Re: [Python-Dev] Adding the 'path' module (was Re: Some RFE for review)

2005-07-17 Thread Neil Hodgson
Martin v. Löwis:

 This appears to be based on the usedDefault return value of
 WideCharToMultiByte. I believe this is insufficient:
 WideCharToMultiByte might convert Unicode characters to
 codepage characters in a lossy way, without using the default
 character. For example, it converts U+0308 (combining diaeresis)
 to U+00A8 (diaeresis) (or something like that, I forgot the
 exact details). So if you have, say, p-umlaut (i.e. U+0070
 U+0308), it converts it to U+0070 U+00A8 (in the local code page).
 Trying to use this as a filename later fails.

   There is WC_NO_BEST_FIT_CHARS to defeat that. It says that it will
use the default character if the translation can't be round-tripped.
Available on WIndows 2000 and XP but not NT4. We could compare the
original against the round-tripped as described at
http://msdn.microsoft.com/library/default.asp?url=/library/en-us/intl/unicode_2bj9.asp

   Neil
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Adding the 'path' module (was Re: Some RFE for review)

2005-07-17 Thread Martin v. Löwis
Neil Hodgson wrote:
There is WC_NO_BEST_FIT_CHARS to defeat that. It says that it will
 use the default character if the translation can't be round-tripped.
 Available on WIndows 2000 and XP but not NT4. 

Ah, ok, that's a useful feature. Of course, limited availability of the
feature means that we either need to drop support for some systems, or
provide yet another layer of fallback routines.

Regards,
Martin
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Adding the 'path' module (was Re: Some RFE for review)

2005-07-15 Thread M.-A. Lemburg
Martin v. Löwis wrote:
 Guido van Rossum wrote:
 
Ah, sigh. I didn't know that os.listdir() behaves differently when the
argument is Unicode. Does os.listdir(.) really behave differently
than os.listdir(u.)? Bah! I don't think that's a very good design
(although I see where it comes from). Promoting only those entries
that need it seems the right solution
 
 
 Unfortunately, this solution is hard to implement (I don't know whether
 it is implementable at all correctly; atleast on Windows, I see no
 way to implement it efficiently).
 
 Here are a number of problems/questions:
 - On Windows, should listdir use the narrow or the wide API? Obviously
   the wide API, since it is not Python which returns the question marks,
   but the Windows API.

Right.

 - But then, the wide API gives all results as Unicode. If you want to
   promote only those entries that need it, it really means that you
   only want to demote those that don't need it. But how can you tell
   whether an entry needs it? There is no API to find out.
   You could declare that anything with characters 128 needs it,
   but that would be an incompatible change: If a character 128 in
   the system code page is in a file name, listdir currently returns
   it in the system code page. It then would return a Unicode string.
   Applications relying on the olde behaviour would break.

We will need a Python C API that returns:

* a string if the Unicode value is representable in the
  default encoding (usually ASCII)

* Unicode if it is not

The file system encoding should be hidden in the OS
layer (e.g. posixmodule). Python should only return
strings with the default encoding and Unicode
otherwise.

See my suggestion to Neil about making the transition to
this new strategy less painful.

 - On Unix, all file names come out as byte strings. Again, how do
   you know which ones to promote, and using what encoding? Python
   currently guesses an encoding, but that may or may not be the one
   intended for the file name.

This is a tough one: AFAIK the file system encoding in Unix
was never really specified, in fact most file systems just
stored the names as-is without any encoding information attached
to it.

Things are moving into the direction of using UTF-8 for
filenames, though.

To solve this issue, various applications have come up with
ways around the problem, e.g. GTK uses the following strategy
to find the encoding (in the given order and adjustable using
an environment  variable):

1. locale based encoding, if given (UTF-8 on most modern Unixes)
2. UTF-8
3. Latin-1
4. CP1252 (Windows Latin-1 version)

Perhaps we should add similar support to Python ?

We should probably use a file system encoding default
of Latin-1 on Unix if no other information can be found.

That way we will assure that things don't change on
Unix unless explicitly setup by the user (Latin-1 is
round-trip safe when converting it to Unicode and back).

os.listdir() would then continue to return plain strings
and file() will open them just it does now.

-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, Jul 15 2005)
 Python/Zope Consulting and Support ...http://www.egenix.com/
 mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/
 mxODBC, mxDateTime, mxTextTools ...http://python.egenix.com/


::: Try mxODBC.Zope.DA for Windows,Linux,Solaris,FreeBSD for free ! 
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Adding the 'path' module (was Re: Some RFE for review)

2005-07-14 Thread M.-A. Lemburg
Hi Neil,

   With the proposed modification, sys.argv[1] u'\u20ac.txt' is
converted through cp1251

Actually, it is not: if you pass in a Unicode argument to
one of the file I/O functions and the OS supports Unicode
directly or at least provides the notion of a file system
encoding, then the file I/O should use the Unicode APIs
of the OS or convert the Unicode argument to the file system
encoding. AFAIK, this is how posixmodule.c already works
(more or less).
 
 
Yes it is. The initial stage is reading the command line arguments.
 The proposed modification is to change behaviour when constructing
 sys.argv, os.environ or when calling os.listdir to Return unicode
 when the text can not be represented in Python's default encoding. I
 take this to mean that when the value can be represented in Python's
 default encoding then it is returned as a byte string in the default
 encoding.
 
Therefore, for the example, the code that sets up sys.argv has to
 encode the unicode command line argument into cp1251.

Ok, I missed your point about sys.argv *not* returning Unicode
in this particular case.

However, with the modification of having posixmodule
and fileobject recode string input via Unicode (based on the
default encoding) into the file system encoding by basically
just changing the parser marker from et to es, you
get correct behaviour - even in the above case.

Both posixmodule and fileobject would then take the cp1251
default encoded string, convert it to Unicode and then
to the file system encoding before opening the file.

On input, file I/O APIs should accept both strings using
the default encoding and Unicode. How these inputs are then
converted to suit the OS is up to the OS abstraction layer, e.g.
posixmodule.c.
 
 
This looks to me to be insufficiently compatible with current
 behaviour whih accepts byte strings outside the default encoding.
 Existing code may call open(€.txt). This is perfectly legitimate
 current Python (with a coding declaration) as €.txt is a byte string
 and file systems will accept byte string names. Since the standard
 default encoding is ASCII, should such code raise UnicodeDecodeError?

Yes.

The above proposed change is indeed more restrictive than
the current pass-through approach. I'm not sure whether we
can impose such a change on the users in the 2.x series...
perhaps we should have a two phase approach:

Phase 1:
   try et and if this fails with an UnicodeDecodeError,
   revert back to the old es pass-through approach, issuing
   a warning as non-disruptive signal to the user

Phase 2:
   move to et for good and issue decode errors

Changing this is easy, though: instead of using the et
getargs format specifier, you'd have to use es. The latter
recodes strings based on the default encoding assumption to
whatever other encoding you specify.
 
Don't you want to convert these into unicode rather than another
 byte string encoding? It looks to me as though the es format always
 produces byte strings and the only byte string format that can be
 passed to the operating system is the file system encoding which may
 not contain all the characters in the default encoding.

If the OS support Unicode directly, we can (and do) have a
special case that bypasses the recoding altogheter. However,
this currently only appears to be available on Windows
versions NT, XP and up, where we already support this.

-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, Jul 14 2005)
 Python/Zope Consulting and Support ...http://www.egenix.com/
 mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/
 mxODBC, mxDateTime, mxTextTools ...http://python.egenix.com/


::: Try mxODBC.Zope.DA for Windows,Linux,Solaris,FreeBSD for free ! 
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Adding the 'path' module (was Re: Some RFE for review)

2005-07-12 Thread Neil Hodgson
M.-A. Lemburg:

  2) Return unicode when the text can not be represented in ASCII. This
  will cause a change of behaviour for existing code which deals with
  non-ASCII data.
 
 +1 on this one (s/ASCII/Python's default encoding).

   I assume you mean the result of sys.getdefaultencoding() here.
Unless much of the Python library is modified to use the default
encoding, this will break. The problem is that different implicit
encodings are being used for reading data and for accessing files.
When calling a function, such as open, with a byte string, Python
passes that byte string through to Windows which interprets it as
being encoded in CP_ACP. When this differs from
sys.getdefaultencoding() there will be a mismatch.

   Say I have been working on a machine set up for Australian English
(or other Western European locale) but am working with Russian data so
have set Python's default encoding to cp1251. With this simple script,
g.py:

import sys
print file(sys.argv[1]).read()

   I process a file called '€.txt' with contents European Euro to produce

C:\zedpython_d g.py €.txt
European Euro

   With the proposed modification, sys.argv[1] u'\u20ac.txt' is
converted through cp1251 to '\x88.txt' as the Euro is located at 0x88
in CP1251. The operating system is then asked to open '\x88.txt' which
it interprets through CP_ACP to be u'\u02c6.txt' ('ˆ.txt') which then
fails. If you are very unlucky there will be a file called 'ˆ.txt' so
the call will succeed and produce bad data.

   Simulating with str(sys.argvu[1]):

C:\zedpython_d g.py €.txt
Traceback (most recent call last):
  File g.py, line 2, in ?
print file(str(sys.argvu[1])).read()
IOError: [Errno 2] No such file or directory: '\x88.txt'

 -1: code pages are evil and the reason why Unicode was invented
 in the first place. This would be a step back in history.

   Features used to specify files (sys.argv, os.environ, ...) should
match functions used to open and perform other operations with files
as they do currently. This means their encodings should match.

   Neil
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Adding the 'path' module (was Re: Some RFE for review)

2005-07-12 Thread M.-A. Lemburg
Hi Neil,

2) Return unicode when the text can not be represented in ASCII. This
will cause a change of behaviour for existing code which deals with
non-ASCII data.

+1 on this one (s/ASCII/Python's default encoding).
 
 
I assume you mean the result of sys.getdefaultencoding() here.

Yes.

The default encoding is the encoding that Python assumes when
auto-converting a string to Unicode. It is normally set to ASCII,
but a user may want to use a different encoding.

However, we've always made it very clear that the user is on his
own when chainging the ASCII default to something else.

 Unless much of the Python library is modified to use the default
 encoding, this will break. The problem is that different implicit
 encodings are being used for reading data and for accessing files.
 When calling a function, such as open, with a byte string, Python
 passes that byte string through to Windows which interprets it as
 being encoded in CP_ACP. When this differs from
 sys.getdefaultencoding() there will be a mismatch.

As I said: code pages are evil :-)

Say I have been working on a machine set up for Australian English
 (or other Western European locale) but am working with Russian data so
 have set Python's default encoding to cp1251. With this simple script,
 g.py:
 
 import sys
 print file(sys.argv[1]).read()
 
I process a file called '€.txt' with contents European Euro to produce
 
 C:\zedpython_d g.py €.txt
 European Euro
 
With the proposed modification, sys.argv[1] u'\u20ac.txt' is
 converted through cp1251 

Actually, it is not: if you pass in a Unicode argument to
one of the file I/O functions and the OS supports Unicode
directly or at least provides the notion of a file system
encoding, then the file I/O should use the Unicode APIs
of the OS or convert the Unicode argument to the file system
encoding. AFAIK, this is how posixmodule.c already works
(more or less).

I was suggesting that OS filename output APIs such as os.listdir()
should return strings, if the filename matches the default
encoding, and Unicode, if not.

On input, file I/O APIs should accept both strings using
the default encoding and Unicode. How these inputs are then
converted to suit the OS is up to the OS abstraction layer, e.g.
posixmodule.c.

Note that the posixmodule currently does not recode string
arguments: it simply passes them to the OS as-is, assuming
that they are already encoded using the file system encoding.
Changing this is easy, though: instead of using the et
getargs format specifier, you'd have to use es. The latter
recodes strings based on the default encoding assumption to
whatever other encoding you specify.

 to '\x88.txt' as the Euro is located at 0x88
 in CP1251. The operating system is then asked to open '\x88.txt' which
 it interprets through CP_ACP to be u'\u02c6.txt' ('ˆ.txt') which then
 fails. If you are very unlucky there will be a file called 'ˆ.txt' so
 the call will succeed and produce bad data.
 
Simulating with str(sys.argvu[1]):
 
 C:\zedpython_d g.py €.txt
 Traceback (most recent call last):
   File g.py, line 2, in ?
 print file(str(sys.argvu[1])).read()
 IOError: [Errno 2] No such file or directory: '\x88.txt'

See above: this is what I'd consider a bug in posixmodule.c

-1: code pages are evil and the reason why Unicode was invented
in the first place. This would be a step back in history.
 
 
Features used to specify files (sys.argv, os.environ, ...) should
 match functions used to open and perform other operations with files
 as they do currently. This means their encodings should match.

Right. However, most of these APIs currently either don't
make any assumption on the strings contents and simply pass
them around, or they assume that these strings use the file
system encoding - which, like in the example you gave above,
can be different from the default encoding.

To untie this Gordian Knot, we should use strings and Unicode
like they are supposed to be used (in the context of text
data):

* strings are fine for text data that is encoded using
  the default encoding

* Unicode should be used for all text data that is not
  or cannot be encoded in the default encoding

Later on in Py3k, all text data should be stored in Unicode
and all binary data in some new binary type.

-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, Jul 12 2005)
 Python/Zope Consulting and Support ...http://www.egenix.com/
 mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/
 mxODBC, mxDateTime, mxTextTools ...http://python.egenix.com/


::: Try mxODBC.Zope.DA for Windows,Linux,Solaris,FreeBSD for free ! 
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Adding the 'path' module (was Re: Some RFE for review)

2005-07-12 Thread Neil Hodgson
   Hi Marc-Andre,

 With the proposed modification, sys.argv[1] u'\u20ac.txt' is
  converted through cp1251
 
 Actually, it is not: if you pass in a Unicode argument to
 one of the file I/O functions and the OS supports Unicode
 directly or at least provides the notion of a file system
 encoding, then the file I/O should use the Unicode APIs
 of the OS or convert the Unicode argument to the file system
 encoding. AFAIK, this is how posixmodule.c already works
 (more or less).

   Yes it is. The initial stage is reading the command line arguments.
The proposed modification is to change behaviour when constructing
sys.argv, os.environ or when calling os.listdir to Return unicode
when the text can not be represented in Python's default encoding. I
take this to mean that when the value can be represented in Python's
default encoding then it is returned as a byte string in the default
encoding.

   Therefore, for the example, the code that sets up sys.argv has to
encode the unicode command line argument into cp1251.

 On input, file I/O APIs should accept both strings using
 the default encoding and Unicode. How these inputs are then
 converted to suit the OS is up to the OS abstraction layer, e.g.
 posixmodule.c.

   This looks to me to be insufficiently compatible with current
behaviour whih accepts byte strings outside the default encoding.
Existing code may call open(€.txt). This is perfectly legitimate
current Python (with a coding declaration) as €.txt is a byte string
and file systems will accept byte string names. Since the standard
default encoding is ASCII, should such code raise UnicodeDecodeError?

 Changing this is easy, though: instead of using the et
 getargs format specifier, you'd have to use es. The latter
 recodes strings based on the default encoding assumption to
 whatever other encoding you specify.

   Don't you want to convert these into unicode rather than another
byte string encoding? It looks to me as though the es format always
produces byte strings and the only byte string format that can be
passed to the operating system is the file system encoding which may
not contain all the characters in the default encoding.

   Neil
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Adding the 'path' module (was Re: Some RFE for review)

2005-07-11 Thread Neil Hodgson
Guido van Rossum:

 In some sense the safest approach from this POV would be to return
 Unicode as soon as it can't be encoded using the global default
 encoding. IOW normally this would return Unicode for all names
 containing non-ASCII characters.

   On unicode versions of Windows, for attributes like os.listdir,
os.getcwd, sys.argv, and os.environ, which can usefully return unicode
strings, there are 4 options I see:

1) Always return unicode. This is the option I'd be happiest to use,
myself, but expect this choice would change the behaviour of existing
code too much and so produce much unhappiness.

2) Return unicode when the text can not be represented in ASCII. This
will cause a change of behaviour for existing code which deals with
non-ASCII data.

3) Return unicode when the text can not be represented in the default
code page. While this change can lead to breakage because of combining
byte string and unicode strings, it is reasonably safe from the point
of view of data integrity as current code is returning garbage strings
that look like '?'.

4) Provide two versions of the attribute, one with the current name
returning byte strings and a second with a u suffix returning
unicode. This is the least intrusive, requiring explicit changes to
code to receive unicode data. For patch #1231336 I chose this approach
producing sys.argvu and os.environu.

For os.listdir the current behaviour of returning unicode when its
argument is unicode can be retained but that is not extensible to, for
example, sys.argv.

   Since this issue may affect many attributes a common approach
should be chosen.

   For experimenting with os.listdir, there is a patch for
posixmodule.c at http://www.scintilla.org/difft.txt which implements
(2). To specify the US-ASCII code page, the number 20127 is used as
there is no definition for this in the system headers. To change to
(3) comment out the line with 20127 and uncomment the line with
CP_ACP. Unicode arguments produce unicode results.

   Neil
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Adding the 'path' module (was Re: Some RFE for review)

2005-07-11 Thread M.-A. Lemburg
Neil Hodgson wrote:
On unicode versions of Windows, for attributes like os.listdir,
 os.getcwd, sys.argv, and os.environ, which can usefully return unicode
 strings, there are 4 options I see:
 
 1) Always return unicode. This is the option I'd be happiest to use,
 myself, but expect this choice would change the behaviour of existing
 code too much and so produce much unhappiness.

Would be nice, but will likely break too much code - if you
let Unicode object enter non-Unicode aware code, it is likely
that you'll end up getting stuck in tons of UnicodeErrors. If you
want to get a feeling for this, try running Python with -U command
line switch.

 2) Return unicode when the text can not be represented in ASCII. This
 will cause a change of behaviour for existing code which deals with
 non-ASCII data.

+1 on this one (s/ASCII/Python's default encoding).

 3) Return unicode when the text can not be represented in the default
 code page. While this change can lead to breakage because of combining
 byte string and unicode strings, it is reasonably safe from the point
 of view of data integrity as current code is returning garbage strings
 that look like '?'.

-1: code pages are evil and the reason why Unicode was invented
in the first place. This would be a step back in history.

 4) Provide two versions of the attribute, one with the current name
 returning byte strings and a second with a u suffix returning
 unicode. This is the least intrusive, requiring explicit changes to
 code to receive unicode data. For patch #1231336 I chose this approach
 producing sys.argvu and os.environu.

-1 - this is what Microsoft did for many of their APIs. The
result is two parallel universes with two sets of features,
bugs, documentation, etc.

 For os.listdir the current behaviour of returning unicode when its
 argument is unicode can be retained but that is not extensible to, for
 example, sys.argv.

I don't think that using the parameter type as parameter
to function is a good idea. However, accepting both strings
and Unicode will make it easier to maintain backwards
compatibility.

Since this issue may affect many attributes a common approach
 should be chosen.

Indeed.

-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, Jul 11 2005)
  Python/Zope Consulting and Support ...http://www.egenix.com/
  mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/
  mxODBC, mxDateTime, mxTextTools ...http://python.egenix.com/


::: Try mxODBC.Zope.DA for Windows,Linux,Solaris,FreeBSD for free ! 
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Adding the 'path' module (was Re: Some RFE for review)

2005-07-11 Thread Guido van Rossum
I'm in full agreement with Marc-Andre below, except I don't like (1)
at all -- having used other APIs that always return Unicode (like the
Python XML parsers) it bothers me to get Unicode for no reason at all.
OTOH I think Python 3.0 should be using a Unicode model closer to
Java's.

On 7/11/05, M.-A. Lemburg [EMAIL PROTECTED] wrote:
 Neil Hodgson wrote:
 On unicode versions of Windows, for attributes like os.listdir,
  os.getcwd, sys.argv, and os.environ, which can usefully return unicode
  strings, there are 4 options I see:
 
  1) Always return unicode. This is the option I'd be happiest to use,
  myself, but expect this choice would change the behaviour of existing
  code too much and so produce much unhappiness.
 
 Would be nice, but will likely break too much code - if you
 let Unicode object enter non-Unicode aware code, it is likely
 that you'll end up getting stuck in tons of UnicodeErrors. If you
 want to get a feeling for this, try running Python with -U command
 line switch.
 
  2) Return unicode when the text can not be represented in ASCII. This
  will cause a change of behaviour for existing code which deals with
  non-ASCII data.
 
 +1 on this one (s/ASCII/Python's default encoding).
 
  3) Return unicode when the text can not be represented in the default
  code page. While this change can lead to breakage because of combining
  byte string and unicode strings, it is reasonably safe from the point
  of view of data integrity as current code is returning garbage strings
  that look like '?'.
 
 -1: code pages are evil and the reason why Unicode was invented
 in the first place. This would be a step back in history.
 
  4) Provide two versions of the attribute, one with the current name
  returning byte strings and a second with a u suffix returning
  unicode. This is the least intrusive, requiring explicit changes to
  code to receive unicode data. For patch #1231336 I chose this approach
  producing sys.argvu and os.environu.
 
 -1 - this is what Microsoft did for many of their APIs. The
 result is two parallel universes with two sets of features,
 bugs, documentation, etc.
 
  For os.listdir the current behaviour of returning unicode when its
  argument is unicode can be retained but that is not extensible to, for
  example, sys.argv.
 
 I don't think that using the parameter type as parameter
 to function is a good idea. However, accepting both strings
 and Unicode will make it easier to maintain backwards
 compatibility.
 
 Since this issue may affect many attributes a common approach
  should be chosen.
 
 Indeed.
 
 --
 Marc-Andre Lemburg
 eGenix.com
 
 Professional Python Services directly from the Source  (#1, Jul 11 2005)
   Python/Zope Consulting and Support ...http://www.egenix.com/
   mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/
   mxODBC, mxDateTime, mxTextTools ...http://python.egenix.com/
 
 
 ::: Try mxODBC.Zope.DA for Windows,Linux,Solaris,FreeBSD for free ! 
 


-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Adding the 'path' module (was Re: Some RFE for review)

2005-07-09 Thread M.-A. Lemburg
Neil Hodgson wrote:
 Thomas Heller:
 
 
But adding u'\u5b66\u6821\u30c7\u30fc' to sys.path won't allow to import
this file as module.  Internally Python\import.c converts everything to
strings.  I started to refactor import.c to work with PyStringObjects
instead of char buffers as a first step - PyUnicodeObjects could have
been added later, but I gave up because there seems absolute zero
interest in it.

Well, most people when confronted with this will rename the
 directory to something simple like ulib and continue.

I don't really buy this trick: what if you happen to have
a home directory with Unicode characters in it ?

I can't judge on this - but it's easy to experiment with it, even in
current Python releases since sys.argvu, os.environu can also be
provided by extension modules.
 
 
It is the effect of this on the non-unicode-savvy that is
 important: if os.environu goes into prereleases of 2.5 then the only
 people that will use it are likely to be those who already try to keep
 their code unicode compliant. There is only likely to be (negative)
 feedback if existing features are made unicode-only or use unicode for
 non-ASCII.

I don't like the idea of creating a parallel universe for
Unicode - OSes are starting to integrate Unicode filenames
rather quickly (UTF-8 on Unix, UTF-16-LE on Windows), so
it's much better to follow them and start accepting Unicode in
sys.path.

Wouldn't it be easy to have the import logic convert Unicode
entries in sys.path to whatever the OS uses internally (UTF-8
or UTF-16-LE) and then keep the char buffers in place ?

But thanks that you care about this stuff - I'm a little bit worried
because all the other folks seem to think everything's ok (?).
 
Unicode is becoming more of an issue: many Linux distributions now
 install by default with a UTF8 locale and other tools are starting to
 use this: GCC 4 now delivers error messages using Unicode quote
 characters like 'these' rather than `these'. There are 131 threads
 found by Google Groups for (UnicodeEncodeError OR UnicodeDecodeError)
 and 21 of these were in this June. A large proportion of the threads
 are in language-specific groups so are not as visible to core
 developers.

-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, Jul 09 2005)
 Python/Zope Consulting and Support ...http://www.egenix.com/
 mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/
 mxODBC, mxDateTime, mxTextTools ...http://python.egenix.com/


::: Try mxODBC.Zope.DA for Windows,Linux,Solaris,FreeBSD for free ! 
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Adding the 'path' module (was Re: Some RFE for review)

2005-07-09 Thread Neil Hodgson
M.-A. Lemburg:

 I don't really buy this trick: what if you happen to have
 a home directory with Unicode characters in it ?

   Most people choose account names and thus home directory names that
are compatible with their preferred locale settings: German users are
unlikely to choose an account name that uses Japanese characters.
Unicode is only necessary for file names that are outside your default
locale. An administration utility may need to visit multiple user's
home directories and so is more likely to encounter files with names
that can not be represented in its default locale.

   I think it would be better if sys.path could include unicode
entries but expect the code will rarely be exercised.

   Neil
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Adding the 'path' module (was Re: Some RFE for review)

2005-07-09 Thread M.-A. Lemburg
Neil Hodgson wrote:
 M.-A. Lemburg:
 
 
I don't really buy this trick: what if you happen to have
a home directory with Unicode characters in it ?
 
 
Most people choose account names and thus home directory names that
 are compatible with their preferred locale settings: German users are
 unlikely to choose an account name that uses Japanese characters.

It's naive to assume that all people in Germany using the German
locale have German names ;-) E.g. we have a large Japanese community
living here in Düsseldorf. If that example does not convince you,
just have a look at all the Chinese restaurants in cities around
the world - I'm sure that quite a few of the owners will want to
use their correctly written name as account name. Unicode makes
this possible and while it may not be in wide-spread use nowadays,
things will definitely change over the next few years as more and
more OSes and platforms will introduce native Unicode support.

 Unicode is only necessary for file names that are outside your default
 locale. An administration utility may need to visit multiple user's
 home directories and so is more likely to encounter files with names
 that can not be represented in its default locale.

I'm not sure why you bring up an administration tool: isn't
the discussion about being able to load Python modules from
directories with Unicode path components ?

I think it would be better if sys.path could include unicode
 entries but expect the code will rarely be exercised.

I think that sys.path should always use Unicode for non-ASCII
path names - this would make it locale setting independent, which
is what we should strive for in Py3k: locales are much easier to
handle at the application level and only introduce portability
problems if used at the OS or C lib level.

-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Software directly from the Source
  Python/Zope Consulting and Support ...http://www.egenix.com/
  mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/
  mxODBC, mxDateTime, mxTextTools ...http://python.egenix.com/


::: Try mxODBC.Zope.DA for Windows,Linux,Solaris,FreeBSD for free ! 
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Adding the 'path' module (was Re: Some RFE for review)

2005-07-09 Thread Guido van Rossum
On 7/9/05, Neil Hodgson [EMAIL PROTECTED] wrote:
 M.-A. Lemburg:
 
  I don't really buy this trick: what if you happen to have
  a home directory with Unicode characters in it ?
 
Most people choose account names and thus home directory names that
 are compatible with their preferred locale settings: German users are
 unlikely to choose an account name that uses Japanese characters.
 Unicode is only necessary for file names that are outside your default
 locale. An administration utility may need to visit multiple user's
 home directories and so is more likely to encounter files with names
 that can not be represented in its default locale.
 
I think it would be better if sys.path could include unicode
 entries but expect the code will rarely be exercised.

Another problem is that if you can return 8-bit strings encoded in the
local code page, and also Unicode, combining the two using string
operations (e.g. a directory using the local code page containing a
file using Unicode, and then combining the two using os.path.join())
will fail unless the local code page is also Python's global default
encoding (which it usually isn't -- we really try hard to keep the
default encoding 'ascii' at all times).

In some sense the safest approach from this POV would be to return
Unicode as soon as it can't be encoded using the global default
encoding. IOW normally this would return Unicode for all names
containing non-ASCII characters.

The problem is of course that while the I/O functions will handle this
fine, *printing* Unicode still doesn't work by default. :-( I can't
wait until we switch everything to Unicode and have encoding on all
streams...

-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Adding the 'path' module (was Re: Some RFE for review)

2005-07-08 Thread Neil Hodgson
Thomas Heller:

 OTOH, I once had a bug report from a py2exe user who complained that the
 program didn't start when installed in a path with japanese characters
 on it.  I tried this out, the bug existed (and still exists), but I was
 astonished how many programs behaved the same: On a PC with english
 language settings, you cannot start WinZip or Acrobat Reader (to give
 just some examples) on a .zip or .pdf file contained in such a
 directory.

   Much of the time these sorts of bugs don't make themselves too hard
to live with because  most non-ASCII names that any user encounters
are still in the user's locale and so get mapped by Windows. It can be
a lot of work supporting wide file names. I have just added wide file
name support to my editor, SciTE, for the second time and am about to
rip it out again as it complicates too much code for too few
beneficiaries. (I want one executable for both Windows NT+ and 9x, so
wide file names has to be a runtime choice leading to maybe 50 new
branches in the code).

   If returning a mixture of unicode and narrow strings from
os.listdir is the right thing to do then maybe it better for sys.argv
and os.environ to also be mixtures. In patch #1231336 I added parallel
attributes, sys.argvu and os.environu to hold unicode versions of this
information. The alternative, placing unicode items in the existing
attributes minimises API size.

   One question here is whether unicode items should be added only
when the element is outside the user's locale (the CP_ACP code page)
or whenever the item is outside ASCII. The former is more similar to
existing behaviour but the latter is safer as it makes it harder to
implicitly treat the data as being in an incorrect encoding.

   Neil
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Adding the 'path' module (was Re: Some RFE for review)

2005-07-07 Thread Thomas Heller
Neil Hodgson [EMAIL PROTECTED] writes:

 Guido van Rossum:

 Ah, sigh. I didn't know that os.listdir() behaves differently when the
 argument is Unicode. Does os.listdir(.) really behave differently
 than os.listdir(u.)? 

Yes:
 os.listdir(.)
 ['abc', '']
 os.listdir(u.)
 [u'abc', 
 u'\u0417\u0434\u0440\u0430\u0432\u0441\u0442\u0432\u0443\u0439\u0442\u0435']

 Bah! I don't think that's a very good design
 (although I see where it comes from). 

Partly my fault. At the time I was more concerned with making
 functionality possible rather than convenient.

 Promoting only those entries
 that need it seems the right solution -- user code that can't deal
 with the Unicode entries shouldn't be used around directories
 containing unicode -- if it needs to work around unicode it should be
 fixed to support that!

I'm sorry but that's not my opinion.

Code that can't deal with unicode entries is broken, imo.  The
programmer does not know where the user runs this code at what he throws
at it.  I think that this will hide bugs.

When I installed the first game written in Python with pygame on my
daughter's PC it didn't run, simply because there was a font listed in
the registry which contained umlauts somewhere.

OTOH, I once had a bug report from a py2exe user who complained that the
program didn't start when installed in a path with japanese characters
on it.  I tried this out, the bug existed (and still exists), but I was
astonished how many programs behaved the same: On a PC with english
language settings, you cannot start WinZip or Acrobat Reader (to give
just some examples) on a .zip or .pdf file contained in such a
directory.

OK, I'll work on a patch for that but I'd like to see the opinions
 of the usual unicode guys as this will produce more opportunities for
 UnicodeDecodeError. The modification will probably work in the
 opposite way, asking for all the names in unicode and then attempting
 to convert to the default code page with failures retaining the
 unicode name.

Thomas

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Adding the 'path' module (was Re: Some RFE for review)

2005-07-06 Thread Neil Hodgson
Guido van Rossum:

 Ah, sigh. I didn't know that os.listdir() behaves differently when the
 argument is Unicode. Does os.listdir(.) really behave differently
 than os.listdir(u.)? 

   Yes:
 os.listdir(.)
['abc', '']
 os.listdir(u.)
[u'abc', 
u'\u0417\u0434\u0440\u0430\u0432\u0441\u0442\u0432\u0443\u0439\u0442\u0435']

 Bah! I don't think that's a very good design
 (although I see where it comes from). 

   Partly my fault. At the time I was more concerned with making
functionality possible rather than convenient.

 Promoting only those entries
 that need it seems the right solution -- user code that can't deal
 with the Unicode entries shouldn't be used around directories
 containing unicode -- if it needs to work around unicode it should be
 fixed to support that!

   OK, I'll work on a patch for that but I'd like to see the opinions
of the usual unicode guys as this will produce more opportunities for
UnicodeDecodeError. The modification will probably work in the
opposite way, asking for all the names in unicode and then attempting
to convert to the default code page with failures retaining the
unicode name.

   Neil
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Adding the 'path' module (was Re: Some RFE for review)

2005-07-04 Thread Guido van Rossum
 Guido van Rossum:
  Then maybe the code that handles Unicode paths in arguments should be
  fixed rather than adding a module that encapsulates a work-around...

On 7/3/05, Neil Hodgson [EMAIL PROTECTED] wrote:
It isn't clear whether you are saying this should be fixed by the
 user or in the library.

I meant the library.

 For a quick example, say someone wrote some
 code for counting lines in a directory:
[deleted]

Ah, sigh. I didn't know that os.listdir() behaves differently when the
argument is Unicode. Does os.listdir(.) really behave differently
than os.listdir(u.)? Bah! I don't think that's a very good design
(although I see where it comes from). Promoting only those entries
that need it seems the right solution -- user code that can't deal
with the Unicode entries shouldn't be used around directories
containing unicode -- if it needs to work around unicode it should be
fixed to support that! Mapping Unicode names to ? seems the
wrong behavior (and doesn't work very well once you try to do anything
with those names except for printing).

Face it. Unicode stinks (from the programmer's POV). But we'll have to
live with it. In Python 3.0 I want str and unicode to be the same data
type (like String in Java) and I want a separate data type to hold a
byte array.

-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Adding the 'path' module (was Re: Some RFE for review)

2005-07-04 Thread Thomas Heller
Neil Hodgson [EMAIL PROTECTED] writes:

 Thomas Heller:

 OTOH, Python is lacking a lot when you have to handle unicode strings on
 sys.path, in command line arguments, environment variables and maybe
 other places.  

A new patch #1231336 Add unicode for sys.argv, os.environ,
 os.system is now in SourceForge. New parallel features sys.argvu and
 os.environu are provided and os.system accepts unicode arguments
 similar to PEP 277. A screenshot showing why the existing features are
 inadequate and the new features an enhancement are at
 http://www.scintilla.org/pyunicode.png
One problem is that when using python -c cmd args, sys.argvu
 includes the cmd but sys.argv does not. They both contain the -c.

Not only that, all the other flags like -O and -E are also in sys.argvu
but not in sys.argv.

os.system was changed to make it easier to add some test cases but
 then that looked like too much trouble. There are far too many
 variants on exec*, spawn* and popen* to write a quick patch for these.

Those are nearly obsoleted by the subprocess module (although I do not
know how that handles unicode.

Thomas

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Adding the 'path' module (was Re: Some RFE for review)

2005-07-04 Thread Neil Hodgson
Thomas Heller:

 Not only that, all the other flags like -O and -E are also in sys.argvu
 but not in sys.argv.

   OK, new patch fixes these and the -c issue.

 Those are nearly obsoleted by the subprocess module (although I do not
 know how that handles unicode.

   It breaks. The argspec is zzOOiiOzO:CreateProcess.

 z = subprocess.Popen(ucmd /c echo \u0417)
Traceback (most recent call last):
  File stdin, line 1, in ?
  File c:\zed\python\dist\src\lib\subprocess.py, line 600, in __init__
errread, errwrite)
  File c:\zed\python\dist\src\lib\subprocess.py, line 791, in _execute_child
startupinfo)
UnicodeEncodeError: 'ascii' codec can't encode character u'\u0417' in
position 12: ordinal not in range(128)

   Neil
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Adding the 'path' module (was Re: Some RFE for review)

2005-07-03 Thread Guido van Rossum
On 6/30/05, Neil Hodgson [EMAIL PROTECTED] wrote:
One benefit I see for the path module is that it makes it easier to
 write code that behaves correctly with unicode paths on Windows.
 Currently, to implement code that may see unicode paths, you must
 first understand that unicode paths may be an issue, then write
 conditional code that uses either a string or unicode string to hold
 paths whenever a new path is created.

Then maybe the code that handles Unicode paths in arguments should be
fixed rather than adding a module that encapsulates a work-around...

-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Adding the 'path' module (was Re: Some RFE for review)

2005-07-01 Thread Thomas Heller
 Guido van Rossum:

 Whoa! Do we really need a completely different mechanism for doing the
 same stuff we can already do? 


Neil Hodgson [EMAIL PROTECTED] writes:

One benefit I see for the path module is that it makes it easier to
 write code that behaves correctly with unicode paths on Windows.
 Currently, to implement code that may see unicode paths, you must
 first understand that unicode paths may be an issue, then write
 conditional code that uses either a string or unicode string to hold
 paths whenever a new path is created.

Indeed.  This would probably handle the cases where you have to
manipulate file paths in code.

OTOH, Python is lacking a lot when you have to handle unicode strings on
sys.path, in command line arguments, environment variables and maybe
other places.  See, for example
http://mail.python.org/pipermail/python-list/2004-December/256969.html

I had started to work on the sys.path unicode issues, but it seems a
considerable rewrite of (not only) Python/import.c is required.  But I
fear the patch http://python.org/sf/1093253 is slowly getting out of
date.

Thomas

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Adding the 'path' module (was Re: Some RFE for review)

2005-06-30 Thread Neil Hodgson
Guido van Rossum:

 Whoa! Do we really need a completely different mechanism for doing the
 same stuff we can already do? 

   One benefit I see for the path module is that it makes it easier to
write code that behaves correctly with unicode paths on Windows.
Currently, to implement code that may see unicode paths, you must
first understand that unicode paths may be an issue, then write
conditional code that uses either a string or unicode string to hold
paths whenever a new path is created.

   Neil
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Adding the 'path' module (was Re: Some RFE for review)

2005-06-28 Thread Guido van Rossum
On 6/27/05, Phillip J. Eby [EMAIL PROTECTED] wrote:

 I think the only questions remaining open are where to put it and what to
 call the class.

Whoa! Do we really need a completely different mechanism for doing the
same stuff we can already do? The path module seems mostly useful for
folks coming from Java who are used to the Java Path class. With the
massive duplication of functionality we should also consider what to
recommend for the future: will the old os.path module be deprecated,
or are we going to maintain both alternatives forever? (And what about
all the duplication with the os module itself, like the cwd()
constructor?) Remember TOOWTDI.

 I think we should put it in os.path, such that 'from
 os.path import path' gives you the path class for your platform, and using
 one of the path modules directly (e.g. 'from posixpath import path') gives
 you the specific platform's version.

Aargh! Call it anything except path. Having two things nested inside
each other with the same name is begging for confusion forever. We
have a few of these in the stdlib now (StringIO, UserDict etc.) and
they were MISTAKES.

 This is useful because sometimes you
 need to manipulate paths that are foreign to your current OS.  For example,
 the distutils and other packages sometimes use POSIX paths for input and
 then convert them to local OS paths.  Also, POSIX path objects would be
 useful for creating or parsing the path portion of many kinds of URLs,
 and I have often used functions from posixpath for that myself.

Right. That's why posixpath etc. always exists, not only when os.name
== posix.

 As for a PEP, I doubt a PEP is really required for something this simple; I
 have never seen anyone say, no, we shouldn't have this in the stdlib.  I
 think it would be more important to write reference documentation and a
 complete test suite.

No, we shouldn't have this in the stdlib.

At least, not without more motivation than it gets high praise.

 By the way, it also occurs to me that for the sake of subclassability, the
 methods should not return 'path(somestr)' when creating new objects, but
 instead use self.__class__(somestr).

Clearly it needs a PEP.

-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Adding the 'path' module (was Re: Some RFE for review)

2005-06-28 Thread Just van Rossum
Phillip J. Eby wrote:

 At 03:45 PM 6/27/2005 -0500, Skip Montanaro wrote:
 We're getting enough discussion about various aspects of Jason's
 path module that perhaps a PEP is warranted.  All this discussion on
 python-dev is just going to get lost.
 
 AFAICT, the only unresolved issue outstanding is a compromise or
 Pronouncement regarding the atime/ctime/mtime members' datatype. 
 This is assuming, of course, that making the empty path be
 os.curdir doesn't receive any objections, and that nobody strongly
 prefers 'path.fromcwd()' over 'path.cwd()' as the alternate
 constructor name.
 
 Apart from these fairly minor issues, there is a very short to-do
 list, small enough to do an implementation patch in an evening or
 two.  Documentation might take a similar amount of time after that;
 mostly it'll be copy-paste from the existing os.path docs, though.
 
 As for the open issues, if we can't reach some sane compromise about
 atime/ctime/mtime, I'd suggest just providing the stat() method and
 let people use stat().st_mtime et al.  Alternately, I'd be okay with
 creating last_modified(), last_accessed(), and created_on() methods
 that return datetime objects, as long as there's also
 atime()/mtime()/ctime() methods that return timestamps.

My issues with the 'path' module (basically recapping what I've said on
the subject in the past):

  - It inherits from str/unicode, so path object have many str methods
that make no sense for paths.
  - On OSX, it inherits from str instead of unicode, due to
http://python.org/sf/767645
  - I don't like __div__ overloading for join().

Just
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Adding the 'path' module (was Re: Some RFE for review)

2005-06-27 Thread Michael Hoffman
On Sun, 26 Jun 2005, Phillip J. Eby wrote:

 At 08:19 PM 6/26/2005 +0100, Michael Hoffman wrote:
 On Sun, 26 Jun 2005, Phillip J. Eby wrote:

 * drop getcwd(); it makes no sense on a path instance

 Personally I use path.getcwd() as a class method all the time. It
 makes as much sense as fromkeys() does on a dict instance, which is
 technically possible but non-sensical.

 It's also duplication with os.path; I'm -1 on creating a new staticmethod
 for it.

os.getcwd() returns a string, but path.getcwd() returns a new path
object. Almost everything in path is a duplication of os.path--the
difference is that the path methods start and end with path objects.
-- 
Michael Hoffman [EMAIL PROTECTED]
European Bioinformatics Institute

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Adding the 'path' module (was Re: Some RFE for review)

2005-06-27 Thread Reinhold Birkenfeld
Michael Hoffman wrote:
 On Sun, 26 Jun 2005, Phillip J. Eby wrote:
 
 At 08:19 PM 6/26/2005 +0100, Michael Hoffman wrote:
 On Sun, 26 Jun 2005, Phillip J. Eby wrote:

 * drop getcwd(); it makes no sense on a path instance

 Personally I use path.getcwd() as a class method all the time. It
 makes as much sense as fromkeys() does on a dict instance, which is
 technically possible but non-sensical.

 It's also duplication with os.path; I'm -1 on creating a new staticmethod
 for it.
 
 os.getcwd() returns a string, but path.getcwd() returns a new path
 object. Almost everything in path is a duplication of os.path--the
 difference is that the path methods start and end with path objects.

+1.

Reinhold

-- 
Mail address is perfectly valid!

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Adding the 'path' module (was Re: Some RFE for review)

2005-06-27 Thread Phillip J. Eby
At 08:20 AM 6/27/2005 +0100, Michael Hoffman wrote:
os.getcwd() returns a string, but path.getcwd() returns a new path
object.

In that case, I'd expect it to be 'path.fromcwd()' or 'path.cwd()'; i.e. a 
constructor classmethod by analogy with 'dict.fromkeys()' or 
'datetime.now()'.  'getcwd()' looks like it's getting a property of a path 
instance, and doesn't match stdlib conventions for constructors.

So, +1 as long as it's called cwd() or something better (i.e. clearer 
and/or more consistent with stdlib constructor conventions).

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Adding the 'path' module (was Re: Some RFE for review)

2005-06-27 Thread Reinhold Birkenfeld
Phillip J. Eby wrote:
 At 08:20 AM 6/27/2005 +0100, Michael Hoffman wrote:
os.getcwd() returns a string, but path.getcwd() returns a new path
object.
 
 In that case, I'd expect it to be 'path.fromcwd()' or 'path.cwd()'; i.e. a 
 constructor classmethod by analogy with 'dict.fromkeys()' or 
 'datetime.now()'.  'getcwd()' looks like it's getting a property of a path 
 instance, and doesn't match stdlib conventions for constructors.
 
 So, +1 as long as it's called cwd() or something better (i.e. clearer 
 and/or more consistent with stdlib constructor conventions).

You're right. +1 for calling it fromcwd().

With that settled, should I rewrite the module? Should I write a PEP?

Reinhold

-- 
Mail address is perfectly valid!

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Adding the 'path' module (was Re: Some RFE for review)

2005-06-27 Thread Phillip J. Eby
At 05:10 PM 6/27/2005 +0200, Reinhold Birkenfeld wrote:
Phillip J. Eby wrote:
  At 08:20 AM 6/27/2005 +0100, Michael Hoffman wrote:
 os.getcwd() returns a string, but path.getcwd() returns a new path
 object.
 
  In that case, I'd expect it to be 'path.fromcwd()' or 'path.cwd()'; i.e. a
  constructor classmethod by analogy with 'dict.fromkeys()' or
  'datetime.now()'.  'getcwd()' looks like it's getting a property of a path
  instance, and doesn't match stdlib conventions for constructors.
 
  So, +1 as long as it's called cwd() or something better (i.e. clearer
  and/or more consistent with stdlib constructor conventions).

You're right. +1 for calling it fromcwd().

I'm leaning slightly towards .cwd() for symmetry with datetime.now(), but 
not enough to argue about it if nobody has objections to fromcwd().


With that settled, should I rewrite the module? Should I write a PEP?

I think the only questions remaining open are where to put it and what to 
call the class.  I think we should put it in os.path, such that 'from 
os.path import path' gives you the path class for your platform, and using 
one of the path modules directly (e.g. 'from posixpath import path') gives 
you the specific platform's version.  This is useful because sometimes you 
need to manipulate paths that are foreign to your current OS.  For example, 
the distutils and other packages sometimes use POSIX paths for input and 
then convert them to local OS paths.  Also, POSIX path objects would be 
useful for creating or parsing the path portion of many kinds of URLs, 
and I have often used functions from posixpath for that myself.

As for a PEP, I doubt a PEP is really required for something this simple; I 
have never seen anyone say, no, we shouldn't have this in the stdlib.  I 
think it would be more important to write reference documentation and a 
complete test suite.

By the way, it also occurs to me that for the sake of subclassability, the 
methods should not return 'path(somestr)' when creating new objects, but 
instead use self.__class__(somestr).

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Adding the 'path' module (was Re: Some RFE for review)

2005-06-27 Thread Trent Mick
 os.getcwd() returns a string, but path.getcwd() returns a new path
 object.
 
 In that case, I'd expect it to be 'path.fromcwd()' or 'path.cwd()'; i.e. a 
 constructor classmethod by analogy with 'dict.fromkeys()' or 
 'datetime.now()'.  'getcwd()' looks like it's getting a property of a path 
 instance, and doesn't match stdlib conventions for constructors.
 
 So, +1 as long as it's called cwd() or something better (i.e. clearer 
 and/or more consistent with stdlib constructor conventions).

What about have it just be the default empty constructor?

assert path.Path() == os.getcwd() \
or path.Path() == os.getcwdu()

Dunno if that causes other weirdnesses with the API, though.

Trent

-- 
Trent Mick
[EMAIL PROTECTED]
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Adding the 'path' module (was Re: Some RFE for review)

2005-06-27 Thread Walter Dörwald
Phillip J. Eby wrote:

 At 09:26 PM 6/26/2005 -0400, Bob Ippolito wrote:

 On Jun 26, 2005, at 8:54 PM, Phillip J. Eby wrote:

 At 12:22 AM 6/27/2005 +0200, Dörwald Walter wrote:

 Phillip J. Eby wrote:

 I'm also not keen on the fact that it makes certain things
 properties whose value can change over time; i.e. ctime/mtime/ 
 atime
 and
 size really shouldn't be properties, but rather methods.

 I think ctime, mtime and atime should be (or return)
 datetime.datetime objects instead of integer timestamps.

 With what timezone?  I don't think that can be done portably and
 unambiguously, so I'm -1 on that.

 That makes no sense, timestamps aren't any better,

 Sure they are, if what you want is a timestamp.  In any case, the  
 most common use case I've seen for mtime and friends is just  
 comparing against a previous value, or the value on another file,  
 so it doesn't actually matter most of the time what the type of the  
 value is.

I find timestamp values to be somewhat opaque. So all things being  
equal, I'd prefer datetime objects.

  and datetime
 objects have no time zone set by default anyway.
 datetime.fromtimestamp(time.time()) gives you the same thing as
 datetime.now().


 In which case, it's also easy enough to get a datetime if you  
 really want one.  I personally would rather do that than complicate  
 the use cases where a datetime isn't really needed.  (i.e. most of  
 the time, at least in my experience)

We should have one uniform way of representing time in Python. IMHO  
datetime objects are the natural choice.

Bye,
Walter Dörwald

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Adding the 'path' module (was Re: Some RFE for review)

2005-06-27 Thread Phillip J. Eby
At 08:24 PM 6/27/2005 +0100, Michael Hoffman wrote:
On Mon, 27 Jun 2005, Phillip J. Eby wrote:

  At 08:20 AM 6/27/2005 +0100, Michael Hoffman wrote:
  os.getcwd() returns a string, but path.getcwd() returns a new path
  object.
 
  In that case, I'd expect it to be 'path.fromcwd()' or 'path.cwd()'; i.e. a
  constructor classmethod by analogy with 'dict.fromkeys()' or
  'datetime.now()'.  'getcwd()' looks like it's getting a property of a path
  instance, and doesn't match stdlib conventions for constructors.
 
  So, +1 as long as it's called cwd() or something better (i.e. clearer
  and/or more consistent with stdlib constructor conventions).

+1 on cwd().

-1 on making this the default constructor. Essentially the default
constructor returns a path object that will reflect the CWD at the
time that further instance methods are called.

Only if we make the default argument to path() be os.curdir, which isn't a 
bad idea.


Unfortunately only some of the methods work on paths created with the
default constructor:

  path().listdir()
Traceback (most recent call last):
File stdin, line 1, in ?
File /usr/lib/python2.4/site-packages/path.py, line 297, in listdir
  names = os.listdir(self)
OSError: [Errno 2] No such file or directory: ''

This wouldn't be a problem if the default constructor arg were os.curdir 
(i.e. '.' for most platforms) instead of an empty string.


Is there support to have all of the methods work when the path is the
empty string? Among other benefits, this would mean that sys.path
could be turned into useful path objects with a simple list
comprehension.

Ugh.  sys.path entries are not path objects, nor should they be.  PEP 302 
(implemented in Python 2.3 and up) allows sys.path to contain any strings 
you like, as interpreted by objects in sys.path_hooks.  Programs that 
assume only filesystem paths appear in sys.path will break in the presence 
of PEP 302-sanctioned import hooks.

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Adding the 'path' module (was Re: Some RFE for review)

2005-06-27 Thread Skip Montanaro
We're getting enough discussion about various aspects of Jason's path module
that perhaps a PEP is warranted.  All this discussion on python-dev is just
going to get lost.

Skip
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Adding the 'path' module (was Re: Some RFE for review)

2005-06-27 Thread Phillip J. Eby
At 03:45 PM 6/27/2005 -0500, Skip Montanaro wrote:
We're getting enough discussion about various aspects of Jason's path module
that perhaps a PEP is warranted.  All this discussion on python-dev is just
going to get lost.

AFAICT, the only unresolved issue outstanding is a compromise or 
Pronouncement regarding the atime/ctime/mtime members' datatype.  This is 
assuming, of course, that making the empty path be os.curdir doesn't 
receive any objections, and that nobody strongly prefers 'path.fromcwd()' 
over 'path.cwd()' as the alternate constructor name.

Apart from these fairly minor issues, there is a very short to-do list, 
small enough to do an implementation patch in an evening or 
two.  Documentation might take a similar amount of time after that; mostly 
it'll be copy-paste from the existing os.path docs, though.

As for the open issues, if we can't reach some sane compromise about 
atime/ctime/mtime, I'd suggest just providing the stat() method and let 
people use stat().st_mtime et al.  Alternately, I'd be okay with creating 
last_modified(), last_accessed(), and created_on() methods that return 
datetime objects, as long as there's also atime()/mtime()/ctime() methods 
that return timestamps.


___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Adding the 'path' module (was Re: Some RFE for review)

2005-06-27 Thread Donovan Baarda
On Mon, 2005-06-27 at 14:25, Phillip J. Eby wrote:
[...]
 As for the open issues, if we can't reach some sane compromise about 
 atime/ctime/mtime, I'd suggest just providing the stat() method and let 
 people use stat().st_mtime et al.  Alternately, I'd be okay with creating 
 last_modified(), last_accessed(), and created_on() methods that return 
 datetime objects, as long as there's also atime()/mtime()/ctime() methods 
 that return timestamps.

+1 for atime/mtime/ctime being timestamps
-1 for redundant duplicates that return DateTimes
+1 for a stat() method (there is lots of other goodies in a stat).

-- 
Donovan Baarda [EMAIL PROTECTED]

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Adding the 'path' module (was Re: Some RFE for review)

2005-06-27 Thread Neil Hodgson
Andrew Durdin:

 While we'ew discussing outstanding issues: In a related discussion of
 the path module on c.l.py, Thomas Heller pointed out that the path
 module doesn't correctly handle unicode paths:
 ...

   Here is a patch that avoids failure when paths can not be
represented in a single 8 bit encoding. It adds a _cwd variable in the
initialisation and then calls this rather than os.getcwd. I sent the
patch to Jason as well.

_base = str
_cwd = os.getcwd
try:
   if os.path.supports_unicode_filenames:
   _base = unicode
   _cwd = os.getcwdu
except AttributeError:
   pass

#...

   def getcwd():
Return the current working directory as a path object. 
   return path(_cwd())

   Neil
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Adding the 'path' module (was Re: Some RFE for review)

2005-06-26 Thread Reinhold Birkenfeld
Phillip J. Eby wrote:
 At 06:57 PM 6/26/2005 +0200, Reinhold Birkenfeld wrote:
1226256:
The path module by Jason Orendorff should be in the standard library.
http://www.jorendorff.com/articles/python/path/
Review: the module is great and seems to have a large user base. On c.l.py
there are frequent praises about it.

[...]

 Aside from all these concerns, I'm +1 on adding the module.
 
 Here's my list of suggested changes:

[...]

I agree with your changes list.

One more issue is open: the one of naming. As path is already the name of
a module, what would the new object be called to avoid confusion? pathobj?
objpath? Path?

Reinhold

-- 
Mail address is perfectly valid!

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Adding the 'path' module (was Re: Some RFE for review)

2005-06-26 Thread Michael Hoffman
On Sun, 26 Jun 2005, Phillip J. Eby wrote:

 * drop getcwd(); it makes no sense on a path instance

Personally I use path.getcwd() as a class method all the time. It
makes as much sense as fromkeys() does on a dict instance, which is
technically possible but non-sensical.

 And, assuming these file-content methods are kept:

 * path.bytes() - path.get_file_bytes()
 * path.write_bytes()   - path.set_file_bytes() and path.append_file_bytes()
 * path.text()  - path.get_file_text()
 * path.write_text()- path.set_file_text() and path.append_file_text()
 * path.lines() - path.get_file_lines()
 * path.write_lines()   - path.set_file_lines() and path.append_file_lines()

I don't know how often these are used. I don't use them myself. I am
mainly interested in this module so that I don't have to use os.path
anymore.

Reinhold Birkenfeld wrote:

 One more issue is open: the one of naming. As path is already the
 name of a module, what would the new object be called to avoid
 confusion? pathobj?  objpath? Path?

I would argue for Path. It fits with the recent cases of:

from sets import Set
from decimal import Decimal
-- 
Michael Hoffman [EMAIL PROTECTED]
European Bioinformatics Institute

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Adding the 'path' module (was Re: Some RFE for review)

2005-06-26 Thread Skip Montanaro
Phillip It has many ways to do the same thing, and many of its property
Phillip and method names are confusing because they either do the same
Phillip thing as a standard function, but have a different name (like
Phillip the 'parent' property that is os.path.dirname in disguise), or
Phillip they have the same name as a standard function but do something
Phillip different (like the 'listdir()' method that returns full paths
Phillip rather than just filenames).  

To the extent that the path module tries to provide a uniform abstraction
that's not saddled with a particular way of doing things (e.g., the Unix way
or the Windows way), I don't think this is necessarily a bad thing.

Skip
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Adding the 'path' module (was Re: Some RFE for review)

2005-06-26 Thread Dörwald Walter
Phillip J. Eby wrote:

 [...]
 I'm also not keen on the fact that it makes certain things
 properties whose value can change over time; i.e. ctime/mtime/atime  
 and
 size really shouldn't be properties, but rather methods.

I think ctime, mtime and atime should be (or return)  
datetime.datetime objects instead of integer timestamps.

Bye,
Walter Dörwald

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Adding the 'path' module (was Re: Some RFE for review)

2005-06-26 Thread Phillip J. Eby
At 09:00 PM 6/26/2005 +0200, Reinhold Birkenfeld wrote:
One more issue is open: the one of naming. As path is already the name of
a module, what would the new object be called to avoid confusion? pathobj?
objpath? Path?

I was thinking os.Path, myself.

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Adding the 'path' module (was Re: Some RFE for review)

2005-06-26 Thread Phillip J. Eby
At 08:19 PM 6/26/2005 +0100, Michael Hoffman wrote:
On Sun, 26 Jun 2005, Phillip J. Eby wrote:

  * drop getcwd(); it makes no sense on a path instance

Personally I use path.getcwd() as a class method all the time. It
makes as much sense as fromkeys() does on a dict instance, which is
technically possible but non-sensical.

It's also duplication with os.path; I'm -1 on creating a new staticmethod 
for it.


Reinhold Birkenfeld wrote:

  One more issue is open: the one of naming. As path is already the
  name of a module, what would the new object be called to avoid
  confusion? pathobj?  objpath? Path?

I would argue for Path. It fits with the recent cases of:

from sets import Set
from decimal import Decimal

I like it too, as a class in the os module.

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Adding the 'path' module (was Re: Some RFE for review)

2005-06-26 Thread Phillip J. Eby
At 02:31 PM 6/26/2005 -0500, Skip Montanaro wrote:
 Phillip It has many ways to do the same thing, and many of its property
 Phillip and method names are confusing because they either do the same
 Phillip thing as a standard function, but have a different name (like
 Phillip the 'parent' property that is os.path.dirname in disguise), or
 Phillip they have the same name as a standard function but do something
 Phillip different (like the 'listdir()' method that returns full paths
 Phillip rather than just filenames).

To the extent that the path module tries to provide a uniform abstraction
that's not saddled with a particular way of doing things (e.g., the Unix way
or the Windows way), I don't think this is necessarily a bad thing.

I'm confused by your statements.  First, I didn't notice the path module 
providing any OS-abstractions that aren't already provided by 
os.path.  Second, using inconsistent and confusing names is pretty much 
always a bad thing.  :)

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Adding the 'path' module (was Re: Some RFE for review)

2005-06-26 Thread Phillip J. Eby
At 12:22 AM 6/27/2005 +0200, Dörwald Walter wrote:
Phillip J. Eby wrote:
[...]
I'm also not keen on the fact that it makes certain things
properties whose value can change over time; i.e. ctime/mtime/atime
and
size really shouldn't be properties, but rather methods.

I think ctime, mtime and atime should be (or return)
datetime.datetime objects instead of integer timestamps.

With what timezone?  I don't think that can be done portably and 
unambiguously, so I'm -1 on that.

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Adding the 'path' module (was Re: Some RFE for review)

2005-06-26 Thread Skip Montanaro

Walter I think ctime, mtime and atime should be (or return)
Walter datetime.datetime objects instead of integer timestamps.

+1

Skip
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Adding the 'path' module (was Re: Some RFE for review)

2005-06-26 Thread Skip Montanaro

Phillip ... but have a different name (like the 'parent' property that
Phillip is os.path.dirname in disguise) ...

Phillip ... (like the 'listdir()' method that returns full paths rather
Phillip than just filenames).

Skip To the extent that the path module tries to provide a uniform
Skip abstraction that's not saddled with a particular way of doing
Skip things (e.g., the Unix way or the Windows way), I don't think this
Skip is necessarily a bad thing.

Phillip I'm confused by your statements.  First, I didn't notice the
Phillip path module providing any OS-abstractions that aren't already
Phillip provided by os.path.  Second, using inconsistent and confusing
Phillip names is pretty much always a bad thing.  :)

Sorry, let me be more explicit.  dirname is the Unix name for return the
parent of this path.  In the Windows and Mac OS9 worlds (ignore any
possible Posix compatibility for a moment), my guess would be it's probably
something else.  I suspect listdir gets its return individual filenames
instead of full paths from the semantics of the Posix opendir/readdir/
closedir functions.  If it makes more sense to return strings that represent
full paths or new path objects that have been absolute-ified, then the minor
semantic change going from os.path.listdir() to the listdir method of
Jason's path objects isn't a big problem to me.

Skip

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Adding the 'path' module (was Re: Some RFE for review)

2005-06-26 Thread Bob Ippolito

On Jun 26, 2005, at 8:54 PM, Phillip J. Eby wrote:

 At 12:22 AM 6/27/2005 +0200, Dörwald Walter wrote:

 Phillip J. Eby wrote:

 [...]
 I'm also not keen on the fact that it makes certain things
 properties whose value can change over time; i.e. ctime/mtime/atime
 and
 size really shouldn't be properties, but rather methods.


 I think ctime, mtime and atime should be (or return)
 datetime.datetime objects instead of integer timestamps.


 With what timezone?  I don't think that can be done portably and
 unambiguously, so I'm -1 on that.

That makes no sense, timestamps aren't any better, and datetime  
objects have no time zone set by default anyway.   
datetime.fromtimestamp(time.time()) gives you the same thing as  
datetime.now().

-bob

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Adding the 'path' module (was Re: Some RFE for review)

2005-06-26 Thread Phillip J. Eby
At 08:29 PM 6/26/2005 -0500, Skip Montanaro wrote:

 Phillip ... but have a different name (like the 'parent' property that
 Phillip is os.path.dirname in disguise) ...

 Phillip ... (like the 'listdir()' method that returns full paths rather
 Phillip than just filenames).

 Skip To the extent that the path module tries to provide a uniform
 Skip abstraction that's not saddled with a particular way of doing
 Skip things (e.g., the Unix way or the Windows way), I don't think this
 Skip is necessarily a bad thing.

 Phillip I'm confused by your statements.  First, I didn't notice the
 Phillip path module providing any OS-abstractions that aren't already
 Phillip provided by os.path.  Second, using inconsistent and confusing
 Phillip names is pretty much always a bad thing.  :)

Sorry, let me be more explicit.  dirname is the Unix name for return the
parent of this path.  In the Windows and Mac OS9 worlds (ignore any
possible Posix compatibility for a moment), my guess would be it's probably
something else.  I suspect listdir gets its return individual filenames
instead of full paths from the semantics of the Posix opendir/readdir/
closedir functions.  If it makes more sense to return strings that represent
full paths or new path objects that have been absolute-ified, then the minor
semantic change going from os.path.listdir() to the listdir method of
Jason's path objects isn't a big problem to me.

The semantics aren't the issue; it's fine and indeed quite useful to have a 
method that returns path objects.  I'm just saying it shouldn't be called 
listdir(), since that's confusing when compared to what the existing 
listdir() function does.  If you look at my original post, you'll see I 
suggested it be called 'subpaths()' instead, to help reflect that it 
returns paths, rather than filenames.

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Adding the 'path' module (was Re: Some RFE for review)

2005-06-26 Thread Phillip J. Eby
At 09:26 PM 6/26/2005 -0400, Bob Ippolito wrote:

On Jun 26, 2005, at 8:54 PM, Phillip J. Eby wrote:

At 12:22 AM 6/27/2005 +0200, Dörwald Walter wrote:

Phillip J. Eby wrote:
I'm also not keen on the fact that it makes certain things
properties whose value can change over time; i.e. ctime/mtime/atime
and
size really shouldn't be properties, but rather methods.

I think ctime, mtime and atime should be (or return)
datetime.datetime objects instead of integer timestamps.

With what timezone?  I don't think that can be done portably and
unambiguously, so I'm -1 on that.

That makes no sense, timestamps aren't any better,

Sure they are, if what you want is a timestamp.  In any case, the most 
common use case I've seen for mtime and friends is just comparing against a 
previous value, or the value on another file, so it doesn't actually matter 
most of the time what the type of the value is.


  and datetime
objects have no time zone set by default anyway.
datetime.fromtimestamp(time.time()) gives you the same thing as
datetime.now().

In which case, it's also easy enough to get a datetime if you really want 
one.  I personally would rather do that than complicate the use cases where 
a datetime isn't really needed.  (i.e. most of the time, at least in my 
experience)

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com