subject:"\[Python\-Dev\] Unicode Imports"

Re: [Python-Dev] Unicode Imports

2006-09-09 Thread Martin v. Löwis

Nick Coghlan schrieb:
 So this is taking something that *already works properly on POSIX
 systems* and making it work on Windows as well.

I doubt it does without side effects. For example, an application that
would go through sys.path, and encode everything with
sys.getfilesystemencoding() currently works, but will break if the patch
is applied and non-mbcs strings are put on sys.path.

Also, what will be the effect on __file__? What value will it have
if the module originates from a sys.path entry that is a non-mbcs
unicode string? I haven't tested the patch, but it looks like
__file__ becomes a unicode string on Windows, and remains a byte
string encoded with the file system encoding elsewhere. That's also
a change in behavior.

Regards,
Martin

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] Unicode Imports

2006-09-09 Thread Steve Holden

Martin v. Löwis wrote:
 Nick Coghlan schrieb:
 
So this is taking something that *already works properly on POSIX
systems* and making it work on Windows as well.
 
 
 I doubt it does without side effects. For example, an application that
 would go through sys.path, and encode everything with
 sys.getfilesystemencoding() currently works, but will break if the patch
 is applied and non-mbcs strings are put on sys.path.
 
 Also, what will be the effect on __file__? What value will it have
 if the module originates from a sys.path entry that is a non-mbcs
 unicode string? I haven't tested the patch, but it looks like
 __file__ becomes a unicode string on Windows, and remains a byte
 string encoded with the file system encoding elsewhere. That's also
 a change in behavior.
 
Just to summarise my feeling having read the words of those more 
familiar with the issues than me: it looks like this should be a 2.6 
enhancement if it's included at all. I'd like to see it go in, but there 
do seem to be problems ensuring consistent behaviour across inconsistent 
platforms.

regards
  Steve
-- 
Steve Holden   +44 150 684 7255  +1 800 494 3119
Holden Web LLC/Ltd  http://www.holdenweb.com
Skype: holdenweb   http://holdenweb.blogspot.com
Recent Ramblings http://del.icio.us/steve.holden
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] Unicode Imports

2006-09-09 Thread David Hopwood

Martin v. Löwis wrote:
 Nick Coghlan schrieb:
 
So this is taking something that *already works properly on POSIX
systems* and making it work on Windows as well.
 
 I doubt it does without side effects. For example, an application that
 would go through sys.path, and encode everything with
 sys.getfilesystemencoding() currently works, but will break if the patch
 is applied and non-mbcs strings are put on sys.path.

Huh? It won't break on any path for which it is not already broken.

You seem to be saying Paths with non-mbcs strings shouldn't work on Windows,
because they haven't worked in the past.

-- 
David Hopwood [EMAIL PROTECTED]



___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] Unicode Imports

2006-09-09 Thread Martin v. Löwis

David Hopwood schrieb:
 I doubt it does without side effects. For example, an application that
 would go through sys.path, and encode everything with
 sys.getfilesystemencoding() currently works, but will break if the patch
 is applied and non-mbcs strings are put on sys.path.
 
 Huh? It won't break on any path for which it is not already broken.
 
 You seem to be saying Paths with non-mbcs strings shouldn't work on Windows,
 because they haven't worked in the past.

That's not what I'm saying. I'm saying that it shouldn't work in 2.5.x,
because it didn't in 2.5.0. Changing it in 2.6 is fine, along with the
incompatibilities it causes.

Regards,
Martin

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] Unicode Imports

2006-09-09 Thread Nick Coghlan

David Hopwood wrote:
 Martin v. Löwis wrote:
 Nick Coghlan schrieb:

 So this is taking something that *already works properly on POSIX
 systems* and making it work on Windows as well.
 I doubt it does without side effects. For example, an application that
 would go through sys.path, and encode everything with
 sys.getfilesystemencoding() currently works, but will break if the patch
 is applied and non-mbcs strings are put on sys.path.
 
 Huh? It won't break on any path for which it is not already broken.
 
 You seem to be saying Paths with non-mbcs strings shouldn't work on Windows,
 because they haven't worked in the past.

I think MvL is looking at it from the point of view of consumers of the list 
of strings in sys.path, such as PEP 302 importer and loader objects, and tools 
like module_finder. Currently, the list of values in sys.path is limited to:

1. 8-bit strings
2. Unicode strings containing only characters which can be encoded using the 
default file system encoding

For PEP 302 loaders, it is currently correct for them to take the 8-bit string 
they receive and do path.decode(sys.getfilesystemencoding())

Kristján's patch works nicely for his application because he doesn't have to 
worry about compatibility with existing loaders and utilities. The core 
doesn't have that luxury.

We *might* be able to find a backwards compatible way to do it that could be 
put into 2.5.x, but that is effort that could more profitably be spent 
elsewhere, particularly since the state of the import system in Py3k will be 
for it to be based entirely on Unicode (as GvR pointed out last time this 
topic came up [1]).

Cheers,
Nick.

http://mail.python.org/pipermail/python-dev/2006-June/066225.html



-- 
Nick Coghlan   |   [EMAIL PROTECTED]   |   Brisbane, Australia
---
 http://www.boredomandlaziness.org
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] Unicode Imports

2006-09-09 Thread Martin v. Löwis

Nick Coghlan schrieb:
 I think MvL is looking at it from the point of view of consumers of the list 
 of strings in sys.path, such as PEP 302 importer and loader objects, and 
 tools 
 like module_finder. Currently, the list of values in sys.path is limited to:

That, and all kinds of inspection tools. For example, when __file__ of a
module object changes to be a Unicode string (which it does under the
proposed patch), then these tools break. They currently don't break in
that way because putting arbitrary Unicode strings on sys.path doesn't
work in the first place.

Regards,
Martin
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] Unicode Imports

2006-09-09 Thread David Hopwood

Nick Coghlan wrote:
 David Hopwood wrote:
 Martin v. Löwis wrote:
 Nick Coghlan schrieb:

 So this is taking something that *already works properly on POSIX
 systems* and making it work on Windows as well.

 I doubt it does without side effects. For example, an application that
 would go through sys.path, and encode everything with
 sys.getfilesystemencoding() currently works, but will break if the patch
 is applied and non-mbcs strings are put on sys.path.

 Huh? It won't break on any path for which it is not already broken.

 You seem to be saying Paths with non-mbcs strings shouldn't work on
 Windows, because they haven't worked in the past.
 
 I think MvL is looking at it from the point of view of consumers of the
 list of strings in sys.path, such as PEP 302 importer and loader
 objects, and tools like module_finder. Currently, the list of values in
 sys.path is limited to:
 
 1. 8-bit strings
 2. Unicode strings containing only characters which can be encoded using
 the default file system encoding

On Windows, file system pathnames can contain arbitrary Unicode characters
(well, almost). Despite the existence of ANSI filesystem APIs, and
regardless of what 'sys.getfilesystemencoding()' returns, the underlying
file system encoding for NTFS and FAT filesystems is UTF-16LE.

Thus, either:
 - the fact that sys.getfilesystemencoding() returns a non-Unicode encoding
   on Windows is a bug, or
 - any program that relies on sys.getfilesystemencoding() being able to
   encode arbitrary Windows pathnames has a bug.

We need to decide which of these is the case.

-- 
David Hopwood [EMAIL PROTECTED]



___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] Unicode Imports

2006-09-09 Thread Martin v. Löwis

David Hopwood schrieb:
 On Windows, file system pathnames can contain arbitrary Unicode characters
 (well, almost). Despite the existence of ANSI filesystem APIs, and
 regardless of what 'sys.getfilesystemencoding()' returns, the underlying
 file system encoding for NTFS and FAT filesystems is UTF-16LE.
 
 Thus, either:
  - the fact that sys.getfilesystemencoding() returns a non-Unicode encoding
on Windows is a bug, or
  - any program that relies on sys.getfilesystemencoding() being able to
encode arbitrary Windows pathnames has a bug.
 
 We need to decide which of these is the case.

There is a third option:
- the operating system has a bug

It is actually this option that rules out the other two.
sys.getfilesystemencoding() returns mbcs on Windows, which means
CP_ACP. The file system encoding is an encoding that converts a
file name into a byte string. Unfortunately, on Windows, there are
file names which cannot be converted into a byte string in a standard
manner. This is an operating system bug (or mis-design; they should
have chosen UTF-8 as the byte encoding of file names, instead of
making it depend on the system locale, but they of course did so
for backwards compatibility with Windows 3.1 and 9x).

As a side note: every encoding in Python is a Unicode encoding;
so there aren't any non-Unicode encodings.

Programs that rely on sys.getfilesystemencoding() being able to
represent arbitrary file names on Windows might have a bug;
programs that rely on sys.getfilesystemencoding() being able
to encode all elements of sys.path do not (atleast not for
Python 2.5 and earlier).

Regards,
Martin

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] Unicode Imports

2006-09-09 Thread David Hopwood

Martin v. Löwis wrote:
 David Hopwood schrieb:
 
On Windows, file system pathnames can contain arbitrary Unicode characters
(well, almost). Despite the existence of ANSI filesystem APIs, and
regardless of what 'sys.getfilesystemencoding()' returns, the underlying
file system encoding for NTFS and FAT filesystems is UTF-16LE.

Thus, either:
 - the fact that sys.getfilesystemencoding() returns a non-Unicode encoding
   on Windows is a bug, or
 - any program that relies on sys.getfilesystemencoding() being able to
   encode arbitrary Windows pathnames has a bug.

We need to decide which of these is the case.
 
 There is a third option:
 - the operating system has a bug

This behaviour is by design. If it is a bug, then it is a won't ever fix --
no way, no how bug, that Python must accomodate if it is to properly support
Unicode on Windows.

 It is actually this option that rules out the other two.
 sys.getfilesystemencoding() returns mbcs on Windows, which means
 CP_ACP. The file system encoding is an encoding that converts a
 file name into a byte string. Unfortunately, on Windows, there are
 file names which cannot be converted into a byte string in a standard
 manner. This is an operating system bug (or mis-design; they should
 have chosen UTF-8 as the byte encoding of file names, instead of
 making it depend on the system locale, but they of course did so
 for backwards compatibility with Windows 3.1 and 9x).

Although UTF-8 was invented (in September 1992) technically before the release
of the first version of NT supporting NTFS (NT 3.1 in July 1993), it had not
been invented before the decision to use Unicode in NTFS, or in Windows NT's
file APIs, had been made.

(I believe OS/2 HPFS had not supported Unicode, even though NTFS was otherwise
almost identical to it.)

At that time, the decision to use Unicode at all was quite forward-looking;
the final version of Unicode 1.0 had only been published in June 1992
(although it had been approved earlier; see http://www.unicode.org/history/).

UTF-8 was only officially added to the Unicode standard in an appendix of
Unicode 2.0 (published July 1996), and only given essentially equal status to
UTF-16 and UTF-32 in Unicode 3.0 (September 1999).

 As a side note: every encoding in Python is a Unicode encoding;
 so there aren't any non-Unicode encodings.

It was clear from context that I meant encoding capable of representing
all Unicode characters.

 Programs that rely on sys.getfilesystemencoding() being able to
 represent arbitrary file names on Windows might have a bug;
 programs that rely on sys.getfilesystemencoding() being able
 to encode all elements of sys.path do not (at least not for
 Python 2.5 and earlier).

Elements of sys.path can be Unicode strings in Python 2.5, and should be
pathnames supported by the underlying OS. Where is it documented that there
is any further restriction on them? And why should there be any further
restriction on them?

-- 
David Hopwood [EMAIL PROTECTED]



___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] Unicode Imports

2006-09-09 Thread Martin v. Löwis

David Hopwood schrieb:
 Elements of sys.path can be Unicode strings in Python 2.5, and should be
 pathnames supported by the underlying OS. Where is it documented that there
 is any further restriction on them? And why should there be any further
 restriction on them?

It's not documented in that detail; if people think it should be
documented more thoroughly, that should be done (contributions are
welcome). Changing the import machinery to deal with Unicode strings
differently cannot be done for Python 2.5, though: it cannot be done
for 2.5.0 as the release candidate has already been published, and there
is no acceptable patch available at this moment. It cannot be added
to 2.5.x as it may reasonably break existing applications.

Regards,
Martin

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] Unicode Imports

2006-09-09 Thread Nick Coghlan

David Hopwood wrote:
 Martin v. Löwis wrote:
 Programs that rely on sys.getfilesystemencoding() being able to
 represent arbitrary file names on Windows might have a bug;
 programs that rely on sys.getfilesystemencoding() being able
 to encode all elements of sys.path do not (at least not for
 Python 2.5 and earlier).
 
 Elements of sys.path can be Unicode strings in Python 2.5, and should be
 pathnames supported by the underlying OS. Where is it documented that there
 is any further restriction on them? And why should there be any further
 restriction on them?

There's no suggestion that this limitation shouldn't be fixed - merely that 
fixing it is likely to break some applications which rely on sys.path for 
importing or introspection purposes. A 2.5.x maintenance release typically 
shouldn't break anything that worked correctly on 2.5.0, hence fixing this 
becomes a project for either 2.6 or 3.0.

To put it another way: fixing this is likely to require changes to more than 
just the interpreter core. It will also potentially require changes to all 
applications which currently expect to be able to use 
's.encode(sys.getfilesystemencoding())' to convert any Unicode path entry or 
__file__ attribute to an 8-bit string.

Doing that qualifies as correcting a language design error or limitation, but 
it would require a real stretch of the definition to qualify as a bug fix.

Cheers,
Nick.

-- 
Nick Coghlan   |   [EMAIL PROTECTED]   |   Brisbane, Australia
---
 http://www.boredomandlaziness.org
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] Unicode Imports

2006-09-08 Thread Steve Holden

Anthony Baxter wrote:
 On Friday 08 September 2006 02:56, Kristján V. Jónsson wrote:
 
Hello All.
I just added patch 1552880 to sourceforge.  It is a patch for 2.6 (and 2.5)
which allows unicode paths in sys.path and uses the unicode file api on
windows. This is tried and tested on 2.5, and backported to 2.3 and is
currently running on clients in china and esewhere.  It is minimally
intrusive to the inporting mechanism, at the cost of some string conversion
overhead (to utf8 and then back to unicode).
 
 
 As this can't be considered a bugfix (that I can see), I'd be against it 
 being 
 checked into 2.5. 
 
Are you suggesting that Python's inability to correctly handle Unicode 
path elements isn't a bug? Or simply that this inability isn't currently 
described in a bug report on Sourceforge?

I agree it's a relatively large patch for a release candidate but if 
prudence suggests deferring it, it should be a *definite* for 2.5.1 and 
subsequent releases.

regards
  Steve
-- 
Steve Holden   +44 150 684 7255  +1 800 494 3119
Holden Web LLC/Ltd  http://www.holdenweb.com
Skype: holdenweb   http://holdenweb.blogspot.com
Recent Ramblings http://del.icio.us/steve.holden

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] Unicode Imports

2006-09-08 Thread Anthony Baxter

On Friday 08 September 2006 18:24, Steve Holden wrote:
  As this can't be considered a bugfix (that I can see), I'd be against it
  being checked into 2.5.

 Are you suggesting that Python's inability to correctly handle Unicode
 path elements isn't a bug? Or simply that this inability isn't currently
 described in a bug report on Sourceforge?

I'm suggesting that adding the ability to handle unicode paths is a *new* 
*feature*.

If people actually want to see 2.5 final ever released, they're going to have 
to accept that oh, but just this _one_ _more_ _thing_ is not going to fly.

We're _well_ past beta1, where new features should have been added. At this 
point, we have to cut another release candidate. This is far too much to add 
during the release candidate stage.

 I agree it's a relatively large patch for a release candidate but if
 prudence suggests deferring it, it should be a *definite* for 2.5.1 and
 subsequent releases.

Possibly. I remain unconvinced. 

-- 
Anthony Baxter [EMAIL PROTECTED]
It's never too late to have a happy childhood.
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] Unicode Imports

2006-09-08 Thread Steve Holden

Anthony Baxter wrote:
 On Friday 08 September 2006 18:24, Steve Holden wrote:
 
As this can't be considered a bugfix (that I can see), I'd be against it
being checked into 2.5.

Are you suggesting that Python's inability to correctly handle Unicode
path elements isn't a bug? Or simply that this inability isn't currently
described in a bug report on Sourceforge?
 
 I'm suggesting that adding the ability to handle unicode paths is a *new* 
 *feature*.
 
That's certainly true.

 If people actually want to see 2.5 final ever released, they're going to have 
 to accept that oh, but just this _one_ _more_ _thing_ is not going to fly.
 
 We're _well_ past beta1, where new features should have been added. At this 
 point, we have to cut another release candidate. This is far too much to add 
 during the release candidate stage.
 
Right. I couldn't argue for putting this in to 2.5 - it would certainly 
represent unwarranted feature creep at the rc2 stage.
 
I agree it's a relatively large patch for a release candidate but if
prudence suggests deferring it, it should be a *definite* for 2.5.1 and
subsequent releases.
 
 
 Possibly. I remain unconvinced. 
 

But it *is* a desirable, albeit new, feature, so I'm surprised that you 
don't appear to perceive it as such for a downstream release.

regards
  Steve
-- 
Steve Holden   +44 150 684 7255  +1 800 494 3119
Holden Web LLC/Ltd  http://www.holdenweb.com
Skype: holdenweb   http://holdenweb.blogspot.com
Recent Ramblings http://del.icio.us/steve.holden

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] Unicode Imports

2006-09-08 Thread Nick Coghlan

Steve Holden wrote:
 Anthony Baxter wrote:
 On Friday 08 September 2006 18:24, Steve Holden wrote:
 I agree it's a relatively large patch for a release candidate but if
 prudence suggests deferring it, it should be a *definite* for 2.5.1 and
 subsequent releases.

 Possibly. I remain unconvinced. 

 
 But it *is* a desirable, albeit new, feature, so I'm surprised that you 
 don't appear to perceive it as such for a downstream release.

And unlike 2.2's True/False problem, it is an *environmental* feature, rather 
than a programmatic one.

So while it's a new feature, it would merely mean that 2.5.1 works correctly 
in more environments than 2.5.

Cheers,
Nick.

-- 
Nick Coghlan   |   [EMAIL PROTECTED]   |   Brisbane, Australia
---
 http://www.boredomandlaziness.org
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] Unicode Imports

2006-09-08 Thread Anthony Baxter

On Friday 08 September 2006 19:19, Steve Holden wrote:
 But it *is* a desirable, albeit new, feature, so I'm surprised that you
 don't appear to perceive it as such for a downstream release.

Point releases (2.x.1 and suchlike) are absolutely not for new features. 
They're for bugfixes, only. It's possible that this could be considered a 
bugfix, but as I said right now I'm dubious.

Anthony
-- 
Anthony Baxter [EMAIL PROTECTED]
It's never too late to have a happy childhood.
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] Unicode Imports

2006-09-08 Thread Steve Holden

Anthony Baxter wrote:
 On Friday 08 September 2006 19:19, Steve Holden wrote:
 
But it *is* a desirable, albeit new, feature, so I'm surprised that you
don't appear to perceive it as such for a downstream release.
 
 
 Point releases (2.x.1 and suchlike) are absolutely not for new features. 
 They're for bugfixes, only. It's possible that this could be considered a 
 bugfix, but as I said right now I'm dubious.
 
OK, in that case I'm going to argue that the current behaviour is buggy.

I suppose your point is that, assuming the patch is correct (and it 
seems the authors are relying on it for production purposes in tens of 
thousands of installations), it doesn't change the behaviour of the 
interpreter in existing cases, and therefore it is providing a new feature.

I don't regard this as the provision of a new feature but as the removal 
of an unnecessary restriction (which I would prefer to call a bug). If 
it was *documented* somewhere that Unicode paths aren't legal I would 
find your arguments more convincing. As things stand new Python users 
would, IMHO, be within their rights to assume that arbitrary directories 
could be added to the path without breakage.

Ultimately, your call, I guess. Would it help if I added inability to 
import from Unicode directories as a bug? Or would you prefer to change 
the documentation to state that some directories can't be used as path 
elements 0.3 wink?

regards
  Steve
-- 
Steve Holden   +44 150 684 7255  +1 800 494 3119
Holden Web LLC/Ltd  http://www.holdenweb.com
Skype: holdenweb   http://holdenweb.blogspot.com
Recent Ramblings http://del.icio.us/steve.holden
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] Unicode Imports

2006-09-08 Thread Guido van Rossum

On 9/8/06, Steve Holden [EMAIL PROTECTED] wrote:
 Anthony Baxter wrote:
  On Friday 08 September 2006 19:19, Steve Holden wrote:
 
 But it *is* a desirable, albeit new, feature, so I'm surprised that you
 don't appear to perceive it as such for a downstream release.
 
 
  Point releases (2.x.1 and suchlike) are absolutely not for new features.
  They're for bugfixes, only. It's possible that this could be considered a
  bugfix, but as I said right now I'm dubious.
 
 OK, in that case I'm going to argue that the current behaviour is buggy.

 I suppose your point is that, assuming the patch is correct (and it
 seems the authors are relying on it for production purposes in tens of
 thousands of installations), it doesn't change the behaviour of the
 interpreter in existing cases, and therefore it is providing a new feature.

 I don't regard this as the provision of a new feature but as the removal
 of an unnecessary restriction (which I would prefer to call a bug). If
 it was *documented* somewhere that Unicode paths aren't legal I would
 find your arguments more convincing. As things stand new Python users
 would, IMHO, be within their rights to assume that arbitrary directories
 could be added to the path without breakage.

 Ultimately, your call, I guess. Would it help if I added inability to
 import from Unicode directories as a bug? Or would you prefer to change
 the documentation to state that some directories can't be used as path
 elements 0.3 wink?

We've all heard the arguments for both sides enough times I think.

IMO it's the call of the release managers. Board members ought to
trust the release managers and not apply undue pressure.

-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] Unicode Imports

2006-09-08 Thread skip


Guido IMO it's the call of the release managers. Board members ought to
Guido trust the release managers and not apply undue pressure.

Indeed.  Let's not go whacking people with boards.  The Perl people would
just laugh at us...

Skip
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] Unicode Imports

2006-09-08 Thread Giovanni Bajo

Guido van Rossum [EMAIL PROTECTED] wrote:

 IMO it's the call of the release managers. Board members ought to
 trust the release managers and not apply undue pressure.


+1, but I would love to see a more formal definition of what a bugfix is,
which would reduce the ambiguous cases, and thus reduce the number of times the
release managers are called to pronounce.

Other projects, for instance, describe point releases as open for regression
fixes only, which means that a patch, to be eligible for a point release, must
fix a regression (something which used to work before, and doesn't anymore).

Regressions are important because they affect people wanting to upgrade Python.
If something never worked before (like this unicode path thingie), surely
existing Python users are not affected by the bug (or they have already
workarounds in place), so that NOT having the bug fixed in a point release is
not a problem.

Anyway, I'm not pushing for this specific policy (even if I like it): I'm just
suggesting Release Managers to more formally define what should and what should
not go in a point release.

Giovanni Bajo

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] Unicode Imports

2006-09-08 Thread Raymond Hettinger

Giovanni Bajo wrote:


+1, but I would love to see a more formal definition of what a bugfix is,
which would reduce the ambiguous cases, and thus reduce the number of times the
release managers are called to pronounce.
  


Sorry, that is just a pipe-dream. To some degree, all bug-fixes are new 
features in that there is some behavioral difference, something will now 
work that wouldn't work before. While some cases are clear-cut (such as 
API changes), the ones that are interesting will defy definition and 
need a human judgment call as to whether a given change will help more 
than it hurts. The RMs are also strongly biased against extensive 
patches than haven't had a chance to go through a beta-cycle -- they 
don't want their releases mucked-up.


Raymond





___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] Unicode Imports

2006-09-08 Thread M.-A. Lemburg

Kristján V. Jónsson wrote:
 Hello All.
 I just added patch 1552880 to sourceforge.  It is a patch for 2.6 (and 2.5) 
 which allows unicode paths in sys.path and uses the unicode file api on 
 windows.
 This is tried and tested on 2.5, and backported to 2.3 and is currently 
 running on clients in china and esewhere.  It is minimally intrusive to the 
 inporting mechanism, at the cost of some string conversion overhead (to utf8 
 and then back to unicode).

+1 on adding it to Python 2.6.

-0 for Python 2.5.x:

Applications/modules written for Python 2.4 and 2.5 won't be expecting
Unicode strings in sys.path with all the consequences that go with it,
so this is a true change in semantics, not just a nice to have
additional feature or bug fix.

OTOH, those applications will just break in a different place with the
patch applied :-)

-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, Sep 08 2006)
 Python/Zope Consulting and Support ...http://www.egenix.com/
 mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/
 mxODBC, mxDateTime, mxTextTools ...http://python.egenix.com/


::: Try mxODBC.Zope.DA for Windows,Linux,Solaris,FreeBSD for free ! 
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] Unicode Imports

2006-09-08 Thread Martin v. Löwis

Steve Holden schrieb:
 As this can't be considered a bugfix (that I can see), I'd be against it 
 being 
 checked into 2.5. 

 Are you suggesting that Python's inability to correctly handle Unicode 
 path elements isn't a bug?

Not sure whether Anthony suggests it, but I do.

 Or simply that this inability isn't currently 
 described in a bug report on Sourceforge?

No: sys.path is specified (originally) as containing a list of byte
strings; it was extended to also support path importers (or whatever
that PEP calls them). It was never extended to support Unicode strings.
That other PEP e

 I agree it's a relatively large patch for a release candidate but if 
 prudence suggests deferring it, it should be a *definite* for 2.5.1 and 
 subsequent releases.

I'm not so sure it should. It *is* a new feature: it makes applications
possible which aren't possible today, and the documentation does not
ever suggest that these applications should have been possible. In fact,
it is common knowledge that this currently isn't supported.

Regards,
Martin
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] Unicode Imports

2006-09-08 Thread Martin v. Löwis

Steve Holden schrieb:
 I agree it's a relatively large patch for a release candidate but if
 prudence suggests deferring it, it should be a *definite* for 2.5.1 and
 subsequent releases.

 Possibly. I remain unconvinced. 

 
 But it *is* a desirable, albeit new, feature, so I'm surprised that you 
 don't appear to perceive it as such for a downstream release.

Because 2.5.1 shouldn't include any new features. If it is a new feature
(which it is), it should go into 2.6.

Regards,
Martin
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] Unicode Imports

2006-09-08 Thread Martin v. Löwis

Nick Coghlan schrieb:
 But it *is* a desirable, albeit new, feature, so I'm surprised that you 
 don't appear to perceive it as such for a downstream release.
 
 And unlike 2.2's True/False problem, it is an *environmental* feature, rather 
 than a programmatic one.

Not sure what you mean by that; if you mean thus existing applications
cannot break: this is not true. In fact, it seems that some
applications are extremely susceptible to the types of objects on
sys.path. Some applications apparently know exactly what you can and
cannot find on sys.path; changing that might break them.

Regards,
Martin
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] Unicode Imports

2006-09-08 Thread Martin v. Löwis

Steve Holden schrieb:
 I don't regard this as the provision of a new feature but as the removal 
 of an unnecessary restriction (which I would prefer to call a bug).

You got the definition of bug wrong. Primarily, a bug is a deviation
from the specification. Extending the domain of an argument to an
existing function is a new feature.

Regards,
Martin
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] Unicode Imports

2006-09-08 Thread Martin v. Löwis

Giovanni Bajo schrieb:
 +1, but I would love to see a more formal definition of what a bugfix is,
 which would reduce the ambiguous cases, and thus reduce the number of times 
 the
 release managers are called to pronounce.
 
 Other projects, for instance, describe point releases as open for regression
 fixes only, which means that a patch, to be eligible for a point release, 
 must
 fix a regression (something which used to work before, and doesn't anymore).

In Python, the tradition has excepted bug fixes beyond that. For
example, fixing a memory leak would also count as a bug fix.

In general, I think a bug is a deviation from the specification (it
might be necessary to interpret the specification first to find out
whether the implementation deviates). A bug fix is then a behavior
change so that the new behavior follows the specification, or a
specification change so that it correctly describes the behavior.

Regards,
Martin
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] Unicode Imports

2006-09-08 Thread Nick Coghlan

Martin v. Löwis wrote:
 Steve Holden schrieb:
 Or simply that this inability isn't currently 
 described in a bug report on Sourceforge?
 
 No: sys.path is specified (originally) as containing a list of byte
 strings; it was extended to also support path importers (or whatever
 that PEP calls them). It was never extended to support Unicode strings.
 That other PEP e

That other PEP being PEP 302. That said, Unicode strings *are* permitted on 
sys.path - the import system will automatically encode them to an 8-bit string 
using the default filesystem encoding as part of the import process.

This works fine on Unix systems that use UTF-8 encoded strings to handle 
Unicode paths at the C API level, but is screwed on Windows because the 
default mbcs filesystem encoding can't handle the full range of possible 
Unicode path names (such as the Chinese directories that originally gave 
Kristján grief).

To get Unicode path names to work on Windows, you have to use the 
Windows-specific wide character API instead of the normal C API, and the 
import machinery doesn't do that.

So this is taking something that *already works properly on POSIX systems* and 
making it work on Windows as well.

 I agree it's a relatively large patch for a release candidate but if 
 prudence suggests deferring it, it should be a *definite* for 2.5.1 and 
 subsequent releases.
 
 I'm not so sure it should. It *is* a new feature: it makes applications
 possible which aren't possible today, and the documentation does not
 ever suggest that these applications should have been possible. In fact,
 it is common knowledge that this currently isn't supported.

It should already work fine on POSIX filesystems that use the default 
filesystem encoding for path names. As far as I am aware, it is only Windows 
where it doesn't work.

Cheers,
Nick.

-- 
Nick Coghlan   |   [EMAIL PROTECTED]   |   Brisbane, Australia
---
 http://www.boredomandlaziness.org
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

[Python-Dev] Unicode Imports

2006-09-07 Thread Kristján V . Jónsson




Hello 
All.
I just added 
patch 1552880 to sourceforge. It is a patch for 2.6 
(and 2.5) which allows unicode paths in sys.path and uses the unicode file api 
on windows.
This is tried and 
tested on 2.5, and backported to 2.3 and is currently running on clients in 
china and esewhere. It is minimally intrusive to the inporting mechanism, 
at the cost of some string conversion overhead (to utf8 and then back to 
unicode).

Cheers,
Kristján
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] Unicode Imports

2006-09-07 Thread Anthony Baxter

On Friday 08 September 2006 02:56, Kristján V. Jónsson wrote:
 Hello All.
 I just added patch 1552880 to sourceforge.  It is a patch for 2.6 (and 2.5)
 which allows unicode paths in sys.path and uses the unicode file api on
 windows. This is tried and tested on 2.5, and backported to 2.3 and is
 currently running on clients in china and esewhere.  It is minimally
 intrusive to the inporting mechanism, at the cost of some string conversion
 overhead (to utf8 and then back to unicode).

As this can't be considered a bugfix (that I can see), I'd be against it being 
checked into 2.5. 

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] unicode imports

2006-06-20 Thread Thomas Heller

Martin v. Löwis schrieb:
 Thomas Heller wrote:
 It should be noted that I once started to convert the import machinery
 to be fully unicode aware.  As far as I can tell, a *lot* has to be changed
 to make this work.
 
 Is that code available somewhere still? Does it still work?

Available as patch 1093253, I have not tried if it stil works
 
 I started with refactoring Python/import.c, but nobody responded to the 
 question
 whether such a refactoring patch would be accepted or not.
 
 I would like to see minimal changes only. I don't see why massive
 refactoring would be necessary: the structure of the code should
 persist - only the data types should change from char* to PyObject*.
 Calls like stat() and open() should be generalized to accept
 PyObject*, and otherwise keep their interface.

To be really useful, wide char versions of other things must also be
made available: command line arguments, environment variables
(PYTHONPATH), and maybe other stuff.

Thomas
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] unicode imports

2006-06-20 Thread Martin v. Löwis

Thomas Heller wrote:
 Is that code available somewhere still? Does it still work?
 
 Available as patch 1093253, I have not tried if it stil works

I see. It's quite a huge change, that's probably why nobody found
the time to review it, yet.

 To be really useful, wide char versions of other things must also be
 made available: command line arguments, environment variables
 (PYTHONPATH), and maybe other stuff.

While I think these things should eventually be done, I don't think
they are that related to import.c.

If W9x support gets dropped, we can rewrite PC/getpathp.c to use the
Unicode API throughout; that would allow to put non-ANSI path
names onto PYTHONPATH.

Making os.environ support Unicode is entirely different isusue.
I would like to see os.environ return Unicode if the key is Unicode;
another option would be to introduce os.uenviron.

Regards,
Martin
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] unicode imports

2006-06-19 Thread Kristján V . Jónsson


Ideally, I would like for python to simply work. It seems to me that it is 
mostly a question of time when all modern platforms offer unicode filesystems 
and hence unicode APIs.  IMHO, stuff like the importer should really be written 
in native unicode and revert to ASCII only as a fallback for unsupporting 
platforms.  is WITH_UNICODE ever left undefined these days?

And sure, module names need to be python identifiers (thus ASCII), although I 
wouldn't be surprised if that restriction were lifted in a not too distant 
future :)  After all, we support the utf-8 encoding of source files, but I 
cannot write kristján = 1.  But that's for a future PEP.

Kristján

-Original Message-
From: Nick Coghlan [mailto:[EMAIL PROTECTED] 
Sent: 16. júní 2006 15:30
To: Kristján V. Jónsson
Cc: Python Dev
Subject: Re: [Python-Dev] unicode imports

Kristján V. Jónsson wrote:
 A cursory glance at import.c shows that the import mechanism is fairly 
 complicated, and riddled with char *path thingies, and manual string 
 arithmetic.  Do you have any suggestions on a clean way to unicodify 
 the import mechanism?

Can you install a PEP 302 path hook and importer/loader that can handle path 
entries that are Unicode strings? (I think this would end up being the parallel 
implementation you were talking about, though)

If the code that traverses sys.path and sys.path_hooks is itself 
unicode-unaware (I don't remember if it is or isn't), then you might be able to 
trick it by poking a Unicode-savvy importer directly into the 
path_importer_cache for affected Unicode paths.

One issue is that the package and file names still have to be valid Python 
identifiers, which means ASCII. Unicode would be, at best, permitted only in 
the path entries.

Cheers,
Nick.

-- 
Nick Coghlan   |   [EMAIL PROTECTED]   |   Brisbane, Australia
---
 http://www.boredomandlaziness.org
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] unicode imports

2006-06-19 Thread Kristján V . Jónsson

Okay, for specifics which demonstrate the problem.
I have a directory, C:\tmp\腌
In it, there is a file, doo.py
d = os.listdir(uc:/tmp)[-1]
d
u'\u814c'
 d2 = os.listdir(uc:/tmp/+d)
 d2
[u'doo.py']
 p = uc:/tmp/+d
 p
u'c:/tmp/\u814c'
 sys.path.append(p)
 import doo
Traceback (most recent call last):
  File stdin, line 1, in module
ImportError: No module named doo

 p.encode(mbcs)
'c:/tmp/?'
 p.encode(gb2312)
'c:/tmp/\xeb\xe7'

Running your example test code gives:
Prefixes: C:\PyDev25 C:\PyDev25
Path: ['c:\\tmp', 'c:\\documents and settings\\kristjan\\my documents\\python',
'C:\\PyDev25\\PCbuild8\\python25.zip', 'C:\\PyDev25\\DLLs', 'C:\\PyDev25\\lib',
'C:\\PyDev25\\lib\\plat-win', 'C:\\PyDev25\\lib\\lib-tk', 'C:\\PyDev25\\PCbuild8
', 'C:\\PyDev25', 'C:\\PyDev25\\lib\\site-packages']
Default encoding: ascii
Input encoding: cp850 Output encodings: cp850 cp850

-Original Message-
From: Nick Coghlan [mailto:[EMAIL PROTECTED] 
Sent: 17. júní 2006 04:17
To: Phillip J. Eby
Cc: Kristján V. Jónsson; Python Dev
Subject: Re: [Python-Dev] unicode imports

Phillip J. Eby wrote:
 Actually, you would want to put it in sys.path_hooks, and then 
 instances would be placed in path_importer_cache automatically.  If 
 you are adding it to the path_hooks after the fact, you should simply 
 clear the path_importer_cache.  Simply poking stuff into the 
 path_importer_cache is not a recommended approach.

Oh, I agree - poking it in directly was a desperation measure if the path_hooks 
machinery didn't like Unicode either.

I've since gone and looked, and you may be screwed either way - the standard 
import paths appear to be always put on the system path as encoded 8-bit 
strings, not as Unicode objects.

That said, it also appears that the existing machinery *should* be able to 
handle non-ASCII path items, so long as 'Py_FileSystemDefaultEncoding' is set 
correctly. If it isn't handling it, then there's something else going wrong.

Modules/getpath.c and friends don't encode the results returned by the platform 
APIs, so the strings in

Kristján, can you provide more details on the fault you get when trying to 
import from the path containing the Chinese characters? Specifically:

What is the actual file system path?
What do sys.prefix, sys.exec_prefix and sys.path contain?
What does sys.getdefaultencoding() return?
What do sys.stdin.encoding, sys.stdout.encoding and sys.stderr.encoding say?
What does python -v show?
Does adding the standard lib directories manually to sys.path make any 
difference?
Does setting PYTHONHOME to the appropriate settings make any difference?

Running something like the following would be good:

   import sys
   print Prefixes:, sys.prefix, sys.exec_prefixes
   print Path:, sys.path
   print Default encoding:, sys.getdefaultencoding()
   print Input encoding:, sys.stdin.encoding,
   print Output encodings:, sys.stdout.encoding, sys.stderr.encoding
   try:
   import string # Make python -v do something interesting
   except ImportError:
   print Could not find string module
   sys.path.append(ustdlib directory name)
   try:
   import string # Make python -v do something interesting
   except ImportError:
   print Could not find string module






-- 
Nick Coghlan   |   [EMAIL PROTECTED]   |   Brisbane, Australia
---
 http://www.boredomandlaziness.org
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] unicode imports

2006-06-19 Thread Kristján V . Jónsson

Well, my particular test uses  u'c:/tmp/\u814c'
If that cannot be encoded in mbcs, then mbcs isn't useful.
Note that this is both an issue of python being able to run from an arbitrary 
install position, and also the ability of users to import and run scripts from 
any other arbitrary directory.

Kristján

-Original Message-
From: Neil Hodgson [mailto:[EMAIL PROTECTED] 
Sent: 17. júní 2006 04:53
To: Kristján V. Jónsson
Cc: Python Dev
Subject: Re: [Python-Dev] unicode imports

Kristján V. Jónsson:

 Although python has had full unicode support for filenames for a long 
 time on selected platforms (e.g. Windows), there is one glaring 
 deficiency:  It cannot import from paths containing unicode.  I´ve 
 tried creating folders with chinese characters and adding them to path, to no 
 avail.
 The standard install path in chinese distributions can be with a 
 non-ANSI path, and installing an embedded python application there will break 
 it.

   It should be unusual for a Chinese installation to use an install path that 
can not be represented in MBCS. Try encoding the install directory into MBCS 
before adding it to sys.path.

   Neil
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] unicode imports

2006-06-19 Thread Kristján V . Jónsson

I don't have specific information on the machines.  We didn´t try very hard to 
get things to work with 2.3 since we simply assumed it would work automatically 
when we upgraded to a more mature 2.4.
I could try to get more info, but it would be 2.3 specific.  Have there been 
any changes since then?

Note that it may not go into program files at all.  Someone may want to install 
his modules in a folder named in the honour of his mother.

Also, I really would like to see a general solution that doesn´t assume that 
the path name can somhow be transmuted to an ascii name.  Users are 
unpredictable.  When you have a wide distribution  , you come up against all 
kinds of problems (Currently we have around 500.000 users in china.) 
Also, relying on some locale settings is not acceptable.  My machine here has 
the icelandic locale.  Yet, I need to be able to set up and use a chinese 
install.  Likewise, many machines in china will have an english locale.  A 
default encoding and locale is essentially an evil hack in our increasingly 
global environment.  We have converted more or less our entire code base to 
unicode because keeping track of encoded strings is simply unworkable in a 
large project.

Funny that no other platforms could benefit from a unicode import path.  Does 
that mean that windows will reign supreme?  Please explain.

Cheers,

Kristján

-Original Message-
From: Martin v. Löwis [mailto:[EMAIL PROTECTED] 
Sent: 17. júní 2006 08:42
To: Kristján V. Jónsson
Cc: Python Dev
Subject: Re: [Python-Dev] unicode imports

Kristján V. Jónsson wrote:
 The standard install path in chinese distributions can be with a 
 non-ANSI path, and installing an embedded python application there 
 will break it.

I very much doubt this. On a Chinese system, the Program Files folder likely 
has a non-*ASCII* name, but it will have a fine *ANSI* name, as the ANSI code 
page on that system should be either 936 (simplified
chinese) or 950 (traditional chinese) - unless the system is misconfigured.

Can you please report what the path is, what the precise name of the operating 
system is, and what the system locale and the system code page are?

 A completely parallel implementation on the sys.path[i] level?

You should also take a look at what the 8.3 name of the path is.
I really cannot believe that the path is unaccessible to DOS programs.

 Are there other platforms beside Windows that would profit from this?

No.

Regards,
Martin
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] unicode imports

2006-06-19 Thread Thomas Heller

It should be noted that I once started to convert the import machinery
to be fully unicode aware.  As far as I can tell, a *lot* has to be changed
to make this work.

I started with refactoring Python/import.c, but nobody responded to the question
whether such a refactoring patch would be accepted or not.

Thomas

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] unicode imports

2006-06-19 Thread M.-A. Lemburg

Thomas Heller wrote:
 It should be noted that I once started to convert the import machinery
 to be fully unicode aware.  As far as I can tell, a *lot* has to be changed
 to make this work.
 
 I started with refactoring Python/import.c, but nobody responded to the 
 question
 whether such a refactoring patch would be accepted or not.

Perhaps someone should start a PEP on this subject ?!
(not me, though :-)

-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, Jun 19 2006)
 Python/Zope Consulting and Support ...http://www.egenix.com/
 mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/
 mxODBC, mxDateTime, mxTextTools ...http://python.egenix.com/


::: Try mxODBC.Zope.DA for Windows,Linux,Solaris,FreeBSD for free ! 
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] unicode imports

2006-06-19 Thread Nick Coghlan

Kristján V. Jónsson wrote:
 Funny that no other platforms could benefit from a unicode import path.
 Does that mean that windows will reign supreme?  Please explain.

As near as I can tell, other platforms use encoded strings with the normal 
(byte-based) posix file API, so the Python interpreter and the file system 
simply need to agree on the encoding (typically utf-8) in order for both 
filesystem access and importing from non-ASCII paths to work.

On Windows, though, most of the file system interaction code has had to be 
updated to use the wide-character API where possible. import.c is one of the 
few holdouts that relies entirely on the byte-based posix API.

If I had to put money on what's currently happening on your test machine, it's 
that import.c is trying to do u'c:/tmp/\u814c'.encode('mbcs'), getting 
'c:/tmp/?' and proceeding to do nothing useful with that path entry. Checking 
the result of sys.getfilesystemencoding() should be able to confirm that.

So it looks like it ain't really gonna work properly on Windows unless 
import.c is rewritten to use the Unicode-aware platform independent IO 
implementation in posixmodule.c.

Until that happens (hopefully by Python 2.6), I like MvL's suggestion - look 
at the 8.3 DOS name on the command prompt and put that into sys.path. ctypes 
and/or pywin32 should let you get at that information programmatically.

Cheers,
Nick.

-- 
Nick Coghlan   |   [EMAIL PROTECTED]   |   Brisbane, Australia
---
 http://www.boredomandlaziness.org
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] unicode imports

2006-06-19 Thread Kristján V . Jónsson

Wouldn´t it be possible then to emulate the unix way?  Simply encode any 
unicode paths to utf-8, process them as normal, and then decode them just prior 
to the actual windows io call?  It would make sense to just use the utf-8 
encoding all the way for all platforms (since it is easy to work with), and 
then convert to most appropriate encoding for the platform in question right at 
the end, e.g. unicode for windows, mbcs for windows without unicode (win98) 
(which relies on the LC_LOCALE setting) and whatever 8 bit encoding is 
appropriate for the particular unix platform.

Of course, once there, why not do it unicode all the way up to that last point? 
 Unless there are platforms without wchar_t that would make sense.

At any rate, I am trying to find a coding path of least resistance here.  
Regardless of the timeline or acceptance in mainstream python for this feature, 
it is something I will have to patch in for our application.

Cheers,
Kristján

-Original Message-
From: Nick Coghlan [mailto:[EMAIL PROTECTED] 
Sent: 19. júní 2006 13:46
To: Kristján V. Jónsson
Cc: Martin v. Löwis; Python Dev
Subject: Re: [Python-Dev] unicode imports

Kristján V. Jónsson wrote:
 Funny that no other platforms could benefit from a unicode import path.
 Does that mean that windows will reign supreme?  Please explain.

As near as I can tell, other platforms use encoded strings with the normal
(byte-based) posix file API, so the Python interpreter and the file system 
simply need to agree on the encoding (typically utf-8) in order for both 
filesystem access and importing from non-ASCII paths to work.
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] unicode imports

2006-06-19 Thread Guido van Rossum

On 6/16/06, Kristján V. Jónsson [EMAIL PROTECTED] wrote:
 Although python has had full unicode support for filenames for a long time
 on selected platforms (e.g. Windows), there is one glaring deficiency:  It
 cannot import from paths containing unicode.  I´ve tried creating folders
 with chinese characters and adding them to path, to no avail.

I don't know exactly where this discussion is heading at this point,
but I think it's clear that there's a real (though -- yet -- rare)
problem, for which currently only ugly work-arounds exist. I'm not
convinced that it occurs on other platforms than Windows -- everyone
else seems to use UTF-8 for pathnames, while Windows is stuck with
code pages and other crap, and the only reasaonably way to access
Unicode pathnames is via the Windows-specific Unicode API (which is
why import is the last place where this isn't easily solved, as the
import machinery is completely 8-bit-based).

Has it been determined yet whether the DOS 8+3 filename cannot be used
as a workaround?

Perhaps it would be good enough to wait for Py3k? That will have pure
Unicode strings and the import machinery will be completely rewritten
anyway. (And I wouldn't be surprised if that rewrite were to use pure
Python code.) Py3k will be released later than Python 2.6, but most
likely before 2.7.

-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] unicode imports

2006-06-19 Thread Martin v. Löwis

Kristján V. Jónsson wrote:
 I don't have specific information on the machines.  We didn´t try
 very hard to get things to work with 2.3 since we simply assumed it
 would work automatically when we upgraded to a more mature 2.4. I
 could try to get more info, but it would be 2.3 specific.  Have there
 been any changes since then?

Not in that respect, no.

 Note that it may not go into program files at all.  Someone may want
 to install his modules in a folder named in the honour of his mother.

It's certainly possible to set this up in a way that it won't work,
on any localized version: just use a path name that isn't supported
in the ANSI code page. However, that should rarely happen: the
name of his mother should still be expressable in the ANSI code
page, if the system is setup correctly.

 Also, I really would like to see a general solution that doesn´t
 assume that the path name can somhow be transmuted to an ascii name.

(Please don't say ASCII here. Windows *A APIs are named that way
 because Microsoft Windows has the notion of an ANSI code page,
 which, in turn, is just a code page indirection so some selected
 code page meant to support the characters of the user's locale)

 Users are unpredictable.  When you have a wide distribution  , you
 come up against all kinds of problems (Currently we have around
 500.000 users in china.) Also, relying on some locale settings is not
 acceptable.

Sure, but stating that doesn't really help. Code contributions
would help, but that part of Python has been left out of using
the *W API, because it is particularly messy to fix.

 Funny that no other platforms could benefit from a unicode import
 path.  Does that mean that windows will reign supreme?

That is the case, more or less. Or, more precisely:
- On Linux, Solaris, and most other Unices, file names are bytes
  on the system API, and are expected to be encoded in the user's
  locale. So if your locale does not support a character, you
  can't name a file that way, on Unix. There is a trend towards
  using UTF-8 locales, so that the locale contains all Unicode
  characters.
- On Mac OS X, all file names are UTF-8, always (unless the
  user managed to mess it up), so you can have arbitrary
  Unicode file names

That means that the approach of converting a Unicode sys.path
element to the file system encoding will always do the right
thing on Linux and OS X: the file system encoding will be
the locale's encoding on Linux, and will be UTF-8 on OS X.

It's only Windows which has valid file names that cannot
be represented in the current locale.

Regards,
Martin
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] unicode imports

2006-06-19 Thread Martin v. Löwis

Kristján V. Jónsson wrote:
 Wouldn´t it be possible then to emulate the unix way?  Simply encode
 any unicode paths to utf-8, process them as normal, and then decode
 them just prior to the actual windows io call?

That won't work. People also put path names from the ANSI code page
onto sys.path and expect that to work - it always worked, and is
a nearly-complete work-around to put directories with funny characters
onto sys.path. sys.path is a list, so we have little control over
what gets put onto it.

 Of course, once there, why not do it unicode all the way up to that
 last point?  Unless there are platforms without wchar_t that would
 make sense.

Again, we can't really control that. Also, most platforms have no
wchar_t API for file IO. We would have to encode each sys.path
element for each stat() call, which would be quite expensive

 At any rate, I am trying to find a coding path of least resistance
 here.  Regardless of the timeline or acceptance in mainstream python
 for this feature, it is something I will have to patch in for our
 application.

The path with least resistance should be usage of 8.3 directory names.
The one to implement in future Python versions should be the rewrite
of import.c, to operate on PyObject* instead of char*, and perform
conversion to the native API only just before calling the native API.

Regards,
Martin
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] unicode imports

2006-06-19 Thread Martin v. Löwis

Thomas Heller wrote:
 It should be noted that I once started to convert the import machinery
 to be fully unicode aware.  As far as I can tell, a *lot* has to be changed
 to make this work.

Is that code available somewhere still? Does it still work?

 I started with refactoring Python/import.c, but nobody responded to the 
 question
 whether such a refactoring patch would be accepted or not.

I would like to see minimal changes only. I don't see why massive
refactoring would be necessary: the structure of the code should
persist - only the data types should change from char* to PyObject*.
Calls like stat() and open() should be generalized to accept
PyObject*, and otherwise keep their interface.

Regards,
Martin
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] unicode imports

2006-06-17 Thread Martin v. Löwis

Kristján V. Jónsson wrote:
 The standard install path in chinese distributions can be with a
 non-ANSI path, and installing an embedded python application there will
 break it.

I very much doubt this. On a Chinese system, the Program Files folder
likely has a non-*ASCII* name, but it will have a fine *ANSI* name,
as the ANSI code page on that system should be either 936 (simplified
chinese) or 950 (traditional chinese) - unless the system is
misconfigured.

Can you please report what the path is, what the precise name of the
operating system is, and what the system locale and the system
code page are?

 A completely parallel implementation on the sys.path[i] level?

You should also take a look at what the 8.3 name of the path is.
I really cannot believe that the path is unaccessible to DOS
programs.

 Are there other platforms beside Windows that would profit from this?

No.

Regards,
Martin
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] unicode imports

2006-06-17 Thread Martin v. Löwis

Neil Hodgson wrote:
It should be unusual for a Chinese installation to use an install
 path that can not be represented in MBCS. Try encoding the install
 directory into MBCS before adding it to sys.path.

Indeed. Unfortunately, people apparently install an English version
(because they can get that without paying any license fee), and then
create directory names that can't be represented in the ANSI code
page (which would then be 1252). Still, on such a system, the
target folder for programs should be Program Files.

If people do that, they *should* change the system locale to some
Chinese locale, but being non-admin people, they often don't.

Regards,
Martin
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] unicode imports

2006-06-17 Thread Ronald Oussoren


On 17-jun-2006, at 6:44, Nick Coghlan wrote:

 Bob Ippolito wrote:
 There's a similar issue in that if sys.prefix contains a colon,  
 Python
 is also busted:
 http://python.org/sf/1507224

 Of course, that's not a Windows issue, but it is everywhere else. The
 offending code in that case is Modules/getpath.c,

 Since it has to do with the definition of Py_GetPath as returning a  
 single
 string that is really a DELIM separated list of strings, where  
 DELIM is
 defined by the current platform (';' on Windows, ':' everywhere  
 else), this
 seems more like a platform problem than a Python problem, though -  
 you can't
 have directories containing a colon as an entry in PATH or  
 PYTHONPATH either.
 It's not really Python's fault that the platform defines a legal  
 filename
 character as the delimiter for path entries.

On unix-y systems any character except the NUL byte can be used in a  
legal fileystem path, that leaves awfully little characters to use as  
delimiter without risking issues like the one in the bug Bob mentioned.


 The only real alternative I can see is to normalise Py_GetPath to  
 always
 return a ';' delimited list of strings, regardless of platform, and  
 update
 PySys_SetPath accordingly. That'd cause potential compatibility  
 problems for
 embedded interpreters, though.

That wouldn't help, ';' is also a valid character in filenames on  
Unix.  Except for accepting the status quo (which is a perfectly fine  
alternative) there seem to be two valid ways to solve this problem.  
You can either define Py_GetPath2 that returns a python list or  
tuple, or introduce some way of quoting the delimiter. Both would be  
backward incompatible.

Ronald
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

[Python-Dev] unicode imports

2006-06-16 Thread Kristján V . Jónsson




Greetings!

Although python has 
had full unicode support for filenames for a long time on selected platforms 
(e.g. Windows), there is one glaring deficiency: It cannot import from 
paths containing unicode. I´ve tried creating folders with chinese 
characters and adding them to path, to no avail.
The standard install 
path in chinese distributions can be with a non-ANSI path, and installing an 
embedded python application there will break it. At the moment this is 
hindering the installation of EVE on Chinese internet-cafés.

A cursory glance at 
import.c shows that the import mechanism is fairly complicated, and riddled with 
"char *path" thingies, and manual string arithmetic. Do you have any 
suggestions on a clean way to unicodify the import 
mechanism?

A completely 
parallel implementation on the sys.path[i] level?

Are there other 
platforms beside Windows that would profit from this?

Cheers,

Kristján
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] unicode imports

2006-06-16 Thread Nick Coghlan

Kristján V. Jónsson wrote:
 A cursory glance at import.c shows that the import mechanism is fairly 
 complicated, and riddled with char *path thingies, and manual string 
 arithmetic.  Do you have any suggestions on a clean way to unicodify the 
 import mechanism?

Can you install a PEP 302 path hook and importer/loader that can handle path 
entries that are Unicode strings? (I think this would end up being the 
parallel implementation you were talking about, though)

If the code that traverses sys.path and sys.path_hooks is itself 
unicode-unaware (I don't remember if it is or isn't), then you might be able 
to trick it by poking a Unicode-savvy importer directly into the 
path_importer_cache for affected Unicode paths.

One issue is that the package and file names still have to be valid Python 
identifiers, which means ASCII. Unicode would be, at best, permitted only in 
the path entries.

Cheers,
Nick.

-- 
Nick Coghlan   |   [EMAIL PROTECTED]   |   Brisbane, Australia
---
 http://www.boredomandlaziness.org
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] unicode imports

2006-06-16 Thread Phillip J. Eby

At 01:29 AM 6/17/2006 +1000, Nick Coghlan wrote:
Kristján V. Jónsson wrote:
  A cursory glance at import.c shows that the import mechanism is fairly
  complicated, and riddled with char *path thingies, and manual string
  arithmetic.  Do you have any suggestions on a clean way to unicodify the
  import mechanism?

Can you install a PEP 302 path hook and importer/loader that can handle path
entries that are Unicode strings? (I think this would end up being the
parallel implementation you were talking about, though)

If the code that traverses sys.path and sys.path_hooks is itself
unicode-unaware (I don't remember if it is or isn't), then you might be able
to trick it by poking a Unicode-savvy importer directly into the
path_importer_cache for affected Unicode paths.

Actually, you would want to put it in sys.path_hooks, and then instances 
would be placed in path_importer_cache automatically.  If you are adding it 
to the path_hooks after the fact, you should simply clear the 
path_importer_cache.  Simply poking stuff into the path_importer_cache is 
not a recommended approach.


One issue is that the package and file names still have to be valid Python
identifiers, which means ASCII. Unicode would be, at best, permitted only in
the path entries.

If I understand the problem correctly, the issue is that if you install 
Python itself to a Unicode directory, you'll be unable to import anything 
from the standard library.  This isn't about module names, it's about the 
places on the path where that stuff goes.

However, if the issue is that the program works, but it puts unicode 
entries on sys.path, I would suggest simply encoding them to strings using 
the platform-appropriate codec.

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] unicode imports

2006-06-16 Thread Bob Ippolito


On Jun 16, 2006, at 9:02 AM, Phillip J. Eby wrote:

 At 01:29 AM 6/17/2006 +1000, Nick Coghlan wrote:
 Kristján V. Jónsson wrote:
 A cursory glance at import.c shows that the import mechanism is  
 fairly
 complicated, and riddled with char *path thingies, and manual  
 string
 arithmetic.  Do you have any suggestions on a clean way to  
 unicodify the
 import mechanism?

 Can you install a PEP 302 path hook and importer/loader that can  
 handle path
 entries that are Unicode strings? (I think this would end up being  
 the
 parallel implementation you were talking about, though)

 If the code that traverses sys.path and sys.path_hooks is itself
 unicode-unaware (I don't remember if it is or isn't), then you  
 might be able
 to trick it by poking a Unicode-savvy importer directly into the
 path_importer_cache for affected Unicode paths.

 Actually, you would want to put it in sys.path_hooks, and then  
 instances
 would be placed in path_importer_cache automatically.  If you are  
 adding it
 to the path_hooks after the fact, you should simply clear the
 path_importer_cache.  Simply poking stuff into the  
 path_importer_cache is
 not a recommended approach.


 One issue is that the package and file names still have to be  
 valid Python
 identifiers, which means ASCII. Unicode would be, at best,  
 permitted only in
 the path entries.

 If I understand the problem correctly, the issue is that if you  
 install
 Python itself to a Unicode directory, you'll be unable to import  
 anything
 from the standard library.  This isn't about module names, it's  
 about the
 places on the path where that stuff goes.

There's a similar issue in that if sys.prefix contains a colon,  
Python is also busted:
http://python.org/sf/1507224

Of course, that's not a Windows issue, but it is everywhere else. The  
offending code in that case is Modules/getpath.c, which probably also  
has to change in order to make unicode directories work on Win32  
(though I think there may be a separate win32 implementation of  
getpath).

-bob

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] unicode imports

2006-06-16 Thread Nick Coghlan

Phillip J. Eby wrote:
 Actually, you would want to put it in sys.path_hooks, and then instances 
 would be placed in path_importer_cache automatically.  If you are adding 
 it to the path_hooks after the fact, you should simply clear the 
 path_importer_cache.  Simply poking stuff into the path_importer_cache 
 is not a recommended approach.

Oh, I agree - poking it in directly was a desperation measure if the 
path_hooks machinery didn't like Unicode either.

I've since gone and looked, and you may be screwed either way - the standard 
import paths appear to be always put on the system path as encoded 8-bit 
strings, not as Unicode objects.

That said, it also appears that the existing machinery *should* be able to 
handle non-ASCII path items, so long as 'Py_FileSystemDefaultEncoding' is set 
correctly. If it isn't handling it, then there's something else going wrong.

Modules/getpath.c and friends don't encode the results returned by the 
platform APIs, so the strings in

Kristján, can you provide more details on the fault you get when trying to 
import from the path containing the Chinese characters? Specifically:

What is the actual file system path?
What do sys.prefix, sys.exec_prefix and sys.path contain?
What does sys.getdefaultencoding() return?
What do sys.stdin.encoding, sys.stdout.encoding and sys.stderr.encoding say?
What does python -v show?
Does adding the standard lib directories manually to sys.path make any 
difference?
Does setting PYTHONHOME to the appropriate settings make any difference?

Running something like the following would be good:

   import sys
   print Prefixes:, sys.prefix, sys.exec_prefixes
   print Path:, sys.path
   print Default encoding:, sys.getdefaultencoding()
   print Input encoding:, sys.stdin.encoding,
   print Output encodings:, sys.stdout.encoding, sys.stderr.encoding
   try:
   import string # Make python -v do something interesting
   except ImportError:
   print Could not find string module
   sys.path.append(ustdlib directory name)
   try:
   import string # Make python -v do something interesting
   except ImportError:
   print Could not find string module






-- 
Nick Coghlan   |   [EMAIL PROTECTED]   |   Brisbane, Australia
---
 http://www.boredomandlaziness.org
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] unicode imports

2006-06-16 Thread Nick Coghlan

Bob Ippolito wrote:
 There's a similar issue in that if sys.prefix contains a colon, Python 
 is also busted:
 http://python.org/sf/1507224
 
 Of course, that's not a Windows issue, but it is everywhere else. The 
 offending code in that case is Modules/getpath.c,

Since it has to do with the definition of Py_GetPath as returning a single 
string that is really a DELIM separated list of strings, where DELIM is 
defined by the current platform (';' on Windows, ':' everywhere else), this 
seems more like a platform problem than a Python problem, though - you can't 
have directories containing a colon as an entry in PATH or PYTHONPATH either. 
It's not really Python's fault that the platform defines a legal filename 
character as the delimiter for path entries.

The only real alternative I can see is to normalise Py_GetPath to always 
return a ';' delimited list of strings, regardless of platform, and update 
PySys_SetPath accordingly. That'd cause potential compatibility problems for 
embedded interpreters, though.

I guess we could create a Py_GetPathEx and a PySys_SetPathEx that accepted the 
delimeters as arguments, and change the call in pythonrun.c from:

   PySys_SetPath(Py_GetPath())

to:

   PySys_SetPathEx(Py_GetPathEx(';'), ';')

(still an incompatible change, but an easier to manage one since you can 
easily provide different behavior for earlier versions of Python)

 which probably also 
 has to change in order to make unicode directories work on Win32 (though 
 I think there may be a separate win32 implementation of getpath).

There is - PC/getpathp.c

Cheers,
Nick.

-- 
Nick Coghlan   |   [EMAIL PROTECTED]   |   Brisbane, Australia
---
 http://www.boredomandlaziness.org
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

53 matches

Mail list logo