Re: [Python-3000] [Python-Dev] Reminder: last alphas next Wednesday 07-May-2008

2008-05-06 Thread glyph

On 11:45 pm, [EMAIL PROTECTED] wrote:

I like this, except one issue: I really don't like the .local
directory. I don't see any compelling reason why this needs to be
~/.local/lib/ -- IMO it should just be ~/lib/. There's no need to hide
it from view, especially since the user is expected to manage this
explicitly.


I've previously given a spirited defense of ~/.local on this list ( 
http://mail.python.org/pipermail/python-dev/2008-January/076173.html ) 
among other places.


Briefly, "lib" is not the only directory participating in this 
convention; you've also got the full complement of other stuff that 
might go into an installation like /usr/local.  So, while "lib" might 
annoy me a little, "bin etc games include lib lib32 man sbin share src" 
is going to get ugly pretty fast, especially if this is what comes up in 
Finder or Nautilus or Explorer every time I open a window.  If it's 
going to be a visible directory on the grounds that this is a Python- 
specific thing that is explicitly *not* participating in a convention 
with other software, then please call it "~/Python" or something.


Am I the only guy who finds software that insists on visible, fixed 
files in my home directory rude?  vmware, for example, wants a 
"~/vmware" directory, but pretty much every other application I use is 
nice enough to use dotfiles (even cedega, with a roughly-comparable-to- 
lib "applications I've installed for you" folder).


Put another way - it's trivial to make ~/.local/lib show up by 
symlinking ~/lib, but you can't make ~/lib disappear, and lots of 
software ends up looking at ~.

___
Python-3000 mailing list
Python-3000@python.org
http://mail.python.org/mailman/listinfo/python-3000
Unsubscribe: 
http://mail.python.org/mailman/options/python-3000/archive%40mail-archive.com


Re: [Python-3000] [Python-Dev] Reminder: last alphas next Wednesday 07-May-2008

2008-05-06 Thread glyph

On 01:55 am, [EMAIL PROTECTED] wrote:

On Thu, May 1, 2008 at 5:03 PM,  <[EMAIL PROTECTED]> wrote:


Hi everybody.  I apologize for writing yet another lengthy screed about 
a simple directory naming issue.  I feel strongly about it but I 
encourate anyone who doesn't to simply skip it.


First, some background: my strong feelings here are actually based on an 
experience I had a long time ago when helping someone with some C++ 
programming homework.  They were baffled because when I helped them the 
programs compiled, but then as soon as they tried it on their own it 
didn't.  The issue was that I had replicated my own autotools-friendly 
directory structure for them (at the time, "~/bin", "~/include", 
"~/lib", "~/etc", and so on managed with GNU stow) onto their machine 
and edited their shell setup to include them appropriately.  But, as 
soon as I was finished, they "cleaned up" the "mess" I had left behind, 
and thereby removed all of their build dependencies.  This was on a 
shared university build server, before the days of linux as a friendly, 
graphical operating system which encouraged you to look even more 
frequently at your home directory, so if anything I suspect the 
likelihood that this is a problem would be worse now.  Since cleaning up 
my own home directory, of course, I find that I appreciate the lack of 
visual noise in Nautilus et. al. as well.


Also, while I obviously think all tools should work this way, I think 
that Python in particular will attract an audience who is learning to 
program but not necessarily savvy with arcane nuances of filesystem 
layout, and it would be best if those details were abstracted.


My concern here is for the naive python developer reading installation 
instructions off of a wiki and trying to get started with Twisted 
development.  Seeing a directory created in your home directory (or, as 
the case may be, 3 directories, "bin", "lib", and "include") is a bit of 
a surprise.  They don't actually care where the files in their installed 
library are, as long as they're "installed", and they can import them. 
However, they may care that clicking on the little house icon now shows 
not "Pictures", "Movies", etc, but "lib" (what's a 'lib'?) "bin" (what's 
a bin?  is that like a box where I throw my stuff?) "share" (I put my 
stuff in "share", but it's not shared.  Wait, I'm supposed to put it in 
"Public"?).
 Briefly, "lib" is not the only directory participating in this 
convention;
you've also got the full complement of other stuff that might go into 
an
installation like /usr/local.  So, while "lib" might annoy me a 
little, "bin
etc games include lib lib32 man sbin share src" is going to get ugly 
pretty
fast, especially if this is what comes up in Finder or Nautilus or 
Explorer

every time I open a window.


Unless I misread the PEP, there's only going to be a lib subdirectory.
Python packages don't put stuff in other places AFAIK.


Python packages, at the very least, frequently put stuff in "bin" (or 
"scripts", I think, on Windows).  Not all Python packages are pure- 
Python packages either; setup.py boasts --install-platlib, --install- 
headers, --install-data, and --exec-prefix options, which suggests an 
"include", "bin", and "share" directory, at least.  I'm sure if I had 
more time to grovel around I'd find one that installed manpages. 
Twisted has some, but apparently setup.py doesn't do anything with them, 
we leave that to the OS packages...


Of course, very little of this is handled by the PEP.  But even the 
usage of the name "lib" implies that the PEP is taking some care to be 
compatible with an idiom that goes beyond Python itself here, or at 
least beyond simple Python packages.


Even assuming that no Python library ever wanted to install any of these 
things, there are many Python libraries which are simply wrappers around 
lower-level libraries, and if I want to perform a per-user install of 
one of those, I am going to ./configure --prefix=~/something (and by 
"something", I mean ".local" ;)) and it would be nice to have Python 
living in the same space.  For that matter it'd be nice to get autotools 
and Ruby and PHP and Perl and Emacs (ad nauseum) all looking at ~/.local 
as a mirror of /usr, so that I didn't have to write a bunch of shell 
bootstrap glue to get everything to behave consistently, or learn the 
new, special names for bits of configuration under "~" that are 
different from the ones under /usr/local or /etc.


I replicate a consistent Python development environment with a ton of 
bizarre dependencies across something like 15 different OS installations 
(not to mention a bevy of virtual machines I keep around just for fun), 
so I think about these issues a lot.  Most of these machines are macs 
and linux boxes, but I do my best on Windows too.  FWIW I don't have any 
idea what the right thing to do is on Windows; ".local" doesn't 
particularly make sense, but neither does "lib" in that context. 
There's no rea

Re: [Python-3000] [Python-Dev] Reminder: last alphas next Wednesday 07-May-2008

2008-05-06 Thread glyph


On 03:49 am, [EMAIL PROTECTED] wrote:

I stand corrected on a few points. You've convinced me that ~/lib/ is
wrong. But I still don't like ~/.local/; not in the last place because
it's not any more local than any other dot files or directories. The
"symmetry" with /usr/local/ is pretty weak, and certainly won't help
beginning users.


Why do you say the symmetry is weak?  The name might not be that 
evocative, but the main thrust of what I'm saying is that "~/." 
should be an autotools-style directory layout.  The symmetry I suggest 
is in exactly that sense; that's what /usr/local is.  I don't actually 
care what "" is, except I (and many others) already use "local" for 
that value, and the more software that honors it, the better.  GNU stow 
(arguably the king of per-user installation management) suggests ~/local 
as an autotools --prefix target; the free desktop project implicitly 
suggests ~/.local (by suggesting ~/.local/share is a place to put the 
same files that would normally be searched for in /usr/share and 
/usr/local/share).  So the word "local" is just floating around in this 
meme space; I don't like the word that much, but I don't see that 
there's a different one which more clearly evokes the concept either.  I 
originally used "~/UNIX" and then ~/.unix, but switched to .local when I 
noticed other folks doing it.  One I've actually seen mentioned a few 
times is "~/.nix-config", which I certainly don't think is any better.


It would help beginning users if ~/.local/bin and ~/.local/lib were 
honored by the system.  I, and other adherents of this idea that it 
would be nice if users could install source without admin privs, have 
been suggesting that to distro guys when I (we) can, and I figure in a 
few years, somebody might bite.  If that happens, it will start being 
*easier* to build stuff from source into a separated location than to 
need root, stomp on the system, and inevitably break some stuff. 
Agitating for ~/Python/Platform/Libraries on $LD_LIBRARY_PATH (or 
equivalent) is a lot harder to do with a straight face.


This is the reason I'm bothering to spill so many pixels on this topic; 
I think it would be great if Python were the first real adopter of this 
convention, and once *one* project has really gone full bore, each 
subsequent one is progressively easier to convince.  However, if you've 
made up your mind on ~/Python, I think I've more than made my case at 
this point, so I'll stop cluttering up the lists :).


(By the way, for what it's worth: I _hate_ the 
bin/lib/etc/man/src/include naming convention mess, but it's a mess 
which is programmatically honored in like a hundred billion lines of 
code.  This is why I want it supported, but hidden ;).)

As a compromise, I'm okay with ~/Python/. I would like to be able to
say that the user explicitly has to set an environment variable in
order to benefit from this feature, just like with $PYTHONPATH and
$PYTHONSTARTUP. But that might defeat the point of making this easy to
use for noobs.


Is there another point?  It seems to me that this change is entirely 
about shared conventions and "works by default" behavior.  If you are 
going to set an environment variable, set PYTHONPATH; it's already much 
more flexible.


~/Python opens up some new problems though, although perhaps they are 
trivially resolved: how should this interoperate with distutils?  'Just 
make "python setup.py --user" do what "python setup.py --prefix 
~/.local" would do' is pretty straightforward, but "~/Python" would need 
a new convention.  Should "~/Python" have a "~/Python/Scripts" directory 
that one could add to $PATH?  A "~/Python/Platform" directory, for 
includes, libraries, other random junk like manpages or HTML docs? 
~/Python/2.6/lib, or ~/Python2.6/lib?


To be fair, a separate, and purpose-designed Python directory layout 
might also make certain things neater.  For example one could support 
parallel installation with Python2.6 (or Python/2.6) by giving each a 
'lib' and 'bin' directory, and always having the scripts in the 2.6/bin 
dir invoke the 2.6 interpreter, rather than having separated space for 
libraries but having to mangle the names of scripts ("twistd8.0-py2.6"). 
I'd still prefer compatibility-by-convention with other tools, 
languages, etc, though.  In the long term, if everyone followed suit on 
~/.local, that would be great.  But I don't want a ~/Python, ~/Java, 
~/Ruby, ~/PHP, ~/Perl, ~/OCaml and ~/Erlang and a $PATH as long as my 
arm just so I can run a few applications without system-installing them.

On OS X I think we should put this somewhere under ~/Library/. Just
put it in a different place than where the Python framework puts its
stuff.


Isn't the whole point that it should be the same place?  Under current 
Python releases, OS X already has this functionality via 
~/Library/Python/2.5/site-packages.


Also, I'd strongly suggest supporting both ~/Library (although the 
existing location seems fine to me) *an

Re: [Python-3000] [Python-Dev] Reminder: last alphas next Wednesday 07-May-2008

2008-05-06 Thread glyph


On 05:53 pm, [EMAIL PROTECTED] wrote:

On May 1, 2008, at 7:54 PM, Barry Warsaw wrote:
Interesting.  I'm of the opposite opinion.  I really don't want 
Python dictating to me what my home directory should look like (a  dot 
file doesn't count because so many tools conspire to hide it  from 
me).  I guess there's always $PYTHONUSERBASE, but I think I  will not 
be alone. ;)


Using ~/.local/ for user-managed content doesn't seem right to me at 
all, because it's hidden by default.


I don't understand your reason for saying this.  Terms like "user" and 
"manage" are somewhat vague.  What sort of experience are you hoping to 
provide what sort of user with this convention?  I hope my earlier 
explanations were clear as far as the types of users.


I believe that the management of ~/.local/ is a subtle question.  It 
will largely be "managed" by simply telling distutils to put files 
there; I hope, implicitly.  In my mind there are 2 types of users who 
will be "managing" it - newbies, who don't really know what's going on 
but want "cd mypackage-0.0.1; python setup.py install; python -c 'import 
mypackage'" (or perhaps even "easy_install mypackage") to work, and 
advanced users who want to be able to mix-and-match different versions 
of different packages.  Advanced users might already have a PYTHONPATH 
management (virtual python, virtualenv, combinator, ~/.bashrc hacks, a 
directory full of symlinks) that already works for them, or be 
comfortable with inspecting a hidden directory, so ~/.local isn't a 
problem for them (i.e. us); newbies don't want to see the directory 
until they already know what's going on.
I'd be even happier if there were no default per-user location, but a 
required configuration setting (in the existing distutils config 
locations) in order to enable per-user installation.


If you're happier without this feature, then perhaps your tastes run 
counter to a useful implementation of it :).  Why wouldn't you want it, 
though?  PYTHONPATH still exists; you don't have to use it, personally.

___
Python-3000 mailing list
Python-3000@python.org
http://mail.python.org/mailman/listinfo/python-3000
Unsubscribe: 
http://mail.python.org/mailman/options/python-3000/archive%40mail-archive.com


Re: [Python-3000] [Python-Dev] Reminder: last alphas next Wednesday 07-May-2008

2008-05-06 Thread glyph


On 3 May, 11:34 pm, [EMAIL PROTECTED] wrote:

On May 3, 2008, at 7:51 AM, [EMAIL PROTECTED] wrote:
Fred asked for a --prefix flag (which is what I was voting on).  I 
don't
really care what you do by default as long as you give me a way to  do 
it

differently.


What's most interesting (to me) is that no one's commented on my note 
that my preferred approach would be that there's no default at all; 
the location would have to be specified explicitly.  Whether on the 
command line or in the distutils configuration doesn't matter, but 
explicitness should be required.


I thought I responded to it in my initial response, but let me be 
clearer.


First, Skip, I *only* care about the default behavior.  There's already 
a way to do it differently: PYTHONPATH.  So, Fred, I think what you're 
arguing for is to drop this feature entirely.  Or is there some other 
use for a new way to allow users to explicitly add something to 
sys.path, aside from PYTHONPATH?  It seems that it would add more 
complexity and I can't see what the value would be.


As I've said a dozen times in this thread already, the feature I'd like 
to get from a per-user installation location is that 'setup.py install', 
or at least some completely canonical distutils incantation, should 
work, by default, for non-root users; ideally non-administrators on 
windows as well as non-root users on unixish platforms.

___
Python-3000 mailing list
Python-3000@python.org
http://mail.python.org/mailman/listinfo/python-3000
Unsubscribe: 
http://mail.python.org/mailman/options/python-3000/archive%40mail-archive.com


Re: [Python-3000] [Python-Dev] Patch for an initial support of bytes filename in Python3

2008-09-30 Thread glyph

On 12:47 am, [EMAIL PROTECTED] wrote:

This is the most sane contribution I've seen so far :).

See attached patch: python3_bytes_filename.patch

Using the patch, you will get:
- open() support bytes
- listdir(unicode) -> only unicode, *skip* invalid filenames
  (as asked by Guido)


Forgive me for being a bit dense, but I couldn't find this hunk in the 
patch.  Do I understand properly that (listdir(bytes) -> bytes)?


If so, this seems basically sane to me, since it provides text behavior 
where possible and allows more sophisticated filesystem wrappers (i.e. 
Twisted's FilePath, Will McGugan's "FS") to do more tricky things, 
separating filenames for display to the user and filenames for exchange 
with the FS.

- remove os.getcwdu()
- create os.getcwdb() -> bytes
- glob.glob() support bytes
- fnmatch.filter() support bytes
- posixpath.join() and posixpath.split() support bytes


It sounds like maybe there should be some 2to3 fixers in here somewhere, 
too?  Not necessarily as part of this patch, but somewhere related?  I 
don't know what they would do, but it does seem quite likely that code 
which was previously correct under 2.6 (using bytes) would suddenly be 
mixing bytes and unicode with these APIs.

___
Python-3000 mailing list
Python-3000@python.org
http://mail.python.org/mailman/listinfo/python-3000
Unsubscribe: 
http://mail.python.org/mailman/options/python-3000/archive%40mail-archive.com


Re: [Python-3000] [Python-Dev] Patch for an initial support of bytes filename in Python3

2008-10-01 Thread glyph

On 02:32 pm, [EMAIL PROTECTED] wrote:

On Tue, Sep 30, 2008 at 6:21 AM,  <[EMAIL PROTECTED]> wrote:

On 12:47 am, [EMAIL PROTECTED] wrote:


It sounds like maybe there should be some 2to3 fixers in here 
somewhere,
too?  Not necessarily as part of this patch, but somewhere related?  I 
don't
know what they would do, but it does seem quite likely that code which 
was
previously correct under 2.6 (using bytes) would suddenly be mixing 
bytes

and unicode with these APIs.


Doesn't seem easy for 2to3 to recognize such cases.


Actually I think I'm wrong.  As far as dealing with glob(), listdir() 
and friends, I suppose that other bytes/text fixers will already have 
had their opportunity to deal with getting the type to be the 
appropriate thing, and if you have glob(should be bytes>) it will work as expected in 3.0.  (I am really just 
confirming that I have nothing useful to say here, using too many words 
to do it: at least, I hope that nobody will waste further time thinking 
about it as a result.)

If 2.6 weren't pretty much released already I'd ask to add
os.getcwdb() there, as an alias for os.getcwd(), and add a 2to3 fixer
that converts os.getcwdu() to os.getcwd(), leaves os.getcwd() alone
(benefit of the doubt) and leaves os.getcwdb() alone as well (a strong
indication the user meant to get bytes in the 3.x version of their
code. (Similar to using bytes instead of str in 2.6 even though they
mean the same thing there -- they will be properly separated in 3.x.)


In the absence of a 2.6 getcwdb, perhaps the fixer could just drop the 
"benefit of the doubt" case?  It could always be added to 2.7, and the 
parity release of 2to3 could have a --2.7 switch that would modify the 
behavior of this and other fixers.

___
Python-3000 mailing list
Python-3000@python.org
http://mail.python.org/mailman/listinfo/python-3000
Unsubscribe: 
http://mail.python.org/mailman/options/python-3000/archive%40mail-archive.com


Re: [Python-3000] [Python-Dev] Patch for an initial support of bytes filename in Python3

2008-10-01 Thread glyph


On 05:56 pm, [EMAIL PROTECTED] wrote:

On Tue, Sep 30, 2008 at 10:59 AM,  <[EMAIL PROTECTED]> wrote:

On 02:32 pm, [EMAIL PROTECTED] wrote:



In the absence of a 2.6 getcwdb, perhaps the fixer could just drop the
"benefit of the doubt" case?  It could always be added to 2.7, and the
parity release of 2to3 could have a --2.7 switch that would modify the
behavior of this and other fixers.


I'm not sure what you're proposing. *My* proposal is that 2to3 changes
os.getcwdu() calls to os.getcwd() and leaves os.getcwd() calls alone
-- there's no way to tell whether os.getcwdb() would be a better
match, and for portable code, it won't be (since os.getcwdb() is a
Unix-only thing).


My proposal is simply to change getcwd to getcwdb, and getcwdu to 
getcwd.  This preserves whatever bytes/text behavior you are expecting 
from 2.6 into 3.0.  Granted, the fact that unicode is really always the 
right thing to do on Windows complicates things.


I already tend to avoid os.getcwd() though, and this is just one more 
reason to avoid it.  In the rare cases where I really do need it, it 
looks like os.path.abspath(b".") / os.path.abspath(u".") will provide 
the clarity that I want.

___
Python-3000 mailing list
Python-3000@python.org
http://mail.python.org/mailman/listinfo/python-3000
Unsubscribe: 
http://mail.python.org/mailman/options/python-3000/archive%40mail-archive.com


Re: [Python-3000] [Python-Dev] New proposition for Python3 bytes filename issue

2008-10-01 Thread glyph

On 30 Sep, 09:22 pm, [EMAIL PROTECTED] wrote:
On Tue, Sep 30, 2008 at 1:04 PM, "Martin v. Löwis" <[EMAIL PROTECTED]> 
wrote:

Guido van Rossum wrote:
On Mon, Sep 29, 2008 at 11:00 PM, "Martin v. Löwis" 
<[EMAIL PROTECTED]> wrote:



Martin, I don't understand why you are in favor of storing raw bytes
encoded as Latin-1 in Unicode string objects, which clearly gives 
rise

to mojibake.


This is my word of the day, by the way.  Reading this whole thread was 
_totally_ worth it to learn about "mojibake".  Obviously I'm familiar 
with the phenomenon but somehow I'd never heard this awesome term 
before.

I am also encouraged by Glyph's support for (a). He has a lot of
practical experience.


Thanks for the vote of confidence.  I hope for all our sakes that you're 
not over-valuing that experience ;-).


For what it's worth, I can see MvL's point in that I think there is some 
danger in generating confusion by adding _too many_ string-like 
functions to the bytes type.  I don't want my suggestion to contribute 
to the confusion between bytes and text.


However, Martin, I can promise you that I will _never_ ask for any 
convenience functions related to bytes as a result of this decision.  I 
want bytes to come back from filesystem APIs because I intend to have a 
wrapper layer which knows two things about the file: the bytes (which 
are needed to talk to POSIX filesystem APIs) and the characters (which 
are computed from those bytes, can be safely renormalized, displayed to 
users, etc).  On Windows this filesystem wrapper will necessarily behave 
differently, and will not use bytes for anything.  Any formatting beyond 
joining path segments together and possibly splitting extensions off 
will be done on character strings, not byte strings.


The proposal of using U+ seems like it would have been almost the 
same from such a wrapper's perspective, except (A) people using the 
filesystem APIs without the benefit of such a wrapper would have been 
even more screwed, and (B) there are a few nasty corner-cases when 
dealing with surrogate (i.e. invalid, in UTF-8) code points which I'm 
not quite sure what it would have done with.


Guido already mentioned "libraries" as a hypothetical issue, but here's 
a real-world problem that results from putting NULLs into filenames. 
Consider this program:


   import gtk
   w = gtk.Window()
   b = gtk.Button(u"\u/hello/world")
   w.add(b)
   w.show_all()
   gtk.main()

which emits this message:
   TypeError: OGtkButton.__init__() argument 1 must be string without 
null bytes or None, not unicode


SQLite has a similar problem with NULLs, and I'm definitely sticking 
paths in there, too.


Eventually I'd like to propose such a path type for inclusion in the 
stdlib, but that will have to wait for issues like 
 to be resolved.
___
Python-3000 mailing list
Python-3000@python.org
http://mail.python.org/mailman/listinfo/python-3000
Unsubscribe: 
http://mail.python.org/mailman/options/python-3000/archive%40mail-archive.com


Re: [Python-3000] [Python-Dev] New proposition for Python3 bytes filename issue

2008-10-01 Thread glyph

On 03:32 am, [EMAIL PROTECTED] wrote:

On Sep 30, 2008, at 10:06 PM, [EMAIL PROTECTED] wrote:



Can you clarify what proposal you are supporting for Python:


Sure.  Neither of your descriptions is terribly accurate, but I'll try 
to explain.
1) Two sets of APIs, one returning unicode strings, and one returning 
bytestrings. (subpoints: what does the unicode-returning API do when 
it cannot decode the bytestring into unicode? raise exception, pretend 
argument/envvar/file didn't exist/?)


The only API discussed so far which would actually provide two variants 
is 'getcwd', which would have a 'getcwdb' that gives back bytes instead.


Pretty much every other API takes some kind of input.  listdir(bytes) 
would give back bytes, while listdir(text) would give back text. 
listdir(text) would skip undecodable filenames.


Similarly for all the other APIs in os and os.path that take pathnames 
for input.
2) All APIs return bytestrings only. Converting to unicode is 
considered lossy, and would have to be done by applications for 
display purposes only.


This is a bad way to do things, because on Windows, filenames *really 
are* unicode.  Converting to bytes is what's lossy.  (See previous 
discussion of active codepages and CreateFileA/CreateFileW.)

I really don't understand the reasoning for (1).


The reasoning is that a lot of software doesn't care if it's wrong for 
edge cases, it's really hard to come up with something that's correct 
with respect to all of those edge cases (absurdly difficult, if you need 
to stay in the straightjacket of string / bytes types, as well as 
provide a useful library interface - which is why we're having this 
discussion).  But, it should be _possible_ to write software that's 
correct in the face of those edge cases.


And - let's not forget this - the worlds of POSIX and Windows really are 
different and really do require subtly different inputs.  Python can try 
to paper over this like Java does and make it impossible to write 
certain classes of application, or it can just provide an ugly, slightly 
inconsistent API that exposes the ugly, slightly inconsistent reality. 
Modulo the issues you've raised which I don't think the proposal totally 
covers yet (abspath with a non-decodable cwd) I think it strikes a nice 
balance; allow people to live in the delusion of unicode-on-POSIX and 
have software that mostly works, most of the time, or allow them to face 
the unpleasantness and spend the effort to get something really solid.


I think the _right_ answer to all of this is to (A) make FilePath work 
completely correctly for every totally insane edge case ever, and (B) 
include it in the stdlib.  One day I think we'll do that.  But nobody 
has the time or energy to do even the first part of that *right now*, 
before 3.0 is released, so I'm just looking for something which it will 
be possible to build FilePath, or something like it, on top of, without 
breaking other people's applications who rely on the os module directly 
too badly.
It seems to me that  most software (probably including all of the 
Python stdlib) would  continue to use the unicode string API.


That's true.  And that software wouldn't handle these edge cases 
completely correctly.  As Guido put it, "it's a quality of 
implementation issue".
Switching all of the Python  stdlib to use the bytestring APIs instead 
would certainly be a large  undertaking, and would have all sorts of 
ripple-on API changes (e.g.  __file__).


I am not quite sure what to do about __file__.  My preference would 
probably be to use unicode filename for consistency so it can always be 
displayed, but provide a second attribute (__open_file__?) that would be 
sometimes unicode, sometimes bytes, which would be guaranteed to work 
with open().  I suspect that most software which interacts with __file__ 
on a deep level would be of the variety which would deal with the edge 
cases.


But where the Python stdlib wants a pathname it should be accepting 
either bytes or unicode, as all of the os.path functions want.  This 
does kind of suck, but the alternatives are to encode crazy extra 
information in unicode path names that cannot be exchanged with other 
programs (or with users: NULL is potentially the worst bogus character 
from a UI perspective), or revert to bytes for everything (which is a 
non-solution, c.f. Windows above).
So I can only imagine that if you're proposing (1), you're  doing so 
without the intention of suggesting that Python be converted  to use 
it.


Maybe updating the stdlib to be correct in the face of such changes is 
hard, but it doesn't seem intractible.  Taken together, it looks like 
there are only about 100 calls in the stdlib to both getcwd and abspath 
together, and I suspect many of them are for purely aesthetic purposes 
and could just be eliminated, and many of them are redefinitions of the 
functions and don't need any changes.


All the other path manipulation functions would continue to work 

Re: [Python-3000] [Python-Dev] New proposition for Python3 bytes filename issue

2008-10-01 Thread glyph


On 03:54 pm, [EMAIL PROTECTED] wrote:
I'm actually sort of liking this idea.  A Pathname class, for 
convenience
a subtype of String, but containing the underlying binary 
representation

used by the OS.  Even non-unicode pathnames could be represented.


On the one hand, I agree with you - except for the part where it's a 
subtype of String, that doesn't work.  In case I haven't mentioned it 
enough times already:


http://twistedmatrix.com/documents/8.1.0/api/twisted.python.filepath.FilePath.html

On the other hand, we've all been on this merry-go-round before:

   http://www.python.org/dev/peps/pep-0355/

Note especially the rejection notice: "Subclassing from str is a 
particularly bad idea".


Again, one day I'd really like to add one of these to Python.  Now is 
not the time.

___
Python-3000 mailing list
Python-3000@python.org
http://mail.python.org/mailman/listinfo/python-3000
Unsubscribe: 
http://mail.python.org/mailman/options/python-3000/archive%40mail-archive.com