Victor Stinner schrieb: > POSIX OS > -------- > > The default behaviour should be to use unicode and raise an error if > conversion to unicode fails. It should also be possible to use bytes using > bytes arguments and optional arguments (for getcwd). > > - listdir(unicode) -> unicode and raise an error on invalid filename > - listdir(bytes) -> bytes > - getcwd() -> unicode > - getcwd(bytes=True) -> bytes > - open(): accept bytes or unicode > > os.path.*() should accept operations on bytes filenames, but maybe not on > bytes+unicode arguments. os.path.join('directory', b'filename'): raise an > error (or use *implicit* conversion to bytes)?
This approach (changing all path-handling functions to accept either bytes or string, but not both) is doomed in my eyes. First, there are lots of them, second, they are not only in os.path but in many modules and also in user code, and third, I see no clean way of implementing them in the specified way. (Just try to do it with os.path.join as an example; I couldn't find the good way to write it, only the bad and the ugly...) If I had to choose, I'd still argue for the modified UTF-8 as filesystem encoding (if it were UTF-8 otherwise), despite possible surprises when a such-encoded filename escapes from Python. Georg -- Thus spake the Lord: Thou shalt indent with four spaces. No more, no less. Four shall be the number of spaces thou shalt indent, and the number of thy indenting shall be four. Eight shalt thou not indent, nor either indent thou two, excepting that thou then proceed to four. Tabs are right out. _______________________________________________ Python-3000 mailing list Python-3000@python.org http://mail.python.org/mailman/listinfo/python-3000 Unsubscribe: http://mail.python.org/mailman/options/python-3000/archive%40mail-archive.com