[issue3187] os.listdir can return byte strings

2008-10-07 Thread Martin v. Löwis

Martin v. Löwis [EMAIL PROTECTED] added the comment:

 Most (or all) patches include new tests about bytes. Here is a patch for 
 os.rst documentation about listdir(), getcwdb() and readlink().

Thanks! Committed as r66829.

I've added additional documentation in r66830, which should complete
Guido's list of things to be documented. So the issue can be closed
now.

 See msg74271 for what Guido considers the lacking documentation;
 you may find that other aspects also need documentation.
 
 I wrote a long document about bytes for filenames but not only. I'm still 
 waiting for some contributors or reviewers:
 http://wiki.python.org/moin/Python3UnicodeDecodeError

We should discuss that on python-dev, of course - the question is
whether additional documentation patches are needed in response to
this specific change.

 As for test cases: it seems that those got waived, in the hurry.
 
 Can you be more precise? Which tests have to be improved/rewritten?

I was probably looking at the wrong patches (such as getcwd_bytes.patch,
merge_os_getcwd_getcwdu.patch, etc); I now see that the final patch did
have tests. I recommend that patches that get superseded by other
patches are removed from the issue. The won't be deleted; it's still
possible to navigate to them through the History at the bottom of the
issue.

___
Python tracker [EMAIL PROTECTED]
http://bugs.python.org/issue3187
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue3187] os.listdir can return byte strings

2008-10-07 Thread Martin v. Löwis

Changes by Martin v. Löwis [EMAIL PROTECTED]:


--
status: open - closed

___
Python tracker [EMAIL PROTECTED]
http://bugs.python.org/issue3187
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue3187] os.listdir can return byte strings

2008-10-06 Thread STINNER Victor

STINNER Victor [EMAIL PROTECTED] added the comment:

Would it possible to close this issue since os.listdir() is fixed and 
many other related functions (posix, posixpath, ntpath, macpath, etc.) 
are also fixed? I propose to open new issues for new bugs since this 
issue becomes a little big long :)

Eg. see new issues #4035 and #4036!

___
Python tracker [EMAIL PROTECTED]
http://bugs.python.org/issue3187
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue3187] os.listdir can return byte strings

2008-10-06 Thread Martin v. Löwis

Martin v. Löwis [EMAIL PROTECTED] added the comment:

 Would it possible to close this issue since os.listdir() is fixed and 
 many other related functions (posix, posixpath, ntpath, macpath, etc.) 
 are also fixed?

IIUC, these fixes are still not complete: they lack documentation
changes. Of course, it would have been better if the original patches
already contained the necessary documentation and test suite changes.
See msg74271 for what Guido considers the lacking documentation;
you may find that other aspects also need documentation.

As for test cases: it seems that those got waived, in the hurry.

___
Python tracker [EMAIL PROTECTED]
http://bugs.python.org/issue3187
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue3187] os.listdir can return byte strings

2008-10-06 Thread STINNER Victor

STINNER Victor [EMAIL PROTECTED] added the comment:

Le Tuesday 07 October 2008 01:13:22 Martin v. Löwis, vous avez écrit :
 IIUC, these fixes are still not complete: they lack documentation
 changes. (...) Of course, it would have been better if the original patches 
 already contained the necessary documentation and test suite changes.

Most (or all) patches include new tests about bytes. Here is a patch for 
os.rst documentation about listdir(), getcwdb() and readlink().

 See msg74271 for what Guido considers the lacking documentation;
 you may find that other aspects also need documentation.

I wrote a long document about bytes for filenames but not only. I'm still 
waiting for some contributors or reviewers:
http://wiki.python.org/moin/Python3UnicodeDecodeError

 As for test cases: it seems that those got waived, in the hurry.

Can you be more precise? Which tests have to be improved/rewritten?

Added file: http://bugs.python.org/file11721/library_os_doc.patch

___
Python tracker [EMAIL PROTECTED]
http://bugs.python.org/issue3187
___Index: Doc/library/os.rst
===
--- Doc/library/os.rst  (révision 66821)
+++ Doc/library/os.rst  (copie de travail)
@@ -693,13 +693,13 @@
 
 .. function:: getcwd()
 
-   Return a bytestring representing the current working directory.
+   Return a string representing the current working directory.
Availability: Unix, Windows.
 
 
-.. function:: getcwdu()
+.. function:: getcwdb()
 
-   Return a string representing the current working directory.
+   Return a bytestring  representing the current working directory.
Availability: Unix, Windows.
 
 
@@ -801,8 +801,10 @@
``'..'`` even if they are present in the directory. Availability:
Unix, Windows.
 
-   On Windows NT/2k/XP and Unix, if *path* is a Unicode object, the result 
will be
-   a list of Unicode objects.
+   If *path* is a Unicode object, the result will be a list of Unicode objects.
+   If a filename can not be decoded to unicode, it is skipped. If *path* is a
+   bytes string, the result will be list of bytes objects included files
+   skipped by the unicode version.
 
 
 .. function:: lstat(path)
@@ -916,7 +918,9 @@
be converted to an absolute pathname using 
``os.path.join(os.path.dirname(path),
result)``.
 
-   If the *path* is a Unicode object, the result will also be a Unicode object.
+   If the *path* is an Unicode object, the result will also be a Unicode object
+   and may raise an UnicodeDecodeError. If the *path* is a bytes object, the
+   result will be a bytes object.
 
Availability: Unix.
 
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue3187] os.listdir can return byte strings

2008-10-03 Thread STINNER Victor

STINNER Victor [EMAIL PROTECTED] added the comment:

Le Friday 03 October 2008 03:45:44 Amaury Forgeot d'Arc, vous avez écrit :
 Here is a patch for Windows: (...)
 test_ntpath also runs functions with bytes.

Which charset is used when you use bytes filename? I read somewhere that it's 
the current codepage. How can the user get this codepage in Python? I ask 
this to complete my document:
  http://wiki.python.org/moin/Python3UnicodeDecodeError

Don't hesitate to edit directly the document, which may also be moved to 
Python3 Doc/ directory.

You should also support bytearray() in ntpath:
   isinstance(path, (bytes, bytearray))

The unit tests might use pure unicode on Windows and bytes on Linux, 
especially getcwd() vs getcwdb().

I don't have Windows nor Mac to test bytes filenames on these systems.

___
Python tracker [EMAIL PROTECTED]
http://bugs.python.org/issue3187
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue3187] os.listdir can return byte strings

2008-10-03 Thread Martin v. Löwis

Martin v. Löwis [EMAIL PROTECTED] added the comment:

 Which charset is used when you use bytes filename?

It's the ANSI code page, which is a system-wide admin-modifiable
indirection to some real code page (changing it requires a reboot).
In the API, it's referred to as CP_ACP. It's also related to the
multi-byte API, which has caused Mark Hammond to call the codec
invoking it mbcs (IOW, mbcs is always the codec name for the
file system encoding). The specific code page that CP_ACP denotes
can be found with locale.getpreferredencoding(). Using that codec
name (which might be e.g. cp1252) is different from using mbcs,
as that goes through a regular (table-driven) Python codec. In
particular, the Python codec will report errors, whereas the mbcs
codec will find replacement characters.

___
Python tracker [EMAIL PROTECTED]
http://bugs.python.org/issue3187
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue3187] os.listdir can return byte strings

2008-10-03 Thread Antoine Pitrou

Antoine Pitrou [EMAIL PROTECTED] added the comment:

 You should also support bytearray() in ntpath:
isinstance(path, (bytes, bytearray))

The most generic way of allowing all bytes-alike objects is to write:
path = bytes(path)

It raises a TypeError if `path` can't export a read-only buffer of
contiguous bytes; also, it is a no-op if `path` is already a bytes
object, so very cheap in the common case.

___
Python tracker [EMAIL PROTECTED]
http://bugs.python.org/issue3187
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue3187] os.listdir can return byte strings

2008-10-03 Thread STINNER Victor

STINNER Victor [EMAIL PROTECTED] added the comment:

 The most generic way of allowing all bytes-alike objects is to write:
 path = bytes(path)

If you use that, any unicode may fails and the function will always return 
unicode. The goal is to get:
  func(bytes)-bytes
  func(bytearray)-bytes (or maybe bytearray, it doesn't matter)
  func(unicode)-unicode

___
Python tracker [EMAIL PROTECTED]
http://bugs.python.org/issue3187
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue3187] os.listdir can return byte strings

2008-10-03 Thread Antoine Pitrou

Antoine Pitrou [EMAIL PROTECTED] added the comment:

Le vendredi 03 octobre 2008 à 11:43 +, STINNER Victor a écrit :
 STINNER Victor [EMAIL PROTECTED] added the comment:
 
  The most generic way of allowing all bytes-alike objects is to write:
  path = bytes(path)
 
 If you use that, any unicode may fails and the function will always return 
 unicode. The goal is to get:
   func(bytes)-bytes
   func(bytearray)-bytes (or maybe bytearray, it doesn't matter)
   func(unicode)-unicode

Then make it:

path = path if isinstance(path, str) else bytes(path)

___
Python tracker [EMAIL PROTECTED]
http://bugs.python.org/issue3187
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue3187] os.listdir can return byte strings

2008-10-03 Thread STINNER Victor

STINNER Victor [EMAIL PROTECTED] added the comment:

path=path is useless most of the code (unicode path), this code is 
faster if both cases (bytes or unicode)!
   if not isinstance(path, str):
  path = bytes(path)

* a if b else c: unicode=0.756730079651; bytes=1.93071103096
* if test: path=...: unicode=0.681571006775; bytes=1.88843798637

___
Python tracker [EMAIL PROTECTED]
http://bugs.python.org/issue3187
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue3187] os.listdir can return byte strings

2008-10-03 Thread Martin v. Löwis

Martin v. Löwis [EMAIL PROTECTED] added the comment:

I've committed sys.setfilesystemencoding as r66769.

Declaring it as a documentation issue now. Not sure whether it should
remain a release blocker; IMO, the documentation can still be produced
after the release.

--
assignee: loewis - georg.brandl
components: +Documentation -Library (Lib)
nosy: +georg.brandl

___
Python tracker [EMAIL PROTECTED]
http://bugs.python.org/issue3187
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue3187] os.listdir can return byte strings

2008-10-03 Thread Guido van Rossum

Guido van Rossum [EMAIL PROTECTED] added the comment:

Reducing priority to critical, it's just docs and tweaks from here.

 You should also support bytearray() in ntpath:
isinstance(path, (bytes, bytearray))

No, you shouldn't.  I changed my mind on this several times and in the
end figured it's good enough to just support bytes and str instances.

Amaury: I've reviewed your patch and ran test_ntpath.py on a Linux box.
 I get this traceback:

==
ERROR: test_relpath (__main__.TestNtpath)
--
Traceback (most recent call last):
  File Lib/test/test_ntpath.py, line 188, in test_relpath
tester('ntpath.relpath(a)', 'a')
  File Lib/test/test_ntpath.py, line 22, in tester
gotResult = eval(fn)
  File string, line 1, in module
  File /usr/local/google/home/guido/python/py3k/Lib/ntpath.py, line
530, in relpath
start_list = abspath(start).split(sep)
  File /usr/local/google/home/guido/python/py3k/Lib/ntpath.py, line
499, in abspath
path = join(os.getcwd(), path)
  File /usr/local/google/home/guido/python/py3k/Lib/ntpath.py, line
137, in join
if b[:1] in seps:
TypeError: 'in string' requires string as left operand, not bytes
--

The fix is to change the fallback abspath to this code:

def abspath(path):
Return the absolute version of a path.
if not isabs(path):
if isinstance(path, bytes):
cwd = os.getcwdb()
else:
cwd = os.getcwd()
path = join(cwd, path)
return normpath(path)

Once you fix that please check it in!

--
priority: release blocker - critical

___
Python tracker [EMAIL PROTECTED]
http://bugs.python.org/issue3187
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue3187] os.listdir can return byte strings

2008-10-03 Thread Guido van Rossum

Guido van Rossum [EMAIL PROTECTED] added the comment:

Assigning to Amaury for Windows fix first.

--
assignee: georg.brandl - amaury.forgeotdarc

___
Python tracker [EMAIL PROTECTED]
http://bugs.python.org/issue3187
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue3187] os.listdir can return byte strings

2008-10-03 Thread Amaury Forgeot d'Arc

Amaury Forgeot d'Arc [EMAIL PROTECTED] added the comment:

Thanks for testing the non-Windows part of ntpath.
Committed patch in r66777.

Leaving the issue open: macpath.py should certainly be modified.

--
assignee: amaury.forgeotdarc - 

___
Python tracker [EMAIL PROTECTED]
http://bugs.python.org/issue3187
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue3187] os.listdir can return byte strings

2008-10-03 Thread Guido van Rossum

Guido van Rossum [EMAIL PROTECTED] added the comment:

FWIW, I don't see a need to change macpath.py -- it's only used for
MacOS 9 and the occasional legacy app.  OSX uses posixpath.py.

--
resolution:  - accepted

___
Python tracker [EMAIL PROTECTED]
http://bugs.python.org/issue3187
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue3187] os.listdir can return byte strings

2008-10-03 Thread Guido van Rossum

Guido van Rossum [EMAIL PROTECTED] added the comment:

Sorry Amaury, but there's another issue.

test_ntpath now fails when run with -bb:

==
ERROR: test_expandvars (__main__.TestNtpath)
--
Traceback (most recent call last):
  File Lib/test/test_ntpath.py, line 151, in test_expandvars
tester('ntpath.expandvars($foo bar)', bar bar)
  File Lib/test/test_ntpath.py, line 10, in tester
gotResult = eval(fn)
  File string, line 1, in module
  File /usr/local/google/home/guido/python/py3k/Lib/ntpath.py, line
344, in expandvars
if c in ('\'', b'\''):   # no expansion within single quotes
BytesWarning: Comparison between bytes and string

==
ERROR: test_normpath (__main__.TestNtpath)
--
Traceback (most recent call last):
  File Lib/test/test_ntpath.py, line 120, in test_normpath
tester(ntpath.normpath('A//././//.//B'), r'A\B')
  File Lib/test/test_ntpath.py, line 10, in tester
gotResult = eval(fn)
  File string, line 1, in module
  File /usr/local/google/home/guido/python/py3k/Lib/ntpath.py, line
465, in normpath
if comps[i] in ('.', '', b'.', b''):
BytesWarning: Comparison between bytes and string

==
ERROR: test_relpath (__main__.TestNtpath)
--
Traceback (most recent call last):
  File Lib/test/test_ntpath.py, line 188, in test_relpath
tester('ntpath.relpath(a)', 'a')
  File Lib/test/test_ntpath.py, line 10, in tester
gotResult = eval(fn)
  File string, line 1, in module
  File /usr/local/google/home/guido/python/py3k/Lib/ntpath.py, line
534, in relpath
start_list = abspath(start).split(sep)
  File /usr/local/google/home/guido/python/py3k/Lib/ntpath.py, line
504, in abspath
return normpath(path)
  File /usr/local/google/home/guido/python/py3k/Lib/ntpath.py, line
465, in normpath
if comps[i] in ('.', '', b'.', b''):
BytesWarning: Comparison between bytes and string

--
assignee:  - amaury.forgeotdarc

___
Python tracker [EMAIL PROTECTED]
http://bugs.python.org/issue3187
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue3187] os.listdir can return byte strings

2008-10-03 Thread Amaury Forgeot d'Arc

Amaury Forgeot d'Arc [EMAIL PROTECTED] added the comment:

Committed r66779: test_ntpath now passes with the -bb option.

It seems that the Windows buildbots do not set -bb.

___
Python tracker [EMAIL PROTECTED]
http://bugs.python.org/issue3187
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue3187] os.listdir can return byte strings

2008-10-03 Thread Guido van Rossum

Guido van Rossum [EMAIL PROTECTED] added the comment:

Thanks Amaury!

On to Georg for doc tweaks.  Summary:

- all the os.path functions now work on bytes as well, on all platforms
- only on Unix (but not OSX) do we recommend using bytes
- os.getcwdu() no longer exists
- os.getcwdb() returns bytes
- os.listdir(str) skips undecodable entries (previously it returned a
mixture of str and bytes instances)
- open() accepts bytes as filename

Stuff that didn't change but that you might want to mention:

- all the syscalls in os support bytes args; readlink() and listdir()
return bytes if the arg is bytes
- getcwd() may raise UnicodeDecodeError

Martin already documented sys.setfilesystemencoding().

--
assignee: amaury.forgeotdarc - georg.brandl

___
Python tracker [EMAIL PROTECTED]
http://bugs.python.org/issue3187
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue3187] os.listdir can return byte strings

2008-10-03 Thread Amaury Forgeot d'Arc

Amaury Forgeot d'Arc [EMAIL PROTECTED] added the comment:

I have a patch for macpath.py nonetheless.
Tested on Windows (of course ;-) but all functions are pure text 
manipulation, except realpath(). It was much easier than ntpath.py.

I also added tests for three functions which were not exercised at all.

Added file: http://bugs.python.org/file11693/macpath.patch

___
Python tracker [EMAIL PROTECTED]
http://bugs.python.org/issue3187
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue3187] os.listdir can return byte strings

2008-10-03 Thread Benjamin Peterson

Benjamin Peterson [EMAIL PROTECTED] added the comment:

Amaury, you're patch looks good.

___
Python tracker [EMAIL PROTECTED]
http://bugs.python.org/issue3187
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue3187] os.listdir can return byte strings

2008-10-03 Thread Amaury Forgeot d'Arc

Amaury Forgeot d'Arc [EMAIL PROTECTED] added the comment:

Committed macpath.py in r66781.

___
Python tracker [EMAIL PROTECTED]
http://bugs.python.org/issue3187
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue3187] os.listdir can return byte strings

2008-10-02 Thread Barry A. Warsaw

Changes by Barry A. Warsaw [EMAIL PROTECTED]:


--
priority: deferred blocker - release blocker

___
Python tracker [EMAIL PROTECTED]
http://bugs.python.org/issue3187
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue3187] os.listdir can return byte strings

2008-10-02 Thread Guido van Rossum

Guido van Rossum [EMAIL PROTECTED] added the comment:

Martin, can you check in your changes to add sys.setfilesystemencoding()?

I will check in Victor's changes (with some edits).

Together this means that the various suggested higher-level solutions
(like returning path-like objects, or some kind of roudtripping
almost-but-not-quite-utf-8 encoding) can be implemented in pure Python.

--
assignee:  - gvanrossum

___
Python tracker [EMAIL PROTECTED]
http://bugs.python.org/issue3187
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue3187] os.listdir can return byte strings

2008-10-02 Thread STINNER Victor

Changes by STINNER Victor [EMAIL PROTECTED]:


Removed file: http://bugs.python.org/file11667/python3_bytes_filename-2.patch

___
Python tracker [EMAIL PROTECTED]
http://bugs.python.org/issue3187
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue3187] os.listdir can return byte strings

2008-10-02 Thread Martin v. Löwis

Martin v. Löwis [EMAIL PROTECTED] added the comment:

 Martin, can you check in your changes to add sys.setfilesystemencoding()?

Will do tomorrow.

___
Python tracker [EMAIL PROTECTED]
http://bugs.python.org/issue3187
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue3187] os.listdir can return byte strings

2008-10-02 Thread djc

Changes by djc [EMAIL PROTECTED]:


--
nosy: +djc

___
Python tracker [EMAIL PROTECTED]
http://bugs.python.org/issue3187
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue3187] os.listdir can return byte strings

2008-10-02 Thread Benoit Boissinot

Changes by Benoit Boissinot [EMAIL PROTECTED]:


--
nosy: +bboissin

___
Python tracker [EMAIL PROTECTED]
http://bugs.python.org/issue3187
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue3187] os.listdir can return byte strings

2008-10-02 Thread Amaury Forgeot d'Arc

Amaury Forgeot d'Arc [EMAIL PROTECTED] added the comment:

Here is a patch for Windows:

The failing tests on buildbots now pass
(test_fnmatch test_posixpath test_unicode_file)

test_ntpath also runs functions with bytes.

I suppose macpath.py is broken as well.

Added file: http://bugs.python.org/file11685/win32-bytes-filenames.patch

___
Python tracker [EMAIL PROTECTED]
http://bugs.python.org/issue3187
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue3187] os.listdir can return byte strings

2008-09-30 Thread Martin v. Löwis

Martin v. Löwis [EMAIL PROTECTED] added the comment:

Here is a patch that solves the issue in a different way: it introduces
sys.setfilesystemencoding. If applications invoke
sys.setfilesystemencoding(iso-8859-1), all file names can be
successfully converted into a character string.

Added file: http://bugs.python.org/file11663/setfsenc.diff

___
Python tracker [EMAIL PROTECTED]
http://bugs.python.org/issue3187
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue3187] os.listdir can return byte strings

2008-09-30 Thread Guido van Rossum

Guido van Rossum [EMAIL PROTECTED] added the comment:

On Tue, Sep 30, 2008 at 8:21 AM, Martin v. Löwis [EMAIL PROTECTED] wrote:
 Martin v. Löwis [EMAIL PROTECTED] added the comment:
 Here is a patch that solves the issue in a different way: it introduces
 sys.setfilesystemencoding. If applications invoke
 sys.setfilesystemencoding(iso-8859-1), all file names can be
 successfully converted into a character string.

I'm not opposed to this going in as well, but I don't think it's the
right approach, as it can cause severe cases of mojibake (which you
have strongly opposed in the past). It's quite orthogonal to Victor's
patch IMO.

___
Python tracker [EMAIL PROTECTED]
http://bugs.python.org/issue3187
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue3187] os.listdir can return byte strings

2008-09-30 Thread STINNER Victor

STINNER Victor [EMAIL PROTECTED] added the comment:

As I wrote, python3_bytes_filename.patch was just an initial support 
for bytes filename. So as asked by Guido, here is a new version of my 
patch.

Changes:
 - for all functions, support bytes as well as bytearray
 - os.readlink(unicode) - unicode and raise an error if unicode 
conversion fails. Note: os.readlink(bytes)-bytes was already working.
 - many changes in posixpath to fix all functions: add many if 
isinstance(...): and repeat sep / curdir / parent / ... in bytes
 - current version of test_posixpath contains a duplicate to 
test_splitdrive() and test_normcase() calls normcase() twice which is 
wrong (fixed in my patch)
 - i used copy/paste + conversion to bytes to test posixpath with 
bytes arguments
 - i added some checks in posixpath tests to reject mixing bytes + str
 - fix quoting style
 - factorize pattern compilation in fnmatch
 - fnmatch.fnmatchcase() supports bytes
 - fix test_unicode_file: replace getcwdu() by getcwd(), and sometimes 
getcwd() by getcwdb()

Open issues:
 - pwd.getpwnam() and grp.getgrpnam() should accept bytes, and then
   expanduser() should use pwd with bytes. Now expanduser() 
   supposes that an username is an ASCII string and the user
   directory can be converted using getfilesystemencoding()
 - expandvars() doesn't support non-ASCII variable value:
   that's new problem. os.environ key should be str or bytes?
   And the value: str or bytes? It str is choosen, what is the
   charset to convert str to bytes?

Added file: http://bugs.python.org/file11667/python3_bytes_filename-2.patch

___
Python tracker [EMAIL PROTECTED]
http://bugs.python.org/issue3187
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue3187] os.listdir can return byte strings

2008-09-29 Thread STINNER Victor

STINNER Victor [EMAIL PROTECTED] added the comment:

About os.getcwd(), another solution is merge_os_getcwd_getcwdu.patch: 
os.getcwd() always return unicode string and raise an error on unicode 
decode error. Wheras os.getcwd(bytes=True) always return bytes. 

The old function os.getcwdu() is removed since os.getcwd() already 
return unicode string.

Note: current version of os.getcwd() uses the wrong encoding to 
conversion bytes to unicode: it uses PyUnicode_FromString() instead of 
PyUnicode_Decode(..., Py_FileSystemDefaultEncoding, strict) (as does 
getcwdu()).

Added file: http://bugs.python.org/file11652/merge_os_getcwd_getcwdu.patch

___
Python tracker [EMAIL PROTECTED]
http://bugs.python.org/issue3187
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue3187] os.listdir can return byte strings

2008-09-29 Thread Raghuram Devarakonda

Changes by Raghuram Devarakonda [EMAIL PROTECTED]:


--
nosy: +draghuram

___
Python tracker [EMAIL PROTECTED]
http://bugs.python.org/issue3187
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue3187] os.listdir can return byte strings

2008-09-29 Thread STINNER Victor

STINNER Victor [EMAIL PROTECTED] added the comment:

As Steven Bethard proposed, here is a new version of my getcwd() 
patch: instead of adding a keyword argument bytes, I created a 
function getcwdb():
 * os.getcwd() - unicode
 * os.getcwdb() - bytes

In Python2 it was:
 * os.getcwd() - str (bytes)
 * os.getcwdu() - unicode

Added file: http://bugs.python.org/file11655/os_getcwdb.patch

___
Python tracker [EMAIL PROTECTED]
http://bugs.python.org/issue3187
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue3187] os.listdir can return byte strings

2008-09-29 Thread STINNER Victor

STINNER Victor [EMAIL PROTECTED] added the comment:

Patch python3_bytes_filename.patch:
 - open() support bytes
 - listdir(unicode) - only unicode, *skip* invalid filenames 
   (as asked by Guido)
 - remove os.getcwdu()
 - create os.getcwdb() - bytes
 - glob.glob() support bytes
 - fnmatch.filter() support bytes
 - posixpath.join() and posixpath.split() support bytes

Mixing bytes and str is invalid. Examples raising a TypeError:
 - posixpath.join(b'x', 'y')
 - fnmatch.filter([b'x', 'y'], '*')
 - fnmatch.filter([b'x', b'y'], '*')
 - glob.glob1('.', b'*')
 - glob.glob1(b'.', '*')

TODO:
 - review this patch :-)
 - support non-ASCII bytes in fnmatch.filter()
 - fix other functions, eg. posixpath.isabs() and 
fnmatch.fnmatchcase()
 - fix functions written in C: grep FileSystemDefaultEncoding
 - make sure that mixing bytes and str is rejected

Added file: http://bugs.python.org/file11658/python3_bytes_filename.patch

___
Python tracker [EMAIL PROTECTED]
http://bugs.python.org/issue3187
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue3187] os.listdir can return byte strings

2008-09-28 Thread Martin v. Löwis

Martin v. Löwis [EMAIL PROTECTED] added the comment:

I'd like to propose yet another approach: make sure that conversion
according to the file system encoding always succeeds. If an
unconvertable byte is detected, map it into some private-use character.
To reduce the chance of conflict with other people's private-use
characters, we can use some of the plane 15 private-use characters, e.g.
map byte 0xPQ to U+F30PQ (in two-byte Unicode mode, this would result in
a surrogate pair).

This would make all file names accessible to all text processing
(including glob and friends); UI display would typically either report
an encoding error, or arrange for some replacement glyph to be shown.

There are certain variations of the approach possible, in case there is
objection to a specific detail.

--
nosy: +loewis

___
Python tracker [EMAIL PROTECTED]
http://bugs.python.org/issue3187
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue3187] os.listdir can return byte strings

2008-09-28 Thread Dwayne Litzenberger

Dwayne Litzenberger [EMAIL PROTECTED] added the comment:

Martin,

Consider this scenario.  On ext3/Linux, assume that UTF-8 is specified
in the system locale.  What would happen if you have two files, named
b\xf3\xb3\x83\x80\x00 and b\xc0\x00?  Under your proposal, the first
file would decode successfully as \U000f30c0\x00, and the second file
would decode unsuccessfully, so it would be mapped to
\U000f30c0\x00---the same thing!

Under your proposal, you could end up with multiple files having the
same filename (from Python's perspective). Python shouldn't break if
somebody deliberately created some weird filenames.  Your proposal would
make it impossible to write a robust remote backup tool in Python 3.

Pathnames on ext3/Linux *are not Unicode*.  Blindly pretending they're
Unicode is a leaky abstraction at best, and a security hole at worst.

___
Python tracker [EMAIL PROTECTED]
http://bugs.python.org/issue3187
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue3187] os.listdir can return byte strings

2008-09-28 Thread Guido van Rossum

Guido van Rossum [EMAIL PROTECTED] added the comment:

You can call it a leaky abstraction all you want, but most people think
of filenames as text strings most of the time, and we need to somehow
support this, at least for users who agree .  I agree we also need to
support bytes strings (at least on Unix) in order to support backup
routines, and support for bytes in - bytes out in os.listdir() is meant
for this.  The open() function should also support a pure bytes filename
(and almost does so -- _fileio does, but io.py doesn't yet). 
os.getcwd() is a weird case and will probably need to be given a flag to
make it return bytes (I don't like that style of API much, but the
alternative is perhaps worse -- os.getcwd_bytes()).

Conclusion: I support patches that make the I/O library work with either
bytes or strings.  (It's OK if the bytes don't actually work on Windows,
where the native type is apparently strings -- though it has a bytes API
too, doesn't it?)

___
Python tracker [EMAIL PROTECTED]
http://bugs.python.org/issue3187
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue3187] os.listdir can return byte strings

2008-09-28 Thread Martin v. Löwis

Martin v. Löwis [EMAIL PROTECTED] added the comment:

 Consider this scenario.  On ext3/Linux, assume that UTF-8 is specified
 in the system locale.  What would happen if you have two files, named
 b\xf3\xb3\x83\x80\x00 and b\xc0\x00?  Under your proposal, the first
 file would decode successfully as \U000f30c0\x00, and the second file
 would decode unsuccessfully, so it would be mapped to
 \U000f30c0\x00---the same thing!

Correct.

 Under your proposal, you could end up with multiple files having the
 same filename (from Python's perspective). Python shouldn't break if
 somebody deliberately created some weird filenames.

I'm not so sure about that. Practicality beats purity.

 Your proposal would
 make it impossible to write a robust remote backup tool in Python 3.

There could be an option to set the file system encoding via an API
to some known safe value, such as Latin-1, or ASCII. If you set the
file system encoding to Latin-1, this escaping would never happen;
if you set it to ASCII, it would happen uniformly for all non-ASCII
bytes. The robust backup tool would have to know to set this option
on POSIX systems.

 Pathnames on ext3/Linux *are not Unicode*.  Blindly pretending they're
 Unicode is a leaky abstraction at best, and a security hole at worst.

I think most Linux users would disagree, and claim that file names are
indeed character strings (which is synonym to being Unicode). It is
technically true that it's possible to create file names which are not
text, but that's really a bug, not a feature - Unix and POSIX were never
intended to work this way. Also, in the overwhelming majority of Python
applications, consistent support for practically-existing systems
matters more than robustness against malicious users.

___
Python tracker [EMAIL PROTECTED]
http://bugs.python.org/issue3187
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue3187] os.listdir can return byte strings

2008-09-28 Thread Martin v. Löwis

Martin v. Löwis [EMAIL PROTECTED] added the comment:

 I agree we also need to
 support bytes strings (at least on Unix) in order to support backup
 routines

How about letting such applications set the file system encoding to
Latin-1?

___
Python tracker [EMAIL PROTECTED]
http://bugs.python.org/issue3187
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue3187] os.listdir can return byte strings

2008-09-28 Thread Martin v. Löwis

Martin v. Löwis [EMAIL PROTECTED] added the comment:

James Knight points out that UTF-8b can be used to give unambiguous
round-tripping of characters in a UTF-8 locale. So I would like to amend
my previous proposal:
- for a non-UTF-8 encoding, use private-use characters for roundtripping
- if the locale's charset is UTF-8, use UTF-8b as the file system encoding.

___
Python tracker [EMAIL PROTECTED]
http://bugs.python.org/issue3187
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue3187] os.listdir can return byte strings

2008-09-27 Thread STINNER Victor

Changes by STINNER Victor [EMAIL PROTECTED]:


Removed file: http://bugs.python.org/file11189/filename.py

___
Python tracker [EMAIL PROTECTED]
http://bugs.python.org/issue3187
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue3187] os.listdir can return byte strings

2008-09-27 Thread STINNER Victor

Changes by STINNER Victor [EMAIL PROTECTED]:


Removed file: http://bugs.python.org/file11210/invalid_filename.patch

___
Python tracker [EMAIL PROTECTED]
http://bugs.python.org/issue3187
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue3187] os.listdir can return byte strings

2008-09-27 Thread STINNER Victor

STINNER Victor [EMAIL PROTECTED] added the comment:

getcwd() fails with NOT FOUNT (not foun*d*?) if the current 
directory filename can't be converted to unicode (str type). Here is a 
patch to fallback to bytes if creation of the unicode failed.

Added file: http://bugs.python.org/file11632/getcwd_bytes.patch

___
Python tracker [EMAIL PROTECTED]
http://bugs.python.org/issue3187
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue3187] os.listdir can return byte strings

2008-09-27 Thread Dwayne Litzenberger

Dwayne Litzenberger [EMAIL PROTECTED] added the comment:

On Sat, Sep 27, 2008 at 01:15:46AM +, Guido van Rossum wrote:
 I don't see the advantage over the existing rule bytes in - bytes out...

Guido,

I figure I should say something since I have some experience in this area.

I wrote some automatic backup software in Python 2 earlier this year.  It
had to work on ext3/Linux (where filenames are natively octet-strings) and
on NTFS/Win32 (where filenames are natively unicode-strings).  I had to be
ridiculously careful to always use unicode paths on Win32, and to always
use str paths on Linux, because otherwise Python would do the conversion
automatically---poorly.

It was particularly bad on Win32, where if you used os.listdir() with a
non-unicode path (Python 2.x str object) in a directory that contained
non-ascii filenames, Windows would invent filenames that looked similar but
couldn't actually be found when using open().  So, naive (Python 2) code
like this would break:

for filename in os.listdir(.):
f = open(filename, rb)
# ...

On Linux, it was bad too, since if you used unicode paths, the filenames
actually opened would depend on your LANG or LC_CTYPE or LC_ALL environment
variables, and those could vary from one system to another, or even from
one invocation of the program to another.

The simple fact of the matter is that pathnames on Linux are _not_ Unicode,
and pathnames on Windows are _not_ octet strings.  They're fundamentally
incompatible types that can only be reconciled when you make assumptions
(e.g. specifying a character encoding) that allow you to convert from one
to the other.

Ideally, io.open(), os.listdir(), os.path.*, etc. would accept _only_
pathnames in their native format, and it would be the job of a wrapper to
provide a portable-but-less-robust interface on top of that.  Perhaps the
built-in functions would use the wrapper (with reasonable defaults), but
the native-only interface should be there for module-writers who want
robust pathname handling.

___
Python tracker [EMAIL PROTECTED]
http://bugs.python.org/issue3187
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue3187] os.listdir can return byte strings

2008-09-26 Thread Barry A. Warsaw

Changes by Barry A. Warsaw [EMAIL PROTECTED]:


--
priority: release blocker - deferred blocker

___
Python tracker [EMAIL PROTECTED]
http://bugs.python.org/issue3187
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue3187] os.listdir can return byte strings

2008-09-26 Thread Benjamin Peterson

Benjamin Peterson [EMAIL PROTECTED] added the comment:

Ok. Here's another possibility. It adds another optional parameter to
listdir. If False, bytes strings can be returned. Otherwise, the
UnicodeDecodeError is reraised.

Added file: http://bugs.python.org/file11629/force_unicode.patch

___
Python tracker [EMAIL PROTECTED]
http://bugs.python.org/issue3187
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue3187] os.listdir can return byte strings

2008-09-26 Thread Benjamin Peterson

Changes by Benjamin Peterson [EMAIL PROTECTED]:


Removed file: http://bugs.python.org/file11629/force_unicode.patch

___
Python tracker [EMAIL PROTECTED]
http://bugs.python.org/issue3187
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue3187] os.listdir can return byte strings

2008-09-26 Thread Benjamin Peterson

Changes by Benjamin Peterson [EMAIL PROTECTED]:


Added file: http://bugs.python.org/file11630/force_unicode.patch

___
Python tracker [EMAIL PROTECTED]
http://bugs.python.org/issue3187
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue3187] os.listdir can return byte strings

2008-09-26 Thread Guido van Rossum

Guido van Rossum [EMAIL PROTECTED] added the comment:

On Fri, Sep 26, 2008 at 5:47 PM, Benjamin Peterson
[EMAIL PROTECTED] wrote:
 Ok. Here's another possibility. It adds another optional parameter to
 listdir. If False, bytes strings can be returned. Otherwise, the
 UnicodeDecodeError is reraised.

I don't see the advantage over the existing rule bytes in - bytes out...

___
Python tracker [EMAIL PROTECTED]
http://bugs.python.org/issue3187
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue3187] os.listdir can return byte strings

2008-09-26 Thread Benjamin Peterson

Benjamin Peterson [EMAIL PROTECTED] added the comment:

Does that mean that the right thing to do is raise decoding errors when
unicode is given and fix the path modules so they can use bytes?

___
Python tracker [EMAIL PROTECTED]
http://bugs.python.org/issue3187
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue3187] os.listdir can return byte strings

2008-09-23 Thread Benjamin Peterson

Benjamin Peterson [EMAIL PROTECTED] added the comment:

Here's another patch. It simply propagates the UnicodeDecodeErrors. I
like this because it avoids silent ignoring problem, and people can get
bytes if they want by passing in a bytes path.

Added file: http://bugs.python.org/file11581/raise_decoding_errors.patch

___
Python tracker [EMAIL PROTECTED]
http://bugs.python.org/issue3187
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue3187] os.listdir can return byte strings

2008-09-23 Thread Guido van Rossum

Guido van Rossum [EMAIL PROTECTED] added the comment:

Hmm... much of the os.path machinery (and os.walk) probably doesn't work
with bytes, and neither do fnmatch.py and glob.py, I expect.  Plus
io.open() refuses bytes for the filename, even though _fileio accepts
them.  The latter should be fixed regardless, and one of the attachments
here has a fix IIRC.

Gotta run, sorry.

___
Python tracker [EMAIL PROTECTED]
http://bugs.python.org/issue3187
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue3187] os.listdir can return byte strings

2008-09-23 Thread STINNER Victor

STINNER Victor [EMAIL PROTECTED] added the comment:

Guido compiled my patches here: http://codereview.appspot.com/3055

My patches allows bytes for fnmatch.filter(), glob.glob1(), 
os.path.join() and open().

___
Python tracker [EMAIL PROTECTED]
http://bugs.python.org/issue3187
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue3187] os.listdir can return byte strings

2008-09-21 Thread Benjamin Peterson

Benjamin Peterson [EMAIL PROTECTED] added the comment:

Here's a potential patch for listdir. It emits a UnicodeWarning (or
should that be a BytesWarning?) and skips the file when decoding fails.
What would be the best way to test this?

Added file: http://bugs.python.org/file11546/listdir_bytes_warning.patch

___
Python tracker [EMAIL PROTECTED]
http://bugs.python.org/issue3187
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue3187] os.listdir can return byte strings

2008-09-21 Thread Amaury Forgeot d'Arc

Amaury Forgeot d'Arc [EMAIL PROTECTED] added the comment:

I did not test the patch, but I have some remarks about it:
- %r does not seem to be handled by PyUnicode_FromFormat; %R maybe?
- In this case, PyObject_Repr(v) is not necessary - and this will avoid 
a reference leak.
- Does the warning warn multiple times? IIRC the default behaviour is to 
warn once.

___
Python tracker [EMAIL PROTECTED]
http://bugs.python.org/issue3187
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue3187] os.listdir can return byte strings

2008-09-21 Thread Benjamin Peterson

Changes by Benjamin Peterson [EMAIL PROTECTED]:


Removed file: http://bugs.python.org/file11546/listdir_bytes_warning.patch

___
Python tracker [EMAIL PROTECTED]
http://bugs.python.org/issue3187
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue3187] os.listdir can return byte strings

2008-09-21 Thread Benjamin Peterson

Benjamin Peterson [EMAIL PROTECTED] added the comment:

Here's two more patches. One is like the old one with Amaury's comments
observed. The other simply notes if there were decoding problems and
warns once at the end of the listdir call.

Making a warning happen more than once is tricky because it requires
messing with the warnings filter. This of course takes away some of the
user's control which is one of the main reasons for using the Python
warning system in the first place.

(I almost wish we could write another listdir that returned the names it
could decode and a list of those it couldn't.)

Added file: http://bugs.python.org/file11549/listdir_encoding_warning.patch

___
Python tracker [EMAIL PROTECTED]
http://bugs.python.org/issue3187
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue3187] os.listdir can return byte strings

2008-09-21 Thread Benjamin Peterson

Changes by Benjamin Peterson [EMAIL PROTECTED]:


Removed file: http://bugs.python.org/file10719/oslistdir_string.patch

___
Python tracker [EMAIL PROTECTED]
http://bugs.python.org/issue3187
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue3187] os.listdir can return byte strings

2008-09-21 Thread Benjamin Peterson

Changes by Benjamin Peterson [EMAIL PROTECTED]:


Added file: http://bugs.python.org/file11550/warn_at_the_end.patch

___
Python tracker [EMAIL PROTECTED]
http://bugs.python.org/issue3187
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue3187] os.listdir can return byte strings

2008-09-18 Thread Helmut Jarausch

Helmut Jarausch [EMAIL PROTECTED] added the comment:

Hi,
is this assumed to be fixed in 3.0rc1 ?

with SVN 66506  (3.0rc1+) 
for dirname, subdirs, files in os.walk(bytes(Top,'iso-8859-1')) :

still gives an error here:

for dirname, subdirs, files in os.walk(bytes(Top,'iso-8859-1')) :
  File /usr/local/lib/python3.0/os.py, line 268, in walk
if isdir(join(top, name)):
  File /usr/local/lib/python3.0/posixpath.py, line 64, in join
if b.startswith('/'):
TypeError: expected an object with the buffer interface

___
Python tracker [EMAIL PROTECTED]
http://bugs.python.org/issue3187
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue3187] os.listdir can return byte strings

2008-09-17 Thread Barry A. Warsaw

Changes by Barry A. Warsaw [EMAIL PROTECTED]:


--
priority: deferred blocker - release blocker

___
Python tracker [EMAIL PROTECTED]
http://bugs.python.org/issue3187
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue3187] os.listdir can return byte strings

2008-09-09 Thread Barry A. Warsaw

Changes by Barry A. Warsaw [EMAIL PROTECTED]:


--
priority: release blocker - deferred blocker

___
Python tracker [EMAIL PROTECTED]
http://bugs.python.org/issue3187
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue3187] os.listdir can return byte strings

2008-08-26 Thread Dwayne Litzenberger

Changes by Dwayne Litzenberger [EMAIL PROTECTED]:


--
nosy: +dlitz

___
Python tracker [EMAIL PROTECTED]
http://bugs.python.org/issue3187
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue3187] os.listdir can return byte strings

2008-08-26 Thread Dwayne Litzenberger

Dwayne Litzenberger [EMAIL PROTECTED] added the comment:

I think Guido already understands this, but I haven't seen it stated
very clearly here:

** Different systems use different things to identify files. **

On Linux/ext3, all filenames are *octet strings* (i.e. bytes), and
*only* the following caveats apply:
- a filename/pathname cannot contain the zero-octet (b\x00).
- a filename/pathname cannot be empty.
- a filename cannot contain the slash (b/); In a pathname, the slash
is used to separate filenames.
- the filenames b. and b.. have special meanings; They cannot be
created, deleted, or renamed.

All filenames that meet these criteria are valid, and calling them
invalid amounts to plugging one's ears and shouting LA LA LA while
imagining Unicode having pre-dated Unix.

It is sometimes convenient to imagine filenames on Linux/ext3 as
sequences of Unicode code points (where the encoding is specified by
LC_CTYPE---it's not necessarily UTF-8), but other times (e.g. in backup
tools that need to be robust in the face of mischievous users) it is an
unnecessary abstraction that introduces bugs.

On Windows/NTFS, the situation is entirely different: Filenames are
actually sequences of Unicode code points, and if you pretend they are
octet strings, Windows will happily invent phantom filenames for you
that will show up in the output of os.listdir(), but that will return
File not found if you try to open them for reading (if you open them
for writing, you risk clobbering other files that happens to have the
same names).

To avoid bugs, it should be possible to work exclusively with filenames
in the platform's native representation.  It was possible in Python 2
(though you had to be very careful).  Ideally, Python 3 would recognize
and enforce the difference instead of trying to guess the translations;
Explicit is better than implicit and all that.

___
Python tracker [EMAIL PROTECTED]
http://bugs.python.org/issue3187
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue3187] os.listdir can return byte strings

2008-08-22 Thread STINNER Victor

STINNER Victor [EMAIL PROTECTED] added the comment:

I implemented the invalid filename class feature:
 - by default, os.listdir() raise an error (UnicodeDecodeError) on 
invalid filename. The previous behaviour was to return bytes object 
instead of str.
 - if invalid_filename=True: create an InvalidFilename class instance

InvalidFilename is not a bytes string, it's not a str string, it's a 
new class. It has three attributes:
 - bytes: the real filename
 - charset: charset (type str)
 - str: fake filename (type str) used by __str__() method

My patch also fixes os.path.join() to accept InvalidFilename: if at 
last one argument is an InvalidFilename, use InvalidFilename.join() 
(class method).

os.listdir() and os.unlink() are patched to accept InvalidFilename. 
unlink always accept InvalidFilename whereas listdir() only produces 
InvalidFilename is os.listdir(path, invalid_filename=True) is used.

I added an optional argument invalid_filename to shutil.rmtree(), 
default value is *True*.

To sum up, visible changes:
 - os.listdir() raise an error on invalid filename instead of return a 
mixed list of str and bytes
 - shutil.rmtree() manipulate str and InvalidFilename instead of str 
and bytes

Added file: http://bugs.python.org/file11210/invalid_filename.patch

___
Python tracker [EMAIL PROTECTED]
http://bugs.python.org/issue3187
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue3187] os.listdir can return byte strings

2008-08-22 Thread Guido van Rossum

Guido van Rossum [EMAIL PROTECTED] added the comment:

I'm not interested in the InvalidFilename class; it's an API
complification that might seem right for your situation but will hinder
most other people.  However I *am* interested in a patch that makes
os.unlink() (and as many other functions as you can think of) accept
bytes.  You'll have to think what encoding to use on Windows though,
since (AFAIK) the Windows filesystem APIs *do* use Unicode.

___
Python tracker [EMAIL PROTECTED]
http://bugs.python.org/issue3187
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue3187] os.listdir can return byte strings

2008-08-22 Thread STINNER Victor

STINNER Victor [EMAIL PROTECTED] added the comment:

@gvanrossum: os.unlink() and os.lstat() already accept byte filenames 
(but open() doesn't).

Ok, here is very small patch for posixpath.join() to accept bytes 
strings. This patch is enough to fix my initial problem (#3616).

Added file: http://bugs.python.org/file11212/posix_path_bytes.patch

___
Python tracker [EMAIL PROTECTED]
http://bugs.python.org/issue3187
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue3187] os.listdir can return byte strings

2008-08-22 Thread STINNER Victor

STINNER Victor [EMAIL PROTECTED] added the comment:

My last patch (posix_join_bytes.patch) is also enough to fix the 
initial reported problem: error in posixpath.join() called by 
os.walk(). I tried os.walk() on a directory with invalid filenames and 
invalid directory name and it works well.

So the last bug is open() which disallow opening a file with an 
invalid name. So here is another patch for that.

Added file: http://bugs.python.org/file11213/io_byte_filename.patch

___
Python tracker [EMAIL PROTECTED]
http://bugs.python.org/issue3187
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue3187] os.listdir can return byte strings

2008-08-22 Thread STINNER Victor

STINNER Victor [EMAIL PROTECTED] added the comment:

Patch glob.glob() to accept directory with invalid filename (invalid 
in the filesystem charset): just ignore bytes = str conversion error.

Added file: http://bugs.python.org/file11216/glob1_bytes.patch

___
Python tracker [EMAIL PROTECTED]
http://bugs.python.org/issue3187
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue3187] os.listdir can return byte strings

2008-08-22 Thread Guido van Rossum

Guido van Rossum [EMAIL PROTECTED] added the comment:

See http://codereview.appspot.com/3055 for a code review of Victor's
latest patches.

___
Python tracker [EMAIL PROTECTED]
http://bugs.python.org/issue3187
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue3187] os.listdir can return byte strings

2008-08-21 Thread STINNER Victor

STINNER Victor [EMAIL PROTECTED] added the comment:

If the filename can not be encoded correctly in the system charset, 
it's not really a problem. The goal is to be able to use open(), 
shutil.copyfile(), os.unlink(), etc. with the given filename.

orig = filename from the kernel (bytes)
filename = filename from listdir() (str)
dest = filename to the kernel (bytes)

The goal is to get orig == dest. In my program Hachoir, to workaround 
this problem I store the original filename (bytes) and convert it to 
unicode with characters replacements (eg. replace invalid byte 
sequence by ?). So the bytes string is used for open(), 
unlink(), ... and the unicode string is displayed to stdout for the 
user.

IMHO, the best solution is to create such class:

class Filename:
def __init__(self, orig):
self.as_bytes = orig
self.as_str = myformat(orig)
def __str__(self):
return self.as_str
def __bytes__(self):
return self.as_bytes

New problems: I guess that functions operating on filenames 
(os.path.*) will have to support this new type (Filename class).

___
Python tracker [EMAIL PROTECTED]
http://bugs.python.org/issue3187
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue3187] os.listdir can return byte strings

2008-08-21 Thread Antoine Pitrou

Antoine Pitrou [EMAIL PROTECTED] added the comment:

Selon STINNER Victor [EMAIL PROTECTED]:
 IMHO, the best solution is to create such class:

 class Filename:
 def __init__(self, orig):
 self.as_bytes = orig
 self.as_str = myformat(orig)
 def __str__(self):
 return self.as_str
 def __bytes__(self):
 return self.as_bytes

I agree that logically it's the right solution. It's also the most invasive. If
that class is made a subclass of str, however, existing code shouldn't break
more than it currently does.

___
Python tracker [EMAIL PROTECTED]
http://bugs.python.org/issue3187
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue3187] os.listdir can return byte strings

2008-08-21 Thread STINNER Victor

STINNER Victor [EMAIL PROTECTED] added the comment:

I wrote a Filename class. I tries different methods:
 * no parent class class Filename: ... - I don't know how to make 
bytes(filename) works!? But it's the best option to avoid strange bugs 
(mix bytes/str, remember Python 2.x...)
 * str parent class class Filename(str): ... - doesn't work because 
os functions uses the fake unicode filename before testing the bytes 
(real) filename
 * bytes parent class class Filename(bytes): ... - that's the 
current implementation

The idea is to encode str - bytes (and not bytes - str because we 
want to avoid problems with such conversions). So I reimplemented most 
bytes methods: __addr__, __raddr__, __contains__, startswith, endswith 
and index. index method has no start/end arguments since the behaviour 
would be different than a real unicode string :-/

I added an example of fixed os.listdir(): create Filename() object if 
we get bytes. Should we always create Filename objects? I don't think 
so.

Added file: http://bugs.python.org/file11189/filename.py

___
Python tracker [EMAIL PROTECTED]
http://bugs.python.org/issue3187
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue3187] os.listdir can return byte strings

2008-08-21 Thread Antoine Pitrou

Antoine Pitrou [EMAIL PROTECTED] added the comment:

  * bytes parent class class Filename(bytes): ... - that's the
 current implementation

I don't think that makes sense (especially under Windows which has Unicode file
APIs). os.listdir() and friends should really return str or str-like objects,
not bytes-like objects with an additional __str__ method.

  * str parent class class Filename(str): ... - doesn't work because
 os functions uses the fake unicode filename before testing the bytes
 (real) filename

Well, of course, if we create a filename type, then all os functions must be
adapted to accept it rather than assume str.

All this is highly speculative of course, and if we really follow this course
(i.e. create a filename type) it should probably be postponed to 3.1: too many
changes with far-reaching consequences.

___
Python tracker [EMAIL PROTECTED]
http://bugs.python.org/issue3187
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue3187] os.listdir can return byte strings

2008-08-21 Thread STINNER Victor

STINNER Victor [EMAIL PROTECTED] added the comment:

Le Thursday 21 August 2008 14:55:43 Antoine Pitrou, vous avez écrit :
   * bytes parent class class Filename(bytes): ... - that's the
  current implementation

 I don't think that makes sense (especially under Windows which has Unicode
 file APIs). os.listdir() and friends should really return str or str-like
 objects, not bytes-like objects with an additional __str__ method.

In we use class Filename(str): ..., we have to ensure that all operations 
takes care of the charset because the unicode version is invalid and not be 
used to access to the file system. Dummy example: Filename()+/ should not 
return str but raise an error or create a new filename.

 Well, of course, if we create a filename type, then all os functions must
 be adapted to accept it rather than assume str.

If Filename has no parent class but is convertible to bytes(), os functions 
requires no change and so we can fix it before final 3.0 ;-)

___
Python tracker [EMAIL PROTECTED]
http://bugs.python.org/issue3187
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue3187] os.listdir can return byte strings

2008-08-21 Thread Antoine Pitrou

Antoine Pitrou [EMAIL PROTECTED] added the comment:

 If Filename has no parent class but is convertible to bytes(), os
 functions requires no change and so we can fix it before final 3.0 ;-)

This sounds highly optimistic.

Also, I think it's wrong to introduce a string-like class with implicit
conversion both to bytes and to str, while we have taken all measures to
make sure that bytes/str exchangeability doesn't exist any more in py3k.

___
Python tracker [EMAIL PROTECTED]
http://bugs.python.org/issue3187
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue3187] os.listdir can return byte strings

2008-08-21 Thread Guido van Rossum

Guido van Rossum [EMAIL PROTECTED] added the comment:

The proper work-around is for the app to pass bytes into os.listdir();
then it will return bytes.  It would be nice if open() etc. accepted
bytes (as well as strings of course), at least on Unix, but not
absolutely necessary -- the app could also just know the right encoding.

I see two reasonable alternatives for what os.listdir() should return
when the input is a string and one of the filenames can't be decoded:
either omit it from the output list; or use errors='replace' in the
encoding.  Failing the entire os.listdir() call is not acceptable, and
neither is returning a mixture of str and bytes instances.

--
nosy: +gvanrossum

___
Python tracker [EMAIL PROTECTED]
http://bugs.python.org/issue3187
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue3187] os.listdir can return byte strings

2008-08-21 Thread STINNER Victor

STINNER Victor [EMAIL PROTECTED] added the comment:

Le Thursday 21 August 2008 18:17:47 Guido van Rossum, vous avez écrit :
 The proper work-around is for the app to pass bytes into os.listdir();
 then it will return bytes.

In my case, I just would like to remove a directory with shutil.rmtree(). I 
don't know if it contains bytes or characters filenames :-)

 It would be nice if open() etc. accepted 
 bytes (as well as strings of course), at least on Unix, but not
 absolutely necessary -- the app could also just know the right encoding.

An invalid filename has no charset. It's just a raw byte string. So open(), 
unlink(), etc. have to accept byte string. Maybe not in the Python version 
with in low level (C version)?

 I see two reasonable alternatives for what os.listdir() should return
 when the input is a string and one of the filenames can't be decoded:
 either omit it from the output list;

It's not a good option: rmtree() will fails because the directory in not 
empty :-/

 or use errors='replace' in the encoding.

It will also fails because filenames will be invalid (valid unicode string but 
non existent file names :-/).

 Failing the entire os.listdir() call is not acceptable, and 
 neither is returning a mixture of str and bytes instances.

Ok, I have another suggestion:
 - *by default*, listdir() only returns str and raise an error (TypeError?) 
   on invalid filename
 - add an optional argument (a callback), eg. fallback_encoder, to catch
   such errors (similar to onerror from shutils.rmtree())

Example of new listdir implementation (pseudo-code):

   charset = sys.getfilesystemcharset()
   dirobj = opendir(path)
   try:
  for bytesname in readdir(dirobj):
  try:
  name = str(bytesname, charset)
  exept UnicodeDecodeError:
  name = fallback_encoder(bytesname)
  yield name
   finally:
  closedir(dirobj)

The default fallback_encoder:

   def fallback_encoder(name):
  raise

Keep raw bytes string:

   def fallback_encoder(name):
  return name

Create my custom filename object:

   class Filename:
  ...

   def fallback_encoder(name):
  return Filename(name)

If a callback is overkill, we can just add an option, 
eg. keep_invalid_filename=True, to ask listdir() to keep bytes string if 
the conversion to unicode fails.

In any case, open(), unlink(), etc. have to accept byte string to be accept to 
read, copy, remove invalid filenames. In a perfect world, all filenames would 
be valid UTF-8 strings, but in the real world (think to Matrix :-)), we have 
to support such strange cases...

___
Python tracker [EMAIL PROTECTED]
http://bugs.python.org/issue3187
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue3187] os.listdir can return byte strings

2008-08-21 Thread Guido van Rossum

Guido van Rossum [EMAIL PROTECTED] added the comment:

So shutil should be fixed to pass a bytes value to os.listdir().  But
then os.remove() should be fixed to accept bytes as well.  This is the
crux I believe: on Unix at least, syscall wrappers should accept bytes
for filenames.  And this would then have to be extended to things like
the functions in os.path, and we'd need bytes versions of os.sep and
os.altsep...  This sounds like a good project for 3.1.

I do not accept an os.listdir() that raises an error because one
filename cannot be decoded.  It sounds like using errors='replace' is
also wrong -- so the only solution is for os.listdir() to skip files it
cannot decode.  While this doesn't help for rmtree(), it is better than
errors='replace' for code that descends into the tree looking for files
matching a pattern or other property.  So I propose this as a patch for 3.0.

The callback variant is too complex; you could write it yourself by
using os.listdir() with a bytes argument.  This also applies to
proposals like passing optional encoding and errors arguments to
os.listdir().

___
Python tracker [EMAIL PROTECTED]
http://bugs.python.org/issue3187
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue3187] os.listdir can return byte strings

2008-08-21 Thread Benjamin Peterson

Benjamin Peterson [EMAIL PROTECTED] added the comment:

On Thu, Aug 21, 2008 at 6:31 PM, Guido van Rossum
[EMAIL PROTECTED] wrote:

 Guido van Rossum [EMAIL PROTECTED] added the comment:

 So shutil should be fixed to pass a bytes value to os.listdir().  But
 then os.remove() should be fixed to accept bytes as well.  This is the
 crux I believe: on Unix at least, syscall wrappers should accept bytes
 for filenames.  And this would then have to be extended to things like
 the functions in os.path, and we'd need bytes versions of os.sep and
 os.altsep...  This sounds like a good project for 3.1.

 I do not accept an os.listdir() that raises an error because one
 filename cannot be decoded.  It sounds like using errors='replace' is
 also wrong -- so the only solution is for os.listdir() to skip files it
 cannot decode.  While this doesn't help for rmtree(), it is better than
 errors='replace' for code that descends into the tree looking for files
 matching a pattern or other property.  So I propose this as a patch for 3.0.

As much as this maybe the right idea, I don't like the idea of
silently losing the contents of a directory. That's asking for
difficult to discover bugs. Could Python emit a warning in this case?

 The callback variant is too complex; you could write it yourself by
 using os.listdir() with a bytes argument.  This also applies to
 proposals like passing optional encoding and errors arguments to
 os.listdir().

 ___
 Python tracker [EMAIL PROTECTED]
 http://bugs.python.org/issue3187
 ___


___
Python tracker [EMAIL PROTECTED]
http://bugs.python.org/issue3187
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue3187] os.listdir can return byte strings

2008-08-21 Thread Guido van Rossum

Guido van Rossum [EMAIL PROTECTED] added the comment:

 I do not accept an os.listdir() that raises an error because one
 filename cannot be decoded.  It sounds like using errors='replace' is
 also wrong -- so the only solution is for os.listdir() to skip files it
 cannot decode.  While this doesn't help for rmtree(), it is better than
 errors='replace' for code that descends into the tree looking for files
 matching a pattern or other property.  So I propose this as a patch for 3.0.

 As much as this maybe the right idea, I don't like the idea of
 silently losing the contents of a directory. That's asking for
 difficult to discover bugs.

Well, the other approaches also cause difficult to discover bugs (the
original bug report here was an example :-).

 Could Python emit a warning in this case?

This may be the best compromise yet. It would have to use the warnings
module so that you could disable it.

___
Python tracker [EMAIL PROTECTED]
http://bugs.python.org/issue3187
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue3187] os.listdir can return byte strings

2008-08-20 Thread Antoine Pitrou

Antoine Pitrou [EMAIL PROTECTED] added the comment:

See #3616 for a consequence of this.

--
nosy: +haypo

___
Python tracker [EMAIL PROTECTED]
http://bugs.python.org/issue3187
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue3187] os.listdir can return byte strings

2008-08-20 Thread Barry A. Warsaw

Changes by Barry A. Warsaw [EMAIL PROTECTED]:


--
priority: deferred blocker - release blocker

___
Python tracker [EMAIL PROTECTED]
http://bugs.python.org/issue3187
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue3187] os.listdir can return byte strings

2008-08-09 Thread Antoine Pitrou

Antoine Pitrou [EMAIL PROTECTED] added the comment:

Hmm, I suppose that while the filename is latin1-encoded,
Py_FileSystemDefaultEncoding is utf-8 and therefore os.listdir fails
decoding the filename and falls back on returning a byte string.
It was acceptable in Python 2.x but is a very annoying problem in py3k
now that unicode and bytes objects can't be mixed together anymore. I'm
bumping this to critical, although there is probably no clean solution.

--
nosy: +pitrou
priority:  - critical
type: crash - behavior

___
Python tracker [EMAIL PROTECTED]
http://bugs.python.org/issue3187
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue3187] os.listdir can return byte strings

2008-08-09 Thread Benjamin Peterson

Benjamin Peterson [EMAIL PROTECTED] added the comment:

Let's make this a release blocker for RCs.

--
priority: critical - deferred blocker

___
Python tracker [EMAIL PROTECTED]
http://bugs.python.org/issue3187
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue3187] os.listdir can return byte strings

2008-06-24 Thread Benjamin Peterson

Changes by Benjamin Peterson [EMAIL PROTECTED]:


--
title: os.walk - strange bug - os.listdir can return byte strings

___
Python tracker [EMAIL PROTECTED]
http://bugs.python.org/issue3187
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com