[issue15441] test_posixpath fails on Japanese edition of Windows
Roundup Robot added the comment: New changeset b3434c1ae503 by Victor Stinner in branch 'default': Issue #15441, #15478: Reenable test_nonascii_abspath() on Windows http://hg.python.org/cpython/rev/b3434c1ae503 -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue15441 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue15441] test_posixpath fails on Japanese edition of Windows
STINNER Victor added the comment: If a file name was invalid byte character, os.chdir() raises UnicodeDecodeError() instead of WindowsError. I believe this case is not handled correctly in Python 3.4 (version under development) thanks to my work on issue #15478. Thanks for the report Atsuo Ishimoto! Don't hesitate to report other similar issue ;-) -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue15441 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue15441] test_posixpath fails on Japanese edition of Windows
Roundup Robot added the comment: New changeset 3edc71ed19e7 by Victor Stinner in branch 'default': Issue #15441: Skip test_nonascii_abspath() of test_genericpath on Windows http://hg.python.org/cpython/rev/3edc71ed19e7 -- nosy: +python-dev ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue15441 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue15441] test_posixpath fails on Japanese edition of Windows
STINNER Victor added the comment: This issue (the test) should be fixed, see the issue #15478 for the real fix. -- resolution: - fixed status: open - closed ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue15441 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue15441] test_posixpath fails on Japanese edition of Windows
STINNER Victor victor.stin...@gmail.com added the comment: I still would prefer if only one issue at a time gets fixed, in particular if the two issues require independent changes. Sorry, you are right: I created the issue #15478 for the general fix, and we will use this issue to fix test_genericpath. This issue can be fixed in Python 3.3, whereas #15478 will have to wait for Python 3.4 beause it's an major change and may break a lot of code. It's better to wait the next release to test such change correctly. -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue15441 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue15441] test_posixpath fails on Japanese edition of Windows
STINNER Victor victor.stin...@gmail.com added the comment: +@unittest.skipIf(sys.platform == 'win32', +Win32 can fail cwd() with invalid utf8 name) def test_nonascii_abspath(self): You should not always skip the test on Windows: the filename is decodable in code pages other than cp932. It would be better to add the following code at the beginning of test_nonascii_abspath(): name = b'\xe7w\xf0' if sys.platform == 'win32': try: os.fsdecode(name) except UnicodeDecodeError: self.skipTest(the filename %a is not decodable from the ANSI code page (%s) % (name, sys.getfilesystemencoding())) Note: Windows does not use UTF-8 for ANSI or OEM code pages, except if you change it manually. +batfile = +chcp 932 +{exe} {scriptname} +chcp {codepage} + chcp does only change the OEM code page, whereas Python uses the ANSI code page for sys.getfilesystemencoding(). It is possible to change the ANSI code page of the current thread (CP_THREAD_ACP) using SetThreadLocale(), but it doesn't help because Python uses the global ANSI code page (CP_ACP). I don't think that changing the CP_THREAD_ACP code page does change the CP_ACP code page of child processes. Changing the ANSI code page manually is possible in the Control Panel, but it requires to reboot Windows. -- Your patch expects that os.mkdir(b'\xe7w\xf0'); os.chdir(b'\xe7w\xf0') works whereas I tested manually in Python, and it doesn't work because Windows creates a directory called \u8f42 (b'\xe7w'), see my previous message (msg166441). At least with a NTFS filesystem on Windows 7. -- Your last patch tries to decode the bytes filename from the filesystem encoding, or uses repr(filename). I may be better to keep the bytes filenames unchanged in OSError.filename, instead of using repr(). But it sounds like a good idea to patch all PyErr_Set*WithFilename(..., char*) functions. My patch for path_error() avoids the creation of a temporary bytes objets. -- test_support.temp_cwd(b'\xe7w\xf0') test was added by the changeset ebdc2aa730c0 and is related to the issue #3426. I'm not sure that it was really expected to test b'\xe7w\xf0', because a previous test was using u'\xe7w\xf0' : -# Issue 3426: check that abspath retuns unicode when the arg is unicode -# and str when it's str, with both ASCII and non-ASCII cwds -for cwd in (u'cwd', u'\xe7w\xf0'): We may use b'\xe7w' instead of b'\xe7w\xf0' if b'\xe7w\xf0' cannot be decoded. -- Attached patch win32_bytes_filename.patch tries to solve both issues: the test and UnicodeDecodeError on raising the OSError. I tries to decode the bytes filename from the FS encoding, or keeps it unchanged (as bytes). As Python 2 does with os.listdir(unicode). -- nosy: +flox Added file: http://bugs.python.org/file26524/win32_bytes_filename.patch ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue15441 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue15441] test_posixpath fails on Japanese edition of Windows
Atsuo Ishimoto ishim...@gembook.org added the comment: chcp does only change the OEM code page, whereas Python uses the ANSI code page for sys.getfilesystemencoding(). Sorry, I should have investigated the code more carefully. Attached patch win32_bytes_filename.patch tries to solve both issues: the test and UnicodeDecodeError on raising the OSError. Looks good to me. Thank you! -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue15441 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue15441] test_posixpath fails on Japanese edition of Windows
Martin v. Löwis mar...@v.loewis.de added the comment: I still would prefer if only one issue at a time gets fixed, in particular if the two issues require independent changes. This issue is about test_nonascii_abspath failing on the Japanese edition of Windows (see the first sentence of the first message from Atsuo) If you absolutely must fix the other issue right away also, it needs a test case. -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue15441 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue15441] test_posixpath fails on Japanese edition of Windows
STINNER Victor victor.stin...@gmail.com added the comment: The following change is a major change on how Python handles undecodable filenames on Windows: -return PyUnicode_DecodeMBCS(s, size, NULL); +return PyUnicode_DecodeMBCS(s, size, surrogateescape); I disagree with this change, Python should not generate surrogates *on Windows*. By the way, there is also os.fsdecode(), it has the same behaviour than PyUnicode_DecodeFSDefault() and PyUnicode_DecodeFSDefaultAndSize() (it uses the strict error handler on Windows). -- nosy: +loewis, tim.golden ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue15441 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue15441] test_posixpath fails on Japanese edition of Windows
STINNER Victor victor.stin...@gmail.com added the comment: If a file name was invalid byte character, os.chdir() raises UnicodeDecodeError() instead of WindowsError. I realized that the problem is in the error handling: raising the OSError fails with a UnicodeDecodeError because PyErr_SetFromWindowsErrWithFilename() calls PyUnicode_DecodeFSDefault(), whereas the filename is not decodable. If you want to change something, it should be PyErr_SetFromWindowsErrWithFilename(). We may use PyObject_Repr() or PyObject_ASCII() for example. -- See also the issue #13374: The Windows bytes API has been deprecated in the os module. Use Unicode filenames instead of bytes filenames to not depend on the ANSI code page anymore and to support any filename. -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue15441 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue15441] test_posixpath fails on Japanese edition of Windows
Atsuo Ishimoto ishim...@gembook.org added the comment: Yes, I know #13374, that's why I wrote This is a byte-api issue on Windows, so we may be able to simply skip this test. Do you think we need a patch to avoid UnicodeDecodeError raised? Or should we change test to skip this? -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue15441 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue15441] test_posixpath fails on Japanese edition of Windows
STINNER Victor victor.stin...@gmail.com added the comment: Do you think we need a patch to avoid UnicodeDecodeError raised? Or should we change test to skip this? It's a bug, the test should not be skipped. You should get an OSError because the chdir() failed, not an UnicodeDecodeError. -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue15441 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue15441] test_posixpath fails on Japanese edition of Windows
Martin v. Löwis mar...@v.loewis.de added the comment: As for your patch: you are missing the point of the test. The file name is assumed to be valid, despite it not being in the file system encoding. -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue15441 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue15441] test_posixpath fails on Japanese edition of Windows
Martin v. Löwis mar...@v.loewis.de added the comment: IMO, it is ok to skip the test on Windows; it was apparently targeted for Unix. If we preserve it, we should pick a file name (on Windows) which is encodable in the file system encoding. -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue15441 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue15441] test_posixpath fails on Japanese edition of Windows
STINNER Victor victor.stin...@gmail.com added the comment: decode_filename_mbcs.patch uses the replace error handler to decode the filename on Windows. It should solve the issue. -- Added file: http://bugs.python.org/file26515/decode_filename_mbcs.patch ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue15441 ___diff -r b127046831e2 Python/errors.c --- a/Python/errors.c Mon Jul 23 00:24:24 2012 -0500 +++ b/Python/errors.c Wed Jul 25 15:40:52 2012 +0200 @@ -463,11 +463,22 @@ PyErr_SetFromErrnoWithFilenameObject(PyO return NULL; } +static PyObject * +decode_filename(const char *filename) +{ +if (!filename) +return NULL; +#ifdef HAVE_MBCS +return PyUnicode_DecodeMBCS(filename, strlen(filename), replace); +#else +return PyUnicode_DecodeFSDefault(filename); +#endif +} PyObject * PyErr_SetFromErrnoWithFilename(PyObject *exc, const char *filename) { -PyObject *name = filename ? PyUnicode_DecodeFSDefault(filename) : NULL; +PyObject *name = decode_filename(filename); PyObject *result = PyErr_SetFromErrnoWithFilenameObject(exc, name); Py_XDECREF(name); return result; @@ -558,7 +569,7 @@ PyObject *PyErr_SetExcFromWindowsErrWith int ierr, const char *filename) { -PyObject *name = filename ? PyUnicode_DecodeFSDefault(filename) : NULL; +PyObject *name = decode_filename(filename); PyObject *ret = PyErr_SetExcFromWindowsErrWithFilenameObject(exc, ierr, name); @@ -595,7 +606,7 @@ PyObject *PyErr_SetFromWindowsErrWithFil int ierr, const char *filename) { -PyObject *name = filename ? PyUnicode_DecodeFSDefault(filename) : NULL; +PyObject *name = decode_filename(filename); PyObject *result = PyErr_SetExcFromWindowsErrWithFilenameObject( PyExc_WindowsError, ierr, name); @@ -892,7 +903,7 @@ PyErr_SyntaxLocationEx(const char *filen } } if (filename != NULL) { -tmp = PyUnicode_DecodeFSDefault(filename); +tmp = decode_filename(filename); if (tmp == NULL) PyErr_Clear(); else { ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue15441] test_posixpath fails on Japanese edition of Windows
Martin v. Löwis mar...@v.loewis.de added the comment: haypo: how is this meant to fix the bug? Won't it now cause a WindowsError, when a successful operation is expected? -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue15441 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue15441] test_posixpath fails on Japanese edition of Windows
STINNER Victor victor.stin...@gmail.com added the comment: haypo: how is this meant to fix the bug? Won't it now cause a WindowsError, when a successful operation is expected? Oh, I was referring to the new test proposed in the attached patch (issue15441.patch): +def test_chdir_invalid_filename(self): +self.assertRaises(WindowsError, os.chdir, b'\xe7w\xf0') os.chdir() in a non existent directory with a bytes name should raise an OSError, not a UnicodeDecodeError. -- About the original issue: it looks like mkdir(bytes) decodes internally the directory name and ignore undecodable bytes. On Windows 7, mkdir(b\xe7w\xf0) creates a directory called \u8f42 (b\xe7w, b\xf0 suffix has been dropped). It is not possible to change the directory to b\xe7w\xf0, but it works with b\xe7w or \u8f42. There are 2 issues: * On Windows, os.chdir(bytes) should not raise a UnicodeDecodeError on the directory does not exist * test_nonascii_abspath() can be skipped on Windows if os.fsdecode(b\xe7w\xf0) fails, or b\xe7w name should be used instead My patch is not the best solution because it looses information (if the filename contains undecodable bytes). I realized that OSError.filename is not necessary a str, bytes is also accepted. win32_error_object() can be used. The following patch pass the original bytes object to OSError constructor instead: diff -r 43ae2a243eca Modules/posixmodule.c --- a/Modules/posixmodule.c Thu Jul 26 00:47:15 2012 +0200 +++ b/Modules/posixmodule.c Thu Jul 26 01:19:14 2012 +0200 @@ -1138,11 +1138,10 @@ static PyObject * path_error(char *function_name, path_t *path) { #ifdef MS_WINDOWS -if (path-narrow) -return win32_error(function_name, path-narrow); -if (path-wide) -return win32_error_unicode(function_name, path-wide); -return win32_error(function_name, NULL); +return PyErr_SetExcFromWindowsErrWithFilenameObject( +PyExc_OSError, +0, +path-object); #else return path_posix_error(function_name, path); #endif (sorry, I failed to attach a patch, I have an issue with my file chooser...) -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue15441 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue15441] test_posixpath fails on Japanese edition of Windows
Atsuo Ishimoto ishim...@gembook.org added the comment: Here's another try: In this patch: - skip test_nonascii_abspath() test since it fails on some code pages. - Added a test to reproduce bug on latin code pages. - Use repr(filename) only if decode failed. This is more backward-compatible and does not lose any information. -- Added file: http://bugs.python.org/file26517/issue15441_2.patch ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue15441 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue15441] test_posixpath fails on Japanese edition of Windows
Atsuo Ishimoto ishim...@gembook.org added the comment: martin: while os.mkdir(b'\xe7w\xf0') succeeds, but strangely enough, subsequent os.chdir(b'\xe7w\xf0') fails. This is not a bug in Python, but Windows issue. -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue15441 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue15441] test_posixpath fails on Japanese edition of Windows
Atsuo Ishimoto ishim...@gembook.org added the comment: Updated patch again. In this patch, byte object is passed as argument to WindowsError as Victor's patch. I prefer to fix PyErr_SetFromWindowsErrWithFilename() over path_error(), because PyErr_SetFromWindowsErrWithFilename() is public API, so someone may use PyErr_SetFromWindowsErrWithFilename() for their own extension modules. -- Added file: http://bugs.python.org/file26518/issue15441_3.patch ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue15441 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue15441] test_posixpath fails on Japanese edition of Windows
New submission from Atsuo Ishimoto ishim...@gembook.org: test_posixpath.PosixCommonTest.test_nonascii_abspath fails on Japanese edition of Windows. If a file name was invalid byte character, os.chdir() raises UnicodeDecodeError() instead of WindowsError. This is a byte-api issue on Windows, so we may be able to simply skip this test. == ERROR: test_nonascii_abspath (test.test_posixpath.PosixCommonTest) -- Traceback (most recent call last): File C:\cygwin\home\ishimoto\src\cpython\lib\test\test_genericpath.py, line 318, in test_nonascii_abspath with support.temp_cwd(b'\xe7w\xf0'): File C:\cygwin\home\ishimoto\src\cpython\lib\contextlib.py, line 48, in __en ter__ return next(self.gen) File C:\cygwin\home\ishimoto\src\cpython\lib\test\support.py, line 614, in t emp_cwd os.chdir(path) UnicodeDecodeError: 'mbcs' codec can't decode bytes in position 0--1: No mapping for the Unicode character exists in the target code page. -- -- components: Unicode, Windows files: test_nonascii_abspath.patch keywords: patch messages: 166300 nosy: ezio.melotti, ishimoto priority: normal severity: normal status: open title: test_posixpath fails on Japanese edition of Windows type: behavior versions: Python 3.3 Added file: http://bugs.python.org/file26500/test_nonascii_abspath.patch ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue15441 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue15441] test_posixpath fails on Japanese edition of Windows
Atsuo Ishimoto ishim...@gembook.org added the comment: changed name of test method. In the old patch, new test method shadowed existing one. -- Added file: http://bugs.python.org/file26502/test_nonascii_abspath_2.patch ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue15441 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue15441] test_posixpath fails on Japanese edition of Windows
Changes by Atsuo Ishimoto ishim...@gembook.org: Removed file: http://bugs.python.org/file26500/test_nonascii_abspath.patch ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue15441 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue15441] test_posixpath fails on Japanese edition of Windows
Changes by Antoine Pitrou pit...@free.fr: -- nosy: +haypo stage: - patch review ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue15441 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue15441] test_posixpath fails on Japanese edition of Windows
Changes by Atsuo Ishimoto ishim...@gembook.org: Removed file: http://bugs.python.org/file26502/test_nonascii_abspath_2.patch ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue15441 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue15441] test_posixpath fails on Japanese edition of Windows
Atsuo Ishimoto ishim...@gembook.org added the comment: I'm sorry, I generated a patch in wrong direction. -- Added file: http://bugs.python.org/file26506/issue15441.patch ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue15441 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com