[issue9425] Rewrite import machinery to work with unicode paths

2010-10-18 Thread STINNER Victor
STINNER Victor victor.stin...@haypocalc.com added the comment: Starting at r85691, the full test suite of Python 3.2 pass with ASCII, ISO-8859-1 and UTF-8 locale encodings in a non-ascii directory. The work on this issue is done. -- resolution: - fixed status: open - closed

[issue9425] Rewrite import machinery to work with unicode paths

2010-09-29 Thread STINNER Victor
STINNER Victor victor.stin...@haypocalc.com added the comment: r85115 closes #9630: an important patch for #9425, redecode all filenames when setting the filesystem encoding. Next tasks (maybe not in this order): - merge getpath.c - redecode argv[0] used by PySys_SetArgvEx() to feed sys.path

[issue9425] Rewrite import machinery to work with unicode paths

2010-09-01 Thread STINNER Victor
STINNER Victor victor.stin...@haypocalc.com added the comment: r84429 creates Py_UNICODE_strcat() (change with the patch: return the right value). r84430 creates PyUnicode_strdup() (change with the patch: rename the function from Py_UNICODE_strdup() to PyUnicode_strdup() and mangle the

[issue9425] Rewrite import machinery to work with unicode paths

2010-09-01 Thread STINNER Victor
Changes by STINNER Victor victor.stin...@haypocalc.com: Removed file: http://bugs.python.org/file18672/Py_UNICODE_strdup.patch ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue9425 ___

[issue9425] Rewrite import machinery to work with unicode paths

2010-09-01 Thread STINNER Victor
Changes by STINNER Victor victor.stin...@haypocalc.com: Removed file: http://bugs.python.org/file18671/Py_UNICODE_strcat.patch ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue9425 ___

[issue9425] Rewrite import machinery to work with unicode paths

2010-08-29 Thread STINNER Victor
Changes by STINNER Victor victor.stin...@haypocalc.com: Added file: http://bugs.python.org/file18671/Py_UNICODE_strcat.patch ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue9425 ___

[issue9425] Rewrite import machinery to work with unicode paths

2010-08-29 Thread STINNER Victor
STINNER Victor victor.stin...@haypocalc.com added the comment: Py_UNICODE_strcat.patch: create Py_UNICODE_strcat() function. Py_UNICODE_strdup.patch: create Py_UNICODE_strdup() function. -- Added file: http://bugs.python.org/file18672/Py_UNICODE_strdup.patch

[issue9425] Rewrite import machinery to work with unicode paths

2010-08-25 Thread STINNER Victor
STINNER Victor victor.stin...@haypocalc.com added the comment: r84012 patchs zipimporter_init() to use the new PyUnicode_FSDecoder() and use Py_UNICODE* (unicode) strings instead of char* (byte) strings. oops, it's r84013 (not r84012) -- ___

[issue9425] Rewrite import machinery to work with unicode paths

2010-08-24 Thread STINNER Victor
STINNER Victor victor.stin...@haypocalc.com added the comment: See also #1552880. -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue9425 ___ ___

[issue9425] Rewrite import machinery to work with unicode paths

2010-08-24 Thread Kristján Valur Jónsson
Kristján Valur Jónsson krist...@ccpgames.com added the comment: Yes. in #1552880 I tried to make as minimal a change as possible. This particular patch is still in use in EVE Online, which is installed in various strange and exotic paths in the orient.. The trick I employed there was to

[issue9425] Rewrite import machinery to work with unicode paths

2010-08-22 Thread Romme
Changes by Romme sad.n...@gmail.com: -- nosy: +Romme ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue9425 ___ ___ Python-bugs-list mailing list

[issue9425] Rewrite import machinery to work with unicode paths

2010-08-17 Thread STINNER Victor
STINNER Victor victor.stin...@haypocalc.com added the comment: r84168 creates PyModule_GetFilenameObject(). I created a separated issue for the patch reencoding all filenames when setting the filesystem encoding: #9630. -- ___ Python tracker

[issue9425] Rewrite import machinery to work with unicode paths

2010-08-16 Thread STINNER Victor
STINNER Victor victor.stin...@haypocalc.com added the comment: zipimport_read_directory.patch commited as r84095. -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue9425 ___

[issue9425] Rewrite import machinery to work with unicode paths

2010-08-16 Thread STINNER Victor
Changes by STINNER Victor victor.stin...@haypocalc.com: Removed file: http://bugs.python.org/file18527/zipimport_read_directory.patch ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue9425 ___

[issue9425] Rewrite import machinery to work with unicode paths

2010-08-16 Thread STINNER Victor
STINNER Victor victor.stin...@haypocalc.com added the comment: Py_UNICODE_strncmp.patch: create Py_UNICODE_strncmp() function. -- Added file: http://bugs.python.org/file18547/Py_UNICODE_strncmp.patch ___ Python tracker rep...@bugs.python.org

[issue9425] Rewrite import machinery to work with unicode paths

2010-08-16 Thread STINNER Victor
STINNER Victor victor.stin...@haypocalc.com added the comment: Py_UNICODE_strncmp.patch was wrong for n=0. New version based on libiberty/strncmp.c source code. -- Added file: http://bugs.python.org/file18548/Py_UNICODE_strncmp-2.patch ___ Python

[issue9425] Rewrite import machinery to work with unicode paths

2010-08-16 Thread STINNER Victor
Changes by STINNER Victor victor.stin...@haypocalc.com: Removed file: http://bugs.python.org/file18547/Py_UNICODE_strncmp.patch ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue9425 ___

[issue9425] Rewrite import machinery to work with unicode paths

2010-08-16 Thread STINNER Victor
STINNER Victor victor.stin...@haypocalc.com added the comment: Py_UNICODE_strncmp-2.patch commited as r84111. -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue9425 ___

[issue9425] Rewrite import machinery to work with unicode paths

2010-08-16 Thread STINNER Victor
Changes by STINNER Victor victor.stin...@haypocalc.com: Removed file: http://bugs.python.org/file18548/Py_UNICODE_strncmp-2.patch ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue9425 ___

[issue9425] Rewrite import machinery to work with unicode paths

2010-08-16 Thread STINNER Victor
STINNER Victor victor.stin...@haypocalc.com added the comment: r84120: get_data() function of zipimport uses an unicode path. -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue9425 ___

[issue9425] Rewrite import machinery to work with unicode paths

2010-08-16 Thread STINNER Victor
STINNER Victor victor.stin...@haypocalc.com added the comment: r84121: repr() method zipimporter object uses unicode. -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue9425 ___

[issue9425] Rewrite import machinery to work with unicode paths

2010-08-16 Thread STINNER Victor
STINNER Victor victor.stin...@haypocalc.com added the comment: r84122 saves/restores the exception around filename = _PyUnicode_AsString(co-co_filename); because it raises an unicode error on unencodable filename. -- ___ Python tracker

[issue9425] Rewrite import machinery to work with unicode paths

2010-08-15 Thread Florent Xicluna
Florent Xicluna florent.xicl...@gmail.com added the comment: r83972 breaks OS X buildbots: support.TESTFN_UNENCODABLE is not defined if sys.platform == 'darwin'. File /Users/db3l/buildarea/3.x.bolen-tiger/build/Lib/test/test_imp.py, line 309, in module class

[issue9425] Rewrite import machinery to work with unicode paths

2010-08-15 Thread Florent Xicluna
Florent Xicluna florent.xicl...@gmail.com added the comment: It breaks test_unicode_file on OS X, too: File /Users/db3l/buildarea/3.x.bolen-tiger/build/Lib/test/test_unicode_file.py, line 8, in module from test.support import (run_unittest, rmtree, ImportError: cannot import name

[issue9425] Rewrite import machinery to work with unicode paths

2010-08-15 Thread STINNER Victor
STINNER Victor victor.stin...@haypocalc.com added the comment: I tried to fix Mac OS X (TESTFN_UNENCODABLE) with r84035, but I don't have access to Mac OS X to test and my patch was not correct. It should now be ok with r84080. -- ___ Python

[issue9425] Rewrite import machinery to work with unicode paths

2010-08-14 Thread STINNER Victor
STINNER Victor victor.stin...@haypocalc.com added the comment: r84012 creates _Py_stat(). It is a little bit different than the attached patch (_Py_stat.patch): it doesn't clear Python exception on unicode conversion error. -- ___ Python tracker

[issue9425] Rewrite import machinery to work with unicode paths

2010-08-14 Thread STINNER Victor
Changes by STINNER Victor victor.stin...@haypocalc.com: Removed file: http://bugs.python.org/file18448/_Py_stat.patch ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue9425 ___

[issue9425] Rewrite import machinery to work with unicode paths

2010-08-14 Thread STINNER Victor
STINNER Victor victor.stin...@haypocalc.com added the comment: r84012 patchs zipimporter_init() to use the new PyUnicode_FSDecoder() and use Py_UNICODE* (unicode) strings instead of char* (byte) strings. -- ___ Python tracker rep...@bugs.python.org

[issue9425] Rewrite import machinery to work with unicode paths

2010-08-14 Thread STINNER Victor
STINNER Victor victor.stin...@haypocalc.com added the comment: r84030 creates _Py_fopen() for PyUnicodeObject path. -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue9425 ___

[issue9425] Rewrite import machinery to work with unicode paths

2010-08-14 Thread STINNER Victor
STINNER Victor victor.stin...@haypocalc.com added the comment: zipimport_read_directory.patch: patch for read_directory() function of the zipimport module to support unencodable filenames. This patch requires #9599 (PySys_FormatStderr). The patch changes the encoding of the name: decode name

[issue9425] Rewrite import machinery to work with unicode paths

2010-08-13 Thread STINNER Victor
STINNER Victor victor.stin...@haypocalc.com added the comment: r83971 enables test.support.TESTFN_UNDECODEABLE on non-Windows OSes. -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue9425 ___

[issue9425] Rewrite import machinery to work with unicode paths

2010-08-13 Thread STINNER Victor
STINNER Victor victor.stin...@haypocalc.com added the comment: I commited nullimporter_unicode.patch with an unit test as r83972. -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue9425 ___

[issue9425] Rewrite import machinery to work with unicode paths

2010-08-13 Thread STINNER Victor
Changes by STINNER Victor victor.stin...@haypocalc.com: Removed file: http://bugs.python.org/file18434/nullimporter_unicode.patch ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue9425 ___

[issue9425] Rewrite import machinery to work with unicode paths

2010-08-13 Thread STINNER Victor
Changes by STINNER Victor victor.stin...@haypocalc.com: Removed file: http://bugs.python.org/file18469/_PyFile_FromFdUnicode.patch ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue9425 ___

[issue9425] Rewrite import machinery to work with unicode paths

2010-08-13 Thread STINNER Victor
Changes by STINNER Victor victor.stin...@haypocalc.com: Removed file: http://bugs.python.org/file18444/pyerr_warnformat-2.patch ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue9425 ___

[issue9425] Rewrite import machinery to work with unicode paths

2010-08-13 Thread STINNER Victor
STINNER Victor victor.stin...@haypocalc.com added the comment: r83981 closes #9560: avoid the filename in _syscmd_file() to fix a bug with non encodable filenames in platform.architecture(). -- ___ Python tracker rep...@bugs.python.org

[issue9425] Rewrite import machinery to work with unicode paths

2010-08-13 Thread Antoine Pitrou
Antoine Pitrou pit...@free.fr added the comment: About wchar2char: - PEP 383 says “With this PEP, non-decodable bytes = 128 will be represented as lone surrogate codes U+DC80..U+DCFF. Bytes below 128 will produce exceptions”. Your patch accepts bytes below 128. - I don't understand why you

[issue9425] Rewrite import machinery to work with unicode paths

2010-08-13 Thread STINNER Victor
STINNER Victor victor.stin...@haypocalc.com added the comment: New version of the patch _Py_wchar2char-2.patch: - _Py_wchar2char() only escapes characters in range U+DC80..U+DCFF (instead of U+DC00..U+DCFF) - add a comment to _Py_char2wchar() I don't understand why you decrement `size` in

[issue9425] Rewrite import machinery to work with unicode paths

2010-08-13 Thread STINNER Victor
Changes by STINNER Victor victor.stin...@haypocalc.com: Removed file: http://bugs.python.org/file18431/_Py_wchar2char.patch ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue9425 ___

[issue9425] Rewrite import machinery to work with unicode paths

2010-08-13 Thread STINNER Victor
STINNER Victor victor.stin...@haypocalc.com added the comment: I know this is not introduced by your patch, just moved, but couldn’t the typo in UNDECODEABLE be fixed? (extraneous e) I wasn't sure that it was a typo, so I kept it unchanged. It's now fixed by r83987. --

[issue9425] Rewrite import machinery to work with unicode paths

2010-08-13 Thread STINNER Victor
STINNER Victor victor.stin...@haypocalc.com added the comment: r83989 creates _Py_wchar2char() function (_Py_wchar2char-2.patch). -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue9425 ___

[issue9425] Rewrite import machinery to work with unicode paths

2010-08-13 Thread STINNER Victor
Changes by STINNER Victor victor.stin...@haypocalc.com: Removed file: http://bugs.python.org/file18514/_Py_wchar2char-2.patch ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue9425 ___

[issue9425] Rewrite import machinery to work with unicode paths

2010-08-13 Thread STINNER Victor
STINNER Victor victor.stin...@haypocalc.com added the comment: r83990 closes #9542 by creating the PyUnicode_FSDecoder() PyArg_ParseTuple parser. -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue9425

[issue9425] Rewrite import machinery to work with unicode paths

2010-08-13 Thread STINNER Victor
STINNER Victor victor.stin...@haypocalc.com added the comment: r83976 adds PyErr_WarnFormat() (pyerr_warnformat-2.patch). -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue9425 ___

[issue9425] Rewrite import machinery to work with unicode paths

2010-08-13 Thread STINNER Victor
STINNER Victor victor.stin...@haypocalc.com added the comment: I created #9599: Add PySys_FormatStdout and PySys_FormatStderr functions. -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue9425

[issue9425] Rewrite import machinery to work with unicode paths

2010-08-12 Thread STINNER Victor
STINNER Victor victor.stin...@haypocalc.com added the comment: (About PyFile_FromFd) pitrou Actually, I'm not sure there's much point since the name pitrou attribute is currently read-only: (...) Oh, it remembers me #4762. I closed this issue with the message The last problem occurs with

[issue9425] Rewrite import machinery to work with unicode paths

2010-08-11 Thread Antoine Pitrou
Antoine Pitrou pit...@free.fr added the comment: Actually, I'm not sure there's much point since the name attribute is currently read-only: f = open(1, wb) f.name = foo Traceback (most recent call last): File stdin, line 1, in module AttributeError: attribute 'name' of '_io.BufferedWriter'

[issue9425] Rewrite import machinery to work with unicode paths

2010-08-10 Thread STINNER Victor
STINNER Victor victor.stin...@haypocalc.com added the comment: I commited Py_UNICODE_strrchr.patch as r83933 after removing the useless start variable. -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue9425

[issue9425] Rewrite import machinery to work with unicode paths

2010-08-10 Thread STINNER Victor
Changes by STINNER Victor victor.stin...@haypocalc.com: Removed file: http://bugs.python.org/file18446/Py_UNICODE_strrchr.patch ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue9425 ___

[issue9425] Rewrite import machinery to work with unicode paths

2010-08-10 Thread STINNER Victor
STINNER Victor victor.stin...@haypocalc.com added the comment: _PyFile_FromFdUnicode.patch: create _PyFile_FromFdUnicode() function. It will be used in import.c to open a file using an unicode filename. For _PyFile_FromFd(), I kept the previous behaviour: clear the exception on

[issue9425] Rewrite import machinery to work with unicode paths

2010-08-08 Thread STINNER Victor
STINNER Victor victor.stin...@haypocalc.com added the comment: _Py_wchar2char.patch: create _Py_wchar2char() private function, and _wstat() and _wfopen() use it. _Py_wchar2char() function has been improved since the previous version posted to Rietveld: it now computes the exact length of the

[issue9425] Rewrite import machinery to work with unicode paths

2010-08-08 Thread STINNER Victor
STINNER Victor victor.stin...@haypocalc.com added the comment: r83783 creates run_file() subfunction. -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue9425 ___

[issue9425] Rewrite import machinery to work with unicode paths

2010-08-08 Thread STINNER Victor
STINNER Victor victor.stin...@haypocalc.com added the comment: pyerr_warnformat.patch: create PyErr_WarnFormat() function, and use it in PyType_Ready() and PyUnicode_AsEncodedString(). The patch fixes also setup_context(): work on the unicode filename, not the encoded (bytes) filename. It

[issue9425] Rewrite import machinery to work with unicode paths

2010-08-08 Thread STINNER Victor
STINNER Victor victor.stin...@haypocalc.com added the comment: nullimporter_unicode.patch: patch NullImporter_init(): - use GetFileAttributesW() instead of GetFileAttributesA() for the Windows version to be fully Unicode compliant - use O format with PyUnicode_FSConverter instead of es with

[issue9425] Rewrite import machinery to work with unicode paths

2010-08-08 Thread Antoine Pitrou
Antoine Pitrou pit...@free.fr added the comment: It looks like you are a fixing a bug in setup_context() at the same time as you introduce PyErr_WarnFormat(). Both changes should probably go in separately. The PyErr_WarnFormat() doc needs a versionadded tag. --

[issue9425] Rewrite import machinery to work with unicode paths

2010-08-08 Thread STINNER Victor
STINNER Victor victor.stin...@haypocalc.com added the comment: pitrou It looks like you are a fixing a bug in setup_context() pitrou at the same time as you introduce PyErr_WarnFormat(). pitrou Both changes should probably go in separately. Right. r83860 fixes the bug, and I attached a new

[issue9425] Rewrite import machinery to work with unicode paths

2010-08-08 Thread STINNER Victor
Changes by STINNER Victor victor.stin...@haypocalc.com: Removed file: http://bugs.python.org/file18432/pyerr_warnformat.patch ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue9425 ___

[issue9425] Rewrite import machinery to work with unicode paths

2010-08-08 Thread STINNER Victor
STINNER Victor victor.stin...@haypocalc.com added the comment: gutworth's comment about r83860: Test? -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue9425 ___

[issue9425] Rewrite import machinery to work with unicode paths

2010-08-08 Thread STINNER Victor
STINNER Victor victor.stin...@haypocalc.com added the comment: Py_UNICODE_strrchr.patch: Create Py_UNICODE_strrchr() function. It will be used for zipimport to work on unicode paths instead of bytes paths. Antoine noticed that the input string is const whereas the output string is not const,

[issue9425] Rewrite import machinery to work with unicode paths

2010-08-08 Thread STINNER Victor
STINNER Victor victor.stin...@haypocalc.com added the comment: I created a separated issue, #9542, to add the new function PyUnicode_FSDecoder(). -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue9425

[issue9425] Rewrite import machinery to work with unicode paths

2010-08-08 Thread STINNER Victor
STINNER Victor victor.stin...@haypocalc.com added the comment: _Py_stat.patch: create _Py_stat() function. It will be used in import.c and zipimport.c. I added the function to import.c because, initially, I only used it there. But it's maybe not the best place for such function. posixmodule.c

[issue9425] Rewrite import machinery to work with unicode paths

2010-08-08 Thread STINNER Victor
STINNER Victor victor.stin...@haypocalc.com added the comment: r83870 creates load_builtin() subfunction in import.c to prepare and simplify the big patch. -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue9425

[issue9425] Rewrite import machinery to work with unicode paths

2010-08-07 Thread STINNER Victor
STINNER Victor victor.stin...@haypocalc.com added the comment: The patch is too huge to be commited at once. I will split it again into smaller parts. First related commit: r83778 fixes tests for not encodable filenames. -- ___ Python tracker

[issue9425] Rewrite import machinery to work with unicode paths

2010-08-07 Thread STINNER Victor
STINNER Victor victor.stin...@haypocalc.com added the comment: r83779 creates run_command(), it's just a refactorization. -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue9425 ___

[issue9425] Rewrite import machinery to work with unicode paths

2010-07-31 Thread STINNER Victor
STINNER Victor victor.stin...@haypocalc.com added the comment: After some tests on Windows, I realized that my patch is not enough to be fully unicode compliant (on Windows). Some functions are still using PyUnicode_DecodeFSDefault() or PyUnicode_EncodeFSDefault(). Until all functions are

[issue9425] Rewrite import machinery to work with unicode paths

2010-07-29 Thread STINNER Victor
New submission from STINNER Victor victor.stin...@haypocalc.com: Python (2 and 3) is unable to load a module installed in a directory containing characters not encodable to the locale encoding. And Python doesn't work if it's installed in non-ASCII directory on Windows or with a locale

[issue9425] Rewrite import machinery to work with unicode paths

2010-07-29 Thread STINNER Victor
STINNER Victor victor.stin...@haypocalc.com added the comment: Oh, I forgot to say that I created an svn branch including my work: import_unicode. http://svn.python.org/view/python/branches/import_unicode/ You can try it if you prefer svn to an huge patch. I created a branch so you can follow

[issue9425] Rewrite import machinery to work with unicode paths

2010-07-29 Thread Ezio Melotti
Ezio Melotti ezio.melo...@gmail.com added the comment: I wrote a few minor comments on codereview. The patch should also include more tests. -- nosy: +ezio.melotti ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue9425

[issue9425] Rewrite import machinery to work with unicode paths

2010-07-29 Thread Arfrever Frehtes Taifersar Arahesis
Changes by Arfrever Frehtes Taifersar Arahesis arfrever@gmail.com: -- nosy: +Arfrever ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue9425 ___

[issue9425] Rewrite import machinery to work with unicode paths

2010-07-29 Thread STINNER Victor
STINNER Victor victor.stin...@haypocalc.com added the comment: The patch should also include more tests. Which kind of test? Run the test suite in a non-ASCII directory with encoding different than utf-8 is enough. If the patch is accepted, the solution is maybe a specific buildbot.

[issue9425] Rewrite import machinery to work with unicode paths

2010-07-29 Thread STINNER Victor
STINNER Victor victor.stin...@haypocalc.com added the comment: Another important TODO: use weak references for the code objects list. -- I tested my patch on Windows. I fixes #8988 because non-ASCII characters are now correctly decoded with mbcs and not UTF-8. But it doesn't work with