STINNER Victor <victor.stin...@haypocalc.com> added the comment: Here is a work-in-progress patch: issue3080-3.patch. The patch is HUGE and written for Python 3.3.
$ diffstat issue3080-3.patch Doc/c-api/module.rst | 24 Include/import.h | 73 + Include/moduleobject.h | 2 Include/pycapsule.h | 4 Modules/zipimport.c | 272 +++--- Objects/moduleobject.c | 52 - PC/import_nt.c | 84 +- Python/dynload_aix.c | 2 Python/dynload_dl.c | 2 Python/dynload_hpux.c | 2 Python/dynload_next.c | 4 Python/dynload_os2.c | 2 Python/dynload_shlib.c | 2 Python/dynload_win.c | 2 Python/import.c | 1910 +++++++++++++++++++++++++++---------------------- Python/importdl.c | 79 +- Python/importdl.h | 2 issue3080.py | 29 18 files changed, 1484 insertions(+), 1063 deletions(-) As expected, most of the work in done in import.c. Decode the module name earlier and encode it later. Try to manipulate PyUnicodeObject objects instead of char* buffers (so we have directly the string length). Split the huge and very complex find_module() function into 3 functions (find_module, find_module_filename and find_module2) and document them. Drop OS/2 support in find_module() (it can be kept, but it was easier for me to drop it and the OS/2 maintainer wrote that Python 3 is far from being compatible with OS/2). The patch creates some functions: PyModule_GetNameObject(), PyImport_ExecCodeModuleUnicode(), PyImport_AddModuleUnicode(), PyImport_ImportFrozenModuleUnicode(), PyModule_NewUnicode(), ... Use "U" format to parse a module name, and "%R" to format a module name (to escape surrogates characters and add quotes, instead of "... '%.200s' ..."). PyWin_FindRegisteredModule() is now private. Remove fqname argument from _PyImport_GetDynLoadFunc(), it wasn't used. Replace open_exclusive() by fopen(name, "wb") on Windows: is it correct? TODO: - rename xxxobj => xxx to keep original names and have a short patch (eg. I renamed name to nameobj during the transition to detect bugs) - catch encoding errors in case_ok() - don't encode in case_ok() if case_ok() does nothing (eg. on Linux) - find a better name for find_module2() The patch contains a tiny script, issue3080.py, to test the patch using an ISO-8859-1 locale. I will open a thread on the mailing list (python-dev) to decide if this patch is needed or not. If we agree that this issue should be fixed, I will split the patch into smaller parts and start a review process. ---------- keywords: +patch Added file: http://bugs.python.org/file20448/issue3080-3.patch _______________________________________ Python tracker <rep...@bugs.python.org> <http://bugs.python.org/issue3080> _______________________________________ _______________________________________________ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com