[issue10955] Possible regression with stdlib in zipfile

2011-01-23 Thread Ronald Oussoren
Ronald Oussoren ronaldousso...@mac.com added the comment: Data files can be anything that can be a data-file in a setuptools/distribute setup.py file. Note that #10972 isn't necessary when python32.zip is build using the zipfile module, _encodeFilenameFlags uses either ASCII or UTF-8 to

[issue10955] Possible regression with stdlib in zipfile

2011-01-22 Thread Georg Brandl
Georg Brandl ge...@python.org added the comment: Patch #2 looks innocent enough to me, and is clearly an improvement. -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue10955 ___

[issue10955] Possible regression with stdlib in zipfile

2011-01-22 Thread Georg Brandl
Georg Brandl ge...@python.org added the comment: For 3.3, we might want to consider implementing cp437 in C, as a necessary consequence of supporting import from zipfiles. Shouldn't be so hard, I guess. -- ___ Python tracker rep...@bugs.python.org

[issue10955] Possible regression with stdlib in zipfile

2011-01-22 Thread STINNER Victor
STINNER Victor victor.stin...@haypocalc.com added the comment: georg.brandl Patch #2 looks innocent enough to me, georg.brandl and is clearly an improvement. Ok, issue fixed by r88140 (+r88141): Issue #10955: zipimport uses ASCII encoding instead of cp497 to decode filenames, at bootstrap, if

[issue10955] Possible regression with stdlib in zipfile

2011-01-21 Thread Amaury Forgeot d'Arc
Amaury Forgeot d'Arc amaur...@gmail.com added the comment: No, your change is in the read_directory() function, which reads the whole archive the first time it's used. -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue10955

[issue10955] Possible regression with stdlib in zipfile

2011-01-21 Thread STINNER Victor
STINNER Victor victor.stin...@haypocalc.com added the comment: No, your change is in the read_directory() function, which reads the whole archive the first time it's used. Oh, I though that read_directory() only reads files one by one. -- ___

[issue10955] Possible regression with stdlib in zipfile

2011-01-21 Thread STINNER Victor
STINNER Victor victor.stin...@haypocalc.com added the comment: Ronald Oussoren and Amaury Forgeot d'Arc: do you think that it is an acceptable limitation to only accept ASCII filenames in python32.zip? (not in all ZIP files, just in the file loaded at startup) All possible solutions: a)

[issue10955] Possible regression with stdlib in zipfile

2011-01-21 Thread Amaury Forgeot d'Arc
Amaury Forgeot d'Arc amaur...@gmail.com added the comment: What about tools that builds one .zip file for all modules, like py2exe? A cp437 decoder is not so ugly to implement in C. It's just a charmap. -- ___ Python tracker rep...@bugs.python.org

[issue10955] Possible regression with stdlib in zipfile

2011-01-21 Thread STINNER Victor
STINNER Victor victor.stin...@haypocalc.com added the comment: Oh, py2app is implemented in Python and use the zipfile module. So if we can control how the filename is encoded, we can fix py2app to workaround this limitation :-) 7zip and WinRAR uses the same algorithm than

[issue10955] Possible regression with stdlib in zipfile

2011-01-21 Thread STINNER Victor
STINNER Victor victor.stin...@haypocalc.com added the comment: #10972 has a patch for zipfile to set the filename encoding if a ZipInfo object (to force the encoding to UTF-8). -- ___ Python tracker rep...@bugs.python.org

[issue10955] Possible regression with stdlib in zipfile

2011-01-21 Thread STINNER Victor
STINNER Victor victor.stin...@haypocalc.com added the comment: On Linux, the zip command line program (InfoZIP zip program) only sets the unicode flag if it is able to set the locale to en_US.UTF-8. It can do better: check if the locale encoding is UTF-8, and only en_US.UTF-8 locale if the

[issue10955] Possible regression with stdlib in zipfile

2011-01-21 Thread STINNER Victor
STINNER Victor victor.stin...@haypocalc.com added the comment: Patch version 2: display a more useful error message: $ python Fatal Python error: Py_Initialize: Unable to get the locale encoding NotImplementedError: bootstrap issue: python32.zip contains non-ASCII filenames without the unicode

[issue10955] Possible regression with stdlib in zipfile

2011-01-21 Thread Antoine Pitrou
Antoine Pitrou pit...@free.fr added the comment: Victor's second patch looks good to me. Georg, is this a release blocker? -- nosy: +pitrou stage: unit test needed - patch review ___ Python tracker rep...@bugs.python.org

[issue10955] Possible regression with stdlib in zipfile

2011-01-21 Thread Ronald Oussoren
Ronald Oussoren ronaldousso...@mac.com added the comment: The python32.zip file generated by py2app contains both files from the stdlib and application files. I cannot avoid haveing non-ascii filenames when a python package contains data files that have such names. The patch in Issue10972

[issue10955] Possible regression with stdlib in zipfile

2011-01-21 Thread Antoine Pitrou
Antoine Pitrou pit...@free.fr added the comment: The python32.zip file generated by py2app contains both files from the stdlib and application files. I cannot avoid haveing non-ascii filenames when a python package contains data files that have such names. I don't think this is a problem.

[issue10955] Possible regression with stdlib in zipfile

2011-01-21 Thread Amaury Forgeot d'Arc
Amaury Forgeot d'Arc amaur...@gmail.com added the comment: We are only talking about bootstrap-time importing of encodings modules. Again, the whole zip central directory is loaded on first import. If the zip file contains non-ascii filenames, nothing can be imported. --

[issue10955] Possible regression with stdlib in zipfile

2011-01-21 Thread Antoine Pitrou
Antoine Pitrou pit...@free.fr added the comment: Again, the whole zip central directory is loaded on first import. If the zip file contains non-ascii filenames, nothing can be imported. Does it have to be decoded eagerly, though? -- ___ Python

[issue10955] Possible regression with stdlib in zipfile

2011-01-21 Thread STINNER Victor
STINNER Victor victor.stin...@haypocalc.com added the comment: I cannot avoid haveing non-ascii filenames when a python package contains data files that have such names. Are data files Python modules (.py files)? Or can it be anything? -- ___

[issue10955] Possible regression with stdlib in zipfile

2011-01-20 Thread Ronald Oussoren
New submission from Ronald Oussoren ronaldousso...@mac.com: I ran into this issue while debugging why py2app doesn't work with python 3.2rc2. The reason seems to be a regression w.r.t. having the stdlib inside a zipfile. Note that I haven't tested this without going through py2app yet.

[issue10955] Possible regression with stdlib in zipfile

2011-01-20 Thread STINNER Victor
STINNER Victor victor.stin...@haypocalc.com added the comment: It should be a regression introduced by #8611 or #9425. -- nosy: +haypo ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue10955 ___

[issue10955] Possible regression with stdlib in zipfile

2011-01-20 Thread STINNER Victor
Changes by STINNER Victor victor.stin...@haypocalc.com: -- nosy: +georg.brandl priority: normal - release blocker ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue10955 ___

[issue10955] Possible regression with stdlib in zipfile

2011-01-20 Thread STINNER Victor
STINNER Victor victor.stin...@haypocalc.com added the comment: zipimport decodes filenames of the archive from cp437 or UTF-8 (depending on a flag in each file entry). Python has a builtin UTF-8 codec, but no cp437 builtin codec. You should try to add encodings/cp437.py to your python3.2/

[issue10955] Possible regression with stdlib in zipfile

2011-01-20 Thread STINNER Victor
STINNER Victor victor.stin...@haypocalc.com added the comment: Restore priority to normal: this is a workaround, and a better fix cannot be done before 3.2 final. -- priority: release blocker - normal ___ Python tracker rep...@bugs.python.org

[issue10955] Possible regression with stdlib in zipfile

2011-01-20 Thread STINNER Victor
STINNER Victor victor.stin...@haypocalc.com added the comment: The regression was introduced in r85690: use the correct encoding to decode the filename from the ZIP file. Attached patch fixes the bootstrap issue. -- keywords: +patch Added file:

[issue10955] Possible regression with stdlib in zipfile

2011-01-20 Thread Amaury Forgeot d'Arc
Amaury Forgeot d'Arc amaur...@gmail.com added the comment: About the patch: Break out of this dependency by assuming that the path to the encodings module is ASCII-only. The 'path' here is the entry inside the zip file (and does not include the location of the zip file itself), so the comment

[issue10955] Possible regression with stdlib in zipfile

2011-01-20 Thread STINNER Victor
STINNER Victor victor.stin...@haypocalc.com added the comment: Le jeudi 20 janvier 2011 à 18:15 +, Amaury Forgeot d'Arc a écrit : But if the zip file contains the stdlib *and* some other custom modules with cp437 names, the whole operation will fail; it can be the case with py2exe