New submission from Joel Puig Rubio <joel.puig.ru...@gmail.com>:

I'm attempting to make a script to create ZIP archives from files which 
filenames are encoded in Shift-JIS. However, Python seems to limit its 
filenames to ASCII or UTF-8, which means that attempting to archive said files 
will raise an exception. This is very inconvenient.

joel@bliss:~/test$ python3 -m zipfile -c mojibake.zip .
Traceback (most recent call last):
  File "/usr/lib/python3.8/zipfile.py", line 457, in _encodeFilenameFlags
    return self.filename.encode('ascii'), self.flag_bits
UnicodeEncodeError: 'ascii' codec can't encode characters in position 0-7: 
ordinal not in range(128)

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/usr/lib/python3.8/runpy.py", line 194, in _run_module_as_main
    return _run_code(code, main_globals, None,
  File "/usr/lib/python3.8/runpy.py", line 87, in _run_code
    exec(code, run_globals)
  File "/usr/lib/python3.8/zipfile.py", line 2441, in <module>
    main()
  File "/usr/lib/python3.8/zipfile.py", line 2437, in main
    addToZip(zf, path, zippath)
  File "/usr/lib/python3.8/zipfile.py", line 2426, in addToZip
    addToZip(zf,
  File "/usr/lib/python3.8/zipfile.py", line 2421, in addToZip
    zf.write(path, zippath, ZIP_DEFLATED)
  File "/usr/lib/python3.8/zipfile.py", line 1775, in write
    with open(filename, "rb") as src, self.open(zinfo, 'w') as dest:
  File "/usr/lib/python3.8/zipfile.py", line 1517, in open
    return self._open_to_write(zinfo, force_zip64=force_zip64)
  File "/usr/lib/python3.8/zipfile.py", line 1614, in _open_to_write
    self.fp.write(zinfo.FileHeader(zip64))
  File "/usr/lib/python3.8/zipfile.py", line 447, in FileHeader
    filename, flag_bits = self._encodeFilenameFlags()
  File "/usr/lib/python3.8/zipfile.py", line 459, in _encodeFilenameFlags
    return self.filename.encode('utf-8'), self.flag_bits | 0x800
UnicodeEncodeError: 'utf-8' codec can't encode characters in position 0-7: 
surrogates not allowed

The zip command from the Linux Info-ZIP package is able to create the same 
archive with no issues, which I've attached to this issue. Here you can see how 
the proper filenames are shown in WinRAR once the right encoding is selected: 
https://i.imgur.com/TVcI95A.png

The same should be seen on any computer using Shift-JIS as their locale.

----------
components: Library (Lib)
files: mojibake.zip
messages: 399049
nosy: joelpuig
priority: normal
severity: normal
status: open
title: zipfile: cannot create zip file from files with non-utf8 filenames
versions: Python 3.8
Added file: https://bugs.python.org/file50204/mojibake.zip

_______________________________________
Python tracker <rep...@bugs.python.org>
<https://bugs.python.org/issue44846>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

Reply via email to