New submission from Joel Puig Rubio <joel.puig.ru...@gmail.com>:
I'm attempting to make a script to create ZIP archives from files which filenames are encoded in Shift-JIS. However, Python seems to limit its filenames to ASCII or UTF-8, which means that attempting to archive said files will raise an exception. This is very inconvenient. joel@bliss:~/test$ python3 -m zipfile -c mojibake.zip . Traceback (most recent call last): File "/usr/lib/python3.8/zipfile.py", line 457, in _encodeFilenameFlags return self.filename.encode('ascii'), self.flag_bits UnicodeEncodeError: 'ascii' codec can't encode characters in position 0-7: ordinal not in range(128) During handling of the above exception, another exception occurred: Traceback (most recent call last): File "/usr/lib/python3.8/runpy.py", line 194, in _run_module_as_main return _run_code(code, main_globals, None, File "/usr/lib/python3.8/runpy.py", line 87, in _run_code exec(code, run_globals) File "/usr/lib/python3.8/zipfile.py", line 2441, in <module> main() File "/usr/lib/python3.8/zipfile.py", line 2437, in main addToZip(zf, path, zippath) File "/usr/lib/python3.8/zipfile.py", line 2426, in addToZip addToZip(zf, File "/usr/lib/python3.8/zipfile.py", line 2421, in addToZip zf.write(path, zippath, ZIP_DEFLATED) File "/usr/lib/python3.8/zipfile.py", line 1775, in write with open(filename, "rb") as src, self.open(zinfo, 'w') as dest: File "/usr/lib/python3.8/zipfile.py", line 1517, in open return self._open_to_write(zinfo, force_zip64=force_zip64) File "/usr/lib/python3.8/zipfile.py", line 1614, in _open_to_write self.fp.write(zinfo.FileHeader(zip64)) File "/usr/lib/python3.8/zipfile.py", line 447, in FileHeader filename, flag_bits = self._encodeFilenameFlags() File "/usr/lib/python3.8/zipfile.py", line 459, in _encodeFilenameFlags return self.filename.encode('utf-8'), self.flag_bits | 0x800 UnicodeEncodeError: 'utf-8' codec can't encode characters in position 0-7: surrogates not allowed The zip command from the Linux Info-ZIP package is able to create the same archive with no issues, which I've attached to this issue. Here you can see how the proper filenames are shown in WinRAR once the right encoding is selected: https://i.imgur.com/TVcI95A.png The same should be seen on any computer using Shift-JIS as their locale. ---------- components: Library (Lib) files: mojibake.zip messages: 399049 nosy: joelpuig priority: normal severity: normal status: open title: zipfile: cannot create zip file from files with non-utf8 filenames versions: Python 3.8 Added file: https://bugs.python.org/file50204/mojibake.zip _______________________________________ Python tracker <rep...@bugs.python.org> <https://bugs.python.org/issue44846> _______________________________________ _______________________________________________ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com