Barry Scott writes: > Also beware that zip file format does not include the encoding of > the files that are in the zip file.
The most recent zipfile format, which is now a decade or so old, does specify the encoding, for values of 0 = ASCII, 1 = UTF-8.[1] > This means that for practical purposes only ASCII filenames are > portable across systems. Is this limitation a problem for this > proposal? As far as I know, with the exception of a few Japanese bureaucrats, everybody uses zip implementations that handle non-ASCII properly. InfoZip is one such that is portable, although I don't recall how it handles filesystems with non-Unicode file name encodings. >From the point of view of this proposal, just require that filename encodings be properly specified, and provide an option to use the appropriate codec. This isn't too hard. The main thing to rule out is multiple encodings in one file system (yes, I've seen it, but not recently, thank the powers). This could even be handled (on POSIX filesystems) with an auxiliary utility that converts whatever-encoded filenames to UTF-8 (could be a symlink tree). Then you can just require a UTF-8 filesystem throughout the zipapp handling system. Only remaining question in my mind would be backward compatibility with any existing zipapp specs (which I have no idea about, but if I were participating in implementation I'd be sure to check). Footnotes: [1] Or maybe it's 0 = ISO-8859-1, 1 = UTF-8. Sorry, don't have a copy of the spec handy. _______________________________________________ Python-ideas mailing list -- python-ideas@python.org To unsubscribe send an email to python-ideas-le...@python.org https://mail.python.org/mailman3/lists/python-ideas.python.org/ Message archived at https://mail.python.org/archives/list/python-ideas@python.org/message/I7AE3GYD7T57NEMVGFWIEWC2DQZ6MMPN/ Code of Conduct: http://python.org/psf/codeofconduct/