> I don't think always encoding them to utf-8 (and using bit 11 of > flag_bits) is a good idea, since there's a chance to create archives > that won't be correctly readable by programs not supporting this bit > (it's no secret that currently some programs just assume that > filenames are encoded using one of system encodings).
I think it is also fairly uniformly agreed that these programs are incorrect; the official encoding of file names in a zip file is Windows/DOS code page 437. > This is too > complex and hazy to implement. Even if I know what is the situation on > Windows (i.e. using OEM, also called DOS encoding, but I'm not sure > how to determine its codec name from within python apart from calling > GetConsoleCP), I'm totally unaware of the situation on other operating > systems. I don't think that the situation on Windows is that the OEM code page should be used. Instead, CP 437 should be used, independent of the OEM code page. >> The tricky question is what to do when reading in zipfiles with >> non-ASCII characters (and yes, I understand that in your case >> there were only ASCII characters in the file names). > > I don't think it should be changed. In Python 3, it will certainly change, since the string type will be unicode-based. It probably should not change for the rest of 2.x. > Current zipfile seems to officially support ascii filenames only > anyway That's not true. You can use any byte string as the file name that you want, including non-ASCII strings encoded in CP437. > + filename = str(self.filename) That would be incorrect, as it relies on the system encoding, which shouldn't be relied upon. Plus, it would allow arbitrary non-string things as filenames. What it should do instead (IMO) is to encode in CP437. Bonus points if it falls back to the UTF-8 feature of zip files if encoding as CP437 fails. Regards, Martin _______________________________________________ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com