Bug Hint (not reported by me): https://bugzilla.gnome.org/show_bug.cgi?id=648673
There are basically two kinds of ZIP archive. Those with random file name encoding (not Unicode enabled) and those with UTF-8 file name encoding and proper meta data set (Unicode enabled). UnZip 6.0 (the current latest released version) from Info-ZIP can extract Unicode enabled archive correctly. However, it's listing feature would treat any non-ASCII character in file name as '?', even for Unicode enabled archives. This affects File Roller also so we have above mentioned bug. Fortunately, UnZip has a -U option. When dealing with Unicode enabled archives, it will escape non-ASCII character to #UXXXX or #LYYYYYY. I already made a working patch for File Roller to utilize this. https://gist.github.com/4057999 Unfortunately, #UXXXX or #LYYYYYY are also legitimate file names in ZIP archives and UnZip's -U option doesn't escape literal # currently. I'm trying to contact the upstream already. http://www.info-zip.org/phpBB3/viewtopic.php?f=4&t=405 In the File Roller side, we may list the archive twice, one without -U and one with -U. Then we can determine which # is literal and which # is for escaping. There is another annoying detail worth noting here, Vanilla UnZip show exactly one ? for one Unicode character while patched UnZip (found in at least Arch and Ubuntu) show several ? for one Unicode character (the number of ? equals to number of UTF-8 bytes). What do you think? _______________________________________________ desktop-devel-list mailing list [email protected] https://mail.gnome.org/mailman/listinfo/desktop-devel-list
