This is from one of my machine running LUbuntu: $ export |grep LANG declare -x LANG="en_US.UTF-8"
$ export |grep LC declare -x LC_ADDRESS="en_US.UTF-8" declare -x LC_IDENTIFICATION="en_US.UTF-8" declare -x LC_MEASUREMENT="en_US.UTF-8" declare -x LC_MONETARY="en_US.UTF-8" declare -x LC_NAME="en_US.UTF-8" declare -x LC_NUMERIC="en_US.UTF-8" declare -x LC_PAPER="en_US.UTF-8" declare -x LC_TELEPHONE="en_US.UTF-8" declare -x LC_TIME="en_US.UTF-8" $ unzip -h UnZip 6.00 of 20 April 2009, by Debian. Original by Info-ZIP. ... Use the file from here: http://www1.axfc.net/uploader/Sc/so/325701.zip (passwd: backer) (CP932) $ unzip celluloid.zip Archive: celluloid.zip inflating: celluloid/readme.txt inflating: celluloid/В╣ВщВчВдВ╟.ust inflating: celluloid/В╣ВщВчВдВ╟2Ф╘.ust inflating: celluloid/В╣ВщВчВдВ╟СхГTГrСOВйВч.ust $ unzip -O cp932 celluloid.zip Archive: celluloid.zip inflating: celluloid/readme.txt inflating: celluloid/せるらうど.ust inflating: celluloid/せるらうど2番.ust inflating: celluloid/せるらうど大サビ前から.ust $ unzip -O cp936 celluloid.zip Archive: celluloid.zip inflating: celluloid/readme.txt inflating: celluloid/偣傞傜偆偳.ust inflating: celluloid/偣傞傜偆偳2斣.ust inflating: celluloid/偣傞傜偆偳戝僒價慜偐傜.ust $ unzip -O cp950 celluloid.zip Archive: celluloid.zip inflating: celluloid/readme.txt inflating: celluloid/�����炤��.ust inflating: celluloid/�����炤��2��.ust inflating: celluloid/�����炤�Ǒ��T�r�O����.ust Another file from here http://3jf.wodemo.com/file/310894 (CP936) $ unzip -L 王妃.zip Archive: 王妃.zip inflating: ═їх·_a.ust inflating: ═їх·_b.ust $ unzip -O cp932 王妃.zip Archive: 王妃.zip inflating: ヘ銈A.ust inflating: ヘ銈B.ust $ unzip -O cp936 王妃.zip Archive: 王妃.zip inflating: 王妃_A.ust inflating: 王妃_B.ust $ unzip -O cp950 王妃.zip Archive: 王妃.zip inflating: 卼漦_A.ust inflating: 卼漦_B.ust Actually, not all the wrong cases map to illegal UTF8 string (question marks). I guess why an auto-detect is not so straight forward? -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1422290 Title: Default charsets handling for Windows archives in CJKV+th locale To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/unzip/+bug/1422290/+subscriptions -- ubuntu-bugs mailing list [email protected] https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
