Hi Rich ! >> I know this problem very well. It happens about every few >> month, that I get a ZIP packaged file from a Windows system. >> As the maintainer is a bit stupid, he can't manage to avoid >> foreign characters and I end up with unusual file names after >> unzip. > >This sounds like a bug in the unzip utility. If it encounters >byte sequences which are not UTF-8, it should convert them from >whatever legacy encoding they're in to UTF-8, possibly issuing >an error that the user needs to specify this encoding if it >can't be determined.
Then you need to consider all programs buggy which don't mangle with the file names. There are so many programs which just copy filenames through and let the kernel decide what to do. And I do not mean BB unzip here, normally I'm using the upstream unzip. ... and how can you consider all names being UTF-8 ... nowadays may be, but what when using 8 bit locales with different charsets? UTF-8 mangling would be wrong on those. ... and not only unzip may produce such results. Think of using an USB stick at an Windows machine, then carry that over to an Linux machine. Depending on how the file system is mounted you may get unusual file names when copying names with foreign characters. Now who is bad? Would be nice to have them all fixed ... get them all fixed the same way when doing some mapping ... but can that ever reach all programs? This is a so long standing problem, nobody really cares. > >Rich -- Harald _______________________________________________ busybox mailing list [email protected] http://lists.busybox.net/mailman/listinfo/busybox
