Hi Rich !

>> I know this problem very well. It happens about every few
>> month, that I get a ZIP packaged file from a Windows system.
>> As the maintainer is a bit stupid, he can't manage to avoid
>> foreign characters and I end up with unusual file names after
>> unzip.
>
>This sounds like a bug in the unzip utility. If it encounters
>byte sequences which are not UTF-8, it should convert them from
>whatever legacy encoding they're in to UTF-8, possibly issuing
>an error that the user needs to specify this encoding if it
>can't be determined.

Then you need to consider all programs buggy which don't
mangle with the file names. There are so many programs which just
copy filenames through and let the kernel decide what to do. And
I do not mean BB unzip here, normally I'm using the upstream
unzip.

... and how can you consider all names being UTF-8 ... nowadays
may be, but what when using 8 bit locales with different
charsets? UTF-8 mangling would be wrong on those.

... and not only unzip may produce such results. Think of using
an USB stick at an Windows machine, then carry that over to an
Linux machine. Depending on how the file system is mounted you
may get unusual file names when copying names with foreign
characters. Now who is bad?

Would be nice to have them all fixed ... get them all fixed the
same way when doing some mapping ... but can that ever reach all
programs? This is a so long standing problem, nobody really
cares. 

>
>Rich


--
Harald
_______________________________________________
busybox mailing list
[email protected]
http://lists.busybox.net/mailman/listinfo/busybox

Reply via email to