Jürg Billeter wrote:
On Mon, 2005-10-17 at 10:42 +0600, Alexander E. Patrakov wrote:
ROX insists that filenames are UTF-8 encoded. This means either an UTF-8
locale or a deviation from POSIX (POSIX implies that filenames are
stored on disk in the locale encoding when it describes the "tar"
program). Such "UTF-8 locale is required" statements are bugs, always.
Maybe it's a bug but the other way round it's not better either. If you
use filenames with non-UTF-8 and non-ASCII characters, these filenames
only get displayed correctly in locales with the same charset. Let's
assume those files are located on a network share where people with
different locales try to access; horribly broken...
Correct, this is indeed a problem with non-UTF-8 locales. But if you try
to list a directory with UTF-8 filenames with "ls" in non-UTF-8 locale,
the output is horribly broken...
IMO the only real problem with UTF-8 is that not all applications
support it (correctly).
You are almost right. The second problem is that not all applications in
UTF-8 locales can correctly deal with legacy non-UTF-8 documents, e.g.
those with Windows origin (like ID3v1 tags created by WinAMP).
I've switched to a UTF-8 locale at least two
years ago and luckily no application I use is horribly broken AFAIR.
That's because you speak German and use only a few non-ASCII characters.
For Russians, every character in their texts is non-ASCII, thus the bugs
are more visible and a Russian speaker can say that a particular bug is
a showstopper while a German speaker says it is only a minor annoyance.
BTW have you patched your copy of the "mkisofs" program to accept UTF-8
as the input charset for filenames or ignored the problem? :)
As for the "sharing filenames" problem, the following viewpoint is
popular here:
If you know both languages and are going to collaborate with your
Foreign colleagues by means of sharing files via NFS, then UTF-8 is the
only way to go.
But this situation is not very common. Let's suppose you don't know
Chinese and see a Chinese-named file. For you, its name is just garbage.
There is no reason to prefer meaningless hieroglyphs that you even can't
type to the (equally meaningless) broken filename. In such situation,
UTF-8 is not a real benefit.
Also it (theoretically) brings its own problems like normalization forms
(i.e. why á and á should be different?) and visually-similar but really
different characters (like o and о) that can be abused.
It's odd that ROX Filer uses glib-2.0 which supports non-UTF-8 filename
encodings (with G_FILENAME_ENCODING) but ROX seems to work around that
environment variable...
Indeed, ROX ignores this variable and thus conforms to neither POSIX nor
Glib-2.0 semi-standard. If G_FILENAME_ENCODING were respected, I would
not call this a bug.
--
Alexander E. Patrakov
--
http://linuxfromscratch.org/mailman/listinfo/blfs-dev
FAQ: http://www.linuxfromscratch.org/blfs/faq.html
Unsubscribe: See the above information page