On Thu, Feb 21, 2002 at 11:08:24AM +0100, Radovan Garabik wrote: > > One thing that's bound to be lost in the transition to UTF-8 filenames: > > the ability to reference any file on the filesystem with a pure CLI. > > If I see a file with a pi symbol in it, I simply can't type that; I have > > to copy and paste it or wildcard it. If I have a filename with all > > Kanji, I can only use wildcards.
(Er, meant copy and paste for the last; wildcards aren't useful for selecting a filename where you can't enter *any* of the characters, unless the length is unique.) > sorry, but that is just plain impossible. For one thing, the "c" can > quite well be U+04AB, CYRILLIC SMALL LETTER ES, ditto for other > letters. But I agree that normalization can save us a lot of headache. Normalization would catch the cases where it's impossible to tell from context what it's likely to be. > Input method should produce normalized characters. Since most > filenames are somehow produced via human operation, it would > catch most of pathological cases. Not just at the input method. I'm in Windows; my input method produces wide characters, which my terminal emulator catches and converts to UTF-8, so my terminal would need to follow the same normalization as input methods in X. Terminal compose keys and real keybindings (actual non-English keyboards) are other things an IM isn't involved in; terminals and GUI apps (or at least widget sets) would need to handle it directly. -- Glenn Maynard -- Linux-UTF8: i18n of Linux on all levels Archive: http://mail.nl.linux.org/linux-utf8/
