Package: linux-2.6
Severity: important

UTF-8 is the default charset since Debian Etch. The default charset of
kernel's vfat filesystem is ISO-8859-1. Therefore, when vfat or ntfs
filesystem is mounted, kernel assumes that system's charset is
ISO-8859-1 if not told otherwise. Wouldn't UTF-8 be better default for
kernel's filesystem modules now? (Ntfs and vfat always have filename
strings in UTF-16.)

Current situation is problematic. Here is two examples. Mount vfat
partition with command:

  mount -t vfat /dev/hdb1 /mnt/vfat

Because of kernel's CONFIG_FAT_DEFAULT_IOCHARSET="iso8859-1" the system
converts vfat's UTF-16 encoded filenames to ISO-8859-1. In UTF-8 system
only characters U+0000..U+007F actually appear correctly because only
those are the same in ISO-8859-1 and UTF-8. All the rest
U+0080..U+(10)FFFF is garbage.

There is another problem when creating new filenames under UTF-8 system.
If user creates a file which has UTF-8 multibyte characters
(U+0080..U+10FFFF [actually U+FFFF is max in vfat]) in it's filename,
kernel's filesystem module thinks that the system is ISO-8859-1 and
converts every single byte in UTF-8 encoded string as separate
characters to destination encoding UTF-16. The result is that filenames
are pretty much garbage in all systems that are configured correctly
(Microsoft Windows for example).

If user knows that kernel's default fat filesystem charset is different
from her system's default, she can use the mount option "utf8" or
"nls=utf8" to make filename conversion work correctly. The KDE desktop
environment is clever as it is able to mount vfat USB memory sticks with
option "utf8" if current locale uses UTF-8 encoding. A user editing
/etc/fstab file may not be aware of this, and when facing problems many
users make a conclusion that "UTF-8 is not ready yet" and turn their
whole system to use ISO-8859-1 locale which makes at least characters
U+0000..U+00FF in vfat filesystem to work correctly.

So, I suggest that you change the kernel options which define the
default charset of filesystem modules.

Currently /boot/config-2.6.18-4-k7 says:

CONFIG_FAT_DEFAULT_CODEPAGE=437                                                 
                                                             
CONFIG_FAT_DEFAULT_IOCHARSET="iso8859-1"

-- System Information:
Debian Release: 4.0
  APT prefers testing
  APT policy: (900, 'testing')
Architecture: i386 (i686)
Shell:  /bin/sh linked to /bin/dash
Kernel: Linux 2.6.18-4-k7
Locale: LANG=fi_FI.UTF-8, LC_CTYPE=fi_FI.UTF-8 (charmap=UTF-8)


-- 
To UNSUBSCRIBE, email to [EMAIL PROTECTED]
with a subject of "unsubscribe". Trouble? Contact [EMAIL PROTECTED]

Reply via email to