Bug#417324: linux-2.6: UTF-8 is the system-wide default encoding in Debian but kernel's filesystem modules use ISO-8859-1

2008-01-30 Thread Changwoo Ryu
Is there any chance to fix this in lenny?

The current Debian default GNOME desktop does NOT correctly handle this,
in every non-latin environment I think.

In my (ko_KR.UTF-8) environment, non-ASCII filenames in removeable media
don't be displayed correctly. And if I create a non-ASCII named file in
such media, the file can't be read in MS-Windows.


-- 
Changwoo Ryu [EMAIL PROTECTED]


signature.asc
Description: This is a digitally signed message part


Bug#417324: linux-2.6: UTF-8 is the system-wide default encoding in Debian but kernel's filesystem modules use ISO-8859-1

2007-04-02 Thread Teemu Likonen
Package: linux-2.6
Severity: important

UTF-8 is the default charset since Debian Etch. The default charset of
kernel's vfat filesystem is ISO-8859-1. Therefore, when vfat or ntfs
filesystem is mounted, kernel assumes that system's charset is
ISO-8859-1 if not told otherwise. Wouldn't UTF-8 be better default for
kernel's filesystem modules now? (Ntfs and vfat always have filename
strings in UTF-16.)

Current situation is problematic. Here is two examples. Mount vfat
partition with command:

  mount -t vfat /dev/hdb1 /mnt/vfat

Because of kernel's CONFIG_FAT_DEFAULT_IOCHARSET=iso8859-1 the system
converts vfat's UTF-16 encoded filenames to ISO-8859-1. In UTF-8 system
only characters U+..U+007F actually appear correctly because only
those are the same in ISO-8859-1 and UTF-8. All the rest
U+0080..U+(10) is garbage.

There is another problem when creating new filenames under UTF-8 system.
If user creates a file which has UTF-8 multibyte characters
(U+0080..U+10 [actually U+ is max in vfat]) in it's filename,
kernel's filesystem module thinks that the system is ISO-8859-1 and
converts every single byte in UTF-8 encoded string as separate
characters to destination encoding UTF-16. The result is that filenames
are pretty much garbage in all systems that are configured correctly
(Microsoft Windows for example).

If user knows that kernel's default fat filesystem charset is different
from her system's default, she can use the mount option utf8 or
nls=utf8 to make filename conversion work correctly. The KDE desktop
environment is clever as it is able to mount vfat USB memory sticks with
option utf8 if current locale uses UTF-8 encoding. A user editing
/etc/fstab file may not be aware of this, and when facing problems many
users make a conclusion that UTF-8 is not ready yet and turn their
whole system to use ISO-8859-1 locale which makes at least characters
U+..U+00FF in vfat filesystem to work correctly.

So, I suggest that you change the kernel options which define the
default charset of filesystem modules.

Currently /boot/config-2.6.18-4-k7 says:

CONFIG_FAT_DEFAULT_CODEPAGE=437 
 
CONFIG_FAT_DEFAULT_IOCHARSET=iso8859-1

-- System Information:
Debian Release: 4.0
  APT prefers testing
  APT policy: (900, 'testing')
Architecture: i386 (i686)
Shell:  /bin/sh linked to /bin/dash
Kernel: Linux 2.6.18-4-k7
Locale: LANG=fi_FI.UTF-8, LC_CTYPE=fi_FI.UTF-8 (charmap=UTF-8)


-- 
To UNSUBSCRIBE, email to [EMAIL PROTECTED]
with a subject of unsubscribe. Trouble? Contact [EMAIL PROTECTED]