On Sat, Jan 03, 2004 at 03:46:06PM +0200, Oren Held wrote:
> Correct me if I'm wrong, but I don't think that people use UTF-8
> filenames yet. A small test I've made shows that even KDE saves hebrew
> filenames in a non-unicode form.

Okay, here's a short roundup on Unicode filenames:

On (modern) FAT, NTFS and Joliet CDs, the filenames are always in 
Unicode format.
On Unix filesystems, the filesystem makes no claims about the textual
interpretation of the file name; each program can decide on its own.
That's why a system-wide convention is needed.

* GTK+ 1:

GTK+ 1 will treat filenames according to your locale. If you use
a UTF-8 locale (check with "locale charmap"), GTK+ 1 will consider the
file names on disk to be in UTF-8 charset.

* KDE: 

Same as GTK+ 1. You can force UTF8 file names despite a non-UTF8 locale
by setting the KDE_UTF8_FILENAMES environment variable.

* GTK+ 2:

GTK+ 2 (and thus, GNOME 2) considers file names to be always in UTF-8
encoding unless you set the G_BROKEN_FILENAMES environment variable,
in which case it resorts to GTK+ 1 behavior.
 
* Mounting Unicode file systems:

As I said, VFAT, NTFS and ISO9960 with Joliet specify the on-disk
storage format to always be Unicode. When you mount them as Linux, you
can choose to see the file names in UTF-8 encoding (with the "utf8"
mount option) or in any legacy encoding (with the "iocharset" mount
option).

> I think that Windows behaves in a similar way.

Windows NT-based operating systems (including Windows XP) always store
file names as Unicode.
 
> > > However.. I still cannot see my hebrew songs well: because now with
> > > gtk2/pango, it expects filenames/id3 in the unicode format, which I

Regarding file names, see above.
Regarding ID3 tags:

* ID3v1 tags don't specify a charset so they should be assumed to have
the locale's charset.

* ID3v2 tags specify their charset, either as Unicode or ASCII.
  
  * ASCII:

    By standard they have the ASCII charset, but for "Real World
    compatibility" they should be assumed to have the locale's charset.

  * Unicode:

    By standard.

=================================================================
To unsubscribe, send mail to [EMAIL PROTECTED] with
the word "unsubscribe" in the message body, e.g., run the command
echo unsubscribe | mail [EMAIL PROTECTED]

Reply via email to