[EMAIL PROTECTED] wrote on 2008-06-13 17:33 UTC: > ï¼³ï½ï½ï½ï¼´ï½ï½ï½ wrote: > > Ive also given tarballs a shot for this task, but sadly cygwin is > > ascii-only. > To quickly respond to this side-topic: It is possible to enable UTF-8 in > cygwin in a limited way although cygwin has only bogus locale support. > Some applications, however, are able to support UTF-8 without locale > support: > > * xterm works nicely in UTF-8 mode if configured properly > * rxvt-unicode can be patched to support UTF-8 (the package includes > my patch) > * my editor mined supports UTF-8 if it finds the terminal to be > running in UTF-8 mode
Cygwin is a Windows DLL that provides Windows C applications a POSIX API very similar to that available under Linux. Under Linux, if you open a file with a UTF-8 filename and the file is located on a VFAT or NTFS filesystem, then it is the job of the kernel file-system driver to convert between the UTF-8 encoding used in the open() system call and the UTF-16 encoding used on Microsoft's file systems. Under Linux, this works nicely if the utf8 option is passed to the ntfs driver by mount from /etc/fstab. This is now done in all recent (i.e., post-2005) major Linux distribution by default. So one option for doing the file transfer is to mount the relevant NTFS partition under Linux and then use any standard Linux file copy tool (cp, tar, rsync, etc.) to do the job. This is trivial if the Linux and NTFS partition reside on the same system, otherwise, either (a) connect the NTFS harddisk to the Linux computer or (b) boot the PC that contains the NTFS partition temporarily with one of the many Live-CD Linux distributions (Knoppix, etc.) from CD-R. The question with regard to Cygwin is not what locales it has, but whether it translates a UTF-8 string provided to it in a POSIX system call, such as open(), into a UTF-16 string before passing the data on to the equivalent Win32 system call, and vice versa. Markus -- Markus Kuhn, Computer Laboratory, University of Cambridge http://www.cl.cam.ac.uk/~mgk25/ || CB3 0FD, Great Britain -- Linux-UTF8: i18n of Linux on all levels Archive: http://mail.nl.linux.org/linux-utf8/
