On 12/21/2011 12:26 PM, John Arbash Meinel wrote:
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

...

On U1 we have a lot of code to handle this, because we also deal
with windows, where things are completely different.
I'd be interested to hear more, given that on Windows the filesystem
encoding of NTFS is officially UTF-16 (without surrogate pairs, I
believe, making it UCS-2).

So while there are lots of 8-bit mappings on Windows (code page, ANSI,
OEM, file-content vs filename, etc, etc.), the filenames on disk are
all Unicode. (FAT-32 is probably a different story, though.)


Well, the main problem, IIRC, was that since this was linux code, it assumed things were either utf-8 strings or unicode, and ... well, on windows, it never is utf-8 strings. On windows, most FS APIs will just give you unicode strings encoded in UTF-16.

Also, on python there are lots of "fun" things, like listdir(".") and listdir(u".") giving different results, of course :-)

--
ubuntu-devel mailing list
[email protected]
Modify settings or unsubscribe at: 
https://lists.ubuntu.com/mailman/listinfo/ubuntu-devel

Reply via email to