Michael (michka) Kaplan writes: > I would not expect Windows (whose most recent shipping version shipped > before Unicode 4.0 was released) to support 4.0 properties and > such. But at > the same time, if you have fonts and build a keyboard you can support any > number of 4.0-only scripts.
Isn't case folding standardized long before Unicode 4.0 ? Well, the Windows case mappings for its NTFS filesystem predates Unicode, and I think that Microsoft wants to avoid the nightmare of filesystems migration. But I think that a NTFS filesystem should track the Unicode version it was created with, so that the filesystem driver can adapt to the set of folding rules supported on this system. The other option would be to propose an option in CHKDSK to find files in the same directory whose name would collide if new case folding rules were applied. CHKDSK could propose to either list them (let the user choose which name to keep, and which file must be renamed). If there's no conflict in a given directory, it could be marked to support the newer Unicode rules. There's an interesting question with FAT32: it was designed after NTFS to add Unicode and LFN support on top of FAT16 and when Unicode was already publishing standard case folding rules. I can't believe that Microsoft chose for its LFN directory extensions to use the same folding rules as those used in NTFS. May be what is wanted here is to maximize the compaitibility of FAT32 with NTFS, even if NTFS has some defects. For now we have to live with the past! I'm quite sure that lowercase Sharp-S (ess-tzett) and double lowercase s are both used on German file-systems. This is even the case on FAT filesystems with which both FAT32 and NTFS must keep some compatibility (for short file names), as it uses the OEM codepage (CP437 or CP850 in Germany) where Sharp-S has been allowed since long and made distinct from double s. If Windows was changed to use case folding of sharp-s to double s, then it would have problems to read filesystems (including floppies which use FAT12 with the same naming constraints as FAT16) containing short filenames. However this is mitigated by the fact that FAT12 and FAT16 have always been ambiguous about the effective OEM charset they were encoded with. Rremember the issues when migrating from Windows 3.x to Windows 95, because of legacy filesystems created with ambiguous OEMCP-only short names, and SCANDISK had also to be used for some time because they were applications expecting OEMCP-encoded names that were conflicting sometimes between CP437 and CP850. Even after the upgrade, the current codepage of the running app is still creating encoding conflicts detected later by CHKDSK or SCANDISK when OEMCP encoded short names do not match their Unicode encoded LFN names. SCANDISK proposes to trust the Unicode LFN name and alter the short name to reflect in the current OEM codepage the effective Unicode name. Even today there are such errors when, for some reason like virus infection, the AUTOEXEC.BAT is not run at startup to fix the codepage, so that Windows will start using short names in FAT filesystems with a new OEM codepage distinct from the OEM codepage with which the filesystem was previously used. Thanks, going to Unicode has fixed all this: short names are retained for compatibility. However FAT32 filesystems are still trying to open first the file converted to short names in the current OEMCP before trying the LFN name in Unicode. As FAT32 is definitely not dead or deprecated in favor of NTFS (for some performance reasons, forgetting the stronger security and stability of NTFS face to system crashes), we still have an issue in Windows 2000/XP/2003... __________________________________________________________________ << ella for Spam Control >> has removed Spam messages and set aside Newsletters for me You can use it too - and it's FREE! http://www.ellaforspam.com
<<attachment: winmail.dat>>

