On Sun, 26 Oct 2003 23:36:55 -0500 "Mrs. Brisby" <[EMAIL PROTECTED]> wrote: > It's good to use null-terminated in many cases; especially in collating > and sorting. It helps to understand that in those cases you stop > processing _after_ you see the terminator (and treat the terminator as > it is: zero.)
Collating involves with length. If data length is known prior to scanning data, in some cases you can skip it if it doesn't match without scanning data body. It helps to understand that in those cases you stop processing _before_ you see the terminator or anything else. > UTF-16 is NOT used in HFS+. HFS+ still uses ASCII with some "tricks". > UFS is what's "preferred" in MacOS X, and it doesn't use UTF-16 either. > UTF-16 isn't what we're talking about anyway, it's UCS16. Thank you for your clarification, I'd like to hear more about that imaginative "tricks", but it's OT I'm afraid. MacOS X uses "Unicode" as its native encoding. In Unicode encoding the most used in MacOS X is UTF-16. Only to call BSD API it uses UTF-8. It's kind of hybrid, but UTF-8 is just used for compatibility to Unix parts in MacOS X, and other non-Unix pieces in MacOS X, which is why MacOS X is Mac, is using UTF-16 internally, including Carbon, Cocoa and ATSUI. For HFS+, from Apple's Technical Note TN2078 (Migrating to FSRefs & long Unicode names from FSSpecs): http://developer.apple.com/technotes/tn2002/tn2078.html "How file names are encoded HFS+ disks store file names as UTF-16 in an Apple-modified form of Normalization Form D (decomposed). This form excludes certain compatibility decompositions and parts of the symbol blocks, in order to assure round-trip of file names to Mac OS encodings (applications using the HFS APIs assume they get the same bytes out that they put in). In Mac OS X 10.2, the decomposition rules used were changed from Unicode 2.0.x (based on an intermediate draft) plus the above-mentioned Apple modifications, to Unicode 3.2 plus the above-mentioned Apple modifications. The Unicode Consortium has committed to not changing the decomposition rules after Unicode 3.2, so we shouldn't have to do this again. The change from 2.0.x to 3.2 was necessary because A) lots of new decompositions had been added, and B) the 2.0.x data was full of errors. Other file systems use different storage formats. UFS disks use UTF-8, HFS disks use Mac OS encodings. AFP (AppleShare) uses Mac OS encodings prior to 3.0, and UTF-16 for 3.0 or later. " -- KL --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]