> -----Original Message----- > From: Barry Caplan [mailto:[EMAIL PROTECTED]] > > At 01:27 PM 7/11/2002 -0400, Suzanne M. Topping wrote: > >Unicode is a character set. Period. > > Each character has numerous > properties in Unicode, whereas they generally don't in legacy > character sets.
Each character, or some characters? > Maybe Unicode is more of a shared set of rules that apply to > low level data structures surrounding text and its algorithms > then a character set. Sounds like the start of a philosophical debate. If Unicode is described as a set of rules, we'll be in a world of hurt. > The Unicode consortium very wisely keeps it's focus narrow. > It provides > >a mechanism for specifying characters. Not for manipulating them, not > >for describing them, not for making them twinkle. > > All true, except for some special cases (BOM, bidi issues and > algoirthms, vertical variants, etc).Not saying those > shouldn't be in there, just that they are useful only in the > use of algorithms that are explicit (bi-di) or assumed (upper > case/lower case, vertical/horizontal) etc. <humour> Why mess up a nice clean statement simply because of a few hard facts? </humour> I choose to look at this stuff as the exceptions that make the rule. (On a serious note, these exceptions are exactly what make writing some sort of "is and isn't" FAQ pretty darned hard. I can't very well say that Unicode manipulates characters given certain historical/legacy conditions and under duress. If I did, people would be scurrying around trying to figure out how to foment the duress.)

