Andrew Lentvorski wrote: >> Heh. My issue *is* Unicode. I believe that Unicode was a solution >> that was arrived at early and all the brainpower was put into making >> it work instead of asking "is this the right thing to do?" This is >> often the case with smart people, I find... they *can* make it work, >> so they don't stop to think about whether it's worth it. > > I disagree. Completely. Unicode means that I can just have a single > "String" abstraction that works across multiple human and computer > languages.
That part I agree with, and I'd add in something about fonts, printers, and font rendering systems. > The cacophony of "String" data types in various programming languages > and libraries prior to Unicode shows that a solution was needed. This part I disagree with. Even when you assume you are strictly working with an ASCII character set, you'll find a cacophony of "String" data types with all kinds of curious properties (copy-on-write vs. always copy, flyweight pattern vs. not, length field vs. termination character, specialized fields for various properties vs. all properties computed from the string, etc., etc.). I think it's fair to say that this is simply a byproduct of the fact that there is no perfect string implementation, only a series of trade offs. Perhaps with generic programming you could come up with something that could be all things to all people, but it's behavior would have so much variation you'd effectively have the same cacophony, just tied to a single data type. > I don't see how any other solution will avoid dealing with the same > issues as Unicode addressed. Yeah, the horrible thing about Unicode is that it seems so painfully complex until you actually start delving in to the problem. Then you find out just how painfully complex the problem is, and start crying uncle and asking for Unicode. ;-) That said, there are some stupid aspects to Unicode that are simply design by committee issues, like having an *optional* byte ordering mark that doesn't even mean anything for UTF-8. The thing is, there are still a lot of circumstances where you can completely side-step the issue, and requiring the use of Unicode even in those circumstances just sucks horribly. --Chris -- [email protected] http://www.kernel-panic.org/cgi-bin/mailman/listinfo/kplug-lpsg
