On Saturday 11 September 2004 16:13, Paul Rosen wrote: > UNICODE Advantage: Any character in any language can be displayed. > UNICODE Disadvantage: Everyone using the structure needs to be UNICODE > aware. Are there systems and computer languages that can't handle it?
I don't know how C does when it comes to Unicode support. There is in the glibc library. I'm mainly a Java programmer, which has native Unicode support. Python and Perl also have that. > That's why I was wondering if there should be some type of switch passed to > the parser about whether to output UNICODE. My opinion is that applications that will use this new parser SHOULD be Unicode-aware. I think that internationalisation is a big issue and will only grow in importance. Application developers can't afford it any longer to stick their head in the sand and pretend that only native English speakers that only need unaccented letters from the latin alphabet will use the fruit of their labour... Just my â0,02... > Actually, since the app passes the string with the tune to the parser, we > could have two functions, one that accepts a UNICODE string and outputs > UNICODE strings, and the other that accepts an ASCII string and outputs > ASCII strings. > > (There's no point in returning UNICODE strings if the original was ASCII, > since no extra characters were used.) That's not necessary at all, at least if you go for the utf-8 encoding. If the input only contains US-ASCII characters, it makes no difference, the first 127 characters in the utf-8 set being encoded exactly as US-ASCII, so the Unicode-version of the function will behave as expected for ASCII-input. Where Unicode *does* make a difference is in header fields like titles and authors and in lyrics. If the input file contains lyrics in, say, Greek, I don't know how an application that only knows US-ASCII could handle that... -- Bert Van Vreckem <http://flanders.blackmill.net/> Te audire non possum. Musa sapientum fixa est in aure. To subscribe/unsubscribe, point your browser to: http://www.tullochgorm.com/lists.html
