> So I think Zhang Weiwu is suggesting a heuristic algorithm for > discriminating a unicode text which is already known, or assumed to be, in > Chinese.
well the site will deliver chinese content w/doublechecking browser locale, etc. so yes, most likely chinese users. > to encounter at least one "ge" u+500B or u+4E2A? One "wei" u+70BA or > u+4E3A? One "shuo" u+8AAC or u+8BF4? It wouldn't take long to figure > this out. might for me ;-) > Marco Cimarosti has questioned, why do you need to classify text as being > simplified or traditional? if i understand their needs correctly, its to implement a search system with search phrases of either "type" of chinese--content would be in both types. > So, basically all you would be doing is providing a convenience for your > readers, making it easier on their eyes to read your web documents in > either traditional or simplified according to their preference. I know > that something like that would help me -- sometimes I forget the > traditional version of a character, and sometimes I forget the simplified > version. It would be very cool if I could just press a button on a web > site to switch the display between the two ;-) . from what i understand this isn't something they've considered but sounds pretty cool.

