Ken is right; collation is *quite* complex. Anyone wanting to see some of what is involved can look at the ICU implementation (which is UCA and ISO 14651 compliant, and open-source):
User Guide: http://www-124.ibm.com/icu/userguide/Collate_Intro.html Internal Design: http://oss.software.ibm.com/cvs/icu/~checkout~/icuhtml/design/collation/ICU_ collation_design.htm (source files are linked from there) ICU home: http://oss.software.ibm.com/icu/ While the perfomance is good, the code is many, many orders of magnitude more complicated than the current nameprep. It is not appropriate for IDN. Mark ————— Δός μοι ποῦ στῶ, καὶ κινῶ τὴν γῆν — Ἀρχιμήδης [http://www.macchiato.com] ----- Original Message ----- From: "Kenneth Whistler" <[EMAIL PROTECTED]> To: <[EMAIL PROTECTED]> Cc: <[EMAIL PROTECTED]> Sent: Thursday, October 18, 2001 8:28 PM Subject: Re: [idn] call for comments for REORDERING > James Seng said: > > > Third, I would really prefer to reference a work from established expert > > group if possible. For example, ISO/IEC JTC1/SC22/WG20 publishes ISO > > 14651 on weighted sorting. I am not sure how ISO 14651 would perform for > > the IDN purpose but I thought it might be worthwhile to examine. > > As one of the principal authors of ISO 14651, who has also implemented > the synchronized Unicode Technical Standard #10, the Unicode Collation > Algorithm, I can attest that this is a very tricky and complicated > area, and the algorithms to do all this correctly are not the kind > you can write on the back of a cocktail napkin. It is very complex > to get all the details right and to get good-performing algorithms > (in speed and in resource usage). It is also very difficult for > independent implementations to get themselves all exactly lined > up, and even more difficult for independent implementations to *prove* > that they are getting the same results for all data (as opposed to > a particular result for one set of data -- which is pretty easy). > > IDN doesn't need to add this kind of headache to the already > complex enough issues of nameprep. > > > > > Whatever the case, we should make a decision quickly on this. Lets not > > drag this further if possible. > > > > -James Seng > > > > ----- Original Message ----- > > From: "Martin Duerst" <[EMAIL PROTECTED]> > > > > So this is a solution in search of a real problem, > > > not worth bothering the whole world with additional > > > complexity. > > I heartily concur with Martin's assessment. > > --Ken Whistler > > >
