Resent with a non-renegade email address... (^8=

À 14:10 2004-07-09, Jony Rosenne a écrit:
I think the problem is with the concept of default in this case. The default
should be the basis for a specific tailoring, and as a last resort for
scripts and letters that do not have specific weights, but each
implementation should have it's own weights when it matters. Only rarely is
the default useful in itself, except possibly for Latin based locales.

[Alain]  My two cents in this debate (in full support of this fundamental statement of Jony): there is no concept of "default" in ISO/IEC 14651, the International String Ordering Standard (by opposition to the UCA, this is a significant difference), as, in order to be conformant, one  * s h a l l *  declare a delta, even if it is only one line.

   Adaptation to the world cultures (at the limit, even to individual needs) is here the key.

   And even for Latin-based locales, the UCA "default" makes no complete sense for any Latin-script-written language in the world.

   Given that there is no such thing as a default according to the international standard, the debate is mostly futile in this context. It is a debate which looks to me like the well known "my-father-is-stronger-than-yours" debate.

   That said, Peter Kirk raised an important issue (that *could* be solved by applying a particular delta consistently):
One Danish participant is Søren Holst and so called in the name field of his e-mails, but signs himself "Soren" in messages in English. If I type "Soren" into the name search box (in Mozilla 1.7), I get no matches. This is not what I expect, because to me, and to Søren himself when thinking in English, ø is a variant of o. (But actually Mozilla is inconsistent: when sorting it put Søren after Sonny but before Soshie.)

[Alain]  Mozilla (and for that purpose even "Find" in the most popular Microsoft products, which of course have nothing to do with Mozilla) does not seem to be smart enough to be *able* to "correctly" treat accented data consistently between searching and sorting. Mozilla (or Microsof products) does not do any accent decomposition for searching (and this is not an expected behaviour in French for my name [LaBonté] either even if "é" is but an accented instantiation of "e", and not a separate letter), and only folds case (that's the best it seems to care doing).

   It would be much better to make sorting, matching and searching consistent with tailored tables of either the UCA or ISO/IEC 14651. Unfortunately that is not what happens in most products, except in some good search engines (Google, Altavista and the like, which are smart enough for this -- but are not tailorable, to my knowledge -- and there are slight differences in behaviour between Google and Altavista although it is very much better that Mozilla or MS products in all cases).

   There is probably a need for an international standard for searching that would just say that: "searching should be consistent with sorting". Sometimes international standards do not need to be complicated. Simple ideas are great, but they seem intellectually so obvious that one would have to write it 1000000 times in its homework book to get them applied and fully understood (i.e. not only intellectually but in human-made tools as well).

Alain LaBonté

Reply via email to