2015-02-19 21:17 GMT+01:00 Eli Zaretskii <e...@gnu.org>: > > From: Philippe Verdy <verd...@wanadoo.fr> > > Date: Thu, 19 Feb 2015 20:31:07 +0100 > > Cc: Julian Bradfield <jcb+unic...@inf.ed.ac.uk>, > > unicode Unicode Discussion <unicode@unicode.org> > > > > The decompositions are not needed for plain text searches, that can use > the > > collation data (with the collation data, you can unify at the primary > level > > differences such as capitalisation and ignore diacritics, or transform > some > > base groups of letters into a single entry, or make some significant > primary > > difference when there are diacritics (for example in German equating > 'ae' and > > 'ä' at the primary level). > > Sorry, I disagree. First, collation data is overkill for search, > since the order information is not required, so the weights are simply > wasting storage. Second, people do want to find, e.g., "²" when they > search for "2" etc. I'm not saying that they _always_ want that, but > sometimes they do. There's no reason a sophisticated text editor > shouldn't support such a feature, under user control. >
The weights or the collation strings do not need to be stored. Even database engines or plain-text search engines on the web provide now collation algorithms for searching or sorting data, so that you don't need to store it in your tables... It is not overkill, as good implementations of collation are efefctively used in high-permance database servers (and many users of these databases do not realize that collation is effectively used. There are also good text editors implementing collation searches.
_______________________________________________ Unicode mailing list Unicode@unicode.org http://unicode.org/mailman/listinfo/unicode