Re: Compatibility decomposition for Hebrew and Greek final letters

Martin J. Dürst Thu, 19 Feb 2015 18:54:57 -0800

On 2015/02/20 05:17, Eli Zaretskii wrote:

From: Philippe Verdy <[email protected]>
Date: Thu, 19 Feb 2015 20:31:07 +0100
Cc: Julian Bradfield <[email protected]>,
        unicode Unicode Discussion <[email protected]>


The decompositions are not needed for plain text searches, that can use the
collation data (with the collation data, you can unify at the primary level
differences such as capitalisation and ignore diacritics, or transform some
base groups of letters into a single entry, or make some significant primary
difference when there are diacritics (for example in German equating 'ae' and
'ä' at the primary level).


Sorry, I disagree.  First, collation data is overkill for search,
since the order information is not required, so the weights are simply
wasting storage.  Second, people do want to find, e.g., "²" when they
search for "2" etc.  I'm not saying that they _always_ want that, but
sometimes they do.  There's no reason a sophisticated text editor
shouldn't support such a feature, under user control.

Well, for cased scripts, search is usually case-insensitive, but caseconversions aren't given by compatibility decompositions.

If the question isn't "Why are there equivalences useful for search thatare not covered by compatibility decompositions?", but "Why doesn'tUnicode provide some data for final/non-final Hebrew lettercorrespondence?", maybe the answer is that it hasn't been seen as a needup to now because it's so easy to figure out.


Regards,   Martin.

_______________________________________________
Unicode mailing list
[email protected]
http://unicode.org/mailman/listinfo/unicode

Re: Compatibility decomposition for Hebrew and Greek final letters

Reply via email to