Date: Fri, 20 Feb 2015 11:50:17 +0900
From: Martin J. Dürst due...@it.aoyama.ac.jp
CC: jcb+unic...@inf.ed.ac.uk, unicode@unicode.org
Well, for cased scripts, search is usually case-insensitive, but case
conversions aren't given by compatibility decompositions.
That's true, but comparing
Date: Thu, 19 Feb 2015 22:02:57 +
From: Richard Wordingham richard.wording...@ntlworld.com
First, collation data is overkill for search,
since the order information is not required, so the weights are simply
wasting storage.
The big waste is not in text-dependent storage, but in
From: Philippe Verdy verd...@wanadoo.fr
Date: Fri, 20 Feb 2015 04:47:52 +0100
Cc: jcb+unic...@inf.ed.ac.uk, unicode Unicode Discussion unicode@unicode.org
Sorry, I disagree. First, collation data is overkill for search,
since the order information is not required, so the weights are
On Fri, 20 Feb 2015 10:04:32 +0200
Eli Zaretskii e...@gnu.org wrote:
Date: Thu, 19 Feb 2015 22:02:57 +
From: Richard Wordingham richard.wording...@ntlworld.com
First, collation data is overkill for search,
since the order information is not required, so the weights are
simply
Date: Fri, 20 Feb 2015 15:01:34 +
From: Richard Wordingham richard.wording...@ntlworld.com
Sorry, I don't think I follow: what is processing for search orders
to which you allude here?
The examples in the CLDR root locale and in DUCET are the massive sets
of 'contractions' of
On Thu, Feb 19, 2015 at 11:51 PM, Eli Zaretskii e...@gnu.org wrote:
I think decomposition to NFKD solves these issues, doesn't it?
Not completely. Judging from your question, you expected more mappings than
NFKD has. You might want to try the mappings that are used as input for
deriving the
From: Philippe Verdy verd...@wanadoo.fr
Date: Thu, 19 Feb 2015 20:31:07 +0100
Cc: Julian Bradfield jcb+unic...@inf.ed.ac.uk,
unicode Unicode Discussion unicode@unicode.org
The decompositions are not needed for plain text searches, that can use the
collation data (with the collation
The decompositions are not needed for plain text searches, that can use the
collation data (with the collation data, you can unify at the primary level
differences such as capitalisation and ignore diacritics, or transform some
base groups of letters into a single entry, or make some significant
On Thu, Feb 19, 2015 at 12:17 PM, Eli Zaretskii e...@gnu.org wrote:
Sorry, I disagree. First, collation data is overkill for search,
since the order information is not required, so the weights are simply
wasting storage. Second, people do want to find, e.g., ² when they
search for 2 etc.
Does anyone know why does the UCD define compatibility decompositions
for Arabic initial, medial, and final forms, but doesn't do the same
for Hebrew final letters, like U+05DD HEBREW LETTER FINAL MEM? Or for
that matter, for U+03C2 GREEK SMALL LETTER FINAL SIGMA?
The relevant application where
On 19 Feb 2015, at 10:55, Eli Zaretskii e...@gnu.org wrote:
Does anyone know why does the UCD define compatibility decompositions
for Arabic initial, medial, and final forms, but doesn't do the same
for Hebrew final letters, like U+05DD HEBREW LETTER FINAL MEM? Or for
that matter, for U+03C2
From: Michael Everson ever...@evertype.com
Date: Thu, 19 Feb 2015 11:21:19 +
On 19 Feb 2015, at 10:55, Eli Zaretskii e...@gnu.org wrote:
Does anyone know why does the UCD define compatibility decompositions
for Arabic initial, medial, and final forms, but doesn't do the same
for
Date: Thu, 19 Feb 2015 11:47:24 GMT
From: Julian Bradfield jcb+unic...@inf.ed.ac.uk
In Arabic, the variant of a letter is determined entirely by its
position, so there is no compelling need to represent the forms separately
(as characters rather than glyphs) save for the existence of legacy
On Thu, 19 Feb 2015 22:17:30 +0200
Eli Zaretskii e...@gnu.org wrote:
First, collation data is overkill for search,
since the order information is not required, so the weights are simply
wasting storage.
The big waste is not in text-dependent storage, but in the
processing for search orders
2015-02-19 21:17 GMT+01:00 Eli Zaretskii e...@gnu.org:
From: Philippe Verdy verd...@wanadoo.fr
Date: Thu, 19 Feb 2015 20:31:07 +0100
Cc: Julian Bradfield jcb+unic...@inf.ed.ac.uk,
unicode Unicode Discussion unicode@unicode.org
The decompositions are not needed for plain text
On Fri, 20 Feb 2015 11:50:17 +0900
Martin J. Dürst due...@it.aoyama.ac.jp wrote:
If the question isn't Why are there equivalences useful for search
that are not covered by compatibility decompositions?, but Why
doesn't Unicode provide some data for final/non-final Hebrew letter
Date: Thu, 19 Feb 2015 13:08:57 -0800
From: Markus Scherer markus@gmail.com
Cc: Philippe Verdy verd...@wanadoo.fr, Julian Bradfield
jcb+unic...@inf.ed.ac.uk,
Unicode Mailing List unicode@unicode.org
Sorry, I disagree. First, collation data is overkill for search,
since
On 2015/02/20 05:17, Eli Zaretskii wrote:
From: Philippe Verdy verd...@wanadoo.fr
Date: Thu, 19 Feb 2015 20:31:07 +0100
Cc: Julian Bradfield jcb+unic...@inf.ed.ac.uk,
unicode Unicode Discussion unicode@unicode.org
The decompositions are not needed for plain text searches, that can use
On 2015/02/19 20:47, Julian Bradfield wrote:
On 2015-02-19, Eli Zaretskii e...@gnu.org wrote:
Does anyone know why does the UCD define compatibility decompositions
for Arabic initial, medial, and final forms, but doesn't do the same
for Hebrew final letters, like U+05DD HEBREW LETTER FINAL MEM?
19 matches
Mail list logo