Re: [sword-devel] Normalize the search string (comparing front-end apps)

2018-03-22 Thread DM Smith
Re case sensitivity, it was just a very simple example of the principle. If it doesn’t find all then the search request and the index were not normalized the same. NFC and NFD are different normalizations. Note, stripping diacritics may be an appropriate normalization. JSword doesn’t properly

Re: [sword-devel] Normalize the search string (comparing front-end apps)

2018-03-22 Thread David Haslam
Thanks, DM. My question was not about case-sensitivity, but about Unicode normalization. The main issue is composition vs decomposition and the canonical ordering of diacritics in each glyph. e.g. Suppose the module contains 181 instances of the name "Efraím" which has 6 characters. Suppose a

Re: [sword-devel] Normalize the search string (comparing front-end apps)

2018-03-22 Thread DM Smith
It doesn’t matter that a search doesn’t use Lucene. The principle is the same. The search request has to be normalized to the same form as the searched text. For example a case insensitive search normalizes both to a single case. If it isn’t done, even on the fly, then search will fail at

Re: [sword-devel] Normalize the search string (comparing front-end apps)

2018-03-22 Thread David Haslam
Thanks DM, Not all searches make use of the Lucene index ! e.g. In Xiphos, the advanced search panel gives the user a choice of which type of search. Lucene is only one of these mutually exclusive options. btw. Where is it documented that the creation of a Lucene search index normalizes the

Re: [sword-devel] Normalize the search string (comparing front-end apps)

2018-03-22 Thread DM Smith
The requirement is not that the search is normalized to nfc but rather that it is normalized the same as the index. This should not be a front end issue. Btw it doesn’t matter how Hebrew is stored in the module. Indexing should normalize it to a form that is internal to the engine. — DM Smith

[sword-devel] Normalize the search string (comparing front-end apps)

2018-03-22 Thread David Haslam
Dear all, Not all front-ends automatically normalize the search string to Unicode NFC. e.g. - Eloquent does - Xiphos does not The data is incomplete for this feature in the table in our wiki page. https://wiki.crosswire.org/Choosing_a_SWORD_program#Search_and_Dictionary Please would other