We recommend that modules with the Unicode text encoded as UTF-8 be normalized to NFC during module build or earlier.
For general background, see https://en.wikipedia.org/wiki/Unicode_equivalence#Normal_forms I have just inserted a table row in this section of our wiki page: https://crosswire.org/wiki/Choosing_a_SWORD_program#Search_and_Dictionary The new row is for *Normalize the search string* and has an explanatory note. Which front-ends are able to normalize the search string to NFC automatically when it is entered? So far, I have only filled in the cell for the *Xiphos* column. /The rest need editing/. This matter arose in particular during some text development work on a Punjabi Bible, and relates to how six Gurmukhi Unicode letters are normalized. The topic is of much wider application - potentially for all [non-Roman] scripts that use combining characters and/or diacitics. I observed: Not all front-ends have a hidden feature to automatically normalize the search string when it is entered. Users accustomed to writing the Gurmukhi letters *LLA SHA KHHA GHHA ZA FA* instead of the letters *LA SA KHA GA JA PHA* with a *NUKTA* sign will thus experience difficulties with the search feature. Best regards, David -- View this message in context: http://sword-dev.350566.n4.nabble.com/Which-front-ends-can-normalize-the-search-string-tp4656635.html Sent from the SWORD Dev mailing list archive at Nabble.com. _______________________________________________ sword-devel mailing list: [email protected] http://www.crosswire.org/mailman/listinfo/sword-devel Instructions to unsubscribe/change your settings at above page
