Troy A. Griffitts wrote:
Costas,
    A few comments...

Costas Stergiou wrote:

Hi David/Troy,
looking at the texts, I think there is some work to be done:
- remove any combining diacriticals & process everything as precomposed.


I think this is backwards. From my limited understanding and from reading recent posts on sword-devel from people with much more knowledge than me, I think the text should be stored with no precomposed characters. If the renderer needs to send precomposed characters to the display control, then it (sword can do this with an ICU filter, I think) can precompose them.

In terms of combining characters vs. precomposed, all you really need to do is to remember to use a single normalization form. Unicode sort of informally suggests that NFC is best. W3C specifically recommends using NFC (see http://www.w3.org/TR/charmod-norm/). Roughly, NFC normalization consists of taking a string, decomposing all characters, then combining any codepoints that can be combined, provided the precombined codepoints are not compatability codepoints. The way to ensure that a string is NFC normalized is to just normalize it with something like the uconv program I mentioned.


I really don't know whether Extended Greek is NFC or not. So the last step before creating the Sword module should be normalization.


--Chris _______________________________________________ sword-devel mailing list [EMAIL PROTECTED] http://www.crosswire.org/mailman/listinfo/sword-devel

Reply via email to