On 6/2/2013 7:44 AM, David Haslam wrote:
Some alphabets make use of a character that in other languages is normally
classed as a punctuation mark.

Examples are many, but here's a verse in *Tongan*, a language where the *ʻ
(fakauʻa)* occurs very frequently as the character for a glottal stop.
This should be written with the modifier letter turned comma (unicode
0x02BB),
and not some other character, even though it looks a bit like the inverted
curly apostrophe.

NAʻE fakatupu ʻe he ʻOtua ʻi he kamataʻanga ʻa e langi mo māmani.

How should the search feature in SWORD be tailored to support modules for
such languages?
Is this even possible, or would it require an enhancement to one of our
library components?

Why don't you investigate and find out? You're essentially posting a request that someone else go look for a bug that you think could potentially exist. If it matters to you, go look for it and report.

Has this topic been ever discussed before?

NB.  In this example for Tongan, it's conceivable that providing the right
codepoint is used, SWORD may already handle it correctly,
but there are other languages in which the ordinary apostrophe is used for
the same sound.

We don't need to support incorrectly encoded content. If someone encoded letters with codepoints that have punctuation properties, the error is in the encoding and nothing should be done to Sword. (And, yes, we have discussed this in the past.)

--Chris


_______________________________________________
sword-devel mailing list: sword-devel@crosswire.org
http://www.crosswire.org/mailman/listinfo/sword-devel
Instructions to unsubscribe/change your settings at above page

Reply via email to