A further idiosyncrasy of the UTF8GreekAccents filter that proves to be an interesting clue:
It changes U+00BE VULGAR FRACTION THREE QUARTERS ¾ to ordinary 3/4. Vulgar fractions are about as far as you can get from Koine Greek, nicht wahr? This is what I think this proves: It must first decompose the Unicode to either NFKD or NFKC as a prelude to removing the "accents". These are two of the four canonical normalization forms in Unicode. It should in theory then renormalize to NFC after the Greek accents have been removed. Without the diacritics, this wouldn't be needed unless some non-Greek composite characters had also been present in the original module text. This particular example is of significance in that once you've got "3/4" no amount of renormalization to NFC would change it back to the special Unicode vulgar fraction ¾. Some aspects of Unicode normalization cannot be reversed. Who'd have thought that my suggestion to use this vulgar fraction character in one single verse of a Punjabi Bible could later prove to be useful evidence in the case for the prosecution? II Chronicles 1:17: ਸੁਲੇਮਾਨ ਦੇ ਵਿਉਪਾਰੀ ਮਿਸਰ ਤੋਂ ਇੱਕ ਰੱਥ ਚਾਂਦੀ ਦੇ 15 ਪੌਂਡ ਦਾ ਅਤੇ ਇੱਕ ਘੋੜਾ ਚਾਂਦੀ ਦੇ 3¾ ਪੌਂਡ ਦਾ ਖਰੀਦਦੇ ਸਨ । ਫ਼ੇਰ ਉਨ੍ਹਾਂ ਨੇ ਇਹ ਘੋੜੇ ਅਤੇ ਰੱਥ ਹਿੱਤੀ ਲੋਕਾਂ ਦੇ ਰਾਜਿਆਂ ਅਤੇ ਆਰਾਮ ਦੇ ਰਾਜਿਆਂ ਨੂੰ ਵੇਚ ਦਿੱਤੇ । Well there we are, you see. :) David -- View this message in context: http://sword-dev.350566.n4.nabble.com/GlobalOptionFilter-UTF8GreekAccents-and-non-Greek-modules-tp4656719p4656747.html Sent from the SWORD Dev mailing list archive at Nabble.com. _______________________________________________ sword-devel mailing list: sword-devel@crosswire.org http://www.crosswire.org/mailman/listinfo/sword-devel Instructions to unsubscribe/change your settings at above page