Mojca Miklavec <[email protected]> wrote in <[email protected]> on Tue, 7 Jun 2011 00:50:10 +0200:
[...] > However now a semi-serious problem arises. If you try to typeset some > text in both scripts and if we use hyph-sh-latn + hyph-sr-cyrl then > you will end up in different hyphenation. The bigger problem is that hyph-sr-latn and hyph-sr-cyrl differ (and both make errors). Take a look at this concrete example. This is an excerpt from my list of hyphenation exceptions for Serbian Cyrillic. I added a Latin equivalent of every word. This is what you get with sr-latn and sr-cyrl loaded together. As you can see, Latin version behaves much better. At least for the words from this list, there is no difference between sh-latn and sr-latn. -------------------------------------- шта-мпан incorrect štam-pan correct -------------------------------------- на-сто-ја-тељ correct na-sto-ja-te-lj incorrect -------------------------------------- српског incorrect srp-skog correct -------------------------------------- српско incorrect srp-sko correct -------------------------------------- српске incorrect srp-ske correct -------------------------------------- српском incorrect srp-skom correct -------------------------------------- српски incorrect srp-ski correct -------------------------------------- ак-ту-е-лност incorrect ak-tu-el-nost correct -------------------------------------- мо-ра-лни incorrect mo-ral-ni correct -------------------------------------- Ка-рлов-ци incorrect Kar-lov-ci correct -------------------------------------- за-хва-лност incorrect za-hval-nost correct -------------------------------------- по-ма-њка-ње incorrect po-manj-ka-nje correct -------------------------------------- Пса-лтир incorrect Psal-tir correct -------------------------------------- ин-те-рвју incorrect in-ter-vju correct -------------------------------------- ко-нвен-ци-је incorrect kon-ven-ci-je correct -------------------------------------- ку-лтур-ног incorrect kul-tur-nog correct -------------------------------------- шта-мпа-ње incorrect štam-pa-nje correct -------------------------------------- ба-њски incorrect banj-ski correct -------------------------------------- шта-мпа-ри-ја incorrect štam-pa-ri-ja correct -------------------------------------- ре-а-лка incorrect re-al-ka correct -------------------------------------- шко-лство incorrect škol-stvo correct -------------------------------------- би-лтен incorrect bil-ten correct -------------------------------------- су-мња-ти incorrect sum-nja-ti correct -------------------------------------- су-мња incorrect sum-nja correct -------------------------------------- > This might have been another reason why the serbian hyphenation > patterns were disabled. I simply have no idea which ones should be > included and why one set of patterns would be better than the other. I > would prefer if we would either create hyph-sr-latn (transliteration > from hyph-sr-cyrl) or include hyph-sh-latn+hyph-sh-cyrl. I hope that the above example might be helpful. > (We can prepare a longer document with hyphenation points colored, so > that one could compare differences between hyph-sr-cyrl and > hyph-sh-cyrl, but I'm not qualified to judge about differences and > which ones are right or wrong.) Please do. That would be great. Best wishes, -- Nikola Lečić = Никола Лечић fingerprint : FEF3 66AF C90E EDC3 D878 7CDC 956D F4AB A377 1C9B ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
signature.asc
Description: PGP signature
