Re: [NTG-context] Adding built-in support for Serbian language
Hi Mojca, \input hyph-sh-latn.tex \input hyph-sh-cyrl.tex That is: it loads both patterns at the same time. Hans, would you be willing to merge two sets of hyphenation patterns together? Alternatively maybe we could prepare hyph-sh.pat.txt on the hyph-utf8 side? I'm actually not sure why we didn't do that already, but maybe it was because we have two sets of cyrillic patterns and it has never been a clear cut which ones to take. I think that a merged file is the most natural approach (isn't it "sr" instesad od "sh"?). I can of course add all kind of code for merging btu at some point I guess a merged file will be used anyway. Hans - Hans Hagen | PRAGMA ADE Ridderstraat 27 | 8061 GH Hasselt | The Netherlands tel: 038 477 53 69 | www.pragma-ade.nl | www.pragma-pod.nl - ___ If your question is of interest to others as well, please add an entry to the Wiki! maillist : ntg-context@ntg.nl / http://www.ntg.nl/mailman/listinfo/ntg-context webpage : http://www.pragma-ade.nl / http://context.aanhet.net archive : https://bitbucket.org/phg/context-mirror/commits/ wiki : http://contextgarden.net ___
Re: [NTG-context] Adding built-in support for Serbian language
Дана 30.10.2020. у 16:42, Mojca Miklavec пише: > Would you be willing to also prepare the latin one then? > The codes should be sorted out by Hans (potentially with some help), > but we definitely want to use "sr-latn" and "sr-cyrl". Here is the lang-txt.lua diff with labels transliterated from serbian cyrillic to latin script. Other files basically do not differ, only codes should be sorted out. Language definition stays the same. Regards, Ivan Serbian-Latn.7z Description: Binary data ___ If your question is of interest to others as well, please add an entry to the Wiki! maillist : ntg-context@ntg.nl / http://www.ntg.nl/mailman/listinfo/ntg-context webpage : http://www.pragma-ade.nl / http://context.aanhet.net archive : https://bitbucket.org/phg/context-mirror/commits/ wiki : http://contextgarden.net ___
Re: [NTG-context] Adding built-in support for Serbian language
> (PS: I would say that adding support for transliteration of the text > from one script to the other would be a really nice feature. Then you > could type your text for a book once and have it typeset in both > versions without any extra effort :) There is Philipp Gesang's transliterator package: https://gitlab.com/phgsng/transliterator https://modules.contextgarden.net/cgi-bin/module.cgi/ruid=199735311/action=view/id=50 Cheers, Henri ___ If your question is of interest to others as well, please add an entry to the Wiki! maillist : ntg-context@ntg.nl / http://www.ntg.nl/mailman/listinfo/ntg-context webpage : http://www.pragma-ade.nl / http://context.aanhet.net archive : https://bitbucket.org/phg/context-mirror/commits/ wiki : http://contextgarden.net ___
Re: [NTG-context] Adding built-in support for Serbian language
On 10/30/2020 8:18 PM, Ivan Pešić wrote: Dear Mojca, Дана 30.10.2020. у 19:10, Mojca Miklavec пише: Would you be willing to also prepare the latin one then? The codes should be sorted out by Hans (potentially with some help), but we definitely want to use "sr-latn" and "sr-cyrl". Sure, I will tomorrow create a transliteration to latin script and post diffs here. What you propose is in fact already used in some other places, I agree with you. For the longer names there is some more freedom. LaTeX uses "serbianl" and "serbianc", I think, but I believe we can come up with something nicer. Maybe something along the lines of the following? \mainlanguage[serbian][script=latn] or \mainlanguage[serbian-latin] \mainlanguage[serbian-cyrillic] No clue, really. Thank you, Mojca (PS: I would say that adding support for transliteration of the text from one script to the other would be a really nice feature. Then you could type your text for a book once and have it typeset in both versions without any extra effort :) As for transliteration, cyrillic to latin is one-to-one, straightforward with no exceptions. A simple table lookup is enough. Going from latin to cyrillic, there are some exceptions, but we could solve that. I can provide Hans with all that is needed. ok. there's quite some code already present in the core that we can use so it's no big deal to do it also think of additional things you want (some tracing?) Hans - Hans Hagen | PRAGMA ADE Ridderstraat 27 | 8061 GH Hasselt | The Netherlands tel: 038 477 53 69 | www.pragma-ade.nl | www.pragma-pod.nl - ___ If your question is of interest to others as well, please add an entry to the Wiki! maillist : ntg-context@ntg.nl / http://www.ntg.nl/mailman/listinfo/ntg-context webpage : http://www.pragma-ade.nl / http://context.aanhet.net archive : https://bitbucket.org/phg/context-mirror/commits/ wiki : http://contextgarden.net ___
Re: [NTG-context] Adding built-in support for Serbian language
Dear Mojca, Дана 30.10.2020. у 19:10, Mojca Miklavec пише: > Would you be willing to also prepare the latin one then? > The codes should be sorted out by Hans (potentially with some help), > but we definitely want to use "sr-latn" and "sr-cyrl". Sure, I will tomorrow create a transliteration to latin script and post diffs here. What you propose is in fact already used in some other places, I agree with you. > For the longer names there is some more freedom. LaTeX uses "serbianl" > and "serbianc", I think, but I believe we can come up with something > nicer. > Maybe something along the lines of the following? > \mainlanguage[serbian][script=latn] > or >\mainlanguage[serbian-latin] >\mainlanguage[serbian-cyrillic] > No clue, really. > > Thank you, > Mojca > > (PS: I would say that adding support for transliteration of the text > from one script to the other would be a really nice feature. Then you > could type your text for a book once and have it typeset in both > versions without any extra effort :) As for transliteration, cyrillic to latin is one-to-one, straightforward with no exceptions. A simple table lookup is enough. Going from latin to cyrillic, there are some exceptions, but we could solve that. I can provide Hans with all that is needed. Ivan ___ If your question is of interest to others as well, please add an entry to the Wiki! maillist : ntg-context@ntg.nl / http://www.ntg.nl/mailman/listinfo/ntg-context webpage : http://www.pragma-ade.nl / http://context.aanhet.net archive : https://bitbucket.org/phg/context-mirror/commits/ wiki : http://contextgarden.net ___
Re: [NTG-context] Adding built-in support for Serbian language
On 10/30/2020 1:42 PM, Mojca Miklavec wrote: I would say that this needs to be changed/improved. There's no reason why it wouldn't load both scripts at the same time (at least for Unicode engines, which is the only thing that's currently supported anyway). i'll look into it once i finished some new stuff (in the middle of fit) (PS: I would say that adding support for transliteration of the text from one script to the other would be a really nice feature. Then you could type your text for a book once and have it typeset in both versions without any extra effort :) just gimme the specs ... sounds like some nice distraction for a rainy weekend Hans - Hans Hagen | PRAGMA ADE Ridderstraat 27 | 8061 GH Hasselt | The Netherlands tel: 038 477 53 69 | www.pragma-ade.nl | www.pragma-pod.nl - ___ If your question is of interest to others as well, please add an entry to the Wiki! maillist : ntg-context@ntg.nl / http://www.ntg.nl/mailman/listinfo/ntg-context webpage : http://www.pragma-ade.nl / http://context.aanhet.net archive : https://bitbucket.org/phg/context-mirror/commits/ wiki : http://contextgarden.net ___
Re: [NTG-context] Adding built-in support for Serbian language
Dear Ivan, On Fri, 30 Oct 2020 at 11:32, Ivan Pešić wrote: > > Hello all, > I have recently started using ConTeXt. Welcome! > I've found that the distribution > includes a proper (cyrillic) hyphenation file for Serbian language, I would say that this needs to be changed/improved. There's no reason why it wouldn't load both scripts at the same time (at least for Unicode engines, which is the only thing that's currently supported anyway). This is what XeTeX loads, for example: https://github.com/hyphenation/tex-hyphen/blob/master/hyph-utf8/tex/generic/hyph-utf8/loadhyph/loadhyph-sr-latn.tex#L25 \input hyph-sh-latn.tex \input hyph-sh-cyrl.tex That is: it loads both patterns at the same time. Hans, would you be willing to merge two sets of hyphenation patterns together? Alternatively maybe we could prepare hyph-sh.pat.txt on the hyph-utf8 side? I'm actually not sure why we didn't do that already, but maybe it was because we have two sets of cyrillic patterns and it has never been a clear cut which ones to take. The author of hyph-sh-[latn|cyrl] says that his patterns should work universally for multiple languages (they are relatively old), but they were initially only released for the Latin scripts. Later another author wanted to have support for Cyrillic script and prepared his own patterns (I'm no longer sure whether they were partially based on the other ones) without the Latin alternative. In Xe(La)TeX and Lua(La)TeX we use the "sh" patterns for both, for consistency reasons, among others. (You likely want the same word to be hyphenated in the same way in both scripts). > but a complete language support is still not implemented. Therefore, > I've added what I think is required, did some testing by putting changed > files in my texmf-local, and the result looks fine. Awesome, thank you. > There is only one thing that requires a decision from the development team. > Serbian language uses two scripts: cyrillic and latin. Context language > codes are using 2 letters for identification. So I'm not sure how to > include both scripts. (Unless has plans to transliterate the translations on the fly :) there should be two independent files. One should use the code sr-latn and the other one sr-cyrl. Two letter code simply doesn't work in this situation and we should not even try to support one single script, or even attempt to decide which one should be the default one. Both should be supported equally well. > What I'm sending now is a cyrillic script implementation, using the code > "sr". > > It is trivial to generate (completely automatic) latin script version of > these changes, once it is decided how to label it. Would you be willing to also prepare the latin one then? The codes should be sorted out by Hans (potentially with some help), but we definitely want to use "sr-latn" and "sr-cyrl". For the longer names there is some more freedom. LaTeX uses "serbianl" and "serbianc", I think, but I believe we can come up with something nicer. Maybe something along the lines of the following? \mainlanguage[serbian][script=latn] or \mainlanguage[serbian-latin] \mainlanguage[serbian-cyrillic] No clue, really. Thank you, Mojca (PS: I would say that adding support for transliteration of the text from one script to the other would be a really nice feature. Then you could type your text for a book once and have it typeset in both versions without any extra effort :) ___ If your question is of interest to others as well, please add an entry to the Wiki! maillist : ntg-context@ntg.nl / http://www.ntg.nl/mailman/listinfo/ntg-context webpage : http://www.pragma-ade.nl / http://context.aanhet.net archive : https://bitbucket.org/phg/context-mirror/commits/ wiki : http://contextgarden.net ___
[NTG-context] Adding built-in support for Serbian language
Appologies, I have forgot to attach the file :$ Here is it. Ivan serbian.7z Description: Binary data ___ If your question is of interest to others as well, please add an entry to the Wiki! maillist : ntg-context@ntg.nl / http://www.ntg.nl/mailman/listinfo/ntg-context webpage : http://www.pragma-ade.nl / http://context.aanhet.net archive : https://bitbucket.org/phg/context-mirror/commits/ wiki : http://contextgarden.net ___
[NTG-context] Adding built-in support for Serbian language
Hello all, I have recently started using ConTeXt. I've found that the distribution includes a proper (cyrillic) hyphenation file for Serbian language, but a complete language support is still not implemented. Therefore, I've added what I think is required, did some testing by putting changed files in my texmf-local, and the result looks fine. There is only one thing that requires a decision from the development team. Serbian language uses two scripts: cyrillic and latin. Context language codes are using 2 letters for identification. So I'm not sure how to include both scripts. What I'm sending now is a cyrillic script implementation, using the code "sr". It is trivial to generate (completely automatic) latin script version of these changes, once it is decided how to label it. Best regards, Ivan ___ If your question is of interest to others as well, please add an entry to the Wiki! maillist : ntg-context@ntg.nl / http://www.ntg.nl/mailman/listinfo/ntg-context webpage : http://www.pragma-ade.nl / http://context.aanhet.net archive : https://bitbucket.org/phg/context-mirror/commits/ wiki : http://contextgarden.net ___