Dear Ivan,

On Fri, 30 Oct 2020 at 11:32, Ivan Pešić wrote:
>
> Hello all,
> I have recently started using ConTeXt.

Welcome!

> I've found that the distribution
> includes a proper (cyrillic) hyphenation file for Serbian language,

I would say that this needs to be changed/improved.
There's no reason why it wouldn't load both scripts at the same time
(at least for Unicode engines, which is the only thing that's
currently supported anyway).

This is what XeTeX loads, for example:
    
https://github.com/hyphenation/tex-hyphen/blob/master/hyph-utf8/tex/generic/hyph-utf8/loadhyph/loadhyph-sr-latn.tex#L25

    \input hyph-sh-latn.tex
    \input hyph-sh-cyrl.tex
That is: it loads both patterns at the same time.

Hans, would you be willing to merge two sets of hyphenation patterns together?
Alternatively maybe we could prepare hyph-sh.pat.txt on the hyph-utf8 side?
I'm actually not sure why we didn't do that already, but maybe it was
because we have two sets of cyrillic patterns and it has never been a
clear cut which ones to take.

The author of hyph-sh-[latn|cyrl] says that his patterns should work
universally for multiple languages (they are relatively old), but they
were initially only released for the Latin scripts. Later another
author wanted to have support for Cyrillic script and prepared his own
patterns (I'm no longer sure whether they were partially based on the
other ones) without the Latin alternative.

In Xe(La)TeX and Lua(La)TeX we use the "sh" patterns for both, for
consistency reasons, among others. (You likely want the same word to
be hyphenated in the same way in both scripts).

> but a complete language support is still not implemented. Therefore,
> I've added what I think is required, did some testing by putting changed
> files in my texmf-local, and the result looks fine.

Awesome, thank you.

> There is only one thing that requires a decision from the development team.
> Serbian language uses two scripts: cyrillic and latin. Context language
> codes are using 2 letters for identification. So I'm not sure how to
> include both scripts.

(Unless has plans to transliterate the translations on the fly :)
there should be two independent files. One should use the code sr-latn
and the other one sr-cyrl.

Two letter code simply doesn't work in this situation and we should
not even try to support one single script, or even attempt to decide
which one should be the default one. Both should be supported equally
well.

> What I'm sending now is a cyrillic script implementation, using the code
> "sr".
>
> It is trivial to generate (completely automatic) latin script version of
> these changes, once it is decided how to label it.

Would you be willing to also prepare the latin one then?
The codes should be sorted out by Hans (potentially with some help),
but we definitely want to use "sr-latn" and "sr-cyrl".

For the longer names there is some more freedom. LaTeX uses "serbianl"
and "serbianc", I think, but I believe we can come up with something
nicer.
Maybe something along the lines of the following?
    \mainlanguage[serbian][script=latn]
or
   \mainlanguage[serbian-latin]
   \mainlanguage[serbian-cyrillic]
No clue, really.

Thank you,
    Mojca

(PS: I would say that adding support for transliteration of the text
from one script to the other would be a really nice feature. Then you
could type your text for a book once and have it typeset in both
versions without any extra effort :)
___________________________________________________________________________________
If your question is of interest to others as well, please add an entry to the 
Wiki!

maillist : ntg-context@ntg.nl / http://www.ntg.nl/mailman/listinfo/ntg-context
webpage  : http://www.pragma-ade.nl / http://context.aanhet.net
archive  : https://bitbucket.org/phg/context-mirror/commits/
wiki     : http://contextgarden.net
___________________________________________________________________________________

Reply via email to