Re: [NTG-context] Adding built-in support for Serbian language

2020-11-03 Thread Hans Hagen

Hi Mojca,


 \input hyph-sh-latn.tex
 \input hyph-sh-cyrl.tex
That is: it loads both patterns at the same time.

Hans, would you be willing to merge two sets of hyphenation patterns together?
Alternatively maybe we could prepare hyph-sh.pat.txt on the hyph-utf8 side?
I'm actually not sure why we didn't do that already, but maybe it was
because we have two sets of cyrillic patterns and it has never been a
clear cut which ones to take.


I think that a merged file is the most natural approach (isn't it "sr" 
instesad od "sh"?). I can of course add all kind of code for merging btu 
at some point I guess a merged file will be used anyway.

 Hans


-
  Hans Hagen | PRAGMA ADE
  Ridderstraat 27 | 8061 GH Hasselt | The Netherlands
   tel: 038 477 53 69 | www.pragma-ade.nl | www.pragma-pod.nl
-
___
If your question is of interest to others as well, please add an entry to the 
Wiki!

maillist : ntg-context@ntg.nl / http://www.ntg.nl/mailman/listinfo/ntg-context
webpage  : http://www.pragma-ade.nl / http://context.aanhet.net
archive  : https://bitbucket.org/phg/context-mirror/commits/
wiki : http://contextgarden.net
___


Re: [NTG-context] Adding built-in support for Serbian language

2020-10-30 Thread Ivan Pešić

Дана 30.10.2020. у 16:42, Mojca Miklavec пише:
> Would you be willing to also prepare the latin one then?
> The codes should be sorted out by Hans (potentially with some help),
> but we definitely want to use "sr-latn" and "sr-cyrl".
Here is the lang-txt.lua diff with labels transliterated from serbian
cyrillic to latin script.
Other files basically do not differ, only codes should be sorted out.
Language definition stays the same.

Regards,
Ivan


Serbian-Latn.7z
Description: Binary data
___
If your question is of interest to others as well, please add an entry to the 
Wiki!

maillist : ntg-context@ntg.nl / http://www.ntg.nl/mailman/listinfo/ntg-context
webpage  : http://www.pragma-ade.nl / http://context.aanhet.net
archive  : https://bitbucket.org/phg/context-mirror/commits/
wiki : http://contextgarden.net
___


Re: [NTG-context] Adding built-in support for Serbian language

2020-10-30 Thread Henri Menke
> (PS: I would say that adding support for transliteration of the text
> from one script to the other would be a really nice feature. Then you
> could type your text for a book once and have it typeset in both
> versions without any extra effort :)

There is Philipp Gesang's transliterator package:
https://gitlab.com/phgsng/transliterator
https://modules.contextgarden.net/cgi-bin/module.cgi/ruid=199735311/action=view/id=50

Cheers, Henri
___
If your question is of interest to others as well, please add an entry to the 
Wiki!

maillist : ntg-context@ntg.nl / http://www.ntg.nl/mailman/listinfo/ntg-context
webpage  : http://www.pragma-ade.nl / http://context.aanhet.net
archive  : https://bitbucket.org/phg/context-mirror/commits/
wiki : http://contextgarden.net
___


Re: [NTG-context] Adding built-in support for Serbian language

2020-10-30 Thread Hans Hagen

On 10/30/2020 8:18 PM, Ivan Pešić wrote:

Dear Mojca,

Дана 30.10.2020. у 19:10, Mojca Miklavec пише:

Would you be willing to also prepare the latin one then?
The codes should be sorted out by Hans (potentially with some help),
but we definitely want to use "sr-latn" and "sr-cyrl".

Sure, I will tomorrow create a transliteration to latin script and post
diffs here.
What you propose is in fact already used in some other places, I agree
with you.

For the longer names there is some more freedom. LaTeX uses "serbianl"
and "serbianc", I think, but I believe we can come up with something
nicer.
Maybe something along the lines of the following?
 \mainlanguage[serbian][script=latn]
or
\mainlanguage[serbian-latin]
\mainlanguage[serbian-cyrillic]
No clue, really.

Thank you,
 Mojca

(PS: I would say that adding support for transliteration of the text
from one script to the other would be a really nice feature. Then you
could type your text for a book once and have it typeset in both
versions without any extra effort :)

As for transliteration, cyrillic to latin is one-to-one, straightforward
with no exceptions.
A simple table lookup is enough.
Going from latin to cyrillic, there are some exceptions, but we could
solve that.
I can provide Hans with all that is needed.


ok. there's quite some code already present in the core that we can use 
so it's no big deal to do it


also think of additional things you want (some tracing?)

Hans

-
  Hans Hagen | PRAGMA ADE
  Ridderstraat 27 | 8061 GH Hasselt | The Netherlands
   tel: 038 477 53 69 | www.pragma-ade.nl | www.pragma-pod.nl
-
___
If your question is of interest to others as well, please add an entry to the 
Wiki!

maillist : ntg-context@ntg.nl / http://www.ntg.nl/mailman/listinfo/ntg-context
webpage  : http://www.pragma-ade.nl / http://context.aanhet.net
archive  : https://bitbucket.org/phg/context-mirror/commits/
wiki : http://contextgarden.net
___


Re: [NTG-context] Adding built-in support for Serbian language

2020-10-30 Thread Ivan Pešić
Dear Mojca,

Дана 30.10.2020. у 19:10, Mojca Miklavec пише:
> Would you be willing to also prepare the latin one then?
> The codes should be sorted out by Hans (potentially with some help),
> but we definitely want to use "sr-latn" and "sr-cyrl".
Sure, I will tomorrow create a transliteration to latin script and post
diffs here.
What you propose is in fact already used in some other places, I agree
with you.
> For the longer names there is some more freedom. LaTeX uses "serbianl"
> and "serbianc", I think, but I believe we can come up with something
> nicer.
> Maybe something along the lines of the following?
> \mainlanguage[serbian][script=latn]
> or
>\mainlanguage[serbian-latin]
>\mainlanguage[serbian-cyrillic]
> No clue, really.
>
> Thank you,
> Mojca
>
> (PS: I would say that adding support for transliteration of the text
> from one script to the other would be a really nice feature. Then you
> could type your text for a book once and have it typeset in both
> versions without any extra effort :)
As for transliteration, cyrillic to latin is one-to-one, straightforward
with no exceptions.
A simple table lookup is enough.
Going from latin to cyrillic, there are some exceptions, but we could
solve that.
I can provide Hans with all that is needed.

Ivan

___
If your question is of interest to others as well, please add an entry to the 
Wiki!

maillist : ntg-context@ntg.nl / http://www.ntg.nl/mailman/listinfo/ntg-context
webpage  : http://www.pragma-ade.nl / http://context.aanhet.net
archive  : https://bitbucket.org/phg/context-mirror/commits/
wiki : http://contextgarden.net
___


Re: [NTG-context] Adding built-in support for Serbian language

2020-10-30 Thread Hans Hagen

On 10/30/2020 1:42 PM, Mojca Miklavec wrote:


I would say that this needs to be changed/improved.
There's no reason why it wouldn't load both scripts at the same time
(at least for Unicode engines, which is the only thing that's
currently supported anyway).


i'll look into it once i finished some new stuff (in the middle of fit)


(PS: I would say that adding support for transliteration of the text
from one script to the other would be a really nice feature. Then you
could type your text for a book once and have it typeset in both
versions without any extra effort :)
just gimme the specs ... sounds like some nice distraction for a rainy 
weekend


Hans


-
  Hans Hagen | PRAGMA ADE
  Ridderstraat 27 | 8061 GH Hasselt | The Netherlands
   tel: 038 477 53 69 | www.pragma-ade.nl | www.pragma-pod.nl
-
___
If your question is of interest to others as well, please add an entry to the 
Wiki!

maillist : ntg-context@ntg.nl / http://www.ntg.nl/mailman/listinfo/ntg-context
webpage  : http://www.pragma-ade.nl / http://context.aanhet.net
archive  : https://bitbucket.org/phg/context-mirror/commits/
wiki : http://contextgarden.net
___


Re: [NTG-context] Adding built-in support for Serbian language

2020-10-30 Thread Mojca Miklavec
Dear Ivan,

On Fri, 30 Oct 2020 at 11:32, Ivan Pešić wrote:
>
> Hello all,
> I have recently started using ConTeXt.

Welcome!

> I've found that the distribution
> includes a proper (cyrillic) hyphenation file for Serbian language,

I would say that this needs to be changed/improved.
There's no reason why it wouldn't load both scripts at the same time
(at least for Unicode engines, which is the only thing that's
currently supported anyway).

This is what XeTeX loads, for example:

https://github.com/hyphenation/tex-hyphen/blob/master/hyph-utf8/tex/generic/hyph-utf8/loadhyph/loadhyph-sr-latn.tex#L25

\input hyph-sh-latn.tex
\input hyph-sh-cyrl.tex
That is: it loads both patterns at the same time.

Hans, would you be willing to merge two sets of hyphenation patterns together?
Alternatively maybe we could prepare hyph-sh.pat.txt on the hyph-utf8 side?
I'm actually not sure why we didn't do that already, but maybe it was
because we have two sets of cyrillic patterns and it has never been a
clear cut which ones to take.

The author of hyph-sh-[latn|cyrl] says that his patterns should work
universally for multiple languages (they are relatively old), but they
were initially only released for the Latin scripts. Later another
author wanted to have support for Cyrillic script and prepared his own
patterns (I'm no longer sure whether they were partially based on the
other ones) without the Latin alternative.

In Xe(La)TeX and Lua(La)TeX we use the "sh" patterns for both, for
consistency reasons, among others. (You likely want the same word to
be hyphenated in the same way in both scripts).

> but a complete language support is still not implemented. Therefore,
> I've added what I think is required, did some testing by putting changed
> files in my texmf-local, and the result looks fine.

Awesome, thank you.

> There is only one thing that requires a decision from the development team.
> Serbian language uses two scripts: cyrillic and latin. Context language
> codes are using 2 letters for identification. So I'm not sure how to
> include both scripts.

(Unless has plans to transliterate the translations on the fly :)
there should be two independent files. One should use the code sr-latn
and the other one sr-cyrl.

Two letter code simply doesn't work in this situation and we should
not even try to support one single script, or even attempt to decide
which one should be the default one. Both should be supported equally
well.

> What I'm sending now is a cyrillic script implementation, using the code
> "sr".
>
> It is trivial to generate (completely automatic) latin script version of
> these changes, once it is decided how to label it.

Would you be willing to also prepare the latin one then?
The codes should be sorted out by Hans (potentially with some help),
but we definitely want to use "sr-latn" and "sr-cyrl".

For the longer names there is some more freedom. LaTeX uses "serbianl"
and "serbianc", I think, but I believe we can come up with something
nicer.
Maybe something along the lines of the following?
\mainlanguage[serbian][script=latn]
or
   \mainlanguage[serbian-latin]
   \mainlanguage[serbian-cyrillic]
No clue, really.

Thank you,
Mojca

(PS: I would say that adding support for transliteration of the text
from one script to the other would be a really nice feature. Then you
could type your text for a book once and have it typeset in both
versions without any extra effort :)
___
If your question is of interest to others as well, please add an entry to the 
Wiki!

maillist : ntg-context@ntg.nl / http://www.ntg.nl/mailman/listinfo/ntg-context
webpage  : http://www.pragma-ade.nl / http://context.aanhet.net
archive  : https://bitbucket.org/phg/context-mirror/commits/
wiki : http://contextgarden.net
___


[NTG-context] Adding built-in support for Serbian language

2020-10-30 Thread Ivan Pešić
Appologies, I have forgot to attach the file :$

Here is it.

Ivan



serbian.7z
Description: Binary data
___
If your question is of interest to others as well, please add an entry to the 
Wiki!

maillist : ntg-context@ntg.nl / http://www.ntg.nl/mailman/listinfo/ntg-context
webpage  : http://www.pragma-ade.nl / http://context.aanhet.net
archive  : https://bitbucket.org/phg/context-mirror/commits/
wiki : http://contextgarden.net
___


[NTG-context] Adding built-in support for Serbian language

2020-10-30 Thread Ivan Pešić
Hello all,
I have recently started using ConTeXt. I've found that the distribution
includes a proper (cyrillic) hyphenation file for Serbian language,
but a complete language support is still not implemented. Therefore,
I've added what I think is required, did some testing by putting changed
files in my texmf-local, and the result looks fine.
There is only one thing that requires a decision from the development team.
Serbian language uses two scripts: cyrillic and latin. Context language
codes are using 2 letters for identification. So I'm not sure how to
include both scripts.
What I'm sending now is a cyrillic script implementation, using the code
"sr".

It is trivial to generate (completely automatic) latin script version of
these changes, once it is decided how to label it.

Best regards,
Ivan
___
If your question is of interest to others as well, please add an entry to the 
Wiki!

maillist : ntg-context@ntg.nl / http://www.ntg.nl/mailman/listinfo/ntg-context
webpage  : http://www.pragma-ade.nl / http://context.aanhet.net
archive  : https://bitbucket.org/phg/context-mirror/commits/
wiki : http://contextgarden.net
___