# Re: [NTG-context] i18n in ConTeXt (& some bugs)

Adam Lindsay wrote:

In OpenType fonts, \i doesn't compose well with accents, unlike normal
TeX fonts. Therefore a couple more definitions (at least) are needed in
enco-acc:
\defineaccent ' {\i} {\iacute}
\defineaccent  {\i} {\igrave} % etc.

ok, added

> I had the toughest time getting strings like \v!january to switch
language. I found that this change (from \currentmainlanguage) fixed the
inability to \ShowLanguageValues. In lang-lab:
\def\labellanguage{\defaultlanguage\currentlanguage}

dangerous and wrong, better

\gdef\ShowHeadText#1{\tttf#1\VL\mainlanguage[\currentlanguage]\headtext{#1}\VisualizeLastSpace}
\gdef\ShowLabelText#1{\tttf#1\VL\mainlanguage[\currentlanguage]\labeltext{#1}\VisualizeLastSpace}

Another bug in visualisation was no doubt brought on by the switch to low
level english. In s-mod-00:
\VL \ShowLabelText \subsection           \VL\MR
\VL \ShowLabelText \subsubsection        \VL\MR
... changes to...
\VL \ShowLabelText \v!subsection           \VL\MR
\VL \ShowLabelText \v!subsubsection        \VL\MR

corrected

With that done, I noticed a few smaller bugs in the lang-* files. I am
not an expert in any of these languages, so all of my recommendations are
subject to verification by people who know what they're doing!

In lang-grk, a copy-paste error was propagated: \s!fi => \s!gr
corrected

In lang-ura, the Hungarian word for abbreviations has a typo:
R\"ovid\'\it\'esek => R\"ovid\'it\'esek (or R\"ovid\'\i t\'esek)

corrected

In lang-sla, the Polish word for part has an \ecedilla, which I'm
guessing should be an \eogonek...
Ust\c{e}p => Ust\k{e}p  (probably!)

corrected

(some day i'll change all these into \namedglyphs)

In lang-ita, The Catalan word for March is spelled with a \,. That's
probably a cedilla:
mar\,c => mar\c{c}

ok

Finally, I've done some work filling out enco-uc with characters used in
the other encodings and (especially) characters used in the lang-* files.
It's attached. I have a couple questions that remain for experts:

indeed

Greek: I defined \Greekleftquot as a guillemot, as a guess. Is that right?

Cyrillic: The Unicode codepoint for \cyrilicii completely baffled me for
a while. (I don't have a copy of the t2a/b fonts here with me to take a
look for myself!) According to Unicode, it should be the same as
\cyrillici, but that can't be right. I'm now guessing that since it
appears the same as the roman 'i' in enco-cyr, it corresponds with
U+0456: CYRILLIC SMALL LETTER BYELORUSSIAN-UKRAINIAN I. Is that right?

The additions to enco-uc are attached. I would be ever so grateful if
they found their way into the distribution!

ok, added

btw, we can also need to extend the utf (unic-*) files

By the way, although these seem like complaints, I must say (again) that
hm, bug are bugs, no complaints -)

the plumbing supporting arbitrary encodings, accents, and input regimes
in ConTeXt is absolutely fantastic. Making XeTeX work with ConTeXt is
really quite trivial compared to the efforts being expended on the
XeLaTeX side!


ah, that's good news; i already got pessimistic seeing the stream of patched needed for latex that pass by on the xetex list

Hans

-----------------------------------------------------------------
-----------------------------------------------------------------
_______________________________________________
`