Hi Mojca,
Hello everybody,
Finally I come to answer your mail... and I have to explain why
everything takes such a long time.
I must confess that I am totally absorbed by scientific work at the
moment (a major dictionary) and, besides using the traditional script
aspects of Mongolian and Manju, I rarely use even my own MonTeX at all.
Language and computer use in Mongolia are absolutely and dominantly
geared towards using a Cyrillic environment. The classical script is
only used in academical, formal and decorative settings, not in
practical applications like books, newspapers or (internet)
communication. Nonetheless, there is a strong need for a system being
able to seamlessly integrate all worlds of writing --- this is the
guiding thought which led me to write MonTeX.
When writing MonTeX, a universal Cyrillic standard was close to being
considered a pipe dream (see the plethora of Cyrillic input encodings
provided with MonTeX), and now that we have Unicode and functioning
Cyrillic support on all sorts of computing platforms, the Mongolians
/continue/ to modify ASCII>128<255 of Arial and other M$ fonts in order
to run Win 1251 lookalike codepages even though the systems do speak
Unicode these days. Writing software applications which use German
umlauts (ä, ö, ü) and running these on vanilla office computers in
Mongolia frequently makes these characters (needed for transliterations)
unusable because somebody installed that homebrew modified font which
will clutter up everything in the wrong place. And as long as people
know these fonts are out there they continue to write Word documents and
create Web pages which will require exactly these crippled fonts... It
sucks. In order to avoid these totally mindless problems I continue to
work pure ASCII and let TeX do the generation of nice Cyrillic and
Mongolian traditional script. Besides, there are linguistic advantages
of being able to sort modern Xalx Mongolian and classical Mongolian in
traditional writing according to the same input alphabet... So MonTeX
continues to have a strong raison-d'être in a albeit small community.
I do feel strongly that the limited Metafont set supplied with MonTeX is
not up to the standards of what we enjoy today. I used to stay with
Metafont because I like the elegance of the creative process, the
language is pleasing, the concept is pleasing. Alas, the world has gone
in a different direction. The type1 fonts contributed to MonTeX were
made by a kind fellow and author of a popular LaTeX book quite a few
years ago.
Given the mainstream popularity of Babel it is absolutely conceivable
why we should support a Babel-style environment which can access modern
fonts --- yet in my recent work (a dictionary in five languages and more
scripts: Manju, Tibetan, Mongolian, Uighur (in Arab script) and Chinese)
I started using XeLaTeX, even abandoning my on ctib Tibetan system, due
to the completeness of the now available Tibetan font. Then I abandoned
ArabTeX because some Arab OT fonts have more glyphs (and I need some
historical material for which I never figured out how ArabTeX would do
exactly what I needed): hence I switched to XeLaTeX. So my next thought
is not how to port the complete MonTeX functionality to Babel, but to
XeLaTeX; in the field where I work XeLaTeX font support is superior to
everything we've seen so far, and is indispensable. So I gave up
understanding the workings of a Babel system (and was not even aware of
the Babel support for Mongolian prior to your first mail; the Mongolian
community is small but highly fragmented---I developed my system
basically at the Academy of Sciences of Mongolia about 10 years ago, and
younger talents coming from different institutions have entirely studied
abroad; sometimes it is difficult to keep track of each other).
To make a long story short:
My suggestion is to keep the ASCII Input -> Cyrillic and Traditional
Script Output functionality of MonTeX alive in a new environment,
preferably based on XeLaTeX.
XeLaTeX is capable of accepting ASCII input and producing Script output
(e.g. consult the ArabXeTeX package!)
The hyphenation support and language settings will not be taken care of
by Babel, but by polyglossia.
I have no idea yet how such a hyphenation file should look like in order
to accommodate both ASCII and Cyrillic input methods.
The flawed hyphenation patterns I generated more than a decade ago are
mainly due to an incomplete dictionary me and my colleague based our
work on.
Please, please, please understand that I am overwhelmed by the monstrous
dictionary work mentioned above I've been working on for more than a
decade and supposedly going to print this summer (and MonTeX was made
for this purpose very much like TeX was made for typesetting Math, if I
am allowed to make this comparison); until summer of this year, I have
no chance to do any meaningful work of porting MonTeX to contemporary
Cyrillic fonts and make it work in XeLaTeX (currently I am using MonTeX
in XeLaTeX without relying on the beautiful font support of XeTeX: a
shame!) Currently my only focus is on finishing the dictionary.
There is a lot of modifications I tweaked into MonTeX in order to
created complicated Manju transliterations of Tibetan text, but so far I
was not able to consolidate everything into a new MonTeX version. Can we
postpone the renaming issue? It is not the only issue I want to fix.
Please accept my apologies for this lengthy mail.
Best regards,
Oliver.
On 22.03.2010 02:54, Mojca Miklavec wrote:
Hello Karl& others,
I took a bit more time and wrote a longer reply. There are two issues
with Mongolian patterns. The first one is that there are two sets (but
since there's unique set of rules for hyphenation, there is chance
that the two authors will be willing to agree on a single set) and the
second one is that they are in different encoding. I will concentrate
only on encodings in this mail.
A bit of background. There are two "big players" in the history of
support for Mongolian in TeX: Oliver Corff and Dorjgotov Batmunkh.
Both contributed quite a lot of material (and Mongolian support in
LaTeX is pretty complex anyway since Mongolian can be written in an
infinite number of scripts and directions).
Oliver Corff was the first one to publish any patterns. He wrote his
own "system", called MonTeX which included packages, patterns and
fonts, but also other things:
- http://ctan.org/tex-archive/language/mongolian/montex/
- http://ctan.org/tex-archive/language/mongolian/MNT/
- http://ctan.org/tex-archive/language/mongolian/mxd/
- http://ctan.org/tex-archive/language/mongolian/soyombo/
On the other hand Dorjgotov Batmunkh contributed Babel support,
translated "The not so short introduction to LaTeX", generated
patterns (and also some documents that shows where Oliver's patterns
break in the wrong way), ...
When Arthur and me started with hyph-utf8, the language.dat file in
TeX Live was using:
- Oliver's patterns in LMC encoding (mnhyph.tex) under the name "mongolian"
- Dorjgotov's paterns in T2A encoding (mnhyphn.tex) under the name
"mongolian2a" (since the name mongolian was not available any more)
See:
http://www.tug.org/svn/texlive/trunk/Master/texmf/tex/generic/hyphen/mnhyph.tex?view=log&pathrev=34
http://www.tug.org/svn/texlive/trunk/Master/texmf/tex/generic/hyphen/mnhyphn.tex?view=log&pathrev=5096
Dorjgotov released his patterns not that long before we started the
work ... They were imported to TL in October 2007, while Oliver's have
been there since the beginning (revision 34) and are usually loaded
with
\language\numbe...@mongolian
inside mls.sty.
The main problem now is that there is
/usr/local/texlive/2009/texmf-dist/tex/latex/mongolian-babel/mongolian.ldf
but \usepackage[mongolian]{babel} would load the LMC-encoded patterns.
One could rename mongolian.ldf to mongolian2a.ldf and then use
\usepackage[mongolian]{babel}, but that would be stupid, in particular
because there's no other "conflicting babel support" that would force
one to use "mongolian2a" as a language name.
Maybe the best possible solution at this time would be to convince
Oliver Corff to allow us to rename his "mongolian" to "mongolianlmc"
and let him fix the line in his support mls.sty (no single user would
be affected by that) and rename "mongolian2a" to "mongolian", to let
babel support work properly (which would make many people happy).
On Sat, Mar 20, 2010 at 23:38, Karl Berry wrote:
- one author (of the old patterns) wants to have automatic
transliteration (he types in latin alphabet and wants the
corresponding cyrillic glyphs in the resulting document) which is
probably only possible with the proper font, but there's hardly any
font in that encoding present (LMC);
I've never heard of LMC.
http://ctan.org/tex-archive/language/mongolian/montex/
How is the author using these patterns now, ie, what font?
His own metafont font. So [I first thought that] Type 3 (bitmap) was
the only available font that can be used with these patterns. Or at
least that's what CTAN says and the montex documentation uses Type 3.
However I see that there is
/usr/local/texlive/2009/texmf-dist/fonts/type1/public/montex/
but I don't know where those outlines come from. Anyway: I guess that
that's the only font that supports his LMC encoding.
The main point of LMC encoding is that it allows transliteration (one
writes in ASCII and gets the pdf typeset in Cyrillic without letting
TeX notice that at all) and the author wants to keep that
functionality. However, I guess that this functionality only works in
connection with his package, so if we leave the functionality there
with his package, we should not worry about anything else.
If the
font(s) he is using is/are not in TL, we could forget it.
It is, but the way to use it is a bit unconventional. Here are some
paragraphs from
/usr/local/texlive/2009/texmf-dist/doc/latex/montex/montex.tex
\section{\MonTeX\ and Recent \TeX\ Trends}
As soon as the LH Cyrillic fonts support the Mongolian currency sign,
\MonTeX\ will switch to this font set. At the moment the
private encoding \LMC\ is favoured over LH; future implementations
of \MonTeX\ will provide a smooth transition for the user: documents
developed with older versions of \MonTeX\ will be upward compatible.
The \texttt{babel} package will, perhaps, also be supported in due
course; at the moment, \texttt{babel} support is lacking mainly due
to font encoding questions and a private RL setup. At present,
\MonTeX\ is \emph{not} built with \texttt{babel} compatibility in mind.
It must be seen as a stand-alone extension similar to
\texttt{german.sty} or the \textsf{CJK} package.
...
\section{Hyphenation Patterns}
\MonTeX\ provides hyphenation rules for Modern Mongolian (Xalx).
... hyphenation patterns for Russian exist at CTAN
but they are unfortunately not suited for \MonTeX\ withour prior
work.
... A format file is usually
created when a new \TeX\ or \LaTeXe\ system is installed, but creating
a new format can be done at any later time again. A special variant
of \TeX\ called \texttt{initex} is used for this purpose.
The procedure sounds more intimidating than it actually is.
Since there are many different types of \TeX\ installations, the
procedure is somewhat system-dependent. There is detailed on-line
documentation available for performing this task, either in form of
a text file for emtex, or in form of a FAQ file which can be
displayed using the command \texttt{texconfig faq} on teTeX systems.
Mojca
On Mon, Nov 23, 2009 at 17:27, Oliver Corff wrote:
Dear Mojca,
No problem. I'll do that in about two weeks from now.
BTW, this is a good opportunity for me to clean up MonTeX code and adapt
MonTeX to XeLaTeX.
I've never dealt with Babel so I am a bit at a loss there.
Still, despite modern encodings, I still cherish the possibility to
write in a transliteration (pure ASCII) and have the system do to
conversion.
Best regards,
Oliver.
PS: Due to the different code spaces, merging romanized and Cyrillic
Mongolian hyphenations into ONE file should not be a big problem.
(It is not possible since some slots overlap.)
Summary (for those who managed to read this mail until this point): in
my opinion it would be best to rename "mongolian" to "mongolianlmc" in
language.dat and "mongolian2a" to "mongolian" + ask Oliver Corff to
fix his package. That way the users of Oliver's package would not be
affected and users of Dorjgotov's babel support would benefit a lot
with the ability to use babel+patterns in an easy way (now they need
to hack manually).