Re: Fallback fonts in LaTeX export for non latin scripts

Ihor Radchenko Mon, 04 Sep 2023 01:10:16 -0700

Juan Manuel Macías <maciasch...@posteo.net> writes:

>> #+language: ancientgreek russian arabic
>
> Of course, this syntax would be the most appropriate and consistent
> within Org. The problem is LaTeX, specifically babel, and that certain
> inconsistencies would be created with the rest of the backends. At first
> some pitfalls come to mind:
>
> - The keyword #+language accepts for now only language codes (es, en,
>   el, ar, ru, etc.). Consistency with other backends should
>   be maintained in this regard: ancientgreek is not a valid language
>   code, but a name that only babel understands. If we put something
>   like (a valid language code):
>
>   #+language: el-polyton
>
>   this could be translated in babel as polutonikogreek (in the classic
>   syntax, that is, the languages that are loaded in the options of
>   \usepackage[options]{babel}), or, in the new syntax, ancientgreek and
>   polytonicgreek, which are actually two different languages: the first
>   is ancient polytonic Greek and the second modern polytonic Greek. To
>   add more confusion to the matter, in classical babel syntax
>   greek.ancient and greek.polytonic are also supported. But neither of
>   these things can be deduced by simply putting el-polyton, unless
>   breaking the consistency with the other backends.


I am now working on unifying Org translation system as discussed in
https://orgmode.org/list/87o7iw8yem....@bzg.fr
As a part of the effort, I plan to introduce a new constant that will
unify language abbreviations across Org and also associate them with
more human-readable names.

(defconst org-language-abbrevs
  '(("am".  "Amharic")
    ("ar" . "Arabic")
    ("ast" . "Asturian")
    ("bg" . "Bulgarian")
    ("bn" . "Bengali")
    ...))

The idea is to allow
#+language: Austrian German, Greek
as a valid specifier, in addition to
#+language: de-at, el

Then, across Org, we will make use of the standardized language
abbreviations.

> - Added to this is that Babel has two ways to load languages: the
>   classic syntax and the \babelprovide command, which is the one we are
>   interested in here for languages with non-Latin scripts, because the
>   onchar=ids fonts property must be added here. And what happens if the
>   user has already defined several languages with babel, using the
>   current procedure: \usepackage[french, english, AUTO]{babel}?

For LaTeX specifically, `org-latex-language-alist', will be re-used to
map whatever is allowed in #+language keyword to its name in
babel/polyglossia.

Does it make sense?

> Therefore, the least complicated thing, in my opinion, is to leave the
> syntax of the keyword #+language as it is. It is not necessary for the
> user to explicitly define secondary non-latin languages. The idea is
> that Org is responsible for generating the necessary babel code by
> simply giving a command like enable font for X language. What we are
> talking about here is ensuring readability using a series of fonts that
> LaTeX does not load by default, not even LuaLaTeX. And, after all, Org
> is monolingual: it does not have multilingual support at the moment;
> that is, there is nothing in Org to switch languages in the middle of
> the document. What happens is that here we take advantage of the
> functionality that Babel has to automatically apply a font for a
> non-Latin language/script, also loading its properties (hyphen rules,
> captions, etc.).
>
> A new keyword #+latex_language could be created, which would understand
> the babel names, but I think it is unnecessary and would add more
> complexity. As I said before, defining the necessary fonts would be
> enough, since my idea in this is a basic practicality to ensure the
> readability of the documents. And anyone looking for more advanced
> functions would have to enter LaTeX code explicitly.

I think that we should move towards multi-language support.
Such support would practically simplify WORG and orgmode.org translation
process, and may also be used as a basis to allow translating the
Org manual.

My rough idea is to allow specifying language as affiliated
keyword and, in future, allow selective export to certain target
language.

Multi-language documents are another potential target to support.

>> #+latex_font[ancientgreek]: "Linux Libertine O" Scale=MatchLowercase
>>
>> #+latex_font[russian]: "FreeSerif" Numbers=Lowercase,Color=blue
>
> I like this idea, but with the exception that in the two examples you
> give the user is declaring two fonts for both languages. In my example
> there was also Arabic, where the default font for the Arabic script is
> used.

My idea was that

#+language: ancientgreek russian arabic

implies "use default font for arabic", unless #+latex_font is specified.

> #+latex_font[arabic]: "FreeSerif" Numbers=Lowercase,Color=blue
>
> This last syntax would also be valid to modify the main default fonts:
>
> #+latex_font[main]: "FreeSerif" Numbers=Lowercase
> #+latex_font[sans]: "some font"
> #+latex_font[mono]: "some font"
> #+latex_font[math]: "some font"
>
> A practical use case. Suppose a user has a document in Spanish, which
> includes passages in Greek and Russian. It would be enough to use the
> Old Standard font (included in TeX live) for the entire document,
> ensuring consistency:
>
> #+latex_header: \usepackage[AUTO]{babel}
> #+language:es
> #+latex_font[main,greek,russian]: Old Standard

Looks reasonable.

-- 
Ihor Radchenko // yantar92,
Org mode contributor,
Learn more about Org mode at <https://orgmode.org/>.
Support Org development at <https://liberapay.com/org-mode>,
or support my work at <https://liberapay.com/yantar92>

Re: Fallback fonts in LaTeX export for non latin scripts

Reply via email to