Hi list,

Thanks a ton to @RadioNoiseE for the examples: with your help we can
now export org documents in Chinese or Japanese to PDF using XeLaTeX.
The current implementation has the following constraints:

1. we use babel
2. the main document language has to be jp or zh

It offers the possibility of having other secondary languages (I've tested with en-gb and es)

@all: Please install or refresh the feature/all-tex-fonts to check the results. A short section in the manual has been added, documenting my current understanding of the state of the branch.

CALL FOR HELP: Anyone fluent in Korean to point at missing things

Best, /PA

On 11/11/25 16:01, RadioNoiseE wrote:
On Tue, 11 Nov 2025 22:05:55 +0800,
Pedro Andres Aranda Gutierrez wrote:

[1  <text/plain; UTF-8 (quoted-printable)>]
[2  <text/html; UTF-8 (quoted-printable)>]
Hi

Thanks a lot for tuning in...
Answers - or maybe more questions ;-) - inline...

On Tue, 11 Nov 2025 at 14:35, RadioNoiseE <[email protected]> wrote:

  On Tue, 11 Nov 2025 02:14:11 +0800,
  Ihor Radchenko wrote:
  >
  > Huang Jing <[email protected]> writes:
  >
  > >> How does it play with babel and polyglossia?
  > >
  > > It's not mentioned in the documents of xeCJK and luatex-ja, however I
  > > believe they do work together. From my limited testing, when loaded as
  > > packages, xeCJK and luatex-ja does no localization, thus relying on
  > > babel. However they will override the font settings by babel, which is
  > > totally acceptable.
  >
  > That actually depends. If the user of Org mode customizes fonts, it may
  > be a surprise when xeCJK/luatex-ja override the fonts. So, we might only
  > load these packages conditionally, when no font of explicitly selected.
  > Or maybe we simply put font settings _after_ xeCJK/luatex-ja is loaded.

  We don't need to configure fonts for babel, and it only provides
  localization. xeCJK provides the \setCJK...font control sequence while
  luatex-ja provides \set...jfont, so we can use them for font
  configuration.

That was my understanding... I've done a couple of experiments based on what 
overleaf.com was providing and was able to
start handling \setCJK...font{} with \usepackage{fontspec}. If you were so kind 
to provide a MWE for luatex-ja, I think
we could  have something reasonable for Japanese too.

Sure. This is for Chinese under LuaTeX:

   \documentclass{article}
\makeatletter
   \def\ltj@stdmcfont{FandolSong} % serif font
   \def\ltj@stdgtfont{FandolHei}  % sans serif and monospace font (usually the 
same)
   \def\ltj@stdyokojfm{quanjiao}  % jfm
   \makeatother
\usepackage{luatexja} % load after defining \ltj@std...
   \usepackage{indentfirst}              % convention
   \usepackage[chinese,provide=*]{babel} % load after luatexja

   \catcode`\^^^^200b=\active\let^^^^200b\relax % ignore zws
\parindent=2\zw % convention
   \linespread{1.333} % 16pt/12pt
\begin{document} \section{天山山脉} 位于乌鲁木齐市以东的博格达峰海拔5445米,峰上的积雪终年不化,人们称它
   “雪海”。位于博格达峰山腰的天池,清澈透明,是新疆著名的旅游胜地。目前,
   博格达峰自然保护区已纳入联合国“人与生物圈”自然保护区网。托木尔峰,海
   拔7439米,是天山的最高峰,登山界一般承认1956年阿巴拉科夫首次登顶成功,
   但也有说1938年已有苏联登山队登顶;1975年7月25日首个中国登山队登顶成
   功。
\end{document}

This is for Japanese under XeTeX:

   \documentclass{article}
\usepackage{luatexja} % OOTB Japanese supp
   \usepackage{indentfirst}               % conventions
   \usepackage[japanese,provide=*]{babel} % laod after luatexja

   \catcode`\^^^^200b=\active\let^^^^200b\relax % ignore zws
\parindent=\zw % convention, different from Chinese which is 2\zw
   \linespread{1.333} % 16pt/12pt
\begin{document} \section{二億圓の犬} 犬はよく訓練されたフォックス・テリアで「歐洲の驚異の犬」といわれたも
   のだそうである。それを加州へ送る途中、兩會社の不注意で、途中で死んで
   しまったので、それに對して、二億二千萬圓の損害賠償をしろというのが、
   この訴えである。
いくらアメリカでも、こういう話は珍しいらしく、加州の話が、シカゴの新
   聞にまで載ったわけである。どんな犬かは知らないが、いくら名犬でも、二
   億圓の犬というのは、われわれには一寸考えが及ばない。とにかく、とんで
   もない話が時々起る國である。
\end{document}


  > > 1. Under XeTeX and LuaTeX, xeCJK and luatex-ja will setup font support
  > > according to the platform (operating system) detected, and activate
  > > font, kinsoku, line-breaking support. They will not change the
  > > \baselineskip.
  > >
  > > 2. When ctex is being used, it will also configure correct
  > > \baselineskip (from the default 12pt to 16pt). It will also try to
  > > support pdfTeX.
  > >
  > > 3. Localization support provided by babel.
  > >
  > > So it's actually necessary to load babel when not using the document
  > > classes provided. It's safer to load babel first though.
  >
  > Note that babel also provides rules for typography. So,
  > xeCJK/lualatex-ja do step onto babel a bit. But, as you said, they
  > basically add missing typographical rules, so it might be reasonable.
  >
  > > Neither xeCJK nor luatex-ja is necessary for font configuration when
  > > babel is being used. Since babel only support Chinese and Japanese on
  > > LuaTeX and XeTeX with OTF support, the CJK font can be loaded the same
  > > way as latin fonts. See 
https://latex3.github.io/babel/guides/locale-chinese.html.
  >
  > > However babel is hardly ever used in Chinese or Japanese community,
  > > since their support is so, primitive. For example it does not add
  > > xkanjiskip between latin and CJK characters. Here's a relevant
  > > discussion on relying on babel for localization in the ctex community:
  > > https://github.com/CTeX-org/ctex-kit/issues/626#issuecomment-1147428749.
  >
  > My understanding from this is that we (1) always want to load xeCJK for
  > Chinese documents (what about luatex?); (2) always want to load
  > luatex-ja for Japanese (what about xetex?).

  We can configure luatex-ja for Chinese documents on LuaTeX, by
  changing the \parindent to 2\zw, change the default font (HaranoAji)
  to FandolSong, and change the JFM (Japanese font metric). Vice versa.

As said above... I'd like to see a MWE to check.

For LuaTeX, see above. For XeTeX, Chinese:

   \documentclass{article}
\usepackage{xeCJK} % OOTB Chinese support
   \usepackage{indentfirst}              % convention
   \usepackage[chinese,provide=*]{babel} % load after xeCJK
\catcode`\^^^^200b=\active\let^^^^200b\relax % ignore zws \parindent=2em % convention
   \linespread{1.333} % 16pt/12pt
\begin{document} \section{天山山脉} 位于乌鲁木齐市以东的博格达峰海拔5445米,峰上的积雪终年不化,人们称它
   “雪海”。位于博格达峰山腰的天池,清澈透明,是新疆著名的旅游胜地。目前,
   博格达峰自然保护区已纳入联合国“人与生物圈”自然保护区网。托木尔峰,海
   拔7439米,是天山的最高峰,登山界一般承认1956年阿巴拉科夫首次登顶成功,
   但也有说1938年已有苏联登山队登顶;1975年7月25日首个中国登山队登顶成
   功。
\end{document}

and for Japanese:

   \documentclass{article}
\usepackage{xeCJK} % load first
   \usepackage{indentfirst}               % convention
   \usepackage[japanese,provide=*]{babel} % load after xeCJK
\setCJKmainfont{HaranoAjiMincho} % serif font
   \setCJKsansfont{HaranoAjiGothic} % sans serif font
   \setCJKmonofont{HaranoAjiGothic} % monospace font
\catcode`\^^^^200b=\active\let^^^^200b\relax % ignore zws \parindent=1em % convention
   \linespread{1.333} % 16pt/12pt
\begin{document} \section{二億圓の犬} 犬はよく訓練されたフォックス・テリアで「歐洲の驚異の犬」といわれたも
   のだそうである。それを加州へ送る途中、兩會社の不注意で、途中で死んで
   しまったので、それに對して、二億二千萬圓の損害賠償をしろというのが、
   この訴えである。
いくらアメリカでも、こういう話は珍しいらしく、加州の話が、シカゴの新
   聞にまで載ったわけである。どんな犬かは知らないが、いくら名犬でも、二
   億圓の犬というのは、われわれには一寸考えが及ばない。とにかく、とんで
   もない話が時々起る國である。
\end{document}

  > >> > For the \setCJK...font declaration, I can provide a wrapper in LaTeX
  > >> > if needed, compatible with XeTeX, LuaTeX and probabily other
  > >> > engines. You will need xeCJK for this control sequence while other
  > >> > engines will not compile because it is provided by the xeCJK package.
  > >> > Under other engines, there are different control sequences used for
  > >> > font configuration (i.e., under LuaTeX thus luatex-ja, you use
  > >> > \set...jfont).
  >
  > Could you expand on "other engines will not compile"? How does it fit to
  > "compatible with XeTeX, LuaTeX, and probably other engines"?
  > (Note that inclusion or not inclusion of xeCJK can be controlled by us -
  > we know which compiler is used for export during export and can
  > conditionally include it on Elisp level)

  What I mean by ``other engines will not compile'' is when directly
  using \setCJK...font in the exported document, even though ctex works
  across different TeX engines, since it's xeCJK providing these
  commands, it will not compile under, i.e., LuaTeX.

  But as we don't use ctex now, we just need to call \setCJK...font for
  XeTeX after loading xeCJK, and \set...jfont for luatex-ja under
  LuaTeX. Since we can access the target engine through
  org-latex-compilers.

Hmm... so my guess was not that wrong ;-)

  > >> Could you provide more details about these commands?
  > >
  > > Equivalents to \setCJK...font provided by luatex-ja are documented in
  > > English here: 
https://mirrors.ctan.org/macros/luatex/generic/luatexja/doc/luatexja-en.pdf
  > > Search for ``Tabel 1: Commands of luatexja-fontspec'' in that
  > > PDF. They are provided by luatexja-fontspec, which autoloads luatexja
  > > and fontspec.
  >
  > Ok. \setmainjfont, \setsansjfont, and \setmonojfont seems to be of
  > interest. They are direct equivalents of \setCJKmainfont,
  > \setCJKsansfont, and \setCJKmonofont. This is probably only relevant
  > when using bare bones fontspec or polyglossia to set fonts. When using
  > babel, it probably makes sense to keep using \babelfont[chinese]{rm}{...}

  I think we should configure fonts through xeCJK or luatex-ja provided
  interface, since they will override the babel font. Babel will not
  complain about no font specified.

I'm close to designing a strategy for this. Currently, when I detect CJK fonts, 
I include xeCJK.
So, with an MWE for Japanese fonts, it would not be too difficult to get this 
configuration right, too.

I think you need to include xeCJK even if the user does not specify
fonts, so there's a fallback/default one. (Not necessary for Chinese
under xeCJK, since it's OOTB; but for Japanese it's necessary, and
same for luatexja -- need to specify default Chinese Fandol font.)

Hopefully the MWEs help explain things.

  > > luatexja also patches LaTeX2e's NFSS2, adding CJK font
  > > support. However unless there's a specific reason we shouldn't use
  > > that in Org export results.
  >
  > That sounds concerning. What are the potential consequences?

  I think no observable consequences for Org export. It will not
  interfere with any existing functionality. What is does is extending
  existing framework, providing NFSS2 like interfaces for document
  classes, handling CJK font scaling, vertical typesetting, etc
  features.

  However I was thinking to not use luatexja-fontspec, that is we no
  longer have \set...jfont control sequences. Since luatexja-fontspec
  should be loaded after fontspec as it patches fontspec. As a
  replacement, we can use (ref. luatexja document section 8.3)

   \ltj@stdmcfont  -> The default Japanese font for the mincho family (serif)
   \ltj@stdgtfont  -> The default Japanese font for the gothic family (sans 
serif and monospace)
   \ltj@stdyokojfm -> The default JFM for horizontal direction
   \ltj@stdtatejfm -> The default JFM for vertical direction

  > > I'm currently having my mid-term exams, so I'll be able to work on
  > > this after Tuesday.
  >
  > No problem. I think Pedro wanted the whole thing to be in mergeable
  > state (not necessary final) before EmacsConf, but we are generally not
  > very pushy - we are all volunteers after all.

  >

I don't want to push... it's just that I have a talk on this in EmacsConf
and it would be cool to be able to say 'you have it in org-mode master'.

  > >> Org mode only supports exporting via pdflatex, xelatex, and lualatex.
  > >
  > > Then my idea is to drop ctex, and use xeCJK or luatex-ja with babel.
  > > These two packages support both Chinese and Japanese, while xeCJK
  > > comes with out-of-the-box Chinese support and luatex-ja comes with
  > > out-of-the-box Japanese support.
  >
  > Good.
  >
  > > pdfTeX support is also feasible, through the CJK package, which is
  > > used by ctex as well.
  >
  > Note that pdfTeX is something we are not certain about. I wish we could
  > do it, but it seems tricky. We will need to work out how we want to
  > design the pdftex support. Tentatively, we may add a field to
  > `org-latex-language-alist' where standard per-language config will be
  > stored and loaded according to #+LANAGUAGE settings (note that there
  > might be multiple languages in one document).

  CJK support on pdfTeX would require appropriate tfm, then we should be
  able to use \pdfmapline to setup CJK font. It is tricky somehow.

  > --
  > Ihor Radchenko // yantar92,
  > Org mode maintainer,
  > Learn more about Org mode at <https://orgmode.org/>.
  > Support Org development at <https://liberapay.com/org-mode>,
  > or support my work at <https://liberapay.com/yantar92>

Best, /PA

--
Fragen sind nicht da, um beantwortet zu werden,
Fragen sind da um gestellt zu werden
Georg Kreisler

"Sagen's Paradeiser" (ORF: Als Radiohören gefährlich war) => write BE!
Year 1 of the New Koprocracy


Reply via email to