Just replying to myself : the dynamic latin9/utf8 switch does not really easily 
work in a LaTeX document (I am sure some people made it work, but I did not 
manage to make it).

Adding the ZWSP support is easy, just insert

\DeclareUnicodeCharacter{200B}{}

in the document preamble, but this does not work because the table of content 
gets ugly with the titles likes « germe de générateur pseudo-aléatoire » 
instead of « germe de générateur pseudo-aléatoire » (some sections in the UTF-8 
part going to the toc that is in the latin-9 part).

I tried to circumvent this by using the following encapsulation for the UTF-8 
part :

\inputencoding{utf8}\addtocontents{toc}{\protect\inputencoding{utf8}}
...
\inputencoding{latin9}\addtocontents{toc}{\protect\inputencoding{latin9}}

(there are \addtocontents commands in addition to the \inputencoding so that I 
also switch the encoding in the toc file), but there are still issues, the 
dynamic switch does not work well with moving arguments like section titles 
going to the toc file. This is probably a problem of non-immediate write 
occurring at page shipout.

  V.

________________________________
De : Vincent Belaïche <vincent....@hotmail.fr>
Envoyé : mardi 25 janvier 2022 18:18
À : Nicolas Goaziou <m...@nicolasgoaziou.fr>
Cc : Juan Manuel Macías <maciasch...@posteo.net>; orgmode 
<emacs-orgmode@gnu.org>
Objet : RE: [RFC] Creole-style / Support for **emphasis**__within__**a word**

Hello,

Actually the source was in UTF-8, but it was using only characters that exist 
in latin-9, and it is exported to LaTeX for inclusion in a LaTeX document that 
is in latin-9.

So I used an Emacs lisp snippet to make the export, and in this snippet after 
calling something like  (org-export-to-buffer 'latex out-buffer nil nil nil t), 
I was doing some insertion like

      (goto-char (point-max))
      (insert "
% Local Variables:
% coding: latin-9
% End:
")
      (save-buffer)
      (kill-buffer)

so that the exported buffer is converted to latin-9 before being saved.

OK, when I inserted the zero width space this barked because of no zwsp (aka 
U+200B) in latin-9.

Then I tried something else, I rewrote the code with some some LaTeX snippet 
@@latex:\kern-0.5em\relax@@ in it, like this:

    ~--my-option=~ @@latex:\kern-0.5em\relax@@ /option value/

that was OK, but this really makes the OrgMode ugly (maybe a custom entity 
would be better), and also this works only for the LaTeX export.

Then, I tried something else, I passed « utf8,latin9 » options, to LaTeX 
inputenc package, instead of just « latin9 », and I kept my org mode document 
in UTF-8, just before exporting I did something like this in the input buffer:

    (goto-char (point-max))
    (insert "\n\n#+begin_export latex\n\\inputencoding{latin9}\n#+end_export\n")
    (goto-char (point-min))
    (insert "\n\n#+begin_export latex\n\\inputencoding{utf8}\n#+end_export\n")

this way the LaTeX processor is switching dynamically from latin9 to utf8 at 
the beginning of the doc, and back to latin9 at the end of it. But there are 
two pitfalls:

the first one is that zwsp are not defined in the inputenc utf8.def definition 
file, so having a zwsp character in the LaTeX code, even though utf8 is 
declared as input encoding make a LaTeX compilation error.

the second (but this is less serious I think …) is that my document ends with 
an enumerate list, and the orgmode exporter make the second begin_export go 
into the enumerate list, not after it. I mean I get in the output this:

   \inputencoding{latin9}
   \end{enumerate}

instead of this:

   \end{enumerate}
   \inputencoding{latin9}

My conlcusion is that for what I am after, an evolution of org-mode would be 
preferable, maybe I contribute something someday, so that writing one of the 
following would make it:

   ~--my-option=~\relax{}/option value/
   ~--my-option=~@@:@@/option value/
   \left~--my-option=\right~/option value/
   \left~--my-option=\right~\left/option value\right/
   ~--my-option=~\left/option value\right/


________________________________
De : Nicolas Goaziou <m...@nicolasgoaziou.fr>
Envoyé : mardi 25 janvier 2022 11:55
À : Vincent Belaïche <vincent....@hotmail.fr>
Cc : Juan Manuel Macías <maciasch...@posteo.net>; orgmode 
<emacs-orgmode@gnu.org>
Objet : Re: [RFC] Creole-style / Support for **emphasis**__within__**a word**

Hello,

Vincent Belaïche <vincent....@hotmail.fr> writes:

> Thank-you both for the reply, I should have mentioned that I am aware of
> this trick but it works only for document encodings which have the
> zero-width space, like UTF-8, I was after a fix for documents in
> ISO-8859-15, aka latin-9.

You mean the source itself is not UTF-8?

I don't think there's a solution for you then, unless you convert it to
UTF-8, of course.

Regards,
--
Nicolas Goaziou

Reply via email to