Dear Michal, Thank you very much for this solution! It sounds like exactly what I need.
I am eager to try it, but unfortunately when I tried to install helpers4ht (on Mac OS 10.11.5) by following your instructions ( https://github.com/michal-h21/helpers4ht), the command `git clone [email protected]:michal-h21/helpers4ht.git` returned an error (without asking me for a password): Cloning into 'helpers4ht'... Permission denied (publickey). fatal: Could not read from remote repository. Do you happen to know what I'm doing wrong? (Sorry for such a basic question.) Best, Alex On Sat, Jul 23, 2016 at 11:56 PM, Michal Hoftich <[email protected]> wrote: > Dear Alex, > > > > > I would like to produce an ODT document from my XeLaTeX document (using > MacTeX > > 2016). > > > > The necessary code to include Unicode characters (including in Greek and > Arabic > > script) was kindly provided by CV Radhakrishnan and Michal Hoftich back > in > > February 2013. But I am running into a new difficulty: converting a > document > > that defines LaTeX macros that have Unicode characters in them. (The > reason I > > want this is to enable me to use macros within a Right-to-Left script, > Arabic. > > Mixing up RTL and LTR scripts in a text editor, especially when > punctuation -- > > or braces {} -- is involved, tends to make the source file unreadable.) > > > > I am attaching a MWE in two files: > > > > 1. `main.tex`: standalone file that includes macro definition > > 2. `utf2ent.pl`: the Perl script devised by CVR to keep Unicode in the > new > > document > > > > The script I run to compile this is: > > > > # CVR's script to preserve Unicode characters > > perl utf2ent.pl main.tex > main-ent.tex > > > > # tex4ht > > mk4ht oolatex main-ent "xhtml, charset=utf-8" -utf8 > > > > There are two problems: > > 1. Macros with Unicode names are supported only by Unicode engines, ie. > XeTeX and LuaTeX. mk4ht oolatex is 8-bit pdflatex, so it can't really > support it. > > 2. utf2ent converts all Unicode characters to entities, including your > command, so you end with something like '\\entity{1589}' in your code. > > 3. $\langle$ and $\rangle$ produces wrong mathml code, see > > https://puszcza.gnu.org.ua/bugs/?278 > > ODT format uses mathml, so it may produce invalid file. > > Now what can be done: > > You need to use Unicode engine. That means LuaTeX at the moment, as > XeTeX support is broken in tex4ht at the moment. Fortunately, you can > use XeTeX to produce the PDF and only modify some macros for tex4ht. > > With LuaTeX, it is possible to keep Unicode characters without need to > call external scripts to convert them to Unicode entities. See > > http://michal-h21.github.io/samples/helpers4ht/fontspec.html > > for more details. I've modified your file to use alternative4ht and to > fix the problem with angles. Two new macros are introduced: extlangle > and extrangle, which are redefined in the config file to use XML > entities directly, instead of math mode. > > I've also found a problem that the angles are wrongly swapped in the ODT > and HTML, probably it is because they use the BIDI algorithm, so they > don't expect that they are swapped by the user already (you use > \rangle#1\langle). I've redefined the commands for angles in the config > file to use the opposite side than should be used according to the name, > so they are rendered correctly. > > The last problem is that mk4ht doesn't support LuaTeX, so you need to > use different way to compile the document. You can use: > > make4ht -ulm draft -c hello.cfg main.tex "xhtml,ooffice" "ooffice/! > -cmozhtf -utf8" " -cooxtpipes -coo" > > (it might be best to save it as a script, as it is not really human > friendly command call :) > > Modified main.tex and hello.cfg are attached. main.tex can be compiled > with xelatex to PDF, all needed changes for tex4ht are in the hello.cfg > file. > > Best regards, > Michal >
