Hi all, I've finally finished the Htfgen project [1]. It's objective is to automatize the creation of the HTF font mapping files. These files are used by tex4ht to map character codes in the DVI files to Unicode.
There are two new scripts: scanfdfile and dvitohtf. The first one searches for declared fonts in the FD files, the other generates literate TeX file for HTF generation. Sample usage is as follows: cat /usr/local/texlive/2018/texmf-dist/tex/latex/ebgaramond/*.fd | scanfdfile | dvitohtf > ebgaramont-htf.tex tex ebgaramont-htf.tex This will create HTF files for all detected fonts defined in FD files for EB Garamond. dvitohtf can also generate HTF files for missing fonts in the DVI file. So if tex4ht reports missing HTF files, it can be used directly on the DVI file: dvitohtf sample.dvi > missing.tex tex missing.tex dvitohtf supports both virtual and tfm fonts. It looks for virtual fonts first, the tfm file is used only when no vf is found. It looks for all fonts referenced in the virtual font and tries to look for corresponding .enc files in pdftex.map. The .enc files contain glyph lists, which are then mapped to Unicode. It also parses the .pfb file for font family name and tries to detect style (italic, bold, small caps) from the font full name saved in the .pfb file. It computes hashes for the font tables, so duplicate font tables aren't written, the fonts with same characters just link to the first used font. If no .enc file is found, then the font cannot be supported. There can be also missing mappings between glyphs and Unicode. The missing mappings are reported in the TeX file. Htfgen contains large mapping files, but some fonts just use some custom glyphs which doesn't have Unicode equivalent. For example Q_u ligatures etc. In this case the mapping must be added by hand to glyphlists/glyphlist-fixes.txt. It works reasonably well for fonts generated by Fontinst, because they usually use standard glyph names, contains .enc files, etc. For complex virtual fonts, especially math, it fails. HTF files for such fonts still needs to be created by hand. What to do now? There are some wrong HTF files in tex4ht sources, for example Linux Libertine support is wrong for some ligatures. I am sure there will be more examples, especially fonts with large number of ligatures. Their support has been added few years ago, but only in T1 font encoding. We should remove HTF generation for these files from the huge literate sources for fonts and create smaller literate TeX file for each of these fonts. This should speed up the tex4ht build and it should be easier to manage. Any volunteers are welcomed. Best regards, Michal [1] https://github.com/michal-h21/htfgen
