There were translations (HTML or unicode or both) for several specific
accented characters. I discovered how to translate macros using characters
with combining class "above", and added those translations as the default.
I repeated that for accents \` (grave), \' (acute), \hat (circumflex),
\tilde, \~, \bar, \= (macron, or short bar), \overbar (long bar), \dot, and
\check (caron).

I corrected a few translations:

 The translations for \~u and \~U had the hex unicode values, but lacked
the "x".)

 \dot was translated as a common period, but should be an accent.

 The translation for \models (as U+22a8) was incorrect.  It was
 changed to U+22a7, as per https://www.compart.com/en/unicode/U+22A8,
 (U+22a8 is the symbol for "true", which is similar to the one for
 "models" but not as tall.  There's no record of a symbol "not
 models".  U+22ad (decimal 8877) is the Unicode character for "not
 true", which is the symbol for "true" with a stroke added.  However,
 neither \true nor \false are standard macros in LaTeX, so there is no
 need to provide those translations.)

 I added translations for dotless i and j without accents, \div, \ddagger,
\LongLeftArrow, and many more macros - all the plain TeX symbols in the
document "Every symbol (most symbols) defined by unicode-math" by Will
Robertson, v0.8m of 2018-07-29, found here:

http://mirrors.ibiblio.org/CTAN/macros/latex/contrib/unicode-math/unimath-symbols.pdf

There were Unicode translations for all the symbols.  Where the HTML
representation merely specified the numeric value for the Unicode
character, I just put the Unicode translation in the "Special characters"
or "Math macros" section.  Where there were also HTML characters with alpha
names, I added one translation to the "Unicode" section and the other to
the "HTML entities" section.  Some tools handle one type of translation
better than the other, so users should be able to indicate a preference.
In fact, I note that Chrome, Chromium, and Firefox browsers on Linux fail
to render these HTML characters:
&velip;
&doublelongleftarrow;
&doublelongrightarrow;
&doublelongleftrightarrow;
I believe those characters are correct.  The browsers handle the Unicode
translations just fine.

the attached patch implements these changes, and would close the bug.

Attachment: accent-translations
Description: Binary data

Reply via email to