On 8/29/2023 8:59 PM, Heiko Oberdiek via luatex wrote:
Hello,

On 2023-08-29 20:29, Joseph Wright wrote:
On 29/08/2023 19:27, Heiko Oberdiek via luatex wrote:

using LuaTeX to review the glyphs of a font, I discovered an oddity about U+0387 ANO TELEIA. LuaTeX shows U+00B7 MIDDLE DOT instead.

         \symbol{"00B7}% MIDDLE DOT
         \symbol{"0387}% ANO TELEIA

 From UnicodeData.txt:

     0387;GREEK ANO TELEIA;Po;0;ON;00B7;;;;N;;;;;

so it looks like it's a simple normalisation.

Start of the UnicodeData.txt format description (https://www.unicode.org/reports/tr44/#UnicodeData.txt):
   [0] Code value
   [1] Character name
   [2] General category
   [3] Canonical combining classes
   [4] Bidirectional category
   [5] Character decomposition
   ...

In the LuaTeX manual, I found:

| Normalization of the Unicode input is on purpose not built-in and
| can be handled by a macro package during callback processing.
| We have made some practical choices and the user has to
| live with those.

The TeX input above, however, is plain ASCII. Therefore, any normalization of the file contents should not matter.

Of course, I do not want to have any decomposition that replaces
the glyph with a different character. That would make reviewing
the original glyph impossible.
Indeed it's not a luatex issue. Rendering in context gives two different glyphs with the mentioned unicodes.

Hans

-----------------------------------------------------------------
                                          Hans Hagen | PRAGMA ADE
              Ridderstraat 27 | 8061 GH Hasselt | The Netherlands
       tel: 038 477 53 69 | www.pragma-ade.nl | www.pragma-pod.nl
-----------------------------------------------------------------

Reply via email to