RFR: 8353230: Emoji rendering regression after JDK-8208377

Daniel Gredler Thu, 03 Apr 2025 04:39:51 -0700

It looks like this regression actually fits into a longer series of fixes / 
regressions in this area:


- [JDK-4517298](https://bugs.openjdk.org/browse/JDK-4517298) fixed metrics for 
zero-width characters, but broke some ligatures / glyph substitutions
- [JDK-7017058](https://bugs.openjdk.org/browse/JDK-7017058) fixed the 
ligatures / glyph substitutions, but broke some zero-width metrics
- [JDK-8208377](https://bugs.openjdk.org/browse/JDK-8208377) fixed some metrics 
and rendering for zero-width characters, but broke some ligatures / glyph 
substitutions
- Now, with this PR, we aim to fix the ligatures without re-breaking zero-width 
metrics and display

We have two different types of use cases pulling `CharToGlyphMapper` in two 
different directions: the users who need raw, untransformed glyph info, and the 
users who need normalized / transformed glyph info.

It looks to me like, in the current code base, the only `CharToGlyphMapper` 
user which requires raw font data is HarfBuzz (explicitly confirmed with the 
HarfBuzz team here: https://github.com/harfbuzz/harfbuzz/discussions/5234).

The regression mechanism at play here is that the HarfBuzz font callbacks are 
currently providing HarfBuzz with transformed glyph info (e.g. ZWJ -> 
INVISIBLE_GLYPH_ID), which prevents HarfBuzz from recognizing and applying the 
correct font GSUB substitutions (which involve ZWJ).

In order to fix this without (yet again) breaking metrics and display behavior 
elsewhere, I've added two methods to `CharToGlyphMapper` which provide access 
to raw glyph info, to be used by the HarfBuzz font callbacks: 
`charToGlyphRaw(int)` and `charToVariationGlyphRaw(int)`.

Note two intricacies related to `CompositeGlyphMapper`:
1. We need to be careful to only cache raw (untransformed) values, to avoid 
conflicts between requests for a raw version of a glyph and a transformed 
version of the same glyph. Another option would have been two separate caches, 
but I don't think that's necessary.
2. Consumers who are using `CompositeGlyphMapper.SLOTMASK` to check glyph slots 
(e.g. `FontRunIterator` and `CTextPipe`) will "see" invisible glyphs as having 
come from slot 0. This isn't new, and I think it's OK, but something to be 
aware of.

The glyph cache handling in `CCharToGlyphMapper` (for macOS) also requires care 
to avoid mixing value types.

Please also note that I'm not sure if the tweak to `sunFont.c` is being tested, 
since FFM is being used by default for HarfBuzz integration. (Is there a plan 
to remove the JNI version soon?)

This PR includes a self-contained regression test. It includes a small font 
created just for this test, which exercises the ligature / glyph substitution 
infrastructure. The font tests, including the new regression test, all pass 
locally on Linux, Windows and macOS (`make test 
TEST="jtreg:test/jdk/java/awt/font"`).

Interestingly, the changes for JDK-7017058 (mentioned above) included a test 
(`ZWJLigatureTest`) which I think would have caught this last regression, but 
it depends on optional Windows fonts which I guess do not exist on any 
commonly-used test infrastructure. This should not be an issue with the new 
test, since it does not depend on any external fonts.

-------------

Commit messages:
 - Finish macOS implementation
 - Differentiate CharToGlyphMapper users who want raw vs normalized glyph info

Changes: https://git.openjdk.org/jdk/pull/24412/files
  Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=24412&range=00
  Issue: https://bugs.openjdk.org/browse/JDK-8353230
  Stats: 305 lines in 10 files changed: 229 ins; 27 del; 49 mod
  Patch: https://git.openjdk.org/jdk/pull/24412.diff
  Fetch: git fetch https://git.openjdk.org/jdk.git pull/24412/head:pull/24412

PR: https://git.openjdk.org/jdk/pull/24412

RFR: 8353230: Emoji rendering regression after JDK-8208377

Reply via email to