It looks like this regression actually fits into a longer series of fixes / regressions in this area:
- [JDK-4517298](https://bugs.openjdk.org/browse/JDK-4517298) fixed metrics for zero-width characters, but broke some ligatures / glyph substitutions - [JDK-7017058](https://bugs.openjdk.org/browse/JDK-7017058) fixed the ligatures / glyph substitutions, but broke some zero-width metrics - [JDK-8208377](https://bugs.openjdk.org/browse/JDK-8208377) fixed some metrics and rendering for zero-width characters, but broke some ligatures / glyph substitutions - Now, with this PR, we aim to fix the ligatures without re-breaking zero-width metrics and display We have two different types of use cases pulling `CharToGlyphMapper` in two different directions: the users who need raw, untransformed glyph info, and the users who need normalized / transformed glyph info. It looks to me like, in the current code base, the only `CharToGlyphMapper` user which requires raw font data is HarfBuzz (explicitly confirmed with the HarfBuzz team here: https://github.com/harfbuzz/harfbuzz/discussions/5234). The regression mechanism at play here is that the HarfBuzz font callbacks are currently providing HarfBuzz with transformed glyph info (e.g. ZWJ -> INVISIBLE_GLYPH_ID), which prevents HarfBuzz from recognizing and applying the correct font GSUB substitutions (which involve ZWJ). In order to fix this without (yet again) breaking metrics and display behavior elsewhere, I've added two methods to `CharToGlyphMapper` which provide access to raw glyph info, to be used by the HarfBuzz font callbacks: `charToGlyphRaw(int)` and `charToVariationGlyphRaw(int)`. Note two intricacies related to `CompositeGlyphMapper`: 1. We need to be careful to only cache raw (untransformed) values, to avoid conflicts between requests for a raw version of a glyph and a transformed version of the same glyph. Another option would have been two separate caches, but I don't think that's necessary. 2. Consumers who are using `CompositeGlyphMapper.SLOTMASK` to check glyph slots (e.g. `FontRunIterator` and `CTextPipe`) will "see" invisible glyphs as having come from slot 0. This isn't new, and I think it's OK, but something to be aware of. The glyph cache handling in `CCharToGlyphMapper` (for macOS) also requires care to avoid mixing value types. Please also note that I'm not sure if the tweak to `sunFont.c` is being tested, since FFM is being used by default for HarfBuzz integration. (Is there a plan to remove the JNI version soon?) This PR includes a self-contained regression test. It includes a small font created just for this test, which exercises the ligature / glyph substitution infrastructure. The font tests, including the new regression test, all pass locally on Linux, Windows and macOS (`make test TEST="jtreg:test/jdk/java/awt/font"`). Interestingly, the changes for JDK-7017058 (mentioned above) included a test (`ZWJLigatureTest`) which I think would have caught this last regression, but it depends on optional Windows fonts which I guess do not exist on any commonly-used test infrastructure. This should not be an issue with the new test, since it does not depend on any external fonts. ------------- Commit messages: - Finish macOS implementation - Differentiate CharToGlyphMapper users who want raw vs normalized glyph info Changes: https://git.openjdk.org/jdk/pull/24412/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=24412&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8353230 Stats: 305 lines in 10 files changed: 229 ins; 27 del; 49 mod Patch: https://git.openjdk.org/jdk/pull/24412.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/24412/head:pull/24412 PR: https://git.openjdk.org/jdk/pull/24412