On Tue, 15 Apr 2025 17:28:12 GMT, Nikita Gubarkov <ngubar...@openjdk.org> wrote:
>> It looks like this regression actually fits into a longer series of fixes / >> regressions in this area: >> >> - [JDK-4517298](https://bugs.openjdk.org/browse/JDK-4517298) fixed metrics >> for zero-width characters, but broke some ligatures / glyph substitutions >> - [JDK-7017058](https://bugs.openjdk.org/browse/JDK-7017058) fixed the >> ligatures / glyph substitutions, but broke some zero-width metrics >> - [JDK-8208377](https://bugs.openjdk.org/browse/JDK-8208377) fixed some >> metrics and rendering for zero-width characters, but broke some ligatures / >> glyph substitutions >> - Now, with this PR, we aim to fix the ligatures without re-breaking >> zero-width metrics and display >> >> We have two different types of use cases pulling `CharToGlyphMapper` in two >> different directions: the users who need raw, untransformed glyph info, and >> the users who need normalized / transformed glyph info. >> >> It looks to me like, in the current code base, the only `CharToGlyphMapper` >> user which requires raw font data is HarfBuzz (explicitly confirmed with the >> HarfBuzz team here: https://github.com/harfbuzz/harfbuzz/discussions/5234). >> >> The regression mechanism at play here is that the HarfBuzz font callbacks >> are currently providing HarfBuzz with transformed glyph info (e.g. ZWJ -> >> INVISIBLE_GLYPH_ID), which prevents HarfBuzz from recognizing and applying >> the correct font GSUB substitutions (which involve ZWJ). >> >> In order to fix this without (yet again) breaking metrics and display >> behavior elsewhere, I've added two methods to `CharToGlyphMapper` which >> provide access to raw glyph info, to be used by the HarfBuzz font callbacks: >> `charToGlyphRaw(int)` and `charToVariationGlyphRaw(int)`. >> >> Note two intricacies related to `CompositeGlyphMapper`: >> 1. We need to be careful to only cache raw (untransformed) values, to avoid >> conflicts between requests for a raw version of a glyph and a transformed >> version of the same glyph. Another option would have been two separate >> caches, but I don't think that's necessary. >> 2. Consumers who are using `CompositeGlyphMapper.SLOTMASK` to check glyph >> slots (e.g. `FontRunIterator` and `CTextPipe`) will "see" invisible glyphs >> as having come from slot 0. This isn't new, and I think it's OK, but >> something to be aware of. >> >> The glyph cache handling in `CCharToGlyphMapper` (for macOS) also requires >> care to avoid mixing value types. >> >> Please also note that I'm not sure if the tweak to `sunFont.c` is being >> tested, since FFM is being used by default for Harf... > > We had similar emoji-related regressions at JetBrains. Although our > font-related code diverged from OpenJDK a bit, porting this patch seems to > resolve them too. I am not an OpenJDK reviewer, but LGTM nevertheless. > @YaaZ Thanks for the information! > > @prrace Have you had a chance to look at this PR? It passed all the testing I did. I still need to look hard at the changes. ------------- PR Comment: https://git.openjdk.org/jdk/pull/24412#issuecomment-2826162827