On Thu, 3 Apr 2025 11:23:42 GMT, Daniel Gredler <dgred...@openjdk.org> wrote:

> It looks like this regression actually fits into a longer series of fixes / 
> regressions in this area:
> 
> - [JDK-4517298](https://bugs.openjdk.org/browse/JDK-4517298) fixed metrics 
> for zero-width characters, but broke some ligatures / glyph substitutions
> - [JDK-7017058](https://bugs.openjdk.org/browse/JDK-7017058) fixed the 
> ligatures / glyph substitutions, but broke some zero-width metrics
> - [JDK-8208377](https://bugs.openjdk.org/browse/JDK-8208377) fixed some 
> metrics and rendering for zero-width characters, but broke some ligatures / 
> glyph substitutions
> - Now, with this PR, we aim to fix the ligatures without re-breaking 
> zero-width metrics and display
> 
> We have two different types of use cases pulling `CharToGlyphMapper` in two 
> different directions: the users who need raw, untransformed glyph info, and 
> the users who need normalized / transformed glyph info.
> 
> It looks to me like, in the current code base, the only `CharToGlyphMapper` 
> user which requires raw font data is HarfBuzz (explicitly confirmed with the 
> HarfBuzz team here: https://github.com/harfbuzz/harfbuzz/discussions/5234).
> 
> The regression mechanism at play here is that the HarfBuzz font callbacks are 
> currently providing HarfBuzz with transformed glyph info (e.g. ZWJ -> 
> INVISIBLE_GLYPH_ID), which prevents HarfBuzz from recognizing and applying 
> the correct font GSUB substitutions (which involve ZWJ).
> 
> In order to fix this without (yet again) breaking metrics and display 
> behavior elsewhere, I've added two methods to `CharToGlyphMapper` which 
> provide access to raw glyph info, to be used by the HarfBuzz font callbacks: 
> `charToGlyphRaw(int)` and `charToVariationGlyphRaw(int)`.
> 
> Note two intricacies related to `CompositeGlyphMapper`:
> 1. We need to be careful to only cache raw (untransformed) values, to avoid 
> conflicts between requests for a raw version of a glyph and a transformed 
> version of the same glyph. Another option would have been two separate 
> caches, but I don't think that's necessary.
> 2. Consumers who are using `CompositeGlyphMapper.SLOTMASK` to check glyph 
> slots (e.g. `FontRunIterator` and `CTextPipe`) will "see" invisible glyphs as 
> having come from slot 0. This isn't new, and I think it's OK, but something 
> to be aware of.
> 
> The glyph cache handling in `CCharToGlyphMapper` (for macOS) also requires 
> care to avoid mixing value types.
> 
> Please also note that I'm not sure if the tweak to `sunFont.c` is being 
> tested, since FFM is being used by default for HarfBuzz integration. (Is 
> there a plan to remove...

By the way, I see that in each implementation, both `charToGlyph` and 
`charToGlyphRaw` call a common method, like `getGlyph(int uniciode, boolean 
raw)`. At first there was just `charToGlyph`, then `charToVariationGlyph` was 
added and now you added a "raw" version for each of them, I see that in the 
future we will need other variants and how it's already starting an exponential 
explosion. Overriding all of those methods in each implementation brings quite 
a bit of boilerplate, and it becomes easier to miss something. Maybe take a 
step back and refactor this into a single `charToGlyph(int unicode, int 
variationSelector, boolean raw)` version?
Also, this `raw` parameter only really controls `isDefaultIgnorable` check in 
the end of each method. Maybe we could factor this out without bringing it 
separately into each mapper implementation?

-------------

PR Comment: https://git.openjdk.org/jdk/pull/24412#issuecomment-2838175747

Reply via email to