On 06/01/2016 14:37, Jonathan Kew wrote: > On 6/1/16 14:17, Behdad Esfahbod wrote: >> On 16-01-05 09:17 PM, Jamie Dale wrote: >>> I actually just wrote something to give me very similar information >>> since I >>> realised that my basic "this is a ligature" flag wasn't enough data, >>> so each >>> of my glyphs now contains the number of characters that the glyph >>> was composed >>> from. This, along with the cluster index of the glyph from the >>> source text, >>> and the reading direction of the glyph, allow me to work out which >>> characters >>> formed the glyph. >> >> Correct. That's pretty much the only way to do it. >> > > > Don't forget the added complication that there may be multiple glyphs > with the same cluster value. E.g. given the text > > <U+0915, U+094D, U+0915, U+093F, U+0915> > > you're very likely to get two glyphs with cluster index zero, as in > something like > > [imatra=0 | kka=0 | ka=4] > > but it's not at all clear from this how you'd determine which > characters formed each glyph. > > JK > > _______________________________________________ > HarfBuzz mailing list > [email protected] > http://lists.freedesktop.org/mailman/listinfo/harfbuzz
Hi Jonathan Yes, I've seen multiple glyphs with the same cluster value with mixed English and fully-vowelled Arabic. Is it technically possible to "enhance" HarfBuzz to provide an API to give you the list of input characters used to shape a particular glyph --- I really do not know enough about the internals of OpenType shaping to know whether that's an impossible (or hugely complex) task. Here's a test/debug sample a librqm run (of course uses HarfBuzz+FriBidi) (I modified libraqm to provide HarfBuzz data about glyph class) Glyph information: glyph [525] glyph class: 3 x_offset: 440 y_offset: 360 x_advance: 0 cluster value: [18] glyph [2023] glyph class: 2 x_offset: 0 y_offset: 0 x_advance: 850 cluster value: [18] glyph [529] glyph class: 3 x_offset: 450 y_offset: -150 x_advance: 0 cluster value: [16] glyph [765] glyph class: 1 x_offset: 0 y_offset: 0 x_advance: 925 cluster value: [16] glyph [525] glyph class: 3 x_offset: 140 y_offset: -280 x_advance: 0 cluster value: [14] glyph [519] glyph class: 1 x_offset: -100 y_offset: 0 x_advance: 506 cluster value: [14] glyph [3] glyph class: 1 x_offset: 0 y_offset: 0 x_advance: 413 cluster value: [13] glyph [64] glyph class: 1 x_offset: 0 y_offset: 0 x_advance: 604 cluster value: [6] glyph [73] glyph class: 1 x_offset: 0 y_offset: 0 x_advance: 778 cluster value: [7] glyph [66] glyph class: 1 x_offset: 0 y_offset: 0 x_advance: 682 cluster value: [8] glyph [71] glyph class: 1 x_offset: 0 y_offset: 0 x_advance: 367 cluster value: [9] glyph [68] glyph class: 1 x_offset: 0 y_offset: 0 x_advance: 375 cluster value: [10] glyph [78] glyph class: 1 x_offset: 0 y_offset: 0 x_advance: 522 cluster value: [11] glyph [67] glyph class: 1 x_offset: 0 y_offset: 0 x_advance: 769 cluster value: [12] glyph [3] glyph class: 1 x_offset: 0 y_offset: 0 x_advance: 413 cluster value: [5] glyph [792] glyph class: 1 x_offset: 0 y_offset: 0 x_advance: 1317 cluster value: [4] glyph [527] glyph class: 3 x_offset: 30 y_offset: 290 x_advance: 0 cluster value: [2] glyph [804] glyph class: 1 x_offset: 0 y_offset: 0 x_advance: 217 cluster value: [2] glyph [525] glyph class: 3 x_offset: -10 y_offset: 420 x_advance: 0 cluster value: [0] glyph [486] glyph class: 1 x_offset: 0 y_offset: 0 x_advance: 293 cluster value: [0] UTF-32 clusters: 18 18 16 16 14 14 13 06 07 08 09 10 11 12 05 04 02 02 00 00 UTF-8 clusters: 27 27 23 23 19 19 18 11 12 13 14 15 16 17 10 08 04 04 00 00 Cheers Graham _______________________________________________ HarfBuzz mailing list [email protected] http://lists.freedesktop.org/mailman/listinfo/harfbuzz
