On 07/25/2012 07:17 PM, Khaled Hosny wrote: > On Sun, Jul 22, 2012 at 11:37:23PM -0400, Behdad Esfahbod wrote: >> Hi Khaled, >> >> On 07/21/2012 05:49 AM, Khaled Hosny wrote: >>> How do I map output glyphs back to input characters? I assume I've to >>> use clusters for that, but I can't make much sense of the cluster >>> numbers I'm seeing and don't seem to find any explanation for them. >> >> When you add text to a hb_buffer_t, you set a cluster number for each >> character. The functions hb_buffer_add_utf* implicitly use the index into >> the >> input string for the cluster. Ie. when using the UTF-8 version, UTF-8 >> indices >> are used. >> >> Note that hb-view/hb-shape by default use UTF-32 cluster numbers (ie. >> character-count instead of byte-count). You can change that using >> --utf8-clusters. > > I’m using UTF-16 (playing with porting LibreOffice to HarfBuzz), so how > surrogate pairs are handled?
See bottom of hb-buffer.cc. "cluster" values after shaping hook back to UTF-16 index in the original. If you want to be more impactful, don't port LibreOffice, port iculayout! It's probably 400 lines of code... behdad >> The shaping process implicitly segments the input text + output glyphs in a >> series of clusters. So you can think of, for LTR text, first cluster >> followed >> by second cluster, followed by third cluster, etc, where each cluster >> contains >> a number of characters and a number of glyphs. >> >> Now, the hb_glyph_info_t::cluster member after shaping simply points to the >> minimum value of that member for all the characters that belong to the >> cluster. >> >> For RTL it's similar, though in reverse direction. >> >> Quick example. If you add text for "differ", then initially characters get >> cluster values 0,1,2,3,4,5 respectively. After shaping, if the 'ff' ligature >> was formed, you will get five glyphs, with cluster values 0,1,2,4,5. This >> means that the two characters that originally had cluster values 2 and 3 are >> represented by the sole glyph having the cluster value 2. >> >> Hope that helps. > > Thanks Behdad, this was very helpful. > > Regards, > Khaled > _______________________________________________ HarfBuzz mailing list [email protected] http://lists.freedesktop.org/mailman/listinfo/harfbuzz
