Re: [HarfBuzz] What is wrong with unicode in harfbuzz?

Kelvin Ma Thu, 16 Jun 2016 20:10:41 -0700

ok thanks!! && in that case yall best fix the example at
https://github.com/behdad/harfbuzz/blob/master/src/sample.py then because
it just uses string.encode('utf-x'). which is confusing.


On Thu, Jun 16, 2016 at 10:44 PM, Khaled Hosny <[email protected]>
wrote:

> On Thu, Jun 16, 2016 at 09:35:03PM -0400, Kelvin Ma wrote:
> > When I run a simple harfbuzz shaping like
> >
> > string = 'In begíffi our '
> > > utfstring = string.encode('utf-8')
> > >
> > > buf = hb.buffer_create()
> > > hb.buffer_add_utf8(buf, utfstring, 0, -1)
> > > hb.buffer_guess_segment_properties(buf)
> > >
> > > hb.shape(font, buf, [])
> > > infos = hb.buffer_get_glyph_infos(buf)
> > > positions = hb.buffer_get_glyph_positions(buf)
> > >
> >
> > I get
> >
> > len(string) = 15
> > len(infos) = 13
> > len(positions) = 13
> >
> > which makes sense, three glyphs became one so 15 characters makes 13
> > glyphs. But the cluster values are wrong because they don’t line up with
> > the character indexes any more (because of the accented character).
> >
> > But then when I change it to utf-16
> >
> > string = 'In begíffi our '
> > > utfstring = string.encode('utf-16')
>
> You need here a list of UTF-16 code units, but string.encode('utf-16')
> just gives you UTF-16 bytes array. You need something like:
>
> utfstring = [int.from_bytes(c.encode("utf-16be"), byteorder='big') for c
> in string]
>
> (This does not handle non-BMP characters that will be encoded as two
> UTF-16 code units, but you get the idea).
>
> > > hb.buffer_add_utf16(buf, utfstring, 0, -1)
>
> And pass the list length here (or add null character at the end of the
> list).
>
> > And when I change it to utf-32, which this post
> > <http://comments.gmane.org/gmane.comp.freedesktop.harfbuzz/1836> says
> > should make it give character counts, but
>
> Same as above.
>
> Regards,
> Khaled
>

_______________________________________________
HarfBuzz mailing list
[email protected]
https://lists.freedesktop.org/mailman/listinfo/harfbuzz

Re: [HarfBuzz] What is wrong with unicode in harfbuzz?

Reply via email to