Re: [HarfBuzz] A couple of clarifications regarding HarfBuzz

2010-10-21 Thread Tom Hacohen
Hi Behdad,

Thanks a lot for your response, my comments are below:

On Wed, 2010-10-20 at 20:17 -0400, Behdad Esfahbod wrote:
 Yes, that's it.  It's more than plain reversal at the end though.  You get the
 direction from the output of UAX#9 (eg. FriBidi).
What does it do more than a plain reversal? And yes, I got the direction
from FriBiDi.
 
 Language is used to do language-specific adjustments when appropriate.  You
 typically just pass the locale or whatever your higher-level tells you (think
 of lang attribute in html) to hb_language_from_string.
As I thought, thanks, I wasn't thinking about languages using the same
script like many of the latin languages and their ligatures.

 
 Some of the fallbacks and specific details wouldn't work.  For example,
 mirroring would not work, which means that you would get incorrect result when
 brackets are used in Arabic.
 
 Also, Jonathan Kew has some code in Firefox to implement those.  You may want
 to check them out.
Thank you very much, I will.

 
 HarfBuzz does the right thing no matter what you pass in.   So you can safely
 pass 0.  String length in characters would be most appropriate if you have it.
I assumed HarfBuzz does well anyway, but I want the fastest way
possible. Ok then, I have the string's length (as it's needed for
buffer_add anyway).

 Those are OpenType features.  You can ignore them for now I would say.
Thank you very much.

 The output glyphs have a member called -cluster, which points to the start
 index of the cluster a glyph is part of.
Oh, very nice, thanks.

 The low-level API to fetch that information from GDEF is available through
 hb_ot_layout_get_lig_carets(), however, very few fonts provide such
 information.  It's common to just divide the width by the number of graphemes.
graphemes being non diacritic glyphs?

Thanks a lot,
Tom.

___
HarfBuzz mailing list
HarfBuzz@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/harfbuzz


Re: [HarfBuzz] A couple of clarifications regarding HarfBuzz

2010-10-21 Thread Behdad Esfahbod
On 10/21/10 04:10, Tom Hacohen wrote:

 Language is used to do language-specific adjustments when appropriate.  You
 typically just pass the locale or whatever your higher-level tells you (think
 of lang attribute in html) to hb_language_from_string.

 As I thought, thanks, I wasn't thinking about languages using the same
 script like many of the latin languages and their ligatures.

It's more than just Latin.


 HarfBuzz does the right thing no matter what you pass in.   So you can safely
 pass 0.  String length in characters would be most appropriate if you have 
 it.

 I assumed HarfBuzz does well anyway, but I want the fastest way
 possible. Ok then, I have the string's length (as it's needed for
 buffer_add anyway).

If you have UTF-32 or UTF-16, just pass the length indeed.  For UTF-8, passing
the byte length will overshoot by a factor of 2 or 3 for anything but ASCII.
You need the # of characters, not # of bytes, etc.


 The low-level API to fetch that information from GDEF is available through
 hb_ot_layout_get_lig_carets(), however, very few fonts provide such
 information.  It's common to just divide the width by the number of 
 graphemes.

 graphemes being non diacritic glyphs?

Graphemes are what a user (of a language) considers to be one entity.  Unicode
defines them:

  http://www.unicode.org/reports/tr29/

We may  add code in harfbuzz for that in the future.  A cheap heuristic is to
check for combining-class=0.

behdad


 Thanks a lot,
 Tom.
 
 
___
HarfBuzz mailing list
HarfBuzz@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/harfbuzz