Re: [HarfBuzz] A problem in thai shaper
On Mon, Apr 16, 2012 at 09:08:49PM -0400, Behdad Esfahbod wrote: > > Problem 2: > > > > When there is no consonant exist, the dotted circle should be inserted as > > base > > character. The logic should be the first step for the shaping engine to > > find > > the invalid combing marks. Refer to > > http://www.microsoft.com/typography/otfntdev/thaiot/shaping.aspx#comb > > Right. We do not handle invalid combining marks yet. That's something I want > to do at some point but it's not high priority. I don't know about Thai, but the handling of "invalid" Arabic combining marks in Uniscribe is completely brain dead and a real PITA and I'd really like not to see HarfBuzz going there, a shaping engine is not a spell checker and should not enforce any input pattern. http://www.microsoft.com/typography/OpenType%20Dev/arabic/shaping.mspx#invalid Regards, Khaled ___ HarfBuzz mailing list HarfBuzz@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/harfbuzz
Re: [HarfBuzz] A problem in thai shaper
Hi, Thanks for the email. My comments inline. On 04/13/2012 09:41 AM, datao zhang wrote > So I think for the new Thai shaper, the valid composition of “consonant [1 > mandatory]+ diacritic vowel [1 optional] + tone mark [1 optional] “ should be > set as same cluster. I would guess that our generic layer will already take care of this based on canonical combining categories? Do you have a test case that you want to see improved? > Problem 2: > > When there is no consonant exist, the dotted circle should be inserted as base > character. The logic should be the first step for the shaping engine to find > the invalid combing marks. Refer to > http://www.microsoft.com/typography/otfntdev/thaiot/shaping.aspx#comb Right. We do not handle invalid combining marks yet. That's something I want to do at some point but it's not high priority. behdad ___ HarfBuzz mailing list HarfBuzz@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/harfbuzz
Re: [HarfBuzz] Problem in complex indic
Thanks Dean. Fixed. behdad On 04/15/2012 08:03 AM, datao zhang wrote: > Hi: > > Problem about finding the vowel syllable: > > If the Indic shaper of Harfbuzz are following the OT specification of > Microsoft, then the following rule in “hb-ot-shape-complex-indic-machine.rl” > should be changed: > > > vowel_syllable = (Ra H)? V N? (z.H.c | ZWJ.c)? matra_group* > syllable_tail %(found_vowel_syllable); => > > > > vowel_syllable = (Ra H)? V N? (z?.H.c | ZWJ.c)? matra_group* > syllable_tail %(found_vowel_syllable); > > > > please refer to > http://www.microsoft.com/typography/otfntdev/devanot/shaping.aspx > > > Br, > Dean > > > > > ___ > HarfBuzz mailing list > HarfBuzz@lists.freedesktop.org > http://lists.freedesktop.org/mailman/listinfo/harfbuzz ___ HarfBuzz mailing list HarfBuzz@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/harfbuzz
[HarfBuzz] harfbuzz-ng: Branch 'master' - 2 commits
src/hb-ot-shape-complex-indic-machine.rl |4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) New commits: commit 9ceca3aeb14cc096f5f87660cf7351bc35073084 Author: Behdad Esfahbod Date: Mon Apr 16 21:05:51 2012 -0400 Fix ragel regexp in vowel-based syllable As reported by datao zhang on the mailing list. diff --git a/src/hb-ot-shape-complex-indic-machine.rl b/src/hb-ot-shape-complex-indic-machine.rl index 417880b..6406c24 100644 --- a/src/hb-ot-shape-complex-indic-machine.rl +++ b/src/hb-ot-shape-complex-indic-machine.rl @@ -67,7 +67,7 @@ action found_non_indic { found_non_indic (map, buffer, mask_array, last, p); } action next_syllable { buffer->merge_clusters (last, p); last = p; } consonant_syllable = (c.N? (H.z?|z.H))* c.N? A? (H.z? | matra_group*)? syllable_tail %(found_consonant_syllable); -vowel_syllable = (Ra H)? V N? (z.H.c | ZWJ.c)? matra_group* syllable_tail %(found_vowel_syllable); +vowel_syllable = (Ra H)? V N? (z?.H.c | ZWJ.c)? matra_group* syllable_tail %(found_vowel_syllable); standalone_cluster = (Ra H)? NBSP N? (z? H c)? matra_group* syllable_tail %(found_standalone_cluster); non_indic = X %(found_non_indic); commit b870afcd1b436614af95db6dc297e54c8f03f0cd Author: Behdad Esfahbod Date: Mon Apr 16 21:05:11 2012 -0400 Rewrite ragel expression to better match the one on MS spec https://www.microsoft.com/typography/otfntdev/devanot/shaping.aspx diff --git a/src/hb-ot-shape-complex-indic-machine.rl b/src/hb-ot-shape-complex-indic-machine.rl index 7af23c1..417880b 100644 --- a/src/hb-ot-shape-complex-indic-machine.rl +++ b/src/hb-ot-shape-complex-indic-machine.rl @@ -66,7 +66,7 @@ action found_non_indic { found_non_indic (map, buffer, mask_array, last, p); } action next_syllable { buffer->merge_clusters (last, p); last = p; } -consonant_syllable = (c.N? (z.H|H.z?))* c.N? A? (H.z? | matra_group*)? syllable_tail %(found_consonant_syllable); +consonant_syllable = (c.N? (H.z?|z.H))* c.N? A? (H.z? | matra_group*)? syllable_tail %(found_consonant_syllable); vowel_syllable = (Ra H)? V N? (z.H.c | ZWJ.c)? matra_group* syllable_tail %(found_vowel_syllable); standalone_cluster = (Ra H)? NBSP N? (z? H c)? matra_group* syllable_tail %(found_standalone_cluster); non_indic = X %(found_non_indic); ___ HarfBuzz mailing list HarfBuzz@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/harfbuzz
[HarfBuzz] harfbuzz-ng: Branch 'master' - 6 commits
src/hb-ot-shape.cc |2 src/hb-private.hh |8 + test/shaping/texts/in-tree/shaper-default/MANIFEST |1 test/shaping/texts/in-tree/shaper-default/script-japanese/MANIFEST |1 test/shaping/texts/in-tree/shaper-default/script-japanese/misc/MANIFEST |2 test/shaping/texts/in-tree/shaper-default/script-japanese/misc/kazuraki-liga-lines.txt |8 + test/shaping/texts/in-tree/shaper-default/script-japanese/misc/kazuraki-liga.txt | 53 ++ util/hb-shape.cc |8 - util/hb-view.hh |2 util/helper-cairo.cc | 22 +++- util/helper-cairo.hh |3 util/options.cc | 19 ++- util/options.hh | 26 +++- util/view-cairo.cc | 15 +- util/view-cairo.hh |3 15 files changed, 139 insertions(+), 34 deletions(-) New commits: commit 95cefdf96efe43a44133aa8a186155cf4e63e2b7 Author: Behdad Esfahbod Date: Mon Apr 16 18:08:20 2012 -0400 Add --utf8-clusters Also fix cairo cluster generation. diff --git a/util/hb-shape.cc b/util/hb-shape.cc index a76a778..b22bc1f 100644 --- a/util/hb-shape.cc +++ b/util/hb-shape.cc @@ -36,7 +36,8 @@ struct output_buffer_t : output_options_t, format_options_t void init (const font_options_t *font_opts); void consume_line (hb_buffer_t *buffer, const char *text, -unsigned int text_len); +unsigned int text_len, +hb_bool_t utf8_clusters); void finish (const font_options_t *font_opts); protected: @@ -57,11 +58,12 @@ output_buffer_t::init (const font_options_t *font_opts) void output_buffer_t::consume_line (hb_buffer_t *buffer, const char *text, - unsigned int text_len) + unsigned int text_len, + hb_bool_t utf8_clusters) { line_no++; g_string_set_size (gs, 0); - serialize_line (buffer, line_no, text, text_len, font, gs); + serialize_line (buffer, line_no, text, text_len, font, utf8_clusters, gs); fprintf (fp, "%s", gs->str); } diff --git a/util/hb-view.hh b/util/hb-view.hh index 68a5dd8..66d955b 100644 --- a/util/hb-view.hh +++ b/util/hb-view.hh @@ -65,7 +65,7 @@ struct hb_view_t buffer)) fail (FALSE, "All shapers failed"); - output.consume_line (buffer, text, text_len); + output.consume_line (buffer, text, text_len, shaper.utf8_clusters); } hb_buffer_destroy (buffer); diff --git a/util/helper-cairo.cc b/util/helper-cairo.cc index abb8c15..9374d9e 100644 --- a/util/helper-cairo.cc +++ b/util/helper-cairo.cc @@ -301,7 +301,8 @@ helper_cairo_line_from_buffer (helper_cairo_line_t *l, hb_buffer_t *buffer, const char *text, unsigned int text_len, - double scale) + double scale, + hb_bool_tutf8_clusters) { memset (l, 0, sizeof (*l)); @@ -349,27 +350,38 @@ helper_cairo_line_from_buffer (helper_cairo_line_t *l, hb_bool_t backward = HB_DIRECTION_IS_BACKWARD (hb_buffer_get_direction (buffer)); l->cluster_flags = backward ? CAIRO_TEXT_CLUSTER_FLAG_BACKWARD : (cairo_text_cluster_flags_t) 0; unsigned int cluster = 0; +const char *start = l->utf8, *end = start; l->clusters[cluster].num_glyphs++; if (backward) { for (i = l->num_glyphs - 2; i >= 0; i--) { if (hb_glyph[i].cluster != hb_glyph[i+1].cluster) { g_assert (hb_glyph[i].cluster > hb_glyph[i+1].cluster); - l->clusters[cluster].num_bytes += hb_glyph[i].cluster - hb_glyph[i+1].cluster; + if (utf8_clusters) + end = start + hb_glyph[i].cluster - hb_glyph[i+1].cluster; + else + end = g_utf8_offset_to_pointer (start, hb_glyph[i].cluster - hb_glyph[i+1].cluster); + l->clusters[cluster].num_bytes = end - start; + start = end; cluster++; } l->clusters[cluster].num_glyphs++; } - l->clusters[cluster].num_bytes += text_len - hb_glyph[0].clus