[HarfBuzz] harfbuzz: Branch 'master' - 4 commits

Behdad Esfahbod Wed, 05 Sep 2012 14:56:49 -0700

 src/hb-ot-shape-complex-indic-machine.rl |    8 ++++----
 test/shaping/hb_test_tools.py            |    8 +++++---
 2 files changed, 9 insertions(+), 7 deletions(-)


New commits:
commit f0b8ed1b6dd9f1d2b9084c101a6fc5dee0cc22a8
Author: Behdad Esfahbod <beh...@behdad.org>
Date:   Wed Sep 5 17:32:57 2012 -0400

    [Indic] Allow "H,ZWJ,M"
    
    Uniscribe accepts a Halant,ZWJ before matras.  Allow that.
    
    BENGALI down from 295 to 291
    DEVANAGARI down from 69 to 57
    GUJARATI down from 19 to 17
    KANNADA down from 871 to 867
    MALAYALAM down from 340 to 337
    TELUGU down from 20 to 16
    
    Currently at:
    
    BENGALI: 353897 out of 354188 tests passed. 291 failed (0.0821598%)
    DEVANAGARI: 707337 out of 707394 tests passed. 57 failed (0.00805774%)
    GUJARATI: 366440 out of 366457 tests passed. 17 failed (0.00463902%)
    GURMUKHI: 60704 out of 60747 tests passed. 43 failed (0.0707854%)
    KANNADA: 951046 out of 951913 tests passed. 867 failed (0.0910798%)
    KHMER: 299077 out of 299124 tests passed. 47 failed (0.0157125%)
    LAO: 53611 out of 53644 tests passed. 33 failed (0.0615167%)
    MALAYALAM: 1047997 out of 1048334 tests passed. 337 failed (0.0321462%)
    ORIYA: 42320 out of 42329 tests passed. 9 failed (0.021262%)
    SINHALA: 271666 out of 271847 tests passed. 181 failed (0.0665816%)
    TAMIL: 1091754 out of 1091754 tests passed. 0 failed (0%)
    TELUGU: 970557 out of 970573 tests passed. 16 failed (0.00164851%)
    TIBETAN: 208469 out of 208469 tests passed. 0 failed (0%)

diff --git a/src/hb-ot-shape-complex-indic-machine.rl 
b/src/hb-ot-shape-complex-indic-machine.rl
index bcc942c..03e3910 100644
--- a/src/hb-ot-shape-complex-indic-machine.rl
+++ b/src/hb-ot-shape-complex-indic-machine.rl
@@ -69,7 +69,7 @@ syllable_tail =  (Coeng (cn|V))? (SM.ZWNJ?)? (VD VD?)?;
 place_holder = NBSP | DOTTEDCIRCLE;
 halant_group = (z?.h.(ZWJ.N?)?);
 final_halant_group = halant_group | h.ZWNJ;
-halant_or_matra_group = (final_halant_group | matra_group{0,4});
+halant_or_matra_group = (final_halant_group | (h.ZWJ)? matra_group{0,4});
 
 
 consonant_syllable =   Repha? (cn.halant_group){0,4} cn A? 
halant_or_matra_group? syllable_tail;
commit 4ed717ef61813fa16cf74f2874848e9feb81568f
Author: Behdad Esfahbod <beh...@behdad.org>
Date:   Wed Sep 5 17:21:17 2012 -0400

    [Indic] Relax grammar
    
    Now that we insert dotted-circle, tests break more easily when our indic
    machine breaks.
    
    In particular, a few Devanagari tests were having sequences like
    "C,H,ZWJ,N", and because of the ZWJ the Nukta does NOT get reordered to
    before the Halant as the grammar used to expect...  Fixup.
    
    Another case is as simple as "C,ZWJ,SM".
    
    Fixes 10 out of 79 failures:
    
    DEVANAGARI: 707325 out of 707394 tests passed. 69 failed (0.00975411%)

diff --git a/src/hb-ot-shape-complex-indic-machine.rl 
b/src/hb-ot-shape-complex-indic-machine.rl
index 283a246..bcc942c 100644
--- a/src/hb-ot-shape-complex-indic-machine.rl
+++ b/src/hb-ot-shape-complex-indic-machine.rl
@@ -62,12 +62,12 @@ z = ZWJ|ZWNJ;                       # is_joiner
 h = H | Coeng;                 # is_halant_or_coeng
 reph = (Ra H | Repha);         # possible reph
 
-cn = c.n?;
+cn = c.ZWJ?.n?;
 forced_rakar = ZWJ H ZWJ Ra;
 matra_group = z{0,3}.M.N?.(H | forced_rakar)?;
 syllable_tail =  (Coeng (cn|V))? (SM.ZWNJ?)? (VD VD?)?;
 place_holder = NBSP | DOTTEDCIRCLE;
-halant_group = (z?.h.ZWJ?);
+halant_group = (z?.h.(ZWJ.N?)?);
 final_halant_group = halant_group | h.ZWNJ;
 halant_or_matra_group = (final_halant_group | matra_group{0,4});
 
commit aa7141efe49991a1160489106984e95163fe2ab8
Author: Behdad Esfahbod <beh...@behdad.org>
Date:   Wed Sep 5 15:54:21 2012 -0400

    [Indic] Fix Khmer syllable-final coeng-consonant
    
    Brings down Khmer failures from 162 to 47.
    
    KHMER: 299077 out of 299124 tests passed. 47 failed (0.0157125%)
    
    Also rebaselined some of the test files that had only-inherited lines.
    Removing those, the stats are:
    
    BENGALI: 353893 out of 354188 tests passed. 295 failed (0.0832891%)
    DEVANAGARI: 707315 out of 707394 tests passed. 79 failed (0.0111678%)
    GUJARATI: 366438 out of 366457 tests passed. 19 failed (0.00518478%)
    GURMUKHI: 60704 out of 60747 tests passed. 43 failed (0.0707854%)
    KANNADA: 951042 out of 951913 tests passed. 871 failed (0.0915%)
    KHMER: 299077 out of 299124 tests passed. 47 failed (0.0157125%)
    LAO: 53611 out of 53644 tests passed. 33 failed (0.0615167%)
    MALAYALAM: 1047994 out of 1048334 tests passed. 340 failed (0.0324324%)
    ORIYA: 42320 out of 42329 tests passed. 9 failed (0.021262%)
    SINHALA: 271666 out of 271847 tests passed. 181 failed (0.0665816%)
    TAMIL: 1091754 out of 1091754 tests passed. 0 failed (0%)
    TELUGU: 970553 out of 970573 tests passed. 20 failed (0.00206064%)
    TIBETAN: 208469 out of 208469 tests passed. 0 failed (0%)
    
    Still some regressions, but some of the more egregious cases are
    addressed.

diff --git a/src/hb-ot-shape-complex-indic-machine.rl 
b/src/hb-ot-shape-complex-indic-machine.rl
index c9309e9..283a246 100644
--- a/src/hb-ot-shape-complex-indic-machine.rl
+++ b/src/hb-ot-shape-complex-indic-machine.rl
@@ -65,7 +65,7 @@ reph = (Ra H | Repha);                # possible reph
 cn = c.n?;
 forced_rakar = ZWJ H ZWJ Ra;
 matra_group = z{0,3}.M.N?.(H | forced_rakar)?;
-syllable_tail = (SM.ZWNJ?)? (Coeng (cn|V))? (VD VD?)?;
+syllable_tail =  (Coeng (cn|V))? (SM.ZWNJ?)? (VD VD?)?;
 place_holder = NBSP | DOTTEDCIRCLE;
 halant_group = (z?.h.ZWJ?);
 final_halant_group = halant_group | h.ZWNJ;
commit efb8d3eb713bca7cbfca41380a012bdb4d380e5c
Author: Behdad Esfahbod <beh...@behdad.org>
Date:   Wed Sep 5 15:50:47 2012 -0400

    Fixup test failure reporting
    
    After we implemented dotted-circle, we were still ignoring any tests
    that had dottedcircle in it for any of the shapers.  That meant that if
    we wrongly outputted dottedcircle, the test was being ignored.  Ouch!
    
    Fixing that shows regressions across the board.  Most are Uniscribe
    bugs: NOT inserting dotted-circle when it should.  Some are arou
    machine bugs.  This is in fact a nice way to catch Indic-machine
    deficiencies and when I fix the regressions, our clusters should be
    much closer to Uniscribe.  For now, we regressed from:
    
    BENGALI: 353997 out of 354285 tests passed. 288 failed (0.0812905%)
    DEVANAGARI: 707339 out of 707394 tests passed. 55 failed (0.00777502%)
    GUJARATI: 366489 out of 366506 tests passed. 17 failed (0.0046384%)
    GURMUKHI: 60769 out of 60809 tests passed. 40 failed (0.0657797%)
    KANNADA: 951086 out of 951913 tests passed. 827 failed (0.0868777%)
    KHMER: 299106 out of 299124 tests passed. 18 failed (0.00601757%)
    LAO: 53611 out of 53644 tests passed. 33 failed (0.0615167%)
    MALAYALAM: 1048104 out of 1048416 tests passed. 312 failed (0.0297592%)
    ORIYA: 42320 out of 42329 tests passed. 9 failed (0.021262%)
    SINHALA: 271747 out of 271847 tests passed. 100 failed (0.0367854%)
    TAMIL: 1091837 out of 1091837 tests passed. 0 failed (0%)
    TELUGU: 970558 out of 970573 tests passed. 15 failed (0.00154548%)
    TIBETAN: 208469 out of 208469 tests passed. 0 failed (0%)
    
    To:
    
    BENGALI: 353990 out of 354285 tests passed. 295 failed (0.0832663%)
    DEVANAGARI: 707315 out of 707394 tests passed. 79 failed (0.0111678%)
    GUJARATI: 366447 out of 366506 tests passed. 59 failed (0.016098%)
    GURMUKHI: 60707 out of 60809 tests passed. 102 failed (0.167738%)
    KANNADA: 951042 out of 951913 tests passed. 871 failed (0.0915%)
    KHMER: 298962 out of 299124 tests passed. 162 failed (0.0541581%)
    LAO: 53611 out of 53644 tests passed. 33 failed (0.0615167%)
    MALAYALAM: 1048074 out of 1048416 tests passed. 342 failed (0.0326206%)
    ORIYA: 42320 out of 42329 tests passed. 9 failed (0.021262%)
    SINHALA: 271666 out of 271847 tests passed. 181 failed (0.0665816%)
    TAMIL: 1091835 out of 1091837 tests passed. 2 failed (0.000183178%)
    TELUGU: 970553 out of 970573 tests passed. 20 failed (0.00206064%)
    TIBETAN: 208469 out of 208469 tests passed. 0 failed (0%)
    
    Investigating.

diff --git a/test/shaping/hb_test_tools.py b/test/shaping/hb_test_tools.py
index 1d1d62c..0b1ec00 100644
--- a/test/shaping/hb_test_tools.py
+++ b/test/shaping/hb_test_tools.py
@@ -295,9 +295,11 @@ class DiffHelpers:
        def test_passed (lines):
                lines = list (lines)
                # XXX This is a hack, but does the job for now.
-               if any (l.find("space|space") >= 0 for l in lines): return True
-               if any (l.find("uni25CC") >= 0 for l in lines): return True
-               if any (l.find("dottedcircle") >= 0 for l in lines): return True
+               if any (l.find("space|space") >= 0 for l in lines if l[0] == 
'+'): return True
+               if any (l.find("uni25CC") >= 0 for l in lines if l[0] == '+'): 
return True
+               if any (l.find("dottedcircle") >= 0 for l in lines if l[0] == 
'+'): return True
+               if any (l.find("glyph0") >= 0 for l in lines if l[0] == '+'): 
return True
+               if any (l.find("notdef") >= 0 for l in lines if l[0] == '+'): 
return True
                return all (l[0] == ' ' for l in lines)
 
 
_______________________________________________
HarfBuzz mailing list
HarfBuzz@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/harfbuzz

[HarfBuzz] harfbuzz: Branch 'master' - 4 commits

Reply via email to