On Thu, Sep 4, 2014 at 6:24 AM, Andrew Dunbar <[email protected]> wrote:
> Do we know if that's ambiguous? > Yes, only one alternative is listed. > Don't we have to run it through Marpa::R2::ASF to know? > We test for ambiguity by calling ambiguous() on the recognizer (example <https://gist.github.com/rns/4322555d750a9e83cd54>) and then can use Marpa::R2::ASF to traverse the parse forest (example 1 <https://github.com/jeffreykegler/Marpa--R2/blob/master/cpan/t/sl_panda.t>, example 2 <https://github.com/jeffreykegler/Marpa--R2/blob/master/cpan/t/sl_panda1.t> ). BTW, I took it (perhaps wrongly) that a lone consonant means itself or an abbrev and a sequence of consonants means a syllable. The syllable type (CVC, CV, C) can then be inferred when traversing the parse tree from the rule name (e.g. Syllable_CVC). Another approach is to use a separate parser for syllable sequence. > > > On Wednesday, 3 September 2014 20:10:42 UTC+10, rns wrote: > >> Can you please look at this gist >> <https://gist.github.com/rns/fb6abf62a5fa779957ba>? The result is in the >> comment below it. This might be a solution provided that I've got the right >> idea. >> >> >> >> >> >> >> On Wed, Sep 3, 2014 at 11:44 AM, Andrew Dunbar <[email protected]> >> wrote: >> >>> I've come back to Perl after a long absence just to play with Marpa >>> because it looks like the most full featured Earley parser in any of the >>> programming languages I know. >>> >>> I'm interested in Earley specifically because it can handle ambiguity >>> and can produce a parse forest. >>> >>> I'm using it to investigate the syllable structure of the writing system >>> of the Lao language of Southeast Asia. Specifically to see whether it's >>> inherently ambiguous, and how. >>> >>> So far it works great and I'm glad I've come here from the Bison and PEG >>> grammars I was playing with earlier. >>> >>> But it seems that there might be two kinds of ambiguities, the kind I'm >>> looking for, and a kind that might be an artefact of Earley parsing or of >>> the way I've written the grammar. >>> >>> Without having to teach you Lao I'll attempt to analogize: >>> >>> All ::= Syllable+ >>> >>> Syllable ::= C V C >>> | C V >>> | C >>> >>> C ~ [bcdfghjklmnpqrstvwxyz] >>> V ~ [aeiou] >>> >>> >>> The "Syllable ::= C" rule is to allow lone initial consonants, as are >>> used occasionally for abbreviations. >>> >>> If my input string is "mat" I only want: >>> >>> (Syllable (C m) (V a) (C t)) >>> >>> But due to the abbreviation rule I also get a second unwanted parse: >>> >>> (Syllable (C m) (V a)) >>> (Syllable (C t)) >>> >>> I've been able to refactor my grammar to deal with other issues that >>> have appeared, by I can't seem to think of anything which accounts for >>> occasional abbreviations but doesn't generate a number of unwanted >>> alternative parses. >>> >>> Can I refactor my grammar or is there some other way to deal with this >>> but still generate all the other kinds of ambiguity that I am interested in? >>> >>> -- >>> You received this message because you are subscribed to the Google >>> Groups "marpa parser" group. >>> To unsubscribe from this group and stop receiving emails from it, send >>> an email to [email protected]. >>> >>> For more options, visit https://groups.google.com/d/optout. >>> >> >> -- > You received this message because you are subscribed to the Google Groups > "marpa parser" group. > To unsubscribe from this group and stop receiving emails from it, send an > email to [email protected]. > For more options, visit https://groups.google.com/d/optout. > -- You received this message because you are subscribed to the Google Groups "marpa parser" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. For more options, visit https://groups.google.com/d/optout.
