On Thu, Sep 4, 2014 at 8:00 AM, Jeffrey Kegler < [email protected]> wrote:
> Perhaps if we differentiate between "closed" and "open" spans -- "closed" > ending in a consonant and "open" ending in a vowel. > > All ::= Span > Span ::= Closed_Span | Open_Span > Closed_Span ::= Abbreviation > Closed_Span :: Closed_Syllable > Closed_Span ::= Span Closed_Syllable > Closed_Span ::= Closed_Span Abbreviation > Open_Span ::= Span Open_Syllable > Open_Span ::= Open_Syllable > Abbreviation ::= C > Closed_Syllable ::= C V C > Open_Syllable ::= C V > > I'm a little busy now, so I didn't test this, but perhaps you get the idea > -- the spans recurse, with open ones ending in a vowel and closed ones > ending in a consonant. Abbreviations may only occur in two places -- > following a closed span, or at the very beginning of a span. Several > abbreviations are allowed in a row, since an abbreviation can end a closed > span. > > @rns: Does this work? > If a lone consonant can mean only an abbrev, then yes, I think — https://gist.github.com/rns/4322555d750a9e83cd54 > I *think* it's unambiguous. > Yes, it's unambiguous — output in the comment; I also added code to test for ambiguous parses. > > > -- jeffrey > > > On 09/03/2014 09:35 PM, Andrew Dunbar wrote: > > Yes I'm vague on Data::Dumper and I don't know much about the workings of > Marpa. > > I added on the example code for Marpa::R2::ASF so I can compare it with > my real code. > It doesn't seem to be ambiguous now but I actually can't see what's > different about it. > I'm not sure whether I simplified it too much when I made the analogy. > > I want rules that mean "only interpret a consonant as an abbreviation" > when it can't be interpreted as part of a syllable. > > I don't know if that's possible of course (-: > > I'll see if I can come up with a better analogy based on some actual > ambiguities I find in Lao. > > On Thursday, 4 September 2014 13:58:23 UTC+10, Jeffrey Kegler wrote: >> >> What rns did (as I read it) was list all the results of $slr->value(). >> The parse is unambiguous if and only if there is exactly one, which seems >> to be the case here. (You've been away from Perl, so Data::Dumper may now >> be hard to read, but you can confirm this for yourself by adding a line >> before the dump of each value, as a "hi there", or giving a count.) >> >> Is your rule that you don't want to allow an abbreviation to follow a >> vowel? >> >> -- jeffrey >> >> On 09/03/2014 08:24 PM, Andrew Dunbar wrote: >> >> Do we know if that's ambiguous? Don't we have to run it >> through Marpa::R2::ASF to know? >> >> >> >> On Wednesday, 3 September 2014 20:10:42 UTC+10, rns wrote: >>> >>> Can you please look at this gist >>> <https://gist.github.com/rns/fb6abf62a5fa779957ba>? The result is in >>> the comment below it. This might be a solution provided that I've got the >>> right idea. >>> >>> >>> >>> >>> >>> >>> On Wed, Sep 3, 2014 at 11:44 AM, Andrew Dunbar <[email protected]> >>> wrote: >>> >>>> I've come back to Perl after a long absence just to play with Marpa >>>> because it looks like the most full featured Earley parser in any of the >>>> programming languages I know. >>>> >>>> I'm interested in Earley specifically because it can handle ambiguity >>>> and can produce a parse forest. >>>> >>>> I'm using it to investigate the syllable structure of the writing >>>> system of the Lao language of Southeast Asia. Specifically to see whether >>>> it's inherently ambiguous, and how. >>>> >>>> So far it works great and I'm glad I've come here from the Bison and >>>> PEG grammars I was playing with earlier. >>>> >>>> But it seems that there might be two kinds of ambiguities, the kind >>>> I'm looking for, and a kind that might be an artefact of Earley parsing or >>>> of the way I've written the grammar. >>>> >>>> Without having to teach you Lao I'll attempt to analogize: >>>> >>>> All ::= Syllable+ >>>> >>>> Syllable ::= C V C >>>> | C V >>>> | C >>>> >>>> C ~ [bcdfghjklmnpqrstvwxyz] >>>> V ~ [aeiou] >>>> >>>> >>>> The "Syllable ::= C" rule is to allow lone initial consonants, as are >>>> used occasionally for abbreviations. >>>> >>>> If my input string is "mat" I only want: >>>> >>>> (Syllable (C m) (V a) (C t)) >>>> >>>> But due to the abbreviation rule I also get a second unwanted parse: >>>> >>>> (Syllable (C m) (V a)) >>>> (Syllable (C t)) >>>> >>>> I've been able to refactor my grammar to deal with other issues that >>>> have appeared, by I can't seem to think of anything which accounts for >>>> occasional abbreviations but doesn't generate a number of unwanted >>>> alternative parses. >>>> >>>> Can I refactor my grammar or is there some other way to deal with >>>> this but still generate all the other kinds of ambiguity that I am >>>> interested in? >>>> -- >>>> You received this message because you are subscribed to the Google >>>> Groups "marpa parser" group. >>>> To unsubscribe from this group and stop receiving emails from it, send >>>> an email to [email protected]. >>>> For more options, visit https://groups.google.com/d/optout. >>>> >>> >>> -- >> You received this message because you are subscribed to the Google Groups >> "marpa parser" group. >> To unsubscribe from this group and stop receiving emails from it, send an >> email to [email protected]. >> For more options, visit https://groups.google.com/d/optout. >> >> >> -- > You received this message because you are subscribed to the Google Groups > "marpa parser" group. > To unsubscribe from this group and stop receiving emails from it, send an > email to [email protected]. > For more options, visit https://groups.google.com/d/optout. > > > -- > You received this message because you are subscribed to the Google Groups > "marpa parser" group. > To unsubscribe from this group and stop receiving emails from it, send an > email to [email protected]. > For more options, visit https://groups.google.com/d/optout. > -- You received this message because you are subscribed to the Google Groups "marpa parser" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. For more options, visit https://groups.google.com/d/optout.
