On Thu, Sep 4, 2014 at 8:00 AM, Jeffrey Kegler <
[email protected]> wrote:

>  Perhaps if we differentiate between "closed" and "open" spans -- "closed"
> ending in a consonant and "open" ending in a vowel.
>
> All ::= Span
> Span ::= Closed_Span | Open_Span
> Closed_Span ::= Abbreviation
> Closed_Span :: Closed_Syllable
> Closed_Span ::= Span Closed_Syllable
> Closed_Span ::= Closed_Span Abbreviation
> Open_Span ::= Span Open_Syllable
> Open_Span ::= Open_Syllable
> Abbreviation ::= C
> Closed_Syllable ::= C V C
> Open_Syllable ::= C V
>
> I'm a little busy now, so I didn't test this, but perhaps you get the idea
> -- the spans recurse, with open ones ending in a vowel and closed ones
> ending in a consonant.  Abbreviations may only occur in two places --
> following a closed span, or at the very beginning of a span.  Several
> abbreviations are allowed in a row, since an abbreviation can end a closed
> span.
>
> @rns: Does this work?
>
If a lone consonant can mean only an abbrev, then yes, I think —
https://gist.github.com/rns/4322555d750a9e83cd54


> I *think* it's unambiguous.
>
Yes, it's unambiguous  — output in the comment; I also added code to test
for ambiguous parses.


>
>
> -- jeffrey
>
>
> On 09/03/2014 09:35 PM, Andrew Dunbar wrote:
>
> Yes I'm vague on Data::Dumper and I don't know much about the workings of
> Marpa.
>
>  I added on the example code for Marpa::R2::ASF so I can compare it with
> my real code.
> It doesn't seem to be ambiguous now but I actually can't see what's
> different about it.
> I'm not sure whether I simplified it too much when I made the analogy.
>
>  I want rules that mean "only interpret a consonant as an abbreviation"
> when it can't be interpreted as part of a syllable.
>
>  I don't know if that's possible of course (-:
>
>  I'll see if I can come up with a better analogy based on some actual
> ambiguities I find in Lao.
>
> On Thursday, 4 September 2014 13:58:23 UTC+10, Jeffrey Kegler wrote:
>>
>>  What rns did (as I read it) was list all the results of $slr->value().
>> The parse is unambiguous if and only if there is exactly one, which seems
>> to be the case here.  (You've been away from Perl, so Data::Dumper may now
>> be hard to read, but you can confirm this for yourself by adding a line
>> before the dump of each value, as a "hi there", or giving a count.)
>>
>> Is your rule that you don't want to allow an abbreviation to follow a
>> vowel?
>>
>> -- jeffrey
>>
>> On 09/03/2014 08:24 PM, Andrew Dunbar wrote:
>>
>> Do we know if that's ambiguous? Don't we have to run it
>> through Marpa::R2::ASF to know?
>>
>>
>>
>> On Wednesday, 3 September 2014 20:10:42 UTC+10, rns wrote:
>>>
>>> Can you please look at this gist
>>> <https://gist.github.com/rns/fb6abf62a5fa779957ba>? The result is in
>>> the comment below it. This might be a solution provided that I've got the
>>> right idea.
>>>
>>>
>>>
>>>
>>>
>>>
>>> On Wed, Sep 3, 2014 at 11:44 AM, Andrew Dunbar <[email protected]>
>>> wrote:
>>>
>>>> I've come back to Perl after a long absence just to play with Marpa
>>>> because it looks like the most full featured Earley parser in any of the
>>>> programming languages I know.
>>>>
>>>>  I'm interested in Earley specifically because it can handle ambiguity
>>>> and can produce a parse forest.
>>>>
>>>>  I'm using it to investigate the syllable structure of the writing
>>>> system of the Lao language of Southeast Asia. Specifically to see whether
>>>> it's inherently ambiguous, and how.
>>>>
>>>>  So far it works great and I'm glad I've come here from the Bison and
>>>> PEG grammars I was playing with earlier.
>>>>
>>>>  But it seems that there might be two kinds of ambiguities, the kind
>>>> I'm looking for, and a kind that might be an artefact of Earley parsing or
>>>> of the way I've written the grammar.
>>>>
>>>>  Without having to teach you Lao I'll attempt to analogize:
>>>>
>>>>  All ::= Syllable+
>>>>
>>>> Syllable ::= C V C
>>>>          | C V
>>>>          | C
>>>>
>>>> C ~ [bcdfghjklmnpqrstvwxyz]
>>>> V ~ [aeiou]
>>>>
>>>>
>>>> The "Syllable ::= C" rule is to allow lone initial consonants, as are
>>>> used occasionally for abbreviations.
>>>>
>>>>  If my input string is "mat" I only want:
>>>>
>>>>   (Syllable (C m) (V a) (C t))
>>>>
>>>>  But due to the abbreviation rule I also get a second unwanted parse:
>>>>
>>>>   (Syllable (C m) (V a))
>>>> (Syllable (C t))
>>>>
>>>>  I've been able to refactor my grammar to deal with other issues that
>>>> have appeared, by I can't seem to think of anything which accounts for
>>>> occasional abbreviations but doesn't generate a number of unwanted
>>>> alternative parses.
>>>>
>>>>  Can I refactor my grammar or is there some other way to deal with
>>>> this but still generate all the other kinds of ambiguity that I am
>>>> interested in?
>>>>  --
>>>> You received this message because you are subscribed to the Google
>>>> Groups "marpa parser" group.
>>>> To unsubscribe from this group and stop receiving emails from it, send
>>>> an email to [email protected].
>>>> For more options, visit https://groups.google.com/d/optout.
>>>>
>>>
>>>   --
>> You received this message because you are subscribed to the Google Groups
>> "marpa parser" group.
>> To unsubscribe from this group and stop receiving emails from it, send an
>> email to [email protected].
>> For more options, visit https://groups.google.com/d/optout.
>>
>>
>>   --
> You received this message because you are subscribed to the Google Groups
> "marpa parser" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to [email protected].
> For more options, visit https://groups.google.com/d/optout.
>
>
>  --
> You received this message because you are subscribed to the Google Groups
> "marpa parser" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to [email protected].
> For more options, visit https://groups.google.com/d/optout.
>

-- 
You received this message because you are subscribed to the Google Groups 
"marpa parser" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
For more options, visit https://groups.google.com/d/optout.

Reply via email to