Perhaps if we differentiate between "closed" and "open" spans --
"closed" ending in a consonant and "open" ending in a vowel.
All ::= Span
Span ::= Closed_Span | Open_Span
Closed_Span ::= Abbreviation
Closed_Span :: Closed_Syllable
Closed_Span ::= Span Closed_Syllable
Closed_Span ::= Closed_Span Abbreviation
Open_Span ::= Span Open_Syllable
Open_Span ::= Open_Syllable
Abbreviation ::= C
Closed_Syllable ::= C V C
Open_Syllable ::= C V
I'm a little busy now, so I didn't test this, but perhaps you get the
idea -- the spans recurse, with open ones ending in a vowel and closed
ones ending in a consonant. Abbreviations may only occur in two places
-- following a closed span, or at the very beginning of a span. Several
abbreviations are allowed in a row, since an abbreviation can end a
closed span.
@rns: Does this work? I *think* it's unambiguous.
-- jeffrey
On 09/03/2014 09:35 PM, Andrew Dunbar wrote:
Yes I'm vague on Data::Dumper and I don't know much about the workings
of Marpa.
I added on the example code for Marpa::R2::ASF so I can compare it
with my real code.
It doesn't seem to be ambiguous now but I actually can't see what's
different about it.
I'm not sure whether I simplified it too much when I made the analogy.
I want rules that mean "only interpret a consonant as an abbreviation"
when it can't be interpreted as part of a syllable.
I don't know if that's possible of course (-:
I'll see if I can come up with a better analogy based on some actual
ambiguities I find in Lao.
On Thursday, 4 September 2014 13:58:23 UTC+10, Jeffrey Kegler wrote:
What rns did (as I read it) was list all the results of
$slr->value(). The parse is unambiguous if and only if there is
exactly one, which seems to be the case here. (You've been away
from Perl, so Data::Dumper may now be hard to read, but you can
confirm this for yourself by adding a line before the dump of each
value, as a "hi there", or giving a count.)
Is your rule that you don't want to allow an abbreviation to
follow a vowel?
-- jeffrey
On 09/03/2014 08:24 PM, Andrew Dunbar wrote:
Do we know if that's ambiguous? Don't we have to run it
through Marpa::R2::ASF to know?
On Wednesday, 3 September 2014 20:10:42 UTC+10, rns wrote:
Can you please look at this gist
<https://gist.github.com/rns/fb6abf62a5fa779957ba>? The
result is in the comment below it. This might be a solution
provided that I've got the right idea.
On Wed, Sep 3, 2014 at 11:44 AM, Andrew Dunbar
<[email protected]> wrote:
I've come back to Perl after a long absence just to play
with Marpa because it looks like the most full featured
Earley parser in any of the programming languages I know.
I'm interested in Earley specifically because it can
handle ambiguity and can produce a parse forest.
I'm using it to investigate the syllable structure of the
writing system of the Lao language of Southeast Asia.
Specifically to see whether it's inherently ambiguous,
and how.
So far it works great and I'm glad I've come here from
the Bison and PEG grammars I was playing with earlier.
But it seems that there might be two kinds of
ambiguities, the kind I'm looking for, and a kind that
might be an artefact of Earley parsing or of the way I've
written the grammar.
Without having to teach you Lao I'll attempt to analogize:
|
All::=Syllable+
Syllable::=C V C
|C V
|C
C ~[bcdfghjklmnpqrstvwxyz]
V ~[aeiou]
|
The "Syllable ::= C" rule is to allow lone initial
consonants, as are used occasionally for abbreviations.
If my input string is "mat" I only want:
|
(Syllable(C m)(V a)(C t))
|
But due to the abbreviation rule I also get a second
unwanted parse:
|
(Syllable(C m)(V a))
(Syllable(C t))
|
I've been able to refactor my grammar to deal with other
issues that have appeared, by I can't seem to think of
anything which accounts for occasional abbreviations but
doesn't generate a number of unwanted alternative parses.
Can I refactor my grammar or is there some other way to
deal with this but still generate all the other kinds of
ambiguity that I am interested in?
--
You received this message because you are subscribed to
the Google Groups "marpa parser" group.
To unsubscribe from this group and stop receiving emails
from it, send an email to [email protected].
For more options, visit
https://groups.google.com/d/optout
<https://groups.google.com/d/optout>.
--
You received this message because you are subscribed to the
Google Groups "marpa parser" group.
To unsubscribe from this group and stop receiving emails from it,
send an email to [email protected] <javascript:>.
For more options, visit https://groups.google.com/d/optout
<https://groups.google.com/d/optout>.
--
You received this message because you are subscribed to the Google
Groups "marpa parser" group.
To unsubscribe from this group and stop receiving emails from it, send
an email to [email protected]
<mailto:[email protected]>.
For more options, visit https://groups.google.com/d/optout.
--
You received this message because you are subscribed to the Google Groups "marpa
parser" group.
To unsubscribe from this group and stop receiving emails from it, send an email
to [email protected].
For more options, visit https://groups.google.com/d/optout.