First off I'm a new user trying to understand the Marpa module. It's been
a difficult ride thus far as I come from the regex side with no experience
in lexing vs. parsing prior to Marpa. The below questions here are
essentially crossposted from here
(http://www.perlmonks.org/?node_id=1085393) but as they're very Marpa
specific, I thought this might be a better place to ask.
I am trying to understand how Marpa handles the case where the input string
matches the grammar, but only partially. All of the input string matches
the grammar to the point of input string exhaustion, but there are
remaining non-optional tokens in the grammar that are not yet matched.
Here's the test case I've been using to illustrate the issue:
use strict;
use warnings;
use Marpa::R2;
use Data::Dumper;
# Example 1
{
my $g_str = <<'END_GRAMMAR';
:default ::= action => [name, value]
:start ::= data
data ::= '12' '3'
END_GRAMMAR
my $g_obj = Marpa::R2::Scanless::G->new({ source => \$g_str });
my $p_obj = Marpa::R2::Scanless::R->new({
grammar => $g_obj,
trace_values => 1,
trace_terminals => 1
});
my $s = '123';
$p_obj->read( \$s );
print "EXAMPLE 1 VALUE:".Dumper($p_obj->value)."\n";
}
# Example 2
{
my $g_str = <<'END_GRAMMAR';
:default ::= action => [name, value]
:start ::= data
data ::= '12' '3'
END_GRAMMAR
my $g_obj = Marpa::R2::Scanless::G->new({ source => \$g_str });
my $p_obj = Marpa::R2::Scanless::R->new({
grammar => $g_obj,
trace_values => 1,
trace_terminals => 1
});
my $s = '12';
$p_obj->read( \$s );
print "EXAMPLE 2 VALUE:".Dumper($p_obj->value)."\n";
}
Output:
Setting trace_terminals option
Setting trace_values option
Lexer "L0" accepted lexeme L1c1-2: '12'; value="12"
Lexer "L0" accepted lexeme L1c3: '3'; value="3"
EXAMPLE 1 VALUE:$VAR1 = \[
'data',
'12',
'3'
];
Setting trace_terminals option
Setting trace_values option
Lexer "L0" accepted lexeme L1c1-2: '12'; value="12"
EXAMPLE 2 VALUE:$VAR1 = undef;
In both examples, the grammar has a single 'data' token that requires a
'12' and a '3' token to match. In example 1, both tokens are matched and
the call to the recognizer object's read method returns the expected data
structure.
However, in example two, the input string is only long enough to match the
first '12' token. It cannot match the '3' token required for 'data'.
Intuitively, I would expect the recognizer object's read method to fail in
this case, but it does not. It seems that read does not fail in a case
where the input string is entirely exhausted while attempting to match the
grammar. I am assuming this is an intentional behavior and have attempted
to read through the documentation to understand this issue, but have failed
so far. Amon over at PerlMonks has been very helpful so far, but I feel
this might be a better place for this question.
--
You received this message because you are subscribed to the Google Groups
"marpa parser" group.
To unsubscribe from this group and stop receiving emails from it, send an email
to [email protected].
For more options, visit https://groups.google.com/d/optout.