First off I'm a new user trying to understand the Marpa module.  It's been 
a difficult ride thus far as I come from the regex side with no experience 
in lexing vs. parsing prior to Marpa.  The below questions here are 
essentially crossposted from here 
(http://www.perlmonks.org/?node_id=1085393) but as they're very Marpa 
specific, I thought this might be a better place to ask.

I am trying to understand how Marpa handles the case where the input string 
matches the grammar, but only partially.  All of the input string matches 
the grammar to the point of input string exhaustion, but there are 
remaining non-optional tokens in the grammar that are  not yet matched.  
Here's the test case I've been using to illustrate the issue:

use strict;
use warnings;

use Marpa::R2;
use Data::Dumper;

# Example 1
{
    my $g_str = <<'END_GRAMMAR';
:default    ::= action => [name, value]
:start      ::= data
data        ::= '12' '3'
END_GRAMMAR

    my $g_obj = Marpa::R2::Scanless::G->new({ source  => \$g_str });
    my $p_obj = Marpa::R2::Scanless::R->new({
        grammar => $g_obj,
        trace_values => 1,
        trace_terminals => 1
    });

    my $s = '123';
    $p_obj->read( \$s );

    print "EXAMPLE 1 VALUE:".Dumper($p_obj->value)."\n";
}

# Example 2
{
    my $g_str = <<'END_GRAMMAR';
:default    ::= action => [name, value]
:start      ::= data
data        ::= '12' '3'
END_GRAMMAR

    my $g_obj = Marpa::R2::Scanless::G->new({ source  => \$g_str });
    my $p_obj = Marpa::R2::Scanless::R->new({
        grammar => $g_obj,
        trace_values => 1,
        trace_terminals => 1
    });

    my $s = '12';
    $p_obj->read( \$s );

    print "EXAMPLE 2 VALUE:".Dumper($p_obj->value)."\n";
}
 

Output:

Setting trace_terminals option
Setting trace_values option
Lexer "L0" accepted lexeme L1c1-2: '12'; value="12"
Lexer "L0" accepted lexeme L1c3: '3'; value="3"
EXAMPLE 1 VALUE:$VAR1 = \[
            'data',
            '12',
            '3'
          ];

Setting trace_terminals option
Setting trace_values option
Lexer "L0" accepted lexeme L1c1-2: '12'; value="12"
EXAMPLE 2 VALUE:$VAR1 = undef;


In both examples, the grammar has a single 'data' token that requires a 
'12' and a '3' token to match.  In example 1, both tokens are matched and 
the call to the recognizer object's read method returns the expected data 
structure.

However, in example two, the input string is only long enough to match the 
first '12' token.  It cannot match the '3' token required for 'data'.  
Intuitively, I would expect the recognizer object's read method to fail in 
this case, but it does not.  It seems that read does not fail in a case 
where the input string is entirely exhausted while attempting to match the 
grammar.  I am assuming this is an intentional behavior and have attempted 
to read through the documentation to understand this issue, but have failed 
so far.  Amon over at PerlMonks has been very helpful so far, but I feel 
this might be a better place for this question.

-- 
You received this message because you are subscribed to the Google Groups 
"marpa parser" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
For more options, visit https://groups.google.com/d/optout.

Reply via email to