Re: Finding where a rule matches input stream

Jeffrey Kegler Fri, 03 Aug 2018 08:35:42 -0700

In the attached, I corrected the script, and added detailed comments about
what is going on.


Hope this helps! -- jeffrey




On Fri, Aug 3, 2018 at 10:41 AM, Michael Spertus <[email protected]>
wrote:

> Hi Jeffrey,
> Unless I'm misreading what you are suggesting, that seems to give me the
> same failure as in the original post as in the code below.
>
> use Marpa::R2;
>>
>> my $dsl = <<'END_OF_DSL';
>> :default ::= action => [name,values]
>> lexeme default = latm => 1
>>
>> :start ::= matches
>>
>> matches ::= match+
>> match ::= id | non_id
>> id ::= identifier action => do_id
>> non_id ::= other
>>
>> identifier ~ [\w]+
>> other ~ [\W]+
>> END_OF_DSL
>>
>>
>>
>> my $grammar = Marpa::R2::Scanless::G->new( { source => \$dsl } );
>> my $recce = Marpa::R2::Scanless::R->new(
>>     { grammar => $grammar, semantics_package => 'CodeProcess' } );
>>
>> my $input = '#####abc##';
>> my $length_read = $recce->read( \$input );
>>
>> my $value_ref = $recce->value;
>> my $value = ${$value_ref};
>>
>>
>> sub CodeProcess::do_id {
>>     my ($start, $end) = Marpa::R2::Context::location();
>>     print "location() is ($start, $end)\n";
>>
>>     my ($span_beg, $span_end) = $recce->g1_location_to_span($start, $end
>> - $start);
>>     print "g1_location_to_span is ($span_beg, $span_end)\n";
>>     print "literal is ".$recce->literal($span_beg, $span_end -
>> $span_beg)."\n";
>>     print 'but $_[1] shows this id rule matched '.$_[1]."\n";
>>  }
>>
>> which incorrectly outputs
>
>> location() is (1, 2)
>> g1_location_to_span is (0, 5)
>> literal is #####
>> but $_[1] shows this id rule matched abc
>>
>> Fortunately, the more complicated solution from my previous post is
> working, so I'm not blocked, but it seems clearly not to be "the right
> approach," so definitely still interested in understanding what is wrong in
> the above.
>
> Thanks,
> Mike
> On Monday, July 30, 2018 at 11:27:44 AM UTC-5, Jeffrey Kegler wrote:
>>
>> I've been traveling so my apologies for the delay and my thanks to
>> Jean-Damien for filling in.
>>
>> Also my apologies for the interface, which is not exactly a model of
>> clarity.
>>
>> G1 locations are *fenceposts*, but they are also used to represent tokens
>> in which case the G1 location *after* the symbol is its G1 location. So
>> your return indicates <id> starts at G1 location 1 and ends at G1 location
>> 2.  It is *at* G1 location 2.  Here's the writeup in the doc
>> <https://metacpan.org/pod/distribution/Marpa-R2/pod/Scanless/R.pod#G1-locations>
>> .
>>
>> The second issue is spans versus ranges.  Marpa::R2::Context::location()
>> returns the G1 start and end locations.  g1_location_to_span() wants a G1
>> *span*.  So you want $recce->g1_location_to_span($end, $end-$start).
>>  (Warning: I have not testing this, it's from memory.)
>>
>> I hope this helps.
>>
>> On Sat, Jul 28, 2018 at 8:27 PM, Michael Spertus <[email protected]>
>> wrote:
>>
>>> I'm sure I'm doing something very dumb here, but I'm not able to find
>>> the part of the input stream that was matched by my rule from the action.
>>> The following code reports the section of the input stream before the match
>>>
>>> use Marpa::R2;
>>>>
>>>> my $dsl = <<'END_OF_DSL';
>>>> :default ::= action => [name,values]
>>>> lexeme default = latm => 1
>>>>
>>>> :start ::= matches
>>>>
>>>> matches ::= match+
>>>> match ::= id | non_id
>>>> id ::= identifier action => do_id
>>>> non_id ::= other
>>>>
>>>> identifier ~ [\w]+
>>>> other ~ [\W]+
>>>> END_OF_DSL
>>>>
>>>>
>>>>
>>>> my $grammar = Marpa::R2::Scanless::G->new( { source => \$dsl } );
>>>> my $recce = Marpa::R2::Scanless::R->new(
>>>>     { grammar => $grammar, semantics_package => 'CodeProcess' } );
>>>>
>>>> my $input = '#####abc##';
>>>> my $length_read = $recce->read( \$input );
>>>>
>>>> my $value_ref = $recce->value;
>>>> my $value = ${$value_ref};
>>>>
>>>>
>>>> sub CodeProcess::do_id {
>>>>     my ($start, $end) = Marpa::R2::Context::location();
>>>>     print "location() is ($start, $end)\n";
>>>>
>>>>     my ($span_beg, $span_end) = $recce->g1_location_to_span($start,
>>>> $end);
>>>>     print "g1_location_to_span is ($span_beg, $span_end)\n";
>>>>     print "literal is ".$recce->literal($span_beg, $span_end)."\n";
>>>>     print 'but $_[1] shows this id rule matched '.$_[1]."\n";
>>>>  }
>>>>
>>>>
>>> as shown by the following output
>>>
>>> mps@CPP:~$ perl marpa_test.pl
>>>> location() is (1, 2)
>>>> g1_location_to_span is (0, 5)
>>>> literal is #####
>>>> but $_[1] shows this id rule matched abc
>>>>
>>>
>>> How would I display the section of input text that is matched by the id
>>> rule? I know I can reconstruct it from the children, but my real parser has
>>> much more complex rules for which it would be really helpful to just find
>>> the range in the input matched by the rule (including its children).
>>>
>>> Thanks,
>>>
>>> Mike
>>>
>>> --
>>> You received this message because you are subscribed to the Google
>>> Groups "marpa parser" group.
>>> To unsubscribe from this group and stop receiving emails from it, send
>>> an email to [email protected].
>>> For more options, visit https://groups.google.com/d/optout.
>>>
>>
>> --
> You received this message because you are subscribed to the Google Groups
> "marpa parser" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to [email protected].
> For more options, visit https://groups.google.com/d/optout.
>

-- 
You received this message because you are subscribed to the Google Groups 
"marpa parser" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
For more options, visit https://groups.google.com/d/optout.

spertus.pl
Description: Perl program

Re: Finding where a rule matches input stream

Reply via email to