Re: Finding where a rule matches input stream

2018-08-06 Thread Durand Jean-Damien
Your CodeProcess::do_macro_match is still wrong, please read 
https://metacpan.org/pod/distribution/Marpa-R2/pod/Scanless/R.pod#g1_location_to_span()
 
.

"Mike's" workaround is quite correct, you have to "merge" the G1 start and 
end locations.

Clearer code in attachment.

Le samedi 4 août 2018 16:24:44 UTC+2, Michael Spertus a écrit :
>
> Hi Jeffrey,
> While that works for the simple example, it seems to fail for slightly 
> more complex examples (Now I don''t feel bad about having had trouble with 
> this to begin with!)
>
> The attached a program that demonstrates the problem and the (surprisingly 
> messy) fix as shown by the following output
>
> location() is (0, 3)
>> G1 first is 1
>> G1 last is 3
>> input span starts at 0
>> span length is 7
>>
>> *Jeffrey's literal is not the entire match: #defineMike's literal is the 
>> entire match:  #define foo bar*
>>
>>
> In any case, I have a working solution now, so I'm good to go.
>
> Thanks,
>
> Mike
>

-- 
You received this message because you are subscribed to the Google Groups 
"marpa parser" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to marpa-parser+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
use Marpa::R2;

my $dsl = <<'END_OF_DSL';
:default ::= action => [name,values]
lexeme default = latm => 1

:start ::= matches

matches ::= match+
match ::= id | macro action => do_macro_match
macro ::= '#define' identifier identifier
id ::= identifier

identifier ~ [\w]+
:discard ~ whitespace
whitespace ~ [\s]+

END_OF_DSL


my $grammar = Marpa::R2::Scanless::G->new( { source => \$dsl } );
my $recce = Marpa::R2::Scanless::R->new(
{ grammar => $grammar, semantics_package => 'CodeProcess' } );

my $input = '#define foo bar';
my $length_read = $recce->read( \$input );

my $value_ref = $recce->value;
my $value = ${$value_ref};


sub CodeProcess::span_merge {
my ($g1_before, $g1_after) = Marpa::R2::Context::location();

my @span_first = $recce->g1_location_to_span($g1_before + 1);
my @span_last  = $recce->g1_location_to_span($g1_after);

my @starts = ($span_first[0], $span_last[0]);
my @ends   = ($span_first[0] + $span_first[1], $span_last[0] + $span_last[1]);

my $span_merge_start = ($starts[0] < $starts[1]) ? $starts[0] : $starts[1];
my $span_merge_end   = ($ends  [0] > $ends  [1]) ? $ends  [0] : $ends  [1];

return ($span_merge_start, $span_merge_end - $span_merge_start);
}

sub CodeProcess::do_macro_match {
print "Span merge's literal gives the entire match:  ".$recce->literal(CodeProcess::span_merge())."\n";
}


Re: Finding where a rule matches input stream

2018-08-06 Thread Jeffrey Kegler
The interface here is a series of layers that grew over the history of R2
-- there never really was any design so conventions change between each
call.  In other words, it's a mess.  The emphasis in R2 is on backward
compatibility so, alas, it won't improve.

In R3 I redesigned G1 locations and their calls from the bottom up.

On Sat, Aug 4, 2018 at 10:24 AM, Michael Spertus 
wrote:

> Hi Jeffrey,
> While that works for the simple example, it seems to fail for slightly
> more complex examples (Now I don''t feel bad about having had trouble with
> this to begin with!)
>
> The attached a program that demonstrates the problem and the (surprisingly
> messy) fix as shown by the following output
>
> location() is (0, 3)
>> G1 first is 1
>> G1 last is 3
>> input span starts at 0
>> span length is 7
>>
>> *Jeffrey's literal is not the entire match: #defineMike's literal is the
>> entire match:  #define foo bar*
>>
>>
> In any case, I have a working solution now, so I'm good to go.
>
> Thanks,
>
> Mike
>
> --
> You received this message because you are subscribed to the Google Groups
> "marpa parser" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to marpa-parser+unsubscr...@googlegroups.com.
> For more options, visit https://groups.google.com/d/optout.
>

-- 
You received this message because you are subscribed to the Google Groups 
"marpa parser" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to marpa-parser+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: Finding where a rule matches input stream

2018-08-04 Thread Michael Spertus
Hi Jeffrey,
While that works for the simple example, it seems to fail for slightly more 
complex examples (Now I don''t feel bad about having had trouble with this 
to begin with!)

The attached a program that demonstrates the problem and the (surprisingly 
messy) fix as shown by the following output

location() is (0, 3)
> G1 first is 1
> G1 last is 3
> input span starts at 0
> span length is 7
>
> *Jeffrey's literal is not the entire match: #defineMike's literal is the 
> entire match:  #define foo bar*
>
>
In any case, I have a working solution now, so I'm good to go.

Thanks,

Mike

-- 
You received this message because you are subscribed to the Google Groups 
"marpa parser" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to marpa-parser+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
use Marpa::R2;

my $dsl = <<'END_OF_DSL';
:default ::= action => [name,values]
lexeme default = latm => 1

:start ::= matches

matches ::= match+
match ::= id | macro action => do_macro_match
macro ::= '#define' identifier identifier
id ::= identifier

identifier ~ [\w]+
:discard ~ whitespace
whitespace ~ [\s]+

END_OF_DSL


my $grammar = Marpa::R2::Scanless::G->new( { source => \$dsl } );
my $recce = Marpa::R2::Scanless::R->new(
{ grammar => $grammar, semantics_package => 'CodeProcess' } );

my $input = '#define foo bar';
my $length_read = $recce->read( \$input );

my $value_ref = $recce->value;
my $value = ${$value_ref};


sub CodeProcess::do_macro_match {
my ($g1_before, $g1_after) = Marpa::R2::Context::location();
print "location() is ($g1_before, $g1_after)\n";
# location() returns the g1 location before the token
# and the g1 location after the last token
# of a symbol

my $g1_first = $g1_before + 1;
my $g1_last = $g1_after;
# We regard a token as *at* a g1 location if the g1
# location is after it, so that the first token included
# in a symbol is the one *after* the start returned
# by location.  The last token in the symbol is "at",
# the g1 location *after* it, so no correction is needed.

print "G1 first is $g1_first\n";
print "G1 last is $g1_last\n";
   
my ($span_beg, $span_length) = $recce->g1_location_to_span(
   $g1_first,
   $g1_last);
print "input span starts at $span_beg\n";
print "span length is $span_length\n";
print "Jeffrey's literal does NOT give the entire match: " . $recce->literal($span_beg, $span_length) . "\n";


my @span1 = $recce->g1_location_to_span($g1_before);
my @span2 = $recce->g1_location_to_span($g1_after);
print "Mike's literal gives the entire match:  ".$recce->literal($span1[0] +$span1[1], $span2[0] + $span2[1] - $span1[0] - $span1[1])."\n";

 }


Re: Finding where a rule matches input stream

2018-08-03 Thread Jeffrey Kegler
In the attached, I corrected the script, and added detailed comments about
what is going on.

Hope this helps! -- jeffrey




On Fri, Aug 3, 2018 at 10:41 AM, Michael Spertus 
wrote:

> Hi Jeffrey,
> Unless I'm misreading what you are suggesting, that seems to give me the
> same failure as in the original post as in the code below.
>
> use Marpa::R2;
>>
>> my $dsl = <<'END_OF_DSL';
>> :default ::= action => [name,values]
>> lexeme default = latm => 1
>>
>> :start ::= matches
>>
>> matches ::= match+
>> match ::= id | non_id
>> id ::= identifier action => do_id
>> non_id ::= other
>>
>> identifier ~ [\w]+
>> other ~ [\W]+
>> END_OF_DSL
>>
>>
>>
>> my $grammar = Marpa::R2::Scanless::G->new( { source => \$dsl } );
>> my $recce = Marpa::R2::Scanless::R->new(
>> { grammar => $grammar, semantics_package => 'CodeProcess' } );
>>
>> my $input = '#abc##';
>> my $length_read = $recce->read( \$input );
>>
>> my $value_ref = $recce->value;
>> my $value = ${$value_ref};
>>
>>
>> sub CodeProcess::do_id {
>> my ($start, $end) = Marpa::R2::Context::location();
>> print "location() is ($start, $end)\n";
>>
>> my ($span_beg, $span_end) = $recce->g1_location_to_span($start, $end
>> - $start);
>> print "g1_location_to_span is ($span_beg, $span_end)\n";
>> print "literal is ".$recce->literal($span_beg, $span_end -
>> $span_beg)."\n";
>> print 'but $_[1] shows this id rule matched '.$_[1]."\n";
>>  }
>>
>> which incorrectly outputs
>
>> location() is (1, 2)
>> g1_location_to_span is (0, 5)
>> literal is #
>> but $_[1] shows this id rule matched abc
>>
>> Fortunately, the more complicated solution from my previous post is
> working, so I'm not blocked, but it seems clearly not to be "the right
> approach," so definitely still interested in understanding what is wrong in
> the above.
>
> Thanks,
> Mike
> On Monday, July 30, 2018 at 11:27:44 AM UTC-5, Jeffrey Kegler wrote:
>>
>> I've been traveling so my apologies for the delay and my thanks to
>> Jean-Damien for filling in.
>>
>> Also my apologies for the interface, which is not exactly a model of
>> clarity.
>>
>> G1 locations are *fenceposts*, but they are also used to represent tokens
>> in which case the G1 location *after* the symbol is its G1 location. So
>> your return indicates  starts at G1 location 1 and ends at G1 location
>> 2.  It is *at* G1 location 2.  Here's the writeup in the doc
>> 
>> .
>>
>> The second issue is spans versus ranges.  Marpa::R2::Context::location()
>> returns the G1 start and end locations.  g1_location_to_span() wants a G1
>> *span*.  So you want $recce->g1_location_to_span($end, $end-$start).
>>  (Warning: I have not testing this, it's from memory.)
>>
>> I hope this helps.
>>
>> On Sat, Jul 28, 2018 at 8:27 PM, Michael Spertus 
>> wrote:
>>
>>> I'm sure I'm doing something very dumb here, but I'm not able to find
>>> the part of the input stream that was matched by my rule from the action.
>>> The following code reports the section of the input stream before the match
>>>
>>> use Marpa::R2;

 my $dsl = <<'END_OF_DSL';
 :default ::= action => [name,values]
 lexeme default = latm => 1

 :start ::= matches

 matches ::= match+
 match ::= id | non_id
 id ::= identifier action => do_id
 non_id ::= other

 identifier ~ [\w]+
 other ~ [\W]+
 END_OF_DSL



 my $grammar = Marpa::R2::Scanless::G->new( { source => \$dsl } );
 my $recce = Marpa::R2::Scanless::R->new(
 { grammar => $grammar, semantics_package => 'CodeProcess' } );

 my $input = '#abc##';
 my $length_read = $recce->read( \$input );

 my $value_ref = $recce->value;
 my $value = ${$value_ref};


 sub CodeProcess::do_id {
 my ($start, $end) = Marpa::R2::Context::location();
 print "location() is ($start, $end)\n";

 my ($span_beg, $span_end) = $recce->g1_location_to_span($start,
 $end);
 print "g1_location_to_span is ($span_beg, $span_end)\n";
 print "literal is ".$recce->literal($span_beg, $span_end)."\n";
 print 'but $_[1] shows this id rule matched '.$_[1]."\n";
  }


>>> as shown by the following output
>>>
>>> mps@CPP:~$ perl marpa_test.pl
 location() is (1, 2)
 g1_location_to_span is (0, 5)
 literal is #
 but $_[1] shows this id rule matched abc

>>>
>>> How would I display the section of input text that is matched by the id
>>> rule? I know I can reconstruct it from the children, but my real parser has
>>> much more complex rules for which it would be really helpful to just find
>>> the range in the input matched by the rule (including its children).
>>>
>>> Thanks,
>>>
>>> Mike
>>>
>>> --
>>> You received this message because you are subscribed to the Google
>>> Groups "marpa parser" group.
>>> To unsubscribe from this group and stop receiving emails 

Re: Finding where a rule matches input stream

2018-08-03 Thread Michael Spertus
Hi Jeffrey,
Unless I'm misreading what you are suggesting, that seems to give me the 
same failure as in the original post as in the code below.

use Marpa::R2;
>
> my $dsl = <<'END_OF_DSL';
> :default ::= action => [name,values]
> lexeme default = latm => 1
>
> :start ::= matches
>
> matches ::= match+
> match ::= id | non_id
> id ::= identifier action => do_id
> non_id ::= other
>
> identifier ~ [\w]+
> other ~ [\W]+
> END_OF_DSL
>
>
>
> my $grammar = Marpa::R2::Scanless::G->new( { source => \$dsl } );
> my $recce = Marpa::R2::Scanless::R->new(
> { grammar => $grammar, semantics_package => 'CodeProcess' } );
>
> my $input = '#abc##';
> my $length_read = $recce->read( \$input );
>
> my $value_ref = $recce->value;
> my $value = ${$value_ref};
>
>
> sub CodeProcess::do_id {
> my ($start, $end) = Marpa::R2::Context::location();
> print "location() is ($start, $end)\n";
>
> my ($span_beg, $span_end) = $recce->g1_location_to_span($start, $end - 
> $start);
> print "g1_location_to_span is ($span_beg, $span_end)\n";
> print "literal is ".$recce->literal($span_beg, $span_end - 
> $span_beg)."\n";
> print 'but $_[1] shows this id rule matched '.$_[1]."\n";
>  }
>
> which incorrectly outputs

> location() is (1, 2)
> g1_location_to_span is (0, 5)
> literal is #
> but $_[1] shows this id rule matched abc
>
> Fortunately, the more complicated solution from my previous post is 
working, so I'm not blocked, but it seems clearly not to be "the right 
approach," so definitely still interested in understanding what is wrong in 
the above.

Thanks,
Mike
On Monday, July 30, 2018 at 11:27:44 AM UTC-5, Jeffrey Kegler wrote:
>
> I've been traveling so my apologies for the delay and my thanks to 
> Jean-Damien for filling in.
>
> Also my apologies for the interface, which is not exactly a model of 
> clarity. 
>
> G1 locations are *fenceposts*, but they are also used to represent tokens 
> in which case the G1 location *after* the symbol is its G1 location. So 
> your return indicates  starts at G1 location 1 and ends at G1 location 
> 2.  It is *at* G1 location 2.  Here's the writeup in the doc 
> 
> .
>
> The second issue is spans versus ranges.  Marpa::R2::Context::location() 
> returns the G1 start and end locations.  g1_location_to_span() wants a G1 
> *span*.  So you want $recce->g1_location_to_span($end, $end-$start). 
>  (Warning: I have not testing this, it's from memory.)
>
> I hope this helps.
>
> On Sat, Jul 28, 2018 at 8:27 PM, Michael Spertus  > wrote:
>
>> I'm sure I'm doing something very dumb here, but I'm not able to find the 
>> part of the input stream that was matched by my rule from the action. The 
>> following code reports the section of the input stream before the match
>>
>> use Marpa::R2;
>>>
>>> my $dsl = <<'END_OF_DSL';
>>> :default ::= action => [name,values]
>>> lexeme default = latm => 1
>>>
>>> :start ::= matches
>>>
>>> matches ::= match+
>>> match ::= id | non_id
>>> id ::= identifier action => do_id
>>> non_id ::= other
>>>
>>> identifier ~ [\w]+
>>> other ~ [\W]+
>>> END_OF_DSL
>>>
>>>
>>>
>>> my $grammar = Marpa::R2::Scanless::G->new( { source => \$dsl } );
>>> my $recce = Marpa::R2::Scanless::R->new(
>>> { grammar => $grammar, semantics_package => 'CodeProcess' } );
>>>
>>> my $input = '#abc##';
>>> my $length_read = $recce->read( \$input );
>>>
>>> my $value_ref = $recce->value;
>>> my $value = ${$value_ref};
>>>
>>>
>>> sub CodeProcess::do_id {
>>> my ($start, $end) = Marpa::R2::Context::location();
>>> print "location() is ($start, $end)\n";
>>> 
>>> my ($span_beg, $span_end) = $recce->g1_location_to_span($start, 
>>> $end);
>>> print "g1_location_to_span is ($span_beg, $span_end)\n";
>>> print "literal is ".$recce->literal($span_beg, $span_end)."\n";
>>> print 'but $_[1] shows this id rule matched '.$_[1]."\n";
>>>  }
>>>
>>>
>> as shown by the following output
>>
>> mps@CPP:~$ perl marpa_test.pl 
>>> location() is (1, 2)
>>> g1_location_to_span is (0, 5)
>>> literal is #
>>> but $_[1] shows this id rule matched abc
>>>
>>
>> How would I display the section of input text that is matched by the id 
>> rule? I know I can reconstruct it from the children, but my real parser has 
>> much more complex rules for which it would be really helpful to just find 
>> the range in the input matched by the rule (including its children).
>>
>> Thanks,
>>
>> Mike
>>
>> -- 
>> You received this message because you are subscribed to the Google Groups 
>> "marpa parser" group.
>> To unsubscribe from this group and stop receiving emails from it, send an 
>> email to marpa-parser...@googlegroups.com .
>> For more options, visit https://groups.google.com/d/optout.
>>
>
>

-- 
You received this message because you are subscribed to the Google Groups 
"marpa parser" group.
To unsubscribe from this group and stop receiving emails from it, send an email