Re: Capture and backreference, arbitrary parts in the grammar, and AST pattern matching

Ron Savage Wed, 10 Aug 2016 15:50:44 -0700

>
> I'm trying to work out some ideas, 
>
>    - I'd like to capture a string of characters , and the use the capture 
>    later in the grammar to match the same string of characters. I'm aware 
>    (although I'm no expert) of Text::Delimited and Text::Balanced, and I 
>    wonder how those two tools manage to match a dynamic token in the input, 
>    that is, if there is a grammar trick that captures like one would capture 
>    and backreference in a perl regular expression, or if that is computed in 
>    the semantic layer.
>
>
Firstly, good work to have looked at Text::Delimited (not mine), 
Text::Delimited::Marpa (I assume you saw that one too; mine) and 
Text::Balanced::Marpa (mine).


If you're trying to capture a known string, say 'foreach', then you can 
make it a lexeme, but since you say dynamic, I suspect you mean you will 
specify a pattern to capture a string whose exact text you don't know 
beforehand. You can still make it a lexeme, because the pattern which 
detected it the first time will of course detect it all other times too, so 
I'm not sure what the problem is. If you want to capture a string using a 
pattern, and then capture a similar but not identical string later, I'd 
start by defining 2 lexemes. But if that becomes too awkward, you can 
always use events, and capture what's matched, and then with the 2nd etc 
captures, compare the captured text with what you stockpiled from earlier 
captures. Lot's of my code uses events.

>
>    - 
>    - I also want to know how to express in the grammar that a part of the 
>    input text (many lines of arbitrary text) has to be taken 'as is' to the 
>    semantic layer, in order to parse it separately. I've read poeple are 
>    chaining parsers. Any pointers as how to do it are welcome.
>    
>
See GraphViz2::Marpa. It uses 2 grammars, $self -> bnf() and $self -> 
bnf4html(), and switches between them (obviously) depending on what appears 
in the input stream. That may work for you. 

>
>    - 
>    - The previous point, relates to my desire to find some patterns in 
>    the parsed input, for instance, two nested loops, or two consecutive 
> nested 
>    loops, or even 'while true ...' loops which are controlled by a variable, 
>    which name could be anything, that is initialized immediately before the 
>    loop, and incremated or accumulated inside the loop. I wonder what 
>    approaches are good or known to 'parse' (meaning confronting to a 
>    specialized grammar over a bigger grammar) the AST that results from a 
>    parse stage. Is ordinary ad-hoc programming the way to deal with ASTs that 
>    have had some inner portions separated or eliminated in order to match 
>    outer or bigger structures to predefined patterns, or even to sort them 
> out 
>    of a big codebase as to see which programming patterns are most common in 
>    the codebase? Or is it better to bring the 'reduced' AST back to a DSL , 
>    and then process it against 'pattern grammars' until one fits the 
>    specialized pattern ? (I hope this can be understood, sorry for any 
>    mistakes, and thanks for the reading)
>
>
Sounds like a huge job to me :-(.

-- 
You received this message because you are subscribed to the Google Groups 
"marpa parser" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
For more options, visit https://groups.google.com/d/optout.

Re: Capture and backreference, arbitrary parts in the grammar, and AST pattern matching

Reply via email to