In addition to the technique Ron mentioned, there's a facility in Marpa for traversing parse forests (and therefore sets of potential AST's) from the top down: http://search.cpan.org/~jkegl/Marpa-R2-3.000000/pod/ASF.pod
On Wed, Aug 10, 2016 at 3:50 PM, Ron Savage <[email protected]> wrote: > I'm trying to work out some ideas, >> >> - I'd like to capture a string of characters , and the use the >> capture later in the grammar to match the same string of characters. I'm >> aware (although I'm no expert) of Text::Delimited and Text::Balanced, and >> I >> wonder how those two tools manage to match a dynamic token in the input, >> that is, if there is a grammar trick that captures like one would capture >> and backreference in a perl regular expression, or if that is computed in >> the semantic layer. >> >> > Firstly, good work to have looked at Text::Delimited (not mine), > Text::Delimited::Marpa (I assume you saw that one too; mine) and > Text::Balanced::Marpa (mine). > > If you're trying to capture a known string, say 'foreach', then you can > make it a lexeme, but since you say dynamic, I suspect you mean you will > specify a pattern to capture a string whose exact text you don't know > beforehand. You can still make it a lexeme, because the pattern which > detected it the first time will of course detect it all other times too, so > I'm not sure what the problem is. If you want to capture a string using a > pattern, and then capture a similar but not identical string later, I'd > start by defining 2 lexemes. But if that becomes too awkward, you can > always use events, and capture what's matched, and then with the 2nd etc > captures, compare the captured text with what you stockpiled from earlier > captures. Lot's of my code uses events. > >> >> - >> - I also want to know how to express in the grammar that a part of >> the input text (many lines of arbitrary text) has to be taken 'as is' to >> the semantic layer, in order to parse it separately. I've read poeple are >> chaining parsers. Any pointers as how to do it are welcome. >> >> > See GraphViz2::Marpa. It uses 2 grammars, $self -> bnf() and $self -> > bnf4html(), and switches between them (obviously) depending on what appears > in the input stream. That may work for you. > >> >> - >> - The previous point, relates to my desire to find some patterns in >> the parsed input, for instance, two nested loops, or two consecutive >> nested >> loops, or even 'while true ...' loops which are controlled by a variable, >> which name could be anything, that is initialized immediately before the >> loop, and incremated or accumulated inside the loop. I wonder what >> approaches are good or known to 'parse' (meaning confronting to a >> specialized grammar over a bigger grammar) the AST that results from a >> parse stage. Is ordinary ad-hoc programming the way to deal with ASTs that >> have had some inner portions separated or eliminated in order to match >> outer or bigger structures to predefined patterns, or even to sort them >> out >> of a big codebase as to see which programming patterns are most common in >> the codebase? Or is it better to bring the 'reduced' AST back to a DSL , >> and then process it against 'pattern grammars' until one fits the >> specialized pattern ? (I hope this can be understood, sorry for any >> mistakes, and thanks for the reading) >> >> > Sounds like a huge job to me :-(. > > -- > You received this message because you are subscribed to the Google Groups > "marpa parser" group. > To unsubscribe from this group and stop receiving emails from it, send an > email to [email protected]. > For more options, visit https://groups.google.com/d/optout. > -- You received this message because you are subscribed to the Google Groups "marpa parser" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. For more options, visit https://groups.google.com/d/optout.
