I'd suggest 2 code examples: one for parsing preprocessor statements [1] and another for parsing length-prefixed format with prediction and completion events [2] -- is show how to use lexeme_read() with pos() and resume().
Hope this helps. [1] https://gist.github.com/rns/3b2f48477fc23d0ab0f7 [2] https://gist.github.com/rns/ba250ed6a5ed1c82ce7b On Tue, Feb 10, 2015 at 2:21 AM, Thomas Weigert <[email protected]> wrote: > No, this is not about relationship troubles. > > I am struggling to work with rejection events. I am trying to deal with > constructs like preprocessing statements or meaningful comments in > programming languages. These (i) can go anywhere in the grammar and (ii) > need to be propagated into the parse tree and (iii) may affect the parse > itself and (iv) cannot be easily parsed with a grammar or an internal lexer. > > My idea to parse such constructs was to create lexemes invoked by fake G1 > productions which would be tried when the relevant text is encountered and > would create a rejection event. I would then parse the text of these > constructs in an external recognizer upon handling the rejection event and > insert the proper text back into the input string and set the continuation > of the parse to the start of the replacement text. If the replacement text > is legal at the inserted point, parsing should continue just fine, thanks > to the great infrastructure provided by Marpa. > > However, things did not go as planned. Please look at the attached example > for detail. In this example, I try to handle preprocessor statements > (#ifdef). > > I created a very simple grammar, and added these productions: > > fakecpp ::= cpp > cpp ~ '#' > > The fakecpp production is actually not reachable. However, when in the > input string, for example: > abc\n#ifdef A\n=\n#else\n+\n#endif\n12 > When we hit the "#ifdef", we get a rejection event, and in the handler I > thought I could clean it up: > $pos = $pos + $len - $newlen + 1; > substr($string, $pos, $newlen) = $cpp2; > ($string is the original string, $pos is the current position, $len is the > total length of the ifdef, $newlen is the length of the replacement text, > and $cpp2 is the replacement text). I insert the replacement text at the > end of the ifdef and set the position to before the replacement text. Now I > hoped that upon resume the parser would get the replacement text and be > happy. > > No such luck. Please note that I got the following to work: Find out what > lexeme was expected and read it with the external parser (lexeme_read), and > proceed with the text after it. > $pos = $pos + $len + 1; > $recce->lexeme_read('OP', $pos, 1, '='); > But this approach only works because this grammar is so simple and I can > easily deal with all cases of possible rejections by looking at the > expected lexemes. > > Note that if I put the "=" into the input string and try to continue > parsing from before it, I get another rejection event at this very point. > This is really strange because the grammar expects an OP, I give it an OP, > but it cannot parse it. > > Intuitively, there is something I must be doing wrong as it seems there > should be a way of getting this to work. > > Any suggestions would be greatly appreciated. > > Thanks, Th. > > -- > You received this message because you are subscribed to the Google Groups > "marpa parser" group. > To unsubscribe from this group and stop receiving emails from it, send an > email to [email protected]. > For more options, visit https://groups.google.com/d/optout. > -- You received this message because you are subscribed to the Google Groups "marpa parser" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. For more options, visit https://groups.google.com/d/optout.
