If the SymbolTable is not built by the Parser then you cannot generate the correct AST for a class definition vs a function call. Also you cannot generate prompt errors close to the point where the error occurs in the parse. 90+% of all semantic ambiguities can be resolved by a SymbolTable in the Parser. Most C++ programs will not have any statements that cannot be resolved inline in the parser and would need to be resolved in a later AST TreeParser phase. The 90% case should not pay the compile time cost of a separate TreeParser just to resolve ambiguities that can be handled by the Parser.
No feedback is required for the Lexer. The Lexer can resolve all pre-processor statements before the Parser is invoked. The Lexer uses it's own primitive SymbolTable to handle text substitutions and macros. Again, 90+% of all C++ programs do not use any complex macros where a second lexer pass would be needed. For simple includes and text substitutions the Lexer can handle them inline. In 10% of the cases The Lexer needs 2 passes where the first pass processes all includes, macros and substitutions and the second pass generates the token stream for the Parser. On Mon, Jul 11, 2011 at 5:16 PM, Loring Craymer <lgcray...@yahoo.com> wrote: > On 3.) : The parser just recognizes syntax and ignores semantic > ambiguities; > then in a first tree walker pass, the symbol tables are constructed; a > second > tree walker pass uses the symbol table information to resolve ambiguities > (which > of several interpretations of valid syntax is correct) and rewrites the > tree > into an unambiguous form. Then you have finished recognizing C++ and can > continue with further processing. Never, never, never would I suggest > adding a > feedback mechanism that couples parser to lexer. > > --Loring > > > ----- Original Message ---- > > From: The Researcher <researcher0...@gmail.com> > > To: antlr-interest@antlr.org > > Sent: Mon, July 11, 2011 12:55:05 PM > > Subject: Re: [antlr-interest] Can one identify the type of parser needed > for a > >given BNF grammar > > > > > > > > Loring, > > > > > > Thanks for the feedback. > > > > While I now can understand your answer, I still don't have enough > > experience to implement any approach correctly. > > > > As a help to others who are working on this in private or interested in > > creating a C++ parser I will gloss what Loring said because last week > it > > would have mostly been Greek to me. If I say something wrong, please > > correct. I am not afraid to make mistakes in public so that others may > learn > > from them. > > > > The big thing that one has to learn about parsing C++ is how to handle > the > > ambiguities. That is the dragon to be slayed here. > > > > 1. And then you have to figure out how to prune the GLR-generated > "forests". > > > > GLR http://en.wikipedia.org/wiki/GLR_parser > > > > The reason GLR is considered for C++ is that do to the ambiguities of > C++ > > most parsers in a typical fashion have to couple the feedback from the > > parser to the lexer and reference a symbol table to correctly parse C++. > > This is contrary to what is taught in school and most of the early > examples > > of parsing you will come across. This allows them to produce an AST > without > > ambiguities after the parser. This is pushing the work to the front. GLR > is > > one of the most powerful parsing strategies and can parse C++ without > > coupling the parser to the lexer, but does so by generating multiple > > subtrees for each ambiguity, for which you must later prune out the > > ambiguities from the forests. This is pushing the work to the back. > > > > 2. C++ is nasty; it can be parsed with ANTLR (as shown by NEXT and David > > Wigg's > > adaptions of that grammar). > > > > The ANTLR C++ grammar does so by creating a symbol table and using > > predicates during the parse. In my initial conversions of the ANTLR C > > grammar to C# I was able to convert it, but did not understand why it > did > > certain things at first. After researching it was clear that a lot of > the > > ambiguities of C++ arise when an name is encountered and then it's use ( > a > > type, an identifier, etc) has to be know in order to take the > appropriate > > branch. Thus the creation of the symbol tables and other tables and the > use > > of predicates. This pushes the work to the front. > > > > 3. I believe that the right strategy with ANTLR is actually to use > > multi-pass recognition to sort out the ambiguities. That has not been > done > > yet. > > The problem is that C++ cannot be adequately described with a > context-free > > grammar; you have to do some context-sensitive processing to resolve the > > syntax > > that is semantically ambiguous. > > > > It could/should be possible to parse C++ with ANTLR without a symbol > table > > and associated predicates, and then in the AST analysis verify that the > > input C++ syntax is valid C++ semantics. > > Remember that the BNF for a grammar does not guarantee valid > code/semantics > > only correct syntax. So the AST is analyzed to find invalid semantics. I > am > > not sure if this will require coupling the parser to the lexer or the > AST to > > the parser, but if I go this route I will soon know. > > > > I know this went off topic but was worth it. > > > > Any answer to the original question is still appreciated. > > > > Thanks > > > > Eric > > > > List: http://www.antlr.org/mailman/listinfo/antlr-interest > > Unsubscribe: > >http://www.antlr.org/mailman/options/antlr-interest/your-email-address > > > > List: http://www.antlr.org/mailman/listinfo/antlr-interest > Unsubscribe: > http://www.antlr.org/mailman/options/antlr-interest/your-email-address > List: http://www.antlr.org/mailman/listinfo/antlr-interest Unsubscribe: http://www.antlr.org/mailman/options/antlr-interest/your-email-address -- You received this message because you are subscribed to the Google Groups "il-antlr-interest" group. To post to this group, send email to il-antlr-inter...@googlegroups.com. To unsubscribe from this group, send email to il-antlr-interest+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/il-antlr-interest?hl=en.