rsmith added inline comments.
================ Comment at: clang/lib/Lex/Preprocessor.cpp:870-900 + TokenSource Source; do { + Source = TokenSource(); + switch (CurLexerKind) { case CLK_Lexer: + Source.InDirective = CurLexer->ParsingPreprocessorDirective; ---------------- ilya-biryukov wrote: > rsmith wrote: > > This is a lot of extra stuff to be doing in the main `Lex` loop. Adding one > > (perfectly-predicted) branch on `OnToken` seems like it should be fine, but > > this seems like a bit more added complexity than I'd prefer. I'd like some > > performance measurements of `-cc1 -Eonly` to see if this makes any real > > difference. > Happy to do the measurements. Do you have a good collection of input files > for this in mind? I think any large header should be fine for testing; I don't expect this to scale worse in the presence of lots of macro expansion or anything like that. ================ Comment at: clang/lib/Lex/Preprocessor.cpp:896 case CLK_LexAfterModuleImport: - ReturnedToken = LexAfterModuleImport(Result); + Source.InDirective = true; + ---------------- ilya-biryukov wrote: > rsmith wrote: > > This isn't a directive; these are normal tokens. > Sorry if it's a silly question, just wanted to clarify I'm not missing > anything here. > > These tokens are the name of the module, "import-suffix" and the semicolon > that follows it? > And they end up being used to build the AST for `ImportDecl`? Right. LexAfterModuleImport is a pass-through lexing layer that takes special action on the ObjC `@import module.name;` and C++20 `import "header";` declarations, which are just normal phase 4 tokens. ================ Comment at: clang/lib/Lex/Preprocessor.cpp:956-957 --LexLevel; + if (OnToken) + OnToken(Result, Source); } ---------------- ilya-biryukov wrote: > ilya-biryukov wrote: > > ilya-biryukov wrote: > > > rsmith wrote: > > > > This seems like it's going to be extremely hard to use. If you just > > > > want the expanded token stream, then removing all the other calls to > > > > `OnToken` and only calling it here when `LexLevel == 0` will give you > > > > that. > > > I've tried `LexLevel == 0 && IsNewToken` and it **almost** works. > > > The only trouble left is dealing with the delayed parsing. E.g. I'm > > > getting the callbacks for the same tokens multiple times in cases like > > > parsing of method bodies. > > > > > > Any ideas on how to best tackle this? > > > > > > The example is: > > > ``` > > > struct X { > > > int method() { return 10; } > > > int a = 10; > > > }; > > > ``` > > > > > > Seeing multiple callbacks for `{ return 10; }` and `= 10;`, want to see > > > only the first one. > > It almost works now, see the update revision. > > Almost no tokens get reported by the callback twice now, except the > > following case. > > Input: > > ``` > > struct X { > > int method() { return method(); } > > }; > > ``` > > After the body of the class, we get a an extra callback for `)` token from > > one of the functions under `ParsedLexedMethodDef`. > > This seems to be happening only for parenthesis. > More precisely, this happens from a following sequence of calls: > > ``` > Lex() {LexLevel=0} > CachingLex() {LexLevel=1} // which does not have a token in `CachedTokens`, > so it recurses into ... > Lex() {LexLevel=1} // which results in a call to ... > CurTokenLexer->Lex() > ``` > > At this point `CurTokenLexer` returns a token of a pre-lexed method > definition, but the current implementation forgets that this was the case by > the time the token is processed at the the top of the callstack (`Lex() > {LexLevel=0}`). > So the token ends up being reported by a callback. Yeah, the parser messes with the token stream to support C++'s out-of-order parsing rule (among other things). Ignoring that here seems right. You might also want to make sure that calls to `EnterToken` are handled properly. The extra paren likely comes from such a call. Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D59885/new/ https://reviews.llvm.org/D59885 _______________________________________________ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits