vvuksanovic wrote: > Theoretically, this macro expansion context is a hack. The `MacroExpansion` > hierarchy should perfectly describe how macros are expanded - I just never > bothered learning how to do that, and slapping a token watcher worked well so > far. Ideally, we wouldn't need a token watcher, and we should just lean on > the `MacroExpansions` somehow. > > Serialising this redundant information (the token sequence) into a PCH is not > elegant, because we start leaking our wacky implementation, and if possible, > I'd strive for not doing that.
As far as I can tell, that isn't redundant information. Macro expansions are not materialized anywhere in the lexer or preprocessor. They are implemented as a stack of `TokenLexer` objects each of which expands a single macro. Each lexer contains tokens of the macro they are expanding (from the macro definition), with the arguments replaced (for function-like macros), but nested macro expansions are handled recursively by pushing a new lexer to the stack (some other things like the ## operator are also handled separately). After each token is lexed/parsed, it is lost. It might be possible to modify these lexers to keep the complete list of expanded tokens, and then return it through a new preprocessor callback, but that implementation would be even more invasive than the current one and it would be a waste since it would always be on, but almost never used. > Unless it's a big ask, I'd ask you to explore what MacroExpansions encode. > Could we use that somehow to replace our token watcher implementation, and > just rely on that? If that is possible, we already have everything in the > PCH, so that part would get simplified. WDYT? The `MacroExpansion` class doesn't actually contain much data, only the name (for built-in macros) or definition (for user macros) and the expanded source range. Unfortunately, the source range doesn't help in getting the expanded tokens, it just points to the macro identifier in the source code. The macro definition is also not useful since we don't know the state of the preprocessor at that point. There doesn't seem to be a way to reconstruct the expanded tokens/string from just this information, that is why I added the expanded string there. I remembered that clangd also shows expansions when hovering over a macro. Their implementation is in `clang/lib/Tooling/Syntax/Tokens.cpp` and they do something similar to `MacroExpansionContext`. Using a token watcher and preprocessor callbacks they maintain a buffer containing spelling and expansion tokens and a mapping between expansion locations and the tokens in the expansion buffer. I don't think there is a way to get macro expansions from the PCH without serializing it in some way ourselves. https://github.com/llvm/llvm-project/pull/176126 _______________________________________________ cfe-commits mailing list [email protected] https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
