[clang] [clang] Fix comma location tracking inside ParenListExpr. (PR #199411)

Oliver Hunt via cfe-commits Sun, 07 Jun 2026 00:59:51 -0700

ojhunt wrote:

> > > > > Hi, are there other suggestions? After reflecting a bit on the 
> > > > > approach proposed by @ojhunt I think that it won't fit my usecase all 
> > > > > that well, since I'd like to have accurate locations from clang in 
> > > > > all cases. I can understand memory concerns, though they do not seem 
> > > > > too excessive to me. I can run some benchmarks, though I'm not quite 
> > > > > sure how.
> > > > 
> > > > 
> > > > I guess this is restricted to just parenlist but it still doesn't seem 
> > > > necessary - can you explain your reasoning as to why finding the 
> > > > location at diagnostic time is not feasible/sufficent?
> > > > There should fundamentally not be any difference in the final source 
> > > > location that you end up with
> > > 
> > > 
> > > Because there are cases where you lose track of exact location, as you 
> > > correctly pointed out 
> > > [here](https://github.com/llvm/llvm-project/pull/199411#issuecomment-4543531463).
> > 
> > 
> > No, I did not say that, I said to give up on trying to diagnose the exact 
> > location, because
> > ```c++
> > #define COMMA ,
> > ...
> > ...
> > (0 COMMA 1)
> > ```
> > 
> > 
> >     
> >       
> >     
> > 
> >       
> >     
> > 
> >     
> >   
> > The diagnostic can point to the line containing the macro definition, or to 
> > the place where the macro is used. I think your change here means that the 
> > diagnostic will appear at the location of the macro, which is not something 
> > I think users would expect.
> > My suggestion meant that in cases like this the diagnostic would remain 
> > with the code.
> 
> I see. I would instead expect the opposite (the diagnostic would talk about a 
> comma, yet not point to a comma, which I find misleading), as the location 
> inside the macro is the real comma location within the source. Perhaps this 
> different perspective is influenced by the fact that our tool would actually 
> show **both** the source location (so pointing to `COMMA`), the macro 
> expansion chain and the final location in the preprocessed source, so it's 
> quite a bit richer than a diagnostic.
> 
> > > Specifically in 
> > > [findNextToken()](https://clang.llvm.org/doxygen/Lexer_8cpp_source.html#l01375)
> > >  it will give up if the location is not at the end of a macro expansion,
> > 
> > 
> > If `findNextToken` did that, then clang could never find the first token
> 
> well but wouldn't the first condition cause it to return `std::nullopt` here?
> 
> ```c++
> #define COMMA_PLUS +1,
> 
> void f() {
>  (int)(0 COMMA_PLUS 1);
> }
> ```
>


You can get the right this if you _really_ want to - it requires jumping 
through hoops to get macro expansions, that might not be helpful

>  > > so there is indeed a difference between tracking exact location and 
> trying to reconstruct it afterwards.
> > 
> > 
> > Not here. You know where to start when trying to find the token, and asking 
> > the lexer to provide the next token after the preceding expression will 
> > provide you the exact same location as when the node
> 
> Perhaps I'm missing some detail here, I will try to reason about it. The way 
> I see it, when in the example above we lex from the end of COMMA_PLUS instead 
> of the comma location we would get the `+` as the token, therefore the 
> reconstructed loc would point to that instead. I should have a go at 
> implementing the alternative solution to see if it is indeed the case.
> 

What I said was "to give up when you hit a macro" - e.g. if you trivially find 
the comma, then hurrah, otherwise just revert to the existing diagnostic

> > > The use of the `-Wcomma` diagnostic in the test was just an easy way to 
> > > show the problem, but potentially more intricate testcases can be derived 
> > > that would work with one approach and fail to reconstruct the exact 
> > > location with the other proposed method.
> > > For my usecase (static analyzer that uses the clang API as the parsing 
> > > backend) source fidelity does matter a lot, but if it is deemed excessive 
> > > for clang I can keep this downsteam.
> > 
> > 
> > There is no source fidelity issue here.
> 
> Will look also at the review comments. Thanks for feedback, really 
> appreciated.

I also realized: if you are doing this as a static analyzer you should be 
operating in conjunction with the AST which definitionally has the full source 
location anyway?

https://github.com/llvm/llvm-project/pull/199411
_______________________________________________
cfe-commits mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [clang] Fix comma location tracking inside ParenListExpr. (PR #199411)

Reply via email to