[clang] [Clang][Lexer][Performance] Optimize Lexer whitespace skipping logic (PR #180819)

Oliver Hunt via cfe-commits Fri, 13 Feb 2026 19:48:20 -0800

ojhunt wrote:

> I am not sure this is entirely true. While it _is_ certainly influenced my 
> micro architectural details, it is by no means a "small" change. In pure -E 
> compilation, I achieved near 2% performance improvement, which means it was 
> probably even more when considering only the lexing part.


I asked for you to try the above because I'm assuming you have testing set up, 
but for me to look at this myself (I'm curious about how we might be able to do 
better) what codebases are you testing with?

> 
> Furthermore, it _does_ depend on the processor, but the idea behind this 
> change is that, for most codebases, the `isHorizontalWhitespace` call is 
> _never_ true, which means it can basically be skipped by the branch predictor 
> (which I assume will be the case on most modern processors), so we are simply 
> exchanging a table lookup with a `*CurPtr == 32`. This can be proven by the 
> `branch-misses` metrics. It is too noisy to be meaningful on the LLVM Compile 
> Time Tracker, but using `perf stat` locally I find that it goes down. This 
> means that:
> 
> * Branch misses goes down.
> * Branching cost probably goes down as well.
> 
> Therefore, I would be surprised if this causes a regression on _any_ serious 
> processor.

I'm not so much expecting a regression as questioning the practical performance 
win in exchange for the relative complexity - but that can easily be mitigated 
by embedding the space check directly into `isHorizontalWhitespace` directly 
rather than ad hoc around the lexer.

If we really want to go all in on this we could do what I did in jsc and 
actually build out the lexer, token, and parsers  stats from large amounts of 
real code bases and start aggressively ordering in response.



https://github.com/llvm/llvm-project/pull/180819
_______________________________________________
cfe-commits mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [Clang][Lexer][Performance] Optimize Lexer whitespace skipping logic (PR #180819)

Reply via email to