ojhunt wrote:
I have been thinking about this more. This change is predicated on the
increased likelihood of spaces making the well predicted branch being
responsible, but I think the actual win is almost entirely due to the removal
of the table indirection in most whitespace cases. My testing shows that the
pure table lookup is around 20% slower than either the space branch + table
lookup or the single condition options. That would explain the measurable
improvement in the context of the work involved in actually running the
preprocessor - I actually found that at around 6-7% of white space being tabs
the `if (c==' '||tablelookup)` actually becomes slower - presumably at that
point the branch is sufficiently unpredictable that you get the cost of the
unpredicted branch and the table lookup - I'm not sure what code that has 7% of
whitespace being tab, but I'm not sure what the actual % of tabs are in
projects using tab based indenting is.
That said I'd like to know what the performance difference you are seeing
between
```cpp
bool isHorizontalWhitespace(unsigned char c) {
return c == ' ' || c == '\t' || c == '\f' || c == '\v';
}
```
and
```cpp
while (LLVM_LIKELY(Char == ' ') || isHorizontalWhitespace(Char))
```
and possibly even just updating the main function:
```
bool isHorizontalWhitespace(unsigned char c) {
using namespace charinfo;
if (LLVM_LIKELY(c == ' '))
return true;
return (InfoTable[c] & (CHAR_HORZ_WS|CHAR_SPACE)) != 0;
}
```
https://github.com/llvm/llvm-project/pull/180819
_______________________________________________
cfe-commits mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits