ojhunt wrote:

I have been thinking about this more. This change is predicated on the 
increased likelihood of spaces making the well predicted branch being 
responsible, but I think the actual win is almost entirely due to the removal 
of the table indirection in most whitespace cases. My testing shows that the 
pure table lookup is around 20% slower than either the space branch + table 
lookup or the single condition options. That would explain the measurable 
improvement in the context of the work involved in actually running the 
preprocessor - I actually found that at around 6-7% of white space being tabs 
the `if (c==' '||tablelookup)` actually becomes slower - presumably at that 
point the branch is sufficiently unpredictable that you get the cost of the 
unpredicted branch and the table lookup - I'm not sure what code that has 7% of 
whitespace being tab, but I'm not sure what the actual % of tabs are in 
projects using tab based indenting is.

That said I'd like to know what the performance difference you are seeing 
between

```cpp
bool isHorizontalWhitespace(unsigned char c) {
    return c == ' ' || c == '\t' || c == '\f' || c == '\v';
}
```

and

```cpp
while (LLVM_LIKELY(Char == ' ') || isHorizontalWhitespace(Char))
```

and possibly even just updating the main function:

```
bool isHorizontalWhitespace(unsigned char c) {
  using namespace charinfo;
  if (LLVM_LIKELY(c == ' '))
    return true;
  return (InfoTable[c] & (CHAR_HORZ_WS|CHAR_SPACE)) != 0;
}
```



https://github.com/llvm/llvm-project/pull/180819
_______________________________________________
cfe-commits mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

Reply via email to