Hi,

I am investigating the performance of AddressSanitizer checks. In the
process, I've found a few checks that look as follows:

     1: 0.27 |        mov    %rax,%rcx
     2: 0.01 |        shr    $0x3,%rcx
     3: 0.03 |        mov    0x7fff8000(%rcx),%cl
     4: 0.57 |        test   %cl,%cl
     5: 0.29 |     |<<je     abe
     6:      |     |  mov    %eax,%edx
     7:      |     |  and    $0x7,%edx
     8:      |     |  add    $0x3,%edx
     9:      |     |  movsbl %cl,%ecx
    10:      |     |  cmp    %ecx,%edx
    11:      |     |  jge    1208
    12: 0.25 | abe:|> movl   $0xdeadbeef,(%rax)

The address being tested is in %rax. This is for a four-byte load, so there
is a fast path and a slow path. The numbers to the left are samples from
the perf sampling profiler, and indicate the cost of each line. From these
it seems that the slow path is never taken.

What is intriguing is that the compiler (clang in this case) places the
slow path block in-line, thus the branch on line five is likely to be
mispredicted. This is not the case for the branch on line 11.

Is this the desired behavior? I.e., are these slow paths often taken? Or is
this layout desirable for other reasons?

If this is not the desired behavior, is it a compiler problem? Should clang
be able to recognize this pattern and automatically guess branch weights?

If it's not a compiler problem, would it make sense to specify branch
weights in AddressSanitizer? Currently, AddressSanitizer does not add any
weights to the branches it creates.

Cheers,
Jonas

-- 
You received this message because you are subscribed to the Google Groups 
"address-sanitizer" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
For more options, visit https://groups.google.com/d/optout.

Reply via email to