[Bug target/85961] scratch register rsi used after function call

2018-05-29 Thread bki at hacon dot de
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=85961

--- Comment #4 from bki at hacon dot de ---
(In reply to Alexander Monakov from comment #3)
> You'd need to disable IPA-RA after forcing -O2 with the pragma, i.e.:
> 
> #pragma GCC optimize "O2"
> #pragma GCC optimize "no-ipa-ra"

Yes, this changes the problematic instruction sequence to:
  7e: 48 89 c5  mov%rax,%rbp
  81: e8 7a ff ff ffcallq  0
<_ZNKSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEE6lengthEv.isra.0>
  86: 48 8d 54 05 00lea0x0(%rbp,%rax,1),%rdx

Which is better, since rbp is not a scratch register.


> [...]. Do we want to adjust it given that "pragma optimized" is documented as 
> "not suitable for production use"?
I appreciate the explanation which makes sense to me. In the original context,
I did introduce the pragma to reduce compilation times like this:
  #pragma GCC optimize "-fno-var-tracking", "-fno-var-tracking-assignments"

So, with instrumentation but without the optimize pragma, scratch registers
will always be considered changed across a function call?

This would then both fix my issue and more general doubts on the feasibility of
my ABI messing. I also agree that the pragma is clearly documented as not fit
for production use and would remove it accordingly. I think I can manage to
pass these options on the command line somehow.

[Bug target/85961] scratch register rsi used after function call

2018-05-29 Thread bki at hacon dot de
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=85961

--- Comment #2 from bki at hacon dot de ---
(In reply to Richard Biener from comment #1)
> To me it shoulds you are messing with the ABI behind GCCs back.
True.

I'd still argue that the SysV-ABI suggests that a caller would need to save and
restore %rsi before/after a normal function call. But you are right in that the
function _ZNKSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEE6lengthEv.isra.0
is not 'normal' in this case but completely in the hands of the compiler (who
does not clobber rsi).

I could rework my messing with the ABI to change fewer registers but since that
is very performance sensitive code, I'd like to understand better if it is
really necessary.

>  We use IPA
> register allocation to do this kind of transform that of course doesn't work
> if you do this.
> 
> So - does -fno-ipa-ra fix this for you?
No, I get the same instruction sequence with '-fno-ipa-ra'

[Bug c++/85961] New: scratch register rsi used after function call

2018-05-28 Thread bki at hacon dot de
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=85961

Bug ID: 85961
   Summary: scratch register rsi used after function call
   Product: gcc
   Version: 7.3.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: c++
  Assignee: unassigned at gcc dot gnu.org
  Reporter: bki at hacon dot de
  Target Milestone: ---

Created attachment 44203
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=44203=edit
Minimal cpp-File producing the bug

With g++ 7.0.0 and g++ 7.3.0 and compiling the attached file as detailed in the
comments, produces an object file containing the following instruction
sequence:

  76:   48 89 c6mov%rax,%rsi
  79:   e8 82 ff ff ff  callq  0
<_ZNKSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEE6lengthEv.isra.0>
  7e:   48 8d 14 06 lea(%rsi,%rax,1),%rdx

According to the System V AMD64 ABI
(https://software.intel.com/sites/default/files/article/402129/mpx-linux64-abi.pdf),
%rsi is a scratch register not preserved across function calls. I therefore
think that this instruction sequence is wrong.

Why is this a problem?

The function which is called does not clobber %rsi itself. But compiling with
'-pg -mfentry' will instrument that function call and allow a user specified
function to clobber %rsi. A call to __fentry__ should not clobber %rsi, since
%rsi is used for parameter passing and must be unchanged when __fentry__
returns. But by temporarily replacing the return address __fentry__ can
instrument the function exit. There, clobbering %rsi (and all other scratch
registers) should be acceptable.