[Bug target/85961] scratch register rsi used after function call
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=85961 --- Comment #4 from bki at hacon dot de --- (In reply to Alexander Monakov from comment #3) > You'd need to disable IPA-RA after forcing -O2 with the pragma, i.e.: > > #pragma GCC optimize "O2" > #pragma GCC optimize "no-ipa-ra" Yes, this changes the problematic instruction sequence to: 7e: 48 89 c5 mov%rax,%rbp 81: e8 7a ff ff ffcallq 0 <_ZNKSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEE6lengthEv.isra.0> 86: 48 8d 54 05 00lea0x0(%rbp,%rax,1),%rdx Which is better, since rbp is not a scratch register. > [...]. Do we want to adjust it given that "pragma optimized" is documented as > "not suitable for production use"? I appreciate the explanation which makes sense to me. In the original context, I did introduce the pragma to reduce compilation times like this: #pragma GCC optimize "-fno-var-tracking", "-fno-var-tracking-assignments" So, with instrumentation but without the optimize pragma, scratch registers will always be considered changed across a function call? This would then both fix my issue and more general doubts on the feasibility of my ABI messing. I also agree that the pragma is clearly documented as not fit for production use and would remove it accordingly. I think I can manage to pass these options on the command line somehow.
[Bug target/85961] scratch register rsi used after function call
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=85961 --- Comment #2 from bki at hacon dot de --- (In reply to Richard Biener from comment #1) > To me it shoulds you are messing with the ABI behind GCCs back. True. I'd still argue that the SysV-ABI suggests that a caller would need to save and restore %rsi before/after a normal function call. But you are right in that the function _ZNKSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEE6lengthEv.isra.0 is not 'normal' in this case but completely in the hands of the compiler (who does not clobber rsi). I could rework my messing with the ABI to change fewer registers but since that is very performance sensitive code, I'd like to understand better if it is really necessary. > We use IPA > register allocation to do this kind of transform that of course doesn't work > if you do this. > > So - does -fno-ipa-ra fix this for you? No, I get the same instruction sequence with '-fno-ipa-ra'
[Bug c++/85961] New: scratch register rsi used after function call
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=85961 Bug ID: 85961 Summary: scratch register rsi used after function call Product: gcc Version: 7.3.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: c++ Assignee: unassigned at gcc dot gnu.org Reporter: bki at hacon dot de Target Milestone: --- Created attachment 44203 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=44203=edit Minimal cpp-File producing the bug With g++ 7.0.0 and g++ 7.3.0 and compiling the attached file as detailed in the comments, produces an object file containing the following instruction sequence: 76: 48 89 c6mov%rax,%rsi 79: e8 82 ff ff ff callq 0 <_ZNKSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEE6lengthEv.isra.0> 7e: 48 8d 14 06 lea(%rsi,%rax,1),%rdx According to the System V AMD64 ABI (https://software.intel.com/sites/default/files/article/402129/mpx-linux64-abi.pdf), %rsi is a scratch register not preserved across function calls. I therefore think that this instruction sequence is wrong. Why is this a problem? The function which is called does not clobber %rsi itself. But compiling with '-pg -mfentry' will instrument that function call and allow a user specified function to clobber %rsi. A call to __fentry__ should not clobber %rsi, since %rsi is used for parameter passing and must be unchanged when __fentry__ returns. But by temporarily replacing the return address __fentry__ can instrument the function exit. There, clobbering %rsi (and all other scratch registers) should be acceptable.