On 2/6/20 6:07 AM, Jakub Jelinek wrote:
> On Thu, Feb 06, 2020 at 01:00:36AM +0000, JonY wrote:
>> On 2/4/20 11:42 AM, Jakub Jelinek wrote:
>>> Hi!
>>>
>>> On Tue, Feb 04, 2020 at 11:16:06AM +0100, Uros Bizjak wrote:
>>>> I guess that Comment #9 patch form the PR should be trivially correct,
>>>> but althouhg it looks obvious, I don't want to propose the patch since
>>>> I have no means of testing it.
>>>
>>> I don't have means of testing it either.
>>> https://docs.microsoft.com/en-us/cpp/build/x64-calling-convention?view=vs-2019
>>> is quite explicit that [xyz]mm16-31 are call clobbered and only xmm6-15 (low
>>> 128-bits only) are call preserved.
>>>
>>> Jonathan, could you please test this if it is sufficient to just change
>>> CALL_USED_REGISTERS or if e.g. something in the pro/epilogue needs tweaking
>>> too?  Thanks.
>>
>> Is this patch testing still required? I just got back from traveling.
> 
> Yes, our reading of the MS ABI docs show that xmm16-31 are to be call used
> (not preserved over calls), while in gcc they are currently handled as
> preserved across the calls.
> 
>       Jakub
> 


--- original.s  2020-02-06 09:00:02.014638069 +0000
+++ new.s       2020-02-07 10:28:55.678317667 +0000
@@ -7,23 +7,23 @@
 qux:
        subq    $72, %rsp
        .seh_stackalloc 72
-       vmovaps %xmm18, 48(%rsp)
-       .seh_savexmm    %xmm18, 48
+       vmovaps %xmm6, 48(%rsp)
+       .seh_savexmm    %xmm6, 48
        .seh_endprologue
        call    bar
        vmovapd %xmm0, %xmm1
-       vmovapd %xmm1, %xmm18
+       vmovapd %xmm1, %xmm6
        call    foo
        leaq    32(%rsp), %rcx
-       vmovapd %xmm18, %xmm0
-       vmovaps %xmm0, 32(%rsp)
+       vmovapd %xmm6, %xmm0
+       vmovapd %xmm0, 32(%rsp)
        call    baz
        nop
-       vmovaps 48(%rsp), %xmm18
+       vmovaps 48(%rsp), %xmm6
        addq    $72, %rsp
        ret
        .seh_endproc
-       .ident  "GCC: (GNU) 10.0.0 20191024 (experimental)"
+       .ident  "GCC: (GNU) 10.0.1 20200206 (experimental)"
        .def    bar;    .scl    2;      .type   32;     .endef
        .def    foo;    .scl    2;      .type   32;     .endef
        .def    baz;    .scl    2;      .type   32;     .endef

GCC with the patch now seems to put the variables in xmm6, unfortunately
I don't know enough of AVX or stack setups to know if that's all that is
needed.

Attachment: signature.asc
Description: OpenPGP digital signature

Reply via email to