Re: [i386] Scalar DImode instructions on XMM registers

Ilya Enkovich Wed, 27 May 2015 06:20:51 -0700

2015-05-27 6:31 GMT+03:00 Jeff Law <[email protected]>:
> On 05/25/2015 09:27 AM, Ilya Enkovich wrote:
>>
>> 2015-05-22 15:01 GMT+03:00 Ilya Enkovich <[email protected]>:
>>>
>>> 2015-05-22 11:53 GMT+03:00 Ilya Enkovich <[email protected]>:
>>>>
>>>> 2015-05-21 22:08 GMT+03:00 Vladimir Makarov <[email protected]>:
>>>>>
>>>>> So, Ilya, to solve the problem you need to avoid sharing subregs for
>>>>> the
>>>>> correct LRA/reload work.
>>>>>
>>>>>
>>>>
>>>> Thanks a lot for your help! I'll fix it.
>>>>
>>>> Ilya
>>>
>>>
>>> I've fixed SUBREG sharing and got a missing spill. I added
>>> --enable-checking=rtl to check other possible bugs. Spill/fill code
>>> still seems incorrect because different sizes are used.  Shouldn't
>>> block me though.
>>>
>>> .L5:
>>>          movl    16(%esp), %eax
>>>          addl    $8, %esi
>>>          movl    20(%esp), %edx
>>>          movl    %eax, (%esp)
>>>          movl    %edx, 4(%esp)
>>>          call    counter@PLT
>>>          movq    -8(%esi), %xmm0
>>>          **movdqa  16(%esp), %xmm2**
>>>          pand    %xmm0, %xmm2
>>>          movdqa  %xmm2, %xmm0
>>>          movd    %xmm2, %edx
>>>          **movq    %xmm2, 16(%esp)**
>>>          psrlq   $32, %xmm0
>>>          movd    %xmm0, %eax
>>>          orl     %edx, %eax
>>>          jne     .L5
>>>
>>> Thanks,
>>> Ilya
>>
>>
>> I was wrong assuming reloads with wrong size shouldn't block me. These
>> reloads require memory to be aligned which is not always true. Here is
>> what I have in RTL now:
>>
>> (insn 2 7 3 2 (set (reg/v:DI 93 [ l ])
>>          (mem/c:DI (reg/f:SI 16 argp) [1 l+0 S8 A32])) test.c:5 89
>> {*movdi_internal}
>>       (nil))
>> ...
>> (insn 27 26 52 6 (set (subreg:V2DI (reg:DI 87 [ D.1822 ]) 0)
>>          (ior:V2DI (subreg:V2DI (reg:DI 99 [ D.1822 ]) 0)
>>              (subreg:V2DI (reg/v:DI 93 [ l ]) 0))) test.c:11 3489
>> {*iorv2di3}
>>       (expr_list:REG_DEAD (reg:DI 99 [ D.1822 ])
>>          (expr_list:REG_DEAD (reg/v:DI 93 [ l ])
>>              (nil))))
>>
>> After reload I get:
>>
>> (insn 2 7 75 2 (set (reg/v:DI 0 ax [orig:93 l ] [93])
>>          (mem/c:DI (plus:SI (reg/f:SI 7 sp)
>>                  (const_int 24 [0x18])) [1 l+0 S8 A32])) test.c:5 89
>> {*movdi_internal}
>>       (nil))
>> (insn 75 2 3 2 (set (mem/c:DI (reg/f:SI 7 sp) [3 %sfp+-16 S8 A64])
>>          (reg/v:DI 0 ax [orig:93 l ] [93])) test.c:5 89 {*movdi_internal}
>>       (nil))
>> ...
>> (insn 27 26 52 6 (set (reg:V2DI 21 xmm0 [orig:87 D.1822 ] [87])
>>          (ior:V2DI (reg:V2DI 21 xmm0 [orig:99 D.1822 ] [99])
>>              (mem/c:V2DI (reg/f:SI 7 sp) [3 %sfp+-16 S16 A64])))
>> test.c:11 3489 {*iorv2di3}
>>
>>
>> 'por' instruction requires memory to be aligned and fails in a bigger
>> testcase. There is also movdqa generated for esp by reload. May it
>> mean I still have some inconsistencies in the produced RTL? Probably I
>> should somehow transform loads and stores?
>
> I'd start by looking at the AP->SP elimination step.  What's the defined
> stack alignment and whether or not a dynamic stack realignment is needed.
> If you don't have all that setup properly prior to the allocators, then
> they're not going to know how what objects to align nor how to align them.


I looked into assign_stack_local_1 call for this spill. LRA correctly
requests 16 bytes size with 16 bytes alignment. But
assign_stack_local_1 look reduces alignment to 8 because estimated
stack alignment before RA is 8 and requested mode's (DI) alignment
fits it. Probably LRA should pass biggest_mode of the reg when
requesting a stack slot?

I handled it by increasing stack_alignment_estimated when transform
some instructions to vector mode.

Thanks for help!

Ilya

>
> jeff
>

Re: [i386] Scalar DImode instructions on XMM registers

Reply via email to