On 05/25/2015 09:27 AM, Ilya Enkovich wrote:
2015-05-22 15:01 GMT+03:00 Ilya Enkovich <enkovich....@gmail.com>:
2015-05-22 11:53 GMT+03:00 Ilya Enkovich <enkovich....@gmail.com>:
2015-05-21 22:08 GMT+03:00 Vladimir Makarov <vmaka...@redhat.com>:
So, Ilya, to solve the problem you need to avoid sharing subregs for the
correct LRA/reload work.
Thanks a lot for your help! I'll fix it.
Ilya
I've fixed SUBREG sharing and got a missing spill. I added
--enable-checking=rtl to check other possible bugs. Spill/fill code
still seems incorrect because different sizes are used. Shouldn't
block me though.
.L5:
movl 16(%esp), %eax
addl $8, %esi
movl 20(%esp), %edx
movl %eax, (%esp)
movl %edx, 4(%esp)
call counter@PLT
movq -8(%esi), %xmm0
**movdqa 16(%esp), %xmm2**
pand %xmm0, %xmm2
movdqa %xmm2, %xmm0
movd %xmm2, %edx
**movq %xmm2, 16(%esp)**
psrlq $32, %xmm0
movd %xmm0, %eax
orl %edx, %eax
jne .L5
Thanks,
Ilya
I was wrong assuming reloads with wrong size shouldn't block me. These
reloads require memory to be aligned which is not always true. Here is
what I have in RTL now:
(insn 2 7 3 2 (set (reg/v:DI 93 [ l ])
(mem/c:DI (reg/f:SI 16 argp) [1 l+0 S8 A32])) test.c:5 89
{*movdi_internal}
(nil))
...
(insn 27 26 52 6 (set (subreg:V2DI (reg:DI 87 [ D.1822 ]) 0)
(ior:V2DI (subreg:V2DI (reg:DI 99 [ D.1822 ]) 0)
(subreg:V2DI (reg/v:DI 93 [ l ]) 0))) test.c:11 3489 {*iorv2di3}
(expr_list:REG_DEAD (reg:DI 99 [ D.1822 ])
(expr_list:REG_DEAD (reg/v:DI 93 [ l ])
(nil))))
After reload I get:
(insn 2 7 75 2 (set (reg/v:DI 0 ax [orig:93 l ] [93])
(mem/c:DI (plus:SI (reg/f:SI 7 sp)
(const_int 24 [0x18])) [1 l+0 S8 A32])) test.c:5 89
{*movdi_internal}
(nil))
(insn 75 2 3 2 (set (mem/c:DI (reg/f:SI 7 sp) [3 %sfp+-16 S8 A64])
(reg/v:DI 0 ax [orig:93 l ] [93])) test.c:5 89 {*movdi_internal}
(nil))
...
(insn 27 26 52 6 (set (reg:V2DI 21 xmm0 [orig:87 D.1822 ] [87])
(ior:V2DI (reg:V2DI 21 xmm0 [orig:99 D.1822 ] [99])
(mem/c:V2DI (reg/f:SI 7 sp) [3 %sfp+-16 S16 A64])))
test.c:11 3489 {*iorv2di3}
'por' instruction requires memory to be aligned and fails in a bigger
testcase. There is also movdqa generated for esp by reload. May it
mean I still have some inconsistencies in the produced RTL? Probably I
should somehow transform loads and stores?
I'd start by looking at the AP->SP elimination step. What's the defined
stack alignment and whether or not a dynamic stack realignment is
needed. If you don't have all that setup properly prior to the
allocators, then they're not going to know how what objects to align nor
how to align them.
jeff