Re: Redundant loads for bitfield accesses

Andrew Pinski Wed, 16 Aug 2017 15:41:39 -0700

On Wed, Aug 16, 2017 at 3:29 PM, Michael Clark <michaeljcl...@mac.com> wrote:
> Hi,
>
> Is there any reason for 3 loads being issued for these bitfield accesses, 
> given two of the loads are bytes, and one is a half; the compiler appears to 
> know the structure is aligned at a half word boundary. Secondly, the riscv 
> code is using a mixture of 32-bit and 64-bit adds and shifts. Thirdly, with 
> -Os the riscv code size is the same, but the schedule is less than optimal. 
> i.e. the 3rd load is issued much later.



Well one thing is most likely SLOW_BYTE_ACCESS is set to 0.  This
forces byte access for bit-field accesses.  The macro is misnamed now
as it only controls bit-field accesses right now (and one thing in
dojump dealing with comparisons with and and a constant but that might
be dead code).  This should allow for you to get the code in hand
written form.
I suspect SLOW_BYTE_ACCESS support should be removed and be assumed to
be 1 but I have not time to look into each backend to see if it is
correct to do or not.  Maybe it is wrong for AVR.

Thanks,
Andrew Pinski

>
> - https://cx.rv8.io/g/2YDLTA
>
> code:
>
>         struct foo {
>           unsigned int a : 5;
>           unsigned int b : 5;
>           unsigned int c : 5;
>         };
>
>         unsigned int proc_foo(struct foo *p)
>         {
>             return p->a + p->b + p->c;
>         }
>
> riscv asm:
>
>         proc_foo(foo*):
>           lhu a3,0(a0)
>           lbu a4,0(a0)
>           lbu a5,1(a0)
>           srliw a3,a3,5
>           andi a0,a4,31
>           srli a5,a5,2
>           andi a4,a3,31
>           addw a0,a0,a4
>           andi a5,a5,31
>           add a0,a0,a5
>           ret
>
> x86_64 asm:
>
>         proc_foo(foo*):
>           movzx edx, BYTE PTR [rdi]
>           movzx eax, WORD PTR [rdi]
>           mov ecx, edx
>           shr ax, 5
>           and eax, 31
>           and ecx, 31
>           lea edx, [rcx+rax]
>           movzx eax, BYTE PTR [rdi+1]
>           shr al, 2
>           and eax, 31
>           add eax, edx
>           ret
>
> hand coded riscv asm:
>
>         proc_foo(foo*):
>           lhu a1,0(a0)
>           srli a2,a1,5
>           srli a3,a1,10
>           andi a0,a1,31
>           andi a2,a2,31
>           andi a3,a3,31
>           add a0,a0,a2
>           add a0,a0,a3
>           ret
>
> Michael

Re: Redundant loads for bitfield accesses

Reply via email to