On February 18, 2015 6:51:17 AM GMT+01:00, Vineet Gupta 
<vineet.gup...@synopsys.com> wrote:
>On Monday 16 February 2015 08:34 PM, Bernhard Reutner-Fischer wrote:
>>> While it at I also did some arch specific adjustment in sigaction
>path
>>> >- inlining the rt_sigaction syscall stub detour to reduce branch
>return
>>> >stack mispredicts etc - which is what 6/8 does !
>> This sounds suspicious.
>> IIRC we already had that argument, last time around _dl_do_reloc and
>_dl_do_lazy_reloc.
>> Could it be that your port has a bug here ( missed optimisation )
>around ifunc handling? Sounds like back then on ARM
>https://gcc.gnu.org/PR40887#c6
>> 
>> What am I missing?
>
>
>I don't think my use-case is close to the ARM issue u pointed to above
>as there is
>no ifunc or function pointer involved.

I was more thinking about the relic functors.
Does GCC 5 produce identical code for ARC master way to explicit function calls 
compared to using a function pointer like suggested and used in all other ports?
If not then I'd consider this a bug.

>
>With orig code, we get 2 function calls on ARC:
>
>0000b504 <__libc_sigaction>:
>    b504:      push_s     blink
>    b506:      sub_s      sp,sp,12
>    b508:      bl.d       36b20 <__st_r13_to_r15>
>...
>
>    b540:      bl.d       b750 <__syscall_rt_sigaction>   <--- DIRECT CALL
>    b544:      mov_s      r3,8
>    b546:      add_s      sp,sp,20
>    b548:      mov_s      r12,12
>    b54a:      b          36b88 <__ld_r13_to_r15_ret>
>    b54e:      nop_s
>
>0000b750 <__syscall_rt_sigaction>:
>    b750:      mov        r8,134
>b754:  swi                                <---- SYSCALL TRAP INTO KERNEL
>    b758:      cmp        r0,0xfffffc00
>    b75c:      bls_s      b76a
>    b75e:      st.a       blink,[sp,-4]
>    b762:      bl         b550 <__syscall_error>
>    b766:      ld.ab      blink,[sp,4]
>    b76a:      j_s        [blink]
>
>The small function call is not necessarily good micro-architecturally
>when
>returning due to limited number of call return stack entries. That cost
>is
>amortized if function is largish.
>
>I do understand that these small syscall wrappers are a common uClibc
>design
>pattern and exist all over the place but given that this was all arch
>code I tool
>the liberty of removing the one hop and the code now looks as below:
>
>0000b4d8 <__libc_sigaction>:
>    b4d8:      st.a       gp,[sp,-4]
>    b4dc:      sub_s      sp,sp,20
>    b4de:      add        gp,pcl,0x00065284
>    b4e6:      breq_s     r1,0,b516
>    b4e8:      ld_s       r3,[r1,4]
>...
>    b516:      mov        r8,134
>    b51a:      mov_s      r3,8
>    b51c:      swi
>    b520:      cmp        r0,0xfffffc00
>    b524:      bls_s      b532
>    b526:      st.a       blink,[sp,-4]
>    b52a:      bl         b53c <__syscall_error>
>    b52e:      ld.ab      blink,[sp,4]
>    b532:      ld.a       gp,[sp,20]
>    b536:      j_s.d      [blink]
>    b538:      add_s      sp,sp,4
>    b53a:      nop_s

I would have assumed / hoped that GCC 5 should generate this 2nd variant for 
extern inline __syscall_rt_sigaction.

Doesn't it do that?

TIA

_______________________________________________
uClibc mailing list
uClibc@uclibc.org
http://lists.busybox.net/mailman/listinfo/uclibc

Reply via email to