On February 18, 2015 6:51:17 AM GMT+01:00, Vineet Gupta <vineet.gup...@synopsys.com> wrote: >On Monday 16 February 2015 08:34 PM, Bernhard Reutner-Fischer wrote: >>> While it at I also did some arch specific adjustment in sigaction >path >>> >- inlining the rt_sigaction syscall stub detour to reduce branch >return >>> >stack mispredicts etc - which is what 6/8 does ! >> This sounds suspicious. >> IIRC we already had that argument, last time around _dl_do_reloc and >_dl_do_lazy_reloc. >> Could it be that your port has a bug here ( missed optimisation ) >around ifunc handling? Sounds like back then on ARM >https://gcc.gnu.org/PR40887#c6 >> >> What am I missing? > > >I don't think my use-case is close to the ARM issue u pointed to above >as there is >no ifunc or function pointer involved.
I was more thinking about the relic functors. Does GCC 5 produce identical code for ARC master way to explicit function calls compared to using a function pointer like suggested and used in all other ports? If not then I'd consider this a bug. > >With orig code, we get 2 function calls on ARC: > >0000b504 <__libc_sigaction>: > b504: push_s blink > b506: sub_s sp,sp,12 > b508: bl.d 36b20 <__st_r13_to_r15> >... > > b540: bl.d b750 <__syscall_rt_sigaction> <--- DIRECT CALL > b544: mov_s r3,8 > b546: add_s sp,sp,20 > b548: mov_s r12,12 > b54a: b 36b88 <__ld_r13_to_r15_ret> > b54e: nop_s > >0000b750 <__syscall_rt_sigaction>: > b750: mov r8,134 >b754: swi <---- SYSCALL TRAP INTO KERNEL > b758: cmp r0,0xfffffc00 > b75c: bls_s b76a > b75e: st.a blink,[sp,-4] > b762: bl b550 <__syscall_error> > b766: ld.ab blink,[sp,4] > b76a: j_s [blink] > >The small function call is not necessarily good micro-architecturally >when >returning due to limited number of call return stack entries. That cost >is >amortized if function is largish. > >I do understand that these small syscall wrappers are a common uClibc >design >pattern and exist all over the place but given that this was all arch >code I tool >the liberty of removing the one hop and the code now looks as below: > >0000b4d8 <__libc_sigaction>: > b4d8: st.a gp,[sp,-4] > b4dc: sub_s sp,sp,20 > b4de: add gp,pcl,0x00065284 > b4e6: breq_s r1,0,b516 > b4e8: ld_s r3,[r1,4] >... > b516: mov r8,134 > b51a: mov_s r3,8 > b51c: swi > b520: cmp r0,0xfffffc00 > b524: bls_s b532 > b526: st.a blink,[sp,-4] > b52a: bl b53c <__syscall_error> > b52e: ld.ab blink,[sp,4] > b532: ld.a gp,[sp,20] > b536: j_s.d [blink] > b538: add_s sp,sp,4 > b53a: nop_s I would have assumed / hoped that GCC 5 should generate this 2nd variant for extern inline __syscall_rt_sigaction. Doesn't it do that? TIA _______________________________________________ uClibc mailing list uClibc@uclibc.org http://lists.busybox.net/mailman/listinfo/uclibc