On Fri, 25 Nov 2016, Prathamesh Kulkarni wrote: > On 24 November 2016 at 18:08, Richard Biener <rguent...@suse.de> wrote: > > On Thu, 24 Nov 2016, Prathamesh Kulkarni wrote: > > > >> On 24 November 2016 at 17:48, Richard Biener <rguent...@suse.de> wrote: > >> > On Thu, 24 Nov 2016, Prathamesh Kulkarni wrote: > >> > > >> >> On 24 November 2016 at 14:07, Richard Biener <rguent...@suse.de> wrote: > >> >> > On Thu, 24 Nov 2016, Prathamesh Kulkarni wrote: > >> >> > > >> >> >> Hi, > >> >> >> Consider following test-case: > >> >> >> > >> >> >> void *f(void *a1, void *a2, __SIZE_TYPE__ a3) > >> >> >> { > >> >> >> __builtin_memcpy (a1, a2, a3); > >> >> >> return a1; > >> >> >> } > >> >> >> > >> >> >> return a1 can be considered equivalent to return value of memcpy, > >> >> >> and the call could be emitted as a tail-call. > >> >> >> gcc doesn't emit the above call to memcpy as a tail-call, > >> >> >> but if it is changed to: > >> >> >> > >> >> >> void *t1 = __builtin_memcpy (a1, a2, a3); > >> >> >> return t1; > >> >> >> > >> >> >> Then memcpy is emitted as a tail-call. > >> >> >> The attached patch tries to handle the former case. > >> >> >> > >> >> >> Bootstrapped+tested on x86_64-unknown-linux-gnu. > >> >> >> Cross tested on arm*-*-*, aarch64*-*-* > >> >> >> Does this patch look OK ? > >> >> > > >> >> > +/* Return arg, if function returns it's argument or NULL if it > >> >> > doesn't. > >> >> > */ > >> >> > +tree > >> >> > +gimple_call_return_arg (gcall *call_stmt) > >> >> > +{ > >> >> > > >> >> > > >> >> > Please just inline it at the single use - the name is not terribly > >> >> > informative. > >> >> > > >> >> > I'm not sure you can rely on code-generation working if you not > >> >> > effectively change the IL to > >> >> > > >> >> > a1 = __builtin_memcpy (a1, a2, a3); > >> >> > return a1; > >> >> > > >> >> > someone more familiar with RTL expansion plus tail call emission on > >> >> > RTL needs to chime in. > >> >> Well I was trying to copy-propagate function's argument into uses of > >> >> it's return value if > >> >> function returned that argument, so the assignment to lhs of call > >> >> could be made redundant. > >> >> > >> >> eg: > >> >> void *f(void *a1, void *a2, __SIZE_TYPE__ a3) > >> >> { > >> >> void *t1 = __builtin_memcpy (a1, a2, a3); > >> >> return t1; > >> >> } > >> >> > >> >> After patch, copyprop transformed it into: > >> >> t1 = __builtin_memcpy (a1, a2, a3); > >> >> return a1; > >> > > >> > But that's a bad transform -- if we know that t1 == a1 then it's > >> > better to use t1 as that's readily available in the return register > >> > while the register for a1 might have been clobbered and thus we > >> > need to spill it for the later return. > >> Oh I didn't realize this could possibly pessimize RA. > >> For test-case: > >> > >> void *t1 = memcpy (dest, src, n); > >> if (t1 != dest) > >> __builtin_abort (); > >> > >> we could copy-propagate t1 into cond_expr and make the condition redundant. > >> However I suppose this particular case could be handled with VRP instead > >> (t1 and dest should be marked equivalent) ? > > > > Yeah, exposing this to value-numbering in general can enable some > > optimizations (but I wouldn't put it in copyprop). Note it's then > > difficult to avoid copy-propgating things... > > > > The user can also write > > > > void *f(void *a1, void *a2, __SIZE_TYPE__ a3) > > { > > __builtin_memcpy (a1, a2, a3); > > return a1; > > } > > > > so it's good to improve code-gen for that (for the tailcall issue). > For the tail-call, issue should we artificially create a lhs and use that > as return value (perhaps by a separate pass before tailcall) ? > > __builtin_memcpy (a1, a2, a3); > return a1; > > gets transformed to: > _1 = __builtin_memcpy (a1, a2, a3) > return _1; > > So tail-call optimization pass would see the IL in it's expected form.
As said, a RTL expert needs to chime in here. Iff then tail-call itself should do this rewrite. But if this form is required to make things work (I suppose you checked it _does_ actually work?) then we'd need to make sure later passes do not undo it. So it looks fragile to me. OTOH I seem to remember that the flags we set on GIMPLE are merely a hint to RTL expansion and the tailcalling is verified again there? Thanks, Richard. > Thanks, > Prathamesh > > > > Richard. > > > >> Thanks, > >> Prathamesh > >> > > >> >> But this now interferes with tail-call optimization, because it is not > >> >> able to emit memcpy > >> >> as tail-call anymore due to which the patch regressed 20050503-1.c. > >> >> I am not sure how to workaround this. > >> >> > >> >> Thanks, > >> >> Prathamesh > >> >> > > >> >> > Richard. > >> >> > >> > > >> > -- > >> > Richard Biener <rguent...@suse.de> > >> > SUSE LINUX GmbH, GF: Felix Imendoerffer, Jane Smithard, Graham Norton, > >> > HRB 21284 (AG Nuernberg) > >> > >> > > > > -- > > Richard Biener <rguent...@suse.de> > > SUSE LINUX GmbH, GF: Felix Imendoerffer, Jane Smithard, Graham Norton, HRB > > 21284 (AG Nuernberg) > > -- Richard Biener <rguent...@suse.de> SUSE LINUX GmbH, GF: Felix Imendoerffer, Jane Smithard, Graham Norton, HRB 21284 (AG Nuernberg)