Hi!

On Thu, Mar 12, 2020 at 01:18:50PM +1030, Alan Modra wrote:
> With lazy PLT resolution the first load of a PLT entry may be a value
> pointing at a resolver stub.  gcc's loop processing can result in the
> PLT load in inline PLT calls being hoisted out of a loop in the
> mistaken idea that this is an optimisation.  It isn't.  If the value
> hoisted was that for a resolver stub then every call to that function
> in the loop will go via the resolver, slowing things down quite
> dramatically.
> 
> The PLT really is volatile, so teach gcc about that.

It would be nice if we could keep it cached after it has been resolved
once, this has potential for regressing performance if we don't?  And
LD_BIND_NOW should keep working just as fast as it is now, too?


Segher

Reply via email to