On 05/06/2015 09:45 AM, Jakub Jelinek wrote:
As for hoisting the load of the call address before the loop, with lazy binding that has the obvious disadvantage that you'd resolve the slot again and again, if you are unlucky enough that the function hasn't been resolved yet. Unless the shared PLT stub after computing _G_O_T_ (for x86) also rechecks the .got.plt address.
Yea, but I suspect that's the rare case rather than the common case.
Of course, it's so bloody expensive when it happens, it might totally outweigh the aggregated benefits from all the other profitable hoisted GOT loads.
jeff