On Thu, May 09, 2013 at 02:20:01PM +0200, Jilles Tjoelker wrote:
> I think architecture-specific memcmp() for i386 and amd64 can still be
> beneficial because of the fast unaligned access offered by these CPUs,
> which allows comparison of 4 or 8 bytes at a time. SSE2 allows
> comparison of 16 bytes at a time but is somewhat harder: not all i386
> CPUs support SSE2, unaligned access is slow on some older CPUs and it
> requires assembly so it only uses %xmm8-%xmm15 so rtld does not trash
> function parameters (or rtld needs to use non-SSE2 code).

FWIW, rtld is not allowed to modify any registers in the bind code
called from the PLT trampoline. The C ABI is not mandated for the
functions resolved through the PLT, so our rtld care to not destroy even
caller-save or scratch registers, at least on x86*.

Attachment: pgp5IuCjbaUuU.pgp
Description: PGP signature

Reply via email to