----- Original Message -----
> From: "Joerg Sonnenberger" <[email protected]>
> To: [email protected]
> Sent: Tuesday, March 3, 2015 12:29:12 PM
> Subject: Re: r230255 - Only lower __builtin_setjmp / __builtin_longjmp to
>
> On Tue, Mar 03, 2015 at 11:59:53AM -0600, Hal Finkel wrote:
> > Having implemented this, I assure you that the potential is not
> > small.
> > Eliminating the unnecessary spilling, the overhead of the function
> > call,
> > and better scheduling of the spills/restores, I've seen 10x
> > speedups
> > (even on modern OOO cores). Please also remember that small
> > functions
> > often don't use all available registers, especially vector
> > registers
> > (which tend to be expensive to save and restore), and so you can
> > just
> > ignore the caller-saved register entirely (you don't need to save
> > them
> > in the prologue or in setjmp call if you don't use them -- it is a
> > pure
> > savings).
>
> Huh? How do you know that the intermediate functions haven't
> clobbered a
> register? Without unwinding, which we explicitly do not want to do
> here,
> you can't. As such you *can't* avoid the spilling.
You can because, for a caller-saved register, the caller saved them (if it,
indeed, needed to do so). That's the nice thing about the builtins: they're
call-site specific. So when you call setjmp, you don't need to save them if you
don't need them afterward. When the caller of setjmp returns, its caller will
restore those registers as needed (as it would have anyway). You do need to
save callee-saved registers (along with any other registers the function is
actually using).
I think it is also worth noting that, as we implement them, the job of
restoring registers is shifted compared to the library calls. So, when using
library setjmp/longjmp, setjmp saves all of the necessary state into the jump
buffer, and longjmp restores all necessary state and jumps to the designated
location. With the builtins, __builtin_setjmp itself saves very little state
into the jump buffer (only the address and some reserved registers), but causes
the function calling it to spill and restore only necessary state around it.
__builtin_longjmp restores only the reserved registers necessary to make the
jump, nothing more (the spill/restore code around the __builtin_setjmp takes
care of the rest).
So, for example, if we have some register, v1, which is caller saved...
void bar(jmp_buf &jb) {
// v1 is not used in this function, and the caller saved it if necessary
// so v1 is not spilled here
__builtin_setjmp(&jb);
// and v1 is not restored here (the caller saved it if necessary, and will
restore it if necessary, when we return)
}
foo() {
jmp_buf jb;
// v1 is saved here if necessary
bar(jb);
...
// the value of v1 saved above is loaded here somewhere
...
__builtin_longjmp() // this does not explicitly restore v1 here, although a
library call would need to
}
-Hal
>
> Joerg
> _______________________________________________
> cfe-commits mailing list
> [email protected]
> http://lists.cs.uiuc.edu/mailman/listinfo/cfe-commits
>
--
Hal Finkel
Assistant Computational Scientist
Leadership Computing Facility
Argonne National Laboratory
_______________________________________________
cfe-commits mailing list
[email protected]
http://lists.cs.uiuc.edu/mailman/listinfo/cfe-commits