> On 6 Jan, 2016, at 02:22, Steinar H. Gunderson <[email protected]> wrote:
> 
> On Tue, Jan 05, 2016 at 04:06:03PM -0800, Stephen Hemminger wrote:
>> The expensive part is often having to save and restore all the state in
>> registers and other bits on context switch.
> 
> Are you sure? There's not really all that much state to save, and all I've
> been taught before says the opposite.
> 
> Also, I've never ever seen the actual context switch turn up high in a perf
> profile.  Is this because of some sampling artifact?

ARM has dedicated register banks for several interrupt levels for exactly this 
reason.  Simple interrupt handlers can operate in these without spilling *any* 
userspace registers.  This gives ARM quite good interrupt latency, especially 
in the simpler implementations.

That doesn’t help for an actual context switch of course.  What does help is 
“lazy FPU state switching”, where on a context switch the FPU is simply marked 
as unavailable.  Only if/when the process attempts to *use* the FPU, this gets 
trapped and the trap handler restores the correct state before returning an 
enabled FPU to userspace.  The same goes for SIMD register banks, of course.

Lazy context switching is a kernel feature.  It’s used on all architectures 
that have a runtime disable-able FPU, AFAIK.  For a context switch to kernel 
and back to the same process, the FPU & SIMD are never actually switched, so 
there is almost no overhead.

 - Jonathan Morton

_______________________________________________
Bloat mailing list
[email protected]
https://lists.bufferbloat.net/listinfo/bloat

Reply via email to