The bottom line is that if we want hard numbers we probably have to measure.
Hoisting the cr2 read is a no-brainer, might even help performance... On March 1, 2014 1:50:42 AM PST, Borislav Petkov <b...@alien8.de> wrote: >On Sat, Mar 01, 2014 at 10:16:50AM +0100, Ingo Molnar wrote: >> >> * Steven Rostedt <rost...@goodmis.org> wrote: >> >> > > Also, this function is called a _LOT_ under certain workloads, I >> > > don't know how cheap a CR2 read is, but it had better be really >> > > cheap. >> > >> > That's a HPA question. >> >> We read CR2 in the page fault hot path, so it's on the top of CPU >> architects' minds and it's reasonably optimized. A couple of cycles >> IIRC, but would be nice to hear actual numbers. > >Yeah, we were discussing this last night on IRC. > >And hpa actually meant that the optimization potential was there but no >one did do it, except maybe Transmeta. :-) > >So the expensive thing is writing to CR2 because it is a serializing >instruction. In fact, all writes to control registers except CR8 are >serializing. > >The reading from CR2 should be cheaper but not as cheap as a normal >MOV %reg %reg is. On AMD, MOV %reg, %cr2 is done with microcode so >definitely at least a couple of cycles and I'd guess it is not a >trivial >MOV on Intel too. > >Maybe a way to hide this cost is the OoO, as hpa suggested, depending >on >how much parallelism that particular code region can offer (serializing >instructions close by). -- Sent from my mobile phone. Please pardon brevity and lack of formatting. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/