The bottom line is that if we want hard numbers we probably have to measure.

Hoisting the cr2 read is a no-brainer, might even help performance...

On March 1, 2014 1:50:42 AM PST, Borislav Petkov <b...@alien8.de> wrote:
>On Sat, Mar 01, 2014 at 10:16:50AM +0100, Ingo Molnar wrote:
>> 
>> * Steven Rostedt <rost...@goodmis.org> wrote:
>> 
>> > > Also, this function is called a _LOT_ under certain workloads, I 
>> > > don't know how cheap a CR2 read is, but it had better be really 
>> > > cheap.
>> > 
>> > That's a HPA question.
>> 
>> We read CR2 in the page fault hot path, so it's on the top of CPU 
>> architects' minds and it's reasonably optimized. A couple of cycles 
>> IIRC, but would be nice to hear actual numbers.
>
>Yeah, we were discussing this last night on IRC.
>
>And hpa actually meant that the optimization potential was there but no
>one did do it, except maybe Transmeta. :-)
>
>So the expensive thing is writing to CR2 because it is a serializing
>instruction. In fact, all writes to control registers except CR8 are
>serializing.
>
>The reading from CR2 should be cheaper but not as cheap as a normal
>MOV %reg %reg is. On AMD, MOV %reg, %cr2 is done with microcode so
>definitely at least a couple of cycles and I'd guess it is not a
>trivial
>MOV on Intel too.
>
>Maybe a way to hide this cost is the OoO, as hpa suggested, depending
>on
>how much parallelism that particular code region can offer (serializing
>instructions close by).

-- 
Sent from my mobile phone.  Please pardon brevity and lack of formatting.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Reply via email to