On Mon, 24 Sep 2012 16:50:13 +0200
Avi Kivity <[email protected]> wrote:
> Afterwards, most exits are APIC and interrupt related, HLT, and MMIO.
> Of these, some are special (HLT, interrupt injection) and some are not
> (read/write most APIC registers). I don't think one group dominates the
> other. So already vcpu->requests processing is not such a slow path, it
> is relatively common. We still see a lot of page faults during boot and
> during live migration though.
>
> With AVIC/APIC-V (still in the future) the mix will change again, with
> both special and non-special exits eliminated. We'll be left mostly
> with APIC timer and HLT (and ICR writes for APIC-V).
>
> So maybe the direction of your patch makes sense. Things like
> KVM_REQ_EVENT (or anything above 2-3% of exits) shouldn't be in
> vcpu->requests or maybe they deserve special treatment.
I see the point.
Since KVM_REQ_EVENT must be checked after handling some other requests,
it needs special treatment anyway -- if we think defining it as the
last flag for for_each_set_bit() is kind of special treatment.
As Gleb and you pointed out, KVM_REQ_STEAL_UPDATE needs to be fixed
first not to be set unnecessarily.
Then by special casing KVM_REQ_EVENT, one line change or moving it out
from vcpu->requests, we can see if further improvement is needed.
If a few requests exceed the threshold, 2-3% as you wrote?, we can also
define a mask to indicate which requests should be treated as "not unlikely".
> > BTW, schedule() is really rare? We do either cond_resched() or
> > heavy weight exit, no?
>
> If 25% of exits are HLT (like a ping workload), then 25% of your exits
> end up in schedule().
>
> On modern hardware, a relatively larger percentage of exits are
> heavyweight (same analysis as above). On AVIC hardware most exits will
> be mmio, HLT, and host interrupts. Of these only host interrupts that
> don't lead to a context switch will be lightweight.
>
> >
> > I always see vcpu threads actively move around the cores.
> > (When I do not pin them.)
>
> Sure, but the frequency is quite low. If not that's a bug.
That's what I was originally testing for: if vcpu threads were being
scheduled as expected.
I forgot why I reached here.
> >> Modern processors will eliminate KVM_REQ_EVENT in many cases, so the
> >> optmimization is wasted on them.
> >
> > Then, my Nehalem server was not so modern.
>
> Well I was referring to APIC-v/AVIC hardware which nobody has. On
> current hardware they're very common. So stuffing it in the
> vcpu->requests slow path is not warranted.
>
> My patch is cleaner than yours as it handles the problem generically,
> but yours matches reality better.
I guess so.
I remember someone once tried to inline functions used inside
for_each_set_bit() complaining that it was slow. A generic approach
needs some scale to win.
> > I did something like this:
> >
> > if requests == KVM_REQ_EVENT
> > ++counter1;
> > if requests == KVM_REQ_STEAL_UPDATE
> > ++counter2;
> > ...
> >
> > in vcpu_enter_guest() and saw KVM_REQ_EVENT many times.
>
>
> (in theory perf probe can do this. But figuring out how is often more
> time consuming than patching the kernel).
Yes, actually I was playing with perf before directly counting each pattern.
But since I could not see the details easily, because of inlining or ...,
I ended up going my way.
Thanks,
Takuya
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to [email protected]
More majordomo info at http://vger.kernel.org/majordomo-info.html