On Wed, Mar 30, 2011 at 03:29:02PM +0200, Avi Kivity wrote:
> On 03/30/2011 03:26 PM, Gleb Natapov wrote:
> >On Wed, Mar 30, 2011 at 02:48:28PM +0200, Gleb Natapov wrote:
> >> On Wed, Mar 30, 2011 at 02:17:55PM +0200, Avi Kivity wrote:
> >> > On 03/30/2011 01:43 PM, Gleb Natapov wrote:
> >> > >After reboot perf started to work. I ran modified emulator.flat unit
> >> > >test. It was modified to run test_cmps() in an endless loop.
> >> > >
> >> > >Without patch:
> >> > >1.71% qemu-system-x86 [kvm] [k]
> >> x86_emulate_instruction
> >> > >1.51% qemu-system-x86 [kvm] [k]
> >> x86_emulate_instruction
> >> > >1.68% qemu-system-x86 [kvm] [k]
> >> x86_emulate_instruction
> >> > >
> >> > >With patch:
> >> > >0.84% qemu-system-x86 [kvm] [k]
> >> x86_emulate_instruction
> >> > >0.96% qemu-system-x86 [kvm] [k]
> >> x86_emulate_instruction
> >> > >0.89% qemu-system-x86 [kvm] [k]
> >> x86_emulate_instruction
> >> > >
> >> >
> >> > The cause might be kvm_rip_write() using vmwrite. Can you use perf
> >> > to see where the hits are in x86_emulate_instruction?
> >> >
> >> > If that's the case, we may be able to do local optimizations to
> >> > kvm_rip_write(), kvm_set_rflags(), and toggle_interruptiblity()
> >> > instead of this global change.
> >> >
> >> I can leave copying there and eliminate only kvm_rip_write and see
> >> perf data.
> >>
> >
> >1.75% qemu-system-x86 [kvm] [k] x86_emulate_instruction
> >1.60% qemu-system-x86 [kvm] [k] x86_emulate_instruction
> >1.42% qemu-system-x86 [kvm] [k] x86_emulate_instruction
> >
> >This is with copy in place, but those are under if (writeback):
> > toggle_interruptibility(vcpu,
> > vcpu->arch.emulate_ctxt.interruptibility);
> > kvm_set_rflags(vcpu, vcpu->arch.emulate_ctxt.eflags);
> > kvm_make_request(KVM_REQ_EVENT, vcpu);
> > vcpu->arch.emulate_regs_need_sync_to_vcpu = false;
> > kvm_rip_write(vcpu, vcpu->arch.emulate_ctxt.eip);
> >
>
> It's wierd. Do you get perf hits in the copying?
>
How can I check. The memcpy is inlined.
> Copying a couple of hot cache lines shouldn't take any measurable
> time compared to a heavyweight exit.
>
The whole function takes only 1.5% CPU. Perf measures how much this
function become faster and heavyweight exit is not part of the function.
--
Gleb.
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to [email protected]
More majordomo info at http://vger.kernel.org/majordomo-info.html