Re: [PATCH v2 00/11] KVM in-guest performance monitoring

2011-06-29 Thread Avi Kivity

On 06/13/2011 04:34 PM, Avi Kivity wrote:

This patchset exposes an emulated version 1 architectural performance
monitoring unit to KVM guests.  The PMU is emulated using perf_events,
so the host kernel can multiplex host-wide, host-user, and the
guest on available resources.

Caveats:
- counters that have PMI (interrupt) enabled stop counting after the
   interrupt is signalled.  This is because we need one-shot samples
   that keep counting, which perf doesn't support yet
- some combinations of INV and CMASK are not supported
- counters keep on counting in the host as well as the guest

perf maintainers: please consider the first three patches for merging (the
first two make sense even without the rest).  If you're familiar with the Intel
PMU, please review patch 5 as well - it effectively undoes all your work
of abstracting the PMU into perf_events by unabstracting perf_events into what
is hoped is a very similar PMU.

v2:
  -  don't pass perf_event handler context to the callback; extract it via the
 'event' parameter instead
  -  RDPMC emulation and interception
  -  CR4.PCE emulation


Peter, can you look at 1-3 please?

--
error compiling committee.c: too many arguments to function

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v2 00/11] KVM in-guest performance monitoring

2011-06-29 Thread Peter Zijlstra
On Wed, 2011-06-29 at 10:52 +0300, Avi Kivity wrote:
 On 06/13/2011 04:34 PM, Avi Kivity wrote:
  This patchset exposes an emulated version 1 architectural performance
  monitoring unit to KVM guests.  The PMU is emulated using perf_events,
  so the host kernel can multiplex host-wide, host-user, and the
  guest on available resources.
 
  Caveats:
  - counters that have PMI (interrupt) enabled stop counting after the
 interrupt is signalled.  This is because we need one-shot samples
 that keep counting, which perf doesn't support yet
  - some combinations of INV and CMASK are not supported
  - counters keep on counting in the host as well as the guest
 
  perf maintainers: please consider the first three patches for merging (the
  first two make sense even without the rest).  If you're familiar with the 
  Intel
  PMU, please review patch 5 as well - it effectively undoes all your work
  of abstracting the PMU into perf_events by unabstracting perf_events into 
  what
  is hoped is a very similar PMU.
 
  v2:
-  don't pass perf_event handler context to the callback; extract it via 
  the
   'event' parameter instead
-  RDPMC emulation and interception
-  CR4.PCE emulation
 
 Peter, can you look at 1-3 please?

Queued them, thanks!

I was more or less waiting for a next iteration of the series because of
those problems reported, but those three stand well on their own.


--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v2 00/11] KVM in-guest performance monitoring

2011-06-29 Thread Avi Kivity

On 06/29/2011 11:38 AM, Peter Zijlstra wrote:


  Peter, can you look at 1-3 please?

Queued them, thanks!

I was more or less waiting for a next iteration of the series because of
those problems reported, but those three stand well on their own.


Thanks.  I'm mired in other work but will return to investigate  fix 
those issues.


--
error compiling committee.c: too many arguments to function

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v2 00/11] KVM in-guest performance monitoring

2011-06-16 Thread Avi Kivity

On 06/15/2011 07:51 PM, David Ahern wrote:

The qemu-kvm change is setting the pmu version to 1, and your patchset
introduces v1 event constraints. So based on intel_pmu_init model=0 is
an appropriate model - and a required parameter (-cpu host,model=0).
With that option I get thenot supported  label as expected.

Guest side:
  Performance counter stats for 'openssl speed aes':

   45160.015949 task-clock#0.998 CPUs utilized

192 context-switches  #0.000 M/sec

  0 CPU-migrations#0.000 M/sec

650 page-faults   #0.000 M/sec

 57,064,592,321 cycles#1.264 GHz
 [49.96%]
138,608,368,094 instructions  #2.43  insns per cycle
 [50.04%]
  3,003,337,751 branches  #   66.504 M/sec
 [50.04%]
 21,890,537 branch-misses #0.73% of all branches
 [49.96%]

   45.242117218 seconds time elapsed

(not supported  events removed). And comparable events from running the
same command host side:
  Performance counter stats for 'openssl speed aes':

   44947.093539 task-clock#0.998 CPUs utilized

  4,800 context-switches  #0.000 M/sec

  5 CPU-migrations#0.000 M/sec

481 page-faults   #0.000 M/sec

124,610,137,228 cycles#2.772 GHz
 [27.77%]
338,982,292,106 instructions  #2.72  insns per cycle

  6,061,899,079 branches  #  134.867 M/sec
 [33.33%]
  2,236,965 branch-misses #0.04% of all branches
 [33.33%]
   45.043442068 seconds time elapsed

So cycles are off by roughly 2, instructions are off by roughly a factor
of 2.5, branches by a factor of 2. Those 3 events are fairly close from
one run to the next in the host.


Oh, there's the scaling issue that Peter pointed out.

Can you try the tests again, but now measuring just one counter per run 
(perf stat -e xxx command).



--
error compiling committee.c: too many arguments to function

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v2 00/11] KVM in-guest performance monitoring

2011-06-16 Thread David Ahern


On 06/16/2011 07:53 AM, Avi Kivity wrote:
 On 06/15/2011 07:51 PM, David Ahern wrote:
 The qemu-kvm change is setting the pmu version to 1, and your patchset
 introduces v1 event constraints. So based on intel_pmu_init model=0 is
 an appropriate model - and a required parameter (-cpu host,model=0).
 With that option I get thenot supported  label as expected.

 Guest side:
   Performance counter stats for 'openssl speed aes':

45160.015949 task-clock#0.998 CPUs utilized

 192 context-switches  #0.000 M/sec

   0 CPU-migrations#0.000 M/sec

 650 page-faults   #0.000 M/sec

  57,064,592,321 cycles#1.264 GHz
  [49.96%]
 138,608,368,094 instructions  #2.43  insns per cycle
  [50.04%]
   3,003,337,751 branches  #   66.504 M/sec
  [50.04%]
  21,890,537 branch-misses #0.73% of all branches
  [49.96%]

45.242117218 seconds time elapsed

 (not supported  events removed). And comparable events from running the
 same command host side:
   Performance counter stats for 'openssl speed aes':

44947.093539 task-clock#0.998 CPUs utilized

   4,800 context-switches  #0.000 M/sec

   5 CPU-migrations#0.000 M/sec

 481 page-faults   #0.000 M/sec

 124,610,137,228 cycles#2.772 GHz
  [27.77%]
 338,982,292,106 instructions  #2.72  insns per cycle

   6,061,899,079 branches  #  134.867 M/sec
  [33.33%]
   2,236,965 branch-misses #0.04% of all branches
  [33.33%]
45.043442068 seconds time elapsed

 So cycles are off by roughly 2, instructions are off by roughly a factor
 of 2.5, branches by a factor of 2. Those 3 events are fairly close from
 one run to the next in the host.
 
 Oh, there's the scaling issue that Peter pointed out.
 
 Can you try the tests again, but now measuring just one counter per run
 (perf stat -e xxx command).
 
 
Command:
  perf stat -e instructions  openssl speed aes

Guest:
   135,522,189,056 instructions  #0.00  insns per cycle


Host:
   346,082,922,185 instructions  #0.00  insns per cycle


Adding '--no-scale' to the perf-stat had no effect on the relative
difference.

David
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v2 00/11] KVM in-guest performance monitoring

2011-06-16 Thread David Ahern
On 06/16/2011 08:08 AM, David Ahern wrote:
 Command:
   perf stat -e instructions  openssl speed aes

Hmm.. this might be the wrong benchmark for this. I thought
openssl-speed was a purely CPU intensive benchmark which should have
fairly similar performance numbers in both host and guest. I seem to
recall this as true 2 or so years ago, but that is not the case with
3.0-rc2 and F14.

Using a benchmark Vince W. wrote seems better:
  http://www.csl.cornell.edu/~vince/projects/perf_counter/million.s

perf stat -e instructions ./million

 Performance counter stats for './million':

 1,113,650 instructions  #0.00  insns per cycle


David


 
 Guest:
135,522,189,056 instructions  #0.00  insns per cycle
 
 
 Host:
346,082,922,185 instructions  #0.00  insns per cycle
 
 
 Adding '--no-scale' to the perf-stat had no effect on the relative
 difference.
 
 David
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v2 00/11] KVM in-guest performance monitoring

2011-06-16 Thread Avi Kivity

On 06/16/2011 05:19 PM, David Ahern wrote:

On 06/16/2011 08:08 AM, David Ahern wrote:
  Command:
perf stat -e instructions  openssl speed aes

Hmm.. this might be the wrong benchmark for this. I thought
openssl-speed was a purely CPU intensive benchmark which should have
fairly similar performance numbers in both host and guest. I seem to
recall this as true 2 or so years ago, but that is not the case with
3.0-rc2 and F14.

Using a benchmark Vince W. wrote seems better:
   http://www.csl.cornell.edu/~vince/projects/perf_counter/million.s

perf stat -e instructions ./million

  Performance counter stats for './million':

  1,113,650 instructions  #0.00  insns per cycle



Maybe it's sensitive to a cpuid bit which we don't pass through - likely 
a bug in qemu or perhaps in kvm.


--
error compiling committee.c: too many arguments to function

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v2 00/11] KVM in-guest performance monitoring

2011-06-16 Thread David Ahern


On 06/16/2011 08:20 AM, Avi Kivity wrote:
 On 06/16/2011 05:19 PM, David Ahern wrote:
 On 06/16/2011 08:08 AM, David Ahern wrote:
   Command:
 perf stat -e instructions  openssl speed aes

 Hmm.. this might be the wrong benchmark for this. I thought
 openssl-speed was a purely CPU intensive benchmark which should have
 fairly similar performance numbers in both host and guest. I seem to
 recall this as true 2 or so years ago, but that is not the case with
 3.0-rc2 and F14.

 Using a benchmark Vince W. wrote seems better:
http://www.csl.cornell.edu/~vince/projects/perf_counter/million.s

 perf stat -e instructions ./million

   Performance counter stats for './million':

   1,113,650 instructions  #0.00  insns per cycle

 
 Maybe it's sensitive to a cpuid bit which we don't pass through - likely
 a bug in qemu or perhaps in kvm.
 

Seems to be a side effect of running perf-stat in the guest. Running
just 'openssl speed aes' in both host and guest shows very similar
numbers (for the first 3 columns). Adding the 'perf stat' to the command
(ie., perf stat openssl speed aes) causes a significant decline in the
guest - by a factor of 2. For comparison 'perf stat' in the host has a
negligible impact.

David
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v2 00/11] KVM in-guest performance monitoring

2011-06-16 Thread Avi Kivity

On 06/16/2011 05:32 PM, David Ahern wrote:

Seems to be a side effect of running perf-stat in the guest. Running
just 'openssl speed aes' in both host and guest shows very similar
numbers (for the first 3 columns). Adding the 'perf stat' to the command
(ie., perf stat openssl speed aes) causes a significant decline in the
guest - by a factor of 2. For comparison 'perf stat' in the host has a
negligible impact.


That's pretty bad.  I'll investigate.

--
error compiling committee.c: too many arguments to function

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v2 00/11] KVM in-guest performance monitoring

2011-06-16 Thread David Ahern


On 06/16/2011 08:36 AM, Avi Kivity wrote:
 On 06/16/2011 05:32 PM, David Ahern wrote:
 Seems to be a side effect of running perf-stat in the guest. Running
 just 'openssl speed aes' in both host and guest shows very similar
 numbers (for the first 3 columns). Adding the 'perf stat' to the command
 (ie., perf stat openssl speed aes) causes a significant decline in the
 guest - by a factor of 2. For comparison 'perf stat' in the host has a
 negligible impact.
 
 That's pretty bad.  I'll investigate.
 

Before I let this go for the day 

Running perf in the host shows arch_local_irq_enable is a lot more
prevalent when adding 'perf stat' to the command in the guest.

David
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v2 00/11] KVM in-guest performance monitoring

2011-06-16 Thread Peter Zijlstra
On Thu, 2011-06-16 at 08:08 -0600, David Ahern wrote:
 Command:
   perf stat -e instructions  openssl speed aes
 
 Guest:
135,522,189,056 instructions  #0.00  insns per cycle
 
 
 Host:
346,082,922,185 instructions  #0.00  insns per cycle 

How does: perf stat -e instructions:u openssl speed aes, compare?
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v2 00/11] KVM in-guest performance monitoring

2011-06-16 Thread David Ahern


On 06/16/2011 09:08 AM, Peter Zijlstra wrote:
 On Thu, 2011-06-16 at 08:08 -0600, David Ahern wrote:
 Command:
   perf stat -e instructions  openssl speed aes

 Guest:
135,522,189,056 instructions  #0.00  insns per cycle


 Host:
346,082,922,185 instructions  #0.00  insns per cycle 
 
 How does: perf stat -e instructions:u openssl speed aes, compare?

I think the problem is that perf stat in the guest introduces
significant overhead. I ran perf-record in the host on the VM pid while
running 'perf stat openssl speed aes' in the guest.

perf-report on that data shows:

18.06%   9226  [k] arch_local_irq_enable
|
|--99.77%-- kvm_arch_vcpu_ioctl_run
|  kvm_vcpu_ioctl
|  do_vfs_ioctl
|  sys_ioctl
|  system_call_fastpath
|  __GI___ioctl
|  0x1010002
 --0.23%-- [...]

and then perf-annotate on kvm_arch_vcpu_ioctl_run  shows
 :vcpu-srcu_idx = srcu_read_lock(vcpu-kvm-srcu);
   21.47 :   1613a:   48 8b 3bmov(%rbx),%rdi

David
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v2 00/11] KVM in-guest performance monitoring

2011-06-16 Thread David Ahern


On 06/16/2011 09:08 AM, Peter Zijlstra wrote:
 On Thu, 2011-06-16 at 08:08 -0600, David Ahern wrote:
 Command:
   perf stat -e instructions  openssl speed aes

 Guest:
135,522,189,056 instructions  #0.00  insns per cycle


 Host:
346,082,922,185 instructions  #0.00  insns per cycle 
 
 How does: perf stat -e instructions:u openssl speed aes, compare?

In the past couple of months I recall you posted a one billion
instruction benchmark in analyzing perf correctness. I can't seem to
find that email. Do you recall the benchmark and if so can you resend ?

David
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v2 00/11] KVM in-guest performance monitoring

2011-06-16 Thread Peter Zijlstra
On Thu, 2011-06-16 at 09:19 -0600, David Ahern wrote:
 
 On 06/16/2011 09:08 AM, Peter Zijlstra wrote:
  On Thu, 2011-06-16 at 08:08 -0600, David Ahern wrote:
  Command:
perf stat -e instructions  openssl speed aes
 
  Guest:
 135,522,189,056 instructions  #0.00  insns per cycle
 
 
  Host:
 346,082,922,185 instructions  #0.00  insns per cycle 
  
  How does: perf stat -e instructions:u openssl speed aes, compare?
 
 In the past couple of months I recall you posted a one billion
 instruction benchmark in analyzing perf correctness. I can't seem to
 find that email. Do you recall the benchmark and if so can you resend ?

Sure, I've got a couple of those things lying around:

# perf stat -e instructions:u ./loop_1b_instructions-4x

 Performance counter stats for './loop_1b_instructions-4x':

 4,000,085,344 instructions:u#0.00  insns per cycle

   0.311861278 seconds time elapsed

---

#include stdlib.h
#include stdio.h
#include time.h

main ()
{
int i;

fork();
fork();

for (i = 0; i  1; i++) {
asm(nop);
asm(nop);
asm(nop);
asm(nop);
asm(nop);
asm(nop);
asm(nop);
}
wait(NULL);
wait(NULL);
wait(NULL);
wait(NULL);
}

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v2 00/11] KVM in-guest performance monitoring

2011-06-16 Thread David Ahern
On 06/16/2011 09:27 AM, Peter Zijlstra wrote:

 Sure, I've got a couple of those things lying around:
 
 # perf stat -e instructions:u ./loop_1b_instructions-4x
 
  Performance counter stats for './loop_1b_instructions-4x':
 
  4,000,085,344 instructions:u#0.00  insns per cycle   
  
 
0.311861278 seconds time elapsed
 
 ---
 
 #include stdlib.h
 #include stdio.h
 #include time.h
 
 main ()
 {
   int i;
 
   fork();
   fork();
 
   for (i = 0; i  1; i++) {
   asm(nop);
   asm(nop);
   asm(nop);
   asm(nop);
   asm(nop);
   asm(nop);
   asm(nop);
   }
   wait(NULL);
   wait(NULL);
   wait(NULL);
   wait(NULL);
 }
 

That's the one.

Guest:
perf stat  -e instructions:u /tmp/a.out

 Performance counter stats for '/tmp/a.out':

 4,000,090,357 instructions:u#0.00  insns per cycle


   2.972828828 seconds time elapsed

Host:
perf stat  -e instructions:u /tmp/a.out

 Performance counter stats for '/tmp/a.out':

 4,000,083,592 instructions:u#0.00  insns per cycle


   0.278185315 seconds time elapsed

So the counting is correct,  but the time to run the command is
significantly longer in the guest. That emphasizes the performance
overhead of running perf-stat in the VM.

Even the default counters for perf-stat are similar, showing correctness
in counting:

Guest:
perf stat ./a.out

 Performance counter stats for './a.out':

   2707.156752 task-clock#0.996 CPUs utilized

   337 context-switches  #0.000 M/sec

 0 CPU-migrations#0.000 M/sec

   209 page-faults   #0.000 M/sec

 3,103,481,148 cycles#1.146 GHz
[50.25%]
   not supported stalled-cycles-frontend
   not supported stalled-cycles-backend
 3,999,894,345 instructions  #1.29  insns per cycle
[50.03%]
   406,716,307 branches  #  150.237 M/sec
[49.85%]
   270,801 branch-misses #0.07% of all branches
[50.02%]

   2.717859741 seconds time elapsed

Host:
perf stat /tmp/a.out

 Performance counter stats for '/tmp/a.out':

   1117.694687 task-clock#3.845 CPUs utilized

   140 context-switches  #0.000 M/sec

 3 CPU-migrations#0.000 M/sec

   203 page-faults   #0.000 M/sec

 3,052,677,262 cycles#2.731 GHz

 1,449,951,708 stalled-cycles-frontend   #   47.50% frontend cycles
idle
   471,788,212 stalled-cycles-backend#   15.45% backend  cycles
idle
 4,006,074,559 instructions  #1.31  insns per cycle

 #0.36  stalled cycles
per insn
   401,265,264 branches  #  359.012 M/sec

29,376 branch-misses #0.01% of all branches


   0.290722796 seconds time elapsed


David
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v2 00/11] KVM in-guest performance monitoring

2011-06-16 Thread Avi Kivity

On 06/16/2011 06:34 PM, David Ahern wrote:


  main ()
  {
int i;

fork();
fork();


What happens without the two forks?

--
error compiling committee.c: too many arguments to function

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v2 00/11] KVM in-guest performance monitoring

2011-06-16 Thread David Ahern


On 06/16/2011 09:59 AM, Avi Kivity wrote:
 On 06/16/2011 06:34 PM, David Ahern wrote:
 
   main ()
   {
   int i;
 
   fork();
   fork();
 
 What happens without the two forks?
 

you have a 1-billion instruction benchmark since there is only 1 process.

David
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v2 00/11] KVM in-guest performance monitoring

2011-06-16 Thread Avi Kivity

On 06/16/2011 07:04 PM, David Ahern wrote:


On 06/16/2011 09:59 AM, Avi Kivity wrote:
  On 06/16/2011 06:34 PM, David Ahern wrote:
  
 main ()
 {
 int i;
  
 fork();
 fork();

  What happens without the two forks?


you have a 1-billion instruction benchmark since there is only 1 process.



I mean in terms of the overhead.  Is the overhead due to context 
switches being made more expensive by the pmu, or is it something else?


But there were only 337 context switches in your measurement, they 
couldn't possibly be so bad.

Anyway I'll investigate it.

--
error compiling committee.c: too many arguments to function

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v2 00/11] KVM in-guest performance monitoring

2011-06-16 Thread David Ahern


On 06/16/2011 10:31 AM, Avi Kivity wrote:
 On 06/16/2011 07:04 PM, David Ahern wrote:

 On 06/16/2011 09:59 AM, Avi Kivity wrote:
   On 06/16/2011 06:34 PM, David Ahern wrote:
   
  main ()
  {
  int i;
   
  fork();
  fork();
 
   What happens without the two forks?
 

 you have a 1-billion instruction benchmark since there is only 1 process.

 
 I mean in terms of the overhead.  Is the overhead due to context
 switches being made more expensive by the pmu, or is it something else?

I figured you meant something else by the question.

 
 But there were only 337 context switches in your measurement, they
 couldn't possibly be so bad.
 Anyway I'll investigate it.
 

I don't think it's the context switching. See the email on perf-report
and perf-annotate from the host side while running perf-stat in the
guest. Perhaps more vmexits and associated preemption disable/enable
overhead - or the rcu change?

David
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v2 00/11] KVM in-guest performance monitoring

2011-06-15 Thread Avi Kivity

On 06/14/2011 09:11 PM, David Ahern wrote:


  Based on Patch 2 you are expecting the guest to have this feature set.
  I've tried +perfmon and +arch_perfmon in the cpu definition for qemu-kvm
  (e.g., -cpu host,model=0,+perfmon) no luck

nevermind. I hand applied your qemu-kvm patch and changed ebx not eax. I
noticed  init_intel() looked at eax and discovered the user error.
Application of patch fixed and it works. :-)


Okay.  If you do anything interesting with it, please let us know.  I 
only tested the watchdog, 'perf top', and 'perf stat'.


--
error compiling committee.c: too many arguments to function

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v2 00/11] KVM in-guest performance monitoring

2011-06-15 Thread David Ahern
On 06/15/2011 02:57 AM, Avi Kivity wrote:
 Okay.  If you do anything interesting with it, please let us know.  I
 only tested the watchdog, 'perf top', and 'perf stat'.
 

For the following I was using the userspace command from latest
perf-core branch.

cycles H/W event is not working for me, so perf-top did not do much
other than start.

perf-stat -ddd shows a whole lot of 0's - which is interesting. It means
time enabled and time running are non-0, yet the counter value is 0.
cycles and instructions events also show as not counted

Command I was playing with:
  taskset -c 1 chrt -r 1 perf stat -ddd openssl speed aes

Performance counter stats for 'openssl speed aes':

  46111.369065 task-clock#0.984 CPUs utilized

   195 context-switches  #0.000 M/sec

 0 CPU-migrations#0.000 M/sec

   650 page-faults   #0.000 M/sec

 not counted cycles
 0 stalled-cycles-frontend   #0.00% frontend cycles
idle[ 7.63%]
 0 stalled-cycles-backend#0.00% backend  cycles
idle[12.70%]
 not counted instructions
   801,002,999 branches  #   17.371 M/sec
[ 8.15%]
 8,491,676 branch-misses #1.06% of all branches
[15.17%]
 0 L1-dcache-loads   #0.000 M/sec
[ 9.23%]
 0 L1-dcache-load-misses #0.00% of all L1-dcache
hits   [ 8.48%]
 0 LLC-loads #0.000 M/sec
[13.89%]
 0 LLC-load-misses   #0.00% of all LL-cache
hits[12.47%]
 0 L1-icache-loads   #0.000 M/sec
[ 9.46%]
 0 L1-icache-load-misses #0.00% of all L1-icache
hits   [ 9.44%]
 0 dTLB-loads#0.000 M/sec
[ 9.59%]
 0 dTLB-load-misses  #0.00% of all dTLB
cache hits  [11.00%]
 0 iTLB-loads#0.000 M/sec
[11.13%]
 0 iTLB-load-misses  #0.00% of all iTLB
cache hits  [ 9.73%]
 0 L1-dcache-prefetches  #0.000 M/sec
[10.98%]
 0 L1-dcache-prefetch-misses #0.000 M/sec
[12.51%]

  46.851192693 seconds time elapsed


Also, the numbers for branches and branch-misses just seem wrong
compared to the same command run in the host as well as running
perf-stat in the host on the vcpu thread running openssl (with the vcpu
pinned to a pcpu).

And then reality kicked in and I had to move on to other items.

David
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v2 00/11] KVM in-guest performance monitoring

2011-06-15 Thread Avi Kivity

On 06/15/2011 03:40 PM, David Ahern wrote:

On 06/15/2011 02:57 AM, Avi Kivity wrote:
  Okay.  If you do anything interesting with it, please let us know.  I
  only tested the watchdog, 'perf top', and 'perf stat'.


For the following I was using the userspace command from latest
perf-core branch.

cycles H/W event is not working for me, so perf-top did not do much
other than start.


Strange, IIRC it did for me.  I'll re-test.


perf-stat -ddd shows a whole lot of 0's - which is interesting. It means
time enabled and time running are non-0, yet the counter value is 0.
cycles and instructions events also show as not counted


Most of those counters aren't supported by the emulated PMU.  What does 
dmesg say about Perf?



Also, the numbers for branches and branch-misses just seem wrong
compared to the same command run in the host as well as running
perf-stat in the host on the vcpu thread running openssl (with the vcpu
pinned to a pcpu).


Could be due to the fact that the counter is running in host mode.  Will 
be fixed once the exclude_host/exclude_guest patch makes it in (and 
gains Intel support).


--
error compiling committee.c: too many arguments to function

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v2 00/11] KVM in-guest performance monitoring

2011-06-15 Thread David Ahern


On 06/15/2011 07:22 AM, Avi Kivity wrote:
 On 06/15/2011 03:40 PM, David Ahern wrote:
 On 06/15/2011 02:57 AM, Avi Kivity wrote:
   Okay.  If you do anything interesting with it, please let us know.  I
   only tested the watchdog, 'perf top', and 'perf stat'.
 

 For the following I was using the userspace command from latest
 perf-core branch.

 cycles H/W event is not working for me, so perf-top did not do much
 other than start.
 
 Strange, IIRC it did for me.  I'll re-test.
 
 perf-stat -ddd shows a whole lot of 0's - which is interesting. It means
 time enabled and time running are non-0, yet the counter value is 0.
 cycles and instructions events also show as not counted
 
 Most of those counters aren't supported by the emulated PMU. 

If the counter is unsupported perf-stat should show either not counted
or not supported (I submitted a patch for the latter which is in
perf-core branch). If you add -v to perf-stat you see the counters are
enabled and the time running is getting incremented. ie., something is
probably not implemented correctly.

 What does
 dmesg say about Perf?

[0.050995] Performance Events: Nehalem events, core PMU driver.
[0.051466] ... version:1
[0.052998] ... bit width:  40
[0.053999] ... generic registers:  2
[0.054998] ... value mask: 00ff
[0.055998] ... max period: 7fff
[0.057997] ... fixed-purpose events:   0
[0.058998] ... event mask: 0003

 
 Also, the numbers for branches and branch-misses just seem wrong
 compared to the same command run in the host as well as running
 perf-stat in the host on the vcpu thread running openssl (with the vcpu
 pinned to a pcpu).
 
 Could be due to the fact that the counter is running in host mode.  Will

You mean when perf is run in the guest?

 be fixed once the exclude_host/exclude_guest patch makes it in (and
 gains Intel support).
 

How does exclude_{host,_guest} help if the guest-side counters are low
-- by orders of magnitude?

David
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v2 00/11] KVM in-guest performance monitoring

2011-06-15 Thread Avi Kivity

On 06/15/2011 07:08 PM, David Ahern wrote:

  What does
  dmesg say about Perf?

[0.050995] Performance Events: Nehalem events, core PMU driver.
[0.051466] ... version:1
[0.052998] ... bit width:  40
[0.053999] ... generic registers:  2
[0.054998] ... value mask: 00ff
[0.055998] ... max period: 7fff
[0.057997] ... fixed-purpose events:   0
[0.058998] ... event mask: 0003


Well, it's not a Nehalem.  Can you tweak the model/family (via -cpu 
host) so it doesn't match a Nehalem and instead falls on the 
architectural PMU?


Trial-and-error should work to find a good combo.



  Also, the numbers for branches and branch-misses just seem wrong
  compared to the same command run in the host as well as running
  perf-stat in the host on the vcpu thread running openssl (with the vcpu
  pinned to a pcpu).

  Could be due to the fact that the counter is running in host mode.  Will

You mean when perf is run in the guest?


Yes - it's counting host events (mostly kvm.ko) as well as guest events.


  be fixed once the exclude_host/exclude_guest patch makes it in (and
  gains Intel support).


How does exclude_{host,_guest} help if the guest-side counters are low
-- by orders of magnitude?


It's probably the misidentification as Nehalem.

--
error compiling committee.c: too many arguments to function

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v2 00/11] KVM in-guest performance monitoring

2011-06-15 Thread David Ahern


On 06/15/2011 10:27 AM, Avi Kivity wrote:
 On 06/15/2011 07:08 PM, David Ahern wrote:
   What does
   dmesg say about Perf?

 [0.050995] Performance Events: Nehalem events, core PMU driver.
 [0.051466] ... version:1
 [0.052998] ... bit width:  40
 [0.053999] ... generic registers:  2
 [0.054998] ... value mask: 00ff
 [0.055998] ... max period: 7fff
 [0.057997] ... fixed-purpose events:   0
 [0.058998] ... event mask: 0003
 
 Well, it's not a Nehalem.  Can you tweak the model/family (via -cpu
 host) so it doesn't match a Nehalem and instead falls on the
 architectural PMU?
 
 Trial-and-error should work to find a good combo.

The qemu-kvm change is setting the pmu version to 1, and your patchset
introduces v1 event constraints. So based on intel_pmu_init model=0 is
an appropriate model - and a required parameter (-cpu host,model=0).
With that option I get the not supported label as expected.

Guest side:
 Performance counter stats for 'openssl speed aes':

  45160.015949 task-clock#0.998 CPUs utilized

   192 context-switches  #0.000 M/sec

 0 CPU-migrations#0.000 M/sec

   650 page-faults   #0.000 M/sec

57,064,592,321 cycles#1.264 GHz
[49.96%]
   138,608,368,094 instructions  #2.43  insns per cycle
[50.04%]
 3,003,337,751 branches  #   66.504 M/sec
[50.04%]
21,890,537 branch-misses #0.73% of all branches
[49.96%]

  45.242117218 seconds time elapsed

(not supported events removed). And comparable events from running the
same command host side:
 Performance counter stats for 'openssl speed aes':

  44947.093539 task-clock#0.998 CPUs utilized

 4,800 context-switches  #0.000 M/sec

 5 CPU-migrations#0.000 M/sec

   481 page-faults   #0.000 M/sec

   124,610,137,228 cycles#2.772 GHz
[27.77%]
   338,982,292,106 instructions  #2.72  insns per cycle

 6,061,899,079 branches  #  134.867 M/sec
[33.33%]
 2,236,965 branch-misses #0.04% of all branches
[33.33%]
  45.043442068 seconds time elapsed

So cycles are off by roughly 2, instructions are off by roughly a factor
of 2.5, branches by a factor of 2. Those 3 events are fairly close from
one run to the next in the host.
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v2 00/11] KVM in-guest performance monitoring

2011-06-14 Thread Avi Kivity

On 06/13/2011 10:55 PM, David Ahern wrote:

On 06/13/2011 07:34 AM, Avi Kivity wrote:
  This patchset exposes an emulated version 1 architectural performance
  monitoring unit to KVM guests.  The PMU is emulated using perf_events,
  so the host kernel can multiplex host-wide, host-user, and the
  guest on available resources.

Any particular magic needed to try this patchset?



You'll need the attached patch, '-cpu host' (or '-cpu host,model=0' 
sometimes), and, as patch 2 is a guest bug fix, you'll need to run the 
patched kernel in the guest as well.


--
error compiling committee.c: too many arguments to function

From 520cf568954500457e1efe37e144c022a767e41f Mon Sep 17 00:00:00 2001
From: Avi Kivity a...@redhat.com
Date: Mon, 9 May 2011 09:59:52 +0300
Subject: [PATCH] pmu hack

Signed-off-by: Avi Kivity a...@redhat.com
---
 target-i386/cpuid.c |2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/target-i386/cpuid.c b/target-i386/cpuid.c
index 091d812..52ee7a6 100644
--- a/target-i386/cpuid.c
+++ b/target-i386/cpuid.c
@@ -1124,7 +1124,7 @@ void cpu_x86_cpuid(CPUX86State *env, uint32_t index, uint32_t count,
 break;
 case 0xA:
 /* Architectural Performance Monitoring Leaf */
-*eax = 0;
+*eax = 0x07280201;
 *ebx = 0;
 *ecx = 0;
 *edx = 0;
-- 
1.7.5.3



Re: [PATCH v2 00/11] KVM in-guest performance monitoring

2011-06-14 Thread David Ahern
On 06/14/2011 02:36 AM, Avi Kivity wrote:
 On 06/13/2011 10:55 PM, David Ahern wrote:
 On 06/13/2011 07:34 AM, Avi Kivity wrote:
   This patchset exposes an emulated version 1 architectural performance
   monitoring unit to KVM guests.  The PMU is emulated using perf_events,
   so the host kernel can multiplex host-wide, host-user, and the
   guest on available resources.

 Any particular magic needed to try this patchset?

 
 You'll need the attached patch, '-cpu host' (or '-cpu host,model=0'
 sometimes), and, as patch 2 is a guest bug fix, you'll need to run the
 patched kernel in the guest as well.
 

qemu-kvm is not cooperating. git repo as of 05f1737582 with your patch
is aborting:

Welcome to Fedora
Starting udev: [4.031626] udev[409]: starting version 161
[4.831159] piix4_smbus :00:01.3: SMBus Host Controller at
0xb100, revision 0
qemu-kvm: /exports/daahern/qemu-kvm.git/hw/msix.c:616:
msix_unset_mask_notifier: Assertion `dev-msix_mask_notifier' failed.

David
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v2 00/11] KVM in-guest performance monitoring

2011-06-14 Thread Jan Kiszka
On 2011-06-14 19:15, David Ahern wrote:
 On 06/14/2011 02:36 AM, Avi Kivity wrote:
 On 06/13/2011 10:55 PM, David Ahern wrote:
 On 06/13/2011 07:34 AM, Avi Kivity wrote:
  This patchset exposes an emulated version 1 architectural performance
  monitoring unit to KVM guests.  The PMU is emulated using perf_events,
  so the host kernel can multiplex host-wide, host-user, and the
  guest on available resources.

 Any particular magic needed to try this patchset?


 You'll need the attached patch, '-cpu host' (or '-cpu host,model=0'
 sometimes), and, as patch 2 is a guest bug fix, you'll need to run the
 patched kernel in the guest as well.

 
 qemu-kvm is not cooperating. git repo as of 05f1737582 with your patch
 is aborting:
 
   Welcome to Fedora
 Starting udev: [4.031626] udev[409]: starting version 161
 [4.831159] piix4_smbus :00:01.3: SMBus Host Controller at
 0xb100, revision 0
 qemu-kvm: /exports/daahern/qemu-kvm.git/hw/msix.c:616:
 msix_unset_mask_notifier: Assertion `dev-msix_mask_notifier' failed.

Use the qemu-kvm next branch. It has the fix you need.

Jan

-- 
Siemens AG, Corporate Technology, CT T DE IT 1
Corporate Competence Center Embedded Linux
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v2 00/11] KVM in-guest performance monitoring

2011-06-14 Thread David Ahern


On 06/14/2011 11:24 AM, Jan Kiszka wrote:
 On 2011-06-14 19:15, David Ahern wrote:
 On 06/14/2011 02:36 AM, Avi Kivity wrote:
 On 06/13/2011 10:55 PM, David Ahern wrote:
 On 06/13/2011 07:34 AM, Avi Kivity wrote:
  This patchset exposes an emulated version 1 architectural performance
  monitoring unit to KVM guests.  The PMU is emulated using perf_events,
  so the host kernel can multiplex host-wide, host-user, and the
  guest on available resources.

 Any particular magic needed to try this patchset?


 You'll need the attached patch, '-cpu host' (or '-cpu host,model=0'
 sometimes), and, as patch 2 is a guest bug fix, you'll need to run the
 patched kernel in the guest as well.


 qemu-kvm is not cooperating. git repo as of 05f1737582 with your patch
 is aborting:

  Welcome to Fedora
 Starting udev: [4.031626] udev[409]: starting version 161
 [4.831159] piix4_smbus :00:01.3: SMBus Host Controller at
 0xb100, revision 0
 qemu-kvm: /exports/daahern/qemu-kvm.git/hw/msix.c:616:
 msix_unset_mask_notifier: Assertion `dev-msix_mask_notifier' failed.
 
 Use the qemu-kvm next branch. It has the fix you need.

Indeed it does. Thanks.

Avi: still no luck:
[0.047996] Performance Events: unsupported p6 CPU model 0 no PMU
driver, software events only.

qemu-kvm next branch, ce5f0a588b740e8f28f46a6009e12cfa72edc51f with your
perfmon cpuid change. Host and guest are both running your kvm next
branch with pmu patch series.

David


 
 Jan
 
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v2 00/11] KVM in-guest performance monitoring

2011-06-14 Thread David Ahern
On 06/14/2011 11:33 AM, David Ahern wrote:
 Avi: still no luck:
 [0.047996] Performance Events: unsupported p6 CPU model 0 no PMU
 driver, software events only.
 
 qemu-kvm next branch, ce5f0a588b740e8f28f46a6009e12cfa72edc51f with your
 perfmon cpuid change. Host and guest are both running your kvm next
 branch with pmu patch series.

The perf init code is going down the !perfmon route:

if (!cpu_has(boot_cpu_data, X86_FEATURE_ARCH_PERFMON)) {
switch (boot_cpu_data.x86) {
case 0x6:
return p6_pmu_init();
case 0xf:
return p4_pmu_init();
}
return -ENODEV;
}

Based on Patch 2 you are expecting the guest to have this feature set.
I've tried +perfmon and +arch_perfmon in the cpu definition for qemu-kvm
(e.g., -cpu host,model=0,+perfmon) no luck

David
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v2 00/11] KVM in-guest performance monitoring

2011-06-14 Thread David Ahern


On 06/14/2011 11:48 AM, David Ahern wrote:
 On 06/14/2011 11:33 AM, David Ahern wrote:
 Avi: still no luck:
 [0.047996] Performance Events: unsupported p6 CPU model 0 no PMU
 driver, software events only.

 qemu-kvm next branch, ce5f0a588b740e8f28f46a6009e12cfa72edc51f with your
 perfmon cpuid change. Host and guest are both running your kvm next
 branch with pmu patch series.
 
 The perf init code is going down the !perfmon route:
 
   if (!cpu_has(boot_cpu_data, X86_FEATURE_ARCH_PERFMON)) {
   switch (boot_cpu_data.x86) {
   case 0x6:
   return p6_pmu_init();
   case 0xf:
   return p4_pmu_init();
   }
   return -ENODEV;
   }
 
 Based on Patch 2 you are expecting the guest to have this feature set.
 I've tried +perfmon and +arch_perfmon in the cpu definition for qemu-kvm
 (e.g., -cpu host,model=0,+perfmon) no luck
 
nevermind. I hand applied your qemu-kvm patch and changed ebx not eax. I
noticed  init_intel() looked at eax and discovered the user error.
Application of patch fixed and it works. :-)

David
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v2 00/11] KVM in-guest performance monitoring

2011-06-13 Thread David Ahern
On 06/13/2011 07:34 AM, Avi Kivity wrote:
 This patchset exposes an emulated version 1 architectural performance
 monitoring unit to KVM guests.  The PMU is emulated using perf_events,
 so the host kernel can multiplex host-wide, host-user, and the
 guest on available resources.

Any particular magic needed to try this patchset?

Host and guest both 64-bit, Fedora 14. Kernel for both is your 'kvm.git
next' with this patchset applied.

Host:
2 x Intel(R) Xeon(R) CPU   E5540  @ 2.53GHz
qemu-kvm git as of May 9.

Guest:
tried '-cpu host' and without a -cpu arg (so qemu-kvm default). In both
cases I get:

[0.044999] CPU0: Intel(R) Xeon(R) CPU   E5540  @ 2.53GHz
stepping 05
[0.046996] Performance Events: unsupported p6 CPU model 26 no PMU
driver, software events only.

David

 
 Caveats:
 - counters that have PMI (interrupt) enabled stop counting after the
   interrupt is signalled.  This is because we need one-shot samples
   that keep counting, which perf doesn't support yet
 - some combinations of INV and CMASK are not supported
 - counters keep on counting in the host as well as the guest
 
 perf maintainers: please consider the first three patches for merging (the
 first two make sense even without the rest).  If you're familiar with the 
 Intel
 PMU, please review patch 5 as well - it effectively undoes all your work
 of abstracting the PMU into perf_events by unabstracting perf_events into what
 is hoped is a very similar PMU.
 
 v2:
  -  don't pass perf_event handler context to the callback; extract it via the
 'event' parameter instead
  -  RDPMC emulation and interception
  -  CR4.PCE emulation
 --
 To unsubscribe from this list: send the line unsubscribe kvm in
 the body of a message to majord...@vger.kernel.org
 More majordomo info at  http://vger.kernel.org/majordomo-info.html
 
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html