RE: [PATCH] vhost: Add polling mode

2014-08-24 Thread Razya Ladelsky
David Laight david.lai...@aculab.com wrote on 21/08/2014 05:29:41 PM:

 From: David Laight david.lai...@aculab.com
 To: Razya Ladelsky/Haifa/IBM@IBMIL, Michael S. Tsirkin 
m...@redhat.com
 Cc: abel.gor...@gmail.com abel.gor...@gmail.com, Alex Glikson/
 Haifa/IBM@IBMIL, Eran Raichstein/Haifa/IBM@IBMIL, Joel Nider/Haifa/
 IBM@IBMIL, kvm@vger.kernel.org kvm@vger.kernel.org, linux-
 ker...@vger.kernel.org linux-ker...@vger.kernel.org, 
 net...@vger.kernel.org net...@vger.kernel.org, 
 virtualizat...@lists.linux-foundation.org 
 virtualizat...@lists.linux-foundation.org, Yossi 
Kuperman1/Haifa/IBM@IBMIL
 Date: 21/08/2014 05:31 PM
 Subject: RE: [PATCH] vhost: Add polling mode
 
 From: Razya Ladelsky
  Michael S. Tsirkin m...@redhat.com wrote on 20/08/2014 01:57:10 PM:
  
Results:
   
Netperf, 1 vm:
The polling patch improved throughput by ~33% (1516 MB/sec - 
 2046 MB/sec).
Number of exits/sec decreased 6x.
The same improvement was shown when I tested with 3 vms running 
netperf
(4086 MB/sec - 5545 MB/sec).
   
filebench, 1 vm:
ops/sec improved by 13% with the polling patch. Number of exits
was reduced by 31%.
The same experiment with 3 vms running filebench showed similar 
numbers.
   
Signed-off-by: Razya Ladelsky ra...@il.ibm.com
  
   This really needs more thourough benchmarking report, including
   system data.  One good example for a related patch:
   http://lwn.net/Articles/551179/
   though for virtualization, we need data about host as well, and if 
you
   want to look at streaming benchmarks, you need to test different 
message
   sizes and measure packet size.
  
  
  Hi Michael,
  I have already tried running netperf with several message sizes:
  64,128,256,512,600,800...
  But the results are inconsistent even in the baseline/unpatched
  configuration.
  For smaller msg sizes, I get consistent numbers. However, at some 
point,
  when I increase the msg size
  I get unstable results. For example, for a 512B msg, I get two 
scenarios:
  vm utilization 100%, vhost utilization 75%, throughput ~6300
  vm utilization 80%, vhost utilization 13%, throughput ~9400 (line 
rate)
  
  I don't know why vhost is behaving that way for certain message sizes.
  Do you have any insight to why this is happening?
 
 Have you tried looking at the actual ethernet packet sizes.
 It may well jump between using small packets (the size of the writes)
 and full sized ones.

I will check it,
Thanks,
Razya

 
 If you are trying to measure ethernet packet 'cost' you need to use UDP.
 However that probably uses different code paths.
 
David
 
 
 

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] vhost: Add polling mode

2014-08-22 Thread Zhang Haoyu
  
  Results:
  
  Netperf, 1 vm:
  The polling patch improved throughput by ~33% (1516 MB/sec - 2046 MB/sec).
  Number of exits/sec decreased 6x.
  The same improvement was shown when I tested with 3 vms running netperf
  (4086 MB/sec - 5545 MB/sec).
  
  filebench, 1 vm:
  ops/sec improved by 13% with the polling patch. Number of exits 
 was reduced by
  31%.
  The same experiment with 3 vms running filebench showed similar numbers.
  
  Signed-off-by: Razya Ladelsky ra...@il.ibm.com
 
 Gave it a quick try on s390/kvm. As expected it makes no difference 
 for big streaming workload like iperf.
 uperf with a 1-1 round robin got indeed faster by about 30%.
 The high CPU consumption is something that bothers me though, as 
 virtualized systems tend to be full.
 
 

Thanks for confirming the results!
The best way to use this patch would be along with a shared vhost thread 
for multiple
devices/vms, as described in:
http://domino.research.ibm.com/library/cyberdig.nsf/1e4115aea78b6e7c85256b360066f0d4/479e3578ed05bfac85257b4200427735!OpenDocument
This work assumes having a dedicated I/O core where the vhost thread 
serves multiple vms, which 
makes the high cpu utilization less of a concern. 

Hi, Razya, Shirley
I am going to test the combination of 
several (depends on total number of cpu on host, e.g.,  total_number * 1/3) 
vhost threads server all VMs and vhost: add polling mode,
now I get the patch 
http://thread.gmane.org/gmane.comp.emulators.kvm.devel/88682/focus=88723; 
posted by Shirley,
any update to this patch?

And, I want to make a bit change on this patch, create total_cpu_number * 
1/N(N={3,4}) vhost threads instead of per-cpu vhost thread to server all VMs,
any ideas?

Thanks,
Zhang Haoyu


  +static int poll_start_rate = 0;
  +module_param(poll_start_rate, int, S_IRUGO|S_IWUSR);
  +MODULE_PARM_DESC(poll_start_rate, Start continuous polling of 
 virtqueue when rate of events is at least this number per jiffy. If 
 0, never start polling.);
  +
  +static int poll_stop_idle = 3*HZ; /* 3 seconds */
  +module_param(poll_stop_idle, int, S_IRUGO|S_IWUSR);
  +MODULE_PARM_DESC(poll_stop_idle, Stop continuous polling of 
 virtqueue after this many jiffies of no work.);
 
 This seems ridicoudly high. Even one jiffie is an eternity, so 
 setting it to 1 as a default would reduce the CPU overhead for most cases.
 If we dont have a packet in one millisecond, we can surely go back 
 to the kick approach, I think.
 
 Christian
 

Good point, will reduce it and recheck.
Thank you,
Razya

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] vhost: Add polling mode

2014-08-22 Thread Zhang Haoyu
  
  Results:
  
  Netperf, 1 vm:
  The polling patch improved throughput by ~33% (1516 MB/sec - 2046 
  MB/sec).
  Number of exits/sec decreased 6x.
  The same improvement was shown when I tested with 3 vms running netperf
  (4086 MB/sec - 5545 MB/sec).
  
  filebench, 1 vm:
  ops/sec improved by 13% with the polling patch. Number of exits 
 was reduced by
  31%.
  The same experiment with 3 vms running filebench showed similar numbers.
  
  Signed-off-by: Razya Ladelsky ra...@il.ibm.com
 
 Gave it a quick try on s390/kvm. As expected it makes no difference 
 for big streaming workload like iperf.
 uperf with a 1-1 round robin got indeed faster by about 30%.
 The high CPU consumption is something that bothers me though, as 
 virtualized systems tend to be full.
 
 

Thanks for confirming the results!
The best way to use this patch would be along with a shared vhost thread 
for multiple
devices/vms, as described in:
http://domino.research.ibm.com/library/cyberdig.nsf/1e4115aea78b6e7c85256b360066f0d4/479e3578ed05bfac85257b4200427735!OpenDocument
This work assumes having a dedicated I/O core where the vhost thread 
serves multiple vms, which 
makes the high cpu utilization less of a concern. 

Hi, Razya, Shirley
I am going to test the combination of 
several (depends on total number of cpu on host, e.g.,  total_number * 1/3) 
vhost threads server all VMs and vhost: add polling mode,
now I get the patch 
http://thread.gmane.org/gmane.comp.emulators.kvm.devel/88682/focus=88723; 
posted by Shirley,
any update to this patch?

And, I want to make a bit change on this patch, create total_cpu_number * 
1/N(N={3,4}) vhost threads instead of per-cpu vhost thread to server all VMs,
Just like xen netback threads, whose number is equal to num_online_cpus on 
Dom0, 
but for kvm host, I think per-cpu vhost thread is too many.
any ideas?

Thanks,
Zhang Haoyu


  +static int poll_start_rate = 0;
  +module_param(poll_start_rate, int, S_IRUGO|S_IWUSR);
  +MODULE_PARM_DESC(poll_start_rate, Start continuous polling of 
 virtqueue when rate of events is at least this number per jiffy. If 
 0, never start polling.);
  +
  +static int poll_stop_idle = 3*HZ; /* 3 seconds */
  +module_param(poll_stop_idle, int, S_IRUGO|S_IWUSR);
  +MODULE_PARM_DESC(poll_stop_idle, Stop continuous polling of 
 virtqueue after this many jiffies of no work.);
 
 This seems ridicoudly high. Even one jiffie is an eternity, so 
 setting it to 1 as a default would reduce the CPU overhead for most cases.
 If we dont have a packet in one millisecond, we can surely go back 
 to the kick approach, I think.
 
 Christian
 

Good point, will reduce it and recheck.
Thank you,
Razya

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] vhost: Add polling mode

2014-08-21 Thread Razya Ladelsky
Christian Borntraeger borntrae...@de.ibm.com wrote on 20/08/2014 
11:41:32 AM:


  
  Results:
  
  Netperf, 1 vm:
  The polling patch improved throughput by ~33% (1516 MB/sec - 2046 
MB/sec).
  Number of exits/sec decreased 6x.
  The same improvement was shown when I tested with 3 vms running 
netperf
  (4086 MB/sec - 5545 MB/sec).
  
  filebench, 1 vm:
  ops/sec improved by 13% with the polling patch. Number of exits 
 was reduced by
  31%.
  The same experiment with 3 vms running filebench showed similar 
numbers.
  
  Signed-off-by: Razya Ladelsky ra...@il.ibm.com
 
 Gave it a quick try on s390/kvm. As expected it makes no difference 
 for big streaming workload like iperf.
 uperf with a 1-1 round robin got indeed faster by about 30%.
 The high CPU consumption is something that bothers me though, as 
 virtualized systems tend to be full.
 
 

Thanks for confirming the results!
The best way to use this patch would be along with a shared vhost thread 
for multiple
devices/vms, as described in:
http://domino.research.ibm.com/library/cyberdig.nsf/1e4115aea78b6e7c85256b360066f0d4/479e3578ed05bfac85257b4200427735!OpenDocument
This work assumes having a dedicated I/O core where the vhost thread 
serves multiple vms, which 
makes the high cpu utilization less of a concern. 



  +static int poll_start_rate = 0;
  +module_param(poll_start_rate, int, S_IRUGO|S_IWUSR);
  +MODULE_PARM_DESC(poll_start_rate, Start continuous polling of 
 virtqueue when rate of events is at least this number per jiffy. If 
 0, never start polling.);
  +
  +static int poll_stop_idle = 3*HZ; /* 3 seconds */
  +module_param(poll_stop_idle, int, S_IRUGO|S_IWUSR);
  +MODULE_PARM_DESC(poll_stop_idle, Stop continuous polling of 
 virtqueue after this many jiffies of no work.);
 
 This seems ridicoudly high. Even one jiffie is an eternity, so 
 setting it to 1 as a default would reduce the CPU overhead for most 
cases.
 If we dont have a packet in one millisecond, we can surely go back 
 to the kick approach, I think.
 
 Christian
 

Good point, will reduce it and recheck.
Thank you,
Razya

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] vhost: Add polling mode

2014-08-21 Thread Razya Ladelsky
Michael S. Tsirkin m...@redhat.com wrote on 20/08/2014 01:57:10 PM:

  Results:
  
  Netperf, 1 vm:
  The polling patch improved throughput by ~33% (1516 MB/sec - 2046 
MB/sec).
  Number of exits/sec decreased 6x.
  The same improvement was shown when I tested with 3 vms running 
netperf
  (4086 MB/sec - 5545 MB/sec).
  
  filebench, 1 vm:
  ops/sec improved by 13% with the polling patch. Number of exits 
 was reduced by
  31%.
  The same experiment with 3 vms running filebench showed similar 
numbers.
  
  Signed-off-by: Razya Ladelsky ra...@il.ibm.com
 
 This really needs more thourough benchmarking report, including
 system data.  One good example for a related patch:
 http://lwn.net/Articles/551179/
 though for virtualization, we need data about host as well, and if you
 want to look at streaming benchmarks, you need to test different message
 sizes and measure packet size.


Hi Michael,
I have already tried running netperf with several message sizes: 
64,128,256,512,600,800...
But the results are inconsistent even in the baseline/unpatched 
configuration.
For smaller msg sizes, I get consistent numbers. However, at some point, 
when I increase the msg size
I get unstable results. For example, for a 512B msg, I get two scenarios:
vm utilization 100%, vhost utilization 75%, throughput ~6300 
vm utilization 80%, vhost utilization 13%, throughput ~9400 (line rate)

I don't know why vhost is behaving that way for certain message sizes.
Do you have any insight to why this is happening?
Thank you,
Razya
 

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: [PATCH] vhost: Add polling mode

2014-08-21 Thread David Laight
From: Razya Ladelsky
 Michael S. Tsirkin m...@redhat.com wrote on 20/08/2014 01:57:10 PM:
 
   Results:
  
   Netperf, 1 vm:
   The polling patch improved throughput by ~33% (1516 MB/sec - 2046 
   MB/sec).
   Number of exits/sec decreased 6x.
   The same improvement was shown when I tested with 3 vms running netperf
   (4086 MB/sec - 5545 MB/sec).
  
   filebench, 1 vm:
   ops/sec improved by 13% with the polling patch. Number of exits
   was reduced by 31%.
   The same experiment with 3 vms running filebench showed similar numbers.
  
   Signed-off-by: Razya Ladelsky ra...@il.ibm.com
 
  This really needs more thourough benchmarking report, including
  system data.  One good example for a related patch:
  http://lwn.net/Articles/551179/
  though for virtualization, we need data about host as well, and if you
  want to look at streaming benchmarks, you need to test different message
  sizes and measure packet size.
 
 
 Hi Michael,
 I have already tried running netperf with several message sizes:
 64,128,256,512,600,800...
 But the results are inconsistent even in the baseline/unpatched
 configuration.
 For smaller msg sizes, I get consistent numbers. However, at some point,
 when I increase the msg size
 I get unstable results. For example, for a 512B msg, I get two scenarios:
 vm utilization 100%, vhost utilization 75%, throughput ~6300
 vm utilization 80%, vhost utilization 13%, throughput ~9400 (line rate)
 
 I don't know why vhost is behaving that way for certain message sizes.
 Do you have any insight to why this is happening?

Have you tried looking at the actual ethernet packet sizes.
It may well jump between using small packets (the size of the writes)
and full sized ones.

If you are trying to measure ethernet packet 'cost' you need to use UDP.
However that probably uses different code paths.

David



--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] vhost: Add polling mode

2014-08-20 Thread Christian Borntraeger
On 10/08/14 10:30, Razya Ladelsky wrote:
 From: Razya Ladelsky ra...@il.ibm.com
 Date: Thu, 31 Jul 2014 09:47:20 +0300
 Subject: [PATCH] vhost: Add polling mode
 
 When vhost is waiting for buffers from the guest driver (e.g., more packets to
 send in vhost-net's transmit queue), it normally goes to sleep and waits for 
 the
 guest to kick it. This kick involves a PIO in the guest, and therefore an 
 exit
 (and possibly userspace involvement in translating this PIO exit into a file
 descriptor event), all of which hurts performance.
 
 If the system is under-utilized (has cpu time to spare), vhost can 
 continuously
 poll the virtqueues for new buffers, and avoid asking the guest to kick us.
 This patch adds an optional polling mode to vhost, that can be enabled via a
 kernel module parameter, poll_start_rate.
 
 When polling is active for a virtqueue, the guest is asked to disable
 notification (kicks), and the worker thread continuously checks for new 
 buffers.
 When it does discover new buffers, it simulates a kick by invoking the
 underlying backend driver (such as vhost-net), which thinks it got a real kick
 from the guest, and acts accordingly. If the underlying driver asks not to be
 kicked, we disable polling on this virtqueue.
 
 We start polling on a virtqueue when we notice it has work to do. Polling on
 this virtqueue is later disabled after 3 seconds of polling turning up no new
 work, as in this case we are better off returning to the exit-based 
 notification
 mechanism. The default timeout of 3 seconds can be changed with the
 poll_stop_idle kernel module parameter.
 
 This polling approach makes lot of sense for new HW with posted-interrupts for
 which we have exitless host-to-guest notifications. But even with support for
 posted interrupts, guest-to-host communication still causes exits. Polling 
 adds
 the missing part.
 
 When systems are overloaded, there won't be enough cpu time for the various
 vhost threads to poll their guests' devices. For these scenarios, we plan to 
 add
 support for vhost threads that can be shared by multiple devices, even of
 multiple vms.
 Our ultimate goal is to implement the I/O acceleration features described in:
 KVM Forum 2013: Efficient and Scalable Virtio (by Abel Gordon)
 https://www.youtube.com/watch?v=9EyweibHfEs
 and
 https://www.mail-archive.com/kvm@vger.kernel.org/msg98179.html
 
 I ran some experiments with TCP stream netperf and filebench (having 2 threads
 performing random reads) benchmarks on an IBM System x3650 M4.
 I have two machines, A and B. A hosts the vms, B runs the netserver.
 The vms (on A) run netperf, its destination server is running on B.
 All runs loaded the guests in a way that they were (cpu) saturated. For 
 example,
 I ran netperf with 64B messages, which is heavily loading the vm (which is why
 its throughput is low).
 The idea was to get it 100% loaded, so we can see that the polling is getting 
 it
 to produce higher throughput.
 
 The system had two cores per guest, as to allow for both the vcpu and the 
 vhost
 thread to run concurrently for maximum throughput (but I didn't pin the 
 threads
 to specific cores).
 My experiments were fair in a sense that for both cases, with or without
 polling, I run both threads, vcpu and vhost, on 2 cores (set their affinity 
 that
 way). The only difference was whether polling was enabled/disabled.
 
 Results:
 
 Netperf, 1 vm:
 The polling patch improved throughput by ~33% (1516 MB/sec - 2046 MB/sec).
 Number of exits/sec decreased 6x.
 The same improvement was shown when I tested with 3 vms running netperf
 (4086 MB/sec - 5545 MB/sec).
 
 filebench, 1 vm:
 ops/sec improved by 13% with the polling patch. Number of exits was reduced by
 31%.
 The same experiment with 3 vms running filebench showed similar numbers.
 
 Signed-off-by: Razya Ladelsky ra...@il.ibm.com

Gave it a quick try on s390/kvm. As expected it makes no difference for big 
streaming workload like iperf.
uperf with a 1-1 round robin got indeed faster by about 30%.
The high CPU consumption is something that bothers me though, as virtualized 
systems tend to be full.


 +static int poll_start_rate = 0;
 +module_param(poll_start_rate, int, S_IRUGO|S_IWUSR);
 +MODULE_PARM_DESC(poll_start_rate, Start continuous polling of virtqueue 
 when rate of events is at least this number per jiffy. If 0, never start 
 polling.);
 +
 +static int poll_stop_idle = 3*HZ; /* 3 seconds */
 +module_param(poll_stop_idle, int, S_IRUGO|S_IWUSR);
 +MODULE_PARM_DESC(poll_stop_idle, Stop continuous polling of virtqueue after 
 this many jiffies of no work.);

This seems ridicoudly high. Even one jiffie is an eternity, so setting it to 1 
as a default would reduce the CPU overhead for most cases.
If we dont have a packet in one millisecond, we can surely go back to the kick 
approach, I think.

Christian

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info

Re: [PATCH] vhost: Add polling mode

2014-08-20 Thread Michael S. Tsirkin
On Wed, Aug 20, 2014 at 10:41:32AM +0200, Christian Borntraeger wrote:
 On 10/08/14 10:30, Razya Ladelsky wrote:
  From: Razya Ladelsky ra...@il.ibm.com
  Date: Thu, 31 Jul 2014 09:47:20 +0300
  Subject: [PATCH] vhost: Add polling mode
  
  When vhost is waiting for buffers from the guest driver (e.g., more packets 
  to
  send in vhost-net's transmit queue), it normally goes to sleep and waits 
  for the
  guest to kick it. This kick involves a PIO in the guest, and therefore an 
  exit
  (and possibly userspace involvement in translating this PIO exit into a file
  descriptor event), all of which hurts performance.
  
  If the system is under-utilized (has cpu time to spare), vhost can 
  continuously
  poll the virtqueues for new buffers, and avoid asking the guest to kick us.
  This patch adds an optional polling mode to vhost, that can be enabled via a
  kernel module parameter, poll_start_rate.
  
  When polling is active for a virtqueue, the guest is asked to disable
  notification (kicks), and the worker thread continuously checks for new 
  buffers.
  When it does discover new buffers, it simulates a kick by invoking the
  underlying backend driver (such as vhost-net), which thinks it got a real 
  kick
  from the guest, and acts accordingly. If the underlying driver asks not to 
  be
  kicked, we disable polling on this virtqueue.
  
  We start polling on a virtqueue when we notice it has work to do. Polling on
  this virtqueue is later disabled after 3 seconds of polling turning up no 
  new
  work, as in this case we are better off returning to the exit-based 
  notification
  mechanism. The default timeout of 3 seconds can be changed with the
  poll_stop_idle kernel module parameter.
  
  This polling approach makes lot of sense for new HW with posted-interrupts 
  for
  which we have exitless host-to-guest notifications. But even with support 
  for
  posted interrupts, guest-to-host communication still causes exits. Polling 
  adds
  the missing part.
  
  When systems are overloaded, there won't be enough cpu time for the various
  vhost threads to poll their guests' devices. For these scenarios, we plan 
  to add
  support for vhost threads that can be shared by multiple devices, even of
  multiple vms.
  Our ultimate goal is to implement the I/O acceleration features described 
  in:
  KVM Forum 2013: Efficient and Scalable Virtio (by Abel Gordon)
  https://www.youtube.com/watch?v=9EyweibHfEs
  and
  https://www.mail-archive.com/kvm@vger.kernel.org/msg98179.html
  
  I ran some experiments with TCP stream netperf and filebench (having 2 
  threads
  performing random reads) benchmarks on an IBM System x3650 M4.
  I have two machines, A and B. A hosts the vms, B runs the netserver.
  The vms (on A) run netperf, its destination server is running on B.
  All runs loaded the guests in a way that they were (cpu) saturated. For 
  example,
  I ran netperf with 64B messages, which is heavily loading the vm (which is 
  why
  its throughput is low).
  The idea was to get it 100% loaded, so we can see that the polling is 
  getting it
  to produce higher throughput.
  
  The system had two cores per guest, as to allow for both the vcpu and the 
  vhost
  thread to run concurrently for maximum throughput (but I didn't pin the 
  threads
  to specific cores).
  My experiments were fair in a sense that for both cases, with or without
  polling, I run both threads, vcpu and vhost, on 2 cores (set their affinity 
  that
  way). The only difference was whether polling was enabled/disabled.
  
  Results:
  
  Netperf, 1 vm:
  The polling patch improved throughput by ~33% (1516 MB/sec - 2046 MB/sec).
  Number of exits/sec decreased 6x.
  The same improvement was shown when I tested with 3 vms running netperf
  (4086 MB/sec - 5545 MB/sec).
  
  filebench, 1 vm:
  ops/sec improved by 13% with the polling patch. Number of exits was reduced 
  by
  31%.
  The same experiment with 3 vms running filebench showed similar numbers.
  
  Signed-off-by: Razya Ladelsky ra...@il.ibm.com
 
 Gave it a quick try on s390/kvm. As expected it makes no difference for big 
 streaming workload like iperf.
 uperf with a 1-1 round robin got indeed faster by about 30%.
 The high CPU consumption is something that bothers me though, as virtualized 
 systems tend to be full.
 
 
  +static int poll_start_rate = 0;
  +module_param(poll_start_rate, int, S_IRUGO|S_IWUSR);
  +MODULE_PARM_DESC(poll_start_rate, Start continuous polling of virtqueue 
  when rate of events is at least this number per jiffy. If 0, never start 
  polling.);
  +
  +static int poll_stop_idle = 3*HZ; /* 3 seconds */
  +module_param(poll_stop_idle, int, S_IRUGO|S_IWUSR);
  +MODULE_PARM_DESC(poll_stop_idle, Stop continuous polling of virtqueue 
  after this many jiffies of no work.);
 
 This seems ridicoudly high. Even one jiffie is an eternity, so setting it to 
 1 as a default would reduce the CPU overhead for most cases.
 If we dont have a packet in one

Re: [PATCH] vhost: Add polling mode

2014-08-20 Thread Michael S. Tsirkin
On Sun, Aug 10, 2014 at 11:30:35AM +0300, Razya Ladelsky wrote:
 From: Razya Ladelsky ra...@il.ibm.com
 Date: Thu, 31 Jul 2014 09:47:20 +0300
 Subject: [PATCH] vhost: Add polling mode
 
 When vhost is waiting for buffers from the guest driver (e.g., more packets to
 send in vhost-net's transmit queue), it normally goes to sleep and waits for 
 the
 guest to kick it. This kick involves a PIO in the guest, and therefore an 
 exit
 (and possibly userspace involvement in translating this PIO exit into a file
 descriptor event), all of which hurts performance.
 
 If the system is under-utilized (has cpu time to spare), vhost can 
 continuously
 poll the virtqueues for new buffers, and avoid asking the guest to kick us.
 This patch adds an optional polling mode to vhost, that can be enabled via a
 kernel module parameter, poll_start_rate.
 
 When polling is active for a virtqueue, the guest is asked to disable
 notification (kicks), and the worker thread continuously checks for new 
 buffers.
 When it does discover new buffers, it simulates a kick by invoking the
 underlying backend driver (such as vhost-net), which thinks it got a real kick
 from the guest, and acts accordingly. If the underlying driver asks not to be
 kicked, we disable polling on this virtqueue.
 
 We start polling on a virtqueue when we notice it has work to do. Polling on
 this virtqueue is later disabled after 3 seconds of polling turning up no new
 work, as in this case we are better off returning to the exit-based 
 notification
 mechanism. The default timeout of 3 seconds can be changed with the
 poll_stop_idle kernel module parameter.
 
 This polling approach makes lot of sense for new HW with posted-interrupts for
 which we have exitless host-to-guest notifications. But even with support for
 posted interrupts, guest-to-host communication still causes exits. Polling 
 adds
 the missing part.
 
 When systems are overloaded, there won't be enough cpu time for the various
 vhost threads to poll their guests' devices. For these scenarios, we plan to 
 add
 support for vhost threads that can be shared by multiple devices, even of
 multiple vms.
 Our ultimate goal is to implement the I/O acceleration features described in:
 KVM Forum 2013: Efficient and Scalable Virtio (by Abel Gordon)
 https://www.youtube.com/watch?v=9EyweibHfEs
 and
 https://www.mail-archive.com/kvm@vger.kernel.org/msg98179.html
 
 I ran some experiments with TCP stream netperf and filebench (having 2 threads
 performing random reads) benchmarks on an IBM System x3650 M4.
 I have two machines, A and B. A hosts the vms, B runs the netserver.
 The vms (on A) run netperf, its destination server is running on B.
 All runs loaded the guests in a way that they were (cpu) saturated. For 
 example,
 I ran netperf with 64B messages, which is heavily loading the vm (which is why
 its throughput is low).
 The idea was to get it 100% loaded, so we can see that the polling is getting 
 it
 to produce higher throughput.
 
 The system had two cores per guest, as to allow for both the vcpu and the 
 vhost
 thread to run concurrently for maximum throughput (but I didn't pin the 
 threads
 to specific cores).
 My experiments were fair in a sense that for both cases, with or without
 polling, I run both threads, vcpu and vhost, on 2 cores (set their affinity 
 that
 way). The only difference was whether polling was enabled/disabled.
 
 Results:
 
 Netperf, 1 vm:
 The polling patch improved throughput by ~33% (1516 MB/sec - 2046 MB/sec).
 Number of exits/sec decreased 6x.
 The same improvement was shown when I tested with 3 vms running netperf
 (4086 MB/sec - 5545 MB/sec).
 
 filebench, 1 vm:
 ops/sec improved by 13% with the polling patch. Number of exits was reduced by
 31%.
 The same experiment with 3 vms running filebench showed similar numbers.
 
 Signed-off-by: Razya Ladelsky ra...@il.ibm.com

This really needs more thourough benchmarking report, including
system data.  One good example for a related patch:
http://lwn.net/Articles/551179/
though for virtualization, we need data about host as well, and if you
want to look at streaming benchmarks, you need to test different message
sizes and measure packet size.

For now, commenting on the patches assuming that will be forthcoming.

 ---
  drivers/vhost/net.c   |6 +-
  drivers/vhost/scsi.c  |6 +-
  drivers/vhost/vhost.c |  245 
 +++--
  drivers/vhost/vhost.h |   38 +++-
  4 files changed, 277 insertions(+), 18 deletions(-)
 
 diff --git a/drivers/vhost/net.c b/drivers/vhost/net.c
 index 971a760..558aecb 100644
 --- a/drivers/vhost/net.c
 +++ b/drivers/vhost/net.c
 @@ -742,8 +742,10 @@ static int vhost_net_open(struct inode *inode, struct 
 file *f)
   }
   vhost_dev_init(dev, vqs, VHOST_NET_VQ_MAX);
  
 - vhost_poll_init(n-poll + VHOST_NET_VQ_TX, handle_tx_net, POLLOUT, dev);
 - vhost_poll_init(n-poll + VHOST_NET_VQ_RX, handle_rx_net, POLLIN, dev

Re: [PATCH] vhost: Add polling mode

2014-08-20 Thread Michael S. Tsirkin
On Tue, Aug 19, 2014 at 11:36:31AM +0300, Razya Ladelsky wrote:
  That was just one example. There many other possibilities.  Either
  actually make the systems load all host CPUs equally, or divide
  throughput by host CPU.
  
 
 The polling patch adds this capability to vhost, reducing costly exit 
 overhead when the vm is loaded.
 
 In order to load the vm I ran netperf  with msg size of 256:
 
 Without polling:  2480 Mbits/sec,  utilization: vm - 100%   vhost - 64% 
 With Polling: 4160 Mbits/sec,  utilization: vm - 100%   vhost - 100% 
 
 Therefore, throughput/cpu without polling is 15.1, and 20.8 with polling.
 

Can you please present results in a form that makes
it possible to see the effect on various configurations
and workloads?

Here's one example where this was done:
https://lkml.org/lkml/2014/8/14/495

You really should also provide data about your host
configuration (missing in the above link).

 My intention was to load vhost as close as possible to 100% utilization 
 without polling, in order to compare it to the polling utilization case 
 (where vhost is always 100%). 
 The best use case, of course, would be when the shared vhost thread work 
 (TBD) is integrated and then vhost will actually be using its polling 
 cycles to handle requests of multiple devices (even from multiple vms).
 
 Thanks,
 Razya


-- 
MST
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] vhost: Add polling mode

2014-08-19 Thread Razya Ladelsky
 That was just one example. There many other possibilities.  Either
 actually make the systems load all host CPUs equally, or divide
 throughput by host CPU.
 

The polling patch adds this capability to vhost, reducing costly exit 
overhead when the vm is loaded.

In order to load the vm I ran netperf  with msg size of 256:

Without polling:  2480 Mbits/sec,  utilization: vm - 100%   vhost - 64% 
With Polling: 4160 Mbits/sec,  utilization: vm - 100%   vhost - 100% 

Therefore, throughput/cpu without polling is 15.1, and 20.8 with polling.

My intention was to load vhost as close as possible to 100% utilization 
without polling, in order to compare it to the polling utilization case 
(where vhost is always 100%). 
The best use case, of course, would be when the shared vhost thread work 
(TBD) is integrated and then vhost will actually be using its polling 
cycles to handle requests of multiple devices (even from multiple vms).

Thanks,
Razya


--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] vhost: Add polling mode

2014-08-17 Thread Razya Ladelsky
  
  Hi Michael,
  
  Sorry for the delay, had some problems with my mailbox, and I realized 

  just now that 
  my reply wasn't sent.
  The vm indeed ALWAYS utilized 100% cpu, whether polling was enabled or 

  not.
  The vhost thread utilized less than 100% (of the other cpu) when 
polling 
  was disabled.
  Enabling polling increased its utilization to 100% (in which case both 

  cpus were 100% utilized). 
 
 Hmm this means the testing wasn't successful then, as you said:
 
The idea was to get it 100% loaded, so we can see that the polling is
getting it to produce higher throughput.
 
 in fact here you are producing more throughput but spending more power
 to produce it, which can have any number of explanations besides polling
 improving the efficiency. For example, increasing system load might
 disable host power management.


Hi Michael,
I re-ran the tests, this time with the  turbo mode and  C-states 
features off.
No Polling:
1 VM running netperf (msg size 64B): 1107 Mbits/sec
 Polling:
1 VM running netperf (msg size 64B): 1572 Mbits/sec








As you can see from the new results, the numbers are lower, 
but relatively (polling on/off) there's no change.
Thank you,
Razya


 


 
 
   -- 
   MST
   
 --
 To unsubscribe from this list: send the line unsubscribe kvm in
 the body of a message to majord...@vger.kernel.org
 More majordomo info at  http://vger.kernel.org/majordomo-info.html
 

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] vhost: Add polling mode

2014-08-17 Thread Michael S. Tsirkin
On Sun, Aug 17, 2014 at 03:35:39PM +0300, Razya Ladelsky wrote:
   
   Hi Michael,
   
   Sorry for the delay, had some problems with my mailbox, and I realized 
 
   just now that 
   my reply wasn't sent.
   The vm indeed ALWAYS utilized 100% cpu, whether polling was enabled or 
 
   not.
   The vhost thread utilized less than 100% (of the other cpu) when 
 polling 
   was disabled.
   Enabling polling increased its utilization to 100% (in which case both 
 
   cpus were 100% utilized). 
  
  Hmm this means the testing wasn't successful then, as you said:
  
 The idea was to get it 100% loaded, so we can see that the polling is
 getting it to produce higher throughput.
  
  in fact here you are producing more throughput but spending more power
  to produce it, which can have any number of explanations besides polling
  improving the efficiency. For example, increasing system load might
  disable host power management.
 
 
 Hi Michael,
 I re-ran the tests, this time with the  turbo mode and  C-states 
 features off.
 No Polling:
 1 VM running netperf (msg size 64B): 1107 Mbits/sec
  Polling:
 1 VM running netperf (msg size 64B): 1572 Mbits/sec
 
 
 
 
 
 
 
 As you can see from the new results, the numbers are lower, 
 but relatively (polling on/off) there's no change.
 Thank you,
 Razya

That was just one example. There many other possibilities.  Either
actually make the systems load all host CPUs equally, or divide
throughput by host CPU.

 
  
 
 
  
  
-- 
MST

  --
  To unsubscribe from this list: send the line unsubscribe kvm in
  the body of a message to majord...@vger.kernel.org
  More majordomo info at  http://vger.kernel.org/majordomo-info.html
  
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] vhost: Add polling mode

2014-08-13 Thread Michael S. Tsirkin
On Tue, Aug 12, 2014 at 01:57:05PM +0300, Razya Ladelsky wrote:
 Michael S. Tsirkin m...@redhat.com wrote on 12/08/2014 12:18:50 PM:
 
  From: Michael S. Tsirkin m...@redhat.com
  To: David Miller da...@davemloft.net
  Cc: Razya Ladelsky/Haifa/IBM@IBMIL, kvm@vger.kernel.org, Alex 
  Glikson/Haifa/IBM@IBMIL, Eran Raichstein/Haifa/IBM@IBMIL, Yossi 
  Kuperman1/Haifa/IBM@IBMIL, Joel Nider/Haifa/IBM@IBMIL, 
  abel.gor...@gmail.com, linux-ker...@vger.kernel.org, 
  net...@vger.kernel.org, virtualizat...@lists.linux-foundation.org
  Date: 12/08/2014 12:18 PM
  Subject: Re: [PATCH] vhost: Add polling mode
  
  On Mon, Aug 11, 2014 at 12:46:21PM -0700, David Miller wrote:
   From: Michael S. Tsirkin m...@redhat.com
   Date: Sun, 10 Aug 2014 21:45:59 +0200
   
On Sun, Aug 10, 2014 at 11:30:35AM +0300, Razya Ladelsky wrote:
...
And, did your tests actually produce 100% load on both host CPUs?
...
   
   Michael, please do not quote an entire patch just to ask a one line
   question.
   
   I truly, truly, wish it was simpler in modern email clients to delete
   the unrelated quoted material because I bet when people do this they
   are simply being lazy.
   
   Thank you.
  
  Lazy - mea culpa, though I'm using mutt so it isn't even hard.
  
  The question still stands: the test results are only valid
  if CPU was at 100% in all configurations.
  This is the reason I generally prefer it when people report
  throughput divided by CPU (power would be good too but it still
  isn't easy for people to get that number).
  
 
 Hi Michael,
 
 Sorry for the delay, had some problems with my mailbox, and I realized 
 just now that 
 my reply wasn't sent.
 The vm indeed ALWAYS utilized 100% cpu, whether polling was enabled or 
 not.
 The vhost thread utilized less than 100% (of the other cpu) when polling 
 was disabled.
 Enabling polling increased its utilization to 100% (in which case both 
 cpus were 100% utilized). 

Hmm this means the testing wasn't successful then, as you said:

The idea was to get it 100% loaded, so we can see that the polling is
getting it to produce higher throughput.

in fact here you are producing more throughput but spending more power
to produce it, which can have any number of explanations besides polling
improving the efficiency. For example, increasing system load might
disable host power management.


  -- 
  MST
  
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] vhost: Add polling mode

2014-08-12 Thread Michael S. Tsirkin
On Mon, Aug 11, 2014 at 12:46:21PM -0700, David Miller wrote:
 From: Michael S. Tsirkin m...@redhat.com
 Date: Sun, 10 Aug 2014 21:45:59 +0200
 
  On Sun, Aug 10, 2014 at 11:30:35AM +0300, Razya Ladelsky wrote:
  ...
  And, did your tests actually produce 100% load on both host CPUs?
  ...
 
 Michael, please do not quote an entire patch just to ask a one line
 question.
 
 I truly, truly, wish it was simpler in modern email clients to delete
 the unrelated quoted material because I bet when people do this they
 are simply being lazy.
 
 Thank you.

Lazy - mea culpa, though I'm using mutt so it isn't even hard.

The question still stands: the test results are only valid
if CPU was at 100% in all configurations.
This is the reason I generally prefer it when people report
throughput divided by CPU (power would be good too but it still
isn't easy for people to get that number).

-- 
MST

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] vhost: Add polling mode

2014-08-12 Thread Razya Ladelsky
Michael S. Tsirkin m...@redhat.com wrote on 12/08/2014 12:18:50 PM:

 From: Michael S. Tsirkin m...@redhat.com
 To: David Miller da...@davemloft.net
 Cc: Razya Ladelsky/Haifa/IBM@IBMIL, kvm@vger.kernel.org, Alex 
 Glikson/Haifa/IBM@IBMIL, Eran Raichstein/Haifa/IBM@IBMIL, Yossi 
 Kuperman1/Haifa/IBM@IBMIL, Joel Nider/Haifa/IBM@IBMIL, 
 abel.gor...@gmail.com, linux-ker...@vger.kernel.org, 
 net...@vger.kernel.org, virtualizat...@lists.linux-foundation.org
 Date: 12/08/2014 12:18 PM
 Subject: Re: [PATCH] vhost: Add polling mode
 
 On Mon, Aug 11, 2014 at 12:46:21PM -0700, David Miller wrote:
  From: Michael S. Tsirkin m...@redhat.com
  Date: Sun, 10 Aug 2014 21:45:59 +0200
  
   On Sun, Aug 10, 2014 at 11:30:35AM +0300, Razya Ladelsky wrote:
   ...
   And, did your tests actually produce 100% load on both host CPUs?
   ...
  
  Michael, please do not quote an entire patch just to ask a one line
  question.
  
  I truly, truly, wish it was simpler in modern email clients to delete
  the unrelated quoted material because I bet when people do this they
  are simply being lazy.
  
  Thank you.
 
 Lazy - mea culpa, though I'm using mutt so it isn't even hard.
 
 The question still stands: the test results are only valid
 if CPU was at 100% in all configurations.
 This is the reason I generally prefer it when people report
 throughput divided by CPU (power would be good too but it still
 isn't easy for people to get that number).
 

Hi Michael,

Sorry for the delay, had some problems with my mailbox, and I realized 
just now that 
my reply wasn't sent.
The vm indeed ALWAYS utilized 100% cpu, whether polling was enabled or 
not.
The vhost thread utilized less than 100% (of the other cpu) when polling 
was disabled.
Enabling polling increased its utilization to 100% (in which case both 
cpus were 100% utilized). 


 -- 
 MST
 

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] vhost: Add polling mode

2014-08-11 Thread David Miller
From: Michael S. Tsirkin m...@redhat.com
Date: Sun, 10 Aug 2014 21:45:59 +0200

 On Sun, Aug 10, 2014 at 11:30:35AM +0300, Razya Ladelsky wrote:
 ...
 And, did your tests actually produce 100% load on both host CPUs?
 ...

Michael, please do not quote an entire patch just to ask a one line
question.

I truly, truly, wish it was simpler in modern email clients to delete
the unrelated quoted material because I bet when people do this they
are simply being lazy.

Thank you.
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH] vhost: Add polling mode

2014-08-10 Thread Razya Ladelsky
From: Razya Ladelsky ra...@il.ibm.com
Date: Thu, 31 Jul 2014 09:47:20 +0300
Subject: [PATCH] vhost: Add polling mode

When vhost is waiting for buffers from the guest driver (e.g., more packets to
send in vhost-net's transmit queue), it normally goes to sleep and waits for the
guest to kick it. This kick involves a PIO in the guest, and therefore an exit
(and possibly userspace involvement in translating this PIO exit into a file
descriptor event), all of which hurts performance.

If the system is under-utilized (has cpu time to spare), vhost can continuously
poll the virtqueues for new buffers, and avoid asking the guest to kick us.
This patch adds an optional polling mode to vhost, that can be enabled via a
kernel module parameter, poll_start_rate.

When polling is active for a virtqueue, the guest is asked to disable
notification (kicks), and the worker thread continuously checks for new buffers.
When it does discover new buffers, it simulates a kick by invoking the
underlying backend driver (such as vhost-net), which thinks it got a real kick
from the guest, and acts accordingly. If the underlying driver asks not to be
kicked, we disable polling on this virtqueue.

We start polling on a virtqueue when we notice it has work to do. Polling on
this virtqueue is later disabled after 3 seconds of polling turning up no new
work, as in this case we are better off returning to the exit-based notification
mechanism. The default timeout of 3 seconds can be changed with the
poll_stop_idle kernel module parameter.

This polling approach makes lot of sense for new HW with posted-interrupts for
which we have exitless host-to-guest notifications. But even with support for
posted interrupts, guest-to-host communication still causes exits. Polling adds
the missing part.

When systems are overloaded, there won't be enough cpu time for the various
vhost threads to poll their guests' devices. For these scenarios, we plan to add
support for vhost threads that can be shared by multiple devices, even of
multiple vms.
Our ultimate goal is to implement the I/O acceleration features described in:
KVM Forum 2013: Efficient and Scalable Virtio (by Abel Gordon)
https://www.youtube.com/watch?v=9EyweibHfEs
and
https://www.mail-archive.com/kvm@vger.kernel.org/msg98179.html

I ran some experiments with TCP stream netperf and filebench (having 2 threads
performing random reads) benchmarks on an IBM System x3650 M4.
I have two machines, A and B. A hosts the vms, B runs the netserver.
The vms (on A) run netperf, its destination server is running on B.
All runs loaded the guests in a way that they were (cpu) saturated. For example,
I ran netperf with 64B messages, which is heavily loading the vm (which is why
its throughput is low).
The idea was to get it 100% loaded, so we can see that the polling is getting it
to produce higher throughput.

The system had two cores per guest, as to allow for both the vcpu and the vhost
thread to run concurrently for maximum throughput (but I didn't pin the threads
to specific cores).
My experiments were fair in a sense that for both cases, with or without
polling, I run both threads, vcpu and vhost, on 2 cores (set their affinity that
way). The only difference was whether polling was enabled/disabled.

Results:

Netperf, 1 vm:
The polling patch improved throughput by ~33% (1516 MB/sec - 2046 MB/sec).
Number of exits/sec decreased 6x.
The same improvement was shown when I tested with 3 vms running netperf
(4086 MB/sec - 5545 MB/sec).

filebench, 1 vm:
ops/sec improved by 13% with the polling patch. Number of exits was reduced by
31%.
The same experiment with 3 vms running filebench showed similar numbers.

Signed-off-by: Razya Ladelsky ra...@il.ibm.com
---
 drivers/vhost/net.c   |6 +-
 drivers/vhost/scsi.c  |6 +-
 drivers/vhost/vhost.c |  245 +++--
 drivers/vhost/vhost.h |   38 +++-
 4 files changed, 277 insertions(+), 18 deletions(-)

diff --git a/drivers/vhost/net.c b/drivers/vhost/net.c
index 971a760..558aecb 100644
--- a/drivers/vhost/net.c
+++ b/drivers/vhost/net.c
@@ -742,8 +742,10 @@ static int vhost_net_open(struct inode *inode, struct file 
*f)
}
vhost_dev_init(dev, vqs, VHOST_NET_VQ_MAX);
 
-   vhost_poll_init(n-poll + VHOST_NET_VQ_TX, handle_tx_net, POLLOUT, dev);
-   vhost_poll_init(n-poll + VHOST_NET_VQ_RX, handle_rx_net, POLLIN, dev);
+   vhost_poll_init(n-poll + VHOST_NET_VQ_TX, handle_tx_net, POLLOUT,
+   vqs[VHOST_NET_VQ_TX]);
+   vhost_poll_init(n-poll + VHOST_NET_VQ_RX, handle_rx_net, POLLIN,
+   vqs[VHOST_NET_VQ_RX]);
 
f-private_data = n;
 
diff --git a/drivers/vhost/scsi.c b/drivers/vhost/scsi.c
index 4f4ffa4..665eeeb 100644
--- a/drivers/vhost/scsi.c
+++ b/drivers/vhost/scsi.c
@@ -1528,9 +1528,9 @@ static int vhost_scsi_open(struct inode *inode, struct 
file *f)
if (!vqs)
goto err_vqs;
 
-   vhost_work_init(vs

Re: [PATCH] vhost: Add polling mode

2014-08-10 Thread Michael S. Tsirkin
On Sun, Aug 10, 2014 at 11:30:35AM +0300, Razya Ladelsky wrote:
 From: Razya Ladelsky ra...@il.ibm.com
 Date: Thu, 31 Jul 2014 09:47:20 +0300
 Subject: [PATCH] vhost: Add polling mode
 
 When vhost is waiting for buffers from the guest driver (e.g., more packets to
 send in vhost-net's transmit queue), it normally goes to sleep and waits for 
 the
 guest to kick it. This kick involves a PIO in the guest, and therefore an 
 exit
 (and possibly userspace involvement in translating this PIO exit into a file
 descriptor event), all of which hurts performance.
 
 If the system is under-utilized (has cpu time to spare), vhost can 
 continuously
 poll the virtqueues for new buffers, and avoid asking the guest to kick us.
 This patch adds an optional polling mode to vhost, that can be enabled via a
 kernel module parameter, poll_start_rate.
 
 When polling is active for a virtqueue, the guest is asked to disable
 notification (kicks), and the worker thread continuously checks for new 
 buffers.
 When it does discover new buffers, it simulates a kick by invoking the
 underlying backend driver (such as vhost-net), which thinks it got a real kick
 from the guest, and acts accordingly. If the underlying driver asks not to be
 kicked, we disable polling on this virtqueue.
 
 We start polling on a virtqueue when we notice it has work to do. Polling on
 this virtqueue is later disabled after 3 seconds of polling turning up no new
 work, as in this case we are better off returning to the exit-based 
 notification
 mechanism. The default timeout of 3 seconds can be changed with the
 poll_stop_idle kernel module parameter.
 
 This polling approach makes lot of sense for new HW with posted-interrupts for
 which we have exitless host-to-guest notifications. But even with support for
 posted interrupts, guest-to-host communication still causes exits. Polling 
 adds
 the missing part.
 
 When systems are overloaded, there won't be enough cpu time for the various
 vhost threads to poll their guests' devices. For these scenarios, we plan to 
 add
 support for vhost threads that can be shared by multiple devices, even of
 multiple vms.
 Our ultimate goal is to implement the I/O acceleration features described in:
 KVM Forum 2013: Efficient and Scalable Virtio (by Abel Gordon)
 https://www.youtube.com/watch?v=9EyweibHfEs
 and
 https://www.mail-archive.com/kvm@vger.kernel.org/msg98179.html
 
 I ran some experiments with TCP stream netperf and filebench (having 2 threads
 performing random reads) benchmarks on an IBM System x3650 M4.
 I have two machines, A and B. A hosts the vms, B runs the netserver.
 The vms (on A) run netperf, its destination server is running on B.
 All runs loaded the guests in a way that they were (cpu) saturated. For 
 example,
 I ran netperf with 64B messages, which is heavily loading the vm (which is why
 its throughput is low).
 The idea was to get it 100% loaded, so we can see that the polling is getting 
 it
 to produce higher throughput.

And, did your tests actually produce 100% load on both host CPUs?

 The system had two cores per guest, as to allow for both the vcpu and the 
 vhost
 thread to run concurrently for maximum throughput (but I didn't pin the 
 threads
 to specific cores).
 My experiments were fair in a sense that for both cases, with or without
 polling, I run both threads, vcpu and vhost, on 2 cores (set their affinity 
 that
 way). The only difference was whether polling was enabled/disabled.
 
 Results:
 
 Netperf, 1 vm:
 The polling patch improved throughput by ~33% (1516 MB/sec - 2046 MB/sec).
 Number of exits/sec decreased 6x.
 The same improvement was shown when I tested with 3 vms running netperf
 (4086 MB/sec - 5545 MB/sec).
 
 filebench, 1 vm:
 ops/sec improved by 13% with the polling patch. Number of exits was reduced by
 31%.
 The same experiment with 3 vms running filebench showed similar numbers.
 
 Signed-off-by: Razya Ladelsky ra...@il.ibm.com
 ---
  drivers/vhost/net.c   |6 +-
  drivers/vhost/scsi.c  |6 +-
  drivers/vhost/vhost.c |  245 
 +++--
  drivers/vhost/vhost.h |   38 +++-
  4 files changed, 277 insertions(+), 18 deletions(-)
 
 diff --git a/drivers/vhost/net.c b/drivers/vhost/net.c
 index 971a760..558aecb 100644
 --- a/drivers/vhost/net.c
 +++ b/drivers/vhost/net.c
 @@ -742,8 +742,10 @@ static int vhost_net_open(struct inode *inode, struct 
 file *f)
   }
   vhost_dev_init(dev, vqs, VHOST_NET_VQ_MAX);
  
 - vhost_poll_init(n-poll + VHOST_NET_VQ_TX, handle_tx_net, POLLOUT, dev);
 - vhost_poll_init(n-poll + VHOST_NET_VQ_RX, handle_rx_net, POLLIN, dev);
 + vhost_poll_init(n-poll + VHOST_NET_VQ_TX, handle_tx_net, POLLOUT,
 + vqs[VHOST_NET_VQ_TX]);
 + vhost_poll_init(n-poll + VHOST_NET_VQ_RX, handle_rx_net, POLLIN,
 + vqs[VHOST_NET_VQ_RX]);
  
   f-private_data = n;
  
 diff --git a/drivers/vhost/scsi.c b/drivers/vhost/scsi.c
 index 4f4ffa4

Re: [PATCH] vhost: Add polling mode

2014-07-30 Thread Razya Ladelsky
kvm-ow...@vger.kernel.org wrote on 29/07/2014 03:40:18 PM:

 From: Michael S. Tsirkin m...@redhat.com
 To: Razya Ladelsky/Haifa/IBM@IBMIL, 
 Cc: abel.gor...@gmail.com, Alex Glikson/Haifa/IBM@IBMIL, Eran 
 Raichstein/Haifa/IBM@IBMIL, Joel Nider/Haifa/IBM@IBMIL, 
 kvm@vger.kernel.org, kvm-ow...@vger.kernel.org, Yossi Kuperman1/
 Haifa/IBM@IBMIL
 Date: 29/07/2014 03:40 PM
 Subject: Re: [PATCH] vhost: Add polling mode
 Sent by: kvm-ow...@vger.kernel.org
 
 On Tue, Jul 29, 2014 at 03:23:59PM +0300, Razya Ladelsky wrote:
   
   Hmm there aren't a lot of numbers there :(. Speed increased by 33% 
but
   by how much?  E.g. maybe you are getting from 1Mbyte/sec to 1.3,
   if so it's hard to get excited about it. 
  
  Netperf 1 VM: 1516 MB/sec - 2046 MB/sec
  and for 3 VMs: 4086 MB/sec - 5545 MB/sec
 
 What do you mean by 1 VM? Streaming TCP host to vm?
 Also, your throughput is somewhat low, it's worth seeing
 why you can't hit higher speeds.
 

My configuration is this:
I have two machines, A and B.
A hosts the vms, B runs the netserver.
One vm (on A) runs netperf, where the its destination server is running on 
B. 

I ran netperf with 64B messages, which is heavily loading the vm, which is 
why its throughput is low.
The idea was to get it 100% loaded, so we can see that the polling is 
getting it to produce higher throughput. 



   Some questions that come to
   mind: what was the message size? I would expect several measurements
   with different values.  How did host CPU utilization change?
   
  
  message size  was 64B in order to get the VM to be cpu saturated. 
  so vhost had 99% cpu and vhost 38%, with the polling patch both had 
99%.
 
 Hmm so a net loss in throughput/CPU.
 

Actually, my experiments were fair in a sense that for both cases, 
with or without polling, I run both threads, vcpu and vhost, on 2 cores 
(set their affinity that way).
The only difference was whether polling was enabled/disabled. 


  
  
   What about latency? As we are competing with guest for host CPU,
   would worst-case or average latency suffer?
   
  
  Polling indeed doesn't make a lot of sense if there aren't enough 
  available cores.
  In these cases polling should not be used.
  
  Thank you,
  Razya
 
 OK but scheduler might run vm and vhost on the same cpu
 even if cores are available.
 This needs to be detected somehow and polling disabled.
 
 
  
  
   Thanks,
   
   -- 
   MST
   --
   To unsubscribe from this list: send the line unsubscribe kvm in
   the body of a message to majord...@vger.kernel.org
   More majordomo info at  http://vger.kernel.org/majordomo-info.html
   
 --
 To unsubscribe from this list: send the line unsubscribe kvm in
 the body of a message to majord...@vger.kernel.org
 More majordomo info at  http://vger.kernel.org/majordomo-info.html
 

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] vhost: Add polling mode

2014-07-29 Thread Razya Ladelsky
kvm-ow...@vger.kernel.org wrote on 29/07/2014 04:30:34 AM:

 From: Zhang Haoyu zhan...@sangfor.com
 To: Jason Wang jasow...@redhat.com, Abel Gordon 
 abel.gor...@gmail.com, 
 Cc: Razya Ladelsky/Haifa/IBM@IBMIL, Alex Glikson/Haifa/IBM@IBMIL, 
 Eran Raichstein/Haifa/IBM@IBMIL, Joel Nider/Haifa/IBM@IBMIL, kvm 
 kvm@vger.kernel.org, Michael S. Tsirkin m...@redhat.com, Yossi 
 Kuperman1/Haifa/IBM@IBMIL
 Date: 29/07/2014 04:35 AM
 Subject: Re: [PATCH] vhost: Add polling mode
 Sent by: kvm-ow...@vger.kernel.org
 
 Maybe tie a knot between vhost-net scalability tuning: threading 
 for many VMs and vhost: Add polling mode is a good marriage,
 because it's more possibility to get work to do with less polling 
 time, so less cpu cycles waste.
 

Hi Zhang,
Indeed have one vhost thread shared by multiple vms, polling for their 
requests is
the ultimate goal of this plan.
The current challenge with it is that the cgroup mechanism needs to be 
supported/incorporated somehow by this shared vhost thread, as it now 
serves multiple vms(processes).
B.T.W. - if someone wants to help with this effort (mainly the cgroup 
issue),
it would be greatly appreciated...! 
 
Thank you,
Razya 

 Thanks,
 Zhang Haoyu
 
Hello All,
   
When vhost is waiting for buffers from the guest driver
 (e.g., more
packets
to send in vhost-net's transmit queue), it normally 
 goes to sleep and
waits
for the guest to kick it. This kick involves a PIO in
 the guest, and
therefore an exit (and possibly userspace involvement 
 in translating
this
PIO
exit into a file descriptor event), all of which hurts 
 performance.
   
If the system is under-utilized (has cpu time to 
 spare), vhost can
continuously poll the virtqueues for new buffers, and 
 avoid asking
the guest to kick us.
This patch adds an optional polling mode to vhost, that
 can be enabled
via a kernel module parameter, poll_start_rate.
   
When polling is active for a virtqueue, the guest is asked 
to
disable notification (kicks), and the worker thread 
continuously
checks
for
new buffers. When it does discover new buffers, it 
 simulates a kick
by
invoking the underlying backend driver (such as vhost-net), 
which
thinks
it
got a real kick from the guest, and acts accordingly. If 
the
underlying
driver asks not to be kicked, we disable polling on 
 this virtqueue.
   
We start polling on a virtqueue when we notice it has
work to do. Polling on this virtqueue is later disabled 
after 3
seconds of
polling turning up no new work, as in this case we are 
better off
returning
to the exit-based notification mechanism. The default 
 timeout of 3
seconds
can be changed with the poll_stop_idle kernel module 
parameter.
   
This polling approach makes lot of sense for new HW with
posted-interrupts
for which we have exitless host-to-guest notifications.
 But even with
support
for posted interrupts, guest-to-host communication 
 still causes exits.
Polling adds the missing part.
   
When systems are overloaded, there won?t be enough cpu 
 time for the
various
vhost threads to poll their guests' devices. For these 
 scenarios, we
plan
to add support for vhost threads that can be shared by 
multiple
devices,
even of multiple vms.
Our ultimate goal is to implement the I/O acceleration 
features
described
in:
KVM Forum 2013: Efficient and Scalable Virtio (by Abel 
Gordon)
https://www.youtube.com/watch?v=9EyweibHfEs
and

https://www.mail-archive.com/kvm@vger.kernel.org/msg98179.html
   
   
Comments are welcome,
Thank you,
Razya
Thanks for the work. Do you have perf numbers for this?
   
Hi Jason,
Thanks for reviewing. I ran some experiments with TCP 
 stream netperf and
filebench (having 2 threads performing random reads) 
 benchmarks on an IBM
System x3650 M4.
All runs loaded the guests in a way that they were (cpu) 
saturated.
The system had two cores per guest, as to allow for both 
 the vcpu and the
vhost thread to
run concurrently for maximum throughput (but I didn't pin 
 the threads to
specific cores)
I get:
   
Netperf, 1 vm:
The polling patch improved throughput by ~33%. Number of 
exits/sec
decreased 6x.
The same improvement was shown when I tested with 3 vms 
 running netperf.
   
filebench, 1 vm:
ops/sec improved by 13% with the polling patch. Number of exits 
was
reduced by 31%.
The same experiment with 3 vms running filebench showed 
 similar numbers.
  
   Looks good, may worth to add the result in the commit log.
   
And looks like the patch only poll for virtqueue. In the 
 future, may
worth to add callbacks for vhost_net to poll socket. Then
 it could be
used with rx busy polling in host which may speedup the rx 
also.
Did you mean polling the network device to avoid

Re: [PATCH] vhost: Add polling mode

2014-07-29 Thread Michael S. Tsirkin
On Mon, Jul 21, 2014 at 04:23:44PM +0300, Razya Ladelsky wrote:
 Hello All,
 
 When vhost is waiting for buffers from the guest driver (e.g., more 
 packets
 to send in vhost-net's transmit queue), it normally goes to sleep and 
 waits
 for the guest to kick it. This kick involves a PIO in the guest, and
 therefore an exit (and possibly userspace involvement in translating this 
 PIO
 exit into a file descriptor event), all of which hurts performance.
 
 If the system is under-utilized (has cpu time to spare), vhost can 
 continuously poll the virtqueues for new buffers, and avoid asking 
 the guest to kick us.
 This patch adds an optional polling mode to vhost, that can be enabled 
 via a kernel module parameter, poll_start_rate.
 
 When polling is active for a virtqueue, the guest is asked to
 disable notification (kicks), and the worker thread continuously checks 
 for
 new buffers. When it does discover new buffers, it simulates a kick by
 invoking the underlying backend driver (such as vhost-net), which thinks 
 it
 got a real kick from the guest, and acts accordingly. If the underlying
 driver asks not to be kicked, we disable polling on this virtqueue.
 
 We start polling on a virtqueue when we notice it has
 work to do. Polling on this virtqueue is later disabled after 3 seconds of
 polling turning up no new work, as in this case we are better off 
 returning
 to the exit-based notification mechanism. The default timeout of 3 seconds
 can be changed with the poll_stop_idle kernel module parameter.
 
 This polling approach makes lot of sense for new HW with posted-interrupts
 for which we have exitless host-to-guest notifications. But even with 
 support 
 for posted interrupts, guest-to-host communication still causes exits. 
 Polling adds the missing part.
 
 When systems are overloaded, there won?t be enough cpu time for the 
 various 
 vhost threads to poll their guests' devices. For these scenarios, we plan 
 to add support for vhost threads that can be shared by multiple devices, 
 even of multiple vms. 
 Our ultimate goal is to implement the I/O acceleration features described 
 in:
 KVM Forum 2013: Efficient and Scalable Virtio (by Abel Gordon) 
 https://www.youtube.com/watch?v=9EyweibHfEs
 and
 https://www.mail-archive.com/kvm@vger.kernel.org/msg98179.html
 
  
 Comments are welcome, 
 Thank you,
 Razya
 
 From: Razya Ladelsky ra...@il.ibm.com
 
 Add an optional polling mode to continuously poll the virtqueues
 for new buffers, and avoid asking the guest to kick us.
 
 Signed-off-by: Razya Ladelsky ra...@il.ibm.com

This is an optimization patch, isn't it?
Could you please include some numbers showing its
effect?


 ---
  drivers/vhost/net.c   |6 +-
  drivers/vhost/scsi.c  |5 +-
  drivers/vhost/vhost.c |  247 
 +++--
  drivers/vhost/vhost.h |   37 +++-
  4 files changed, 277 insertions(+), 18 deletions(-)


Whitespace seems mangled to the point of making patch
unreadable. Can you pls repost?

 diff --git a/drivers/vhost/net.c b/drivers/vhost/net.c
 index 971a760..558aecb 100644
 --- a/drivers/vhost/net.c
 +++ b/drivers/vhost/net.c
 @@ -742,8 +742,10 @@ static int vhost_net_open(struct inode *inode, struct 
 file *f)
 }
 vhost_dev_init(dev, vqs, VHOST_NET_VQ_MAX);
  
 -   vhost_poll_init(n-poll + VHOST_NET_VQ_TX, handle_tx_net, POLLOUT, 
 dev);
 -   vhost_poll_init(n-poll + VHOST_NET_VQ_RX, handle_rx_net, POLLIN, 
 dev);
 +   vhost_poll_init(n-poll + VHOST_NET_VQ_TX, handle_tx_net, POLLOUT,
 +   vqs[VHOST_NET_VQ_TX]);
 +   vhost_poll_init(n-poll + VHOST_NET_VQ_RX, handle_rx_net, POLLIN,
 +   vqs[VHOST_NET_VQ_RX]);
  
 f-private_data = n;
  
 diff --git a/drivers/vhost/scsi.c b/drivers/vhost/scsi.c
 index 4f4ffa4..56f0233 100644
 --- a/drivers/vhost/scsi.c
 +++ b/drivers/vhost/scsi.c
 @@ -1528,9 +1528,8 @@ static int vhost_scsi_open(struct inode *inode, 
 struct file *f)
 if (!vqs)
 goto err_vqs;
  
 -   vhost_work_init(vs-vs_completion_work, 
 vhost_scsi_complete_cmd_work);
 -   vhost_work_init(vs-vs_event_work, tcm_vhost_evt_work);
 -
 +   vhost_work_init(vs-vs_completion_work, NULL, 
 vhost_scsi_complete_cmd_work);
 +   vhost_work_init(vs-vs_event_work, NULL, tcm_vhost_evt_work);
 vs-vs_events_nr = 0;
 vs-vs_events_missed = false;
  
 diff --git a/drivers/vhost/vhost.c b/drivers/vhost/vhost.c
 index c90f437..678d766 100644
 --- a/drivers/vhost/vhost.c
 +++ b/drivers/vhost/vhost.c
 @@ -24,9 +24,17 @@
  #include linux/slab.h
  #include linux/kthread.h
  #include linux/cgroup.h
 +#include linux/jiffies.h
  #include linux/module.h
  
  #include vhost.h
 +static int poll_start_rate = 0;
 +module_param(poll_start_rate, int, S_IRUGO|S_IWUSR);
 +MODULE_PARM_DESC(poll_start_rate, Start continuous polling of virtqueue 
 when rate of events is at least this number per jiffy. If 0, never start 
 polling.);
 +
 

Re: [PATCH] vhost: Add polling mode

2014-07-29 Thread Razya Ladelsky
Michael S. Tsirkin m...@redhat.com wrote on 29/07/2014 11:06:40 AM:

 From: Michael S. Tsirkin m...@redhat.com
 To: Razya Ladelsky/Haifa/IBM@IBMIL, 
 Cc: kvm@vger.kernel.org, abel.gor...@gmail.com, Joel Nider/Haifa/
 IBM@IBMIL, Yossi Kuperman1/Haifa/IBM@IBMIL, Eran Raichstein/Haifa/
 IBM@IBMIL, Alex Glikson/Haifa/IBM@IBMIL
 Date: 29/07/2014 11:06 AM
 Subject: Re: [PATCH] vhost: Add polling mode
 
 On Mon, Jul 21, 2014 at 04:23:44PM +0300, Razya Ladelsky wrote:
  Hello All,
  
  When vhost is waiting for buffers from the guest driver (e.g., more 
  packets
  to send in vhost-net's transmit queue), it normally goes to sleep and 
  waits
  for the guest to kick it. This kick involves a PIO in the guest, and
  therefore an exit (and possibly userspace involvement in translating 
this 
  PIO
  exit into a file descriptor event), all of which hurts performance.
  
  If the system is under-utilized (has cpu time to spare), vhost can 
  continuously poll the virtqueues for new buffers, and avoid asking 
  the guest to kick us.
  This patch adds an optional polling mode to vhost, that can be enabled 

  via a kernel module parameter, poll_start_rate.
  
  When polling is active for a virtqueue, the guest is asked to
  disable notification (kicks), and the worker thread continuously 
checks 
  for
  new buffers. When it does discover new buffers, it simulates a kick 
by
  invoking the underlying backend driver (such as vhost-net), which 
thinks 
  it
  got a real kick from the guest, and acts accordingly. If the 
underlying
  driver asks not to be kicked, we disable polling on this virtqueue.
  
  We start polling on a virtqueue when we notice it has
  work to do. Polling on this virtqueue is later disabled after 3 
seconds of
  polling turning up no new work, as in this case we are better off 
  returning
  to the exit-based notification mechanism. The default timeout of 3 
seconds
  can be changed with the poll_stop_idle kernel module parameter.
  
  This polling approach makes lot of sense for new HW with 
posted-interrupts
  for which we have exitless host-to-guest notifications. But even with 
  support 
  for posted interrupts, guest-to-host communication still causes exits. 

  Polling adds the missing part.
  
  When systems are overloaded, there won?t be enough cpu time for the 
  various 
  vhost threads to poll their guests' devices. For these scenarios, we 
plan 
  to add support for vhost threads that can be shared by multiple 
devices, 
  even of multiple vms. 
  Our ultimate goal is to implement the I/O acceleration features 
described 
  in:
  KVM Forum 2013: Efficient and Scalable Virtio (by Abel Gordon) 
  https://www.youtube.com/watch?v=9EyweibHfEs
  and
  https://www.mail-archive.com/kvm@vger.kernel.org/msg98179.html
  
  
  Comments are welcome, 
  Thank you,
  Razya
  
  From: Razya Ladelsky ra...@il.ibm.com
  
  Add an optional polling mode to continuously poll the virtqueues
  for new buffers, and avoid asking the guest to kick us.
  
  Signed-off-by: Razya Ladelsky ra...@il.ibm.com
 
 This is an optimization patch, isn't it?
 Could you please include some numbers showing its
 effect?
 
 

Hi Michael,
Sure. I included them in a reply to Jason Wang in this thread,
Here it is:
http://www.spinics.net/linux/lists/kvm/msg106049.html




  ---
   drivers/vhost/net.c   |6 +-
   drivers/vhost/scsi.c  |5 +-
   drivers/vhost/vhost.c |  247 
  +++--
   drivers/vhost/vhost.h |   37 +++-
   4 files changed, 277 insertions(+), 18 deletions(-)
 
 
 Whitespace seems mangled to the point of making patch
 unreadable. Can you pls repost?
 

Sure.

  diff --git a/drivers/vhost/net.c b/drivers/vhost/net.c
  index 971a760..558aecb 100644
  --- a/drivers/vhost/net.c
  +++ b/drivers/vhost/net.c
  @@ -742,8 +742,10 @@ static int vhost_net_open(struct inode *inode, 
struct 
  file *f)
  }
  vhost_dev_init(dev, vqs, VHOST_NET_VQ_MAX);
  
  -   vhost_poll_init(n-poll + VHOST_NET_VQ_TX, handle_tx_net, 
POLLOUT, 
  dev);
  -   vhost_poll_init(n-poll + VHOST_NET_VQ_RX, handle_rx_net, 
POLLIN, 
  dev);
  +   vhost_poll_init(n-poll + VHOST_NET_VQ_TX, handle_tx_net, 
POLLOUT,
  +   vqs[VHOST_NET_VQ_TX]);
  +   vhost_poll_init(n-poll + VHOST_NET_VQ_RX, handle_rx_net, 
POLLIN,
  +   vqs[VHOST_NET_VQ_RX]);
  
  f-private_data = n;
  
  diff --git a/drivers/vhost/scsi.c b/drivers/vhost/scsi.c
  index 4f4ffa4..56f0233 100644
  --- a/drivers/vhost/scsi.c
  +++ b/drivers/vhost/scsi.c
  @@ -1528,9 +1528,8 @@ static int vhost_scsi_open(struct inode *inode, 
  struct file *f)
  if (!vqs)
  goto err_vqs;
  
  -   vhost_work_init(vs-vs_completion_work, 
  vhost_scsi_complete_cmd_work);
  -   vhost_work_init(vs-vs_event_work, tcm_vhost_evt_work);
  -
  +   vhost_work_init(vs-vs_completion_work, NULL, 
  vhost_scsi_complete_cmd_work);
  +   vhost_work_init(vs

Re: [PATCH] vhost: Add polling mode

2014-07-29 Thread Michael S. Tsirkin
On Tue, Jul 29, 2014 at 01:30:03PM +0300, Razya Ladelsky wrote:

[..] had to snip off the quoted text, it's mangled up to
 unreadability.
Please take a look at Documentation/email-clients.txt
to fix this.

  This is an optimization patch, isn't it?
  Could you please include some numbers showing its
  effect?
  
  
 
 Hi Michael,
 Sure. I included them in a reply to Jason Wang in this thread,
 Here it is:
 http://www.spinics.net/linux/lists/kvm/msg106049.html
 

Hmm there aren't a lot of numbers there :(. Speed increased by 33% but
by how much?  E.g. maybe you are getting from 1Mbyte/sec to 1.3,
if so it's hard to get excited about it. Some questions that come to
mind: what was the message size? I would expect several measurements
with different values.  How did host CPU utilization change?

What about latency? As we are competing with guest for host CPU,
would worst-case or average latency suffer?

Thanks,

-- 
MST
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] vhost: Add polling mode

2014-07-29 Thread Razya Ladelsky
 
 Hmm there aren't a lot of numbers there :(. Speed increased by 33% but
 by how much?  E.g. maybe you are getting from 1Mbyte/sec to 1.3,
 if so it's hard to get excited about it. 

Netperf 1 VM: 1516 MB/sec - 2046 MB/sec
and for 3 VMs: 4086 MB/sec - 5545 MB/sec

 Some questions that come to
 mind: what was the message size? I would expect several measurements
 with different values.  How did host CPU utilization change?
 

message size  was 64B in order to get the VM to be cpu saturated. 
so vhost had 99% cpu and vhost 38%, with the polling patch both had 99%.



 What about latency? As we are competing with guest for host CPU,
 would worst-case or average latency suffer?
 

Polling indeed doesn't make a lot of sense if there aren't enough 
available cores.
In these cases polling should not be used.

Thank you,
Razya



 Thanks,
 
 -- 
 MST
 --
 To unsubscribe from this list: send the line unsubscribe kvm in
 the body of a message to majord...@vger.kernel.org
 More majordomo info at  http://vger.kernel.org/majordomo-info.html
 

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] vhost: Add polling mode

2014-07-29 Thread Michael S. Tsirkin
On Tue, Jul 29, 2014 at 03:23:59PM +0300, Razya Ladelsky wrote:
  
  Hmm there aren't a lot of numbers there :(. Speed increased by 33% but
  by how much?  E.g. maybe you are getting from 1Mbyte/sec to 1.3,
  if so it's hard to get excited about it. 
 
 Netperf 1 VM: 1516 MB/sec - 2046 MB/sec
 and for 3 VMs: 4086 MB/sec - 5545 MB/sec

What do you mean by 1 VM? Streaming TCP host to vm?
Also, your throughput is somewhat low, it's worth seeing
why you can't hit higher speeds.

  Some questions that come to
  mind: what was the message size? I would expect several measurements
  with different values.  How did host CPU utilization change?
  
 
 message size  was 64B in order to get the VM to be cpu saturated. 
 so vhost had 99% cpu and vhost 38%, with the polling patch both had 99%.

Hmm so a net loss in throughput/CPU.

 
 
  What about latency? As we are competing with guest for host CPU,
  would worst-case or average latency suffer?
  
 
 Polling indeed doesn't make a lot of sense if there aren't enough 
 available cores.
 In these cases polling should not be used.
 
 Thank you,
 Razya

OK but scheduler might run vm and vhost on the same cpu
even if cores are available.
This needs to be detected somehow and polling disabled.


 
 
  Thanks,
  
  -- 
  MST
  --
  To unsubscribe from this list: send the line unsubscribe kvm in
  the body of a message to majord...@vger.kernel.org
  More majordomo info at  http://vger.kernel.org/majordomo-info.html
  
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] vhost: Add polling mode

2014-07-28 Thread Zhang Haoyu
Maybe tie a knot between vhost-net scalability tuning: threading for many VMs 
and vhost: Add polling mode is a good marriage,
because it's more possibility to get work to do with less polling time, so less 
cpu cycles waste.

Thanks,
Zhang Haoyu

   Hello All,
  
   When vhost is waiting for buffers from the guest driver (e.g., more
   packets
   to send in vhost-net's transmit queue), it normally goes to sleep 
   and
   waits
   for the guest to kick it. This kick involves a PIO in the guest, 
   and
   therefore an exit (and possibly userspace involvement in 
   translating
   this
   PIO
   exit into a file descriptor event), all of which hurts performance.
  
   If the system is under-utilized (has cpu time to spare), vhost can
   continuously poll the virtqueues for new buffers, and avoid asking
   the guest to kick us.
   This patch adds an optional polling mode to vhost, that can be 
   enabled
   via a kernel module parameter, poll_start_rate.
  
   When polling is active for a virtqueue, the guest is asked to
   disable notification (kicks), and the worker thread continuously
   checks
   for
   new buffers. When it does discover new buffers, it simulates a 
   kick
   by
   invoking the underlying backend driver (such as vhost-net), which
   thinks
   it
   got a real kick from the guest, and acts accordingly. If the
   underlying
   driver asks not to be kicked, we disable polling on this virtqueue.
  
   We start polling on a virtqueue when we notice it has
   work to do. Polling on this virtqueue is later disabled after 3
   seconds of
   polling turning up no new work, as in this case we are better off
   returning
   to the exit-based notification mechanism. The default timeout of 3
   seconds
   can be changed with the poll_stop_idle kernel module parameter.
  
   This polling approach makes lot of sense for new HW with
   posted-interrupts
   for which we have exitless host-to-guest notifications. But even 
   with
   support
   for posted interrupts, guest-to-host communication still causes 
   exits.
   Polling adds the missing part.
  
   When systems are overloaded, there won?t be enough cpu time for the
   various
   vhost threads to poll their guests' devices. For these scenarios, 
   we
   plan
   to add support for vhost threads that can be shared by multiple
   devices,
   even of multiple vms.
   Our ultimate goal is to implement the I/O acceleration features
   described
   in:
   KVM Forum 2013: Efficient and Scalable Virtio (by Abel Gordon)
   https://www.youtube.com/watch?v=9EyweibHfEs
   and
   https://www.mail-archive.com/kvm@vger.kernel.org/msg98179.html
  
  
   Comments are welcome,
   Thank you,
   Razya
   Thanks for the work. Do you have perf numbers for this?
  
   Hi Jason,
   Thanks for reviewing. I ran some experiments with TCP stream netperf 
   and
   filebench (having 2 threads performing random reads) benchmarks on an 
   IBM
   System x3650 M4.
   All runs loaded the guests in a way that they were (cpu) saturated.
   The system had two cores per guest, as to allow for both the vcpu and 
   the
   vhost thread to
   run concurrently for maximum throughput (but I didn't pin the threads 
   to
   specific cores)
   I get:
  
   Netperf, 1 vm:
   The polling patch improved throughput by ~33%. Number of exits/sec
   decreased 6x.
   The same improvement was shown when I tested with 3 vms running 
   netperf.
  
   filebench, 1 vm:
   ops/sec improved by 13% with the polling patch. Number of exits was
   reduced by 31%.
   The same experiment with 3 vms running filebench showed similar 
   numbers.
 
  Looks good, may worth to add the result in the commit log.
  
   And looks like the patch only poll for virtqueue. In the future, may
   worth to add callbacks for vhost_net to poll socket. Then it could be
   used with rx busy polling in host which may speedup the rx also.
   Did you mean polling the network device to avoid interrupts?
 
  Yes, recent linux host support rx busy polling which can reduce the
  interrupts. If vhost can utilize this, it can also reduce the latency
  caused by vhost thread wakeups.
 
  And I'm also working on virtio-net busy polling in guest, if vhost can
  poll socket, it can also help in guest rx polling.
 Nice :)  Note that you may want to check if if the processor support
 posted interrupts. I guess that if CPU supports posted interrupts then
 benefits of polling in the front-end (from performance perspective)
 may not worth the cpu cycles wasted in the guest.


Yes it's worth to check. But I think busy polling in guest may still
help since it may reduce the overhead of irq and NAPI in guest, also can
reduce the latency by eliminating wakeups of both vcpu thread in host
and userspace process in guest.

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] vhost: Add polling mode

2014-07-23 Thread Razya Ladelsky
Jason Wang jasow...@redhat.com wrote on 23/07/2014 08:26:36 AM:

 From: Jason Wang jasow...@redhat.com
 To: Razya Ladelsky/Haifa/IBM@IBMIL, kvm@vger.kernel.org, Michael S.
 Tsirkin m...@redhat.com, 
 Cc: abel.gor...@gmail.com, Joel Nider/Haifa/IBM@IBMIL, Yossi 
 Kuperman1/Haifa/IBM@IBMIL, Eran Raichstein/Haifa/IBM@IBMIL, Alex 
 Glikson/Haifa/IBM@IBMIL
 Date: 23/07/2014 08:26 AM
 Subject: Re: [PATCH] vhost: Add polling mode
 
 On 07/21/2014 09:23 PM, Razya Ladelsky wrote:
  Hello All,
 
  When vhost is waiting for buffers from the guest driver (e.g., more 
  packets
  to send in vhost-net's transmit queue), it normally goes to sleep and 
  waits
  for the guest to kick it. This kick involves a PIO in the guest, and
  therefore an exit (and possibly userspace involvement in translating 
this 
  PIO
  exit into a file descriptor event), all of which hurts performance.
 
  If the system is under-utilized (has cpu time to spare), vhost can 
  continuously poll the virtqueues for new buffers, and avoid asking 
  the guest to kick us.
  This patch adds an optional polling mode to vhost, that can be enabled 

  via a kernel module parameter, poll_start_rate.
 
  When polling is active for a virtqueue, the guest is asked to
  disable notification (kicks), and the worker thread continuously 
checks 
  for
  new buffers. When it does discover new buffers, it simulates a kick 
by
  invoking the underlying backend driver (such as vhost-net), which 
thinks 
  it
  got a real kick from the guest, and acts accordingly. If the 
underlying
  driver asks not to be kicked, we disable polling on this virtqueue.
 
  We start polling on a virtqueue when we notice it has
  work to do. Polling on this virtqueue is later disabled after 3 
seconds of
  polling turning up no new work, as in this case we are better off 
  returning
  to the exit-based notification mechanism. The default timeout of 3 
seconds
  can be changed with the poll_stop_idle kernel module parameter.
 
  This polling approach makes lot of sense for new HW with 
posted-interrupts
  for which we have exitless host-to-guest notifications. But even with 
  support 
  for posted interrupts, guest-to-host communication still causes exits. 

  Polling adds the missing part.
 
  When systems are overloaded, there won?t be enough cpu time for the 
  various 
  vhost threads to poll their guests' devices. For these scenarios, we 
plan 
  to add support for vhost threads that can be shared by multiple 
devices, 
  even of multiple vms. 
  Our ultimate goal is to implement the I/O acceleration features 
described 
  in:
  KVM Forum 2013: Efficient and Scalable Virtio (by Abel Gordon) 
  https://www.youtube.com/watch?v=9EyweibHfEs
  and
  https://www.mail-archive.com/kvm@vger.kernel.org/msg98179.html
 
  
  Comments are welcome, 
  Thank you,
  Razya
 
 Thanks for the work. Do you have perf numbers for this?
 

Hi Jason,
Thanks for reviewing. I ran some experiments with TCP stream netperf and 
filebench (having 2 threads performing random reads) benchmarks on an IBM 
System x3650 M4.
All runs loaded the guests in a way that they were (cpu) saturated.
The system had two cores per guest, as to allow for both the vcpu and the 
vhost thread to
run concurrently for maximum throughput (but I didn't pin the threads to 
specific cores)
I get:

Netperf, 1 vm:
The polling patch improved throughput by ~33%. Number of exits/sec 
decreased 6x.
The same improvement was shown when I tested with 3 vms running netperf.

filebench, 1 vm:
ops/sec improved by 13% with the polling patch. Number of exits was 
reduced by 31%.
The same experiment with 3 vms running filebench showed similar numbers.


 And looks like the patch only poll for virtqueue. In the future, may
 worth to add callbacks for vhost_net to poll socket. Then it could be
 used with rx busy polling in host which may speedup the rx also.

Did you mean polling the network device to avoid interrupts?

  
  diff --git a/drivers/vhost/vhost.c b/drivers/vhost/vhost.c
  index c90f437..678d766 100644
  --- a/drivers/vhost/vhost.c
  +++ b/drivers/vhost/vhost.c
  @@ -24,9 +24,17 @@
   #include linux/slab.h
   #include linux/kthread.h
   #include linux/cgroup.h
  +#include linux/jiffies.h
   #include linux/module.h
  
   #include vhost.h
  +static int poll_start_rate = 0;
  +module_param(poll_start_rate, int, S_IRUGO|S_IWUSR);
  +MODULE_PARM_DESC(poll_start_rate, Start continuous polling of 
virtqueue 
  when rate of events is at least this number per jiffy. If 0, never 
start 
  polling.);
  +
  +static int poll_stop_idle = 3*HZ; /* 3 seconds */
  +module_param(poll_stop_idle, int, S_IRUGO|S_IWUSR);
  +MODULE_PARM_DESC(poll_stop_idle, Stop continuous polling of 
virtqueue 
  after this many jiffies of no work.);
  
 
 I'm not sure using jiffy is good enough since user need know HZ value.
 May worth to look at sk_busy_loop() which use sched_clock() and us. 

Ok, Will look into it, thanks.

  
  +/* Enable or disable virtqueue polling

Re: [PATCH] vhost: Add polling mode

2014-07-23 Thread Jason Wang
On 07/23/2014 04:12 PM, Razya Ladelsky wrote:
 Jason Wang jasow...@redhat.com wrote on 23/07/2014 08:26:36 AM:

 From: Jason Wang jasow...@redhat.com
 To: Razya Ladelsky/Haifa/IBM@IBMIL, kvm@vger.kernel.org, Michael S.
 Tsirkin m...@redhat.com, 
 Cc: abel.gor...@gmail.com, Joel Nider/Haifa/IBM@IBMIL, Yossi 
 Kuperman1/Haifa/IBM@IBMIL, Eran Raichstein/Haifa/IBM@IBMIL, Alex 
 Glikson/Haifa/IBM@IBMIL
 Date: 23/07/2014 08:26 AM
 Subject: Re: [PATCH] vhost: Add polling mode

 On 07/21/2014 09:23 PM, Razya Ladelsky wrote:
 Hello All,

 When vhost is waiting for buffers from the guest driver (e.g., more 
 packets
 to send in vhost-net's transmit queue), it normally goes to sleep and 
 waits
 for the guest to kick it. This kick involves a PIO in the guest, and
 therefore an exit (and possibly userspace involvement in translating 
 this 
 PIO
 exit into a file descriptor event), all of which hurts performance.

 If the system is under-utilized (has cpu time to spare), vhost can 
 continuously poll the virtqueues for new buffers, and avoid asking 
 the guest to kick us.
 This patch adds an optional polling mode to vhost, that can be enabled 
 via a kernel module parameter, poll_start_rate.

 When polling is active for a virtqueue, the guest is asked to
 disable notification (kicks), and the worker thread continuously 
 checks 
 for
 new buffers. When it does discover new buffers, it simulates a kick 
 by
 invoking the underlying backend driver (such as vhost-net), which 
 thinks 
 it
 got a real kick from the guest, and acts accordingly. If the 
 underlying
 driver asks not to be kicked, we disable polling on this virtqueue.

 We start polling on a virtqueue when we notice it has
 work to do. Polling on this virtqueue is later disabled after 3 
 seconds of
 polling turning up no new work, as in this case we are better off 
 returning
 to the exit-based notification mechanism. The default timeout of 3 
 seconds
 can be changed with the poll_stop_idle kernel module parameter.

 This polling approach makes lot of sense for new HW with 
 posted-interrupts
 for which we have exitless host-to-guest notifications. But even with 
 support 
 for posted interrupts, guest-to-host communication still causes exits. 
 Polling adds the missing part.

 When systems are overloaded, there won?t be enough cpu time for the 
 various 
 vhost threads to poll their guests' devices. For these scenarios, we 
 plan 
 to add support for vhost threads that can be shared by multiple 
 devices, 
 even of multiple vms. 
 Our ultimate goal is to implement the I/O acceleration features 
 described 
 in:
 KVM Forum 2013: Efficient and Scalable Virtio (by Abel Gordon) 
 https://www.youtube.com/watch?v=9EyweibHfEs
 and
 https://www.mail-archive.com/kvm@vger.kernel.org/msg98179.html


 Comments are welcome, 
 Thank you,
 Razya
 Thanks for the work. Do you have perf numbers for this?

 Hi Jason,
 Thanks for reviewing. I ran some experiments with TCP stream netperf and 
 filebench (having 2 threads performing random reads) benchmarks on an IBM 
 System x3650 M4.
 All runs loaded the guests in a way that they were (cpu) saturated.
 The system had two cores per guest, as to allow for both the vcpu and the 
 vhost thread to
 run concurrently for maximum throughput (but I didn't pin the threads to 
 specific cores)
 I get:

 Netperf, 1 vm:
 The polling patch improved throughput by ~33%. Number of exits/sec 
 decreased 6x.
 The same improvement was shown when I tested with 3 vms running netperf.

 filebench, 1 vm:
 ops/sec improved by 13% with the polling patch. Number of exits was 
 reduced by 31%.
 The same experiment with 3 vms running filebench showed similar numbers.

Looks good, may worth to add the result in the commit log.

 And looks like the patch only poll for virtqueue. In the future, may
 worth to add callbacks for vhost_net to poll socket. Then it could be
 used with rx busy polling in host which may speedup the rx also.
 Did you mean polling the network device to avoid interrupts?

Yes, recent linux host support rx busy polling which can reduce the
interrupts. If vhost can utilize this, it can also reduce the latency
caused by vhost thread wakeups.

And I'm also working on virtio-net busy polling in guest, if vhost can
poll socket, it can also help in guest rx polling.
 diff --git a/drivers/vhost/vhost.c b/drivers/vhost/vhost.c
 index c90f437..678d766 100644
 --- a/drivers/vhost/vhost.c
 +++ b/drivers/vhost/vhost.c
 @@ -24,9 +24,17 @@
  #include linux/slab.h
  #include linux/kthread.h
  #include linux/cgroup.h
 +#include linux/jiffies.h
  #include linux/module.h

  #include vhost.h
 +static int poll_start_rate = 0;
 +module_param(poll_start_rate, int, S_IRUGO|S_IWUSR);
 +MODULE_PARM_DESC(poll_start_rate, Start continuous polling of 
 virtqueue 
 when rate of events is at least this number per jiffy. If 0, never 
 start 
 polling.);
 +
 +static int poll_stop_idle = 3*HZ; /* 3 seconds */
 +module_param(poll_stop_idle, int, S_IRUGO|S_IWUSR

Re: [PATCH] vhost: Add polling mode

2014-07-23 Thread Abel Gordon
On Wed, Jul 23, 2014 at 11:42 AM, Jason Wang jasow...@redhat.com wrote:

 On 07/23/2014 04:12 PM, Razya Ladelsky wrote:
  Jason Wang jasow...@redhat.com wrote on 23/07/2014 08:26:36 AM:
 
  From: Jason Wang jasow...@redhat.com
  To: Razya Ladelsky/Haifa/IBM@IBMIL, kvm@vger.kernel.org, Michael S.
  Tsirkin m...@redhat.com,
  Cc: abel.gor...@gmail.com, Joel Nider/Haifa/IBM@IBMIL, Yossi
  Kuperman1/Haifa/IBM@IBMIL, Eran Raichstein/Haifa/IBM@IBMIL, Alex
  Glikson/Haifa/IBM@IBMIL
  Date: 23/07/2014 08:26 AM
  Subject: Re: [PATCH] vhost: Add polling mode
 
  On 07/21/2014 09:23 PM, Razya Ladelsky wrote:
  Hello All,
 
  When vhost is waiting for buffers from the guest driver (e.g., more
  packets
  to send in vhost-net's transmit queue), it normally goes to sleep and
  waits
  for the guest to kick it. This kick involves a PIO in the guest, and
  therefore an exit (and possibly userspace involvement in translating
  this
  PIO
  exit into a file descriptor event), all of which hurts performance.
 
  If the system is under-utilized (has cpu time to spare), vhost can
  continuously poll the virtqueues for new buffers, and avoid asking
  the guest to kick us.
  This patch adds an optional polling mode to vhost, that can be enabled
  via a kernel module parameter, poll_start_rate.
 
  When polling is active for a virtqueue, the guest is asked to
  disable notification (kicks), and the worker thread continuously
  checks
  for
  new buffers. When it does discover new buffers, it simulates a kick
  by
  invoking the underlying backend driver (such as vhost-net), which
  thinks
  it
  got a real kick from the guest, and acts accordingly. If the
  underlying
  driver asks not to be kicked, we disable polling on this virtqueue.
 
  We start polling on a virtqueue when we notice it has
  work to do. Polling on this virtqueue is later disabled after 3
  seconds of
  polling turning up no new work, as in this case we are better off
  returning
  to the exit-based notification mechanism. The default timeout of 3
  seconds
  can be changed with the poll_stop_idle kernel module parameter.
 
  This polling approach makes lot of sense for new HW with
  posted-interrupts
  for which we have exitless host-to-guest notifications. But even with
  support
  for posted interrupts, guest-to-host communication still causes exits.
  Polling adds the missing part.
 
  When systems are overloaded, there won?t be enough cpu time for the
  various
  vhost threads to poll their guests' devices. For these scenarios, we
  plan
  to add support for vhost threads that can be shared by multiple
  devices,
  even of multiple vms.
  Our ultimate goal is to implement the I/O acceleration features
  described
  in:
  KVM Forum 2013: Efficient and Scalable Virtio (by Abel Gordon)
  https://www.youtube.com/watch?v=9EyweibHfEs
  and
  https://www.mail-archive.com/kvm@vger.kernel.org/msg98179.html
 
 
  Comments are welcome,
  Thank you,
  Razya
  Thanks for the work. Do you have perf numbers for this?
 
  Hi Jason,
  Thanks for reviewing. I ran some experiments with TCP stream netperf and
  filebench (having 2 threads performing random reads) benchmarks on an IBM
  System x3650 M4.
  All runs loaded the guests in a way that they were (cpu) saturated.
  The system had two cores per guest, as to allow for both the vcpu and the
  vhost thread to
  run concurrently for maximum throughput (but I didn't pin the threads to
  specific cores)
  I get:
 
  Netperf, 1 vm:
  The polling patch improved throughput by ~33%. Number of exits/sec
  decreased 6x.
  The same improvement was shown when I tested with 3 vms running netperf.
 
  filebench, 1 vm:
  ops/sec improved by 13% with the polling patch. Number of exits was
  reduced by 31%.
  The same experiment with 3 vms running filebench showed similar numbers.

 Looks good, may worth to add the result in the commit log.
 
  And looks like the patch only poll for virtqueue. In the future, may
  worth to add callbacks for vhost_net to poll socket. Then it could be
  used with rx busy polling in host which may speedup the rx also.
  Did you mean polling the network device to avoid interrupts?

 Yes, recent linux host support rx busy polling which can reduce the
 interrupts. If vhost can utilize this, it can also reduce the latency
 caused by vhost thread wakeups.

 And I'm also working on virtio-net busy polling in guest, if vhost can
 poll socket, it can also help in guest rx polling.

Nice :)  Note that you may want to check if if the processor support
posted interrupts. I guess that if CPU supports posted interrupts then
benefits of polling in the front-end (from performance perspective)
may not worth the cpu cycles wasted in the guest.


  diff --git a/drivers/vhost/vhost.c b/drivers/vhost/vhost.c
  index c90f437..678d766 100644
  --- a/drivers/vhost/vhost.c
  +++ b/drivers/vhost/vhost.c
  @@ -24,9 +24,17 @@
   #include linux/slab.h
   #include linux/kthread.h
   #include linux/cgroup.h
  +#include linux/jiffies.h

Re: [PATCH] vhost: Add polling mode

2014-07-23 Thread Jason Wang
On 07/23/2014 04:48 PM, Abel Gordon wrote:
 On Wed, Jul 23, 2014 at 11:42 AM, Jason Wang jasow...@redhat.com wrote:
 
  On 07/23/2014 04:12 PM, Razya Ladelsky wrote:
   Jason Wang jasow...@redhat.com wrote on 23/07/2014 08:26:36 AM:
  
   From: Jason Wang jasow...@redhat.com
   To: Razya Ladelsky/Haifa/IBM@IBMIL, kvm@vger.kernel.org, Michael S.
   Tsirkin m...@redhat.com,
   Cc: abel.gor...@gmail.com, Joel Nider/Haifa/IBM@IBMIL, Yossi
   Kuperman1/Haifa/IBM@IBMIL, Eran Raichstein/Haifa/IBM@IBMIL, Alex
   Glikson/Haifa/IBM@IBMIL
   Date: 23/07/2014 08:26 AM
   Subject: Re: [PATCH] vhost: Add polling mode
  
   On 07/21/2014 09:23 PM, Razya Ladelsky wrote:
   Hello All,
  
   When vhost is waiting for buffers from the guest driver (e.g., more
   packets
   to send in vhost-net's transmit queue), it normally goes to sleep 
   and
   waits
   for the guest to kick it. This kick involves a PIO in the guest, 
   and
   therefore an exit (and possibly userspace involvement in translating
   this
   PIO
   exit into a file descriptor event), all of which hurts performance.
  
   If the system is under-utilized (has cpu time to spare), vhost can
   continuously poll the virtqueues for new buffers, and avoid asking
   the guest to kick us.
   This patch adds an optional polling mode to vhost, that can be 
   enabled
   via a kernel module parameter, poll_start_rate.
  
   When polling is active for a virtqueue, the guest is asked to
   disable notification (kicks), and the worker thread continuously
   checks
   for
   new buffers. When it does discover new buffers, it simulates a 
   kick
   by
   invoking the underlying backend driver (such as vhost-net), which
   thinks
   it
   got a real kick from the guest, and acts accordingly. If the
   underlying
   driver asks not to be kicked, we disable polling on this virtqueue.
  
   We start polling on a virtqueue when we notice it has
   work to do. Polling on this virtqueue is later disabled after 3
   seconds of
   polling turning up no new work, as in this case we are better off
   returning
   to the exit-based notification mechanism. The default timeout of 3
   seconds
   can be changed with the poll_stop_idle kernel module parameter.
  
   This polling approach makes lot of sense for new HW with
   posted-interrupts
   for which we have exitless host-to-guest notifications. But even 
   with
   support
   for posted interrupts, guest-to-host communication still causes 
   exits.
   Polling adds the missing part.
  
   When systems are overloaded, there won?t be enough cpu time for the
   various
   vhost threads to poll their guests' devices. For these scenarios, we
   plan
   to add support for vhost threads that can be shared by multiple
   devices,
   even of multiple vms.
   Our ultimate goal is to implement the I/O acceleration features
   described
   in:
   KVM Forum 2013: Efficient and Scalable Virtio (by Abel Gordon)
   https://www.youtube.com/watch?v=9EyweibHfEs
   and
   https://www.mail-archive.com/kvm@vger.kernel.org/msg98179.html
  
  
   Comments are welcome,
   Thank you,
   Razya
   Thanks for the work. Do you have perf numbers for this?
  
   Hi Jason,
   Thanks for reviewing. I ran some experiments with TCP stream netperf and
   filebench (having 2 threads performing random reads) benchmarks on an 
   IBM
   System x3650 M4.
   All runs loaded the guests in a way that they were (cpu) saturated.
   The system had two cores per guest, as to allow for both the vcpu and 
   the
   vhost thread to
   run concurrently for maximum throughput (but I didn't pin the threads to
   specific cores)
   I get:
  
   Netperf, 1 vm:
   The polling patch improved throughput by ~33%. Number of exits/sec
   decreased 6x.
   The same improvement was shown when I tested with 3 vms running netperf.
  
   filebench, 1 vm:
   ops/sec improved by 13% with the polling patch. Number of exits was
   reduced by 31%.
   The same experiment with 3 vms running filebench showed similar numbers.
 
  Looks good, may worth to add the result in the commit log.
  
   And looks like the patch only poll for virtqueue. In the future, may
   worth to add callbacks for vhost_net to poll socket. Then it could be
   used with rx busy polling in host which may speedup the rx also.
   Did you mean polling the network device to avoid interrupts?
 
  Yes, recent linux host support rx busy polling which can reduce the
  interrupts. If vhost can utilize this, it can also reduce the latency
  caused by vhost thread wakeups.
 
  And I'm also working on virtio-net busy polling in guest, if vhost can
  poll socket, it can also help in guest rx polling.
 Nice :)  Note that you may want to check if if the processor support
 posted interrupts. I guess that if CPU supports posted interrupts then
 benefits of polling in the front-end (from performance perspective)
 may not worth the cpu cycles wasted in the guest.


Yes it's worth to check. But I think busy polling in guest may still
help since it may

Re: [PATCH] vhost: Add polling mode

2014-07-22 Thread Jason Wang
On 07/21/2014 09:23 PM, Razya Ladelsky wrote:
 Hello All,

 When vhost is waiting for buffers from the guest driver (e.g., more 
 packets
 to send in vhost-net's transmit queue), it normally goes to sleep and 
 waits
 for the guest to kick it. This kick involves a PIO in the guest, and
 therefore an exit (and possibly userspace involvement in translating this 
 PIO
 exit into a file descriptor event), all of which hurts performance.

 If the system is under-utilized (has cpu time to spare), vhost can 
 continuously poll the virtqueues for new buffers, and avoid asking 
 the guest to kick us.
 This patch adds an optional polling mode to vhost, that can be enabled 
 via a kernel module parameter, poll_start_rate.

 When polling is active for a virtqueue, the guest is asked to
 disable notification (kicks), and the worker thread continuously checks 
 for
 new buffers. When it does discover new buffers, it simulates a kick by
 invoking the underlying backend driver (such as vhost-net), which thinks 
 it
 got a real kick from the guest, and acts accordingly. If the underlying
 driver asks not to be kicked, we disable polling on this virtqueue.

 We start polling on a virtqueue when we notice it has
 work to do. Polling on this virtqueue is later disabled after 3 seconds of
 polling turning up no new work, as in this case we are better off 
 returning
 to the exit-based notification mechanism. The default timeout of 3 seconds
 can be changed with the poll_stop_idle kernel module parameter.

 This polling approach makes lot of sense for new HW with posted-interrupts
 for which we have exitless host-to-guest notifications. But even with 
 support 
 for posted interrupts, guest-to-host communication still causes exits. 
 Polling adds the missing part.

 When systems are overloaded, there won?t be enough cpu time for the 
 various 
 vhost threads to poll their guests' devices. For these scenarios, we plan 
 to add support for vhost threads that can be shared by multiple devices, 
 even of multiple vms. 
 Our ultimate goal is to implement the I/O acceleration features described 
 in:
 KVM Forum 2013: Efficient and Scalable Virtio (by Abel Gordon) 
 https://www.youtube.com/watch?v=9EyweibHfEs
 and
 https://www.mail-archive.com/kvm@vger.kernel.org/msg98179.html

  
 Comments are welcome, 
 Thank you,
 Razya

Thanks for the work. Do you have perf numbers for this?

And looks like the patch only poll for virtqueue. In the future, may
worth to add callbacks for vhost_net to poll socket. Then it could be
used with rx busy polling in host which may speedup the rx also.

 From: Razya Ladelsky ra...@il.ibm.com

 Add an optional polling mode to continuously poll the virtqueues
 for new buffers, and avoid asking the guest to kick us.

 Signed-off-by: Razya Ladelsky ra...@il.ibm.com
 ---
  drivers/vhost/net.c   |6 +-
  drivers/vhost/scsi.c  |5 +-
  drivers/vhost/vhost.c |  247 
 +++--
  drivers/vhost/vhost.h |   37 +++-
  4 files changed, 277 insertions(+), 18 deletions(-)

 diff --git a/drivers/vhost/net.c b/drivers/vhost/net.c
 index 971a760..558aecb 100644
 --- a/drivers/vhost/net.c
 +++ b/drivers/vhost/net.c
 @@ -742,8 +742,10 @@ static int vhost_net_open(struct inode *inode, struct 
 file *f)
 }
 vhost_dev_init(dev, vqs, VHOST_NET_VQ_MAX);
  
 -   vhost_poll_init(n-poll + VHOST_NET_VQ_TX, handle_tx_net, POLLOUT, 
 dev);
 -   vhost_poll_init(n-poll + VHOST_NET_VQ_RX, handle_rx_net, POLLIN, 
 dev);
 +   vhost_poll_init(n-poll + VHOST_NET_VQ_TX, handle_tx_net, POLLOUT,
 +   vqs[VHOST_NET_VQ_TX]);
 +   vhost_poll_init(n-poll + VHOST_NET_VQ_RX, handle_rx_net, POLLIN,
 +   vqs[VHOST_NET_VQ_RX]);
  
 f-private_data = n;
  
 diff --git a/drivers/vhost/scsi.c b/drivers/vhost/scsi.c
 index 4f4ffa4..56f0233 100644
 --- a/drivers/vhost/scsi.c
 +++ b/drivers/vhost/scsi.c
 @@ -1528,9 +1528,8 @@ static int vhost_scsi_open(struct inode *inode, 
 struct file *f)
 if (!vqs)
 goto err_vqs;
  
 -   vhost_work_init(vs-vs_completion_work, 
 vhost_scsi_complete_cmd_work);
 -   vhost_work_init(vs-vs_event_work, tcm_vhost_evt_work);
 -
 +   vhost_work_init(vs-vs_completion_work, NULL, 
 vhost_scsi_complete_cmd_work);
 +   vhost_work_init(vs-vs_event_work, NULL, tcm_vhost_evt_work);
 vs-vs_events_nr = 0;
 vs-vs_events_missed = false;
  
 diff --git a/drivers/vhost/vhost.c b/drivers/vhost/vhost.c
 index c90f437..678d766 100644
 --- a/drivers/vhost/vhost.c
 +++ b/drivers/vhost/vhost.c
 @@ -24,9 +24,17 @@
  #include linux/slab.h
  #include linux/kthread.h
  #include linux/cgroup.h
 +#include linux/jiffies.h
  #include linux/module.h
  
  #include vhost.h
 +static int poll_start_rate = 0;
 +module_param(poll_start_rate, int, S_IRUGO|S_IWUSR);
 +MODULE_PARM_DESC(poll_start_rate, Start continuous polling of virtqueue 
 when rate of events is at least this number per 

[PATCH] vhost: Add polling mode

2014-07-21 Thread Razya Ladelsky
Hello All,

When vhost is waiting for buffers from the guest driver (e.g., more 
packets
to send in vhost-net's transmit queue), it normally goes to sleep and 
waits
for the guest to kick it. This kick involves a PIO in the guest, and
therefore an exit (and possibly userspace involvement in translating this 
PIO
exit into a file descriptor event), all of which hurts performance.

If the system is under-utilized (has cpu time to spare), vhost can 
continuously poll the virtqueues for new buffers, and avoid asking 
the guest to kick us.
This patch adds an optional polling mode to vhost, that can be enabled 
via a kernel module parameter, poll_start_rate.

When polling is active for a virtqueue, the guest is asked to
disable notification (kicks), and the worker thread continuously checks 
for
new buffers. When it does discover new buffers, it simulates a kick by
invoking the underlying backend driver (such as vhost-net), which thinks 
it
got a real kick from the guest, and acts accordingly. If the underlying
driver asks not to be kicked, we disable polling on this virtqueue.

We start polling on a virtqueue when we notice it has
work to do. Polling on this virtqueue is later disabled after 3 seconds of
polling turning up no new work, as in this case we are better off 
returning
to the exit-based notification mechanism. The default timeout of 3 seconds
can be changed with the poll_stop_idle kernel module parameter.

This polling approach makes lot of sense for new HW with posted-interrupts
for which we have exitless host-to-guest notifications. But even with 
support 
for posted interrupts, guest-to-host communication still causes exits. 
Polling adds the missing part.

When systems are overloaded, there won?t be enough cpu time for the 
various 
vhost threads to poll their guests' devices. For these scenarios, we plan 
to add support for vhost threads that can be shared by multiple devices, 
even of multiple vms. 
Our ultimate goal is to implement the I/O acceleration features described 
in:
KVM Forum 2013: Efficient and Scalable Virtio (by Abel Gordon) 
https://www.youtube.com/watch?v=9EyweibHfEs
and
https://www.mail-archive.com/kvm@vger.kernel.org/msg98179.html

 
Comments are welcome, 
Thank you,
Razya

From: Razya Ladelsky ra...@il.ibm.com

Add an optional polling mode to continuously poll the virtqueues
for new buffers, and avoid asking the guest to kick us.

Signed-off-by: Razya Ladelsky ra...@il.ibm.com
---
 drivers/vhost/net.c   |6 +-
 drivers/vhost/scsi.c  |5 +-
 drivers/vhost/vhost.c |  247 
+++--
 drivers/vhost/vhost.h |   37 +++-
 4 files changed, 277 insertions(+), 18 deletions(-)

diff --git a/drivers/vhost/net.c b/drivers/vhost/net.c
index 971a760..558aecb 100644
--- a/drivers/vhost/net.c
+++ b/drivers/vhost/net.c
@@ -742,8 +742,10 @@ static int vhost_net_open(struct inode *inode, struct 
file *f)
}
vhost_dev_init(dev, vqs, VHOST_NET_VQ_MAX);
 
-   vhost_poll_init(n-poll + VHOST_NET_VQ_TX, handle_tx_net, POLLOUT, 
dev);
-   vhost_poll_init(n-poll + VHOST_NET_VQ_RX, handle_rx_net, POLLIN, 
dev);
+   vhost_poll_init(n-poll + VHOST_NET_VQ_TX, handle_tx_net, POLLOUT,
+   vqs[VHOST_NET_VQ_TX]);
+   vhost_poll_init(n-poll + VHOST_NET_VQ_RX, handle_rx_net, POLLIN,
+   vqs[VHOST_NET_VQ_RX]);
 
f-private_data = n;
 
diff --git a/drivers/vhost/scsi.c b/drivers/vhost/scsi.c
index 4f4ffa4..56f0233 100644
--- a/drivers/vhost/scsi.c
+++ b/drivers/vhost/scsi.c
@@ -1528,9 +1528,8 @@ static int vhost_scsi_open(struct inode *inode, 
struct file *f)
if (!vqs)
goto err_vqs;
 
-   vhost_work_init(vs-vs_completion_work, 
vhost_scsi_complete_cmd_work);
-   vhost_work_init(vs-vs_event_work, tcm_vhost_evt_work);
-
+   vhost_work_init(vs-vs_completion_work, NULL, 
vhost_scsi_complete_cmd_work);
+   vhost_work_init(vs-vs_event_work, NULL, tcm_vhost_evt_work);
vs-vs_events_nr = 0;
vs-vs_events_missed = false;
 
diff --git a/drivers/vhost/vhost.c b/drivers/vhost/vhost.c
index c90f437..678d766 100644
--- a/drivers/vhost/vhost.c
+++ b/drivers/vhost/vhost.c
@@ -24,9 +24,17 @@
 #include linux/slab.h
 #include linux/kthread.h
 #include linux/cgroup.h
+#include linux/jiffies.h
 #include linux/module.h
 
 #include vhost.h
+static int poll_start_rate = 0;
+module_param(poll_start_rate, int, S_IRUGO|S_IWUSR);
+MODULE_PARM_DESC(poll_start_rate, Start continuous polling of virtqueue 
when rate of events is at least this number per jiffy. If 0, never start 
polling.);
+
+static int poll_stop_idle = 3*HZ; /* 3 seconds */
+module_param(poll_stop_idle, int, S_IRUGO|S_IWUSR);
+MODULE_PARM_DESC(poll_stop_idle, Stop continuous polling of virtqueue 
after this many jiffies of no work.);
 
 enum {
VHOST_MEMORY_MAX_NREGIONS = 64,
@@ -58,27 +66,27 @@ static int vhost_poll_wakeup(wait_queue_t *wait, 
unsigned mode, int sync,
return 0;
 }