[PATCH v5 01/27] kvm: stop including asm-generic/bitops/le.h directly

2011-01-22 Thread Akinobu Mita
asm-generic/bitops/le.h is only intended to be included directly from
asm-generic/bitops/ext2-non-atomic.h or asm-generic/bitops/minix-le.h
which implements generic ext2 or minix bit operations.

This stops including asm-generic/bitops/le.h directly and use ext2
non-atomic bit operations instead.

It seems odd to use ext2_set_bit() on kvm, but it will replaced with
__set_bit_le() after introducing little endian bit operations
for all architectures.  This indirect step is necessary to maintain
bisectability for some architectures which have their own little-endian
bit operations.

Signed-off-by: Akinobu Mita akinobu.m...@gmail.com
Cc: Avi Kivity a...@redhat.com
Cc: Marcelo Tosatti mtosa...@redhat.com
Cc: kvm@vger.kernel.org
---

Change from v4:
 - splitted into two patches to fix a bisection hole

The whole series is available in the git branch at:
 git://git.kernel.org/pub/scm/linux/kernel/git/mita/linux-2.6.git le-bitops-v5

 virt/kvm/kvm_main.c |3 +--
 1 files changed, 1 insertions(+), 2 deletions(-)

diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
index f29abeb..3461001 100644
--- a/virt/kvm/kvm_main.c
+++ b/virt/kvm/kvm_main.c
@@ -52,7 +52,6 @@
 #include asm/io.h
 #include asm/uaccess.h
 #include asm/pgtable.h
-#include asm-generic/bitops/le.h
 
 #include coalesced_mmio.h
 #include async_pf.h
@@ -1421,7 +1420,7 @@ void mark_page_dirty_in_slot(struct kvm *kvm, struct 
kvm_memory_slot *memslot,
if (memslot  memslot-dirty_bitmap) {
unsigned long rel_gfn = gfn - memslot-base_gfn;
 
-   generic___set_le_bit(rel_gfn, memslot-dirty_bitmap);
+   ext2_set_bit(rel_gfn, memslot-dirty_bitmap);
}
 }
 
-- 
1.7.3.4

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v5 14/27] kvm: use little-endian bitops

2011-01-22 Thread Akinobu Mita
As a preparation for removing ext2 non-atomic bit operations from
asm/bitops.h.  This converts ext2 non-atomic bit operations to
little-endian bit operations.

Signed-off-by: Akinobu Mita akinobu.m...@gmail.com
Cc: Avi Kivity a...@redhat.com
Cc: Marcelo Tosatti mtosa...@redhat.com
Cc: kvm@vger.kernel.org
---

Change from v4:
 - splitted into two patches to fix a bisection hole

The whole series is available in the git branch at:
 git://git.kernel.org/pub/scm/linux/kernel/git/mita/linux-2.6.git le-bitops-v5

 virt/kvm/kvm_main.c |2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
index 3461001..508fdb1 100644
--- a/virt/kvm/kvm_main.c
+++ b/virt/kvm/kvm_main.c
@@ -1420,7 +1420,7 @@ void mark_page_dirty_in_slot(struct kvm *kvm, struct 
kvm_memory_slot *memslot,
if (memslot  memslot-dirty_bitmap) {
unsigned long rel_gfn = gfn - memslot-base_gfn;
 
-   ext2_set_bit(rel_gfn, memslot-dirty_bitmap);
+   __set_bit_le(rel_gfn, memslot-dirty_bitmap);
}
 }
 
-- 
1.7.3.4

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 2/3] kvm hypervisor : Add hypercalls to support pv-ticketlock

2011-01-22 Thread Rik van Riel

On 01/22/2011 01:14 AM, Srivatsa Vaddagiri wrote:


Also it may be possible for the pv-ticketlocks to track owning vcpu and make use
of a yield-to interface as further optimization to avoid the
others-get-more-time problem, but Peterz rightly pointed that PI would be a
better solution there than yield-to. So overall IMO kvm_vcpu_on_spin+yield_to
could be the best solution for unmodified guests, while paravirtualized
ticketlocks + some sort of PI would be a better solution where we have the
luxury of modifying guest sources!


Agreed, for unmodified guests (which is what people will mostly be
running for the next couple of years), we have little choice but
to use PLE + kvm_vcpu_on_spin + yield_to.

The main question that remains is whether the PV ticketlocks are
a large enough improvement to also merge those.  I expect they
will be, and we'll see so in the benchmark numbers.

--
All rights reversed
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Flow Control and Port Mirroring Revisited

2011-01-22 Thread Michael S. Tsirkin
On Sat, Jan 22, 2011 at 10:11:52AM +1100, Simon Horman wrote:
 On Fri, Jan 21, 2011 at 11:59:30AM +0200, Michael S. Tsirkin wrote:
  On Thu, Jan 20, 2011 at 05:38:33PM +0900, Simon Horman wrote:
   [ Trimmed Eric from CC list as vger was complaining that it is too long ]
   
   On Tue, Jan 18, 2011 at 11:41:22AM -0800, Rick Jones wrote:
So it won't be all that simple to implement well, and before we try,
I'd like to know whether there are applications that are helped
by it. For example, we could try to measure latency at various
pps and see whether the backpressure helps. netperf has -b, -w
flags which might help these measurements.

Those options are enabled when one adds --enable-burst to the
pre-compilation ./configure  of netperf (one doesn't have to
recompile netserver).  However, if one is also looking at latency
statistics via the -j option in the top-of-trunk, or simply at the
histogram with --enable-histogram on the ./configure and a verbosity
level of 2 (global -v 2) then one wants the very top of trunk
netperf from:
   
   Hi,
   
   I have constructed a test where I run an un-paced  UDP_STREAM test in
   one guest and a paced omni rr test in another guest at the same time.
  
  Hmm, what is this supposed to measure?  Basically each time you run an
  un-paced UDP_STREAM you get some random load on the network.
  You can't tell what it was exactly, only that it was between
  the send and receive throughput.
 
 Rick mentioned in another email that I messed up my test parameters a bit,
 so I will re-run the tests, incorporating his suggestions.
 
 What I was attempting to measure was the effect of an unpaced UDP_STREAM
 on the latency of more moderated traffic. Because I am interested in
 what effect an abusive guest has on other guests and how that my be
 mitigated.
 
 Could you suggest some tests that you feel are more appropriate?

Yes. To refraze my concern in these terms, besides the malicious guest
you have another software in host (netperf) that interferes with
the traffic, and it cooperates with the malicious guest.
Right?

IMO for a malicious guest you would send
UDP packets that then get dropped by the host.

For example block netperf in host so that
it does not consume packets from the socket.



--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC PATCH 0/2] Expose available KVM free memory slot count to help avoid aborts

2011-01-22 Thread Michael S. Tsirkin
On Fri, Jan 21, 2011 at 04:48:02PM -0700, Alex Williamson wrote:
 When doing device assignment, we use cpu_register_physical_memory() to
 directly map the qemu mmap of the device resource into the address
 space of the guest.  The unadvertised feature of the register physical
 memory code path on kvm, at least for this type of mapping, is that it
 needs to allocate an index from a small, fixed array of memory slots.
 Even better, if it can't get an index, the code aborts deep in the
 kvm specific bits, preventing the caller from having a chance to
 recover.
 
 It's really easy to hit this by hot adding too many assigned devices
 to a guest (pretty easy to hit with too many devices at instantiation
 time too, but the abort is slightly more bearable there).
 
 I'm assuming it's pretty difficult to make the memory slot array
 dynamically sized.  If that's not the case, please let me know as
 that would be a much better solution.
 
 I'm not terribly happy with the solution in this series, it doesn't
 provide any guarantees whether a cpu_register_physical_memory() will
 succeed, only slightly better educated guesses.
 
 Are there better ideas how we could solve this?  Thanks,
 
 Alex

Put the table in qemu memory, make kvm access it with copy from/to user?
It can then be any size ...

 ---
 
 Alex Williamson (2):
   device-assignment: Count required kvm memory slots
   kvm: Allow querying free slots
 
 
  hw/device-assignment.c |   59 
 +++-
  hw/device-assignment.h |3 ++
  kvm-all.c  |   16 +
  kvm.h  |2 ++
  4 files changed, 79 insertions(+), 1 deletions(-)
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: MIPS, io-thread, icount and wfi

2011-01-22 Thread Edgar E. Iglesias
On Wed, Jan 19, 2011 at 08:02:28PM +0100, Edgar E. Iglesias wrote:
 On Wed, Jan 19, 2011 at 03:02:26PM -0200, Marcelo Tosatti wrote:
  On Tue, Jan 18, 2011 at 11:00:57AM +0100, Jan Kiszka wrote:
   On 2011-01-18 01:19, Edgar E. Iglesias wrote:
On Mon, Jan 17, 2011 at 11:03:08AM +0100, Edgar E. Iglesias wrote:
Hi,
   
I'm running an io-thread enabled qemu-system-mipsel with icount.
When the guest (linux) goes to sleep through the wait insn (waiting
to be woken up by future timer interrupts), the thing deadlocks.
   
IIUC, this is because vm timers are driven by icount, but the CPU is
halted so icount makes no progress and time stands still.
   
I've locally disabled vcpu halting when icount is enabled, that
works around my problem but of course makes qemu consume 100% host cpu.
   
I don't know why I only see this problem with io-thread builds?
Could be related timing and luck.
   
Would be interesting to know if someone has any info on how this was
intended to work (if it was)? And if there are ideas for better
workarounds or fixes that don't disable vcpu halting entirely.

Hi,

I've found the problem. For some reason io-thread builds use a
static timeout for wait loops. The entire chunk of code that
makes sure qemu_icount makes forward progress when the CPU's
are idle has been ifdef'ed away...

This fixes the problem for me, hopefully without affecting
io-thread runs without icount.

commit 0f4f3a919952500b487b438c5520f07a1c6be35b
Author: Edgar E. Iglesias ed...@axis.com
Date:   Tue Jan 18 01:01:57 2011 +0100

qemu-timer: Fix timeout calc for io-thread with icount

Make sure we always make forward progress with qemu_icount to
avoid deadlocks. For io-thread, use the static 1000 timeout
only if icount is disabled.

Signed-off-by: Edgar E. Iglesias ed...@axis.com

diff --git a/qemu-timer.c b/qemu-timer.c
index 95814af..db1ec49 100644
--- a/qemu-timer.c
+++ b/qemu-timer.c
@@ -110,7 +110,6 @@ static int64_t cpu_get_clock(void)
 }
 }
 
-#ifndef CONFIG_IOTHREAD
 static int64_t qemu_icount_delta(void)
 {
 if (!use_icount) {
@@ -124,7 +123,6 @@ static int64_t qemu_icount_delta(void)
 return cpu_get_icount() - cpu_get_clock();
 }
 }
-#endif
 
 /* enable cpu_get_ticks() */
 void cpu_enable_ticks(void)
@@ -1077,9 +1075,17 @@ void quit_timers(void)
 
 int qemu_calculate_timeout(void)
 {
-#ifndef CONFIG_IOTHREAD
 int timeout;
 
+#ifdef CONFIG_IOTHREAD
+/* When using icount, making forward progress with qemu_icount 
when the
+   guest CPU is idle is critical. We only use the static io-thread 
timeout
+   for non icount runs.  */
+if (!use_icount) {
+return 1000;
+}
+#endif
+
 if (!vm_running)
 timeout = 5000;
 else {
@@ -1110,8 +1116,5 @@ int qemu_calculate_timeout(void)
 }
 
 return timeout;
-#else /* CONFIG_IOTHREAD */
-return 1000;
-#endif
 }
 


   
   This logic and timeout values were imported on iothread merge. And I bet
   at least the timeout value of 1s (vs. 5s) can still be found in
   qemu-kvm. Maybe someone over there can remember the rationales behind
   choosing this value.
   
   Jan
  
  This timeout is for the main select() call. So there is not a lot
  of reasoning, how long to wait when there's no activity on the file
  descriptors.
 
 OK, I suspected something like that. Thanks both of you for the info.
 I'll give people a couple of days to complain at the patch, if noone
 does I'll apply it.

Silence - so I've applied this one, thanks.

Cheers
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Flow Control and Port Mirroring Revisited

2011-01-22 Thread Simon Horman
On Sat, Jan 22, 2011 at 11:57:42PM +0200, Michael S. Tsirkin wrote:
 On Sat, Jan 22, 2011 at 10:11:52AM +1100, Simon Horman wrote:
  On Fri, Jan 21, 2011 at 11:59:30AM +0200, Michael S. Tsirkin wrote:
   On Thu, Jan 20, 2011 at 05:38:33PM +0900, Simon Horman wrote:
[ Trimmed Eric from CC list as vger was complaining that it is too long 
]

On Tue, Jan 18, 2011 at 11:41:22AM -0800, Rick Jones wrote:
 So it won't be all that simple to implement well, and before we try,
 I'd like to know whether there are applications that are helped
 by it. For example, we could try to measure latency at various
 pps and see whether the backpressure helps. netperf has -b, -w
 flags which might help these measurements.
 
 Those options are enabled when one adds --enable-burst to the
 pre-compilation ./configure  of netperf (one doesn't have to
 recompile netserver).  However, if one is also looking at latency
 statistics via the -j option in the top-of-trunk, or simply at the
 histogram with --enable-histogram on the ./configure and a verbosity
 level of 2 (global -v 2) then one wants the very top of trunk
 netperf from:

Hi,

I have constructed a test where I run an un-paced  UDP_STREAM test in
one guest and a paced omni rr test in another guest at the same time.
   
   Hmm, what is this supposed to measure?  Basically each time you run an
   un-paced UDP_STREAM you get some random load on the network.
   You can't tell what it was exactly, only that it was between
   the send and receive throughput.
  
  Rick mentioned in another email that I messed up my test parameters a bit,
  so I will re-run the tests, incorporating his suggestions.
  
  What I was attempting to measure was the effect of an unpaced UDP_STREAM
  on the latency of more moderated traffic. Because I am interested in
  what effect an abusive guest has on other guests and how that my be
  mitigated.
  
  Could you suggest some tests that you feel are more appropriate?
 
 Yes. To refraze my concern in these terms, besides the malicious guest
 you have another software in host (netperf) that interferes with
 the traffic, and it cooperates with the malicious guest.
 Right?

Yes, that is the scenario in this test.

 IMO for a malicious guest you would send
 UDP packets that then get dropped by the host.
 
 For example block netperf in host so that
 it does not consume packets from the socket.

I'm more interested in rate-limiting netperf than blocking it.
But in any case, do you mean use iptables or tc based on
classification made by net_cls?

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html