Re: [RFC][PATCH 0/9] rework KVM mmu_shrink() code

2010-06-17 Thread Avi Kivity

On 06/16/2010 06:03 PM, Dave Hansen wrote:

On Wed, 2010-06-16 at 11:38 +0300, Avi Kivity wrote:
   

On 06/15/2010 04:55 PM, Dave Hansen wrote:
 

These seem to boot and run fine.  I'm running about 40 VMs at
once, while doing echo 3   /proc/sys/vm/drop_caches, and
killing/restarting VMs constantly.

   

Will drop_caches actually shrink the kvm caches too?  If so we probably
need to add that to autotest since it's a really good stress test for
the mmu.
 

I'm completely sure.


Yes, easily seen from the code as well.


I crashed my machines several times this way
during testing.
   


Hopefully only with your patches applied?

I'll try to run autotest from time to time with drop_caches running in 
the background.  Looks like an excellent way to stress out the mmu.



--
error compiling committee.c: too many arguments to function

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC][PATCH 0/9] rework KVM mmu_shrink() code

2010-06-16 Thread Avi Kivity

On 06/15/2010 04:55 PM, Dave Hansen wrote:

This is a big RFC for the moment.  These need a bunch more
runtime testing.

--

We've seen contention in the mmu_shrink() function.


First of all, that's surprising.  I tried to configure the shrinker so 
it would stay away from kvm unless memory was really tight.  The reason 
is that kvm mmu pages can cost as much as 1-2 ms of cpu time to build, 
perhaps even more, so we shouldn't drop them lightly.


It's certainly a neglected area that needs attention, though.


This patch
set reworks it to hopefully be more scalable to large numbers
of CPUs, as well as large numbers of running VMs.

The patches are ordered with increasing invasiveness.

These seem to boot and run fine.  I'm running about 40 VMs at
once, while doing echo 3  /proc/sys/vm/drop_caches, and
killing/restarting VMs constantly.
   


Will drop_caches actually shrink the kvm caches too?  If so we probably 
need to add that to autotest since it's a really good stress test for 
the mmu.



Seems to be relatively stable, and seems to keep the numbers
of kvm_mmu_page_header objects down.
   


That's no necessarily a good thing, those things are expensive to 
recreate.  Of course, when we do need to reclaim them, that should be 
efficient.


We also do a very bad job of selecting which page to reclaim.  We need 
to start using the accessed bit on sptes that point to shadow page 
tables, and then look those up and reclaim unreferenced pages sooner.  
With shadow paging there can be tons of unsync pages that are basically 
unused and can be reclaimed at no cost to future runtime.


--
error compiling committee.c: too many arguments to function

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC][PATCH 0/9] rework KVM mmu_shrink() code

2010-06-16 Thread Dave Hansen
On Wed, 2010-06-16 at 11:38 +0300, Avi Kivity wrote:
 On 06/15/2010 04:55 PM, Dave Hansen wrote:
  These seem to boot and run fine.  I'm running about 40 VMs at
  once, while doing echo 3  /proc/sys/vm/drop_caches, and
  killing/restarting VMs constantly.
 
 
 Will drop_caches actually shrink the kvm caches too?  If so we probably 
 need to add that to autotest since it's a really good stress test for 
 the mmu.

I'm completely sure.  I crashed my machines several times this way
during testing.

  Seems to be relatively stable, and seems to keep the numbers
  of kvm_mmu_page_header objects down.
 
 
 That's no necessarily a good thing, those things are expensive to 
 recreate.  Of course, when we do need to reclaim them, that should be 
 efficient.

Oh, I meant that I didn't break the shrinker completely.

 We also do a very bad job of selecting which page to reclaim.  We need 
 to start using the accessed bit on sptes that point to shadow page 
 tables, and then look those up and reclaim unreferenced pages sooner.  
 With shadow paging there can be tons of unsync pages that are basically 
 unused and can be reclaimed at no cost to future runtime.

Sounds like a good next step.

-- Dave

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[RFC][PATCH 0/9] rework KVM mmu_shrink() code

2010-06-15 Thread Dave Hansen
This is a big RFC for the moment.  These need a bunch more
runtime testing.

--

We've seen contention in the mmu_shrink() function.  This patch
set reworks it to hopefully be more scalable to large numbers
of CPUs, as well as large numbers of running VMs.

The patches are ordered with increasing invasiveness.

These seem to boot and run fine.  I'm running about 40 VMs at
once, while doing echo 3  /proc/sys/vm/drop_caches, and
killing/restarting VMs constantly.

Seems to be relatively stable, and seems to keep the numbers
of kvm_mmu_page_header objects down.
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html