Andi Kleen wrote:
On Sat, Nov 29, 2008 at 08:43:35PM +0200, Avi Kivity wrote:
Andi Kleen wrote:
It depends -- it's not necessarily an improvement. e.g. if it leads to
some CPUs being idle while others are oversubscribed because of the
pinning you typically lose more than you win. In general default
pinning is a bad idea in my experience.

Alternative more flexible strategies:

- Do a mapping from CPU to node at runtime by using getcpu()
- Migrate to complete nodes using migrate_pages when qemu detects
node migration on the host.
Wouldn't that cause lots of migrations? Migrating a 1GB guest can take

I assume you mean the second one (the two points were orthogonal)
The first one is an approximate method, also has advantages
and disadvantages.


I don't think the first one works without the second. Calling getcpu() on startup is meaningless since the initial placement doesn't take the current workload into account.

a huge amount of cpu time (tens or even hundreds of milliseconds?) compared to very high frequency activity like the scheduler.

Yes migration is expensive, although you can do it on demand of course, but the scheduler typically has pretty strong cpu affinity so it shouldn't happen too often. Also it's only a temporary cost compared to the endless overhead of running forever non local or running forever with some cores idle.

Another strategy would be to tune the load balancer in the scheduler
for this case and make it only migrate in extreme situations.

Anyways it's not ideal either, but in my mind would be all preferable
to default CPU pinning.

I agree we need something dynamic, and that we need to tie cpu affinity and memory affinity together.

This could happen completely in the kernel (not an easy task), or by having a second-level scheduler in userspace polling for cpu usage an rebalancing processes across numa nodes. Given that with virtualization you have a few long lived processes, this does not seem too difficult.


--
I have a truly marvellous patch that fixes the bug which this
signature is too narrow to contain.

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Reply via email to