Andi Kleen wrote:
On Sat, Nov 29, 2008 at 08:43:35PM +0200, Avi Kivity wrote:
Andi Kleen wrote:
It depends -- it's not necessarily an improvement. e.g. if it leads to
some CPUs being idle while others are oversubscribed because of the
pinning you typically lose more than you win. In general default
pinning is a bad idea in my experience.
Alternative more flexible strategies:
- Do a mapping from CPU to node at runtime by using getcpu()
- Migrate to complete nodes using migrate_pages when qemu detects
node migration on the host.
Wouldn't that cause lots of migrations? Migrating a 1GB guest can take
I assume you mean the second one (the two points were orthogonal)
The first one is an approximate method, also has advantages
and disadvantages.
I don't think the first one works without the second. Calling getcpu()
on startup is meaningless since the initial placement doesn't take the
current workload into account.
a huge amount of cpu time (tens or even hundreds of milliseconds?)
compared to very high frequency activity like the scheduler.
Yes migration is expensive, although you can do it on demand of course,
but the scheduler typically has pretty strong cpu affinity so it shouldn't
happen too often. Also it's only a temporary cost compared to the
endless overhead of running forever non local or running forever with
some cores idle.
Another strategy would be to tune the load balancer in the scheduler
for this case and make it only migrate in extreme situations.
Anyways it's not ideal either, but in my mind would be all preferable
to default CPU pinning.
I agree we need something dynamic, and that we need to tie cpu affinity
and memory affinity together.
This could happen completely in the kernel (not an easy task), or by
having a second-level scheduler in userspace polling for cpu usage an
rebalancing processes across numa nodes. Given that with virtualization
you have a few long lived processes, this does not seem too difficult.
--
I have a truly marvellous patch that fixes the bug which this
signature is too narrow to contain.
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at http://vger.kernel.org/majordomo-info.html