Andi Kleen wrote:
On Sun, Nov 30, 2008 at 10:07:01PM +0200, Avi Kivity wrote:
Right. Allocated from the guest kernel's perspective. This may be different from the host kernel's perspective.

Linux will delay touching memory until the last moment, Windows will not (likely it zeros pages on their own nodes, but who knows)?

The problem on Linux is that the first touch is clear_page() and that unfortunately happens in the direct mapping before mapping, so the "detect mapping" trick doesn't quite work (unless it's a 32bit highmem page).

It should still be on the same cpu.

Ok one could migrate it on mapping. When the data is still cache
hot that shouldn't be that expensive. Thinking about it again it might be actually a reasonable approach.


Could also work for normal apps - move code and data to local node.

But again, we don't have any guest mapping information when we're running under #pt; only the first access. If we're willing so sacrifice memory, we can get the first access per virtual node.

In our case, the application is the guest kernel, which does know.

It knows but it doesn't really care all that much.  The only thing
that counts is the end performance in this case.

Well, testing is the only way to know. I'm particularly interested in how Windows will perform, since we know so little about its internals.

From some light googling, it looks like Windows has a home node for a thread, and will allocate pages from the home node even when the thread is executing on some other node temporarily. It also does automatic page migration in some cases.


The difference is, Linux (as a guest) will try to reuse freed pages from an application or pagecache, knowing which node they belong to.

I agree that if all you do is HPC style computation (boot a kernel and one app with one process per cpu), then the heuristics work well.

Or if there's a way to detect unmapping/remapping.

Sure, if you're willing to drop %pt.

It is certainly not perfect and has holes (like any heuristics),
but it has the advantage of being fully dynamic.
It also has the advantage of being already implemented (apart from fake SRAT tables; and that isn't necessary for HPC apps).

What do you mean?

Which part? being already implemented? Like I said earlier, right now kvm will allocate memory from the process that runs the vcpu that first touched this memory. Given that Linux prefers allocating from the current node, we already implement the first touch heuristic.


--
I have a truly marvellous patch that fixes the bug which this
signature is too narrow to contain.

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Reply via email to