On Wed, 14 May 2008, Christoph Lameter wrote: > > The problem is that the code in rmap.c try_to_umap() and friends loops > over reverse maps after taking a spinlock. The mm_struct is only known > after the rmap has been acccessed. This means *inside* the spinlock.
So you queue them. That's what we do with things like the dirty bit. We need to hold various spinlocks to look up pages, but then we can't actually call the filesystem with the spinlock held. Converting a spinlock to a waiting lock for things like that is simply not acceptable. You have to work with the system. Yeah, there's only a single bit worth of information on whether a page is dirty or not, so "queueing" that information is trivial (it's just the return value from "page_mkclean_file()". Some things are harder than others, and I suspect you need some kind of "gather" structure to queue up all the vma's that can be affected. But it sounds like for the case of rmap, the approach of: - the page lock is the higher-level "sleeping lock" (which makes sense, since this is very close to an IO event, and that is what the page lock is generally used for) But hey, it could be anything else - maybe you have some other even bigger lock to allow you to handle lots of pages in one go. - with that lock held, you do the whole rmap dance (which requires spinlocks) and gather up the vma's and the struct mm's involved. - outside the spinlocks you then do whatever it is you need to do. This doesn't sound all that different from TLB shoot-down in SMP, and the "mmu_gather" structure. Now, admittedly we can do the TLB shoot-down while holding the spinlocks, but if we couldn't that's how we'd still do it: it would get more involved (because we'd need to guarantee that the gather can hold *all* the pages - right now we can just flush in the middle if we need to), but it wouldn't be all that fundamentally different. And no, I really haven't even wanted to look at what XPMEM really needs to do, so maybe the above thing doesn't work for you, and you have other issues. I'm just pointing you in a general direction, not trying to say "this is exactly how to get there". Linus ------------------------------------------------------------------------- This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2008. http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/ _______________________________________________ kvm-devel mailing list kvm-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/kvm-devel