Benjamin Herrenschmidt wrote: >>OK. i was reffering to another approach: Copying _to_ VRAM /AGP: >> >>lock_mmap_sems() >>unmap_mapping_range() (or similar) >>copy() / flip() >>foreach_affected_vma{ >> io_remap_pfn_range() /* Map vram / AGP space */ >>} >>unlock_mmap_sem() >> >>This works like a charm in the drm memory manager but it requires the >>lock of the mmap sems from all affected processes, and the locking >>order must be the same all the time otherwise deadlocks will occur. >> >> > >Yes, and that's what I think we can "fix" using do_no_page() and >unmap_mapping_ranges(). That is, we don't io_remap_pfn_range(), we just >"fault" in pages either from VRAM or from memory depending on where an >object sits at a given point in time. and we use >unmap_mappingng_ranges() to invalidate current mappings of an object >when we "move" it. That can be done with the minimal approach I >described with the only limitation (though a pretty big one today) that >you need struct page for VRAM for no_page() to be useable in those >conditions. > > > >>do_no_page() is smart enough to recheck the pte when it retakes the >>page table spinlock(), so if the pte has been populated by someone >>while in the driver nopage(), the returned struct page will simply be >>discarded. >> >> > >Yup, indeed. It has to to avoid races since no_page() has to be called >without the PTE lock. The NOPAGE_RETRY approach would still be slightly >more efficient though. > > > >>io_remap_pfn_range() should do the job of setting up the new ptes, but >>it needs the mmap_sem, so if that one is held while blocked in >>nopage(), a deadlock will occur. Here, the NOPAGE_RETRY will obviously >>do the job. When io_remap_pfn_range() has finished setting up the >>ptes, one can simply return a bogus page to nopage() if it insists on >>retrying. Until NOPAGE_RETRY is implemented, I'm afraid I'm stuck with >>the approach outlined above. >> >> > >It's not completely clear to me if we need the mmap_sem for writing to >call io_remap_pfn_range()... We can certainly populate PTEs with only >the read semaphore and we happen to have it in no_page.... so that would >just work being called just within no_page(). > >So this approach would work today imho: > >* objects have rwsem to protect migration. >* no_page() does: > - takes that object read sem > - if object is in vram or other non-memory location then do >io_remap_pfn_range() and get a dummy page struct pointer > - else get the struct page of the object page in memory > - release the object read sem and return whatever struct page we got >* migration does: > - take that object write sem > - copy the data to the new location > - call unmap_mapping_ranges() for that object > - release the object write sem > >With 2.6.19, hopefully, NOPAGE_RETRY will get in, which means that >no_page() can be optimized for the case where it calls >io_remap_pfn_range() to not return a bogus page and have a faster return >path to userland. It's also possible to provide a io_remap_one_page() >that would be faster than having to call the whole 4 level >io_remap_pfn_range() for every page faulted in (though we might just >remap the entire object on the first fault, might well work ...) > >Or do you think I missed something ? > > > No, that's probably the safest approach we can use until NOPAGE_RETRY arrives. Only I was not sure it'd be safe to call io_remap_pfn_range() from within nopage, in case it modifies some internal mm structs that the kernel nopage() code expects to be untouched.
Once NOPAGE_RETRY arrives, (hopefully with a schedule() call attached to it), it's possible, however, that repopulating the whole vma using io_remap_pfn_range() outside nopage, just after doing the copying is more efficient. Although this means keeping track of vmas, the mmap_sems can be taken and released one at a time, without any locking problems. I agree the single-page approach looks nicer, though. It's somewhat ugly to force one's way into another process' memory space. /Thomas >Cheers, >Ben > > > ------------------------------------------------------------------------- Take Surveys. Earn Cash. Influence the Future of IT Join SourceForge.net's Techsay panel and you'll get the chance to share your opinions on IT & business topics through brief surveys -- and earn cash http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV -- _______________________________________________ Dri-devel mailing list Dri-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/dri-devel