> OK. i was reffering to another approach: Copying _to_ VRAM /AGP: > > lock_mmap_sems() > unmap_mapping_range() (or similar) > copy() / flip() > foreach_affected_vma{ > io_remap_pfn_range() /* Map vram / AGP space */ > } > unlock_mmap_sem() > > This works like a charm in the drm memory manager but it requires the > lock of the mmap sems from all affected processes, and the locking > order must be the same all the time otherwise deadlocks will occur.
Yes, and that's what I think we can "fix" using do_no_page() and unmap_mapping_ranges(). That is, we don't io_remap_pfn_range(), we just "fault" in pages either from VRAM or from memory depending on where an object sits at a given point in time. and we use unmap_mappingng_ranges() to invalidate current mappings of an object when we "move" it. That can be done with the minimal approach I described with the only limitation (though a pretty big one today) that you need struct page for VRAM for no_page() to be useable in those conditions. > do_no_page() is smart enough to recheck the pte when it retakes the > page table spinlock(), so if the pte has been populated by someone > while in the driver nopage(), the returned struct page will simply be > discarded. Yup, indeed. It has to to avoid races since no_page() has to be called without the PTE lock. The NOPAGE_RETRY approach would still be slightly more efficient though. > io_remap_pfn_range() should do the job of setting up the new ptes, but > it needs the mmap_sem, so if that one is held while blocked in > nopage(), a deadlock will occur. Here, the NOPAGE_RETRY will obviously > do the job. When io_remap_pfn_range() has finished setting up the > ptes, one can simply return a bogus page to nopage() if it insists on > retrying. Until NOPAGE_RETRY is implemented, I'm afraid I'm stuck with > the approach outlined above. It's not completely clear to me if we need the mmap_sem for writing to call io_remap_pfn_range()... We can certainly populate PTEs with only the read semaphore and we happen to have it in no_page.... so that would just work being called just within no_page(). So this approach would work today imho: * objects have rwsem to protect migration. * no_page() does: - takes that object read sem - if object is in vram or other non-memory location then do io_remap_pfn_range() and get a dummy page struct pointer - else get the struct page of the object page in memory - release the object read sem and return whatever struct page we got * migration does: - take that object write sem - copy the data to the new location - call unmap_mapping_ranges() for that object - release the object write sem With 2.6.19, hopefully, NOPAGE_RETRY will get in, which means that no_page() can be optimized for the case where it calls io_remap_pfn_range() to not return a bogus page and have a faster return path to userland. It's also possible to provide a io_remap_one_page() that would be faster than having to call the whole 4 level io_remap_pfn_range() for every page faulted in (though we might just remap the entire object on the first fault, might well work ...) Or do you think I missed something ? Cheers, Ben ------------------------------------------------------------------------- Take Surveys. Earn Cash. Influence the Future of IT Join SourceForge.net's Techsay panel and you'll get the chance to share your opinions on IT & business topics through brief surveys -- and earn cash http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV -- _______________________________________________ Dri-devel mailing list Dri-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/dri-devel