> OK. i was reffering to another approach: Copying _to_ VRAM /AGP:
> 
> lock_mmap_sems()
> unmap_mapping_range() (or similar)
> copy() / flip()
> foreach_affected_vma{
>    io_remap_pfn_range() /* Map vram / AGP space */
> }
> unlock_mmap_sem()
> 
> This works like a charm in the drm memory manager but it requires the
> lock of the mmap sems from all affected processes, and the locking
> order must be the same all the time otherwise deadlocks will occur.

Yes, and that's what I think we can "fix" using do_no_page() and
unmap_mapping_ranges(). That is, we don't io_remap_pfn_range(), we just
"fault" in pages either from VRAM or from memory depending on where an
object sits at a given point in time. and we use
unmap_mappingng_ranges() to invalidate current mappings of an object
when we "move" it. That can be done with the minimal approach I
described with the only limitation (though a pretty big one today) that
you need struct page for VRAM for no_page() to be useable in those
conditions.

> do_no_page() is smart enough to recheck the pte when it retakes the
> page table spinlock(), so if the pte has been populated by someone
> while in the driver nopage(), the returned struct page will simply be
> discarded. 

Yup, indeed. It has to to avoid races since no_page() has to be called
without the PTE lock. The NOPAGE_RETRY approach would still be slightly
more efficient though.

> io_remap_pfn_range() should do the job of setting up the new ptes, but
> it needs the mmap_sem, so if that one is held while blocked in
> nopage(), a deadlock will occur. Here, the NOPAGE_RETRY will obviously
> do the job. When io_remap_pfn_range() has finished setting up the
> ptes, one can simply return a bogus page to nopage() if it insists on
> retrying. Until NOPAGE_RETRY is implemented, I'm afraid I'm stuck with
> the approach outlined above.

It's not completely clear to me if we need the mmap_sem for writing to
call io_remap_pfn_range()... We can certainly populate PTEs with only
the read semaphore and we happen to have it in no_page.... so that would
just work being called just within no_page().

So this approach would work today imho:

* objects have rwsem to protect migration.
* no_page() does:
   - takes that object read sem
   - if object is in vram or other non-memory location then do
io_remap_pfn_range() and get a dummy page struct pointer
   - else get the struct page of the object page in memory
   - release the object read sem and return whatever struct page we got
* migration does:
   - take that object write sem
   - copy the data to the new location
   - call unmap_mapping_ranges() for that object
   - release the object write sem

With 2.6.19, hopefully, NOPAGE_RETRY will get in, which means that
no_page() can be optimized for the case where it calls
io_remap_pfn_range() to not return a bogus page and have a faster return
path to userland. It's also possible to provide a io_remap_one_page()
that would be faster than having to call the whole 4 level
io_remap_pfn_range() for every page faulted in (though we might just
remap the entire object on the first fault, might well work ...)

Or do you think I missed something ?

Cheers,
Ben


-------------------------------------------------------------------------
Take Surveys. Earn Cash. Influence the Future of IT
Join SourceForge.net's Techsay panel and you'll get the chance to share your
opinions on IT & business topics through brief surveys -- and earn cash
http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV
--
_______________________________________________
Dri-devel mailing list
Dri-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dri-devel

Reply via email to