> OK. i was reffering to another approach: Copying _to_ VRAM /AGP:
>
> lock_mmap_sems()
> unmap_mapping_range() (or similar)
> copy() / flip()
> foreach_affected_vma{
> io_remap_pfn_range() /* Map vram / AGP space */
> }
> unlock_mmap_sem()
>
> This works like a charm in the drm memory manager but it requires the
> lock of the mmap sems from all affected processes, and the locking
> order must be the same all the time otherwise deadlocks will occur.
Yes, and that's what I think we can "fix" using do_no_page() and
unmap_mapping_ranges(). That is, we don't io_remap_pfn_range(), we just
"fault" in pages either from VRAM or from memory depending on where an
object sits at a given point in time. and we use
unmap_mappingng_ranges() to invalidate current mappings of an object
when we "move" it. That can be done with the minimal approach I
described with the only limitation (though a pretty big one today) that
you need struct page for VRAM for no_page() to be useable in those
conditions.
> do_no_page() is smart enough to recheck the pte when it retakes the
> page table spinlock(), so if the pte has been populated by someone
> while in the driver nopage(), the returned struct page will simply be
> discarded.
Yup, indeed. It has to to avoid races since no_page() has to be called
without the PTE lock. The NOPAGE_RETRY approach would still be slightly
more efficient though.
> io_remap_pfn_range() should do the job of setting up the new ptes, but
> it needs the mmap_sem, so if that one is held while blocked in
> nopage(), a deadlock will occur. Here, the NOPAGE_RETRY will obviously
> do the job. When io_remap_pfn_range() has finished setting up the
> ptes, one can simply return a bogus page to nopage() if it insists on
> retrying. Until NOPAGE_RETRY is implemented, I'm afraid I'm stuck with
> the approach outlined above.
It's not completely clear to me if we need the mmap_sem for writing to
call io_remap_pfn_range()... We can certainly populate PTEs with only
the read semaphore and we happen to have it in no_page.... so that would
just work being called just within no_page().
So this approach would work today imho:
* objects have rwsem to protect migration.
* no_page() does:
- takes that object read sem
- if object is in vram or other non-memory location then do
io_remap_pfn_range() and get a dummy page struct pointer
- else get the struct page of the object page in memory
- release the object read sem and return whatever struct page we got
* migration does:
- take that object write sem
- copy the data to the new location
- call unmap_mapping_ranges() for that object
- release the object write sem
With 2.6.19, hopefully, NOPAGE_RETRY will get in, which means that
no_page() can be optimized for the case where it calls
io_remap_pfn_range() to not return a bogus page and have a faster return
path to userland. It's also possible to provide a io_remap_one_page()
that would be faster than having to call the whole 4 level
io_remap_pfn_range() for every page faulted in (though we might just
remap the entire object on the first fault, might well work ...)
Or do you think I missed something ?
Cheers,
Ben
-------------------------------------------------------------------------
Take Surveys. Earn Cash. Influence the Future of IT
Join SourceForge.net's Techsay panel and you'll get the chance to share your
opinions on IT & business topics through brief surveys -- and earn cash
http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV
--
_______________________________________________
Dri-devel mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/dri-devel