Re: [Intel-gfx] mmu_notifier and i915_gem_userptr.c

2014-06-25 Thread Joerg Roedel
On Fri, Jun 20, 2014 at 01:43:50PM +0200, Joerg Roedel wrote:
 Change_pte is also called when the underlying page of an address
 changes in the kernel which would matter for DMA. But that can only
 happen in KSM and uprobes code which is probably not of interest for the
 i915 driver.
 
 The other case where I think it matters is the do_wp_page() path for
 COW. The code works by calling invalidate_range_start - change_pte -
 invalidate_range_end. Your driver would react to this by unbinding the
 vma from itself internally (after a fork for example).
 
 But I have to check whether this really matters here.

Okay, I think it does not matter for the i915 driver. The code-paths
which map pages read-only for COW invoke invalidate_range_start/end on
the page-ranges which causes the driver to unbind the pages.

When get_user_pages() is called again later it will do the COW by
itself, so the driver doesn't need to care.

So I tend to say that the i915 driver does not need a change_pte()
call-back at all. But probably someone should double-check to make sure
I didn't miss something.


Joerg


___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] mmu_notifier and i915_gem_userptr.c

2014-06-20 Thread Joerg Roedel
On Thu, Jun 19, 2014 at 05:02:57PM +0100, Chris Wilson wrote:
 Maybe it should hook the invalidate_page notifier. I picked
 invalidate_range() as it seemed to be called first and seemed to cover
 all operations that affected an mm's addr range.  The invalidate_page
 seemed to be only called during writeback, which aiui will not affect gup.
 Should we be hooking more notifiers?

For what you are interested in the invalidate_range_start notifier
should be sufficient. Invalidate_page is only called for page-aging and
swapping (and xip, but that doesn't matter either).

The problem with invalidate_range_start/end is that its semantics make
it somewhat expensive to implement. So it would be good to reduce the
call-sites and get rid of it at least around the change_pte calls in a
first step.

  I am not familiar with the i915 hardware and the driver implementation
  details, so I wanted to ask whether the driver
  
  1) Cares about the change_pte event?
 
 I don't think so... If I understand correctly, change_pte only gets
 called when the PTE is updated, not the physical page itself. If that is
 true, then the dma mapping of the page is unaffected, and so the GPU
 view of the page is not affected either. So, as long as the GPU
 maintains a remapped address of a page that is associated by a process
 address, we don't need to change anything.

Change_pte is also called when the underlying page of an address
changes in the kernel which would matter for DMA. But that can only
happen in KSM and uprobes code which is probably not of interest for the
i915 driver.

The other case where I think it matters is the do_wp_page() path for
COW. The code works by calling invalidate_range_start - change_pte -
invalidate_range_end. Your driver would react to this by unbinding the
vma from itself internally (after a fork for example).

But I have to check whether this really matters here.

 If need be, start with overkill. I'd rather drop everything and rebuild
 it through the layers of mappings we have, as this is a new area for us
 and we want some solid foundations before trying to be smart.

The main problem I have seen so far is the mutex taken in the
invalidate_range_start call-back. That can't be used in a change_pte
function because change_pte is not preemptible.


Joerg


___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] mmu_notifier and i915_gem_userptr.c

2014-06-19 Thread Chris Wilson
On Thu, Jun 19, 2014 at 05:36:55PM +0200, Joerg Roedel wrote:
 Hey Chris,
 
 recently I had a look at i915_gem_userptr.c in order to extend the
 mmu_notifier call-backs implemented there. My goal is to implement the
 change_pte call-back where it is necessary to get rid of it being
 wrapped mn_invalidate_range_start/end() calls (for the reason see
 commit 6bdb913f).
 
 For most users of mmu_notifiers this is easy, except the i915 driver :)
 The invalidate_range_start notifier implemented there can sleep, so it
 can't be reused for a change_pte implementation (because change_pte is
 called under ptl spin_lock and is not allowed to sleep). On the other
 hand you also didn't implement the invalidate_page notifier, so I am not
 sure whether the code actually cares about the somewhat similar
 change_pte events?

Maybe it should hook the invalidate_page notifier. I picked
invalidate_range() as it seemed to be called first and seemed to cover
all operations that affected an mm's addr range.  The invalidate_page
seemed to be only called during writeback, which aiui will not affect gup.
Should we be hooking more notifiers?
 
 Here is where change_pte is called from:
 
   * In KSM code when pages are merged (shouldn't be relevant
 because KSM doesn't merged pages returned by get_user_pages())
   * In uprobes code when a user-page is replaced by a kernel page
 (should only handle .text sections, so probaly not relevant
  here)
   * When someone writes to a COW page in mm/memory.c (this looks
 relevant looking at forked processes, on the other hand,
 this is currently handled by unbinding the vma from the
 object list in the i915 driver)
 
 I am not familiar with the i915 hardware and the driver implementation
 details, so I wanted to ask whether the driver
 
   1) Cares about the change_pte event?

I don't think so... If I understand correctly, change_pte only gets
called when the PTE is updated, not the physical page itself. If that is
true, then the dma mapping of the page is unaffected, and so the GPU
view of the page is not affected either. So, as long as the GPU
maintains a remapped address of a page that is associated by a process
address, we don't need to change anything.

   2) If it cares, what is the best way to implement it? What the
  invalidate_range_start() notifier does seems a bit overkill,
  since for the change_pte event nothing is unmapped (but maybe
  remapped)

If need be, start with overkill. I'd rather drop everything and rebuild
it through the layers of mappings we have, as this is a new area for us
and we want some solid foundations before trying to be smart.
 
 So any insight you could provide here would be useful :)

Likewise. :-p
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx


[Intel-gfx] mmu_notifier and i915_gem_userptr.c

2014-06-19 Thread Joerg Roedel
Hey Chris,

recently I had a look at i915_gem_userptr.c in order to extend the
mmu_notifier call-backs implemented there. My goal is to implement the
change_pte call-back where it is necessary to get rid of it being
wrapped mn_invalidate_range_start/end() calls (for the reason see
commit 6bdb913f).

For most users of mmu_notifiers this is easy, except the i915 driver :)
The invalidate_range_start notifier implemented there can sleep, so it
can't be reused for a change_pte implementation (because change_pte is
called under ptl spin_lock and is not allowed to sleep). On the other
hand you also didn't implement the invalidate_page notifier, so I am not
sure whether the code actually cares about the somewhat similar
change_pte events?

Here is where change_pte is called from:

* In KSM code when pages are merged (shouldn't be relevant
  because KSM doesn't merged pages returned by get_user_pages())
* In uprobes code when a user-page is replaced by a kernel page
  (should only handle .text sections, so probaly not relevant
   here)
* When someone writes to a COW page in mm/memory.c (this looks
  relevant looking at forked processes, on the other hand,
  this is currently handled by unbinding the vma from the
  object list in the i915 driver)

I am not familiar with the i915 hardware and the driver implementation
details, so I wanted to ask whether the driver

1) Cares about the change_pte event?
2) If it cares, what is the best way to implement it? What the
   invalidate_range_start() notifier does seems a bit overkill,
   since for the change_pte event nothing is unmapped (but maybe
   remapped)

So any insight you could provide here would be useful :)


Thanks,

Joerg


___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx