Re: [Intel-gfx] [PATCH 2/2] drm/i915: Initialise userptr mmu_notifier serial to 1

2014-07-15 Thread Daniel Vetter
On Fri, Jul 11, 2014 at 03:06:55PM +0100, Tvrtko Ursulin wrote:
 
 On 07/11/2014 03:02 PM, Daniel Vetter wrote:
 On Fri, Jul 11, 2014 at 12:06:02PM +0100, Chris Wilson wrote:
 On Fri, Jul 11, 2014 at 12:00:26PM +0100, Tvrtko Ursulin wrote:
 
 On 07/11/2014 11:28 AM, Chris Wilson wrote:
 During the range invalidate, we walk the list of buffers associated with
 the mmu_notifer and find the ones that overlap the range. An
 optimisation is made to speed up the iteration by assuming the previous
 iter is still valid whilst the tree is unmodified. This exposes a bug
 when a range invalidate is triggered after we have just created the
 mmu_notifier, but before attaching any buffers. In that case, we presume
 we have an unmodified list and start walking from the last iter which is
 NULL. Oops.
 
 The easiest fix is then to initialise the serial of the tree to 1.
 
 Signed-off-by: Chris Wilson ch...@chris-wilson.co.uk
 Cc: Tvrtko Ursulin tvrtko.ursu...@linux.intel.com
 ---
   drivers/gpu/drm/i915/i915_gem_userptr.c | 2 +-
   1 file changed, 1 insertion(+), 1 deletion(-)
 
 diff --git a/drivers/gpu/drm/i915/i915_gem_userptr.c 
 b/drivers/gpu/drm/i915/i915_gem_userptr.c
 index 7c38f50014db..8e9e91029aed 100644
 --- a/drivers/gpu/drm/i915/i915_gem_userptr.c
 +++ b/drivers/gpu/drm/i915/i915_gem_userptr.c
 @@ -197,7 +197,7 @@ i915_mmu_notifier_get(struct drm_device *dev, struct 
 mm_struct *mm)
   mmu-mm = mm;
   mmu-objects = RB_ROOT;
   mmu-count = 0;
 - mmu-serial = 0;
 + mmu-serial = 1;
   INIT_LIST_HEAD(mmu-linear);
   mmu-is_linear = false;
 
 
 Looks good to me and I think safe to merge in any case, so:
 
 Reviewed-by: Tvrtko Ursulin tvrtko.ursu...@intel.com
 
 But it will be interesting to know what code managed to trigger this
 race, because as we discussed on IRC it would indicate some pretty
 wild userspace behaviour. Or lack of imagination on our part?
 
 A threaded client. One thread using userptr, the other doing munmap or
 free. Given enough embarrassment, it will happen every time.
 
 Can I have an igt please to confirm this and ensure it stays fixed?
 
 Sure, give me a day or two.

Now that we have the testcase both patches merged, thanks.
-Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx


[Intel-gfx] [PATCH 2/2] drm/i915: Initialise userptr mmu_notifier serial to 1

2014-07-11 Thread Chris Wilson
During the range invalidate, we walk the list of buffers associated with
the mmu_notifer and find the ones that overlap the range. An
optimisation is made to speed up the iteration by assuming the previous
iter is still valid whilst the tree is unmodified. This exposes a bug
when a range invalidate is triggered after we have just created the
mmu_notifier, but before attaching any buffers. In that case, we presume
we have an unmodified list and start walking from the last iter which is
NULL. Oops.

The easiest fix is then to initialise the serial of the tree to 1.

Signed-off-by: Chris Wilson ch...@chris-wilson.co.uk
Cc: Tvrtko Ursulin tvrtko.ursu...@linux.intel.com
---
 drivers/gpu/drm/i915/i915_gem_userptr.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/i915/i915_gem_userptr.c 
b/drivers/gpu/drm/i915/i915_gem_userptr.c
index 7c38f50014db..8e9e91029aed 100644
--- a/drivers/gpu/drm/i915/i915_gem_userptr.c
+++ b/drivers/gpu/drm/i915/i915_gem_userptr.c
@@ -197,7 +197,7 @@ i915_mmu_notifier_get(struct drm_device *dev, struct 
mm_struct *mm)
mmu-mm = mm;
mmu-objects = RB_ROOT;
mmu-count = 0;
-   mmu-serial = 0;
+   mmu-serial = 1;
INIT_LIST_HEAD(mmu-linear);
mmu-is_linear = false;
 
-- 
2.0.1

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH 2/2] drm/i915: Initialise userptr mmu_notifier serial to 1

2014-07-11 Thread Tvrtko Ursulin


On 07/11/2014 11:28 AM, Chris Wilson wrote:

During the range invalidate, we walk the list of buffers associated with
the mmu_notifer and find the ones that overlap the range. An
optimisation is made to speed up the iteration by assuming the previous
iter is still valid whilst the tree is unmodified. This exposes a bug
when a range invalidate is triggered after we have just created the
mmu_notifier, but before attaching any buffers. In that case, we presume
we have an unmodified list and start walking from the last iter which is
NULL. Oops.

The easiest fix is then to initialise the serial of the tree to 1.

Signed-off-by: Chris Wilson ch...@chris-wilson.co.uk
Cc: Tvrtko Ursulin tvrtko.ursu...@linux.intel.com
---
  drivers/gpu/drm/i915/i915_gem_userptr.c | 2 +-
  1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/i915/i915_gem_userptr.c 
b/drivers/gpu/drm/i915/i915_gem_userptr.c
index 7c38f50014db..8e9e91029aed 100644
--- a/drivers/gpu/drm/i915/i915_gem_userptr.c
+++ b/drivers/gpu/drm/i915/i915_gem_userptr.c
@@ -197,7 +197,7 @@ i915_mmu_notifier_get(struct drm_device *dev, struct 
mm_struct *mm)
mmu-mm = mm;
mmu-objects = RB_ROOT;
mmu-count = 0;
-   mmu-serial = 0;
+   mmu-serial = 1;
INIT_LIST_HEAD(mmu-linear);
mmu-is_linear = false;



Looks good to me and I think safe to merge in any case, so:

Reviewed-by: Tvrtko Ursulin tvrtko.ursu...@intel.com

But it will be interesting to know what code managed to trigger this 
race, because as we discussed on IRC it would indicate some pretty wild 
userspace behaviour. Or lack of imagination on our part?


Tvrtko
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH 2/2] drm/i915: Initialise userptr mmu_notifier serial to 1

2014-07-11 Thread Chris Wilson
On Fri, Jul 11, 2014 at 12:00:26PM +0100, Tvrtko Ursulin wrote:
 
 On 07/11/2014 11:28 AM, Chris Wilson wrote:
 During the range invalidate, we walk the list of buffers associated with
 the mmu_notifer and find the ones that overlap the range. An
 optimisation is made to speed up the iteration by assuming the previous
 iter is still valid whilst the tree is unmodified. This exposes a bug
 when a range invalidate is triggered after we have just created the
 mmu_notifier, but before attaching any buffers. In that case, we presume
 we have an unmodified list and start walking from the last iter which is
 NULL. Oops.
 
 The easiest fix is then to initialise the serial of the tree to 1.
 
 Signed-off-by: Chris Wilson ch...@chris-wilson.co.uk
 Cc: Tvrtko Ursulin tvrtko.ursu...@linux.intel.com
 ---
   drivers/gpu/drm/i915/i915_gem_userptr.c | 2 +-
   1 file changed, 1 insertion(+), 1 deletion(-)
 
 diff --git a/drivers/gpu/drm/i915/i915_gem_userptr.c 
 b/drivers/gpu/drm/i915/i915_gem_userptr.c
 index 7c38f50014db..8e9e91029aed 100644
 --- a/drivers/gpu/drm/i915/i915_gem_userptr.c
 +++ b/drivers/gpu/drm/i915/i915_gem_userptr.c
 @@ -197,7 +197,7 @@ i915_mmu_notifier_get(struct drm_device *dev, struct 
 mm_struct *mm)
  mmu-mm = mm;
  mmu-objects = RB_ROOT;
  mmu-count = 0;
 -mmu-serial = 0;
 +mmu-serial = 1;
  INIT_LIST_HEAD(mmu-linear);
  mmu-is_linear = false;
 
 
 Looks good to me and I think safe to merge in any case, so:
 
 Reviewed-by: Tvrtko Ursulin tvrtko.ursu...@intel.com
 
 But it will be interesting to know what code managed to trigger this
 race, because as we discussed on IRC it would indicate some pretty
 wild userspace behaviour. Or lack of imagination on our part?

A threaded client. One thread using userptr, the other doing munmap or
free. Given enough embarrassment, it will happen every time.
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH 2/2] drm/i915: Initialise userptr mmu_notifier serial to 1

2014-07-11 Thread Tvrtko Ursulin


On 07/11/2014 12:06 PM, Chris Wilson wrote:

On Fri, Jul 11, 2014 at 12:00:26PM +0100, Tvrtko Ursulin wrote:


On 07/11/2014 11:28 AM, Chris Wilson wrote:

During the range invalidate, we walk the list of buffers associated with
the mmu_notifer and find the ones that overlap the range. An
optimisation is made to speed up the iteration by assuming the previous
iter is still valid whilst the tree is unmodified. This exposes a bug
when a range invalidate is triggered after we have just created the
mmu_notifier, but before attaching any buffers. In that case, we presume
we have an unmodified list and start walking from the last iter which is
NULL. Oops.

The easiest fix is then to initialise the serial of the tree to 1.

Signed-off-by: Chris Wilson ch...@chris-wilson.co.uk
Cc: Tvrtko Ursulin tvrtko.ursu...@linux.intel.com
---
  drivers/gpu/drm/i915/i915_gem_userptr.c | 2 +-
  1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/i915/i915_gem_userptr.c 
b/drivers/gpu/drm/i915/i915_gem_userptr.c
index 7c38f50014db..8e9e91029aed 100644
--- a/drivers/gpu/drm/i915/i915_gem_userptr.c
+++ b/drivers/gpu/drm/i915/i915_gem_userptr.c
@@ -197,7 +197,7 @@ i915_mmu_notifier_get(struct drm_device *dev, struct 
mm_struct *mm)
mmu-mm = mm;
mmu-objects = RB_ROOT;
mmu-count = 0;
-   mmu-serial = 0;
+   mmu-serial = 1;
INIT_LIST_HEAD(mmu-linear);
mmu-is_linear = false;



Looks good to me and I think safe to merge in any case, so:

Reviewed-by: Tvrtko Ursulin tvrtko.ursu...@intel.com

But it will be interesting to know what code managed to trigger this
race, because as we discussed on IRC it would indicate some pretty
wild userspace behaviour. Or lack of imagination on our part?


A threaded client. One thread using userptr, the other doing munmap or
free. Given enough embarrassment, it will happen every time.


Yes fine, but I struggle to imagine what would be the intention of such 
code or how did it manage to fail in such way. I hope the only 
difference is not that userptr upgraded the failure mode for heap 
corruption or memory management races in general.


Tvrtko
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH 2/2] drm/i915: Initialise userptr mmu_notifier serial to 1

2014-07-11 Thread Chris Wilson
On Fri, Jul 11, 2014 at 12:31:12PM +0100, Tvrtko Ursulin wrote:
 
 On 07/11/2014 12:06 PM, Chris Wilson wrote:
 On Fri, Jul 11, 2014 at 12:00:26PM +0100, Tvrtko Ursulin wrote:
 But it will be interesting to know what code managed to trigger this
 race, because as we discussed on IRC it would indicate some pretty
 wild userspace behaviour. Or lack of imagination on our part?
 
 A threaded client. One thread using userptr, the other doing munmap or
 free. Given enough embarrassment, it will happen every time.
 
 Yes fine, but I struggle to imagine what would be the intention of
 such code or how did it manage to fail in such way. I hope the only
 difference is not that userptr upgraded the failure mode for heap
 corruption or memory management races in general.

The mmu notifier is called everytime a process sneezes. It does not
imply that our object is being invalidated, just that some portion of
the current-mm is being modified.
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH 2/2] drm/i915: Initialise userptr mmu_notifier serial to 1

2014-07-11 Thread Tvrtko Ursulin


On 07/11/2014 12:35 PM, Chris Wilson wrote:

On Fri, Jul 11, 2014 at 12:31:12PM +0100, Tvrtko Ursulin wrote:


On 07/11/2014 12:06 PM, Chris Wilson wrote:

On Fri, Jul 11, 2014 at 12:00:26PM +0100, Tvrtko Ursulin wrote:

But it will be interesting to know what code managed to trigger this
race, because as we discussed on IRC it would indicate some pretty
wild userspace behaviour. Or lack of imagination on our part?


A threaded client. One thread using userptr, the other doing munmap or
free. Given enough embarrassment, it will happen every time.


Yes fine, but I struggle to imagine what would be the intention of
such code or how did it manage to fail in such way. I hope the only
difference is not that userptr upgraded the failure mode for heap
corruption or memory management races in general.


The mmu notifier is called everytime a process sneezes. It does not
imply that our object is being invalidated, just that some portion of
the current-mm is being modified.


Ah yes, I did not see the big picture.

Tvrtko
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH 2/2] drm/i915: Initialise userptr mmu_notifier serial to 1

2014-07-11 Thread Daniel Vetter
On Fri, Jul 11, 2014 at 12:06:02PM +0100, Chris Wilson wrote:
 On Fri, Jul 11, 2014 at 12:00:26PM +0100, Tvrtko Ursulin wrote:
  
  On 07/11/2014 11:28 AM, Chris Wilson wrote:
  During the range invalidate, we walk the list of buffers associated with
  the mmu_notifer and find the ones that overlap the range. An
  optimisation is made to speed up the iteration by assuming the previous
  iter is still valid whilst the tree is unmodified. This exposes a bug
  when a range invalidate is triggered after we have just created the
  mmu_notifier, but before attaching any buffers. In that case, we presume
  we have an unmodified list and start walking from the last iter which is
  NULL. Oops.
  
  The easiest fix is then to initialise the serial of the tree to 1.
  
  Signed-off-by: Chris Wilson ch...@chris-wilson.co.uk
  Cc: Tvrtko Ursulin tvrtko.ursu...@linux.intel.com
  ---
drivers/gpu/drm/i915/i915_gem_userptr.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
  
  diff --git a/drivers/gpu/drm/i915/i915_gem_userptr.c 
  b/drivers/gpu/drm/i915/i915_gem_userptr.c
  index 7c38f50014db..8e9e91029aed 100644
  --- a/drivers/gpu/drm/i915/i915_gem_userptr.c
  +++ b/drivers/gpu/drm/i915/i915_gem_userptr.c
  @@ -197,7 +197,7 @@ i915_mmu_notifier_get(struct drm_device *dev, struct 
  mm_struct *mm)
 mmu-mm = mm;
 mmu-objects = RB_ROOT;
 mmu-count = 0;
  -  mmu-serial = 0;
  +  mmu-serial = 1;
 INIT_LIST_HEAD(mmu-linear);
 mmu-is_linear = false;
  
  
  Looks good to me and I think safe to merge in any case, so:
  
  Reviewed-by: Tvrtko Ursulin tvrtko.ursu...@intel.com
  
  But it will be interesting to know what code managed to trigger this
  race, because as we discussed on IRC it would indicate some pretty
  wild userspace behaviour. Or lack of imagination on our part?
 
 A threaded client. One thread using userptr, the other doing munmap or
 free. Given enough embarrassment, it will happen every time.

Can I have an igt please to confirm this and ensure it stays fixed?
-Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH 2/2] drm/i915: Initialise userptr mmu_notifier serial to 1

2014-07-11 Thread Tvrtko Ursulin


On 07/11/2014 03:02 PM, Daniel Vetter wrote:

On Fri, Jul 11, 2014 at 12:06:02PM +0100, Chris Wilson wrote:

On Fri, Jul 11, 2014 at 12:00:26PM +0100, Tvrtko Ursulin wrote:


On 07/11/2014 11:28 AM, Chris Wilson wrote:

During the range invalidate, we walk the list of buffers associated with
the mmu_notifer and find the ones that overlap the range. An
optimisation is made to speed up the iteration by assuming the previous
iter is still valid whilst the tree is unmodified. This exposes a bug
when a range invalidate is triggered after we have just created the
mmu_notifier, but before attaching any buffers. In that case, we presume
we have an unmodified list and start walking from the last iter which is
NULL. Oops.

The easiest fix is then to initialise the serial of the tree to 1.

Signed-off-by: Chris Wilson ch...@chris-wilson.co.uk
Cc: Tvrtko Ursulin tvrtko.ursu...@linux.intel.com
---
  drivers/gpu/drm/i915/i915_gem_userptr.c | 2 +-
  1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/i915/i915_gem_userptr.c 
b/drivers/gpu/drm/i915/i915_gem_userptr.c
index 7c38f50014db..8e9e91029aed 100644
--- a/drivers/gpu/drm/i915/i915_gem_userptr.c
+++ b/drivers/gpu/drm/i915/i915_gem_userptr.c
@@ -197,7 +197,7 @@ i915_mmu_notifier_get(struct drm_device *dev, struct 
mm_struct *mm)
mmu-mm = mm;
mmu-objects = RB_ROOT;
mmu-count = 0;
-   mmu-serial = 0;
+   mmu-serial = 1;
INIT_LIST_HEAD(mmu-linear);
mmu-is_linear = false;



Looks good to me and I think safe to merge in any case, so:

Reviewed-by: Tvrtko Ursulin tvrtko.ursu...@intel.com

But it will be interesting to know what code managed to trigger this
race, because as we discussed on IRC it would indicate some pretty
wild userspace behaviour. Or lack of imagination on our part?


A threaded client. One thread using userptr, the other doing munmap or
free. Given enough embarrassment, it will happen every time.


Can I have an igt please to confirm this and ensure it stays fixed?


Sure, give me a day or two.

Tvrtko
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx