Re: [Intel-gfx] [PATCH 2/2] drm/i915: Initialise userptr mmu_notifier serial to 1
On Fri, Jul 11, 2014 at 03:06:55PM +0100, Tvrtko Ursulin wrote: On 07/11/2014 03:02 PM, Daniel Vetter wrote: On Fri, Jul 11, 2014 at 12:06:02PM +0100, Chris Wilson wrote: On Fri, Jul 11, 2014 at 12:00:26PM +0100, Tvrtko Ursulin wrote: On 07/11/2014 11:28 AM, Chris Wilson wrote: During the range invalidate, we walk the list of buffers associated with the mmu_notifer and find the ones that overlap the range. An optimisation is made to speed up the iteration by assuming the previous iter is still valid whilst the tree is unmodified. This exposes a bug when a range invalidate is triggered after we have just created the mmu_notifier, but before attaching any buffers. In that case, we presume we have an unmodified list and start walking from the last iter which is NULL. Oops. The easiest fix is then to initialise the serial of the tree to 1. Signed-off-by: Chris Wilson ch...@chris-wilson.co.uk Cc: Tvrtko Ursulin tvrtko.ursu...@linux.intel.com --- drivers/gpu/drm/i915/i915_gem_userptr.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/gpu/drm/i915/i915_gem_userptr.c b/drivers/gpu/drm/i915/i915_gem_userptr.c index 7c38f50014db..8e9e91029aed 100644 --- a/drivers/gpu/drm/i915/i915_gem_userptr.c +++ b/drivers/gpu/drm/i915/i915_gem_userptr.c @@ -197,7 +197,7 @@ i915_mmu_notifier_get(struct drm_device *dev, struct mm_struct *mm) mmu-mm = mm; mmu-objects = RB_ROOT; mmu-count = 0; - mmu-serial = 0; + mmu-serial = 1; INIT_LIST_HEAD(mmu-linear); mmu-is_linear = false; Looks good to me and I think safe to merge in any case, so: Reviewed-by: Tvrtko Ursulin tvrtko.ursu...@intel.com But it will be interesting to know what code managed to trigger this race, because as we discussed on IRC it would indicate some pretty wild userspace behaviour. Or lack of imagination on our part? A threaded client. One thread using userptr, the other doing munmap or free. Given enough embarrassment, it will happen every time. Can I have an igt please to confirm this and ensure it stays fixed? Sure, give me a day or two. Now that we have the testcase both patches merged, thanks. -Daniel -- Daniel Vetter Software Engineer, Intel Corporation +41 (0) 79 365 57 48 - http://blog.ffwll.ch ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [PATCH 2/2] drm/i915: Initialise userptr mmu_notifier serial to 1
During the range invalidate, we walk the list of buffers associated with the mmu_notifer and find the ones that overlap the range. An optimisation is made to speed up the iteration by assuming the previous iter is still valid whilst the tree is unmodified. This exposes a bug when a range invalidate is triggered after we have just created the mmu_notifier, but before attaching any buffers. In that case, we presume we have an unmodified list and start walking from the last iter which is NULL. Oops. The easiest fix is then to initialise the serial of the tree to 1. Signed-off-by: Chris Wilson ch...@chris-wilson.co.uk Cc: Tvrtko Ursulin tvrtko.ursu...@linux.intel.com --- drivers/gpu/drm/i915/i915_gem_userptr.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/gpu/drm/i915/i915_gem_userptr.c b/drivers/gpu/drm/i915/i915_gem_userptr.c index 7c38f50014db..8e9e91029aed 100644 --- a/drivers/gpu/drm/i915/i915_gem_userptr.c +++ b/drivers/gpu/drm/i915/i915_gem_userptr.c @@ -197,7 +197,7 @@ i915_mmu_notifier_get(struct drm_device *dev, struct mm_struct *mm) mmu-mm = mm; mmu-objects = RB_ROOT; mmu-count = 0; - mmu-serial = 0; + mmu-serial = 1; INIT_LIST_HEAD(mmu-linear); mmu-is_linear = false; -- 2.0.1 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH 2/2] drm/i915: Initialise userptr mmu_notifier serial to 1
On 07/11/2014 11:28 AM, Chris Wilson wrote: During the range invalidate, we walk the list of buffers associated with the mmu_notifer and find the ones that overlap the range. An optimisation is made to speed up the iteration by assuming the previous iter is still valid whilst the tree is unmodified. This exposes a bug when a range invalidate is triggered after we have just created the mmu_notifier, but before attaching any buffers. In that case, we presume we have an unmodified list and start walking from the last iter which is NULL. Oops. The easiest fix is then to initialise the serial of the tree to 1. Signed-off-by: Chris Wilson ch...@chris-wilson.co.uk Cc: Tvrtko Ursulin tvrtko.ursu...@linux.intel.com --- drivers/gpu/drm/i915/i915_gem_userptr.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/gpu/drm/i915/i915_gem_userptr.c b/drivers/gpu/drm/i915/i915_gem_userptr.c index 7c38f50014db..8e9e91029aed 100644 --- a/drivers/gpu/drm/i915/i915_gem_userptr.c +++ b/drivers/gpu/drm/i915/i915_gem_userptr.c @@ -197,7 +197,7 @@ i915_mmu_notifier_get(struct drm_device *dev, struct mm_struct *mm) mmu-mm = mm; mmu-objects = RB_ROOT; mmu-count = 0; - mmu-serial = 0; + mmu-serial = 1; INIT_LIST_HEAD(mmu-linear); mmu-is_linear = false; Looks good to me and I think safe to merge in any case, so: Reviewed-by: Tvrtko Ursulin tvrtko.ursu...@intel.com But it will be interesting to know what code managed to trigger this race, because as we discussed on IRC it would indicate some pretty wild userspace behaviour. Or lack of imagination on our part? Tvrtko ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH 2/2] drm/i915: Initialise userptr mmu_notifier serial to 1
On Fri, Jul 11, 2014 at 12:00:26PM +0100, Tvrtko Ursulin wrote: On 07/11/2014 11:28 AM, Chris Wilson wrote: During the range invalidate, we walk the list of buffers associated with the mmu_notifer and find the ones that overlap the range. An optimisation is made to speed up the iteration by assuming the previous iter is still valid whilst the tree is unmodified. This exposes a bug when a range invalidate is triggered after we have just created the mmu_notifier, but before attaching any buffers. In that case, we presume we have an unmodified list and start walking from the last iter which is NULL. Oops. The easiest fix is then to initialise the serial of the tree to 1. Signed-off-by: Chris Wilson ch...@chris-wilson.co.uk Cc: Tvrtko Ursulin tvrtko.ursu...@linux.intel.com --- drivers/gpu/drm/i915/i915_gem_userptr.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/gpu/drm/i915/i915_gem_userptr.c b/drivers/gpu/drm/i915/i915_gem_userptr.c index 7c38f50014db..8e9e91029aed 100644 --- a/drivers/gpu/drm/i915/i915_gem_userptr.c +++ b/drivers/gpu/drm/i915/i915_gem_userptr.c @@ -197,7 +197,7 @@ i915_mmu_notifier_get(struct drm_device *dev, struct mm_struct *mm) mmu-mm = mm; mmu-objects = RB_ROOT; mmu-count = 0; -mmu-serial = 0; +mmu-serial = 1; INIT_LIST_HEAD(mmu-linear); mmu-is_linear = false; Looks good to me and I think safe to merge in any case, so: Reviewed-by: Tvrtko Ursulin tvrtko.ursu...@intel.com But it will be interesting to know what code managed to trigger this race, because as we discussed on IRC it would indicate some pretty wild userspace behaviour. Or lack of imagination on our part? A threaded client. One thread using userptr, the other doing munmap or free. Given enough embarrassment, it will happen every time. -Chris -- Chris Wilson, Intel Open Source Technology Centre ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH 2/2] drm/i915: Initialise userptr mmu_notifier serial to 1
On 07/11/2014 12:06 PM, Chris Wilson wrote: On Fri, Jul 11, 2014 at 12:00:26PM +0100, Tvrtko Ursulin wrote: On 07/11/2014 11:28 AM, Chris Wilson wrote: During the range invalidate, we walk the list of buffers associated with the mmu_notifer and find the ones that overlap the range. An optimisation is made to speed up the iteration by assuming the previous iter is still valid whilst the tree is unmodified. This exposes a bug when a range invalidate is triggered after we have just created the mmu_notifier, but before attaching any buffers. In that case, we presume we have an unmodified list and start walking from the last iter which is NULL. Oops. The easiest fix is then to initialise the serial of the tree to 1. Signed-off-by: Chris Wilson ch...@chris-wilson.co.uk Cc: Tvrtko Ursulin tvrtko.ursu...@linux.intel.com --- drivers/gpu/drm/i915/i915_gem_userptr.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/gpu/drm/i915/i915_gem_userptr.c b/drivers/gpu/drm/i915/i915_gem_userptr.c index 7c38f50014db..8e9e91029aed 100644 --- a/drivers/gpu/drm/i915/i915_gem_userptr.c +++ b/drivers/gpu/drm/i915/i915_gem_userptr.c @@ -197,7 +197,7 @@ i915_mmu_notifier_get(struct drm_device *dev, struct mm_struct *mm) mmu-mm = mm; mmu-objects = RB_ROOT; mmu-count = 0; - mmu-serial = 0; + mmu-serial = 1; INIT_LIST_HEAD(mmu-linear); mmu-is_linear = false; Looks good to me and I think safe to merge in any case, so: Reviewed-by: Tvrtko Ursulin tvrtko.ursu...@intel.com But it will be interesting to know what code managed to trigger this race, because as we discussed on IRC it would indicate some pretty wild userspace behaviour. Or lack of imagination on our part? A threaded client. One thread using userptr, the other doing munmap or free. Given enough embarrassment, it will happen every time. Yes fine, but I struggle to imagine what would be the intention of such code or how did it manage to fail in such way. I hope the only difference is not that userptr upgraded the failure mode for heap corruption or memory management races in general. Tvrtko ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH 2/2] drm/i915: Initialise userptr mmu_notifier serial to 1
On Fri, Jul 11, 2014 at 12:31:12PM +0100, Tvrtko Ursulin wrote: On 07/11/2014 12:06 PM, Chris Wilson wrote: On Fri, Jul 11, 2014 at 12:00:26PM +0100, Tvrtko Ursulin wrote: But it will be interesting to know what code managed to trigger this race, because as we discussed on IRC it would indicate some pretty wild userspace behaviour. Or lack of imagination on our part? A threaded client. One thread using userptr, the other doing munmap or free. Given enough embarrassment, it will happen every time. Yes fine, but I struggle to imagine what would be the intention of such code or how did it manage to fail in such way. I hope the only difference is not that userptr upgraded the failure mode for heap corruption or memory management races in general. The mmu notifier is called everytime a process sneezes. It does not imply that our object is being invalidated, just that some portion of the current-mm is being modified. -Chris -- Chris Wilson, Intel Open Source Technology Centre ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH 2/2] drm/i915: Initialise userptr mmu_notifier serial to 1
On 07/11/2014 12:35 PM, Chris Wilson wrote: On Fri, Jul 11, 2014 at 12:31:12PM +0100, Tvrtko Ursulin wrote: On 07/11/2014 12:06 PM, Chris Wilson wrote: On Fri, Jul 11, 2014 at 12:00:26PM +0100, Tvrtko Ursulin wrote: But it will be interesting to know what code managed to trigger this race, because as we discussed on IRC it would indicate some pretty wild userspace behaviour. Or lack of imagination on our part? A threaded client. One thread using userptr, the other doing munmap or free. Given enough embarrassment, it will happen every time. Yes fine, but I struggle to imagine what would be the intention of such code or how did it manage to fail in such way. I hope the only difference is not that userptr upgraded the failure mode for heap corruption or memory management races in general. The mmu notifier is called everytime a process sneezes. It does not imply that our object is being invalidated, just that some portion of the current-mm is being modified. Ah yes, I did not see the big picture. Tvrtko ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH 2/2] drm/i915: Initialise userptr mmu_notifier serial to 1
On Fri, Jul 11, 2014 at 12:06:02PM +0100, Chris Wilson wrote: On Fri, Jul 11, 2014 at 12:00:26PM +0100, Tvrtko Ursulin wrote: On 07/11/2014 11:28 AM, Chris Wilson wrote: During the range invalidate, we walk the list of buffers associated with the mmu_notifer and find the ones that overlap the range. An optimisation is made to speed up the iteration by assuming the previous iter is still valid whilst the tree is unmodified. This exposes a bug when a range invalidate is triggered after we have just created the mmu_notifier, but before attaching any buffers. In that case, we presume we have an unmodified list and start walking from the last iter which is NULL. Oops. The easiest fix is then to initialise the serial of the tree to 1. Signed-off-by: Chris Wilson ch...@chris-wilson.co.uk Cc: Tvrtko Ursulin tvrtko.ursu...@linux.intel.com --- drivers/gpu/drm/i915/i915_gem_userptr.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/gpu/drm/i915/i915_gem_userptr.c b/drivers/gpu/drm/i915/i915_gem_userptr.c index 7c38f50014db..8e9e91029aed 100644 --- a/drivers/gpu/drm/i915/i915_gem_userptr.c +++ b/drivers/gpu/drm/i915/i915_gem_userptr.c @@ -197,7 +197,7 @@ i915_mmu_notifier_get(struct drm_device *dev, struct mm_struct *mm) mmu-mm = mm; mmu-objects = RB_ROOT; mmu-count = 0; - mmu-serial = 0; + mmu-serial = 1; INIT_LIST_HEAD(mmu-linear); mmu-is_linear = false; Looks good to me and I think safe to merge in any case, so: Reviewed-by: Tvrtko Ursulin tvrtko.ursu...@intel.com But it will be interesting to know what code managed to trigger this race, because as we discussed on IRC it would indicate some pretty wild userspace behaviour. Or lack of imagination on our part? A threaded client. One thread using userptr, the other doing munmap or free. Given enough embarrassment, it will happen every time. Can I have an igt please to confirm this and ensure it stays fixed? -Daniel -- Daniel Vetter Software Engineer, Intel Corporation +41 (0) 79 365 57 48 - http://blog.ffwll.ch ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH 2/2] drm/i915: Initialise userptr mmu_notifier serial to 1
On 07/11/2014 03:02 PM, Daniel Vetter wrote: On Fri, Jul 11, 2014 at 12:06:02PM +0100, Chris Wilson wrote: On Fri, Jul 11, 2014 at 12:00:26PM +0100, Tvrtko Ursulin wrote: On 07/11/2014 11:28 AM, Chris Wilson wrote: During the range invalidate, we walk the list of buffers associated with the mmu_notifer and find the ones that overlap the range. An optimisation is made to speed up the iteration by assuming the previous iter is still valid whilst the tree is unmodified. This exposes a bug when a range invalidate is triggered after we have just created the mmu_notifier, but before attaching any buffers. In that case, we presume we have an unmodified list and start walking from the last iter which is NULL. Oops. The easiest fix is then to initialise the serial of the tree to 1. Signed-off-by: Chris Wilson ch...@chris-wilson.co.uk Cc: Tvrtko Ursulin tvrtko.ursu...@linux.intel.com --- drivers/gpu/drm/i915/i915_gem_userptr.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/gpu/drm/i915/i915_gem_userptr.c b/drivers/gpu/drm/i915/i915_gem_userptr.c index 7c38f50014db..8e9e91029aed 100644 --- a/drivers/gpu/drm/i915/i915_gem_userptr.c +++ b/drivers/gpu/drm/i915/i915_gem_userptr.c @@ -197,7 +197,7 @@ i915_mmu_notifier_get(struct drm_device *dev, struct mm_struct *mm) mmu-mm = mm; mmu-objects = RB_ROOT; mmu-count = 0; - mmu-serial = 0; + mmu-serial = 1; INIT_LIST_HEAD(mmu-linear); mmu-is_linear = false; Looks good to me and I think safe to merge in any case, so: Reviewed-by: Tvrtko Ursulin tvrtko.ursu...@intel.com But it will be interesting to know what code managed to trigger this race, because as we discussed on IRC it would indicate some pretty wild userspace behaviour. Or lack of imagination on our part? A threaded client. One thread using userptr, the other doing munmap or free. Given enough embarrassment, it will happen every time. Can I have an igt please to confirm this and ensure it stays fixed? Sure, give me a day or two. Tvrtko ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx