Re: [Intel-gfx] [PATCH] RFC: drm/i915: Switch obj->mm.lock lockdep annotations on its head
On Thu, Aug 15, 2019 at 9:35 PM Tang, CQ wrote: > > > > > -Original Message- > > From: Intel-gfx [mailto:intel-gfx-boun...@lists.freedesktop.org] On Behalf > > Of Daniel Vetter > > Sent: Wednesday, August 14, 2019 12:25 PM > > To: Intel Graphics Development > > Cc: Daniel Vetter ; Vetter, Daniel > > > > Subject: [Intel-gfx] [PATCH] RFC: drm/i915: Switch obj->mm.lock lockdep > > annotations on its head > > > > The trouble with having a plain nesting flag for locks which do not > > naturally > > nest (unlike block devices and their partitions, which is the original > > motivation > > for nesting levels) is that lockdep will never spot a true deadlock if you > > screw > > up. > > > > This patch is an attempt at trying better, by highlighting a bit more the > > actual > > nature of the nesting that's going on. Essentially we have two kinds of > > objects: > > > > - objects without pages allocated, which cannot be on any lru and are > > hence inaccessible to the shrinker. > > > > - objects which have pages allocated, which are on an lru, and which > > the shrinker can decide to throw out. > > > > For the former type of object, memory allcoations while holding > > obj->mm.lock are permissible. For the latter they are not. And > > get/put_pages transitions between the two types of objects. > > > > This is still not entirely fool-proof since the rules might chance. > > But as long as we run such a code ever at runtime lockdep should be able to > > observe the inconsistency and complain (like with any other lockdep class > > that we've split up in multiple classes). But there are a few clear > > benefits: > > > > - We can drop the nesting flag parameter from > > __i915_gem_object_put_pages, because that function by definition is > > never going allocate memory, and calling it on an object which > > doesn't have its pages allocated would be a bug. > > > > - We strictly catch more bugs, since there's not only one place in the > > entire tree which is annotated with the special class. All the > > other places that had explicit lockdep nesting annotations we're now > > going to leave up to lockdep again. > > > > - Specifically this catches stuff like calling get_pages from > > put_pages (which isn't really a good idea, if we can call put_pages > > so could the shrinker). I've seen patches do exactly that. > > > > Of course I fully expect CI will show me for the fool I am with this one > > here :-) > > > > v2: There can only be one (lockdep only has a cache for the first subclass, > > not > > for deeper ones, and we don't want to make these locks even slower). Still > > separate enums for better documentation. > > > > Real fix: don forget about phys objs and pin_map(), and fix the shrinker to > > have the right annotations ... silly me. > > > > v3: Forgot usertptr too ... > > I eventually looked this patch. My question is on the shrinking calling stack: > > Pin_pages(A)-->get_page(A)-->lock(objA->mm.lock, > I915_MM_GET_PAGES)-->i915_gem_shrink()--> > Lock(struct_mutex)-->put_pages(B)-->lock(objB->mm.lock) > > objA is locked with: mutex_lock_interruptible_nested(>mm.lock, > I915_MM_GET_PAGES); > > objB is locked with: mutex_lock(>mm.lock); > My understanding is that objB locking is equivalently to: > mutex_lock_nested(>mm.lock, I915_MM_NORMAL); > > so you lock subclass=2 first on A, then lock subclass=0 next B, the reverse > order. > Doesn't this cause a lockdep warning? So mutex_lock_nested really is misnamed, it should be called mutex_lock_subclass. There's no guarantees/promises about a certain nesting, just that lockdep should use the 1st/2nd/... subclass of the lockdep class the lock uses (there's a limit of 8 total, including the main/default subclass). From lockdep's pov a subclass works exactly like allocating a new lockdep key. But with the difference that you can change it at runtime, i.e. when an object makes a state transition, you can change the subclass you're using. Like this patch tries to do here. But ofc like everytime you're creating a new lockdep class, you need to spend a few words on why it's not hiding real issues from lockdep. -Daniel > --CQ > > > > > > Cc: Chris Wilson > > Cc: Tvrtko Ursulin > > Cc: Joonas Lahtinen > > Signed-off-by: Daniel Vetter > > --- > > drivers/gpu/drm/i915/gem/i915_gem_object.c | 2 +- > > drivers/gpu/drm/i915/gem/i915_gem_object.h | 16 +--- > &
Re: [Intel-gfx] [PATCH] RFC: drm/i915: Switch obj->mm.lock lockdep annotations on its head
> -Original Message- > From: Intel-gfx [mailto:intel-gfx-boun...@lists.freedesktop.org] On Behalf > Of Daniel Vetter > Sent: Wednesday, August 14, 2019 12:25 PM > To: Intel Graphics Development > Cc: Daniel Vetter ; Vetter, Daniel > > Subject: [Intel-gfx] [PATCH] RFC: drm/i915: Switch obj->mm.lock lockdep > annotations on its head > > The trouble with having a plain nesting flag for locks which do not naturally > nest (unlike block devices and their partitions, which is the original > motivation > for nesting levels) is that lockdep will never spot a true deadlock if you > screw > up. > > This patch is an attempt at trying better, by highlighting a bit more the > actual > nature of the nesting that's going on. Essentially we have two kinds of > objects: > > - objects without pages allocated, which cannot be on any lru and are > hence inaccessible to the shrinker. > > - objects which have pages allocated, which are on an lru, and which > the shrinker can decide to throw out. > > For the former type of object, memory allcoations while holding > obj->mm.lock are permissible. For the latter they are not. And > get/put_pages transitions between the two types of objects. > > This is still not entirely fool-proof since the rules might chance. > But as long as we run such a code ever at runtime lockdep should be able to > observe the inconsistency and complain (like with any other lockdep class > that we've split up in multiple classes). But there are a few clear benefits: > > - We can drop the nesting flag parameter from > __i915_gem_object_put_pages, because that function by definition is > never going allocate memory, and calling it on an object which > doesn't have its pages allocated would be a bug. > > - We strictly catch more bugs, since there's not only one place in the > entire tree which is annotated with the special class. All the > other places that had explicit lockdep nesting annotations we're now > going to leave up to lockdep again. > > - Specifically this catches stuff like calling get_pages from > put_pages (which isn't really a good idea, if we can call put_pages > so could the shrinker). I've seen patches do exactly that. > > Of course I fully expect CI will show me for the fool I am with this one here > :-) > > v2: There can only be one (lockdep only has a cache for the first subclass, > not > for deeper ones, and we don't want to make these locks even slower). Still > separate enums for better documentation. > > Real fix: don forget about phys objs and pin_map(), and fix the shrinker to > have the right annotations ... silly me. > > v3: Forgot usertptr too ... I eventually looked this patch. My question is on the shrinking calling stack: Pin_pages(A)-->get_page(A)-->lock(objA->mm.lock, I915_MM_GET_PAGES)-->i915_gem_shrink()--> Lock(struct_mutex)-->put_pages(B)-->lock(objB->mm.lock) objA is locked with: mutex_lock_interruptible_nested(>mm.lock, I915_MM_GET_PAGES); objB is locked with: mutex_lock(>mm.lock); My understanding is that objB locking is equivalently to: mutex_lock_nested(>mm.lock, I915_MM_NORMAL); so you lock subclass=2 first on A, then lock subclass=0 next B, the reverse order. Doesn't this cause a lockdep warning? --CQ > > Cc: Chris Wilson > Cc: Tvrtko Ursulin > Cc: Joonas Lahtinen > Signed-off-by: Daniel Vetter > --- > drivers/gpu/drm/i915/gem/i915_gem_object.c | 2 +- > drivers/gpu/drm/i915/gem/i915_gem_object.h | 16 +--- > drivers/gpu/drm/i915/gem/i915_gem_object_types.h | 10 +- > drivers/gpu/drm/i915/gem/i915_gem_pages.c| 9 - > drivers/gpu/drm/i915/gem/i915_gem_phys.c | 2 +- > drivers/gpu/drm/i915/gem/i915_gem_shrinker.c | 5 ++--- > drivers/gpu/drm/i915/gem/i915_gem_userptr.c | 4 ++-- > drivers/gpu/drm/i915/gem/selftests/huge_pages.c | 12 ++-- > 8 files changed, 38 insertions(+), 22 deletions(-) > > diff --git a/drivers/gpu/drm/i915/gem/i915_gem_object.c > b/drivers/gpu/drm/i915/gem/i915_gem_object.c > index 3929c3a6b281..a1a835539e09 100644 > --- a/drivers/gpu/drm/i915/gem/i915_gem_object.c > +++ b/drivers/gpu/drm/i915/gem/i915_gem_object.c > @@ -191,7 +191,7 @@ static void __i915_gem_free_objects(struct > drm_i915_private *i915, > GEM_BUG_ON(!list_empty(>lut_list)); > > atomic_set(>mm.pages_pin_count, 0); > - __i915_gem_object_put_pages(obj, I915_MM_NORMAL); > + __i915_gem_object_put_pages(obj); > GEM_BUG_ON(i915_gem_object_has_pages(obj)); > bitmap_free(obj->bit_17); > > diff --git a/drivers/gpu/drm
Re: [Intel-gfx] [PATCH] RFC: drm/i915: Switch obj->mm.lock lockdep annotations on its head
Quoting Daniel Vetter (2019-08-15 08:23:01) > On Wed, Aug 14, 2019 at 07:57:57PM +0100, Chris Wilson wrote: > > Quoting Daniel Vetter (2019-08-14 19:49:41) > > > diff --git a/drivers/gpu/drm/i915/gem/i915_gem_object_types.h > > > b/drivers/gpu/drm/i915/gem/i915_gem_object_types.h > > > index d474c6ac4100..1ea3c3c96a5a 100644 > > > --- a/drivers/gpu/drm/i915/gem/i915_gem_object_types.h > > > +++ b/drivers/gpu/drm/i915/gem/i915_gem_object_types.h > > > @@ -157,7 +157,15 @@ struct drm_i915_gem_object { > > > unsigned int pin_global; > > > > > > struct { > > > - struct mutex lock; /* protects the pages and their use */ > > > + /* > > > +* Protects the pages and their use. > > > > "Their use" is still a misleading suggest of mine. This should be > > "protects the pinning of pages". The couple of other things it is used > > for are tied to the concept of the pages being pinned; further use should > > be discouraged; direct use prohibited. > > So something like "Protects the pinning of pages. Do not use directly, but > instead go through the get/put_pages and pin/unpin interfaces." There's a > few too many of those to list them all imo ... I would say only pin/unpin. get/put is the for implementation. Fwiw, elsewhere we adopted the name pin_mutex / pin_count. But there's also variations on that naming scheme. The pattern of "that who is being pinned may unpin others" is repeating. -Chris ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH] RFC: drm/i915: Switch obj->mm.lock lockdep annotations on its head
On Wed, Aug 14, 2019 at 07:57:57PM +0100, Chris Wilson wrote: > Quoting Daniel Vetter (2019-08-14 19:49:41) > > diff --git a/drivers/gpu/drm/i915/gem/i915_gem_object_types.h > > b/drivers/gpu/drm/i915/gem/i915_gem_object_types.h > > index d474c6ac4100..1ea3c3c96a5a 100644 > > --- a/drivers/gpu/drm/i915/gem/i915_gem_object_types.h > > +++ b/drivers/gpu/drm/i915/gem/i915_gem_object_types.h > > @@ -157,7 +157,15 @@ struct drm_i915_gem_object { > > unsigned int pin_global; > > > > struct { > > - struct mutex lock; /* protects the pages and their use */ > > + /* > > +* Protects the pages and their use. > > "Their use" is still a misleading suggest of mine. This should be > "protects the pinning of pages". The couple of other things it is used > for are tied to the concept of the pages being pinned; further use should > be discouraged; direct use prohibited. So something like "Protects the pinning of pages. Do not use directly, but instead go through the get/put_pages and pin/unpin interfaces." There's a few too many of those to list them all imo ... -Daniel -- Daniel Vetter Software Engineer, Intel Corporation http://blog.ffwll.ch ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [PATCH] RFC: drm/i915: Switch obj->mm.lock lockdep annotations on its head
The trouble with having a plain nesting flag for locks which do not naturally nest (unlike block devices and their partitions, which is the original motivation for nesting levels) is that lockdep will never spot a true deadlock if you screw up. This patch is an attempt at trying better, by highlighting a bit more the actual nature of the nesting that's going on. Essentially we have two kinds of objects: - objects without pages allocated, which cannot be on any lru and are hence inaccessible to the shrinker. - objects which have pages allocated, which are on an lru, and which the shrinker can decide to throw out. For the former type of object, memory allcoations while holding obj->mm.lock are permissible. For the latter they are not. And get/put_pages transitions between the two types of objects. This is still not entirely fool-proof since the rules might chance. But as long as we run such a code ever at runtime lockdep should be able to observe the inconsistency and complain (like with any other lockdep class that we've split up in multiple classes). But there are a few clear benefits: - We can drop the nesting flag parameter from __i915_gem_object_put_pages, because that function by definition is never going allocate memory, and calling it on an object which doesn't have its pages allocated would be a bug. - We strictly catch more bugs, since there's not only one place in the entire tree which is annotated with the special class. All the other places that had explicit lockdep nesting annotations we're now going to leave up to lockdep again. - Specifically this catches stuff like calling get_pages from put_pages (which isn't really a good idea, if we can call put_pages so could the shrinker). I've seen patches do exactly that. Of course I fully expect CI will show me for the fool I am with this one here :-) v2: There can only be one (lockdep only has a cache for the first subclass, not for deeper ones, and we don't want to make these locks even slower). Still separate enums for better documentation. Real fix: don forget about phys objs and pin_map(), and fix the shrinker to have the right annotations ... silly me. v3: Forgot usertptr too ... Cc: Chris Wilson Cc: Tvrtko Ursulin Cc: Joonas Lahtinen Signed-off-by: Daniel Vetter --- drivers/gpu/drm/i915/gem/i915_gem_object.c | 2 +- drivers/gpu/drm/i915/gem/i915_gem_object.h | 16 +--- drivers/gpu/drm/i915/gem/i915_gem_object_types.h | 10 +- drivers/gpu/drm/i915/gem/i915_gem_pages.c| 9 - drivers/gpu/drm/i915/gem/i915_gem_phys.c | 2 +- drivers/gpu/drm/i915/gem/i915_gem_shrinker.c | 5 ++--- drivers/gpu/drm/i915/gem/i915_gem_userptr.c | 4 ++-- drivers/gpu/drm/i915/gem/selftests/huge_pages.c | 12 ++-- 8 files changed, 38 insertions(+), 22 deletions(-) diff --git a/drivers/gpu/drm/i915/gem/i915_gem_object.c b/drivers/gpu/drm/i915/gem/i915_gem_object.c index 3929c3a6b281..a1a835539e09 100644 --- a/drivers/gpu/drm/i915/gem/i915_gem_object.c +++ b/drivers/gpu/drm/i915/gem/i915_gem_object.c @@ -191,7 +191,7 @@ static void __i915_gem_free_objects(struct drm_i915_private *i915, GEM_BUG_ON(!list_empty(>lut_list)); atomic_set(>mm.pages_pin_count, 0); - __i915_gem_object_put_pages(obj, I915_MM_NORMAL); + __i915_gem_object_put_pages(obj); GEM_BUG_ON(i915_gem_object_has_pages(obj)); bitmap_free(obj->bit_17); diff --git a/drivers/gpu/drm/i915/gem/i915_gem_object.h b/drivers/gpu/drm/i915/gem/i915_gem_object.h index 3714cf234d64..5ce511ca7fa8 100644 --- a/drivers/gpu/drm/i915/gem/i915_gem_object.h +++ b/drivers/gpu/drm/i915/gem/i915_gem_object.h @@ -281,11 +281,21 @@ i915_gem_object_unpin_pages(struct drm_i915_gem_object *obj) enum i915_mm_subclass { /* lockdep subclass for obj->mm.lock/struct_mutex */ I915_MM_NORMAL = 0, - I915_MM_SHRINKER /* called "recursively" from direct-reclaim-esque */ + /* +* Only used by struct_mutex, when called "recursively" from +* direct-reclaim-esque. Safe because there is only every one +* struct_mutex in the entire system. */ + I915_MM_SHRINKER = 1, + /* +* Used for obj->mm.lock when allocating pages. Safe because the object +* isn't yet on any LRU, and therefore the shrinker can't deadlock on +* it. As soon as the object has pages, obj->mm.lock nests within +* fs_reclaim. +*/ + I915_MM_GET_PAGES = 1, }; -int __i915_gem_object_put_pages(struct drm_i915_gem_object *obj, - enum i915_mm_subclass subclass); +int __i915_gem_object_put_pages(struct drm_i915_gem_object *obj); void i915_gem_object_truncate(struct drm_i915_gem_object *obj); void i915_gem_object_writeback(struct drm_i915_gem_object *obj); diff --git a/drivers/gpu/drm/i915/gem/i915_gem_object_types.h
Re: [Intel-gfx] [PATCH] RFC: drm/i915: Switch obj->mm.lock lockdep annotations on its head
Quoting Daniel Vetter (2019-08-14 19:49:41) > diff --git a/drivers/gpu/drm/i915/gem/i915_gem_object_types.h > b/drivers/gpu/drm/i915/gem/i915_gem_object_types.h > index d474c6ac4100..1ea3c3c96a5a 100644 > --- a/drivers/gpu/drm/i915/gem/i915_gem_object_types.h > +++ b/drivers/gpu/drm/i915/gem/i915_gem_object_types.h > @@ -157,7 +157,15 @@ struct drm_i915_gem_object { > unsigned int pin_global; > > struct { > - struct mutex lock; /* protects the pages and their use */ > + /* > +* Protects the pages and their use. "Their use" is still a misleading suggest of mine. This should be "protects the pinning of pages". The couple of other things it is used for are tied to the concept of the pages being pinned; further use should be discouraged; direct use prohibited. -Chris ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [PATCH] RFC: drm/i915: Switch obj->mm.lock lockdep annotations on its head
The trouble with having a plain nesting flag for locks which do not naturally nest (unlike block devices and their partitions, which is the original motivation for nesting levels) is that lockdep will never spot a true deadlock if you screw up. This patch is an attempt at trying better, by highlighting a bit more the actual nature of the nesting that's going on. Essentially we have two kinds of objects: - objects without pages allocated, which cannot be on any lru and are hence inaccessible to the shrinker. - objects which have pages allocated, which are on an lru, and which the shrinker can decide to throw out. For the former type of object, memory allcoations while holding obj->mm.lock are permissible. For the latter they are not. And get/put_pages transitions between the two types of objects. This is still not entirely fool-proof since the rules might chance. But as long as we run such a code ever at runtime lockdep should be able to observe the inconsistency and complain (like with any other lockdep class that we've split up in multiple classes). But there are a few clear benefits: - We can drop the nesting flag parameter from __i915_gem_object_put_pages, because that function by definition is never going allocate memory, and calling it on an object which doesn't have its pages allocated would be a bug. - We strictly catch more bugs, since there's not only one place in the entire tree which is annotated with the special class. All the other places that had explicit lockdep nesting annotations we're now going to leave up to lockdep again. - Specifically this catches stuff like calling get_pages from put_pages (which isn't really a good idea, if we can call put_pages so could the shrinker). I've seen patches do exactly that. Of course I fully expect CI will show me for the fool I am with this one here :-) v2: There can only be one (lockdep only has a cache for the first subclass, not for deeper ones, and we don't want to make these locks even slower). Still separate enums for better documentation. Real fix: don forget about phys objs and pin_map(), and fix the shrinker to have the right annotations ... silly me. Cc: Chris Wilson Cc: Tvrtko Ursulin Cc: Joonas Lahtinen Signed-off-by: Daniel Vetter --- drivers/gpu/drm/i915/gem/i915_gem_object.c | 2 +- drivers/gpu/drm/i915/gem/i915_gem_object.h | 16 +--- drivers/gpu/drm/i915/gem/i915_gem_object_types.h | 10 +- drivers/gpu/drm/i915/gem/i915_gem_pages.c| 9 - drivers/gpu/drm/i915/gem/i915_gem_phys.c | 2 +- drivers/gpu/drm/i915/gem/i915_gem_shrinker.c | 5 ++--- drivers/gpu/drm/i915/gem/i915_gem_userptr.c | 2 +- drivers/gpu/drm/i915/gem/selftests/huge_pages.c | 12 ++-- 8 files changed, 37 insertions(+), 21 deletions(-) diff --git a/drivers/gpu/drm/i915/gem/i915_gem_object.c b/drivers/gpu/drm/i915/gem/i915_gem_object.c index 3929c3a6b281..a1a835539e09 100644 --- a/drivers/gpu/drm/i915/gem/i915_gem_object.c +++ b/drivers/gpu/drm/i915/gem/i915_gem_object.c @@ -191,7 +191,7 @@ static void __i915_gem_free_objects(struct drm_i915_private *i915, GEM_BUG_ON(!list_empty(>lut_list)); atomic_set(>mm.pages_pin_count, 0); - __i915_gem_object_put_pages(obj, I915_MM_NORMAL); + __i915_gem_object_put_pages(obj); GEM_BUG_ON(i915_gem_object_has_pages(obj)); bitmap_free(obj->bit_17); diff --git a/drivers/gpu/drm/i915/gem/i915_gem_object.h b/drivers/gpu/drm/i915/gem/i915_gem_object.h index 3714cf234d64..5ce511ca7fa8 100644 --- a/drivers/gpu/drm/i915/gem/i915_gem_object.h +++ b/drivers/gpu/drm/i915/gem/i915_gem_object.h @@ -281,11 +281,21 @@ i915_gem_object_unpin_pages(struct drm_i915_gem_object *obj) enum i915_mm_subclass { /* lockdep subclass for obj->mm.lock/struct_mutex */ I915_MM_NORMAL = 0, - I915_MM_SHRINKER /* called "recursively" from direct-reclaim-esque */ + /* +* Only used by struct_mutex, when called "recursively" from +* direct-reclaim-esque. Safe because there is only every one +* struct_mutex in the entire system. */ + I915_MM_SHRINKER = 1, + /* +* Used for obj->mm.lock when allocating pages. Safe because the object +* isn't yet on any LRU, and therefore the shrinker can't deadlock on +* it. As soon as the object has pages, obj->mm.lock nests within +* fs_reclaim. +*/ + I915_MM_GET_PAGES = 1, }; -int __i915_gem_object_put_pages(struct drm_i915_gem_object *obj, - enum i915_mm_subclass subclass); +int __i915_gem_object_put_pages(struct drm_i915_gem_object *obj); void i915_gem_object_truncate(struct drm_i915_gem_object *obj); void i915_gem_object_writeback(struct drm_i915_gem_object *obj); diff --git a/drivers/gpu/drm/i915/gem/i915_gem_object_types.h b/drivers/gpu/drm/i915/gem/i915_gem_object_types.h index