[PATCH 4/4] drm/ttm: Optimize reservation slightly
Reservation locking currently always takes place under the LRU spinlock. Hence, strictly there is no need for an atomic_cmpxchg call; we can use atomic_read followed by atomic_write since nobody else will ever reserve without the lru spinlock held. At least on Intel this should remove a locked bus cycle on successful reserve. Note that thit commit may be obsoleted by the cross-device reservation work. Signed-off-by: Thomas Hellstrom --- drivers/gpu/drm/ttm/ttm_bo.c |3 ++- 1 files changed, 2 insertions(+), 1 deletions(-) diff --git a/drivers/gpu/drm/ttm/ttm_bo.c b/drivers/gpu/drm/ttm/ttm_bo.c index bf6e4b5..46008ea 100644 --- a/drivers/gpu/drm/ttm/ttm_bo.c +++ b/drivers/gpu/drm/ttm/ttm_bo.c @@ -220,7 +220,7 @@ int ttm_bo_reserve_locked(struct ttm_buffer_object *bo, struct ttm_bo_global *glob = bo->glob; int ret; - while (unlikely(atomic_cmpxchg(>reserved, 0, 1) != 0)) { + while (unlikely(atomic_read(>reserved) != 0)) { /** * Deadlock avoidance for multi-bo reserving. */ @@ -249,6 +249,7 @@ int ttm_bo_reserve_locked(struct ttm_buffer_object *bo, return ret; } + atomic_set(>reserved, 1); if (use_sequence) { /** * Wake up waiters that may need to recheck for deadlock, -- 1.7.4.4
[PATCH 4/4] drm/ttm: Optimize reservation slightly
Reservation locking currently always takes place under the LRU spinlock. Hence, strictly there is no need for an atomic_cmpxchg call; we can use atomic_read followed by atomic_write since nobody else will ever reserve without the lru spinlock held. At least on Intel this should remove a locked bus cycle on successful reserve. Note that thit commit may be obsoleted by the cross-device reservation work. Signed-off-by: Thomas Hellstrom thellst...@vmware.com --- drivers/gpu/drm/ttm/ttm_bo.c |3 ++- 1 files changed, 2 insertions(+), 1 deletions(-) diff --git a/drivers/gpu/drm/ttm/ttm_bo.c b/drivers/gpu/drm/ttm/ttm_bo.c index bf6e4b5..46008ea 100644 --- a/drivers/gpu/drm/ttm/ttm_bo.c +++ b/drivers/gpu/drm/ttm/ttm_bo.c @@ -220,7 +220,7 @@ int ttm_bo_reserve_locked(struct ttm_buffer_object *bo, struct ttm_bo_global *glob = bo-glob; int ret; - while (unlikely(atomic_cmpxchg(bo-reserved, 0, 1) != 0)) { + while (unlikely(atomic_read(bo-reserved) != 0)) { /** * Deadlock avoidance for multi-bo reserving. */ @@ -249,6 +249,7 @@ int ttm_bo_reserve_locked(struct ttm_buffer_object *bo, return ret; } + atomic_set(bo-reserved, 1); if (use_sequence) { /** * Wake up waiters that may need to recheck for deadlock, -- 1.7.4.4 ___ dri-devel mailing list dri-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/dri-devel
[PATCH 4/4] drm/ttm: Optimize reservation slightly
On 11/05/2012 03:01 PM, Maarten Lankhorst wrote: > Hey, > > Op 05-11-12 14:31, Thomas Hellstrom schreef: >> Reservation locking currently always takes place under the LRU spinlock. >> Hence, strictly there is no need for an atomic_cmpxchg call; we can use >> atomic_read followed by atomic_write since nobody else will ever reserve >> without the lru spinlock held. >> At least on Intel this should remove a locked bus cycle on successful >> reserve. >> >> Signed-off-by: Thomas Hellstrom >> > Is that really a good thing to submit when I am busy killing lru lock around > reserve? :-) > If your patch series makes it into the same kernel, let's kill this patch. Otherwise it may live at least for a kernel release. It's not a big thing to rebase against, and I won't complain if your patch adds another atomic read-modify-write op here. :) /Thomas
[PATCH 4/4] drm/ttm: Optimize reservation slightly
Hey, Op 05-11-12 14:31, Thomas Hellstrom schreef: > Reservation locking currently always takes place under the LRU spinlock. > Hence, strictly there is no need for an atomic_cmpxchg call; we can use > atomic_read followed by atomic_write since nobody else will ever reserve > without the lru spinlock held. > At least on Intel this should remove a locked bus cycle on successful > reserve. > > Signed-off-by: Thomas Hellstrom > Is that really a good thing to submit when I am busy killing lru lock around reserve? :-) - while (unlikely(atomic_cmpxchg(>reserved, 0, 1) != 0)) { + while (unlikely(atomic_xchg(>reserved, 1) != 0)) { Works without lru lock too! In fact mutexes are done in a similar way[1], except with some more magic, and unlocked state is 1, not 0. However I do think that to get that right (saves a irq disable in unlock path, and less wakeups in contended case), I should really just post the mutex extension patches for reservations and ride the flames. It's getting too close to real mutexes so I really want it to be a mutex in that case. So lets convert it.. Soon! :-) ~Maarten [1] See linux/include/asm-generic/mutex-xchg.h and linux/include/asm-generic/mutex-dec.h for how archs generally implement mutex fastpaths.
[PATCH 4/4] drm/ttm: Optimize reservation slightly
Reservation locking currently always takes place under the LRU spinlock. Hence, strictly there is no need for an atomic_cmpxchg call; we can use atomic_read followed by atomic_write since nobody else will ever reserve without the lru spinlock held. At least on Intel this should remove a locked bus cycle on successful reserve. Signed-off-by: Thomas Hellstrom --- drivers/gpu/drm/ttm/ttm_bo.c |3 ++- 1 files changed, 2 insertions(+), 1 deletions(-) diff --git a/drivers/gpu/drm/ttm/ttm_bo.c b/drivers/gpu/drm/ttm/ttm_bo.c index bf6e4b5..46008ea 100644 --- a/drivers/gpu/drm/ttm/ttm_bo.c +++ b/drivers/gpu/drm/ttm/ttm_bo.c @@ -220,7 +220,7 @@ int ttm_bo_reserve_locked(struct ttm_buffer_object *bo, struct ttm_bo_global *glob = bo->glob; int ret; - while (unlikely(atomic_cmpxchg(>reserved, 0, 1) != 0)) { + while (unlikely(atomic_read(>reserved) != 0)) { /** * Deadlock avoidance for multi-bo reserving. */ @@ -249,6 +249,7 @@ int ttm_bo_reserve_locked(struct ttm_buffer_object *bo, return ret; } + atomic_set(>reserved, 1); if (use_sequence) { /** * Wake up waiters that may need to recheck for deadlock, -- 1.7.4.4
[PATCH 4/4] drm/ttm: Optimize reservation slightly
Reservation locking currently always takes place under the LRU spinlock. Hence, strictly there is no need for an atomic_cmpxchg call; we can use atomic_read followed by atomic_write since nobody else will ever reserve without the lru spinlock held. At least on Intel this should remove a locked bus cycle on successful reserve. Signed-off-by: Thomas Hellstrom --- drivers/gpu/drm/ttm/ttm_bo.c |3 ++- 1 files changed, 2 insertions(+), 1 deletions(-) diff --git a/drivers/gpu/drm/ttm/ttm_bo.c b/drivers/gpu/drm/ttm/ttm_bo.c index bf6e4b5..46008ea 100644 --- a/drivers/gpu/drm/ttm/ttm_bo.c +++ b/drivers/gpu/drm/ttm/ttm_bo.c @@ -220,7 +220,7 @@ int ttm_bo_reserve_locked(struct ttm_buffer_object *bo, struct ttm_bo_global *glob = bo->glob; int ret; - while (unlikely(atomic_cmpxchg(>reserved, 0, 1) != 0)) { + while (unlikely(atomic_read(>reserved) != 0)) { /** * Deadlock avoidance for multi-bo reserving. */ @@ -249,6 +249,7 @@ int ttm_bo_reserve_locked(struct ttm_buffer_object *bo, return ret; } + atomic_set(>reserved, 1); if (use_sequence) { /** * Wake up waiters that may need to recheck for deadlock, -- 1.7.4.4
[PATCH 4/4] drm/ttm: Optimize reservation slightly
Reservation locking currently always takes place under the LRU spinlock. Hence, strictly there is no need for an atomic_cmpxchg call; we can use atomic_read followed by atomic_write since nobody else will ever reserve without the lru spinlock held. At least on Intel this should remove a locked bus cycle on successful reserve. Signed-off-by: Thomas Hellstrom thellst...@vmware.com --- drivers/gpu/drm/ttm/ttm_bo.c |3 ++- 1 files changed, 2 insertions(+), 1 deletions(-) diff --git a/drivers/gpu/drm/ttm/ttm_bo.c b/drivers/gpu/drm/ttm/ttm_bo.c index bf6e4b5..46008ea 100644 --- a/drivers/gpu/drm/ttm/ttm_bo.c +++ b/drivers/gpu/drm/ttm/ttm_bo.c @@ -220,7 +220,7 @@ int ttm_bo_reserve_locked(struct ttm_buffer_object *bo, struct ttm_bo_global *glob = bo-glob; int ret; - while (unlikely(atomic_cmpxchg(bo-reserved, 0, 1) != 0)) { + while (unlikely(atomic_read(bo-reserved) != 0)) { /** * Deadlock avoidance for multi-bo reserving. */ @@ -249,6 +249,7 @@ int ttm_bo_reserve_locked(struct ttm_buffer_object *bo, return ret; } + atomic_set(bo-reserved, 1); if (use_sequence) { /** * Wake up waiters that may need to recheck for deadlock, -- 1.7.4.4 ___ dri-devel mailing list dri-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/dri-devel
Re: [PATCH 4/4] drm/ttm: Optimize reservation slightly
Hey, Op 05-11-12 14:31, Thomas Hellstrom schreef: Reservation locking currently always takes place under the LRU spinlock. Hence, strictly there is no need for an atomic_cmpxchg call; we can use atomic_read followed by atomic_write since nobody else will ever reserve without the lru spinlock held. At least on Intel this should remove a locked bus cycle on successful reserve. Signed-off-by: Thomas Hellstrom thellst...@vmware.com Is that really a good thing to submit when I am busy killing lru lock around reserve? :-) - while (unlikely(atomic_cmpxchg(bo-reserved, 0, 1) != 0)) { + while (unlikely(atomic_xchg(bo-reserved, 1) != 0)) { Works without lru lock too! In fact mutexes are done in a similar way[1], except with some more magic, and unlocked state is 1, not 0. However I do think that to get that right (saves a irq disable in unlock path, and less wakeups in contended case), I should really just post the mutex extension patches for reservations and ride the flames. It's getting too close to real mutexes so I really want it to be a mutex in that case. So lets convert it.. Soon! :-) ~Maarten [1] See linux/include/asm-generic/mutex-xchg.h and linux/include/asm-generic/mutex-dec.h for how archs generally implement mutex fastpaths. ___ dri-devel mailing list dri-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/dri-devel
Re: [PATCH 4/4] drm/ttm: Optimize reservation slightly
On 11/05/2012 03:01 PM, Maarten Lankhorst wrote: Hey, Op 05-11-12 14:31, Thomas Hellstrom schreef: Reservation locking currently always takes place under the LRU spinlock. Hence, strictly there is no need for an atomic_cmpxchg call; we can use atomic_read followed by atomic_write since nobody else will ever reserve without the lru spinlock held. At least on Intel this should remove a locked bus cycle on successful reserve. Signed-off-by: Thomas Hellstrom thellst...@vmware.com Is that really a good thing to submit when I am busy killing lru lock around reserve? :-) If your patch series makes it into the same kernel, let's kill this patch. Otherwise it may live at least for a kernel release. It's not a big thing to rebase against, and I won't complain if your patch adds another atomic read-modify-write op here. :) /Thomas ___ dri-devel mailing list dri-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/dri-devel