[PATCH 3/4] drm/ttm, drm/vmwgfx: Use RCU locking for object lookups v3

2012-11-20 Thread Dave Airlie
On Tue, Nov 20, 2012 at 4:44 PM, Thomas Hellstrom  
wrote:
> On 11/20/2012 07:19 AM, Dave Airlie wrote:
>>
>> On Tue, Nov 6, 2012 at 9:31 PM, Thomas Hellstrom 
>> wrote:
>>>
>>> The mostly used lookup+get put+potential_destroy path of TTM objects
>>> is converted to use RCU locks. This will substantially decrease the
>>> amount
>>> of locked bus cycles during normal operation.
>>> Since we use kfree_rcu to free the objects, no rcu synchronization is
>>> needed
>>> at module unload time.
>>
>> As this is the first use of RCU in a drm driver from what I can see,
>> let me remind that the
>> RCU patent agreement AFAIK only covers GPL works.
>>
>> So non-GPL or other OSes porting this code should take not of this.
>>
>> Dave.
>
>
> From VMware's side this won't be a problem, since other VMware kernel
> modules (VMCI IIRC) use RCU.
>
> In any case I have a new version of the "vmwgfx optimization" patch series
> that mostly add documentation and
> annotation (by  using a drm_ht_xxx_rcu) interface for hashtab, after an
> internal review by Dmitry Torkov. I see you've already
> applied the original patch series. Do you want me to send out the new one or
> rebase it against current drm-next?

Can you rebase it on top of -next?

Dave.


[PATCH 3/4] drm/ttm, drm/vmwgfx: Use RCU locking for object lookups v3

2012-11-20 Thread Dave Airlie
On Tue, Nov 6, 2012 at 9:31 PM, Thomas Hellstrom  
wrote:
> The mostly used lookup+get put+potential_destroy path of TTM objects
> is converted to use RCU locks. This will substantially decrease the amount
> of locked bus cycles during normal operation.
> Since we use kfree_rcu to free the objects, no rcu synchronization is needed
> at module unload time.

As this is the first use of RCU in a drm driver from what I can see,
let me remind that the
RCU patent agreement AFAIK only covers GPL works.

So non-GPL or other OSes porting this code should take not of this.

Dave.


[PATCH 3/4] drm/ttm, drm/vmwgfx: Use RCU locking for object lookups v3

2012-11-20 Thread Thomas Hellstrom
On 11/20/2012 07:19 AM, Dave Airlie wrote:
> On Tue, Nov 6, 2012 at 9:31 PM, Thomas Hellstrom  
> wrote:
>> The mostly used lookup+get put+potential_destroy path of TTM objects
>> is converted to use RCU locks. This will substantially decrease the amount
>> of locked bus cycles during normal operation.
>> Since we use kfree_rcu to free the objects, no rcu synchronization is needed
>> at module unload time.
> As this is the first use of RCU in a drm driver from what I can see,
> let me remind that the
> RCU patent agreement AFAIK only covers GPL works.
>
> So non-GPL or other OSes porting this code should take not of this.
>
> Dave.

 From VMware's side this won't be a problem, since other VMware kernel 
modules (VMCI IIRC) use RCU.

In any case I have a new version of the "vmwgfx optimization" patch 
series that mostly add documentation and
annotation (by  using a drm_ht_xxx_rcu) interface for hashtab, after an 
internal review by Dmitry Torkov. I see you've already
applied the original patch series. Do you want me to send out the new 
one or rebase it against current drm-next?

Thanks,
Thomas



Re: [PATCH 3/4] drm/ttm, drm/vmwgfx: Use RCU locking for object lookups v3

2012-11-20 Thread Dave Airlie
On Tue, Nov 20, 2012 at 4:44 PM, Thomas Hellstrom thellst...@vmware.com wrote:
 On 11/20/2012 07:19 AM, Dave Airlie wrote:

 On Tue, Nov 6, 2012 at 9:31 PM, Thomas Hellstrom thellst...@vmware.com
 wrote:

 The mostly used lookup+get put+potential_destroy path of TTM objects
 is converted to use RCU locks. This will substantially decrease the
 amount
 of locked bus cycles during normal operation.
 Since we use kfree_rcu to free the objects, no rcu synchronization is
 needed
 at module unload time.

 As this is the first use of RCU in a drm driver from what I can see,
 let me remind that the
 RCU patent agreement AFAIK only covers GPL works.

 So non-GPL or other OSes porting this code should take not of this.

 Dave.


 From VMware's side this won't be a problem, since other VMware kernel
 modules (VMCI IIRC) use RCU.

 In any case I have a new version of the vmwgfx optimization patch series
 that mostly add documentation and
 annotation (by  using a drm_ht_xxx_rcu) interface for hashtab, after an
 internal review by Dmitry Torkov. I see you've already
 applied the original patch series. Do you want me to send out the new one or
 rebase it against current drm-next?

Can you rebase it on top of -next?

Dave.
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: [PATCH 3/4] drm/ttm, drm/vmwgfx: Use RCU locking for object lookups v3

2012-11-19 Thread Dave Airlie
On Tue, Nov 6, 2012 at 9:31 PM, Thomas Hellstrom thellst...@vmware.com wrote:
 The mostly used lookup+get put+potential_destroy path of TTM objects
 is converted to use RCU locks. This will substantially decrease the amount
 of locked bus cycles during normal operation.
 Since we use kfree_rcu to free the objects, no rcu synchronization is needed
 at module unload time.

As this is the first use of RCU in a drm driver from what I can see,
let me remind that the
RCU patent agreement AFAIK only covers GPL works.

So non-GPL or other OSes porting this code should take not of this.

Dave.
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: [PATCH 3/4] drm/ttm, drm/vmwgfx: Use RCU locking for object lookups v3

2012-11-19 Thread Thomas Hellstrom

On 11/20/2012 07:19 AM, Dave Airlie wrote:

On Tue, Nov 6, 2012 at 9:31 PM, Thomas Hellstrom thellst...@vmware.com wrote:

The mostly used lookup+get put+potential_destroy path of TTM objects
is converted to use RCU locks. This will substantially decrease the amount
of locked bus cycles during normal operation.
Since we use kfree_rcu to free the objects, no rcu synchronization is needed
at module unload time.

As this is the first use of RCU in a drm driver from what I can see,
let me remind that the
RCU patent agreement AFAIK only covers GPL works.

So non-GPL or other OSes porting this code should take not of this.

Dave.


From VMware's side this won't be a problem, since other VMware kernel 
modules (VMCI IIRC) use RCU.


In any case I have a new version of the vmwgfx optimization patch 
series that mostly add documentation and
annotation (by  using a drm_ht_xxx_rcu) interface for hashtab, after an 
internal review by Dmitry Torkov. I see you've already
applied the original patch series. Do you want me to send out the new 
one or rebase it against current drm-next?


Thanks,
Thomas

___
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel


[PATCH 3/4] drm/ttm, drm/vmwgfx: Use RCU locking for object lookups v3

2012-11-06 Thread Thomas Hellstrom
The mostly used lookup+get put+potential_destroy path of TTM objects
is converted to use RCU locks. This will substantially decrease the amount
of locked bus cycles during normal operation.
Since we use kfree_rcu to free the objects, no rcu synchronization is needed
at module unload time.

v2: Don't touch include/linux/kref.h
v3: Adapt to kref_get_unless_zero return value change

Signed-off-by: Thomas Hellstrom 
---
 drivers/gpu/drm/ttm/ttm_object.c |   30 +++---
 drivers/gpu/drm/vmwgfx/vmwgfx_resource.c |8 
 include/drm/ttm/ttm_object.h |4 
 3 files changed, 19 insertions(+), 23 deletions(-)

diff --git a/drivers/gpu/drm/ttm/ttm_object.c b/drivers/gpu/drm/ttm/ttm_object.c
index c785787..f18eeb4 100644
--- a/drivers/gpu/drm/ttm/ttm_object.c
+++ b/drivers/gpu/drm/ttm/ttm_object.c
@@ -80,7 +80,7 @@ struct ttm_object_file {
  */

 struct ttm_object_device {
-   rwlock_t object_lock;
+   spinlock_t object_lock;
struct drm_open_hash object_hash;
atomic_t object_count;
struct ttm_mem_global *mem_glob;
@@ -157,12 +157,12 @@ int ttm_base_object_init(struct ttm_object_file *tfile,
base->refcount_release = refcount_release;
base->ref_obj_release = ref_obj_release;
base->object_type = object_type;
-   write_lock(>object_lock);
+   spin_lock(>object_lock);
kref_init(>refcount);
ret = drm_ht_just_insert_please(>object_hash,
>hash,
(unsigned long)base, 31, 0, 0);
-   write_unlock(>object_lock);
+   spin_unlock(>object_lock);
if (unlikely(ret != 0))
goto out_err0;

@@ -186,30 +186,22 @@ static void ttm_release_base(struct kref *kref)
container_of(kref, struct ttm_base_object, refcount);
struct ttm_object_device *tdev = base->tfile->tdev;

+   spin_lock(>object_lock);
(void)drm_ht_remove_item(>object_hash, >hash);
-   write_unlock(>object_lock);
+   spin_unlock(>object_lock);
if (base->refcount_release) {
ttm_object_file_unref(>tfile);
base->refcount_release();
}
-   write_lock(>object_lock);
 }

 void ttm_base_object_unref(struct ttm_base_object **p_base)
 {
struct ttm_base_object *base = *p_base;
-   struct ttm_object_device *tdev = base->tfile->tdev;

*p_base = NULL;

-   /*
-* Need to take the lock here to avoid racing with
-* users trying to look up the object.
-*/
-
-   write_lock(>object_lock);
kref_put(>refcount, ttm_release_base);
-   write_unlock(>object_lock);
 }
 EXPORT_SYMBOL(ttm_base_object_unref);

@@ -221,14 +213,14 @@ struct ttm_base_object *ttm_base_object_lookup(struct 
ttm_object_file *tfile,
struct drm_hash_item *hash;
int ret;

-   read_lock(>object_lock);
+   rcu_read_lock();
ret = drm_ht_find_item(>object_hash, key, );

if (likely(ret == 0)) {
base = drm_hash_entry(hash, struct ttm_base_object, hash);
-   kref_get(>refcount);
+   ret = kref_get_unless_zero(>refcount) ? 0 : -EINVAL;
}
-   read_unlock(>object_lock);
+   rcu_read_unlock();

if (unlikely(ret != 0))
return NULL;
@@ -426,7 +418,7 @@ struct ttm_object_device *ttm_object_device_init(struct 
ttm_mem_global
return NULL;

tdev->mem_glob = mem_glob;
-   rwlock_init(>object_lock);
+   spin_lock_init(>object_lock);
atomic_set(>object_count, 0);
ret = drm_ht_create(>object_hash, hash_order);

@@ -444,9 +436,9 @@ void ttm_object_device_release(struct ttm_object_device 
**p_tdev)

*p_tdev = NULL;

-   write_lock(>object_lock);
+   spin_lock(>object_lock);
drm_ht_remove(>object_hash);
-   write_unlock(>object_lock);
+   spin_unlock(>object_lock);

kfree(tdev);
 }
diff --git a/drivers/gpu/drm/vmwgfx/vmwgfx_resource.c 
b/drivers/gpu/drm/vmwgfx/vmwgfx_resource.c
index da3c6b5..ae675c6 100644
--- a/drivers/gpu/drm/vmwgfx/vmwgfx_resource.c
+++ b/drivers/gpu/drm/vmwgfx/vmwgfx_resource.c
@@ -351,7 +351,7 @@ static void vmw_user_context_free(struct vmw_resource *res)
container_of(res, struct vmw_user_context, res);
struct vmw_private *dev_priv = res->dev_priv;

-   kfree(ctx);
+   ttm_base_object_kfree(ctx, base);
ttm_mem_global_free(vmw_mem_glob(dev_priv),
vmw_user_context_size);
 }
@@ -1147,7 +1147,7 @@ static void vmw_user_surface_free(struct vmw_resource 
*res)
kfree(srf->offsets);
kfree(srf->sizes);
kfree(srf->snooper.image);
-   kfree(user_srf);
+   ttm_base_object_kfree(user_srf, base);
ttm_mem_global_free(vmw_mem_glob(dev_priv), size);
 }

@@ -1575,7 +1575,7 @@ static void vmw_user_dmabuf_destroy(struct 
ttm_buffer_object *bo)
 {

[PATCH 3/4] drm/ttm, drm/vmwgfx: Use RCU locking for object lookups v3

2012-11-06 Thread Thomas Hellstrom
The mostly used lookup+get put+potential_destroy path of TTM objects
is converted to use RCU locks. This will substantially decrease the amount
of locked bus cycles during normal operation.
Since we use kfree_rcu to free the objects, no rcu synchronization is needed
at module unload time.

v2: Don't touch include/linux/kref.h
v3: Adapt to kref_get_unless_zero return value change

Signed-off-by: Thomas Hellstrom thellst...@vmware.com
---
 drivers/gpu/drm/ttm/ttm_object.c |   30 +++---
 drivers/gpu/drm/vmwgfx/vmwgfx_resource.c |8 
 include/drm/ttm/ttm_object.h |4 
 3 files changed, 19 insertions(+), 23 deletions(-)

diff --git a/drivers/gpu/drm/ttm/ttm_object.c b/drivers/gpu/drm/ttm/ttm_object.c
index c785787..f18eeb4 100644
--- a/drivers/gpu/drm/ttm/ttm_object.c
+++ b/drivers/gpu/drm/ttm/ttm_object.c
@@ -80,7 +80,7 @@ struct ttm_object_file {
  */
 
 struct ttm_object_device {
-   rwlock_t object_lock;
+   spinlock_t object_lock;
struct drm_open_hash object_hash;
atomic_t object_count;
struct ttm_mem_global *mem_glob;
@@ -157,12 +157,12 @@ int ttm_base_object_init(struct ttm_object_file *tfile,
base-refcount_release = refcount_release;
base-ref_obj_release = ref_obj_release;
base-object_type = object_type;
-   write_lock(tdev-object_lock);
+   spin_lock(tdev-object_lock);
kref_init(base-refcount);
ret = drm_ht_just_insert_please(tdev-object_hash,
base-hash,
(unsigned long)base, 31, 0, 0);
-   write_unlock(tdev-object_lock);
+   spin_unlock(tdev-object_lock);
if (unlikely(ret != 0))
goto out_err0;
 
@@ -186,30 +186,22 @@ static void ttm_release_base(struct kref *kref)
container_of(kref, struct ttm_base_object, refcount);
struct ttm_object_device *tdev = base-tfile-tdev;
 
+   spin_lock(tdev-object_lock);
(void)drm_ht_remove_item(tdev-object_hash, base-hash);
-   write_unlock(tdev-object_lock);
+   spin_unlock(tdev-object_lock);
if (base-refcount_release) {
ttm_object_file_unref(base-tfile);
base-refcount_release(base);
}
-   write_lock(tdev-object_lock);
 }
 
 void ttm_base_object_unref(struct ttm_base_object **p_base)
 {
struct ttm_base_object *base = *p_base;
-   struct ttm_object_device *tdev = base-tfile-tdev;
 
*p_base = NULL;
 
-   /*
-* Need to take the lock here to avoid racing with
-* users trying to look up the object.
-*/
-
-   write_lock(tdev-object_lock);
kref_put(base-refcount, ttm_release_base);
-   write_unlock(tdev-object_lock);
 }
 EXPORT_SYMBOL(ttm_base_object_unref);
 
@@ -221,14 +213,14 @@ struct ttm_base_object *ttm_base_object_lookup(struct 
ttm_object_file *tfile,
struct drm_hash_item *hash;
int ret;
 
-   read_lock(tdev-object_lock);
+   rcu_read_lock();
ret = drm_ht_find_item(tdev-object_hash, key, hash);
 
if (likely(ret == 0)) {
base = drm_hash_entry(hash, struct ttm_base_object, hash);
-   kref_get(base-refcount);
+   ret = kref_get_unless_zero(base-refcount) ? 0 : -EINVAL;
}
-   read_unlock(tdev-object_lock);
+   rcu_read_unlock();
 
if (unlikely(ret != 0))
return NULL;
@@ -426,7 +418,7 @@ struct ttm_object_device *ttm_object_device_init(struct 
ttm_mem_global
return NULL;
 
tdev-mem_glob = mem_glob;
-   rwlock_init(tdev-object_lock);
+   spin_lock_init(tdev-object_lock);
atomic_set(tdev-object_count, 0);
ret = drm_ht_create(tdev-object_hash, hash_order);
 
@@ -444,9 +436,9 @@ void ttm_object_device_release(struct ttm_object_device 
**p_tdev)
 
*p_tdev = NULL;
 
-   write_lock(tdev-object_lock);
+   spin_lock(tdev-object_lock);
drm_ht_remove(tdev-object_hash);
-   write_unlock(tdev-object_lock);
+   spin_unlock(tdev-object_lock);
 
kfree(tdev);
 }
diff --git a/drivers/gpu/drm/vmwgfx/vmwgfx_resource.c 
b/drivers/gpu/drm/vmwgfx/vmwgfx_resource.c
index da3c6b5..ae675c6 100644
--- a/drivers/gpu/drm/vmwgfx/vmwgfx_resource.c
+++ b/drivers/gpu/drm/vmwgfx/vmwgfx_resource.c
@@ -351,7 +351,7 @@ static void vmw_user_context_free(struct vmw_resource *res)
container_of(res, struct vmw_user_context, res);
struct vmw_private *dev_priv = res-dev_priv;
 
-   kfree(ctx);
+   ttm_base_object_kfree(ctx, base);
ttm_mem_global_free(vmw_mem_glob(dev_priv),
vmw_user_context_size);
 }
@@ -1147,7 +1147,7 @@ static void vmw_user_surface_free(struct vmw_resource 
*res)
kfree(srf-offsets);
kfree(srf-sizes);
kfree(srf-snooper.image);
-   kfree(user_srf);
+   ttm_base_object_kfree(user_srf, base);

[PATCH 3/4] drm/ttm, drm/vmwgfx: Use RCU locking for object lookups v2

2012-11-05 Thread Thomas Hellstrom
The mostly used lookup+get put+potential_destroy path of TTM objects
is converted to use RCU locks. This will substantially decrease the amount
of locked bus cycles during normal operation.
Since we use kfree_rcu to free the objects, no rcu synchronization is needed
at module unload time.

v2: Don't touch include/linux/kref.h

Signed-off-by: Thomas Hellstrom 
---
 drivers/gpu/drm/ttm/ttm_object.c |   30 +++---
 drivers/gpu/drm/vmwgfx/vmwgfx_resource.c |8 
 include/drm/ttm/ttm_object.h |4 
 3 files changed, 19 insertions(+), 23 deletions(-)

diff --git a/drivers/gpu/drm/ttm/ttm_object.c b/drivers/gpu/drm/ttm/ttm_object.c
index c785787..9d7f674 100644
--- a/drivers/gpu/drm/ttm/ttm_object.c
+++ b/drivers/gpu/drm/ttm/ttm_object.c
@@ -80,7 +80,7 @@ struct ttm_object_file {
  */

 struct ttm_object_device {
-   rwlock_t object_lock;
+   spinlock_t object_lock;
struct drm_open_hash object_hash;
atomic_t object_count;
struct ttm_mem_global *mem_glob;
@@ -157,12 +157,12 @@ int ttm_base_object_init(struct ttm_object_file *tfile,
base->refcount_release = refcount_release;
base->ref_obj_release = ref_obj_release;
base->object_type = object_type;
-   write_lock(>object_lock);
+   spin_lock(>object_lock);
kref_init(>refcount);
ret = drm_ht_just_insert_please(>object_hash,
>hash,
(unsigned long)base, 31, 0, 0);
-   write_unlock(>object_lock);
+   spin_unlock(>object_lock);
if (unlikely(ret != 0))
goto out_err0;

@@ -186,30 +186,22 @@ static void ttm_release_base(struct kref *kref)
container_of(kref, struct ttm_base_object, refcount);
struct ttm_object_device *tdev = base->tfile->tdev;

+   spin_lock(>object_lock);
(void)drm_ht_remove_item(>object_hash, >hash);
-   write_unlock(>object_lock);
+   spin_unlock(>object_lock);
if (base->refcount_release) {
ttm_object_file_unref(>tfile);
base->refcount_release();
}
-   write_lock(>object_lock);
 }

 void ttm_base_object_unref(struct ttm_base_object **p_base)
 {
struct ttm_base_object *base = *p_base;
-   struct ttm_object_device *tdev = base->tfile->tdev;

*p_base = NULL;

-   /*
-* Need to take the lock here to avoid racing with
-* users trying to look up the object.
-*/
-
-   write_lock(>object_lock);
kref_put(>refcount, ttm_release_base);
-   write_unlock(>object_lock);
 }
 EXPORT_SYMBOL(ttm_base_object_unref);

@@ -221,14 +213,14 @@ struct ttm_base_object *ttm_base_object_lookup(struct 
ttm_object_file *tfile,
struct drm_hash_item *hash;
int ret;

-   read_lock(>object_lock);
+   rcu_read_lock();
ret = drm_ht_find_item(>object_hash, key, );

if (likely(ret == 0)) {
base = drm_hash_entry(hash, struct ttm_base_object, hash);
-   kref_get(>refcount);
+   ret = kref_get_unless_zero(>refcount);
}
-   read_unlock(>object_lock);
+   rcu_read_unlock();

if (unlikely(ret != 0))
return NULL;
@@ -426,7 +418,7 @@ struct ttm_object_device *ttm_object_device_init(struct 
ttm_mem_global
return NULL;

tdev->mem_glob = mem_glob;
-   rwlock_init(>object_lock);
+   spin_lock_init(>object_lock);
atomic_set(>object_count, 0);
ret = drm_ht_create(>object_hash, hash_order);

@@ -444,9 +436,9 @@ void ttm_object_device_release(struct ttm_object_device 
**p_tdev)

*p_tdev = NULL;

-   write_lock(>object_lock);
+   spin_lock(>object_lock);
drm_ht_remove(>object_hash);
-   write_unlock(>object_lock);
+   spin_unlock(>object_lock);

kfree(tdev);
 }
diff --git a/drivers/gpu/drm/vmwgfx/vmwgfx_resource.c 
b/drivers/gpu/drm/vmwgfx/vmwgfx_resource.c
index da3c6b5..ae675c6 100644
--- a/drivers/gpu/drm/vmwgfx/vmwgfx_resource.c
+++ b/drivers/gpu/drm/vmwgfx/vmwgfx_resource.c
@@ -351,7 +351,7 @@ static void vmw_user_context_free(struct vmw_resource *res)
container_of(res, struct vmw_user_context, res);
struct vmw_private *dev_priv = res->dev_priv;

-   kfree(ctx);
+   ttm_base_object_kfree(ctx, base);
ttm_mem_global_free(vmw_mem_glob(dev_priv),
vmw_user_context_size);
 }
@@ -1147,7 +1147,7 @@ static void vmw_user_surface_free(struct vmw_resource 
*res)
kfree(srf->offsets);
kfree(srf->sizes);
kfree(srf->snooper.image);
-   kfree(user_srf);
+   ttm_base_object_kfree(user_srf, base);
ttm_mem_global_free(vmw_mem_glob(dev_priv), size);
 }

@@ -1575,7 +1575,7 @@ static void vmw_user_dmabuf_destroy(struct 
ttm_buffer_object *bo)
 {
struct vmw_user_dma_buffer *vmw_user_bo = vmw_user_dma_buffer(bo);


[PATCH 3/4] drm/ttm, drm/vmwgfx: Use RCU locking for object lookups

2012-11-05 Thread Thomas Hellstrom
On 11/05/2012 02:31 PM, Thomas Hellstrom wrote:
> The mostly used lookup+get put+potential_destroy path of TTM objects
> is converted to use RCU locks. This will substantially decrease the amount
> of locked bus cycles during normal operation.
> Since we use kfree_rcu to free the objects, no rcu synchronization is needed
> at module unload time.
>
> Signed-off-by: Thomas Hellstrom 
> ---
>   drivers/gpu/drm/ttm/ttm_object.c |   30 
> +++---
>   drivers/gpu/drm/vmwgfx/vmwgfx_resource.c |8 
>   include/drm/ttm/ttm_object.h |4 
>   include/linux/kref.h |2 +-
>   4 files changed, 20 insertions(+), 24 deletions(-)
>
> diff --git a/drivers/gpu/drm/ttm/ttm_object.c 
> b/drivers/gpu/drm/ttm/ttm_object.c
> index c785787..9d7f674 100644
> --- a/drivers/gpu/drm/ttm/ttm_object.c
> +++ b/drivers/gpu/drm/ttm/ttm_object.c
> @@ -80,7 +80,7 @@ struct ttm_object_file {
>*/
>   
>   struct ttm_object_device {
> - rwlock_t object_lock;
> + spinlock_t object_lock;
>   struct drm_open_hash object_hash;
>   atomic_t object_count;
>   struct ttm_mem_global *mem_glob;
> @@ -157,12 +157,12 @@ int ttm_base_object_init(struct ttm_object_file *tfile,
>   base->refcount_release = refcount_release;
>   base->ref_obj_release = ref_obj_release;
>   base->object_type = object_type;
> - write_lock(>object_lock);
> + spin_lock(>object_lock);
>   kref_init(>refcount);
>   ret = drm_ht_just_insert_please(>object_hash,
>   >hash,
>   (unsigned long)base, 31, 0, 0);
> - write_unlock(>object_lock);
> + spin_unlock(>object_lock);
>   if (unlikely(ret != 0))
>   goto out_err0;
>   
> @@ -186,30 +186,22 @@ static void ttm_release_base(struct kref *kref)
>   container_of(kref, struct ttm_base_object, refcount);
>   struct ttm_object_device *tdev = base->tfile->tdev;
>   
> + spin_lock(>object_lock);
>   (void)drm_ht_remove_item(>object_hash, >hash);
> - write_unlock(>object_lock);
> + spin_unlock(>object_lock);
>   if (base->refcount_release) {
>   ttm_object_file_unref(>tfile);
>   base->refcount_release();
>   }
> - write_lock(>object_lock);
>   }
>   
>   void ttm_base_object_unref(struct ttm_base_object **p_base)
>   {
>   struct ttm_base_object *base = *p_base;
> - struct ttm_object_device *tdev = base->tfile->tdev;
>   
>   *p_base = NULL;
>   
> - /*
> -  * Need to take the lock here to avoid racing with
> -  * users trying to look up the object.
> -  */
> -
> - write_lock(>object_lock);
>   kref_put(>refcount, ttm_release_base);
> - write_unlock(>object_lock);
>   }
>   EXPORT_SYMBOL(ttm_base_object_unref);
>   
> @@ -221,14 +213,14 @@ struct ttm_base_object *ttm_base_object_lookup(struct 
> ttm_object_file *tfile,
>   struct drm_hash_item *hash;
>   int ret;
>   
> - read_lock(>object_lock);
> + rcu_read_lock();
>   ret = drm_ht_find_item(>object_hash, key, );
>   
>   if (likely(ret == 0)) {
>   base = drm_hash_entry(hash, struct ttm_base_object, hash);
> - kref_get(>refcount);
> + ret = kref_get_unless_zero(>refcount);
>   }
> - read_unlock(>object_lock);
> + rcu_read_unlock();
>   
>   if (unlikely(ret != 0))
>   return NULL;
> @@ -426,7 +418,7 @@ struct ttm_object_device *ttm_object_device_init(struct 
> ttm_mem_global
>   return NULL;
>   
>   tdev->mem_glob = mem_glob;
> - rwlock_init(>object_lock);
> + spin_lock_init(>object_lock);
>   atomic_set(>object_count, 0);
>   ret = drm_ht_create(>object_hash, hash_order);
>   
> @@ -444,9 +436,9 @@ void ttm_object_device_release(struct ttm_object_device 
> **p_tdev)
>   
>   *p_tdev = NULL;
>   
> - write_lock(>object_lock);
> + spin_lock(>object_lock);
>   drm_ht_remove(>object_hash);
> - write_unlock(>object_lock);
> + spin_unlock(>object_lock);
>   
>   kfree(tdev);
>   }
> diff --git a/drivers/gpu/drm/vmwgfx/vmwgfx_resource.c 
> b/drivers/gpu/drm/vmwgfx/vmwgfx_resource.c
> index da3c6b5..ae675c6 100644
> --- a/drivers/gpu/drm/vmwgfx/vmwgfx_resource.c
> +++ b/drivers/gpu/drm/vmwgfx/vmwgfx_resource.c
> @@ -351,7 +351,7 @@ static void vmw_user_context_free(struct vmw_resource 
> *res)
>   container_of(res, struct vmw_user_context, res);
>   struct vmw_private *dev_priv = res->dev_priv;
>   
> - kfree(ctx);
> + ttm_base_object_kfree(ctx, base);
>   ttm_mem_global_free(vmw_mem_glob(dev_priv),
>   vmw_user_context_size);
>   }
> @@ -1147,7 +1147,7 @@ static void vmw_user_surface_free(struct vmw_resource 
> *res)
>   kfree(srf->offsets);
>   kfree(srf->sizes);
>   kfree(srf->snooper.image);
> - kfree(user_srf);
> + ttm_base_object_kfree(user_srf, 

[PATCH 3/4] drm/ttm, drm/vmwgfx: Use RCU locking for object lookups

2012-11-05 Thread Thomas Hellstrom
The mostly used lookup+get put+potential_destroy path of TTM objects
is converted to use RCU locks. This will substantially decrease the amount
of locked bus cycles during normal operation.
Since we use kfree_rcu to free the objects, no rcu synchronization is needed
at module unload time.

Signed-off-by: Thomas Hellstrom 
---
 drivers/gpu/drm/ttm/ttm_object.c |   30 +++---
 drivers/gpu/drm/vmwgfx/vmwgfx_resource.c |8 
 include/drm/ttm/ttm_object.h |4 
 include/linux/kref.h |2 +-
 4 files changed, 20 insertions(+), 24 deletions(-)

diff --git a/drivers/gpu/drm/ttm/ttm_object.c b/drivers/gpu/drm/ttm/ttm_object.c
index c785787..9d7f674 100644
--- a/drivers/gpu/drm/ttm/ttm_object.c
+++ b/drivers/gpu/drm/ttm/ttm_object.c
@@ -80,7 +80,7 @@ struct ttm_object_file {
  */

 struct ttm_object_device {
-   rwlock_t object_lock;
+   spinlock_t object_lock;
struct drm_open_hash object_hash;
atomic_t object_count;
struct ttm_mem_global *mem_glob;
@@ -157,12 +157,12 @@ int ttm_base_object_init(struct ttm_object_file *tfile,
base->refcount_release = refcount_release;
base->ref_obj_release = ref_obj_release;
base->object_type = object_type;
-   write_lock(>object_lock);
+   spin_lock(>object_lock);
kref_init(>refcount);
ret = drm_ht_just_insert_please(>object_hash,
>hash,
(unsigned long)base, 31, 0, 0);
-   write_unlock(>object_lock);
+   spin_unlock(>object_lock);
if (unlikely(ret != 0))
goto out_err0;

@@ -186,30 +186,22 @@ static void ttm_release_base(struct kref *kref)
container_of(kref, struct ttm_base_object, refcount);
struct ttm_object_device *tdev = base->tfile->tdev;

+   spin_lock(>object_lock);
(void)drm_ht_remove_item(>object_hash, >hash);
-   write_unlock(>object_lock);
+   spin_unlock(>object_lock);
if (base->refcount_release) {
ttm_object_file_unref(>tfile);
base->refcount_release();
}
-   write_lock(>object_lock);
 }

 void ttm_base_object_unref(struct ttm_base_object **p_base)
 {
struct ttm_base_object *base = *p_base;
-   struct ttm_object_device *tdev = base->tfile->tdev;

*p_base = NULL;

-   /*
-* Need to take the lock here to avoid racing with
-* users trying to look up the object.
-*/
-
-   write_lock(>object_lock);
kref_put(>refcount, ttm_release_base);
-   write_unlock(>object_lock);
 }
 EXPORT_SYMBOL(ttm_base_object_unref);

@@ -221,14 +213,14 @@ struct ttm_base_object *ttm_base_object_lookup(struct 
ttm_object_file *tfile,
struct drm_hash_item *hash;
int ret;

-   read_lock(>object_lock);
+   rcu_read_lock();
ret = drm_ht_find_item(>object_hash, key, );

if (likely(ret == 0)) {
base = drm_hash_entry(hash, struct ttm_base_object, hash);
-   kref_get(>refcount);
+   ret = kref_get_unless_zero(>refcount);
}
-   read_unlock(>object_lock);
+   rcu_read_unlock();

if (unlikely(ret != 0))
return NULL;
@@ -426,7 +418,7 @@ struct ttm_object_device *ttm_object_device_init(struct 
ttm_mem_global
return NULL;

tdev->mem_glob = mem_glob;
-   rwlock_init(>object_lock);
+   spin_lock_init(>object_lock);
atomic_set(>object_count, 0);
ret = drm_ht_create(>object_hash, hash_order);

@@ -444,9 +436,9 @@ void ttm_object_device_release(struct ttm_object_device 
**p_tdev)

*p_tdev = NULL;

-   write_lock(>object_lock);
+   spin_lock(>object_lock);
drm_ht_remove(>object_hash);
-   write_unlock(>object_lock);
+   spin_unlock(>object_lock);

kfree(tdev);
 }
diff --git a/drivers/gpu/drm/vmwgfx/vmwgfx_resource.c 
b/drivers/gpu/drm/vmwgfx/vmwgfx_resource.c
index da3c6b5..ae675c6 100644
--- a/drivers/gpu/drm/vmwgfx/vmwgfx_resource.c
+++ b/drivers/gpu/drm/vmwgfx/vmwgfx_resource.c
@@ -351,7 +351,7 @@ static void vmw_user_context_free(struct vmw_resource *res)
container_of(res, struct vmw_user_context, res);
struct vmw_private *dev_priv = res->dev_priv;

-   kfree(ctx);
+   ttm_base_object_kfree(ctx, base);
ttm_mem_global_free(vmw_mem_glob(dev_priv),
vmw_user_context_size);
 }
@@ -1147,7 +1147,7 @@ static void vmw_user_surface_free(struct vmw_resource 
*res)
kfree(srf->offsets);
kfree(srf->sizes);
kfree(srf->snooper.image);
-   kfree(user_srf);
+   ttm_base_object_kfree(user_srf, base);
ttm_mem_global_free(vmw_mem_glob(dev_priv), size);
 }

@@ -1575,7 +1575,7 @@ static void vmw_user_dmabuf_destroy(struct 
ttm_buffer_object *bo)
 {
struct vmw_user_dma_buffer *vmw_user_bo = 

[PATCH 3/4] drm/ttm, drm/vmwgfx: Use RCU locking for object lookups

2012-11-05 Thread Thomas Hellstrom
The mostly used lookup+get put+potential_destroy path of TTM objects
is converted to use RCU locks. This will substantially decrease the amount
of locked bus cycles during normal operation.
Since we use kfree_rcu to free the objects, no rcu synchronization is needed
at module unload time.

Signed-off-by: Thomas Hellstrom thellst...@vmware.com
---
 drivers/gpu/drm/ttm/ttm_object.c |   30 +++---
 drivers/gpu/drm/vmwgfx/vmwgfx_resource.c |8 
 include/drm/ttm/ttm_object.h |4 
 include/linux/kref.h |2 +-
 4 files changed, 20 insertions(+), 24 deletions(-)

diff --git a/drivers/gpu/drm/ttm/ttm_object.c b/drivers/gpu/drm/ttm/ttm_object.c
index c785787..9d7f674 100644
--- a/drivers/gpu/drm/ttm/ttm_object.c
+++ b/drivers/gpu/drm/ttm/ttm_object.c
@@ -80,7 +80,7 @@ struct ttm_object_file {
  */
 
 struct ttm_object_device {
-   rwlock_t object_lock;
+   spinlock_t object_lock;
struct drm_open_hash object_hash;
atomic_t object_count;
struct ttm_mem_global *mem_glob;
@@ -157,12 +157,12 @@ int ttm_base_object_init(struct ttm_object_file *tfile,
base-refcount_release = refcount_release;
base-ref_obj_release = ref_obj_release;
base-object_type = object_type;
-   write_lock(tdev-object_lock);
+   spin_lock(tdev-object_lock);
kref_init(base-refcount);
ret = drm_ht_just_insert_please(tdev-object_hash,
base-hash,
(unsigned long)base, 31, 0, 0);
-   write_unlock(tdev-object_lock);
+   spin_unlock(tdev-object_lock);
if (unlikely(ret != 0))
goto out_err0;
 
@@ -186,30 +186,22 @@ static void ttm_release_base(struct kref *kref)
container_of(kref, struct ttm_base_object, refcount);
struct ttm_object_device *tdev = base-tfile-tdev;
 
+   spin_lock(tdev-object_lock);
(void)drm_ht_remove_item(tdev-object_hash, base-hash);
-   write_unlock(tdev-object_lock);
+   spin_unlock(tdev-object_lock);
if (base-refcount_release) {
ttm_object_file_unref(base-tfile);
base-refcount_release(base);
}
-   write_lock(tdev-object_lock);
 }
 
 void ttm_base_object_unref(struct ttm_base_object **p_base)
 {
struct ttm_base_object *base = *p_base;
-   struct ttm_object_device *tdev = base-tfile-tdev;
 
*p_base = NULL;
 
-   /*
-* Need to take the lock here to avoid racing with
-* users trying to look up the object.
-*/
-
-   write_lock(tdev-object_lock);
kref_put(base-refcount, ttm_release_base);
-   write_unlock(tdev-object_lock);
 }
 EXPORT_SYMBOL(ttm_base_object_unref);
 
@@ -221,14 +213,14 @@ struct ttm_base_object *ttm_base_object_lookup(struct 
ttm_object_file *tfile,
struct drm_hash_item *hash;
int ret;
 
-   read_lock(tdev-object_lock);
+   rcu_read_lock();
ret = drm_ht_find_item(tdev-object_hash, key, hash);
 
if (likely(ret == 0)) {
base = drm_hash_entry(hash, struct ttm_base_object, hash);
-   kref_get(base-refcount);
+   ret = kref_get_unless_zero(base-refcount);
}
-   read_unlock(tdev-object_lock);
+   rcu_read_unlock();
 
if (unlikely(ret != 0))
return NULL;
@@ -426,7 +418,7 @@ struct ttm_object_device *ttm_object_device_init(struct 
ttm_mem_global
return NULL;
 
tdev-mem_glob = mem_glob;
-   rwlock_init(tdev-object_lock);
+   spin_lock_init(tdev-object_lock);
atomic_set(tdev-object_count, 0);
ret = drm_ht_create(tdev-object_hash, hash_order);
 
@@ -444,9 +436,9 @@ void ttm_object_device_release(struct ttm_object_device 
**p_tdev)
 
*p_tdev = NULL;
 
-   write_lock(tdev-object_lock);
+   spin_lock(tdev-object_lock);
drm_ht_remove(tdev-object_hash);
-   write_unlock(tdev-object_lock);
+   spin_unlock(tdev-object_lock);
 
kfree(tdev);
 }
diff --git a/drivers/gpu/drm/vmwgfx/vmwgfx_resource.c 
b/drivers/gpu/drm/vmwgfx/vmwgfx_resource.c
index da3c6b5..ae675c6 100644
--- a/drivers/gpu/drm/vmwgfx/vmwgfx_resource.c
+++ b/drivers/gpu/drm/vmwgfx/vmwgfx_resource.c
@@ -351,7 +351,7 @@ static void vmw_user_context_free(struct vmw_resource *res)
container_of(res, struct vmw_user_context, res);
struct vmw_private *dev_priv = res-dev_priv;
 
-   kfree(ctx);
+   ttm_base_object_kfree(ctx, base);
ttm_mem_global_free(vmw_mem_glob(dev_priv),
vmw_user_context_size);
 }
@@ -1147,7 +1147,7 @@ static void vmw_user_surface_free(struct vmw_resource 
*res)
kfree(srf-offsets);
kfree(srf-sizes);
kfree(srf-snooper.image);
-   kfree(user_srf);
+   ttm_base_object_kfree(user_srf, base);
ttm_mem_global_free(vmw_mem_glob(dev_priv), size);
 }
 

Re: [PATCH 3/4] drm/ttm, drm/vmwgfx: Use RCU locking for object lookups

2012-11-05 Thread Thomas Hellstrom

On 11/05/2012 02:31 PM, Thomas Hellstrom wrote:

The mostly used lookup+get put+potential_destroy path of TTM objects
is converted to use RCU locks. This will substantially decrease the amount
of locked bus cycles during normal operation.
Since we use kfree_rcu to free the objects, no rcu synchronization is needed
at module unload time.

Signed-off-by: Thomas Hellstrom thellst...@vmware.com
---
  drivers/gpu/drm/ttm/ttm_object.c |   30 +++---
  drivers/gpu/drm/vmwgfx/vmwgfx_resource.c |8 
  include/drm/ttm/ttm_object.h |4 
  include/linux/kref.h |2 +-
  4 files changed, 20 insertions(+), 24 deletions(-)

diff --git a/drivers/gpu/drm/ttm/ttm_object.c b/drivers/gpu/drm/ttm/ttm_object.c
index c785787..9d7f674 100644
--- a/drivers/gpu/drm/ttm/ttm_object.c
+++ b/drivers/gpu/drm/ttm/ttm_object.c
@@ -80,7 +80,7 @@ struct ttm_object_file {
   */
  
  struct ttm_object_device {

-   rwlock_t object_lock;
+   spinlock_t object_lock;
struct drm_open_hash object_hash;
atomic_t object_count;
struct ttm_mem_global *mem_glob;
@@ -157,12 +157,12 @@ int ttm_base_object_init(struct ttm_object_file *tfile,
base-refcount_release = refcount_release;
base-ref_obj_release = ref_obj_release;
base-object_type = object_type;
-   write_lock(tdev-object_lock);
+   spin_lock(tdev-object_lock);
kref_init(base-refcount);
ret = drm_ht_just_insert_please(tdev-object_hash,
base-hash,
(unsigned long)base, 31, 0, 0);
-   write_unlock(tdev-object_lock);
+   spin_unlock(tdev-object_lock);
if (unlikely(ret != 0))
goto out_err0;
  
@@ -186,30 +186,22 @@ static void ttm_release_base(struct kref *kref)

container_of(kref, struct ttm_base_object, refcount);
struct ttm_object_device *tdev = base-tfile-tdev;
  
+	spin_lock(tdev-object_lock);

(void)drm_ht_remove_item(tdev-object_hash, base-hash);
-   write_unlock(tdev-object_lock);
+   spin_unlock(tdev-object_lock);
if (base-refcount_release) {
ttm_object_file_unref(base-tfile);
base-refcount_release(base);
}
-   write_lock(tdev-object_lock);
  }
  
  void ttm_base_object_unref(struct ttm_base_object **p_base)

  {
struct ttm_base_object *base = *p_base;
-   struct ttm_object_device *tdev = base-tfile-tdev;
  
  	*p_base = NULL;
  
-	/*

-* Need to take the lock here to avoid racing with
-* users trying to look up the object.
-*/
-
-   write_lock(tdev-object_lock);
kref_put(base-refcount, ttm_release_base);
-   write_unlock(tdev-object_lock);
  }
  EXPORT_SYMBOL(ttm_base_object_unref);
  
@@ -221,14 +213,14 @@ struct ttm_base_object *ttm_base_object_lookup(struct ttm_object_file *tfile,

struct drm_hash_item *hash;
int ret;
  
-	read_lock(tdev-object_lock);

+   rcu_read_lock();
ret = drm_ht_find_item(tdev-object_hash, key, hash);
  
  	if (likely(ret == 0)) {

base = drm_hash_entry(hash, struct ttm_base_object, hash);
-   kref_get(base-refcount);
+   ret = kref_get_unless_zero(base-refcount);
}
-   read_unlock(tdev-object_lock);
+   rcu_read_unlock();
  
  	if (unlikely(ret != 0))

return NULL;
@@ -426,7 +418,7 @@ struct ttm_object_device *ttm_object_device_init(struct 
ttm_mem_global
return NULL;
  
  	tdev-mem_glob = mem_glob;

-   rwlock_init(tdev-object_lock);
+   spin_lock_init(tdev-object_lock);
atomic_set(tdev-object_count, 0);
ret = drm_ht_create(tdev-object_hash, hash_order);
  
@@ -444,9 +436,9 @@ void ttm_object_device_release(struct ttm_object_device **p_tdev)
  
  	*p_tdev = NULL;
  
-	write_lock(tdev-object_lock);

+   spin_lock(tdev-object_lock);
drm_ht_remove(tdev-object_hash);
-   write_unlock(tdev-object_lock);
+   spin_unlock(tdev-object_lock);
  
  	kfree(tdev);

  }
diff --git a/drivers/gpu/drm/vmwgfx/vmwgfx_resource.c 
b/drivers/gpu/drm/vmwgfx/vmwgfx_resource.c
index da3c6b5..ae675c6 100644
--- a/drivers/gpu/drm/vmwgfx/vmwgfx_resource.c
+++ b/drivers/gpu/drm/vmwgfx/vmwgfx_resource.c
@@ -351,7 +351,7 @@ static void vmw_user_context_free(struct vmw_resource *res)
container_of(res, struct vmw_user_context, res);
struct vmw_private *dev_priv = res-dev_priv;
  
-	kfree(ctx);

+   ttm_base_object_kfree(ctx, base);
ttm_mem_global_free(vmw_mem_glob(dev_priv),
vmw_user_context_size);
  }
@@ -1147,7 +1147,7 @@ static void vmw_user_surface_free(struct vmw_resource 
*res)
kfree(srf-offsets);
kfree(srf-sizes);
kfree(srf-snooper.image);
-   kfree(user_srf);
+   ttm_base_object_kfree(user_srf, base);

[PATCH 3/4] drm/ttm, drm/vmwgfx: Use RCU locking for object lookups v2

2012-11-05 Thread Thomas Hellstrom
The mostly used lookup+get put+potential_destroy path of TTM objects
is converted to use RCU locks. This will substantially decrease the amount
of locked bus cycles during normal operation.
Since we use kfree_rcu to free the objects, no rcu synchronization is needed
at module unload time.

v2: Don't touch include/linux/kref.h

Signed-off-by: Thomas Hellstrom thellst...@vmware.com
---
 drivers/gpu/drm/ttm/ttm_object.c |   30 +++---
 drivers/gpu/drm/vmwgfx/vmwgfx_resource.c |8 
 include/drm/ttm/ttm_object.h |4 
 3 files changed, 19 insertions(+), 23 deletions(-)

diff --git a/drivers/gpu/drm/ttm/ttm_object.c b/drivers/gpu/drm/ttm/ttm_object.c
index c785787..9d7f674 100644
--- a/drivers/gpu/drm/ttm/ttm_object.c
+++ b/drivers/gpu/drm/ttm/ttm_object.c
@@ -80,7 +80,7 @@ struct ttm_object_file {
  */
 
 struct ttm_object_device {
-   rwlock_t object_lock;
+   spinlock_t object_lock;
struct drm_open_hash object_hash;
atomic_t object_count;
struct ttm_mem_global *mem_glob;
@@ -157,12 +157,12 @@ int ttm_base_object_init(struct ttm_object_file *tfile,
base-refcount_release = refcount_release;
base-ref_obj_release = ref_obj_release;
base-object_type = object_type;
-   write_lock(tdev-object_lock);
+   spin_lock(tdev-object_lock);
kref_init(base-refcount);
ret = drm_ht_just_insert_please(tdev-object_hash,
base-hash,
(unsigned long)base, 31, 0, 0);
-   write_unlock(tdev-object_lock);
+   spin_unlock(tdev-object_lock);
if (unlikely(ret != 0))
goto out_err0;
 
@@ -186,30 +186,22 @@ static void ttm_release_base(struct kref *kref)
container_of(kref, struct ttm_base_object, refcount);
struct ttm_object_device *tdev = base-tfile-tdev;
 
+   spin_lock(tdev-object_lock);
(void)drm_ht_remove_item(tdev-object_hash, base-hash);
-   write_unlock(tdev-object_lock);
+   spin_unlock(tdev-object_lock);
if (base-refcount_release) {
ttm_object_file_unref(base-tfile);
base-refcount_release(base);
}
-   write_lock(tdev-object_lock);
 }
 
 void ttm_base_object_unref(struct ttm_base_object **p_base)
 {
struct ttm_base_object *base = *p_base;
-   struct ttm_object_device *tdev = base-tfile-tdev;
 
*p_base = NULL;
 
-   /*
-* Need to take the lock here to avoid racing with
-* users trying to look up the object.
-*/
-
-   write_lock(tdev-object_lock);
kref_put(base-refcount, ttm_release_base);
-   write_unlock(tdev-object_lock);
 }
 EXPORT_SYMBOL(ttm_base_object_unref);
 
@@ -221,14 +213,14 @@ struct ttm_base_object *ttm_base_object_lookup(struct 
ttm_object_file *tfile,
struct drm_hash_item *hash;
int ret;
 
-   read_lock(tdev-object_lock);
+   rcu_read_lock();
ret = drm_ht_find_item(tdev-object_hash, key, hash);
 
if (likely(ret == 0)) {
base = drm_hash_entry(hash, struct ttm_base_object, hash);
-   kref_get(base-refcount);
+   ret = kref_get_unless_zero(base-refcount);
}
-   read_unlock(tdev-object_lock);
+   rcu_read_unlock();
 
if (unlikely(ret != 0))
return NULL;
@@ -426,7 +418,7 @@ struct ttm_object_device *ttm_object_device_init(struct 
ttm_mem_global
return NULL;
 
tdev-mem_glob = mem_glob;
-   rwlock_init(tdev-object_lock);
+   spin_lock_init(tdev-object_lock);
atomic_set(tdev-object_count, 0);
ret = drm_ht_create(tdev-object_hash, hash_order);
 
@@ -444,9 +436,9 @@ void ttm_object_device_release(struct ttm_object_device 
**p_tdev)
 
*p_tdev = NULL;
 
-   write_lock(tdev-object_lock);
+   spin_lock(tdev-object_lock);
drm_ht_remove(tdev-object_hash);
-   write_unlock(tdev-object_lock);
+   spin_unlock(tdev-object_lock);
 
kfree(tdev);
 }
diff --git a/drivers/gpu/drm/vmwgfx/vmwgfx_resource.c 
b/drivers/gpu/drm/vmwgfx/vmwgfx_resource.c
index da3c6b5..ae675c6 100644
--- a/drivers/gpu/drm/vmwgfx/vmwgfx_resource.c
+++ b/drivers/gpu/drm/vmwgfx/vmwgfx_resource.c
@@ -351,7 +351,7 @@ static void vmw_user_context_free(struct vmw_resource *res)
container_of(res, struct vmw_user_context, res);
struct vmw_private *dev_priv = res-dev_priv;
 
-   kfree(ctx);
+   ttm_base_object_kfree(ctx, base);
ttm_mem_global_free(vmw_mem_glob(dev_priv),
vmw_user_context_size);
 }
@@ -1147,7 +1147,7 @@ static void vmw_user_surface_free(struct vmw_resource 
*res)
kfree(srf-offsets);
kfree(srf-sizes);
kfree(srf-snooper.image);
-   kfree(user_srf);
+   ttm_base_object_kfree(user_srf, base);
ttm_mem_global_free(vmw_mem_glob(dev_priv), size);
 }
 
@@ -1575,7