On 25 February 2015 at 15:43, Maarten Lankhorst
<maarten.lankho...@ubuntu.com> wrote:
> Op 25-02-15 om 16:28 schreef Patrick Baggett:
>> On Wed, Feb 25, 2015 at 9:07 AM, Maarten Lankhorst <
>> maarten.lankho...@ubuntu.com> wrote:
>>
>>> Op 25-02-15 om 16:04 schreef Patrick Baggett:
>>>> On Wed, Feb 25, 2015 at 8:59 AM, Maarten Lankhorst <
>>>> maarten.lankho...@ubuntu.com> wrote:
>>>>
>>>>> Op 25-02-15 om 15:11 schreef Emil Velikov:
>>>>>> On 24 February 2015 at 09:01, Maarten Lankhorst
>>>>>> <maarten.lankho...@ubuntu.com> wrote:
>>>>>>> Only add wrapped bo's and bo's that have been exported through flink
>>> or
>>>>> dma-buf.
>>>>>>> This avoids a lock in the common case, and decreases traversal needed
>>>>> for importing
>>>>>>> a dma-buf or flink.
>>>>>>>
>>>>>>> Signed-off-by: Maarten Lankhorst <maarten.lankho...@ubuntu.com>
>>>>>>> ---
>>>>>>>  nouveau/nouveau.c | 47
>>> +++++++++++++++++++++++------------------------
>>>>>>>  1 file changed, 23 insertions(+), 24 deletions(-)
>>>>>>>
>>>>>>> diff --git a/nouveau/nouveau.c b/nouveau/nouveau.c
>>>>>>> index 1c723b9..d411523 100644
>>>>>>> --- a/nouveau/nouveau.c
>>>>>>> +++ b/nouveau/nouveau.c
>>>>>>> @@ -349,8 +349,8 @@ nouveau_bo_del(struct nouveau_bo *bo)
>>>>>>>         struct nouveau_bo_priv *nvbo = nouveau_bo(bo);
>>>>>>>         struct drm_gem_close req = { bo->handle };
>>>>>>>
>>>>>>> -       pthread_mutex_lock(&nvdev->lock);
>>>>>>> -       if (nvbo->name) {
>>>>>>> +       if (nvbo->head.next) {
>>>>>>> +               pthread_mutex_lock(&nvdev->lock);
>>>>>>>                 if (atomic_read(&nvbo->refcnt) == 0) {
>>>>>>>                         DRMLISTDEL(&nvbo->head);
>>>>>>>                         /*
>>>>>>> @@ -365,8 +365,6 @@ nouveau_bo_del(struct nouveau_bo *bo)
>>>>>>>                 }
>>>>>>>                 pthread_mutex_unlock(&nvdev->lock);
>>>>>>>         } else {
>>>>>>> -               DRMLISTDEL(&nvbo->head);
>>>>>>> -               pthread_mutex_unlock(&nvdev->lock);
>>>>>>>                 drmIoctl(bo->device->fd, DRM_IOCTL_GEM_CLOSE, &req);
>>>>>>>         }
>>>>>>>         if (bo->map)
>>>>>>> @@ -379,7 +377,6 @@ nouveau_bo_new(struct nouveau_device *dev,
>>> uint32_t
>>>>> flags, uint32_t align,
>>>>>>>                uint64_t size, union nouveau_bo_config *config,
>>>>>>>                struct nouveau_bo **pbo)
>>>>>>>  {
>>>>>>> -       struct nouveau_device_priv *nvdev = nouveau_device(dev);
>>>>>>>         struct nouveau_bo_priv *nvbo = calloc(1, sizeof(*nvbo));
>>>>>>>         struct nouveau_bo *bo = &nvbo->base;
>>>>>>>         int ret;
>>>>>>> @@ -397,10 +394,6 @@ nouveau_bo_new(struct nouveau_device *dev,
>>>>> uint32_t flags, uint32_t align,
>>>>>>>                 return ret;
>>>>>>>         }
>>>>>>>
>>>>>>> -       pthread_mutex_lock(&nvdev->lock);
>>>>>>> -       DRMLISTADD(&nvbo->head, &nvdev->bo_list);
>>>>>>> -       pthread_mutex_unlock(&nvdev->lock);
>>>>>>> -
>>>>>>>         *pbo = bo;
>>>>>>>         return 0;
>>>>>>>  }
>>>>>>> @@ -457,6 +450,18 @@ nouveau_bo_wrap_locked(struct nouveau_device
>>> *dev,
>>>>> uint32_t handle,
>>>>>>>         return -ENOMEM;
>>>>>>>  }
>>>>>>>
>>>>>>> +static void
>>>>>>> +nouveau_bo_make_global(struct nouveau_bo_priv *nvbo)
>>>>>>> +{
>>>>>>> +       if (!nvbo->head.next) {
>>>>>>> +               struct nouveau_device_priv *nvdev =
>>>>> nouveau_device(nvbo->base.device);
>>>>>>> +               pthread_mutex_lock(&nvdev->lock);
>>>>>>> +               if (!nvbo->head.next)
>>>> I guess the bo_make_global call is not particularly sensitive, so
>>>>> removing's fine with me.
>>>>>
>>>> I would be worried about the duplicated check. It seems like a "smart"
>>>> compiler could cache the value of "nvbo->head.next" (unless marked as
>>>> volatile), rendering the second if() useless. If the field is marked
>>>> volatile, then of course, this does not apply.
>>>>
>>> In that case I would be worried about a compiler that ignores the acquire
>>> semantics of pthread_mutex_lock..
>>>
>> This isn't about the acquire semantics here, because you're reading it on
>> both sides of the lock. The compiler need not reorder the reads at all,
>> merely just save the value of the first read. Unless the compiler knows
>> that the value can change after the lock is acquired, it can simply cache
>> the result. Consider the nearly identical transformation:
>>
>> const int condition = (!nvbo->head.next);
>> if(condition){
>>    pthread_mutex_lock(...);
>>     if(condition){ //redundant, already can assert it is true
>>         ...
>>
> So you're saying a compiler can optimize:
>
> - statement with memory access
> - read memory barrier
> - statement with memory access
>
> To drop the second statement? I would worry about your definition of memory 
> barrier then..
>
Patrick thanks for elaborating my concern in a more complete manner.

Maarten, I would assume that the overhead of dropping the first if
statement will be negligible wrt the time spent debugging when the
compiler does the silly thing. Afaict even with a single if, your
patch does reduce the time spend in libdrm_nouveau quite a bit.

That said, it's up-to you and others to draw the final call.

Cheers,
Emil
_______________________________________________
Nouveau mailing list
Nouveau@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/nouveau

Reply via email to