Re: [Intel-gfx] [PATCH 4/4] dma-buf: nuke reservation_object seq number

2019-08-14 Thread Chris Wilson
Quoting Koenig, Christian (2019-08-14 18:58:32)
> Am 14.08.19 um 19:48 schrieb Chris Wilson:
> > Quoting Chris Wilson (2019-08-14 18:38:20)
> >> Quoting Chris Wilson (2019-08-14 18:22:53)
> >>> Quoting Chris Wilson (2019-08-14 18:06:18)
>  Quoting Chris Wilson (2019-08-14 17:42:48)
> > Quoting Daniel Vetter (2019-08-14 16:39:08)
> > +   } while (rcu_access_pointer(obj->fence_excl) != *excl);
> >> What if someone is real fast (like really real fast) and recycles the
> >> exclusive fence so you read the same pointer twice, but everything else
> >> changed? reused fence pointer is a lot more likely than seqlock 
> >> wrapping
> >> around.
> > It's an exclusive fence. If it is replaced, it must be later than all
> > the shared fences (and dependent on them directly or indirectly), and
> > so still a consistent snapshot.
>  An extension of that argument says we don't even need to loop,
> 
>  *list = rcu_dereference(obj->fence);
>  *shared_count = *list ? (*list)->shared_count : 0;
>  smp_rmb();
>  *excl = rcu_dereference(obj->fence_excl);
> 
>  Gives a consistent snapshot. It doesn't matter if the fence_excl is
>  before or after the shared_list -- if it's after, it's a superset of all
>  fences, if it's before, we have a correct list of shared fences the
>  earlier fence_excl.
> >>> The problem is that the point of the loop is that we do need a check on
> >>> the fences after the full memory barrier.
> >>>
> >>> Thinking of the rationale beaten out for dma_fence_get_excl_rcu_safe()
> >>>
> >>> We don't have a full memory barrier here, so this cannot be used safely
> >>> in light of fence reallocation.
> >> i.e. we need to restore the loops in the callers,
> >>
> >> diff --git a/drivers/gpu/drm/i915/gem/i915_gem_busy.c 
> >> b/drivers/gpu/drm/i915/gem/i915_gem_busy.c
> >> index a2aff1d8290e..f019062c8cd7 100644
> >> --- a/drivers/gpu/drm/i915/gem/i915_gem_busy.c
> >> +++ b/drivers/gpu/drm/i915/gem/i915_gem_busy.c
> >> @@ -110,6 +110,7 @@ i915_gem_busy_ioctl(struct drm_device *dev, void *data,
> >>   * to report the overall busyness. This is what the wait-ioctl 
> >> does.
> >>   *
> >>   */
> >> +retry:
> >>  dma_resv_fences(obj->base.resv, , , _count);
> >>
> >>  /* Translate the exclusive fence to the READ *and* WRITE engine */
> >> @@ -122,6 +123,10 @@ i915_gem_busy_ioctl(struct drm_device *dev, void 
> >> *data,
> >>  args->busy |= busy_check_reader(fence);
> >>  }
> >>
> >> +   smp_rmb();
> >> +   if (excl != rcu_access_pointer(obj->base.resv->fence_excl))
> >> +   goto retry;
> >> +
> >>
> >> wrap that up as
> >>
> >> static inline bool
> >> dma_resv_fences_retry(struct dma_resv *resv, struct dma_fence *excl)
> >> {
> >>  smp_rmb();
> >>  return excl != rcu_access_pointer(resv->fence_excl);
> >> }
> > I give up. It's not just the fence_excl that's an issue here.
> >
> > Any of the shared fences may be replaced after dma_resv_fences()
> > and so the original shared fence pointer may be reassigned (even under
> > RCU).
> 
> Yeah, but this should be harmless. See fences are always replaced either 
> when they are signaled anyway or by later fences from the same context.
> 
> And existing fences shouldn't be re-used while under RCU, or is anybody 
> still using SLAB_TYPESAFE_BY_RCU?

Yes. We go through enough fences that the freelist is a noticeable
improvement.
-Chris
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

Re: [Intel-gfx] [PATCH 4/4] dma-buf: nuke reservation_object seq number

2019-08-14 Thread Koenig, Christian
Am 14.08.19 um 19:48 schrieb Chris Wilson:
> Quoting Chris Wilson (2019-08-14 18:38:20)
>> Quoting Chris Wilson (2019-08-14 18:22:53)
>>> Quoting Chris Wilson (2019-08-14 18:06:18)
 Quoting Chris Wilson (2019-08-14 17:42:48)
> Quoting Daniel Vetter (2019-08-14 16:39:08)
> +   } while (rcu_access_pointer(obj->fence_excl) != *excl);
>> What if someone is real fast (like really real fast) and recycles the
>> exclusive fence so you read the same pointer twice, but everything else
>> changed? reused fence pointer is a lot more likely than seqlock wrapping
>> around.
> It's an exclusive fence. If it is replaced, it must be later than all
> the shared fences (and dependent on them directly or indirectly), and
> so still a consistent snapshot.
 An extension of that argument says we don't even need to loop,

 *list = rcu_dereference(obj->fence);
 *shared_count = *list ? (*list)->shared_count : 0;
 smp_rmb();
 *excl = rcu_dereference(obj->fence_excl);

 Gives a consistent snapshot. It doesn't matter if the fence_excl is
 before or after the shared_list -- if it's after, it's a superset of all
 fences, if it's before, we have a correct list of shared fences the
 earlier fence_excl.
>>> The problem is that the point of the loop is that we do need a check on
>>> the fences after the full memory barrier.
>>>
>>> Thinking of the rationale beaten out for dma_fence_get_excl_rcu_safe()
>>>
>>> We don't have a full memory barrier here, so this cannot be used safely
>>> in light of fence reallocation.
>> i.e. we need to restore the loops in the callers,
>>
>> diff --git a/drivers/gpu/drm/i915/gem/i915_gem_busy.c 
>> b/drivers/gpu/drm/i915/gem/i915_gem_busy.c
>> index a2aff1d8290e..f019062c8cd7 100644
>> --- a/drivers/gpu/drm/i915/gem/i915_gem_busy.c
>> +++ b/drivers/gpu/drm/i915/gem/i915_gem_busy.c
>> @@ -110,6 +110,7 @@ i915_gem_busy_ioctl(struct drm_device *dev, void *data,
>>   * to report the overall busyness. This is what the wait-ioctl does.
>>   *
>>   */
>> +retry:
>>  dma_resv_fences(obj->base.resv, , , _count);
>>
>>  /* Translate the exclusive fence to the READ *and* WRITE engine */
>> @@ -122,6 +123,10 @@ i915_gem_busy_ioctl(struct drm_device *dev, void *data,
>>  args->busy |= busy_check_reader(fence);
>>  }
>>
>> +   smp_rmb();
>> +   if (excl != rcu_access_pointer(obj->base.resv->fence_excl))
>> +   goto retry;
>> +
>>
>> wrap that up as
>>
>> static inline bool
>> dma_resv_fences_retry(struct dma_resv *resv, struct dma_fence *excl)
>> {
>>  smp_rmb();
>>  return excl != rcu_access_pointer(resv->fence_excl);
>> }
> I give up. It's not just the fence_excl that's an issue here.
>
> Any of the shared fences may be replaced after dma_resv_fences()
> and so the original shared fence pointer may be reassigned (even under
> RCU).

Yeah, but this should be harmless. See fences are always replaced either 
when they are signaled anyway or by later fences from the same context.

And existing fences shouldn't be re-used while under RCU, or is anybody 
still using SLAB_TYPESAFE_BY_RCU?

Christian.

>   The only defense against that is the seqcount.
>
> I totally screwed that up.
> -Chris

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

Re: [Intel-gfx] [PATCH 4/4] dma-buf: nuke reservation_object seq number

2019-08-14 Thread Daniel Vetter
On Wed, Aug 14, 2019 at 06:20:28PM +0100, Chris Wilson wrote:
> Quoting Daniel Vetter (2019-08-14 18:06:26)
> > On Wed, Aug 14, 2019 at 05:42:48PM +0100, Chris Wilson wrote:
> > > Quoting Daniel Vetter (2019-08-14 16:39:08)
> [snip]
> > > > > > >  if (old)
> > > > > > > -   old->shared_count = 0;
> > > > > > > -   write_seqcount_end(>seq);
> > > > > > > +   smp_store_mb(old->shared_count, 0);
> > > > 
> > > > So your comment and the kerneldoc don't match up. Quoting
> > > > Documentation/memory-barriers.txt:
> > > > 
> > > >  This assigns the value to the variable and then inserts a full 
> > > > memory
> > > >  barrier after it.  It isn't guaranteed to insert anything more 
> > > > than a
> > > >  compiler barrier in a UP compilation.
> > > > 
> > > > So order is 1. store 2. fence, but your comment suggests you want it the
> > > > other way round.
> > > 
> > > What's more weird is that it is a fully serialising instruction that is
> > > used to fence first as part of the update. If that's way PeterZ uses
> > > it...
> > 
> > I haven't looked at the implementations tbh, just going with the text. Or
> > do you mean in the write_seqlock that we're replacing?
> 
> Nah, I misremembered set_current_state(), all that implies is the fence
> is before the following instructions. I have some recollection that it
> can be used as a RELEASE operation (if only because it is a locked xchg).
> If all else fails, make it an xchg_release(). Or normal assignment +
> smp_wmb().

Yeah that one is called smp_store_release, not smp_store_mb. I think
smp_store_release is the right one here.

> > > It's an exclusive fence. If it is replaced, it must be later than all
> > > the shared fences (and dependent on them directly or indirectly), and
> > > so still a consistent snapshot.
> > 
> > I'm not worried about the fence, that part is fine. But we're defacto
> > using the fence as a fancy seqlock-of-sorts. And if the fence gets reused
> > and the pointers match, then our seqlock-of-sorts breaks. But I haven't
> > looked around whether there's more in the code that makes this an
> > irrelevant issue.
> 
> No, it should not break if we replace the fence with the same pointer.
> If the fence pointer expires, reused and assigned back as the excl_fence
> -- it is still the excl_fence and by the properties of that
> excl_fence construction, it is later than the shared_fences.

So I thought the rules are that if we have an exclusive fence and shared
fences both present, then the shared fences are after the exclusive one.

But if we race here, then I think we could accidentally sample shared
fences from _before_ the exclusive fences. Rough timeline:

exlusive fence 1 -> shared fence 2 -> exclusive fence, but reuses memory
of fence 1

Then we sample fence 1, capture the shared fence 2, and notice that the
exclusive fence pointer is the same (but not the fence on the timeline)
and conclude that we got a consistent sample.

But the only consistent sample with the new fence state would be only the
exclusive fence.

Reminds me I forgot to look for the usual kref_get_unless_zero trickery we
also need to do here to make sure we have the right fence.
-Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

Re: [Intel-gfx] [PATCH 4/4] dma-buf: nuke reservation_object seq number

2019-08-14 Thread Chris Wilson
Quoting Chris Wilson (2019-08-14 18:38:20)
> Quoting Chris Wilson (2019-08-14 18:22:53)
> > Quoting Chris Wilson (2019-08-14 18:06:18)
> > > Quoting Chris Wilson (2019-08-14 17:42:48)
> > > > Quoting Daniel Vetter (2019-08-14 16:39:08)
> > > > > > > > +   } while (rcu_access_pointer(obj->fence_excl) != *excl);
> > > > > 
> > > > > What if someone is real fast (like really real fast) and recycles the
> > > > > exclusive fence so you read the same pointer twice, but everything 
> > > > > else
> > > > > changed? reused fence pointer is a lot more likely than seqlock 
> > > > > wrapping
> > > > > around.
> > > > 
> > > > It's an exclusive fence. If it is replaced, it must be later than all
> > > > the shared fences (and dependent on them directly or indirectly), and
> > > > so still a consistent snapshot.
> > > 
> > > An extension of that argument says we don't even need to loop,
> > > 
> > > *list = rcu_dereference(obj->fence);
> > > *shared_count = *list ? (*list)->shared_count : 0;
> > > smp_rmb();
> > > *excl = rcu_dereference(obj->fence_excl);
> > > 
> > > Gives a consistent snapshot. It doesn't matter if the fence_excl is
> > > before or after the shared_list -- if it's after, it's a superset of all
> > > fences, if it's before, we have a correct list of shared fences the
> > > earlier fence_excl.
> > 
> > The problem is that the point of the loop is that we do need a check on
> > the fences after the full memory barrier.
> > 
> > Thinking of the rationale beaten out for dma_fence_get_excl_rcu_safe()
> > 
> > We don't have a full memory barrier here, so this cannot be used safely
> > in light of fence reallocation.
> 
> i.e. we need to restore the loops in the callers,
> 
> diff --git a/drivers/gpu/drm/i915/gem/i915_gem_busy.c 
> b/drivers/gpu/drm/i915/gem/i915_gem_busy.c
> index a2aff1d8290e..f019062c8cd7 100644
> --- a/drivers/gpu/drm/i915/gem/i915_gem_busy.c
> +++ b/drivers/gpu/drm/i915/gem/i915_gem_busy.c
> @@ -110,6 +110,7 @@ i915_gem_busy_ioctl(struct drm_device *dev, void *data,
>  * to report the overall busyness. This is what the wait-ioctl does.
>  *
>  */
> +retry:
> dma_resv_fences(obj->base.resv, , , _count);
> 
> /* Translate the exclusive fence to the READ *and* WRITE engine */
> @@ -122,6 +123,10 @@ i915_gem_busy_ioctl(struct drm_device *dev, void *data,
> args->busy |= busy_check_reader(fence);
> }
> 
> +   smp_rmb();
> +   if (excl != rcu_access_pointer(obj->base.resv->fence_excl))
> +   goto retry;
> +
> 
> wrap that up as
> 
> static inline bool
> dma_resv_fences_retry(struct dma_resv *resv, struct dma_fence *excl)
> {
> smp_rmb();
> return excl != rcu_access_pointer(resv->fence_excl);
> }

I give up. It's not just the fence_excl that's an issue here.

Any of the shared fences may be replaced after dma_resv_fences()
and so the original shared fence pointer may be reassigned (even under
RCU). The only defense against that is the seqcount.

I totally screwed that up.
-Chris
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

Re: [Intel-gfx] [PATCH 4/4] dma-buf: nuke reservation_object seq number

2019-08-14 Thread Chris Wilson
Quoting Chris Wilson (2019-08-14 18:22:53)
> Quoting Chris Wilson (2019-08-14 18:06:18)
> > Quoting Chris Wilson (2019-08-14 17:42:48)
> > > Quoting Daniel Vetter (2019-08-14 16:39:08)
> > > > > > > +   } while (rcu_access_pointer(obj->fence_excl) != *excl);
> > > > 
> > > > What if someone is real fast (like really real fast) and recycles the
> > > > exclusive fence so you read the same pointer twice, but everything else
> > > > changed? reused fence pointer is a lot more likely than seqlock wrapping
> > > > around.
> > > 
> > > It's an exclusive fence. If it is replaced, it must be later than all
> > > the shared fences (and dependent on them directly or indirectly), and
> > > so still a consistent snapshot.
> > 
> > An extension of that argument says we don't even need to loop,
> > 
> > *list = rcu_dereference(obj->fence);
> > *shared_count = *list ? (*list)->shared_count : 0;
> > smp_rmb();
> > *excl = rcu_dereference(obj->fence_excl);
> > 
> > Gives a consistent snapshot. It doesn't matter if the fence_excl is
> > before or after the shared_list -- if it's after, it's a superset of all
> > fences, if it's before, we have a correct list of shared fences the
> > earlier fence_excl.
> 
> The problem is that the point of the loop is that we do need a check on
> the fences after the full memory barrier.
> 
> Thinking of the rationale beaten out for dma_fence_get_excl_rcu_safe()
> 
> We don't have a full memory barrier here, so this cannot be used safely
> in light of fence reallocation.

i.e. we need to restore the loops in the callers,

diff --git a/drivers/gpu/drm/i915/gem/i915_gem_busy.c 
b/drivers/gpu/drm/i915/gem/i915_gem_busy.c
index a2aff1d8290e..f019062c8cd7 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_busy.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_busy.c
@@ -110,6 +110,7 @@ i915_gem_busy_ioctl(struct drm_device *dev, void *data,
 * to report the overall busyness. This is what the wait-ioctl does.
 *
 */
+retry:
dma_resv_fences(obj->base.resv, , , _count);

/* Translate the exclusive fence to the READ *and* WRITE engine */
@@ -122,6 +123,10 @@ i915_gem_busy_ioctl(struct drm_device *dev, void *data,
args->busy |= busy_check_reader(fence);
}

+   smp_rmb();
+   if (excl != rcu_access_pointer(obj->base.resv->fence_excl))
+   goto retry;
+

wrap that up as

static inline bool
dma_resv_fences_retry(struct dma_resv *resv, struct dma_fence *excl)
{
smp_rmb();
return excl != rcu_access_pointer(resv->fence_excl);
}
-Chris
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

Re: [Intel-gfx] [PATCH 4/4] dma-buf: nuke reservation_object seq number

2019-08-14 Thread Chris Wilson
Quoting Chris Wilson (2019-08-14 18:06:18)
> Quoting Chris Wilson (2019-08-14 17:42:48)
> > Quoting Daniel Vetter (2019-08-14 16:39:08)
> > > > > > +   } while (rcu_access_pointer(obj->fence_excl) != *excl);
> > > 
> > > What if someone is real fast (like really real fast) and recycles the
> > > exclusive fence so you read the same pointer twice, but everything else
> > > changed? reused fence pointer is a lot more likely than seqlock wrapping
> > > around.
> > 
> > It's an exclusive fence. If it is replaced, it must be later than all
> > the shared fences (and dependent on them directly or indirectly), and
> > so still a consistent snapshot.
> 
> An extension of that argument says we don't even need to loop,
> 
> *list = rcu_dereference(obj->fence);
> *shared_count = *list ? (*list)->shared_count : 0;
> smp_rmb();
> *excl = rcu_dereference(obj->fence_excl);
> 
> Gives a consistent snapshot. It doesn't matter if the fence_excl is
> before or after the shared_list -- if it's after, it's a superset of all
> fences, if it's before, we have a correct list of shared fences the
> earlier fence_excl.

The problem is that the point of the loop is that we do need a check on
the fences after the full memory barrier.

Thinking of the rationale beaten out for dma_fence_get_excl_rcu_safe()

We don't have a full memory barrier here, so this cannot be used safely
in light of fence reallocation.
-Chris
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

Re: [Intel-gfx] [PATCH 4/4] dma-buf: nuke reservation_object seq number

2019-08-14 Thread Chris Wilson
Quoting Daniel Vetter (2019-08-14 18:06:26)
> On Wed, Aug 14, 2019 at 05:42:48PM +0100, Chris Wilson wrote:
> > Quoting Daniel Vetter (2019-08-14 16:39:08)
[snip]
> > > > > >  if (old)
> > > > > > -   old->shared_count = 0;
> > > > > > -   write_seqcount_end(>seq);
> > > > > > +   smp_store_mb(old->shared_count, 0);
> > > 
> > > So your comment and the kerneldoc don't match up. Quoting
> > > Documentation/memory-barriers.txt:
> > > 
> > >  This assigns the value to the variable and then inserts a full memory
> > >  barrier after it.  It isn't guaranteed to insert anything more than a
> > >  compiler barrier in a UP compilation.
> > > 
> > > So order is 1. store 2. fence, but your comment suggests you want it the
> > > other way round.
> > 
> > What's more weird is that it is a fully serialising instruction that is
> > used to fence first as part of the update. If that's way PeterZ uses
> > it...
> 
> I haven't looked at the implementations tbh, just going with the text. Or
> do you mean in the write_seqlock that we're replacing?

Nah, I misremembered set_current_state(), all that implies is the fence
is before the following instructions. I have some recollection that it
can be used as a RELEASE operation (if only because it is a locked xchg).
If all else fails, make it an xchg_release(). Or normal assignment +
smp_wmb().

> > It's an exclusive fence. If it is replaced, it must be later than all
> > the shared fences (and dependent on them directly or indirectly), and
> > so still a consistent snapshot.
> 
> I'm not worried about the fence, that part is fine. But we're defacto
> using the fence as a fancy seqlock-of-sorts. And if the fence gets reused
> and the pointers match, then our seqlock-of-sorts breaks. But I haven't
> looked around whether there's more in the code that makes this an
> irrelevant issue.

No, it should not break if we replace the fence with the same pointer.
If the fence pointer expires, reused and assigned back as the excl_fence
-- it is still the excl_fence and by the properties of that
excl_fence construction, it is later than the shared_fences.
-Chris
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

Re: [Intel-gfx] [PATCH 4/4] dma-buf: nuke reservation_object seq number

2019-08-14 Thread Chris Wilson
Quoting Chris Wilson (2019-08-14 17:42:48)
> Quoting Daniel Vetter (2019-08-14 16:39:08)
> > > > > +   } while (rcu_access_pointer(obj->fence_excl) != *excl);
> > 
> > What if someone is real fast (like really real fast) and recycles the
> > exclusive fence so you read the same pointer twice, but everything else
> > changed? reused fence pointer is a lot more likely than seqlock wrapping
> > around.
> 
> It's an exclusive fence. If it is replaced, it must be later than all
> the shared fences (and dependent on them directly or indirectly), and
> so still a consistent snapshot.

An extension of that argument says we don't even need to loop,

*list = rcu_dereference(obj->fence);
*shared_count = *list ? (*list)->shared_count : 0;
smp_rmb();
*excl = rcu_dereference(obj->fence_excl);

Gives a consistent snapshot. It doesn't matter if the fence_excl is
before or after the shared_list -- if it's after, it's a superset of all
fences, if it's before, we have a correct list of shared fences the
earlier fence_excl.
-Chris
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

Re: [Intel-gfx] [PATCH 4/4] dma-buf: nuke reservation_object seq number

2019-08-14 Thread Daniel Vetter
On Wed, Aug 14, 2019 at 05:42:48PM +0100, Chris Wilson wrote:
> Quoting Daniel Vetter (2019-08-14 16:39:08)
> > Sorry I burried myself in some other stuff ...
> > 
> > On Sat, Aug 10, 2019 at 12:51:00PM +0200, Christian König wrote:
> > > Am 07.08.19 um 16:17 schrieb Chris Wilson:
> > > > Quoting Christian König (2019-08-07 14:53:12)
> > > > > The only remaining use for this is to protect against setting a new 
> > > > > exclusive
> > > > > fence while we grab both exclusive and shared. That can also be 
> > > > > archived by
> > > > > looking if the exclusive fence has changed or not after completing the
> > > > > operation.
> > > > > 
> > > > > v2: switch setting excl fence to rcu_assign_pointer
> > > > > 
> > > > > Signed-off-by: Christian König 
> > > > > ---
> > > > >   drivers/dma-buf/reservation.c | 24 +---
> > > > >   include/linux/reservation.h   |  9 ++---
> > > > >   2 files changed, 7 insertions(+), 26 deletions(-)
> > > > > 
> > > > > diff --git a/drivers/dma-buf/reservation.c 
> > > > > b/drivers/dma-buf/reservation.c
> > > > > index 90bc6ef03598..f7f4a0858c2a 100644
> > > > > --- a/drivers/dma-buf/reservation.c
> > > > > +++ b/drivers/dma-buf/reservation.c
> > > > > @@ -49,12 +49,6 @@
> > > > >   DEFINE_WD_CLASS(reservation_ww_class);
> > > > >   EXPORT_SYMBOL(reservation_ww_class);
> > > > > -struct lock_class_key reservation_seqcount_class;
> > > > > -EXPORT_SYMBOL(reservation_seqcount_class);
> > > > > -
> > > > > -const char reservation_seqcount_string[] = "reservation_seqcount";
> > > > > -EXPORT_SYMBOL(reservation_seqcount_string);
> > > > > -
> > > > >   /**
> > > > >* reservation_object_list_alloc - allocate fence list
> > > > >* @shared_max: number of fences we need space for
> > > > > @@ -103,9 +97,6 @@ static void reservation_object_list_free(struct 
> > > > > reservation_object_list *list)
> > > > >   void reservation_object_init(struct reservation_object *obj)
> > > > >   {
> > > > >  ww_mutex_init(>lock, _ww_class);
> > > > > -
> > > > > -   __seqcount_init(>seq, reservation_seqcount_string,
> > > > > -   _seqcount_class);
> > > > >  RCU_INIT_POINTER(obj->fence, NULL);
> > > > >  RCU_INIT_POINTER(obj->fence_excl, NULL);
> > > > >   }
> > > > > @@ -282,12 +273,10 @@ void reservation_object_add_excl_fence(struct 
> > > > > reservation_object *obj,
> > > > >  dma_fence_get(fence);
> > > > >  preempt_disable();
> > > > > -   write_seqcount_begin(>seq);
> > > > > -   /* write_seqcount_begin provides the necessary memory barrier 
> > > > > */
> > > > > -   RCU_INIT_POINTER(obj->fence_excl, fence);
> > > > > +   rcu_assign_pointer(obj->fence_excl, fence);
> > > > > +   /* pointer update must be visible before we modify the 
> > > > > shared_count */
> > 
> > Pls add a "see reservation_object_fence()" here or similar.
> > 
> > > > >  if (old)
> > > > > -   old->shared_count = 0;
> > > > > -   write_seqcount_end(>seq);
> > > > > +   smp_store_mb(old->shared_count, 0);
> > 
> > So your comment and the kerneldoc don't match up. Quoting
> > Documentation/memory-barriers.txt:
> > 
> >  This assigns the value to the variable and then inserts a full memory
> >  barrier after it.  It isn't guaranteed to insert anything more than a
> >  compiler barrier in a UP compilation.
> > 
> > So order is 1. store 2. fence, but your comment suggests you want it the
> > other way round.
> 
> What's more weird is that it is a fully serialising instruction that is
> used to fence first as part of the update. If that's way PeterZ uses
> it...

I haven't looked at the implementations tbh, just going with the text. Or
do you mean in the write_seqlock that we're replacing?

> 
> > > > >  preempt_enable();
> > > > >  /* inplace update, no shared fences */
> > > > > @@ -368,11 +357,8 @@ int reservation_object_copy_fences(struct 
> > > > > reservation_object *dst,
> > > > >  old = reservation_object_get_excl(dst);
> > > > >  preempt_disable();
> > > > > -   write_seqcount_begin(>seq);
> > > > > -   /* write_seqcount_begin provides the necessary memory barrier 
> > > > > */
> > > > > -   RCU_INIT_POINTER(dst->fence_excl, new);
> > > > > -   RCU_INIT_POINTER(dst->fence, dst_list);
> > > > > -   write_seqcount_end(>seq);
> > > > > +   rcu_assign_pointer(dst->fence_excl, new);
> > > > > +   rcu_assign_pointer(dst->fence, dst_list);
> > > > >  preempt_enable();
> > > > >  reservation_object_list_free(src_list);
> > > > > diff --git a/include/linux/reservation.h b/include/linux/reservation.h
> > > > > index 044a5cd4af50..fd29baad0be3 100644
> > > > > --- a/include/linux/reservation.h
> > > > > +++ b/include/linux/reservation.h
> > > > > @@ -46,8 +46,6 @@
> > > > >   #include 
> > > > >   extern struct ww_class reservation_ww_class;
> > > > > -extern struct lock_class_key 

Re: [Intel-gfx] [PATCH 4/4] dma-buf: nuke reservation_object seq number

2019-08-14 Thread Chris Wilson
Quoting Daniel Vetter (2019-08-14 16:39:08)
> Sorry I burried myself in some other stuff ...
> 
> On Sat, Aug 10, 2019 at 12:51:00PM +0200, Christian König wrote:
> > Am 07.08.19 um 16:17 schrieb Chris Wilson:
> > > Quoting Christian König (2019-08-07 14:53:12)
> > > > The only remaining use for this is to protect against setting a new 
> > > > exclusive
> > > > fence while we grab both exclusive and shared. That can also be 
> > > > archived by
> > > > looking if the exclusive fence has changed or not after completing the
> > > > operation.
> > > > 
> > > > v2: switch setting excl fence to rcu_assign_pointer
> > > > 
> > > > Signed-off-by: Christian König 
> > > > ---
> > > >   drivers/dma-buf/reservation.c | 24 +---
> > > >   include/linux/reservation.h   |  9 ++---
> > > >   2 files changed, 7 insertions(+), 26 deletions(-)
> > > > 
> > > > diff --git a/drivers/dma-buf/reservation.c 
> > > > b/drivers/dma-buf/reservation.c
> > > > index 90bc6ef03598..f7f4a0858c2a 100644
> > > > --- a/drivers/dma-buf/reservation.c
> > > > +++ b/drivers/dma-buf/reservation.c
> > > > @@ -49,12 +49,6 @@
> > > >   DEFINE_WD_CLASS(reservation_ww_class);
> > > >   EXPORT_SYMBOL(reservation_ww_class);
> > > > -struct lock_class_key reservation_seqcount_class;
> > > > -EXPORT_SYMBOL(reservation_seqcount_class);
> > > > -
> > > > -const char reservation_seqcount_string[] = "reservation_seqcount";
> > > > -EXPORT_SYMBOL(reservation_seqcount_string);
> > > > -
> > > >   /**
> > > >* reservation_object_list_alloc - allocate fence list
> > > >* @shared_max: number of fences we need space for
> > > > @@ -103,9 +97,6 @@ static void reservation_object_list_free(struct 
> > > > reservation_object_list *list)
> > > >   void reservation_object_init(struct reservation_object *obj)
> > > >   {
> > > >  ww_mutex_init(>lock, _ww_class);
> > > > -
> > > > -   __seqcount_init(>seq, reservation_seqcount_string,
> > > > -   _seqcount_class);
> > > >  RCU_INIT_POINTER(obj->fence, NULL);
> > > >  RCU_INIT_POINTER(obj->fence_excl, NULL);
> > > >   }
> > > > @@ -282,12 +273,10 @@ void reservation_object_add_excl_fence(struct 
> > > > reservation_object *obj,
> > > >  dma_fence_get(fence);
> > > >  preempt_disable();
> > > > -   write_seqcount_begin(>seq);
> > > > -   /* write_seqcount_begin provides the necessary memory barrier */
> > > > -   RCU_INIT_POINTER(obj->fence_excl, fence);
> > > > +   rcu_assign_pointer(obj->fence_excl, fence);
> > > > +   /* pointer update must be visible before we modify the 
> > > > shared_count */
> 
> Pls add a "see reservation_object_fence()" here or similar.
> 
> > > >  if (old)
> > > > -   old->shared_count = 0;
> > > > -   write_seqcount_end(>seq);
> > > > +   smp_store_mb(old->shared_count, 0);
> 
> So your comment and the kerneldoc don't match up. Quoting
> Documentation/memory-barriers.txt:
> 
>  This assigns the value to the variable and then inserts a full memory
>  barrier after it.  It isn't guaranteed to insert anything more than a
>  compiler barrier in a UP compilation.
> 
> So order is 1. store 2. fence, but your comment suggests you want it the
> other way round.

What's more weird is that it is a fully serialising instruction that is
used to fence first as part of the update. If that's way PeterZ uses
it...

> > > >  preempt_enable();
> > > >  /* inplace update, no shared fences */
> > > > @@ -368,11 +357,8 @@ int reservation_object_copy_fences(struct 
> > > > reservation_object *dst,
> > > >  old = reservation_object_get_excl(dst);
> > > >  preempt_disable();
> > > > -   write_seqcount_begin(>seq);
> > > > -   /* write_seqcount_begin provides the necessary memory barrier */
> > > > -   RCU_INIT_POINTER(dst->fence_excl, new);
> > > > -   RCU_INIT_POINTER(dst->fence, dst_list);
> > > > -   write_seqcount_end(>seq);
> > > > +   rcu_assign_pointer(dst->fence_excl, new);
> > > > +   rcu_assign_pointer(dst->fence, dst_list);
> > > >  preempt_enable();
> > > >  reservation_object_list_free(src_list);
> > > > diff --git a/include/linux/reservation.h b/include/linux/reservation.h
> > > > index 044a5cd4af50..fd29baad0be3 100644
> > > > --- a/include/linux/reservation.h
> > > > +++ b/include/linux/reservation.h
> > > > @@ -46,8 +46,6 @@
> > > >   #include 
> > > >   extern struct ww_class reservation_ww_class;
> > > > -extern struct lock_class_key reservation_seqcount_class;
> > > > -extern const char reservation_seqcount_string[];
> > > >   /**
> > > >* struct reservation_object_list - a list of shared fences
> > > > @@ -71,7 +69,6 @@ struct reservation_object_list {
> > > >*/
> > > >   struct reservation_object {
> > > >  struct ww_mutex lock;
> > > > -   seqcount_t seq;
> > > >  struct dma_fence __rcu *fence_excl;
> > > >  struct 

Re: [Intel-gfx] [PATCH 4/4] dma-buf: nuke reservation_object seq number

2019-08-14 Thread Daniel Vetter
Sorry I burried myself in some other stuff ...

On Sat, Aug 10, 2019 at 12:51:00PM +0200, Christian König wrote:
> Am 07.08.19 um 16:17 schrieb Chris Wilson:
> > Quoting Christian König (2019-08-07 14:53:12)
> > > The only remaining use for this is to protect against setting a new 
> > > exclusive
> > > fence while we grab both exclusive and shared. That can also be archived 
> > > by
> > > looking if the exclusive fence has changed or not after completing the
> > > operation.
> > > 
> > > v2: switch setting excl fence to rcu_assign_pointer
> > > 
> > > Signed-off-by: Christian König 
> > > ---
> > >   drivers/dma-buf/reservation.c | 24 +---
> > >   include/linux/reservation.h   |  9 ++---
> > >   2 files changed, 7 insertions(+), 26 deletions(-)
> > > 
> > > diff --git a/drivers/dma-buf/reservation.c b/drivers/dma-buf/reservation.c
> > > index 90bc6ef03598..f7f4a0858c2a 100644
> > > --- a/drivers/dma-buf/reservation.c
> > > +++ b/drivers/dma-buf/reservation.c
> > > @@ -49,12 +49,6 @@
> > >   DEFINE_WD_CLASS(reservation_ww_class);
> > >   EXPORT_SYMBOL(reservation_ww_class);
> > > -struct lock_class_key reservation_seqcount_class;
> > > -EXPORT_SYMBOL(reservation_seqcount_class);
> > > -
> > > -const char reservation_seqcount_string[] = "reservation_seqcount";
> > > -EXPORT_SYMBOL(reservation_seqcount_string);
> > > -
> > >   /**
> > >* reservation_object_list_alloc - allocate fence list
> > >* @shared_max: number of fences we need space for
> > > @@ -103,9 +97,6 @@ static void reservation_object_list_free(struct 
> > > reservation_object_list *list)
> > >   void reservation_object_init(struct reservation_object *obj)
> > >   {
> > >  ww_mutex_init(>lock, _ww_class);
> > > -
> > > -   __seqcount_init(>seq, reservation_seqcount_string,
> > > -   _seqcount_class);
> > >  RCU_INIT_POINTER(obj->fence, NULL);
> > >  RCU_INIT_POINTER(obj->fence_excl, NULL);
> > >   }
> > > @@ -282,12 +273,10 @@ void reservation_object_add_excl_fence(struct 
> > > reservation_object *obj,
> > >  dma_fence_get(fence);
> > >  preempt_disable();
> > > -   write_seqcount_begin(>seq);
> > > -   /* write_seqcount_begin provides the necessary memory barrier */
> > > -   RCU_INIT_POINTER(obj->fence_excl, fence);
> > > +   rcu_assign_pointer(obj->fence_excl, fence);
> > > +   /* pointer update must be visible before we modify the 
> > > shared_count */

Pls add a "see reservation_object_fence()" here or similar.

> > >  if (old)
> > > -   old->shared_count = 0;
> > > -   write_seqcount_end(>seq);
> > > +   smp_store_mb(old->shared_count, 0);

So your comment and the kerneldoc don't match up. Quoting
Documentation/memory-barriers.txt:

 This assigns the value to the variable and then inserts a full memory
 barrier after it.  It isn't guaranteed to insert anything more than a
 compiler barrier in a UP compilation.

So order is 1. store 2. fence, but your comment suggests you want it the
other way round.

> > >  preempt_enable();
> > >  /* inplace update, no shared fences */
> > > @@ -368,11 +357,8 @@ int reservation_object_copy_fences(struct 
> > > reservation_object *dst,
> > >  old = reservation_object_get_excl(dst);
> > >  preempt_disable();
> > > -   write_seqcount_begin(>seq);
> > > -   /* write_seqcount_begin provides the necessary memory barrier */
> > > -   RCU_INIT_POINTER(dst->fence_excl, new);
> > > -   RCU_INIT_POINTER(dst->fence, dst_list);
> > > -   write_seqcount_end(>seq);
> > > +   rcu_assign_pointer(dst->fence_excl, new);
> > > +   rcu_assign_pointer(dst->fence, dst_list);
> > >  preempt_enable();
> > >  reservation_object_list_free(src_list);
> > > diff --git a/include/linux/reservation.h b/include/linux/reservation.h
> > > index 044a5cd4af50..fd29baad0be3 100644
> > > --- a/include/linux/reservation.h
> > > +++ b/include/linux/reservation.h
> > > @@ -46,8 +46,6 @@
> > >   #include 
> > >   extern struct ww_class reservation_ww_class;
> > > -extern struct lock_class_key reservation_seqcount_class;
> > > -extern const char reservation_seqcount_string[];
> > >   /**
> > >* struct reservation_object_list - a list of shared fences
> > > @@ -71,7 +69,6 @@ struct reservation_object_list {
> > >*/
> > >   struct reservation_object {
> > >  struct ww_mutex lock;
> > > -   seqcount_t seq;
> > >  struct dma_fence __rcu *fence_excl;
> > >  struct reservation_object_list __rcu *fence;
> > > @@ -156,14 +153,12 @@ reservation_object_fences(struct reservation_object 
> > > *obj,
> > >struct reservation_object_list **list,
> > >u32 *shared_count)
> > >   {
> > > -   unsigned int seq;
> > > -
> > >  do {
> > > -   seq = read_seqcount_begin(>seq);
> > >  *excl = 

Re: [Intel-gfx] [PATCH 4/4] dma-buf: nuke reservation_object seq number

2019-08-10 Thread Christian König

Am 07.08.19 um 16:17 schrieb Chris Wilson:

Quoting Christian König (2019-08-07 14:53:12)

The only remaining use for this is to protect against setting a new exclusive
fence while we grab both exclusive and shared. That can also be archived by
looking if the exclusive fence has changed or not after completing the
operation.

v2: switch setting excl fence to rcu_assign_pointer

Signed-off-by: Christian König 
---
  drivers/dma-buf/reservation.c | 24 +---
  include/linux/reservation.h   |  9 ++---
  2 files changed, 7 insertions(+), 26 deletions(-)

diff --git a/drivers/dma-buf/reservation.c b/drivers/dma-buf/reservation.c
index 90bc6ef03598..f7f4a0858c2a 100644
--- a/drivers/dma-buf/reservation.c
+++ b/drivers/dma-buf/reservation.c
@@ -49,12 +49,6 @@
  DEFINE_WD_CLASS(reservation_ww_class);
  EXPORT_SYMBOL(reservation_ww_class);
  
-struct lock_class_key reservation_seqcount_class;

-EXPORT_SYMBOL(reservation_seqcount_class);
-
-const char reservation_seqcount_string[] = "reservation_seqcount";
-EXPORT_SYMBOL(reservation_seqcount_string);
-
  /**
   * reservation_object_list_alloc - allocate fence list
   * @shared_max: number of fences we need space for
@@ -103,9 +97,6 @@ static void reservation_object_list_free(struct 
reservation_object_list *list)
  void reservation_object_init(struct reservation_object *obj)
  {
 ww_mutex_init(>lock, _ww_class);
-
-   __seqcount_init(>seq, reservation_seqcount_string,
-   _seqcount_class);
 RCU_INIT_POINTER(obj->fence, NULL);
 RCU_INIT_POINTER(obj->fence_excl, NULL);
  }
@@ -282,12 +273,10 @@ void reservation_object_add_excl_fence(struct 
reservation_object *obj,
 dma_fence_get(fence);
  
 preempt_disable();

-   write_seqcount_begin(>seq);
-   /* write_seqcount_begin provides the necessary memory barrier */
-   RCU_INIT_POINTER(obj->fence_excl, fence);
+   rcu_assign_pointer(obj->fence_excl, fence);
+   /* pointer update must be visible before we modify the shared_count */
 if (old)
-   old->shared_count = 0;
-   write_seqcount_end(>seq);
+   smp_store_mb(old->shared_count, 0);
 preempt_enable();
  
 /* inplace update, no shared fences */

@@ -368,11 +357,8 @@ int reservation_object_copy_fences(struct 
reservation_object *dst,
 old = reservation_object_get_excl(dst);
  
 preempt_disable();

-   write_seqcount_begin(>seq);
-   /* write_seqcount_begin provides the necessary memory barrier */
-   RCU_INIT_POINTER(dst->fence_excl, new);
-   RCU_INIT_POINTER(dst->fence, dst_list);
-   write_seqcount_end(>seq);
+   rcu_assign_pointer(dst->fence_excl, new);
+   rcu_assign_pointer(dst->fence, dst_list);
 preempt_enable();
  
 reservation_object_list_free(src_list);

diff --git a/include/linux/reservation.h b/include/linux/reservation.h
index 044a5cd4af50..fd29baad0be3 100644
--- a/include/linux/reservation.h
+++ b/include/linux/reservation.h
@@ -46,8 +46,6 @@
  #include 
  
  extern struct ww_class reservation_ww_class;

-extern struct lock_class_key reservation_seqcount_class;
-extern const char reservation_seqcount_string[];
  
  /**

   * struct reservation_object_list - a list of shared fences
@@ -71,7 +69,6 @@ struct reservation_object_list {
   */
  struct reservation_object {
 struct ww_mutex lock;
-   seqcount_t seq;
  
 struct dma_fence __rcu *fence_excl;

 struct reservation_object_list __rcu *fence;
@@ -156,14 +153,12 @@ reservation_object_fences(struct reservation_object *obj,
   struct reservation_object_list **list,
   u32 *shared_count)
  {
-   unsigned int seq;
-
 do {
-   seq = read_seqcount_begin(>seq);
 *excl = rcu_dereference(obj->fence_excl);
 *list = rcu_dereference(obj->fence);
 *shared_count = *list ? (*list)->shared_count : 0;
-   } while (read_seqcount_retry(>seq, seq));
+   smp_rmb(); /* See reservation_object_add_excl_fence */
+   } while (rcu_access_pointer(obj->fence_excl) != *excl);
  }

Reviewed-by: Chris Wilson 

I think this is correct. Now see if we can convince Daniel!


Daniel any objections to this? IGTs look good as well, so if not I'm 
going to push it.


Christian.


-Chris


___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

Re: [Intel-gfx] [PATCH 4/4] dma-buf: nuke reservation_object seq number

2019-08-07 Thread Chris Wilson
Quoting Christian König (2019-08-07 14:53:12)
> The only remaining use for this is to protect against setting a new exclusive
> fence while we grab both exclusive and shared. That can also be archived by
> looking if the exclusive fence has changed or not after completing the
> operation.
> 
> v2: switch setting excl fence to rcu_assign_pointer
> 
> Signed-off-by: Christian König 
> ---
>  drivers/dma-buf/reservation.c | 24 +---
>  include/linux/reservation.h   |  9 ++---
>  2 files changed, 7 insertions(+), 26 deletions(-)
> 
> diff --git a/drivers/dma-buf/reservation.c b/drivers/dma-buf/reservation.c
> index 90bc6ef03598..f7f4a0858c2a 100644
> --- a/drivers/dma-buf/reservation.c
> +++ b/drivers/dma-buf/reservation.c
> @@ -49,12 +49,6 @@
>  DEFINE_WD_CLASS(reservation_ww_class);
>  EXPORT_SYMBOL(reservation_ww_class);
>  
> -struct lock_class_key reservation_seqcount_class;
> -EXPORT_SYMBOL(reservation_seqcount_class);
> -
> -const char reservation_seqcount_string[] = "reservation_seqcount";
> -EXPORT_SYMBOL(reservation_seqcount_string);
> -
>  /**
>   * reservation_object_list_alloc - allocate fence list
>   * @shared_max: number of fences we need space for
> @@ -103,9 +97,6 @@ static void reservation_object_list_free(struct 
> reservation_object_list *list)
>  void reservation_object_init(struct reservation_object *obj)
>  {
> ww_mutex_init(>lock, _ww_class);
> -
> -   __seqcount_init(>seq, reservation_seqcount_string,
> -   _seqcount_class);
> RCU_INIT_POINTER(obj->fence, NULL);
> RCU_INIT_POINTER(obj->fence_excl, NULL);
>  }
> @@ -282,12 +273,10 @@ void reservation_object_add_excl_fence(struct 
> reservation_object *obj,
> dma_fence_get(fence);
>  
> preempt_disable();
> -   write_seqcount_begin(>seq);
> -   /* write_seqcount_begin provides the necessary memory barrier */
> -   RCU_INIT_POINTER(obj->fence_excl, fence);
> +   rcu_assign_pointer(obj->fence_excl, fence);
> +   /* pointer update must be visible before we modify the shared_count */
> if (old)
> -   old->shared_count = 0;
> -   write_seqcount_end(>seq);
> +   smp_store_mb(old->shared_count, 0);
> preempt_enable();
>  
> /* inplace update, no shared fences */
> @@ -368,11 +357,8 @@ int reservation_object_copy_fences(struct 
> reservation_object *dst,
> old = reservation_object_get_excl(dst);
>  
> preempt_disable();
> -   write_seqcount_begin(>seq);
> -   /* write_seqcount_begin provides the necessary memory barrier */
> -   RCU_INIT_POINTER(dst->fence_excl, new);
> -   RCU_INIT_POINTER(dst->fence, dst_list);
> -   write_seqcount_end(>seq);
> +   rcu_assign_pointer(dst->fence_excl, new);
> +   rcu_assign_pointer(dst->fence, dst_list);
> preempt_enable();
>  
> reservation_object_list_free(src_list);
> diff --git a/include/linux/reservation.h b/include/linux/reservation.h
> index 044a5cd4af50..fd29baad0be3 100644
> --- a/include/linux/reservation.h
> +++ b/include/linux/reservation.h
> @@ -46,8 +46,6 @@
>  #include 
>  
>  extern struct ww_class reservation_ww_class;
> -extern struct lock_class_key reservation_seqcount_class;
> -extern const char reservation_seqcount_string[];
>  
>  /**
>   * struct reservation_object_list - a list of shared fences
> @@ -71,7 +69,6 @@ struct reservation_object_list {
>   */
>  struct reservation_object {
> struct ww_mutex lock;
> -   seqcount_t seq;
>  
> struct dma_fence __rcu *fence_excl;
> struct reservation_object_list __rcu *fence;
> @@ -156,14 +153,12 @@ reservation_object_fences(struct reservation_object 
> *obj,
>   struct reservation_object_list **list,
>   u32 *shared_count)
>  {
> -   unsigned int seq;
> -
> do {
> -   seq = read_seqcount_begin(>seq);
> *excl = rcu_dereference(obj->fence_excl);
> *list = rcu_dereference(obj->fence);
> *shared_count = *list ? (*list)->shared_count : 0;
> -   } while (read_seqcount_retry(>seq, seq));
> +   smp_rmb(); /* See reservation_object_add_excl_fence */
> +   } while (rcu_access_pointer(obj->fence_excl) != *excl);
>  }

Reviewed-by: Chris Wilson 

I think this is correct. Now see if we can convince Daniel!
-Chris
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx