Re: [PATCH 1/5] dma-buf: add optional invalidate_mappings callback v2

2018-03-19 Thread Christian König

Am 19.03.2018 um 16:53 schrieb Chris Wilson:

Quoting Christian König (2018-03-16 14:22:32)
[snip, probably lost too must context]

This allows for full grown pipelining, e.g. the exporter can say I need
to move the buffer for some operation. Then let the move operation wait
for all existing fences in the reservation object and install the fence
of the move operation as exclusive fence.

Ok, the situation I have in mind is the non-pipelined case: revoking
dma-buf for mmu_invalidate_range or shrink_slab. I would need a
completion event that can be waited on the cpu for all the invalidate
callbacks. (Essentially an atomic_t counter plus struct completion; a
lighter version of dma_fence, I wonder where I've seen that before ;)


Actually that is harmless.

When you need to unmap a DMA-buf because of mmu_invalidate_range or 
shrink_slab you need to wait for it's reservation object anyway.


This needs to be done to make sure that the backing memory is now idle, 
it doesn't matter if the jobs where submitted by DMA-buf importers or 
your own driver.


The sg tables pointing to the now released memory might live a bit 
longer, but that is unproblematic and actually intended.


When we would try to destroy the sg tables in an mmu_invalidate_range or 
shrink_slab callback we would run into a lockdep horror.


Regards,
Christian.



Even so, it basically means passing a fence object down to the async
callbacks for them to signal when they are complete. Just to handle the
non-pipelined version. :|
-Chris
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel


___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: [PATCH 1/5] dma-buf: add optional invalidate_mappings callback v2

2018-03-19 Thread Chris Wilson
Quoting Christian König (2018-03-16 14:22:32)
[snip, probably lost too must context]
> This allows for full grown pipelining, e.g. the exporter can say I need 
> to move the buffer for some operation. Then let the move operation wait 
> for all existing fences in the reservation object and install the fence 
> of the move operation as exclusive fence.

Ok, the situation I have in mind is the non-pipelined case: revoking
dma-buf for mmu_invalidate_range or shrink_slab. I would need a
completion event that can be waited on the cpu for all the invalidate
callbacks. (Essentially an atomic_t counter plus struct completion; a
lighter version of dma_fence, I wonder where I've seen that before ;)

Even so, it basically means passing a fence object down to the async
callbacks for them to signal when they are complete. Just to handle the
non-pipelined version. :|
-Chris
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: [PATCH 1/5] dma-buf: add optional invalidate_mappings callback v2

2018-03-19 Thread Daniel Vetter
On Fri, Mar 16, 2018 at 02:20:45PM +0100, Christian König wrote:
> Each importer can now provide an invalidate_mappings callback.
> 
> This allows the exporter to provide the mappings without the need to pin
> the backing store.
> 
> v2: don't try to invalidate mappings when the callback is NULL,
> lock the reservation obj while using the attachments,
> add helper to set the callback
> 
> Signed-off-by: Christian König 

Replying here to avoid thread split, but not entirely related.

I thought some more about the lockdep splat discussion, and specifically
that amdgpu needs the reservations for the vm objects when doing a gpu
reset.

Since they're in the same ww_class as all other dma-buf reservations I'm
pretty sure lockdep will complain, at least when cross-release lockdep and
cross-release annotations for dma_fence are merged.

And as long as there's some case where amdgpu needs both the vm object
reservation and other reservations (CS?) then we must have them in the
same class, and in that case the deadlock is real. It'll require an
impressive set of circumstances most likely (the minimal number of threads
we generally needed to actually hit the cross-release stuff was 4+ or
something nuts like that, all doing something else), but it'll be real.

Otoh I think the invalidate stuff here doesn't actually make this worse,
so we can bang our heads against the gpu reset problem at leisure :-)

This stuff here has much more potential to interact badly with core mm
paths, and anything related to that (which ime also means all the cpu
hotplug machinery, which includes suspend/resume, and anything related to
the vfs because someone always manages to drag sysfs into the picture).

It's going to be fun times.

Cheers, Daniel
> ---
>  drivers/dma-buf/dma-buf.c | 60 
> +++
>  include/linux/dma-buf.h   | 38 ++
>  2 files changed, 98 insertions(+)
> 
> diff --git a/drivers/dma-buf/dma-buf.c b/drivers/dma-buf/dma-buf.c
> index d78d5fc173dc..ed2b3559ba25 100644
> --- a/drivers/dma-buf/dma-buf.c
> +++ b/drivers/dma-buf/dma-buf.c
> @@ -572,7 +572,9 @@ struct dma_buf_attachment *dma_buf_attach(struct dma_buf 
> *dmabuf,
>   if (ret)
>   goto err_attach;
>   }
> + reservation_object_lock(dmabuf->resv, NULL);
>   list_add(>node, >attachments);
> + reservation_object_unlock(dmabuf->resv);
>  
>   mutex_unlock(>lock);
>   return attach;
> @@ -598,7 +600,9 @@ void dma_buf_detach(struct dma_buf *dmabuf, struct 
> dma_buf_attachment *attach)
>   return;
>  
>   mutex_lock(>lock);
> + reservation_object_lock(dmabuf->resv, NULL);
>   list_del(>node);
> + reservation_object_unlock(dmabuf->resv);
>   if (dmabuf->ops->detach)
>   dmabuf->ops->detach(dmabuf, attach);
>  
> @@ -632,10 +636,23 @@ struct sg_table *dma_buf_map_attachment(struct 
> dma_buf_attachment *attach,
>   if (WARN_ON(!attach || !attach->dmabuf))
>   return ERR_PTR(-EINVAL);
>  
> + /*
> +  * Mapping a DMA-buf can trigger its invalidation, prevent sending this
> +  * event to the caller by temporary removing this attachment from the
> +  * list.
> +  */
> + if (attach->invalidate_mappings) {
> + reservation_object_assert_held(attach->dmabuf->resv);
> + list_del(>node);
> + }
> +
>   sg_table = attach->dmabuf->ops->map_dma_buf(attach, direction);
>   if (!sg_table)
>   sg_table = ERR_PTR(-ENOMEM);
>  
> + if (attach->invalidate_mappings)
> + list_add(>node, >dmabuf->attachments);
> +
>   return sg_table;
>  }
>  EXPORT_SYMBOL_GPL(dma_buf_map_attachment);
> @@ -656,6 +673,9 @@ void dma_buf_unmap_attachment(struct dma_buf_attachment 
> *attach,
>  {
>   might_sleep();
>  
> + if (attach->invalidate_mappings)
> + reservation_object_assert_held(attach->dmabuf->resv);
> +
>   if (WARN_ON(!attach || !attach->dmabuf || !sg_table))
>   return;
>  
> @@ -664,6 +684,44 @@ void dma_buf_unmap_attachment(struct dma_buf_attachment 
> *attach,
>  }
>  EXPORT_SYMBOL_GPL(dma_buf_unmap_attachment);
>  
> +/**
> + * dma_buf_set_invalidate_callback - set the invalidate_mappings callback
> + *
> + * @attach:  [in]attachment where to set the callback
> + * @cb:  [in]the callback to set
> + *
> + * Makes sure to take the appropriate locks when updating the invalidate
> + * mappings callback.
> + */
> +void dma_buf_set_invalidate_callback(struct dma_buf_attachment *attach,
> +  void (*cb)(struct dma_buf_attachment *))
> +{
> + reservation_object_lock(attach->dmabuf->resv, NULL);
> + attach->invalidate_mappings = cb;
> + reservation_object_unlock(attach->dmabuf->resv);
> +}
> +EXPORT_SYMBOL_GPL(dma_buf_set_invalidate_callback);
> +
> +/**
> + * dma_buf_invalidate_mappings - invalidate 

Re: [PATCH 1/5] dma-buf: add optional invalidate_mappings callback v2

2018-03-16 Thread Christian König

Am 16.03.2018 um 14:51 schrieb Chris Wilson:

Quoting Christian König (2018-03-16 13:20:45)

@@ -326,6 +338,29 @@ struct dma_buf_attachment {
 struct device *dev;
 struct list_head node;
 void *priv;
+
+   /**
+* @invalidate_mappings:
+*
+* Optional callback provided by the importer of the attachment which
+* must be set before mappings are created.
+*
+* If provided the exporter can avoid pinning the backing store while
+* mappings exists.

Hmm, no I don't think it avoids the pinning issue entirely. As it stands,
the importer doesn't have a page refcount and so they all rely on the
exporter keeping the dmabuf pages pinned while attached. What can happen
is that given the invalidate cb, the importers can revoke their
attachments, letting the exporter recover the pages/sg, and then start
again from scratch.


Yes, exactly. The wording is just not 100% precise and I haven't found 
something better so far.



That also neatly answers what happens if not all importers provide an
invalidate cb, or fail, the dmabuf remains pinned and the exporter must
retreat.


Yes, exactly as well. As soon as at least one importer says "I can't do 
this", we must fallback to the old behavior.



+*
+* The function is called with the lock of the reservation object
+* associated with the dma_buf held and the mapping function must be
+* called with this lock held as well. This makes sure that no mapping
+* is created concurrently with an ongoing invalidation.
+*
+* After the callback all existing mappings are still valid until all
+* fences in the dma_bufs reservation object are signaled, but should be
+* destroyed by the importer as soon as possible.
+*
+* New mappings can be created immediately, but can't be used before the
+* exclusive fence in the dma_bufs reservation object is signaled.
+*/
+   void (*invalidate_mappings)(struct dma_buf_attachment *attach);

The intent is that the invalidate is synchronous and immediate, while
locked? We are looking at recursing back into the dma_buf functions to
remove each attachment from the invalidate cb (as well as waiting for
dma), won't that cause some nasty issues?


No, with this idea invalidation is asynchronous. Already discussed that 
with Daniel as well and YES Daniel also already pointed out that I need 
to better document this.


When the exporter calls invalidate_mappings() it just means that all 
importers can no longer use their sg tables for new submissions, old 
ones stay active.


The existing sg tables are guaranteed to stay valid until all fences in 
the reservation object have signaled and the importer should also delay 
it's call to call dma_buf_unmap_attachment() until all the fences have 
signaled.


When the importer has new work to do, e.g. wants to attach a new fence 
to the reservation object, it must grab a new sg table for that. The 
importer also needs to make sure that all new work touching the dma-buf 
doesn't start before the exclusive fence in the reservation object signals.


This allows for full grown pipelining, e.g. the exporter can say I need 
to move the buffer for some operation. Then let the move operation wait 
for all existing fences in the reservation object and install the fence 
of the move operation as exclusive fence.


The importer can then immediately grab a new sg table for the new 
location of the buffer and use it to prepare the next operation.


Regards,
Christian.


-Chris
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel


___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: [PATCH 1/5] dma-buf: add optional invalidate_mappings callback v2

2018-03-16 Thread Chris Wilson
Quoting Christian König (2018-03-16 13:20:45)
> @@ -326,6 +338,29 @@ struct dma_buf_attachment {
> struct device *dev;
> struct list_head node;
> void *priv;
> +
> +   /**
> +* @invalidate_mappings:
> +*
> +* Optional callback provided by the importer of the attachment which
> +* must be set before mappings are created.
> +*
> +* If provided the exporter can avoid pinning the backing store while
> +* mappings exists.

Hmm, no I don't think it avoids the pinning issue entirely. As it stands,
the importer doesn't have a page refcount and so they all rely on the
exporter keeping the dmabuf pages pinned while attached. What can happen
is that given the invalidate cb, the importers can revoke their
attachments, letting the exporter recover the pages/sg, and then start
again from scratch.

That also neatly answers what happens if not all importers provide an
invalidate cb, or fail, the dmabuf remains pinned and the exporter must
retreat.

> +*
> +* The function is called with the lock of the reservation object
> +* associated with the dma_buf held and the mapping function must be
> +* called with this lock held as well. This makes sure that no mapping
> +* is created concurrently with an ongoing invalidation.
> +*
> +* After the callback all existing mappings are still valid until all
> +* fences in the dma_bufs reservation object are signaled, but should 
> be
> +* destroyed by the importer as soon as possible.
> +*
> +* New mappings can be created immediately, but can't be used before 
> the
> +* exclusive fence in the dma_bufs reservation object is signaled.
> +*/
> +   void (*invalidate_mappings)(struct dma_buf_attachment *attach);

The intent is that the invalidate is synchronous and immediate, while
locked? We are looking at recursing back into the dma_buf functions to
remove each attachment from the invalidate cb (as well as waiting for
dma), won't that cause some nasty issues?
-Chris
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel