Re: [PATCH] dma-fence: Make ->enable_signaling optional
Op 04-05-18 om 16:10 schreef Daniel Vetter: > Many drivers have a trivial implementation for ->enable_signaling. > Let's make it optional by assuming that signalling is already > available when the callback isn't present. > > v2: Don't do the trick to set the ENABLE_SIGNAL_BIT > unconditionally, it results in an expensive spinlock take for > everyone. Instead just check if the callback is present. Suggested by > Maarten. > > Also move misplaced kerneldoc hunk to the right patch. > > Cc: Maarten Lankhorst <maarten.lankho...@linux.intel.com> > Reviewed-by: Christian König <christian.koe...@amd.com> (v1) > Signed-off-by: Daniel Vetter <daniel.vet...@intel.com> > Cc: Sumit Semwal <sumit.sem...@linaro.org> > Cc: Gustavo Padovan <gust...@padovan.org> > Cc: linux-media@vger.kernel.org > Cc: linaro-mm-...@lists.linaro.org > --- > drivers/dma-buf/dma-fence.c | 9 + > include/linux/dma-fence.h | 3 ++- > 2 files changed, 7 insertions(+), 5 deletions(-) > > diff --git a/drivers/dma-buf/dma-fence.c b/drivers/dma-buf/dma-fence.c > index 4edb9fd3cf47..dd01a1720be9 100644 > --- a/drivers/dma-buf/dma-fence.c > +++ b/drivers/dma-buf/dma-fence.c > @@ -200,7 +200,8 @@ void dma_fence_enable_sw_signaling(struct dma_fence > *fence) > > if (!test_and_set_bit(DMA_FENCE_FLAG_ENABLE_SIGNAL_BIT, > >flags) && > - !test_bit(DMA_FENCE_FLAG_SIGNALED_BIT, >flags)) { > + !test_bit(DMA_FENCE_FLAG_SIGNALED_BIT, >flags) && > + fence->ops->enable_signaling) { > trace_dma_fence_enable_signal(fence); > > spin_lock_irqsave(fence->lock, flags); > @@ -260,7 +261,7 @@ int dma_fence_add_callback(struct dma_fence *fence, > struct dma_fence_cb *cb, > > if (test_bit(DMA_FENCE_FLAG_SIGNALED_BIT, >flags)) > ret = -ENOENT; > - else if (!was_set) { > + else if (!was_set && fence->ops->enable_signaling) { > trace_dma_fence_enable_signal(fence); > > if (!fence->ops->enable_signaling(fence)) { > @@ -388,7 +389,7 @@ dma_fence_default_wait(struct dma_fence *fence, bool > intr, signed long timeout) > if (test_bit(DMA_FENCE_FLAG_SIGNALED_BIT, >flags)) > goto out; > > - if (!was_set) { > + if (!was_set && fence->ops->enable_signaling) { > trace_dma_fence_enable_signal(fence); > > if (!fence->ops->enable_signaling(fence)) { > @@ -560,7 +561,7 @@ dma_fence_init(struct dma_fence *fence, const struct > dma_fence_ops *ops, > spinlock_t *lock, u64 context, unsigned seqno) > { > BUG_ON(!lock); > - BUG_ON(!ops || !ops->wait || !ops->enable_signaling || > + BUG_ON(!ops || !ops->wait || > !ops->get_driver_name || !ops->get_timeline_name); > > kref_init(>refcount); > diff --git a/include/linux/dma-fence.h b/include/linux/dma-fence.h > index 111aefe1c956..c053d19e1e24 100644 > --- a/include/linux/dma-fence.h > +++ b/include/linux/dma-fence.h > @@ -166,7 +166,8 @@ struct dma_fence_ops { >* released when the fence is signalled (through e.g. the interrupt >* handler). >* > - * This callback is mandatory. > + * This callback is optional. If this callback is not present, then the > + * driver must always have signaling enabled. >*/ > bool (*enable_signaling)(struct dma_fence *fence); > Much better. :) Reviewed-by: Maarten Lankhorst <maarten.lankho...@linux.intel.com>
Re: [PATCH] dma-fence: fix dma_fence_get_rcu_safe
Op 21-09-17 om 09:00 schreef Christian König: > Am 20.09.2017 um 20:20 schrieb Daniel Vetter: >> On Mon, Sep 11, 2017 at 01:06:32PM +0200, Christian König wrote: >>> Am 11.09.2017 um 12:01 schrieb Chris Wilson: [SNIP] > Yeah, but that is illegal with a fence objects. > > When anybody allocates fences this way it breaks at least > reservation_object_get_fences_rcu(), > reservation_object_wait_timeout_rcu() and > reservation_object_test_signaled_single(). Many, many months ago I sent patches to fix them all. >>> Found those after a bit a searching. Yeah, those patches where proposed more >>> than a year ago, but never pushed upstream. >>> >>> Not sure if we really should go this way. dma_fence objects are shared >>> between drivers and since we can't judge if it's the correct fence based on >>> a criteria in the object (only the read counter which is outside) all >>> drivers need to be correct for this. >>> >>> I would rather go the way and change dma_fence_release() to wrap >>> fence->ops->release into call_rcu() to keep the whole RCU handling outside >>> of the individual drivers. >> Hm, I entirely dropped the ball on this, I kinda assumed that we managed >> to get some agreement on this between i915 and dma_fence. Adding a pile >> more people. > > For the meantime I've send a v2 of this patch to fix at least the buggy > return of NULL when we fail to grab the RCU reference but keeping the extra > checking for now. > > Can I get an rb on this please so that we fix at least the bug at hand? > > Thanks, > Christian. Done.
Re: [PATCH] dma-fence: fix dma_fence_get_rcu_safe v2
Op 15-09-17 om 11:53 schreef Christian König: > From: Christian König <christian.koe...@amd.com> > > When dma_fence_get_rcu() fails to acquire a reference it doesn't necessary > mean that there is no fence at all. > > It usually mean that the fence was replaced by a new one and in this situation > we certainly want to have the new one as result and *NOT* NULL. > > v2: Keep extra check after dma_fence_get_rcu(). > > Signed-off-by: Christian König <christian.koe...@amd.com> > Cc: Chris Wilson <ch...@chris-wilson.co.uk> > Cc: Daniel Vetter <daniel.vet...@ffwll.ch> > Cc: Sumit Semwal <sumit.sem...@linaro.org> > Cc: linux-media@vger.kernel.org > Cc: dri-de...@lists.freedesktop.org > Cc: linaro-mm-...@lists.linaro.org > --- > include/linux/dma-fence.h | 5 - > 1 file changed, 4 insertions(+), 1 deletion(-) > > diff --git a/include/linux/dma-fence.h b/include/linux/dma-fence.h > index 0a186c4..f4f23cb 100644 > --- a/include/linux/dma-fence.h > +++ b/include/linux/dma-fence.h > @@ -248,9 +248,12 @@ dma_fence_get_rcu_safe(struct dma_fence * __rcu *fencep) > struct dma_fence *fence; > > fence = rcu_dereference(*fencep); > - if (!fence || !dma_fence_get_rcu(fence)) > + if (!fence) > return NULL; > > + if (!dma_fence_get_rcu(fence)) > + continue; > + > /* The atomic_inc_not_zero() inside dma_fence_get_rcu() >* provides a full memory barrier upon success (such as now). >* This is paired with the write barrier from assigning Should be safe from an infinite loop since the old fence is only unreffed after the new pointer is written, so we'll always make progress. :) Reviewed-by: Maarten Lankhorst <maarten.lankho...@linux.intel.com>
Re: [PATCH] android: fix warning when releasing active sync point
Op 15-12-15 om 18:19 schreef Dmitry Torokhov: > On Tue, Dec 15, 2015 at 2:01 AM, Maarten Lankhorst > <maarten.lankho...@linux.intel.com> wrote: >> Op 15-12-15 om 02:29 schreef Dmitry Torokhov: >>> Userspace can close the sync device while there are still active fence >>> points, in which case kernel produces the following warning: >>> >>> [ 43.853176] [ cut here ] >>> [ 43.857834] WARNING: CPU: 0 PID: 892 at >>> /mnt/host/source/src/third_party/kernel/v3.18/drivers/staging/android/sync.c:439 >>> android_fence_release+0x88/0x104() >>> [ 43.871741] CPU: 0 PID: 892 Comm: Binder_5 Tainted: G U >>> 3.18.0-07661-g0550ce9 #1 >>> [ 43.880176] Hardware name: Google Tegra210 Smaug Rev 1+ (DT) >>> [ 43.885834] Call trace: >>> [ 43.888294] [] dump_backtrace+0x0/0x10c >>> [ 43.893697] [] show_stack+0x10/0x1c >>> [ 43.898756] [] dump_stack+0x74/0xb8 >>> [ 43.903814] [] warn_slowpath_common+0x84/0xb0 >>> [ 43.909736] [] warn_slowpath_null+0x14/0x20 >>> [ 43.915482] [] android_fence_release+0x84/0x104 >>> [ 43.921582] [] fence_release+0x104/0x134 >>> [ 43.927066] [] sync_fence_free+0x74/0x9c >>> [ 43.932552] [] sync_fence_release+0x34/0x48 >>> [ 43.938304] [] __fput+0x100/0x1b8 >>> [ 43.943185] [] fput+0x8/0x14 >>> [ 43.947982] [] task_work_run+0xb0/0xe4 >>> [ 43.953297] [] do_notify_resume+0x44/0x5c >>> [ 43.958867] ---[ end trace 5a2aa4027cc5d171 ]--- >>> >>> Let's fix it by introducing a new optional callback (disable_signaling) >>> to fence operations so that drivers can do proper clean ups when we >>> remove last callback for given fence. >>> >>> Reviewed-by: Andrew Bresticker <abres...@chromium.org> >>> Signed-off-by: Dmitry Torokhov <d...@chromium.org> >>> >> NACK! There's no way to do this race free. > Can you please explain the race because as far as I can see there is not one.\ The entire code in fence.c assumes that a fence can only go from not enable_signaling to .enable_signaling. .enable_signaling is not refcounted so 2 calls to .enable_disabling and 1 to .disable_signaling would mess up. Furthermore we try to make sure that fence_signal doesn't take locks if its unneeded. With a disable_signaling callback you would always need locks. To get rid of these warnings make sure that there's a refcount on the fence until it's signaled. >> The driver should hold a refcount until fence is signaled. > If we are no longer interested in fence why do we need to wait for the > fence to be signaled? It's the part of the design. A driver tracks its outstanding requests and submissions, and every submission has its own fence. Before the driver releases its final ref the request should be completed or aborted. In either case it must call fence_signal. -- To unsubscribe from this list: send the line "unsubscribe linux-media" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] android: fix warning when releasing active sync point
Op 15-12-15 om 02:29 schreef Dmitry Torokhov: > Userspace can close the sync device while there are still active fence > points, in which case kernel produces the following warning: > > [ 43.853176] [ cut here ] > [ 43.857834] WARNING: CPU: 0 PID: 892 at > /mnt/host/source/src/third_party/kernel/v3.18/drivers/staging/android/sync.c:439 > android_fence_release+0x88/0x104() > [ 43.871741] CPU: 0 PID: 892 Comm: Binder_5 Tainted: G U > 3.18.0-07661-g0550ce9 #1 > [ 43.880176] Hardware name: Google Tegra210 Smaug Rev 1+ (DT) > [ 43.885834] Call trace: > [ 43.888294] [] dump_backtrace+0x0/0x10c > [ 43.893697] [] show_stack+0x10/0x1c > [ 43.898756] [] dump_stack+0x74/0xb8 > [ 43.903814] [] warn_slowpath_common+0x84/0xb0 > [ 43.909736] [] warn_slowpath_null+0x14/0x20 > [ 43.915482] [] android_fence_release+0x84/0x104 > [ 43.921582] [] fence_release+0x104/0x134 > [ 43.927066] [] sync_fence_free+0x74/0x9c > [ 43.932552] [] sync_fence_release+0x34/0x48 > [ 43.938304] [] __fput+0x100/0x1b8 > [ 43.943185] [] fput+0x8/0x14 > [ 43.947982] [] task_work_run+0xb0/0xe4 > [ 43.953297] [] do_notify_resume+0x44/0x5c > [ 43.958867] ---[ end trace 5a2aa4027cc5d171 ]--- > > Let's fix it by introducing a new optional callback (disable_signaling) > to fence operations so that drivers can do proper clean ups when we > remove last callback for given fence. > > Reviewed-by: Andrew Bresticker> Signed-off-by: Dmitry Torokhov > NACK! There's no way to do this race free. The driver should hold a refcount until fence is signaled. -- To unsubscribe from this list: send the line "unsubscribe linux-media" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [RFCv3 2/2] dma-buf: add helpers for sharing attacher constraints with dma-parms
Op 27-01-15 om 09:25 schreef Sumit Semwal: Add some helpers to share the constraints of devices while attaching to the dmabuf buffer. At each attach, the constraints are calculated based on the following: - max_segment_size, max_segment_count, segment_boundary_mask from device_dma_parameters. In case the attaching device's constraints don't match up, attach() fails. At detach, the constraints are recalculated based on the remaining attached devices. Two helpers are added: - dma_buf_get_constraints - which gives the current constraints as calculated during each attach on the buffer till the time, - dma_buf_recalc_constraints - which recalculates the constraints for all currently attached devices for the 'paranoid' ones amongst us. The idea of this patch is largely taken from Rob Clark's RFC at https://lkml.org/lkml/2012/7/19/285, and the comments received on it. Cc: Rob Clark robdcl...@gmail.com Signed-off-by: Sumit Semwal sumit.sem...@linaro.org --- v3: - Thanks to Russell's comment, remove dma_mask and coherent_dma_mask from constraints' calculation; has a nice side effect of letting us use device_dma_parameters directly to list constraints. - update the debugfs output to show constraint info as well. v2: split constraints-sharing and allocation helpers drivers/dma-buf/dma-buf.c | 126 +- include/linux/dma-buf.h | 7 +++ 2 files changed, 132 insertions(+), 1 deletion(-) diff --git a/drivers/dma-buf/dma-buf.c b/drivers/dma-buf/dma-buf.c index 5be225c2ba98..f363f1440803 100644 --- a/drivers/dma-buf/dma-buf.c +++ b/drivers/dma-buf/dma-buf.c @@ -264,6 +264,66 @@ static inline int is_dma_buf_file(struct file *file) return file-f_op == dma_buf_fops; } +static inline void init_constraints(struct device_dma_parameters *cons) +{ + cons-max_segment_count = (unsigned int)-1; + cons-max_segment_size = (unsigned int)-1; + cons-segment_boundary_mask = (unsigned long)-1; +} Use DMA_SEGMENTS_MAX_SEG_COUNT or UINT/ULONG_MAX here instead? Patches look sane, Reviewed-By: Maarten Lankhorst maarten.lankho...@canonical.com +/* + * calc_constraints - calculates if the new attaching device's constraints + * match, with the constraints of already attached devices; if yes, returns + * the constraints; else return ERR_PTR(-EINVAL) + */ +static int calc_constraints(struct device *dev, + struct device_dma_parameters *calc_cons) +{ + struct device_dma_parameters cons = *calc_cons; + + cons.max_segment_count = min(cons.max_segment_count, + dma_get_max_seg_count(dev)); + cons.max_segment_size = min(cons.max_segment_size, + dma_get_max_seg_size(dev)); + cons.segment_boundary_mask = dma_get_seg_boundary(dev); + + if (!cons.max_segment_count || + !cons.max_segment_size || + !cons.segment_boundary_mask) { + pr_err(Dev: %s's constraints don't match\n, dev_name(dev)); + return -EINVAL; + } + + *calc_cons = cons; + + return 0; +} + +/* + * recalc_constraints - recalculates constraints for all attached devices; + * useful for detach() recalculation, and for dma_buf_recalc_constraints() + * helper. + * Returns recalculated constraints in recalc_cons, or error in the unlikely + * case when constraints of attached devices might have changed. + */ +static int recalc_constraints(struct dma_buf *dmabuf, + struct device_dma_parameters *recalc_cons) +{ + struct device_dma_parameters calc_cons; + struct dma_buf_attachment *attach; + int ret = 0; + + init_constraints(calc_cons); + + list_for_each_entry(attach, dmabuf-attachments, node) { + ret = calc_constraints(attach-dev, calc_cons); + if (ret) + return ret; + } + *recalc_cons = calc_cons; + return 0; +} + /** * dma_buf_export_named - Creates a new dma_buf, and associates an anon file * with this buffer, so it can be exported. @@ -313,6 +373,9 @@ struct dma_buf *dma_buf_export_named(void *priv, const struct dma_buf_ops *ops, dmabuf-ops = ops; dmabuf-size = size; dmabuf-exp_name = exp_name; + + init_constraints(dmabuf-constraints); + init_waitqueue_head(dmabuf-poll); dmabuf-cb_excl.poll = dmabuf-cb_shared.poll = dmabuf-poll; dmabuf-cb_excl.active = dmabuf-cb_shared.active = 0; @@ -422,7 +485,7 @@ struct dma_buf_attachment *dma_buf_attach(struct dma_buf *dmabuf, struct device *dev) { struct dma_buf_attachment *attach; - int ret; + int ret = 0; if (WARN_ON(!dmabuf || !dev)) return ERR_PTR(-EINVAL); @@ -436,6 +499,9 @@ struct dma_buf_attachment *dma_buf_attach(struct dma_buf *dmabuf
Re: [PATCH v3] dma-buf: cleanup dma_buf_export() to make it easily extensible
Op 28-01-15 om 13:54 schreef Sumit Semwal: At present, dma_buf_export() takes a series of parameters, which makes it difficult to add any new parameters for exporters, if required. Make it simpler by moving all these parameters into a struct, and pass the struct * as parameter to dma_buf_export(). While at it, unite dma_buf_export_named() with dma_buf_export(), and change all callers accordingly. Signed-off-by: Sumit Semwal sumit.sem...@linaro.org --- v3: Daniel Thompson caught the C99 warning issue w/ using {0}; using {.exp_name = xxx} instead. v2: add macro to zero out local struct, and fill KBUILD_MODNAME by default drivers/dma-buf/dma-buf.c | 47 +- drivers/gpu/drm/armada/armada_gem.c| 10 -- drivers/gpu/drm/drm_prime.c| 12 --- drivers/gpu/drm/exynos/exynos_drm_dmabuf.c | 9 +++-- drivers/gpu/drm/i915/i915_gem_dmabuf.c | 10 -- drivers/gpu/drm/omapdrm/omap_gem_dmabuf.c | 9 - drivers/gpu/drm/tegra/gem.c| 10 -- drivers/gpu/drm/ttm/ttm_object.c | 9 +++-- drivers/gpu/drm/udl/udl_dmabuf.c | 9 - drivers/media/v4l2-core/videobuf2-dma-contig.c | 8 - drivers/media/v4l2-core/videobuf2-dma-sg.c | 8 - drivers/media/v4l2-core/videobuf2-vmalloc.c| 8 - drivers/staging/android/ion/ion.c | 9 +++-- include/linux/dma-buf.h| 34 +++ 14 files changed, 142 insertions(+), 50 deletions(-) diff --git a/drivers/dma-buf/dma-buf.c b/drivers/dma-buf/dma-buf.c index 5be225c2ba98..6d3df3dd9310 100644 --- a/drivers/dma-buf/dma-buf.c +++ b/drivers/dma-buf/dma-buf.c @@ -265,7 +265,7 @@ static inline int is_dma_buf_file(struct file *file) } /** - * dma_buf_export_named - Creates a new dma_buf, and associates an anon file + * dma_buf_export - Creates a new dma_buf, and associates an anon file * with this buffer, so it can be exported. * Also connect the allocator specific data and ops to the buffer. * Additionally, provide a name string for exporter; useful in debugging. @@ -277,31 +277,32 @@ static inline int is_dma_buf_file(struct file *file) * @exp_name:[in]name of the exporting module - useful for debugging. * @resv:[in]reservation-object, NULL to allocate default one. * + * All the above info comes from struct dma_buf_export_info. + * * Returns, on success, a newly created dma_buf object, which wraps the * supplied private data and operations for dma_buf_ops. On either missing * ops, or error in allocating struct dma_buf, will return negative error. * */ -struct dma_buf *dma_buf_export_named(void *priv, const struct dma_buf_ops *ops, - size_t size, int flags, const char *exp_name, - struct reservation_object *resv) +struct dma_buf *dma_buf_export(struct dma_buf_export_info *exp_info) This function should probably take a const struct dma_buf_export_info here.. Rest looks sane. ~Maarten { struct dma_buf *dmabuf; struct file *file; size_t alloc_size = sizeof(struct dma_buf); - if (!resv) + if (!exp_info-resv) alloc_size += sizeof(struct reservation_object); else /* prevent dma_buf[1] == dma_buf-resv */ alloc_size += 1; - if (WARN_ON(!priv || !ops - || !ops-map_dma_buf - || !ops-unmap_dma_buf - || !ops-release - || !ops-kmap_atomic - || !ops-kmap - || !ops-mmap)) { + if (WARN_ON(!exp_info-priv + || !exp_info-ops + || !exp_info-ops-map_dma_buf + || !exp_info-ops-unmap_dma_buf + || !exp_info-ops-release + || !exp_info-ops-kmap_atomic + || !exp_info-ops-kmap + || !exp_info-ops-mmap)) { return ERR_PTR(-EINVAL); } @@ -309,21 +310,22 @@ struct dma_buf *dma_buf_export_named(void *priv, const struct dma_buf_ops *ops, if (dmabuf == NULL) return ERR_PTR(-ENOMEM); - dmabuf-priv = priv; - dmabuf-ops = ops; - dmabuf-size = size; - dmabuf-exp_name = exp_name; + dmabuf-priv = exp_info-priv; + dmabuf-ops = exp_info-ops; + dmabuf-size = exp_info-size; + dmabuf-exp_name = exp_info-exp_name; init_waitqueue_head(dmabuf-poll); dmabuf-cb_excl.poll = dmabuf-cb_shared.poll = dmabuf-poll; dmabuf-cb_excl.active = dmabuf-cb_shared.active = 0; - if (!resv) { - resv = (struct reservation_object *)dmabuf[1]; - reservation_object_init(resv); + if (!exp_info-resv) { + exp_info-resv =
Re: [PATCH v2 0/9] Updated fence patch series
op 02-07-14 07:37, Greg KH schreef: On Tue, Jul 01, 2014 at 12:57:02PM +0200, Maarten Lankhorst wrote: So after some more hacking I've moved dma-buf to its own subdirectory, drivers/dma-buf and applied the fence patches to its new place. I believe that the first patch should be applied regardless, and the rest should be ready now. :-) Changes to the fence api: - release_fence - fence_release etc. - __fence_init - fence_init - __fence_signal - fence_signal_locked - __fence_is_signaled - fence_is_signaled_locked - Changing BUG_ON to WARN_ON in fence_later, and return NULL if it triggers. Android can expose fences to userspace. It's possible to make the new fence mechanism expose the same fences to userspace by changing sync_fence_create to take a struct fence instead of a struct sync_pt. No other change is needed, because only the fence parts of struct sync_pt are used. But because the userspace fences are a separate problem and I haven't really looked at it yet I feel it should stay in staging, for now. Ok, that's reasonable. At first glance, this all looks sane to me, any objection from anyone if I merge this through my driver-core tree for 3.17? Sounds good to me, let me know when you pull it in, so I can rebase my drm conversion on top of it. :-) ~Maarten -- To unsubscribe from this list: send the line unsubscribe linux-media in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH v2 7/9] dma-buf: add poll support, v3
Thanks to Fengguang Wu for spotting a missing static cast. v2: - Kill unused variable need_shared. v3: - Clarify the BUG() in dma_buf_release some more. (Rob Clark) Signed-off-by: Maarten Lankhorst maarten.lankho...@canonical.com --- drivers/dma-buf/dma-buf.c | 108 + include/linux/dma-buf.h | 12 + 2 files changed, 120 insertions(+) diff --git a/drivers/dma-buf/dma-buf.c b/drivers/dma-buf/dma-buf.c index cd40ca22911f..25e8c4165936 100644 --- a/drivers/dma-buf/dma-buf.c +++ b/drivers/dma-buf/dma-buf.c @@ -30,6 +30,7 @@ #include linux/export.h #include linux/debugfs.h #include linux/seq_file.h +#include linux/poll.h #include linux/reservation.h static inline int is_dma_buf_file(struct file *); @@ -52,6 +53,16 @@ static int dma_buf_release(struct inode *inode, struct file *file) BUG_ON(dmabuf-vmapping_counter); + /* +* Any fences that a dma-buf poll can wait on should be signaled +* before releasing dma-buf. This is the responsibility of each +* driver that uses the reservation objects. +* +* If you hit this BUG() it means someone dropped their ref to the +* dma-buf while still having pending operation to the buffer. +*/ + BUG_ON(dmabuf-cb_shared.active || dmabuf-cb_excl.active); + dmabuf-ops-release(dmabuf); mutex_lock(db_list.lock); @@ -108,10 +119,103 @@ static loff_t dma_buf_llseek(struct file *file, loff_t offset, int whence) return base + offset; } +static void dma_buf_poll_cb(struct fence *fence, struct fence_cb *cb) +{ + struct dma_buf_poll_cb_t *dcb = (struct dma_buf_poll_cb_t *)cb; + unsigned long flags; + + spin_lock_irqsave(dcb-poll-lock, flags); + wake_up_locked_poll(dcb-poll, dcb-active); + dcb-active = 0; + spin_unlock_irqrestore(dcb-poll-lock, flags); +} + +static unsigned int dma_buf_poll(struct file *file, poll_table *poll) +{ + struct dma_buf *dmabuf; + struct reservation_object *resv; + unsigned long events; + + dmabuf = file-private_data; + if (!dmabuf || !dmabuf-resv) + return POLLERR; + + resv = dmabuf-resv; + + poll_wait(file, dmabuf-poll, poll); + + events = poll_requested_events(poll) (POLLIN | POLLOUT); + if (!events) + return 0; + + ww_mutex_lock(resv-lock, NULL); + + if (resv-fence_excl (!(events POLLOUT) || +resv-fence_shared_count == 0)) { + struct dma_buf_poll_cb_t *dcb = dmabuf-cb_excl; + unsigned long pevents = POLLIN; + + if (resv-fence_shared_count == 0) + pevents |= POLLOUT; + + spin_lock_irq(dmabuf-poll.lock); + if (dcb-active) { + dcb-active |= pevents; + events = ~pevents; + } else + dcb-active = pevents; + spin_unlock_irq(dmabuf-poll.lock); + + if (events pevents) { + if (!fence_add_callback(resv-fence_excl, + dcb-cb, dma_buf_poll_cb)) + events = ~pevents; + else + /* +* No callback queued, wake up any additional +* waiters. +*/ + dma_buf_poll_cb(NULL, dcb-cb); + } + } + + if ((events POLLOUT) resv-fence_shared_count 0) { + struct dma_buf_poll_cb_t *dcb = dmabuf-cb_shared; + int i; + + /* Only queue a new callback if no event has fired yet */ + spin_lock_irq(dmabuf-poll.lock); + if (dcb-active) + events = ~POLLOUT; + else + dcb-active = POLLOUT; + spin_unlock_irq(dmabuf-poll.lock); + + if (!(events POLLOUT)) + goto out; + + for (i = 0; i resv-fence_shared_count; ++i) + if (!fence_add_callback(resv-fence_shared[i], + dcb-cb, dma_buf_poll_cb)) { + events = ~POLLOUT; + break; + } + + /* No callback queued, wake up any additional waiters. */ + if (i == resv-fence_shared_count) + dma_buf_poll_cb(NULL, dcb-cb); + } + +out: + ww_mutex_unlock(resv-lock); + return events; +} + static const struct file_operations dma_buf_fops = { .release= dma_buf_release, .mmap = dma_buf_mmap_internal, .llseek = dma_buf_llseek, + .poll = dma_buf_poll, }; /* @@ -171,6 +275,10 @@ struct
[PATCH v2 6/9] reservation: add support for fences to enable cross-device synchronisation
Signed-off-by: Maarten Lankhorst maarten.lankho...@canonical.com Reviewed-by: Rob Clark robdcl...@gmail.com --- include/linux/reservation.h | 20 +++- 1 file changed, 19 insertions(+), 1 deletion(-) diff --git a/include/linux/reservation.h b/include/linux/reservation.h index 813dae960ebd..f3f57460a205 100644 --- a/include/linux/reservation.h +++ b/include/linux/reservation.h @@ -6,7 +6,7 @@ * Copyright (C) 2012 Texas Instruments * * Authors: - * Rob Clark rob.cl...@linaro.org + * Rob Clark robdcl...@gmail.com * Maarten Lankhorst maarten.lankho...@canonical.com * Thomas Hellstrom thellstrom-at-vmware-dot-com * @@ -40,22 +40,40 @@ #define _LINUX_RESERVATION_H #include linux/ww_mutex.h +#include linux/fence.h +#include linux/slab.h extern struct ww_class reservation_ww_class; struct reservation_object { struct ww_mutex lock; + + struct fence *fence_excl; + struct fence **fence_shared; + u32 fence_shared_count, fence_shared_max; }; static inline void reservation_object_init(struct reservation_object *obj) { ww_mutex_init(obj-lock, reservation_ww_class); + + obj-fence_shared_count = obj-fence_shared_max = 0; + obj-fence_shared = NULL; + obj-fence_excl = NULL; } static inline void reservation_object_fini(struct reservation_object *obj) { + int i; + + if (obj-fence_excl) + fence_put(obj-fence_excl); + for (i = 0; i obj-fence_shared_count; ++i) + fence_put(obj-fence_shared[i]); + kfree(obj-fence_shared); + ww_mutex_destroy(obj-lock); } -- To unsubscribe from this list: send the line unsubscribe linux-media in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH v2 9/9] reservation: add suppport for read-only access using rcu
This adds some extra functions to deal with rcu. reservation_object_get_fences_rcu() will obtain the list of shared and exclusive fences without obtaining the ww_mutex. reservation_object_wait_timeout_rcu() will wait on all fences of the reservation_object, without obtaining the ww_mutex. reservation_object_test_signaled_rcu() will test if all fences of the reservation_object are signaled without using the ww_mutex. reservation_object_get_excl and reservation_object_get_list require the reservation object to be held, updating requires write_seqcount_begin/end. If only the exclusive fence is needed, rcu_dereference followed by fence_get_rcu can be used, if the shared fences are needed it's recommended to use the supplied functions. Signed-off-by: Maarten Lankhorst maarten.lankho...@canonical.com Reviewed-By: Thomas Hellstrom thellst...@vmware.com --- drivers/dma-buf/dma-buf.c | 47 -- drivers/dma-buf/fence.c |2 drivers/dma-buf/reservation.c | 336 ++--- include/linux/fence.h | 17 ++ include/linux/reservation.h | 52 -- 5 files changed, 400 insertions(+), 54 deletions(-) diff --git a/drivers/dma-buf/dma-buf.c b/drivers/dma-buf/dma-buf.c index cb8379dfeed5..f3014c448e1e 100644 --- a/drivers/dma-buf/dma-buf.c +++ b/drivers/dma-buf/dma-buf.c @@ -137,7 +137,7 @@ static unsigned int dma_buf_poll(struct file *file, poll_table *poll) struct reservation_object_list *fobj; struct fence *fence_excl; unsigned long events; - unsigned shared_count; + unsigned shared_count, seq; dmabuf = file-private_data; if (!dmabuf || !dmabuf-resv) @@ -151,14 +151,20 @@ static unsigned int dma_buf_poll(struct file *file, poll_table *poll) if (!events) return 0; - ww_mutex_lock(resv-lock, NULL); +retry: + seq = read_seqcount_begin(resv-seq); + rcu_read_lock(); - fobj = resv-fence; - if (!fobj) - goto out; - - shared_count = fobj-shared_count; - fence_excl = resv-fence_excl; + fobj = rcu_dereference(resv-fence); + if (fobj) + shared_count = fobj-shared_count; + else + shared_count = 0; + fence_excl = rcu_dereference(resv-fence_excl); + if (read_seqcount_retry(resv-seq, seq)) { + rcu_read_unlock(); + goto retry; + } if (fence_excl (!(events POLLOUT) || shared_count == 0)) { struct dma_buf_poll_cb_t *dcb = dmabuf-cb_excl; @@ -176,14 +182,20 @@ static unsigned int dma_buf_poll(struct file *file, poll_table *poll) spin_unlock_irq(dmabuf-poll.lock); if (events pevents) { - if (!fence_add_callback(fence_excl, dcb-cb, + if (!fence_get_rcu(fence_excl)) { + /* force a recheck */ + events = ~pevents; + dma_buf_poll_cb(NULL, dcb-cb); + } else if (!fence_add_callback(fence_excl, dcb-cb, dma_buf_poll_cb)) { events = ~pevents; + fence_put(fence_excl); } else { /* * No callback queued, wake up any additional * waiters. */ + fence_put(fence_excl); dma_buf_poll_cb(NULL, dcb-cb); } } @@ -205,13 +217,26 @@ static unsigned int dma_buf_poll(struct file *file, poll_table *poll) goto out; for (i = 0; i shared_count; ++i) { - struct fence *fence = fobj-shared[i]; + struct fence *fence = rcu_dereference(fobj-shared[i]); + if (!fence_get_rcu(fence)) { + /* +* fence refcount dropped to zero, this means +* that fobj has been freed +* +* call dma_buf_poll_cb and force a recheck! +*/ + events = ~POLLOUT; + dma_buf_poll_cb(NULL, dcb-cb); + break; + } if (!fence_add_callback(fence, dcb-cb, dma_buf_poll_cb)) { + fence_put(fence); events = ~POLLOUT; break; } + fence_put(fence); } /* No callback queued, wake up any additional
[PATCH v2 8/9] reservation: update api and add some helpers
Move the list of shared fences to a struct, and return it in reservation_object_get_list(). Add reservation_object_get_excl to get the exclusive fence. Add reservation_object_reserve_shared(), which reserves space in the reservation_object for 1 more shared fence. reservation_object_add_shared_fence() and reservation_object_add_excl_fence() are used to assign a new fence to a reservation_object pointer, to complete a reservation. Signed-off-by: Maarten Lankhorst maarten.lankho...@canonical.com Changes since v1: - Add reservation_object_get_excl, reorder code a bit. --- Documentation/DocBook/device-drivers.tmpl |1 drivers/dma-buf/dma-buf.c | 35 --- drivers/dma-buf/reservation.c | 156 + include/linux/reservation.h | 56 +- 4 files changed, 229 insertions(+), 19 deletions(-) diff --git a/Documentation/DocBook/device-drivers.tmpl b/Documentation/DocBook/device-drivers.tmpl index ed0ef00cd7bc..dd3f278faa8a 100644 --- a/Documentation/DocBook/device-drivers.tmpl +++ b/Documentation/DocBook/device-drivers.tmpl @@ -133,6 +133,7 @@ X!Edrivers/base/interface.c !Edrivers/dma-buf/seqno-fence.c !Iinclude/linux/fence.h !Iinclude/linux/seqno-fence.h +!Edrivers/dma-buf/reservation.c !Iinclude/linux/reservation.h !Edrivers/base/dma-coherent.c !Edrivers/base/dma-mapping.c diff --git a/drivers/dma-buf/dma-buf.c b/drivers/dma-buf/dma-buf.c index 25e8c4165936..cb8379dfeed5 100644 --- a/drivers/dma-buf/dma-buf.c +++ b/drivers/dma-buf/dma-buf.c @@ -134,7 +134,10 @@ static unsigned int dma_buf_poll(struct file *file, poll_table *poll) { struct dma_buf *dmabuf; struct reservation_object *resv; + struct reservation_object_list *fobj; + struct fence *fence_excl; unsigned long events; + unsigned shared_count; dmabuf = file-private_data; if (!dmabuf || !dmabuf-resv) @@ -150,12 +153,18 @@ static unsigned int dma_buf_poll(struct file *file, poll_table *poll) ww_mutex_lock(resv-lock, NULL); - if (resv-fence_excl (!(events POLLOUT) || -resv-fence_shared_count == 0)) { + fobj = resv-fence; + if (!fobj) + goto out; + + shared_count = fobj-shared_count; + fence_excl = resv-fence_excl; + + if (fence_excl (!(events POLLOUT) || shared_count == 0)) { struct dma_buf_poll_cb_t *dcb = dmabuf-cb_excl; unsigned long pevents = POLLIN; - if (resv-fence_shared_count == 0) + if (shared_count == 0) pevents |= POLLOUT; spin_lock_irq(dmabuf-poll.lock); @@ -167,19 +176,20 @@ static unsigned int dma_buf_poll(struct file *file, poll_table *poll) spin_unlock_irq(dmabuf-poll.lock); if (events pevents) { - if (!fence_add_callback(resv-fence_excl, - dcb-cb, dma_buf_poll_cb)) + if (!fence_add_callback(fence_excl, dcb-cb, + dma_buf_poll_cb)) { events = ~pevents; - else + } else { /* * No callback queued, wake up any additional * waiters. */ dma_buf_poll_cb(NULL, dcb-cb); + } } } - if ((events POLLOUT) resv-fence_shared_count 0) { + if ((events POLLOUT) shared_count 0) { struct dma_buf_poll_cb_t *dcb = dmabuf-cb_shared; int i; @@ -194,15 +204,18 @@ static unsigned int dma_buf_poll(struct file *file, poll_table *poll) if (!(events POLLOUT)) goto out; - for (i = 0; i resv-fence_shared_count; ++i) - if (!fence_add_callback(resv-fence_shared[i], - dcb-cb, dma_buf_poll_cb)) { + for (i = 0; i shared_count; ++i) { + struct fence *fence = fobj-shared[i]; + + if (!fence_add_callback(fence, dcb-cb, + dma_buf_poll_cb)) { events = ~POLLOUT; break; } + } /* No callback queued, wake up any additional waiters. */ - if (i == resv-fence_shared_count) + if (i == shared_count) dma_buf_poll_cb(NULL, dcb-cb); } diff --git a/drivers/dma-buf/reservation.c b/drivers/dma-buf/reservation.c index a73fbf3b8e56..e6166723a9ae 100644 --- a/drivers/dma-buf/reservation.c +++ b/drivers/dma-buf/reservation.c @@ -1,5 +1,5
[PATCH v2 5/9] android: convert sync to fence api, v6
Just to show it's easy. Android syncpoints can be mapped to a timeline. This removes the need to maintain a separate api for synchronization. I've left the android trace events in place, but the core fence events should already be sufficient for debugging. v2: - Call fence_remove_callback in sync_fence_free if not all fences have fired. v3: - Merge Colin Cross' bugfixes, and the android fence merge optimization. v4: - Merge with the upstream fixes. v5: - Fix small style issues pointed out by Thomas Hellstrom. v6: - Fix for updates to fence api. Signed-off-by: Maarten Lankhorst maarten.lankho...@canonical.com Acked-by: John Stultz john.stu...@linaro.org --- drivers/staging/android/Kconfig |1 drivers/staging/android/Makefile |2 drivers/staging/android/sw_sync.c|6 drivers/staging/android/sync.c | 913 +++--- drivers/staging/android/sync.h | 79 ++- drivers/staging/android/sync_debug.c | 247 + drivers/staging/android/trace/sync.h | 12 7 files changed, 609 insertions(+), 651 deletions(-) create mode 100644 drivers/staging/android/sync_debug.c diff --git a/drivers/staging/android/Kconfig b/drivers/staging/android/Kconfig index 99e484f845f2..51607e9aa049 100644 --- a/drivers/staging/android/Kconfig +++ b/drivers/staging/android/Kconfig @@ -88,6 +88,7 @@ config SYNC bool Synchronization framework default n select ANON_INODES + select DMA_SHARED_BUFFER ---help--- This option enables the framework for synchronization between multiple drivers. Sync implementations can take advantage of hardware diff --git a/drivers/staging/android/Makefile b/drivers/staging/android/Makefile index 0a01e1914905..517ad5ffa429 100644 --- a/drivers/staging/android/Makefile +++ b/drivers/staging/android/Makefile @@ -9,5 +9,5 @@ obj-$(CONFIG_ANDROID_TIMED_OUTPUT) += timed_output.o obj-$(CONFIG_ANDROID_TIMED_GPIO) += timed_gpio.o obj-$(CONFIG_ANDROID_LOW_MEMORY_KILLER)+= lowmemorykiller.o obj-$(CONFIG_ANDROID_INTF_ALARM_DEV) += alarm-dev.o -obj-$(CONFIG_SYNC) += sync.o +obj-$(CONFIG_SYNC) += sync.o sync_debug.o obj-$(CONFIG_SW_SYNC) += sw_sync.o diff --git a/drivers/staging/android/sw_sync.c b/drivers/staging/android/sw_sync.c index 12a136ec1cec..a76db3ff87cb 100644 --- a/drivers/staging/android/sw_sync.c +++ b/drivers/staging/android/sw_sync.c @@ -50,7 +50,7 @@ static struct sync_pt *sw_sync_pt_dup(struct sync_pt *sync_pt) { struct sw_sync_pt *pt = (struct sw_sync_pt *) sync_pt; struct sw_sync_timeline *obj = - (struct sw_sync_timeline *)sync_pt-parent; + (struct sw_sync_timeline *)sync_pt_parent(sync_pt); return (struct sync_pt *) sw_sync_pt_create(obj, pt-value); } @@ -59,7 +59,7 @@ static int sw_sync_pt_has_signaled(struct sync_pt *sync_pt) { struct sw_sync_pt *pt = (struct sw_sync_pt *)sync_pt; struct sw_sync_timeline *obj = - (struct sw_sync_timeline *)sync_pt-parent; + (struct sw_sync_timeline *)sync_pt_parent(sync_pt); return sw_sync_cmp(obj-value, pt-value) = 0; } @@ -97,7 +97,6 @@ static void sw_sync_pt_value_str(struct sync_pt *sync_pt, char *str, int size) { struct sw_sync_pt *pt = (struct sw_sync_pt *)sync_pt; - snprintf(str, size, %d, pt-value); } @@ -157,7 +156,6 @@ static int sw_sync_open(struct inode *inode, struct file *file) static int sw_sync_release(struct inode *inode, struct file *file) { struct sw_sync_timeline *obj = file-private_data; - sync_timeline_destroy(obj-obj); return 0; } diff --git a/drivers/staging/android/sync.c b/drivers/staging/android/sync.c index 18174f7c871c..c9a0c2cdc81a 100644 --- a/drivers/staging/android/sync.c +++ b/drivers/staging/android/sync.c @@ -31,22 +31,13 @@ #define CREATE_TRACE_POINTS #include trace/sync.h -static void sync_fence_signal_pt(struct sync_pt *pt); -static int _sync_pt_has_signaled(struct sync_pt *pt); -static void sync_fence_free(struct kref *kref); -static void sync_dump(void); - -static LIST_HEAD(sync_timeline_list_head); -static DEFINE_SPINLOCK(sync_timeline_list_lock); - -static LIST_HEAD(sync_fence_list_head); -static DEFINE_SPINLOCK(sync_fence_list_lock); +static const struct fence_ops android_fence_ops; +static const struct file_operations sync_fence_fops; struct sync_timeline *sync_timeline_create(const struct sync_timeline_ops *ops, int size, const char *name) { struct sync_timeline *obj; - unsigned long flags; if (size sizeof(struct sync_timeline)) return NULL; @@ -57,17 +48,14 @@ struct sync_timeline *sync_timeline_create(const struct sync_timeline_ops *ops, kref_init(obj-kref); obj-ops = ops; + obj-context
[PATCH v2 4/9] dma-buf: use reservation objects
This allows reservation objects to be used in dma-buf. it's required for implementing polling support on the fences that belong to a dma-buf. Signed-off-by: Maarten Lankhorst maarten.lankho...@canonical.com Acked-by: Mauro Carvalho Chehab m.che...@samsung.com #drivers/media/v4l2-core/ Acked-by: Thomas Hellstrom thellst...@vmware.com #drivers/gpu/drm/ttm Signed-off-by: Vincent Stehlé vincent.ste...@laposte.net #drivers/gpu/drm/armada/ --- drivers/dma-buf/dma-buf.c | 22 -- drivers/gpu/drm/armada/armada_gem.c|2 +- drivers/gpu/drm/drm_prime.c|8 +++- drivers/gpu/drm/exynos/exynos_drm_dmabuf.c |2 +- drivers/gpu/drm/i915/i915_gem_dmabuf.c |3 ++- drivers/gpu/drm/nouveau/nouveau_drm.c |1 + drivers/gpu/drm/nouveau/nouveau_gem.h |1 + drivers/gpu/drm/nouveau/nouveau_prime.c|7 +++ drivers/gpu/drm/omapdrm/omap_gem_dmabuf.c |2 +- drivers/gpu/drm/radeon/radeon_drv.c|2 ++ drivers/gpu/drm/radeon/radeon_prime.c |8 drivers/gpu/drm/tegra/gem.c|2 +- drivers/gpu/drm/ttm/ttm_object.c |2 +- drivers/media/v4l2-core/videobuf2-dma-contig.c |2 +- drivers/staging/android/ion/ion.c |3 ++- include/drm/drmP.h |3 +++ include/linux/dma-buf.h|9 ++--- 17 files changed, 65 insertions(+), 14 deletions(-) diff --git a/drivers/dma-buf/dma-buf.c b/drivers/dma-buf/dma-buf.c index 840c7fa80983..cd40ca22911f 100644 --- a/drivers/dma-buf/dma-buf.c +++ b/drivers/dma-buf/dma-buf.c @@ -25,10 +25,12 @@ #include linux/fs.h #include linux/slab.h #include linux/dma-buf.h +#include linux/fence.h #include linux/anon_inodes.h #include linux/export.h #include linux/debugfs.h #include linux/seq_file.h +#include linux/reservation.h static inline int is_dma_buf_file(struct file *); @@ -56,6 +58,9 @@ static int dma_buf_release(struct inode *inode, struct file *file) list_del(dmabuf-list_node); mutex_unlock(db_list.lock); + if (dmabuf-resv == (struct reservation_object *)dmabuf[1]) + reservation_object_fini(dmabuf-resv); + kfree(dmabuf); return 0; } @@ -128,6 +133,7 @@ static inline int is_dma_buf_file(struct file *file) * @size: [in]Size of the buffer * @flags: [in]mode flags for the file. * @exp_name: [in]name of the exporting module - useful for debugging. + * @resv: [in]reservation-object, NULL to allocate default one. * * Returns, on success, a newly created dma_buf object, which wraps the * supplied private data and operations for dma_buf_ops. On either missing @@ -135,10 +141,17 @@ static inline int is_dma_buf_file(struct file *file) * */ struct dma_buf *dma_buf_export_named(void *priv, const struct dma_buf_ops *ops, - size_t size, int flags, const char *exp_name) + size_t size, int flags, const char *exp_name, + struct reservation_object *resv) { struct dma_buf *dmabuf; struct file *file; + size_t alloc_size = sizeof(struct dma_buf); + if (!resv) + alloc_size += sizeof(struct reservation_object); + else + /* prevent dma_buf[1] == dma_buf-resv */ + alloc_size += 1; if (WARN_ON(!priv || !ops || !ops-map_dma_buf @@ -150,7 +163,7 @@ struct dma_buf *dma_buf_export_named(void *priv, const struct dma_buf_ops *ops, return ERR_PTR(-EINVAL); } - dmabuf = kzalloc(sizeof(struct dma_buf), GFP_KERNEL); + dmabuf = kzalloc(alloc_size, GFP_KERNEL); if (dmabuf == NULL) return ERR_PTR(-ENOMEM); @@ -158,6 +171,11 @@ struct dma_buf *dma_buf_export_named(void *priv, const struct dma_buf_ops *ops, dmabuf-ops = ops; dmabuf-size = size; dmabuf-exp_name = exp_name; + if (!resv) { + resv = (struct reservation_object *)dmabuf[1]; + reservation_object_init(resv); + } + dmabuf-resv = resv; file = anon_inode_getfile(dmabuf, dma_buf_fops, dmabuf, flags); if (IS_ERR(file)) { diff --git a/drivers/gpu/drm/armada/armada_gem.c b/drivers/gpu/drm/armada/armada_gem.c index bb9b642d8485..7496f55611a5 100644 --- a/drivers/gpu/drm/armada/armada_gem.c +++ b/drivers/gpu/drm/armada/armada_gem.c @@ -539,7 +539,7 @@ armada_gem_prime_export(struct drm_device *dev, struct drm_gem_object *obj, int flags) { return dma_buf_export(obj, armada_gem_prime_dmabuf_ops, obj-size, - O_RDWR); + O_RDWR, NULL); } struct drm_gem_object * diff --git a/drivers/gpu/drm/drm_prime.c b/drivers/gpu/drm/drm_prime.c index 304ca8cacbc4..99d578bad17e
[PATCH v2 2/9] fence: dma-buf cross-device synchronization (v18)
A fence can be attached to a buffer which is being filled or consumed by hw, to allow userspace to pass the buffer without waiting to another device. For example, userspace can call page_flip ioctl to display the next frame of graphics after kicking the GPU but while the GPU is still rendering. The display device sharing the buffer with the GPU would attach a callback to get notified when the GPU's rendering-complete IRQ fires, to update the scan-out address of the display, without having to wake up userspace. A driver must allocate a fence context for each execution ring that can run in parallel. The function for this takes an argument with how many contexts to allocate: + fence_context_alloc() A fence is transient, one-shot deal. It is allocated and attached to one or more dma-buf's. When the one that attached it is done, with the pending operation, it can signal the fence: + fence_signal() To have a rough approximation whether a fence is fired, call: + fence_is_signaled() The dma-buf-mgr handles tracking, and waiting on, the fences associated with a dma-buf. The one pending on the fence can add an async callback: + fence_add_callback() The callback can optionally be cancelled with: + fence_remove_callback() To wait synchronously, optionally with a timeout: + fence_wait() + fence_wait_timeout() When emitting a fence, call: + trace_fence_emit() To annotate that a fence is blocking on another fence, call: + trace_fence_annotate_wait_on(fence, on_fence) A default software-only implementation is provided, which can be used by drivers attaching a fence to a buffer when they have no other means for hw sync. But a memory backed fence is also envisioned, because it is common that GPU's can write to, or poll on some memory location for synchronization. For example: fence = custom_get_fence(...); if ((seqno_fence = to_seqno_fence(fence)) != NULL) { dma_buf *fence_buf = seqno_fence-sync_buf; get_dma_buf(fence_buf); ... tell the hw the memory location to wait ... custom_wait_on(fence_buf, seqno_fence-seqno_ofs, fence-seqno); } else { /* fall-back to sw sync * / fence_add_callback(fence, my_cb); } On SoC platforms, if some other hw mechanism is provided for synchronizing between IP blocks, it could be supported as an alternate implementation with it's own fence ops in a similar way. enable_signaling callback is used to provide sw signaling in case a cpu waiter is requested or no compatible hardware signaling could be used. The intention is to provide a userspace interface (presumably via eventfd) later, to be used in conjunction with dma-buf's mmap support for sw access to buffers (or for userspace apps that would prefer to do their own synchronization). v1: Original v2: After discussion w/ danvet and mlankhorst on #dri-devel, we decided that dma-fence didn't need to care about the sw-hw signaling path (it can be handled same as sw-sw case), and therefore the fence-ops can be simplified and more handled in the core. So remove the signal, add_callback, cancel_callback, and wait ops, and replace with a simple enable_signaling() op which can be used to inform a fence supporting hw-hw signaling that one or more devices which do not support hw signaling are waiting (and therefore it should enable an irq or do whatever is necessary in order that the CPU is notified when the fence is passed). v3: Fix locking fail in attach_fence() and get_fence() v4: Remove tie-in w/ dma-buf.. after discussion w/ danvet and mlankorst we decided that we need to be able to attach one fence to N dma-buf's, so using the list_head in dma-fence struct would be problematic. v5: [ Maarten Lankhorst ] Updated for dma-bikeshed-fence and dma-buf-manager. v6: [ Maarten Lankhorst ] I removed dma_fence_cancel_callback and some comments about checking if fence fired or not. This is broken by design. waitqueue_active during destruction is now fatal, since the signaller should be holding a reference in enable_signalling until it signalled the fence. Pass the original dma_fence_cb along, and call __remove_wait in the dma_fence_callback handler, so that no cleanup needs to be performed. v7: [ Maarten Lankhorst ] Set cb-func and only enable sw signaling if fence wasn't signaled yet, for example for hardware fences that may choose to signal blindly. v8: [ Maarten Lankhorst ] Tons of tiny fixes, moved __dma_fence_init to header and fixed include mess. dma-fence.h now includes dma-buf.h All members are now initialized, so kmalloc can be used for allocating a dma-fence. More documentation added. v9: Change compiler bitfields to flags, change return type of enable_signaling to bool. Rework dma_fence_wait. Added dma_fence_is_signaled and dma_fence_wait_timeout. s/dma// and change exports to non GPL. Added fence_is_signaled and fence_enable_sw_signaling calls, add ability to override default
[PATCH v2 3/9] seqno-fence: Hardware dma-buf implementation of fencing (v6)
This type of fence can be used with hardware synchronization for simple hardware that can block execution until the condition (dma_buf[offset] - value) = 0 has been met when WAIT_GEQUAL is used, or (dma_buf[offset] != 0) has been met when WAIT_NONZERO is set. A software fallback still has to be provided in case the fence is used with a device that doesn't support this mechanism. It is useful to expose this for graphics cards that have an op to support this. Some cards like i915 can export those, but don't have an option to wait, so they need the software fallback. I extended the original patch by Rob Clark. v1: Original v2: Renamed from bikeshed to seqno, moved into dma-fence.c since not much was left of the file. Lots of documentation added. v3: Use fence_ops instead of custom callbacks. Moved to own file to avoid circular dependency between dma-buf.h and fence.h v4: Add spinlock pointer to seqno_fence_init v5: Add condition member to allow wait for != 0. Fix small style errors pointed out by checkpatch. v6: Move to a separate file. Fix up api changes in fences. Signed-off-by: Maarten Lankhorst maarten.lankho...@canonical.com Reviewed-by: Rob Clark robdcl...@gmail.com #v4 --- Documentation/DocBook/device-drivers.tmpl |2 + MAINTAINERS |2 - drivers/dma-buf/Makefile |2 - drivers/dma-buf/seqno-fence.c | 73 ++ include/linux/seqno-fence.h | 116 + 5 files changed, 193 insertions(+), 2 deletions(-) create mode 100644 drivers/dma-buf/seqno-fence.c create mode 100644 include/linux/seqno-fence.h diff --git a/Documentation/DocBook/device-drivers.tmpl b/Documentation/DocBook/device-drivers.tmpl index e634657efb52..ed0ef00cd7bc 100644 --- a/Documentation/DocBook/device-drivers.tmpl +++ b/Documentation/DocBook/device-drivers.tmpl @@ -130,7 +130,9 @@ X!Edrivers/base/interface.c sect1titleDevice Drivers DMA Management/title !Edrivers/dma-buf/dma-buf.c !Edrivers/dma-buf/fence.c +!Edrivers/dma-buf/seqno-fence.c !Iinclude/linux/fence.h +!Iinclude/linux/seqno-fence.h !Iinclude/linux/reservation.h !Edrivers/base/dma-coherent.c !Edrivers/base/dma-mapping.c diff --git a/MAINTAINERS b/MAINTAINERS index ebc1ebf6f542..135929f6cf6a 100644 --- a/MAINTAINERS +++ b/MAINTAINERS @@ -2883,7 +2883,7 @@ L:linux-media@vger.kernel.org L: dri-de...@lists.freedesktop.org L: linaro-mm-...@lists.linaro.org F: drivers/dma-buf/ -F: include/linux/dma-buf* include/linux/reservation.h include/linux/fence.h +F: include/linux/dma-buf* include/linux/reservation.h include/linux/*fence.h F: Documentation/dma-buf-sharing.txt T: git git://git.linaro.org/people/sumitsemwal/linux-dma-buf.git diff --git a/drivers/dma-buf/Makefile b/drivers/dma-buf/Makefile index d7825bfe630e..57a675f90cd0 100644 --- a/drivers/dma-buf/Makefile +++ b/drivers/dma-buf/Makefile @@ -1 +1 @@ -obj-y := dma-buf.o fence.o reservation.o +obj-y := dma-buf.o fence.o reservation.o seqno-fence.o diff --git a/drivers/dma-buf/seqno-fence.c b/drivers/dma-buf/seqno-fence.c new file mode 100644 index ..7d12a39a4b57 --- /dev/null +++ b/drivers/dma-buf/seqno-fence.c @@ -0,0 +1,73 @@ +/* + * seqno-fence, using a dma-buf to synchronize fencing + * + * Copyright (C) 2012 Texas Instruments + * Copyright (C) 2012-2014 Canonical Ltd + * Authors: + * Rob Clark robdcl...@gmail.com + * Maarten Lankhorst maarten.lankho...@canonical.com + * + * This program is free software; you can redistribute it and/or modify it + * under the terms of the GNU General Public License version 2 as published by + * the Free Software Foundation. + * + * This program is distributed in the hope that it will be useful, but WITHOUT + * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or + * FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for + * more details. + */ + +#include linux/slab.h +#include linux/export.h +#include linux/seqno-fence.h + +static const char *seqno_fence_get_driver_name(struct fence *fence) +{ + struct seqno_fence *seqno_fence = to_seqno_fence(fence); + return seqno_fence-ops-get_driver_name(fence); +} + +static const char *seqno_fence_get_timeline_name(struct fence *fence) +{ + struct seqno_fence *seqno_fence = to_seqno_fence(fence); + return seqno_fence-ops-get_timeline_name(fence); +} + +static bool seqno_enable_signaling(struct fence *fence) +{ + struct seqno_fence *seqno_fence = to_seqno_fence(fence); + return seqno_fence-ops-enable_signaling(fence); +} + +static bool seqno_signaled(struct fence *fence) +{ + struct seqno_fence *seqno_fence = to_seqno_fence(fence); + return seqno_fence-ops-signaled seqno_fence-ops-signaled(fence); +} + +static void seqno_release(struct fence *fence) +{ + struct seqno_fence *f = to_seqno_fence(fence); + + dma_buf_put(f-sync_buf); + if (f-ops
[PATCH v2 1/9] dma-buf: move to drivers/dma-buf
Signed-off-by: Maarten Lankhorst maarten.lankho...@canonical.com --- Documentation/DocBook/device-drivers.tmpl |3 MAINTAINERS |4 drivers/Makefile |1 drivers/base/Makefile |1 drivers/base/dma-buf.c| 743 - drivers/base/reservation.c| 39 -- drivers/dma-buf/Makefile |1 drivers/dma-buf/dma-buf.c | 743 + drivers/dma-buf/reservation.c | 39 ++ 9 files changed, 787 insertions(+), 787 deletions(-) delete mode 100644 drivers/base/dma-buf.c delete mode 100644 drivers/base/reservation.c create mode 100644 drivers/dma-buf/Makefile create mode 100644 drivers/dma-buf/dma-buf.c create mode 100644 drivers/dma-buf/reservation.c diff --git a/Documentation/DocBook/device-drivers.tmpl b/Documentation/DocBook/device-drivers.tmpl index cc63f30de166..ac61ebd92875 100644 --- a/Documentation/DocBook/device-drivers.tmpl +++ b/Documentation/DocBook/device-drivers.tmpl @@ -128,8 +128,7 @@ X!Edrivers/base/interface.c !Edrivers/base/bus.c /sect1 sect1titleDevice Drivers DMA Management/title -!Edrivers/base/dma-buf.c -!Edrivers/base/reservation.c +!Edrivers/dma-buf/dma-buf.c !Iinclude/linux/reservation.h !Edrivers/base/dma-coherent.c !Edrivers/base/dma-mapping.c diff --git a/MAINTAINERS b/MAINTAINERS index 134483f206e4..c948e53a4ee6 100644 --- a/MAINTAINERS +++ b/MAINTAINERS @@ -2882,8 +2882,8 @@ S:Maintained L: linux-media@vger.kernel.org L: dri-de...@lists.freedesktop.org L: linaro-mm-...@lists.linaro.org -F: drivers/base/dma-buf* -F: include/linux/dma-buf* +F: drivers/dma-buf/ +F: include/linux/dma-buf* include/linux/reservation.h F: Documentation/dma-buf-sharing.txt T: git git://git.linaro.org/people/sumitsemwal/linux-dma-buf.git diff --git a/drivers/Makefile b/drivers/Makefile index f98b50d8251d..c00337be5351 100644 --- a/drivers/Makefile +++ b/drivers/Makefile @@ -61,6 +61,7 @@ obj-$(CONFIG_FB_INTEL) += video/fbdev/intelfb/ obj-$(CONFIG_PARPORT) += parport/ obj-y += base/ block/ misc/ mfd/ nfc/ +obj-$(CONFIG_DMA_SHARED_BUFFER) += dma-buf/ obj-$(CONFIG_NUBUS)+= nubus/ obj-y += macintosh/ obj-$(CONFIG_IDE) += ide/ diff --git a/drivers/base/Makefile b/drivers/base/Makefile index 04b314e0fa51..4aab26ec0292 100644 --- a/drivers/base/Makefile +++ b/drivers/base/Makefile @@ -10,7 +10,6 @@ obj-$(CONFIG_DMA_CMA) += dma-contiguous.o obj-y += power/ obj-$(CONFIG_HAS_DMA) += dma-mapping.o obj-$(CONFIG_HAVE_GENERIC_DMA_COHERENT) += dma-coherent.o -obj-$(CONFIG_DMA_SHARED_BUFFER) += dma-buf.o reservation.o obj-$(CONFIG_ISA) += isa.o obj-$(CONFIG_FW_LOADER)+= firmware_class.o obj-$(CONFIG_NUMA) += node.o diff --git a/drivers/base/dma-buf.c b/drivers/base/dma-buf.c deleted file mode 100644 index 840c7fa80983.. --- a/drivers/base/dma-buf.c +++ /dev/null @@ -1,743 +0,0 @@ -/* - * Framework for buffer objects that can be shared across devices/subsystems. - * - * Copyright(C) 2011 Linaro Limited. All rights reserved. - * Author: Sumit Semwal sumit.sem...@ti.com - * - * Many thanks to linaro-mm-sig list, and specially - * Arnd Bergmann a...@arndb.de, Rob Clark r...@ti.com and - * Daniel Vetter dan...@ffwll.ch for their support in creation and - * refining of this idea. - * - * This program is free software; you can redistribute it and/or modify it - * under the terms of the GNU General Public License version 2 as published by - * the Free Software Foundation. - * - * This program is distributed in the hope that it will be useful, but WITHOUT - * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or - * FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for - * more details. - * - * You should have received a copy of the GNU General Public License along with - * this program. If not, see http://www.gnu.org/licenses/. - */ - -#include linux/fs.h -#include linux/slab.h -#include linux/dma-buf.h -#include linux/anon_inodes.h -#include linux/export.h -#include linux/debugfs.h -#include linux/seq_file.h - -static inline int is_dma_buf_file(struct file *); - -struct dma_buf_list { - struct list_head head; - struct mutex lock; -}; - -static struct dma_buf_list db_list; - -static int dma_buf_release(struct inode *inode, struct file *file) -{ - struct dma_buf *dmabuf; - - if (!is_dma_buf_file(file)) - return -EINVAL; - - dmabuf = file-private_data; - - BUG_ON(dmabuf-vmapping_counter); - - dmabuf-ops-release(dmabuf); - - mutex_lock(db_list.lock); - list_del(dmabuf-list_node); - mutex_unlock(db_list.lock); - - kfree(dmabuf); - return 0; -} - -static int
Re: [PATCH v2 1/9] dma-buf: move to drivers/dma-buf
op 01-07-14 13:06, Arend van Spriel schreef: On 01-07-14 12:57, Maarten Lankhorst wrote: Signed-off-by: Maarten Lankhorst maarten.lankho...@canonical.com It would help to use '-M' option with format-patch for this kind of rework. Regards, Arend Thanks, was looking for some option but didn't find it. Have a rediff below. :-) 8 Signed-off-by: Maarten Lankhorst maarten.lankho...@canonical.com --- Documentation/DocBook/device-drivers.tmpl | 3 +-- MAINTAINERS | 4 ++-- drivers/Makefile | 1 + drivers/base/Makefile | 1 - drivers/dma-buf/Makefile | 1 + drivers/{base = dma-buf}/dma-buf.c | 0 drivers/{base = dma-buf}/reservation.c | 0 7 files changed, 5 insertions(+), 5 deletions(-) create mode 100644 drivers/dma-buf/Makefile rename drivers/{base = dma-buf}/dma-buf.c (100%) rename drivers/{base = dma-buf}/reservation.c (100%) diff --git a/Documentation/DocBook/device-drivers.tmpl b/Documentation/DocBook/device-drivers.tmpl index cc63f30de166..ac61ebd92875 100644 --- a/Documentation/DocBook/device-drivers.tmpl +++ b/Documentation/DocBook/device-drivers.tmpl @@ -128,8 +128,7 @@ X!Edrivers/base/interface.c !Edrivers/base/bus.c /sect1 sect1titleDevice Drivers DMA Management/title -!Edrivers/base/dma-buf.c -!Edrivers/base/reservation.c +!Edrivers/dma-buf/dma-buf.c !Iinclude/linux/reservation.h !Edrivers/base/dma-coherent.c !Edrivers/base/dma-mapping.c diff --git a/MAINTAINERS b/MAINTAINERS index 134483f206e4..c948e53a4ee6 100644 --- a/MAINTAINERS +++ b/MAINTAINERS @@ -2882,8 +2882,8 @@ S:Maintained L: linux-media@vger.kernel.org L: dri-de...@lists.freedesktop.org L: linaro-mm-...@lists.linaro.org -F: drivers/base/dma-buf* -F: include/linux/dma-buf* +F: drivers/dma-buf/ +F: include/linux/dma-buf* include/linux/reservation.h F: Documentation/dma-buf-sharing.txt T: git git://git.linaro.org/people/sumitsemwal/linux-dma-buf.git diff --git a/drivers/Makefile b/drivers/Makefile index f98b50d8251d..c00337be5351 100644 --- a/drivers/Makefile +++ b/drivers/Makefile @@ -61,6 +61,7 @@ obj-$(CONFIG_FB_INTEL) += video/fbdev/intelfb/ obj-$(CONFIG_PARPORT) += parport/ obj-y += base/ block/ misc/ mfd/ nfc/ +obj-$(CONFIG_DMA_SHARED_BUFFER) += dma-buf/ obj-$(CONFIG_NUBUS)+= nubus/ obj-y += macintosh/ obj-$(CONFIG_IDE) += ide/ diff --git a/drivers/base/Makefile b/drivers/base/Makefile index 04b314e0fa51..4aab26ec0292 100644 --- a/drivers/base/Makefile +++ b/drivers/base/Makefile @@ -10,7 +10,6 @@ obj-$(CONFIG_DMA_CMA) += dma-contiguous.o obj-y += power/ obj-$(CONFIG_HAS_DMA) += dma-mapping.o obj-$(CONFIG_HAVE_GENERIC_DMA_COHERENT) += dma-coherent.o -obj-$(CONFIG_DMA_SHARED_BUFFER) += dma-buf.o reservation.o obj-$(CONFIG_ISA) += isa.o obj-$(CONFIG_FW_LOADER)+= firmware_class.o obj-$(CONFIG_NUMA) += node.o diff --git a/drivers/dma-buf/Makefile b/drivers/dma-buf/Makefile new file mode 100644 index ..4a4f4c9bacd0 --- /dev/null +++ b/drivers/dma-buf/Makefile @@ -0,0 +1 @@ +obj-y := dma-buf.o reservation.o diff --git a/drivers/base/dma-buf.c b/drivers/dma-buf/dma-buf.c similarity index 100% rename from drivers/base/dma-buf.c rename to drivers/dma-buf/dma-buf.c diff --git a/drivers/base/reservation.c b/drivers/dma-buf/reservation.c similarity index 100% rename from drivers/base/reservation.c rename to drivers/dma-buf/reservation.c -- 2.0.0 -- To unsubscribe from this list: send the line unsubscribe linux-media in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [REPOST PATCH 4/8] android: convert sync to fence api, v5
Hey, op 20-06-14 22:52, Thierry Reding schreef: On Thu, Jun 19, 2014 at 02:28:14PM +0200, Daniel Vetter wrote: On Thu, Jun 19, 2014 at 1:48 PM, Thierry Reding thierry.red...@gmail.com wrote: With these changes, can we pull the android sync logic out of drivers/staging/ now? Afaik the google guys never really looked at this and acked it. So I'm not sure whether they'll follow along. The other issue I have as the maintainer of gfx driver is that I don't want to implement support for two different sync object primitives (once for dma-buf and once for android syncpts), and my impression thus far has been that even with this we're not there. I'm trying to get our own android guys to upstream their i915 syncpts support, but thus far I haven't managed to convince them to throw people's time at this. This has been discussed a fair bit internally recently and some of our GPU experts have raised concerns that this may result in seriously degraded performance in our proprietary graphics stack. Now I don't care very much for the proprietary graphics stack, but by extension I would assume that the same restrictions are relevant for any open-source driver as well. I'm still trying to fully understand all the implications and at the same time get some of the people who raised concerns to join in this discussion. As I understand it the concern is mostly about explicit vs. implicit synchronization and having this mechanism in the kernel will implicitly synchronize all accesses to these buffers even in cases where it's not needed (read vs. write locks, etc.). In one particular instance it was even mentioned that this kind of implicit synchronization can lead to deadlocks in some use-cases (this was mentioned for Android compositing, but I suspect that the same may happen for Wayland or X compositors). Well the implicit fences here actually can't deadlock. That's the entire point behind using ww mutexes. I've also heard tons of complaints about implicit enforced syncing (especially from opencl people), but in the end drivers and always expose unsynchronized access for specific cases. We do that in i915 for upload buffers and other fun stuff. This is about shared stuff across different drivers and different processes. Tegra K1 needs to share buffers across different drivers even for very basic use-cases since the GPU and display drivers are separate. So while I agree that the GPU driver can still use explicit synchronization for internal work, things aren't that simple in general. Let me try to reconstruct the use-case that caused the lock on Android: the compositor uses a hardware overlay to display an image. The system detects that there's little activity and instructs the compositor to put everything into one image and scan out only that (for power efficiency). Now with implicit locking the display driver has a lock on the image, so the GPU (used for compositing) needs to wait for it before it can composite everything into one image. But the display driver cannot release the lock on the image until the final image has been composited and can be displayed instead. This may not be technically a deadlock, but it's still a stalemate. Unless I'm missing something fundamental about DMA fences and ww mutexes I don't see how you can get out of this situation. This sounds like a case for implicit shared fences. ;-) Reading and scanning out would only wait for the last 'exclusive' fence, not on each other. But in drivers/drm I can encounter a similar issue, people expect to be able to overwrite the contents of the currently displayed buffer, so I 'solved' it by not adding a fence on the buffer, only by waiting for buffer idle before page flipping. The rationale is that the buffer is pinned internally, and the backing storage cannot go away until dma_buf_unmap_attachment is called. So when you render to the current front buffer without queuing a page flip you get exactly what you expect. ;-) Explicit vs. implicit synchronization may also become more of an issue as buffers are imported from other sources (such as cameras). Yeah, but the kernel space primitives would in both cases be the same, so drivers don't need to implement 2 separate fencing mechanisms for that. :-) ~Maarten -- To unsubscribe from this list: send the line unsubscribe linux-media in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [REPOST PATCH 4/8] android: convert sync to fence api, v5
op 19-06-14 17:22, Colin Cross schreef: On Wed, Jun 18, 2014 at 11:37 PM, Daniel Vetter dan...@ffwll.ch wrote: On Wed, Jun 18, 2014 at 06:15:56PM -0700, Greg KH wrote: On Wed, Jun 18, 2014 at 12:37:11PM +0200, Maarten Lankhorst wrote: Just to show it's easy. Android syncpoints can be mapped to a timeline. This removes the need to maintain a separate api for synchronization. I've left the android trace events in place, but the core fence events should already be sufficient for debugging. v2: - Call fence_remove_callback in sync_fence_free if not all fences have fired. v3: - Merge Colin Cross' bugfixes, and the android fence merge optimization. v4: - Merge with the upstream fixes. v5: - Fix small style issues pointed out by Thomas Hellstrom. Signed-off-by: Maarten Lankhorst maarten.lankho...@canonical.com Acked-by: John Stultz john.stu...@linaro.org --- drivers/staging/android/Kconfig |1 drivers/staging/android/Makefile |2 drivers/staging/android/sw_sync.c|6 drivers/staging/android/sync.c | 913 +++--- drivers/staging/android/sync.h | 79 ++- drivers/staging/android/sync_debug.c | 247 + drivers/staging/android/trace/sync.h | 12 7 files changed, 609 insertions(+), 651 deletions(-) create mode 100644 drivers/staging/android/sync_debug.c With these changes, can we pull the android sync logic out of drivers/staging/ now? Afaik the google guys never really looked at this and acked it. So I'm not sure whether they'll follow along. The other issue I have as the maintainer of gfx driver is that I don't want to implement support for two different sync object primitives (once for dma-buf and once for android syncpts), and my impression thus far has been that even with this we're not there. We have tested these patches to use dma fences to back the android sync driver and not found any major issues. However, my understanding is that dma fences are designed for implicit sync, and explicit sync through the android sync driver is bolted on the side to share code. Android is not moving away from explicit sync, but we do wrap all of our userspace sync accesses through libsync (https://android.googlesource.com/platform/system/core/+/master/libsync/sync.c, ignore the sw_sync parts), so if the kernel supported a slightly different userspace explicit sync interface we could adapt to it fairly easily. All we require is that individual kernel drivers need to be able to accept work alongisde an fd to wait on, and to return an fd that will signal when the work is done, and that userspace has some way to merge two of those fds, wait on an fd, and get some debugging info from an fd. However, this patch set doesn't do that, it has no way to export a dma fence as an fd except through the android sync driver, so it is not yet ready to fully replace android sync. Dma fences can be exported as android fences, just didn't see a need for it yet. :-) To wait on all implicit fences attached to a dma-buf one could simply poll the dma-buf directly, or use something like a android userspace fence. sync_fence_create takes a sync_pt as function argument, but I kept that to keep source code compatibility, not because it uses any sync_pt functions. Here's a patch to create a userspace fd for dma-fence instead of a android fence, applied on top of android: convert sync to fence api. diff --git a/drivers/staging/android/sw_sync.c b/drivers/staging/android/sw_sync.c index a76db3ff87cb..afc3c63b0438 100644 --- a/drivers/staging/android/sw_sync.c +++ b/drivers/staging/android/sw_sync.c @@ -184,7 +184,7 @@ static long sw_sync_ioctl_create_fence(struct sw_sync_timeline *obj, } data.name[sizeof(data.name) - 1] = '\0'; - fence = sync_fence_create(data.name, pt); + fence = sync_fence_create(data.name, pt-base); if (fence == NULL) { sync_pt_free(pt); err = -ENOMEM; diff --git a/drivers/staging/android/sync.c b/drivers/staging/android/sync.c index 70b09b5001ba..c89a6f954e41 100644 --- a/drivers/staging/android/sync.c +++ b/drivers/staging/android/sync.c @@ -188,7 +188,7 @@ static void fence_check_cb_func(struct fence *f, struct fence_cb *cb) } /* TODO: implement a create which takes more that one sync_pt */ -struct sync_fence *sync_fence_create(const char *name, struct sync_pt *pt) +struct sync_fence *sync_fence_create(const char *name, struct fence *pt) { struct sync_fence *fence; @@ -199,10 +199,10 @@ struct sync_fence *sync_fence_create(const char *name, struct sync_pt *pt) fence-num_fences = 1; atomic_set(fence-status, 1); - fence_get(pt-base); - fence-cbs[0].sync_pt = pt-base; + fence_get(pt); + fence-cbs[0].sync_pt = pt; fence-cbs[0].fence = fence; - if (fence_add_callback(pt-base, fence-cbs[0].cb, + if (fence_add_callback(pt, fence-cbs[0].cb
[REPOST PATCH 2/8] seqno-fence: Hardware dma-buf implementation of fencing (v5)
This type of fence can be used with hardware synchronization for simple hardware that can block execution until the condition (dma_buf[offset] - value) = 0 has been met when WAIT_GEQUAL is used, or (dma_buf[offset] != 0) has been met when WAIT_NONZERO is set. A software fallback still has to be provided in case the fence is used with a device that doesn't support this mechanism. It is useful to expose this for graphics cards that have an op to support this. Some cards like i915 can export those, but don't have an option to wait, so they need the software fallback. I extended the original patch by Rob Clark. v1: Original v2: Renamed from bikeshed to seqno, moved into dma-fence.c since not much was left of the file. Lots of documentation added. v3: Use fence_ops instead of custom callbacks. Moved to own file to avoid circular dependency between dma-buf.h and fence.h v4: Add spinlock pointer to seqno_fence_init v5: Add condition member to allow wait for != 0. Fix small style errors pointed out by checkpatch. Signed-off-by: Maarten Lankhorst maarten.lankho...@canonical.com Reviewed-by: Rob Clark robdcl...@gmail.com #v4 --- Documentation/DocBook/device-drivers.tmpl |1 drivers/base/fence.c | 52 + include/linux/seqno-fence.h | 119 + 3 files changed, 172 insertions(+) create mode 100644 include/linux/seqno-fence.h diff --git a/Documentation/DocBook/device-drivers.tmpl b/Documentation/DocBook/device-drivers.tmpl index 7eef81069d1b..6ca7a11fb893 100644 --- a/Documentation/DocBook/device-drivers.tmpl +++ b/Documentation/DocBook/device-drivers.tmpl @@ -131,6 +131,7 @@ X!Edrivers/base/interface.c !Edrivers/base/dma-buf.c !Edrivers/base/fence.c !Iinclude/linux/fence.h +!Iinclude/linux/seqno-fence.h !Edrivers/base/reservation.c !Iinclude/linux/reservation.h !Edrivers/base/dma-coherent.c diff --git a/drivers/base/fence.c b/drivers/base/fence.c index 1da7f4d6542a..752a2dfa505f 100644 --- a/drivers/base/fence.c +++ b/drivers/base/fence.c @@ -25,6 +25,7 @@ #include linux/export.h #include linux/atomic.h #include linux/fence.h +#include linux/seqno-fence.h #define CREATE_TRACE_POINTS #include trace/events/fence.h @@ -414,3 +415,54 @@ __fence_init(struct fence *fence, const struct fence_ops *ops, trace_fence_init(fence); } EXPORT_SYMBOL(__fence_init); + +static const char *seqno_fence_get_driver_name(struct fence *fence) +{ + struct seqno_fence *seqno_fence = to_seqno_fence(fence); + return seqno_fence-ops-get_driver_name(fence); +} + +static const char *seqno_fence_get_timeline_name(struct fence *fence) +{ + struct seqno_fence *seqno_fence = to_seqno_fence(fence); + return seqno_fence-ops-get_timeline_name(fence); +} + +static bool seqno_enable_signaling(struct fence *fence) +{ + struct seqno_fence *seqno_fence = to_seqno_fence(fence); + return seqno_fence-ops-enable_signaling(fence); +} + +static bool seqno_signaled(struct fence *fence) +{ + struct seqno_fence *seqno_fence = to_seqno_fence(fence); + return seqno_fence-ops-signaled seqno_fence-ops-signaled(fence); +} + +static void seqno_release(struct fence *fence) +{ + struct seqno_fence *f = to_seqno_fence(fence); + + dma_buf_put(f-sync_buf); + if (f-ops-release) + f-ops-release(fence); + else + kfree(f); +} + +static long seqno_wait(struct fence *fence, bool intr, signed long timeout) +{ + struct seqno_fence *f = to_seqno_fence(fence); + return f-ops-wait(fence, intr, timeout); +} + +const struct fence_ops seqno_fence_ops = { + .get_driver_name = seqno_fence_get_driver_name, + .get_timeline_name = seqno_fence_get_timeline_name, + .enable_signaling = seqno_enable_signaling, + .signaled = seqno_signaled, + .wait = seqno_wait, + .release = seqno_release, +}; +EXPORT_SYMBOL(seqno_fence_ops); diff --git a/include/linux/seqno-fence.h b/include/linux/seqno-fence.h new file mode 100644 index ..b4d4aad3cadc --- /dev/null +++ b/include/linux/seqno-fence.h @@ -0,0 +1,119 @@ +/* + * seqno-fence, using a dma-buf to synchronize fencing + * + * Copyright (C) 2012 Texas Instruments + * Copyright (C) 2012 Canonical Ltd + * Authors: + * Rob Clark robdcl...@gmail.com + * Maarten Lankhorst maarten.lankho...@canonical.com + * + * This program is free software; you can redistribute it and/or modify it + * under the terms of the GNU General Public License version 2 as published by + * the Free Software Foundation. + * + * This program is distributed in the hope that it will be useful, but WITHOUT + * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or + * FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for + * more details. + * + * You should have received a copy of the GNU General Public License along with + * this program. If not, see http://www.gnu.org/licenses
[REPOST PATCH 6/8] dma-buf: add poll support, v3
Thanks to Fengguang Wu for spotting a missing static cast. v2: - Kill unused variable need_shared. v3: - Clarify the BUG() in dma_buf_release some more. (Rob Clark) Signed-off-by: Maarten Lankhorst maarten.lankho...@canonical.com --- drivers/base/dma-buf.c | 108 +++ include/linux/dma-buf.h | 12 + 2 files changed, 120 insertions(+) diff --git a/drivers/base/dma-buf.c b/drivers/base/dma-buf.c index cd40ca22911f..25e8c4165936 100644 --- a/drivers/base/dma-buf.c +++ b/drivers/base/dma-buf.c @@ -30,6 +30,7 @@ #include linux/export.h #include linux/debugfs.h #include linux/seq_file.h +#include linux/poll.h #include linux/reservation.h static inline int is_dma_buf_file(struct file *); @@ -52,6 +53,16 @@ static int dma_buf_release(struct inode *inode, struct file *file) BUG_ON(dmabuf-vmapping_counter); + /* +* Any fences that a dma-buf poll can wait on should be signaled +* before releasing dma-buf. This is the responsibility of each +* driver that uses the reservation objects. +* +* If you hit this BUG() it means someone dropped their ref to the +* dma-buf while still having pending operation to the buffer. +*/ + BUG_ON(dmabuf-cb_shared.active || dmabuf-cb_excl.active); + dmabuf-ops-release(dmabuf); mutex_lock(db_list.lock); @@ -108,10 +119,103 @@ static loff_t dma_buf_llseek(struct file *file, loff_t offset, int whence) return base + offset; } +static void dma_buf_poll_cb(struct fence *fence, struct fence_cb *cb) +{ + struct dma_buf_poll_cb_t *dcb = (struct dma_buf_poll_cb_t *)cb; + unsigned long flags; + + spin_lock_irqsave(dcb-poll-lock, flags); + wake_up_locked_poll(dcb-poll, dcb-active); + dcb-active = 0; + spin_unlock_irqrestore(dcb-poll-lock, flags); +} + +static unsigned int dma_buf_poll(struct file *file, poll_table *poll) +{ + struct dma_buf *dmabuf; + struct reservation_object *resv; + unsigned long events; + + dmabuf = file-private_data; + if (!dmabuf || !dmabuf-resv) + return POLLERR; + + resv = dmabuf-resv; + + poll_wait(file, dmabuf-poll, poll); + + events = poll_requested_events(poll) (POLLIN | POLLOUT); + if (!events) + return 0; + + ww_mutex_lock(resv-lock, NULL); + + if (resv-fence_excl (!(events POLLOUT) || +resv-fence_shared_count == 0)) { + struct dma_buf_poll_cb_t *dcb = dmabuf-cb_excl; + unsigned long pevents = POLLIN; + + if (resv-fence_shared_count == 0) + pevents |= POLLOUT; + + spin_lock_irq(dmabuf-poll.lock); + if (dcb-active) { + dcb-active |= pevents; + events = ~pevents; + } else + dcb-active = pevents; + spin_unlock_irq(dmabuf-poll.lock); + + if (events pevents) { + if (!fence_add_callback(resv-fence_excl, + dcb-cb, dma_buf_poll_cb)) + events = ~pevents; + else + /* +* No callback queued, wake up any additional +* waiters. +*/ + dma_buf_poll_cb(NULL, dcb-cb); + } + } + + if ((events POLLOUT) resv-fence_shared_count 0) { + struct dma_buf_poll_cb_t *dcb = dmabuf-cb_shared; + int i; + + /* Only queue a new callback if no event has fired yet */ + spin_lock_irq(dmabuf-poll.lock); + if (dcb-active) + events = ~POLLOUT; + else + dcb-active = POLLOUT; + spin_unlock_irq(dmabuf-poll.lock); + + if (!(events POLLOUT)) + goto out; + + for (i = 0; i resv-fence_shared_count; ++i) + if (!fence_add_callback(resv-fence_shared[i], + dcb-cb, dma_buf_poll_cb)) { + events = ~POLLOUT; + break; + } + + /* No callback queued, wake up any additional waiters. */ + if (i == resv-fence_shared_count) + dma_buf_poll_cb(NULL, dcb-cb); + } + +out: + ww_mutex_unlock(resv-lock); + return events; +} + static const struct file_operations dma_buf_fops = { .release= dma_buf_release, .mmap = dma_buf_mmap_internal, .llseek = dma_buf_llseek, + .poll = dma_buf_poll, }; /* @@ -171,6 +275,10 @@ struct dma_buf
[REPOST PATCH 4/8] android: convert sync to fence api, v5
Just to show it's easy. Android syncpoints can be mapped to a timeline. This removes the need to maintain a separate api for synchronization. I've left the android trace events in place, but the core fence events should already be sufficient for debugging. v2: - Call fence_remove_callback in sync_fence_free if not all fences have fired. v3: - Merge Colin Cross' bugfixes, and the android fence merge optimization. v4: - Merge with the upstream fixes. v5: - Fix small style issues pointed out by Thomas Hellstrom. Signed-off-by: Maarten Lankhorst maarten.lankho...@canonical.com Acked-by: John Stultz john.stu...@linaro.org --- drivers/staging/android/Kconfig |1 drivers/staging/android/Makefile |2 drivers/staging/android/sw_sync.c|6 drivers/staging/android/sync.c | 913 +++--- drivers/staging/android/sync.h | 79 ++- drivers/staging/android/sync_debug.c | 247 + drivers/staging/android/trace/sync.h | 12 7 files changed, 609 insertions(+), 651 deletions(-) create mode 100644 drivers/staging/android/sync_debug.c diff --git a/drivers/staging/android/Kconfig b/drivers/staging/android/Kconfig index 99e484f845f2..51607e9aa049 100644 --- a/drivers/staging/android/Kconfig +++ b/drivers/staging/android/Kconfig @@ -88,6 +88,7 @@ config SYNC bool Synchronization framework default n select ANON_INODES + select DMA_SHARED_BUFFER ---help--- This option enables the framework for synchronization between multiple drivers. Sync implementations can take advantage of hardware diff --git a/drivers/staging/android/Makefile b/drivers/staging/android/Makefile index 0a01e1914905..517ad5ffa429 100644 --- a/drivers/staging/android/Makefile +++ b/drivers/staging/android/Makefile @@ -9,5 +9,5 @@ obj-$(CONFIG_ANDROID_TIMED_OUTPUT) += timed_output.o obj-$(CONFIG_ANDROID_TIMED_GPIO) += timed_gpio.o obj-$(CONFIG_ANDROID_LOW_MEMORY_KILLER)+= lowmemorykiller.o obj-$(CONFIG_ANDROID_INTF_ALARM_DEV) += alarm-dev.o -obj-$(CONFIG_SYNC) += sync.o +obj-$(CONFIG_SYNC) += sync.o sync_debug.o obj-$(CONFIG_SW_SYNC) += sw_sync.o diff --git a/drivers/staging/android/sw_sync.c b/drivers/staging/android/sw_sync.c index 12a136ec1cec..a76db3ff87cb 100644 --- a/drivers/staging/android/sw_sync.c +++ b/drivers/staging/android/sw_sync.c @@ -50,7 +50,7 @@ static struct sync_pt *sw_sync_pt_dup(struct sync_pt *sync_pt) { struct sw_sync_pt *pt = (struct sw_sync_pt *) sync_pt; struct sw_sync_timeline *obj = - (struct sw_sync_timeline *)sync_pt-parent; + (struct sw_sync_timeline *)sync_pt_parent(sync_pt); return (struct sync_pt *) sw_sync_pt_create(obj, pt-value); } @@ -59,7 +59,7 @@ static int sw_sync_pt_has_signaled(struct sync_pt *sync_pt) { struct sw_sync_pt *pt = (struct sw_sync_pt *)sync_pt; struct sw_sync_timeline *obj = - (struct sw_sync_timeline *)sync_pt-parent; + (struct sw_sync_timeline *)sync_pt_parent(sync_pt); return sw_sync_cmp(obj-value, pt-value) = 0; } @@ -97,7 +97,6 @@ static void sw_sync_pt_value_str(struct sync_pt *sync_pt, char *str, int size) { struct sw_sync_pt *pt = (struct sw_sync_pt *)sync_pt; - snprintf(str, size, %d, pt-value); } @@ -157,7 +156,6 @@ static int sw_sync_open(struct inode *inode, struct file *file) static int sw_sync_release(struct inode *inode, struct file *file) { struct sw_sync_timeline *obj = file-private_data; - sync_timeline_destroy(obj-obj); return 0; } diff --git a/drivers/staging/android/sync.c b/drivers/staging/android/sync.c index 18174f7c871c..70b09b5001ba 100644 --- a/drivers/staging/android/sync.c +++ b/drivers/staging/android/sync.c @@ -31,22 +31,13 @@ #define CREATE_TRACE_POINTS #include trace/sync.h -static void sync_fence_signal_pt(struct sync_pt *pt); -static int _sync_pt_has_signaled(struct sync_pt *pt); -static void sync_fence_free(struct kref *kref); -static void sync_dump(void); - -static LIST_HEAD(sync_timeline_list_head); -static DEFINE_SPINLOCK(sync_timeline_list_lock); - -static LIST_HEAD(sync_fence_list_head); -static DEFINE_SPINLOCK(sync_fence_list_lock); +static const struct fence_ops android_fence_ops; +static const struct file_operations sync_fence_fops; struct sync_timeline *sync_timeline_create(const struct sync_timeline_ops *ops, int size, const char *name) { struct sync_timeline *obj; - unsigned long flags; if (size sizeof(struct sync_timeline)) return NULL; @@ -57,17 +48,14 @@ struct sync_timeline *sync_timeline_create(const struct sync_timeline_ops *ops, kref_init(obj-kref); obj-ops = ops; + obj-context = fence_context_alloc(1); strlcpy(obj
[REPOST PATCH 5/8] reservation: add support for fences to enable cross-device synchronisation
Signed-off-by: Maarten Lankhorst maarten.lankho...@canonical.com Reviewed-by: Rob Clark robdcl...@gmail.com --- include/linux/reservation.h | 20 +++- 1 file changed, 19 insertions(+), 1 deletion(-) diff --git a/include/linux/reservation.h b/include/linux/reservation.h index 813dae960ebd..f3f57460a205 100644 --- a/include/linux/reservation.h +++ b/include/linux/reservation.h @@ -6,7 +6,7 @@ * Copyright (C) 2012 Texas Instruments * * Authors: - * Rob Clark rob.cl...@linaro.org + * Rob Clark robdcl...@gmail.com * Maarten Lankhorst maarten.lankho...@canonical.com * Thomas Hellstrom thellstrom-at-vmware-dot-com * @@ -40,22 +40,40 @@ #define _LINUX_RESERVATION_H #include linux/ww_mutex.h +#include linux/fence.h +#include linux/slab.h extern struct ww_class reservation_ww_class; struct reservation_object { struct ww_mutex lock; + + struct fence *fence_excl; + struct fence **fence_shared; + u32 fence_shared_count, fence_shared_max; }; static inline void reservation_object_init(struct reservation_object *obj) { ww_mutex_init(obj-lock, reservation_ww_class); + + obj-fence_shared_count = obj-fence_shared_max = 0; + obj-fence_shared = NULL; + obj-fence_excl = NULL; } static inline void reservation_object_fini(struct reservation_object *obj) { + int i; + + if (obj-fence_excl) + fence_put(obj-fence_excl); + for (i = 0; i obj-fence_shared_count; ++i) + fence_put(obj-fence_shared[i]); + kfree(obj-fence_shared); + ww_mutex_destroy(obj-lock); } -- To unsubscribe from this list: send the line unsubscribe linux-media in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[REPOST PATCH 0/8] fence synchronization patches
The following series implements fence and converts dma-buf and android sync to use it. Patch 5 and 6 add support for polling to dma-buf, blocking until all fences are signaled. Patch 7 and 8 provide some helpers, and allow use of RCU in the reservation api. The helpers make it easier to convert ttm, and make dealing with rcu less painful. Patches slightly updated to fix compilation with armada and new atomic primitives, but otherwise identical. --- Maarten Lankhorst (8): fence: dma-buf cross-device synchronization (v17) seqno-fence: Hardware dma-buf implementation of fencing (v5) dma-buf: use reservation objects android: convert sync to fence api, v5 reservation: add support for fences to enable cross-device synchronisation dma-buf: add poll support, v3 reservation: update api and add some helpers reservation: add suppport for read-only access using rcu Documentation/DocBook/device-drivers.tmpl |3 drivers/base/Kconfig |9 drivers/base/Makefile |2 drivers/base/dma-buf.c | 168 drivers/base/fence.c | 468 drivers/base/reservation.c | 440 drivers/gpu/drm/armada/armada_gem.c|2 drivers/gpu/drm/drm_prime.c|8 drivers/gpu/drm/exynos/exynos_drm_dmabuf.c |2 drivers/gpu/drm/i915/i915_gem_dmabuf.c |3 drivers/gpu/drm/nouveau/nouveau_drm.c |1 drivers/gpu/drm/nouveau/nouveau_gem.h |1 drivers/gpu/drm/nouveau/nouveau_prime.c|7 drivers/gpu/drm/omapdrm/omap_gem_dmabuf.c |2 drivers/gpu/drm/radeon/radeon_drv.c|2 drivers/gpu/drm/radeon/radeon_prime.c |8 drivers/gpu/drm/tegra/gem.c|2 drivers/gpu/drm/ttm/ttm_object.c |2 drivers/media/v4l2-core/videobuf2-dma-contig.c |2 drivers/staging/android/Kconfig|1 drivers/staging/android/Makefile |2 drivers/staging/android/ion/ion.c |3 drivers/staging/android/sw_sync.c |6 drivers/staging/android/sync.c | 913 drivers/staging/android/sync.h | 79 +- drivers/staging/android/sync_debug.c | 247 ++ drivers/staging/android/trace/sync.h | 12 include/drm/drmP.h |3 include/linux/dma-buf.h| 21 - include/linux/fence.h | 355 + include/linux/reservation.h| 82 ++ include/linux/seqno-fence.h| 119 +++ include/trace/events/fence.h | 128 +++ 33 files changed, 2435 insertions(+), 668 deletions(-) create mode 100644 drivers/base/fence.c create mode 100644 drivers/staging/android/sync_debug.c create mode 100644 include/linux/fence.h create mode 100644 include/linux/seqno-fence.h create mode 100644 include/trace/events/fence.h -- Signature -- To unsubscribe from this list: send the line unsubscribe linux-media in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[REPOST PATCH 7/8] reservation: update api and add some helpers
Move the list of shared fences to a struct, and return it in reservation_object_get_list(). Add reservation_object_get_excl to get the exclusive fence. Add reservation_object_reserve_shared(), which reserves space in the reservation_object for 1 more shared fence. reservation_object_add_shared_fence() and reservation_object_add_excl_fence() are used to assign a new fence to a reservation_object pointer, to complete a reservation. Signed-off-by: Maarten Lankhorst maarten.lankho...@canonical.com Changes since v1: - Add reservation_object_get_excl, reorder code a bit. --- drivers/base/dma-buf.c | 35 +++--- drivers/base/fence.c|4 + drivers/base/reservation.c | 156 +++ include/linux/fence.h |6 ++ include/linux/reservation.h | 56 ++- 5 files changed, 236 insertions(+), 21 deletions(-) diff --git a/drivers/base/dma-buf.c b/drivers/base/dma-buf.c index 25e8c4165936..cb8379dfeed5 100644 --- a/drivers/base/dma-buf.c +++ b/drivers/base/dma-buf.c @@ -134,7 +134,10 @@ static unsigned int dma_buf_poll(struct file *file, poll_table *poll) { struct dma_buf *dmabuf; struct reservation_object *resv; + struct reservation_object_list *fobj; + struct fence *fence_excl; unsigned long events; + unsigned shared_count; dmabuf = file-private_data; if (!dmabuf || !dmabuf-resv) @@ -150,12 +153,18 @@ static unsigned int dma_buf_poll(struct file *file, poll_table *poll) ww_mutex_lock(resv-lock, NULL); - if (resv-fence_excl (!(events POLLOUT) || -resv-fence_shared_count == 0)) { + fobj = resv-fence; + if (!fobj) + goto out; + + shared_count = fobj-shared_count; + fence_excl = resv-fence_excl; + + if (fence_excl (!(events POLLOUT) || shared_count == 0)) { struct dma_buf_poll_cb_t *dcb = dmabuf-cb_excl; unsigned long pevents = POLLIN; - if (resv-fence_shared_count == 0) + if (shared_count == 0) pevents |= POLLOUT; spin_lock_irq(dmabuf-poll.lock); @@ -167,19 +176,20 @@ static unsigned int dma_buf_poll(struct file *file, poll_table *poll) spin_unlock_irq(dmabuf-poll.lock); if (events pevents) { - if (!fence_add_callback(resv-fence_excl, - dcb-cb, dma_buf_poll_cb)) + if (!fence_add_callback(fence_excl, dcb-cb, + dma_buf_poll_cb)) { events = ~pevents; - else + } else { /* * No callback queued, wake up any additional * waiters. */ dma_buf_poll_cb(NULL, dcb-cb); + } } } - if ((events POLLOUT) resv-fence_shared_count 0) { + if ((events POLLOUT) shared_count 0) { struct dma_buf_poll_cb_t *dcb = dmabuf-cb_shared; int i; @@ -194,15 +204,18 @@ static unsigned int dma_buf_poll(struct file *file, poll_table *poll) if (!(events POLLOUT)) goto out; - for (i = 0; i resv-fence_shared_count; ++i) - if (!fence_add_callback(resv-fence_shared[i], - dcb-cb, dma_buf_poll_cb)) { + for (i = 0; i shared_count; ++i) { + struct fence *fence = fobj-shared[i]; + + if (!fence_add_callback(fence, dcb-cb, + dma_buf_poll_cb)) { events = ~POLLOUT; break; } + } /* No callback queued, wake up any additional waiters. */ - if (i == resv-fence_shared_count) + if (i == shared_count) dma_buf_poll_cb(NULL, dcb-cb); } diff --git a/drivers/base/fence.c b/drivers/base/fence.c index 752a2dfa505f..74d1f7bcb467 100644 --- a/drivers/base/fence.c +++ b/drivers/base/fence.c @@ -170,7 +170,7 @@ void release_fence(struct kref *kref) if (fence-ops-release) fence-ops-release(fence); else - kfree(fence); + free_fence(fence); } EXPORT_SYMBOL(release_fence); @@ -448,7 +448,7 @@ static void seqno_release(struct fence *fence) if (f-ops-release) f-ops-release(fence); else - kfree(f); + free_fence(fence); } static long seqno_wait(struct fence *fence, bool intr, signed long timeout) diff --git a/drivers
[REPOST PATCH 8/8] reservation: add suppport for read-only access using rcu
This adds 4 more functions to deal with rcu. reservation_object_get_fences_rcu() will obtain the list of shared and exclusive fences without obtaining the ww_mutex. reservation_object_wait_timeout_rcu() will wait on all fences of the reservation_object, without obtaining the ww_mutex. reservation_object_test_signaled_rcu() will test if all fences of the reservation_object are signaled without using the ww_mutex. reservation_object_get_excl() is added because touching the fence_excl member directly will trigger a sparse warning. Signed-off-by: Maarten Lankhorst maarten.lankho...@canonical.com Reviewed-By: Thomas Hellstrom thellst...@vmware.com --- drivers/base/dma-buf.c | 47 +- drivers/base/reservation.c | 336 --- include/linux/fence.h | 20 ++- include/linux/reservation.h | 52 +-- 4 files changed, 400 insertions(+), 55 deletions(-) diff --git a/drivers/base/dma-buf.c b/drivers/base/dma-buf.c index cb8379dfeed5..f3014c448e1e 100644 --- a/drivers/base/dma-buf.c +++ b/drivers/base/dma-buf.c @@ -137,7 +137,7 @@ static unsigned int dma_buf_poll(struct file *file, poll_table *poll) struct reservation_object_list *fobj; struct fence *fence_excl; unsigned long events; - unsigned shared_count; + unsigned shared_count, seq; dmabuf = file-private_data; if (!dmabuf || !dmabuf-resv) @@ -151,14 +151,20 @@ static unsigned int dma_buf_poll(struct file *file, poll_table *poll) if (!events) return 0; - ww_mutex_lock(resv-lock, NULL); +retry: + seq = read_seqcount_begin(resv-seq); + rcu_read_lock(); - fobj = resv-fence; - if (!fobj) - goto out; - - shared_count = fobj-shared_count; - fence_excl = resv-fence_excl; + fobj = rcu_dereference(resv-fence); + if (fobj) + shared_count = fobj-shared_count; + else + shared_count = 0; + fence_excl = rcu_dereference(resv-fence_excl); + if (read_seqcount_retry(resv-seq, seq)) { + rcu_read_unlock(); + goto retry; + } if (fence_excl (!(events POLLOUT) || shared_count == 0)) { struct dma_buf_poll_cb_t *dcb = dmabuf-cb_excl; @@ -176,14 +182,20 @@ static unsigned int dma_buf_poll(struct file *file, poll_table *poll) spin_unlock_irq(dmabuf-poll.lock); if (events pevents) { - if (!fence_add_callback(fence_excl, dcb-cb, + if (!fence_get_rcu(fence_excl)) { + /* force a recheck */ + events = ~pevents; + dma_buf_poll_cb(NULL, dcb-cb); + } else if (!fence_add_callback(fence_excl, dcb-cb, dma_buf_poll_cb)) { events = ~pevents; + fence_put(fence_excl); } else { /* * No callback queued, wake up any additional * waiters. */ + fence_put(fence_excl); dma_buf_poll_cb(NULL, dcb-cb); } } @@ -205,13 +217,26 @@ static unsigned int dma_buf_poll(struct file *file, poll_table *poll) goto out; for (i = 0; i shared_count; ++i) { - struct fence *fence = fobj-shared[i]; + struct fence *fence = rcu_dereference(fobj-shared[i]); + if (!fence_get_rcu(fence)) { + /* +* fence refcount dropped to zero, this means +* that fobj has been freed +* +* call dma_buf_poll_cb and force a recheck! +*/ + events = ~POLLOUT; + dma_buf_poll_cb(NULL, dcb-cb); + break; + } if (!fence_add_callback(fence, dcb-cb, dma_buf_poll_cb)) { + fence_put(fence); events = ~POLLOUT; break; } + fence_put(fence); } /* No callback queued, wake up any additional waiters. */ @@ -220,7 +245,7 @@ static unsigned int dma_buf_poll(struct file *file, poll_table *poll) } out: - ww_mutex_unlock(resv-lock); + rcu_read_unlock(); return events; } diff --git a/drivers/base/reservation.c b/drivers/base
[REPOST PATCH 1/8] fence: dma-buf cross-device synchronization (v17)
A fence can be attached to a buffer which is being filled or consumed by hw, to allow userspace to pass the buffer without waiting to another device. For example, userspace can call page_flip ioctl to display the next frame of graphics after kicking the GPU but while the GPU is still rendering. The display device sharing the buffer with the GPU would attach a callback to get notified when the GPU's rendering-complete IRQ fires, to update the scan-out address of the display, without having to wake up userspace. A driver must allocate a fence context for each execution ring that can run in parallel. The function for this takes an argument with how many contexts to allocate: + fence_context_alloc() A fence is transient, one-shot deal. It is allocated and attached to one or more dma-buf's. When the one that attached it is done, with the pending operation, it can signal the fence: + fence_signal() To have a rough approximation whether a fence is fired, call: + fence_is_signaled() The dma-buf-mgr handles tracking, and waiting on, the fences associated with a dma-buf. The one pending on the fence can add an async callback: + fence_add_callback() The callback can optionally be cancelled with: + fence_remove_callback() To wait synchronously, optionally with a timeout: + fence_wait() + fence_wait_timeout() When emitting a fence, call: + trace_fence_emit() To annotate that a fence is blocking on another fence, call: + trace_fence_annotate_wait_on(fence, on_fence) A default software-only implementation is provided, which can be used by drivers attaching a fence to a buffer when they have no other means for hw sync. But a memory backed fence is also envisioned, because it is common that GPU's can write to, or poll on some memory location for synchronization. For example: fence = custom_get_fence(...); if ((seqno_fence = to_seqno_fence(fence)) != NULL) { dma_buf *fence_buf = seqno_fence-sync_buf; get_dma_buf(fence_buf); ... tell the hw the memory location to wait ... custom_wait_on(fence_buf, seqno_fence-seqno_ofs, fence-seqno); } else { /* fall-back to sw sync * / fence_add_callback(fence, my_cb); } On SoC platforms, if some other hw mechanism is provided for synchronizing between IP blocks, it could be supported as an alternate implementation with it's own fence ops in a similar way. enable_signaling callback is used to provide sw signaling in case a cpu waiter is requested or no compatible hardware signaling could be used. The intention is to provide a userspace interface (presumably via eventfd) later, to be used in conjunction with dma-buf's mmap support for sw access to buffers (or for userspace apps that would prefer to do their own synchronization). v1: Original v2: After discussion w/ danvet and mlankhorst on #dri-devel, we decided that dma-fence didn't need to care about the sw-hw signaling path (it can be handled same as sw-sw case), and therefore the fence-ops can be simplified and more handled in the core. So remove the signal, add_callback, cancel_callback, and wait ops, and replace with a simple enable_signaling() op which can be used to inform a fence supporting hw-hw signaling that one or more devices which do not support hw signaling are waiting (and therefore it should enable an irq or do whatever is necessary in order that the CPU is notified when the fence is passed). v3: Fix locking fail in attach_fence() and get_fence() v4: Remove tie-in w/ dma-buf.. after discussion w/ danvet and mlankorst we decided that we need to be able to attach one fence to N dma-buf's, so using the list_head in dma-fence struct would be problematic. v5: [ Maarten Lankhorst ] Updated for dma-bikeshed-fence and dma-buf-manager. v6: [ Maarten Lankhorst ] I removed dma_fence_cancel_callback and some comments about checking if fence fired or not. This is broken by design. waitqueue_active during destruction is now fatal, since the signaller should be holding a reference in enable_signalling until it signalled the fence. Pass the original dma_fence_cb along, and call __remove_wait in the dma_fence_callback handler, so that no cleanup needs to be performed. v7: [ Maarten Lankhorst ] Set cb-func and only enable sw signaling if fence wasn't signaled yet, for example for hardware fences that may choose to signal blindly. v8: [ Maarten Lankhorst ] Tons of tiny fixes, moved __dma_fence_init to header and fixed include mess. dma-fence.h now includes dma-buf.h All members are now initialized, so kmalloc can be used for allocating a dma-fence. More documentation added. v9: Change compiler bitfields to flags, change return type of enable_signaling to bool. Rework dma_fence_wait. Added dma_fence_is_signaled and dma_fence_wait_timeout. s/dma// and change exports to non GPL. Added fence_is_signaled and fence_enable_sw_signaling calls, add ability to override default
[REPOST PATCH 3/8] dma-buf: use reservation objects
This allows reservation objects to be used in dma-buf. it's required for implementing polling support on the fences that belong to a dma-buf. Signed-off-by: Maarten Lankhorst maarten.lankho...@canonical.com Acked-by: Mauro Carvalho Chehab m.che...@samsung.com #drivers/media/v4l2-core/ Acked-by: Thomas Hellstrom thellst...@vmware.com #drivers/gpu/drm/ttm Signed-off-by: Vincent Stehlé vincent.ste...@laposte.net #drivers/gpu/drm/armada/ --- drivers/base/dma-buf.c | 22 -- drivers/gpu/drm/armada/armada_gem.c|2 +- drivers/gpu/drm/drm_prime.c|8 +++- drivers/gpu/drm/exynos/exynos_drm_dmabuf.c |2 +- drivers/gpu/drm/i915/i915_gem_dmabuf.c |3 ++- drivers/gpu/drm/nouveau/nouveau_drm.c |1 + drivers/gpu/drm/nouveau/nouveau_gem.h |1 + drivers/gpu/drm/nouveau/nouveau_prime.c|7 +++ drivers/gpu/drm/omapdrm/omap_gem_dmabuf.c |2 +- drivers/gpu/drm/radeon/radeon_drv.c|2 ++ drivers/gpu/drm/radeon/radeon_prime.c |8 drivers/gpu/drm/tegra/gem.c|2 +- drivers/gpu/drm/ttm/ttm_object.c |2 +- drivers/media/v4l2-core/videobuf2-dma-contig.c |2 +- drivers/staging/android/ion/ion.c |3 ++- include/drm/drmP.h |3 +++ include/linux/dma-buf.h|9 ++--- 17 files changed, 65 insertions(+), 14 deletions(-) diff --git a/drivers/base/dma-buf.c b/drivers/base/dma-buf.c index 840c7fa80983..cd40ca22911f 100644 --- a/drivers/base/dma-buf.c +++ b/drivers/base/dma-buf.c @@ -25,10 +25,12 @@ #include linux/fs.h #include linux/slab.h #include linux/dma-buf.h +#include linux/fence.h #include linux/anon_inodes.h #include linux/export.h #include linux/debugfs.h #include linux/seq_file.h +#include linux/reservation.h static inline int is_dma_buf_file(struct file *); @@ -56,6 +58,9 @@ static int dma_buf_release(struct inode *inode, struct file *file) list_del(dmabuf-list_node); mutex_unlock(db_list.lock); + if (dmabuf-resv == (struct reservation_object *)dmabuf[1]) + reservation_object_fini(dmabuf-resv); + kfree(dmabuf); return 0; } @@ -128,6 +133,7 @@ static inline int is_dma_buf_file(struct file *file) * @size: [in]Size of the buffer * @flags: [in]mode flags for the file. * @exp_name: [in]name of the exporting module - useful for debugging. + * @resv: [in]reservation-object, NULL to allocate default one. * * Returns, on success, a newly created dma_buf object, which wraps the * supplied private data and operations for dma_buf_ops. On either missing @@ -135,10 +141,17 @@ static inline int is_dma_buf_file(struct file *file) * */ struct dma_buf *dma_buf_export_named(void *priv, const struct dma_buf_ops *ops, - size_t size, int flags, const char *exp_name) + size_t size, int flags, const char *exp_name, + struct reservation_object *resv) { struct dma_buf *dmabuf; struct file *file; + size_t alloc_size = sizeof(struct dma_buf); + if (!resv) + alloc_size += sizeof(struct reservation_object); + else + /* prevent dma_buf[1] == dma_buf-resv */ + alloc_size += 1; if (WARN_ON(!priv || !ops || !ops-map_dma_buf @@ -150,7 +163,7 @@ struct dma_buf *dma_buf_export_named(void *priv, const struct dma_buf_ops *ops, return ERR_PTR(-EINVAL); } - dmabuf = kzalloc(sizeof(struct dma_buf), GFP_KERNEL); + dmabuf = kzalloc(alloc_size, GFP_KERNEL); if (dmabuf == NULL) return ERR_PTR(-ENOMEM); @@ -158,6 +171,11 @@ struct dma_buf *dma_buf_export_named(void *priv, const struct dma_buf_ops *ops, dmabuf-ops = ops; dmabuf-size = size; dmabuf-exp_name = exp_name; + if (!resv) { + resv = (struct reservation_object *)dmabuf[1]; + reservation_object_init(resv); + } + dmabuf-resv = resv; file = anon_inode_getfile(dmabuf, dma_buf_fops, dmabuf, flags); if (IS_ERR(file)) { diff --git a/drivers/gpu/drm/armada/armada_gem.c b/drivers/gpu/drm/armada/armada_gem.c index bb9b642d8485..7496f55611a5 100644 --- a/drivers/gpu/drm/armada/armada_gem.c +++ b/drivers/gpu/drm/armada/armada_gem.c @@ -539,7 +539,7 @@ armada_gem_prime_export(struct drm_device *dev, struct drm_gem_object *obj, int flags) { return dma_buf_export(obj, armada_gem_prime_dmabuf_ops, obj-size, - O_RDWR); + O_RDWR, NULL); } struct drm_gem_object * diff --git a/drivers/gpu/drm/drm_prime.c b/drivers/gpu/drm/drm_prime.c index 304ca8cacbc4..99d578bad17e 100644
Re: [RFC PATCH 2/2 with seqcount v3] reservation: add suppport for read-only access using rcu
op 20-05-14 17:13, Thomas Hellstrom schreef: On 05/19/2014 03:13 PM, Maarten Lankhorst wrote: op 19-05-14 15:42, Thomas Hellstrom schreef: Hi, Maarten! Some nitpicks, and that krealloc within rcu lock still worries me. Otherwise looks good. /Thomas On 04/23/2014 12:15 PM, Maarten Lankhorst wrote: @@ -55,8 +60,8 @@ int reservation_object_reserve_shared(struct reservation_object *obj) kfree(obj-staged); obj-staged = NULL; return 0; -} -max = old-shared_max * 2; +} else +max = old-shared_max * 2; Perhaps as a separate reformatting patch? I'll fold it in to the patch that added reservation_object_reserve_shared. + +int reservation_object_get_fences_rcu(struct reservation_object *obj, + struct fence **pfence_excl, + unsigned *pshared_count, + struct fence ***pshared) +{ +unsigned shared_count = 0; +unsigned retry = 1; +struct fence **shared = NULL, *fence_excl = NULL; +int ret = 0; + +while (retry) { +struct reservation_object_list *fobj; +unsigned seq; + +seq = read_seqcount_begin(obj-seq); + +rcu_read_lock(); + +fobj = rcu_dereference(obj-fence); +if (fobj) { +struct fence **nshared; + +shared_count = ACCESS_ONCE(fobj-shared_count); ACCESS_ONCE() shouldn't be needed inside the seqlock? Yes it is, shared_count may be increased, leading to potential different sizes for krealloc and memcpy if the ACCESS_ONCE is removed. I could use shared_max here instead, which stays the same, but it would waste more memory. Maarten, Another perhaps ignorant question WRT this, Does ACCESS_ONCE() guarantee that the value accessed is read atomically? Well I've reworked the code to use shared_max, so this point is moot. :-) On any archs I'm aware of it would work, either the old or new value would be visible, as long as natural alignment is used. rcu uses the same trick in the rcu_dereference macro, so if this didn't work rcu wouldn't work either. ~Maarten -- To unsubscribe from this list: send the line unsubscribe linux-media in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [RFC PATCH 2/2 with seqcount v3] reservation: add suppport for read-only access using rcu
op 19-05-14 15:42, Thomas Hellstrom schreef: Hi, Maarten! Some nitpicks, and that krealloc within rcu lock still worries me. Otherwise looks good. /Thomas On 04/23/2014 12:15 PM, Maarten Lankhorst wrote: @@ -55,8 +60,8 @@ int reservation_object_reserve_shared(struct reservation_object *obj) kfree(obj-staged); obj-staged = NULL; return 0; -} -max = old-shared_max * 2; +} else +max = old-shared_max * 2; Perhaps as a separate reformatting patch? I'll fold it in to the patch that added reservation_object_reserve_shared. + +int reservation_object_get_fences_rcu(struct reservation_object *obj, + struct fence **pfence_excl, + unsigned *pshared_count, + struct fence ***pshared) +{ +unsigned shared_count = 0; +unsigned retry = 1; +struct fence **shared = NULL, *fence_excl = NULL; +int ret = 0; + +while (retry) { +struct reservation_object_list *fobj; +unsigned seq; + +seq = read_seqcount_begin(obj-seq); + +rcu_read_lock(); + +fobj = rcu_dereference(obj-fence); +if (fobj) { +struct fence **nshared; + +shared_count = ACCESS_ONCE(fobj-shared_count); ACCESS_ONCE() shouldn't be needed inside the seqlock? Yes it is, shared_count may be increased, leading to potential different sizes for krealloc and memcpy if the ACCESS_ONCE is removed. I could use shared_max here instead, which stays the same, but it would waste more memory. +nshared = krealloc(shared, sizeof(*shared) * shared_count, GFP_KERNEL); Again, krealloc should be a sleeping function, and not suitable within a RCU read lock? I still think this krealloc should be moved to the start of the retry loop, and we should start with a suitable guess of shared_count (perhaps 0?) It's not like we're going to waste a lot of memory But shared_count is only known when holding the rcu lock. What about this change? @@ -254,16 +254,27 @@ int reservation_object_get_fences_rcu(struct reservation_object *obj, fobj = rcu_dereference(obj-fence); if (fobj) { struct fence **nshared; + size_t sz; shared_count = ACCESS_ONCE(fobj-shared_count); - nshared = krealloc(shared, sizeof(*shared) * shared_count, GFP_KERNEL); + sz = sizeof(*shared) * shared_count; + + nshared = krealloc(shared, sz, + GFP_NOWAIT | __GFP_NOWARN); if (!nshared) { + rcu_read_unlock(); + nshared = krealloc(shared, sz, GFP_KERNEL) + if (nshared) { + shared = nshared; + continue; + } + ret = -ENOMEM; - shared_count = retry = 0; - goto unlock; + shared_count = 0; + break; } shared = nshared; - memcpy(shared, fobj-shared, sizeof(*shared) * shared_count); + memcpy(shared, fobj-shared, sz); } else shared_count = 0; fence_excl = rcu_dereference(obj-fence_excl); + +/* + * There could be a read_seqcount_retry here, but nothing cares + * about whether it's the old or newer fence pointers that are + * signale. That race could still have happened after checking Typo. Oops. -- To unsubscribe from this list: send the line unsubscribe linux-media in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [RFC PATCH 2/2 with seqcount v3] reservation: add suppport for read-only access using rcu
op 23-04-14 13:15, Maarten Lankhorst schreef: This adds 4 more functions to deal with rcu. reservation_object_get_fences_rcu() will obtain the list of shared and exclusive fences without obtaining the ww_mutex. reservation_object_wait_timeout_rcu() will wait on all fences of the reservation_object, without obtaining the ww_mutex. reservation_object_test_signaled_rcu() will test if all fences of the reservation_object are signaled without using the ww_mutex. reservation_object_get_excl() is added because touching the fence_excl member directly will trigger a sparse warning. Signed-off-by: Maarten Lankhorst maarten.lankho...@canonical.com --- Using seqcount and fixing some lockdep bugs. Changes since v2: - Fix some crashes, remove some unneeded barriers when provided by seqcount writes - Fix code to work correctly with sparse's RCU annotations. - Create a global string for the seqcount lock to make lockdep happy. Can I get this version reviewed? If it looks correct I'll mail the full series because it's intertwined with the TTM conversion to use this code. Ping, can anyone review this? -- To unsubscribe from this list: send the line unsubscribe linux-media in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[RFC PATCH 2/2 with seqcount v3] reservation: add suppport for read-only access using rcu
This adds 4 more functions to deal with rcu. reservation_object_get_fences_rcu() will obtain the list of shared and exclusive fences without obtaining the ww_mutex. reservation_object_wait_timeout_rcu() will wait on all fences of the reservation_object, without obtaining the ww_mutex. reservation_object_test_signaled_rcu() will test if all fences of the reservation_object are signaled without using the ww_mutex. reservation_object_get_excl() is added because touching the fence_excl member directly will trigger a sparse warning. Signed-off-by: Maarten Lankhorst maarten.lankho...@canonical.com --- Using seqcount and fixing some lockdep bugs. Changes since v2: - Fix some crashes, remove some unneeded barriers when provided by seqcount writes - Fix code to work correctly with sparse's RCU annotations. - Create a global string for the seqcount lock to make lockdep happy. Can I get this version reviewed? If it looks correct I'll mail the full series because it's intertwined with the TTM conversion to use this code. See http://cgit.freedesktop.org/~mlankhorst/linux/log/?h=vmwgfx_wip --- diff --git a/drivers/base/dma-buf.c b/drivers/base/dma-buf.c index d89a98d2c37b..0df673f812eb 100644 --- a/drivers/base/dma-buf.c +++ b/drivers/base/dma-buf.c @@ -137,7 +137,7 @@ static unsigned int dma_buf_poll(struct file *file, poll_table *poll) struct reservation_object_list *fobj; struct fence *fence_excl; unsigned long events; - unsigned shared_count; + unsigned shared_count, seq; dmabuf = file-private_data; if (!dmabuf || !dmabuf-resv) @@ -151,14 +151,20 @@ static unsigned int dma_buf_poll(struct file *file, poll_table *poll) if (!events) return 0; - ww_mutex_lock(resv-lock, NULL); +retry: + seq = read_seqcount_begin(resv-seq); + rcu_read_lock(); - fobj = resv-fence; - if (!fobj) - goto out; - - shared_count = fobj-shared_count; - fence_excl = resv-fence_excl; + fobj = rcu_dereference(resv-fence); + if (fobj) + shared_count = fobj-shared_count; + else + shared_count = 0; + fence_excl = rcu_dereference(resv-fence_excl); + if (read_seqcount_retry(resv-seq, seq)) { + rcu_read_unlock(); + goto retry; + } if (fence_excl (!(events POLLOUT) || shared_count == 0)) { struct dma_buf_poll_cb_t *dcb = dmabuf-cb_excl; @@ -176,14 +182,20 @@ static unsigned int dma_buf_poll(struct file *file, poll_table *poll) spin_unlock_irq(dmabuf-poll.lock); if (events pevents) { - if (!fence_add_callback(fence_excl, dcb-cb, + if (!fence_get_rcu(fence_excl)) { + /* force a recheck */ + events = ~pevents; + dma_buf_poll_cb(NULL, dcb-cb); + } else if (!fence_add_callback(fence_excl, dcb-cb, dma_buf_poll_cb)) { events = ~pevents; + fence_put(fence_excl); } else { /* * No callback queued, wake up any additional * waiters. */ + fence_put(fence_excl); dma_buf_poll_cb(NULL, dcb-cb); } } @@ -205,13 +217,26 @@ static unsigned int dma_buf_poll(struct file *file, poll_table *poll) goto out; for (i = 0; i shared_count; ++i) { - struct fence *fence = fobj-shared[i]; + struct fence *fence = rcu_dereference(fobj-shared[i]); + if (!fence_get_rcu(fence)) { + /* +* fence refcount dropped to zero, this means +* that fobj has been freed +* +* call dma_buf_poll_cb and force a recheck! +*/ + events = ~POLLOUT; + dma_buf_poll_cb(NULL, dcb-cb); + break; + } if (!fence_add_callback(fence, dcb-cb, dma_buf_poll_cb)) { + fence_put(fence); events = ~POLLOUT; break; } + fence_put(fence); } /* No callback queued, wake up any additional waiters. */ @@ -220,7 +245,7 @@ static unsigned int dma_buf_poll(struct file *file, poll_table *poll) } out: - ww_mutex_unlock(resv-lock
Re: [PATCH 2/2] [RFC v2 with seqcount] reservation: add suppport for read-only access using rcu
op 11-04-14 21:30, Thomas Hellstrom schreef: Hi! On 04/11/2014 08:09 PM, Maarten Lankhorst wrote: op 11-04-14 12:11, Thomas Hellstrom schreef: On 04/11/2014 11:24 AM, Maarten Lankhorst wrote: op 11-04-14 10:38, Thomas Hellstrom schreef: Hi, Maarten. Here I believe we encounter a lot of locking inconsistencies. First, it seems you're use a number of pointers as RCU pointers without annotating them as such and use the correct rcu macros when assigning those pointers. Some pointers (like the pointers in the shared fence list) are both used as RCU pointers (in dma_buf_poll()) for example, or considered protected by the seqlock (reservation_object_get_fences_rcu()), which I believe is OK, but then the pointers must be assigned using the correct rcu macros. In the memcpy in reservation_object_get_fences_rcu() we might get away with an ugly typecast, but with a verbose comment that the pointers are considered protected by the seqlock at that location. So I've updated (attached) the headers with proper __rcu annotation and locking comments according to how they are being used in the various reading functions. I believe if we want to get rid of this we need to validate those pointers using the seqlock as well. This will generate a lot of sparse warnings in those places needing rcu_dereference() rcu_assign_pointer() rcu_dereference_protected() With this I think we can get rid of all ACCESS_ONCE macros: It's not needed when the rcu_x() macros are used, and it's never needed for the members protected by the seqlock, (provided that the seq is tested). The only place where I think that's *not* the case is at the krealloc in reservation_object_get_fences_rcu(). Also I have some more comments in the reservation_object_get_fences_rcu() function below: I felt that the barriers needed for rcu were already provided by checking the seqcount lock. But looking at rcu_dereference makes it seem harmless to add it in more places, it handles the ACCESS_ONCE and barrier() for us. And it makes the code more maintainable, and helps sparse doing a lot of checking for us. I guess we can tolerate a couple of extra barriers for that. We could probably get away with using RCU_INIT_POINTER on the writer side, because the smp_wmb is already done by arranging seqcount updates correctly. Hmm. yes, probably. At least in the replace function. I think if we do it in other places, we should add comments as to where the smp_wmb() is located, for future reference. Also I saw in a couple of places where you're checking the shared pointers, you're not checking for NULL pointers, which I guess may happen if shared_count and pointers are not in full sync? No, because shared_count is protected with seqcount. I only allow appending to the array, so when shared_count is validated by seqcount it means that the [0...shared_count) indexes are valid and non-null. What could happen though is that the fence at a specific index is updated with another one from the same context, but that's harmless. Hmm. Shouldn't we have a way to clean signaled fences from reservation objects? Perhaps when we attach a new fence, or after a wait with ww_mutex held? Otherwise we'd have a lot of completely unused fence objects hanging around for no reason. I don't think we need to be as picky as TTM, but I think we should do something? Calling reservation_object_add_excl_fence with a NULL fence works, I do this in ttm_bo_wait(). It requires ww_mutex. ~Maarten -- To unsubscribe from this list: send the line unsubscribe linux-media in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 2/2] [RFC v2 with seqcount] reservation: add suppport for read-only access using rcu
op 11-04-14 21:35, Thomas Hellstrom schreef: On 04/11/2014 08:09 PM, Maarten Lankhorst wrote: op 11-04-14 12:11, Thomas Hellstrom schreef: On 04/11/2014 11:24 AM, Maarten Lankhorst wrote: op 11-04-14 10:38, Thomas Hellstrom schreef: Hi, Maarten. Here I believe we encounter a lot of locking inconsistencies. First, it seems you're use a number of pointers as RCU pointers without annotating them as such and use the correct rcu macros when assigning those pointers. Some pointers (like the pointers in the shared fence list) are both used as RCU pointers (in dma_buf_poll()) for example, or considered protected by the seqlock (reservation_object_get_fences_rcu()), which I believe is OK, but then the pointers must be assigned using the correct rcu macros. In the memcpy in reservation_object_get_fences_rcu() we might get away with an ugly typecast, but with a verbose comment that the pointers are considered protected by the seqlock at that location. So I've updated (attached) the headers with proper __rcu annotation and locking comments according to how they are being used in the various reading functions. I believe if we want to get rid of this we need to validate those pointers using the seqlock as well. This will generate a lot of sparse warnings in those places needing rcu_dereference() rcu_assign_pointer() rcu_dereference_protected() With this I think we can get rid of all ACCESS_ONCE macros: It's not needed when the rcu_x() macros are used, and it's never needed for the members protected by the seqlock, (provided that the seq is tested). The only place where I think that's *not* the case is at the krealloc in reservation_object_get_fences_rcu(). Also I have some more comments in the reservation_object_get_fences_rcu() function below: I felt that the barriers needed for rcu were already provided by checking the seqcount lock. But looking at rcu_dereference makes it seem harmless to add it in more places, it handles the ACCESS_ONCE and barrier() for us. And it makes the code more maintainable, and helps sparse doing a lot of checking for us. I guess we can tolerate a couple of extra barriers for that. We could probably get away with using RCU_INIT_POINTER on the writer side, because the smp_wmb is already done by arranging seqcount updates correctly. Hmm. yes, probably. At least in the replace function. I think if we do it in other places, we should add comments as to where the smp_wmb() is located, for future reference. Also I saw in a couple of places where you're checking the shared pointers, you're not checking for NULL pointers, which I guess may happen if shared_count and pointers are not in full sync? No, because shared_count is protected with seqcount. I only allow appending to the array, so when shared_count is validated by seqcount it means that the [0...shared_count) indexes are valid and non-null. What could happen though is that the fence at a specific index is updated with another one from the same context, but that's harmless. Hmm, doesn't attaching an exclusive fence clear all shared fence pointers from under a reader? No, for that reason. It only resets shared_count to 0. This is harmless because the shared fence pointers are still valid long enough because of RCU delayed deletion. fence_get_rcu will fail when the refcount has dropped to zero. This is enough of a check to prevent errors, so there's no need to explicitly clear the fence pointers. ~Maarten -- To unsubscribe from this list: send the line unsubscribe linux-media in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 2/2] [RFC v2 with seqcount] reservation: add suppport for read-only access using rcu
op 11-04-14 10:38, Thomas Hellstrom schreef: Hi, Maarten. Here I believe we encounter a lot of locking inconsistencies. First, it seems you're use a number of pointers as RCU pointers without annotating them as such and use the correct rcu macros when assigning those pointers. Some pointers (like the pointers in the shared fence list) are both used as RCU pointers (in dma_buf_poll()) for example, or considered protected by the seqlock (reservation_object_get_fences_rcu()), which I believe is OK, but then the pointers must be assigned using the correct rcu macros. In the memcpy in reservation_object_get_fences_rcu() we might get away with an ugly typecast, but with a verbose comment that the pointers are considered protected by the seqlock at that location. So I've updated (attached) the headers with proper __rcu annotation and locking comments according to how they are being used in the various reading functions. I believe if we want to get rid of this we need to validate those pointers using the seqlock as well. This will generate a lot of sparse warnings in those places needing rcu_dereference() rcu_assign_pointer() rcu_dereference_protected() With this I think we can get rid of all ACCESS_ONCE macros: It's not needed when the rcu_x() macros are used, and it's never needed for the members protected by the seqlock, (provided that the seq is tested). The only place where I think that's *not* the case is at the krealloc in reservation_object_get_fences_rcu(). Also I have some more comments in the reservation_object_get_fences_rcu() function below: I felt that the barriers needed for rcu were already provided by checking the seqcount lock. But looking at rcu_dereference makes it seem harmless to add it in more places, it handles the ACCESS_ONCE and barrier() for us. We could probably get away with using RCU_INIT_POINTER on the writer side, because the smp_wmb is already done by arranging seqcount updates correctly. diff --git a/drivers/base/dma-buf.c b/drivers/base/dma-buf.c index d89a98d2c37b..ca6ef0c4b358 100644 --- a/drivers/base/dma-buf.c +++ b/drivers/base/dma-buf.c +int reservation_object_get_fences_rcu(struct reservation_object *obj, + struct fence **pfence_excl, + unsigned *pshared_count, + struct fence ***pshared) +{ +unsigned shared_count = 0; +unsigned retry = 1; +struct fence **shared = NULL, *fence_excl = NULL; +int ret = 0; + +while (retry) { +struct reservation_object_list *fobj; +unsigned seq, retry; You're shadowing retry? Oops. + +seq = read_seqcount_begin(obj-seq); + +rcu_read_lock(); + +fobj = ACCESS_ONCE(obj-fence); +if (fobj) { +struct fence **nshared; + +shared_count = ACCESS_ONCE(fobj-shared_count); +nshared = krealloc(shared, sizeof(*shared) * shared_count, GFP_KERNEL); krealloc inside rcu_read_lock(). Better to put this first in the loop. Except that shared_count isn't known until the rcu_read_lock is taken. Thanks, Thomas ~Maarten -- To unsubscribe from this list: send the line unsubscribe linux-media in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 2/2] [RFC v2 with seqcount] reservation: add suppport for read-only access using rcu
op 11-04-14 12:11, Thomas Hellstrom schreef: On 04/11/2014 11:24 AM, Maarten Lankhorst wrote: op 11-04-14 10:38, Thomas Hellstrom schreef: Hi, Maarten. Here I believe we encounter a lot of locking inconsistencies. First, it seems you're use a number of pointers as RCU pointers without annotating them as such and use the correct rcu macros when assigning those pointers. Some pointers (like the pointers in the shared fence list) are both used as RCU pointers (in dma_buf_poll()) for example, or considered protected by the seqlock (reservation_object_get_fences_rcu()), which I believe is OK, but then the pointers must be assigned using the correct rcu macros. In the memcpy in reservation_object_get_fences_rcu() we might get away with an ugly typecast, but with a verbose comment that the pointers are considered protected by the seqlock at that location. So I've updated (attached) the headers with proper __rcu annotation and locking comments according to how they are being used in the various reading functions. I believe if we want to get rid of this we need to validate those pointers using the seqlock as well. This will generate a lot of sparse warnings in those places needing rcu_dereference() rcu_assign_pointer() rcu_dereference_protected() With this I think we can get rid of all ACCESS_ONCE macros: It's not needed when the rcu_x() macros are used, and it's never needed for the members protected by the seqlock, (provided that the seq is tested). The only place where I think that's *not* the case is at the krealloc in reservation_object_get_fences_rcu(). Also I have some more comments in the reservation_object_get_fences_rcu() function below: I felt that the barriers needed for rcu were already provided by checking the seqcount lock. But looking at rcu_dereference makes it seem harmless to add it in more places, it handles the ACCESS_ONCE and barrier() for us. And it makes the code more maintainable, and helps sparse doing a lot of checking for us. I guess we can tolerate a couple of extra barriers for that. We could probably get away with using RCU_INIT_POINTER on the writer side, because the smp_wmb is already done by arranging seqcount updates correctly. Hmm. yes, probably. At least in the replace function. I think if we do it in other places, we should add comments as to where the smp_wmb() is located, for future reference. Also I saw in a couple of places where you're checking the shared pointers, you're not checking for NULL pointers, which I guess may happen if shared_count and pointers are not in full sync? No, because shared_count is protected with seqcount. I only allow appending to the array, so when shared_count is validated by seqcount it means that the [0...shared_count) indexes are valid and non-null. What could happen though is that the fence at a specific index is updated with another one from the same context, but that's harmless. ~Maarten -- To unsubscribe from this list: send the line unsubscribe linux-media in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 2/2] [RFC] reservation: add suppport for read-only access using rcu
Hey, op 10-04-14 10:46, Thomas Hellstrom schreef: Hi! Ugh. This became more complicated than I thought, but I'm OK with moving TTM over to fence while we sort out how / if we're going to use this. While reviewing, it struck me that this is kind of error-prone, and hard to follow since we're operating on a structure that may be continually updated under us, needing a lot of RCU-specific macros and barriers. Yeah, but with the exception of dma_buf_poll I don't think there is anything else outside drivers/base/reservation.c has to deal with rcu. Also the rcu wait appears to not complete until there are no busy fences left (new ones can be added while we wait) rather than waiting on a snapshot of busy fences. This has been by design, because 'wait for bo idle' type of functions only care if the bo is completely idle or not. It would be easy to make a snapshot even without seqlocks, just copy reservation_object_test_signaled_rcu to return a shared list if test_all is set, or return pointer to exclusive otherwise. I wonder if these issues can be addressed by having a function that provides a snapshot of all busy fences: This can be accomplished either by including the exclusive fence in the fence_list structure and allocate a new such structure each time it is updated. The RCU reader could then just make a copy of the current fence_list structure pointed to by obj-fence, but I'm not sure we want to reallocate *each* time we update the fence pointer. No, the most common operation is updating fence pointers, which is why the current design makes that cheap. It's also why doing rcu reads is more expensive. The other approach uses a seqlock to obtain a consistent snapshot, and I've attached an incomplete outline, and I'm not 100% whether it's OK to combine RCU and seqlocks in this way... Both these approaches have the benefit of hiding the RCU snapshotting in a single function, that can then be used by any waiting or polling function. I think the middle way with using seqlocks to protect the fence_excl pointer and shared list combination, and using RCU to protect the refcounts for fences and the availability of the list could work for our usecase and might remove a bunch of memory barriers. But yeah that depends on layering rcu and seqlocks. No idea if that is allowed. But I suppose it is. Also, you're being overly paranoid with seqlock reading, we would only need something like this: rcu_read_lock() preempt_disable() seq = read_seqcount_begin(); read fence_excl, shared_count = ACCESS_ONCE(fence-shared_count) copy shared to a struct. if (read_seqcount_retry()) { unlock and retry } preempt_enable(); use fence_get_rcu() to bump refcount on everything, if that fails unlock, put, and retry rcu_read_unlock() But the shared list would still need to be RCU'd, to make sure we're not reading freed garbage. ~Maarten -- To unsubscribe from this list: send the line unsubscribe linux-media in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 2/2] [RFC v2 with seqcount] reservation: add suppport for read-only access using rcu
op 10-04-14 13:08, Thomas Hellstrom schreef: On 04/10/2014 12:07 PM, Maarten Lankhorst wrote: Hey, op 10-04-14 10:46, Thomas Hellstrom schreef: Hi! Ugh. This became more complicated than I thought, but I'm OK with moving TTM over to fence while we sort out how / if we're going to use this. While reviewing, it struck me that this is kind of error-prone, and hard to follow since we're operating on a structure that may be continually updated under us, needing a lot of RCU-specific macros and barriers. Yeah, but with the exception of dma_buf_poll I don't think there is anything else outside drivers/base/reservation.c has to deal with rcu. Also the rcu wait appears to not complete until there are no busy fences left (new ones can be added while we wait) rather than waiting on a snapshot of busy fences. This has been by design, because 'wait for bo idle' type of functions only care if the bo is completely idle or not. No, not when using RCU, because the bo may be busy again before the function returns :) Complete idleness can only be guaranteed if holding the reservation, or otherwise making sure that no new rendering is submitted to the buffer, so it's an overkill to wait for complete idleness here. You're probably right, but it makes waiting a lot easier if I don't have to deal with memory allocations. :P It would be easy to make a snapshot even without seqlocks, just copy reservation_object_test_signaled_rcu to return a shared list if test_all is set, or return pointer to exclusive otherwise. I wonder if these issues can be addressed by having a function that provides a snapshot of all busy fences: This can be accomplished either by including the exclusive fence in the fence_list structure and allocate a new such structure each time it is updated. The RCU reader could then just make a copy of the current fence_list structure pointed to by obj-fence, but I'm not sure we want to reallocate *each* time we update the fence pointer. No, the most common operation is updating fence pointers, which is why the current design makes that cheap. It's also why doing rcu reads is more expensive. The other approach uses a seqlock to obtain a consistent snapshot, and I've attached an incomplete outline, and I'm not 100% whether it's OK to combine RCU and seqlocks in this way... Both these approaches have the benefit of hiding the RCU snapshotting in a single function, that can then be used by any waiting or polling function. I think the middle way with using seqlocks to protect the fence_excl pointer and shared list combination, and using RCU to protect the refcounts for fences and the availability of the list could work for our usecase and might remove a bunch of memory barriers. But yeah that depends on layering rcu and seqlocks. No idea if that is allowed. But I suppose it is. Also, you're being overly paranoid with seqlock reading, we would only need something like this: rcu_read_lock() preempt_disable() seq = read_seqcount_begin() read fence_excl, shared_count = ACCESS_ONCE(fence-shared_count) copy shared to a struct. if (read_seqcount_retry()) { unlock and retry } preempt_enable(); use fence_get_rcu() to bump refcount on everything, if that fails unlock, put, and retry rcu_read_unlock() But the shared list would still need to be RCU'd, to make sure we're not reading freed garbage. Ah. OK, But I think we should use rcu inside seqcount, because read_seqcount_begin() may spin for a long time if there are many writers. Also I don't think the preempt_disable() is needed for read_seq critical sections other than they might decrease the risc of retries.. Reading the seqlock code makes me suspect that's the case too. The lockdep code calls local_irq_disable, so it's probably safe without preemption disabled. ~Maarten I like the ability of not allocating memory, so I kept reservation_object_wait_timeout_rcu mostly the way it was. This code appears to fail on nouveau when using the shared members, but I'm not completely sure whether the error is in nouveau or this code yet. --8 [RFC v2] reservation: add suppport for read-only access using rcu This adds 4 more functions to deal with rcu. reservation_object_get_fences_rcu() will obtain the list of shared and exclusive fences without obtaining the ww_mutex. reservation_object_wait_timeout_rcu() will wait on all fences of the reservation_object, without obtaining the ww_mutex. reservation_object_test_signaled_rcu() will test if all fences of the reservation_object are signaled without using the ww_mutex. reservation_object_get_excl() is added because touching the fence_excl member directly will trigger a sparse warning. Signed-off-by: Maarten Lankhorst maarten.lankho...@canonical.com diff --git a/drivers/base/dma-buf.c b/drivers/base/dma-buf.c index d89a98d2c37b..ca6ef0c4b358 100644 --- a/drivers/base/dma-buf.c +++ b/drivers/base/dma-buf.c @@ -137,7 +137,7 @@ static unsigned int dma_buf_poll(struct file
[PATCH 2/2] [RFC] reservation: add suppport for read-only access using rcu
This adds 3 more functions to deal with rcu. reservation_object_wait_timeout_rcu() will wait on all fences of the reservation_object, without obtaining the ww_mutex. reservation_object_test_signaled_rcu() will test if all fences of the reservation_object are signaled without using the ww_mutex. reservation_object_get_excl() is added because touching the fence_excl member directly will trigger a sparse warning. Signed-off-by: Maarten Lankhorst maarten.lankho...@canonical.com --- drivers/base/dma-buf.c | 46 +++-- drivers/base/reservation.c | 147 +-- include/linux/fence.h | 22 ++ include/linux/reservation.h | 40 4 files changed, 224 insertions(+), 31 deletions(-) diff --git a/drivers/base/dma-buf.c b/drivers/base/dma-buf.c index d89a98d2c37b..fc2d7546b8b0 100644 --- a/drivers/base/dma-buf.c +++ b/drivers/base/dma-buf.c @@ -151,14 +151,22 @@ static unsigned int dma_buf_poll(struct file *file, poll_table *poll) if (!events) return 0; - ww_mutex_lock(resv-lock, NULL); + rcu_read_lock(); - fobj = resv-fence; - if (!fobj) - goto out; + fobj = rcu_dereference(resv-fence); + if (fobj) { + shared_count = ACCESS_ONCE(fobj-shared_count); + smp_mb(); /* shared_count needs transitivity wrt fence_excl */ + } else + shared_count = 0; + fence_excl = rcu_dereference(resv-fence_excl); - shared_count = fobj-shared_count; - fence_excl = resv-fence_excl; + /* +* This would have needed a smp_read_barrier_depends() +* because shared_count needs to be read before shared[i], but +* spin_lock_irq and spin_unlock_irq provide even stronger +* guarantees. +*/ if (fence_excl (!(events POLLOUT) || shared_count == 0)) { struct dma_buf_poll_cb_t *dcb = dmabuf-cb_excl; @@ -176,14 +184,20 @@ static unsigned int dma_buf_poll(struct file *file, poll_table *poll) spin_unlock_irq(dmabuf-poll.lock); if (events pevents) { - if (!fence_add_callback(fence_excl, dcb-cb, + if (!fence_get_rcu(fence_excl)) { + /* force a recheck */ + events = ~pevents; + dma_buf_poll_cb(NULL, dcb-cb); + } else if (!fence_add_callback(fence_excl, dcb-cb, dma_buf_poll_cb)) { events = ~pevents; + fence_put(fence_excl); } else { /* * No callback queued, wake up any additional * waiters. */ + fence_put(fence_excl); dma_buf_poll_cb(NULL, dcb-cb); } } @@ -205,13 +219,25 @@ static unsigned int dma_buf_poll(struct file *file, poll_table *poll) goto out; for (i = 0; i shared_count; ++i) { - struct fence *fence = fobj-shared[i]; - + struct fence *fence = fence_get_rcu(fobj-shared[i]); + if (!fence) { + /* +* fence refcount dropped to zero, this means +* that fobj has been freed +* +* call dma_buf_poll_cb and force a recheck! +*/ + events = ~POLLOUT; + dma_buf_poll_cb(NULL, dcb-cb); + break; + } if (!fence_add_callback(fence, dcb-cb, dma_buf_poll_cb)) { + fence_put(fence); events = ~POLLOUT; break; } + fence_put(fence); } /* No callback queued, wake up any additional waiters. */ @@ -220,7 +246,7 @@ static unsigned int dma_buf_poll(struct file *file, poll_table *poll) } out: - ww_mutex_unlock(resv-lock); + rcu_read_unlock(); return events; } diff --git a/drivers/base/reservation.c b/drivers/base/reservation.c index b82a5b630a8e..4cdce63140b8 100644 --- a/drivers/base/reservation.c +++ b/drivers/base/reservation.c @@ -87,9 +87,13 @@ reservation_object_add_shared_inplace(struct reservation_object *obj, struct fence *old_fence = fobj-shared[i]; fence_get(fence
[PATCH 0/2] Updates to fence api
The following series implements small updates to the fence api. I've found them useful when implementing the fence API in ttm and i915. The last patch enables RCU on top of the api. I've found this less useful, but it was the condition on which Thomas Hellstrom was ok with converting TTM to fence, so I had to keep it in. If nobody objects I'll probably merge that patch through drm, because some care is needed in ttm before it can flip the switch on rcu. --- Maarten Lankhorst (2): reservation: update api and add some helpers [RFC] reservation: add suppport for read-only access using rcu -- To unsubscribe from this list: send the line unsubscribe linux-media in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 1/2] reservation: update api and add some helpers
Move the list of shared fences to a struct, and return it in reservation_object_get_list(). Add reservation_object_reserve_shared(), which reserves space in the reservation_object for 1 more shared fence. reservation_object_add_shared_fence() and reservation_object_add_excl_fence() are used to assign a new fence to a reservation_object pointer, to complete a reservation. Signed-off-by: Maarten Lankhorst maarten.lankho...@canonical.com --- drivers/base/dma-buf.c | 35 +++--- drivers/base/fence.c|4 + drivers/base/reservation.c | 154 +++ include/linux/fence.h |6 ++ include/linux/reservation.h | 48 +++-- kernel/sched/core.c |1 6 files changed, 228 insertions(+), 20 deletions(-) diff --git a/drivers/base/dma-buf.c b/drivers/base/dma-buf.c index 96338bf7f457..d89a98d2c37b 100644 --- a/drivers/base/dma-buf.c +++ b/drivers/base/dma-buf.c @@ -134,7 +134,10 @@ static unsigned int dma_buf_poll(struct file *file, poll_table *poll) { struct dma_buf *dmabuf; struct reservation_object *resv; + struct reservation_object_list *fobj; + struct fence *fence_excl; unsigned long events; + unsigned shared_count; dmabuf = file-private_data; if (!dmabuf || !dmabuf-resv) @@ -150,12 +153,18 @@ static unsigned int dma_buf_poll(struct file *file, poll_table *poll) ww_mutex_lock(resv-lock, NULL); - if (resv-fence_excl (!(events POLLOUT) || -resv-fence_shared_count == 0)) { + fobj = resv-fence; + if (!fobj) + goto out; + + shared_count = fobj-shared_count; + fence_excl = resv-fence_excl; + + if (fence_excl (!(events POLLOUT) || shared_count == 0)) { struct dma_buf_poll_cb_t *dcb = dmabuf-cb_excl; unsigned long pevents = POLLIN; - if (resv-fence_shared_count == 0) + if (shared_count == 0) pevents |= POLLOUT; spin_lock_irq(dmabuf-poll.lock); @@ -167,19 +176,20 @@ static unsigned int dma_buf_poll(struct file *file, poll_table *poll) spin_unlock_irq(dmabuf-poll.lock); if (events pevents) { - if (!fence_add_callback(resv-fence_excl, - dcb-cb, dma_buf_poll_cb)) + if (!fence_add_callback(fence_excl, dcb-cb, + dma_buf_poll_cb)) { events = ~pevents; - else + } else { /* * No callback queued, wake up any additional * waiters. */ dma_buf_poll_cb(NULL, dcb-cb); + } } } - if ((events POLLOUT) resv-fence_shared_count 0) { + if ((events POLLOUT) shared_count 0) { struct dma_buf_poll_cb_t *dcb = dmabuf-cb_shared; int i; @@ -194,15 +204,18 @@ static unsigned int dma_buf_poll(struct file *file, poll_table *poll) if (!(events POLLOUT)) goto out; - for (i = 0; i resv-fence_shared_count; ++i) - if (!fence_add_callback(resv-fence_shared[i], - dcb-cb, dma_buf_poll_cb)) { + for (i = 0; i shared_count; ++i) { + struct fence *fence = fobj-shared[i]; + + if (!fence_add_callback(fence, dcb-cb, + dma_buf_poll_cb)) { events = ~POLLOUT; break; } + } /* No callback queued, wake up any additional waiters. */ - if (i == resv-fence_shared_count) + if (i == shared_count) dma_buf_poll_cb(NULL, dcb-cb); } diff --git a/drivers/base/fence.c b/drivers/base/fence.c index 8fff13fb86cf..f780f9b3d418 100644 --- a/drivers/base/fence.c +++ b/drivers/base/fence.c @@ -170,7 +170,7 @@ void release_fence(struct kref *kref) if (fence-ops-release) fence-ops-release(fence); else - kfree(fence); + free_fence(fence); } EXPORT_SYMBOL(release_fence); @@ -448,7 +448,7 @@ static void seqno_release(struct fence *fence) if (f-ops-release) f-ops-release(fence); else - kfree(f); + free_fence(fence); } static long seqno_wait(struct fence *fence, bool intr, signed long timeout) diff --git a/drivers/base/reservation.c b/drivers/base/reservation.c index a73fbf3b8e56..b82a5b630a8e 100644
Re: [PATCH 4/6] android: convert sync to fence api, v4
op 04-03-14 09:14, Daniel Vetter schreef: On Tue, Mar 04, 2014 at 08:50:38AM +0100, Maarten Lankhorst wrote: op 03-03-14 22:11, Daniel Vetter schreef: On Mon, Feb 17, 2014 at 04:57:19PM +0100, Maarten Lankhorst wrote: Android syncpoints can be mapped to a timeline. This removes the need to maintain a separate api for synchronization. I've left the android trace events in place, but the core fence events should already be sufficient for debugging. v2: - Call fence_remove_callback in sync_fence_free if not all fences have fired. v3: - Merge Colin Cross' bugfixes, and the android fence merge optimization. v4: - Merge with the upstream fixes. Signed-off-by: Maarten Lankhorst maarten.lankho...@canonical.com --- Snipped everything but headers - Ian Lister from our android team is signed up to have a more in-depth look at proper integration with android syncpoints. Adding him to cc. diff --git a/drivers/staging/android/sync.h b/drivers/staging/android/sync.h index 62e2255b1c1e..6036dbdc8e6f 100644 --- a/drivers/staging/android/sync.h +++ b/drivers/staging/android/sync.h @@ -21,6 +21,7 @@ #include linux/list.h #include linux/spinlock.h #include linux/wait.h +#include linux/fence.h struct sync_timeline; struct sync_pt; @@ -40,8 +41,6 @@ struct sync_fence; * -1 if a will signal before b * @free_pt: called before sync_pt is freed * @release_obj: called before sync_timeline is freed - * @print_obj: deprecated - * @print_pt: deprecated * @fill_driver_data: write implementation specific driver data to data. * should return an error if there is not enough room * as specified by size. This information is returned @@ -67,13 +66,6 @@ struct sync_timeline_ops { /* optional */ void (*release_obj)(struct sync_timeline *sync_timeline); - /* deprecated */ - void (*print_obj)(struct seq_file *s, - struct sync_timeline *sync_timeline); - - /* deprecated */ - void (*print_pt)(struct seq_file *s, struct sync_pt *sync_pt); - /* optional */ int (*fill_driver_data)(struct sync_pt *syncpt, void *data, int size); @@ -104,42 +96,48 @@ struct sync_timeline { /* protected by child_list_lock */ bool destroyed; + int context, value; struct list_head child_list_head; spinlock_t child_list_lock; struct list_head active_list_head; - spinlock_t active_list_lock; +#ifdef CONFIG_DEBUG_FS struct list_head sync_timeline_list; +#endif }; /** * struct sync_pt - sync point - * @parent: sync_timeline to which this sync_pt belongs + * @fence: base fence class * @child_list: membership in sync_timeline.child_list_head * @active_list: membership in sync_timeline.active_list_head + current * @signaled_list: membership in temporary signaled_list on stack * @fence: sync_fence to which the sync_pt belongs * @pt_list: membership in sync_fence.pt_list_head * @status: 1: signaled, 0:active, 0: error * @timestamp: time which sync_pt status transitioned from active to * signaled or error. +=== + patched Conflict markers ... Oops. */ struct sync_pt { - struct sync_timeline *parent; - struct list_head child_list; + struct fence base; Hm, embedding feels wrong, since that still means that I'll need to implement two kinds of fences in i915 - one using the seqno fence to make dma-buf sync work, and one to implmenent sync_pt to make the android folks happy. If I can dream I think we should have a pointer to an underlying fence here, i.e. a struct sync_pt would just be a userspace interface wrapper to do explicit syncing using native fences, instead of implicit syncing like with dma-bufs. But this is all drive-by comments from a very cursory high-level look. I might be full of myself again ;-) -Daniel No, the idea is that because android syncpoint is simply another type of dma-fence, that if you deal with normal fences then android can automatically be handled too. The userspace fence api android exposes could be very easily made to work for dma-fence, just pass a dma-fence to sync_fence_create. So exposing dma-fence would probably work for android too. Hm, then why do we still have struct sync_pt around? Since it's just the internal bit, with the userspace facing object being struct sync_fence, I'd opt to shuffle any useful features into the core struct fence. -Daniel To keep compatibility with the android api. I think that gradually converting them is going to be more useful than to force all drivers to use a new api all at once. They could keep android syncpoint api for exporting, as long as they accept dma-fence for importing/waiting. ~Maarten -- To unsubscribe from this list: send the line unsubscribe linux-media in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 4/6] android: convert sync to fence api, v4
op 04-03-14 11:00, Daniel Vetter schreef: On Tue, Mar 04, 2014 at 09:20:58AM +0100, Maarten Lankhorst wrote: op 04-03-14 09:14, Daniel Vetter schreef: On Tue, Mar 04, 2014 at 08:50:38AM +0100, Maarten Lankhorst wrote: op 03-03-14 22:11, Daniel Vetter schreef: On Mon, Feb 17, 2014 at 04:57:19PM +0100, Maarten Lankhorst wrote: Android syncpoints can be mapped to a timeline. This removes the need to maintain a separate api for synchronization. I've left the android trace events in place, but the core fence events should already be sufficient for debugging. v2: - Call fence_remove_callback in sync_fence_free if not all fences have fired. v3: - Merge Colin Cross' bugfixes, and the android fence merge optimization. v4: - Merge with the upstream fixes. Signed-off-by: Maarten Lankhorst maarten.lankho...@canonical.com --- Snipped everything but headers - Ian Lister from our android team is signed up to have a more in-depth look at proper integration with android syncpoints. Adding him to cc. diff --git a/drivers/staging/android/sync.h b/drivers/staging/android/sync.h index 62e2255b1c1e..6036dbdc8e6f 100644 --- a/drivers/staging/android/sync.h +++ b/drivers/staging/android/sync.h @@ -21,6 +21,7 @@ #include linux/list.h #include linux/spinlock.h #include linux/wait.h +#include linux/fence.h struct sync_timeline; struct sync_pt; @@ -40,8 +41,6 @@ struct sync_fence; * -1 if a will signal before b * @free_pt: called before sync_pt is freed * @release_obj: called before sync_timeline is freed - * @print_obj: deprecated - * @print_pt: deprecated * @fill_driver_data: write implementation specific driver data to data. * should return an error if there is not enough room * as specified by size. This information is returned @@ -67,13 +66,6 @@ struct sync_timeline_ops { /* optional */ void (*release_obj)(struct sync_timeline *sync_timeline); - /* deprecated */ - void (*print_obj)(struct seq_file *s, - struct sync_timeline *sync_timeline); - - /* deprecated */ - void (*print_pt)(struct seq_file *s, struct sync_pt *sync_pt); - /* optional */ int (*fill_driver_data)(struct sync_pt *syncpt, void *data, int size); @@ -104,42 +96,48 @@ struct sync_timeline { /* protected by child_list_lock */ bool destroyed; + int context, value; struct list_head child_list_head; spinlock_t child_list_lock; struct list_head active_list_head; - spinlock_t active_list_lock; +#ifdef CONFIG_DEBUG_FS struct list_head sync_timeline_list; +#endif }; /** * struct sync_pt - sync point - * @parent: sync_timeline to which this sync_pt belongs + * @fence: base fence class * @child_list: membership in sync_timeline.child_list_head * @active_list: membership in sync_timeline.active_list_head + current * @signaled_list: membership in temporary signaled_list on stack * @fence: sync_fence to which the sync_pt belongs * @pt_list: membership in sync_fence.pt_list_head * @status: 1: signaled, 0:active, 0: error * @timestamp: time which sync_pt status transitioned from active to * signaled or error. +=== + patched Conflict markers ... Oops. */ struct sync_pt { - struct sync_timeline *parent; - struct list_head child_list; + struct fence base; Hm, embedding feels wrong, since that still means that I'll need to implement two kinds of fences in i915 - one using the seqno fence to make dma-buf sync work, and one to implmenent sync_pt to make the android folks happy. If I can dream I think we should have a pointer to an underlying fence here, i.e. a struct sync_pt would just be a userspace interface wrapper to do explicit syncing using native fences, instead of implicit syncing like with dma-bufs. But this is all drive-by comments from a very cursory high-level look. I might be full of myself again ;-) -Daniel No, the idea is that because android syncpoint is simply another type of dma-fence, that if you deal with normal fences then android can automatically be handled too. The userspace fence api android exposes could be very easily made to work for dma-fence, just pass a dma-fence to sync_fence_create. So exposing dma-fence would probably work for android too. Hm, then why do we still have struct sync_pt around? Since it's just the internal bit, with the userspace facing object being struct sync_fence, I'd opt to shuffle any useful features into the core struct fence. -Daniel To keep compatibility with the android api. I think that gradually converting them is going to be more useful than to force all drivers to use a new api all at once. They could keep android syncpoint api for exporting, as long as they accept dma-fence for importing/waiting. We don't have any users of the android sync_pt stuff (outside of the framework itself). So any considerations for existing drivers for upstreaming are imo moot. At least for the in-kernel interfaces used. For the actual userspace interface I guess keeping the android syncpt ioctls as-is has value
Re: [PATCH 4/6] android: convert sync to fence api, v4
op 03-03-14 22:11, Daniel Vetter schreef: On Mon, Feb 17, 2014 at 04:57:19PM +0100, Maarten Lankhorst wrote: Android syncpoints can be mapped to a timeline. This removes the need to maintain a separate api for synchronization. I've left the android trace events in place, but the core fence events should already be sufficient for debugging. v2: - Call fence_remove_callback in sync_fence_free if not all fences have fired. v3: - Merge Colin Cross' bugfixes, and the android fence merge optimization. v4: - Merge with the upstream fixes. Signed-off-by: Maarten Lankhorst maarten.lankho...@canonical.com --- Snipped everything but headers - Ian Lister from our android team is signed up to have a more in-depth look at proper integration with android syncpoints. Adding him to cc. diff --git a/drivers/staging/android/sync.h b/drivers/staging/android/sync.h index 62e2255b1c1e..6036dbdc8e6f 100644 --- a/drivers/staging/android/sync.h +++ b/drivers/staging/android/sync.h @@ -21,6 +21,7 @@ #include linux/list.h #include linux/spinlock.h #include linux/wait.h +#include linux/fence.h struct sync_timeline; struct sync_pt; @@ -40,8 +41,6 @@ struct sync_fence; * -1 if a will signal before b * @free_pt: called before sync_pt is freed * @release_obj: called before sync_timeline is freed - * @print_obj: deprecated - * @print_pt: deprecated * @fill_driver_data: write implementation specific driver data to data. * should return an error if there is not enough room * as specified by size. This information is returned @@ -67,13 +66,6 @@ struct sync_timeline_ops { /* optional */ void (*release_obj)(struct sync_timeline *sync_timeline); - /* deprecated */ - void (*print_obj)(struct seq_file *s, - struct sync_timeline *sync_timeline); - - /* deprecated */ - void (*print_pt)(struct seq_file *s, struct sync_pt *sync_pt); - /* optional */ int (*fill_driver_data)(struct sync_pt *syncpt, void *data, int size); @@ -104,42 +96,48 @@ struct sync_timeline { /* protected by child_list_lock */ bool destroyed; + int context, value; struct list_head child_list_head; spinlock_t child_list_lock; struct list_head active_list_head; - spinlock_t active_list_lock; +#ifdef CONFIG_DEBUG_FS struct list_head sync_timeline_list; +#endif }; /** * struct sync_pt - sync point - * @parent: sync_timeline to which this sync_pt belongs + * @fence: base fence class * @child_list: membership in sync_timeline.child_list_head * @active_list: membership in sync_timeline.active_list_head + current * @signaled_list: membership in temporary signaled_list on stack * @fence: sync_fence to which the sync_pt belongs * @pt_list: membership in sync_fence.pt_list_head * @status: 1: signaled, 0:active, 0: error * @timestamp: time which sync_pt status transitioned from active to * signaled or error. +=== + patched Conflict markers ... Oops. */ struct sync_pt { - struct sync_timeline *parent; - struct list_head child_list; + struct fence base; Hm, embedding feels wrong, since that still means that I'll need to implement two kinds of fences in i915 - one using the seqno fence to make dma-buf sync work, and one to implmenent sync_pt to make the android folks happy. If I can dream I think we should have a pointer to an underlying fence here, i.e. a struct sync_pt would just be a userspace interface wrapper to do explicit syncing using native fences, instead of implicit syncing like with dma-bufs. But this is all drive-by comments from a very cursory high-level look. I might be full of myself again ;-) -Daniel No, the idea is that because android syncpoint is simply another type of dma-fence, that if you deal with normal fences then android can automatically be handled too. The userspace fence api android exposes could be very easily made to work for dma-fence, just pass a dma-fence to sync_fence_create. So exposing dma-fence would probably work for android too. ~Maarten -- To unsubscribe from this list: send the line unsubscribe linux-media in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 4/6] android: convert sync to fence api, v4
op 19-02-14 14:56, Thomas Hellstrom schreef: +static void fence_check_cb_func(struct fence *f, struct fence_cb *cb) +{ + struct sync_fence_cb *check = container_of(cb, struct sync_fence_cb, cb); + struct sync_fence *fence = check-fence; + + // TODO: Add a fence-status member and check it Hmm, C++ / C99 style comments makes checkpatch.pl complain. Did you run this series through checkpatch? /Thomas Actually I used c99 here because it shouldn't have been in the sent patch. ;-) Right below that comment I use fence-status, so the right thing to do was to zap the comment. Thanks for catching it! ~Maarten\ -- To unsubscribe from this list: send the line unsubscribe linux-media in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 0/6] dma-buf synchronization patches (updated)
The following series implements fence and converts dma-buf and android sync to use it. Patch 5 and 6 add support for polling to dma-buf, blocking until all fences are signaled. Patches that received some minor updates: - seqno fence (wait condition member added) - android (whitespace changes and a comment removed) - add poll support to dma-buf (added comment) --- Maarten Lankhorst (6): fence: dma-buf cross-device synchronization (v17) seqno-fence: Hardware dma-buf implementation of fencing (v5) dma-buf: use reservation objects android: convert sync to fence api, v5 reservation: add support for fences to enable cross-device synchronisation dma-buf: add poll support, v3 Documentation/DocBook/device-drivers.tmpl |3 drivers/base/Kconfig |9 drivers/base/Makefile |2 drivers/base/dma-buf.c | 130 +++ drivers/base/fence.c | 467 drivers/gpu/drm/drm_prime.c|8 drivers/gpu/drm/exynos/exynos_drm_dmabuf.c |2 drivers/gpu/drm/i915/i915_gem_dmabuf.c |2 drivers/gpu/drm/nouveau/nouveau_drm.c |1 drivers/gpu/drm/nouveau/nouveau_gem.h |1 drivers/gpu/drm/nouveau/nouveau_prime.c|7 drivers/gpu/drm/omapdrm/omap_gem_dmabuf.c |2 drivers/gpu/drm/radeon/radeon_drv.c|2 drivers/gpu/drm/radeon/radeon_prime.c |8 drivers/gpu/drm/ttm/ttm_object.c |2 drivers/media/v4l2-core/videobuf2-dma-contig.c |2 drivers/staging/android/Kconfig|1 drivers/staging/android/Makefile |2 drivers/staging/android/ion/ion.c |2 drivers/staging/android/sw_sync.c |4 drivers/staging/android/sync.c | 903 drivers/staging/android/sync.h | 82 +- drivers/staging/android/sync_debug.c | 247 +++ drivers/staging/android/trace/sync.h | 12 include/drm/drmP.h |2 include/linux/dma-buf.h| 21 - include/linux/fence.h | 329 + include/linux/reservation.h| 20 + include/linux/seqno-fence.h| 119 +++ include/trace/events/fence.h | 125 +++ 30 files changed, 1863 insertions(+), 654 deletions(-) create mode 100644 drivers/base/fence.c create mode 100644 drivers/staging/android/sync_debug.c create mode 100644 include/linux/fence.h create mode 100644 include/linux/seqno-fence.h create mode 100644 include/trace/events/fence.h -- Signature -- To unsubscribe from this list: send the line unsubscribe linux-media in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 2/6] seqno-fence: Hardware dma-buf implementation of fencing (v5)
This type of fence can be used with hardware synchronization for simple hardware that can block execution until the condition (dma_buf[offset] - value) = 0 has been met when WAIT_GEQUAL is used, or (dma_buf[offset] != 0) has been met when WAIT_NONZERO is set. A software fallback still has to be provided in case the fence is used with a device that doesn't support this mechanism. It is useful to expose this for graphics cards that have an op to support this. Some cards like i915 can export those, but don't have an option to wait, so they need the software fallback. I extended the original patch by Rob Clark. v1: Original v2: Renamed from bikeshed to seqno, moved into dma-fence.c since not much was left of the file. Lots of documentation added. v3: Use fence_ops instead of custom callbacks. Moved to own file to avoid circular dependency between dma-buf.h and fence.h v4: Add spinlock pointer to seqno_fence_init v5: Add condition member to allow wait for != 0. Fix small style errors pointed out by checkpatch. Signed-off-by: Maarten Lankhorst maarten.lankho...@canonical.com Reviewed-by: Rob Clark robdcl...@gmail.com #v4 --- Documentation/DocBook/device-drivers.tmpl |1 drivers/base/fence.c | 52 + include/linux/seqno-fence.h | 119 + 3 files changed, 172 insertions(+) create mode 100644 include/linux/seqno-fence.h diff --git a/Documentation/DocBook/device-drivers.tmpl b/Documentation/DocBook/device-drivers.tmpl index 7a0c9ddb4818..8c85c20942c2 100644 --- a/Documentation/DocBook/device-drivers.tmpl +++ b/Documentation/DocBook/device-drivers.tmpl @@ -131,6 +131,7 @@ X!Edrivers/base/interface.c !Edrivers/base/dma-buf.c !Edrivers/base/fence.c !Iinclude/linux/fence.h +!Iinclude/linux/seqno-fence.h !Edrivers/base/reservation.c !Iinclude/linux/reservation.h !Edrivers/base/dma-coherent.c diff --git a/drivers/base/fence.c b/drivers/base/fence.c index 12df2bf62034..be33981ba2a2 100644 --- a/drivers/base/fence.c +++ b/drivers/base/fence.c @@ -25,6 +25,7 @@ #include linux/export.h #include linux/atomic.h #include linux/fence.h +#include linux/seqno-fence.h #define CREATE_TRACE_POINTS #include trace/events/fence.h @@ -413,3 +414,54 @@ __fence_init(struct fence *fence, const struct fence_ops *ops, trace_fence_init(fence); } EXPORT_SYMBOL(__fence_init); + +static const char *seqno_fence_get_driver_name(struct fence *fence) +{ + struct seqno_fence *seqno_fence = to_seqno_fence(fence); + return seqno_fence-ops-get_driver_name(fence); +} + +static const char *seqno_fence_get_timeline_name(struct fence *fence) +{ + struct seqno_fence *seqno_fence = to_seqno_fence(fence); + return seqno_fence-ops-get_timeline_name(fence); +} + +static bool seqno_enable_signaling(struct fence *fence) +{ + struct seqno_fence *seqno_fence = to_seqno_fence(fence); + return seqno_fence-ops-enable_signaling(fence); +} + +static bool seqno_signaled(struct fence *fence) +{ + struct seqno_fence *seqno_fence = to_seqno_fence(fence); + return seqno_fence-ops-signaled seqno_fence-ops-signaled(fence); +} + +static void seqno_release(struct fence *fence) +{ + struct seqno_fence *f = to_seqno_fence(fence); + + dma_buf_put(f-sync_buf); + if (f-ops-release) + f-ops-release(fence); + else + kfree(f); +} + +static long seqno_wait(struct fence *fence, bool intr, signed long timeout) +{ + struct seqno_fence *f = to_seqno_fence(fence); + return f-ops-wait(fence, intr, timeout); +} + +const struct fence_ops seqno_fence_ops = { + .get_driver_name = seqno_fence_get_driver_name, + .get_timeline_name = seqno_fence_get_timeline_name, + .enable_signaling = seqno_enable_signaling, + .signaled = seqno_signaled, + .wait = seqno_wait, + .release = seqno_release, +}; +EXPORT_SYMBOL(seqno_fence_ops); diff --git a/include/linux/seqno-fence.h b/include/linux/seqno-fence.h new file mode 100644 index ..b4d4aad3cadc --- /dev/null +++ b/include/linux/seqno-fence.h @@ -0,0 +1,119 @@ +/* + * seqno-fence, using a dma-buf to synchronize fencing + * + * Copyright (C) 2012 Texas Instruments + * Copyright (C) 2012 Canonical Ltd + * Authors: + * Rob Clark robdcl...@gmail.com + * Maarten Lankhorst maarten.lankho...@canonical.com + * + * This program is free software; you can redistribute it and/or modify it + * under the terms of the GNU General Public License version 2 as published by + * the Free Software Foundation. + * + * This program is distributed in the hope that it will be useful, but WITHOUT + * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or + * FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for + * more details. + * + * You should have received a copy of the GNU General Public License along with + * this program. If not, see http://www.gnu.org/licenses
[PATCH 6/6] dma-buf: add poll support, v3
Thanks to Fengguang Wu for spotting a missing static cast. v2: - Kill unused variable need_shared. v3: - Clarify the BUG() in dma_buf_release some more. (Rob Clark) Signed-off-by: Maarten Lankhorst maarten.lankho...@canonical.com --- drivers/base/dma-buf.c | 108 +++ include/linux/dma-buf.h | 12 + 2 files changed, 120 insertions(+) diff --git a/drivers/base/dma-buf.c b/drivers/base/dma-buf.c index 65d0f6201db4..84a9d0b66c99 100644 --- a/drivers/base/dma-buf.c +++ b/drivers/base/dma-buf.c @@ -30,6 +30,7 @@ #include linux/export.h #include linux/debugfs.h #include linux/seq_file.h +#include linux/poll.h #include linux/reservation.h static inline int is_dma_buf_file(struct file *); @@ -52,6 +53,16 @@ static int dma_buf_release(struct inode *inode, struct file *file) BUG_ON(dmabuf-vmapping_counter); + /* +* Any fences that a dma-buf poll can wait on should be signaled +* before releasing dma-buf. This is the responsibility of each +* driver that uses the reservation objects. +* +* If you hit this BUG() it means someone dropped their ref to the +* dma-buf while still having pending operation to the buffer. +*/ + BUG_ON(dmabuf-cb_shared.active || dmabuf-cb_excl.active); + dmabuf-ops-release(dmabuf); mutex_lock(db_list.lock); @@ -108,10 +119,103 @@ static loff_t dma_buf_llseek(struct file *file, loff_t offset, int whence) return base + offset; } +static void dma_buf_poll_cb(struct fence *fence, struct fence_cb *cb) +{ + struct dma_buf_poll_cb_t *dcb = (struct dma_buf_poll_cb_t *)cb; + unsigned long flags; + + spin_lock_irqsave(dcb-poll-lock, flags); + wake_up_locked_poll(dcb-poll, dcb-active); + dcb-active = 0; + spin_unlock_irqrestore(dcb-poll-lock, flags); +} + +static unsigned int dma_buf_poll(struct file *file, poll_table *poll) +{ + struct dma_buf *dmabuf; + struct reservation_object *resv; + unsigned long events; + + dmabuf = file-private_data; + if (!dmabuf || !dmabuf-resv) + return POLLERR; + + resv = dmabuf-resv; + + poll_wait(file, dmabuf-poll, poll); + + events = poll_requested_events(poll) (POLLIN | POLLOUT); + if (!events) + return 0; + + ww_mutex_lock(resv-lock, NULL); + + if (resv-fence_excl (!(events POLLOUT) || +resv-fence_shared_count == 0)) { + struct dma_buf_poll_cb_t *dcb = dmabuf-cb_excl; + unsigned long pevents = POLLIN; + + if (resv-fence_shared_count == 0) + pevents |= POLLOUT; + + spin_lock_irq(dmabuf-poll.lock); + if (dcb-active) { + dcb-active |= pevents; + events = ~pevents; + } else + dcb-active = pevents; + spin_unlock_irq(dmabuf-poll.lock); + + if (events pevents) { + if (!fence_add_callback(resv-fence_excl, + dcb-cb, dma_buf_poll_cb)) + events = ~pevents; + else + /* +* No callback queued, wake up any additional +* waiters. +*/ + dma_buf_poll_cb(NULL, dcb-cb); + } + } + + if ((events POLLOUT) resv-fence_shared_count 0) { + struct dma_buf_poll_cb_t *dcb = dmabuf-cb_shared; + int i; + + /* Only queue a new callback if no event has fired yet */ + spin_lock_irq(dmabuf-poll.lock); + if (dcb-active) + events = ~POLLOUT; + else + dcb-active = POLLOUT; + spin_unlock_irq(dmabuf-poll.lock); + + if (!(events POLLOUT)) + goto out; + + for (i = 0; i resv-fence_shared_count; ++i) + if (!fence_add_callback(resv-fence_shared[i], + dcb-cb, dma_buf_poll_cb)) { + events = ~POLLOUT; + break; + } + + /* No callback queued, wake up any additional waiters. */ + if (i == resv-fence_shared_count) + dma_buf_poll_cb(NULL, dcb-cb); + } + +out: + ww_mutex_unlock(resv-lock); + return events; +} + static const struct file_operations dma_buf_fops = { .release= dma_buf_release, .mmap = dma_buf_mmap_internal, .llseek = dma_buf_llseek, + .poll = dma_buf_poll, }; /* @@ -171,6 +275,10 @@ struct dma_buf
[PATCH 5/6] reservation: add support for fences to enable cross-device synchronisation
Signed-off-by: Maarten Lankhorst maarten.lankho...@canonical.com Reviewed-by: Rob Clark robdcl...@gmail.com --- include/linux/reservation.h | 20 +++- 1 file changed, 19 insertions(+), 1 deletion(-) diff --git a/include/linux/reservation.h b/include/linux/reservation.h index 813dae960ebd..f3f57460a205 100644 --- a/include/linux/reservation.h +++ b/include/linux/reservation.h @@ -6,7 +6,7 @@ * Copyright (C) 2012 Texas Instruments * * Authors: - * Rob Clark rob.cl...@linaro.org + * Rob Clark robdcl...@gmail.com * Maarten Lankhorst maarten.lankho...@canonical.com * Thomas Hellstrom thellstrom-at-vmware-dot-com * @@ -40,22 +40,40 @@ #define _LINUX_RESERVATION_H #include linux/ww_mutex.h +#include linux/fence.h +#include linux/slab.h extern struct ww_class reservation_ww_class; struct reservation_object { struct ww_mutex lock; + + struct fence *fence_excl; + struct fence **fence_shared; + u32 fence_shared_count, fence_shared_max; }; static inline void reservation_object_init(struct reservation_object *obj) { ww_mutex_init(obj-lock, reservation_ww_class); + + obj-fence_shared_count = obj-fence_shared_max = 0; + obj-fence_shared = NULL; + obj-fence_excl = NULL; } static inline void reservation_object_fini(struct reservation_object *obj) { + int i; + + if (obj-fence_excl) + fence_put(obj-fence_excl); + for (i = 0; i obj-fence_shared_count; ++i) + fence_put(obj-fence_shared[i]); + kfree(obj-fence_shared); + ww_mutex_destroy(obj-lock); } -- To unsubscribe from this list: send the line unsubscribe linux-media in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 1/6] fence: dma-buf cross-device synchronization (v17)
A fence can be attached to a buffer which is being filled or consumed by hw, to allow userspace to pass the buffer without waiting to another device. For example, userspace can call page_flip ioctl to display the next frame of graphics after kicking the GPU but while the GPU is still rendering. The display device sharing the buffer with the GPU would attach a callback to get notified when the GPU's rendering-complete IRQ fires, to update the scan-out address of the display, without having to wake up userspace. A driver must allocate a fence context for each execution ring that can run in parallel. The function for this takes an argument with how many contexts to allocate: + fence_context_alloc() A fence is transient, one-shot deal. It is allocated and attached to one or more dma-buf's. When the one that attached it is done, with the pending operation, it can signal the fence: + fence_signal() To have a rough approximation whether a fence is fired, call: + fence_is_signaled() The dma-buf-mgr handles tracking, and waiting on, the fences associated with a dma-buf. The one pending on the fence can add an async callback: + fence_add_callback() The callback can optionally be cancelled with: + fence_remove_callback() To wait synchronously, optionally with a timeout: + fence_wait() + fence_wait_timeout() When emitting a fence, call: + trace_fence_emit() To annotate that a fence is blocking on another fence, call: + trace_fence_annotate_wait_on(fence, on_fence) A default software-only implementation is provided, which can be used by drivers attaching a fence to a buffer when they have no other means for hw sync. But a memory backed fence is also envisioned, because it is common that GPU's can write to, or poll on some memory location for synchronization. For example: fence = custom_get_fence(...); if ((seqno_fence = to_seqno_fence(fence)) != NULL) { dma_buf *fence_buf = seqno_fence-sync_buf; get_dma_buf(fence_buf); ... tell the hw the memory location to wait ... custom_wait_on(fence_buf, seqno_fence-seqno_ofs, fence-seqno); } else { /* fall-back to sw sync * / fence_add_callback(fence, my_cb); } On SoC platforms, if some other hw mechanism is provided for synchronizing between IP blocks, it could be supported as an alternate implementation with it's own fence ops in a similar way. enable_signaling callback is used to provide sw signaling in case a cpu waiter is requested or no compatible hardware signaling could be used. The intention is to provide a userspace interface (presumably via eventfd) later, to be used in conjunction with dma-buf's mmap support for sw access to buffers (or for userspace apps that would prefer to do their own synchronization). v1: Original v2: After discussion w/ danvet and mlankhorst on #dri-devel, we decided that dma-fence didn't need to care about the sw-hw signaling path (it can be handled same as sw-sw case), and therefore the fence-ops can be simplified and more handled in the core. So remove the signal, add_callback, cancel_callback, and wait ops, and replace with a simple enable_signaling() op which can be used to inform a fence supporting hw-hw signaling that one or more devices which do not support hw signaling are waiting (and therefore it should enable an irq or do whatever is necessary in order that the CPU is notified when the fence is passed). v3: Fix locking fail in attach_fence() and get_fence() v4: Remove tie-in w/ dma-buf.. after discussion w/ danvet and mlankorst we decided that we need to be able to attach one fence to N dma-buf's, so using the list_head in dma-fence struct would be problematic. v5: [ Maarten Lankhorst ] Updated for dma-bikeshed-fence and dma-buf-manager. v6: [ Maarten Lankhorst ] I removed dma_fence_cancel_callback and some comments about checking if fence fired or not. This is broken by design. waitqueue_active during destruction is now fatal, since the signaller should be holding a reference in enable_signalling until it signalled the fence. Pass the original dma_fence_cb along, and call __remove_wait in the dma_fence_callback handler, so that no cleanup needs to be performed. v7: [ Maarten Lankhorst ] Set cb-func and only enable sw signaling if fence wasn't signaled yet, for example for hardware fences that may choose to signal blindly. v8: [ Maarten Lankhorst ] Tons of tiny fixes, moved __dma_fence_init to header and fixed include mess. dma-fence.h now includes dma-buf.h All members are now initialized, so kmalloc can be used for allocating a dma-fence. More documentation added. v9: Change compiler bitfields to flags, change return type of enable_signaling to bool. Rework dma_fence_wait. Added dma_fence_is_signaled and dma_fence_wait_timeout. s/dma// and change exports to non GPL. Added fence_is_signaled and fence_enable_sw_signaling calls, add ability to override default
[PATCH 4/6] android: convert sync to fence api, v5
Just to show it's easy. Android syncpoints can be mapped to a timeline. This removes the need to maintain a separate api for synchronization. I've left the android trace events in place, but the core fence events should already be sufficient for debugging. v2: - Call fence_remove_callback in sync_fence_free if not all fences have fired. v3: - Merge Colin Cross' bugfixes, and the android fence merge optimization. v4: - Merge with the upstream fixes. v5: - Fix small style issues pointed out by Thomas Hellstrom. Signed-off-by: Maarten Lankhorst maarten.lankho...@canonical.com --- drivers/staging/android/Kconfig |1 drivers/staging/android/Makefile |2 drivers/staging/android/sw_sync.c|4 drivers/staging/android/sync.c | 903 -- drivers/staging/android/sync.h | 82 ++- drivers/staging/android/sync_debug.c | 247 + drivers/staging/android/trace/sync.h | 12 7 files changed, 611 insertions(+), 640 deletions(-) create mode 100644 drivers/staging/android/sync_debug.c diff --git a/drivers/staging/android/Kconfig b/drivers/staging/android/Kconfig index b91c758883bf..ecc8194242b5 100644 --- a/drivers/staging/android/Kconfig +++ b/drivers/staging/android/Kconfig @@ -77,6 +77,7 @@ config SYNC bool Synchronization framework default n select ANON_INODES + select DMA_SHARED_BUFFER ---help--- This option enables the framework for synchronization between multiple drivers. Sync implementations can take advantage of hardware diff --git a/drivers/staging/android/Makefile b/drivers/staging/android/Makefile index 0a01e1914905..517ad5ffa429 100644 --- a/drivers/staging/android/Makefile +++ b/drivers/staging/android/Makefile @@ -9,5 +9,5 @@ obj-$(CONFIG_ANDROID_TIMED_OUTPUT) += timed_output.o obj-$(CONFIG_ANDROID_TIMED_GPIO) += timed_gpio.o obj-$(CONFIG_ANDROID_LOW_MEMORY_KILLER)+= lowmemorykiller.o obj-$(CONFIG_ANDROID_INTF_ALARM_DEV) += alarm-dev.o -obj-$(CONFIG_SYNC) += sync.o +obj-$(CONFIG_SYNC) += sync.o sync_debug.o obj-$(CONFIG_SW_SYNC) += sw_sync.o diff --git a/drivers/staging/android/sw_sync.c b/drivers/staging/android/sw_sync.c index f24493ac65e3..a76db3ff87cb 100644 --- a/drivers/staging/android/sw_sync.c +++ b/drivers/staging/android/sw_sync.c @@ -50,7 +50,7 @@ static struct sync_pt *sw_sync_pt_dup(struct sync_pt *sync_pt) { struct sw_sync_pt *pt = (struct sw_sync_pt *) sync_pt; struct sw_sync_timeline *obj = - (struct sw_sync_timeline *)sync_pt-parent; + (struct sw_sync_timeline *)sync_pt_parent(sync_pt); return (struct sync_pt *) sw_sync_pt_create(obj, pt-value); } @@ -59,7 +59,7 @@ static int sw_sync_pt_has_signaled(struct sync_pt *sync_pt) { struct sw_sync_pt *pt = (struct sw_sync_pt *)sync_pt; struct sw_sync_timeline *obj = - (struct sw_sync_timeline *)sync_pt-parent; + (struct sw_sync_timeline *)sync_pt_parent(sync_pt); return sw_sync_cmp(obj-value, pt-value) = 0; } diff --git a/drivers/staging/android/sync.c b/drivers/staging/android/sync.c index 3d05f662110b..b2254e5a8b70 100644 --- a/drivers/staging/android/sync.c +++ b/drivers/staging/android/sync.c @@ -31,22 +31,13 @@ #define CREATE_TRACE_POINTS #include trace/sync.h -static void sync_fence_signal_pt(struct sync_pt *pt); -static int _sync_pt_has_signaled(struct sync_pt *pt); -static void sync_fence_free(struct kref *kref); -static void sync_dump(void); - -static LIST_HEAD(sync_timeline_list_head); -static DEFINE_SPINLOCK(sync_timeline_list_lock); - -static LIST_HEAD(sync_fence_list_head); -static DEFINE_SPINLOCK(sync_fence_list_lock); +static const struct fence_ops android_fence_ops; +static const struct file_operations sync_fence_fops; struct sync_timeline *sync_timeline_create(const struct sync_timeline_ops *ops, int size, const char *name) { struct sync_timeline *obj; - unsigned long flags; if (size sizeof(struct sync_timeline)) return NULL; @@ -57,17 +48,14 @@ struct sync_timeline *sync_timeline_create(const struct sync_timeline_ops *ops, kref_init(obj-kref); obj-ops = ops; + obj-context = fence_context_alloc(1); strlcpy(obj-name, name, sizeof(obj-name)); INIT_LIST_HEAD(obj-child_list_head); - spin_lock_init(obj-child_list_lock); - INIT_LIST_HEAD(obj-active_list_head); - spin_lock_init(obj-active_list_lock); + spin_lock_init(obj-child_list_lock); - spin_lock_irqsave(sync_timeline_list_lock, flags); - list_add_tail(obj-sync_timeline_list, sync_timeline_list_head); - spin_unlock_irqrestore(sync_timeline_list_lock, flags); + sync_timeline_debug_add(obj); return obj; } @@ -77,11 +65,8 @@ static void sync_timeline_free
[PATCH 3/6] dma-buf: use reservation objects
This allows reservation objects to be used in dma-buf. it's required for implementing polling support on the fences that belong to a dma-buf. Signed-off-by: Maarten Lankhorst maarten.lankho...@canonical.com Acked-by: Mauro Carvalho Chehab m.che...@samsung.com #drivers/media/v4l2-core/ Acked-by: Thomas Hellstrom thellst...@vmware.com #drivers/gpu/drm/ttm --- drivers/base/dma-buf.c | 22 -- drivers/gpu/drm/drm_prime.c|8 +++- drivers/gpu/drm/exynos/exynos_drm_dmabuf.c |2 +- drivers/gpu/drm/i915/i915_gem_dmabuf.c |2 +- drivers/gpu/drm/nouveau/nouveau_drm.c |1 + drivers/gpu/drm/nouveau/nouveau_gem.h |1 + drivers/gpu/drm/nouveau/nouveau_prime.c|7 +++ drivers/gpu/drm/omapdrm/omap_gem_dmabuf.c |2 +- drivers/gpu/drm/radeon/radeon_drv.c|2 ++ drivers/gpu/drm/radeon/radeon_prime.c |8 drivers/gpu/drm/ttm/ttm_object.c |2 +- drivers/media/v4l2-core/videobuf2-dma-contig.c |2 +- drivers/staging/android/ion/ion.c |2 +- include/drm/drmP.h |2 ++ include/linux/dma-buf.h|9 ++--- 15 files changed, 60 insertions(+), 12 deletions(-) diff --git a/drivers/base/dma-buf.c b/drivers/base/dma-buf.c index 1e16cbd61da2..65d0f6201db4 100644 --- a/drivers/base/dma-buf.c +++ b/drivers/base/dma-buf.c @@ -25,10 +25,12 @@ #include linux/fs.h #include linux/slab.h #include linux/dma-buf.h +#include linux/fence.h #include linux/anon_inodes.h #include linux/export.h #include linux/debugfs.h #include linux/seq_file.h +#include linux/reservation.h static inline int is_dma_buf_file(struct file *); @@ -56,6 +58,9 @@ static int dma_buf_release(struct inode *inode, struct file *file) list_del(dmabuf-list_node); mutex_unlock(db_list.lock); + if (dmabuf-resv == (struct reservation_object *)dmabuf[1]) + reservation_object_fini(dmabuf-resv); + kfree(dmabuf); return 0; } @@ -128,6 +133,7 @@ static inline int is_dma_buf_file(struct file *file) * @size: [in]Size of the buffer * @flags: [in]mode flags for the file. * @exp_name: [in]name of the exporting module - useful for debugging. + * @resv: [in]reservation-object, NULL to allocate default one. * * Returns, on success, a newly created dma_buf object, which wraps the * supplied private data and operations for dma_buf_ops. On either missing @@ -135,10 +141,17 @@ static inline int is_dma_buf_file(struct file *file) * */ struct dma_buf *dma_buf_export_named(void *priv, const struct dma_buf_ops *ops, - size_t size, int flags, const char *exp_name) + size_t size, int flags, const char *exp_name, + struct reservation_object *resv) { struct dma_buf *dmabuf; struct file *file; + size_t alloc_size = sizeof(struct dma_buf); + if (!resv) + alloc_size += sizeof(struct reservation_object); + else + /* prevent dma_buf[1] == dma_buf-resv */ + alloc_size += 1; if (WARN_ON(!priv || !ops || !ops-map_dma_buf @@ -150,7 +163,7 @@ struct dma_buf *dma_buf_export_named(void *priv, const struct dma_buf_ops *ops, return ERR_PTR(-EINVAL); } - dmabuf = kzalloc(sizeof(struct dma_buf), GFP_KERNEL); + dmabuf = kzalloc(alloc_size, GFP_KERNEL); if (dmabuf == NULL) return ERR_PTR(-ENOMEM); @@ -158,6 +171,11 @@ struct dma_buf *dma_buf_export_named(void *priv, const struct dma_buf_ops *ops, dmabuf-ops = ops; dmabuf-size = size; dmabuf-exp_name = exp_name; + if (!resv) { + resv = (struct reservation_object *)dmabuf[1]; + reservation_object_init(resv); + } + dmabuf-resv = resv; file = anon_inode_getfile(dmabuf, dma_buf_fops, dmabuf, flags); if (IS_ERR(file)) { diff --git a/drivers/gpu/drm/drm_prime.c b/drivers/gpu/drm/drm_prime.c index 56805c39c906..a13e90245adf 100644 --- a/drivers/gpu/drm/drm_prime.c +++ b/drivers/gpu/drm/drm_prime.c @@ -318,7 +318,13 @@ static const struct dma_buf_ops drm_gem_prime_dmabuf_ops = { struct dma_buf *drm_gem_prime_export(struct drm_device *dev, struct drm_gem_object *obj, int flags) { - return dma_buf_export(obj, drm_gem_prime_dmabuf_ops, obj-size, flags); + struct reservation_object *robj = NULL; + + if (dev-driver-gem_prime_res_obj) + robj = dev-driver-gem_prime_res_obj(obj); + + return dma_buf_export(obj, drm_gem_prime_dmabuf_ops, obj-size, + flags, robj); } EXPORT_SYMBOL(drm_gem_prime_export); diff --git a/drivers/gpu/drm/exynos
Re: [PATCH 2/6] seqno-fence: Hardware dma-buf implementation of fencing (v4)
op 17-02-14 19:41, Christian König schreef: Am 17.02.2014 19:24, schrieb Rob Clark: On Mon, Feb 17, 2014 at 12:36 PM, Christian König deathsim...@vodafone.de wrote: Am 17.02.2014 18:27, schrieb Rob Clark: On Mon, Feb 17, 2014 at 11:56 AM, Christian König deathsim...@vodafone.de wrote: Am 17.02.2014 16:56, schrieb Maarten Lankhorst: This type of fence can be used with hardware synchronization for simple hardware that can block execution until the condition (dma_buf[offset] - value) = 0 has been met. Can't we make that just dma_buf[offset] != 0 instead? As far as I know this way it would match the definition M$ uses in their WDDM specification and so make it much more likely that hardware supports it. well 'buf[offset] = value' at least means the same slot can be used for multiple operations (with increasing values of 'value').. not sure if that is something people care about. =value seems to be possible with adreno and radeon. I'm not really sure about others (although I presume it as least supported for nv desktop stuff). For hw that cannot do =value, we can either have a different fence implementation which uses the !=0 approach. Or change seqno-fence implementation later if needed. But if someone has hw that can do !=0 but not =value, speak up now ;-) Here! Radeon can only do =value on the DMA and 3D engine, but not with UVD or VCE. And for the 3D engine it means draining the pipe, which isn't really a good idea. hmm, ok.. forgot you have a few extra rings compared to me. Is UVD re-ordering from decode-order to display-order for you in hw? If not, I guess you need sw intervention anyways when a frame is done for frame re-ordering, so maybe hw-hw sync doesn't really matter as much as compared to gpu/3d-display. For dma-3d interactions, seems like you would care more about hw-hw sync, but I guess you aren't likely to use GPU A to do a resolve blit for GPU B.. No UVD isn't reordering, but since frame reordering is predictable you usually end up with pipelining everything to the hardware. E.g. you send the decode commands in decode order to the UVD block and if you have overlay active one of the frames are going to be the first to display and then you want to wait for it on the display side. For 3D ring, I assume you probably want a CP_WAIT_FOR_IDLE before a CP_MEM_WRITE to update fence value in memory (for the one signalling the fence). But why would you need that before a CP_WAIT_REG_MEM (for the one waiting for the fence)? I don't exactly have documentation for adreno version of CP_WAIT_REG_{MEM,EQ,GTE}.. but PFP and ME appear to be same instruction set as r600, so I'm pretty sure they should have similar capabilities.. CP_WAIT_REG_MEM appears to be same but with 32bit gpu addresses vs 64b. You shouldn't use any of the CP commands for engine synchronization (neither for wait nor for signal). The PFP and ME are just the top of a quite deep pipeline and when you use any of the CP_WAIT functions you block them for something and that's draining the pipeline. With the semaphore and fence commands the values are just attached as prerequisite to the draw command, e.g. the CP setups the draw environment and issues the command, but the actual execution of it is delayed until the != 0 condition hits. And in the meantime the CP already prepares the next draw operation. But at least for compute queues wait semaphore aren't the perfect solution either. What you need then is a GPU scheduler that uses a kernel task for setting up the command submission for you when all prerequisites are meet. nouveau has sort of a scheduler in hardware. It can yield when waiting on a semaphore. And each process gets their own context and the timeslices can be adjusted. ;-) But I don't mind changing this patch when an actual user pops up. Nouveau can do a wait for (*sema mask) != 0 only on nvc0 and newer, where mask can be chosen. But it can do == somevalue and = somevalue on older relevant optimus hardware, so if we know that it was zero before and we know the sign of the new value that could work too. Adding ops and a separate mask later on when users pop up is fine with me, the original design here was chosen so I could map the intel status page read-only into the process specific nvidia vm. ~Maarten -- To unsubscribe from this list: send the line unsubscribe linux-media in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 1/6] fence: dma-buf cross-device synchronization (v17)
A fence can be attached to a buffer which is being filled or consumed by hw, to allow userspace to pass the buffer without waiting to another device. For example, userspace can call page_flip ioctl to display the next frame of graphics after kicking the GPU but while the GPU is still rendering. The display device sharing the buffer with the GPU would attach a callback to get notified when the GPU's rendering-complete IRQ fires, to update the scan-out address of the display, without having to wake up userspace. A driver must allocate a fence context for each execution ring that can run in parallel. The function for this takes an argument with how many contexts to allocate: + fence_context_alloc() A fence is transient, one-shot deal. It is allocated and attached to one or more dma-buf's. When the one that attached it is done, with the pending operation, it can signal the fence: + fence_signal() To have a rough approximation whether a fence is fired, call: + fence_is_signaled() The dma-buf-mgr handles tracking, and waiting on, the fences associated with a dma-buf. The one pending on the fence can add an async callback: + fence_add_callback() The callback can optionally be cancelled with: + fence_remove_callback() To wait synchronously, optionally with a timeout: + fence_wait() + fence_wait_timeout() When emitting a fence, call: + trace_fence_emit() To annotate that a fence is blocking on another fence, call: + trace_fence_annotate_wait_on(fence, on_fence) A default software-only implementation is provided, which can be used by drivers attaching a fence to a buffer when they have no other means for hw sync. But a memory backed fence is also envisioned, because it is common that GPU's can write to, or poll on some memory location for synchronization. For example: fence = custom_get_fence(...); if ((seqno_fence = to_seqno_fence(fence)) != NULL) { dma_buf *fence_buf = seqno_fence-sync_buf; get_dma_buf(fence_buf); ... tell the hw the memory location to wait ... custom_wait_on(fence_buf, seqno_fence-seqno_ofs, fence-seqno); } else { /* fall-back to sw sync * / fence_add_callback(fence, my_cb); } On SoC platforms, if some other hw mechanism is provided for synchronizing between IP blocks, it could be supported as an alternate implementation with it's own fence ops in a similar way. enable_signaling callback is used to provide sw signaling in case a cpu waiter is requested or no compatible hardware signaling could be used. The intention is to provide a userspace interface (presumably via eventfd) later, to be used in conjunction with dma-buf's mmap support for sw access to buffers (or for userspace apps that would prefer to do their own synchronization). v1: Original v2: After discussion w/ danvet and mlankhorst on #dri-devel, we decided that dma-fence didn't need to care about the sw-hw signaling path (it can be handled same as sw-sw case), and therefore the fence-ops can be simplified and more handled in the core. So remove the signal, add_callback, cancel_callback, and wait ops, and replace with a simple enable_signaling() op which can be used to inform a fence supporting hw-hw signaling that one or more devices which do not support hw signaling are waiting (and therefore it should enable an irq or do whatever is necessary in order that the CPU is notified when the fence is passed). v3: Fix locking fail in attach_fence() and get_fence() v4: Remove tie-in w/ dma-buf.. after discussion w/ danvet and mlankorst we decided that we need to be able to attach one fence to N dma-buf's, so using the list_head in dma-fence struct would be problematic. v5: [ Maarten Lankhorst ] Updated for dma-bikeshed-fence and dma-buf-manager. v6: [ Maarten Lankhorst ] I removed dma_fence_cancel_callback and some comments about checking if fence fired or not. This is broken by design. waitqueue_active during destruction is now fatal, since the signaller should be holding a reference in enable_signalling until it signalled the fence. Pass the original dma_fence_cb along, and call __remove_wait in the dma_fence_callback handler, so that no cleanup needs to be performed. v7: [ Maarten Lankhorst ] Set cb-func and only enable sw signaling if fence wasn't signaled yet, for example for hardware fences that may choose to signal blindly. v8: [ Maarten Lankhorst ] Tons of tiny fixes, moved __dma_fence_init to header and fixed include mess. dma-fence.h now includes dma-buf.h All members are now initialized, so kmalloc can be used for allocating a dma-fence. More documentation added. v9: Change compiler bitfields to flags, change return type of enable_signaling to bool. Rework dma_fence_wait. Added dma_fence_is_signaled and dma_fence_wait_timeout. s/dma// and change exports to non GPL. Added fence_is_signaled and fence_enable_sw_signaling calls, add ability to override default
[PATCH 2/6] seqno-fence: Hardware dma-buf implementation of fencing (v4)
This type of fence can be used with hardware synchronization for simple hardware that can block execution until the condition (dma_buf[offset] - value) = 0 has been met. A software fallback still has to be provided in case the fence is used with a device that doesn't support this mechanism. It is useful to expose this for graphics cards that have an op to support this. Some cards like i915 can export those, but don't have an option to wait, so they need the software fallback. I extended the original patch by Rob Clark. v1: Original v2: Renamed from bikeshed to seqno, moved into dma-fence.c since not much was left of the file. Lots of documentation added. v3: Use fence_ops instead of custom callbacks. Moved to own file to avoid circular dependency between dma-buf.h and fence.h v4: Add spinlock pointer to seqno_fence_init Signed-off-by: Maarten Lankhorst maarten.lankho...@canonical.com --- Documentation/DocBook/device-drivers.tmpl |1 drivers/base/fence.c | 50 + include/linux/seqno-fence.h | 109 + 3 files changed, 160 insertions(+) create mode 100644 include/linux/seqno-fence.h diff --git a/Documentation/DocBook/device-drivers.tmpl b/Documentation/DocBook/device-drivers.tmpl index 7a0c9ddb4818..8c85c20942c2 100644 --- a/Documentation/DocBook/device-drivers.tmpl +++ b/Documentation/DocBook/device-drivers.tmpl @@ -131,6 +131,7 @@ X!Edrivers/base/interface.c !Edrivers/base/dma-buf.c !Edrivers/base/fence.c !Iinclude/linux/fence.h +!Iinclude/linux/seqno-fence.h !Edrivers/base/reservation.c !Iinclude/linux/reservation.h !Edrivers/base/dma-coherent.c diff --git a/drivers/base/fence.c b/drivers/base/fence.c index 12df2bf62034..cd0937127a89 100644 --- a/drivers/base/fence.c +++ b/drivers/base/fence.c @@ -25,6 +25,7 @@ #include linux/export.h #include linux/atomic.h #include linux/fence.h +#include linux/seqno-fence.h #define CREATE_TRACE_POINTS #include trace/events/fence.h @@ -413,3 +414,52 @@ __fence_init(struct fence *fence, const struct fence_ops *ops, trace_fence_init(fence); } EXPORT_SYMBOL(__fence_init); + +static const char *seqno_fence_get_driver_name(struct fence *fence) { + struct seqno_fence *seqno_fence = to_seqno_fence(fence); + return seqno_fence-ops-get_driver_name(fence); +} + +static const char *seqno_fence_get_timeline_name(struct fence *fence) { + struct seqno_fence *seqno_fence = to_seqno_fence(fence); + return seqno_fence-ops-get_timeline_name(fence); +} + +static bool seqno_enable_signaling(struct fence *fence) +{ + struct seqno_fence *seqno_fence = to_seqno_fence(fence); + return seqno_fence-ops-enable_signaling(fence); +} + +static bool seqno_signaled(struct fence *fence) +{ + struct seqno_fence *seqno_fence = to_seqno_fence(fence); + return seqno_fence-ops-signaled seqno_fence-ops-signaled(fence); +} + +static void seqno_release(struct fence *fence) +{ + struct seqno_fence *f = to_seqno_fence(fence); + + dma_buf_put(f-sync_buf); + if (f-ops-release) + f-ops-release(fence); + else + kfree(f); +} + +static long seqno_wait(struct fence *fence, bool intr, signed long timeout) +{ + struct seqno_fence *f = to_seqno_fence(fence); + return f-ops-wait(fence, intr, timeout); +} + +const struct fence_ops seqno_fence_ops = { + .get_driver_name = seqno_fence_get_driver_name, + .get_timeline_name = seqno_fence_get_timeline_name, + .enable_signaling = seqno_enable_signaling, + .signaled = seqno_signaled, + .wait = seqno_wait, + .release = seqno_release, +}; +EXPORT_SYMBOL(seqno_fence_ops); diff --git a/include/linux/seqno-fence.h b/include/linux/seqno-fence.h new file mode 100644 index ..952f7909128c --- /dev/null +++ b/include/linux/seqno-fence.h @@ -0,0 +1,109 @@ +/* + * seqno-fence, using a dma-buf to synchronize fencing + * + * Copyright (C) 2012 Texas Instruments + * Copyright (C) 2012 Canonical Ltd + * Authors: + * Rob Clark robdcl...@gmail.com + * Maarten Lankhorst maarten.lankho...@canonical.com + * + * This program is free software; you can redistribute it and/or modify it + * under the terms of the GNU General Public License version 2 as published by + * the Free Software Foundation. + * + * This program is distributed in the hope that it will be useful, but WITHOUT + * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or + * FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for + * more details. + * + * You should have received a copy of the GNU General Public License along with + * this program. If not, see http://www.gnu.org/licenses/. + */ + +#ifndef __LINUX_SEQNO_FENCE_H +#define __LINUX_SEQNO_FENCE_H + +#include linux/fence.h +#include linux/dma-buf.h + +struct seqno_fence { + struct fence base; + + const struct fence_ops *ops; + struct dma_buf
[PATCH 4/6] android: convert sync to fence api, v4
Android syncpoints can be mapped to a timeline. This removes the need to maintain a separate api for synchronization. I've left the android trace events in place, but the core fence events should already be sufficient for debugging. v2: - Call fence_remove_callback in sync_fence_free if not all fences have fired. v3: - Merge Colin Cross' bugfixes, and the android fence merge optimization. v4: - Merge with the upstream fixes. Signed-off-by: Maarten Lankhorst maarten.lankho...@canonical.com --- drivers/staging/android/Kconfig |1 drivers/staging/android/Makefile |2 drivers/staging/android/sw_sync.c|4 drivers/staging/android/sync.c | 892 +++--- drivers/staging/android/sync.h | 80 ++- drivers/staging/android/sync_debug.c | 245 + drivers/staging/android/trace/sync.h | 12 7 files changed, 592 insertions(+), 644 deletions(-) create mode 100644 drivers/staging/android/sync_debug.c diff --git a/drivers/staging/android/Kconfig b/drivers/staging/android/Kconfig index b91c758883bf..ecc8194242b5 100644 --- a/drivers/staging/android/Kconfig +++ b/drivers/staging/android/Kconfig @@ -77,6 +77,7 @@ config SYNC bool Synchronization framework default n select ANON_INODES + select DMA_SHARED_BUFFER ---help--- This option enables the framework for synchronization between multiple drivers. Sync implementations can take advantage of hardware diff --git a/drivers/staging/android/Makefile b/drivers/staging/android/Makefile index 0a01e1914905..517ad5ffa429 100644 --- a/drivers/staging/android/Makefile +++ b/drivers/staging/android/Makefile @@ -9,5 +9,5 @@ obj-$(CONFIG_ANDROID_TIMED_OUTPUT) += timed_output.o obj-$(CONFIG_ANDROID_TIMED_GPIO) += timed_gpio.o obj-$(CONFIG_ANDROID_LOW_MEMORY_KILLER)+= lowmemorykiller.o obj-$(CONFIG_ANDROID_INTF_ALARM_DEV) += alarm-dev.o -obj-$(CONFIG_SYNC) += sync.o +obj-$(CONFIG_SYNC) += sync.o sync_debug.o obj-$(CONFIG_SW_SYNC) += sw_sync.o diff --git a/drivers/staging/android/sw_sync.c b/drivers/staging/android/sw_sync.c index f24493ac65e3..a76db3ff87cb 100644 --- a/drivers/staging/android/sw_sync.c +++ b/drivers/staging/android/sw_sync.c @@ -50,7 +50,7 @@ static struct sync_pt *sw_sync_pt_dup(struct sync_pt *sync_pt) { struct sw_sync_pt *pt = (struct sw_sync_pt *) sync_pt; struct sw_sync_timeline *obj = - (struct sw_sync_timeline *)sync_pt-parent; + (struct sw_sync_timeline *)sync_pt_parent(sync_pt); return (struct sync_pt *) sw_sync_pt_create(obj, pt-value); } @@ -59,7 +59,7 @@ static int sw_sync_pt_has_signaled(struct sync_pt *sync_pt) { struct sw_sync_pt *pt = (struct sw_sync_pt *)sync_pt; struct sw_sync_timeline *obj = - (struct sw_sync_timeline *)sync_pt-parent; + (struct sw_sync_timeline *)sync_pt_parent(sync_pt); return sw_sync_cmp(obj-value, pt-value) = 0; } diff --git a/drivers/staging/android/sync.c b/drivers/staging/android/sync.c index 3d05f662110b..8e77cd73b739 100644 --- a/drivers/staging/android/sync.c +++ b/drivers/staging/android/sync.c @@ -31,22 +31,13 @@ #define CREATE_TRACE_POINTS #include trace/sync.h -static void sync_fence_signal_pt(struct sync_pt *pt); -static int _sync_pt_has_signaled(struct sync_pt *pt); -static void sync_fence_free(struct kref *kref); -static void sync_dump(void); - -static LIST_HEAD(sync_timeline_list_head); -static DEFINE_SPINLOCK(sync_timeline_list_lock); - -static LIST_HEAD(sync_fence_list_head); -static DEFINE_SPINLOCK(sync_fence_list_lock); +static const struct fence_ops android_fence_ops; +static const struct file_operations sync_fence_fops; struct sync_timeline *sync_timeline_create(const struct sync_timeline_ops *ops, int size, const char *name) { struct sync_timeline *obj; - unsigned long flags; if (size sizeof(struct sync_timeline)) return NULL; @@ -57,17 +48,14 @@ struct sync_timeline *sync_timeline_create(const struct sync_timeline_ops *ops, kref_init(obj-kref); obj-ops = ops; + obj-context = fence_context_alloc(1); strlcpy(obj-name, name, sizeof(obj-name)); INIT_LIST_HEAD(obj-child_list_head); - spin_lock_init(obj-child_list_lock); - INIT_LIST_HEAD(obj-active_list_head); - spin_lock_init(obj-active_list_lock); + spin_lock_init(obj-child_list_lock); - spin_lock_irqsave(sync_timeline_list_lock, flags); - list_add_tail(obj-sync_timeline_list, sync_timeline_list_head); - spin_unlock_irqrestore(sync_timeline_list_lock, flags); + sync_timeline_debug_add(obj); return obj; } @@ -77,11 +65,8 @@ static void sync_timeline_free(struct kref *kref) { struct sync_timeline *obj = container_of(kref
[PATCH 5/6] reservation: add support for fences to enable cross-device synchronisation
Signed-off-by: Maarten Lankhorst maarten.lankho...@canonical.com --- include/linux/reservation.h | 18 +- 1 file changed, 17 insertions(+), 1 deletion(-) diff --git a/include/linux/reservation.h b/include/linux/reservation.h index 813dae960ebd..92c4851b5a39 100644 --- a/include/linux/reservation.h +++ b/include/linux/reservation.h @@ -6,7 +6,7 @@ * Copyright (C) 2012 Texas Instruments * * Authors: - * Rob Clark rob.cl...@linaro.org + * Rob Clark robdcl...@gmail.com * Maarten Lankhorst maarten.lankho...@canonical.com * Thomas Hellstrom thellstrom-at-vmware-dot-com * @@ -40,22 +40,38 @@ #define _LINUX_RESERVATION_H #include linux/ww_mutex.h +#include linux/fence.h extern struct ww_class reservation_ww_class; struct reservation_object { struct ww_mutex lock; + + struct fence *fence_excl; + struct fence **fence_shared; + u32 fence_shared_count, fence_shared_max; }; static inline void reservation_object_init(struct reservation_object *obj) { ww_mutex_init(obj-lock, reservation_ww_class); + + obj-fence_shared_count = obj-fence_shared_max = 0; + obj-fence_shared = NULL; + obj-fence_excl = NULL; } static inline void reservation_object_fini(struct reservation_object *obj) { + int i; + + if (obj-fence_excl) + fence_put(obj-fence_excl); + for (i = 0; i obj-fence_shared_count; ++i) + fence_put(obj-fence_shared[i]); + ww_mutex_destroy(obj-lock); } -- To unsubscribe from this list: send the line unsubscribe linux-media in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 0/6] dma-buf synchronization patches
The following series implements fence and converts dma-buf and android sync to use it. Patch 6 and 7 add support for polling to dma-buf, blocking until all fences are signaled. I've dropped the extra patch to copy an export from the core, and instead use the public version of it. I've had to fix some fallout from the rebase, hopefully everything's clean now, and ready for -next. --- Maarten Lankhorst (6): fence: dma-buf cross-device synchronization (v17) seqno-fence: Hardware dma-buf implementation of fencing (v4) dma-buf: use reservation objects android: convert sync to fence api, v3 reservation: add support for fences to enable cross-device synchronisation dma-buf: add poll support, v2 Documentation/DocBook/device-drivers.tmpl |3 drivers/base/Kconfig |9 drivers/base/Makefile |2 drivers/base/dma-buf.c | 123 +++ drivers/base/fence.c | 465 + drivers/gpu/drm/drm_prime.c|8 drivers/gpu/drm/exynos/exynos_drm_dmabuf.c |2 drivers/gpu/drm/i915/i915_gem_dmabuf.c |2 drivers/gpu/drm/nouveau/nouveau_drm.c |1 drivers/gpu/drm/nouveau/nouveau_gem.h |1 drivers/gpu/drm/nouveau/nouveau_prime.c|7 drivers/gpu/drm/omapdrm/omap_gem_dmabuf.c |2 drivers/gpu/drm/radeon/radeon_drv.c|2 drivers/gpu/drm/radeon/radeon_prime.c |8 drivers/gpu/drm/ttm/ttm_object.c |2 drivers/media/v4l2-core/videobuf2-dma-contig.c |2 drivers/staging/android/Kconfig|1 drivers/staging/android/Makefile |2 drivers/staging/android/sw_sync.c |4 drivers/staging/android/sync.c | 892 drivers/staging/android/sync.h | 80 +- drivers/staging/android/sync_debug.c | 245 +++ drivers/staging/android/trace/sync.h | 12 include/drm/drmP.h |2 include/linux/dma-buf.h| 21 - include/linux/fence.h | 329 + include/linux/reservation.h| 18 include/linux/seqno-fence.h| 109 +++ include/trace/events/fence.h | 125 +++ 29 files changed, 1822 insertions(+), 657 deletions(-) create mode 100644 drivers/base/fence.c create mode 100644 drivers/staging/android/sync_debug.c create mode 100644 include/linux/fence.h create mode 100644 include/linux/seqno-fence.h create mode 100644 include/trace/events/fence.h -- Signature -- To unsubscribe from this list: send the line unsubscribe linux-media in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 1/7] sched: allow try_to_wake_up to be used internally outside of core.c
op 13-01-14 19:50, Colin Cross schreef: On Mon, Jan 13, 2014 at 4:31 AM, Maarten Lankhorst maarten.lankho...@canonical.com wrote: The kernel fence implementation doesn't use event queues, but needs to perform the same wake up. The symbol is not exported, since the fence implementation is not built as a module. Signed-off-by: Maarten Lankhorst maarten.lankho...@canonical.com --- include/linux/wait.h |1 + kernel/sched/core.c |2 +- 2 files changed, 2 insertions(+), 1 deletion(-) diff --git a/include/linux/wait.h b/include/linux/wait.h index eaa00b10abaa..c54e3ef50134 100644 --- a/include/linux/wait.h +++ b/include/linux/wait.h @@ -12,6 +12,7 @@ typedef struct __wait_queue wait_queue_t; typedef int (*wait_queue_func_t)(wait_queue_t *wait, unsigned mode, int flags, void *key); int default_wake_function(wait_queue_t *wait, unsigned mode, int flags, void *key); +int try_to_wake_up(struct task_struct *p, unsigned int state, int wake_flags); struct __wait_queue { unsigned intflags; diff --git a/kernel/sched/core.c b/kernel/sched/core.c index a88f4a485c5e..f41d317042dd 100644 --- a/kernel/sched/core.c +++ b/kernel/sched/core.c @@ -1578,7 +1578,7 @@ static void ttwu_queue(struct task_struct *p, int cpu) * Return: %true if @p was woken up, %false if it was already running. * or @state didn't match @p's state. */ -static int +int try_to_wake_up(struct task_struct *p, unsigned int state, int wake_flags) { unsigned long flags; wake_up_state is already available in linux/sched.h, can you use that? Indeed! Thanks for the catch. ~Maarten -- To unsubscribe from this list: send the line unsubscribe linux-media in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 3/7] seqno-fence: Hardware dma-buf implementation of fencing (v4)
This type of fence can be used with hardware synchronization for simple hardware that can block execution until the condition (dma_buf[offset] - value) = 0 has been met. A software fallback still has to be provided in case the fence is used with a device that doesn't support this mechanism. It is useful to expose this for graphics cards that have an op to support this. Some cards like i915 can export those, but don't have an option to wait, so they need the software fallback. I extended the original patch by Rob Clark. v1: Original v2: Renamed from bikeshed to seqno, moved into dma-fence.c since not much was left of the file. Lots of documentation added. v3: Use fence_ops instead of custom callbacks. Moved to own file to avoid circular dependency between dma-buf.h and fence.h v4: Add spinlock pointer to seqno_fence_init Signed-off-by: Maarten Lankhorst maarten.lankho...@canonical.com --- Documentation/DocBook/device-drivers.tmpl |1 drivers/base/fence.c | 50 + include/linux/seqno-fence.h | 109 + 3 files changed, 160 insertions(+) create mode 100644 include/linux/seqno-fence.h diff --git a/Documentation/DocBook/device-drivers.tmpl b/Documentation/DocBook/device-drivers.tmpl index 7a0c9ddb4818..8c85c20942c2 100644 --- a/Documentation/DocBook/device-drivers.tmpl +++ b/Documentation/DocBook/device-drivers.tmpl @@ -131,6 +131,7 @@ X!Edrivers/base/interface.c !Edrivers/base/dma-buf.c !Edrivers/base/fence.c !Iinclude/linux/fence.h +!Iinclude/linux/seqno-fence.h !Edrivers/base/reservation.c !Iinclude/linux/reservation.h !Edrivers/base/dma-coherent.c diff --git a/drivers/base/fence.c b/drivers/base/fence.c index ac5f68020436..25bd5813aa90 100644 --- a/drivers/base/fence.c +++ b/drivers/base/fence.c @@ -25,6 +25,7 @@ #include linux/export.h #include linux/atomic.h #include linux/fence.h +#include linux/seqno-fence.h #define CREATE_TRACE_POINTS #include trace/events/fence.h @@ -413,3 +414,52 @@ __fence_init(struct fence *fence, const struct fence_ops *ops, trace_fence_init(fence); } EXPORT_SYMBOL(__fence_init); + +static const char *seqno_fence_get_driver_name(struct fence *fence) { + struct seqno_fence *seqno_fence = to_seqno_fence(fence); + return seqno_fence-ops-get_driver_name(fence); +} + +static const char *seqno_fence_get_timeline_name(struct fence *fence) { + struct seqno_fence *seqno_fence = to_seqno_fence(fence); + return seqno_fence-ops-get_timeline_name(fence); +} + +static bool seqno_enable_signaling(struct fence *fence) +{ + struct seqno_fence *seqno_fence = to_seqno_fence(fence); + return seqno_fence-ops-enable_signaling(fence); +} + +static bool seqno_signaled(struct fence *fence) +{ + struct seqno_fence *seqno_fence = to_seqno_fence(fence); + return seqno_fence-ops-signaled seqno_fence-ops-signaled(fence); +} + +static void seqno_release(struct fence *fence) +{ + struct seqno_fence *f = to_seqno_fence(fence); + + dma_buf_put(f-sync_buf); + if (f-ops-release) + f-ops-release(fence); + else + kfree(f); +} + +static long seqno_wait(struct fence *fence, bool intr, signed long timeout) +{ + struct seqno_fence *f = to_seqno_fence(fence); + return f-ops-wait(fence, intr, timeout); +} + +const struct fence_ops seqno_fence_ops = { + .get_driver_name = seqno_fence_get_driver_name, + .get_timeline_name = seqno_fence_get_timeline_name, + .enable_signaling = seqno_enable_signaling, + .signaled = seqno_signaled, + .wait = seqno_wait, + .release = seqno_release, +}; +EXPORT_SYMBOL(seqno_fence_ops); diff --git a/include/linux/seqno-fence.h b/include/linux/seqno-fence.h new file mode 100644 index ..952f7909128c --- /dev/null +++ b/include/linux/seqno-fence.h @@ -0,0 +1,109 @@ +/* + * seqno-fence, using a dma-buf to synchronize fencing + * + * Copyright (C) 2012 Texas Instruments + * Copyright (C) 2012 Canonical Ltd + * Authors: + * Rob Clark robdcl...@gmail.com + * Maarten Lankhorst maarten.lankho...@canonical.com + * + * This program is free software; you can redistribute it and/or modify it + * under the terms of the GNU General Public License version 2 as published by + * the Free Software Foundation. + * + * This program is distributed in the hope that it will be useful, but WITHOUT + * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or + * FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for + * more details. + * + * You should have received a copy of the GNU General Public License along with + * this program. If not, see http://www.gnu.org/licenses/. + */ + +#ifndef __LINUX_SEQNO_FENCE_H +#define __LINUX_SEQNO_FENCE_H + +#include linux/fence.h +#include linux/dma-buf.h + +struct seqno_fence { + struct fence base; + + const struct fence_ops *ops; + struct dma_buf
[PATCH 0/7] dma-buf synchronization patches
The following series implements fence and converts dma-buf and android sync to use it. Patch 6 and 7 add support for polling to dma-buf, blocking until all fences are signaled. --- Maarten Lankhorst (7): sched: allow try_to_wake_up to be used internally outside of core.c fence: dma-buf cross-device synchronization (v16) seqno-fence: Hardware dma-buf implementation of fencing (v4) dma-buf: use reservation objects android: convert sync to fence api, v3 reservation: add support for fences to enable cross-device synchronisation dma-buf: add poll support Documentation/DocBook/device-drivers.tmpl |3 drivers/base/Kconfig |9 drivers/base/Makefile |2 drivers/base/dma-buf.c | 124 +++ drivers/base/fence.c | 465 drivers/gpu/drm/drm_prime.c|8 drivers/gpu/drm/exynos/exynos_drm_dmabuf.c |2 drivers/gpu/drm/i915/i915_gem_dmabuf.c |2 drivers/gpu/drm/nouveau/nouveau_drm.c |1 drivers/gpu/drm/nouveau/nouveau_gem.h |1 drivers/gpu/drm/nouveau/nouveau_prime.c|7 drivers/gpu/drm/omapdrm/omap_gem_dmabuf.c |2 drivers/gpu/drm/radeon/radeon_drv.c|2 drivers/gpu/drm/radeon/radeon_prime.c |8 drivers/gpu/drm/ttm/ttm_object.c |2 drivers/media/v4l2-core/videobuf2-dma-contig.c |2 drivers/staging/android/Kconfig|1 drivers/staging/android/Makefile |2 drivers/staging/android/sw_sync.c |4 drivers/staging/android/sync.c | 895 drivers/staging/android/sync.h | 85 +- drivers/staging/android/sync_debug.c | 245 +++ drivers/staging/android/trace/sync.h | 12 include/drm/drmP.h |2 include/linux/dma-buf.h| 21 - include/linux/fence.h | 329 + include/linux/reservation.h| 18 include/linux/seqno-fence.h| 109 +++ include/linux/wait.h |1 include/trace/events/fence.h | 125 +++ kernel/sched/core.c|2 31 files changed, 1825 insertions(+), 666 deletions(-) create mode 100644 drivers/base/fence.c create mode 100644 drivers/staging/android/sync_debug.c create mode 100644 include/linux/fence.h create mode 100644 include/linux/seqno-fence.h create mode 100644 include/trace/events/fence.h -- Signature -- To unsubscribe from this list: send the line unsubscribe linux-media in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 1/7] sched: allow try_to_wake_up to be used internally outside of core.c
The kernel fence implementation doesn't use event queues, but needs to perform the same wake up. The symbol is not exported, since the fence implementation is not built as a module. Signed-off-by: Maarten Lankhorst maarten.lankho...@canonical.com --- include/linux/wait.h |1 + kernel/sched/core.c |2 +- 2 files changed, 2 insertions(+), 1 deletion(-) diff --git a/include/linux/wait.h b/include/linux/wait.h index eaa00b10abaa..c54e3ef50134 100644 --- a/include/linux/wait.h +++ b/include/linux/wait.h @@ -12,6 +12,7 @@ typedef struct __wait_queue wait_queue_t; typedef int (*wait_queue_func_t)(wait_queue_t *wait, unsigned mode, int flags, void *key); int default_wake_function(wait_queue_t *wait, unsigned mode, int flags, void *key); +int try_to_wake_up(struct task_struct *p, unsigned int state, int wake_flags); struct __wait_queue { unsigned intflags; diff --git a/kernel/sched/core.c b/kernel/sched/core.c index a88f4a485c5e..f41d317042dd 100644 --- a/kernel/sched/core.c +++ b/kernel/sched/core.c @@ -1578,7 +1578,7 @@ static void ttwu_queue(struct task_struct *p, int cpu) * Return: %true if @p was woken up, %false if it was already running. * or @state didn't match @p's state. */ -static int +int try_to_wake_up(struct task_struct *p, unsigned int state, int wake_flags) { unsigned long flags; -- To unsubscribe from this list: send the line unsubscribe linux-media in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 2/7] fence: dma-buf cross-device synchronization (v16)
A fence can be attached to a buffer which is being filled or consumed by hw, to allow userspace to pass the buffer without waiting to another device. For example, userspace can call page_flip ioctl to display the next frame of graphics after kicking the GPU but while the GPU is still rendering. The display device sharing the buffer with the GPU would attach a callback to get notified when the GPU's rendering-complete IRQ fires, to update the scan-out address of the display, without having to wake up userspace. A driver must allocate a fence context for each execution ring that can run in parallel. The function for this takes an argument with how many contexts to allocate: + fence_context_alloc() A fence is transient, one-shot deal. It is allocated and attached to one or more dma-buf's. When the one that attached it is done, with the pending operation, it can signal the fence: + fence_signal() To have a rough approximation whether a fence is fired, call: + fence_is_signaled() The dma-buf-mgr handles tracking, and waiting on, the fences associated with a dma-buf. The one pending on the fence can add an async callback: + fence_add_callback() The callback can optionally be cancelled with: + fence_remove_callback() To wait synchronously, optionally with a timeout: + fence_wait() + fence_wait_timeout() When emitting a fence, call: + trace_fence_emit() To annotate that a fence is blocking on another fence, call: + trace_fence_annotate_wait_on(fence, on_fence) A default software-only implementation is provided, which can be used by drivers attaching a fence to a buffer when they have no other means for hw sync. But a memory backed fence is also envisioned, because it is common that GPU's can write to, or poll on some memory location for synchronization. For example: fence = custom_get_fence(...); if ((seqno_fence = to_seqno_fence(fence)) != NULL) { dma_buf *fence_buf = fence-sync_buf; get_dma_buf(fence_buf); ... tell the hw the memory location to wait ... custom_wait_on(fence_buf, fence-seqno_ofs, fence-seqno); } else { /* fall-back to sw sync * / fence_add_callback(fence, my_cb); } On SoC platforms, if some other hw mechanism is provided for synchronizing between IP blocks, it could be supported as an alternate implementation with it's own fence ops in a similar way. enable_signaling callback is used to provide sw signaling in case a cpu waiter is requested or no compatible hardware signaling could be used. The intention is to provide a userspace interface (presumably via eventfd) later, to be used in conjunction with dma-buf's mmap support for sw access to buffers (or for userspace apps that would prefer to do their own synchronization). v1: Original v2: After discussion w/ danvet and mlankhorst on #dri-devel, we decided that dma-fence didn't need to care about the sw-hw signaling path (it can be handled same as sw-sw case), and therefore the fence-ops can be simplified and more handled in the core. So remove the signal, add_callback, cancel_callback, and wait ops, and replace with a simple enable_signaling() op which can be used to inform a fence supporting hw-hw signaling that one or more devices which do not support hw signaling are waiting (and therefore it should enable an irq or do whatever is necessary in order that the CPU is notified when the fence is passed). v3: Fix locking fail in attach_fence() and get_fence() v4: Remove tie-in w/ dma-buf.. after discussion w/ danvet and mlankorst we decided that we need to be able to attach one fence to N dma-buf's, so using the list_head in dma-fence struct would be problematic. v5: [ Maarten Lankhorst ] Updated for dma-bikeshed-fence and dma-buf-manager. v6: [ Maarten Lankhorst ] I removed dma_fence_cancel_callback and some comments about checking if fence fired or not. This is broken by design. waitqueue_active during destruction is now fatal, since the signaller should be holding a reference in enable_signalling until it signalled the fence. Pass the original dma_fence_cb along, and call __remove_wait in the dma_fence_callback handler, so that no cleanup needs to be performed. v7: [ Maarten Lankhorst ] Set cb-func and only enable sw signaling if fence wasn't signaled yet, for example for hardware fences that may choose to signal blindly. v8: [ Maarten Lankhorst ] Tons of tiny fixes, moved __dma_fence_init to header and fixed include mess. dma-fence.h now includes dma-buf.h All members are now initialized, so kmalloc can be used for allocating a dma-fence. More documentation added. v9: Change compiler bitfields to flags, change return type of enable_signaling to bool. Rework dma_fence_wait. Added dma_fence_is_signaled and dma_fence_wait_timeout. s/dma// and change exports to non GPL. Added fence_is_signaled and fence_enable_sw_signaling calls, add ability to override default wait
[PATCH 4/7] dma-buf: use reservation objects
This allows reservation objects to be used in dma-buf. it's required for implementing polling support on the fences that belong to a dma-buf. Signed-off-by: Maarten Lankhorst maarten.lankho...@canonical.com --- drivers/base/dma-buf.c | 22 -- drivers/gpu/drm/drm_prime.c|8 +++- drivers/gpu/drm/exynos/exynos_drm_dmabuf.c |2 +- drivers/gpu/drm/i915/i915_gem_dmabuf.c |2 +- drivers/gpu/drm/nouveau/nouveau_drm.c |1 + drivers/gpu/drm/nouveau/nouveau_gem.h |1 + drivers/gpu/drm/nouveau/nouveau_prime.c|7 +++ drivers/gpu/drm/omapdrm/omap_gem_dmabuf.c |2 +- drivers/gpu/drm/radeon/radeon_drv.c|2 ++ drivers/gpu/drm/radeon/radeon_prime.c |8 drivers/gpu/drm/ttm/ttm_object.c |2 +- drivers/media/v4l2-core/videobuf2-dma-contig.c |2 +- include/drm/drmP.h |2 ++ include/linux/dma-buf.h|9 ++--- 14 files changed, 59 insertions(+), 11 deletions(-) diff --git a/drivers/base/dma-buf.c b/drivers/base/dma-buf.c index 1e16cbd61da2..85e792c2c909 100644 --- a/drivers/base/dma-buf.c +++ b/drivers/base/dma-buf.c @@ -25,10 +25,12 @@ #include linux/fs.h #include linux/slab.h #include linux/dma-buf.h +#include linux/fence.h #include linux/anon_inodes.h #include linux/export.h #include linux/debugfs.h #include linux/seq_file.h +#include linux/reservation.h static inline int is_dma_buf_file(struct file *); @@ -56,6 +58,9 @@ static int dma_buf_release(struct inode *inode, struct file *file) list_del(dmabuf-list_node); mutex_unlock(db_list.lock); + if (dmabuf-resv == (struct reservation_object*)dmabuf[1]) + reservation_object_fini(dmabuf-resv); + kfree(dmabuf); return 0; } @@ -128,6 +133,7 @@ static inline int is_dma_buf_file(struct file *file) * @size: [in]Size of the buffer * @flags: [in]mode flags for the file. * @exp_name: [in]name of the exporting module - useful for debugging. + * @resv: [in]reservation-object, NULL to allocate default one. * * Returns, on success, a newly created dma_buf object, which wraps the * supplied private data and operations for dma_buf_ops. On either missing @@ -135,10 +141,17 @@ static inline int is_dma_buf_file(struct file *file) * */ struct dma_buf *dma_buf_export_named(void *priv, const struct dma_buf_ops *ops, - size_t size, int flags, const char *exp_name) + size_t size, int flags, const char *exp_name, + struct reservation_object *resv) { struct dma_buf *dmabuf; struct file *file; + size_t alloc_size = sizeof(struct dma_buf); + if (!resv) + alloc_size += sizeof(struct reservation_object); + else + /* prevent dma_buf[1] == dma_buf-resv */ + alloc_size += 1; if (WARN_ON(!priv || !ops || !ops-map_dma_buf @@ -150,7 +163,7 @@ struct dma_buf *dma_buf_export_named(void *priv, const struct dma_buf_ops *ops, return ERR_PTR(-EINVAL); } - dmabuf = kzalloc(sizeof(struct dma_buf), GFP_KERNEL); + dmabuf = kzalloc(alloc_size, GFP_KERNEL); if (dmabuf == NULL) return ERR_PTR(-ENOMEM); @@ -158,6 +171,11 @@ struct dma_buf *dma_buf_export_named(void *priv, const struct dma_buf_ops *ops, dmabuf-ops = ops; dmabuf-size = size; dmabuf-exp_name = exp_name; + if (!resv) { + resv = (struct reservation_object*)dmabuf[1]; + reservation_object_init(resv); + } + dmabuf-resv = resv; file = anon_inode_getfile(dmabuf, dma_buf_fops, dmabuf, flags); if (IS_ERR(file)) { diff --git a/drivers/gpu/drm/drm_prime.c b/drivers/gpu/drm/drm_prime.c index 56805c39c906..a13e90245adf 100644 --- a/drivers/gpu/drm/drm_prime.c +++ b/drivers/gpu/drm/drm_prime.c @@ -318,7 +318,13 @@ static const struct dma_buf_ops drm_gem_prime_dmabuf_ops = { struct dma_buf *drm_gem_prime_export(struct drm_device *dev, struct drm_gem_object *obj, int flags) { - return dma_buf_export(obj, drm_gem_prime_dmabuf_ops, obj-size, flags); + struct reservation_object *robj = NULL; + + if (dev-driver-gem_prime_res_obj) + robj = dev-driver-gem_prime_res_obj(obj); + + return dma_buf_export(obj, drm_gem_prime_dmabuf_ops, obj-size, + flags, robj); } EXPORT_SYMBOL(drm_gem_prime_export); diff --git a/drivers/gpu/drm/exynos/exynos_drm_dmabuf.c b/drivers/gpu/drm/exynos/exynos_drm_dmabuf.c index 59827cc5e770..b5e89f46326e 100644 --- a/drivers/gpu/drm/exynos/exynos_drm_dmabuf.c +++ b/drivers/gpu/drm/exynos/exynos_drm_dmabuf.c @@ -187,7 +187,7
[PATCH 6/7] reservation: add support for fences to enable cross-device synchronisation
Signed-off-by: Maarten Lankhorst maarten.lankho...@canonical.com --- include/linux/reservation.h | 18 +- 1 file changed, 17 insertions(+), 1 deletion(-) diff --git a/include/linux/reservation.h b/include/linux/reservation.h index 813dae960ebd..92c4851b5a39 100644 --- a/include/linux/reservation.h +++ b/include/linux/reservation.h @@ -6,7 +6,7 @@ * Copyright (C) 2012 Texas Instruments * * Authors: - * Rob Clark rob.cl...@linaro.org + * Rob Clark robdcl...@gmail.com * Maarten Lankhorst maarten.lankho...@canonical.com * Thomas Hellstrom thellstrom-at-vmware-dot-com * @@ -40,22 +40,38 @@ #define _LINUX_RESERVATION_H #include linux/ww_mutex.h +#include linux/fence.h extern struct ww_class reservation_ww_class; struct reservation_object { struct ww_mutex lock; + + struct fence *fence_excl; + struct fence **fence_shared; + u32 fence_shared_count, fence_shared_max; }; static inline void reservation_object_init(struct reservation_object *obj) { ww_mutex_init(obj-lock, reservation_ww_class); + + obj-fence_shared_count = obj-fence_shared_max = 0; + obj-fence_shared = NULL; + obj-fence_excl = NULL; } static inline void reservation_object_fini(struct reservation_object *obj) { + int i; + + if (obj-fence_excl) + fence_put(obj-fence_excl); + for (i = 0; i obj-fence_shared_count; ++i) + fence_put(obj-fence_shared[i]); + ww_mutex_destroy(obj-lock); } -- To unsubscribe from this list: send the line unsubscribe linux-media in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 5/7] android: convert sync to fence api, v3
Android syncpoints can be mapped to a timeline. This removes the need to maintain a separate api for synchronization. I've left the android trace events in place, but the core fence events should already be sufficient for debugging. v2: - Call fence_remove_callback in sync_fence_free if not all fences have fired. v3: - Merge Colin Cross' bugfixes, and the android fence merge optimization. Signed-off-by: Maarten Lankhorst maarten.lankho...@canonical.com --- drivers/staging/android/Kconfig |1 drivers/staging/android/Makefile |2 drivers/staging/android/sw_sync.c|4 drivers/staging/android/sync.c | 895 +++--- drivers/staging/android/sync.h | 85 ++- drivers/staging/android/sync_debug.c | 245 + drivers/staging/android/trace/sync.h | 12 7 files changed, 592 insertions(+), 652 deletions(-) create mode 100644 drivers/staging/android/sync_debug.c diff --git a/drivers/staging/android/Kconfig b/drivers/staging/android/Kconfig index 1e9ab6dfc90d..c8d28dc3050c 100644 --- a/drivers/staging/android/Kconfig +++ b/drivers/staging/android/Kconfig @@ -77,6 +77,7 @@ config SYNC bool Synchronization framework default n select ANON_INODES + select DMA_SHARED_BUFFER ---help--- This option enables the framework for synchronization between multiple drivers. Sync implementations can take advantage of hardware diff --git a/drivers/staging/android/Makefile b/drivers/staging/android/Makefile index c136299e05af..81b8293ba1a9 100644 --- a/drivers/staging/android/Makefile +++ b/drivers/staging/android/Makefile @@ -7,5 +7,5 @@ obj-$(CONFIG_ANDROID_TIMED_OUTPUT) += timed_output.o obj-$(CONFIG_ANDROID_TIMED_GPIO) += timed_gpio.o obj-$(CONFIG_ANDROID_LOW_MEMORY_KILLER)+= lowmemorykiller.o obj-$(CONFIG_ANDROID_INTF_ALARM_DEV) += alarm-dev.o -obj-$(CONFIG_SYNC) += sync.o +obj-$(CONFIG_SYNC) += sync.o sync_debug.o obj-$(CONFIG_SW_SYNC) += sw_sync.o diff --git a/drivers/staging/android/sw_sync.c b/drivers/staging/android/sw_sync.c index f24493ac65e3..a76db3ff87cb 100644 --- a/drivers/staging/android/sw_sync.c +++ b/drivers/staging/android/sw_sync.c @@ -50,7 +50,7 @@ static struct sync_pt *sw_sync_pt_dup(struct sync_pt *sync_pt) { struct sw_sync_pt *pt = (struct sw_sync_pt *) sync_pt; struct sw_sync_timeline *obj = - (struct sw_sync_timeline *)sync_pt-parent; + (struct sw_sync_timeline *)sync_pt_parent(sync_pt); return (struct sync_pt *) sw_sync_pt_create(obj, pt-value); } @@ -59,7 +59,7 @@ static int sw_sync_pt_has_signaled(struct sync_pt *sync_pt) { struct sw_sync_pt *pt = (struct sw_sync_pt *)sync_pt; struct sw_sync_timeline *obj = - (struct sw_sync_timeline *)sync_pt-parent; + (struct sw_sync_timeline *)sync_pt_parent(sync_pt); return sw_sync_cmp(obj-value, pt-value) = 0; } diff --git a/drivers/staging/android/sync.c b/drivers/staging/android/sync.c index 38e5d3b5ed9b..ba0d69e269b5 100644 --- a/drivers/staging/android/sync.c +++ b/drivers/staging/android/sync.c @@ -31,22 +31,13 @@ #define CREATE_TRACE_POINTS #include trace/sync.h -static void sync_fence_signal_pt(struct sync_pt *pt); -static int _sync_pt_has_signaled(struct sync_pt *pt); -static void sync_fence_free(struct kref *kref); -static void sync_dump(void); - -static LIST_HEAD(sync_timeline_list_head); -static DEFINE_SPINLOCK(sync_timeline_list_lock); - -static LIST_HEAD(sync_fence_list_head); -static DEFINE_SPINLOCK(sync_fence_list_lock); +static const struct fence_ops android_fence_ops; +static const struct file_operations sync_fence_fops; struct sync_timeline *sync_timeline_create(const struct sync_timeline_ops *ops, int size, const char *name) { struct sync_timeline *obj; - unsigned long flags; if (size sizeof(struct sync_timeline)) return NULL; @@ -57,17 +48,14 @@ struct sync_timeline *sync_timeline_create(const struct sync_timeline_ops *ops, kref_init(obj-kref); obj-ops = ops; + obj-context = fence_context_alloc(1); strlcpy(obj-name, name, sizeof(obj-name)); INIT_LIST_HEAD(obj-child_list_head); - spin_lock_init(obj-child_list_lock); - INIT_LIST_HEAD(obj-active_list_head); - spin_lock_init(obj-active_list_lock); + spin_lock_init(obj-child_list_lock); - spin_lock_irqsave(sync_timeline_list_lock, flags); - list_add_tail(obj-sync_timeline_list, sync_timeline_list_head); - spin_unlock_irqrestore(sync_timeline_list_lock, flags); + sync_timeline_debug_add(obj); return obj; } @@ -77,18 +65,25 @@ static void sync_timeline_free(struct kref *kref) { struct sync_timeline *obj = container_of(kref, struct sync_timeline, kref
[PATCH 7/7] dma-buf: add poll support
Thanks to Fengguang Wu for spotting a missing static cast. Signed-off-by: Maarten Lankhorst maarten.lankho...@canonical.com --- drivers/base/dma-buf.c | 102 +++ include/linux/dma-buf.h | 12 ++ 2 files changed, 114 insertions(+) diff --git a/drivers/base/dma-buf.c b/drivers/base/dma-buf.c index 85e792c2c909..e0898d49530e 100644 --- a/drivers/base/dma-buf.c +++ b/drivers/base/dma-buf.c @@ -30,6 +30,7 @@ #include linux/export.h #include linux/debugfs.h #include linux/seq_file.h +#include linux/poll.h #include linux/reservation.h static inline int is_dma_buf_file(struct file *); @@ -52,6 +53,13 @@ static int dma_buf_release(struct inode *inode, struct file *file) BUG_ON(dmabuf-vmapping_counter); + /* +* Any fences that a dma-buf poll can wait on should be signaled +* before releasing dma-buf. This is the responsibility of each +* driver that uses the reservation objects. +*/ + BUG_ON(dmabuf-cb_shared.active || dmabuf-cb_excl.active); + dmabuf-ops-release(dmabuf); mutex_lock(db_list.lock); @@ -108,10 +116,100 @@ static loff_t dma_buf_llseek(struct file *file, loff_t offset, int whence) return base + offset; } +static void dma_buf_poll_cb(struct fence *fence, struct fence_cb *cb) +{ + struct dma_buf_poll_cb_t *dcb = (struct dma_buf_poll_cb_t*) cb; + unsigned long flags; + + spin_lock_irqsave(dcb-poll-lock, flags); + wake_up_locked_poll(dcb-poll, dcb-active); + dcb-active = 0; + spin_unlock_irqrestore(dcb-poll-lock, flags); +} + +static unsigned int dma_buf_poll(struct file *file, poll_table *poll) +{ + struct dma_buf *dmabuf; + struct reservation_object *resv; + unsigned long events; + int need_shared; + + dmabuf = file-private_data; + if (!dmabuf || !dmabuf-resv) + return POLLERR; + + resv = dmabuf-resv; + + poll_wait(file, dmabuf-poll, poll); + + events = poll_requested_events(poll) (POLLIN | POLLOUT); + if (!events) + return 0; + + ww_mutex_lock(resv-lock, NULL); + + if (resv-fence_excl (!(events POLLOUT) || resv-fence_shared_count == 0)) { + struct dma_buf_poll_cb_t *dcb = dmabuf-cb_excl; + unsigned long pevents = POLLIN; + + if (resv-fence_shared_count == 0) + pevents |= POLLOUT; + + spin_lock_irq(dmabuf-poll.lock); + if (dcb-active) { + dcb-active |= pevents; + events = ~pevents; + } else + dcb-active = pevents; + spin_unlock_irq(dmabuf-poll.lock); + + if (events pevents) { + if (!fence_add_callback(resv-fence_excl, + dcb-cb, dma_buf_poll_cb)) + events = ~pevents; + else + // No callback queued, wake up any additional waiters. + dma_buf_poll_cb(NULL, dcb-cb); + } + } + + if ((events POLLOUT) resv-fence_shared_count 0) { + struct dma_buf_poll_cb_t *dcb = dmabuf-cb_shared; + int i; + + /* Only queue a new callback if no event has fired yet */ + spin_lock_irq(dmabuf-poll.lock); + if (dcb-active) + events = ~POLLOUT; + else + dcb-active = POLLOUT; + spin_unlock_irq(dmabuf-poll.lock); + + if (!(events POLLOUT)) + goto out; + + for (i = 0; i resv-fence_shared_count; ++i) + if (!fence_add_callback(resv-fence_shared[i], + dcb-cb, dma_buf_poll_cb)) { + events = ~POLLOUT; + break; + } + + // No callback queued, wake up any additional waiters. + if (i == resv-fence_shared_count) + dma_buf_poll_cb(NULL, dcb-cb); + } + +out: + ww_mutex_unlock(resv-lock); + return events; +} + static const struct file_operations dma_buf_fops = { .release= dma_buf_release, .mmap = dma_buf_mmap_internal, .llseek = dma_buf_llseek, + .poll = dma_buf_poll, }; /* @@ -171,6 +269,10 @@ struct dma_buf *dma_buf_export_named(void *priv, const struct dma_buf_ops *ops, dmabuf-ops = ops; dmabuf-size = size; dmabuf-exp_name = exp_name; + init_waitqueue_head(dmabuf-poll); + dmabuf-cb_excl.poll = dmabuf-cb_shared.poll = dmabuf-poll; + dmabuf-cb_excl.active = dmabuf-cb_shared.active = 0; + if (!resv) { resv = (struct
Re: [RFC PATCH] drm/radeon: rework to new fence interface
Op 20-08-13 10:37, Christian König schreef: Am 19.08.2013 21:37, schrieb Maarten Lankhorst: Op 19-08-13 14:35, Christian König schreef: Am 19.08.2013 12:17, schrieb Maarten Lankhorst: [SNIP] @@ -190,25 +225,24 @@ void radeon_fence_process(struct radeon_device *rdev, int ring) } } while (atomic64_xchg(rdev-fence_drv[ring].last_seq, seq) seq); -if (wake) { +if (wake) rdev-fence_drv[ring].last_activity = jiffies; -wake_up_all(rdev-fence_queue); -} +return wake; } Very bad idea, when sequence numbers change, you always want to wake up the whole fence queue here. Yes, and the callers of this function call wake_up_all or wake_up_all_locked themselves, based on the return value.. And as I said that's a very bad idea. The fence processing shouldn't be called with any locks held and should be self responsible for activating any waiters. The call point (enable_signaling) only needs to know whether its own counter has passed or not. This prevents the race where the counter has elapsed, but the irq was not yet enabled. I don't really care if enable_signaling updates last_seq or not, it only needs to check if it's own fence has been signaled after enabling sw_irqs. [SNIP] +/** + * radeon_fence_enable_signaling - enable signalling on fence + * @fence: fence + * + * This function is called with fence_queue lock held, and adds a callback + * to fence_queue that checks if this fence is signaled, and if so it + * signals the fence and removes itself. + */ +static bool radeon_fence_enable_signaling(struct fence *f) +{ +struct radeon_fence *fence = to_radeon_fence(f); + +if (atomic64_read(fence-rdev-fence_drv[fence-ring].last_seq) = fence-seq || +!fence-rdev-ddev-irq_enabled) +return false; + Do I get that right that you rely on IRQs to be enabled and working here? Cause that would be a quite bad idea from the conceptual side. For cross-device synchronization it would be nice to have working irqs, it allows signalling fences faster, and it allows for callbacks on completion to be called. For internal usage it's no more required than it was before. That's a big NAK. The fence processing is actually very fine tuned to avoid IRQs and as far as I can see you just leave them enabled by decrementing the atomic from IRQ context. Additional to that we need allot of special handling in case of a hardware lockup here, which isn't done if you abuse the fence interface like this. I think it's not needed to leave the irq enabled, it's a leftover from when I was debugging the mac and no interrupt occurred at all. Also your approach of leaking the IRQ context outside of the driver is a very bad idea from the conceptual side. Please don't modify the fence interface at all and instead use the wait functions already exposed by radeon_fence.c. If you need some kind of signaling mechanism then wait inside a workqueue instead. The fence takes up the role of a single shot workqueue here. Manually resetting the counter and calling wake_up_all would end up waking all active fences, there's no special handling needed inside radeon for this. The fence api does provide a synchronous wait function, but this causes a stall of whomever waits on it. When I was testing this with intel I used the fence callback to poke a register in i915, this allowed it to not block until it hits the wait op in the command stream, and even then only if the callback was not called first. It's documented that the callbacks can be called from any context and will be called with irqs disabled, so nothing scary should be done. The kernel provides enough debug mechanisms to find any violators. PROVE_LOCKING and DEBUG_ATOMIC_SLEEP for example. ~Maarten -- To unsubscribe from this list: send the line unsubscribe linux-media in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [RFC PATCH] drm/radeon: rework to new fence interface
Op 20-08-13 11:51, Christian König schreef: Am 20.08.2013 11:36, schrieb Maarten Lankhorst: [SNIP] [SNIP] +/** + * radeon_fence_enable_signaling - enable signalling on fence + * @fence: fence + * + * This function is called with fence_queue lock held, and adds a callback + * to fence_queue that checks if this fence is signaled, and if so it + * signals the fence and removes itself. + */ +static bool radeon_fence_enable_signaling(struct fence *f) +{ +struct radeon_fence *fence = to_radeon_fence(f); + +if (atomic64_read(fence-rdev-fence_drv[fence-ring].last_seq) = fence-seq || +!fence-rdev-ddev-irq_enabled) +return false; + Do I get that right that you rely on IRQs to be enabled and working here? Cause that would be a quite bad idea from the conceptual side. For cross-device synchronization it would be nice to have working irqs, it allows signalling fences faster, and it allows for callbacks on completion to be called. For internal usage it's no more required than it was before. That's a big NAK. The fence processing is actually very fine tuned to avoid IRQs and as far as I can see you just leave them enabled by decrementing the atomic from IRQ context. Additional to that we need allot of special handling in case of a hardware lockup here, which isn't done if you abuse the fence interface like this. I think it's not needed to leave the irq enabled, it's a leftover from when I was debugging the mac and no interrupt occurred at all. Also your approach of leaking the IRQ context outside of the driver is a very bad idea from the conceptual side. Please don't modify the fence interface at all and instead use the wait functions already exposed by radeon_fence.c. If you need some kind of signaling mechanism then wait inside a workqueue instead. The fence takes up the role of a single shot workqueue here. Manually resetting the counter and calling wake_up_all would end up waking all active fences, there's no special handling needed inside radeon for this. Yeah that's actually the point here, you NEED to activate ALL fences, otherwise the fence handling inside the driver won't work. It's done in a lazy fashion. If there's no need for an activated fence the interrupt will not be enabled. The fence api does provide a synchronous wait function, but this causes a stall of whomever waits on it. Which is perfectly fine. What actually is the use case of not stalling a process who wants to wait for something? Does radeon call ttm_bo_wait on all bo's before doing a command submission? No? Why should other drivers do that.. When I was testing this with intel I used the fence callback to poke a register in i915, this allowed it to not block until it hits the wait op in the command stream, and even then only if the callback was not called first. It's documented that the callbacks can be called from any context and will be called with irqs disabled, so nothing scary should be done. The kernel provides enough debug mechanisms to find any violators. PROVE_LOCKING and DEBUG_ATOMIC_SLEEP for example. No thanks, we even abandoned that concept internal in the driver. Please use the blocking wait functions instead. No, this just stalls all gpu's that share a bo. The idea is to provide a standardized api so bo's can be synchronized without stalling. The first step to this is ww_mutex. If this lock is shared between multiple gpu's the same object can be reserved between multiple devices without causing a deadlock with circular dependencies. With some small patches it's possible to do this already between multiple drivers that use ttm. ttm_bo_reserve, ttm_bo_unreserve and all the other code dealing with ttm reservations have been converted to use ww_mutex locking. Fencing is the next step. When all buffers are locked a callback should be added to any previous fence, and a single new fence signaling completion of the command submission should be placed on all locked objects. Because the common path is that no objects are shared, the callback and FIFO stalling will only be needed for dma-bufs. When all callbacks have fired the FIFO can be unblocked. This prevents having to sync the gpu to the cpu. If a bo is submitted to 1 gpu, and then immediately to another it will not stall unless needed. For example in a optimus configuration an application could copy a rendered frame from VRAM to a shared dma-buf (xorg's buffer), then have Xorg copying it again (on intel's gpu) from the dma-buf to a framebuffer . ~Maarten -- To unsubscribe from this list: send the line unsubscribe linux-media in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[RFC PATCH] drm/nouveau: rework to new fence interface
nouveau was a bit tricky, it has no support for interrupts on nv84, so I added an extra call to nouveau_fence_update in nouveau_fence_emit to increase the chance slightly that deferred work gets triggered. This patch depends on the vblank locking fix for the definitions of nouveau_event_enable_locked and nouveau_event_disable_locked. Signed-off-by: Maarten Lankhorst maarten.lankho...@canonical.com --- diff --git a/drivers/gpu/drm/nouveau/nouveau_fence.c b/drivers/gpu/drm/nouveau/nouveau_fence.c index be31499..78714e4 100644 --- a/drivers/gpu/drm/nouveau/nouveau_fence.c +++ b/drivers/gpu/drm/nouveau/nouveau_fence.c @@ -35,88 +35,115 @@ #include engine/fifo.h -struct fence_work { - struct work_struct base; - struct list_head head; - void (*func)(void *); - void *data; -}; +static const struct fence_ops nouveau_fence_ops_uevent; +static const struct fence_ops nouveau_fence_ops_legacy; static void nouveau_fence_signal(struct nouveau_fence *fence) { - struct fence_work *work, *temp; + __fence_signal(fence-base); + list_del(fence-head); - list_for_each_entry_safe(work, temp, fence-work, head) { - schedule_work(work-base); - list_del(work-head); + if (fence-base.ops == nouveau_fence_ops_uevent + fence-event.head.next) { + struct nouveau_event *event; + + list_del(fence-event.head); + fence-event.head.next = NULL; + + event = container_of(fence-base.lock, typeof(*event), lock); + if (!--event-index[0].refs) + event-disable(event, 0); } - fence-channel = NULL; - list_del(fence-head); + fence_put(fence-base); } void nouveau_fence_context_del(struct nouveau_fence_chan *fctx) { struct nouveau_fence *fence, *fnext; - spin_lock(fctx-lock); - list_for_each_entry_safe(fence, fnext, fctx-pending, head) { + + spin_lock_irq(fctx-lock); + list_for_each_entry_safe(fence, fnext, fctx-pending, head) nouveau_fence_signal(fence); - } - spin_unlock(fctx-lock); + spin_unlock_irq(fctx-lock); } void -nouveau_fence_context_new(struct nouveau_fence_chan *fctx) +nouveau_fence_context_new(struct nouveau_channel *chan, struct nouveau_fence_chan *fctx) { + struct nouveau_fifo *pfifo = nouveau_fifo(chan-drm-device); + + fctx-lock = pfifo-uevent-lock; INIT_LIST_HEAD(fctx-flip); INIT_LIST_HEAD(fctx-pending); - spin_lock_init(fctx-lock); } +struct nouveau_fence_work { + struct work_struct work; + struct fence_cb cb; + void (*func)(void *); + void *data; +}; + static void nouveau_fence_work_handler(struct work_struct *kwork) { - struct fence_work *work = container_of(kwork, typeof(*work), base); + struct nouveau_fence_work *work = container_of(kwork, typeof(*work), work); work-func(work-data); kfree(work); } +static void nouveau_fence_work_cb(struct fence *fence, struct fence_cb *cb) +{ + struct nouveau_fence_work *work = container_of(cb, typeof(*work), cb); + + schedule_work(work-work); +} + +/* + * In an ideal world, read would not assume the channel context is still alive. + * This function may be called from another device, running into free memory as a + * result. The drm node should still be there, so we can derive the index from + * the fence context. + */ +static bool nouveau_fence_is_signaled(struct fence *f) +{ + struct nouveau_fence *fence = container_of(f, struct nouveau_fence, base); + struct nouveau_channel *chan = fence-channel; + struct nouveau_fence_chan *fctx = chan-fence; + + return (int)(fctx-read(chan) - fence-base.seqno) = 0; +} + void nouveau_fence_work(struct nouveau_fence *fence, void (*func)(void *), void *data) { - struct nouveau_channel *chan = fence-channel; - struct nouveau_fence_chan *fctx; - struct fence_work *work = NULL; + struct nouveau_fence_work *work; - if (nouveau_fence_done(fence)) { - func(data); - return; - } + if (fence_is_signaled(fence-base)) + goto err; - fctx = chan-fence; work = kmalloc(sizeof(*work), GFP_KERNEL); if (!work) { WARN_ON(nouveau_fence_wait(fence, false, false)); - func(data); - return; - } - - spin_lock(fctx-lock); - if (!fence-channel) { - spin_unlock(fctx-lock); - kfree(work); - func(data); - return; + goto err; } - INIT_WORK(work-base, nouveau_fence_work_handler); + INIT_WORK(work-work, nouveau_fence_work_handler); work-func = func; work-data = data; - list_add(work-head, fence-work); - spin_unlock(fctx-lock
[RFC PATCH] drm/radeon: rework to new fence interface
Signed-off-by: Maarten Lankhorst maarten.lankho...@canonical.com --- diff --git a/drivers/gpu/drm/radeon/radeon.h b/drivers/gpu/drm/radeon/radeon.h index 9f19259..971284e 100644 --- a/drivers/gpu/drm/radeon/radeon.h +++ b/drivers/gpu/drm/radeon/radeon.h @@ -64,6 +64,7 @@ #include linux/wait.h #include linux/list.h #include linux/kref.h +#include linux/fence.h #include ttm/ttm_bo_api.h #include ttm/ttm_bo_driver.h @@ -114,9 +115,6 @@ extern int radeon_aspm; /* max number of rings */ #define RADEON_NUM_RINGS 6 -/* fence seq are set to this number when signaled */ -#define RADEON_FENCE_SIGNALED_SEQ 0LL - /* internal ring indices */ /* r1xx+ has gfx CP ring */ #define RADEON_RING_TYPE_GFX_INDEX 0 @@ -285,12 +283,15 @@ struct radeon_fence_driver { }; struct radeon_fence { + struct fence base; + struct radeon_device*rdev; - struct kref kref; /* protected by radeon_fence.lock */ uint64_tseq; /* RB, DMA, etc. */ unsignedring; + + wait_queue_t fence_wake; }; int radeon_fence_driver_start_ring(struct radeon_device *rdev, int ring); @@ -2039,6 +2040,7 @@ struct radeon_device { struct radeon_mman mman; struct radeon_fence_driver fence_drv[RADEON_NUM_RINGS]; wait_queue_head_t fence_queue; + unsignedfence_context; struct mutexring_lock; struct radeon_ring ring[RADEON_NUM_RINGS]; boolib_pool_ready; @@ -2117,11 +2119,6 @@ u32 cik_mm_rdoorbell(struct radeon_device *rdev, u32 offset); void cik_mm_wdoorbell(struct radeon_device *rdev, u32 offset, u32 v); /* - * Cast helper - */ -#define to_radeon_fence(p) ((struct radeon_fence *)(p)) - -/* * Registers read write functions. */ #define RREG8(reg) readb((rdev-rmmio) + (reg)) diff --git a/drivers/gpu/drm/radeon/radeon_device.c b/drivers/gpu/drm/radeon/radeon_device.c index 63398ae..d76a187 100644 --- a/drivers/gpu/drm/radeon/radeon_device.c +++ b/drivers/gpu/drm/radeon/radeon_device.c @@ -1150,6 +1150,7 @@ int radeon_device_init(struct radeon_device *rdev, for (i = 0; i RADEON_NUM_RINGS; i++) { rdev-ring[i].idx = i; } + rdev-fence_context = fence_context_alloc(RADEON_NUM_RINGS); DRM_INFO(initializing kernel modesetting (%s 0x%04X:0x%04X 0x%04X:0x%04X).\n, radeon_family_name[rdev-family], pdev-vendor, pdev-device, diff --git a/drivers/gpu/drm/radeon/radeon_fence.c b/drivers/gpu/drm/radeon/radeon_fence.c index ddb8f8e..92a1576 100644 --- a/drivers/gpu/drm/radeon/radeon_fence.c +++ b/drivers/gpu/drm/radeon/radeon_fence.c @@ -39,6 +39,15 @@ #include radeon.h #include radeon_trace.h +static const struct fence_ops radeon_fence_ops; + +#define to_radeon_fence(p) \ + ({ \ + struct radeon_fence *__f; \ + __f = container_of((p), struct radeon_fence, base); \ + __f-base.ops == radeon_fence_ops ? __f : NULL;\ + }) + /* * Fences * Fences mark an event in the GPUs pipeline and are used @@ -111,14 +120,17 @@ int radeon_fence_emit(struct radeon_device *rdev, struct radeon_fence **fence, int ring) { + u64 seq = ++rdev-fence_drv[ring].sync_seq[ring]; + /* we are protected by the ring emission mutex */ *fence = kmalloc(sizeof(struct radeon_fence), GFP_KERNEL); if ((*fence) == NULL) { return -ENOMEM; } - kref_init(((*fence)-kref)); + __fence_init((*fence)-base, radeon_fence_ops, +rdev-fence_queue.lock, rdev-fence_context + ring, seq); (*fence)-rdev = rdev; - (*fence)-seq = ++rdev-fence_drv[ring].sync_seq[ring]; + (*fence)-seq = seq; (*fence)-ring = ring; radeon_fence_ring_emit(rdev, ring, *fence); trace_radeon_fence_emit(rdev-ddev, (*fence)-seq); @@ -126,15 +138,38 @@ int radeon_fence_emit(struct radeon_device *rdev, } /** - * radeon_fence_process - process a fence + * radeon_fence_check_signaled - callback from fence_queue * - * @rdev: radeon_device pointer - * @ring: ring index the fence is associated with - * - * Checks the current fence value and wakes the fence queue - * if the sequence number has increased (all asics). + * this function is called with fence_queue lock held, which is also used + * for the fence locking itself, so unlocked variants are used for + * fence_signal, and remove_wait_queue. */ -void radeon_fence_process(struct radeon_device *rdev, int ring) +static int radeon_fence_check_signaled(wait_queue_t *wait, unsigned mode, int flags, void *key) +{ + struct radeon_fence *fence
Re: [RFC PATCH] drm/radeon: rework to new fence interface
Op 19-08-13 14:35, Christian König schreef: Am 19.08.2013 12:17, schrieb Maarten Lankhorst: [SNIP] @@ -190,25 +225,24 @@ void radeon_fence_process(struct radeon_device *rdev, int ring) } } while (atomic64_xchg(rdev-fence_drv[ring].last_seq, seq) seq); -if (wake) { +if (wake) rdev-fence_drv[ring].last_activity = jiffies; -wake_up_all(rdev-fence_queue); -} +return wake; } Very bad idea, when sequence numbers change, you always want to wake up the whole fence queue here. Yes, and the callers of this function call wake_up_all or wake_up_all_locked themselves, based on the return value.. [SNIP] +/** + * radeon_fence_enable_signaling - enable signalling on fence + * @fence: fence + * + * This function is called with fence_queue lock held, and adds a callback + * to fence_queue that checks if this fence is signaled, and if so it + * signals the fence and removes itself. + */ +static bool radeon_fence_enable_signaling(struct fence *f) +{ +struct radeon_fence *fence = to_radeon_fence(f); + +if (atomic64_read(fence-rdev-fence_drv[fence-ring].last_seq) = fence-seq || +!fence-rdev-ddev-irq_enabled) +return false; + Do I get that right that you rely on IRQs to be enabled and working here? Cause that would be a quite bad idea from the conceptual side. For cross-device synchronization it would be nice to have working irqs, it allows signalling fences faster, and it allows for callbacks on completion to be called. For internal usage it's no more required than it was before. +radeon_irq_kms_sw_irq_get(fence-rdev, fence-ring); + +if (__radeon_fence_process(fence-rdev, fence-ring)) +wake_up_all_locked(fence-rdev-fence_queue); + +/* did fence get signaled after we enabled the sw irq? */ +if (atomic64_read(fence-rdev-fence_drv[fence-ring].last_seq) = fence-seq) { +radeon_irq_kms_sw_irq_put(fence-rdev, fence-ring); +return false; +} + +fence-fence_wake.flags = 0; +fence-fence_wake.private = NULL; +fence-fence_wake.func = radeon_fence_check_signaled; +__add_wait_queue(fence-rdev-fence_queue, fence-fence_wake); +fence_get(f); + +return true; +} + /** * radeon_fence_signaled - check if a fence has signaled * Christian. ~Maarten -- To unsubscribe from this list: send the line unsubscribe linux-media in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [RFC PATCH] fence: dma-buf cross-device synchronization (v12)
Op 12-08-13 17:43, Rob Clark schreef: On Mon, Jul 29, 2013 at 10:05 AM, Maarten Lankhorst maarten.lankho...@canonical.com wrote: A fence can be attached to a buffer which is being filled or consumed by hw, to allow userspace to pass the buffer without waiting to another device. For example, userspace can call page_flip ioctl to display the next frame of graphics after kicking the GPU but while the GPU is still rendering. The display device sharing the buffer with the GPU would attach a callback to get notified when the GPU's rendering-complete IRQ fires, to update the scan-out address of the display, without having to wake up userspace. A driver must allocate a fence context for each execution ring that can run in parallel. The function for this takes an argument with how many contexts to allocate: + fence_context_alloc() A fence is transient, one-shot deal. It is allocated and attached to one or more dma-buf's. When the one that attached it is done, with the pending operation, it can signal the fence: + fence_signal() To have a rough approximation whether a fence is fired, call: + fence_is_signaled() The dma-buf-mgr handles tracking, and waiting on, the fences associated with a dma-buf. The one pending on the fence can add an async callback: + fence_add_callback() The callback can optionally be cancelled with: + fence_remove_callback() To wait synchronously, optionally with a timeout: + fence_wait() + fence_wait_timeout() A default software-only implementation is provided, which can be used by drivers attaching a fence to a buffer when they have no other means for hw sync. But a memory backed fence is also envisioned, because it is common that GPU's can write to, or poll on some memory location for synchronization. For example: fence = custom_get_fence(...); if ((seqno_fence = to_seqno_fence(fence)) != NULL) { dma_buf *fence_buf = fence-sync_buf; get_dma_buf(fence_buf); ... tell the hw the memory location to wait ... custom_wait_on(fence_buf, fence-seqno_ofs, fence-seqno); } else { /* fall-back to sw sync * / fence_add_callback(fence, my_cb); } On SoC platforms, if some other hw mechanism is provided for synchronizing between IP blocks, it could be supported as an alternate implementation with it's own fence ops in a similar way. enable_signaling callback is used to provide sw signaling in case a cpu waiter is requested or no compatible hardware signaling could be used. The intention is to provide a userspace interface (presumably via eventfd) later, to be used in conjunction with dma-buf's mmap support for sw access to buffers (or for userspace apps that would prefer to do their own synchronization). v1: Original v2: After discussion w/ danvet and mlankhorst on #dri-devel, we decided that dma-fence didn't need to care about the sw-hw signaling path (it can be handled same as sw-sw case), and therefore the fence-ops can be simplified and more handled in the core. So remove the signal, add_callback, cancel_callback, and wait ops, and replace with a simple enable_signaling() op which can be used to inform a fence supporting hw-hw signaling that one or more devices which do not support hw signaling are waiting (and therefore it should enable an irq or do whatever is necessary in order that the CPU is notified when the fence is passed). v3: Fix locking fail in attach_fence() and get_fence() v4: Remove tie-in w/ dma-buf.. after discussion w/ danvet and mlankorst we decided that we need to be able to attach one fence to N dma-buf's, so using the list_head in dma-fence struct would be problematic. v5: [ Maarten Lankhorst ] Updated for dma-bikeshed-fence and dma-buf-manager. v6: [ Maarten Lankhorst ] I removed dma_fence_cancel_callback and some comments about checking if fence fired or not. This is broken by design. waitqueue_active during destruction is now fatal, since the signaller should be holding a reference in enable_signalling until it signalled the fence. Pass the original dma_fence_cb along, and call __remove_wait in the dma_fence_callback handler, so that no cleanup needs to be performed. v7: [ Maarten Lankhorst ] Set cb-func and only enable sw signaling if fence wasn't signaled yet, for example for hardware fences that may choose to signal blindly. v8: [ Maarten Lankhorst ] Tons of tiny fixes, moved __dma_fence_init to header and fixed include mess. dma-fence.h now includes dma-buf.h All members are now initialized, so kmalloc can be used for allocating a dma-fence. More documentation added. v9: Change compiler bitfields to flags, change return type of enable_signaling to bool. Rework dma_fence_wait. Added dma_fence_is_signaled and dma_fence_wait_timeout. s/dma// and change exports to non GPL. Added fence_is_signaled
[PATCH] fence: dma-buf cross-device synchronization (v13)
A fence can be attached to a buffer which is being filled or consumed by hw, to allow userspace to pass the buffer without waiting to another device. For example, userspace can call page_flip ioctl to display the next frame of graphics after kicking the GPU but while the GPU is still rendering. The display device sharing the buffer with the GPU would attach a callback to get notified when the GPU's rendering-complete IRQ fires, to update the scan-out address of the display, without having to wake up userspace. A driver must allocate a fence context for each execution ring that can run in parallel. The function for this takes an argument with how many contexts to allocate: + fence_context_alloc() A fence is transient, one-shot deal. It is allocated and attached to one or more dma-buf's. When the one that attached it is done, with the pending operation, it can signal the fence: + fence_signal() To have a rough approximation whether a fence is fired, call: + fence_is_signaled() The dma-buf-mgr handles tracking, and waiting on, the fences associated with a dma-buf. The one pending on the fence can add an async callback: + fence_add_callback() The callback can optionally be cancelled with: + fence_remove_callback() To wait synchronously, optionally with a timeout: + fence_wait() + fence_wait_timeout() A default software-only implementation is provided, which can be used by drivers attaching a fence to a buffer when they have no other means for hw sync. But a memory backed fence is also envisioned, because it is common that GPU's can write to, or poll on some memory location for synchronization. For example: fence = custom_get_fence(...); if ((seqno_fence = to_seqno_fence(fence)) != NULL) { dma_buf *fence_buf = fence-sync_buf; get_dma_buf(fence_buf); ... tell the hw the memory location to wait ... custom_wait_on(fence_buf, fence-seqno_ofs, fence-seqno); } else { /* fall-back to sw sync * / fence_add_callback(fence, my_cb); } On SoC platforms, if some other hw mechanism is provided for synchronizing between IP blocks, it could be supported as an alternate implementation with it's own fence ops in a similar way. enable_signaling callback is used to provide sw signaling in case a cpu waiter is requested or no compatible hardware signaling could be used. The intention is to provide a userspace interface (presumably via eventfd) later, to be used in conjunction with dma-buf's mmap support for sw access to buffers (or for userspace apps that would prefer to do their own synchronization). v1: Original v2: After discussion w/ danvet and mlankhorst on #dri-devel, we decided that dma-fence didn't need to care about the sw-hw signaling path (it can be handled same as sw-sw case), and therefore the fence-ops can be simplified and more handled in the core. So remove the signal, add_callback, cancel_callback, and wait ops, and replace with a simple enable_signaling() op which can be used to inform a fence supporting hw-hw signaling that one or more devices which do not support hw signaling are waiting (and therefore it should enable an irq or do whatever is necessary in order that the CPU is notified when the fence is passed). v3: Fix locking fail in attach_fence() and get_fence() v4: Remove tie-in w/ dma-buf.. after discussion w/ danvet and mlankorst we decided that we need to be able to attach one fence to N dma-buf's, so using the list_head in dma-fence struct would be problematic. v5: [ Maarten Lankhorst ] Updated for dma-bikeshed-fence and dma-buf-manager. v6: [ Maarten Lankhorst ] I removed dma_fence_cancel_callback and some comments about checking if fence fired or not. This is broken by design. waitqueue_active during destruction is now fatal, since the signaller should be holding a reference in enable_signalling until it signalled the fence. Pass the original dma_fence_cb along, and call __remove_wait in the dma_fence_callback handler, so that no cleanup needs to be performed. v7: [ Maarten Lankhorst ] Set cb-func and only enable sw signaling if fence wasn't signaled yet, for example for hardware fences that may choose to signal blindly. v8: [ Maarten Lankhorst ] Tons of tiny fixes, moved __dma_fence_init to header and fixed include mess. dma-fence.h now includes dma-buf.h All members are now initialized, so kmalloc can be used for allocating a dma-fence. More documentation added. v9: Change compiler bitfields to flags, change return type of enable_signaling to bool. Rework dma_fence_wait. Added dma_fence_is_signaled and dma_fence_wait_timeout. s/dma// and change exports to non GPL. Added fence_is_signaled and fence_enable_sw_signaling calls, add ability to override default wait operation. v10: remove event_queue, use a custom list, export try_to_wake_up from scheduler. Remove fence lock and use a global spinlock instead, this should
[PATCH] fence: dma-buf cross-device synchronization (v14)
A fence can be attached to a buffer which is being filled or consumed by hw, to allow userspace to pass the buffer without waiting to another device. For example, userspace can call page_flip ioctl to display the next frame of graphics after kicking the GPU but while the GPU is still rendering. The display device sharing the buffer with the GPU would attach a callback to get notified when the GPU's rendering-complete IRQ fires, to update the scan-out address of the display, without having to wake up userspace. A driver must allocate a fence context for each execution ring that can run in parallel. The function for this takes an argument with how many contexts to allocate: + fence_context_alloc() A fence is transient, one-shot deal. It is allocated and attached to one or more dma-buf's. When the one that attached it is done, with the pending operation, it can signal the fence: + fence_signal() To have a rough approximation whether a fence is fired, call: + fence_is_signaled() The dma-buf-mgr handles tracking, and waiting on, the fences associated with a dma-buf. The one pending on the fence can add an async callback: + fence_add_callback() The callback can optionally be cancelled with: + fence_remove_callback() To wait synchronously, optionally with a timeout: + fence_wait() + fence_wait_timeout() A default software-only implementation is provided, which can be used by drivers attaching a fence to a buffer when they have no other means for hw sync. But a memory backed fence is also envisioned, because it is common that GPU's can write to, or poll on some memory location for synchronization. For example: fence = custom_get_fence(...); if ((seqno_fence = to_seqno_fence(fence)) != NULL) { dma_buf *fence_buf = fence-sync_buf; get_dma_buf(fence_buf); ... tell the hw the memory location to wait ... custom_wait_on(fence_buf, fence-seqno_ofs, fence-seqno); } else { /* fall-back to sw sync * / fence_add_callback(fence, my_cb); } On SoC platforms, if some other hw mechanism is provided for synchronizing between IP blocks, it could be supported as an alternate implementation with it's own fence ops in a similar way. enable_signaling callback is used to provide sw signaling in case a cpu waiter is requested or no compatible hardware signaling could be used. The intention is to provide a userspace interface (presumably via eventfd) later, to be used in conjunction with dma-buf's mmap support for sw access to buffers (or for userspace apps that would prefer to do their own synchronization). v1: Original v2: After discussion w/ danvet and mlankhorst on #dri-devel, we decided that dma-fence didn't need to care about the sw-hw signaling path (it can be handled same as sw-sw case), and therefore the fence-ops can be simplified and more handled in the core. So remove the signal, add_callback, cancel_callback, and wait ops, and replace with a simple enable_signaling() op which can be used to inform a fence supporting hw-hw signaling that one or more devices which do not support hw signaling are waiting (and therefore it should enable an irq or do whatever is necessary in order that the CPU is notified when the fence is passed). v3: Fix locking fail in attach_fence() and get_fence() v4: Remove tie-in w/ dma-buf.. after discussion w/ danvet and mlankorst we decided that we need to be able to attach one fence to N dma-buf's, so using the list_head in dma-fence struct would be problematic. v5: [ Maarten Lankhorst ] Updated for dma-bikeshed-fence and dma-buf-manager. v6: [ Maarten Lankhorst ] I removed dma_fence_cancel_callback and some comments about checking if fence fired or not. This is broken by design. waitqueue_active during destruction is now fatal, since the signaller should be holding a reference in enable_signalling until it signalled the fence. Pass the original dma_fence_cb along, and call __remove_wait in the dma_fence_callback handler, so that no cleanup needs to be performed. v7: [ Maarten Lankhorst ] Set cb-func and only enable sw signaling if fence wasn't signaled yet, for example for hardware fences that may choose to signal blindly. v8: [ Maarten Lankhorst ] Tons of tiny fixes, moved __dma_fence_init to header and fixed include mess. dma-fence.h now includes dma-buf.h All members are now initialized, so kmalloc can be used for allocating a dma-fence. More documentation added. v9: Change compiler bitfields to flags, change return type of enable_signaling to bool. Rework dma_fence_wait. Added dma_fence_is_signaled and dma_fence_wait_timeout. s/dma// and change exports to non GPL. Added fence_is_signaled and fence_enable_sw_signaling calls, add ability to override default wait operation. v10: remove event_queue, use a custom list, export try_to_wake_up from scheduler. Remove fence lock and use a global spinlock instead, this should
Re: [PATCH] fence: dma-buf cross-device synchronization (v13)
Op 15-08-13 14:45, Marcin Ślusarz schreef: 2013/8/15 Maarten Lankhorst maarten.lankho...@canonical.com: A fence can be attached to a buffer which is being filled or consumed by hw, to allow userspace to pass the buffer without waiting to another device. For example, userspace can call page_flip ioctl to display the next frame of graphics after kicking the GPU but while the GPU is still rendering. The display device sharing the buffer with the GPU would attach a callback to get notified when the GPU's rendering-complete IRQ fires, to update the scan-out address of the display, without having to wake up userspace. A driver must allocate a fence context for each execution ring that can run in parallel. The function for this takes an argument with how many contexts to allocate: + fence_context_alloc() A fence is transient, one-shot deal. It is allocated and attached to one or more dma-buf's. When the one that attached it is done, with the pending operation, it can signal the fence: + fence_signal() To have a rough approximation whether a fence is fired, call: + fence_is_signaled() The dma-buf-mgr handles tracking, and waiting on, the fences associated with a dma-buf. The one pending on the fence can add an async callback: + fence_add_callback() The callback can optionally be cancelled with: + fence_remove_callback() To wait synchronously, optionally with a timeout: + fence_wait() + fence_wait_timeout() A default software-only implementation is provided, which can be used by drivers attaching a fence to a buffer when they have no other means for hw sync. But a memory backed fence is also envisioned, because it is common that GPU's can write to, or poll on some memory location for synchronization. For example: fence = custom_get_fence(...); if ((seqno_fence = to_seqno_fence(fence)) != NULL) { dma_buf *fence_buf = fence-sync_buf; get_dma_buf(fence_buf); ... tell the hw the memory location to wait ... custom_wait_on(fence_buf, fence-seqno_ofs, fence-seqno); } else { /* fall-back to sw sync * / fence_add_callback(fence, my_cb); } On SoC platforms, if some other hw mechanism is provided for synchronizing between IP blocks, it could be supported as an alternate implementation with it's own fence ops in a similar way. enable_signaling callback is used to provide sw signaling in case a cpu waiter is requested or no compatible hardware signaling could be used. The intention is to provide a userspace interface (presumably via eventfd) later, to be used in conjunction with dma-buf's mmap support for sw access to buffers (or for userspace apps that would prefer to do their own synchronization). v1: Original v2: After discussion w/ danvet and mlankhorst on #dri-devel, we decided that dma-fence didn't need to care about the sw-hw signaling path (it can be handled same as sw-sw case), and therefore the fence-ops can be simplified and more handled in the core. So remove the signal, add_callback, cancel_callback, and wait ops, and replace with a simple enable_signaling() op which can be used to inform a fence supporting hw-hw signaling that one or more devices which do not support hw signaling are waiting (and therefore it should enable an irq or do whatever is necessary in order that the CPU is notified when the fence is passed). v3: Fix locking fail in attach_fence() and get_fence() v4: Remove tie-in w/ dma-buf.. after discussion w/ danvet and mlankorst we decided that we need to be able to attach one fence to N dma-buf's, so using the list_head in dma-fence struct would be problematic. v5: [ Maarten Lankhorst ] Updated for dma-bikeshed-fence and dma-buf-manager. v6: [ Maarten Lankhorst ] I removed dma_fence_cancel_callback and some comments about checking if fence fired or not. This is broken by design. waitqueue_active during destruction is now fatal, since the signaller should be holding a reference in enable_signalling until it signalled the fence. Pass the original dma_fence_cb along, and call __remove_wait in the dma_fence_callback handler, so that no cleanup needs to be performed. v7: [ Maarten Lankhorst ] Set cb-func and only enable sw signaling if fence wasn't signaled yet, for example for hardware fences that may choose to signal blindly. v8: [ Maarten Lankhorst ] Tons of tiny fixes, moved __dma_fence_init to header and fixed include mess. dma-fence.h now includes dma-buf.h All members are now initialized, so kmalloc can be used for allocating a dma-fence. More documentation added. v9: Change compiler bitfields to flags, change return type of enable_signaling to bool. Rework dma_fence_wait. Added dma_fence_is_signaled and dma_fence_wait_timeout. s/dma// and change exports to non GPL. Added fence_is_signaled and fence_enable_sw_signaling calls, add ability
Re: [RFC PATCH] fence: dma-buf cross-device synchronization (v12)
Op 15-08-13 15:14, Rob Clark schreef: On Thu, Aug 15, 2013 at 7:16 AM, Maarten Lankhorst maarten.lankho...@canonical.com wrote: Op 12-08-13 17:43, Rob Clark schreef: On Mon, Jul 29, 2013 at 10:05 AM, Maarten Lankhorst maarten.lankho...@canonical.com wrote: + [snip] +/** + * fence_add_callback - add a callback to be called when the fence + * is signaled + * @fence: [in]the fence to wait on + * @cb:[in]the callback to register + * @func: [in]the function to call + * @priv: [in]the argument to pass to function + * + * cb will be initialized by fence_add_callback, no initialization + * by the caller is required. Any number of callbacks can be registered + * to a fence, but a callback can only be registered to one fence at a time. + * + * Note that the callback can be called from an atomic context. If + * fence is already signaled, this function will return -ENOENT (and + * *not* call the callback) + * + * Add a software callback to the fence. Same restrictions apply to + * refcount as it does to fence_wait, however the caller doesn't need to + * keep a refcount to fence afterwards: when software access is enabled, + * the creator of the fence is required to keep the fence alive until + * after it signals with fence_signal. The callback itself can be called + * from irq context. + * + */ +int fence_add_callback(struct fence *fence, struct fence_cb *cb, + fence_func_t func, void *priv) +{ + unsigned long flags; + int ret = 0; + bool was_set; + + if (WARN_ON(!fence || !func)) + return -EINVAL; + + if (test_bit(FENCE_FLAG_SIGNALED_BIT, fence-flags)) + return -ENOENT; + + spin_lock_irqsave(fence-lock, flags); + + was_set = test_and_set_bit(FENCE_FLAG_ENABLE_SIGNAL_BIT, fence-flags); + + if (test_bit(FENCE_FLAG_SIGNALED_BIT, fence-flags)) + ret = -ENOENT; + else if (!was_set !fence-ops-enable_signaling(fence)) { + __fence_signal(fence); + ret = -ENOENT; + } + + if (!ret) { + cb-func = func; + cb-priv = priv; + list_add_tail(cb-node, fence-cb_list); since the user is providing the 'struct fence_cb', why not drop the priv func args, and have some cb-initialize macro, ie. INIT_FENCE_CB(foo-fence, cbfxn); and I guess we can just drop priv and let the user embed fence in whatever structure they like. Ie. make it look a bit how work_struct works. I don't mind killing priv. But a INIT_FENCE_CB macro is silly, when all it would do is set cb-func. So passing it as an argument to fence_add_callback is fine, unless you have a better reason to do so. INIT_WORK seems to have a bit more initialization than us, it seems work can be more complicated than callbacks, because the callbacks can only be called once and work can be rescheduled multiple times. yeah, INIT_WORK does more.. although maybe some day we want INIT_FENCE_CB to do more (ie. if we add some debug features to help catch misuse of fence/fence-cb's). And if nothing else, having it look a bit like other constructs that we have in the kernel seems useful. And with my point below, you'd want INIT_FENCE_CB to do a INIT_LIST_HEAD(), so it is (very) slightly more than just setting the fxn ptr. I don't think list is a good idea for that. maybe also, if (!list_empty(cb-node) return -EBUSY? I think checking for list_empty(cb-node) is a terrible idea. This is no different from any other list corruption, and it's a programming error. Not a runtime error. :-) I was thinking for crtc and page-flip, embed the fence_cb in the crtc. You should only use the cb once at a time, but in this case you might want to re-use it for the next page flip. Having something to catch cb mis-use in this sort of scenario seems useful. maybe how I am thinking to use fence_cb is not quite what you had in mind. I'm not sure. I was trying to think how I could just directly use fence/fence_cb in msm for everything (imported dmabuf or just regular 'ol gem buffers). cb-node.next/prev may be NULL, which would fail with this check. The contents of cb-node are undefined before fence_add_callback is called. Calling fence_remove_callback on a fence that hasn't been added is undefined too. Calling fence_remove_callback works, but I'm thinking of changing the list_del_init to list_del, which would make calling fence_remove_callback twice a fatal error if CONFIG_DEBUG_LIST is enabled, and a possible memory corruption otherwise. ... + [snip] + +/** + * fence context counter: each execution context should have its own + * fence context, this allows checking if fences belong to the same + * context or not. One device can have multiple separate contexts, + * and they're used if some engine can run independently of another. + */ +extern
[RFC PATCH] fence: dma-buf cross-device synchronization (v12)
A fence can be attached to a buffer which is being filled or consumed by hw, to allow userspace to pass the buffer without waiting to another device. For example, userspace can call page_flip ioctl to display the next frame of graphics after kicking the GPU but while the GPU is still rendering. The display device sharing the buffer with the GPU would attach a callback to get notified when the GPU's rendering-complete IRQ fires, to update the scan-out address of the display, without having to wake up userspace. A driver must allocate a fence context for each execution ring that can run in parallel. The function for this takes an argument with how many contexts to allocate: + fence_context_alloc() A fence is transient, one-shot deal. It is allocated and attached to one or more dma-buf's. When the one that attached it is done, with the pending operation, it can signal the fence: + fence_signal() To have a rough approximation whether a fence is fired, call: + fence_is_signaled() The dma-buf-mgr handles tracking, and waiting on, the fences associated with a dma-buf. The one pending on the fence can add an async callback: + fence_add_callback() The callback can optionally be cancelled with: + fence_remove_callback() To wait synchronously, optionally with a timeout: + fence_wait() + fence_wait_timeout() A default software-only implementation is provided, which can be used by drivers attaching a fence to a buffer when they have no other means for hw sync. But a memory backed fence is also envisioned, because it is common that GPU's can write to, or poll on some memory location for synchronization. For example: fence = custom_get_fence(...); if ((seqno_fence = to_seqno_fence(fence)) != NULL) { dma_buf *fence_buf = fence-sync_buf; get_dma_buf(fence_buf); ... tell the hw the memory location to wait ... custom_wait_on(fence_buf, fence-seqno_ofs, fence-seqno); } else { /* fall-back to sw sync * / fence_add_callback(fence, my_cb); } On SoC platforms, if some other hw mechanism is provided for synchronizing between IP blocks, it could be supported as an alternate implementation with it's own fence ops in a similar way. enable_signaling callback is used to provide sw signaling in case a cpu waiter is requested or no compatible hardware signaling could be used. The intention is to provide a userspace interface (presumably via eventfd) later, to be used in conjunction with dma-buf's mmap support for sw access to buffers (or for userspace apps that would prefer to do their own synchronization). v1: Original v2: After discussion w/ danvet and mlankhorst on #dri-devel, we decided that dma-fence didn't need to care about the sw-hw signaling path (it can be handled same as sw-sw case), and therefore the fence-ops can be simplified and more handled in the core. So remove the signal, add_callback, cancel_callback, and wait ops, and replace with a simple enable_signaling() op which can be used to inform a fence supporting hw-hw signaling that one or more devices which do not support hw signaling are waiting (and therefore it should enable an irq or do whatever is necessary in order that the CPU is notified when the fence is passed). v3: Fix locking fail in attach_fence() and get_fence() v4: Remove tie-in w/ dma-buf.. after discussion w/ danvet and mlankorst we decided that we need to be able to attach one fence to N dma-buf's, so using the list_head in dma-fence struct would be problematic. v5: [ Maarten Lankhorst ] Updated for dma-bikeshed-fence and dma-buf-manager. v6: [ Maarten Lankhorst ] I removed dma_fence_cancel_callback and some comments about checking if fence fired or not. This is broken by design. waitqueue_active during destruction is now fatal, since the signaller should be holding a reference in enable_signalling until it signalled the fence. Pass the original dma_fence_cb along, and call __remove_wait in the dma_fence_callback handler, so that no cleanup needs to be performed. v7: [ Maarten Lankhorst ] Set cb-func and only enable sw signaling if fence wasn't signaled yet, for example for hardware fences that may choose to signal blindly. v8: [ Maarten Lankhorst ] Tons of tiny fixes, moved __dma_fence_init to header and fixed include mess. dma-fence.h now includes dma-buf.h All members are now initialized, so kmalloc can be used for allocating a dma-fence. More documentation added. v9: Change compiler bitfields to flags, change return type of enable_signaling to bool. Rework dma_fence_wait. Added dma_fence_is_signaled and dma_fence_wait_timeout. s/dma// and change exports to non GPL. Added fence_is_signaled and fence_enable_sw_signaling calls, add ability to override default wait operation. v10: remove event_queue, use a custom list, export try_to_wake_up from scheduler. Remove fence lock and use a global spinlock instead, this should
[PATCH v6 2/7] mutex: add support for wound/wait style locks
Op 20-06-13 14:23, Ingo Molnar schreef: * Maarten Lankhorst maarten.lankho...@canonical.com wrote: Well they've helped me with some of the changes and contributed some code and/or fixes, but if acked-by is preferred I'll use that.. Such contributions can be credited in the changelog, and/or copyright notices, and/or the code itself. The signoff chain on the other hand is strictly defined as a 'route the patch took', with a single point of origin, the main author. See Documentation/SubmittingPatches, pt 12. [ A signoff chain _can_ signal multi-authored code where the code got written by someone and then further fixed/developed by someone else - who adds a SOB to the end - but in that case I expect to get the patch from the last person in the signoff chain. ] Thanks, Ingo Is this better? I added some more to the changelog entry, clarified ttm and fixed the sob's. 8 Wound/wait mutexes are used when other multiple lock acquisitions of a similar type can be done in an arbitrary order. The deadlock handling used here is called wait/wound in the RDBMS literature: The older tasks waits until it can acquire the contended lock. The younger tasks needs to back off and drop all the locks it is currently holding, i.e. the younger task is wounded. For full documentation please read Documentation/ww-mutex-design.txt. Changes since RFC patch v1: - Updated to use atomic_long instead of atomic, since the reservation_id was a long. - added mutex_reserve_lock_slow and mutex_reserve_lock_intr_slow - removed mutex_locked_set_reservation_id (or w/e it was called) Changes since RFC patch v2: - remove use of __mutex_lock_retval_arg, add warnings when using wrong combination of mutex_(,reserve_)lock/unlock. Changes since v1: - Add __always_inline to __mutex_lock_common, otherwise reservation paths can be triggered from normal locks, because __builtin_constant_p might evaluate to false for the constant 0 in that case. Tests for this have been added in the next patch. - Updated documentation slightly. Changes since v2: - Renamed everything to ww_mutex. (mlankhorst) - Added ww_acquire_ctx and ww_class. (mlankhorst) - Added a lot of checks for wrong api usage. (mlankhorst) - Documentation updates. (danvet) Changes since v3: - Small documentation fixes (robclark) - Memory barrier fix (danvet) Changes since v4: - Remove ww_mutex_unlock_single and ww_mutex_lock_single. - Rename ww_mutex_trylock_single to ww_mutex_trylock. - Remove separate implementations of ww_mutex_lock_slow*, normal functions can be used. Inline versions still exist for extra debugging. - Cleanup unneeded memory barriers, add comment to the remaining smp_mb(). Changes since v5: - Clarify TTM - TTM graphics subsystem References: https://lwn.net/Articles/548909/ Acked-by: Daniel Vetter daniel.vet...@ffwll.ch Acked-by: Rob Clark robdcl...@gmail.com Signed-off-by: Maarten Lankhorst maarten.lankho...@canonical.com --- Documentation/ww-mutex-design.txt | 344 include/linux/mutex-debug.h | 1 + include/linux/mutex.h | 355 +- kernel/mutex.c| 318 -- lib/debug_locks.c | 2 + 5 files changed, 1003 insertions(+), 17 deletions(-) create mode 100644 Documentation/ww-mutex-design.txt diff --git a/Documentation/ww-mutex-design.txt b/Documentation/ww-mutex-design.txt new file mode 100644 index 000..8a112dc --- /dev/null +++ b/Documentation/ww-mutex-design.txt @@ -0,0 +1,344 @@ +Wait/Wound Deadlock-Proof Mutex Design +== + +Please read mutex-design.txt first, as it applies to wait/wound mutexes too. + +Motivation for WW-Mutexes +- + +GPU's do operations that commonly involve many buffers. Those buffers +can be shared across contexts/processes, exist in different memory +domains (for example VRAM vs system memory), and so on. And with +PRIME / dmabuf, they can even be shared across devices. So there are +a handful of situations where the driver needs to wait for buffers to +become ready. If you think about this in terms of waiting on a buffer +mutex for it to become available, this presents a problem because +there is no way to guarantee that buffers appear in a execbuf/batch in +the same order in all contexts. That is directly under control of +userspace, and a result of the sequence of GL calls that an application +makes. Which results in the potential for deadlock. The problem gets +more complex when you consider that the kernel may need to migrate the +buffer(s) into VRAM before the GPU operates on the buffer(s), which +may in turn require evicting some other buffers (and you don't want to +evict other buffers which are already queued up to the GPU), but for a +simplified understanding of the problem you can ignore this. + +The algorithm that the TTM graphics
[PATCH v5 5/7] mutex: add more tests to lib/locking-selftest.c
None of the ww_mutex codepaths should be taken in the 'normal' mutex calls. The easiest way to verify this is by using the normal mutex calls, and making sure o.ctx is unmodified. Signed-off-by: Maarten Lankhorst maarten.lankho...@canonical.com --- lib/locking-selftest.c | 62 1 file changed, 62 insertions(+) diff --git a/lib/locking-selftest.c b/lib/locking-selftest.c index 9962262..37faefd 100644 --- a/lib/locking-selftest.c +++ b/lib/locking-selftest.c @@ -1162,6 +1162,67 @@ static void ww_test_fail_acquire(void) #endif } +static void ww_test_normal(void) +{ + int ret; + + WWAI(t); + + /* +* None of the ww_mutex codepaths should be taken in the 'normal' +* mutex calls. The easiest way to verify this is by using the +* normal mutex calls, and making sure o.ctx is unmodified. +*/ + + /* mutex_lock (and indirectly, mutex_lock_nested) */ + o.ctx = (void *)~0UL; + mutex_lock(o.base); + mutex_unlock(o.base); + WARN_ON(o.ctx != (void *)~0UL); + + /* mutex_lock_interruptible (and *_nested) */ + o.ctx = (void *)~0UL; + ret = mutex_lock_interruptible(o.base); + if (!ret) + mutex_unlock(o.base); + else + WARN_ON(1); + WARN_ON(o.ctx != (void *)~0UL); + + /* mutex_lock_killable (and *_nested) */ + o.ctx = (void *)~0UL; + ret = mutex_lock_killable(o.base); + if (!ret) + mutex_unlock(o.base); + else + WARN_ON(1); + WARN_ON(o.ctx != (void *)~0UL); + + /* trylock, succeeding */ + o.ctx = (void *)~0UL; + ret = mutex_trylock(o.base); + WARN_ON(!ret); + if (ret) + mutex_unlock(o.base); + else + WARN_ON(1); + WARN_ON(o.ctx != (void *)~0UL); + + /* trylock, failing */ + o.ctx = (void *)~0UL; + mutex_lock(o.base); + ret = mutex_trylock(o.base); + WARN_ON(ret); + mutex_unlock(o.base); + WARN_ON(o.ctx != (void *)~0UL); + + /* nest_lock */ + o.ctx = (void *)~0UL; + mutex_lock_nest_lock(o.base, t); + mutex_unlock(o.base); + WARN_ON(o.ctx != (void *)~0UL); +} + static void ww_test_two_contexts(void) { WWAI(t); @@ -1415,6 +1476,7 @@ static void ww_tests(void) print_testname(ww api failures); dotest(ww_test_fail_acquire, SUCCESS, LOCKTYPE_WW); + dotest(ww_test_normal, SUCCESS, LOCKTYPE_WW); dotest(ww_test_unneeded_slow, FAILURE, LOCKTYPE_WW); printk(\n); -- To unsubscribe from this list: send the line unsubscribe linux-media in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH v5 6/7] mutex: add more ww tests to test EDEADLK path handling
Signed-off-by: Maarten Lankhorst maarten.lankho...@canonical.com --- lib/locking-selftest.c | 264 +++- 1 file changed, 261 insertions(+), 3 deletions(-) diff --git a/lib/locking-selftest.c b/lib/locking-selftest.c index 37faefd..d554f3f 100644 --- a/lib/locking-selftest.c +++ b/lib/locking-selftest.c @@ -47,7 +47,7 @@ __setup(debug_locks_verbose=, setup_debug_locks_verbose); #define LOCKTYPE_WW0x10 static struct ww_acquire_ctx t, t2; -static struct ww_mutex o, o2; +static struct ww_mutex o, o2, o3; /* * Normal standalone locks, for the circular and irq-context @@ -947,12 +947,12 @@ static void reset_locks(void) I1(A); I1(B); I1(C); I1(D); I1(X1); I1(X2); I1(Y1); I1(Y2); I1(Z1); I1(Z2); - I_WW(t); I_WW(t2); I_WW(o.base); I_WW(o2.base); + I_WW(t); I_WW(t2); I_WW(o.base); I_WW(o2.base); I_WW(o3.base); lockdep_reset(); I2(A); I2(B); I2(C); I2(D); init_shared_classes(); - ww_mutex_init(o, ww_lockdep); ww_mutex_init(o2, ww_lockdep); + ww_mutex_init(o, ww_lockdep); ww_mutex_init(o2, ww_lockdep); ww_mutex_init(o3, ww_lockdep); memset(t, 0, sizeof(t)); memset(t2, 0, sizeof(t2)); memset(ww_lockdep.acquire_key, 0, sizeof(ww_lockdep.acquire_key)); memset(ww_lockdep.mutex_key, 0, sizeof(ww_lockdep.mutex_key)); @@ -1292,6 +1292,251 @@ static void ww_test_object_lock_stale_context(void) WWL(o, t); } +static void ww_test_edeadlk_normal(void) +{ + int ret; + + mutex_lock(o2.base); + o2.ctx = t2; + mutex_release(o2.base.dep_map, 1, _THIS_IP_); + + WWAI(t); + t2 = t; + t2.stamp--; + + ret = WWL(o, t); + WARN_ON(ret); + + ret = WWL(o2, t); + WARN_ON(ret != -EDEADLK); + + o2.ctx = NULL; + mutex_acquire(o2.base.dep_map, 0, 1, _THIS_IP_); + mutex_unlock(o2.base); + WWU(o); + + WWL(o2, t); +} + +static void ww_test_edeadlk_normal_slow(void) +{ + int ret; + + mutex_lock(o2.base); + mutex_release(o2.base.dep_map, 1, _THIS_IP_); + o2.ctx = t2; + + WWAI(t); + t2 = t; + t2.stamp--; + + ret = WWL(o, t); + WARN_ON(ret); + + ret = WWL(o2, t); + WARN_ON(ret != -EDEADLK); + + o2.ctx = NULL; + mutex_acquire(o2.base.dep_map, 0, 1, _THIS_IP_); + mutex_unlock(o2.base); + WWU(o); + + ww_mutex_lock_slow(o2, t); +} + +static void ww_test_edeadlk_no_unlock(void) +{ + int ret; + + mutex_lock(o2.base); + o2.ctx = t2; + mutex_release(o2.base.dep_map, 1, _THIS_IP_); + + WWAI(t); + t2 = t; + t2.stamp--; + + ret = WWL(o, t); + WARN_ON(ret); + + ret = WWL(o2, t); + WARN_ON(ret != -EDEADLK); + + o2.ctx = NULL; + mutex_acquire(o2.base.dep_map, 0, 1, _THIS_IP_); + mutex_unlock(o2.base); + + WWL(o2, t); +} + +static void ww_test_edeadlk_no_unlock_slow(void) +{ + int ret; + + mutex_lock(o2.base); + mutex_release(o2.base.dep_map, 1, _THIS_IP_); + o2.ctx = t2; + + WWAI(t); + t2 = t; + t2.stamp--; + + ret = WWL(o, t); + WARN_ON(ret); + + ret = WWL(o2, t); + WARN_ON(ret != -EDEADLK); + + o2.ctx = NULL; + mutex_acquire(o2.base.dep_map, 0, 1, _THIS_IP_); + mutex_unlock(o2.base); + + ww_mutex_lock_slow(o2, t); +} + +static void ww_test_edeadlk_acquire_more(void) +{ + int ret; + + mutex_lock(o2.base); + mutex_release(o2.base.dep_map, 1, _THIS_IP_); + o2.ctx = t2; + + WWAI(t); + t2 = t; + t2.stamp--; + + ret = WWL(o, t); + WARN_ON(ret); + + ret = WWL(o2, t); + WARN_ON(ret != -EDEADLK); + + ret = WWL(o3, t); +} + +static void ww_test_edeadlk_acquire_more_slow(void) +{ + int ret; + + mutex_lock(o2.base); + mutex_release(o2.base.dep_map, 1, _THIS_IP_); + o2.ctx = t2; + + WWAI(t); + t2 = t; + t2.stamp--; + + ret = WWL(o, t); + WARN_ON(ret); + + ret = WWL(o2, t); + WARN_ON(ret != -EDEADLK); + + ww_mutex_lock_slow(o3, t); +} + +static void ww_test_edeadlk_acquire_more_edeadlk(void) +{ + int ret; + + mutex_lock(o2.base); + mutex_release(o2.base.dep_map, 1, _THIS_IP_); + o2.ctx = t2; + + mutex_lock(o3.base); + mutex_release(o3.base.dep_map, 1, _THIS_IP_); + o3.ctx = t2; + + WWAI(t); + t2 = t; + t2.stamp--; + + ret = WWL(o, t); + WARN_ON(ret); + + ret = WWL(o2, t); + WARN_ON(ret != -EDEADLK); + + ret = WWL(o3, t); + WARN_ON(ret != -EDEADLK); +} + +static void ww_test_edeadlk_acquire_more_edeadlk_slow(void) +{ + int ret; + + mutex_lock(o2.base); + mutex_release(o2.base.dep_map, 1, _THIS_IP_); + o2.ctx = t2; + + mutex_lock(o3.base); + mutex_release(o3.base.dep_map, 1, _THIS_IP_
[PATCH v5 7/7] locking-selftests: handle unexpected failures more strictly
When CONFIG_PROVE_LOCKING is not enabled, more tests are expected to pass unexpectedly, but there no tests that should start to fail that pass with CONFIG_PROVE_LOCKING enabled. Signed-off-by: Maarten Lankhorst maarten.lankho...@canonical.com --- lib/locking-selftest.c |8 +--- 1 file changed, 5 insertions(+), 3 deletions(-) diff --git a/lib/locking-selftest.c b/lib/locking-selftest.c index d554f3f..aad024d 100644 --- a/lib/locking-selftest.c +++ b/lib/locking-selftest.c @@ -976,16 +976,18 @@ static void dotest(void (*testcase_fn)(void), int expected, int lockclass_mask) /* * Filter out expected failures: */ - if (debug_locks != expected) { #ifndef CONFIG_PROVE_LOCKING + if (expected == FAILURE debug_locks) { expected_testcase_failures++; printk(failed|); -#else + } + else +#endif + if (debug_locks != expected) { unexpected_testcase_failures++; printk(FAILED|); dump_stack(); -#endif } else { testcase_successes++; printk( ok |); -- To unsubscribe from this list: send the line unsubscribe linux-media in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH v5 3/7] mutex: w/w mutex slowpath debugging
From: Daniel Vetter daniel.vet...@ffwll.ch Injects EDEADLK conditions at pseudo-random interval, with exponential backoff up to UINT_MAX (to ensure that every lock operation still completes in a reasonable time). This way we can test the wound slowpath even for ww mutex users where contention is never expected, and the ww deadlock avoidance algorithm is only needed for correctness against malicious userspace. An example would be protecting kernel modesetting properties, which thanks to single-threaded X isn't really expected to contend, ever. I've looked into using the CONFIG_FAULT_INJECTION infrastructure, but decided against it for two reasons: - EDEADLK handling is mandatory for ww mutex users and should never affect the outcome of a syscall. This is in contrast to -ENOMEM injection. So fine configurability isn't required. - The fault injection framework only allows to set a simple probability for failure. Now the probability that a ww mutex acquire stage with N locks will never complete (due to too many injected EDEADLK backoffs) is zero. But the expected number of ww_mutex_lock operations for the completely uncontended case would be O(exp(N)). The per-acuiqire ctx exponential backoff solution choosen here only results in O(log N) overhead due to injection and so O(log N * N) lock operations. This way we can fail with high probability (and so have good test coverage even for fancy backoff and lock acquisition paths) without running into patalogical cases. Note that EDEADLK will only ever be injected when we managed to acquire the lock. This prevents any behaviour changes for users which rely on the EALREADY semantics. v2: Drop the cargo-culted __sched (I should read docs next time around) and annotate the non-debug case with inline to prevent gcc from doing something horrible. v3: Rebase on top of Maarten's latest patches. v4: Actually make this stuff compile, I've misplace the hunk in the wrong #ifdef block. v5: Simplify ww_mutex_deadlock_injection definition, and fix lib/locking-selftest.c warnings. Fix lib/Kconfig.debug definition to work correctly. (mlankhorst) v6: Do not inject -EDEADLK when ctx-acquired == 0, because the _slow paths are merged now. (mlankhorst) Cc: Steven Rostedt rost...@goodmis.org Signed-off-by: Daniel Vetter daniel.vet...@ffwll.ch Signed-off-by: Maarten Lankhorst maarten.lankho...@canonical.com --- include/linux/mutex.h |8 kernel/mutex.c| 44 +--- lib/Kconfig.debug | 13 + 3 files changed, 62 insertions(+), 3 deletions(-) diff --git a/include/linux/mutex.h b/include/linux/mutex.h index f3ad181..2ff9178 100644 --- a/include/linux/mutex.h +++ b/include/linux/mutex.h @@ -95,6 +95,10 @@ struct ww_acquire_ctx { #ifdef CONFIG_DEBUG_LOCK_ALLOC struct lockdep_map dep_map; #endif +#ifdef CONFIG_DEBUG_WW_MUTEX_SLOWPATH + unsigned deadlock_inject_interval; + unsigned deadlock_inject_countdown; +#endif }; struct ww_mutex { @@ -280,6 +284,10 @@ static inline void ww_acquire_init(struct ww_acquire_ctx *ctx, ww_class-acquire_key, 0); mutex_acquire(ctx-dep_map, 0, 0, _RET_IP_); #endif +#ifdef CONFIG_DEBUG_WW_MUTEX_SLOWPATH + ctx-deadlock_inject_interval = 1; + ctx-deadlock_inject_countdown = ctx-stamp 0xf; +#endif } /** diff --git a/kernel/mutex.c b/kernel/mutex.c index 75fc7c4..e40004b 100644 --- a/kernel/mutex.c +++ b/kernel/mutex.c @@ -508,22 +508,60 @@ mutex_lock_interruptible_nested(struct mutex *lock, unsigned int subclass) EXPORT_SYMBOL_GPL(mutex_lock_interruptible_nested); +static inline int +ww_mutex_deadlock_injection(struct ww_mutex *lock, struct ww_acquire_ctx *ctx) +{ +#ifdef CONFIG_DEBUG_WW_MUTEX_SLOWPATH + unsigned tmp; + + if (ctx-deadlock_inject_countdown-- == 0) { + tmp = ctx-deadlock_inject_interval; + if (tmp UINT_MAX/4) + tmp = UINT_MAX; + else + tmp = tmp*2 + tmp + tmp/2; + + ctx-deadlock_inject_interval = tmp; + ctx-deadlock_inject_countdown = tmp; + ctx-contending_lock = lock; + + ww_mutex_unlock(lock); + + return -EDEADLK; + } +#endif + + return 0; +} int __sched __ww_mutex_lock(struct ww_mutex *lock, struct ww_acquire_ctx *ctx) { + int ret; + might_sleep(); - return __mutex_lock_common(lock-base, TASK_UNINTERRUPTIBLE, + ret = __mutex_lock_common(lock-base, TASK_UNINTERRUPTIBLE, 0, ctx-dep_map, _RET_IP_, ctx); + if (!ret ctx-acquired 0) + return ww_mutex_deadlock_injection(lock, ctx); + + return ret; } EXPORT_SYMBOL_GPL(__ww_mutex_lock); int __sched __ww_mutex_lock_interruptible(struct ww_mutex *lock, struct ww_acquire_ctx *ctx) { + int ret; + might_sleep(); - return
[PATCH v5 2/7] mutex: add support for wound/wait style locks, v5
Changes since RFC patch v1: - Updated to use atomic_long instead of atomic, since the reservation_id was a long. - added mutex_reserve_lock_slow and mutex_reserve_lock_intr_slow - removed mutex_locked_set_reservation_id (or w/e it was called) Changes since RFC patch v2: - remove use of __mutex_lock_retval_arg, add warnings when using wrong combination of mutex_(,reserve_)lock/unlock. Changes since v1: - Add __always_inline to __mutex_lock_common, otherwise reservation paths can be triggered from normal locks, because __builtin_constant_p might evaluate to false for the constant 0 in that case. Tests for this have been added in the next patch. - Updated documentation slightly. Changes since v2: - Renamed everything to ww_mutex. (mlankhorst) - Added ww_acquire_ctx and ww_class. (mlankhorst) - Added a lot of checks for wrong api usage. (mlankhorst) - Documentation updates. (danvet) Changes since v3: - Small documentation fixes (robclark) - Memory barrier fix (danvet) Changes since v4: - Remove ww_mutex_unlock_single and ww_mutex_lock_single. - Rename ww_mutex_trylock_single to ww_mutex_trylock. - Remove separate implementations of ww_mutex_lock_slow*, normal functions can be used. Inline versions still exist for extra debugging. - Cleanup unneeded memory barriers, add comment to the remaining smp_mb(). Signed-off-by: Maarten Lankhorst maarten.lankho...@canonical.com Signed-off-by: Daniel Vetter daniel.vet...@ffwll.ch Signed-off-by: Rob Clark robdcl...@gmail.com --- Documentation/ww-mutex-design.txt | 343 include/linux/mutex-debug.h |1 include/linux/mutex.h | 355 + kernel/mutex.c| 318 +++-- lib/debug_locks.c |2 5 files changed, 1002 insertions(+), 17 deletions(-) create mode 100644 Documentation/ww-mutex-design.txt diff --git a/Documentation/ww-mutex-design.txt b/Documentation/ww-mutex-design.txt new file mode 100644 index 000..379739c --- /dev/null +++ b/Documentation/ww-mutex-design.txt @@ -0,0 +1,343 @@ +Wait/Wound Deadlock-Proof Mutex Design +== + +Please read mutex-design.txt first, as it applies to wait/wound mutexes too. + +Motivation for WW-Mutexes +- + +GPU's do operations that commonly involve many buffers. Those buffers +can be shared across contexts/processes, exist in different memory +domains (for example VRAM vs system memory), and so on. And with +PRIME / dmabuf, they can even be shared across devices. So there are +a handful of situations where the driver needs to wait for buffers to +become ready. If you think about this in terms of waiting on a buffer +mutex for it to become available, this presents a problem because +there is no way to guarantee that buffers appear in a execbuf/batch in +the same order in all contexts. That is directly under control of +userspace, and a result of the sequence of GL calls that an application +makes. Which results in the potential for deadlock. The problem gets +more complex when you consider that the kernel may need to migrate the +buffer(s) into VRAM before the GPU operates on the buffer(s), which +may in turn require evicting some other buffers (and you don't want to +evict other buffers which are already queued up to the GPU), but for a +simplified understanding of the problem you can ignore this. + +The algorithm that TTM came up with for dealing with this problem is quite +simple. For each group of buffers (execbuf) that need to be locked, the caller +would be assigned a unique reservation id/ticket, from a global counter. In +case of deadlock while locking all the buffers associated with a execbuf, the +one with the lowest reservation ticket (i.e. the oldest task) wins, and the one +with the higher reservation id (i.e. the younger task) unlocks all of the +buffers that it has already locked, and then tries again. + +In the RDBMS literature this deadlock handling approach is called wait/wound: +The older tasks waits until it can acquire the contended lock. The younger tasks +needs to back off and drop all the locks it is currently holding, i.e. the +younger task is wounded. + +Concepts + + +Compared to normal mutexes two additional concepts/objects show up in the lock +interface for w/w mutexes: + +Acquire context: To ensure eventual forward progress it is important the a task +trying to acquire locks doesn't grab a new reservation id, but keeps the one it +acquired when starting the lock acquisition. This ticket is stored in the +acquire context. Furthermore the acquire context keeps track of debugging state +to catch w/w mutex interface abuse. + +W/w class: In contrast to normal mutexes the lock class needs to be explicit for +w/w mutexes, since it is required to initialize the acquire context. + +Furthermore there are three different class of w/w lock
[PATCH v5 1/7] arch: make __mutex_fastpath_lock_retval return whether fastpath succeeded or not.
This will allow me to call functions that have multiple arguments if fastpath fails. This is required to support ticket mutexes, because they need to be able to pass an extra argument to the fail function. Originally I duplicated the functions, by adding __mutex_fastpath_lock_retval_arg. This ended up being just a duplication of the existing function, so a way to test if fastpath was called ended up being better. This also cleaned up the reservation mutex patch some by being able to call an atomic_set instead of atomic_xchg, and making it easier to detect if the wrong unlock function was previously used. Changes since v1, pointed out by Francesco Lavra: - fix a small comment issue in mutex_32.h - fix the __mutex_fastpath_lock_retval macro for mutex-null.h Signed-off-by: Maarten Lankhorst maarten.lankho...@canonical.com --- arch/ia64/include/asm/mutex.h| 10 -- arch/powerpc/include/asm/mutex.h | 10 -- arch/sh/include/asm/mutex-llsc.h |4 ++-- arch/x86/include/asm/mutex_32.h | 11 --- arch/x86/include/asm/mutex_64.h | 11 --- include/asm-generic/mutex-dec.h | 10 -- include/asm-generic/mutex-null.h |2 +- include/asm-generic/mutex-xchg.h | 10 -- kernel/mutex.c | 32 ++-- 9 files changed, 41 insertions(+), 59 deletions(-) diff --git a/arch/ia64/include/asm/mutex.h b/arch/ia64/include/asm/mutex.h index bed73a6..f41e66d 100644 --- a/arch/ia64/include/asm/mutex.h +++ b/arch/ia64/include/asm/mutex.h @@ -29,17 +29,15 @@ __mutex_fastpath_lock(atomic_t *count, void (*fail_fn)(atomic_t *)) * __mutex_fastpath_lock_retval - try to take the lock by moving the count * from 1 to a 0 value * @count: pointer of type atomic_t - * @fail_fn: function to call if the original value was not 1 * - * Change the count from 1 to a value lower than 1, and call fail_fn if - * it wasn't 1 originally. This function returns 0 if the fastpath succeeds, - * or anything the slow path function returns. + * Change the count from 1 to a value lower than 1. This function returns 0 + * if the fastpath succeeds, or -1 otherwise. */ static inline int -__mutex_fastpath_lock_retval(atomic_t *count, int (*fail_fn)(atomic_t *)) +__mutex_fastpath_lock_retval(atomic_t *count) { if (unlikely(ia64_fetchadd4_acq(count, -1) != 1)) - return fail_fn(count); + return -1; return 0; } diff --git a/arch/powerpc/include/asm/mutex.h b/arch/powerpc/include/asm/mutex.h index 5399f7e..127ab23 100644 --- a/arch/powerpc/include/asm/mutex.h +++ b/arch/powerpc/include/asm/mutex.h @@ -82,17 +82,15 @@ __mutex_fastpath_lock(atomic_t *count, void (*fail_fn)(atomic_t *)) * __mutex_fastpath_lock_retval - try to take the lock by moving the count * from 1 to a 0 value * @count: pointer of type atomic_t - * @fail_fn: function to call if the original value was not 1 * - * Change the count from 1 to a value lower than 1, and call fail_fn if - * it wasn't 1 originally. This function returns 0 if the fastpath succeeds, - * or anything the slow path function returns. + * Change the count from 1 to a value lower than 1. This function returns 0 + * if the fastpath succeeds, or -1 otherwise. */ static inline int -__mutex_fastpath_lock_retval(atomic_t *count, int (*fail_fn)(atomic_t *)) +__mutex_fastpath_lock_retval(atomic_t *count) { if (unlikely(__mutex_dec_return_lock(count) 0)) - return fail_fn(count); + return -1; return 0; } diff --git a/arch/sh/include/asm/mutex-llsc.h b/arch/sh/include/asm/mutex-llsc.h index 090358a..dad29b6 100644 --- a/arch/sh/include/asm/mutex-llsc.h +++ b/arch/sh/include/asm/mutex-llsc.h @@ -37,7 +37,7 @@ __mutex_fastpath_lock(atomic_t *count, void (*fail_fn)(atomic_t *)) } static inline int -__mutex_fastpath_lock_retval(atomic_t *count, int (*fail_fn)(atomic_t *)) +__mutex_fastpath_lock_retval(atomic_t *count) { int __done, __res; @@ -51,7 +51,7 @@ __mutex_fastpath_lock_retval(atomic_t *count, int (*fail_fn)(atomic_t *)) : t); if (unlikely(!__done || __res != 0)) - __res = fail_fn(count); + __res = -1; return __res; } diff --git a/arch/x86/include/asm/mutex_32.h b/arch/x86/include/asm/mutex_32.h index 03f90c8..0208c3c 100644 --- a/arch/x86/include/asm/mutex_32.h +++ b/arch/x86/include/asm/mutex_32.h @@ -42,17 +42,14 @@ do { \ * __mutex_fastpath_lock_retval - try to take the lock by moving the count * from 1 to a 0 value * @count: pointer of type atomic_t - * @fail_fn: function to call if the original value was not 1 * - * Change the count from 1 to a value lower than 1, and call fail_fn if it - * wasn't 1 originally. This function returns 0
[PATCH v5 0/7] add mutex wait/wound/style style locks
Changes since v4: - Some documentation cleanups. - Added a lot more tests to cover all the DEBUG_LOCKS_WARN_ON cases. - Added EDEADLK tests. - Split off the normal mutex tests to a separate patch. - Added a patch to not allow tests to fail that succeed with PROVE_LOCKING enabled. --- Daniel Vetter (1): mutex: w/w mutex slowpath debugging Maarten Lankhorst (6): arch: make __mutex_fastpath_lock_retval return whether fastpath succeeded or not. mutex: add support for wound/wait style locks, v5 mutex: Add ww tests to lib/locking-selftest.c. v5 mutex: add more tests to lib/locking-selftest.c mutex: add more ww tests to test EDEADLK path handling locking-selftests: handle unexpected failures more strictly Documentation/ww-mutex-design.txt | 343 ++ arch/ia64/include/asm/mutex.h | 10 - arch/powerpc/include/asm/mutex.h | 10 - arch/sh/include/asm/mutex-llsc.h |4 arch/x86/include/asm/mutex_32.h | 11 - arch/x86/include/asm/mutex_64.h | 11 - include/asm-generic/mutex-dec.h | 10 - include/asm-generic/mutex-null.h |2 include/asm-generic/mutex-xchg.h | 10 - include/linux/mutex-debug.h |1 include/linux/mutex.h | 363 +++ kernel/mutex.c| 384 ++-- lib/Kconfig.debug | 13 + lib/debug_locks.c |2 lib/locking-selftest.c| 720 - 15 files changed, 1802 insertions(+), 92 deletions(-) create mode 100644 Documentation/ww-mutex-design.txt -- Signature -- To unsubscribe from this list: send the line unsubscribe linux-media in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH v5 4/7] mutex: Add ww tests to lib/locking-selftest.c. v5
This stresses the lockdep code in some ways specifically useful to ww_mutexes. It adds checks for most of the common locking errors. Changes since v1: - Add tests to verify reservation_id is untouched. - Use L() and U() macros where possible. Changes since v2: - Use the ww_mutex api directly. - Use macros for most of the code. Changes since v3: - Rework tests for the api changes. Changes since v4: - Added a test to cover ww_acquire_done being called multiple times. - Added a test for calling ww_acquire_fini being called before all locks have been unlocked. - Added a test for locking after ww_acquire_done has been called. - Added a test for unbalance for ctx-acquired dropping below zero. - Added a test for unlocked ww_mutex with ctx != NULL. Signed-off-by: Maarten Lankhorst maarten.lankho...@canonical.com --- lib/locking-selftest.c | 400 ++-- 1 file changed, 381 insertions(+), 19 deletions(-) diff --git a/lib/locking-selftest.c b/lib/locking-selftest.c index c3eb261..9962262 100644 --- a/lib/locking-selftest.c +++ b/lib/locking-selftest.c @@ -26,6 +26,8 @@ */ static unsigned int debug_locks_verbose; +static DEFINE_WW_CLASS(ww_lockdep); + static int __init setup_debug_locks_verbose(char *str) { get_option(str, debug_locks_verbose); @@ -42,6 +44,10 @@ __setup(debug_locks_verbose=, setup_debug_locks_verbose); #define LOCKTYPE_RWLOCK0x2 #define LOCKTYPE_MUTEX 0x4 #define LOCKTYPE_RWSEM 0x8 +#define LOCKTYPE_WW0x10 + +static struct ww_acquire_ctx t, t2; +static struct ww_mutex o, o2; /* * Normal standalone locks, for the circular and irq-context @@ -193,6 +199,20 @@ static void init_shared_classes(void) #define RSU(x) up_read(rwsem_##x) #define RWSI(x)init_rwsem(rwsem_##x) +#ifndef CONFIG_DEBUG_WW_MUTEX_SLOWPATH +#define WWAI(x)ww_acquire_init(x, ww_lockdep) +#else +#define WWAI(x)do { ww_acquire_init(x, ww_lockdep); (x)-deadlock_inject_countdown = ~0U; } while (0) +#endif +#define WWAD(x)ww_acquire_done(x) +#define WWAF(x)ww_acquire_fini(x) + +#define WWL(x, c) ww_mutex_lock(x, c) +#define WWT(x) ww_mutex_trylock(x) +#define WWL1(x)ww_mutex_lock(x, NULL) +#define WWU(x) ww_mutex_unlock(x) + + #define LOCK_UNLOCK_2(x,y) LOCK(x); LOCK(y); UNLOCK(y); UNLOCK(x) /* @@ -894,11 +914,13 @@ GENERATE_PERMUTATIONS_3_EVENTS(irq_read_recursion_soft) # define I_RWLOCK(x) lockdep_reset_lock(rwlock_##x.dep_map) # define I_MUTEX(x)lockdep_reset_lock(mutex_##x.dep_map) # define I_RWSEM(x)lockdep_reset_lock(rwsem_##x.dep_map) +# define I_WW(x) lockdep_reset_lock(x.dep_map) #else # define I_SPINLOCK(x) # define I_RWLOCK(x) # define I_MUTEX(x) # define I_RWSEM(x) +# define I_WW(x) #endif #define I1(x) \ @@ -920,11 +942,20 @@ GENERATE_PERMUTATIONS_3_EVENTS(irq_read_recursion_soft) static void reset_locks(void) { local_irq_disable(); + lockdep_free_key_range(ww_lockdep.acquire_key, 1); + lockdep_free_key_range(ww_lockdep.mutex_key, 1); + I1(A); I1(B); I1(C); I1(D); I1(X1); I1(X2); I1(Y1); I1(Y2); I1(Z1); I1(Z2); + I_WW(t); I_WW(t2); I_WW(o.base); I_WW(o2.base); lockdep_reset(); I2(A); I2(B); I2(C); I2(D); init_shared_classes(); + + ww_mutex_init(o, ww_lockdep); ww_mutex_init(o2, ww_lockdep); + memset(t, 0, sizeof(t)); memset(t2, 0, sizeof(t2)); + memset(ww_lockdep.acquire_key, 0, sizeof(ww_lockdep.acquire_key)); + memset(ww_lockdep.mutex_key, 0, sizeof(ww_lockdep.mutex_key)); local_irq_enable(); } @@ -938,7 +969,6 @@ static int unexpected_testcase_failures; static void dotest(void (*testcase_fn)(void), int expected, int lockclass_mask) { unsigned long saved_preempt_count = preempt_count(); - int expected_failure = 0; WARN_ON(irqs_disabled()); @@ -946,26 +976,16 @@ static void dotest(void (*testcase_fn)(void), int expected, int lockclass_mask) /* * Filter out expected failures: */ + if (debug_locks != expected) { #ifndef CONFIG_PROVE_LOCKING - if ((lockclass_mask LOCKTYPE_SPIN) debug_locks != expected) - expected_failure = 1; - if ((lockclass_mask LOCKTYPE_RWLOCK) debug_locks != expected) - expected_failure = 1; - if ((lockclass_mask LOCKTYPE_MUTEX) debug_locks != expected) - expected_failure = 1; - if ((lockclass_mask LOCKTYPE_RWSEM) debug_locks != expected) - expected_failure = 1; + expected_testcase_failures++; + printk(failed|); +#else + unexpected_testcase_failures++; + printk(FAILED|); + + dump_stack(); #endif - if (debug_locks
Re: [PATCH v5 2/7] mutex: add support for wound/wait style locks, v5
Op 20-06-13 13:55, Ingo Molnar schreef: * Maarten Lankhorst maarten.lankho...@canonical.com wrote: Changes since RFC patch v1: - Updated to use atomic_long instead of atomic, since the reservation_id was a long. - added mutex_reserve_lock_slow and mutex_reserve_lock_intr_slow - removed mutex_locked_set_reservation_id (or w/e it was called) Changes since RFC patch v2: - remove use of __mutex_lock_retval_arg, add warnings when using wrong combination of mutex_(,reserve_)lock/unlock. Changes since v1: - Add __always_inline to __mutex_lock_common, otherwise reservation paths can be triggered from normal locks, because __builtin_constant_p might evaluate to false for the constant 0 in that case. Tests for this have been added in the next patch. - Updated documentation slightly. Changes since v2: - Renamed everything to ww_mutex. (mlankhorst) - Added ww_acquire_ctx and ww_class. (mlankhorst) - Added a lot of checks for wrong api usage. (mlankhorst) - Documentation updates. (danvet) Changes since v3: - Small documentation fixes (robclark) - Memory barrier fix (danvet) Changes since v4: - Remove ww_mutex_unlock_single and ww_mutex_lock_single. - Rename ww_mutex_trylock_single to ww_mutex_trylock. - Remove separate implementations of ww_mutex_lock_slow*, normal functions can be used. Inline versions still exist for extra debugging. - Cleanup unneeded memory barriers, add comment to the remaining smp_mb(). That's not a proper changelog. It should be a short description of what it does, possibly referring to the new Documentation/ww-mutex-design.txt file for more details. Well they've helped me with some of the changes and contributed some code and/or fixes, but if acked-by is preferred I'll use that.. Signed-off-by: Maarten Lankhorst maarten.lankho...@canonical.com Signed-off-by: Daniel Vetter daniel.vet...@ffwll.ch Signed-off-by: Rob Clark robdcl...@gmail.com That's not a valid signoff chain: the last signoff in the chain is the person sending me the patch. The first signoff is the person who wrote the patch. The other two gents should be Acked-by I suspect? I guess so. -- To unsubscribe from this list: send the line unsubscribe linux-media in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [RFC PATCH v2] dmabuf-sync: Introduce buffer synchronization framework
Op 17-06-13 13:15, Inki Dae schreef: This patch adds a buffer synchronization framework based on DMA BUF[1] and reservation[2] to use dma-buf resource, and based on ww-mutexes[3] for lock mechanism. The purpose of this framework is not only to couple cache operations, and buffer access control to CPU and DMA but also to provide easy-to-use interfaces for device drivers and potentially user application (not implemented for user applications, yet). And this framework can be used for all dma devices using system memory as dma buffer, especially for most ARM based SoCs. Changelog v2: - use atomic_add_unless to avoid potential bug. - add a macro for checking valid access type. - code clean. The mechanism of this framework has the following steps, 1. Register dmabufs to a sync object - A task gets a new sync object and can add one or more dmabufs that the task wants to access. This registering should be performed when a device context or an event context such as a page flip event is created or before CPU accesses a shared buffer. dma_buf_sync_get(a sync object, a dmabuf); 2. Lock a sync object - A task tries to lock all dmabufs added in its own sync object. Basically, the lock mechanism uses ww-mutex[1] to avoid dead lock issue and for race condition between CPU and CPU, CPU and DMA, and DMA and DMA. Taking a lock means that others cannot access all locked dmabufs until the task that locked the corresponding dmabufs, unlocks all the locked dmabufs. This locking should be performed before DMA or CPU accesses these dmabufs. dma_buf_sync_lock(a sync object); 3. Unlock a sync object - The task unlocks all dmabufs added in its own sync object. The unlock means that the DMA or CPU accesses to the dmabufs have been completed so that others may access them. This unlocking should be performed after DMA or CPU has completed accesses to the dmabufs. dma_buf_sync_unlock(a sync object); 4. Unregister one or all dmabufs from a sync object - A task unregisters the given dmabufs from the sync object. This means that the task dosen't want to lock the dmabufs. The unregistering should be performed after DMA or CPU has completed accesses to the dmabufs or when dma_buf_sync_lock() is failed. dma_buf_sync_put(a sync object, a dmabuf); dma_buf_sync_put_all(a sync object); The described steps may be summarized as: get - lock - CPU or DMA access to a buffer/s - unlock - put This framework includes the following two features. 1. read (shared) and write (exclusive) locks - A task is required to declare the access type when the task tries to register a dmabuf; READ, WRITE, READ DMA, or WRITE DMA. The below is example codes, struct dmabuf_sync *sync; sync = dmabuf_sync_init(NULL, test sync); dmabuf_sync_get(sync, dmabuf, DMA_BUF_ACCESS_READ); ... And the below can be used as access types: DMA_BUF_ACCESS_READ, - CPU will access a buffer for read. DMA_BUF_ACCESS_WRITE, - CPU will access a buffer for read or write. DMA_BUF_ACCESS_READ | DMA_BUF_ACCESS_DMA, - DMA will access a buffer for read DMA_BUF_ACCESS_WRITE | DMA_BUF_ACCESS_DMA, - DMA will access a buffer for read or write. 2. Mandatory resource releasing - a task cannot hold a lock indefinitely. A task may never try to unlock a buffer after taking a lock to the buffer. In this case, a timer handler to the corresponding sync object is called in five (default) seconds and then the timed-out buffer is unlocked by work queue handler to avoid lockups and to enforce resources of the buffer. The below is how to use: 1. Allocate and Initialize a sync object: struct dmabuf_sync *sync; sync = dmabuf_sync_init(NULL, test sync); ... 2. Add a dmabuf to the sync object when setting up dma buffer relevant registers: dmabuf_sync_get(sync, dmabuf, DMA_BUF_ACCESS_READ); ... 3. Lock all dmabufs of the sync object before DMA or CPU accesses the dmabufs: dmabuf_sync_lock(sync); ... 4. Now CPU or DMA can access all dmabufs locked in step 3. 5. Unlock all dmabufs added in a sync object after DMA or CPU access to these dmabufs is completed: dmabuf_sync_unlock(sync); And call the following functions to release all resources, dmabuf_sync_put_all(sync); dmabuf_sync_fini(sync); You can refer to actual example codes: https://git.kernel.org/cgit/linux/kernel/git/daeinki/drm-exynos.git/ commit/?h=dmabuf-syncid=4030bdee9bab5841ad32faade528d04cc0c5fc94
Re: [RFC PATCH v2] dmabuf-sync: Introduce buffer synchronization framework
Op 17-06-13 15:04, Inki Dae schreef: -Original Message- From: Maarten Lankhorst [mailto:maarten.lankho...@canonical.com] Sent: Monday, June 17, 2013 8:35 PM To: Inki Dae Cc: dri-de...@lists.freedesktop.org; linux-fb...@vger.kernel.org; linux- arm-ker...@lists.infradead.org; linux-media@vger.kernel.org; dan...@ffwll.ch; robdcl...@gmail.com; kyungmin.p...@samsung.com; myungjoo@samsung.com; yj44@samsung.com Subject: Re: [RFC PATCH v2] dmabuf-sync: Introduce buffer synchronization framework Op 17-06-13 13:15, Inki Dae schreef: This patch adds a buffer synchronization framework based on DMA BUF[1] and reservation[2] to use dma-buf resource, and based on ww-mutexes[3] for lock mechanism. The purpose of this framework is not only to couple cache operations, and buffer access control to CPU and DMA but also to provide easy-to-use interfaces for device drivers and potentially user application (not implemented for user applications, yet). And this framework can be used for all dma devices using system memory as dma buffer, especially for most ARM based SoCs. Changelog v2: - use atomic_add_unless to avoid potential bug. - add a macro for checking valid access type. - code clean. The mechanism of this framework has the following steps, 1. Register dmabufs to a sync object - A task gets a new sync object and can add one or more dmabufs that the task wants to access. This registering should be performed when a device context or an event context such as a page flip event is created or before CPU accesses a shared buffer. dma_buf_sync_get(a sync object, a dmabuf); 2. Lock a sync object - A task tries to lock all dmabufs added in its own sync object. Basically, the lock mechanism uses ww-mutex[1] to avoid dead lock issue and for race condition between CPU and CPU, CPU and DMA, and DMA and DMA. Taking a lock means that others cannot access all locked dmabufs until the task that locked the corresponding dmabufs, unlocks all the locked dmabufs. This locking should be performed before DMA or CPU accesses these dmabufs. dma_buf_sync_lock(a sync object); 3. Unlock a sync object - The task unlocks all dmabufs added in its own sync object. The unlock means that the DMA or CPU accesses to the dmabufs have been completed so that others may access them. This unlocking should be performed after DMA or CPU has completed accesses to the dmabufs. dma_buf_sync_unlock(a sync object); 4. Unregister one or all dmabufs from a sync object - A task unregisters the given dmabufs from the sync object. This means that the task dosen't want to lock the dmabufs. The unregistering should be performed after DMA or CPU has completed accesses to the dmabufs or when dma_buf_sync_lock() is failed. dma_buf_sync_put(a sync object, a dmabuf); dma_buf_sync_put_all(a sync object); The described steps may be summarized as: get - lock - CPU or DMA access to a buffer/s - unlock - put This framework includes the following two features. 1. read (shared) and write (exclusive) locks - A task is required to declare the access type when the task tries to register a dmabuf; READ, WRITE, READ DMA, or WRITE DMA. The below is example codes, struct dmabuf_sync *sync; sync = dmabuf_sync_init(NULL, test sync); dmabuf_sync_get(sync, dmabuf, DMA_BUF_ACCESS_READ); ... And the below can be used as access types: DMA_BUF_ACCESS_READ, - CPU will access a buffer for read. DMA_BUF_ACCESS_WRITE, - CPU will access a buffer for read or write. DMA_BUF_ACCESS_READ | DMA_BUF_ACCESS_DMA, - DMA will access a buffer for read DMA_BUF_ACCESS_WRITE | DMA_BUF_ACCESS_DMA, - DMA will access a buffer for read or write. 2. Mandatory resource releasing - a task cannot hold a lock indefinitely. A task may never try to unlock a buffer after taking a lock to the buffer. In this case, a timer handler to the corresponding sync object is called in five (default) seconds and then the timed-out buffer is unlocked by work queue handler to avoid lockups and to enforce resources of the buffer. The below is how to use: 1. Allocate and Initialize a sync object: struct dmabuf_sync *sync; sync = dmabuf_sync_init(NULL, test sync); ... 2. Add a dmabuf to the sync object when setting up dma buffer relevant registers: dmabuf_sync_get(sync, dmabuf, DMA_BUF_ACCESS_READ); ... 3. Lock all dmabufs of the sync object before DMA or CPU accesses the dmabufs: dmabuf_sync_lock(sync); ... 4. Now CPU or DMA can access all dmabufs locked in step 3. 5. Unlock all dmabufs added in a sync object after DMA
Re: [PATCH v4 0/4] add mutex wait/wound/style style locks
Op 28-05-13 16:48, Maarten Lankhorst schreef: Version 4 already? Small api changes since v3: - Remove ww_mutex_unlock_single and ww_mutex_lock_single. - Rename ww_mutex_trylock_single to ww_mutex_trylock. - Remove separate implementations of ww_mutex_lock_slow*, normal functions can be used. Inline versions still exist for extra debugging, and to annotate. - Cleanup unneeded memory barriers, add comment to the remaining smp_mb(). Thanks to Daniel Vetter, Rob Clark and Peter Zijlstra for their feedback. --- Daniel Vetter (1): mutex: w/w mutex slowpath debugging Maarten Lankhorst (3): arch: make __mutex_fastpath_lock_retval return whether fastpath succeeded or not. mutex: add support for wound/wait style locks, v5 mutex: Add ww tests to lib/locking-selftest.c. v4 Documentation/ww-mutex-design.txt | 344 +++ arch/ia64/include/asm/mutex.h | 10 - arch/powerpc/include/asm/mutex.h | 10 - arch/sh/include/asm/mutex-llsc.h |4 arch/x86/include/asm/mutex_32.h | 11 - arch/x86/include/asm/mutex_64.h | 11 - include/asm-generic/mutex-dec.h | 10 - include/asm-generic/mutex-null.h |2 include/asm-generic/mutex-xchg.h | 10 - include/linux/mutex-debug.h |1 include/linux/mutex.h | 363 + kernel/mutex.c| 384 --- lib/Kconfig.debug | 13 + lib/debug_locks.c |2 lib/locking-selftest.c| 410 +++-- 15 files changed, 1492 insertions(+), 93 deletions(-) create mode 100644 Documentation/ww-mutex-design.txt Bump, do you have any feedback peterz? -- To unsubscribe from this list: send the line unsubscribe linux-media in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [RFC][PATCH 1/2] dma-buf: add importer private data to attachment
Op 07-06-13 04:32, 김승우 schreef: Hello Maarten, On 2013년 06월 05일 22:23, Maarten Lankhorst wrote: Op 31-05-13 10:54, Seung-Woo Kim schreef: dma-buf attachment has only exporter private data, but importer private data can be useful for importer especially to re-import the same dma-buf. To use importer private data in attachment of the device, the function to search attachment in the attachment list of dma-buf is also added. Signed-off-by: Seung-Woo Kim sw0312@samsung.com --- drivers/base/dma-buf.c | 31 +++ include/linux/dma-buf.h |4 2 files changed, 35 insertions(+), 0 deletions(-) diff --git a/drivers/base/dma-buf.c b/drivers/base/dma-buf.c index 08fe897..a1eaaf2 100644 --- a/drivers/base/dma-buf.c +++ b/drivers/base/dma-buf.c @@ -259,6 +259,37 @@ err_attach: EXPORT_SYMBOL_GPL(dma_buf_attach); /** + * dma_buf_get_attachment - Get attachment with the device from dma_buf's + * attachments list + * @dmabuf:[in]buffer to find device from. + * @dev: [in]device to be found. + * + * Returns struct dma_buf_attachment * attaching the device; may return + * negative error codes. + * + */ +struct dma_buf_attachment *dma_buf_get_attachment(struct dma_buf *dmabuf, + struct device *dev) +{ + struct dma_buf_attachment *attach; + + if (WARN_ON(!dmabuf || !dev)) + return ERR_PTR(-EINVAL); + + mutex_lock(dmabuf-lock); + list_for_each_entry(attach, dmabuf-attachments, node) { + if (attach-dev == dev) { + mutex_unlock(dmabuf-lock); + return attach; + } + } + mutex_unlock(dmabuf-lock); + + return ERR_PTR(-ENODEV); +} +EXPORT_SYMBOL_GPL(dma_buf_get_attachment); NAK in any form.. Spot the race condition between dma_buf_get_attachment and dma_buf_attach Both dma_buf_get_attachment and dma_buf_attach has mutet with dmabuf-lock, and dma_buf_get_attachment is used for get attachment from same device before calling dma_buf_attach. hint: what happens if 2 threads do this: if (IS_ERR(attach = dma_buf_get_attachment(buf, dev))) attach = dma_buf_attach(buf, dev); There really is no correct usecase for this kind of thing, so please don't do it. While, dma_buf_detach can removes attachment because it does not have ref count. So importer should check ref count in its importer private data before calling dma_buf_detach if the importer want to use dma_buf_get_attachment. And dma_buf_get_attachment is for the importer, so exporter attach and detach callback operation should not call it as like exporter detach callback operation should not call dma_buf_attach if you mean this kind of race. If you have other considerations here, please describe more specifically. Thanks and Best Regards, - Seung-Woo Kim ~Maarten -- To unsubscribe from this list: send the line unsubscribe linux-media in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [RFC][PATCH 1/2] dma-buf: add importer private data to attachment
Op 31-05-13 10:54, Seung-Woo Kim schreef: dma-buf attachment has only exporter private data, but importer private data can be useful for importer especially to re-import the same dma-buf. To use importer private data in attachment of the device, the function to search attachment in the attachment list of dma-buf is also added. Signed-off-by: Seung-Woo Kim sw0312@samsung.com --- drivers/base/dma-buf.c | 31 +++ include/linux/dma-buf.h |4 2 files changed, 35 insertions(+), 0 deletions(-) diff --git a/drivers/base/dma-buf.c b/drivers/base/dma-buf.c index 08fe897..a1eaaf2 100644 --- a/drivers/base/dma-buf.c +++ b/drivers/base/dma-buf.c @@ -259,6 +259,37 @@ err_attach: EXPORT_SYMBOL_GPL(dma_buf_attach); /** + * dma_buf_get_attachment - Get attachment with the device from dma_buf's + * attachments list + * @dmabuf: [in]buffer to find device from. + * @dev: [in]device to be found. + * + * Returns struct dma_buf_attachment * attaching the device; may return + * negative error codes. + * + */ +struct dma_buf_attachment *dma_buf_get_attachment(struct dma_buf *dmabuf, + struct device *dev) +{ + struct dma_buf_attachment *attach; + + if (WARN_ON(!dmabuf || !dev)) + return ERR_PTR(-EINVAL); + + mutex_lock(dmabuf-lock); + list_for_each_entry(attach, dmabuf-attachments, node) { + if (attach-dev == dev) { + mutex_unlock(dmabuf-lock); + return attach; + } + } + mutex_unlock(dmabuf-lock); + + return ERR_PTR(-ENODEV); +} +EXPORT_SYMBOL_GPL(dma_buf_get_attachment); NAK in any form.. Spot the race condition between dma_buf_get_attachment and dma_buf_attach ~Maarten -- To unsubscribe from this list: send the line unsubscribe linux-media in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v4 2/4] mutex: add support for wound/wait style locks, v5
Op 29-05-13 12:33, Inki Dae schreef: Hi, Just minor comments +Usage +- + +Three different ways to acquire locks within the same w/w class. Common +definitions for methods #1 and #2: + +static DEFINE_WW_CLASS(ww_class); + +struct obj { + struct ww_mutex lock; + /* obj data */ +}; + +struct obj_entry { + struct list_head *list; + struct obj *obj; +}; + +Method 1, using a list in execbuf-buffers that's not allowed to be reordered. +This is useful if a list of required objects is already tracked somewhere. +Furthermore the lock helper can use propagate the -EALREADY return code back to +the caller as a signal that an object is twice on the list. This is useful if +the list is constructed from userspace input and the ABI requires userspace to +not have duplicate entries (e.g. for a gpu commandbuffer submission ioctl). + +int lock_objs(struct list_head *list, struct ww_acquire_ctx *ctx) +{ + struct obj *res_obj = NULL; + struct obj_entry *contended_entry = NULL; + struct obj_entry *entry; + + ww_acquire_init(ctx, ww_class); + +retry: + list_for_each_entry (list, entry) { + if (entry == res_obj) { Indeed, documentation was wrong. With the below diff it should almost compile now. I really don't want to know if it really does, it's meant to be documentation! diff --git a/Documentation/ww-mutex-design.txt b/Documentation/ww-mutex-design.txt index 8bd1761..379739c 100644 --- a/Documentation/ww-mutex-design.txt +++ b/Documentation/ww-mutex-design.txt @@ -100,7 +100,7 @@ struct obj { }; struct obj_entry { - struct list_head *list; + struct list_head head; struct obj *obj; }; @@ -120,14 +120,14 @@ int lock_objs(struct list_head *list, struct ww_acquire_ctx *ctx) ww_acquire_init(ctx, ww_class); retry: - list_for_each_entry (list, entry) { - if (entry == res_obj) { + list_for_each_entry (entry, list, head) { + if (entry-obj == res_obj) { res_obj = NULL; continue; } ret = ww_mutex_lock(entry-obj-lock, ctx); if (ret 0) { - contended_obj = entry; + contended_entry = entry; goto err; } } @@ -136,7 +136,7 @@ retry: return 0; err: - list_for_each_entry_continue_reverse (list, contended_entry, entry) + list_for_each_entry_continue_reverse (entry, list, head) ww_mutex_unlock(entry-obj-lock); if (res_obj) @@ -163,13 +163,13 @@ int lock_objs(struct list_head *list, struct ww_acquire_ctx *ctx) ww_acquire_init(ctx, ww_class); - list_for_each_entry (list, entry) { + list_for_each_entry (entry, list, head) { ret = ww_mutex_lock(entry-obj-lock, ctx); if (ret 0) { entry2 = entry; - list_for_each_entry_continue_reverse (list, entry2) - ww_mutex_unlock(entry-obj-lock); + list_for_each_entry_continue_reverse (entry2, list, head) + ww_mutex_unlock(entry2-obj-lock); if (ret != -EDEADLK) { ww_acquire_fini(ctx); @@ -184,8 +184,8 @@ int lock_objs(struct list_head *list, struct ww_acquire_ctx *ctx) * buf-next to the first unlocked entry, * restarting the for loop. */ - list_del(entry-list); - list_add(entry-list, list); + list_del(entry-head); + list_add(entry-head, list); } } @@ -199,7 +199,7 @@ void unlock_objs(struct list_head *list, struct ww_acquire_ctx *ctx) { struct obj_entry *entry; - list_for_each_entry (list, entry) + list_for_each_entry (entry, list, head) ww_mutex_unlock(entry-obj-lock); ww_acquire_fini(ctx); @@ -244,22 +244,21 @@ struct obj { static DEFINE_WW_CLASS(ww_class); -void __unlock_objs(struct list_head *list, struct ww_acquire_ctx *ctx) +void __unlock_objs(struct list_head *list) { - struct obj entry; + struct obj *entry, *temp; - for_each_safe (list, entry) { + list_for_each_entry_safe (entry, temp, list, locked_list) { /* need to do that before unlocking, since only the current lock holder is allowed to use object */ - list_del(entry-locked_list); + list_del(entry-locked_list); ww_mutex_unlock(entry-ww_mutex) } } void lock_objs(struct list_head *list, struct ww_acquire_ctx *ctx) { - struct list_head locked_buffers; - struct obj obj = NULL, entry; + struct obj *obj;
Re: Introduce a new helper framework for buffer synchronization
Hey, Op 28-05-13 04:49, Inki Dae schreef: -Original Message- From: Maarten Lankhorst [mailto:maarten.lankho...@canonical.com] Sent: Tuesday, May 28, 2013 12:23 AM To: Inki Dae Cc: 'Daniel Vetter'; 'Rob Clark'; 'linux-fbdev'; 'YoungJun Cho'; 'Kyungmin Park'; 'myungjoo.ham'; 'DRI mailing list'; linux-arm- ker...@lists.infradead.org; linux-media@vger.kernel.org Subject: Re: Introduce a new helper framework for buffer synchronization Hey, Op 27-05-13 12:38, Inki Dae schreef: Hi all, I have been removed previous branch and added new one with more cleanup. This time, the fence helper doesn't include user side interfaces and cache operation relevant codes anymore because not only we are not sure that coupling those two things, synchronizing caches and buffer access between CPU and CPU, CPU and DMA, and DMA and DMA with fences, in kernel side is a good idea yet but also existing codes for user side have problems with badly behaved or crashing userspace. So this could be more discussed later. The below is a new branch, https://git.kernel.org/cgit/linux/kernel/git/daeinki/drm- exynos.git/?h=dma-f ence-helper And fence helper codes, https://git.kernel.org/cgit/linux/kernel/git/daeinki/drm- exynos.git/commit/? h=dma-fence-helperid=adcbc0fe7e285ce866e5816e5e21443dcce01005 And example codes for device driver, https://git.kernel.org/cgit/linux/kernel/git/daeinki/drm- exynos.git/commit/? h=dma-fence-helperid=d2ce7af23835789602a99d0ccef1f53cdd5caaae I think the time is not yet ripe for RFC posting: maybe existing dma fence and reservation need more review and addition work. So I'd glad for somebody giving other opinions and advices in advance before RFC posting. NAK. For examples for how to handle locking properly, see Documentation/ww- mutex-design.txt in my recent tree. I could list what I believe is wrong with your implementation, but real problem is that the approach you're taking is wrong. I just removed ticket stubs to show my approach you guys as simple as possible, and I just wanted to show that we could use buffer synchronization mechanism without ticket stubs. The tickets have been removed in favor of a ww_context. Moving it in as a base primitive allows more locking abuse to be detected, and makes some other things easier too. Question, WW-Mutexes could be used for all devices? I guess this has dependence on x86 gpu: gpu has VRAM and it means different memory domain. And could you tell my why shared fence should have only eight objects? I think we could need more than eight objects for read access. Anyway I think I don't surely understand yet so there might be my missing point. Yes, ww mutexes are not limited in any way to x86. They're a locking mechanism. When you acquired the ww mutexes for all buffer objects, all it does is say at that point in time you have exclusively acquired the locks of all bo's. After locking everything you can read the fence pointers safely, queue waits, and set a new fence pointer on all reservation_objects. You only need a single fence on all those objects, so 8 is plenty. Nonetheless this was a limitation of my earlier design, and I'll dynamically allocate fence_shared in the future. ~Maarten -- To unsubscribe from this list: send the line unsubscribe linux-media in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH v4 1/4] arch: make __mutex_fastpath_lock_retval return whether fastpath succeeded or not.
This will allow me to call functions that have multiple arguments if fastpath fails. This is required to support ticket mutexes, because they need to be able to pass an extra argument to the fail function. Originally I duplicated the functions, by adding __mutex_fastpath_lock_retval_arg. This ended up being just a duplication of the existing function, so a way to test if fastpath was called ended up being better. This also cleaned up the reservation mutex patch some by being able to call an atomic_set instead of atomic_xchg, and making it easier to detect if the wrong unlock function was previously used. Changes since v1, pointed out by Francesco Lavra: - fix a small comment issue in mutex_32.h - fix the __mutex_fastpath_lock_retval macro for mutex-null.h Signed-off-by: Maarten Lankhorst maarten.lankho...@canonical.com --- arch/ia64/include/asm/mutex.h| 10 -- arch/powerpc/include/asm/mutex.h | 10 -- arch/sh/include/asm/mutex-llsc.h |4 ++-- arch/x86/include/asm/mutex_32.h | 11 --- arch/x86/include/asm/mutex_64.h | 11 --- include/asm-generic/mutex-dec.h | 10 -- include/asm-generic/mutex-null.h |2 +- include/asm-generic/mutex-xchg.h | 10 -- kernel/mutex.c | 32 ++-- 9 files changed, 41 insertions(+), 59 deletions(-) diff --git a/arch/ia64/include/asm/mutex.h b/arch/ia64/include/asm/mutex.h index bed73a6..f41e66d 100644 --- a/arch/ia64/include/asm/mutex.h +++ b/arch/ia64/include/asm/mutex.h @@ -29,17 +29,15 @@ __mutex_fastpath_lock(atomic_t *count, void (*fail_fn)(atomic_t *)) * __mutex_fastpath_lock_retval - try to take the lock by moving the count * from 1 to a 0 value * @count: pointer of type atomic_t - * @fail_fn: function to call if the original value was not 1 * - * Change the count from 1 to a value lower than 1, and call fail_fn if - * it wasn't 1 originally. This function returns 0 if the fastpath succeeds, - * or anything the slow path function returns. + * Change the count from 1 to a value lower than 1. This function returns 0 + * if the fastpath succeeds, or -1 otherwise. */ static inline int -__mutex_fastpath_lock_retval(atomic_t *count, int (*fail_fn)(atomic_t *)) +__mutex_fastpath_lock_retval(atomic_t *count) { if (unlikely(ia64_fetchadd4_acq(count, -1) != 1)) - return fail_fn(count); + return -1; return 0; } diff --git a/arch/powerpc/include/asm/mutex.h b/arch/powerpc/include/asm/mutex.h index 5399f7e..127ab23 100644 --- a/arch/powerpc/include/asm/mutex.h +++ b/arch/powerpc/include/asm/mutex.h @@ -82,17 +82,15 @@ __mutex_fastpath_lock(atomic_t *count, void (*fail_fn)(atomic_t *)) * __mutex_fastpath_lock_retval - try to take the lock by moving the count * from 1 to a 0 value * @count: pointer of type atomic_t - * @fail_fn: function to call if the original value was not 1 * - * Change the count from 1 to a value lower than 1, and call fail_fn if - * it wasn't 1 originally. This function returns 0 if the fastpath succeeds, - * or anything the slow path function returns. + * Change the count from 1 to a value lower than 1. This function returns 0 + * if the fastpath succeeds, or -1 otherwise. */ static inline int -__mutex_fastpath_lock_retval(atomic_t *count, int (*fail_fn)(atomic_t *)) +__mutex_fastpath_lock_retval(atomic_t *count) { if (unlikely(__mutex_dec_return_lock(count) 0)) - return fail_fn(count); + return -1; return 0; } diff --git a/arch/sh/include/asm/mutex-llsc.h b/arch/sh/include/asm/mutex-llsc.h index 090358a..dad29b6 100644 --- a/arch/sh/include/asm/mutex-llsc.h +++ b/arch/sh/include/asm/mutex-llsc.h @@ -37,7 +37,7 @@ __mutex_fastpath_lock(atomic_t *count, void (*fail_fn)(atomic_t *)) } static inline int -__mutex_fastpath_lock_retval(atomic_t *count, int (*fail_fn)(atomic_t *)) +__mutex_fastpath_lock_retval(atomic_t *count) { int __done, __res; @@ -51,7 +51,7 @@ __mutex_fastpath_lock_retval(atomic_t *count, int (*fail_fn)(atomic_t *)) : t); if (unlikely(!__done || __res != 0)) - __res = fail_fn(count); + __res = -1; return __res; } diff --git a/arch/x86/include/asm/mutex_32.h b/arch/x86/include/asm/mutex_32.h index 03f90c8..0208c3c 100644 --- a/arch/x86/include/asm/mutex_32.h +++ b/arch/x86/include/asm/mutex_32.h @@ -42,17 +42,14 @@ do { \ * __mutex_fastpath_lock_retval - try to take the lock by moving the count * from 1 to a 0 value * @count: pointer of type atomic_t - * @fail_fn: function to call if the original value was not 1 * - * Change the count from 1 to a value lower than 1, and call fail_fn if it - * wasn't 1 originally. This function returns 0
[PATCH v4 4/4] mutex: w/w mutex slowpath debugging
From: Daniel Vetter daniel.vet...@ffwll.ch Injects EDEADLK conditions at pseudo-random interval, with exponential backoff up to UINT_MAX (to ensure that every lock operation still completes in a reasonable time). This way we can test the wound slowpath even for ww mutex users where contention is never expected, and the ww deadlock avoidance algorithm is only needed for correctness against malicious userspace. An example would be protecting kernel modesetting properties, which thanks to single-threaded X isn't really expected to contend, ever. I've looked into using the CONFIG_FAULT_INJECTION infrastructure, but decided against it for two reasons: - EDEADLK handling is mandatory for ww mutex users and should never affect the outcome of a syscall. This is in contrast to -ENOMEM injection. So fine configurability isn't required. - The fault injection framework only allows to set a simple probability for failure. Now the probability that a ww mutex acquire stage with N locks will never complete (due to too many injected EDEADLK backoffs) is zero. But the expected number of ww_mutex_lock operations for the completely uncontended case would be O(exp(N)). The per-acuiqire ctx exponential backoff solution choosen here only results in O(log N) overhead due to injection and so O(log N * N) lock operations. This way we can fail with high probability (and so have good test coverage even for fancy backoff and lock acquisition paths) without running into patalogical cases. Note that EDEADLK will only ever be injected when we managed to acquire the lock. This prevents any behaviour changes for users which rely on the EALREADY semantics. v2: Drop the cargo-culted __sched (I should read docs next time around) and annotate the non-debug case with inline to prevent gcc from doing something horrible. v3: Rebase on top of Maarten's latest patches. v4: Actually make this stuff compile, I've misplace the hunk in the wrong #ifdef block. v5: Simplify ww_mutex_deadlock_injection definition, and fix lib/locking-selftest.c warnings. Fix lib/Kconfig.debug definition to work correctly. (mlankhorst) v6: Do not inject -EDEADLK when ctx-acquired == 0, because the _slow paths are merged now. (mlankhorst) Cc: Steven Rostedt rost...@goodmis.org Signed-off-by: Daniel Vetter daniel.vet...@ffwll.ch Signed-off-by: Maarten Lankhorst maarten.lankho...@canonical.com --- include/linux/mutex.h |8 kernel/mutex.c | 44 +--- lib/Kconfig.debug | 13 + lib/locking-selftest.c |5 + 4 files changed, 67 insertions(+), 3 deletions(-) diff --git a/include/linux/mutex.h b/include/linux/mutex.h index f3ad181..2ff9178 100644 --- a/include/linux/mutex.h +++ b/include/linux/mutex.h @@ -95,6 +95,10 @@ struct ww_acquire_ctx { #ifdef CONFIG_DEBUG_LOCK_ALLOC struct lockdep_map dep_map; #endif +#ifdef CONFIG_DEBUG_WW_MUTEX_SLOWPATH + unsigned deadlock_inject_interval; + unsigned deadlock_inject_countdown; +#endif }; struct ww_mutex { @@ -280,6 +284,10 @@ static inline void ww_acquire_init(struct ww_acquire_ctx *ctx, ww_class-acquire_key, 0); mutex_acquire(ctx-dep_map, 0, 0, _RET_IP_); #endif +#ifdef CONFIG_DEBUG_WW_MUTEX_SLOWPATH + ctx-deadlock_inject_interval = 1; + ctx-deadlock_inject_countdown = ctx-stamp 0xf; +#endif } /** diff --git a/kernel/mutex.c b/kernel/mutex.c index 75fc7c4..e40004b 100644 --- a/kernel/mutex.c +++ b/kernel/mutex.c @@ -508,22 +508,60 @@ mutex_lock_interruptible_nested(struct mutex *lock, unsigned int subclass) EXPORT_SYMBOL_GPL(mutex_lock_interruptible_nested); +static inline int +ww_mutex_deadlock_injection(struct ww_mutex *lock, struct ww_acquire_ctx *ctx) +{ +#ifdef CONFIG_DEBUG_WW_MUTEX_SLOWPATH + unsigned tmp; + + if (ctx-deadlock_inject_countdown-- == 0) { + tmp = ctx-deadlock_inject_interval; + if (tmp UINT_MAX/4) + tmp = UINT_MAX; + else + tmp = tmp*2 + tmp + tmp/2; + + ctx-deadlock_inject_interval = tmp; + ctx-deadlock_inject_countdown = tmp; + ctx-contending_lock = lock; + + ww_mutex_unlock(lock); + + return -EDEADLK; + } +#endif + + return 0; +} int __sched __ww_mutex_lock(struct ww_mutex *lock, struct ww_acquire_ctx *ctx) { + int ret; + might_sleep(); - return __mutex_lock_common(lock-base, TASK_UNINTERRUPTIBLE, + ret = __mutex_lock_common(lock-base, TASK_UNINTERRUPTIBLE, 0, ctx-dep_map, _RET_IP_, ctx); + if (!ret ctx-acquired 0) + return ww_mutex_deadlock_injection(lock, ctx); + + return ret; } EXPORT_SYMBOL_GPL(__ww_mutex_lock); int __sched __ww_mutex_lock_interruptible(struct ww_mutex *lock, struct ww_acquire_ctx *ctx) { + int ret
[PATCH v4 3/4] mutex: Add ww tests to lib/locking-selftest.c. v4
This stresses the lockdep code in some ways specifically useful to ww_mutexes. It adds checks for most of the common locking errors. Changes since v1: - Add tests to verify reservation_id is untouched. - Use L() and U() macros where possible. Changes since v2: - Use the ww_mutex api directly. - Use macros for most of the code. Changes since v3: - Rework tests for the api changes. Signed-off-by: Maarten Lankhorst maarten.lankho...@canonical.com --- lib/locking-selftest.c | 405 ++-- 1 file changed, 386 insertions(+), 19 deletions(-) diff --git a/lib/locking-selftest.c b/lib/locking-selftest.c index c3eb261..b18f1d3 100644 --- a/lib/locking-selftest.c +++ b/lib/locking-selftest.c @@ -26,6 +26,8 @@ */ static unsigned int debug_locks_verbose; +static DEFINE_WW_CLASS(ww_lockdep); + static int __init setup_debug_locks_verbose(char *str) { get_option(str, debug_locks_verbose); @@ -42,6 +44,10 @@ __setup(debug_locks_verbose=, setup_debug_locks_verbose); #define LOCKTYPE_RWLOCK0x2 #define LOCKTYPE_MUTEX 0x4 #define LOCKTYPE_RWSEM 0x8 +#define LOCKTYPE_WW0x10 + +static struct ww_acquire_ctx t, t2; +static struct ww_mutex o, o2; /* * Normal standalone locks, for the circular and irq-context @@ -193,6 +199,16 @@ static void init_shared_classes(void) #define RSU(x) up_read(rwsem_##x) #define RWSI(x)init_rwsem(rwsem_##x) +#define WWAI(x)ww_acquire_init(x, ww_lockdep) +#define WWAD(x)ww_acquire_done(x) +#define WWAF(x)ww_acquire_fini(x) + +#define WWL(x, c) ww_mutex_lock(x, c) +#define WWT(x) ww_mutex_trylock(x) +#define WWL1(x)ww_mutex_lock(x, NULL) +#define WWU(x) ww_mutex_unlock(x) + + #define LOCK_UNLOCK_2(x,y) LOCK(x); LOCK(y); UNLOCK(y); UNLOCK(x) /* @@ -894,11 +910,13 @@ GENERATE_PERMUTATIONS_3_EVENTS(irq_read_recursion_soft) # define I_RWLOCK(x) lockdep_reset_lock(rwlock_##x.dep_map) # define I_MUTEX(x)lockdep_reset_lock(mutex_##x.dep_map) # define I_RWSEM(x)lockdep_reset_lock(rwsem_##x.dep_map) +# define I_WW(x) lockdep_reset_lock(x.dep_map) #else # define I_SPINLOCK(x) # define I_RWLOCK(x) # define I_MUTEX(x) # define I_RWSEM(x) +# define I_WW(x) #endif #define I1(x) \ @@ -920,11 +938,20 @@ GENERATE_PERMUTATIONS_3_EVENTS(irq_read_recursion_soft) static void reset_locks(void) { local_irq_disable(); + lockdep_free_key_range(ww_lockdep.acquire_key, 1); + lockdep_free_key_range(ww_lockdep.mutex_key, 1); + I1(A); I1(B); I1(C); I1(D); I1(X1); I1(X2); I1(Y1); I1(Y2); I1(Z1); I1(Z2); + I_WW(t); I_WW(t2); I_WW(o.base); I_WW(o2.base); lockdep_reset(); I2(A); I2(B); I2(C); I2(D); init_shared_classes(); + + ww_mutex_init(o, ww_lockdep); ww_mutex_init(o2, ww_lockdep); + memset(t, 0, sizeof(t)); memset(t2, 0, sizeof(t2)); + memset(ww_lockdep.acquire_key, 0, sizeof(ww_lockdep.acquire_key)); + memset(ww_lockdep.mutex_key, 0, sizeof(ww_lockdep.mutex_key)); local_irq_enable(); } @@ -938,7 +965,6 @@ static int unexpected_testcase_failures; static void dotest(void (*testcase_fn)(void), int expected, int lockclass_mask) { unsigned long saved_preempt_count = preempt_count(); - int expected_failure = 0; WARN_ON(irqs_disabled()); @@ -946,26 +972,16 @@ static void dotest(void (*testcase_fn)(void), int expected, int lockclass_mask) /* * Filter out expected failures: */ + if (debug_locks != expected) { #ifndef CONFIG_PROVE_LOCKING - if ((lockclass_mask LOCKTYPE_SPIN) debug_locks != expected) - expected_failure = 1; - if ((lockclass_mask LOCKTYPE_RWLOCK) debug_locks != expected) - expected_failure = 1; - if ((lockclass_mask LOCKTYPE_MUTEX) debug_locks != expected) - expected_failure = 1; - if ((lockclass_mask LOCKTYPE_RWSEM) debug_locks != expected) - expected_failure = 1; + expected_testcase_failures++; + printk(failed|); +#else + unexpected_testcase_failures++; + printk(FAILED|); + + dump_stack(); #endif - if (debug_locks != expected) { - if (expected_failure) { - expected_testcase_failures++; - printk(failed|); - } else { - unexpected_testcase_failures++; - - printk(FAILED|); - dump_stack(); - } } else { testcase_successes++; printk( ok |); @@ -1108,6 +1124,355 @@ static inline void print_testname(const char *testname) DO_TESTCASE_6IRW(desc, name, 312
[PATCH v4 2/4] mutex: add support for wound/wait style locks, v5
Changes since RFC patch v1: - Updated to use atomic_long instead of atomic, since the reservation_id was a long. - added mutex_reserve_lock_slow and mutex_reserve_lock_intr_slow - removed mutex_locked_set_reservation_id (or w/e it was called) Changes since RFC patch v2: - remove use of __mutex_lock_retval_arg, add warnings when using wrong combination of mutex_(,reserve_)lock/unlock. Changes since v1: - Add __always_inline to __mutex_lock_common, otherwise reservation paths can be triggered from normal locks, because __builtin_constant_p might evaluate to false for the constant 0 in that case. Tests for this have been added in the next patch. - Updated documentation slightly. Changes since v2: - Renamed everything to ww_mutex. (mlankhorst) - Added ww_acquire_ctx and ww_class. (mlankhorst) - Added a lot of checks for wrong api usage. (mlankhorst) - Documentation updates. (danvet) Changes since v3: - Small documentation fixes (robclark) - Memory barrier fix (danvet) Changes since v4: - Remove ww_mutex_unlock_single and ww_mutex_lock_single. - Rename ww_mutex_trylock_single to ww_mutex_trylock. - Remove separate implementations of ww_mutex_lock_slow*, normal functions can be used. Inline versions still exist for extra debugging. - Cleanup unneeded memory barriers, add comment to the remaining smp_mb(). Signed-off-by: Maarten Lankhorst maarten.lankho...@canonical.com Signed-off-by: Daniel Vetter daniel.vet...@ffwll.ch Signed-off-by: Rob Clark robdcl...@gmail.com --- Documentation/ww-mutex-design.txt | 344 include/linux/mutex-debug.h |1 include/linux/mutex.h | 355 + kernel/mutex.c| 318 +++-- lib/debug_locks.c |2 5 files changed, 1003 insertions(+), 17 deletions(-) create mode 100644 Documentation/ww-mutex-design.txt diff --git a/Documentation/ww-mutex-design.txt b/Documentation/ww-mutex-design.txt new file mode 100644 index 000..8bd1761 --- /dev/null +++ b/Documentation/ww-mutex-design.txt @@ -0,0 +1,344 @@ +Wait/Wound Deadlock-Proof Mutex Design +== + +Please read mutex-design.txt first, as it applies to wait/wound mutexes too. + +Motivation for WW-Mutexes +- + +GPU's do operations that commonly involve many buffers. Those buffers +can be shared across contexts/processes, exist in different memory +domains (for example VRAM vs system memory), and so on. And with +PRIME / dmabuf, they can even be shared across devices. So there are +a handful of situations where the driver needs to wait for buffers to +become ready. If you think about this in terms of waiting on a buffer +mutex for it to become available, this presents a problem because +there is no way to guarantee that buffers appear in a execbuf/batch in +the same order in all contexts. That is directly under control of +userspace, and a result of the sequence of GL calls that an application +makes. Which results in the potential for deadlock. The problem gets +more complex when you consider that the kernel may need to migrate the +buffer(s) into VRAM before the GPU operates on the buffer(s), which +may in turn require evicting some other buffers (and you don't want to +evict other buffers which are already queued up to the GPU), but for a +simplified understanding of the problem you can ignore this. + +The algorithm that TTM came up with for dealing with this problem is quite +simple. For each group of buffers (execbuf) that need to be locked, the caller +would be assigned a unique reservation id/ticket, from a global counter. In +case of deadlock while locking all the buffers associated with a execbuf, the +one with the lowest reservation ticket (i.e. the oldest task) wins, and the one +with the higher reservation id (i.e. the younger task) unlocks all of the +buffers that it has already locked, and then tries again. + +In the RDBMS literature this deadlock handling approach is called wait/wound: +The older tasks waits until it can acquire the contended lock. The younger tasks +needs to back off and drop all the locks it is currently holding, i.e. the +younger task is wounded. + +Concepts + + +Compared to normal mutexes two additional concepts/objects show up in the lock +interface for w/w mutexes: + +Acquire context: To ensure eventual forward progress it is important the a task +trying to acquire locks doesn't grab a new reservation id, but keeps the one it +acquired when starting the lock acquisition. This ticket is stored in the +acquire context. Furthermore the acquire context keeps track of debugging state +to catch w/w mutex interface abuse. + +W/w class: In contrast to normal mutexes the lock class needs to be explicit for +w/w mutexes, since it is required to initialize the acquire context. + +Furthermore there are three different class of w/w lock