On Wed, Sep 14, 2016 at 07:52:37AM +0100, Chris Wilson wrote:
> In preparation to support many distinct timelines, we need to expand the
> activity tracking on the GEM object to handle more than just a request
> per engine. We already use the struct reservation_object on the dma-buf
> to handle many fence contexts, so integrating that into the GEM object
> itself is the preferred solution. (For example, we can now share the same
> reservation_object between every consumer/producer using this buffer and
> skip the manual import/export via dma-buf.)
> Caveats:
>  * busy-ioctl: the modern interface is completely thrown away,
> regressing us back to before
> commit e9808edd98679680804dfbc42c5ee8f1aa91f617 [v3.10]
> Author: Chris Wilson <ch...@chris-wilson.co.uk>
> Date:   Wed Jul 4 12:25:08 2012 +0100
>     drm/i915: Return a mask of the active rings in the high word of busy_ioctl

There is an alternative here, as we can walk through the obj->resv and
convert *i915* fences to their respective read/write engines.

However, this means that busy-ioctl and wait-ioctl(.timeout=0) are no
longer equivalent as that busy-ioctl may report an object as idle but
still has external rendering (and so wait-ioctl, set-to-domain-ioctl,
pread, pwrite may stall).

Walking the resv object is also more expensive than doing a simple test.
Though that can be mitigated later by opencoding the test (once we can
do the whole busy-ioctl under only RCU protection that makes more
sense). But it makes busy-ioctl slower than ideal, and busy-ioctl is
high frequency (so long as we keep avoiding struct_mutex, the difference
is likely only visible in the microbenchmarks).

Pro: we keep the finese of being able to report which engines are busy
Con: not equivalent to wait-ioctl, polling the dmabuf, etc.

That we have alternative mechanisms to check may be considered a good
thing, i.e. we can migrate to using wait-ioctl for the general am I
going to stall test, and using busy-ioctl for engine selection. I think
I am leaning towards keeping the ABI more or less intact for the time
being. Though we are getting pretty close to its limit on NUM_ENGINES.

i915_gem_busy_ioctl(struct drm_device *dev, void *data,
                    struct drm_file *file)
        struct drm_i915_gem_busy *args = data;
        struct drm_i915_gem_object *obj;
        struct fence **shared, *excl;
        unsigned int count, i;
        int ret;

        ret = -ENOENT;
        obj = i915_gem_object_lookup(file, args->handle);
        if (obj) {
                ret = reservation_object_get_fences_rcu(obj->resv,
                                                        &excl, &count, &shared);
        if (unlikely(ret))
                return ret;

         /* A discrepancy here is that we do not report the status of
          * non-i915 fences, i.e. even though we may report the object as idle,
          * a call to set-domain may still stall waiting for foreign rendering.
          * This also means that wait-ioctl may report an object as busy,
          * where busy-ioctl considers it idle.
          * We trade the ability to warn of foreign fences to report on which
          * i915 engines are active for the object.
          * Alternatively, we can trade that extra information on read/write
          * activity with
          *     args->busy =
          *             !reservation_object_test_signaled_rcu(obj->resv, true);
          * to report the overall busyness. This is what the wait-ioctl does.
        args->busy = 0;

        /* Translate shared fences to READ set of engines */
        for (i = 0; i < count; i++) {
                args->busy |= busy_check_reader(shared[i]);

        /* Translate the exclusive fence to the READ *and* WRITE engine */
        if (excl) {
                args->busy |= busy_check_writer(excl);

        return 0;


Chris Wilson, Intel Open Source Technology Centre
Intel-gfx mailing list

Reply via email to