Quoting Tvrtko Ursulin (2018-02-19 18:13:13)
> On 19/02/2018 16:19, Chris Wilson wrote:
> > During igt, we frequently call into the driver to reset both HW and
> > driver state (idling the device, waiting for it to become idle and
> > freeing off old objects) to ensure that we start each test/subtest/pass
> > from known state. This process incurs an RCU barrier or two to ensure
> > that any such pending frees are indeed flushed before we return.
> > However, unconditionally waiting on the RCU barrier adds needless delay
> > to many callers, which adds up to several seconds when repeated thousands
> > of times. We can skip the rcu_barrier() if by tracking how many outstanding
> > frees we have, we know there are none.
> To be pedantic it is not skipping the rcu_barrier, but skipping the
> drain altogether.
> So theoretically there is a tiny difference in behaviour where today
> drain would wait for all frees currently executing, where after the
> patch it will ignore these and only process the ones which got to the
> end of the function.
> Perhaps it atomic_inc was at the very top of i915_gem_free_object it
> would be closer to today. But such suggestions feel extremely iffy.
That's a smallish window. And it exists even today, with a race after
the RCU grace period (if you let userspace race with itself). I think
it's fair to say that we are dependent upon single-threaded client
operation here (either igt or suspend) for defined behaviour.
> Nosing around the code base suggest the change is completely fine. Only
> potentially relevant site which might care about the subtle difference
> is i915_gem_freeze_late, which actually doesnt care since everything has
> been frozen at that point. So all frees have presumably exited and
> incremented the new counter.
That's the idea at least :)
Intel-gfx mailing list