Re: [Intel-gfx] [PATCH 6/6] drm/i915: Only call tasklet_kill() on the first prepare_reset

2018-03-09 Thread Chris Wilson
Quoting Chris Wilson (2018-03-09 14:10:34)
> Quoting Chris Wilson (2018-03-07 13:42:26)
> > tasklet_kill() will spin waiting for the current tasklet to be executed.
> > However, if tasklet_disable() has been called, then the tasklet is never
> > executed but permanently put back onto the runlist until
> > tasklet_enable() is called. Ergo, we cannot use tasklet_kill() inside a
> > disable/enable pair. This is the case when we call set-wedge from inside
> > i915_reset(), and another request was submitted to us concurrent to the
> > reset.
> > 
> > Fixes: 963ddd63c314 ("drm/i915: Suspend submission tasklets around wedging")
> > Signed-off-by: Chris Wilson 
> > Cc: Mika Kuoppala 
> > ---
> >  drivers/gpu/drm/i915/i915_gem.c | 3 ++-
> >  1 file changed, 2 insertions(+), 1 deletion(-)
> > 
> > diff --git a/drivers/gpu/drm/i915/i915_gem.c 
> > b/drivers/gpu/drm/i915/i915_gem.c
> > index 3b44952e089f..e75af06904b7 100644
> > --- a/drivers/gpu/drm/i915/i915_gem.c
> > +++ b/drivers/gpu/drm/i915/i915_gem.c
> > @@ -2941,7 +2941,8 @@ i915_gem_reset_prepare_engine(struct intel_engine_cs 
> > *engine)
> >  * Turning off the execlists->tasklet until the reset is over
> >  * prevents the race.
> >  */
> > -   tasklet_kill(&engine->execlists.tasklet);
> > +   if (!atomic_read(&engine->execlists.tasklet.count))
> > +   tasklet_kill(&engine->execlists.tasklet);
> 
> Note this is racy; two separate atomic operations. The only race we have
> is with set-wedge vs reset, which is a rare and already contentious
> issue. The upside of preventing the lockup is that we don't lose the
> machine.
> 
> +* Note that this needs to be a single atomic operation on the
> +* tasklet (flush existing tasks, prevent new tasks) to prevent
> +* a race between reset and set-wedged. It is not, so we do the best
> +* we can atm and make sure we don't lock the machine up in the more
> +* common case of recursing calling set-wedged from inside i915_reset.

The alternative is to not try and do tasklet_kill() here. But that has
the side-effect of leaving it spinning if it was running at the same
time as the suspend. Hmm, for the moment I'll go with the comment and
see how inspired we are once gem_eio stops dieing.
-Chris
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH 6/6] drm/i915: Only call tasklet_kill() on the first prepare_reset

2018-03-09 Thread Chris Wilson
Quoting Chris Wilson (2018-03-07 13:42:26)
> tasklet_kill() will spin waiting for the current tasklet to be executed.
> However, if tasklet_disable() has been called, then the tasklet is never
> executed but permanently put back onto the runlist until
> tasklet_enable() is called. Ergo, we cannot use tasklet_kill() inside a
> disable/enable pair. This is the case when we call set-wedge from inside
> i915_reset(), and another request was submitted to us concurrent to the
> reset.
> 
> Fixes: 963ddd63c314 ("drm/i915: Suspend submission tasklets around wedging")
> Signed-off-by: Chris Wilson 
> Cc: Mika Kuoppala 
> ---
>  drivers/gpu/drm/i915/i915_gem.c | 3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
> index 3b44952e089f..e75af06904b7 100644
> --- a/drivers/gpu/drm/i915/i915_gem.c
> +++ b/drivers/gpu/drm/i915/i915_gem.c
> @@ -2941,7 +2941,8 @@ i915_gem_reset_prepare_engine(struct intel_engine_cs 
> *engine)
>  * Turning off the execlists->tasklet until the reset is over
>  * prevents the race.
>  */
> -   tasklet_kill(&engine->execlists.tasklet);
> +   if (!atomic_read(&engine->execlists.tasklet.count))
> +   tasklet_kill(&engine->execlists.tasklet);

Note this is racy; two separate atomic operations. The only race we have
is with set-wedge vs reset, which is a rare and already contentious
issue. The upside of preventing the lockup is that we don't lose the
machine.

+* Note that this needs to be a single atomic operation on the
+* tasklet (flush existing tasks, prevent new tasks) to prevent
+* a race between reset and set-wedged. It is not, so we do the best
+* we can atm and make sure we don't lock the machine up in the more
+* common case of recursing calling set-wedged from inside i915_reset.
-Chris
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH 6/6] drm/i915: Only call tasklet_kill() on the first prepare_reset

2018-03-09 Thread Mika Kuoppala
Chris Wilson  writes:

> tasklet_kill() will spin waiting for the current tasklet to be executed.
> However, if tasklet_disable() has been called, then the tasklet is never
> executed but permanently put back onto the runlist until
> tasklet_enable() is called. Ergo, we cannot use tasklet_kill() inside a
> disable/enable pair. This is the case when we call set-wedge from inside
> i915_reset(), and another request was submitted to us concurrent to the
> reset.
>
> Fixes: 963ddd63c314 ("drm/i915: Suspend submission tasklets around wedging")
> Signed-off-by: Chris Wilson 
> Cc: Mika Kuoppala 
> ---
>  drivers/gpu/drm/i915/i915_gem.c | 3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
> index 3b44952e089f..e75af06904b7 100644
> --- a/drivers/gpu/drm/i915/i915_gem.c
> +++ b/drivers/gpu/drm/i915/i915_gem.c
> @@ -2941,7 +2941,8 @@ i915_gem_reset_prepare_engine(struct intel_engine_cs 
> *engine)
>* Turning off the execlists->tasklet until the reset is over
>* prevents the race.
>*/
> - tasklet_kill(&engine->execlists.tasklet);
> + if (!atomic_read(&engine->execlists.tasklet.count))
> + tasklet_kill(&engine->execlists.tasklet);

As discussed in irc, this might warrant  comment that it is
our tasklet of which count we are racily investigating here,
so the race does not matter in the path we avoid killing.

Reviewed-by: Mika Kuoppala 

>   tasklet_disable(&engine->execlists.tasklet);
>  
>   /*
> -- 
> 2.16.2
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


[Intel-gfx] [PATCH 6/6] drm/i915: Only call tasklet_kill() on the first prepare_reset

2018-03-07 Thread Chris Wilson
tasklet_kill() will spin waiting for the current tasklet to be executed.
However, if tasklet_disable() has been called, then the tasklet is never
executed but permanently put back onto the runlist until
tasklet_enable() is called. Ergo, we cannot use tasklet_kill() inside a
disable/enable pair. This is the case when we call set-wedge from inside
i915_reset(), and another request was submitted to us concurrent to the
reset.

Fixes: 963ddd63c314 ("drm/i915: Suspend submission tasklets around wedging")
Signed-off-by: Chris Wilson 
Cc: Mika Kuoppala 
---
 drivers/gpu/drm/i915/i915_gem.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index 3b44952e089f..e75af06904b7 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -2941,7 +2941,8 @@ i915_gem_reset_prepare_engine(struct intel_engine_cs 
*engine)
 * Turning off the execlists->tasklet until the reset is over
 * prevents the race.
 */
-   tasklet_kill(&engine->execlists.tasklet);
+   if (!atomic_read(&engine->execlists.tasklet.count))
+   tasklet_kill(&engine->execlists.tasklet);
tasklet_disable(&engine->execlists.tasklet);
 
/*
-- 
2.16.2

___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx