Re: [Intel-gfx] [PATCH] drm/i915: Skip waking the device to service pwrite

2017-09-04 Thread Daniel Vetter
On Mon, Sep 04, 2017 at 11:18:07AM +0100, Chris Wilson wrote:
> Quoting Daniel Vetter (2017-09-04 09:12:12)
> > On Wed, Aug 30, 2017 at 06:48:19PM +0100, Chris Wilson wrote:
> > > If the device is in runtime suspend, resuming takes time and reduces our
> > > powersaving. If this was for a small write into an object, that resume
> > > will take longer than any savings in using the indirect GGTT access to
> > > avoid the cpu cache.
> > > 
> > > Signed-off-by: Chris Wilson 
> > > ---
> > >  drivers/gpu/drm/i915/i915_gem.c | 21 ++---
> > >  1 file changed, 18 insertions(+), 3 deletions(-)
> > > 
> > > diff --git a/drivers/gpu/drm/i915/i915_gem.c 
> > > b/drivers/gpu/drm/i915/i915_gem.c
> > > index 93dfa793975a..8940a6873ca5 100644
> > > --- a/drivers/gpu/drm/i915/i915_gem.c
> > > +++ b/drivers/gpu/drm/i915/i915_gem.c
> > > @@ -1229,7 +1229,21 @@ i915_gem_gtt_pwrite_fast(struct 
> > > drm_i915_gem_object *obj,
> > >   if (ret)
> > >   return ret;
> > >  
> > > - intel_runtime_pm_get(i915);
> > > + if (i915_gem_object_has_struct_page(obj)) {
> > 
> > I don't really see why we need to check for has_struct_page here (we do
> > already outside the lock grabbing), and why if that's not the case we hit
> > the slow-path?
> 
> We can't use the alternate paths if we don't have struct_page, hence we
> have to force use of GTT if !i915_gem_object_has_struct_page. The
> previous test is to also make sure we come down this path and not fail.

Ow, I've entirely misread all the code, I thought this checks for
obj->pages. With clue gained on my side:

Reviewed-by: Daniel Vetter 
-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH] drm/i915: Skip waking the device to service pwrite

2017-09-04 Thread Chris Wilson
Quoting Daniel Vetter (2017-09-04 09:12:12)
> On Wed, Aug 30, 2017 at 06:48:19PM +0100, Chris Wilson wrote:
> > If the device is in runtime suspend, resuming takes time and reduces our
> > powersaving. If this was for a small write into an object, that resume
> > will take longer than any savings in using the indirect GGTT access to
> > avoid the cpu cache.
> > 
> > Signed-off-by: Chris Wilson 
> > ---
> >  drivers/gpu/drm/i915/i915_gem.c | 21 ++---
> >  1 file changed, 18 insertions(+), 3 deletions(-)
> > 
> > diff --git a/drivers/gpu/drm/i915/i915_gem.c 
> > b/drivers/gpu/drm/i915/i915_gem.c
> > index 93dfa793975a..8940a6873ca5 100644
> > --- a/drivers/gpu/drm/i915/i915_gem.c
> > +++ b/drivers/gpu/drm/i915/i915_gem.c
> > @@ -1229,7 +1229,21 @@ i915_gem_gtt_pwrite_fast(struct drm_i915_gem_object 
> > *obj,
> >   if (ret)
> >   return ret;
> >  
> > - intel_runtime_pm_get(i915);
> > + if (i915_gem_object_has_struct_page(obj)) {
> 
> I don't really see why we need to check for has_struct_page here (we do
> already outside the lock grabbing), and why if that's not the case we hit
> the slow-path?

We can't use the alternate paths if we don't have struct_page, hence we
have to force use of GTT if !i915_gem_object_has_struct_page. The
previous test is to also make sure we come down this path and not fail.
-Chris
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx


Re: [Intel-gfx] [PATCH] drm/i915: Skip waking the device to service pwrite

2017-09-04 Thread Daniel Vetter
On Wed, Aug 30, 2017 at 06:48:19PM +0100, Chris Wilson wrote:
> If the device is in runtime suspend, resuming takes time and reduces our
> powersaving. If this was for a small write into an object, that resume
> will take longer than any savings in using the indirect GGTT access to
> avoid the cpu cache.
> 
> Signed-off-by: Chris Wilson 
> ---
>  drivers/gpu/drm/i915/i915_gem.c | 21 ++---
>  1 file changed, 18 insertions(+), 3 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
> index 93dfa793975a..8940a6873ca5 100644
> --- a/drivers/gpu/drm/i915/i915_gem.c
> +++ b/drivers/gpu/drm/i915/i915_gem.c
> @@ -1229,7 +1229,21 @@ i915_gem_gtt_pwrite_fast(struct drm_i915_gem_object 
> *obj,
>   if (ret)
>   return ret;
>  
> - intel_runtime_pm_get(i915);
> + if (i915_gem_object_has_struct_page(obj)) {

I don't really see why we need to check for has_struct_page here (we do
already outside the lock grabbing), and why if that's not the case we hit
the slow-path?

I'd have expected a simple s/pm_get/pm_get_if_in_use/ ...
-Daniel

> + /* Avoid waking the device up if we can fallback, as
> +  * waking/resuming is very slow (10-100 ms depending
> +  * on PCI sleeps and our own resume time). This easily
> +  * dwarfs any performance advantage from using the
> +  * cache bypass of indirect GGTT access.
> +  */
> + if (!intel_runtime_pm_get_if_in_use(i915)) {
> + ret = -EFAULT;
> + goto out_unlock;
> + }
> + } else {
> + intel_runtime_pm_get(i915);
> + }
> +
>   vma = i915_gem_object_ggtt_pin(obj, NULL, 0, 0,
>  PIN_MAPPABLE | PIN_NONBLOCK);
>   if (!IS_ERR(vma)) {
> @@ -1244,7 +1258,7 @@ i915_gem_gtt_pwrite_fast(struct drm_i915_gem_object 
> *obj,
>   if (IS_ERR(vma)) {
>   ret = insert_mappable_node(ggtt, , PAGE_SIZE);
>   if (ret)
> - goto out_unlock;
> + goto out_rpm;
>   GEM_BUG_ON(!node.allocated);
>   }
>  
> @@ -1307,8 +1321,9 @@ i915_gem_gtt_pwrite_fast(struct drm_i915_gem_object 
> *obj,
>   } else {
>   i915_vma_unpin(vma);
>   }
> -out_unlock:
> +out_rpm:
>   intel_runtime_pm_put(i915);
> +out_unlock:
>   mutex_unlock(>drm.struct_mutex);
>   return ret;
>  }
> -- 
> 2.14.1
> 
> ___
> Intel-gfx mailing list
> Intel-gfx@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/intel-gfx

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx