Re: [Intel-gfx] [PATCH] drm/i915: Skip waking the device to service pwrite
On Mon, Sep 04, 2017 at 11:18:07AM +0100, Chris Wilson wrote: > Quoting Daniel Vetter (2017-09-04 09:12:12) > > On Wed, Aug 30, 2017 at 06:48:19PM +0100, Chris Wilson wrote: > > > If the device is in runtime suspend, resuming takes time and reduces our > > > powersaving. If this was for a small write into an object, that resume > > > will take longer than any savings in using the indirect GGTT access to > > > avoid the cpu cache. > > > > > > Signed-off-by: Chris Wilson> > > --- > > > drivers/gpu/drm/i915/i915_gem.c | 21 ++--- > > > 1 file changed, 18 insertions(+), 3 deletions(-) > > > > > > diff --git a/drivers/gpu/drm/i915/i915_gem.c > > > b/drivers/gpu/drm/i915/i915_gem.c > > > index 93dfa793975a..8940a6873ca5 100644 > > > --- a/drivers/gpu/drm/i915/i915_gem.c > > > +++ b/drivers/gpu/drm/i915/i915_gem.c > > > @@ -1229,7 +1229,21 @@ i915_gem_gtt_pwrite_fast(struct > > > drm_i915_gem_object *obj, > > > if (ret) > > > return ret; > > > > > > - intel_runtime_pm_get(i915); > > > + if (i915_gem_object_has_struct_page(obj)) { > > > > I don't really see why we need to check for has_struct_page here (we do > > already outside the lock grabbing), and why if that's not the case we hit > > the slow-path? > > We can't use the alternate paths if we don't have struct_page, hence we > have to force use of GTT if !i915_gem_object_has_struct_page. The > previous test is to also make sure we come down this path and not fail. Ow, I've entirely misread all the code, I thought this checks for obj->pages. With clue gained on my side: Reviewed-by: Daniel Vetter -- Daniel Vetter Software Engineer, Intel Corporation http://blog.ffwll.ch ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH] drm/i915: Skip waking the device to service pwrite
Quoting Daniel Vetter (2017-09-04 09:12:12) > On Wed, Aug 30, 2017 at 06:48:19PM +0100, Chris Wilson wrote: > > If the device is in runtime suspend, resuming takes time and reduces our > > powersaving. If this was for a small write into an object, that resume > > will take longer than any savings in using the indirect GGTT access to > > avoid the cpu cache. > > > > Signed-off-by: Chris Wilson> > --- > > drivers/gpu/drm/i915/i915_gem.c | 21 ++--- > > 1 file changed, 18 insertions(+), 3 deletions(-) > > > > diff --git a/drivers/gpu/drm/i915/i915_gem.c > > b/drivers/gpu/drm/i915/i915_gem.c > > index 93dfa793975a..8940a6873ca5 100644 > > --- a/drivers/gpu/drm/i915/i915_gem.c > > +++ b/drivers/gpu/drm/i915/i915_gem.c > > @@ -1229,7 +1229,21 @@ i915_gem_gtt_pwrite_fast(struct drm_i915_gem_object > > *obj, > > if (ret) > > return ret; > > > > - intel_runtime_pm_get(i915); > > + if (i915_gem_object_has_struct_page(obj)) { > > I don't really see why we need to check for has_struct_page here (we do > already outside the lock grabbing), and why if that's not the case we hit > the slow-path? We can't use the alternate paths if we don't have struct_page, hence we have to force use of GTT if !i915_gem_object_has_struct_page. The previous test is to also make sure we come down this path and not fail. -Chris ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH] drm/i915: Skip waking the device to service pwrite
On Wed, Aug 30, 2017 at 06:48:19PM +0100, Chris Wilson wrote: > If the device is in runtime suspend, resuming takes time and reduces our > powersaving. If this was for a small write into an object, that resume > will take longer than any savings in using the indirect GGTT access to > avoid the cpu cache. > > Signed-off-by: Chris Wilson> --- > drivers/gpu/drm/i915/i915_gem.c | 21 ++--- > 1 file changed, 18 insertions(+), 3 deletions(-) > > diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c > index 93dfa793975a..8940a6873ca5 100644 > --- a/drivers/gpu/drm/i915/i915_gem.c > +++ b/drivers/gpu/drm/i915/i915_gem.c > @@ -1229,7 +1229,21 @@ i915_gem_gtt_pwrite_fast(struct drm_i915_gem_object > *obj, > if (ret) > return ret; > > - intel_runtime_pm_get(i915); > + if (i915_gem_object_has_struct_page(obj)) { I don't really see why we need to check for has_struct_page here (we do already outside the lock grabbing), and why if that's not the case we hit the slow-path? I'd have expected a simple s/pm_get/pm_get_if_in_use/ ... -Daniel > + /* Avoid waking the device up if we can fallback, as > + * waking/resuming is very slow (10-100 ms depending > + * on PCI sleeps and our own resume time). This easily > + * dwarfs any performance advantage from using the > + * cache bypass of indirect GGTT access. > + */ > + if (!intel_runtime_pm_get_if_in_use(i915)) { > + ret = -EFAULT; > + goto out_unlock; > + } > + } else { > + intel_runtime_pm_get(i915); > + } > + > vma = i915_gem_object_ggtt_pin(obj, NULL, 0, 0, > PIN_MAPPABLE | PIN_NONBLOCK); > if (!IS_ERR(vma)) { > @@ -1244,7 +1258,7 @@ i915_gem_gtt_pwrite_fast(struct drm_i915_gem_object > *obj, > if (IS_ERR(vma)) { > ret = insert_mappable_node(ggtt, , PAGE_SIZE); > if (ret) > - goto out_unlock; > + goto out_rpm; > GEM_BUG_ON(!node.allocated); > } > > @@ -1307,8 +1321,9 @@ i915_gem_gtt_pwrite_fast(struct drm_i915_gem_object > *obj, > } else { > i915_vma_unpin(vma); > } > -out_unlock: > +out_rpm: > intel_runtime_pm_put(i915); > +out_unlock: > mutex_unlock(>drm.struct_mutex); > return ret; > } > -- > 2.14.1 > > ___ > Intel-gfx mailing list > Intel-gfx@lists.freedesktop.org > https://lists.freedesktop.org/mailman/listinfo/intel-gfx -- Daniel Vetter Software Engineer, Intel Corporation http://blog.ffwll.ch ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx