On Tue, Jun 10, 2014 at 04:14:40AM -0700, Chris Wilson wrote:
> Inserting additional PTEs has no side-effect for us as the pfn are fixed
> for the entire time the object is resident in the global GTT. The
> downside is that we pay the entire cost of faulting the object upon the
> first hit, for which we in return receive the benefit of removing the
> per-page faulting overhead.
> 
> On an Ivybridge i7-3720qm with 1600MHz DDR3, with 32 fences,
> Upload rate for 2 linear surfaces:    8127MiB/s -> 8134MiB/s
> Upload rate for 2 tiled surfaces:     8607MiB/s -> 8625MiB/s
> Upload rate for 4 linear surfaces:    8127MiB/s -> 8127MiB/s
> Upload rate for 4 tiled surfaces:     8611MiB/s -> 8602MiB/s
> Upload rate for 8 linear surfaces:    8114MiB/s -> 8124MiB/s
> Upload rate for 8 tiled surfaces:     8601MiB/s -> 8603MiB/s
> Upload rate for 16 linear surfaces:   8110MiB/s -> 8123MiB/s
> Upload rate for 16 tiled surfaces:    8595MiB/s -> 8606MiB/s
> Upload rate for 32 linear surfaces:   8104MiB/s -> 8121MiB/s
> Upload rate for 32 tiled surfaces:    8589MiB/s -> 8605MiB/s
> Upload rate for 64 linear surfaces:   8107MiB/s -> 8121MiB/s
> Upload rate for 64 tiled surfaces:    2013MiB/s -> 3017MiB/s
> 
> Signed-off-by: Chris Wilson <[email protected]>
> Cc: "Goel, Akash" <[email protected]>

For reproducibility it would be nice to have the testcase info, assuming
it's something from i-g-t. Other than that, I think this change looks good.

Reviewed-by: Brad Volkin <[email protected]>

> ---
>  drivers/gpu/drm/i915/i915_gem.c | 22 +++++++++++++++++-----
>  1 file changed, 17 insertions(+), 5 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
> index 3aaf7e01235e..e1f68f06c2ef 100644
> --- a/drivers/gpu/drm/i915/i915_gem.c
> +++ b/drivers/gpu/drm/i915/i915_gem.c
> @@ -1704,14 +1704,26 @@ int i915_gem_fault(struct vm_area_struct *vma, struct 
> vm_fault *vmf)
>       if (ret)
>               goto unpin;
>  
> -     obj->fault_mappable = true;
> -
> +     /* Finally, remap it using the new GTT offset */
>       pfn = dev_priv->gtt.mappable_base + i915_gem_obj_ggtt_offset(obj);
>       pfn >>= PAGE_SHIFT;
> -     pfn += page_offset;
>  
> -     /* Finally, remap it using the new GTT offset */
> -     ret = vm_insert_pfn(vma, (unsigned long)vmf->virtual_address, pfn);
> +     if (!obj->fault_mappable) {
> +             int i;
> +
> +             for (i = 0; i < obj->base.size >> PAGE_SHIFT; i++) {
> +                     ret = vm_insert_pfn(vma,
> +                                         (unsigned long)vma->vm_start + i * 
> PAGE_SIZE,
> +                                         pfn + i);
> +                     if (ret)
> +                             break;
> +             }
> +
> +             obj->fault_mappable = true;
> +     } else
> +             ret = vm_insert_pfn(vma,
> +                                 (unsigned long)vmf->virtual_address,
> +                                 pfn + page_offset);
>  unpin:
>       i915_gem_object_ggtt_unpin(obj);
>  unlock:
> -- 
> 2.0.0
> 
> _______________________________________________
> Intel-gfx mailing list
> [email protected]
> http://lists.freedesktop.org/mailman/listinfo/intel-gfx
_______________________________________________
Intel-gfx mailing list
[email protected]
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

Reply via email to