> -----Original Message-----
> From: Intel-gfx <[email protected]> On Behalf Of Ville
> Syrjala
> Sent: Monday, October 9, 2023 6:52 PM
> To: [email protected]
> Subject: [Intel-gfx] [PATCH 1/4] drm/i915/dsb: Allocate command buffer from
> local memory
> 
> From: Ville Syrjälä <[email protected]>
> 
> Using system memory for the DSB command buffer doesn't appear to work.
> On DG2 it seems like the hardware internally replaces the actual memory reads
> with zeroes, and so we end up executing a bunch of NOOPs instead of whatever
> commands we put in the buffer. To determine that I measured the time it takes 
> to
> execute the instructions, and the results are always more or less consistent 
> with
> executing a buffer full of NOOPs from local memory.
> 
> Another theory I considered was some kind of cache coherency issue.
> Looks like i915_gem_object_pin_map_unlocked() will in fact give you a WB
> mapping for system memory on DGFX regardless of what mapping mode was
> requested (WC in case of the DSB code). But clflush did not change the 
> behaviour
> at all, so that theory seems moot.
> 
> On DG1 it looks like the hardware might actually be fetching data from system
> memory as the logs indicate that we just get underruns. But that is equally 
> bad, so
> doens't look like we can really use system memory on
> DG1 either.
> 
> Thus always allocate the DSB command buffer from local memory on discrete
> GPUs.

This seems fair to do,
Reviewed-by: Uma Shankar <[email protected]>

> Signed-off-by: Ville Syrjälä <[email protected]>
> ---
>  drivers/gpu/drm/i915/display/intel_dsb.c | 7 ++++++-
>  1 file changed, 6 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/gpu/drm/i915/display/intel_dsb.c
> b/drivers/gpu/drm/i915/display/intel_dsb.c
> index 3e32aa49b8eb..7410ba3126f9 100644
> --- a/drivers/gpu/drm/i915/display/intel_dsb.c
> +++ b/drivers/gpu/drm/i915/display/intel_dsb.c
> @@ -5,6 +5,7 @@
>   */
> 
>  #include "gem/i915_gem_internal.h"
> +#include "gem/i915_gem_lmem.h"
> 
>  #include "i915_drv.h"
>  #include "i915_irq.h"
> @@ -461,7 +462,11 @@ struct intel_dsb *intel_dsb_prepare(const struct
> intel_crtc_state *crtc_state,
>       /* ~1 qword per instruction, full cachelines */
>       size = ALIGN(max_cmds * 8, CACHELINE_BYTES);
> 
> -     obj = i915_gem_object_create_internal(i915, PAGE_ALIGN(size));
> +     if (HAS_LMEM(i915))
> +             obj = i915_gem_object_create_lmem(i915, PAGE_ALIGN(size),
> +
> I915_BO_ALLOC_CONTIGUOUS);
> +     else
> +             obj = i915_gem_object_create_internal(i915, PAGE_ALIGN(size));
>       if (IS_ERR(obj))
>               goto out_put_rpm;
> 
> --
> 2.41.0

Reply via email to