On Mon, Jul 17, 2023 at 07:30:58PM +0200, Andi Shyti wrote:
> From: Jonathan Cavitt <jonathan.cav...@intel.com>
> 
> For platforms that use Aux CCS, wait for aux invalidation to
> complete by checking the aux invalidation register bit is
> cleared.
> 
> Fixes: 972282c4cf24 ("drm/i915/gen12: Add aux table invalidate for all 
> engines")
> Signed-off-by: Jonathan Cavitt <jonathan.cav...@intel.com>
> Signed-off-by: Andi Shyti <andi.sh...@linux.intel.com>
> Cc: <sta...@vger.kernel.org> # v5.8+
> Reviewed-by: Nirmoy Das <nirmoy....@intel.com>
> ---
>  drivers/gpu/drm/i915/gt/gen8_engine_cs.c     | 17 +++++++++++++----
>  drivers/gpu/drm/i915/gt/intel_gpu_commands.h |  1 +
>  2 files changed, 14 insertions(+), 4 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/gt/gen8_engine_cs.c 
> b/drivers/gpu/drm/i915/gt/gen8_engine_cs.c
> index aa2fb9d72745a..fbc70f3b7f2fd 100644
> --- a/drivers/gpu/drm/i915/gt/gen8_engine_cs.c
> +++ b/drivers/gpu/drm/i915/gt/gen8_engine_cs.c
> @@ -174,6 +174,16 @@ u32 *gen12_emit_aux_table_inv(struct intel_gt *gt, u32 
> *cs, const i915_reg_t inv
>       *cs++ = AUX_INV;
>       *cs++ = MI_NOOP;

We only need qword alignment for sequences of commands, not each
individual command, right?  So technically we could drop this noop...

>  
> +     *cs++ = MI_SEMAPHORE_WAIT_TOKEN |
> +             MI_SEMAPHORE_REGISTER_POLL |
> +             MI_SEMAPHORE_POLL |
> +             MI_SEMAPHORE_SAD_EQ_SDD;
> +     *cs++ = 0;
> +     *cs++ = i915_mmio_reg_offset(inv_reg) + gsi_offset;
> +     *cs++ = 0;
> +     *cs++ = 0;
> +     *cs++ = MI_NOOP;

...and then we wouldn't need an extra one here.

If we drop the pair of noops, that would also change the # of dwords
farther down too.

> +
>       return cs;
>  }
>  
> @@ -284,10 +294,9 @@ int gen12_emit_flush_rcs(struct i915_request *rq, u32 
> mode)
>               else if (engine->class == COMPUTE_CLASS)
>                       flags &= ~PIPE_CONTROL_3D_ENGINE_FLAGS;
>  
> +             count = 8;
>               if (!HAS_FLAT_CCS(rq->engine->i915))

As noted on the earlier patch, we should probably make this check that
the platform actually has AuxCCS.  

Anyway, up to you whether you want to make that change or not.  The
extra noops don't actually hurt anything.

Reviewed-by: Matt Roper <matthew.d.ro...@intel.com>

> -                     count = 8 + 4;
> -             else
> -                     count = 8;
> +                     count += 10;
>  
>               cs = intel_ring_begin(rq, count);
>               if (IS_ERR(cs))
> @@ -330,7 +339,7 @@ int gen12_emit_flush_xcs(struct i915_request *rq, u32 
> mode)
>                       aux_inv = rq->engine->mask &
>                               ~GENMASK(_BCS(I915_MAX_BCS - 1), BCS0);
>                       if (aux_inv)
> -                             cmd += 4;
> +                             cmd += 10;
>               }
>       }
>  
> diff --git a/drivers/gpu/drm/i915/gt/intel_gpu_commands.h 
> b/drivers/gpu/drm/i915/gt/intel_gpu_commands.h
> index 5df7cce23197c..2bd8d98d21102 100644
> --- a/drivers/gpu/drm/i915/gt/intel_gpu_commands.h
> +++ b/drivers/gpu/drm/i915/gt/intel_gpu_commands.h
> @@ -121,6 +121,7 @@
>  #define   MI_SEMAPHORE_TARGET(engine)        ((engine)<<15)
>  #define MI_SEMAPHORE_WAIT    MI_INSTR(0x1c, 2) /* GEN8+ */
>  #define MI_SEMAPHORE_WAIT_TOKEN      MI_INSTR(0x1c, 3) /* GEN12+ */
> +#define   MI_SEMAPHORE_REGISTER_POLL (1 << 16)
>  #define   MI_SEMAPHORE_POLL          (1 << 15)
>  #define   MI_SEMAPHORE_SAD_GT_SDD    (0 << 12)
>  #define   MI_SEMAPHORE_SAD_GTE_SDD   (1 << 12)
> -- 
> 2.40.1
> 

-- 
Matt Roper
Graphics Software Engineer
Linux GPU Platform Enablement
Intel Corporation

Reply via email to