On Mon, Aug 23, 2021 at 10:26:06AM +0100, Tvrtko Ursulin wrote:
> 
> On 05/08/2021 17:36, Matt Roper wrote:
> > For tgl+, the per-context setting of MI_MODE[12] determines whether
> > the bits of a nested MI_BATCH_BUFFER_START instruction should be
> > interpreted in the traditional manner or whether they should
> > instead use a new tgl+ meaning that breaks backward compatibility, but
> > allows nesting into 3rd-level batchbuffers.  For previous platforms,
> > the hardware default for this register bit is to maintain
> > backward-compatible behavior unless a context intentionally opts into
> > the new behavior; however Xe_HPG flips the hardware default behavior.
> > 
> > > From a SW perspective, we want to maintain the backward-compatible
> > behavior for userspace, so we'll apply a fake workaround to set it back
> > to the legacy behavior on platforms where the hardware default is to
> > break compatibility.  At the moment there is no Linux userspace that
> > utilizes third-level batchbuffers, so this will avoid userspace from
> > needing to make any changes.  using the legacy meaning is the correct
> > thing to do.  If/when we have userspace consumers that want to utilize
> > third-level batch nesting, we can provide a context parameter to allow
> > them to opt-in.
> > 
> > Bspec: 45974, 45718
> > Cc: John Harrison <[email protected]>
> > Signed-off-by: Matt Roper <[email protected]>
> > ---
> >   drivers/gpu/drm/i915/gt/intel_workarounds.c | 39 +++++++++++++++++++--
> >   drivers/gpu/drm/i915/i915_reg.h             |  1 +
> >   2 files changed, 38 insertions(+), 2 deletions(-)
> > 
> > diff --git a/drivers/gpu/drm/i915/gt/intel_workarounds.c 
> > b/drivers/gpu/drm/i915/gt/intel_workarounds.c
> > index aae609d7d85d..97b3cd81b721 100644
> > --- a/drivers/gpu/drm/i915/gt/intel_workarounds.c
> > +++ b/drivers/gpu/drm/i915/gt/intel_workarounds.c
> > @@ -644,6 +644,37 @@ static void dg1_ctx_workarounds_init(struct 
> > intel_engine_cs *engine,
> >                  DG1_HZ_READ_SUPPRESSION_OPTIMIZATION_DISABLE);
> >   }
> > +static void fakewa_disable_nestedbb_mode(struct intel_engine_cs *engine,
> > +                                    struct i915_wa_list *wal)
> > +{
> > +   /*
> > +    * This is a "fake" workaround defined by software to ensure we
> > +    * maintain reliable, backward-compatible behavior for userspace with
> > +    * regards to how nested MI_BATCH_BUFFER_START commands are handled.
> > +    *
> > +    * The per-context setting of MI_MODE[12] determines whether the bits
> > +    * of a nested MI_BATCH_BUFFER_START instruction should be interpreted
> > +    * in the traditional manner or whether they should instead use a new
> > +    * tgl+ meaning that breaks backward compatibility, but allows nesting
> > +    * into 3rd-level batchbuffers.  When this new capability was first
> > +    * added in TGL, it remained off by default unless a context
> > +    * intentionally opted in to the new behavior.  However Xe_HPG now
> > +    * flips this on by default and requires that we explicitly opt out if
> > +    * we don't want the new behavior.
> > +    *
> > +    * From a SW perspective, we want to maintain the backward-compatible
> > +    * behavior for userspace, so we'll apply a fake workaround to set it
> > +    * back to the legacy behavior on platforms where the hardware default
> > +    * is to break compatibility.  At the moment there is no Linux
> > +    * userspace that utilizes third-level batchbuffers, so this will avoid
> > +    * userspace from needing to make any changes.  using the legacy
> > +    * meaning is the correct thing to do.  If/when we have userspace
> > +    * consumers that want to utilize third-level batch nesting, we can
> > +    * provide a context parameter to allow them to opt-in.
> > +    */
> > +   wa_masked_dis(wal, RING_MI_MODE(engine->mmio_base), TGL_NESTED_BB_EN);
> > +}
> > +
> >   static void
> >   __intel_engine_init_ctx_wa(struct intel_engine_cs *engine,
> >                        struct i915_wa_list *wal,
> > @@ -651,11 +682,15 @@ __intel_engine_init_ctx_wa(struct intel_engine_cs 
> > *engine,
> >   {
> >     struct drm_i915_private *i915 = engine->i915;
> > +   wa_init_start(wal, name, engine->name);
> > +
> > +   /* Applies to all engines */
> > +   if (GRAPHICS_VER_FULL(i915) >= IP_VER(12, 55))
> > +           fakewa_disable_nestedbb_mode(engine, wal);
> > +
> >     if (engine->class != RENDER_CLASS)
> >             return;
> 
> Is it intentional to skip wa_init_finish on non-render engines? Would be a
> bit odd although granted no significant functional difference apart from not
> logging and maybe not trimming the list storage.

No, just an oversight.  Like you said, it doesn't look like it really
matters too much, but I'll write up a fix for it tomorrow.

Thanks.


Matt

> 
> Regards,
> 
> Tvrtko
> 
> > -   wa_init_start(wal, name, engine->name);
> > -
> >     if (IS_DG1(i915))
> >             dg1_ctx_workarounds_init(engine, wal);
> >     else if (GRAPHICS_VER(i915) == 12)
> > diff --git a/drivers/gpu/drm/i915/i915_reg.h 
> > b/drivers/gpu/drm/i915/i915_reg.h
> > index 77f6dcaba2b9..269685955fbd 100644
> > --- a/drivers/gpu/drm/i915/i915_reg.h
> > +++ b/drivers/gpu/drm/i915/i915_reg.h
> > @@ -2821,6 +2821,7 @@ static inline bool i915_mmio_reg_valid(i915_reg_t reg)
> >   #define MI_MODE           _MMIO(0x209c)
> >   # define VS_TIMER_DISPATCH                                (1 << 6)
> >   # define MI_FLUSH_ENABLE                          (1 << 12)
> > +# define TGL_NESTED_BB_EN                          (1 << 12)
> >   # define ASYNC_FLIP_PERF_DISABLE                  (1 << 14)
> >   # define MODE_IDLE                                        (1 << 9)
> >   # define STOP_RING                                        (1 << 8)
> > 

-- 
Matt Roper
Graphics Software Engineer
VTT-OSGC Platform Enablement
Intel Corporation
(916) 356-2795

Reply via email to