Quoting Govindapillai, Vinod (2025-11-04 13:35:43-03:00)
>Hi Matt,
>
>
>On Mon, 2025-11-03 at 16:15 -0800, Matt Roper wrote:
>> On Mon, Nov 03, 2025 at 02:18:12PM -0300, Gustavo Sousa wrote:
>> > From: Vinod Govindapillai <[email protected]>
>> > 
>> > Configure one of the FBC instances to use system caching. FBC
>> > read/write requests are tagged as cacheable till a programmed
>> > limit is reached by the hw.
>> 
>> What exactly is "system caching?"  We have lots of different caches
>> in
>> current platforms, and it's not really obvious to me from the
>> description here (or the bspec page) exactly which cache(s) are
>> involved
>> here.
>> 
>> Is turning this on always a win or is it situational?  I.e., is there
>> any potential for display memory traffic to fill a cache with FBC
>> data
>> by evicting data that was part of the CPU or GT's working set?  If
>> so,
>> that seems like it could potentially harm the performance of other
>> workloads running on the platform.
>> 
>> Or is this whole thing about a completely new cache (unrelated to
>> and unusable by anything else) which is devoted solely to FBC?
>> 
>> > 
>> > Bspec: 74722
>> 
>> You might want to add 68881 here since it has a bit more information
>> about how we're actually supposed to set the fields documented on
>> 74722.
>
>Okay I will include that. I guess, the HAS also talks about "system
>cache" - no further explanation. But only a fixed portion is allocated
>specifically for the display use and is "configurable". Motivation is
>to reduce to memory subsystem power especially in idle scenarios.
>
>> 
>> > Signed-off-by: Vinod Govindapillai <[email protected]>
>> > Signed-off-by: Gustavo Sousa <[email protected]>
>> > ---
>> >  drivers/gpu/drm/i915/display/intel_fbc.c      | 47
>> > +++++++++++++++++++++++++++
>> >  drivers/gpu/drm/i915/display/intel_fbc_regs.h |  9 +++++
>> >  2 files changed, 56 insertions(+)
>> > 
>> > diff --git a/drivers/gpu/drm/i915/display/intel_fbc.c
>> > b/drivers/gpu/drm/i915/display/intel_fbc.c
>> > index 24b72951ea3c..e2e55c58ddbc 100644
>> > --- a/drivers/gpu/drm/i915/display/intel_fbc.c
>> > +++ b/drivers/gpu/drm/i915/display/intel_fbc.c
>> > @@ -127,6 +127,9 @@ struct intel_fbc {
>> >           */
>> >          struct intel_fbc_state state;
>> >          const char *no_fbc_reason;
>> > +
>> > +        /* Only one of FBC instances can use the system cache */
>> > +        bool own_sys_cache;
>> >  };
>> >  
>> >  static char fbc_name(enum intel_fbc_id fbc_id)
>> > @@ -569,12 +572,51 @@ static bool ilk_fbc_is_compressing(struct
>> > intel_fbc *fbc)
>> >          return intel_de_read(fbc->display, ILK_DPFC_STATUS(fbc-
>> > >id)) & DPFC_COMP_SEG_MASK;
>> >  }
>> >  
>> > +static void nvl_fbc_program_system_cache(struct intel_fbc *fbc,
>> > bool enable)
>> > +{
>> > +        struct intel_display *display = fbc->display;
>> > +        u32 cfb_offset, usage;
>> > +
>> > +        lockdep_assert_held(&fbc->lock);
>> > +
>> > +        usage = intel_de_read(display,
>> > NVL_FBC_SYS_CACHE_USAGE_CFG);
>> > +
>> > +        /* System cache already being used by another pipe */
>> > +        if (enable && (usage & FBC_SYS_CACHE_TAG_USE_RES_SPACE))
>> > +                return;
>> 
>> Rather than relying on the current register contents, should we be
>> sanitizing this on driver probe (in case the pre-OS firmware already
>> set
>> this up) and then making our own decisions (as part of an atomic
>> transaction) about which pipe to prioritize after that?
>
>
>> 
>> > +
>> > +        /* Only the fbc instance which owns system cache can
>> > disable it */
>> > +        if (!enable && !fbc->own_sys_cache)
>> > +                return;
>> > +
>> > +        /*
>> > +         * Not programming the cache limit and cache reading
>> > enable bits explicitly
>> > +         * here. The default values should take care of those and
>> > that could leave
>> > +         * adjustments of those bits to the system hw policy
>> > +         *
>> > +         * TODO: check if we need to explicitly program these?
>> 
>> There's no hardware default documented for the range field, so unless
>> the pre-OS firmware sets it up (which we probably shouldn't rely on),
>> I'd expect this to be 0; I don't think that's what we want.
>
>The Bspec says it is 2MB. But according to the HAS it is "configurable"
>and I clarified that this is at the moment 2MB but can change. So I
>read it as something already configured and set as the default value to
>the register and it could be changed by the soc policy. Thats the
>reason I thought it be kept untouched. Will forward on email
>conversation I had.

FWIW, I checked it on a NVL system and it was zero prior to loading the
driver.

--
Gustavo Sousa

>
>
>> 
>> > +         */
>> > +        cfb_offset = enable ? i915_gem_stolen_node_offset(fbc-
>> > >compressed_fb) : 0;
>> > +        usage |= FBC_SYS_CACHE_START_BASE(cfb_offset);
>> 
>> And if something *did* set this up already, then OR'ing a new address
>> over the old one isn't going to work.  We'd need "(old & ~mask) |
>> new"
>> to ensure we don't have leftover bits still set by accident.  But it
>> would probably be better to just avoid RMW-style handling in general
>> and
>> build a complete register value with exactly what we want rather than
>> trying to modify the pre-existing value.
>> 
>> > +        usage |= enable ? FBC_SYS_CACHE_TAG_USE_RES_SPACE :
>> > FBC_SYS_CACHE_TAG_DONT_CACHE;
>> > +
>> > +        intel_de_write(display, NVL_FBC_SYS_CACHE_USAGE_CFG,
>> > usage);
>> > +
>> > +        fbc->own_sys_cache = enable;
>
>Okay. Thanks. Will fix that!
>
>> 
>> It feels like instead of having this as a boolean flag in fbc, this
>> should be a pointer/ID tracked at the intel_display level.  E.g.,
>> 
>>         display->sys_cache_fbc = fbc;
>> 
>> or possibly converted over to something tracked with atomic state so
>> that we can make better high-level decisions about which FBC we want
>> to
>> enable this on as various displays get enabled/disabled.
>
>Okay. Will check this and get rid of the bool from the intel_fbc
>structure! At the moment we can allocate only based on the firt pipe
>enabling the fbc. But may be in future we could have some logic for
>this I guess.
>
>> Matt
>> 
>> > +
>> > +        drm_dbg_kms(display->drm, "System caching for FBC[%d]
>> > %s\n",
>> > +                    fbc->id, enable ? "configured" : "cleared");
>> > +}
>> > +
>> >  static void ilk_fbc_program_cfb(struct intel_fbc *fbc)
>> >  {
>> >          struct intel_display *display = fbc->display;
>> >  
>> >          intel_de_write(display, ILK_DPFC_CB_BASE(fbc->id),
>> >                         i915_gem_stolen_node_offset(fbc-
>> > >compressed_fb));
>> > +
>> > +        if (DISPLAY_VER(display) >= 35)
>> > +                nvl_fbc_program_system_cache(fbc, true);
>> >  }
>> >  
>> >  static const struct intel_fbc_funcs ilk_fbc_funcs = {
>> > @@ -952,6 +994,8 @@ static void
>> > intel_fbc_program_workarounds(struct intel_fbc *fbc)
>> >  
>> >  static void __intel_fbc_cleanup_cfb(struct intel_fbc *fbc)
>> >  {
>> > +        struct intel_display *display = fbc->display;
>> > +
>> >          if (WARN_ON(intel_fbc_hw_is_active(fbc)))
>> >                  return;
>> >  
>> > @@ -959,6 +1003,9 @@ static void __intel_fbc_cleanup_cfb(struct
>> > intel_fbc *fbc)
>> >                  i915_gem_stolen_remove_node(fbc->compressed_llb);
>> >          if (i915_gem_stolen_node_allocated(fbc->compressed_fb))
>> >                  i915_gem_stolen_remove_node(fbc->compressed_fb);
>> > +
>> > +        if (DISPLAY_VER(display) >= 35)
>> > +                nvl_fbc_program_system_cache(fbc, false);
>> >  }
>> >  
>> >  void intel_fbc_cleanup(struct intel_display *display)
>> > diff --git a/drivers/gpu/drm/i915/display/intel_fbc_regs.h
>> > b/drivers/gpu/drm/i915/display/intel_fbc_regs.h
>> > index 77d8321c4fb3..592cd2384255 100644
>> > --- a/drivers/gpu/drm/i915/display/intel_fbc_regs.h
>> > +++ b/drivers/gpu/drm/i915/display/intel_fbc_regs.h
>> > @@ -128,4 +128,13 @@
>> >  #define   FBC_REND_NUKE                        REG_BIT(2)
>> >  #define   FBC_REND_CACHE_CLEAN                REG_BIT(1)
>> >  
>> > +#define NVL_FBC_SYS_CACHE_USAGE_CFG             _MMIO(0x1344E0)
>> > +#define   FBC_SYS_CACHE_START_BASE_MASK         REG_GENMASK(31,
>> > 16)
>> > +#define   FBC_SYS_CACHE_START_BASE(base)       
>> > REG_FIELD_PREP(FBC_SYS_CACHE_START_BASE_MASK, (base))
>> > +#define   FBC_SYS_CACHEABLE_RANGE_MASK          REG_GENMASK(15, 4)
>> > +#define   FBC_SYS_CACHEABLE_RANGE(range)       
>> > REG_FIELD_PREP(FBC_SYS_CACHEABLE_RANGE_MASK, (range))
>> > +#define   FBC_SYS_CACHE_TAG_MASK                REG_GENMASK(3, 2)
>> > +#define   FBC_SYS_CACHE_TAG_DONT_CACHE         
>> > REG_FIELD_PREP(FBC_SYS_CACHE_TAG_MASK, 0)
>> > +#define   FBC_SYS_CACHE_TAG_USE_RES_SPACE      
>> > REG_FIELD_PREP(FBC_SYS_CACHE_TAG_MASK, 3)
>> > +
>> >  #endif /* __INTEL_FBC_REGS__ */
>> > 
>> > -- 
>> > 2.51.0
>> > 
>> 
>

Reply via email to