Quoting Govindapillai, Vinod (2025-11-04 13:35:43-03:00)
>Hi Matt,
>
>
>On Mon, 2025-11-03 at 16:15 -0800, Matt Roper wrote:
>> On Mon, Nov 03, 2025 at 02:18:12PM -0300, Gustavo Sousa wrote:
>> > From: Vinod Govindapillai <[email protected]>
>> >
>> > Configure one of the FBC instances to use system caching. FBC
>> > read/write requests are tagged as cacheable till a programmed
>> > limit is reached by the hw.
>>
>> What exactly is "system caching?" We have lots of different caches
>> in
>> current platforms, and it's not really obvious to me from the
>> description here (or the bspec page) exactly which cache(s) are
>> involved
>> here.
>>
>> Is turning this on always a win or is it situational? I.e., is there
>> any potential for display memory traffic to fill a cache with FBC
>> data
>> by evicting data that was part of the CPU or GT's working set? If
>> so,
>> that seems like it could potentially harm the performance of other
>> workloads running on the platform.
>>
>> Or is this whole thing about a completely new cache (unrelated to
>> and unusable by anything else) which is devoted solely to FBC?
>>
>> >
>> > Bspec: 74722
>>
>> You might want to add 68881 here since it has a bit more information
>> about how we're actually supposed to set the fields documented on
>> 74722.
>
>Okay I will include that. I guess, the HAS also talks about "system
>cache" - no further explanation. But only a fixed portion is allocated
>specifically for the display use and is "configurable". Motivation is
>to reduce to memory subsystem power especially in idle scenarios.
>
>>
>> > Signed-off-by: Vinod Govindapillai <[email protected]>
>> > Signed-off-by: Gustavo Sousa <[email protected]>
>> > ---
>> > drivers/gpu/drm/i915/display/intel_fbc.c | 47
>> > +++++++++++++++++++++++++++
>> > drivers/gpu/drm/i915/display/intel_fbc_regs.h | 9 +++++
>> > 2 files changed, 56 insertions(+)
>> >
>> > diff --git a/drivers/gpu/drm/i915/display/intel_fbc.c
>> > b/drivers/gpu/drm/i915/display/intel_fbc.c
>> > index 24b72951ea3c..e2e55c58ddbc 100644
>> > --- a/drivers/gpu/drm/i915/display/intel_fbc.c
>> > +++ b/drivers/gpu/drm/i915/display/intel_fbc.c
>> > @@ -127,6 +127,9 @@ struct intel_fbc {
>> > */
>> > struct intel_fbc_state state;
>> > const char *no_fbc_reason;
>> > +
>> > + /* Only one of FBC instances can use the system cache */
>> > + bool own_sys_cache;
>> > };
>> >
>> > static char fbc_name(enum intel_fbc_id fbc_id)
>> > @@ -569,12 +572,51 @@ static bool ilk_fbc_is_compressing(struct
>> > intel_fbc *fbc)
>> > return intel_de_read(fbc->display, ILK_DPFC_STATUS(fbc-
>> > >id)) & DPFC_COMP_SEG_MASK;
>> > }
>> >
>> > +static void nvl_fbc_program_system_cache(struct intel_fbc *fbc,
>> > bool enable)
>> > +{
>> > + struct intel_display *display = fbc->display;
>> > + u32 cfb_offset, usage;
>> > +
>> > + lockdep_assert_held(&fbc->lock);
>> > +
>> > + usage = intel_de_read(display,
>> > NVL_FBC_SYS_CACHE_USAGE_CFG);
>> > +
>> > + /* System cache already being used by another pipe */
>> > + if (enable && (usage & FBC_SYS_CACHE_TAG_USE_RES_SPACE))
>> > + return;
>>
>> Rather than relying on the current register contents, should we be
>> sanitizing this on driver probe (in case the pre-OS firmware already
>> set
>> this up) and then making our own decisions (as part of an atomic
>> transaction) about which pipe to prioritize after that?
>
>
>>
>> > +
>> > + /* Only the fbc instance which owns system cache can
>> > disable it */
>> > + if (!enable && !fbc->own_sys_cache)
>> > + return;
>> > +
>> > + /*
>> > + * Not programming the cache limit and cache reading
>> > enable bits explicitly
>> > + * here. The default values should take care of those and
>> > that could leave
>> > + * adjustments of those bits to the system hw policy
>> > + *
>> > + * TODO: check if we need to explicitly program these?
>>
>> There's no hardware default documented for the range field, so unless
>> the pre-OS firmware sets it up (which we probably shouldn't rely on),
>> I'd expect this to be 0; I don't think that's what we want.
>
>The Bspec says it is 2MB. But according to the HAS it is "configurable"
>and I clarified that this is at the moment 2MB but can change. So I
>read it as something already configured and set as the default value to
>the register and it could be changed by the soc policy. Thats the
>reason I thought it be kept untouched. Will forward on email
>conversation I had.
>
>
>>
>> > + */
>> > + cfb_offset = enable ? i915_gem_stolen_node_offset(fbc-
>> > >compressed_fb) : 0;
>> > + usage |= FBC_SYS_CACHE_START_BASE(cfb_offset);
>>
>> And if something *did* set this up already, then OR'ing a new address
>> over the old one isn't going to work. We'd need "(old & ~mask) |
>> new"
>> to ensure we don't have leftover bits still set by accident. But it
>> would probably be better to just avoid RMW-style handling in general
>> and
>> build a complete register value with exactly what we want rather than
>> trying to modify the pre-existing value.
>>
>> > + usage |= enable ? FBC_SYS_CACHE_TAG_USE_RES_SPACE :
>> > FBC_SYS_CACHE_TAG_DONT_CACHE;
>> > +
>> > + intel_de_write(display, NVL_FBC_SYS_CACHE_USAGE_CFG,
>> > usage);
>> > +
>> > + fbc->own_sys_cache = enable;
>
>Okay. Thanks. Will fix that!
>
>>
>> It feels like instead of having this as a boolean flag in fbc, this
>> should be a pointer/ID tracked at the intel_display level. E.g.,
>>
>> display->sys_cache_fbc = fbc;
>>
>> or possibly converted over to something tracked with atomic state so
>> that we can make better high-level decisions about which FBC we want
>> to
>> enable this on as various displays get enabled/disabled.
>
>Okay. Will check this and get rid of the bool from the intel_fbc
>structure! At the moment we can allocate only based on the firt pipe
>enabling the fbc. But may be in future we could have some logic for
>this I guess.
Vinod, based on your replies here, I'm assuming you are going to send an
updated version of this patch. I'll drop it in the v4 of this series.
--
Gustavo Sousa
>
>> Matt
>>
>> > +
>> > + drm_dbg_kms(display->drm, "System caching for FBC[%d]
>> > %s\n",
>> > + fbc->id, enable ? "configured" : "cleared");
>> > +}
>> > +
>> > static void ilk_fbc_program_cfb(struct intel_fbc *fbc)
>> > {
>> > struct intel_display *display = fbc->display;
>> >
>> > intel_de_write(display, ILK_DPFC_CB_BASE(fbc->id),
>> > i915_gem_stolen_node_offset(fbc-
>> > >compressed_fb));
>> > +
>> > + if (DISPLAY_VER(display) >= 35)
>> > + nvl_fbc_program_system_cache(fbc, true);
>> > }
>> >
>> > static const struct intel_fbc_funcs ilk_fbc_funcs = {
>> > @@ -952,6 +994,8 @@ static void
>> > intel_fbc_program_workarounds(struct intel_fbc *fbc)
>> >
>> > static void __intel_fbc_cleanup_cfb(struct intel_fbc *fbc)
>> > {
>> > + struct intel_display *display = fbc->display;
>> > +
>> > if (WARN_ON(intel_fbc_hw_is_active(fbc)))
>> > return;
>> >
>> > @@ -959,6 +1003,9 @@ static void __intel_fbc_cleanup_cfb(struct
>> > intel_fbc *fbc)
>> > i915_gem_stolen_remove_node(fbc->compressed_llb);
>> > if (i915_gem_stolen_node_allocated(fbc->compressed_fb))
>> > i915_gem_stolen_remove_node(fbc->compressed_fb);
>> > +
>> > + if (DISPLAY_VER(display) >= 35)
>> > + nvl_fbc_program_system_cache(fbc, false);
>> > }
>> >
>> > void intel_fbc_cleanup(struct intel_display *display)
>> > diff --git a/drivers/gpu/drm/i915/display/intel_fbc_regs.h
>> > b/drivers/gpu/drm/i915/display/intel_fbc_regs.h
>> > index 77d8321c4fb3..592cd2384255 100644
>> > --- a/drivers/gpu/drm/i915/display/intel_fbc_regs.h
>> > +++ b/drivers/gpu/drm/i915/display/intel_fbc_regs.h
>> > @@ -128,4 +128,13 @@
>> > #define FBC_REND_NUKE REG_BIT(2)
>> > #define FBC_REND_CACHE_CLEAN REG_BIT(1)
>> >
>> > +#define NVL_FBC_SYS_CACHE_USAGE_CFG _MMIO(0x1344E0)
>> > +#define FBC_SYS_CACHE_START_BASE_MASK REG_GENMASK(31,
>> > 16)
>> > +#define FBC_SYS_CACHE_START_BASE(base)
>> > REG_FIELD_PREP(FBC_SYS_CACHE_START_BASE_MASK, (base))
>> > +#define FBC_SYS_CACHEABLE_RANGE_MASK REG_GENMASK(15, 4)
>> > +#define FBC_SYS_CACHEABLE_RANGE(range)
>> > REG_FIELD_PREP(FBC_SYS_CACHEABLE_RANGE_MASK, (range))
>> > +#define FBC_SYS_CACHE_TAG_MASK REG_GENMASK(3, 2)
>> > +#define FBC_SYS_CACHE_TAG_DONT_CACHE
>> > REG_FIELD_PREP(FBC_SYS_CACHE_TAG_MASK, 0)
>> > +#define FBC_SYS_CACHE_TAG_USE_RES_SPACE
>> > REG_FIELD_PREP(FBC_SYS_CACHE_TAG_MASK, 3)
>> > +
>> > #endif /* __INTEL_FBC_REGS__ */
>> >
>> > --
>> > 2.51.0
>> >
>>
>