Re: [PATCH] drm/syncobj: add DRM_IOCTL_SYNCOBJ_IMPORT/EXPORT_SYNC_FILE

2023-07-20 Thread Simon Ser
I had a look at the IGT and I'm not sure about the approach. It seems
like the patch replaces occurrences of the old FLAGS_IMPORT_SYNC_FILE
and FLAGS_EXPORT_SYNC_FILE plus TRANSFER with the new IOCTLs. However
we still want to test the functionality of that old codepath: we need
to continue to test that the old IOCTLs work as expected.

Are the old IOCTLs still sufficiently tested elsewhere? If not, we need
to either duplicate the tests, either add a flag to the test function
to select between old and new.

Also, it would be good to have some basic tests for invalid cases, e.g.
for the invalid zero syncobj handle, for timeline points which haven't
materialized yet, etc.

I wonder if we need to detect at runtime whether the IOCTL is available.
I'm not sure what the IGT requirements are, is it supposed to run on
any Linux version, or does it require drm-next?

It would help to post the IGT patches on the mailing list so that we
can do a proper review there.


[PATCH] drm/radeon: ERROR: "foo * bar" should be "foo *bar"

2023-07-20 Thread sunran001

Fix two occurrences of the checkpatch.pl error:
ERROR: "foo * bar" should be "foo *bar"

Signed-off-by: Jianghui Xu 
---
 drivers/gpu/drm/radeon/atom.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/radeon/atom.c 
b/drivers/gpu/drm/radeon/atom.c

index 11a1940bb26d..93acb0e42bd6 100644
--- a/drivers/gpu/drm/radeon/atom.c
+++ b/drivers/gpu/drm/radeon/atom.c
@@ -68,8 +68,8 @@ typedef struct {
 } atom_exec_context;

 int atom_debug = 0;
-static int atom_execute_table_locked(struct atom_context *ctx, int 
index, uint32_t * params);
-int atom_execute_table(struct atom_context *ctx, int index, uint32_t * 
params);
+static int atom_execute_table_locked(struct atom_context *ctx, int 
index, uint32_t *params);
+int atom_execute_table(struct atom_context *ctx, int index, uint32_t 
*params);


 static uint32_t atom_arg_mask[8] = {
0x, 0x, 0x0000, 0x,


[PATCH] drm/radeon: ERROR: "foo * bar" should be "foo *bar"

2023-07-20 Thread sunran001

Fix two occurrences of the checkpatch.pl error:
ERROR: "foo * bar" should be "foo *bar"

Signed-off-by: Ran Sun 
---
 drivers/gpu/drm/radeon/atom.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/radeon/atom.c 
b/drivers/gpu/drm/radeon/atom.c

index 11a1940bb26d..93acb0e42bd6 100644
--- a/drivers/gpu/drm/radeon/atom.c
+++ b/drivers/gpu/drm/radeon/atom.c
@@ -68,8 +68,8 @@ typedef struct {
 } atom_exec_context;

 int atom_debug = 0;
-static int atom_execute_table_locked(struct atom_context *ctx, int 
index, uint32_t * params);
-int atom_execute_table(struct atom_context *ctx, int index, uint32_t * 
params);
+static int atom_execute_table_locked(struct atom_context *ctx, int 
index, uint32_t *params);
+int atom_execute_table(struct atom_context *ctx, int index, uint32_t 
*params);


 static uint32_t atom_arg_mask[8] = {
0x, 0x, 0x0000, 0x,


Re: [PATCH] backlight: gpio_backlight: Drop output gpio direction check for initial power state

2023-07-20 Thread Andy Shevchenko
On Fri, Jul 21, 2023 at 8:17 AM Ying Liu  wrote:
> On Thursday, July 20, 2023 7:28 PM Daniel Thompson 
>  wrote:
> > On Thu, Jul 20, 2023 at 06:06:27AM +, Ying Liu wrote:
> > > Bootloader may leave gpio direction as input and gpio value as logical 
> > > low.
> > > It hints that initial backlight power state should be
> > FB_BLANK_POWERDOWN
> > > since the gpio value is literally logical low.
> >
> > To be honest this probably "hints" that the bootloader simply didn't
> > consider the backlight at all :-) . I'd rather the patch description
> > focus on what circumstances lead to the current code making a bad
> > decision. More like:
> >
> >   If the GPIO pin is in the input state but the backlight is currently
> >   off due to default pull downs then ...
>
> How about this patch description?
>
> -8<---
> Without this patch, if gpio pin is in input state but backlight is currently

s/Without this patch, if/If/

> off due to default pull downs, then initial power state is set to
> FB_BLANK_UNBLANK in DT boot mode with phandle link and the backlight is
> effectively turned on in gpio_backlight_probe(), which is undesirable
> according to patch description of commit ec665b756e6f ("backlight:
> gpio-backlight: Correct initial power state handling").
>
> Quote:
> ---
> If in DT boot we have phandle link then leave the GPIO in a state which the
> bootloader left it and let the user of the backlight to configure it further.
> ---
>
> So, let's drop output gpio direction check and only check gpio value to set
> the initial power state.
> -8<---

-- 
With Best Regards,
Andy Shevchenko


[PATCH] drm/radeon: ERROR: "foo * bar" should be "foo *bar"

2023-07-20 Thread sunran001

Fix nine occurrences of the checkpatch.pl error:
ERROR: "foo * bar" should be "foo *bar"

Signed-off-by: Ran Sun 
---
 drivers/gpu/drm/radeon/atom.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/radeon/atom.c 
b/drivers/gpu/drm/radeon/atom.c

index 11a1940bb26d..93acb0e42bd6 100644
--- a/drivers/gpu/drm/radeon/atom.c
+++ b/drivers/gpu/drm/radeon/atom.c
@@ -68,8 +68,8 @@ typedef struct {
 } atom_exec_context;

 int atom_debug = 0;
-static int atom_execute_table_locked(struct atom_context *ctx, int 
index, uint32_t * params);
-int atom_execute_table(struct atom_context *ctx, int index, uint32_t * 
params);
+static int atom_execute_table_locked(struct atom_context *ctx, int 
index, uint32_t *params);
+int atom_execute_table(struct atom_context *ctx, int index, uint32_t 
*params);


 static uint32_t atom_arg_mask[8] = {
0x, 0x, 0x0000, 0x,


RE: [PATCH] backlight: gpio_backlight: Drop output gpio direction check for initial power state

2023-07-20 Thread Ying Liu
Hi Daniel,

On Thursday, July 20, 2023 7:28 PM Daniel Thompson  
wrote:
> 
> On Thu, Jul 20, 2023 at 06:06:27AM +, Ying Liu wrote:
> > Bootloader may leave gpio direction as input and gpio value as logical low.
> > It hints that initial backlight power state should be
> FB_BLANK_POWERDOWN
> > since the gpio value is literally logical low.
> 
> To be honest this probably "hints" that the bootloader simply didn't
> consider the backlight at all :-) . I'd rather the patch description
> focus on what circumstances lead to the current code making a bad
> decision. More like:
> 
>   If the GPIO pin is in the input state but the backlight is currently
>   off due to default pull downs then ...

How about this patch description?

-8<---
Without this patch, if gpio pin is in input state but backlight is currently
off due to default pull downs, then initial power state is set to
FB_BLANK_UNBLANK in DT boot mode with phandle link and the backlight is
effectively turned on in gpio_backlight_probe(), which is undesirable
according to patch description of commit ec665b756e6f ("backlight:
gpio-backlight: Correct initial power state handling").

Quote:
---
If in DT boot we have phandle link then leave the GPIO in a state which the
bootloader left it and let the user of the backlight to configure it further.
---

So, let's drop output gpio direction check and only check gpio value to set
the initial power state.
-8<---

Regards,
Liu Ying


Re: [PATCH v2 1/9] vgacon: rework Kconfig dependencies

2023-07-20 Thread Michael Ellerman
Arnd Bergmann  writes:
> From: Arnd Bergmann 
>
> The list of dependencies here is phrased as an opt-out, but this is missing
> a lot of architectures that don't actually support VGA consoles, and some
> of the entries are stale:
>
>  - powerpc used to support VGA consoles in the old arch/ppc codebase, but
>the merged arch/powerpc never did

Not disputing this, but how did you come to that conclusion? I grepped
around and couldn't convince myself whether it can work on powerpc or
not. ie. currently it's possible to enable CONFIG_VGA_CONSOLE and
powerpc does have a struct screen_info defined which seems like it would
allow vgacon_startup() to complete.

My only concern is that someone could be using it with Qemu?

cheers


RE: [PATCH v3] drm/i915: Refactor PAT/object cache handling

2023-07-20 Thread Yang, Fei
>>> [snip]
> @@ -27,15 +28,8 @@ static bool gpu_write_needs_clflush(struct
> drm_i915_gem_object *obj)

 The code change here looks accurate, but while we're here, I have a
 side question about this function in general...it was originally
 introduced in commit 48004881f693 ("drm/i915: Mark CPU cache as
 dirty when used for
 rendering") which states that GPU rendering ends up in the CPU cache
 (and thus needs a clflush later to make sure it lands in memory).
 That makes sense to me for LLC platforms, but is it really true for
 non-LLC snooping platforms (like MTL) as the commit states?
>>>
>>> For non-LLC platforms objects can be set to 1-way coherent which
>>> means GPU rendering ending up in CPU cache as well, so for non-LLC
>>> platform the logic here should be checking 1-way coherent flag.
>>
>> That's the part that I'm questioning (and not just for MTL, but for
>> all of our other non-LLC platforms too).  Just because there's
>> coherency doesn't mean that device writes landed in the CPU cache.
>> Coherency is also achieved if device writes invalidate the contents of the 
>> CPU cache.
>> I thought our non-LLC snooping platforms were coherent due to
>> write-invalidate rather than write-update, but I can't find it
>> specifically documented anywhere at the moment.  If write-invalidate
>> was used, then there shouldn't be a need for a later clflush either.
>
> [Trying to consolidate by doing a combined reply to the discussion so far.]
>
> On the write-invalidate vs write-update I don't know. If you did not
> find it in bspec then I doubt I would. I can have a browse still.

Matt was correct. Quote Ron Silvas from SW ARCH, "MTL GPU doesn't write to
CPU cache, it simply snoop CPU cache on its way to RAM."

 My understanding
 was that snooping platforms just invalidated the CPU cache to
 prevent future CPU reads from seeing stale data but didn't actually
 stick any new data in there?  Am I off track or is the original
 logic of this function not quite right?

 Anyway, even if the logic of this function is wrong, it's a mistake
 that would only hurt performance
>>>
>>> Yes, this logic will introduce performance impact because it's
>>> missing the checking for obj->pat_set_by_user. For objects with
>>> pat_set_by_user==true, even if the object is snooping or 1-way
>>> coherent, we don't want to enforce a clflush here since the
>>> coherency is supposed to be handled by user space.
>
> What should I add you think to fix it?

I think the simplest would be

if (obj->pat_set_by_user)
return false;

because even checking for incoherent WB is unnecessary, simply no
need for the KMD to initiate a flush if PAT is set by user.

> Add a check for non-coherent WB in gpu_write_needs_clflush as an additional 
> condition for returning false?
>
> And then if Matt is correct write-invalidate is used also !HAS_LLC should 
> just return false?
>
 (flushing more often than we truly need to) rather than
 functionality, so not something we really need to dig into right now
 as part of this patch.

>   if (IS_DGFX(i915))
>   return false;
>
> -/*
> - * For objects created by userspace through GEM_CREATE with pat_index
> - * set by set_pat extension, i915_gem_object_has_cache_level() will
> - * always return true, because the coherency of such object is 
> managed
> - * by userspace. Othereise the call here would fall back to checking
> - * whether the object is un-cached or write-through.
> - */
> -return !(i915_gem_object_has_cache_level(obj, I915_CACHE_NONE) ||
> - i915_gem_object_has_cache_level(obj, I915_CACHE_WT));
> +return i915_gem_object_has_cache_mode(obj, I915_CACHE_MODE_UC) != 1 
> &&
> +   i915_gem_object_has_cache_mode(obj, I915_CACHE_MODE_WT) != 1;
>   }
>>>
>>> [snip]
> @@ -640,15 +640,9 @@ static inline int use_cpu_reloc(const struct 
> reloc_cache *cache,
>   if (DBG_FORCE_RELOC == FORCE_GTT_RELOC)
>   return false;
>
> -/*
> - * For objects created by userspace through GEM_CREATE with pat_index
> - * set by set_pat extension, i915_gem_object_has_cache_level() always
> - * return true, otherwise the call would fall back to checking 
> whether
> - * the object is un-cached.
> - */
>   return (cache->has_llc ||
>   obj->cache_dirty ||
> -!i915_gem_object_has_cache_level(obj, I915_CACHE_NONE));
> +i915_gem_object_has_cache_mode(obj,
> + I915_CACHE_MODE_UC) != 1);

 Platforms with relocations and platforms with user-specified PAT
 have no overlap, right?  So a -1 return should be impossible here
 and this is one case where we could just treat the return value as
 a boolean, right?
>>>
>
> Hm no, or m

[PATCH] drm/amd/pm: open brace '{' following struct go on the same line

2023-07-20 Thread sunran001

ERROR: open brace '{' following struct go on the same line

Signed-off-by: Ran Sun 
---
 drivers/gpu/drm/amd/pm/inc/amdgpu_dpm.h | 8 
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/drivers/gpu/drm/amd/pm/inc/amdgpu_dpm.h 
b/drivers/gpu/drm/amd/pm/inc/amdgpu_dpm.h

index ddc488251313..0cf564ea1ed8 100644
--- a/drivers/gpu/drm/amd/pm/inc/amdgpu_dpm.h
+++ b/drivers/gpu/drm/amd/pm/inc/amdgpu_dpm.h
@@ -429,10 +429,10 @@ int amdgpu_pm_load_smu_firmware(struct 
amdgpu_device *adev, uint32_t *smu_versio
 int amdgpu_dpm_handle_passthrough_sbr(struct amdgpu_device *adev, bool 
enable);
 int amdgpu_dpm_send_hbm_bad_pages_num(struct amdgpu_device *adev, 
uint32_t size);
 int amdgpu_dpm_send_hbm_bad_channel_flag(struct amdgpu_device *adev, 
uint32_t size);
-int amdgpu_dpm_get_dpm_freq_range(struct amdgpu_device *adev,enum 
pp_clock_type type,

- uint32_t *min,uint32_t *max);
-int amdgpu_dpm_set_soft_freq_range(struct amdgpu_device *adev,enum 
pp_clock_type type,

-  uint32_t min,uint32_t max);
+int amdgpu_dpm_get_dpm_freq_range(struct amdgpu_device *adev, enum 
pp_clock_type type,

+ uint32_t *min, uint32_t *max);
+int amdgpu_dpm_set_soft_freq_range(struct amdgpu_device *adev, enum 
pp_clock_type type,

+  uint32_t min, uint32_t max);
 int amdgpu_dpm_write_watermarks_table(struct amdgpu_device *adev);
 int amdgpu_dpm_wait_for_event(struct amdgpu_device *adev, enum 
smu_event_type event,

  uint64_t event_arg);


Re: [git pull] drm fixes for 6.5-rc3

2023-07-20 Thread pr-tracker-bot
The pull request you sent on Fri, 21 Jul 2023 13:14:05 +1000:

> git://anongit.freedesktop.org/drm/drm tags/drm-fixes-2023-07-21

has been merged into torvalds/linux.git:
https://git.kernel.org/torvalds/c/f7e3a1bafdea735050dfde00523cf505dc7fd309

Thank you!

-- 
Deet-doot-dot, I am a bot.
https://korg.docs.kernel.org/prtracker.html


[PATCH] drm/amdgpu: open brace '{' following struct go on the same line

2023-07-20 Thread sunran001

ERROR: open brace '{' following struct go on the same line

Signed-off-by: Ran Sun 
---
 drivers/gpu/drm/amd/pm/inc/amdgpu_pm.h | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/amd/pm/inc/amdgpu_pm.h 
b/drivers/gpu/drm/amd/pm/inc/amdgpu_pm.h

index 52045ad59bed..eec816f0cbf9 100644
--- a/drivers/gpu/drm/amd/pm/inc/amdgpu_pm.h
+++ b/drivers/gpu/drm/amd/pm/inc/amdgpu_pm.h
@@ -24,8 +24,7 @@
 #ifndef __AMDGPU_PM_H__
 #define __AMDGPU_PM_H__

-struct cg_flag_name
-{
+struct cg_flag_name {
u64 flag;
const char *name;
 };


[PATCH] drm/amd/pm: open brace '{' following struct go on the same line

2023-07-20 Thread sunran001

ERROR: open brace '{' following struct go on the same line

Signed-off-by: Ran Sun 
---
 .../gpu/drm/amd/pm/inc/smu_v13_0_0_pptable.h  | 21 +++
 1 file changed, 7 insertions(+), 14 deletions(-)

diff --git a/drivers/gpu/drm/amd/pm/inc/smu_v13_0_0_pptable.h 
b/drivers/gpu/drm/amd/pm/inc/smu_v13_0_0_pptable.h

index 1dc7a065a6d4..251ed011b3b0 100644
--- a/drivers/gpu/drm/amd/pm/inc/smu_v13_0_0_pptable.h
+++ b/drivers/gpu/drm/amd/pm/inc/smu_v13_0_0_pptable.h
@@ -41,8 +41,7 @@
 #define SMU_13_0_0_PP_OVERDRIVE_VERSION 0x83// OverDrive 8 
Table Version 0.2
 #define SMU_13_0_0_PP_POWERSAVINGCLOCK_VERSION 0x01 // Power Saving 
Clock Table Version 1.00


-enum SMU_13_0_0_ODFEATURE_CAP
-{
+enum SMU_13_0_0_ODFEATURE_CAP {
 SMU_13_0_0_ODCAP_GFXCLK_LIMITS = 0,
 SMU_13_0_0_ODCAP_UCLK_LIMITS,
 SMU_13_0_0_ODCAP_POWER_LIMIT,
@@ -62,8 +61,7 @@ enum SMU_13_0_0_ODFEATURE_CAP
 SMU_13_0_0_ODCAP_COUNT,
 };

-enum SMU_13_0_0_ODFEATURE_ID
-{
+enum SMU_13_0_0_ODFEATURE_ID {
 SMU_13_0_0_ODFEATURE_GFXCLK_LIMITS   = 1 << 
SMU_13_0_0_ODCAP_GFXCLK_LIMITS,   //GFXCLK Limit feature
 SMU_13_0_0_ODFEATURE_UCLK_LIMITS = 1 << 
SMU_13_0_0_ODCAP_UCLK_LIMITS, //UCLK Limit feature
 SMU_13_0_0_ODFEATURE_POWER_LIMIT = 1 << 
SMU_13_0_0_ODCAP_POWER_LIMIT, //Power Limit feature

@@ -85,8 +83,7 @@ enum SMU_13_0_0_ODFEATURE_ID

 #define SMU_13_0_0_MAX_ODFEATURE 32 //Maximum Number of OD Features

-enum SMU_13_0_0_ODSETTING_ID
-{
+enum SMU_13_0_0_ODSETTING_ID {
 SMU_13_0_0_ODSETTING_GFXCLKFMAX = 0,
 SMU_13_0_0_ODSETTING_GFXCLKFMIN,
 SMU_13_0_0_ODSETTING_UCLKFMIN,
@@ -123,8 +120,7 @@ enum SMU_13_0_0_ODSETTING_ID
 };
 #define SMU_13_0_0_MAX_ODSETTING 64 //Maximum Number of ODSettings

-enum SMU_13_0_0_PWRMODE_SETTING
-{
+enum SMU_13_0_0_PWRMODE_SETTING {
 SMU_13_0_0_PMSETTING_POWER_LIMIT_QUIET = 0,
 SMU_13_0_0_PMSETTING_POWER_LIMIT_BALANCE,
 SMU_13_0_0_PMSETTING_POWER_LIMIT_TURBO,
@@ -144,8 +140,7 @@ enum SMU_13_0_0_PWRMODE_SETTING
 };
 #define SMU_13_0_0_MAX_PMSETTING 32 //Maximum Number of PowerMode 
Settings


-struct smu_13_0_0_overdrive_table
-{
+struct smu_13_0_0_overdrive_table {
 uint8_t revision; //Revision = 
SMU_13_0_0_PP_OVERDRIVE_VERSION
 uint8_t reserve[3];   //Zero filled field 
reserved for future use
 uint32_t feature_count;   //Total number of 
supported features

@@ -156,8 +151,7 @@ struct smu_13_0_0_overdrive_table
 int16_t pm_setting[SMU_13_0_0_MAX_PMSETTING]; //Optimized power 
mode feature settings

 };

-enum SMU_13_0_0_PPCLOCK_ID
-{
+enum SMU_13_0_0_PPCLOCK_ID {
 SMU_13_0_0_PPCLOCK_GFXCLK = 0,
 SMU_13_0_0_PPCLOCK_SOCCLK,
 SMU_13_0_0_PPCLOCK_UCLK,
@@ -175,8 +169,7 @@ enum SMU_13_0_0_PPCLOCK_ID
 };
 #define SMU_13_0_0_MAX_PPCLOCK 16 //Maximum Number of PP Clocks

-struct smu_13_0_0_powerplay_table
-{
+struct smu_13_0_0_powerplay_table {
 struct atom_common_table_header header; //For SMU13, 
header.format_revision = 15, header.content_revision = 0
 uint8_t table_revision; //For SMU13, table_revision 
= 2

 uint8_t padding;


[git pull] drm fixes for 6.5-rc3

2023-07-20 Thread Dave Airlie
Hi Linus,

This week's fixes pull, just realise the tag says for rc2, but it's
meant to be rc3. Mostly amdgpu fixes, a couple of i915 fixes, some
nouveau and then a few misc accel and other fixes.

Dave.

drm-fixes-2023-07-21:
drm fixes for 6.5-rc2

client:
- memory leak fix

dma-buf:
- memory leak fix

qaic:
- bound check fixes
- map_user_pages leak
- int overflow fixes

habanalabs:
- debugfs stub helper

nouveau:
- aux event slot fixes
- anx9805 cards fixes

i915:
- Add sentinel to xehp_oa_b_counters
- Revert "drm/i915: use localized __diag_ignore_all() instead of per file"

amdgpu:
- More PCIe DPM fixes for Intel platforms
- DCN3.0.1 fixes
- Virtual display timer fix
- Async flip fix
- SMU13 clock reporting fixes
- Add missing PSP firmware declaration
- DP MST fix
- DCN3.1.x fixes
- Slab out of bounds fix
The following changes since commit fdf0eaf11452d72945af31804e2a1048ee1b574c:

  Linux 6.5-rc2 (2023-07-16 15:10:37 -0700)

are available in the Git repository at:

  git://anongit.freedesktop.org/drm/drm tags/drm-fixes-2023-07-21

for you to fetch changes up to 28801cc85906ea62043e62c71def7f9daaf1c168:

  Merge tag 'amd-drm-fixes-6.5-2023-07-20' of
https://gitlab.freedesktop.org/agd5f/linux into drm-fixes (2023-07-21
12:16:47 +1000)


drm fixes for 6.5-rc2

client:
- memory leak fix

dma-buf:
- memory leak fix

qaic:
- bound check fixes
- map_user_pages leak
- int overflow fixes

habanalabs:
- debugfs stub helper

nouveau:
- aux event slot fixes
- anx9805 cards fixes

i915:
- Add sentinel to xehp_oa_b_counters
- Revert "drm/i915: use localized __diag_ignore_all() instead of per file"

amdgpu:
- More PCIe DPM fixes for Intel platforms
- DCN3.0.1 fixes
- Virtual display timer fix
- Async flip fix
- SMU13 clock reporting fixes
- Add missing PSP firmware declaration
- DP MST fix
- DCN3.1.x fixes
- Slab out of bounds fix


Alex Deucher (2):
  drm/amdgpu/pm: make gfxclock consistent for sienna cichlid
  drm/amdgpu/pm: make mclk consistent for smu 13.0.7

Andrzej Hajda (1):
  drm/i915/perf: add sentinel to xehp_oa_b_counters

Arnd Bergmann (1):
  accel/habanalabs: add more debugfs stub helpers

Aurabindo Pillai (2):
  drm/amd/display: export some optc function for reuse
  drm/amd/display: add DCN301 specific logic for OTG programming

Ben Skeggs (3):
  drm/nouveau/i2c: fix number of aux event slots
  drm/nouveau/disp: PIOR DP uses GPIO for HPD, not PMGR AUX interrupts
  drm/nouveau/kms/nv50-: init hpd_irq_lock for PIOR DP

Candice Li (1):
  drm/amdgpu: Allow the initramfs generator to include psp_13_0_6_ta

Dan Carpenter (4):
  accel/qaic: tighten bounds checking in encode_message()
  accel/qaic: tighten bounds checking in decode_message()
  accel/qaic: Add consistent integer overflow checks
  accel/qaic: Fix a leak in map_user_pages()

Daniel Miess (1):
  drm/amd/display: Prevent vtotal from being set to 0

Dave Airlie (3):
  Merge tag 'drm-misc-fixes-2023-07-20' of
git://anongit.freedesktop.org/drm/drm-misc into drm-fixes
  Merge tag 'drm-intel-fixes-2023-07-20' of
git://anongit.freedesktop.org/drm/drm-intel into drm-fixes
  Merge tag 'amd-drm-fixes-6.5-2023-07-20' of
https://gitlab.freedesktop.org/agd5f/linux into drm-fixes

Guchun Chen (5):
  drm/amdgpu/vkms: relax timer deactivation by hrtimer_try_to_cancel
  drm/amdgpu: Allocate root PD on correct partition
  drm/amdgpu: fix slab-out-of-bounds issue in amdgpu_vm_pt_create
  drm/amdgpu/vm: use the same xcp_id from root PD
  drm/amdgpu: use a macro to define no xcp partition case

Jani Nikula (1):
  Revert "drm/i915: use localized __diag_ignore_all() instead of per file"

Jocelyn Falempe (2):
  drm/client: Fix memory leak in drm_client_target_cloned
  drm/client: Fix memory leak in drm_client_modeset_probe

Mario Limonciello (1):
  drm/amd: Use amdgpu_device_pcie_dynamic_switching_supported() for SMU7

Nicholas Kazlauskas (1):
  drm/amd/display: Keep PHY active for DP displays on DCN31

Simon Ser (1):
  drm/amd/display: only accept async flips for fast updates

Srinivasan Shanmugam (1):
  drm/amd/display: Clean up errors & warnings in amdgpu_dm.c

Taimur Hassan (1):
  drm/amd/display: check TG is non-null before checking if enabled

Ville Syrjälä (1):
  dma-buf/dma-resv: Stop leaking on krealloc() failure

Wayne Lin (1):
  drm/amd/display: Add polling method to handle MST reply packet

Zhikai Zhai (1):
  drm/amd/display: Disable MPC split by default on special asic

 drivers/accel/habanalabs/common/habanalabs.h   |   9 +
 drivers/accel/qaic/qaic_control.c  |  39 ++--
 drivers/dma-buf/dma-resv.c |  13 +-
 drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c   |   3 +-
 drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c|   6 +-
 drivers/gpu/drm/amd/amdgpu/am

[Bug 217690] New: consistent amdgpu failures on Lenovo ThinkPad Z16: "[drm:amdgpu_dm_process_dmub_aux_transfer_sync [amdgpu]] *ERROR* wait_for_completion_timeout timeout!"

2023-07-20 Thread bugzilla-daemon
https://bugzilla.kernel.org/show_bug.cgi?id=217690

Bug ID: 217690
   Summary: consistent amdgpu failures on Lenovo ThinkPad Z16:
"[drm:amdgpu_dm_process_dmub_aux_transfer_sync
[amdgpu]] *ERROR* wait_for_completion_timeout
timeout!"
   Product: Drivers
   Version: 2.5
  Hardware: AMD
OS: Linux
Status: NEW
  Severity: high
  Priority: P3
 Component: Video(DRI - non Intel)
  Assignee: drivers_video-...@kernel-bugs.osdl.org
  Reporter: mh...@mhcomputing.net
Regression: No

I am using the following system:

System Information
Manufacturer: LENOVO
Product Name: 21D4000HUS
Version: ThinkPad Z16 Gen 1
Serial Number: PF3XPEBD
UUID: 59e137b9-fc54-11ec-80f2-6c2408eab813
Wake-up Type: Power Switch
SKU Number: LENOVO_MT_21D4_BU_Think_FM_ThinkPad Z16 Gen 1
Family: ThinkPad Z16 Gen 1

It has the following graphics controller:

Advanced Micro Devices, Inc. [AMD/ATI] Navi 24 [Radeon RX 6400/6500 XT/6500M]
(rev c3)

I see different bizarre AMD GPU errors from 6.2.0, 6.2.16, and 6.4.3. I have
never been able to find a kernel that works 100% stable, so I am not sure how
to narrow this down or work around it without some professional advice.

Here are the errors from Linux version 6.4.3-060403-generic (kernel@kathleen)
(x86_64-linux-gnu-gcc-12 (Ubuntu 12.3.0-1ubuntu1) 12.3.0, GNU ld (GNU Binutils
for Ubuntu) 2.40) #202307110536 SMP PREEMPT_DYNAMIC Tue Jul 11 05:43:58 UTC
2023.

When the errors happen, X / Wayland freezes up. The problem happens more often
in Wayland than X, from what I can see. Sometimes, you can mode-switch to the
TTY and try to recover some work, but other times it locks up any local
interaction with the machine, so you can only get in via SSH. It does not seem
to fully cripple the machine, but the machine usually refuses to complete the
normal shutdown process when the infinite timeout message loop described below
occurs.

I am not 100% sure what information would be most helpful for debugging this,
but let me know and I will provide whatever is needed ASAP.

2023-07-14T13:12:00.727393-07:00 mhall-xps-01 kernel: [601793.214603]
[drm:amdgpu_job_timedout [amdgpu]] *ERROR* ring gfx_0.0.0 timeout, signaled
seq=9336145, emitted seq=9336147
2023-07-14T13:12:00.727411-07:00 mhall-xps-01 kernel: [601793.215114]
[drm:amdgpu_job_timedout [amdgpu]] *ERROR* Process information: process
kwin_wayland pid 2368 thread kwin_wayla:cs0 pid 2378
2023-07-14T13:12:00.727413-07:00 mhall-xps-01 kernel: [601793.215552] amdgpu
:67:00.0: amdgpu: GPU reset begin!
2023-07-14T13:12:01.439388-07:00 mhall-xps-01 kernel: [601793.927226] amdgpu
:67:00.0: amdgpu: MODE2 reset
2023-07-14T13:12:01.452637-07:00 mhall-xps-01 kernel: [601793.936646] amdgpu
:67:00.0: amdgpu: GPU reset succeeded, trying to resume
2023-07-14T13:12:01.452643-07:00 mhall-xps-01 kernel: [601793.936835] [drm]
PCIE GART of 1024M enabled (table at 0x00F41FC0).
2023-07-14T13:12:01.455390-07:00 mhall-xps-01 kernel: [601793.941616] [drm] PSP
is resuming...
2023-07-14T13:12:01.475649-07:00 mhall-xps-01 kernel: [601793.963877] [drm]
reserve 0xa0 from 0xf41e00 for PSP TMR
2023-07-14T13:12:01.799694-07:00 mhall-xps-01 kernel: [601794.288086] amdgpu
:67:00.0: amdgpu: RAS: optional ras ta ucode is not available
2023-07-14T13:12:01.811380-07:00 mhall-xps-01 kernel: [601794.300087] amdgpu
:67:00.0: amdgpu: RAP: optional rap ta ucode is not available
2023-07-14T13:12:01.811389-07:00 mhall-xps-01 kernel: [601794.300089] amdgpu
:67:00.0: amdgpu: SECUREDISPLAY: securedisplay ta ucode is not available
2023-07-14T13:12:01.811390-07:00 mhall-xps-01 kernel: [601794.300094] amdgpu
:67:00.0: amdgpu: SMU is resuming...
2023-07-14T13:12:01.815388-07:00 mhall-xps-01 kernel: [601794.301356] amdgpu
:67:00.0: amdgpu: SMU is resumed successfully!
2023-07-14T13:12:01.815403-07:00 mhall-xps-01 kernel: [601794.303468] [drm]
DMUB hardware initialized: version=0x042E
2023-07-14T13:12:02.839400-07:00 mhall-xps-01 kernel: [601794.311151] [drm]
Watermarks table not configured properly by SMU
2023-07-14T13:12:02.839419-07:00 mhall-xps-01 kernel: [601795.324614] [drm] kiq
ring mec 2 pipe 1 q 0
2023-07-14T13:12:02.843825-07:00 mhall-xps-01 kernel: [601795.330387] [drm] VCN
decode and encode initialized successfully(under DPG Mode).
2023-07-14T13:12:02.843842-07:00 mhall-xps-01 kernel: [601795.330870] [drm]
JPEG decode initialized successfully.
2023-07-14T13:12:02.843845-07:00 mhall-xps-01 kernel: [601795.330877] amdgpu
:67:00.0: amdgpu: ring gfx_0.0.0 uses VM inv eng 0 on hub 0
2023-07-14T13:12:02.843846-07:00 mhall-xps-01 kernel: [601795.330882] amdgpu
:67:00.0: amdgpu: ring comp_1.0.0 uses VM inv eng 1 on hub 0
2023-07-14T13:12:02.843848-07:00 mhall-xps-01 kernel: [601795.330886] amdgpu
:67:00.0: amdgpu: r

linux-next: manual merge of the accel tree with the drm-fixes tree

2023-07-20 Thread Stephen Rothwell
Hi all,

Today's linux-next merge of the accel tree got a conflict in:

  drivers/accel/habanalabs/common/habanalabs.h

between commit:

  78e9b217d78e ("accel/habanalabs: add more debugfs stub helpers")

from the drm-fixes tree and commit:

  32055d39a2bb ("accel/habanalabs: register compute device as an accel device")

from the accel tree.

I fixed it up (see below) and can carry the fix as necessary. This
is now fixed as far as linux-next is concerned, but any non trivial
conflicts should be mentioned to your upstream maintainer when your tree
is submitted for merging.  You may also want to consider cooperating
with the maintainer of the conflicting tree to minimise any particularly
complex conflicts.

-- 
Cheers,
Stephen Rothwell

diff --cc drivers/accel/habanalabs/common/habanalabs.h
index 2f027d5a8206,834f8cbf080a..
--- a/drivers/accel/habanalabs/common/habanalabs.h
+++ b/drivers/accel/habanalabs/common/habanalabs.h
@@@ -3972,23 -4002,6 +4002,15 @@@ void hl_debugfs_set_state_dump(struct h
  
  #else
  
- static inline void __init hl_debugfs_init(void)
- {
- }
- 
- static inline void hl_debugfs_fini(void)
- {
- }
- 
 +static inline int hl_debugfs_device_init(struct hl_device *hdev)
 +{
 +  return 0;
 +}
 +
 +static inline void hl_debugfs_device_fini(struct hl_device *hdev)
 +{
 +}
 +
  static inline void hl_debugfs_add_device(struct hl_device *hdev)
  {
  }


pgpPLwk4GQHpP.pgp
Description: OpenPGP digital signature


[PATCH] drm/amd: open brace '{' following struct go on the same line

2023-07-20 Thread sunran001

Fix the checkpatch error as open brace '{' following struct should
go on the same line.

Signed-off-by: Ran Sun 
---
 drivers/gpu/drm/amd/include/yellow_carp_offset.h | 6 ++
 1 file changed, 2 insertions(+), 4 deletions(-)

diff --git a/drivers/gpu/drm/amd/include/yellow_carp_offset.h 
b/drivers/gpu/drm/amd/include/yellow_carp_offset.h

index 0fea6a746611..a2c8dca2425e 100644
--- a/drivers/gpu/drm/amd/include/yellow_carp_offset.h
+++ b/drivers/gpu/drm/amd/include/yellow_carp_offset.h
@@ -7,13 +7,11 @@
 #define MAX_SEGMENT 6


-struct IP_BASE_INSTANCE
-{
+struct IP_BASE_INSTANCE {
 unsigned int segment[MAX_SEGMENT];
 } __maybe_unused;

-struct IP_BASE
-{
+struct IP_BASE {
 struct IP_BASE_INSTANCE instance[MAX_INSTANCE];
 } __maybe_unused;


[PATCH] drm/i915/huc: silence injected failure in the load via GSC path

2023-07-20 Thread Daniele Ceraolo Spurio
If we can't load the HuC due to an injected failure, we don't want
to throw and error and trip CI. Using the gt_probe_error macro for
logging ensure that the error is only printed if it wasn't explicitly
injected.

Link: https://gitlab.freedesktop.org/drm/intel/-/issues/7061
Signed-off-by: Daniele Ceraolo Spurio 
---
 drivers/gpu/drm/i915/pxp/intel_pxp_tee.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/i915/pxp/intel_pxp_tee.c 
b/drivers/gpu/drm/i915/pxp/intel_pxp_tee.c
index 1ce07d7e8769..88215b5efe72 100644
--- a/drivers/gpu/drm/i915/pxp/intel_pxp_tee.c
+++ b/drivers/gpu/drm/i915/pxp/intel_pxp_tee.c
@@ -9,6 +9,7 @@
 #include 
 
 #include "gem/i915_gem_lmem.h"
+#include "gt/intel_gt_print.h"
 
 #include "i915_drv.h"
 
@@ -174,7 +175,7 @@ static int i915_pxp_tee_component_bind(struct device 
*i915_kdev,
/* load huc via pxp */
ret = intel_huc_fw_load_and_auth_via_gsc(&uc->huc);
if (ret < 0)
-   drm_err(&i915->drm, "failed to load huc via gsc 
%d\n", ret);
+   gt_probe_error(pxp->ctrl_gt, "failed to load 
huc via gsc %d\n", ret);
}
}
 
-- 
2.41.0



[PATCH v7] drm/i915/selftest/gsc: Ensure GSC Proxy init completes before selftests

2023-07-20 Thread Alan Previn
On MTL, if the GSC Proxy init flows haven't completed, submissions to the
GSC engine will fail. Those init flows are dependent on the mei's
gsc_proxy component that is loaded in parallel with i915 and a
worker that could potentially start after i915 driver init is done.

That said, all subsytems that access the GSC engine today does check
for such init flow completion before using the GSC engine. However,
selftests currently don't wait on anything before starting.

To fix this, add a waiter function at the start of __run_selftests
that waits for gsc-proxy init flows to complete. Selftests shouldn't
care if the proxy-init failed as that should be flagged elsewhere.

Difference from prior versions:
   v7: - Change the fw status to INTEL_UC_FIRMWARE_LOAD_FAIL if the
 proxy-init fails so that intel_gsc_uc_fw_proxy_get_status
 catches it. (Daniele)
   v6: - Add a helper that returns something more than a boolean
 so we selftest can stop waiting if proxy-init hadn't
 completed but failed (Daniele).
   v5: - Move the call to __wait_gsc_proxy_completed from common
 __run_selftests dispatcher to the group-level selftest
 function (Trvtko).
   - change the pr_info to pr_warn if we hit the timeout.
   v4: - Remove generalized waiters function table framework (Tvrtko).
   - Remove mention of CI-framework-timeout from comments (Tvrtko).
   v3: - Rebase to latest drm-tip.
   v2: - Based on internal testing, increase the timeout for gsc-proxy
 specific case to 8 seconds.

Signed-off-by: Alan Previn 
---
 drivers/gpu/drm/i915/gt/uc/intel_gsc_fw.c | 14 +
 drivers/gpu/drm/i915/gt/uc/intel_gsc_fw.h |  1 +
 drivers/gpu/drm/i915/gt/uc/intel_gsc_uc.c | 13 +++-
 .../gpu/drm/i915/selftests/i915_selftest.c| 31 +++
 4 files changed, 58 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/i915/gt/uc/intel_gsc_fw.c 
b/drivers/gpu/drm/i915/gt/uc/intel_gsc_fw.c
index ab1a456f833d..163021705210 100644
--- a/drivers/gpu/drm/i915/gt/uc/intel_gsc_fw.c
+++ b/drivers/gpu/drm/i915/gt/uc/intel_gsc_fw.c
@@ -45,6 +45,20 @@ bool intel_gsc_uc_fw_proxy_init_done(struct intel_gsc_uc 
*gsc, bool needs_wakere
   HECI1_FWSTS1_PROXY_STATE_NORMAL;
 }
 
+int intel_gsc_uc_fw_proxy_get_status(struct intel_gsc_uc *gsc)
+{
+   if (!(IS_ENABLED(CONFIG_INTEL_MEI_GSC_PROXY)))
+   return -ENODEV;
+   if (!intel_uc_fw_is_loadable(&gsc->fw))
+   return -ENODEV;
+   if (__intel_uc_fw_status(&gsc->fw) == INTEL_UC_FIRMWARE_LOAD_FAIL)
+   return -ENOLINK;
+   if (!intel_gsc_uc_fw_proxy_init_done(gsc, true))
+   return -EAGAIN;
+
+   return 0;
+}
+
 bool intel_gsc_uc_fw_init_done(struct intel_gsc_uc *gsc)
 {
return gsc_uc_get_fw_status(gsc_uc_to_gt(gsc)->uncore, false) &
diff --git a/drivers/gpu/drm/i915/gt/uc/intel_gsc_fw.h 
b/drivers/gpu/drm/i915/gt/uc/intel_gsc_fw.h
index ad2167ce9137..bc9dd0de8aaf 100644
--- a/drivers/gpu/drm/i915/gt/uc/intel_gsc_fw.h
+++ b/drivers/gpu/drm/i915/gt/uc/intel_gsc_fw.h
@@ -16,5 +16,6 @@ int intel_gsc_fw_get_binary_info(struct intel_uc_fw *gsc_fw, 
const void *data, s
 int intel_gsc_uc_fw_upload(struct intel_gsc_uc *gsc);
 bool intel_gsc_uc_fw_init_done(struct intel_gsc_uc *gsc);
 bool intel_gsc_uc_fw_proxy_init_done(struct intel_gsc_uc *gsc, bool 
needs_wakeref);
+int intel_gsc_uc_fw_proxy_get_status(struct intel_gsc_uc *gsc);
 
 #endif
diff --git a/drivers/gpu/drm/i915/gt/uc/intel_gsc_uc.c 
b/drivers/gpu/drm/i915/gt/uc/intel_gsc_uc.c
index 034b53a71541..0d3b22a74365 100644
--- a/drivers/gpu/drm/i915/gt/uc/intel_gsc_uc.c
+++ b/drivers/gpu/drm/i915/gt/uc/intel_gsc_uc.c
@@ -62,8 +62,18 @@ static void gsc_work(struct work_struct *work)
}
 
ret = intel_gsc_proxy_request_handler(gsc);
-   if (ret)
+   if (ret) {
+   if (actions & GSC_ACTION_FW_LOAD) {
+   /*
+* A proxy failure right after firmware load 
means the proxy-init
+* step has failed so mark GSC as not usable 
after this
+*/
+   drm_err(>->i915->drm,
+   "GSC proxy handler failed to init\n");
+   intel_uc_fw_change_status(&gsc->fw, 
INTEL_UC_FIRMWARE_LOAD_FAIL);
+   }
goto out_put;
+   }
 
/* mark the GSC FW init as done the first time we run this */
if (actions & GSC_ACTION_FW_LOAD) {
@@ -78,6 +88,7 @@ static void gsc_work(struct work_struct *work)
} else {
drm_err(>->i915->drm,
"GSC status reports proxy init not 
complete\n");
+   intel_uc_fw_change_status(&gsc->fw, 
INTEL_UC_FIRMWARE_LOA

Re: [PATCH v6] drm/i915/selftest/gsc: Ensure GSC Proxy init completes before selftests

2023-07-20 Thread Teres Alexis, Alan Previn
On Thu, 2023-07-20 at 14:52 -0700, Ceraolo Spurio, Daniele wrote:
> 
> On 7/20/2023 2:40 PM, Alan Previn wrote:
> > On MTL, if the GSC Proxy init flows haven't completed, submissions to the
> > GSC engine will fail. Those init flows are dependent on the mei's
> > gsc_proxy component that is loaded in parallel with i915 and a
> > worker that could potentially start after i915 driver init is done.
> > 
> > That said, all subsytems that access the GSC engine today does check
> > for such init flow completion before using the GSC engine. However,
> > selftests currently don't wait on anything before starting.
> > 
> > To fix this, add a waiter function at the start of __run_selftests
> > that waits for gsc-proxy init flows to complete. Selftests shouldn't
> > care if the proxy-init failed as that should be flagged elsewhere.
> > 
> > Difference from prior versions:
> > v6: - Add a helper that returns something more than a boolean
> >   so we selftest can stop waiting if proxy-init hadn't
> >   completed but failed (Daniele).

alan:snip
> 

> > +int intel_gsc_uc_fw_proxy_get_status(struct intel_gsc_uc *gsc)
> > +{
> > +   if (!(IS_ENABLED(CONFIG_INTEL_MEI_GSC_PROXY)))
> > +   return -ENODEV;
> > +   if (!intel_uc_fw_is_loadable(&gsc->fw))
> > +   return -ENODEV;
> > +   if (__intel_uc_fw_status(&gsc->fw) == INTEL_UC_FIRMWARE_LOAD_FAIL)
> 
> You're missing the change to move the FW status to LOAD_FAIL if the 
> proxy fails to initialize. Or are you expecting 
> https://patchwork.freedesktop.org/series/118723/, which included that 
> change, to be merged first?
> 
> Daniele

alan: as per our offline sync, I'll respin this one and move it away from the
other patch (since this is more critical) and we can respin the other after
this is done so we get a smooth merge. Also, as i move that "change fw status
to fail" from that PXP patch to this patch, I'll fix that issue where i missed
the 2nd failure point in the proxy init flow.

Thanks for your help. :)



Re: [PATCH v3] drm/i915/pxp: Optimize GET_PARAM:PXP_STATUS

2023-07-20 Thread Ceraolo Spurio, Daniele




On 6/29/2023 6:44 PM, Alan Previn wrote:

After recent discussions with Mesa folks, it was requested
that we optimize i915's GET_PARAM for the PXP_STATUS without
changing the UAPI spec.

Add these additional optimizations:
- If any PXP initializatoin flow failed, then ensure that
  we catch it so that we can change the returned PXP_STATUS
  from "2" (i.e. 'PXP is supported but not yet ready')
  to "-ENODEV". This typically should not happen and if it
  does, we have a platform configuration issue.
- If a PXP arbitration session creation event failed
  due to incorrect firmware version or blocking SOC fusing
  or blocking BIOS configuration (platform reasons that won't
  change if we retry), then reflect that blockage by also
  returning -ENODEV in the GET_PARAM:PXP_STATUS.
- GET_PARAM:PXP_STATUS should not wait at all if PXP is
  supported but non-i915 dependencies (component-driver /
  firmware) we are still pending to complete the init flows.
  In this case, just return "2" immediately (i.e. 'PXP is
  supported but not yet ready').

Difference from prio revs:
   v2: - Use a #define for the default readiness timeout (Vivaik).
   - Improve comments around the failing of proxy-init.
   v1: - Change the commit msg style to be imperative. (Jani)
   - Rename timeout to timeout_ms. (Jani)
   - Fix is_fw_err_platform_config to use higher order
 param (pxp) first. (Jani)

Signed-off-by: Alan Previn 
---
  drivers/gpu/drm/i915/gt/uc/intel_gsc_uc.c  | 10 +-
  drivers/gpu/drm/i915/i915_getparam.c   |  2 +-
  drivers/gpu/drm/i915/pxp/intel_pxp.c   | 40 ++
  drivers/gpu/drm/i915/pxp/intel_pxp.h   |  2 +-
  drivers/gpu/drm/i915/pxp/intel_pxp_gsccs.c |  7 ++--
  drivers/gpu/drm/i915/pxp/intel_pxp_tee.c   |  7 ++--
  drivers/gpu/drm/i915/pxp/intel_pxp_types.h |  9 +
  7 files changed, 61 insertions(+), 16 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/uc/intel_gsc_uc.c 
b/drivers/gpu/drm/i915/gt/uc/intel_gsc_uc.c
index 034b53a71541..21c2b7cce335 100644
--- a/drivers/gpu/drm/i915/gt/uc/intel_gsc_uc.c
+++ b/drivers/gpu/drm/i915/gt/uc/intel_gsc_uc.c
@@ -62,8 +62,16 @@ static void gsc_work(struct work_struct *work)
}
  
  		ret = intel_gsc_proxy_request_handler(gsc);

-   if (ret)
+   if (ret) {
+   if (actions & GSC_ACTION_FW_LOAD) {
+   /*
+* A failure right after firmware load means 
the proxy-init
+* step has failed so mark GSC as not usable 
after this
+*/
+   intel_uc_fw_change_status(&gsc->fw, 
INTEL_UC_FIRMWARE_LOAD_FAIL);


Note that proxy init can also fail below if the init_done check returns 
false, so the status needs to be changed in both cases.


Daniele


+   }
goto out_put;
+   }
  
  		/* mark the GSC FW init as done the first time we run this */

if (actions & GSC_ACTION_FW_LOAD) {
diff --git a/drivers/gpu/drm/i915/i915_getparam.c 
b/drivers/gpu/drm/i915/i915_getparam.c
index 890f2b382bee..5c3fec63cb4c 100644
--- a/drivers/gpu/drm/i915/i915_getparam.c
+++ b/drivers/gpu/drm/i915/i915_getparam.c
@@ -109,7 +109,7 @@ int i915_getparam_ioctl(struct drm_device *dev, void *data,
return value;
break;
case I915_PARAM_PXP_STATUS:
-   value = intel_pxp_get_readiness_status(i915->pxp);
+   value = intel_pxp_get_readiness_status(i915->pxp, 0);
if (value < 0)
return value;
break;
diff --git a/drivers/gpu/drm/i915/pxp/intel_pxp.c 
b/drivers/gpu/drm/i915/pxp/intel_pxp.c
index bb2e15329f34..e3b47525dc60 100644
--- a/drivers/gpu/drm/i915/pxp/intel_pxp.c
+++ b/drivers/gpu/drm/i915/pxp/intel_pxp.c
@@ -359,22 +359,46 @@ void intel_pxp_end(struct intel_pxp *pxp)
intel_runtime_pm_put(&i915->runtime_pm, wakeref);
  }
  
+static bool pxp_required_fw_failed(struct intel_pxp *pxp)

+{
+   if (__intel_uc_fw_status(&pxp->ctrl_gt->uc.huc.fw) == 
INTEL_UC_FIRMWARE_LOAD_FAIL)
+   return true;
+   if (HAS_ENGINE(pxp->ctrl_gt, GSC0) &&
+   __intel_uc_fw_status(&pxp->ctrl_gt->uc.gsc.fw) == 
INTEL_UC_FIRMWARE_LOAD_FAIL)
+   return true;
+
+   return false;
+}
+
+static bool pxp_fw_dependencies_completed(struct intel_pxp *pxp)
+{
+   if (HAS_ENGINE(pxp->ctrl_gt, GSC0))
+   return intel_pxp_gsccs_is_ready_for_sessions(pxp);
+
+   return pxp_component_bound(pxp);
+}
+
  /*
   * this helper is used by both intel_pxp_start and by
   * the GET_PARAM IOCTL that user space calls. Thus, the
   * return values here should match the UAPI spec.
   */
-int intel_pxp_get_readiness_status(struct intel_pxp *pxp)
+int intel_pxp_get_readiness_status(struct in

Re: [PATCH v1 5/5] drm/msm/dp: move of_dp_aux_populate_bus() to probe for eDP

2023-07-20 Thread Dmitry Baryshkov

On 20/07/2023 23:27, Kuogee Hsieh wrote:


On 7/10/2023 11:24 AM, Dmitry Baryshkov wrote:

[Restored CC list]

On Mon, 10 Jul 2023 at 20:08, Kuogee Hsieh  
wrote:


On 7/7/2023 5:32 PM, Dmitry Baryshkov wrote:

On 08/07/2023 02:52, Kuogee Hsieh wrote:

Move of_dp_aux_populate_bus() to dp_display_probe() for eDP
from dp_display_bind() so that probe deferral cases can be
handled effectively

Signed-off-by: Kuogee Hsieh 
---
   drivers/gpu/drm/msm/dp/dp_aux.c | 25 
   drivers/gpu/drm/msm/dp/dp_display.c | 79
+++--
   2 files changed, 65 insertions(+), 39 deletions(-)

diff --git a/drivers/gpu/drm/msm/dp/dp_aux.c
b/drivers/gpu/drm/msm/dp/dp_aux.c
index c592064..c1baffb 100644
--- a/drivers/gpu/drm/msm/dp/dp_aux.c
+++ b/drivers/gpu/drm/msm/dp/dp_aux.c
@@ -505,6 +505,21 @@ void dp_aux_unregister(struct drm_dp_aux *dp_aux)
   drm_dp_aux_unregister(dp_aux);
   }
   +static int dp_wait_hpd_asserted(struct drm_dp_aux *dp_aux,
+ unsigned long wait_us)
+{
+    int ret;
+    struct dp_aux_private *aux;
+
+    aux = container_of(dp_aux, struct dp_aux_private, dp_aux);
+
+    pm_runtime_get_sync(aux->dev);
+    ret = dp_catalog_aux_wait_for_hpd_connect_state(aux->catalog);
+    pm_runtime_put_sync(aux->dev);
+
+    return ret;
+}
+
   struct drm_dp_aux *dp_aux_get(struct device *dev, struct dp_catalog
*catalog,
 bool is_edp)
   {
@@ -528,6 +543,16 @@ struct drm_dp_aux *dp_aux_get(struct device
*dev, struct dp_catalog *catalog,
   aux->catalog = catalog;
   aux->retry_cnt = 0;
   +    /*
+ * Use the drm_dp_aux_init() to use the aux adapter
+ * before registering aux with the DRM device.
+ */
+    aux->dp_aux.name = "dpu_dp_aux";
+    aux->dp_aux.dev = dev;
+    aux->dp_aux.transfer = dp_aux_transfer;
+    aux->dp_aux.wait_hpd_asserted = dp_wait_hpd_asserted;
+    drm_dp_aux_init(&aux->dp_aux);
+
   return &aux->dp_aux;
   }
   diff --git a/drivers/gpu/drm/msm/dp/dp_display.c
b/drivers/gpu/drm/msm/dp/dp_display.c
index 185f1eb..7ed4bea 100644
--- a/drivers/gpu/drm/msm/dp/dp_display.c
+++ b/drivers/gpu/drm/msm/dp/dp_display.c
@@ -302,10 +302,6 @@ static int dp_display_bind(struct device *dev,
struct device *master,
   goto end;
   }
   -    pm_runtime_enable(dev);
-    pm_runtime_set_autosuspend_delay(dev, 1000);
-    pm_runtime_use_autosuspend(dev);
-
   return 0;
   end:
   return rc;
@@ -322,8 +318,6 @@ static void dp_display_unbind(struct device *dev,
struct device *master,
 kthread_stop(dp->ev_tsk);
   -    of_dp_aux_depopulate_bus(dp->aux);
-
   dp_power_client_deinit(dp->power);
   dp_unregister_audio_driver(dev, dp->audio);
   dp_aux_unregister(dp->aux);
@@ -1245,6 +1239,29 @@ static const struct msm_dp_desc
*dp_display_get_desc(struct platform_device *pde
   return NULL;
   }
   +static void of_dp_aux_depopulate_bus_void(void *data)
+{
+    of_dp_aux_depopulate_bus(data);
+}
+
+static int dp_display_auxbus_emulation(struct dp_display_private *dp)

Why is it called emulation?


+{
+    struct device *dev = &dp->pdev->dev;
+    struct device_node *aux_bus;
+    int ret = 0;
+
+    aux_bus = of_get_child_by_name(dev->of_node, "aux-bus");
+
+    if (aux_bus) {
+    ret = devm_of_dp_aux_populate_bus(dp->aux, NULL);

And here you missed the whole point of why we have been asking for.
Please add a sensible `done_probing' callback, which will call
component_add(). This way the DP component will only be registered
when the panel has been probed. Keeping us from the component binding
retries and corresponding side effects.


+
+    devm_add_action_or_reset(dev, of_dp_aux_depopulate_bus_void,
+ dp->aux);

Useless, it's already handled by the devm_ part of the
devm_of_dp_aux_populate_bus().


+    }
+
+    return ret;
+}
+
   static int dp_display_probe(struct platform_device *pdev)
   {
   int rc = 0;
@@ -1290,8 +1307,18 @@ static int dp_display_probe(struct
platform_device *pdev)
 platform_set_drvdata(pdev, &dp->dp_display);
   +    pm_runtime_enable(&pdev->dev);
+    pm_runtime_set_autosuspend_delay(&pdev->dev, 1000);
+    pm_runtime_use_autosuspend(&pdev->dev);

Can we have this in probe right from the patch #2?

no, at patch#2, devm_of_dp_aux_populate_bus() is done ta bind timing.

The device used by pm_runtime_get_sync() of generic_edp_panel_probe()
which is derived from devm_of_dp_aux_populate_bus() is different the
&pdev->dev here.

Excuse me, I don't get your answer. In patch #2 you have added
pm_runtime_enable() / etc to dp_display_bind().
In this patch you are moving these calls to dp_display_probe(). I
think that the latter is a better place for enabling runtime PM and as
such I've asked you to squash this chunk into patch #2.
Why isn't that going to work?

If I'm not mistaken here, the panel's call to pm_runtime_get_sync()
will wake up the panel and all the parent devices, including the DP.
That's what I meant in my comm

Re: [PATCH drm-misc-next v8 01/12] drm: manager to keep track of GPUs VA mappings

2023-07-20 Thread Danilo Krummrich

On 7/20/23 12:44, Steven Price wrote:

On 20/07/2023 01:14, Danilo Krummrich wrote:

Add infrastructure to keep track of GPU virtual address (VA) mappings
with a decicated VA space manager implementation.

New UAPIs, motivated by Vulkan sparse memory bindings graphics drivers
start implementing, allow userspace applications to request multiple and
arbitrary GPU VA mappings of buffer objects. The DRM GPU VA manager is
intended to serve the following purposes in this context.

1) Provide infrastructure to track GPU VA allocations and mappings,
making using an interval tree (RB-tree).

2) Generically connect GPU VA mappings to their backing buffers, in
particular DRM GEM objects.

3) Provide a common implementation to perform more complex mapping
operations on the GPU VA space. In particular splitting and merging
of GPU VA mappings, e.g. for intersecting mapping requests or partial
unmap requests.

Acked-by: Thomas Hellström 
Acked-by: Matthew Brost 
Reviewed-by: Boris Brezillon 
Tested-by: Matthew Brost 
Tested-by: Donald Robson 
Suggested-by: Dave Airlie 
Signed-off-by: Danilo Krummrich 


[...]


diff --git a/drivers/gpu/drm/drm_gpuva_mgr.c b/drivers/gpu/drm/drm_gpuva_mgr.c
new file mode 100644
index ..dee2235530d6
--- /dev/null
+++ b/drivers/gpu/drm/drm_gpuva_mgr.c


[...]


+static bool
+drm_gpuva_check_overflow(u64 addr, u64 range)
+{
+   u64 end;
+
+   return WARN(check_add_overflow(addr, range, &end),
+   "GPUVA address limited to %lu bytes.\n", sizeof(end));
+}


This produces a warning on 32 bit systems as sizeof() isn't necessarily
an unsigned long. The fix below silences the warning.


Thank you for fixing this up! Applied both of your patches to drm-misc-next.

- Danilo



Thanks,

Steve

---8<-
 From 9c7356580362b6ac4673724f18ea6e8453b52913 Mon Sep 17 00:00:00 2001
From: Steven Price 
Date: Thu, 20 Jul 2023 10:58:09 +0100
Subject: [PATCH] drm: manager: Fix printk format for size_t

sizeof() returns a size_t which may be different to an unsigned long.
Use the correct format specifier of '%zu' to prevent compiler warnings.

Fixes: e6303f323b1a ("drm: manager to keep track of GPUs VA mappings")
Signed-off-by: Steven Price 
---
  drivers/gpu/drm/drm_gpuva_mgr.c | 2 +-
  1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/drm_gpuva_mgr.c b/drivers/gpu/drm/drm_gpuva_mgr.c
index 0b80177592a6..f86bfad74ff8 100644
--- a/drivers/gpu/drm/drm_gpuva_mgr.c
+++ b/drivers/gpu/drm/drm_gpuva_mgr.c
@@ -619,7 +619,7 @@ drm_gpuva_check_overflow(u64 addr, u64 range)
u64 end;
  
  	return WARN(check_add_overflow(addr, range, &end),

-   "GPUVA address limited to %lu bytes.\n", sizeof(end));
+   "GPUVA address limited to %zu bytes.\n", sizeof(end));
  }
  
  static bool




Re: [PATCH v5 05/11] drm/amdgpu: Use RMW accessors for changing LNKCTL

2023-07-20 Thread Bjorn Helgaas
On Mon, Jul 17, 2023 at 03:04:57PM +0300, Ilpo Järvinen wrote:
> Don't assume that only the driver would be accessing LNKCTL. ASPM
> policy changes can trigger write to LNKCTL outside of driver's control.
> And in the case of upstream bridge, the driver does not even own the
> device it's changing the registers for.
> 
> Use RMW capability accessors which do proper locking to avoid losing
> concurrent updates to the register value.
> 
> Fixes: a2e73f56fa62 ("drm/amdgpu: Add support for CIK parts")
> Fixes: 62a37553414a ("drm/amdgpu: add si implementation v10")
> Suggested-by: Lukas Wunner 
> Signed-off-by: Ilpo Järvinen 
> Cc: sta...@vger.kernel.org

Do we have any reports of problems that are fixed by this patch (or by
others in the series)?  If not, I'm not sure it really fits the usual
stable kernel criteria:

https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/Documentation/process/stable-kernel-rules.rst?id=v6.4

> ---
>  drivers/gpu/drm/amd/amdgpu/cik.c | 36 +---
>  drivers/gpu/drm/amd/amdgpu/si.c  | 36 +---
>  2 files changed, 20 insertions(+), 52 deletions(-)
> 
> diff --git a/drivers/gpu/drm/amd/amdgpu/cik.c 
> b/drivers/gpu/drm/amd/amdgpu/cik.c
> index 5641cf05d856..e63abdf52b6c 100644
> --- a/drivers/gpu/drm/amd/amdgpu/cik.c
> +++ b/drivers/gpu/drm/amd/amdgpu/cik.c
> @@ -1574,17 +1574,8 @@ static void cik_pcie_gen3_enable(struct amdgpu_device 
> *adev)
>   u16 bridge_cfg2, gpu_cfg2;
>   u32 max_lw, current_lw, tmp;
>  
> - pcie_capability_read_word(root, PCI_EXP_LNKCTL,
> -   &bridge_cfg);
> - pcie_capability_read_word(adev->pdev, PCI_EXP_LNKCTL,
> -   &gpu_cfg);
> -
> - tmp16 = bridge_cfg | PCI_EXP_LNKCTL_HAWD;
> - pcie_capability_write_word(root, PCI_EXP_LNKCTL, tmp16);
> -
> - tmp16 = gpu_cfg | PCI_EXP_LNKCTL_HAWD;
> - pcie_capability_write_word(adev->pdev, PCI_EXP_LNKCTL,
> -tmp16);
> + pcie_capability_set_word(root, PCI_EXP_LNKCTL, 
> PCI_EXP_LNKCTL_HAWD);
> + pcie_capability_set_word(adev->pdev, PCI_EXP_LNKCTL, 
> PCI_EXP_LNKCTL_HAWD);
>  
>   tmp = RREG32_PCIE(ixPCIE_LC_STATUS1);
>   max_lw = (tmp & 
> PCIE_LC_STATUS1__LC_DETECTED_LINK_WIDTH_MASK) >>
> @@ -1637,21 +1628,14 @@ static void cik_pcie_gen3_enable(struct amdgpu_device 
> *adev)
>   msleep(100);
>  
>   /* linkctl */
> - pcie_capability_read_word(root, PCI_EXP_LNKCTL,
> -   &tmp16);
> - tmp16 &= ~PCI_EXP_LNKCTL_HAWD;
> - tmp16 |= (bridge_cfg & PCI_EXP_LNKCTL_HAWD);
> - pcie_capability_write_word(root, PCI_EXP_LNKCTL,
> -tmp16);
> -
> - pcie_capability_read_word(adev->pdev,
> -   PCI_EXP_LNKCTL,
> -   &tmp16);
> - tmp16 &= ~PCI_EXP_LNKCTL_HAWD;
> - tmp16 |= (gpu_cfg & PCI_EXP_LNKCTL_HAWD);
> - pcie_capability_write_word(adev->pdev,
> -PCI_EXP_LNKCTL,
> -tmp16);
> + pcie_capability_clear_and_set_word(root, 
> PCI_EXP_LNKCTL,
> +
> PCI_EXP_LNKCTL_HAWD,
> +bridge_cfg &
> +
> PCI_EXP_LNKCTL_HAWD);
> + pcie_capability_clear_and_set_word(adev->pdev, 
> PCI_EXP_LNKCTL,
> +
> PCI_EXP_LNKCTL_HAWD,
> +gpu_cfg &
> +
> PCI_EXP_LNKCTL_HAWD);

Wow, there's a lot of pointless-looking work going on here:

  set root PCI_EXP_LNKCTL_HAWD
  set GPU  PCI_EXP_LNKCTL_HAWD

  for (i = 0; i < 10; i++) {
read root PCI_EXP_LNKCTL
read GPU  PCI_EXP_LNKCTL

clear root PCI_EXP_LNKCTL_HAWD
if (root PCI_EXP_LNKCTL_HAWD was set)
  set root PCI_EXP_LNKCTL_HAWD

clear GPU  PCI_EXP_LNKCTL_HAWD
if (GPU  PCI_EXP_LNKCTL_HAWD was set)
  set GPU  PCI_EXP_LNKCTL_HAWD
  }

If it really *is* pointless, it would be nice to clean it up, but that
wouldn't be material for this 

Re: [PATCH v6] drm/i915/selftest/gsc: Ensure GSC Proxy init completes before selftests

2023-07-20 Thread Ceraolo Spurio, Daniele




On 7/20/2023 2:40 PM, Alan Previn wrote:

On MTL, if the GSC Proxy init flows haven't completed, submissions to the
GSC engine will fail. Those init flows are dependent on the mei's
gsc_proxy component that is loaded in parallel with i915 and a
worker that could potentially start after i915 driver init is done.

That said, all subsytems that access the GSC engine today does check
for such init flow completion before using the GSC engine. However,
selftests currently don't wait on anything before starting.

To fix this, add a waiter function at the start of __run_selftests
that waits for gsc-proxy init flows to complete. Selftests shouldn't
care if the proxy-init failed as that should be flagged elsewhere.

Difference from prior versions:
v6: - Add a helper that returns something more than a boolean
  so we selftest can stop waiting if proxy-init hadn't
  completed but failed (Daniele).
v5: - Move the call to __wait_gsc_proxy_completed from common
  __run_selftests dispatcher to the group-level selftest
  function (Trvtko).
- change the pr_info to pr_warn if we hit the timeout.
v4: - Remove generalized waiters function table framework (Tvrtko).
- Remove mention of CI-framework-timeout from comments (Tvrtko).
v3: - Rebase to latest drm-tip.
v2: - Based on internal testing, increase the timeout for gsc-proxy
  specific case to 8 seconds.

Signed-off-by: Alan Previn 
---
  drivers/gpu/drm/i915/gt/uc/intel_gsc_fw.c | 14 +
  drivers/gpu/drm/i915/gt/uc/intel_gsc_fw.h |  1 +
  .../gpu/drm/i915/selftests/i915_selftest.c| 31 +++
  3 files changed, 46 insertions(+)

diff --git a/drivers/gpu/drm/i915/gt/uc/intel_gsc_fw.c 
b/drivers/gpu/drm/i915/gt/uc/intel_gsc_fw.c
index ab1a456f833d..163021705210 100644
--- a/drivers/gpu/drm/i915/gt/uc/intel_gsc_fw.c
+++ b/drivers/gpu/drm/i915/gt/uc/intel_gsc_fw.c
@@ -45,6 +45,20 @@ bool intel_gsc_uc_fw_proxy_init_done(struct intel_gsc_uc 
*gsc, bool needs_wakere
   HECI1_FWSTS1_PROXY_STATE_NORMAL;
  }
  
+int intel_gsc_uc_fw_proxy_get_status(struct intel_gsc_uc *gsc)

+{
+   if (!(IS_ENABLED(CONFIG_INTEL_MEI_GSC_PROXY)))
+   return -ENODEV;
+   if (!intel_uc_fw_is_loadable(&gsc->fw))
+   return -ENODEV;
+   if (__intel_uc_fw_status(&gsc->fw) == INTEL_UC_FIRMWARE_LOAD_FAIL)


You're missing the change to move the FW status to LOAD_FAIL if the 
proxy fails to initialize. Or are you expecting 
https://patchwork.freedesktop.org/series/118723/, which included that 
change, to be merged first?


Daniele


+   return -ENOLINK;
+   if (!intel_gsc_uc_fw_proxy_init_done(gsc, true))
+   return -EAGAIN;
+
+   return 0;
+}
+
  bool intel_gsc_uc_fw_init_done(struct intel_gsc_uc *gsc)
  {
return gsc_uc_get_fw_status(gsc_uc_to_gt(gsc)->uncore, false) &
diff --git a/drivers/gpu/drm/i915/gt/uc/intel_gsc_fw.h 
b/drivers/gpu/drm/i915/gt/uc/intel_gsc_fw.h
index ad2167ce9137..bc9dd0de8aaf 100644
--- a/drivers/gpu/drm/i915/gt/uc/intel_gsc_fw.h
+++ b/drivers/gpu/drm/i915/gt/uc/intel_gsc_fw.h
@@ -16,5 +16,6 @@ int intel_gsc_fw_get_binary_info(struct intel_uc_fw *gsc_fw, 
const void *data, s
  int intel_gsc_uc_fw_upload(struct intel_gsc_uc *gsc);
  bool intel_gsc_uc_fw_init_done(struct intel_gsc_uc *gsc);
  bool intel_gsc_uc_fw_proxy_init_done(struct intel_gsc_uc *gsc, bool 
needs_wakeref);
+int intel_gsc_uc_fw_proxy_get_status(struct intel_gsc_uc *gsc);
  
  #endif

diff --git a/drivers/gpu/drm/i915/selftests/i915_selftest.c 
b/drivers/gpu/drm/i915/selftests/i915_selftest.c
index 39da0fb0d6d2..ee79e0809a6d 100644
--- a/drivers/gpu/drm/i915/selftests/i915_selftest.c
+++ b/drivers/gpu/drm/i915/selftests/i915_selftest.c
@@ -24,6 +24,8 @@
  #include 
  
  #include "gt/intel_gt_pm.h"

+#include "gt/uc/intel_gsc_fw.h"
+
  #include "i915_driver.h"
  #include "i915_drv.h"
  #include "i915_selftest.h"
@@ -127,6 +129,31 @@ static void set_default_test_all(struct selftest *st, 
unsigned int count)
st[i].enabled = true;
  }
  
+static bool

+__gsc_proxy_init_progressing(struct intel_gsc_uc *gsc)
+{
+   return intel_gsc_uc_fw_proxy_get_status(gsc) == -EAGAIN;
+}
+
+static void
+__wait_gsc_proxy_completed(struct drm_i915_private *i915)
+{
+   bool need_to_wait = (IS_ENABLED(CONFIG_INTEL_MEI_GSC_PROXY) &&
+i915->media_gt &&
+HAS_ENGINE(i915->media_gt, GSC0) &&
+
intel_uc_fw_is_loadable(&i915->media_gt->uc.gsc.fw));
+   /*
+* The gsc proxy component depends on the kernel component driver load 
ordering
+* and in corner cases (the first time after an IFWI flash), 
init-completion
+* firmware flows take longer.
+*/
+   unsigned long timeout_ms = 8000;
+
+   if (need_to_wait && 
wait_for(!__gsc_proxy_init_progressing(&i915->media_gt->uc.gsc),
+ 

[PATCH v6] drm/i915/selftest/gsc: Ensure GSC Proxy init completes before selftests

2023-07-20 Thread Alan Previn
On MTL, if the GSC Proxy init flows haven't completed, submissions to the
GSC engine will fail. Those init flows are dependent on the mei's
gsc_proxy component that is loaded in parallel with i915 and a
worker that could potentially start after i915 driver init is done.

That said, all subsytems that access the GSC engine today does check
for such init flow completion before using the GSC engine. However,
selftests currently don't wait on anything before starting.

To fix this, add a waiter function at the start of __run_selftests
that waits for gsc-proxy init flows to complete. Selftests shouldn't
care if the proxy-init failed as that should be flagged elsewhere.

Difference from prior versions:
   v6: - Add a helper that returns something more than a boolean
 so we selftest can stop waiting if proxy-init hadn't
 completed but failed (Daniele).
   v5: - Move the call to __wait_gsc_proxy_completed from common
 __run_selftests dispatcher to the group-level selftest
 function (Trvtko).
   - change the pr_info to pr_warn if we hit the timeout.
   v4: - Remove generalized waiters function table framework (Tvrtko).
   - Remove mention of CI-framework-timeout from comments (Tvrtko).
   v3: - Rebase to latest drm-tip.
   v2: - Based on internal testing, increase the timeout for gsc-proxy
 specific case to 8 seconds.

Signed-off-by: Alan Previn 
---
 drivers/gpu/drm/i915/gt/uc/intel_gsc_fw.c | 14 +
 drivers/gpu/drm/i915/gt/uc/intel_gsc_fw.h |  1 +
 .../gpu/drm/i915/selftests/i915_selftest.c| 31 +++
 3 files changed, 46 insertions(+)

diff --git a/drivers/gpu/drm/i915/gt/uc/intel_gsc_fw.c 
b/drivers/gpu/drm/i915/gt/uc/intel_gsc_fw.c
index ab1a456f833d..163021705210 100644
--- a/drivers/gpu/drm/i915/gt/uc/intel_gsc_fw.c
+++ b/drivers/gpu/drm/i915/gt/uc/intel_gsc_fw.c
@@ -45,6 +45,20 @@ bool intel_gsc_uc_fw_proxy_init_done(struct intel_gsc_uc 
*gsc, bool needs_wakere
   HECI1_FWSTS1_PROXY_STATE_NORMAL;
 }
 
+int intel_gsc_uc_fw_proxy_get_status(struct intel_gsc_uc *gsc)
+{
+   if (!(IS_ENABLED(CONFIG_INTEL_MEI_GSC_PROXY)))
+   return -ENODEV;
+   if (!intel_uc_fw_is_loadable(&gsc->fw))
+   return -ENODEV;
+   if (__intel_uc_fw_status(&gsc->fw) == INTEL_UC_FIRMWARE_LOAD_FAIL)
+   return -ENOLINK;
+   if (!intel_gsc_uc_fw_proxy_init_done(gsc, true))
+   return -EAGAIN;
+
+   return 0;
+}
+
 bool intel_gsc_uc_fw_init_done(struct intel_gsc_uc *gsc)
 {
return gsc_uc_get_fw_status(gsc_uc_to_gt(gsc)->uncore, false) &
diff --git a/drivers/gpu/drm/i915/gt/uc/intel_gsc_fw.h 
b/drivers/gpu/drm/i915/gt/uc/intel_gsc_fw.h
index ad2167ce9137..bc9dd0de8aaf 100644
--- a/drivers/gpu/drm/i915/gt/uc/intel_gsc_fw.h
+++ b/drivers/gpu/drm/i915/gt/uc/intel_gsc_fw.h
@@ -16,5 +16,6 @@ int intel_gsc_fw_get_binary_info(struct intel_uc_fw *gsc_fw, 
const void *data, s
 int intel_gsc_uc_fw_upload(struct intel_gsc_uc *gsc);
 bool intel_gsc_uc_fw_init_done(struct intel_gsc_uc *gsc);
 bool intel_gsc_uc_fw_proxy_init_done(struct intel_gsc_uc *gsc, bool 
needs_wakeref);
+int intel_gsc_uc_fw_proxy_get_status(struct intel_gsc_uc *gsc);
 
 #endif
diff --git a/drivers/gpu/drm/i915/selftests/i915_selftest.c 
b/drivers/gpu/drm/i915/selftests/i915_selftest.c
index 39da0fb0d6d2..ee79e0809a6d 100644
--- a/drivers/gpu/drm/i915/selftests/i915_selftest.c
+++ b/drivers/gpu/drm/i915/selftests/i915_selftest.c
@@ -24,6 +24,8 @@
 #include 
 
 #include "gt/intel_gt_pm.h"
+#include "gt/uc/intel_gsc_fw.h"
+
 #include "i915_driver.h"
 #include "i915_drv.h"
 #include "i915_selftest.h"
@@ -127,6 +129,31 @@ static void set_default_test_all(struct selftest *st, 
unsigned int count)
st[i].enabled = true;
 }
 
+static bool
+__gsc_proxy_init_progressing(struct intel_gsc_uc *gsc)
+{
+   return intel_gsc_uc_fw_proxy_get_status(gsc) == -EAGAIN;
+}
+
+static void
+__wait_gsc_proxy_completed(struct drm_i915_private *i915)
+{
+   bool need_to_wait = (IS_ENABLED(CONFIG_INTEL_MEI_GSC_PROXY) &&
+i915->media_gt &&
+HAS_ENGINE(i915->media_gt, GSC0) &&
+
intel_uc_fw_is_loadable(&i915->media_gt->uc.gsc.fw));
+   /*
+* The gsc proxy component depends on the kernel component driver load 
ordering
+* and in corner cases (the first time after an IFWI flash), 
init-completion
+* firmware flows take longer.
+*/
+   unsigned long timeout_ms = 8000;
+
+   if (need_to_wait && 
wait_for(!__gsc_proxy_init_progressing(&i915->media_gt->uc.gsc),
+timeout_ms))
+   pr_warn(DRIVER_NAME "Timed out waiting for 
gsc_proxy_completion!\n");
+}
+
 static int __run_selftests(const char *name,
   struct selftest *st,
   unsigned int count,
@@ -206,6 +233,8 @@ int i915_live_selftests(struct pci_dev *pdev)

[PATCH v7 9/9] drm/i915/gt: Support aux invalidation on all engines

2023-07-20 Thread Andi Shyti
Perform some refactoring with the purpose of keeping in one
single place all the operations around the aux table
invalidation.

With this refactoring add more engines where the invalidation
should be performed.

Fixes: 972282c4cf24 ("drm/i915/gen12: Add aux table invalidate for all engines")
Signed-off-by: Andi Shyti 
Cc: Jonathan Cavitt 
Cc: Matt Roper 
Cc:  # v5.8+
---
 drivers/gpu/drm/i915/gt/gen8_engine_cs.c | 58 +++-
 drivers/gpu/drm/i915/gt/gen8_engine_cs.h |  3 +-
 drivers/gpu/drm/i915/gt/intel_lrc.c  | 17 +--
 3 files changed, 41 insertions(+), 37 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/gen8_engine_cs.c 
b/drivers/gpu/drm/i915/gt/gen8_engine_cs.c
index 3ded597f002a2..30fb4e0af6134 100644
--- a/drivers/gpu/drm/i915/gt/gen8_engine_cs.c
+++ b/drivers/gpu/drm/i915/gt/gen8_engine_cs.c
@@ -165,9 +165,36 @@ static u32 preparser_disable(bool state)
return MI_ARB_CHECK | 1 << 8 | state;
 }
 
-u32 *gen12_emit_aux_table_inv(struct intel_gt *gt, u32 *cs, const i915_reg_t 
inv_reg)
+static i915_reg_t gen12_get_aux_inv_reg(struct intel_engine_cs *engine)
 {
-   u32 gsi_offset = gt->uncore->gsi_offset;
+   if (!HAS_AUX_CCS(engine->i915))
+   return INVALID_MMIO_REG;
+
+   switch (engine->id) {
+   case RCS0:
+   return GEN12_CCS_AUX_INV;
+   case BCS0:
+   return GEN12_BCS0_AUX_INV;
+   case VCS0:
+   return GEN12_VD0_AUX_INV;
+   case VCS2:
+   return GEN12_VD2_AUX_INV;
+   case VECS0:
+   return GEN12_VE0_AUX_INV;
+   case CCS0:
+   return GEN12_CCS0_AUX_INV;
+   default:
+   return INVALID_MMIO_REG;
+   }
+}
+
+u32 *gen12_emit_aux_table_inv(struct intel_engine_cs *engine, u32 *cs)
+{
+   i915_reg_t inv_reg = gen12_get_aux_inv_reg(engine);
+   u32 gsi_offset = engine->gt->uncore->gsi_offset;
+
+   if (i915_mmio_reg_valid(inv_reg))
+   return cs;
 
*cs++ = MI_LOAD_REGISTER_IMM(1) | MI_LRI_MMIO_REMAP_EN;
*cs++ = i915_mmio_reg_offset(inv_reg) + gsi_offset;
@@ -201,6 +228,11 @@ static u32 *intel_emit_pipe_control_cs(struct i915_request 
*rq, u32 bit_group_0,
return cs;
 }
 
+static bool gen12_engine_has_aux_inv(struct intel_engine_cs *engine)
+{
+   return i915_mmio_reg_valid(gen12_get_aux_inv_reg(engine));
+}
+
 static int mtl_dummy_pipe_control(struct i915_request *rq)
 {
/* Wa_14016712196 */
@@ -307,11 +339,7 @@ int gen12_emit_flush_rcs(struct i915_request *rq, u32 mode)
 
cs = gen8_emit_pipe_control(cs, flags, LRC_PPHWSP_SCRATCH_ADDR);
 
-   if (!HAS_FLAT_CCS(rq->engine->i915)) {
-   /* hsdes: 1809175790 */
-   cs = gen12_emit_aux_table_inv(rq->engine->gt, cs,
- GEN12_CCS_AUX_INV);
-   }
+   cs = gen12_emit_aux_table_inv(engine, cs);
 
*cs++ = preparser_disable(false);
intel_ring_advance(rq, cs);
@@ -322,7 +350,6 @@ int gen12_emit_flush_rcs(struct i915_request *rq, u32 mode)
 
 int gen12_emit_flush_xcs(struct i915_request *rq, u32 mode)
 {
-   intel_engine_mask_t aux_inv = 0;
u32 cmd_flush = 0;
u32 cmd = 4;
u32 *cs;
@@ -330,15 +357,11 @@ int gen12_emit_flush_xcs(struct i915_request *rq, u32 
mode)
if (mode & EMIT_INVALIDATE)
cmd += 2;
 
-   if (HAS_AUX_CCS(rq->engine->i915))
-   aux_inv = rq->engine->mask &
- ~GENMASK(_BCS(I915_MAX_BCS - 1), BCS0);
-
/*
 * On Aux CCS platforms the invalidation of the Aux
 * table requires quiescing memory traffic beforehand
 */
-   if (aux_inv) {
+   if (gen12_engine_has_aux_inv(rq->engine)) {
cmd += 8; /* for the AUX invalidation */
cmd += 2; /* for the engine quiescing */
 
@@ -381,14 +404,7 @@ int gen12_emit_flush_xcs(struct i915_request *rq, u32 mode)
*cs++ = 0; /* upper addr */
*cs++ = 0; /* value */
 
-   if (aux_inv) { /* hsdes: 1809175790 */
-   if (rq->engine->class == VIDEO_DECODE_CLASS)
-   cs = gen12_emit_aux_table_inv(rq->engine->gt,
- cs, GEN12_VD0_AUX_INV);
-   else
-   cs = gen12_emit_aux_table_inv(rq->engine->gt,
- cs, GEN12_VE0_AUX_INV);
-   }
+   cs = gen12_emit_aux_table_inv(rq->engine, cs);
 
if (mode & EMIT_INVALIDATE)
*cs++ = preparser_disable(false);
diff --git a/drivers/gpu/drm/i915/gt/gen8_engine_cs.h 
b/drivers/gpu/drm/i915/gt/gen8_engine_cs.h
index a44eda096557c..867ba697aceb8 100644
--- a/drivers/gpu/drm/i915/gt/gen8_engine_cs.h
+++ b/drivers/gpu/drm/i915/gt/gen8_engine_cs.h
@@ -13,6 +13,7 @@
 #include "intel_gt_regs.h"
 #include "intel_gpu_commands.h"
 
+st

[PATCH v7 7/9] drm/i915/gt: Ensure memory quiesced before invalidation for all engines

2023-07-20 Thread Andi Shyti
Commit af9e423a8aae ("drm/i915/gt: Ensure memory quiesced before
invalidation") has made sure that the memory is quiesced before
invalidating the AUX CCS table. Do it for all the other engines
and not just RCS.

Signed-off-by: Andi Shyti 
Cc: Jonathan Cavitt 
Cc: Matt Roper 
Cc:  # v5.8+
---
 drivers/gpu/drm/i915/gt/gen8_engine_cs.c | 36 
 1 file changed, 25 insertions(+), 11 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/gen8_engine_cs.c 
b/drivers/gpu/drm/i915/gt/gen8_engine_cs.c
index 202d6ff8b5264..b6dd22eb2d9b2 100644
--- a/drivers/gpu/drm/i915/gt/gen8_engine_cs.c
+++ b/drivers/gpu/drm/i915/gt/gen8_engine_cs.c
@@ -316,26 +316,40 @@ int gen12_emit_flush_rcs(struct i915_request *rq, u32 
mode)
 int gen12_emit_flush_xcs(struct i915_request *rq, u32 mode)
 {
intel_engine_mask_t aux_inv = 0;
-   u32 cmd, *cs;
+   u32 cmd_flush = 0;
+   u32 cmd = 4;
+   u32 *cs;
 
-   cmd = 4;
-   if (mode & EMIT_INVALIDATE) {
+   if (mode & EMIT_INVALIDATE)
cmd += 2;
 
-   if (HAS_AUX_CCS(rq->engine->i915) &&
-   (rq->engine->class == VIDEO_DECODE_CLASS ||
-rq->engine->class == VIDEO_ENHANCEMENT_CLASS)) {
-   aux_inv = rq->engine->mask &
-   ~GENMASK(_BCS(I915_MAX_BCS - 1), BCS0);
-   if (aux_inv)
-   cmd += 4;
-   }
+   if (HAS_AUX_CCS(rq->engine->i915))
+   aux_inv = rq->engine->mask &
+ ~GENMASK(_BCS(I915_MAX_BCS - 1), BCS0);
+
+   /*
+* On Aux CCS platforms the invalidation of the Aux
+* table requires quiescing memory traffic beforehand
+*/
+   if (aux_inv) {
+   cmd += 4; /* for the AUX invalidation */
+   cmd += 2; /* for the engine quiescing */
+
+   cmd_flush = MI_FLUSH_DW;
+
+   if (rq->engine->class == COPY_ENGINE_CLASS)
+   cmd_flush |= MI_FLUSH_DW_CCS;
}
 
cs = intel_ring_begin(rq, cmd);
if (IS_ERR(cs))
return PTR_ERR(cs);
 
+   if (cmd_flush) {
+   *cs++ = cmd_flush;
+   *cs++ = 0;
+   }
+
if (mode & EMIT_INVALIDATE)
*cs++ = preparser_disable(true);
 
-- 
2.40.1



[PATCH v7 8/9] drm/i915/gt: Poll aux invalidation register bit on invalidation

2023-07-20 Thread Andi Shyti
From: Jonathan Cavitt 

For platforms that use Aux CCS, wait for aux invalidation to
complete by checking the aux invalidation register bit is
cleared.

Fixes: 972282c4cf24 ("drm/i915/gen12: Add aux table invalidate for all engines")
Signed-off-by: Jonathan Cavitt 
Signed-off-by: Andi Shyti 
Cc:  # v5.8+
Reviewed-by: Nirmoy Das 
Reviewed-by: Andrzej Hajda 
Reviewed-by: Matt Roper 
---
 drivers/gpu/drm/i915/gt/gen8_engine_cs.c | 17 -
 drivers/gpu/drm/i915/gt/intel_gpu_commands.h |  1 +
 2 files changed, 13 insertions(+), 5 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/gen8_engine_cs.c 
b/drivers/gpu/drm/i915/gt/gen8_engine_cs.c
index b6dd22eb2d9b2..3ded597f002a2 100644
--- a/drivers/gpu/drm/i915/gt/gen8_engine_cs.c
+++ b/drivers/gpu/drm/i915/gt/gen8_engine_cs.c
@@ -172,7 +172,15 @@ u32 *gen12_emit_aux_table_inv(struct intel_gt *gt, u32 
*cs, const i915_reg_t inv
*cs++ = MI_LOAD_REGISTER_IMM(1) | MI_LRI_MMIO_REMAP_EN;
*cs++ = i915_mmio_reg_offset(inv_reg) + gsi_offset;
*cs++ = AUX_INV;
-   *cs++ = MI_NOOP;
+
+   *cs++ = MI_SEMAPHORE_WAIT_TOKEN |
+   MI_SEMAPHORE_REGISTER_POLL |
+   MI_SEMAPHORE_POLL |
+   MI_SEMAPHORE_SAD_EQ_SDD;
+   *cs++ = 0;
+   *cs++ = i915_mmio_reg_offset(inv_reg) + gsi_offset;
+   *cs++ = 0;
+   *cs++ = 0;
 
return cs;
 }
@@ -282,10 +290,9 @@ int gen12_emit_flush_rcs(struct i915_request *rq, u32 mode)
else if (engine->class == COMPUTE_CLASS)
flags &= ~PIPE_CONTROL_3D_ENGINE_FLAGS;
 
+   count = 8;
if (HAS_AUX_CCS(rq->engine->i915))
-   count = 8 + 4;
-   else
-   count = 8;
+   count += 8;
 
cs = intel_ring_begin(rq, count);
if (IS_ERR(cs))
@@ -332,7 +339,7 @@ int gen12_emit_flush_xcs(struct i915_request *rq, u32 mode)
 * table requires quiescing memory traffic beforehand
 */
if (aux_inv) {
-   cmd += 4; /* for the AUX invalidation */
+   cmd += 8; /* for the AUX invalidation */
cmd += 2; /* for the engine quiescing */
 
cmd_flush = MI_FLUSH_DW;
diff --git a/drivers/gpu/drm/i915/gt/intel_gpu_commands.h 
b/drivers/gpu/drm/i915/gt/intel_gpu_commands.h
index 5df7cce23197c..2bd8d98d21102 100644
--- a/drivers/gpu/drm/i915/gt/intel_gpu_commands.h
+++ b/drivers/gpu/drm/i915/gt/intel_gpu_commands.h
@@ -121,6 +121,7 @@
 #define   MI_SEMAPHORE_TARGET(engine)  ((engine)<<15)
 #define MI_SEMAPHORE_WAIT  MI_INSTR(0x1c, 2) /* GEN8+ */
 #define MI_SEMAPHORE_WAIT_TOKENMI_INSTR(0x1c, 3) /* GEN12+ */
+#define   MI_SEMAPHORE_REGISTER_POLL   (1 << 16)
 #define   MI_SEMAPHORE_POLL(1 << 15)
 #define   MI_SEMAPHORE_SAD_GT_SDD  (0 << 12)
 #define   MI_SEMAPHORE_SAD_GTE_SDD (1 << 12)
-- 
2.40.1



[PATCH v7 4/9] drm/i915/gt: Rename flags with bit_group_X according to the datasheet

2023-07-20 Thread Andi Shyti
In preparation of the next patch align with the datasheet (BSPEC
47112) with the naming of the pipe control set of flag values.
The variable "flags" in gen12_emit_flush_rcs() is applied as a
set of flags called Bit Group 1.

Define also the Bit Group 0 as bit_group_0 where currently only
PIPE_CONTROL0_HDC_PIPELINE_FLUSH bit is set.

Signed-off-by: Andi Shyti 
Cc:  # v5.8+
Reviewed-by: Matt Roper 
Reviewed-by: Andrzej Hajda 
Reviewed-by: Nirmoy Das 
---
 drivers/gpu/drm/i915/gt/gen8_engine_cs.c | 34 +---
 drivers/gpu/drm/i915/gt/gen8_engine_cs.h | 18 -
 2 files changed, 29 insertions(+), 23 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/gen8_engine_cs.c 
b/drivers/gpu/drm/i915/gt/gen8_engine_cs.c
index 5fbc3f630f32b..7566c89d9def3 100644
--- a/drivers/gpu/drm/i915/gt/gen8_engine_cs.c
+++ b/drivers/gpu/drm/i915/gt/gen8_engine_cs.c
@@ -207,7 +207,8 @@ int gen12_emit_flush_rcs(struct i915_request *rq, u32 mode)
 * table requires quiescing memory traffic beforehand
 */
if (mode & EMIT_FLUSH || HAS_AUX_CCS(engine->i915)) {
-   u32 flags = 0;
+   u32 bit_group_0 = 0;
+   u32 bit_group_1 = 0;
int err;
u32 *cs;
 
@@ -215,32 +216,33 @@ int gen12_emit_flush_rcs(struct i915_request *rq, u32 
mode)
if (err)
return err;
 
-   flags |= PIPE_CONTROL_TILE_CACHE_FLUSH;
-   flags |= PIPE_CONTROL_FLUSH_L3;
-   flags |= PIPE_CONTROL_RENDER_TARGET_CACHE_FLUSH;
-   flags |= PIPE_CONTROL_DEPTH_CACHE_FLUSH;
+   bit_group_0 |= PIPE_CONTROL0_HDC_PIPELINE_FLUSH;
+
+   bit_group_1 |= PIPE_CONTROL_TILE_CACHE_FLUSH;
+   bit_group_1 |= PIPE_CONTROL_FLUSH_L3;
+   bit_group_1 |= PIPE_CONTROL_RENDER_TARGET_CACHE_FLUSH;
+   bit_group_1 |= PIPE_CONTROL_DEPTH_CACHE_FLUSH;
/* Wa_1409600907:tgl,adl-p */
-   flags |= PIPE_CONTROL_DEPTH_STALL;
-   flags |= PIPE_CONTROL_DC_FLUSH_ENABLE;
-   flags |= PIPE_CONTROL_FLUSH_ENABLE;
+   bit_group_1 |= PIPE_CONTROL_DEPTH_STALL;
+   bit_group_1 |= PIPE_CONTROL_DC_FLUSH_ENABLE;
+   bit_group_1 |= PIPE_CONTROL_FLUSH_ENABLE;
 
-   flags |= PIPE_CONTROL_STORE_DATA_INDEX;
-   flags |= PIPE_CONTROL_QW_WRITE;
+   bit_group_1 |= PIPE_CONTROL_STORE_DATA_INDEX;
+   bit_group_1 |= PIPE_CONTROL_QW_WRITE;
 
-   flags |= PIPE_CONTROL_CS_STALL;
+   bit_group_1 |= PIPE_CONTROL_CS_STALL;
 
if (!HAS_3D_PIPELINE(engine->i915))
-   flags &= ~PIPE_CONTROL_3D_ARCH_FLAGS;
+   bit_group_1 &= ~PIPE_CONTROL_3D_ARCH_FLAGS;
else if (engine->class == COMPUTE_CLASS)
-   flags &= ~PIPE_CONTROL_3D_ENGINE_FLAGS;
+   bit_group_1 &= ~PIPE_CONTROL_3D_ENGINE_FLAGS;
 
cs = intel_ring_begin(rq, 6);
if (IS_ERR(cs))
return PTR_ERR(cs);
 
-   cs = gen12_emit_pipe_control(cs,
-PIPE_CONTROL0_HDC_PIPELINE_FLUSH,
-flags, LRC_PPHWSP_SCRATCH_ADDR);
+   cs = gen12_emit_pipe_control(cs, bit_group_0, bit_group_1,
+LRC_PPHWSP_SCRATCH_ADDR);
intel_ring_advance(rq, cs);
}
 
diff --git a/drivers/gpu/drm/i915/gt/gen8_engine_cs.h 
b/drivers/gpu/drm/i915/gt/gen8_engine_cs.h
index 655e5c00ddc27..a44eda096557c 100644
--- a/drivers/gpu/drm/i915/gt/gen8_engine_cs.h
+++ b/drivers/gpu/drm/i915/gt/gen8_engine_cs.h
@@ -49,25 +49,29 @@ u32 *gen12_emit_fini_breadcrumb_rcs(struct i915_request 
*rq, u32 *cs);
 u32 *gen12_emit_aux_table_inv(struct intel_gt *gt, u32 *cs, const i915_reg_t 
inv_reg);
 
 static inline u32 *
-__gen8_emit_pipe_control(u32 *batch, u32 flags0, u32 flags1, u32 offset)
+__gen8_emit_pipe_control(u32 *batch, u32 bit_group_0,
+u32 bit_group_1, u32 offset)
 {
memset(batch, 0, 6 * sizeof(u32));
 
-   batch[0] = GFX_OP_PIPE_CONTROL(6) | flags0;
-   batch[1] = flags1;
+   batch[0] = GFX_OP_PIPE_CONTROL(6) | bit_group_0;
+   batch[1] = bit_group_1;
batch[2] = offset;
 
return batch + 6;
 }
 
-static inline u32 *gen8_emit_pipe_control(u32 *batch, u32 flags, u32 offset)
+static inline u32 *gen8_emit_pipe_control(u32 *batch,
+ u32 bit_group_1, u32 offset)
 {
-   return __gen8_emit_pipe_control(batch, 0, flags, offset);
+   return __gen8_emit_pipe_control(batch, 0, bit_group_1, offset);
 }
 
-static inline u32 *gen12_emit_pipe_control(u32 *batch, u32 flags0, u32 flags1, 
u32 offset)
+static inline u32 *gen12_emit_pipe_control(u32 *batch, u32 bit_group_0,
+   

[PATCH v7 6/9] drm/i915/gt: Refactor intel_emit_pipe_control_cs() in a single function

2023-07-20 Thread Andi Shyti
Just a trivial refactoring for reducing the number of code
duplicate. This will come at handy in the next commits.

Signed-off-by: Andi Shyti 
Cc:  # v5.8+
---
 drivers/gpu/drm/i915/gt/gen8_engine_cs.c | 44 +---
 1 file changed, 23 insertions(+), 21 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/gen8_engine_cs.c 
b/drivers/gpu/drm/i915/gt/gen8_engine_cs.c
index 9d050b9a19194..202d6ff8b5264 100644
--- a/drivers/gpu/drm/i915/gt/gen8_engine_cs.c
+++ b/drivers/gpu/drm/i915/gt/gen8_engine_cs.c
@@ -177,23 +177,31 @@ u32 *gen12_emit_aux_table_inv(struct intel_gt *gt, u32 
*cs, const i915_reg_t inv
return cs;
 }
 
+static u32 *intel_emit_pipe_control_cs(struct i915_request *rq, u32 
bit_group_0,
+  u32 bit_group_1, u32 offset)
+{
+   u32 *cs;
+
+   cs = intel_ring_begin(rq, 6);
+   if (IS_ERR(cs))
+   return cs;
+
+   cs = gen12_emit_pipe_control(cs, bit_group_0, bit_group_1,
+LRC_PPHWSP_SCRATCH_ADDR);
+   intel_ring_advance(rq, cs);
+
+   return cs;
+}
+
 static int mtl_dummy_pipe_control(struct i915_request *rq)
 {
/* Wa_14016712196 */
if (IS_MTL_GRAPHICS_STEP(rq->engine->i915, M, STEP_A0, STEP_B0) ||
-   IS_MTL_GRAPHICS_STEP(rq->engine->i915, P, STEP_A0, STEP_B0)) {
-   u32 *cs;
-
-   /* dummy PIPE_CONTROL + depth flush */
-   cs = intel_ring_begin(rq, 6);
-   if (IS_ERR(cs))
-   return PTR_ERR(cs);
-   cs = gen12_emit_pipe_control(cs,
-0,
-PIPE_CONTROL_DEPTH_CACHE_FLUSH,
-LRC_PPHWSP_SCRATCH_ADDR);
-   intel_ring_advance(rq, cs);
-   }
+   IS_MTL_GRAPHICS_STEP(rq->engine->i915, P, STEP_A0, STEP_B0))
+   intel_emit_pipe_control_cs(rq,
+  0,
+  PIPE_CONTROL_DEPTH_CACHE_FLUSH,
+  LRC_PPHWSP_SCRATCH_ADDR);
 
return 0;
 }
@@ -210,7 +218,6 @@ int gen12_emit_flush_rcs(struct i915_request *rq, u32 mode)
u32 bit_group_0 = 0;
u32 bit_group_1 = 0;
int err;
-   u32 *cs;
 
err = mtl_dummy_pipe_control(rq);
if (err)
@@ -244,13 +251,8 @@ int gen12_emit_flush_rcs(struct i915_request *rq, u32 mode)
else if (engine->class == COMPUTE_CLASS)
bit_group_1 &= ~PIPE_CONTROL_3D_ENGINE_FLAGS;
 
-   cs = intel_ring_begin(rq, 6);
-   if (IS_ERR(cs))
-   return PTR_ERR(cs);
-
-   cs = gen12_emit_pipe_control(cs, bit_group_0, bit_group_1,
-LRC_PPHWSP_SCRATCH_ADDR);
-   intel_ring_advance(rq, cs);
+   intel_emit_pipe_control_cs(rq, bit_group_0, bit_group_1,
+  LRC_PPHWSP_SCRATCH_ADDR);
}
 
if (mode & EMIT_INVALIDATE) {
-- 
2.40.1



[PATCH v7 5/9] drm/i915/gt: Enable the CCS_FLUSH bit in the pipe control

2023-07-20 Thread Andi Shyti
Enable the CCS_FLUSH bit 13 in the control pipe for render and
compute engines in platforms starting from Meteor Lake (BSPEC
43904 and 47112).

Fixes: 972282c4cf24 ("drm/i915/gen12: Add aux table invalidate for all engines")
Signed-off-by: Andi Shyti 
Cc: Jonathan Cavitt 
Cc: Nirmoy Das 
Cc:  # v5.8+
---
 drivers/gpu/drm/i915/gt/gen8_engine_cs.c | 7 +++
 drivers/gpu/drm/i915/gt/intel_gpu_commands.h | 1 +
 2 files changed, 8 insertions(+)

diff --git a/drivers/gpu/drm/i915/gt/gen8_engine_cs.c 
b/drivers/gpu/drm/i915/gt/gen8_engine_cs.c
index 7566c89d9def3..9d050b9a19194 100644
--- a/drivers/gpu/drm/i915/gt/gen8_engine_cs.c
+++ b/drivers/gpu/drm/i915/gt/gen8_engine_cs.c
@@ -218,6 +218,13 @@ int gen12_emit_flush_rcs(struct i915_request *rq, u32 mode)
 
bit_group_0 |= PIPE_CONTROL0_HDC_PIPELINE_FLUSH;
 
+   /*
+* When required, in MTL+ platforms we need to
+* set the CCS_FLUSH bit in the pipe control
+*/
+   if (GRAPHICS_VER_FULL(rq->i915) >= IP_VER(12, 70))
+   bit_group_0 |= PIPE_CONTROL_CCS_FLUSH;
+
bit_group_1 |= PIPE_CONTROL_TILE_CACHE_FLUSH;
bit_group_1 |= PIPE_CONTROL_FLUSH_L3;
bit_group_1 |= PIPE_CONTROL_RENDER_TARGET_CACHE_FLUSH;
diff --git a/drivers/gpu/drm/i915/gt/intel_gpu_commands.h 
b/drivers/gpu/drm/i915/gt/intel_gpu_commands.h
index 5d143e2a8db03..5df7cce23197c 100644
--- a/drivers/gpu/drm/i915/gt/intel_gpu_commands.h
+++ b/drivers/gpu/drm/i915/gt/intel_gpu_commands.h
@@ -299,6 +299,7 @@
 #define   PIPE_CONTROL_QW_WRITE(1<<14)
 #define   PIPE_CONTROL_POST_SYNC_OP_MASK(3<<14)
 #define   PIPE_CONTROL_DEPTH_STALL (1<<13)
+#define   PIPE_CONTROL_CCS_FLUSH   (1<<13) /* MTL+ */
 #define   PIPE_CONTROL_WRITE_FLUSH (1<<12)
 #define   PIPE_CONTROL_RENDER_TARGET_CACHE_FLUSH   (1<<12) /* gen6+ */
 #define   PIPE_CONTROL_INSTRUCTION_CACHE_INVALIDATE(1<<11) /* MBZ on ILK */
-- 
2.40.1



[PATCH v7 3/9] drm/i915/gt: Ensure memory quiesced before invalidation

2023-07-20 Thread Andi Shyti
From: Jonathan Cavitt 

All memory traffic must be quiesced before requesting
an aux invalidation on platforms that use Aux CCS.

Fixes: 972282c4cf24 ("drm/i915/gen12: Add aux table invalidate for all engines")
Signed-off-by: Jonathan Cavitt 
Signed-off-by: Andi Shyti 
Cc:  # v5.8+
Reviewed-by: Nirmoy Das 
Reviewed-by: Andrzej Hajda 
---
 drivers/gpu/drm/i915/gt/gen8_engine_cs.c | 6 +-
 1 file changed, 5 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/i915/gt/gen8_engine_cs.c 
b/drivers/gpu/drm/i915/gt/gen8_engine_cs.c
index 0d4d5e0407a2d..5fbc3f630f32b 100644
--- a/drivers/gpu/drm/i915/gt/gen8_engine_cs.c
+++ b/drivers/gpu/drm/i915/gt/gen8_engine_cs.c
@@ -202,7 +202,11 @@ int gen12_emit_flush_rcs(struct i915_request *rq, u32 mode)
 {
struct intel_engine_cs *engine = rq->engine;
 
-   if (mode & EMIT_FLUSH) {
+   /*
+* On Aux CCS platforms the invalidation of the Aux
+* table requires quiescing memory traffic beforehand
+*/
+   if (mode & EMIT_FLUSH || HAS_AUX_CCS(engine->i915)) {
u32 flags = 0;
int err;
u32 *cs;
-- 
2.40.1



[PATCH v7 2/9] drm/i915: Add the has_aux_ccs device property

2023-07-20 Thread Andi Shyti
We always assumed that a device might either have AUX or FLAT
CCS, but this is an approximation that is not always true as it
requires some further per device checks.

Add the "has_aux_ccs" flag in the intel_device_info structure in
order to have a per device flag indicating of the AUX CCS.

Signed-off-by: Andi Shyti 
Cc: Matt Roper 
Cc: Jonathan Cavitt 
Cc:  # v5.8+
---
 drivers/gpu/drm/i915/gt/gen8_engine_cs.c | 4 ++--
 drivers/gpu/drm/i915/i915_drv.h  | 1 +
 drivers/gpu/drm/i915/i915_pci.c  | 5 -
 drivers/gpu/drm/i915/intel_device_info.h | 1 +
 4 files changed, 8 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/gen8_engine_cs.c 
b/drivers/gpu/drm/i915/gt/gen8_engine_cs.c
index 563efee055602..0d4d5e0407a2d 100644
--- a/drivers/gpu/drm/i915/gt/gen8_engine_cs.c
+++ b/drivers/gpu/drm/i915/gt/gen8_engine_cs.c
@@ -267,7 +267,7 @@ int gen12_emit_flush_rcs(struct i915_request *rq, u32 mode)
else if (engine->class == COMPUTE_CLASS)
flags &= ~PIPE_CONTROL_3D_ENGINE_FLAGS;
 
-   if (!HAS_FLAT_CCS(rq->engine->i915))
+   if (HAS_AUX_CCS(rq->engine->i915))
count = 8 + 4;
else
count = 8;
@@ -307,7 +307,7 @@ int gen12_emit_flush_xcs(struct i915_request *rq, u32 mode)
if (mode & EMIT_INVALIDATE) {
cmd += 2;
 
-   if (!HAS_FLAT_CCS(rq->engine->i915) &&
+   if (HAS_AUX_CCS(rq->engine->i915) &&
(rq->engine->class == VIDEO_DECODE_CLASS ||
 rq->engine->class == VIDEO_ENHANCEMENT_CLASS)) {
aux_inv = rq->engine->mask &
diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index 682ef2b5c7d59..e9cc048b5727a 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -848,6 +848,7 @@ IS_SUBPLATFORM(const struct drm_i915_private *i915,
  * stored in lmem to support the 3D and media compression formats.
  */
 #define HAS_FLAT_CCS(i915)   (INTEL_INFO(i915)->has_flat_ccs)
+#define HAS_AUX_CCS(i915)(INTEL_INFO(i915)->has_aux_ccs)
 
 #define HAS_GT_UC(i915)(INTEL_INFO(i915)->has_gt_uc)
 
diff --git a/drivers/gpu/drm/i915/i915_pci.c b/drivers/gpu/drm/i915/i915_pci.c
index fcacdc21643cf..c9ff1d11a9fce 100644
--- a/drivers/gpu/drm/i915/i915_pci.c
+++ b/drivers/gpu/drm/i915/i915_pci.c
@@ -643,7 +643,8 @@ static const struct intel_device_info jsl_info = {
TGL_CACHELEVEL, \
.has_global_mocs = 1, \
.has_pxp = 1, \
-   .max_pat_index = 3
+   .max_pat_index = 3, \
+   .has_aux_ccs = 1
 
 static const struct intel_device_info tgl_info = {
GEN12_FEATURES,
@@ -775,6 +776,7 @@ static const struct intel_device_info dg2_info = {
 
 static const struct intel_device_info ats_m_info = {
DG2_FEATURES,
+   .has_aux_ccs = 1,
.require_force_probe = 1,
.tuning_thread_rr_after_dep = 1,
 };
@@ -827,6 +829,7 @@ static const struct intel_device_info mtl_info = {
.__runtime.media.ip.ver = 13,
PLATFORM(INTEL_METEORLAKE),
.extra_gt_list = xelpmp_extra_gt,
+   .has_aux_ccs = 1,
.has_flat_ccs = 0,
.has_gmd_id = 1,
.has_guc_deprivilege = 1,
diff --git a/drivers/gpu/drm/i915/intel_device_info.h 
b/drivers/gpu/drm/i915/intel_device_info.h
index dbfe6443457b5..93485507506cc 100644
--- a/drivers/gpu/drm/i915/intel_device_info.h
+++ b/drivers/gpu/drm/i915/intel_device_info.h
@@ -151,6 +151,7 @@ enum intel_ppgtt_type {
func(has_reset_engine); \
func(has_3d_pipeline); \
func(has_4tile); \
+   func(has_aux_ccs); \
func(has_flat_ccs); \
func(has_global_mocs); \
func(has_gmd_id); \
-- 
2.40.1



[PATCH v7 1/9] drm/i915/gt: Cleanup aux invalidation registers

2023-07-20 Thread Andi Shyti
Fix the 'NV' definition postfix that is supposed to be INV.

Take the chance to also order properly the registers based on
their address and call the GEN12_GFX_CCS_AUX_INV address as
GEN12_CCS_AUX_INV like all the other similar registers.

Remove also VD1, VD3 and VE1 registers that don't exist and add
BCS0 and CCS0.

Signed-off-by: Andi Shyti 
Cc:  # v5.8+
Reviewed-by: Nirmoy Das 
Reviewed-by: Andrzej Hajda 
---
 drivers/gpu/drm/i915/gt/gen8_engine_cs.c |  8 
 drivers/gpu/drm/i915/gt/intel_gt_regs.h  | 16 
 drivers/gpu/drm/i915/gt/intel_lrc.c  |  6 +++---
 3 files changed, 15 insertions(+), 15 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/gen8_engine_cs.c 
b/drivers/gpu/drm/i915/gt/gen8_engine_cs.c
index 23857cc08eca1..563efee055602 100644
--- a/drivers/gpu/drm/i915/gt/gen8_engine_cs.c
+++ b/drivers/gpu/drm/i915/gt/gen8_engine_cs.c
@@ -287,8 +287,8 @@ int gen12_emit_flush_rcs(struct i915_request *rq, u32 mode)
 
if (!HAS_FLAT_CCS(rq->engine->i915)) {
/* hsdes: 1809175790 */
-   cs = gen12_emit_aux_table_inv(rq->engine->gt,
- cs, GEN12_GFX_CCS_AUX_NV);
+   cs = gen12_emit_aux_table_inv(rq->engine->gt, cs,
+ GEN12_CCS_AUX_INV);
}
 
*cs++ = preparser_disable(false);
@@ -348,10 +348,10 @@ int gen12_emit_flush_xcs(struct i915_request *rq, u32 
mode)
if (aux_inv) { /* hsdes: 1809175790 */
if (rq->engine->class == VIDEO_DECODE_CLASS)
cs = gen12_emit_aux_table_inv(rq->engine->gt,
- cs, GEN12_VD0_AUX_NV);
+ cs, GEN12_VD0_AUX_INV);
else
cs = gen12_emit_aux_table_inv(rq->engine->gt,
- cs, GEN12_VE0_AUX_NV);
+ cs, GEN12_VE0_AUX_INV);
}
 
if (mode & EMIT_INVALIDATE)
diff --git a/drivers/gpu/drm/i915/gt/intel_gt_regs.h 
b/drivers/gpu/drm/i915/gt/intel_gt_regs.h
index 718cb2c80f79e..2cdfb2f713d02 100644
--- a/drivers/gpu/drm/i915/gt/intel_gt_regs.h
+++ b/drivers/gpu/drm/i915/gt/intel_gt_regs.h
@@ -332,9 +332,11 @@
 #define GEN8_PRIVATE_PAT_HI_MMIO(0x40e0 + 4)
 #define GEN10_PAT_INDEX(index) _MMIO(0x40e0 + (index) * 4)
 #define BSD_HWS_PGA_GEN7   _MMIO(0x4180)
-#define GEN12_GFX_CCS_AUX_NV   _MMIO(0x4208)
-#define GEN12_VD0_AUX_NV   _MMIO(0x4218)
-#define GEN12_VD1_AUX_NV   _MMIO(0x4228)
+
+#define GEN12_CCS_AUX_INV  _MMIO(0x4208)
+#define GEN12_VD0_AUX_INV  _MMIO(0x4218)
+#define GEN12_VE0_AUX_INV  _MMIO(0x4238)
+#define GEN12_BCS0_AUX_INV _MMIO(0x4248)
 
 #define GEN8_RTCR  _MMIO(0x4260)
 #define GEN8_M1TCR _MMIO(0x4264)
@@ -342,14 +344,12 @@
 #define GEN8_BTCR  _MMIO(0x426c)
 #define GEN8_VTCR  _MMIO(0x4270)
 
-#define GEN12_VD2_AUX_NV   _MMIO(0x4298)
-#define GEN12_VD3_AUX_NV   _MMIO(0x42a8)
-#define GEN12_VE0_AUX_NV   _MMIO(0x4238)
-
 #define BLT_HWS_PGA_GEN7   _MMIO(0x4280)
 
-#define GEN12_VE1_AUX_NV   _MMIO(0x42b8)
+#define GEN12_VD2_AUX_INV  _MMIO(0x4298)
+#define GEN12_CCS0_AUX_INV _MMIO(0x42c8)
 #define   AUX_INV  REG_BIT(0)
+
 #define VEBOX_HWS_PGA_GEN7 _MMIO(0x4380)
 
 #define GEN12_AUX_ERR_DBG  _MMIO(0x43f4)
diff --git a/drivers/gpu/drm/i915/gt/intel_lrc.c 
b/drivers/gpu/drm/i915/gt/intel_lrc.c
index 1b710102390bf..235f3fab60a98 100644
--- a/drivers/gpu/drm/i915/gt/intel_lrc.c
+++ b/drivers/gpu/drm/i915/gt/intel_lrc.c
@@ -1374,7 +1374,7 @@ gen12_emit_indirect_ctx_rcs(const struct intel_context 
*ce, u32 *cs)
/* hsdes: 1809175790 */
if (!HAS_FLAT_CCS(ce->engine->i915))
cs = gen12_emit_aux_table_inv(ce->engine->gt,
- cs, GEN12_GFX_CCS_AUX_NV);
+ cs, GEN12_CCS_AUX_INV);
 
/* Wa_16014892111 */
if (IS_MTL_GRAPHICS_STEP(ce->engine->i915, M, STEP_A0, STEP_B0) ||
@@ -1403,10 +1403,10 @@ gen12_emit_indirect_ctx_xcs(const struct intel_context 
*ce, u32 *cs)
if (!HAS_FLAT_CCS(ce->engine->i915)) {
if (ce->engine->class == VIDEO_DECODE_CLASS)
cs = gen12_emit_aux_table_inv(ce->engine->gt,
- cs, GEN12_VD0_AUX_NV);
+   

[PATCH v7 0/9] Update AUX invalidation sequence

2023-07-20 Thread Andi Shyti
Hi,

as there are new hardware directives, we need a little adaptation
for the AUX invalidation sequence.

In this version we support all the engines affected by this
change.

The stable backport has some challenges because the original
patch that this series fixes has had more changes in between.

This patch is slowly exploding with code refactorings and
features added and fixed.

Thanks a lot Nirmoy, Andrzej and Matt for your review and for the
fruitful discussions!

Thanks,
Andi

Changelog:
=
v6 -> v7
 - Fix correct sequence applied to the correct engine. A little
   confusion promptly cought by Nirmoy when applying to the VD
   engine the sequence belonging to the compute engines. Thanks a
   lot, Nirmoy!

v5 -> v6
 - Fixed ccs flush in the engines VE and BCS. They are sent as a
   separate command instead of added in the pipe control.
 - Separated the CCS flusing in the pipe control patch with the
   quiescing of the memory. They were meant to be on separate
   patch already in the previous verision, but apparently I
   squashed them by mistake.

v4 -> v5
 - The AUX CCS is added as a device property instead of checking
   against FLAT CCS. This adds the new HAS_AUX_CCS check
   (Patch 2, new).
 - little and trivial refactoring here and there.
 - extended the flags{0,1}/bit_group_{0,1} renaming to other
   functions.
 - Created an intel_emit_pipe_control_cs() wrapper for submitting
   the pipe control.
 - Quiesce memory for all the engines, not just RCS (Patch 6,
   new).
 - The PIPE_CONTROL_CCS_FLUSH is added to all the engines.
 - Remove redundant EMIT_FLUSH_CCS mode flag.
 - Remove unnecessary NOOPs from the command streamer for
   invalidating the CCS table.
 - Use INVALID_MMIO_REG and gen12_get_aux_inv_reg() instad of
   __MMIO(0) and reg.reg.
 - Remove useless wrapper and just use gen12_get_aux_inv_reg().

v3 -> v4
 - A trivial patch 3 is added to rename the flags with
   bit_group_{0,1} to align with the datasheet naming.
 - Patch 4 fixes a confusion I made where the CCS flag was
   applied to the wrong bit group.

v2 -> v3
 - added r-b from Nirmoy in patch 1 and 4.
 - added patch 3 which enables the ccs_flush in the control pipe
   for mtl+ compute and render engines.
 - added redundant checks in patch 2 for enabling the EMIT_FLUSH
   flag.

v1 -> v2
 - add a clean up preliminary patch for the existing registers
 - add support for more engines
 - add the Fixes tag

Andi Shyti (7):
  drm/i915/gt: Cleanup aux invalidation registers
  drm/i915: Add the has_aux_ccs device property
  drm/i915/gt: Rename flags with bit_group_X according to the datasheet
  drm/i915/gt: Enable the CCS_FLUSH bit in the pipe control
  drm/i915/gt: Refactor intel_emit_pipe_control_cs() in a single
function
  drm/i915/gt: Ensure memory quiesced before invalidation for all
engines
  drm/i915/gt: Support aux invalidation on all engines

Jonathan Cavitt (2):
  drm/i915/gt: Ensure memory quiesced before invalidation
  drm/i915/gt: Poll aux invalidation register bit on invalidation

 drivers/gpu/drm/i915/gt/gen8_engine_cs.c | 188 ---
 drivers/gpu/drm/i915/gt/gen8_engine_cs.h |  21 ++-
 drivers/gpu/drm/i915/gt/intel_gpu_commands.h |   2 +
 drivers/gpu/drm/i915/gt/intel_gt_regs.h  |  16 +-
 drivers/gpu/drm/i915/gt/intel_lrc.c  |  17 +-
 drivers/gpu/drm/i915/i915_drv.h  |   1 +
 drivers/gpu/drm/i915/i915_pci.c  |   5 +-
 drivers/gpu/drm/i915/intel_device_info.h |   1 +
 8 files changed, 151 insertions(+), 100 deletions(-)

-- 
2.40.1



Re: [PATCH v6 0/9] Update AUX invalidation sequence

2023-07-20 Thread Andi Shyti
Ops... sorry... I am realizing that I sent again V6... please
ignore this series!

Andi

On Thu, Jul 20, 2023 at 06:44:45PM +0200, Andi Shyti wrote:
> Hi,
> 
> as there are new hardware directives, we need a little adaptation
> for the AUX invalidation sequence.
> 
> In this version we support all the engines affected by this
> change.
> 
> The stable backport has some challenges because the original
> patch that this series fixes has had more changes in between.
> 
> This patch is slowly exploding with code refactorings and
> features added and fixed.
> 
> Thanks a lot Nirmoy, Andrzej and Matt for your review and for the
> fruitful discussions!
> 
> Thanks,
> Andi
> 
> Changelog:
> =
> v5 -> v6
>  - Fixed ccs flush in the engines VE and BCS. They are sent as a
>separate command instead of added in the pipe control.
>  - Separated the CCS flusing in the pipe control patch with the
>quiescing of the memory. They were meant to be on separate
>patch already in the previous verision, but apparently I
>squashed them by mistake.
> 
> v4 -> v5
>  - The AUX CCS is added as a device property instead of checking
>against FLAT CCS. This adds the new HAS_AUX_CCS check
>(Patch 2, new).
>  - little and trivial refactoring here and there.
>  - extended the flags{0,1}/bit_group_{0,1} renaming to other
>functions.
>  - Created an intel_emit_pipe_control_cs() wrapper for submitting
>the pipe control.
>  - Quiesce memory for all the engines, not just RCS (Patch 6,
>new).
>  - The PIPE_CONTROL_CCS_FLUSH is added to all the engines.
>  - Remove redundant EMIT_FLUSH_CCS mode flag.
>  - Remove unnecessary NOOPs from the command streamer for
>invalidating the CCS table.
>  - Use INVALID_MMIO_REG and gen12_get_aux_inv_reg() instad of
>__MMIO(0) and reg.reg.
>  - Remove useless wrapper and just use gen12_get_aux_inv_reg().
> 
> v3 -> v4
>  - A trivial patch 3 is added to rename the flags with
>bit_group_{0,1} to align with the datasheet naming.
>  - Patch 4 fixes a confusion I made where the CCS flag was
>applied to the wrong bit group.
> 
> v2 -> v3
>  - added r-b from Nirmoy in patch 1 and 4.
>  - added patch 3 which enables the ccs_flush in the control pipe
>for mtl+ compute and render engines.
>  - added redundant checks in patch 2 for enabling the EMIT_FLUSH
>flag.
> 
> v1 -> v2
>  - add a clean up preliminary patch for the existing registers
>  - add support for more engines
>  - add the Fixes tag
> 
> Andi Shyti (7):
>   drm/i915/gt: Cleanup aux invalidation registers
>   drm/i915: Add the has_aux_ccs device property
>   drm/i915/gt: Rename flags with bit_group_X according to the datasheet
>   drm/i915/gt: Refactor intel_emit_pipe_control_cs() in a single
> function
>   drm/i915/gt: Ensure memory quiesced before invalidation for all
> engines
>   drm/i915/gt: Enable the CCS_FLUSH bit in the pipe control
>   drm/i915/gt: Support aux invalidation on all engines
> 
> Jonathan Cavitt (2):
>   drm/i915/gt: Ensure memory quiesced before invalidation
>   drm/i915/gt: Poll aux invalidation register bit on invalidation
> 
>  drivers/gpu/drm/i915/gt/gen8_engine_cs.c | 222 +--
>  drivers/gpu/drm/i915/gt/gen8_engine_cs.h |  21 +-
>  drivers/gpu/drm/i915/gt/intel_gpu_commands.h |   2 +
>  drivers/gpu/drm/i915/gt/intel_gt_regs.h  |  16 +-
>  drivers/gpu/drm/i915/gt/intel_lrc.c  |  17 +-
>  drivers/gpu/drm/i915/i915_drv.h  |   1 +
>  drivers/gpu/drm/i915/i915_pci.c  |   5 +-
>  drivers/gpu/drm/i915/intel_device_info.h |   1 +
>  8 files changed, 186 insertions(+), 99 deletions(-)
> 
> -- 
> 2.40.1


Re: [PATCH v6 7/9] drm/i915/gt: Enable the CCS_FLUSH bit in the pipe control

2023-07-20 Thread Matt Roper
On Wed, Jul 19, 2023 at 01:07:27PM +0200, Andi Shyti wrote:
> Enable the CCS_FLUSH bit 13 in the control pipe for render and
> compute engines in platforms starting from Meteor Lake (BSPEC
> 43904 and 47112). The VE and BCS engines need to add the flush
> part in their command streamer.
> 
> Fixes: 972282c4cf24 ("drm/i915/gen12: Add aux table invalidate for all 
> engines")
> Signed-off-by: Andi Shyti 
> Cc: Jonathan Cavitt 
> Cc: Nirmoy Das 
> Cc:  # v5.8+
> ---
>  drivers/gpu/drm/i915/gt/gen8_engine_cs.c | 31 
>  drivers/gpu/drm/i915/gt/intel_gpu_commands.h |  1 +
>  2 files changed, 32 insertions(+)
> 
> diff --git a/drivers/gpu/drm/i915/gt/gen8_engine_cs.c 
> b/drivers/gpu/drm/i915/gt/gen8_engine_cs.c
> index 3bedab8d61db1..78bbd55262a2d 100644
> --- a/drivers/gpu/drm/i915/gt/gen8_engine_cs.c
> +++ b/drivers/gpu/drm/i915/gt/gen8_engine_cs.c
> @@ -225,6 +225,13 @@ int gen12_emit_flush_rcs(struct i915_request *rq, u32 
> mode)
>  
>   bit_group_0 |= PIPE_CONTROL0_HDC_PIPELINE_FLUSH;
>  
> + /*
> +  * When required, in MTL+ platforms we need to

Nitpick:  let's avoid using "FOO+" as "FOO and beyond."  We already have
formal IP names that include + signs (Xe_LPM+, Xe_LPD+, etc.), so using
it this way can cause confusion.

> +  * set the CCS_FLUSH bit in the pipe control
> +  */
> + if (GRAPHICS_VER_FULL(rq->i915) >= IP_VER(12, 70))
> + bit_group_0 |= PIPE_CONTROL_CCS_FLUSH;
> +
>   bit_group_1 |= PIPE_CONTROL_TILE_CACHE_FLUSH;
>   bit_group_1 |= PIPE_CONTROL_FLUSH_L3;
>   bit_group_1 |= PIPE_CONTROL_RENDER_TARGET_CACHE_FLUSH;
> @@ -309,6 +316,7 @@ int gen12_emit_flush_rcs(struct i915_request *rq, u32 
> mode)
>  int gen12_emit_flush_xcs(struct i915_request *rq, u32 mode)
>  {
>   intel_engine_mask_t aux_inv = 0;
> + u32 cmd_flush = 0;
>   u32 cmd = 4;
>   u32 *cs;
>  
> @@ -339,6 +347,13 @@ int gen12_emit_flush_xcs(struct i915_request *rq, u32 
> mode)
>   bit_group_1 |= PIPE_CONTROL_FLUSH_L3;
>   bit_group_1 |= PIPE_CONTROL_CS_STALL;
>  
> + /*
> +  * When required, in MTL+ platforms we need to
> +  * set the CCS_FLUSH bit in the pipe control
> +  */
> + if (GRAPHICS_VER_FULL(rq->i915) >= IP_VER(12, 70))
> + bit_group_0 |= PIPE_CONTROL_CCS_FLUSH;
> +
>   intel_emit_pipe_control_cs(rq, bit_group_0, bit_group_1,
>  LRC_PPHWSP_SCRATCH_ADDR);
>  
> @@ -346,7 +361,18 @@ int gen12_emit_flush_xcs(struct i915_request *rq, u32 
> mode)
>  
>   case VIDEO_ENHANCEMENT_CLASS:
>   case COMPUTE_CLASS:
> + cmd += 2;
> + cmd_flush = MI_FLUSH_DW;
> +
> + break;
> +

It looks like some of these changes wound up in the wrong patch?
And as Nirmoy pointed out on the other patch, some of the functions and
engine instructions are mixed around here too.


Matt

>   case COPY_ENGINE_CLASS:
> + cmd += 2;
> + /*
> +  * When required, in MTL+ platforms we need to
> +  * set the CCS_FLUSH bit in the pipe control
> +  */
> + cmd_flush = MI_FLUSH_DW | MI_FLUSH_DW_CCS;
>   break;
>   }
>   }
> @@ -355,6 +381,11 @@ int gen12_emit_flush_xcs(struct i915_request *rq, u32 
> mode)
>   if (IS_ERR(cs))
>   return PTR_ERR(cs);
>  
> + if (cmd_flush) {
> + *cs++ = cmd_flush;
> + *cs++ = 0;
> + }
> +
>   if (mode & EMIT_INVALIDATE)
>   *cs++ = preparser_disable(true);
>  
> diff --git a/drivers/gpu/drm/i915/gt/intel_gpu_commands.h 
> b/drivers/gpu/drm/i915/gt/intel_gpu_commands.h
> index 5d143e2a8db03..5df7cce23197c 100644
> --- a/drivers/gpu/drm/i915/gt/intel_gpu_commands.h
> +++ b/drivers/gpu/drm/i915/gt/intel_gpu_commands.h
> @@ -299,6 +299,7 @@
>  #define   PIPE_CONTROL_QW_WRITE  (1<<14)
>  #define   PIPE_CONTROL_POST_SYNC_OP_MASK(3<<14)
>  #define   PIPE_CONTROL_DEPTH_STALL   (1<<13)
> +#define   PIPE_CONTROL_CCS_FLUSH (1<<13) /* MTL+ */
>  #define   PIPE_CONTROL_WRITE_FLUSH   (1<<12)
>  #define   PIPE_CONTROL_RENDER_TARGET_CACHE_FLUSH (1<<12) /* gen6+ */
>  #define   PIPE_CONTROL_INSTRUCTION_CACHE_INVALIDATE  (1<<11) /* MBZ on ILK */
> -- 
> 2.40.1
> 

-- 
Matt Roper
Graphics Software Engineer
Linux GPU Platform Enablement
Intel Corporation


Re: [PATCH v5] Documentation/gpu: Add a VM_BIND async draft document

2023-07-20 Thread Matthew Brost
On Sat, Jul 15, 2023 at 05:45:43PM +0200, Thomas Hellström wrote:
> Add a motivation for and description of asynchronous VM_BIND operation
> 
> v2:
> - Fix typos (Nirmoy Das)
> - Improve the description of a memory fence (Oak Zeng)
> - Add a reference to the document in the Xe RFC.
> - Add pointers to sample uAPI suggestions
> v3:
> - Address review comments (Danilo Krummrich)
> - Formatting fixes
> v4:
> - Address typos (Francois Dugast)
> - Explain why in-fences are not allowed for VM_BIND operations for long-
>   running workloads (Matthew Brost)
> v5:
> - More typo- and style fixing
> - Further clarify the implications of disallowing in-fences for VM_BIND
>   operations for long-running workloads (Matthew Brost)
> 
> Signed-off-by: Thomas Hellström 

Reviewed-by: Matthew Brost 

> Acked-by: Nirmoy Das 
> ---
>  Documentation/gpu/drm-vm-bind-async.rst | 171 
>  Documentation/gpu/rfc/xe.rst|   4 +-
>  2 files changed, 173 insertions(+), 2 deletions(-)
>  create mode 100644 Documentation/gpu/drm-vm-bind-async.rst
> 
> diff --git a/Documentation/gpu/drm-vm-bind-async.rst 
> b/Documentation/gpu/drm-vm-bind-async.rst
> new file mode 100644
> index ..d2b02a38198a
> --- /dev/null
> +++ b/Documentation/gpu/drm-vm-bind-async.rst
> @@ -0,0 +1,171 @@
> +
> +Asynchronous VM_BIND
> +
> +
> +Nomenclature:
> +=
> +
> +* ``VRAM``: On-device memory. Sometimes referred to as device local memory.
> +
> +* ``gpu_vm``: A GPU address space. Typically per process, but can be shared 
> by
> +  multiple processes.
> +
> +* ``VM_BIND``: An operation or a list of operations to modify a gpu_vm using
> +  an IOCTL. The operations include mapping and unmapping system- or
> +  VRAM memory.
> +
> +* ``syncobj``: A container that abstracts synchronization objects. The
> +  synchronization objects can be either generic, like dma-fences or
> +  driver specific. A syncobj typically indicates the type of the
> +  underlying synchronization object.
> +
> +* ``in-syncobj``: Argument to a VM_BIND IOCTL, the VM_BIND operation waits
> +  for these before starting.
> +
> +* ``out-syncobj``: Argument to a VM_BIND_IOCTL, the VM_BIND operation
> +  signals these when the bind operation is complete.
> +
> +* ``memory fence``: A synchronization object, different from a dma-fence.
> +  A memory fence uses the value of a specified memory location to determine
> +  signaled status. A memory fence can be awaited and signaled by both
> +  the GPU and CPU. Memory fences are sometimes referred to as
> +  user-fences, userspace-fences or gpu futexes and do not necessarily obey
> +  the dma-fence rule of signaling within a "reasonable amount of time".
> +  The kernel should thus avoid waiting for memory fences with locks held.
> +
> +* ``long-running workload``: A workload that may take more than the
> +  current stipulated dma-fence maximum signal delay to complete and
> +  which therefore needs to set the gpu_vm or the GPU execution context in
> +  a certain mode that disallows completion dma-fences.
> +
> +* ``exec function``: An exec function is a function that revalidates all
> +  affected gpu_vmas, submits a GPU command batch and registers the
> +  dma_fence representing the GPU command's activity with all affected
> +  dma_resvs. For completeness, although not covered by this document,
> +  it's worth mentioning that an exec function may also be the
> +  revalidation worker that is used by some drivers in compute /
> +  long-running mode.
> +
> +* ``bind context``: A context identifier used for the VM_BIND
> +  operation. VM_BIND operations that use the same bind context can be
> +  assumed, where it matters, to complete in order of submission. No such
> +  assumptions can be made for VM_BIND operations using separate bind 
> contexts.
> +
> +* ``UMD``: User-mode driver.
> +
> +* ``KMD``: Kernel-mode driver.
> +
> +
> +Synchronous / Asynchronous VM_BIND operation
> +
> +
> +Synchronous VM_BIND
> +___
> +With Synchronous VM_BIND, the VM_BIND operations all complete before the
> +IOCTL returns. A synchronous VM_BIND takes neither in-fences nor
> +out-fences. Synchronous VM_BIND may block and wait for GPU operations;
> +for example swap-in or clearing, or even previous binds.
> +
> +Asynchronous VM_BIND
> +
> +Asynchronous VM_BIND accepts both in-syncobjs and out-syncobjs. While the
> +IOCTL may return immediately, the VM_BIND operations wait for the in-syncobjs
> +before modifying the GPU page-tables, and signal the out-syncobjs when
> +the modification is done in the sense that the next exec function that
> +awaits for the out-syncobjs will see the change. Errors are reported
> +synchronously assuming that the asynchronous part of the job never errors.
> +In low-memory situations the implementation may block, performing the
> +VM_BIND synchronously, because there might not be 

Re: [PATCH v6 5/9] drm/i915/gt: Refactor intel_emit_pipe_control_cs() in a single function

2023-07-20 Thread Matt Roper
On Wed, Jul 19, 2023 at 01:07:25PM +0200, Andi Shyti wrote:
> Just a trivial refactoring for reducing the number of code
> duplicate. This will come at handy in the next commits.
> 
> Signed-off-by: Andi Shyti 
> ---
>  drivers/gpu/drm/i915/gt/gen8_engine_cs.c | 44 +---
>  1 file changed, 23 insertions(+), 21 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/gt/gen8_engine_cs.c 
> b/drivers/gpu/drm/i915/gt/gen8_engine_cs.c
> index 7566c89d9def3..1b1dadacfbf42 100644
> --- a/drivers/gpu/drm/i915/gt/gen8_engine_cs.c
> +++ b/drivers/gpu/drm/i915/gt/gen8_engine_cs.c
> @@ -177,23 +177,31 @@ u32 *gen12_emit_aux_table_inv(struct intel_gt *gt, u32 
> *cs, const i915_reg_t inv
>   return cs;
>  }
>  
> +static u32 *intel_emit_pipe_control_cs(struct i915_request *rq, u32 
> bit_group_0,

This is another case where it gets confusing because this function name
sounds like it's something generic, but it actually only applies to a
small subset of platforms (gen12).

> +u32 bit_group_1, u32 offset)
> +{
> + u32 *cs;
> +
> + cs = intel_ring_begin(rq, 6);
> + if (IS_ERR(cs))
> + return cs;

We're not actually checking for this error at the callsites.  Should we
be checking for it and propagating it farther up the call stack?

> +
> + cs = gen12_emit_pipe_control(cs, bit_group_0, bit_group_1,
> +  LRC_PPHWSP_SCRATCH_ADDR);
> + intel_ring_advance(rq, cs);
> +
> + return cs;

This cursor never gets used for anything.  We can probably just make
this function return an int error code.


Matt

> +}
> +
>  static int mtl_dummy_pipe_control(struct i915_request *rq)
>  {
>   /* Wa_14016712196 */
>   if (IS_MTL_GRAPHICS_STEP(rq->engine->i915, M, STEP_A0, STEP_B0) ||
> - IS_MTL_GRAPHICS_STEP(rq->engine->i915, P, STEP_A0, STEP_B0)) {
> - u32 *cs;
> -
> - /* dummy PIPE_CONTROL + depth flush */
> - cs = intel_ring_begin(rq, 6);
> - if (IS_ERR(cs))
> - return PTR_ERR(cs);
> - cs = gen12_emit_pipe_control(cs,
> -  0,
> -  PIPE_CONTROL_DEPTH_CACHE_FLUSH,
> -  LRC_PPHWSP_SCRATCH_ADDR);
> - intel_ring_advance(rq, cs);
> - }
> + IS_MTL_GRAPHICS_STEP(rq->engine->i915, P, STEP_A0, STEP_B0))
> + intel_emit_pipe_control_cs(rq,
> +0,
> +PIPE_CONTROL_DEPTH_CACHE_FLUSH,
> +LRC_PPHWSP_SCRATCH_ADDR);
>  
>   return 0;
>  }
> @@ -210,7 +218,6 @@ int gen12_emit_flush_rcs(struct i915_request *rq, u32 
> mode)
>   u32 bit_group_0 = 0;
>   u32 bit_group_1 = 0;
>   int err;
> - u32 *cs;
>  
>   err = mtl_dummy_pipe_control(rq);
>   if (err)
> @@ -237,13 +244,8 @@ int gen12_emit_flush_rcs(struct i915_request *rq, u32 
> mode)
>   else if (engine->class == COMPUTE_CLASS)
>   bit_group_1 &= ~PIPE_CONTROL_3D_ENGINE_FLAGS;
>  
> - cs = intel_ring_begin(rq, 6);
> - if (IS_ERR(cs))
> - return PTR_ERR(cs);
> -
> - cs = gen12_emit_pipe_control(cs, bit_group_0, bit_group_1,
> -  LRC_PPHWSP_SCRATCH_ADDR);
> - intel_ring_advance(rq, cs);
> + intel_emit_pipe_control_cs(rq, bit_group_0, bit_group_1,
> +LRC_PPHWSP_SCRATCH_ADDR);
>   }
>  
>   if (mode & EMIT_INVALIDATE) {
> -- 
> 2.40.1
> 

-- 
Matt Roper
Graphics Software Engineer
Linux GPU Platform Enablement
Intel Corporation


Re: [PATCH v5] drm/i915/selftest/gsc: Ensure GSC Proxy init completes before selftests

2023-07-20 Thread Ceraolo Spurio, Daniele




On 7/12/2023 4:12 PM, Alan Previn wrote:

On MTL, if the GSC Proxy init flows haven't completed, submissions to the
GSC engine will fail. Those init flows are dependent on the mei's
gsc_proxy component that is loaded in parallel with i915 and a
worker that could potentially start after i915 driver init is done.

That said, all subsytems that access the GSC engine today does check
for such init flow completion before using the GSC engine. However,
selftests currently don't wait on anything before starting.

To fix this, add a waiter function at the start of __run_selftests
that waits for gsc-proxy init flows to complete.

Difference from prior versions:
v5: - Move the call to __wait_gsc_proxy_completed from common
  __run_selftests dispatcher to the group-level selftest
  function (Trvtko).
- change the pr_info to pr_warn if we hit the timeout.
v4: - Remove generalized waiters function table framework (Tvrtko).
- Remove mention of CI-framework-timeout from comments (Tvrtko).
v3: - Rebase to latest drm-tip.
v2: - Based on internal testing, increase the timeout for gsc-proxy
  specific case to 8 seconds.

Signed-off-by: Alan Previn 
---
  .../gpu/drm/i915/selftests/i915_selftest.c| 26 +++
  1 file changed, 26 insertions(+)

diff --git a/drivers/gpu/drm/i915/selftests/i915_selftest.c 
b/drivers/gpu/drm/i915/selftests/i915_selftest.c
index 39da0fb0d6d2..b03d03eac3d6 100644
--- a/drivers/gpu/drm/i915/selftests/i915_selftest.c
+++ b/drivers/gpu/drm/i915/selftests/i915_selftest.c
@@ -24,6 +24,8 @@
  #include 
  
  #include "gt/intel_gt_pm.h"

+#include "gt/uc/intel_gsc_fw.h"
+
  #include "i915_driver.h"
  #include "i915_drv.h"
  #include "i915_selftest.h"
@@ -127,6 +129,26 @@ static void set_default_test_all(struct selftest *st, 
unsigned int count)
st[i].enabled = true;
  }
  
+static void

+__wait_gsc_proxy_completed(struct drm_i915_private *i915)
+{
+   bool need_to_wait = (IS_ENABLED(CONFIG_INTEL_MEI_GSC_PROXY) &&
+i915->media_gt &&
+HAS_ENGINE(i915->media_gt, GSC0) &&
+
intel_uc_fw_is_loadable(&i915->media_gt->uc.gsc.fw));
+   /*
+* The gsc proxy component depends on the kernel component driver load 
ordering
+* and in corner cases (the first time after an IFWI flash), 
init-completion
+* firmware flows take longer.
+*/
+   unsigned long timeout_ms = 8000;
+
+   if (need_to_wait &&
+   (wait_for(intel_gsc_uc_fw_proxy_init_done(&i915->media_gt->uc.gsc, 
true),


Small issue here: if proxy init fails, intel_gsc_uc_fw_proxy_init_done 
will keep returning false, so we'll wait for the full 8 secs. Maybe we 
can instead have a proxy_init_status function to differentiate between 
pending/failed/done. This would basically be a generalization of the 
checks you already have in https://patchwork.freedesktop.org/series/118723/.

Patch LGTM apart from this.

Daniele


+   timeout_ms)))
+   pr_warn(DRIVER_NAME "Timed out waiting for 
gsc_proxy_completion!\n");
+}
+
  static int __run_selftests(const char *name,
   struct selftest *st,
   unsigned int count,
@@ -206,6 +228,8 @@ int i915_live_selftests(struct pci_dev *pdev)
if (!i915_selftest.live)
return 0;
  
+	__wait_gsc_proxy_completed(pdev_to_i915(pdev));

+
err = run_selftests(live, pdev_to_i915(pdev));
if (err) {
i915_selftest.live = err;
@@ -227,6 +251,8 @@ int i915_perf_selftests(struct pci_dev *pdev)
if (!i915_selftest.perf)
return 0;
  
+	__wait_gsc_proxy_completed(pdev_to_i915(pdev));

+
err = run_selftests(perf, pdev_to_i915(pdev));
if (err) {
i915_selftest.perf = err;

base-commit: 57ea1a97c50c63c77e3bfa46ee486e8a451be5e7




Re: [PATCH v6 2/9] drm/i915: Add the has_aux_ccs device property

2023-07-20 Thread Matt Roper
On Wed, Jul 19, 2023 at 01:07:22PM +0200, Andi Shyti wrote:
> We always assumed that a device might either have AUX or FLAT
> CCS, but this is an approximation that is not always true as it
> requires some further per device checks.
> 
> Add the "has_aux_ccs" flag in the intel_device_info structure in
> order to have a per device flag indicating of the AUX CCS.

I think this flag is a bit misnamed/inaccurate at the moment.  AuxCCS in
general has been around for ages.  Bspec 14276 indicates the GT side of
the hardware has had AuxCCS since at least SNB (gen6).  You seem to just
be setting this flag on the platforms where we need to do TLB
invalidation for the AUX (gen12), which is a small subset of the
platforms that had this compression in general.

I kind of feel like the helper function approach might still be simpler
than using a device flag, but if you want to stick with the flag it's
probably best to rename it slightly so that it more accurately reflects
what we're using it for.


Matt

> 
> Signed-off-by: Andi Shyti 
> Cc: Matt Roper 
> Cc: Jonathan Cavitt 
> Cc:  # v5.8+
> ---
>  drivers/gpu/drm/i915/gt/gen8_engine_cs.c | 4 ++--
>  drivers/gpu/drm/i915/i915_drv.h  | 1 +
>  drivers/gpu/drm/i915/i915_pci.c  | 5 -
>  drivers/gpu/drm/i915/intel_device_info.h | 1 +
>  4 files changed, 8 insertions(+), 3 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/gt/gen8_engine_cs.c 
> b/drivers/gpu/drm/i915/gt/gen8_engine_cs.c
> index 563efee055602..0d4d5e0407a2d 100644
> --- a/drivers/gpu/drm/i915/gt/gen8_engine_cs.c
> +++ b/drivers/gpu/drm/i915/gt/gen8_engine_cs.c
> @@ -267,7 +267,7 @@ int gen12_emit_flush_rcs(struct i915_request *rq, u32 
> mode)
>   else if (engine->class == COMPUTE_CLASS)
>   flags &= ~PIPE_CONTROL_3D_ENGINE_FLAGS;
>  
> - if (!HAS_FLAT_CCS(rq->engine->i915))
> + if (HAS_AUX_CCS(rq->engine->i915))
>   count = 8 + 4;
>   else
>   count = 8;
> @@ -307,7 +307,7 @@ int gen12_emit_flush_xcs(struct i915_request *rq, u32 
> mode)
>   if (mode & EMIT_INVALIDATE) {
>   cmd += 2;
>  
> - if (!HAS_FLAT_CCS(rq->engine->i915) &&
> + if (HAS_AUX_CCS(rq->engine->i915) &&
>   (rq->engine->class == VIDEO_DECODE_CLASS ||
>rq->engine->class == VIDEO_ENHANCEMENT_CLASS)) {
>   aux_inv = rq->engine->mask &
> diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
> index 682ef2b5c7d59..e9cc048b5727a 100644
> --- a/drivers/gpu/drm/i915/i915_drv.h
> +++ b/drivers/gpu/drm/i915/i915_drv.h
> @@ -848,6 +848,7 @@ IS_SUBPLATFORM(const struct drm_i915_private *i915,
>   * stored in lmem to support the 3D and media compression formats.
>   */
>  #define HAS_FLAT_CCS(i915)   (INTEL_INFO(i915)->has_flat_ccs)
> +#define HAS_AUX_CCS(i915)(INTEL_INFO(i915)->has_aux_ccs)
>  
>  #define HAS_GT_UC(i915)  (INTEL_INFO(i915)->has_gt_uc)
>  
> diff --git a/drivers/gpu/drm/i915/i915_pci.c b/drivers/gpu/drm/i915/i915_pci.c
> index fcacdc21643cf..c9ff1d11a9fce 100644
> --- a/drivers/gpu/drm/i915/i915_pci.c
> +++ b/drivers/gpu/drm/i915/i915_pci.c
> @@ -643,7 +643,8 @@ static const struct intel_device_info jsl_info = {
>   TGL_CACHELEVEL, \
>   .has_global_mocs = 1, \
>   .has_pxp = 1, \
> - .max_pat_index = 3
> + .max_pat_index = 3, \
> + .has_aux_ccs = 1
>  
>  static const struct intel_device_info tgl_info = {
>   GEN12_FEATURES,
> @@ -775,6 +776,7 @@ static const struct intel_device_info dg2_info = {
>  
>  static const struct intel_device_info ats_m_info = {
>   DG2_FEATURES,
> + .has_aux_ccs = 1,
>   .require_force_probe = 1,
>   .tuning_thread_rr_after_dep = 1,
>  };
> @@ -827,6 +829,7 @@ static const struct intel_device_info mtl_info = {
>   .__runtime.media.ip.ver = 13,
>   PLATFORM(INTEL_METEORLAKE),
>   .extra_gt_list = xelpmp_extra_gt,
> + .has_aux_ccs = 1,
>   .has_flat_ccs = 0,
>   .has_gmd_id = 1,
>   .has_guc_deprivilege = 1,
> diff --git a/drivers/gpu/drm/i915/intel_device_info.h 
> b/drivers/gpu/drm/i915/intel_device_info.h
> index dbfe6443457b5..93485507506cc 100644
> --- a/drivers/gpu/drm/i915/intel_device_info.h
> +++ b/drivers/gpu/drm/i915/intel_device_info.h
> @@ -151,6 +151,7 @@ enum intel_ppgtt_type {
>   func(has_reset_engine); \
>   func(has_3d_pipeline); \
>   func(has_4tile); \
> + func(has_aux_ccs); \
>   func(has_flat_ccs); \
>   func(has_global_mocs); \
>   func(has_gmd_id); \
> -- 
> 2.40.1
> 

-- 
Matt Roper
Graphics Software Engineer
Linux GPU Platform Enablement
Intel Corporation


Re: [PATCH v1 5/5] drm/msm/dp: move of_dp_aux_populate_bus() to probe for eDP

2023-07-20 Thread Kuogee Hsieh



On 7/10/2023 11:24 AM, Dmitry Baryshkov wrote:

[Restored CC list]

On Mon, 10 Jul 2023 at 20:08, Kuogee Hsieh  wrote:


On 7/7/2023 5:32 PM, Dmitry Baryshkov wrote:

On 08/07/2023 02:52, Kuogee Hsieh wrote:

Move of_dp_aux_populate_bus() to dp_display_probe() for eDP
from dp_display_bind() so that probe deferral cases can be
handled effectively

Signed-off-by: Kuogee Hsieh 
---
   drivers/gpu/drm/msm/dp/dp_aux.c | 25 
   drivers/gpu/drm/msm/dp/dp_display.c | 79
+++--
   2 files changed, 65 insertions(+), 39 deletions(-)

diff --git a/drivers/gpu/drm/msm/dp/dp_aux.c
b/drivers/gpu/drm/msm/dp/dp_aux.c
index c592064..c1baffb 100644
--- a/drivers/gpu/drm/msm/dp/dp_aux.c
+++ b/drivers/gpu/drm/msm/dp/dp_aux.c
@@ -505,6 +505,21 @@ void dp_aux_unregister(struct drm_dp_aux *dp_aux)
   drm_dp_aux_unregister(dp_aux);
   }
   +static int dp_wait_hpd_asserted(struct drm_dp_aux *dp_aux,
+ unsigned long wait_us)
+{
+int ret;
+struct dp_aux_private *aux;
+
+aux = container_of(dp_aux, struct dp_aux_private, dp_aux);
+
+pm_runtime_get_sync(aux->dev);
+ret = dp_catalog_aux_wait_for_hpd_connect_state(aux->catalog);
+pm_runtime_put_sync(aux->dev);
+
+return ret;
+}
+
   struct drm_dp_aux *dp_aux_get(struct device *dev, struct dp_catalog
*catalog,
 bool is_edp)
   {
@@ -528,6 +543,16 @@ struct drm_dp_aux *dp_aux_get(struct device
*dev, struct dp_catalog *catalog,
   aux->catalog = catalog;
   aux->retry_cnt = 0;
   +/*
+ * Use the drm_dp_aux_init() to use the aux adapter
+ * before registering aux with the DRM device.
+ */
+aux->dp_aux.name = "dpu_dp_aux";
+aux->dp_aux.dev = dev;
+aux->dp_aux.transfer = dp_aux_transfer;
+aux->dp_aux.wait_hpd_asserted = dp_wait_hpd_asserted;
+drm_dp_aux_init(&aux->dp_aux);
+
   return &aux->dp_aux;
   }
   diff --git a/drivers/gpu/drm/msm/dp/dp_display.c
b/drivers/gpu/drm/msm/dp/dp_display.c
index 185f1eb..7ed4bea 100644
--- a/drivers/gpu/drm/msm/dp/dp_display.c
+++ b/drivers/gpu/drm/msm/dp/dp_display.c
@@ -302,10 +302,6 @@ static int dp_display_bind(struct device *dev,
struct device *master,
   goto end;
   }
   -pm_runtime_enable(dev);
-pm_runtime_set_autosuspend_delay(dev, 1000);
-pm_runtime_use_autosuspend(dev);
-
   return 0;
   end:
   return rc;
@@ -322,8 +318,6 @@ static void dp_display_unbind(struct device *dev,
struct device *master,
 kthread_stop(dp->ev_tsk);
   -of_dp_aux_depopulate_bus(dp->aux);
-
   dp_power_client_deinit(dp->power);
   dp_unregister_audio_driver(dev, dp->audio);
   dp_aux_unregister(dp->aux);
@@ -1245,6 +1239,29 @@ static const struct msm_dp_desc
*dp_display_get_desc(struct platform_device *pde
   return NULL;
   }
   +static void of_dp_aux_depopulate_bus_void(void *data)
+{
+of_dp_aux_depopulate_bus(data);
+}
+
+static int dp_display_auxbus_emulation(struct dp_display_private *dp)

Why is it called emulation?


+{
+struct device *dev = &dp->pdev->dev;
+struct device_node *aux_bus;
+int ret = 0;
+
+aux_bus = of_get_child_by_name(dev->of_node, "aux-bus");
+
+if (aux_bus) {
+ret = devm_of_dp_aux_populate_bus(dp->aux, NULL);

And here you missed the whole point of why we have been asking for.
Please add a sensible `done_probing' callback, which will call
component_add(). This way the DP component will only be registered
when the panel has been probed. Keeping us from the component binding
retries and corresponding side effects.


+
+devm_add_action_or_reset(dev, of_dp_aux_depopulate_bus_void,
+ dp->aux);

Useless, it's already handled by the devm_ part of the
devm_of_dp_aux_populate_bus().


+}
+
+return ret;
+}
+
   static int dp_display_probe(struct platform_device *pdev)
   {
   int rc = 0;
@@ -1290,8 +1307,18 @@ static int dp_display_probe(struct
platform_device *pdev)
 platform_set_drvdata(pdev, &dp->dp_display);
   +pm_runtime_enable(&pdev->dev);
+pm_runtime_set_autosuspend_delay(&pdev->dev, 1000);
+pm_runtime_use_autosuspend(&pdev->dev);

Can we have this in probe right from the patch #2?

no, at patch#2, devm_of_dp_aux_populate_bus() is done ta bind timing.

The device used by pm_runtime_get_sync() of generic_edp_panel_probe()
which is derived from devm_of_dp_aux_populate_bus() is different the
&pdev->dev here.

Excuse me, I don't get your answer. In patch #2 you have added
pm_runtime_enable() / etc to dp_display_bind().
In this patch you are moving these calls to dp_display_probe(). I
think that the latter is a better place for enabling runtime PM and as
such I've asked you to squash this chunk into patch #2.
Why isn't that going to work?

If I'm not mistaken here, the panel's call to pm_runtime_get_sync()
will wake up the panel and all the parent devices, including the DP.
That's what I meant in my comment regarding PM calls in the patch #1.
pm

Re: [PATCH v5 9/9] drm: Introduce documentation for hotspot properties

2023-07-20 Thread Simon Ser
On Thursday, July 20th, 2023 at 21:28, Zack Rusin  wrote:

> On Thu, 2023-07-20 at 09:07 +, Simon Ser wrote:
> 
> > !! External Email
> > 
> > On Thursday, July 20th, 2023 at 10:50, Javier Martinez Canillas
> > javi...@redhat.com wrote:
> > 
> > > > On Thursday, July 20th, 2023 at 07:03, Zack Rusin za...@vmware.com 
> > > > wrote:
> > > > 
> > > > > I'll give this series a few more hours on the list and if no one 
> > > > > objects
> > > > > I'll push
> > > > > it to drm-misc later today. Thanks!
> > > > 
> > > > Sorry, but this doesn't seem to be enough to satisfy the DRM merge
> > > > requirements. This introduces a new uAPI but is missing user-space
> > > > patches and IGT. See 1 and 2.
> > > 
> > > Albert (Cc'ed) wrote IGT tests for this new uAPI and was waiting for
> > > Zack's patches to land to post them. I believe his branch is 0 but
> > > he can correct me if I'm wrong on that.
> > > 
> > > Zack also has mutter patches and Albert has been testing those too.
> > 
> > Ah, nice. Please do post all of these (without merging them) and
> > include links to them in the commit message. Posting is important
> > to make sure there are no gaps/mistakes in the tests and user-space
> > impl.
> 
> What should those links point to? Because my private mutter repository is 
> definitely
> not going to last long so I'm not sure if there's any point in putting that 
> in a
> kernel commit log. Or would you like the links to those in the cover letter?

The kernel docs say: "The userspace side must be fully reviewed and
tested to the standards of that userspace project".

So you need to open a merge request for mutter. Same for IGT.


Re: [PATCH v5 9/9] drm: Introduce documentation for hotspot properties

2023-07-20 Thread Zack Rusin
On Thu, 2023-07-20 at 09:07 +, Simon Ser wrote:
> !! External Email
>
> On Thursday, July 20th, 2023 at 10:50, Javier Martinez Canillas
>  wrote:
>
> > > On Thursday, July 20th, 2023 at 07:03, Zack Rusin za...@vmware.com wrote:
> > >
> > > > I'll give this series a few more hours on the list and if no one objects
> > > > I'll push
> > > > it to drm-misc later today. Thanks!
> > >
> > > Sorry, but this doesn't seem to be enough to satisfy the DRM merge
> > > requirements. This introduces a new uAPI but is missing user-space
> > > patches and IGT. See 1 and 2.
> >
> >
> > Albert (Cc'ed) wrote IGT tests for this new uAPI and was waiting for
> > Zack's patches to land to post them. I believe his branch is [0] but
> > he can correct me if I'm wrong on that.
> >
> > Zack also has mutter patches and Albert has been testing those too.
> >
> > [0]:
> > https://gitlab.freedesktop.org/aesteve/igt-gpu-tools/-/commits/modeset-cursor-hotspot-test/
>
> Ah, nice. Please do post all of these (without merging them) and
> include links to them in the commit message. Posting is important
> to make sure there are no gaps/mistakes in the tests and user-space
> impl.

What should those links point to? Because my private mutter repository is 
definitely
not going to last long so I'm not sure if there's any point in putting that in a
kernel commit log. Or would you like the links to those in the cover letter?

z


Re: [PATCH v3 06/15] dt-bindings: display/msm: sc7180-dpu: Describe SM6125

2023-07-20 Thread Marijn Suijten
On 2023-07-20 01:24:27, Dmitry Baryshkov wrote:
> On Thu, 20 Jul 2023 at 01:09, Marijn Suijten
>  wrote:
> >
> > On 2023-07-19 01:06:03, Dmitry Baryshkov wrote:
> > > On 19/07/2023 00:24, Marijn Suijten wrote:
> > > > SM6125 is identical to SM6375 except that while downstream also defines
> > > > a throttle clock, its presence results in timeouts whereas SM6375
> > > > requires it to not observe any timeouts.  This is represented by
> > > > reducing the clock array length to 6 so that it cannot be passed.  Note
> > > > that any SoC other than SM6375 (currently SC7180 and SM6350) are
> > > > unconstrained and could either pass or leave out this "throttle" clock.
> > >
> > > Could you please describe, what kind of timeouts do you observe? Is this
> > > the DSI underruns issue?
> >
> > Ping-pong timeouts and low(er) framerate.  However, they were previosuly
> > not happening on a random boot out of tens... and now I can no longer
> > reproduce the timeout on 4 consecutive boots after adding the throttle
> > clock.  Could it perhaps be the power domains and opps that we added in
> > v2 and v3?
> 
> Quite unlikely, but who knows. My main question is whether we should
> continue skipping the throttle clocks or if it should be enabled now.

And that "main question" could ... drum roll please ... only be answered
by knowing if this got "accidentally" fixed by providing the right PMs
or some other change entirely while I changed base branch and defconfig.
Or if this is just a fluke that persisted multiple boots but will fall
apart in some time and/or when someone else runs this on their device?

- Marijn




Re: [Freedreno] [PATCH] drm/msm: Check for the GPU IOMMU during bind

2023-07-20 Thread Bjorn Andersson
On Mon, Jul 10, 2023 at 03:20:44AM +0530, Akhil P Oommen wrote:
> On Fri, Jul 07, 2023 at 08:27:18PM +0300, Dmitry Baryshkov wrote:
> > 
> > On 07/07/2023 18:03, Jordan Crouse wrote:
> > > On Thu, Jul 06, 2023 at 09:55:13PM +0300, Dmitry Baryshkov wrote:
> > > > 
> > > > On 10/03/2023 00:20, Jordan Crouse wrote:
> > > > > While booting with amd,imageon on a headless target the GPU probe was
> > > > > failing with -ENOSPC in get_pages() from msm_gem.c.
> > > > > 
> > > > > Investigation showed that the driver was using the default 16MB VRAM
> > > > > carveout because msm_use_mmu() was returning false since headless 
> > > > > devices
> > > > > use a dummy parent device. Avoid this by extending the existing 
> > > > > is_a2xx
> > > > > priv member to check the GPU IOMMU state on all platforms and use that
> > > > > check in msm_use_mmu().
> > > > > 
> > > > > This works for memory allocations but it doesn't prevent the VRAM 
> > > > > carveout
> > > > > from being created because that happens before we have a chance to 
> > > > > check
> > > > > the GPU IOMMU state in adreno_bind.
> > > > > 
> > > > > There are a number of possible options to resolve this but none of 
> > > > > them are
> > > > > very clean. The easiest way is to likely specify vram=0 as module 
> > > > > parameter
> > > > > on headless devices so that the memory doesn't get wasted.
> > > > 
> > > > This patch was on my plate for quite a while, please excuse me for
> > > > taking it so long.
> > > 
> > > No worries. I'm also chasing a bunch of other stuff too.
> > > 
> > > > I see the following problem with the current code. We have two different
> > > > instances than can access memory: MDP/DPU and GPU. And each of them can
> > > > either have or miss the MMU.
> > > > 
> > > > For some time I toyed with the idea of determining whether the allocated
> > > > BO is going to be used by display or by GPU, but then I abandoned it. We
> > > > can have display BOs being filled by GPU, so handling it this way would
> > > > complicate things a lot.
> > > > 
> > > > This actually rings a tiny bell in my head with the idea of splitting
> > > > the display and GPU parts to two different drivers, but I'm not sure
> > > > what would be the overall impact.
> > > 
> > > As I now exclusively work on headless devices I would be 100% for this,
> > > but I'm sure that our laptop friends might not agree :)
> > 
> > I do not know here. This is probably a question to Rob, as he better
> > understands the interaction between GPU and display parts of the userspace.
> 
> I fully support this if it is feasible.
> 

I second this.

> In our architecture, display and GPU are completely independent subsystems.
> Like Jordan mentioned, there are IOT products without display. And I wouldn't
> be surprised if there is a product with just display and uses software 
> rendering.
> 

And we have SA8295P/SA8540P with two MDSS instances and one GPU.

Regards,
Bjorn


Patch "drm/client: Send hotplug event after registering a client" has been added to the 6.4-stable tree

2023-07-20 Thread gregkh


This is a note to let you know that I've just added the patch titled

drm/client: Send hotplug event after registering a client

to the 6.4-stable tree which can be found at:

http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=summary

The filename of the patch is:
 drm-client-send-hotplug-event-after-registering-a-client.patch
and it can be found in the queue-6.4 subdirectory.

If you, or anyone else, feels it should not be added to the stable tree,
please let  know about it.


>From 27655b9bb9f0d9c32b8de8bec649b676898c52d5 Mon Sep 17 00:00:00 2001
From: Thomas Zimmermann 
Date: Mon, 10 Jul 2023 11:10:17 +0200
Subject: drm/client: Send hotplug event after registering a client
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

From: Thomas Zimmermann 

commit 27655b9bb9f0d9c32b8de8bec649b676898c52d5 upstream.

Generate a hotplug event after registering a client to allow the
client to configure its display. Remove the hotplug calls from the
existing clients for fbdev emulation. This change fixes a concurrency
bug between registering a client and receiving events from the DRM
core. The bug is present in the fbdev emulation of all drivers.

The fbdev emulation currently generates a hotplug event before
registering the client to the device. For each new output, the DRM
core sends an additional hotplug event to each registered client.

If the DRM core detects first output between sending the artificial
hotplug and registering the device, the output's hotplug event gets
lost. If this is the first output, the fbdev console display remains
dark. This has been observed with amdgpu and fbdev-generic.

Fix this by adding hotplug generation directly to the client's
register helper drm_client_register(). Registering the client and
receiving events are serialized by struct drm_device.clientlist_mutex.
So an output is either configured by the initial hotplug event, or
the client has already been registered.

The bug was originally added in commit 6e3f17ee73f7 ("drm/fb-helper:
generic: Call drm_client_add() after setup is done"), in which adding
a client and receiving a hotplug event switched order. It was hidden,
as most hardware and drivers have at least on static output configured.
Other drivers didn't use the internal DRM client or still had struct
drm_mode_config_funcs.output_poll_changed set. That callback handled
hotplug events as well. After not setting the callback in amdgpu in
commit 0e3172bac3f4 ("drm/amdgpu: Don't set struct
drm_driver.output_poll_changed"), amdgpu did not show a framebuffer
console if output events got lost. The bug got copy-pasted from
fbdev-generic into the other fbdev emulation.

Reported-by: Moritz Duge 
Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/2649
Fixes: 6e3f17ee73f7 ("drm/fb-helper: generic: Call drm_client_add() after setup 
is done")
Fixes: 8ab59da26bc0 ("drm/fb-helper: Move generic fbdev emulation into separate 
source file")
Fixes: b79fe9abd58b ("drm/fbdev-dma: Implement fbdev emulation for GEM DMA 
helpers")
Fixes: 63c381552f69 ("drm/armada: Implement fbdev emulation as in-kernel 
client")
Fixes: 49953b70e7d3 ("drm/exynos: Implement fbdev emulation as in-kernel 
client")
Fixes: 8f1aaccb04b7 ("drm/gma500: Implement client-based fbdev emulation")
Fixes: 940b869c2f2f ("drm/msm: Implement fbdev emulation as in-kernel client")
Fixes: 9e69bcd88e45 ("drm/omapdrm: Implement fbdev emulation as in-kernel 
client")
Fixes: e317a69fe891 ("drm/radeon: Implement client-based fbdev emulation")
Fixes: 71ec16f45ef8 ("drm/tegra: Implement fbdev emulation as in-kernel client")
Fixes: 0e3172bac3f4 ("drm/amdgpu: Don't set struct 
drm_driver.output_poll_changed")
Signed-off-by: Thomas Zimmermann 
Tested-by: Moritz Duge 
Tested-by: Torsten Krah 
Tested-by: Paul Schyska 
Cc: Daniel Vetter 
Cc: David Airlie 
Cc: Noralf Trønnes 
Cc: Maarten Lankhorst 
Cc: Maxime Ripard 
Cc: Javier Martinez Canillas 
Cc: Russell King 
Cc: Inki Dae 
Cc: Seung-Woo Kim 
Cc: Kyungmin Park 
Cc: Krzysztof Kozlowski 
Cc: Patrik Jakobsson 
Cc: Rob Clark 
Cc: Abhinav Kumar 
Cc: Dmitry Baryshkov 
Cc: Tomi Valkeinen 
Cc: Alex Deucher 
Cc: "Christian König" 
Cc: "Pan, Xinhui" 
Cc: Thierry Reding 
Cc: Mikko Perttunen 
Cc: dri-devel@lists.freedesktop.org
Cc: linux-ker...@vger.kernel.org
Cc: linux-arm-ker...@lists.infradead.org
Cc: linux-samsung-...@vger.kernel.org
Cc: linux-arm-...@vger.kernel.org
Cc: freedr...@lists.freedesktop.org
Cc: amd-...@lists.freedesktop.org
Cc: linux-te...@vger.kernel.org
Cc: dri-devel@lists.freedesktop.org
Cc:  # v5.2+
Reviewed-by: Javier Martinez Canillas 
Reviewed-by: Dmitry Baryshkov  # msm
Link: 
https://patchwork.freedesktop.org/patch/msgid/20230710091029.27503-1-tzimmerm...@suse.de
[ Dropped changes to drivers/gpu/drm/armada/armada_fbdev.c as
  174c3c38e3a2 drm/armada: Initialize fbdev DRM client
  was introduced in 6.5-rc1 ]
Signed-off-by: Mario Limonciello 
Signed-off-by: Greg Kroah-Hartman 
---
 drivers/gpu/drm/drm

Patch "drm/client: Send hotplug event after registering a client" has been added to the 6.1-stable tree

2023-07-20 Thread gregkh


This is a note to let you know that I've just added the patch titled

drm/client: Send hotplug event after registering a client

to the 6.1-stable tree which can be found at:

http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=summary

The filename of the patch is:
 drm-client-send-hotplug-event-after-registering-a-client.patch
and it can be found in the queue-6.1 subdirectory.

If you, or anyone else, feels it should not be added to the stable tree,
please let  know about it.


>From 27655b9bb9f0d9c32b8de8bec649b676898c52d5 Mon Sep 17 00:00:00 2001
From: Thomas Zimmermann 
Date: Mon, 10 Jul 2023 11:10:17 +0200
Subject: drm/client: Send hotplug event after registering a client
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

From: Thomas Zimmermann 

commit 27655b9bb9f0d9c32b8de8bec649b676898c52d5 upstream.

Generate a hotplug event after registering a client to allow the
client to configure its display. Remove the hotplug calls from the
existing clients for fbdev emulation. This change fixes a concurrency
bug between registering a client and receiving events from the DRM
core. The bug is present in the fbdev emulation of all drivers.

The fbdev emulation currently generates a hotplug event before
registering the client to the device. For each new output, the DRM
core sends an additional hotplug event to each registered client.

If the DRM core detects first output between sending the artificial
hotplug and registering the device, the output's hotplug event gets
lost. If this is the first output, the fbdev console display remains
dark. This has been observed with amdgpu and fbdev-generic.

Fix this by adding hotplug generation directly to the client's
register helper drm_client_register(). Registering the client and
receiving events are serialized by struct drm_device.clientlist_mutex.
So an output is either configured by the initial hotplug event, or
the client has already been registered.

The bug was originally added in commit 6e3f17ee73f7 ("drm/fb-helper:
generic: Call drm_client_add() after setup is done"), in which adding
a client and receiving a hotplug event switched order. It was hidden,
as most hardware and drivers have at least on static output configured.
Other drivers didn't use the internal DRM client or still had struct
drm_mode_config_funcs.output_poll_changed set. That callback handled
hotplug events as well. After not setting the callback in amdgpu in
commit 0e3172bac3f4 ("drm/amdgpu: Don't set struct
drm_driver.output_poll_changed"), amdgpu did not show a framebuffer
console if output events got lost. The bug got copy-pasted from
fbdev-generic into the other fbdev emulation.

Reported-by: Moritz Duge 
Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/2649
Fixes: 6e3f17ee73f7 ("drm/fb-helper: generic: Call drm_client_add() after setup 
is done")
Fixes: 8ab59da26bc0 ("drm/fb-helper: Move generic fbdev emulation into separate 
source file")
Fixes: b79fe9abd58b ("drm/fbdev-dma: Implement fbdev emulation for GEM DMA 
helpers")
Fixes: 63c381552f69 ("drm/armada: Implement fbdev emulation as in-kernel 
client")
Fixes: 49953b70e7d3 ("drm/exynos: Implement fbdev emulation as in-kernel 
client")
Fixes: 8f1aaccb04b7 ("drm/gma500: Implement client-based fbdev emulation")
Fixes: 940b869c2f2f ("drm/msm: Implement fbdev emulation as in-kernel client")
Fixes: 9e69bcd88e45 ("drm/omapdrm: Implement fbdev emulation as in-kernel 
client")
Fixes: e317a69fe891 ("drm/radeon: Implement client-based fbdev emulation")
Fixes: 71ec16f45ef8 ("drm/tegra: Implement fbdev emulation as in-kernel client")
Fixes: 0e3172bac3f4 ("drm/amdgpu: Don't set struct 
drm_driver.output_poll_changed")
Signed-off-by: Thomas Zimmermann 
Tested-by: Moritz Duge 
Tested-by: Torsten Krah 
Tested-by: Paul Schyska 
Cc: Daniel Vetter 
Cc: David Airlie 
Cc: Noralf Trønnes 
Cc: Maarten Lankhorst 
Cc: Maxime Ripard 
Cc: Javier Martinez Canillas 
Cc: Russell King 
Cc: Inki Dae 
Cc: Seung-Woo Kim 
Cc: Kyungmin Park 
Cc: Krzysztof Kozlowski 
Cc: Patrik Jakobsson 
Cc: Rob Clark 
Cc: Abhinav Kumar 
Cc: Dmitry Baryshkov 
Cc: Tomi Valkeinen 
Cc: Alex Deucher 
Cc: "Christian König" 
Cc: "Pan, Xinhui" 
Cc: Thierry Reding 
Cc: Mikko Perttunen 
Cc: dri-devel@lists.freedesktop.org
Cc: linux-ker...@vger.kernel.org
Cc: linux-arm-ker...@lists.infradead.org
Cc: linux-samsung-...@vger.kernel.org
Cc: linux-arm-...@vger.kernel.org
Cc: freedr...@lists.freedesktop.org
Cc: amd-...@lists.freedesktop.org
Cc: linux-te...@vger.kernel.org
Cc: dri-devel@lists.freedesktop.org
Cc:  # v5.2+
Reviewed-by: Javier Martinez Canillas 
Reviewed-by: Dmitry Baryshkov  # msm
Link: 
https://patchwork.freedesktop.org/patch/msgid/20230710091029.27503-1-tzimmerm...@suse.de
(cherry picked from commit 27655b9bb9f0d9c32b8de8bec649b676898c52d5)
[ Dropped changes to drivers/gpu/drm/armada/armada_fbdev.c as
  174c3c38e3a2 drm/armada: Initialize fbdev DRM client
  was introduced in 6.5-rc1.

  Dropped changes to exy

Re: [PATCH v2 06/11] drm/tests: helpers: Create a helper to allocate an atomic state

2023-07-20 Thread kernel test robot
Hi Maxime,

kernel test robot noticed the following build warnings:

[auto build test WARNING on c58c49dd89324b18a812762a2bfa5a0458e4f252]

url:
https://github.com/intel-lab-lkp/linux/commits/Maxime-Ripard/drm-tests-helpers-Switch-to-kunit-actions/20230720-191901
base:   c58c49dd89324b18a812762a2bfa5a0458e4f252
patch link:
https://lore.kernel.org/r/20230720-kms-kunit-actions-rework-v2-6-175017bd56ab%40kernel.org
patch subject: [PATCH v2 06/11] drm/tests: helpers: Create a helper to allocate 
an atomic state
config: arm64-randconfig-r022-20230720 
(https://download.01.org/0day-ci/archive/20230721/202307210124.ur3unuxz-...@intel.com/config)
compiler: clang version 17.0.0 (https://github.com/llvm/llvm-project.git 
4a5ac14ee968ff0ad5d2cc1ffa0299048db4c88a)
reproduce: 
(https://download.01.org/0day-ci/archive/20230721/202307210124.ur3unuxz-...@intel.com/reproduce)

If you fix the issue in a separate patch/commit (i.e. not just a new version of
the same patch/commit), kindly add following tags
| Reported-by: kernel test robot 
| Closes: 
https://lore.kernel.org/oe-kbuild-all/202307210124.ur3unuxz-...@intel.com/

All warnings (new ones prefixed by >>):

   drivers/gpu/drm/tests/drm_kunit_helpers.c:54:6: warning: cast from 'void 
(*)(struct platform_driver *)' to 'kunit_action_t *' (aka 'void (*)(void *)') 
converts to incompatible function type [-Wcast-function-type-strict]
  54 | (kunit_action_t 
*)platform_driver_unregister,
 | 
^~~~
   drivers/gpu/drm/tests/drm_kunit_helpers.c:62:6: warning: cast from 'void 
(*)(struct platform_device *)' to 'kunit_action_t *' (aka 'void (*)(void *)') 
converts to incompatible function type [-Wcast-function-type-strict]
  62 | (kunit_action_t 
*)platform_device_put,
 | 
^
   drivers/gpu/drm/tests/drm_kunit_helpers.c:70:6: warning: cast from 'void 
(*)(struct platform_device *)' to 'kunit_action_t *' (aka 'void (*)(void *)') 
converts to incompatible function type [-Wcast-function-type-strict]
  70 | (kunit_action_t 
*)platform_device_del,
 | 
^
   drivers/gpu/drm/tests/drm_kunit_helpers.c:90:9: warning: cast from 'void 
(*)(struct platform_device *)' to 'kunit_action_t *' (aka 'void (*)(void *)') 
converts to incompatible function type [-Wcast-function-type-strict]
  90 |  (kunit_action_t 
*)platform_device_unregister,
 |  
^~~~
   drivers/gpu/drm/tests/drm_kunit_helpers.c:94:9: warning: cast from 'void 
(*)(struct platform_driver *)' to 'kunit_action_t *' (aka 'void (*)(void *)') 
converts to incompatible function type [-Wcast-function-type-strict]
  94 |  (kunit_action_t 
*)platform_driver_unregister,
 |  
^~~~
>> drivers/gpu/drm/tests/drm_kunit_helpers.c:192:6: warning: cast from 'void 
>> (*)(struct drm_atomic_state *)' to 'kunit_action_t *' (aka 'void (*)(void 
>> *)') converts to incompatible function type [-Wcast-function-type-strict]
 192 | (kunit_action_t 
*)drm_atomic_state_put,
 | 
^~
   6 warnings generated.


vim +192 drivers/gpu/drm/tests/drm_kunit_helpers.c

   164  
   165  /**
   166   * drm_kunit_helper_atomic_state_alloc - Allocates an atomic state
   167   * @test: The test context object
   168   * @drm: The device to alloc the state for
   169   * @ctx: Locking context for that atomic update
   170   *
   171   * Allocates a empty atomic state.
   172   *
   173   * The state is tied to the kunit test context, so we must not call
   174   * drm_atomic_state_put() on it, it will be done so automatically.
   175   *
   176   * Returns:
   177   * An ERR_PTR on error, a pointer to the newly allocated state otherwise
   178   */
   179  struct drm_atomic_state *
   180  drm_kunit_helper_atomic_state_alloc(struct kunit *test,
   181  struct drm_device *drm,
   182  struct drm_modeset_acquire_ctx *ctx)
   183  {
   184  struct drm_atomic_state *state;
   185  int ret;
   186  
   187  state = drm_atomic_state_alloc(drm);
   188  if (!state)
   189  return ERR_PTR(-ENOMEM);
   190  
   191  ret = kunit_a

[drm-misc:for-linux-next 2/2] drivers/gpu/drm/drm_debugfs.c:212:33: sparse: sparse: non size-preserving pointer to integer cast

2023-07-20 Thread kernel test robot
tree:   git://anongit.freedesktop.org/drm/drm-misc for-linux-next
head:   c7a472297169156252a50d76965eb36b081186e2
commit: 4f66feeab173bd73e71028b8c2e1dcea07e32dd5 [2/2] drm: debugfs: provide 
infrastructure to dump a DRM GPU VA space
config: i386-randconfig-r092-20230720 
(https://download.01.org/0day-ci/archive/20230721/202307210230.t2onm5g0-...@intel.com/config)
compiler: gcc-12 (Debian 12.2.0-14) 12.2.0
reproduce: 
(https://download.01.org/0day-ci/archive/20230721/202307210230.t2onm5g0-...@intel.com/reproduce)

If you fix the issue in a separate patch/commit (i.e. not just a new version of
the same patch/commit), kindly add following tags
| Reported-by: kernel test robot 
| Closes: 
https://lore.kernel.org/oe-kbuild-all/202307210230.t2onm5g0-...@intel.com/

sparse warnings: (new ones prefixed by >>)
>> drivers/gpu/drm/drm_debugfs.c:212:33: sparse: sparse: non size-preserving 
>> pointer to integer cast

vim +212 drivers/gpu/drm/drm_debugfs.c

   178  
   179  /**
   180   * drm_debugfs_gpuva_info - dump the given DRM GPU VA space
   181   * @m: pointer to the &seq_file to write
   182   * @mgr: the &drm_gpuva_manager representing the GPU VA space
   183   *
   184   * Dumps the GPU VA mappings of a given DRM GPU VA manager.
   185   *
   186   * For each DRM GPU VA space drivers should call this function from 
their
   187   * &drm_info_list's show callback.
   188   *
   189   * Returns: 0 on success, -ENODEV if the &mgr is not initialized
   190   */
   191  int drm_debugfs_gpuva_info(struct seq_file *m,
   192 struct drm_gpuva_manager *mgr)
   193  {
   194  struct drm_gpuva *va, *kva = &mgr->kernel_alloc_node;
   195  
   196  if (!mgr->name)
   197  return -ENODEV;
   198  
   199  seq_printf(m, "DRM GPU VA space (%s) [0x%016llx;0x%016llx]\n",
   200 mgr->name, mgr->mm_start, mgr->mm_start + 
mgr->mm_range);
   201  seq_printf(m, "Kernel reserved node [0x%016llx;0x%016llx]\n",
   202 kva->va.addr, kva->va.addr + kva->va.range);
   203  seq_puts(m, "\n");
   204  seq_puts(m, " VAs | start  | range  | 
end| object | object offset\n");
   205  seq_puts(m, 
"-\n");
   206  drm_gpuva_for_each_va(va, mgr) {
   207  if (unlikely(va == kva))
   208  continue;
   209  
   210  seq_printf(m, " | 0x%016llx | 0x%016llx | 0x%016llx 
| 0x%016llx | 0x%016llx\n",
   211 va->va.addr, va->va.range, va->va.addr + 
va->va.range,
 > 212 (u64)va->gem.obj, va->gem.offset);
   213  }
   214  
   215  return 0;
   216  }
   217  EXPORT_SYMBOL(drm_debugfs_gpuva_info);
   218  

-- 
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki


Re: [PATCH] drm/panel: simple: Drop prepared_time

2023-07-20 Thread Doug Anderson
Hi,

On Tue, Jul 18, 2023 at 7:23 AM Doug Anderson  wrote:
>
> Hi,
>
> On Sun, Jul 9, 2023 at 9:23 AM Marek Vasut  wrote:
> >
> > This has been superseded by RPM in commit
> > 3235b0f20a0a ("drm/panel: panel-simple: Use runtime pm to avoid excessive 
> > unprepare / prepare")
> > and the last user of the non-RPM case has been dropped in commit
> > b6d5ffce11dd ("drm/panel-simple: Non-eDP panels don't need "HPD" handling")
> > whatever is left in this driver is just assigned and never used.
> > Drop the remaining parts.
> >
> > Signed-off-by: Marek Vasut 
> > ---
> > Cc: Daniel Vetter 
> > Cc: David Airlie 
> > Cc: Douglas Anderson 
> > Cc: Neil Armstrong 
> > Cc: Sam Ravnborg 
> > Cc: dri-devel@lists.freedesktop.org
> > ---
> >  drivers/gpu/drm/panel/panel-simple.c | 4 
> >  1 file changed, 4 deletions(-)
>
> Thanks for cleaning this up.
>
> Reviewed-by: Douglas Anderson 
>
> I'm happy to apply this or happy for others to. I'll wait a few more
> days and then apply it if nobody else has.

As promised, I've pushed this to drm-misc-next.

c942e935ff3f drm/panel: simple: Drop prepared_time

I had to manually resolve the context conflict since this patch only
applied cleanly atop your other patch [1]. However, the resolution was
trivial.

[1] https://lore.kernel.org/all/20230709135231.449636-1-ma...@denx.de/


Re: [PATCH] drm/syncobj: add DRM_IOCTL_SYNCOBJ_IMPORT/EXPORT_SYNC_FILE

2023-07-20 Thread Erik Kurzinger



On 7/20/23 02:43, Simon Ser wrote:
> On Wednesday, July 19th, 2023 at 19:05, Erik Kurzinger 
>  wrote:
> 
>> These new ioctls perform a task similar to
>> DRM_IOCTL_SYNCOBJ_HANDLE_TO_FD/FD_TO_HANDLE with the
>> IMPORT/EXPORT_SYNC_FILE flag set, except that they allow specifying the
>> timeline point to import or export the fence to or from on a timeline
>> syncobj.
>>
>> This eliminates the need to use a temporary binary syncobj along with
>> DRM_IOCTL_SYNCOBJ_TRANSFER to achieve such a thing, which is the
>> technique userspace has had to employ up to this point. While that does
>> work, it is rather awkward from the programmer's perspective.  Since DRM
>> syncobjs have been proposed as the basis for display server explicit
>> synchronization protocols, e.g. [1] and [2], providing a more
>> streamlined interface now seems worthwhile.
> 
> This looks like a good idea to me! The patch looks good as well, apart
> from one tricky issue, see below...
> 
>> [1] 
>> https://gitlab.freedesktop.org/wayland/wayland-protocols/-/merge_requests/90
>> [2] https://gitlab.freedesktop.org/xorg/xserver/-/merge_requests/967
>>
>> Accompanying userspace patches...
>> IGT: 
>> https://gitlab.freedesktop.org/ekurzinger/igt-gpu-tools/-/commit/241e7f379aeaa9b22a32277e77ad4011c8717a57
>> libdrm: 
>> https://gitlab.freedesktop.org/ekurzinger/drm/-/commit/b3961a592fc6f8b05f7e3a12413fb58eca2dbfa2
> 
> (Unfortunately this isn't enough when it comes to user-space patches: the
> kernel rules require a "real" user of the new IOCTL, not just a libdr IOCTL
> wrapper. I will post a patch to make use of this from wlroots if that helps.)
> 

Thanks for taking a look, Simon. If that's the case I could also update my 
Xwayland MR to use the new functions.

>> Signed-off-by: Erik Kurzinger 
>> ---
>>  drivers/gpu/drm/drm_internal.h |  4 +++
>>  drivers/gpu/drm/drm_ioctl.c|  4 +++
>>  drivers/gpu/drm/drm_syncobj.c  | 60 ++
>>  include/uapi/drm/drm.h |  9 +
>>  4 files changed, 71 insertions(+), 6 deletions(-)
>>
>> diff --git a/drivers/gpu/drm/drm_internal.h b/drivers/gpu/drm/drm_internal.h
>> index d7e023bbb0d5..64a28ed26a16 100644
>> --- a/drivers/gpu/drm/drm_internal.h
>> +++ b/drivers/gpu/drm/drm_internal.h
>> @@ -253,6 +253,10 @@ int drm_syncobj_timeline_signal_ioctl(struct drm_device 
>> *dev, void *data,
>>struct drm_file *file_private);
>>  int drm_syncobj_query_ioctl(struct drm_device *dev, void *data,
>>  struct drm_file *file_private);
>> +int drm_syncobj_import_sync_file_ioctl(struct drm_device *dev, void *data,
>> +   struct drm_file *file_private);
>> +int drm_syncobj_export_sync_file_ioctl(struct drm_device *dev, void *data,
>> +   struct drm_file *file_private);
>>
>>  /* drm_framebuffer.c */
>>  void drm_framebuffer_print_info(struct drm_printer *p, unsigned int indent,
>> diff --git a/drivers/gpu/drm/drm_ioctl.c b/drivers/gpu/drm/drm_ioctl.c
>> index 7c9d66ee917d..0344e8e447bc 100644
>> --- a/drivers/gpu/drm/drm_ioctl.c
>> +++ b/drivers/gpu/drm/drm_ioctl.c
>> @@ -710,6 +710,10 @@ static const struct drm_ioctl_desc drm_ioctls[] = {
>>DRM_RENDER_ALLOW),
>>  DRM_IOCTL_DEF(DRM_IOCTL_SYNCOBJ_QUERY, drm_syncobj_query_ioctl,
>>DRM_RENDER_ALLOW),
>> +DRM_IOCTL_DEF(DRM_IOCTL_SYNCOBJ_IMPORT_SYNC_FILE, 
>> drm_syncobj_import_sync_file_ioctl,
>> +  DRM_RENDER_ALLOW),
>> +DRM_IOCTL_DEF(DRM_IOCTL_SYNCOBJ_EXPORT_SYNC_FILE, 
>> drm_syncobj_export_sync_file_ioctl,
>> +  DRM_RENDER_ALLOW),
>>  DRM_IOCTL_DEF(DRM_IOCTL_CRTC_GET_SEQUENCE, drm_crtc_get_sequence_ioctl, 
>> 0),
>>  DRM_IOCTL_DEF(DRM_IOCTL_CRTC_QUEUE_SEQUENCE, 
>> drm_crtc_queue_sequence_ioctl, 0),
>>  DRM_IOCTL_DEF(DRM_IOCTL_MODE_CREATE_LEASE, drm_mode_create_lease_ioctl, 
>> DRM_MASTER),
>> diff --git a/drivers/gpu/drm/drm_syncobj.c b/drivers/gpu/drm/drm_syncobj.c
>> index 0c2be8360525..bf0c1eae353a 100644
>> --- a/drivers/gpu/drm/drm_syncobj.c
>> +++ b/drivers/gpu/drm/drm_syncobj.c
>> @@ -181,6 +181,13 @@
>>   * Note that if you want to transfer a struct &dma_fence_chain from a given
>>   * point on a timeline syncobj from/into a binary syncobj, you can use the
>>   * point 0 to mean take/replace the fence in the syncobj.
>> + *
>> + * &DRM_IOCTL_SYNCOBJ_IMPORT_SYNC_FILE and 
>> &DRM_IOCTL_SYNCOBJ_EXPORT_SYNC_FILE
>> + * let the client import or export the struct &dma_fence_chain of a syncobj
>> + * at a particular timeline point from or to a sync file.
>> + * These behave similarly to 
>> &DRM_SYNCOBJ_FD_TO_HANDLE_FLAGS_IMPORT_SYNC_FILE
>> + * and &DRM_SYNCOBJ_HANDLE_TO_FD_FLAGS_EXPORT_SYNC_FILE described above, 
>> except
>> + * that they accommodate timeline syncobjs in addition to binary syncobjs.
>>   */
>>
>>  #include 
>> @@ -682,10 +689,11 @@ static int drm_syncobj_fd_to_handle(struct drm_file 
>> *file_priva

Re: [RFC v5 00/17] DRM cgroup controller with scheduling control and memory stats

2023-07-20 Thread T.J. Mercier
On Thu, Jul 20, 2023 at 3:55 AM Tvrtko Ursulin
 wrote:
>
>
> Hi,
>
> On 19/07/2023 21:31, T.J. Mercier wrote:
> > On Wed, Jul 12, 2023 at 4:47 AM Tvrtko Ursulin
> >  wrote:
> >>
> >>drm.memory.stat
> >>  A nested file containing cumulative memory statistics for the 
> >> whole
> >>  sub-hierarchy, broken down into separate GPUs and separate memory
> >>  regions supported by the latter.
> >>
> >>  For example::
> >>
> >>$ cat drm.memory.stat
> >>card0 region=system total=12898304 shared=0 active=0 
> >> resident=12111872 purgeable=167936
> >>card0 region=stolen-system total=0 shared=0 active=0 resident=0 
> >> purgeable=0
> >>
> >>  Card designation corresponds to the DRM device names and multiple 
> >> line
> >>  entries can be present per card.
> >>
> >>  Memory region names should be expected to be driver specific with 
> >> the
> >>  exception of 'system' which is standardised and applicable for 
> >> GPUs
> >>  which can operate on system memory buffers.
> >>
> >>  Sub-keys 'resident' and 'purgeable' are optional.
> >>
> >>  Per category region usage is reported in bytes.
> >>
> >>   * Feedback from people interested in drm.active_us and drm.memory.stat is
> >> required to understand the use cases and their usefulness (of the 
> >> fields).
> >>
> >> Memory stats are something which was easy to add to my series, since I 
> >> was
> >> already working on the fdinfo memory stats patches, but the question 
> >> is how
> >> useful it is.
> >>
> > Hi Tvrtko,
> >
> > I think this style of driver-defined categories for reporting of
> > memory could potentially allow us to eliminate the GPU memory tracking
> > tracepoint used on Android (gpu_mem_total). This would involve reading
> > drm.memory.stat at the root cgroup (I see it's currently disabled on
>
> I can put it available under root too, don't think there is any
> technical reason to not have it. In fact, now that I look at it again,
> memory.stat is present on root so that would align with my general
> guideline to keep the two as similar as possible.
>
> > the root), which means traversing the whole cgroup tree under the
> > cgroup lock to generate the values on-demand. This would be done
> > rarely, but I still wonder what the cost of that would turn out to be.
>
> Yeah that's ugly. I could eliminate cgroup_lock by being a bit smarter.
> Just didn't think it worth it for the RFC.
>
> Basically to account memory stats for any sub-tree I need the equivalent
> one struct drm_memory_stats per DRM device present in the hierarchy. So
> I could pre-allocate a few and restart if run out of spares, or
> something. They are really small so pre-allocating a good number, based
> on past state or something, should would good enough. Or even total
> number of DRM devices in a system as a pessimistic and safe option for
> most reasonable deployments.
>
> > The drm_memory_stats categories in the output don't seem like a big
> > value-add for this use-case, but no real objection to them being
>
> You mean the fact there are different categories is not a value add for
> your use case because you would only use one?
>
Exactly, I guess that'd be just "private" (or pick another one) for
the driver-defined "regions" where
shared/private/resident/purgeable/active aren't really applicable.
That doesn't seem like a big problem to me since you already need an
understanding of what a driver-defined region means. It's just adding
a requirement to understand what fields are used, and a driver can
document that in the same place as the region itself. That does mean
performing arithmetic on values from different drivers might not make
sense. But this is just my perspective from trying to fit the
gpu_mem_total tracepoint here. I think we could probably change the
way drivers that use it report memory to fit closer into the
drm_memory_stats categories.

> The idea was to align 1:1 with DRM memory stats fdinfo and somewhat
> emulate how memory.stat also offers a breakdown.
>
> > there. I know it's called the DRM cgroup controller, but it'd be nice
> > if there were a way to make the mem tracking part work for any driver
> > that wishes to participate as many of our devices don't use a DRM
> > driver. But making that work doesn't look like it would fit very
>
> Ah that would be a challenge indeed to which I don't have any answers
> right now.
>
> Hm if you have a DRM device somewhere in the chain memory stats would
> still show up. Like if you had a dma-buf producer which is not a DRM
> driver, but then that buffer was imported by a DRM driver, it would show
> up in a cgroup. Or vice-versa. But if there aren't any in the whole
> chain then it would not.
>
Creating a dummy DRM driver underneath an existing driver as an
adaptation layer also came to mind, but yeah... probably not. :)

By the way I still want to try to add tracking for dma-bufs ba

Re: [PATCH v2 01/11] drm/tests: helpers: Switch to kunit actions

2023-07-20 Thread kernel test robot
Hi Maxime,

kernel test robot noticed the following build warnings:

[auto build test WARNING on c58c49dd89324b18a812762a2bfa5a0458e4f252]

url:
https://github.com/intel-lab-lkp/linux/commits/Maxime-Ripard/drm-tests-helpers-Switch-to-kunit-actions/20230720-191901
base:   c58c49dd89324b18a812762a2bfa5a0458e4f252
patch link:
https://lore.kernel.org/r/20230720-kms-kunit-actions-rework-v2-1-175017bd56ab%40kernel.org
patch subject: [PATCH v2 01/11] drm/tests: helpers: Switch to kunit actions
config: arm64-randconfig-r022-20230720 
(https://download.01.org/0day-ci/archive/20230721/202307210148.7gwzlotn-...@intel.com/config)
compiler: clang version 17.0.0 (https://github.com/llvm/llvm-project.git 
4a5ac14ee968ff0ad5d2cc1ffa0299048db4c88a)
reproduce: 
(https://download.01.org/0day-ci/archive/20230721/202307210148.7gwzlotn-...@intel.com/reproduce)

If you fix the issue in a separate patch/commit (i.e. not just a new version of
the same patch/commit), kindly add following tags
| Reported-by: kernel test robot 
| Closes: 
https://lore.kernel.org/oe-kbuild-all/202307210148.7gwzlotn-...@intel.com/

All warnings (new ones prefixed by >>):

>> drivers/gpu/drm/tests/drm_kunit_helpers.c:53:6: warning: cast from 'void 
>> (*)(struct platform_driver *)' to 'kunit_action_t *' (aka 'void (*)(void 
>> *)') converts to incompatible function type [-Wcast-function-type-strict]
  53 | (kunit_action_t 
*)platform_driver_unregister,
 | 
^~~~
>> drivers/gpu/drm/tests/drm_kunit_helpers.c:61:6: warning: cast from 'void 
>> (*)(struct platform_device *)' to 'kunit_action_t *' (aka 'void (*)(void 
>> *)') converts to incompatible function type [-Wcast-function-type-strict]
  61 | (kunit_action_t 
*)platform_device_put,
 | 
^
   drivers/gpu/drm/tests/drm_kunit_helpers.c:69:6: warning: cast from 'void 
(*)(struct platform_device *)' to 'kunit_action_t *' (aka 'void (*)(void *)') 
converts to incompatible function type [-Wcast-function-type-strict]
  69 | (kunit_action_t 
*)platform_device_del,
 | 
^
   drivers/gpu/drm/tests/drm_kunit_helpers.c:89:9: warning: cast from 'void 
(*)(struct platform_device *)' to 'kunit_action_t *' (aka 'void (*)(void *)') 
converts to incompatible function type [-Wcast-function-type-strict]
  89 |  (kunit_action_t 
*)platform_device_unregister,
 |  
^~~~
   drivers/gpu/drm/tests/drm_kunit_helpers.c:93:9: warning: cast from 'void 
(*)(struct platform_driver *)' to 'kunit_action_t *' (aka 'void (*)(void *)') 
converts to incompatible function type [-Wcast-function-type-strict]
  93 |  (kunit_action_t 
*)platform_driver_unregister,
 |  
^~~~
   5 warnings generated.


vim +53 drivers/gpu/drm/tests/drm_kunit_helpers.c

28  
29  /**
30   * drm_kunit_helper_alloc_device - Allocate a mock device for a KUnit 
test
31   * @test: The test context object
32   *
33   * This allocates a fake struct &device to create a mock for a KUnit
34   * test. The device will also be bound to a fake driver. It will thus be
35   * able to leverage the usual infrastructure and most notably the
36   * device-managed resources just like a "real" device.
37   *
38   * Resources will be cleaned up automatically, but the removal can be
39   * forced using @drm_kunit_helper_free_device.
40   *
41   * Returns:
42   * A pointer to the new device, or an ERR_PTR() otherwise.
43   */
44  struct device *drm_kunit_helper_alloc_device(struct kunit *test)
45  {
46  struct platform_device *pdev;
47  int ret;
48  
49  ret = platform_driver_register(&fake_platform_driver);
50  KUNIT_ASSERT_EQ(test, ret, 0);
51  
52  ret = kunit_add_action_or_reset(test,
  > 53  (kunit_action_t 
*)platform_driver_unregister,
54  &fake_platform_driver);
55  KUNIT_ASSERT_EQ(test, ret, 0);
56  
57  pdev = platform_device_alloc(KUNIT_DEVICE_NAME, 
PLATFORM_DEVID_NONE);
58  KUNIT_ASSERT_NOT_ERR_OR_NULL(test, pdev);
59  
60  ret = kunit_add_action_or_reset(test,
  > 61   

Re: [PATCH] drm/msm: Check for the GPU IOMMU during bind

2023-07-20 Thread Dmitry Baryshkov

On 20/07/2023 18:52, Rob Clark wrote:

On Thu, Jul 6, 2023 at 11:55 AM Dmitry Baryshkov
 wrote:


On 10/03/2023 00:20, Jordan Crouse wrote:

While booting with amd,imageon on a headless target the GPU probe was
failing with -ENOSPC in get_pages() from msm_gem.c.

Investigation showed that the driver was using the default 16MB VRAM
carveout because msm_use_mmu() was returning false since headless devices
use a dummy parent device. Avoid this by extending the existing is_a2xx
priv member to check the GPU IOMMU state on all platforms and use that
check in msm_use_mmu().

This works for memory allocations but it doesn't prevent the VRAM carveout
from being created because that happens before we have a chance to check
the GPU IOMMU state in adreno_bind.

There are a number of possible options to resolve this but none of them are
very clean. The easiest way is to likely specify vram=0 as module parameter
on headless devices so that the memory doesn't get wasted.


This patch was on my plate for quite a while, please excuse me for
taking it so long.

I see the following problem with the current code. We have two different
instances than can access memory: MDP/DPU and GPU. And each of them can
either have or miss the MMU.

For some time I toyed with the idea of determining whether the allocated
BO is going to be used by display or by GPU, but then I abandoned it. We
can have display BOs being filled by GPU, so handling it this way would
complicate things a lot.


There is MSM_BO_SCANOUT .. but it wouldn't completely surprise me if
it isn't used in some place where it should somewhere or other.  But
that is the hint that contiguous allocation should be used if the
display doesn't support some sort of iommu.  (Using a GPU without some
sort of mmu/iommu isn't something sane to do.. the only reason the
support for that exists at all is to aid bringup.  I wouldn't call
that a "supported" configuration.)


This actually rings a tiny bell in my head with the idea of splitting
the display and GPU parts to two different drivers, but I'm not sure
what would be the overall impact.


Userspace does have better support for split display/gpu these days
than it did when drm/msm was first merged.  It _might_ just work if
one device only advertised DRIVER_RENDER and the other
MODESET/ATOMIC.. but I'd be a bit concerned about breaking things.  I
guess you could try some sort of kconfig knob to have two "msm"
devices and see what breaks, but I'm a bit skeptical that we could
make this the default anytime soon.


Thanks. Yes, breaking userspace would be a bad thing. I do not know if 
we should consider a single GPU+KMS driver to be an ABI and thus set in 
stone.




For now, just addressing the only-display and only-gpu cases
(continuing with the single device arrangement when you have both
display and gpu), maybe split up drm_dev_alloc() and drm_dev_init() so
that we could use drm_device::driver_features to mask out
DRIVER_RENDER if needed.


Yep. I'll continue following that path.



BR,
-R


More on the msm_use_mmu() below.



Signed-off-by: Jordan Crouse 
---

   drivers/gpu/drm/msm/adreno/adreno_device.c | 6 +-
   drivers/gpu/drm/msm/msm_drv.c  | 7 +++
   drivers/gpu/drm/msm/msm_drv.h  | 2 +-
   3 files changed, 9 insertions(+), 6 deletions(-)

diff --git a/drivers/gpu/drm/msm/adreno/adreno_device.c 
b/drivers/gpu/drm/msm/adreno/adreno_device.c
index 36f062c7582f..4f19da28f80f 100644
--- a/drivers/gpu/drm/msm/adreno/adreno_device.c
+++ b/drivers/gpu/drm/msm/adreno/adreno_device.c
@@ -539,7 +539,11 @@ static int adreno_bind(struct device *dev, struct device 
*master, void *data)
   DBG("Found GPU: %u.%u.%u.%u", config.rev.core, config.rev.major,
   config.rev.minor, config.rev.patchid);

- priv->is_a2xx = config.rev.core == 2;
+ /*
+  * A2xx has a built in IOMMU and all other IOMMU enabled targets will
+  * have an ARM IOMMU attached
+  */
+ priv->has_gpu_iommu = config.rev.core == 2 || device_iommu_mapped(dev);
   priv->has_cached_coherent = config.rev.core >= 6;

   gpu = info->init(drm);
diff --git a/drivers/gpu/drm/msm/msm_drv.c b/drivers/gpu/drm/msm/msm_drv.c
index aca48c868c14..a125a351ec90 100644
--- a/drivers/gpu/drm/msm/msm_drv.c
+++ b/drivers/gpu/drm/msm/msm_drv.c
@@ -318,11 +318,10 @@ bool msm_use_mmu(struct drm_device *dev)
   struct msm_drm_private *priv = dev->dev_private;

   /*
-  * a2xx comes with its own MMU
-  * On other platforms IOMMU can be declared specified either for the
-  * MDP/DPU device or for its parent, MDSS device.
+  * Return true if the GPU or the MDP/DPU or parent MDSS device has an
+  * IOMMU
*/
- return priv->is_a2xx ||
+ return priv->has_gpu_iommu ||
   device_iommu_mapped(dev->dev) ||
   device_iommu_mapped(dev->dev->parent);


I have a generic feeling that both old an new code is not fully correct.
Please correct me if I'm wrong:

We should be 

[linux-next:master] BUILD REGRESSION c58c49dd89324b18a812762a2bfa5a0458e4f252

2023-07-20 Thread kernel test robot
tree/branch: 
https://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git master
branch HEAD: c58c49dd89324b18a812762a2bfa5a0458e4f252  Add linux-next specific 
files for 20230720

Error/Warning reports:

https://lore.kernel.org/oe-kbuild-all/202306260401.qzlyqpv2-...@intel.com
https://lore.kernel.org/oe-kbuild-all/202307160328.p79qwzob-...@intel.com
https://lore.kernel.org/oe-kbuild-all/202307181450.sfbuvmf5-...@intel.com
https://lore.kernel.org/oe-kbuild-all/202307201439.a9marfeq-...@intel.com
https://lore.kernel.org/oe-kbuild-all/202307202051.kikxdex1-...@intel.com
https://lore.kernel.org/oe-kbuild-all/202307210041.jt6femhs-...@intel.com

Error/Warning: (recently discovered and may have been fixed)

../lib/gcc/loongarch64-linux/12.3.0/plugin/include/config/loongarch/loongarch-opts.h:31:10:
 fatal error: loongarch-def.h: No such file or directory
drivers/mfd/max77541.c:176:18: warning: cast to smaller integer type 'enum 
max7754x_ids' from 'const void *' [-Wvoid-pointer-to-enum-cast]
drivers/regulator/max77857-regulator.c:312:16: error: initializer element is 
not a compile-time constant
include/asm-generic/io.h:1137:20: error: static declaration of 'ioport_map' 
follows non-static declaration
include/asm-generic/io.h:1147:22: error: static declaration of 'ioport_unmap' 
follows non-static declaration
include/asm-generic/io.h:636:15: error: redefinition of 'inb_p'
include/asm-generic/io.h:644:15: error: redefinition of 'inw_p'
include/asm-generic/io.h:652:15: error: redefinition of 'inl_p'
include/asm-generic/io.h:660:16: error: redefinition of 'outb_p'
include/asm-generic/io.h:668:16: error: redefinition of 'outw_p'
include/asm-generic/io.h:676:16: error: redefinition of 'outl_p'
include/asm-generic/io.h:689:14: error: redefinition of 'insb'
include/asm-generic/io.h:697:14: error: redefinition of 'insw'
include/asm-generic/io.h:705:14: error: redefinition of 'insl'
include/asm-generic/io.h:713:15: error: redefinition of 'outsb'
include/asm-generic/io.h:722:15: error: redefinition of 'outsw'
include/asm-generic/io.h:731:15: error: redefinition of 'outsl'

Unverified Error/Warning (likely false positive, please contact us if 
interested):

drivers/gpu/drm/tests/drm_exec_test.c:137 test_prepare_array() error: 
uninitialized symbol 'ret'.
drivers/regulator/max77857-regulator.c:428:28: sparse: sparse: symbol 
'max77857_id' was not declared. Should it be static?
drivers/regulator/max77857-regulator.c:446:19: sparse: sparse: symbol 
'max77857_driver' was not declared. Should it be static?
drivers/regulator/max77857-regulator.c:70:22: sparse: sparse: symbol 
'max77857_regmap_config' was not declared. Should it be static?
mm/khugepaged.c:2137 collapse_file() warn: variable dereferenced before check 
'cc' (see line 1787)
net/wireless/scan.c:373 cfg80211_gen_new_ie() warn: potential spectre issue 
'sub->data' [r]
net/wireless/scan.c:397 cfg80211_gen_new_ie() warn: possible spectre second 
half.  'ext_id'
{standard input}:13: Error: symbol `__export_symbol_alpha_mv' is already defined

Error/Warning ids grouped by kconfigs:

gcc_recent_errors
|-- alpha-randconfig-r025-20230720
|   `-- standard-input:Error:symbol-__export_symbol_alpha_mv-is-already-defined
|-- arc-randconfig-r083-20230720
|   |-- 
drivers-gpu-drm-loongson-lsdc_benchmark.c:sparse:sparse:incorrect-type-in-argument-(different-address-spaces)-expected-void-const-volatile-noderef-__iomem-addr-got-void-kptr
|   `-- 
drivers-gpu-drm-loongson-lsdc_benchmark.c:sparse:sparse:incorrect-type-in-argument-(different-address-spaces)-expected-void-volatile-noderef-__iomem-addr-got-void-kptr
|-- arc-randconfig-r093-20230720
|   |-- 
drivers-iio-adc-max14001.c:sparse:sparse:incorrect-type-in-assignment-(different-base-types)-expected-restricted-__be16-usertype-spi_tx_buffer-got-int
|   `-- 
drivers-iio-adc-max14001.c:sparse:sparse:incorrect-type-in-initializer-(different-base-types)-expected-unsigned-short-usertype-__x-got-restricted-__be16-usertype
|-- arm-randconfig-r091-20230720
|   |-- 
drivers-regulator-max77857-regulator.c:sparse:sparse:symbol-max77857_driver-was-not-declared.-Should-it-be-static
|   |-- 
drivers-regulator-max77857-regulator.c:sparse:sparse:symbol-max77857_id-was-not-declared.-Should-it-be-static
|   `-- 
drivers-regulator-max77857-regulator.c:sparse:sparse:symbol-max77857_regmap_config-was-not-declared.-Should-it-be-static
|-- i386-randconfig-m021-20230720
|   `-- 
mm-khugepaged.c-collapse_file()-warn:variable-dereferenced-before-check-cc-(see-line-)
|-- loongarch-randconfig-r001-20230720
|   `-- 
lib-gcc-loongarch64-linux-..-plugin-include-config-loongarch-loongarch-opts.h:fatal-error:loongarch-def.h:No-such-file-or-directory
|-- parisc-randconfig-m041-20230720
|   |-- 
drivers-gpu-drm-tests-drm_exec

[PATCH v6 9/9] drm/i915/gt: Support aux invalidation on all engines

2023-07-20 Thread Andi Shyti
Perform some refactoring with the purpose of keeping in one
single place all the operations around the aux table
invalidation.

With this refactoring add more engines where the invalidation
should be performed.

Fixes: 972282c4cf24 ("drm/i915/gen12: Add aux table invalidate for all engines")
Signed-off-by: Andi Shyti 
Cc: Jonathan Cavitt 
Cc: Matt Roper 
Cc:  # v5.8+
---
 drivers/gpu/drm/i915/gt/gen8_engine_cs.c | 58 +++-
 drivers/gpu/drm/i915/gt/gen8_engine_cs.h |  3 +-
 drivers/gpu/drm/i915/gt/intel_lrc.c  | 17 +--
 3 files changed, 41 insertions(+), 37 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/gen8_engine_cs.c 
b/drivers/gpu/drm/i915/gt/gen8_engine_cs.c
index bedd1586c978f..4fab07de1ab4a 100644
--- a/drivers/gpu/drm/i915/gt/gen8_engine_cs.c
+++ b/drivers/gpu/drm/i915/gt/gen8_engine_cs.c
@@ -165,9 +165,36 @@ static u32 preparser_disable(bool state)
return MI_ARB_CHECK | 1 << 8 | state;
 }
 
-u32 *gen12_emit_aux_table_inv(struct intel_gt *gt, u32 *cs, const i915_reg_t 
inv_reg)
+static i915_reg_t gen12_get_aux_inv_reg(struct intel_engine_cs *engine)
 {
-   u32 gsi_offset = gt->uncore->gsi_offset;
+   if (!HAS_AUX_CCS(engine->i915))
+   return INVALID_MMIO_REG;
+
+   switch (engine->id) {
+   case RCS0:
+   return GEN12_CCS_AUX_INV;
+   case BCS0:
+   return GEN12_BCS0_AUX_INV;
+   case VCS0:
+   return GEN12_VD0_AUX_INV;
+   case VCS2:
+   return GEN12_VD2_AUX_INV;
+   case VECS0:
+   return GEN12_VE0_AUX_INV;
+   case CCS0:
+   return GEN12_CCS0_AUX_INV;
+   default:
+   return INVALID_MMIO_REG;
+   }
+}
+
+u32 *gen12_emit_aux_table_inv(struct intel_engine_cs *engine, u32 *cs)
+{
+   i915_reg_t inv_reg = gen12_get_aux_inv_reg(engine);
+   u32 gsi_offset = engine->gt->uncore->gsi_offset;
+
+   if (i915_mmio_reg_valid(inv_reg))
+   return cs;
 
*cs++ = MI_LOAD_REGISTER_IMM(1) | MI_LRI_MMIO_REMAP_EN;
*cs++ = i915_mmio_reg_offset(inv_reg) + gsi_offset;
@@ -201,6 +228,11 @@ static u32 *intel_emit_pipe_control_cs(struct i915_request 
*rq, u32 bit_group_0,
return cs;
 }
 
+static bool gen12_engine_has_aux_inv(struct intel_engine_cs *engine)
+{
+   return i915_mmio_reg_valid(gen12_get_aux_inv_reg(engine));
+}
+
 static int mtl_dummy_pipe_control(struct i915_request *rq)
 {
/* Wa_14016712196 */
@@ -307,11 +339,7 @@ int gen12_emit_flush_rcs(struct i915_request *rq, u32 mode)
 
cs = gen8_emit_pipe_control(cs, flags, LRC_PPHWSP_SCRATCH_ADDR);
 
-   if (!HAS_FLAT_CCS(rq->engine->i915)) {
-   /* hsdes: 1809175790 */
-   cs = gen12_emit_aux_table_inv(rq->engine->gt, cs,
- GEN12_CCS_AUX_INV);
-   }
+   cs = gen12_emit_aux_table_inv(engine, cs);
 
*cs++ = preparser_disable(false);
intel_ring_advance(rq, cs);
@@ -322,7 +350,6 @@ int gen12_emit_flush_rcs(struct i915_request *rq, u32 mode)
 
 int gen12_emit_flush_xcs(struct i915_request *rq, u32 mode)
 {
-   intel_engine_mask_t aux_inv = 0;
u32 cmd_flush = 0;
u32 cmd = 4;
u32 *cs;
@@ -330,15 +357,11 @@ int gen12_emit_flush_xcs(struct i915_request *rq, u32 
mode)
if (mode & EMIT_INVALIDATE)
cmd += 2;
 
-   if (HAS_AUX_CCS(rq->engine->i915))
-   aux_inv = rq->engine->mask &
- ~GENMASK(_BCS(I915_MAX_BCS - 1), BCS0);
-
/*
 * On Aux CCS platforms the invalidation of the Aux
 * table requires quiescing memory traffic beforehand
 */
-   if (aux_inv) {
+   if (gen12_engine_has_aux_inv(rq->engine)) {
u32 bit_group_0 = 0;
u32 bit_group_1 = 0;
 
@@ -417,14 +440,7 @@ int gen12_emit_flush_xcs(struct i915_request *rq, u32 mode)
*cs++ = 0; /* upper addr */
*cs++ = 0; /* value */
 
-   if (aux_inv) { /* hsdes: 1809175790 */
-   if (rq->engine->class == VIDEO_DECODE_CLASS)
-   cs = gen12_emit_aux_table_inv(rq->engine->gt,
- cs, GEN12_VD0_AUX_INV);
-   else
-   cs = gen12_emit_aux_table_inv(rq->engine->gt,
- cs, GEN12_VE0_AUX_INV);
-   }
+   cs = gen12_emit_aux_table_inv(rq->engine, cs);
 
if (mode & EMIT_INVALIDATE)
*cs++ = preparser_disable(false);
diff --git a/drivers/gpu/drm/i915/gt/gen8_engine_cs.h 
b/drivers/gpu/drm/i915/gt/gen8_engine_cs.h
index a44eda096557c..867ba697aceb8 100644
--- a/drivers/gpu/drm/i915/gt/gen8_engine_cs.h
+++ b/drivers/gpu/drm/i915/gt/gen8_engine_cs.h
@@ -13,6 +13,7 @@
 #include "intel_gt_regs.h"
 #include "intel_gpu_commands.h"
 
+struct intel_engine_cs;
 struct intel_gt;

[PATCH v6 8/9] drm/i915/gt: Poll aux invalidation register bit on invalidation

2023-07-20 Thread Andi Shyti
From: Jonathan Cavitt 

For platforms that use Aux CCS, wait for aux invalidation to
complete by checking the aux invalidation register bit is
cleared.

Fixes: 972282c4cf24 ("drm/i915/gen12: Add aux table invalidate for all engines")
Signed-off-by: Jonathan Cavitt 
Signed-off-by: Andi Shyti 
Cc:  # v5.8+
Reviewed-by: Nirmoy Das 
Reviewed-by: Andrzej Hajda 
Reviewed-by: Matt Roper 
---
 drivers/gpu/drm/i915/gt/gen8_engine_cs.c | 17 -
 drivers/gpu/drm/i915/gt/intel_gpu_commands.h |  1 +
 2 files changed, 13 insertions(+), 5 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/gen8_engine_cs.c 
b/drivers/gpu/drm/i915/gt/gen8_engine_cs.c
index 78bbd55262a2d..bedd1586c978f 100644
--- a/drivers/gpu/drm/i915/gt/gen8_engine_cs.c
+++ b/drivers/gpu/drm/i915/gt/gen8_engine_cs.c
@@ -172,7 +172,15 @@ u32 *gen12_emit_aux_table_inv(struct intel_gt *gt, u32 
*cs, const i915_reg_t inv
*cs++ = MI_LOAD_REGISTER_IMM(1) | MI_LRI_MMIO_REMAP_EN;
*cs++ = i915_mmio_reg_offset(inv_reg) + gsi_offset;
*cs++ = AUX_INV;
-   *cs++ = MI_NOOP;
+
+   *cs++ = MI_SEMAPHORE_WAIT_TOKEN |
+   MI_SEMAPHORE_REGISTER_POLL |
+   MI_SEMAPHORE_POLL |
+   MI_SEMAPHORE_SAD_EQ_SDD;
+   *cs++ = 0;
+   *cs++ = i915_mmio_reg_offset(inv_reg) + gsi_offset;
+   *cs++ = 0;
+   *cs++ = 0;
 
return cs;
 }
@@ -282,10 +290,9 @@ int gen12_emit_flush_rcs(struct i915_request *rq, u32 mode)
else if (engine->class == COMPUTE_CLASS)
flags &= ~PIPE_CONTROL_3D_ENGINE_FLAGS;
 
+   count = 8;
if (HAS_AUX_CCS(rq->engine->i915))
-   count = 8 + 4;
-   else
-   count = 8;
+   count += 8;
 
cs = intel_ring_begin(rq, count);
if (IS_ERR(cs))
@@ -335,7 +342,7 @@ int gen12_emit_flush_xcs(struct i915_request *rq, u32 mode)
u32 bit_group_0 = 0;
u32 bit_group_1 = 0;
 
-   cmd += 4;
+   cmd += 8;
 
bit_group_0 |= PIPE_CONTROL0_HDC_PIPELINE_FLUSH;
 
diff --git a/drivers/gpu/drm/i915/gt/intel_gpu_commands.h 
b/drivers/gpu/drm/i915/gt/intel_gpu_commands.h
index 5df7cce23197c..2bd8d98d21102 100644
--- a/drivers/gpu/drm/i915/gt/intel_gpu_commands.h
+++ b/drivers/gpu/drm/i915/gt/intel_gpu_commands.h
@@ -121,6 +121,7 @@
 #define   MI_SEMAPHORE_TARGET(engine)  ((engine)<<15)
 #define MI_SEMAPHORE_WAIT  MI_INSTR(0x1c, 2) /* GEN8+ */
 #define MI_SEMAPHORE_WAIT_TOKENMI_INSTR(0x1c, 3) /* GEN12+ */
+#define   MI_SEMAPHORE_REGISTER_POLL   (1 << 16)
 #define   MI_SEMAPHORE_POLL(1 << 15)
 #define   MI_SEMAPHORE_SAD_GT_SDD  (0 << 12)
 #define   MI_SEMAPHORE_SAD_GTE_SDD (1 << 12)
-- 
2.40.1



[PATCH v6 6/9] drm/i915/gt: Ensure memory quiesced before invalidation for all engines

2023-07-20 Thread Andi Shyti
Commit af9e423a8aae ("drm/i915/gt: Ensure memory quiesced before
invalidation") has made sure that the memory is quiesced before
invalidating the AUX CCS table. Do it for all the other engines
and not just RCS.

Signed-off-by: Andi Shyti 
Cc: Jonathan Cavitt 
Cc: Matt Roper 
---
 drivers/gpu/drm/i915/gt/gen8_engine_cs.c | 46 ++--
 1 file changed, 36 insertions(+), 10 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/gen8_engine_cs.c 
b/drivers/gpu/drm/i915/gt/gen8_engine_cs.c
index 1b1dadacfbf42..3bedab8d61db1 100644
--- a/drivers/gpu/drm/i915/gt/gen8_engine_cs.c
+++ b/drivers/gpu/drm/i915/gt/gen8_engine_cs.c
@@ -309,19 +309,45 @@ int gen12_emit_flush_rcs(struct i915_request *rq, u32 
mode)
 int gen12_emit_flush_xcs(struct i915_request *rq, u32 mode)
 {
intel_engine_mask_t aux_inv = 0;
-   u32 cmd, *cs;
+   u32 cmd = 4;
+   u32 *cs;
 
-   cmd = 4;
-   if (mode & EMIT_INVALIDATE) {
+   if (mode & EMIT_INVALIDATE)
cmd += 2;
 
-   if (HAS_AUX_CCS(rq->engine->i915) &&
-   (rq->engine->class == VIDEO_DECODE_CLASS ||
-rq->engine->class == VIDEO_ENHANCEMENT_CLASS)) {
-   aux_inv = rq->engine->mask &
-   ~GENMASK(_BCS(I915_MAX_BCS - 1), BCS0);
-   if (aux_inv)
-   cmd += 4;
+   if (HAS_AUX_CCS(rq->engine->i915))
+   aux_inv = rq->engine->mask &
+ ~GENMASK(_BCS(I915_MAX_BCS - 1), BCS0);
+
+   /*
+* On Aux CCS platforms the invalidation of the Aux
+* table requires quiescing memory traffic beforehand
+*/
+   if (aux_inv) {
+   u32 bit_group_0 = 0;
+   u32 bit_group_1 = 0;
+
+   cmd += 4;
+
+   bit_group_0 |= PIPE_CONTROL0_HDC_PIPELINE_FLUSH;
+
+   switch (rq->engine->class) {
+   case VIDEO_DECODE_CLASS:
+   bit_group_1 |= PIPE_CONTROL_RENDER_TARGET_CACHE_FLUSH;
+   bit_group_1 |= PIPE_CONTROL_DEPTH_CACHE_FLUSH;
+   bit_group_1 |= PIPE_CONTROL_DC_FLUSH_ENABLE;
+   bit_group_1 |= PIPE_CONTROL_FLUSH_L3;
+   bit_group_1 |= PIPE_CONTROL_CS_STALL;
+
+   intel_emit_pipe_control_cs(rq, bit_group_0, bit_group_1,
+  LRC_PPHWSP_SCRATCH_ADDR);
+
+   break;
+
+   case VIDEO_ENHANCEMENT_CLASS:
+   case COMPUTE_CLASS:
+   case COPY_ENGINE_CLASS:
+   break;
}
}
 
-- 
2.40.1



[PATCH v6 7/9] drm/i915/gt: Enable the CCS_FLUSH bit in the pipe control

2023-07-20 Thread Andi Shyti
Enable the CCS_FLUSH bit 13 in the control pipe for render and
compute engines in platforms starting from Meteor Lake (BSPEC
43904 and 47112). The VE and BCS engines need to add the flush
part in their command streamer.

Fixes: 972282c4cf24 ("drm/i915/gen12: Add aux table invalidate for all engines")
Signed-off-by: Andi Shyti 
Cc: Jonathan Cavitt 
Cc: Nirmoy Das 
Cc:  # v5.8+
---
 drivers/gpu/drm/i915/gt/gen8_engine_cs.c | 31 
 drivers/gpu/drm/i915/gt/intel_gpu_commands.h |  1 +
 2 files changed, 32 insertions(+)

diff --git a/drivers/gpu/drm/i915/gt/gen8_engine_cs.c 
b/drivers/gpu/drm/i915/gt/gen8_engine_cs.c
index 3bedab8d61db1..78bbd55262a2d 100644
--- a/drivers/gpu/drm/i915/gt/gen8_engine_cs.c
+++ b/drivers/gpu/drm/i915/gt/gen8_engine_cs.c
@@ -225,6 +225,13 @@ int gen12_emit_flush_rcs(struct i915_request *rq, u32 mode)
 
bit_group_0 |= PIPE_CONTROL0_HDC_PIPELINE_FLUSH;
 
+   /*
+* When required, in MTL+ platforms we need to
+* set the CCS_FLUSH bit in the pipe control
+*/
+   if (GRAPHICS_VER_FULL(rq->i915) >= IP_VER(12, 70))
+   bit_group_0 |= PIPE_CONTROL_CCS_FLUSH;
+
bit_group_1 |= PIPE_CONTROL_TILE_CACHE_FLUSH;
bit_group_1 |= PIPE_CONTROL_FLUSH_L3;
bit_group_1 |= PIPE_CONTROL_RENDER_TARGET_CACHE_FLUSH;
@@ -309,6 +316,7 @@ int gen12_emit_flush_rcs(struct i915_request *rq, u32 mode)
 int gen12_emit_flush_xcs(struct i915_request *rq, u32 mode)
 {
intel_engine_mask_t aux_inv = 0;
+   u32 cmd_flush = 0;
u32 cmd = 4;
u32 *cs;
 
@@ -339,6 +347,13 @@ int gen12_emit_flush_xcs(struct i915_request *rq, u32 mode)
bit_group_1 |= PIPE_CONTROL_FLUSH_L3;
bit_group_1 |= PIPE_CONTROL_CS_STALL;
 
+   /*
+* When required, in MTL+ platforms we need to
+* set the CCS_FLUSH bit in the pipe control
+*/
+   if (GRAPHICS_VER_FULL(rq->i915) >= IP_VER(12, 70))
+   bit_group_0 |= PIPE_CONTROL_CCS_FLUSH;
+
intel_emit_pipe_control_cs(rq, bit_group_0, bit_group_1,
   LRC_PPHWSP_SCRATCH_ADDR);
 
@@ -346,7 +361,18 @@ int gen12_emit_flush_xcs(struct i915_request *rq, u32 mode)
 
case VIDEO_ENHANCEMENT_CLASS:
case COMPUTE_CLASS:
+   cmd += 2;
+   cmd_flush = MI_FLUSH_DW;
+
+   break;
+
case COPY_ENGINE_CLASS:
+   cmd += 2;
+   /*
+* When required, in MTL+ platforms we need to
+* set the CCS_FLUSH bit in the pipe control
+*/
+   cmd_flush = MI_FLUSH_DW | MI_FLUSH_DW_CCS;
break;
}
}
@@ -355,6 +381,11 @@ int gen12_emit_flush_xcs(struct i915_request *rq, u32 mode)
if (IS_ERR(cs))
return PTR_ERR(cs);
 
+   if (cmd_flush) {
+   *cs++ = cmd_flush;
+   *cs++ = 0;
+   }
+
if (mode & EMIT_INVALIDATE)
*cs++ = preparser_disable(true);
 
diff --git a/drivers/gpu/drm/i915/gt/intel_gpu_commands.h 
b/drivers/gpu/drm/i915/gt/intel_gpu_commands.h
index 5d143e2a8db03..5df7cce23197c 100644
--- a/drivers/gpu/drm/i915/gt/intel_gpu_commands.h
+++ b/drivers/gpu/drm/i915/gt/intel_gpu_commands.h
@@ -299,6 +299,7 @@
 #define   PIPE_CONTROL_QW_WRITE(1<<14)
 #define   PIPE_CONTROL_POST_SYNC_OP_MASK(3<<14)
 #define   PIPE_CONTROL_DEPTH_STALL (1<<13)
+#define   PIPE_CONTROL_CCS_FLUSH   (1<<13) /* MTL+ */
 #define   PIPE_CONTROL_WRITE_FLUSH (1<<12)
 #define   PIPE_CONTROL_RENDER_TARGET_CACHE_FLUSH   (1<<12) /* gen6+ */
 #define   PIPE_CONTROL_INSTRUCTION_CACHE_INVALIDATE(1<<11) /* MBZ on ILK */
-- 
2.40.1



[PATCH v6 5/9] drm/i915/gt: Refactor intel_emit_pipe_control_cs() in a single function

2023-07-20 Thread Andi Shyti
Just a trivial refactoring for reducing the number of code
duplicate. This will come at handy in the next commits.

Signed-off-by: Andi Shyti 
---
 drivers/gpu/drm/i915/gt/gen8_engine_cs.c | 44 +---
 1 file changed, 23 insertions(+), 21 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/gen8_engine_cs.c 
b/drivers/gpu/drm/i915/gt/gen8_engine_cs.c
index 7566c89d9def3..1b1dadacfbf42 100644
--- a/drivers/gpu/drm/i915/gt/gen8_engine_cs.c
+++ b/drivers/gpu/drm/i915/gt/gen8_engine_cs.c
@@ -177,23 +177,31 @@ u32 *gen12_emit_aux_table_inv(struct intel_gt *gt, u32 
*cs, const i915_reg_t inv
return cs;
 }
 
+static u32 *intel_emit_pipe_control_cs(struct i915_request *rq, u32 
bit_group_0,
+  u32 bit_group_1, u32 offset)
+{
+   u32 *cs;
+
+   cs = intel_ring_begin(rq, 6);
+   if (IS_ERR(cs))
+   return cs;
+
+   cs = gen12_emit_pipe_control(cs, bit_group_0, bit_group_1,
+LRC_PPHWSP_SCRATCH_ADDR);
+   intel_ring_advance(rq, cs);
+
+   return cs;
+}
+
 static int mtl_dummy_pipe_control(struct i915_request *rq)
 {
/* Wa_14016712196 */
if (IS_MTL_GRAPHICS_STEP(rq->engine->i915, M, STEP_A0, STEP_B0) ||
-   IS_MTL_GRAPHICS_STEP(rq->engine->i915, P, STEP_A0, STEP_B0)) {
-   u32 *cs;
-
-   /* dummy PIPE_CONTROL + depth flush */
-   cs = intel_ring_begin(rq, 6);
-   if (IS_ERR(cs))
-   return PTR_ERR(cs);
-   cs = gen12_emit_pipe_control(cs,
-0,
-PIPE_CONTROL_DEPTH_CACHE_FLUSH,
-LRC_PPHWSP_SCRATCH_ADDR);
-   intel_ring_advance(rq, cs);
-   }
+   IS_MTL_GRAPHICS_STEP(rq->engine->i915, P, STEP_A0, STEP_B0))
+   intel_emit_pipe_control_cs(rq,
+  0,
+  PIPE_CONTROL_DEPTH_CACHE_FLUSH,
+  LRC_PPHWSP_SCRATCH_ADDR);
 
return 0;
 }
@@ -210,7 +218,6 @@ int gen12_emit_flush_rcs(struct i915_request *rq, u32 mode)
u32 bit_group_0 = 0;
u32 bit_group_1 = 0;
int err;
-   u32 *cs;
 
err = mtl_dummy_pipe_control(rq);
if (err)
@@ -237,13 +244,8 @@ int gen12_emit_flush_rcs(struct i915_request *rq, u32 mode)
else if (engine->class == COMPUTE_CLASS)
bit_group_1 &= ~PIPE_CONTROL_3D_ENGINE_FLAGS;
 
-   cs = intel_ring_begin(rq, 6);
-   if (IS_ERR(cs))
-   return PTR_ERR(cs);
-
-   cs = gen12_emit_pipe_control(cs, bit_group_0, bit_group_1,
-LRC_PPHWSP_SCRATCH_ADDR);
-   intel_ring_advance(rq, cs);
+   intel_emit_pipe_control_cs(rq, bit_group_0, bit_group_1,
+  LRC_PPHWSP_SCRATCH_ADDR);
}
 
if (mode & EMIT_INVALIDATE) {
-- 
2.40.1



[PATCH v6 4/9] drm/i915/gt: Rename flags with bit_group_X according to the datasheet

2023-07-20 Thread Andi Shyti
In preparation of the next patch align with the datasheet (BSPEC
47112) with the naming of the pipe control set of flag values.
The variable "flags" in gen12_emit_flush_rcs() is applied as a
set of flags called Bit Group 1.

Define also the Bit Group 0 as bit_group_0 where currently only
PIPE_CONTROL0_HDC_PIPELINE_FLUSH bit is set.

Signed-off-by: Andi Shyti 
Cc:  # v5.8+
Reviewed-by: Matt Roper 
Reviewed-by: Andrzej Hajda 
Reviewed-by: Nirmoy Das 
---
 drivers/gpu/drm/i915/gt/gen8_engine_cs.c | 34 +---
 drivers/gpu/drm/i915/gt/gen8_engine_cs.h | 18 -
 2 files changed, 29 insertions(+), 23 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/gen8_engine_cs.c 
b/drivers/gpu/drm/i915/gt/gen8_engine_cs.c
index 5fbc3f630f32b..7566c89d9def3 100644
--- a/drivers/gpu/drm/i915/gt/gen8_engine_cs.c
+++ b/drivers/gpu/drm/i915/gt/gen8_engine_cs.c
@@ -207,7 +207,8 @@ int gen12_emit_flush_rcs(struct i915_request *rq, u32 mode)
 * table requires quiescing memory traffic beforehand
 */
if (mode & EMIT_FLUSH || HAS_AUX_CCS(engine->i915)) {
-   u32 flags = 0;
+   u32 bit_group_0 = 0;
+   u32 bit_group_1 = 0;
int err;
u32 *cs;
 
@@ -215,32 +216,33 @@ int gen12_emit_flush_rcs(struct i915_request *rq, u32 
mode)
if (err)
return err;
 
-   flags |= PIPE_CONTROL_TILE_CACHE_FLUSH;
-   flags |= PIPE_CONTROL_FLUSH_L3;
-   flags |= PIPE_CONTROL_RENDER_TARGET_CACHE_FLUSH;
-   flags |= PIPE_CONTROL_DEPTH_CACHE_FLUSH;
+   bit_group_0 |= PIPE_CONTROL0_HDC_PIPELINE_FLUSH;
+
+   bit_group_1 |= PIPE_CONTROL_TILE_CACHE_FLUSH;
+   bit_group_1 |= PIPE_CONTROL_FLUSH_L3;
+   bit_group_1 |= PIPE_CONTROL_RENDER_TARGET_CACHE_FLUSH;
+   bit_group_1 |= PIPE_CONTROL_DEPTH_CACHE_FLUSH;
/* Wa_1409600907:tgl,adl-p */
-   flags |= PIPE_CONTROL_DEPTH_STALL;
-   flags |= PIPE_CONTROL_DC_FLUSH_ENABLE;
-   flags |= PIPE_CONTROL_FLUSH_ENABLE;
+   bit_group_1 |= PIPE_CONTROL_DEPTH_STALL;
+   bit_group_1 |= PIPE_CONTROL_DC_FLUSH_ENABLE;
+   bit_group_1 |= PIPE_CONTROL_FLUSH_ENABLE;
 
-   flags |= PIPE_CONTROL_STORE_DATA_INDEX;
-   flags |= PIPE_CONTROL_QW_WRITE;
+   bit_group_1 |= PIPE_CONTROL_STORE_DATA_INDEX;
+   bit_group_1 |= PIPE_CONTROL_QW_WRITE;
 
-   flags |= PIPE_CONTROL_CS_STALL;
+   bit_group_1 |= PIPE_CONTROL_CS_STALL;
 
if (!HAS_3D_PIPELINE(engine->i915))
-   flags &= ~PIPE_CONTROL_3D_ARCH_FLAGS;
+   bit_group_1 &= ~PIPE_CONTROL_3D_ARCH_FLAGS;
else if (engine->class == COMPUTE_CLASS)
-   flags &= ~PIPE_CONTROL_3D_ENGINE_FLAGS;
+   bit_group_1 &= ~PIPE_CONTROL_3D_ENGINE_FLAGS;
 
cs = intel_ring_begin(rq, 6);
if (IS_ERR(cs))
return PTR_ERR(cs);
 
-   cs = gen12_emit_pipe_control(cs,
-PIPE_CONTROL0_HDC_PIPELINE_FLUSH,
-flags, LRC_PPHWSP_SCRATCH_ADDR);
+   cs = gen12_emit_pipe_control(cs, bit_group_0, bit_group_1,
+LRC_PPHWSP_SCRATCH_ADDR);
intel_ring_advance(rq, cs);
}
 
diff --git a/drivers/gpu/drm/i915/gt/gen8_engine_cs.h 
b/drivers/gpu/drm/i915/gt/gen8_engine_cs.h
index 655e5c00ddc27..a44eda096557c 100644
--- a/drivers/gpu/drm/i915/gt/gen8_engine_cs.h
+++ b/drivers/gpu/drm/i915/gt/gen8_engine_cs.h
@@ -49,25 +49,29 @@ u32 *gen12_emit_fini_breadcrumb_rcs(struct i915_request 
*rq, u32 *cs);
 u32 *gen12_emit_aux_table_inv(struct intel_gt *gt, u32 *cs, const i915_reg_t 
inv_reg);
 
 static inline u32 *
-__gen8_emit_pipe_control(u32 *batch, u32 flags0, u32 flags1, u32 offset)
+__gen8_emit_pipe_control(u32 *batch, u32 bit_group_0,
+u32 bit_group_1, u32 offset)
 {
memset(batch, 0, 6 * sizeof(u32));
 
-   batch[0] = GFX_OP_PIPE_CONTROL(6) | flags0;
-   batch[1] = flags1;
+   batch[0] = GFX_OP_PIPE_CONTROL(6) | bit_group_0;
+   batch[1] = bit_group_1;
batch[2] = offset;
 
return batch + 6;
 }
 
-static inline u32 *gen8_emit_pipe_control(u32 *batch, u32 flags, u32 offset)
+static inline u32 *gen8_emit_pipe_control(u32 *batch,
+ u32 bit_group_1, u32 offset)
 {
-   return __gen8_emit_pipe_control(batch, 0, flags, offset);
+   return __gen8_emit_pipe_control(batch, 0, bit_group_1, offset);
 }
 
-static inline u32 *gen12_emit_pipe_control(u32 *batch, u32 flags0, u32 flags1, 
u32 offset)
+static inline u32 *gen12_emit_pipe_control(u32 *batch, u32 bit_group_0,
+   

[PATCH v6 3/9] drm/i915/gt: Ensure memory quiesced before invalidation

2023-07-20 Thread Andi Shyti
From: Jonathan Cavitt 

All memory traffic must be quiesced before requesting
an aux invalidation on platforms that use Aux CCS.

Fixes: 972282c4cf24 ("drm/i915/gen12: Add aux table invalidate for all engines")
Signed-off-by: Jonathan Cavitt 
Signed-off-by: Andi Shyti 
Cc:  # v5.8+
Reviewed-by: Nirmoy Das 
Reviewed-by: Andrzej Hajda 
---
 drivers/gpu/drm/i915/gt/gen8_engine_cs.c | 6 +-
 1 file changed, 5 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/i915/gt/gen8_engine_cs.c 
b/drivers/gpu/drm/i915/gt/gen8_engine_cs.c
index 0d4d5e0407a2d..5fbc3f630f32b 100644
--- a/drivers/gpu/drm/i915/gt/gen8_engine_cs.c
+++ b/drivers/gpu/drm/i915/gt/gen8_engine_cs.c
@@ -202,7 +202,11 @@ int gen12_emit_flush_rcs(struct i915_request *rq, u32 mode)
 {
struct intel_engine_cs *engine = rq->engine;
 
-   if (mode & EMIT_FLUSH) {
+   /*
+* On Aux CCS platforms the invalidation of the Aux
+* table requires quiescing memory traffic beforehand
+*/
+   if (mode & EMIT_FLUSH || HAS_AUX_CCS(engine->i915)) {
u32 flags = 0;
int err;
u32 *cs;
-- 
2.40.1



[PATCH v6 0/9] Update AUX invalidation sequence

2023-07-20 Thread Andi Shyti
Hi,

as there are new hardware directives, we need a little adaptation
for the AUX invalidation sequence.

In this version we support all the engines affected by this
change.

The stable backport has some challenges because the original
patch that this series fixes has had more changes in between.

This patch is slowly exploding with code refactorings and
features added and fixed.

Thanks a lot Nirmoy, Andrzej and Matt for your review and for the
fruitful discussions!

Thanks,
Andi

Changelog:
=
v5 -> v6
 - Fixed ccs flush in the engines VE and BCS. They are sent as a
   separate command instead of added in the pipe control.
 - Separated the CCS flusing in the pipe control patch with the
   quiescing of the memory. They were meant to be on separate
   patch already in the previous verision, but apparently I
   squashed them by mistake.

v4 -> v5
 - The AUX CCS is added as a device property instead of checking
   against FLAT CCS. This adds the new HAS_AUX_CCS check
   (Patch 2, new).
 - little and trivial refactoring here and there.
 - extended the flags{0,1}/bit_group_{0,1} renaming to other
   functions.
 - Created an intel_emit_pipe_control_cs() wrapper for submitting
   the pipe control.
 - Quiesce memory for all the engines, not just RCS (Patch 6,
   new).
 - The PIPE_CONTROL_CCS_FLUSH is added to all the engines.
 - Remove redundant EMIT_FLUSH_CCS mode flag.
 - Remove unnecessary NOOPs from the command streamer for
   invalidating the CCS table.
 - Use INVALID_MMIO_REG and gen12_get_aux_inv_reg() instad of
   __MMIO(0) and reg.reg.
 - Remove useless wrapper and just use gen12_get_aux_inv_reg().

v3 -> v4
 - A trivial patch 3 is added to rename the flags with
   bit_group_{0,1} to align with the datasheet naming.
 - Patch 4 fixes a confusion I made where the CCS flag was
   applied to the wrong bit group.

v2 -> v3
 - added r-b from Nirmoy in patch 1 and 4.
 - added patch 3 which enables the ccs_flush in the control pipe
   for mtl+ compute and render engines.
 - added redundant checks in patch 2 for enabling the EMIT_FLUSH
   flag.

v1 -> v2
 - add a clean up preliminary patch for the existing registers
 - add support for more engines
 - add the Fixes tag

Andi Shyti (7):
  drm/i915/gt: Cleanup aux invalidation registers
  drm/i915: Add the has_aux_ccs device property
  drm/i915/gt: Rename flags with bit_group_X according to the datasheet
  drm/i915/gt: Refactor intel_emit_pipe_control_cs() in a single
function
  drm/i915/gt: Ensure memory quiesced before invalidation for all
engines
  drm/i915/gt: Enable the CCS_FLUSH bit in the pipe control
  drm/i915/gt: Support aux invalidation on all engines

Jonathan Cavitt (2):
  drm/i915/gt: Ensure memory quiesced before invalidation
  drm/i915/gt: Poll aux invalidation register bit on invalidation

 drivers/gpu/drm/i915/gt/gen8_engine_cs.c | 222 +--
 drivers/gpu/drm/i915/gt/gen8_engine_cs.h |  21 +-
 drivers/gpu/drm/i915/gt/intel_gpu_commands.h |   2 +
 drivers/gpu/drm/i915/gt/intel_gt_regs.h  |  16 +-
 drivers/gpu/drm/i915/gt/intel_lrc.c  |  17 +-
 drivers/gpu/drm/i915/i915_drv.h  |   1 +
 drivers/gpu/drm/i915/i915_pci.c  |   5 +-
 drivers/gpu/drm/i915/intel_device_info.h |   1 +
 8 files changed, 186 insertions(+), 99 deletions(-)

-- 
2.40.1



[PATCH v6 2/9] drm/i915: Add the has_aux_ccs device property

2023-07-20 Thread Andi Shyti
We always assumed that a device might either have AUX or FLAT
CCS, but this is an approximation that is not always true as it
requires some further per device checks.

Add the "has_aux_ccs" flag in the intel_device_info structure in
order to have a per device flag indicating of the AUX CCS.

Signed-off-by: Andi Shyti 
Cc: Matt Roper 
Cc: Jonathan Cavitt 
Cc:  # v5.8+
---
 drivers/gpu/drm/i915/gt/gen8_engine_cs.c | 4 ++--
 drivers/gpu/drm/i915/i915_drv.h  | 1 +
 drivers/gpu/drm/i915/i915_pci.c  | 5 -
 drivers/gpu/drm/i915/intel_device_info.h | 1 +
 4 files changed, 8 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/gen8_engine_cs.c 
b/drivers/gpu/drm/i915/gt/gen8_engine_cs.c
index 563efee055602..0d4d5e0407a2d 100644
--- a/drivers/gpu/drm/i915/gt/gen8_engine_cs.c
+++ b/drivers/gpu/drm/i915/gt/gen8_engine_cs.c
@@ -267,7 +267,7 @@ int gen12_emit_flush_rcs(struct i915_request *rq, u32 mode)
else if (engine->class == COMPUTE_CLASS)
flags &= ~PIPE_CONTROL_3D_ENGINE_FLAGS;
 
-   if (!HAS_FLAT_CCS(rq->engine->i915))
+   if (HAS_AUX_CCS(rq->engine->i915))
count = 8 + 4;
else
count = 8;
@@ -307,7 +307,7 @@ int gen12_emit_flush_xcs(struct i915_request *rq, u32 mode)
if (mode & EMIT_INVALIDATE) {
cmd += 2;
 
-   if (!HAS_FLAT_CCS(rq->engine->i915) &&
+   if (HAS_AUX_CCS(rq->engine->i915) &&
(rq->engine->class == VIDEO_DECODE_CLASS ||
 rq->engine->class == VIDEO_ENHANCEMENT_CLASS)) {
aux_inv = rq->engine->mask &
diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index 682ef2b5c7d59..e9cc048b5727a 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -848,6 +848,7 @@ IS_SUBPLATFORM(const struct drm_i915_private *i915,
  * stored in lmem to support the 3D and media compression formats.
  */
 #define HAS_FLAT_CCS(i915)   (INTEL_INFO(i915)->has_flat_ccs)
+#define HAS_AUX_CCS(i915)(INTEL_INFO(i915)->has_aux_ccs)
 
 #define HAS_GT_UC(i915)(INTEL_INFO(i915)->has_gt_uc)
 
diff --git a/drivers/gpu/drm/i915/i915_pci.c b/drivers/gpu/drm/i915/i915_pci.c
index fcacdc21643cf..c9ff1d11a9fce 100644
--- a/drivers/gpu/drm/i915/i915_pci.c
+++ b/drivers/gpu/drm/i915/i915_pci.c
@@ -643,7 +643,8 @@ static const struct intel_device_info jsl_info = {
TGL_CACHELEVEL, \
.has_global_mocs = 1, \
.has_pxp = 1, \
-   .max_pat_index = 3
+   .max_pat_index = 3, \
+   .has_aux_ccs = 1
 
 static const struct intel_device_info tgl_info = {
GEN12_FEATURES,
@@ -775,6 +776,7 @@ static const struct intel_device_info dg2_info = {
 
 static const struct intel_device_info ats_m_info = {
DG2_FEATURES,
+   .has_aux_ccs = 1,
.require_force_probe = 1,
.tuning_thread_rr_after_dep = 1,
 };
@@ -827,6 +829,7 @@ static const struct intel_device_info mtl_info = {
.__runtime.media.ip.ver = 13,
PLATFORM(INTEL_METEORLAKE),
.extra_gt_list = xelpmp_extra_gt,
+   .has_aux_ccs = 1,
.has_flat_ccs = 0,
.has_gmd_id = 1,
.has_guc_deprivilege = 1,
diff --git a/drivers/gpu/drm/i915/intel_device_info.h 
b/drivers/gpu/drm/i915/intel_device_info.h
index dbfe6443457b5..93485507506cc 100644
--- a/drivers/gpu/drm/i915/intel_device_info.h
+++ b/drivers/gpu/drm/i915/intel_device_info.h
@@ -151,6 +151,7 @@ enum intel_ppgtt_type {
func(has_reset_engine); \
func(has_3d_pipeline); \
func(has_4tile); \
+   func(has_aux_ccs); \
func(has_flat_ccs); \
func(has_global_mocs); \
func(has_gmd_id); \
-- 
2.40.1



[PATCH v6 1/9] drm/i915/gt: Cleanup aux invalidation registers

2023-07-20 Thread Andi Shyti
Fix the 'NV' definition postfix that is supposed to be INV.

Take the chance to also order properly the registers based on
their address and call the GEN12_GFX_CCS_AUX_INV address as
GEN12_CCS_AUX_INV like all the other similar registers.

Remove also VD1, VD3 and VE1 registers that don't exist and add
BCS0 and CCS0.

Signed-off-by: Andi Shyti 
Cc:  # v5.8+
Reviewed-by: Nirmoy Das 
Reviewed-by: Andrzej Hajda 
---
 drivers/gpu/drm/i915/gt/gen8_engine_cs.c |  8 
 drivers/gpu/drm/i915/gt/intel_gt_regs.h  | 16 
 drivers/gpu/drm/i915/gt/intel_lrc.c  |  6 +++---
 3 files changed, 15 insertions(+), 15 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/gen8_engine_cs.c 
b/drivers/gpu/drm/i915/gt/gen8_engine_cs.c
index 23857cc08eca1..563efee055602 100644
--- a/drivers/gpu/drm/i915/gt/gen8_engine_cs.c
+++ b/drivers/gpu/drm/i915/gt/gen8_engine_cs.c
@@ -287,8 +287,8 @@ int gen12_emit_flush_rcs(struct i915_request *rq, u32 mode)
 
if (!HAS_FLAT_CCS(rq->engine->i915)) {
/* hsdes: 1809175790 */
-   cs = gen12_emit_aux_table_inv(rq->engine->gt,
- cs, GEN12_GFX_CCS_AUX_NV);
+   cs = gen12_emit_aux_table_inv(rq->engine->gt, cs,
+ GEN12_CCS_AUX_INV);
}
 
*cs++ = preparser_disable(false);
@@ -348,10 +348,10 @@ int gen12_emit_flush_xcs(struct i915_request *rq, u32 
mode)
if (aux_inv) { /* hsdes: 1809175790 */
if (rq->engine->class == VIDEO_DECODE_CLASS)
cs = gen12_emit_aux_table_inv(rq->engine->gt,
- cs, GEN12_VD0_AUX_NV);
+ cs, GEN12_VD0_AUX_INV);
else
cs = gen12_emit_aux_table_inv(rq->engine->gt,
- cs, GEN12_VE0_AUX_NV);
+ cs, GEN12_VE0_AUX_INV);
}
 
if (mode & EMIT_INVALIDATE)
diff --git a/drivers/gpu/drm/i915/gt/intel_gt_regs.h 
b/drivers/gpu/drm/i915/gt/intel_gt_regs.h
index 718cb2c80f79e..2cdfb2f713d02 100644
--- a/drivers/gpu/drm/i915/gt/intel_gt_regs.h
+++ b/drivers/gpu/drm/i915/gt/intel_gt_regs.h
@@ -332,9 +332,11 @@
 #define GEN8_PRIVATE_PAT_HI_MMIO(0x40e0 + 4)
 #define GEN10_PAT_INDEX(index) _MMIO(0x40e0 + (index) * 4)
 #define BSD_HWS_PGA_GEN7   _MMIO(0x4180)
-#define GEN12_GFX_CCS_AUX_NV   _MMIO(0x4208)
-#define GEN12_VD0_AUX_NV   _MMIO(0x4218)
-#define GEN12_VD1_AUX_NV   _MMIO(0x4228)
+
+#define GEN12_CCS_AUX_INV  _MMIO(0x4208)
+#define GEN12_VD0_AUX_INV  _MMIO(0x4218)
+#define GEN12_VE0_AUX_INV  _MMIO(0x4238)
+#define GEN12_BCS0_AUX_INV _MMIO(0x4248)
 
 #define GEN8_RTCR  _MMIO(0x4260)
 #define GEN8_M1TCR _MMIO(0x4264)
@@ -342,14 +344,12 @@
 #define GEN8_BTCR  _MMIO(0x426c)
 #define GEN8_VTCR  _MMIO(0x4270)
 
-#define GEN12_VD2_AUX_NV   _MMIO(0x4298)
-#define GEN12_VD3_AUX_NV   _MMIO(0x42a8)
-#define GEN12_VE0_AUX_NV   _MMIO(0x4238)
-
 #define BLT_HWS_PGA_GEN7   _MMIO(0x4280)
 
-#define GEN12_VE1_AUX_NV   _MMIO(0x42b8)
+#define GEN12_VD2_AUX_INV  _MMIO(0x4298)
+#define GEN12_CCS0_AUX_INV _MMIO(0x42c8)
 #define   AUX_INV  REG_BIT(0)
+
 #define VEBOX_HWS_PGA_GEN7 _MMIO(0x4380)
 
 #define GEN12_AUX_ERR_DBG  _MMIO(0x43f4)
diff --git a/drivers/gpu/drm/i915/gt/intel_lrc.c 
b/drivers/gpu/drm/i915/gt/intel_lrc.c
index 1b710102390bf..235f3fab60a98 100644
--- a/drivers/gpu/drm/i915/gt/intel_lrc.c
+++ b/drivers/gpu/drm/i915/gt/intel_lrc.c
@@ -1374,7 +1374,7 @@ gen12_emit_indirect_ctx_rcs(const struct intel_context 
*ce, u32 *cs)
/* hsdes: 1809175790 */
if (!HAS_FLAT_CCS(ce->engine->i915))
cs = gen12_emit_aux_table_inv(ce->engine->gt,
- cs, GEN12_GFX_CCS_AUX_NV);
+ cs, GEN12_CCS_AUX_INV);
 
/* Wa_16014892111 */
if (IS_MTL_GRAPHICS_STEP(ce->engine->i915, M, STEP_A0, STEP_B0) ||
@@ -1403,10 +1403,10 @@ gen12_emit_indirect_ctx_xcs(const struct intel_context 
*ce, u32 *cs)
if (!HAS_FLAT_CCS(ce->engine->i915)) {
if (ce->engine->class == VIDEO_DECODE_CLASS)
cs = gen12_emit_aux_table_inv(ce->engine->gt,
- cs, GEN12_VD0_AUX_NV);
+   

Re: [PATCH] backlight: gpio_backlight: Drop output gpio direction check for initial power state

2023-07-20 Thread Andy Shevchenko
On Thu, Jul 20, 2023 at 2:27 PM Daniel Thompson
 wrote:
>
> On Thu, Jul 20, 2023 at 06:06:27AM +, Ying Liu wrote:
> > Bootloader may leave gpio direction as input and gpio value as logical low.
> > It hints that initial backlight power state should be FB_BLANK_POWERDOWN
> > since the gpio value is literally logical low.
>
> To be honest this probably "hints" that the bootloader simply didn't
> consider the backlight at all :-) . I'd rather the patch description
> focus on what circumstances lead to the current code making a bad
> decision. More like:
>
>   If the GPIO pin is in the input state but the backlight is currently
>   off due to default pull downs then ...
>
> > So, let's drop output gpio
> > direction check and only check gpio value to set the initial power state.
>
> This check was specifically added by Bartosz so I'd be interested in his
> opinion of this change (especially since he is now a GPIO maintainer)!
>
> What motivates (or motivated) the need to check the direction rather
> than just read that current logic level on the pin?

...

> > - else if (gpiod_get_direction(gbl->gpiod) == 0 &&
> > -  gpiod_get_value_cansleep(gbl->gpiod) == 0)
> > + else if (gpiod_get_value_cansleep(gbl->gpiod) == 0)
> >   bl->props.power = FB_BLANK_POWERDOWN;

The code before this patch needs a bit of elaboration. There is no
prohibition on reading value for the pin that is in any direction.
I.o.w. if the direction here is a problem it should have been
configured beforehand.

-- 
With Best Regards,
Andy Shevchenko


Re: [PATCH -next] drm/amdgpu: Fix one kernel-doc comment

2023-07-20 Thread Alex Deucher
Applied.  Thanks!

On Wed, Jul 19, 2023 at 9:27 PM Randy Dunlap  wrote:
>
>
>
> On 7/19/23 18:05, Yang Li wrote:
> > Use colon to separate parameter name from their specific meaning.
> > silence the warning:
> >
> > drivers/gpu/drm/amd/amdgpu/amdgpu_vm_pt.c:793: warning: Function parameter 
> > or member 'adev' not described in 'amdgpu_vm_pte_update_noretry_flags'
> >
> > Signed-off-by: Yang Li 
>
> Reviewed-by: Randy Dunlap 
> Thanks.
>
> > ---
> >  drivers/gpu/drm/amd/amdgpu/amdgpu_vm_pt.c | 2 +-
> >  1 file changed, 1 insertion(+), 1 deletion(-)
> >
> > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm_pt.c 
> > b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm_pt.c
> > index 83e1923f6775..96d601e209b8 100644
> > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm_pt.c
> > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm_pt.c
> > @@ -783,7 +783,7 @@ int amdgpu_vm_pde_update(struct amdgpu_vm_update_params 
> > *params,
> >  /**
> >   * amdgpu_vm_pte_update_noretry_flags - Update PTE no-retry flags
> >   *
> > - * @adev - amdgpu_device pointer
> > + * @adev: amdgpu_device pointer
> >   * @flags: pointer to PTE flags
> >   *
> >   * Update PTE no-retry flags when TF is enabled.
>
> --
> ~Randy


Re: [PATCH v6 05/11] drm/mediatek: dp: Move AUX_P0 setting to mtk_dp_initialize_aux_settings()

2023-07-20 Thread Alexandre Mergnat

Reviewed-by: Alexandre Mergnat 

On 17/07/2023 16:14, AngeloGioacchino Del Regno wrote:

Move the register write to MTK_DP_AUX_P0_3690 to set the AUX reply mode
to function mtk_dp_initialize_aux_settings(), as this is effectively
part of the DPTX AUX setup sequence.


--
Regards,
Alexandre


Re: [PATCH v6 04/11] drm/mediatek: dp: Use devm variant of drm_bridge_add()

2023-07-20 Thread Alexandre Mergnat

Reviewed-by: Alexandre Mergnat 

On 17/07/2023 16:14, AngeloGioacchino Del Regno wrote:

In preparation for adding support for aux-bus, which will add a code
path that may fail after the drm_bridge_add() call, change that to
devm_drm_bridge_add() to simplify failure paths later.


--
Regards,
Alexandre


Re: [PATCH v6 03/11] drm/mediatek: dp: Change logging to dev for mtk_dp_aux_transfer()

2023-07-20 Thread Alexandre Mergnat

Reviewed-by: Alexandre Mergnat 

On 17/07/2023 16:14, AngeloGioacchino Del Regno wrote:

Change logging from drm_{err,info}() to dev_{err,info}() in functions
mtk_dp_aux_transfer() and mtk_dp_aux_do_transfer(): this will be
essential to avoid getting NULL pointer kernel panics if any kind
of error happens during AUX transfers happening before the bridge
is attached.

This may potentially start happening in a later commit implementing
aux-bus support, as AUX transfers will be triggered from the panel
driver (for EDID) before the mtk-dp bridge gets attached, and it's
done in preparation for the same.


--
Regards,
Alexandre


Re: [PATCH v6 02/11] drm/mediatek: dp: Move AUX and panel poweron/off sequence to function

2023-07-20 Thread Alexandre Mergnat

Reviewed-by: Alexandre Mergnat 

On 17/07/2023 16:14, AngeloGioacchino Del Regno wrote:

Everytime we run bridge detection and/or EDID read we run a poweron
and poweroff sequence for both the AUX and the panel; moreover, this
is also done when enabling the bridge in the .atomic_enable() callback.

Move this power on/off sequence to a new mtk_dp_aux_panel_poweron()
function as to commonize it.
Note that, before this commit, in mtk_dp_bridge_atomic_enable() only
the AUX was getting powered on but the panel was left powered off if
the DP cable wasn't plugged in while now we unconditionally send a D0
request and this is done for two reasons:
  - First, whether this request fails or not, it takes the same time
and anyway the DP hardware won't produce any error (or, if it
does, it's ignorable because it won't block further commands)
  - Second, training the link between a sleeping/standby/unpowered
display makes little sense.


--
Regards,
Alexandre


Re: [PATCH v6 01/11] drm/mediatek: dp: Add missing error checks in mtk_dp_parse_capabilities

2023-07-20 Thread Alexandre Mergnat

Reviewed-by: Alexandre Mergnat 

On 17/07/2023 16:14, AngeloGioacchino Del Regno wrote:

If reading the RX capabilities fails the training pattern will be set
wrongly: add error checking for drm_dp_read_dpcd_caps() and return if
anything went wrong with it.

While at it, also add a less critical error check when writing to
clear the ESI0 IRQ vector.


--
Regards,
Alexandre


Re: [PATCH v6 11/11] drm/mediatek: dp: Don't register HPD interrupt handler for eDP case

2023-07-20 Thread Alexandre Mergnat

Reviewed-by: Alexandre Mergnat 

On 17/07/2023 16:14, AngeloGioacchino Del Regno wrote:

The interrupt handler for HPD is useful only if a display is actually
supposed to be hotpluggable, as that manages the machinery to perform
cable (un)plug detection, debouncing and setup for re-training.

Since eDP panels are not supposed to be hotpluggable we can avoid
using the HPD interrupts altogether and rely on HPD polling only
for the suspend/resume case, saving us some spinlocking action and
the overhead of interrupts firing at every suspend/resume cycle,
achieving a faster (even if just slightly) display resume.


--
Regards,
Alexandre


Re: [PATCH] drm/msm: Check for the GPU IOMMU during bind

2023-07-20 Thread Rob Clark
On Thu, Jul 6, 2023 at 11:55 AM Dmitry Baryshkov
 wrote:
>
> On 10/03/2023 00:20, Jordan Crouse wrote:
> > While booting with amd,imageon on a headless target the GPU probe was
> > failing with -ENOSPC in get_pages() from msm_gem.c.
> >
> > Investigation showed that the driver was using the default 16MB VRAM
> > carveout because msm_use_mmu() was returning false since headless devices
> > use a dummy parent device. Avoid this by extending the existing is_a2xx
> > priv member to check the GPU IOMMU state on all platforms and use that
> > check in msm_use_mmu().
> >
> > This works for memory allocations but it doesn't prevent the VRAM carveout
> > from being created because that happens before we have a chance to check
> > the GPU IOMMU state in adreno_bind.
> >
> > There are a number of possible options to resolve this but none of them are
> > very clean. The easiest way is to likely specify vram=0 as module parameter
> > on headless devices so that the memory doesn't get wasted.
>
> This patch was on my plate for quite a while, please excuse me for
> taking it so long.
>
> I see the following problem with the current code. We have two different
> instances than can access memory: MDP/DPU and GPU. And each of them can
> either have or miss the MMU.
>
> For some time I toyed with the idea of determining whether the allocated
> BO is going to be used by display or by GPU, but then I abandoned it. We
> can have display BOs being filled by GPU, so handling it this way would
> complicate things a lot.

There is MSM_BO_SCANOUT .. but it wouldn't completely surprise me if
it isn't used in some place where it should somewhere or other.  But
that is the hint that contiguous allocation should be used if the
display doesn't support some sort of iommu.  (Using a GPU without some
sort of mmu/iommu isn't something sane to do.. the only reason the
support for that exists at all is to aid bringup.  I wouldn't call
that a "supported" configuration.)

> This actually rings a tiny bell in my head with the idea of splitting
> the display and GPU parts to two different drivers, but I'm not sure
> what would be the overall impact.

Userspace does have better support for split display/gpu these days
than it did when drm/msm was first merged.  It _might_ just work if
one device only advertised DRIVER_RENDER and the other
MODESET/ATOMIC.. but I'd be a bit concerned about breaking things.  I
guess you could try some sort of kconfig knob to have two "msm"
devices and see what breaks, but I'm a bit skeptical that we could
make this the default anytime soon.

For now, just addressing the only-display and only-gpu cases
(continuing with the single device arrangement when you have both
display and gpu), maybe split up drm_dev_alloc() and drm_dev_init() so
that we could use drm_device::driver_features to mask out
DRIVER_RENDER if needed.

BR,
-R

> More on the msm_use_mmu() below.
>
> >
> > Signed-off-by: Jordan Crouse 
> > ---
> >
> >   drivers/gpu/drm/msm/adreno/adreno_device.c | 6 +-
> >   drivers/gpu/drm/msm/msm_drv.c  | 7 +++
> >   drivers/gpu/drm/msm/msm_drv.h  | 2 +-
> >   3 files changed, 9 insertions(+), 6 deletions(-)
> >
> > diff --git a/drivers/gpu/drm/msm/adreno/adreno_device.c 
> > b/drivers/gpu/drm/msm/adreno/adreno_device.c
> > index 36f062c7582f..4f19da28f80f 100644
> > --- a/drivers/gpu/drm/msm/adreno/adreno_device.c
> > +++ b/drivers/gpu/drm/msm/adreno/adreno_device.c
> > @@ -539,7 +539,11 @@ static int adreno_bind(struct device *dev, struct 
> > device *master, void *data)
> >   DBG("Found GPU: %u.%u.%u.%u", config.rev.core, config.rev.major,
> >   config.rev.minor, config.rev.patchid);
> >
> > - priv->is_a2xx = config.rev.core == 2;
> > + /*
> > +  * A2xx has a built in IOMMU and all other IOMMU enabled targets will
> > +  * have an ARM IOMMU attached
> > +  */
> > + priv->has_gpu_iommu = config.rev.core == 2 || 
> > device_iommu_mapped(dev);
> >   priv->has_cached_coherent = config.rev.core >= 6;
> >
> >   gpu = info->init(drm);
> > diff --git a/drivers/gpu/drm/msm/msm_drv.c b/drivers/gpu/drm/msm/msm_drv.c
> > index aca48c868c14..a125a351ec90 100644
> > --- a/drivers/gpu/drm/msm/msm_drv.c
> > +++ b/drivers/gpu/drm/msm/msm_drv.c
> > @@ -318,11 +318,10 @@ bool msm_use_mmu(struct drm_device *dev)
> >   struct msm_drm_private *priv = dev->dev_private;
> >
> >   /*
> > -  * a2xx comes with its own MMU
> > -  * On other platforms IOMMU can be declared specified either for the
> > -  * MDP/DPU device or for its parent, MDSS device.
> > +  * Return true if the GPU or the MDP/DPU or parent MDSS device has an
> > +  * IOMMU
> >*/
> > - return priv->is_a2xx ||
> > + return priv->has_gpu_iommu ||
> >   device_iommu_mapped(dev->dev) ||
> >   device_iommu_mapped(dev->dev->parent);
>
> I have a generic feeling that both old an new code is not fully correct.
> Please correct me if I'm wr

[PATCH 6.4] drm/client: Send hotplug event after registering a client

2023-07-20 Thread Mario Limonciello
From: Thomas Zimmermann 

Generate a hotplug event after registering a client to allow the
client to configure its display. Remove the hotplug calls from the
existing clients for fbdev emulation. This change fixes a concurrency
bug between registering a client and receiving events from the DRM
core. The bug is present in the fbdev emulation of all drivers.

The fbdev emulation currently generates a hotplug event before
registering the client to the device. For each new output, the DRM
core sends an additional hotplug event to each registered client.

If the DRM core detects first output between sending the artificial
hotplug and registering the device, the output's hotplug event gets
lost. If this is the first output, the fbdev console display remains
dark. This has been observed with amdgpu and fbdev-generic.

Fix this by adding hotplug generation directly to the client's
register helper drm_client_register(). Registering the client and
receiving events are serialized by struct drm_device.clientlist_mutex.
So an output is either configured by the initial hotplug event, or
the client has already been registered.

The bug was originally added in commit 6e3f17ee73f7 ("drm/fb-helper:
generic: Call drm_client_add() after setup is done"), in which adding
a client and receiving a hotplug event switched order. It was hidden,
as most hardware and drivers have at least on static output configured.
Other drivers didn't use the internal DRM client or still had struct
drm_mode_config_funcs.output_poll_changed set. That callback handled
hotplug events as well. After not setting the callback in amdgpu in
commit 0e3172bac3f4 ("drm/amdgpu: Don't set struct
drm_driver.output_poll_changed"), amdgpu did not show a framebuffer
console if output events got lost. The bug got copy-pasted from
fbdev-generic into the other fbdev emulation.

Reported-by: Moritz Duge 
Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/2649
Fixes: 6e3f17ee73f7 ("drm/fb-helper: generic: Call drm_client_add() after setup 
is done")
Fixes: 8ab59da26bc0 ("drm/fb-helper: Move generic fbdev emulation into separate 
source file")
Fixes: b79fe9abd58b ("drm/fbdev-dma: Implement fbdev emulation for GEM DMA 
helpers")
Fixes: 63c381552f69 ("drm/armada: Implement fbdev emulation as in-kernel 
client")
Fixes: 49953b70e7d3 ("drm/exynos: Implement fbdev emulation as in-kernel 
client")
Fixes: 8f1aaccb04b7 ("drm/gma500: Implement client-based fbdev emulation")
Fixes: 940b869c2f2f ("drm/msm: Implement fbdev emulation as in-kernel client")
Fixes: 9e69bcd88e45 ("drm/omapdrm: Implement fbdev emulation as in-kernel 
client")
Fixes: e317a69fe891 ("drm/radeon: Implement client-based fbdev emulation")
Fixes: 71ec16f45ef8 ("drm/tegra: Implement fbdev emulation as in-kernel client")
Fixes: 0e3172bac3f4 ("drm/amdgpu: Don't set struct 
drm_driver.output_poll_changed")
Signed-off-by: Thomas Zimmermann 
Tested-by: Moritz Duge 
Tested-by: Torsten Krah 
Tested-by: Paul Schyska 
Cc: Daniel Vetter 
Cc: David Airlie 
Cc: Noralf Trønnes 
Cc: Maarten Lankhorst 
Cc: Maxime Ripard 
Cc: Javier Martinez Canillas 
Cc: Russell King 
Cc: Inki Dae 
Cc: Seung-Woo Kim 
Cc: Kyungmin Park 
Cc: Krzysztof Kozlowski 
Cc: Patrik Jakobsson 
Cc: Rob Clark 
Cc: Abhinav Kumar 
Cc: Dmitry Baryshkov 
Cc: Tomi Valkeinen 
Cc: Alex Deucher 
Cc: "Christian König" 
Cc: "Pan, Xinhui" 
Cc: Thierry Reding 
Cc: Mikko Perttunen 
Cc: dri-devel@lists.freedesktop.org
Cc: linux-ker...@vger.kernel.org
Cc: linux-arm-ker...@lists.infradead.org
Cc: linux-samsung-...@vger.kernel.org
Cc: linux-arm-...@vger.kernel.org
Cc: freedr...@lists.freedesktop.org
Cc: amd-...@lists.freedesktop.org
Cc: linux-te...@vger.kernel.org
Cc: dri-devel@lists.freedesktop.org
Cc:  # v5.2+
Reviewed-by: Javier Martinez Canillas 
Reviewed-by: Dmitry Baryshkov  # msm
Link: 
https://patchwork.freedesktop.org/patch/msgid/20230710091029.27503-1-tzimmerm...@suse.de
(cherry picked from commit 27655b9bb9f0d9c32b8de8bec649b676898c52d5)
Dropped changes to drivers/gpu/drm/armada/armada_fbdev.c as
174c3c38e3a2 drm/armada: Initialize fbdev DRM client
was introduced in 6.5-rc1
Signed-off-by: Mario Limonciello 
---
 drivers/gpu/drm/drm_client.c  | 21 +
 drivers/gpu/drm/drm_fbdev_dma.c   |  4 
 drivers/gpu/drm/drm_fbdev_generic.c   |  4 
 drivers/gpu/drm/exynos/exynos_drm_fbdev.c |  4 
 drivers/gpu/drm/gma500/fbdev.c|  4 
 drivers/gpu/drm/msm/msm_fbdev.c   |  4 
 drivers/gpu/drm/omapdrm/omap_fbdev.c  |  4 
 drivers/gpu/drm/radeon/radeon_fbdev.c |  4 
 drivers/gpu/drm/tegra/fbdev.c |  4 
 9 files changed, 21 insertions(+), 32 deletions(-)

diff --git a/drivers/gpu/drm/drm_client.c b/drivers/gpu/drm/drm_client.c
index f6292ba0e6fc..037e36f2049c 100644
--- a/drivers/gpu/drm/drm_client.c
+++ b/drivers/gpu/drm/drm_client.c
@@ -122,13 +122,34 @@ EXPORT_SYMBOL(drm_client_init);
  * drm_client_register() it is no longer permissible to call 
drm_client_release

[PATCH 6.1] drm/client: Send hotplug event after registering a client

2023-07-20 Thread Mario Limonciello
From: Thomas Zimmermann 

Generate a hotplug event after registering a client to allow the
client to configure its display. Remove the hotplug calls from the
existing clients for fbdev emulation. This change fixes a concurrency
bug between registering a client and receiving events from the DRM
core. The bug is present in the fbdev emulation of all drivers.

The fbdev emulation currently generates a hotplug event before
registering the client to the device. For each new output, the DRM
core sends an additional hotplug event to each registered client.

If the DRM core detects first output between sending the artificial
hotplug and registering the device, the output's hotplug event gets
lost. If this is the first output, the fbdev console display remains
dark. This has been observed with amdgpu and fbdev-generic.

Fix this by adding hotplug generation directly to the client's
register helper drm_client_register(). Registering the client and
receiving events are serialized by struct drm_device.clientlist_mutex.
So an output is either configured by the initial hotplug event, or
the client has already been registered.

The bug was originally added in commit 6e3f17ee73f7 ("drm/fb-helper:
generic: Call drm_client_add() after setup is done"), in which adding
a client and receiving a hotplug event switched order. It was hidden,
as most hardware and drivers have at least on static output configured.
Other drivers didn't use the internal DRM client or still had struct
drm_mode_config_funcs.output_poll_changed set. That callback handled
hotplug events as well. After not setting the callback in amdgpu in
commit 0e3172bac3f4 ("drm/amdgpu: Don't set struct
drm_driver.output_poll_changed"), amdgpu did not show a framebuffer
console if output events got lost. The bug got copy-pasted from
fbdev-generic into the other fbdev emulation.

Reported-by: Moritz Duge 
Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/2649
Fixes: 6e3f17ee73f7 ("drm/fb-helper: generic: Call drm_client_add() after setup 
is done")
Fixes: 8ab59da26bc0 ("drm/fb-helper: Move generic fbdev emulation into separate 
source file")
Fixes: b79fe9abd58b ("drm/fbdev-dma: Implement fbdev emulation for GEM DMA 
helpers")
Fixes: 63c381552f69 ("drm/armada: Implement fbdev emulation as in-kernel 
client")
Fixes: 49953b70e7d3 ("drm/exynos: Implement fbdev emulation as in-kernel 
client")
Fixes: 8f1aaccb04b7 ("drm/gma500: Implement client-based fbdev emulation")
Fixes: 940b869c2f2f ("drm/msm: Implement fbdev emulation as in-kernel client")
Fixes: 9e69bcd88e45 ("drm/omapdrm: Implement fbdev emulation as in-kernel 
client")
Fixes: e317a69fe891 ("drm/radeon: Implement client-based fbdev emulation")
Fixes: 71ec16f45ef8 ("drm/tegra: Implement fbdev emulation as in-kernel client")
Fixes: 0e3172bac3f4 ("drm/amdgpu: Don't set struct 
drm_driver.output_poll_changed")
Signed-off-by: Thomas Zimmermann 
Tested-by: Moritz Duge 
Tested-by: Torsten Krah 
Tested-by: Paul Schyska 
Cc: Daniel Vetter 
Cc: David Airlie 
Cc: Noralf Trønnes 
Cc: Maarten Lankhorst 
Cc: Maxime Ripard 
Cc: Javier Martinez Canillas 
Cc: Russell King 
Cc: Inki Dae 
Cc: Seung-Woo Kim 
Cc: Kyungmin Park 
Cc: Krzysztof Kozlowski 
Cc: Patrik Jakobsson 
Cc: Rob Clark 
Cc: Abhinav Kumar 
Cc: Dmitry Baryshkov 
Cc: Tomi Valkeinen 
Cc: Alex Deucher 
Cc: "Christian König" 
Cc: "Pan, Xinhui" 
Cc: Thierry Reding 
Cc: Mikko Perttunen 
Cc: dri-devel@lists.freedesktop.org
Cc: linux-ker...@vger.kernel.org
Cc: linux-arm-ker...@lists.infradead.org
Cc: linux-samsung-...@vger.kernel.org
Cc: linux-arm-...@vger.kernel.org
Cc: freedr...@lists.freedesktop.org
Cc: amd-...@lists.freedesktop.org
Cc: linux-te...@vger.kernel.org
Cc: dri-devel@lists.freedesktop.org
Cc:  # v5.2+
Reviewed-by: Javier Martinez Canillas 
Reviewed-by: Dmitry Baryshkov  # msm
Link: 
https://patchwork.freedesktop.org/patch/msgid/20230710091029.27503-1-tzimmerm...@suse.de
(cherry picked from commit 27655b9bb9f0d9c32b8de8bec649b676898c52d5)
Dropped changes to drivers/gpu/drm/armada/armada_fbdev.c as
174c3c38e3a2 drm/armada: Initialize fbdev DRM client
was introduced in 6.5-rc1.

Dropped changes to exynos, msm, omapdrm, radeon, tegra drivers
as missing code these commits introduced:

99286486d674 drm/exynos: Initialize fbdev DRM client
841ef552b141 drm/msm: Initialize fbdev DRM client
9e69bcd88e45 drm/omapdrm: Implement fbdev emulation as in-kernel client
e317a69fe891 drm/radeon: Implement client-based fbdev emulation
9b926bcf2636 drm/radeon: Only build fbdev if DRM_FBDEV_EMULATION is set
25dda38e0b07 drm/tegra: Initialize fbdev DRM client
8f1aaccb04b7 drm/gma500: Implement client-based fbdev emulation
b79fe9abd58b drm/fbdev-dma: Implement fbdev emulation for GEM DMA helpers

Move code for drm-fbdev-generic.c to matching file in 6.1.y because
these commits haven't happened in 6.1.y.
8ab59da26bc0 drm/fb-helper: Move generic fbdev emulation into separate source 
file
b9c93f4ec737 drm/fbdev-generic: Rename symbols

Cc: alexandru.gagn...@hp.com
Link: 
https://lore.ker

Re: [PATCH 2/2] drm/i915: Avoid -Wconstant-logical-operand in nsecs_to_jiffies_timeout()

2023-07-20 Thread Nathan Chancellor
On Thu, Jul 20, 2023 at 09:43:05AM +0100, Tvrtko Ursulin wrote:
> 
> On 18/07/2023 22:44, Nathan Chancellor wrote:
> > A proposed update to clang's -Wconstant-logical-operand to warn when the
> > left hand side is a constant shows the following instance in
> > nsecs_to_jiffies_timeout() when NSEC_PER_SEC is not a multiple of HZ,
> > such as CONFIG_HZ=300:
> > 
> >drivers/gpu/drm/i915/gem/i915_gem_wait.c:189:24: warning: use of logical 
> > '&&' with constant operand [-Wconstant-logical-operand]
> >  189 | if (NSEC_PER_SEC % HZ &&
> >  | ~ ^
> >drivers/gpu/drm/i915/gem/i915_gem_wait.c:189:24: note: use '&' for a 
> > bitwise operation
> >  189 | if (NSEC_PER_SEC % HZ &&
> >  |   ^~
> >  |   &
> >drivers/gpu/drm/i915/gem/i915_gem_wait.c:189:24: note: remove constant 
> > to silence this warning
> >1 warning generated.
> > 
> > Turn this into an explicit comparison against zero to make the
> > expression a boolean to make it clear this should be a logical check,
> > not a bitwise one.
> 
> So -Wconstant-logical-operand only triggers when it is a
> constant but not zero constant? Why does that make sense is not
> a kludge to avoid too much noise?

Yes, the warning purposefully does not trigger when the constant is a 1
or 0 (as those are usually indicative of an intentional logical
operation):

https://github.com/llvm/llvm-project/blob/dfdfd306cfaf54fbc43e2d5eb36489dac3eb9976/clang/lib/Sema/SemaExpr.cpp#L13917-L13919

In this case, it is 100, so I kind of understand why this might be
ambiguous to the compiler.

> Personally, it all feels a bit over the top as a warning,
> since code in both cases should optimise away. And we may end

I do not necessarily disagree, as you can see from the differential
review that I linked in the message, but I also understand it is a fine
line to tread when writing compiler warnings between wanting to catch
as many potential problems as possible and having too much noise for
developers to sift through. I think this is erring on the side of
caution.

> up papering over it if it becomes a default.

diagtool tree tells me this warning is already on by default.

> Then again this patch IMO does make the code more readable, so

I think so too.

> I am happy to take this one via our tree. Or either give ack to
> bring it in via drm-misc-next:
> 
> Acked-by: Tvrtko Ursulin 
> 
> Let me know which route works best.

Thanks for the feedback! Either route is fine with me but if the v3d
patch is going to go in via drm-misc-next, it seems like it would not be
too much trouble to push this one with it.

Cheers,
Nathan


Re: [PATCH v6 10/11] drm/mediatek: dp: Add .wait_hpd_asserted() for AUX bus

2023-07-20 Thread Alexandre Mergnat

Reviewed-by: Alexandre Mergnat 

On 17/07/2023 16:14, AngeloGioacchino Del Regno wrote:

In order to support usecases in which the panel regulator can be
switched on and off to save power, and usecases in which the panel
regulator is off at boot, add a .wait_hpd_asserted() callback for
the AUX bus: this will make sure to wait until the panel is fully
ready after power-on before trying to communicate with it.

Also, parse the eDP display capabilities in that callback, so that
we can also avoid using the .get_edid() callback from this bridge.

Since at this point the hpd machinery is performed in the new hpd
callback and the detection and edid reading are done outside of
this driver, assign the DRM_BRIDGE_OP_{DETECT, EDID, HPD} ops and
register the bridge unconditionally at probe time only if we are
probing full DisplayPort and not eDP while, for the latter, we
register the bridge in the .done_probing() callback and only if
the panel was found and triggered HPD.


--
Regards,
Alexandre


Re: [PATCH v6 6/9] drm/i915/gt: Ensure memory quiesced before invalidation for all engines

2023-07-20 Thread Andi Shyti
Hi Nirmoy,

> +   if (aux_inv) {
> +   u32 bit_group_0 = 0;
> +   u32 bit_group_1 = 0;
> +
> +   cmd += 4;
> +
> +   bit_group_0 |= PIPE_CONTROL0_HDC_PIPELINE_FLUSH;
> +
> +   switch (rq->engine->class) {
> +   case VIDEO_DECODE_CLASS:
> +   bit_group_1 |= 
> PIPE_CONTROL_RENDER_TARGET_CACHE_FLUSH;
> +   bit_group_1 |= PIPE_CONTROL_DEPTH_CACHE_FLUSH;
> +   bit_group_1 |= PIPE_CONTROL_DC_FLUSH_ENABLE;
> +   bit_group_1 |= PIPE_CONTROL_FLUSH_L3;
> +   bit_group_1 |= PIPE_CONTROL_CS_STALL;
> +
> +   intel_emit_pipe_control_cs(rq, bit_group_0, 
> bit_group_1,
> +  
> LRC_PPHWSP_SCRATCH_ADDR);
> 
> 
> I think pipe control is only for compute and render engines.
> 
> +
> +   break;
> +
> +   case VIDEO_ENHANCEMENT_CLASS:
> +   case COMPUTE_CLASS:
> 
> Don't think gen12_emit_flush_xcs() will get called for compute engine.
> 
> intel_guc_submission_setup() --> rcs_submission_override() replaces
> gen12_emit_flush_xcs() with gen12_emit_flush_rcs()
> 
> for compute and render.

yes, I made some confusion here... this part is bogus... will try
to clean things up and try again.

Andi


Re: [PATCH v2 05/11] drm/tests: helpers: Create a helper to allocate a locking ctx

2023-07-20 Thread kernel test robot
Hi Maxime,

kernel test robot noticed the following build warnings:

[auto build test WARNING on c58c49dd89324b18a812762a2bfa5a0458e4f252]

url:
https://github.com/intel-lab-lkp/linux/commits/Maxime-Ripard/drm-tests-helpers-Switch-to-kunit-actions/20230720-191901
base:   c58c49dd89324b18a812762a2bfa5a0458e4f252
patch link:
https://lore.kernel.org/r/20230720-kms-kunit-actions-rework-v2-5-175017bd56ab%40kernel.org
patch subject: [PATCH v2 05/11] drm/tests: helpers: Create a helper to allocate 
a locking ctx
config: alpha-allyesconfig 
(https://download.01.org/0day-ci/archive/20230720/202307202244.26vyezkj-...@intel.com/config)
compiler: alpha-linux-gcc (GCC) 12.3.0
reproduce: 
(https://download.01.org/0day-ci/archive/20230720/202307202244.26vyezkj-...@intel.com/reproduce)

If you fix the issue in a separate patch/commit (i.e. not just a new version of
the same patch/commit), kindly add following tags
| Reported-by: kernel test robot 
| Closes: 
https://lore.kernel.org/oe-kbuild-all/202307202244.26vyezkj-...@intel.com/

All warnings (new ones prefixed by >>):

>> drivers/gpu/drm/tests/drm_kunit_helpers.c:145: warning: expecting prototype 
>> for drm_kunit_helper_context_alloc(). Prototype was for 
>> drm_kunit_helper_acquire_ctx_alloc() instead


vim +145 drivers/gpu/drm/tests/drm_kunit_helpers.c

   130  
   131  /**
   132   * drm_kunit_helper_context_alloc - Allocates an acquire context
   133   * @test: The test context object
   134   *
   135   * Allocates and initializes a modeset acquire context.
   136   *
   137   * The context is tied to the kunit test context, so we must not call
   138   * drm_modeset_acquire_fini() on it, it will be done so automatically.
   139   *
   140   * Returns:
   141   * An ERR_PTR on error, a pointer to the newly allocated context 
otherwise
   142   */
   143  struct drm_modeset_acquire_ctx *
   144  drm_kunit_helper_acquire_ctx_alloc(struct kunit *test)
 > 145  {
   146  struct drm_modeset_acquire_ctx *ctx;
   147  int ret;
   148  
   149  ctx = kunit_kzalloc(test, sizeof(*ctx), GFP_KERNEL);
   150  KUNIT_ASSERT_NOT_NULL(test, ctx);
   151  
   152  drm_modeset_acquire_init(ctx, 0);
   153  
   154  ret = kunit_add_action_or_reset(test,
   155  action_drm_release_context,
   156  ctx);
   157  if (ret)
   158  return ERR_PTR(ret);
   159  
   160  return ctx;
   161  }
   162  EXPORT_SYMBOL_GPL(drm_kunit_helper_acquire_ctx_alloc);
   163  

-- 
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki


Re: [PATCH v6 09/11] drm/mediatek: dp: Add support for embedded DisplayPort aux-bus

2023-07-20 Thread Alexandre Mergnat

Reviewed-by: Alexandre Mergnat 

On 17/07/2023 16:14, AngeloGioacchino Del Regno wrote:

For the eDP case we can support using aux-bus on MediaTek DP: this
gives us the possibility to declare our panel as generic "panel-edp"
which will automatically configure the timings and available modes
via the EDID that we read from it.

To do this, move the panel parsing at the end of the probe function
so that the hardware is initialized beforehand and also initialize
the DPTX AUX block and power both on as, when we populate the
aux-bus, the panel driver will trigger an EDID read to perform
panel detection.

Last but not least, since now the AUX transfers can happen in the
separated aux-bus, it was necessary to add an exclusion for the
cable_plugged_in check in `mtk_dp_aux_transfer()` and the easiest
way to do this is to simply ignore checking that when the bridge
type is eDP.


--
Regards,
Alexandre


Re: [PATCH v3] drm/i915: Refactor PAT/object cache handling

2023-07-20 Thread Tvrtko Ursulin



[Here let me just focus on the points which did not get further discussion in 
follow ups yet.]

On 19/07/2023 23:31, Matt Roper wrote:

On Wed, Jul 19, 2023 at 01:37:30PM +0100, Tvrtko Ursulin wrote:

From: Tvrtko Ursulin 

Commit 9275277d5324 ("drm/i915: use pat_index instead of cache_level") has
introduced PAT indices to i915 internal APIs, partially replacing the
usage of driver internal cache_level, but has also added a few
questionable design decisions which this patch tries to improve upon.

Principal change is to invert the per platform cache level to PAT index
table which was added by the referenced commit, and by doing so enable
i915 to understand the cache mode between PAT indices, changing them from
opaque to transparent.

Once we have the inverted table we are able to remove the hidden false
"return true" from i915_gem_object_has_cache_level.

Other changes/fixes/improvements we are able to do:

1)
Replace the enum i915_cache_level with i915_cache_t, composed of a more
detailed representation of each cache mode (base mode plus flags).

For instance this way we are able to express the difference between WB and
1-way coherent WB on Meteorlake. Which in turn enables us to map the i915
"cached" mode to the correct Meteorlake PAT index.

2)
We can cache PAT indices of the caching modes used by the driver itself in
struct drm_i915_private, which eliminates the runtime calls to
i915_gem_get_pat_index from both high- and low-level i915 components.

3)
We can also cache the caching modes used by the driver for coherent
access and for display buffers.

4)
Remove the incorrect references to enum i915_cache_level from low level
PTE encode vfuncs, since those are actually given PAT indices by their
callers.

5)
Because i915 now understands PAT indices, we can remove the overly
aggressive flushing triggered from i915_gem_object_can_bypass_llc() and
limit it to non-coherent write-back mode only.

6)
Finally we are able to replace the platform dependent cache mode to string
code in debugfs and elsewhere by the single implementation based on
i915_cache_t.

v2:
  * Fix PAT-to-cache-mode table for PVC. (Fei)
  * Cache display caching mode too. (Fei)
  * Improve and document criteria in i915_gem_object_can_bypass_llc() (Matt)

v3:
  * Checkpath issues.
  * Cache mode flags check fixed.

Signed-off-by: Tvrtko Ursulin 
Fixes: 9275277d5324 ("drm/i915: use pat_index instead of cache_level")
Cc: Chris Wilson 
Cc: Fei Yang 
Cc: Andi Shyti 
Cc: Matt Roper 
---
  drivers/gpu/drm/i915/Makefile |   1 +
  .../drm/i915/display/intel_plane_initial.c|   3 +-
  drivers/gpu/drm/i915/gem/i915_gem_domain.c|  56 ---
  drivers/gpu/drm/i915/gem/i915_gem_domain.h|   5 +-
  .../gpu/drm/i915/gem/i915_gem_execbuffer.c|  13 +-
  drivers/gpu/drm/i915/gem/i915_gem_internal.c  |   4 +-
  drivers/gpu/drm/i915/gem/i915_gem_mman.c  |  12 +-
  drivers/gpu/drm/i915/gem/i915_gem_object.c| 152 +++---
  drivers/gpu/drm/i915/gem/i915_gem_object.h|  11 +-
  .../gpu/drm/i915/gem/i915_gem_object_types.h  | 116 +
  drivers/gpu/drm/i915/gem/i915_gem_shmem.c |   8 +-
  drivers/gpu/drm/i915/gem/i915_gem_stolen.c|  11 +-
  drivers/gpu/drm/i915/gem/i915_gem_ttm_move.c  |  44 ++---
  drivers/gpu/drm/i915/gem/i915_gem_userptr.c   |   2 +-
  .../drm/i915/gem/selftests/huge_gem_object.c  |   4 +-
  .../gpu/drm/i915/gem/selftests/huge_pages.c   |   6 +-
  drivers/gpu/drm/i915/gt/gen6_ppgtt.c  |   4 +-
  drivers/gpu/drm/i915/gt/gen8_ppgtt.c  |  19 +--
  drivers/gpu/drm/i915/gt/intel_engine_cs.c |   2 +-
  drivers/gpu/drm/i915/gt/intel_ggtt.c  |  33 ++--
  drivers/gpu/drm/i915/gt/intel_ggtt_gmch.c |   4 +-
  drivers/gpu/drm/i915/gt/intel_gtt.c   |   2 +-
  drivers/gpu/drm/i915/gt/intel_gtt.h   |   3 +-
  drivers/gpu/drm/i915/gt/intel_migrate.c   |  11 +-
  drivers/gpu/drm/i915/gt/intel_ppgtt.c |   6 +-
  .../gpu/drm/i915/gt/intel_ring_submission.c   |   4 +-
  drivers/gpu/drm/i915/gt/intel_timeline.c  |   2 +-
  drivers/gpu/drm/i915/gt/selftest_hangcheck.c  |   2 +-
  drivers/gpu/drm/i915/gt/selftest_migrate.c|   9 +-
  drivers/gpu/drm/i915/gt/selftest_reset.c  |  14 +-
  drivers/gpu/drm/i915/gt/selftest_tlb.c|   5 +-
  .../gpu/drm/i915/gt/selftest_workarounds.c|   2 +-
  drivers/gpu/drm/i915/gt/uc/intel_uc_fw.c  |   8 +-
  drivers/gpu/drm/i915/i915_cache.c |  91 +++
  drivers/gpu/drm/i915/i915_cache.h |  60 +++
  drivers/gpu/drm/i915/i915_debugfs.c   |  53 +-
  drivers/gpu/drm/i915/i915_driver.c|   5 +
  drivers/gpu/drm/i915/i915_drv.h   |   5 +
  drivers/gpu/drm/i915/i915_gem.c   |  21 +--
  drivers/gpu/drm/i915/i915_gpu_error.c |   7 +-
  drivers/gpu/drm/i915/i915_pci.c   |  82 +-
  drivers/gpu/drm/i915/i915_perf.c  |   2 +-
  drivers/gpu/drm/i915/intel_device_info.h  |

Re: [PATCH v3] drm/i915: Refactor PAT/object cache handling

2023-07-20 Thread Tvrtko Ursulin



On 20/07/2023 01:22, Matt Roper wrote:

On Wed, Jul 19, 2023 at 05:07:15PM -0700, Yang, Fei wrote:

[snip]

@@ -27,15 +28,8 @@ static bool gpu_write_needs_clflush(struct 
drm_i915_gem_object *obj)


The code change here looks accurate, but while we're here, I have a side
question about this function in general...it was originally introduced
in commit 48004881f693 ("drm/i915: Mark CPU cache as dirty when used for
rendering") which states that GPU rendering ends up in the CPU cache
(and thus needs a clflush later to make sure it lands in memory).  That
makes sense to me for LLC platforms, but is it really true for non-LLC
snooping platforms (like MTL) as the commit states?


For non-LLC platforms objects can be set to 1-way coherent which means
GPU rendering ending up in CPU cache as well, so for non-LLC platform
the logic here should be checking 1-way coherent flag.


That's the part that I'm questioning (and not just for MTL, but for all
of our other non-LLC platforms too).  Just because there's coherency
doesn't mean that device writes landed in the CPU cache.  Coherency is
also achieved if device writes invalidate the contents of the CPU cache.
I thought our non-LLC snooping platforms were coherent due to
write-invalidate rather than write-update, but I can't find it
specifically documented anywhere at the moment.  If write-invalidate was
used, then there shouldn't be a need for a later clflush either.


[Trying to consolidate by doing a combined reply to the discussion so far.]

On the write-invalidate vs write-update I don't know. If you did not 
find it in bspec then I doubt I would. I can have a browse still.



My understanding
was that snooping platforms just invalidated the CPU cache to prevent
future CPU reads from seeing stale data but didn't actually stick any
new data in there?  Am I off track or is the original logic of this
function not quite right?

Anyway, even if the logic of this function is wrong, it's a mistake that
would only hurt performance


Yes, this logic will introduce performance impact because it's missing the
checking for obj->pat_set_by_user. For objects with pat_set_by_user==true,
even if the object is snooping or 1-way coherent, we don't want to enforce
a clflush here since the coherency is supposed to be handled by user space.


What should I add you think to fix it?

Add a check for non-coherent WB in gpu_write_needs_clflush as an 
additional condition for returning false?


And then if Matt is correct write-invalidate is used also !HAS_LLC 
should just return false?



(flushing more often than we truly need to)
rather than functionality, so not something we really need to dig into
right now as part of this patch.


  if (IS_DGFX(i915))
  return false;

-/*
- * For objects created by userspace through GEM_CREATE with pat_index
- * set by set_pat extension, i915_gem_object_has_cache_level() will
- * always return true, because the coherency of such object is managed
- * by userspace. Othereise the call here would fall back to checking
- * whether the object is un-cached or write-through.
- */
-return !(i915_gem_object_has_cache_level(obj, I915_CACHE_NONE) ||
- i915_gem_object_has_cache_level(obj, I915_CACHE_WT));
+return i915_gem_object_has_cache_mode(obj, I915_CACHE_MODE_UC) != 1 &&
+   i915_gem_object_has_cache_mode(obj, I915_CACHE_MODE_WT) != 1;
  }


[snip]

@@ -640,15 +640,9 @@ static inline int use_cpu_reloc(const struct reloc_cache 
*cache,
  if (DBG_FORCE_RELOC == FORCE_GTT_RELOC)
  return false;

-/*
- * For objects created by userspace through GEM_CREATE with pat_index
- * set by set_pat extension, i915_gem_object_has_cache_level() always
- * return true, otherwise the call would fall back to checking whether
- * the object is un-cached.
- */
  return (cache->has_llc ||
  obj->cache_dirty ||
-!i915_gem_object_has_cache_level(obj, I915_CACHE_NONE));
+i915_gem_object_has_cache_mode(obj, I915_CACHE_MODE_UC) != 1);


Platforms with relocations and platforms with user-specified PAT have no
overlap, right?  So a -1 return should be impossible here and this is
one case where we could just treat the return value as a boolean, right?




Hm no, or maybe. My thinking behind tri-state is to allow a safe option 
for "don't know". In case PAT index to cache mode table is not fully 
populated on some future platform.



My understanding is that the condition here means to say that, if GPU
access is uncached, don't use CPU reloc because the CPU cache might
contain stale data. This condition is sufficient for snooping platforms.
But from MTL onward, the condition show be whether the GPU access is
coherent with CPU. So, we should be checking 1-way coherent flag instead
of UC mode, because even if the GPU access is WB, it's still non-coherent,
thus CPU cache could be out-dated.


Honestly the matrix of caching decision/logic i

Re: [PATCH v6 08/11] drm/mediatek: dp: Move PHY registration to new function

2023-07-20 Thread Alexandre Mergnat

Reviewed-by: Alexandre Mergnat 

On 17/07/2023 16:14, AngeloGioacchino Del Regno wrote:

In preparation for adding support for eDP, move the PHY registration
code to a new mtk_dp_register_phy() function for better readability.

This commit brings no functional changes.


--
Regards,
Alexandre


Re: [PATCH 2/3] dt-bindings: display: panel: Add panels based on ILITEK ILI9806E

2023-07-20 Thread Luca Ceresoli
Hello Rob,

thanks for reviewing.

On Wed, 19 Jul 2023 13:02:54 -0600
Rob Herring  wrote:

> On Wed, Jul 19, 2023 at 05:21:46PM +0200, Luca Ceresoli wrote:
> > Add bindings for LCD panels based on the ILITEK ILI9806E RGB controller
> > connected over SPI and the "ShenZhen New Display Co NDS040480800-V3"
> > 480x800 panel based on it.
> > 
> > Signed-off-by: Luca Ceresoli 
> > ---
> >  .../display/panel/ilitek,ili9806e.yaml| 69 +++
> >  MAINTAINERS   |  6 ++
> >  2 files changed, 75 insertions(+)
> >  create mode 100644 
> > Documentation/devicetree/bindings/display/panel/ilitek,ili9806e.yaml
> > 
> > diff --git 
> > a/Documentation/devicetree/bindings/display/panel/ilitek,ili9806e.yaml 
> > b/Documentation/devicetree/bindings/display/panel/ilitek,ili9806e.yaml
> > new file mode 100644
> > index ..42abc6923065
> > --- /dev/null
> > +++ b/Documentation/devicetree/bindings/display/panel/ilitek,ili9806e.yaml
> > @@ -0,0 +1,69 @@
> > +# SPDX-License-Identifier: (GPL-2.0-only OR BSD-2-Clause)
> > +%YAML 1.2
> > +---
> > +$id: http://devicetree.org/schemas/display/panel/ilitek,ili9806e.yaml#
> > +$schema: http://devicetree.org/meta-schemas/core.yaml#
> > +
> > +title: Ilitek ILI9806E display panels
> > +
> > +maintainers:
> > +  - Luca Ceresoli 
> > +
> > +description:
> > +  This binding is for display panels using an Ilitek ILI9806E controller in
> > +  SPI mode.
> > +
> > +allOf:
> > +  - $ref: panel-common.yaml#  
> 
> A SPI device should reference spi-peripheral-props.yaml as well.
> 
> > +
> > +properties:
> > +  compatible:
> > +items:
> > +  - enum:
> > +  # ShenZhen New Display Co 3.97" 480x800 RGB a-SI TFT LCD
> > +  - newdisplay,nds040480800-v3
> > +  - const: ilitek,ili9806e
> > +
> > +  reg: true  
> 
> maxItems: 1
> 
> > +  spi-max-frequency: true
> > +  reset-gpios: true
> > +  backlight: true
> > +  port: true  
> 
> Drop all these and ...
> 
> > +
> > +required:
> > +  - compatible
> > +  - reg
> > +  - port
> > +
> > +additionalProperties: false  
> 
> ... use "unevaluatedProperties" instead.
> 
> > +
> > +examples:
> > +  - |
> > +#include 
> > +
> > +backlight: backlight {
> > +compatible = "gpio-backlight";
> > +gpios = <&gpio 22 GPIO_ACTIVE_HIGH>;
> > +};  
> 
> The exact backlight is outside the scope of this binding and should be 
> dropped from the example.

As this comes from copy-pasting from the bindings yaml for another
panel, would it be useful if I send a patch to remove it?

Requested changes queued for v2.

Luca

-- 
Luca Ceresoli, Bootlin
Embedded Linux and Kernel engineering
https://bootlin.com


Re: [PATCH 2/3] dt-bindings: display: panel: Add panels based on ILITEK ILI9806E

2023-07-20 Thread Luca Ceresoli
Hello Krzysztof,

thanks for reviewing.

On Wed, 19 Jul 2023 21:12:14 +0200
Krzysztof Kozlowski  wrote:

> On 19/07/2023 17:21, Luca Ceresoli wrote:
> > Add bindings for LCD panels based on the ILITEK ILI9806E RGB controller
> > connected over SPI and the "ShenZhen New Display Co NDS040480800-V3"
> > 480x800 panel based on it.  
> 
> 
> > diff --git a/MAINTAINERS b/MAINTAINERS
> > index aee340630eca..3c38699ee821 100644
> > --- a/MAINTAINERS
> > +++ b/MAINTAINERS
> > @@ -6515,6 +6515,12 @@ T:   git git://anongit.freedesktop.org/drm/drm-misc
> >  F: Documentation/devicetree/bindings/display/ilitek,ili9486.yaml
> >  F: drivers/gpu/drm/tiny/ili9486.c
> >  
> > +DRM DRIVER FOR ILITEK ILI9806E PANELS
> > +M: Luca Ceresoli 
> > +S: Maintained
> > +T: git git://anongit.freedesktop.org/drm/drm-misc  
> 
> Nope, same for recent one-driver-subsystem. It's like a second try...
> You do not have git tree for one driver. The git tree is for subsystem,
> not driver.

I see, no problem, I'm removing the T: line in v2.

As this came from copy-pasting another panel driver, should the T: line
be removed from every "DRM DRIVER FOR * PANELS" entry? I could send a
patch to do that, if that makes sense.

Luca

-- 
Luca Ceresoli, Bootlin
Embedded Linux and Kernel engineering
https://bootlin.com


Re: [PATCH 2/3] dt-bindings: display: panel: Add panels based on ILITEK ILI9806E

2023-07-20 Thread Luca Ceresoli
Hello Conor,

thanks for reviewing.

On Wed, 19 Jul 2023 17:44:45 +0100
Conor Dooley  wrote:

> Hey Luca,
> 
> On Wed, Jul 19, 2023 at 05:21:46PM +0200, Luca Ceresoli wrote:
> > Add bindings for LCD panels based on the ILITEK ILI9806E RGB controller
> > connected over SPI and the "ShenZhen New Display Co NDS040480800-V3"
> > 480x800 panel based on it.
> > 
> > Signed-off-by: Luca Ceresoli 
> > ---
> >  .../display/panel/ilitek,ili9806e.yaml| 69 +++
> >  MAINTAINERS   |  6 ++
> >  2 files changed, 75 insertions(+)
> >  create mode 100644 
> > Documentation/devicetree/bindings/display/panel/ilitek,ili9806e.yaml
> > 
> > diff --git 
> > a/Documentation/devicetree/bindings/display/panel/ilitek,ili9806e.yaml 
> > b/Documentation/devicetree/bindings/display/panel/ilitek,ili9806e.yaml
> > new file mode 100644
> > index ..42abc6923065
> > --- /dev/null
> > +++ b/Documentation/devicetree/bindings/display/panel/ilitek,ili9806e.yaml
> > @@ -0,0 +1,69 @@
> > +# SPDX-License-Identifier: (GPL-2.0-only OR BSD-2-Clause)
> > +%YAML 1.2
> > +---
> > +$id: http://devicetree.org/schemas/display/panel/ilitek,ili9806e.yaml#
> > +$schema: http://devicetree.org/meta-schemas/core.yaml#
> > +
> > +title: Ilitek ILI9806E display panels
> > +
> > +maintainers:
> > +  - Luca Ceresoli 
> > +
> > +description:
> > +  This binding is for display panels using an Ilitek ILI9806E controller in
> > +  SPI mode.  
> 
> I figure you explicitly mention SPI mode here because it also supports
> D{P,S}I?

Exactly, DSI is supported as well by the chip.

> > +allOf:
> > +  - $ref: panel-common.yaml#
> > +
> > +properties:
> > +  compatible:
> > +items:
> > +  - enum:
> > +  # ShenZhen New Display Co 3.97" 480x800 RGB a-SI TFT LCD
> > +  - newdisplay,nds040480800-v3
> > +  - const: ilitek,ili9806e
> > +
> > +  reg: true
> > +  spi-max-frequency: true
> > +  reset-gpios: true
> > +  backlight: true
> > +  port: true
> > +
> > +required:
> > +  - compatible
> > +  - reg
> > +  - port
> > +
> > +additionalProperties: false
> > +
> > +examples:
> > +  - |
> > +#include 
> > +
> > +backlight: backlight {
> > +compatible = "gpio-backlight";
> > +gpios = <&gpio 22 GPIO_ACTIVE_HIGH>;
> > +};
> > +spi {  
> 
> Just a nit, a blank line between properties please. Clearly no
> respinning needed for that...

I agree. This comes from copy-paste from another bindings file so I'm
probably sending a separate series to fix it and avoid the same issue
in the future.

Luca

-- 
Luca Ceresoli, Bootlin
Embedded Linux and Kernel engineering
https://bootlin.com


[pull] amdgpu drm-fixes-6.5

2023-07-20 Thread Alex Deucher
Hi Dave, Daniel,

Fixes for 6.5.

The following changes since commit fdf0eaf11452d72945af31804e2a1048ee1b574c:

  Linux 6.5-rc2 (2023-07-16 15:10:37 -0700)

are available in the Git repository at:

  https://gitlab.freedesktop.org/agd5f/linux.git 
tags/amd-drm-fixes-6.5-2023-07-20

for you to fetch changes up to b13d3e9c6b62597a5c31fdc74febb3bc588893bf:

  drm/amdgpu: use a macro to define no xcp partition case (2023-07-18 14:42:54 
-0400)


amd-drm-fixes-6.5-2023-07-20:

amdgpu:
- More PCIe DPM fixes for Intel platforms
- DCN3.0.1 fixes
- Virtual display timer fix
- Async flip fix
- SMU13 clock reporting fixes
- Add missing PSP firmware declaration
- DP MST fix
- DCN3.1.x fixes
- Slab out of bounds fix


Alex Deucher (2):
  drm/amdgpu/pm: make gfxclock consistent for sienna cichlid
  drm/amdgpu/pm: make mclk consistent for smu 13.0.7

Aurabindo Pillai (2):
  drm/amd/display: export some optc function for reuse
  drm/amd/display: add DCN301 specific logic for OTG programming

Candice Li (1):
  drm/amdgpu: Allow the initramfs generator to include psp_13_0_6_ta

Daniel Miess (1):
  drm/amd/display: Prevent vtotal from being set to 0

Guchun Chen (5):
  drm/amdgpu/vkms: relax timer deactivation by hrtimer_try_to_cancel
  drm/amdgpu: Allocate root PD on correct partition
  drm/amdgpu: fix slab-out-of-bounds issue in amdgpu_vm_pt_create
  drm/amdgpu/vm: use the same xcp_id from root PD
  drm/amdgpu: use a macro to define no xcp partition case

Mario Limonciello (1):
  drm/amd: Use amdgpu_device_pcie_dynamic_switching_supported() for SMU7

Nicholas Kazlauskas (1):
  drm/amd/display: Keep PHY active for DP displays on DCN31

Simon Ser (1):
  drm/amd/display: only accept async flips for fast updates

Srinivasan Shanmugam (1):
  drm/amd/display: Clean up errors & warnings in amdgpu_dm.c

Taimur Hassan (1):
  drm/amd/display: check TG is non-null before checking if enabled

Wayne Lin (1):
  drm/amd/display: Add polling method to handle MST reply packet

Zhikai Zhai (1):
  drm/amd/display: Disable MPC split by default on special asic

 drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c   |   3 +-
 drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c|   6 +-
 drivers/gpu/drm/amd/amdgpu/amdgpu_mes.c|   2 +-
 drivers/gpu/drm/amd/amdgpu/amdgpu_vkms.c   |   5 +-
 drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c |   5 +-
 drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h |   5 +-
 drivers/gpu/drm/amd/amdgpu/amdgpu_vm_pt.c  |  12 +-
 drivers/gpu/drm/amd/amdgpu/amdgpu_xcp.c|   4 +-
 drivers/gpu/drm/amd/amdgpu/amdgpu_xcp.h|   2 +
 .../gpu/drm/amd/amdgpu/aqua_vanjaram_reg_init.c|   4 +-
 drivers/gpu/drm/amd/amdgpu/psp_v13_0.c |   1 +
 drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c  | 256 +
 drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.h  |   7 +
 .../gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_crtc.c |  12 +
 .../amd/display/amdgpu_dm/amdgpu_dm_mst_types.c| 110 +
 .../amd/display/amdgpu_dm/amdgpu_dm_mst_types.h|  11 +
 .../amd/display/dc/clk_mgr/dcn31/dcn31_clk_mgr.c   |   5 +
 .../drm/amd/display/dc/dcn10/dcn10_hw_sequencer.c  |   3 +-
 drivers/gpu/drm/amd/display/dc/dcn30/dcn30_optc.c  |   4 +-
 drivers/gpu/drm/amd/display/dc/dcn30/dcn30_optc.h  |   3 +
 drivers/gpu/drm/amd/display/dc/dcn301/Makefile |   3 +-
 .../gpu/drm/amd/display/dc/dcn301/dcn301_optc.c| 185 +++
 .../gpu/drm/amd/display/dc/dcn301/dcn301_optc.h|  36 +++
 .../drm/amd/display/dc/dcn301/dcn301_resource.c|   4 +-
 .../drm/amd/display/dc/dcn303/dcn303_resource.c|   2 +-
 .../gpu/drm/amd/display/dc/dml/dcn314/dcn314_fpu.c |   6 +-
 .../gpu/drm/amd/pm/powerplay/hwmgr/smu7_hwmgr.c|  14 +-
 .../drm/amd/pm/swsmu/smu11/sienna_cichlid_ppt.c|   8 +-
 .../gpu/drm/amd/pm/swsmu/smu13/smu_v13_0_7_ppt.c   |   2 +-
 29 files changed, 524 insertions(+), 196 deletions(-)
 create mode 100644 drivers/gpu/drm/amd/display/dc/dcn301/dcn301_optc.c
 create mode 100644 drivers/gpu/drm/amd/display/dc/dcn301/dcn301_optc.h


Re: [PATCH] backlight: gpio_backlight: Drop output gpio direction check for initial power state

2023-07-20 Thread Bartosz Golaszewski
On Thu, Jul 20, 2023 at 3:10 PM Daniel Thompson
 wrote:
>
> On Thu, Jul 20, 2023 at 02:56:32PM +0200, Bartosz Golaszewski wrote:
> > On Thu, Jul 20, 2023 at 1:27 PM Daniel Thompson
> >  wrote:
> > >
> > > On Thu, Jul 20, 2023 at 06:06:27AM +, Ying Liu wrote:
> > > > Bootloader may leave gpio direction as input and gpio value as logical 
> > > > low.
> > > > It hints that initial backlight power state should be FB_BLANK_POWERDOWN
> > > > since the gpio value is literally logical low.
> > >
> > > To be honest this probably "hints" that the bootloader simply didn't
> > > consider the backlight at all :-) . I'd rather the patch description
> > > focus on what circumstances lead to the current code making a bad
> > > decision. More like:
> > >
> > >   If the GPIO pin is in the input state but the backlight is currently
> > >   off due to default pull downs then ...
> > >
> > > > So, let's drop output gpio
> > > > direction check and only check gpio value to set the initial power 
> > > > state.
> > >
> > > This check was specifically added by Bartosz so I'd be interested in his
> > > opinion of this change (especially since he is now a GPIO maintainer)!
> > >
> > > What motivates (or motivated) the need to check the direction rather
> > > than just read that current logic level on the pin?
> > >
> > >
> > > Daniel.
> > > [I'm done but since Bartosz and Linus were not on copy of the original
> > > thread I've left the rest of the patch below as a convenience ;-) ]
> > >
> >
> > This was done in commit: 706dc68102bc ("backlight: gpio: Explicitly
> > set the direction of the GPIO").
> >
> > Let me quote myself from it:
> > --
> > The GPIO backlight driver currently requests the line 'as is', without
> > actively setting its direction. This can lead to problems: if the line
> > is in input mode by default, we won't be able to drive it later when
> > updating the status and also reading its initial value doesn't make
> > sense for backlight setting.
> > --
>
> You are perhaps quoting the wrong bit here ;-). The currently proposed
> patch leaves the code to put the pin into output mode unmodified. However
> there was an extra line at the bottom of your commit message:
> --
> Also: check the current direction and only read the value if it's output.
> --

Yeah I'm no longer sure why I did this. The commit doesn't look harmful though.

Bart

>
> This was the bit I wanted to check on, since the proposed patch
> literally reverses this!
>
> However...
>
>
> > I agree with Thomas that it's highly unlikely the bootloader "hints"
> > at any specific backlight settings. That being said, the change itself
> > looks correct to me. The other branch of that if will always unblank
> > the backlight if the GPIO is in input mode which may not be desirable.
>
> ... if you're happy the proposed change is OK then I'm happy too!
> I came to the same conclusion after reviewing the GPIO code this morning,
> however I copied you in because I was worried I might have overlooked
> something.
>
>
> > I don't see any obvious problem with this change, just make sure the
> > commit message makes more sense.
>
> Agreed.
>
>
> Daniel.


Re: [PATCH] backlight: gpio_backlight: Drop output gpio direction check for initial power state

2023-07-20 Thread Daniel Thompson
On Thu, Jul 20, 2023 at 02:56:32PM +0200, Bartosz Golaszewski wrote:
> On Thu, Jul 20, 2023 at 1:27 PM Daniel Thompson
>  wrote:
> >
> > On Thu, Jul 20, 2023 at 06:06:27AM +, Ying Liu wrote:
> > > Bootloader may leave gpio direction as input and gpio value as logical 
> > > low.
> > > It hints that initial backlight power state should be FB_BLANK_POWERDOWN
> > > since the gpio value is literally logical low.
> >
> > To be honest this probably "hints" that the bootloader simply didn't
> > consider the backlight at all :-) . I'd rather the patch description
> > focus on what circumstances lead to the current code making a bad
> > decision. More like:
> >
> >   If the GPIO pin is in the input state but the backlight is currently
> >   off due to default pull downs then ...
> >
> > > So, let's drop output gpio
> > > direction check and only check gpio value to set the initial power state.
> >
> > This check was specifically added by Bartosz so I'd be interested in his
> > opinion of this change (especially since he is now a GPIO maintainer)!
> >
> > What motivates (or motivated) the need to check the direction rather
> > than just read that current logic level on the pin?
> >
> >
> > Daniel.
> > [I'm done but since Bartosz and Linus were not on copy of the original
> > thread I've left the rest of the patch below as a convenience ;-) ]
> >
>
> This was done in commit: 706dc68102bc ("backlight: gpio: Explicitly
> set the direction of the GPIO").
>
> Let me quote myself from it:
> --
> The GPIO backlight driver currently requests the line 'as is', without
> actively setting its direction. This can lead to problems: if the line
> is in input mode by default, we won't be able to drive it later when
> updating the status and also reading its initial value doesn't make
> sense for backlight setting.
> --

You are perhaps quoting the wrong bit here ;-). The currently proposed
patch leaves the code to put the pin into output mode unmodified. However
there was an extra line at the bottom of your commit message:
--
Also: check the current direction and only read the value if it's output.
--

This was the bit I wanted to check on, since the proposed patch
literally reverses this!

However...


> I agree with Thomas that it's highly unlikely the bootloader "hints"
> at any specific backlight settings. That being said, the change itself
> looks correct to me. The other branch of that if will always unblank
> the backlight if the GPIO is in input mode which may not be desirable.

... if you're happy the proposed change is OK then I'm happy too!
I came to the same conclusion after reviewing the GPIO code this morning,
however I copied you in because I was worried I might have overlooked
something.


> I don't see any obvious problem with this change, just make sure the
> commit message makes more sense.

Agreed.


Daniel.


Re: [PATCH v6 07/11] drm/mediatek: dp: Avoid mutex locks if audio is not supported/enabled

2023-07-20 Thread Alexandre Mergnat




On 17/07/2023 16:14, AngeloGioacchino Del Regno wrote:

If a controller (usually, eDP!) does not support audio, or audio is not
enabled because the endpoint has no audio support, it's useless to lock
a mutex only to unlock it right after because there's no .plugged_cb().

Check if the audio is supported and enabled before locking the mutex in
mtk_dp_update_plugged_status(): if not, we simply return immediately.

While at it, since the update_plugged_status_lock mutex would not be
used if the controller doesn't support audio at all, initialize it
only if `audio_supported` is true.


Reviewed-by: Alexandre Mergnat 

--
Regards,
Alexandre


Re: [PATCH v6 06/11] drm/mediatek: dp: Enable event interrupt only when bridge attached

2023-07-20 Thread Alexandre Mergnat




On 17/07/2023 16:14, AngeloGioacchino Del Regno wrote:

It is useless and error-prone to enable the DisplayPort event interrupt
before finishing to probe and install the driver, as the DP training
cannot happen before the entire pipeline is correctly set up, as the
interrupt handler also requires the full hardware to be initialized by
mtk_dp_bridge_attach().

Anyway, depending in which state the controller is left from the
bootloader, this may cause an interrupt storm and consequently hang
the kernel during boot, so, avoid enabling the interrupt until we
reach a clean state by adding the IRQ_NOAUTOEN flag before requesting
it at probe time and manage the enablement of the ISR in the .attach()
and .detach() handlers for the DP bridge.


Reviewed-by: Alexandre Mergnat 

--
Regards,
Alexandre


Re: [PATCH] backlight: gpio_backlight: Drop output gpio direction check for initial power state

2023-07-20 Thread Bartosz Golaszewski
On Thu, Jul 20, 2023 at 1:27 PM Daniel Thompson
 wrote:
>
> On Thu, Jul 20, 2023 at 06:06:27AM +, Ying Liu wrote:
> > Bootloader may leave gpio direction as input and gpio value as logical low.
> > It hints that initial backlight power state should be FB_BLANK_POWERDOWN
> > since the gpio value is literally logical low.
>
> To be honest this probably "hints" that the bootloader simply didn't
> consider the backlight at all :-) . I'd rather the patch description
> focus on what circumstances lead to the current code making a bad
> decision. More like:
>
>   If the GPIO pin is in the input state but the backlight is currently
>   off due to default pull downs then ...
>
> > So, let's drop output gpio
> > direction check and only check gpio value to set the initial power state.
>
> This check was specifically added by Bartosz so I'd be interested in his
> opinion of this change (especially since he is now a GPIO maintainer)!
>
> What motivates (or motivated) the need to check the direction rather
> than just read that current logic level on the pin?
>
>
> Daniel.
> [I'm done but since Bartosz and Linus were not on copy of the original
> thread I've left the rest of the patch below as a convenience ;-) ]
>

This was done in commit: 706dc68102bc ("backlight: gpio: Explicitly
set the direction of the GPIO").

Let me quote myself from it:

--
The GPIO backlight driver currently requests the line 'as is', without
actively setting its direction. This can lead to problems: if the line
is in input mode by default, we won't be able to drive it later when
updating the status and also reading its initial value doesn't make
sense for backlight setting.
--

I agree with Thomas that it's highly unlikely the bootloader "hints"
at any specific backlight settings. That being said, the change itself
looks correct to me. The other branch of that if will always unblank
the backlight if the GPIO is in input mode which may not be desirable.
I don't see any obvious problem with this change, just make sure the
commit message makes more sense.

Bartosz


[PATCH v2 3/5] drm/i915: Fix HPD polling, reenabling the output poll work as needed

2023-07-20 Thread Imre Deak
After the commit in the Fixes: line below, HPD polling stopped working
on i915, since after that change calling drm_kms_helper_poll_enable()
doesn't restart drm_mode_config::output_poll_work if the work was
stopped (no connectors needing polling) and enabling polling for a
connector (during runtime suspend or detecting an HPD IRQ storm).

After the above change calling drm_kms_helper_poll_enable() is a nop
after it's been called already and polling for some connectors was
disabled/re-enabled.

Fix this by calling drm_kms_helper_poll_reschedule() added in the
previous patch instead, which reschedules the work whenever expected.

Fixes: d33a54e3991d ("drm/probe_helper: sort out poll_running vs poll_enabled")
Cc: Dmitry Baryshkov 
Cc: dri-devel@lists.freedesktop.org
Signed-off-by: Imre Deak 
---
 drivers/gpu/drm/i915/display/intel_hotplug.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/i915/display/intel_hotplug.c 
b/drivers/gpu/drm/i915/display/intel_hotplug.c
index dd7eb9fc78610..d9f0ab1d953b9 100644
--- a/drivers/gpu/drm/i915/display/intel_hotplug.c
+++ b/drivers/gpu/drm/i915/display/intel_hotplug.c
@@ -212,7 +212,7 @@ intel_hpd_irq_storm_switch_to_polling(struct 
drm_i915_private *dev_priv)
 
/* Enable polling and queue hotplug re-enabling. */
if (hpd_disabled) {
-   drm_kms_helper_poll_enable(&dev_priv->drm);
+   drm_kms_helper_poll_reschedule(&dev_priv->drm);
mod_delayed_work(dev_priv->unordered_wq,
 &dev_priv->display.hotplug.reenable_work,
 msecs_to_jiffies(HPD_STORM_REENABLE_DELAY));
@@ -676,7 +676,7 @@ static void i915_hpd_poll_init_work(struct work_struct 
*work)
drm_connector_list_iter_end(&conn_iter);
 
if (enabled)
-   drm_kms_helper_poll_enable(&dev_priv->drm);
+   drm_kms_helper_poll_reschedule(&dev_priv->drm);
 
mutex_unlock(&dev_priv->drm.mode_config.mutex);
 
-- 
2.37.2



[PATCH v2 2/5] drm: Add an HPD poll helper to reschedule the poll work

2023-07-20 Thread Imre Deak
Add a helper to reschedule drm_mode_config::output_poll_work after
polling has been enabled for a connector (and needing a reschedule,
since previously polling was disabled for all connectors and hence
output_poll_work was not running).

This is needed by the next patch fixing HPD polling on i915.

Cc: Dmitry Baryshkov 
Cc: dri-devel@lists.freedesktop.org
Signed-off-by: Imre Deak 
---
 drivers/gpu/drm/drm_probe_helper.c | 68 --
 include/drm/drm_probe_helper.h |  1 +
 2 files changed, 47 insertions(+), 22 deletions(-)

diff --git a/drivers/gpu/drm/drm_probe_helper.c 
b/drivers/gpu/drm/drm_probe_helper.c
index 2fb9bf901a2cc..3f479483d7d80 100644
--- a/drivers/gpu/drm/drm_probe_helper.c
+++ b/drivers/gpu/drm/drm_probe_helper.c
@@ -262,6 +262,26 @@ static bool drm_kms_helper_enable_hpd(struct drm_device 
*dev)
 }
 
 #define DRM_OUTPUT_POLL_PERIOD (10*HZ)
+static void reschedule_output_poll_work(struct drm_device *dev)
+{
+   unsigned long delay = DRM_OUTPUT_POLL_PERIOD;
+
+   if (dev->mode_config.delayed_event)
+   /*
+* FIXME:
+*
+* Use short (1s) delay to handle the initial delayed event.
+* This delay should not be needed, but Optimus/nouveau will
+* fail in a mysterious way if the delayed event is handled as
+* soon as possible like it is done in
+* drm_helper_probe_single_connector_modes() in case the poll
+* was enabled before.
+*/
+   delay = HZ;
+
+   schedule_delayed_work(&dev->mode_config.output_poll_work, delay);
+}
+
 /**
  * drm_kms_helper_poll_enable - re-enable output polling.
  * @dev: drm_device
@@ -279,37 +299,41 @@ static bool drm_kms_helper_enable_hpd(struct drm_device 
*dev)
  */
 void drm_kms_helper_poll_enable(struct drm_device *dev)
 {
-   bool poll = false;
-   unsigned long delay = DRM_OUTPUT_POLL_PERIOD;
-
if (!dev->mode_config.poll_enabled || !drm_kms_helper_poll ||
dev->mode_config.poll_running)
return;
 
-   poll = drm_kms_helper_enable_hpd(dev);
-
-   if (dev->mode_config.delayed_event) {
-   /*
-* FIXME:
-*
-* Use short (1s) delay to handle the initial delayed event.
-* This delay should not be needed, but Optimus/nouveau will
-* fail in a mysterious way if the delayed event is handled as
-* soon as possible like it is done in
-* drm_helper_probe_single_connector_modes() in case the poll
-* was enabled before.
-*/
-   poll = true;
-   delay = HZ;
-   }
-
-   if (poll)
-   schedule_delayed_work(&dev->mode_config.output_poll_work, 
delay);
+   if (drm_kms_helper_enable_hpd(dev) ||
+   dev->mode_config.delayed_event)
+   reschedule_output_poll_work(dev);
 
dev->mode_config.poll_running = true;
 }
 EXPORT_SYMBOL(drm_kms_helper_poll_enable);
 
+/**
+ * drm_kms_helper_poll_reschedule - reschedule the output polling work
+ * @dev: drm_device
+ *
+ * This function reschedules the output polling work, after polling for a
+ * connector has been enabled.
+ *
+ * Drivers must call this helper after enabling polling for a connector by
+ * setting %DRM_CONNECTOR_POLL_CONNECT / %DRM_CONNECTOR_POLL_DISCONNECT flags
+ * in drm_connector::polled. Note that after disabling polling by clearing 
these
+ * flags for a connector will stop the output polling work automatically if
+ * the polling is disabled for all other connectors as well.
+ *
+ * The function can be called only after polling has been enabled by calling
+ * drm_kms_helper_poll_init() / drm_kms_helper_poll_enable().
+ */
+void drm_kms_helper_poll_reschedule(struct drm_device *dev)
+{
+   if (dev->mode_config.poll_running)
+   reschedule_output_poll_work(dev);
+}
+EXPORT_SYMBOL(drm_kms_helper_poll_reschedule);
+
 static enum drm_connector_status
 drm_helper_probe_detect_ctx(struct drm_connector *connector, bool force)
 {
diff --git a/include/drm/drm_probe_helper.h b/include/drm/drm_probe_helper.h
index 4977e0ab72dbb..fad3c4003b2b5 100644
--- a/include/drm/drm_probe_helper.h
+++ b/include/drm/drm_probe_helper.h
@@ -25,6 +25,7 @@ void drm_kms_helper_connector_hotplug_event(struct 
drm_connector *connector);
 
 void drm_kms_helper_poll_disable(struct drm_device *dev);
 void drm_kms_helper_poll_enable(struct drm_device *dev);
+void drm_kms_helper_poll_reschedule(struct drm_device *dev);
 bool drm_kms_helper_is_poll_worker(void);
 
 enum drm_mode_status drm_crtc_helper_mode_valid_fixed(struct drm_crtc *crtc,
-- 
2.37.2



Re: [PATCH v3,3/3] drm/mediatek: dp: Add the audio divider to mtk_dp_data struct

2023-07-20 Thread AngeloGioacchino Del Regno

Il 20/07/23 14:29, Alexandre Mergnat ha scritto:



On 20/07/2023 14:08, AngeloGioacchino Del Regno wrote:

Il 20/07/23 14:07, Alexandre Mergnat ha scritto:



On 20/07/2023 13:54, AngeloGioacchino Del Regno wrote:

Il 20/07/23 12:14, Alexandre Mergnat ha scritto:



On 20/07/2023 10:26, Shuijing Li wrote:

Due to the difference of HW, different dividers need to be set.

Signed-off-by: Shuijing Li 
Signed-off-by: Jitao Shi 
---
Changes in v3:
Separate these two things into two different patches.
per suggestion from the previous thread:
https://lore.kernel.org/lkml/e2ad22bcba31797f38a12a488d4246a01bf0cb2e.ca...@mediatek.com/
Changes in v2:
- change the variables' name to be more descriptive
- add a comment that describes the function of mtk_dp_audio_sample_arrange
- reduce indentation by doing the inverse check
- add a definition of some bits
- add support for mediatek, mt8188-edp-tx
per suggestion from the previous thread:
https://lore.kernel.org/lkml/ac0fcec9-a2fe-06cc-c727-189ef7bab...@collabora.com/
---
  drivers/gpu/drm/mediatek/mtk_dp.c | 7 ++-
  drivers/gpu/drm/mediatek/mtk_dp_reg.h | 1 +
  2 files changed, 7 insertions(+), 1 deletion(-)


...

b/drivers/gpu/drm/mediatek/mtk_dp_reg.h
index f38d6ff12afe..6d7f0405867e 100644
--- a/drivers/gpu/drm/mediatek/mtk_dp_reg.h
+++ b/drivers/gpu/drm/mediatek/mtk_dp_reg.h
@@ -162,6 +162,7 @@
  #define AUDIO_M_CODE_MULT_DIV_SEL_DP_ENC0_P0_MUL_2    (1 << 8)
  #define AUDIO_M_CODE_MULT_DIV_SEL_DP_ENC0_P0_MUL_4    (2 << 8)
  #define AUDIO_M_CODE_MULT_DIV_SEL_DP_ENC0_P0_MUL_8    (3 << 8)
+#define MT8188_AUDIO_M_CODE_MULT_DIV_SEL_DP_ENC0_P0_DIV_2    (4 << 8)


IMO, it's a bit weird to have SoC specific define in the generic header.
Are you sure this bit is only available for MT8188 ?



Eh, the P0_DIV2 bit is 5<<8 for MT8195, while for 8188 it's 4<<8, clearly :-)



Ok then, to avoid this kind of issue for other SoCs in the future, is that make 
sense for you to do a SoC specific header file beside the generic one?




For just one definition? That's a bit overkill :-)



You're right, but we must start somewhere ^^, and show the proper way for future 
patches. Actually, I gave my Reviewed-by because it's only one definition. This 
will be fixed later (I hope).




I'm confident that *if* and when "a bunch" of SoC-specific definitions will
appear, those will be splitted in different headers. :-)


  #define AUDIO_M_CODE_MULT_DIV_SEL_DP_ENC0_P0_DIV_2    (5 << 8)
  #define AUDIO_M_CODE_MULT_DIV_SEL_DP_ENC0_P0_DIV_4    (6 << 8)
  #define AUDIO_M_CODE_MULT_DIV_SEL_DP_ENC0_P0_DIV_8    (7 << 8)


Reviewed-by: Alexandre Mergnat 

















[PATCH v3 2/2] drm/tiny: add display driver for philips pcd8544 display controller

2023-07-20 Thread Viktar Simanenka
Add support for monochrome LCD SPI displays (such as Nokia 5110/3310 LCD)
based on PCD8544 LCD display controller.

Signed-off-by: Viktar Simanenka 
---
v3:nothing changed from v2
v2:checked and fixed with sparse and smatch
   changed param prefixes
v2 link: 
https://lore.kernel.org/linux-devicetree/20230719154450.620410-2-viteo...@gmail.com/

 drivers/gpu/drm/tiny/Kconfig   |  11 +
 drivers/gpu/drm/tiny/Makefile  |   1 +
 drivers/gpu/drm/tiny/pcd8544.c | 506 +
 3 files changed, 518 insertions(+)
 create mode 100644 drivers/gpu/drm/tiny/pcd8544.c

diff --git a/drivers/gpu/drm/tiny/Kconfig b/drivers/gpu/drm/tiny/Kconfig
index f6889f649bc1..10caa0818253 100644
--- a/drivers/gpu/drm/tiny/Kconfig
+++ b/drivers/gpu/drm/tiny/Kconfig
@@ -172,6 +172,17 @@ config TINYDRM_MI0283QT
  DRM driver for the Multi-Inno MI0283QT display panel
  If M is selected the module will be called mi0283qt.
 
+config TINYDRM_PCD8544
+   tristate "DRM support for PCD8544 displays"
+   depends on DRM && SPI
+   select DRM_KMS_HELPER
+   select DRM_GEM_DMA_HELPER
+   select BACKLIGHT_CLASS_DEVICE
+   help
+ DRM driver for PCD8544 (Nokia 5110/3310) 84x48 LCD displays.
+
+ If M is selected the module will be called pcd8544.
+
 config TINYDRM_REPAPER
tristate "DRM support for Pervasive Displays RePaper panels (V231)"
depends on DRM && SPI
diff --git a/drivers/gpu/drm/tiny/Makefile b/drivers/gpu/drm/tiny/Makefile
index 76dde89a044b..75bc112a02f9 100644
--- a/drivers/gpu/drm/tiny/Makefile
+++ b/drivers/gpu/drm/tiny/Makefile
@@ -13,6 +13,7 @@ obj-$(CONFIG_TINYDRM_ILI9225) += ili9225.o
 obj-$(CONFIG_TINYDRM_ILI9341)  += ili9341.o
 obj-$(CONFIG_TINYDRM_ILI9486)  += ili9486.o
 obj-$(CONFIG_TINYDRM_MI0283QT) += mi0283qt.o
+obj-$(CONFIG_TINYDRM_PCD8544)  += pcd8544.o
 obj-$(CONFIG_TINYDRM_REPAPER)  += repaper.o
 obj-$(CONFIG_TINYDRM_ST7586)   += st7586.o
 obj-$(CONFIG_TINYDRM_ST7735R)  += st7735r.o
diff --git a/drivers/gpu/drm/tiny/pcd8544.c b/drivers/gpu/drm/tiny/pcd8544.c
new file mode 100644
index ..73958b302a36
--- /dev/null
+++ b/drivers/gpu/drm/tiny/pcd8544.c
@@ -0,0 +1,506 @@
+// SPDX-License-Identifier: GPL-2.0-or-later
+/*
+ * DRM driver for Philips PCD8544 LCD controller/driver.
+ * Compatible with Nokia 5110/3310 84x48 LCD displays.
+ *
+ * Copyright 2023 Viktar Simanenka 
+ */
+
+#include 
+#include 
+#include 
+#include 
+#include 
+
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+
+/*
+ * The display is monochrome, every bit in buffer is a pixel.
+ * Display RAM divided into 6 banks along y-axis, each bank 84 bytes along 
x-axis.
+ * Driver uses horizontal addressing.
+ */
+
+#define PCD8544_FUNCTIONSET 0x20
+#define PCD8544_DISPLAYCONTROL  0x08
+#define PCD8544_SETYADDR0x40
+#define PCD8544_SETXADDR0x80
+#define PCD8544_SETBIAS 0x10
+#define PCD8544_SETTEMPCOEF 0x04
+#define PCD8544_SETVOP  0x80
+
+#define PCD8544_EXTENDED_INSTRUCTION0x01
+#define PCD8544_VERTICAL_ADDRESSING 0x02
+#define PCD8544_DISPLAYNORMAL   0x04
+#define PCD8544_DISPLAYINVERTED 0x05
+
+struct pcd8544_device {
+   struct drm_device drm;
+   struct drm_simple_display_pipe pipe;
+   struct drm_connector connector;
+   struct drm_display_mode mode;
+   struct spi_device *spi;
+
+   u32 width;
+   u32 height;
+   u8 *tx_buf; // Buffer used for transfer
+   size_t tx_buflen;
+
+   struct backlight_device *backlight;
+   struct gpio_desc *reset;
+   struct gpio_desc *dc;
+
+   u32 inverted;
+   u32 temperature_coeff;
+   u32 bias;
+   u32 voltage_op;
+};
+
+MODULE_PARM_DESC(inverted, "Invert display colors: 1 - enable, 0 - disable");
+MODULE_PARM_DESC(voltage_op, "Vop[6:0] LCD voltage operation coefficient: 
0-127 (default: 0)");
+MODULE_PARM_DESC(temperature_coeff, "TC[1:0] Temperature compensation 
coefficient: 0-3 (default: 0)");
+MODULE_PARM_DESC(bias, "BS[2:0] Bias system coefficient: 0-7 (default: 4)");
+
+#define drm_to_dev(__dev) container_of(__dev, struct pcd8544_device, drm)
+
+static int pcd8544_spi_transfer(struct spi_device *spi, const void *buf, 
size_t len)
+{
+   size_t max_chunk = spi_max_transfer_size(spi);
+   struct spi_transfer tr = {
+   .bits_per_word = 8,
+   .speed_hz = 0,
+   };
+   struct spi_message m;
+   size_t chunk;
+   int ret;
+
+   max_chunk = ALIGN_DOWN(max_chunk, 2);
+
+   spi_message_init_with_transfers(&m, &tr, 1);
+
+   while (len) {
+   chunk = min(len, max_chunk);
+
+   tr.tx_buf = buf;
+   tr.len = chunk;
+   buf += chunk;
+   len -= chunk;
+
+   ret = spi_sync(spi, &m);
+   

[PATCH v3 1/2] dt-bindings: display: add bindings for pcd8544 displays

2023-07-20 Thread Viktar Simanenka
Add device tree binding documentation for PCD8544 LCD display controller.

Signed-off-by: Viktar Simanenka 
---
v3:add a little more description to the exposed vendor properties
   add commit message finally
v2 link: 
https://lore.kernel.org/linux-devicetree/20230719154450.620410-1-viteo...@gmail.com/

 .../bindings/display/nxp,pcd8544.yaml | 95 +++
 1 file changed, 95 insertions(+)
 create mode 100644 Documentation/devicetree/bindings/display/nxp,pcd8544.yaml

diff --git a/Documentation/devicetree/bindings/display/nxp,pcd8544.yaml 
b/Documentation/devicetree/bindings/display/nxp,pcd8544.yaml
new file mode 100644
index ..bacdeff9776e
--- /dev/null
+++ b/Documentation/devicetree/bindings/display/nxp,pcd8544.yaml
@@ -0,0 +1,95 @@
+# SPDX-License-Identifier: (GPL-2.0-only OR BSD-2-Clause)
+%YAML 1.2
+---
+$id: http://devicetree.org/schemas/display/nxp,pcd8544.yaml#
+$schema: http://devicetree.org/meta-schemas/core.yaml#
+
+title: Philips Semiconductors PCD8544 LCD Display Controller
+
+maintainers:
+  - Viktar Simanenka 
+
+description: |
+  Philips Semiconductors PCD8544 LCD Display Controller with SPI control bus.
+  Designed to drive a graphic display of 48 rows and 84 columns,
+  such as Nokia 5110/3310 LCDs.
+
+allOf:
+  - $ref: panel/panel-common.yaml#
+  - $ref: /schemas/spi/spi-peripheral-props.yaml#
+
+properties:
+  compatible:
+enum:
+  - nxp,pcd8544
+
+  dc-gpios:
+maxItems: 1
+description: Data/Command selection pin (D/CX)
+
+  reset-gpios:
+maxItems: 1
+description: Display Reset pin (RST)
+
+  nxp,inverted:
+type: boolean
+description: Display color inversion
+
+  nxp,voltage-op:
+$ref: /schemas/types.yaml#/definitions/uint32
+minimum: 0
+maximum: 127
+description: |
+  Liquid crystall voltage operation coefficient. Determines the LCD
+  controlling voltage on display segments. Should be adjusted according
+  to the ambient temperature. Adjusts the contrast of the display.
+
+  nxp,temperature-coeff:
+$ref: /schemas/types.yaml#/definitions/uint32
+minimum: 0
+maximum: 3
+description: |
+  Display temperature compensation coefficient. Increases LCD controlling
+  voltage at lower temperatures to maintain optimum contrast.
+
+  nxp,bias:
+$ref: /schemas/types.yaml#/definitions/uint32
+minimum: 0
+maximum: 7
+description: |
+  Display bias system coefficient. Should only be changed if an external
+  oscillator is used for the display.
+
+required:
+  - compatible
+  - reg
+  - dc-gpios
+  - reset-gpios
+
+unevaluatedProperties: false
+
+examples:
+  - |
+#include 
+
+spi {
+#address-cells = <1>;
+#size-cells = <0>;
+
+display@0 {
+compatible = "nxp,pcd8544";
+reg = <0>;
+spi-max-frequency = <800>;
+
+dc-gpios = <&pio 0 3 GPIO_ACTIVE_HIGH>; /* DC=PA3 */
+reset-gpios = <&pio 0 1 GPIO_ACTIVE_HIGH>; /* RESET=PA1 */
+backlight = <&backlight>;
+
+nxp,inverted;
+nxp,voltage-op = <0>;
+nxp,bias = <4>;
+nxp,temperature-coeff = <0>;
+};
+};
+
+...
-- 
2.34.1



[PATCH next] drm/loongson: Fix error handling in lsdc_pixel_pll_setup()

2023-07-20 Thread Harshit Mogalapalli
There are two problems in lsdc_pixel_pll_setup()
1. If kzalloc() fails then call iounmap() to release the resources.
2. Both kzalloc and ioremap doesnot return error pointers on failure, so
   using IS_ERR_OR_NULL() checks is a bit confusing and not very right,
   fix this by changing those to NULL checks instead.

Fixes: f39db26c5428 ("drm: Add kms driver for loongson display controller")
Signed-off-by: Harshit Mogalapalli 
---
This is found with static analysis with smacth and only compile tested.
---
 drivers/gpu/drm/loongson/lsdc_pixpll.c | 6 --
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/loongson/lsdc_pixpll.c 
b/drivers/gpu/drm/loongson/lsdc_pixpll.c
index 04c15b4697e2..2609a2256da4 100644
--- a/drivers/gpu/drm/loongson/lsdc_pixpll.c
+++ b/drivers/gpu/drm/loongson/lsdc_pixpll.c
@@ -120,12 +120,14 @@ static int lsdc_pixel_pll_setup(struct lsdc_pixpll * 
const this)
struct lsdc_pixpll_parms *pparms;
 
this->mmio = ioremap(this->reg_base, this->reg_size);
-   if (IS_ERR_OR_NULL(this->mmio))
+   if (!this->mmio)
return -ENOMEM;
 
pparms = kzalloc(sizeof(*pparms), GFP_KERNEL);
-   if (IS_ERR_OR_NULL(pparms))
+   if (!pparms) {
+   iounmap(this->mmio);
return -ENOMEM;
+   }
 
pparms->ref_clock = LSDC_PLL_REF_CLK_KHZ;
 
-- 
2.39.3



  1   2   >