Re: [PATCH] drm/i2c: Switch i2c drivers back to use .probe()

2023-06-15 Thread Uwe Kleine-König
Hello,

On Sun, Jun 11, 2023 at 10:27:40PM +0200, Uwe Kleine-König wrote:
> After commit b8a1a4cd5a98 ("i2c: Provide a temporary .probe_new()
> call-back type"), all drivers being converted to .probe_new() and then
> commit 03c835f498b5 ("i2c: Switch .probe() to not take an id parameter")
> convert back to (the new) .probe() to be able to eventually drop
> .probe_new() from struct i2c_driver.

It would be great if this patch made it into 6.5-rc1, as I intend to
send a patch series to Wolfram after the upcoming merge window to drop
.probe_new to go in via the i2c tree. There are a few remaining
driver instances that I will have to fix in this series, but I'm happy
about every patch that goes in via its designated tree beforehand.

Best regards
Uwe

-- 
Pengutronix e.K.   | Uwe Kleine-König|
Industrial Linux Solutions | https://www.pengutronix.de/ |


signature.asc
Description: PGP signature


Re: [PATCH 0/3] drm: Allow PRIME 'self-import' for all drivers

2023-06-15 Thread Zack Rusin
On Thu, 2023-06-15 at 11:31 +0200, Thomas Zimmermann wrote:
> Set drm_gem_prime_handle_to_fd() and drm_gem_prime_fd_to_handle()
> for all DRM drivers. Even drivers that do not support PRIME import
> or export of dma-bufs can now import their own buffer objects. This
> is required by some userspace, such as wlroots-based compositors, to
> share buffers among processes.
> 
> The only driver that does not use the drm_gem_prime_*() helpers is
> vmwgfx. Once it has been converted, the callbacks in struct drm_driver
> can be removed.

Hmm, I'm not sure that's ever going to be possible on vmwgfx. Or at least not 
until
Xorg is used by anything. Some things in vmwgfx stack create "surfaces" which 
are
not GEM's (they can be backed by a GEM object, but don't have to) and the prime
implementation on vmwgfx has to be able to export and import those. 

In an ideal world I'd just delete the ioctl's that allow creating those 
"surfaces",
but of course that's a no-no, so possibility of introducing vmwgfx2 with a
saner/modern ioctl model has been floating around internally. Which at least 
would
open a path to removing vmwgfx at some point in the future, but there's not much
that can be done about vmwgfx having to be able to prime import/export GEM's and
wonky non-GEM objects.

z



Re: [PATCH v3] drm/vkms: Add support to 1D gamma LUT

2023-06-15 Thread kernel test robot
Hi Arthur,

kernel test robot noticed the following build warnings:

[auto build test WARNING on drm-misc/drm-misc-next]
[also build test WARNING on drm/drm-next drm-exynos/exynos-drm-next 
drm-intel/for-linux-next drm-intel/for-linux-next-fixes drm-tip/drm-tip 
linus/master v6.4-rc6 next-20230615]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch#_base_tree_information]

url:
https://github.com/intel-lab-lkp/linux/commits/Arthur-Grillo/drm-vkms-Add-support-to-1D-gamma-LUT/20230616-040349
base:   git://anongit.freedesktop.org/drm/drm-misc drm-misc-next
patch link:
https://lore.kernel.org/r/20230615200157.960630-1-arthurgrillo%40riseup.net
patch subject: [PATCH v3] drm/vkms: Add support to 1D gamma LUT
config: x86_64-randconfig-r022-20230615 
(https://download.01.org/0day-ci/archive/20230616/202306161055.obaa9nf1-...@intel.com/config)
compiler: clang version 15.0.7 (https://github.com/llvm/llvm-project.git 
8dfdcc7b7bf66834a761bd8de445840ef68e4d1a)
reproduce (this is a W=1 build):
mkdir -p ~/bin
wget 
https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O 
~/bin/make.cross
chmod +x ~/bin/make.cross
git remote add drm-misc git://anongit.freedesktop.org/drm/drm-misc
git fetch drm-misc drm-misc-next
git checkout drm-misc/drm-misc-next
b4 shazam 
https://lore.kernel.org/r/20230615200157.960630-1-arthurgri...@riseup.net
# save the config file
mkdir build_dir && cp config build_dir/.config
COMPILER_INSTALL_PATH=$HOME/0day COMPILER=clang ~/bin/make.cross W=1 
O=build_dir ARCH=x86_64 olddefconfig
COMPILER_INSTALL_PATH=$HOME/0day COMPILER=clang ~/bin/make.cross W=1 
O=build_dir ARCH=x86_64 SHELL=/bin/bash drivers/gpu/drm/vkms/

If you fix the issue in a separate patch/commit (i.e. not just a new version of
the same patch/commit), kindly add following tags
| Reported-by: kernel test robot 
| Closes: 
https://lore.kernel.org/oe-kbuild-all/202306161055.obaa9nf1-...@intel.com/

All warnings (new ones prefixed by >>):

>> drivers/gpu/drm/vkms/vkms_crtc.c:251:25: warning: variable 'gamma_lut' set 
>> but not used [-Wunused-but-set-variable]
   struct vkms_color_lut *gamma_lut;
  ^
   1 warning generated.


vim +/gamma_lut +251 drivers/gpu/drm/vkms/vkms_crtc.c

   246  
   247  static void vkms_crtc_atomic_flush(struct drm_crtc *crtc,
   248 struct drm_atomic_state *state)
   249  {
   250  struct vkms_output *vkms_output = drm_crtc_to_vkms_output(crtc);
 > 251  struct vkms_color_lut *gamma_lut;
   252  
   253  if (crtc->state->event) {
   254  spin_lock(>dev->event_lock);
   255  
   256  if (drm_crtc_vblank_get(crtc) != 0)
   257  drm_crtc_send_vblank_event(crtc, 
crtc->state->event);
   258  else
   259  drm_crtc_arm_vblank_event(crtc, 
crtc->state->event);
   260  
   261  spin_unlock(>dev->event_lock);
   262  
   263  crtc->state->event = NULL;
   264  }
   265  
   266  vkms_output->composer_state = to_vkms_crtc_state(crtc->state);
   267  gamma_lut = _output->composer_state->gamma_lut;
   268  
   269  spin_unlock_irq(_output->lock);
   270  }
   271  

-- 
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki


Re: [PATCH] drm/panel: move some dsi commands from unprepare to disable

2023-06-15 Thread Doug Anderson
Hi,

On Thu, Jun 15, 2023 at 12:49 AM Neil Armstrong
 wrote:
>
> On 14/06/2023 22:58, Linus Walleij wrote:
> > On Tue, Jun 13, 2023 at 11:08 PM Stephan Gerhold  
> > wrote:
> >
> >> I'm still quite confused about what exactly is supposed to be in
> >> (un)prepare and what in enable/disable. I've seen some related
> >> discussion every now and then but it's still quite inconsistent across
> >> different panel drivers... Can someone clarify this?
> >
> > It is somewhat clarified in commit 45527d435c5e39b6eec4aa0065a562e7cf05d503
> > that added the callbacks:
> >
> > Author: Ajay Kumar 
> > Date:   Fri Jul 18 02:13:48 2014 +0530
> >
> >  drm/panel: add .prepare() and .unprepare() functions
> >
> >  Panels often require an initialization sequence that consists of three
> >  steps: a) powering up the panel, b) starting transmission of video data
> >  and c) enabling the panel (e.g. turn on backlight). This is usually
> >  necessary to avoid visual glitches at the beginning of video data
> >  transmission.
> >
> >  Similarly, the shutdown sequence is typically done in three steps as
> >  well: a) disable the panel (e.g. turn off backlight), b) cease video
> >  data transmission and c) power down the panel.
> >
> >  Currently drivers can only implement .enable() and .disable() 
> > functions,
> >  which is not enough to implement the above sequences. This commit adds 
> > a
> >  second pair of functions, .prepare() and .unprepare() to allow more
> >  fine-grained control over when the above steps are performed.
> >
> >  Signed-off-by: Ajay Kumar 
> >  [treding: rewrite changelog, add kerneldoc]
> >  Signed-off-by: Thierry Reding 
> >
> > My interpretation is that .enable/.disable is for enabling/disabling
> > backlight and well, make things show up on the display, and that
> > happens quickly.
> >
> > prepare/unprepare is for everything else setting up/tearing down
> > the data transmission pipeline.
> >
> > In the clock subsystem the enable/disable could be called in fastpath
> > and prepare/unprepare only from slowpath so e.g an IRQ handler
> > can gate a simple gated clock. This semantic seems to have nothing
> > to do with the display semantic. :/
>
> It had to do, .prepare is called when the DSI link is at LP11 state
> before it has switched to the VIDEO mode, and .unprepare is the
> reverse when VIDEO mode has been disabled and before the DSI link
> is totally disabled.
>
> https://elixir.bootlin.com/linux/latest/source/drivers/gpu/drm/bridge/synopsys/dw-mipi-dsi.c#L938
>
> then
>
> https://elixir.bootlin.com/linux/latest/source/drivers/gpu/drm/bridge/synopsys/dw-mipi-dsi.c#L855
>
> but Doug has started changing this starting with MSM DSI driver, leading to
> current panel drivers not working anymore with the current DSI init mode
> and requires setting pre_enable_prev_first for only some DSI hosts
> who switched out of set_mode().
>
> The DSI init model doesn't fit at all with the atomic bridge model and
> some DSI controllers doesn't support the same features like the allwinner
> DSI controller not support sending LP commands when in HS video mode
> for example.

Summary of the history here as I understand it:

1. Before the switch to DRM_PANEL_BRIDGE, things worked OK.

2. After the switch to DRM_PANEL_BRIDGE, things broke for tc358762.
That led to commit 7d8e9a90509f ("drm/msm/dsi: move DSI host powerup
to modeset time"), which was a little ugly but sorta OK, except ...

3. Moving the DSI host powerup to modeset time broke ps8640. That led
to commit ec7981e6c614 ("drm/msm/dsi: don't powerup at modeset time
for parade-ps8640"), which was a hack.

4. We fixed tc358762 using the new "pre_enable_prev_first" in commit
55cac10739d5 ("drm/bridge: tc358762: Set pre_enable_prev_first") and
thus were able to undo moving DSI host powerup to modeset time and
then undo the ps8640 hack. I talk about this a bit more in the message
for commit 9e15123eca79 ("drm/msm/dsi: Stop unconditionally powering
up DSI hosts at modeset").

If there are other things that need "pre_enable_prev_first" we could
do that. If I understand Dave Stevenson [1], though, this doesn't hurt
but technically shouldn't be required. He says that "It is documented
that the mipi_dsi_host_ops transfer function should be called with the
host in any state [1], so the host driver is failing there." Even if
it shouldn't be required, though, "pre_enable_prev_first" can still
have a benefit as Dave says [2] because it would mean that the DSI
controller doesn't have to power itself up and down for each
transfer...

If I understand, if the MSM DSI driver did what Dave said (proactive
turn on if someone sends commands) then we'd actually be OK even with
ps8640, since we don't send any commands in the ps8640 pre_enable()
function.

I guess one other point of reference is commit a3ee9e0b57f8
("drm/panel: boe-tv101wum-nl6: Ensure DSI writes succeed during
disable"). I think Stephen made that change 

[PATCH v9 20/20] drm/msm/a6xx: Add A610 speedbin support

2023-06-15 Thread Konrad Dybcio
A610 is implemented on at least three SoCs: SM6115 (bengal), SM6125
(trinket) and SM6225 (khaje). Trinket does not support speed binning
(only a single SKU exists) and we don't yet support khaje upstream.
Hence, add a fuse mapping table for bengal to allow for per-chip
frequency limiting.

Reviewed-by: Dmitry Baryshkov 
Reviewed-by: Akhil P Oommen 
Signed-off-by: Konrad Dybcio 
---
 drivers/gpu/drm/msm/adreno/a6xx_gpu.c | 27 +++
 1 file changed, 27 insertions(+)

diff --git a/drivers/gpu/drm/msm/adreno/a6xx_gpu.c 
b/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
index ff9a8d342c77..b3ada1e7b598 100644
--- a/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
+++ b/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
@@ -2204,6 +2204,30 @@ static bool a6xx_progress(struct msm_gpu *gpu, struct 
msm_ringbuffer *ring)
return progress;
 }
 
+static u32 a610_get_speed_bin(u32 fuse)
+{
+   /*
+* There are (at least) three SoCs implementing A610: SM6125 (trinket),
+* SM6115 (bengal) and SM6225 (khaje). Trinket does not have 
speedbinning,
+* as only a single SKU exists and we don't support khaje upstream yet.
+* Hence, this matching table is only valid for bengal and can be easily
+* expanded if need be.
+*/
+
+   if (fuse == 0)
+   return 0;
+   else if (fuse == 206)
+   return 1;
+   else if (fuse == 200)
+   return 2;
+   else if (fuse == 157)
+   return 3;
+   else if (fuse == 127)
+   return 4;
+
+   return UINT_MAX;
+}
+
 static u32 a618_get_speed_bin(u32 fuse)
 {
if (fuse == 0)
@@ -2301,6 +2325,9 @@ static u32 fuse_to_supp_hw(struct device *dev, struct 
adreno_gpu *adreno_gpu, u3
 {
u32 val = UINT_MAX;
 
+   if (adreno_is_a610(adreno_gpu))
+   val = a610_get_speed_bin(fuse);
+
if (adreno_is_a618(adreno_gpu))
val = a618_get_speed_bin(fuse);
 

-- 
2.41.0



[PATCH v9 19/20] drm/msm/a6xx: Add A619_holi speedbin support

2023-06-15 Thread Konrad Dybcio
A619_holi is implemented on at least two SoCs: SM4350 (holi) and SM6375
(blair). This is what seems to be a first occurrence of this happening,
but it's easy to overcome by guarding the SoC-specific fuse values with
of_machine_is_compatible(). Do just that to enable frequency limiting
on these SoCs.

Reviewed-by: Dmitry Baryshkov 
Reviewed-by: Akhil P Oommen 
Signed-off-by: Konrad Dybcio 
---
 drivers/gpu/drm/msm/adreno/a6xx_gpu.c | 31 +++
 1 file changed, 31 insertions(+)

diff --git a/drivers/gpu/drm/msm/adreno/a6xx_gpu.c 
b/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
index d7139eae0f73..ff9a8d342c77 100644
--- a/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
+++ b/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
@@ -2216,6 +2216,34 @@ static u32 a618_get_speed_bin(u32 fuse)
return UINT_MAX;
 }
 
+static u32 a619_holi_get_speed_bin(u32 fuse)
+{
+   /*
+* There are (at least) two SoCs implementing A619_holi: SM4350 (holi)
+* and SM6375 (blair). Limit the fuse matching to the corresponding
+* SoC to prevent bogus frequency setting (as improbable as it may be,
+* given unexpected fuse values are.. unexpected! But still possible.)
+*/
+
+   if (fuse == 0)
+   return 0;
+
+   if (of_machine_is_compatible("qcom,sm4350")) {
+   if (fuse == 138)
+   return 1;
+   else if (fuse == 92)
+   return 2;
+   } else if (of_machine_is_compatible("qcom,sm6375")) {
+   if (fuse == 190)
+   return 1;
+   else if (fuse == 177)
+   return 2;
+   } else
+   pr_warn("Unknown SoC implementing A619_holi!\n");
+
+   return UINT_MAX;
+}
+
 static u32 a619_get_speed_bin(u32 fuse)
 {
if (fuse == 0)
@@ -2276,6 +2304,9 @@ static u32 fuse_to_supp_hw(struct device *dev, struct 
adreno_gpu *adreno_gpu, u3
if (adreno_is_a618(adreno_gpu))
val = a618_get_speed_bin(fuse);
 
+   else if (adreno_is_a619_holi(adreno_gpu))
+   val = a619_holi_get_speed_bin(fuse);
+
else if (adreno_is_a619(adreno_gpu))
val = a619_get_speed_bin(fuse);
 

-- 
2.41.0



[PATCH v9 15/20] drm/msm/a6xx: Add A610 support

2023-06-15 Thread Konrad Dybcio
A610 is one of (if not the) lowest-tier SKUs in the A6XX family. It
features no GMU, as it's implemented solely on SoCs with SMD_RPM.
What's more interesting is that it does not feature a VDDGX line
either, being powered solely by VDDCX and has an unfortunate hardware
quirk that makes its reset line broken - after a couple of assert/
deassert cycles, it will hang for good and will not wake up again.

This GPU requires mesa changes for proper rendering, and lots of them
at that. The command streams are quite far away from any other A6XX
GPU and hence it needs special care. This patch was validated both
by running an (incomplete) downstream mesa with some hacks (frames
rendered correctly, though some instructions made the GPU hangcheck
which is expected - garbage in, garbage out) and by replaying RD
traces captured with the downstream KGSL driver - no crashes there,
ever.

Add support for this GPU on the kernel side, which comes down to
pretty simply adding A612 HWCG tables, altering a few values and
adding a special case for handling the reset line.

Reviewed-by: Dmitry Baryshkov 
Signed-off-by: Konrad Dybcio 
---
 drivers/gpu/drm/msm/adreno/a6xx_gpu.c  | 99 ++
 drivers/gpu/drm/msm/adreno/adreno_device.c | 12 
 drivers/gpu/drm/msm/adreno/adreno_gpu.h|  8 ++-
 3 files changed, 107 insertions(+), 12 deletions(-)

diff --git a/drivers/gpu/drm/msm/adreno/a6xx_gpu.c 
b/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
index 2ca9e0440396..47aafc9deaf8 100644
--- a/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
+++ b/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
@@ -252,6 +252,56 @@ static void a6xx_submit(struct msm_gpu *gpu, struct 
msm_gem_submit *submit)
a6xx_flush(gpu, ring);
 }
 
+const struct adreno_reglist a612_hwcg[] = {
+   {REG_A6XX_RBBM_CLOCK_CNTL_SP0, 0x},
+   {REG_A6XX_RBBM_CLOCK_CNTL2_SP0, 0x0220},
+   {REG_A6XX_RBBM_CLOCK_DELAY_SP0, 0x0081},
+   {REG_A6XX_RBBM_CLOCK_HYST_SP0, 0xf3cf},
+   {REG_A6XX_RBBM_CLOCK_CNTL_TP0, 0x},
+   {REG_A6XX_RBBM_CLOCK_CNTL2_TP0, 0x},
+   {REG_A6XX_RBBM_CLOCK_CNTL3_TP0, 0x},
+   {REG_A6XX_RBBM_CLOCK_CNTL4_TP0, 0x0002},
+   {REG_A6XX_RBBM_CLOCK_DELAY_TP0, 0x},
+   {REG_A6XX_RBBM_CLOCK_DELAY2_TP0, 0x},
+   {REG_A6XX_RBBM_CLOCK_DELAY3_TP0, 0x},
+   {REG_A6XX_RBBM_CLOCK_DELAY4_TP0, 0x0001},
+   {REG_A6XX_RBBM_CLOCK_HYST_TP0, 0x},
+   {REG_A6XX_RBBM_CLOCK_HYST2_TP0, 0x},
+   {REG_A6XX_RBBM_CLOCK_HYST3_TP0, 0x},
+   {REG_A6XX_RBBM_CLOCK_HYST4_TP0, 0x0007},
+   {REG_A6XX_RBBM_CLOCK_CNTL_RB0, 0x},
+   {REG_A6XX_RBBM_CLOCK_CNTL2_RB0, 0x0120},
+   {REG_A6XX_RBBM_CLOCK_CNTL_CCU0, 0x2220},
+   {REG_A6XX_RBBM_CLOCK_HYST_RB_CCU0, 0x00040f00},
+   {REG_A6XX_RBBM_CLOCK_CNTL_RAC, 0x05522022},
+   {REG_A6XX_RBBM_CLOCK_CNTL2_RAC, 0x},
+   {REG_A6XX_RBBM_CLOCK_DELAY_RAC, 0x0011},
+   {REG_A6XX_RBBM_CLOCK_HYST_RAC, 0x00445044},
+   {REG_A6XX_RBBM_CLOCK_CNTL_TSE_RAS_RBBM, 0x0422},
+   {REG_A6XX_RBBM_CLOCK_MODE_VFD, 0x},
+   {REG_A6XX_RBBM_CLOCK_MODE_GPC, 0x0222},
+   {REG_A6XX_RBBM_CLOCK_DELAY_HLSQ_2, 0x0002},
+   {REG_A6XX_RBBM_CLOCK_MODE_HLSQ, 0x},
+   {REG_A6XX_RBBM_CLOCK_DELAY_TSE_RAS_RBBM, 0x4000},
+   {REG_A6XX_RBBM_CLOCK_DELAY_VFD, 0x},
+   {REG_A6XX_RBBM_CLOCK_DELAY_GPC, 0x0200},
+   {REG_A6XX_RBBM_CLOCK_DELAY_HLSQ, 0x},
+   {REG_A6XX_RBBM_CLOCK_HYST_TSE_RAS_RBBM, 0x},
+   {REG_A6XX_RBBM_CLOCK_HYST_VFD, 0x},
+   {REG_A6XX_RBBM_CLOCK_HYST_GPC, 0x04104004},
+   {REG_A6XX_RBBM_CLOCK_HYST_HLSQ, 0x},
+   {REG_A6XX_RBBM_CLOCK_CNTL_UCHE, 0x},
+   {REG_A6XX_RBBM_CLOCK_HYST_UCHE, 0x0004},
+   {REG_A6XX_RBBM_CLOCK_DELAY_UCHE, 0x0002},
+   {REG_A6XX_RBBM_ISDB_CNT, 0x0182},
+   {REG_A6XX_RBBM_RAC_THRESHOLD_CNT, 0x},
+   {REG_A6XX_RBBM_SP_HYST_CNT, 0x},
+   {REG_A6XX_RBBM_CLOCK_CNTL_GMU_GX, 0x0222},
+   {REG_A6XX_RBBM_CLOCK_DELAY_GMU_GX, 0x0111},
+   {REG_A6XX_RBBM_CLOCK_HYST_GMU_GX, 0x0555},
+   {},
+};
+
 /* For a615 family (a615, a616, a618 and a619) */
 const struct adreno_reglist a615_hwcg[] = {
{REG_A6XX_RBBM_CLOCK_CNTL_SP0,  0x0222},
@@ -659,6 +709,8 @@ static void a6xx_set_hwcg(struct msm_gpu *gpu, bool state)
 
if (adreno_is_a630(adreno_gpu))
clock_cntl_on = 0x8aa8aa02;
+   else if (adreno_is_a610(adreno_gpu))
+   clock_cntl_on = 0xaaa8aa82;
else
clock_cntl_on = 0x8aa8aa82;
 
@@ -669,13 +721,15 @@ static void a6xx_set_hwcg(struct msm_gpu *gpu, bool state)
return;
 
/* Disable SP clock before programming HWCG registers */
-   gmu_rmw(gmu, REG_A6XX_GPU_GMU_GX_SPTPRAC_CLOCK_CONTROL, 1, 0);
+   if (!adreno_is_a610(adreno_gpu))
+  

[PATCH v9 14/20] drm/msm/a6xx: Add support for A619_holi

2023-06-15 Thread Konrad Dybcio
A619_holi is a GMU-less variant of the already-supported A619 GPU.
It's present on at least SM4350 (holi) and SM6375 (blair). No mesa
changes are required. Add the required kernel-side support for it.

Signed-off-by: Konrad Dybcio 
---
 drivers/gpu/drm/msm/adreno/a6xx_gpu.c   | 27 +--
 drivers/gpu/drm/msm/adreno/adreno_gpu.h |  5 +
 2 files changed, 30 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/msm/adreno/a6xx_gpu.c 
b/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
index b91fc02eb08c..2ca9e0440396 100644
--- a/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
+++ b/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
@@ -911,6 +911,9 @@ static void a6xx_set_ubwc_config(struct msm_gpu *gpu)
if (adreno_is_a618(adreno_gpu))
return;
 
+   if (adreno_is_a619_holi(adreno_gpu))
+   hbb_lo = 0;
+
if (adreno_is_a640_family(adreno_gpu))
amsbc = 1;
 
@@ -1135,7 +1138,12 @@ static int hw_init(struct msm_gpu *gpu)
}
 
/* Clear GBIF halt in case GX domain was not collapsed */
-   if (a6xx_has_gbif(adreno_gpu)) {
+   if (adreno_is_a619_holi(adreno_gpu)) {
+   gpu_write(gpu, REG_A6XX_GBIF_HALT, 0);
+   gpu_write(gpu, REG_A6XX_RBBM_GPR0_CNTL, 0);
+   /* Let's make extra sure that the GPU can access the memory.. */
+   mb();
+   } else if (a6xx_has_gbif(adreno_gpu)) {
gpu_write(gpu, REG_A6XX_GBIF_HALT, 0);
gpu_write(gpu, REG_A6XX_RBBM_GBIF_HALT, 0);
/* Let's make extra sure that the GPU can access the memory.. */
@@ -1144,6 +1152,9 @@ static int hw_init(struct msm_gpu *gpu)
 
gpu_write(gpu, REG_A6XX_RBBM_SECVID_TSB_CNTL, 0);
 
+   if (adreno_is_a619_holi(adreno_gpu))
+   a6xx_sptprac_enable(gmu);
+
/*
 * Disable the trusted memory range - we don't actually supported secure
 * memory rendering at this point in time and we don't want to block off
@@ -1760,12 +1771,18 @@ static void a6xx_llc_slices_init(struct platform_device 
*pdev,
 #define GBIF_CLIENT_HALT_MASK  BIT(0)
 #define GBIF_ARB_HALT_MASK BIT(1)
 #define VBIF_XIN_HALT_CTRL0_MASK   GENMASK(3, 0)
+#define VBIF_RESET_ACK_MASK0xF0
+#define GPR0_GBIF_HALT_REQUEST 0x1E0
 
 void a6xx_bus_clear_pending_transactions(struct adreno_gpu *adreno_gpu, bool 
gx_off)
 {
struct msm_gpu *gpu = _gpu->base;
 
-   if (!a6xx_has_gbif(adreno_gpu)) {
+   if (adreno_is_a619_holi(adreno_gpu)) {
+   gpu_write(gpu, REG_A6XX_RBBM_GPR0_CNTL, GPR0_GBIF_HALT_REQUEST);
+   spin_until((gpu_read(gpu, REG_A6XX_RBBM_VBIF_GX_RESET_STATUS) &
+   (VBIF_RESET_ACK_MASK)) == VBIF_RESET_ACK_MASK);
+   } else if (!a6xx_has_gbif(adreno_gpu)) {
gpu_write(gpu, REG_A6XX_VBIF_XIN_HALT_CTRL0, 
VBIF_XIN_HALT_CTRL0_MASK);
spin_until((gpu_read(gpu, REG_A6XX_VBIF_XIN_HALT_CTRL1) &
(VBIF_XIN_HALT_CTRL0_MASK)) == 
VBIF_XIN_HALT_CTRL0_MASK);
@@ -1861,6 +1878,9 @@ static int a6xx_pm_resume(struct msm_gpu *gpu)
if (ret)
goto err_bulk_clk;
 
+   if (adreno_is_a619_holi(adreno_gpu))
+   a6xx_sptprac_enable(gmu);
+
/* If anything goes south, tear the GPU down piece by piece.. */
if (ret) {
 err_bulk_clk:
@@ -1920,6 +1940,9 @@ static int a6xx_pm_suspend(struct msm_gpu *gpu)
/* Drain the outstanding traffic on memory buses */
a6xx_bus_clear_pending_transactions(adreno_gpu, true);
 
+   if (adreno_is_a619_holi(adreno_gpu))
+   a6xx_sptprac_disable(gmu);
+
clk_bulk_disable_unprepare(gpu->nr_clocks, gpu->grp_clks);
 
pm_runtime_put_sync(gmu->gxpd);
diff --git a/drivers/gpu/drm/msm/adreno/adreno_gpu.h 
b/drivers/gpu/drm/msm/adreno/adreno_gpu.h
index de0b03a4b594..efd35b7bc4cf 100644
--- a/drivers/gpu/drm/msm/adreno/adreno_gpu.h
+++ b/drivers/gpu/drm/msm/adreno/adreno_gpu.h
@@ -263,6 +263,11 @@ static inline int adreno_is_a619(const struct adreno_gpu 
*gpu)
return adreno_is_revn(gpu, 619);
 }
 
+static inline int adreno_is_a619_holi(const struct adreno_gpu *gpu)
+{
+   return adreno_is_a619(gpu) && adreno_has_gmu_wrapper(gpu);
+}
+
 static inline int adreno_is_a630(const struct adreno_gpu *gpu)
 {
return adreno_is_revn(gpu, 630);

-- 
2.41.0



[PATCH v9 18/20] drm/msm/a6xx: Use adreno_is_aXYZ macros in speedbin matching

2023-06-15 Thread Konrad Dybcio
Before transitioning to using per-SoC and not per-Adreno speedbin
fuse values (need another patchset to land elsewhere), a good
improvement/stopgap solution is to use adreno_is_aXYZ macros in
place of explicit revision matching. Do so to allow differentiating
between A619 and A619_holi.

Reviewed-by: Dmitry Baryshkov 
Reviewed-by: Akhil P Oommen 
Signed-off-by: Konrad Dybcio 
---
 drivers/gpu/drm/msm/adreno/a6xx_gpu.c   | 18 +-
 drivers/gpu/drm/msm/adreno/adreno_gpu.h | 15 ---
 2 files changed, 21 insertions(+), 12 deletions(-)

diff --git a/drivers/gpu/drm/msm/adreno/a6xx_gpu.c 
b/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
index d0ba0844079c..d7139eae0f73 100644
--- a/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
+++ b/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
@@ -2269,23 +2269,23 @@ static u32 adreno_7c3_get_speed_bin(u32 fuse)
return UINT_MAX;
 }
 
-static u32 fuse_to_supp_hw(struct device *dev, struct adreno_rev rev, u32 fuse)
+static u32 fuse_to_supp_hw(struct device *dev, struct adreno_gpu *adreno_gpu, 
u32 fuse)
 {
u32 val = UINT_MAX;
 
-   if (adreno_cmp_rev(ADRENO_REV(6, 1, 8, ANY_ID), rev))
+   if (adreno_is_a618(adreno_gpu))
val = a618_get_speed_bin(fuse);
 
-   else if (adreno_cmp_rev(ADRENO_REV(6, 1, 9, ANY_ID), rev))
+   else if (adreno_is_a619(adreno_gpu))
val = a619_get_speed_bin(fuse);
 
-   else if (adreno_cmp_rev(ADRENO_REV(6, 3, 5, ANY_ID), rev))
+   else if (adreno_is_7c3(adreno_gpu))
val = adreno_7c3_get_speed_bin(fuse);
 
-   else if (adreno_cmp_rev(ADRENO_REV(6, 4, 0, ANY_ID), rev))
+   else if (adreno_is_a640(adreno_gpu))
val = a640_get_speed_bin(fuse);
 
-   else if (adreno_cmp_rev(ADRENO_REV(6, 5, 0, ANY_ID), rev))
+   else if (adreno_is_a650(adreno_gpu))
val = a650_get_speed_bin(fuse);
 
if (val == UINT_MAX) {
@@ -2298,7 +2298,7 @@ static u32 fuse_to_supp_hw(struct device *dev, struct 
adreno_rev rev, u32 fuse)
return (1 << val);
 }
 
-static int a6xx_set_supported_hw(struct device *dev, struct adreno_rev rev)
+static int a6xx_set_supported_hw(struct device *dev, struct adreno_gpu 
*adreno_gpu)
 {
u32 supp_hw;
u32 speedbin;
@@ -2317,7 +2317,7 @@ static int a6xx_set_supported_hw(struct device *dev, 
struct adreno_rev rev)
return ret;
}
 
-   supp_hw = fuse_to_supp_hw(dev, rev, speedbin);
+   supp_hw = fuse_to_supp_hw(dev, adreno_gpu, speedbin);
 
ret = devm_pm_opp_set_supported_hw(dev, _hw, 1);
if (ret)
@@ -2438,7 +2438,7 @@ struct msm_gpu *a6xx_gpu_init(struct drm_device *dev)
 
a6xx_llc_slices_init(pdev, a6xx_gpu);
 
-   ret = a6xx_set_supported_hw(>dev, config->rev);
+   ret = a6xx_set_supported_hw(>dev, adreno_gpu);
if (ret) {
a6xx_destroy(&(a6xx_gpu->base.base));
return ERR_PTR(ret);
diff --git a/drivers/gpu/drm/msm/adreno/adreno_gpu.h 
b/drivers/gpu/drm/msm/adreno/adreno_gpu.h
index 3a8af5fdaea8..d8c9e8cc3753 100644
--- a/drivers/gpu/drm/msm/adreno/adreno_gpu.h
+++ b/drivers/gpu/drm/msm/adreno/adreno_gpu.h
@@ -279,10 +279,9 @@ static inline int adreno_is_a630(const struct adreno_gpu 
*gpu)
return adreno_is_revn(gpu, 630);
 }
 
-static inline int adreno_is_a640_family(const struct adreno_gpu *gpu)
+static inline int adreno_is_a640(const struct adreno_gpu *gpu)
 {
-   return adreno_is_revn(gpu, 640) ||
-   adreno_is_revn(gpu, 680);
+   return adreno_is_revn(gpu, 640);
 }
 
 static inline int adreno_is_a650(const struct adreno_gpu *gpu)
@@ -301,6 +300,11 @@ static inline int adreno_is_a660(const struct adreno_gpu 
*gpu)
return adreno_is_revn(gpu, 660);
 }
 
+static inline int adreno_is_a680(const struct adreno_gpu *gpu)
+{
+   return adreno_is_revn(gpu, 680);
+}
+
 static inline int adreno_is_a690(const struct adreno_gpu *gpu)
 {
return adreno_is_revn(gpu, 690);
@@ -328,6 +332,11 @@ static inline int adreno_is_a650_family(const struct 
adreno_gpu *gpu)
adreno_is_a660_family(gpu);
 }
 
+static inline int adreno_is_a640_family(const struct adreno_gpu *gpu)
+{
+   return adreno_is_a640(gpu) || adreno_is_a680(gpu);
+}
+
 u64 adreno_private_address_space_size(struct msm_gpu *gpu);
 int adreno_get_param(struct msm_gpu *gpu, struct msm_file_private *ctx,
 uint32_t param, uint64_t *value, uint32_t *len);

-- 
2.41.0



[PATCH v9 17/20] drm/msm/a6xx: Use "else if" in GPU speedbin rev matching

2023-06-15 Thread Konrad Dybcio
The GPU can only be one at a time. Turn a series of ifs into if +
elseifs to save some CPU cycles.

Reviewed-by: Dmitry Baryshkov 
Reviewed-by: Akhil P Oommen 
Signed-off-by: Konrad Dybcio 
---
 drivers/gpu/drm/msm/adreno/a6xx_gpu.c | 8 
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/drivers/gpu/drm/msm/adreno/a6xx_gpu.c 
b/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
index 97e261d33312..d0ba0844079c 100644
--- a/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
+++ b/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
@@ -2276,16 +2276,16 @@ static u32 fuse_to_supp_hw(struct device *dev, struct 
adreno_rev rev, u32 fuse)
if (adreno_cmp_rev(ADRENO_REV(6, 1, 8, ANY_ID), rev))
val = a618_get_speed_bin(fuse);
 
-   if (adreno_cmp_rev(ADRENO_REV(6, 1, 9, ANY_ID), rev))
+   else if (adreno_cmp_rev(ADRENO_REV(6, 1, 9, ANY_ID), rev))
val = a619_get_speed_bin(fuse);
 
-   if (adreno_cmp_rev(ADRENO_REV(6, 3, 5, ANY_ID), rev))
+   else if (adreno_cmp_rev(ADRENO_REV(6, 3, 5, ANY_ID), rev))
val = adreno_7c3_get_speed_bin(fuse);
 
-   if (adreno_cmp_rev(ADRENO_REV(6, 4, 0, ANY_ID), rev))
+   else if (adreno_cmp_rev(ADRENO_REV(6, 4, 0, ANY_ID), rev))
val = a640_get_speed_bin(fuse);
 
-   if (adreno_cmp_rev(ADRENO_REV(6, 5, 0, ANY_ID), rev))
+   else if (adreno_cmp_rev(ADRENO_REV(6, 5, 0, ANY_ID), rev))
val = a650_get_speed_bin(fuse);
 
if (val == UINT_MAX) {

-- 
2.41.0



[PATCH v9 06/20] drm/msm/a6xx: Move a6xx_bus_clear_pending_transactions to a6xx_gpu

2023-06-15 Thread Konrad Dybcio
This function is responsible for telling the GPU to halt transactions
on all of its relevant buses, drain them and leave them in a predictable
state, so that the GPU can be e.g. reset cleanly.

Move the function to a6xx_gpu.c, remove the static keyword and add a
prototype in a6xx_gpu.h to accomodate for the move.

Reviewed-by: Akhil P Oommen 
Signed-off-by: Konrad Dybcio 
---
 drivers/gpu/drm/msm/adreno/a6xx_gmu.c | 37 ---
 drivers/gpu/drm/msm/adreno/a6xx_gpu.c | 36 ++
 drivers/gpu/drm/msm/adreno/a6xx_gpu.h |  2 ++
 3 files changed, 38 insertions(+), 37 deletions(-)

diff --git a/drivers/gpu/drm/msm/adreno/a6xx_gmu.c 
b/drivers/gpu/drm/msm/adreno/a6xx_gmu.c
index 32852c161aab..6402544f6849 100644
--- a/drivers/gpu/drm/msm/adreno/a6xx_gmu.c
+++ b/drivers/gpu/drm/msm/adreno/a6xx_gmu.c
@@ -868,43 +868,6 @@ static void a6xx_gmu_rpmh_off(struct a6xx_gmu *gmu)
(val & 1), 100, 1000);
 }
 
-#define GBIF_CLIENT_HALT_MASK BIT(0)
-#define GBIF_ARB_HALT_MASKBIT(1)
-
-static void a6xx_bus_clear_pending_transactions(struct adreno_gpu *adreno_gpu,
-   bool gx_off)
-{
-   struct msm_gpu *gpu = _gpu->base;
-
-   if (!a6xx_has_gbif(adreno_gpu)) {
-   gpu_write(gpu, REG_A6XX_VBIF_XIN_HALT_CTRL0, 0xf);
-   spin_until((gpu_read(gpu, REG_A6XX_VBIF_XIN_HALT_CTRL1) &
-   0xf) == 0xf);
-   gpu_write(gpu, REG_A6XX_VBIF_XIN_HALT_CTRL0, 0);
-
-   return;
-   }
-
-   if (gx_off) {
-   /* Halt the gx side of GBIF */
-   gpu_write(gpu, REG_A6XX_RBBM_GBIF_HALT, 1);
-   spin_until(gpu_read(gpu, REG_A6XX_RBBM_GBIF_HALT_ACK) & 1);
-   }
-
-   /* Halt new client requests on GBIF */
-   gpu_write(gpu, REG_A6XX_GBIF_HALT, GBIF_CLIENT_HALT_MASK);
-   spin_until((gpu_read(gpu, REG_A6XX_GBIF_HALT_ACK) &
-   (GBIF_CLIENT_HALT_MASK)) == GBIF_CLIENT_HALT_MASK);
-
-   /* Halt all AXI requests on GBIF */
-   gpu_write(gpu, REG_A6XX_GBIF_HALT, GBIF_ARB_HALT_MASK);
-   spin_until((gpu_read(gpu,  REG_A6XX_GBIF_HALT_ACK) &
-   (GBIF_ARB_HALT_MASK)) == GBIF_ARB_HALT_MASK);
-
-   /* The GBIF halt needs to be explicitly cleared */
-   gpu_write(gpu, REG_A6XX_GBIF_HALT, 0x0);
-}
-
 /* Force the GMU off in case it isn't responsive */
 static void a6xx_gmu_force_off(struct a6xx_gmu *gmu)
 {
diff --git a/drivers/gpu/drm/msm/adreno/a6xx_gpu.c 
b/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
index eebb4bc7c0f9..a48f4e3a754a 100644
--- a/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
+++ b/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
@@ -1705,6 +1705,42 @@ static void a6xx_llc_slices_init(struct platform_device 
*pdev,
a6xx_gpu->llc_mmio = ERR_PTR(-EINVAL);
 }
 
+#define GBIF_CLIENT_HALT_MASK BIT(0)
+#define GBIF_ARB_HALT_MASKBIT(1)
+
+void a6xx_bus_clear_pending_transactions(struct adreno_gpu *adreno_gpu, bool 
gx_off)
+{
+   struct msm_gpu *gpu = _gpu->base;
+
+   if (!a6xx_has_gbif(adreno_gpu)) {
+   gpu_write(gpu, REG_A6XX_VBIF_XIN_HALT_CTRL0, 0xf);
+   spin_until((gpu_read(gpu, REG_A6XX_VBIF_XIN_HALT_CTRL1) &
+   0xf) == 0xf);
+   gpu_write(gpu, REG_A6XX_VBIF_XIN_HALT_CTRL0, 0);
+
+   return;
+   }
+
+   if (gx_off) {
+   /* Halt the gx side of GBIF */
+   gpu_write(gpu, REG_A6XX_RBBM_GBIF_HALT, 1);
+   spin_until(gpu_read(gpu, REG_A6XX_RBBM_GBIF_HALT_ACK) & 1);
+   }
+
+   /* Halt new client requests on GBIF */
+   gpu_write(gpu, REG_A6XX_GBIF_HALT, GBIF_CLIENT_HALT_MASK);
+   spin_until((gpu_read(gpu, REG_A6XX_GBIF_HALT_ACK) &
+   (GBIF_CLIENT_HALT_MASK)) == GBIF_CLIENT_HALT_MASK);
+
+   /* Halt all AXI requests on GBIF */
+   gpu_write(gpu, REG_A6XX_GBIF_HALT, GBIF_ARB_HALT_MASK);
+   spin_until((gpu_read(gpu,  REG_A6XX_GBIF_HALT_ACK) &
+   (GBIF_ARB_HALT_MASK)) == GBIF_ARB_HALT_MASK);
+
+   /* The GBIF halt needs to be explicitly cleared */
+   gpu_write(gpu, REG_A6XX_GBIF_HALT, 0x0);
+}
+
 static int a6xx_pm_resume(struct msm_gpu *gpu)
 {
struct adreno_gpu *adreno_gpu = to_adreno_gpu(gpu);
diff --git a/drivers/gpu/drm/msm/adreno/a6xx_gpu.h 
b/drivers/gpu/drm/msm/adreno/a6xx_gpu.h
index eea2e60ce3b7..9580def06d45 100644
--- a/drivers/gpu/drm/msm/adreno/a6xx_gpu.h
+++ b/drivers/gpu/drm/msm/adreno/a6xx_gpu.h
@@ -88,4 +88,6 @@ void a6xx_show(struct msm_gpu *gpu, struct msm_gpu_state 
*state,
 struct msm_gpu_state *a6xx_gpu_state_get(struct msm_gpu *gpu);
 int a6xx_gpu_state_put(struct msm_gpu_state *state);
 
+void a6xx_bus_clear_pending_transactions(struct adreno_gpu *adreno_gpu, bool 
gx_off);
+
 #endif /* __A6XX_GPU_H__ */

-- 
2.41.0



[PATCH v9 11/20] drm/msm/a6xx: Move CX GMU power counter enablement to hw_init

2023-06-15 Thread Konrad Dybcio
Since the introduction of A6xx support, we've been enabling the CX GMU
power counter 0 in a bit of a weird spot. Move it to hw_init so that
GMU wrapper GPUs can reuse the same code paths. As a bonus, this order
makes it easier to compare mainline and downstream register access traces.

Signed-off-by: Konrad Dybcio 
---
 drivers/gpu/drm/msm/adreno/a6xx_gmu.c | 6 --
 drivers/gpu/drm/msm/adreno/a6xx_gpu.c | 7 +++
 2 files changed, 7 insertions(+), 6 deletions(-)

diff --git a/drivers/gpu/drm/msm/adreno/a6xx_gmu.c 
b/drivers/gpu/drm/msm/adreno/a6xx_gmu.c
index 906bed49f27d..aae7ea651607 100644
--- a/drivers/gpu/drm/msm/adreno/a6xx_gmu.c
+++ b/drivers/gpu/drm/msm/adreno/a6xx_gmu.c
@@ -479,12 +479,6 @@ static int a6xx_rpmh_start(struct a6xx_gmu *gmu)
 
gmu_write(gmu, REG_A6XX_GMU_RSCC_CONTROL_REQ, 0);
 
-   /* Set up CX GMU counter 0 to count busy ticks */
-   gmu_write(gmu, REG_A6XX_GPU_GMU_AO_GPU_CX_BUSY_MASK, 0xff00);
-   gmu_rmw(gmu, REG_A6XX_GMU_CX_GMU_POWER_COUNTER_SELECT_0, 0xff, 0x20);
-
-   /* Enable the power counter */
-   gmu_write(gmu, REG_A6XX_GMU_CX_GMU_POWER_COUNTER_ENABLE, 1);
return 0;
 }
 
diff --git a/drivers/gpu/drm/msm/adreno/a6xx_gpu.c 
b/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
index 8aa4670b4308..0efecde2af1a 100644
--- a/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
+++ b/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
@@ -1256,6 +1256,13 @@ static int hw_init(struct msm_gpu *gpu)
0x3f0243f0);
}
 
+   /* Set up the CX GMU counter 0 to count busy ticks */
+   gmu_write(gmu, REG_A6XX_GPU_GMU_AO_GPU_CX_BUSY_MASK, 0xff00);
+
+   /* Enable the power counter */
+   gmu_rmw(gmu, REG_A6XX_GMU_CX_GMU_POWER_COUNTER_SELECT_0, 0xff, BIT(5));
+   gmu_write(gmu, REG_A6XX_GMU_CX_GMU_POWER_COUNTER_ENABLE, 1);
+
/* Protect registers from the CP */
a6xx_set_cp_protect(gpu);
 

-- 
2.41.0



[PATCH v9 16/20] drm/msm/a6xx: Fix some A619 tunables

2023-06-15 Thread Konrad Dybcio
Adreno 619 expects some tunables to be set differently. Make up for it.

Fixes: b7616b5c69e6 ("drm/msm/adreno: Add A619 support")
Reviewed-by: Dmitry Baryshkov 
Reviewed-by: Akhil P Oommen 
Signed-off-by: Konrad Dybcio 
---
 drivers/gpu/drm/msm/adreno/a6xx_gpu.c | 6 +-
 1 file changed, 5 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/msm/adreno/a6xx_gpu.c 
b/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
index 47aafc9deaf8..97e261d33312 100644
--- a/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
+++ b/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
@@ -1306,6 +1306,8 @@ static int hw_init(struct msm_gpu *gpu)
gpu_write(gpu, REG_A6XX_PC_DBG_ECO_CNTL, 0x00200200);
else if (adreno_is_a650(adreno_gpu) || adreno_is_a660(adreno_gpu))
gpu_write(gpu, REG_A6XX_PC_DBG_ECO_CNTL, 0x00300200);
+   else if (adreno_is_a619(adreno_gpu))
+   gpu_write(gpu, REG_A6XX_PC_DBG_ECO_CNTL, 0x00018000);
else if (adreno_is_a610(adreno_gpu))
gpu_write(gpu, REG_A6XX_PC_DBG_ECO_CNTL, 0x0008);
else
@@ -1323,7 +1325,9 @@ static int hw_init(struct msm_gpu *gpu)
a6xx_set_ubwc_config(gpu);
 
/* Enable fault detection */
-   if (adreno_is_a610(adreno_gpu))
+   if (adreno_is_a619(adreno_gpu))
+   gpu_write(gpu, REG_A6XX_RBBM_INTERFACE_HANG_INT_CNTL, (1 << 30) 
| 0x3f);
+   else if (adreno_is_a610(adreno_gpu))
gpu_write(gpu, REG_A6XX_RBBM_INTERFACE_HANG_INT_CNTL, (1 << 30) 
| 0x3);
else
gpu_write(gpu, REG_A6XX_RBBM_INTERFACE_HANG_INT_CNTL, (1 << 30) 
| 0x1f);

-- 
2.41.0



[PATCH v9 13/20] drm/msm/adreno: Disable has_cached_coherent in GMU wrapper configurations

2023-06-15 Thread Konrad Dybcio
A610 and A619_holi don't support the feature. Disable it to make the GPU stop
crashing after almost each and every submission - the received data on
the GPU end was simply incomplete in garbled, resulting in almost nothing
being executed properly. Extend the disablement to adreno_has_gmu_wrapper,
as none of the GMU wrapper Adrenos that don't support yet seem to feature it.

Reviewed-by: Akhil P Oommen 
Signed-off-by: Konrad Dybcio 
---
 drivers/gpu/drm/msm/adreno/adreno_device.c | 5 -
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/msm/adreno/adreno_device.c 
b/drivers/gpu/drm/msm/adreno/adreno_device.c
index e5a865024e94..6ea24b8ddcf8 100644
--- a/drivers/gpu/drm/msm/adreno/adreno_device.c
+++ b/drivers/gpu/drm/msm/adreno/adreno_device.c
@@ -565,7 +565,6 @@ static int adreno_bind(struct device *dev, struct device 
*master, void *data)
config.rev.minor, config.rev.patchid);
 
priv->is_a2xx = config.rev.core == 2;
-   priv->has_cached_coherent = config.rev.core >= 6;
 
gpu = info->init(drm);
if (IS_ERR(gpu)) {
@@ -577,6 +576,10 @@ static int adreno_bind(struct device *dev, struct device 
*master, void *data)
if (ret)
return ret;
 
+   if (config.rev.core >= 6)
+   if (!adreno_has_gmu_wrapper(to_adreno_gpu(gpu)))
+   priv->has_cached_coherent = true;
+
return 0;
 }
 

-- 
2.41.0



[PATCH v9 12/20] drm/msm/a6xx: Introduce GMU wrapper support

2023-06-15 Thread Konrad Dybcio
Some (particularly SMD_RPM, a.k.a non-RPMh) SoCs implement A6XX GPUs
but don't implement the associated GMUs. This is due to the fact that
the GMU directly pokes at RPMh. Sadly, this means we have to take care
of enabling & scaling power rails, clocks and bandwidth ourselves.

Reuse existing Adreno-common code and modify the deeply-GMU-infused
A6XX code to facilitate these GPUs. This involves if-ing out lots
of GMU callbacks and introducing a new type of GMU - GMU wrapper (it's
the actual name that Qualcomm uses in their downstream kernels).

This is essentially a register region which is convenient to model
as a device. We'll use it for managing the GDSCs. The register
layout matches the actual GMU_CX/GX regions on the "real GMU" devices
and lets us reuse quite a bit of gmu_read/write/rmw calls.

Signed-off-by: Konrad Dybcio 
---
 drivers/gpu/drm/msm/adreno/a6xx_gmu.c   |  72 +-
 drivers/gpu/drm/msm/adreno/a6xx_gpu.c   | 201 
 drivers/gpu/drm/msm/adreno/a6xx_gpu.h   |   1 +
 drivers/gpu/drm/msm/adreno/a6xx_gpu_state.c |  14 +-
 drivers/gpu/drm/msm/adreno/adreno_gpu.c |   8 +-
 drivers/gpu/drm/msm/adreno/adreno_gpu.h |   6 +
 6 files changed, 266 insertions(+), 36 deletions(-)

diff --git a/drivers/gpu/drm/msm/adreno/a6xx_gmu.c 
b/drivers/gpu/drm/msm/adreno/a6xx_gmu.c
index aae7ea651607..5deb79924897 100644
--- a/drivers/gpu/drm/msm/adreno/a6xx_gmu.c
+++ b/drivers/gpu/drm/msm/adreno/a6xx_gmu.c
@@ -1431,6 +1431,7 @@ static int a6xx_gmu_get_irq(struct a6xx_gmu *gmu, struct 
platform_device *pdev,
 
 void a6xx_gmu_remove(struct a6xx_gpu *a6xx_gpu)
 {
+   struct adreno_gpu *adreno_gpu = _gpu->base;
struct a6xx_gmu *gmu = _gpu->gmu;
struct platform_device *pdev = to_platform_device(gmu->dev);
 
@@ -1456,10 +1457,12 @@ void a6xx_gmu_remove(struct a6xx_gpu *a6xx_gpu)
gmu->mmio = NULL;
gmu->rscc = NULL;
 
-   a6xx_gmu_memory_free(gmu);
+   if (!adreno_has_gmu_wrapper(adreno_gpu)) {
+   a6xx_gmu_memory_free(gmu);
 
-   free_irq(gmu->gmu_irq, gmu);
-   free_irq(gmu->hfi_irq, gmu);
+   free_irq(gmu->gmu_irq, gmu);
+   free_irq(gmu->hfi_irq, gmu);
+   }
 
/* Drop reference taken in of_find_device_by_node */
put_device(gmu->dev);
@@ -1478,6 +1481,69 @@ static int cxpd_notifier_cb(struct notifier_block *nb,
return 0;
 }
 
+int a6xx_gmu_wrapper_init(struct a6xx_gpu *a6xx_gpu, struct device_node *node)
+{
+   struct platform_device *pdev = of_find_device_by_node(node);
+   struct a6xx_gmu *gmu = _gpu->gmu;
+   int ret;
+
+   if (!pdev)
+   return -ENODEV;
+
+   gmu->dev = >dev;
+
+   of_dma_configure(gmu->dev, node, true);
+
+   pm_runtime_enable(gmu->dev);
+
+   /* Mark legacy for manual SPTPRAC control */
+   gmu->legacy = true;
+
+   /* Map the GMU registers */
+   gmu->mmio = a6xx_gmu_get_mmio(pdev, "gmu");
+   if (IS_ERR(gmu->mmio)) {
+   ret = PTR_ERR(gmu->mmio);
+   goto err_mmio;
+   }
+
+   gmu->cxpd = dev_pm_domain_attach_by_name(gmu->dev, "cx");
+   if (IS_ERR(gmu->cxpd)) {
+   ret = PTR_ERR(gmu->cxpd);
+   goto err_mmio;
+   }
+
+   if (!device_link_add(gmu->dev, gmu->cxpd, DL_FLAG_PM_RUNTIME)) {
+   ret = -ENODEV;
+   goto detach_cxpd;
+   }
+
+   init_completion(>pd_gate);
+   complete_all(>pd_gate);
+   gmu->pd_nb.notifier_call = cxpd_notifier_cb;
+
+   /* Get a link to the GX power domain to reset the GPU */
+   gmu->gxpd = dev_pm_domain_attach_by_name(gmu->dev, "gx");
+   if (IS_ERR(gmu->gxpd)) {
+   ret = PTR_ERR(gmu->gxpd);
+   goto err_mmio;
+   }
+
+   gmu->initialized = true;
+
+   return 0;
+
+detach_cxpd:
+   dev_pm_domain_detach(gmu->cxpd, false);
+
+err_mmio:
+   iounmap(gmu->mmio);
+
+   /* Drop reference taken in of_find_device_by_node */
+   put_device(gmu->dev);
+
+   return ret;
+}
+
 int a6xx_gmu_init(struct a6xx_gpu *a6xx_gpu, struct device_node *node)
 {
struct adreno_gpu *adreno_gpu = _gpu->base;
diff --git a/drivers/gpu/drm/msm/adreno/a6xx_gpu.c 
b/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
index 0efecde2af1a..b91fc02eb08c 100644
--- a/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
+++ b/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
@@ -21,7 +21,7 @@ static inline bool _a6xx_check_idle(struct msm_gpu *gpu)
struct a6xx_gpu *a6xx_gpu = to_a6xx_gpu(adreno_gpu);
 
/* Check that the GMU is idle */
-   if (!a6xx_gmu_isidle(_gpu->gmu))
+   if (!adreno_has_gmu_wrapper(adreno_gpu) && 
!a6xx_gmu_isidle(_gpu->gmu))
return false;
 
/* Check tha the CX master is idle */
@@ -1126,10 +1126,13 @@ static int hw_init(struct msm_gpu *gpu)
 {
struct adreno_gpu *adreno_gpu = to_adreno_gpu(gpu);
struct a6xx_gpu *a6xx_gpu = to_a6xx_gpu(adreno_gpu);
+   

[PATCH v9 09/20] drm/msm/a6xx: Remove both GBIF and RBBM GBIF halt on hw init

2023-06-15 Thread Konrad Dybcio
Currently we're only deasserting REG_A6XX_RBBM_GBIF_HALT, but we also
need REG_A6XX_GBIF_HALT to be set to 0.

This is typically done automatically on successful GX collapse, but in
case that fails, we should take care of it.

Also, add a memory barrier to ensure it's gone through before jumping
to further initialization.

Reviewed-by: Dmitry Baryshkov 
Signed-off-by: Konrad Dybcio 
---
 drivers/gpu/drm/msm/adreno/a6xx_gpu.c | 6 +-
 1 file changed, 5 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/msm/adreno/a6xx_gpu.c 
b/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
index b627be3f6360..7e0d1dfcd993 100644
--- a/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
+++ b/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
@@ -,8 +,12 @@ static int hw_init(struct msm_gpu *gpu)
a6xx_gmu_set_oob(_gpu->gmu, GMU_OOB_GPU_SET);
 
/* Clear GBIF halt in case GX domain was not collapsed */
-   if (a6xx_has_gbif(adreno_gpu))
+   if (a6xx_has_gbif(adreno_gpu)) {
+   gpu_write(gpu, REG_A6XX_GBIF_HALT, 0);
gpu_write(gpu, REG_A6XX_RBBM_GBIF_HALT, 0);
+   /* Let's make extra sure that the GPU can access the memory.. */
+   mb();
+   }
 
gpu_write(gpu, REG_A6XX_RBBM_SECVID_TSB_CNTL, 0);
 

-- 
2.41.0



[PATCH v9 10/20] drm/msm/a6xx: Extend and explain UBWC config

2023-06-15 Thread Konrad Dybcio
Rename lower_bit to hbb_lo and explain what it signifies.
Add explanations (wherever possible to other tunables).

Port setting min_access_length, ubwc_mode and hbb_hi from downstream.

Reviewed-by: Rob Clark 
Reviewed-by: Akhil P Oommen 
Signed-off-by: Konrad Dybcio 
---
 drivers/gpu/drm/msm/adreno/a6xx_gpu.c | 41 ++-
 1 file changed, 31 insertions(+), 10 deletions(-)

diff --git a/drivers/gpu/drm/msm/adreno/a6xx_gpu.c 
b/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
index 7e0d1dfcd993..8aa4670b4308 100644
--- a/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
+++ b/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
@@ -887,10 +887,25 @@ static void a6xx_set_cp_protect(struct msm_gpu *gpu)
 static void a6xx_set_ubwc_config(struct msm_gpu *gpu)
 {
struct adreno_gpu *adreno_gpu = to_adreno_gpu(gpu);
-   u32 lower_bit = 2;
-   u32 amsbc = 0;
+   /* Unknown, introduced with A650 family, related to UBWC mode/ver 4 */
u32 rgb565_predicator = 0;
+   /* Unknown, introduced with A650 family */
u32 uavflagprd_inv = 0;
+   /* Whether the minimum access length is 64 bits */
+   u32 min_acc_len = 0;
+   /* Entirely magic, per-GPU-gen value */
+   u32 ubwc_mode = 0;
+   /*
+* The Highest Bank Bit value represents the bit of the highest DDR 
bank.
+* We then subtract 13 from it (13 is the minimum value allowed by hw) 
and
+* write the lowest two bits of the remaining value as hbb_lo and the
+* one above it as hbb_hi to the hardware. This should ideally use DRAM
+* type detection.
+*/
+   u32 hbb_hi = 0;
+   u32 hbb_lo = 2;
+   /* Unknown, introduced with A640/680 */
+   u32 amsbc = 0;
 
/* a618 is using the hw default values */
if (adreno_is_a618(adreno_gpu))
@@ -901,32 +916,38 @@ static void a6xx_set_ubwc_config(struct msm_gpu *gpu)
 
if (adreno_is_a650(adreno_gpu) || adreno_is_a660(adreno_gpu)) {
/* TODO: get ddr type from bootloader and use 2 for LPDDR4 */
-   lower_bit = 3;
+   hbb_lo = 3;
amsbc = 1;
rgb565_predicator = 1;
uavflagprd_inv = 2;
}
 
if (adreno_is_a690(adreno_gpu)) {
-   lower_bit = 2;
+   hbb_lo = 2;
amsbc = 1;
rgb565_predicator = 1;
uavflagprd_inv = 2;
}
 
if (adreno_is_7c3(adreno_gpu)) {
-   lower_bit = 1;
+   hbb_lo = 1;
amsbc = 1;
rgb565_predicator = 1;
uavflagprd_inv = 2;
}
 
gpu_write(gpu, REG_A6XX_RB_NC_MODE_CNTL,
-   rgb565_predicator << 11 | amsbc << 4 | lower_bit << 1);
-   gpu_write(gpu, REG_A6XX_TPL1_NC_MODE_CNTL, lower_bit << 1);
-   gpu_write(gpu, REG_A6XX_SP_NC_MODE_CNTL,
-   uavflagprd_inv << 4 | lower_bit << 1);
-   gpu_write(gpu, REG_A6XX_UCHE_MODE_CNTL, lower_bit << 21);
+ rgb565_predicator << 11 | hbb_hi << 10 | amsbc << 4 |
+ min_acc_len << 3 | hbb_lo << 1 | ubwc_mode);
+
+   gpu_write(gpu, REG_A6XX_TPL1_NC_MODE_CNTL, hbb_hi << 4 |
+ min_acc_len << 3 | hbb_lo << 1 | ubwc_mode);
+
+   gpu_write(gpu, REG_A6XX_SP_NC_MODE_CNTL, hbb_hi << 10 |
+ uavflagprd_inv << 4 | min_acc_len << 3 |
+ hbb_lo << 1 | ubwc_mode);
+
+   gpu_write(gpu, REG_A6XX_UCHE_MODE_CNTL, min_acc_len << 23 | hbb_lo << 
21);
 }
 
 static int a6xx_cp_init(struct msm_gpu *gpu)

-- 
2.41.0



[PATCH v9 04/20] drm/msm/a6xx: Remove static keyword from sptprac en/disable functions

2023-06-15 Thread Konrad Dybcio
These two will be reused by at least A619_holi in the non-gmu
paths. Turn them non-static them to make it possible.

Reviewed-by: Dmitry Baryshkov 
Signed-off-by: Konrad Dybcio 
---
 drivers/gpu/drm/msm/adreno/a6xx_gmu.c | 4 ++--
 drivers/gpu/drm/msm/adreno/a6xx_gmu.h | 2 ++
 2 files changed, 4 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/msm/adreno/a6xx_gmu.c 
b/drivers/gpu/drm/msm/adreno/a6xx_gmu.c
index 8914992378f2..a6fa273d700e 100644
--- a/drivers/gpu/drm/msm/adreno/a6xx_gmu.c
+++ b/drivers/gpu/drm/msm/adreno/a6xx_gmu.c
@@ -354,7 +354,7 @@ void a6xx_gmu_clear_oob(struct a6xx_gmu *gmu, enum 
a6xx_gmu_oob_state state)
 }
 
 /* Enable CPU control of SPTP power power collapse */
-static int a6xx_sptprac_enable(struct a6xx_gmu *gmu)
+int a6xx_sptprac_enable(struct a6xx_gmu *gmu)
 {
int ret;
u32 val;
@@ -376,7 +376,7 @@ static int a6xx_sptprac_enable(struct a6xx_gmu *gmu)
 }
 
 /* Disable CPU control of SPTP power power collapse */
-static void a6xx_sptprac_disable(struct a6xx_gmu *gmu)
+void a6xx_sptprac_disable(struct a6xx_gmu *gmu)
 {
u32 val;
int ret;
diff --git a/drivers/gpu/drm/msm/adreno/a6xx_gmu.h 
b/drivers/gpu/drm/msm/adreno/a6xx_gmu.h
index 4759a8ce51e4..236f81a43caa 100644
--- a/drivers/gpu/drm/msm/adreno/a6xx_gmu.h
+++ b/drivers/gpu/drm/msm/adreno/a6xx_gmu.h
@@ -193,5 +193,7 @@ int a6xx_hfi_set_freq(struct a6xx_gmu *gmu, int index);
 
 bool a6xx_gmu_gx_is_on(struct a6xx_gmu *gmu);
 bool a6xx_gmu_sptprac_is_on(struct a6xx_gmu *gmu);
+void a6xx_sptprac_disable(struct a6xx_gmu *gmu);
+int a6xx_sptprac_enable(struct a6xx_gmu *gmu);
 
 #endif

-- 
2.41.0



[PATCH v9 07/20] drm/msm/a6xx: Improve a6xx_bus_clear_pending_transactions()

2023-06-15 Thread Konrad Dybcio
Unify the indentation and explain the cryptic 0xF value.

Reviewed-by: Akhil P Oommen 
Signed-off-by: Konrad Dybcio 
---
 drivers/gpu/drm/msm/adreno/a6xx_gpu.c | 9 +
 1 file changed, 5 insertions(+), 4 deletions(-)

diff --git a/drivers/gpu/drm/msm/adreno/a6xx_gpu.c 
b/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
index a48f4e3a754a..d5bd008c2947 100644
--- a/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
+++ b/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
@@ -1705,17 +1705,18 @@ static void a6xx_llc_slices_init(struct platform_device 
*pdev,
a6xx_gpu->llc_mmio = ERR_PTR(-EINVAL);
 }
 
-#define GBIF_CLIENT_HALT_MASK BIT(0)
-#define GBIF_ARB_HALT_MASKBIT(1)
+#define GBIF_CLIENT_HALT_MASK  BIT(0)
+#define GBIF_ARB_HALT_MASK BIT(1)
+#define VBIF_XIN_HALT_CTRL0_MASK   GENMASK(3, 0)
 
 void a6xx_bus_clear_pending_transactions(struct adreno_gpu *adreno_gpu, bool 
gx_off)
 {
struct msm_gpu *gpu = _gpu->base;
 
if (!a6xx_has_gbif(adreno_gpu)) {
-   gpu_write(gpu, REG_A6XX_VBIF_XIN_HALT_CTRL0, 0xf);
+   gpu_write(gpu, REG_A6XX_VBIF_XIN_HALT_CTRL0, 
VBIF_XIN_HALT_CTRL0_MASK);
spin_until((gpu_read(gpu, REG_A6XX_VBIF_XIN_HALT_CTRL1) &
-   0xf) == 0xf);
+   (VBIF_XIN_HALT_CTRL0_MASK)) == 
VBIF_XIN_HALT_CTRL0_MASK);
gpu_write(gpu, REG_A6XX_VBIF_XIN_HALT_CTRL0, 0);
 
return;

-- 
2.41.0



[PATCH v9 08/20] drm/msm/a6xx: Add a helper for software-resetting the GPU

2023-06-15 Thread Konrad Dybcio
Introduce a6xx_gpu_sw_reset() in preparation for adding GMU wrapper
GPUs and reuse it in a6xx_gmu_force_off().

This helper, contrary to the original usage in GMU code paths, adds
a readback+delay sequence to ensure that the reset is never deasserted
too quickly due to e.g. OoO execution going crazy.

Signed-off-by: Konrad Dybcio 
---
 drivers/gpu/drm/msm/adreno/a6xx_gmu.c |  3 +--
 drivers/gpu/drm/msm/adreno/a6xx_gpu.c | 12 
 drivers/gpu/drm/msm/adreno/a6xx_gpu.h |  1 +
 3 files changed, 14 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/msm/adreno/a6xx_gmu.c 
b/drivers/gpu/drm/msm/adreno/a6xx_gmu.c
index 6402544f6849..906bed49f27d 100644
--- a/drivers/gpu/drm/msm/adreno/a6xx_gmu.c
+++ b/drivers/gpu/drm/msm/adreno/a6xx_gmu.c
@@ -899,8 +899,7 @@ static void a6xx_gmu_force_off(struct a6xx_gmu *gmu)
a6xx_bus_clear_pending_transactions(adreno_gpu, true);
 
/* Reset GPU core blocks */
-   gpu_write(gpu, REG_A6XX_RBBM_SW_RESET_CMD, 1);
-   udelay(100);
+   a6xx_gpu_sw_reset(gpu, true);
 }
 
 static void a6xx_gmu_set_initial_freq(struct msm_gpu *gpu, struct a6xx_gmu 
*gmu)
diff --git a/drivers/gpu/drm/msm/adreno/a6xx_gpu.c 
b/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
index d5bd008c2947..b627be3f6360 100644
--- a/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
+++ b/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
@@ -1742,6 +1742,18 @@ void a6xx_bus_clear_pending_transactions(struct 
adreno_gpu *adreno_gpu, bool gx_
gpu_write(gpu, REG_A6XX_GBIF_HALT, 0x0);
 }
 
+void a6xx_gpu_sw_reset(struct msm_gpu *gpu, bool assert)
+{
+   gpu_write(gpu, REG_A6XX_RBBM_SW_RESET_CMD, assert);
+   /* Perform a bogus read and add a brief delay to ensure ordering. */
+   gpu_read(gpu, REG_A6XX_RBBM_SW_RESET_CMD);
+   udelay(1);
+
+   /* The reset line needs to be asserted for at least 100 us */
+   if (assert)
+   udelay(100);
+}
+
 static int a6xx_pm_resume(struct msm_gpu *gpu)
 {
struct adreno_gpu *adreno_gpu = to_adreno_gpu(gpu);
diff --git a/drivers/gpu/drm/msm/adreno/a6xx_gpu.h 
b/drivers/gpu/drm/msm/adreno/a6xx_gpu.h
index 9580def06d45..aa70390ee1c6 100644
--- a/drivers/gpu/drm/msm/adreno/a6xx_gpu.h
+++ b/drivers/gpu/drm/msm/adreno/a6xx_gpu.h
@@ -89,5 +89,6 @@ struct msm_gpu_state *a6xx_gpu_state_get(struct msm_gpu *gpu);
 int a6xx_gpu_state_put(struct msm_gpu_state *state);
 
 void a6xx_bus_clear_pending_transactions(struct adreno_gpu *adreno_gpu, bool 
gx_off);
+void a6xx_gpu_sw_reset(struct msm_gpu *gpu, bool assert);
 
 #endif /* __A6XX_GPU_H__ */

-- 
2.41.0



[PATCH v9 05/20] drm/msm/a6xx: Move force keepalive vote removal to a6xx_gmu_force_off()

2023-06-15 Thread Konrad Dybcio
As pointed out by Akhil during the review process of GMU wrapper
introduction [1], it makes sense to move this write into the function
that's responsible for forcibly shutting the GMU off.

It is also very convenient to move this to GMU-specific code, so that
it does not have to be guarded by an if-condition to avoid calling it
on GMU wrapper targets.

Move the write to the aforementioned a6xx_gmu_force_off() to achieve
that. No effective functional change.

[1] 
https://lore.kernel.org/linux-arm-msm/20230501194022.ga18...@akhilpo-linux.qualcomm.com/

Reviewed-by: Akhil P Oommen 
Signed-off-by: Konrad Dybcio 
---
 drivers/gpu/drm/msm/adreno/a6xx_gmu.c | 6 ++
 drivers/gpu/drm/msm/adreno/a6xx_gpu.c | 6 --
 2 files changed, 6 insertions(+), 6 deletions(-)

diff --git a/drivers/gpu/drm/msm/adreno/a6xx_gmu.c 
b/drivers/gpu/drm/msm/adreno/a6xx_gmu.c
index a6fa273d700e..32852c161aab 100644
--- a/drivers/gpu/drm/msm/adreno/a6xx_gmu.c
+++ b/drivers/gpu/drm/msm/adreno/a6xx_gmu.c
@@ -912,6 +912,12 @@ static void a6xx_gmu_force_off(struct a6xx_gmu *gmu)
struct adreno_gpu *adreno_gpu = _gpu->base;
struct msm_gpu *gpu = _gpu->base;
 
+   /*
+* Turn off keep alive that might have been enabled by the hang
+* interrupt
+*/
+   gmu_write(_gpu->gmu, REG_A6XX_GMU_GMU_PWR_COL_KEEPALIVE, 0);
+
/* Flush all the queues */
a6xx_hfi_stop(gmu);
 
diff --git a/drivers/gpu/drm/msm/adreno/a6xx_gpu.c 
b/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
index ab5c446e4409..eebb4bc7c0f9 100644
--- a/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
+++ b/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
@@ -1382,12 +1382,6 @@ static void a6xx_recover(struct msm_gpu *gpu)
/* Halt SQE first */
gpu_write(gpu, REG_A6XX_CP_SQE_CNTL, 3);
 
-   /*
-* Turn off keep alive that might have been enabled by the hang
-* interrupt
-*/
-   gmu_write(_gpu->gmu, REG_A6XX_GMU_GMU_PWR_COL_KEEPALIVE, 0);
-
pm_runtime_dont_use_autosuspend(>pdev->dev);
 
/* active_submit won't change until we make a submission */

-- 
2.41.0



[PATCH v9 01/20] dt-bindings: display/msm: gpu: Document GMU wrapper-equipped A6xx

2023-06-15 Thread Konrad Dybcio
The "GMU Wrapper" is Qualcomm's name for "let's treat the GPU blocks
we'd normally assign to the GMU as if they were a part of the GMU, even
though they are not". It's a (good) software representation of the GMU_CX
and GMU_GX register spaces within the GPUSS that helps us programatically
treat these de-facto GMU-less parts in a way that's very similar to their
GMU-equipped cousins, massively saving up on code duplication.

The "wrapper" register space was specifically designed to mimic the layout
of a real GMU, though it rather obviously does not have the M3 core et al.

GMU wrapper-equipped A6xx GPUs require clocks and clock-names to be
specified under the GPU node, just like their older cousins. Account
for that.

Acked-by: Rob Herring 
Reviewed-by: Krzysztof Kozlowski 
Signed-off-by: Konrad Dybcio 
---
 .../devicetree/bindings/display/msm/gpu.yaml   | 61 ++
 1 file changed, 52 insertions(+), 9 deletions(-)

diff --git a/Documentation/devicetree/bindings/display/msm/gpu.yaml 
b/Documentation/devicetree/bindings/display/msm/gpu.yaml
index 5dabe7b6794b..58ca8912a8c3 100644
--- a/Documentation/devicetree/bindings/display/msm/gpu.yaml
+++ b/Documentation/devicetree/bindings/display/msm/gpu.yaml
@@ -36,10 +36,7 @@ properties:
 
   reg-names:
 minItems: 1
-items:
-  - const: kgsl_3d0_reg_memory
-  - const: cx_mem
-  - const: cx_dbgc
+maxItems: 3
 
   interrupts:
 maxItems: 1
@@ -157,16 +154,62 @@ allOf:
   required:
 - clocks
 - clock-names
+
   - if:
   properties:
 compatible:
   contains:
-pattern: '^qcom,adreno-6[0-9][0-9]\.[0-9]$'
-
-then: # Since Adreno 6xx series clocks should be defined in GMU
+enum:
+  - qcom,adreno-610.0
+  - qcom,adreno-619.1
+then:
   properties:
-clocks: false
-clock-names: false
+clocks:
+  minItems: 6
+  maxItems: 6
+
+clock-names:
+  items:
+- const: core
+  description: GPU Core clock
+- const: iface
+  description: GPU Interface clock
+- const: mem_iface
+  description: GPU Memory Interface clock
+- const: alt_mem_iface
+  description: GPU Alternative Memory Interface clock
+- const: gmu
+  description: CX GMU clock
+- const: xo
+  description: GPUCC clocksource clock
+
+reg-names:
+  minItems: 1
+  items:
+- const: kgsl_3d0_reg_memory
+- const: cx_dbgc
+
+  required:
+- clocks
+- clock-names
+else:
+  if:
+properties:
+  compatible:
+contains:
+  pattern: '^qcom,adreno-6[0-9][0-9]\.[0-9]$'
+
+  then: # Starting with A6xx, the clocks are usually defined in the GMU 
node
+properties:
+  clocks: false
+  clock-names: false
+
+  reg-names:
+minItems: 1
+items:
+  - const: kgsl_3d0_reg_memory
+  - const: cx_mem
+  - const: cx_dbgc
 
 examples:
   - |

-- 
2.41.0



[PATCH v9 03/20] drm/msm/adreno: Use adreno_is_revn for A690

2023-06-15 Thread Konrad Dybcio
The adreno_is_revn rework came at the same time as A690 introduction
and that resulted in it not covering all cases. Fix it.

Signed-off-by: Konrad Dybcio 
---
 drivers/gpu/drm/msm/adreno/adreno_gpu.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/msm/adreno/adreno_gpu.h 
b/drivers/gpu/drm/msm/adreno/adreno_gpu.h
index 9a7626c7ac4d..5a26c8a2de7c 100644
--- a/drivers/gpu/drm/msm/adreno/adreno_gpu.h
+++ b/drivers/gpu/drm/msm/adreno/adreno_gpu.h
@@ -286,7 +286,7 @@ static inline int adreno_is_a660(const struct adreno_gpu 
*gpu)
 
 static inline int adreno_is_a690(const struct adreno_gpu *gpu)
 {
-   return gpu->revn == 690;
+   return adreno_is_revn(gpu, 690);
 };
 
 /* check for a615, a616, a618, a619 or any derivatives */

-- 
2.41.0



[PATCH v9 02/20] dt-bindings: display/msm/gmu: Add GMU wrapper

2023-06-15 Thread Konrad Dybcio
The "GMU Wrapper" is Qualcomm's name for "let's treat the GPU blocks
we'd normally assign to the GMU as if they were a part of the GMU, even
though they are not". It's a (good) software representation of the GMU_CX
and GMU_GX register spaces within the GPUSS that helps us programatically
treat these de-facto GMU-less parts in a way that's very similar to their
GMU-equipped cousins, massively saving up on code duplication.

The "wrapper" register space was specifically designed to mimic the layout
of a real GMU, though it rather obviously does not have the M3 core et al.

To sum it all up, the GMU wrapper is essentially a register space within
the GPU, which Linux sees as a dumbed-down regular GMU: there's no clocks,
interrupts, multiple reg spaces, iommus and OPP. Document it.

Reviewed-by: Krzysztof Kozlowski 
Signed-off-by: Konrad Dybcio 
---
 .../devicetree/bindings/display/msm/gmu.yaml   | 50 --
 1 file changed, 38 insertions(+), 12 deletions(-)

diff --git a/Documentation/devicetree/bindings/display/msm/gmu.yaml 
b/Documentation/devicetree/bindings/display/msm/gmu.yaml
index f31a26305ca9..5fc4106110ad 100644
--- a/Documentation/devicetree/bindings/display/msm/gmu.yaml
+++ b/Documentation/devicetree/bindings/display/msm/gmu.yaml
@@ -19,16 +19,18 @@ description: |
 
 properties:
   compatible:
-items:
-  - pattern: '^qcom,adreno-gmu-6[0-9][0-9]\.[0-9]$'
-  - const: qcom,adreno-gmu
+oneOf:
+  - items:
+  - pattern: '^qcom,adreno-gmu-6[0-9][0-9]\.[0-9]$'
+  - const: qcom,adreno-gmu
+  - const: qcom,adreno-gmu-wrapper
 
   reg:
-minItems: 3
+minItems: 1
 maxItems: 4
 
   reg-names:
-minItems: 3
+minItems: 1
 maxItems: 4
 
   clocks:
@@ -44,7 +46,6 @@ properties:
   - description: GMU HFI interrupt
   - description: GMU interrupt
 
-
   interrupt-names:
 items:
   - const: hfi
@@ -72,14 +73,8 @@ required:
   - compatible
   - reg
   - reg-names
-  - clocks
-  - clock-names
-  - interrupts
-  - interrupt-names
   - power-domains
   - power-domain-names
-  - iommus
-  - operating-points-v2
 
 additionalProperties: false
 
@@ -218,6 +213,28 @@ allOf:
 - const: axi
 - const: memnoc
 
+  - if:
+  properties:
+compatible:
+  contains:
+const: qcom,adreno-gmu-wrapper
+then:
+  properties:
+reg:
+  items:
+- description: GMU wrapper register space
+reg-names:
+  items:
+- const: gmu
+else:
+  required:
+- clocks
+- clock-names
+- interrupts
+- interrupt-names
+- iommus
+- operating-points-v2
+
 examples:
   - |
 #include 
@@ -250,3 +267,12 @@ examples:
 iommus = <_smmu 5>;
 operating-points-v2 = <_opp_table>;
 };
+
+gmu_wrapper: gmu@596a000 {
+compatible = "qcom,adreno-gmu-wrapper";
+reg = <0x0596a000 0x3>;
+reg-names = "gmu";
+power-domains = < GPU_CX_GDSC>,
+< GPU_GX_GDSC>;
+power-domain-names = "cx", "gx";
+};

-- 
2.41.0



[PATCH v9 00/20] GMU-less A6xx support (A610, A619_holi)

2023-06-15 Thread Konrad Dybcio
v8 -> v9:
- Re-pick-up Krzysztof's lost r-b tag (I messed up, sorry)
- Rebase on constifying-adreno_is_aXYZ and A690 changes
- Fix A610 inactive period
- Move the stray A619 register write from A610 patch to the A619 patch
- Add one more commit, cleaning up A690 addition for git context (for adding
  adreno_is_a680 in "Use adreno_is_aXYZ macros in speedbin matching")
- Use a readback+delay combo instead of a memory barrier in sw reset
- Separate out GMU CX power counter moving
- Pick up tags from v8
v8: 
https://lore.kernel.org/r/20230223-topic-gmuwrapper-v8-0-69c682066...@linaro.org

v7 -> v8:
- Fix up resume/suspend (icc now correctly parks to 0, don't abuse
  OPP & genpd throughout system-wide suspend)
- Don't handle ebi1_clk separately, the bulk ops handle it just fine
- Rebase on next-20230525 (no meaningful changes)

v7: 
https://lore.kernel.org/linux-arm-msm/20230223-topic-gmuwrapper-v7-0-ecc7aab83...@linaro.org/

v6 -> v7:
- Rebase on next-20230519 (A640/650 speedbin merged already)

- separate out the .get_timestamp cb for gmu wrapper

- check for gmu presence inside a6xx_llc_slices_(init|destroy) instead
  of before calling them

- use REG_A6XX_RBBM_GPR0_CNTL instead of literal 0x18

- move a6xx_bus_clear_pending_transactions to a6xx_gpu, clean it up
  and reuse it for gmu wrapper gpus

- drop clearing RBBM_GBIF (GBIF from GX's POV) as part of draining the
  buses, it's not necessary

- introduce a helper for gpu softreset

- sw-reset the gmu wrapper GPUS *after* draining GBIF and only reset
  it if it's hung

- reword the commit message in "Remove both GBIF and RBBM GBIF halt
  on hw init" and move it before gmu wrapper-specific changes

- drop set_rate logic from a6xx_pm_suspend as the clock simply gets
  disabled and we don't have to worry about scaling problems as OPP
  and devfreq take care of that, validated with debugcc

- drop a level of indentation in _a6xx_check_idle() to hopefully
  improve readability

- check for !a610 instead of gmu_wrapper||a619_holi in sptprac cc
  toggling in a6xx_set_hwcg()

- pick up krzk's rb on bindings

All external dependencies have been merged since the last revision.

v6: 
https://lore.kernel.org/r/20230223-topic-gmuwrapper-v6-0-2034115bb...@linaro.org

v5 -> v6:
- Rebase on 8ead96783163 ("drm/msm/gpu: Move BO allocation out of hw_init")
  (Add .ucode_load to funcs_gmuwrapper)
- Drop A6[45]0 speedbin deps, merged into msm-next

Dependencies:
- 
https://lore.kernel.org/linux-arm-msm/20230330231517.2747024-1-konrad.dyb...@linaro.org/
 (to work properly)

v5: 
https://lore.kernel.org/linux-arm-msm/20230223-topic-gmuwrapper-v5-0-bf774b9a9...@linaro.org/

v4 -> v5:
- Add a newline before the new allOf:if: [3/15]
- Enforce 6 clocks on A619_holi/A610 [2/15]
- Pick up tags
- Improve error handling in a6xx_pm_resume [6/15]
- Add patch [1/15] (fix an existing issue) which can be picked
  separately and account for it in [6/15]
- Rebase atop Akhil's CX shutdown patches and incorporate analogous logic
- Fix a regression introduced in v3 that made the fw loader expect
  GMU fw on GMU wrapper GPUs

Dependencies:
- 
https://lore.kernel.org/linux-arm-msm/20230120172233.1905761-1-konrad.dyb...@linaro.org/
 (to apply)
- 
https://lore.kernel.org/linux-arm-msm/20230330231517.2747024-1-konrad.dyb...@linaro.org/
 (to work properly)

v4: 
https://lore.kernel.org/r/20230223-topic-gmuwrapper-v4-0-e987eb79d...@linaro.org

v3 -> v4:
- Drop the mistakengly-included and wrong A3xx-A5xx bindings changes
- Improve bindings commit messages to better explain what GMU Wrapper is
- Drop the A680 highest bank bit value adjustment patch
- Sort UBWC config variables in a reverse-Christmass-tree fashion [4/14]
- Don't alter any UBWC config values in [4/14]
  - Do so for a619_holi in [8/14]
- Rebase on next-20230314 (shouldn't matter at all)

v3: 
https://lore.kernel.org/r/20230223-topic-gmuwrapper-v3-0-5be55a336...@linaro.org

v2 -> v3:
New dependencies:
- 
https://lore.kernel.org/linux-arm-msm/20230223-topic-opp-v3-0-5f22163cd...@linaro.org/T/#t
- 
https://lore.kernel.org/linux-arm-msm/20230120172233.1905761-1-konrad.dyb...@linaro.org/

Sidenote: A speedbin rework is in progress, the of_machine_is_compatible
calls in A619_holi are ugly (but well, necessary..) but they'll be
replaced with socid matching in this or the next kernel cycle.

Due to the new way of identifying GMU wrapper GPUs, configuring 6350
to use wrapper would cause the wrong fuse values to be checked, but that
will be solved by the conversion + the ultimate goal is to use the GMU
whenever possible with the wrapper left for GMU-less Adrenos and early
bringup debugging of GMU-equipped ones.

- Ship dt-bindings in this series as we're referencing the compatible now

- "De-staticize" -> "remove static keyword" [3/15]

- Track down all the values in [4/15]

- Add many comments and explanations in [4/15]

- Fix possible return-before-mutex-unlock [5/15]

- Explain the GMU wrapper a bit more in the commit msg [5/15]

- Separate out 

Re: [PATCH v2 06/22] drm/msm/dpu: simplify peer LM handling

2023-06-15 Thread Marijn Suijten
On 2023-06-13 03:09:45, Dmitry Baryshkov wrote:
> For each LM there is at max 1 peer LM which can be driven by the same
> CTL, so there no need to have a mask instead of just an ID of the peer
> LM.
> 
> Signed-off-by: Dmitry Baryshkov 

Nit: I think you can describe the the patch contents in the title:

Replace LM peer mask with index

Instead of the vague (IMHO) "simplify handling".

> ---
>  .../gpu/drm/msm/disp/dpu1/dpu_hw_catalog.c|  2 +-
>  .../gpu/drm/msm/disp/dpu1/dpu_hw_catalog.h|  4 +--
>  drivers/gpu/drm/msm/disp/dpu1/dpu_rm.c| 34 +++
>  3 files changed, 15 insertions(+), 25 deletions(-)
> 
> diff --git a/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_catalog.c 
> b/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_catalog.c
> index 0de507d4d7b7..30fb5b1f3966 100644
> --- a/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_catalog.c
> +++ b/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_catalog.c
> @@ -394,7 +394,7 @@ static const struct dpu_sspp_sub_blks qcm2290_dma_sblk_0 
> = _DMA_SBLK("8", 1);
>   .features = _fmask, \
>   .sblk = _sblk, \
>   .pingpong = _pp, \
> - .lm_pair_mask = (1 << _lmpair), \
> + .lm_pair = _lmpair, \
>   .dspp = _dspp \
>   }
>  
> diff --git a/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_catalog.h 
> b/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_catalog.h
> index b860784ade72..b07caa4b867e 100644
> --- a/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_catalog.h
> +++ b/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_catalog.h
> @@ -554,14 +554,14 @@ struct dpu_sspp_cfg {
>   * @features   bit mask identifying sub-blocks/features
>   * @sblk:  LM Sub-blocks information
>   * @pingpong:  ID of connected PingPong, PINGPONG_NONE if unsupported
> - * @lm_pair_mask:  Bitmask of LMs that can be controlled by same CTL
> + * @lm_pair:   ID of LM that can be controlled by same CTL

Of *the* LM
By *the* same CTL

But then the rest of these comments have this borked hard-to-read style
as well.

>   */
>  struct dpu_lm_cfg {
>   DPU_HW_BLK_INFO;
>   const struct dpu_lm_sub_blks *sblk;
>   u32 pingpong;
>   u32 dspp;
> - unsigned long lm_pair_mask;
> + unsigned long lm_pair;
>  };
>  
>  /**
> diff --git a/drivers/gpu/drm/msm/disp/dpu1/dpu_rm.c 
> b/drivers/gpu/drm/msm/disp/dpu1/dpu_rm.c
> index 471842bbb950..e333f4eeafc1 100644
> --- a/drivers/gpu/drm/msm/disp/dpu1/dpu_rm.c
> +++ b/drivers/gpu/drm/msm/disp/dpu1/dpu_rm.c
> @@ -253,28 +253,19 @@ static bool _dpu_rm_needs_split_display(const struct 
> msm_display_topology *top)
>  }
>  
>  /**
> - * _dpu_rm_check_lm_peer - check if a mixer is a peer of the primary
> + * _dpu_rm_get_lm_peer - get the id of a mixer which is a peer of the primary

... mixer?

>   * @rm: dpu resource manager handle
>   * @primary_idx: index of primary mixer in rm->mixer_blks[]
> - * @peer_idx: index of other mixer in rm->mixer_blks[]
> - * Return: true if rm->mixer_blks[peer_idx] is a peer of
> - *  rm->mixer_blks[primary_idx]
>   */
> -static bool _dpu_rm_check_lm_peer(struct dpu_rm *rm, int primary_idx,
> - int peer_idx)
> +static int _dpu_rm_get_lm_peer(struct dpu_rm *rm, int primary_idx)
>  {
>   const struct dpu_lm_cfg *prim_lm_cfg;
> - const struct dpu_lm_cfg *peer_cfg;
>  
>   prim_lm_cfg = to_dpu_hw_mixer(rm->mixer_blks[primary_idx])->cap;
> - peer_cfg = to_dpu_hw_mixer(rm->mixer_blks[peer_idx])->cap;
>  
> - if (!test_bit(peer_cfg->id, _lm_cfg->lm_pair_mask)) {
> - DPU_DEBUG("lm %d not peer of lm %d\n", peer_cfg->id,
> - peer_cfg->id);
> - return false;
> - }
> - return true;
> + if (prim_lm_cfg->lm_pair >= LM_0 && prim_lm_cfg->lm_pair < LM_MAX)
> + return prim_lm_cfg->lm_pair - LM_0;
> + return -EINVAL;
>  }
>  
>  /**
> @@ -351,7 +342,7 @@ static int _dpu_rm_reserve_lms(struct dpu_rm *rm,
>   int lm_idx[MAX_BLOCKS];
>   int pp_idx[MAX_BLOCKS];
>   int dspp_idx[MAX_BLOCKS] = {0};
> - int i, j, lm_count = 0;
> + int i, lm_count = 0;
>  
>   if (!reqs->topology.num_lm) {
>   DPU_ERROR("invalid number of lm: %d\n", reqs->topology.num_lm);
> @@ -376,16 +367,15 @@ static int _dpu_rm_reserve_lms(struct dpu_rm *rm,
>   ++lm_count;
>  
>   /* Valid primary mixer found, find matching peers */
> - for (j = i + 1; j < ARRAY_SIZE(rm->mixer_blks) &&
> - lm_count < reqs->topology.num_lm; j++) {
> - if (!rm->mixer_blks[j])
> + if (lm_count < reqs->topology.num_lm) {
> + int j = _dpu_rm_get_lm_peer(rm, i);
> +
> + /* ignore the peer if there is an error or if the peer 
> was already processed */

I would not call this an "error" (though it is -EINVAL): 0 (out of range
of LM_0 <= x M LM_MAX) is a valid value meaning "LM has no peer" and
maybe another error code is more fitting?

> + if (j < 0 || j < i)
> 

[PATCH] Remove incorrect hard coded cache coherrency setting

2023-06-15 Thread Zhanjun Dong
The previouse i915_gem_object_create_internal already set it with proper value 
before function return. This hard coded setting is incorrect for platforms like 
MTL, thus need to be removed.

Signed-off-by: Zhanjun Dong 
---
 drivers/gpu/drm/i915/gt/intel_timeline.c | 2 --
 1 file changed, 2 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/intel_timeline.c 
b/drivers/gpu/drm/i915/gt/intel_timeline.c
index b9640212d659..693d18e14b00 100644
--- a/drivers/gpu/drm/i915/gt/intel_timeline.c
+++ b/drivers/gpu/drm/i915/gt/intel_timeline.c
@@ -26,8 +26,6 @@ static struct i915_vma *hwsp_alloc(struct intel_gt *gt)
if (IS_ERR(obj))
return ERR_CAST(obj);
 
-   i915_gem_object_set_cache_coherency(obj, I915_CACHE_LLC);
-
vma = i915_vma_instance(obj, >ggtt->vm, NULL);
if (IS_ERR(vma))
i915_gem_object_put(obj);
-- 
2.34.1



Re: [PATCH v2 00/22]drm/msm/dpu: another catalog rework

2023-06-15 Thread Marijn Suijten
On 2023-06-15 14:31:22, Dmitry Baryshkov wrote:
> 
> On Tue, 13 Jun 2023 03:09:39 +0300, Dmitry Baryshkov wrote:
> > Having a macro with 10 arguments doesn't seem like a good idea. It makes
> > it inherently harder to compare the actual structure values. Also this
> > leads to adding macros covering varieties of the block.
> > 
> > As it was previously discussed, inline all foo_BLK macros in order to
> > ease performing changes to the catalog data.
> > 
> > [...]
> 
> Applied, thanks!
> 
> [01/22] drm/msm/dpu: fix sc7280 and sc7180 PINGPONG done interrupts
> https://gitlab.freedesktop.org/lumag/msm/-/commit/5efc0fec31d8
> [02/22] drm/msm/dpu: correct MERGE_3D length
> https://gitlab.freedesktop.org/lumag/msm/-/commit/f01fb5e211fd
> [03/22] drm/msm/dpu: remove unused INTF_NONE interfaces
> https://gitlab.freedesktop.org/lumag/msm/-/commit/17bf6f8efc50

The first two patches are fixes, the third one is not?

- Marijn

> 
> Best regards,
> -- 
> Dmitry Baryshkov 


Re: [PATCH v2 05/22] drm/msm/dpu: always use MSM_DP/DSI_CONTROLLER_n

2023-06-15 Thread Marijn Suijten
On 2023-06-13 03:09:44, Dmitry Baryshkov wrote:
> In several catalog entries we did not use existing MSM_DP_CONTROLLER_n
> constants. Fill them in. Also use freshly defined MSM_DSI_CONTROLLER_n
> for DSI interfaces.
> 
> Signed-off-by: Dmitry Baryshkov 
> ---
>  drivers/gpu/drm/msm/disp/dpu1/catalog/dpu_3_0_msm8998.h  | 6 +++---
>  drivers/gpu/drm/msm/disp/dpu1/catalog/dpu_4_0_sdm845.h   | 8 
>  drivers/gpu/drm/msm/disp/dpu1/catalog/dpu_5_0_sm8150.h   | 8 
>  drivers/gpu/drm/msm/disp/dpu1/catalog/dpu_5_1_sc8180x.h  | 4 ++--
>  drivers/gpu/drm/msm/disp/dpu1/catalog/dpu_6_0_sm8250.h   | 8 
>  drivers/gpu/drm/msm/disp/dpu1/catalog/dpu_6_2_sc7180.h   | 2 +-
>  drivers/gpu/drm/msm/disp/dpu1/catalog/dpu_6_3_sm6115.h   | 2 +-
>  drivers/gpu/drm/msm/disp/dpu1/catalog/dpu_6_5_qcm2290.h  | 2 +-

6_4_sm6350 and 6_9_sm6375 are missing from this series.

For the rest:

Reviewed-by: Marijn Suijten 

>  drivers/gpu/drm/msm/disp/dpu1/catalog/dpu_7_0_sm8350.h   | 4 ++--
>  drivers/gpu/drm/msm/disp/dpu1/catalog/dpu_7_2_sc7280.h   | 2 +-
>  drivers/gpu/drm/msm/disp/dpu1/catalog/dpu_8_0_sc8280xp.h | 4 ++--
>  drivers/gpu/drm/msm/disp/dpu1/catalog/dpu_8_1_sm8450.h   | 4 ++--
>  drivers/gpu/drm/msm/disp/dpu1/catalog/dpu_9_0_sm8550.h   | 4 ++--
>  13 files changed, 29 insertions(+), 29 deletions(-)
> 
> diff --git a/drivers/gpu/drm/msm/disp/dpu1/catalog/dpu_3_0_msm8998.h 
> b/drivers/gpu/drm/msm/disp/dpu1/catalog/dpu_3_0_msm8998.h
> index 7d0d0e74c3b0..be0514bf27ec 100644
> --- a/drivers/gpu/drm/msm/disp/dpu1/catalog/dpu_3_0_msm8998.h
> +++ b/drivers/gpu/drm/msm/disp/dpu1/catalog/dpu_3_0_msm8998.h
> @@ -139,13 +139,13 @@ static const struct dpu_dspp_cfg msm8998_dspp[] = {
>  };
>  
>  static const struct dpu_intf_cfg msm8998_intf[] = {
> - INTF_BLK("intf_0", INTF_0, 0x6a000, 0x280, INTF_DP, 0, 21, 
> INTF_SDM845_MASK,
> + INTF_BLK("intf_0", INTF_0, 0x6a000, 0x280, INTF_DP, 
> MSM_DP_CONTROLLER_0, 21, INTF_SDM845_MASK,
>   DPU_IRQ_IDX(MDP_SSPP_TOP0_INTR, 24),
>   DPU_IRQ_IDX(MDP_SSPP_TOP0_INTR, 25)),
> - INTF_BLK("intf_1", INTF_1, 0x6a800, 0x280, INTF_DSI, 0, 21, 
> INTF_SDM845_MASK,
> + INTF_BLK("intf_1", INTF_1, 0x6a800, 0x280, INTF_DSI, 
> MSM_DSI_CONTROLLER_0, 21, INTF_SDM845_MASK,
>   DPU_IRQ_IDX(MDP_SSPP_TOP0_INTR, 26),
>   DPU_IRQ_IDX(MDP_SSPP_TOP0_INTR, 27)),
> - INTF_BLK("intf_2", INTF_2, 0x6b000, 0x280, INTF_DSI, 1, 21, 
> INTF_SDM845_MASK,
> + INTF_BLK("intf_2", INTF_2, 0x6b000, 0x280, INTF_DSI, 
> MSM_DSI_CONTROLLER_1, 21, INTF_SDM845_MASK,
>   DPU_IRQ_IDX(MDP_SSPP_TOP0_INTR, 28),
>   DPU_IRQ_IDX(MDP_SSPP_TOP0_INTR, 29)),
>   INTF_BLK("intf_3", INTF_3, 0x6b800, 0x280, INTF_HDMI, 0, 21, 
> INTF_SDM845_MASK,
> diff --git a/drivers/gpu/drm/msm/disp/dpu1/catalog/dpu_4_0_sdm845.h 
> b/drivers/gpu/drm/msm/disp/dpu1/catalog/dpu_4_0_sdm845.h
> index b6098141bb9b..b33472625fcb 100644
> --- a/drivers/gpu/drm/msm/disp/dpu1/catalog/dpu_4_0_sdm845.h
> +++ b/drivers/gpu/drm/msm/disp/dpu1/catalog/dpu_4_0_sdm845.h
> @@ -143,16 +143,16 @@ static const struct dpu_dsc_cfg sdm845_dsc[] = {
>  };
>  
>  static const struct dpu_intf_cfg sdm845_intf[] = {
> - INTF_BLK("intf_0", INTF_0, 0x6a000, 0x280, INTF_DP, 0, 24, 
> INTF_SDM845_MASK,
> + INTF_BLK("intf_0", INTF_0, 0x6a000, 0x280, INTF_DP, 
> MSM_DP_CONTROLLER_0, 24, INTF_SDM845_MASK,
>   DPU_IRQ_IDX(MDP_SSPP_TOP0_INTR, 24),
>   DPU_IRQ_IDX(MDP_SSPP_TOP0_INTR, 25)),
> - INTF_BLK("intf_1", INTF_1, 0x6a800, 0x280, INTF_DSI, 0, 24, 
> INTF_SDM845_MASK,
> + INTF_BLK("intf_1", INTF_1, 0x6a800, 0x280, INTF_DSI, 
> MSM_DSI_CONTROLLER_0, 24, INTF_SDM845_MASK,
>   DPU_IRQ_IDX(MDP_SSPP_TOP0_INTR, 26),
>   DPU_IRQ_IDX(MDP_SSPP_TOP0_INTR, 27)),
> - INTF_BLK("intf_2", INTF_2, 0x6b000, 0x280, INTF_DSI, 1, 24, 
> INTF_SDM845_MASK,
> + INTF_BLK("intf_2", INTF_2, 0x6b000, 0x280, INTF_DSI, 
> MSM_DSI_CONTROLLER_1, 24, INTF_SDM845_MASK,
>   DPU_IRQ_IDX(MDP_SSPP_TOP0_INTR, 28),
>   DPU_IRQ_IDX(MDP_SSPP_TOP0_INTR, 29)),
> - INTF_BLK("intf_3", INTF_3, 0x6b800, 0x280, INTF_DP, 1, 24, 
> INTF_SDM845_MASK,
> + INTF_BLK("intf_3", INTF_3, 0x6b800, 0x280, INTF_DP, 
> MSM_DP_CONTROLLER_1, 24, INTF_SDM845_MASK,
>   DPU_IRQ_IDX(MDP_SSPP_TOP0_INTR, 30),
>   DPU_IRQ_IDX(MDP_SSPP_TOP0_INTR, 31)),
>  };
> diff --git a/drivers/gpu/drm/msm/disp/dpu1/catalog/dpu_5_0_sm8150.h 
> b/drivers/gpu/drm/msm/disp/dpu1/catalog/dpu_5_0_sm8150.h
> index b5f751354267..64ed10da1b73 100644
> --- a/drivers/gpu/drm/msm/disp/dpu1/catalog/dpu_5_0_sm8150.h
> +++ b/drivers/gpu/drm/msm/disp/dpu1/catalog/dpu_5_0_sm8150.h
> @@ -162,18 +162,18 @@ static const struct dpu_dsc_cfg sm8150_dsc[] = {
>  };
>  
>  static const struct dpu_intf_cfg sm8150_intf[] = {
> - INTF_BLK("intf_0", INTF_0, 

Re: [PATCH v2 04/22] drm/msm: enumerate DSI interfaces

2023-06-15 Thread Marijn Suijten
On 2023-06-13 03:09:43, Dmitry Baryshkov wrote:
> Follow the DP example and define MSM_DSI_CONTROLLER_n enumeration.
> 
> Signed-off-by: Dmitry Baryshkov 

Nice, that'll be cleaner.

Reviewed-by: Marijn Suijten 

> ---
>  drivers/gpu/drm/msm/msm_drv.h | 8 +++-
>  1 file changed, 7 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/gpu/drm/msm/msm_drv.h b/drivers/gpu/drm/msm/msm_drv.h
> index e13a8cbd61c9..ad4fad2bcdc8 100644
> --- a/drivers/gpu/drm/msm/msm_drv.h
> +++ b/drivers/gpu/drm/msm/msm_drv.h
> @@ -65,6 +65,12 @@ enum msm_dp_controller {
>   MSM_DP_CONTROLLER_COUNT,
>  };
>  
> +enum msm_dsi_controller {
> + MSM_DSI_CONTROLLER_0,
> + MSM_DSI_CONTROLLER_1,
> + MSM_DSI_CONTROLLER_COUNT,
> +};
> +
>  #define MSM_GPU_MAX_RINGS 4
>  #define MAX_H_TILES_PER_DISPLAY 2
>  
> @@ -117,7 +123,7 @@ struct msm_drm_private {
>   struct hdmi *hdmi;
>  
>   /* DSI is shared by mdp4 and mdp5 */
> - struct msm_dsi *dsi[2];
> + struct msm_dsi *dsi[MSM_DSI_CONTROLLER_COUNT];
>  
>   struct msm_dp *dp[MSM_DP_CONTROLLER_COUNT];
>  
> -- 
> 2.39.2
> 


Re: [PATCH v2 03/22] drm/msm/dpu: remove unused INTF_NONE interfaces

2023-06-15 Thread Marijn Suijten
On 2023-06-13 03:09:42, Dmitry Baryshkov wrote:
> sm6115 and qcm2290 do not have INTF_0. Drop corresponding interface
> definitions.

As Abhinav said, add sm6375.

If it wasn't for sc8280xp using INTF_NONE for fake MST, we could have
dropped INTF_NONE and the special-cases in dpu_hw_interrupts.c and
dpu_hw_intf.c entirely!  Is that your plan?

> 
> Signed-off-by: Dmitry Baryshkov 

Reviewed-by: Marijn Suijten 

> ---
>  drivers/gpu/drm/msm/disp/dpu1/catalog/dpu_6_3_sm6115.h  | 1 -
>  drivers/gpu/drm/msm/disp/dpu1/catalog/dpu_6_5_qcm2290.h | 1 -
>  drivers/gpu/drm/msm/disp/dpu1/catalog/dpu_6_9_sm6375.h  | 1 -
>  3 files changed, 3 deletions(-)
> 
> diff --git a/drivers/gpu/drm/msm/disp/dpu1/catalog/dpu_6_3_sm6115.h 
> b/drivers/gpu/drm/msm/disp/dpu1/catalog/dpu_6_3_sm6115.h
> index ba9de008519b..031fc8dae3c6 100644
> --- a/drivers/gpu/drm/msm/disp/dpu1/catalog/dpu_6_3_sm6115.h
> +++ b/drivers/gpu/drm/msm/disp/dpu1/catalog/dpu_6_3_sm6115.h
> @@ -66,7 +66,6 @@ static const struct dpu_pingpong_cfg sm6115_pp[] = {
>  };
>  
>  static const struct dpu_intf_cfg sm6115_intf[] = {
> - INTF_BLK("intf_0", INTF_0, 0x0, 0x280, INTF_NONE, 0, 0, 0, 0, 0),
>   INTF_BLK_DSI_TE("intf_1", INTF_1, 0x6a800, 0x2c0, INTF_DSI, 0, 24, 
> INTF_SC7180_MASK,
>   DPU_IRQ_IDX(MDP_SSPP_TOP0_INTR, 26),
>   DPU_IRQ_IDX(MDP_SSPP_TOP0_INTR, 27),
> diff --git a/drivers/gpu/drm/msm/disp/dpu1/catalog/dpu_6_5_qcm2290.h 
> b/drivers/gpu/drm/msm/disp/dpu1/catalog/dpu_6_5_qcm2290.h
> index 92ac348eea6b..f2808098af39 100644
> --- a/drivers/gpu/drm/msm/disp/dpu1/catalog/dpu_6_5_qcm2290.h
> +++ b/drivers/gpu/drm/msm/disp/dpu1/catalog/dpu_6_5_qcm2290.h
> @@ -63,7 +63,6 @@ static const struct dpu_pingpong_cfg qcm2290_pp[] = {
>  };
>  
>  static const struct dpu_intf_cfg qcm2290_intf[] = {
> - INTF_BLK("intf_0", INTF_0, 0x0, 0x280, INTF_NONE, 0, 0, 0, 0, 0),
>   INTF_BLK_DSI_TE("intf_1", INTF_1, 0x6a800, 0x2c0, INTF_DSI, 0, 24, 
> INTF_SC7180_MASK,
>   DPU_IRQ_IDX(MDP_SSPP_TOP0_INTR, 26),
>   DPU_IRQ_IDX(MDP_SSPP_TOP0_INTR, 27),
> diff --git a/drivers/gpu/drm/msm/disp/dpu1/catalog/dpu_6_9_sm6375.h 
> b/drivers/gpu/drm/msm/disp/dpu1/catalog/dpu_6_9_sm6375.h
> index d7aae45e3e66..241fa6746674 100644
> --- a/drivers/gpu/drm/msm/disp/dpu1/catalog/dpu_6_9_sm6375.h
> +++ b/drivers/gpu/drm/msm/disp/dpu1/catalog/dpu_6_9_sm6375.h
> @@ -71,7 +71,6 @@ static const struct dpu_dsc_cfg sm6375_dsc[] = {
>  };
>  
>  static const struct dpu_intf_cfg sm6375_intf[] = {
> - INTF_BLK("intf_0", INTF_0, 0x0, 0x280, INTF_NONE, 0, 0, 0, 0, 0),
>   INTF_BLK_DSI_TE("intf_1", INTF_1, 0x6a800, 0x2c0, INTF_DSI, 0, 24, 
> INTF_SC7180_MASK,
>   DPU_IRQ_IDX(MDP_SSPP_TOP0_INTR, 26),
>   DPU_IRQ_IDX(MDP_SSPP_TOP0_INTR, 27),
> -- 
> 2.39.2
> 


Re: [PATCH v2 02/22] drm/msm/dpu: correct MERGE_3D length

2023-06-15 Thread Marijn Suijten
On 2023-06-13 03:09:41, Dmitry Baryshkov wrote:
> Each MERGE_3D block has just two registers. Correct the block length
> accordingly.
> 
> Fixes: 4369c93cf36b ("drm/msm/dpu: initial support for merge3D hardware 
> block")
> Signed-off-by: Dmitry Baryshkov 

Indeed, and that patch wasn't even introducing the register writes -
this only happened in commit 9ffd0e8569937 ("drm/msm/dpu: setup merge
modes in merge_3d block").

Reviewed-by: Marijn Suijten 

> ---
>  drivers/gpu/drm/msm/disp/dpu1/dpu_hw_catalog.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_catalog.c 
> b/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_catalog.c
> index 36ba3f58dcdf..0de507d4d7b7 100644
> --- a/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_catalog.c
> +++ b/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_catalog.c
> @@ -508,7 +508,7 @@ static const struct dpu_pingpong_sub_blks sc7280_pp_sblk 
> = {
>  #define MERGE_3D_BLK(_name, _id, _base) \
>   {\
>   .name = _name, .id = _id, \
> - .base = _base, .len = 0x100, \
> + .base = _base, .len = 0x8, \
>   .features = MERGE_3D_SM8150_MASK, \
>   .sblk = NULL \
>   }
> -- 
> 2.39.2
> 


Re: [PATCH v2 01/22] drm/msm/dpu: fix sc7280 and sc7180 PINGPONG done interrupts

2023-06-15 Thread Marijn Suijten
On 2023-06-13 03:09:40, Dmitry Baryshkov wrote:
> During IRQ conversion we have lost the PP_DONE interrupts for sc7280
> platform. This was left unnoticed, because this interrupt is only used
> for CMD outputs and probably no sc7[12]80 systems use DSI CMD panels.
> 
> Fixes: 667e9985ee24 ("drm/msm/dpu: replace IRQ lookup with the data in hw 
> catalog")
> Signed-off-by: Dmitry Baryshkov 

Agreed.  I never really understood why these were missing, as if there
was no cmd-mode support.  The code prior to the Fixes: commit was indeed
returning interrupts without looking at hardware support at all.

Reviewed-by: Marijn Suijten 

> ---
>  .../drm/msm/disp/dpu1/catalog/dpu_6_2_sc7180.h   |  8 ++--
>  .../drm/msm/disp/dpu1/catalog/dpu_7_2_sc7280.h   | 16 
>  2 files changed, 18 insertions(+), 6 deletions(-)
> 
> diff --git a/drivers/gpu/drm/msm/disp/dpu1/catalog/dpu_6_2_sc7180.h 
> b/drivers/gpu/drm/msm/disp/dpu1/catalog/dpu_6_2_sc7180.h
> index 0b05da2592c0..67566b07195a 100644
> --- a/drivers/gpu/drm/msm/disp/dpu1/catalog/dpu_6_2_sc7180.h
> +++ b/drivers/gpu/drm/msm/disp/dpu1/catalog/dpu_6_2_sc7180.h
> @@ -80,8 +80,12 @@ static const struct dpu_dspp_cfg sc7180_dspp[] = {
>  };
>  
>  static const struct dpu_pingpong_cfg sc7180_pp[] = {
> - PP_BLK("pingpong_0", PINGPONG_0, 0x7, PINGPONG_SM8150_MASK, 0, 
> sdm845_pp_sblk, -1, -1),
> - PP_BLK("pingpong_1", PINGPONG_1, 0x70800, PINGPONG_SM8150_MASK, 0, 
> sdm845_pp_sblk, -1, -1),
> + PP_BLK("pingpong_0", PINGPONG_0, 0x7, PINGPONG_SM8150_MASK, 0, 
> sdm845_pp_sblk,
> + DPU_IRQ_IDX(MDP_SSPP_TOP0_INTR, 8),
> + -1),
> + PP_BLK("pingpong_1", PINGPONG_1, 0x70800, PINGPONG_SM8150_MASK, 0, 
> sdm845_pp_sblk,
> + DPU_IRQ_IDX(MDP_SSPP_TOP0_INTR, 9),
> + -1),
>  };
>  
>  static const struct dpu_intf_cfg sc7180_intf[] = {
> diff --git a/drivers/gpu/drm/msm/disp/dpu1/catalog/dpu_7_2_sc7280.h 
> b/drivers/gpu/drm/msm/disp/dpu1/catalog/dpu_7_2_sc7280.h
> index 9c5a3fe9cfde..6ea1cb551348 100644
> --- a/drivers/gpu/drm/msm/disp/dpu1/catalog/dpu_7_2_sc7280.h
> +++ b/drivers/gpu/drm/msm/disp/dpu1/catalog/dpu_7_2_sc7280.h
> @@ -88,10 +88,18 @@ static const struct dpu_dspp_cfg sc7280_dspp[] = {
>  };
>  
>  static const struct dpu_pingpong_cfg sc7280_pp[] = {
> - PP_BLK_DITHER("pingpong_0", PINGPONG_0, 0x69000, 0, sc7280_pp_sblk, -1, 
> -1),
> - PP_BLK_DITHER("pingpong_1", PINGPONG_1, 0x6a000, 0, sc7280_pp_sblk, -1, 
> -1),
> - PP_BLK_DITHER("pingpong_2", PINGPONG_2, 0x6b000, 0, sc7280_pp_sblk, -1, 
> -1),
> - PP_BLK_DITHER("pingpong_3", PINGPONG_3, 0x6c000, 0, sc7280_pp_sblk, -1, 
> -1),
> + PP_BLK_DITHER("pingpong_0", PINGPONG_0, 0x69000, 0, sc7280_pp_sblk,
> + DPU_IRQ_IDX(MDP_SSPP_TOP0_INTR, 8),
> + -1),
> + PP_BLK_DITHER("pingpong_1", PINGPONG_1, 0x6a000, 0, sc7280_pp_sblk,
> + DPU_IRQ_IDX(MDP_SSPP_TOP0_INTR, 9),
> + -1),
> + PP_BLK_DITHER("pingpong_2", PINGPONG_2, 0x6b000, 0, sc7280_pp_sblk,
> + DPU_IRQ_IDX(MDP_SSPP_TOP0_INTR, 10),
> + -1),
> + PP_BLK_DITHER("pingpong_3", PINGPONG_3, 0x6c000, 0, sc7280_pp_sblk,
> + DPU_IRQ_IDX(MDP_SSPP_TOP0_INTR, 11),
> + -1),
>  };
>  
>  static const struct dpu_wb_cfg sc7280_wb[] = {
> -- 
> 2.39.2
> 


Re: [PATCH 02/11] drm/i915/mst: Remove broken MST DSC support

2023-06-15 Thread Dave Airlie
On Wed, 3 May 2023 at 22:23, Lisovskiy, Stanislav
 wrote:
>
> On Wed, May 03, 2023 at 02:07:04PM +0300, Ville Syrjälä wrote:
> > On Wed, May 03, 2023 at 10:36:42AM +0300, Lisovskiy, Stanislav wrote:
> > > On Tue, May 02, 2023 at 05:38:57PM +0300, Ville Syrjala wrote:
> > > > From: Ville Syrjälä 
> > > >
> > > > The MST DSC code has a myriad of issues:
> > > > - Platform checks are wrong (MST+DSC is TGL+ only IIRC)
> > > > - Return values of .mode_valid_ctx() are wrong
> > > > - .mode_valid_ctx() assumes bigjoiner might be used, but ther rest
> > > >   of the code doesn't agree
> > > > - compressed bpp calculations don't make sense
> > > > - FEC handling needs to consider the entire link as opposed to just
> > > >   the single stream. Currently FEC would only get enabled if the
> > > >   first enabled stream is compressed. Also I'm not seeing anything
> > > >   that would account for the FEC overhead in any bandwidth calculations
> > > > - PPS SDP is only handled for the first stream via the dig_port
> > > >   hooks, other streams will not be transmittitng any PPS SDPs
> > > > - PPS SDP readout is missing (also missing for SST!)
> > > > - VDSC readout is missing (also missing for SST!)
> > > >
> > > > The FEC issues is really the big one since we have no way currently
> > > > to apply such link wide configuration constraints. Changing that is
> > > > going to require a much bigger rework of the higher level modeset
> > > > .compute_config() logic. We will also need such a rework to properly
> > > > distribute the available bandwidth across all the streams on the
> > > > same link (which is a must to eg. enable deep color).
> > >
> > > Also all the things you mentioned are subject for discussion, for example
> > > I see that FEC overhead is actually accounted for bpp calculation for 
> > > instance.
> >
> > AFAICS FEC is only accounted for in the data M/N calculations,
> > assuming that particular stream happened to be compressed. I'm
> > not sure if that actually matters since at least the link M/N
> > are not even used by the MST sink. I suppose the data M/N might
> > still be used for something though. For any uncompressed stream
> > on the same link the data M/N values will be calculated
> > incorrectly without FEC.
> >
> > And as mentioned, the FEC bandwidth overhead doesn't seem to
> > be accounted anywhere so no guarantee that we won't try to
> > oversubcribe the link.
> >
> > And FEC will only be enabled if the first stream to be enabled
> > is compressed, otherwise we will enable the link without FEC
> > and still try to cram other compressed streams through it
> > (albeit without the PPS SDP so who knows what will happen)
> > and that is illegal.
> >
> > > We usually improve things by gradually fixing, because if we act same way 
> > > towards
> > > all wrong code in the driver, we could end up removing the whole i915.
> >
> > We ususally don't merge code that has this many obvious and/or
> > fundemental issues.
>
> Well, this is arguable and subjective judgement. Fact is that, so far we had 
> more MST hubs
> working with that code than without. Also no regressions or anything like 
> that.
> Moreover we usually merge code after code review, in particular those patches
> did spend lots of time in review, where you could comment also.
>
> Regarding merging code with fundamental issues, just recently you had 
> admitted yourself
> that bigjoiner issue for instance, we had recently, was partly caused by your 
> code, because
> we don't anymore copy the pll state to slave crtc.
> I would say that words like "obvious" and "fundamental"
> issues can be applied to many things, however I thought that we always fix 
> things in constructive,
> but not destructive/negative way.
> Should I call also all code completely broken and remove it, once we discover 
> some flaws
> there? Oh, we had many regressions, where I could say the same.
>
> And once again I'm completely okay, if you did introduce better functionality 
> instead
> AND I know you have some valid points there, but now we are just removing 
> everything completely,
> without providing anything better.
>
> But okay, I've mentioned what I think about this and from side this is nak.
> And once the guys to whom those patches helped will pop up from gitlab,
> asking why their MST hubs stopped working - I will just refer them here.
>
> >
> > Now, most of the issues I listed above are probably fixable
> > in a way that could be backported to stable kernels, but
> > unfortunately the FEC issue is not one of those. That one
> > will likely need massive amounts of work all over the driver
> > modeset code, making a backport impossible.
> >
> > > So from my side I would nack it, at least until you have a code which 
> > > handles
> > > all of this better - I have no doubt you probably have some ideas in your 
> > > mind, so lets be constructive at least and propose something better first.
> > > This code doesn't cause any regressions, but still provides 

Re: [PATCH 1/2] drm/panel: boe-tv101wum-nl6: Drop macros and open code sequences

2023-06-15 Thread kernel test robot
Hi Linus,

kernel test robot noticed the following build warnings:

[auto build test WARNING on ac9a78681b921877518763ba0e89202254349d1b]

url:
https://github.com/intel-lab-lkp/linux/commits/Linus-Walleij/drm-panel-boe-tv101wum-nl6-Drop-macros-and-open-code-sequences/20230616-042312
base:   ac9a78681b921877518763ba0e89202254349d1b
patch link:
https://lore.kernel.org/r/20230615-fix-boe-tv101wum-nl6-v1-1-8ac378405fb7%40linaro.org
patch subject: [PATCH 1/2] drm/panel: boe-tv101wum-nl6: Drop macros and open 
code sequences
config: alpha-allyesconfig 
(https://download.01.org/0day-ci/archive/20230616/202306160538.b7hkwlko-...@intel.com/config)
compiler: alpha-linux-gcc (GCC) 12.3.0
reproduce (this is a W=1 build):
mkdir -p ~/bin
wget 
https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O 
~/bin/make.cross
chmod +x ~/bin/make.cross
git checkout ac9a78681b921877518763ba0e89202254349d1b
b4 shazam 
https://lore.kernel.org/r/20230615-fix-boe-tv101wum-nl6-v1-1-8ac378405...@linaro.org
# save the config file
mkdir build_dir && cp config build_dir/.config
COMPILER_INSTALL_PATH=$HOME/0day COMPILER=gcc-12.3.0 ~/bin/make.cross 
W=1 O=build_dir ARCH=alpha olddefconfig
COMPILER_INSTALL_PATH=$HOME/0day COMPILER=gcc-12.3.0 ~/bin/make.cross 
W=1 O=build_dir ARCH=alpha SHELL=/bin/bash drivers/gpu/drm/panel/

If you fix the issue in a separate patch/commit (i.e. not just a new version of
the same patch/commit), kindly add following tags
| Reported-by: kernel test robot 
| Closes: 
https://lore.kernel.org/oe-kbuild-all/202306160538.b7hkwlko-...@intel.com/

All warnings (new ones prefixed by >>):

>> drivers/gpu/drm/panel/panel-boe-tv101wum-nl6.c:776:5: warning: no previous 
>> prototype for 'boe_init' [-Wmissing-prototypes]
 776 | int boe_init(struct mipi_dsi_device *dsi)
 | ^~~~


vim +/boe_init +776 drivers/gpu/drm/panel/panel-boe-tv101wum-nl6.c

   775  
 > 776  int boe_init(struct mipi_dsi_device *dsi)
   777  {
   778  msleep(24);
   779  mipi_dsi_dcs_write_seq(dsi, 0xB0, 0x05);
   780  mipi_dsi_dcs_write_seq(dsi, 0xB1, 0xE5);
   781  mipi_dsi_dcs_write_seq(dsi, 0xB3, 0x52);
   782  mipi_dsi_dcs_write_seq(dsi, 0xB0, 0x00);
   783  mipi_dsi_dcs_write_seq(dsi, 0xB3, 0x88);
   784  mipi_dsi_dcs_write_seq(dsi, 0xB0, 0x04);
   785  mipi_dsi_dcs_write_seq(dsi, 0xB8, 0x00);
   786  mipi_dsi_dcs_write_seq(dsi, 0xB0, 0x00);
   787  mipi_dsi_dcs_write_seq(dsi, 0xB6, 0x03);
   788  mipi_dsi_dcs_write_seq(dsi, 0xBA, 0x8B);
   789  mipi_dsi_dcs_write_seq(dsi, 0xBF, 0x1A);
   790  mipi_dsi_dcs_write_seq(dsi, 0xC0, 0x0F);
   791  mipi_dsi_dcs_write_seq(dsi, 0xC2, 0x0C);
   792  mipi_dsi_dcs_write_seq(dsi, 0xC3, 0x02);
   793  mipi_dsi_dcs_write_seq(dsi, 0xC4, 0x0C);
   794  mipi_dsi_dcs_write_seq(dsi, 0xC5, 0x02);
   795  mipi_dsi_dcs_write_seq(dsi, 0xB0, 0x01);
   796  mipi_dsi_dcs_write_seq(dsi, 0xE0, 0x26);
   797  mipi_dsi_dcs_write_seq(dsi, 0xE1, 0x26);
   798  mipi_dsi_dcs_write_seq(dsi, 0xDC, 0x00);
   799  mipi_dsi_dcs_write_seq(dsi, 0xDD, 0x00);
   800  mipi_dsi_dcs_write_seq(dsi, 0xCC, 0x26);
   801  mipi_dsi_dcs_write_seq(dsi, 0xCD, 0x26);
   802  mipi_dsi_dcs_write_seq(dsi, 0xC8, 0x00);
   803  mipi_dsi_dcs_write_seq(dsi, 0xC9, 0x00);
   804  mipi_dsi_dcs_write_seq(dsi, 0xD2, 0x03);
   805  mipi_dsi_dcs_write_seq(dsi, 0xD3, 0x03);
   806  mipi_dsi_dcs_write_seq(dsi, 0xE6, 0x04);
   807  mipi_dsi_dcs_write_seq(dsi, 0xE7, 0x04);
   808  mipi_dsi_dcs_write_seq(dsi, 0xC4, 0x09);
   809  mipi_dsi_dcs_write_seq(dsi, 0xC5, 0x09);
   810  mipi_dsi_dcs_write_seq(dsi, 0xD8, 0x0A);
   811  mipi_dsi_dcs_write_seq(dsi, 0xD9, 0x0A);
   812  mipi_dsi_dcs_write_seq(dsi, 0xC2, 0x0B);
   813  mipi_dsi_dcs_write_seq(dsi, 0xC3, 0x0B);
   814  mipi_dsi_dcs_write_seq(dsi, 0xD6, 0x0C);
   815  mipi_dsi_dcs_write_seq(dsi, 0xD7, 0x0C);
   816  mipi_dsi_dcs_write_seq(dsi, 0xC0, 0x05);
   817  mipi_dsi_dcs_write_seq(dsi, 0xC1, 0x05);
   818  mipi_dsi_dcs_write_seq(dsi, 0xD4, 0x06);
   819  mipi_dsi_dcs_write_seq(dsi, 0xD5, 0x06);
   820  mipi_dsi_dcs_write_seq(dsi, 0xCA, 0x07);
   821  mipi_dsi_dcs_write_seq(dsi, 0xCB, 0x07);
   822  mipi_dsi_dcs_write_seq(dsi, 0xDE, 0x08);
   823  mipi_dsi_dcs_write_seq(dsi, 0xDF, 0x08);
   824  mipi_dsi_dcs_write_seq(dsi, 0xB0, 0x02);
   825  mipi_dsi_dcs_write_seq(dsi, 0xC0, 0x00);
   826  mipi_dsi_dcs_write_seq(dsi, 0xC1, 0x0D);
   827  mipi_dsi_dcs_write_seq(dsi, 0xC2, 0x17);
   828  mipi_dsi_dcs_write_seq(dsi, 0xC3, 0x26);
   829  mipi_dsi_dcs_write

Re: [PATCH v2 00/22]drm/msm/dpu: another catalog rework

2023-06-15 Thread Marijn Suijten
On 2023-06-13 03:09:39, Dmitry Baryshkov wrote:
> Having a macro with 10 arguments doesn't seem like a good idea. It makes
> it inherently harder to compare the actual structure values. Also this
> leads to adding macros covering varieties of the block.
> 
> As it was previously discussed, inline all foo_BLK macros in order to
> ease performing changes to the catalog data.
> 
> Major part of the conversion was performed using vim script found at
> [1]. Then some manual cleanups were applied, like dropping fields set to
> 0.
> 
> Dependencies: [2].
> 
> Changes since v1:
>  - Rebased on top of msm-next
>  - Dropped dependency on interrupt rework
> 
> [1] https://pastebin.ubuntu.com/p/26cdW5VpYB/
> [2] https://patchwork.freedesktop.org/patch/542142/?series=119220=1
> 
> Dmitry Baryshkov (22):
>   drm/msm/dpu: fix sc7280 and sc7180 PINGPONG done interrupts
>   drm/msm/dpu: correct MERGE_3D length
>   drm/msm/dpu: remove unused INTF_NONE interfaces
>   drm/msm: enumerate DSI interfaces
>   drm/msm/dpu: always use MSM_DP/DSI_CONTROLLER_n
>   drm/msm/dpu: simplify peer LM handling
>   drm/msm/dpu: drop dpu_mdss_cfg::mdp_count field
>   drm/msm/dpu: drop enum dpu_mdp and MDP_TOP value
>   drm/msm/dpu: expand .clk_ctrls definitions
>   drm/msm/dpu: drop zero features from dpu_mdp_cfg data
>   drm/msm/dpu: drop zero features from dpu_ctl_cfg data
>   drm/msm/dpu: correct indentation for CTL definitions
>   drm/msm/dpu: inline SSPP_BLK macros
>   drm/msm/dpu: inline DSPP_BLK macros
>   drm/msm/dpu: inline LM_BLK macros
>   drm/msm/dpu: inline DSC_BLK and DSC_BLK_1_2 macros
>   drm/msm/dpu: inline MERGE_3D_BLK macros
>   drm/msm/dpu: inline various PP_BLK_* macros
>   drm/msm/dpu: inline WB_BLK macros
>   drm/msm/dpu: inline INTF_BLK and INTF_BLK_DSI_TE macros
>   drm/msm/dpu: drop empty features mask MERGE_3D_SM8150_MASK
>   drm/msm/dpu: drop empty features mask INTF_SDM845_MASK

I am not sure how to process this series, seems like something went
wrong during sending.  Besides patch 16 being duplicate albeit with
different content, patch 20, 21 and 22 are duplicate as well and have an
interesting pattern where 21 and 22 are sent in reply to 20?

6587 N T 2023-06-13 02:09:54 Dmitry Baryshko (   0) ├─>[PATCH v2 15/22] 
drm/msm/dpu: inline LM_BLK macros
6588 N T 2023-06-13 02:09:55 Dmitry Baryshko (   0) ├─>[PATCH v2 16/22] 
drm/msm/dpu: inline DSC_BLK and DSC_BLK_1_2 macros
6589 N T 2023-06-13 02:09:57 Dmitry Baryshko (   0) ├─>[PATCH v2 17/22] 
drm/msm/dpu: inline MERGE_3D_BLK macros
6590 N T 2023-06-13 02:12:37 Dmitry Baryshko (   0) ├─>[PATCH v2 20/22] 
drm/msm/dpu: inline INTF_BLK and INTF_BLK_DSI_TE macros
6591 N T 2023-06-13 02:12:38 Dmitry Baryshko (   0) │ ├─>[PATCH v2 21/22] 
drm/msm/dpu: drop empty features mask MERGE_3D_SM8150_MASK
6592 N T 2023-06-13 02:12:39 Dmitry Baryshko (   0) │ └─>[PATCH v2 22/22] 
drm/msm/dpu: drop empty features mask INTF_SDM845_MASK

^ Here are 21 and 22 in reply to 20 ^

6593 N T 2023-06-13 02:14:49 Dmitry Baryshko (   0) ├─>[PATCH v2 18/22] 
drm/msm/dpu: inline various PP_BLK_* macros
6594 N T 2023-06-13 02:14:50 Dmitry Baryshko (   0) ├─>[PATCH v2 19/22] 
drm/msm/dpu: inline WB_BLK macros
6595 N T 2023-06-13 02:14:51 Dmitry Baryshko (   0) ├─>[PATCH v2 20/22] 
drm/msm/dpu: inline INTF_BLK and INTF_BLK_DSI_TE macros
6596 N T 2023-06-13 02:14:52 Dmitry Baryshko (   0) ├─>[PATCH v2 21/22] 
drm/msm/dpu: drop empty features mask MERGE_3D_SM8150_MASK
6597 N T 2023-06-13 02:14:53 Dmitry Baryshko (   0) ├─>[PATCH v2 22/22] 
drm/msm/dpu: drop empty features mask INTF_SDM845_MASK

And here they are again.

See the same on Lore: 
https://lore.kernel.org/linux-arm-msm/20230613001004.3426676-1-dmitry.barysh...@linaro.org/

- Marijn

> 
>  .../msm/disp/dpu1/catalog/dpu_3_0_msm8998.h   | 329 
>  .../msm/disp/dpu1/catalog/dpu_4_0_sdm845.h| 348 +
>  .../msm/disp/dpu1/catalog/dpu_5_0_sm8150.h| 411 ++-
>  .../msm/disp/dpu1/catalog/dpu_5_1_sc8180x.h   | 448 +++-
>  .../msm/disp/dpu1/catalog/dpu_6_0_sm8250.h| 430 +++-
>  .../msm/disp/dpu1/catalog/dpu_6_2_sc7180.h| 180 +--
>  .../msm/disp/dpu1/catalog/dpu_6_3_sm6115.h|  89 +++-
>  .../msm/disp/dpu1/catalog/dpu_6_4_sm6350.h| 188 ---
>  .../msm/disp/dpu1/catalog/dpu_6_5_qcm2290.h   |  89 +++-
>  .../msm/disp/dpu1/catalog/dpu_6_9_sm6375.h|  96 ++--
>  .../msm/disp/dpu1/catalog/dpu_7_0_sm8350.h| 418 ++-
>  .../msm/disp/dpu1/catalog/dpu_7_2_sc7280.h| 236 ++---
>  .../msm/disp/dpu1/catalog/dpu_8_0_sc8280xp.h  | 484 +-
>  .../msm/disp/dpu1/catalog/dpu_8_1_sm8450.h| 445 +++-
>  .../msm/disp/dpu1/catalog/dpu_9_0_sm8550.h| 467 -
>  .../gpu/drm/msm/disp/dpu1/dpu_hw_catalog.c| 130 -
>  .../gpu/drm/msm/disp/dpu1/dpu_hw_catalog.h|   5 +-
>  drivers/gpu/drm/msm/disp/dpu1/dpu_hw_mdss.h   |   5 -
>  drivers/gpu/drm/msm/disp/dpu1/dpu_hw_top.c|  34 +-
>  

Re: [PATCH v2 16/21] drm/msm/dpu: inline DSC_BLK macros

2023-06-15 Thread Dmitry Baryshkov

On 16/06/2023 01:05, Marijn Suijten wrote:

On 2023-06-13 03:09:56, Dmitry Baryshkov wrote:

To simplify making changes to the hardware block definitions, expand
corresponding macros. This way making all the changes are more obvious
and visible in the source files.

Signed-off-by: Dmitry Baryshkov 
---
  .../msm/disp/dpu1/catalog/dpu_3_0_msm8998.h   | 11 +--
  .../msm/disp/dpu1/catalog/dpu_4_0_sdm845.h| 17 +++---
  .../msm/disp/dpu1/catalog/dpu_5_0_sm8150.h| 21 ++---
  .../msm/disp/dpu1/catalog/dpu_5_1_sc8180x.h   | 31 +++
  .../msm/disp/dpu1/catalog/dpu_6_0_sm8250.h| 21 ++---
  .../msm/disp/dpu1/catalog/dpu_6_4_sm6350.h|  6 +++-
  .../msm/disp/dpu1/catalog/dpu_6_9_sm6375.h|  6 +++-
  .../gpu/drm/msm/disp/dpu1/dpu_hw_catalog.c| 10 --
  8 files changed, 91 insertions(+), 32 deletions(-)

diff --git a/drivers/gpu/drm/msm/disp/dpu1/catalog/dpu_3_0_msm8998.h 
b/drivers/gpu/drm/msm/disp/dpu1/catalog/dpu_3_0_msm8998.h
index a07c68744b29..7c3da4033c46 100644
--- a/drivers/gpu/drm/msm/disp/dpu1/catalog/dpu_3_0_msm8998.h
+++ b/drivers/gpu/drm/msm/disp/dpu1/catalog/dpu_3_0_msm8998.h
@@ -200,8 +200,15 @@ static const struct dpu_pingpong_cfg msm8998_pp[] = {
  };
  
  static const struct dpu_dsc_cfg msm8998_dsc[] = {

-   DSC_BLK("dsc_0", DSC_0, 0x8, 0),
-   DSC_BLK("dsc_1", DSC_1, 0x80400, 0),
+   {
+   .name = "dsc_0", .id = DSC_0,
+   .base = 0x8, .len = 0x1800,
+   .features = 0,
+   }, {
+   .name = "dsc_1", .id = DSC_1,
+   .base = 0x80400, .len = 0x1800,
+   .features = 0,
+   },
  };
  
  static const struct dpu_dspp_cfg msm8998_dspp[] = {

diff --git a/drivers/gpu/drm/msm/disp/dpu1/catalog/dpu_4_0_sdm845.h 
b/drivers/gpu/drm/msm/disp/dpu1/catalog/dpu_4_0_sdm845.h
index 786263ed1ef2..ca3bb6a1a93a 100644
--- a/drivers/gpu/drm/msm/disp/dpu1/catalog/dpu_4_0_sdm845.h
+++ b/drivers/gpu/drm/msm/disp/dpu1/catalog/dpu_4_0_sdm845.h
@@ -224,10 +224,19 @@ static const struct dpu_pingpong_cfg sdm845_pp[] = {
  };
  
  static const struct dpu_dsc_cfg sdm845_dsc[] = {

-   DSC_BLK("dsc_0", DSC_0, 0x8, 0),
-   DSC_BLK("dsc_1", DSC_1, 0x80400, 0),
-   DSC_BLK("dsc_2", DSC_2, 0x80800, 0),
-   DSC_BLK("dsc_3", DSC_3, 0x80c00, 0),
+   {
+   .name = "dsc_0", .id = DSC_0,
+   .base = 0x8, .len = 0x1800,
+   }, {
+   .name = "dsc_1", .id = DSC_1,
+   .base = 0x80400, .len = 0x1800,
+   }, {
+   .name = "dsc_2", .id = DSC_2,
+   .base = 0x80800, .len = 0x1800,
+   }, {
+   .name = "dsc_3", .id = DSC_3,
+   .base = 0x80c00, .len = 0x1800,
+   },
  };
  
  static const struct dpu_intf_cfg sdm845_intf[] = {

diff --git a/drivers/gpu/drm/msm/disp/dpu1/catalog/dpu_5_0_sm8150.h 
b/drivers/gpu/drm/msm/disp/dpu1/catalog/dpu_5_0_sm8150.h
index 6b9bfeac6e0a..5b068521de13 100644
--- a/drivers/gpu/drm/msm/disp/dpu1/catalog/dpu_5_0_sm8150.h
+++ b/drivers/gpu/drm/msm/disp/dpu1/catalog/dpu_5_0_sm8150.h
@@ -245,10 +245,23 @@ static const struct dpu_merge_3d_cfg sm8150_merge_3d[] = {
  };
  
  static const struct dpu_dsc_cfg sm8150_dsc[] = {

-   DSC_BLK("dsc_0", DSC_0, 0x8, BIT(DPU_DSC_OUTPUT_CTRL)),
-   DSC_BLK("dsc_1", DSC_1, 0x80400, BIT(DPU_DSC_OUTPUT_CTRL)),
-   DSC_BLK("dsc_2", DSC_2, 0x80800, BIT(DPU_DSC_OUTPUT_CTRL)),
-   DSC_BLK("dsc_3", DSC_3, 0x80c00, BIT(DPU_DSC_OUTPUT_CTRL)),
+   {
+   .name = "dsc_0", .id = DSC_0,
+   .base = 0x8, .len = 0x1800,
+   .features = BIT(DPU_DSC_OUTPUT_CTRL),
+   }, {
+   .name = "dsc_1", .id = DSC_1,
+   .base = 0x80400, .len = 0x1800,
+   .features = BIT(DPU_DSC_OUTPUT_CTRL),
+   }, {
+   .name = "dsc_2", .id = DSC_2,
+   .base = 0x80800, .len = 0x1800,
+   .features = BIT(DPU_DSC_OUTPUT_CTRL),
+   }, {
+   .name = "dsc_3", .id = DSC_3,
+   .base = 0x80c00, .len = 0x1800,
+   .features = BIT(DPU_DSC_OUTPUT_CTRL),
+   },
  };
  
  static const struct dpu_intf_cfg sm8150_intf[] = {

diff --git a/drivers/gpu/drm/msm/disp/dpu1/catalog/dpu_5_1_sc8180x.h 
b/drivers/gpu/drm/msm/disp/dpu1/catalog/dpu_5_1_sc8180x.h
index 414f0db3306c..ba5420f334ec 100644
--- a/drivers/gpu/drm/msm/disp/dpu1/catalog/dpu_5_1_sc8180x.h
+++ b/drivers/gpu/drm/msm/disp/dpu1/catalog/dpu_5_1_sc8180x.h
@@ -244,12 +244,31 @@ static const struct dpu_merge_3d_cfg sc8180x_merge_3d[] = 
{
  };
  
  static const struct dpu_dsc_cfg sc8180x_dsc[] = {

-   DSC_BLK("dsc_0", DSC_0, 0x8, BIT(DPU_DSC_OUTPUT_CTRL)),
-   DSC_BLK("dsc_1", DSC_1, 0x80400, BIT(DPU_DSC_OUTPUT_CTRL)),
-   DSC_BLK("dsc_2", DSC_2, 0x80800, BIT(DPU_DSC_OUTPUT_CTRL)),
-   DSC_BLK("dsc_3", DSC_3, 0x80c00, BIT(DPU_DSC_OUTPUT_CTRL)),
-   DSC_BLK("dsc_4", DSC_4, 0x81000, 

Re: [PATCH v2 16/21] drm/msm/dpu: inline DSC_BLK macros

2023-06-15 Thread Marijn Suijten
On 2023-06-13 03:09:56, Dmitry Baryshkov wrote:
> To simplify making changes to the hardware block definitions, expand
> corresponding macros. This way making all the changes are more obvious
> and visible in the source files.
> 
> Signed-off-by: Dmitry Baryshkov 
> ---
>  .../msm/disp/dpu1/catalog/dpu_3_0_msm8998.h   | 11 +--
>  .../msm/disp/dpu1/catalog/dpu_4_0_sdm845.h| 17 +++---
>  .../msm/disp/dpu1/catalog/dpu_5_0_sm8150.h| 21 ++---
>  .../msm/disp/dpu1/catalog/dpu_5_1_sc8180x.h   | 31 +++
>  .../msm/disp/dpu1/catalog/dpu_6_0_sm8250.h| 21 ++---
>  .../msm/disp/dpu1/catalog/dpu_6_4_sm6350.h|  6 +++-
>  .../msm/disp/dpu1/catalog/dpu_6_9_sm6375.h|  6 +++-
>  .../gpu/drm/msm/disp/dpu1/dpu_hw_catalog.c| 10 --
>  8 files changed, 91 insertions(+), 32 deletions(-)
> 
> diff --git a/drivers/gpu/drm/msm/disp/dpu1/catalog/dpu_3_0_msm8998.h 
> b/drivers/gpu/drm/msm/disp/dpu1/catalog/dpu_3_0_msm8998.h
> index a07c68744b29..7c3da4033c46 100644
> --- a/drivers/gpu/drm/msm/disp/dpu1/catalog/dpu_3_0_msm8998.h
> +++ b/drivers/gpu/drm/msm/disp/dpu1/catalog/dpu_3_0_msm8998.h
> @@ -200,8 +200,15 @@ static const struct dpu_pingpong_cfg msm8998_pp[] = {
>  };
>  
>  static const struct dpu_dsc_cfg msm8998_dsc[] = {
> - DSC_BLK("dsc_0", DSC_0, 0x8, 0),
> - DSC_BLK("dsc_1", DSC_1, 0x80400, 0),
> + {
> + .name = "dsc_0", .id = DSC_0,
> + .base = 0x8, .len = 0x1800,
> + .features = 0,
> + }, {
> + .name = "dsc_1", .id = DSC_1,
> + .base = 0x80400, .len = 0x1800,
> + .features = 0,
> + },
>  };
>  
>  static const struct dpu_dspp_cfg msm8998_dspp[] = {
> diff --git a/drivers/gpu/drm/msm/disp/dpu1/catalog/dpu_4_0_sdm845.h 
> b/drivers/gpu/drm/msm/disp/dpu1/catalog/dpu_4_0_sdm845.h
> index 786263ed1ef2..ca3bb6a1a93a 100644
> --- a/drivers/gpu/drm/msm/disp/dpu1/catalog/dpu_4_0_sdm845.h
> +++ b/drivers/gpu/drm/msm/disp/dpu1/catalog/dpu_4_0_sdm845.h
> @@ -224,10 +224,19 @@ static const struct dpu_pingpong_cfg sdm845_pp[] = {
>  };
>  
>  static const struct dpu_dsc_cfg sdm845_dsc[] = {
> - DSC_BLK("dsc_0", DSC_0, 0x8, 0),
> - DSC_BLK("dsc_1", DSC_1, 0x80400, 0),
> - DSC_BLK("dsc_2", DSC_2, 0x80800, 0),
> - DSC_BLK("dsc_3", DSC_3, 0x80c00, 0),
> + {
> + .name = "dsc_0", .id = DSC_0,
> + .base = 0x8, .len = 0x1800,
> + }, {
> + .name = "dsc_1", .id = DSC_1,
> + .base = 0x80400, .len = 0x1800,
> + }, {
> + .name = "dsc_2", .id = DSC_2,
> + .base = 0x80800, .len = 0x1800,
> + }, {
> + .name = "dsc_3", .id = DSC_3,
> + .base = 0x80c00, .len = 0x1800,
> + },
>  };
>  
>  static const struct dpu_intf_cfg sdm845_intf[] = {
> diff --git a/drivers/gpu/drm/msm/disp/dpu1/catalog/dpu_5_0_sm8150.h 
> b/drivers/gpu/drm/msm/disp/dpu1/catalog/dpu_5_0_sm8150.h
> index 6b9bfeac6e0a..5b068521de13 100644
> --- a/drivers/gpu/drm/msm/disp/dpu1/catalog/dpu_5_0_sm8150.h
> +++ b/drivers/gpu/drm/msm/disp/dpu1/catalog/dpu_5_0_sm8150.h
> @@ -245,10 +245,23 @@ static const struct dpu_merge_3d_cfg sm8150_merge_3d[] 
> = {
>  };
>  
>  static const struct dpu_dsc_cfg sm8150_dsc[] = {
> - DSC_BLK("dsc_0", DSC_0, 0x8, BIT(DPU_DSC_OUTPUT_CTRL)),
> - DSC_BLK("dsc_1", DSC_1, 0x80400, BIT(DPU_DSC_OUTPUT_CTRL)),
> - DSC_BLK("dsc_2", DSC_2, 0x80800, BIT(DPU_DSC_OUTPUT_CTRL)),
> - DSC_BLK("dsc_3", DSC_3, 0x80c00, BIT(DPU_DSC_OUTPUT_CTRL)),
> + {
> + .name = "dsc_0", .id = DSC_0,
> + .base = 0x8, .len = 0x1800,
> + .features = BIT(DPU_DSC_OUTPUT_CTRL),
> + }, {
> + .name = "dsc_1", .id = DSC_1,
> + .base = 0x80400, .len = 0x1800,
> + .features = BIT(DPU_DSC_OUTPUT_CTRL),
> + }, {
> + .name = "dsc_2", .id = DSC_2,
> + .base = 0x80800, .len = 0x1800,
> + .features = BIT(DPU_DSC_OUTPUT_CTRL),
> + }, {
> + .name = "dsc_3", .id = DSC_3,
> + .base = 0x80c00, .len = 0x1800,
> + .features = BIT(DPU_DSC_OUTPUT_CTRL),
> + },
>  };
>  
>  static const struct dpu_intf_cfg sm8150_intf[] = {
> diff --git a/drivers/gpu/drm/msm/disp/dpu1/catalog/dpu_5_1_sc8180x.h 
> b/drivers/gpu/drm/msm/disp/dpu1/catalog/dpu_5_1_sc8180x.h
> index 414f0db3306c..ba5420f334ec 100644
> --- a/drivers/gpu/drm/msm/disp/dpu1/catalog/dpu_5_1_sc8180x.h
> +++ b/drivers/gpu/drm/msm/disp/dpu1/catalog/dpu_5_1_sc8180x.h
> @@ -244,12 +244,31 @@ static const struct dpu_merge_3d_cfg sc8180x_merge_3d[] 
> = {
>  };
>  
>  static const struct dpu_dsc_cfg sc8180x_dsc[] = {
> - DSC_BLK("dsc_0", DSC_0, 0x8, BIT(DPU_DSC_OUTPUT_CTRL)),
> - DSC_BLK("dsc_1", DSC_1, 0x80400, BIT(DPU_DSC_OUTPUT_CTRL)),
> - DSC_BLK("dsc_2", DSC_2, 0x80800, BIT(DPU_DSC_OUTPUT_CTRL)),
> - DSC_BLK("dsc_3", DSC_3, 0x80c00, 

Re: [PATCH v3] drm/vkms: Add support to 1D gamma LUT

2023-06-15 Thread kernel test robot
Hi Arthur,

kernel test robot noticed the following build warnings:

[auto build test WARNING on drm-misc/drm-misc-next]
[also build test WARNING on drm/drm-next drm-exynos/exynos-drm-next 
drm-intel/for-linux-next drm-intel/for-linux-next-fixes drm-tip/drm-tip 
linus/master v6.4-rc6 next-20230615]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch#_base_tree_information]

url:
https://github.com/intel-lab-lkp/linux/commits/Arthur-Grillo/drm-vkms-Add-support-to-1D-gamma-LUT/20230616-040349
base:   git://anongit.freedesktop.org/drm/drm-misc drm-misc-next
patch link:
https://lore.kernel.org/r/20230615200157.960630-1-arthurgrillo%40riseup.net
patch subject: [PATCH v3] drm/vkms: Add support to 1D gamma LUT
config: alpha-allyesconfig 
(https://download.01.org/0day-ci/archive/20230616/202306160524.qcbf0knr-...@intel.com/config)
compiler: alpha-linux-gcc (GCC) 12.3.0
reproduce (this is a W=1 build):
mkdir -p ~/bin
wget 
https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O 
~/bin/make.cross
chmod +x ~/bin/make.cross
git remote add drm-misc git://anongit.freedesktop.org/drm/drm-misc
git fetch drm-misc drm-misc-next
git checkout drm-misc/drm-misc-next
b4 shazam 
https://lore.kernel.org/r/20230615200157.960630-1-arthurgri...@riseup.net
# save the config file
mkdir build_dir && cp config build_dir/.config
COMPILER_INSTALL_PATH=$HOME/0day COMPILER=gcc-12.3.0 ~/bin/make.cross 
W=1 O=build_dir ARCH=alpha olddefconfig
COMPILER_INSTALL_PATH=$HOME/0day COMPILER=gcc-12.3.0 ~/bin/make.cross 
W=1 O=build_dir ARCH=alpha SHELL=/bin/bash drivers/gpu/

If you fix the issue in a separate patch/commit (i.e. not just a new version of
the same patch/commit), kindly add following tags
| Reported-by: kernel test robot 
| Closes: 
https://lore.kernel.org/oe-kbuild-all/202306160524.qcbf0knr-...@intel.com/

All warnings (new ones prefixed by >>):

   drivers/gpu/drm/vkms/vkms_crtc.c: In function 'vkms_crtc_atomic_flush':
>> drivers/gpu/drm/vkms/vkms_crtc.c:251:32: warning: variable 'gamma_lut' set 
>> but not used [-Wunused-but-set-variable]
 251 | struct vkms_color_lut *gamma_lut;
 |^


vim +/gamma_lut +251 drivers/gpu/drm/vkms/vkms_crtc.c

   246  
   247  static void vkms_crtc_atomic_flush(struct drm_crtc *crtc,
   248 struct drm_atomic_state *state)
   249  {
   250  struct vkms_output *vkms_output = drm_crtc_to_vkms_output(crtc);
 > 251  struct vkms_color_lut *gamma_lut;
   252  
   253  if (crtc->state->event) {
   254  spin_lock(>dev->event_lock);
   255  
   256  if (drm_crtc_vblank_get(crtc) != 0)
   257  drm_crtc_send_vblank_event(crtc, 
crtc->state->event);
   258  else
   259  drm_crtc_arm_vblank_event(crtc, 
crtc->state->event);
   260  
   261  spin_unlock(>dev->event_lock);
   262  
   263  crtc->state->event = NULL;
   264  }
   265  
   266  vkms_output->composer_state = to_vkms_crtc_state(crtc->state);
   267  gamma_lut = _output->composer_state->gamma_lut;
   268  
   269  spin_unlock_irq(_output->lock);
   270  }
   271  

-- 
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki


Re: [PATCH v2 0/5] drm/ssd130x: A few enhancements and cleanups

2023-06-15 Thread Javier Martinez Canillas
Javier Martinez Canillas  writes:

> Hello,
>
> While working on adding support for the SSD132X family of 4-bit grayscale
> Solomon OLED panel controllers, I noticed a few things in the driver that
> can be improved and make extending to support other chip families easier.
>
> I've split the preparatory patches in this series and will post the actual
> SSD132X support as a separate patch-set once this one is merged.
>
> Best regards,
> Javier
>
> Changes in v2:
> - List per controller default width/height values in DT schema (Maxime 
> Ripard).
>

Pushed to drm-misc (drm-misc-next). Thanks!

-- 
Best regards,

Javier Martinez Canillas
Core Platforms
Red Hat



Re: [RFC] Plane color pipeline KMS uAPI

2023-06-15 Thread Christopher Braga




On 6/14/2023 5:00 AM, Pekka Paalanen wrote:

On Tue, 13 Jun 2023 12:29:55 -0400
Christopher Braga  wrote:


On 6/13/2023 4:23 AM, Pekka Paalanen wrote:

On Mon, 12 Jun 2023 12:56:57 -0400
Christopher Braga  wrote:
   

On 6/12/2023 5:21 AM, Pekka Paalanen wrote:

On Fri, 9 Jun 2023 19:11:25 -0400
Christopher Braga  wrote:
  

On 6/9/2023 12:30 PM, Simon Ser wrote:

Hi Christopher,

On Friday, June 9th, 2023 at 17:52, Christopher Braga  
wrote:
 

The new COLOROP objects also expose a number of KMS properties. Each has a
type, a reference to the next COLOROP object in the linked list, and other
type-specific properties. Here is an example for a 1D LUT operation:

 Color operation 42
 ├─ "type": enum {Bypass, 1D curve} = 1D curve
 ├─ "1d_curve_type": enum {LUT, sRGB, PQ, BT.709, HLG, …} = LUT

The options sRGB / PQ / BT.709 / HLG would select hard-coded 1D
curves? Will different hardware be allowed to expose a subset of these
enum values?


Yes. Only hardcoded LUTs supported by the HW are exposed as enum entries.
 

 ├─ "lut_size": immutable range = 4096
 ├─ "lut_data": blob
 └─ "next": immutable color operation ID = 43


Some hardware has per channel 1D LUT values, while others use the same
LUT for all channels.  We will definitely need to expose this in the
UAPI in some form.


Hm, I was assuming per-channel 1D LUTs here, just like the existing GAMMA_LUT/
DEGAMMA_LUT properties work. If some hardware can't support that, it'll need
to get exposed as another color operation block.
 

To configure this hardware block, user-space can fill a KMS blob with
4096 u32
entries, then set "lut_data" to the blob ID. Other color operation types
might
have different properties.


The bit-depth of the LUT is an important piece of information we should
include by default. Are we assuming that the DRM driver will always
reduce the input values to the resolution supported by the pipeline?
This could result in differences between the hardware behavior
and the shader behavior.

Additionally, some pipelines are floating point while others are fixed.
How would user space know if it needs to pack 32 bit integer values vs
32 bit float values?


Again, I'm deferring to the existing GAMMA_LUT/DEGAMMA_LUT. These use a common
definition of LUT blob (u16 elements) and it's up to the driver to convert.

Using a very precise format for the uAPI has the nice property of making the
uAPI much simpler to use. User-space sends high precision data and it's up to
drivers to map that to whatever the hardware accepts.


Conversion from a larger uint type to a smaller type sounds low effort,
however if a block works in a floating point space things are going to
get messy really quickly. If the block operates in FP16 space and the
interface is 16 bits we are good, but going from 32 bits to FP16 (such
as in the matrix case or 3DLUT) is less than ideal.


Hi Christopher,

are you thinking of precision loss, or the overhead of conversion?

Conversion from N-bit fixed point to N-bit floating-point is generally
lossy, too, and the other direction as well.

What exactly would be messy?
  

Overheard of conversion is the primary concern here. Having to extract
and / or calculate the significand + exponent components in the kernel
is burdensome and imo a task better suited for user space. This also has
to be done every blob set, meaning that if user space is re-using
pre-calculated blobs we would be repeating the same conversion
operations in kernel space unnecessarily.


What is burdensome in that calculation? I don't think you would need to
use any actual floating-point instructions. Logarithm for finding the
exponent is about finding the highest bit set in an integer and
everything is conveniently expressed in base-2. Finding significand is
just masking the integer based on the exponent.
   

Oh it definitely can be done, but I think this is just a difference of
opinion at this point. At the end of the day we will do it if we have
to, but it is just more optimal if a more agreeable common type is used.


Can you not cache the converted data, keyed by the DRM blob unique
identity vs. the KMS property it is attached to?

If the userspace compositor has N common transforms (ex: standard P3 ->
sRGB matrix), they would likely have N unique blobs. Obviously from the
kernel end we wouldn't want to cache the transform of every blob passed
down through the UAPI.


Hi Christoper,

as long as the blob exists, why not?


Generally because this is an unbounded amount of blobs. I'm not 100% 
sure what the typical behavior is upstream, but in our driver we have 
scenarios where we can have per-frame blob updates (unique per-frame blobs).


Speaking of per-frame blob updates, there is one concern I neglected to 
bring up. Internally we have seen scenarios where frequent blob 
allocation can lead to memory allocation delays of two frames or higher. 
This 

Re: [PATCH v8 10/18] drm/msm/a6xx: Introduce GMU wrapper support

2023-06-15 Thread Konrad Dybcio
On 10.06.2023 00:06, Akhil P Oommen wrote:
> On Mon, May 29, 2023 at 03:52:29PM +0200, Konrad Dybcio wrote:
>>
>> Some (particularly SMD_RPM, a.k.a non-RPMh) SoCs implement A6XX GPUs
>> but don't implement the associated GMUs. This is due to the fact that
>> the GMU directly pokes at RPMh. Sadly, this means we have to take care
>> of enabling & scaling power rails, clocks and bandwidth ourselves.
>>
>> Reuse existing Adreno-common code and modify the deeply-GMU-infused
>> A6XX code to facilitate these GPUs. This involves if-ing out lots
>> of GMU callbacks and introducing a new type of GMU - GMU wrapper (it's
>> the actual name that Qualcomm uses in their downstream kernels).
>>
>> This is essentially a register region which is convenient to model
>> as a device. We'll use it for managing the GDSCs. The register
>> layout matches the actual GMU_CX/GX regions on the "real GMU" devices
>> and lets us reuse quite a bit of gmu_read/write/rmw calls.
>>
>> Signed-off-by: Konrad Dybcio 
>> ---
>>  drivers/gpu/drm/msm/adreno/a6xx_gmu.c   |  72 +-
>>  drivers/gpu/drm/msm/adreno/a6xx_gpu.c   | 211 
>> 
>>  drivers/gpu/drm/msm/adreno/a6xx_gpu.h   |   1 +
>>  drivers/gpu/drm/msm/adreno/a6xx_gpu_state.c |  14 +-
>>  drivers/gpu/drm/msm/adreno/adreno_gpu.c |   8 +-
>>  drivers/gpu/drm/msm/adreno/adreno_gpu.h |   6 +
>>  6 files changed, 277 insertions(+), 35 deletions(-)
>>
>> diff --git a/drivers/gpu/drm/msm/adreno/a6xx_gmu.c 
>> b/drivers/gpu/drm/msm/adreno/a6xx_gmu.c
>> index 5ba8cba69383..385ca3a12462 100644
>> --- a/drivers/gpu/drm/msm/adreno/a6xx_gmu.c
>> +++ b/drivers/gpu/drm/msm/adreno/a6xx_gmu.c
>> @@ -1437,6 +1437,7 @@ static int a6xx_gmu_get_irq(struct a6xx_gmu *gmu, 
>> struct platform_device *pdev,
>>  
>>  void a6xx_gmu_remove(struct a6xx_gpu *a6xx_gpu)
>>  {
>> +struct adreno_gpu *adreno_gpu = _gpu->base;
>>  struct a6xx_gmu *gmu = _gpu->gmu;
>>  struct platform_device *pdev = to_platform_device(gmu->dev);
>>  
>> @@ -1462,10 +1463,12 @@ void a6xx_gmu_remove(struct a6xx_gpu *a6xx_gpu)
>>  gmu->mmio = NULL;
>>  gmu->rscc = NULL;
>>  
>> -a6xx_gmu_memory_free(gmu);
>> +if (!adreno_has_gmu_wrapper(adreno_gpu)) {
>> +a6xx_gmu_memory_free(gmu);
>>  
>> -free_irq(gmu->gmu_irq, gmu);
>> -free_irq(gmu->hfi_irq, gmu);
>> +free_irq(gmu->gmu_irq, gmu);
>> +free_irq(gmu->hfi_irq, gmu);
>> +}
>>  
>>  /* Drop reference taken in of_find_device_by_node */
>>  put_device(gmu->dev);
>> @@ -1484,6 +1487,69 @@ static int cxpd_notifier_cb(struct notifier_block *nb,
>>  return 0;
>>  }
>>  
>> +int a6xx_gmu_wrapper_init(struct a6xx_gpu *a6xx_gpu, struct device_node 
>> *node)
>> +{
>> +struct platform_device *pdev = of_find_device_by_node(node);
>> +struct a6xx_gmu *gmu = _gpu->gmu;
>> +int ret;
>> +
>> +if (!pdev)
>> +return -ENODEV;
>> +
>> +gmu->dev = >dev;
>> +
>> +of_dma_configure(gmu->dev, node, true);
>> +
>> +pm_runtime_enable(gmu->dev);
>> +
>> +/* Mark legacy for manual SPTPRAC control */
>> +gmu->legacy = true;
>> +
>> +/* Map the GMU registers */
>> +gmu->mmio = a6xx_gmu_get_mmio(pdev, "gmu");
>> +if (IS_ERR(gmu->mmio)) {
>> +ret = PTR_ERR(gmu->mmio);
>> +goto err_mmio;
>> +}
>> +
>> +gmu->cxpd = dev_pm_domain_attach_by_name(gmu->dev, "cx");
>> +if (IS_ERR(gmu->cxpd)) {
>> +ret = PTR_ERR(gmu->cxpd);
>> +goto err_mmio;
>> +}
>> +
>> +if (!device_link_add(gmu->dev, gmu->cxpd, DL_FLAG_PM_RUNTIME)) {
>> +ret = -ENODEV;
>> +goto detach_cxpd;
>> +}
>> +
>> +init_completion(>pd_gate);
>> +complete_all(>pd_gate);
>> +gmu->pd_nb.notifier_call = cxpd_notifier_cb;
>> +
>> +/* Get a link to the GX power domain to reset the GPU */
>> +gmu->gxpd = dev_pm_domain_attach_by_name(gmu->dev, "gx");
>> +if (IS_ERR(gmu->gxpd)) {
>> +ret = PTR_ERR(gmu->gxpd);
>> +goto err_mmio;
>> +}
>> +
>> +gmu->initialized = true;
>> +
>> +return 0;
>> +
>> +detach_cxpd:
>> +dev_pm_domain_detach(gmu->cxpd, false);
>> +
>> +err_mmio:
>> +iounmap(gmu->mmio);
>> +
>> +/* Drop reference taken in of_find_device_by_node */
>> +put_device(gmu->dev);
>> +
>> +return ret;
>> +}
>> +
>>  int a6xx_gmu_init(struct a6xx_gpu *a6xx_gpu, struct device_node *node)
>>  {
>>  struct adreno_gpu *adreno_gpu = _gpu->base;
>> diff --git a/drivers/gpu/drm/msm/adreno/a6xx_gpu.c 
>> b/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
>> index 58bf405b85d8..0a44762dbb6d 100644
>> --- a/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
>> +++ b/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
>> @@ -21,7 +21,7 @@ static inline bool _a6xx_check_idle(struct msm_gpu *gpu)
>>  struct a6xx_gpu *a6xx_gpu = to_a6xx_gpu(adreno_gpu);
>>  
>>  /* Check that the GMU is idle */
>> -if (!a6xx_gmu_isidle(_gpu->gmu))
>> +if 

Re: [PATCH v3] drm/i915: Avoid circular locking dependency when flush delayed work on gt reset

2023-06-15 Thread Dong, Zhanjun
V3 is to follow John's suggestion option 1. The better option is in 
discussion and might have boarder impact.


Meanwhile we can start with option 1, check CI system report and see if 
issue getting better.



Regards,

Zhanjun Dong

On 2023-06-15 5:15 p.m., Zhanjun Dong wrote:

This attempts to avoid circular locking dependency between flush delayed work 
and intel_gt_reset.
Switched from cancel_delayed_work_sync to cancel_delayed_work, the non-sync 
version for reset path, it is safe as the worker has the trylock code to handle 
the lock; Meanwhile keep the sync version for park/fini to ensure the worker is 
not still running during suspend or shutdown.

WARNING: possible circular locking dependency detected
6.4.0-rc1-drmtip_1340-g31e3463b0edb+ #1 Not tainted
--
kms_pipe_crc_ba/6415 is trying to acquire lock:
88813e6cc640 
((work_completion)(&(>timestamp.work)->work)){+.+.}-{0:0}, at: 
__flush_work+0x42/0x530

but task is already holding lock:
88813e6cce90 (>reset.mutex){+.+.}-{3:3}, at: intel_gt_reset+0x19e/0x470 
[i915]

which lock already depends on the new lock.

the existing dependency chain (in reverse order) is:

-> #3 (>reset.mutex){+.+.}-{3:3}:
 lock_acquire+0xd8/0x2d0
 i915_gem_shrinker_taints_mutex+0x31/0x50 [i915]
 intel_gt_init_reset+0x65/0x80 [i915]
 intel_gt_common_init_early+0xe1/0x170 [i915]
 intel_root_gt_init_early+0x48/0x60 [i915]
 i915_driver_probe+0x671/0xcb0 [i915]
 i915_pci_probe+0xdc/0x210 [i915]
 pci_device_probe+0x95/0x120
 really_probe+0x164/0x3c0
 __driver_probe_device+0x73/0x160
 driver_probe_device+0x19/0xa0
 __driver_attach+0xb6/0x180
 bus_for_each_dev+0x77/0xd0
 bus_add_driver+0x114/0x210
 driver_register+0x5b/0x110
 __pfx_vgem_open+0x3/0x10 [vgem]
 do_one_initcall+0x57/0x270
 do_init_module+0x5f/0x220
 load_module+0x1ca4/0x1f00
 __do_sys_finit_module+0xb4/0x130
 do_syscall_64+0x3c/0x90
 entry_SYSCALL_64_after_hwframe+0x72/0xdc

-> #2 (fs_reclaim){+.+.}-{0:0}:
 lock_acquire+0xd8/0x2d0
 fs_reclaim_acquire+0xac/0xe0
 kmem_cache_alloc+0x32/0x260
 i915_vma_instance+0xb2/0xc60 [i915]
 i915_gem_object_ggtt_pin_ww+0x175/0x370 [i915]
 vm_fault_gtt+0x22d/0xf60 [i915]
 __do_fault+0x2f/0x1d0
 do_pte_missing+0x4a/0xd20
 __handle_mm_fault+0x5b0/0x790
 handle_mm_fault+0xa2/0x230
 do_user_addr_fault+0x3ea/0xa10
 exc_page_fault+0x68/0x1a0
 asm_exc_page_fault+0x26/0x30

-> #1 (>reset.backoff_srcu){}-{0:0}:
 lock_acquire+0xd8/0x2d0
 _intel_gt_reset_lock+0x57/0x330 [i915]
 guc_timestamp_ping+0x35/0x130 [i915]
 process_one_work+0x250/0x510
 worker_thread+0x4f/0x3a0
 kthread+0xff/0x130
 ret_from_fork+0x29/0x50

-> #0 ((work_completion)(&(>timestamp.work)->work)){+.+.}-{0:0}:
 check_prev_add+0x90/0xc60
 __lock_acquire+0x1998/0x2590
 lock_acquire+0xd8/0x2d0
 __flush_work+0x74/0x530
 __cancel_work_timer+0x14f/0x1f0
 intel_guc_submission_reset_prepare+0x81/0x4b0 [i915]
 intel_uc_reset_prepare+0x9c/0x120 [i915]
 reset_prepare+0x21/0x60 [i915]
 intel_gt_reset+0x1dd/0x470 [i915]
 intel_gt_reset_global+0xfb/0x170 [i915]
 intel_gt_handle_error+0x368/0x420 [i915]
 intel_gt_debugfs_reset_store+0x5c/0xc0 [i915]
 i915_wedged_set+0x29/0x40 [i915]
 simple_attr_write_xsigned.constprop.0+0xb4/0x110
 full_proxy_write+0x52/0x80
 vfs_write+0xc5/0x4f0
 ksys_write+0x64/0xe0
 do_syscall_64+0x3c/0x90
 entry_SYSCALL_64_after_hwframe+0x72/0xdc

other info that might help us debug this:
  Chain exists of:
   (work_completion)(&(>timestamp.work)->work) --> fs_reclaim --> 
>reset.mutex
   Possible unsafe locking scenario:
 CPU0CPU1
 
lock(>reset.mutex);
 lock(fs_reclaim);
 lock(>reset.mutex);
lock((work_completion)(&(>timestamp.work)->work));

  *** DEADLOCK ***
  3 locks held by kms_pipe_crc_ba/6415:
   #0: 888101541430 (sb_writers#15){.+.+}-{0:0}, at: ksys_write+0x64/0xe0
   #1: 888136c7eab8 (>mutex){+.+.}-{3:3}, at: 
simple_attr_write_xsigned.constprop.0+0x47/0x110
   #2: 88813e6cce90 (>reset.mutex){+.+.}-{3:3}, at: 
intel_gt_reset+0x19e/0x470 [i915]

v2: Add sync flag to guc_cancel_busyness_worker to ensure reset path calls 
asynchronous cancel.
v3: Add sync flag to intel_guc_submission_disable to ensure reset path calls 
asynchronous cancel.

Signed-off-by: Zhanjun Dong 
---
  .../gpu/drm/i915/gt/uc/intel_guc_submission.c   | 17 ++---
  .../gpu/drm/i915/gt/uc/intel_guc_submission.h   |  2 +-
  

[PATCH v1] drm/i915/gsc: Fix intel_gsc_uc_fw_proxy_init_done with directed wakerefs

2023-06-15 Thread Alan Previn
intel_gsc_uc_fw_proxy_init_done is used by a few code paths
and usages. However, certain paths need a wakeref while others
can't take a wakeref such as from the runtime_pm_resume callstack.

Add a param into this helper to allow callers to direct whether
to take the wakeref or not. This resolves the following bug:

   INFO: task sh:2607 blocked for more than 61 seconds.
   Not tainted 6.3.0-pxp-gsc-final-jun14+ #297
   "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
   task:sh  state:D stack:13016 pid:2607  ppid:2602   
flags:0x4000
   Call Trace:
  
  __schedule+0x47b/0xe10
  schedule+0x58/0xd0
  rpm_resume+0x1cc/0x800
  ? __pfx_autoremove_wake_function+0x10/0x10
  __pm_runtime_resume+0x42/0x80
  __intel_runtime_pm_get+0x19/0x80 [i915]
  gsc_uc_get_fw_status+0x10/0x50 [i915]
  intel_gsc_uc_fw_init_done+0x9/0x20 [i915]
  intel_gsc_uc_load_start+0x5b/0x130 [i915]
  __uc_resume+0xa5/0x280 [i915]
  intel_runtime_resume+0xd4/0x250 [i915]
  ? __pfx_pci_pm_runtime_resume+0x10/0x10
   __rpm_callback+0x3c/0x160

Fixes: 8c33c3755b75 ("drm/i915/gsc: take a wakeref for the 
proxy-init-completion check")
Signed-off-by: Alan Previn 
---
 drivers/gpu/drm/i915/gt/uc/intel_gsc_fw.c  | 17 +++--
 drivers/gpu/drm/i915/gt/uc/intel_gsc_fw.h  |  2 +-
 drivers/gpu/drm/i915/gt/uc/intel_gsc_uc.c  |  2 +-
 drivers/gpu/drm/i915/pxp/intel_pxp_gsccs.c |  2 +-
 4 files changed, 14 insertions(+), 9 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/uc/intel_gsc_fw.c 
b/drivers/gpu/drm/i915/gt/uc/intel_gsc_fw.c
index 856de9af1e3a..ab1a456f833d 100644
--- a/drivers/gpu/drm/i915/gt/uc/intel_gsc_fw.c
+++ b/drivers/gpu/drm/i915/gt/uc/intel_gsc_fw.c
@@ -22,27 +22,32 @@ static bool gsc_is_in_reset(struct intel_uncore *uncore)
HECI1_FWSTS1_CURRENT_STATE_RESET;
 }
 
-static u32 gsc_uc_get_fw_status(struct intel_uncore *uncore)
+static u32 gsc_uc_get_fw_status(struct intel_uncore *uncore, bool 
needs_wakeref)
 {
intel_wakeref_t wakeref;
u32 fw_status = 0;
 
-   with_intel_runtime_pm(uncore->rpm, wakeref)
-   fw_status = intel_uncore_read(uncore, 
HECI_FWSTS(MTL_GSC_HECI1_BASE, 1));
+   if (needs_wakeref)
+   wakeref = intel_runtime_pm_get(uncore->rpm);
 
+   fw_status = intel_uncore_read(uncore, HECI_FWSTS(MTL_GSC_HECI1_BASE, 
1));
+
+   if (needs_wakeref)
+   intel_runtime_pm_put(uncore->rpm, wakeref);
return fw_status;
 }
 
-bool intel_gsc_uc_fw_proxy_init_done(struct intel_gsc_uc *gsc)
+bool intel_gsc_uc_fw_proxy_init_done(struct intel_gsc_uc *gsc, bool 
needs_wakeref)
 {
return REG_FIELD_GET(HECI1_FWSTS1_CURRENT_STATE,
-gsc_uc_get_fw_status(gsc_uc_to_gt(gsc)->uncore)) ==
+gsc_uc_get_fw_status(gsc_uc_to_gt(gsc)->uncore,
+ needs_wakeref)) ==
   HECI1_FWSTS1_PROXY_STATE_NORMAL;
 }
 
 bool intel_gsc_uc_fw_init_done(struct intel_gsc_uc *gsc)
 {
-   return gsc_uc_get_fw_status(gsc_uc_to_gt(gsc)->uncore) &
+   return gsc_uc_get_fw_status(gsc_uc_to_gt(gsc)->uncore, false) &
   HECI1_FWSTS1_INIT_COMPLETE;
 }
 
diff --git a/drivers/gpu/drm/i915/gt/uc/intel_gsc_fw.h 
b/drivers/gpu/drm/i915/gt/uc/intel_gsc_fw.h
index 8d7b9e4f1ffc..ad2167ce9137 100644
--- a/drivers/gpu/drm/i915/gt/uc/intel_gsc_fw.h
+++ b/drivers/gpu/drm/i915/gt/uc/intel_gsc_fw.h
@@ -15,6 +15,6 @@ struct intel_uncore;
 int intel_gsc_fw_get_binary_info(struct intel_uc_fw *gsc_fw, const void *data, 
size_t size);
 int intel_gsc_uc_fw_upload(struct intel_gsc_uc *gsc);
 bool intel_gsc_uc_fw_init_done(struct intel_gsc_uc *gsc);
-bool intel_gsc_uc_fw_proxy_init_done(struct intel_gsc_uc *gsc);
+bool intel_gsc_uc_fw_proxy_init_done(struct intel_gsc_uc *gsc, bool 
needs_wakeref);
 
 #endif
diff --git a/drivers/gpu/drm/i915/gt/uc/intel_gsc_uc.c 
b/drivers/gpu/drm/i915/gt/uc/intel_gsc_uc.c
index 85d90f0a15e3..75a3a0790ef3 100644
--- a/drivers/gpu/drm/i915/gt/uc/intel_gsc_uc.c
+++ b/drivers/gpu/drm/i915/gt/uc/intel_gsc_uc.c
@@ -72,7 +72,7 @@ static void gsc_work(struct work_struct *work)
 * complete the request handling cleanly, so we need to 
check the
 * status register to check if the proxy init was 
actually successful
 */
-   if (intel_gsc_uc_fw_proxy_init_done(gsc)) {
+   if (intel_gsc_uc_fw_proxy_init_done(gsc, false)) {
drm_dbg(>i915->drm, "GSC Proxy 
initialized\n");
intel_uc_fw_change_status(>fw, 
INTEL_UC_FIRMWARE_RUNNING);
} else {
diff --git a/drivers/gpu/drm/i915/pxp/intel_pxp_gsccs.c 
b/drivers/gpu/drm/i915/pxp/intel_pxp_gsccs.c
index f13890ec7db1..c7df47364013 100644
--- a/drivers/gpu/drm/i915/pxp/intel_pxp_gsccs.c
+++ 

Re: [PATCH 1/2] fbdev/offb: Update expected device name

2023-06-15 Thread Cyril Brulebois
Hi Rob,

Rob Herring  (2023-06-15):
> On Thu, Jun 15, 2023 at 03:21:07PM +0200, Michal Suchánek wrote:
> > At the time this was proposed it was said that "of-display", is wrong,
> > and that "of-display.0" must be used for the first device instead, and
> > if something breaks an alias can be provided.
> > 
> > So how does one provide an alias so that offb can find "of-display.0"
> > as "of-display"?
> 
> I'm not aware of any way. There isn't because device names and paths are 
> not considered ABI. There are mechanisms for getting stable class device 
> indices (e.g. i2c0, mmcblk0, fb0, fb1, etc.) though not implemented for 
> fbN (and please don't add it). 
> 
> In any case, this should be an easy fix. Though if "linux,opened" or 
> "linux,boot-display" is not set, then you'd still get "of-display.0":
> 
> diff --git a/drivers/of/platform.c b/drivers/of/platform.c
> index 78ae84187449..e46482cef9c7 100644
> --- a/drivers/of/platform.c
> +++ b/drivers/of/platform.c
> @@ -553,7 +553,7 @@ static int __init of_platform_default_populate_init(void)
> if (!of_get_property(node, "linux,opened", NULL) ||
> !of_get_property(node, "linux,boot-display", 
> NULL))
> continue;
> -   dev = of_platform_device_create(node, "of-display.0", 
> NULL);
> +   dev = of_platform_device_create(node, "of-display", 
> NULL);
> of_node_put(node);
> if (WARN_ON(!dev))
> return -ENOMEM;

I've just replaced my clueless workaround with this patch on top of the
kernel found in Debian 12 (Bookworm), i.e. 6.1.27 at this point, and it
indeed fixes the black screen problem in the installer's context.

I didn't run a full installation to check whether this kernel is also fine
after rebooting into the installed system, but as far as I understood for
the original bug report[1], it wasn't affected in the first place.

 1. https://bugs.debian.org/1033058

Will somebody else pick up the torch from here, and submit that for
inclusion in master? Or should I re-submit the above patch on my own?

I see my Debian colleagues have already pushed an updated v6.4-rc6 in
experimental, so it should be rather easy to combine checking latest
master with the distribution's packaging. Once that's done, I'm quite
familiar with building an updated installer image on top of it…


Thanks,
-- 
Cyril Brulebois -- Debian Consultant @ DEBAMAX -- https://debamax.com/


signature.asc
Description: PGP signature


Re: [PATCH v8 07/18] drm/msm/a6xx: Add a helper for software-resetting the GPU

2023-06-15 Thread Akhil P Oommen
On Thu, Jun 15, 2023 at 10:59:23PM +0200, Konrad Dybcio wrote:
> 
> On 15.06.2023 22:11, Akhil P Oommen wrote:
> > On Thu, Jun 15, 2023 at 12:34:06PM +0200, Konrad Dybcio wrote:
> >>
> >> On 6.06.2023 19:18, Akhil P Oommen wrote:
> >>> On Mon, May 29, 2023 at 03:52:26PM +0200, Konrad Dybcio wrote:
> 
>  Introduce a6xx_gpu_sw_reset() in preparation for adding GMU wrapper
>  GPUs and reuse it in a6xx_gmu_force_off().
> 
>  This helper, contrary to the original usage in GMU code paths, adds
>  a write memory barrier which together with the necessary delay should
>  ensure that the reset is never deasserted too quickly due to e.g. OoO
>  execution going crazy.
> 
>  Signed-off-by: Konrad Dybcio 
>  ---
>   drivers/gpu/drm/msm/adreno/a6xx_gmu.c |  3 +--
>   drivers/gpu/drm/msm/adreno/a6xx_gpu.c | 11 +++
>   drivers/gpu/drm/msm/adreno/a6xx_gpu.h |  1 +
>   3 files changed, 13 insertions(+), 2 deletions(-)
> 
>  diff --git a/drivers/gpu/drm/msm/adreno/a6xx_gmu.c 
>  b/drivers/gpu/drm/msm/adreno/a6xx_gmu.c
>  index b86be123ecd0..5ba8cba69383 100644
>  --- a/drivers/gpu/drm/msm/adreno/a6xx_gmu.c
>  +++ b/drivers/gpu/drm/msm/adreno/a6xx_gmu.c
>  @@ -899,8 +899,7 @@ static void a6xx_gmu_force_off(struct a6xx_gmu *gmu)
>   a6xx_bus_clear_pending_transactions(adreno_gpu, true);
>   
>   /* Reset GPU core blocks */
>  -gpu_write(gpu, REG_A6XX_RBBM_SW_RESET_CMD, 1);
>  -udelay(100);
>  +a6xx_gpu_sw_reset(gpu, true);
>   }
>   
>   static void a6xx_gmu_set_initial_freq(struct msm_gpu *gpu, struct 
>  a6xx_gmu *gmu)
>  diff --git a/drivers/gpu/drm/msm/adreno/a6xx_gpu.c 
>  b/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
>  index e3ac3f045665..083ccb5bcb4e 100644
>  --- a/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
>  +++ b/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
>  @@ -1634,6 +1634,17 @@ void a6xx_bus_clear_pending_transactions(struct 
>  adreno_gpu *adreno_gpu, bool gx_
>   gpu_write(gpu, REG_A6XX_GBIF_HALT, 0x0);
>   }
>   
>  +void a6xx_gpu_sw_reset(struct msm_gpu *gpu, bool assert)
>  +{
>  +gpu_write(gpu, REG_A6XX_RBBM_SW_RESET_CMD, assert);
>  +/* Add a barrier to avoid bad surprises */
> >>> Can you please make this comment a bit more clear? Highlight that we
> >>> should ensure the register is posted at hw before polling.
> >>>
> >>> I think this barrier is required only during assert.
> >> Generally it should not be strictly required at all, but I'm thinking
> >> that it'd be good to keep it in both cases, so that:
> >>
> >> if (assert)
> >>we don't keep writing things to the GPU if it's in reset
> >> else
> >>we don't start writing things to the GPU becomes it comes
> >>out of reset
> >>
> >> Also, if you squint hard enough at the commit message, you'll notice
> >> I intended for this so only be a wmb, but for some reason generalized
> >> it.. Perhaps that's another thing I should fix!
> >> for v9..
> > 
> > wmb() doesn't provide any ordering guarantee with the delay loop.
> Hm, fair.. I'm still not as fluent with memory access knowledge as I'd
> like to be..
> 
> > A common practice is to just read back the same register before
> > the loop because a readl followed by delay() is guaranteed to be ordered.
> So, how should I proceed? Keep the r/w barrier, or add a readback and
> a tiiiny (perhaps even using ndelay instead of udelay?) delay on de-assert?

readback + delay (similar value as downstream). This path is exercised
rarely.

-Akhil.

> 
> Konrad
> > 
> > -Akhil.
> >>
> >> Konrad
> >>>
> >>> -Akhil.
>  +mb();
>  +
>  +/* The reset line needs to be asserted for at least 100 us */
>  +if (assert)
>  +udelay(100);
>  +}
>  +
>   static int a6xx_pm_resume(struct msm_gpu *gpu)
>   {
>   struct adreno_gpu *adreno_gpu = to_adreno_gpu(gpu);
>  diff --git a/drivers/gpu/drm/msm/adreno/a6xx_gpu.h 
>  b/drivers/gpu/drm/msm/adreno/a6xx_gpu.h
>  index 9580def06d45..aa70390ee1c6 100644
>  --- a/drivers/gpu/drm/msm/adreno/a6xx_gpu.h
>  +++ b/drivers/gpu/drm/msm/adreno/a6xx_gpu.h
>  @@ -89,5 +89,6 @@ struct msm_gpu_state *a6xx_gpu_state_get(struct 
>  msm_gpu *gpu);
>   int a6xx_gpu_state_put(struct msm_gpu_state *state);
>   
>   void a6xx_bus_clear_pending_transactions(struct adreno_gpu *adreno_gpu, 
>  bool gx_off);
>  +void a6xx_gpu_sw_reset(struct msm_gpu *gpu, bool assert);
>   
>   #endif /* __A6XX_GPU_H__ */
> 
>  -- 
>  2.40.1
> 


[PATCH v3] drm/i915: Avoid circular locking dependency when flush delayed work on gt reset

2023-06-15 Thread Zhanjun Dong
This attempts to avoid circular locking dependency between flush delayed work 
and intel_gt_reset.
Switched from cancel_delayed_work_sync to cancel_delayed_work, the non-sync 
version for reset path, it is safe as the worker has the trylock code to handle 
the lock; Meanwhile keep the sync version for park/fini to ensure the worker is 
not still running during suspend or shutdown.

WARNING: possible circular locking dependency detected
6.4.0-rc1-drmtip_1340-g31e3463b0edb+ #1 Not tainted
--
kms_pipe_crc_ba/6415 is trying to acquire lock:
88813e6cc640 
((work_completion)(&(>timestamp.work)->work)){+.+.}-{0:0}, at: 
__flush_work+0x42/0x530

but task is already holding lock:
88813e6cce90 (>reset.mutex){+.+.}-{3:3}, at: intel_gt_reset+0x19e/0x470 
[i915]

which lock already depends on the new lock.

the existing dependency chain (in reverse order) is:

-> #3 (>reset.mutex){+.+.}-{3:3}:
lock_acquire+0xd8/0x2d0
i915_gem_shrinker_taints_mutex+0x31/0x50 [i915]
intel_gt_init_reset+0x65/0x80 [i915]
intel_gt_common_init_early+0xe1/0x170 [i915]
intel_root_gt_init_early+0x48/0x60 [i915]
i915_driver_probe+0x671/0xcb0 [i915]
i915_pci_probe+0xdc/0x210 [i915]
pci_device_probe+0x95/0x120
really_probe+0x164/0x3c0
__driver_probe_device+0x73/0x160
driver_probe_device+0x19/0xa0
__driver_attach+0xb6/0x180
bus_for_each_dev+0x77/0xd0
bus_add_driver+0x114/0x210
driver_register+0x5b/0x110
__pfx_vgem_open+0x3/0x10 [vgem]
do_one_initcall+0x57/0x270
do_init_module+0x5f/0x220
load_module+0x1ca4/0x1f00
__do_sys_finit_module+0xb4/0x130
do_syscall_64+0x3c/0x90
entry_SYSCALL_64_after_hwframe+0x72/0xdc

-> #2 (fs_reclaim){+.+.}-{0:0}:
lock_acquire+0xd8/0x2d0
fs_reclaim_acquire+0xac/0xe0
kmem_cache_alloc+0x32/0x260
i915_vma_instance+0xb2/0xc60 [i915]
i915_gem_object_ggtt_pin_ww+0x175/0x370 [i915]
vm_fault_gtt+0x22d/0xf60 [i915]
__do_fault+0x2f/0x1d0
do_pte_missing+0x4a/0xd20
__handle_mm_fault+0x5b0/0x790
handle_mm_fault+0xa2/0x230
do_user_addr_fault+0x3ea/0xa10
exc_page_fault+0x68/0x1a0
asm_exc_page_fault+0x26/0x30

-> #1 (>reset.backoff_srcu){}-{0:0}:
lock_acquire+0xd8/0x2d0
_intel_gt_reset_lock+0x57/0x330 [i915]
guc_timestamp_ping+0x35/0x130 [i915]
process_one_work+0x250/0x510
worker_thread+0x4f/0x3a0
kthread+0xff/0x130
ret_from_fork+0x29/0x50

-> #0 ((work_completion)(&(>timestamp.work)->work)){+.+.}-{0:0}:
check_prev_add+0x90/0xc60
__lock_acquire+0x1998/0x2590
lock_acquire+0xd8/0x2d0
__flush_work+0x74/0x530
__cancel_work_timer+0x14f/0x1f0
intel_guc_submission_reset_prepare+0x81/0x4b0 [i915]
intel_uc_reset_prepare+0x9c/0x120 [i915]
reset_prepare+0x21/0x60 [i915]
intel_gt_reset+0x1dd/0x470 [i915]
intel_gt_reset_global+0xfb/0x170 [i915]
intel_gt_handle_error+0x368/0x420 [i915]
intel_gt_debugfs_reset_store+0x5c/0xc0 [i915]
i915_wedged_set+0x29/0x40 [i915]
simple_attr_write_xsigned.constprop.0+0xb4/0x110
full_proxy_write+0x52/0x80
vfs_write+0xc5/0x4f0
ksys_write+0x64/0xe0
do_syscall_64+0x3c/0x90
entry_SYSCALL_64_after_hwframe+0x72/0xdc

other info that might help us debug this:
 Chain exists of:
  (work_completion)(&(>timestamp.work)->work) --> fs_reclaim --> 
>reset.mutex
  Possible unsafe locking scenario:
CPU0CPU1

   lock(>reset.mutex);
lock(fs_reclaim);
lock(>reset.mutex);
   lock((work_completion)(&(>timestamp.work)->work));

 *** DEADLOCK ***
 3 locks held by kms_pipe_crc_ba/6415:
  #0: 888101541430 (sb_writers#15){.+.+}-{0:0}, at: ksys_write+0x64/0xe0
  #1: 888136c7eab8 (>mutex){+.+.}-{3:3}, at: 
simple_attr_write_xsigned.constprop.0+0x47/0x110
  #2: 88813e6cce90 (>reset.mutex){+.+.}-{3:3}, at: 
intel_gt_reset+0x19e/0x470 [i915]

v2: Add sync flag to guc_cancel_busyness_worker to ensure reset path calls 
asynchronous cancel.
v3: Add sync flag to intel_guc_submission_disable to ensure reset path calls 
asynchronous cancel.

Signed-off-by: Zhanjun Dong 
---
 .../gpu/drm/i915/gt/uc/intel_guc_submission.c   | 17 ++---
 .../gpu/drm/i915/gt/uc/intel_guc_submission.h   |  2 +-
 drivers/gpu/drm/i915/gt/uc/intel_uc.c   |  4 ++--
 3 files changed, 13 insertions(+), 10 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c 
b/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c
index a0e3ef1c65d2..ef4300246ce1 100644
--- a/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c
+++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c
@@ 

Re: [PATCH v7 2/8] PCI/VGA: Deal only with VGA class devices

2023-06-15 Thread Alex Deucher
On Wed, Jun 14, 2023 at 6:50 AM Sui Jingfeng  wrote:
>
> Hi,
>
> On 2023/6/13 11:01, Sui Jingfeng wrote:
> > From: Sui Jingfeng 
> >
> > Deal only with the VGA devcie(pdev->class == 0x0300), so replace the
> > pci_get_subsys() function with pci_get_class(). Filter the non-PCI display
> > device(pdev->class != 0x0300) out. There no need to process the non-display
> > PCI device.
> >
> > Cc: Bjorn Helgaas 
> > Signed-off-by: Sui Jingfeng 
> > ---
> >   drivers/pci/vgaarb.c | 22 --
> >   1 file changed, 12 insertions(+), 10 deletions(-)
> >
> > diff --git a/drivers/pci/vgaarb.c b/drivers/pci/vgaarb.c
> > index c1bc6c983932..22a505e877dc 100644
> > --- a/drivers/pci/vgaarb.c
> > +++ b/drivers/pci/vgaarb.c
> > @@ -754,10 +754,6 @@ static bool vga_arbiter_add_pci_device(struct pci_dev 
> > *pdev)
> >   struct pci_dev *bridge;
> >   u16 cmd;
> >
> > - /* Only deal with VGA class devices */
> > - if ((pdev->class >> 8) != PCI_CLASS_DISPLAY_VGA)
> > - return false;
> > -
>
> Hi, here is probably a bug fixing.
>
> For an example, nvidia render only GPU typically has 0x0380.
>
> at its PCI class number, but  render only GPU should not participate in
> the arbitration.
>
> As it shouldn't snoop the legacy fixed VGA address.
>
> It(render only GPU) can not display anything.
>
>
> But 0x0380 >> 8 = 0x03, the filter  failed.
>
>
> >   /* Allocate structure */
> >   vgadev = kzalloc(sizeof(struct vga_device), GFP_KERNEL);
> >   if (vgadev == NULL) {
> > @@ -1500,7 +1496,9 @@ static int pci_notify(struct notifier_block *nb, 
> > unsigned long action,
> >   struct pci_dev *pdev = to_pci_dev(dev);
> >   bool notify = false;
> >
> > - vgaarb_dbg(dev, "%s\n", __func__);
> > + /* Only deal with VGA class devices */
> > + if (pdev->class != PCI_CLASS_DISPLAY_VGA << 8)
> > + return 0;
>
> So here we only care 0x0300, my initial intent is to make an optimization,
>
> nowadays sane display graphic card should all has 0x0300 as its PCI
> class number, is this complete right?
>
> ```
>
> #define PCI_BASE_CLASS_DISPLAY0x03
> #define PCI_CLASS_DISPLAY_VGA0x0300
> #define PCI_CLASS_DISPLAY_XGA0x0301
> #define PCI_CLASS_DISPLAY_3D0x0302
> #define PCI_CLASS_DISPLAY_OTHER0x0380
>
> ```
>
> Any ideas ?

I'm not quite sure what you are asking about here.  For vga_arb, we
only care about VGA class devices since those should be on the only
ones that might have VGA routed to them.  However, as VGA gets
deprecated, you'll have more non VGA PCI classes for devices which
could be the pre-OS console device.

Alex

>
> >   /* For now we're only intereted in devices added and removed. I didn't
> >* test this thing here, so someone needs to double check for the
> > @@ -1510,6 +1508,8 @@ static int pci_notify(struct notifier_block *nb, 
> > unsigned long action,
> >   else if (action == BUS_NOTIFY_DEL_DEVICE)
> >   notify = vga_arbiter_del_pci_device(pdev);
> >
> > + vgaarb_dbg(dev, "%s: action = %lu\n", __func__, action);
> > +
> >   if (notify)
> >   vga_arbiter_notify_clients();
> >   return 0;
> > @@ -1534,8 +1534,8 @@ static struct miscdevice vga_arb_device = {
> >
> >   static int __init vga_arb_device_init(void)
> >   {
> > + struct pci_dev *pdev = NULL;
> >   int rc;
> > - struct pci_dev *pdev;
> >
> >   rc = misc_register(_arb_device);
> >   if (rc < 0)
> > @@ -1545,11 +1545,13 @@ static int __init vga_arb_device_init(void)
> >
> >   /* We add all PCI devices satisfying VGA class in the arbiter by
> >* default */
> > - pdev = NULL;
> > - while ((pdev =
> > - pci_get_subsys(PCI_ANY_ID, PCI_ANY_ID, PCI_ANY_ID,
> > -PCI_ANY_ID, pdev)) != NULL)
> > + while (1) {
> > + pdev = pci_get_class(PCI_CLASS_DISPLAY_VGA << 8, pdev);
> > + if (!pdev)
> > + break;
> > +
> >   vga_arbiter_add_pci_device(pdev);
> > + }
> >
> >   pr_info("loaded\n");
> >   return rc;
>
> --
> Jingfeng
>


Re: [PATCH v8 07/18] drm/msm/a6xx: Add a helper for software-resetting the GPU

2023-06-15 Thread Konrad Dybcio
On 15.06.2023 22:11, Akhil P Oommen wrote:
> On Thu, Jun 15, 2023 at 12:34:06PM +0200, Konrad Dybcio wrote:
>>
>> On 6.06.2023 19:18, Akhil P Oommen wrote:
>>> On Mon, May 29, 2023 at 03:52:26PM +0200, Konrad Dybcio wrote:

 Introduce a6xx_gpu_sw_reset() in preparation for adding GMU wrapper
 GPUs and reuse it in a6xx_gmu_force_off().

 This helper, contrary to the original usage in GMU code paths, adds
 a write memory barrier which together with the necessary delay should
 ensure that the reset is never deasserted too quickly due to e.g. OoO
 execution going crazy.

 Signed-off-by: Konrad Dybcio 
 ---
  drivers/gpu/drm/msm/adreno/a6xx_gmu.c |  3 +--
  drivers/gpu/drm/msm/adreno/a6xx_gpu.c | 11 +++
  drivers/gpu/drm/msm/adreno/a6xx_gpu.h |  1 +
  3 files changed, 13 insertions(+), 2 deletions(-)

 diff --git a/drivers/gpu/drm/msm/adreno/a6xx_gmu.c 
 b/drivers/gpu/drm/msm/adreno/a6xx_gmu.c
 index b86be123ecd0..5ba8cba69383 100644
 --- a/drivers/gpu/drm/msm/adreno/a6xx_gmu.c
 +++ b/drivers/gpu/drm/msm/adreno/a6xx_gmu.c
 @@ -899,8 +899,7 @@ static void a6xx_gmu_force_off(struct a6xx_gmu *gmu)
a6xx_bus_clear_pending_transactions(adreno_gpu, true);
  
/* Reset GPU core blocks */
 -  gpu_write(gpu, REG_A6XX_RBBM_SW_RESET_CMD, 1);
 -  udelay(100);
 +  a6xx_gpu_sw_reset(gpu, true);
  }
  
  static void a6xx_gmu_set_initial_freq(struct msm_gpu *gpu, struct 
 a6xx_gmu *gmu)
 diff --git a/drivers/gpu/drm/msm/adreno/a6xx_gpu.c 
 b/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
 index e3ac3f045665..083ccb5bcb4e 100644
 --- a/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
 +++ b/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
 @@ -1634,6 +1634,17 @@ void a6xx_bus_clear_pending_transactions(struct 
 adreno_gpu *adreno_gpu, bool gx_
gpu_write(gpu, REG_A6XX_GBIF_HALT, 0x0);
  }
  
 +void a6xx_gpu_sw_reset(struct msm_gpu *gpu, bool assert)
 +{
 +  gpu_write(gpu, REG_A6XX_RBBM_SW_RESET_CMD, assert);
 +  /* Add a barrier to avoid bad surprises */
>>> Can you please make this comment a bit more clear? Highlight that we
>>> should ensure the register is posted at hw before polling.
>>>
>>> I think this barrier is required only during assert.
>> Generally it should not be strictly required at all, but I'm thinking
>> that it'd be good to keep it in both cases, so that:
>>
>> if (assert)
>>  we don't keep writing things to the GPU if it's in reset
>> else
>>  we don't start writing things to the GPU becomes it comes
>>  out of reset
>>
>> Also, if you squint hard enough at the commit message, you'll notice
>> I intended for this so only be a wmb, but for some reason generalized
>> it.. Perhaps that's another thing I should fix!
>> for v9..
> 
> wmb() doesn't provide any ordering guarantee with the delay loop.
Hm, fair.. I'm still not as fluent with memory access knowledge as I'd
like to be..

> A common practice is to just read back the same register before
> the loop because a readl followed by delay() is guaranteed to be ordered.
So, how should I proceed? Keep the r/w barrier, or add a readback and
a tiiiny (perhaps even using ndelay instead of udelay?) delay on de-assert?

Konrad
> 
> -Akhil.
>>
>> Konrad
>>>
>>> -Akhil.
 +  mb();
 +
 +  /* The reset line needs to be asserted for at least 100 us */
 +  if (assert)
 +  udelay(100);
 +}
 +
  static int a6xx_pm_resume(struct msm_gpu *gpu)
  {
struct adreno_gpu *adreno_gpu = to_adreno_gpu(gpu);
 diff --git a/drivers/gpu/drm/msm/adreno/a6xx_gpu.h 
 b/drivers/gpu/drm/msm/adreno/a6xx_gpu.h
 index 9580def06d45..aa70390ee1c6 100644
 --- a/drivers/gpu/drm/msm/adreno/a6xx_gpu.h
 +++ b/drivers/gpu/drm/msm/adreno/a6xx_gpu.h
 @@ -89,5 +89,6 @@ struct msm_gpu_state *a6xx_gpu_state_get(struct msm_gpu 
 *gpu);
  int a6xx_gpu_state_put(struct msm_gpu_state *state);
  
  void a6xx_bus_clear_pending_transactions(struct adreno_gpu *adreno_gpu, 
 bool gx_off);
 +void a6xx_gpu_sw_reset(struct msm_gpu *gpu, bool assert);
  
  #endif /* __A6XX_GPU_H__ */

 -- 
 2.40.1



Re: [PATCH v3 02/17] dt-bindings: gpu: Add Imagination Technologies PowerVR GPU

2023-06-15 Thread Rob Herring
On Tue, Jun 13, 2023 at 9:20 AM Sarah Walker  wrote:
>
> Add the device tree binding documentation for the Series AXE GPU used in
> TI AM62 SoCs.
>
> Signed-off-by: Sarah Walker 
> ---
>  .../devicetree/bindings/gpu/img,powervr.yaml  | 71 +++
>  MAINTAINERS   |  7 ++
>  2 files changed, 78 insertions(+)
>  create mode 100644 Documentation/devicetree/bindings/gpu/img,powervr.yaml

Please use get_maintainers.pl and send your patches to the correct
people and lists or they won't get reviewed.

Rob


Re: [PATCH v9 02/14] mm: move page zone helpers from mm.h to mmzone.h

2023-06-15 Thread Peter Xu
On Thu, Jun 15, 2023 at 09:15:26PM +0100, Matthew Wilcox wrote:
> On Thu, Jun 15, 2023 at 03:33:12PM -0400, Peter Xu wrote:
> > My question is whether page_zonenum() is ready for taking all kinds of tail
> > pages?
> > 
> > Zone device tail pages all look fine, per memmap_init_zone_device().  The
> > question was other kinds of usual compound pages, like either thp or
> > hugetlb.  IIUC page->flags can be uninitialized for those tail pages.
> 
> I don't think that's true.  It's my understanding that page->flags is
> initialised for all pages in memmap at boot / hotplug / delayed-init
> time.  So you can check things like zone, node, etc on literally any
> page.  Contrariwise, those flags are not available in tail pages for
> use by the entity that has allocated a compound page / large folio.

Oh so the zone mask is special.  Fair enough.

> 
> Also, I don't believe zone device pages support compound allocation.
> I think they're always allocated as order-0.

Totally not familiar with zone device pages, but memmap_init_zone_device()
has pfns_per_compound which can be >1.  From there, memmap_init_compound()
does go ahead and setup pages as compound ones.

Thanks!

-- 
Peter Xu



Re: [PATCH v9 02/14] mm: move page zone helpers from mm.h to mmzone.h

2023-06-15 Thread Matthew Wilcox
On Thu, Jun 15, 2023 at 03:33:12PM -0400, Peter Xu wrote:
> My question is whether page_zonenum() is ready for taking all kinds of tail
> pages?
> 
> Zone device tail pages all look fine, per memmap_init_zone_device().  The
> question was other kinds of usual compound pages, like either thp or
> hugetlb.  IIUC page->flags can be uninitialized for those tail pages.

I don't think that's true.  It's my understanding that page->flags is
initialised for all pages in memmap at boot / hotplug / delayed-init
time.  So you can check things like zone, node, etc on literally any
page.  Contrariwise, those flags are not available in tail pages for
use by the entity that has allocated a compound page / large folio.

Also, I don't believe zone device pages support compound allocation.
I think they're always allocated as order-0.



[PATCH 1/2] drm/panel: boe-tv101wum-nl6: Drop macros and open code sequences

2023-06-15 Thread Linus Walleij
The boe-tv101wum-nl6 is reinventing the mechanism to send command
sequences that we usually nix during review, but I missed this one
so fixing it up myself.

Also use the explicit function calls to mipi_dsi_dcs_exit_sleep_mode()
and mipi_dsi_dcs_set_display_on() instead of reimplementing them
with homegrown sequences.

Signed-off-by: Linus Walleij 
---
 drivers/gpu/drm/panel/panel-boe-tv101wum-nl6.c | 2408 
 1 file changed, 1193 insertions(+), 1215 deletions(-)

diff --git a/drivers/gpu/drm/panel/panel-boe-tv101wum-nl6.c 
b/drivers/gpu/drm/panel/panel-boe-tv101wum-nl6.c
index 783234ae0f57..d19d30e134dd 100644
--- a/drivers/gpu/drm/panel/panel-boe-tv101wum-nl6.c
+++ b/drivers/gpu/drm/panel/panel-boe-tv101wum-nl6.c
@@ -33,7 +33,7 @@ struct panel_desc {
 
unsigned long mode_flags;
enum mipi_dsi_pixel_format format;
-   const struct panel_init_cmd *init_cmds;
+   int (*init)(struct mipi_dsi_device *dsi);
unsigned int lanes;
bool discharge_on_disable;
 };
@@ -54,1224 +54,1200 @@ struct boe_panel {
bool prepared;
 };
 
-enum dsi_cmd_type {
-   INIT_DCS_CMD,
-   DELAY_CMD,
-};
+static int boe_tv110c9m_init(struct mipi_dsi_device *dsi)
+{
+   int ret;
 
-struct panel_init_cmd {
-   enum dsi_cmd_type type;
-   size_t len;
-   const char *data;
-};
+   mipi_dsi_dcs_write_seq(dsi, 0xFF, 0x20);
+   mipi_dsi_dcs_write_seq(dsi, 0xFB, 0x01);
+   mipi_dsi_dcs_write_seq(dsi, 0x05, 0xD9);
+   mipi_dsi_dcs_write_seq(dsi, 0x07, 0x78);
+   mipi_dsi_dcs_write_seq(dsi, 0x08, 0x5A);
+   mipi_dsi_dcs_write_seq(dsi, 0x0D, 0x63);
+   mipi_dsi_dcs_write_seq(dsi, 0x0E, 0x91);
+   mipi_dsi_dcs_write_seq(dsi, 0x0F, 0x73);
+   mipi_dsi_dcs_write_seq(dsi, 0x95, 0xE6);
+   mipi_dsi_dcs_write_seq(dsi, 0x96, 0xF0);
+   mipi_dsi_dcs_write_seq(dsi, 0x30, 0x00);
+   mipi_dsi_dcs_write_seq(dsi, 0x6D, 0x66);
+   mipi_dsi_dcs_write_seq(dsi, 0x75, 0xA2);
+   mipi_dsi_dcs_write_seq(dsi, 0x77, 0x3B);
 
-#define _INIT_DCS_CMD(...) { \
-   .type = INIT_DCS_CMD, \
-   .len = sizeof((char[]){__VA_ARGS__}), \
-   .data = (char[]){__VA_ARGS__} }
-
-#define _INIT_DELAY_CMD(...) { \
-   .type = DELAY_CMD,\
-   .len = sizeof((char[]){__VA_ARGS__}), \
-   .data = (char[]){__VA_ARGS__} }
-
-static const struct panel_init_cmd boe_tv110c9m_init_cmd[] = {
-   _INIT_DCS_CMD(0xFF, 0x20),
-   _INIT_DCS_CMD(0xFB, 0x01),
-   _INIT_DCS_CMD(0x05, 0xD9),
-   _INIT_DCS_CMD(0x07, 0x78),
-   _INIT_DCS_CMD(0x08, 0x5A),
-   _INIT_DCS_CMD(0x0D, 0x63),
-   _INIT_DCS_CMD(0x0E, 0x91),
-   _INIT_DCS_CMD(0x0F, 0x73),
-   _INIT_DCS_CMD(0x95, 0xE6),
-   _INIT_DCS_CMD(0x96, 0xF0),
-   _INIT_DCS_CMD(0x30, 0x00),
-   _INIT_DCS_CMD(0x6D, 0x66),
-   _INIT_DCS_CMD(0x75, 0xA2),
-   _INIT_DCS_CMD(0x77, 0x3B),
+   mipi_dsi_dcs_write_seq(dsi, 0xB0, 0x00, 0x08, 0x00, 0x23, 0x00, 0x4D, 
0x00, 0x6D, 0x00, 0x89, 0x00, 0xA1, 0x00, 0xB6, 0x00, 0xC9);
+   mipi_dsi_dcs_write_seq(dsi, 0xB1, 0x00, 0xDA, 0x01, 0x13, 0x01, 0x3C, 
0x01, 0x7E, 0x01, 0xAB, 0x01, 0xF7, 0x02, 0x2F, 0x02, 0x31);
+   mipi_dsi_dcs_write_seq(dsi, 0xB2, 0x02, 0x67, 0x02, 0xA6, 0x02, 0xD1, 
0x03, 0x08, 0x03, 0x2E, 0x03, 0x5B, 0x03, 0x6B, 0x03, 0x7B);
+   mipi_dsi_dcs_write_seq(dsi, 0xB3, 0x03, 0x8E, 0x03, 0xA2, 0x03, 0xB7, 
0x03, 0xE7, 0x03, 0xFD, 0x03, 0xFF);
 
-   _INIT_DCS_CMD(0xB0, 0x00, 0x08, 0x00, 0x23, 0x00, 0x4D, 0x00, 0x6D, 
0x00, 0x89, 0x00, 0xA1, 0x00, 0xB6, 0x00, 0xC9),
-   _INIT_DCS_CMD(0xB1, 0x00, 0xDA, 0x01, 0x13, 0x01, 0x3C, 0x01, 0x7E, 
0x01, 0xAB, 0x01, 0xF7, 0x02, 0x2F, 0x02, 0x31),
-   _INIT_DCS_CMD(0xB2, 0x02, 0x67, 0x02, 0xA6, 0x02, 0xD1, 0x03, 0x08, 
0x03, 0x2E, 0x03, 0x5B, 0x03, 0x6B, 0x03, 0x7B),
-   _INIT_DCS_CMD(0xB3, 0x03, 0x8E, 0x03, 0xA2, 0x03, 0xB7, 0x03, 0xE7, 
0x03, 0xFD, 0x03, 0xFF),
-
-   _INIT_DCS_CMD(0xB4, 0x00, 0x08, 0x00, 0x23, 0x00, 0x4D, 0x00, 0x6D, 
0x00, 0x89, 0x00, 0xA1, 0x00, 0xB6, 0x00, 0xC9),
-   _INIT_DCS_CMD(0xB5, 0x00, 0xDA, 0x01, 0x13, 0x01, 0x3C, 0x01, 0x7E, 
0x01, 0xAB, 0x01, 0xF7, 0x02, 0x2F, 0x02, 0x31),
-   _INIT_DCS_CMD(0xB6, 0x02, 0x67, 0x02, 0xA6, 0x02, 0xD1, 0x03, 0x08, 
0x03, 0x2E, 0x03, 0x5B, 0x03, 0x6B, 0x03, 0x7B),
-   _INIT_DCS_CMD(0xB7, 0x03, 0x8E, 0x03, 0xA2, 0x03, 0xB7, 0x03, 0xE7, 
0x03, 0xFD, 0x03, 0xFF),
-   _INIT_DCS_CMD(0xB8, 0x00, 0x08, 0x00, 0x23, 0x00, 0x4D, 0x00, 0x6D, 
0x00, 0x89, 0x00, 0xA1, 0x00, 0xB6, 0x00, 0xC9),
-   _INIT_DCS_CMD(0xB9, 0x00, 0xDA, 0x01, 0x13, 0x01, 0x3C, 0x01, 0x7E, 
0x01, 0xAB, 0x01, 0xF7, 0x02, 0x2F, 0x02, 0x31),
-   _INIT_DCS_CMD(0xBA, 0x02, 0x67, 0x02, 0xA6, 0x02, 0xD1, 0x03, 0x08, 
0x03, 0x2E, 0x03, 0x5B, 0x03, 0x6B, 0x03, 0x7B),
-   _INIT_DCS_CMD(0xBB, 0x03, 0x8E, 0x03, 0xA2, 0x03, 0xB7, 0x03, 0xE7, 
0x03, 0xFD, 0x03, 0xFF),
-
-   _INIT_DCS_CMD(0xFF, 0x21),
-   _INIT_DCS_CMD(0xFB, 0x01),
-
-   _INIT_DCS_CMD(0xB0, 0x00, 0x00, 

[PATCH 2/2] drm/panel: boe-tv101wum-nl6: Drop surplus prepare tracking

2023-06-15 Thread Linus Walleij
The DRM panel core already keeps track of if the panel is already
prepared so do not reimplement this.

Signed-off-by: Linus Walleij 
---
 drivers/gpu/drm/panel/panel-boe-tv101wum-nl6.c | 12 
 1 file changed, 12 deletions(-)

diff --git a/drivers/gpu/drm/panel/panel-boe-tv101wum-nl6.c 
b/drivers/gpu/drm/panel/panel-boe-tv101wum-nl6.c
index d19d30e134dd..412d4d99aec6 100644
--- a/drivers/gpu/drm/panel/panel-boe-tv101wum-nl6.c
+++ b/drivers/gpu/drm/panel/panel-boe-tv101wum-nl6.c
@@ -50,8 +50,6 @@ struct boe_panel {
struct regulator *avee;
struct regulator *avdd;
struct gpio_desc *enable_gpio;
-
-   bool prepared;
 };
 
 static int boe_tv110c9m_init(struct mipi_dsi_device *dsi)
@@ -1286,9 +1284,6 @@ static int boe_panel_unprepare(struct drm_panel *panel)
 {
struct boe_panel *boe = to_boe_panel(panel);
 
-   if (!boe->prepared)
-   return 0;
-
if (boe->desc->discharge_on_disable) {
regulator_disable(boe->avee);
regulator_disable(boe->avdd);
@@ -1307,8 +1302,6 @@ static int boe_panel_unprepare(struct drm_panel *panel)
regulator_disable(boe->pp3300);
}
 
-   boe->prepared = false;
-
return 0;
 }
 
@@ -1317,9 +1310,6 @@ static int boe_panel_prepare(struct drm_panel *panel)
struct boe_panel *boe = to_boe_panel(panel);
int ret;
 
-   if (boe->prepared)
-   return 0;
-
gpiod_set_value(boe->enable_gpio, 0);
usleep_range(1000, 1500);
 
@@ -1357,8 +1347,6 @@ static int boe_panel_prepare(struct drm_panel *panel)
}
}
 
-   boe->prepared = true;
-
return 0;
 
 poweroff:

-- 
2.34.1



[PATCH 0/2] Fix up the boe-tv101wum-nl6 panel driver

2023-06-15 Thread Linus Walleij
This is two patches fixing things I would normally complain about
in reviews, but alas I missed this one, so I go in and fix it up
myself.

Signed-off-by: Linus Walleij 
---
Linus Walleij (2):
  drm/panel: boe-tv101wum-nl6: Drop macros and open code sequences
  drm/panel: boe-tv101wum-nl6: Drop surplus prepare tracking

 drivers/gpu/drm/panel/panel-boe-tv101wum-nl6.c | 2420 
 1 file changed, 1193 insertions(+), 1227 deletions(-)
---
base-commit: ac9a78681b921877518763ba0e89202254349d1b
change-id: 20230615-fix-boe-tv101wum-nl6-6aa3fab22b44

Best regards,
-- 
Linus Walleij 



Re: [PATCH v2 2/8] dt-bindings: display: panel: mipi-dbi-spi: add shineworld lh133k compatible

2023-06-15 Thread Rob Herring
On Thu, Jun 15, 2023 at 12:35:25PM +0200, Noralf Trønnes wrote:
> 
> 
> On 6/14/23 14:32, Leonard Göhrs wrote:
> > The Shineworld LH133K is a 1.3" 240x240px RGB LCD with a MIPI DBI
> > compatible SPI interface.
> > The initialization procedure is quite basic with the exception of
> > requiring inverted colors.
> > A basic mipi-dbi-cmd[1] script to get the display running thus looks
> > like this:
> > 
> > $ cat shineworld,lh133k.txt
> > command 0x11 # exit sleep mode
> > delay 120
> > 
> > # The display seems to require display color inversion, so enable it.
> > command 0x21 # INVON
> > 
> > # Enable normal display mode (in contrast to partial display mode).
> > command 0x13 # NORON
> > command 0x29 # MIPI_DCS_SET_DISPLAY_ON
> > 
> > $ mipi-dbi-cmd shineworld,lh133k.bin shineworld,lh133k.txt
> > 
> > [1]: https://github.com/notro/panel-mipi-dbi
> > 
> > Signed-off-by: Leonard Göhrs 
> > Acked-by: Conor Dooley 
> > ---
> 
> Normally I would take this trough drm-misc-next but -rc6 is the cutoff
> so if I do that it won't make it to 6.5. If the other patches make it to
> 6.5 the dtb checks will fail. I'm okay with the patches going through
> another tree if that's preferred. Let me know if I should apply the
> mipi-dbi-spi patches.

I've applied patches 1, 2, and 3. The netdev folks should pick up the 
other bindings.

Rob


[PATCH 5/5] drm/bridge: tc358762: Handle HS/VS polarity

2023-06-15 Thread Marek Vasut
Add support for handling the HS/VS sync signals polarity in the bridge
driver, otherwise e.g. DSIM bridge feeds the TC358762 inverted polarity
sync signals and the image is shifted to the left, up, and wobbly.

Signed-off-by: Marek Vasut 
---
Cc: Andrzej Hajda 
Cc: Daniel Vetter 
Cc: David Airlie 
Cc: Jernej Skrabec 
Cc: Jonas Karlman 
Cc: Laurent Pinchart 
Cc: Neil Armstrong 
Cc: Robert Foss 
Cc: dri-devel@lists.freedesktop.org
---
 drivers/gpu/drm/bridge/tc358762.c | 27 +--
 1 file changed, 25 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/bridge/tc358762.c 
b/drivers/gpu/drm/bridge/tc358762.c
index a092e2096074f..46198af9eebbf 100644
--- a/drivers/gpu/drm/bridge/tc358762.c
+++ b/drivers/gpu/drm/bridge/tc358762.c
@@ -74,6 +74,7 @@ struct tc358762 {
struct regulator *regulator;
struct drm_bridge *panel_bridge;
struct gpio_desc *reset_gpio;
+   struct drm_display_mode mode;
bool pre_enabled;
int error;
 };
@@ -114,6 +115,8 @@ static inline struct tc358762 *bridge_to_tc358762(struct 
drm_bridge *bridge)
 
 static int tc358762_init(struct tc358762 *ctx)
 {
+   u32 lcdctrl;
+
tc358762_write(ctx, DSI_LANEENABLE,
   LANEENABLE_L0EN | LANEENABLE_CLEN);
tc358762_write(ctx, PPI_D0S_CLRSIPOCOUNT, 5);
@@ -123,8 +126,18 @@ static int tc358762_init(struct tc358762 *ctx)
tc358762_write(ctx, PPI_LPTXTIMECNT, LPX_PERIOD);
 
tc358762_write(ctx, SPICMR, 0x00);
-   tc358762_write(ctx, LCDCTRL, LCDCTRL_VSDELAY(1) | LCDCTRL_RGB888 |
-LCDCTRL_UNK6 | LCDCTRL_VTGEN);
+
+   lcdctrl = LCDCTRL_VSDELAY(1) | LCDCTRL_RGB888 |
+ LCDCTRL_UNK6 | LCDCTRL_VTGEN;
+
+   if (ctx->mode.flags & DRM_MODE_FLAG_NHSYNC)
+   lcdctrl |= LCDCTRL_HSPOL;
+
+   if (ctx->mode.flags & DRM_MODE_FLAG_NVSYNC)
+   lcdctrl |= LCDCTRL_VSPOL;
+
+   tc358762_write(ctx, LCDCTRL, lcdctrl);
+
tc358762_write(ctx, SYSCTRL, 0x040f);
msleep(100);
 
@@ -194,6 +207,15 @@ static int tc358762_attach(struct drm_bridge *bridge,
 bridge, flags);
 }
 
+static void tc358762_bridge_mode_set(struct drm_bridge *bridge,
+const struct drm_display_mode *mode,
+const struct drm_display_mode *adj)
+{
+   struct tc358762 *ctx = bridge_to_tc358762(bridge);
+
+   drm_mode_copy(>mode, mode);
+}
+
 static const struct drm_bridge_funcs tc358762_bridge_funcs = {
.atomic_post_disable = tc358762_post_disable,
.atomic_pre_enable = tc358762_pre_enable,
@@ -202,6 +224,7 @@ static const struct drm_bridge_funcs tc358762_bridge_funcs 
= {
.atomic_destroy_state = drm_atomic_helper_bridge_destroy_state,
.atomic_reset = drm_atomic_helper_bridge_reset,
.attach = tc358762_attach,
+   .mode_set = tc358762_bridge_mode_set,
 };
 
 static int tc358762_parse_dt(struct tc358762 *ctx)
-- 
2.39.2



[PATCH 4/5] drm/bridge: tc358762: Guess the meaning of LCDCTRL bits

2023-06-15 Thread Marek Vasut
The register content and behavior is very similar to TC358764 VP_CTRL.
All the bits except for unknown bit 6 also seem to match, even though
the datasheet is just not available. Add a comment and reuse the bit
definitions.

Signed-off-by: Marek Vasut 
---
Cc: Andrzej Hajda 
Cc: Daniel Vetter 
Cc: David Airlie 
Cc: Jernej Skrabec 
Cc: Jonas Karlman 
Cc: Laurent Pinchart 
Cc: Neil Armstrong 
Cc: Robert Foss 
Cc: dri-devel@lists.freedesktop.org
---
 drivers/gpu/drm/bridge/tc358762.c | 16 +---
 1 file changed, 13 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/bridge/tc358762.c 
b/drivers/gpu/drm/bridge/tc358762.c
index 77f2ec9de9e59..a092e2096074f 100644
--- a/drivers/gpu/drm/bridge/tc358762.c
+++ b/drivers/gpu/drm/bridge/tc358762.c
@@ -41,8 +41,17 @@
 #define DSI_LANEENABLE 0x0210 /* Enables each lane */
 #define DSI_RX_START   1
 
-/* LCDC/DPI Host Registers */
-#define LCDCTRL0x0420
+/* LCDC/DPI Host Registers, based on guesswork that this matches TC358764 */
+#define LCDCTRL0x0420 /* Video Path Control */
+#define LCDCTRL_MSFBIT(0) /* Magic square in RGB666 */
+#define LCDCTRL_VTGEN  BIT(4)/* Use chip clock for timing */
+#define LCDCTRL_UNK6   BIT(6) /* Unknown */
+#define LCDCTRL_EVTMODEBIT(5) /* Event mode */
+#define LCDCTRL_RGB888 BIT(8) /* RGB888 mode */
+#define LCDCTRL_HSPOL  BIT(17) /* Polarity of HSYNC signal */
+#define LCDCTRL_DEPOL  BIT(18) /* Polarity of DE signal */
+#define LCDCTRL_VSPOL  BIT(19) /* Polarity of VSYNC signal */
+#define LCDCTRL_VSDELAY(v) (((v) & 0xfff) << 20) /* VSYNC delay */
 
 /* SPI Master Registers */
 #define SPICMR 0x0450
@@ -114,7 +123,8 @@ static int tc358762_init(struct tc358762 *ctx)
tc358762_write(ctx, PPI_LPTXTIMECNT, LPX_PERIOD);
 
tc358762_write(ctx, SPICMR, 0x00);
-   tc358762_write(ctx, LCDCTRL, 0x00100150);
+   tc358762_write(ctx, LCDCTRL, LCDCTRL_VSDELAY(1) | LCDCTRL_RGB888 |
+LCDCTRL_UNK6 | LCDCTRL_VTGEN);
tc358762_write(ctx, SYSCTRL, 0x040f);
msleep(100);
 
-- 
2.39.2



[PATCH 2/5] drm/bridge: tc358762: Switch to atomic ops

2023-06-15 Thread Marek Vasut
Switch the bridge driver over to atomic ops. No functional change.

Signed-off-by: Marek Vasut 
---
Cc: Andrzej Hajda 
Cc: Daniel Vetter 
Cc: David Airlie 
Cc: Jernej Skrabec 
Cc: Jonas Karlman 
Cc: Laurent Pinchart 
Cc: Neil Armstrong 
Cc: Robert Foss 
Cc: dri-devel@lists.freedesktop.org
---
 drivers/gpu/drm/bridge/tc358762.c | 15 +--
 1 file changed, 9 insertions(+), 6 deletions(-)

diff --git a/drivers/gpu/drm/bridge/tc358762.c 
b/drivers/gpu/drm/bridge/tc358762.c
index df9703eacab1f..5e00c08b99540 100644
--- a/drivers/gpu/drm/bridge/tc358762.c
+++ b/drivers/gpu/drm/bridge/tc358762.c
@@ -126,7 +126,7 @@ static int tc358762_init(struct tc358762 *ctx)
return tc358762_clear_error(ctx);
 }
 
-static void tc358762_post_disable(struct drm_bridge *bridge)
+static void tc358762_post_disable(struct drm_bridge *bridge, struct 
drm_bridge_state *state)
 {
struct tc358762 *ctx = bridge_to_tc358762(bridge);
int ret;
@@ -148,7 +148,7 @@ static void tc358762_post_disable(struct drm_bridge *bridge)
dev_err(ctx->dev, "error disabling regulators (%d)\n", ret);
 }
 
-static void tc358762_pre_enable(struct drm_bridge *bridge)
+static void tc358762_pre_enable(struct drm_bridge *bridge, struct 
drm_bridge_state *state)
 {
struct tc358762 *ctx = bridge_to_tc358762(bridge);
int ret;
@@ -165,7 +165,7 @@ static void tc358762_pre_enable(struct drm_bridge *bridge)
ctx->pre_enabled = true;
 }
 
-static void tc358762_enable(struct drm_bridge *bridge)
+static void tc358762_enable(struct drm_bridge *bridge, struct drm_bridge_state 
*state)
 {
struct tc358762 *ctx = bridge_to_tc358762(bridge);
int ret;
@@ -185,9 +185,12 @@ static int tc358762_attach(struct drm_bridge *bridge,
 }
 
 static const struct drm_bridge_funcs tc358762_bridge_funcs = {
-   .post_disable = tc358762_post_disable,
-   .pre_enable = tc358762_pre_enable,
-   .enable = tc358762_enable,
+   .atomic_post_disable = tc358762_post_disable,
+   .atomic_pre_enable = tc358762_pre_enable,
+   .atomic_enable = tc358762_enable,
+   .atomic_duplicate_state = drm_atomic_helper_bridge_duplicate_state,
+   .atomic_destroy_state = drm_atomic_helper_bridge_destroy_state,
+   .atomic_reset = drm_atomic_helper_bridge_reset,
.attach = tc358762_attach,
 };
 
-- 
2.39.2



[PATCH 3/5] drm/bridge: tc358762: Instruct DSI host to generate HSE packets

2023-06-15 Thread Marek Vasut
This bridge seems to need the HSE packet, otherwise the image is
shifted up and corrupted at the bottom. This makes the bridge
work with Samsung DSIM on i.MX8MM and i.MX8MP.

Signed-off-by: Marek Vasut 
---
Cc: Andrzej Hajda 
Cc: Daniel Vetter 
Cc: David Airlie 
Cc: Jernej Skrabec 
Cc: Jonas Karlman 
Cc: Laurent Pinchart 
Cc: Neil Armstrong 
Cc: Robert Foss 
Cc: dri-devel@lists.freedesktop.org
---
 drivers/gpu/drm/bridge/tc358762.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/bridge/tc358762.c 
b/drivers/gpu/drm/bridge/tc358762.c
index 5e00c08b99540..77f2ec9de9e59 100644
--- a/drivers/gpu/drm/bridge/tc358762.c
+++ b/drivers/gpu/drm/bridge/tc358762.c
@@ -241,7 +241,7 @@ static int tc358762_probe(struct mipi_dsi_device *dsi)
dsi->lanes = 1;
dsi->format = MIPI_DSI_FMT_RGB888;
dsi->mode_flags = MIPI_DSI_MODE_VIDEO | MIPI_DSI_MODE_VIDEO_SYNC_PULSE |
- MIPI_DSI_MODE_LPM;
+ MIPI_DSI_MODE_LPM | MIPI_DSI_MODE_VIDEO_HSE;
 
ret = tc358762_parse_dt(ctx);
if (ret < 0)
-- 
2.39.2



[PATCH 1/5] drm/bridge: tc358762: Split register programming from pre-enable to enable

2023-06-15 Thread Marek Vasut
Move the register programming part, which actually enables the bridge and
makes it push data out of its DPI side, into the enable callback. The DSI
host like DSIM may not be able to transmit commands in pre_enable, moving
the register programming into enable assures it can transmit commands.

Signed-off-by: Marek Vasut 
---
Cc: Andrzej Hajda 
Cc: Daniel Vetter 
Cc: David Airlie 
Cc: Jernej Skrabec 
Cc: Jonas Karlman 
Cc: Laurent Pinchart 
Cc: Neil Armstrong 
Cc: Robert Foss 
Cc: dri-devel@lists.freedesktop.org
---
 drivers/gpu/drm/bridge/tc358762.c | 11 +--
 1 file changed, 9 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/bridge/tc358762.c 
b/drivers/gpu/drm/bridge/tc358762.c
index 5641395fd310e..df9703eacab1f 100644
--- a/drivers/gpu/drm/bridge/tc358762.c
+++ b/drivers/gpu/drm/bridge/tc358762.c
@@ -162,11 +162,17 @@ static void tc358762_pre_enable(struct drm_bridge *bridge)
usleep_range(5000, 1);
}
 
+   ctx->pre_enabled = true;
+}
+
+static void tc358762_enable(struct drm_bridge *bridge)
+{
+   struct tc358762 *ctx = bridge_to_tc358762(bridge);
+   int ret;
+
ret = tc358762_init(ctx);
if (ret < 0)
dev_err(ctx->dev, "error initializing bridge (%d)\n", ret);
-
-   ctx->pre_enabled = true;
 }
 
 static int tc358762_attach(struct drm_bridge *bridge,
@@ -181,6 +187,7 @@ static int tc358762_attach(struct drm_bridge *bridge,
 static const struct drm_bridge_funcs tc358762_bridge_funcs = {
.post_disable = tc358762_post_disable,
.pre_enable = tc358762_pre_enable,
+   .enable = tc358762_enable,
.attach = tc358762_attach,
 };
 
-- 
2.39.2



[PATCH] drm/bridge: tc358764: Use BIT() macro for actual bits

2023-06-15 Thread Marek Vasut
None of these four bits are bitfields, use BIT() macro and treat
them as bits. No functional change.

Signed-off-by: Marek Vasut 
---
Cc: Andrzej Hajda 
Cc: Daniel Vetter 
Cc: David Airlie 
Cc: Jernej Skrabec 
Cc: Jonas Karlman 
Cc: Laurent Pinchart 
Cc: Neil Armstrong 
Cc: Robert Foss 
Cc: dri-devel@lists.freedesktop.org
---
 drivers/gpu/drm/bridge/tc358764.c | 12 ++--
 1 file changed, 6 insertions(+), 6 deletions(-)

diff --git a/drivers/gpu/drm/bridge/tc358764.c 
b/drivers/gpu/drm/bridge/tc358764.c
index f85654f1b1045..6a4cd313f5281 100644
--- a/drivers/gpu/drm/bridge/tc358764.c
+++ b/drivers/gpu/drm/bridge/tc358764.c
@@ -42,10 +42,10 @@
 
 /* Video path registers */
 #define VP_CTRL0x0450 /* Video Path Control */
-#define VP_CTRL_MSF(v) FLD_VAL(v, 0, 0) /* Magic square in RGB666 */
-#define VP_CTRL_VTGEN(v)   FLD_VAL(v, 4, 4) /* Use chip clock for timing */
-#define VP_CTRL_EVTMODE(v) FLD_VAL(v, 5, 5) /* Event mode */
-#define VP_CTRL_RGB888(v)  FLD_VAL(v, 8, 8) /* RGB888 mode */
+#define VP_CTRL_MSFBIT(0) /* Magic square in RGB666 */
+#define VP_CTRL_VTGEN  BIT(4) /* Use chip clock for timing */
+#define VP_CTRL_EVTMODEBIT(5) /* Event mode */
+#define VP_CTRL_RGB888 BIT(8) /* RGB888 mode */
 #define VP_CTRL_VSDELAY(v) FLD_VAL(v, 31, 20) /* VSYNC delay */
 #define VP_CTRL_HSPOL  BIT(17) /* Polarity of HSYNC signal */
 #define VP_CTRL_DEPOL  BIT(18) /* Polarity of DE signal */
@@ -233,8 +233,8 @@ static int tc358764_init(struct tc358764 *ctx)
tc358764_write(ctx, DSI_STARTDSI, DSI_RX_START);
 
/* configure video path */
-   tc358764_write(ctx, VP_CTRL, VP_CTRL_VSDELAY(15) | VP_CTRL_RGB888(1) |
-  VP_CTRL_EVTMODE(1) | VP_CTRL_HSPOL | VP_CTRL_VSPOL);
+   tc358764_write(ctx, VP_CTRL, VP_CTRL_VSDELAY(15) | VP_CTRL_RGB888 |
+  VP_CTRL_EVTMODE | VP_CTRL_HSPOL | VP_CTRL_VSPOL);
 
/* reset PHY */
tc358764_write(ctx, LV_PHY0, LV_PHY0_RST(1) |
-- 
2.39.2



[PATCH] drm/panel: simple: Add Powertip PH800480T013 drm_display_mode flags

2023-06-15 Thread Marek Vasut
Add missing drm_display_mode DRM_MODE_FLAG_NVSYNC | DRM_MODE_FLAG_NHSYNC
flags. Those are used by various bridges in the pipeline to correctly
configure its sync signals polarity.

Fixes: d69de69f2be1 ("drm/panel: simple: Add Powertip PH800480T013 panel")
Signed-off-by: Marek Vasut 
---
Cc: Daniel Vetter 
Cc: David Airlie 
Cc: Neil Armstrong 
Cc: Sam Ravnborg 
Cc: dri-devel@lists.freedesktop.org
---
 drivers/gpu/drm/panel/panel-simple.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/drivers/gpu/drm/panel/panel-simple.c 
b/drivers/gpu/drm/panel/panel-simple.c
index b81b21901940b..a0f4302de130a 100644
--- a/drivers/gpu/drm/panel/panel-simple.c
+++ b/drivers/gpu/drm/panel/panel-simple.c
@@ -3202,6 +3202,7 @@ static const struct drm_display_mode 
powertip_ph800480t013_idf02_mode = {
.vsync_start = 480 + 49,
.vsync_end = 480 + 49 + 2,
.vtotal = 480 + 49 + 2 + 22,
+   .flags = DRM_MODE_FLAG_NVSYNC | DRM_MODE_FLAG_NHSYNC,
 };
 
 static const struct panel_desc powertip_ph800480t013_idf02  = {
-- 
2.39.2



[PATCH] drm: bridge: samsung-dsim: Drain command transfer FIFO before transfer

2023-06-15 Thread Marek Vasut
Wait until the command transfer FIFO is empty before loading in the next
command. The previous behavior where the code waited until command transfer
FIFO was not full suffered from transfer corruption, where the last command
in the FIFO could be overwritten in case the FIFO indicates not full, but
also does not have enough space to store another transfer yet.

Signed-off-by: Marek Vasut 
---
Cc: Andrzej Hajda 
Cc: Daniel Vetter 
Cc: David Airlie 
Cc: Inki Dae 
Cc: Jagan Teki 
Cc: Jernej Skrabec 
Cc: Jonas Karlman 
Cc: Laurent Pinchart 
Cc: Marek Szyprowski 
Cc: Neil Armstrong 
Cc: Robert Foss 
Cc: dri-devel@lists.freedesktop.org
---
 drivers/gpu/drm/bridge/samsung-dsim.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/bridge/samsung-dsim.c 
b/drivers/gpu/drm/bridge/samsung-dsim.c
index 043b8109e64aa..9b7a00bafeaaa 100644
--- a/drivers/gpu/drm/bridge/samsung-dsim.c
+++ b/drivers/gpu/drm/bridge/samsung-dsim.c
@@ -1009,7 +1009,7 @@ static int samsung_dsim_wait_for_hdr_fifo(struct 
samsung_dsim *dsi)
do {
u32 reg = samsung_dsim_read(dsi, DSIM_FIFOCTRL_REG);
 
-   if (!(reg & DSIM_SFR_HEADER_FULL))
+   if (reg & DSIM_SFR_HEADER_EMPTY)
return 0;
 
if (!cond_resched())
-- 
2.39.2



Re: [PATCH v8 07/18] drm/msm/a6xx: Add a helper for software-resetting the GPU

2023-06-15 Thread Akhil P Oommen
On Thu, Jun 15, 2023 at 12:34:06PM +0200, Konrad Dybcio wrote:
> 
> On 6.06.2023 19:18, Akhil P Oommen wrote:
> > On Mon, May 29, 2023 at 03:52:26PM +0200, Konrad Dybcio wrote:
> >>
> >> Introduce a6xx_gpu_sw_reset() in preparation for adding GMU wrapper
> >> GPUs and reuse it in a6xx_gmu_force_off().
> >>
> >> This helper, contrary to the original usage in GMU code paths, adds
> >> a write memory barrier which together with the necessary delay should
> >> ensure that the reset is never deasserted too quickly due to e.g. OoO
> >> execution going crazy.
> >>
> >> Signed-off-by: Konrad Dybcio 
> >> ---
> >>  drivers/gpu/drm/msm/adreno/a6xx_gmu.c |  3 +--
> >>  drivers/gpu/drm/msm/adreno/a6xx_gpu.c | 11 +++
> >>  drivers/gpu/drm/msm/adreno/a6xx_gpu.h |  1 +
> >>  3 files changed, 13 insertions(+), 2 deletions(-)
> >>
> >> diff --git a/drivers/gpu/drm/msm/adreno/a6xx_gmu.c 
> >> b/drivers/gpu/drm/msm/adreno/a6xx_gmu.c
> >> index b86be123ecd0..5ba8cba69383 100644
> >> --- a/drivers/gpu/drm/msm/adreno/a6xx_gmu.c
> >> +++ b/drivers/gpu/drm/msm/adreno/a6xx_gmu.c
> >> @@ -899,8 +899,7 @@ static void a6xx_gmu_force_off(struct a6xx_gmu *gmu)
> >>a6xx_bus_clear_pending_transactions(adreno_gpu, true);
> >>  
> >>/* Reset GPU core blocks */
> >> -  gpu_write(gpu, REG_A6XX_RBBM_SW_RESET_CMD, 1);
> >> -  udelay(100);
> >> +  a6xx_gpu_sw_reset(gpu, true);
> >>  }
> >>  
> >>  static void a6xx_gmu_set_initial_freq(struct msm_gpu *gpu, struct 
> >> a6xx_gmu *gmu)
> >> diff --git a/drivers/gpu/drm/msm/adreno/a6xx_gpu.c 
> >> b/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
> >> index e3ac3f045665..083ccb5bcb4e 100644
> >> --- a/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
> >> +++ b/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
> >> @@ -1634,6 +1634,17 @@ void a6xx_bus_clear_pending_transactions(struct 
> >> adreno_gpu *adreno_gpu, bool gx_
> >>gpu_write(gpu, REG_A6XX_GBIF_HALT, 0x0);
> >>  }
> >>  
> >> +void a6xx_gpu_sw_reset(struct msm_gpu *gpu, bool assert)
> >> +{
> >> +  gpu_write(gpu, REG_A6XX_RBBM_SW_RESET_CMD, assert);
> >> +  /* Add a barrier to avoid bad surprises */
> > Can you please make this comment a bit more clear? Highlight that we
> > should ensure the register is posted at hw before polling.
> > 
> > I think this barrier is required only during assert.
> Generally it should not be strictly required at all, but I'm thinking
> that it'd be good to keep it in both cases, so that:
> 
> if (assert)
>   we don't keep writing things to the GPU if it's in reset
> else
>   we don't start writing things to the GPU becomes it comes
>   out of reset
> 
> Also, if you squint hard enough at the commit message, you'll notice
> I intended for this so only be a wmb, but for some reason generalized
> it.. Perhaps that's another thing I should fix!
> for v9..

wmb() doesn't provide any ordering guarantee with the delay loop.
A common practice is to just read back the same register before
the loop because a readl followed by delay() is guaranteed to be ordered.

-Akhil.
> 
> Konrad
> > 
> > -Akhil.
> >> +  mb();
> >> +
> >> +  /* The reset line needs to be asserted for at least 100 us */
> >> +  if (assert)
> >> +  udelay(100);
> >> +}
> >> +
> >>  static int a6xx_pm_resume(struct msm_gpu *gpu)
> >>  {
> >>struct adreno_gpu *adreno_gpu = to_adreno_gpu(gpu);
> >> diff --git a/drivers/gpu/drm/msm/adreno/a6xx_gpu.h 
> >> b/drivers/gpu/drm/msm/adreno/a6xx_gpu.h
> >> index 9580def06d45..aa70390ee1c6 100644
> >> --- a/drivers/gpu/drm/msm/adreno/a6xx_gpu.h
> >> +++ b/drivers/gpu/drm/msm/adreno/a6xx_gpu.h
> >> @@ -89,5 +89,6 @@ struct msm_gpu_state *a6xx_gpu_state_get(struct msm_gpu 
> >> *gpu);
> >>  int a6xx_gpu_state_put(struct msm_gpu_state *state);
> >>  
> >>  void a6xx_bus_clear_pending_transactions(struct adreno_gpu *adreno_gpu, 
> >> bool gx_off);
> >> +void a6xx_gpu_sw_reset(struct msm_gpu *gpu, bool assert);
> >>  
> >>  #endif /* __A6XX_GPU_H__ */
> >>
> >> -- 
> >> 2.40.1
> >>


Re: [PATCH 1/2] fbdev/offb: Update expected device name

2023-06-15 Thread Rob Herring
On Thu, Jun 15, 2023 at 03:21:07PM +0200, Michal Suchánek wrote:
> Hello,
> 
> On Thu, Jun 15, 2023 at 03:06:28PM +0200, Thomas Zimmermann wrote:
> > Hi
> > 
> > Am 15.06.23 um 15:03 schrieb Linux regression tracking (Thorsten Leemhuis):
> > > On 16.04.23 14:34, Salvatore Bonaccorso wrote:
> > > > 
> > > > On Wed, Apr 12, 2023 at 11:55:08AM +0200, Cyril Brulebois wrote:
> > > > > Since commit 241d2fb56a18 ("of: Make OF framebuffer device names 
> > > > > unique"),
> > > > > as spotted by Frédéric Bonnard, the historical "of-display" device is
> > > > > gone: the updated logic creates "of-display.0" instead, then as many
> > > > > "of-display.N" as required.
> > > > > 
> > > > > This means that offb no longer finds the expected device, which 
> > > > > prevents
> > > > > the Debian Installer from setting up its interface, at least on 
> > > > > ppc64el.
> > > > > 
> > > > > It might be better to iterate on all possible nodes, but updating the
> > > > > hardcoded device from "of-display" to "of-display.0" is confirmed to 
> > > > > fix
> > > > > the Debian Installer at the very least.
> 
> At the time this was proposed it was said that "of-display", is wrong,
> and that "of-display.0" must be used for the first device instead, and
> if something breaks an alias can be provided.
> 
> So how does one provide an alias so that offb can find "of-display.0" as
> "of-display"?

I'm not aware of any way. There isn't because device names and paths are 
not considered ABI. There are mechanisms for getting stable class device 
indices (e.g. i2c0, mmcblk0, fb0, fb1, etc.) though not implemented for 
fbN (and please don't add it). 

In any case, this should be an easy fix. Though if "linux,opened" or 
"linux,boot-display" is not set, then you'd still get "of-display.0":

diff --git a/drivers/of/platform.c b/drivers/of/platform.c
index 78ae84187449..e46482cef9c7 100644
--- a/drivers/of/platform.c
+++ b/drivers/of/platform.c
@@ -553,7 +553,7 @@ static int __init of_platform_default_populate_init(void)
if (!of_get_property(node, "linux,opened", NULL) ||
!of_get_property(node, "linux,boot-display", NULL))
continue;
-   dev = of_platform_device_create(node, "of-display.0", 
NULL);
+   dev = of_platform_device_create(node, "of-display", 
NULL);
of_node_put(node);
if (WARN_ON(!dev))
return -ENOMEM;


[PATCH v3] drm/vkms: Add support to 1D gamma LUT

2023-06-15 Thread Arthur Grillo
Support a 1D gamma LUT with interpolation for each color channel on the
VKMS driver. Add a check for the LUT length by creating
vkms_atomic_check().

Tested with:
igt@kms_color@gamma
igt@kms_color@legacy-gamma
igt@kms_color@invalid-gamma-lut-sizes

v2:
- Add interpolation between the values of the LUT (Simon Ser)

v3:
- s/ratio/delta (Pekka)
- s/color_channel/channel_value (Pekka)
- s/lut_area/lut_channel
- Store the `drm_color_lut`, `lut_length`, and
  `channel_value2index_ratio` inside a struct called `vkms_lut`
  (Pekka)
- Pre-compute some constants values used through the LUT procedure
  (Pekka)
- Change the switch statement to a cast to __u16* (Pekka)
- Make the apply_lut_to_channel_value return the computation result
  (Pekka)

Signed-off-by: Arthur Grillo 
---
 drivers/gpu/drm/vkms/vkms_composer.c | 82 
 drivers/gpu/drm/vkms/vkms_crtc.c |  5 ++
 drivers/gpu/drm/vkms/vkms_drv.c  | 20 ++-
 drivers/gpu/drm/vkms/vkms_drv.h  |  9 +++
 4 files changed, 115 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/vkms/vkms_composer.c 
b/drivers/gpu/drm/vkms/vkms_composer.c
index 906d3df40cdb..9e735a963b81 100644
--- a/drivers/gpu/drm/vkms/vkms_composer.c
+++ b/drivers/gpu/drm/vkms/vkms_composer.c
@@ -6,6 +6,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
 #include 
@@ -89,6 +90,69 @@ static void fill_background(const struct pixel_argb_u16 
*background_color,
output_buffer->pixels[i] = *background_color;
 }
 
+// lerp(a, b, t) = a + (b - a) * t
+static u16 lerp_u16(u16 a, u16 b, s64 t)
+{
+   s64 a_fp = drm_int2fixp(a);
+   s64 b_fp = drm_int2fixp(b);
+
+   s64 delta = drm_fixp_mul(b_fp - a_fp,  t);
+
+   return drm_fixp2int(a_fp + delta);
+}
+
+static s64 get_lut_index(const struct vkms_color_lut *lut, u16 channel_value)
+{
+   s64 color_channel_fp = drm_int2fixp(channel_value);
+
+   return drm_fixp_mul(color_channel_fp, lut->channel_value2index_ratio);
+}
+
+enum lut_channel {
+   LUT_RED = 0,
+   LUT_GREEN,
+   LUT_BLUE,
+   LUT_RESERVED
+};
+
+static u16 apply_lut_to_channel_value(const struct vkms_color_lut *lut, u16 
channel_value,
+ enum lut_channel channel)
+{
+   s64 lut_index = get_lut_index(lut, channel_value);
+
+   /*
+* This checks if `struct drm_color_lut` had any gap added by the 
compiler
+* between the struct fields.
+*/
+   static_assert(sizeof(struct drm_color_lut) == sizeof(__u16) * 4);
+
+   u16 *floor_lut_value = (__u16 *)>base[drm_fixp2int(lut_index)];
+   u16 *ceil_lut_value = (__u16 *)>base[drm_fixp2int_ceil(lut_index)];
+
+   u16 floor_channel_value = floor_lut_value[channel];
+   u16 ceil_channel_value = ceil_lut_value[channel];
+
+   return lerp_u16(floor_channel_value, ceil_channel_value,
+   lut_index & DRM_FIXED_DECIMAL_MASK);
+}
+
+static void apply_lut(const struct vkms_crtc_state *crtc_state, struct 
line_buffer *output_buffer)
+{
+   if (!crtc_state->gamma_lut.base)
+   return;
+
+   if (!crtc_state->gamma_lut.lut_length)
+   return;
+
+   for (size_t x = 0; x < output_buffer->n_pixels; x++) {
+   struct pixel_argb_u16 *pixel = _buffer->pixels[x];
+
+   pixel->r = apply_lut_to_channel_value(_state->gamma_lut, 
pixel->r, LUT_RED);
+   pixel->g = apply_lut_to_channel_value(_state->gamma_lut, 
pixel->g, LUT_GREEN);
+   pixel->b = apply_lut_to_channel_value(_state->gamma_lut, 
pixel->b, LUT_BLUE);
+   }
+}
+
 /**
  * @wb_frame_info: The writeback frame buffer metadata
  * @crtc_state: The crtc state
@@ -128,6 +192,8 @@ static void blend(struct vkms_writeback_job *wb,
output_buffer);
}
 
+   apply_lut(crtc_state, output_buffer);
+
*crc32 = crc32_le(*crc32, (void *)output_buffer->pixels, 
row_size);
 
if (wb)
@@ -242,6 +308,22 @@ void vkms_composer_worker(struct work_struct *work)
crtc_state->frame_start = 0;
crtc_state->frame_end = 0;
crtc_state->crc_pending = false;
+
+   if (crtc->state->gamma_lut) {
+   s64 max_lut_index_fp;
+   s64 u16_max_fp = drm_int2fixp(0x);
+
+   crtc_state->gamma_lut.base = (struct drm_color_lut 
*)crtc->state->gamma_lut->data;
+   crtc_state->gamma_lut.lut_length =
+   crtc->state->gamma_lut->length / sizeof(struct 
drm_color_lut);
+   max_lut_index_fp = 
drm_int2fixp(crtc_state->gamma_lut.lut_length  - 1);
+   crtc_state->gamma_lut.channel_value2index_ratio = 
drm_fixp_div(max_lut_index_fp,
+  
u16_max_fp);
+
+   } else {
+   crtc_state->gamma_lut.base = NULL;
+  

Re: [PATCH v4 3/3] drm/panel-fannal-c3004: Add fannal c3004 DSI panel

2023-06-15 Thread Linus Walleij
Hi Paulo,

thanks for your patch!

Overall this looks very good.

I doubt that the display controller is actually by Fannal, but I guess
you tried to find out? We usually try to identify the underlying display
controller so the driver can be named after it and reused for more
display panels.

Some minor questions/nitpicks below.

On Wed, Jun 7, 2023 at 5:11 PM Paulo Pavacic  wrote:

> +static int fannal_panel_enable(struct drm_panel *panel)
> +{
> +   struct mipi_dsi_device *dsi = to_mipi_dsi_device(panel->dev);
> +
> +   mipi_dsi_generic_write_seq(dsi, 0xFF, 0x77, 0x01, 0x00, 0x00, 0x13);
> +   mipi_dsi_generic_write_seq(dsi, 0xEF, 0x08);

Why is everything using mipi_dsi_generic_write_seq() instead of
mipi_dsi_dcs_write_seq()?

Especially here, because 0x11 is certainly a command:

> +   mipi_dsi_generic_write_seq(dsi, 0x11); //MIPI_DCS_EXIT_SLEEP_MODE

Instead use:

ret = mipi_dsi_dcs_exit_sleep_mode(dsi);
if (ret)
return ret;


> +   mipi_dsi_generic_write_seq(dsi, 0x29); //MIPI_DCS_SET_DISPLAY_ON

Instead use:

ret = mipi_dsi_dcs_set_display_on(dsi);
if (ret)
return ret;

Yours,
Linus Walleij


Re: [PATCH v9 02/14] mm: move page zone helpers from mm.h to mmzone.h

2023-06-15 Thread Peter Xu
Hello, all,

On Fri, Jul 15, 2022 at 10:05:09AM -0500, Alex Sierra wrote:
> +static inline enum zone_type page_zonenum(const struct page *page)
> +{
> + ASSERT_EXCLUSIVE_BITS(page->flags, ZONES_MASK << ZONES_PGSHIFT);
> + return (page->flags >> ZONES_PGSHIFT) & ZONES_MASK;
> +}

Sorry to hijack this patch - not directly relevant to the movement, but
relevant to this helper, so maybe I can leverage the cc list..

My question is whether page_zonenum() is ready for taking all kinds of tail
pages?

Zone device tail pages all look fine, per memmap_init_zone_device().  The
question was other kinds of usual compound pages, like either thp or
hugetlb.  IIUC page->flags can be uninitialized for those tail pages.

Asking because I noticed it seems possible that page_zonenum() can just
take any random tail page as input, e.g.:

try_grab_folio -> is_pci_p2pdma_page -> is_zone_device_page -> page_zonenum

I'm worried it'll just read fake things, but maybe I just missed something?

Thanks,

-- 
Peter Xu



Re: [PATCH 3/3] drm/amdgpu: use new scheduler accounting

2023-06-15 Thread Luben Tuikov
On 2023-06-15 07:56, Christian König wrote:
> Instead of implementing this ourself.

Spellcheck: "ourselves".

Acked-by: Luben Tuikov 

Regards,
Luben

> 
> Signed-off-by: Christian König 
> ---
>  drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.c | 52 -
>  1 file changed, 8 insertions(+), 44 deletions(-)
> 
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.c 
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.c
> index 1445e030d788..f787a9b06d62 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.c
> @@ -163,41 +163,6 @@ static unsigned int amdgpu_ctx_get_hw_prio(struct 
> amdgpu_ctx *ctx, u32 hw_ip)
>   return hw_prio;
>  }
>  
> -/* Calculate the time spend on the hw */
> -static ktime_t amdgpu_ctx_fence_time(struct dma_fence *fence)
> -{
> - struct drm_sched_fence *s_fence;
> -
> - if (!fence)
> - return ns_to_ktime(0);
> -
> - /* When the fence is not even scheduled it can't have spend time */
> - s_fence = to_drm_sched_fence(fence);
> - if (!test_bit(DMA_FENCE_FLAG_TIMESTAMP_BIT, _fence->scheduled.flags))
> - return ns_to_ktime(0);
> -
> - /* When it is still running account how much already spend */
> - if (!test_bit(DMA_FENCE_FLAG_TIMESTAMP_BIT, _fence->finished.flags))
> - return ktime_sub(ktime_get(), s_fence->scheduled.timestamp);
> -
> - return ktime_sub(s_fence->finished.timestamp,
> -  s_fence->scheduled.timestamp);
> -}
> -
> -static ktime_t amdgpu_ctx_entity_time(struct amdgpu_ctx *ctx,
> -   struct amdgpu_ctx_entity *centity)
> -{
> - ktime_t res = ns_to_ktime(0);
> - uint32_t i;
> -
> - spin_lock(>ring_lock);
> - for (i = 0; i < amdgpu_sched_jobs; i++) {
> - res = ktime_add(res, amdgpu_ctx_fence_time(centity->fences[i]));
> - }
> - spin_unlock(>ring_lock);
> - return res;
> -}
> -
>  static int amdgpu_ctx_init_entity(struct amdgpu_ctx *ctx, u32 hw_ip,
> const u32 ring)
>  {
> @@ -257,16 +222,15 @@ static int amdgpu_ctx_init_entity(struct amdgpu_ctx 
> *ctx, u32 hw_ip,
>  
>  static ktime_t amdgpu_ctx_fini_entity(struct amdgpu_ctx_entity *entity)
>  {
> - ktime_t res = ns_to_ktime(0);
> + ktime_t res;
>   int i;
>  
>   if (!entity)
> - return res;
> + return ns_to_ktime(0);
>  
> - for (i = 0; i < amdgpu_sched_jobs; ++i) {
> - res = ktime_add(res, amdgpu_ctx_fence_time(entity->fences[i]));
> + for (i = 0; i < amdgpu_sched_jobs; ++i)
>   dma_fence_put(entity->fences[i]);
> - }
> + res = drm_sched_entity_time_spend(>entity);
>   drm_sched_entity_destroy(>entity);
>   kfree(entity);
>   return res;
> @@ -718,9 +682,6 @@ uint64_t amdgpu_ctx_add_fence(struct amdgpu_ctx *ctx,
>   centity->sequence++;
>   spin_unlock(>ring_lock);
>  
> - atomic64_add(ktime_to_ns(amdgpu_ctx_fence_time(other)),
> -  >mgr->time_spend[centity->hw_ip]);
> -
>   dma_fence_put(other);
>   return seq;
>  }
> @@ -900,12 +861,15 @@ void amdgpu_ctx_mgr_usage(struct amdgpu_ctx_mgr *mgr,
>   for (hw_ip = 0; hw_ip < AMDGPU_HW_IP_NUM; ++hw_ip) {
>   for (i = 0; i < amdgpu_ctx_num_entities[hw_ip]; ++i) {
>   struct amdgpu_ctx_entity *centity;
> + struct drm_sched_entity *entity;
>   ktime_t spend;
>  
>   centity = ctx->entities[hw_ip][i];
>   if (!centity)
>   continue;
> - spend = amdgpu_ctx_entity_time(ctx, centity);
> +
> + entity = >entity;
> + spend = drm_sched_entity_time_spend(entity);
>   usage[hw_ip] = ktime_add(usage[hw_ip], spend);
>   }
>   }



Re: [PATCH 1/3] drm/scheduler: implement hw time accounting

2023-06-15 Thread Luben Tuikov
On 2023-06-15 07:56, Christian König wrote:
> Multiple drivers came up with the requirement to measure how
> much time each submission spend on the hw.

"spends"

> 
> A previous attempt of accounting this had to be reverted because
> hw submissions can live longer than the entity originally
> issuing them.
> 
> Amdgpu on the other hand solves this task by keeping track of
> all the submissions and calculating how much time they have used
> on demand.
> 
> Move this approach over into the scheduler to provide an easy to
> use interface for all drivers.

Yeah, that'd be helpful.

> 
> Signed-off-by: Christian König 
> ---
>  drivers/gpu/drm/scheduler/sched_entity.c | 89 ++--
>  drivers/gpu/drm/scheduler/sched_fence.c  | 19 +
>  include/drm/gpu_scheduler.h  | 29 
>  3 files changed, 133 insertions(+), 4 deletions(-)
> 
> diff --git a/drivers/gpu/drm/scheduler/sched_entity.c 
> b/drivers/gpu/drm/scheduler/sched_entity.c
> index 68e807ae136a..67307022a505 100644
> --- a/drivers/gpu/drm/scheduler/sched_entity.c
> +++ b/drivers/gpu/drm/scheduler/sched_entity.c
> @@ -62,7 +62,9 @@ int drm_sched_entity_init(struct drm_sched_entity *entity,
> unsigned int num_sched_list,
> atomic_t *guilty)
>  {
> - if (!(entity && sched_list && (num_sched_list == 0 || sched_list[0])))
> + unsigned int i, num_submissions;
> +
> + if (!entity || !sched_list)
>   return -EINVAL;
>  
>   memset(entity, 0, sizeof(struct drm_sched_entity));
> @@ -75,9 +77,6 @@ int drm_sched_entity_init(struct drm_sched_entity *entity,
>   RCU_INIT_POINTER(entity->last_scheduled, NULL);
>   RB_CLEAR_NODE(>rb_tree_node);
>  
> - if(num_sched_list)
> - entity->rq = _list[0]->sched_rq[entity->priority];
> -
>   init_completion(>entity_idle);
>  
>   /* We start in an idle state. */
> @@ -88,11 +87,68 @@ int drm_sched_entity_init(struct drm_sched_entity *entity,
>  
>   atomic_set(>fence_seq, 0);
>   entity->fence_context = dma_fence_context_alloc(2);
> + spin_lock_init(>accounting_lock);
> +
> + /* We need to be able to init even unused entities */
> + if (!num_sched_list)
> + return 0;
> +
> + entity->rq = _list[0]->sched_rq[entity->priority];
> +
> + num_submissions = 0;
> + for (i = 0; i < num_sched_list; ++i) {
> + if (!sched_list[i])
> + return -EINVAL;
> +
> + num_submissions = max(num_submissions,
> +   sched_list[i]->hw_submission_limit);
> + }
> +
> + /* FIXME: Entity initialized before the scheduler. */
> + if (!num_submissions)
> + return 0;
> +
> + entity->num_hw_submissions = num_submissions;
> + entity->hw_submissions = kcalloc(num_submissions, sizeof(void *),
> +  GFP_KERNEL);
> + if (!entity->hw_submissions)
> + return -ENOMEM;
>  
>   return 0;
>  }
>  EXPORT_SYMBOL(drm_sched_entity_init);
>  
> +/**
> + * drm_sched_entity_time_spend - Accumulated hw time used by this entity

Use the adjective form to "time", "time spent":

drm_sched_entity_time_spent

> + * @entity: scheduler entity to check
> + *
> + * Get the current accumulated hw time used by all submissions made through 
> this
> + * entity.
> + */
> +ktime_t drm_sched_entity_time_spend(struct drm_sched_entity *entity)

"time_spend" --> "time_spent"

drm_sched_entity_time_spent

> +{
> + ktime_t result;
> + unsigned int i;
> +
> + if (!entity->num_hw_submissions)
> + return ns_to_ktime(0);
> +
> + spin_lock(>accounting_lock);
> + result = entity->hw_time_used;
> + for (i = 0; i < entity->num_hw_submissions; ++i) {
> + struct drm_sched_fence *fence = entity->hw_submissions[i];
> +
> + if (!fence)
> + continue;
> +
> + result = ktime_add(result, drm_sched_fence_time_spend(fence));
> + }
> + spin_unlock(>accounting_lock);
> +
> + return result;
> +}
> +EXPORT_SYMBOL(drm_sched_entity_time_spend);

This is a good show-and-tell, ideally we want to add to entity->hw_time_used,
so that that quantity is updated. 
Otherwise, as it returns result = "new time spent" + entity->hw_time_used; we 
want
to make sure that "result" is accounted for somewhere more permanent. But
as I said, this is a good show-and-tell. I'd imagine we'd develop this function 
more
once it lands in the tree.

> +
>  /**
>   * drm_sched_entity_modify_sched - Modify sched of an entity
>   * @entity: scheduler entity to init
> @@ -288,6 +344,8 @@ EXPORT_SYMBOL(drm_sched_entity_flush);
>   */
>  void drm_sched_entity_fini(struct drm_sched_entity *entity)
>  {
> + unsigned int i;
> +
>   /*
>* If consumption of existing IBs wasn't completed. Forcefully remove
>* them here. Also makes sure that the scheduler won't touch this entity
> @@ -303,6 

Re: [PATCH 2/2] drm/amdgpu: Move clocks closer to its only usage in amdgpu_parse_cg_state()

2023-06-15 Thread Alex Deucher
Applied the series.  Thanks!

Alex

On Thu, Jun 15, 2023 at 1:06 PM Nathan Chancellor  wrote:
>
> After commit a25a9dae2067 ("drm/amd/amdgpu: enable W=1 for amdgpu"),
> there is an instance of -Wunused-const-variable when CONFIG_DEBUG_FS is
> disabled:
>
>   drivers/gpu/drm/amd/amdgpu/../pm/amdgpu_pm.c:38:34: error: unused variable 
> 'clocks' [-Werror,-Wunused-const-variable]
>  38 | static const struct cg_flag_name clocks[] = {
> |  ^
>   1 error generated.
>
> clocks is only used when CONFIG_DEBUG_FS is set, so move the definition
> into the CONFIG_DEBUG_FS block right above its only usage to clear up
> the warning.
>
> Signed-off-by: Nathan Chancellor 
> ---
>  drivers/gpu/drm/amd/pm/amdgpu_pm.c | 76 
> +++---
>  1 file changed, 38 insertions(+), 38 deletions(-)
>
> diff --git a/drivers/gpu/drm/amd/pm/amdgpu_pm.c 
> b/drivers/gpu/drm/amd/pm/amdgpu_pm.c
> index a57952b93e73..386ccf11e657 100644
> --- a/drivers/gpu/drm/amd/pm/amdgpu_pm.c
> +++ b/drivers/gpu/drm/amd/pm/amdgpu_pm.c
> @@ -35,44 +35,6 @@
>  #include 
>  #include 
>
> -static const struct cg_flag_name clocks[] = {
> -   {AMD_CG_SUPPORT_GFX_FGCG, "Graphics Fine Grain Clock Gating"},
> -   {AMD_CG_SUPPORT_GFX_MGCG, "Graphics Medium Grain Clock Gating"},
> -   {AMD_CG_SUPPORT_GFX_MGLS, "Graphics Medium Grain memory Light Sleep"},
> -   {AMD_CG_SUPPORT_GFX_CGCG, "Graphics Coarse Grain Clock Gating"},
> -   {AMD_CG_SUPPORT_GFX_CGLS, "Graphics Coarse Grain memory Light Sleep"},
> -   {AMD_CG_SUPPORT_GFX_CGTS, "Graphics Coarse Grain Tree Shader Clock 
> Gating"},
> -   {AMD_CG_SUPPORT_GFX_CGTS_LS, "Graphics Coarse Grain Tree Shader Light 
> Sleep"},
> -   {AMD_CG_SUPPORT_GFX_CP_LS, "Graphics Command Processor Light Sleep"},
> -   {AMD_CG_SUPPORT_GFX_RLC_LS, "Graphics Run List Controller Light 
> Sleep"},
> -   {AMD_CG_SUPPORT_GFX_3D_CGCG, "Graphics 3D Coarse Grain Clock Gating"},
> -   {AMD_CG_SUPPORT_GFX_3D_CGLS, "Graphics 3D Coarse Grain memory Light 
> Sleep"},
> -   {AMD_CG_SUPPORT_MC_LS, "Memory Controller Light Sleep"},
> -   {AMD_CG_SUPPORT_MC_MGCG, "Memory Controller Medium Grain Clock 
> Gating"},
> -   {AMD_CG_SUPPORT_SDMA_LS, "System Direct Memory Access Light Sleep"},
> -   {AMD_CG_SUPPORT_SDMA_MGCG, "System Direct Memory Access Medium Grain 
> Clock Gating"},
> -   {AMD_CG_SUPPORT_BIF_MGCG, "Bus Interface Medium Grain Clock Gating"},
> -   {AMD_CG_SUPPORT_BIF_LS, "Bus Interface Light Sleep"},
> -   {AMD_CG_SUPPORT_UVD_MGCG, "Unified Video Decoder Medium Grain Clock 
> Gating"},
> -   {AMD_CG_SUPPORT_VCE_MGCG, "Video Compression Engine Medium Grain 
> Clock Gating"},
> -   {AMD_CG_SUPPORT_HDP_LS, "Host Data Path Light Sleep"},
> -   {AMD_CG_SUPPORT_HDP_MGCG, "Host Data Path Medium Grain Clock Gating"},
> -   {AMD_CG_SUPPORT_DRM_MGCG, "Digital Right Management Medium Grain 
> Clock Gating"},
> -   {AMD_CG_SUPPORT_DRM_LS, "Digital Right Management Light Sleep"},
> -   {AMD_CG_SUPPORT_ROM_MGCG, "Rom Medium Grain Clock Gating"},
> -   {AMD_CG_SUPPORT_DF_MGCG, "Data Fabric Medium Grain Clock Gating"},
> -   {AMD_CG_SUPPORT_VCN_MGCG, "VCN Medium Grain Clock Gating"},
> -   {AMD_CG_SUPPORT_HDP_DS, "Host Data Path Deep Sleep"},
> -   {AMD_CG_SUPPORT_HDP_SD, "Host Data Path Shutdown"},
> -   {AMD_CG_SUPPORT_IH_CG, "Interrupt Handler Clock Gating"},
> -   {AMD_CG_SUPPORT_JPEG_MGCG, "JPEG Medium Grain Clock Gating"},
> -   {AMD_CG_SUPPORT_REPEATER_FGCG, "Repeater Fine Grain Clock Gating"},
> -   {AMD_CG_SUPPORT_GFX_PERF_CLK, "Perfmon Clock Gating"},
> -   {AMD_CG_SUPPORT_ATHUB_MGCG, "Address Translation Hub Medium Grain 
> Clock Gating"},
> -   {AMD_CG_SUPPORT_ATHUB_LS, "Address Translation Hub Light Sleep"},
> -   {0, NULL},
> -};
> -
>  static const struct hwmon_temp_label {
> enum PP_HWMON_TEMP channel;
> const char *label;
> @@ -3684,6 +3646,44 @@ static int amdgpu_debugfs_pm_info_pp(struct seq_file 
> *m, struct amdgpu_device *a
> return 0;
>  }
>
> +static const struct cg_flag_name clocks[] = {
> +   {AMD_CG_SUPPORT_GFX_FGCG, "Graphics Fine Grain Clock Gating"},
> +   {AMD_CG_SUPPORT_GFX_MGCG, "Graphics Medium Grain Clock Gating"},
> +   {AMD_CG_SUPPORT_GFX_MGLS, "Graphics Medium Grain memory Light Sleep"},
> +   {AMD_CG_SUPPORT_GFX_CGCG, "Graphics Coarse Grain Clock Gating"},
> +   {AMD_CG_SUPPORT_GFX_CGLS, "Graphics Coarse Grain memory Light Sleep"},
> +   {AMD_CG_SUPPORT_GFX_CGTS, "Graphics Coarse Grain Tree Shader Clock 
> Gating"},
> +   {AMD_CG_SUPPORT_GFX_CGTS_LS, "Graphics Coarse Grain Tree Shader Light 
> Sleep"},
> +   {AMD_CG_SUPPORT_GFX_CP_LS, "Graphics Command Processor Light Sleep"},
> +   {AMD_CG_SUPPORT_GFX_RLC_LS, "Graphics Run List Controller Light 
> Sleep"},
> +   {AMD_CG_SUPPORT_GFX_3D_CGCG, "Graphics 3D Coarse Grain Clock Gating"},
> +   

Re: [PATCH 1/2] fbdev/offb: Update expected device name

2023-06-15 Thread Cyril Brulebois
Linux regression tracking (Thorsten Leemhuis)  
(2023-06-15):
> No reply to my status inquiry[1] a few weeks ago, so I have to assume
> nobody cares anymore. If somebody still cares, holler!

I still care about a proper bugfix, for upstream and for the Debian
distribution, and so does Salvatore. But fixing kernel regressions isn't
my day job, so I haven't got around to working on it.


Cheers,
-- 
Cyril Brulebois -- Debian Consultant @ DEBAMAX -- https://debamax.com/


signature.asc
Description: PGP signature


Re: [RFC PATCH v2 00/18] Add DRM CRTC 3D LUT interface

2023-06-15 Thread Jacopo Mondi
Hi Pekka
thanks for the reply

On Thu, Jun 15, 2023 at 10:14:05AM +0300, Pekka Paalanen wrote:
> On Tue, 13 Jun 2023 17:43:55 +0200
> Jacopo Mondi  wrote:
>
> > Hello
> >
> >I'm completing the support for 3D LUT on R-Car DU peripheral
> > and I have used this series as a base.
> >
> > I'm wondering, since quite some time has passed without any update if
> > this series is still a thing and it makes any sense for me to try to
> > bring it forward.
> >
> > I'm asking as I've noticed:
> > "[PATCH 00/36] drm/amd/display: add AMD driver-specific properties for 
> > color mgmt"
> >
> > which seems to supersede this proposal with driver-specific
> > properties.
> >
> > I asked Melissa privately but I wasn't able to get an hold of her, so
> > if anyone has any clue feel free to reply :)
>
> Hi,
>
> since no-one else replied, I'll point you to the thread starting at
> https://lists.freedesktop.org/archives/dri-devel/2023-May/403173.html

Yes, Melissa pointed me to that series privately yesterday.

However, and here I might be missing something, per-plane properties do
not apply well to the HW pipeline I'm looking at.

The R-Car DU has a 1D LUT and a 3D LUT at the CRTC level (I guess
'post blending' is the right term here) ?  A per-plane property
doesn't seem to match how the HW work, but please feel free to correct
me as this is all rather new to me and I might be overlooking
something.

My plan at the moment would have been to base my work on Melissa's RFC
and re-send to prop discussions, unless it is certainly a dead-end and
I have missed how to properly use per-plane properties on our HW.

Thank you!

> and continuing to June. That is the plan of getting a common UAPI for
> these things.
>
>
> Thanks,
> pq
>
>
> >
> > Thanks
> >   j
> >
> > On Mon, Jan 09, 2023 at 01:38:28PM -0100, Melissa Wen wrote:
> > > Hi,
> > >
> > > After collecting comments in different places, here is a second version
> > > of the work on adding DRM CRTC 3D LUT support to the current DRM color
> > > mgmt interface. In comparison to previous proposals [1][2][3], here we
> > > add 3D LUT before gamma 1D LUT, but also a shaper 1D LUT before 3D LUT,
> > > that means the following DRM CRTC color correction pipeline:
> > >
> > > Blend -> Degamma 1D LUT -> CTM -> Shaper 1D LUT -> 3D LUT -> Gamma 1D LUT
> > >
> > > and we also add a DRM CRTC LUT3D_MODE property, based on Alex Hung
> > > proposal for pre-blending 3D LUT [4] (Thanks!), instead of just a
> > > LUT3D_SIZE, that allows userspace to use different supported settings of
> > > 3D LUT, fitting VA-API and new color API better. In this sense, I
> > > adjusted the pre-blending proposal for post-blending usage.
> > >
> > > Patches 1-6 targets the addition of shaper LUT and 3D LUT properties to
> > > the current DRM CRTC color mgmt pipeline. Patch 6 can be considered an
> > > extra/optional patch to define a default value for LUT3D_MODE, inspired
> > > by what we do for the plane blend mode property (pre-multiplied).
> > >
> > > Patches 7-18 targets AMD display code to enable shaper and 3D LUT usage
> > > on DCN 301 (our HW case). Patches 7-9 performs code cleanups on current
> > > AMD DM colors code, patch 10 updates AMD stream in case of user 3D LUT
> > > changes, patch 11/12 rework AMD MPC 3D LUT resource handling by context
> > > for DCN 301 (easily extendible to other DCN families). Finally, from
> > > 13-18, we wire up SHAPER LUT, LUT3D and LUT3D MODE to AMD display
> > > driver, exposing modes supported by HW and programming user shaper and
> > > 3D LUT accordingly.
> > >
> > > Our target userspace is Gamescope/SteamOS.
> > >
> > > Basic IGT tests were based on [5][6] and are available here (in-progress):
> > > https://gitlab.freedesktop.org/mwen/igt-gpu-tools/-/commits/crtc-lut3d-api
> > >
> > > [1] 
> > > https://lore.kernel.org/all/20201221015730.28333-1-laurent.pinchart+rene...@ideasonboard.com/
> > > [2] 
> > > https://github.com/vsyrjala/linux/commit/4d28e8ddf2a076f30f9e5bdc17cbb4656fe23e69
> > > [3] 
> > > https://lore.kernel.org/amd-gfx/20220619223104.667413-1-m...@igalia.com/
> > > [4] 
> > > https://lore.kernel.org/dri-devel/20221004211451.1475215-1-alex.h...@amd.com/
> > > [5] https://patchwork.freedesktop.org/series/90165/
> > > [6] https://patchwork.freedesktop.org/series/109402/
> > > [VA_API] 
> > > http://intel.github.io/libva/structVAProcFilterParameterBuffer3DLUT.html
> > > [KMS_pipe_API] https://gitlab.freedesktop.org/pq/color-and-hdr/-/issues/11
> > >
> > > Let me know your thoughts.
> > >
> > > Thanks,
> > >
> > > Melissa
> > >
> > > Alex Hung (2):
> > >   drm: Add 3D LUT mode and its attributes
> > >   drm/amd/display: Define 3D LUT struct for HDR planes
> > >
> > > Melissa Wen (16):
> > >   drm/drm_color_mgmt: add shaper LUT to color mgmt properties
> > >   drm/drm_color_mgmt: add 3D LUT props to DRM color mgmt
> > >   drm/drm_color_mgmt: add function to create 3D LUT modes supported
> > >   drm/drm_color_mgmt: add function to attach 3D LUT props
> > >   

Re: [PATCH] dt-bindings: display: Add missing property types

2023-06-15 Thread Rob Herring


On Tue, 13 Jun 2023 14:11:14 -0600, Rob Herring wrote:
> A couple of display bridge properties are missing a type definition. Add
> the types to them.
> 
> Signed-off-by: Rob Herring 
> ---
>  .../devicetree/bindings/display/bridge/analogix,dp.yaml  | 1 +
>  .../devicetree/bindings/display/bridge/nxp,tda998x.yaml  | 1 +
>  2 files changed, 2 insertions(+)
> 

Applied, thanks!



Re: [PATCH] drm/i915/guc/slpc: Apply min softlimit correctly

2023-06-15 Thread Dixit, Ashutosh
On Fri, 09 Jun 2023 15:02:52 -0700, Vinay Belgaumkar wrote:
>

Hi Vinay,

> We were skipping when min_softlimit was equal to RPn. We need to apply
> it rergardless as efficient frequency will push the SLPC min to RPe.
> This will break scenarios where user sets a min softlimit < RPe before
> reset and then performs a GT reset.
>
> Fixes: 95ccf312a1e4 ("drm/i915/guc/slpc: Allow SLPC to use efficient 
> frequency")
>
> Signed-off-by: Vinay Belgaumkar 
> ---
>  drivers/gpu/drm/i915/gt/uc/intel_guc_slpc.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_slpc.c 
> b/drivers/gpu/drm/i915/gt/uc/intel_guc_slpc.c
> index 01b75529311c..ee9f83af7cf6 100644
> --- a/drivers/gpu/drm/i915/gt/uc/intel_guc_slpc.c
> +++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_slpc.c
> @@ -606,7 +606,7 @@ static int slpc_set_softlimits(struct intel_guc_slpc 
> *slpc)
>   if (unlikely(ret))
>   return ret;
>   slpc_to_gt(slpc)->defaults.min_freq = slpc->min_freq_softlimit;
> - } else if (slpc->min_freq_softlimit != slpc->min_freq) {
> + } else {
>   return intel_guc_slpc_set_min_freq(slpc,
>  slpc->min_freq_softlimit);

IMO the current code is unnecessarily complicated and confusing and similar
changes (with a little tweaking) should be made for max_freq too. But at
least this is a step in the right direction so:

Reviewed-by: Ashutosh Dixit 



>   }
> --
> 2.38.1
>


Re: [PATCH] drm/amd/pm: remove unneeded variable

2023-06-15 Thread Alex Deucher
Applied, but please check your mailer.  I had to manually fix this up.

Alex

On Wed, Jun 14, 2023 at 3:21 AM  wrote:
>
> fix the following coccicheck warning:
>
> drivers/gpu/drm/amd/pm/swsmu/smu11/navi10_ppt.c:1657:14-18: Unneeded
> variable: "size".
>
> Signed-off-by: Mingtong Bao 
> ---
>   drivers/gpu/drm/amd/pm/swsmu/smu11/navi10_ppt.c | 10 +-
>   1 file changed, 5 insertions(+), 5 deletions(-)
>
> diff --git a/drivers/gpu/drm/amd/pm/swsmu/smu11/navi10_ppt.c
> b/drivers/gpu/drm/amd/pm/swsmu/smu11/navi10_ppt.c
> index 275f708db636..c94d825a871b 100644
> --- a/drivers/gpu/drm/amd/pm/swsmu/smu11/navi10_ppt.c
> +++ b/drivers/gpu/drm/amd/pm/swsmu/smu11/navi10_ppt.c
> @@ -1654,7 +1654,7 @@ static int navi10_force_clk_levels(struct
> smu_context *smu,
>  enum smu_clk_type clk_type, uint32_t mask)
>   {
>
> -int ret = 0, size = 0;
> +int ret = 0;
>   uint32_t soft_min_level = 0, soft_max_level = 0, min_freq = 0,
> max_freq = 0;
>
>   soft_min_level = mask ? (ffs(mask) - 1) : 0;
> @@ -1675,15 +1675,15 @@ static int navi10_force_clk_levels(struct
> smu_context *smu,
>
>   ret = smu_v11_0_get_dpm_freq_by_index(smu, clk_type,
> soft_min_level, _freq);
>   if (ret)
> -return size;
> +return 0;
>
>   ret = smu_v11_0_get_dpm_freq_by_index(smu, clk_type,
> soft_max_level, _freq);
>   if (ret)
> -return size;
> +return 0;
>
>   ret = smu_v11_0_set_soft_freq_limited_range(smu, clk_type,
> min_freq, max_freq);
>   if (ret)
> -return size;
> +return 0;
>   break;
>   case SMU_DCEFCLK:
>   dev_info(smu->adev->dev,"Setting DCEFCLK min/max dpm level is
> not supported!\n");
> @@ -1693,7 +1693,7 @@ static int navi10_force_clk_levels(struct
> smu_context *smu,
>   break;
>   }
>
> -return size;
> +return 0;
>   }
>
>   static int navi10_populate_umd_state_clk(struct smu_context *smu)


Re: [PATCH] drm/amdgpu: remove unneeded variable

2023-06-15 Thread Alex Deucher
On Wed, Jun 14, 2023 at 3:20 AM  wrote:
>
> fix the following coccicheck warning:
>
> drivers/gpu/drm/amd/pm/powerplay/hwmgr/smu7_powertune.c:1156:5-11:
> Unneeded variable: "result".
>

This variable is used and the code fails to compile with it removed.

Alex

> Signed-off-by: Mingtong Bao 
> ---
>   drivers/gpu/drm/amd/pm/powerplay/hwmgr/smu7_powertune.c | 3 +--
>   1 file changed, 1 insertion(+), 2 deletions(-)
>
> diff --git a/drivers/gpu/drm/amd/pm/powerplay/hwmgr/smu7_powertune.c
> b/drivers/gpu/drm/amd/pm/powerplay/hwmgr/smu7_powertune.c
> index 21be23ec3c79..36aa7657c66d 100644
> --- a/drivers/gpu/drm/amd/pm/powerplay/hwmgr/smu7_powertune.c
> +++ b/drivers/gpu/drm/amd/pm/powerplay/hwmgr/smu7_powertune.c
> @@ -1153,7 +1153,6 @@ int smu7_enable_power_containment(struct pp_hwmgr
> *hwmgr)
>   struct phm_ppt_v1_information *table_info =
>   (struct phm_ppt_v1_information *)(hwmgr->pptable);
>   int smc_result;
> -int result = 0;
>   struct phm_cac_tdp_table *cac_table;
>
>   data->power_containment_features = 0;
> @@ -1191,7 +1190,7 @@ int smu7_enable_power_containment(struct pp_hwmgr
> *hwmgr)
>   }
>   }
>   }
> -return result;
> +return 0;
>   }
>
>   int smu7_disable_power_containment(struct pp_hwmgr *hwmgr)


Re: [PATCH v2 4/4] drm/mgag200: Use DMA to copy the framebuffer to the VRAM

2023-06-15 Thread Jocelyn Falempe

On 15/06/2023 16:24, Thomas Zimmermann wrote:

Hi Jocelyn

Am 31.05.23 um 11:21 schrieb Jocelyn Falempe:

Even if the transfer is not faster, it brings significant
improvement in latencies and CPU usage.

CPU usage drops from 100% of one core to 3% when continuously
refreshing the screen.


I tried your patchset on a HP Proliant server with a G200EH. I can see 
that the CPU usage goes down, but the time until the screen update 
reaches the hardware's video memory has increased significantly.


Thanks for taking time to test it.
Can you check if there is something in the dmesg ?

The 1s looks suspicious, if the IRQ is not working, there is a 1s 
timeout, which can explain why it will display only one frame per 
second. (logs should be filled with "DMA transfer timed out")


I will see if I can get access to a G200EH, and if I can reproduce this.

Best regards,

--

Jocelyn



Any display update that is more than just moving the mouse results in 
tearing. I can see how the individial scanlines are updated from top to 
bottom. That takes ~1 sec per full frame. So this patch renders the 
display from slow to barely usable.


Best regards
Thomas



The primary DMA is used to send commands (register write), and
the secondary DMA to send the pixel data.
It uses the ILOAD command (chapter 4.5.7 in G200 specification),
which allows to load an image, or a part of an image from system
memory to VRAM.
The last command sent, is a softrap command, which triggers an IRQ
when the DMA transfer is complete.
For 16bits and 24bits pixel width, each line is padded to make sure,
the DMA transfer is a multiple of 32bits. The padded bytes won't be
drawn on the screen, so they don't need to be initialized.

Signed-off-by: Jocelyn Falempe 
---
  drivers/gpu/drm/mgag200/Makefile  |   3 +-
  drivers/gpu/drm/mgag200/mgag200_dma.c | 237 ++
  drivers/gpu/drm/mgag200/mgag200_drv.c |   4 +-
  drivers/gpu/drm/mgag200/mgag200_drv.h |  29 +++
  drivers/gpu/drm/mgag200/mgag200_g200.c    |   4 +
  drivers/gpu/drm/mgag200/mgag200_g200eh.c  |   4 +
  drivers/gpu/drm/mgag200/mgag200_g200eh3.c |   4 +
  drivers/gpu/drm/mgag200/mgag200_g200er.c  |   4 +
  drivers/gpu/drm/mgag200/mgag200_g200ev.c  |   4 +
  drivers/gpu/drm/mgag200/mgag200_g200ew3.c |   4 +
  drivers/gpu/drm/mgag200/mgag200_g200se.c  |   4 +
  drivers/gpu/drm/mgag200/mgag200_g200wb.c  |   4 +
  drivers/gpu/drm/mgag200/mgag200_mode.c    |  15 +-
  drivers/gpu/drm/mgag200/mgag200_reg.h |  25 +++
  14 files changed, 333 insertions(+), 12 deletions(-)
  create mode 100644 drivers/gpu/drm/mgag200/mgag200_dma.c

diff --git a/drivers/gpu/drm/mgag200/Makefile 
b/drivers/gpu/drm/mgag200/Makefile

index 182e224c460d..96e23dc5572c 100644
--- a/drivers/gpu/drm/mgag200/Makefile
+++ b/drivers/gpu/drm/mgag200/Makefile
@@ -11,6 +11,7 @@ mgag200-y := \
  mgag200_g200se.o \
  mgag200_g200wb.o \
  mgag200_i2c.o \
-    mgag200_mode.o
+    mgag200_mode.o \
+    mgag200_dma.o
  obj-$(CONFIG_DRM_MGAG200) += mgag200.o
diff --git a/drivers/gpu/drm/mgag200/mgag200_dma.c 
b/drivers/gpu/drm/mgag200/mgag200_dma.c

new file mode 100644
index ..7e9b59ef08d9
--- /dev/null
+++ b/drivers/gpu/drm/mgag200/mgag200_dma.c
@@ -0,0 +1,237 @@
+// SPDX-License-Identifier: GPL-2.0-only
+/*
+ * Copyright 2023 Red Hat
+ *
+ * Authors: Jocelyn Falempe
+ *
+ */
+
+#include 
+#include 
+#include 
+
+#include 
+
+#include "mgag200_drv.h"
+#include "mgag200_reg.h"
+
+/* Maximum number of command block for one DMA transfer
+ * iload should only use 4 blocks
+ */
+#define MGA_MAX_CMD    50
+
+#define MGA_DMA_SIZE    (4 * 1024 * 1024)
+#define MGA_MIN_DMA_SIZE    (64 * 1024)
+
+/*
+ * Allocate coherent buffers for DMA transfer.
+ * We need two buffers, one for the commands, and one for the data.
+ */
+int mgag200_dma_init(struct mga_device *mdev)
+{
+    struct drm_device *dev = >base;
+    struct mga_dma *dma = >dma;
+    int size;
+    /* Allocate the command buffer */
+    dma->cmd = dmam_alloc_coherent(dev->dev, MGA_MAX_CMD * 
sizeof(*dma->cmd),

+    >cmd_handle, GFP_KERNEL);
+
+    if (!dma->cmd) {
+    drm_err(dev, "DMA command buffer allocation failed\n");
+    return -ENOMEM;
+    }
+
+    for (size = MGA_DMA_SIZE; size >= MGA_MIN_DMA_SIZE; size = size 
>> 1) {
+    dma->buf = dmam_alloc_coherent(dev->dev, size, >handle, 
GFP_KERNEL);

+    if (dma->buf)
+    break;
+    }
+    if (!dma->buf) {
+    drm_err(dev, "DMA data buffer allocation failed\n");
+    return -ENOMEM;
+    }
+    dma->size = size;
+    drm_info(dev, "Using DMA with a %dk data buffer\n", size / 1024);
+
+    init_waitqueue_head(>waitq);
+    return 0;
+}
+
+/*
+ * Matrox uses a command block to program the hardware through DMA.
+ * Each command is a register write, and each block contains 4 commands
+ * packed in 5 dwords(u32).
+ * First dword is the 4 register index (8bit) to write for the 4 
commands,

+ * followed by the four 

Re: [PATCH] amd/display/dc:remove repeating expression

2023-06-15 Thread Alex Deucher
Applied.  Thanks!

Alex

On Wed, Jun 14, 2023 at 1:36 AM Ammar Faizi  wrote:
>
> On 6/14/23 10:49 AM, Wang Ming wrote:
> > Identify issues that arise by using the tests/doubletest.cocci
> > semantic patch.Need to remove duplicate expression in if statement.
> >
> > Signed-off-by: Wang Ming 
>
> Reviewed-by: Ammar Faizi 
>
> --
> Ammar Faizi


[PATCH 0/2] drm/amdgpu: Fix instances of -Wunused-const-variable with CONFIG_DEBUG_FS=n

2023-06-15 Thread Nathan Chancellor
Hi all,

After commit a25a9dae2067 ("drm/amd/amdgpu: enable W=1 for amdgpu"),
I see a few instances of -Wunused-const-variable with configurations
that do not enable CONFIG_DEBUG_FS, such as Alpine Linux's. This series
includes two patches to resolve each warning I see.

---
Nathan Chancellor (2):
  drm/amdgpu: Remove CONFIG_DEBUG_FS guard around body of 
amdgpu_rap_debugfs_init()
  drm/amdgpu: Move clocks closer to its only usage in 
amdgpu_parse_cg_state()

 drivers/gpu/drm/amd/amdgpu/amdgpu_rap.c |  2 -
 drivers/gpu/drm/amd/pm/amdgpu_pm.c  | 76 -
 2 files changed, 38 insertions(+), 40 deletions(-)
---
base-commit: d297eedf83f5af96751c0da1e4355c19244a55a2
change-id: 20230615-amdgpu-wunused-const-variable-wo-debugfs-308ce8e17329

Best regards,
-- 
Nathan Chancellor 



[PATCH 2/2] drm/amdgpu: Move clocks closer to its only usage in amdgpu_parse_cg_state()

2023-06-15 Thread Nathan Chancellor
After commit a25a9dae2067 ("drm/amd/amdgpu: enable W=1 for amdgpu"),
there is an instance of -Wunused-const-variable when CONFIG_DEBUG_FS is
disabled:

  drivers/gpu/drm/amd/amdgpu/../pm/amdgpu_pm.c:38:34: error: unused variable 
'clocks' [-Werror,-Wunused-const-variable]
 38 | static const struct cg_flag_name clocks[] = {
|  ^
  1 error generated.

clocks is only used when CONFIG_DEBUG_FS is set, so move the definition
into the CONFIG_DEBUG_FS block right above its only usage to clear up
the warning.

Signed-off-by: Nathan Chancellor 
---
 drivers/gpu/drm/amd/pm/amdgpu_pm.c | 76 +++---
 1 file changed, 38 insertions(+), 38 deletions(-)

diff --git a/drivers/gpu/drm/amd/pm/amdgpu_pm.c 
b/drivers/gpu/drm/amd/pm/amdgpu_pm.c
index a57952b93e73..386ccf11e657 100644
--- a/drivers/gpu/drm/amd/pm/amdgpu_pm.c
+++ b/drivers/gpu/drm/amd/pm/amdgpu_pm.c
@@ -35,44 +35,6 @@
 #include 
 #include 
 
-static const struct cg_flag_name clocks[] = {
-   {AMD_CG_SUPPORT_GFX_FGCG, "Graphics Fine Grain Clock Gating"},
-   {AMD_CG_SUPPORT_GFX_MGCG, "Graphics Medium Grain Clock Gating"},
-   {AMD_CG_SUPPORT_GFX_MGLS, "Graphics Medium Grain memory Light Sleep"},
-   {AMD_CG_SUPPORT_GFX_CGCG, "Graphics Coarse Grain Clock Gating"},
-   {AMD_CG_SUPPORT_GFX_CGLS, "Graphics Coarse Grain memory Light Sleep"},
-   {AMD_CG_SUPPORT_GFX_CGTS, "Graphics Coarse Grain Tree Shader Clock 
Gating"},
-   {AMD_CG_SUPPORT_GFX_CGTS_LS, "Graphics Coarse Grain Tree Shader Light 
Sleep"},
-   {AMD_CG_SUPPORT_GFX_CP_LS, "Graphics Command Processor Light Sleep"},
-   {AMD_CG_SUPPORT_GFX_RLC_LS, "Graphics Run List Controller Light Sleep"},
-   {AMD_CG_SUPPORT_GFX_3D_CGCG, "Graphics 3D Coarse Grain Clock Gating"},
-   {AMD_CG_SUPPORT_GFX_3D_CGLS, "Graphics 3D Coarse Grain memory Light 
Sleep"},
-   {AMD_CG_SUPPORT_MC_LS, "Memory Controller Light Sleep"},
-   {AMD_CG_SUPPORT_MC_MGCG, "Memory Controller Medium Grain Clock Gating"},
-   {AMD_CG_SUPPORT_SDMA_LS, "System Direct Memory Access Light Sleep"},
-   {AMD_CG_SUPPORT_SDMA_MGCG, "System Direct Memory Access Medium Grain 
Clock Gating"},
-   {AMD_CG_SUPPORT_BIF_MGCG, "Bus Interface Medium Grain Clock Gating"},
-   {AMD_CG_SUPPORT_BIF_LS, "Bus Interface Light Sleep"},
-   {AMD_CG_SUPPORT_UVD_MGCG, "Unified Video Decoder Medium Grain Clock 
Gating"},
-   {AMD_CG_SUPPORT_VCE_MGCG, "Video Compression Engine Medium Grain Clock 
Gating"},
-   {AMD_CG_SUPPORT_HDP_LS, "Host Data Path Light Sleep"},
-   {AMD_CG_SUPPORT_HDP_MGCG, "Host Data Path Medium Grain Clock Gating"},
-   {AMD_CG_SUPPORT_DRM_MGCG, "Digital Right Management Medium Grain Clock 
Gating"},
-   {AMD_CG_SUPPORT_DRM_LS, "Digital Right Management Light Sleep"},
-   {AMD_CG_SUPPORT_ROM_MGCG, "Rom Medium Grain Clock Gating"},
-   {AMD_CG_SUPPORT_DF_MGCG, "Data Fabric Medium Grain Clock Gating"},
-   {AMD_CG_SUPPORT_VCN_MGCG, "VCN Medium Grain Clock Gating"},
-   {AMD_CG_SUPPORT_HDP_DS, "Host Data Path Deep Sleep"},
-   {AMD_CG_SUPPORT_HDP_SD, "Host Data Path Shutdown"},
-   {AMD_CG_SUPPORT_IH_CG, "Interrupt Handler Clock Gating"},
-   {AMD_CG_SUPPORT_JPEG_MGCG, "JPEG Medium Grain Clock Gating"},
-   {AMD_CG_SUPPORT_REPEATER_FGCG, "Repeater Fine Grain Clock Gating"},
-   {AMD_CG_SUPPORT_GFX_PERF_CLK, "Perfmon Clock Gating"},
-   {AMD_CG_SUPPORT_ATHUB_MGCG, "Address Translation Hub Medium Grain Clock 
Gating"},
-   {AMD_CG_SUPPORT_ATHUB_LS, "Address Translation Hub Light Sleep"},
-   {0, NULL},
-};
-
 static const struct hwmon_temp_label {
enum PP_HWMON_TEMP channel;
const char *label;
@@ -3684,6 +3646,44 @@ static int amdgpu_debugfs_pm_info_pp(struct seq_file *m, 
struct amdgpu_device *a
return 0;
 }
 
+static const struct cg_flag_name clocks[] = {
+   {AMD_CG_SUPPORT_GFX_FGCG, "Graphics Fine Grain Clock Gating"},
+   {AMD_CG_SUPPORT_GFX_MGCG, "Graphics Medium Grain Clock Gating"},
+   {AMD_CG_SUPPORT_GFX_MGLS, "Graphics Medium Grain memory Light Sleep"},
+   {AMD_CG_SUPPORT_GFX_CGCG, "Graphics Coarse Grain Clock Gating"},
+   {AMD_CG_SUPPORT_GFX_CGLS, "Graphics Coarse Grain memory Light Sleep"},
+   {AMD_CG_SUPPORT_GFX_CGTS, "Graphics Coarse Grain Tree Shader Clock 
Gating"},
+   {AMD_CG_SUPPORT_GFX_CGTS_LS, "Graphics Coarse Grain Tree Shader Light 
Sleep"},
+   {AMD_CG_SUPPORT_GFX_CP_LS, "Graphics Command Processor Light Sleep"},
+   {AMD_CG_SUPPORT_GFX_RLC_LS, "Graphics Run List Controller Light Sleep"},
+   {AMD_CG_SUPPORT_GFX_3D_CGCG, "Graphics 3D Coarse Grain Clock Gating"},
+   {AMD_CG_SUPPORT_GFX_3D_CGLS, "Graphics 3D Coarse Grain memory Light 
Sleep"},
+   {AMD_CG_SUPPORT_MC_LS, "Memory Controller Light Sleep"},
+   {AMD_CG_SUPPORT_MC_MGCG, "Memory Controller Medium Grain Clock Gating"},
+   {AMD_CG_SUPPORT_SDMA_LS, "System Direct Memory Access Light 

[PATCH 1/2] drm/amdgpu: Remove CONFIG_DEBUG_FS guard around body of amdgpu_rap_debugfs_init()

2023-06-15 Thread Nathan Chancellor
After commit a25a9dae2067 ("drm/amd/amdgpu: enable W=1 for amdgpu"),
there is an instance of -Wunused-const-variable when CONFIG_DEBUG_FS is
disabled:

  drivers/gpu/drm/amd/amdgpu/amdgpu_rap.c:110:37: error: unused variable 
'amdgpu_rap_debugfs_ops' [-Werror,-Wunused-const-variable]
110 | static const struct file_operations amdgpu_rap_debugfs_ops = {
| ^
  1 error generated.

There is no reason for the body of this function to be guarded when
CONFIG_DEBUG_FS is disabled, as debugfs_create_file() is a stub that
just returns an error pointer in that situation. Remove the preprocessor
guards so that the variable never appears unused, while not changing
anything at run time.

Signed-off-by: Nathan Chancellor 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_rap.c | 2 --
 1 file changed, 2 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_rap.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_rap.c
index 12010c988c8b..123bcf5c2bb1 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_rap.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_rap.c
@@ -116,7 +116,6 @@ static const struct file_operations amdgpu_rap_debugfs_ops 
= {
 
 void amdgpu_rap_debugfs_init(struct amdgpu_device *adev)
 {
-#if defined(CONFIG_DEBUG_FS)
struct drm_minor *minor = adev_to_drm(adev)->primary;
 
if (!adev->psp.rap_context.context.initialized)
@@ -124,5 +123,4 @@ void amdgpu_rap_debugfs_init(struct amdgpu_device *adev)
 
debugfs_create_file("rap_test", S_IWUSR, minor->debugfs_root,
adev, _rap_debugfs_ops);
-#endif
 }

-- 
2.41.0



Re: [PATCH drm-next v4 00/14] [RFC] DRM GPUVA Manager & Nouveau VM_BIND UAPI

2023-06-15 Thread Danilo Krummrich

On 6/7/23 00:31, Danilo Krummrich wrote:


   Maple Tree:
 - Maple tree uses the 'unsinged long' type for node entries. While this
   works for 64bit, it's incompatible with the DRM GPUVA Manager on 32bit,
   since the DRM GPUVA Manager uses the u64 type and so do drivers using it.
   While it's questionable whether a 32bit kernel and a > 32bit GPU address
   space make any sense, it creates tons of compiler warnings when compiling
   for 32bit. Maybe it makes sense to expand the maple tree API to let users
   decide which size to pick - other ideas / proposals are welcome.


I remember you told me that the filesystem folks had some interest in a 
64-bit maple tree for a 32-bit kernel as well. Are there any news or 
plans for such a feature?


For the short term I'd probably add a feature flag to the GPUVA manager, 
where drivers explicitly need to promise not to pass in addresses 
exceeding 32-bit on a 32-bit kernel, and if they don't refuse to 
initialize the GPUVA manager on 32-bit kernels - or something similar...





Re: (subset) [PATCH v2 0/4] video: backlight: lp855x: modernize bindings

2023-06-15 Thread Bjorn Andersson
On Tue, Jun 13, 2023 at 03:30:10PM -0700, Bjorn Andersson wrote:
> On Fri, 19 May 2023 20:07:24 +0200, Artur Weber wrote:
> > Convert TI LP855X backlight controller bindings from TXT to YAML and,
> > while we're at it, rework some of the code related to PWM handling.
> > Also correct existing DTS files to avoid introducing new dtb_check
> > errors.
> > 
> > Signed-off-by: Artur Weber 
> > 
> > [...]
> 
> Applied, thanks!
> 
> [4/4] arm64: dts: adapt to LP855X bindings changes
>   commit: ebdcfc8c42c2b9d5ca1b27d8ee558eefb3e904d8
> 

Sorry, that was not for me to pick up. So I've dropped this change
again.

Please note that all other changes to the affected file is prefixed
"arm64: tegra:". Following this is a good idea, and would have helped me
not accidentally pick this change.

Regards,
Bjorn

> Best regards,
> -- 
> Bjorn Andersson 


Re: [Intel-xe] [RFC PATCH 1/1] drm/xe: Introduce function pointers for MMIO functions

2023-06-15 Thread Matt Roper
On Thu, Jun 15, 2023 at 04:04:18PM +0300, Oded Gabbay wrote:
> On Thu, Jun 15, 2023 at 3:01 AM Matt Roper  wrote:
> >
> > On Mon, Jun 12, 2023 at 06:31:57PM +0200, Francois Dugast wrote:
> > > On Thu, Jun 08, 2023 at 10:35:29AM -0700, Lucas De Marchi wrote:
> > > > On Fri, Jun 02, 2023 at 02:25:01PM +, Francois Dugast wrote:
> > > > > A local structure of function pointers is used as a minimal hardware
> > > > > abstraction layer to prepare for platform independent MMIO calls.
> > > > >
> > > > > Cc: Oded Gabbay 
> > > > > Cc: Ofir Bitton 
> > > > > Cc: Ohad Sharabi 
> > > > > Signed-off-by: Francois Dugast 
> > > > > ---
> > > > > drivers/gpu/drm/xe/xe_device_types.h |  3 ++
> > > > > drivers/gpu/drm/xe/xe_mmio.c | 81 
> > > > > drivers/gpu/drm/xe/xe_mmio.h | 35 ++--
> > > > > 3 files changed, 99 insertions(+), 20 deletions(-)
> > > > >
> > > > > diff --git a/drivers/gpu/drm/xe/xe_device_types.h 
> > > > > b/drivers/gpu/drm/xe/xe_device_types.h
> > > > > index 17b6b1cc5adb..3f8fd0d8129b 100644
> > > > > --- a/drivers/gpu/drm/xe/xe_device_types.h
> > > > > +++ b/drivers/gpu/drm/xe/xe_device_types.h
> > > > > @@ -378,6 +378,9 @@ struct xe_device {
> > > > >   /** @d3cold_allowed: Indicates if d3cold is a valid device state */
> > > > >   bool d3cold_allowed;
> > > > >
> > > > > + /** @mmio_funcs: function pointers for MMIO related functions */
> > > > > + const struct xe_mmio_funcs *mmio_funcs;
> > > > > +
> > > > >   /* private: */
> > > > >
> > > > > #if IS_ENABLED(CONFIG_DRM_XE_DISPLAY)
> > > > > diff --git a/drivers/gpu/drm/xe/xe_mmio.c 
> > > > > b/drivers/gpu/drm/xe/xe_mmio.c
> > > > > index 475b14fe4356..f3d08676a77a 100644
> > > > > --- a/drivers/gpu/drm/xe/xe_mmio.c
> > > > > +++ b/drivers/gpu/drm/xe/xe_mmio.c
> > > > > @@ -25,6 +25,62 @@
> > > > >
> > > > > #define BAR_SIZE_SHIFT 20
> > > > >
> > > > > +static void xe_mmio_write32_device(struct xe_gt *gt,
> > > > > +struct xe_reg reg, u32 val);
> > > > > +static u32 xe_mmio_read32_device(struct xe_gt *gt, struct xe_reg 
> > > > > reg);
> > > > > +static void xe_mmio_write64_device(struct xe_gt *gt,
> > > > > +struct xe_reg reg, u64 val);
> > > > > +static u64 xe_mmio_read64_device(struct xe_gt *gt, struct xe_reg 
> > > > > reg);
> > > > > +
> > > > > +static const struct xe_mmio_funcs xe_mmio_funcs_device = {
> > > > > + .write32 = xe_mmio_write32_device,
> > > > > + .read32 = xe_mmio_read32_device,
> > > > > + .write64 = xe_mmio_write64_device,
> > > > > + .read64 = xe_mmio_read64_device,
> > > > > +};
> > > > > +
> > > > > +static inline void xe_mmio_write32_device(struct xe_gt *gt,
> > > > > +struct xe_reg reg, u32 val)
> > > > > +{
> > > > > + struct xe_tile *tile = gt_to_tile(gt);
> > > > > +
> > > > > + if (reg.addr < gt->mmio.adj_limit)
> > > > > + reg.addr += gt->mmio.adj_offset;
> > > > > +
> > > > > + writel(val, tile->mmio.regs + reg.addr);
> > > > > +}
> > > > > +
> > > > > +static inline u32 xe_mmio_read32_device(struct xe_gt *gt, struct 
> > > > > xe_reg reg)
> > > > > +{
> > > > > + struct xe_tile *tile = gt_to_tile(gt);
> > > > > +
> > > > > + if (reg.addr < gt->mmio.adj_limit)
> > > > > + reg.addr += gt->mmio.adj_offset;
> > > > > +
> > > > > + return readl(tile->mmio.regs + reg.addr);
> > > > > +}
> > > > > +
> > > > > +static inline void xe_mmio_write64_device(struct xe_gt *gt,
> > > > > +struct xe_reg reg, u64 val)
> > > > > +{
> > > > > + struct xe_tile *tile = gt_to_tile(gt);
> > > > > +
> > > > > + if (reg.addr < gt->mmio.adj_limit)
> > > > > + reg.addr += gt->mmio.adj_offset;
> > > > > +
> > > > > + writeq(val, tile->mmio.regs + reg.addr);
> > > > > +}
> > > > > +
> > > > > +static inline u64 xe_mmio_read64_device(struct xe_gt *gt, struct 
> > > > > xe_reg reg)
> > > > > +{
> > > > > + struct xe_tile *tile = gt_to_tile(gt);
> > > > > +
> > > > > + if (reg.addr < gt->mmio.adj_limit)
> > > > > + reg.addr += gt->mmio.adj_offset;
> > > > > +
> > > > > + return readq(tile->mmio.regs + reg.addr);
> > > > > +}
> > > > > +
> > > > > static int xe_set_dma_info(struct xe_device *xe)
> > > > > {
> > > > >   unsigned int mask_size = xe->info.dma_mask_size;
> > > > > @@ -377,6 +433,29 @@ static void mmio_fini(struct drm_device *drm, 
> > > > > void *arg)
> > > > >   iounmap(xe->mem.vram.mapping);
> > > > > }
> > > > >
> > > > > +static void xe_mmio_set_funcs(struct xe_device *xe)
> > > > > +{
> > > > > + /* For now all platforms use the set of MMIO functions for a
> > > > > +  * physical device.
> > > > > +  */
> > > >
> > > >
> > > > what is "device" in this context? that seems confusing as we always ever
> > > > just support reading/writing to a real device (physical here may also
> > > > add to the confusion when thinking about SR-IOV and VFs).
> > > > We shouldn't add abstractions that are then never used and all platforms
> > 

Re: [PATCH v6 3/8] drm: bridge: Cadence: Add MHDP8501 DP driver

2023-06-15 Thread Sam Ravnborg
Hi Sandor,

On Thu, Jun 15, 2023 at 09:38:13AM +0800, Sandor Yu wrote:
> Add a new DRM DisplayPort bridge driver for Candence MHDP8501
> used in i.MX8MQ SOC. MHDP8501 could support HDMI or DisplayPort
> standards according embedded Firmware running in the uCPU.
> 
> For iMX8MQ SOC, the DisplayPort FW was loaded and activated by SOC
> ROM code. Bootload binary included HDMI FW was required for the driver.

The bridge driver supports creating a connector, but is this really
necessary?

This part:
> +static const struct drm_connector_funcs cdns_dp_connector_funcs = {
> + .fill_modes = drm_helper_probe_single_connector_modes,
> + .destroy = drm_connector_cleanup,
> + .reset = drm_atomic_helper_connector_reset,
> + .atomic_duplicate_state = drm_atomic_helper_connector_duplicate_state,
> + .atomic_destroy_state = drm_atomic_helper_connector_destroy_state,
> +};
> +
> +static const struct drm_connector_helper_funcs 
> cdns_dp_connector_helper_funcs = {
> + .get_modes = cdns_dp_connector_get_modes,
> +};
> +
> +static int cdns_dp_bridge_attach(struct drm_bridge *bridge,
> +  enum drm_bridge_attach_flags flags)
> +{
> + struct cdns_mhdp_device *mhdp = bridge->driver_private;
> + struct drm_encoder *encoder = bridge->encoder;
> + struct drm_connector *connector = >connector;
> + int ret;
> +
> + if (!(flags & DRM_BRIDGE_ATTACH_NO_CONNECTOR)) {
> + connector->interlace_allowed = 0;
> +
> + connector->polled = DRM_CONNECTOR_POLL_HPD;
> +
> + drm_connector_helper_add(connector, 
> _dp_connector_helper_funcs);
> +
> + drm_connector_init(bridge->dev, connector, 
> _dp_connector_funcs,
> +DRM_MODE_CONNECTOR_DisplayPort);
> +
> + drm_connector_attach_encoder(connector, encoder);
> + }

Unless you have a display driver that do not create their own connector
then drop the above and error out if DRM_BRIDGE_ATTACH_NO_CONNECTOR is
not set.
It is encouraged that display drivers create their own connector.

This was the only detail I looked for in the driver, I hope some else
volunteer to review it.

Sam


Re: [PATCH drm-next v4 00/14] [RFC] DRM GPUVA Manager & Nouveau VM_BIND UAPI

2023-06-15 Thread Danilo Krummrich

On 6/14/23 09:58, Donald Robson wrote:

On Tue, 2023-06-13 at 16:20 +0200, Danilo Krummrich wrote:


I'm definitely up improving the existing documentation. Anything in
particular you think should be described in more detail?

- Danilo


Hi Danilo,

As I said, with inexperience it's possible I missed what I was
looking for in the existing documentation, which is highly detailed
in regard to how it deals with operations, but usage was where I fell
down.

If I understand there are three ways to use this, which are:
1) Using drm_gpuva_insert() and drm_gpuva_remove() directly using
stack va objects.


What do you mean with stack va objects?


2) Using drm_gpuva_insert() and drm_gpuva_remove() in a callback
context, after having created ops lists using
drm_gpuva_sm_[un]map_ops_create().
3) Using drm_gpuva_[un]map() in callback context after having
prealloced a node and va objects for map/remap function use,
which must be forwarded in as the 'priv' argument to
drm_gpuva_sm_[un]map().


Right, and I think it might be worth concretely mentioning this in the 
documentation.




The first of these is pretty self-explanatory.  The second was also
fairly easy to understand, it has an example in your own driver, and
since it takes care of allocs in drm_gpuva_sm_map_ops_create() it
leads to pretty clean code too.

The third case, which I am using in the new PowerVR driver did not
have an example of usage and the approach is quite different to 2)
in that you have to prealloc everything explicitly.  I didn't realise
this, so it led to a fair amount of frustration.


Yeah, I think this is not entirely obvious why this is the case. I 
should maybe add a comment on how the callback way of using this 
interface is motivated.


The requirement of pre-allocation arises out of two circumstances.
First, having a single callback for every drm_gpuva_op on the GPUVA 
space implies that we're not allowed to fail the operation, because 
processing the drm_gpuva_ops directly implies that we can't unwind them 
on failure.


I know that the API functions the documentation guides you to use in 
this case actually can return error codes, but those are just range 
checks. If they fail, it's clearly a bug. However, I did not use WARN() 
for those cases, since the driver could still decide to use the 
callbacks to keep track of the operations in a driver specific way, 
although I would not recommend doing this and rather like to try to 
cover the drivers use case within the regular way of creating a list of 
operations.


Second, most (other) drivers when using the callback way of this 
interface would need to execute the GPUVA space updates asynchronously 
in a dma_fence signalling critical path, where no memory allocations are 
permitted.




I think if you're willing, it would help inexperienced implementers a
lot if there were some brief 'how to' snippets for each of the three
use cases.


Yes, I can definitely add some.



Thanks,
Donald




Re: [PATCH 2/3] linux/bits.h: Add fixed-width GENMASK and BIT macros

2023-06-15 Thread Andy Shevchenko
On Fri, May 12, 2023 at 09:29:23AM -0700, Lucas De Marchi wrote:
> On Fri, May 12, 2023 at 02:14:19PM +0300, Andy Shevchenko wrote:
> > On Mon, May 08, 2023 at 10:14:02PM -0700, Lucas De Marchi wrote:
> > > Add GENMASK_U32(), GENMASK_U16() and GENMASK_U8()  macros to create
> > > masks for fixed-width types and also the corresponding BIT_U32(),
> > > BIT_U16() and BIT_U8().
> > 
> > Why?
> 
> to create the masks/values for device registers that are
> of a certain width, preventing mistakes like:
> 
>   #define REG10x10
>   #define REG1_ENABLE BIT(17)
>   #define REG1_FOOGENMASK(16, 15);
> 
>   register_write(REG1_ENABLE, REG1);
> 
> 
> ... if REG1 is a 16bit register for example. There were mistakes in the
> past in the i915 source leading to the creation of the REG_* variants on
> top of normal GENMASK/BIT (see last patch and commit 09b434d4f6d2
> ("drm/i915: introduce REG_BIT() and REG_GENMASK() to define register
> contents")

Doesn't it look like something for bitfield.h candidate?
If your definition doesn't fit the given mask, bail out.

-- 
With Best Regards,
Andy Shevchenko




Re: [PATCH 2/3] linux/bits.h: Add fixed-width GENMASK and BIT macros

2023-06-15 Thread Andy Shevchenko
On Fri, May 12, 2023 at 02:45:19PM +0300, Jani Nikula wrote:
> On Fri, 12 May 2023, Andy Shevchenko  
> wrote:
> > On Fri, May 12, 2023 at 02:25:18PM +0300, Jani Nikula wrote:
> >> On Fri, 12 May 2023, Andy Shevchenko  
> >> wrote:
> >> > On Mon, May 08, 2023 at 10:14:02PM -0700, Lucas De Marchi wrote:
> >> >> Add GENMASK_U32(), GENMASK_U16() and GENMASK_U8()  macros to create
> >> >> masks for fixed-width types and also the corresponding BIT_U32(),
> >> >> BIT_U16() and BIT_U8().
> >> >
> >> > Why?
> >> 
> >> The main reason is that GENMASK() and BIT() size varies for 32/64 bit
> >> builds.
> >
> > When needed GENMASK_ULL() can be used (with respective castings perhaps)
> > and BIT_ULL(), no?
> 
> How does that help with making them the same 32-bit size on both 32 and
> 64 bit builds?

u32 x = GENMASK();
u64 y = GENMASK_ULL();

No? Then use in your code either x or y. Note that I assume that the parameters
to GENMASK*() are built-time constants. Is it the case for you?

-- 
With Best Regards,
Andy Shevchenko




[PATCH] drm/bridge: tc358764: Fix debug print parameter order

2023-06-15 Thread Marek Vasut
The debug print parameters were swapped in the output and they were
printed as decimal values, both the hardware address and the value.
Update the debug print to print the parameters in correct order, and
use hexadecimal print for both address and value.

Fixes: f38b7cca6d0e ("drm/bridge: tc358764: Add DSI to LVDS bridge driver")
Signed-off-by: Marek Vasut 
---
Cc: Andrzej Hajda 
Cc: Daniel Vetter 
Cc: David Airlie 
Cc: Jernej Skrabec 
Cc: Jonas Karlman 
Cc: Laurent Pinchart 
Cc: Neil Armstrong 
Cc: Robert Foss 
Cc: dri-devel@lists.freedesktop.org
---
 drivers/gpu/drm/bridge/tc358764.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/bridge/tc358764.c 
b/drivers/gpu/drm/bridge/tc358764.c
index f85654f1b1045..8e938a7480f37 100644
--- a/drivers/gpu/drm/bridge/tc358764.c
+++ b/drivers/gpu/drm/bridge/tc358764.c
@@ -176,7 +176,7 @@ static void tc358764_read(struct tc358764 *ctx, u16 addr, 
u32 *val)
if (ret >= 0)
le32_to_cpus(val);
 
-   dev_dbg(ctx->dev, "read: %d, addr: %d\n", addr, *val);
+   dev_dbg(ctx->dev, "read: addr=0x%04x data=0x%08x\n", addr, *val);
 }
 
 static void tc358764_write(struct tc358764 *ctx, u16 addr, u32 val)
-- 
2.39.2



Re: [PATCH] accel/qaic: Call DRM helper function to destroy prime GEM

2023-06-15 Thread Jeffrey Hugo

On 6/15/2023 1:05 AM, Christian König wrote:



Am 14.06.23 um 18:15 schrieb Jeffrey Hugo:

From: Pranjal Ramajor Asha Kanojiya 

smatch warning:
drivers/accel/qaic/qaic_data.c:620 qaic_free_object() error:
    dereferencing freed memory 'obj->import_attach'

obj->import_attach is detached and freed using dma_buf_detach().
But used after free to decrease the dmabuf ref count using
dma_buf_put().

drm_prime_gem_destroy() handles this issue and performs the proper clean
up instead of open coding it in the driver.

Fixes: ff13be830333 ("accel/qaic: Add datapath")
Reported-by: Sukrut Bellary 
Closes: 
https://lore.kernel.org/all/20230610021200.377452-1-sukrut.bell...@linux.com/ 


Suggested-by: Christian König 
Signed-off-by: Pranjal Ramajor Asha Kanojiya 
Reviewed-by: Carl Vanderlip 
Reviewed-by: Jeffrey Hugo 
Signed-off-by: Jeffrey Hugo 


Reviewed-by: Christian König 


Thanks for the guidance and review!


Re: [PATCH 0/3] drm: Allow PRIME 'self-import' for all drivers

2023-06-15 Thread Thomas Zimmermann

Hi

Am 15.06.23 um 16:50 schrieb Simon Ser:

On Thursday, June 15th, 2023 at 11:31, Thomas Zimmermann  
wrote:


Set drm_gem_prime_handle_to_fd() and drm_gem_prime_fd_to_handle()
for all DRM drivers. Even drivers that do not support PRIME import
or export of dma-bufs can now import their own buffer objects. This
is required by some userspace, such as wlroots-based compositors, to
share buffers among processes.

The only driver that does not use the drm_gem_prime_*() helpers is
vmwgfx. Once it has been converted, the callbacks in struct drm_driver
can be removed.

Simon Ser implemented the feature for drivers based on GEM VRAM helpers
in [1]. This patchset generalizes the code for all drivers that do not
otherwise support PRIME. Tested by running sway with gma500 hardware.


Very nice! Thanks a lot for doing this!

Just one minor comment about docs. I think there are also some remaining
references to drm_gem_prime_handle_to_fd() and drm_gem_prime_fd_to_handle()
in the drm_prime.c overview. These become stale since this series unexports
these functions.


I'll address the documentation issue.



With that fixed:

Reviewed-by: Simon Ser 


Thanks a lot.

Best regards
Thomas

--
Thomas Zimmermann
Graphics Driver Developer
SUSE Software Solutions Germany GmbH
Frankenstrasse 146, 90461 Nuernberg, Germany
GF: Ivo Totev, Andrew Myers, Andrew McDonald, Boudien Moerman
HRB 36809 (AG Nuernberg)


OpenPGP_signature
Description: OpenPGP digital signature


Re: [PATCH 0/3] drm: Allow PRIME 'self-import' for all drivers

2023-06-15 Thread Simon Ser
On Thursday, June 15th, 2023 at 11:31, Thomas Zimmermann  
wrote:

> Set drm_gem_prime_handle_to_fd() and drm_gem_prime_fd_to_handle()
> for all DRM drivers. Even drivers that do not support PRIME import
> or export of dma-bufs can now import their own buffer objects. This
> is required by some userspace, such as wlroots-based compositors, to
> share buffers among processes.
> 
> The only driver that does not use the drm_gem_prime_*() helpers is
> vmwgfx. Once it has been converted, the callbacks in struct drm_driver
> can be removed.
> 
> Simon Ser implemented the feature for drivers based on GEM VRAM helpers
> in [1]. This patchset generalizes the code for all drivers that do not
> otherwise support PRIME. Tested by running sway with gma500 hardware.

Very nice! Thanks a lot for doing this!

Just one minor comment about docs. I think there are also some remaining
references to drm_gem_prime_handle_to_fd() and drm_gem_prime_fd_to_handle()
in the drm_prime.c overview. These become stale since this series unexports
these functions.

With that fixed:

Reviewed-by: Simon Ser 


Re: [PATCH 1/3] drm: Enable PRIME import/export for all drivers

2023-06-15 Thread Simon Ser
On Thursday, June 15th, 2023 at 11:31, Thomas Zimmermann  
wrote:

> diff --git a/include/drm/drm_drv.h b/include/drm/drm_drv.h
> index 89e2706cac561..10af1899236a0 100644
> --- a/include/drm/drm_drv.h
> +++ b/include/drm/drm_drv.h
> @@ -309,6 +309,9 @@ struct drm_driver {
>*
>* For an in-depth discussion see :ref:`PRIME buffer sharing
>* documentation `.
> +  *
> +  * TODO: Convert remaining drivers to drm_gem_prime_handle_to_fd()
> +  *   and remove this callback.
>*/

The docs right above still state: "Should be implemented with
drm_gem_prime_handle_to_fd() for GEM based drivers". Maybe we can replace that
and state that leaving this NULL will use a default implementation?

>   int (*prime_handle_to_fd)(struct drm_device *dev, struct drm_file 
> *file_priv,
>   uint32_t handle, uint32_t flags, int *prime_fd);
> @@ -320,6 +323,9 @@ struct drm_driver {
>*
>* For an in-depth discussion see :ref:`PRIME buffer sharing
>* documentation `.
> +  *
> +  * TODO: Convert remaining drivers to drm_gem_prime_fd_to_handle()
> +  *   and remove this callback.
>*/

Ditto.

>   int (*prime_fd_to_handle)(struct drm_device *dev, struct drm_file 
> *file_priv,
>   int prime_fd, uint32_t *handle);
> --
> 2.41.0


[PATCH v15 2/2] MAINTAINERS: add maintainers for DRM LOONGSON driver

2023-06-15 Thread Sui Jingfeng
From: Sui Jingfeng 

This patch add Sui Jingfeng as maintainer to drm/loongson driver.

Signed-off-by: Sui Jingfeng 
---
 MAINTAINERS | 7 +++
 1 file changed, 7 insertions(+)

diff --git a/MAINTAINERS b/MAINTAINERS
index 225e20582a96..70262eb6e614 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -6956,6 +6956,13 @@ T:   git git://anongit.freedesktop.org/drm/drm-misc
 F: drivers/gpu/drm/lima/
 F: include/uapi/drm/lima_drm.h
 
+DRM DRIVERS FOR LOONGSON
+M: Sui Jingfeng 
+L: dri-devel@lists.freedesktop.org
+S: Supported
+T: git git://anongit.freedesktop.org/drm/drm-misc
+F: drivers/gpu/drm/loongson/
+
 DRM DRIVERS FOR MEDIATEK
 M: Chun-Kuang Hu 
 M: Philipp Zabel 
-- 
2.25.1



[PATCH v15 0/2] drm: Add kms driver for loongson display controller

2023-06-15 Thread Sui Jingfeng
From: Sui Jingfeng 

Loongson display controller IP has been integrated in both Loongson north
bridge chipset (ls7a1000/ls7a2000) and Loongson SoCs (ls2k1000/ls2k2000).
It has even been included in Loongson's BMC products. It has two display
pipes, and each display pipe supports a primary plane and a cursor plane.

For the DC in the LS7a1000, each display pipe has a DVO output interface,
which is able to support 1920x1080@60Hz. For the DC in the LS7A2000, each
display pipe is equipped with a built-in HDMI encoder, which is compliant
with the HDMI 1.4 specification. The first display pipe is also equipped
with a transparent VGA encoder, which is parallel with the HDMI encoder.
To get a decent performance for writing framebuffer data to the VRAM, the
write combine support should be enabled.

v1 -> v2:
 1) Use hpd status reg when polling for ls7a2000.
 2) Fix all warnings that emerged when compiling with W=1.

v2 -> v3:
 1) Add COMPILE_TEST to Kconfig and make the driver off by default
 2) Alphabetical sorting headers (Thomas)
 3) Untangle register access functions as much as possible (Thomas)
 4) Switch to TTM-based memory manager (Thomas)
 5) Add the chip ID detection function which can be used to distinguish
chip models
 6) Revise the built-in HDMI phy driver, nearly all main stream mode below
4K@30Hz is tested, and this driver supports clone(mirror) display mode
and extend(joint) display mode.

v3 -> v4:
 1) Quickly fix a small mistake.

v4 -> v5:
 1) Add per display pipe debugfs support to the builtin HDMI encoder.

v5 -> v6:
 1) Remove stray code which didn't get used, say lsdc_of_get_reserved_ram
 2) Fix all typos I could found, make sentences and code more readable
 3) Untangle lsdc_hdmi*_connector_detect() function according to the pipe
 4) Rename this driver as loongson.

v6 -> v7:
1) Add prime support for buffer self-sharing, sharing buffer with
   drm/etnaviv is also tested and it works with limitations.
2) Implement buffer object tracking with list_head.
3) Add S3(sleep to RAM) support
4) Rewrite lsdc_bo_move since TTM core stop allocating resources
    during BO creation. Patch V1 ~ V6 of this series no longer work.
    Thus, we send V7.

v7 -> v8:
 1) Zero a compile warning on a 32-bit platform, compile with W=1
 2) Revise lsdc_bo_gpu_offset() and make minor cleanups.
 3) Pageflip tested on the virtual terminal with the following commands:

modetest -M loongson -s 32:1920x1080 -v
modetest -M loongson -s 34:1920x1080 -v -F tiles

   It works like a charm, when running the pageflip test with dual screens
   configuration, another two additional BOs were created by the modetest,
   VRAM usage up to 40+ MB, well we have at least 64MB, still enough.

   # cat bos

   bo[]: size: 8112kB VRAM
   bo[0001]: size:   16kB VRAM
   bo[0002]: size:   16kB VRAM
   bo[0003]: size:16208kB VRAM
   bo[0004]: size: 8112kB VRAM
   bo[0005]: size: 8112kB VRAM

v8 -> v9:
 1) Select I2C and I2C_ALGOBIT in Kconfig, should depend on MMU.
 2) Using pci_get_domain_bus_and_slot to get the GPU device.

v9 -> v10:
 1) Revise lsdc_drm_freeze() to implement S3 correctly. We realized that
the pinned BO could not be moved, the VRAM lost power when sleeping
to RAM. Thus, the data in the buffer who is pinned in VRAM will get
lost when resumed. Yet it's not a big problem because this driver
relies on the CPU to update the front framebuffer. We can see the
garbage data when resume from S3, but the screen will show the right
image as I move the cursor. This is due to the CPU repaint. v10 of
this patch makes S3 perfect by unpin all of the BOs in VRAM, evict
them all to system RAM in lsdc_drm_freeze().

v10 -> v11:
 1) On a double-screen case, The buffer object backing the single giant
framebuffer is referenced by two GEM objects; hence, it will be
pinned at least twice by prepare_fb() function. This causes its pin
count > 1. V10 of this patch only unpins VRAM BOs once when suspend,
which is not correct on double-screen case. V11 of this patch unpin
the BOs until its pin count reaches zero when suspend. Then, we make
the S3 support complete finally. With v11, I can't see any garbage
data when resume.

 2) Fix vblank wait timeout when disable CRTC.
 3) Test against IGT, at least fbdev test and kms_flip test passed.
 4) Rewrite pixel PLL update function, magic numbers eliminated (Emil)
 5) Drop a few common hardware features description in lsdc_desc (Emil)
 6) Drop lsdc_mode_config_mode_valid(), instead add restrictions in dumb
create function. (Emil)
 7) Untangle the ls7a1000 case and ls7a2000 case completely (Thomas)

v11 -> v12:
 none

v12 -> v13:
 1) Add benchmarks to figure out the bandwidth of the hardware platform.
Usage:
# cd /sys/kernel/debug/dri/0/
# cat benchmark

 2) VRAM is filled with garbage data if uninitialized, add a buffer
clearing procedure (lsdc_bo_clear), clear the 

Re: [PATCH] drm/bridge: ps8640: Drop the ability of ps8640 to fetch the EDID

2023-06-15 Thread Doug Anderson
Hi,

On Thu, Jun 15, 2023 at 1:47 AM Pin-yen Lin  wrote:
>
> Hi Doug,
>
> On Thu, Jun 15, 2023 at 5:31 AM Doug Anderson  wrote:
> >
> > Hi,
> >
> > On Wed, Jun 14, 2023 at 1:22 AM AngeloGioacchino Del Regno
> >  wrote:
> > >
> > > Il 13/06/23 01:32, Douglas Anderson ha scritto:
> > > > In order to read the EDID from an eDP panel, you not only need to
> > > > power on the bridge chip itself but also the panel. In the ps8640
> > > > driver, this was made to work by having the bridge chip manually power
> > > > the panel on by calling pre_enable() on everything connectorward on
> > > > the bridge chain. This worked OK, but...
> > > >
> > > > ...when trying to do the same thing on ti-sn65dsi86, feedback was that
> > > > this wasn't a great idea. As a result, we designed the "DP AUX"
> > > > bus. With the design we ended up with the panel driver itself was in
> > > > charge of reading the EDID. The panel driver could power itself on and
> > > > the bridge chip was able to power itself on because it implemented the
> > > > DP AUX bus.
> > > >
> > > > Despite the fact that we came up with a new scheme, implemented in on
> > > > ti-sn65dsi86, and even implemented it on parade-ps8640, we still kept
> > > > the old code around. This was because the new scheme required a DT
> > > > change. Previously the panel was a simple "platform_device" and in DT
> > > > at the top level. With the new design the panel needs to be listed in
> > > > DT under the DP controller node. The old code allowed us to properly
> > > > fetch EDIDs with ps8640 with the old DTs.
> > > >
> > > > Unfortunately, the old code stopped working as of commit 102e80d1fa2c
> > > > ("drm/bridge: ps8640: Use atomic variants of drm_bridge_funcs"). There
> > > > are cases at bootup where connector->state->state is NULL and the
> > > > kernel crashed at:
> > > > * drm_atomic_bridge_chain_pre_enable
> > > > * drm_atomic_get_old_bridge_state
> > > > * drm_atomic_get_old_private_obj_state
> > > >
> > > > A bit of digging was done to see if there was an easy fix but there
> > > > was nothing obvious. Instead, the only device using ps8640 the "old"
> > > > way had its DT updated so that the panel was no longer a simple
> > > > "platform_deice". See commit c2d94f72140a ("arm64: dts: mediatek:
> > > > mt8173-elm: Move display to ps8640 auxiliary bus") and commit
> > > > 113b5cc06f44 ("arm64: dts: mediatek: mt8173-elm: remove panel model
> > > > number in DT").
> > > >
> > > > Let's delete the old, crashing code so nobody gets tempted to copy it
> > > > or figure out how it works (since it doesn't).
> > > >
> > > > NOTE: from a device tree "purist" point of view, we're supposed to
> > > > keep old device trees working and this patch is technically "against
> > > > policy". Reasons I'm still proposing it anyway:
> > > > 1. Officially, old mt8173-elm device trees worked via the "little
> > > > white lie" approach. The DT would list an arbitrary/representative
> > > > panel that would be used for power sequencing. The mode information
> > > > in the panel driver would then be ignored / overridden by the EDID
> > > > reading code in ps8640. I don't feel too terrible breaking DTs that
> > > > contained the wrong "compatible" string to begin with. NOTE that
> > > > any old device trees that _didn't_ lie about their compatible will
> > > > still work because the mode information will come from the
> > > > hardcoded panels in panel-edp.
> > > > 2. The only users of the old code were Chromebooks and Chromebooks
> > > > don't bake their DTs into the BIOS (they are bundled with the
> > > > kernel). Thus we don't need to worry about breaking someone using
> > > > an old DT with a new kernel.
> > > > 3. The old code was crashing anyway. If someone wants to fix the old
> > > > code instead of deleting it then they have my blessing, but without
> > > > a proper fix the old code isn't useful.
> > > >
> > > > I'll list this as "Fixing" the code that made the old code start
> > > > failing. There's not lots of reason to bring this back any further
> > > > than that.
> > >
> > > Hoping to see removal of non-aux EDID reading code from all drivers that 
> > > can
> > > support aux-bus is exactly why I moved Elm to the proper... aux-bus.. 
> > > so...
> > >
> > > Yes! Let's go!
> > >
> > > >
> > > > Fixes: 102e80d1fa2c ("drm/bridge: ps8640: Use atomic variants of 
> > > > drm_bridge_funcs")
> > >
> > > ...but this Fixes tag will cause this commit to be backported to kernel 
> > > versions
> > > before my commit moving Elm to aux-bus, and break display on those.
> > >
> > > I would suggest to either find a different Fixes tag, or don't add any, 
> > > since
> > > technically this is a deprecation commit. We could imply that the old 
> > > technique
> > > is deprecated since kernel version X.Y and get away with it.
> > >
> > > Otherwise, if you want it backported *anyway*, the safest way would be to 
> > > Cc it
> > > to stable and 

Re: [PATCH drm-next v4 03/14] drm: manager to keep track of GPUs VA mappings

2023-06-15 Thread Danilo Krummrich
On Tue, Jun 13, 2023 at 08:29:35PM -0400, Liam R. Howlett wrote:
> * Danilo Krummrich  [230606 18:32]:
> > Add infrastructure to keep track of GPU virtual address (VA) mappings
> > with a decicated VA space manager implementation.
> > 
> > New UAPIs, motivated by Vulkan sparse memory bindings graphics drivers
> > start implementing, allow userspace applications to request multiple and
> > arbitrary GPU VA mappings of buffer objects. The DRM GPU VA manager is
> > intended to serve the following purposes in this context.
> > 
> > 1) Provide infrastructure to track GPU VA allocations and mappings,
> >making use of the maple_tree.
> > 
> > 2) Generically connect GPU VA mappings to their backing buffers, in
> >particular DRM GEM objects.
> > 
> > 3) Provide a common implementation to perform more complex mapping
> >operations on the GPU VA space. In particular splitting and merging
> >of GPU VA mappings, e.g. for intersecting mapping requests or partial
> >unmap requests.
> > 
> > Suggested-by: Dave Airlie 
> > Signed-off-by: Danilo Krummrich 
> > ---
> >  Documentation/gpu/drm-mm.rst|   31 +
> >  drivers/gpu/drm/Makefile|1 +
> >  drivers/gpu/drm/drm_gem.c   |3 +
> >  drivers/gpu/drm/drm_gpuva_mgr.c | 1687 +++
> >  include/drm/drm_drv.h   |6 +
> >  include/drm/drm_gem.h   |   75 ++
> >  include/drm/drm_gpuva_mgr.h |  681 +
> >  7 files changed, 2484 insertions(+)
> >  create mode 100644 drivers/gpu/drm/drm_gpuva_mgr.c
> >  create mode 100644 include/drm/drm_gpuva_mgr.h
> > 
> > diff --git a/Documentation/gpu/drm-mm.rst b/Documentation/gpu/drm-mm.rst
> > index a52e6f4117d6..c9f120cfe730 100644
> > --- a/Documentation/gpu/drm-mm.rst
> > +++ b/Documentation/gpu/drm-mm.rst
> > @@ -466,6 +466,37 @@ DRM MM Range Allocator Function References
> >  .. kernel-doc:: drivers/gpu/drm/drm_mm.c
> > :export:
> >  
> > +DRM GPU VA Manager
> > +==
> > +
> > +Overview
> > +
> > +
> > +.. kernel-doc:: drivers/gpu/drm/drm_gpuva_mgr.c
> > +   :doc: Overview
> > +
> > +Split and Merge
> > +---
> > +
> > +.. kernel-doc:: drivers/gpu/drm/drm_gpuva_mgr.c
> > +   :doc: Split and Merge
> > +
> > +Locking
> > +---
> > +
> > +.. kernel-doc:: drivers/gpu/drm/drm_gpuva_mgr.c
> > +   :doc: Locking
> > +
> > +
> > +DRM GPU VA Manager Function References
> > +--
> > +
> > +.. kernel-doc:: include/drm/drm_gpuva_mgr.h
> > +   :internal:
> > +
> > +.. kernel-doc:: drivers/gpu/drm/drm_gpuva_mgr.c
> > +   :export:
> > +
> >  DRM Buddy Allocator
> >  ===
> >  
> > diff --git a/drivers/gpu/drm/Makefile b/drivers/gpu/drm/Makefile
> > index 9c6446eb3c83..8eeed446a078 100644
> > --- a/drivers/gpu/drm/Makefile
> > +++ b/drivers/gpu/drm/Makefile
> > @@ -45,6 +45,7 @@ drm-y := \
> > drm_vblank.o \
> > drm_vblank_work.o \
> > drm_vma_manager.o \
> > +   drm_gpuva_mgr.o \
> > drm_writeback.o
> >  drm-$(CONFIG_DRM_LEGACY) += \
> > drm_agpsupport.o \
> > diff --git a/drivers/gpu/drm/drm_gem.c b/drivers/gpu/drm/drm_gem.c
> > index 1a5a2cd0d4ec..cd878ebddbd0 100644
> > --- a/drivers/gpu/drm/drm_gem.c
> > +++ b/drivers/gpu/drm/drm_gem.c
> > @@ -164,6 +164,9 @@ void drm_gem_private_object_init(struct drm_device *dev,
> > if (!obj->resv)
> > obj->resv = >_resv;
> >  
> > +   if (drm_core_check_feature(dev, DRIVER_GEM_GPUVA))
> > +   drm_gem_gpuva_init(obj);
> > +
> > drm_vma_node_reset(>vma_node);
> > INIT_LIST_HEAD(>lru_node);
> >  }
> > diff --git a/drivers/gpu/drm/drm_gpuva_mgr.c 
> > b/drivers/gpu/drm/drm_gpuva_mgr.c
> > new file mode 100644
> > index ..dd8dd7fef14b
> > --- /dev/null
> > +++ b/drivers/gpu/drm/drm_gpuva_mgr.c
> > @@ -0,0 +1,1687 @@
> > +// SPDX-License-Identifier: GPL-2.0
> > +/*
> > + * Copyright (c) 2022 Red Hat.
> > + *
> > + * Permission is hereby granted, free of charge, to any person obtaining a
> > + * copy of this software and associated documentation files (the 
> > "Software"),
> > + * to deal in the Software without restriction, including without 
> > limitation
> > + * the rights to use, copy, modify, merge, publish, distribute, sublicense,
> > + * and/or sell copies of the Software, and to permit persons to whom the
> > + * Software is furnished to do so, subject to the following conditions:
> > + *
> > + * The above copyright notice and this permission notice shall be included 
> > in
> > + * all copies or substantial portions of the Software.
> > + *
> > + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS 
> > OR
> > + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
> > + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT SHALL
> > + * THE COPYRIGHT HOLDER(S) OR AUTHOR(S) BE LIABLE FOR ANY CLAIM, DAMAGES OR
> > + * OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE,
> > + * ARISING 

Re: [PATCH v2 4/4] drm/mgag200: Use DMA to copy the framebuffer to the VRAM

2023-06-15 Thread Thomas Zimmermann

Hi Jocelyn

Am 31.05.23 um 11:21 schrieb Jocelyn Falempe:

Even if the transfer is not faster, it brings significant
improvement in latencies and CPU usage.

CPU usage drops from 100% of one core to 3% when continuously
refreshing the screen.


I tried your patchset on a HP Proliant server with a G200EH. I can see 
that the CPU usage goes down, but the time until the screen update 
reaches the hardware's video memory has increased significantly.


Any display update that is more than just moving the mouse results in 
tearing. I can see how the individial scanlines are updated from top to 
bottom. That takes ~1 sec per full frame. So this patch renders the 
display from slow to barely usable.


Best regards
Thomas



The primary DMA is used to send commands (register write), and
the secondary DMA to send the pixel data.
It uses the ILOAD command (chapter 4.5.7 in G200 specification),
which allows to load an image, or a part of an image from system
memory to VRAM.
The last command sent, is a softrap command, which triggers an IRQ
when the DMA transfer is complete.
For 16bits and 24bits pixel width, each line is padded to make sure,
the DMA transfer is a multiple of 32bits. The padded bytes won't be
drawn on the screen, so they don't need to be initialized.

Signed-off-by: Jocelyn Falempe 
---
  drivers/gpu/drm/mgag200/Makefile  |   3 +-
  drivers/gpu/drm/mgag200/mgag200_dma.c | 237 ++
  drivers/gpu/drm/mgag200/mgag200_drv.c |   4 +-
  drivers/gpu/drm/mgag200/mgag200_drv.h |  29 +++
  drivers/gpu/drm/mgag200/mgag200_g200.c|   4 +
  drivers/gpu/drm/mgag200/mgag200_g200eh.c  |   4 +
  drivers/gpu/drm/mgag200/mgag200_g200eh3.c |   4 +
  drivers/gpu/drm/mgag200/mgag200_g200er.c  |   4 +
  drivers/gpu/drm/mgag200/mgag200_g200ev.c  |   4 +
  drivers/gpu/drm/mgag200/mgag200_g200ew3.c |   4 +
  drivers/gpu/drm/mgag200/mgag200_g200se.c  |   4 +
  drivers/gpu/drm/mgag200/mgag200_g200wb.c  |   4 +
  drivers/gpu/drm/mgag200/mgag200_mode.c|  15 +-
  drivers/gpu/drm/mgag200/mgag200_reg.h |  25 +++
  14 files changed, 333 insertions(+), 12 deletions(-)
  create mode 100644 drivers/gpu/drm/mgag200/mgag200_dma.c

diff --git a/drivers/gpu/drm/mgag200/Makefile b/drivers/gpu/drm/mgag200/Makefile
index 182e224c460d..96e23dc5572c 100644
--- a/drivers/gpu/drm/mgag200/Makefile
+++ b/drivers/gpu/drm/mgag200/Makefile
@@ -11,6 +11,7 @@ mgag200-y := \
mgag200_g200se.o \
mgag200_g200wb.o \
mgag200_i2c.o \
-   mgag200_mode.o
+   mgag200_mode.o \
+   mgag200_dma.o
  
  obj-$(CONFIG_DRM_MGAG200) += mgag200.o

diff --git a/drivers/gpu/drm/mgag200/mgag200_dma.c 
b/drivers/gpu/drm/mgag200/mgag200_dma.c
new file mode 100644
index ..7e9b59ef08d9
--- /dev/null
+++ b/drivers/gpu/drm/mgag200/mgag200_dma.c
@@ -0,0 +1,237 @@
+// SPDX-License-Identifier: GPL-2.0-only
+/*
+ * Copyright 2023 Red Hat
+ *
+ * Authors: Jocelyn Falempe
+ *
+ */
+
+#include 
+#include 
+#include 
+
+#include 
+
+#include "mgag200_drv.h"
+#include "mgag200_reg.h"
+
+/* Maximum number of command block for one DMA transfer
+ * iload should only use 4 blocks
+ */
+#define MGA_MAX_CMD50
+
+#define MGA_DMA_SIZE   (4 * 1024 * 1024)
+#define MGA_MIN_DMA_SIZE   (64 * 1024)
+
+/*
+ * Allocate coherent buffers for DMA transfer.
+ * We need two buffers, one for the commands, and one for the data.
+ */
+int mgag200_dma_init(struct mga_device *mdev)
+{
+   struct drm_device *dev = >base;
+   struct mga_dma *dma = >dma;
+   int size;
+   /* Allocate the command buffer */
+   dma->cmd = dmam_alloc_coherent(dev->dev, MGA_MAX_CMD * 
sizeof(*dma->cmd),
+   >cmd_handle, GFP_KERNEL);
+
+   if (!dma->cmd) {
+   drm_err(dev, "DMA command buffer allocation failed\n");
+   return -ENOMEM;
+   }
+
+   for (size = MGA_DMA_SIZE; size >= MGA_MIN_DMA_SIZE; size = size >> 1) {
+   dma->buf = dmam_alloc_coherent(dev->dev, size, >handle, 
GFP_KERNEL);
+   if (dma->buf)
+   break;
+   }
+   if (!dma->buf) {
+   drm_err(dev, "DMA data buffer allocation failed\n");
+   return -ENOMEM;
+   }
+   dma->size = size;
+   drm_info(dev, "Using DMA with a %dk data buffer\n", size / 1024);
+
+   init_waitqueue_head(>waitq);
+   return 0;
+}
+
+/*
+ * Matrox uses a command block to program the hardware through DMA.
+ * Each command is a register write, and each block contains 4 commands
+ * packed in 5 dwords(u32).
+ * First dword is the 4 register index (8bit) to write for the 4 commands,
+ * followed by the four values to be written.
+ */
+static void mgag200_dma_add_block(struct mga_device *mdev,
+  u32 reg0, u32 val0,
+  u32 reg1, u32 val1,
+  u32 reg2, u32 val2,
+  u32 reg3, u32 val3)
+{
+   if 

  1   2   >