date:20230201

Re: [Intel-gfx] [PATCH v3] drm/i915: Consolidate TLB invalidation flow

2023-02-01 Thread Andrzej Hajda


On 01.02.2023 17:51, Tvrtko Ursulin wrote:

From: Tvrtko Ursulin 

As the logic for selecting the register and corresponsing values grew, the
code become a bit unsightly. Consolidate by storing the required values at
engine init time in the engine itself, and by doing so minimise the amount
of invariant platform and engine checks during each and every TLB
invalidation.

v2:
  * Fail engine probe if TLB invlidations registers are unknown.

v3:
  * Rebase.

Signed-off-by: Tvrtko Ursulin 
Cc: Andrzej Hajda 
Cc: Matt Roper 
Reviewed-by: Andrzej Hajda  # v1
---
  drivers/gpu/drm/i915/gt/intel_engine_cs.c|  96 +
  drivers/gpu/drm/i915/gt/intel_engine_types.h |  15 ++
  drivers/gpu/drm/i915/gt/intel_gt.c   | 138 +++
  3 files changed, 133 insertions(+), 116 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/intel_engine_cs.c 
b/drivers/gpu/drm/i915/gt/intel_engine_cs.c
index d4e29da74612..e430945743ec 100644
--- a/drivers/gpu/drm/i915/gt/intel_engine_cs.c
+++ b/drivers/gpu/drm/i915/gt/intel_engine_cs.c
@@ -9,6 +9,7 @@
  
  #include "gem/i915_gem_context.h"

  #include "gem/i915_gem_internal.h"
+#include "gt/intel_gt_print.h"
  #include "gt/intel_gt_regs.h"
  
  #include "i915_cmd_parser.h"

@@ -1143,12 +1144,107 @@ static int init_status_page(struct intel_engine_cs 
*engine)
return ret;
  }
  
+static int intel_engine_init_tlb_invalidation(struct intel_engine_cs *engine)

+{
+   static const union intel_engine_tlb_inv_reg gen8_regs[] = {
+   [RENDER_CLASS].reg  = GEN8_RTCR,
+   [VIDEO_DECODE_CLASS].reg= GEN8_M1TCR, /* , GEN8_M2TCR */
+   [VIDEO_ENHANCEMENT_CLASS].reg   = GEN8_VTCR,
+   [COPY_ENGINE_CLASS].reg = GEN8_BTCR,
+   };
+   static const union intel_engine_tlb_inv_reg gen12_regs[] = {
+   [RENDER_CLASS].reg  = GEN12_GFX_TLB_INV_CR,
+   [VIDEO_DECODE_CLASS].reg= GEN12_VD_TLB_INV_CR,
+   [VIDEO_ENHANCEMENT_CLASS].reg   = GEN12_VE_TLB_INV_CR,
+   [COPY_ENGINE_CLASS].reg = GEN12_BLT_TLB_INV_CR,
+   [COMPUTE_CLASS].reg = GEN12_COMPCTX_TLB_INV_CR,
+   };
+   static const union intel_engine_tlb_inv_reg xehp_regs[] = {
+   [RENDER_CLASS].mcr_reg= XEHP_GFX_TLB_INV_CR,
+   [VIDEO_DECODE_CLASS].mcr_reg  = XEHP_VD_TLB_INV_CR,
+   [VIDEO_ENHANCEMENT_CLASS].mcr_reg = XEHP_VE_TLB_INV_CR,
+   [COPY_ENGINE_CLASS].mcr_reg   = XEHP_BLT_TLB_INV_CR,
+   [COMPUTE_CLASS].mcr_reg   = XEHP_COMPCTX_TLB_INV_CR,
+   };
+   struct drm_i915_private *i915 = engine->i915;
+   const union intel_engine_tlb_inv_reg *regs;
+   union intel_engine_tlb_inv_reg reg;
+   unsigned int class = engine->class;
+   unsigned int num = 0;
+   u32 val;
+
+   /*
+* New platforms should not be added with catch-all-newer (>=)
+* condition so that any later platform added triggers the below warning
+* and in turn mandates a human cross-check of whether the invalidation
+* flows have compatible semantics.
+*
+* For instance with the 11.00 -> 12.00 transition three out of five
+* respective engine registers were moved to masked type. Then after the
+* 12.00 -> 12.50 transition multi cast handling is required too.
+*/
+
+   if (GRAPHICS_VER_FULL(i915) == IP_VER(12, 50) ||
+   GRAPHICS_VER_FULL(i915) == IP_VER(12, 55)) {
+   regs = xehp_regs;
+   num = ARRAY_SIZE(xehp_regs);
+   } else if (GRAPHICS_VER_FULL(i915) == IP_VER(12, 0) ||
+  GRAPHICS_VER_FULL(i915) == IP_VER(12, 10)) {
+   regs = gen12_regs;
+   num = ARRAY_SIZE(gen12_regs);
+   } else if (GRAPHICS_VER(i915) >= 8 && GRAPHICS_VER(i915) <= 11) {
+   regs = gen8_regs;
+   num = ARRAY_SIZE(gen8_regs);
+   } else if (GRAPHICS_VER(i915) < 8) {
+   return 0;
+   }
+
+   if (gt_WARN_ONCE(engine->gt, !num,
+"Platform does not implement TLB invalidation!"))
+   return -ENODEV;
+
+   if (gt_WARN_ON_ONCE(engine->gt,
+class >= num ||
+(!regs[class].reg.reg &&
+ !regs[class].mcr_reg.reg)))
+   return -ERANGE;
+
+   reg = regs[class];
+
+   if (GRAPHICS_VER(i915) == 8 && class == VIDEO_DECODE_CLASS) {


As selftest pointed out it should cover also gen 9-11.
Btw maybe it is worth to convert this pseudo array indexing to direct 
assignment:
if ((GRAPHICS_VER(i915) <= 11 && class == VIDEO_DECODE_CLASS && 
engine->instance == 1) {

reg.reg = GEN8_M2TCR;
val = 0;
}

Regards
Andrzej


+   reg.reg.reg += 4 * engine->instance; /* GEN8_M2TCR */
+   val = 0;
+   } else {
+

[PATCH v2] drm/mediatek: dp: Only trigger DRM HPD events if bridge is attached

2023-02-01 Thread Chen-Yu Tsai

The MediaTek DisplayPort interface bridge driver starts its interrupts
as soon as its probed. However when the interrupts trigger the bridge
might not have been attached to a DRM device. As drm_helper_hpd_irq_event()
does not check whether the passed in drm_device is valid or not, a NULL
pointer passed in results in a kernel NULL pointer dereference in it.

Check whether the bridge is attached and only trigger an HPD event if
it is.

Fixes: f70ac097a2cf ("drm/mediatek: Add MT8195 Embedded DisplayPort driver")
Signed-off-by: Chen-Yu Tsai 
Reviewed-by: Guillaume Ranquet 
---
Changes since v1
- Dropped prerequisite-patch-ids
- Added Guillaume's Reviewed-by

This applies on top of mediatek-drm-next.

 drivers/gpu/drm/mediatek/mtk_dp.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/mediatek/mtk_dp.c 
b/drivers/gpu/drm/mediatek/mtk_dp.c
index 1f94fcc144d3..a82f53e1a146 100644
--- a/drivers/gpu/drm/mediatek/mtk_dp.c
+++ b/drivers/gpu/drm/mediatek/mtk_dp.c
@@ -1823,7 +1823,8 @@ static irqreturn_t mtk_dp_hpd_event_thread(int hpd, void 
*dev)
spin_unlock_irqrestore(_dp->irq_thread_lock, flags);
 
if (status & MTK_DP_THREAD_CABLE_STATE_CHG) {
-   drm_helper_hpd_irq_event(mtk_dp->bridge.dev);
+   if (mtk_dp->bridge.dev)
+   drm_helper_hpd_irq_event(mtk_dp->bridge.dev);
 
if (!mtk_dp->train_info.cable_plugged_in) {
mtk_dp_disable_sdp_aui(mtk_dp);
-- 
2.39.1.456.gfc5497dd1b-goog

[pull] amdgpu drm-fixes-6.2

2023-02-01 Thread Alex Deucher

Hi Dave, Daniel,

Fixes for 6.2.

The following changes since commit 6d796c50f84ca79f1722bb131799e5a5710c4700:

  Linux 6.2-rc6 (2023-01-29 13:59:43 -0800)

are available in the Git repository at:

  https://gitlab.freedesktop.org/agd5f/linux.git 
tags/amd-drm-fixes-6.2-2023-02-01

for you to fetch changes up to 6fc547a5a2ef5ce05b16924106663ab92f8f87a7:

  drm/amd/display: Properly handle additional cases where DCN is not supported 
(2023-02-01 22:45:51 -0500)


amd-drm-fixes-6.2-2023-02-01:

amdgpu:
- GC11 fixes
- DCN 3.1.4 fixes
- NBIO 4.3 fix
- DCN 3.2 fixes
- Properly handle additional cases where DCN is not supported
- SMU13 fixes


Alex Deucher (1):
  drm/amd/display: Properly handle additional cases where DCN is not 
supported

Daniel Miess (2):
  drm/amd/display: Add missing brackets in calculation
  drm/amd/display: Adjust downscaling limits for dcn314

Evan Quan (1):
  drm/amdgpu: enable HDP SD for gfx 11.0.3

George Shen (1):
  drm/amd/display: Unassign does_plane_fit_in_mall function from dcn3.2

Graham Sider (1):
  drm/amdgpu: update wave data type to 3 for gfx11

Mario Limonciello (1):
  drm/amd: Fix initialization for nbio 4.3.0

Nicholas Kazlauskas (1):
  drm/amd/display: Reset DMUB mailbox SW state after HW reset

Tim Huang (1):
  drm/amd/pm: drop unneeded dpm features disablement for SMU 13.0.4/11

Yiqing Yao (1):
  drm/amdgpu: Enable vclk dclk node for gc11.0.3

 drivers/gpu/drm/amd/amdgpu/gfx_v11_0.c |  4 ++--
 drivers/gpu/drm/amd/amdgpu/nbio_v4_3.c |  8 +++-
 drivers/gpu/drm/amd/amdgpu/soc21.c |  3 ++-
 drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c  | 11 +++
 drivers/gpu/drm/amd/display/dc/dcn314/dcn314_resource.c|  5 +++--
 drivers/gpu/drm/amd/display/dc/dcn32/dcn32_init.c  |  2 +-
 .../drm/amd/display/dc/dml/dcn314/display_mode_vba_314.c   |  2 +-
 drivers/gpu/drm/amd/display/dmub/src/dmub_srv.c| 12 
 drivers/gpu/drm/amd/pm/amdgpu_pm.c |  6 --
 drivers/gpu/drm/amd/pm/swsmu/amdgpu_smu.c  | 14 ++
 10 files changed, 57 insertions(+), 10 deletions(-)

Re: [v1 3/3] drm/msm/disp/dpu1: reserve the resources on topology change

2023-02-01 Thread kernel test robot

Hi Kalyan,

Thank you for the patch! Yet something to improve:

[auto build test ERROR on drm-misc/drm-misc-next]
[also build test ERROR on drm/drm-next drm-exynos/exynos-drm-next 
drm-intel/for-linux-next drm-intel/for-linux-next-fixes drm-tip/drm-tip 
linus/master v6.2-rc6 next-20230201]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch#_base_tree_information]

url:
https://github.com/intel-lab-lkp/linux/commits/Kalyan-Thota/drm-msm-disp-dpu1-clear-dspp-reservations-in-rm-release/20230130-232224
base:   git://anongit.freedesktop.org/drm/drm-misc drm-misc-next
patch link:
https://lore.kernel.org/r/1675092092-26412-4-git-send-email-quic_kalyant%40quicinc.com
patch subject: [v1 3/3] drm/msm/disp/dpu1: reserve the resources on topology 
change
config: riscv-randconfig-r042-20230130 
(https://download.01.org/0day-ci/archive/20230202/202302021238.o9yx7mks-...@intel.com/config)
compiler: clang version 16.0.0 (https://github.com/llvm/llvm-project 
4196ca3278f78c6e19246e54ab0ecb364e37d66a)
reproduce (this is a W=1 build):
wget 
https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O 
~/bin/make.cross
chmod +x ~/bin/make.cross
# install riscv cross compiling tool for clang build
# apt-get install binutils-riscv64-linux-gnu
# 
https://github.com/intel-lab-lkp/linux/commit/4c49c3233fc18f3b746a96b5ff4ce5008da3bfec
git remote add linux-review https://github.com/intel-lab-lkp/linux
git fetch --no-tags linux-review 
Kalyan-Thota/drm-msm-disp-dpu1-clear-dspp-reservations-in-rm-release/20230130-232224
git checkout 4c49c3233fc18f3b746a96b5ff4ce5008da3bfec
# save the config file
mkdir build_dir && cp config build_dir/.config
COMPILER_INSTALL_PATH=$HOME/0day COMPILER=clang make.cross W=1 
O=build_dir ARCH=riscv olddefconfig
COMPILER_INSTALL_PATH=$HOME/0day COMPILER=clang make.cross W=1 
O=build_dir ARCH=riscv SHELL=/bin/bash drivers/gpu/drm/msm/

If you fix the issue, kindly add following tag where applicable
| Reported-by: kernel test robot 

All errors (new ones prefixed by >>):

>> drivers/gpu/drm/msm/disp/dpu1/dpu_encoder.c:2091:6: error: conflicting types 
>> for 'dpu_encoder_prepare_commit'
   void dpu_encoder_prepare_commit(struct drm_encoder *drm_enc)
^
   drivers/gpu/drm/msm/disp/dpu1/dpu_encoder.h:155:6: note: previous 
declaration is here
   void dpu_encoder_prepare_commit(struct drm_encoder *drm_enc,
^
   1 error generated.
--
>> drivers/gpu/drm/msm/disp/dpu1/dpu_kms.c:443:38: error: too few arguments to 
>> function call, expected 2, have 1
   dpu_encoder_prepare_commit(encoder);
   ~~^
   drivers/gpu/drm/msm/disp/dpu1/dpu_encoder.h:155:6: note: 
'dpu_encoder_prepare_commit' declared here
   void dpu_encoder_prepare_commit(struct drm_encoder *drm_enc,
^
   1 error generated.


vim +/dpu_encoder_prepare_commit +2091 
drivers/gpu/drm/msm/disp/dpu1/dpu_encoder.c

ae4d721ce10057a Abhinav Kumar 2022-04-26  2090  
25fdd5933e4c0f5 Jeykumar Sankaran 2018-06-27 @2091  void 
dpu_encoder_prepare_commit(struct drm_encoder *drm_enc)
25fdd5933e4c0f5 Jeykumar Sankaran 2018-06-27  2092  {
25fdd5933e4c0f5 Jeykumar Sankaran 2018-06-27  2093  struct dpu_encoder_virt 
*dpu_enc;
25fdd5933e4c0f5 Jeykumar Sankaran 2018-06-27  2094  struct dpu_encoder_phys 
*phys;
25fdd5933e4c0f5 Jeykumar Sankaran 2018-06-27  2095  int i;
25fdd5933e4c0f5 Jeykumar Sankaran 2018-06-27  2096  
25fdd5933e4c0f5 Jeykumar Sankaran 2018-06-27  2097  if (!drm_enc) {
25fdd5933e4c0f5 Jeykumar Sankaran 2018-06-27  2098  
DPU_ERROR("invalid encoder\n");
25fdd5933e4c0f5 Jeykumar Sankaran 2018-06-27  2099  return;
25fdd5933e4c0f5 Jeykumar Sankaran 2018-06-27  2100  }
25fdd5933e4c0f5 Jeykumar Sankaran 2018-06-27  2101  dpu_enc = 
to_dpu_encoder_virt(drm_enc);
25fdd5933e4c0f5 Jeykumar Sankaran 2018-06-27  2102  
25fdd5933e4c0f5 Jeykumar Sankaran 2018-06-27  2103  for (i = 0; i < 
dpu_enc->num_phys_encs; i++) {
25fdd5933e4c0f5 Jeykumar Sankaran 2018-06-27  2104  phys = 
dpu_enc->phys_encs[i];
b6fadcade627040 Drew Davenport2019-12-06  2105  if 
(phys->ops.prepare_commit)
25fdd5933e4c0f5 Jeykumar Sankaran 2018-06-27  2106  
phys->ops.prepare_commit(phys);
25fdd5933e4c0f5 Jeykumar Sankaran 2018-06-27  2107  }
25fdd5933e4c0f5 Jeykumar Sankaran 2018-06-27  2108  }
25fdd5933e4c0f5 Jeykumar Sankaran 2018-06-27  2109  

-- 
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests

[PATCH] drm/i915/hwmon: Enable PL1 power limit

2023-02-01 Thread Ashutosh Dixit

Previous documentation suggested that PL1 power limit is always
enabled. However we now find this not to be the case on some
platforms (such as ATSM). Therefore enable PL1 power limit during hwmon
initialization.

Signed-off-by: Ashutosh Dixit 
---
 drivers/gpu/drm/i915/i915_hwmon.c | 5 +
 1 file changed, 5 insertions(+)

diff --git a/drivers/gpu/drm/i915/i915_hwmon.c 
b/drivers/gpu/drm/i915/i915_hwmon.c
index 1225bc432f0d5..4683a5b96eff1 100644
--- a/drivers/gpu/drm/i915/i915_hwmon.c
+++ b/drivers/gpu/drm/i915/i915_hwmon.c
@@ -687,6 +687,11 @@ hwm_get_preregistration_info(struct drm_i915_private *i915)
for_each_gt(gt, i915, i)
hwm_energy(>ddat_gt[i], );
}
+
+   /* Enable PL1 power limit */
+   if (i915_mmio_reg_valid(hwmon->rg.pkg_rapl_limit))
+   hwm_locked_with_pm_intel_uncore_rmw(ddat, 
hwmon->rg.pkg_rapl_limit,
+   PKG_PWR_LIM_1_EN, 
PKG_PWR_LIM_1_EN);
 }
 
 void i915_hwmon_register(struct drm_i915_private *i915)
-- 
2.38.0

Re: [PATCH] drm/virtio: exbuf->fence_fd unmodified on interrupted wait

2023-02-01 Thread Dmitry Osipenko

On 1/27/23 01:58, Ryan Neph wrote:
> An interrupted dma_fence_wait() becomes an -ERESTARTSYS returned
> to userspace ioctl(DRM_IOCTL_VIRTGPU_EXECBUFFER) calls, prompting to
> retry the ioctl(), but the passed exbuf->fence_fd has been reset to -1,
> making the retry attempt fail at sync_file_get_fence().
> 
> The uapi for DRM_IOCTL_VIRTGPU_EXECBUFFER is changed to retain the
> passed value for exbuf->fence_fd when returning ERESTARTSYS or EINTR.
> 
> Fixes: 2cd7b6f08bc4 ("drm/virtio: add in/out fence support for explicit 
> synchronization")
> Signed-off-by: Ryan Neph 
> ---
> 
>  drivers/gpu/drm/virtio/virtgpu_ioctl.c | 9 ++---
>  include/uapi/drm/virtgpu_drm.h | 3 +++
>  2 files changed, 9 insertions(+), 3 deletions(-)
> 
> diff --git a/drivers/gpu/drm/virtio/virtgpu_ioctl.c 
> b/drivers/gpu/drm/virtio/virtgpu_ioctl.c
> index 9f4a90493aea..ffce4e2a409a 100644
> --- a/drivers/gpu/drm/virtio/virtgpu_ioctl.c
> +++ b/drivers/gpu/drm/virtio/virtgpu_ioctl.c
> @@ -132,6 +132,8 @@ static int virtio_gpu_execbuffer_ioctl(struct drm_device 
> *dev, void *data,
>   uint64_t fence_ctx;
>   uint32_t ring_idx;
>  
> + exbuf->fence_fd = -1;
> +
>   fence_ctx = vgdev->fence_drv.context;
>   ring_idx = 0;
>  
> @@ -152,8 +154,6 @@ static int virtio_gpu_execbuffer_ioctl(struct drm_device 
> *dev, void *data,
>   ring_idx = exbuf->ring_idx;
>   }
>  
> - exbuf->fence_fd = -1;
> -
>   virtio_gpu_create_context(dev, file);
>   if (exbuf->flags & VIRTGPU_EXECBUF_FENCE_FD_IN) {
>   struct dma_fence *in_fence;
> @@ -173,7 +173,7 @@ static int virtio_gpu_execbuffer_ioctl(struct drm_device 
> *dev, void *data,
>  
>   dma_fence_put(in_fence);
>   if (ret)
> - return ret;
> + goto out_err;
>   }
>  
>   if (exbuf->flags & VIRTGPU_EXECBUF_FENCE_FD_OUT) {
> @@ -259,6 +259,9 @@ static int virtio_gpu_execbuffer_ioctl(struct drm_device 
> *dev, void *data,
>  
>   if (out_fence_fd >= 0)
>   put_unused_fd(out_fence_fd);
> +out_err:
> + if (ret == -EINTR || ret == -ERESTARTSYS)
> + exbuf->fence_fd = in_fence_fd;
>  
>   return ret;
>  }
> diff --git a/include/uapi/drm/virtgpu_drm.h b/include/uapi/drm/virtgpu_drm.h
> index 0512fde5e697..ac8d1eed12ab 100644
> --- a/include/uapi/drm/virtgpu_drm.h
> +++ b/include/uapi/drm/virtgpu_drm.h
> @@ -64,6 +64,9 @@ struct drm_virtgpu_map {
>   __u32 pad;
>  };
>  
> +/* For ioctl() returning ERESTARTSYS or EINTR, fence_fd is unmodified.
> + * For all other errors it is set to -1.
> + */
>  struct drm_virtgpu_execbuffer {
>   __u32 flags;
>   __u32 size;

Reviewed-by: Dmitry Osipenko 

-- 
Best regards,
Dmitry

Re: [PATCH] drm/virtio: exbuf->fence_fd unmodified on interrupted wait

2023-02-01 Thread Dmitry Osipenko

On 2/2/23 05:17, Dmitry Osipenko wrote:
> On 2/1/23 18:48, Rob Clark wrote:
>> On Wed, Feb 1, 2023 at 5:28 AM Dmitry Osipenko
>>  wrote:
>>>
>>> On 1/27/23 01:58, Ryan Neph wrote:
 An interrupted dma_fence_wait() becomes an -ERESTARTSYS returned
 to userspace ioctl(DRM_IOCTL_VIRTGPU_EXECBUFFER) calls, prompting to
 retry the ioctl(), but the passed exbuf->fence_fd has been reset to -1,
 making the retry attempt fail at sync_file_get_fence().

 The uapi for DRM_IOCTL_VIRTGPU_EXECBUFFER is changed to retain the
 passed value for exbuf->fence_fd when returning ERESTARTSYS or EINTR.

 Fixes: 2cd7b6f08bc4 ("drm/virtio: add in/out fence support for explicit 
 synchronization")
 Signed-off-by: Ryan Neph 
 ---

  drivers/gpu/drm/virtio/virtgpu_ioctl.c | 9 ++---
  include/uapi/drm/virtgpu_drm.h | 3 +++
  2 files changed, 9 insertions(+), 3 deletions(-)

 diff --git a/drivers/gpu/drm/virtio/virtgpu_ioctl.c 
 b/drivers/gpu/drm/virtio/virtgpu_ioctl.c
 index 9f4a90493aea..ffce4e2a409a 100644
 --- a/drivers/gpu/drm/virtio/virtgpu_ioctl.c
 +++ b/drivers/gpu/drm/virtio/virtgpu_ioctl.c
 @@ -132,6 +132,8 @@ static int virtio_gpu_execbuffer_ioctl(struct 
 drm_device *dev, void *data,
   uint64_t fence_ctx;
   uint32_t ring_idx;

 + exbuf->fence_fd = -1;
 +
   fence_ctx = vgdev->fence_drv.context;
   ring_idx = 0;

 @@ -152,8 +154,6 @@ static int virtio_gpu_execbuffer_ioctl(struct 
 drm_device *dev, void *data,
   ring_idx = exbuf->ring_idx;
   }

 - exbuf->fence_fd = -1;
>>>
>>> Is there any userspace relying on this -1 behaviour? Wouldn't be better
>>> to remove this offending assignment?
>>
>> Looking at current mesa, removing the assignment should be ok (and
>> more consistent with other drivers).  But I can't say if this was
>> always true, or that there aren't other non-mesa users, so I can see
>> the argument for the more conservative uabi change that this patch
>> went with.
> 
> Realistically, Mesa is the only user of this IOCTL. In general, in a
> such case of doubt, I'll do the UABI change and then wait for complains.
> If there is a complaint, then the change is reverted. Also will be good
> to know about existence of other users :)
> 
> Given that -1 already wasn't consistently set for all error code paths,
> it's tempting to see it removed.
> 
> The code change of this patch is trivial, hence should fine to keep the
> -1 if you prefer that, but the patch won't apply cleanly to the stable
> kernels because of the "exbuf->fence_fd = -1" movement. If stable
> maintainers won't put effort into rebasing the patch, then better to do
> the removal and live with a cleaner driver code, IMO.

Although, there will be a merge conflict either way. I'll give the r-b,
still removing -1 feels more attractive to me.

-- 
Best regards,
Dmitry

Re: [PATCH] drm/virtio: exbuf->fence_fd unmodified on interrupted wait

2023-02-01 Thread Dmitry Osipenko

On 2/1/23 18:48, Rob Clark wrote:
> On Wed, Feb 1, 2023 at 5:28 AM Dmitry Osipenko
>  wrote:
>>
>> On 1/27/23 01:58, Ryan Neph wrote:
>>> An interrupted dma_fence_wait() becomes an -ERESTARTSYS returned
>>> to userspace ioctl(DRM_IOCTL_VIRTGPU_EXECBUFFER) calls, prompting to
>>> retry the ioctl(), but the passed exbuf->fence_fd has been reset to -1,
>>> making the retry attempt fail at sync_file_get_fence().
>>>
>>> The uapi for DRM_IOCTL_VIRTGPU_EXECBUFFER is changed to retain the
>>> passed value for exbuf->fence_fd when returning ERESTARTSYS or EINTR.
>>>
>>> Fixes: 2cd7b6f08bc4 ("drm/virtio: add in/out fence support for explicit 
>>> synchronization")
>>> Signed-off-by: Ryan Neph 
>>> ---
>>>
>>>  drivers/gpu/drm/virtio/virtgpu_ioctl.c | 9 ++---
>>>  include/uapi/drm/virtgpu_drm.h | 3 +++
>>>  2 files changed, 9 insertions(+), 3 deletions(-)
>>>
>>> diff --git a/drivers/gpu/drm/virtio/virtgpu_ioctl.c 
>>> b/drivers/gpu/drm/virtio/virtgpu_ioctl.c
>>> index 9f4a90493aea..ffce4e2a409a 100644
>>> --- a/drivers/gpu/drm/virtio/virtgpu_ioctl.c
>>> +++ b/drivers/gpu/drm/virtio/virtgpu_ioctl.c
>>> @@ -132,6 +132,8 @@ static int virtio_gpu_execbuffer_ioctl(struct 
>>> drm_device *dev, void *data,
>>>   uint64_t fence_ctx;
>>>   uint32_t ring_idx;
>>>
>>> + exbuf->fence_fd = -1;
>>> +
>>>   fence_ctx = vgdev->fence_drv.context;
>>>   ring_idx = 0;
>>>
>>> @@ -152,8 +154,6 @@ static int virtio_gpu_execbuffer_ioctl(struct 
>>> drm_device *dev, void *data,
>>>   ring_idx = exbuf->ring_idx;
>>>   }
>>>
>>> - exbuf->fence_fd = -1;
>>
>> Is there any userspace relying on this -1 behaviour? Wouldn't be better
>> to remove this offending assignment?
> 
> Looking at current mesa, removing the assignment should be ok (and
> more consistent with other drivers).  But I can't say if this was
> always true, or that there aren't other non-mesa users, so I can see
> the argument for the more conservative uabi change that this patch
> went with.

Realistically, Mesa is the only user of this IOCTL. In general, in a
such case of doubt, I'll do the UABI change and then wait for complains.
If there is a complaint, then the change is reverted. Also will be good
to know about existence of other users :)

Given that -1 already wasn't consistently set for all error code paths,
it's tempting to see it removed.

The code change of this patch is trivial, hence should fine to keep the
-1 if you prefer that, but the patch won't apply cleanly to the stable
kernels because of the "exbuf->fence_fd = -1" movement. If stable
maintainers won't put effort into rebasing the patch, then better to do
the removal and live with a cleaner driver code, IMO.

-- 
Best regards,
Dmitry

Re: [RFC PATCH v3 0/3] Support for Solid Fill Planes

2023-02-01 Thread Jessica Zhang





On 1/31/2023 4:49 AM, Pekka Paalanen wrote:

On Tue, 31 Jan 2023 11:21:18 +
Simon Ser  wrote:


On Tuesday, January 31st, 2023 at 12:13, Pekka Paalanen  
wrote:


On Tue, 31 Jan 2023 10:06:39 +
Simon Ser  wrote:
   

On Tuesday, January 31st, 2023 at 10:25, Pekka Paalanen  
wrote:
   

indeed, what about simply using a 1x1 framebuffer for real? Why was that
approach rejected?


Ideally we don't want to allocate any GPU memory for the solid-fill
stuff. And if we special-case 1x1 FB creation to not be backed by real
GPU memory then we hit several situations where user-space expects a
real FB but there isn't: for instance, GETFB2 converts from FB object
ID to GEM handles. Even if we make GETFB2 fail and accept that this
breaks user-space, then there is no way for user-space to recover the
FB color for flicker-free transitions and such.

This is all purely from a uAPI PoV, completely ignoring the potential
issues with the internal kernel abstractions which might not be suitable
for this either.


I mean a real 1x1 buffer: a dumb buffer.

It would be absolutely compatible with anything existing, because it is
a real FB. As a dumb buffer it would be trivial to write into and read
out. As 1x1 it would be tiny (one page?). Even if something needs to
raw-access uncached memory over 33 MHz PCI bus or whatever the worst
case is, it's just one pixel, so it's fast enough, right? And it only
needs to be read once when set, like USB display drivers do. The driver
does not need to manually apply any color operations, because none are
supported in this special case.

One can put all these limitations and even pixel format in the plane
property that tells userspace that a 1x1 FB works here.

To recap, the other alternatives under discussion I see right now are:

- this proposal of dedicated fill color property
- stuffing something new into FB_ID property

There is also the question of other kinds of plane content sources like
live camera feeds where userspace won't be shovelling each frame
individually like we do now.

1x1 dumb buffer is not as small and lean as a dedicated fill color
property, but the UAPI design questions seem to be much less. What's
the best trade-off and for whom?


By "real memory" yes I mean the 1 page.

Using a real buffer also brings back other discussions, e.g. the one about
which pixel formats to accept.


Yeah, which is why I wrote: "One can put all these limitations and even
pixel format in the plane property". It doesn't even need to be a
variable in the UAPI, it can be hardcoded in the UAPI doc.

Please, do not understand this as me strongly advocating for the real FB
approach! I just don't want that option to be misunderstood.

I don't really care which design is chosen, but I do care about
documenting why other designs were rejected. If the rejection reasons
were false, they should be revised, even if the decision does not
change.


Hi Pekka/Daniel,

Looks like the general sentiment is to keep solid fill as a separate 
property, so I will stick with that implementation for v4.


I can document the reason why we chose this approach over 1x1 FB in the 
cover letter, but to summarize here:


Allocating an FB for solid_fill brings in unnecessary overhead (ex. 
having to allocate memory for the FB). In addition, since memory fetch 
is disabled when solid fill is enabled, having a separate property that 
doesn't do any memory allocation for solid fill better reflects the 
behavior of this feature within driver.


We also wanted to avoid having FB_ID accept a property blob as it would 
involve loosening some drm_property checks, which could cause issues 
with other property ioctls.




Also, re: other plane sources -- FWIW, I have tried implementing a 
source enum as Ville suggested, but ultimately dropped the change as it 
would require userspace to set properties in a specific order (i.e. to 
enable solid_fill, userspace would have to first set FB_ID to NULL then 
set SOLID_FILL).


I'm not sure how much of a can of worms that would be for userspace, but 
if you're fine with having that as a requirement the I can re-add the code.


Thanks,

Jessica Zhang




Thanks,
pq

Re: [PATCH v10 00/23] drm/i915/vm_bind: Add VM_BIND functionality

2023-02-01 Thread Zanoni, Paulo R

On Tue, 2023-01-17 at 23:15 -0800, Niranjana Vishwanathapura wrote:
> DRM_I915_GEM_VM_BIND/UNBIND ioctls allows UMD to bind/unbind GEM
> buffer objects (BOs) or sections of a BOs at specified GPU virtual
> addresses on a specified address space (VM). Multiple mappings can map
> to the same physical pages of an object (aliasing). These mappings (also
> referred to as persistent mappings) will be persistent across multiple
> GPU submissions (execbuf calls) issued by the UMD, without user having
> to provide a list of all required mappings during each submission (as
> required by older execbuf mode).
> 
> This patch series support VM_BIND version 1, as described by the param
> I915_PARAM_VM_BIND_VERSION.
> 
> Add new execbuf3 ioctl (I915_GEM_EXECBUFFER3) which only works in
> vm_bind mode. The vm_bind mode only works with this new execbuf3 ioctl.
> The new execbuf3 ioctl will not have any execlist support and all the
> legacy support like relocations etc., are removed.
> 
> NOTEs:
> * It is based on below VM_BIND design+uapi rfc.
>   Documentation/gpu/rfc/i915_vm_bind.rst
> 
> * The IGT RFC series is posted as,
>   [PATCH i-g-t v10 0/19] vm_bind: Add VM_BIND validation support

FYI, I created a Draft MR for the Mesa implementation:

https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21057

A Draft MR should be easier to track than simply a branch on a personal
tree. Feel free to put this link in the next cover letters for v11 and
above.

> 
> v2: Address various review comments
> v3: Address review comments and other fixes
> v4: Remove vm_unbind out fence uapi which is not supported yet,
> replace vm->vm_bind_mode check with i915_gem_vm_is_vm_bind_mode()
> v5: Render kernel-doc, use PIN_NOEVICT, limit vm_bind support to
> non-recoverable faults
> v6: Rebased, minor fixes, add reserved fields to drm_i915_gem_vm_bind,
> add new patch for async vm_unbind support
> v7: Rebased, minor cleanups as per review feedback
> v8: Rebased, add capture support
> v9: Address capture support feedback from v8
> v10: Properly handle vma->resource for mappings with capture request
> 
> Test-with: 20230118071350.17498-1-niranjana.vishwanathap...@intel.com
> 
> Signed-off-by: Niranjana Vishwanathapura 
> 
> Niranjana Vishwanathapura (23):
>   drm/i915/vm_bind: Expose vm lookup function
>   drm/i915/vm_bind: Add __i915_sw_fence_await_reservation()
>   drm/i915/vm_bind: Expose i915_gem_object_max_page_size()
>   drm/i915/vm_bind: Support partially mapped vma resource
>   drm/i915/vm_bind: Add support to create persistent vma
>   drm/i915/vm_bind: Implement bind and unbind of object
>   drm/i915/vm_bind: Support for VM private BOs
>   drm/i915/vm_bind: Add support to handle object evictions
>   drm/i915/vm_bind: Support persistent vma activeness tracking
>   drm/i915/vm_bind: Add out fence support
>   drm/i915/vm_bind: Abstract out common execbuf functions
>   drm/i915/vm_bind: Use common execbuf functions in execbuf path
>   drm/i915/vm_bind: Implement I915_GEM_EXECBUFFER3 ioctl
>   drm/i915/vm_bind: Update i915_vma_verify_bind_complete()
>   drm/i915/vm_bind: Expose i915_request_await_bind()
>   drm/i915/vm_bind: Handle persistent vmas in execbuf3
>   drm/i915/vm_bind: userptr dma-resv changes
>   drm/i915/vm_bind: Limit vm_bind mode to non-recoverable contexts
>   drm/i915/vm_bind: Add uapi for user to enable vm_bind_mode
>   drm/i915/vm_bind: Render VM_BIND documentation
>   drm/i915/vm_bind: Async vm_unbind support
>   drm/i915/vm_bind: Properly build persistent map sg table
>   drm/i915/vm_bind: Support capture of persistent mappings
> 
>  Documentation/gpu/i915.rst|  78 +-
>  drivers/gpu/drm/i915/Makefile |   3 +
>  drivers/gpu/drm/i915/gem/i915_gem_context.c   |  43 +-
>  drivers/gpu/drm/i915/gem/i915_gem_context.h   |  17 +
>  drivers/gpu/drm/i915/gem/i915_gem_create.c|  72 +-
>  drivers/gpu/drm/i915/gem/i915_gem_dmabuf.c|   6 +
>  .../gpu/drm/i915/gem/i915_gem_execbuffer.c| 522 +--
>  .../gpu/drm/i915/gem/i915_gem_execbuffer3.c   | 872 ++
>  .../drm/i915/gem/i915_gem_execbuffer_common.c | 671 ++
>  .../drm/i915/gem/i915_gem_execbuffer_common.h |  76 ++
>  drivers/gpu/drm/i915/gem/i915_gem_ioctls.h|   2 +
>  drivers/gpu/drm/i915/gem/i915_gem_object.c|   3 +
>  drivers/gpu/drm/i915/gem/i915_gem_object.h|   2 +
>  .../gpu/drm/i915/gem/i915_gem_object_types.h  |   6 +
>  drivers/gpu/drm/i915/gem/i915_gem_userptr.c   |  19 +
>  drivers/gpu/drm/i915/gem/i915_gem_vm_bind.h   |  30 +
>  .../drm/i915/gem/i915_gem_vm_bind_object.c| 463 ++
>  drivers/gpu/drm/i915/gt/intel_gtt.c   |  22 +
>  drivers/gpu/drm/i915/gt/intel_gtt.h   |  28 +
>  drivers/gpu/drm/i915/i915_driver.c|   4 +
>  drivers/gpu/drm/i915/i915_drv.h   |   2 +
>  drivers/gpu/drm/i915/i915_gem.c   |  14 +-
>  drivers/gpu/drm/i915/i915_gem_gtt.c   |  39 +
>

RE: [PATCH v2] drm/amdgpu/fence: Fix oops due to non-matching drm_sched init/fini

2023-02-01 Thread Chen, Guchun

Reviewed-by: Guchun Chen 

Regards,
Guchun

-Original Message-
From: Guilherme G. Piccoli  
Sent: Thursday, February 2, 2023 12:48 AM
To: amd-...@lists.freedesktop.org
Cc: dri-devel@lists.freedesktop.org; Deucher, Alexander 
; Koenig, Christian ; Pan, 
Xinhui ; ker...@gpiccoli.net; kernel-...@igalia.com; 
Guilherme G. Piccoli ; Chen, Guchun ; 
Tuikov, Luben ; Limonciello, Mario 

Subject: [PATCH v2] drm/amdgpu/fence: Fix oops due to non-matching drm_sched 
init/fini

Currently amdgpu calls drm_sched_fini() from the fence driver sw fini routine - 
such function is expected to be called only after the respective init function 
- drm_sched_init() - was executed successfully.

Happens that we faced a driver probe failure in the Steam Deck recently, and 
the function drm_sched_fini() was called even without its counter-part had been 
previously called, causing the following oops:

amdgpu: probe of :04:00.0 failed with error -110
BUG: kernel NULL pointer dereference, address: 0090 PGD 0 P4D 0
Oops: 0002 [#1] PREEMPT SMP NOPTI
CPU: 0 PID: 609 Comm: systemd-udevd Not tainted 6.2.0-rc3-gpiccoli #338 
Hardware name: Valve Jupiter/Jupiter, BIOS F7A0113 11/04/2022
RIP: 0010:drm_sched_fini+0x84/0xa0 [gpu_sched] [...] Call Trace:
 
 amdgpu_fence_driver_sw_fini+0xc8/0xd0 [amdgpu]
 amdgpu_device_fini_sw+0x2b/0x3b0 [amdgpu]
 amdgpu_driver_release_kms+0x16/0x30 [amdgpu]
 devm_drm_dev_init_release+0x49/0x70
 [...]

To prevent that, check if the drm_sched was properly initialized for a given 
ring before calling its fini counter-part.

Notice ideally we'd use sched.ready for that; such field is set as the latest 
thing on drm_sched_init(). But amdgpu seems to "override" the meaning of such 
field - in the above oops for example, it was a GFX ring causing the crash, and 
the sched.ready field was set to true in the ring init routine, regardless of 
the state of the DRM scheduler. Hence, we ended-up using sched.ops as per 
Christian's suggestion [0].

[0] 
https://lore.kernel.org/amd-gfx/984ee981-2906-0eaf-ccec-9f80975cb...@amd.com/

Fixes: 067f44c8b459 ("drm/amdgpu: avoid over-handle of fence driver fini in s3 
test (v2)")
Suggested-by: Christian König 
Cc: Guchun Chen 
Cc: Luben Tuikov 
Cc: Mario Limonciello 
Signed-off-by: Guilherme G. Piccoli 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c | 8 +++-
 1 file changed, 7 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c
index 00444203220d..3b962cb680a6 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c
@@ -618,7 +618,13 @@ void amdgpu_fence_driver_sw_fini(struct amdgpu_device 
*adev)
if (!ring || !ring->fence_drv.initialized)
continue;
 
-   if (!ring->no_scheduler)
+   /*
+* Notice we check for sched.ops since there's some
+* override on the meaning of sched.ready by amdgpu.
+* The natural check would be sched.ready, which is
+* set as drm_sched_init() finishes...
+*/
+   if (!ring->no_scheduler && ring->sched.ops)
drm_sched_fini(>sched);
 
for (j = 0; j <= ring->fence_drv.num_fences_mask; ++j)
--
2.39.0

Re: [PATCH] gpu: host1x: fix uninitialized variable use

2023-02-01 Thread Nick Desaulniers

On Fri, Jan 27, 2023 at 11:14:00PM +0100, Arnd Bergmann wrote:
> From: Arnd Bergmann 
> 
> The error handling for platform_get_irq() failing no longer
> works after a recent change, clang now points this out with
> a warning:
> 
> drivers/gpu/host1x/dev.c:520:6: error: variable 'syncpt_irq' is uninitialized 
> when used here [-Werror,-Wuninitialized]
> if (syncpt_irq < 0)
> ^~
> 
> Fix this by removing the variable and checking the correct
> error status.
> 
> Fixes: 625d4ffb438c ("gpu: host1x: Rewrite syncpoint interrupt handling")
> Signed-off-by: Arnd Bergmann 

Thanks Arnd, I saw some reports from kernelci about this, too.
https://lore.kernel.org/linux-next/?q=warning%3A+variable+%27syncpt_irq%27+is+uninitialized+when+used+here

Reported-by: "kernelci.org bot" 
Reviewed-by: Nick Desaulniers 

> ---
>  drivers/gpu/host1x/dev.c | 5 ++---
>  1 file changed, 2 insertions(+), 3 deletions(-)
> 
> diff --git a/drivers/gpu/host1x/dev.c b/drivers/gpu/host1x/dev.c
> index 4872d183d860..aae2efeef503 100644
> --- a/drivers/gpu/host1x/dev.c
> +++ b/drivers/gpu/host1x/dev.c
> @@ -487,7 +487,6 @@ static int host1x_get_resets(struct host1x *host)
>  static int host1x_probe(struct platform_device *pdev)
>  {
>   struct host1x *host;
> - int syncpt_irq;
>   int err;
>  
>   host = devm_kzalloc(>dev, sizeof(*host), GFP_KERNEL);
> @@ -517,8 +516,8 @@ static int host1x_probe(struct platform_device *pdev)
>   }
>  
>   host->syncpt_irq = platform_get_irq(pdev, 0);
> - if (syncpt_irq < 0)
> - return syncpt_irq;
> + if (host->syncpt_irq < 0)
> + return host->syncpt_irq;
>  
>   mutex_init(>devices_lock);
>   INIT_LIST_HEAD(>devices);
> -- 
> 2.39.0
> 
>

Re: [REGRESSION] [Bug 216859] New: PCI bridge to bus boot hang at enumeration

2023-02-01 Thread Bjorn Helgaas

[+cc Geert]

On Thu, Jan 26, 2023 at 06:11:24AM -0600, Bjorn Helgaas wrote:
> On Thu, Jan 12, 2023 at 02:08:19PM -0600, Bjorn Helgaas wrote:
> > On Wed, Dec 28, 2022 at 06:02:48AM -0600, Bjorn Helgaas wrote:
> > > On Wed, Dec 28, 2022 at 08:37:52AM +, bugzilla-dae...@kernel.org 
> > > wrote:
> > > > https://bugzilla.kernel.org/show_bug.cgi?id=216859
> > > 
> > > >Summary: PCI bridge to bus boot hang at enumeration
> > > > Kernel Version: 6.1-rc1
> > > > ...
> > > 
> > > > With Kernel 6.1-rc1 the enumeration process stopped working for me,
> > > > see attachments.
> > > > 
> > > > The enumeration works fine with Kernel 6.0 and below.
> > > > 
> > > > Same problem still exists with v6.1. and v6.2.-rc1
> 
> This is a regression between v6.0 and v6.1-rc1.  Console output during
> boot freezes after nvidiafb deactivates the VGA console.
> 
> It was a lot of work for Zeno, but we finally isolated this console
> hang to 145eed48de27 ("fbdev: Remove conflicting devices on PCI bus").
> 
> The system actually does continue to boot and is accessible via ssh, 
> but the console appears hung, at least for output.  More details in
> the bugzilla starting at
> https://bugzilla.kernel.org/show_bug.cgi?id=216859#c47 .

145eed48de27 ("fbdev: Remove conflicting devices on PCI bus") doesn't
say what the benefit is, or what would break if we reverted it.

Does anybody have any clues?  It would be nice to resolve this
regression before v6.2, which will probably be released 2/12 or 2/19.

Bjorn

[PATCH 4/4] drm/i915/selftest: Use forcewake to sanity check engine wa lists

2023-02-01 Thread Matt Roper

Although register information in the bspec includes a field that is
supposed to reflect a register's reset characteristics (i.e., whether a
register maintains its value through engine resets), it's been
discovered that this information is incorrect for some register ranges
(i.e., registers that are not affected by engine resets are tagged in a
way that indicates they would be).

We can sanity check workaround registers placed on the RCS/CCS engine
workaround lists (including those placed there via the
general_render_compute_wa_init() function) by comparing against the
forcewake table.  As far as we know, there's never a case where a
register that lives outside the RENDER powerwell will be reset by an
RCS/CCS engine reset.

Signed-off-by: Matt Roper 
---
 .../gpu/drm/i915/gt/selftest_workarounds.c| 52 +++
 1 file changed, 52 insertions(+)

diff --git a/drivers/gpu/drm/i915/gt/selftest_workarounds.c 
b/drivers/gpu/drm/i915/gt/selftest_workarounds.c
index 14a8b25b6204..1bc8febc5c1d 100644
--- a/drivers/gpu/drm/i915/gt/selftest_workarounds.c
+++ b/drivers/gpu/drm/i915/gt/selftest_workarounds.c
@@ -1362,12 +1362,64 @@ live_engine_reset_workarounds(void *arg)
return ret;
 }
 
+/*
+ * The bspec's documentation for register reset behavior can be unreliable for
+ * some MMIO ranges.  But in general we do not expect registers outside the
+ * RENDER forcewake domain to be reset by RCS/CCS engine resets.  If we find
+ * workaround registers on an RCS or CCS engine's list, it likely indicates
+ * the register is misdocumented in the bspec and the workaround implementation
+ * should be moved to the GT workaround list instead.
+ */
+static int
+live_check_engine_workarounds_fw(void *arg)
+{
+   struct intel_gt *gt = arg;
+   struct intel_engine_cs *engine;
+   struct wa_lists *lists;
+   enum intel_engine_id id;
+   int ret = 0;
+
+   lists = kzalloc(sizeof(*lists), GFP_KERNEL);
+   if (!lists)
+   return -ENOMEM;
+
+   reference_lists_init(gt, lists);
+
+   for_each_engine(engine, gt, id) {
+   struct i915_wa_list *wal = >engine[id].wa_list;
+   struct i915_wa *wa;
+   int i;
+
+   if (engine->class != RENDER_CLASS &&
+   engine->class != COMPUTE_CLASS)
+   continue;
+
+   for (i = 0, wa = wal->list; i < wal->count; i++, wa++) {
+   enum forcewake_domains fw;
+
+   fw = intel_uncore_forcewake_for_reg(gt->uncore, wa->reg,
+   FW_REG_READ | 
FW_REG_WRITE);
+   if ((fw & FORCEWAKE_RENDER) == 0) {
+   pr_err("%s: Register %#x not in RENDER 
forcewake domain!\n",
+  engine->name, 
i915_mmio_reg_offset(wa->reg));
+   ret = -EINVAL;
+   }
+   }
+   }
+
+   reference_lists_fini(gt, lists);
+   kfree(lists);
+
+   return ret;
+}
+
 int intel_workarounds_live_selftests(struct drm_i915_private *i915)
 {
static const struct i915_subtest tests[] = {
SUBTEST(live_dirty_whitelist),
SUBTEST(live_reset_whitelist),
SUBTEST(live_isolated_whitelist),
+   SUBTEST(live_check_engine_workarounds_fw),
SUBTEST(live_gpu_reset_workarounds),
SUBTEST(live_engine_reset_workarounds),
};
-- 
2.39.1

[PATCH 2/4] drm/i915/gen11: Wa_1408615072/Wa_1407596294 should be on GT list

2023-02-01 Thread Matt Roper

The UNSLICE_UNIT_LEVEL_CLKGATE register programmed by this workaround
has 'BUS' style reset, indicating that it does not lose its value on
engine resets.  Furthermore, this register is part of the GT forcewake
domain rather than the RENDER domain, so it should not be impacted by
RCS engine resets.  As such, we should implement this on the GT
workaround list rather than an engine list.

Bspec: 19219
Fixes: 3551ff928744 ("drm/i915/gen11: Moving WAs to rcs_engine_wa_init()")
Signed-off-by: Matt Roper 
---
 drivers/gpu/drm/i915/gt/intel_workarounds.c | 14 +++---
 1 file changed, 7 insertions(+), 7 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/intel_workarounds.c 
b/drivers/gpu/drm/i915/gt/intel_workarounds.c
index f45ca3d4a07c..7e93ba6b3208 100644
--- a/drivers/gpu/drm/i915/gt/intel_workarounds.c
+++ b/drivers/gpu/drm/i915/gt/intel_workarounds.c
@@ -1405,6 +1405,13 @@ icl_gt_workarounds_init(struct intel_gt *gt, struct 
i915_wa_list *wal)
GAMT_CHKN_BIT_REG,
GAMT_CHKN_DISABLE_L3_COH_PIPE);
 
+   /*
+* Wa_1408615072:icl,ehl  (vsunit)
+* Wa_1407596294:icl,ehl  (hsunit)
+*/
+   wa_write_or(wal, UNSLICE_UNIT_LEVEL_CLKGATE,
+   VSUNIT_CLKGATE_DIS | HSUNIT_CLKGATE_DIS);
+
/* Wa_1407352427:icl,ehl */
wa_write_or(wal, UNSLICE_UNIT_LEVEL_CLKGATE2,
PSDUNIT_CLKGATE_DIS);
@@ -2536,13 +2543,6 @@ rcs_engine_wa_init(struct intel_engine_cs *engine, 
struct i915_wa_list *wal)
wa_masked_en(wal, GEN9_CSFE_CHICKEN1_RCS,
 GEN11_ENABLE_32_PLANE_MODE);
 
-   /*
-* Wa_1408615072:icl,ehl  (vsunit)
-* Wa_1407596294:icl,ehl  (hsunit)
-*/
-   wa_write_or(wal, UNSLICE_UNIT_LEVEL_CLKGATE,
-   VSUNIT_CLKGATE_DIS | HSUNIT_CLKGATE_DIS);
-
/*
 * Wa_1408767742:icl[a2..forever],ehl[all]
 * Wa_1605460711:icl[a0..c0]
-- 
2.39.1

[PATCH 3/4] drm/i915/xehp: LNCF/LBCF workarounds should be on the GT list

2023-02-01 Thread Matt Roper

Although registers in the L3 bank/node configuration ranges are marked
as having "DEV" reset characteristics in the bspec, this appears to be a
hold-over from pre-Xe_HP platforms.  In reality, these registers
maintain their values across engine resets, meaning that workarounds
and tuning settings targetting them should be placed on the GT
workaround list rather than an engine workaround list.

Note that an extra clue here is that these registers moved from the
RENDER forcewake domain to the GT forcewake domain in Xe_HP; generally
RCS/CCS engine resets should not lead to the reset of a register that
lives outside the RENDER domain.

Re-applying these registers on engine resets wouldn't actually hurt
anything, but is unnecessary and just makes it more confusing to anyone
trying to decipher how these registers really work.

Signed-off-by: Matt Roper 
---
 drivers/gpu/drm/i915/gt/intel_workarounds.c | 61 +
 1 file changed, 38 insertions(+), 23 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/intel_workarounds.c 
b/drivers/gpu/drm/i915/gt/intel_workarounds.c
index 7e93ba6b3208..09c9837458b5 100644
--- a/drivers/gpu/drm/i915/gt/intel_workarounds.c
+++ b/drivers/gpu/drm/i915/gt/intel_workarounds.c
@@ -1499,6 +1499,12 @@ xehpsdv_gt_workarounds_init(struct intel_gt *gt, struct 
i915_wa_list *wal)
/* Wa_1409757795:xehpsdv */
wa_mcr_write_or(wal, SCCGCTL94DC, CG3DDISURB);
 
+   /* Wa_18011725039:xehpsdv */
+   if (IS_XEHPSDV_GRAPHICS_STEP(i915, STEP_A1, STEP_B0)) {
+   wa_mcr_masked_dis(wal, MLTICTXCTL, TDONRENDER);
+   wa_mcr_write_or(wal, L3SQCREG1_CCS0, FLUSHALLNONCOH);
+   }
+
/* Wa_16011155590:xehpsdv */
if (IS_XEHPSDV_GRAPHICS_STEP(i915, STEP_A0, STEP_B0))
wa_write_or(wal, UNSLICE_UNIT_LEVEL_CLKGATE,
@@ -1548,6 +1554,9 @@ xehpsdv_gt_workarounds_init(struct intel_gt *gt, struct 
i915_wa_list *wal)
/* Wa_14014368820:xehpsdv */
wa_mcr_write_or(wal, XEHP_GAMCNTRL_CTRL,
INVALIDATION_BROADCAST_MODE_DIS | 
GLOBAL_INVALIDATION_MODE);
+
+   /* Wa_14010670810:xehpsdv */
+   wa_mcr_write_or(wal, XEHP_L3NODEARBCFG, XEHP_LNESPARE);
 }
 
 static void
@@ -1684,6 +1693,9 @@ pvc_gt_workarounds_init(struct intel_gt *gt, struct 
i915_wa_list *wal)
wa_mcr_write_or(wal, COMP_MOD_CTRL, FORCE_MISS_FTLB);
wa_mcr_write_or(wal, XEHP_VDBX_MOD_CTRL, FORCE_MISS_FTLB);
wa_mcr_write_or(wal, XEHP_VEBX_MOD_CTRL, FORCE_MISS_FTLB);
+
+   /* Wa_16016694945 */
+   wa_mcr_masked_en(wal, XEHPC_LNCFMISCCFGREG0, XEHPC_OVRLSCCC);
 }
 
 static void
@@ -1724,11 +1736,36 @@ xelpmp_gt_workarounds_init(struct intel_gt *gt, struct 
i915_wa_list *wal)
debug_dump_steering(gt);
 }
 
+/*
+ * The bspec performance guide has recommended MMIO tuning settings.  These
+ * aren't truly "workarounds" but we want to program them through the
+ * workaround infrastructure to make sure they're (re)applied at the proper
+ * times.
+ *
+ * The settings in this function are for settings that persist through
+ * engine resets and also are not part of any engine's register state context.
+ * I.e., settings that only need to be re-applied in the event of a full GT
+ * reset.
+ */
+static void gt_tuning_settings(struct intel_gt *gt, struct i915_wa_list *wal)
+{
+   if (IS_PONTEVECCHIO(gt->i915)) {
+   wa_mcr_write(wal, XEHPC_L3SCRUB,
+SCRUB_CL_DWNGRADE_SHARED | SCRUB_RATE_4B_PER_CLK);
+   wa_mcr_masked_en(wal, XEHPC_LNCFMISCCFGREG0, XEHPC_HOSTCACHEEN);
+   }
+
+   if (IS_DG2(gt->i915))
+   wa_mcr_write_or(wal, XEHP_L3SCQREG7, 
BLEND_FILL_CACHING_OPT_DIS);
+}
+
 static void
 gt_init_workarounds(struct intel_gt *gt, struct i915_wa_list *wal)
 {
struct drm_i915_private *i915 = gt->i915;
 
+   gt_tuning_settings(gt, wal);
+
if (gt->type == GT_MEDIA) {
if (MEDIA_VER(i915) >= 13)
xelpmp_gt_workarounds_init(gt, wal);
@@ -2897,16 +2934,8 @@ static void
 add_render_compute_tuning_settings(struct drm_i915_private *i915,
   struct i915_wa_list *wal)
 {
-   if (IS_PONTEVECCHIO(i915)) {
-   wa_mcr_write(wal, XEHPC_L3SCRUB,
-SCRUB_CL_DWNGRADE_SHARED | SCRUB_RATE_4B_PER_CLK);
-   wa_mcr_masked_en(wal, XEHPC_LNCFMISCCFGREG0, XEHPC_HOSTCACHEEN);
-   }
-
-   if (IS_DG2(i915)) {
-   wa_mcr_write_or(wal, XEHP_L3SCQREG7, 
BLEND_FILL_CACHING_OPT_DIS);
+   if (IS_DG2(i915))
wa_mcr_write_clr_set(wal, RT_CTRL, STACKID_CTRL, 
STACKID_CTRL_512);
-   }
 
/*
 * This tuning setting proves beneficial only on ATS-M designs; the
@@ -2988,11 +3017,6 @@ general_render_compute_wa_init(struct intel_engine_cs 
*engine, struct i915_wa_li
   0, false);
}
 
-   if (IS_PONTEVECCHIO(i915)) {
-   /*

[PATCH 1/4] drm/i915/pvc: Annotate two more workaround/tuning registers as MCR

2023-02-01 Thread Matt Roper

XEHPC_LNCFMISCCFGREG0 and XEHPC_L3SCRUB are both in MCR register ranges
on PVC (with HALFBSLICE and L3BANK replication respectively), so they
should be explicitly declared as MCR registers and use MCR-aware
workaround handlers.

The workarounds/tuning settings should still be applied properly on PVC
even without the MCR annotation, but readback verification on
CONFIG_DRM_I915_DEBUG_GEM builds could potentitally give false positive
"workaround lost on load" warnings on parts fused such that a unicast
read targets a terminated register instance.

Fixes: a9e69428b1b4 ("drm/i915: Define MCR registers explicitly")
Signed-off-by: Matt Roper 
---
 drivers/gpu/drm/i915/gt/intel_gt_regs.h |  4 ++--
 drivers/gpu/drm/i915/gt/intel_workarounds.c | 12 +---
 2 files changed, 11 insertions(+), 5 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/intel_gt_regs.h 
b/drivers/gpu/drm/i915/gt/intel_gt_regs.h
index 7fa18a3b3957..928698c621e5 100644
--- a/drivers/gpu/drm/i915/gt/intel_gt_regs.h
+++ b/drivers/gpu/drm/i915/gt/intel_gt_regs.h
@@ -979,7 +979,7 @@
 #define   GEN7_WA_FOR_GEN7_L3_CONTROL  0x3C47FF8C
 #define   GEN7_L3AGDIS (1 << 19)
 
-#define XEHPC_LNCFMISCCFGREG0  _MMIO(0xb01c)
+#define XEHPC_LNCFMISCCFGREG0  MCR_REG(0xb01c)
 #define   XEHPC_HOSTCACHEENREG_BIT(1)
 #define   XEHPC_OVRLSCCC   REG_BIT(0)
 
@@ -1042,7 +1042,7 @@
 #define XEHP_L3SCQREG7 MCR_REG(0xb188)
 #define   BLEND_FILL_CACHING_OPT_DIS   REG_BIT(3)
 
-#define XEHPC_L3SCRUB  _MMIO(0xb18c)
+#define XEHPC_L3SCRUB  MCR_REG(0xb18c)
 #define   SCRUB_CL_DWNGRADE_SHARED REG_BIT(12)
 #define   SCRUB_RATE_PER_BANK_MASK REG_GENMASK(2, 0)
 #define   SCRUB_RATE_4B_PER_CLK
REG_FIELD_PREP(SCRUB_RATE_PER_BANK_MASK, 0x6)
diff --git a/drivers/gpu/drm/i915/gt/intel_workarounds.c 
b/drivers/gpu/drm/i915/gt/intel_workarounds.c
index 29718d0595f4..f45ca3d4a07c 100644
--- a/drivers/gpu/drm/i915/gt/intel_workarounds.c
+++ b/drivers/gpu/drm/i915/gt/intel_workarounds.c
@@ -240,6 +240,12 @@ wa_write(struct i915_wa_list *wal, i915_reg_t reg, u32 set)
wa_write_clr_set(wal, reg, ~0, set);
 }
 
+static void
+wa_mcr_write(struct i915_wa_list *wal, i915_mcr_reg_t reg, u32 set)
+{
+   wa_mcr_write_clr_set(wal, reg, ~0, set);
+}
+
 static void
 wa_write_or(struct i915_wa_list *wal, i915_reg_t reg, u32 set)
 {
@@ -2892,9 +2898,9 @@ add_render_compute_tuning_settings(struct 
drm_i915_private *i915,
   struct i915_wa_list *wal)
 {
if (IS_PONTEVECCHIO(i915)) {
-   wa_write(wal, XEHPC_L3SCRUB,
+   wa_mcr_write(wal, XEHPC_L3SCRUB,
 SCRUB_CL_DWNGRADE_SHARED | SCRUB_RATE_4B_PER_CLK);
-   wa_masked_en(wal, XEHPC_LNCFMISCCFGREG0, XEHPC_HOSTCACHEEN);
+   wa_mcr_masked_en(wal, XEHPC_LNCFMISCCFGREG0, XEHPC_HOSTCACHEEN);
}
 
if (IS_DG2(i915)) {
@@ -2984,7 +2990,7 @@ general_render_compute_wa_init(struct intel_engine_cs 
*engine, struct i915_wa_li
 
if (IS_PONTEVECCHIO(i915)) {
/* Wa_16016694945 */
-   wa_masked_en(wal, XEHPC_LNCFMISCCFGREG0, XEHPC_OVRLSCCC);
+   wa_mcr_masked_en(wal, XEHPC_LNCFMISCCFGREG0, XEHPC_OVRLSCCC);
}
 
if (IS_XEHPSDV(i915)) {
-- 
2.39.1

Re: [PATCH v12 00/18] drm: Add Samsung MIPI DSIM bridge

2023-02-01 Thread Marek Vasut


On 1/30/23 13:45, Rasmus Villemoes wrote:

On 27/01/2023 12.30, Marek Vasut wrote:

On 1/27/23 12:04, Jagan Teki wrote:



Thanks, but that's exactly what I'm doing, and I don't see any
modification of imx8mp.dtsi in that branch. I'm basically looking for
help to do the equivalent of

    88775338cd58 - arm64: dts: imx8mm: Add MIPI DSI pipeline
    f964f67dd6ee - arm64: dts: imx8mm: Add eLCDIF node support

for imx8mp in order to test those patches on our boards (we have two
variants).


Marek, any help here, thanks.


Try attached patch.


Thanks. I removed the lcdif2 and ldb nodes I had added from Alexander's
patch (94e6197dadc9 in linux-next) in order to apply it. I get a couple
of errors during boot:

   clk: /soc@0/bus@32c0/mipi_dsi@32e6: failed to reparent
media_apb to sys_pll1_266m: -22

and enabling a pr_debug in clk_core_set_parent_nolock() shows that this
is because

   clk_core_set_parent_nolock: clk sys_pll1_266m can not be parent of clk
media_apb

Further, the mipi_dsi fails to probe due to

   /soc@0/bus@32c0/mipi_dsi@32e6: failed to get
'samsung,burst-clock-frequency' property

All other .dtsi files seem to have those samsung,burst-clock-frequency
and samsung,esc-clock-frequency properties, so I suppose those should
also go into the imx8mp.dtsi and are not something that the board .dts
file should supply(?).


No, that samsung,esc-clock-frequency (should be some 10-20 MHz, based on 
your panel/bridge) and samsung,burst-clock-frequency (that's the HS 
clock) should go into board DT, as those are property of the attached 
panel/bridge.



[There's also some differences between your patch and Alexander's
regarding the lcdif2 and ldb nodes, so while my lvds display still sorta
works, I get

   fsl-ldb 32ec.blk-ctrl:lvds-ldb: Configured LDB clock (29700
Hz) does not match requested LVDS clock: 34650 Hz

and the image is oddly distorted/shifted. But I suppose that's
orthogonal to getting the lcdif1 -> mipi-dsi -> ... pipeline working.]


Alexander is right in the reply below, you have to configure the LVDS 
serializer clock in DT and they must match LCDIF2 pixel clock which are 
also configured in DT then, else you won't get the correct LVDS clock.

Re: linux-next: manual merge of the usb tree with the drm-intel-fixes tree

2023-02-01 Thread Rodrigo Vivi

On Wed, Feb 01, 2023 at 10:37:06AM -0800, John Harrison wrote:
> On 2/1/2023 07:31, Rodrigo Vivi wrote:
> > On Wed, Feb 01, 2023 at 03:11:31PM +1100, Stephen Rothwell wrote:
> > > Hi all,
> > > 
> > > On Tue, 31 Jan 2023 10:27:29 -0800 John Harrison 
> > >  wrote:
> > > > On 1/31/2023 04:44, Andy Shevchenko wrote:
> > > > > On Tue, Jan 31, 2023 at 01:03:05PM +1100, Stephen Rothwell wrote:
> > > > > > Today's linux-next merge of the usb tree got a conflict in:
> > > > > > 
> > > > > > drivers/gpu/drm/i915/gt/intel_engine_cs.c
> > > > > > 
> > > > > > between commit:
> > > > > > 
> > > > > > 5bc4b43d5c6c ("drm/i915: Fix up locking around dumping requests 
> > > > > > lists")
> > > > > > 
> > > > > > from the drm-intel-fixes tree and commit:
> > > > > > 
> > > > > > 4d70c74659d9 ("i915: Move list_count() to list.h as 
> > > > > > list_count_nodes() for broader use")
> > > > > > 
> > > > > > from the usb tree.
> > > > > > 
> > > > > > I fixed it up (the former removed the code changed by the latter)
> > > > > Hmm... Currently I see that 
> > > > > 20230127002842.3169194-4-john.c.harri...@intel.com
> > > > > moves the code to the 
> > > > > drivers/gpu/drm/i915/gt/intel_execlists_submission.c.
> > > > > 
> > > > > Is there any new series beside the above mentioned that touches that 
> > > > > file and
> > > > > actually _removes_ that code?
> > > > As long as the removal is limited to list_count/list_count_nodes,
> > > > that's fine. I only moved it from one file to another because the one
> > > > and only function that was using it was being moved to the other
> > > > file. If someone else has found a use for the same and wants to move
> > > > it to a more common place then great. I assume there was no conflict
> > > > happening in the i915 specific code.
> > > I have added this fix up patch to linux-next today (more or less - this
> > > is a hand hacked version, but you get the idea):
> > > 
> > > From: Stephen Rothwell 
> > > Date: Wed, 1 Feb 2023 13:13:01 +1100
> > > Subject: [PATCH] i915: fix up for "drm/i915: Fix up locking around 
> > > dumping requests lists"
> > > 
> > > interacting with "i915: Move list_count() to list.h as list_count_nodes() 
> > > for broader use"
> > > 
> > > Signed-off-by: Stephen Rothwell 
> > > ---
> > >   .../gpu/drm/i915/gt/intel_execlists_submission.c| 15 +
> > >   1 file changed, 2 insertion(+), 13 deletions(-)
> > > 
> > > diff --git a/drivers/gpu/drm/i915/gt/intel_execlists_submission.c 
> > > b/drivers/gpu/drm/i915/gt/intel_execlists_submission.c
> > > index 3c573d41d404..e919d41a48d9 100644
> > > --- a/drivers/gpu/drm/i915/gt/intel_execlists_submission.c
> > > +++ b/drivers/gpu/drm/i915/gt/intel_execlists_submission.c
> > > @@ -4150,17 +4150,6 @@ void intel_execlists_show_requests(struct 
> > > intel_engine_cs *engine,
> > >   spin_unlock_irqrestore(_engine->lock, flags);
> > >   }
> > > -static unsigned long list_count(struct list_head *list)
> > > -{
> > > - struct list_head *pos;
> > > - unsigned long count = 0;
> > > -
> > > - list_for_each(pos, list)
> > > - count++;
> > > -
> > > - return count;
> > > -}
> > > -
> > >   void intel_execlists_dump_active_requests(struct intel_engine_cs 
> > > *engine,
> > > struct i915_request *hung_rq,
> > > struct drm_printer *m)
> > > @@ -4172,7 +4161,7 @@ void intel_execlists_dump_active_requests(struct 
> > > intel_engine_cs *engine,
> > >   
> > > intel_engine_dump_active_requests(>sched_engine->requests, 
> > > hung_rq, m);
> > > - drm_printf(m, "\tOn hold?: %lu\n",
> > > -list_count(>sched_engine->hold));
> > > + drm_printf(m, "\tOn hold?: %zu\n",
> > > +list_count_nodes(>sched_engine->hold));
> > something awkward here.
> > The resolution on linux-next should align with the resolution on drm-tip
> > where we have the list_count still there as we preferred the version
> > on drm-intel-gt-next as the resolution of the conflict instead of the
> > fixes one.
> Not following why you want to keep list_count as a local function in the
> i915 driver? Surely the correct fix is to move it to the common header and
> share the code? In which case, the correct name is list_count_nodes() as
> that is what got merged to the common header.

right. please ignore my previous email and accept my apologies for the
unnecessary noise. I had just read the commit '4d70c74659d9 ("i915: Move
 list_count() to list.h as list_count_nodes() for broader use")'
and it now the final resolution makes total sense. And that patch had
been reviewed and acked by us, so everything is good.

I just confused with other conflict that we have with our on gt-next
and -fixes tree but with an easier resolution.

Sorry,
Rodrigo.

> 
> John.
> 
> > 
> > >   spin_unlock_irqrestore(>sched_engine->lock, flags);
> > >   }
> > > -- 
> > > 2.35.1
> > > 
> > > -- 
> > > Cheers,
> > > Stephen Rothwell
> > 
>

Re: (subset) [PATCH v2 00/13] spi: Add support for stacked/parallel memories

2023-02-01 Thread Mark Brown

On Fri, 20 Jan 2023 00:23:29 +0530, Amit Kumar Mahapatra wrote:
> This patch is in the continuation to the discussions which happened on
> 'commit f89504300e94 ("spi: Stacked/parallel memories bindings")' for
> adding dt-binding support for stacked/parallel memories.
> 
> This patch series updated the spi-nor, spi core and the spi drivers
> to add stacked and parallel memories support.
> 
> [...]

Applied to

   https://git.kernel.org/pub/scm/linux/kernel/git/broonie/spi.git for-next

Thanks!

[01/13] spi: Add APIs in spi core to set/get spi->chip_select and spi->cs_gpiod
commit: 303feb3cc06ac0665d0ee9c1414941200e60e8a3
[02/13] spi: Replace all spi->chip_select and spi->cs_gpiod references with 
function call
(no commit info)
[03/13] net: Replace all spi->chip_select and spi->cs_gpiod references with 
function call
(no commit info)
[04/13] iio: imu: Replace all spi->chip_select and spi->cs_gpiod references 
with function call
(no commit info)
[05/13] mtd: devices: Replace all spi->chip_select and spi->cs_gpiod references 
with function call
(no commit info)
[06/13] staging: Replace all spi->chip_select and spi->cs_gpiod references with 
function call
(no commit info)
[07/13] platform/x86: serial-multi-instantiate: Replace all spi->chip_select 
and spi->cs_gpiod references with function call
(no commit info)

All being well this means that it will be integrated into the linux-next
tree (usually sometime in the next 24 hours) and sent to Linus during
the next merge window (or sooner if it is a bug fix), however if
problems are discovered then the patch may be dropped or reverted.

You may get further e-mails resulting from automated or manual testing
and review of the tree, please engage with people reporting problems and
send followup patches addressing any issues that are reported if needed.

If any updates are required or you are submitting further changes they
should be sent as incremental updates against current git, existing
patches will not be replaced.

Please add any relevant lists and maintainers to the CCs when replying
to this mail.

Thanks,
Mark

[PATCH v2 1/2] dt-bindings: display: simple: add support for InnoLux G070ACE-L01

2023-02-01 Thread richard . leitner

From: Richard Leitner 

Add Innolux G070ACE-L01 7" WVGA (800x480) TFT LCD panel compatible
string.

Acked-by: Krzysztof Kozlowski 
Signed-off-by: Richard Leitner 
---
 Documentation/devicetree/bindings/display/panel/panel-simple.yaml | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/Documentation/devicetree/bindings/display/panel/panel-simple.yaml 
b/Documentation/devicetree/bindings/display/panel/panel-simple.yaml
index 18241f4051d2..fd3e5ad769dc 100644
--- a/Documentation/devicetree/bindings/display/panel/panel-simple.yaml
+++ b/Documentation/devicetree/bindings/display/panel/panel-simple.yaml
@@ -174,6 +174,8 @@ properties:
   - innolux,at043tn24
 # Innolux AT070TN92 7.0" WQVGA TFT LCD panel
   - innolux,at070tn92
+# Innolux G070ACE-L01 7" WVGA (800x480) TFT LCD panel
+  - innolux,g070ace-l01
 # Innolux G070Y2-L01 7" WVGA (800x480) TFT LCD panel
   - innolux,g070y2-l01
 # Innolux G070Y2-T02 7" WVGA (800x480) TFT LCD TTL panel

-- 
2.39.1

[PATCH v2 2/2] drm/panel: simple: Add InnoLux G070ACE-L01

2023-02-01 Thread richard . leitner

From: Richard Leitner 

Add InnoLux G070ACE-L01 7" 800x480 TFT LCD with WLED backlight panel
support. Timing data was extracted from datasheet and vendor provided
EDID file.

Signed-off-by: Richard Leitner 
---
 drivers/gpu/drm/panel/panel-simple.c | 35 +++
 1 file changed, 35 insertions(+)

diff --git a/drivers/gpu/drm/panel/panel-simple.c 
b/drivers/gpu/drm/panel/panel-simple.c
index 8a3b685c2fcc..d90f86ad3789 100644
--- a/drivers/gpu/drm/panel/panel-simple.c
+++ b/drivers/gpu/drm/panel/panel-simple.c
@@ -2142,6 +2142,38 @@ static const struct panel_desc innolux_at070tn92 = {
.bus_format = MEDIA_BUS_FMT_RGB888_1X24,
 };
 
+static const struct display_timing innolux_g070ace_l01_timing = {
+   .pixelclock = { 2520, 3500, 3570 },
+   .hactive = { 800, 800, 800 },
+   .hfront_porch = { 30, 32, 87 },
+   .hback_porch = { 30, 32, 87 },
+   .hsync_len = { 1, 1, 1 },
+   .vactive = { 480, 480, 480 },
+   .vfront_porch = { 3, 3, 3 },
+   .vback_porch = { 13, 13, 13 },
+   .vsync_len = { 1, 1, 4 },
+   .flags = DISPLAY_FLAGS_DE_HIGH,
+};
+
+static const struct panel_desc innolux_g070ace_l01 = {
+   .timings = _g070ace_l01_timing,
+   .num_timings = 1,
+   .bpc = 8,
+   .size = {
+   .width = 152,
+   .height = 91,
+   },
+   .delay = {
+   .prepare = 10,
+   .enable = 50,
+   .disable = 50,
+   .unprepare = 500,
+   },
+   .bus_format = MEDIA_BUS_FMT_RGB888_1X7X4_SPWG,
+   .bus_flags = DRM_BUS_FLAG_DE_HIGH,
+   .connector_type = DRM_MODE_CONNECTOR_LVDS,
+};
+
 static const struct display_timing innolux_g070y2_l01_timing = {
.pixelclock = { 2800, 2950, 3200 },
.hactive = { 800, 800, 800 },
@@ -4098,6 +4130,9 @@ static const struct of_device_id platform_of_match[] = {
}, {
.compatible = "innolux,at070tn92",
.data = _at070tn92,
+   }, {
+   .compatible = "innolux,g070ace-l01",
+   .data = _g070ace_l01,
}, {
.compatible = "innolux,g070y2-l01",
.data = _g070y2_l01,

-- 
2.39.1

Re: [PATCH v2 02/13] spi: Replace all spi->chip_select and spi->cs_gpiod references with function call

2023-02-01 Thread Mark Brown

On Fri, Jan 20, 2023 at 12:23:31AM +0530, Amit Kumar Mahapatra wrote:
> Supporting multi-cs in spi drivers would require the chip_select & cs_gpiod
> members of struct spi_device to be an array. But changing the type of these
> members to array would break the spi driver functionality. To make the
> transition smoother introduced four new APIs to get/set the
> spi->chip_select & spi->cs_gpiod and replaced all spi->chip_select and
> spi->cs_gpiod references with get or set API calls.
> While adding multi-cs support in further patches the chip_select & cs_gpiod
> members of the spi_device structure would be converted to arrays & the
> "idx" parameter of the APIs would be used as array index i.e.,
> spi->chip_select[idx] & spi->cs_gpiod[idx] respectively.

This doesn't apply against current code, please check and resend.


signature.asc
Description: PGP signature

[PATCH v2 0/2] panel-simple: Add InnoLux G070ACE-L01 support

2023-02-01 Thread richard . leitner

This series adds support for the InnoLux G070ACE-L01 7" 800x480 TFT LCD
panel with WLED backlight.

Signed-off-by: Richard Leitner 
---
Richard Leitner (2):
  dt-bindings: display: simple: add support for InnoLux G070ACE-L01
  drm/panel: simple: Add InnoLux G070ACE-L01

 .../bindings/display/panel/panel-simple.yaml   |  2 ++
 drivers/gpu/drm/panel/panel-simple.c   | 35 ++
 2 files changed, 37 insertions(+)
---
base-commit: c0b67534c95c537f7a506a06b98e5e85d72e2b7d
change-id: 20230201-innolux-g070ace-fda21c89efe2

Best regards,
-- 
Richard Leitner

RE: [PATCH] Revert "drm/display/dp_mst: Move all payload info into the atomic state"

2023-02-01 Thread Limonciello, Mario

[AMD Official Use Only - General]



> -Original Message-
> From: Greg KH 
> Sent: Sunday, January 29, 2023 07:32
> To: Limonciello, Mario 
> Cc: Linux regressions mailing list ; dri-
> de...@lists.freedesktop.org; sta...@vger.kernel.org;
> stanislav.lisovs...@intel.com; Zuo, Jerry ; amd-
> g...@lists.freedesktop.org; Lin, Wayne ; Guenter
> Roeck ; bske...@redhat.com
> Subject: Re: [PATCH] Revert "drm/display/dp_mst: Move all payload info into
> the atomic state"
> 
> On Fri, Jan 27, 2023 at 03:02:41PM +, Limonciello, Mario wrote:
> > [Public]
> >
> >
> >
> > > -Original Message-
> > > From: Linux kernel regression tracking (Thorsten Leemhuis)
> > > 
> > > Sent: Friday, January 27, 2023 03:15
> > > To: Greg KH ; Limonciello, Mario
> > > 
> > > Cc: dri-devel@lists.freedesktop.org; sta...@vger.kernel.org;
> > > stanislav.lisovs...@intel.com; Zuo, Jerry ; amd-
> > > g...@lists.freedesktop.org; Lin, Wayne ; Guenter
> > > Roeck ; bske...@redhat.com
> > > Subject: Re: [PATCH] Revert "drm/display/dp_mst: Move all payload info
> into
> > > the atomic state"
> > >
> > > On 27.01.23 08:39, Greg KH wrote:
> > > > On Fri, Jan 20, 2023 at 11:51:04AM -0600, Limonciello, Mario wrote:
> > > >> On 1/20/2023 11:46, Guenter Roeck wrote:
> > > >>> On Thu, Jan 12, 2023 at 04:50:44PM +0800, Wayne Lin wrote:
> > >  This reverts commit 4d07b0bc403403438d9cf88450506240c5faf92f.
> > > 
> > >  [Why]
> > >  Changes cause regression on amdgpu mst.
> > >  E.g.
> > >  In fill_dc_mst_payload_table_from_drm(), amdgpu expects to
> > > add/remove payload
> > >  one by one and call fill_dc_mst_payload_table_from_drm() to
> update
> > > the HW
> > >  maintained payload table. But previous change tries to go through
> all
> > > the
> > >  payloads in mst_state and update amdpug hw maintained table in
> once
> > > everytime
> > >  driver only tries to add/remove a specific payload stream only. The
> > > newly
> > >  design idea conflicts with the implementation in amdgpu nowadays.
> > > 
> > >  [How]
> > >  Revert this patch first. After addressing all regression problems
> caused
> > > by
> > >  this previous patch, will add it back and adjust it.
> > > >>>
> > > >>> Has there been any progress on this revert, or on fixing the
> underlying
> > > >>> problem ?
> > > >>>
> > > >>> Thanks,
> > > >>> Guenter
> > > >>
> > > >> Hi Guenter,
> > > >>
> > > >> Wayne is OOO for CNY, but let me update you.
> > > >>
> > > >> Harry has sent out this series which is a collection of proper fixes.
> > > >> https://patchwork.freedesktop.org/series/113125/
> > > >>
> > > >> Once that's reviewed and accepted, 4 of them are applicable for 6.1.
> > > >
> > > > Any hint on when those will be reviewed and accepted?  patchwork
> > > doesn't
> > > > show any activity on them, or at least I can't figure it out...
> > >
> > > I didn't look closer (hence please correct me if I'm wrong), but the
> > > core changes afaics are in the DRM pull airlied send a few hours ago to
> > > Linus (note the "amdgpu […] DP MST fixes" line):
> > >
> > >
> https://lore.kernel.org/all/CAPM%3D9tzuu4xnx6T5v7sKsK%2BA5HEaPOc1ie
> > > myznsyqzgztj%3d...@mail.gmail.com/
> >
> > That's right.  There are 4 commits in that PR with the appropriate stable 
> > tags
> > that should fix the majority of the MST issues introduced in 6.1 by
> 4d07b0bc40340
> > ("drm/display/dp_mst: Move all payload info into the atomic state"):
> >
> >   drm/amdgpu/display/mst: Fix mst_state->pbn_div and slot count
> assignments
> >   drm/amdgpu/display/mst: limit payload to be updated one by one
> >   drm/amdgpu/display/mst: update mst_mgr relevant variable when long
> HPD
> >   drm/display/dp_mst: Correct the kref of port.
> >
> > There will be follow ups for any remaining corner cases.
> 
> Great, thanks for this, all are now queued up in the 6.1.y queue.
> 
> greg k-h

Greg,

My apologies if this has been covered elsewhere and I missed it but I was
wondering if there was a decision made for whether 6.1.y will be an LTS kernel
release or not?

Re: [PATCH] drm/i915/guc: Improve debug message on context reset notification

2023-02-01 Thread John Harrison


On 1/31/2023 13:44, Michal Wajdeczko wrote:

Just recently we switched over to new GuC oriented log macros but in
the meantime yet another message was added that we missed to update.

While around improve that new message by adding engine name and use
existing helpers to check for context state.

Signed-off-by: Michal Wajdeczko 
Cc: John Harrison 

Reviewed-by: John Harrison 


---
  drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c | 7 ---
  1 file changed, 4 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c 
b/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c
index 53f3ed3244d5..be495e657d66 100644
--- a/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c
+++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c
@@ -4660,9 +4660,10 @@ static void guc_handle_context_reset(struct intel_guc 
*guc,
  {
trace_intel_context_reset(ce);
  
-	drm_dbg(_to_gt(guc)->i915->drm, "Got GuC reset of 0x%04X, exiting = %d, banned = %d\n",

-   ce->guc_id.id, test_bit(CONTEXT_EXITING, >flags),
-   test_bit(CONTEXT_BANNED, >flags));
+   guc_dbg(guc, "Got context reset notification: 0x%04X on %s, exiting = %s, 
banned = %s\n",
+   ce->guc_id.id, ce->engine->name,
+   str_yes_no(intel_context_is_exiting(ce)),
+   str_yes_no(intel_context_is_banned(ce)));
  
  	if (likely(intel_context_is_schedulable(ce))) {

capture_error_state(guc, ce);

Re: linux-next: manual merge of the usb tree with the drm-intel-fixes tree

2023-02-01 Thread John Harrison


On 2/1/2023 07:31, Rodrigo Vivi wrote:

On Wed, Feb 01, 2023 at 03:11:31PM +1100, Stephen Rothwell wrote:

Hi all,

On Tue, 31 Jan 2023 10:27:29 -0800 John Harrison  
wrote:

On 1/31/2023 04:44, Andy Shevchenko wrote:

On Tue, Jan 31, 2023 at 01:03:05PM +1100, Stephen Rothwell wrote:

Today's linux-next merge of the usb tree got a conflict in:

drivers/gpu/drm/i915/gt/intel_engine_cs.c

between commit:

5bc4b43d5c6c ("drm/i915: Fix up locking around dumping requests lists")

from the drm-intel-fixes tree and commit:

4d70c74659d9 ("i915: Move list_count() to list.h as list_count_nodes() for 
broader use")

from the usb tree.

I fixed it up (the former removed the code changed by the latter)

Hmm... Currently I see that 20230127002842.3169194-4-john.c.harri...@intel.com
moves the code to the drivers/gpu/drm/i915/gt/intel_execlists_submission.c.

Is there any new series beside the above mentioned that touches that file and
actually _removes_ that code?

As long as the removal is limited to list_count/list_count_nodes,
that's fine. I only moved it from one file to another because the one
and only function that was using it was being moved to the other
file. If someone else has found a use for the same and wants to move
it to a more common place then great. I assume there was no conflict
happening in the i915 specific code.

I have added this fix up patch to linux-next today (more or less - this
is a hand hacked version, but you get the idea):

From: Stephen Rothwell 
Date: Wed, 1 Feb 2023 13:13:01 +1100
Subject: [PATCH] i915: fix up for "drm/i915: Fix up locking around dumping requests 
lists"

interacting with "i915: Move list_count() to list.h as list_count_nodes() for 
broader use"

Signed-off-by: Stephen Rothwell 
---
  .../gpu/drm/i915/gt/intel_execlists_submission.c| 15 +
  1 file changed, 2 insertion(+), 13 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/intel_execlists_submission.c 
b/drivers/gpu/drm/i915/gt/intel_execlists_submission.c
index 3c573d41d404..e919d41a48d9 100644
--- a/drivers/gpu/drm/i915/gt/intel_execlists_submission.c
+++ b/drivers/gpu/drm/i915/gt/intel_execlists_submission.c
@@ -4150,17 +4150,6 @@ void intel_execlists_show_requests(struct 
intel_engine_cs *engine,
spin_unlock_irqrestore(_engine->lock, flags);
  }
  
-static unsigned long list_count(struct list_head *list)

-{
-   struct list_head *pos;
-   unsigned long count = 0;
-
-   list_for_each(pos, list)
-   count++;
-
-   return count;
-}
-
  void intel_execlists_dump_active_requests(struct intel_engine_cs *engine,
  struct i915_request *hung_rq,
  struct drm_printer *m)
@@ -4172,7 +4161,7 @@ void intel_execlists_dump_active_requests(struct 
intel_engine_cs *engine,
intel_engine_dump_active_requests(>sched_engine->requests, 
hung_rq, m);
  
-	drm_printf(m, "\tOn hold?: %lu\n",

-  list_count(>sched_engine->hold));
+   drm_printf(m, "\tOn hold?: %zu\n",
+  list_count_nodes(>sched_engine->hold));

something awkward here.
The resolution on linux-next should align with the resolution on drm-tip
where we have the list_count still there as we preferred the version
on drm-intel-gt-next as the resolution of the conflict instead of the
fixes one.
Not following why you want to keep list_count as a local function in the 
i915 driver? Surely the correct fix is to move it to the common header 
and share the code? In which case, the correct name is 
list_count_nodes() as that is what got merged to the common header.


John.



  
  	spin_unlock_irqrestore(>sched_engine->lock, flags);

  }
--
2.35.1

--
Cheers,
Stephen Rothwell

Re: [PATCH v2 0/8] drm/vmwgfx: Refactor the buffer object code

2023-02-01 Thread Zack Rusin

On Tue, 2023-01-31 at 20:28 +0100, Thomas Zimmermann wrote:
> Hi,
> 
> I cannot comment much on the internal workings of the vmwgfx driver, but 
> feel free to add
> 
> Acked-by: Thomas Zimmermann 
> 
> to the patches that don't already have my r-b.

Thanks so much for taking a look at this series. I really appreciate it!

z

Re: [PATCH v10 04/23] drm/i915/vm_bind: Support partially mapped vma resource

2023-02-01 Thread Andi Shyti

Hi Niranjana,

On Tue, Jan 17, 2023 at 11:15:50PM -0800, Niranjana Vishwanathapura wrote:
> As persistent vmas can be partialled mapped to an object,
> remove restriction which require vma resource sg table to
> be just pointer to object's sg table.
> 
> Reviewed-by: Matthew Auld 
> Signed-off-by: Niranjana Vishwanathapura 

Reviewed-by: Andi Shyti 

Andi

> ---
>  drivers/gpu/drm/i915/i915_vma.c | 3 +--
>  1 file changed, 1 insertion(+), 2 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/i915_vma.c b/drivers/gpu/drm/i915/i915_vma.c
> index 34f0e6c923c2..79b2e19a299f 100644
> --- a/drivers/gpu/drm/i915/i915_vma.c
> +++ b/drivers/gpu/drm/i915/i915_vma.c
> @@ -2060,8 +2060,7 @@ static struct dma_fence *__i915_vma_unbind_async(struct 
> i915_vma *vma)
>   if (!drm_mm_node_allocated(>node))
>   return NULL;
>  
> - if (i915_vma_is_pinned(vma) ||
> - >obj->mm.rsgt->table != vma->resource->bi.pages)
> + if (i915_vma_is_pinned(vma))
>   return ERR_PTR(-EAGAIN);
>  
>   /*
> -- 
> 2.21.0.rc0.32.g243a4c7e27

Re: [PATCH v10 03/23] drm/i915/vm_bind: Expose i915_gem_object_max_page_size()

2023-02-01 Thread Andi Shyti

Hi Niranjana,

On Tue, Jan 17, 2023 at 11:15:49PM -0800, Niranjana Vishwanathapura wrote:
> Expose i915_gem_object_max_page_size() function non-static
> which will be used by the vm_bind feature.
> 
> Reviewed-by: Matthew Auld 
> Signed-off-by: Niranjana Vishwanathapura 
> Signed-off-by: Andi Shyti 
> ---
>  drivers/gpu/drm/i915/gem/i915_gem_create.c | 18 +-
>  drivers/gpu/drm/i915/gem/i915_gem_object.h |  2 ++
>  2 files changed, 15 insertions(+), 5 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/gem/i915_gem_create.c 
> b/drivers/gpu/drm/i915/gem/i915_gem_create.c
> index 005a7f842784..86469710bd59 100644
> --- a/drivers/gpu/drm/i915/gem/i915_gem_create.c
> +++ b/drivers/gpu/drm/i915/gem/i915_gem_create.c
> @@ -15,10 +15,18 @@
>  #include "i915_trace.h"
>  #include "i915_user_extensions.h"
>  
> -static u32 object_max_page_size(struct intel_memory_region **placements,
> - unsigned int n_placements)
> +/**
> + * i915_gem_object_max_page_size() - max of min_page_size of the regions
> + * @placements:  list of regions
> + * @n_placements: number of the placements
> + *
> + * Returns the largest of min_page_size of the @placements,
> + * or I915_GTT_PAGE_SIZE_4K if @n_placements is 0.
> + */
> +u32 i915_gem_object_max_page_size(struct intel_memory_region **placements,
> +   unsigned int n_placements)
>  {
> - u32 max_page_size = 0;
> + u32 max_page_size = I915_GTT_PAGE_SIZE_4K;
>   int i;
>  
>   for (i = 0; i < n_placements; i++) {
> @@ -28,7 +36,6 @@ static u32 object_max_page_size(struct intel_memory_region 
> **placements,
>   max_page_size = max_t(u32, max_page_size, mr->min_page_size);
>   }
>  
> - GEM_BUG_ON(!max_page_size);

this change, though, is not described in the commit log. Can you
please add two lines?

Thanks,
Andi

>   return max_page_size;
>  }
>  
> @@ -99,7 +106,8 @@ __i915_gem_object_create_user_ext(struct drm_i915_private 
> *i915, u64 size,
>  
>   i915_gem_flush_free_objects(i915);
>  
> - size = round_up(size, object_max_page_size(placements, n_placements));
> + size = round_up(size, i915_gem_object_max_page_size(placements,
> + n_placements));
>   if (size == 0)
>   return ERR_PTR(-EINVAL);
>  
> diff --git a/drivers/gpu/drm/i915/gem/i915_gem_object.h 
> b/drivers/gpu/drm/i915/gem/i915_gem_object.h
> index 3db53769864c..5455ca0eabe9 100644
> --- a/drivers/gpu/drm/i915/gem/i915_gem_object.h
> +++ b/drivers/gpu/drm/i915/gem/i915_gem_object.h
> @@ -47,6 +47,8 @@ static inline bool i915_gem_object_size_2big(u64 size)
>  }
>  
>  void i915_gem_init__objects(struct drm_i915_private *i915);
> +u32 i915_gem_object_max_page_size(struct intel_memory_region **placements,
> +   unsigned int n_placements);
>  
>  void i915_objects_module_exit(void);
>  int i915_objects_module_init(void);
> -- 
> 2.21.0.rc0.32.g243a4c7e27

Re: [PATCH v10 02/23] drm/i915/vm_bind: Add __i915_sw_fence_await_reservation()

2023-02-01 Thread Andi Shyti

Hi Niranjana,

On Tue, Jan 17, 2023 at 11:15:48PM -0800, Niranjana Vishwanathapura wrote:
> Add function __i915_sw_fence_await_reservation() for
> asynchronous wait on a dma-resv object with specified
> dma_resv_usage. This is required for async vma unbind
> with vm_bind.
> 
> Reviewed-by: Matthew Auld 
> Signed-off-by: Niranjana Vishwanathapura 

Reviewed-by: Andi Shyti 

Andi

> ---
>  drivers/gpu/drm/i915/i915_sw_fence.c | 28 +---
>  drivers/gpu/drm/i915/i915_sw_fence.h | 23 +--
>  2 files changed, 38 insertions(+), 13 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/i915_sw_fence.c 
> b/drivers/gpu/drm/i915/i915_sw_fence.c
> index cc2a8821d22a..ae06d35db056 100644
> --- a/drivers/gpu/drm/i915/i915_sw_fence.c
> +++ b/drivers/gpu/drm/i915/i915_sw_fence.c
> @@ -7,7 +7,6 @@
>  #include 
>  #include 
>  #include 
> -#include 
>  
>  #include "i915_sw_fence.h"
>  #include "i915_selftest.h"
> @@ -569,11 +568,26 @@ int __i915_sw_fence_await_dma_fence(struct 
> i915_sw_fence *fence,
>   return ret;
>  }
>  
> -int i915_sw_fence_await_reservation(struct i915_sw_fence *fence,
> - struct dma_resv *resv,
> - bool write,
> - unsigned long timeout,
> - gfp_t gfp)
> +/**
> + * __i915_sw_fence_await_reservation() - Setup a fence to wait on a dma-resv
> + * object with specified usage.
> + * @fence: the fence that needs to wait
> + * @resv: dma-resv object
> + * @usage: dma_resv_usage (See enum dma_resv_usage)
> + * @timeout: how long to wait in jiffies
> + * @gfp: allocation mode
> + *
> + * Setup the @fence to asynchronously wait on dma-resv object @resv for
> + * @usage to complete before signaling.
> + *
> + * Returns 0 if there is nothing to wait on, -ve error code upon error
> + * and >0 upon successfully setting up the wait.
> + */
> +int __i915_sw_fence_await_reservation(struct i915_sw_fence *fence,
> +   struct dma_resv *resv,
> +   enum dma_resv_usage usage,
> +   unsigned long timeout,
> +   gfp_t gfp)
>  {
>   struct dma_resv_iter cursor;
>   struct dma_fence *f;
> @@ -582,7 +596,7 @@ int i915_sw_fence_await_reservation(struct i915_sw_fence 
> *fence,
>   debug_fence_assert(fence);
>   might_sleep_if(gfpflags_allow_blocking(gfp));
>  
> - dma_resv_iter_begin(, resv, dma_resv_usage_rw(write));
> + dma_resv_iter_begin(, resv, usage);
>   dma_resv_for_each_fence_unlocked(, f) {
>   pending = i915_sw_fence_await_dma_fence(fence, f, timeout,
>   gfp);
> diff --git a/drivers/gpu/drm/i915/i915_sw_fence.h 
> b/drivers/gpu/drm/i915/i915_sw_fence.h
> index f752bfc7c6e1..9c4859dc4c0d 100644
> --- a/drivers/gpu/drm/i915/i915_sw_fence.h
> +++ b/drivers/gpu/drm/i915/i915_sw_fence.h
> @@ -10,13 +10,13 @@
>  #define _I915_SW_FENCE_H_
>  
>  #include 
> +#include 
>  #include 
>  #include 
>  #include  /* for NOTIFY_DONE */
>  #include 
>  
>  struct completion;
> -struct dma_resv;
>  struct i915_sw_fence;
>  
>  enum i915_sw_fence_notify {
> @@ -89,11 +89,22 @@ int i915_sw_fence_await_dma_fence(struct i915_sw_fence 
> *fence,
> unsigned long timeout,
> gfp_t gfp);
>  
> -int i915_sw_fence_await_reservation(struct i915_sw_fence *fence,
> - struct dma_resv *resv,
> - bool write,
> - unsigned long timeout,
> - gfp_t gfp);
> +int __i915_sw_fence_await_reservation(struct i915_sw_fence *fence,
> +   struct dma_resv *resv,
> +   enum dma_resv_usage usage,
> +   unsigned long timeout,
> +   gfp_t gfp);
> +
> +static inline int i915_sw_fence_await_reservation(struct i915_sw_fence 
> *fence,
> +   struct dma_resv *resv,
> +   bool write,
> +   unsigned long timeout,
> +   gfp_t gfp)
> +{
> + return __i915_sw_fence_await_reservation(fence, resv,
> +  dma_resv_usage_rw(write),
> +  timeout, gfp);
> +}
>  
>  bool i915_sw_fence_await(struct i915_sw_fence *fence);
>  void i915_sw_fence_complete(struct i915_sw_fence *fence);
> -- 
> 2.21.0.rc0.32.g243a4c7e27

[PATCH] drm: add kms driver for loongson display controller

2023-02-01 Thread suijingfeng

From: suijingfeng 

The display controller is a PCI device and It has two display pipe.
Loongson display controllers are simple which require scanout buffers
to be physically contiguous.

For LS7A1000 bridge chip, each way has a DVO interface which provide
RGB888 signals, vertical & horizontal synchronisations, and the pixel
clock. Each CRTC is able to support 1920x1080 resolution at 60Hz, the
maximum resolution is 2048x2048 according to the hardware spec.

For the DC in LS7A2000 bridge chip, each way is connected with a
built-in HDMI encoder which support 3840x2160@30.

Signed-off-by: suijingfeng 
Signed-off-by: suijingfeng <15330273...@189.cn>
---
 drivers/gpu/drm/Kconfig |   2 +
 drivers/gpu/drm/Makefile|   1 +
 drivers/gpu/drm/lsdc/Kconfig|  18 +
 drivers/gpu/drm/lsdc/Makefile   |  14 +
 drivers/gpu/drm/lsdc/lsdc_crtc.c| 437 +
 drivers/gpu/drm/lsdc/lsdc_debugfs.c | 243 
 drivers/gpu/drm/lsdc/lsdc_drv.c | 535 ++
 drivers/gpu/drm/lsdc/lsdc_drv.h | 274 +
 drivers/gpu/drm/lsdc/lsdc_i2c.c | 187 +
 drivers/gpu/drm/lsdc/lsdc_irq.c |  81 
 drivers/gpu/drm/lsdc/lsdc_output.c  | 366 ++
 drivers/gpu/drm/lsdc/lsdc_plane.c   | 447 ++
 drivers/gpu/drm/lsdc/lsdc_pll.c | 571 
 drivers/gpu/drm/lsdc/lsdc_pll.h |  78 
 drivers/gpu/drm/lsdc/lsdc_regs.c|  29 ++
 drivers/gpu/drm/lsdc/lsdc_regs.h| 294 ++
 16 files changed, 3577 insertions(+)
 create mode 100644 drivers/gpu/drm/lsdc/Kconfig
 create mode 100644 drivers/gpu/drm/lsdc/Makefile
 create mode 100644 drivers/gpu/drm/lsdc/lsdc_crtc.c
 create mode 100644 drivers/gpu/drm/lsdc/lsdc_debugfs.c
 create mode 100644 drivers/gpu/drm/lsdc/lsdc_drv.c
 create mode 100644 drivers/gpu/drm/lsdc/lsdc_drv.h
 create mode 100644 drivers/gpu/drm/lsdc/lsdc_i2c.c
 create mode 100644 drivers/gpu/drm/lsdc/lsdc_irq.c
 create mode 100644 drivers/gpu/drm/lsdc/lsdc_output.c
 create mode 100644 drivers/gpu/drm/lsdc/lsdc_plane.c
 create mode 100644 drivers/gpu/drm/lsdc/lsdc_pll.c
 create mode 100644 drivers/gpu/drm/lsdc/lsdc_pll.h
 create mode 100644 drivers/gpu/drm/lsdc/lsdc_regs.c
 create mode 100644 drivers/gpu/drm/lsdc/lsdc_regs.h

diff --git a/drivers/gpu/drm/Kconfig b/drivers/gpu/drm/Kconfig
index 663ea8f9966d..bb256a1ae24a 100644
--- a/drivers/gpu/drm/Kconfig
+++ b/drivers/gpu/drm/Kconfig
@@ -368,6 +368,8 @@ source "drivers/gpu/drm/solomon/Kconfig"
 
 source "drivers/gpu/drm/sprd/Kconfig"
 
+source "drivers/gpu/drm/lsdc/Kconfig"
+
 config DRM_HYPERV
tristate "DRM Support for Hyper-V synthetic video device"
depends on DRM && PCI && MMU && HYPERV
diff --git a/drivers/gpu/drm/Makefile b/drivers/gpu/drm/Makefile
index 496fa5a6147a..7fd6412ffbfa 100644
--- a/drivers/gpu/drm/Makefile
+++ b/drivers/gpu/drm/Makefile
@@ -197,3 +197,4 @@ obj-y   += gud/
 obj-$(CONFIG_DRM_HYPERV) += hyperv/
 obj-y  += solomon/
 obj-$(CONFIG_DRM_SPRD) += sprd/
+obj-$(CONFIG_DRM_LSDC) += lsdc/
diff --git a/drivers/gpu/drm/lsdc/Kconfig b/drivers/gpu/drm/lsdc/Kconfig
new file mode 100644
index ..56a50014df9d
--- /dev/null
+++ b/drivers/gpu/drm/lsdc/Kconfig
@@ -0,0 +1,18 @@
+# SPDX-License-Identifier: GPL-2.0
+
+config DRM_LSDC
+   tristate "DRM support for loongson display controller"
+   depends on DRM && PCI && (MACH_LOONGSON64 || LOONGARCH || MIPS)
+   select OF
+   select DRM_KMS_HELPER
+   select DRM_TTM
+   select DRM_TTM_HELPER
+   select DRM_VRAM_HELPER
+   select VIDEOMODE_HELPERS
+   default m
+   help
+ This is a KMS driver for loongson display controller in the
+ LS7A1000/LS7A2000 bridge chip and LS2K1000/LS2K0500 SoC.
+ If "M" is selected, the module will be called lsdc.
+
+ If in doubt, say "Y".
diff --git a/drivers/gpu/drm/lsdc/Makefile b/drivers/gpu/drm/lsdc/Makefile
new file mode 100644
index ..d4b901ec09cf
--- /dev/null
+++ b/drivers/gpu/drm/lsdc/Makefile
@@ -0,0 +1,14 @@
+# SPDX-License-Identifier: GPL-2.0
+
+lsdc-y := \
+   lsdc_drv.o \
+   lsdc_crtc.o \
+   lsdc_irq.o \
+   lsdc_plane.o \
+   lsdc_pll.o \
+   lsdc_i2c.o \
+   lsdc_output.o \
+   lsdc_regs.o \
+   lsdc_debugfs.o
+
+obj-$(CONFIG_DRM_LSDC) += lsdc.o
diff --git a/drivers/gpu/drm/lsdc/lsdc_crtc.c b/drivers/gpu/drm/lsdc/lsdc_crtc.c
new file mode 100644
index ..0737b3c3084f
--- /dev/null
+++ b/drivers/gpu/drm/lsdc/lsdc_crtc.c
@@ -0,0 +1,437 @@
+// SPDX-License-Identifier: GPL-2.0
+
+#include 
+#include 
+#include 
+
+#include "lsdc_drv.h"
+#include "lsdc_regs.h"
+#include "lsdc_pll.h"
+
+static int lsdc_crtc_enable_vblank(struct drm_crtc *crtc)
+{
+   struct lsdc_device *ldev = to_lsdc(crtc->dev);
+   unsigned int index = drm_crtc_index(crtc);
+   u32 val;
+
+   val = lsdc_rreg32(ldev, LSDC_INT_REG);
+
+

Re: [PATCH v2] drm/amdgpu/fence: Fix oops due to non-matching drm_sched init/fini

2023-02-01 Thread Luben Tuikov

Reviewed-by: Luben Tuikov 

Regards,
Luben

On 2023-02-01 11:48, Guilherme G. Piccoli wrote:
> Currently amdgpu calls drm_sched_fini() from the fence driver sw fini
> routine - such function is expected to be called only after the
> respective init function - drm_sched_init() - was executed successfully.
>
> Happens that we faced a driver probe failure in the Steam Deck
> recently, and the function drm_sched_fini() was called even without
> its counter-part had been previously called, causing the following oops:
>
> amdgpu: probe of :04:00.0 failed with error -110
> BUG: kernel NULL pointer dereference, address: 0090
> PGD 0 P4D 0
> Oops: 0002 [#1] PREEMPT SMP NOPTI
> CPU: 0 PID: 609 Comm: systemd-udevd Not tainted 6.2.0-rc3-gpiccoli #338
> Hardware name: Valve Jupiter/Jupiter, BIOS F7A0113 11/04/2022
> RIP: 0010:drm_sched_fini+0x84/0xa0 [gpu_sched]
> [...]
> Call Trace:
>  
>  amdgpu_fence_driver_sw_fini+0xc8/0xd0 [amdgpu]
>  amdgpu_device_fini_sw+0x2b/0x3b0 [amdgpu]
>  amdgpu_driver_release_kms+0x16/0x30 [amdgpu]
>  devm_drm_dev_init_release+0x49/0x70
>  [...]
>
> To prevent that, check if the drm_sched was properly initialized for a
> given ring before calling its fini counter-part.
>
> Notice ideally we'd use sched.ready for that; such field is set as the latest
> thing on drm_sched_init(). But amdgpu seems to "override" the meaning of such
> field - in the above oops for example, it was a GFX ring causing the crash, 
> and
> the sched.ready field was set to true in the ring init routine, regardless of
> the state of the DRM scheduler. Hence, we ended-up using sched.ops as per
> Christian's suggestion [0].
>
> [0] 
> https://lore.kernel.org/amd-gfx/984ee981-2906-0eaf-ccec-9f80975cb...@amd.com/
>
> Fixes: 067f44c8b459 ("drm/amdgpu: avoid over-handle of fence driver fini in 
> s3 test (v2)")
> Suggested-by: Christian König 
> Cc: Guchun Chen 
> Cc: Luben Tuikov 
> Cc: Mario Limonciello 
> Signed-off-by: Guilherme G. Piccoli 
> ---
>  drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c | 8 +++-
>  1 file changed, 7 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c 
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c
> index 00444203220d..3b962cb680a6 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c
> @@ -618,7 +618,13 @@ void amdgpu_fence_driver_sw_fini(struct amdgpu_device 
> *adev)
>   if (!ring || !ring->fence_drv.initialized)
>   continue;
>  
> - if (!ring->no_scheduler)
> + /*
> +  * Notice we check for sched.ops since there's some
> +  * override on the meaning of sched.ready by amdgpu.
> +  * The natural check would be sched.ready, which is
> +  * set as drm_sched_init() finishes...
> +  */
> + if (!ring->no_scheduler && ring->sched.ops)
>   drm_sched_fini(>sched);
>  
>   for (j = 0; j <= ring->fence_drv.num_fences_mask; ++j)

[PATCH 3/3] drm/mediatek: Enable AR30 and BA30 overlays on MT8195

2023-02-01 Thread Justin Green

Tested using "modetest -P" on an MT8195 device.

Signed-off-by: Justin Green 
---
 drivers/gpu/drm/mediatek/mtk_disp_ovl.c | 21 +++--
 1 file changed, 19 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/mediatek/mtk_disp_ovl.c 
b/drivers/gpu/drm/mediatek/mtk_disp_ovl.c
index a6255e847104..7d26f7055751 100644
--- a/drivers/gpu/drm/mediatek/mtk_disp_ovl.c
+++ b/drivers/gpu/drm/mediatek/mtk_disp_ovl.c
@@ -85,6 +85,22 @@ static const u32 mt8173_formats[] = {
DRM_FORMAT_YUYV,
 };
 
+static const u32 mt8195_formats[] = {
+   DRM_FORMAT_XRGB,
+   DRM_FORMAT_ARGB,
+   DRM_FORMAT_ARGB2101010,
+   DRM_FORMAT_BGRX,
+   DRM_FORMAT_BGRA,
+   DRM_FORMAT_BGRA1010102,
+   DRM_FORMAT_ABGR,
+   DRM_FORMAT_XBGR,
+   DRM_FORMAT_RGB888,
+   DRM_FORMAT_BGR888,
+   DRM_FORMAT_RGB565,
+   DRM_FORMAT_UYVY,
+   DRM_FORMAT_YUYV,
+};
+
 struct mtk_disp_ovl_data {
unsigned int addr;
unsigned int gmc_bits;
@@ -616,8 +632,9 @@ static const struct mtk_disp_ovl_data 
mt8195_ovl_driver_data = {
.fmt_rgb565_is_0 = true,
.smi_id_en = true,
.supports_afbc = true,
-   .formats = mt8173_formats,
-   .num_formats = ARRAY_SIZE(mt8173_formats),
+   .formats = mt8195_formats,
+   .num_formats = ARRAY_SIZE(mt8195_formats),
+   .supports_clrfmt_ext = true,
 };
 
 static const struct of_device_id mtk_disp_ovl_driver_dt_match[] = {
-- 
2.39.1.456.gfc5497dd1b-goog

[PATCH 2/3] drm/mediatek: Add support for AR30 and BA30 overlays

2023-02-01 Thread Justin Green

Tested using "modetest -P" on an MT8195 device.

Signed-off-by: Justin Green 
---
 drivers/gpu/drm/mediatek/mtk_disp_ovl.c | 33 +
 1 file changed, 33 insertions(+)

diff --git a/drivers/gpu/drm/mediatek/mtk_disp_ovl.c 
b/drivers/gpu/drm/mediatek/mtk_disp_ovl.c
index 8743c8047dc9..a6255e847104 100644
--- a/drivers/gpu/drm/mediatek/mtk_disp_ovl.c
+++ b/drivers/gpu/drm/mediatek/mtk_disp_ovl.c
@@ -41,6 +41,7 @@
 #define DISP_REG_OVL_RDMA_CTRL(n)  (0x00c0 + 0x20 * (n))
 #define DISP_REG_OVL_RDMA_GMC(n)   (0x00c8 + 0x20 * (n))
 #define DISP_REG_OVL_ADDR_MT2701   0x0040
+#define DISP_REG_OVL_CLRFMT_EXT0x02D0
 #define DISP_REG_OVL_ADDR_MT8173   0x0f40
 #define DISP_REG_OVL_ADDR(ovl, n)  ((ovl)->data->addr + 0x20 * (n))
 #define DISP_REG_OVL_HDR_ADDR(ovl, n)  ((ovl)->data->addr + 0x20 * (n) 
+ 0x04)
@@ -61,6 +62,10 @@
0 : OVL_CON_CLRFMT_RGB)
 #define OVL_CON_CLRFMT_RGB888(ovl) ((ovl)->data->fmt_rgb565_is_0 ? \
OVL_CON_CLRFMT_RGB : 0)
+#define OVL_CON_CLRFMT_BIT_DEPTH_MASK(ovl) (0xFF << 4 * (ovl))
+#define OVL_CON_CLRFMT_BIT_DEPTH(depth, ovl)   (depth << 4 * (ovl))
+#define OVL_CON_CLRFMT_8_BIT   0x00
+#define OVL_CON_CLRFMT_10_BIT  0x01
 #defineOVL_CON_AEN BIT(8)
 #defineOVL_CON_ALPHA   0xff
 #defineOVL_CON_VIRT_FLIP   BIT(9)
@@ -89,6 +94,7 @@ struct mtk_disp_ovl_data {
bool supports_afbc;
const u32 *formats;
size_t num_formats;
+   bool supports_clrfmt_ext;
 };
 
 /*
@@ -218,6 +224,30 @@ static void mtk_ovl_set_afbc(struct mtk_disp_ovl *ovl, 
struct cmdq_pkt *cmdq_pkt
   DISP_REG_OVL_DATAPATH_CON, OVL_LAYER_AFBC_EN(idx));
 }
 
+static void mtk_ovl_set_bit_depth(struct device *dev, int idx, u32 format,
+ struct cmdq_pkt *cmdq_pkt)
+{
+   struct mtk_disp_ovl *ovl = dev_get_drvdata(dev);
+   unsigned int reg;
+   unsigned int bit_depth = OVL_CON_CLRFMT_8_BIT;
+
+   if (!ovl->data->supports_clrfmt_ext)
+   return;
+
+   reg = readl(ovl->regs + DISP_REG_OVL_CLRFMT_EXT);
+   reg &= ~OVL_CON_CLRFMT_BIT_DEPTH_MASK(idx);
+
+   if (format == DRM_FORMAT_RGBA1010102 ||
+   format == DRM_FORMAT_BGRA1010102 ||
+   format == DRM_FORMAT_ARGB2101010)
+   bit_depth = OVL_CON_CLRFMT_10_BIT;
+
+   reg |= OVL_CON_CLRFMT_BIT_DEPTH(bit_depth, idx);
+
+   mtk_ddp_write(cmdq_pkt, reg, >cmdq_reg,
+ ovl->regs, DISP_REG_OVL_CLRFMT_EXT);
+}
+
 void mtk_ovl_config(struct device *dev, unsigned int w,
unsigned int h, unsigned int vrefresh,
unsigned int bpc, struct cmdq_pkt *cmdq_pkt)
@@ -332,9 +362,11 @@ static unsigned int ovl_fmt_convert(struct mtk_disp_ovl 
*ovl, unsigned int fmt)
return OVL_CON_CLRFMT_ARGB;
case DRM_FORMAT_BGRX:
case DRM_FORMAT_BGRA:
+   case DRM_FORMAT_BGRA1010102:
return OVL_CON_CLRFMT_ARGB | OVL_CON_BYTE_SWAP;
case DRM_FORMAT_XRGB:
case DRM_FORMAT_ARGB:
+   case DRM_FORMAT_ARGB2101010:
return OVL_CON_CLRFMT_RGBA;
case DRM_FORMAT_XBGR:
case DRM_FORMAT_ABGR:
@@ -418,6 +450,7 @@ void mtk_ovl_layer_config(struct device *dev, unsigned int 
idx,
  >cmdq_reg, ovl->regs, 
DISP_REG_OVL_PITCH_MSB(idx));
}
 
+   mtk_ovl_set_bit_depth(dev, idx, fmt, cmdq_pkt);
mtk_ovl_layer_on(dev, idx, cmdq_pkt);
 }
 
-- 
2.39.1.456.gfc5497dd1b-goog

[PATCH 1/3] drm/mediatek: Refactor pixel format logic

2023-02-01 Thread Justin Green

Add an DDP component interface for querying pixel format support and move list
of supported pixel formats into DDP components instead of mtk_drm_plane.c

Tested by running Chrome on an MT8195.

Signed-off-by: Justin Green 
---
 drivers/gpu/drm/mediatek/mtk_disp_drv.h |  4 ++
 drivers/gpu/drm/mediatek/mtk_disp_ovl.c | 44 +
 drivers/gpu/drm/mediatek/mtk_disp_rdma.c| 38 ++
 drivers/gpu/drm/mediatek/mtk_drm_crtc.c |  4 +-
 drivers/gpu/drm/mediatek/mtk_drm_ddp_comp.c |  4 ++
 drivers/gpu/drm/mediatek/mtk_drm_ddp_comp.h | 20 ++
 drivers/gpu/drm/mediatek/mtk_drm_plane.c| 24 ---
 drivers/gpu/drm/mediatek/mtk_drm_plane.h|  3 +-
 8 files changed, 123 insertions(+), 18 deletions(-)

diff --git a/drivers/gpu/drm/mediatek/mtk_disp_drv.h 
b/drivers/gpu/drm/mediatek/mtk_disp_drv.h
index 33e61a136bbc..0df6a06defb8 100644
--- a/drivers/gpu/drm/mediatek/mtk_disp_drv.h
+++ b/drivers/gpu/drm/mediatek/mtk_disp_drv.h
@@ -96,6 +96,8 @@ void mtk_ovl_register_vblank_cb(struct device *dev,
 void mtk_ovl_unregister_vblank_cb(struct device *dev);
 void mtk_ovl_enable_vblank(struct device *dev);
 void mtk_ovl_disable_vblank(struct device *dev);
+const u32 *mtk_ovl_get_formats(struct device *dev);
+size_t mtk_ovl_get_num_formats(struct device *dev);
 
 void mtk_rdma_bypass_shadow(struct device *dev);
 int mtk_rdma_clk_enable(struct device *dev);
@@ -115,6 +117,8 @@ void mtk_rdma_register_vblank_cb(struct device *dev,
 void mtk_rdma_unregister_vblank_cb(struct device *dev);
 void mtk_rdma_enable_vblank(struct device *dev);
 void mtk_rdma_disable_vblank(struct device *dev);
+const u32 *mtk_rdma_get_formats(struct device *dev);
+size_t mtk_rdma_get_num_formats(struct device *dev);
 
 int mtk_mdp_rdma_clk_enable(struct device *dev);
 void mtk_mdp_rdma_clk_disable(struct device *dev);
diff --git a/drivers/gpu/drm/mediatek/mtk_disp_ovl.c 
b/drivers/gpu/drm/mediatek/mtk_disp_ovl.c
index 84daeaffab6a..8743c8047dc9 100644
--- a/drivers/gpu/drm/mediatek/mtk_disp_ovl.c
+++ b/drivers/gpu/drm/mediatek/mtk_disp_ovl.c
@@ -66,6 +66,20 @@
 #defineOVL_CON_VIRT_FLIP   BIT(9)
 #defineOVL_CON_HORZ_FLIP   BIT(10)
 
+static const u32 mt8173_formats[] = {
+   DRM_FORMAT_XRGB,
+   DRM_FORMAT_ARGB,
+   DRM_FORMAT_BGRX,
+   DRM_FORMAT_BGRA,
+   DRM_FORMAT_ABGR,
+   DRM_FORMAT_XBGR,
+   DRM_FORMAT_RGB888,
+   DRM_FORMAT_BGR888,
+   DRM_FORMAT_RGB565,
+   DRM_FORMAT_UYVY,
+   DRM_FORMAT_YUYV,
+};
+
 struct mtk_disp_ovl_data {
unsigned int addr;
unsigned int gmc_bits;
@@ -73,6 +87,8 @@ struct mtk_disp_ovl_data {
bool fmt_rgb565_is_0;
bool smi_id_en;
bool supports_afbc;
+   const u32 *formats;
+   size_t num_formats;
 };
 
 /*
@@ -138,6 +154,20 @@ void mtk_ovl_disable_vblank(struct device *dev)
writel_relaxed(0x0, ovl->regs + DISP_REG_OVL_INTEN);
 }
 
+const u32 *mtk_ovl_get_formats(struct device *dev)
+{
+   struct mtk_disp_ovl *ovl = dev_get_drvdata(dev);
+
+   return ovl->data->formats;
+}
+
+size_t mtk_ovl_get_num_formats(struct device *dev)
+{
+   struct mtk_disp_ovl *ovl = dev_get_drvdata(dev);
+
+   return ovl->data->num_formats;
+}
+
 int mtk_ovl_clk_enable(struct device *dev)
 {
struct mtk_disp_ovl *ovl = dev_get_drvdata(dev);
@@ -495,6 +525,8 @@ static const struct mtk_disp_ovl_data 
mt2701_ovl_driver_data = {
.gmc_bits = 8,
.layer_nr = 4,
.fmt_rgb565_is_0 = false,
+   .formats = mt8173_formats,
+   .num_formats = ARRAY_SIZE(mt8173_formats),
 };
 
 static const struct mtk_disp_ovl_data mt8173_ovl_driver_data = {
@@ -502,6 +534,8 @@ static const struct mtk_disp_ovl_data 
mt8173_ovl_driver_data = {
.gmc_bits = 8,
.layer_nr = 4,
.fmt_rgb565_is_0 = true,
+   .formats = mt8173_formats,
+   .num_formats = ARRAY_SIZE(mt8173_formats),
 };
 
 static const struct mtk_disp_ovl_data mt8183_ovl_driver_data = {
@@ -509,6 +543,8 @@ static const struct mtk_disp_ovl_data 
mt8183_ovl_driver_data = {
.gmc_bits = 10,
.layer_nr = 4,
.fmt_rgb565_is_0 = true,
+   .formats = mt8173_formats,
+   .num_formats = ARRAY_SIZE(mt8173_formats),
 };
 
 static const struct mtk_disp_ovl_data mt8183_ovl_2l_driver_data = {
@@ -516,6 +552,8 @@ static const struct mtk_disp_ovl_data 
mt8183_ovl_2l_driver_data = {
.gmc_bits = 10,
.layer_nr = 2,
.fmt_rgb565_is_0 = true,
+   .formats = mt8173_formats,
+   .num_formats = ARRAY_SIZE(mt8173_formats),
 };
 
 static const struct mtk_disp_ovl_data mt8192_ovl_driver_data = {
@@ -524,6 +562,8 @@ static const struct mtk_disp_ovl_data 
mt8192_ovl_driver_data = {
.layer_nr = 4,
.fmt_rgb565_is_0 = true,
.smi_id_en = true,
+   .formats = mt8173_formats,
+   .num_formats = ARRAY_SIZE(mt8173_formats),
 };
 
 static const struct mtk_disp_ovl_data

[PATCH 0/3 v7] drm/mediatek: Add support for 10-bit overlays

2023-02-01 Thread Justin Green

This patch series adds support for 10-bit overlays to the Mediatek DRM driver.
Specifically, we add support for AR30 and BA30 overlays on MT8195 devices and
lay the groundwork for supporting more 10-bit formats on more devices.

1. Refactor plane initialization logic to allow individual DDP components to
provide their supported pixel formats.

2. Add AR30 and BA30 support to overlay driver.

3. Enable AR30 and BA30 overlays on MT8195.


Version history:
v2:
 * Rebase and resolve merge conflicts with the AFBC patch.
v3:
 * Moved 10-bit support detection to mtk_disk_ovl.c
v4:
 * Moved formats to mtk_disp_ovl.c and mtk_disp_rdma.c
v5:
 * Minor style adjustments per checkpatch.pl
v6:
 * Refactor patch into patch series.
 * Add formats directly to private data.
v7:
 * Gate setting OVL_CLRFMT_EXT register on compatibility.
 * Split patches for adding 10-bit support and enabling 10-bit support on
   MT8195.


Justin Green (3):
  drm/mediatek: Refactor pixel format logic
  drm/mediatek: Add support for AR30 and BA30 overlays
  drm/mediatek: Enable AR30 and BA30 overlays on MT8195

 drivers/gpu/drm/mediatek/mtk_disp_drv.h |  4 +
 drivers/gpu/drm/mediatek/mtk_disp_ovl.c | 94 +
 drivers/gpu/drm/mediatek/mtk_disp_rdma.c| 38 +
 drivers/gpu/drm/mediatek/mtk_drm_crtc.c |  4 +-
 drivers/gpu/drm/mediatek/mtk_drm_ddp_comp.c |  4 +
 drivers/gpu/drm/mediatek/mtk_drm_ddp_comp.h | 20 +
 drivers/gpu/drm/mediatek/mtk_drm_plane.c| 24 ++
 drivers/gpu/drm/mediatek/mtk_drm_plane.h|  3 +-
 8 files changed, 173 insertions(+), 18 deletions(-)

-- 
2.39.1.456.gfc5497dd1b-goog

Re: [PATCH] drm/amdgpu/fence: Fix oops due to non-matching drm_sched init/fini

2023-02-01 Thread Guilherme G. Piccoli

On 01/02/2023 13:21, Luben Tuikov wrote:
> Hi Guilherme,
> 
> Since setting sched->ready to false, seems to be taking place in, directly 
> amdgpu_ring_fini()
> and in amdgpu_fence_driver_sw_fini() indirectly as that function calls 
> drm_sched_fini()
> which sets it to false, we seem to have two competing policies of,
> "set ready to false to show that _fini() was called, and set to false to 
> disable IB submissions".
> 
> To that effect, your patch is generally correct, as it would be the case of 
> an early failure
> and unroll from (indirectly) amdgpu_device_init_schedulers().
> 
> Please resubmit your patch but using .ops as Christian suggested, as .name is 
> sufficient,
> but .ops is necessary.
> 
> On a side-note: in the future we should probably discern between
> "this ring has an initialized and working scheduler" (looking up at DRM), from
> "this ring can take on IBs to send them down to the hardware" (looking down 
> at hardware).
> Sched->ready seems to be overloaded with these disparate states, and this is 
> why you need
> to use .ops to guard calling drm_sched_fini().
> 
> Regards,
> Luben

Thanks a lot Luben, makes perfect sense!

Also, thanks for everyone that provided feedback here, very interesting
discussion.

Submitted V2:
https://lore.kernel.org/dri-devel/20230201164814.1353383-1-gpicc...@igalia.com/
Cheers,


Guilherme

[PATCH v3] drm/i915: Consolidate TLB invalidation flow

2023-02-01 Thread Tvrtko Ursulin

From: Tvrtko Ursulin 

As the logic for selecting the register and corresponsing values grew, the
code become a bit unsightly. Consolidate by storing the required values at
engine init time in the engine itself, and by doing so minimise the amount
of invariant platform and engine checks during each and every TLB
invalidation.

v2:
 * Fail engine probe if TLB invlidations registers are unknown.

v3:
 * Rebase.

Signed-off-by: Tvrtko Ursulin 
Cc: Andrzej Hajda 
Cc: Matt Roper 
Reviewed-by: Andrzej Hajda  # v1
---
 drivers/gpu/drm/i915/gt/intel_engine_cs.c|  96 +
 drivers/gpu/drm/i915/gt/intel_engine_types.h |  15 ++
 drivers/gpu/drm/i915/gt/intel_gt.c   | 138 +++
 3 files changed, 133 insertions(+), 116 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/intel_engine_cs.c 
b/drivers/gpu/drm/i915/gt/intel_engine_cs.c
index d4e29da74612..e430945743ec 100644
--- a/drivers/gpu/drm/i915/gt/intel_engine_cs.c
+++ b/drivers/gpu/drm/i915/gt/intel_engine_cs.c
@@ -9,6 +9,7 @@
 
 #include "gem/i915_gem_context.h"
 #include "gem/i915_gem_internal.h"
+#include "gt/intel_gt_print.h"
 #include "gt/intel_gt_regs.h"
 
 #include "i915_cmd_parser.h"
@@ -1143,12 +1144,107 @@ static int init_status_page(struct intel_engine_cs 
*engine)
return ret;
 }
 
+static int intel_engine_init_tlb_invalidation(struct intel_engine_cs *engine)
+{
+   static const union intel_engine_tlb_inv_reg gen8_regs[] = {
+   [RENDER_CLASS].reg  = GEN8_RTCR,
+   [VIDEO_DECODE_CLASS].reg= GEN8_M1TCR, /* , GEN8_M2TCR */
+   [VIDEO_ENHANCEMENT_CLASS].reg   = GEN8_VTCR,
+   [COPY_ENGINE_CLASS].reg = GEN8_BTCR,
+   };
+   static const union intel_engine_tlb_inv_reg gen12_regs[] = {
+   [RENDER_CLASS].reg  = GEN12_GFX_TLB_INV_CR,
+   [VIDEO_DECODE_CLASS].reg= GEN12_VD_TLB_INV_CR,
+   [VIDEO_ENHANCEMENT_CLASS].reg   = GEN12_VE_TLB_INV_CR,
+   [COPY_ENGINE_CLASS].reg = GEN12_BLT_TLB_INV_CR,
+   [COMPUTE_CLASS].reg = GEN12_COMPCTX_TLB_INV_CR,
+   };
+   static const union intel_engine_tlb_inv_reg xehp_regs[] = {
+   [RENDER_CLASS].mcr_reg= XEHP_GFX_TLB_INV_CR,
+   [VIDEO_DECODE_CLASS].mcr_reg  = XEHP_VD_TLB_INV_CR,
+   [VIDEO_ENHANCEMENT_CLASS].mcr_reg = XEHP_VE_TLB_INV_CR,
+   [COPY_ENGINE_CLASS].mcr_reg   = XEHP_BLT_TLB_INV_CR,
+   [COMPUTE_CLASS].mcr_reg   = XEHP_COMPCTX_TLB_INV_CR,
+   };
+   struct drm_i915_private *i915 = engine->i915;
+   const union intel_engine_tlb_inv_reg *regs;
+   union intel_engine_tlb_inv_reg reg;
+   unsigned int class = engine->class;
+   unsigned int num = 0;
+   u32 val;
+
+   /*
+* New platforms should not be added with catch-all-newer (>=)
+* condition so that any later platform added triggers the below warning
+* and in turn mandates a human cross-check of whether the invalidation
+* flows have compatible semantics.
+*
+* For instance with the 11.00 -> 12.00 transition three out of five
+* respective engine registers were moved to masked type. Then after the
+* 12.00 -> 12.50 transition multi cast handling is required too.
+*/
+
+   if (GRAPHICS_VER_FULL(i915) == IP_VER(12, 50) ||
+   GRAPHICS_VER_FULL(i915) == IP_VER(12, 55)) {
+   regs = xehp_regs;
+   num = ARRAY_SIZE(xehp_regs);
+   } else if (GRAPHICS_VER_FULL(i915) == IP_VER(12, 0) ||
+  GRAPHICS_VER_FULL(i915) == IP_VER(12, 10)) {
+   regs = gen12_regs;
+   num = ARRAY_SIZE(gen12_regs);
+   } else if (GRAPHICS_VER(i915) >= 8 && GRAPHICS_VER(i915) <= 11) {
+   regs = gen8_regs;
+   num = ARRAY_SIZE(gen8_regs);
+   } else if (GRAPHICS_VER(i915) < 8) {
+   return 0;
+   }
+
+   if (gt_WARN_ONCE(engine->gt, !num,
+"Platform does not implement TLB invalidation!"))
+   return -ENODEV;
+
+   if (gt_WARN_ON_ONCE(engine->gt,
+class >= num ||
+(!regs[class].reg.reg &&
+ !regs[class].mcr_reg.reg)))
+   return -ERANGE;
+
+   reg = regs[class];
+
+   if (GRAPHICS_VER(i915) == 8 && class == VIDEO_DECODE_CLASS) {
+   reg.reg.reg += 4 * engine->instance; /* GEN8_M2TCR */
+   val = 0;
+   } else {
+   val = engine->instance;
+   }
+
+   val = BIT(val);
+
+   engine->tlb_inv.mcr = regs == xehp_regs;
+   engine->tlb_inv.reg = reg;
+   engine->tlb_inv.done = val;
+
+   if (GRAPHICS_VER(i915) >= 12 &&
+   (engine->class == VIDEO_DECODE_CLASS ||
+engine->class == VIDEO_ENHANCEMENT_CLASS ||
+

Re: [PATCH v3 3/3] drm/etnaviv: export client GPU usage statistics via fdinfo

2023-02-01 Thread Philipp Zabel

On Wed, Feb 01, 2023 at 04:26:09PM +0100, Lucas Stach wrote:
> This exposes a accumulated GPU active time per client via the
> fdinfo infrastructure.
> 
> Signed-off-by: Lucas Stach 
> ---
> v3: handle NPU cores

Reviewed-by: Philipp Zabel 

regards
Philipp

Re: [PATCH v3 2/3] drm/etnaviv: allocate unique ID per drm_file

2023-02-01 Thread Philipp Zabel

On Wed, Feb 01, 2023 at 04:26:08PM +0100, Lucas Stach wrote:
> Allows to easily track if several fd are pointing to the same
> execution context due to being dup'ed.
> 
> Signed-off-by: Lucas Stach 
> ---
> v3: use xarray to track the active contexts to avoid issues
> on rollover

Reviewed-by: Philipp Zabel 

regards
Philipp

[PATCH v2] drm/amdgpu/fence: Fix oops due to non-matching drm_sched init/fini

2023-02-01 Thread Guilherme G. Piccoli

Currently amdgpu calls drm_sched_fini() from the fence driver sw fini
routine - such function is expected to be called only after the
respective init function - drm_sched_init() - was executed successfully.

Happens that we faced a driver probe failure in the Steam Deck
recently, and the function drm_sched_fini() was called even without
its counter-part had been previously called, causing the following oops:

amdgpu: probe of :04:00.0 failed with error -110
BUG: kernel NULL pointer dereference, address: 0090
PGD 0 P4D 0
Oops: 0002 [#1] PREEMPT SMP NOPTI
CPU: 0 PID: 609 Comm: systemd-udevd Not tainted 6.2.0-rc3-gpiccoli #338
Hardware name: Valve Jupiter/Jupiter, BIOS F7A0113 11/04/2022
RIP: 0010:drm_sched_fini+0x84/0xa0 [gpu_sched]
[...]
Call Trace:
 
 amdgpu_fence_driver_sw_fini+0xc8/0xd0 [amdgpu]
 amdgpu_device_fini_sw+0x2b/0x3b0 [amdgpu]
 amdgpu_driver_release_kms+0x16/0x30 [amdgpu]
 devm_drm_dev_init_release+0x49/0x70
 [...]

To prevent that, check if the drm_sched was properly initialized for a
given ring before calling its fini counter-part.

Notice ideally we'd use sched.ready for that; such field is set as the latest
thing on drm_sched_init(). But amdgpu seems to "override" the meaning of such
field - in the above oops for example, it was a GFX ring causing the crash, and
the sched.ready field was set to true in the ring init routine, regardless of
the state of the DRM scheduler. Hence, we ended-up using sched.ops as per
Christian's suggestion [0].

[0] 
https://lore.kernel.org/amd-gfx/984ee981-2906-0eaf-ccec-9f80975cb...@amd.com/

Fixes: 067f44c8b459 ("drm/amdgpu: avoid over-handle of fence driver fini in s3 
test (v2)")
Suggested-by: Christian König 
Cc: Guchun Chen 
Cc: Luben Tuikov 
Cc: Mario Limonciello 
Signed-off-by: Guilherme G. Piccoli 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c | 8 +++-
 1 file changed, 7 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c
index 00444203220d..3b962cb680a6 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c
@@ -618,7 +618,13 @@ void amdgpu_fence_driver_sw_fini(struct amdgpu_device 
*adev)
if (!ring || !ring->fence_drv.initialized)
continue;
 
-   if (!ring->no_scheduler)
+   /*
+* Notice we check for sched.ops since there's some
+* override on the meaning of sched.ready by amdgpu.
+* The natural check would be sched.ready, which is
+* set as drm_sched_init() finishes...
+*/
+   if (!ring->no_scheduler && ring->sched.ops)
drm_sched_fini(>sched);
 
for (j = 0; j <= ring->fence_drv.num_fences_mask; ++j)
-- 
2.39.0

Re: [PATCH 2/2] drm/mediatek: Add support for AR30 and BA30 overlays

2023-02-01 Thread Justin Green

Hi Chun-Kuang,

> Does all SoC have this register? If no, you should write this register
> for the SoC have this register.

I can't confirm this from the documentation that I have, I can only
confirm this register exists on MT8195 and MT8186. I will send out
another patch.

Thanks,
Justin

Re: [PATCH] drm/amdgpu/fence: Fix oops due to non-matching drm_sched init/fini

2023-02-01 Thread Luben Tuikov

Hi Guilherme,

Since setting sched->ready to false, seems to be taking place in, directly 
amdgpu_ring_fini()
and in amdgpu_fence_driver_sw_fini() indirectly as that function calls 
drm_sched_fini()
which sets it to false, we seem to have two competing policies of,
"set ready to false to show that _fini() was called, and set to false to 
disable IB submissions".

To that effect, your patch is generally correct, as it would be the case of an 
early failure
and unroll from (indirectly) amdgpu_device_init_schedulers().

Please resubmit your patch but using .ops as Christian suggested, as .name is 
sufficient,
but .ops is necessary.

On a side-note: in the future we should probably discern between
"this ring has an initialized and working scheduler" (looking up at DRM), from
"this ring can take on IBs to send them down to the hardware" (looking down at 
hardware).
Sched->ready seems to be overloaded with these disparate states, and this is 
why you need
to use .ops to guard calling drm_sched_fini().

Regards,
Luben

On 2023-02-01 09:35, Christian König wrote:
> Am 01.02.23 um 15:24 schrieb Alex Deucher:
>> On Wed, Feb 1, 2023 at 2:18 AM Christian König
>>  wrote:
>>> Hi Guchun,
>>>
>>> no, that doesn't make any sense at all.
>>>
>>> The ready flag indicates that the scheduler is fully prepared for hw
>>> submissions from userspace and is unrelated to the initialization
>>> status. It's set to true after IB testing was successful and only set to
>>> false only when a GPU reset fails and we can't get the hardware to work
>>> any more.
>> That might have been the original intention, but right now sched.ready
>> gets set to true when we finish setting up the ring, but before we do
>> ring or IB tests.
> WHAT? Please not again.
>
> I'm really tired of fixing this over and over again, the meaning of 
> ring->sched.ready is to block submissions when a GPU reset fails. AND 
> NOTHING ELSE!
>
> The problem is people seem to abuse it and I have to fix it for the 
> fourth or fives time now.
>
> I'm going to send out patches,
> Christian.
>
>> Alex
>>
>>> Please use sched.ops instead as I suggested before.
>>>
>>> Regards,
>>> Christian.
>>>
>>> Am 31.01.23 um 14:58 schrieb Chen, Guchun:
 Hi Christian,

 Do you think if it makes sense that we can set 'ring->sched.ready' to be 
 true in each ring init, even if before executing/setting up drm_sched_init 
 in amdgpu_device_init_schedulers? As 'ready' is a member of gpu scheduler 
 structure.

 Regards,
 Guchun

 -Original Message-
 From: Koenig, Christian 
 Sent: Tuesday, January 31, 2023 6:59 PM
 To: Chen, Guchun ; Alex Deucher 
 ; Guilherme G. Piccoli 
 Cc: amd-...@lists.freedesktop.org; ker...@gpiccoli.net; Pan, Xinhui 
 ; dri-devel@lists.freedesktop.org; Tuikov, Luben 
 ; Limonciello, Mario ; 
 kernel-...@igalia.com; Deucher, Alexander 
 Subject: Re: [PATCH] drm/amdgpu/fence: Fix oops due to non-matching 
 drm_sched init/fini

 Am 31.01.23 um 10:17 schrieb Chen, Guchun:
> Hi Piccoli,
>
> Please ignore my request of full dmesg log. I can reproduce the issue and 
> get the same failure callstack by returning early with an error code 
> prior to amdgpu_device_init_schedulers.
>
> Regards,
> Guchun
>
> -Original Message-
> From: Chen, Guchun
> Sent: Tuesday, January 31, 2023 2:37 PM
> To: Alex Deucher ; Guilherme G. Piccoli
> 
> Cc: amd-...@lists.freedesktop.org; ker...@gpiccoli.net; Pan, Xinhui
> ; dri-devel@lists.freedesktop.org; Tuikov, Luben
> ; Limonciello, Mario
> ; kernel-...@igalia.com; Deucher, Alexander
> ; Koenig, Christian
> 
> Subject: RE: [PATCH] drm/amdgpu/fence: Fix oops due to non-matching
> drm_sched init/fini
>
> Hi Piccoli,
>
> I agree with Alex's point, using ring->sched.name for such check is not a 
> good way. BTW, can you please attach a full dmesg long in bad case to 
> help me understand more?
>
> Regards,
> Guchun
>
> -Original Message-
> From: Alex Deucher 
> Sent: Tuesday, January 31, 2023 6:30 AM
> To: Guilherme G. Piccoli 
> Cc: amd-...@lists.freedesktop.org; ker...@gpiccoli.net; Chen, Guchun
> ; Pan, Xinhui ;
> dri-devel@lists.freedesktop.org; Tuikov, Luben ;
> Limonciello, Mario ; kernel-...@igalia.com;
> Deucher, Alexander ; Koenig, Christian
> 
> Subject: Re: [PATCH] drm/amdgpu/fence: Fix oops due to non-matching
> drm_sched init/fini
>
> On Mon, Jan 30, 2023 at 4:51 PM Guilherme G. Piccoli 
>  wrote:
>> + Luben
>>
>> (sorry, missed that in the first submission).
>>
>> On 30/01/2023 18:45, Guilherme G. Piccoli wrote:
>>> Currently amdgpu calls drm_sched_fini() from the fence driver sw
>>> fini routine - such function is expected to be called only after the
>>> respective init function - drm_sched_init() - was

Re: [PATCH 2/2] drm/mediatek: Add support for AR30 and BA30 overlays

2023-02-01 Thread Chun-Kuang Hu

Hi, Justin:

Justin Green  於 2023年2月1日 週三 上午4:09寫道：
>
> Add support for overlays with pixel formats AR30 and BA30 on MT8195.

I would like to break this patch into two patches. One is ovl support
AR30 and BA30, and the other one is mt8195 ovl support AR30 and BA30.

>
> Tested using "modetest -P" on an MT8195 device.
>
> Signed-off-by: Justin Green 
> ---
>  drivers/gpu/drm/mediatek/mtk_disp_ovl.c | 49 -
>  1 file changed, 47 insertions(+), 2 deletions(-)
>
> diff --git a/drivers/gpu/drm/mediatek/mtk_disp_ovl.c 
> b/drivers/gpu/drm/mediatek/mtk_disp_ovl.c
> index 8743c8047dc9..cd2f9a156456 100644
> --- a/drivers/gpu/drm/mediatek/mtk_disp_ovl.c
> +++ b/drivers/gpu/drm/mediatek/mtk_disp_ovl.c
> @@ -41,6 +41,7 @@
>  #define DISP_REG_OVL_RDMA_CTRL(n)  (0x00c0 + 0x20 * (n))
>  #define DISP_REG_OVL_RDMA_GMC(n)   (0x00c8 + 0x20 * (n))
>  #define DISP_REG_OVL_ADDR_MT2701   0x0040
> +#define DISP_REG_OVL_CLRFMT_EXT0x02D0
>  #define DISP_REG_OVL_ADDR_MT8173   0x0f40
>  #define DISP_REG_OVL_ADDR(ovl, n)  ((ovl)->data->addr + 0x20 * 
> (n))
>  #define DISP_REG_OVL_HDR_ADDR(ovl, n)  ((ovl)->data->addr + 0x20 * 
> (n) + 0x04)
> @@ -61,6 +62,10 @@
> 0 : OVL_CON_CLRFMT_RGB)
>  #define OVL_CON_CLRFMT_RGB888(ovl) ((ovl)->data->fmt_rgb565_is_0 ? \
> OVL_CON_CLRFMT_RGB : 0)
> +#define OVL_CON_CLRFMT_BIT_DEPTH_MASK(ovl) (0xFF << 4 * (ovl))
> +#define OVL_CON_CLRFMT_BIT_DEPTH(depth, ovl)   (depth << 4 * (ovl))
> +#define OVL_CON_CLRFMT_8_BIT   0x00
> +#define OVL_CON_CLRFMT_10_BIT  0x01
>  #defineOVL_CON_AEN BIT(8)
>  #defineOVL_CON_ALPHA   0xff
>  #defineOVL_CON_VIRT_FLIP   BIT(9)
> @@ -80,6 +85,22 @@ static const u32 mt8173_formats[] = {
> DRM_FORMAT_YUYV,
>  };
>
> +static const u32 mt8195_formats[] = {
> +   DRM_FORMAT_XRGB,
> +   DRM_FORMAT_ARGB,
> +   DRM_FORMAT_ARGB2101010,
> +   DRM_FORMAT_BGRX,
> +   DRM_FORMAT_BGRA,
> +   DRM_FORMAT_BGRA1010102,
> +   DRM_FORMAT_ABGR,
> +   DRM_FORMAT_XBGR,
> +   DRM_FORMAT_RGB888,
> +   DRM_FORMAT_BGR888,
> +   DRM_FORMAT_RGB565,
> +   DRM_FORMAT_UYVY,
> +   DRM_FORMAT_YUYV,
> +};
> +
>  struct mtk_disp_ovl_data {
> unsigned int addr;
> unsigned int gmc_bits;
> @@ -218,6 +239,27 @@ static void mtk_ovl_set_afbc(struct mtk_disp_ovl *ovl, 
> struct cmdq_pkt *cmdq_pkt
>DISP_REG_OVL_DATAPATH_CON, OVL_LAYER_AFBC_EN(idx));
>  }
>
> +static void mtk_ovl_set_bit_depth(struct device *dev, int idx, u32 format,
> + struct cmdq_pkt *cmdq_pkt)
> +{
> +   struct mtk_disp_ovl *ovl = dev_get_drvdata(dev);
> +   unsigned int reg;
> +   unsigned int bit_depth = OVL_CON_CLRFMT_8_BIT;
> +
> +   reg = readl(ovl->regs + DISP_REG_OVL_CLRFMT_EXT);
> +   reg &= ~OVL_CON_CLRFMT_BIT_DEPTH_MASK(idx);
> +
> +   if (format == DRM_FORMAT_RGBA1010102 ||
> +   format == DRM_FORMAT_BGRA1010102 ||
> +   format == DRM_FORMAT_ARGB2101010)
> +   bit_depth = OVL_CON_CLRFMT_10_BIT;
> +
> +   reg |= OVL_CON_CLRFMT_BIT_DEPTH(bit_depth, idx);
> +
> +   mtk_ddp_write(cmdq_pkt, reg, >cmdq_reg,
> + ovl->regs, DISP_REG_OVL_CLRFMT_EXT);

Does all SoC have this register? If no, you should write this register
for the SoC have this register.

Regards,
Chun-Kuang.

> +}
> +
>  void mtk_ovl_config(struct device *dev, unsigned int w,
> unsigned int h, unsigned int vrefresh,
> unsigned int bpc, struct cmdq_pkt *cmdq_pkt)
> @@ -332,9 +374,11 @@ static unsigned int ovl_fmt_convert(struct mtk_disp_ovl 
> *ovl, unsigned int fmt)
> return OVL_CON_CLRFMT_ARGB;
> case DRM_FORMAT_BGRX:
> case DRM_FORMAT_BGRA:
> +   case DRM_FORMAT_BGRA1010102:
> return OVL_CON_CLRFMT_ARGB | OVL_CON_BYTE_SWAP;
> case DRM_FORMAT_XRGB:
> case DRM_FORMAT_ARGB:
> +   case DRM_FORMAT_ARGB2101010:
> return OVL_CON_CLRFMT_RGBA;
> case DRM_FORMAT_XBGR:
> case DRM_FORMAT_ABGR:
> @@ -418,6 +462,7 @@ void mtk_ovl_layer_config(struct device *dev, unsigned 
> int idx,
>   >cmdq_reg, ovl->regs, 
> DISP_REG_OVL_PITCH_MSB(idx));
> }
>
> +   mtk_ovl_set_bit_depth(dev, idx, fmt, cmdq_pkt);
> mtk_ovl_layer_on(dev, idx, cmdq_pkt);
>  }
>
> @@ -583,8 +628,8 @@ static const struct mtk_disp_ovl_data 
> mt8195_ovl_driver_data = {
> .fmt_rgb565_is_0 = true,
> .smi_id_en = true,
> .supports_afbc = true,
> -   .formats = mt8173_formats,
> -   .num_formats = ARRAY_SIZE(mt8173_formats),
> +   .formats =

Re: [PATCH] drm/virtio: exbuf->fence_fd unmodified on interrupted wait

2023-02-01 Thread Rob Clark

On Wed, Feb 1, 2023 at 5:28 AM Dmitry Osipenko
 wrote:
>
> On 1/27/23 01:58, Ryan Neph wrote:
> > An interrupted dma_fence_wait() becomes an -ERESTARTSYS returned
> > to userspace ioctl(DRM_IOCTL_VIRTGPU_EXECBUFFER) calls, prompting to
> > retry the ioctl(), but the passed exbuf->fence_fd has been reset to -1,
> > making the retry attempt fail at sync_file_get_fence().
> >
> > The uapi for DRM_IOCTL_VIRTGPU_EXECBUFFER is changed to retain the
> > passed value for exbuf->fence_fd when returning ERESTARTSYS or EINTR.
> >
> > Fixes: 2cd7b6f08bc4 ("drm/virtio: add in/out fence support for explicit 
> > synchronization")
> > Signed-off-by: Ryan Neph 
> > ---
> >
> >  drivers/gpu/drm/virtio/virtgpu_ioctl.c | 9 ++---
> >  include/uapi/drm/virtgpu_drm.h | 3 +++
> >  2 files changed, 9 insertions(+), 3 deletions(-)
> >
> > diff --git a/drivers/gpu/drm/virtio/virtgpu_ioctl.c 
> > b/drivers/gpu/drm/virtio/virtgpu_ioctl.c
> > index 9f4a90493aea..ffce4e2a409a 100644
> > --- a/drivers/gpu/drm/virtio/virtgpu_ioctl.c
> > +++ b/drivers/gpu/drm/virtio/virtgpu_ioctl.c
> > @@ -132,6 +132,8 @@ static int virtio_gpu_execbuffer_ioctl(struct 
> > drm_device *dev, void *data,
> >   uint64_t fence_ctx;
> >   uint32_t ring_idx;
> >
> > + exbuf->fence_fd = -1;
> > +
> >   fence_ctx = vgdev->fence_drv.context;
> >   ring_idx = 0;
> >
> > @@ -152,8 +154,6 @@ static int virtio_gpu_execbuffer_ioctl(struct 
> > drm_device *dev, void *data,
> >   ring_idx = exbuf->ring_idx;
> >   }
> >
> > - exbuf->fence_fd = -1;
>
> Is there any userspace relying on this -1 behaviour? Wouldn't be better
> to remove this offending assignment?

Looking at current mesa, removing the assignment should be ok (and
more consistent with other drivers).  But I can't say if this was
always true, or that there aren't other non-mesa users, so I can see
the argument for the more conservative uabi change that this patch
went with.

BR,
-R

Re: [PATCH v5] drm/mediatek: Add support for AR30 and BA30

2023-02-01 Thread Justin Green

Hi Chun-Kuang,
These two series are independent, so my preference would be to apply
whichever series becomes ready first. There won't technically be a
merge conflict, but I think the ovl-adaptor component will fail to
initialize correctly once the first patch of this series lands,
because the ovl-adaptor patch series does not yet implement the
get_formats() interface. I can send a follow-up patch after the
ovl-adaptor series lands, or let Nancy know about the potential
runtime issue.

Regards,
Justin

On Wed, Feb 1, 2023 at 10:30 AM Chun-Kuang Hu  wrote:
>
> Hi, Justin:
>
> Justin Green  於 2023年1月31日 週二 下午11:04寫道：
> >
> > Hi Chun-Kuang,
> > Thanks for the review! Will try to have a new patch out later today.
> >
> > Re MT8195 RDMA: Yes, the MT8195 RDMA has a 10 bit mode, but I left
> > this unimplemented because I have no means of testing it
> > unfortunately; ChromeOS does not use this hardware.
> >
> > I also wanted to mention, I think this patch might conflict with the
> > in-flight ovl-adaptor patch. Is there a protocol for a situation like
> > this?
>
> If I prefer the order of the two series, I would told you. But I have
> no preference of these two series, so you may decide these two serious
> are independent or dependent. If independent, I would apply the one
> which is ready first, so the other one should rebase. If dependent,
> you should discuss with Nancy the order of these two serious, the
> second series should rebase on first series.
>
> Regards,
> Chun-Kuang.
>
> >
> > Regards,
> > Justin
> >
> > On Mon, Jan 30, 2023 at 7:28 PM Chun-Kuang Hu  
> > wrote:
> > >
> > > Hi, Justin:
> > >
> > > Justin Green  於 2023年1月31日 週二 上午4:36寫道：
> > > >
> > > > Add support for AR30 and BA30 pixel formats to the Mediatek DRM driver.
> > > >
> > > > Tested using "modetest -P" on an MT8195.
> > > >
> > > > Signed-off-by: Justin Green 
> > > > ---
> > > > v2:
> > > >  * Rebase and resolve merge conflicts with the AFBC patch.
> > > > v3:
> > > >  * Moved 10-bit support detection to mtk_disk_ovl.c
> > > > v4:
> > > >  * Moved formats to mtk_disp_ovl.c and mtk_disp_rdma.c
> > > > v5:
> > > >  * Minor style adjustments per checkpatch.pl
> > > >
> > > >  drivers/gpu/drm/mediatek/mtk_disp_drv.h |  4 ++
> > > >  drivers/gpu/drm/mediatek/mtk_disp_ovl.c | 75 +
> > > >  drivers/gpu/drm/mediatek/mtk_disp_rdma.c| 27 
> > > >  drivers/gpu/drm/mediatek/mtk_drm_crtc.c |  4 +-
> > > >  drivers/gpu/drm/mediatek/mtk_drm_ddp_comp.c |  4 ++
> > > >  drivers/gpu/drm/mediatek/mtk_drm_ddp_comp.h | 20 ++
> > > >  drivers/gpu/drm/mediatek/mtk_drm_plane.c| 28 +++-
> > > >  drivers/gpu/drm/mediatek/mtk_drm_plane.h|  3 +-
> > > >  8 files changed, 146 insertions(+), 19 deletions(-)
> > > >
> > > > diff --git a/drivers/gpu/drm/mediatek/mtk_disp_drv.h 
> > > > b/drivers/gpu/drm/mediatek/mtk_disp_drv.h
> > > > index 33e61a136bbc..6ad22ce75b81 100644
> > > > --- a/drivers/gpu/drm/mediatek/mtk_disp_drv.h
> > > > +++ b/drivers/gpu/drm/mediatek/mtk_disp_drv.h
> > > > @@ -96,6 +96,8 @@ void mtk_ovl_register_vblank_cb(struct device *dev,
> > > >  void mtk_ovl_unregister_vblank_cb(struct device *dev);
> > > >  void mtk_ovl_enable_vblank(struct device *dev);
> > > >  void mtk_ovl_disable_vblank(struct device *dev);
> > > > +const u32 *mtk_ovl_get_formats(struct device *dev);
> > > > +size_t mtk_ovl_get_num_formats(struct device *dev);
> > > >
> > > >  void mtk_rdma_bypass_shadow(struct device *dev);
> > > >  int mtk_rdma_clk_enable(struct device *dev);
> > > > @@ -122,4 +124,6 @@ void mtk_mdp_rdma_start(struct device *dev, struct 
> > > > cmdq_pkt *cmdq_pkt);
> > > >  void mtk_mdp_rdma_stop(struct device *dev, struct cmdq_pkt *cmdq_pkt);
> > > >  void mtk_mdp_rdma_config(struct device *dev, struct mtk_mdp_rdma_cfg 
> > > > *cfg,
> > > >  struct cmdq_pkt *cmdq_pkt);
> > > > +const u32 *mtk_rdma_get_formats(struct device *dev);
> > > > +size_t mtk_rdma_get_num_formats(struct device *dev);
> > > >  #endif
> > > > diff --git a/drivers/gpu/drm/mediatek/mtk_disp_ovl.c 
> > > > b/drivers/gpu/drm/mediatek/mtk_disp_ovl.c
> > > > index 84daeaffab6a..1db70a77560f 100644
> > > > --- a/drivers/gpu/drm/mediatek/mtk_disp_ovl.c
> > > > +++ b/drivers/gpu/drm/mediatek/mtk_disp_ovl.c
> > > > @@ -41,6 +41,7 @@
> > > >  #define DISP_REG_OVL_RDMA_CTRL(n)  (0x00c0 + 0x20 * (n))
> > > >  #define DISP_REG_OVL_RDMA_GMC(n)   (0x00c8 + 0x20 * (n))
> > > >  #define DISP_REG_OVL_ADDR_MT2701   0x0040
> > > > +#define DISP_REG_OVL_CLRFMT_EXT0x02D0
> > > >  #define DISP_REG_OVL_ADDR_MT8173   0x0f40
> > > >  #define DISP_REG_OVL_ADDR(ovl, n)  ((ovl)->data->addr + 
> > > > 0x20 * (n))
> > > >  #define DISP_REG_OVL_HDR_ADDR(ovl, n)  ((ovl)->data->addr + 
> > > > 0x20 * (n) + 0x04)
> > > > @@ -61,11 +62,45 @@
> > > > 0 : OVL_CON_CLRFMT_RGB)
> > > >  #define

Re: [PATCH 1/2] drm/mediatek: Refactor pixel format logic

2023-02-01 Thread Chun-Kuang Hu

Justin Green  於 2023年2月1日 週三 上午4:09寫道：
>
> Add an DDP component interface for querying pixel format support and move list
> of supported pixel formats into DDP components instead of mtk_drm_plane.c
>
> Tested by running Chrome on an MT8195.
>
> Signed-off-by: Justin Green 
>
> edit1

After remove this word,

Reviewed-by: Chun-Kuang Hu 

> ---
>  drivers/gpu/drm/mediatek/mtk_disp_drv.h |  4 ++
>  drivers/gpu/drm/mediatek/mtk_disp_ovl.c | 44 +
>  drivers/gpu/drm/mediatek/mtk_disp_rdma.c| 38 ++
>  drivers/gpu/drm/mediatek/mtk_drm_crtc.c |  4 +-
>  drivers/gpu/drm/mediatek/mtk_drm_ddp_comp.c |  4 ++
>  drivers/gpu/drm/mediatek/mtk_drm_ddp_comp.h | 20 ++
>  drivers/gpu/drm/mediatek/mtk_drm_plane.c| 24 ---
>  drivers/gpu/drm/mediatek/mtk_drm_plane.h|  3 +-
>  8 files changed, 123 insertions(+), 18 deletions(-)
>
> diff --git a/drivers/gpu/drm/mediatek/mtk_disp_drv.h 
> b/drivers/gpu/drm/mediatek/mtk_disp_drv.h
> index 33e61a136bbc..0df6a06defb8 100644
> --- a/drivers/gpu/drm/mediatek/mtk_disp_drv.h
> +++ b/drivers/gpu/drm/mediatek/mtk_disp_drv.h
> @@ -96,6 +96,8 @@ void mtk_ovl_register_vblank_cb(struct device *dev,
>  void mtk_ovl_unregister_vblank_cb(struct device *dev);
>  void mtk_ovl_enable_vblank(struct device *dev);
>  void mtk_ovl_disable_vblank(struct device *dev);
> +const u32 *mtk_ovl_get_formats(struct device *dev);
> +size_t mtk_ovl_get_num_formats(struct device *dev);
>
>  void mtk_rdma_bypass_shadow(struct device *dev);
>  int mtk_rdma_clk_enable(struct device *dev);
> @@ -115,6 +117,8 @@ void mtk_rdma_register_vblank_cb(struct device *dev,
>  void mtk_rdma_unregister_vblank_cb(struct device *dev);
>  void mtk_rdma_enable_vblank(struct device *dev);
>  void mtk_rdma_disable_vblank(struct device *dev);
> +const u32 *mtk_rdma_get_formats(struct device *dev);
> +size_t mtk_rdma_get_num_formats(struct device *dev);
>
>  int mtk_mdp_rdma_clk_enable(struct device *dev);
>  void mtk_mdp_rdma_clk_disable(struct device *dev);
> diff --git a/drivers/gpu/drm/mediatek/mtk_disp_ovl.c 
> b/drivers/gpu/drm/mediatek/mtk_disp_ovl.c
> index 84daeaffab6a..8743c8047dc9 100644
> --- a/drivers/gpu/drm/mediatek/mtk_disp_ovl.c
> +++ b/drivers/gpu/drm/mediatek/mtk_disp_ovl.c
> @@ -66,6 +66,20 @@
>  #defineOVL_CON_VIRT_FLIP   BIT(9)
>  #defineOVL_CON_HORZ_FLIP   BIT(10)
>
> +static const u32 mt8173_formats[] = {
> +   DRM_FORMAT_XRGB,
> +   DRM_FORMAT_ARGB,
> +   DRM_FORMAT_BGRX,
> +   DRM_FORMAT_BGRA,
> +   DRM_FORMAT_ABGR,
> +   DRM_FORMAT_XBGR,
> +   DRM_FORMAT_RGB888,
> +   DRM_FORMAT_BGR888,
> +   DRM_FORMAT_RGB565,
> +   DRM_FORMAT_UYVY,
> +   DRM_FORMAT_YUYV,
> +};
> +
>  struct mtk_disp_ovl_data {
> unsigned int addr;
> unsigned int gmc_bits;
> @@ -73,6 +87,8 @@ struct mtk_disp_ovl_data {
> bool fmt_rgb565_is_0;
> bool smi_id_en;
> bool supports_afbc;
> +   const u32 *formats;
> +   size_t num_formats;
>  };
>
>  /*
> @@ -138,6 +154,20 @@ void mtk_ovl_disable_vblank(struct device *dev)
> writel_relaxed(0x0, ovl->regs + DISP_REG_OVL_INTEN);
>  }
>
> +const u32 *mtk_ovl_get_formats(struct device *dev)
> +{
> +   struct mtk_disp_ovl *ovl = dev_get_drvdata(dev);
> +
> +   return ovl->data->formats;
> +}
> +
> +size_t mtk_ovl_get_num_formats(struct device *dev)
> +{
> +   struct mtk_disp_ovl *ovl = dev_get_drvdata(dev);
> +
> +   return ovl->data->num_formats;
> +}
> +
>  int mtk_ovl_clk_enable(struct device *dev)
>  {
> struct mtk_disp_ovl *ovl = dev_get_drvdata(dev);
> @@ -495,6 +525,8 @@ static const struct mtk_disp_ovl_data 
> mt2701_ovl_driver_data = {
> .gmc_bits = 8,
> .layer_nr = 4,
> .fmt_rgb565_is_0 = false,
> +   .formats = mt8173_formats,
> +   .num_formats = ARRAY_SIZE(mt8173_formats),
>  };
>
>  static const struct mtk_disp_ovl_data mt8173_ovl_driver_data = {
> @@ -502,6 +534,8 @@ static const struct mtk_disp_ovl_data 
> mt8173_ovl_driver_data = {
> .gmc_bits = 8,
> .layer_nr = 4,
> .fmt_rgb565_is_0 = true,
> +   .formats = mt8173_formats,
> +   .num_formats = ARRAY_SIZE(mt8173_formats),
>  };
>
>  static const struct mtk_disp_ovl_data mt8183_ovl_driver_data = {
> @@ -509,6 +543,8 @@ static const struct mtk_disp_ovl_data 
> mt8183_ovl_driver_data = {
> .gmc_bits = 10,
> .layer_nr = 4,
> .fmt_rgb565_is_0 = true,
> +   .formats = mt8173_formats,
> +   .num_formats = ARRAY_SIZE(mt8173_formats),
>  };
>
>  static const struct mtk_disp_ovl_data mt8183_ovl_2l_driver_data = {
> @@ -516,6 +552,8 @@ static const struct mtk_disp_ovl_data 
> mt8183_ovl_2l_driver_data = {
> .gmc_bits = 10,
> .layer_nr = 2,
> .fmt_rgb565_is_0 = true,
> +   .formats = mt8173_formats,
> +   .num_formats = ARRAY_SIZE(mt8173_formats),
>  };
>
>  static const struct

Re: [PATCH -next] drm/client: fix kernel-doc warning in drm_client.h

2023-02-01 Thread Thomas Zimmermann




Am 31.01.23 um 02:21 schrieb Randy Dunlap:

scripts/kernel-doc complains about the comment for hotplug_failed,
so fix it:

include/drm/drm_client.h:111: warning: Incorrect use of kernel-doc format:  
* @hotplug failed:

Fixes: 6a9d5ad3af65 ("drm/client: Add hotplug_failed flag")
Signed-off-by: Randy Dunlap 
Cc: Thomas Zimmermann 
Cc: Maarten Lankhorst 
Cc: Maxime Ripard 
Cc: Javier Martinez Canillas 
Cc: David Airlie 
Cc: Daniel Vetter 
Cc: dri-devel@lists.freedesktop.org


Reviewed-by: Thomas Zimmermann 

I've added your patch to drm-misc-next-fixes.


---
  include/drm/drm_client.h |2 +-
  1 file changed, 1 insertion(+), 1 deletion(-)

diff -- a/include/drm/drm_client.h b/include/drm/drm_client.h
--- a/include/drm/drm_client.h
+++ b/include/drm/drm_client.h
@@ -108,7 +108,7 @@ struct drm_client_dev {
struct drm_mode_set *modesets;
  
  	/**

-* @hotplug failed:
+* @hotplug_failed:
 *
 * Set by client hotplug helpers if the hotplugging failed
 * before. It is usually not tried again.


--
Thomas Zimmermann
Graphics Driver Developer
SUSE Software Solutions Germany GmbH
Maxfeldstr. 5, 90409 Nürnberg, Germany
(HRB 36809, AG Nürnberg)
Geschäftsführer: Ivo Totev


OpenPGP_signature
Description: OpenPGP digital signature

Re: linux-next: manual merge of the usb tree with the drm-intel-fixes tree

2023-02-01 Thread Rodrigo Vivi

On Wed, Feb 01, 2023 at 03:11:31PM +1100, Stephen Rothwell wrote:
> Hi all,
> 
> On Tue, 31 Jan 2023 10:27:29 -0800 John Harrison  
> wrote:
> >
> > On 1/31/2023 04:44, Andy Shevchenko wrote:
> > > On Tue, Jan 31, 2023 at 01:03:05PM +1100, Stephen Rothwell wrote:  
> > >>
> > >> Today's linux-next merge of the usb tree got a conflict in:
> > >>
> > >>drivers/gpu/drm/i915/gt/intel_engine_cs.c
> > >>
> > >> between commit:
> > >>
> > >>5bc4b43d5c6c ("drm/i915: Fix up locking around dumping requests 
> > >> lists")
> > >>
> > >> from the drm-intel-fixes tree and commit:
> > >>
> > >>4d70c74659d9 ("i915: Move list_count() to list.h as 
> > >> list_count_nodes() for broader use")
> > >>
> > >> from the usb tree.
> > >>
> > >> I fixed it up (the former removed the code changed by the latter)  
> > > Hmm... Currently I see that 
> > > 20230127002842.3169194-4-john.c.harri...@intel.com
> > > moves the code to the 
> > > drivers/gpu/drm/i915/gt/intel_execlists_submission.c.
> > >
> > > Is there any new series beside the above mentioned that touches that file 
> > > and
> > > actually _removes_ that code?  
> > As long as the removal is limited to list_count/list_count_nodes,
> > that's fine. I only moved it from one file to another because the one
> > and only function that was using it was being moved to the other
> > file. If someone else has found a use for the same and wants to move
> > it to a more common place then great. I assume there was no conflict
> > happening in the i915 specific code.
> 
> I have added this fix up patch to linux-next today (more or less - this
> is a hand hacked version, but you get the idea):
> 
> From: Stephen Rothwell 
> Date: Wed, 1 Feb 2023 13:13:01 +1100
> Subject: [PATCH] i915: fix up for "drm/i915: Fix up locking around dumping 
> requests lists"
> 
> interacting with "i915: Move list_count() to list.h as list_count_nodes() for 
> broader use"
> 
> Signed-off-by: Stephen Rothwell 
> ---
>  .../gpu/drm/i915/gt/intel_execlists_submission.c| 15 +
>  1 file changed, 2 insertion(+), 13 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/gt/intel_execlists_submission.c 
> b/drivers/gpu/drm/i915/gt/intel_execlists_submission.c
> index 3c573d41d404..e919d41a48d9 100644
> --- a/drivers/gpu/drm/i915/gt/intel_execlists_submission.c
> +++ b/drivers/gpu/drm/i915/gt/intel_execlists_submission.c
> @@ -4150,17 +4150,6 @@ void intel_execlists_show_requests(struct 
> intel_engine_cs *engine,
>   spin_unlock_irqrestore(_engine->lock, flags);
>  }
>  
> -static unsigned long list_count(struct list_head *list)
> -{
> - struct list_head *pos;
> - unsigned long count = 0;
> -
> - list_for_each(pos, list)
> - count++;
> -
> - return count;
> -}
> -
>  void intel_execlists_dump_active_requests(struct intel_engine_cs *engine,
> struct i915_request *hung_rq,
> struct drm_printer *m)
> @@ -4172,7 +4161,7 @@ void intel_execlists_dump_active_requests(struct 
> intel_engine_cs *engine,
>   intel_engine_dump_active_requests(>sched_engine->requests, 
> hung_rq, m);
>  
> - drm_printf(m, "\tOn hold?: %lu\n",
> -list_count(>sched_engine->hold));
> + drm_printf(m, "\tOn hold?: %zu\n",
> +list_count_nodes(>sched_engine->hold));

something awkward here.
The resolution on linux-next should align with the resolution on drm-tip
where we have the list_count still there as we preferred the version
on drm-intel-gt-next as the resolution of the conflict instead of the
fixes one.

>  
>   spin_unlock_irqrestore(>sched_engine->lock, flags);
>  }
> -- 
> 2.35.1
> 
> -- 
> Cheers,
> Stephen Rothwell

Re: [PATCH] drm/etnaviv: show number of NN cores in GPU debugfs info

2023-02-01 Thread Tomeu Vizoso


On 2/1/23 16:27, Lucas Stach wrote:

For NPUs the number of NN cores is a interesting property, which is useful
to show in the debugfs information.

Signed-off-by: Lucas Stach 
---
  drivers/gpu/drm/etnaviv/etnaviv_gpu.c | 2 ++
  1 file changed, 2 insertions(+)

diff --git a/drivers/gpu/drm/etnaviv/etnaviv_gpu.c 
b/drivers/gpu/drm/etnaviv/etnaviv_gpu.c
index 27c10584773d..de8c9894967c 100644
--- a/drivers/gpu/drm/etnaviv/etnaviv_gpu.c
+++ b/drivers/gpu/drm/etnaviv/etnaviv_gpu.c
@@ -961,6 +961,8 @@ int etnaviv_gpu_debugfs(struct etnaviv_gpu *gpu, struct 
seq_file *m)
gpu->identity.vertex_cache_size);
seq_printf(m, "\t shader_core_count: %d\n",
gpu->identity.shader_core_count);
+   seq_printf(m, "\t nn_core_count: %d\n",
+   gpu->identity.nn_core_count);
seq_printf(m, "\t pixel_pipes: %d\n",
gpu->identity.pixel_pipes);
seq_printf(m, "\t vertex_output_buffer_size: %d\n",


Hi Lucas,

That looks good to me.

Reviewed-by: Tomeu Vizoso 

Cheers,

Tomeu

Re: [PATCH v5] drm/mediatek: Add support for AR30 and BA30

2023-02-01 Thread Chun-Kuang Hu

Hi, Justin:

Justin Green  於 2023年1月31日 週二 下午11:04寫道：
>
> Hi Chun-Kuang,
> Thanks for the review! Will try to have a new patch out later today.
>
> Re MT8195 RDMA: Yes, the MT8195 RDMA has a 10 bit mode, but I left
> this unimplemented because I have no means of testing it
> unfortunately; ChromeOS does not use this hardware.
>
> I also wanted to mention, I think this patch might conflict with the
> in-flight ovl-adaptor patch. Is there a protocol for a situation like
> this?

If I prefer the order of the two series, I would told you. But I have
no preference of these two series, so you may decide these two serious
are independent or dependent. If independent, I would apply the one
which is ready first, so the other one should rebase. If dependent,
you should discuss with Nancy the order of these two serious, the
second series should rebase on first series.

Regards,
Chun-Kuang.

>
> Regards,
> Justin
>
> On Mon, Jan 30, 2023 at 7:28 PM Chun-Kuang Hu  wrote:
> >
> > Hi, Justin:
> >
> > Justin Green  於 2023年1月31日 週二 上午4:36寫道：
> > >
> > > Add support for AR30 and BA30 pixel formats to the Mediatek DRM driver.
> > >
> > > Tested using "modetest -P" on an MT8195.
> > >
> > > Signed-off-by: Justin Green 
> > > ---
> > > v2:
> > >  * Rebase and resolve merge conflicts with the AFBC patch.
> > > v3:
> > >  * Moved 10-bit support detection to mtk_disk_ovl.c
> > > v4:
> > >  * Moved formats to mtk_disp_ovl.c and mtk_disp_rdma.c
> > > v5:
> > >  * Minor style adjustments per checkpatch.pl
> > >
> > >  drivers/gpu/drm/mediatek/mtk_disp_drv.h |  4 ++
> > >  drivers/gpu/drm/mediatek/mtk_disp_ovl.c | 75 +
> > >  drivers/gpu/drm/mediatek/mtk_disp_rdma.c| 27 
> > >  drivers/gpu/drm/mediatek/mtk_drm_crtc.c |  4 +-
> > >  drivers/gpu/drm/mediatek/mtk_drm_ddp_comp.c |  4 ++
> > >  drivers/gpu/drm/mediatek/mtk_drm_ddp_comp.h | 20 ++
> > >  drivers/gpu/drm/mediatek/mtk_drm_plane.c| 28 +++-
> > >  drivers/gpu/drm/mediatek/mtk_drm_plane.h|  3 +-
> > >  8 files changed, 146 insertions(+), 19 deletions(-)
> > >
> > > diff --git a/drivers/gpu/drm/mediatek/mtk_disp_drv.h 
> > > b/drivers/gpu/drm/mediatek/mtk_disp_drv.h
> > > index 33e61a136bbc..6ad22ce75b81 100644
> > > --- a/drivers/gpu/drm/mediatek/mtk_disp_drv.h
> > > +++ b/drivers/gpu/drm/mediatek/mtk_disp_drv.h
> > > @@ -96,6 +96,8 @@ void mtk_ovl_register_vblank_cb(struct device *dev,
> > >  void mtk_ovl_unregister_vblank_cb(struct device *dev);
> > >  void mtk_ovl_enable_vblank(struct device *dev);
> > >  void mtk_ovl_disable_vblank(struct device *dev);
> > > +const u32 *mtk_ovl_get_formats(struct device *dev);
> > > +size_t mtk_ovl_get_num_formats(struct device *dev);
> > >
> > >  void mtk_rdma_bypass_shadow(struct device *dev);
> > >  int mtk_rdma_clk_enable(struct device *dev);
> > > @@ -122,4 +124,6 @@ void mtk_mdp_rdma_start(struct device *dev, struct 
> > > cmdq_pkt *cmdq_pkt);
> > >  void mtk_mdp_rdma_stop(struct device *dev, struct cmdq_pkt *cmdq_pkt);
> > >  void mtk_mdp_rdma_config(struct device *dev, struct mtk_mdp_rdma_cfg 
> > > *cfg,
> > >  struct cmdq_pkt *cmdq_pkt);
> > > +const u32 *mtk_rdma_get_formats(struct device *dev);
> > > +size_t mtk_rdma_get_num_formats(struct device *dev);
> > >  #endif
> > > diff --git a/drivers/gpu/drm/mediatek/mtk_disp_ovl.c 
> > > b/drivers/gpu/drm/mediatek/mtk_disp_ovl.c
> > > index 84daeaffab6a..1db70a77560f 100644
> > > --- a/drivers/gpu/drm/mediatek/mtk_disp_ovl.c
> > > +++ b/drivers/gpu/drm/mediatek/mtk_disp_ovl.c
> > > @@ -41,6 +41,7 @@
> > >  #define DISP_REG_OVL_RDMA_CTRL(n)  (0x00c0 + 0x20 * (n))
> > >  #define DISP_REG_OVL_RDMA_GMC(n)   (0x00c8 + 0x20 * (n))
> > >  #define DISP_REG_OVL_ADDR_MT2701   0x0040
> > > +#define DISP_REG_OVL_CLRFMT_EXT0x02D0
> > >  #define DISP_REG_OVL_ADDR_MT8173   0x0f40
> > >  #define DISP_REG_OVL_ADDR(ovl, n)  ((ovl)->data->addr + 0x20 
> > > * (n))
> > >  #define DISP_REG_OVL_HDR_ADDR(ovl, n)  ((ovl)->data->addr + 0x20 
> > > * (n) + 0x04)
> > > @@ -61,11 +62,45 @@
> > > 0 : OVL_CON_CLRFMT_RGB)
> > >  #define OVL_CON_CLRFMT_RGB888(ovl) ((ovl)->data->fmt_rgb565_is_0 ? \
> > > OVL_CON_CLRFMT_RGB : 0)
> > > +#define OVL_CON_CLRFMT_BIT_DEPTH_MASK(ovl) (0xFF << 4 * (ovl))
> > > +#define OVL_CON_CLRFMT_BIT_DEPTH(depth, ovl)   (depth << 4 * (ovl))
> > > +#define OVL_CON_CLRFMT_8_BIT   0x00
> > > +#define OVL_CON_CLRFMT_10_BIT  0x01
> > >  #defineOVL_CON_AEN BIT(8)
> > >  #defineOVL_CON_ALPHA   0xff
> > >  #defineOVL_CON_VIRT_FLIP   BIT(9)
> > >  #defineOVL_CON_HORZ_FLIP   BIT(10)
> > >
> > > +static const u32 formats_mt8173[] = {
> > > +   DRM_FORMAT_XRGB,
> > > +   DRM_FORMAT_ARGB,
> > > +   DRM_FORMAT_BGRX,
> > > +

[PATCH] drm/etnaviv: add HWDB entry for VIP8000 Nano r8002

2023-02-01 Thread Lucas Stach

This is the NPU found on the NXP i.MX8MP SoC. Feature bits taken
from the downstream kernel driver 6.4.3.p4.4.

Signed-off-by: Lucas Stach 
---
 drivers/gpu/drm/etnaviv/etnaviv_hwdb.c | 32 ++
 1 file changed, 32 insertions(+)

diff --git a/drivers/gpu/drm/etnaviv/etnaviv_hwdb.c 
b/drivers/gpu/drm/etnaviv/etnaviv_hwdb.c
index 383a1a97ea6a..7f700d79d74d 100644
--- a/drivers/gpu/drm/etnaviv/etnaviv_hwdb.c
+++ b/drivers/gpu/drm/etnaviv/etnaviv_hwdb.c
@@ -196,6 +196,38 @@ static const struct etnaviv_chip_identity 
etnaviv_chip_identities[] = {
.minor_features10 = 0x108048c0,
.minor_features11 = 0x0010,
},
+   {
+   .model = 0x8000,
+   .revision = 0x8002,
+   .product_id = 0x5080009,
+   .customer_id = 0x9f,
+   .eco_id = 0x600,
+   .stream_count = 8,
+   .register_max = 64,
+   .thread_count = 256,
+   .shader_core_count = 1,
+   .nn_core_count = 6,
+   .vertex_cache_size = 16,
+   .vertex_output_buffer_size = 1024,
+   .pixel_pipes = 1,
+   .instruction_count = 512,
+   .num_constants = 320,
+   .buffer_size = 0,
+   .varyings_count = 16,
+   .features = 0xe0287cac,
+   .minor_features0 = 0xc1799eff,
+   .minor_features1 = 0xfefbfadb,
+   .minor_features2 = 0xeb9d6fbf,
+   .minor_features3 = 0xedfffced,
+   .minor_features4 = 0xd30dafc7,
+   .minor_features5 = 0x7b5ac333,
+   .minor_features6 = 0xfc8ee200,
+   .minor_features7 = 0x03fffa6f,
+   .minor_features8 = 0x00fe0ef0,
+   .minor_features9 = 0x0088003c,
+   .minor_features10 = 0x108048c0,
+   .minor_features11 = 0x0010,
+   },
 };
 
 bool etnaviv_fill_identity_from_hwdb(struct etnaviv_gpu *gpu)
-- 
2.39.1

[PATCH] drm/etnaviv: show number of NN cores in GPU debugfs info

2023-02-01 Thread Lucas Stach

For NPUs the number of NN cores is a interesting property, which is useful
to show in the debugfs information.

Signed-off-by: Lucas Stach 
---
 drivers/gpu/drm/etnaviv/etnaviv_gpu.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/drivers/gpu/drm/etnaviv/etnaviv_gpu.c 
b/drivers/gpu/drm/etnaviv/etnaviv_gpu.c
index 27c10584773d..de8c9894967c 100644
--- a/drivers/gpu/drm/etnaviv/etnaviv_gpu.c
+++ b/drivers/gpu/drm/etnaviv/etnaviv_gpu.c
@@ -961,6 +961,8 @@ int etnaviv_gpu_debugfs(struct etnaviv_gpu *gpu, struct 
seq_file *m)
gpu->identity.vertex_cache_size);
seq_printf(m, "\t shader_core_count: %d\n",
gpu->identity.shader_core_count);
+   seq_printf(m, "\t nn_core_count: %d\n",
+   gpu->identity.nn_core_count);
seq_printf(m, "\t pixel_pipes: %d\n",
gpu->identity.pixel_pipes);
seq_printf(m, "\t vertex_output_buffer_size: %d\n",
-- 
2.39.1

[PATCH v3 3/3] drm/etnaviv: export client GPU usage statistics via fdinfo

2023-02-01 Thread Lucas Stach

This exposes a accumulated GPU active time per client via the
fdinfo infrastructure.

Signed-off-by: Lucas Stach 
---
v3: handle NPU cores
---
 drivers/gpu/drm/etnaviv/etnaviv_drv.c | 43 ++-
 1 file changed, 42 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/etnaviv/etnaviv_drv.c 
b/drivers/gpu/drm/etnaviv/etnaviv_drv.c
index 31a7f59ccb49..44ca803237a5 100644
--- a/drivers/gpu/drm/etnaviv/etnaviv_drv.c
+++ b/drivers/gpu/drm/etnaviv/etnaviv_drv.c
@@ -22,6 +22,7 @@
 #include "etnaviv_gem.h"
 #include "etnaviv_mmu.h"
 #include "etnaviv_perfmon.h"
+#include "common.xml.h"
 
 /*
  * DRM operations:
@@ -475,7 +476,47 @@ static const struct drm_ioctl_desc etnaviv_ioctls[] = {
ETNA_IOCTL(PM_QUERY_SIG, pm_query_sig, DRM_RENDER_ALLOW),
 };
 
-DEFINE_DRM_GEM_FOPS(fops);
+static void etnaviv_fop_show_fdinfo(struct seq_file *m, struct file *f)
+{
+   struct drm_file *file = f->private_data;
+   struct drm_device *dev = file->minor->dev;
+   struct etnaviv_drm_private *priv = dev->dev_private;
+   struct etnaviv_file_private *ctx = file->driver_priv;
+
+   /*
+* For a description of the text output format used here, see
+* Documentation/gpu/drm-usage-stats.rst.
+*/
+   seq_printf(m, "drm-driver:\t%s\n", dev->driver->name);
+   seq_printf(m, "drm-client-id:\t%u\n", ctx->id);
+
+   for (int i = 0; i < ETNA_MAX_PIPES; i++) {
+   struct etnaviv_gpu *gpu = priv->gpu[i];
+   char engine[10] = "UNK";
+   int cur = 0;
+
+   if (!gpu)
+   continue;
+
+   if (gpu->identity.features & chipFeatures_PIPE_2D)
+   cur = snprintf(engine, sizeof(engine), "2D");
+   if (gpu->identity.features & chipFeatures_PIPE_3D)
+   cur = snprintf(engine + cur, sizeof(engine) - cur,
+  "%s3D", cur ? "/" : "");
+   if (gpu->identity.nn_core_count > 0)
+   cur = snprintf(engine + cur, sizeof(engine) - cur,
+  "%sNN", cur ? "/" : "");
+
+   seq_printf(m, "drm-engine-%s:\t%llu ns\n", engine,
+  ctx->sched_entity[i].elapsed_ns);
+   }
+}
+
+static const struct file_operations fops = {
+   .owner = THIS_MODULE,
+   DRM_GEM_FOPS,
+   .show_fdinfo = etnaviv_fop_show_fdinfo,
+};
 
 static const struct drm_driver etnaviv_drm_driver = {
.driver_features= DRIVER_GEM | DRIVER_RENDER,
-- 
2.39.1

[PATCH v3 1/3] drm/scheduler: track GPU active time per entity

2023-02-01 Thread Lucas Stach

Track the accumulated time that jobs from this entity were active
on the GPU. This allows drivers using the scheduler to trivially
implement the DRM fdinfo when the hardware doesn't provide more
specific information than signalling job completion anyways.

Signed-off-by: Lucas Stach 
Reviewed-by: Andrey Grodzovsky 
---
 drivers/gpu/drm/scheduler/sched_main.c | 6 ++
 include/drm/gpu_scheduler.h| 7 +++
 2 files changed, 13 insertions(+)

diff --git a/drivers/gpu/drm/scheduler/sched_main.c 
b/drivers/gpu/drm/scheduler/sched_main.c
index 31f3a1267be4..cc5e77a30a66 100644
--- a/drivers/gpu/drm/scheduler/sched_main.c
+++ b/drivers/gpu/drm/scheduler/sched_main.c
@@ -895,6 +895,12 @@ drm_sched_get_cleanup_job(struct drm_gpu_scheduler *sched)
 
spin_unlock(>job_list_lock);
 
+   if (job) {
+   job->entity->elapsed_ns += ktime_to_ns(
+   ktime_sub(job->s_fence->finished.timestamp,
+ job->s_fence->scheduled.timestamp));
+   }
+
return job;
 }
 
diff --git a/include/drm/gpu_scheduler.h b/include/drm/gpu_scheduler.h
index ca857ec9e7eb..f88a3fc14fb1 100644
--- a/include/drm/gpu_scheduler.h
+++ b/include/drm/gpu_scheduler.h
@@ -228,6 +228,13 @@ struct drm_sched_entity {
 */
struct rb_node  rb_tree_node;
 
+   /**
+* @elapsed_ns
+*
+* Records the amount of time where jobs from this entity were active
+* on the GPU.
+*/
+   uint64_t elapsed_ns;
 };
 
 /**
-- 
2.39.1

[PATCH v3 2/3] drm/etnaviv: allocate unique ID per drm_file

2023-02-01 Thread Lucas Stach

Allows to easily track if several fd are pointing to the same
execution context due to being dup'ed.

Signed-off-by: Lucas Stach 
---
v3: use xarray to track the active contexts to avoid issues
on rollover
---
 drivers/gpu/drm/etnaviv/etnaviv_drv.c | 11 +++
 drivers/gpu/drm/etnaviv/etnaviv_drv.h |  4 
 2 files changed, 15 insertions(+)

diff --git a/drivers/gpu/drm/etnaviv/etnaviv_drv.c 
b/drivers/gpu/drm/etnaviv/etnaviv_drv.c
index 1d2b4fb4bcf8..31a7f59ccb49 100644
--- a/drivers/gpu/drm/etnaviv/etnaviv_drv.c
+++ b/drivers/gpu/drm/etnaviv/etnaviv_drv.c
@@ -56,6 +56,11 @@ static int etnaviv_open(struct drm_device *dev, struct 
drm_file *file)
if (!ctx)
return -ENOMEM;
 
+   ret = xa_alloc_cyclic(>active_contexts, >id, ctx,
+ xa_limit_32b, >next_context_id, GFP_KERNEL);
+   if (ret < 0)
+   goto out_free;
+
ctx->mmu = etnaviv_iommu_context_init(priv->mmu_global,
  priv->cmdbuf_suballoc);
if (!ctx->mmu) {
@@ -99,6 +104,8 @@ static void etnaviv_postclose(struct drm_device *dev, struct 
drm_file *file)
 
etnaviv_iommu_context_put(ctx->mmu);
 
+   xa_erase(>active_contexts, ctx->id);
+
kfree(ctx);
 }
 
@@ -514,6 +521,8 @@ static int etnaviv_bind(struct device *dev)
 
dma_set_max_seg_size(dev, SZ_2G);
 
+   xa_init_flags(>active_contexts, XA_FLAGS_ALLOC);
+
mutex_init(>gem_lock);
INIT_LIST_HEAD(>gem_list);
priv->num_gpus = 0;
@@ -563,6 +572,8 @@ static void etnaviv_unbind(struct device *dev)
 
etnaviv_cmdbuf_suballoc_destroy(priv->cmdbuf_suballoc);
 
+   xa_destroy(>active_contexts);
+
drm->dev_private = NULL;
kfree(priv);
 
diff --git a/drivers/gpu/drm/etnaviv/etnaviv_drv.h 
b/drivers/gpu/drm/etnaviv/etnaviv_drv.h
index 0b311af04f1d..b3eb1662e90c 100644
--- a/drivers/gpu/drm/etnaviv/etnaviv_drv.h
+++ b/drivers/gpu/drm/etnaviv/etnaviv_drv.h
@@ -29,6 +29,7 @@ struct etnaviv_iommu_global;
 #define ETNAVIV_SOFTPIN_START_ADDRESS  SZ_4M /* must be >= SUBALLOC_SIZE */
 
 struct etnaviv_file_private {
+   int id;
struct etnaviv_iommu_context*mmu;
struct drm_sched_entity sched_entity[ETNA_MAX_PIPES];
 };
@@ -41,6 +42,9 @@ struct etnaviv_drm_private {
struct etnaviv_cmdbuf_suballoc *cmdbuf_suballoc;
struct etnaviv_iommu_global *mmu_global;
 
+   struct xarray active_contexts;
+   u32 next_context_id;
+
/* list of GEM objects: */
struct mutex gem_lock;
struct list_head gem_list;
-- 
2.39.1

Re: [PATCH v5 7/7] drm/etnaviv: add HWDB entry for VIPNano-QI.7120.0055

2023-02-01 Thread Tomeu Vizoso


On 2/1/23 14:26, Lucas Stach wrote:

Hi Tomeu,

Am Donnerstag, dem 01.12.2022 um 11:30 +0100 schrieb Tomeu Vizoso:

This is a compute-only module marketed towards AI and vision
acceleration. This particular version can be found on the Amlogic A311D
SoC.

The feature bits are taken from the Khadas downstream kernel driver
6.4.4.3.310723AAA.


I just noticed that you seem to have dropped the nn_core_count property
from this HWDB entry when reworking the series according to my
feedback. I assume that you are fine with me fixing this up in the
etnaviv tree by using the hunk from your previous patch "[PATCH v4 6/7]
drm/etnaviv: Add nn_core_count to chip feature struct".

If you see any issue with that, please let me know.


I have checked and I think you are right.

Thanks!

Tomeu


Regards,
Lucas


Signed-off-by: Tomeu Vizoso 
---
  drivers/gpu/drm/etnaviv/etnaviv_hwdb.c | 31 ++
  1 file changed, 31 insertions(+)

diff --git a/drivers/gpu/drm/etnaviv/etnaviv_hwdb.c 
b/drivers/gpu/drm/etnaviv/etnaviv_hwdb.c
index 44df273a5aae..66b8ad6c7d26 100644
--- a/drivers/gpu/drm/etnaviv/etnaviv_hwdb.c
+++ b/drivers/gpu/drm/etnaviv/etnaviv_hwdb.c
@@ -134,6 +134,37 @@ static const struct etnaviv_chip_identity 
etnaviv_chip_identities[] = {
.minor_features10 = 0x90044250,
.minor_features11 = 0x0024,
},
+   {
+   .model = 0x8000,
+   .revision = 0x7120,
+   .product_id = 0x45080009,
+   .customer_id = 0x88,
+   .eco_id = 0,
+   .stream_count = 8,
+   .register_max = 64,
+   .thread_count = 256,
+   .shader_core_count = 1,
+   .vertex_cache_size = 16,
+   .vertex_output_buffer_size = 1024,
+   .pixel_pipes = 1,
+   .instruction_count = 512,
+   .num_constants = 320,
+   .buffer_size = 0,
+   .varyings_count = 16,
+   .features = 0xe0287cac,
+   .minor_features0 = 0xc1799eff,
+   .minor_features1 = 0xfefbfadb,
+   .minor_features2 = 0xeb9d6fbf,
+   .minor_features3 = 0xedfffced,
+   .minor_features4 = 0xd30dafc7,
+   .minor_features5 = 0x7b5ac333,
+   .minor_features6 = 0xfc8ee200,
+   .minor_features7 = 0x03fffa6f,
+   .minor_features8 = 0x00fe0ef0,
+   .minor_features9 = 0x0088003c,
+   .minor_features10 = 0x108048c0,
+   .minor_features11 = 0x0010,
+   },
  };
  
  bool etnaviv_fill_identity_from_hwdb(struct etnaviv_gpu *gpu)




--
Tomeu Vizoso
Consultant Principal Software Engineer

Collabora Ltd.
Platinum Building, St John's Innovation Park, Cambridge CB4 0DS, United 
Kingdom

Registered in England & Wales, no. 5513718

DRM accel and debugfs/sysfs

2023-02-01 Thread Stanislaw Gruszka

Hi

I was about to send debugfs support for ivpu and noticed that there
are current changes that deprecate drm_devel->debugfs_init callback.

Further I looked at this commit [1], that stated we should not
use drm_minor for debugfs and sysfs. What is quite contrary to 
what drm accel framework did in the first place. 

So my question is how we should use debugfs/sysfs in accel?
Use it with old fashioned minor-centric way or change
the framework somehow ?

[1] 
https://cgit.freedesktop.org/drm/drm-misc/commit/?id=99845faae7099cd704ebf67514c1157c26960a26

Regards
Stanislaw

Re: [v1 2/3] drm/msm/disp/dpu1: add dspps into reservation if there is a ctm request

2023-02-01 Thread Marijn Suijten

On 2023-02-01 15:48:02, Dmitry Baryshkov wrote:
> On 01/02/2023 13:16, Marijn Suijten wrote:
> > On 2023-01-30 07:21:31, Kalyan Thota wrote:
> >> Add dspp blocks into the topology for reservation, if there is a ctm
> >> request for that composition.
> > 
> > DSPP
> > 
> >> Changes in v1:
> >> - Minor nits (Dmitry)
> > 
> > This should go below the triple dashes, so that it /does not/ become
> > part of the patch/commit that is applied to the tree (where review
> > history is irrelevant as it can be searched for separately).
> 
> This is one of DRM peculiarities which we have to live with.

Not sure I follow.  Keeping "changes since vXX" out of commit messages
seems to be a kernel-wide convention, after all the title doesn't
include which revision of the patch ended up being applied to the tree
either.  Having the changelog checked in to the tree has no relevance.

> >> Signed-off-by: Kalyan Thota 
> >> Reviewed-by: Dmitry Baryshkov 
> >> ---
> >>   drivers/gpu/drm/msm/disp/dpu1/dpu_encoder.c | 13 ++---
> >>   1 file changed, 6 insertions(+), 7 deletions(-)
> >>
> >> diff --git a/drivers/gpu/drm/msm/disp/dpu1/dpu_encoder.c 
> >> b/drivers/gpu/drm/msm/disp/dpu1/dpu_encoder.c
> >> index 9c6817b..3bd46b4 100644
> >> --- a/drivers/gpu/drm/msm/disp/dpu1/dpu_encoder.c
> >> +++ b/drivers/gpu/drm/msm/disp/dpu1/dpu_encoder.c
> >> @@ -545,7 +545,8 @@ bool dpu_encoder_use_dsc_merge(struct drm_encoder 
> >> *drm_enc)
> >>   static struct msm_display_topology dpu_encoder_get_topology(
> >>struct dpu_encoder_virt *dpu_enc,
> >>struct dpu_kms *dpu_kms,
> >> -  struct drm_display_mode *mode)
> >> +  struct drm_display_mode *mode,
> >> +  struct drm_crtc_state *crtc_state)
> >>   {
> >>struct msm_display_topology topology = {0};
> >>int i, intf_count = 0;
> >> @@ -573,11 +574,9 @@ static struct msm_display_topology 
> >> dpu_encoder_get_topology(
> >>else
> >>topology.num_lm = (mode->hdisplay > MAX_HDISPLAY_SPLIT) ? 2 : 1;
> >>   
> >> -  if (dpu_enc->disp_info.intf_type == DRM_MODE_ENCODER_DSI) {
> >> -  if (dpu_kms->catalog->dspp &&
> >> -  (dpu_kms->catalog->dspp_count >= topology.num_lm))
> >> -  topology.num_dspp = topology.num_lm;
> >> -  }
> >> +  if (dpu_kms->catalog->dspp &&
> >> +  crtc_state->ctm && (dpu_kms->catalog->dspp_count >= 
> >> topology.num_lm))
> > 
> > Multiline-if-clause is typically indented with two tabs, not a half tab
> > (4 spaces).
> 
> I tend to disagree here. Lately I have mostly seen it being indented to 
> the opening parenthesis, so that nested statements also indent nicely.

Ack, hence double-checked in a followup message; there's no concistency
in dpu1 now but I agree that for ts=8 a 4-space-indented wraparound
neatly aligns with the expression on the first line /and/ prevents
inadvertently aligning with the conditional body on the next line.

Will fix up in my own series too, thanks!

> > Nit: swap the && here?  dspp and dspp_count are related, so check ctm
> > first or last but not in the middle - makes reading easier.
> 
> I think we can ignore dpu_kms->catalog->dspp completely. checking 
> dspp_count should be enough for the purpose of the check (and note, the 
> check for dspp/dspp_count is misleading and should be omitted).

Ack, thanks!

- Marijn

Re: [PATCH v2 02/17] drm/display/dp_mst: Handle old/new payload states in drm_dp_remove_payload()

2023-02-01 Thread Imre Deak

On Tue, Jan 31, 2023 at 06:13:10PM -0500, Lyude Paul wrote:
> On Tue, 2023-01-31 at 17:05 +0200, Imre Deak wrote:
> > Atm, drm_dp_remove_payload() uses the same payload state to both get the
> > vc_start_slot required for the payload removal DPCD message and to
> > deduct time_slots from vc_start_slot of all payloads after the one being
> > removed.
> > 
> > The above isn't always correct, as vc_start_slot must be the up-to-date
> > version contained in the new payload state, but time_slots must be the
> > one used when the payload was previously added, contained in the old
> > payload state. The new payload's time_slots can change vs. the old one
> > if the current atomic commit changes the corresponding mode.
> > 
> > This patch let's drivers pass the old and new payload states to
> > drm_dp_remove_payload(), but keeps these the same for now in all drivers
> > not to change the behavior. A follow-up i915 patch will pass in that
> > driver the correct old and new states to the function.
> 
> Oh wow, this was definitely a mistake on my part, thanks for catching this!
> TBH, I think this behavior is correct so (now that I actually have a setup
> capable of testing amdgpu's MST fully thanks to gitlab issue 2171…) if you'd
> like to change it on other drivers as well I can test it fully.

I only checked that the other drivers pass the new payload state to
drm_dp_remove_payload(), so not sure how that works atm if the same
commit has to both remove the payload (with the old time_slots value)
and add it back (with a new time_slots value). Maybe that can't happen
in those drivers, or time_slots get updated between remove and readd.

> Or feel free to leave it to me, shouldn't be too difficult I think :).

Yes, this patch should have no functional change, so please check what
would apply to other drivers as well.

Could you also check Ville's comment about storing start_slot elsewhere
than the atomic state (leaving only time_slots there). I wonder if that
would work, at least it would simplify things I think.

> For 0-2:
> 
> Reviewed-by: Lyude Paul 

Thanks.

> 
> > 
> > Cc: Lyude Paul 
> > Cc: Ville Syrjälä 
> > Cc: Ben Skeggs 
> > Cc: Karol Herbst 
> > Cc: Harry Wentland 
> > Cc: Alex Deucher 
> > Cc: Wayne Lin 
> > Cc: sta...@vger.kernel.org # 6.1
> > Cc: dri-devel@lists.freedesktop.org
> > Reviewed-by: Ville Syrjälä 
> > Signed-off-by: Imre Deak 
> > ---
> >  .../amd/display/amdgpu_dm/amdgpu_dm_helpers.c |  2 +-
> >  drivers/gpu/drm/display/drm_dp_mst_topology.c | 26 ++-
> >  drivers/gpu/drm/i915/display/intel_dp_mst.c   |  4 ++-
> >  drivers/gpu/drm/nouveau/dispnv50/disp.c   |  2 +-
> >  include/drm/display/drm_dp_mst_helper.h   |  3 ++-
> >  5 files changed, 21 insertions(+), 16 deletions(-)
> > 
> > diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_helpers.c 
> > b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_helpers.c
> > index a50319fc42b11..180d3893b68da 100644
> > --- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_helpers.c
> > +++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_helpers.c
> > @@ -208,7 +208,7 @@ bool dm_helpers_dp_mst_write_payload_allocation_table(
> > if (enable)
> > drm_dp_add_payload_part1(mst_mgr, mst_state, payload);
> > else
> > -   drm_dp_remove_payload(mst_mgr, mst_state, payload);
> > +   drm_dp_remove_payload(mst_mgr, mst_state, payload, payload);
> >  
> > /* mst_mgr->->payloads are VC payload notify MST branch using DPCD or
> >  * AUX message. The sequence is slot 1-63 allocated sequence for each
> > diff --git a/drivers/gpu/drm/display/drm_dp_mst_topology.c 
> > b/drivers/gpu/drm/display/drm_dp_mst_topology.c
> > index 847c10aa2098c..1990ff5dc7ddd 100644
> > --- a/drivers/gpu/drm/display/drm_dp_mst_topology.c
> > +++ b/drivers/gpu/drm/display/drm_dp_mst_topology.c
> > @@ -3342,7 +3342,8 @@ EXPORT_SYMBOL(drm_dp_add_payload_part1);
> >   * drm_dp_remove_payload() - Remove an MST payload
> >   * @mgr: Manager to use.
> >   * @mst_state: The MST atomic state
> > - * @payload: The payload to write
> > + * @old_payload: The payload with its old state
> > + * @new_payload: The payload to write
> >   *
> >   * Removes a payload from an MST topology if it was successfully assigned 
> > a start slot. Also updates
> >   * the starting time slots of all other payloads which would have been 
> > shifted towards the start of
> > @@ -3350,36 +3351,37 @@ EXPORT_SYMBOL(drm_dp_add_payload_part1);
> >   */
> >  void drm_dp_remove_payload(struct drm_dp_mst_topology_mgr *mgr,
> >struct drm_dp_mst_topology_state *mst_state,
> > -  struct drm_dp_mst_atomic_payload *payload)
> > +  const struct drm_dp_mst_atomic_payload *old_payload,
> > +  struct drm_dp_mst_atomic_payload *new_payload)
> >  {
> > struct drm_dp_mst_atomic_payload *pos;
> > bool send_remove = false;
> >  
> > /* We failed to make the payload, so

Re: [PATCH v2 1/4] memcg: Track exported dma-buffers

2023-02-01 Thread Tvrtko Ursulin




On 01/02/2023 14:23, Tvrtko Ursulin wrote:


On 01/02/2023 01:49, T.J. Mercier wrote:

On Tue, Jan 31, 2023 at 6:01 AM Tvrtko Ursulin
 wrote:



On 25/01/2023 20:04, T.J. Mercier wrote:

On Wed, Jan 25, 2023 at 9:31 AM Tvrtko Ursulin
 wrote:



Hi,

On 25/01/2023 11:52, Michal Hocko wrote:

On Tue 24-01-23 19:46:28, Shakeel Butt wrote:

On Tue, Jan 24, 2023 at 03:59:58PM +0100, Michal Hocko wrote:

On Mon 23-01-23 19:17:23, T.J. Mercier wrote:

When a buffer is exported to userspace, use memcg to attribute the
buffer to the allocating cgroup until all buffer references are
released.


Is there any reason why this memory cannot be charged during the
allocation (__GFP_ACCOUNT used)?
Also you do charge and account the memory but underlying pages 
do not
know about their memcg (this is normally done with commit_charge 
for

user mapped pages). This would become a problem if the memory is
migrated for example.


I don't think this is movable memory.


This also means that you have to maintain memcg
reference outside of the memcg proper which is not really nice 
either.
This mimicks tcp kmem limit implementation which I really have 
to say I

am not a great fan of and this pattern shouldn't be coppied.



I think we should keep the discussion on technical merits instead of
personal perference. To me using skmem like interface is totally 
fine

but the pros/cons need to be very explicit and the clear reasons to
select that option should be included.


I do agree with that. I didn't want sound to be personal wrt tcp kmem
accounting but the overall code maintenance cost is higher because
of how tcp take on accounting differs from anything else in the memcg
proper. I would prefer to not grow another example like that.


To me there are two options:

1. Using skmem like interface as this patch series:

The main pros of this option is that it is very simple. Let me 
list down

the cons of this approach:

a. There is time window between the actual memory allocation/free 
and
the charge and uncharge and [un]charge happen when the whole 
memory is
allocated or freed. I think for the charge path that might not be 
a big
issue but on the uncharge, this can cause issues. The application 
and
the potential shrinkers have freed some of this dmabuf memory but 
until
the whole dmabuf is freed, the memcg uncharge will not happen. 
This can

consequences on reclaim and oom behavior of the application.

b. Due to the usage model i.e. a central daemon allocating the 
dmabuf
memory upfront, there is a requirement to have a memcg charge 
transfer

functionality to transfer the charge from the central daemon to the
client applications. This does introduce complexity and avenues 
of weird

reclaim and oom behavior.


2. Allocate and charge the memory on page fault by actual user

In this approach, the memory is not allocated upfront by the central
daemon but rather on the page fault by the client application and 
the

memcg charge happen at the same time.

The only cons I can think of is this approach is more involved 
and may
need some clever tricks to track the page on the free patch i.e. 
we to

decrement the dmabuf memcg stat on free path. Maybe a page flag.

The pros of this approach is there is no need have a charge transfer
functionality and the charge/uncharge being closely tied to the 
actual

memory allocation and free.

Personally I would prefer the second approach but I don't want to 
just
block this work if the dmabuf folks are ok with the cons 
mentioned of

the first approach.


I am not familiar with dmabuf internals to judge complexity on 
their end

but I fully agree that charge-when-used is much more easier to reason
about and it should have less subtle surprises.


Disclaimer that I don't seem to see patches 3&4 on dri-devel so 
maybe I
am missing something, but in principle yes, I agree that the 2nd 
option
(charge the user, not exporter) should be preferred. Thing being 
that at

export time there may not be any backing store allocated, plus if the
series is restricting the charge transfer to just Android clients then
it seems it has the potential to miss many other use cases. At least
needs to outline a description on how the feature will be useful 
outside

Android.


There is no restriction like that. It's available to anybody who wants
to call dma_buf_charge_transfer if they actually have a need for that,
which I don't really expect to be common since most users/owners of
the buffers will be the ones causing the export in the first place.
It's just not like that on Android with the extra allocator process in
the middle most of the time.


Yeah I used the wrong term "restrict", apologies. What I meant was, if
the idea was to allow spotting memory leaks, with the charge transfer
being optional and in the series only wired up for Android Binder, then
it obviously only fully works for that one case. So a step back..


Oh, spotting kernel memory leaks is a side-benefit of accounting
kernel-only buffers in the

Re: [PATCH] drm/amdgpu/fence: Fix oops due to non-matching drm_sched init/fini

2023-02-01 Thread Christian König


Am 01.02.23 um 15:24 schrieb Alex Deucher:

On Wed, Feb 1, 2023 at 2:18 AM Christian König
 wrote:

Hi Guchun,

no, that doesn't make any sense at all.

The ready flag indicates that the scheduler is fully prepared for hw
submissions from userspace and is unrelated to the initialization
status. It's set to true after IB testing was successful and only set to
false only when a GPU reset fails and we can't get the hardware to work
any more.

That might have been the original intention, but right now sched.ready
gets set to true when we finish setting up the ring, but before we do
ring or IB tests.


WHAT? Please not again.

I'm really tired of fixing this over and over again, the meaning of 
ring->sched.ready is to block submissions when a GPU reset fails. AND 
NOTHING ELSE!


The problem is people seem to abuse it and I have to fix it for the 
fourth or fives time now.


I'm going to send out patches,
Christian.



Alex


Please use sched.ops instead as I suggested before.

Regards,
Christian.

Am 31.01.23 um 14:58 schrieb Chen, Guchun:

Hi Christian,

Do you think if it makes sense that we can set 'ring->sched.ready' to be true 
in each ring init, even if before executing/setting up drm_sched_init in 
amdgpu_device_init_schedulers? As 'ready' is a member of gpu scheduler structure.

Regards,
Guchun

-Original Message-
From: Koenig, Christian 
Sent: Tuesday, January 31, 2023 6:59 PM
To: Chen, Guchun ; Alex Deucher ; 
Guilherme G. Piccoli 
Cc: amd-...@lists.freedesktop.org; ker...@gpiccoli.net; Pan, Xinhui ; 
dri-devel@lists.freedesktop.org; Tuikov, Luben ; Limonciello, Mario 
; kernel-...@igalia.com; Deucher, Alexander 

Subject: Re: [PATCH] drm/amdgpu/fence: Fix oops due to non-matching drm_sched 
init/fini

Am 31.01.23 um 10:17 schrieb Chen, Guchun:

Hi Piccoli,

Please ignore my request of full dmesg log. I can reproduce the issue and get 
the same failure callstack by returning early with an error code prior to 
amdgpu_device_init_schedulers.

Regards,
Guchun

-Original Message-
From: Chen, Guchun
Sent: Tuesday, January 31, 2023 2:37 PM
To: Alex Deucher ; Guilherme G. Piccoli

Cc: amd-...@lists.freedesktop.org; ker...@gpiccoli.net; Pan, Xinhui
; dri-devel@lists.freedesktop.org; Tuikov, Luben
; Limonciello, Mario
; kernel-...@igalia.com; Deucher, Alexander
; Koenig, Christian

Subject: RE: [PATCH] drm/amdgpu/fence: Fix oops due to non-matching
drm_sched init/fini

Hi Piccoli,

I agree with Alex's point, using ring->sched.name for such check is not a good 
way. BTW, can you please attach a full dmesg long in bad case to help me 
understand more?

Regards,
Guchun

-Original Message-
From: Alex Deucher 
Sent: Tuesday, January 31, 2023 6:30 AM
To: Guilherme G. Piccoli 
Cc: amd-...@lists.freedesktop.org; ker...@gpiccoli.net; Chen, Guchun
; Pan, Xinhui ;
dri-devel@lists.freedesktop.org; Tuikov, Luben ;
Limonciello, Mario ; kernel-...@igalia.com;
Deucher, Alexander ; Koenig, Christian

Subject: Re: [PATCH] drm/amdgpu/fence: Fix oops due to non-matching
drm_sched init/fini

On Mon, Jan 30, 2023 at 4:51 PM Guilherme G. Piccoli  
wrote:

+ Luben

(sorry, missed that in the first submission).

On 30/01/2023 18:45, Guilherme G. Piccoli wrote:

Currently amdgpu calls drm_sched_fini() from the fence driver sw
fini routine - such function is expected to be called only after the
respective init function - drm_sched_init() - was executed successfully.

Happens that we faced a driver probe failure in the Steam Deck
recently, and the function drm_sched_fini() was called even without
its counter-part had been previously called, causing the following oops:

amdgpu: probe of :04:00.0 failed with error -110
BUG: kernel NULL pointer dereference, address: 0090 PGD
0 P4D 0
Oops: 0002 [#1] PREEMPT SMP NOPTI
CPU: 0 PID: 609 Comm: systemd-udevd Not tainted 6.2.0-rc3-gpiccoli
#338 Hardware name: Valve Jupiter/Jupiter, BIOS F7A0113 11/04/2022
RIP: 0010:drm_sched_fini+0x84/0xa0 [gpu_sched] [...] Call Trace:

amdgpu_fence_driver_sw_fini+0xc8/0xd0 [amdgpu]
amdgpu_device_fini_sw+0x2b/0x3b0 [amdgpu]
amdgpu_driver_release_kms+0x16/0x30 [amdgpu]
devm_drm_dev_init_release+0x49/0x70
[...]

To prevent that, check if the drm_sched was properly initialized for
a given ring before calling its fini counter-part.

Notice ideally we'd use sched.ready for that; such field is set as
the latest thing on drm_sched_init(). But amdgpu seems to "override"
the meaning of such field - in the above oops for example, it was a
GFX ring causing the crash, and the sched.ready field was set to
true in the ring init routine, regardless of the state of the DRM scheduler. 
Hence, we ended-up using another sched field.

Fixes: 067f44c8b459 ("drm/amdgpu: avoid over-handle of fence
driver fini in s3 test (v2)")

Cc: Andrey Grodzovsky 
Cc: Guchun Chen 
Cc: Mario Limonciello 
Signed-off-by: Guilherme G. Piccoli 
---


Hi folks, first of all thanks in advance for reviews / comments!
Notice that I've used the

Re: [RFT PATCH v2 2/3] drm/msm/dsi: Stop unconditionally powering up DSI hosts at modeset

2023-02-01 Thread Doug Anderson

Hi,

On Tue, Jan 31, 2023 at 3:32 PM Abhinav Kumar  wrote:
>
> On 1/31/2023 2:18 PM, Douglas Anderson wrote:
> > In commit 7d8e9a90509f ("drm/msm/dsi: move DSI host powerup to modeset
> > time"), we moved powering up DSI hosts to modeset time. This wasn't
> > because it was an elegant design, but there were no better options.
> >
> > That commit actually ended up breaking ps8640, and thus was born
> > commit ec7981e6c614 ("drm/msm/dsi: don't powerup at modeset time for
> > parade-ps8640") as a temporary hack to un-break ps8640 by moving it to
> > the old way of doing things. It turns out that ps8640 _really_ doesn't
> > like its pre_enable() function to be called after
> > dsi_mgr_bridge_power_on(). Specifically (from experimentation, not
> > because I have any inside knowledge), it looks like the assertion of
> > "RST#" in the ps8640 runtime resume handler seems like it's not
> > allowed to happen after dsi_mgr_bridge_power_on()
> >
> > Recently, Dave Stevenson's series landed allowing bridges some control
> > over pre_enable ordering. The meaty commit for our purposes is commit
> > 4fb912e5e190 ("drm/bridge: Introduce pre_enable_prev_first to alter
> > bridge init order"). As documented by that series, if a bridge doesn't
> > set "pre_enable_prev_first" then we should use the old ordering.
> >
> > Now that we have the commit ("drm/bridge: tc358762: Set
> > pre_enable_prev_first") we can go back to the old ordering, which also
> > allows us to remove the ps8640 special case.
> >
> > One last note is that even without reverting commit 7d8e9a90509f
> > ("drm/msm/dsi: move DSI host powerup to modeset time"), if you _just_
> > revert the ps8640 special case and try it out then it doesn't seem to
> > fail anymore. I spent time bisecting / debugging this and it turns out
> > to be mostly luck, so we still want this patch to make sure it's
> > solid. Specifically the reason it sorta works these days is because
> > we implemented wait_hpd_asserted() in ps8640 now, plus the magic of
> > "pm_runtime" autosuspend. The fact that we have wait_hpd_asserted()
> > implemented means that we actually power the bridge chip up just a wee
> > bit earlier and then the bridge happens to stay on because of
> > autosuspend and thus ends up powered before dsi_mgr_bridge_power_on().
> >
> > Cc: Dave Stevenson 
> > Cc: Dmitry Baryshkov 
> > Cc: Abhinav Kumar 
> > Signed-off-by: Douglas Anderson 
> > ---
> >
> > Changes in v2:
> > - Don't fold dsi_mgr_bridge_power_on() back into dsi_mgr_bridge_pre_enable()
> >
> >   drivers/gpu/drm/msm/dsi/dsi_manager.c | 38 +--
> >   1 file changed, 1 insertion(+), 37 deletions(-)
> >
> > diff --git a/drivers/gpu/drm/msm/dsi/dsi_manager.c 
> > b/drivers/gpu/drm/msm/dsi/dsi_manager.c
> > index 1bbac72dad35..2197a54b9b96 100644
> > --- a/drivers/gpu/drm/msm/dsi/dsi_manager.c
> > +++ b/drivers/gpu/drm/msm/dsi/dsi_manager.c
> > @@ -34,32 +34,6 @@ static struct msm_dsi_manager msm_dsim_glb;
> >   #define IS_SYNC_NEEDED()(msm_dsim_glb.is_sync_needed)
> >   #define IS_MASTER_DSI_LINK(id)  (msm_dsim_glb.master_dsi_link_id == 
> > id)
> >
> > -#ifdef CONFIG_OF
> > -static bool dsi_mgr_power_on_early(struct drm_bridge *bridge)
> > -{
> > - struct drm_bridge *next_bridge = drm_bridge_get_next_bridge(bridge);
> > -
> > - /*
> > -  * If the next bridge in the chain is the Parade ps8640 bridge chip
> > -  * then don't power on early since it seems to violate the 
> > expectations
> > -  * of the firmware that the bridge chip is running.
> > -  *
> > -  * NOTE: this is expected to be a temporary special case. It's 
> > expected
> > -  * that we'll eventually have a framework that allows the next level
> > -  * bridge to indicate whether it needs us to power on before it or
> > -  * after it. When that framework is in place then we'll use it and
> > -  * remove this special case.
> > -  */
> > - return !(next_bridge && next_bridge->of_node &&
> > -  of_device_is_compatible(next_bridge->of_node, 
> > "parade,ps8640"));
> > -}
> > -#else
> > -static inline bool dsi_mgr_power_on_early(struct drm_bridge *bridge)
> > -{
> > - return true;
> > -}
> > -#endif
> > -
> >   static inline struct msm_dsi *dsi_mgr_get_dsi(int id)
> >   {
> >   return msm_dsim_glb.dsi[id];
> > @@ -265,12 +239,6 @@ static void dsi_mgr_bridge_power_on(struct drm_bridge 
> > *bridge)
> >   int ret;
> >
> >   DBG("id=%d", id);
> > - if (!msm_dsi_device_connected(msm_dsi))
> > - return;
> > -
> > - /* Do nothing with the host if it is slave-DSI in case of bonded DSI 
> > */
> > - if (is_bonded_dsi && !IS_MASTER_DSI_LINK(id))
> > - return;
> >
>
> Why are these two checks removed?

After this patch there is now one caller to this function and the one
caller does those exact same two checks immediately before calling
this function. Thus, they no longer do anything useful.

-Doug

Re: [PATCH] drm/amdgpu/fence: Fix oops due to non-matching drm_sched init/fini

2023-02-01 Thread Alex Deucher

On Wed, Feb 1, 2023 at 2:18 AM Christian König
 wrote:
>
> Hi Guchun,
>
> no, that doesn't make any sense at all.
>
> The ready flag indicates that the scheduler is fully prepared for hw
> submissions from userspace and is unrelated to the initialization
> status. It's set to true after IB testing was successful and only set to
> false only when a GPU reset fails and we can't get the hardware to work
> any more.

That might have been the original intention, but right now sched.ready
gets set to true when we finish setting up the ring, but before we do
ring or IB tests.

Alex

>
> Please use sched.ops instead as I suggested before.
>
> Regards,
> Christian.
>
> Am 31.01.23 um 14:58 schrieb Chen, Guchun:
> > Hi Christian,
> >
> > Do you think if it makes sense that we can set 'ring->sched.ready' to be 
> > true in each ring init, even if before executing/setting up drm_sched_init 
> > in amdgpu_device_init_schedulers? As 'ready' is a member of gpu scheduler 
> > structure.
> >
> > Regards,
> > Guchun
> >
> > -Original Message-
> > From: Koenig, Christian 
> > Sent: Tuesday, January 31, 2023 6:59 PM
> > To: Chen, Guchun ; Alex Deucher 
> > ; Guilherme G. Piccoli 
> > Cc: amd-...@lists.freedesktop.org; ker...@gpiccoli.net; Pan, Xinhui 
> > ; dri-devel@lists.freedesktop.org; Tuikov, Luben 
> > ; Limonciello, Mario ; 
> > kernel-...@igalia.com; Deucher, Alexander 
> > Subject: Re: [PATCH] drm/amdgpu/fence: Fix oops due to non-matching 
> > drm_sched init/fini
> >
> > Am 31.01.23 um 10:17 schrieb Chen, Guchun:
> >> Hi Piccoli,
> >>
> >> Please ignore my request of full dmesg log. I can reproduce the issue and 
> >> get the same failure callstack by returning early with an error code prior 
> >> to amdgpu_device_init_schedulers.
> >>
> >> Regards,
> >> Guchun
> >>
> >> -Original Message-
> >> From: Chen, Guchun
> >> Sent: Tuesday, January 31, 2023 2:37 PM
> >> To: Alex Deucher ; Guilherme G. Piccoli
> >> 
> >> Cc: amd-...@lists.freedesktop.org; ker...@gpiccoli.net; Pan, Xinhui
> >> ; dri-devel@lists.freedesktop.org; Tuikov, Luben
> >> ; Limonciello, Mario
> >> ; kernel-...@igalia.com; Deucher, Alexander
> >> ; Koenig, Christian
> >> 
> >> Subject: RE: [PATCH] drm/amdgpu/fence: Fix oops due to non-matching
> >> drm_sched init/fini
> >>
> >> Hi Piccoli,
> >>
> >> I agree with Alex's point, using ring->sched.name for such check is not a 
> >> good way. BTW, can you please attach a full dmesg long in bad case to help 
> >> me understand more?
> >>
> >> Regards,
> >> Guchun
> >>
> >> -Original Message-
> >> From: Alex Deucher 
> >> Sent: Tuesday, January 31, 2023 6:30 AM
> >> To: Guilherme G. Piccoli 
> >> Cc: amd-...@lists.freedesktop.org; ker...@gpiccoli.net; Chen, Guchun
> >> ; Pan, Xinhui ;
> >> dri-devel@lists.freedesktop.org; Tuikov, Luben ;
> >> Limonciello, Mario ; kernel-...@igalia.com;
> >> Deucher, Alexander ; Koenig, Christian
> >> 
> >> Subject: Re: [PATCH] drm/amdgpu/fence: Fix oops due to non-matching
> >> drm_sched init/fini
> >>
> >> On Mon, Jan 30, 2023 at 4:51 PM Guilherme G. Piccoli  
> >> wrote:
> >>> + Luben
> >>>
> >>> (sorry, missed that in the first submission).
> >>>
> >>> On 30/01/2023 18:45, Guilherme G. Piccoli wrote:
>  Currently amdgpu calls drm_sched_fini() from the fence driver sw
>  fini routine - such function is expected to be called only after the
>  respective init function - drm_sched_init() - was executed successfully.
> 
>  Happens that we faced a driver probe failure in the Steam Deck
>  recently, and the function drm_sched_fini() was called even without
>  its counter-part had been previously called, causing the following oops:
> 
>  amdgpu: probe of :04:00.0 failed with error -110
>  BUG: kernel NULL pointer dereference, address: 0090 PGD
>  0 P4D 0
>  Oops: 0002 [#1] PREEMPT SMP NOPTI
>  CPU: 0 PID: 609 Comm: systemd-udevd Not tainted 6.2.0-rc3-gpiccoli
>  #338 Hardware name: Valve Jupiter/Jupiter, BIOS F7A0113 11/04/2022
>  RIP: 0010:drm_sched_fini+0x84/0xa0 [gpu_sched] [...] Call Trace:
> 
> amdgpu_fence_driver_sw_fini+0xc8/0xd0 [amdgpu]
> amdgpu_device_fini_sw+0x2b/0x3b0 [amdgpu]
> amdgpu_driver_release_kms+0x16/0x30 [amdgpu]
> devm_drm_dev_init_release+0x49/0x70
> [...]
> 
>  To prevent that, check if the drm_sched was properly initialized for
>  a given ring before calling its fini counter-part.
> 
>  Notice ideally we'd use sched.ready for that; such field is set as
>  the latest thing on drm_sched_init(). But amdgpu seems to "override"
>  the meaning of such field - in the above oops for example, it was a
>  GFX ring causing the crash, and the sched.ready field was set to
>  true in the ring init routine, regardless of the state of the DRM 
>  scheduler. Hence, we ended-up using another sched field.
> >> Fixes: 067f44c8b459 ("drm/amdgpu: avoid over-handle of fence
> >>

Re: [PATCH v2 1/4] memcg: Track exported dma-buffers

2023-02-01 Thread Tvrtko Ursulin




On 01/02/2023 01:49, T.J. Mercier wrote:

On Tue, Jan 31, 2023 at 6:01 AM Tvrtko Ursulin
 wrote:



On 25/01/2023 20:04, T.J. Mercier wrote:

On Wed, Jan 25, 2023 at 9:31 AM Tvrtko Ursulin
 wrote:



Hi,

On 25/01/2023 11:52, Michal Hocko wrote:

On Tue 24-01-23 19:46:28, Shakeel Butt wrote:

On Tue, Jan 24, 2023 at 03:59:58PM +0100, Michal Hocko wrote:

On Mon 23-01-23 19:17:23, T.J. Mercier wrote:

When a buffer is exported to userspace, use memcg to attribute the
buffer to the allocating cgroup until all buffer references are
released.


Is there any reason why this memory cannot be charged during the
allocation (__GFP_ACCOUNT used)?
Also you do charge and account the memory but underlying pages do not
know about their memcg (this is normally done with commit_charge for
user mapped pages). This would become a problem if the memory is
migrated for example.


I don't think this is movable memory.


This also means that you have to maintain memcg
reference outside of the memcg proper which is not really nice either.
This mimicks tcp kmem limit implementation which I really have to say I
am not a great fan of and this pattern shouldn't be coppied.



I think we should keep the discussion on technical merits instead of
personal perference. To me using skmem like interface is totally fine
but the pros/cons need to be very explicit and the clear reasons to
select that option should be included.


I do agree with that. I didn't want sound to be personal wrt tcp kmem
accounting but the overall code maintenance cost is higher because
of how tcp take on accounting differs from anything else in the memcg
proper. I would prefer to not grow another example like that.


To me there are two options:

1. Using skmem like interface as this patch series:

The main pros of this option is that it is very simple. Let me list down
the cons of this approach:

a. There is time window between the actual memory allocation/free and
the charge and uncharge and [un]charge happen when the whole memory is
allocated or freed. I think for the charge path that might not be a big
issue but on the uncharge, this can cause issues. The application and
the potential shrinkers have freed some of this dmabuf memory but until
the whole dmabuf is freed, the memcg uncharge will not happen. This can
consequences on reclaim and oom behavior of the application.

b. Due to the usage model i.e. a central daemon allocating the dmabuf
memory upfront, there is a requirement to have a memcg charge transfer
functionality to transfer the charge from the central daemon to the
client applications. This does introduce complexity and avenues of weird
reclaim and oom behavior.


2. Allocate and charge the memory on page fault by actual user

In this approach, the memory is not allocated upfront by the central
daemon but rather on the page fault by the client application and the
memcg charge happen at the same time.

The only cons I can think of is this approach is more involved and may
need some clever tricks to track the page on the free patch i.e. we to
decrement the dmabuf memcg stat on free path. Maybe a page flag.

The pros of this approach is there is no need have a charge transfer
functionality and the charge/uncharge being closely tied to the actual
memory allocation and free.

Personally I would prefer the second approach but I don't want to just
block this work if the dmabuf folks are ok with the cons mentioned of
the first approach.


I am not familiar with dmabuf internals to judge complexity on their end
but I fully agree that charge-when-used is much more easier to reason
about and it should have less subtle surprises.


Disclaimer that I don't seem to see patches 3&4 on dri-devel so maybe I
am missing something, but in principle yes, I agree that the 2nd option
(charge the user, not exporter) should be preferred. Thing being that at
export time there may not be any backing store allocated, plus if the
series is restricting the charge transfer to just Android clients then
it seems it has the potential to miss many other use cases. At least
needs to outline a description on how the feature will be useful outside
Android.


There is no restriction like that. It's available to anybody who wants
to call dma_buf_charge_transfer if they actually have a need for that,
which I don't really expect to be common since most users/owners of
the buffers will be the ones causing the export in the first place.
It's just not like that on Android with the extra allocator process in
the middle most of the time.


Yeah I used the wrong term "restrict", apologies. What I meant was, if
the idea was to allow spotting memory leaks, with the charge transfer
being optional and in the series only wired up for Android Binder, then
it obviously only fully works for that one case. So a step back..


Oh, spotting kernel memory leaks is a side-benefit of accounting
kernel-only buffers in the root cgroup. The primary goal is to
attribute buffers to applications that originated

Re: linux-6.2-rc4+ hangs on poweroff/reboot: Bisected

2023-02-01 Thread Chris Clayton




On 30/01/2023 23:27, Ben Skeggs wrote:
> On Tue, 31 Jan 2023 at 09:09, Chris Clayton  wrote:
>>
>> Hi again.
>>
>> On 30/01/2023 20:19, Chris Clayton wrote:
>>> Thanks, Ben.
>>
>> 
>>
 Hey,

 This is a complete shot-in-the-dark, as I don't see this behaviour on
 *any* of my boards.  Could you try the attached patch please?
>>>
>>> Unfortunately, the patch made no difference.
>>>
>>> I've been looking at how the graphics on my laptop is set up, and have a 
>>> bit of a worry about whether the firmware might
>>> be playing a part in this problem. In order to offload video decoding to 
>>> the NVidia TU117 GPU, it seems the scrubber
>>> firmware must be available, but as far as I know,that has not been released 
>>> by NVidia. To get it to work, I followed
>>> what ubuntu have done and the scrubber in /lib/firmware/nvidia/tu117/nvdec/ 
>>> is a symlink to
>>> ../../tu116/nvdev/scrubber.bin. That, of course, means that some of the 
>>> firmware loaded is for a different card is being
>>> loaded. I note that processing related to firmware is being changed in the 
>>> patch. Might my set up be at the root of my
>>> problem?
>>>
>>> I'll have a fiddle an see what I can work out.
>>>
>>> Chris
>>>

 Thanks,
 Ben.

>
>>
>> Well, my fiddling has got my system rebooting and shutting down successfully 
>> again. I found that if I delete the symlink
>> to the scrubber firmware, reboot and shutdown work again. There are however, 
>> a number of other files in the tu117
>> firmware directory tree that that are symlinks to actual files in its tu116 
>> counterpart. So I deleted all of those too.
>> Unfortunately, the absence of one or more of those symlinks causes Xorg to 
>> fail to start. I've reinstated all the links
>> except scrubber and I now have a system that works as it did until I tried 
>> to run a kernel that includes the bad commit
>> I identified in my bisection. That includes offloading video decoding to the 
>> NVidia card, so what ever I read that said
>> the scrubber firmware was needed seems to have been wrong. I get a new 
>> message that (nouveau :01:00.0: fb: VPR
>> locked, but no scrubber binary!), but, hey, we can't have everything.
>>
>> If you still want to get to the bottom of this, let me know what you need me 
>> to provide and I'll do my best. I suspect
>> you might want to because there will a n awful lot of Ubuntu-based systems 
>> out there with that scrubber.bin symlink in
>> place. On the other hand,m it could but quite a while before ubuntu are 
>> deploying 6.2 or later kernels.
> The symlinks are correct - whole groups of GPUs share the same FW, and
> we use symlinks in linux-firmware to represent this.
> 
> I don't really have any ideas how/why this patch causes issues with
> shutdown - it's a path that only gets executed during initialisation.
> Can you try and capture the kernel log during shutdown ("dmesg -w"
> over ssh? netconsole?), and see if there's any relevant messages
> providing a hint at what's going on?  Alternatively, you could try
> unloading the module (you will have to stop X/wayland/gdm/etc/etc
> first) and seeing if that hangs too.
> 
> Ben.

Sorry for the delay - I've been learning about netconsole and netcat. However, 
I had no success with ssh and netconsole
produced a log with nothing unusual in it.

Simply stopping Xorg and removing the nouveau module succeeds.

So, I rebuilt rc6+ after a pull from linus' tree this morning and set the 
nouveau debug level to 7. I then booted to a
console before doing a reboot (with Ctl+Alt+Del). As expected the machine 
locked up just before it would ordinarily
restart. The last few lines on the console might be helpful:

...
nouveau :01:00:0  fifo: preinit running...
nouveau :01:00:0  fifo: preinit completed in 4us
nouveau :01:00:0  gr: preinit running...
nouveau :01:00:0  gr: preinit completed in 0us
nouveau :01:00:0  nvdec0: preinit running...
nouveau :01:00:0  nvdec0: preinit completed in 0us
nouveau :01:00:0  nvdec0: preinit running...
nouveau :01:00:0  nvdec0: preinit completed in 0us
nouveau :01:00:0  sec2: preinit running...
nouveau :01:00:0  sec2: preinit completed in 0us
nouveau :01:00:0  fb:.VPR locked, running scrubber binary

These messages appear after the "sd 4:0:0:0 [sda] Stopping disk" I reported in 
my initial email.

After the "running scrubber" line appears the machine is locked and I have to 
hold down the power button to recover. I
get the same outcome from running "halt -dip", "poweroff -di" and "shutdown -h 
-P now". I guess it's no surprise that
all three result in the same outcome because invocations halt, poweroff and 
reboot (without the -f argument)from a
runlevel other than 0 resukt in shutdown being run. switching to runlevel 0 
with "telenit 0" results in the same
messages from nouveau followed by the lockup.

Let me know if you need any additional diagnostics.

Chris

> 
>>
>> Thanks,
>>
>> Chris
>>
>>

Re: [v1 2/3] drm/msm/disp/dpu1: add dspps into reservation if there is a ctm request

2023-02-01 Thread Dmitry Baryshkov


On 01/02/2023 13:26, Marijn Suijten wrote:

On 2023-02-01 12:16:05, Marijn Suijten wrote:


+   if (dpu_kms->catalog->dspp &&
+   crtc_state->ctm && (dpu_kms->catalog->dspp_count >= 
topology.num_lm))


Multiline-if-clause is typically indented with two tabs, not a half tab
(4 spaces).


Hmm, Dmitry requested indent-to-opening-parenthesis in v1 instead; and
the majority of dpu1 uses the worst version of all: indent with a single
tab so that the contents line up with the code block below.  Dmitry,
I'll leave final say to you (and fix it up in my own DPU series
accordingly too).


Well,

:set cino=(0



- Marijn


--
With best wishes
Dmitry

Re: [v1 2/3] drm/msm/disp/dpu1: add dspps into reservation if there is a ctm request

2023-02-01 Thread Dmitry Baryshkov


On 01/02/2023 13:16, Marijn Suijten wrote:

On 2023-01-30 07:21:31, Kalyan Thota wrote:

Add dspp blocks into the topology for reservation, if there is a ctm
request for that composition.


DSPP


Changes in v1:
- Minor nits (Dmitry)


This should go below the triple dashes, so that it /does not/ become
part of the patch/commit that is applied to the tree (where review
history is irrelevant as it can be searched for separately).


This is one of DRM peculiarities which we have to live with.




Signed-off-by: Kalyan Thota 
Reviewed-by: Dmitry Baryshkov 
---
  drivers/gpu/drm/msm/disp/dpu1/dpu_encoder.c | 13 ++---
  1 file changed, 6 insertions(+), 7 deletions(-)

diff --git a/drivers/gpu/drm/msm/disp/dpu1/dpu_encoder.c 
b/drivers/gpu/drm/msm/disp/dpu1/dpu_encoder.c
index 9c6817b..3bd46b4 100644
--- a/drivers/gpu/drm/msm/disp/dpu1/dpu_encoder.c
+++ b/drivers/gpu/drm/msm/disp/dpu1/dpu_encoder.c
@@ -545,7 +545,8 @@ bool dpu_encoder_use_dsc_merge(struct drm_encoder *drm_enc)
  static struct msm_display_topology dpu_encoder_get_topology(
struct dpu_encoder_virt *dpu_enc,
struct dpu_kms *dpu_kms,
-   struct drm_display_mode *mode)
+   struct drm_display_mode *mode,
+   struct drm_crtc_state *crtc_state)
  {
struct msm_display_topology topology = {0};
int i, intf_count = 0;
@@ -573,11 +574,9 @@ static struct msm_display_topology 
dpu_encoder_get_topology(
else
topology.num_lm = (mode->hdisplay > MAX_HDISPLAY_SPLIT) ? 2 : 1;
  
-	if (dpu_enc->disp_info.intf_type == DRM_MODE_ENCODER_DSI) {

-   if (dpu_kms->catalog->dspp &&
-   (dpu_kms->catalog->dspp_count >= topology.num_lm))
-   topology.num_dspp = topology.num_lm;
-   }
+   if (dpu_kms->catalog->dspp &&
+   crtc_state->ctm && (dpu_kms->catalog->dspp_count >= 
topology.num_lm))


Multiline-if-clause is typically indented with two tabs, not a half tab
(4 spaces).


I tend to disagree here. Lately I have mostly seen it being indented to 
the opening parenthesis, so that nested statements also indent nicely.



Nit: swap the && here?  dspp and dspp_count are related, so check ctm
first or last but not in the middle - makes reading easier.


I think we can ignore dpu_kms->catalog->dspp completely. checking 
dspp_count should be enough for the purpose of the check (and note, the 
check for dspp/dspp_count is misleading and should be omitted).





+   topology.num_dspp = topology.num_lm;
  
  	topology.num_enc = 0;

topology.num_intf = intf_count;
@@ -643,7 +642,7 @@ static int dpu_encoder_virt_atomic_check(
}
}
  
-	topology = dpu_encoder_get_topology(dpu_enc, dpu_kms, adj_mode);

+   topology = dpu_encoder_get_topology(dpu_enc, dpu_kms, adj_mode, 
crtc_state);
  
  	/* Reserve dynamic resources now. */

if (!ret) {
--
2.7.4



--
With best wishes
Dmitry

Re: [PATCH] drm/virtio: exbuf->fence_fd unmodified on interrupted wait

2023-02-01 Thread Dmitry Osipenko

On 1/27/23 01:58, Ryan Neph wrote:
> An interrupted dma_fence_wait() becomes an -ERESTARTSYS returned
> to userspace ioctl(DRM_IOCTL_VIRTGPU_EXECBUFFER) calls, prompting to
> retry the ioctl(), but the passed exbuf->fence_fd has been reset to -1,
> making the retry attempt fail at sync_file_get_fence().
> 
> The uapi for DRM_IOCTL_VIRTGPU_EXECBUFFER is changed to retain the
> passed value for exbuf->fence_fd when returning ERESTARTSYS or EINTR.
> 
> Fixes: 2cd7b6f08bc4 ("drm/virtio: add in/out fence support for explicit 
> synchronization")
> Signed-off-by: Ryan Neph 
> ---
> 
>  drivers/gpu/drm/virtio/virtgpu_ioctl.c | 9 ++---
>  include/uapi/drm/virtgpu_drm.h | 3 +++
>  2 files changed, 9 insertions(+), 3 deletions(-)
> 
> diff --git a/drivers/gpu/drm/virtio/virtgpu_ioctl.c 
> b/drivers/gpu/drm/virtio/virtgpu_ioctl.c
> index 9f4a90493aea..ffce4e2a409a 100644
> --- a/drivers/gpu/drm/virtio/virtgpu_ioctl.c
> +++ b/drivers/gpu/drm/virtio/virtgpu_ioctl.c
> @@ -132,6 +132,8 @@ static int virtio_gpu_execbuffer_ioctl(struct drm_device 
> *dev, void *data,
>   uint64_t fence_ctx;
>   uint32_t ring_idx;
>  
> + exbuf->fence_fd = -1;
> +
>   fence_ctx = vgdev->fence_drv.context;
>   ring_idx = 0;
>  
> @@ -152,8 +154,6 @@ static int virtio_gpu_execbuffer_ioctl(struct drm_device 
> *dev, void *data,
>   ring_idx = exbuf->ring_idx;
>   }
>  
> - exbuf->fence_fd = -1;

Is there any userspace relying on this -1 behaviour? Wouldn't be better
to remove this offending assignment?

-- 
Best regards,
Dmitry

Re: [PATCH v5 7/7] drm/etnaviv: add HWDB entry for VIPNano-QI.7120.0055

2023-02-01 Thread Lucas Stach

Hi Tomeu,

Am Donnerstag, dem 01.12.2022 um 11:30 +0100 schrieb Tomeu Vizoso:
> This is a compute-only module marketed towards AI and vision
> acceleration. This particular version can be found on the Amlogic A311D
> SoC.
> 
> The feature bits are taken from the Khadas downstream kernel driver
> 6.4.4.3.310723AAA.
> 
I just noticed that you seem to have dropped the nn_core_count property
from this HWDB entry when reworking the series according to my
feedback. I assume that you are fine with me fixing this up in the
etnaviv tree by using the hunk from your previous patch "[PATCH v4 6/7]
drm/etnaviv: Add nn_core_count to chip feature struct".

If you see any issue with that, please let me know.

Regards,
Lucas

> Signed-off-by: Tomeu Vizoso 
> ---
>  drivers/gpu/drm/etnaviv/etnaviv_hwdb.c | 31 ++
>  1 file changed, 31 insertions(+)
> 
> diff --git a/drivers/gpu/drm/etnaviv/etnaviv_hwdb.c 
> b/drivers/gpu/drm/etnaviv/etnaviv_hwdb.c
> index 44df273a5aae..66b8ad6c7d26 100644
> --- a/drivers/gpu/drm/etnaviv/etnaviv_hwdb.c
> +++ b/drivers/gpu/drm/etnaviv/etnaviv_hwdb.c
> @@ -134,6 +134,37 @@ static const struct etnaviv_chip_identity 
> etnaviv_chip_identities[] = {
>   .minor_features10 = 0x90044250,
>   .minor_features11 = 0x0024,
>   },
> + {
> + .model = 0x8000,
> + .revision = 0x7120,
> + .product_id = 0x45080009,
> + .customer_id = 0x88,
> + .eco_id = 0,
> + .stream_count = 8,
> + .register_max = 64,
> + .thread_count = 256,
> + .shader_core_count = 1,
> + .vertex_cache_size = 16,
> + .vertex_output_buffer_size = 1024,
> + .pixel_pipes = 1,
> + .instruction_count = 512,
> + .num_constants = 320,
> + .buffer_size = 0,
> + .varyings_count = 16,
> + .features = 0xe0287cac,
> + .minor_features0 = 0xc1799eff,
> + .minor_features1 = 0xfefbfadb,
> + .minor_features2 = 0xeb9d6fbf,
> + .minor_features3 = 0xedfffced,
> + .minor_features4 = 0xd30dafc7,
> + .minor_features5 = 0x7b5ac333,
> + .minor_features6 = 0xfc8ee200,
> + .minor_features7 = 0x03fffa6f,
> + .minor_features8 = 0x00fe0ef0,
> + .minor_features9 = 0x0088003c,
> + .minor_features10 = 0x108048c0,
> + .minor_features11 = 0x0010,
> + },
>  };
>  
>  bool etnaviv_fill_identity_from_hwdb(struct etnaviv_gpu *gpu)

[PULL] drm-intel-gt-next

2023-02-01 Thread Tvrtko Ursulin

Hi Dave, Daniel,

Here goes the final pull request for 6.3.

Aside a few fixes, the reset is split between refactoring of the
workarounds code and correcting some workaround placement to correctly
align for new platforms, and converting the GuC code to use dedicated
logging macros, as was done for the whole of the GT in the previous pull.

Some minor random future platform prep as well, and a back merge to fix
a conflict between drm-intel-next and drm-intel-gt-next.

Regards,

Tvrtko

drm-intel-gt-next-2023-02-01:
Driver Changes:

Fixes/improvements/new stuff:

- Fix bcs default context on Meteorlake (Lucas De Marchi)
- GAM registers don't need to be re-applied on engine resets (Matt Roper)
- Correct implementation of Wa_18018781329 (Matt Roper)
- Avoid potential vm use-after-free (Rob Clark)
- GuC error capture fixes (John Harrison)
- Fix potential bit_17 double-free (Rob Clark)
- Don't complain about missing regs on MTL (John Harrison)

Future platform enablement:

- Convert PSS_MODE2 to multicast register (Gustavo Sousa)
- Move/adjust register definitions related to Wa_22011450934 (Matt Roper)
- Move LSC_CHICKEN_BIT* workarounds to correct function (Gustavo Sousa)
- Document where to implement register workarounds (Gustavo Sousa)
- Use uabi engines for the default engine map (Tvrtko Ursulin)
- Flush all tiles on test exit (Tvrtko Ursulin)
- Annotate a couple more workaround registers as MCR (Matt Roper)

Driver refactors:

- Add and use GuC oriented print macros (Michal Wajdeczko)

Miscellaneous:

- Fix intel_selftest_modify_policy argument types (Arnd Bergmann)

Backmerges:

Merge drm/drm-next into drm-intel-gt-next (for conflict resolution) (Tvrtko 
Ursulin)
The following changes since commit 045e8d102f44ad75dca0b0ec9eede15ea89da673:

  Merge tag 'drm-intel-gt-next-2023-01-18' of 
git://anongit.freedesktop.org/drm/drm-intel into drm-next (2023-01-24 16:20:43 
+0100)

are available in the Git repository at:

  git://anongit.freedesktop.org/drm/drm-intel tags/drm-intel-gt-next-2023-02-01

for you to fetch changes up to 003e11ed2ef4af01b808f0f193eaa5a32f32383b:

  drm/i915/mtl: Wa_22011802037: don't complain about missing regs on MTL 
(2023-01-31 15:17:30 -0800)


Driver Changes:

Fixes/improvements/new stuff:

- Fix bcs default context on Meteorlake (Lucas De Marchi)
- GAM registers don't need to be re-applied on engine resets (Matt Roper)
- Correct implementation of Wa_18018781329 (Matt Roper)
- Avoid potential vm use-after-free (Rob Clark)
- GuC error capture fixes (John Harrison)
- Fix potential bit_17 double-free (Rob Clark)
- Don't complain about missing regs on MTL (John Harrison)

Future platform enablement:

- Convert PSS_MODE2 to multicast register (Gustavo Sousa)
- Move/adjust register definitions related to Wa_22011450934 (Matt Roper)
- Move LSC_CHICKEN_BIT* workarounds to correct function (Gustavo Sousa)
- Document where to implement register workarounds (Gustavo Sousa)
- Use uabi engines for the default engine map (Tvrtko Ursulin)
- Flush all tiles on test exit (Tvrtko Ursulin)
- Annotate a couple more workaround registers as MCR (Matt Roper)

Driver refactors:

- Add and use GuC oriented print macros (Michal Wajdeczko)

Miscellaneous:

- Fix intel_selftest_modify_policy argument types (Arnd Bergmann)

Backmerges:

Merge drm/drm-next into drm-intel-gt-next (for conflict resolution) (Tvrtko 
Ursulin)


Arnd Bergmann (1):
  drm/i915/selftest: fix intel_selftest_modify_policy argument types

Gustavo Sousa (3):
  drm/i915/doc: Document where to implement register workarounds
  drm/i915/gt: Move LSC_CHICKEN_BIT* workarounds to correct function
  drm/i915/gt: Convert PSS_MODE2 to multicast register

John Harrison (9):
  drm/i915/guc: Fix locking when searching for a hung request
  drm/i915: Fix request ref counting during error capture & debugfs dump
  drm/i915: Fix up locking around dumping requests lists
  drm/i915: Allow error capture without a request
  drm/i915: Allow error capture of a pending request
  drm/i915/guc: Look for a guilty context when an engine reset fails
  drm/i915/guc: Add a debug print on GuC triggered reset
  drm/i915/guc: Rename GuC register state capture node to be more obvious
  drm/i915/mtl: Wa_22011802037: don't complain about missing regs on MTL

Lucas De Marchi (1):
  drm/i915/mtl: Fix bcs default context

Matt Roper (4):
  drm/i915: Move/adjust register definitions related to Wa_22011450934
  drm/i915/xehp: GAM registers don't need to be re-applied on engine resets
  drm/i915/mtl: Correct implementation of Wa_18018781329
  drm/i915/xehp: Annotate a couple more workaround registers as MCR

Michal Wajdeczko (8):
  drm/i915/guc: Add GuC oriented print macros
  drm/i915/guc: Update GuC messages in intel_guc.c
  drm/i915/guc: Update GuC messages in

Re: [v1 2/3] drm/msm/disp/dpu1: add dspps into reservation if there is a ctm request

2023-02-01 Thread Marijn Suijten

On 2023-02-01 12:16:05, Marijn Suijten wrote:

> > +   if (dpu_kms->catalog->dspp &&
> > +   crtc_state->ctm && (dpu_kms->catalog->dspp_count >= 
> > topology.num_lm))
> 
> Multiline-if-clause is typically indented with two tabs, not a half tab
> (4 spaces).

Hmm, Dmitry requested indent-to-opening-parenthesis in v1 instead; and
the majority of dpu1 uses the worst version of all: indent with a single
tab so that the contents line up with the code block below.  Dmitry,
I'll leave final say to you (and fix it up in my own DPU series
accordingly too).

- Marijn

Re: [v1 1/3] drm/msm/disp/dpu1: clear dspp reservations in rm release

2023-02-01 Thread Marijn Suijten

On 2023-02-01 12:10:33, Marijn Suijten wrote:
> On 2023-01-30 07:21:30, Kalyan Thota wrote:
> > Clear dspp reservations from the global state during
> > rm release
> 
> DSPP, and a period at the end of a sentence.  Also noticing inconsistent
> linebreaks across these patches, stick to 72 chars.
> 
> > Signed-off-by: Kalyan Thota 
> > Reviewed-by: Dmitry Baryshkov 
> 
> Reviewed-by: Marijn Suijten 
> 
> Should this be considered a fix to be backported, or is it harmless?  If
> so:
> 
> Fixes: e47616df008b ("drm/msm/dpu: add support for color processing blocks in 
> dpu driver")

Right, it should, Dmitry also requested this in v1.

- Marijn

Re: [v1 2/3] drm/msm/disp/dpu1: add dspps into reservation if there is a ctm request

2023-02-01 Thread Marijn Suijten

On 2023-01-30 07:21:31, Kalyan Thota wrote:
> Add dspp blocks into the topology for reservation, if there is a ctm
> request for that composition.

DSPP

> Changes in v1:
> - Minor nits (Dmitry)

This should go below the triple dashes, so that it /does not/ become
part of the patch/commit that is applied to the tree (where review
history is irrelevant as it can be searched for separately).

> Signed-off-by: Kalyan Thota 
> Reviewed-by: Dmitry Baryshkov 
> ---
>  drivers/gpu/drm/msm/disp/dpu1/dpu_encoder.c | 13 ++---
>  1 file changed, 6 insertions(+), 7 deletions(-)
> 
> diff --git a/drivers/gpu/drm/msm/disp/dpu1/dpu_encoder.c 
> b/drivers/gpu/drm/msm/disp/dpu1/dpu_encoder.c
> index 9c6817b..3bd46b4 100644
> --- a/drivers/gpu/drm/msm/disp/dpu1/dpu_encoder.c
> +++ b/drivers/gpu/drm/msm/disp/dpu1/dpu_encoder.c
> @@ -545,7 +545,8 @@ bool dpu_encoder_use_dsc_merge(struct drm_encoder 
> *drm_enc)
>  static struct msm_display_topology dpu_encoder_get_topology(
>   struct dpu_encoder_virt *dpu_enc,
>   struct dpu_kms *dpu_kms,
> - struct drm_display_mode *mode)
> + struct drm_display_mode *mode,
> + struct drm_crtc_state *crtc_state)
>  {
>   struct msm_display_topology topology = {0};
>   int i, intf_count = 0;
> @@ -573,11 +574,9 @@ static struct msm_display_topology 
> dpu_encoder_get_topology(
>   else
>   topology.num_lm = (mode->hdisplay > MAX_HDISPLAY_SPLIT) ? 2 : 1;
>  
> - if (dpu_enc->disp_info.intf_type == DRM_MODE_ENCODER_DSI) {
> - if (dpu_kms->catalog->dspp &&
> - (dpu_kms->catalog->dspp_count >= topology.num_lm))
> - topology.num_dspp = topology.num_lm;
> - }
> + if (dpu_kms->catalog->dspp &&
> + crtc_state->ctm && (dpu_kms->catalog->dspp_count >= 
> topology.num_lm))

Multiline-if-clause is typically indented with two tabs, not a half tab
(4 spaces).

Nit: swap the && here?  dspp and dspp_count are related, so check ctm
first or last but not in the middle - makes reading easier.

> + topology.num_dspp = topology.num_lm;
>  
>   topology.num_enc = 0;
>   topology.num_intf = intf_count;
> @@ -643,7 +642,7 @@ static int dpu_encoder_virt_atomic_check(
>   }
>   }
>  
> - topology = dpu_encoder_get_topology(dpu_enc, dpu_kms, adj_mode);
> + topology = dpu_encoder_get_topology(dpu_enc, dpu_kms, adj_mode, 
> crtc_state);
>  
>   /* Reserve dynamic resources now. */
>   if (!ret) {
> -- 
> 2.7.4
>

Re: [v1 1/3] drm/msm/disp/dpu1: clear dspp reservations in rm release

2023-02-01 Thread Marijn Suijten

On 2023-01-30 07:21:30, Kalyan Thota wrote:
> Clear dspp reservations from the global state during
> rm release

DSPP, and a period at the end of a sentence.  Also noticing inconsistent
linebreaks across these patches, stick to 72 chars.

> Signed-off-by: Kalyan Thota 
> Reviewed-by: Dmitry Baryshkov 

Reviewed-by: Marijn Suijten 

Should this be considered a fix to be backported, or is it harmless?  If
so:

Fixes: e47616df008b ("drm/msm/dpu: add support for color processing blocks in 
dpu driver")

> ---
>  drivers/gpu/drm/msm/disp/dpu1/dpu_rm.c | 2 ++
>  1 file changed, 2 insertions(+)
> 
> diff --git a/drivers/gpu/drm/msm/disp/dpu1/dpu_rm.c 
> b/drivers/gpu/drm/msm/disp/dpu1/dpu_rm.c
> index 73b3442..718ea0a 100644
> --- a/drivers/gpu/drm/msm/disp/dpu1/dpu_rm.c
> +++ b/drivers/gpu/drm/msm/disp/dpu1/dpu_rm.c
> @@ -572,6 +572,8 @@ void dpu_rm_release(struct dpu_global_state *global_state,
>   ARRAY_SIZE(global_state->ctl_to_enc_id), enc->base.id);
>   _dpu_rm_clear_mapping(global_state->dsc_to_enc_id,
>   ARRAY_SIZE(global_state->dsc_to_enc_id), enc->base.id);
> + _dpu_rm_clear_mapping(global_state->dspp_to_enc_id,
> + ARRAY_SIZE(global_state->dspp_to_enc_id), enc->base.id);
>  }
>  
>  int dpu_rm_reserve(
> -- 
> 2.7.4
>

Re: [PATCH] drm/panel: boe-tv101wum-nl6: Ensure DSI writes succeed during disable

2023-02-01 Thread Thomas Zimmermann




Am 31.01.23 um 22:27 schrieb Doug Anderson:

Hi,

On Thu, Jan 26, 2023 at 4:52 PM Doug Anderson  wrote:


Hi,

On Wed, Jan 18, 2023 at 1:34 PM Doug Anderson  wrote:


Hi,

On Thu, Jan 5, 2023 at 7:01 PM Stephen Boyd  wrote:


The unprepare sequence has started to fail after moving to panel bridge
code in the msm drm driver (commit 007ac0262b0d ("drm/msm/dsi: switch to
DRM_PANEL_BRIDGE")). You'll see messages like this in the kernel logs:

panel-boe-tv101wum-nl6 ae94000.dsi.0: failed to set panel off: -22

This is because boe_panel_enter_sleep_mode() needs an operating DSI link
to set the panel into sleep mode. Performing those writes in the
unprepare phase of bridge ops is too late, because the link has already
been torn down by the DSI controller in post_disable, i.e. the PHY has
been disabled, etc. See dsi_mgr_bridge_post_disable() for more details
on the DSI .

Split the unprepare function into a disable part and an unprepare part.
For now, just the DSI writes to enter sleep mode are put in the disable
function. This fixes the panel off routine and keeps the panel happy.

My Wormdingler has an integrated touchscreen that stops responding to
touch if the panel is only half disabled too. This patch fixes it. And
finally, this saves power when the screen is off because without this
fix the regulators for the panel are left enabled when nothing is being
displayed on the screen.

Fixes: 007ac0262b0d ("drm/msm/dsi: switch to DRM_PANEL_BRIDGE")
Fixes: a869b9db7adf ("drm/panel: support for boe tv101wum-nl6 wuxga dsi video mode 
panel")
Cc: yangcong 
Cc: Douglas Anderson 
Cc: Jitao Shi 
Cc: Sam Ravnborg 
Cc: Rob Clark 
Cc: Dmitry Baryshkov 
Signed-off-by: Stephen Boyd 
---
  drivers/gpu/drm/panel/panel-boe-tv101wum-nl6.c | 16 
  1 file changed, 12 insertions(+), 4 deletions(-)

diff --git a/drivers/gpu/drm/panel/panel-boe-tv101wum-nl6.c 
b/drivers/gpu/drm/panel/panel-boe-tv101wum-nl6.c
index 857a2f0420d7..c924f1124ebc 100644
--- a/drivers/gpu/drm/panel/panel-boe-tv101wum-nl6.c
+++ b/drivers/gpu/drm/panel/panel-boe-tv101wum-nl6.c
@@ -1193,14 +1193,11 @@ static int boe_panel_enter_sleep_mode(struct boe_panel 
*boe)
 return 0;
  }

-static int boe_panel_unprepare(struct drm_panel *panel)
+static int boe_panel_disable(struct drm_panel *panel)
  {
 struct boe_panel *boe = to_boe_panel(panel);
 int ret;

-   if (!boe->prepared)
-   return 0;
-
 ret = boe_panel_enter_sleep_mode(boe);
 if (ret < 0) {
 dev_err(panel->dev, "failed to set panel off: %d\n", ret);
@@ -1209,6 +1206,16 @@ static int boe_panel_unprepare(struct drm_panel *panel)

 msleep(150);

+   return 0;
+}
+
+static int boe_panel_unprepare(struct drm_panel *panel)
+{
+   struct boe_panel *boe = to_boe_panel(panel);
+
+   if (!boe->prepared)
+   return 0;
+
 if (boe->desc->discharge_on_disable) {
 regulator_disable(boe->avee);
 regulator_disable(boe->avdd);
@@ -1528,6 +1535,7 @@ static enum drm_panel_orientation 
boe_panel_get_orientation(struct drm_panel *pa
  }

  static const struct drm_panel_funcs boe_panel_funcs = {
+   .disable = boe_panel_disable,
 .unprepare = boe_panel_unprepare,
 .prepare = boe_panel_prepare,
 .enable = boe_panel_enable,


As mentioned by Stephen, my initial reaction was that this felt
asymmetric. We were moving some stuff from unprepare() to disable()
and it felt like that would mean we would also need to move something
from prepare() to enable. Initially I thought maybe that "something"
was all of boe_panel_init_dcs_cmd() but I guess that didn't work.

I don't truly have a reason that this _has_ to be symmetric. I was
initially worried that there might be some place where we call
pre_enable(), then never call enable() / disable(), and then call
post_disable(). That could have us in a bad state because we'd never
enter sleep mode / turn the display off. However (as I think I've
discovered before and just forgot), I don't think this is possible
because we always call pre-enable() and enable() together. Also, as
mentioned by Sam, we're about to fully shut the panel's power off so
(unless it's on a shared rail) it probably doesn't really matter.

Thus, I'd be OK with:

Reviewed-by: Douglas Anderson 

I'm also happy to land this (adding Cc: stable) to drm-misc-fixes if
nobody has any objections (also happy if someone else wants to land
it). I guess the one worry I have is that somehow this could break
something for one of the other 8 panels that this driver supports (or
it could have bad interactions with the display controller used on a
board with one of these panels?). Maybe we should have "Cc: stable"
off just to give it extra bake time? ...and maybe even push to
drm-misc-next instead of -fixes again to give extra bake time?


This thread has gone silent. I'll plan to land the patch in
drm-misc-next early next week, maybe Monday, _without_

Re: [PATCH 0/3] Reserve dspps based on user request

2023-02-01 Thread Marijn Suijten

On 2023-01-30 21:18:30, Dmitry Baryshkov wrote:
> On 30/01/2023 17:21, Kalyan Thota wrote:
> > This series will enable color features on sc7280 target which has primary 
> > panel as eDP
> > 
> > The series removes dspp allocation based on encoder type and allows the 
> > dspp reservation
> > based on user request via ctm.
> > 
> > The series will release/reserve the dpu resources when ever there is a 
> > topology change
> > to suit the new requirements.
> 
> Nit: the subject of the cover letter should include the version, if you 
> are including one into the individual patches Subject.

Indeed this makes it hard to tell the versions apart, and lore also
confusingly bundles both series in "loose matches on Subject: below".

Nit ^2: and individual patches should still have the PATCH moniker, i.e.
[PATCH v2 1/3].  git format-patch -v2 --cover-letter ... takes care of
/all this/ this for you.

And one more: as DSPP is an abbreviation, can we capitalize it?  So
DSPP / DSPPs in these titles?

> > 
> > Kalyan Thota (3):
> >drm/msm/disp/dpu1: clear dspp reservations in rm release
> >drm/msm/disp/dpu1: add dspps into reservation if there is a ctm
> >  request
> >drm/msm/disp/dpu1: reserve the resources on topology change

We just discussed in the DSC series that the subsystem prefix is
drm/msm/dpu.

- Marijn

> >   drivers/gpu/drm/msm/disp/dpu1/dpu_crtc.h|  1 +
> >   drivers/gpu/drm/msm/disp/dpu1/dpu_encoder.c | 54 
> > +++--
> >   drivers/gpu/drm/msm/disp/dpu1/dpu_encoder.h |  4 ++-
> >   drivers/gpu/drm/msm/disp/dpu1/dpu_kms.c |  2 +-
> >   drivers/gpu/drm/msm/disp/dpu1/dpu_rm.c  |  6 ++--
> >   5 files changed, 50 insertions(+), 17 deletions(-)
> > 
> 
> -- 
> With best wishes
> Dmitry
>

Re: [PATCH 02/14] drm/msm/a6xx: Extend UBWC config

2023-02-01 Thread Konrad Dybcio




On 1.02.2023 10:30, Akhil P Oommen wrote:
> On 1/26/2023 8:46 PM, Konrad Dybcio wrote:
>> Port setting min_access_length, ubwc_mode and upper_bit from downstream.
>> Values were validated using downstream device trees for SM8[123]50 and
>> left default (as per downstream) elsewhere.
>>
>> Signed-off-by: Konrad Dybcio 
>> ---
>>  drivers/gpu/drm/msm/adreno/a6xx_gpu.c | 26 ++
>>  1 file changed, 18 insertions(+), 8 deletions(-)
>>
>> diff --git a/drivers/gpu/drm/msm/adreno/a6xx_gpu.c 
>> b/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
>> index c5f5d0bb3fdc..ad5d791b804c 100644
>> --- a/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
>> +++ b/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
>> @@ -786,17 +786,22 @@ static void a6xx_set_cp_protect(struct msm_gpu *gpu)
>>  static void a6xx_set_ubwc_config(struct msm_gpu *gpu)
>>  {
>>  struct adreno_gpu *adreno_gpu = to_adreno_gpu(gpu);
>> -u32 lower_bit = 2;
>> +u32 lower_bit = 1;
> Wouldn't this break a630?

// highest_bank_bit = 15 on 845, from "qcom,highest-bank-bit" in dt

bit = adreno_dev->highest_bank_bit ? adreno_dev->highest_bank_bit - 13 : 0;
// => bit = 2

lower_bit = bit & 0x3;
// => lower_bit = 2

Yes it would! Thanks for catching that, I'll add the A630 case in v2.

The 1 default value comes from the fact that highest_bank_bit is 13
when it's unset in dt, which makes lower_bit 1.


Konrad

> 
> -Akhil.
>> +u32 upper_bit = 0;
>>  u32 amsbc = 0;
>>  u32 rgb565_predicator = 0;
>>  u32 uavflagprd_inv = 0;
>> +u32 min_acc_len = 0;
>> +u32 ubwc_mode = 0;
>>  
>>  /* a618 is using the hw default values */
>>  if (adreno_is_a618(adreno_gpu))
>>  return;
>>  
>> -if (adreno_is_a640_family(adreno_gpu))
>> +if (adreno_is_a640_family(adreno_gpu)) {
>>  amsbc = 1;
>> +lower_bit = 2;
>> +}
>>  
>>  if (adreno_is_a650(adreno_gpu) || adreno_is_a660(adreno_gpu)) {
>>  /* TODO: get ddr type from bootloader and use 2 for LPDDR4 */
>> @@ -807,18 +812,23 @@ static void a6xx_set_ubwc_config(struct msm_gpu *gpu)
>>  }
>>  
>>  if (adreno_is_7c3(adreno_gpu)) {
>> -lower_bit = 1;
>>  amsbc = 1;
>>  rgb565_predicator = 1;
>>  uavflagprd_inv = 2;
>>  }
>>  
>>  gpu_write(gpu, REG_A6XX_RB_NC_MODE_CNTL,
>> -rgb565_predicator << 11 | amsbc << 4 | lower_bit << 1);
>> -gpu_write(gpu, REG_A6XX_TPL1_NC_MODE_CNTL, lower_bit << 1);
>> -gpu_write(gpu, REG_A6XX_SP_NC_MODE_CNTL,
>> -uavflagprd_inv << 4 | lower_bit << 1);
>> -gpu_write(gpu, REG_A6XX_UCHE_MODE_CNTL, lower_bit << 21);
>> +  rgb565_predicator << 11 | upper_bit << 10 | amsbc << 4 |
>> +  min_acc_len << 3 | lower_bit << 1 | ubwc_mode);
>> +
>> +gpu_write(gpu, REG_A6XX_TPL1_NC_MODE_CNTL, upper_bit << 4 |
>> +  min_acc_len << 3 | lower_bit << 1 | ubwc_mode);
>> +
>> +gpu_write(gpu, REG_A6XX_SP_NC_MODE_CNTL, upper_bit << 10 |
>> +  uavflagprd_inv << 4 | min_acc_len << 3 |
>> +  lower_bit << 1 | ubwc_mode);
>> +
>> +gpu_write(gpu, REG_A6XX_UCHE_MODE_CNTL, min_acc_len << 23 | lower_bit 
>> << 21);
>>  }
>>  
>>  static int a6xx_cp_init(struct msm_gpu *gpu)
>

Re: [PATCH 2/2] accel/ivpu: avoid duplciate assignment

2023-02-01 Thread Thomas Zimmermann




Am 31.01.23 um 15:00 schrieb Jacek Lawrynowicz:

Applied to drm-misc-next. Thanks.


I have cherry-picked the patch into drm-misc-next-fixes.



On 26.01.2023 17:37, Arnd Bergmann wrote:

From: Arnd Bergmann 

With extra warnings enabled, gcc warns about two assignments
of the same .mmap callback:

In file included from drivers/accel/ivpu/ivpu_drv.c:10:
include/drm/drm_accel.h:31:27: error: initialized field overwritten 
[-Werror=override-init]
31 | .mmap   = drm_gem_mmap
   |   ^~~~
drivers/accel/ivpu/ivpu_drv.c:360:9: note: in expansion of macro 
'DRM_ACCEL_FOPS'
   360 | DRM_ACCEL_FOPS,
   | ^~

Remove the unused local assignment.

Fixes: 20709aa9435b ("accel: Add .mmap to DRM_ACCEL_FOPS")
Signed-off-by: Arnd Bergmann 
---
  drivers/accel/ivpu/ivpu_drv.c | 1 -
  1 file changed, 1 deletion(-)

diff --git a/drivers/accel/ivpu/ivpu_drv.c b/drivers/accel/ivpu/ivpu_drv.c
index 2bc2f1b90671..a29e8ee0dce6 100644
--- a/drivers/accel/ivpu/ivpu_drv.c
+++ b/drivers/accel/ivpu/ivpu_drv.c
@@ -356,7 +356,6 @@ int ivpu_shutdown(struct ivpu_device *vdev)
  
  static const struct file_operations ivpu_fops = {

.owner  = THIS_MODULE,
-   .mmap   = drm_gem_mmap,
DRM_ACCEL_FOPS,
  };
  


--
Thomas Zimmermann
Graphics Driver Developer
SUSE Software Solutions Germany GmbH
Maxfeldstr. 5, 90409 Nürnberg, Germany
(HRB 36809, AG Nürnberg)
Geschäftsführer: Ivo Totev


OpenPGP_signature
Description: OpenPGP digital signature

Re: [PATCH] [v2] accel: fix CONFIG_DRM dependencies

2023-02-01 Thread Thomas Zimmermann




Am 31.01.23 um 10:20 schrieb Jacek Lawrynowicz:

On 31.01.2023 09:53, Oded Gabbay wrote:

On Sat, Jan 28, 2023 at 12:15 AM Arnd Bergmann  wrote:


From: Arnd Bergmann 

At the moment, accel drivers can be built-in even with CONFIG_DRM=m,
but this causes a link failure:

x86_64-linux-ld: drivers/accel/ivpu/ivpu_drv.o: in function `ivpu_dev_init':
ivpu_drv.c:(.text+0x1535): undefined reference to `drmm_kmalloc'
x86_64-linux-ld: ivpu_drv.c:(.text+0x1562): undefined reference to 
`drmm_kmalloc'
x86_64-linux-ld: drivers/accel/ivpu/ivpu_drv.o: in function `ivpu_remove':
ivpu_drv.c:(.text+0x1faa): undefined reference to `drm_dev_unregister'
x86_64-linux-ld: drivers/accel/ivpu/ivpu_drv.o: in function `ivpu_probe':
ivpu_drv.c:(.text+0x1fef): undefined reference to `__devm_drm_dev_alloc'

The problem is that DRM_ACCEL is a 'bool' symbol symbol, so driver that
only depend on DRM_ACCEL but not also on DRM do not see the restriction
to =m configs.

To ensure that each accel driver has an implied dependency on CONFIG_DRM,
enclose the entire Kconfig file in an if/endif check.

Fixes: 8bf4889762a8 ("drivers/accel: define kconfig and register a new major")
Signed-off-by: Arnd Bergmann 
---
v2: rearrage the DRM dependency rather than requiring DRM to be built-in
---
  drivers/accel/Kconfig | 5 -
  1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/drivers/accel/Kconfig b/drivers/accel/Kconfig
index 834863902e16..c437206aa3f1 100644
--- a/drivers/accel/Kconfig
+++ b/drivers/accel/Kconfig
@@ -6,9 +6,10 @@
  # as, but not limited to, Machine-Learning and Deep-Learning acceleration
  # devices
  #
+if DRM
+
  menuconfig DRM_ACCEL
 bool "Compute Acceleration Framework"
-   depends on DRM
 help
   Framework for device drivers of compute acceleration devices, such
   as, but not limited to, Machine-Learning and Deep-Learning
@@ -25,3 +26,5 @@ menuconfig DRM_ACCEL

  source "drivers/accel/habanalabs/Kconfig"
  source "drivers/accel/ivpu/Kconfig"
+
+endif
--
2.39.0


This patch is:
Reviewed-by: Oded Gabbay 


Applied to drm-misc-next.


I have cherry-picked the patch into drm-misc-next-fixes.


Thanks.


--
Thomas Zimmermann
Graphics Driver Developer
SUSE Software Solutions Germany GmbH
Maxfeldstr. 5, 90409 Nürnberg, Germany
(HRB 36809, AG Nürnberg)
Geschäftsführer: Ivo Totev


OpenPGP_signature
Description: OpenPGP digital signature

Re: [PATCH v3] drm/shmem: Cleanup drm_gem_shmem_create_with_handle()

2023-02-01 Thread Thomas Zimmermann


I have cherry-picked the patch into drm-misc-next-fixes.

Am 23.01.23 um 16:48 schrieb Rob Clark:

From: Rob Clark 

Once we create the handle, the handle owns the reference.  Currently
nothing was doing anything with the shmem ptr after the handle was
created, but let's change drm_gem_shmem_create_with_handle() to not
return the pointer, so-as to not encourage problematic use of this
function in the future.  As a bonus, it makes the code a bit cleaner.

Signed-off-by: Rob Clark 
---
  drivers/gpu/drm/drm_gem_shmem_helper.c | 13 -
  1 file changed, 4 insertions(+), 9 deletions(-)

diff --git a/drivers/gpu/drm/drm_gem_shmem_helper.c 
b/drivers/gpu/drm/drm_gem_shmem_helper.c
index f21f47737817..42c496c5f92c 100644
--- a/drivers/gpu/drm/drm_gem_shmem_helper.c
+++ b/drivers/gpu/drm/drm_gem_shmem_helper.c
@@ -415,7 +415,7 @@ void drm_gem_shmem_vunmap(struct drm_gem_shmem_object 
*shmem,
  }
  EXPORT_SYMBOL(drm_gem_shmem_vunmap);
  
-static struct drm_gem_shmem_object *

+static int
  drm_gem_shmem_create_with_handle(struct drm_file *file_priv,
 struct drm_device *dev, size_t size,
 uint32_t *handle)
@@ -425,7 +425,7 @@ drm_gem_shmem_create_with_handle(struct drm_file *file_priv,
  
  	shmem = drm_gem_shmem_create(dev, size);

if (IS_ERR(shmem))
-   return shmem;
+   return PTR_ERR(shmem);
  
  	/*

 * Allocate an id of idr table where the obj is registered
@@ -434,10 +434,8 @@ drm_gem_shmem_create_with_handle(struct drm_file 
*file_priv,
ret = drm_gem_handle_create(file_priv, >base, handle);
/* drop reference from allocate - handle holds it now. */
drm_gem_object_put(>base);
-   if (ret)
-   return ERR_PTR(ret);
  
-	return shmem;

+   return ret;
  }
  
  /* Update madvise status, returns true if not purged, else

@@ -520,7 +518,6 @@ int drm_gem_shmem_dumb_create(struct drm_file *file, struct 
drm_device *dev,
  struct drm_mode_create_dumb *args)
  {
u32 min_pitch = DIV_ROUND_UP(args->width * args->bpp, 8);
-   struct drm_gem_shmem_object *shmem;
  
  	if (!args->pitch || !args->size) {

args->pitch = min_pitch;
@@ -533,9 +530,7 @@ int drm_gem_shmem_dumb_create(struct drm_file *file, struct 
drm_device *dev,
args->size = PAGE_ALIGN(args->pitch * args->height);
}
  
-	shmem = drm_gem_shmem_create_with_handle(file, dev, args->size, >handle);

-
-   return PTR_ERR_OR_ZERO(shmem);
+   return drm_gem_shmem_create_with_handle(file, dev, args->size, 
>handle);
  }
  EXPORT_SYMBOL_GPL(drm_gem_shmem_dumb_create);
  


--
Thomas Zimmermann
Graphics Driver Developer
SUSE Software Solutions Germany GmbH
Maxfeldstr. 5, 90409 Nürnberg, Germany
(HRB 36809, AG Nürnberg)
Geschäftsführer: Ivo Totev


OpenPGP_signature
Description: OpenPGP digital signature

Re: [PATCH] dma-buf: actually set signaling bit for private sub fences

2023-02-01 Thread Thomas Zimmermann




Am 27.01.23 um 08:49 schrieb Christian König:

Am 26.01.23 um 23:25 schrieb Danilo Krummrich:

On 1/26/23 07:58, Christian König wrote:

Am 26.01.23 um 01:28 schrieb Danilo Krummrich:

In dma_fence_allocate_private_stub() set the signaling bit of the newly
allocated private stub fence rather than the signaling bit of the
shared dma_fence_stub.

Fixes: c85d00d4fd8b ("dma-buf: set signaling bit for the stub fence")
Signed-off-by: Danilo Krummrich 


Good catch, Reviewed-by: Christian König 

Should I push it upstream as well or do you have commit access?


Thanks, I can push it to drm-misc-next.


This is clearly a bug fix so better push this to drm-misc-fixes and 
maybe even add a CC:stable tag.


I have cherry-picked the patch into drm-misc-next-fixes.



Christian.






---
  drivers/dma-buf/dma-fence.c | 2 +-
  1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/dma-buf/dma-fence.c b/drivers/dma-buf/dma-fence.c
index 406b4e26f538..0de0482cd36e 100644
--- a/drivers/dma-buf/dma-fence.c
+++ b/drivers/dma-buf/dma-fence.c
@@ -167,7 +167,7 @@ struct dma_fence 
*dma_fence_allocate_private_stub(void)

 0, 0);
  set_bit(DMA_FENCE_FLAG_ENABLE_SIGNAL_BIT,
-    _fence_stub.flags);
+    >flags);
  dma_fence_signal(fence);








--
Thomas Zimmermann
Graphics Driver Developer
SUSE Software Solutions Germany GmbH
Maxfeldstr. 5, 90409 Nürnberg, Germany
(HRB 36809, AG Nürnberg)
Geschäftsführer: Ivo Totev


OpenPGP_signature
Description: OpenPGP digital signature

Re: [PATCH] drm/simpledrm: Fix an NULL vs IS_ERR() bug

2023-02-01 Thread Thomas Zimmermann




Am 26.01.23 um 17:18 schrieb Thierry Reding:

On Thu, Jan 26, 2023 at 12:28:44PM +0300, Dan Carpenter wrote:

The devm_memremap() function doesn't return NULL, it returns error
pointers.

Fixes: 9a10c7e6519b ("drm/simpledrm: Add support for system memory 
framebuffers")
Signed-off-by: Dan Carpenter 
---
  drivers/gpu/drm/tiny/simpledrm.c | 4 ++--
  1 file changed, 2 insertions(+), 2 deletions(-)


Good catch! Applied to drm-misc-next, thanks.


I have cherry-picked the patch into drm-misc-next-fixes.



Thierry


--
Thomas Zimmermann
Graphics Driver Developer
SUSE Software Solutions Germany GmbH
Maxfeldstr. 5, 90409 Nürnberg, Germany
(HRB 36809, AG Nürnberg)
Geschäftsführer: Ivo Totev


OpenPGP_signature
Description: OpenPGP digital signature

PSA: drm-misc-next-fixes is open

2023-02-01 Thread Thomas Zimmermann


Hi,

the email goes out a bit late, as -rc6 has already been tagged for a few 
days. This means that drm-misc-next-fixes is now open for bug fixes, as 
drm-next is in feature freeze until the next -rc1 comes out.


Some rules of thumb on where to land your patch:

   * if your patch fixes a bug in upstream, please put it into 
drm-misc-fixes,


   * if your patch fixes a bug in drm-next, please put it into 
drm-misc-next-fixes,


   * anything else, especially new features, should go into drm-misc-next.

The flow chart is at [1].

The transition from/to drm-misc-fixes-next sometimes results in patches 
that are applied to the wrong tree and get stuck there for a long time. 
If you have fixes in drm-misc-next that must go into drm-next soon, 
please cherry-pick them into drm-misc-next-fixes. We have


   dim cherry-pick

to help you with that.

Best regards
Thomas

[1] 
https://drm.pages.freedesktop.org/maintainer-tools/committer-drm-misc.html#where-do-i-apply-my-patch


--
Thomas Zimmermann
Graphics Driver Developer
SUSE Software Solutions Germany GmbH
Maxfeldstr. 5, 90409 Nürnberg, Germany
(HRB 36809, AG Nürnberg)
Geschäftsführer: Ivo Totev


OpenPGP_signature
Description: OpenPGP digital signature

Re: [RFT PATCH v2 1/3] drm/bridge: tc358762: Set pre_enable_prev_first

2023-02-01 Thread Dave Stevenson

On Tue, 31 Jan 2023 at 22:22, Douglas Anderson  wrote:
>
> Set the "pre_enable_prev_first" as provided by commit 4fb912e5e190
> ("drm/bridge: Introduce pre_enable_prev_first to alter bridge init
> order"). This should allow us to revert commit ec7981e6c614
> ("drm/msm/dsi: don't powerup at modeset time for parade-ps8640") and
> commit 7d8e9a90509f ("drm/msm/dsi: move DSI host powerup to modeset
> time").

I see no reference in the TC358762 datasheet to requiring the DSI
interface to be in any particular state.
However, setting this flag does mean that the DSI host doesn't need to
power up and down for each host_transfer request from
tc358762_pre_enable/tc358762_init, so on that basis I'm good with it.

Reviewed-by: Dave Stevenson 

> Cc: Dave Stevenson 
> Cc: Dmitry Baryshkov 
> Cc: Abhinav Kumar 
> Signed-off-by: Douglas Anderson 
> ---
>
> (no changes since v1)
>
>  drivers/gpu/drm/bridge/tc358762.c | 1 +
>  1 file changed, 1 insertion(+)
>
> diff --git a/drivers/gpu/drm/bridge/tc358762.c 
> b/drivers/gpu/drm/bridge/tc358762.c
> index 0b6a28436885..77f7f7f54757 100644
> --- a/drivers/gpu/drm/bridge/tc358762.c
> +++ b/drivers/gpu/drm/bridge/tc358762.c
> @@ -229,6 +229,7 @@ static int tc358762_probe(struct mipi_dsi_device *dsi)
> ctx->bridge.funcs = _bridge_funcs;
> ctx->bridge.type = DRM_MODE_CONNECTOR_DPI;
> ctx->bridge.of_node = dev->of_node;
> +   ctx->bridge.pre_enable_prev_first = true;
>
> drm_bridge_add(>bridge);
>
> --
> 2.39.1.456.gfc5497dd1b-goog
>

Re: [PATCH 02/14] drm/msm/a6xx: Extend UBWC config

2023-02-01 Thread Akhil P Oommen

On 1/26/2023 8:46 PM, Konrad Dybcio wrote:
> Port setting min_access_length, ubwc_mode and upper_bit from downstream.
> Values were validated using downstream device trees for SM8[123]50 and
> left default (as per downstream) elsewhere.
>
> Signed-off-by: Konrad Dybcio 
> ---
>  drivers/gpu/drm/msm/adreno/a6xx_gpu.c | 26 ++
>  1 file changed, 18 insertions(+), 8 deletions(-)
>
> diff --git a/drivers/gpu/drm/msm/adreno/a6xx_gpu.c 
> b/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
> index c5f5d0bb3fdc..ad5d791b804c 100644
> --- a/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
> +++ b/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
> @@ -786,17 +786,22 @@ static void a6xx_set_cp_protect(struct msm_gpu *gpu)
>  static void a6xx_set_ubwc_config(struct msm_gpu *gpu)
>  {
>   struct adreno_gpu *adreno_gpu = to_adreno_gpu(gpu);
> - u32 lower_bit = 2;
> + u32 lower_bit = 1;
Wouldn't this break a630?

-Akhil.
> + u32 upper_bit = 0;
>   u32 amsbc = 0;
>   u32 rgb565_predicator = 0;
>   u32 uavflagprd_inv = 0;
> + u32 min_acc_len = 0;
> + u32 ubwc_mode = 0;
>  
>   /* a618 is using the hw default values */
>   if (adreno_is_a618(adreno_gpu))
>   return;
>  
> - if (adreno_is_a640_family(adreno_gpu))
> + if (adreno_is_a640_family(adreno_gpu)) {
>   amsbc = 1;
> + lower_bit = 2;
> + }
>  
>   if (adreno_is_a650(adreno_gpu) || adreno_is_a660(adreno_gpu)) {
>   /* TODO: get ddr type from bootloader and use 2 for LPDDR4 */
> @@ -807,18 +812,23 @@ static void a6xx_set_ubwc_config(struct msm_gpu *gpu)
>   }
>  
>   if (adreno_is_7c3(adreno_gpu)) {
> - lower_bit = 1;
>   amsbc = 1;
>   rgb565_predicator = 1;
>   uavflagprd_inv = 2;
>   }
>  
>   gpu_write(gpu, REG_A6XX_RB_NC_MODE_CNTL,
> - rgb565_predicator << 11 | amsbc << 4 | lower_bit << 1);
> - gpu_write(gpu, REG_A6XX_TPL1_NC_MODE_CNTL, lower_bit << 1);
> - gpu_write(gpu, REG_A6XX_SP_NC_MODE_CNTL,
> - uavflagprd_inv << 4 | lower_bit << 1);
> - gpu_write(gpu, REG_A6XX_UCHE_MODE_CNTL, lower_bit << 21);
> +   rgb565_predicator << 11 | upper_bit << 10 | amsbc << 4 |
> +   min_acc_len << 3 | lower_bit << 1 | ubwc_mode);
> +
> + gpu_write(gpu, REG_A6XX_TPL1_NC_MODE_CNTL, upper_bit << 4 |
> +   min_acc_len << 3 | lower_bit << 1 | ubwc_mode);
> +
> + gpu_write(gpu, REG_A6XX_SP_NC_MODE_CNTL, upper_bit << 10 |
> +   uavflagprd_inv << 4 | min_acc_len << 3 |
> +   lower_bit << 1 | ubwc_mode);
> +
> + gpu_write(gpu, REG_A6XX_UCHE_MODE_CNTL, min_acc_len << 23 | lower_bit 
> << 21);
>  }
>  
>  static int a6xx_cp_init(struct msm_gpu *gpu)

[PATCH v9 16/25] dept: Apply sdt_might_sleep_{start, end}() to dma fence wait

2023-02-01 Thread Byungchul Park

Makes Dept able to track dma fence waits.

Signed-off-by: Byungchul Park 
---
 drivers/dma-buf/dma-fence.c | 5 +
 1 file changed, 5 insertions(+)

diff --git a/drivers/dma-buf/dma-fence.c b/drivers/dma-buf/dma-fence.c
index 406b4e2..1db4bc0 100644
--- a/drivers/dma-buf/dma-fence.c
+++ b/drivers/dma-buf/dma-fence.c
@@ -16,6 +16,7 @@
 #include 
 #include 
 #include 
+#include 
 
 #define CREATE_TRACE_POINTS
 #include 
@@ -782,6 +783,7 @@ struct default_wait_cb {
cb.task = current;
list_add(, >cb_list);
 
+   sdt_might_sleep_start(NULL);
while (!test_bit(DMA_FENCE_FLAG_SIGNALED_BIT, >flags) && ret > 
0) {
if (intr)
__set_current_state(TASK_INTERRUPTIBLE);
@@ -795,6 +797,7 @@ struct default_wait_cb {
if (ret > 0 && intr && signal_pending(current))
ret = -ERESTARTSYS;
}
+   sdt_might_sleep_end();
 
if (!list_empty())
list_del();
@@ -884,6 +887,7 @@ struct default_wait_cb {
}
}
 
+   sdt_might_sleep_start(NULL);
while (ret > 0) {
if (intr)
set_current_state(TASK_INTERRUPTIBLE);
@@ -898,6 +902,7 @@ struct default_wait_cb {
if (ret > 0 && intr && signal_pending(current))
ret = -ERESTARTSYS;
}
+   sdt_might_sleep_end();
 
__set_current_state(TASK_RUNNING);
 
-- 
1.9.1

[PATCH v9 10/25] dept: Apply sdt_might_sleep_{start, end}() to waitqueue wait

2023-02-01 Thread Byungchul Park

Makes Dept able to track dependencies by waitqueue waits.

Signed-off-by: Byungchul Park 
---
 include/linux/wait.h | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/include/linux/wait.h b/include/linux/wait.h
index a0307b5..ff349e6 100644
--- a/include/linux/wait.h
+++ b/include/linux/wait.h
@@ -7,6 +7,7 @@
 #include 
 #include 
 #include 
+#include 
 
 #include 
 #include 
@@ -303,6 +304,7 @@ static inline void wake_up_pollfree(struct wait_queue_head 
*wq_head)
struct wait_queue_entry __wq_entry; 
\
long __ret = ret;   /* explicit shadow */   
\

\
+   sdt_might_sleep_start(NULL);
\
init_wait_entry(&__wq_entry, exclusive ? WQ_FLAG_EXCLUSIVE : 0);
\
for (;;) {  
\
long __int = prepare_to_wait_event(_head, &__wq_entry, 
state);\
@@ -318,6 +320,7 @@ static inline void wake_up_pollfree(struct wait_queue_head 
*wq_head)
cmd;
\
}   
\
finish_wait(_head, &__wq_entry); 
\
+   sdt_might_sleep_end();  
\
 __out: __ret;  
\
 })
 
-- 
1.9.1

[PATCH v9 09/25] dept: Apply sdt_might_sleep_{start,end}() to swait

2023-02-01 Thread Byungchul Park

Makes Dept able to track dependencies by swaits.

Signed-off-by: Byungchul Park 
---
 include/linux/swait.h | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/include/linux/swait.h b/include/linux/swait.h
index 6a8c22b..0284821 100644
--- a/include/linux/swait.h
+++ b/include/linux/swait.h
@@ -6,6 +6,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 
 /*
@@ -161,6 +162,7 @@ static inline bool swq_has_sleeper(struct swait_queue_head 
*wq)
struct swait_queue __wait;  \
long __ret = ret;   \
\
+   sdt_might_sleep_start(NULL);\
INIT_LIST_HEAD(&__wait.task_list);  \
for (;;) {  \
long __int = prepare_to_swait_event(, &__wait, state);\
@@ -176,6 +178,7 @@ static inline bool swq_has_sleeper(struct swait_queue_head 
*wq)
cmd;\
}   \
finish_swait(, &__wait); \
+   sdt_might_sleep_end();  \
 __out: __ret;  \
 })
 
-- 
1.9.1

[PATCH v9 24/25] dept: Make Dept able to work with an external wgen

2023-02-01 Thread Byungchul Park

There is a case where total maps for its wait/event is so large in size.
For instance, struct page for PG_locked and PG_writeback is the case.
The additional memory size for the maps would be 'the # of pages *
sizeof(struct dept_map)' if each struct page keeps its map all the way,
which might be too big to accept.

It'd be better to keep the minimum data in the case, which is timestamp
called 'wgen' that Dept makes use of. So made Dept able to work with an
external wgen when needed.

Signed-off-by: Byungchul Park 
---
 include/linux/dept.h | 18 ++
 include/linux/dept_sdt.h |  4 ++--
 kernel/dependency/dept.c | 30 +-
 3 files changed, 37 insertions(+), 15 deletions(-)

diff --git a/include/linux/dept.h b/include/linux/dept.h
index 0aa8d90..ad32ea7 100644
--- a/include/linux/dept.h
+++ b/include/linux/dept.h
@@ -487,6 +487,13 @@ struct dept_task {
boolin_sched;
 };
 
+/*
+ * for subsystems that requires compact use of memory e.g. struct page
+ */
+struct dept_ext_wgen{
+   unsigned int wgen;
+};
+
 #define DEPT_TASK_INITIALIZER(t)   \
 {  \
.wait_hist = { { .wait = NULL, } }, \
@@ -518,6 +525,7 @@ struct dept_task {
 extern void dept_free_range(void *start, unsigned int sz);
 extern void dept_map_init(struct dept_map *m, struct dept_key *k, int sub_u, 
const char *n);
 extern void dept_map_reinit(struct dept_map *m, struct dept_key *k, int sub_u, 
const char *n);
+extern void dept_ext_wgen_init(struct dept_ext_wgen *ewg);
 extern void dept_map_copy(struct dept_map *to, struct dept_map *from);
 
 extern void dept_wait(struct dept_map *m, unsigned long w_f, unsigned long ip, 
const char *w_fn, int sub_l, long timeout);
@@ -527,8 +535,8 @@ struct dept_task {
 extern void dept_stage_event(struct task_struct *t, unsigned long ip);
 extern void dept_ecxt_enter(struct dept_map *m, unsigned long e_f, unsigned 
long ip, const char *c_fn, const char *e_fn, int sub_l);
 extern bool dept_ecxt_holding(struct dept_map *m, unsigned long e_f);
-extern void dept_request_event(struct dept_map *m);
-extern void dept_event(struct dept_map *m, unsigned long e_f, unsigned long 
ip, const char *e_fn);
+extern void dept_request_event(struct dept_map *m, struct dept_ext_wgen *ewg);
+extern void dept_event(struct dept_map *m, unsigned long e_f, unsigned long 
ip, const char *e_fn, struct dept_ext_wgen *ewg);
 extern void dept_ecxt_exit(struct dept_map *m, unsigned long e_f, unsigned 
long ip);
 extern void dept_sched_enter(void);
 extern void dept_sched_exit(void);
@@ -559,6 +567,7 @@ static inline void dept_ecxt_enter_nokeep(struct dept_map 
*m)
 struct dept_key  { };
 struct dept_map  { };
 struct dept_task { };
+struct dept_ext_wgen { };
 
 #define DEPT_MAP_INITIALIZER(n, k) { }
 #define DEPT_TASK_INITIALIZER(t)   { }
@@ -571,6 +580,7 @@ static inline void dept_ecxt_enter_nokeep(struct dept_map 
*m)
 #define dept_free_range(s, sz) do { } while (0)
 #define dept_map_init(m, k, su, n) do { (void)(n); 
(void)(k); } while (0)
 #define dept_map_reinit(m, k, su, n)   do { (void)(n); 
(void)(k); } while (0)
+#define dept_ext_wgen_init(wg) do { } while (0)
 #define dept_map_copy(t, f)do { } while (0)
 
 #define dept_wait(m, w_f, ip, w_fn, sl, t) do { (void)(w_fn); } 
while (0)
@@ -580,8 +590,8 @@ static inline void dept_ecxt_enter_nokeep(struct dept_map 
*m)
 #define dept_stage_event(t, ip)do { } while (0)
 #define dept_ecxt_enter(m, e_f, ip, c_fn, e_fn, sl)do { (void)(c_fn); 
(void)(e_fn); } while (0)
 #define dept_ecxt_holding(m, e_f)  false
-#define dept_request_event(m)  do { } while (0)
-#define dept_event(m, e_f, ip, e_fn)   do { (void)(e_fn); } 
while (0)
+#define dept_request_event(m, wg)  do { } while (0)
+#define dept_event(m, e_f, ip, e_fn, wg)   do { (void)(e_fn); } 
while (0)
 #define dept_ecxt_exit(m, e_f, ip) do { } while (0)
 #define dept_sched_enter() do { } while (0)
 #define dept_sched_exit()  do { } while (0)
diff --git a/include/linux/dept_sdt.h b/include/linux/dept_sdt.h
index 21fce52..8cdac79 100644
--- a/include/linux/dept_sdt.h
+++ b/include/linux/dept_sdt.h
@@ -24,7 +24,7 @@
 
 #define sdt_wait_timeout(m, t) \
do {\
-   dept_request_event(m);  \
+   dept_request_event(m, NULL);\
dept_wait(m, 1UL, _THIS_IP_, __func__, 0, t);   \
} while (0)
 #define sdt_wait(m)

[PATCH v9 15/25] locking/lockdep, cpu/hotplus: Use a weaker annotation in AP thread

2023-02-01 Thread Byungchul Park

cb92173d1f0 ("locking/lockdep, cpu/hotplug: Annotate AP thread") was
introduced to make lockdep_assert_cpus_held() work in AP thread.

However, the annotation is too strong for that purpose. We don't have to
use more than try lock annotation for that.

Furthermore, now that Dept was introduced, false positive alarms was
reported by that. Replaced it with try lock annotation.

Signed-off-by: Byungchul Park 
---
 kernel/cpu.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/kernel/cpu.c b/kernel/cpu.c
index 6c0a92c..6a9b9c3 100644
--- a/kernel/cpu.c
+++ b/kernel/cpu.c
@@ -356,7 +356,7 @@ int lockdep_is_cpus_held(void)
 
 static void lockdep_acquire_cpus_lock(void)
 {
-   rwsem_acquire(_hotplug_lock.dep_map, 0, 0, _THIS_IP_);
+   rwsem_acquire(_hotplug_lock.dep_map, 0, 1, _THIS_IP_);
 }
 
 static void lockdep_release_cpus_lock(void)
-- 
1.9.1

[PATCH v9 18/25] dept: Apply timeout consideration to wait_for_completion()/complete()

2023-02-01 Thread Byungchul Park

Now that CONFIG_DEPT_AGGRESSIVE_TIMEOUT_WAIT was introduced, apply the
consideration to wait_for_completion()/complete().

Signed-off-by: Byungchul Park 
---
 include/linux/completion.h | 4 ++--
 kernel/sched/completion.c  | 2 +-
 2 files changed, 3 insertions(+), 3 deletions(-)

diff --git a/include/linux/completion.h b/include/linux/completion.h
index 32d535a..15eede0 100644
--- a/include/linux/completion.h
+++ b/include/linux/completion.h
@@ -41,9 +41,9 @@ struct completion {
  */
 #define init_completion_map(x, m) init_completion(x)
 
-static inline void complete_acquire(struct completion *x)
+static inline void complete_acquire(struct completion *x, long timeout)
 {
-   sdt_might_sleep_start(>dmap);
+   sdt_might_sleep_start_timeout(>dmap, timeout);
 }
 
 static inline void complete_release(struct completion *x)
diff --git a/kernel/sched/completion.c b/kernel/sched/completion.c
index d57a5c1..261807f 100644
--- a/kernel/sched/completion.c
+++ b/kernel/sched/completion.c
@@ -100,7 +100,7 @@ void complete_all(struct completion *x)
 {
might_sleep();
 
-   complete_acquire(x);
+   complete_acquire(x, timeout);
 
raw_spin_lock_irq(>wait.lock);
timeout = do_wait_for_common(x, action, timeout, state);
-- 
1.9.1

[PATCH v9 13/25] dept: Distinguish each work from another

2023-02-01 Thread Byungchul Park

Workqueue already provides concurrency control. By that, any wait in a
work doesn't prevents events in other works with the control enabled.
Thus, each work would better be considered a different context.

So let Dept assign a different context id to each work.

Signed-off-by: Byungchul Park 
---
 include/linux/dept.h |  2 ++
 kernel/dependency/dept.c | 10 ++
 kernel/workqueue.c   |  3 +++
 3 files changed, 15 insertions(+)

diff --git a/include/linux/dept.h b/include/linux/dept.h
index f62c7b6..d9ca9dd 100644
--- a/include/linux/dept.h
+++ b/include/linux/dept.h
@@ -515,6 +515,7 @@ struct dept_task {
 extern void dept_sched_enter(void);
 extern void dept_sched_exit(void);
 extern void dept_kernel_enter(void);
+extern void dept_work_enter(void);
 
 static inline void dept_ecxt_enter_nokeep(struct dept_map *m)
 {
@@ -567,6 +568,7 @@ static inline void dept_ecxt_enter_nokeep(struct dept_map 
*m)
 #define dept_sched_enter() do { } while (0)
 #define dept_sched_exit()  do { } while (0)
 #define dept_kernel_enter()do { } while (0)
+#define dept_work_enter()  do { } while (0)
 #define dept_ecxt_enter_nokeep(m)  do { } while (0)
 #define dept_key_init(k)   do { (void)(k); } while 
(0)
 #define dept_key_destroy(k)do { (void)(k); } while 
(0)
diff --git a/kernel/dependency/dept.c b/kernel/dependency/dept.c
index dd90746..1f1aca4 100644
--- a/kernel/dependency/dept.c
+++ b/kernel/dependency/dept.c
@@ -1977,6 +1977,16 @@ void dept_hardirqs_off_ip(unsigned long ip)
 }
 EXPORT_SYMBOL_GPL(dept_hardirqs_off_ip);
 
+/*
+ * Assign a different context id to each work.
+ */
+void dept_work_enter(void)
+{
+   struct dept_task *dt = dept_task();
+
+   dt->cxt_id[DEPT_CXT_PROCESS] += 1UL << DEPT_CXTS_NR;
+}
+
 void dept_kernel_enter(void)
 {
struct dept_task *dt = dept_task();
diff --git a/kernel/workqueue.c b/kernel/workqueue.c
index 07895de..69c4f46 100644
--- a/kernel/workqueue.c
+++ b/kernel/workqueue.c
@@ -51,6 +51,7 @@
 #include 
 #include 
 #include 
+#include 
 
 #include "workqueue_internal.h"
 
@@ -2199,6 +2200,8 @@ static void process_one_work(struct worker *worker, 
struct work_struct *work)
 
lockdep_copy_map(_map, >lockdep_map);
 #endif
+   dept_work_enter();
+
/* ensure we're on the correct CPU */
WARN_ON_ONCE(!(pool->flags & POOL_DISASSOCIATED) &&
 raw_smp_processor_id() != pool->cpu);
-- 
1.9.1

[PATCH v9 14/25] dept: Add a mechanism to refill the internal memory pools on running out

2023-02-01 Thread Byungchul Park

Dept engine works in a constrained environment. For example, Dept cannot
make use of dynamic allocation e.g. kmalloc(). So Dept has been using
static pools to keep memory chunks Dept uses.

However, Dept would barely work once any of the pools gets run out. So
implemented a mechanism for the refill on the lack by any chance, using
irq work and workqueue that fits on the contrained environment.

Signed-off-by: Byungchul Park 
---
 include/linux/dept.h|  19 ++--
 kernel/dependency/dept.c| 104 ++--
 kernel/dependency/dept_object.h |  10 ++--
 kernel/dependency/dept_proc.c   |   8 ++--
 4 files changed, 112 insertions(+), 29 deletions(-)

diff --git a/include/linux/dept.h b/include/linux/dept.h
index d9ca9dd..583e8fe 100644
--- a/include/linux/dept.h
+++ b/include/linux/dept.h
@@ -336,9 +336,19 @@ struct dept_pool {
size_t  obj_sz;
 
/*
-* the number of the static array
+* the remaining number of the object in spool
 */
-   atomic_tobj_nr;
+   int obj_nr;
+
+   /*
+* the number of the object in spool
+*/
+   int tot_nr;
+
+   /*
+* accumulated amount of memory used by the object in byte
+*/
+   atomic_tacc_sz;
 
/*
 * offset of ->pool_node
@@ -348,9 +358,10 @@ struct dept_pool {
/*
 * pointer to the pool
 */
-   void*spool;
+   void*spool; /* static pool */
+   void*rpool; /* reserved pool */
struct llist_head   boot_pool;
-   struct llist_head __percpu  *lpool;
+   struct llist_head __percpu  *lpool; /* local pool */
 };
 
 struct dept_ecxt_held {
diff --git a/kernel/dependency/dept.c b/kernel/dependency/dept.c
index 1f1aca4..0ac6442 100644
--- a/kernel/dependency/dept.c
+++ b/kernel/dependency/dept.c
@@ -74,6 +74,9 @@
 #include 
 #include 
 #include 
+#include 
+#include 
+#include 
 #include "dept_internal.h"
 
 static int dept_stop;
@@ -122,10 +125,12 @@
WARN(1, "DEPT_STOP: " s);   \
})
 
-#define DEPT_INFO_ONCE(s...) pr_warn_once("DEPT_INFO_ONCE: " s)
+#define DEPT_INFO_ONCE(s...)   pr_warn_once("DEPT_INFO_ONCE: " s)
+#define DEPT_INFO(s...)pr_warn("DEPT_INFO: " s)
 
 static arch_spinlock_t dept_spin = (arch_spinlock_t)__ARCH_SPIN_LOCK_UNLOCKED;
 static arch_spinlock_t stage_spin = (arch_spinlock_t)__ARCH_SPIN_LOCK_UNLOCKED;
+static arch_spinlock_t dept_pool_spin = 
(arch_spinlock_t)__ARCH_SPIN_LOCK_UNLOCKED;
 
 /*
  * DEPT internal engine should be careful in using outside functions
@@ -264,6 +269,7 @@ static inline bool valid_key(struct dept_key *k)
 
 #define OBJECT(id, nr) \
 static struct dept_##id spool_##id[nr];
\
+static struct dept_##id rpool_##id[nr];
\
 static DEFINE_PER_CPU(struct llist_head, lpool_##id);
#include "dept_object.h"
 #undef  OBJECT
@@ -272,14 +278,70 @@ struct dept_pool dept_pool[OBJECT_NR] = {
 #define OBJECT(id, nr) {   \
.name = #id,\
.obj_sz = sizeof(struct dept_##id), \
-   .obj_nr = ATOMIC_INIT(nr),  \
+   .obj_nr = nr,   \
+   .tot_nr = nr,   \
+   .acc_sz = ATOMIC_INIT(sizeof(spool_##id) + sizeof(rpool_##id)), \
.node_off = offsetof(struct dept_##id, pool_node),  \
.spool = spool_##id,\
+   .rpool = rpool_##id,\
.lpool = _##id, },
#include "dept_object.h"
 #undef  OBJECT
 };
 
+static void dept_wq_work_fn(struct work_struct *work)
+{
+   int i;
+
+   for (i = 0; i < OBJECT_NR; i++) {
+   struct dept_pool *p = dept_pool + i;
+   int sz = p->tot_nr * p->obj_sz;
+   void *rpool;
+   bool need;
+
+   arch_spin_lock(_pool_spin);
+   need = !p->rpool;
+   arch_spin_unlock(_pool_spin);
+
+   if (!need)
+   continue;
+
+   rpool = vmalloc(sz);
+
+   if (!rpool) {
+   DEPT_STOP("Failed to extend internal resources.\n");
+   break;
+   }
+
+   arch_spin_lock(_pool_spin);
+   if (!p->rpool) {
+   p->rpool = rpool;
+   rpool = NULL;
+

[PATCH v9 02/25] dept: Implement Dept(Dependency Tracker)

2023-02-01 Thread Byungchul Park

CURRENT STATUS
--
Lockdep tracks acquisition order of locks in order to detect deadlock,
and IRQ and IRQ enable/disable state as well to take accident
acquisitions into account.

Lockdep should be turned off once it detects and reports a deadlock
since the data structure and algorithm are not reusable after detection
because of the complex design.

PROBLEM
---
*Waits* and their *events* that never reach eventually cause deadlock.
However, Lockdep is only interested in lock acquisition order, forcing
to emulate lock acqusition even for just waits and events that have
nothing to do with real lock.

Even worse, no one likes Lockdep's false positive detection because that
prevents further one that might be more valuable. That's why all the
kernel developers are sensitive to Lockdep's false positive.

Besides those, by tracking acquisition order, it cannot correctly deal
with read lock and cross-event e.g. wait_for_completion()/complete() for
deadlock detection. Lockdep is no longer a good tool for that purpose.

SOLUTION

Again, *waits* and their *events* that never reach eventually cause
deadlock. The new solution, Dept(DEPendency Tracker), focuses on waits
and events themselves. Dept tracks waits and events and report it if
any event would be never reachable.

Dept does:
   . Works with read lock in the right way.
   . Works with any wait and event e.i. cross-event.
   . Continue to work even after reporting multiple times.
   . Provides simple and intuitive APIs.
   . Does exactly what dependency checker should do.

Q & A
-
Q. Is this the first try ever to address the problem?
A. No. Cross-release feature (b09be676e0ff2 locking/lockdep: Implement
   the 'crossrelease' feature) addressed it 2 years ago that was a
   Lockdep extension and merged but reverted shortly because:

   Cross-release started to report valuable hidden problems but started
   to give report false positive reports as well. For sure, no one
   likes Lockdep's false positive reports since it makes Lockdep stop,
   preventing reporting further real problems.

Q. Why not Dept was developed as an extension of Lockdep?
A. Lockdep definitely includes all the efforts great developers have
   made for a long time so as to be quite stable enough. But I had to
   design and implement newly because of the following:

   1) Lockdep was designed to track lock acquisition order. The APIs and
  implementation do not fit on wait-event model.
   2) Lockdep is turned off on detection including false positive. Which
  is terrible and prevents developing any extension for stronger
  detection.

Q. Do you intend to totally replace Lockdep?
A. No. Lockdep also checks if lock usage is correct. Of course, the
   dependency check routine should be replaced but the other functions
   should be still there.

Q. Do you mean the dependency check routine should be replaced right
   away?
A. No. I admit Lockdep is stable enough thanks to great efforts kernel
   developers have made. Lockdep and Dept, both should be in the kernel
   until Dept gets considered stable.

Q. Stronger detection capability would give more false positive report.
   Which was a big problem when cross-release was introduced. Is it ok
   with Dept?
A. It's ok. Dept allows multiple reporting thanks to simple and quite
   generalized design. Of course, false positive reports should be fixed
   anyway but it's no longer as a critical problem as it was.

Signed-off-by: Byungchul Park 
---
 include/linux/dept.h|  577 
 include/linux/hardirq.h |3 +
 include/linux/sched.h   |3 +
 init/init_task.c|2 +
 init/main.c |2 +
 kernel/Makefile |1 +
 kernel/dependency/Makefile  |3 +
 kernel/dependency/dept.c| 3009 +++
 kernel/dependency/dept_hash.h   |   10 +
 kernel/dependency/dept_object.h |   13 +
 kernel/exit.c   |1 +
 kernel/fork.c   |2 +
 kernel/module/main.c|2 +
 kernel/sched/core.c |9 +
 lib/Kconfig.debug   |   27 +
 lib/locking-selftest.c  |2 +
 16 files changed, 3666 insertions(+)
 create mode 100644 include/linux/dept.h
 create mode 100644 kernel/dependency/Makefile
 create mode 100644 kernel/dependency/dept.c
 create mode 100644 kernel/dependency/dept_hash.h
 create mode 100644 kernel/dependency/dept_object.h

diff --git a/include/linux/dept.h b/include/linux/dept.h
new file mode 100644
index ..b6d45b4
--- /dev/null
+++ b/include/linux/dept.h
@@ -0,0 +1,577 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+/*
+ * DEPT(DEPendency Tracker) - runtime dependency tracker
+ *
+ * Started by Byungchul Park :
+ *
+ *  Copyright (c) 2020 LG Electronics, Inc., Byungchul Park
+ */
+
+#ifndef __LINUX_DEPT_H
+#define __LINUX_DEPT_H
+
+#ifdef CONFIG_DEPT
+
+#include 
+
+struct task_struct;
+
+#define DEPT_MAX_STACK_ENTRY

[PATCH v9 00/25] DEPT(Dependency Tracker)

2023-02-01 Thread Byungchul Park

I'm happy to see that DEPT reports a real problem in practice. See:

   
https://lore.kernel.org/lkml/6383cde5-cf4b-facf-6e07-1378a4856...@i-love.sakura.ne.jp/#t
   
https://lore.kernel.org/lkml/1674268856-31807-1-git-send-email-byungchul.p...@lge.com/

Nevertheless, I apologize for the lack of document. I promise to add it
before it gets needed to use DEPT's APIs by users. For now, you can use
DEPT just with CONFIG_DEPT on.

---

Hi Linus and folks,

I've been developing a tool for detecting deadlock possibilities by
tracking wait/event rather than lock(?) acquisition order to try to
cover all synchonization machanisms. It's done on v6.2-rc2.

https://github.com/lgebyungchulpark/linux-dept/commits/dept2.4_on_v6.2-rc2

Benifit:

0. Works with all lock primitives.
1. Works with wait_for_completion()/complete().
2. Works with 'wait' on PG_locked.
3. Works with 'wait' on PG_writeback.
4. Works with swait/wakeup.
5. Works with waitqueue.
6. Works with wait_bit.
7. Multiple reports are allowed.
8. Deduplication control on multiple reports.
9. Withstand false positives thanks to 6.
10. Easy to tag any wait/event.

Future work:

0. To make it more stable.
1. To separates Dept from Lockdep.
2. To improves performance in terms of time and space.
3. To use Dept as a dependency engine for Lockdep.
4. To add any missing tags of wait/event in the kernel.
5. To deduplicate stack trace.

How to interpret reports:

1. E(event) in each context cannot be triggered because of the
   W(wait) that cannot be woken.
2. The stack trace helping find the problematic code is located
   in each conext's detail.

Thanks,
Byungchul

---

Changes from v8:

1. Fix build error by adding EXPORT_SYMBOL(PG_locked_map) and
   EXPORT_SYMBOL(PG_writeback_map) for kernel module build -
   appologize for that. (reported by kernel test robot)
2. Fix build error by removing header file's circular dependency
   that was caused by "atomic.h", "kernel.h" and "irqflags.h",
   which I introduced - appolgize for that. (reported by kernel
   test robot)

Changes from v7:

1. Fix a bug that cannot track rwlock dependency properly,
   introduced in v7. (reported by Boqun and lockdep selftest)
2. Track wait/event of PG_{locked,writeback} more aggressively
   assuming that when a bit of PG_{locked,writeback} is cleared
   there might be waits on the bit. (reported by Linus, Hillf
   and syzbot)
3. Fix and clean bad style code e.i. unnecessarily introduced
   a randome pattern and so on. (pointed out by Linux)
4. Clean code for applying DEPT to wait_for_completion().

Changes from v6:

1. Tie to task scheduler code to track sleep and try_to_wake_up()
   assuming sleeps cause waits, try_to_wake_up()s would be the
   events that those are waiting for, of course with proper DEPT
   annotations, sdt_might_sleep_weak(), sdt_might_sleep_strong()
   and so on. For these cases, class is classified at sleep
   entrance rather than the synchronization initialization code.
   Which would extremely reduce false alarms.
2. Remove the DEPT associated instance in each page struct for
   tracking dependencies by PG_locked and PG_writeback thanks to
   the 1. work above.
3. Introduce CONFIG_DEPT_AGGRESIVE_TIMEOUT_WAIT to suppress
   reports that waits with timeout set are involved, for those
   who don't like verbose reporting.
4. Add a mechanism to refill the internal memory pools on
   running out so that DEPT could keep working as long as free
   memory is available in the system.
5. Re-enable tracking hashed-waitqueue wait. That's going to no
   longer generate false positives because class is classified
   at sleep entrance rather than the waitqueue initailization.
6. Refactor to make it easier to port onto each new version of
   the kernel.
7. Apply DEPT to dma fence.
8. Do trivial optimizaitions.

Changes from v5:

1. Use just pr_warn_once() rather than WARN_ONCE() on the lack
   of internal resources because WARN_*() printing stacktrace is
   too much for informing the lack. (feedback from Ted, Hyeonggon)
2. Fix trivial bugs like missing initializing a struct before
   using it.
3. Assign a different class per task when handling onstack
   variables for waitqueue or the like. Which makes Dept
   distinguish between onstack variables of different tasks so
   as to prevent false positives. (reported by Hyeonggon)
4. Make Dept aware of even raw_local_irq_*() to prevent false
   positives. (reported by Hyeonggon)

[PATCH v9 12/25] dept: Distinguish each syscall context from another

2023-02-01 Thread Byungchul Park

It enters kernel mode on each syscall and each syscall handling should
be considered independently from the point of view of Dept. Otherwise,
Dept may wrongly track dependencies across different syscalls.

That might be a real dependency from user mode. However, now that Dept
just started to work, conservatively let Dept not track dependencies
across different syscalls.

Signed-off-by: Byungchul Park 
---
 arch/arm64/kernel/syscall.c |  2 ++
 arch/x86/entry/common.c |  4 +++
 include/linux/dept.h| 39 +++---
 kernel/dependency/dept.c| 67 +++--
 4 files changed, 63 insertions(+), 49 deletions(-)

diff --git a/arch/arm64/kernel/syscall.c b/arch/arm64/kernel/syscall.c
index a5de47e..e26d0ca 100644
--- a/arch/arm64/kernel/syscall.c
+++ b/arch/arm64/kernel/syscall.c
@@ -7,6 +7,7 @@
 #include 
 #include 
 #include 
+#include 
 
 #include 
 #include 
@@ -105,6 +106,7 @@ static void el0_svc_common(struct pt_regs *regs, int scno, 
int sc_nr,
 */
 
local_daif_restore(DAIF_PROCCTX);
+   dept_kernel_enter();
 
if (flags & _TIF_MTE_ASYNC_FAULT) {
/*
diff --git a/arch/x86/entry/common.c b/arch/x86/entry/common.c
index 6c28264..7cdd27a 100644
--- a/arch/x86/entry/common.c
+++ b/arch/x86/entry/common.c
@@ -19,6 +19,7 @@
 #include 
 #include 
 #include 
+#include 
 
 #ifdef CONFIG_XEN_PV
 #include 
@@ -72,6 +73,7 @@ static __always_inline bool do_syscall_x32(struct pt_regs 
*regs, int nr)
 
 __visible noinstr void do_syscall_64(struct pt_regs *regs, int nr)
 {
+   dept_kernel_enter();
add_random_kstack_offset();
nr = syscall_enter_from_user_mode(regs, nr);
 
@@ -120,6 +122,7 @@ __visible noinstr void do_int80_syscall_32(struct pt_regs 
*regs)
 {
int nr = syscall_32_enter(regs);
 
+   dept_kernel_enter();
add_random_kstack_offset();
/*
 * Subtlety here: if ptrace pokes something larger than 2^31-1 into
@@ -140,6 +143,7 @@ static noinstr bool __do_fast_syscall_32(struct pt_regs 
*regs)
int nr = syscall_32_enter(regs);
int res;
 
+   dept_kernel_enter();
add_random_kstack_offset();
/*
 * This cannot use syscall_enter_from_user_mode() as it has to
diff --git a/include/linux/dept.h b/include/linux/dept.h
index b6d45b4..f62c7b6 100644
--- a/include/linux/dept.h
+++ b/include/linux/dept.h
@@ -25,11 +25,16 @@
 #define DEPT_MAX_SUBCLASSES_USR(DEPT_MAX_SUBCLASSES / 
DEPT_MAX_SUBCLASSES_EVT)
 #define DEPT_MAX_SUBCLASSES_CACHE  2
 
-#define DEPT_SIRQ  0
-#define DEPT_HIRQ  1
-#define DEPT_IRQS_NR   2
-#define DEPT_SIRQF (1UL << DEPT_SIRQ)
-#define DEPT_HIRQF (1UL << DEPT_HIRQ)
+enum {
+   DEPT_CXT_SIRQ = 0,
+   DEPT_CXT_HIRQ,
+   DEPT_CXT_IRQS_NR,
+   DEPT_CXT_PROCESS = DEPT_CXT_IRQS_NR,
+   DEPT_CXTS_NR
+};
+
+#define DEPT_SIRQF (1UL << DEPT_CXT_SIRQ)
+#define DEPT_HIRQF (1UL << DEPT_CXT_HIRQ)
 
 struct dept_ecxt;
 struct dept_iecxt {
@@ -94,8 +99,8 @@ struct dept_class {
/*
 * for tracking IRQ dependencies
 */
-   struct dept_iecxt iecxt[DEPT_IRQS_NR];
-   struct dept_iwait iwait[DEPT_IRQS_NR];
+   struct dept_iecxt iecxt[DEPT_CXT_IRQS_NR];
+   struct dept_iwait iwait[DEPT_CXT_IRQS_NR];
 
/*
 * classified by a map embedded in task_struct,
@@ -207,8 +212,8 @@ struct dept_ecxt {
/*
 * where the IRQ-enabled happened
 */
-   unsigned long   enirq_ip[DEPT_IRQS_NR];
-   struct dept_stack *enirq_stack[DEPT_IRQS_NR];
+   unsigned long   enirq_ip[DEPT_CXT_IRQS_NR];
+   struct dept_stack *enirq_stack[DEPT_CXT_IRQS_NR];
 
/*
 * where the event context started
@@ -252,8 +257,8 @@ struct dept_wait {
/*
 * where the IRQ wait happened
 */
-   unsigned long   irq_ip[DEPT_IRQS_NR];
-   struct dept_stack *irq_stack[DEPT_IRQS_NR];
+   unsigned long   irq_ip[DEPT_CXT_IRQS_NR];
+   struct dept_stack *irq_stack[DEPT_CXT_IRQS_NR];
 
/*
 * where the wait happened
@@ -406,19 +411,19 @@ struct dept_task {
int wait_hist_pos;
 
/*
-* sequential id to identify each IRQ context
+* sequential id to identify each context
 */
-   unsigned intirq_id[DEPT_IRQS_NR];
+   unsigned int

[PATCH v9 08/25] dept: Apply sdt_might_sleep_{start, end}() to PG_{locked, writeback} wait

2023-02-01 Thread Byungchul Park

Makes Dept able to track dependencies by PG_{locked,writeback} waits.

Signed-off-by: Byungchul Park 
---
 mm/filemap.c | 11 +++
 1 file changed, 11 insertions(+)

diff --git a/mm/filemap.c b/mm/filemap.c
index c4d4ace..adc49cb 100644
--- a/mm/filemap.c
+++ b/mm/filemap.c
@@ -42,6 +42,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
 #include "internal.h"
@@ -1215,6 +1216,9 @@ static inline bool folio_trylock_flag(struct folio 
*folio, int bit_nr,
 /* How many times do we accept lock stealing from under a waiter? */
 int sysctl_page_lock_unfairness = 5;
 
+static struct dept_map __maybe_unused PG_locked_map = 
DEPT_MAP_INITIALIZER(PG_locked_map, NULL);
+static struct dept_map __maybe_unused PG_writeback_map = 
DEPT_MAP_INITIALIZER(PG_writeback_map, NULL);
+
 static inline int folio_wait_bit_common(struct folio *folio, int bit_nr,
int state, enum behavior behavior)
 {
@@ -1226,6 +1230,11 @@ static inline int folio_wait_bit_common(struct folio 
*folio, int bit_nr,
unsigned long pflags;
bool in_thrashing;
 
+   if (bit_nr == PG_locked)
+   sdt_might_sleep_start(_locked_map);
+   else if (bit_nr == PG_writeback)
+   sdt_might_sleep_start(_writeback_map);
+
if (bit_nr == PG_locked &&
!folio_test_uptodate(folio) && folio_test_workingset(folio)) {
delayacct_thrashing_start(_thrashing);
@@ -1327,6 +1336,8 @@ static inline int folio_wait_bit_common(struct folio 
*folio, int bit_nr,
 */
finish_wait(q, wait);
 
+   sdt_might_sleep_end();
+
if (thrashing) {
delayacct_thrashing_end(_thrashing);
psi_memstall_leave();
-- 
1.9.1

[PATCH v9 23/25] dept: Record the latest one out of consecutive waits of the same class

2023-02-01 Thread Byungchul Park

The current code records all the waits for later use to track relation
between waits and events in each context. However, since the same class
is handled the same way, it'd be okay to record only one on behalf of
the others if they all have the same class.

Even though it's the ideal to search the whole history buffer for that,
since it'd cost too high, alternatively, let's keep the latest one at
least when the same class'ed waits consecutively appear.

Signed-off-by: Byungchul Park 
---
 kernel/dependency/dept.c | 21 -
 1 file changed, 20 insertions(+), 1 deletion(-)

diff --git a/kernel/dependency/dept.c b/kernel/dependency/dept.c
index 3ab999f..2887858 100644
--- a/kernel/dependency/dept.c
+++ b/kernel/dependency/dept.c
@@ -1522,9 +1522,28 @@ static inline struct dept_wait_hist *new_hist(void)
return wh;
 }
 
+static inline struct dept_wait_hist *last_hist(void)
+{
+   int pos_n = hist_pos_next();
+   struct dept_wait_hist *wh_n = hist(pos_n);
+
+   /*
+* This is the first try.
+*/
+   if (!pos_n && !wh_n->wait)
+   return NULL;
+
+   return hist(pos_n + DEPT_MAX_WAIT_HIST - 1);
+}
+
 static void add_hist(struct dept_wait *w, unsigned int wg, unsigned int 
ctxt_id)
 {
-   struct dept_wait_hist *wh = new_hist();
+   struct dept_wait_hist *wh;
+
+   wh = last_hist();
+
+   if (!wh || wh->wait->class != w->class || wh->ctxt_id != ctxt_id)
+   wh = new_hist();
 
if (likely(wh->wait))
put_wait(wh->wait);
-- 
1.9.1

1 2 >

1 - 100 of 117 matches

Mail list logo