[v2] drm/panel: Fine tune Starry-ili9882t panel HFP and HBP

2023-06-26 Thread Cong Yang
Because the setting of hproch is too small, there will be warning in
kernel log[1]. After fine tune the HFP and HBP, this warning can be
solved. The actual measurement frame rate is 60.1Hz.

[1]: WARNING kernel:[drm] HFP + HBP less than d-phy, FPS will under 60Hz

Fixes: 8716a6473e6c ("drm/panel: Support for Starry-ili9882t TDDI
MIPI-DSI panel")
Signed-off-by: Cong Yang 
Reviewed-by: Neil Armstrong 
---
v2:
  - Update commit add Fixes tag
---
 drivers/gpu/drm/panel/panel-boe-tv101wum-nl6.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/panel/panel-boe-tv101wum-nl6.c 
b/drivers/gpu/drm/panel/panel-boe-tv101wum-nl6.c
index 3cc9fb0d4f5d..dc276c346fd1 100644
--- a/drivers/gpu/drm/panel/panel-boe-tv101wum-nl6.c
+++ b/drivers/gpu/drm/panel/panel-boe-tv101wum-nl6.c
@@ -2139,9 +2139,9 @@ static const struct panel_desc starry_himax83102_j02_desc 
= {
 static const struct drm_display_mode starry_ili9882t_default_mode = {
.clock = 165280,
.hdisplay = 1200,
-   .hsync_start = 1200 + 32,
-   .hsync_end = 1200 + 32 + 30,
-   .htotal = 1200 + 32 + 30 + 32,
+   .hsync_start = 1200 + 72,
+   .hsync_end = 1200 + 72 + 30,
+   .htotal = 1200 + 72 + 30 + 72,
.vdisplay = 1920,
.vsync_start = 1920 + 68,
.vsync_end = 1920 + 68 + 2,
-- 
2.25.1



[PATCH next] drm/msm/a5xx: Fix a NULL dereference bug in a5xx_gpu_init()

2023-06-26 Thread Harshit Mogalapalli
Smatch complains:
drivers/gpu/drm/msm/adreno/a5xx_gpu.c:1753
a5xx_gpu_init() warn: variable dereferenced before
check 'pdev' (see line 1746)

When no device is defined, dereferencing pdev is a NULL dereference.
Fix this by dereferencing pdev to get the config post the NULL check.

Fixes: 736a93273656 ("drm/msm/a5xx: really check for A510 in a5xx_gpu_init")
Signed-off-by: Harshit Mogalapalli 
---
This is based on static analysis, only compile tested.
---
 drivers/gpu/drm/msm/adreno/a5xx_gpu.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/msm/adreno/a5xx_gpu.c 
b/drivers/gpu/drm/msm/adreno/a5xx_gpu.c
index a99310b68793..adaf8f8e7f2d 100644
--- a/drivers/gpu/drm/msm/adreno/a5xx_gpu.c
+++ b/drivers/gpu/drm/msm/adreno/a5xx_gpu.c
@@ -1743,7 +1743,7 @@ struct msm_gpu *a5xx_gpu_init(struct drm_device *dev)
 {
struct msm_drm_private *priv = dev->dev_private;
struct platform_device *pdev = priv->gpu_pdev;
-   struct adreno_platform_config *config = pdev->dev.platform_data;
+   struct adreno_platform_config *config;
struct a5xx_gpu *a5xx_gpu = NULL;
struct adreno_gpu *adreno_gpu;
struct msm_gpu *gpu;
@@ -1755,6 +1755,8 @@ struct msm_gpu *a5xx_gpu_init(struct drm_device *dev)
return ERR_PTR(-ENXIO);
}
 
+   config = pdev->dev.platform_data;
+
a5xx_gpu = kzalloc(sizeof(*a5xx_gpu), GFP_KERNEL);
if (!a5xx_gpu)
return ERR_PTR(-ENOMEM);
-- 
2.39.3



Re: [Intel-gfx] [PATCH] drm/i915/guc: Dump perf_limit_reasons for debug

2023-06-26 Thread Belgaumkar, Vinay



On 6/26/2023 8:17 PM, Dixit, Ashutosh wrote:

On Mon, 26 Jun 2023 19:12:18 -0700, Vinay Belgaumkar wrote:

GuC load takes longer sometimes due to GT frequency not ramping up.
Add perf_limit_reasons to the existing warn print to see if frequency
is being throttled.

Signed-off-by: Vinay Belgaumkar 
---
  drivers/gpu/drm/i915/gt/uc/intel_guc_fw.c | 2 ++
  1 file changed, 2 insertions(+)

diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_fw.c 
b/drivers/gpu/drm/i915/gt/uc/intel_guc_fw.c
index 364d0d546ec8..73911536a8e7 100644
--- a/drivers/gpu/drm/i915/gt/uc/intel_guc_fw.c
+++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_fw.c
@@ -254,6 +254,8 @@ static int guc_wait_ucode(struct intel_guc *guc)
guc_warn(guc, "excessive init time: %lldms! [freq = %dMHz, before = 
%dMHz, status = 0x%08X, count = %d, ret = %d]\n",
 delta_ms, 
intel_rps_read_actual_frequency(>gt->rps),
 before_freq, status, count, ret);
+   guc_warn(guc, "perf limit reasons = 0x%08X\n",
+intel_uncore_read(uncore, 
intel_gt_perf_limit_reasons_reg(gt)));

Maybe just add at the end of the previous guc_warn?


Its already too long a line. If I try adding on the next line checkpatch 
complains about splitting double quotes.


Thanks,

Vinay.




} else {
guc_dbg(guc, "init took %lldms, freq = %dMHz, before = %dMHz, status 
= 0x%08X, count = %d, ret = %d\n",
delta_ms, 
intel_rps_read_actual_frequency(>gt->rps),
--
2.38.1



[PATCH v3 7/8] drm: Remove legacy cursor hotspot code

2023-06-26 Thread Zack Rusin
From: Zack Rusin 

Atomic modesetting support mouse cursor offsets via the hotspot
properties that are creates on cursor planes. All drivers which
support hotspot are atomic and the legacy code has been implemented
in terms of the atomic properties as well.

Due to the above the lagacy cursor hotspot code is no longer used or
needed and can be removed.

Signed-off-by: Zack Rusin 
Cc: Maarten Lankhorst 
Cc: Maxime Ripard 
Cc: Thomas Zimmermann 
Cc: David Airlie 
Cc: Daniel Vetter 
---
 drivers/gpu/drm/drm_plane.c   |  3 ---
 include/drm/drm_framebuffer.h | 12 
 2 files changed, 15 deletions(-)

diff --git a/drivers/gpu/drm/drm_plane.c b/drivers/gpu/drm/drm_plane.c
index ff1cc810d8f8..539fe0d94300 100644
--- a/drivers/gpu/drm/drm_plane.c
+++ b/drivers/gpu/drm/drm_plane.c
@@ -1069,9 +1069,6 @@ static int drm_mode_cursor_universal(struct drm_crtc 
*crtc,
return PTR_ERR(fb);
}
 
-   fb->hot_x = req->hot_x;
-   fb->hot_y = req->hot_y;
-
if (plane->hotspot_x_property && plane->state)
plane->state->hotspot_x = req->hot_x;
if (plane->hotspot_y_property && plane->state)
diff --git a/include/drm/drm_framebuffer.h b/include/drm/drm_framebuffer.h
index 0dcc07b68654..1e108c1789b1 100644
--- a/include/drm/drm_framebuffer.h
+++ b/include/drm/drm_framebuffer.h
@@ -188,18 +188,6 @@ struct drm_framebuffer {
 * DRM_MODE_FB_MODIFIERS.
 */
int flags;
-   /**
-* @hot_x: X coordinate of the cursor hotspot. Used by the legacy cursor
-* IOCTL when the driver supports cursor through a DRM_PLANE_TYPE_CURSOR
-* universal plane.
-*/
-   int hot_x;
-   /**
-* @hot_y: Y coordinate of the cursor hotspot. Used by the legacy cursor
-* IOCTL when the driver supports cursor through a DRM_PLANE_TYPE_CURSOR
-* universal plane.
-*/
-   int hot_y;
/**
 * @filp_head: Placed on _file.fbs, protected by _file.fbs_lock.
 */
-- 
2.39.2



[PATCH v3 2/8] drm/atomic: Add support for mouse hotspots

2023-06-26 Thread Zack Rusin
From: Zack Rusin 

Atomic modesetting code lacked support for specifying mouse cursor
hotspots. The legacy kms DRM_IOCTL_MODE_CURSOR2 had support for setting
the hotspot but the functionality was not implemented in the new atomic
paths.

Due to the lack of hotspots in the atomic paths userspace compositors
completely disable atomic modesetting for drivers that require it (i.e.
all paravirtualized drivers).

This change adds hotspot properties to the atomic codepaths throughtout
the DRM core and will allow enabling atomic modesetting for virtualized
drivers in the userspace.

Signed-off-by: Zack Rusin 
Cc: Maarten Lankhorst 
Cc: Maxime Ripard 
Cc: Thomas Zimmermann 
Cc: David Airlie 
Cc: Daniel Vetter 
---
 drivers/gpu/drm/drm_atomic_state_helper.c | 14 +++
 drivers/gpu/drm/drm_atomic_uapi.c | 20 +
 drivers/gpu/drm/drm_plane.c   | 51 +++
 include/drm/drm_plane.h   | 15 +++
 4 files changed, 100 insertions(+)

diff --git a/drivers/gpu/drm/drm_atomic_state_helper.c 
b/drivers/gpu/drm/drm_atomic_state_helper.c
index 784e63d70a42..54975de44a0e 100644
--- a/drivers/gpu/drm/drm_atomic_state_helper.c
+++ b/drivers/gpu/drm/drm_atomic_state_helper.c
@@ -275,6 +275,20 @@ void __drm_atomic_helper_plane_state_reset(struct 
drm_plane_state *plane_state,
plane_state->normalized_zpos = val;
}
}
+
+   if (plane->hotspot_x_property) {
+   if (!drm_object_property_get_default_value(>base,
+  
plane->hotspot_x_property,
+  ))
+   plane_state->hotspot_x = val;
+   }
+
+   if (plane->hotspot_y_property) {
+   if (!drm_object_property_get_default_value(>base,
+  
plane->hotspot_y_property,
+  ))
+   plane_state->hotspot_y = val;
+   }
 }
 EXPORT_SYMBOL(__drm_atomic_helper_plane_state_reset);
 
diff --git a/drivers/gpu/drm/drm_atomic_uapi.c 
b/drivers/gpu/drm/drm_atomic_uapi.c
index 98d3b10c08ae..07a7b3f18df2 100644
--- a/drivers/gpu/drm/drm_atomic_uapi.c
+++ b/drivers/gpu/drm/drm_atomic_uapi.c
@@ -593,6 +593,22 @@ static int drm_atomic_plane_set_property(struct drm_plane 
*plane,
} else if (plane->funcs->atomic_set_property) {
return plane->funcs->atomic_set_property(plane, state,
property, val);
+   } else if (property == plane->hotspot_x_property) {
+   if (plane->type != DRM_PLANE_TYPE_CURSOR) {
+   drm_dbg_atomic(plane->dev,
+  "[PLANE:%d:%s] is not a cursor plane: 
0x%llx\n",
+  plane->base.id, plane->name, val);
+   return -EINVAL;
+   }
+   state->hotspot_x = val;
+   } else if (property == plane->hotspot_y_property) {
+   if (plane->type != DRM_PLANE_TYPE_CURSOR) {
+   drm_dbg_atomic(plane->dev,
+  "[PLANE:%d:%s] is not a cursor plane: 
0x%llx\n",
+  plane->base.id, plane->name, val);
+   return -EINVAL;
+   }
+   state->hotspot_y = val;
} else {
drm_dbg_atomic(plane->dev,
   "[PLANE:%d:%s] unknown property [PROP:%d:%s]\n",
@@ -653,6 +669,10 @@ drm_atomic_plane_get_property(struct drm_plane *plane,
*val = state->scaling_filter;
} else if (plane->funcs->atomic_get_property) {
return plane->funcs->atomic_get_property(plane, state, 
property, val);
+   } else if (property == plane->hotspot_x_property) {
+   *val = state->hotspot_x;
+   } else if (property == plane->hotspot_y_property) {
+   *val = state->hotspot_y;
} else {
drm_dbg_atomic(dev,
   "[PLANE:%d:%s] unknown property [PROP:%d:%s]\n",
diff --git a/drivers/gpu/drm/drm_plane.c b/drivers/gpu/drm/drm_plane.c
index a4a39f4834e2..ff1cc810d8f8 100644
--- a/drivers/gpu/drm/drm_plane.c
+++ b/drivers/gpu/drm/drm_plane.c
@@ -348,6 +348,10 @@ static int __drm_universal_plane_init(struct drm_device 
*dev,
drm_object_attach_property(>base, config->prop_src_w, 0);
drm_object_attach_property(>base, config->prop_src_h, 0);
}
+   if (drm_core_check_feature(dev, DRIVER_CURSOR_HOTSPOT) &&
+   type == DRM_PLANE_TYPE_CURSOR) {
+   drm_plane_create_hotspot_properties(plane);
+   }
 
if (format_modifier_count)
create_in_format_blob(dev, plane);
@@ -1067,6 +1071,11 @@ static int drm_mode_cursor_universal(struct drm_crtc 
*crtc,
 

[PATCH v3 8/8] drm: Introduce DRM_CLIENT_CAP_VIRTUALIZED_CURSOR_PLANE

2023-06-26 Thread Zack Rusin
From: Zack Rusin 

Virtualized drivers place additional restrictions on the cursor plane
which breaks the contract of universal planes. To allow atomic
modesettings with virtualized drivers the clients need to advertise
that they're capable of dealing with those extra restrictions.

To do that introduce DRM_CLIENT_CAP_VIRTUALIZED_CURSOR_PLANE which
lets DRM know that the client is aware of and capable of dealing with
the extra restrictions on the virtual cursor plane.

Setting this option to true makes DRM expose the cursor plane on
virtualized drivers. The userspace is expected to set the hotspots
and handle mouse events on that plane.

Signed-off-by: Zack Rusin 
Cc: Maarten Lankhorst 
Cc: Maxime Ripard 
Cc: Thomas Zimmermann 
Cc: David Airlie 
Cc: Daniel Vetter 
Cc: dri-devel@lists.freedesktop.org
---
 drivers/gpu/drm/drm_ioctl.c |  9 +
 include/uapi/drm/drm.h  | 26 ++
 2 files changed, 35 insertions(+)

diff --git a/drivers/gpu/drm/drm_ioctl.c b/drivers/gpu/drm/drm_ioctl.c
index 8e9afe7af19c..6fd17ff14656 100644
--- a/drivers/gpu/drm/drm_ioctl.c
+++ b/drivers/gpu/drm/drm_ioctl.c
@@ -361,6 +361,15 @@ drm_setclientcap(struct drm_device *dev, void *data, 
struct drm_file *file_priv)
return -EINVAL;
file_priv->writeback_connectors = req->value;
break;
+   case DRM_CLIENT_CAP_VIRTUALIZED_CURSOR_PLANE:
+   if (!drm_core_check_feature(dev, DRIVER_CURSOR_HOTSPOT))
+   return -EOPNOTSUPP;
+   if (!file_priv->atomic)
+   return -EINVAL;
+   if (req->value > 1)
+   return -EINVAL;
+   file_priv->supports_virtualized_cursor_plane = req->value;
+   break;
default:
return -EINVAL;
}
diff --git a/include/uapi/drm/drm.h b/include/uapi/drm/drm.h
index a87ca2d4..057ef2a16d31 100644
--- a/include/uapi/drm/drm.h
+++ b/include/uapi/drm/drm.h
@@ -836,6 +836,32 @@ struct drm_get_cap {
  */
 #define DRM_CLIENT_CAP_WRITEBACK_CONNECTORS5
 
+/**
+ * DRM_CLIENT_CAP_VIRTUALIZED_CURSOR_PLANE
+ *
+ * Drivers for para-virtualized hardware (e.g. vmwgfx, qxl, virtio and
+ * virtualbox) have additional restrictions for cursor planes (thus
+ * making cursor planes on those drivers not truly universal,) e.g.
+ * they need cursor planes to act like one would expect from a mouse
+ * cursor and have correctly set hotspot properties.
+ * If this client cap is not set the DRM core will hide cursor plane on
+ * those virtualized drivers because not setting it implies that the
+ * client is not capable of dealing with those extra restictions.
+ * Clients which do set cursor hotspot and treat the cursor plane
+ * like a mouse cursor should set this property.
+ * The client must enable _CLIENT_CAP_ATOMIC first.
+ *
+ * Setting this property on drivers which do not special case
+ * cursor planes (i.e. non-virtualized drivers) will return
+ * EOPNOTSUPP, which can be used by userspace to gauge
+ * requirements of the hardware/drivers they're running on.
+ *
+ * This capability is always supported for atomic-capable virtualized
+ * drivers starting from kernel version 6.5.
+ */
+#define DRM_CLIENT_CAP_VIRTUALIZED_CURSOR_PLANE6
+
+
 /* DRM_IOCTL_SET_CLIENT_CAP ioctl argument type */
 struct drm_set_client_cap {
__u64 capability;
-- 
2.39.2



[PATCH v3 5/8] drm/vboxvideo: Use the hotspot properties from cursor planes

2023-06-26 Thread Zack Rusin
From: Zack Rusin 

Atomic modesetting got support for mouse hotspots via the hotspot
properties. Port the legacy kms hotspot handling to the new properties
on cursor planes.

Signed-off-by: Zack Rusin 
Cc: Hans de Goede 
Cc: David Airlie 
Cc: Daniel Vetter 
---
 drivers/gpu/drm/vboxvideo/vbox_mode.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/vboxvideo/vbox_mode.c 
b/drivers/gpu/drm/vboxvideo/vbox_mode.c
index 341edd982cb3..9ff3bade9795 100644
--- a/drivers/gpu/drm/vboxvideo/vbox_mode.c
+++ b/drivers/gpu/drm/vboxvideo/vbox_mode.c
@@ -429,8 +429,8 @@ static void vbox_cursor_atomic_update(struct drm_plane 
*plane,
flags = VBOX_MOUSE_POINTER_VISIBLE | VBOX_MOUSE_POINTER_SHAPE |
VBOX_MOUSE_POINTER_ALPHA;
hgsmi_update_pointer_shape(vbox->guest_pool, flags,
-  min_t(u32, max(fb->hot_x, 0), width),
-  min_t(u32, max(fb->hot_y, 0), height),
+  min_t(u32, max(new_state->hotspot_x, 0), 
width),
+  min_t(u32, max(new_state->hotspot_y, 0), 
height),
   width, height, vbox->cursor_data, data_size);
 
mutex_unlock(>hw_mutex);
-- 
2.39.2



[PATCH v3 3/8] drm/vmwgfx: Use the hotspot properties from cursor planes

2023-06-26 Thread Zack Rusin
From: Zack Rusin 

Atomic modesetting got support for mouse hotspots via the hotspot
properties. Port the legacy kms hotspot handling to the new properties
on cursor planes.

Signed-off-by: Zack Rusin 
Cc: Martin Krastev 
Cc: Maaz Mombasawala 
---
 drivers/gpu/drm/vmwgfx/vmwgfx_kms.c | 9 ++---
 1 file changed, 2 insertions(+), 7 deletions(-)

diff --git a/drivers/gpu/drm/vmwgfx/vmwgfx_kms.c 
b/drivers/gpu/drm/vmwgfx/vmwgfx_kms.c
index b62207be3363..de294dfe05d0 100644
--- a/drivers/gpu/drm/vmwgfx/vmwgfx_kms.c
+++ b/drivers/gpu/drm/vmwgfx/vmwgfx_kms.c
@@ -768,13 +768,8 @@ vmw_du_cursor_plane_atomic_update(struct drm_plane *plane,
struct vmw_plane_state *old_vps = vmw_plane_state_to_vps(old_state);
s32 hotspot_x, hotspot_y;
 
-   hotspot_x = du->hotspot_x;
-   hotspot_y = du->hotspot_y;
-
-   if (new_state->fb) {
-   hotspot_x += new_state->fb->hot_x;
-   hotspot_y += new_state->fb->hot_y;
-   }
+   hotspot_x = du->hotspot_x + new_state->hotspot_x;
+   hotspot_y = du->hotspot_y + new_state->hotspot_y;
 
du->cursor_surface = vps->surf;
du->cursor_bo = vps->bo;
-- 
2.39.2



[PATCH v3 6/8] drm/virtio: Use the hotspot properties from cursor planes

2023-06-26 Thread Zack Rusin
From: Zack Rusin 

Atomic modesetting got support for mouse hotspots via the hotspot
properties. Port the legacy kms hotspot handling to the new properties
on cursor planes.

Signed-off-by: Zack Rusin 
Reviewed-by: Gerd Hoffmann 
Cc: David Airlie 
Cc: Gurchetan Singh 
Cc: Chia-I Wu 
Cc: Daniel Vetter 
Cc: virtualizat...@lists.linux-foundation.org
---
 drivers/gpu/drm/virtio/virtgpu_plane.c | 8 
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/drivers/gpu/drm/virtio/virtgpu_plane.c 
b/drivers/gpu/drm/virtio/virtgpu_plane.c
index a2e045f3a000..20de599658c1 100644
--- a/drivers/gpu/drm/virtio/virtgpu_plane.c
+++ b/drivers/gpu/drm/virtio/virtgpu_plane.c
@@ -323,16 +323,16 @@ static void virtio_gpu_cursor_plane_update(struct 
drm_plane *plane,
DRM_DEBUG("update, handle %d, pos +%d+%d, hot %d,%d\n", handle,
  plane->state->crtc_x,
  plane->state->crtc_y,
- plane->state->fb ? plane->state->fb->hot_x : 0,
- plane->state->fb ? plane->state->fb->hot_y : 0);
+ plane->state->hotspot_x,
+ plane->state->hotspot_y);
output->cursor.hdr.type =
cpu_to_le32(VIRTIO_GPU_CMD_UPDATE_CURSOR);
output->cursor.resource_id = cpu_to_le32(handle);
if (plane->state->fb) {
output->cursor.hot_x =
-   cpu_to_le32(plane->state->fb->hot_x);
+   cpu_to_le32(plane->state->hotspot_x);
output->cursor.hot_y =
-   cpu_to_le32(plane->state->fb->hot_y);
+   cpu_to_le32(plane->state->hotspot_y);
} else {
output->cursor.hot_x = cpu_to_le32(0);
output->cursor.hot_y = cpu_to_le32(0);
-- 
2.39.2



[PATCH v3 4/8] drm/qxl: Use the hotspot properties from cursor planes

2023-06-26 Thread Zack Rusin
From: Zack Rusin 

Atomic modesetting got support for mouse hotspots via the hotspot
properties. Port the legacy kms hotspot handling to the new properties
on cursor planes.

Signed-off-by: Zack Rusin 
Reviewed-by: Gerd Hoffmann 
Cc: Dave Airlie 
Cc: Daniel Vetter 
Cc: virtualizat...@lists.linux-foundation.org
Cc: spice-de...@lists.freedesktop.org
---
 drivers/gpu/drm/qxl/qxl_display.c | 14 ++
 1 file changed, 6 insertions(+), 8 deletions(-)

diff --git a/drivers/gpu/drm/qxl/qxl_display.c 
b/drivers/gpu/drm/qxl/qxl_display.c
index 6492a70e3c39..5d689e0d3586 100644
--- a/drivers/gpu/drm/qxl/qxl_display.c
+++ b/drivers/gpu/drm/qxl/qxl_display.c
@@ -485,7 +485,6 @@ static int qxl_primary_atomic_check(struct drm_plane *plane,
 static int qxl_primary_apply_cursor(struct qxl_device *qdev,
struct drm_plane_state *plane_state)
 {
-   struct drm_framebuffer *fb = plane_state->fb;
struct qxl_crtc *qcrtc = to_qxl_crtc(plane_state->crtc);
struct qxl_cursor_cmd *cmd;
struct qxl_release *release;
@@ -510,8 +509,8 @@ static int qxl_primary_apply_cursor(struct qxl_device *qdev,
 
cmd = (struct qxl_cursor_cmd *)qxl_release_map(qdev, release);
cmd->type = QXL_CURSOR_SET;
-   cmd->u.set.position.x = plane_state->crtc_x + fb->hot_x;
-   cmd->u.set.position.y = plane_state->crtc_y + fb->hot_y;
+   cmd->u.set.position.x = plane_state->crtc_x + plane_state->hotspot_x;
+   cmd->u.set.position.y = plane_state->crtc_y + plane_state->hotspot_y;
 
cmd->u.set.shape = qxl_bo_physical_address(qdev, qcrtc->cursor_bo, 0);
 
@@ -531,7 +530,6 @@ static int qxl_primary_apply_cursor(struct qxl_device *qdev,
 static int qxl_primary_move_cursor(struct qxl_device *qdev,
   struct drm_plane_state *plane_state)
 {
-   struct drm_framebuffer *fb = plane_state->fb;
struct qxl_crtc *qcrtc = to_qxl_crtc(plane_state->crtc);
struct qxl_cursor_cmd *cmd;
struct qxl_release *release;
@@ -554,8 +552,8 @@ static int qxl_primary_move_cursor(struct qxl_device *qdev,
 
cmd = (struct qxl_cursor_cmd *)qxl_release_map(qdev, release);
cmd->type = QXL_CURSOR_MOVE;
-   cmd->u.position.x = plane_state->crtc_x + fb->hot_x;
-   cmd->u.position.y = plane_state->crtc_y + fb->hot_y;
+   cmd->u.position.x = plane_state->crtc_x + plane_state->hotspot_x;
+   cmd->u.position.y = plane_state->crtc_y + plane_state->hotspot_y;
qxl_release_unmap(qdev, release, >release_info);
 
qxl_release_fence_buffer_objects(release);
@@ -851,8 +849,8 @@ static int qxl_plane_prepare_fb(struct drm_plane *plane,
struct qxl_bo *old_cursor_bo = qcrtc->cursor_bo;
 
qcrtc->cursor_bo = qxl_create_cursor(qdev, user_bo,
-new_state->fb->hot_x,
-new_state->fb->hot_y);
+new_state->hotspot_x,
+new_state->hotspot_y);
qxl_free_cursor(old_cursor_bo);
}
 
-- 
2.39.2



[PATCH v3 1/8] drm: Disable the cursor plane on atomic contexts with virtualized drivers

2023-06-26 Thread Zack Rusin
From: Zack Rusin 

Cursor planes on virtualized drivers have special meaning and require
that the clients handle them in specific ways, e.g. the cursor plane
should react to the mouse movement the way a mouse cursor would be
expected to and the client is required to set hotspot properties on it
in order for the mouse events to be routed correctly.

This breaks the contract as specified by the "universal planes". Fix it
by disabling the cursor planes on virtualized drivers while adding
a foundation on top of which it's possible to special case mouse cursor
planes for clients that want it.

Disabling the cursor planes makes some kms compositors which were broken,
e.g. Weston, fallback to software cursor which works fine or at least
better than currently while having no effect on others, e.g. gnome-shell
or kwin, which put virtualized drivers on a deny-list when running in
atomic context to make them fallback to legacy kms and avoid this issue.

Signed-off-by: Zack Rusin 
Fixes: 681e7ec73044 ("drm: Allow userspace to ask for universal plane list 
(v2)")
Cc:  # v5.4+
Cc: Maarten Lankhorst 
Cc: Maxime Ripard 
Cc: Thomas Zimmermann 
Cc: David Airlie 
Cc: Daniel Vetter 
Cc: Dave Airlie 
Cc: Gerd Hoffmann 
Cc: Hans de Goede 
Cc: Gurchetan Singh 
Cc: Chia-I Wu 
Cc: dri-devel@lists.freedesktop.org
Cc: virtualizat...@lists.linux-foundation.org
Cc: spice-de...@lists.freedesktop.org
---
 drivers/gpu/drm/drm_plane.c  | 13 +
 drivers/gpu/drm/qxl/qxl_drv.c|  2 +-
 drivers/gpu/drm/vboxvideo/vbox_drv.c |  2 +-
 drivers/gpu/drm/virtio/virtgpu_drv.c |  2 +-
 drivers/gpu/drm/vmwgfx/vmwgfx_drv.c  |  2 +-
 include/drm/drm_drv.h|  9 +
 include/drm/drm_file.h   | 12 
 7 files changed, 38 insertions(+), 4 deletions(-)

diff --git a/drivers/gpu/drm/drm_plane.c b/drivers/gpu/drm/drm_plane.c
index 24e7998d1731..a4a39f4834e2 100644
--- a/drivers/gpu/drm/drm_plane.c
+++ b/drivers/gpu/drm/drm_plane.c
@@ -678,6 +678,19 @@ int drm_mode_getplane_res(struct drm_device *dev, void 
*data,
!file_priv->universal_planes)
continue;
 
+   /*
+* If we're running on a virtualized driver then,
+* unless userspace advertizes support for the
+* virtualized cursor plane, disable cursor planes
+* because they'll be broken due to missing cursor
+* hotspot info.
+*/
+   if (plane->type == DRM_PLANE_TYPE_CURSOR &&
+   drm_core_check_feature(dev, DRIVER_CURSOR_HOTSPOT)  &&
+   file_priv->atomic &&
+   !file_priv->supports_virtualized_cursor_plane)
+   continue;
+
if (drm_lease_held(file_priv, plane->base.id)) {
if (count < plane_resp->count_planes &&
put_user(plane->base.id, plane_ptr + count))
diff --git a/drivers/gpu/drm/qxl/qxl_drv.c b/drivers/gpu/drm/qxl/qxl_drv.c
index b30ede1cf62d..91930e84a9cd 100644
--- a/drivers/gpu/drm/qxl/qxl_drv.c
+++ b/drivers/gpu/drm/qxl/qxl_drv.c
@@ -283,7 +283,7 @@ static const struct drm_ioctl_desc qxl_ioctls[] = {
 };
 
 static struct drm_driver qxl_driver = {
-   .driver_features = DRIVER_GEM | DRIVER_MODESET | DRIVER_ATOMIC,
+   .driver_features = DRIVER_GEM | DRIVER_MODESET | DRIVER_ATOMIC | 
DRIVER_CURSOR_HOTSPOT,
 
.dumb_create = qxl_mode_dumb_create,
.dumb_map_offset = drm_gem_ttm_dumb_map_offset,
diff --git a/drivers/gpu/drm/vboxvideo/vbox_drv.c 
b/drivers/gpu/drm/vboxvideo/vbox_drv.c
index 4fee15c97c34..8ecd0863fad7 100644
--- a/drivers/gpu/drm/vboxvideo/vbox_drv.c
+++ b/drivers/gpu/drm/vboxvideo/vbox_drv.c
@@ -172,7 +172,7 @@ DEFINE_DRM_GEM_FOPS(vbox_fops);
 
 static const struct drm_driver driver = {
.driver_features =
-   DRIVER_MODESET | DRIVER_GEM | DRIVER_ATOMIC,
+   DRIVER_MODESET | DRIVER_GEM | DRIVER_ATOMIC | DRIVER_CURSOR_HOTSPOT,
 
.fops = _fops,
.name = DRIVER_NAME,
diff --git a/drivers/gpu/drm/virtio/virtgpu_drv.c 
b/drivers/gpu/drm/virtio/virtgpu_drv.c
index a7ec5a3770da..8f4bb8a4e952 100644
--- a/drivers/gpu/drm/virtio/virtgpu_drv.c
+++ b/drivers/gpu/drm/virtio/virtgpu_drv.c
@@ -176,7 +176,7 @@ static const struct drm_driver driver = {
 * If KMS is disabled DRIVER_MODESET and DRIVER_ATOMIC are masked
 * out via drm_device::driver_features:
 */
-   .driver_features = DRIVER_MODESET | DRIVER_GEM | DRIVER_RENDER | 
DRIVER_ATOMIC,
+   .driver_features = DRIVER_MODESET | DRIVER_GEM | DRIVER_RENDER | 
DRIVER_ATOMIC | DRIVER_CURSOR_HOTSPOT,
.open = virtio_gpu_driver_open,
.postclose = virtio_gpu_driver_postclose,
 
diff --git a/drivers/gpu/drm/vmwgfx/vmwgfx_drv.c 
b/drivers/gpu/drm/vmwgfx/vmwgfx_drv.c
index 8b24ecf60e3e..d3e308fdfd5b 100644
--- a/drivers/gpu/drm/vmwgfx/vmwgfx_drv.c
+++ b/drivers/gpu/drm/vmwgfx/vmwgfx_drv.c
@@ 

[PATCH v3 0/8] Fix cursor planes with virtualized drivers

2023-06-26 Thread Zack Rusin
From: Zack Rusin 

v3: Renames, fixes and cleanups suggested by Daniel, Simon and Pekka
after v2. There's no major changes in functionality. Please let me know
if I missed anything, it's been a while since v2.

Virtualized drivers have had a lot of issues with cursor support on top
of atomic modesetting. This set both fixes the long standing problems
with atomic kms and virtualized drivers and adds code to let userspace
use atomic kms on virtualized drivers while preserving functioning
seamless cursors between the host and guest.

The first change in the set is one that should be backported as far as
possible, likely 5.4 stable, because earlier stable kernels do not have
virtualbox driver. The change makes virtualized drivers stop exposing
a cursor plane for atomic clients, this fixes mouse cursor on all well
formed compositors which will automatically fallback to software cursor.

The rest of the changes until the last one ports the legacy hotspot code
to atomic plane properties.

Finally the last change introduces userspace API to let userspace
clients advertise the fact that they are aware of additional restrictions
placed upon the cursor plane by virtualized drivers and lets them use
atomic kms with virtualized drivers (the clients are expected to set
hotspots correctly when advertising support for virtual cursor plane).

Zack Rusin (8):
  drm: Disable the cursor plane on atomic contexts with virtualized
drivers
  drm/atomic: Add support for mouse hotspots
  drm/vmwgfx: Use the hotspot properties from cursor planes
  drm/qxl: Use the hotspot properties from cursor planes
  drm/vboxvideo: Use the hotspot properties from cursor planes
  drm/virtio: Use the hotspot properties from cursor planes
  drm: Remove legacy cursor hotspot code
  drm: Introduce DRM_CLIENT_CAP_VIRTUALIZED_CURSOR_PLANE

 drivers/gpu/drm/drm_atomic_state_helper.c | 14 +
 drivers/gpu/drm/drm_atomic_uapi.c | 20 +++
 drivers/gpu/drm/drm_ioctl.c   |  9 
 drivers/gpu/drm/drm_plane.c   | 65 ++-
 drivers/gpu/drm/qxl/qxl_display.c | 14 +++--
 drivers/gpu/drm/qxl/qxl_drv.c |  2 +-
 drivers/gpu/drm/vboxvideo/vbox_drv.c  |  2 +-
 drivers/gpu/drm/vboxvideo/vbox_mode.c |  4 +-
 drivers/gpu/drm/virtio/virtgpu_drv.c  |  2 +-
 drivers/gpu/drm/virtio/virtgpu_plane.c|  8 +--
 drivers/gpu/drm/vmwgfx/vmwgfx_drv.c   |  2 +-
 drivers/gpu/drm/vmwgfx/vmwgfx_kms.c   |  9 +---
 include/drm/drm_drv.h |  9 
 include/drm/drm_file.h| 12 +
 include/drm/drm_framebuffer.h | 12 -
 include/drm/drm_plane.h   | 15 ++
 include/uapi/drm/drm.h| 26 +
 17 files changed, 186 insertions(+), 39 deletions(-)

-- 
2.39.2



Re: [PATCH v5 03/11] drm/mediatek: gamma: Support SoC specific LUT size

2023-06-26 Thread 胡俊光


Re: [Intel-gfx] [PATCH] drm/i915/guc: Dump perf_limit_reasons for debug

2023-06-26 Thread Dixit, Ashutosh
On Mon, 26 Jun 2023 19:12:18 -0700, Vinay Belgaumkar wrote:
> 
> GuC load takes longer sometimes due to GT frequency not ramping up.
> Add perf_limit_reasons to the existing warn print to see if frequency
> is being throttled.
> 
> Signed-off-by: Vinay Belgaumkar 
> ---
>  drivers/gpu/drm/i915/gt/uc/intel_guc_fw.c | 2 ++
>  1 file changed, 2 insertions(+)
> 
> diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_fw.c 
> b/drivers/gpu/drm/i915/gt/uc/intel_guc_fw.c
> index 364d0d546ec8..73911536a8e7 100644
> --- a/drivers/gpu/drm/i915/gt/uc/intel_guc_fw.c
> +++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_fw.c
> @@ -254,6 +254,8 @@ static int guc_wait_ucode(struct intel_guc *guc)
>   guc_warn(guc, "excessive init time: %lldms! [freq = %dMHz, 
> before = %dMHz, status = 0x%08X, count = %d, ret = %d]\n",
>delta_ms, 
> intel_rps_read_actual_frequency(>gt->rps),
>before_freq, status, count, ret);
> + guc_warn(guc, "perf limit reasons = 0x%08X\n",
> +  intel_uncore_read(uncore, 
> intel_gt_perf_limit_reasons_reg(gt)));

Maybe just add at the end of the previous guc_warn?

>   } else {
>   guc_dbg(guc, "init took %lldms, freq = %dMHz, before = %dMHz, 
> status = 0x%08X, count = %d, ret = %d\n",
>   delta_ms, 
> intel_rps_read_actual_frequency(>gt->rps),
> -- 
> 2.38.1
> 


[PATCH] drm/i915/guc: Dump perf_limit_reasons for debug

2023-06-26 Thread Vinay Belgaumkar
GuC load takes longer sometimes due to GT frequency not ramping up.
Add perf_limit_reasons to the existing warn print to see if frequency
is being throttled.

Signed-off-by: Vinay Belgaumkar 
---
 drivers/gpu/drm/i915/gt/uc/intel_guc_fw.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_fw.c 
b/drivers/gpu/drm/i915/gt/uc/intel_guc_fw.c
index 364d0d546ec8..73911536a8e7 100644
--- a/drivers/gpu/drm/i915/gt/uc/intel_guc_fw.c
+++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_fw.c
@@ -254,6 +254,8 @@ static int guc_wait_ucode(struct intel_guc *guc)
guc_warn(guc, "excessive init time: %lldms! [freq = %dMHz, 
before = %dMHz, status = 0x%08X, count = %d, ret = %d]\n",
 delta_ms, 
intel_rps_read_actual_frequency(>gt->rps),
 before_freq, status, count, ret);
+   guc_warn(guc, "perf limit reasons = 0x%08X\n",
+intel_uncore_read(uncore, 
intel_gt_perf_limit_reasons_reg(gt)));
} else {
guc_dbg(guc, "init took %lldms, freq = %dMHz, before = %dMHz, 
status = 0x%08X, count = %d, ret = %d\n",
delta_ms, 
intel_rps_read_actual_frequency(>gt->rps),
-- 
2.38.1



linux-next: manual merge of the drm tree with the drm-misc-fixes tree

2023-06-26 Thread Stephen Rothwell
Hi all,

Today's linux-next merge of the drm tree got a conflict in:

  include/drm/gpu_scheduler.h

between commit:

  db8b4968a8d0 ("drm/sched: Call drm_sched_fence_set_parent() from 
drm_sched_fence_scheduled()")

from the drm-misc-fixes tree and commit:

  539f9ee4b52a ("drm/scheduler: properly forward fence errors")

from the drm tree.

I fixed it up (see below) and can carry the fix as necessary. This
is now fixed as far as linux-next is concerned, but any non trivial
conflicts should be mentioned to your upstream maintainer when your tree
is submitted for merging.  You may also want to consider cooperating
with the maintainer of the conflicting tree to minimise any particularly
complex conflicts.

-- 
Cheers,
Stephen Rothwell

diff --cc include/drm/gpu_scheduler.h
index b29e347b10a9,e95b4837e5a3..
--- a/include/drm/gpu_scheduler.h
+++ b/include/drm/gpu_scheduler.h
@@@ -581,16 -581,18 +581,17 @@@ void drm_sched_entity_push_job(struct d
  void drm_sched_entity_set_priority(struct drm_sched_entity *entity,
   enum drm_sched_priority priority);
  bool drm_sched_entity_is_ready(struct drm_sched_entity *entity);
+ int drm_sched_entity_error(struct drm_sched_entity *entity);
  
 -void drm_sched_fence_set_parent(struct drm_sched_fence *s_fence,
 -  struct dma_fence *fence);
  struct drm_sched_fence *drm_sched_fence_alloc(
struct drm_sched_entity *s_entity, void *owner);
  void drm_sched_fence_init(struct drm_sched_fence *fence,
  struct drm_sched_entity *entity);
  void drm_sched_fence_free(struct drm_sched_fence *fence);
  
 -void drm_sched_fence_scheduled(struct drm_sched_fence *fence);
 +void drm_sched_fence_scheduled(struct drm_sched_fence *fence,
 + struct dma_fence *parent);
- void drm_sched_fence_finished(struct drm_sched_fence *fence);
+ void drm_sched_fence_finished(struct drm_sched_fence *fence, int result);
  
  unsigned long drm_sched_suspend_timeout(struct drm_gpu_scheduler *sched);
  void drm_sched_resume_timeout(struct drm_gpu_scheduler *sched,


pgpUI0jSomqvI.pgp
Description: OpenPGP digital signature


Re: [RFC PATCH 0/3] Support for Solid Fill Planes

2023-06-26 Thread Dmitry Baryshkov
On Tue, 27 Jun 2023 at 03:45, Jessica Zhang  wrote:
>
>
>
> On 6/26/2023 5:06 PM, Dmitry Baryshkov wrote:
> > On 27/06/2023 02:02, Jessica Zhang wrote:
> >>
> >>
> >> On 11/7/2022 11:37 AM, Ville Syrjälä wrote:
> >>> On Fri, Oct 28, 2022 at 03:59:49PM -0700, Jessica Zhang wrote:
>  Introduce and add support for COLOR_FILL and COLOR_FILL_FORMAT
>  properties. When the color fill value is set, and the framebuffer is
>  set
>  to NULL, memory fetch will be disabled.
> >>>
> >>> Thinking a bit more universally I wonder if there should be
> >>> some kind of enum property:
> >>>
> >>> enum plane_pixel_source {
> >>> FB,
> >>> COLOR,
> >>> LIVE_FOO,
> >>> LIVE_BAR,
> >>> ...
> >>> }
> >>
> >> Reviving this thread as this was the initial comment suggesting to
> >> implement pixel_source as an enum.
> >>
> >> I think the issue with having pixel_source as an enum is how to decide
> >> what counts as a NULL commit.
> >>
> >> Currently, setting the FB to NULL will disable the plane. So I'm
> >> guessing we will extend that logic to "if there's no pixel_source set
> >> for the plane, then it will be a NULL commit and disable the plane".
> >>
> >> In that case, the question then becomes when to set the pixel_source
> >> to NONE. Because if we do that when setting a NULL FB (or NULL
> >> solid_fill blob), it then forces userspace to set one property before
> >> the other.
> >
> > Why? The userspace should use atomic commits and as such it should all
> > properties at the same time.
>
> Correct, userspace will set all the properties within the same atomic
> commit. The issue happens when the driver iterates through each property
> within the MODE_ATOMIC ioctl [1].
>
> For reference, I'm thinking of an implementation where we're setting the
> pixel_source within drm_atomic_plane_set_property().
>
> So something like:
>
> drm_atomic_plane_set_property( ... )
> {
>  if (property == config->prop_fb_id) {
>  if (fb)
>  state->pixel_source = FB;
>  else
>  state->pixel_source = NONE;
>  } else if (property == config->prop_solid_fill) {
>  if (solid_fill_blob)
>  state->pixel_source = SOLID_FILL;
>  }
>
>  // ...
> }

I think this is somewhat overcomplicated. Allow userspace to set these
properties as it sees fit and then in
drm_atomic_helper_check_plane_state() consider all of them to set
plane_state->visible.

We still have to remain compatible with older userspace (esp. with a
non-atomic one). It expects that a plane is enabled after setting both
CRTC and FB. So maybe you are right and we should force pixel_source
to FB if FB is set.

>
> If userspace sets solid_fill to a valid blob and FB to NULL, it's
> possible for driver to first set the solid_fill property then set the
> fb_id property later. This would cause pixel_source to be set to NONE
> after all properties have been set.
>
> I've also considered an implementation without the `pixel_source = NONE`
> line in the prop_fb_id case, but we would still need to find somewhere
> to set the pixel_source to NONE in order to allow userspace to disable a
> plane.

Good point. I don't think we would need NONE (just setting CRTC to
none or FB to none and pixel_source to FB would disable the plane),
but I might be missing something here.

>
> [1]
> https://elixir.bootlin.com/linux/latest/source/drivers/gpu/drm/drm_atomic_uapi.c#L1385
>
> >
> >> Because of that, I'm thinking of having pixel_source be represented by
> >> a bitmask instead. That way, we will simply unset the corresponding
> >> pixel_source bit when passing in a NULL FB/solid_fill blob. Then, in
> >> order to detect whether a commit is NULL or has a valid pixel source,
> >> we can just check if pixel_source == 0.
> >>
> >> Would be interested in any feedback on this.
> >
> > This is an interesting idea. Frankly speaking, I'd consider it
> > counter-intuitive at the first glance.
> >
> > Consider it to act as a curtain. Setup the curtain (by writing the fill
> > colour property). Then one can close the curtain (by switching source to
> > colour), or open it (by switching to any other source). Bitmask wouldn't
> > complicate this.
>
> So if I'm understanding your analogy correctly, pixel_source won't
> necessarily be set whenever the FB or solid_fill properties are set. So
> that way we can have both FB *and* solid_fill set at the same time, but
> only the source that pixel_source is set to would be displayed.

Yes. And if the source is not configured, the plane will be marked as
not visible.

>
> Thanks,
>
> Jessica Zhang
>
> >
> >>
> >> Thanks,
> >>
> >> Jessica Zhang
> >>
> >>>
>  In addition, loosen the NULL FB checks within the atomic commit
>  callstack
>  to allow a NULL FB when color_fill is nonzero and add FB checks in
>  methods where the FB was previously assumed to be non-NULL.
> 
>  Finally, have the DPU driver use drm_plane_state.color_fill and
>  

Re: [RFC PATCH 0/3] Support for Solid Fill Planes

2023-06-26 Thread Jessica Zhang




On 6/26/2023 5:06 PM, Dmitry Baryshkov wrote:

On 27/06/2023 02:02, Jessica Zhang wrote:



On 11/7/2022 11:37 AM, Ville Syrjälä wrote:

On Fri, Oct 28, 2022 at 03:59:49PM -0700, Jessica Zhang wrote:

Introduce and add support for COLOR_FILL and COLOR_FILL_FORMAT
properties. When the color fill value is set, and the framebuffer is 
set

to NULL, memory fetch will be disabled.


Thinking a bit more universally I wonder if there should be
some kind of enum property:

enum plane_pixel_source {
FB,
COLOR,
LIVE_FOO,
LIVE_BAR,
...
}


Reviving this thread as this was the initial comment suggesting to 
implement pixel_source as an enum.


I think the issue with having pixel_source as an enum is how to decide 
what counts as a NULL commit.


Currently, setting the FB to NULL will disable the plane. So I'm 
guessing we will extend that logic to "if there's no pixel_source set 
for the plane, then it will be a NULL commit and disable the plane".


In that case, the question then becomes when to set the pixel_source 
to NONE. Because if we do that when setting a NULL FB (or NULL 
solid_fill blob), it then forces userspace to set one property before 
the other.


Why? The userspace should use atomic commits and as such it should all 
properties at the same time.


Correct, userspace will set all the properties within the same atomic 
commit. The issue happens when the driver iterates through each property 
within the MODE_ATOMIC ioctl [1].


For reference, I'm thinking of an implementation where we're setting the 
pixel_source within drm_atomic_plane_set_property().


So something like:

drm_atomic_plane_set_property( ... )
{
if (property == config->prop_fb_id) {
if (fb)
state->pixel_source = FB;
else
state->pixel_source = NONE;
} else if (property == config->prop_solid_fill) {
if (solid_fill_blob)
state->pixel_source = SOLID_FILL;
}

// ...
}

If userspace sets solid_fill to a valid blob and FB to NULL, it's 
possible for driver to first set the solid_fill property then set the 
fb_id property later. This would cause pixel_source to be set to NONE 
after all properties have been set.


I've also considered an implementation without the `pixel_source = NONE` 
line in the prop_fb_id case, but we would still need to find somewhere 
to set the pixel_source to NONE in order to allow userspace to disable a 
plane.


[1] 
https://elixir.bootlin.com/linux/latest/source/drivers/gpu/drm/drm_atomic_uapi.c#L1385




Because of that, I'm thinking of having pixel_source be represented by 
a bitmask instead. That way, we will simply unset the corresponding 
pixel_source bit when passing in a NULL FB/solid_fill blob. Then, in 
order to detect whether a commit is NULL or has a valid pixel source, 
we can just check if pixel_source == 0.


Would be interested in any feedback on this.


This is an interesting idea. Frankly speaking, I'd consider it 
counter-intuitive at the first glance.


Consider it to act as a curtain. Setup the curtain (by writing the fill 
colour property). Then one can close the curtain (by switching source to 
colour), or open it (by switching to any other source). Bitmask wouldn't 
complicate this.


So if I'm understanding your analogy correctly, pixel_source won't 
necessarily be set whenever the FB or solid_fill properties are set. So 
that way we can have both FB *and* solid_fill set at the same time, but 
only the source that pixel_source is set to would be displayed.


Thanks,

Jessica Zhang





Thanks,

Jessica Zhang



In addition, loosen the NULL FB checks within the atomic commit 
callstack

to allow a NULL FB when color_fill is nonzero and add FB checks in
methods where the FB was previously assumed to be non-NULL.

Finally, have the DPU driver use drm_plane_state.color_fill and
drm_plane_state.color_fill_format instead of 
dpu_plane_state.color_fill,
and add extra checks in the DPU atomic commit callstack to account 
for a

NULL FB in cases where color_fill is set.

Some drivers support hardware that have optimizations for solid fill
planes. This series aims to expose these capabilities to userspace as
some compositors have a solid fill flag (ex. SOLID_COLOR in the Android
hardware composer HAL) that can be set by apps like the Android Gears
app.

Userspace can set the color_fill value by setting COLOR_FILL_FORMAT 
to a

DRM format, setting COLOR_FILL to a color fill value, and setting the
framebuffer to NULL.


Is there some real reason for the format property? Ie. why not
just do what was the plan for the crttc background color and
specify the color in full 16bpc format and just pick as many
msbs from that as the hw can use?



Jessica Zhang (3):
   drm: Introduce color fill properties for drm plane
   drm: Adjust atomic checks for solid fill color
   drm/msm/dpu: Use color_fill property for DPU planes

  drivers/gpu/drm/drm_atomic.c  | 68 
---

  

Re: [PATCH v10 00/25] DEPT(Dependency Tracker)

2023-06-26 Thread Byungchul Park
On Mon, Jun 26, 2023 at 03:02:22PM +0200, Greg KH wrote:
> On Mon, Jun 26, 2023 at 08:56:35PM +0900, Byungchul Park wrote:
> > >From now on, I can work on LKML again! I'm wondering if DEPT has been
> > helping kernel debugging well even though it's a form of patches yet.
> > 
> > I'm happy to see that DEPT reports a real problem in practice. See:
> > 
> >
> > https://lore.kernel.org/lkml/6383cde5-cf4b-facf-6e07-1378a4856...@i-love.sakura.ne.jp/#t
> >
> > https://lore.kernel.org/lkml/1674268856-31807-1-git-send-email-byungchul.p...@lge.com/
> > 
> > Nevertheless, I apologize for the lack of document. I promise to add it
> > before it gets needed to use DEPT's APIs by users. For now, you can use
> > DEPT just with CONFIG_DEPT on.
> > 
> > ---
> > 
> > Hi Linus and folks,
> > 
> > I've been developing a tool for detecting deadlock possibilities by
> > tracking wait/event rather than lock(?) acquisition order to try to
> > cover all synchonization machanisms. It's done on v6.2-rc2.
> > 
> > Benifit:
> > 
> > 0. Works with all lock primitives.
> > 1. Works with wait_for_completion()/complete().
> > 2. Works with 'wait' on PG_locked.
> > 3. Works with 'wait' on PG_writeback.
> > 4. Works with swait/wakeup.
> > 5. Works with waitqueue.
> > 6. Works with wait_bit.
> > 7. Multiple reports are allowed.
> > 8. Deduplication control on multiple reports.
> > 9. Withstand false positives thanks to 6.
> > 10. Easy to tag any wait/event.
> > 
> > Future work:
> > 
> > 0. To make it more stable.
> > 1. To separates Dept from Lockdep.
> > 2. To improves performance in terms of time and space.
> > 3. To use Dept as a dependency engine for Lockdep.
> > 4. To add any missing tags of wait/event in the kernel.
> > 5. To deduplicate stack trace.
> 
> If you run this today, does it find any issues with any subsystems /
> drivers that the current lockdep code does not find?  Have you run your

Yes, it found some deadlocks. The following issue was about a deadlock
by PG_locked detected by DEPT which lockdep couldn't. Check it out:

   
https://lore.kernel.org/lkml/6383cde5-cf4b-facf-6e07-1378a4856...@i-love.sakura.ne.jp/#t
   
https://lore.kernel.org/lkml/1674268856-31807-1-git-send-email-byungchul.p...@lge.com/

> tool on patches sent to the different mailing lists for new drivers and
> code added to the tree to verify that it can find issues easily?

I had been co-working with GPU driver developers for their new drivers
adding to their CI system to verify that it can find issues easily. Now
that I've almost organized my stuff, I will re-start it.

> In other words, why do we need this at all?  What makes it 'better' than
> what we already have that works for us today?  What benifit is it?

AS IS : It can detect deadlocks by wrong lock usage e.g. acqusition order.
Once it reports a issue, you must resolve it or work around to
see further reports even if it's not one you are into.

TO BE : It can detect deadlocks by not only locks but also any waits e.g.
wait_for_completion(), PG_locked, PG_writeback, dma fence and
so on. Last but not least, DEPT can report issues multiple times
at a single system-up so that any issues that you are not into,
no longer prevent further reports that is valuable to you.

However, yes. DEPT needs to be more matured. I'd like to do that together.

Byungchul

> thanks,
> 
> greg k-h


Re: [RFC PATCH 0/3] Support for Solid Fill Planes

2023-06-26 Thread Dmitry Baryshkov

On 27/06/2023 02:02, Jessica Zhang wrote:



On 11/7/2022 11:37 AM, Ville Syrjälä wrote:

On Fri, Oct 28, 2022 at 03:59:49PM -0700, Jessica Zhang wrote:

Introduce and add support for COLOR_FILL and COLOR_FILL_FORMAT
properties. When the color fill value is set, and the framebuffer is set
to NULL, memory fetch will be disabled.


Thinking a bit more universally I wonder if there should be
some kind of enum property:

enum plane_pixel_source {
FB,
COLOR,
LIVE_FOO,
LIVE_BAR,
...
}


Reviving this thread as this was the initial comment suggesting to 
implement pixel_source as an enum.


I think the issue with having pixel_source as an enum is how to decide 
what counts as a NULL commit.


Currently, setting the FB to NULL will disable the plane. So I'm 
guessing we will extend that logic to "if there's no pixel_source set 
for the plane, then it will be a NULL commit and disable the plane".


In that case, the question then becomes when to set the pixel_source to 
NONE. Because if we do that when setting a NULL FB (or NULL solid_fill 
blob), it then forces userspace to set one property before the other.


Why? The userspace should use atomic commits and as such it should all 
properties at the same time.


Because of that, I'm thinking of having pixel_source be represented by a 
bitmask instead. That way, we will simply unset the corresponding 
pixel_source bit when passing in a NULL FB/solid_fill blob. Then, in 
order to detect whether a commit is NULL or has a valid pixel source, we 
can just check if pixel_source == 0.


Would be interested in any feedback on this.


This is an interesting idea. Frankly speaking, I'd consider it 
counter-intuitive at the first glance.


Consider it to act as a curtain. Setup the curtain (by writing the fill 
colour property). Then one can close the curtain (by switching source to 
colour), or open it (by switching to any other source). Bitmask wouldn't 
complicate this.




Thanks,

Jessica Zhang



In addition, loosen the NULL FB checks within the atomic commit 
callstack

to allow a NULL FB when color_fill is nonzero and add FB checks in
methods where the FB was previously assumed to be non-NULL.

Finally, have the DPU driver use drm_plane_state.color_fill and
drm_plane_state.color_fill_format instead of dpu_plane_state.color_fill,
and add extra checks in the DPU atomic commit callstack to account for a
NULL FB in cases where color_fill is set.

Some drivers support hardware that have optimizations for solid fill
planes. This series aims to expose these capabilities to userspace as
some compositors have a solid fill flag (ex. SOLID_COLOR in the Android
hardware composer HAL) that can be set by apps like the Android Gears
app.

Userspace can set the color_fill value by setting COLOR_FILL_FORMAT to a
DRM format, setting COLOR_FILL to a color fill value, and setting the
framebuffer to NULL.


Is there some real reason for the format property? Ie. why not
just do what was the plan for the crttc background color and
specify the color in full 16bpc format and just pick as many
msbs from that as the hw can use?



Jessica Zhang (3):
   drm: Introduce color fill properties for drm plane
   drm: Adjust atomic checks for solid fill color
   drm/msm/dpu: Use color_fill property for DPU planes

  drivers/gpu/drm/drm_atomic.c  | 68 ---
  drivers/gpu/drm/drm_atomic_helper.c   | 34 +++-
  drivers/gpu/drm/drm_atomic_uapi.c |  8 +++
  drivers/gpu/drm/drm_blend.c   | 38 +
  drivers/gpu/drm/drm_plane.c   |  8 +--
  drivers/gpu/drm/msm/disp/dpu1/dpu_crtc.c  |  7 ++-
  drivers/gpu/drm/msm/disp/dpu1/dpu_plane.c | 66 ++
  include/drm/drm_atomic_helper.h   |  5 +-
  include/drm/drm_blend.h   |  2 +
  include/drm/drm_plane.h   | 28 ++
  10 files changed, 188 insertions(+), 76 deletions(-)

--
2.38.0


--
Ville Syrjälä
Intel


--
With best wishes
Dmitry



Re: [RFC PATCH 0/3] Support for Solid Fill Planes

2023-06-26 Thread Jessica Zhang




On 11/7/2022 11:37 AM, Ville Syrjälä wrote:

On Fri, Oct 28, 2022 at 03:59:49PM -0700, Jessica Zhang wrote:

Introduce and add support for COLOR_FILL and COLOR_FILL_FORMAT
properties. When the color fill value is set, and the framebuffer is set
to NULL, memory fetch will be disabled.


Thinking a bit more universally I wonder if there should be
some kind of enum property:

enum plane_pixel_source {
FB,
COLOR,
LIVE_FOO,
LIVE_BAR,
...
}


Reviving this thread as this was the initial comment suggesting to 
implement pixel_source as an enum.


I think the issue with having pixel_source as an enum is how to decide 
what counts as a NULL commit.


Currently, setting the FB to NULL will disable the plane. So I'm 
guessing we will extend that logic to "if there's no pixel_source set 
for the plane, then it will be a NULL commit and disable the plane".


In that case, the question then becomes when to set the pixel_source to 
NONE. Because if we do that when setting a NULL FB (or NULL solid_fill 
blob), it then forces userspace to set one property before the other.


Because of that, I'm thinking of having pixel_source be represented by a 
bitmask instead. That way, we will simply unset the corresponding 
pixel_source bit when passing in a NULL FB/solid_fill blob. Then, in 
order to detect whether a commit is NULL or has a valid pixel source, we 
can just check if pixel_source == 0.


Would be interested in any feedback on this.

Thanks,

Jessica Zhang




In addition, loosen the NULL FB checks within the atomic commit callstack
to allow a NULL FB when color_fill is nonzero and add FB checks in
methods where the FB was previously assumed to be non-NULL.

Finally, have the DPU driver use drm_plane_state.color_fill and
drm_plane_state.color_fill_format instead of dpu_plane_state.color_fill,
and add extra checks in the DPU atomic commit callstack to account for a
NULL FB in cases where color_fill is set.

Some drivers support hardware that have optimizations for solid fill
planes. This series aims to expose these capabilities to userspace as
some compositors have a solid fill flag (ex. SOLID_COLOR in the Android
hardware composer HAL) that can be set by apps like the Android Gears
app.

Userspace can set the color_fill value by setting COLOR_FILL_FORMAT to a
DRM format, setting COLOR_FILL to a color fill value, and setting the
framebuffer to NULL.


Is there some real reason for the format property? Ie. why not
just do what was the plan for the crttc background color and
specify the color in full 16bpc format and just pick as many
msbs from that as the hw can use?



Jessica Zhang (3):
   drm: Introduce color fill properties for drm plane
   drm: Adjust atomic checks for solid fill color
   drm/msm/dpu: Use color_fill property for DPU planes

  drivers/gpu/drm/drm_atomic.c  | 68 ---
  drivers/gpu/drm/drm_atomic_helper.c   | 34 +++-
  drivers/gpu/drm/drm_atomic_uapi.c |  8 +++
  drivers/gpu/drm/drm_blend.c   | 38 +
  drivers/gpu/drm/drm_plane.c   |  8 +--
  drivers/gpu/drm/msm/disp/dpu1/dpu_crtc.c  |  7 ++-
  drivers/gpu/drm/msm/disp/dpu1/dpu_plane.c | 66 ++
  include/drm/drm_atomic_helper.h   |  5 +-
  include/drm/drm_blend.h   |  2 +
  include/drm/drm_plane.h   | 28 ++
  10 files changed, 188 insertions(+), 76 deletions(-)

--
2.38.0


--
Ville Syrjälä
Intel


Re: [PATCH v2 08/10] HID: i2c-hid: Support being a panel follower

2023-06-26 Thread Doug Anderson
Benjamin,

On Thu, Jun 8, 2023 at 8:37 AM Benjamin Tissoires
 wrote:
>
> > +static const struct drm_panel_follower_funcs 
> > i2c_hid_core_panel_follower_funcs = {
> > + .panel_prepared = i2c_hid_core_panel_prepared,
> > + .panel_unpreparing = i2c_hid_core_panel_unpreparing,
> > +};
>
> Can we make that above block at least behind a Kconfig?
>
> i2c-hid is often used for touchpads, and the notion of drm panel has
> nothing to do with them. So I'd be more confident if we could disable
> that code if not required.

Now that other concerns are addressed, I started trying to write up a
v3 and I found myself writing this as the description of the Kconfig
entry:

--
config I2C_HID_SUPPORT_PANEL_FOLLOWER
bool "Support i2c-hid devices that must be power sequenced with a panel"

Say Y here if you want support for i2c-hid devices that need to
coordinate power sequencing with a panel. This is typically important
when you have a panel and a touchscreen that share power rails or
reset GPIOs. If you say N here then the kernel will not try to honor
any shared power sequencing for your hardware. In the best case,
ignoring power sequencing when it's needed will draw extra power. In
the worst case this will prevent your hardware from functioning or
could even damage your hardware.

If unsure, say Y.

--

I can certainly go that way, but I just wanted to truly make sure
that's what we want. Specifically:

1. If we put the panel follower code behind a Kconfig then we actually
have no idea if a touchscreen was intended to be a panel follower.
Specifically the panel follower API is the one that detects the
connection between the panel and the i2c-hid device, so without being
able to call the panel follower API we have no idea that an i2c-hid
device was supposed to be a panel follower.

2. It is conceivable that power sequencing a device incorrectly could
truly cause hardware damage.

Together, those points mean that if you turn off the Kconfig entry and
then try to boot on a device that needed that Kconfig setting that you
might damage hardware. I can code it up that way if you want, but it
worries me...


Alternatives that I can think of:

a) I could change the panel follower API so that panel followers are
in charge of detecting the panel that they follow. Today, that looks
like:

   panel_np = of_parse_phandle(dev->of_node, "panel", 0);
   if (panel_np)
   /* It's a panel follower */
   of_node_put(panel_np);

...so we could put that code in each touchscreen driver and then fail
to probe i2c-hid if we detect that we're supposed to be a panel
follower but the Kconfig is turned off. The above doesn't seem
massively ideal since it duplicates code. Also, one reason why I put
that code in drm_panel_add_follower() is that I think this concept
will eventually be needed even for non-DT cases. I don't know how to
write the non-DT code right now, though...


b) I could open-code detect the panel follower case but leave the
actual linking to the panel follower API. AKA add to i2c-hid:

   if (of_property_read_bool(dev->of_node, "panel"))
   /* It's a panel follower */

...that's a smaller bit of code, but feels like an abstraction
violation. It also would need to be updated if/when we added support
for non-DT panel followers.


c) I could add a "static inline" implementation of b) to "drm_panel.h".

That sounds great and I started doing it. ...but then realized that it
means adding to drm_panel.h:

#include 
#include 

...because otherwise of_property_read_bool() isn't defined and "struct
device" can't be dereferenced. That might be OK, but it looks as if
folks have been working hard to avoid things like this in header
files. Presumably it would get uglier if/when we added the non-DT
case, as well. That being said, I can give it a shot...

--

At this point, I'm hoping for some advice. How important is it for you
to have a Kconfig for "I2C_HID_SUPPORT_PANEL_FOLLOWER"?

NOTE: even if I don't add the Kconfig, I could at least create a
function for registering the panel follower that would get most of the
panel follower logic out of the probe function. Would that be enough?

Thanks!

-Doug


Re: [PATCH 06/15] dt-bindings: display/msm: sc7180-dpu: Describe SM6125

2023-06-26 Thread Konrad Dybcio
On 26.06.2023 22:28, Marijn Suijten wrote:
> On 2023-06-26 20:57:51, Konrad Dybcio wrote:
>> On 26.06.2023 19:54, Marijn Suijten wrote:
>>> On 2023-06-26 18:16:58, Krzysztof Kozlowski wrote:
 On 25/06/2023 21:52, Marijn Suijten wrote:
> On 2023-06-24 11:12:52, Krzysztof Kozlowski wrote:
>> On 24/06/2023 02:41, Marijn Suijten wrote:
>>> SM6125 is identical to SM6375 except that while downstream also defines
>>> a throttle clock, its presence results in timeouts whereas SM6375
>>> requires it to not observe any timeouts.
>>
>> Then it should not be allowed, so you need either "else:" block or
>> another "if: properties: compatible:" to disallow it. Because in current
>> patch it would be allowed.
>
> That means this binding is wrong/incomplete for all other SoCs then.
> clock(-name)s has 6 items, and sets `minItems: 6`.  Only for sm6375-dpu
>>>
>>> Of course meant to say that clock(-name)s has **7** items, not 6.
>>>
> does it set `minItems: 7`, but an else case is missing.

 Ask the author why it is done like this.
>>>
>>> Konrad, can you clarify why other 
> 
> (Looks like I forgot to complete this sentence before sending,
> apologies)
> 
>> 6375 needs the throttle clk and the clock(-names) are strongly ordered
>> so having minItems: 6 discards the last entry
> 
> The question is whether or not we should have maxItems: 6 to disallow
> the clock from being passed: right now it is optional and either is
> allowed for any !6375 SoC.
That's a very good question. I don't have a 7180 to test, but for
you it seems to cause inexplicable issues on 6125..

Konrad
> 
> - Marijn
> 
>>
>> Konrad
>>>
> Shall I send a Fixes: ed41005f5b7c ("dt-bindings: display/msm:
> sc7180-dpu: Describe SM6350 and SM6375") for that, and should maxItems:
> 6 be the default under clock(-name)s or in an else:?

 There is no bug to fix. Or at least it is not yet known. Whether other
 devices should be constrained as well - sure, sounds reasonable, but I
 did not check the code exactly.
>>>
>>> I don't know either, but we need this information to decide whether to
>>> use `maxItems: 6`:
>>>
>>> 1. Directly on the property;
>>> 2. In an `else:` case on the current `if: sm6375-dpu` (should have the
>>>same effect as 1., afaik);
>>> 3. In a second `if:` case that lists all SoCS explicitly.
>>>
>>> Since we don't have this information, I think option 3. is the right way
>>> to go, setting `maxItems: 6` for qcom,sm6125-dpu.
>>>
>>> However, it is not yet understood why downstream is able to use the
>>> throttle clock without repercussions.
>>>
 We talk here about this patch.
>>>
>>> We used this patch to discover that other SoCs are similarly
>>> unconstrained.  But if you don't want me to look into it, by all means!
>>> Saves me a lot of time.  So I will go with option 3.
>>>
>>> - Marijn


Re: [PATCH drm-next v5 03/14] drm: manager to keep track of GPUs VA mappings

2023-06-26 Thread Dave Airlie
> > As pointed out by Christian, this would optimize the "get all mappings
> > backed by a specific BO from a given VM" use case.
> >
> > The question for me is, do other drivers than amdgpu commonly need this?
>
> I have no idea.
>
> >
> > And what does amdgpu need this for? Maybe amdgpu does something we're
> > not doing (yet)?
>
> Basically when we do a CS we need to make sure that the VM used by this
> CS is up to date. For this we walk over the relocation list of BOs and
> check the status of each BO+VM structure.
>
> This is done because we don't want to update all VMs at the same time,
> but rather just those who needs the update.

This seems like a legacy from GL and possibly older vulkan, going
forward vulkan can't rely on passing lists of objects into the kernel
due to things like buffer_device_address, I'm not sure we should
optimise for this situation, and moving the tracking list into the
drivers is going to mean having a bunch of drivers all having the same
boilerplate, to do the same thing just so amdgpu can't avoid doing it.

Now there might be some benchmark somewhere we can produce a benefit
in this, and if there is then we should consider going this way for
all drivers and not just allowing drivers to choose their own path
here.

> >
> > Christian - I know you didn't ask for "do it the way amdgpu does",
> > instead you voted for keeping it entirely driver specific. But I think
> > everyone is pretty close and I'm still optimistic that we could just
> > generalize this.
>
> Well, you should *not* necessarily do it like amdgpu does! Basically the
> implementation in amdgpu was driven by requirements, e.g. we need that,
> let's do it like this.
>
> It's perfectly possible that other requirements (e.g. focus on Vulkan)
> lead to a completely different implementation.
>
> It's just that ideally I would like to have an implementation where I
> can apply at least the basics to amdgpu as well.
>

I think we can still do that just either have an option to disable
using the list internally in the gpuva or have the driver keep it's
own tracking alongside, there may still be use cases where it can use
the gpuva tracking instead of it's own.

I don't think we should forklift what is pretty likely to be common
code across every driver that uses this going forward just to benefit
an amdgpu design decision for older stacks.

Dave.


Re: [PATCH v4 1/1] drm/doc: Document DRM device reset expectations

2023-06-26 Thread Randy Dunlap
Hi,

On 6/26/23 11:33, André Almeida wrote:
> Create a section that specifies how to deal with DRM device resets for
> kernel and userspace drivers.
> 
> Signed-off-by: André Almeida 
> ---
>  Documentation/gpu/drm-uapi.rst | 68 ++
>  1 file changed, 68 insertions(+)
> 
> diff --git a/Documentation/gpu/drm-uapi.rst b/Documentation/gpu/drm-uapi.rst
> index 65fb3036a580..25a11b9b98fa 100644
> --- a/Documentation/gpu/drm-uapi.rst
> +++ b/Documentation/gpu/drm-uapi.rst
> @@ -285,6 +285,74 @@ for GPU1 and GPU2 from different vendors, and a third 
> handler for
>  mmapped regular files. Threads cause additional pain with signal
>  handling as well.
>  
> +Device reset
> +
> +
> +The GPU stack is really complex and is prone to errors, from hardware bugs,
> +faulty applications and everything in between the many layers. Some errors
> +require resetting the device in order to make the device usable again. This
> +section describes what is the expectations for DRM and usermode drivers when 
> a

   sections describes the expectations for DRM and usermode drivers when a

> +device resets and how to propagate the reset status.
> +
> +Kernel Mode Driver
> +--
> +
> +The KMD is responsible for checking if the device needs a reset, and to 
> perform
> +it as needed. Usually a hang is detected when a job gets stuck executing. KMD
> +should keep track of resets, because userspace can query any time about the
> +reset stats for an specific context. This is needed to propagate to the rest 
> of
> +the stack that a reset has happened. Currently, this is implemented by each
> +driver separately, with no common DRM interface.
> +
> +User Mode Driver
> +
> +
> +The UMD should check before submitting new commands to the KMD if the device 
> has
> +been reset, and this can be checked more often if it requires to. After

   more often if the UMD requires it. After

> +detecting a reset, UMD will then proceed to report it to the application 
> using
> +the appropriated API error code, as explained in the below section about

   appropriate  the section below about

> +robustness.
> +
> +Robustness
> +--
> +
> +The only way to try to keep an application working after a reset is if it
> +complies with the robustness aspects of the graphical API that it is using.
> +
> +Graphical APIs provide ways to application to deal with device resets. 
> However,

   to applications

> +there is no guarantee that the app will be correctly using such features, and

  will use such features correctly, and a 
// or "and the"


> +UMD can implement policies to close the app if it is a repeating offender,
> +likely in a broken loop. This is done to ensure that it does not keeps 
> blocking

keep

> +the user interface from being correctly displayed. This should be done even 
> if
> +the app is correct but happens to trigger some bug in the hardware/driver.
> +
> +OpenGL
> +~~
> +
> +Apps using OpenGL should use the available robust interfaces, like the
> +extension ``GL_ARB_robustness`` (or ``GL_EXT_robustness`` for OpenGL ES). 
> This
> +interface tells if a reset has happened, and if so, all the context state is
> +considered lost and the app proceeds by creating new ones. If is possible to

  If it is possible 
to

> +determine that robustness is not in use, UMD will terminate the app when a 
> reset

the UMD

> +is detected, giving that the contexts are lost and the app won't be able to
> +figure this out and recreate the contexts.
> +
> +Vulkan
> +~~
> +
> +Apps using Vulkan should check for ``VK_ERROR_DEVICE_LOST`` for submissions.
> +This error code means, among other things, that a device reset has happened 
> and
> +it needs to recreate the contexts to keep going.
> +
> +Reporting resets causes

That's an awkward heading. How about:
Reporting causes of resets
--

> +---
> +
> +Apart from propagating the reset through the stack so apps can recover, it's
> +really useful for driver developers to learn more about what caused the 
> reset in
> +first place. DRM devices should make use of devcoredump to store relevant
> +information about the reset, so this information can be added to user bug
> +reports.
> +
>  .. _drm_driver_ioctl:
>  
>  IOCTL Support on Device Nodes

thanks for the documentation.
-- 
~Randy


[PATCH v10 21/25] dept: Apply timeout consideration to hashed-waitqueue wait

2023-06-26 Thread Byungchul Park
Now that CONFIG_DEPT_AGGRESSIVE_TIMEOUT_WAIT was introduced, apply the
consideration to hashed-waitqueue wait, assuming an input 'ret' in
___wait_var_event() macro is used as a timeout value.

Signed-off-by: Byungchul Park 
---
 include/linux/wait_bit.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/include/linux/wait_bit.h b/include/linux/wait_bit.h
index fe89282c3e96..3ef450d9a7c5 100644
--- a/include/linux/wait_bit.h
+++ b/include/linux/wait_bit.h
@@ -247,7 +247,7 @@ extern wait_queue_head_t *__var_waitqueue(void *p);
struct wait_bit_queue_entry __wbq_entry;\
long __ret = ret; /* explicit shadow */ \
\
-   sdt_might_sleep_start(NULL);\
+   sdt_might_sleep_start_timeout(NULL, __ret); \
init_wait_var_entry(&__wbq_entry, var,  \
exclusive ? WQ_FLAG_EXCLUSIVE : 0); \
for (;;) {  \
-- 
2.17.1



[PATCH v10 23/25] dept: Record the latest one out of consecutive waits of the same class

2023-06-26 Thread Byungchul Park
The current code records all the waits for later use to track relation
between waits and events in each context. However, since the same class
is handled the same way, it'd be okay to record only one on behalf of
the others if they all have the same class.

Even though it's the ideal to search the whole history buffer for that,
since it'd cost too high, alternatively, let's keep the latest one at
least when the same class'ed waits consecutively appear.

Signed-off-by: Byungchul Park 
---
 kernel/dependency/dept.c | 21 -
 1 file changed, 20 insertions(+), 1 deletion(-)

diff --git a/kernel/dependency/dept.c b/kernel/dependency/dept.c
index 52537c099b68..cdfda4acff58 100644
--- a/kernel/dependency/dept.c
+++ b/kernel/dependency/dept.c
@@ -1522,9 +1522,28 @@ static inline struct dept_wait_hist *new_hist(void)
return wh;
 }
 
+static inline struct dept_wait_hist *last_hist(void)
+{
+   int pos_n = hist_pos_next();
+   struct dept_wait_hist *wh_n = hist(pos_n);
+
+   /*
+* This is the first try.
+*/
+   if (!pos_n && !wh_n->wait)
+   return NULL;
+
+   return hist(pos_n + DEPT_MAX_WAIT_HIST - 1);
+}
+
 static void add_hist(struct dept_wait *w, unsigned int wg, unsigned int 
ctxt_id)
 {
-   struct dept_wait_hist *wh = new_hist();
+   struct dept_wait_hist *wh;
+
+   wh = last_hist();
+
+   if (!wh || wh->wait->class != w->class || wh->ctxt_id != ctxt_id)
+   wh = new_hist();
 
if (likely(wh->wait))
put_wait(wh->wait);
-- 
2.17.1



[PATCH v10 18/25] dept: Apply timeout consideration to wait_for_completion()/complete()

2023-06-26 Thread Byungchul Park
Now that CONFIG_DEPT_AGGRESSIVE_TIMEOUT_WAIT was introduced, apply the
consideration to wait_for_completion()/complete().

Signed-off-by: Byungchul Park 
---
 include/linux/completion.h | 4 ++--
 kernel/sched/completion.c  | 2 +-
 2 files changed, 3 insertions(+), 3 deletions(-)

diff --git a/include/linux/completion.h b/include/linux/completion.h
index 32d535abebf3..15eede01a451 100644
--- a/include/linux/completion.h
+++ b/include/linux/completion.h
@@ -41,9 +41,9 @@ do {  \
  */
 #define init_completion_map(x, m) init_completion(x)
 
-static inline void complete_acquire(struct completion *x)
+static inline void complete_acquire(struct completion *x, long timeout)
 {
-   sdt_might_sleep_start(>dmap);
+   sdt_might_sleep_start_timeout(>dmap, timeout);
 }
 
 static inline void complete_release(struct completion *x)
diff --git a/kernel/sched/completion.c b/kernel/sched/completion.c
index d57a5c1c1cd9..261807fa7118 100644
--- a/kernel/sched/completion.c
+++ b/kernel/sched/completion.c
@@ -100,7 +100,7 @@ __wait_for_common(struct completion *x,
 {
might_sleep();
 
-   complete_acquire(x);
+   complete_acquire(x, timeout);
 
raw_spin_lock_irq(>wait.lock);
timeout = do_wait_for_common(x, action, timeout, state);
-- 
2.17.1



[PATCH v10 10/25] dept: Apply sdt_might_sleep_{start, end}() to waitqueue wait

2023-06-26 Thread Byungchul Park
Makes Dept able to track dependencies by waitqueue waits.

Signed-off-by: Byungchul Park 
---
 include/linux/wait.h | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/include/linux/wait.h b/include/linux/wait.h
index a0307b516b09..ff349e609da7 100644
--- a/include/linux/wait.h
+++ b/include/linux/wait.h
@@ -7,6 +7,7 @@
 #include 
 #include 
 #include 
+#include 
 
 #include 
 #include 
@@ -303,6 +304,7 @@ extern void init_wait_entry(struct wait_queue_entry 
*wq_entry, int flags);
struct wait_queue_entry __wq_entry; 
\
long __ret = ret;   /* explicit shadow */   
\

\
+   sdt_might_sleep_start(NULL);
\
init_wait_entry(&__wq_entry, exclusive ? WQ_FLAG_EXCLUSIVE : 0);
\
for (;;) {  
\
long __int = prepare_to_wait_event(_head, &__wq_entry, 
state);\
@@ -318,6 +320,7 @@ extern void init_wait_entry(struct wait_queue_entry 
*wq_entry, int flags);
cmd;
\
}   
\
finish_wait(_head, &__wq_entry); 
\
+   sdt_might_sleep_end();  
\
 __out: __ret;  
\
 })
 
-- 
2.17.1



[PATCH v10 15/25] locking/lockdep, cpu/hotplus: Use a weaker annotation in AP thread

2023-06-26 Thread Byungchul Park
cb92173d1f0 ("locking/lockdep, cpu/hotplug: Annotate AP thread") was
introduced to make lockdep_assert_cpus_held() work in AP thread.

However, the annotation is too strong for that purpose. We don't have to
use more than try lock annotation for that.

Furthermore, now that Dept was introduced, false positive alarms was
reported by that. Replaced it with try lock annotation.

Signed-off-by: Byungchul Park 
---
 kernel/cpu.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/kernel/cpu.c b/kernel/cpu.c
index 6c0a92ca6bb5..6a9b9c3d90a1 100644
--- a/kernel/cpu.c
+++ b/kernel/cpu.c
@@ -356,7 +356,7 @@ int lockdep_is_cpus_held(void)
 
 static void lockdep_acquire_cpus_lock(void)
 {
-   rwsem_acquire(_hotplug_lock.dep_map, 0, 0, _THIS_IP_);
+   rwsem_acquire(_hotplug_lock.dep_map, 0, 1, _THIS_IP_);
 }
 
 static void lockdep_release_cpus_lock(void)
-- 
2.17.1



[PATCH v10 20/25] dept: Apply timeout consideration to waitqueue wait

2023-06-26 Thread Byungchul Park
Now that CONFIG_DEPT_AGGRESSIVE_TIMEOUT_WAIT was introduced, apply the
consideration to waitqueue wait, assuming an input 'ret' in
___wait_event() macro is used as a timeout value.

Signed-off-by: Byungchul Park 
---
 include/linux/wait.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/include/linux/wait.h b/include/linux/wait.h
index ff349e609da7..aa1bd964be1e 100644
--- a/include/linux/wait.h
+++ b/include/linux/wait.h
@@ -304,7 +304,7 @@ extern void init_wait_entry(struct wait_queue_entry 
*wq_entry, int flags);
struct wait_queue_entry __wq_entry; 
\
long __ret = ret;   /* explicit shadow */   
\

\
-   sdt_might_sleep_start(NULL);
\
+   sdt_might_sleep_start_timeout(NULL, __ret); 
\
init_wait_entry(&__wq_entry, exclusive ? WQ_FLAG_EXCLUSIVE : 0);
\
for (;;) {  
\
long __int = prepare_to_wait_event(_head, &__wq_entry, 
state);\
-- 
2.17.1



[PATCH v10 03/25] dept: Add single event dependency tracker APIs

2023-06-26 Thread Byungchul Park
Wrapped the base APIs for easier annotation on wait and event. Start
with supporting waiters on each single event. More general support for
multiple events is a future work. Do more when the need arises.

How to annotate (the simplest way):

1. Initaialize a map for the interesting wait.

   /*
* Recommand to place along with the wait instance.
*/
   struct dept_map my_wait;

   /*
* Recommand to place in the initialization code.
*/
   sdt_map_init(_wait);

2. Place the following at the wait code.

   sdt_wait(_wait);

3. Place the following at the event code.

   sdt_event(_wait);

That's it!

Signed-off-by: Byungchul Park 
---
 include/linux/dept_sdt.h | 62 
 1 file changed, 62 insertions(+)
 create mode 100644 include/linux/dept_sdt.h

diff --git a/include/linux/dept_sdt.h b/include/linux/dept_sdt.h
new file mode 100644
index ..12a793b90c7e
--- /dev/null
+++ b/include/linux/dept_sdt.h
@@ -0,0 +1,62 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+/*
+ * Single-event Dependency Tracker
+ *
+ * Started by Byungchul Park :
+ *
+ *  Copyright (c) 2020 LG Electronics, Inc., Byungchul Park
+ */
+
+#ifndef __LINUX_DEPT_SDT_H
+#define __LINUX_DEPT_SDT_H
+
+#include 
+#include 
+
+#ifdef CONFIG_DEPT
+#define sdt_map_init(m)
\
+   do {\
+   static struct dept_key __key;   \
+   dept_map_init(m, &__key, 0, #m);\
+   } while (0)
+
+#define sdt_map_init_key(m, k) dept_map_init(m, k, 0, #m)
+
+#define sdt_wait(m)\
+   do {\
+   dept_request_event(m);  \
+   dept_wait(m, 1UL, _THIS_IP_, __func__, 0);  \
+   } while (0)
+
+/*
+ * sdt_might_sleep() and its family will be committed in __schedule()
+ * when it actually gets to __schedule(). Both dept_request_event() and
+ * dept_wait() will be performed on the commit.
+ */
+
+/*
+ * Use the code location as the class key if an explicit map is not used.
+ */
+#define sdt_might_sleep_start(m)   \
+   do {\
+   struct dept_map *__m = m;   \
+   static struct dept_key __key;   \
+   dept_stage_wait(__m, __m ? NULL : &__key, _THIS_IP_, __func__);\
+   } while (0)
+
+#define sdt_might_sleep_end()  dept_clean_stage()
+
+#define sdt_ecxt_enter(m)  dept_ecxt_enter(m, 1UL, _THIS_IP_, 
"start", "event", 0)
+#define sdt_event(m)   dept_event(m, 1UL, _THIS_IP_, __func__)
+#define sdt_ecxt_exit(m)   dept_ecxt_exit(m, 1UL, _THIS_IP_)
+#else /* !CONFIG_DEPT */
+#define sdt_map_init(m)do { } while (0)
+#define sdt_map_init_key(m, k) do { (void)(k); } while (0)
+#define sdt_wait(m)do { } while (0)
+#define sdt_might_sleep_start(m)   do { } while (0)
+#define sdt_might_sleep_end()  do { } while (0)
+#define sdt_ecxt_enter(m)  do { } while (0)
+#define sdt_event(m)   do { } while (0)
+#define sdt_ecxt_exit(m)   do { } while (0)
+#endif
+#endif /* __LINUX_DEPT_SDT_H */
-- 
2.17.1



[PATCH v10 05/25] dept: Tie to Lockdep and IRQ tracing

2023-06-26 Thread Byungchul Park
Yes. How to place Dept in here looks so ugly. But it's inevitable as
long as relying on Lockdep. The way should be enhanced gradually.

   1. Basically relies on Lockdep to track typical locks and IRQ things.

   2. Dept fails to recognize IRQ situation so it generates false alarms
  when raw_local_irq_*() APIs are used. So made it track those too.

   3. Lockdep doesn't track the outmost {hard,soft}irq entracnes but
  Dept makes use of it. So made it track those too.

Signed-off-by: Byungchul Park 
---
 include/linux/irqflags.h|  22 +-
 include/linux/local_lock_internal.h |   1 +
 include/linux/lockdep.h | 102 ++--
 include/linux/lockdep_types.h   |   3 +
 include/linux/mutex.h   |   1 +
 include/linux/percpu-rwsem.h|   2 +-
 include/linux/rtmutex.h |   1 +
 include/linux/rwlock_types.h|   1 +
 include/linux/rwsem.h   |   1 +
 include/linux/seqlock.h |   2 +-
 include/linux/spinlock_types_raw.h  |   3 +
 include/linux/srcu.h|   2 +-
 kernel/dependency/dept.c|   4 +-
 kernel/locking/lockdep.c|  23 +++
 14 files changed, 139 insertions(+), 29 deletions(-)

diff --git a/include/linux/irqflags.h b/include/linux/irqflags.h
index 5ec0fa71399e..0ebc5ec2dbd4 100644
--- a/include/linux/irqflags.h
+++ b/include/linux/irqflags.h
@@ -13,6 +13,7 @@
 #define _LINUX_TRACE_IRQFLAGS_H
 
 #include 
+#include 
 #include 
 #include 
 
@@ -60,8 +61,10 @@ extern void trace_hardirqs_off(void);
 # define lockdep_softirqs_enabled(p)   ((p)->softirqs_enabled)
 # define lockdep_hardirq_enter()   \
 do {   \
-   if (__this_cpu_inc_return(hardirq_context) == 1)\
+   if (__this_cpu_inc_return(hardirq_context) == 1) { \
current->hardirq_threaded = 0;  \
+   dept_hardirq_enter();   \
+   }   \
 } while (0)
 # define lockdep_hardirq_threaded()\
 do {   \
@@ -136,6 +139,8 @@ do {\
 # define lockdep_softirq_enter()   \
 do {   \
current->softirq_context++; \
+   if (current->softirq_context == 1)  \
+   dept_softirq_enter();   \
 } while (0)
 # define lockdep_softirq_exit()\
 do {   \
@@ -170,17 +175,28 @@ extern void warn_bogus_irq_restore(void);
 /*
  * Wrap the arch provided IRQ routines to provide appropriate checks.
  */
-#define raw_local_irq_disable()arch_local_irq_disable()
-#define raw_local_irq_enable() arch_local_irq_enable()
+#define raw_local_irq_disable()\
+   do {\
+   arch_local_irq_disable();   \
+   dept_hardirqs_off();\
+   } while (0)
+#define raw_local_irq_enable() \
+   do {\
+   dept_hardirqs_on(); \
+   arch_local_irq_enable();\
+   } while (0)
 #define raw_local_irq_save(flags)  \
do {\
typecheck(unsigned long, flags);\
flags = arch_local_irq_save();  \
+   dept_hardirqs_off();\
} while (0)
 #define raw_local_irq_restore(flags)   \
do {\
typecheck(unsigned long, flags);\
raw_check_bogus_irq_restore();  \
+   if (!arch_irqs_disabled_flags(flags))   \
+   dept_hardirqs_on(); \
arch_local_irq_restore(flags);  \
} while (0)
 #define raw_local_save_flags(flags)\
diff --git a/include/linux/local_lock_internal.h 
b/include/linux/local_lock_internal.h
index 975e33b793a7..39f67788fd95 100644
--- a/include/linux/local_lock_internal.h
+++ b/include/linux/local_lock_internal.h
@@ -21,6 +21,7 @@ typedef struct {
.name = #lockname,  \
.wait_type_inner = LD_WAIT_CONFIG,  \
.lock_type = LD_LOCK_PERCPU,\
+   .dmap = DEPT_MAP_INITIALIZER(lockname, NULL),\
},  \
.owner = NULL,
 
diff --git a/include/linux/lockdep.h b/include/linux/lockdep.h
index 1f1099dac3f0..99961026ba43 100644
--- a/include/linux/lockdep.h
+++ b/include/linux/lockdep.h
@@ -12,6 +12,7 @@
 
 #include 
 #include 
+#include 
 #include 
 
 struct 

[PATCH v10 24/25] dept: Make Dept able to work with an external wgen

2023-06-26 Thread Byungchul Park
There is a case where total maps for its wait/event is so large in size.
For instance, struct page for PG_locked and PG_writeback is the case.
The additional memory size for the maps would be 'the # of pages *
sizeof(struct dept_map)' if each struct page keeps its map all the way,
which might be too big to accept.

It'd be better to keep the minimum data in the case, which is timestamp
called 'wgen' that Dept makes use of. So made Dept able to work with an
external wgen when needed.

Signed-off-by: Byungchul Park 
---
 include/linux/dept.h | 18 ++
 include/linux/dept_sdt.h |  4 ++--
 kernel/dependency/dept.c | 30 +-
 3 files changed, 37 insertions(+), 15 deletions(-)

diff --git a/include/linux/dept.h b/include/linux/dept.h
index 0aa8d90558a9..ad32ea7b57bb 100644
--- a/include/linux/dept.h
+++ b/include/linux/dept.h
@@ -487,6 +487,13 @@ struct dept_task {
boolin_sched;
 };
 
+/*
+ * for subsystems that requires compact use of memory e.g. struct page
+ */
+struct dept_ext_wgen{
+   unsigned int wgen;
+};
+
 #define DEPT_TASK_INITIALIZER(t)   \
 {  \
.wait_hist = { { .wait = NULL, } }, \
@@ -518,6 +525,7 @@ extern void dept_task_exit(struct task_struct *t);
 extern void dept_free_range(void *start, unsigned int sz);
 extern void dept_map_init(struct dept_map *m, struct dept_key *k, int sub_u, 
const char *n);
 extern void dept_map_reinit(struct dept_map *m, struct dept_key *k, int sub_u, 
const char *n);
+extern void dept_ext_wgen_init(struct dept_ext_wgen *ewg);
 extern void dept_map_copy(struct dept_map *to, struct dept_map *from);
 
 extern void dept_wait(struct dept_map *m, unsigned long w_f, unsigned long ip, 
const char *w_fn, int sub_l, long timeout);
@@ -527,8 +535,8 @@ extern void dept_clean_stage(void);
 extern void dept_stage_event(struct task_struct *t, unsigned long ip);
 extern void dept_ecxt_enter(struct dept_map *m, unsigned long e_f, unsigned 
long ip, const char *c_fn, const char *e_fn, int sub_l);
 extern bool dept_ecxt_holding(struct dept_map *m, unsigned long e_f);
-extern void dept_request_event(struct dept_map *m);
-extern void dept_event(struct dept_map *m, unsigned long e_f, unsigned long 
ip, const char *e_fn);
+extern void dept_request_event(struct dept_map *m, struct dept_ext_wgen *ewg);
+extern void dept_event(struct dept_map *m, unsigned long e_f, unsigned long 
ip, const char *e_fn, struct dept_ext_wgen *ewg);
 extern void dept_ecxt_exit(struct dept_map *m, unsigned long e_f, unsigned 
long ip);
 extern void dept_sched_enter(void);
 extern void dept_sched_exit(void);
@@ -559,6 +567,7 @@ extern void dept_hardirqs_off_ip(unsigned long ip);
 struct dept_key  { };
 struct dept_map  { };
 struct dept_task { };
+struct dept_ext_wgen { };
 
 #define DEPT_MAP_INITIALIZER(n, k) { }
 #define DEPT_TASK_INITIALIZER(t)   { }
@@ -571,6 +580,7 @@ struct dept_task { };
 #define dept_free_range(s, sz) do { } while (0)
 #define dept_map_init(m, k, su, n) do { (void)(n); 
(void)(k); } while (0)
 #define dept_map_reinit(m, k, su, n)   do { (void)(n); 
(void)(k); } while (0)
+#define dept_ext_wgen_init(wg) do { } while (0)
 #define dept_map_copy(t, f)do { } while (0)
 
 #define dept_wait(m, w_f, ip, w_fn, sl, t) do { (void)(w_fn); } 
while (0)
@@ -580,8 +590,8 @@ struct dept_task { };
 #define dept_stage_event(t, ip)do { } while (0)
 #define dept_ecxt_enter(m, e_f, ip, c_fn, e_fn, sl)do { (void)(c_fn); 
(void)(e_fn); } while (0)
 #define dept_ecxt_holding(m, e_f)  false
-#define dept_request_event(m)  do { } while (0)
-#define dept_event(m, e_f, ip, e_fn)   do { (void)(e_fn); } 
while (0)
+#define dept_request_event(m, wg)  do { } while (0)
+#define dept_event(m, e_f, ip, e_fn, wg)   do { (void)(e_fn); } 
while (0)
 #define dept_ecxt_exit(m, e_f, ip) do { } while (0)
 #define dept_sched_enter() do { } while (0)
 #define dept_sched_exit()  do { } while (0)
diff --git a/include/linux/dept_sdt.h b/include/linux/dept_sdt.h
index 21fce525f031..8cdac7982036 100644
--- a/include/linux/dept_sdt.h
+++ b/include/linux/dept_sdt.h
@@ -24,7 +24,7 @@
 
 #define sdt_wait_timeout(m, t) \
do {\
-   dept_request_event(m);  \
+   dept_request_event(m, NULL);\
dept_wait(m, 1UL, _THIS_IP_, __func__, 0, t);   \
} while (0)
 #define sdt_wait(m) sdt_wait_timeout(m, -1L)
@@ -49,7 

[PATCH v10 06/25] dept: Add proc knobs to show stats and dependency graph

2023-06-26 Thread Byungchul Park
It'd be useful to show Dept internal stats and dependency graph on
runtime via proc for better information. Introduced the knobs.

Signed-off-by: Byungchul Park 
---
 kernel/dependency/Makefile|  1 +
 kernel/dependency/dept.c  | 24 +++-
 kernel/dependency/dept_internal.h | 26 +
 kernel/dependency/dept_proc.c | 95 +++
 4 files changed, 131 insertions(+), 15 deletions(-)
 create mode 100644 kernel/dependency/dept_internal.h
 create mode 100644 kernel/dependency/dept_proc.c

diff --git a/kernel/dependency/Makefile b/kernel/dependency/Makefile
index b5cfb8a03c0c..92f165400187 100644
--- a/kernel/dependency/Makefile
+++ b/kernel/dependency/Makefile
@@ -1,3 +1,4 @@
 # SPDX-License-Identifier: GPL-2.0
 
 obj-$(CONFIG_DEPT) += dept.o
+obj-$(CONFIG_DEPT) += dept_proc.o
diff --git a/kernel/dependency/dept.c b/kernel/dependency/dept.c
index d3b6d2f4cd7b..c5e23e9184b8 100644
--- a/kernel/dependency/dept.c
+++ b/kernel/dependency/dept.c
@@ -74,6 +74,7 @@
 #include 
 #include 
 #include 
+#include "dept_internal.h"
 
 static int dept_stop;
 static int dept_per_cpu_ready;
@@ -261,20 +262,13 @@ static inline bool valid_key(struct dept_key *k)
  *   have been freed will be placed.
  */
 
-enum object_t {
-#define OBJECT(id, nr) OBJECT_##id,
-   #include "dept_object.h"
-#undef  OBJECT
-   OBJECT_NR,
-};
-
 #define OBJECT(id, nr) \
 static struct dept_##id spool_##id[nr];
\
 static DEFINE_PER_CPU(struct llist_head, lpool_##id);
#include "dept_object.h"
 #undef  OBJECT
 
-static struct dept_pool pool[OBJECT_NR] = {
+struct dept_pool dept_pool[OBJECT_NR] = {
 #define OBJECT(id, nr) {   \
.name = #id,\
.obj_sz = sizeof(struct dept_##id), \
@@ -304,7 +298,7 @@ static void *from_pool(enum object_t t)
if (DEPT_WARN_ON(!irqs_disabled()))
return NULL;
 
-   p = [t];
+   p = _pool[t];
 
/*
 * Try local pool first.
@@ -339,7 +333,7 @@ static void *from_pool(enum object_t t)
 
 static void to_pool(void *o, enum object_t t)
 {
-   struct dept_pool *p = [t];
+   struct dept_pool *p = _pool[t];
struct llist_head *h;
 
preempt_disable();
@@ -2136,7 +2130,7 @@ void dept_map_copy(struct dept_map *to, struct dept_map 
*from)
clean_classes_cache(>map_key);
 }
 
-static LIST_HEAD(classes);
+LIST_HEAD(dept_classes);
 
 static inline bool within(const void *addr, void *start, unsigned long size)
 {
@@ -2168,7 +2162,7 @@ void dept_free_range(void *start, unsigned int sz)
while (unlikely(!dept_lock()))
cpu_relax();
 
-   list_for_each_entry_safe(c, n, , all_node) {
+   list_for_each_entry_safe(c, n, _classes, all_node) {
if (!within((void *)c->key, start, sz) &&
!within(c->name, start, sz))
continue;
@@ -2244,7 +2238,7 @@ static struct dept_class *check_new_class(struct dept_key 
*local,
c->sub_id = sub_id;
c->key = (unsigned long)(k->base + sub_id);
hash_add_class(c);
-   list_add(>all_node, );
+   list_add(>all_node, _classes);
 unlock:
dept_unlock();
 caching:
@@ -2958,8 +2952,8 @@ static void migrate_per_cpu_pool(void)
struct llist_head *from;
struct llist_head *to;
 
-   from = [i].boot_pool;
-   to = per_cpu_ptr(pool[i].lpool, boot_cpu);
+   from = _pool[i].boot_pool;
+   to = per_cpu_ptr(dept_pool[i].lpool, boot_cpu);
move_llist(to, from);
}
 }
diff --git a/kernel/dependency/dept_internal.h 
b/kernel/dependency/dept_internal.h
new file mode 100644
index ..007c1eec6bab
--- /dev/null
+++ b/kernel/dependency/dept_internal.h
@@ -0,0 +1,26 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+/*
+ * Dept(DEPendency Tracker) - runtime dependency tracker internal header
+ *
+ * Started by Byungchul Park :
+ *
+ *  Copyright (c) 2020 LG Electronics, Inc., Byungchul Park
+ */
+
+#ifndef __DEPT_INTERNAL_H
+#define __DEPT_INTERNAL_H
+
+#ifdef CONFIG_DEPT
+
+enum object_t {
+#define OBJECT(id, nr) OBJECT_##id,
+   #include "dept_object.h"
+#undef  OBJECT
+   OBJECT_NR,
+};
+
+extern struct list_head dept_classes;
+extern struct dept_pool dept_pool[];
+
+#endif
+#endif /* __DEPT_INTERNAL_H */
diff --git a/kernel/dependency/dept_proc.c b/kernel/dependency/dept_proc.c
new file mode 100644
index ..7d61dfbc5865
--- /dev/null
+++ b/kernel/dependency/dept_proc.c
@@ -0,0 +1,95 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * Procfs knobs for Dept(DEPendency Tracker)
+ *
+ * Started by Byungchul Park :
+ *
+ *  Copyright (C) 2021 LG Electronics, Inc. , Byungchul Park
+ */
+#include 
+#include 
+#include 
+#include 

[PATCH v10 08/25] dept: Apply sdt_might_sleep_{start, end}() to PG_{locked, writeback} wait

2023-06-26 Thread Byungchul Park
Makes Dept able to track dependencies by PG_{locked,writeback} waits.

Signed-off-by: Byungchul Park 
---
 mm/filemap.c | 11 +++
 1 file changed, 11 insertions(+)

diff --git a/mm/filemap.c b/mm/filemap.c
index c4d4ace9cc70..adc49cb59db6 100644
--- a/mm/filemap.c
+++ b/mm/filemap.c
@@ -42,6 +42,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
 #include "internal.h"
@@ -1215,6 +1216,9 @@ static inline bool folio_trylock_flag(struct folio 
*folio, int bit_nr,
 /* How many times do we accept lock stealing from under a waiter? */
 int sysctl_page_lock_unfairness = 5;
 
+static struct dept_map __maybe_unused PG_locked_map = 
DEPT_MAP_INITIALIZER(PG_locked_map, NULL);
+static struct dept_map __maybe_unused PG_writeback_map = 
DEPT_MAP_INITIALIZER(PG_writeback_map, NULL);
+
 static inline int folio_wait_bit_common(struct folio *folio, int bit_nr,
int state, enum behavior behavior)
 {
@@ -1226,6 +1230,11 @@ static inline int folio_wait_bit_common(struct folio 
*folio, int bit_nr,
unsigned long pflags;
bool in_thrashing;
 
+   if (bit_nr == PG_locked)
+   sdt_might_sleep_start(_locked_map);
+   else if (bit_nr == PG_writeback)
+   sdt_might_sleep_start(_writeback_map);
+
if (bit_nr == PG_locked &&
!folio_test_uptodate(folio) && folio_test_workingset(folio)) {
delayacct_thrashing_start(_thrashing);
@@ -1327,6 +1336,8 @@ static inline int folio_wait_bit_common(struct folio 
*folio, int bit_nr,
 */
finish_wait(q, wait);
 
+   sdt_might_sleep_end();
+
if (thrashing) {
delayacct_thrashing_end(_thrashing);
psi_memstall_leave();
-- 
2.17.1



[PATCH v10 13/25] dept: Distinguish each work from another

2023-06-26 Thread Byungchul Park
Workqueue already provides concurrency control. By that, any wait in a
work doesn't prevents events in other works with the control enabled.
Thus, each work would better be considered a different context.

So let Dept assign a different context id to each work.

Signed-off-by: Byungchul Park 
---
 include/linux/dept.h |  2 ++
 kernel/dependency/dept.c | 10 ++
 kernel/workqueue.c   |  3 +++
 3 files changed, 15 insertions(+)

diff --git a/include/linux/dept.h b/include/linux/dept.h
index f62c7b6f42c6..d9ca9dd50219 100644
--- a/include/linux/dept.h
+++ b/include/linux/dept.h
@@ -515,6 +515,7 @@ extern void dept_ecxt_exit(struct dept_map *m, unsigned 
long e_f, unsigned long
 extern void dept_sched_enter(void);
 extern void dept_sched_exit(void);
 extern void dept_kernel_enter(void);
+extern void dept_work_enter(void);
 
 static inline void dept_ecxt_enter_nokeep(struct dept_map *m)
 {
@@ -567,6 +568,7 @@ struct dept_task { };
 #define dept_sched_enter() do { } while (0)
 #define dept_sched_exit()  do { } while (0)
 #define dept_kernel_enter()do { } while (0)
+#define dept_work_enter()  do { } while (0)
 #define dept_ecxt_enter_nokeep(m)  do { } while (0)
 #define dept_key_init(k)   do { (void)(k); } while 
(0)
 #define dept_key_destroy(k)do { (void)(k); } while 
(0)
diff --git a/kernel/dependency/dept.c b/kernel/dependency/dept.c
index 4165cacf4ebb..6cf17f206b78 100644
--- a/kernel/dependency/dept.c
+++ b/kernel/dependency/dept.c
@@ -1977,6 +1977,16 @@ void dept_hardirqs_off_ip(unsigned long ip)
 }
 EXPORT_SYMBOL_GPL(dept_hardirqs_off_ip);
 
+/*
+ * Assign a different context id to each work.
+ */
+void dept_work_enter(void)
+{
+   struct dept_task *dt = dept_task();
+
+   dt->cxt_id[DEPT_CXT_PROCESS] += 1UL << DEPT_CXTS_NR;
+}
+
 void dept_kernel_enter(void)
 {
struct dept_task *dt = dept_task();
diff --git a/kernel/workqueue.c b/kernel/workqueue.c
index 07895deca271..69c4f464d017 100644
--- a/kernel/workqueue.c
+++ b/kernel/workqueue.c
@@ -51,6 +51,7 @@
 #include 
 #include 
 #include 
+#include 
 
 #include "workqueue_internal.h"
 
@@ -2199,6 +2200,8 @@ __acquires(>lock)
 
lockdep_copy_map(_map, >lockdep_map);
 #endif
+   dept_work_enter();
+
/* ensure we're on the correct CPU */
WARN_ON_ONCE(!(pool->flags & POOL_DISASSOCIATED) &&
 raw_smp_processor_id() != pool->cpu);
-- 
2.17.1



[PATCH v10 07/25] dept: Apply sdt_might_sleep_{start, end}() to wait_for_completion()/complete()

2023-06-26 Thread Byungchul Park
Makes Dept able to track dependencies by
wait_for_completion()/complete().

Signed-off-by: Byungchul Park 
---
 include/linux/completion.h | 30 +-
 1 file changed, 25 insertions(+), 5 deletions(-)

diff --git a/include/linux/completion.h b/include/linux/completion.h
index 62b32b19e0a8..32d535abebf3 100644
--- a/include/linux/completion.h
+++ b/include/linux/completion.h
@@ -10,6 +10,7 @@
  */
 
 #include 
+#include 
 
 /*
  * struct completion - structure used to maintain state for a "completion"
@@ -26,14 +27,33 @@
 struct completion {
unsigned int done;
struct swait_queue_head wait;
+   struct dept_map dmap;
 };
 
+#define init_completion(x) \
+do {   \
+   sdt_map_init(&(x)->dmap);   \
+   __init_completion(x);   \
+} while (0)
+
+/*
+ * XXX: No use cases for now. Fill the body when needed.
+ */
 #define init_completion_map(x, m) init_completion(x)
-static inline void complete_acquire(struct completion *x) {}
-static inline void complete_release(struct completion *x) {}
+
+static inline void complete_acquire(struct completion *x)
+{
+   sdt_might_sleep_start(>dmap);
+}
+
+static inline void complete_release(struct completion *x)
+{
+   sdt_might_sleep_end();
+}
 
 #define COMPLETION_INITIALIZER(work) \
-   { 0, __SWAIT_QUEUE_HEAD_INITIALIZER((work).wait) }
+   { 0, __SWAIT_QUEUE_HEAD_INITIALIZER((work).wait), \
+ .dmap = DEPT_MAP_INITIALIZER(work, NULL), }
 
 #define COMPLETION_INITIALIZER_ONSTACK_MAP(work, map) \
(*({ init_completion_map(&(work), &(map)); &(work); }))
@@ -75,13 +95,13 @@ static inline void complete_release(struct completion *x) {}
 #endif
 
 /**
- * init_completion - Initialize a dynamically allocated completion
+ * __init_completion - Initialize a dynamically allocated completion
  * @x:  pointer to completion structure that is to be initialized
  *
  * This inline function will initialize a dynamically created completion
  * structure.
  */
-static inline void init_completion(struct completion *x)
+static inline void __init_completion(struct completion *x)
 {
x->done = 0;
init_swait_queue_head(>wait);
-- 
2.17.1



[PATCH v10 22/25] dept: Apply timeout consideration to dma fence wait

2023-06-26 Thread Byungchul Park
Now that CONFIG_DEPT_AGGRESSIVE_TIMEOUT_WAIT was introduced, apply the
consideration to dma fence wait.

Signed-off-by: Byungchul Park 
---
 drivers/dma-buf/dma-fence.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/dma-buf/dma-fence.c b/drivers/dma-buf/dma-fence.c
index 1db4bc0e8adc..a1ede7b467cd 100644
--- a/drivers/dma-buf/dma-fence.c
+++ b/drivers/dma-buf/dma-fence.c
@@ -783,7 +783,7 @@ dma_fence_default_wait(struct dma_fence *fence, bool intr, 
signed long timeout)
cb.task = current;
list_add(, >cb_list);
 
-   sdt_might_sleep_start(NULL);
+   sdt_might_sleep_start_timeout(NULL, timeout);
while (!test_bit(DMA_FENCE_FLAG_SIGNALED_BIT, >flags) && ret > 
0) {
if (intr)
__set_current_state(TASK_INTERRUPTIBLE);
@@ -887,7 +887,7 @@ dma_fence_wait_any_timeout(struct dma_fence **fences, 
uint32_t count,
}
}
 
-   sdt_might_sleep_start(NULL);
+   sdt_might_sleep_start_timeout(NULL, timeout);
while (ret > 0) {
if (intr)
set_current_state(TASK_INTERRUPTIBLE);
-- 
2.17.1



[PATCH v10 11/25] dept: Apply sdt_might_sleep_{start, end}() to hashed-waitqueue wait

2023-06-26 Thread Byungchul Park
Makes Dept able to track dependencies by hashed-waitqueue waits.

Signed-off-by: Byungchul Park 
---
 include/linux/wait_bit.h | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/include/linux/wait_bit.h b/include/linux/wait_bit.h
index 7725b7579b78..fe89282c3e96 100644
--- a/include/linux/wait_bit.h
+++ b/include/linux/wait_bit.h
@@ -6,6 +6,7 @@
  * Linux wait-bit related types and methods:
  */
 #include 
+#include 
 
 struct wait_bit_key {
void*flags;
@@ -246,6 +247,7 @@ extern wait_queue_head_t *__var_waitqueue(void *p);
struct wait_bit_queue_entry __wbq_entry;\
long __ret = ret; /* explicit shadow */ \
\
+   sdt_might_sleep_start(NULL);\
init_wait_var_entry(&__wbq_entry, var,  \
exclusive ? WQ_FLAG_EXCLUSIVE : 0); \
for (;;) {  \
@@ -263,6 +265,7 @@ extern wait_queue_head_t *__var_waitqueue(void *p);
cmd;\
}   \
finish_wait(__wq_head, &__wbq_entry.wq_entry);  \
+   sdt_might_sleep_end();  \
 __out: __ret;  \
 })
 
-- 
2.17.1



[PATCH v10 02/25] dept: Implement Dept(Dependency Tracker)

2023-06-26 Thread Byungchul Park
CURRENT STATUS
--
Lockdep tracks acquisition order of locks in order to detect deadlock,
and IRQ and IRQ enable/disable state as well to take accident
acquisitions into account.

Lockdep should be turned off once it detects and reports a deadlock
since the data structure and algorithm are not reusable after detection
because of the complex design.

PROBLEM
---
*Waits* and their *events* that never reach eventually cause deadlock.
However, Lockdep is only interested in lock acquisition order, forcing
to emulate lock acqusition even for just waits and events that have
nothing to do with real lock.

Even worse, no one likes Lockdep's false positive detection because that
prevents further one that might be more valuable. That's why all the
kernel developers are sensitive to Lockdep's false positive.

Besides those, by tracking acquisition order, it cannot correctly deal
with read lock and cross-event e.g. wait_for_completion()/complete() for
deadlock detection. Lockdep is no longer a good tool for that purpose.

SOLUTION

Again, *waits* and their *events* that never reach eventually cause
deadlock. The new solution, Dept(DEPendency Tracker), focuses on waits
and events themselves. Dept tracks waits and events and report it if
any event would be never reachable.

Dept does:
   . Works with read lock in the right way.
   . Works with any wait and event e.i. cross-event.
   . Continue to work even after reporting multiple times.
   . Provides simple and intuitive APIs.
   . Does exactly what dependency checker should do.

Q & A
-
Q. Is this the first try ever to address the problem?
A. No. Cross-release feature (b09be676e0ff2 locking/lockdep: Implement
   the 'crossrelease' feature) addressed it 2 years ago that was a
   Lockdep extension and merged but reverted shortly because:

   Cross-release started to report valuable hidden problems but started
   to give report false positive reports as well. For sure, no one
   likes Lockdep's false positive reports since it makes Lockdep stop,
   preventing reporting further real problems.

Q. Why not Dept was developed as an extension of Lockdep?
A. Lockdep definitely includes all the efforts great developers have
   made for a long time so as to be quite stable enough. But I had to
   design and implement newly because of the following:

   1) Lockdep was designed to track lock acquisition order. The APIs and
  implementation do not fit on wait-event model.
   2) Lockdep is turned off on detection including false positive. Which
  is terrible and prevents developing any extension for stronger
  detection.

Q. Do you intend to totally replace Lockdep?
A. No. Lockdep also checks if lock usage is correct. Of course, the
   dependency check routine should be replaced but the other functions
   should be still there.

Q. Do you mean the dependency check routine should be replaced right
   away?
A. No. I admit Lockdep is stable enough thanks to great efforts kernel
   developers have made. Lockdep and Dept, both should be in the kernel
   until Dept gets considered stable.

Q. Stronger detection capability would give more false positive report.
   Which was a big problem when cross-release was introduced. Is it ok
   with Dept?
A. It's ok. Dept allows multiple reporting thanks to simple and quite
   generalized design. Of course, false positive reports should be fixed
   anyway but it's no longer as a critical problem as it was.

Signed-off-by: Byungchul Park 
---
 include/linux/dept.h|  577 ++
 include/linux/hardirq.h |3 +
 include/linux/sched.h   |3 +
 init/init_task.c|2 +
 init/main.c |2 +
 kernel/Makefile |1 +
 kernel/dependency/Makefile  |3 +
 kernel/dependency/dept.c| 3009 +++
 kernel/dependency/dept_hash.h   |   10 +
 kernel/dependency/dept_object.h |   13 +
 kernel/exit.c   |1 +
 kernel/fork.c   |2 +
 kernel/module/main.c|2 +
 kernel/sched/core.c |9 +
 lib/Kconfig.debug   |   27 +
 lib/locking-selftest.c  |2 +
 16 files changed, 3666 insertions(+)
 create mode 100644 include/linux/dept.h
 create mode 100644 kernel/dependency/Makefile
 create mode 100644 kernel/dependency/dept.c
 create mode 100644 kernel/dependency/dept_hash.h
 create mode 100644 kernel/dependency/dept_object.h

diff --git a/include/linux/dept.h b/include/linux/dept.h
new file mode 100644
index ..b6d45b4b1fd6
--- /dev/null
+++ b/include/linux/dept.h
@@ -0,0 +1,577 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+/*
+ * DEPT(DEPendency Tracker) - runtime dependency tracker
+ *
+ * Started by Byungchul Park :
+ *
+ *  Copyright (c) 2020 LG Electronics, Inc., Byungchul Park
+ */
+
+#ifndef __LINUX_DEPT_H
+#define __LINUX_DEPT_H
+
+#ifdef CONFIG_DEPT
+
+#include 
+
+struct task_struct;
+
+#define DEPT_MAX_STACK_ENTRY  

[PATCH v10 14/25] dept: Add a mechanism to refill the internal memory pools on running out

2023-06-26 Thread Byungchul Park
Dept engine works in a constrained environment. For example, Dept cannot
make use of dynamic allocation e.g. kmalloc(). So Dept has been using
static pools to keep memory chunks Dept uses.

However, Dept would barely work once any of the pools gets run out. So
implemented a mechanism for the refill on the lack by any chance, using
irq work and workqueue that fits on the contrained environment.

Signed-off-by: Byungchul Park 
---
 include/linux/dept.h|  19 --
 kernel/dependency/dept.c| 104 +++-
 kernel/dependency/dept_object.h |  10 +--
 kernel/dependency/dept_proc.c   |   8 +--
 4 files changed, 112 insertions(+), 29 deletions(-)

diff --git a/include/linux/dept.h b/include/linux/dept.h
index d9ca9dd50219..583e8fe2dd7b 100644
--- a/include/linux/dept.h
+++ b/include/linux/dept.h
@@ -336,9 +336,19 @@ struct dept_pool {
size_t  obj_sz;
 
/*
-* the number of the static array
+* the remaining number of the object in spool
 */
-   atomic_tobj_nr;
+   int obj_nr;
+
+   /*
+* the number of the object in spool
+*/
+   int tot_nr;
+
+   /*
+* accumulated amount of memory used by the object in byte
+*/
+   atomic_tacc_sz;
 
/*
 * offset of ->pool_node
@@ -348,9 +358,10 @@ struct dept_pool {
/*
 * pointer to the pool
 */
-   void*spool;
+   void*spool; /* static pool */
+   void*rpool; /* reserved pool */
struct llist_head   boot_pool;
-   struct llist_head __percpu  *lpool;
+   struct llist_head __percpu  *lpool; /* local pool */
 };
 
 struct dept_ecxt_held {
diff --git a/kernel/dependency/dept.c b/kernel/dependency/dept.c
index 6cf17f206b78..8454f0a14d67 100644
--- a/kernel/dependency/dept.c
+++ b/kernel/dependency/dept.c
@@ -74,6 +74,9 @@
 #include 
 #include 
 #include 
+#include 
+#include 
+#include 
 #include "dept_internal.h"
 
 static int dept_stop;
@@ -122,10 +125,12 @@ static int dept_per_cpu_ready;
WARN(1, "DEPT_STOP: " s);   \
})
 
-#define DEPT_INFO_ONCE(s...) pr_warn_once("DEPT_INFO_ONCE: " s)
+#define DEPT_INFO_ONCE(s...)   pr_warn_once("DEPT_INFO_ONCE: " s)
+#define DEPT_INFO(s...)pr_warn("DEPT_INFO: " s)
 
 static arch_spinlock_t dept_spin = (arch_spinlock_t)__ARCH_SPIN_LOCK_UNLOCKED;
 static arch_spinlock_t stage_spin = (arch_spinlock_t)__ARCH_SPIN_LOCK_UNLOCKED;
+static arch_spinlock_t dept_pool_spin = 
(arch_spinlock_t)__ARCH_SPIN_LOCK_UNLOCKED;
 
 /*
  * DEPT internal engine should be careful in using outside functions
@@ -264,6 +269,7 @@ static inline bool valid_key(struct dept_key *k)
 
 #define OBJECT(id, nr) \
 static struct dept_##id spool_##id[nr];
\
+static struct dept_##id rpool_##id[nr];
\
 static DEFINE_PER_CPU(struct llist_head, lpool_##id);
#include "dept_object.h"
 #undef  OBJECT
@@ -272,14 +278,70 @@ struct dept_pool dept_pool[OBJECT_NR] = {
 #define OBJECT(id, nr) {   \
.name = #id,\
.obj_sz = sizeof(struct dept_##id), \
-   .obj_nr = ATOMIC_INIT(nr),  \
+   .obj_nr = nr,   \
+   .tot_nr = nr,   \
+   .acc_sz = ATOMIC_INIT(sizeof(spool_##id) + sizeof(rpool_##id)), \
.node_off = offsetof(struct dept_##id, pool_node),  \
.spool = spool_##id,\
+   .rpool = rpool_##id,\
.lpool = _##id, },
#include "dept_object.h"
 #undef  OBJECT
 };
 
+static void dept_wq_work_fn(struct work_struct *work)
+{
+   int i;
+
+   for (i = 0; i < OBJECT_NR; i++) {
+   struct dept_pool *p = dept_pool + i;
+   int sz = p->tot_nr * p->obj_sz;
+   void *rpool;
+   bool need;
+
+   arch_spin_lock(_pool_spin);
+   need = !p->rpool;
+   arch_spin_unlock(_pool_spin);
+
+   if (!need)
+   continue;
+
+   rpool = vmalloc(sz);
+
+   if (!rpool) {
+   DEPT_STOP("Failed to extend internal resources.\n");
+   break;
+   }
+
+   arch_spin_lock(_pool_spin);
+   if (!p->rpool) {
+   p->rpool = rpool;
+   

[PATCH v10 25/25] dept: Track the potential waits of PG_{locked, writeback}

2023-06-26 Thread Byungchul Park
Currently, Dept only tracks the real waits of PG_{locked,writeback} that
actually happened having gone through __schedule() to avoid false
positives. However, it ends in limited capacity for deadlock detection,
because anyway there might be still way more potential dependencies by
the waits that have yet to happen but may happen in the future so as to
cause a deadlock.

So let Dept assume that when PG_{locked,writeback} bit gets cleared,
there might be waits on the bit to be woken up.

Even though false positives may increase with the aggressive tracking,
it's worth doing it because it's going to be useful in practice. See the
following link for instance:

   
https://lore.kernel.org/lkml/1674268856-31807-1-git-send-email-byungchul.p...@lge.com/

Signed-off-by: Byungchul Park 
---
 include/linux/mm_types.h   |   3 +
 include/linux/page-flags.h | 112 +
 include/linux/pagemap.h|   7 ++-
 mm/filemap.c   |  11 +++-
 mm/page_alloc.c|   3 +
 5 files changed, 121 insertions(+), 15 deletions(-)

diff --git a/include/linux/mm_types.h b/include/linux/mm_types.h
index 3b8475007734..61d982eea8d1 100644
--- a/include/linux/mm_types.h
+++ b/include/linux/mm_types.h
@@ -19,6 +19,7 @@
 #include 
 #include 
 #include 
+#include 
 
 #include 
 
@@ -252,6 +253,8 @@ struct page {
 #ifdef LAST_CPUPID_NOT_IN_PAGE_FLAGS
int _last_cpupid;
 #endif
+   struct dept_ext_wgen PG_locked_wgen;
+   struct dept_ext_wgen PG_writeback_wgen;
 } _struct_page_alignment;
 
 /*
diff --git a/include/linux/page-flags.h b/include/linux/page-flags.h
index 69e93a0c1277..d6ca1148d21d 100644
--- a/include/linux/page-flags.h
+++ b/include/linux/page-flags.h
@@ -202,6 +202,50 @@ enum pageflags {
 
 #ifndef __GENERATING_BOUNDS_H
 
+#ifdef CONFIG_DEPT
+#include 
+#include 
+
+extern struct dept_map PG_locked_map;
+extern struct dept_map PG_writeback_map;
+
+/*
+ * Place the following annotations in its suitable point in code:
+ *
+ * Annotate dept_page_set_bit() around firstly set_bit*()
+ * Annotate dept_page_clear_bit() around clear_bit*()
+ * Annotate dept_page_wait_on_bit() around wait_on_bit*()
+ */
+
+static inline void dept_page_set_bit(struct page *p, int bit_nr)
+{
+   if (bit_nr == PG_locked)
+   dept_request_event(_locked_map, >PG_locked_wgen);
+   else if (bit_nr == PG_writeback)
+   dept_request_event(_writeback_map, >PG_writeback_wgen);
+}
+
+static inline void dept_page_clear_bit(struct page *p, int bit_nr)
+{
+   if (bit_nr == PG_locked)
+   dept_event(_locked_map, 1UL, _RET_IP_, __func__, 
>PG_locked_wgen);
+   else if (bit_nr == PG_writeback)
+   dept_event(_writeback_map, 1UL, _RET_IP_, __func__, 
>PG_writeback_wgen);
+}
+
+static inline void dept_page_wait_on_bit(struct page *p, int bit_nr)
+{
+   if (bit_nr == PG_locked)
+   dept_wait(_locked_map, 1UL, _RET_IP_, __func__, 0, -1L);
+   else if (bit_nr == PG_writeback)
+   dept_wait(_writeback_map, 1UL, _RET_IP_, __func__, 0, -1L);
+}
+#else
+#define dept_page_set_bit(p, bit_nr)   do { } while (0)
+#define dept_page_clear_bit(p, bit_nr) do { } while (0)
+#define dept_page_wait_on_bit(p, bit_nr)   do { } while (0)
+#endif
+
 #ifdef CONFIG_HUGETLB_PAGE_OPTIMIZE_VMEMMAP
 DECLARE_STATIC_KEY_FALSE(hugetlb_optimize_vmemmap_key);
 
@@ -383,44 +427,88 @@ static __always_inline int Page##uname(struct page *page) 
\
 #define SETPAGEFLAG(uname, lname, policy)  \
 static __always_inline \
 void folio_set_##lname(struct folio *folio)\
-{ set_bit(PG_##lname, folio_flags(folio, FOLIO_##policy)); }   \
+{  \
+   set_bit(PG_##lname, folio_flags(folio, FOLIO_##policy));\
+   dept_page_set_bit(>page, PG_##lname);\
+}  \
 static __always_inline void SetPage##uname(struct page *page)  \
-{ set_bit(PG_##lname, (page, 1)->flags); }
+{  \
+   set_bit(PG_##lname, (page, 1)->flags);   \
+   dept_page_set_bit(page, PG_##lname);\
+}
 
 #define CLEARPAGEFLAG(uname, lname, policy)\
 static __always_inline \
 void folio_clear_##lname(struct folio *folio)  \
-{ clear_bit(PG_##lname, folio_flags(folio, FOLIO_##policy)); } \
+{  \
+   clear_bit(PG_##lname, folio_flags(folio, FOLIO_##policy));  \
+   dept_page_clear_bit(>page, PG_##lname);  \
+}  

[PATCH v10 19/25] dept: Apply timeout consideration to swait

2023-06-26 Thread Byungchul Park
Now that CONFIG_DEPT_AGGRESSIVE_TIMEOUT_WAIT was introduced, apply the
consideration to swait, assuming an input 'ret' in ___swait_event()
macro is used as a timeout value.

Signed-off-by: Byungchul Park 
---
 include/linux/swait.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/include/linux/swait.h b/include/linux/swait.h
index 02848211cef5..def1e47bb678 100644
--- a/include/linux/swait.h
+++ b/include/linux/swait.h
@@ -162,7 +162,7 @@ extern void finish_swait(struct swait_queue_head *q, struct 
swait_queue *wait);
struct swait_queue __wait;  \
long __ret = ret;   \
\
-   sdt_might_sleep_start(NULL);\
+   sdt_might_sleep_start_timeout(NULL, __ret); \
INIT_LIST_HEAD(&__wait.task_list);  \
for (;;) {  \
long __int = prepare_to_swait_event(, &__wait, state);\
-- 
2.17.1



[PATCH v10 01/25] llist: Move llist_{head, node} definition to types.h

2023-06-26 Thread Byungchul Park
llist_head and llist_node can be used by very primitives. For example,
Dept for tracking dependency uses llist things in its header. To avoid
header dependency, move those to types.h.

Signed-off-by: Byungchul Park 
---
 include/linux/llist.h | 8 
 include/linux/types.h | 8 
 2 files changed, 8 insertions(+), 8 deletions(-)

diff --git a/include/linux/llist.h b/include/linux/llist.h
index 85bda2d02d65..99cc3c30f79c 100644
--- a/include/linux/llist.h
+++ b/include/linux/llist.h
@@ -53,14 +53,6 @@
 #include 
 #include 
 
-struct llist_head {
-   struct llist_node *first;
-};
-
-struct llist_node {
-   struct llist_node *next;
-};
-
 #define LLIST_HEAD_INIT(name)  { NULL }
 #define LLIST_HEAD(name)   struct llist_head name = LLIST_HEAD_INIT(name)
 
diff --git a/include/linux/types.h b/include/linux/types.h
index ea8cf60a8a79..b12a44400877 100644
--- a/include/linux/types.h
+++ b/include/linux/types.h
@@ -187,6 +187,14 @@ struct hlist_node {
struct hlist_node *next, **pprev;
 };
 
+struct llist_head {
+   struct llist_node *first;
+};
+
+struct llist_node {
+   struct llist_node *next;
+};
+
 struct ustat {
__kernel_daddr_tf_tfree;
 #ifdef CONFIG_ARCH_32BIT_USTAT_F_TINODE
-- 
2.17.1



[PATCH v10 16/25] dept: Apply sdt_might_sleep_{start, end}() to dma fence wait

2023-06-26 Thread Byungchul Park
Makes Dept able to track dma fence waits.

Signed-off-by: Byungchul Park 
---
 drivers/dma-buf/dma-fence.c | 5 +
 1 file changed, 5 insertions(+)

diff --git a/drivers/dma-buf/dma-fence.c b/drivers/dma-buf/dma-fence.c
index 406b4e26f538..1db4bc0e8adc 100644
--- a/drivers/dma-buf/dma-fence.c
+++ b/drivers/dma-buf/dma-fence.c
@@ -16,6 +16,7 @@
 #include 
 #include 
 #include 
+#include 
 
 #define CREATE_TRACE_POINTS
 #include 
@@ -782,6 +783,7 @@ dma_fence_default_wait(struct dma_fence *fence, bool intr, 
signed long timeout)
cb.task = current;
list_add(, >cb_list);
 
+   sdt_might_sleep_start(NULL);
while (!test_bit(DMA_FENCE_FLAG_SIGNALED_BIT, >flags) && ret > 
0) {
if (intr)
__set_current_state(TASK_INTERRUPTIBLE);
@@ -795,6 +797,7 @@ dma_fence_default_wait(struct dma_fence *fence, bool intr, 
signed long timeout)
if (ret > 0 && intr && signal_pending(current))
ret = -ERESTARTSYS;
}
+   sdt_might_sleep_end();
 
if (!list_empty())
list_del();
@@ -884,6 +887,7 @@ dma_fence_wait_any_timeout(struct dma_fence **fences, 
uint32_t count,
}
}
 
+   sdt_might_sleep_start(NULL);
while (ret > 0) {
if (intr)
set_current_state(TASK_INTERRUPTIBLE);
@@ -898,6 +902,7 @@ dma_fence_wait_any_timeout(struct dma_fence **fences, 
uint32_t count,
if (ret > 0 && intr && signal_pending(current))
ret = -ERESTARTSYS;
}
+   sdt_might_sleep_end();
 
__set_current_state(TASK_RUNNING);
 
-- 
2.17.1



[PATCH v10 12/25] dept: Distinguish each syscall context from another

2023-06-26 Thread Byungchul Park
It enters kernel mode on each syscall and each syscall handling should
be considered independently from the point of view of Dept. Otherwise,
Dept may wrongly track dependencies across different syscalls.

That might be a real dependency from user mode. However, now that Dept
just started to work, conservatively let Dept not track dependencies
across different syscalls.

Signed-off-by: Byungchul Park 
---
 arch/arm64/kernel/syscall.c |  2 ++
 arch/x86/entry/common.c |  4 +++
 include/linux/dept.h| 39 -
 kernel/dependency/dept.c| 67 +++--
 4 files changed, 63 insertions(+), 49 deletions(-)

diff --git a/arch/arm64/kernel/syscall.c b/arch/arm64/kernel/syscall.c
index a5de47e3df2b..e26d0cab0657 100644
--- a/arch/arm64/kernel/syscall.c
+++ b/arch/arm64/kernel/syscall.c
@@ -7,6 +7,7 @@
 #include 
 #include 
 #include 
+#include 
 
 #include 
 #include 
@@ -105,6 +106,7 @@ static void el0_svc_common(struct pt_regs *regs, int scno, 
int sc_nr,
 */
 
local_daif_restore(DAIF_PROCCTX);
+   dept_kernel_enter();
 
if (flags & _TIF_MTE_ASYNC_FAULT) {
/*
diff --git a/arch/x86/entry/common.c b/arch/x86/entry/common.c
index 6c2826417b33..7cdd27abe529 100644
--- a/arch/x86/entry/common.c
+++ b/arch/x86/entry/common.c
@@ -19,6 +19,7 @@
 #include 
 #include 
 #include 
+#include 
 
 #ifdef CONFIG_XEN_PV
 #include 
@@ -72,6 +73,7 @@ static __always_inline bool do_syscall_x32(struct pt_regs 
*regs, int nr)
 
 __visible noinstr void do_syscall_64(struct pt_regs *regs, int nr)
 {
+   dept_kernel_enter();
add_random_kstack_offset();
nr = syscall_enter_from_user_mode(regs, nr);
 
@@ -120,6 +122,7 @@ __visible noinstr void do_int80_syscall_32(struct pt_regs 
*regs)
 {
int nr = syscall_32_enter(regs);
 
+   dept_kernel_enter();
add_random_kstack_offset();
/*
 * Subtlety here: if ptrace pokes something larger than 2^31-1 into
@@ -140,6 +143,7 @@ static noinstr bool __do_fast_syscall_32(struct pt_regs 
*regs)
int nr = syscall_32_enter(regs);
int res;
 
+   dept_kernel_enter();
add_random_kstack_offset();
/*
 * This cannot use syscall_enter_from_user_mode() as it has to
diff --git a/include/linux/dept.h b/include/linux/dept.h
index b6d45b4b1fd6..f62c7b6f42c6 100644
--- a/include/linux/dept.h
+++ b/include/linux/dept.h
@@ -25,11 +25,16 @@ struct task_struct;
 #define DEPT_MAX_SUBCLASSES_USR(DEPT_MAX_SUBCLASSES / 
DEPT_MAX_SUBCLASSES_EVT)
 #define DEPT_MAX_SUBCLASSES_CACHE  2
 
-#define DEPT_SIRQ  0
-#define DEPT_HIRQ  1
-#define DEPT_IRQS_NR   2
-#define DEPT_SIRQF (1UL << DEPT_SIRQ)
-#define DEPT_HIRQF (1UL << DEPT_HIRQ)
+enum {
+   DEPT_CXT_SIRQ = 0,
+   DEPT_CXT_HIRQ,
+   DEPT_CXT_IRQS_NR,
+   DEPT_CXT_PROCESS = DEPT_CXT_IRQS_NR,
+   DEPT_CXTS_NR
+};
+
+#define DEPT_SIRQF (1UL << DEPT_CXT_SIRQ)
+#define DEPT_HIRQF (1UL << DEPT_CXT_HIRQ)
 
 struct dept_ecxt;
 struct dept_iecxt {
@@ -94,8 +99,8 @@ struct dept_class {
/*
 * for tracking IRQ dependencies
 */
-   struct dept_iecxt iecxt[DEPT_IRQS_NR];
-   struct dept_iwait iwait[DEPT_IRQS_NR];
+   struct dept_iecxt iecxt[DEPT_CXT_IRQS_NR];
+   struct dept_iwait iwait[DEPT_CXT_IRQS_NR];
 
/*
 * classified by a map embedded in task_struct,
@@ -207,8 +212,8 @@ struct dept_ecxt {
/*
 * where the IRQ-enabled happened
 */
-   unsigned long   enirq_ip[DEPT_IRQS_NR];
-   struct dept_stack *enirq_stack[DEPT_IRQS_NR];
+   unsigned long   enirq_ip[DEPT_CXT_IRQS_NR];
+   struct dept_stack *enirq_stack[DEPT_CXT_IRQS_NR];
 
/*
 * where the event context started
@@ -252,8 +257,8 @@ struct dept_wait {
/*
 * where the IRQ wait happened
 */
-   unsigned long   irq_ip[DEPT_IRQS_NR];
-   struct dept_stack *irq_stack[DEPT_IRQS_NR];
+   unsigned long   irq_ip[DEPT_CXT_IRQS_NR];
+   struct dept_stack *irq_stack[DEPT_CXT_IRQS_NR];
 
/*
 * where the wait happened
@@ -406,19 +411,19 @@ struct dept_task {
int wait_hist_pos;
 
/*
-* sequential id to identify each IRQ context
+* sequential id to identify each context
 */
-   unsigned int

[PATCH v10 00/25] DEPT(Dependency Tracker)

2023-06-26 Thread Byungchul Park
>From now on, I can work on LKML again! I'm wondering if DEPT has been
helping kernel debugging well even though it's a form of patches yet.

I'm happy to see that DEPT reports a real problem in practice. See:

   
https://lore.kernel.org/lkml/6383cde5-cf4b-facf-6e07-1378a4856...@i-love.sakura.ne.jp/#t
   
https://lore.kernel.org/lkml/1674268856-31807-1-git-send-email-byungchul.p...@lge.com/

Nevertheless, I apologize for the lack of document. I promise to add it
before it gets needed to use DEPT's APIs by users. For now, you can use
DEPT just with CONFIG_DEPT on.

---

Hi Linus and folks,

I've been developing a tool for detecting deadlock possibilities by
tracking wait/event rather than lock(?) acquisition order to try to
cover all synchonization machanisms. It's done on v6.2-rc2.

Benifit:

0. Works with all lock primitives.
1. Works with wait_for_completion()/complete().
2. Works with 'wait' on PG_locked.
3. Works with 'wait' on PG_writeback.
4. Works with swait/wakeup.
5. Works with waitqueue.
6. Works with wait_bit.
7. Multiple reports are allowed.
8. Deduplication control on multiple reports.
9. Withstand false positives thanks to 6.
10. Easy to tag any wait/event.

Future work:

0. To make it more stable.
1. To separates Dept from Lockdep.
2. To improves performance in terms of time and space.
3. To use Dept as a dependency engine for Lockdep.
4. To add any missing tags of wait/event in the kernel.
5. To deduplicate stack trace.

How to interpret reports:

1. E(event) in each context cannot be triggered because of the
   W(wait) that cannot be woken.
2. The stack trace helping find the problematic code is located
   in each conext's detail.

Thanks,
Byungchul

---

Changes from v9:

1. Fix a bug. SDT tracking didn't work well because of my big
   mistake that I should've used waiter's map to indentify its
   class but it had been working with waker's one. FYI,
   PG_locked and PG_writeback weren't affected. They still
   worked well. (reported by YoungJun)

Changes from v8:

1. Fix build error by adding EXPORT_SYMBOL(PG_locked_map) and
   EXPORT_SYMBOL(PG_writeback_map) for kernel module build -
   appologize for that. (reported by kernel test robot)
2. Fix build error by removing header file's circular dependency
   that was caused by "atomic.h", "kernel.h" and "irqflags.h",
   which I introduced - appolgize for that. (reported by kernel
   test robot)

Changes from v7:

1. Fix a bug that cannot track rwlock dependency properly,
   introduced in v7. (reported by Boqun and lockdep selftest)
2. Track wait/event of PG_{locked,writeback} more aggressively
   assuming that when a bit of PG_{locked,writeback} is cleared
   there might be waits on the bit. (reported by Linus, Hillf
   and syzbot)
3. Fix and clean bad style code e.i. unnecessarily introduced
   a randome pattern and so on. (pointed out by Linux)
4. Clean code for applying DEPT to wait_for_completion().

Changes from v6:

1. Tie to task scheduler code to track sleep and try_to_wake_up()
   assuming sleeps cause waits, try_to_wake_up()s would be the
   events that those are waiting for, of course with proper DEPT
   annotations, sdt_might_sleep_weak(), sdt_might_sleep_strong()
   and so on. For these cases, class is classified at sleep
   entrance rather than the synchronization initialization code.
   Which would extremely reduce false alarms.
2. Remove the DEPT associated instance in each page struct for
   tracking dependencies by PG_locked and PG_writeback thanks to
   the 1. work above.
3. Introduce CONFIG_DEPT_AGGRESIVE_TIMEOUT_WAIT to suppress
   reports that waits with timeout set are involved, for those
   who don't like verbose reporting.
4. Add a mechanism to refill the internal memory pools on
   running out so that DEPT could keep working as long as free
   memory is available in the system.
5. Re-enable tracking hashed-waitqueue wait. That's going to no
   longer generate false positives because class is classified
   at sleep entrance rather than the waitqueue initailization.
6. Refactor to make it easier to port onto each new version of
   the kernel.
7. Apply DEPT to dma fence.
8. Do trivial optimizaitions.

Changes from v5:

1. Use just pr_warn_once() rather than WARN_ONCE() on the lack
   of internal resources because WARN_*() printing stacktrace is
   too much for informing the lack. (feedback from Ted, Hyeonggon)
2. Fix trivial bugs like missing initializing a struct 

[PATCH v10 09/25] dept: Apply sdt_might_sleep_{start,end}() to swait

2023-06-26 Thread Byungchul Park
Makes Dept able to track dependencies by swaits.

Signed-off-by: Byungchul Park 
---
 include/linux/swait.h | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/include/linux/swait.h b/include/linux/swait.h
index 6a8c22b8c2a5..02848211cef5 100644
--- a/include/linux/swait.h
+++ b/include/linux/swait.h
@@ -6,6 +6,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 
 /*
@@ -161,6 +162,7 @@ extern void finish_swait(struct swait_queue_head *q, struct 
swait_queue *wait);
struct swait_queue __wait;  \
long __ret = ret;   \
\
+   sdt_might_sleep_start(NULL);\
INIT_LIST_HEAD(&__wait.task_list);  \
for (;;) {  \
long __int = prepare_to_swait_event(, &__wait, state);\
@@ -176,6 +178,7 @@ extern void finish_swait(struct swait_queue_head *q, struct 
swait_queue *wait);
cmd;\
}   \
finish_swait(, &__wait); \
+   sdt_might_sleep_end();  \
 __out: __ret;  \
 })
 
-- 
2.17.1



[PATCH v10 17/25] dept: Track timeout waits separately with a new Kconfig

2023-06-26 Thread Byungchul Park
Waits with valid timeouts don't actually cause deadlocks. However, Dept
has been reporting the cases as well because it's worth informing the
circular dependency for some cases where, for example, timeout is used
to avoid a deadlock but not meant to be expired.

However, yes, there are also a lot of, even more, cases where timeout
is used for its clear purpose and meant to be expired.

Let Dept report these as an information rather than shouting DEADLOCK.
Plus, introduced CONFIG_DEPT_AGGRESSIVE_TIMEOUT_WAIT Kconfig to make it
optional so that any reports involving waits with timeouts can be turned
on/off depending on the purpose.

Signed-off-by: Byungchul Park 
---
 include/linux/dept.h | 15 ++---
 include/linux/dept_ldt.h |  6 ++--
 include/linux/dept_sdt.h | 12 +---
 kernel/dependency/dept.c | 66 ++--
 lib/Kconfig.debug| 10 ++
 5 files changed, 89 insertions(+), 20 deletions(-)

diff --git a/include/linux/dept.h b/include/linux/dept.h
index 583e8fe2dd7b..0aa8d90558a9 100644
--- a/include/linux/dept.h
+++ b/include/linux/dept.h
@@ -270,6 +270,11 @@ struct dept_wait {
 * whether this wait is for commit in scheduler
 */
boolsched_sleep;
+
+   /*
+* whether a timeout is set
+*/
+   booltimeout;
};
};
 };
@@ -458,6 +463,7 @@ struct dept_task {
boolstage_sched_map;
const char  *stage_w_fn;
unsigned long   stage_ip;
+   boolstage_timeout;
 
/*
 * the number of missing ecxts
@@ -496,6 +502,7 @@ struct dept_task {
.stage_sched_map = false,   \
.stage_w_fn = NULL, \
.stage_ip = 0UL,\
+   .stage_timeout = false, \
.missing_ecxt = 0,  \
.hardirqs_enabled = false,  \
.softirqs_enabled = false,  \
@@ -513,8 +520,8 @@ extern void dept_map_init(struct dept_map *m, struct 
dept_key *k, int sub_u, con
 extern void dept_map_reinit(struct dept_map *m, struct dept_key *k, int sub_u, 
const char *n);
 extern void dept_map_copy(struct dept_map *to, struct dept_map *from);
 
-extern void dept_wait(struct dept_map *m, unsigned long w_f, unsigned long ip, 
const char *w_fn, int sub_l);
-extern void dept_stage_wait(struct dept_map *m, struct dept_key *k, unsigned 
long ip, const char *w_fn);
+extern void dept_wait(struct dept_map *m, unsigned long w_f, unsigned long ip, 
const char *w_fn, int sub_l, long timeout);
+extern void dept_stage_wait(struct dept_map *m, struct dept_key *k, unsigned 
long ip, const char *w_fn, long timeout);
 extern void dept_request_event_wait_commit(void);
 extern void dept_clean_stage(void);
 extern void dept_stage_event(struct task_struct *t, unsigned long ip);
@@ -566,8 +573,8 @@ struct dept_task { };
 #define dept_map_reinit(m, k, su, n)   do { (void)(n); 
(void)(k); } while (0)
 #define dept_map_copy(t, f)do { } while (0)
 
-#define dept_wait(m, w_f, ip, w_fn, sl)do { 
(void)(w_fn); } while (0)
-#define dept_stage_wait(m, k, ip, w_fn)do { (void)(k); 
(void)(w_fn); } while (0)
+#define dept_wait(m, w_f, ip, w_fn, sl, t) do { (void)(w_fn); } 
while (0)
+#define dept_stage_wait(m, k, ip, w_fn, t) do { (void)(k); 
(void)(w_fn); } while (0)
 #define dept_request_event_wait_commit()   do { } while (0)
 #define dept_clean_stage() do { } while (0)
 #define dept_stage_event(t, ip)do { } while (0)
diff --git a/include/linux/dept_ldt.h b/include/linux/dept_ldt.h
index 062613e89fc3..8adf298dfcb8 100644
--- a/include/linux/dept_ldt.h
+++ b/include/linux/dept_ldt.h
@@ -27,7 +27,7 @@
else if (t) \
dept_ecxt_enter(m, LDT_EVT_L, i, "trylock", "unlock", 
sl);\
else {  \
-   dept_wait(m, LDT_EVT_L, i, "lock", sl); \
+   dept_wait(m, LDT_EVT_L, i, "lock", sl, false);  \
dept_ecxt_enter(m, LDT_EVT_L, i, "lock", "unlock", sl);\
}   \
} while (0)
@@ -39,7 +39,7 @@
else if (t) \
dept_ecxt_enter(m, LDT_EVT_R, i, "read_trylock", 
"read_unlock", sl);\
else { 

[PATCH v10 04/25] dept: Add lock dependency tracker APIs

2023-06-26 Thread Byungchul Park
Wrapped the base APIs for easier annotation on typical lock.

Signed-off-by: Byungchul Park 
---
 include/linux/dept_ldt.h | 77 
 1 file changed, 77 insertions(+)
 create mode 100644 include/linux/dept_ldt.h

diff --git a/include/linux/dept_ldt.h b/include/linux/dept_ldt.h
new file mode 100644
index ..062613e89fc3
--- /dev/null
+++ b/include/linux/dept_ldt.h
@@ -0,0 +1,77 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+/*
+ * Lock Dependency Tracker
+ *
+ * Started by Byungchul Park :
+ *
+ *  Copyright (c) 2020 LG Electronics, Inc., Byungchul Park
+ */
+
+#ifndef __LINUX_DEPT_LDT_H
+#define __LINUX_DEPT_LDT_H
+
+#include 
+
+#ifdef CONFIG_DEPT
+#define LDT_EVT_L  1UL
+#define LDT_EVT_R  2UL
+#define LDT_EVT_W  1UL
+#define LDT_EVT_RW (LDT_EVT_R | LDT_EVT_W)
+#define LDT_EVT_ALL(LDT_EVT_L | LDT_EVT_RW)
+
+#define ldt_init(m, k, su, n)  dept_map_init(m, k, su, n)
+#define ldt_lock(m, sl, t, n, i)   \
+   do {\
+   if (n)  \
+   dept_ecxt_enter_nokeep(m);  \
+   else if (t) \
+   dept_ecxt_enter(m, LDT_EVT_L, i, "trylock", "unlock", 
sl);\
+   else {  \
+   dept_wait(m, LDT_EVT_L, i, "lock", sl); \
+   dept_ecxt_enter(m, LDT_EVT_L, i, "lock", "unlock", sl);\
+   }   \
+   } while (0)
+
+#define ldt_rlock(m, sl, t, n, i, q)   \
+   do {\
+   if (n)  \
+   dept_ecxt_enter_nokeep(m);  \
+   else if (t) \
+   dept_ecxt_enter(m, LDT_EVT_R, i, "read_trylock", 
"read_unlock", sl);\
+   else {  \
+   dept_wait(m, q ? LDT_EVT_RW : LDT_EVT_W, i, 
"read_lock", sl);\
+   dept_ecxt_enter(m, LDT_EVT_R, i, "read_lock", 
"read_unlock", sl);\
+   }   \
+   } while (0)
+
+#define ldt_wlock(m, sl, t, n, i)  \
+   do {\
+   if (n)  \
+   dept_ecxt_enter_nokeep(m);  \
+   else if (t) \
+   dept_ecxt_enter(m, LDT_EVT_W, i, "write_trylock", 
"write_unlock", sl);\
+   else {  \
+   dept_wait(m, LDT_EVT_RW, i, "write_lock", sl);  \
+   dept_ecxt_enter(m, LDT_EVT_W, i, "write_lock", 
"write_unlock", sl);\
+   }   \
+   } while (0)
+
+#define ldt_unlock(m, i)   dept_ecxt_exit(m, LDT_EVT_ALL, i)
+
+#define ldt_downgrade(m, i)\
+   do {\
+   if (dept_ecxt_holding(m, LDT_EVT_W))\
+   dept_map_ecxt_modify(m, LDT_EVT_W, NULL, LDT_EVT_R, i, 
"downgrade", "read_unlock", -1);\
+   } while (0)
+
+#define ldt_set_class(m, n, k, sl, i)  dept_map_ecxt_modify(m, LDT_EVT_ALL, k, 
0UL, i, "lock_set_class", "(any)unlock", sl)
+#else /* !CONFIG_DEPT */
+#define ldt_init(m, k, su, n)  do { (void)(k); } while (0)
+#define ldt_lock(m, sl, t, n, i)   do { } while (0)
+#define ldt_rlock(m, sl, t, n, i, q)   do { } while (0)
+#define ldt_wlock(m, sl, t, n, i)  do { } while (0)
+#define ldt_unlock(m, i)   do { } while (0)
+#define ldt_downgrade(m, i)do { } while (0)
+#define ldt_set_class(m, n, k, sl, i)  do { } while (0)
+#endif
+#endif /* __LINUX_DEPT_LDT_H */
-- 
2.17.1



Re: [PATCH 6/9] drm: atmel_hlcdc: Add support for XLCDC in atmel LCD driver

2023-06-26 Thread Sam Ravnborg
Hi Manikandan,

On Tue, Jun 13, 2023 at 12:34:23PM +0530, Manikandan Muralidharan wrote:
> - XLCDC in SAM9X7 has different sets of registers and additional
> configuration bits when compared to previous HLCDC IP. Read/write
> operation on the controller registers is now separated using the
> XLCDC status flag.
>   - HEO scaling, window resampling, Alpha blending, YUV-to-RGB
> conversion in XLCDC is derived and handled using additional
> configuration bits and registers.
>   - Writing one to the Enable fields of each layer in LCD_ATTRE
> is required to reflect the values set in Configuration, FBA, Enable
> registers of each layer

In general things would benefit from a more clear separation between
hlcdc and xlcdc. In several cases two functions would be better than
testing as done today.

See some more specific comments in the following.

Sam

> 
> Signed-off-by: Manikandan Muralidharan 
> [hari.prasat...@microchip.com: update the attribute field for each layer]
> Signed-off-by: Hari Prasath Gujulan Elango 
> [durai.manicka...@microchip.com: implement status flag to seprate register 
> update]
> Signed-off-by: Durai Manickam KR 
> ---
>  .../gpu/drm/atmel-hlcdc/atmel_hlcdc_crtc.c|  28 +-
>  .../gpu/drm/atmel-hlcdc/atmel_hlcdc_plane.c   | 297 ++
>  2 files changed, 256 insertions(+), 69 deletions(-)
> 
> diff --git a/drivers/gpu/drm/atmel-hlcdc/atmel_hlcdc_crtc.c 
> b/drivers/gpu/drm/atmel-hlcdc/atmel_hlcdc_crtc.c
> index 58184cd6ab0b..7c9cf7c0c75d 100644
> --- a/drivers/gpu/drm/atmel-hlcdc/atmel_hlcdc_crtc.c
> +++ b/drivers/gpu/drm/atmel-hlcdc/atmel_hlcdc_crtc.c
> @@ -139,10 +139,10 @@ static void atmel_hlcdc_crtc_mode_set_nofb(struct 
> drm_crtc *c)
>   state = drm_crtc_state_to_atmel_hlcdc_crtc_state(c->state);
>   cfg = state->output_mode << 8;
>  
> - if (adj->flags & DRM_MODE_FLAG_NVSYNC)
> + if (!crtc->dc->is_xlcdc && (adj->flags & DRM_MODE_FLAG_NVSYNC))
>   cfg |= ATMEL_HLCDC_VSPOL;
>  
> - if (adj->flags & DRM_MODE_FLAG_NHSYNC)
> + if (!crtc->dc->is_xlcdc && (adj->flags & DRM_MODE_FLAG_NHSYNC))
>   cfg |= ATMEL_HLCDC_HSPOL;

>From the above code I assume ATMEL_HLCDC_VSPOL and ATMEL_HLCDC_HSPOL are
specific to HLCDC and not part of XLCDC register set.

We can identify XLCDC specific registers as thy use "XLCDC" in the name.
But the "HLCDC" specific registers are not easy to spot.

Would it make sense to update the register definitions so we can see in
the register names which at XLCDC, which are HLCDC and which a common
(LCDC)?

It would require a little code crunch to do so as all users would need
an update. Dunno if this is worth it.
But then at least a comment in the register definition file.


>  
>   regmap_update_bits(regmap, ATMEL_HLCDC_CFG(5),
> @@ -177,6 +177,18 @@ static void atmel_hlcdc_crtc_atomic_disable(struct 
> drm_crtc *c,
>  
>   pm_runtime_get_sync(dev->dev);
>  
> + if (crtc->dc->is_xlcdc) {
> + regmap_write(regmap, ATMEL_HLCDC_DIS, ATMEL_XLCDC_CM);
> + while (!regmap_read(regmap, ATMEL_HLCDC_SR, ) &&
> +(status & ATMEL_XLCDC_CM))
> + cpu_relax();
> +
> + regmap_write(regmap, ATMEL_HLCDC_DIS, ATMEL_XLCDC_SD);
> + while (!regmap_read(regmap, ATMEL_HLCDC_SR, ) &&
> +!(status & ATMEL_XLCDC_SD))
> + cpu_relax();
> + }

A small helper atmel_xlcdc_disable()?

> +
>   regmap_write(regmap, ATMEL_HLCDC_DIS, ATMEL_HLCDC_DISP);
>   while (!regmap_read(regmap, ATMEL_HLCDC_SR, ) &&
>  (status & ATMEL_HLCDC_DISP))
> @@ -231,6 +243,18 @@ static void atmel_hlcdc_crtc_atomic_enable(struct 
> drm_crtc *c,
>  !(status & ATMEL_HLCDC_DISP))
>   cpu_relax();
>  
> + if (crtc->dc->is_xlcdc) {
> + regmap_write(regmap, ATMEL_HLCDC_EN, ATMEL_XLCDC_CM);
> + while (!regmap_read(regmap, ATMEL_HLCDC_SR, ) &&
> +!(status & ATMEL_XLCDC_CM))
> + cpu_relax();
> +
> + regmap_write(regmap, ATMEL_HLCDC_EN, ATMEL_XLCDC_SD);
> + while (!regmap_read(regmap, ATMEL_HLCDC_SR, ) &&
> +(status & ATMEL_XLCDC_SD))
> + cpu_relax();
> + }

A small helper atmel_xlcdc_enable()?

> +
>   pm_runtime_put_sync(dev->dev);
>  
>  }
> diff --git a/drivers/gpu/drm/atmel-hlcdc/atmel_hlcdc_plane.c 
> b/drivers/gpu/drm/atmel-hlcdc/atmel_hlcdc_plane.c
> index daa508504f47..fe33476818c4 100644
> --- a/drivers/gpu/drm/atmel-hlcdc/atmel_hlcdc_plane.c
> +++ b/drivers/gpu/drm/atmel-hlcdc/atmel_hlcdc_plane.c
> @@ -330,11 +330,59 @@ static void atmel_hlcdc_plane_setup_scaler(struct 
> atmel_hlcdc_plane *plane,
>yfactor));
>  }
>  
> +static void atmel_xlcdc_plane_setup_scaler(struct atmel_hlcdc_plane *plane,
> +struct 

Re: [PATCH 06/15] dt-bindings: display/msm: sc7180-dpu: Describe SM6125

2023-06-26 Thread Marijn Suijten
On 2023-06-26 20:57:51, Konrad Dybcio wrote:
> On 26.06.2023 19:54, Marijn Suijten wrote:
> > On 2023-06-26 18:16:58, Krzysztof Kozlowski wrote:
> >> On 25/06/2023 21:52, Marijn Suijten wrote:
> >>> On 2023-06-24 11:12:52, Krzysztof Kozlowski wrote:
>  On 24/06/2023 02:41, Marijn Suijten wrote:
> > SM6125 is identical to SM6375 except that while downstream also defines
> > a throttle clock, its presence results in timeouts whereas SM6375
> > requires it to not observe any timeouts.
> 
>  Then it should not be allowed, so you need either "else:" block or
>  another "if: properties: compatible:" to disallow it. Because in current
>  patch it would be allowed.
> >>>
> >>> That means this binding is wrong/incomplete for all other SoCs then.
> >>> clock(-name)s has 6 items, and sets `minItems: 6`.  Only for sm6375-dpu
> > 
> > Of course meant to say that clock(-name)s has **7** items, not 6.
> > 
> >>> does it set `minItems: 7`, but an else case is missing.
> >>
> >> Ask the author why it is done like this.
> > 
> > Konrad, can you clarify why other 

(Looks like I forgot to complete this sentence before sending,
apologies)

> 6375 needs the throttle clk and the clock(-names) are strongly ordered
> so having minItems: 6 discards the last entry

The question is whether or not we should have maxItems: 6 to disallow
the clock from being passed: right now it is optional and either is
allowed for any !6375 SoC.

- Marijn

> 
> Konrad
> > 
> >>> Shall I send a Fixes: ed41005f5b7c ("dt-bindings: display/msm:
> >>> sc7180-dpu: Describe SM6350 and SM6375") for that, and should maxItems:
> >>> 6 be the default under clock(-name)s or in an else:?
> >>
> >> There is no bug to fix. Or at least it is not yet known. Whether other
> >> devices should be constrained as well - sure, sounds reasonable, but I
> >> did not check the code exactly.
> > 
> > I don't know either, but we need this information to decide whether to
> > use `maxItems: 6`:
> > 
> > 1. Directly on the property;
> > 2. In an `else:` case on the current `if: sm6375-dpu` (should have the
> >same effect as 1., afaik);
> > 3. In a second `if:` case that lists all SoCS explicitly.
> > 
> > Since we don't have this information, I think option 3. is the right way
> > to go, setting `maxItems: 6` for qcom,sm6125-dpu.
> > 
> > However, it is not yet understood why downstream is able to use the
> > throttle clock without repercussions.
> > 
> >> We talk here about this patch.
> > 
> > We used this patch to discover that other SoCs are similarly
> > unconstrained.  But if you don't want me to look into it, by all means!
> > Saves me a lot of time.  So I will go with option 3.
> > 
> > - Marijn


Re: [PATCH 3/9] drm: atmel-hlcdc: add LCD controller layer definition for SAM9X7

2023-06-26 Thread Sam Ravnborg
Hi Manikandan,

On Tue, Jun 13, 2023 at 12:34:20PM +0530, Manikandan Muralidharan wrote:
> Add the LCD controller layer definition and descriptor structure for SAM9X7
> for the following layers,
> - Base Layer
> - Overlay1 Layer
> - Overlay2 Layer
> - High End Overlay
> 
> Signed-off-by: Manikandan Muralidharan 
> ---
>  drivers/gpu/drm/atmel-hlcdc/atmel_hlcdc_dc.c | 96 
>  1 file changed, 96 insertions(+)
> 
> diff --git a/drivers/gpu/drm/atmel-hlcdc/atmel_hlcdc_dc.c 
> b/drivers/gpu/drm/atmel-hlcdc/atmel_hlcdc_dc.c
> index fa0f9a93d50d..d7ad828e9e8c 100644
> --- a/drivers/gpu/drm/atmel-hlcdc/atmel_hlcdc_dc.c
> +++ b/drivers/gpu/drm/atmel-hlcdc/atmel_hlcdc_dc.c
> @@ -462,6 +462,98 @@ static const struct atmel_hlcdc_dc_desc 
> atmel_hlcdc_dc_sam9x60 = {
>   .layers = atmel_hlcdc_sam9x60_layers,
>  };
>  
> +static const struct atmel_hlcdc_layer_desc atmel_xlcdc_sam9x7_layers[] = {
> + {
> + .name = "base",
> + .formats = _hlcdc_plane_rgb_formats,
> + .regs_offset = 0x60,
> + .id = 0,
> + .type = ATMEL_HLCDC_BASE_LAYER,
> + .cfgs_offset = 0x1c,
> + .layout = {
> + .xstride = { 2 },
> + .default_color = 3,
> + .general_config = 4,
> + .disc_pos = 5,
> + .disc_size = 6,
> + },
> + .clut_offset = 0x700,
> + },
> + {
> + .name = "overlay1",
> + .formats = _hlcdc_plane_rgb_formats,
> + .regs_offset = 0x160,
> + .id = 1,
> + .type = ATMEL_HLCDC_OVERLAY_LAYER,
> + .cfgs_offset = 0x1c,
> + .layout = {
> + .pos = 2,
> + .size = 3,
> + .xstride = { 4 },
> + .pstride = { 5 },
> + .default_color = 6,
> + .chroma_key = 7,
> + .chroma_key_mask = 8,
> + .general_config = 9,
> + },
> + .clut_offset = 0xb00,
> + },
> + {
> + .name = "overlay2",
> + .formats = _hlcdc_plane_rgb_formats,
> + .regs_offset = 0x260,
> + .id = 2,
> + .type = ATMEL_HLCDC_OVERLAY_LAYER,
> + .cfgs_offset = 0x1c,
> + .layout = {
> + .pos = 2,
> + .size = 3,
> + .xstride = { 4 },
> + .pstride = { 5 },
> + .default_color = 6,
> + .chroma_key = 7,
> + .chroma_key_mask = 8,
> + .general_config = 9,
> + },
> + .clut_offset = 0xf00,
> + },
> + {
> + .name = "high-end-overlay",
> + .formats = _hlcdc_plane_rgb_and_yuv_formats,
> + .regs_offset = 0x360,
> + .id = 3,
> + .type = ATMEL_HLCDC_OVERLAY_LAYER,
> + .cfgs_offset = 0x30,
> + .layout = {
> + .pos = 2,
> + .size = 3,
> + .memsize = 4,
> + .xstride = { 5, 7 },
> + .pstride = { 6, 8 },
> + .default_color = 9,
> + .chroma_key = 10,
> + .chroma_key_mask = 11,
> + .general_config = 12,
> + .csc = 16,
> + .scaler_config = 23,
> + },
> + .clut_offset = 0x1300,
> + },
> +};
> +
> +static const struct atmel_hlcdc_dc_desc atmel_xlcdc_dc_sam9x7 = {
> + .min_width = 0,
> + .min_height = 0,
> + .max_width = 2048,
> + .max_height = 2048,
> + .max_spw = 0xff,
> + .max_vpw = 0xff,
> + .max_hpw = 0x3ff,
> + .fixed_clksrc = true,
> + .nlayers = ARRAY_SIZE(atmel_xlcdc_sam9x7_layers),
> + .layers = atmel_xlcdc_sam9x7_layers,
> +};

As already suggested by someone else, add is_xlcdc to struct
atmel_hlcdc_dc_desc, so the check for the compatible can be dropped.
It would be better to put it here.


Sam


Re: [PATCH v1 0/2] udmabuf: Add back support for mapping hugetlb pages

2023-06-26 Thread Peter Xu
On Mon, Jun 26, 2023 at 03:18:48PM -0300, Jason Gunthorpe wrote:
> On Mon, Jun 26, 2023 at 08:14:27PM +0200, David Hildenbrand wrote:
> 
> > So we might have to implement the same page migration as gup does on
> > FOLL_LONGTERM here ... maybe there are more such cases/drivers that actually
> > require that handling when simply taking pages out of the memfd, believing
> > they can hold on to them forever.
> 
> In general I would like to see an interface to FOLL_LONGTERM pin pages
> from a memfd. I would quite happily use that in iommufd as well.
> 
> It solves some problems we have there with fork/exec/etc if the pages
> are not linked to a mm_struct.

Afaiu any fd based approach should mean it'll never work with private
memories, while mm-based should be able to work on any kind.  Maybe that's
not a problem - I assume at least udmabuf should just only work with shared
memories; not sure on iommufd, though.

-- 
Peter Xu



[PATCH] drm/mediatek: Fix potential memory leak if vmap() fail

2023-06-26 Thread Sui Jingfeng
Also return -ENOMEM if such a failure happens, the implement should take
responsibility for the error handling.

Signed-off-by: Sui Jingfeng 
---
 drivers/gpu/drm/mediatek/mtk_drm_gem.c | 6 +-
 1 file changed, 5 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/mediatek/mtk_drm_gem.c 
b/drivers/gpu/drm/mediatek/mtk_drm_gem.c
index a25b28d3ee90..9f364df52478 100644
--- a/drivers/gpu/drm/mediatek/mtk_drm_gem.c
+++ b/drivers/gpu/drm/mediatek/mtk_drm_gem.c
@@ -247,7 +247,11 @@ int mtk_drm_gem_prime_vmap(struct drm_gem_object *obj, 
struct iosys_map *map)
 
mtk_gem->kvaddr = vmap(mtk_gem->pages, npages, VM_MAP,
   pgprot_writecombine(PAGE_KERNEL));
-
+   if (!mtk_gem->kvaddr) {
+   kfree(sgt);
+   kfree(mtk_gem->pages);
+   return -ENOMEM;
+   }
 out:
kfree(sgt);
iosys_map_set_vaddr(map, mtk_gem->kvaddr);
-- 
2.25.1



Re: [PATCH 06/15] dt-bindings: display/msm: sc7180-dpu: Describe SM6125

2023-06-26 Thread Konrad Dybcio
On 26.06.2023 19:54, Marijn Suijten wrote:
> On 2023-06-26 18:16:58, Krzysztof Kozlowski wrote:
>> On 25/06/2023 21:52, Marijn Suijten wrote:
>>> On 2023-06-24 11:12:52, Krzysztof Kozlowski wrote:
 On 24/06/2023 02:41, Marijn Suijten wrote:
> SM6125 is identical to SM6375 except that while downstream also defines
> a throttle clock, its presence results in timeouts whereas SM6375
> requires it to not observe any timeouts.

 Then it should not be allowed, so you need either "else:" block or
 another "if: properties: compatible:" to disallow it. Because in current
 patch it would be allowed.
>>>
>>> That means this binding is wrong/incomplete for all other SoCs then.
>>> clock(-name)s has 6 items, and sets `minItems: 6`.  Only for sm6375-dpu
> 
> Of course meant to say that clock(-name)s has **7** items, not 6.
> 
>>> does it set `minItems: 7`, but an else case is missing.
>>
>> Ask the author why it is done like this.
> 
> Konrad, can you clarify why other 
6375 needs the throttle clk and the clock(-names) are strongly ordered
so having minItems: 6 discards the last entry

Konrad
> 
>>> Shall I send a Fixes: ed41005f5b7c ("dt-bindings: display/msm:
>>> sc7180-dpu: Describe SM6350 and SM6375") for that, and should maxItems:
>>> 6 be the default under clock(-name)s or in an else:?
>>
>> There is no bug to fix. Or at least it is not yet known. Whether other
>> devices should be constrained as well - sure, sounds reasonable, but I
>> did not check the code exactly.
> 
> I don't know either, but we need this information to decide whether to
> use `maxItems: 6`:
> 
> 1. Directly on the property;
> 2. In an `else:` case on the current `if: sm6375-dpu` (should have the
>same effect as 1., afaik);
> 3. In a second `if:` case that lists all SoCS explicitly.
> 
> Since we don't have this information, I think option 3. is the right way
> to go, setting `maxItems: 6` for qcom,sm6125-dpu.
> 
> However, it is not yet understood why downstream is able to use the
> throttle clock without repercussions.
> 
>> We talk here about this patch.
> 
> We used this patch to discover that other SoCs are similarly
> unconstrained.  But if you don't want me to look into it, by all means!
> Saves me a lot of time.  So I will go with option 3.
> 
> - Marijn


Re: [PATCH 03/15] dt-bindings: clock: qcom,dispcc-sm6125: Require GCC PLL0 DIV clock

2023-06-26 Thread Marijn Suijten
On 2023-06-26 20:51:38, Marijn Suijten wrote:

> > Not really, binding also defines the list of clocks - their order and
> > specific entries. This changes.
> 
> And so it does in "dt-bindings: clock: qcom,dispcc-sm6125: Remove unused
> GCC_DISP_AHB_CLK"?

Never mind: it is the last item so the order of the other items doesn't
change.  The total number of items decreases though, which sounds like
an ABI-break too?

- Marijn


Re: [PATCH 03/15] dt-bindings: clock: qcom,dispcc-sm6125: Require GCC PLL0 DIV clock

2023-06-26 Thread Marijn Suijten
On 2023-06-26 20:29:37, Krzysztof Kozlowski wrote:
> On 26/06/2023 19:49, Marijn Suijten wrote:
> > On 2023-06-26 18:10:44, Krzysztof Kozlowski wrote:
> >> On 25/06/2023 21:48, Marijn Suijten wrote:
> >>> On 2023-06-24 11:08:54, Krzysztof Kozlowski wrote:
>  On 24/06/2023 03:45, Konrad Dybcio wrote:
> > On 24.06.2023 02:41, Marijn Suijten wrote:
> >> The "gcc_disp_gpll0_div_clk_src" clock is consumed by the driver, will
> >> be passed from DT, and should be required by the bindings.
> >>
> >> Fixes: 8397c9c0c26b ("dt-bindings: clock: add QCOM SM6125 display 
> >> clock bindings")
> >> Signed-off-by: Marijn Suijten 
> >> ---
> > Ideally, you'd stick it at the bottom of the list, as the items: order
> > is part of the ABI
> 
>  Yes, please add them to the end. Order is fixed.
> >>>
> >>> Disagreed for bindings that declare clock-names and when the driver
> >>> adheres to it, see my reply to Konrad's message.
> >>
> >> That's the generic rule, with some exceptions of course. Whether one
> >> chosen driver (chosen system and chosen version of that system) adheres
> >> or not, does not change it. Other driver behaves differently and ABI is
> >> for everyone, not only for your specific version of Linux driver.
> >>
> >> Follow the rule.
> > 
> > This has no relation to the driver (just that our driver adheres to the
> > bindings, as it is supposed to be).  The bindings define a mapping from
> > a clock-names=<> entry to a clock on the same index in the clocks=<>
> > array.  That relation remains the same with this change.
> 
> Not really, binding also defines the list of clocks - their order and
> specific entries. This changes.

And so it does in "dt-bindings: clock: qcom,dispcc-sm6125: Remove unused
GCC_DISP_AHB_CLK"?

- Marijn


Re: [PATCH v3 00/17] Imagination Technologies PowerVR DRM driver

2023-06-26 Thread H. Nikolaus Schaller
Hi Frank,
I have added Merlijn who is doing a lot with PVRSGX for Maemo-Leste and the
phone-devel list since most SoC we find using a PVRSGX are smartphone 
processors.

> Am 26.06.2023 um 15:45 schrieb Frank Binns :
> 
> Hi Nikolaus,
> 
>> 
>> Some questions to the author of the new driver:
>> - are there plans to support SGX5 (the predecessor of Rogue6)?
> 
> We don't currently have any plans to support SGX. Our main focus is currently 
> on
> Rogue and then we'll move onto Volcanic.

Okay, that's completely understandable from a commercial perspective.

> 
>> - will this be able to run the existing firmware and user-space code of 
>> pvrsrvkm?
> 
> I'm afraid not. The interface for existing firmware and userspace code were
> designed with different requirements in mind and don't cater to the kernel's
> strict compatibility guarantees. As such, the uapi for this new driver is
> very different to pvrsrvkm's, although naturally there are similarities:
> https://gitlab.freedesktop.org/sarah-walker-imgtec/powervr/-/blob/dev/v3/include/uapi/drm/pvr_drm.h

This makes sense. So the new Rogue/Volcanic and the older PVRSGX drivers should
be able to coexist (at least in source code as there is no hardware having 
both).

> We've also worked with our firmware team to make changes to the firmware
> interface to better support this new driver. Specifically, parts of the 
> firmware
> interface are no longer conditional on the GPU feature set / hardware
> workarounds, meaning we now have a single interface which can work across all
> existing Rogue GPUs, which makes things a lot easier for this new kernel 
> driver.

That is what I have dreamed of for SGX as well.

We could have replaced all the #if for specific versions and errata by some code
to runtime check with the device tree for the specific SGX version.

But this was never done because it is complex, difficult to automate and our 
means
for testing things are limited. And we could not decide which DDK version we
should build on as there is no common denominator for all SoC.

> 
>> - or will it have new firmware and user-space code for these older chips?
>> - or will there be open user-space code for SGX (and Rogue)?
> 
> We're using the same Rogue firmware as our closed source driver, just with
> modifications to the interface to cater for this new kernel driver. In terms 
> of

Ok. Well, it was to be expected that SGX and Rogue firmware are quite different.

> userspace, we already have a Vulkan driver upstreamed to Mesa:
> https://gitlab.freedesktop.org/mesa/mesa/-/tree/main/src/imagination/vulkan

Nice!

> 
> and will be working to enable GLES support via Mesa's Zink GL(ES)-to-Vulkan
> translation layer. This naturally limits support to Series 6 onwards, as
> anything older isn't capable of supporting Vulkan.

I see.

> 
>> 
>>> Specifically I would ask that the DT bindings include all old and new 
>>> PowerVR
>>> hardware in one go, unless they have very specific hardware definition 
>>> needs,
>>> which I doubt.
>> 
>> Our current bindings for all SoC with a SGX5 GPU inside and which have at 
>> least
>> some official Linux support are here:
>> 
>> https://github.com/openpvrsgx-devgroup/linux_openpvrsgx/blob/letux/omap-pvr-soc-glue-v8/Documentation/devicetree/bindings/gpu/img%2Cpvrsgx.yaml
>> 
>> There was an attempt to upstream at least this plus glue code (i.e. device 
>> tree
>> sources) some years ago but there was no consensus about the number and 
>> names of
>> clocks that should be included in such a bindings document.
> 
> I've taken a look and your bindings look very similar to the ones we've come 
> up
> with. If you decide to attempt to upstream these again, please feel free to CC
> me, Sarah and Donald (all on this email chain) so we can provide some 
> feedback.

That is good!

It would be a good moment to give it another try as we can have even more
reviewers than before...

> 
>> 
>>> Also I think they could use your help to get the proper firmware for the 
>>> older
>>> hardware licensed properly from Imagination and included into linux-firmware
>>> so they do not need to ship files on the side.
>> 
>> Indeed, this and some "universal" user-space code would help a lot. Currently
>> we have collected a lot of binaries for several architectures (e.g. Intel, 
>> OMAP, JZ4780),
>> but all from different DDK versions and very different assumptions about 
>> system
>> library versions.
> 
> The way the SGX driver was designed means that it has to be built for a 
> specific
> GPU, the version of the firmware, userspace driver(s) & kernel driver have to
> exactly match and the build configuration has to match as well. Essentially, 
> we
> don't have "universal" userspace code ourselves. With our focus being on Rogue
> and beyond, we don't currently have any plans to work on this.

Hm. This makes me wonder if it could be possible to open source the SGX code
since it is a different architecture than Rogue, is no longer your focus and
AFAIK the 

[syzbot] Monthly dri report (Jun 2023)

2023-06-26 Thread syzbot
Hello dri maintainers/developers,

This is a 31-day syzbot report for the dri subsystem.
All related reports/information can be found at:
https://syzkaller.appspot.com/upstream/s/dri

During the period, 3 new issues were detected and 0 were fixed.
In total, 7 issues are still open and 30 have been fixed so far.

Some of the still happening issues:

Ref Crashes Repro Title
<1> 297 Yes   WARNING in drm_wait_one_vblank
  https://syzkaller.appspot.com/bug?extid=6f7fe2dbc479dca0ed17
<2> 32  Yes   inconsistent lock state in sync_info_debugfs_show
  https://syzkaller.appspot.com/bug?extid=007bfe0f3330f6e1e7d1
<3> 16  NoWARNING in vkms_get_vblank_timestamp (2)
  https://syzkaller.appspot.com/bug?extid=93bd128a383695391534
<4> 2   Yes   divide error in drm_mode_vrefresh
  https://syzkaller.appspot.com/bug?extid=622bba18029bcde672e1

---
This report is generated by a bot. It may contain errors.
See https://goo.gl/tpsmEJ for more information about syzbot.
syzbot engineers can be reached at syzkal...@googlegroups.com.

To disable reminders for individual bugs, reply with the following command:
#syz set  no-reminders

To change bug's subsystems, reply with:
#syz set  subsystems: new-subsystem

You may send multiple commands in a single email message.


[PATCH v4 1/1] drm/doc: Document DRM device reset expectations

2023-06-26 Thread André Almeida
Create a section that specifies how to deal with DRM device resets for
kernel and userspace drivers.

Signed-off-by: André Almeida 
---
 Documentation/gpu/drm-uapi.rst | 68 ++
 1 file changed, 68 insertions(+)

diff --git a/Documentation/gpu/drm-uapi.rst b/Documentation/gpu/drm-uapi.rst
index 65fb3036a580..25a11b9b98fa 100644
--- a/Documentation/gpu/drm-uapi.rst
+++ b/Documentation/gpu/drm-uapi.rst
@@ -285,6 +285,74 @@ for GPU1 and GPU2 from different vendors, and a third 
handler for
 mmapped regular files. Threads cause additional pain with signal
 handling as well.
 
+Device reset
+
+
+The GPU stack is really complex and is prone to errors, from hardware bugs,
+faulty applications and everything in between the many layers. Some errors
+require resetting the device in order to make the device usable again. This
+section describes what is the expectations for DRM and usermode drivers when a
+device resets and how to propagate the reset status.
+
+Kernel Mode Driver
+--
+
+The KMD is responsible for checking if the device needs a reset, and to perform
+it as needed. Usually a hang is detected when a job gets stuck executing. KMD
+should keep track of resets, because userspace can query any time about the
+reset stats for an specific context. This is needed to propagate to the rest of
+the stack that a reset has happened. Currently, this is implemented by each
+driver separately, with no common DRM interface.
+
+User Mode Driver
+
+
+The UMD should check before submitting new commands to the KMD if the device 
has
+been reset, and this can be checked more often if it requires to. After
+detecting a reset, UMD will then proceed to report it to the application using
+the appropriated API error code, as explained in the below section about
+robustness.
+
+Robustness
+--
+
+The only way to try to keep an application working after a reset is if it
+complies with the robustness aspects of the graphical API that it is using.
+
+Graphical APIs provide ways to application to deal with device resets. However,
+there is no guarantee that the app will be correctly using such features, and
+UMD can implement policies to close the app if it is a repeating offender,
+likely in a broken loop. This is done to ensure that it does not keeps blocking
+the user interface from being correctly displayed. This should be done even if
+the app is correct but happens to trigger some bug in the hardware/driver.
+
+OpenGL
+~~
+
+Apps using OpenGL should use the available robust interfaces, like the
+extension ``GL_ARB_robustness`` (or ``GL_EXT_robustness`` for OpenGL ES). This
+interface tells if a reset has happened, and if so, all the context state is
+considered lost and the app proceeds by creating new ones. If is possible to
+determine that robustness is not in use, UMD will terminate the app when a 
reset
+is detected, giving that the contexts are lost and the app won't be able to
+figure this out and recreate the contexts.
+
+Vulkan
+~~
+
+Apps using Vulkan should check for ``VK_ERROR_DEVICE_LOST`` for submissions.
+This error code means, among other things, that a device reset has happened and
+it needs to recreate the contexts to keep going.
+
+Reporting resets causes
+---
+
+Apart from propagating the reset through the stack so apps can recover, it's
+really useful for driver developers to learn more about what caused the reset 
in
+first place. DRM devices should make use of devcoredump to store relevant
+information about the reset, so this information can be added to user bug
+reports.
+
 .. _drm_driver_ioctl:
 
 IOCTL Support on Device Nodes
-- 
2.41.0



[PATCH v4 0/1] drm/doc: Document DRM device reset expectations

2023-06-26 Thread André Almeida
This v4 removes the common DRM ioctl, and adds just the documentation for now,
giving the lack of a common "DRM context" infrascture make it hard to implement.

v3: https://lore.kernel.org/lkml/20230621005719.836857-1-andrealm...@igalia.com/

Changes: 
 - Drop the ioctl
 - Addresed comments com Pekka, as making the documentation more clear and
 consistent.

André Almeida (1):
  drm/doc: Document DRM device reset expectations

 Documentation/gpu/drm-uapi.rst | 68 ++
 1 file changed, 68 insertions(+)

-- 
2.41.0



Re: [PATCH 03/15] dt-bindings: clock: qcom, dispcc-sm6125: Require GCC PLL0 DIV clock

2023-06-26 Thread Krzysztof Kozlowski
On 26/06/2023 19:49, Marijn Suijten wrote:
> On 2023-06-26 18:10:44, Krzysztof Kozlowski wrote:
>> On 25/06/2023 21:48, Marijn Suijten wrote:
>>> On 2023-06-24 11:08:54, Krzysztof Kozlowski wrote:
 On 24/06/2023 03:45, Konrad Dybcio wrote:
> On 24.06.2023 02:41, Marijn Suijten wrote:
>> The "gcc_disp_gpll0_div_clk_src" clock is consumed by the driver, will
>> be passed from DT, and should be required by the bindings.
>>
>> Fixes: 8397c9c0c26b ("dt-bindings: clock: add QCOM SM6125 display clock 
>> bindings")
>> Signed-off-by: Marijn Suijten 
>> ---
> Ideally, you'd stick it at the bottom of the list, as the items: order
> is part of the ABI

 Yes, please add them to the end. Order is fixed.
>>>
>>> Disagreed for bindings that declare clock-names and when the driver
>>> adheres to it, see my reply to Konrad's message.
>>
>> That's the generic rule, with some exceptions of course. Whether one
>> chosen driver (chosen system and chosen version of that system) adheres
>> or not, does not change it. Other driver behaves differently and ABI is
>> for everyone, not only for your specific version of Linux driver.
>>
>> Follow the rule.
> 
> This has no relation to the driver (just that our driver adheres to the
> bindings, as it is supposed to be).  The bindings define a mapping from
> a clock-names=<> entry to a clock on the same index in the clocks=<>
> array.  That relation remains the same with this change.

Not really, binding also defines the list of clocks - their order and
specific entries. This changes.

Best regards,
Krzysztof



Re: [PATCH v1 0/2] udmabuf: Add back support for mapping hugetlb pages

2023-06-26 Thread Jason Gunthorpe
On Mon, Jun 26, 2023 at 08:14:27PM +0200, David Hildenbrand wrote:

> So we might have to implement the same page migration as gup does on
> FOLL_LONGTERM here ... maybe there are more such cases/drivers that actually
> require that handling when simply taking pages out of the memfd, believing
> they can hold on to them forever.

In general I would like to see an interface to FOLL_LONGTERM pin pages
from a memfd. I would quite happily use that in iommufd as well.

It solves some problems we have there with fork/exec/etc if the pages
are not linked to a mm_struct.

Jason


Re: [PATCH v1 0/2] udmabuf: Add back support for mapping hugetlb pages

2023-06-26 Thread David Hildenbrand

On 26.06.23 19:52, Peter Xu wrote:

On Mon, Jun 26, 2023 at 07:45:37AM +, Kasireddy, Vivek wrote:

Hi Peter,



On Fri, Jun 23, 2023 at 06:13:02AM +, Kasireddy, Vivek wrote:

Hi David,


The first patch ensures that the mappings needed for handling mmap
operation would be managed by using the pfn instead of struct page.
The second patch restores support for mapping hugetlb pages where
subpages of a hugepage are not directly used anymore (main reason
for revert) and instead the hugetlb pages and the relevant offsets
are used to populate the scatterlist for dma-buf export and for
mmap operation.

Testcase: default_hugepagesz=2M hugepagesz=2M hugepages=2500

options

were passed to the Host kernel and Qemu was launched with these
relevant options: qemu-system-x86_64 -m 4096m
-device virtio-gpu-pci,max_outputs=1,blob=true,xres=1920,yres=1080
-display gtk,gl=on
-object memory-backend-memfd,hugetlb=on,id=mem1,size=4096M
-machine memory-backend=mem1

Replacing -display gtk,gl=on with -display gtk,gl=off above would
exercise the mmap handler.



While I think the VM_PFNMAP approach is much better and should fix

that

issue at hand, I thought more about missing memlock support and

realized

that we might have to fix something else. SO I'm going to raise the
issue here.

I think udmabuf chose the wrong interface to do what it's doing, that
makes it harder to fix it eventually.

Instead of accepting a range in a memfd, it should just have accepted a
user space address range and then used
pin_user_pages(FOLL_WRITE|FOLL_LONGTERM) to longterm-pin the

pages

"officially".

Udmabuf indeed started off by using user space address range and GUP

but

the dma-buf subsystem maintainer had concerns with that approach in v2.
It also had support for mlock in that version. Here is v2 and the relevant
conversation:
https://patchwork.freedesktop.org/patch/210992/?series=39879=2



So what's the issue? Udma effectively pins pages longterm ("possibly
forever") simply by grabbing a reference on them. These pages might
easily reside in ZONE_MOVABLE or in MIGRATE_CMA pageblocks.

So what udmabuf does is break memory hotunplug and CMA, because it
turns
pages that have to remain movable unmovable.

In the pin_user_pages(FOLL_LONGTERM) case we make sure to migrate
these
pages. See mm/gup.c:check_and_migrate_movable_pages() and

especially

folio_is_longterm_pinnable(). We'd probably have to implement

something

similar for udmabuf, where we detect such unpinnable pages and

migrate

them.

The pages udmabuf pins are only those associated with Guest (GPU

driver/virtio-gpu)

resources (or buffers allocated and pinned from shmem via drm GEM).

Some

resources are short-lived, and some are long-lived and whenever a

resource

gets destroyed, the pages are unpinned. And, not all resources have their

pages

pinned. The resource that is pinned for the longest duration is the FB and

that's

because it is updated every ~16ms (assuming 1920x1080@60) by the Guest
GPU driver. We can certainly pin/unpin the FB after it is accessed on the

Host

as a workaround, but I guess that may not be very efficient given the

amount

of churn it would create.

Also, as far as migration or S3/S4 is concerned, my understanding is that all
the Guest resources are destroyed and recreated again. So, wouldn't

something

similar happen during memory hotunplug?




For example, pairing udmabuf with vfio (which pins pages using
pin_user_pages(FOLL_LONGTERM)) in QEMU will most probably not work

in

all cases: if udmabuf longterm pinned the pages "the wrong way", vfio
will fail to migrate them during FOLL_LONGTERM and consequently fail
pin_user_pages(). As long as udmabuf holds a reference on these pages,
that will never succeed.

Dma-buf rules (for exporters) indicate that the pages only need to be

pinned

during the map_attachment phase (and until unmap attachment happens).
In other words, only when the sg_table is created by udmabuf. I guess one
option would be to not hold any references during UDMABUF_CREATE and
only grab references to the pages (as and when it gets used) during this

step.

Would this help?


IIUC the refcount is needed, otherwise I don't see what to protect the page
from being freed and even reused elsewhere before map_attachment().

It seems the previous concern on using gup was majorly fork(), if this is it:

https://patchwork.freedesktop.org/patch/210992/?series=39879=2#co
mment_414213

Could it also be guarded by just making sure the pages are MAP_SHARED
when
creating the udmabuf, if fork() is a requirement of the feature?

I had a feeling that the userspace still needs to always do the right thing
to make it work, even using pure PFN mappings.

For instance, what if the userapp just punchs a hole in the shmem/hugetlbfs
file after creating the udmabuf (I see that F_SEAL_SHRINK is required, but
at least not F_SEAL_WRITE with current impl), and fault a new page into the
page cache?

IIUC, Qemu creates and owns the memfd that is 

Re: [PATCH v2] drm/bridge: ti-sn65dsi86: Fix auxiliary bus lifetime

2023-06-26 Thread Doug Anderson
Hi,

On Tue, Jun 13, 2023 at 11:41 AM Stephen Boyd  wrote:
>
> Quoting Douglas Anderson (2023-06-13 06:58:13)
> > Memory for the "struct device" for any given device isn't supposed to
> > be released until the device's release() is called. This is important
> > because someone might be holding a kobject reference to the "struct
> > device" and might try to access one of its members even after any
> > other cleanup/uninitialization has happened.
> >
> > Code analysis of ti-sn65dsi86 shows that this isn't quite right. When
> > the code was written, it was believed that we could rely on the fact
> > that the child devices would all be freed before the parent devices
> > and thus we didn't need to worry about a release() function. While I
> > still believe that the parent's "struct device" is guaranteed to
> > outlive the child's "struct device" (because the child holds a kobject
> > reference to the parent), the parent's "devm" allocated memory is a
> > different story. That appears to be freed much earlier.
> >
> > Let's make this better for ti-sn65dsi86 by allocating each auxiliary
> > with kzalloc and then free that memory in the release().
> >
> > Fixes: bf73537f411b ("drm/bridge: ti-sn65dsi86: Break GPIO and MIPI-to-eDP 
> > bridge into sub-drivers")
> > Suggested-by: Stephen Boyd 
> > Signed-off-by: Douglas Anderson 
> > ---
>
> Reviewed-by: Stephen Boyd 

Pushed to drm-misc-fixes:

7aa83fbd712a drm/bridge: ti-sn65dsi86: Fix auxiliary bus lifetime


[PATCH] drm/tegra: Fix potential memory leak if vmap() fail

2023-06-26 Thread Sui Jingfeng
The vmap function called in the tegra_fbdev_probe() function could fail,
It doesn't matther. But if the error handling is necessary, it should
also free the resources allocated by drm_fb_helper_alloc_info() function
and the gem buffer object allocated by tegra_bo_create().

Signed-off-by: Sui Jingfeng 
---
 drivers/gpu/drm/tegra/fbdev.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/drivers/gpu/drm/tegra/fbdev.c b/drivers/gpu/drm/tegra/fbdev.c
index e74d9be981c7..d2152b5eb77b 100644
--- a/drivers/gpu/drm/tegra/fbdev.c
+++ b/drivers/gpu/drm/tegra/fbdev.c
@@ -141,6 +141,8 @@ static int tegra_fbdev_probe(struct drm_fb_helper *helper,
 
 destroy:
drm_framebuffer_remove(fb);
+   drm_fb_helper_release_info(helper);
+   drm_gem_object_put(>gem);
return err;
 }
 
-- 
2.25.1



Re: [PATCH 06/15] dt-bindings: display/msm: sc7180-dpu: Describe SM6125

2023-06-26 Thread Marijn Suijten
On 2023-06-26 18:16:58, Krzysztof Kozlowski wrote:
> On 25/06/2023 21:52, Marijn Suijten wrote:
> > On 2023-06-24 11:12:52, Krzysztof Kozlowski wrote:
> >> On 24/06/2023 02:41, Marijn Suijten wrote:
> >>> SM6125 is identical to SM6375 except that while downstream also defines
> >>> a throttle clock, its presence results in timeouts whereas SM6375
> >>> requires it to not observe any timeouts.
> >>
> >> Then it should not be allowed, so you need either "else:" block or
> >> another "if: properties: compatible:" to disallow it. Because in current
> >> patch it would be allowed.
> > 
> > That means this binding is wrong/incomplete for all other SoCs then.
> > clock(-name)s has 6 items, and sets `minItems: 6`.  Only for sm6375-dpu

Of course meant to say that clock(-name)s has **7** items, not 6.

> > does it set `minItems: 7`, but an else case is missing.
> 
> Ask the author why it is done like this.

Konrad, can you clarify why other 

> > Shall I send a Fixes: ed41005f5b7c ("dt-bindings: display/msm:
> > sc7180-dpu: Describe SM6350 and SM6375") for that, and should maxItems:
> > 6 be the default under clock(-name)s or in an else:?
> 
> There is no bug to fix. Or at least it is not yet known. Whether other
> devices should be constrained as well - sure, sounds reasonable, but I
> did not check the code exactly.

I don't know either, but we need this information to decide whether to
use `maxItems: 6`:

1. Directly on the property;
2. In an `else:` case on the current `if: sm6375-dpu` (should have the
   same effect as 1., afaik);
3. In a second `if:` case that lists all SoCS explicitly.

Since we don't have this information, I think option 3. is the right way
to go, setting `maxItems: 6` for qcom,sm6125-dpu.

However, it is not yet understood why downstream is able to use the
throttle clock without repercussions.

> We talk here about this patch.

We used this patch to discover that other SoCs are similarly
unconstrained.  But if you don't want me to look into it, by all means!
Saves me a lot of time.  So I will go with option 3.

- Marijn


Re: [PATCH v1 0/2] udmabuf: Add back support for mapping hugetlb pages

2023-06-26 Thread Peter Xu
On Mon, Jun 26, 2023 at 07:45:37AM +, Kasireddy, Vivek wrote:
> Hi Peter,
> 
> > 
> > On Fri, Jun 23, 2023 at 06:13:02AM +, Kasireddy, Vivek wrote:
> > > Hi David,
> > >
> > > > > The first patch ensures that the mappings needed for handling mmap
> > > > > operation would be managed by using the pfn instead of struct page.
> > > > > The second patch restores support for mapping hugetlb pages where
> > > > > subpages of a hugepage are not directly used anymore (main reason
> > > > > for revert) and instead the hugetlb pages and the relevant offsets
> > > > > are used to populate the scatterlist for dma-buf export and for
> > > > > mmap operation.
> > > > >
> > > > > Testcase: default_hugepagesz=2M hugepagesz=2M hugepages=2500
> > > > options
> > > > > were passed to the Host kernel and Qemu was launched with these
> > > > > relevant options: qemu-system-x86_64 -m 4096m
> > > > > -device virtio-gpu-pci,max_outputs=1,blob=true,xres=1920,yres=1080
> > > > > -display gtk,gl=on
> > > > > -object memory-backend-memfd,hugetlb=on,id=mem1,size=4096M
> > > > > -machine memory-backend=mem1
> > > > >
> > > > > Replacing -display gtk,gl=on with -display gtk,gl=off above would
> > > > > exercise the mmap handler.
> > > > >
> > > >
> > > > While I think the VM_PFNMAP approach is much better and should fix
> > that
> > > > issue at hand, I thought more about missing memlock support and
> > realized
> > > > that we might have to fix something else. SO I'm going to raise the
> > > > issue here.
> > > >
> > > > I think udmabuf chose the wrong interface to do what it's doing, that
> > > > makes it harder to fix it eventually.
> > > >
> > > > Instead of accepting a range in a memfd, it should just have accepted a
> > > > user space address range and then used
> > > > pin_user_pages(FOLL_WRITE|FOLL_LONGTERM) to longterm-pin the
> > pages
> > > > "officially".
> > > Udmabuf indeed started off by using user space address range and GUP
> > but
> > > the dma-buf subsystem maintainer had concerns with that approach in v2.
> > > It also had support for mlock in that version. Here is v2 and the relevant
> > > conversation:
> > > https://patchwork.freedesktop.org/patch/210992/?series=39879=2
> > >
> > > >
> > > > So what's the issue? Udma effectively pins pages longterm ("possibly
> > > > forever") simply by grabbing a reference on them. These pages might
> > > > easily reside in ZONE_MOVABLE or in MIGRATE_CMA pageblocks.
> > > >
> > > > So what udmabuf does is break memory hotunplug and CMA, because it
> > > > turns
> > > > pages that have to remain movable unmovable.
> > > >
> > > > In the pin_user_pages(FOLL_LONGTERM) case we make sure to migrate
> > > > these
> > > > pages. See mm/gup.c:check_and_migrate_movable_pages() and
> > especially
> > > > folio_is_longterm_pinnable(). We'd probably have to implement
> > something
> > > > similar for udmabuf, where we detect such unpinnable pages and
> > migrate
> > > > them.
> > > The pages udmabuf pins are only those associated with Guest (GPU
> > driver/virtio-gpu)
> > > resources (or buffers allocated and pinned from shmem via drm GEM).
> > Some
> > > resources are short-lived, and some are long-lived and whenever a
> > resource
> > > gets destroyed, the pages are unpinned. And, not all resources have their
> > pages
> > > pinned. The resource that is pinned for the longest duration is the FB and
> > that's
> > > because it is updated every ~16ms (assuming 1920x1080@60) by the Guest
> > > GPU driver. We can certainly pin/unpin the FB after it is accessed on the
> > Host
> > > as a workaround, but I guess that may not be very efficient given the
> > amount
> > > of churn it would create.
> > >
> > > Also, as far as migration or S3/S4 is concerned, my understanding is that 
> > > all
> > > the Guest resources are destroyed and recreated again. So, wouldn't
> > something
> > > similar happen during memory hotunplug?
> > >
> > > >
> > > >
> > > > For example, pairing udmabuf with vfio (which pins pages using
> > > > pin_user_pages(FOLL_LONGTERM)) in QEMU will most probably not work
> > in
> > > > all cases: if udmabuf longterm pinned the pages "the wrong way", vfio
> > > > will fail to migrate them during FOLL_LONGTERM and consequently fail
> > > > pin_user_pages(). As long as udmabuf holds a reference on these pages,
> > > > that will never succeed.
> > > Dma-buf rules (for exporters) indicate that the pages only need to be
> > pinned
> > > during the map_attachment phase (and until unmap attachment happens).
> > > In other words, only when the sg_table is created by udmabuf. I guess one
> > > option would be to not hold any references during UDMABUF_CREATE and
> > > only grab references to the pages (as and when it gets used) during this
> > step.
> > > Would this help?
> > 
> > IIUC the refcount is needed, otherwise I don't see what to protect the page
> > from being freed and even reused elsewhere before map_attachment().
> > 
> > It seems the previous concern on using gup 

Re: [PATCH 03/15] dt-bindings: clock: qcom,dispcc-sm6125: Require GCC PLL0 DIV clock

2023-06-26 Thread Marijn Suijten
On 2023-06-26 18:10:44, Krzysztof Kozlowski wrote:
> On 25/06/2023 21:48, Marijn Suijten wrote:
> > On 2023-06-24 11:08:54, Krzysztof Kozlowski wrote:
> >> On 24/06/2023 03:45, Konrad Dybcio wrote:
> >>> On 24.06.2023 02:41, Marijn Suijten wrote:
>  The "gcc_disp_gpll0_div_clk_src" clock is consumed by the driver, will
>  be passed from DT, and should be required by the bindings.
> 
>  Fixes: 8397c9c0c26b ("dt-bindings: clock: add QCOM SM6125 display clock 
>  bindings")
>  Signed-off-by: Marijn Suijten 
>  ---
> >>> Ideally, you'd stick it at the bottom of the list, as the items: order
> >>> is part of the ABI
> >>
> >> Yes, please add them to the end. Order is fixed.
> > 
> > Disagreed for bindings that declare clock-names and when the driver
> > adheres to it, see my reply to Konrad's message.
> 
> That's the generic rule, with some exceptions of course. Whether one
> chosen driver (chosen system and chosen version of that system) adheres
> or not, does not change it. Other driver behaves differently and ABI is
> for everyone, not only for your specific version of Linux driver.
> 
> Follow the rule.

This has no relation to the driver (just that our driver adheres to the
bindings, as it is supposed to be).  The bindings define a mapping from
a clock-names=<> entry to a clock on the same index in the clocks=<>
array.  That relation remains the same with this change.

- Marijn


Re: [PATCH 03/15] dt-bindings: clock: qcom,dispcc-sm6125: Require GCC PLL0 DIV clock

2023-06-26 Thread Marijn Suijten
On 2023-06-26 18:15:13, Krzysztof Kozlowski wrote:
> On 26/06/2023 16:26, Marijn Suijten wrote:
> > On 2023-06-26 11:43:39, Konrad Dybcio wrote:
> >> On 25.06.2023 21:48, Marijn Suijten wrote:
> >>> On 2023-06-24 03:45:02, Konrad Dybcio wrote:
>  On 24.06.2023 02:41, Marijn Suijten wrote:
> > The "gcc_disp_gpll0_div_clk_src" clock is consumed by the driver, will
> > be passed from DT, and should be required by the bindings.
> >
> > Fixes: 8397c9c0c26b ("dt-bindings: clock: add QCOM SM6125 display clock 
> > bindings")
> > Signed-off-by: Marijn Suijten 
> > ---
>  Ideally, you'd stick it at the bottom of the list, as the items: order
>  is part of the ABI
> >>>
> >>> This isn't an ABI break, as this driver nor its bindings require/declare
> >>> a fixed order: they declare a relation between clocks and clock-names.
> >> Bindings describe the ABI, drivers implement compliant code flow.
> > 
> > That is how bindings are supposed to be...  However typically the driver
> > is written/ported first and then the bindings are simply created to
> 
> Your development process does not matter for the bindings. Whatever you
> decide to do "typically" is your choice, although of course I understand
> why you do it like that. You can argument the same that "I never create
> bindings in my process, so the driver defines the ABI".

This is not my process, nor my choice.

> > reflect this, and sometimes (as is the case with this patch)
> > incorrectly.
> > 
> > That, together with a lack of DTS and known-working device with it
> > (which is why I'm submitting driver+bindings+dts in one series now!)
> > makes us shoot ourselves in the foot by locking everyone into an ABI
> > that makes no sense.
> 
> No one is locked into the ABI. SoC maintainer decides on this. However
> unjustified ABI breaking or not caring about it at all is not the way to
> go. It is not the correct process.

It definitely sounds like it.

> >>> This orders the GCC clock just like other dispccs.  And the previous
> >>> patch dropped the unused cfg_ahb_clk from the bindings, so all bets are
> >>> off anyway.
> >> Thinking about it again, the binding has not been consumed by any upstream
> >> DT to date, so it should (tm) be fine to let it slide..
> > 
> > Exactly, I hope/doubt anyone was already using these incomplete
> > bindings.  And again: the ABI here is the name->phandle mapping, the
> > order Does Not Matter™.
> 
> No, it's not. Your one driver does not define the ABI. There are many
> different drivers, many different operating systems and other software
> components.

You missed the point entirely.

The point is that someone contributed a dt-bindings patch that does not
represent the hardware (hence not the driver for that hardware either).
It was taken without objection.  So now what?  We are stuck with a
broken ABI that does not allow us to describe the hardware?

If there are many different drivers and other OSes, why are we solely
responsible for describing broken bindings?  Why were there no
objections elsewhere that this does not allow us to describe the
hardware in question?

Note that the person signing off on and sending that initial dt-bindings
patch for qcom,dispcc-sm6125.yaml is me, by the way.

- Marijn


Re: [PATCH v4] drm/vkms: Add support to 1D gamma LUT

2023-06-26 Thread Maira Canal

Hi Pekka,

On 6/26/23 05:17, Pekka Paalanen wrote:

On Sat, 24 Jun 2023 18:48:08 -0300
Maira Canal  wrote:


Hi Arthur,

Thanks for working on this feature for the VKMS!

On 6/21/23 16:41, Arthur Grillo wrote:

Support a 1D gamma LUT with interpolation for each color channel on the
VKMS driver. Add a check for the LUT length by creating
vkms_atomic_check().

Tested with:
igt@kms_color@gamma
igt@kms_color@legacy-gamma
igt@kms_color@invalid-gamma-lut-sizes


Could you also mention that this will make it possible to run the test
igt@kms_plane@pixel-format?

Also, you mentioned to me that the performance degraded with this new
feature, but I wasn't able to see it while running the VKMS CI. I
performed a couple of tests and I didn't see any significant performance
issue.

Could you please run a benchmark and share the results with us? This way
we can atest that this new feature will not affect significantly the
VKMS performance. It would be nice to have a small brief of this
benchmark on the commit message as well.

Attesting that there isn't a performance issue and adding those nits to
the commit message, you can add my

Reviewed-by: Maíra Canal 

on the next version.


Hi,

perfomance testing is good indeed. As future work, could there be a
document describing how to test VKMS performance?


I'll try to select a couple of more meaningful IGT tests to describe how
to test the VKMS performance and also add a document to describe how to
run this tests.

Recently, I added a VKMS must-pass testlist to IGT. This testlist
tries to assure that regressions will not be introduced into VKMS. But,
I failed to introduce a documentation on the kernel side pointing to
this new testlist... I'll also work on it.



"I ran IGT@blah 100 times and it took xx seconds before and yy seconds
after" does not really give someone like me an idea of what was
actually measured. For example blending overhead increase could be
completely lost in opaque pixel copying noise if the test case has only
few pixels to blend, e.g. a cursor plane, not to mention the overhead
of launching an IGT test in the first place.


About the IGT overhead, I don't know exactly how we could escape from
it. Maybe writing KUnit tests to the VKMS's composition functions, such
as blend(). Anyway, we would have the overhead of the KUnit framework.
I mean, for whatever framework we choose, there'll be an overhead...

Do you have any other ideas on how to test VKMS with less overhead?

Best Regards,
- Maíra



Something that would guide new developers in running meaningful
benchmarks would be nice.

Should e.g. IGT have explicit (VKMS) performance tests that need to be
run manually, since evaluation of the result is not feasible
automatically? Or a benchmark mode in correctness tests that would run
the identical operation N times and measure the time before checking
for correctness?

The correctness verification in IGT tests, if done by image comparison
which they undoubtedly will need to be in the future, may dominate the
CPU run time measurements if included.


Thanks,
pq


Re: [PATCH 1/9] dt-bindings: mfd: Add bindings for SAM9X7 LCD controller

2023-06-26 Thread Conor Dooley
On Mon, Jun 26, 2023 at 05:31:59AM +, manikanda...@microchip.com wrote:
> On 21/06/23 13:17, Nicolas Ferre wrote:
> > On 16/06/2023 at 08:44, Manikandan M - I67131 wrote:
> >> On 14/06/23 20:10, Nicolas Ferre wrote:
> >>> On 13/06/2023 at 20:21, Conor Dooley wrote:
>  On Tue, Jun 13, 2023 at 08:17:13PM +0200, Krzysztof Kozlowski wrote:
> > On 13/06/2023 09:04, Manikandan Muralidharan wrote:
> >> Add new compatible string for the XLCD controller on SAM9X7 SoC.
> >>
> >> Signed-off-by: Manikandan Muralidharan
> >> ---
> >>    Documentation/devicetree/bindings/mfd/atmel-hlcdc.txt | 1 +
> >>    1 file changed, 1 insertion(+)
> >>
> >> diff --git a/Documentation/devicetree/bindings/mfd/atmel-hlcdc.txt
> >> b/Documentation/devicetree/bindings/mfd/atmel-hlcdc.txt
> >> index 5f8880cc757e..7c77b6bf4adb 100644
> >> --- a/Documentation/devicetree/bindings/mfd/atmel-hlcdc.txt
> >> +++ b/Documentation/devicetree/bindings/mfd/atmel-hlcdc.txt
> >> @@ -8,6 +8,7 @@ Required properties:
> >>   "atmel,sama5d3-hlcdc"
> >>   "atmel,sama5d4-hlcdc"
> >>   "microchip,sam9x60-hlcdc"
> >> +   "microchip,sam9x7-xlcdc"
> > Google says sam9x7 is a series, not a SoC. Please add compatibles for
> > specific SoCs, not for series.
>  We had this one a few weeks ago, see
>  https://lore.kernel.org/all/add5e49e-8416-ba9f-819a-da944938c...@microchip.com/
>  and its parents. Outcome of that seemed to be that using "sam9x7" was
>  fine.
> >>>
> >>> And it's where it begins to be funny, as the LCD is precisely one aspect
> >>> where we differentiate between sam9x75, sam9x72 and sam9x70...
> >>> So please Manikandan sort this out if difference between these chips
> >>> will be better handled with different compatibility string, in
> >>> particular about //, LVDS and MIPI-DSI variants!
> >> Yes Sure, I will replace the compatible as s/sam9x7/sam9x75/g to handle
> >> the different variants of sam9x7 better.
> > 
> > Moving to sam9x75 is probably good. But what is your plan for 
> > differentiating parallel and LVDS (on sam9x72) and precisely this 
> > sam9x75 variant which in addition has MIPI-DSI?
> In sam9x75 with support for LVDS and MIPI, Parallel interfacing 
> peripherals, the additions performed on the LCD controller driver is the 
> same.Considering the LCDC IP used in sam9x75, there are no registers 
> sets that needs modification per connecting peripheral variants, only 
> the clock and DRM_ENCODER_MODE_XXX (set by connecting peripheral driver) 
> differs, which can be handled in DT, atmel-lcdc MFD driver and 
> peripheral driver.
> 
> In future, sam9x72 with XLCD controller can be differentiated with 
> sam9x72 compatible string and with a IP version flag(is_xlcdc_v2) in its 
> driver data if an upgraded XLCD IP is used with difference in bits or 
> register set exist compared to current IP.

Trying to covert that into what the binding will look like...
sam9x72 & sam9x75 each get their own compatibles for the lcd controller.
From there, we permit `compatible = "microchip,sam9x75-foo"` in
isolation. It *sounds* like the basic featureset of the sam9x75 is
compatible with the sam9x72, so for that we permit
`compatible = "microchip,sam9x72-foo", "microchip,sam9x75-foo"`.
Although, if the hardware for the sam9x72 isn't set in stone yet, it
might be best to hold off on documenting it until things settle down,
and only add the sam9x75 for now.

Cheers,
Conor.


signature.asc
Description: PGP signature


Re: [PATCH v2 0/2] Add MHI quirk for QAIC

2023-06-26 Thread Jeffrey Hugo

On 6/8/2023 5:59 AM, Manivannan Sadhasivam wrote:

On Fri, May 19, 2023 at 10:39:00AM -0600, Jeffrey Hugo wrote:

With the QAIC driver in -next, I'd like to suggest some MHI changes that
specific to AIC100 devices, but perhaps provide a framework for other
device oddities.

AIC100 devices technically violate the MHI spec in two ways. Sadly, these
issues comes from the device hardware, so host SW needs to work around
them.

Thie first issue, presented in this series, has to do with the
SOC_HW_VERSION register. This register is suposed to be initialized by the
hardware prior to the MHI being accessable by the host to contain a
version string for the SoC of the device. This could be used by the host
MHI controller software to identify and handle version to version changes.
The AIC100 hardware does not initialize this register, and thus it
contains garbage.

This would not be much of a problem normally - the QAIC driver would just
never use it. However the MHI stack uses this register as part of the init
sequence and if the controller reports that the register is inaccessable
then the init sequence fails.  On some AIC100 cards, the garbage value
ends up being 0x which is PCIe spec defined to be a special value
indicating the access failed.  The MHI controller cannot tell if that
value is a PCIe link issue, or just garbage.

QAIC needs a way to tell MHI not to use this register. Other buses have a
quirk mechanism - a way to describe oddities in a particular
implementation that have some kind of workaround. Since this seems to be
the first need for such a thing in MHI, introduce a quirk framework.

The second issue AIC100 has involves the PK Hash registers. A solution for
this is expected to be proposed in the near future and is anticipated to
make use of the quirk framework proposed here. With PK Hash, there are two
oddities to handle. AIC100 does not initialize these registers until the
SBL is running, which is later than the spec indicates, and in practice
is after MHI reads/caches them. Also, AIC100 does not have enough
registers defined to fully report the 5 PK Hash slots, so a custom
reporting format is defined by the device.



Looking at the two issues you reported above, it looks to me that they can be
handled inside the aic100 mhi_controller driver itself. Since the MHI stack
exports the read_reg callback to controller drivers, if some registers are not
supported by the device, then the callback can provide some fixed dummy data
emulating the register until the issue is fixed in the device (if at all).

Quirk framework could be useful if the device misbehaves against the protocol
itself but for the register issues like this, I think the controller driver can
handle itself.

What do you think?


I think for the HW_VERSION register, your suggestion is very good, and 
something I plan to adopt.


For the PK Hash registers, I don't think it quite works.

HW_VERSION I can hard code to a valid value, or just stub out to 0 since 
that appears to be only consumed by the MHI Controller, and we don't use it.


The PK Hash registers are programmed into the SoC, and can be unique 
from SoC to SoC.  I don't see how the driver can provide valid, but 
faked information for them.  Also, the user consumes this data via 
sysfs.  We'd like to give the data to the user, and we can't fake it. 
Also the data is dynamic.


Lets start with the dynamic data issue.  Right now MHI reads these 
registers once, and caches the values.  I would propose a quirk to 
change that behavior for AIC100, but does MHI really need to operate in 
a "read once" mode?  Would something actually break if MHI read the 
registers every time the sysfs node is accessed?  Then sysfs would 
display the latest data, which would be beneficial to AIC100 and should 
not be a behavior change for other devices which have static data (MHI 
just displays the same data because it hasn't changed).


Do you recall the reason behind making the PK Hash registers read once 
and cached?




- Mani


v2:
-Fix build error
-Fix typo in commit text

Jeffrey Hugo (2):
   bus: mhi: host: Add quirk framework and initial quirk
   accel/qaic: Add MHI_QUIRK_SOC_HW_VERSION_UNRELIABLE

  drivers/accel/qaic/mhi_controller.c |  1 +
  drivers/bus/mhi/host/init.c | 13 +
  include/linux/mhi.h | 18 ++
  3 files changed, 28 insertions(+), 4 deletions(-)

--
2.40.1








[PATCH] drm/omap: Checking the mapping returned by vmap()

2023-06-26 Thread Sui Jingfeng
Because vmap() function could fail.
Also don't let omap_gem_vaddr() function signature (declaration) dangling
there, as it will only get defined when CONFIG_DRM_FBDEV_EMULATION=y.

Signed-off-by: Sui Jingfeng 
---
 drivers/gpu/drm/omapdrm/omap_fbdev.c | 10 --
 drivers/gpu/drm/omapdrm/omap_gem.h   |  3 +++
 2 files changed, 11 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/omapdrm/omap_fbdev.c 
b/drivers/gpu/drm/omapdrm/omap_fbdev.c
index b7ccce0704a3..2c88aa1008d8 100644
--- a/drivers/gpu/drm/omapdrm/omap_fbdev.c
+++ b/drivers/gpu/drm/omapdrm/omap_fbdev.c
@@ -197,6 +197,11 @@ static int omap_fbdev_create(struct drm_fb_helper *helper,
drm_fb_helper_fill_info(fbi, helper, sizes);
 
fbi->screen_buffer = omap_gem_vaddr(bo);
+   if (!fbi->screen_buffer) {
+   ret = -ENOMEM;
+   goto err_release_fbi;
+   }
+
fbi->screen_size = bo->size;
fbi->fix.smem_start = dma_addr;
fbi->fix.smem_len = bo->size;
@@ -210,14 +215,15 @@ static int omap_fbdev_create(struct drm_fb_helper *helper,
fbi->fix.ywrapstep = 1;
}
 
-
DBG("par=%p, %dx%d", fbi->par, fbi->var.xres, fbi->var.yres);
DBG("allocated %dx%d fb", fb->width, fb->height);
 
return 0;
 
-fail:
+err_release_fbi:
+   drm_fb_helper_release_info(helper);
 
+fail:
if (ret) {
if (fb)
drm_framebuffer_remove(fb);
diff --git a/drivers/gpu/drm/omapdrm/omap_gem.h 
b/drivers/gpu/drm/omapdrm/omap_gem.h
index 4d4488939f6b..7e8c41b72aae 100644
--- a/drivers/gpu/drm/omapdrm/omap_gem.h
+++ b/drivers/gpu/drm/omapdrm/omap_gem.h
@@ -48,7 +48,10 @@ struct drm_gem_object *omap_gem_new_dmabuf(struct drm_device 
*dev, size_t size,
struct sg_table *sgt);
 int omap_gem_new_handle(struct drm_device *dev, struct drm_file *file,
union omap_gem_size gsize, u32 flags, u32 *handle);
+
+#ifdef CONFIG_DRM_FBDEV_EMULATION
 void *omap_gem_vaddr(struct drm_gem_object *obj);
+#endif
 
 /* Dumb Buffers Interface */
 int omap_gem_dumb_map_offset(struct drm_file *file, struct drm_device *dev,
-- 
2.25.1



Re: [PATCH 1/5] drm/panfrost: Stop using drm_gem_shmem_put_pages()

2023-06-26 Thread Boris Brezillon
On Mon, 26 Jun 2023 19:06:55 +0300
Dmitry Osipenko  wrote:

> On 6/26/23 18:43, Boris Brezillon wrote:
> > On Mon, 26 Jun 2023 16:20:53 +0300
> > Dmitry Osipenko  wrote:
> >   
> >> On 6/26/23 15:02, Boris Brezillon wrote:  
> >>> -err_pages:
> >>> - drm_gem_shmem_put_pages(>base);
> >>>  err_unlock:
> >>>   dma_resv_unlock(obj->resv);
> >>> +
> >>> + if (ret && pinned)
> >>> + drm_gem_shmem_unpin(>base);
> >>
> >> The drm_gem_shmem_unpin() was supposed to be used only in conjunction
> >> with drm_gem_shmem_pin(). I've a pending patch to enable the pin/unpin
> >> refcounting needed by drm-shmem shrinker, it will prohibit invocation of
> >> unpin without a previous pin.
> >>
> >> I'm wondering whether it will be okay to simply remove
> >> drm_gem_shmem_put_pages() from the Panfrost code, letting pages to be
> >> kept allocated in a error case. They will be freed once BO is destroyed.
> >>  
> > 
> > Okay, so after looking at your shmem-shrinker series, I confirm we need
> > to take a pin ref here (hard-pin), otherwise the buffer might be
> > evicted before the GPU is done, especially after you drop gpu_usecount
> > and use only pin_count to check whether a GEM object can be evicted or
> > not.  
> 
> See the drm_gem_evict() [1], it checks whether GEM is busy, preventing
> BO eviction while it is in-use by GPU. Note that in case of Panfrost,
> shrinker isn't enabled for growable BOs.

Okay, we should be good then, sorry for the confusion.


Re: [PATCH 06/15] dt-bindings: display/msm: sc7180-dpu: Describe SM6125

2023-06-26 Thread Krzysztof Kozlowski
On 25/06/2023 21:52, Marijn Suijten wrote:
> On 2023-06-24 11:12:52, Krzysztof Kozlowski wrote:
>> On 24/06/2023 02:41, Marijn Suijten wrote:
>>> SM6125 is identical to SM6375 except that while downstream also defines
>>> a throttle clock, its presence results in timeouts whereas SM6375
>>> requires it to not observe any timeouts.
>>
>> Then it should not be allowed, so you need either "else:" block or
>> another "if: properties: compatible:" to disallow it. Because in current
>> patch it would be allowed.
> 
> That means this binding is wrong/incomplete for all other SoCs then.
> clock(-name)s has 6 items, and sets `minItems: 6`.  Only for sm6375-dpu
> does it set `minItems: 7`, but an else case is missing.

Ask the author why it is done like this.

> 
> Shall I send a Fixes: ed41005f5b7c ("dt-bindings: display/msm:
> sc7180-dpu: Describe SM6350 and SM6375") for that, and should maxItems:
> 6 be the default under clock(-name)s or in an else:?

There is no bug to fix. Or at least it is not yet known. Whether other
devices should be constrained as well - sure, sounds reasonable, but I
did not check the code exactly.

We talk here about this patch.

Best regards,
Krzysztof



Re: [RFC PATCH v3 1/4] drm/doc: Document DRM device reset expectations

2023-06-26 Thread André Almeida

Em 22/06/2023 05:12, Pekka Paalanen escreveu:

On Wed, 21 Jun 2023 13:28:34 -0300
André Almeida  wrote:


Em 21/06/2023 04:58, Pekka Paalanen escreveu:

On Tue, 20 Jun 2023 21:57:16 -0300
André Almeida  wrote:
   

Create a section that specifies how to deal with DRM device resets for
kernel and userspace drivers.

Signed-off-by: André Almeida 


Hi André,

nice to see this! I ended up giving lots of grammar comments, but I'm
not a native speaker. Generally it looks good to me.


Thank you for your feedback :)

   

---
   Documentation/gpu/drm-uapi.rst | 65 ++
   1 file changed, 65 insertions(+)

diff --git a/Documentation/gpu/drm-uapi.rst b/Documentation/gpu/drm-uapi.rst
index 65fb3036a580..da4f8a694d8d 100644
--- a/Documentation/gpu/drm-uapi.rst
+++ b/Documentation/gpu/drm-uapi.rst
@@ -285,6 +285,71 @@ for GPU1 and GPU2 from different vendors, and a third 
handler for
   mmapped regular files. Threads cause additional pain with signal
   handling as well.
   
+Device reset

+
+
+The GPU stack is really complex and is prone to errors, from hardware bugs,
+faulty applications and everything in between the many layers. To recover
+from this kind of state, sometimes is needed to reset the device. This section


It seems unclear what "this kind of state" refers to, so maybe just write 
"errors"?

Maybe:

Some errors require resetting the device in order to make the
device usable again.

I presume that recovery does not mean that the failed job could recover.
   

+describes what's the expectations for DRM and usermode drivers when a device
+resets and how to propagate the reset status.
+
+Kernel Mode Driver
+--
+
+The KMD is responsible for checking if the device needs a reset, and to perform
+it as needed. Usually a hung is detected when a job gets stuck executing. KMD


s/hung/hang/ ?
   

+then update it's internal reset tracking to be ready when userspace asks the


updates its

"update reset tracking"... do you mean that KMD records information
about the reset in case userspace asks for it later?


Yes, kernel drivers do annotate whenever a reset happens, so it can
report to userspace when it asks about resets.

For instance, this is the amdgpu implementation of
AMDGPU_CTX_OP_QUERY_STATE2:

https://elixir.bootlin.com/linux/v6.3.8/source/drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.c#L548


You can see there stored information about resets.


Hi André,

right. What I mean is, if I have to ask this, then that implies that
the wording could be more clear.

I don't know if "reset tracking" is some sub-system that is turned on
and off as needed or what updating it would mean.



Understood, I'll rewrite it to be more clear.

   

+kernel about reset information. Drivers should implement the 
DRM_IOCTL_GET_RESET
+for that.


At this point, I'm not sure what "reset tracking" or "reset
information" entails. Could something be said about those?
  >> +

+User Mode Driver
+
+
+The UMD should check before submitting new commands to the KMD if the device 
has
+been reset, and this can be checked more often if it requires to. The
+DRM_IOCTL_GET_RESET is the default interface for those kind of checks. After
+detecting a reset, UMD will then proceed to report it to the application using
+the appropriated API error code, as explained in the bellow section about


s/bellow/below/
   

+robustness.
+
+Robustness
+--
+
+The only way to try to keep an application working after a reset is if it
+complies with the robustness aspects of the graphical API that is using.


that it is using.
   

+
+Graphical APIs provide ways to application to deal with device resets. However,


provide ways for applications to deal with
   

+there's no guarantee that the app will be correctly using such features, and 
UMD
+can implement policies to close the app if it's a repeating offender, likely in
+a broken loop. This is done to ensure that it doesn't keeps blocking the user


does not keep

I think contractions are usually avoided in documents, but I'm not
bothering to flag them all.
   

+interface to be correctly displayed.


interface from being correctly displayed.
   

+
+OpenGL
+~~
+
+Apps using OpenGL can rely on ``GL_ARB_robustness`` to be robust. This 
extension
+tells if a reset has happened, and if so, all the context state is considered
+lost and the app proceeds by creating new ones. If robustness isn't in use, UMD
+will terminate the app when a reset is detected, giving that the contexts are
+lost and the app won't be able to figure this out and recreate the contexts.


What about GL ES? Is GL_ARB_robustness implemented or even defined there?
   


I found this:
https://registry.khronos.org/OpenGL/extensions/EXT/EXT_robustness.txt

"Since this is intended to be a version of ARB_robustness for OpenGL ES,
it should be named accordingly."

I can add this to this paragraph.


Yes, please!

I suppose there could be even more 

Re: [PATCH 03/15] dt-bindings: clock: qcom, dispcc-sm6125: Require GCC PLL0 DIV clock

2023-06-26 Thread Krzysztof Kozlowski
On 26/06/2023 16:26, Marijn Suijten wrote:
> On 2023-06-26 11:43:39, Konrad Dybcio wrote:
>> On 25.06.2023 21:48, Marijn Suijten wrote:
>>> On 2023-06-24 03:45:02, Konrad Dybcio wrote:
 On 24.06.2023 02:41, Marijn Suijten wrote:
> The "gcc_disp_gpll0_div_clk_src" clock is consumed by the driver, will
> be passed from DT, and should be required by the bindings.
>
> Fixes: 8397c9c0c26b ("dt-bindings: clock: add QCOM SM6125 display clock 
> bindings")
> Signed-off-by: Marijn Suijten 
> ---
 Ideally, you'd stick it at the bottom of the list, as the items: order
 is part of the ABI
>>>
>>> This isn't an ABI break, as this driver nor its bindings require/declare
>>> a fixed order: they declare a relation between clocks and clock-names.
>> Bindings describe the ABI, drivers implement compliant code flow.
> 
> That is how bindings are supposed to be...  However typically the driver
> is written/ported first and then the bindings are simply created to

Your development process does not matter for the bindings. Whatever you
decide to do "typically" is your choice, although of course I understand
why you do it like that. You can argument the same that "I never create
bindings in my process, so the driver defines the ABI".

> reflect this, and sometimes (as is the case with this patch)
> incorrectly.
> 
> That, together with a lack of DTS and known-working device with it
> (which is why I'm submitting driver+bindings+dts in one series now!)
> makes us shoot ourselves in the foot by locking everyone into an ABI
> that makes no sense.

No one is locked into the ABI. SoC maintainer decides on this. However
unjustified ABI breaking or not caring about it at all is not the way to
go. It is not the correct process.

> 
>>> This orders the GCC clock just like other dispccs.  And the previous
>>> patch dropped the unused cfg_ahb_clk from the bindings, so all bets are
>>> off anyway.
>> Thinking about it again, the binding has not been consumed by any upstream
>> DT to date, so it should (tm) be fine to let it slide..
> 
> Exactly, I hope/doubt anyone was already using these incomplete
> bindings.  And again: the ABI here is the name->phandle mapping, the
> order Does Not Matter™.

No, it's not. Your one driver does not define the ABI. There are many
different drivers, many different operating systems and other software
components.

Best regards,
Krzysztof



Re: [PATCH v6 3/3] drm/virtio: Support sync objects

2023-06-26 Thread Dmitry Osipenko
On 6/25/23 18:36, Geert Uytterhoeven wrote:
> Hi Dmitry,
> 
> On Sun, Jun 25, 2023 at 2:41 PM Dmitry Osipenko
>  wrote:
>> On 6/25/23 11:47, Geert Uytterhoeven wrote:
>>> On Sun, Apr 16, 2023 at 1:55 PM Dmitry Osipenko
>>>  wrote:
 Add sync object DRM UAPI support to VirtIO-GPU driver. Sync objects
 support is needed by native context VirtIO-GPU Mesa drivers, it also will
 be used by Venus and Virgl contexts.

 Reviewed-by; Emil Velikov 
 Signed-off-by: Dmitry Osipenko 
>>>
>>> Thanks for your patch!
>>>
 --- a/drivers/gpu/drm/virtio/virtgpu_submit.c
 +++ b/drivers/gpu/drm/virtio/virtgpu_submit.c
>>>
 +static int
 +virtio_gpu_parse_deps(struct virtio_gpu_submit *submit)
 +{
 +   struct drm_virtgpu_execbuffer *exbuf = submit->exbuf;
 +   struct drm_virtgpu_execbuffer_syncobj syncobj_desc;
 +   size_t syncobj_stride = exbuf->syncobj_stride;
 +   u32 num_in_syncobjs = exbuf->num_in_syncobjs;
 +   struct drm_syncobj **syncobjs;
 +   int ret = 0, i;
 +
 +   if (!num_in_syncobjs)
 +   return 0;
 +
 +   /*
 +* kvalloc at first tries to allocate memory using kmalloc and
 +* falls back to vmalloc only on failure. It also uses GFP_NOWARN
>>>
>>> GFP_NOWARN does not exist.
>>
>> https://elixir.bootlin.com/linux/v6.4-rc7/source/include/linux/gfp_types.h#L38
> 
> That line defines "__GFP_NOWARN", not "GFP_NOWARN".
> C is case- and underscore-sensitive. as is "git grep -w" ;-)

The removal of underscores was done intentionally for improving
readability of the comment

-- 
Best regards,
Dmitry



Re: [PATCH 03/15] dt-bindings: clock: qcom, dispcc-sm6125: Require GCC PLL0 DIV clock

2023-06-26 Thread Krzysztof Kozlowski
On 25/06/2023 21:48, Marijn Suijten wrote:
> On 2023-06-24 11:08:54, Krzysztof Kozlowski wrote:
>> On 24/06/2023 03:45, Konrad Dybcio wrote:
>>> On 24.06.2023 02:41, Marijn Suijten wrote:
 The "gcc_disp_gpll0_div_clk_src" clock is consumed by the driver, will
 be passed from DT, and should be required by the bindings.

 Fixes: 8397c9c0c26b ("dt-bindings: clock: add QCOM SM6125 display clock 
 bindings")
 Signed-off-by: Marijn Suijten 
 ---
>>> Ideally, you'd stick it at the bottom of the list, as the items: order
>>> is part of the ABI
>>
>> Yes, please add them to the end. Order is fixed.
> 
> Disagreed for bindings that declare clock-names and when the driver
> adheres to it, see my reply to Konrad's message.

That's the generic rule, with some exceptions of course. Whether one
chosen driver (chosen system and chosen version of that system) adheres
or not, does not change it. Other driver behaves differently and ABI is
for everyone, not only for your specific version of Linux driver.

Follow the rule.

Best regards,
Krzysztof



Re: [PATCH 1/5] drm/panfrost: Stop using drm_gem_shmem_put_pages()

2023-06-26 Thread Dmitry Osipenko
On 6/26/23 18:43, Boris Brezillon wrote:
> On Mon, 26 Jun 2023 16:20:53 +0300
> Dmitry Osipenko  wrote:
> 
>> On 6/26/23 15:02, Boris Brezillon wrote:
>>> -err_pages:
>>> -   drm_gem_shmem_put_pages(>base);
>>>  err_unlock:
>>> dma_resv_unlock(obj->resv);
>>> +
>>> +   if (ret && pinned)
>>> +   drm_gem_shmem_unpin(>base);  
>>
>> The drm_gem_shmem_unpin() was supposed to be used only in conjunction
>> with drm_gem_shmem_pin(). I've a pending patch to enable the pin/unpin
>> refcounting needed by drm-shmem shrinker, it will prohibit invocation of
>> unpin without a previous pin.
>>
>> I'm wondering whether it will be okay to simply remove
>> drm_gem_shmem_put_pages() from the Panfrost code, letting pages to be
>> kept allocated in a error case. They will be freed once BO is destroyed.
>>
> 
> Okay, so after looking at your shmem-shrinker series, I confirm we need
> to take a pin ref here (hard-pin), otherwise the buffer might be
> evicted before the GPU is done, especially after you drop gpu_usecount
> and use only pin_count to check whether a GEM object can be evicted or
> not.

See the drm_gem_evict() [1], it checks whether GEM is busy, preventing
BO eviction while it is in-use by GPU. Note that in case of Panfrost,
shrinker isn't enabled for growable BOs.

[1]
https://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git/tree/drivers/gpu/drm/drm_gem.c?h=next-20230626#n1495

-- 
Best regards,
Dmitry



Re: [PATCH 1/5] drm/panfrost: Stop using drm_gem_shmem_put_pages()

2023-06-26 Thread Boris Brezillon
On Mon, 26 Jun 2023 16:20:53 +0300
Dmitry Osipenko  wrote:

> On 6/26/23 15:02, Boris Brezillon wrote:
> > -err_pages:
> > -   drm_gem_shmem_put_pages(>base);
> >  err_unlock:
> > dma_resv_unlock(obj->resv);
> > +
> > +   if (ret && pinned)
> > +   drm_gem_shmem_unpin(>base);  
> 
> The drm_gem_shmem_unpin() was supposed to be used only in conjunction
> with drm_gem_shmem_pin(). I've a pending patch to enable the pin/unpin
> refcounting needed by drm-shmem shrinker, it will prohibit invocation of
> unpin without a previous pin.
> 
> I'm wondering whether it will be okay to simply remove
> drm_gem_shmem_put_pages() from the Panfrost code, letting pages to be
> kept allocated in a error case. They will be freed once BO is destroyed.
> 

Okay, so after looking at your shmem-shrinker series, I confirm we need
to take a pin ref here (hard-pin), otherwise the buffer might be
evicted before the GPU is done, especially after you drop gpu_usecount
and use only pin_count to check whether a GEM object can be evicted or
not.


Re: [PATCH v3 07/17] drm/imagination: Add GPU ID parsing and firmware loading

2023-06-26 Thread Adam Ford
On Mon, Jun 26, 2023 at 8:22 AM Frank Binns  wrote:
>
> Hi Adam,
>
> On Sat, 2023-06-17 at 07:48 -0500, Adam Ford wrote:
> > On Tue, Jun 13, 2023 at 10:20 AM Sarah Walker  
> > wrote:
> > > Read the GPU ID register at probe time and select the correct
> > > features/quirks/enhancements. Use the GPU ID to form the firmware
> > > file name and load the firmware.
> >
> > I have a Rogue 6250 variant, but the BVNC is returning a slightly
> > different revision than the firmware that's currently available, and
> > the older firmware for the vendor driver doesn't work with this new
> > code.
> >
> > Linux responds with Unsupported BVNC: 4.45.2.58.  From what I can
> > tell, the closest available firmware is 4.40.2.51.
> >
> > Will there be more firmware variants in the future or will there be
> > some options to build the firmware somehow?
>
> We don't plan to support the SoC vendor provided firmware binaries as this 
> will
> mean having to deal with many different versions of the firmware and its
> interface. Specifically, the interface can change in backwards incompatible 
> ways
> between releases, it changes based on the hardware feature set & bugs and it's
> split across userspace & the kernel. This makes these SoC provided firmware
> binaries very difficult to support in this new driver.

Thanks for the response.

That makes sense.  I would hope that various SoC vendors would jump on
the bandwagon to work with your group to get their hardware supported.
>
> As an alternative, we'll be releasing firmware binaries as we add support for
> more Rogue GPUs. We'll also release binaries upon request, in case others in 
> the
> community want to work on support in the meantime - we're just getting things
> set up for this at the moment.

The Mesa side of things appears to be missing some documentation, and
the power VR stuff still appears listed as experimental.  Is there
some documentation somewhere that would explain to someone how to go
about porting the Rogue 6250 to a different hardware variant of the
6250?  I don't really know the difference between BVNC of 4.45.2.58
and 4.40.2.51, but I can't imagine they are drastically different.

adam
>
> Thanks
> Frank
>
> >
> > adam
> >
> >
> >
> >
> > > The features/quirks/enhancements arrays are currently hardcoded in
> > > the driver for the supported GPUs. We are looking at moving this
> > > information to the firmware image.
> > >
> > > Signed-off-by: Sarah Walker 
> > > ---
> > >  drivers/gpu/drm/imagination/Makefile  |   1 +
> > >  drivers/gpu/drm/imagination/pvr_device.c  | 359 
> > >  drivers/gpu/drm/imagination/pvr_device.h  | 221 +++
> > >  drivers/gpu/drm/imagination/pvr_device_info.c | 223 +++
> > >  drivers/gpu/drm/imagination/pvr_device_info.h | 133 +
> > >  drivers/gpu/drm/imagination/pvr_drv.c | 553 +-
> > >  drivers/gpu/drm/imagination/pvr_drv.h | 108 
> > >  drivers/gpu/drm/imagination/pvr_fw.h  |  20 +
> > >  8 files changed, 1617 insertions(+), 1 deletion(-)
> > >  create mode 100644 drivers/gpu/drm/imagination/pvr_device_info.c
> > >  create mode 100644 drivers/gpu/drm/imagination/pvr_device_info.h
> > >  create mode 100644 drivers/gpu/drm/imagination/pvr_fw.h
> > >
> > > diff --git a/drivers/gpu/drm/imagination/Makefile 
> > > b/drivers/gpu/drm/imagination/Makefile
> > > index 186f920d615b..d713b1280776 100644
> > > --- a/drivers/gpu/drm/imagination/Makefile
> > > +++ b/drivers/gpu/drm/imagination/Makefile
> > > @@ -5,6 +5,7 @@ subdir-ccflags-y := -I$(srctree)/$(src)
> > >
> > >  powervr-y := \
> > > pvr_device.o \
> > > +   pvr_device_info.o \
> > > pvr_drv.o \
> > >
> > >  obj-$(CONFIG_DRM_POWERVR) += powervr.o
> > > diff --git a/drivers/gpu/drm/imagination/pvr_device.c 
> > > b/drivers/gpu/drm/imagination/pvr_device.c
> > > index 790c36cebec1..2e03763f2eb7 100644
> > > --- a/drivers/gpu/drm/imagination/pvr_device.c
> > > +++ b/drivers/gpu/drm/imagination/pvr_device.c
> > > @@ -2,20 +2,32 @@
> > >  /* Copyright (c) 2022 Imagination Technologies Ltd. */
> > >
> > >  #include "pvr_device.h"
> > > +#include "pvr_device_info.h"
> > > +
> > > +#include "pvr_fw.h"
> > > +#include "pvr_rogue_cr_defs.h"
> > >
> > >  #include 
> > >
> > > +#include 
> > >  #include 
> > >  #include 
> > >  #include 
> > >  #include 
> > >  #include 
> > > +#include 
> > >  #include 
> > > +#include 
> > >  #include 
> > > +#include 
> > >  #include 
> > >  #include 
> > >  #include 
> > >  #include 
> > > +#include 
> > > +
> > > +/* Major number for the supported version of the firmware. */
> > > +#define PVR_FW_VERSION_MAJOR 1
> > >
> > >  /**
> > >   * pvr_device_reg_init() - Initialize kernel access to a PowerVR device's
> > > @@ -205,6 +217,246 @@ pvr_device_regulator_init(struct pvr_device 
> > > *pvr_dev)
> > > return err;
> > >  }
> > >
> > > +/**
> > > + * pvr_device_clk_core_get_freq - Get current PowerVR device core clock 
> > > frequency
> > > + * 

Re: [PATCH 3/5] drm/shmem-helper: Inline drm_gem_shmem_{get,put}_pages()

2023-06-26 Thread Boris Brezillon
On Mon, 26 Jun 2023 14:02:45 +0200
Boris Brezillon  wrote:

> Move code drm_gem_shmem_{get,put}_pages() code to
> drm_gem_shmem_{pin,unpin}_locked().

After having a closer look at 'Add generic memory shrinker to VirtIO-GPU
and  Panfrost DRM drivers', I realize that's not what we want. We must
differentiate hard-pinning (as in, can't be evicted until all users
give up the ref they have) and soft-pinning (users can survive a
swapout, basically userspace mappings created through
drm_gem_shmem_mmap()).

> 
> Signed-off-by: Boris Brezillon 
> Cc: Daniel Vetter 
> Cc: Thomas Zimmermann 
> Cc: Emil Velikov 
> Cc: Dmitry Osipenko 
> ---
>  drivers/gpu/drm/drm_gem_shmem_helper.c | 108 ++---
>  1 file changed, 41 insertions(+), 67 deletions(-)
> 
> diff --git a/drivers/gpu/drm/drm_gem_shmem_helper.c 
> b/drivers/gpu/drm/drm_gem_shmem_helper.c
> index d6fc034164c0..f406556e42e0 100644
> --- a/drivers/gpu/drm/drm_gem_shmem_helper.c
> +++ b/drivers/gpu/drm/drm_gem_shmem_helper.c
> @@ -128,46 +128,7 @@ struct drm_gem_shmem_object *drm_gem_shmem_create(struct 
> drm_device *dev, size_t
>  }
>  EXPORT_SYMBOL_GPL(drm_gem_shmem_create);
>  
> -static void drm_gem_shmem_put_pages(struct drm_gem_shmem_object *shmem);
> -
> -/**
> - * drm_gem_shmem_free - Free resources associated with a shmem GEM object
> - * @shmem: shmem GEM object to free
> - *
> - * This function cleans up the GEM object state and frees the memory used to
> - * store the object itself.
> - */
> -void drm_gem_shmem_free(struct drm_gem_shmem_object *shmem)
> -{
> - struct drm_gem_object *obj = >base;
> -
> - if (obj->import_attach) {
> - drm_prime_gem_destroy(obj, shmem->sgt);
> - } else {
> - dma_resv_lock(shmem->base.resv, NULL);
> -
> - drm_WARN_ON(obj->dev, shmem->vmap_use_count);
> -
> - if (shmem->sgt) {
> - dma_unmap_sgtable(obj->dev->dev, shmem->sgt,
> -   DMA_BIDIRECTIONAL, 0);
> - sg_free_table(shmem->sgt);
> - kfree(shmem->sgt);
> - }
> - if (shmem->pages)
> - drm_gem_shmem_put_pages(shmem);
> -
> - drm_WARN_ON(obj->dev, shmem->pages_use_count);
> -
> - dma_resv_unlock(shmem->base.resv);
> - }
> -
> - drm_gem_object_release(obj);
> - kfree(shmem);
> -}
> -EXPORT_SYMBOL_GPL(drm_gem_shmem_free);
> -
> -static int drm_gem_shmem_get_pages(struct drm_gem_shmem_object *shmem)
> +static int drm_gem_shmem_pin_locked(struct drm_gem_shmem_object *shmem)
>  {
>   struct drm_gem_object *obj = >base;
>   struct page **pages;
> @@ -200,13 +161,7 @@ static int drm_gem_shmem_get_pages(struct 
> drm_gem_shmem_object *shmem)
>   return 0;
>  }
>  
> -/*
> - * drm_gem_shmem_put_pages - Decrease use count on the backing pages for a 
> shmem GEM object
> - * @shmem: shmem GEM object
> - *
> - * This function decreases the use count and puts the backing pages when use 
> drops to zero.
> - */
> -static void drm_gem_shmem_put_pages(struct drm_gem_shmem_object *shmem)
> +static void drm_gem_shmem_unpin_locked(struct drm_gem_shmem_object *shmem)
>  {
>   struct drm_gem_object *obj = >base;
>  
> @@ -229,23 +184,42 @@ static void drm_gem_shmem_put_pages(struct 
> drm_gem_shmem_object *shmem)
>   shmem->pages = NULL;
>  }
>  
> -static int drm_gem_shmem_pin_locked(struct drm_gem_shmem_object *shmem)
> +/**
> + * drm_gem_shmem_free - Free resources associated with a shmem GEM object
> + * @shmem: shmem GEM object to free
> + *
> + * This function cleans up the GEM object state and frees the memory used to
> + * store the object itself.
> + */
> +void drm_gem_shmem_free(struct drm_gem_shmem_object *shmem)
>  {
> - int ret;
> + struct drm_gem_object *obj = >base;
>  
> - dma_resv_assert_held(shmem->base.resv);
> + if (obj->import_attach) {
> + drm_prime_gem_destroy(obj, shmem->sgt);
> + } else {
> + dma_resv_lock(shmem->base.resv, NULL);
>  
> - ret = drm_gem_shmem_get_pages(shmem);
> + drm_WARN_ON(obj->dev, shmem->vmap_use_count);
>  
> - return ret;
> -}
> -
> -static void drm_gem_shmem_unpin_locked(struct drm_gem_shmem_object *shmem)
> -{
> - dma_resv_assert_held(shmem->base.resv);
> -
> - drm_gem_shmem_put_pages(shmem);
> + if (shmem->sgt) {
> + dma_unmap_sgtable(obj->dev->dev, shmem->sgt,
> +   DMA_BIDIRECTIONAL, 0);
> + sg_free_table(shmem->sgt);
> + kfree(shmem->sgt);
> + }
> + if (shmem->pages)
> + drm_gem_shmem_unpin_locked(shmem);
> +
> + drm_WARN_ON(obj->dev, shmem->pages_use_count);
> +
> + dma_resv_unlock(shmem->base.resv);
> + }
> +
> + drm_gem_object_release(obj);
> + kfree(shmem);
>  }
> +EXPORT_SYMBOL_GPL(drm_gem_shmem_free);
>  
> 

Re: [PATCH v13 03/10] drm/shmem-helper: Add pages_pin_count field

2023-06-26 Thread Dmitry Osipenko
On 6/26/23 18:21, Boris Brezillon wrote:
> On Mon, 26 Jun 2023 17:04:57 +0200
> Boris Brezillon  wrote:
> 
>> Hi Dmitry,
>>
>> Sorry for chiming in only now :-/.
>>
>> On Tue, 14 Mar 2023 05:26:52 +0300
>> Dmitry Osipenko  wrote:
>>
>>> And new pages_pin_count field to struct drm_gem_shmem_object that will
>>> determine whether pages are evictable by memory shrinker. The pages will
>>> be evictable only when pages_pin_count=0. This patch prepares code for
>>> addition of the memory shrinker that will utilize the new field.
>>>
>>> Signed-off-by: Dmitry Osipenko 
>>> ---
>>>  drivers/gpu/drm/drm_gem_shmem_helper.c | 7 +++
>>>  include/drm/drm_gem_shmem_helper.h | 9 +
>>>  2 files changed, 16 insertions(+)
>>>
>>> diff --git a/drivers/gpu/drm/drm_gem_shmem_helper.c 
>>> b/drivers/gpu/drm/drm_gem_shmem_helper.c
>>> index 4da9c9c39b9a..81d61791f874 100644
>>> --- a/drivers/gpu/drm/drm_gem_shmem_helper.c
>>> +++ b/drivers/gpu/drm/drm_gem_shmem_helper.c
>>> @@ -277,6 +277,8 @@ static int drm_gem_shmem_pin_locked(struct 
>>> drm_gem_shmem_object *shmem)
>>> drm_WARN_ON(obj->dev, obj->import_attach);
>>>  
>>> ret = drm_gem_shmem_get_pages(shmem);
>>> +   if (!ret)
>>> +   shmem->pages_pin_count++;
>>>  
>>> return ret;
>>>  }
>>> @@ -289,7 +291,12 @@ static void drm_gem_shmem_unpin_locked(struct 
>>> drm_gem_shmem_object *shmem)
>>>  
>>> drm_WARN_ON(obj->dev, obj->import_attach);
>>>  
>>> +   if (drm_WARN_ON_ONCE(obj->dev, !shmem->pages_pin_count))
>>> +   return;
>>> +
>>> drm_gem_shmem_put_pages(shmem);
>>> +
>>> +   shmem->pages_pin_count--;
>>>  }
>>>  
>>>  /**
>>> diff --git a/include/drm/drm_gem_shmem_helper.h 
>>> b/include/drm/drm_gem_shmem_helper.h
>>> index 20ddcd799df9..7d823c9fc480 100644
>>> --- a/include/drm/drm_gem_shmem_helper.h
>>> +++ b/include/drm/drm_gem_shmem_helper.h
>>> @@ -39,6 +39,15 @@ struct drm_gem_shmem_object {
>>>  */
>>> unsigned int pages_use_count;
>>>  
>>> +   /**
>>> +* @pages_pin_count:
>>> +*
>>> +* Reference count on the pinned pages table.
>>> +* The pages allowed to be evicted by memory shrinker
>>> +* only when the count is zero.
>>> +*/
>>> +   unsigned int pages_pin_count;  
>>
>> s/pages_pin_count/pin_count/ ?
>>
>> And do we really need both pages_pin_count and pages_use_count. Looks
>> like they both serve the same purpose, with one exception:
>> pages_use_count is also incremented in the get_pages_sgt_locked() path,
>> but you probably don't want it to prevent GEM eviction. Assuming
>> your goal with this pin_count field is to check if a GEM object is
>> evictable, it can be done with something like
>>
>> bool
>> drm_gem_shmem_is_evictable_locked(struct drm_gem_shmem_object *shmem)
>> {
>>  dma_resv_assert_held(shmem->base.resv);
>>
>>  return shmem->pages_use_count == (shmem->sgt ? 1 : 0);
>> }
>>
>> I mean, I'm not against renaming pages_use_count into pin_count, but,
>> unless I'm missing something, I don't see a good reason to keep both.
> 
> My bad, I think I found one place calling drm_gem_shmem_get_pages()
> where we want pin_count and pages_use_count to differ:
> drm_gem_shmem_mmap(). We certainly don't want userspace mappings to
> prevent eviction.

That's correct, thanks for the review :)

-- 
Best regards,
Dmitry



Re: [PATCH] backlight: led_bl: take led_access lock when required

2023-06-26 Thread Lee Jones
On Mon, 19 Jun 2023, Mans Rullgard wrote:

> The led_access lock must be held when calling led_sysfs_enable() and
> led_sysfs_disable().  This fixes warnings such as this:
> 
> [2.432495] [ cut here ]
> [2.437316] WARNING: CPU: 0 PID: 22 at drivers/leds/led-core.c:349 
> led_sysfs_disable+0x54/0x58
> [2.446105] Modules linked in:
> [2.449218] CPU: 0 PID: 22 Comm: kworker/u2:1 Not tainted 6.3.8+ #1
> [2.456268] Hardware name: Generic AM3517 (Flattened Device Tree)
> [2.462402] Workqueue: events_unbound deferred_probe_work_func
> [2.468353]  unwind_backtrace from show_stack+0x10/0x14
> [2.473632]  show_stack from dump_stack_lvl+0x24/0x2c
> [2.478759]  dump_stack_lvl from __warn+0x9c/0xc4
> [2.483551]  __warn from warn_slowpath_fmt+0x64/0xc0
> [2.488586]  warn_slowpath_fmt from led_sysfs_disable+0x54/0x58
> [2.494567]  led_sysfs_disable from led_bl_probe+0x20c/0x3b0
> [2.500305]  led_bl_probe from platform_probe+0x5c/0xb8
> [2.505615]  platform_probe from really_probe+0xc8/0x2a0
> [2.510986]  really_probe from __driver_probe_device+0x88/0x19c
> [2.516967]  __driver_probe_device from driver_probe_device+0x30/0xcc
> [2.523498]  driver_probe_device from __device_attach_driver+0x94/0xc4
> [2.530090]  __device_attach_driver from bus_for_each_drv+0x80/0xcc
> [2.536437]  bus_for_each_drv from __device_attach+0xf8/0x19c
> [2.542236]  __device_attach from bus_probe_device+0x8c/0x90
> [2.547973]  bus_probe_device from deferred_probe_work_func+0x80/0xb0
> [2.554504]  deferred_probe_work_func from process_one_work+0x228/0x4c0
> [2.561187]  process_one_work from worker_thread+0x1fc/0x4d0
> [2.566925]  worker_thread from kthread+0xb4/0xd0
> [2.571685]  kthread from ret_from_fork+0x14/0x2c
> [2.576446] Exception stack(0xd0079fb0 to 0xd0079ff8)
> [2.581573] 9fa0:   
>  
> [2.589813] 9fc0:       
>  
> [2.598052] 9fe0:     0013 
> [2.604888] ---[ end trace  ]---
> 
> 
> Signed-off-by: Mans Rullgard 
> ---
>  drivers/video/backlight/led_bl.c | 5 -
>  1 file changed, 4 insertions(+), 1 deletion(-)

Applied, thanks

-- 
Lee Jones [李琼斯]


Re: [PATCH] kunit: drm: make DRM buddy test compatible with other pages sizes

2023-06-26 Thread Nico Pache
Hi Christian,

Thanks for the information! I am not very familiar with the inner
workings of DRM, so I'm not really in a position to make any large or
systematic changes to the test regarding the points you made. I am
mainly trying to allow the tests to be run on more diverse hardware.
>From the looks of it this test has been adapted from an older test, so
perhaps this rule was set in place in the past.

Either way, I dont think my changes are going to break anything, so
for the time being I think this small change is the best approach.
Please let me know if you think otherwise.

David, do you still have this on your radar? We've been carrying this
as a RHEL-only since I originally posted it and have not noticed any
issues due to it.

Cheers,
-- Nico

On Wed, Apr 19, 2023 at 4:30 AM Christian König
 wrote:
>
> Am 18.04.23 um 19:15 schrieb Nico Pache:
> > The DRM buddy test uses a fixed 12 bit shift to covert from pages to
> > bytes. This number is then used to confirm that (chunk_size < PAGE_SIZE)
> > which can lead to a failing drm_buddy_init on systems with PAGE_SIZE > 4k.
>
> Since the buddy allocator is used for resources which are independent of
> the CPU PAGE size the later check is actually the broken one.
>
> E.g. neither in the buddy allocator nor in it's test cases we should
> have any of PAGE_SHIFT or PAGE_SIZE.
>
> Otherwise the allocator wouldn't work correctly on systems with a
> PAGE_SIZE different than 4k.
>
> Regards,
> Christian.
>
> >
> > Fixes: 92937f170d3f ("drm/selftests: add drm buddy alloc range testcase")
> > Signed-off-by: Nico Pache 
> > ---
> >   drivers/gpu/drm/tests/drm_buddy_test.c | 4 ++--
> >   1 file changed, 2 insertions(+), 2 deletions(-)
> >
> > diff --git a/drivers/gpu/drm/tests/drm_buddy_test.c 
> > b/drivers/gpu/drm/tests/drm_buddy_test.c
> > index 09ee6f6af896..a62b2690d3c2 100644
> > --- a/drivers/gpu/drm/tests/drm_buddy_test.c
> > +++ b/drivers/gpu/drm/tests/drm_buddy_test.c
> > @@ -318,8 +318,8 @@ static void mm_config(u64 *size, u64 *chunk_size)
> >   s &= -ms;
> >
> >   /* Convert from pages to bytes */
> > - *chunk_size = (u64)ms << 12;
> > - *size = (u64)s << 12;
> > + *chunk_size = (u64)ms << PAGE_SHIFT;
> > + *size = (u64)s << PAGE_SHIFT;
> >   }
> >
> >   static void drm_test_buddy_alloc_pathological(struct kunit *test)
>



Re: [Intel-xe] [PATCH v2 2/4] drm/ttm: Don't shadow the operation context

2023-06-26 Thread Thomas Hellström
On Mon, 2023-06-26 at 17:18 +0200, Thomas Hellström wrote:
> On Mon, 2023-06-26 at 17:15 +0200, Christian König wrote:
> > Am 26.06.23 um 11:14 schrieb Thomas Hellström:
> > > ttm_bo_swapout() shadows the ttm operation context which may
> > > cause
> > > major confusion in driver callbacks when swapping out
> > > !TTM_PL_SYSTEM
> > > memory. Fix this by reusing the operation context argument to
> > > ttm_bo_swapout().
> > > 
> > > Cc: "Christian König" 
> > > Cc: Roger He 
> > > Cc: 
> > > Cc: 
> > > Cc:  # v4.16+
> > > Fixes: dc947770cf34 ("drm/ttm: enable swapout for reserved BOs
> > > during allocation")
> > > Signed-off-by: Thomas Hellström
> > > 
> > > Acked-by: Matthew Brost 
> > 
> > We intentionally didn't used the parameter here, but I absolutely
> > can't 
> > figure out why.
> > 
> > Feel free to add my rb, but let's give it some time upstream before
> > you 
> > base anything on top of this. Just in case we missed something.
> 
> Sure. Thanks for reviewing,

BTW, I'll remove the Fixes: tag as well.

/Thomas


> /Thomas
> 
> > 
> > Regards,
> > Christian.
> > 
> > > ---
> > >   drivers/gpu/drm/ttm/ttm_bo.c | 3 +--
> > >   1 file changed, 1 insertion(+), 2 deletions(-)
> > > 
> > > diff --git a/drivers/gpu/drm/ttm/ttm_bo.c
> > > b/drivers/gpu/drm/ttm/ttm_bo.c
> > > index bd5dae4d1624..615d30c4262d 100644
> > > --- a/drivers/gpu/drm/ttm/ttm_bo.c
> > > +++ b/drivers/gpu/drm/ttm/ttm_bo.c
> > > @@ -1154,7 +1154,6 @@ int ttm_bo_swapout(struct ttm_buffer_object
> > > *bo, struct ttm_operation_ctx *ctx,
> > >  * Move to system cached
> > >  */
> > > if (bo->resource->mem_type != TTM_PL_SYSTEM) {
> > > -   struct ttm_operation_ctx ctx = { false, false };
> > > struct ttm_resource *evict_mem;
> > > struct ttm_place hop;
> > >   
> > > @@ -1164,7 +1163,7 @@ int ttm_bo_swapout(struct ttm_buffer_object
> > > *bo, struct ttm_operation_ctx *ctx,
> > > if (unlikely(ret))
> > > goto out;
> > >   
> > > -   ret = ttm_bo_handle_move_mem(bo, evict_mem, true,
> > > , );
> > > +   ret = ttm_bo_handle_move_mem(bo, evict_mem, true,
> > > ctx, );
> > > if (unlikely(ret != 0)) {
> > > WARN(ret == -EMULTIHOP, "Unexpected
> > > multihop in swaput - likely driver bug.\n");
> > > goto out;
> > 
> 



Re: [PATCH v13 03/10] drm/shmem-helper: Add pages_pin_count field

2023-06-26 Thread Boris Brezillon
On Mon, 26 Jun 2023 17:04:57 +0200
Boris Brezillon  wrote:

> Hi Dmitry,
> 
> Sorry for chiming in only now :-/.
> 
> On Tue, 14 Mar 2023 05:26:52 +0300
> Dmitry Osipenko  wrote:
> 
> > And new pages_pin_count field to struct drm_gem_shmem_object that will
> > determine whether pages are evictable by memory shrinker. The pages will
> > be evictable only when pages_pin_count=0. This patch prepares code for
> > addition of the memory shrinker that will utilize the new field.
> > 
> > Signed-off-by: Dmitry Osipenko 
> > ---
> >  drivers/gpu/drm/drm_gem_shmem_helper.c | 7 +++
> >  include/drm/drm_gem_shmem_helper.h | 9 +
> >  2 files changed, 16 insertions(+)
> > 
> > diff --git a/drivers/gpu/drm/drm_gem_shmem_helper.c 
> > b/drivers/gpu/drm/drm_gem_shmem_helper.c
> > index 4da9c9c39b9a..81d61791f874 100644
> > --- a/drivers/gpu/drm/drm_gem_shmem_helper.c
> > +++ b/drivers/gpu/drm/drm_gem_shmem_helper.c
> > @@ -277,6 +277,8 @@ static int drm_gem_shmem_pin_locked(struct 
> > drm_gem_shmem_object *shmem)
> > drm_WARN_ON(obj->dev, obj->import_attach);
> >  
> > ret = drm_gem_shmem_get_pages(shmem);
> > +   if (!ret)
> > +   shmem->pages_pin_count++;
> >  
> > return ret;
> >  }
> > @@ -289,7 +291,12 @@ static void drm_gem_shmem_unpin_locked(struct 
> > drm_gem_shmem_object *shmem)
> >  
> > drm_WARN_ON(obj->dev, obj->import_attach);
> >  
> > +   if (drm_WARN_ON_ONCE(obj->dev, !shmem->pages_pin_count))
> > +   return;
> > +
> > drm_gem_shmem_put_pages(shmem);
> > +
> > +   shmem->pages_pin_count--;
> >  }
> >  
> >  /**
> > diff --git a/include/drm/drm_gem_shmem_helper.h 
> > b/include/drm/drm_gem_shmem_helper.h
> > index 20ddcd799df9..7d823c9fc480 100644
> > --- a/include/drm/drm_gem_shmem_helper.h
> > +++ b/include/drm/drm_gem_shmem_helper.h
> > @@ -39,6 +39,15 @@ struct drm_gem_shmem_object {
> >  */
> > unsigned int pages_use_count;
> >  
> > +   /**
> > +* @pages_pin_count:
> > +*
> > +* Reference count on the pinned pages table.
> > +* The pages allowed to be evicted by memory shrinker
> > +* only when the count is zero.
> > +*/
> > +   unsigned int pages_pin_count;  
> 
> s/pages_pin_count/pin_count/ ?
> 
> And do we really need both pages_pin_count and pages_use_count. Looks
> like they both serve the same purpose, with one exception:
> pages_use_count is also incremented in the get_pages_sgt_locked() path,
> but you probably don't want it to prevent GEM eviction. Assuming
> your goal with this pin_count field is to check if a GEM object is
> evictable, it can be done with something like
> 
> bool
> drm_gem_shmem_is_evictable_locked(struct drm_gem_shmem_object *shmem)
> {
>   dma_resv_assert_held(shmem->base.resv);
> 
>   return shmem->pages_use_count == (shmem->sgt ? 1 : 0);
> }
> 
> I mean, I'm not against renaming pages_use_count into pin_count, but,
> unless I'm missing something, I don't see a good reason to keep both.

My bad, I think I found one place calling drm_gem_shmem_get_pages()
where we want pin_count and pages_use_count to differ:
drm_gem_shmem_mmap(). We certainly don't want userspace mappings to
prevent eviction.


Re: [PATCH v2 2/4] drm/ttm: Don't shadow the operation context

2023-06-26 Thread Thomas Hellström
On Mon, 2023-06-26 at 17:15 +0200, Christian König wrote:
> Am 26.06.23 um 11:14 schrieb Thomas Hellström:
> > ttm_bo_swapout() shadows the ttm operation context which may cause
> > major confusion in driver callbacks when swapping out
> > !TTM_PL_SYSTEM
> > memory. Fix this by reusing the operation context argument to
> > ttm_bo_swapout().
> > 
> > Cc: "Christian König" 
> > Cc: Roger He 
> > Cc: 
> > Cc: 
> > Cc:  # v4.16+
> > Fixes: dc947770cf34 ("drm/ttm: enable swapout for reserved BOs
> > during allocation")
> > Signed-off-by: Thomas Hellström 
> > Acked-by: Matthew Brost 
> 
> We intentionally didn't used the parameter here, but I absolutely
> can't 
> figure out why.
> 
> Feel free to add my rb, but let's give it some time upstream before
> you 
> base anything on top of this. Just in case we missed something.

Sure. Thanks for reviewing,
/Thomas

> 
> Regards,
> Christian.
> 
> > ---
> >   drivers/gpu/drm/ttm/ttm_bo.c | 3 +--
> >   1 file changed, 1 insertion(+), 2 deletions(-)
> > 
> > diff --git a/drivers/gpu/drm/ttm/ttm_bo.c
> > b/drivers/gpu/drm/ttm/ttm_bo.c
> > index bd5dae4d1624..615d30c4262d 100644
> > --- a/drivers/gpu/drm/ttm/ttm_bo.c
> > +++ b/drivers/gpu/drm/ttm/ttm_bo.c
> > @@ -1154,7 +1154,6 @@ int ttm_bo_swapout(struct ttm_buffer_object
> > *bo, struct ttm_operation_ctx *ctx,
> >  * Move to system cached
> >  */
> > if (bo->resource->mem_type != TTM_PL_SYSTEM) {
> > -   struct ttm_operation_ctx ctx = { false, false };
> > struct ttm_resource *evict_mem;
> > struct ttm_place hop;
> >   
> > @@ -1164,7 +1163,7 @@ int ttm_bo_swapout(struct ttm_buffer_object
> > *bo, struct ttm_operation_ctx *ctx,
> > if (unlikely(ret))
> > goto out;
> >   
> > -   ret = ttm_bo_handle_move_mem(bo, evict_mem, true,
> > , );
> > +   ret = ttm_bo_handle_move_mem(bo, evict_mem, true,
> > ctx, );
> > if (unlikely(ret != 0)) {
> > WARN(ret == -EMULTIHOP, "Unexpected
> > multihop in swaput - likely driver bug.\n");
> > goto out;
> 



Re: [PATCH v2 2/4] drm/ttm: Don't shadow the operation context

2023-06-26 Thread Christian König

Am 26.06.23 um 11:14 schrieb Thomas Hellström:

ttm_bo_swapout() shadows the ttm operation context which may cause
major confusion in driver callbacks when swapping out !TTM_PL_SYSTEM
memory. Fix this by reusing the operation context argument to
ttm_bo_swapout().

Cc: "Christian König" 
Cc: Roger He 
Cc: 
Cc: 
Cc:  # v4.16+
Fixes: dc947770cf34 ("drm/ttm: enable swapout for reserved BOs during 
allocation")
Signed-off-by: Thomas Hellström 
Acked-by: Matthew Brost 


We intentionally didn't used the parameter here, but I absolutely can't 
figure out why.


Feel free to add my rb, but let's give it some time upstream before you 
base anything on top of this. Just in case we missed something.


Regards,
Christian.


---
  drivers/gpu/drm/ttm/ttm_bo.c | 3 +--
  1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/ttm/ttm_bo.c b/drivers/gpu/drm/ttm/ttm_bo.c
index bd5dae4d1624..615d30c4262d 100644
--- a/drivers/gpu/drm/ttm/ttm_bo.c
+++ b/drivers/gpu/drm/ttm/ttm_bo.c
@@ -1154,7 +1154,6 @@ int ttm_bo_swapout(struct ttm_buffer_object *bo, struct 
ttm_operation_ctx *ctx,
 * Move to system cached
 */
if (bo->resource->mem_type != TTM_PL_SYSTEM) {
-   struct ttm_operation_ctx ctx = { false, false };
struct ttm_resource *evict_mem;
struct ttm_place hop;
  
@@ -1164,7 +1163,7 @@ int ttm_bo_swapout(struct ttm_buffer_object *bo, struct ttm_operation_ctx *ctx,

if (unlikely(ret))
goto out;
  
-		ret = ttm_bo_handle_move_mem(bo, evict_mem, true, , );

+   ret = ttm_bo_handle_move_mem(bo, evict_mem, true, ctx, );
if (unlikely(ret != 0)) {
WARN(ret == -EMULTIHOP, "Unexpected multihop in swaput - 
likely driver bug.\n");
goto out;




Re: [PATCH] backlight: pwm_bl: Avoid backlight flicker applying initial PWM state

2023-06-26 Thread Daniel Thompson
On Thu, Jun 08, 2023 at 04:11:14PM +0200, Philipp Zabel wrote:
> The initial PWM state returned by pwm_init_state() has a duty cycle
> of 0 ns. To avoid backlight flicker when taking over an enabled
> display from the bootloader, skip the initial pwm_apply_state()
> and leave the PWM be until backlight_update_state() will apply the
> state with the desired brightness.

backlight_update_state() uses pwm_get_state() to update the PWM.

Without applying something that came from pwm_init_state() then
we will never adopt the reference values from pwm->args.


Daniel.


Re: [PATCH v13 03/10] drm/shmem-helper: Add pages_pin_count field

2023-06-26 Thread Boris Brezillon
Hi Dmitry,

Sorry for chiming in only now :-/.

On Tue, 14 Mar 2023 05:26:52 +0300
Dmitry Osipenko  wrote:

> And new pages_pin_count field to struct drm_gem_shmem_object that will
> determine whether pages are evictable by memory shrinker. The pages will
> be evictable only when pages_pin_count=0. This patch prepares code for
> addition of the memory shrinker that will utilize the new field.
> 
> Signed-off-by: Dmitry Osipenko 
> ---
>  drivers/gpu/drm/drm_gem_shmem_helper.c | 7 +++
>  include/drm/drm_gem_shmem_helper.h | 9 +
>  2 files changed, 16 insertions(+)
> 
> diff --git a/drivers/gpu/drm/drm_gem_shmem_helper.c 
> b/drivers/gpu/drm/drm_gem_shmem_helper.c
> index 4da9c9c39b9a..81d61791f874 100644
> --- a/drivers/gpu/drm/drm_gem_shmem_helper.c
> +++ b/drivers/gpu/drm/drm_gem_shmem_helper.c
> @@ -277,6 +277,8 @@ static int drm_gem_shmem_pin_locked(struct 
> drm_gem_shmem_object *shmem)
>   drm_WARN_ON(obj->dev, obj->import_attach);
>  
>   ret = drm_gem_shmem_get_pages(shmem);
> + if (!ret)
> + shmem->pages_pin_count++;
>  
>   return ret;
>  }
> @@ -289,7 +291,12 @@ static void drm_gem_shmem_unpin_locked(struct 
> drm_gem_shmem_object *shmem)
>  
>   drm_WARN_ON(obj->dev, obj->import_attach);
>  
> + if (drm_WARN_ON_ONCE(obj->dev, !shmem->pages_pin_count))
> + return;
> +
>   drm_gem_shmem_put_pages(shmem);
> +
> + shmem->pages_pin_count--;
>  }
>  
>  /**
> diff --git a/include/drm/drm_gem_shmem_helper.h 
> b/include/drm/drm_gem_shmem_helper.h
> index 20ddcd799df9..7d823c9fc480 100644
> --- a/include/drm/drm_gem_shmem_helper.h
> +++ b/include/drm/drm_gem_shmem_helper.h
> @@ -39,6 +39,15 @@ struct drm_gem_shmem_object {
>*/
>   unsigned int pages_use_count;
>  
> + /**
> +  * @pages_pin_count:
> +  *
> +  * Reference count on the pinned pages table.
> +  * The pages allowed to be evicted by memory shrinker
> +  * only when the count is zero.
> +  */
> + unsigned int pages_pin_count;

s/pages_pin_count/pin_count/ ?

And do we really need both pages_pin_count and pages_use_count. Looks
like they both serve the same purpose, with one exception:
pages_use_count is also incremented in the get_pages_sgt_locked() path,
but you probably don't want it to prevent GEM eviction. Assuming
your goal with this pin_count field is to check if a GEM object is
evictable, it can be done with something like

bool
drm_gem_shmem_is_evictable_locked(struct drm_gem_shmem_object *shmem)
{
dma_resv_assert_held(shmem->base.resv);

return shmem->pages_use_count == (shmem->sgt ? 1 : 0);
}

I mean, I'm not against renaming pages_use_count into pin_count, but,
unless I'm missing something, I don't see a good reason to keep both.

Regards,

Boris


[PATCH] drm/tegra: Kconfig: Express the dependency by using the depends on

2023-06-26 Thread Sui Jingfeng
Because using the 'depends on' is easy to understand and more obvious.

Signed-off-by: Sui Jingfeng 
---
 drivers/gpu/drm/tegra/Kconfig | 6 ++
 drivers/gpu/host1x/Kconfig| 5 +
 2 files changed, 3 insertions(+), 8 deletions(-)

diff --git a/drivers/gpu/drm/tegra/Kconfig b/drivers/gpu/drm/tegra/Kconfig
index 498313778175..e7e3856665b1 100644
--- a/drivers/gpu/drm/tegra/Kconfig
+++ b/drivers/gpu/drm/tegra/Kconfig
@@ -26,19 +26,17 @@ config DRM_TEGRA
  To compile this driver as a module, choose M here: the module
  will be called tegra-drm.
 
-if DRM_TEGRA
-
 config DRM_TEGRA_DEBUG
bool "NVIDIA Tegra DRM debug support"
+   depends on DRM_TEGRA
help
  Say yes here to enable debugging support.
 
 config DRM_TEGRA_STAGING
bool "Enable HOST1X interface"
+   depends on DRM_TEGRA
depends on STAGING
help
  Say yes if HOST1X should be available for userspace DRM users.
 
  If unsure, choose N.
-
-endif
diff --git a/drivers/gpu/host1x/Kconfig b/drivers/gpu/host1x/Kconfig
index e6c78ae2003a..36eb98d93637 100644
--- a/drivers/gpu/host1x/Kconfig
+++ b/drivers/gpu/host1x/Kconfig
@@ -17,14 +17,11 @@ config TEGRA_HOST1X
  by host1x are referred to as clients. host1x includes some other
  functionality, such as synchronization.
 
-if TEGRA_HOST1X
-
 config TEGRA_HOST1X_FIREWALL
bool "Enable HOST1X security firewall"
+   depends on TEGRA_HOST1X
default y
help
  Say yes if kernel should protect command streams from tampering.
 
  If unsure, choose Y.
-
-endif
-- 
2.25.1



Re: [PATCH 1/3] backlight: lm3630a: add support for changing the boost frequency

2023-06-26 Thread Daniel Thompson
On Wed, Jun 14, 2023 at 09:08:52PM +0200, Maximilian Weigand wrote:
> From: Maximilian Weigand 
>
> The led driver supports changing the switching frequency of the boost
> converter by two means: the base switching frequency can be changed from
> 500 kHz to 1 MHz, and a frequency shift can be activated, leading to
> switching frequencies of 560 kHz or 1.12 Mhz, respectively.
>
> Add this functionality to the led driver by introducing two dts entries
> that control the boost frequency (500 kHz by default) and the frequency
> shift (no shift by default).
>
> Signed-off-by: Maximilian Weigand 

Driver changes look ok (or at least will be when the DT bindings are
finalized).

However... I think patches 1 and 2 of this series are in the wrong
order. See #5 in
https://docs.kernel.org/devicetree/bindings/submitting-patches.html
for details.


Daniel.


[PATCH 2/2] drm/tegra: Remove surplus else after return

2023-06-26 Thread Sui Jingfeng
else is not generally useful after return

Signed-off-by: Sui Jingfeng 
---
 drivers/gpu/drm/tegra/gem.c | 19 ++-
 1 file changed, 10 insertions(+), 9 deletions(-)

diff --git a/drivers/gpu/drm/tegra/gem.c b/drivers/gpu/drm/tegra/gem.c
index 0ce22935fbd3..26d64f4545bd 100644
--- a/drivers/gpu/drm/tegra/gem.c
+++ b/drivers/gpu/drm/tegra/gem.c
@@ -180,15 +180,15 @@ static void *tegra_bo_mmap(struct host1x_bo *bo)
struct iosys_map map;
int ret;
 
-   if (obj->vaddr) {
+   if (obj->vaddr)
return obj->vaddr;
-   } else if (obj->gem.import_attach) {
+
+   if (obj->gem.import_attach) {
ret = dma_buf_vmap_unlocked(obj->gem.import_attach->dmabuf, 
);
return ret ? NULL : map.vaddr;
-   } else {
-   return vmap(obj->pages, obj->num_pages, VM_MAP,
-   pgprot_writecombine(PAGE_KERNEL));
}
+
+   return vmap(obj->pages, obj->num_pages, VM_MAP, 
pgprot_writecombine(PAGE_KERNEL));
 }
 
 static void tegra_bo_munmap(struct host1x_bo *bo, void *addr)
@@ -198,10 +198,11 @@ static void tegra_bo_munmap(struct host1x_bo *bo, void 
*addr)
 
if (obj->vaddr)
return;
-   else if (obj->gem.import_attach)
-   dma_buf_vunmap_unlocked(obj->gem.import_attach->dmabuf, );
-   else
-   vunmap(addr);
+
+   if (obj->gem.import_attach)
+   return dma_buf_vunmap_unlocked(obj->gem.import_attach->dmabuf, 
);
+
+   vunmap(addr);
 }
 
 static struct host1x_bo *tegra_bo_get(struct host1x_bo *bo)
-- 
2.25.1



[PATCH 1/2] drm/tegra: Return an error code if fails

2023-06-26 Thread Sui Jingfeng
Return -ENOMEM if tegra_bo_mmap() fails.

Signed-off-by: Sui Jingfeng 
---
 drivers/gpu/drm/tegra/gem.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/drivers/gpu/drm/tegra/gem.c b/drivers/gpu/drm/tegra/gem.c
index dea38892d6e6..0ce22935fbd3 100644
--- a/drivers/gpu/drm/tegra/gem.c
+++ b/drivers/gpu/drm/tegra/gem.c
@@ -710,6 +710,8 @@ static int tegra_gem_prime_vmap(struct dma_buf *buf, struct 
iosys_map *map)
void *vaddr;
 
vaddr = tegra_bo_mmap(>base);
+   if (!vaddr)
+   return -ENOMEM;
if (IS_ERR(vaddr))
return PTR_ERR(vaddr);
 
-- 
2.25.1



Re: [PATCH] backlight: led_bl: take led_access lock when required

2023-06-26 Thread Daniel Thompson
On Mon, Jun 19, 2023 at 05:02:49PM +0100, Mans Rullgard wrote:
> The led_access lock must be held when calling led_sysfs_enable() and
> led_sysfs_disable().  This fixes warnings such as this:
>
> [2.432495] [ cut here ]
> [2.437316] WARNING: CPU: 0 PID: 22 at drivers/leds/led-core.c:349 
> led_sysfs_disable+0x54/0x58
> [2.446105] Modules linked in:
> [2.449218] CPU: 0 PID: 22 Comm: kworker/u2:1 Not tainted 6.3.8+ #1
> [2.456268] Hardware name: Generic AM3517 (Flattened Device Tree)
> [2.462402] Workqueue: events_unbound deferred_probe_work_func
> [2.468353]  unwind_backtrace from show_stack+0x10/0x14
> [2.473632]  show_stack from dump_stack_lvl+0x24/0x2c
> [2.478759]  dump_stack_lvl from __warn+0x9c/0xc4
> [2.483551]  __warn from warn_slowpath_fmt+0x64/0xc0
> [2.488586]  warn_slowpath_fmt from led_sysfs_disable+0x54/0x58
> [2.494567]  led_sysfs_disable from led_bl_probe+0x20c/0x3b0
> [2.500305]  led_bl_probe from platform_probe+0x5c/0xb8
> [2.505615]  platform_probe from really_probe+0xc8/0x2a0
> [2.510986]  really_probe from __driver_probe_device+0x88/0x19c
> [2.516967]  __driver_probe_device from driver_probe_device+0x30/0xcc
> [2.523498]  driver_probe_device from __device_attach_driver+0x94/0xc4
> [2.530090]  __device_attach_driver from bus_for_each_drv+0x80/0xcc
> [2.536437]  bus_for_each_drv from __device_attach+0xf8/0x19c
> [2.542236]  __device_attach from bus_probe_device+0x8c/0x90
> [2.547973]  bus_probe_device from deferred_probe_work_func+0x80/0xb0
> [2.554504]  deferred_probe_work_func from process_one_work+0x228/0x4c0
> [2.561187]  process_one_work from worker_thread+0x1fc/0x4d0
> [2.566925]  worker_thread from kthread+0xb4/0xd0
> [2.571685]  kthread from ret_from_fork+0x14/0x2c
> [2.576446] Exception stack(0xd0079fb0 to 0xd0079ff8)
> [2.581573] 9fa0:   
>  
> [2.589813] 9fc0:       
>  
> [2.598052] 9fe0:     0013 
> [2.604888] ---[ end trace  ]---
>
>
> Signed-off-by: Mans Rullgard 

Reviewed-by: Daniel Thompson 


Daniel.


  1   2   >