[v2] drm/panel: Fine tune Starry-ili9882t panel HFP and HBP
Because the setting of hproch is too small, there will be warning in kernel log[1]. After fine tune the HFP and HBP, this warning can be solved. The actual measurement frame rate is 60.1Hz. [1]: WARNING kernel:[drm] HFP + HBP less than d-phy, FPS will under 60Hz Fixes: 8716a6473e6c ("drm/panel: Support for Starry-ili9882t TDDI MIPI-DSI panel") Signed-off-by: Cong Yang Reviewed-by: Neil Armstrong --- v2: - Update commit add Fixes tag --- drivers/gpu/drm/panel/panel-boe-tv101wum-nl6.c | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/drivers/gpu/drm/panel/panel-boe-tv101wum-nl6.c b/drivers/gpu/drm/panel/panel-boe-tv101wum-nl6.c index 3cc9fb0d4f5d..dc276c346fd1 100644 --- a/drivers/gpu/drm/panel/panel-boe-tv101wum-nl6.c +++ b/drivers/gpu/drm/panel/panel-boe-tv101wum-nl6.c @@ -2139,9 +2139,9 @@ static const struct panel_desc starry_himax83102_j02_desc = { static const struct drm_display_mode starry_ili9882t_default_mode = { .clock = 165280, .hdisplay = 1200, - .hsync_start = 1200 + 32, - .hsync_end = 1200 + 32 + 30, - .htotal = 1200 + 32 + 30 + 32, + .hsync_start = 1200 + 72, + .hsync_end = 1200 + 72 + 30, + .htotal = 1200 + 72 + 30 + 72, .vdisplay = 1920, .vsync_start = 1920 + 68, .vsync_end = 1920 + 68 + 2, -- 2.25.1
[PATCH next] drm/msm/a5xx: Fix a NULL dereference bug in a5xx_gpu_init()
Smatch complains: drivers/gpu/drm/msm/adreno/a5xx_gpu.c:1753 a5xx_gpu_init() warn: variable dereferenced before check 'pdev' (see line 1746) When no device is defined, dereferencing pdev is a NULL dereference. Fix this by dereferencing pdev to get the config post the NULL check. Fixes: 736a93273656 ("drm/msm/a5xx: really check for A510 in a5xx_gpu_init") Signed-off-by: Harshit Mogalapalli --- This is based on static analysis, only compile tested. --- drivers/gpu/drm/msm/adreno/a5xx_gpu.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/drivers/gpu/drm/msm/adreno/a5xx_gpu.c b/drivers/gpu/drm/msm/adreno/a5xx_gpu.c index a99310b68793..adaf8f8e7f2d 100644 --- a/drivers/gpu/drm/msm/adreno/a5xx_gpu.c +++ b/drivers/gpu/drm/msm/adreno/a5xx_gpu.c @@ -1743,7 +1743,7 @@ struct msm_gpu *a5xx_gpu_init(struct drm_device *dev) { struct msm_drm_private *priv = dev->dev_private; struct platform_device *pdev = priv->gpu_pdev; - struct adreno_platform_config *config = pdev->dev.platform_data; + struct adreno_platform_config *config; struct a5xx_gpu *a5xx_gpu = NULL; struct adreno_gpu *adreno_gpu; struct msm_gpu *gpu; @@ -1755,6 +1755,8 @@ struct msm_gpu *a5xx_gpu_init(struct drm_device *dev) return ERR_PTR(-ENXIO); } + config = pdev->dev.platform_data; + a5xx_gpu = kzalloc(sizeof(*a5xx_gpu), GFP_KERNEL); if (!a5xx_gpu) return ERR_PTR(-ENOMEM); -- 2.39.3
Re: [Intel-gfx] [PATCH] drm/i915/guc: Dump perf_limit_reasons for debug
On 6/26/2023 8:17 PM, Dixit, Ashutosh wrote: On Mon, 26 Jun 2023 19:12:18 -0700, Vinay Belgaumkar wrote: GuC load takes longer sometimes due to GT frequency not ramping up. Add perf_limit_reasons to the existing warn print to see if frequency is being throttled. Signed-off-by: Vinay Belgaumkar --- drivers/gpu/drm/i915/gt/uc/intel_guc_fw.c | 2 ++ 1 file changed, 2 insertions(+) diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_fw.c b/drivers/gpu/drm/i915/gt/uc/intel_guc_fw.c index 364d0d546ec8..73911536a8e7 100644 --- a/drivers/gpu/drm/i915/gt/uc/intel_guc_fw.c +++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_fw.c @@ -254,6 +254,8 @@ static int guc_wait_ucode(struct intel_guc *guc) guc_warn(guc, "excessive init time: %lldms! [freq = %dMHz, before = %dMHz, status = 0x%08X, count = %d, ret = %d]\n", delta_ms, intel_rps_read_actual_frequency(>gt->rps), before_freq, status, count, ret); + guc_warn(guc, "perf limit reasons = 0x%08X\n", +intel_uncore_read(uncore, intel_gt_perf_limit_reasons_reg(gt))); Maybe just add at the end of the previous guc_warn? Its already too long a line. If I try adding on the next line checkpatch complains about splitting double quotes. Thanks, Vinay. } else { guc_dbg(guc, "init took %lldms, freq = %dMHz, before = %dMHz, status = 0x%08X, count = %d, ret = %d\n", delta_ms, intel_rps_read_actual_frequency(>gt->rps), -- 2.38.1
[PATCH v3 7/8] drm: Remove legacy cursor hotspot code
From: Zack Rusin Atomic modesetting support mouse cursor offsets via the hotspot properties that are creates on cursor planes. All drivers which support hotspot are atomic and the legacy code has been implemented in terms of the atomic properties as well. Due to the above the lagacy cursor hotspot code is no longer used or needed and can be removed. Signed-off-by: Zack Rusin Cc: Maarten Lankhorst Cc: Maxime Ripard Cc: Thomas Zimmermann Cc: David Airlie Cc: Daniel Vetter --- drivers/gpu/drm/drm_plane.c | 3 --- include/drm/drm_framebuffer.h | 12 2 files changed, 15 deletions(-) diff --git a/drivers/gpu/drm/drm_plane.c b/drivers/gpu/drm/drm_plane.c index ff1cc810d8f8..539fe0d94300 100644 --- a/drivers/gpu/drm/drm_plane.c +++ b/drivers/gpu/drm/drm_plane.c @@ -1069,9 +1069,6 @@ static int drm_mode_cursor_universal(struct drm_crtc *crtc, return PTR_ERR(fb); } - fb->hot_x = req->hot_x; - fb->hot_y = req->hot_y; - if (plane->hotspot_x_property && plane->state) plane->state->hotspot_x = req->hot_x; if (plane->hotspot_y_property && plane->state) diff --git a/include/drm/drm_framebuffer.h b/include/drm/drm_framebuffer.h index 0dcc07b68654..1e108c1789b1 100644 --- a/include/drm/drm_framebuffer.h +++ b/include/drm/drm_framebuffer.h @@ -188,18 +188,6 @@ struct drm_framebuffer { * DRM_MODE_FB_MODIFIERS. */ int flags; - /** -* @hot_x: X coordinate of the cursor hotspot. Used by the legacy cursor -* IOCTL when the driver supports cursor through a DRM_PLANE_TYPE_CURSOR -* universal plane. -*/ - int hot_x; - /** -* @hot_y: Y coordinate of the cursor hotspot. Used by the legacy cursor -* IOCTL when the driver supports cursor through a DRM_PLANE_TYPE_CURSOR -* universal plane. -*/ - int hot_y; /** * @filp_head: Placed on _file.fbs, protected by _file.fbs_lock. */ -- 2.39.2
[PATCH v3 2/8] drm/atomic: Add support for mouse hotspots
From: Zack Rusin Atomic modesetting code lacked support for specifying mouse cursor hotspots. The legacy kms DRM_IOCTL_MODE_CURSOR2 had support for setting the hotspot but the functionality was not implemented in the new atomic paths. Due to the lack of hotspots in the atomic paths userspace compositors completely disable atomic modesetting for drivers that require it (i.e. all paravirtualized drivers). This change adds hotspot properties to the atomic codepaths throughtout the DRM core and will allow enabling atomic modesetting for virtualized drivers in the userspace. Signed-off-by: Zack Rusin Cc: Maarten Lankhorst Cc: Maxime Ripard Cc: Thomas Zimmermann Cc: David Airlie Cc: Daniel Vetter --- drivers/gpu/drm/drm_atomic_state_helper.c | 14 +++ drivers/gpu/drm/drm_atomic_uapi.c | 20 + drivers/gpu/drm/drm_plane.c | 51 +++ include/drm/drm_plane.h | 15 +++ 4 files changed, 100 insertions(+) diff --git a/drivers/gpu/drm/drm_atomic_state_helper.c b/drivers/gpu/drm/drm_atomic_state_helper.c index 784e63d70a42..54975de44a0e 100644 --- a/drivers/gpu/drm/drm_atomic_state_helper.c +++ b/drivers/gpu/drm/drm_atomic_state_helper.c @@ -275,6 +275,20 @@ void __drm_atomic_helper_plane_state_reset(struct drm_plane_state *plane_state, plane_state->normalized_zpos = val; } } + + if (plane->hotspot_x_property) { + if (!drm_object_property_get_default_value(>base, + plane->hotspot_x_property, + )) + plane_state->hotspot_x = val; + } + + if (plane->hotspot_y_property) { + if (!drm_object_property_get_default_value(>base, + plane->hotspot_y_property, + )) + plane_state->hotspot_y = val; + } } EXPORT_SYMBOL(__drm_atomic_helper_plane_state_reset); diff --git a/drivers/gpu/drm/drm_atomic_uapi.c b/drivers/gpu/drm/drm_atomic_uapi.c index 98d3b10c08ae..07a7b3f18df2 100644 --- a/drivers/gpu/drm/drm_atomic_uapi.c +++ b/drivers/gpu/drm/drm_atomic_uapi.c @@ -593,6 +593,22 @@ static int drm_atomic_plane_set_property(struct drm_plane *plane, } else if (plane->funcs->atomic_set_property) { return plane->funcs->atomic_set_property(plane, state, property, val); + } else if (property == plane->hotspot_x_property) { + if (plane->type != DRM_PLANE_TYPE_CURSOR) { + drm_dbg_atomic(plane->dev, + "[PLANE:%d:%s] is not a cursor plane: 0x%llx\n", + plane->base.id, plane->name, val); + return -EINVAL; + } + state->hotspot_x = val; + } else if (property == plane->hotspot_y_property) { + if (plane->type != DRM_PLANE_TYPE_CURSOR) { + drm_dbg_atomic(plane->dev, + "[PLANE:%d:%s] is not a cursor plane: 0x%llx\n", + plane->base.id, plane->name, val); + return -EINVAL; + } + state->hotspot_y = val; } else { drm_dbg_atomic(plane->dev, "[PLANE:%d:%s] unknown property [PROP:%d:%s]\n", @@ -653,6 +669,10 @@ drm_atomic_plane_get_property(struct drm_plane *plane, *val = state->scaling_filter; } else if (plane->funcs->atomic_get_property) { return plane->funcs->atomic_get_property(plane, state, property, val); + } else if (property == plane->hotspot_x_property) { + *val = state->hotspot_x; + } else if (property == plane->hotspot_y_property) { + *val = state->hotspot_y; } else { drm_dbg_atomic(dev, "[PLANE:%d:%s] unknown property [PROP:%d:%s]\n", diff --git a/drivers/gpu/drm/drm_plane.c b/drivers/gpu/drm/drm_plane.c index a4a39f4834e2..ff1cc810d8f8 100644 --- a/drivers/gpu/drm/drm_plane.c +++ b/drivers/gpu/drm/drm_plane.c @@ -348,6 +348,10 @@ static int __drm_universal_plane_init(struct drm_device *dev, drm_object_attach_property(>base, config->prop_src_w, 0); drm_object_attach_property(>base, config->prop_src_h, 0); } + if (drm_core_check_feature(dev, DRIVER_CURSOR_HOTSPOT) && + type == DRM_PLANE_TYPE_CURSOR) { + drm_plane_create_hotspot_properties(plane); + } if (format_modifier_count) create_in_format_blob(dev, plane); @@ -1067,6 +1071,11 @@ static int drm_mode_cursor_universal(struct drm_crtc *crtc,
[PATCH v3 8/8] drm: Introduce DRM_CLIENT_CAP_VIRTUALIZED_CURSOR_PLANE
From: Zack Rusin Virtualized drivers place additional restrictions on the cursor plane which breaks the contract of universal planes. To allow atomic modesettings with virtualized drivers the clients need to advertise that they're capable of dealing with those extra restrictions. To do that introduce DRM_CLIENT_CAP_VIRTUALIZED_CURSOR_PLANE which lets DRM know that the client is aware of and capable of dealing with the extra restrictions on the virtual cursor plane. Setting this option to true makes DRM expose the cursor plane on virtualized drivers. The userspace is expected to set the hotspots and handle mouse events on that plane. Signed-off-by: Zack Rusin Cc: Maarten Lankhorst Cc: Maxime Ripard Cc: Thomas Zimmermann Cc: David Airlie Cc: Daniel Vetter Cc: dri-devel@lists.freedesktop.org --- drivers/gpu/drm/drm_ioctl.c | 9 + include/uapi/drm/drm.h | 26 ++ 2 files changed, 35 insertions(+) diff --git a/drivers/gpu/drm/drm_ioctl.c b/drivers/gpu/drm/drm_ioctl.c index 8e9afe7af19c..6fd17ff14656 100644 --- a/drivers/gpu/drm/drm_ioctl.c +++ b/drivers/gpu/drm/drm_ioctl.c @@ -361,6 +361,15 @@ drm_setclientcap(struct drm_device *dev, void *data, struct drm_file *file_priv) return -EINVAL; file_priv->writeback_connectors = req->value; break; + case DRM_CLIENT_CAP_VIRTUALIZED_CURSOR_PLANE: + if (!drm_core_check_feature(dev, DRIVER_CURSOR_HOTSPOT)) + return -EOPNOTSUPP; + if (!file_priv->atomic) + return -EINVAL; + if (req->value > 1) + return -EINVAL; + file_priv->supports_virtualized_cursor_plane = req->value; + break; default: return -EINVAL; } diff --git a/include/uapi/drm/drm.h b/include/uapi/drm/drm.h index a87ca2d4..057ef2a16d31 100644 --- a/include/uapi/drm/drm.h +++ b/include/uapi/drm/drm.h @@ -836,6 +836,32 @@ struct drm_get_cap { */ #define DRM_CLIENT_CAP_WRITEBACK_CONNECTORS5 +/** + * DRM_CLIENT_CAP_VIRTUALIZED_CURSOR_PLANE + * + * Drivers for para-virtualized hardware (e.g. vmwgfx, qxl, virtio and + * virtualbox) have additional restrictions for cursor planes (thus + * making cursor planes on those drivers not truly universal,) e.g. + * they need cursor planes to act like one would expect from a mouse + * cursor and have correctly set hotspot properties. + * If this client cap is not set the DRM core will hide cursor plane on + * those virtualized drivers because not setting it implies that the + * client is not capable of dealing with those extra restictions. + * Clients which do set cursor hotspot and treat the cursor plane + * like a mouse cursor should set this property. + * The client must enable _CLIENT_CAP_ATOMIC first. + * + * Setting this property on drivers which do not special case + * cursor planes (i.e. non-virtualized drivers) will return + * EOPNOTSUPP, which can be used by userspace to gauge + * requirements of the hardware/drivers they're running on. + * + * This capability is always supported for atomic-capable virtualized + * drivers starting from kernel version 6.5. + */ +#define DRM_CLIENT_CAP_VIRTUALIZED_CURSOR_PLANE6 + + /* DRM_IOCTL_SET_CLIENT_CAP ioctl argument type */ struct drm_set_client_cap { __u64 capability; -- 2.39.2
[PATCH v3 5/8] drm/vboxvideo: Use the hotspot properties from cursor planes
From: Zack Rusin Atomic modesetting got support for mouse hotspots via the hotspot properties. Port the legacy kms hotspot handling to the new properties on cursor planes. Signed-off-by: Zack Rusin Cc: Hans de Goede Cc: David Airlie Cc: Daniel Vetter --- drivers/gpu/drm/vboxvideo/vbox_mode.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/drivers/gpu/drm/vboxvideo/vbox_mode.c b/drivers/gpu/drm/vboxvideo/vbox_mode.c index 341edd982cb3..9ff3bade9795 100644 --- a/drivers/gpu/drm/vboxvideo/vbox_mode.c +++ b/drivers/gpu/drm/vboxvideo/vbox_mode.c @@ -429,8 +429,8 @@ static void vbox_cursor_atomic_update(struct drm_plane *plane, flags = VBOX_MOUSE_POINTER_VISIBLE | VBOX_MOUSE_POINTER_SHAPE | VBOX_MOUSE_POINTER_ALPHA; hgsmi_update_pointer_shape(vbox->guest_pool, flags, - min_t(u32, max(fb->hot_x, 0), width), - min_t(u32, max(fb->hot_y, 0), height), + min_t(u32, max(new_state->hotspot_x, 0), width), + min_t(u32, max(new_state->hotspot_y, 0), height), width, height, vbox->cursor_data, data_size); mutex_unlock(>hw_mutex); -- 2.39.2
[PATCH v3 3/8] drm/vmwgfx: Use the hotspot properties from cursor planes
From: Zack Rusin Atomic modesetting got support for mouse hotspots via the hotspot properties. Port the legacy kms hotspot handling to the new properties on cursor planes. Signed-off-by: Zack Rusin Cc: Martin Krastev Cc: Maaz Mombasawala --- drivers/gpu/drm/vmwgfx/vmwgfx_kms.c | 9 ++--- 1 file changed, 2 insertions(+), 7 deletions(-) diff --git a/drivers/gpu/drm/vmwgfx/vmwgfx_kms.c b/drivers/gpu/drm/vmwgfx/vmwgfx_kms.c index b62207be3363..de294dfe05d0 100644 --- a/drivers/gpu/drm/vmwgfx/vmwgfx_kms.c +++ b/drivers/gpu/drm/vmwgfx/vmwgfx_kms.c @@ -768,13 +768,8 @@ vmw_du_cursor_plane_atomic_update(struct drm_plane *plane, struct vmw_plane_state *old_vps = vmw_plane_state_to_vps(old_state); s32 hotspot_x, hotspot_y; - hotspot_x = du->hotspot_x; - hotspot_y = du->hotspot_y; - - if (new_state->fb) { - hotspot_x += new_state->fb->hot_x; - hotspot_y += new_state->fb->hot_y; - } + hotspot_x = du->hotspot_x + new_state->hotspot_x; + hotspot_y = du->hotspot_y + new_state->hotspot_y; du->cursor_surface = vps->surf; du->cursor_bo = vps->bo; -- 2.39.2
[PATCH v3 6/8] drm/virtio: Use the hotspot properties from cursor planes
From: Zack Rusin Atomic modesetting got support for mouse hotspots via the hotspot properties. Port the legacy kms hotspot handling to the new properties on cursor planes. Signed-off-by: Zack Rusin Reviewed-by: Gerd Hoffmann Cc: David Airlie Cc: Gurchetan Singh Cc: Chia-I Wu Cc: Daniel Vetter Cc: virtualizat...@lists.linux-foundation.org --- drivers/gpu/drm/virtio/virtgpu_plane.c | 8 1 file changed, 4 insertions(+), 4 deletions(-) diff --git a/drivers/gpu/drm/virtio/virtgpu_plane.c b/drivers/gpu/drm/virtio/virtgpu_plane.c index a2e045f3a000..20de599658c1 100644 --- a/drivers/gpu/drm/virtio/virtgpu_plane.c +++ b/drivers/gpu/drm/virtio/virtgpu_plane.c @@ -323,16 +323,16 @@ static void virtio_gpu_cursor_plane_update(struct drm_plane *plane, DRM_DEBUG("update, handle %d, pos +%d+%d, hot %d,%d\n", handle, plane->state->crtc_x, plane->state->crtc_y, - plane->state->fb ? plane->state->fb->hot_x : 0, - plane->state->fb ? plane->state->fb->hot_y : 0); + plane->state->hotspot_x, + plane->state->hotspot_y); output->cursor.hdr.type = cpu_to_le32(VIRTIO_GPU_CMD_UPDATE_CURSOR); output->cursor.resource_id = cpu_to_le32(handle); if (plane->state->fb) { output->cursor.hot_x = - cpu_to_le32(plane->state->fb->hot_x); + cpu_to_le32(plane->state->hotspot_x); output->cursor.hot_y = - cpu_to_le32(plane->state->fb->hot_y); + cpu_to_le32(plane->state->hotspot_y); } else { output->cursor.hot_x = cpu_to_le32(0); output->cursor.hot_y = cpu_to_le32(0); -- 2.39.2
[PATCH v3 4/8] drm/qxl: Use the hotspot properties from cursor planes
From: Zack Rusin Atomic modesetting got support for mouse hotspots via the hotspot properties. Port the legacy kms hotspot handling to the new properties on cursor planes. Signed-off-by: Zack Rusin Reviewed-by: Gerd Hoffmann Cc: Dave Airlie Cc: Daniel Vetter Cc: virtualizat...@lists.linux-foundation.org Cc: spice-de...@lists.freedesktop.org --- drivers/gpu/drm/qxl/qxl_display.c | 14 ++ 1 file changed, 6 insertions(+), 8 deletions(-) diff --git a/drivers/gpu/drm/qxl/qxl_display.c b/drivers/gpu/drm/qxl/qxl_display.c index 6492a70e3c39..5d689e0d3586 100644 --- a/drivers/gpu/drm/qxl/qxl_display.c +++ b/drivers/gpu/drm/qxl/qxl_display.c @@ -485,7 +485,6 @@ static int qxl_primary_atomic_check(struct drm_plane *plane, static int qxl_primary_apply_cursor(struct qxl_device *qdev, struct drm_plane_state *plane_state) { - struct drm_framebuffer *fb = plane_state->fb; struct qxl_crtc *qcrtc = to_qxl_crtc(plane_state->crtc); struct qxl_cursor_cmd *cmd; struct qxl_release *release; @@ -510,8 +509,8 @@ static int qxl_primary_apply_cursor(struct qxl_device *qdev, cmd = (struct qxl_cursor_cmd *)qxl_release_map(qdev, release); cmd->type = QXL_CURSOR_SET; - cmd->u.set.position.x = plane_state->crtc_x + fb->hot_x; - cmd->u.set.position.y = plane_state->crtc_y + fb->hot_y; + cmd->u.set.position.x = plane_state->crtc_x + plane_state->hotspot_x; + cmd->u.set.position.y = plane_state->crtc_y + plane_state->hotspot_y; cmd->u.set.shape = qxl_bo_physical_address(qdev, qcrtc->cursor_bo, 0); @@ -531,7 +530,6 @@ static int qxl_primary_apply_cursor(struct qxl_device *qdev, static int qxl_primary_move_cursor(struct qxl_device *qdev, struct drm_plane_state *plane_state) { - struct drm_framebuffer *fb = plane_state->fb; struct qxl_crtc *qcrtc = to_qxl_crtc(plane_state->crtc); struct qxl_cursor_cmd *cmd; struct qxl_release *release; @@ -554,8 +552,8 @@ static int qxl_primary_move_cursor(struct qxl_device *qdev, cmd = (struct qxl_cursor_cmd *)qxl_release_map(qdev, release); cmd->type = QXL_CURSOR_MOVE; - cmd->u.position.x = plane_state->crtc_x + fb->hot_x; - cmd->u.position.y = plane_state->crtc_y + fb->hot_y; + cmd->u.position.x = plane_state->crtc_x + plane_state->hotspot_x; + cmd->u.position.y = plane_state->crtc_y + plane_state->hotspot_y; qxl_release_unmap(qdev, release, >release_info); qxl_release_fence_buffer_objects(release); @@ -851,8 +849,8 @@ static int qxl_plane_prepare_fb(struct drm_plane *plane, struct qxl_bo *old_cursor_bo = qcrtc->cursor_bo; qcrtc->cursor_bo = qxl_create_cursor(qdev, user_bo, -new_state->fb->hot_x, -new_state->fb->hot_y); +new_state->hotspot_x, +new_state->hotspot_y); qxl_free_cursor(old_cursor_bo); } -- 2.39.2
[PATCH v3 1/8] drm: Disable the cursor plane on atomic contexts with virtualized drivers
From: Zack Rusin Cursor planes on virtualized drivers have special meaning and require that the clients handle them in specific ways, e.g. the cursor plane should react to the mouse movement the way a mouse cursor would be expected to and the client is required to set hotspot properties on it in order for the mouse events to be routed correctly. This breaks the contract as specified by the "universal planes". Fix it by disabling the cursor planes on virtualized drivers while adding a foundation on top of which it's possible to special case mouse cursor planes for clients that want it. Disabling the cursor planes makes some kms compositors which were broken, e.g. Weston, fallback to software cursor which works fine or at least better than currently while having no effect on others, e.g. gnome-shell or kwin, which put virtualized drivers on a deny-list when running in atomic context to make them fallback to legacy kms and avoid this issue. Signed-off-by: Zack Rusin Fixes: 681e7ec73044 ("drm: Allow userspace to ask for universal plane list (v2)") Cc: # v5.4+ Cc: Maarten Lankhorst Cc: Maxime Ripard Cc: Thomas Zimmermann Cc: David Airlie Cc: Daniel Vetter Cc: Dave Airlie Cc: Gerd Hoffmann Cc: Hans de Goede Cc: Gurchetan Singh Cc: Chia-I Wu Cc: dri-devel@lists.freedesktop.org Cc: virtualizat...@lists.linux-foundation.org Cc: spice-de...@lists.freedesktop.org --- drivers/gpu/drm/drm_plane.c | 13 + drivers/gpu/drm/qxl/qxl_drv.c| 2 +- drivers/gpu/drm/vboxvideo/vbox_drv.c | 2 +- drivers/gpu/drm/virtio/virtgpu_drv.c | 2 +- drivers/gpu/drm/vmwgfx/vmwgfx_drv.c | 2 +- include/drm/drm_drv.h| 9 + include/drm/drm_file.h | 12 7 files changed, 38 insertions(+), 4 deletions(-) diff --git a/drivers/gpu/drm/drm_plane.c b/drivers/gpu/drm/drm_plane.c index 24e7998d1731..a4a39f4834e2 100644 --- a/drivers/gpu/drm/drm_plane.c +++ b/drivers/gpu/drm/drm_plane.c @@ -678,6 +678,19 @@ int drm_mode_getplane_res(struct drm_device *dev, void *data, !file_priv->universal_planes) continue; + /* +* If we're running on a virtualized driver then, +* unless userspace advertizes support for the +* virtualized cursor plane, disable cursor planes +* because they'll be broken due to missing cursor +* hotspot info. +*/ + if (plane->type == DRM_PLANE_TYPE_CURSOR && + drm_core_check_feature(dev, DRIVER_CURSOR_HOTSPOT) && + file_priv->atomic && + !file_priv->supports_virtualized_cursor_plane) + continue; + if (drm_lease_held(file_priv, plane->base.id)) { if (count < plane_resp->count_planes && put_user(plane->base.id, plane_ptr + count)) diff --git a/drivers/gpu/drm/qxl/qxl_drv.c b/drivers/gpu/drm/qxl/qxl_drv.c index b30ede1cf62d..91930e84a9cd 100644 --- a/drivers/gpu/drm/qxl/qxl_drv.c +++ b/drivers/gpu/drm/qxl/qxl_drv.c @@ -283,7 +283,7 @@ static const struct drm_ioctl_desc qxl_ioctls[] = { }; static struct drm_driver qxl_driver = { - .driver_features = DRIVER_GEM | DRIVER_MODESET | DRIVER_ATOMIC, + .driver_features = DRIVER_GEM | DRIVER_MODESET | DRIVER_ATOMIC | DRIVER_CURSOR_HOTSPOT, .dumb_create = qxl_mode_dumb_create, .dumb_map_offset = drm_gem_ttm_dumb_map_offset, diff --git a/drivers/gpu/drm/vboxvideo/vbox_drv.c b/drivers/gpu/drm/vboxvideo/vbox_drv.c index 4fee15c97c34..8ecd0863fad7 100644 --- a/drivers/gpu/drm/vboxvideo/vbox_drv.c +++ b/drivers/gpu/drm/vboxvideo/vbox_drv.c @@ -172,7 +172,7 @@ DEFINE_DRM_GEM_FOPS(vbox_fops); static const struct drm_driver driver = { .driver_features = - DRIVER_MODESET | DRIVER_GEM | DRIVER_ATOMIC, + DRIVER_MODESET | DRIVER_GEM | DRIVER_ATOMIC | DRIVER_CURSOR_HOTSPOT, .fops = _fops, .name = DRIVER_NAME, diff --git a/drivers/gpu/drm/virtio/virtgpu_drv.c b/drivers/gpu/drm/virtio/virtgpu_drv.c index a7ec5a3770da..8f4bb8a4e952 100644 --- a/drivers/gpu/drm/virtio/virtgpu_drv.c +++ b/drivers/gpu/drm/virtio/virtgpu_drv.c @@ -176,7 +176,7 @@ static const struct drm_driver driver = { * If KMS is disabled DRIVER_MODESET and DRIVER_ATOMIC are masked * out via drm_device::driver_features: */ - .driver_features = DRIVER_MODESET | DRIVER_GEM | DRIVER_RENDER | DRIVER_ATOMIC, + .driver_features = DRIVER_MODESET | DRIVER_GEM | DRIVER_RENDER | DRIVER_ATOMIC | DRIVER_CURSOR_HOTSPOT, .open = virtio_gpu_driver_open, .postclose = virtio_gpu_driver_postclose, diff --git a/drivers/gpu/drm/vmwgfx/vmwgfx_drv.c b/drivers/gpu/drm/vmwgfx/vmwgfx_drv.c index 8b24ecf60e3e..d3e308fdfd5b 100644 --- a/drivers/gpu/drm/vmwgfx/vmwgfx_drv.c +++ b/drivers/gpu/drm/vmwgfx/vmwgfx_drv.c @@
[PATCH v3 0/8] Fix cursor planes with virtualized drivers
From: Zack Rusin v3: Renames, fixes and cleanups suggested by Daniel, Simon and Pekka after v2. There's no major changes in functionality. Please let me know if I missed anything, it's been a while since v2. Virtualized drivers have had a lot of issues with cursor support on top of atomic modesetting. This set both fixes the long standing problems with atomic kms and virtualized drivers and adds code to let userspace use atomic kms on virtualized drivers while preserving functioning seamless cursors between the host and guest. The first change in the set is one that should be backported as far as possible, likely 5.4 stable, because earlier stable kernels do not have virtualbox driver. The change makes virtualized drivers stop exposing a cursor plane for atomic clients, this fixes mouse cursor on all well formed compositors which will automatically fallback to software cursor. The rest of the changes until the last one ports the legacy hotspot code to atomic plane properties. Finally the last change introduces userspace API to let userspace clients advertise the fact that they are aware of additional restrictions placed upon the cursor plane by virtualized drivers and lets them use atomic kms with virtualized drivers (the clients are expected to set hotspots correctly when advertising support for virtual cursor plane). Zack Rusin (8): drm: Disable the cursor plane on atomic contexts with virtualized drivers drm/atomic: Add support for mouse hotspots drm/vmwgfx: Use the hotspot properties from cursor planes drm/qxl: Use the hotspot properties from cursor planes drm/vboxvideo: Use the hotspot properties from cursor planes drm/virtio: Use the hotspot properties from cursor planes drm: Remove legacy cursor hotspot code drm: Introduce DRM_CLIENT_CAP_VIRTUALIZED_CURSOR_PLANE drivers/gpu/drm/drm_atomic_state_helper.c | 14 + drivers/gpu/drm/drm_atomic_uapi.c | 20 +++ drivers/gpu/drm/drm_ioctl.c | 9 drivers/gpu/drm/drm_plane.c | 65 ++- drivers/gpu/drm/qxl/qxl_display.c | 14 +++-- drivers/gpu/drm/qxl/qxl_drv.c | 2 +- drivers/gpu/drm/vboxvideo/vbox_drv.c | 2 +- drivers/gpu/drm/vboxvideo/vbox_mode.c | 4 +- drivers/gpu/drm/virtio/virtgpu_drv.c | 2 +- drivers/gpu/drm/virtio/virtgpu_plane.c| 8 +-- drivers/gpu/drm/vmwgfx/vmwgfx_drv.c | 2 +- drivers/gpu/drm/vmwgfx/vmwgfx_kms.c | 9 +--- include/drm/drm_drv.h | 9 include/drm/drm_file.h| 12 + include/drm/drm_framebuffer.h | 12 - include/drm/drm_plane.h | 15 ++ include/uapi/drm/drm.h| 26 + 17 files changed, 186 insertions(+), 39 deletions(-) -- 2.39.2
Re: [PATCH v5 03/11] drm/mediatek: gamma: Support SoC specific LUT size
Re: [Intel-gfx] [PATCH] drm/i915/guc: Dump perf_limit_reasons for debug
On Mon, 26 Jun 2023 19:12:18 -0700, Vinay Belgaumkar wrote: > > GuC load takes longer sometimes due to GT frequency not ramping up. > Add perf_limit_reasons to the existing warn print to see if frequency > is being throttled. > > Signed-off-by: Vinay Belgaumkar > --- > drivers/gpu/drm/i915/gt/uc/intel_guc_fw.c | 2 ++ > 1 file changed, 2 insertions(+) > > diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_fw.c > b/drivers/gpu/drm/i915/gt/uc/intel_guc_fw.c > index 364d0d546ec8..73911536a8e7 100644 > --- a/drivers/gpu/drm/i915/gt/uc/intel_guc_fw.c > +++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_fw.c > @@ -254,6 +254,8 @@ static int guc_wait_ucode(struct intel_guc *guc) > guc_warn(guc, "excessive init time: %lldms! [freq = %dMHz, > before = %dMHz, status = 0x%08X, count = %d, ret = %d]\n", >delta_ms, > intel_rps_read_actual_frequency(>gt->rps), >before_freq, status, count, ret); > + guc_warn(guc, "perf limit reasons = 0x%08X\n", > + intel_uncore_read(uncore, > intel_gt_perf_limit_reasons_reg(gt))); Maybe just add at the end of the previous guc_warn? > } else { > guc_dbg(guc, "init took %lldms, freq = %dMHz, before = %dMHz, > status = 0x%08X, count = %d, ret = %d\n", > delta_ms, > intel_rps_read_actual_frequency(>gt->rps), > -- > 2.38.1 >
[PATCH] drm/i915/guc: Dump perf_limit_reasons for debug
GuC load takes longer sometimes due to GT frequency not ramping up. Add perf_limit_reasons to the existing warn print to see if frequency is being throttled. Signed-off-by: Vinay Belgaumkar --- drivers/gpu/drm/i915/gt/uc/intel_guc_fw.c | 2 ++ 1 file changed, 2 insertions(+) diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_fw.c b/drivers/gpu/drm/i915/gt/uc/intel_guc_fw.c index 364d0d546ec8..73911536a8e7 100644 --- a/drivers/gpu/drm/i915/gt/uc/intel_guc_fw.c +++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_fw.c @@ -254,6 +254,8 @@ static int guc_wait_ucode(struct intel_guc *guc) guc_warn(guc, "excessive init time: %lldms! [freq = %dMHz, before = %dMHz, status = 0x%08X, count = %d, ret = %d]\n", delta_ms, intel_rps_read_actual_frequency(>gt->rps), before_freq, status, count, ret); + guc_warn(guc, "perf limit reasons = 0x%08X\n", +intel_uncore_read(uncore, intel_gt_perf_limit_reasons_reg(gt))); } else { guc_dbg(guc, "init took %lldms, freq = %dMHz, before = %dMHz, status = 0x%08X, count = %d, ret = %d\n", delta_ms, intel_rps_read_actual_frequency(>gt->rps), -- 2.38.1
linux-next: manual merge of the drm tree with the drm-misc-fixes tree
Hi all, Today's linux-next merge of the drm tree got a conflict in: include/drm/gpu_scheduler.h between commit: db8b4968a8d0 ("drm/sched: Call drm_sched_fence_set_parent() from drm_sched_fence_scheduled()") from the drm-misc-fixes tree and commit: 539f9ee4b52a ("drm/scheduler: properly forward fence errors") from the drm tree. I fixed it up (see below) and can carry the fix as necessary. This is now fixed as far as linux-next is concerned, but any non trivial conflicts should be mentioned to your upstream maintainer when your tree is submitted for merging. You may also want to consider cooperating with the maintainer of the conflicting tree to minimise any particularly complex conflicts. -- Cheers, Stephen Rothwell diff --cc include/drm/gpu_scheduler.h index b29e347b10a9,e95b4837e5a3.. --- a/include/drm/gpu_scheduler.h +++ b/include/drm/gpu_scheduler.h @@@ -581,16 -581,18 +581,17 @@@ void drm_sched_entity_push_job(struct d void drm_sched_entity_set_priority(struct drm_sched_entity *entity, enum drm_sched_priority priority); bool drm_sched_entity_is_ready(struct drm_sched_entity *entity); + int drm_sched_entity_error(struct drm_sched_entity *entity); -void drm_sched_fence_set_parent(struct drm_sched_fence *s_fence, - struct dma_fence *fence); struct drm_sched_fence *drm_sched_fence_alloc( struct drm_sched_entity *s_entity, void *owner); void drm_sched_fence_init(struct drm_sched_fence *fence, struct drm_sched_entity *entity); void drm_sched_fence_free(struct drm_sched_fence *fence); -void drm_sched_fence_scheduled(struct drm_sched_fence *fence); +void drm_sched_fence_scheduled(struct drm_sched_fence *fence, + struct dma_fence *parent); - void drm_sched_fence_finished(struct drm_sched_fence *fence); + void drm_sched_fence_finished(struct drm_sched_fence *fence, int result); unsigned long drm_sched_suspend_timeout(struct drm_gpu_scheduler *sched); void drm_sched_resume_timeout(struct drm_gpu_scheduler *sched, pgpUI0jSomqvI.pgp Description: OpenPGP digital signature
Re: [RFC PATCH 0/3] Support for Solid Fill Planes
On Tue, 27 Jun 2023 at 03:45, Jessica Zhang wrote: > > > > On 6/26/2023 5:06 PM, Dmitry Baryshkov wrote: > > On 27/06/2023 02:02, Jessica Zhang wrote: > >> > >> > >> On 11/7/2022 11:37 AM, Ville Syrjälä wrote: > >>> On Fri, Oct 28, 2022 at 03:59:49PM -0700, Jessica Zhang wrote: > Introduce and add support for COLOR_FILL and COLOR_FILL_FORMAT > properties. When the color fill value is set, and the framebuffer is > set > to NULL, memory fetch will be disabled. > >>> > >>> Thinking a bit more universally I wonder if there should be > >>> some kind of enum property: > >>> > >>> enum plane_pixel_source { > >>> FB, > >>> COLOR, > >>> LIVE_FOO, > >>> LIVE_BAR, > >>> ... > >>> } > >> > >> Reviving this thread as this was the initial comment suggesting to > >> implement pixel_source as an enum. > >> > >> I think the issue with having pixel_source as an enum is how to decide > >> what counts as a NULL commit. > >> > >> Currently, setting the FB to NULL will disable the plane. So I'm > >> guessing we will extend that logic to "if there's no pixel_source set > >> for the plane, then it will be a NULL commit and disable the plane". > >> > >> In that case, the question then becomes when to set the pixel_source > >> to NONE. Because if we do that when setting a NULL FB (or NULL > >> solid_fill blob), it then forces userspace to set one property before > >> the other. > > > > Why? The userspace should use atomic commits and as such it should all > > properties at the same time. > > Correct, userspace will set all the properties within the same atomic > commit. The issue happens when the driver iterates through each property > within the MODE_ATOMIC ioctl [1]. > > For reference, I'm thinking of an implementation where we're setting the > pixel_source within drm_atomic_plane_set_property(). > > So something like: > > drm_atomic_plane_set_property( ... ) > { > if (property == config->prop_fb_id) { > if (fb) > state->pixel_source = FB; > else > state->pixel_source = NONE; > } else if (property == config->prop_solid_fill) { > if (solid_fill_blob) > state->pixel_source = SOLID_FILL; > } > > // ... > } I think this is somewhat overcomplicated. Allow userspace to set these properties as it sees fit and then in drm_atomic_helper_check_plane_state() consider all of them to set plane_state->visible. We still have to remain compatible with older userspace (esp. with a non-atomic one). It expects that a plane is enabled after setting both CRTC and FB. So maybe you are right and we should force pixel_source to FB if FB is set. > > If userspace sets solid_fill to a valid blob and FB to NULL, it's > possible for driver to first set the solid_fill property then set the > fb_id property later. This would cause pixel_source to be set to NONE > after all properties have been set. > > I've also considered an implementation without the `pixel_source = NONE` > line in the prop_fb_id case, but we would still need to find somewhere > to set the pixel_source to NONE in order to allow userspace to disable a > plane. Good point. I don't think we would need NONE (just setting CRTC to none or FB to none and pixel_source to FB would disable the plane), but I might be missing something here. > > [1] > https://elixir.bootlin.com/linux/latest/source/drivers/gpu/drm/drm_atomic_uapi.c#L1385 > > > > >> Because of that, I'm thinking of having pixel_source be represented by > >> a bitmask instead. That way, we will simply unset the corresponding > >> pixel_source bit when passing in a NULL FB/solid_fill blob. Then, in > >> order to detect whether a commit is NULL or has a valid pixel source, > >> we can just check if pixel_source == 0. > >> > >> Would be interested in any feedback on this. > > > > This is an interesting idea. Frankly speaking, I'd consider it > > counter-intuitive at the first glance. > > > > Consider it to act as a curtain. Setup the curtain (by writing the fill > > colour property). Then one can close the curtain (by switching source to > > colour), or open it (by switching to any other source). Bitmask wouldn't > > complicate this. > > So if I'm understanding your analogy correctly, pixel_source won't > necessarily be set whenever the FB or solid_fill properties are set. So > that way we can have both FB *and* solid_fill set at the same time, but > only the source that pixel_source is set to would be displayed. Yes. And if the source is not configured, the plane will be marked as not visible. > > Thanks, > > Jessica Zhang > > > > >> > >> Thanks, > >> > >> Jessica Zhang > >> > >>> > In addition, loosen the NULL FB checks within the atomic commit > callstack > to allow a NULL FB when color_fill is nonzero and add FB checks in > methods where the FB was previously assumed to be non-NULL. > > Finally, have the DPU driver use drm_plane_state.color_fill and >
Re: [RFC PATCH 0/3] Support for Solid Fill Planes
On 6/26/2023 5:06 PM, Dmitry Baryshkov wrote: On 27/06/2023 02:02, Jessica Zhang wrote: On 11/7/2022 11:37 AM, Ville Syrjälä wrote: On Fri, Oct 28, 2022 at 03:59:49PM -0700, Jessica Zhang wrote: Introduce and add support for COLOR_FILL and COLOR_FILL_FORMAT properties. When the color fill value is set, and the framebuffer is set to NULL, memory fetch will be disabled. Thinking a bit more universally I wonder if there should be some kind of enum property: enum plane_pixel_source { FB, COLOR, LIVE_FOO, LIVE_BAR, ... } Reviving this thread as this was the initial comment suggesting to implement pixel_source as an enum. I think the issue with having pixel_source as an enum is how to decide what counts as a NULL commit. Currently, setting the FB to NULL will disable the plane. So I'm guessing we will extend that logic to "if there's no pixel_source set for the plane, then it will be a NULL commit and disable the plane". In that case, the question then becomes when to set the pixel_source to NONE. Because if we do that when setting a NULL FB (or NULL solid_fill blob), it then forces userspace to set one property before the other. Why? The userspace should use atomic commits and as such it should all properties at the same time. Correct, userspace will set all the properties within the same atomic commit. The issue happens when the driver iterates through each property within the MODE_ATOMIC ioctl [1]. For reference, I'm thinking of an implementation where we're setting the pixel_source within drm_atomic_plane_set_property(). So something like: drm_atomic_plane_set_property( ... ) { if (property == config->prop_fb_id) { if (fb) state->pixel_source = FB; else state->pixel_source = NONE; } else if (property == config->prop_solid_fill) { if (solid_fill_blob) state->pixel_source = SOLID_FILL; } // ... } If userspace sets solid_fill to a valid blob and FB to NULL, it's possible for driver to first set the solid_fill property then set the fb_id property later. This would cause pixel_source to be set to NONE after all properties have been set. I've also considered an implementation without the `pixel_source = NONE` line in the prop_fb_id case, but we would still need to find somewhere to set the pixel_source to NONE in order to allow userspace to disable a plane. [1] https://elixir.bootlin.com/linux/latest/source/drivers/gpu/drm/drm_atomic_uapi.c#L1385 Because of that, I'm thinking of having pixel_source be represented by a bitmask instead. That way, we will simply unset the corresponding pixel_source bit when passing in a NULL FB/solid_fill blob. Then, in order to detect whether a commit is NULL or has a valid pixel source, we can just check if pixel_source == 0. Would be interested in any feedback on this. This is an interesting idea. Frankly speaking, I'd consider it counter-intuitive at the first glance. Consider it to act as a curtain. Setup the curtain (by writing the fill colour property). Then one can close the curtain (by switching source to colour), or open it (by switching to any other source). Bitmask wouldn't complicate this. So if I'm understanding your analogy correctly, pixel_source won't necessarily be set whenever the FB or solid_fill properties are set. So that way we can have both FB *and* solid_fill set at the same time, but only the source that pixel_source is set to would be displayed. Thanks, Jessica Zhang Thanks, Jessica Zhang In addition, loosen the NULL FB checks within the atomic commit callstack to allow a NULL FB when color_fill is nonzero and add FB checks in methods where the FB was previously assumed to be non-NULL. Finally, have the DPU driver use drm_plane_state.color_fill and drm_plane_state.color_fill_format instead of dpu_plane_state.color_fill, and add extra checks in the DPU atomic commit callstack to account for a NULL FB in cases where color_fill is set. Some drivers support hardware that have optimizations for solid fill planes. This series aims to expose these capabilities to userspace as some compositors have a solid fill flag (ex. SOLID_COLOR in the Android hardware composer HAL) that can be set by apps like the Android Gears app. Userspace can set the color_fill value by setting COLOR_FILL_FORMAT to a DRM format, setting COLOR_FILL to a color fill value, and setting the framebuffer to NULL. Is there some real reason for the format property? Ie. why not just do what was the plan for the crttc background color and specify the color in full 16bpc format and just pick as many msbs from that as the hw can use? Jessica Zhang (3): drm: Introduce color fill properties for drm plane drm: Adjust atomic checks for solid fill color drm/msm/dpu: Use color_fill property for DPU planes drivers/gpu/drm/drm_atomic.c | 68 ---
Re: [PATCH v10 00/25] DEPT(Dependency Tracker)
On Mon, Jun 26, 2023 at 03:02:22PM +0200, Greg KH wrote: > On Mon, Jun 26, 2023 at 08:56:35PM +0900, Byungchul Park wrote: > > >From now on, I can work on LKML again! I'm wondering if DEPT has been > > helping kernel debugging well even though it's a form of patches yet. > > > > I'm happy to see that DEPT reports a real problem in practice. See: > > > > > > https://lore.kernel.org/lkml/6383cde5-cf4b-facf-6e07-1378a4856...@i-love.sakura.ne.jp/#t > > > > https://lore.kernel.org/lkml/1674268856-31807-1-git-send-email-byungchul.p...@lge.com/ > > > > Nevertheless, I apologize for the lack of document. I promise to add it > > before it gets needed to use DEPT's APIs by users. For now, you can use > > DEPT just with CONFIG_DEPT on. > > > > --- > > > > Hi Linus and folks, > > > > I've been developing a tool for detecting deadlock possibilities by > > tracking wait/event rather than lock(?) acquisition order to try to > > cover all synchonization machanisms. It's done on v6.2-rc2. > > > > Benifit: > > > > 0. Works with all lock primitives. > > 1. Works with wait_for_completion()/complete(). > > 2. Works with 'wait' on PG_locked. > > 3. Works with 'wait' on PG_writeback. > > 4. Works with swait/wakeup. > > 5. Works with waitqueue. > > 6. Works with wait_bit. > > 7. Multiple reports are allowed. > > 8. Deduplication control on multiple reports. > > 9. Withstand false positives thanks to 6. > > 10. Easy to tag any wait/event. > > > > Future work: > > > > 0. To make it more stable. > > 1. To separates Dept from Lockdep. > > 2. To improves performance in terms of time and space. > > 3. To use Dept as a dependency engine for Lockdep. > > 4. To add any missing tags of wait/event in the kernel. > > 5. To deduplicate stack trace. > > If you run this today, does it find any issues with any subsystems / > drivers that the current lockdep code does not find? Have you run your Yes, it found some deadlocks. The following issue was about a deadlock by PG_locked detected by DEPT which lockdep couldn't. Check it out: https://lore.kernel.org/lkml/6383cde5-cf4b-facf-6e07-1378a4856...@i-love.sakura.ne.jp/#t https://lore.kernel.org/lkml/1674268856-31807-1-git-send-email-byungchul.p...@lge.com/ > tool on patches sent to the different mailing lists for new drivers and > code added to the tree to verify that it can find issues easily? I had been co-working with GPU driver developers for their new drivers adding to their CI system to verify that it can find issues easily. Now that I've almost organized my stuff, I will re-start it. > In other words, why do we need this at all? What makes it 'better' than > what we already have that works for us today? What benifit is it? AS IS : It can detect deadlocks by wrong lock usage e.g. acqusition order. Once it reports a issue, you must resolve it or work around to see further reports even if it's not one you are into. TO BE : It can detect deadlocks by not only locks but also any waits e.g. wait_for_completion(), PG_locked, PG_writeback, dma fence and so on. Last but not least, DEPT can report issues multiple times at a single system-up so that any issues that you are not into, no longer prevent further reports that is valuable to you. However, yes. DEPT needs to be more matured. I'd like to do that together. Byungchul > thanks, > > greg k-h
Re: [RFC PATCH 0/3] Support for Solid Fill Planes
On 27/06/2023 02:02, Jessica Zhang wrote: On 11/7/2022 11:37 AM, Ville Syrjälä wrote: On Fri, Oct 28, 2022 at 03:59:49PM -0700, Jessica Zhang wrote: Introduce and add support for COLOR_FILL and COLOR_FILL_FORMAT properties. When the color fill value is set, and the framebuffer is set to NULL, memory fetch will be disabled. Thinking a bit more universally I wonder if there should be some kind of enum property: enum plane_pixel_source { FB, COLOR, LIVE_FOO, LIVE_BAR, ... } Reviving this thread as this was the initial comment suggesting to implement pixel_source as an enum. I think the issue with having pixel_source as an enum is how to decide what counts as a NULL commit. Currently, setting the FB to NULL will disable the plane. So I'm guessing we will extend that logic to "if there's no pixel_source set for the plane, then it will be a NULL commit and disable the plane". In that case, the question then becomes when to set the pixel_source to NONE. Because if we do that when setting a NULL FB (or NULL solid_fill blob), it then forces userspace to set one property before the other. Why? The userspace should use atomic commits and as such it should all properties at the same time. Because of that, I'm thinking of having pixel_source be represented by a bitmask instead. That way, we will simply unset the corresponding pixel_source bit when passing in a NULL FB/solid_fill blob. Then, in order to detect whether a commit is NULL or has a valid pixel source, we can just check if pixel_source == 0. Would be interested in any feedback on this. This is an interesting idea. Frankly speaking, I'd consider it counter-intuitive at the first glance. Consider it to act as a curtain. Setup the curtain (by writing the fill colour property). Then one can close the curtain (by switching source to colour), or open it (by switching to any other source). Bitmask wouldn't complicate this. Thanks, Jessica Zhang In addition, loosen the NULL FB checks within the atomic commit callstack to allow a NULL FB when color_fill is nonzero and add FB checks in methods where the FB was previously assumed to be non-NULL. Finally, have the DPU driver use drm_plane_state.color_fill and drm_plane_state.color_fill_format instead of dpu_plane_state.color_fill, and add extra checks in the DPU atomic commit callstack to account for a NULL FB in cases where color_fill is set. Some drivers support hardware that have optimizations for solid fill planes. This series aims to expose these capabilities to userspace as some compositors have a solid fill flag (ex. SOLID_COLOR in the Android hardware composer HAL) that can be set by apps like the Android Gears app. Userspace can set the color_fill value by setting COLOR_FILL_FORMAT to a DRM format, setting COLOR_FILL to a color fill value, and setting the framebuffer to NULL. Is there some real reason for the format property? Ie. why not just do what was the plan for the crttc background color and specify the color in full 16bpc format and just pick as many msbs from that as the hw can use? Jessica Zhang (3): drm: Introduce color fill properties for drm plane drm: Adjust atomic checks for solid fill color drm/msm/dpu: Use color_fill property for DPU planes drivers/gpu/drm/drm_atomic.c | 68 --- drivers/gpu/drm/drm_atomic_helper.c | 34 +++- drivers/gpu/drm/drm_atomic_uapi.c | 8 +++ drivers/gpu/drm/drm_blend.c | 38 + drivers/gpu/drm/drm_plane.c | 8 +-- drivers/gpu/drm/msm/disp/dpu1/dpu_crtc.c | 7 ++- drivers/gpu/drm/msm/disp/dpu1/dpu_plane.c | 66 ++ include/drm/drm_atomic_helper.h | 5 +- include/drm/drm_blend.h | 2 + include/drm/drm_plane.h | 28 ++ 10 files changed, 188 insertions(+), 76 deletions(-) -- 2.38.0 -- Ville Syrjälä Intel -- With best wishes Dmitry
Re: [RFC PATCH 0/3] Support for Solid Fill Planes
On 11/7/2022 11:37 AM, Ville Syrjälä wrote: On Fri, Oct 28, 2022 at 03:59:49PM -0700, Jessica Zhang wrote: Introduce and add support for COLOR_FILL and COLOR_FILL_FORMAT properties. When the color fill value is set, and the framebuffer is set to NULL, memory fetch will be disabled. Thinking a bit more universally I wonder if there should be some kind of enum property: enum plane_pixel_source { FB, COLOR, LIVE_FOO, LIVE_BAR, ... } Reviving this thread as this was the initial comment suggesting to implement pixel_source as an enum. I think the issue with having pixel_source as an enum is how to decide what counts as a NULL commit. Currently, setting the FB to NULL will disable the plane. So I'm guessing we will extend that logic to "if there's no pixel_source set for the plane, then it will be a NULL commit and disable the plane". In that case, the question then becomes when to set the pixel_source to NONE. Because if we do that when setting a NULL FB (or NULL solid_fill blob), it then forces userspace to set one property before the other. Because of that, I'm thinking of having pixel_source be represented by a bitmask instead. That way, we will simply unset the corresponding pixel_source bit when passing in a NULL FB/solid_fill blob. Then, in order to detect whether a commit is NULL or has a valid pixel source, we can just check if pixel_source == 0. Would be interested in any feedback on this. Thanks, Jessica Zhang In addition, loosen the NULL FB checks within the atomic commit callstack to allow a NULL FB when color_fill is nonzero and add FB checks in methods where the FB was previously assumed to be non-NULL. Finally, have the DPU driver use drm_plane_state.color_fill and drm_plane_state.color_fill_format instead of dpu_plane_state.color_fill, and add extra checks in the DPU atomic commit callstack to account for a NULL FB in cases where color_fill is set. Some drivers support hardware that have optimizations for solid fill planes. This series aims to expose these capabilities to userspace as some compositors have a solid fill flag (ex. SOLID_COLOR in the Android hardware composer HAL) that can be set by apps like the Android Gears app. Userspace can set the color_fill value by setting COLOR_FILL_FORMAT to a DRM format, setting COLOR_FILL to a color fill value, and setting the framebuffer to NULL. Is there some real reason for the format property? Ie. why not just do what was the plan for the crttc background color and specify the color in full 16bpc format and just pick as many msbs from that as the hw can use? Jessica Zhang (3): drm: Introduce color fill properties for drm plane drm: Adjust atomic checks for solid fill color drm/msm/dpu: Use color_fill property for DPU planes drivers/gpu/drm/drm_atomic.c | 68 --- drivers/gpu/drm/drm_atomic_helper.c | 34 +++- drivers/gpu/drm/drm_atomic_uapi.c | 8 +++ drivers/gpu/drm/drm_blend.c | 38 + drivers/gpu/drm/drm_plane.c | 8 +-- drivers/gpu/drm/msm/disp/dpu1/dpu_crtc.c | 7 ++- drivers/gpu/drm/msm/disp/dpu1/dpu_plane.c | 66 ++ include/drm/drm_atomic_helper.h | 5 +- include/drm/drm_blend.h | 2 + include/drm/drm_plane.h | 28 ++ 10 files changed, 188 insertions(+), 76 deletions(-) -- 2.38.0 -- Ville Syrjälä Intel
Re: [PATCH v2 08/10] HID: i2c-hid: Support being a panel follower
Benjamin, On Thu, Jun 8, 2023 at 8:37 AM Benjamin Tissoires wrote: > > > +static const struct drm_panel_follower_funcs > > i2c_hid_core_panel_follower_funcs = { > > + .panel_prepared = i2c_hid_core_panel_prepared, > > + .panel_unpreparing = i2c_hid_core_panel_unpreparing, > > +}; > > Can we make that above block at least behind a Kconfig? > > i2c-hid is often used for touchpads, and the notion of drm panel has > nothing to do with them. So I'd be more confident if we could disable > that code if not required. Now that other concerns are addressed, I started trying to write up a v3 and I found myself writing this as the description of the Kconfig entry: -- config I2C_HID_SUPPORT_PANEL_FOLLOWER bool "Support i2c-hid devices that must be power sequenced with a panel" Say Y here if you want support for i2c-hid devices that need to coordinate power sequencing with a panel. This is typically important when you have a panel and a touchscreen that share power rails or reset GPIOs. If you say N here then the kernel will not try to honor any shared power sequencing for your hardware. In the best case, ignoring power sequencing when it's needed will draw extra power. In the worst case this will prevent your hardware from functioning or could even damage your hardware. If unsure, say Y. -- I can certainly go that way, but I just wanted to truly make sure that's what we want. Specifically: 1. If we put the panel follower code behind a Kconfig then we actually have no idea if a touchscreen was intended to be a panel follower. Specifically the panel follower API is the one that detects the connection between the panel and the i2c-hid device, so without being able to call the panel follower API we have no idea that an i2c-hid device was supposed to be a panel follower. 2. It is conceivable that power sequencing a device incorrectly could truly cause hardware damage. Together, those points mean that if you turn off the Kconfig entry and then try to boot on a device that needed that Kconfig setting that you might damage hardware. I can code it up that way if you want, but it worries me... Alternatives that I can think of: a) I could change the panel follower API so that panel followers are in charge of detecting the panel that they follow. Today, that looks like: panel_np = of_parse_phandle(dev->of_node, "panel", 0); if (panel_np) /* It's a panel follower */ of_node_put(panel_np); ...so we could put that code in each touchscreen driver and then fail to probe i2c-hid if we detect that we're supposed to be a panel follower but the Kconfig is turned off. The above doesn't seem massively ideal since it duplicates code. Also, one reason why I put that code in drm_panel_add_follower() is that I think this concept will eventually be needed even for non-DT cases. I don't know how to write the non-DT code right now, though... b) I could open-code detect the panel follower case but leave the actual linking to the panel follower API. AKA add to i2c-hid: if (of_property_read_bool(dev->of_node, "panel")) /* It's a panel follower */ ...that's a smaller bit of code, but feels like an abstraction violation. It also would need to be updated if/when we added support for non-DT panel followers. c) I could add a "static inline" implementation of b) to "drm_panel.h". That sounds great and I started doing it. ...but then realized that it means adding to drm_panel.h: #include #include ...because otherwise of_property_read_bool() isn't defined and "struct device" can't be dereferenced. That might be OK, but it looks as if folks have been working hard to avoid things like this in header files. Presumably it would get uglier if/when we added the non-DT case, as well. That being said, I can give it a shot... -- At this point, I'm hoping for some advice. How important is it for you to have a Kconfig for "I2C_HID_SUPPORT_PANEL_FOLLOWER"? NOTE: even if I don't add the Kconfig, I could at least create a function for registering the panel follower that would get most of the panel follower logic out of the probe function. Would that be enough? Thanks! -Doug
Re: [PATCH 06/15] dt-bindings: display/msm: sc7180-dpu: Describe SM6125
On 26.06.2023 22:28, Marijn Suijten wrote: > On 2023-06-26 20:57:51, Konrad Dybcio wrote: >> On 26.06.2023 19:54, Marijn Suijten wrote: >>> On 2023-06-26 18:16:58, Krzysztof Kozlowski wrote: On 25/06/2023 21:52, Marijn Suijten wrote: > On 2023-06-24 11:12:52, Krzysztof Kozlowski wrote: >> On 24/06/2023 02:41, Marijn Suijten wrote: >>> SM6125 is identical to SM6375 except that while downstream also defines >>> a throttle clock, its presence results in timeouts whereas SM6375 >>> requires it to not observe any timeouts. >> >> Then it should not be allowed, so you need either "else:" block or >> another "if: properties: compatible:" to disallow it. Because in current >> patch it would be allowed. > > That means this binding is wrong/incomplete for all other SoCs then. > clock(-name)s has 6 items, and sets `minItems: 6`. Only for sm6375-dpu >>> >>> Of course meant to say that clock(-name)s has **7** items, not 6. >>> > does it set `minItems: 7`, but an else case is missing. Ask the author why it is done like this. >>> >>> Konrad, can you clarify why other > > (Looks like I forgot to complete this sentence before sending, > apologies) > >> 6375 needs the throttle clk and the clock(-names) are strongly ordered >> so having minItems: 6 discards the last entry > > The question is whether or not we should have maxItems: 6 to disallow > the clock from being passed: right now it is optional and either is > allowed for any !6375 SoC. That's a very good question. I don't have a 7180 to test, but for you it seems to cause inexplicable issues on 6125.. Konrad > > - Marijn > >> >> Konrad >>> > Shall I send a Fixes: ed41005f5b7c ("dt-bindings: display/msm: > sc7180-dpu: Describe SM6350 and SM6375") for that, and should maxItems: > 6 be the default under clock(-name)s or in an else:? There is no bug to fix. Or at least it is not yet known. Whether other devices should be constrained as well - sure, sounds reasonable, but I did not check the code exactly. >>> >>> I don't know either, but we need this information to decide whether to >>> use `maxItems: 6`: >>> >>> 1. Directly on the property; >>> 2. In an `else:` case on the current `if: sm6375-dpu` (should have the >>>same effect as 1., afaik); >>> 3. In a second `if:` case that lists all SoCS explicitly. >>> >>> Since we don't have this information, I think option 3. is the right way >>> to go, setting `maxItems: 6` for qcom,sm6125-dpu. >>> >>> However, it is not yet understood why downstream is able to use the >>> throttle clock without repercussions. >>> We talk here about this patch. >>> >>> We used this patch to discover that other SoCs are similarly >>> unconstrained. But if you don't want me to look into it, by all means! >>> Saves me a lot of time. So I will go with option 3. >>> >>> - Marijn
Re: [PATCH drm-next v5 03/14] drm: manager to keep track of GPUs VA mappings
> > As pointed out by Christian, this would optimize the "get all mappings > > backed by a specific BO from a given VM" use case. > > > > The question for me is, do other drivers than amdgpu commonly need this? > > I have no idea. > > > > > And what does amdgpu need this for? Maybe amdgpu does something we're > > not doing (yet)? > > Basically when we do a CS we need to make sure that the VM used by this > CS is up to date. For this we walk over the relocation list of BOs and > check the status of each BO+VM structure. > > This is done because we don't want to update all VMs at the same time, > but rather just those who needs the update. This seems like a legacy from GL and possibly older vulkan, going forward vulkan can't rely on passing lists of objects into the kernel due to things like buffer_device_address, I'm not sure we should optimise for this situation, and moving the tracking list into the drivers is going to mean having a bunch of drivers all having the same boilerplate, to do the same thing just so amdgpu can't avoid doing it. Now there might be some benchmark somewhere we can produce a benefit in this, and if there is then we should consider going this way for all drivers and not just allowing drivers to choose their own path here. > > > > Christian - I know you didn't ask for "do it the way amdgpu does", > > instead you voted for keeping it entirely driver specific. But I think > > everyone is pretty close and I'm still optimistic that we could just > > generalize this. > > Well, you should *not* necessarily do it like amdgpu does! Basically the > implementation in amdgpu was driven by requirements, e.g. we need that, > let's do it like this. > > It's perfectly possible that other requirements (e.g. focus on Vulkan) > lead to a completely different implementation. > > It's just that ideally I would like to have an implementation where I > can apply at least the basics to amdgpu as well. > I think we can still do that just either have an option to disable using the list internally in the gpuva or have the driver keep it's own tracking alongside, there may still be use cases where it can use the gpuva tracking instead of it's own. I don't think we should forklift what is pretty likely to be common code across every driver that uses this going forward just to benefit an amdgpu design decision for older stacks. Dave.
Re: [PATCH v4 1/1] drm/doc: Document DRM device reset expectations
Hi, On 6/26/23 11:33, André Almeida wrote: > Create a section that specifies how to deal with DRM device resets for > kernel and userspace drivers. > > Signed-off-by: André Almeida > --- > Documentation/gpu/drm-uapi.rst | 68 ++ > 1 file changed, 68 insertions(+) > > diff --git a/Documentation/gpu/drm-uapi.rst b/Documentation/gpu/drm-uapi.rst > index 65fb3036a580..25a11b9b98fa 100644 > --- a/Documentation/gpu/drm-uapi.rst > +++ b/Documentation/gpu/drm-uapi.rst > @@ -285,6 +285,74 @@ for GPU1 and GPU2 from different vendors, and a third > handler for > mmapped regular files. Threads cause additional pain with signal > handling as well. > > +Device reset > + > + > +The GPU stack is really complex and is prone to errors, from hardware bugs, > +faulty applications and everything in between the many layers. Some errors > +require resetting the device in order to make the device usable again. This > +section describes what is the expectations for DRM and usermode drivers when > a sections describes the expectations for DRM and usermode drivers when a > +device resets and how to propagate the reset status. > + > +Kernel Mode Driver > +-- > + > +The KMD is responsible for checking if the device needs a reset, and to > perform > +it as needed. Usually a hang is detected when a job gets stuck executing. KMD > +should keep track of resets, because userspace can query any time about the > +reset stats for an specific context. This is needed to propagate to the rest > of > +the stack that a reset has happened. Currently, this is implemented by each > +driver separately, with no common DRM interface. > + > +User Mode Driver > + > + > +The UMD should check before submitting new commands to the KMD if the device > has > +been reset, and this can be checked more often if it requires to. After more often if the UMD requires it. After > +detecting a reset, UMD will then proceed to report it to the application > using > +the appropriated API error code, as explained in the below section about appropriate the section below about > +robustness. > + > +Robustness > +-- > + > +The only way to try to keep an application working after a reset is if it > +complies with the robustness aspects of the graphical API that it is using. > + > +Graphical APIs provide ways to application to deal with device resets. > However, to applications > +there is no guarantee that the app will be correctly using such features, and will use such features correctly, and a // or "and the" > +UMD can implement policies to close the app if it is a repeating offender, > +likely in a broken loop. This is done to ensure that it does not keeps > blocking keep > +the user interface from being correctly displayed. This should be done even > if > +the app is correct but happens to trigger some bug in the hardware/driver. > + > +OpenGL > +~~ > + > +Apps using OpenGL should use the available robust interfaces, like the > +extension ``GL_ARB_robustness`` (or ``GL_EXT_robustness`` for OpenGL ES). > This > +interface tells if a reset has happened, and if so, all the context state is > +considered lost and the app proceeds by creating new ones. If is possible to If it is possible to > +determine that robustness is not in use, UMD will terminate the app when a > reset the UMD > +is detected, giving that the contexts are lost and the app won't be able to > +figure this out and recreate the contexts. > + > +Vulkan > +~~ > + > +Apps using Vulkan should check for ``VK_ERROR_DEVICE_LOST`` for submissions. > +This error code means, among other things, that a device reset has happened > and > +it needs to recreate the contexts to keep going. > + > +Reporting resets causes That's an awkward heading. How about: Reporting causes of resets -- > +--- > + > +Apart from propagating the reset through the stack so apps can recover, it's > +really useful for driver developers to learn more about what caused the > reset in > +first place. DRM devices should make use of devcoredump to store relevant > +information about the reset, so this information can be added to user bug > +reports. > + > .. _drm_driver_ioctl: > > IOCTL Support on Device Nodes thanks for the documentation. -- ~Randy
[PATCH v10 21/25] dept: Apply timeout consideration to hashed-waitqueue wait
Now that CONFIG_DEPT_AGGRESSIVE_TIMEOUT_WAIT was introduced, apply the consideration to hashed-waitqueue wait, assuming an input 'ret' in ___wait_var_event() macro is used as a timeout value. Signed-off-by: Byungchul Park --- include/linux/wait_bit.h | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/include/linux/wait_bit.h b/include/linux/wait_bit.h index fe89282c3e96..3ef450d9a7c5 100644 --- a/include/linux/wait_bit.h +++ b/include/linux/wait_bit.h @@ -247,7 +247,7 @@ extern wait_queue_head_t *__var_waitqueue(void *p); struct wait_bit_queue_entry __wbq_entry;\ long __ret = ret; /* explicit shadow */ \ \ - sdt_might_sleep_start(NULL);\ + sdt_might_sleep_start_timeout(NULL, __ret); \ init_wait_var_entry(&__wbq_entry, var, \ exclusive ? WQ_FLAG_EXCLUSIVE : 0); \ for (;;) { \ -- 2.17.1
[PATCH v10 23/25] dept: Record the latest one out of consecutive waits of the same class
The current code records all the waits for later use to track relation between waits and events in each context. However, since the same class is handled the same way, it'd be okay to record only one on behalf of the others if they all have the same class. Even though it's the ideal to search the whole history buffer for that, since it'd cost too high, alternatively, let's keep the latest one at least when the same class'ed waits consecutively appear. Signed-off-by: Byungchul Park --- kernel/dependency/dept.c | 21 - 1 file changed, 20 insertions(+), 1 deletion(-) diff --git a/kernel/dependency/dept.c b/kernel/dependency/dept.c index 52537c099b68..cdfda4acff58 100644 --- a/kernel/dependency/dept.c +++ b/kernel/dependency/dept.c @@ -1522,9 +1522,28 @@ static inline struct dept_wait_hist *new_hist(void) return wh; } +static inline struct dept_wait_hist *last_hist(void) +{ + int pos_n = hist_pos_next(); + struct dept_wait_hist *wh_n = hist(pos_n); + + /* +* This is the first try. +*/ + if (!pos_n && !wh_n->wait) + return NULL; + + return hist(pos_n + DEPT_MAX_WAIT_HIST - 1); +} + static void add_hist(struct dept_wait *w, unsigned int wg, unsigned int ctxt_id) { - struct dept_wait_hist *wh = new_hist(); + struct dept_wait_hist *wh; + + wh = last_hist(); + + if (!wh || wh->wait->class != w->class || wh->ctxt_id != ctxt_id) + wh = new_hist(); if (likely(wh->wait)) put_wait(wh->wait); -- 2.17.1
[PATCH v10 18/25] dept: Apply timeout consideration to wait_for_completion()/complete()
Now that CONFIG_DEPT_AGGRESSIVE_TIMEOUT_WAIT was introduced, apply the consideration to wait_for_completion()/complete(). Signed-off-by: Byungchul Park --- include/linux/completion.h | 4 ++-- kernel/sched/completion.c | 2 +- 2 files changed, 3 insertions(+), 3 deletions(-) diff --git a/include/linux/completion.h b/include/linux/completion.h index 32d535abebf3..15eede01a451 100644 --- a/include/linux/completion.h +++ b/include/linux/completion.h @@ -41,9 +41,9 @@ do { \ */ #define init_completion_map(x, m) init_completion(x) -static inline void complete_acquire(struct completion *x) +static inline void complete_acquire(struct completion *x, long timeout) { - sdt_might_sleep_start(>dmap); + sdt_might_sleep_start_timeout(>dmap, timeout); } static inline void complete_release(struct completion *x) diff --git a/kernel/sched/completion.c b/kernel/sched/completion.c index d57a5c1c1cd9..261807fa7118 100644 --- a/kernel/sched/completion.c +++ b/kernel/sched/completion.c @@ -100,7 +100,7 @@ __wait_for_common(struct completion *x, { might_sleep(); - complete_acquire(x); + complete_acquire(x, timeout); raw_spin_lock_irq(>wait.lock); timeout = do_wait_for_common(x, action, timeout, state); -- 2.17.1
[PATCH v10 10/25] dept: Apply sdt_might_sleep_{start, end}() to waitqueue wait
Makes Dept able to track dependencies by waitqueue waits. Signed-off-by: Byungchul Park --- include/linux/wait.h | 3 +++ 1 file changed, 3 insertions(+) diff --git a/include/linux/wait.h b/include/linux/wait.h index a0307b516b09..ff349e609da7 100644 --- a/include/linux/wait.h +++ b/include/linux/wait.h @@ -7,6 +7,7 @@ #include #include #include +#include #include #include @@ -303,6 +304,7 @@ extern void init_wait_entry(struct wait_queue_entry *wq_entry, int flags); struct wait_queue_entry __wq_entry; \ long __ret = ret; /* explicit shadow */ \ \ + sdt_might_sleep_start(NULL); \ init_wait_entry(&__wq_entry, exclusive ? WQ_FLAG_EXCLUSIVE : 0); \ for (;;) { \ long __int = prepare_to_wait_event(_head, &__wq_entry, state);\ @@ -318,6 +320,7 @@ extern void init_wait_entry(struct wait_queue_entry *wq_entry, int flags); cmd; \ } \ finish_wait(_head, &__wq_entry); \ + sdt_might_sleep_end(); \ __out: __ret; \ }) -- 2.17.1
[PATCH v10 15/25] locking/lockdep, cpu/hotplus: Use a weaker annotation in AP thread
cb92173d1f0 ("locking/lockdep, cpu/hotplug: Annotate AP thread") was introduced to make lockdep_assert_cpus_held() work in AP thread. However, the annotation is too strong for that purpose. We don't have to use more than try lock annotation for that. Furthermore, now that Dept was introduced, false positive alarms was reported by that. Replaced it with try lock annotation. Signed-off-by: Byungchul Park --- kernel/cpu.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/kernel/cpu.c b/kernel/cpu.c index 6c0a92ca6bb5..6a9b9c3d90a1 100644 --- a/kernel/cpu.c +++ b/kernel/cpu.c @@ -356,7 +356,7 @@ int lockdep_is_cpus_held(void) static void lockdep_acquire_cpus_lock(void) { - rwsem_acquire(_hotplug_lock.dep_map, 0, 0, _THIS_IP_); + rwsem_acquire(_hotplug_lock.dep_map, 0, 1, _THIS_IP_); } static void lockdep_release_cpus_lock(void) -- 2.17.1
[PATCH v10 20/25] dept: Apply timeout consideration to waitqueue wait
Now that CONFIG_DEPT_AGGRESSIVE_TIMEOUT_WAIT was introduced, apply the consideration to waitqueue wait, assuming an input 'ret' in ___wait_event() macro is used as a timeout value. Signed-off-by: Byungchul Park --- include/linux/wait.h | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/include/linux/wait.h b/include/linux/wait.h index ff349e609da7..aa1bd964be1e 100644 --- a/include/linux/wait.h +++ b/include/linux/wait.h @@ -304,7 +304,7 @@ extern void init_wait_entry(struct wait_queue_entry *wq_entry, int flags); struct wait_queue_entry __wq_entry; \ long __ret = ret; /* explicit shadow */ \ \ - sdt_might_sleep_start(NULL); \ + sdt_might_sleep_start_timeout(NULL, __ret); \ init_wait_entry(&__wq_entry, exclusive ? WQ_FLAG_EXCLUSIVE : 0); \ for (;;) { \ long __int = prepare_to_wait_event(_head, &__wq_entry, state);\ -- 2.17.1
[PATCH v10 03/25] dept: Add single event dependency tracker APIs
Wrapped the base APIs for easier annotation on wait and event. Start with supporting waiters on each single event. More general support for multiple events is a future work. Do more when the need arises. How to annotate (the simplest way): 1. Initaialize a map for the interesting wait. /* * Recommand to place along with the wait instance. */ struct dept_map my_wait; /* * Recommand to place in the initialization code. */ sdt_map_init(_wait); 2. Place the following at the wait code. sdt_wait(_wait); 3. Place the following at the event code. sdt_event(_wait); That's it! Signed-off-by: Byungchul Park --- include/linux/dept_sdt.h | 62 1 file changed, 62 insertions(+) create mode 100644 include/linux/dept_sdt.h diff --git a/include/linux/dept_sdt.h b/include/linux/dept_sdt.h new file mode 100644 index ..12a793b90c7e --- /dev/null +++ b/include/linux/dept_sdt.h @@ -0,0 +1,62 @@ +/* SPDX-License-Identifier: GPL-2.0 */ +/* + * Single-event Dependency Tracker + * + * Started by Byungchul Park : + * + * Copyright (c) 2020 LG Electronics, Inc., Byungchul Park + */ + +#ifndef __LINUX_DEPT_SDT_H +#define __LINUX_DEPT_SDT_H + +#include +#include + +#ifdef CONFIG_DEPT +#define sdt_map_init(m) \ + do {\ + static struct dept_key __key; \ + dept_map_init(m, &__key, 0, #m);\ + } while (0) + +#define sdt_map_init_key(m, k) dept_map_init(m, k, 0, #m) + +#define sdt_wait(m)\ + do {\ + dept_request_event(m); \ + dept_wait(m, 1UL, _THIS_IP_, __func__, 0); \ + } while (0) + +/* + * sdt_might_sleep() and its family will be committed in __schedule() + * when it actually gets to __schedule(). Both dept_request_event() and + * dept_wait() will be performed on the commit. + */ + +/* + * Use the code location as the class key if an explicit map is not used. + */ +#define sdt_might_sleep_start(m) \ + do {\ + struct dept_map *__m = m; \ + static struct dept_key __key; \ + dept_stage_wait(__m, __m ? NULL : &__key, _THIS_IP_, __func__);\ + } while (0) + +#define sdt_might_sleep_end() dept_clean_stage() + +#define sdt_ecxt_enter(m) dept_ecxt_enter(m, 1UL, _THIS_IP_, "start", "event", 0) +#define sdt_event(m) dept_event(m, 1UL, _THIS_IP_, __func__) +#define sdt_ecxt_exit(m) dept_ecxt_exit(m, 1UL, _THIS_IP_) +#else /* !CONFIG_DEPT */ +#define sdt_map_init(m)do { } while (0) +#define sdt_map_init_key(m, k) do { (void)(k); } while (0) +#define sdt_wait(m)do { } while (0) +#define sdt_might_sleep_start(m) do { } while (0) +#define sdt_might_sleep_end() do { } while (0) +#define sdt_ecxt_enter(m) do { } while (0) +#define sdt_event(m) do { } while (0) +#define sdt_ecxt_exit(m) do { } while (0) +#endif +#endif /* __LINUX_DEPT_SDT_H */ -- 2.17.1
[PATCH v10 05/25] dept: Tie to Lockdep and IRQ tracing
Yes. How to place Dept in here looks so ugly. But it's inevitable as long as relying on Lockdep. The way should be enhanced gradually. 1. Basically relies on Lockdep to track typical locks and IRQ things. 2. Dept fails to recognize IRQ situation so it generates false alarms when raw_local_irq_*() APIs are used. So made it track those too. 3. Lockdep doesn't track the outmost {hard,soft}irq entracnes but Dept makes use of it. So made it track those too. Signed-off-by: Byungchul Park --- include/linux/irqflags.h| 22 +- include/linux/local_lock_internal.h | 1 + include/linux/lockdep.h | 102 ++-- include/linux/lockdep_types.h | 3 + include/linux/mutex.h | 1 + include/linux/percpu-rwsem.h| 2 +- include/linux/rtmutex.h | 1 + include/linux/rwlock_types.h| 1 + include/linux/rwsem.h | 1 + include/linux/seqlock.h | 2 +- include/linux/spinlock_types_raw.h | 3 + include/linux/srcu.h| 2 +- kernel/dependency/dept.c| 4 +- kernel/locking/lockdep.c| 23 +++ 14 files changed, 139 insertions(+), 29 deletions(-) diff --git a/include/linux/irqflags.h b/include/linux/irqflags.h index 5ec0fa71399e..0ebc5ec2dbd4 100644 --- a/include/linux/irqflags.h +++ b/include/linux/irqflags.h @@ -13,6 +13,7 @@ #define _LINUX_TRACE_IRQFLAGS_H #include +#include #include #include @@ -60,8 +61,10 @@ extern void trace_hardirqs_off(void); # define lockdep_softirqs_enabled(p) ((p)->softirqs_enabled) # define lockdep_hardirq_enter() \ do { \ - if (__this_cpu_inc_return(hardirq_context) == 1)\ + if (__this_cpu_inc_return(hardirq_context) == 1) { \ current->hardirq_threaded = 0; \ + dept_hardirq_enter(); \ + } \ } while (0) # define lockdep_hardirq_threaded()\ do { \ @@ -136,6 +139,8 @@ do {\ # define lockdep_softirq_enter() \ do { \ current->softirq_context++; \ + if (current->softirq_context == 1) \ + dept_softirq_enter(); \ } while (0) # define lockdep_softirq_exit()\ do { \ @@ -170,17 +175,28 @@ extern void warn_bogus_irq_restore(void); /* * Wrap the arch provided IRQ routines to provide appropriate checks. */ -#define raw_local_irq_disable()arch_local_irq_disable() -#define raw_local_irq_enable() arch_local_irq_enable() +#define raw_local_irq_disable()\ + do {\ + arch_local_irq_disable(); \ + dept_hardirqs_off();\ + } while (0) +#define raw_local_irq_enable() \ + do {\ + dept_hardirqs_on(); \ + arch_local_irq_enable();\ + } while (0) #define raw_local_irq_save(flags) \ do {\ typecheck(unsigned long, flags);\ flags = arch_local_irq_save(); \ + dept_hardirqs_off();\ } while (0) #define raw_local_irq_restore(flags) \ do {\ typecheck(unsigned long, flags);\ raw_check_bogus_irq_restore(); \ + if (!arch_irqs_disabled_flags(flags)) \ + dept_hardirqs_on(); \ arch_local_irq_restore(flags); \ } while (0) #define raw_local_save_flags(flags)\ diff --git a/include/linux/local_lock_internal.h b/include/linux/local_lock_internal.h index 975e33b793a7..39f67788fd95 100644 --- a/include/linux/local_lock_internal.h +++ b/include/linux/local_lock_internal.h @@ -21,6 +21,7 @@ typedef struct { .name = #lockname, \ .wait_type_inner = LD_WAIT_CONFIG, \ .lock_type = LD_LOCK_PERCPU,\ + .dmap = DEPT_MAP_INITIALIZER(lockname, NULL),\ }, \ .owner = NULL, diff --git a/include/linux/lockdep.h b/include/linux/lockdep.h index 1f1099dac3f0..99961026ba43 100644 --- a/include/linux/lockdep.h +++ b/include/linux/lockdep.h @@ -12,6 +12,7 @@ #include #include +#include #include struct
[PATCH v10 24/25] dept: Make Dept able to work with an external wgen
There is a case where total maps for its wait/event is so large in size. For instance, struct page for PG_locked and PG_writeback is the case. The additional memory size for the maps would be 'the # of pages * sizeof(struct dept_map)' if each struct page keeps its map all the way, which might be too big to accept. It'd be better to keep the minimum data in the case, which is timestamp called 'wgen' that Dept makes use of. So made Dept able to work with an external wgen when needed. Signed-off-by: Byungchul Park --- include/linux/dept.h | 18 ++ include/linux/dept_sdt.h | 4 ++-- kernel/dependency/dept.c | 30 +- 3 files changed, 37 insertions(+), 15 deletions(-) diff --git a/include/linux/dept.h b/include/linux/dept.h index 0aa8d90558a9..ad32ea7b57bb 100644 --- a/include/linux/dept.h +++ b/include/linux/dept.h @@ -487,6 +487,13 @@ struct dept_task { boolin_sched; }; +/* + * for subsystems that requires compact use of memory e.g. struct page + */ +struct dept_ext_wgen{ + unsigned int wgen; +}; + #define DEPT_TASK_INITIALIZER(t) \ { \ .wait_hist = { { .wait = NULL, } }, \ @@ -518,6 +525,7 @@ extern void dept_task_exit(struct task_struct *t); extern void dept_free_range(void *start, unsigned int sz); extern void dept_map_init(struct dept_map *m, struct dept_key *k, int sub_u, const char *n); extern void dept_map_reinit(struct dept_map *m, struct dept_key *k, int sub_u, const char *n); +extern void dept_ext_wgen_init(struct dept_ext_wgen *ewg); extern void dept_map_copy(struct dept_map *to, struct dept_map *from); extern void dept_wait(struct dept_map *m, unsigned long w_f, unsigned long ip, const char *w_fn, int sub_l, long timeout); @@ -527,8 +535,8 @@ extern void dept_clean_stage(void); extern void dept_stage_event(struct task_struct *t, unsigned long ip); extern void dept_ecxt_enter(struct dept_map *m, unsigned long e_f, unsigned long ip, const char *c_fn, const char *e_fn, int sub_l); extern bool dept_ecxt_holding(struct dept_map *m, unsigned long e_f); -extern void dept_request_event(struct dept_map *m); -extern void dept_event(struct dept_map *m, unsigned long e_f, unsigned long ip, const char *e_fn); +extern void dept_request_event(struct dept_map *m, struct dept_ext_wgen *ewg); +extern void dept_event(struct dept_map *m, unsigned long e_f, unsigned long ip, const char *e_fn, struct dept_ext_wgen *ewg); extern void dept_ecxt_exit(struct dept_map *m, unsigned long e_f, unsigned long ip); extern void dept_sched_enter(void); extern void dept_sched_exit(void); @@ -559,6 +567,7 @@ extern void dept_hardirqs_off_ip(unsigned long ip); struct dept_key { }; struct dept_map { }; struct dept_task { }; +struct dept_ext_wgen { }; #define DEPT_MAP_INITIALIZER(n, k) { } #define DEPT_TASK_INITIALIZER(t) { } @@ -571,6 +580,7 @@ struct dept_task { }; #define dept_free_range(s, sz) do { } while (0) #define dept_map_init(m, k, su, n) do { (void)(n); (void)(k); } while (0) #define dept_map_reinit(m, k, su, n) do { (void)(n); (void)(k); } while (0) +#define dept_ext_wgen_init(wg) do { } while (0) #define dept_map_copy(t, f)do { } while (0) #define dept_wait(m, w_f, ip, w_fn, sl, t) do { (void)(w_fn); } while (0) @@ -580,8 +590,8 @@ struct dept_task { }; #define dept_stage_event(t, ip)do { } while (0) #define dept_ecxt_enter(m, e_f, ip, c_fn, e_fn, sl)do { (void)(c_fn); (void)(e_fn); } while (0) #define dept_ecxt_holding(m, e_f) false -#define dept_request_event(m) do { } while (0) -#define dept_event(m, e_f, ip, e_fn) do { (void)(e_fn); } while (0) +#define dept_request_event(m, wg) do { } while (0) +#define dept_event(m, e_f, ip, e_fn, wg) do { (void)(e_fn); } while (0) #define dept_ecxt_exit(m, e_f, ip) do { } while (0) #define dept_sched_enter() do { } while (0) #define dept_sched_exit() do { } while (0) diff --git a/include/linux/dept_sdt.h b/include/linux/dept_sdt.h index 21fce525f031..8cdac7982036 100644 --- a/include/linux/dept_sdt.h +++ b/include/linux/dept_sdt.h @@ -24,7 +24,7 @@ #define sdt_wait_timeout(m, t) \ do {\ - dept_request_event(m); \ + dept_request_event(m, NULL);\ dept_wait(m, 1UL, _THIS_IP_, __func__, 0, t); \ } while (0) #define sdt_wait(m) sdt_wait_timeout(m, -1L) @@ -49,7
[PATCH v10 06/25] dept: Add proc knobs to show stats and dependency graph
It'd be useful to show Dept internal stats and dependency graph on runtime via proc for better information. Introduced the knobs. Signed-off-by: Byungchul Park --- kernel/dependency/Makefile| 1 + kernel/dependency/dept.c | 24 +++- kernel/dependency/dept_internal.h | 26 + kernel/dependency/dept_proc.c | 95 +++ 4 files changed, 131 insertions(+), 15 deletions(-) create mode 100644 kernel/dependency/dept_internal.h create mode 100644 kernel/dependency/dept_proc.c diff --git a/kernel/dependency/Makefile b/kernel/dependency/Makefile index b5cfb8a03c0c..92f165400187 100644 --- a/kernel/dependency/Makefile +++ b/kernel/dependency/Makefile @@ -1,3 +1,4 @@ # SPDX-License-Identifier: GPL-2.0 obj-$(CONFIG_DEPT) += dept.o +obj-$(CONFIG_DEPT) += dept_proc.o diff --git a/kernel/dependency/dept.c b/kernel/dependency/dept.c index d3b6d2f4cd7b..c5e23e9184b8 100644 --- a/kernel/dependency/dept.c +++ b/kernel/dependency/dept.c @@ -74,6 +74,7 @@ #include #include #include +#include "dept_internal.h" static int dept_stop; static int dept_per_cpu_ready; @@ -261,20 +262,13 @@ static inline bool valid_key(struct dept_key *k) * have been freed will be placed. */ -enum object_t { -#define OBJECT(id, nr) OBJECT_##id, - #include "dept_object.h" -#undef OBJECT - OBJECT_NR, -}; - #define OBJECT(id, nr) \ static struct dept_##id spool_##id[nr]; \ static DEFINE_PER_CPU(struct llist_head, lpool_##id); #include "dept_object.h" #undef OBJECT -static struct dept_pool pool[OBJECT_NR] = { +struct dept_pool dept_pool[OBJECT_NR] = { #define OBJECT(id, nr) { \ .name = #id,\ .obj_sz = sizeof(struct dept_##id), \ @@ -304,7 +298,7 @@ static void *from_pool(enum object_t t) if (DEPT_WARN_ON(!irqs_disabled())) return NULL; - p = [t]; + p = _pool[t]; /* * Try local pool first. @@ -339,7 +333,7 @@ static void *from_pool(enum object_t t) static void to_pool(void *o, enum object_t t) { - struct dept_pool *p = [t]; + struct dept_pool *p = _pool[t]; struct llist_head *h; preempt_disable(); @@ -2136,7 +2130,7 @@ void dept_map_copy(struct dept_map *to, struct dept_map *from) clean_classes_cache(>map_key); } -static LIST_HEAD(classes); +LIST_HEAD(dept_classes); static inline bool within(const void *addr, void *start, unsigned long size) { @@ -2168,7 +2162,7 @@ void dept_free_range(void *start, unsigned int sz) while (unlikely(!dept_lock())) cpu_relax(); - list_for_each_entry_safe(c, n, , all_node) { + list_for_each_entry_safe(c, n, _classes, all_node) { if (!within((void *)c->key, start, sz) && !within(c->name, start, sz)) continue; @@ -2244,7 +2238,7 @@ static struct dept_class *check_new_class(struct dept_key *local, c->sub_id = sub_id; c->key = (unsigned long)(k->base + sub_id); hash_add_class(c); - list_add(>all_node, ); + list_add(>all_node, _classes); unlock: dept_unlock(); caching: @@ -2958,8 +2952,8 @@ static void migrate_per_cpu_pool(void) struct llist_head *from; struct llist_head *to; - from = [i].boot_pool; - to = per_cpu_ptr(pool[i].lpool, boot_cpu); + from = _pool[i].boot_pool; + to = per_cpu_ptr(dept_pool[i].lpool, boot_cpu); move_llist(to, from); } } diff --git a/kernel/dependency/dept_internal.h b/kernel/dependency/dept_internal.h new file mode 100644 index ..007c1eec6bab --- /dev/null +++ b/kernel/dependency/dept_internal.h @@ -0,0 +1,26 @@ +/* SPDX-License-Identifier: GPL-2.0 */ +/* + * Dept(DEPendency Tracker) - runtime dependency tracker internal header + * + * Started by Byungchul Park : + * + * Copyright (c) 2020 LG Electronics, Inc., Byungchul Park + */ + +#ifndef __DEPT_INTERNAL_H +#define __DEPT_INTERNAL_H + +#ifdef CONFIG_DEPT + +enum object_t { +#define OBJECT(id, nr) OBJECT_##id, + #include "dept_object.h" +#undef OBJECT + OBJECT_NR, +}; + +extern struct list_head dept_classes; +extern struct dept_pool dept_pool[]; + +#endif +#endif /* __DEPT_INTERNAL_H */ diff --git a/kernel/dependency/dept_proc.c b/kernel/dependency/dept_proc.c new file mode 100644 index ..7d61dfbc5865 --- /dev/null +++ b/kernel/dependency/dept_proc.c @@ -0,0 +1,95 @@ +// SPDX-License-Identifier: GPL-2.0 +/* + * Procfs knobs for Dept(DEPendency Tracker) + * + * Started by Byungchul Park : + * + * Copyright (C) 2021 LG Electronics, Inc. , Byungchul Park + */ +#include +#include +#include +#include
[PATCH v10 08/25] dept: Apply sdt_might_sleep_{start, end}() to PG_{locked, writeback} wait
Makes Dept able to track dependencies by PG_{locked,writeback} waits. Signed-off-by: Byungchul Park --- mm/filemap.c | 11 +++ 1 file changed, 11 insertions(+) diff --git a/mm/filemap.c b/mm/filemap.c index c4d4ace9cc70..adc49cb59db6 100644 --- a/mm/filemap.c +++ b/mm/filemap.c @@ -42,6 +42,7 @@ #include #include #include +#include #include #include #include "internal.h" @@ -1215,6 +1216,9 @@ static inline bool folio_trylock_flag(struct folio *folio, int bit_nr, /* How many times do we accept lock stealing from under a waiter? */ int sysctl_page_lock_unfairness = 5; +static struct dept_map __maybe_unused PG_locked_map = DEPT_MAP_INITIALIZER(PG_locked_map, NULL); +static struct dept_map __maybe_unused PG_writeback_map = DEPT_MAP_INITIALIZER(PG_writeback_map, NULL); + static inline int folio_wait_bit_common(struct folio *folio, int bit_nr, int state, enum behavior behavior) { @@ -1226,6 +1230,11 @@ static inline int folio_wait_bit_common(struct folio *folio, int bit_nr, unsigned long pflags; bool in_thrashing; + if (bit_nr == PG_locked) + sdt_might_sleep_start(_locked_map); + else if (bit_nr == PG_writeback) + sdt_might_sleep_start(_writeback_map); + if (bit_nr == PG_locked && !folio_test_uptodate(folio) && folio_test_workingset(folio)) { delayacct_thrashing_start(_thrashing); @@ -1327,6 +1336,8 @@ static inline int folio_wait_bit_common(struct folio *folio, int bit_nr, */ finish_wait(q, wait); + sdt_might_sleep_end(); + if (thrashing) { delayacct_thrashing_end(_thrashing); psi_memstall_leave(); -- 2.17.1
[PATCH v10 13/25] dept: Distinguish each work from another
Workqueue already provides concurrency control. By that, any wait in a work doesn't prevents events in other works with the control enabled. Thus, each work would better be considered a different context. So let Dept assign a different context id to each work. Signed-off-by: Byungchul Park --- include/linux/dept.h | 2 ++ kernel/dependency/dept.c | 10 ++ kernel/workqueue.c | 3 +++ 3 files changed, 15 insertions(+) diff --git a/include/linux/dept.h b/include/linux/dept.h index f62c7b6f42c6..d9ca9dd50219 100644 --- a/include/linux/dept.h +++ b/include/linux/dept.h @@ -515,6 +515,7 @@ extern void dept_ecxt_exit(struct dept_map *m, unsigned long e_f, unsigned long extern void dept_sched_enter(void); extern void dept_sched_exit(void); extern void dept_kernel_enter(void); +extern void dept_work_enter(void); static inline void dept_ecxt_enter_nokeep(struct dept_map *m) { @@ -567,6 +568,7 @@ struct dept_task { }; #define dept_sched_enter() do { } while (0) #define dept_sched_exit() do { } while (0) #define dept_kernel_enter()do { } while (0) +#define dept_work_enter() do { } while (0) #define dept_ecxt_enter_nokeep(m) do { } while (0) #define dept_key_init(k) do { (void)(k); } while (0) #define dept_key_destroy(k)do { (void)(k); } while (0) diff --git a/kernel/dependency/dept.c b/kernel/dependency/dept.c index 4165cacf4ebb..6cf17f206b78 100644 --- a/kernel/dependency/dept.c +++ b/kernel/dependency/dept.c @@ -1977,6 +1977,16 @@ void dept_hardirqs_off_ip(unsigned long ip) } EXPORT_SYMBOL_GPL(dept_hardirqs_off_ip); +/* + * Assign a different context id to each work. + */ +void dept_work_enter(void) +{ + struct dept_task *dt = dept_task(); + + dt->cxt_id[DEPT_CXT_PROCESS] += 1UL << DEPT_CXTS_NR; +} + void dept_kernel_enter(void) { struct dept_task *dt = dept_task(); diff --git a/kernel/workqueue.c b/kernel/workqueue.c index 07895deca271..69c4f464d017 100644 --- a/kernel/workqueue.c +++ b/kernel/workqueue.c @@ -51,6 +51,7 @@ #include #include #include +#include #include "workqueue_internal.h" @@ -2199,6 +2200,8 @@ __acquires(>lock) lockdep_copy_map(_map, >lockdep_map); #endif + dept_work_enter(); + /* ensure we're on the correct CPU */ WARN_ON_ONCE(!(pool->flags & POOL_DISASSOCIATED) && raw_smp_processor_id() != pool->cpu); -- 2.17.1
[PATCH v10 07/25] dept: Apply sdt_might_sleep_{start, end}() to wait_for_completion()/complete()
Makes Dept able to track dependencies by wait_for_completion()/complete(). Signed-off-by: Byungchul Park --- include/linux/completion.h | 30 +- 1 file changed, 25 insertions(+), 5 deletions(-) diff --git a/include/linux/completion.h b/include/linux/completion.h index 62b32b19e0a8..32d535abebf3 100644 --- a/include/linux/completion.h +++ b/include/linux/completion.h @@ -10,6 +10,7 @@ */ #include +#include /* * struct completion - structure used to maintain state for a "completion" @@ -26,14 +27,33 @@ struct completion { unsigned int done; struct swait_queue_head wait; + struct dept_map dmap; }; +#define init_completion(x) \ +do { \ + sdt_map_init(&(x)->dmap); \ + __init_completion(x); \ +} while (0) + +/* + * XXX: No use cases for now. Fill the body when needed. + */ #define init_completion_map(x, m) init_completion(x) -static inline void complete_acquire(struct completion *x) {} -static inline void complete_release(struct completion *x) {} + +static inline void complete_acquire(struct completion *x) +{ + sdt_might_sleep_start(>dmap); +} + +static inline void complete_release(struct completion *x) +{ + sdt_might_sleep_end(); +} #define COMPLETION_INITIALIZER(work) \ - { 0, __SWAIT_QUEUE_HEAD_INITIALIZER((work).wait) } + { 0, __SWAIT_QUEUE_HEAD_INITIALIZER((work).wait), \ + .dmap = DEPT_MAP_INITIALIZER(work, NULL), } #define COMPLETION_INITIALIZER_ONSTACK_MAP(work, map) \ (*({ init_completion_map(&(work), &(map)); &(work); })) @@ -75,13 +95,13 @@ static inline void complete_release(struct completion *x) {} #endif /** - * init_completion - Initialize a dynamically allocated completion + * __init_completion - Initialize a dynamically allocated completion * @x: pointer to completion structure that is to be initialized * * This inline function will initialize a dynamically created completion * structure. */ -static inline void init_completion(struct completion *x) +static inline void __init_completion(struct completion *x) { x->done = 0; init_swait_queue_head(>wait); -- 2.17.1
[PATCH v10 22/25] dept: Apply timeout consideration to dma fence wait
Now that CONFIG_DEPT_AGGRESSIVE_TIMEOUT_WAIT was introduced, apply the consideration to dma fence wait. Signed-off-by: Byungchul Park --- drivers/dma-buf/dma-fence.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/drivers/dma-buf/dma-fence.c b/drivers/dma-buf/dma-fence.c index 1db4bc0e8adc..a1ede7b467cd 100644 --- a/drivers/dma-buf/dma-fence.c +++ b/drivers/dma-buf/dma-fence.c @@ -783,7 +783,7 @@ dma_fence_default_wait(struct dma_fence *fence, bool intr, signed long timeout) cb.task = current; list_add(, >cb_list); - sdt_might_sleep_start(NULL); + sdt_might_sleep_start_timeout(NULL, timeout); while (!test_bit(DMA_FENCE_FLAG_SIGNALED_BIT, >flags) && ret > 0) { if (intr) __set_current_state(TASK_INTERRUPTIBLE); @@ -887,7 +887,7 @@ dma_fence_wait_any_timeout(struct dma_fence **fences, uint32_t count, } } - sdt_might_sleep_start(NULL); + sdt_might_sleep_start_timeout(NULL, timeout); while (ret > 0) { if (intr) set_current_state(TASK_INTERRUPTIBLE); -- 2.17.1
[PATCH v10 11/25] dept: Apply sdt_might_sleep_{start, end}() to hashed-waitqueue wait
Makes Dept able to track dependencies by hashed-waitqueue waits. Signed-off-by: Byungchul Park --- include/linux/wait_bit.h | 3 +++ 1 file changed, 3 insertions(+) diff --git a/include/linux/wait_bit.h b/include/linux/wait_bit.h index 7725b7579b78..fe89282c3e96 100644 --- a/include/linux/wait_bit.h +++ b/include/linux/wait_bit.h @@ -6,6 +6,7 @@ * Linux wait-bit related types and methods: */ #include +#include struct wait_bit_key { void*flags; @@ -246,6 +247,7 @@ extern wait_queue_head_t *__var_waitqueue(void *p); struct wait_bit_queue_entry __wbq_entry;\ long __ret = ret; /* explicit shadow */ \ \ + sdt_might_sleep_start(NULL);\ init_wait_var_entry(&__wbq_entry, var, \ exclusive ? WQ_FLAG_EXCLUSIVE : 0); \ for (;;) { \ @@ -263,6 +265,7 @@ extern wait_queue_head_t *__var_waitqueue(void *p); cmd;\ } \ finish_wait(__wq_head, &__wbq_entry.wq_entry); \ + sdt_might_sleep_end(); \ __out: __ret; \ }) -- 2.17.1
[PATCH v10 02/25] dept: Implement Dept(Dependency Tracker)
CURRENT STATUS -- Lockdep tracks acquisition order of locks in order to detect deadlock, and IRQ and IRQ enable/disable state as well to take accident acquisitions into account. Lockdep should be turned off once it detects and reports a deadlock since the data structure and algorithm are not reusable after detection because of the complex design. PROBLEM --- *Waits* and their *events* that never reach eventually cause deadlock. However, Lockdep is only interested in lock acquisition order, forcing to emulate lock acqusition even for just waits and events that have nothing to do with real lock. Even worse, no one likes Lockdep's false positive detection because that prevents further one that might be more valuable. That's why all the kernel developers are sensitive to Lockdep's false positive. Besides those, by tracking acquisition order, it cannot correctly deal with read lock and cross-event e.g. wait_for_completion()/complete() for deadlock detection. Lockdep is no longer a good tool for that purpose. SOLUTION Again, *waits* and their *events* that never reach eventually cause deadlock. The new solution, Dept(DEPendency Tracker), focuses on waits and events themselves. Dept tracks waits and events and report it if any event would be never reachable. Dept does: . Works with read lock in the right way. . Works with any wait and event e.i. cross-event. . Continue to work even after reporting multiple times. . Provides simple and intuitive APIs. . Does exactly what dependency checker should do. Q & A - Q. Is this the first try ever to address the problem? A. No. Cross-release feature (b09be676e0ff2 locking/lockdep: Implement the 'crossrelease' feature) addressed it 2 years ago that was a Lockdep extension and merged but reverted shortly because: Cross-release started to report valuable hidden problems but started to give report false positive reports as well. For sure, no one likes Lockdep's false positive reports since it makes Lockdep stop, preventing reporting further real problems. Q. Why not Dept was developed as an extension of Lockdep? A. Lockdep definitely includes all the efforts great developers have made for a long time so as to be quite stable enough. But I had to design and implement newly because of the following: 1) Lockdep was designed to track lock acquisition order. The APIs and implementation do not fit on wait-event model. 2) Lockdep is turned off on detection including false positive. Which is terrible and prevents developing any extension for stronger detection. Q. Do you intend to totally replace Lockdep? A. No. Lockdep also checks if lock usage is correct. Of course, the dependency check routine should be replaced but the other functions should be still there. Q. Do you mean the dependency check routine should be replaced right away? A. No. I admit Lockdep is stable enough thanks to great efforts kernel developers have made. Lockdep and Dept, both should be in the kernel until Dept gets considered stable. Q. Stronger detection capability would give more false positive report. Which was a big problem when cross-release was introduced. Is it ok with Dept? A. It's ok. Dept allows multiple reporting thanks to simple and quite generalized design. Of course, false positive reports should be fixed anyway but it's no longer as a critical problem as it was. Signed-off-by: Byungchul Park --- include/linux/dept.h| 577 ++ include/linux/hardirq.h |3 + include/linux/sched.h |3 + init/init_task.c|2 + init/main.c |2 + kernel/Makefile |1 + kernel/dependency/Makefile |3 + kernel/dependency/dept.c| 3009 +++ kernel/dependency/dept_hash.h | 10 + kernel/dependency/dept_object.h | 13 + kernel/exit.c |1 + kernel/fork.c |2 + kernel/module/main.c|2 + kernel/sched/core.c |9 + lib/Kconfig.debug | 27 + lib/locking-selftest.c |2 + 16 files changed, 3666 insertions(+) create mode 100644 include/linux/dept.h create mode 100644 kernel/dependency/Makefile create mode 100644 kernel/dependency/dept.c create mode 100644 kernel/dependency/dept_hash.h create mode 100644 kernel/dependency/dept_object.h diff --git a/include/linux/dept.h b/include/linux/dept.h new file mode 100644 index ..b6d45b4b1fd6 --- /dev/null +++ b/include/linux/dept.h @@ -0,0 +1,577 @@ +/* SPDX-License-Identifier: GPL-2.0 */ +/* + * DEPT(DEPendency Tracker) - runtime dependency tracker + * + * Started by Byungchul Park : + * + * Copyright (c) 2020 LG Electronics, Inc., Byungchul Park + */ + +#ifndef __LINUX_DEPT_H +#define __LINUX_DEPT_H + +#ifdef CONFIG_DEPT + +#include + +struct task_struct; + +#define DEPT_MAX_STACK_ENTRY
[PATCH v10 14/25] dept: Add a mechanism to refill the internal memory pools on running out
Dept engine works in a constrained environment. For example, Dept cannot make use of dynamic allocation e.g. kmalloc(). So Dept has been using static pools to keep memory chunks Dept uses. However, Dept would barely work once any of the pools gets run out. So implemented a mechanism for the refill on the lack by any chance, using irq work and workqueue that fits on the contrained environment. Signed-off-by: Byungchul Park --- include/linux/dept.h| 19 -- kernel/dependency/dept.c| 104 +++- kernel/dependency/dept_object.h | 10 +-- kernel/dependency/dept_proc.c | 8 +-- 4 files changed, 112 insertions(+), 29 deletions(-) diff --git a/include/linux/dept.h b/include/linux/dept.h index d9ca9dd50219..583e8fe2dd7b 100644 --- a/include/linux/dept.h +++ b/include/linux/dept.h @@ -336,9 +336,19 @@ struct dept_pool { size_t obj_sz; /* -* the number of the static array +* the remaining number of the object in spool */ - atomic_tobj_nr; + int obj_nr; + + /* +* the number of the object in spool +*/ + int tot_nr; + + /* +* accumulated amount of memory used by the object in byte +*/ + atomic_tacc_sz; /* * offset of ->pool_node @@ -348,9 +358,10 @@ struct dept_pool { /* * pointer to the pool */ - void*spool; + void*spool; /* static pool */ + void*rpool; /* reserved pool */ struct llist_head boot_pool; - struct llist_head __percpu *lpool; + struct llist_head __percpu *lpool; /* local pool */ }; struct dept_ecxt_held { diff --git a/kernel/dependency/dept.c b/kernel/dependency/dept.c index 6cf17f206b78..8454f0a14d67 100644 --- a/kernel/dependency/dept.c +++ b/kernel/dependency/dept.c @@ -74,6 +74,9 @@ #include #include #include +#include +#include +#include #include "dept_internal.h" static int dept_stop; @@ -122,10 +125,12 @@ static int dept_per_cpu_ready; WARN(1, "DEPT_STOP: " s); \ }) -#define DEPT_INFO_ONCE(s...) pr_warn_once("DEPT_INFO_ONCE: " s) +#define DEPT_INFO_ONCE(s...) pr_warn_once("DEPT_INFO_ONCE: " s) +#define DEPT_INFO(s...)pr_warn("DEPT_INFO: " s) static arch_spinlock_t dept_spin = (arch_spinlock_t)__ARCH_SPIN_LOCK_UNLOCKED; static arch_spinlock_t stage_spin = (arch_spinlock_t)__ARCH_SPIN_LOCK_UNLOCKED; +static arch_spinlock_t dept_pool_spin = (arch_spinlock_t)__ARCH_SPIN_LOCK_UNLOCKED; /* * DEPT internal engine should be careful in using outside functions @@ -264,6 +269,7 @@ static inline bool valid_key(struct dept_key *k) #define OBJECT(id, nr) \ static struct dept_##id spool_##id[nr]; \ +static struct dept_##id rpool_##id[nr]; \ static DEFINE_PER_CPU(struct llist_head, lpool_##id); #include "dept_object.h" #undef OBJECT @@ -272,14 +278,70 @@ struct dept_pool dept_pool[OBJECT_NR] = { #define OBJECT(id, nr) { \ .name = #id,\ .obj_sz = sizeof(struct dept_##id), \ - .obj_nr = ATOMIC_INIT(nr), \ + .obj_nr = nr, \ + .tot_nr = nr, \ + .acc_sz = ATOMIC_INIT(sizeof(spool_##id) + sizeof(rpool_##id)), \ .node_off = offsetof(struct dept_##id, pool_node), \ .spool = spool_##id,\ + .rpool = rpool_##id,\ .lpool = _##id, }, #include "dept_object.h" #undef OBJECT }; +static void dept_wq_work_fn(struct work_struct *work) +{ + int i; + + for (i = 0; i < OBJECT_NR; i++) { + struct dept_pool *p = dept_pool + i; + int sz = p->tot_nr * p->obj_sz; + void *rpool; + bool need; + + arch_spin_lock(_pool_spin); + need = !p->rpool; + arch_spin_unlock(_pool_spin); + + if (!need) + continue; + + rpool = vmalloc(sz); + + if (!rpool) { + DEPT_STOP("Failed to extend internal resources.\n"); + break; + } + + arch_spin_lock(_pool_spin); + if (!p->rpool) { + p->rpool = rpool; +
[PATCH v10 25/25] dept: Track the potential waits of PG_{locked, writeback}
Currently, Dept only tracks the real waits of PG_{locked,writeback} that actually happened having gone through __schedule() to avoid false positives. However, it ends in limited capacity for deadlock detection, because anyway there might be still way more potential dependencies by the waits that have yet to happen but may happen in the future so as to cause a deadlock. So let Dept assume that when PG_{locked,writeback} bit gets cleared, there might be waits on the bit to be woken up. Even though false positives may increase with the aggressive tracking, it's worth doing it because it's going to be useful in practice. See the following link for instance: https://lore.kernel.org/lkml/1674268856-31807-1-git-send-email-byungchul.p...@lge.com/ Signed-off-by: Byungchul Park --- include/linux/mm_types.h | 3 + include/linux/page-flags.h | 112 + include/linux/pagemap.h| 7 ++- mm/filemap.c | 11 +++- mm/page_alloc.c| 3 + 5 files changed, 121 insertions(+), 15 deletions(-) diff --git a/include/linux/mm_types.h b/include/linux/mm_types.h index 3b8475007734..61d982eea8d1 100644 --- a/include/linux/mm_types.h +++ b/include/linux/mm_types.h @@ -19,6 +19,7 @@ #include #include #include +#include #include @@ -252,6 +253,8 @@ struct page { #ifdef LAST_CPUPID_NOT_IN_PAGE_FLAGS int _last_cpupid; #endif + struct dept_ext_wgen PG_locked_wgen; + struct dept_ext_wgen PG_writeback_wgen; } _struct_page_alignment; /* diff --git a/include/linux/page-flags.h b/include/linux/page-flags.h index 69e93a0c1277..d6ca1148d21d 100644 --- a/include/linux/page-flags.h +++ b/include/linux/page-flags.h @@ -202,6 +202,50 @@ enum pageflags { #ifndef __GENERATING_BOUNDS_H +#ifdef CONFIG_DEPT +#include +#include + +extern struct dept_map PG_locked_map; +extern struct dept_map PG_writeback_map; + +/* + * Place the following annotations in its suitable point in code: + * + * Annotate dept_page_set_bit() around firstly set_bit*() + * Annotate dept_page_clear_bit() around clear_bit*() + * Annotate dept_page_wait_on_bit() around wait_on_bit*() + */ + +static inline void dept_page_set_bit(struct page *p, int bit_nr) +{ + if (bit_nr == PG_locked) + dept_request_event(_locked_map, >PG_locked_wgen); + else if (bit_nr == PG_writeback) + dept_request_event(_writeback_map, >PG_writeback_wgen); +} + +static inline void dept_page_clear_bit(struct page *p, int bit_nr) +{ + if (bit_nr == PG_locked) + dept_event(_locked_map, 1UL, _RET_IP_, __func__, >PG_locked_wgen); + else if (bit_nr == PG_writeback) + dept_event(_writeback_map, 1UL, _RET_IP_, __func__, >PG_writeback_wgen); +} + +static inline void dept_page_wait_on_bit(struct page *p, int bit_nr) +{ + if (bit_nr == PG_locked) + dept_wait(_locked_map, 1UL, _RET_IP_, __func__, 0, -1L); + else if (bit_nr == PG_writeback) + dept_wait(_writeback_map, 1UL, _RET_IP_, __func__, 0, -1L); +} +#else +#define dept_page_set_bit(p, bit_nr) do { } while (0) +#define dept_page_clear_bit(p, bit_nr) do { } while (0) +#define dept_page_wait_on_bit(p, bit_nr) do { } while (0) +#endif + #ifdef CONFIG_HUGETLB_PAGE_OPTIMIZE_VMEMMAP DECLARE_STATIC_KEY_FALSE(hugetlb_optimize_vmemmap_key); @@ -383,44 +427,88 @@ static __always_inline int Page##uname(struct page *page) \ #define SETPAGEFLAG(uname, lname, policy) \ static __always_inline \ void folio_set_##lname(struct folio *folio)\ -{ set_bit(PG_##lname, folio_flags(folio, FOLIO_##policy)); } \ +{ \ + set_bit(PG_##lname, folio_flags(folio, FOLIO_##policy));\ + dept_page_set_bit(>page, PG_##lname);\ +} \ static __always_inline void SetPage##uname(struct page *page) \ -{ set_bit(PG_##lname, (page, 1)->flags); } +{ \ + set_bit(PG_##lname, (page, 1)->flags); \ + dept_page_set_bit(page, PG_##lname);\ +} #define CLEARPAGEFLAG(uname, lname, policy)\ static __always_inline \ void folio_clear_##lname(struct folio *folio) \ -{ clear_bit(PG_##lname, folio_flags(folio, FOLIO_##policy)); } \ +{ \ + clear_bit(PG_##lname, folio_flags(folio, FOLIO_##policy)); \ + dept_page_clear_bit(>page, PG_##lname); \ +}
[PATCH v10 19/25] dept: Apply timeout consideration to swait
Now that CONFIG_DEPT_AGGRESSIVE_TIMEOUT_WAIT was introduced, apply the consideration to swait, assuming an input 'ret' in ___swait_event() macro is used as a timeout value. Signed-off-by: Byungchul Park --- include/linux/swait.h | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/include/linux/swait.h b/include/linux/swait.h index 02848211cef5..def1e47bb678 100644 --- a/include/linux/swait.h +++ b/include/linux/swait.h @@ -162,7 +162,7 @@ extern void finish_swait(struct swait_queue_head *q, struct swait_queue *wait); struct swait_queue __wait; \ long __ret = ret; \ \ - sdt_might_sleep_start(NULL);\ + sdt_might_sleep_start_timeout(NULL, __ret); \ INIT_LIST_HEAD(&__wait.task_list); \ for (;;) { \ long __int = prepare_to_swait_event(, &__wait, state);\ -- 2.17.1
[PATCH v10 01/25] llist: Move llist_{head, node} definition to types.h
llist_head and llist_node can be used by very primitives. For example, Dept for tracking dependency uses llist things in its header. To avoid header dependency, move those to types.h. Signed-off-by: Byungchul Park --- include/linux/llist.h | 8 include/linux/types.h | 8 2 files changed, 8 insertions(+), 8 deletions(-) diff --git a/include/linux/llist.h b/include/linux/llist.h index 85bda2d02d65..99cc3c30f79c 100644 --- a/include/linux/llist.h +++ b/include/linux/llist.h @@ -53,14 +53,6 @@ #include #include -struct llist_head { - struct llist_node *first; -}; - -struct llist_node { - struct llist_node *next; -}; - #define LLIST_HEAD_INIT(name) { NULL } #define LLIST_HEAD(name) struct llist_head name = LLIST_HEAD_INIT(name) diff --git a/include/linux/types.h b/include/linux/types.h index ea8cf60a8a79..b12a44400877 100644 --- a/include/linux/types.h +++ b/include/linux/types.h @@ -187,6 +187,14 @@ struct hlist_node { struct hlist_node *next, **pprev; }; +struct llist_head { + struct llist_node *first; +}; + +struct llist_node { + struct llist_node *next; +}; + struct ustat { __kernel_daddr_tf_tfree; #ifdef CONFIG_ARCH_32BIT_USTAT_F_TINODE -- 2.17.1
[PATCH v10 16/25] dept: Apply sdt_might_sleep_{start, end}() to dma fence wait
Makes Dept able to track dma fence waits. Signed-off-by: Byungchul Park --- drivers/dma-buf/dma-fence.c | 5 + 1 file changed, 5 insertions(+) diff --git a/drivers/dma-buf/dma-fence.c b/drivers/dma-buf/dma-fence.c index 406b4e26f538..1db4bc0e8adc 100644 --- a/drivers/dma-buf/dma-fence.c +++ b/drivers/dma-buf/dma-fence.c @@ -16,6 +16,7 @@ #include #include #include +#include #define CREATE_TRACE_POINTS #include @@ -782,6 +783,7 @@ dma_fence_default_wait(struct dma_fence *fence, bool intr, signed long timeout) cb.task = current; list_add(, >cb_list); + sdt_might_sleep_start(NULL); while (!test_bit(DMA_FENCE_FLAG_SIGNALED_BIT, >flags) && ret > 0) { if (intr) __set_current_state(TASK_INTERRUPTIBLE); @@ -795,6 +797,7 @@ dma_fence_default_wait(struct dma_fence *fence, bool intr, signed long timeout) if (ret > 0 && intr && signal_pending(current)) ret = -ERESTARTSYS; } + sdt_might_sleep_end(); if (!list_empty()) list_del(); @@ -884,6 +887,7 @@ dma_fence_wait_any_timeout(struct dma_fence **fences, uint32_t count, } } + sdt_might_sleep_start(NULL); while (ret > 0) { if (intr) set_current_state(TASK_INTERRUPTIBLE); @@ -898,6 +902,7 @@ dma_fence_wait_any_timeout(struct dma_fence **fences, uint32_t count, if (ret > 0 && intr && signal_pending(current)) ret = -ERESTARTSYS; } + sdt_might_sleep_end(); __set_current_state(TASK_RUNNING); -- 2.17.1
[PATCH v10 12/25] dept: Distinguish each syscall context from another
It enters kernel mode on each syscall and each syscall handling should be considered independently from the point of view of Dept. Otherwise, Dept may wrongly track dependencies across different syscalls. That might be a real dependency from user mode. However, now that Dept just started to work, conservatively let Dept not track dependencies across different syscalls. Signed-off-by: Byungchul Park --- arch/arm64/kernel/syscall.c | 2 ++ arch/x86/entry/common.c | 4 +++ include/linux/dept.h| 39 - kernel/dependency/dept.c| 67 +++-- 4 files changed, 63 insertions(+), 49 deletions(-) diff --git a/arch/arm64/kernel/syscall.c b/arch/arm64/kernel/syscall.c index a5de47e3df2b..e26d0cab0657 100644 --- a/arch/arm64/kernel/syscall.c +++ b/arch/arm64/kernel/syscall.c @@ -7,6 +7,7 @@ #include #include #include +#include #include #include @@ -105,6 +106,7 @@ static void el0_svc_common(struct pt_regs *regs, int scno, int sc_nr, */ local_daif_restore(DAIF_PROCCTX); + dept_kernel_enter(); if (flags & _TIF_MTE_ASYNC_FAULT) { /* diff --git a/arch/x86/entry/common.c b/arch/x86/entry/common.c index 6c2826417b33..7cdd27abe529 100644 --- a/arch/x86/entry/common.c +++ b/arch/x86/entry/common.c @@ -19,6 +19,7 @@ #include #include #include +#include #ifdef CONFIG_XEN_PV #include @@ -72,6 +73,7 @@ static __always_inline bool do_syscall_x32(struct pt_regs *regs, int nr) __visible noinstr void do_syscall_64(struct pt_regs *regs, int nr) { + dept_kernel_enter(); add_random_kstack_offset(); nr = syscall_enter_from_user_mode(regs, nr); @@ -120,6 +122,7 @@ __visible noinstr void do_int80_syscall_32(struct pt_regs *regs) { int nr = syscall_32_enter(regs); + dept_kernel_enter(); add_random_kstack_offset(); /* * Subtlety here: if ptrace pokes something larger than 2^31-1 into @@ -140,6 +143,7 @@ static noinstr bool __do_fast_syscall_32(struct pt_regs *regs) int nr = syscall_32_enter(regs); int res; + dept_kernel_enter(); add_random_kstack_offset(); /* * This cannot use syscall_enter_from_user_mode() as it has to diff --git a/include/linux/dept.h b/include/linux/dept.h index b6d45b4b1fd6..f62c7b6f42c6 100644 --- a/include/linux/dept.h +++ b/include/linux/dept.h @@ -25,11 +25,16 @@ struct task_struct; #define DEPT_MAX_SUBCLASSES_USR(DEPT_MAX_SUBCLASSES / DEPT_MAX_SUBCLASSES_EVT) #define DEPT_MAX_SUBCLASSES_CACHE 2 -#define DEPT_SIRQ 0 -#define DEPT_HIRQ 1 -#define DEPT_IRQS_NR 2 -#define DEPT_SIRQF (1UL << DEPT_SIRQ) -#define DEPT_HIRQF (1UL << DEPT_HIRQ) +enum { + DEPT_CXT_SIRQ = 0, + DEPT_CXT_HIRQ, + DEPT_CXT_IRQS_NR, + DEPT_CXT_PROCESS = DEPT_CXT_IRQS_NR, + DEPT_CXTS_NR +}; + +#define DEPT_SIRQF (1UL << DEPT_CXT_SIRQ) +#define DEPT_HIRQF (1UL << DEPT_CXT_HIRQ) struct dept_ecxt; struct dept_iecxt { @@ -94,8 +99,8 @@ struct dept_class { /* * for tracking IRQ dependencies */ - struct dept_iecxt iecxt[DEPT_IRQS_NR]; - struct dept_iwait iwait[DEPT_IRQS_NR]; + struct dept_iecxt iecxt[DEPT_CXT_IRQS_NR]; + struct dept_iwait iwait[DEPT_CXT_IRQS_NR]; /* * classified by a map embedded in task_struct, @@ -207,8 +212,8 @@ struct dept_ecxt { /* * where the IRQ-enabled happened */ - unsigned long enirq_ip[DEPT_IRQS_NR]; - struct dept_stack *enirq_stack[DEPT_IRQS_NR]; + unsigned long enirq_ip[DEPT_CXT_IRQS_NR]; + struct dept_stack *enirq_stack[DEPT_CXT_IRQS_NR]; /* * where the event context started @@ -252,8 +257,8 @@ struct dept_wait { /* * where the IRQ wait happened */ - unsigned long irq_ip[DEPT_IRQS_NR]; - struct dept_stack *irq_stack[DEPT_IRQS_NR]; + unsigned long irq_ip[DEPT_CXT_IRQS_NR]; + struct dept_stack *irq_stack[DEPT_CXT_IRQS_NR]; /* * where the wait happened @@ -406,19 +411,19 @@ struct dept_task { int wait_hist_pos; /* -* sequential id to identify each IRQ context +* sequential id to identify each context */ - unsigned int
[PATCH v10 00/25] DEPT(Dependency Tracker)
>From now on, I can work on LKML again! I'm wondering if DEPT has been helping kernel debugging well even though it's a form of patches yet. I'm happy to see that DEPT reports a real problem in practice. See: https://lore.kernel.org/lkml/6383cde5-cf4b-facf-6e07-1378a4856...@i-love.sakura.ne.jp/#t https://lore.kernel.org/lkml/1674268856-31807-1-git-send-email-byungchul.p...@lge.com/ Nevertheless, I apologize for the lack of document. I promise to add it before it gets needed to use DEPT's APIs by users. For now, you can use DEPT just with CONFIG_DEPT on. --- Hi Linus and folks, I've been developing a tool for detecting deadlock possibilities by tracking wait/event rather than lock(?) acquisition order to try to cover all synchonization machanisms. It's done on v6.2-rc2. Benifit: 0. Works with all lock primitives. 1. Works with wait_for_completion()/complete(). 2. Works with 'wait' on PG_locked. 3. Works with 'wait' on PG_writeback. 4. Works with swait/wakeup. 5. Works with waitqueue. 6. Works with wait_bit. 7. Multiple reports are allowed. 8. Deduplication control on multiple reports. 9. Withstand false positives thanks to 6. 10. Easy to tag any wait/event. Future work: 0. To make it more stable. 1. To separates Dept from Lockdep. 2. To improves performance in terms of time and space. 3. To use Dept as a dependency engine for Lockdep. 4. To add any missing tags of wait/event in the kernel. 5. To deduplicate stack trace. How to interpret reports: 1. E(event) in each context cannot be triggered because of the W(wait) that cannot be woken. 2. The stack trace helping find the problematic code is located in each conext's detail. Thanks, Byungchul --- Changes from v9: 1. Fix a bug. SDT tracking didn't work well because of my big mistake that I should've used waiter's map to indentify its class but it had been working with waker's one. FYI, PG_locked and PG_writeback weren't affected. They still worked well. (reported by YoungJun) Changes from v8: 1. Fix build error by adding EXPORT_SYMBOL(PG_locked_map) and EXPORT_SYMBOL(PG_writeback_map) for kernel module build - appologize for that. (reported by kernel test robot) 2. Fix build error by removing header file's circular dependency that was caused by "atomic.h", "kernel.h" and "irqflags.h", which I introduced - appolgize for that. (reported by kernel test robot) Changes from v7: 1. Fix a bug that cannot track rwlock dependency properly, introduced in v7. (reported by Boqun and lockdep selftest) 2. Track wait/event of PG_{locked,writeback} more aggressively assuming that when a bit of PG_{locked,writeback} is cleared there might be waits on the bit. (reported by Linus, Hillf and syzbot) 3. Fix and clean bad style code e.i. unnecessarily introduced a randome pattern and so on. (pointed out by Linux) 4. Clean code for applying DEPT to wait_for_completion(). Changes from v6: 1. Tie to task scheduler code to track sleep and try_to_wake_up() assuming sleeps cause waits, try_to_wake_up()s would be the events that those are waiting for, of course with proper DEPT annotations, sdt_might_sleep_weak(), sdt_might_sleep_strong() and so on. For these cases, class is classified at sleep entrance rather than the synchronization initialization code. Which would extremely reduce false alarms. 2. Remove the DEPT associated instance in each page struct for tracking dependencies by PG_locked and PG_writeback thanks to the 1. work above. 3. Introduce CONFIG_DEPT_AGGRESIVE_TIMEOUT_WAIT to suppress reports that waits with timeout set are involved, for those who don't like verbose reporting. 4. Add a mechanism to refill the internal memory pools on running out so that DEPT could keep working as long as free memory is available in the system. 5. Re-enable tracking hashed-waitqueue wait. That's going to no longer generate false positives because class is classified at sleep entrance rather than the waitqueue initailization. 6. Refactor to make it easier to port onto each new version of the kernel. 7. Apply DEPT to dma fence. 8. Do trivial optimizaitions. Changes from v5: 1. Use just pr_warn_once() rather than WARN_ONCE() on the lack of internal resources because WARN_*() printing stacktrace is too much for informing the lack. (feedback from Ted, Hyeonggon) 2. Fix trivial bugs like missing initializing a struct
[PATCH v10 09/25] dept: Apply sdt_might_sleep_{start,end}() to swait
Makes Dept able to track dependencies by swaits. Signed-off-by: Byungchul Park --- include/linux/swait.h | 3 +++ 1 file changed, 3 insertions(+) diff --git a/include/linux/swait.h b/include/linux/swait.h index 6a8c22b8c2a5..02848211cef5 100644 --- a/include/linux/swait.h +++ b/include/linux/swait.h @@ -6,6 +6,7 @@ #include #include #include +#include #include /* @@ -161,6 +162,7 @@ extern void finish_swait(struct swait_queue_head *q, struct swait_queue *wait); struct swait_queue __wait; \ long __ret = ret; \ \ + sdt_might_sleep_start(NULL);\ INIT_LIST_HEAD(&__wait.task_list); \ for (;;) { \ long __int = prepare_to_swait_event(, &__wait, state);\ @@ -176,6 +178,7 @@ extern void finish_swait(struct swait_queue_head *q, struct swait_queue *wait); cmd;\ } \ finish_swait(, &__wait); \ + sdt_might_sleep_end(); \ __out: __ret; \ }) -- 2.17.1
[PATCH v10 17/25] dept: Track timeout waits separately with a new Kconfig
Waits with valid timeouts don't actually cause deadlocks. However, Dept has been reporting the cases as well because it's worth informing the circular dependency for some cases where, for example, timeout is used to avoid a deadlock but not meant to be expired. However, yes, there are also a lot of, even more, cases where timeout is used for its clear purpose and meant to be expired. Let Dept report these as an information rather than shouting DEADLOCK. Plus, introduced CONFIG_DEPT_AGGRESSIVE_TIMEOUT_WAIT Kconfig to make it optional so that any reports involving waits with timeouts can be turned on/off depending on the purpose. Signed-off-by: Byungchul Park --- include/linux/dept.h | 15 ++--- include/linux/dept_ldt.h | 6 ++-- include/linux/dept_sdt.h | 12 +--- kernel/dependency/dept.c | 66 ++-- lib/Kconfig.debug| 10 ++ 5 files changed, 89 insertions(+), 20 deletions(-) diff --git a/include/linux/dept.h b/include/linux/dept.h index 583e8fe2dd7b..0aa8d90558a9 100644 --- a/include/linux/dept.h +++ b/include/linux/dept.h @@ -270,6 +270,11 @@ struct dept_wait { * whether this wait is for commit in scheduler */ boolsched_sleep; + + /* +* whether a timeout is set +*/ + booltimeout; }; }; }; @@ -458,6 +463,7 @@ struct dept_task { boolstage_sched_map; const char *stage_w_fn; unsigned long stage_ip; + boolstage_timeout; /* * the number of missing ecxts @@ -496,6 +502,7 @@ struct dept_task { .stage_sched_map = false, \ .stage_w_fn = NULL, \ .stage_ip = 0UL,\ + .stage_timeout = false, \ .missing_ecxt = 0, \ .hardirqs_enabled = false, \ .softirqs_enabled = false, \ @@ -513,8 +520,8 @@ extern void dept_map_init(struct dept_map *m, struct dept_key *k, int sub_u, con extern void dept_map_reinit(struct dept_map *m, struct dept_key *k, int sub_u, const char *n); extern void dept_map_copy(struct dept_map *to, struct dept_map *from); -extern void dept_wait(struct dept_map *m, unsigned long w_f, unsigned long ip, const char *w_fn, int sub_l); -extern void dept_stage_wait(struct dept_map *m, struct dept_key *k, unsigned long ip, const char *w_fn); +extern void dept_wait(struct dept_map *m, unsigned long w_f, unsigned long ip, const char *w_fn, int sub_l, long timeout); +extern void dept_stage_wait(struct dept_map *m, struct dept_key *k, unsigned long ip, const char *w_fn, long timeout); extern void dept_request_event_wait_commit(void); extern void dept_clean_stage(void); extern void dept_stage_event(struct task_struct *t, unsigned long ip); @@ -566,8 +573,8 @@ struct dept_task { }; #define dept_map_reinit(m, k, su, n) do { (void)(n); (void)(k); } while (0) #define dept_map_copy(t, f)do { } while (0) -#define dept_wait(m, w_f, ip, w_fn, sl)do { (void)(w_fn); } while (0) -#define dept_stage_wait(m, k, ip, w_fn)do { (void)(k); (void)(w_fn); } while (0) +#define dept_wait(m, w_f, ip, w_fn, sl, t) do { (void)(w_fn); } while (0) +#define dept_stage_wait(m, k, ip, w_fn, t) do { (void)(k); (void)(w_fn); } while (0) #define dept_request_event_wait_commit() do { } while (0) #define dept_clean_stage() do { } while (0) #define dept_stage_event(t, ip)do { } while (0) diff --git a/include/linux/dept_ldt.h b/include/linux/dept_ldt.h index 062613e89fc3..8adf298dfcb8 100644 --- a/include/linux/dept_ldt.h +++ b/include/linux/dept_ldt.h @@ -27,7 +27,7 @@ else if (t) \ dept_ecxt_enter(m, LDT_EVT_L, i, "trylock", "unlock", sl);\ else { \ - dept_wait(m, LDT_EVT_L, i, "lock", sl); \ + dept_wait(m, LDT_EVT_L, i, "lock", sl, false); \ dept_ecxt_enter(m, LDT_EVT_L, i, "lock", "unlock", sl);\ } \ } while (0) @@ -39,7 +39,7 @@ else if (t) \ dept_ecxt_enter(m, LDT_EVT_R, i, "read_trylock", "read_unlock", sl);\ else {
[PATCH v10 04/25] dept: Add lock dependency tracker APIs
Wrapped the base APIs for easier annotation on typical lock. Signed-off-by: Byungchul Park --- include/linux/dept_ldt.h | 77 1 file changed, 77 insertions(+) create mode 100644 include/linux/dept_ldt.h diff --git a/include/linux/dept_ldt.h b/include/linux/dept_ldt.h new file mode 100644 index ..062613e89fc3 --- /dev/null +++ b/include/linux/dept_ldt.h @@ -0,0 +1,77 @@ +/* SPDX-License-Identifier: GPL-2.0 */ +/* + * Lock Dependency Tracker + * + * Started by Byungchul Park : + * + * Copyright (c) 2020 LG Electronics, Inc., Byungchul Park + */ + +#ifndef __LINUX_DEPT_LDT_H +#define __LINUX_DEPT_LDT_H + +#include + +#ifdef CONFIG_DEPT +#define LDT_EVT_L 1UL +#define LDT_EVT_R 2UL +#define LDT_EVT_W 1UL +#define LDT_EVT_RW (LDT_EVT_R | LDT_EVT_W) +#define LDT_EVT_ALL(LDT_EVT_L | LDT_EVT_RW) + +#define ldt_init(m, k, su, n) dept_map_init(m, k, su, n) +#define ldt_lock(m, sl, t, n, i) \ + do {\ + if (n) \ + dept_ecxt_enter_nokeep(m); \ + else if (t) \ + dept_ecxt_enter(m, LDT_EVT_L, i, "trylock", "unlock", sl);\ + else { \ + dept_wait(m, LDT_EVT_L, i, "lock", sl); \ + dept_ecxt_enter(m, LDT_EVT_L, i, "lock", "unlock", sl);\ + } \ + } while (0) + +#define ldt_rlock(m, sl, t, n, i, q) \ + do {\ + if (n) \ + dept_ecxt_enter_nokeep(m); \ + else if (t) \ + dept_ecxt_enter(m, LDT_EVT_R, i, "read_trylock", "read_unlock", sl);\ + else { \ + dept_wait(m, q ? LDT_EVT_RW : LDT_EVT_W, i, "read_lock", sl);\ + dept_ecxt_enter(m, LDT_EVT_R, i, "read_lock", "read_unlock", sl);\ + } \ + } while (0) + +#define ldt_wlock(m, sl, t, n, i) \ + do {\ + if (n) \ + dept_ecxt_enter_nokeep(m); \ + else if (t) \ + dept_ecxt_enter(m, LDT_EVT_W, i, "write_trylock", "write_unlock", sl);\ + else { \ + dept_wait(m, LDT_EVT_RW, i, "write_lock", sl); \ + dept_ecxt_enter(m, LDT_EVT_W, i, "write_lock", "write_unlock", sl);\ + } \ + } while (0) + +#define ldt_unlock(m, i) dept_ecxt_exit(m, LDT_EVT_ALL, i) + +#define ldt_downgrade(m, i)\ + do {\ + if (dept_ecxt_holding(m, LDT_EVT_W))\ + dept_map_ecxt_modify(m, LDT_EVT_W, NULL, LDT_EVT_R, i, "downgrade", "read_unlock", -1);\ + } while (0) + +#define ldt_set_class(m, n, k, sl, i) dept_map_ecxt_modify(m, LDT_EVT_ALL, k, 0UL, i, "lock_set_class", "(any)unlock", sl) +#else /* !CONFIG_DEPT */ +#define ldt_init(m, k, su, n) do { (void)(k); } while (0) +#define ldt_lock(m, sl, t, n, i) do { } while (0) +#define ldt_rlock(m, sl, t, n, i, q) do { } while (0) +#define ldt_wlock(m, sl, t, n, i) do { } while (0) +#define ldt_unlock(m, i) do { } while (0) +#define ldt_downgrade(m, i)do { } while (0) +#define ldt_set_class(m, n, k, sl, i) do { } while (0) +#endif +#endif /* __LINUX_DEPT_LDT_H */ -- 2.17.1
Re: [PATCH 6/9] drm: atmel_hlcdc: Add support for XLCDC in atmel LCD driver
Hi Manikandan, On Tue, Jun 13, 2023 at 12:34:23PM +0530, Manikandan Muralidharan wrote: > - XLCDC in SAM9X7 has different sets of registers and additional > configuration bits when compared to previous HLCDC IP. Read/write > operation on the controller registers is now separated using the > XLCDC status flag. > - HEO scaling, window resampling, Alpha blending, YUV-to-RGB > conversion in XLCDC is derived and handled using additional > configuration bits and registers. > - Writing one to the Enable fields of each layer in LCD_ATTRE > is required to reflect the values set in Configuration, FBA, Enable > registers of each layer In general things would benefit from a more clear separation between hlcdc and xlcdc. In several cases two functions would be better than testing as done today. See some more specific comments in the following. Sam > > Signed-off-by: Manikandan Muralidharan > [hari.prasat...@microchip.com: update the attribute field for each layer] > Signed-off-by: Hari Prasath Gujulan Elango > [durai.manicka...@microchip.com: implement status flag to seprate register > update] > Signed-off-by: Durai Manickam KR > --- > .../gpu/drm/atmel-hlcdc/atmel_hlcdc_crtc.c| 28 +- > .../gpu/drm/atmel-hlcdc/atmel_hlcdc_plane.c | 297 ++ > 2 files changed, 256 insertions(+), 69 deletions(-) > > diff --git a/drivers/gpu/drm/atmel-hlcdc/atmel_hlcdc_crtc.c > b/drivers/gpu/drm/atmel-hlcdc/atmel_hlcdc_crtc.c > index 58184cd6ab0b..7c9cf7c0c75d 100644 > --- a/drivers/gpu/drm/atmel-hlcdc/atmel_hlcdc_crtc.c > +++ b/drivers/gpu/drm/atmel-hlcdc/atmel_hlcdc_crtc.c > @@ -139,10 +139,10 @@ static void atmel_hlcdc_crtc_mode_set_nofb(struct > drm_crtc *c) > state = drm_crtc_state_to_atmel_hlcdc_crtc_state(c->state); > cfg = state->output_mode << 8; > > - if (adj->flags & DRM_MODE_FLAG_NVSYNC) > + if (!crtc->dc->is_xlcdc && (adj->flags & DRM_MODE_FLAG_NVSYNC)) > cfg |= ATMEL_HLCDC_VSPOL; > > - if (adj->flags & DRM_MODE_FLAG_NHSYNC) > + if (!crtc->dc->is_xlcdc && (adj->flags & DRM_MODE_FLAG_NHSYNC)) > cfg |= ATMEL_HLCDC_HSPOL; >From the above code I assume ATMEL_HLCDC_VSPOL and ATMEL_HLCDC_HSPOL are specific to HLCDC and not part of XLCDC register set. We can identify XLCDC specific registers as thy use "XLCDC" in the name. But the "HLCDC" specific registers are not easy to spot. Would it make sense to update the register definitions so we can see in the register names which at XLCDC, which are HLCDC and which a common (LCDC)? It would require a little code crunch to do so as all users would need an update. Dunno if this is worth it. But then at least a comment in the register definition file. > > regmap_update_bits(regmap, ATMEL_HLCDC_CFG(5), > @@ -177,6 +177,18 @@ static void atmel_hlcdc_crtc_atomic_disable(struct > drm_crtc *c, > > pm_runtime_get_sync(dev->dev); > > + if (crtc->dc->is_xlcdc) { > + regmap_write(regmap, ATMEL_HLCDC_DIS, ATMEL_XLCDC_CM); > + while (!regmap_read(regmap, ATMEL_HLCDC_SR, ) && > +(status & ATMEL_XLCDC_CM)) > + cpu_relax(); > + > + regmap_write(regmap, ATMEL_HLCDC_DIS, ATMEL_XLCDC_SD); > + while (!regmap_read(regmap, ATMEL_HLCDC_SR, ) && > +!(status & ATMEL_XLCDC_SD)) > + cpu_relax(); > + } A small helper atmel_xlcdc_disable()? > + > regmap_write(regmap, ATMEL_HLCDC_DIS, ATMEL_HLCDC_DISP); > while (!regmap_read(regmap, ATMEL_HLCDC_SR, ) && > (status & ATMEL_HLCDC_DISP)) > @@ -231,6 +243,18 @@ static void atmel_hlcdc_crtc_atomic_enable(struct > drm_crtc *c, > !(status & ATMEL_HLCDC_DISP)) > cpu_relax(); > > + if (crtc->dc->is_xlcdc) { > + regmap_write(regmap, ATMEL_HLCDC_EN, ATMEL_XLCDC_CM); > + while (!regmap_read(regmap, ATMEL_HLCDC_SR, ) && > +!(status & ATMEL_XLCDC_CM)) > + cpu_relax(); > + > + regmap_write(regmap, ATMEL_HLCDC_EN, ATMEL_XLCDC_SD); > + while (!regmap_read(regmap, ATMEL_HLCDC_SR, ) && > +(status & ATMEL_XLCDC_SD)) > + cpu_relax(); > + } A small helper atmel_xlcdc_enable()? > + > pm_runtime_put_sync(dev->dev); > > } > diff --git a/drivers/gpu/drm/atmel-hlcdc/atmel_hlcdc_plane.c > b/drivers/gpu/drm/atmel-hlcdc/atmel_hlcdc_plane.c > index daa508504f47..fe33476818c4 100644 > --- a/drivers/gpu/drm/atmel-hlcdc/atmel_hlcdc_plane.c > +++ b/drivers/gpu/drm/atmel-hlcdc/atmel_hlcdc_plane.c > @@ -330,11 +330,59 @@ static void atmel_hlcdc_plane_setup_scaler(struct > atmel_hlcdc_plane *plane, >yfactor)); > } > > +static void atmel_xlcdc_plane_setup_scaler(struct atmel_hlcdc_plane *plane, > +struct
Re: [PATCH 06/15] dt-bindings: display/msm: sc7180-dpu: Describe SM6125
On 2023-06-26 20:57:51, Konrad Dybcio wrote: > On 26.06.2023 19:54, Marijn Suijten wrote: > > On 2023-06-26 18:16:58, Krzysztof Kozlowski wrote: > >> On 25/06/2023 21:52, Marijn Suijten wrote: > >>> On 2023-06-24 11:12:52, Krzysztof Kozlowski wrote: > On 24/06/2023 02:41, Marijn Suijten wrote: > > SM6125 is identical to SM6375 except that while downstream also defines > > a throttle clock, its presence results in timeouts whereas SM6375 > > requires it to not observe any timeouts. > > Then it should not be allowed, so you need either "else:" block or > another "if: properties: compatible:" to disallow it. Because in current > patch it would be allowed. > >>> > >>> That means this binding is wrong/incomplete for all other SoCs then. > >>> clock(-name)s has 6 items, and sets `minItems: 6`. Only for sm6375-dpu > > > > Of course meant to say that clock(-name)s has **7** items, not 6. > > > >>> does it set `minItems: 7`, but an else case is missing. > >> > >> Ask the author why it is done like this. > > > > Konrad, can you clarify why other (Looks like I forgot to complete this sentence before sending, apologies) > 6375 needs the throttle clk and the clock(-names) are strongly ordered > so having minItems: 6 discards the last entry The question is whether or not we should have maxItems: 6 to disallow the clock from being passed: right now it is optional and either is allowed for any !6375 SoC. - Marijn > > Konrad > > > >>> Shall I send a Fixes: ed41005f5b7c ("dt-bindings: display/msm: > >>> sc7180-dpu: Describe SM6350 and SM6375") for that, and should maxItems: > >>> 6 be the default under clock(-name)s or in an else:? > >> > >> There is no bug to fix. Or at least it is not yet known. Whether other > >> devices should be constrained as well - sure, sounds reasonable, but I > >> did not check the code exactly. > > > > I don't know either, but we need this information to decide whether to > > use `maxItems: 6`: > > > > 1. Directly on the property; > > 2. In an `else:` case on the current `if: sm6375-dpu` (should have the > >same effect as 1., afaik); > > 3. In a second `if:` case that lists all SoCS explicitly. > > > > Since we don't have this information, I think option 3. is the right way > > to go, setting `maxItems: 6` for qcom,sm6125-dpu. > > > > However, it is not yet understood why downstream is able to use the > > throttle clock without repercussions. > > > >> We talk here about this patch. > > > > We used this patch to discover that other SoCs are similarly > > unconstrained. But if you don't want me to look into it, by all means! > > Saves me a lot of time. So I will go with option 3. > > > > - Marijn
Re: [PATCH 3/9] drm: atmel-hlcdc: add LCD controller layer definition for SAM9X7
Hi Manikandan, On Tue, Jun 13, 2023 at 12:34:20PM +0530, Manikandan Muralidharan wrote: > Add the LCD controller layer definition and descriptor structure for SAM9X7 > for the following layers, > - Base Layer > - Overlay1 Layer > - Overlay2 Layer > - High End Overlay > > Signed-off-by: Manikandan Muralidharan > --- > drivers/gpu/drm/atmel-hlcdc/atmel_hlcdc_dc.c | 96 > 1 file changed, 96 insertions(+) > > diff --git a/drivers/gpu/drm/atmel-hlcdc/atmel_hlcdc_dc.c > b/drivers/gpu/drm/atmel-hlcdc/atmel_hlcdc_dc.c > index fa0f9a93d50d..d7ad828e9e8c 100644 > --- a/drivers/gpu/drm/atmel-hlcdc/atmel_hlcdc_dc.c > +++ b/drivers/gpu/drm/atmel-hlcdc/atmel_hlcdc_dc.c > @@ -462,6 +462,98 @@ static const struct atmel_hlcdc_dc_desc > atmel_hlcdc_dc_sam9x60 = { > .layers = atmel_hlcdc_sam9x60_layers, > }; > > +static const struct atmel_hlcdc_layer_desc atmel_xlcdc_sam9x7_layers[] = { > + { > + .name = "base", > + .formats = _hlcdc_plane_rgb_formats, > + .regs_offset = 0x60, > + .id = 0, > + .type = ATMEL_HLCDC_BASE_LAYER, > + .cfgs_offset = 0x1c, > + .layout = { > + .xstride = { 2 }, > + .default_color = 3, > + .general_config = 4, > + .disc_pos = 5, > + .disc_size = 6, > + }, > + .clut_offset = 0x700, > + }, > + { > + .name = "overlay1", > + .formats = _hlcdc_plane_rgb_formats, > + .regs_offset = 0x160, > + .id = 1, > + .type = ATMEL_HLCDC_OVERLAY_LAYER, > + .cfgs_offset = 0x1c, > + .layout = { > + .pos = 2, > + .size = 3, > + .xstride = { 4 }, > + .pstride = { 5 }, > + .default_color = 6, > + .chroma_key = 7, > + .chroma_key_mask = 8, > + .general_config = 9, > + }, > + .clut_offset = 0xb00, > + }, > + { > + .name = "overlay2", > + .formats = _hlcdc_plane_rgb_formats, > + .regs_offset = 0x260, > + .id = 2, > + .type = ATMEL_HLCDC_OVERLAY_LAYER, > + .cfgs_offset = 0x1c, > + .layout = { > + .pos = 2, > + .size = 3, > + .xstride = { 4 }, > + .pstride = { 5 }, > + .default_color = 6, > + .chroma_key = 7, > + .chroma_key_mask = 8, > + .general_config = 9, > + }, > + .clut_offset = 0xf00, > + }, > + { > + .name = "high-end-overlay", > + .formats = _hlcdc_plane_rgb_and_yuv_formats, > + .regs_offset = 0x360, > + .id = 3, > + .type = ATMEL_HLCDC_OVERLAY_LAYER, > + .cfgs_offset = 0x30, > + .layout = { > + .pos = 2, > + .size = 3, > + .memsize = 4, > + .xstride = { 5, 7 }, > + .pstride = { 6, 8 }, > + .default_color = 9, > + .chroma_key = 10, > + .chroma_key_mask = 11, > + .general_config = 12, > + .csc = 16, > + .scaler_config = 23, > + }, > + .clut_offset = 0x1300, > + }, > +}; > + > +static const struct atmel_hlcdc_dc_desc atmel_xlcdc_dc_sam9x7 = { > + .min_width = 0, > + .min_height = 0, > + .max_width = 2048, > + .max_height = 2048, > + .max_spw = 0xff, > + .max_vpw = 0xff, > + .max_hpw = 0x3ff, > + .fixed_clksrc = true, > + .nlayers = ARRAY_SIZE(atmel_xlcdc_sam9x7_layers), > + .layers = atmel_xlcdc_sam9x7_layers, > +}; As already suggested by someone else, add is_xlcdc to struct atmel_hlcdc_dc_desc, so the check for the compatible can be dropped. It would be better to put it here. Sam
Re: [PATCH v1 0/2] udmabuf: Add back support for mapping hugetlb pages
On Mon, Jun 26, 2023 at 03:18:48PM -0300, Jason Gunthorpe wrote: > On Mon, Jun 26, 2023 at 08:14:27PM +0200, David Hildenbrand wrote: > > > So we might have to implement the same page migration as gup does on > > FOLL_LONGTERM here ... maybe there are more such cases/drivers that actually > > require that handling when simply taking pages out of the memfd, believing > > they can hold on to them forever. > > In general I would like to see an interface to FOLL_LONGTERM pin pages > from a memfd. I would quite happily use that in iommufd as well. > > It solves some problems we have there with fork/exec/etc if the pages > are not linked to a mm_struct. Afaiu any fd based approach should mean it'll never work with private memories, while mm-based should be able to work on any kind. Maybe that's not a problem - I assume at least udmabuf should just only work with shared memories; not sure on iommufd, though. -- Peter Xu
[PATCH] drm/mediatek: Fix potential memory leak if vmap() fail
Also return -ENOMEM if such a failure happens, the implement should take responsibility for the error handling. Signed-off-by: Sui Jingfeng --- drivers/gpu/drm/mediatek/mtk_drm_gem.c | 6 +- 1 file changed, 5 insertions(+), 1 deletion(-) diff --git a/drivers/gpu/drm/mediatek/mtk_drm_gem.c b/drivers/gpu/drm/mediatek/mtk_drm_gem.c index a25b28d3ee90..9f364df52478 100644 --- a/drivers/gpu/drm/mediatek/mtk_drm_gem.c +++ b/drivers/gpu/drm/mediatek/mtk_drm_gem.c @@ -247,7 +247,11 @@ int mtk_drm_gem_prime_vmap(struct drm_gem_object *obj, struct iosys_map *map) mtk_gem->kvaddr = vmap(mtk_gem->pages, npages, VM_MAP, pgprot_writecombine(PAGE_KERNEL)); - + if (!mtk_gem->kvaddr) { + kfree(sgt); + kfree(mtk_gem->pages); + return -ENOMEM; + } out: kfree(sgt); iosys_map_set_vaddr(map, mtk_gem->kvaddr); -- 2.25.1
Re: [PATCH 06/15] dt-bindings: display/msm: sc7180-dpu: Describe SM6125
On 26.06.2023 19:54, Marijn Suijten wrote: > On 2023-06-26 18:16:58, Krzysztof Kozlowski wrote: >> On 25/06/2023 21:52, Marijn Suijten wrote: >>> On 2023-06-24 11:12:52, Krzysztof Kozlowski wrote: On 24/06/2023 02:41, Marijn Suijten wrote: > SM6125 is identical to SM6375 except that while downstream also defines > a throttle clock, its presence results in timeouts whereas SM6375 > requires it to not observe any timeouts. Then it should not be allowed, so you need either "else:" block or another "if: properties: compatible:" to disallow it. Because in current patch it would be allowed. >>> >>> That means this binding is wrong/incomplete for all other SoCs then. >>> clock(-name)s has 6 items, and sets `minItems: 6`. Only for sm6375-dpu > > Of course meant to say that clock(-name)s has **7** items, not 6. > >>> does it set `minItems: 7`, but an else case is missing. >> >> Ask the author why it is done like this. > > Konrad, can you clarify why other 6375 needs the throttle clk and the clock(-names) are strongly ordered so having minItems: 6 discards the last entry Konrad > >>> Shall I send a Fixes: ed41005f5b7c ("dt-bindings: display/msm: >>> sc7180-dpu: Describe SM6350 and SM6375") for that, and should maxItems: >>> 6 be the default under clock(-name)s or in an else:? >> >> There is no bug to fix. Or at least it is not yet known. Whether other >> devices should be constrained as well - sure, sounds reasonable, but I >> did not check the code exactly. > > I don't know either, but we need this information to decide whether to > use `maxItems: 6`: > > 1. Directly on the property; > 2. In an `else:` case on the current `if: sm6375-dpu` (should have the >same effect as 1., afaik); > 3. In a second `if:` case that lists all SoCS explicitly. > > Since we don't have this information, I think option 3. is the right way > to go, setting `maxItems: 6` for qcom,sm6125-dpu. > > However, it is not yet understood why downstream is able to use the > throttle clock without repercussions. > >> We talk here about this patch. > > We used this patch to discover that other SoCs are similarly > unconstrained. But if you don't want me to look into it, by all means! > Saves me a lot of time. So I will go with option 3. > > - Marijn
Re: [PATCH 03/15] dt-bindings: clock: qcom,dispcc-sm6125: Require GCC PLL0 DIV clock
On 2023-06-26 20:51:38, Marijn Suijten wrote: > > Not really, binding also defines the list of clocks - their order and > > specific entries. This changes. > > And so it does in "dt-bindings: clock: qcom,dispcc-sm6125: Remove unused > GCC_DISP_AHB_CLK"? Never mind: it is the last item so the order of the other items doesn't change. The total number of items decreases though, which sounds like an ABI-break too? - Marijn
Re: [PATCH 03/15] dt-bindings: clock: qcom,dispcc-sm6125: Require GCC PLL0 DIV clock
On 2023-06-26 20:29:37, Krzysztof Kozlowski wrote: > On 26/06/2023 19:49, Marijn Suijten wrote: > > On 2023-06-26 18:10:44, Krzysztof Kozlowski wrote: > >> On 25/06/2023 21:48, Marijn Suijten wrote: > >>> On 2023-06-24 11:08:54, Krzysztof Kozlowski wrote: > On 24/06/2023 03:45, Konrad Dybcio wrote: > > On 24.06.2023 02:41, Marijn Suijten wrote: > >> The "gcc_disp_gpll0_div_clk_src" clock is consumed by the driver, will > >> be passed from DT, and should be required by the bindings. > >> > >> Fixes: 8397c9c0c26b ("dt-bindings: clock: add QCOM SM6125 display > >> clock bindings") > >> Signed-off-by: Marijn Suijten > >> --- > > Ideally, you'd stick it at the bottom of the list, as the items: order > > is part of the ABI > > Yes, please add them to the end. Order is fixed. > >>> > >>> Disagreed for bindings that declare clock-names and when the driver > >>> adheres to it, see my reply to Konrad's message. > >> > >> That's the generic rule, with some exceptions of course. Whether one > >> chosen driver (chosen system and chosen version of that system) adheres > >> or not, does not change it. Other driver behaves differently and ABI is > >> for everyone, not only for your specific version of Linux driver. > >> > >> Follow the rule. > > > > This has no relation to the driver (just that our driver adheres to the > > bindings, as it is supposed to be). The bindings define a mapping from > > a clock-names=<> entry to a clock on the same index in the clocks=<> > > array. That relation remains the same with this change. > > Not really, binding also defines the list of clocks - their order and > specific entries. This changes. And so it does in "dt-bindings: clock: qcom,dispcc-sm6125: Remove unused GCC_DISP_AHB_CLK"? - Marijn
Re: [PATCH v3 00/17] Imagination Technologies PowerVR DRM driver
Hi Frank, I have added Merlijn who is doing a lot with PVRSGX for Maemo-Leste and the phone-devel list since most SoC we find using a PVRSGX are smartphone processors. > Am 26.06.2023 um 15:45 schrieb Frank Binns : > > Hi Nikolaus, > >> >> Some questions to the author of the new driver: >> - are there plans to support SGX5 (the predecessor of Rogue6)? > > We don't currently have any plans to support SGX. Our main focus is currently > on > Rogue and then we'll move onto Volcanic. Okay, that's completely understandable from a commercial perspective. > >> - will this be able to run the existing firmware and user-space code of >> pvrsrvkm? > > I'm afraid not. The interface for existing firmware and userspace code were > designed with different requirements in mind and don't cater to the kernel's > strict compatibility guarantees. As such, the uapi for this new driver is > very different to pvrsrvkm's, although naturally there are similarities: > https://gitlab.freedesktop.org/sarah-walker-imgtec/powervr/-/blob/dev/v3/include/uapi/drm/pvr_drm.h This makes sense. So the new Rogue/Volcanic and the older PVRSGX drivers should be able to coexist (at least in source code as there is no hardware having both). > We've also worked with our firmware team to make changes to the firmware > interface to better support this new driver. Specifically, parts of the > firmware > interface are no longer conditional on the GPU feature set / hardware > workarounds, meaning we now have a single interface which can work across all > existing Rogue GPUs, which makes things a lot easier for this new kernel > driver. That is what I have dreamed of for SGX as well. We could have replaced all the #if for specific versions and errata by some code to runtime check with the device tree for the specific SGX version. But this was never done because it is complex, difficult to automate and our means for testing things are limited. And we could not decide which DDK version we should build on as there is no common denominator for all SoC. > >> - or will it have new firmware and user-space code for these older chips? >> - or will there be open user-space code for SGX (and Rogue)? > > We're using the same Rogue firmware as our closed source driver, just with > modifications to the interface to cater for this new kernel driver. In terms > of Ok. Well, it was to be expected that SGX and Rogue firmware are quite different. > userspace, we already have a Vulkan driver upstreamed to Mesa: > https://gitlab.freedesktop.org/mesa/mesa/-/tree/main/src/imagination/vulkan Nice! > > and will be working to enable GLES support via Mesa's Zink GL(ES)-to-Vulkan > translation layer. This naturally limits support to Series 6 onwards, as > anything older isn't capable of supporting Vulkan. I see. > >> >>> Specifically I would ask that the DT bindings include all old and new >>> PowerVR >>> hardware in one go, unless they have very specific hardware definition >>> needs, >>> which I doubt. >> >> Our current bindings for all SoC with a SGX5 GPU inside and which have at >> least >> some official Linux support are here: >> >> https://github.com/openpvrsgx-devgroup/linux_openpvrsgx/blob/letux/omap-pvr-soc-glue-v8/Documentation/devicetree/bindings/gpu/img%2Cpvrsgx.yaml >> >> There was an attempt to upstream at least this plus glue code (i.e. device >> tree >> sources) some years ago but there was no consensus about the number and >> names of >> clocks that should be included in such a bindings document. > > I've taken a look and your bindings look very similar to the ones we've come > up > with. If you decide to attempt to upstream these again, please feel free to CC > me, Sarah and Donald (all on this email chain) so we can provide some > feedback. That is good! It would be a good moment to give it another try as we can have even more reviewers than before... > >> >>> Also I think they could use your help to get the proper firmware for the >>> older >>> hardware licensed properly from Imagination and included into linux-firmware >>> so they do not need to ship files on the side. >> >> Indeed, this and some "universal" user-space code would help a lot. Currently >> we have collected a lot of binaries for several architectures (e.g. Intel, >> OMAP, JZ4780), >> but all from different DDK versions and very different assumptions about >> system >> library versions. > > The way the SGX driver was designed means that it has to be built for a > specific > GPU, the version of the firmware, userspace driver(s) & kernel driver have to > exactly match and the build configuration has to match as well. Essentially, > we > don't have "universal" userspace code ourselves. With our focus being on Rogue > and beyond, we don't currently have any plans to work on this. Hm. This makes me wonder if it could be possible to open source the SGX code since it is a different architecture than Rogue, is no longer your focus and AFAIK the
[syzbot] Monthly dri report (Jun 2023)
Hello dri maintainers/developers, This is a 31-day syzbot report for the dri subsystem. All related reports/information can be found at: https://syzkaller.appspot.com/upstream/s/dri During the period, 3 new issues were detected and 0 were fixed. In total, 7 issues are still open and 30 have been fixed so far. Some of the still happening issues: Ref Crashes Repro Title <1> 297 Yes WARNING in drm_wait_one_vblank https://syzkaller.appspot.com/bug?extid=6f7fe2dbc479dca0ed17 <2> 32 Yes inconsistent lock state in sync_info_debugfs_show https://syzkaller.appspot.com/bug?extid=007bfe0f3330f6e1e7d1 <3> 16 NoWARNING in vkms_get_vblank_timestamp (2) https://syzkaller.appspot.com/bug?extid=93bd128a383695391534 <4> 2 Yes divide error in drm_mode_vrefresh https://syzkaller.appspot.com/bug?extid=622bba18029bcde672e1 --- This report is generated by a bot. It may contain errors. See https://goo.gl/tpsmEJ for more information about syzbot. syzbot engineers can be reached at syzkal...@googlegroups.com. To disable reminders for individual bugs, reply with the following command: #syz set no-reminders To change bug's subsystems, reply with: #syz set subsystems: new-subsystem You may send multiple commands in a single email message.
[PATCH v4 1/1] drm/doc: Document DRM device reset expectations
Create a section that specifies how to deal with DRM device resets for kernel and userspace drivers. Signed-off-by: André Almeida --- Documentation/gpu/drm-uapi.rst | 68 ++ 1 file changed, 68 insertions(+) diff --git a/Documentation/gpu/drm-uapi.rst b/Documentation/gpu/drm-uapi.rst index 65fb3036a580..25a11b9b98fa 100644 --- a/Documentation/gpu/drm-uapi.rst +++ b/Documentation/gpu/drm-uapi.rst @@ -285,6 +285,74 @@ for GPU1 and GPU2 from different vendors, and a third handler for mmapped regular files. Threads cause additional pain with signal handling as well. +Device reset + + +The GPU stack is really complex and is prone to errors, from hardware bugs, +faulty applications and everything in between the many layers. Some errors +require resetting the device in order to make the device usable again. This +section describes what is the expectations for DRM and usermode drivers when a +device resets and how to propagate the reset status. + +Kernel Mode Driver +-- + +The KMD is responsible for checking if the device needs a reset, and to perform +it as needed. Usually a hang is detected when a job gets stuck executing. KMD +should keep track of resets, because userspace can query any time about the +reset stats for an specific context. This is needed to propagate to the rest of +the stack that a reset has happened. Currently, this is implemented by each +driver separately, with no common DRM interface. + +User Mode Driver + + +The UMD should check before submitting new commands to the KMD if the device has +been reset, and this can be checked more often if it requires to. After +detecting a reset, UMD will then proceed to report it to the application using +the appropriated API error code, as explained in the below section about +robustness. + +Robustness +-- + +The only way to try to keep an application working after a reset is if it +complies with the robustness aspects of the graphical API that it is using. + +Graphical APIs provide ways to application to deal with device resets. However, +there is no guarantee that the app will be correctly using such features, and +UMD can implement policies to close the app if it is a repeating offender, +likely in a broken loop. This is done to ensure that it does not keeps blocking +the user interface from being correctly displayed. This should be done even if +the app is correct but happens to trigger some bug in the hardware/driver. + +OpenGL +~~ + +Apps using OpenGL should use the available robust interfaces, like the +extension ``GL_ARB_robustness`` (or ``GL_EXT_robustness`` for OpenGL ES). This +interface tells if a reset has happened, and if so, all the context state is +considered lost and the app proceeds by creating new ones. If is possible to +determine that robustness is not in use, UMD will terminate the app when a reset +is detected, giving that the contexts are lost and the app won't be able to +figure this out and recreate the contexts. + +Vulkan +~~ + +Apps using Vulkan should check for ``VK_ERROR_DEVICE_LOST`` for submissions. +This error code means, among other things, that a device reset has happened and +it needs to recreate the contexts to keep going. + +Reporting resets causes +--- + +Apart from propagating the reset through the stack so apps can recover, it's +really useful for driver developers to learn more about what caused the reset in +first place. DRM devices should make use of devcoredump to store relevant +information about the reset, so this information can be added to user bug +reports. + .. _drm_driver_ioctl: IOCTL Support on Device Nodes -- 2.41.0
[PATCH v4 0/1] drm/doc: Document DRM device reset expectations
This v4 removes the common DRM ioctl, and adds just the documentation for now, giving the lack of a common "DRM context" infrascture make it hard to implement. v3: https://lore.kernel.org/lkml/20230621005719.836857-1-andrealm...@igalia.com/ Changes: - Drop the ioctl - Addresed comments com Pekka, as making the documentation more clear and consistent. André Almeida (1): drm/doc: Document DRM device reset expectations Documentation/gpu/drm-uapi.rst | 68 ++ 1 file changed, 68 insertions(+) -- 2.41.0
Re: [PATCH 03/15] dt-bindings: clock: qcom, dispcc-sm6125: Require GCC PLL0 DIV clock
On 26/06/2023 19:49, Marijn Suijten wrote: > On 2023-06-26 18:10:44, Krzysztof Kozlowski wrote: >> On 25/06/2023 21:48, Marijn Suijten wrote: >>> On 2023-06-24 11:08:54, Krzysztof Kozlowski wrote: On 24/06/2023 03:45, Konrad Dybcio wrote: > On 24.06.2023 02:41, Marijn Suijten wrote: >> The "gcc_disp_gpll0_div_clk_src" clock is consumed by the driver, will >> be passed from DT, and should be required by the bindings. >> >> Fixes: 8397c9c0c26b ("dt-bindings: clock: add QCOM SM6125 display clock >> bindings") >> Signed-off-by: Marijn Suijten >> --- > Ideally, you'd stick it at the bottom of the list, as the items: order > is part of the ABI Yes, please add them to the end. Order is fixed. >>> >>> Disagreed for bindings that declare clock-names and when the driver >>> adheres to it, see my reply to Konrad's message. >> >> That's the generic rule, with some exceptions of course. Whether one >> chosen driver (chosen system and chosen version of that system) adheres >> or not, does not change it. Other driver behaves differently and ABI is >> for everyone, not only for your specific version of Linux driver. >> >> Follow the rule. > > This has no relation to the driver (just that our driver adheres to the > bindings, as it is supposed to be). The bindings define a mapping from > a clock-names=<> entry to a clock on the same index in the clocks=<> > array. That relation remains the same with this change. Not really, binding also defines the list of clocks - their order and specific entries. This changes. Best regards, Krzysztof
Re: [PATCH v1 0/2] udmabuf: Add back support for mapping hugetlb pages
On Mon, Jun 26, 2023 at 08:14:27PM +0200, David Hildenbrand wrote: > So we might have to implement the same page migration as gup does on > FOLL_LONGTERM here ... maybe there are more such cases/drivers that actually > require that handling when simply taking pages out of the memfd, believing > they can hold on to them forever. In general I would like to see an interface to FOLL_LONGTERM pin pages from a memfd. I would quite happily use that in iommufd as well. It solves some problems we have there with fork/exec/etc if the pages are not linked to a mm_struct. Jason
Re: [PATCH v1 0/2] udmabuf: Add back support for mapping hugetlb pages
On 26.06.23 19:52, Peter Xu wrote: On Mon, Jun 26, 2023 at 07:45:37AM +, Kasireddy, Vivek wrote: Hi Peter, On Fri, Jun 23, 2023 at 06:13:02AM +, Kasireddy, Vivek wrote: Hi David, The first patch ensures that the mappings needed for handling mmap operation would be managed by using the pfn instead of struct page. The second patch restores support for mapping hugetlb pages where subpages of a hugepage are not directly used anymore (main reason for revert) and instead the hugetlb pages and the relevant offsets are used to populate the scatterlist for dma-buf export and for mmap operation. Testcase: default_hugepagesz=2M hugepagesz=2M hugepages=2500 options were passed to the Host kernel and Qemu was launched with these relevant options: qemu-system-x86_64 -m 4096m -device virtio-gpu-pci,max_outputs=1,blob=true,xres=1920,yres=1080 -display gtk,gl=on -object memory-backend-memfd,hugetlb=on,id=mem1,size=4096M -machine memory-backend=mem1 Replacing -display gtk,gl=on with -display gtk,gl=off above would exercise the mmap handler. While I think the VM_PFNMAP approach is much better and should fix that issue at hand, I thought more about missing memlock support and realized that we might have to fix something else. SO I'm going to raise the issue here. I think udmabuf chose the wrong interface to do what it's doing, that makes it harder to fix it eventually. Instead of accepting a range in a memfd, it should just have accepted a user space address range and then used pin_user_pages(FOLL_WRITE|FOLL_LONGTERM) to longterm-pin the pages "officially". Udmabuf indeed started off by using user space address range and GUP but the dma-buf subsystem maintainer had concerns with that approach in v2. It also had support for mlock in that version. Here is v2 and the relevant conversation: https://patchwork.freedesktop.org/patch/210992/?series=39879=2 So what's the issue? Udma effectively pins pages longterm ("possibly forever") simply by grabbing a reference on them. These pages might easily reside in ZONE_MOVABLE or in MIGRATE_CMA pageblocks. So what udmabuf does is break memory hotunplug and CMA, because it turns pages that have to remain movable unmovable. In the pin_user_pages(FOLL_LONGTERM) case we make sure to migrate these pages. See mm/gup.c:check_and_migrate_movable_pages() and especially folio_is_longterm_pinnable(). We'd probably have to implement something similar for udmabuf, where we detect such unpinnable pages and migrate them. The pages udmabuf pins are only those associated with Guest (GPU driver/virtio-gpu) resources (or buffers allocated and pinned from shmem via drm GEM). Some resources are short-lived, and some are long-lived and whenever a resource gets destroyed, the pages are unpinned. And, not all resources have their pages pinned. The resource that is pinned for the longest duration is the FB and that's because it is updated every ~16ms (assuming 1920x1080@60) by the Guest GPU driver. We can certainly pin/unpin the FB after it is accessed on the Host as a workaround, but I guess that may not be very efficient given the amount of churn it would create. Also, as far as migration or S3/S4 is concerned, my understanding is that all the Guest resources are destroyed and recreated again. So, wouldn't something similar happen during memory hotunplug? For example, pairing udmabuf with vfio (which pins pages using pin_user_pages(FOLL_LONGTERM)) in QEMU will most probably not work in all cases: if udmabuf longterm pinned the pages "the wrong way", vfio will fail to migrate them during FOLL_LONGTERM and consequently fail pin_user_pages(). As long as udmabuf holds a reference on these pages, that will never succeed. Dma-buf rules (for exporters) indicate that the pages only need to be pinned during the map_attachment phase (and until unmap attachment happens). In other words, only when the sg_table is created by udmabuf. I guess one option would be to not hold any references during UDMABUF_CREATE and only grab references to the pages (as and when it gets used) during this step. Would this help? IIUC the refcount is needed, otherwise I don't see what to protect the page from being freed and even reused elsewhere before map_attachment(). It seems the previous concern on using gup was majorly fork(), if this is it: https://patchwork.freedesktop.org/patch/210992/?series=39879=2#co mment_414213 Could it also be guarded by just making sure the pages are MAP_SHARED when creating the udmabuf, if fork() is a requirement of the feature? I had a feeling that the userspace still needs to always do the right thing to make it work, even using pure PFN mappings. For instance, what if the userapp just punchs a hole in the shmem/hugetlbfs file after creating the udmabuf (I see that F_SEAL_SHRINK is required, but at least not F_SEAL_WRITE with current impl), and fault a new page into the page cache? IIUC, Qemu creates and owns the memfd that is
Re: [PATCH v2] drm/bridge: ti-sn65dsi86: Fix auxiliary bus lifetime
Hi, On Tue, Jun 13, 2023 at 11:41 AM Stephen Boyd wrote: > > Quoting Douglas Anderson (2023-06-13 06:58:13) > > Memory for the "struct device" for any given device isn't supposed to > > be released until the device's release() is called. This is important > > because someone might be holding a kobject reference to the "struct > > device" and might try to access one of its members even after any > > other cleanup/uninitialization has happened. > > > > Code analysis of ti-sn65dsi86 shows that this isn't quite right. When > > the code was written, it was believed that we could rely on the fact > > that the child devices would all be freed before the parent devices > > and thus we didn't need to worry about a release() function. While I > > still believe that the parent's "struct device" is guaranteed to > > outlive the child's "struct device" (because the child holds a kobject > > reference to the parent), the parent's "devm" allocated memory is a > > different story. That appears to be freed much earlier. > > > > Let's make this better for ti-sn65dsi86 by allocating each auxiliary > > with kzalloc and then free that memory in the release(). > > > > Fixes: bf73537f411b ("drm/bridge: ti-sn65dsi86: Break GPIO and MIPI-to-eDP > > bridge into sub-drivers") > > Suggested-by: Stephen Boyd > > Signed-off-by: Douglas Anderson > > --- > > Reviewed-by: Stephen Boyd Pushed to drm-misc-fixes: 7aa83fbd712a drm/bridge: ti-sn65dsi86: Fix auxiliary bus lifetime
[PATCH] drm/tegra: Fix potential memory leak if vmap() fail
The vmap function called in the tegra_fbdev_probe() function could fail, It doesn't matther. But if the error handling is necessary, it should also free the resources allocated by drm_fb_helper_alloc_info() function and the gem buffer object allocated by tegra_bo_create(). Signed-off-by: Sui Jingfeng --- drivers/gpu/drm/tegra/fbdev.c | 2 ++ 1 file changed, 2 insertions(+) diff --git a/drivers/gpu/drm/tegra/fbdev.c b/drivers/gpu/drm/tegra/fbdev.c index e74d9be981c7..d2152b5eb77b 100644 --- a/drivers/gpu/drm/tegra/fbdev.c +++ b/drivers/gpu/drm/tegra/fbdev.c @@ -141,6 +141,8 @@ static int tegra_fbdev_probe(struct drm_fb_helper *helper, destroy: drm_framebuffer_remove(fb); + drm_fb_helper_release_info(helper); + drm_gem_object_put(>gem); return err; } -- 2.25.1
Re: [PATCH 06/15] dt-bindings: display/msm: sc7180-dpu: Describe SM6125
On 2023-06-26 18:16:58, Krzysztof Kozlowski wrote: > On 25/06/2023 21:52, Marijn Suijten wrote: > > On 2023-06-24 11:12:52, Krzysztof Kozlowski wrote: > >> On 24/06/2023 02:41, Marijn Suijten wrote: > >>> SM6125 is identical to SM6375 except that while downstream also defines > >>> a throttle clock, its presence results in timeouts whereas SM6375 > >>> requires it to not observe any timeouts. > >> > >> Then it should not be allowed, so you need either "else:" block or > >> another "if: properties: compatible:" to disallow it. Because in current > >> patch it would be allowed. > > > > That means this binding is wrong/incomplete for all other SoCs then. > > clock(-name)s has 6 items, and sets `minItems: 6`. Only for sm6375-dpu Of course meant to say that clock(-name)s has **7** items, not 6. > > does it set `minItems: 7`, but an else case is missing. > > Ask the author why it is done like this. Konrad, can you clarify why other > > Shall I send a Fixes: ed41005f5b7c ("dt-bindings: display/msm: > > sc7180-dpu: Describe SM6350 and SM6375") for that, and should maxItems: > > 6 be the default under clock(-name)s or in an else:? > > There is no bug to fix. Or at least it is not yet known. Whether other > devices should be constrained as well - sure, sounds reasonable, but I > did not check the code exactly. I don't know either, but we need this information to decide whether to use `maxItems: 6`: 1. Directly on the property; 2. In an `else:` case on the current `if: sm6375-dpu` (should have the same effect as 1., afaik); 3. In a second `if:` case that lists all SoCS explicitly. Since we don't have this information, I think option 3. is the right way to go, setting `maxItems: 6` for qcom,sm6125-dpu. However, it is not yet understood why downstream is able to use the throttle clock without repercussions. > We talk here about this patch. We used this patch to discover that other SoCs are similarly unconstrained. But if you don't want me to look into it, by all means! Saves me a lot of time. So I will go with option 3. - Marijn
Re: [PATCH v1 0/2] udmabuf: Add back support for mapping hugetlb pages
On Mon, Jun 26, 2023 at 07:45:37AM +, Kasireddy, Vivek wrote: > Hi Peter, > > > > > On Fri, Jun 23, 2023 at 06:13:02AM +, Kasireddy, Vivek wrote: > > > Hi David, > > > > > > > > The first patch ensures that the mappings needed for handling mmap > > > > > operation would be managed by using the pfn instead of struct page. > > > > > The second patch restores support for mapping hugetlb pages where > > > > > subpages of a hugepage are not directly used anymore (main reason > > > > > for revert) and instead the hugetlb pages and the relevant offsets > > > > > are used to populate the scatterlist for dma-buf export and for > > > > > mmap operation. > > > > > > > > > > Testcase: default_hugepagesz=2M hugepagesz=2M hugepages=2500 > > > > options > > > > > were passed to the Host kernel and Qemu was launched with these > > > > > relevant options: qemu-system-x86_64 -m 4096m > > > > > -device virtio-gpu-pci,max_outputs=1,blob=true,xres=1920,yres=1080 > > > > > -display gtk,gl=on > > > > > -object memory-backend-memfd,hugetlb=on,id=mem1,size=4096M > > > > > -machine memory-backend=mem1 > > > > > > > > > > Replacing -display gtk,gl=on with -display gtk,gl=off above would > > > > > exercise the mmap handler. > > > > > > > > > > > > > While I think the VM_PFNMAP approach is much better and should fix > > that > > > > issue at hand, I thought more about missing memlock support and > > realized > > > > that we might have to fix something else. SO I'm going to raise the > > > > issue here. > > > > > > > > I think udmabuf chose the wrong interface to do what it's doing, that > > > > makes it harder to fix it eventually. > > > > > > > > Instead of accepting a range in a memfd, it should just have accepted a > > > > user space address range and then used > > > > pin_user_pages(FOLL_WRITE|FOLL_LONGTERM) to longterm-pin the > > pages > > > > "officially". > > > Udmabuf indeed started off by using user space address range and GUP > > but > > > the dma-buf subsystem maintainer had concerns with that approach in v2. > > > It also had support for mlock in that version. Here is v2 and the relevant > > > conversation: > > > https://patchwork.freedesktop.org/patch/210992/?series=39879=2 > > > > > > > > > > > So what's the issue? Udma effectively pins pages longterm ("possibly > > > > forever") simply by grabbing a reference on them. These pages might > > > > easily reside in ZONE_MOVABLE or in MIGRATE_CMA pageblocks. > > > > > > > > So what udmabuf does is break memory hotunplug and CMA, because it > > > > turns > > > > pages that have to remain movable unmovable. > > > > > > > > In the pin_user_pages(FOLL_LONGTERM) case we make sure to migrate > > > > these > > > > pages. See mm/gup.c:check_and_migrate_movable_pages() and > > especially > > > > folio_is_longterm_pinnable(). We'd probably have to implement > > something > > > > similar for udmabuf, where we detect such unpinnable pages and > > migrate > > > > them. > > > The pages udmabuf pins are only those associated with Guest (GPU > > driver/virtio-gpu) > > > resources (or buffers allocated and pinned from shmem via drm GEM). > > Some > > > resources are short-lived, and some are long-lived and whenever a > > resource > > > gets destroyed, the pages are unpinned. And, not all resources have their > > pages > > > pinned. The resource that is pinned for the longest duration is the FB and > > that's > > > because it is updated every ~16ms (assuming 1920x1080@60) by the Guest > > > GPU driver. We can certainly pin/unpin the FB after it is accessed on the > > Host > > > as a workaround, but I guess that may not be very efficient given the > > amount > > > of churn it would create. > > > > > > Also, as far as migration or S3/S4 is concerned, my understanding is that > > > all > > > the Guest resources are destroyed and recreated again. So, wouldn't > > something > > > similar happen during memory hotunplug? > > > > > > > > > > > > > > > For example, pairing udmabuf with vfio (which pins pages using > > > > pin_user_pages(FOLL_LONGTERM)) in QEMU will most probably not work > > in > > > > all cases: if udmabuf longterm pinned the pages "the wrong way", vfio > > > > will fail to migrate them during FOLL_LONGTERM and consequently fail > > > > pin_user_pages(). As long as udmabuf holds a reference on these pages, > > > > that will never succeed. > > > Dma-buf rules (for exporters) indicate that the pages only need to be > > pinned > > > during the map_attachment phase (and until unmap attachment happens). > > > In other words, only when the sg_table is created by udmabuf. I guess one > > > option would be to not hold any references during UDMABUF_CREATE and > > > only grab references to the pages (as and when it gets used) during this > > step. > > > Would this help? > > > > IIUC the refcount is needed, otherwise I don't see what to protect the page > > from being freed and even reused elsewhere before map_attachment(). > > > > It seems the previous concern on using gup
Re: [PATCH 03/15] dt-bindings: clock: qcom,dispcc-sm6125: Require GCC PLL0 DIV clock
On 2023-06-26 18:10:44, Krzysztof Kozlowski wrote: > On 25/06/2023 21:48, Marijn Suijten wrote: > > On 2023-06-24 11:08:54, Krzysztof Kozlowski wrote: > >> On 24/06/2023 03:45, Konrad Dybcio wrote: > >>> On 24.06.2023 02:41, Marijn Suijten wrote: > The "gcc_disp_gpll0_div_clk_src" clock is consumed by the driver, will > be passed from DT, and should be required by the bindings. > > Fixes: 8397c9c0c26b ("dt-bindings: clock: add QCOM SM6125 display clock > bindings") > Signed-off-by: Marijn Suijten > --- > >>> Ideally, you'd stick it at the bottom of the list, as the items: order > >>> is part of the ABI > >> > >> Yes, please add them to the end. Order is fixed. > > > > Disagreed for bindings that declare clock-names and when the driver > > adheres to it, see my reply to Konrad's message. > > That's the generic rule, with some exceptions of course. Whether one > chosen driver (chosen system and chosen version of that system) adheres > or not, does not change it. Other driver behaves differently and ABI is > for everyone, not only for your specific version of Linux driver. > > Follow the rule. This has no relation to the driver (just that our driver adheres to the bindings, as it is supposed to be). The bindings define a mapping from a clock-names=<> entry to a clock on the same index in the clocks=<> array. That relation remains the same with this change. - Marijn
Re: [PATCH 03/15] dt-bindings: clock: qcom,dispcc-sm6125: Require GCC PLL0 DIV clock
On 2023-06-26 18:15:13, Krzysztof Kozlowski wrote: > On 26/06/2023 16:26, Marijn Suijten wrote: > > On 2023-06-26 11:43:39, Konrad Dybcio wrote: > >> On 25.06.2023 21:48, Marijn Suijten wrote: > >>> On 2023-06-24 03:45:02, Konrad Dybcio wrote: > On 24.06.2023 02:41, Marijn Suijten wrote: > > The "gcc_disp_gpll0_div_clk_src" clock is consumed by the driver, will > > be passed from DT, and should be required by the bindings. > > > > Fixes: 8397c9c0c26b ("dt-bindings: clock: add QCOM SM6125 display clock > > bindings") > > Signed-off-by: Marijn Suijten > > --- > Ideally, you'd stick it at the bottom of the list, as the items: order > is part of the ABI > >>> > >>> This isn't an ABI break, as this driver nor its bindings require/declare > >>> a fixed order: they declare a relation between clocks and clock-names. > >> Bindings describe the ABI, drivers implement compliant code flow. > > > > That is how bindings are supposed to be... However typically the driver > > is written/ported first and then the bindings are simply created to > > Your development process does not matter for the bindings. Whatever you > decide to do "typically" is your choice, although of course I understand > why you do it like that. You can argument the same that "I never create > bindings in my process, so the driver defines the ABI". This is not my process, nor my choice. > > reflect this, and sometimes (as is the case with this patch) > > incorrectly. > > > > That, together with a lack of DTS and known-working device with it > > (which is why I'm submitting driver+bindings+dts in one series now!) > > makes us shoot ourselves in the foot by locking everyone into an ABI > > that makes no sense. > > No one is locked into the ABI. SoC maintainer decides on this. However > unjustified ABI breaking or not caring about it at all is not the way to > go. It is not the correct process. It definitely sounds like it. > >>> This orders the GCC clock just like other dispccs. And the previous > >>> patch dropped the unused cfg_ahb_clk from the bindings, so all bets are > >>> off anyway. > >> Thinking about it again, the binding has not been consumed by any upstream > >> DT to date, so it should (tm) be fine to let it slide.. > > > > Exactly, I hope/doubt anyone was already using these incomplete > > bindings. And again: the ABI here is the name->phandle mapping, the > > order Does Not Matter™. > > No, it's not. Your one driver does not define the ABI. There are many > different drivers, many different operating systems and other software > components. You missed the point entirely. The point is that someone contributed a dt-bindings patch that does not represent the hardware (hence not the driver for that hardware either). It was taken without objection. So now what? We are stuck with a broken ABI that does not allow us to describe the hardware? If there are many different drivers and other OSes, why are we solely responsible for describing broken bindings? Why were there no objections elsewhere that this does not allow us to describe the hardware in question? Note that the person signing off on and sending that initial dt-bindings patch for qcom,dispcc-sm6125.yaml is me, by the way. - Marijn
Re: [PATCH v4] drm/vkms: Add support to 1D gamma LUT
Hi Pekka, On 6/26/23 05:17, Pekka Paalanen wrote: On Sat, 24 Jun 2023 18:48:08 -0300 Maira Canal wrote: Hi Arthur, Thanks for working on this feature for the VKMS! On 6/21/23 16:41, Arthur Grillo wrote: Support a 1D gamma LUT with interpolation for each color channel on the VKMS driver. Add a check for the LUT length by creating vkms_atomic_check(). Tested with: igt@kms_color@gamma igt@kms_color@legacy-gamma igt@kms_color@invalid-gamma-lut-sizes Could you also mention that this will make it possible to run the test igt@kms_plane@pixel-format? Also, you mentioned to me that the performance degraded with this new feature, but I wasn't able to see it while running the VKMS CI. I performed a couple of tests and I didn't see any significant performance issue. Could you please run a benchmark and share the results with us? This way we can atest that this new feature will not affect significantly the VKMS performance. It would be nice to have a small brief of this benchmark on the commit message as well. Attesting that there isn't a performance issue and adding those nits to the commit message, you can add my Reviewed-by: Maíra Canal on the next version. Hi, perfomance testing is good indeed. As future work, could there be a document describing how to test VKMS performance? I'll try to select a couple of more meaningful IGT tests to describe how to test the VKMS performance and also add a document to describe how to run this tests. Recently, I added a VKMS must-pass testlist to IGT. This testlist tries to assure that regressions will not be introduced into VKMS. But, I failed to introduce a documentation on the kernel side pointing to this new testlist... I'll also work on it. "I ran IGT@blah 100 times and it took xx seconds before and yy seconds after" does not really give someone like me an idea of what was actually measured. For example blending overhead increase could be completely lost in opaque pixel copying noise if the test case has only few pixels to blend, e.g. a cursor plane, not to mention the overhead of launching an IGT test in the first place. About the IGT overhead, I don't know exactly how we could escape from it. Maybe writing KUnit tests to the VKMS's composition functions, such as blend(). Anyway, we would have the overhead of the KUnit framework. I mean, for whatever framework we choose, there'll be an overhead... Do you have any other ideas on how to test VKMS with less overhead? Best Regards, - Maíra Something that would guide new developers in running meaningful benchmarks would be nice. Should e.g. IGT have explicit (VKMS) performance tests that need to be run manually, since evaluation of the result is not feasible automatically? Or a benchmark mode in correctness tests that would run the identical operation N times and measure the time before checking for correctness? The correctness verification in IGT tests, if done by image comparison which they undoubtedly will need to be in the future, may dominate the CPU run time measurements if included. Thanks, pq
Re: [PATCH 1/9] dt-bindings: mfd: Add bindings for SAM9X7 LCD controller
On Mon, Jun 26, 2023 at 05:31:59AM +, manikanda...@microchip.com wrote: > On 21/06/23 13:17, Nicolas Ferre wrote: > > On 16/06/2023 at 08:44, Manikandan M - I67131 wrote: > >> On 14/06/23 20:10, Nicolas Ferre wrote: > >>> On 13/06/2023 at 20:21, Conor Dooley wrote: > On Tue, Jun 13, 2023 at 08:17:13PM +0200, Krzysztof Kozlowski wrote: > > On 13/06/2023 09:04, Manikandan Muralidharan wrote: > >> Add new compatible string for the XLCD controller on SAM9X7 SoC. > >> > >> Signed-off-by: Manikandan Muralidharan > >> --- > >> Documentation/devicetree/bindings/mfd/atmel-hlcdc.txt | 1 + > >> 1 file changed, 1 insertion(+) > >> > >> diff --git a/Documentation/devicetree/bindings/mfd/atmel-hlcdc.txt > >> b/Documentation/devicetree/bindings/mfd/atmel-hlcdc.txt > >> index 5f8880cc757e..7c77b6bf4adb 100644 > >> --- a/Documentation/devicetree/bindings/mfd/atmel-hlcdc.txt > >> +++ b/Documentation/devicetree/bindings/mfd/atmel-hlcdc.txt > >> @@ -8,6 +8,7 @@ Required properties: > >> "atmel,sama5d3-hlcdc" > >> "atmel,sama5d4-hlcdc" > >> "microchip,sam9x60-hlcdc" > >> + "microchip,sam9x7-xlcdc" > > Google says sam9x7 is a series, not a SoC. Please add compatibles for > > specific SoCs, not for series. > We had this one a few weeks ago, see > https://lore.kernel.org/all/add5e49e-8416-ba9f-819a-da944938c...@microchip.com/ > and its parents. Outcome of that seemed to be that using "sam9x7" was > fine. > >>> > >>> And it's where it begins to be funny, as the LCD is precisely one aspect > >>> where we differentiate between sam9x75, sam9x72 and sam9x70... > >>> So please Manikandan sort this out if difference between these chips > >>> will be better handled with different compatibility string, in > >>> particular about //, LVDS and MIPI-DSI variants! > >> Yes Sure, I will replace the compatible as s/sam9x7/sam9x75/g to handle > >> the different variants of sam9x7 better. > > > > Moving to sam9x75 is probably good. But what is your plan for > > differentiating parallel and LVDS (on sam9x72) and precisely this > > sam9x75 variant which in addition has MIPI-DSI? > In sam9x75 with support for LVDS and MIPI, Parallel interfacing > peripherals, the additions performed on the LCD controller driver is the > same.Considering the LCDC IP used in sam9x75, there are no registers > sets that needs modification per connecting peripheral variants, only > the clock and DRM_ENCODER_MODE_XXX (set by connecting peripheral driver) > differs, which can be handled in DT, atmel-lcdc MFD driver and > peripheral driver. > > In future, sam9x72 with XLCD controller can be differentiated with > sam9x72 compatible string and with a IP version flag(is_xlcdc_v2) in its > driver data if an upgraded XLCD IP is used with difference in bits or > register set exist compared to current IP. Trying to covert that into what the binding will look like... sam9x72 & sam9x75 each get their own compatibles for the lcd controller. From there, we permit `compatible = "microchip,sam9x75-foo"` in isolation. It *sounds* like the basic featureset of the sam9x75 is compatible with the sam9x72, so for that we permit `compatible = "microchip,sam9x72-foo", "microchip,sam9x75-foo"`. Although, if the hardware for the sam9x72 isn't set in stone yet, it might be best to hold off on documenting it until things settle down, and only add the sam9x75 for now. Cheers, Conor. signature.asc Description: PGP signature
Re: [PATCH v2 0/2] Add MHI quirk for QAIC
On 6/8/2023 5:59 AM, Manivannan Sadhasivam wrote: On Fri, May 19, 2023 at 10:39:00AM -0600, Jeffrey Hugo wrote: With the QAIC driver in -next, I'd like to suggest some MHI changes that specific to AIC100 devices, but perhaps provide a framework for other device oddities. AIC100 devices technically violate the MHI spec in two ways. Sadly, these issues comes from the device hardware, so host SW needs to work around them. Thie first issue, presented in this series, has to do with the SOC_HW_VERSION register. This register is suposed to be initialized by the hardware prior to the MHI being accessable by the host to contain a version string for the SoC of the device. This could be used by the host MHI controller software to identify and handle version to version changes. The AIC100 hardware does not initialize this register, and thus it contains garbage. This would not be much of a problem normally - the QAIC driver would just never use it. However the MHI stack uses this register as part of the init sequence and if the controller reports that the register is inaccessable then the init sequence fails. On some AIC100 cards, the garbage value ends up being 0x which is PCIe spec defined to be a special value indicating the access failed. The MHI controller cannot tell if that value is a PCIe link issue, or just garbage. QAIC needs a way to tell MHI not to use this register. Other buses have a quirk mechanism - a way to describe oddities in a particular implementation that have some kind of workaround. Since this seems to be the first need for such a thing in MHI, introduce a quirk framework. The second issue AIC100 has involves the PK Hash registers. A solution for this is expected to be proposed in the near future and is anticipated to make use of the quirk framework proposed here. With PK Hash, there are two oddities to handle. AIC100 does not initialize these registers until the SBL is running, which is later than the spec indicates, and in practice is after MHI reads/caches them. Also, AIC100 does not have enough registers defined to fully report the 5 PK Hash slots, so a custom reporting format is defined by the device. Looking at the two issues you reported above, it looks to me that they can be handled inside the aic100 mhi_controller driver itself. Since the MHI stack exports the read_reg callback to controller drivers, if some registers are not supported by the device, then the callback can provide some fixed dummy data emulating the register until the issue is fixed in the device (if at all). Quirk framework could be useful if the device misbehaves against the protocol itself but for the register issues like this, I think the controller driver can handle itself. What do you think? I think for the HW_VERSION register, your suggestion is very good, and something I plan to adopt. For the PK Hash registers, I don't think it quite works. HW_VERSION I can hard code to a valid value, or just stub out to 0 since that appears to be only consumed by the MHI Controller, and we don't use it. The PK Hash registers are programmed into the SoC, and can be unique from SoC to SoC. I don't see how the driver can provide valid, but faked information for them. Also, the user consumes this data via sysfs. We'd like to give the data to the user, and we can't fake it. Also the data is dynamic. Lets start with the dynamic data issue. Right now MHI reads these registers once, and caches the values. I would propose a quirk to change that behavior for AIC100, but does MHI really need to operate in a "read once" mode? Would something actually break if MHI read the registers every time the sysfs node is accessed? Then sysfs would display the latest data, which would be beneficial to AIC100 and should not be a behavior change for other devices which have static data (MHI just displays the same data because it hasn't changed). Do you recall the reason behind making the PK Hash registers read once and cached? - Mani v2: -Fix build error -Fix typo in commit text Jeffrey Hugo (2): bus: mhi: host: Add quirk framework and initial quirk accel/qaic: Add MHI_QUIRK_SOC_HW_VERSION_UNRELIABLE drivers/accel/qaic/mhi_controller.c | 1 + drivers/bus/mhi/host/init.c | 13 + include/linux/mhi.h | 18 ++ 3 files changed, 28 insertions(+), 4 deletions(-) -- 2.40.1
[PATCH] drm/omap: Checking the mapping returned by vmap()
Because vmap() function could fail. Also don't let omap_gem_vaddr() function signature (declaration) dangling there, as it will only get defined when CONFIG_DRM_FBDEV_EMULATION=y. Signed-off-by: Sui Jingfeng --- drivers/gpu/drm/omapdrm/omap_fbdev.c | 10 -- drivers/gpu/drm/omapdrm/omap_gem.h | 3 +++ 2 files changed, 11 insertions(+), 2 deletions(-) diff --git a/drivers/gpu/drm/omapdrm/omap_fbdev.c b/drivers/gpu/drm/omapdrm/omap_fbdev.c index b7ccce0704a3..2c88aa1008d8 100644 --- a/drivers/gpu/drm/omapdrm/omap_fbdev.c +++ b/drivers/gpu/drm/omapdrm/omap_fbdev.c @@ -197,6 +197,11 @@ static int omap_fbdev_create(struct drm_fb_helper *helper, drm_fb_helper_fill_info(fbi, helper, sizes); fbi->screen_buffer = omap_gem_vaddr(bo); + if (!fbi->screen_buffer) { + ret = -ENOMEM; + goto err_release_fbi; + } + fbi->screen_size = bo->size; fbi->fix.smem_start = dma_addr; fbi->fix.smem_len = bo->size; @@ -210,14 +215,15 @@ static int omap_fbdev_create(struct drm_fb_helper *helper, fbi->fix.ywrapstep = 1; } - DBG("par=%p, %dx%d", fbi->par, fbi->var.xres, fbi->var.yres); DBG("allocated %dx%d fb", fb->width, fb->height); return 0; -fail: +err_release_fbi: + drm_fb_helper_release_info(helper); +fail: if (ret) { if (fb) drm_framebuffer_remove(fb); diff --git a/drivers/gpu/drm/omapdrm/omap_gem.h b/drivers/gpu/drm/omapdrm/omap_gem.h index 4d4488939f6b..7e8c41b72aae 100644 --- a/drivers/gpu/drm/omapdrm/omap_gem.h +++ b/drivers/gpu/drm/omapdrm/omap_gem.h @@ -48,7 +48,10 @@ struct drm_gem_object *omap_gem_new_dmabuf(struct drm_device *dev, size_t size, struct sg_table *sgt); int omap_gem_new_handle(struct drm_device *dev, struct drm_file *file, union omap_gem_size gsize, u32 flags, u32 *handle); + +#ifdef CONFIG_DRM_FBDEV_EMULATION void *omap_gem_vaddr(struct drm_gem_object *obj); +#endif /* Dumb Buffers Interface */ int omap_gem_dumb_map_offset(struct drm_file *file, struct drm_device *dev, -- 2.25.1
Re: [PATCH 1/5] drm/panfrost: Stop using drm_gem_shmem_put_pages()
On Mon, 26 Jun 2023 19:06:55 +0300 Dmitry Osipenko wrote: > On 6/26/23 18:43, Boris Brezillon wrote: > > On Mon, 26 Jun 2023 16:20:53 +0300 > > Dmitry Osipenko wrote: > > > >> On 6/26/23 15:02, Boris Brezillon wrote: > >>> -err_pages: > >>> - drm_gem_shmem_put_pages(>base); > >>> err_unlock: > >>> dma_resv_unlock(obj->resv); > >>> + > >>> + if (ret && pinned) > >>> + drm_gem_shmem_unpin(>base); > >> > >> The drm_gem_shmem_unpin() was supposed to be used only in conjunction > >> with drm_gem_shmem_pin(). I've a pending patch to enable the pin/unpin > >> refcounting needed by drm-shmem shrinker, it will prohibit invocation of > >> unpin without a previous pin. > >> > >> I'm wondering whether it will be okay to simply remove > >> drm_gem_shmem_put_pages() from the Panfrost code, letting pages to be > >> kept allocated in a error case. They will be freed once BO is destroyed. > >> > > > > Okay, so after looking at your shmem-shrinker series, I confirm we need > > to take a pin ref here (hard-pin), otherwise the buffer might be > > evicted before the GPU is done, especially after you drop gpu_usecount > > and use only pin_count to check whether a GEM object can be evicted or > > not. > > See the drm_gem_evict() [1], it checks whether GEM is busy, preventing > BO eviction while it is in-use by GPU. Note that in case of Panfrost, > shrinker isn't enabled for growable BOs. Okay, we should be good then, sorry for the confusion.
Re: [PATCH 06/15] dt-bindings: display/msm: sc7180-dpu: Describe SM6125
On 25/06/2023 21:52, Marijn Suijten wrote: > On 2023-06-24 11:12:52, Krzysztof Kozlowski wrote: >> On 24/06/2023 02:41, Marijn Suijten wrote: >>> SM6125 is identical to SM6375 except that while downstream also defines >>> a throttle clock, its presence results in timeouts whereas SM6375 >>> requires it to not observe any timeouts. >> >> Then it should not be allowed, so you need either "else:" block or >> another "if: properties: compatible:" to disallow it. Because in current >> patch it would be allowed. > > That means this binding is wrong/incomplete for all other SoCs then. > clock(-name)s has 6 items, and sets `minItems: 6`. Only for sm6375-dpu > does it set `minItems: 7`, but an else case is missing. Ask the author why it is done like this. > > Shall I send a Fixes: ed41005f5b7c ("dt-bindings: display/msm: > sc7180-dpu: Describe SM6350 and SM6375") for that, and should maxItems: > 6 be the default under clock(-name)s or in an else:? There is no bug to fix. Or at least it is not yet known. Whether other devices should be constrained as well - sure, sounds reasonable, but I did not check the code exactly. We talk here about this patch. Best regards, Krzysztof
Re: [RFC PATCH v3 1/4] drm/doc: Document DRM device reset expectations
Em 22/06/2023 05:12, Pekka Paalanen escreveu: On Wed, 21 Jun 2023 13:28:34 -0300 André Almeida wrote: Em 21/06/2023 04:58, Pekka Paalanen escreveu: On Tue, 20 Jun 2023 21:57:16 -0300 André Almeida wrote: Create a section that specifies how to deal with DRM device resets for kernel and userspace drivers. Signed-off-by: André Almeida Hi André, nice to see this! I ended up giving lots of grammar comments, but I'm not a native speaker. Generally it looks good to me. Thank you for your feedback :) --- Documentation/gpu/drm-uapi.rst | 65 ++ 1 file changed, 65 insertions(+) diff --git a/Documentation/gpu/drm-uapi.rst b/Documentation/gpu/drm-uapi.rst index 65fb3036a580..da4f8a694d8d 100644 --- a/Documentation/gpu/drm-uapi.rst +++ b/Documentation/gpu/drm-uapi.rst @@ -285,6 +285,71 @@ for GPU1 and GPU2 from different vendors, and a third handler for mmapped regular files. Threads cause additional pain with signal handling as well. +Device reset + + +The GPU stack is really complex and is prone to errors, from hardware bugs, +faulty applications and everything in between the many layers. To recover +from this kind of state, sometimes is needed to reset the device. This section It seems unclear what "this kind of state" refers to, so maybe just write "errors"? Maybe: Some errors require resetting the device in order to make the device usable again. I presume that recovery does not mean that the failed job could recover. +describes what's the expectations for DRM and usermode drivers when a device +resets and how to propagate the reset status. + +Kernel Mode Driver +-- + +The KMD is responsible for checking if the device needs a reset, and to perform +it as needed. Usually a hung is detected when a job gets stuck executing. KMD s/hung/hang/ ? +then update it's internal reset tracking to be ready when userspace asks the updates its "update reset tracking"... do you mean that KMD records information about the reset in case userspace asks for it later? Yes, kernel drivers do annotate whenever a reset happens, so it can report to userspace when it asks about resets. For instance, this is the amdgpu implementation of AMDGPU_CTX_OP_QUERY_STATE2: https://elixir.bootlin.com/linux/v6.3.8/source/drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.c#L548 You can see there stored information about resets. Hi André, right. What I mean is, if I have to ask this, then that implies that the wording could be more clear. I don't know if "reset tracking" is some sub-system that is turned on and off as needed or what updating it would mean. Understood, I'll rewrite it to be more clear. +kernel about reset information. Drivers should implement the DRM_IOCTL_GET_RESET +for that. At this point, I'm not sure what "reset tracking" or "reset information" entails. Could something be said about those? >> + +User Mode Driver + + +The UMD should check before submitting new commands to the KMD if the device has +been reset, and this can be checked more often if it requires to. The +DRM_IOCTL_GET_RESET is the default interface for those kind of checks. After +detecting a reset, UMD will then proceed to report it to the application using +the appropriated API error code, as explained in the bellow section about s/bellow/below/ +robustness. + +Robustness +-- + +The only way to try to keep an application working after a reset is if it +complies with the robustness aspects of the graphical API that is using. that it is using. + +Graphical APIs provide ways to application to deal with device resets. However, provide ways for applications to deal with +there's no guarantee that the app will be correctly using such features, and UMD +can implement policies to close the app if it's a repeating offender, likely in +a broken loop. This is done to ensure that it doesn't keeps blocking the user does not keep I think contractions are usually avoided in documents, but I'm not bothering to flag them all. +interface to be correctly displayed. interface from being correctly displayed. + +OpenGL +~~ + +Apps using OpenGL can rely on ``GL_ARB_robustness`` to be robust. This extension +tells if a reset has happened, and if so, all the context state is considered +lost and the app proceeds by creating new ones. If robustness isn't in use, UMD +will terminate the app when a reset is detected, giving that the contexts are +lost and the app won't be able to figure this out and recreate the contexts. What about GL ES? Is GL_ARB_robustness implemented or even defined there? I found this: https://registry.khronos.org/OpenGL/extensions/EXT/EXT_robustness.txt "Since this is intended to be a version of ARB_robustness for OpenGL ES, it should be named accordingly." I can add this to this paragraph. Yes, please! I suppose there could be even more
Re: [PATCH 03/15] dt-bindings: clock: qcom, dispcc-sm6125: Require GCC PLL0 DIV clock
On 26/06/2023 16:26, Marijn Suijten wrote: > On 2023-06-26 11:43:39, Konrad Dybcio wrote: >> On 25.06.2023 21:48, Marijn Suijten wrote: >>> On 2023-06-24 03:45:02, Konrad Dybcio wrote: On 24.06.2023 02:41, Marijn Suijten wrote: > The "gcc_disp_gpll0_div_clk_src" clock is consumed by the driver, will > be passed from DT, and should be required by the bindings. > > Fixes: 8397c9c0c26b ("dt-bindings: clock: add QCOM SM6125 display clock > bindings") > Signed-off-by: Marijn Suijten > --- Ideally, you'd stick it at the bottom of the list, as the items: order is part of the ABI >>> >>> This isn't an ABI break, as this driver nor its bindings require/declare >>> a fixed order: they declare a relation between clocks and clock-names. >> Bindings describe the ABI, drivers implement compliant code flow. > > That is how bindings are supposed to be... However typically the driver > is written/ported first and then the bindings are simply created to Your development process does not matter for the bindings. Whatever you decide to do "typically" is your choice, although of course I understand why you do it like that. You can argument the same that "I never create bindings in my process, so the driver defines the ABI". > reflect this, and sometimes (as is the case with this patch) > incorrectly. > > That, together with a lack of DTS and known-working device with it > (which is why I'm submitting driver+bindings+dts in one series now!) > makes us shoot ourselves in the foot by locking everyone into an ABI > that makes no sense. No one is locked into the ABI. SoC maintainer decides on this. However unjustified ABI breaking or not caring about it at all is not the way to go. It is not the correct process. > >>> This orders the GCC clock just like other dispccs. And the previous >>> patch dropped the unused cfg_ahb_clk from the bindings, so all bets are >>> off anyway. >> Thinking about it again, the binding has not been consumed by any upstream >> DT to date, so it should (tm) be fine to let it slide.. > > Exactly, I hope/doubt anyone was already using these incomplete > bindings. And again: the ABI here is the name->phandle mapping, the > order Does Not Matter™. No, it's not. Your one driver does not define the ABI. There are many different drivers, many different operating systems and other software components. Best regards, Krzysztof
Re: [PATCH v6 3/3] drm/virtio: Support sync objects
On 6/25/23 18:36, Geert Uytterhoeven wrote: > Hi Dmitry, > > On Sun, Jun 25, 2023 at 2:41 PM Dmitry Osipenko > wrote: >> On 6/25/23 11:47, Geert Uytterhoeven wrote: >>> On Sun, Apr 16, 2023 at 1:55 PM Dmitry Osipenko >>> wrote: Add sync object DRM UAPI support to VirtIO-GPU driver. Sync objects support is needed by native context VirtIO-GPU Mesa drivers, it also will be used by Venus and Virgl contexts. Reviewed-by; Emil Velikov Signed-off-by: Dmitry Osipenko >>> >>> Thanks for your patch! >>> --- a/drivers/gpu/drm/virtio/virtgpu_submit.c +++ b/drivers/gpu/drm/virtio/virtgpu_submit.c >>> +static int +virtio_gpu_parse_deps(struct virtio_gpu_submit *submit) +{ + struct drm_virtgpu_execbuffer *exbuf = submit->exbuf; + struct drm_virtgpu_execbuffer_syncobj syncobj_desc; + size_t syncobj_stride = exbuf->syncobj_stride; + u32 num_in_syncobjs = exbuf->num_in_syncobjs; + struct drm_syncobj **syncobjs; + int ret = 0, i; + + if (!num_in_syncobjs) + return 0; + + /* +* kvalloc at first tries to allocate memory using kmalloc and +* falls back to vmalloc only on failure. It also uses GFP_NOWARN >>> >>> GFP_NOWARN does not exist. >> >> https://elixir.bootlin.com/linux/v6.4-rc7/source/include/linux/gfp_types.h#L38 > > That line defines "__GFP_NOWARN", not "GFP_NOWARN". > C is case- and underscore-sensitive. as is "git grep -w" ;-) The removal of underscores was done intentionally for improving readability of the comment -- Best regards, Dmitry
Re: [PATCH 03/15] dt-bindings: clock: qcom, dispcc-sm6125: Require GCC PLL0 DIV clock
On 25/06/2023 21:48, Marijn Suijten wrote: > On 2023-06-24 11:08:54, Krzysztof Kozlowski wrote: >> On 24/06/2023 03:45, Konrad Dybcio wrote: >>> On 24.06.2023 02:41, Marijn Suijten wrote: The "gcc_disp_gpll0_div_clk_src" clock is consumed by the driver, will be passed from DT, and should be required by the bindings. Fixes: 8397c9c0c26b ("dt-bindings: clock: add QCOM SM6125 display clock bindings") Signed-off-by: Marijn Suijten --- >>> Ideally, you'd stick it at the bottom of the list, as the items: order >>> is part of the ABI >> >> Yes, please add them to the end. Order is fixed. > > Disagreed for bindings that declare clock-names and when the driver > adheres to it, see my reply to Konrad's message. That's the generic rule, with some exceptions of course. Whether one chosen driver (chosen system and chosen version of that system) adheres or not, does not change it. Other driver behaves differently and ABI is for everyone, not only for your specific version of Linux driver. Follow the rule. Best regards, Krzysztof
Re: [PATCH 1/5] drm/panfrost: Stop using drm_gem_shmem_put_pages()
On 6/26/23 18:43, Boris Brezillon wrote: > On Mon, 26 Jun 2023 16:20:53 +0300 > Dmitry Osipenko wrote: > >> On 6/26/23 15:02, Boris Brezillon wrote: >>> -err_pages: >>> - drm_gem_shmem_put_pages(>base); >>> err_unlock: >>> dma_resv_unlock(obj->resv); >>> + >>> + if (ret && pinned) >>> + drm_gem_shmem_unpin(>base); >> >> The drm_gem_shmem_unpin() was supposed to be used only in conjunction >> with drm_gem_shmem_pin(). I've a pending patch to enable the pin/unpin >> refcounting needed by drm-shmem shrinker, it will prohibit invocation of >> unpin without a previous pin. >> >> I'm wondering whether it will be okay to simply remove >> drm_gem_shmem_put_pages() from the Panfrost code, letting pages to be >> kept allocated in a error case. They will be freed once BO is destroyed. >> > > Okay, so after looking at your shmem-shrinker series, I confirm we need > to take a pin ref here (hard-pin), otherwise the buffer might be > evicted before the GPU is done, especially after you drop gpu_usecount > and use only pin_count to check whether a GEM object can be evicted or > not. See the drm_gem_evict() [1], it checks whether GEM is busy, preventing BO eviction while it is in-use by GPU. Note that in case of Panfrost, shrinker isn't enabled for growable BOs. [1] https://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git/tree/drivers/gpu/drm/drm_gem.c?h=next-20230626#n1495 -- Best regards, Dmitry
Re: [PATCH 1/5] drm/panfrost: Stop using drm_gem_shmem_put_pages()
On Mon, 26 Jun 2023 16:20:53 +0300 Dmitry Osipenko wrote: > On 6/26/23 15:02, Boris Brezillon wrote: > > -err_pages: > > - drm_gem_shmem_put_pages(>base); > > err_unlock: > > dma_resv_unlock(obj->resv); > > + > > + if (ret && pinned) > > + drm_gem_shmem_unpin(>base); > > The drm_gem_shmem_unpin() was supposed to be used only in conjunction > with drm_gem_shmem_pin(). I've a pending patch to enable the pin/unpin > refcounting needed by drm-shmem shrinker, it will prohibit invocation of > unpin without a previous pin. > > I'm wondering whether it will be okay to simply remove > drm_gem_shmem_put_pages() from the Panfrost code, letting pages to be > kept allocated in a error case. They will be freed once BO is destroyed. > Okay, so after looking at your shmem-shrinker series, I confirm we need to take a pin ref here (hard-pin), otherwise the buffer might be evicted before the GPU is done, especially after you drop gpu_usecount and use only pin_count to check whether a GEM object can be evicted or not.
Re: [PATCH v3 07/17] drm/imagination: Add GPU ID parsing and firmware loading
On Mon, Jun 26, 2023 at 8:22 AM Frank Binns wrote: > > Hi Adam, > > On Sat, 2023-06-17 at 07:48 -0500, Adam Ford wrote: > > On Tue, Jun 13, 2023 at 10:20 AM Sarah Walker > > wrote: > > > Read the GPU ID register at probe time and select the correct > > > features/quirks/enhancements. Use the GPU ID to form the firmware > > > file name and load the firmware. > > > > I have a Rogue 6250 variant, but the BVNC is returning a slightly > > different revision than the firmware that's currently available, and > > the older firmware for the vendor driver doesn't work with this new > > code. > > > > Linux responds with Unsupported BVNC: 4.45.2.58. From what I can > > tell, the closest available firmware is 4.40.2.51. > > > > Will there be more firmware variants in the future or will there be > > some options to build the firmware somehow? > > We don't plan to support the SoC vendor provided firmware binaries as this > will > mean having to deal with many different versions of the firmware and its > interface. Specifically, the interface can change in backwards incompatible > ways > between releases, it changes based on the hardware feature set & bugs and it's > split across userspace & the kernel. This makes these SoC provided firmware > binaries very difficult to support in this new driver. Thanks for the response. That makes sense. I would hope that various SoC vendors would jump on the bandwagon to work with your group to get their hardware supported. > > As an alternative, we'll be releasing firmware binaries as we add support for > more Rogue GPUs. We'll also release binaries upon request, in case others in > the > community want to work on support in the meantime - we're just getting things > set up for this at the moment. The Mesa side of things appears to be missing some documentation, and the power VR stuff still appears listed as experimental. Is there some documentation somewhere that would explain to someone how to go about porting the Rogue 6250 to a different hardware variant of the 6250? I don't really know the difference between BVNC of 4.45.2.58 and 4.40.2.51, but I can't imagine they are drastically different. adam > > Thanks > Frank > > > > > adam > > > > > > > > > > > The features/quirks/enhancements arrays are currently hardcoded in > > > the driver for the supported GPUs. We are looking at moving this > > > information to the firmware image. > > > > > > Signed-off-by: Sarah Walker > > > --- > > > drivers/gpu/drm/imagination/Makefile | 1 + > > > drivers/gpu/drm/imagination/pvr_device.c | 359 > > > drivers/gpu/drm/imagination/pvr_device.h | 221 +++ > > > drivers/gpu/drm/imagination/pvr_device_info.c | 223 +++ > > > drivers/gpu/drm/imagination/pvr_device_info.h | 133 + > > > drivers/gpu/drm/imagination/pvr_drv.c | 553 +- > > > drivers/gpu/drm/imagination/pvr_drv.h | 108 > > > drivers/gpu/drm/imagination/pvr_fw.h | 20 + > > > 8 files changed, 1617 insertions(+), 1 deletion(-) > > > create mode 100644 drivers/gpu/drm/imagination/pvr_device_info.c > > > create mode 100644 drivers/gpu/drm/imagination/pvr_device_info.h > > > create mode 100644 drivers/gpu/drm/imagination/pvr_fw.h > > > > > > diff --git a/drivers/gpu/drm/imagination/Makefile > > > b/drivers/gpu/drm/imagination/Makefile > > > index 186f920d615b..d713b1280776 100644 > > > --- a/drivers/gpu/drm/imagination/Makefile > > > +++ b/drivers/gpu/drm/imagination/Makefile > > > @@ -5,6 +5,7 @@ subdir-ccflags-y := -I$(srctree)/$(src) > > > > > > powervr-y := \ > > > pvr_device.o \ > > > + pvr_device_info.o \ > > > pvr_drv.o \ > > > > > > obj-$(CONFIG_DRM_POWERVR) += powervr.o > > > diff --git a/drivers/gpu/drm/imagination/pvr_device.c > > > b/drivers/gpu/drm/imagination/pvr_device.c > > > index 790c36cebec1..2e03763f2eb7 100644 > > > --- a/drivers/gpu/drm/imagination/pvr_device.c > > > +++ b/drivers/gpu/drm/imagination/pvr_device.c > > > @@ -2,20 +2,32 @@ > > > /* Copyright (c) 2022 Imagination Technologies Ltd. */ > > > > > > #include "pvr_device.h" > > > +#include "pvr_device_info.h" > > > + > > > +#include "pvr_fw.h" > > > +#include "pvr_rogue_cr_defs.h" > > > > > > #include > > > > > > +#include > > > #include > > > #include > > > #include > > > #include > > > #include > > > +#include > > > #include > > > +#include > > > #include > > > +#include > > > #include > > > #include > > > #include > > > #include > > > +#include > > > + > > > +/* Major number for the supported version of the firmware. */ > > > +#define PVR_FW_VERSION_MAJOR 1 > > > > > > /** > > > * pvr_device_reg_init() - Initialize kernel access to a PowerVR device's > > > @@ -205,6 +217,246 @@ pvr_device_regulator_init(struct pvr_device > > > *pvr_dev) > > > return err; > > > } > > > > > > +/** > > > + * pvr_device_clk_core_get_freq - Get current PowerVR device core clock > > > frequency > > > + *
Re: [PATCH 3/5] drm/shmem-helper: Inline drm_gem_shmem_{get,put}_pages()
On Mon, 26 Jun 2023 14:02:45 +0200 Boris Brezillon wrote: > Move code drm_gem_shmem_{get,put}_pages() code to > drm_gem_shmem_{pin,unpin}_locked(). After having a closer look at 'Add generic memory shrinker to VirtIO-GPU and Panfrost DRM drivers', I realize that's not what we want. We must differentiate hard-pinning (as in, can't be evicted until all users give up the ref they have) and soft-pinning (users can survive a swapout, basically userspace mappings created through drm_gem_shmem_mmap()). > > Signed-off-by: Boris Brezillon > Cc: Daniel Vetter > Cc: Thomas Zimmermann > Cc: Emil Velikov > Cc: Dmitry Osipenko > --- > drivers/gpu/drm/drm_gem_shmem_helper.c | 108 ++--- > 1 file changed, 41 insertions(+), 67 deletions(-) > > diff --git a/drivers/gpu/drm/drm_gem_shmem_helper.c > b/drivers/gpu/drm/drm_gem_shmem_helper.c > index d6fc034164c0..f406556e42e0 100644 > --- a/drivers/gpu/drm/drm_gem_shmem_helper.c > +++ b/drivers/gpu/drm/drm_gem_shmem_helper.c > @@ -128,46 +128,7 @@ struct drm_gem_shmem_object *drm_gem_shmem_create(struct > drm_device *dev, size_t > } > EXPORT_SYMBOL_GPL(drm_gem_shmem_create); > > -static void drm_gem_shmem_put_pages(struct drm_gem_shmem_object *shmem); > - > -/** > - * drm_gem_shmem_free - Free resources associated with a shmem GEM object > - * @shmem: shmem GEM object to free > - * > - * This function cleans up the GEM object state and frees the memory used to > - * store the object itself. > - */ > -void drm_gem_shmem_free(struct drm_gem_shmem_object *shmem) > -{ > - struct drm_gem_object *obj = >base; > - > - if (obj->import_attach) { > - drm_prime_gem_destroy(obj, shmem->sgt); > - } else { > - dma_resv_lock(shmem->base.resv, NULL); > - > - drm_WARN_ON(obj->dev, shmem->vmap_use_count); > - > - if (shmem->sgt) { > - dma_unmap_sgtable(obj->dev->dev, shmem->sgt, > - DMA_BIDIRECTIONAL, 0); > - sg_free_table(shmem->sgt); > - kfree(shmem->sgt); > - } > - if (shmem->pages) > - drm_gem_shmem_put_pages(shmem); > - > - drm_WARN_ON(obj->dev, shmem->pages_use_count); > - > - dma_resv_unlock(shmem->base.resv); > - } > - > - drm_gem_object_release(obj); > - kfree(shmem); > -} > -EXPORT_SYMBOL_GPL(drm_gem_shmem_free); > - > -static int drm_gem_shmem_get_pages(struct drm_gem_shmem_object *shmem) > +static int drm_gem_shmem_pin_locked(struct drm_gem_shmem_object *shmem) > { > struct drm_gem_object *obj = >base; > struct page **pages; > @@ -200,13 +161,7 @@ static int drm_gem_shmem_get_pages(struct > drm_gem_shmem_object *shmem) > return 0; > } > > -/* > - * drm_gem_shmem_put_pages - Decrease use count on the backing pages for a > shmem GEM object > - * @shmem: shmem GEM object > - * > - * This function decreases the use count and puts the backing pages when use > drops to zero. > - */ > -static void drm_gem_shmem_put_pages(struct drm_gem_shmem_object *shmem) > +static void drm_gem_shmem_unpin_locked(struct drm_gem_shmem_object *shmem) > { > struct drm_gem_object *obj = >base; > > @@ -229,23 +184,42 @@ static void drm_gem_shmem_put_pages(struct > drm_gem_shmem_object *shmem) > shmem->pages = NULL; > } > > -static int drm_gem_shmem_pin_locked(struct drm_gem_shmem_object *shmem) > +/** > + * drm_gem_shmem_free - Free resources associated with a shmem GEM object > + * @shmem: shmem GEM object to free > + * > + * This function cleans up the GEM object state and frees the memory used to > + * store the object itself. > + */ > +void drm_gem_shmem_free(struct drm_gem_shmem_object *shmem) > { > - int ret; > + struct drm_gem_object *obj = >base; > > - dma_resv_assert_held(shmem->base.resv); > + if (obj->import_attach) { > + drm_prime_gem_destroy(obj, shmem->sgt); > + } else { > + dma_resv_lock(shmem->base.resv, NULL); > > - ret = drm_gem_shmem_get_pages(shmem); > + drm_WARN_ON(obj->dev, shmem->vmap_use_count); > > - return ret; > -} > - > -static void drm_gem_shmem_unpin_locked(struct drm_gem_shmem_object *shmem) > -{ > - dma_resv_assert_held(shmem->base.resv); > - > - drm_gem_shmem_put_pages(shmem); > + if (shmem->sgt) { > + dma_unmap_sgtable(obj->dev->dev, shmem->sgt, > + DMA_BIDIRECTIONAL, 0); > + sg_free_table(shmem->sgt); > + kfree(shmem->sgt); > + } > + if (shmem->pages) > + drm_gem_shmem_unpin_locked(shmem); > + > + drm_WARN_ON(obj->dev, shmem->pages_use_count); > + > + dma_resv_unlock(shmem->base.resv); > + } > + > + drm_gem_object_release(obj); > + kfree(shmem); > } > +EXPORT_SYMBOL_GPL(drm_gem_shmem_free); > >
Re: [PATCH v13 03/10] drm/shmem-helper: Add pages_pin_count field
On 6/26/23 18:21, Boris Brezillon wrote: > On Mon, 26 Jun 2023 17:04:57 +0200 > Boris Brezillon wrote: > >> Hi Dmitry, >> >> Sorry for chiming in only now :-/. >> >> On Tue, 14 Mar 2023 05:26:52 +0300 >> Dmitry Osipenko wrote: >> >>> And new pages_pin_count field to struct drm_gem_shmem_object that will >>> determine whether pages are evictable by memory shrinker. The pages will >>> be evictable only when pages_pin_count=0. This patch prepares code for >>> addition of the memory shrinker that will utilize the new field. >>> >>> Signed-off-by: Dmitry Osipenko >>> --- >>> drivers/gpu/drm/drm_gem_shmem_helper.c | 7 +++ >>> include/drm/drm_gem_shmem_helper.h | 9 + >>> 2 files changed, 16 insertions(+) >>> >>> diff --git a/drivers/gpu/drm/drm_gem_shmem_helper.c >>> b/drivers/gpu/drm/drm_gem_shmem_helper.c >>> index 4da9c9c39b9a..81d61791f874 100644 >>> --- a/drivers/gpu/drm/drm_gem_shmem_helper.c >>> +++ b/drivers/gpu/drm/drm_gem_shmem_helper.c >>> @@ -277,6 +277,8 @@ static int drm_gem_shmem_pin_locked(struct >>> drm_gem_shmem_object *shmem) >>> drm_WARN_ON(obj->dev, obj->import_attach); >>> >>> ret = drm_gem_shmem_get_pages(shmem); >>> + if (!ret) >>> + shmem->pages_pin_count++; >>> >>> return ret; >>> } >>> @@ -289,7 +291,12 @@ static void drm_gem_shmem_unpin_locked(struct >>> drm_gem_shmem_object *shmem) >>> >>> drm_WARN_ON(obj->dev, obj->import_attach); >>> >>> + if (drm_WARN_ON_ONCE(obj->dev, !shmem->pages_pin_count)) >>> + return; >>> + >>> drm_gem_shmem_put_pages(shmem); >>> + >>> + shmem->pages_pin_count--; >>> } >>> >>> /** >>> diff --git a/include/drm/drm_gem_shmem_helper.h >>> b/include/drm/drm_gem_shmem_helper.h >>> index 20ddcd799df9..7d823c9fc480 100644 >>> --- a/include/drm/drm_gem_shmem_helper.h >>> +++ b/include/drm/drm_gem_shmem_helper.h >>> @@ -39,6 +39,15 @@ struct drm_gem_shmem_object { >>> */ >>> unsigned int pages_use_count; >>> >>> + /** >>> +* @pages_pin_count: >>> +* >>> +* Reference count on the pinned pages table. >>> +* The pages allowed to be evicted by memory shrinker >>> +* only when the count is zero. >>> +*/ >>> + unsigned int pages_pin_count; >> >> s/pages_pin_count/pin_count/ ? >> >> And do we really need both pages_pin_count and pages_use_count. Looks >> like they both serve the same purpose, with one exception: >> pages_use_count is also incremented in the get_pages_sgt_locked() path, >> but you probably don't want it to prevent GEM eviction. Assuming >> your goal with this pin_count field is to check if a GEM object is >> evictable, it can be done with something like >> >> bool >> drm_gem_shmem_is_evictable_locked(struct drm_gem_shmem_object *shmem) >> { >> dma_resv_assert_held(shmem->base.resv); >> >> return shmem->pages_use_count == (shmem->sgt ? 1 : 0); >> } >> >> I mean, I'm not against renaming pages_use_count into pin_count, but, >> unless I'm missing something, I don't see a good reason to keep both. > > My bad, I think I found one place calling drm_gem_shmem_get_pages() > where we want pin_count and pages_use_count to differ: > drm_gem_shmem_mmap(). We certainly don't want userspace mappings to > prevent eviction. That's correct, thanks for the review :) -- Best regards, Dmitry
Re: [PATCH] backlight: led_bl: take led_access lock when required
On Mon, 19 Jun 2023, Mans Rullgard wrote: > The led_access lock must be held when calling led_sysfs_enable() and > led_sysfs_disable(). This fixes warnings such as this: > > [2.432495] [ cut here ] > [2.437316] WARNING: CPU: 0 PID: 22 at drivers/leds/led-core.c:349 > led_sysfs_disable+0x54/0x58 > [2.446105] Modules linked in: > [2.449218] CPU: 0 PID: 22 Comm: kworker/u2:1 Not tainted 6.3.8+ #1 > [2.456268] Hardware name: Generic AM3517 (Flattened Device Tree) > [2.462402] Workqueue: events_unbound deferred_probe_work_func > [2.468353] unwind_backtrace from show_stack+0x10/0x14 > [2.473632] show_stack from dump_stack_lvl+0x24/0x2c > [2.478759] dump_stack_lvl from __warn+0x9c/0xc4 > [2.483551] __warn from warn_slowpath_fmt+0x64/0xc0 > [2.488586] warn_slowpath_fmt from led_sysfs_disable+0x54/0x58 > [2.494567] led_sysfs_disable from led_bl_probe+0x20c/0x3b0 > [2.500305] led_bl_probe from platform_probe+0x5c/0xb8 > [2.505615] platform_probe from really_probe+0xc8/0x2a0 > [2.510986] really_probe from __driver_probe_device+0x88/0x19c > [2.516967] __driver_probe_device from driver_probe_device+0x30/0xcc > [2.523498] driver_probe_device from __device_attach_driver+0x94/0xc4 > [2.530090] __device_attach_driver from bus_for_each_drv+0x80/0xcc > [2.536437] bus_for_each_drv from __device_attach+0xf8/0x19c > [2.542236] __device_attach from bus_probe_device+0x8c/0x90 > [2.547973] bus_probe_device from deferred_probe_work_func+0x80/0xb0 > [2.554504] deferred_probe_work_func from process_one_work+0x228/0x4c0 > [2.561187] process_one_work from worker_thread+0x1fc/0x4d0 > [2.566925] worker_thread from kthread+0xb4/0xd0 > [2.571685] kthread from ret_from_fork+0x14/0x2c > [2.576446] Exception stack(0xd0079fb0 to 0xd0079ff8) > [2.581573] 9fa0: > > [2.589813] 9fc0: > > [2.598052] 9fe0: 0013 > [2.604888] ---[ end trace ]--- > > > Signed-off-by: Mans Rullgard > --- > drivers/video/backlight/led_bl.c | 5 - > 1 file changed, 4 insertions(+), 1 deletion(-) Applied, thanks -- Lee Jones [李琼斯]
Re: [PATCH] kunit: drm: make DRM buddy test compatible with other pages sizes
Hi Christian, Thanks for the information! I am not very familiar with the inner workings of DRM, so I'm not really in a position to make any large or systematic changes to the test regarding the points you made. I am mainly trying to allow the tests to be run on more diverse hardware. >From the looks of it this test has been adapted from an older test, so perhaps this rule was set in place in the past. Either way, I dont think my changes are going to break anything, so for the time being I think this small change is the best approach. Please let me know if you think otherwise. David, do you still have this on your radar? We've been carrying this as a RHEL-only since I originally posted it and have not noticed any issues due to it. Cheers, -- Nico On Wed, Apr 19, 2023 at 4:30 AM Christian König wrote: > > Am 18.04.23 um 19:15 schrieb Nico Pache: > > The DRM buddy test uses a fixed 12 bit shift to covert from pages to > > bytes. This number is then used to confirm that (chunk_size < PAGE_SIZE) > > which can lead to a failing drm_buddy_init on systems with PAGE_SIZE > 4k. > > Since the buddy allocator is used for resources which are independent of > the CPU PAGE size the later check is actually the broken one. > > E.g. neither in the buddy allocator nor in it's test cases we should > have any of PAGE_SHIFT or PAGE_SIZE. > > Otherwise the allocator wouldn't work correctly on systems with a > PAGE_SIZE different than 4k. > > Regards, > Christian. > > > > > Fixes: 92937f170d3f ("drm/selftests: add drm buddy alloc range testcase") > > Signed-off-by: Nico Pache > > --- > > drivers/gpu/drm/tests/drm_buddy_test.c | 4 ++-- > > 1 file changed, 2 insertions(+), 2 deletions(-) > > > > diff --git a/drivers/gpu/drm/tests/drm_buddy_test.c > > b/drivers/gpu/drm/tests/drm_buddy_test.c > > index 09ee6f6af896..a62b2690d3c2 100644 > > --- a/drivers/gpu/drm/tests/drm_buddy_test.c > > +++ b/drivers/gpu/drm/tests/drm_buddy_test.c > > @@ -318,8 +318,8 @@ static void mm_config(u64 *size, u64 *chunk_size) > > s &= -ms; > > > > /* Convert from pages to bytes */ > > - *chunk_size = (u64)ms << 12; > > - *size = (u64)s << 12; > > + *chunk_size = (u64)ms << PAGE_SHIFT; > > + *size = (u64)s << PAGE_SHIFT; > > } > > > > static void drm_test_buddy_alloc_pathological(struct kunit *test) >
Re: [Intel-xe] [PATCH v2 2/4] drm/ttm: Don't shadow the operation context
On Mon, 2023-06-26 at 17:18 +0200, Thomas Hellström wrote: > On Mon, 2023-06-26 at 17:15 +0200, Christian König wrote: > > Am 26.06.23 um 11:14 schrieb Thomas Hellström: > > > ttm_bo_swapout() shadows the ttm operation context which may > > > cause > > > major confusion in driver callbacks when swapping out > > > !TTM_PL_SYSTEM > > > memory. Fix this by reusing the operation context argument to > > > ttm_bo_swapout(). > > > > > > Cc: "Christian König" > > > Cc: Roger He > > > Cc: > > > Cc: > > > Cc: # v4.16+ > > > Fixes: dc947770cf34 ("drm/ttm: enable swapout for reserved BOs > > > during allocation") > > > Signed-off-by: Thomas Hellström > > > > > > Acked-by: Matthew Brost > > > > We intentionally didn't used the parameter here, but I absolutely > > can't > > figure out why. > > > > Feel free to add my rb, but let's give it some time upstream before > > you > > base anything on top of this. Just in case we missed something. > > Sure. Thanks for reviewing, BTW, I'll remove the Fixes: tag as well. /Thomas > /Thomas > > > > > Regards, > > Christian. > > > > > --- > > > drivers/gpu/drm/ttm/ttm_bo.c | 3 +-- > > > 1 file changed, 1 insertion(+), 2 deletions(-) > > > > > > diff --git a/drivers/gpu/drm/ttm/ttm_bo.c > > > b/drivers/gpu/drm/ttm/ttm_bo.c > > > index bd5dae4d1624..615d30c4262d 100644 > > > --- a/drivers/gpu/drm/ttm/ttm_bo.c > > > +++ b/drivers/gpu/drm/ttm/ttm_bo.c > > > @@ -1154,7 +1154,6 @@ int ttm_bo_swapout(struct ttm_buffer_object > > > *bo, struct ttm_operation_ctx *ctx, > > > * Move to system cached > > > */ > > > if (bo->resource->mem_type != TTM_PL_SYSTEM) { > > > - struct ttm_operation_ctx ctx = { false, false }; > > > struct ttm_resource *evict_mem; > > > struct ttm_place hop; > > > > > > @@ -1164,7 +1163,7 @@ int ttm_bo_swapout(struct ttm_buffer_object > > > *bo, struct ttm_operation_ctx *ctx, > > > if (unlikely(ret)) > > > goto out; > > > > > > - ret = ttm_bo_handle_move_mem(bo, evict_mem, true, > > > , ); > > > + ret = ttm_bo_handle_move_mem(bo, evict_mem, true, > > > ctx, ); > > > if (unlikely(ret != 0)) { > > > WARN(ret == -EMULTIHOP, "Unexpected > > > multihop in swaput - likely driver bug.\n"); > > > goto out; > > >
Re: [PATCH v13 03/10] drm/shmem-helper: Add pages_pin_count field
On Mon, 26 Jun 2023 17:04:57 +0200 Boris Brezillon wrote: > Hi Dmitry, > > Sorry for chiming in only now :-/. > > On Tue, 14 Mar 2023 05:26:52 +0300 > Dmitry Osipenko wrote: > > > And new pages_pin_count field to struct drm_gem_shmem_object that will > > determine whether pages are evictable by memory shrinker. The pages will > > be evictable only when pages_pin_count=0. This patch prepares code for > > addition of the memory shrinker that will utilize the new field. > > > > Signed-off-by: Dmitry Osipenko > > --- > > drivers/gpu/drm/drm_gem_shmem_helper.c | 7 +++ > > include/drm/drm_gem_shmem_helper.h | 9 + > > 2 files changed, 16 insertions(+) > > > > diff --git a/drivers/gpu/drm/drm_gem_shmem_helper.c > > b/drivers/gpu/drm/drm_gem_shmem_helper.c > > index 4da9c9c39b9a..81d61791f874 100644 > > --- a/drivers/gpu/drm/drm_gem_shmem_helper.c > > +++ b/drivers/gpu/drm/drm_gem_shmem_helper.c > > @@ -277,6 +277,8 @@ static int drm_gem_shmem_pin_locked(struct > > drm_gem_shmem_object *shmem) > > drm_WARN_ON(obj->dev, obj->import_attach); > > > > ret = drm_gem_shmem_get_pages(shmem); > > + if (!ret) > > + shmem->pages_pin_count++; > > > > return ret; > > } > > @@ -289,7 +291,12 @@ static void drm_gem_shmem_unpin_locked(struct > > drm_gem_shmem_object *shmem) > > > > drm_WARN_ON(obj->dev, obj->import_attach); > > > > + if (drm_WARN_ON_ONCE(obj->dev, !shmem->pages_pin_count)) > > + return; > > + > > drm_gem_shmem_put_pages(shmem); > > + > > + shmem->pages_pin_count--; > > } > > > > /** > > diff --git a/include/drm/drm_gem_shmem_helper.h > > b/include/drm/drm_gem_shmem_helper.h > > index 20ddcd799df9..7d823c9fc480 100644 > > --- a/include/drm/drm_gem_shmem_helper.h > > +++ b/include/drm/drm_gem_shmem_helper.h > > @@ -39,6 +39,15 @@ struct drm_gem_shmem_object { > > */ > > unsigned int pages_use_count; > > > > + /** > > +* @pages_pin_count: > > +* > > +* Reference count on the pinned pages table. > > +* The pages allowed to be evicted by memory shrinker > > +* only when the count is zero. > > +*/ > > + unsigned int pages_pin_count; > > s/pages_pin_count/pin_count/ ? > > And do we really need both pages_pin_count and pages_use_count. Looks > like they both serve the same purpose, with one exception: > pages_use_count is also incremented in the get_pages_sgt_locked() path, > but you probably don't want it to prevent GEM eviction. Assuming > your goal with this pin_count field is to check if a GEM object is > evictable, it can be done with something like > > bool > drm_gem_shmem_is_evictable_locked(struct drm_gem_shmem_object *shmem) > { > dma_resv_assert_held(shmem->base.resv); > > return shmem->pages_use_count == (shmem->sgt ? 1 : 0); > } > > I mean, I'm not against renaming pages_use_count into pin_count, but, > unless I'm missing something, I don't see a good reason to keep both. My bad, I think I found one place calling drm_gem_shmem_get_pages() where we want pin_count and pages_use_count to differ: drm_gem_shmem_mmap(). We certainly don't want userspace mappings to prevent eviction.
Re: [PATCH v2 2/4] drm/ttm: Don't shadow the operation context
On Mon, 2023-06-26 at 17:15 +0200, Christian König wrote: > Am 26.06.23 um 11:14 schrieb Thomas Hellström: > > ttm_bo_swapout() shadows the ttm operation context which may cause > > major confusion in driver callbacks when swapping out > > !TTM_PL_SYSTEM > > memory. Fix this by reusing the operation context argument to > > ttm_bo_swapout(). > > > > Cc: "Christian König" > > Cc: Roger He > > Cc: > > Cc: > > Cc: # v4.16+ > > Fixes: dc947770cf34 ("drm/ttm: enable swapout for reserved BOs > > during allocation") > > Signed-off-by: Thomas Hellström > > Acked-by: Matthew Brost > > We intentionally didn't used the parameter here, but I absolutely > can't > figure out why. > > Feel free to add my rb, but let's give it some time upstream before > you > base anything on top of this. Just in case we missed something. Sure. Thanks for reviewing, /Thomas > > Regards, > Christian. > > > --- > > drivers/gpu/drm/ttm/ttm_bo.c | 3 +-- > > 1 file changed, 1 insertion(+), 2 deletions(-) > > > > diff --git a/drivers/gpu/drm/ttm/ttm_bo.c > > b/drivers/gpu/drm/ttm/ttm_bo.c > > index bd5dae4d1624..615d30c4262d 100644 > > --- a/drivers/gpu/drm/ttm/ttm_bo.c > > +++ b/drivers/gpu/drm/ttm/ttm_bo.c > > @@ -1154,7 +1154,6 @@ int ttm_bo_swapout(struct ttm_buffer_object > > *bo, struct ttm_operation_ctx *ctx, > > * Move to system cached > > */ > > if (bo->resource->mem_type != TTM_PL_SYSTEM) { > > - struct ttm_operation_ctx ctx = { false, false }; > > struct ttm_resource *evict_mem; > > struct ttm_place hop; > > > > @@ -1164,7 +1163,7 @@ int ttm_bo_swapout(struct ttm_buffer_object > > *bo, struct ttm_operation_ctx *ctx, > > if (unlikely(ret)) > > goto out; > > > > - ret = ttm_bo_handle_move_mem(bo, evict_mem, true, > > , ); > > + ret = ttm_bo_handle_move_mem(bo, evict_mem, true, > > ctx, ); > > if (unlikely(ret != 0)) { > > WARN(ret == -EMULTIHOP, "Unexpected > > multihop in swaput - likely driver bug.\n"); > > goto out; >
Re: [PATCH v2 2/4] drm/ttm: Don't shadow the operation context
Am 26.06.23 um 11:14 schrieb Thomas Hellström: ttm_bo_swapout() shadows the ttm operation context which may cause major confusion in driver callbacks when swapping out !TTM_PL_SYSTEM memory. Fix this by reusing the operation context argument to ttm_bo_swapout(). Cc: "Christian König" Cc: Roger He Cc: Cc: Cc: # v4.16+ Fixes: dc947770cf34 ("drm/ttm: enable swapout for reserved BOs during allocation") Signed-off-by: Thomas Hellström Acked-by: Matthew Brost We intentionally didn't used the parameter here, but I absolutely can't figure out why. Feel free to add my rb, but let's give it some time upstream before you base anything on top of this. Just in case we missed something. Regards, Christian. --- drivers/gpu/drm/ttm/ttm_bo.c | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-) diff --git a/drivers/gpu/drm/ttm/ttm_bo.c b/drivers/gpu/drm/ttm/ttm_bo.c index bd5dae4d1624..615d30c4262d 100644 --- a/drivers/gpu/drm/ttm/ttm_bo.c +++ b/drivers/gpu/drm/ttm/ttm_bo.c @@ -1154,7 +1154,6 @@ int ttm_bo_swapout(struct ttm_buffer_object *bo, struct ttm_operation_ctx *ctx, * Move to system cached */ if (bo->resource->mem_type != TTM_PL_SYSTEM) { - struct ttm_operation_ctx ctx = { false, false }; struct ttm_resource *evict_mem; struct ttm_place hop; @@ -1164,7 +1163,7 @@ int ttm_bo_swapout(struct ttm_buffer_object *bo, struct ttm_operation_ctx *ctx, if (unlikely(ret)) goto out; - ret = ttm_bo_handle_move_mem(bo, evict_mem, true, , ); + ret = ttm_bo_handle_move_mem(bo, evict_mem, true, ctx, ); if (unlikely(ret != 0)) { WARN(ret == -EMULTIHOP, "Unexpected multihop in swaput - likely driver bug.\n"); goto out;
Re: [PATCH] backlight: pwm_bl: Avoid backlight flicker applying initial PWM state
On Thu, Jun 08, 2023 at 04:11:14PM +0200, Philipp Zabel wrote: > The initial PWM state returned by pwm_init_state() has a duty cycle > of 0 ns. To avoid backlight flicker when taking over an enabled > display from the bootloader, skip the initial pwm_apply_state() > and leave the PWM be until backlight_update_state() will apply the > state with the desired brightness. backlight_update_state() uses pwm_get_state() to update the PWM. Without applying something that came from pwm_init_state() then we will never adopt the reference values from pwm->args. Daniel.
Re: [PATCH v13 03/10] drm/shmem-helper: Add pages_pin_count field
Hi Dmitry, Sorry for chiming in only now :-/. On Tue, 14 Mar 2023 05:26:52 +0300 Dmitry Osipenko wrote: > And new pages_pin_count field to struct drm_gem_shmem_object that will > determine whether pages are evictable by memory shrinker. The pages will > be evictable only when pages_pin_count=0. This patch prepares code for > addition of the memory shrinker that will utilize the new field. > > Signed-off-by: Dmitry Osipenko > --- > drivers/gpu/drm/drm_gem_shmem_helper.c | 7 +++ > include/drm/drm_gem_shmem_helper.h | 9 + > 2 files changed, 16 insertions(+) > > diff --git a/drivers/gpu/drm/drm_gem_shmem_helper.c > b/drivers/gpu/drm/drm_gem_shmem_helper.c > index 4da9c9c39b9a..81d61791f874 100644 > --- a/drivers/gpu/drm/drm_gem_shmem_helper.c > +++ b/drivers/gpu/drm/drm_gem_shmem_helper.c > @@ -277,6 +277,8 @@ static int drm_gem_shmem_pin_locked(struct > drm_gem_shmem_object *shmem) > drm_WARN_ON(obj->dev, obj->import_attach); > > ret = drm_gem_shmem_get_pages(shmem); > + if (!ret) > + shmem->pages_pin_count++; > > return ret; > } > @@ -289,7 +291,12 @@ static void drm_gem_shmem_unpin_locked(struct > drm_gem_shmem_object *shmem) > > drm_WARN_ON(obj->dev, obj->import_attach); > > + if (drm_WARN_ON_ONCE(obj->dev, !shmem->pages_pin_count)) > + return; > + > drm_gem_shmem_put_pages(shmem); > + > + shmem->pages_pin_count--; > } > > /** > diff --git a/include/drm/drm_gem_shmem_helper.h > b/include/drm/drm_gem_shmem_helper.h > index 20ddcd799df9..7d823c9fc480 100644 > --- a/include/drm/drm_gem_shmem_helper.h > +++ b/include/drm/drm_gem_shmem_helper.h > @@ -39,6 +39,15 @@ struct drm_gem_shmem_object { >*/ > unsigned int pages_use_count; > > + /** > + * @pages_pin_count: > + * > + * Reference count on the pinned pages table. > + * The pages allowed to be evicted by memory shrinker > + * only when the count is zero. > + */ > + unsigned int pages_pin_count; s/pages_pin_count/pin_count/ ? And do we really need both pages_pin_count and pages_use_count. Looks like they both serve the same purpose, with one exception: pages_use_count is also incremented in the get_pages_sgt_locked() path, but you probably don't want it to prevent GEM eviction. Assuming your goal with this pin_count field is to check if a GEM object is evictable, it can be done with something like bool drm_gem_shmem_is_evictable_locked(struct drm_gem_shmem_object *shmem) { dma_resv_assert_held(shmem->base.resv); return shmem->pages_use_count == (shmem->sgt ? 1 : 0); } I mean, I'm not against renaming pages_use_count into pin_count, but, unless I'm missing something, I don't see a good reason to keep both. Regards, Boris
[PATCH] drm/tegra: Kconfig: Express the dependency by using the depends on
Because using the 'depends on' is easy to understand and more obvious. Signed-off-by: Sui Jingfeng --- drivers/gpu/drm/tegra/Kconfig | 6 ++ drivers/gpu/host1x/Kconfig| 5 + 2 files changed, 3 insertions(+), 8 deletions(-) diff --git a/drivers/gpu/drm/tegra/Kconfig b/drivers/gpu/drm/tegra/Kconfig index 498313778175..e7e3856665b1 100644 --- a/drivers/gpu/drm/tegra/Kconfig +++ b/drivers/gpu/drm/tegra/Kconfig @@ -26,19 +26,17 @@ config DRM_TEGRA To compile this driver as a module, choose M here: the module will be called tegra-drm. -if DRM_TEGRA - config DRM_TEGRA_DEBUG bool "NVIDIA Tegra DRM debug support" + depends on DRM_TEGRA help Say yes here to enable debugging support. config DRM_TEGRA_STAGING bool "Enable HOST1X interface" + depends on DRM_TEGRA depends on STAGING help Say yes if HOST1X should be available for userspace DRM users. If unsure, choose N. - -endif diff --git a/drivers/gpu/host1x/Kconfig b/drivers/gpu/host1x/Kconfig index e6c78ae2003a..36eb98d93637 100644 --- a/drivers/gpu/host1x/Kconfig +++ b/drivers/gpu/host1x/Kconfig @@ -17,14 +17,11 @@ config TEGRA_HOST1X by host1x are referred to as clients. host1x includes some other functionality, such as synchronization. -if TEGRA_HOST1X - config TEGRA_HOST1X_FIREWALL bool "Enable HOST1X security firewall" + depends on TEGRA_HOST1X default y help Say yes if kernel should protect command streams from tampering. If unsure, choose Y. - -endif -- 2.25.1
Re: [PATCH 1/3] backlight: lm3630a: add support for changing the boost frequency
On Wed, Jun 14, 2023 at 09:08:52PM +0200, Maximilian Weigand wrote: > From: Maximilian Weigand > > The led driver supports changing the switching frequency of the boost > converter by two means: the base switching frequency can be changed from > 500 kHz to 1 MHz, and a frequency shift can be activated, leading to > switching frequencies of 560 kHz or 1.12 Mhz, respectively. > > Add this functionality to the led driver by introducing two dts entries > that control the boost frequency (500 kHz by default) and the frequency > shift (no shift by default). > > Signed-off-by: Maximilian Weigand Driver changes look ok (or at least will be when the DT bindings are finalized). However... I think patches 1 and 2 of this series are in the wrong order. See #5 in https://docs.kernel.org/devicetree/bindings/submitting-patches.html for details. Daniel.
[PATCH 2/2] drm/tegra: Remove surplus else after return
else is not generally useful after return Signed-off-by: Sui Jingfeng --- drivers/gpu/drm/tegra/gem.c | 19 ++- 1 file changed, 10 insertions(+), 9 deletions(-) diff --git a/drivers/gpu/drm/tegra/gem.c b/drivers/gpu/drm/tegra/gem.c index 0ce22935fbd3..26d64f4545bd 100644 --- a/drivers/gpu/drm/tegra/gem.c +++ b/drivers/gpu/drm/tegra/gem.c @@ -180,15 +180,15 @@ static void *tegra_bo_mmap(struct host1x_bo *bo) struct iosys_map map; int ret; - if (obj->vaddr) { + if (obj->vaddr) return obj->vaddr; - } else if (obj->gem.import_attach) { + + if (obj->gem.import_attach) { ret = dma_buf_vmap_unlocked(obj->gem.import_attach->dmabuf, ); return ret ? NULL : map.vaddr; - } else { - return vmap(obj->pages, obj->num_pages, VM_MAP, - pgprot_writecombine(PAGE_KERNEL)); } + + return vmap(obj->pages, obj->num_pages, VM_MAP, pgprot_writecombine(PAGE_KERNEL)); } static void tegra_bo_munmap(struct host1x_bo *bo, void *addr) @@ -198,10 +198,11 @@ static void tegra_bo_munmap(struct host1x_bo *bo, void *addr) if (obj->vaddr) return; - else if (obj->gem.import_attach) - dma_buf_vunmap_unlocked(obj->gem.import_attach->dmabuf, ); - else - vunmap(addr); + + if (obj->gem.import_attach) + return dma_buf_vunmap_unlocked(obj->gem.import_attach->dmabuf, ); + + vunmap(addr); } static struct host1x_bo *tegra_bo_get(struct host1x_bo *bo) -- 2.25.1
[PATCH 1/2] drm/tegra: Return an error code if fails
Return -ENOMEM if tegra_bo_mmap() fails. Signed-off-by: Sui Jingfeng --- drivers/gpu/drm/tegra/gem.c | 2 ++ 1 file changed, 2 insertions(+) diff --git a/drivers/gpu/drm/tegra/gem.c b/drivers/gpu/drm/tegra/gem.c index dea38892d6e6..0ce22935fbd3 100644 --- a/drivers/gpu/drm/tegra/gem.c +++ b/drivers/gpu/drm/tegra/gem.c @@ -710,6 +710,8 @@ static int tegra_gem_prime_vmap(struct dma_buf *buf, struct iosys_map *map) void *vaddr; vaddr = tegra_bo_mmap(>base); + if (!vaddr) + return -ENOMEM; if (IS_ERR(vaddr)) return PTR_ERR(vaddr); -- 2.25.1
Re: [PATCH] backlight: led_bl: take led_access lock when required
On Mon, Jun 19, 2023 at 05:02:49PM +0100, Mans Rullgard wrote: > The led_access lock must be held when calling led_sysfs_enable() and > led_sysfs_disable(). This fixes warnings such as this: > > [2.432495] [ cut here ] > [2.437316] WARNING: CPU: 0 PID: 22 at drivers/leds/led-core.c:349 > led_sysfs_disable+0x54/0x58 > [2.446105] Modules linked in: > [2.449218] CPU: 0 PID: 22 Comm: kworker/u2:1 Not tainted 6.3.8+ #1 > [2.456268] Hardware name: Generic AM3517 (Flattened Device Tree) > [2.462402] Workqueue: events_unbound deferred_probe_work_func > [2.468353] unwind_backtrace from show_stack+0x10/0x14 > [2.473632] show_stack from dump_stack_lvl+0x24/0x2c > [2.478759] dump_stack_lvl from __warn+0x9c/0xc4 > [2.483551] __warn from warn_slowpath_fmt+0x64/0xc0 > [2.488586] warn_slowpath_fmt from led_sysfs_disable+0x54/0x58 > [2.494567] led_sysfs_disable from led_bl_probe+0x20c/0x3b0 > [2.500305] led_bl_probe from platform_probe+0x5c/0xb8 > [2.505615] platform_probe from really_probe+0xc8/0x2a0 > [2.510986] really_probe from __driver_probe_device+0x88/0x19c > [2.516967] __driver_probe_device from driver_probe_device+0x30/0xcc > [2.523498] driver_probe_device from __device_attach_driver+0x94/0xc4 > [2.530090] __device_attach_driver from bus_for_each_drv+0x80/0xcc > [2.536437] bus_for_each_drv from __device_attach+0xf8/0x19c > [2.542236] __device_attach from bus_probe_device+0x8c/0x90 > [2.547973] bus_probe_device from deferred_probe_work_func+0x80/0xb0 > [2.554504] deferred_probe_work_func from process_one_work+0x228/0x4c0 > [2.561187] process_one_work from worker_thread+0x1fc/0x4d0 > [2.566925] worker_thread from kthread+0xb4/0xd0 > [2.571685] kthread from ret_from_fork+0x14/0x2c > [2.576446] Exception stack(0xd0079fb0 to 0xd0079ff8) > [2.581573] 9fa0: > > [2.589813] 9fc0: > > [2.598052] 9fe0: 0013 > [2.604888] ---[ end trace ]--- > > > Signed-off-by: Mans Rullgard Reviewed-by: Daniel Thompson Daniel.