From: Tzung-Bi Shih <[email protected]> [ Upstream commit d935187cfb27fc4168f78f3959aef4eafaae76bb ]
A potential circular locking dependency (ABBA deadlock) exists between `ec_dev->lock` and the clock framework's `prepare_lock`. The first order (A -> B) occurs when scp_ipi_send() is called while `ec_dev->lock` is held (e.g., within cros_ec_cmd_xfer()): 1. cros_ec_cmd_xfer() acquires `ec_dev->lock` and calls scp_ipi_send(). 2. scp_ipi_send() calls clk_prepare_enable(), which acquires `prepare_lock`. See #0 in the following example calling trace. (Lock Order: `ec_dev->lock` -> `prepare_lock`) The reverse order (B -> A) is more complex and has been observed (learned) by lockdep. It involves the clock prepare operation triggering power domain changes, which then propagates through sysfs and power supply uevents, eventually calling back into the ChromeOS EC driver and attempting to acquire `ec_dev->lock`: 1. Something calls clk_prepare(), which acquires `prepare_lock`. It then triggers genpd operations like genpd_runtime_resume(), which takes `&genpd->mlock`. 2. Power domain changes can trigger regulator changes; regulator changes can then trigger device link changes; device link changes can then trigger sysfs changes. Eventually, power_supply_uevent() is called. 3. This leads to calls like cros_usbpd_charger_get_prop(), which calls cros_ec_cmd_xfer_status(), which then attempts to acquire `ec_dev->lock`. See #1 ~ #6 in the following example calling trace. (Lock Order: `prepare_lock` -> `&genpd->mlock` -> ... -> `&ec_dev->lock`) Move the clk_prepare()/clk_unprepare() operations for `scp->clk` to the remoteproc prepare()/unprepare() callbacks. This ensures `prepare_lock` is only acquired in prepare()/unprepare() callbacks. Since `ec_dev->lock` is not involved in the callbacks, the dependency loop is broken. This means the clock is always "prepared" when the SCP is running. The prolonged "prepared time" for the clock should be acceptable as SCP is designed to be a very power efficient processor. The power consumption impact can be negligible. A simplified calling trace reported by lockdep: > -> #6 (&ec_dev->lock) > cros_ec_cmd_xfer > cros_ec_cmd_xfer_status > cros_usbpd_charger_get_port_status > cros_usbpd_charger_get_prop > power_supply_get_property > power_supply_show_property > power_supply_uevent > dev_uevent > uevent_show > dev_attr_show > sysfs_kf_seq_show > kernfs_seq_show > -> #5 (kn->active#2) > kernfs_drain > __kernfs_remove > kernfs_remove_by_name_ns > sysfs_remove_file_ns > device_del > __device_link_del > device_links_driver_bound > -> #4 (device_links_lock) > device_link_remove > _regulator_put > regulator_put > -> #3 (regulator_list_mutex) > regulator_lock_dependent > regulator_disable > scpsys_power_off > _genpd_power_off > genpd_power_off > -> #2 (&genpd->mlock/1) > genpd_add_subdomain > pm_genpd_add_subdomain > scpsys_add_subdomain > scpsys_probe > -> #1 (&genpd->mlock) > genpd_runtime_resume > __rpm_callback > rpm_callback > rpm_resume > __pm_runtime_resume > clk_core_prepare > clk_prepare > -> #0 (prepare_lock) > clk_prepare > scp_ipi_send > scp_send_ipi > mtk_rpmsg_send > rpmsg_send > cros_ec_pkt_xfer_rpmsg Signed-off-by: Tzung-Bi Shih <[email protected]> Reviewed-by: Chen-Yu Tsai <[email protected]> Tested-by: Chen-Yu Tsai <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Mathieu Poirier <[email protected]> Signed-off-by: Sasha Levin <[email protected]> --- LLM Generated explanations, may be completely bogus: Good - the `prepare`/`unprepare` ops were added in 2020, well before any current stable tree branch point. They exist in 5.10, 5.15, 6.1, and all newer stable trees. ### 5. User Impact Assessment This affects **MediaTek SCP on ChromeOS devices** (Chromebooks with MediaTek SoCs). The deadlock scenario involves: - ChromeOS EC communication via rpmsg (SCP IPI) - Clock framework interactions - Power domain/regulator/device link chains This is a real-world deadlock that lockdep detected. ChromeOS devices are widely deployed, and a deadlock in the EC communication path could cause a complete system hang. ### 6. Risk Assessment **Changes are mechanical and low-risk:** - All `clk_prepare_enable()` → `clk_enable()` (7 call sites) - All `clk_disable_unprepare()` → `clk_disable()` (7 call sites) - New `scp_prepare()`/`scp_unprepare()` callbacks added to `scp_ops` (simple wrappers) **The pattern is well-established** - `clk_prepare()` is meant to be called in sleepable context, `clk_enable()` can be called from any context including IRQ. Separating them is a standard clock framework pattern. **Total: 30 insertions, 13 deletions** across 2 files - small and contained. ### 7. Dependency Check The commit is self-contained: - No subsequent commits in these files depend on it - The `rproc_ops.prepare`/`.unprepare` callbacks exist since 2020 (present in all stable trees: 5.10+) - No other infrastructure changes needed ### Verification - **git log master** confirmed commit d935187cfb27f is the exact commit being analyzed - **git log --oneline d935187cfb27f..master** confirmed no subsequent commits modify the same files (no dependencies) - **git log/show 33467ac3c8dc8** confirmed `prepare`/`unprepare` ops were added to `rproc_ops` in April 2020, present in stable trees 5.10+ - **Code review of remoteproc_internal.h** confirmed `rproc_prepare_device()`/`rproc_unprepare_device()` wrappers exist and are called in `rproc_fw_boot()` and `rproc_shutdown()` paths - **Reviewed-by and Tested-by** from Chen-Yu Tsai ([email protected]) confirms the fix was tested on real hardware - **Lockdep call trace** in commit message confirms this is a real detected lock ordering issue, not theoretical - The commit is a standalone single patch (not part of a series) - confirmed by lore.kernel.org link format (single patch, not X/Y) ### Summary This commit fixes a **lockdep-detected ABBA deadlock** in the MediaTek SCP remoteproc driver. The deadlock involves the clock framework's `prepare_lock` and the ChromeOS EC's `ec_dev->lock`. The fix is mechanically simple (splitting `clk_prepare_enable` into separate `clk_prepare` + `clk_enable` calls), uses existing remoteproc framework infrastructure (`prepare`/`unprepare` callbacks available since 2020), and has been reviewed and tested. The scope is limited to 2 files with 30 insertions/13 deletions. The risk of regression is very low as it uses a well-established clock framework pattern, and the benefit is preventing a real deadlock on ChromeOS devices with MediaTek SoCs. **YES** drivers/remoteproc/mtk_scp.c | 39 +++++++++++++++++++++++--------- drivers/remoteproc/mtk_scp_ipi.c | 4 ++-- 2 files changed, 30 insertions(+), 13 deletions(-) diff --git a/drivers/remoteproc/mtk_scp.c b/drivers/remoteproc/mtk_scp.c index db8fd045468d9..98d00bd5200cc 100644 --- a/drivers/remoteproc/mtk_scp.c +++ b/drivers/remoteproc/mtk_scp.c @@ -283,7 +283,7 @@ static irqreturn_t scp_irq_handler(int irq, void *priv) struct mtk_scp *scp = priv; int ret; - ret = clk_prepare_enable(scp->clk); + ret = clk_enable(scp->clk); if (ret) { dev_err(scp->dev, "failed to enable clocks\n"); return IRQ_NONE; @@ -291,7 +291,7 @@ static irqreturn_t scp_irq_handler(int irq, void *priv) scp->data->scp_irq_handler(scp); - clk_disable_unprepare(scp->clk); + clk_disable(scp->clk); return IRQ_HANDLED; } @@ -665,7 +665,7 @@ static int scp_load(struct rproc *rproc, const struct firmware *fw) struct device *dev = scp->dev; int ret; - ret = clk_prepare_enable(scp->clk); + ret = clk_enable(scp->clk); if (ret) { dev_err(dev, "failed to enable clocks\n"); return ret; @@ -680,7 +680,7 @@ static int scp_load(struct rproc *rproc, const struct firmware *fw) ret = scp_elf_load_segments(rproc, fw); leave: - clk_disable_unprepare(scp->clk); + clk_disable(scp->clk); return ret; } @@ -691,14 +691,14 @@ static int scp_parse_fw(struct rproc *rproc, const struct firmware *fw) struct device *dev = scp->dev; int ret; - ret = clk_prepare_enable(scp->clk); + ret = clk_enable(scp->clk); if (ret) { dev_err(dev, "failed to enable clocks\n"); return ret; } ret = scp_ipi_init(scp, fw); - clk_disable_unprepare(scp->clk); + clk_disable(scp->clk); return ret; } @@ -709,7 +709,7 @@ static int scp_start(struct rproc *rproc) struct scp_run *run = &scp->run; int ret; - ret = clk_prepare_enable(scp->clk); + ret = clk_enable(scp->clk); if (ret) { dev_err(dev, "failed to enable clocks\n"); return ret; @@ -734,14 +734,14 @@ static int scp_start(struct rproc *rproc) goto stop; } - clk_disable_unprepare(scp->clk); + clk_disable(scp->clk); dev_info(dev, "SCP is ready. FW version %s\n", run->fw_ver); return 0; stop: scp->data->scp_reset_assert(scp); - clk_disable_unprepare(scp->clk); + clk_disable(scp->clk); return ret; } @@ -909,7 +909,7 @@ static int scp_stop(struct rproc *rproc) struct mtk_scp *scp = rproc->priv; int ret; - ret = clk_prepare_enable(scp->clk); + ret = clk_enable(scp->clk); if (ret) { dev_err(scp->dev, "failed to enable clocks\n"); return ret; @@ -917,12 +917,29 @@ static int scp_stop(struct rproc *rproc) scp->data->scp_reset_assert(scp); scp->data->scp_stop(scp); - clk_disable_unprepare(scp->clk); + clk_disable(scp->clk); return 0; } +static int scp_prepare(struct rproc *rproc) +{ + struct mtk_scp *scp = rproc->priv; + + return clk_prepare(scp->clk); +} + +static int scp_unprepare(struct rproc *rproc) +{ + struct mtk_scp *scp = rproc->priv; + + clk_unprepare(scp->clk); + return 0; +} + static const struct rproc_ops scp_ops = { + .prepare = scp_prepare, + .unprepare = scp_unprepare, .start = scp_start, .stop = scp_stop, .load = scp_load, diff --git a/drivers/remoteproc/mtk_scp_ipi.c b/drivers/remoteproc/mtk_scp_ipi.c index c068227e251e7..7a37e273b3af8 100644 --- a/drivers/remoteproc/mtk_scp_ipi.c +++ b/drivers/remoteproc/mtk_scp_ipi.c @@ -171,7 +171,7 @@ int scp_ipi_send(struct mtk_scp *scp, u32 id, void *buf, unsigned int len, WARN_ON(len > scp_sizes->ipi_share_buffer_size) || WARN_ON(!buf)) return -EINVAL; - ret = clk_prepare_enable(scp->clk); + ret = clk_enable(scp->clk); if (ret) { dev_err(scp->dev, "failed to enable clock\n"); return ret; @@ -211,7 +211,7 @@ int scp_ipi_send(struct mtk_scp *scp, u32 id, void *buf, unsigned int len, unlock_mutex: mutex_unlock(&scp->send_lock); - clk_disable_unprepare(scp->clk); + clk_disable(scp->clk); return ret; } -- 2.51.0

