When a DSP hangs without triggering its own crash handler, a graceful
shutdown via sysmon times out. Before this series, two things went
wrong:
1. The stop() and unprepare() callbacks in struct rproc_subdev were
void. Implementations had no way to surface failures, and callers
discarded any internal error state silently.
2. Even if an error had been detectable, rproc_stop_subdevices() kept
iterating after a failed stop. This meant glink and ssr subdevices
were torn down regardless, causing HLOS to unregister and unmap the
shared memory regions. If the remote still had DMA in flight
against those regions — as is often the case with a hung DSP — the
result was an SMMU fault.
This series fixes both problems in three steps.
Patch 1 changes stop() and unprepare() from void to int, matching
prepare() and start(). Most implementations gain a trivial return 0;
rproc_vdev_do_stop() now surfaces the error it was already computing.
Callers warn on failure but continue iterating (best-effort).
Patch 2 changes rproc_stop_subdevices() to abort and return error on the
first failing subdev, propagating through rproc_stop() and
__rproc_detach().
Patch 3 makes sysmon_stop() return -ETIMEDOUT when the remote does not
acknowledge a graceful shutdown request. Combined with patch 2, this
prevents glink and ssr from unmapping shared memory against a hung DSP.
Mukesh Ojha (3):
remoteproc: check return value of subdev stop and unprepare callbacks
remoteproc: abort subdev stop sequence on first failure
remoteproc: qcom_sysmon: abort stop on unacknowledged shutdown
drivers/remoteproc/qcom_common.c | 26 ++++++++++++++-----
drivers/remoteproc/qcom_sysmon.c | 15 ++++++++---
drivers/remoteproc/remoteproc_core.c | 36 +++++++++++++++++++++-----
drivers/remoteproc/remoteproc_virtio.c | 4 ++-
drivers/rpmsg/mtk_rpmsg.c | 8 ++++--
include/linux/remoteproc.h | 4 +--
6 files changed, 71 insertions(+), 22 deletions(-)
--
2.53.0