On Thu, Nov 13, 2025 at 07:44:03AM -0800, Tanmay Shah wrote: > Current attach on recovery mechanism loads the clean resource table > during recovery, but doesn't re-allocate the resources. RPMsg > communication will fail after recovery due to this. Fix this > incorrect behavior by doing the full detach and attach of remote > processor during the recovery. This will load the clean resource table > and re-allocate all the resources, which will set up correct vring > information in the resource table. > > Signed-off-by: Tanmay Shah <[email protected]> > --- > > Changes in v2: > - use rproc_boot instead of rproc_attach > - move debug message early in the function > > drivers/remoteproc/remoteproc_core.c | 26 +++++++++++++++----------- > 1 file changed, 15 insertions(+), 11 deletions(-) > > diff --git a/drivers/remoteproc/remoteproc_core.c > b/drivers/remoteproc/remoteproc_core.c > index aada2780b343..f65e8bc2d1e1 100644 > --- a/drivers/remoteproc/remoteproc_core.c > +++ b/drivers/remoteproc/remoteproc_core.c > @@ -1777,11 +1777,11 @@ static int rproc_attach_recovery(struct rproc *rproc) > { > int ret; > > - ret = __rproc_detach(rproc); > + ret = rproc_detach(rproc); > if (ret) > return ret; > > - return __rproc_attach(rproc); > + return rproc_boot(rproc); > } > > static int rproc_boot_recovery(struct rproc *rproc) > @@ -1829,6 +1829,11 @@ int rproc_trigger_recovery(struct rproc *rproc) > struct device *dev = &rproc->dev; > int ret; > > + dev_err(dev, "recovering %s\n", rproc->name); > + > + if (rproc_has_feature(rproc, RPROC_FEAT_ATTACH_ON_RECOVERY)) > + return rproc_attach_recovery(rproc); > +
Humm... I find this a little messy. Taking [1] as an example, I suggest moving the "unlock_mutex" block to line 1846 and add mutex calls to rproc_boot_recovery(). That way both rproc_attach_recovery() and rproc_boot_recovery() are called the same way. [1] https://elixir.bootlin.com/linux/v6.17.8/source/drivers/remoteproc/remoteproc_core.c#L1832 > ret = mutex_lock_interruptible(&rproc->lock); > if (ret) > return ret; > @@ -1837,12 +1842,7 @@ int rproc_trigger_recovery(struct rproc *rproc) > if (rproc->state != RPROC_CRASHED) > goto unlock_mutex; > > - dev_err(dev, "recovering %s\n", rproc->name); > - > - if (rproc_has_feature(rproc, RPROC_FEAT_ATTACH_ON_RECOVERY)) > - ret = rproc_attach_recovery(rproc); > - else > - ret = rproc_boot_recovery(rproc); > + ret = rproc_boot_recovery(rproc); > > unlock_mutex: > mutex_unlock(&rproc->lock); > @@ -1860,6 +1860,7 @@ static void rproc_crash_handler_work(struct work_struct > *work) > { > struct rproc *rproc = container_of(work, struct rproc, crash_handler); > struct device *dev = &rproc->dev; > + int ret; > > dev_dbg(dev, "enter %s\n", __func__); > > @@ -1883,8 +1884,11 @@ static void rproc_crash_handler_work(struct > work_struct *work) > > mutex_unlock(&rproc->lock); > > - if (!rproc->recovery_disabled) > - rproc_trigger_recovery(rproc); > + if (!rproc->recovery_disabled) { > + ret = rproc_trigger_recovery(rproc); > + if (ret) > + dev_warn(dev, "rproc recovery failed, err %d\n", ret); I would prefer a patch on its own for this one. I'm running out of time for today, I'll review patch 3/3 tomorrow. Thanks, Mathieu > + } > > out: > pm_relax(rproc->dev.parent); > @@ -2057,7 +2061,7 @@ int rproc_detach(struct rproc *rproc) > return ret; > } > > - if (rproc->state != RPROC_ATTACHED) { > + if (rproc->state != RPROC_ATTACHED && rproc->state != RPROC_CRASHED) { > ret = -EINVAL; > goto out; > } > -- > 2.34.1 >

