On 3/11/2026 4:28 PM, Stephan Gerhold wrote:
On Tue, Mar 10, 2026 at 03:03:22AM -0700, Jingyi Wang wrote:
From: Gokul Krishna Krishnakumar <[email protected]>

Subsystems can be brought out of reset by entities such as bootloaders.
As the irq enablement could be later than subsystem bring up, the state
of subsystem should be checked by reading SMP2P bits and performing ping
test.

A new qcom_pas_attach() function is introduced. if a crash state is
detected for the subsystem, rproc_report_crash() is called. If the
subsystem is ready either at the first check or within a 5-second timeout
and the ping is successful, it will be marked as "attached". The ready
state could be set by either ready interrupt or handover interrupt.

If "early_boot" is set by kernel but "subsys_booted" is not completed
within the timeout, It could be the early boot feature is not supported
by other entities. In this case, the state will be marked as RPROC_OFFLINE
so that the PAS driver can load the firmware and start the remoteproc. As
the running state is set once attach function is called, the watchdog or
fatal interrupt received can be handled correctly.

Signed-off-by: Gokul Krishna Krishnakumar <[email protected]>
Co-developed-by: Jingyi Wang <[email protected]>
Signed-off-by: Jingyi Wang <[email protected]>
---
  drivers/remoteproc/qcom_q6v5.c      |  88 +++++++++++++++++++++++++++++-
  drivers/remoteproc/qcom_q6v5.h      |  17 +++++-
  drivers/remoteproc/qcom_q6v5_adsp.c |   2 +-
  drivers/remoteproc/qcom_q6v5_mss.c  |   2 +-
  drivers/remoteproc/qcom_q6v5_pas.c  | 103 ++++++++++++++++++++++++++++++++++--
  drivers/remoteproc/qcom_q6v5_wcss.c |   2 +-
  6 files changed, 204 insertions(+), 10 deletions(-)

[...]
diff --git a/drivers/remoteproc/qcom_q6v5_pas.c 
b/drivers/remoteproc/qcom_q6v5_pas.c
index 46204da046fa..4700d111e058 100644
--- a/drivers/remoteproc/qcom_q6v5_pas.c
+++ b/drivers/remoteproc/qcom_q6v5_pas.c
@@ -36,6 +36,8 @@
#define MAX_ASSIGN_COUNT 3 +#define EARLY_ATTACH_TIMEOUT_MS 5000
+
  struct qcom_pas_data {
        int crash_reason_smem;
        const char *firmware_name;
[...]
@@ -510,6 +521,80 @@ static unsigned long qcom_pas_panic(struct rproc *rproc)
        return qcom_q6v5_panic(&pas->q6v5);
  }
+static int qcom_pas_attach(struct rproc *rproc)
+{
+       int ret;
+       struct qcom_pas *pas = rproc->priv;
+       bool ready_state;
+       bool crash_state;
+
+       pas->q6v5.running = true;
+       ret = irq_get_irqchip_state(pas->q6v5.fatal_irq,
+                                   IRQCHIP_STATE_LINE_LEVEL, &crash_state);
+
+       if (!ret && crash_state) {
+               dev_err(pas->dev, "Sub system has crashed before driver 
probe\n");
+               rproc_report_crash(rproc, RPROC_FATAL_ERROR);
+               ret = -EINVAL;
+               goto disable_running;
+       }
+
+       if (!ret)
+               ret = irq_get_irqchip_state(pas->q6v5.ready_irq,
+                                           IRQCHIP_STATE_LINE_LEVEL, 
&ready_state);
+
+       /*
+        * smp2p allocate irq entry can be delayed, irq_get_irqchip_state will 
get -ENODEV,
+        * the 5 seconds timeout is set to wait for this, after the entry is 
allocated, smp2p
+        * will call the qcom_smp2p_intr and complete the timeout in the ISR.
+        */
+       if (unlikely(ret == -ENODEV) || unlikely(!ready_state)) {
+               ret = wait_for_completion_timeout(&pas->q6v5.subsys_booted,
+                                                 
msecs_to_jiffies(EARLY_ATTACH_TIMEOUT_MS));

I have asked this back in October for v2 [1] and again in December for
v3 [2], but you still haven't really answered it. Please answer all
of the following questions:

  1. What is the use case for this timeout?
  2. In which situations will the start of the remoteproc be delayed?
  3. Why does the boot firmware not wait until the remoteproc is fully
     started before it continues booting?
  4. If the boot firmware gives up control before the remoteproc is fully
     started, how do you ensure that the handover resources are
     maintained until the remoteproc signals handover?

v4 looks a bit less dangerous now since you don't enable the handover
IRQ anymore. Still, I don't understand how this would work in practice.
Removing this timeout would be preferable because then we could actually
support firmware versions that do not automatically start the remoteproc
without having to delay the boot process for 5s.

Thanks,
Stephan

[1]: https://lore.kernel.org/r/[email protected]/
[2]: https://lore.kernel.org/r/[email protected]/

Hi Stephan,

For question [1] and [2],
We tried to answer the reason why the timeout is added in the comments,
but seems it is still not clear enough, it is a design in downstream
code to avoid the irq is not received when we check the irq state, but
indeed it seems like a redundant design and may block firmware start.
So we will remove this timeout in next series.

For question [3] and [4],
I think it related to how to deal with the "start" if "attach" fail?
As we will remove the timeout, we have 2 proposals for this:

a. attach fail ->  keep the state "RPROC_DETACHED" -> if user echo "start" -> 
call attach() function again
                                                   -> if user echo "stop" to set 
RPROC_OFFLINE -> user echo "start" to do firmware load
If attach fail, user need to do "stop" first to set the RPROC_OFFLINE
and then "start" the remoteproc for firmware loading.

b. attach fail -> change state to RPROC_OFFLINE -> user echo "start" to do 
firmware load
this is what current code do, if attach fail, RPROC_OFFLINE will be set,
next time user echo start will go to the firmware load path.

Could you please share your comments on which one is better?

And qcom_pas_attach() is called by rproc_add() with auto_boot enabled.
if qcom_pas_attach() return false, rproc_add() will teardown all the
resources, so in rproc_trigger_auto_boot(), instead of calling rproc_boot
directly:
        if (rproc->state == RPROC_DETACHED)
                return rproc_boot(rproc);
we will schedule work to call rproc_boot asynchronously, like what firmware
loading is doing now, to make sure rproc_add() can success even if attach
failed, and we can do the firmware loading in the next step.
Do you think it is okay to do this change?

Thanks,
Jingyi









Reply via email to