On Sat, 6 Dec 2025 21:26:12 -0500
Joel Fernandes <[email protected]> wrote:

> Hi Zhi,
> 
> On 12/6/2025 7:42 AM, Zhi Wang wrote:

snip

> 
> boot() already returns -ETIMEDOUT via
> wait_till_halted()->read_poll_timeout().
> 
> The wait there is 2 seconds. I assume the scrubber would have
> completed by then.
> 1
> > +
> > +            dev_dbg!(
> > +                pdev.as_ref(),
> > +                "SEC2 MBOX0: {:#x}, MBOX1{:#x}\n",
> > +                mbox0,
> > +                mbox1
> > +            );
> > +
> > +            if
> > !regs::NV_PGC6_BSI_SECURE_SCRATCH_15::read(bar).scrubber_completed()
> > {
> > +                return Err(ETIMEDOUT);  
> 
> So under which situation do you get to this point
> (!scrubber_completed) ? Basically I am not sure if ETIMEDOUT is the
> right error to return here, because boot() already returns ETIMEDOUT
> by waiting for the halt.
>
> If you still want return ETIMEDOUT here, then it sounds like you're
> waiting for scrubbing beyond the waiting already done by boot(). If
> so, then shouldn't you need to use read_poll_timeout() here?
> 
> perhaps something like:
> 
>  read_poll_timeout(
>      ||
> Ok(regs::NV_PGC6_BSI_SECURE_SCRATCH_15::read(bar).scrubber_completed()),
> |val: &bool| *val, Delta::from_millis(10),
>      Delta::from_secs(5),
>  )?;
> 

This is the identical implementation to OpenRM [1]. According to that
parts of code, I think the scrubber runs in the binary booting process.
When it signals the firmware booting successfully, the scrubbing should
be done. Let me change to another errno.

[1]https://github.com/NVIDIA/open-gpu-kernel-modules/blob/a5bfb10e75a4046c5d991c65f49b5d29151e68cf/src/nvidia/src/kernel/gpu/gsp/arch/ada/kernel_gsp_ad102.c#L49
> Thanks.
> 

Reply via email to