Hello Chaoyi,

Following up on the need_regulator suggestion -- I implemented and
tested it on the
board, and unfortunately it doesn't avoid the deadlock on RK3568; it
moves it from
boot to the NPU job submit.

What I did: gave the RK3568 NPU power domain a regulator (a DOMAIN_M_R
variant with
need_regulator = true), wired domain-supply = <&vdd_npu>, and dropped the
regulator-always-on workaround.

Boot is now clean and the NPU probes, but there is a warning during boot:

  rockchip-pm-domain ...: Failed to create device link (0x180) with supplier
  0-0020 for .../power-domain@6

(0-0020 is the rk809 PMIC that supplies vdd_npu.) Then on the first NPU job
submit the board hard-hangs with an RCU stall:

  rcu: INFO: rcu_preempt detected stalls on CPUs/tasks:
  rcu:     3-...!: (1 GPs behind) ...
  rcu: rcu_preempt kthread starved for 5115 jiffies! ... RCU_GP_WAIT_FQS(5)
  rcu: Unless rcu_preempt kthread gets sufficient CPU time, OOM is now expected

My reading: vdd_npu is on the rk809 *I2C* PMIC, so when genpd
enables/disables the
regulator during the NPU's runtime-PM power transition, the I2C
transfer runs in a
context that starves RCU and the box freezes. (I suspect
need_regulator is fine on
the RK3588 NPU because its supply isn't behind an I2C PMIC.) The always-on
workaround avoids this precisely because genpd never touches the I2C
regulator in
that path.

So: for an NPU domain whose supply is an I2C PMIC, is there a
supported way to let
genpd own the regulator without performing the I2C op in the
power-transition path
(a deferred/async regulator enable, or a flag), or should RK3568 keep vdd_npu as
regulator-always-on? For v4 I'll keep always-on unless there's a cleaner path.


Thanks,
Midgy

Le lun. 8 juin 2026 à 10:05, Midgy Balon <[email protected]> a écrit :
>
> Hello Chaoyi,
>
> Thanks -- this is exactly what I needed.
>
> - v2/DTE: will do. I'll keep building on Simon's per-device-ops series -- with
>   that in place the NPU MMU can use the 32-bit-DTE ops (the per-ops GFP_DMA32
>   that's already in mainline) without the global rk_ops conflict. I'll
> keep it as
>   a stated dependency of the v4 cover letter.
>
> - vdd_npu:  I'll switch the RK3568 NPU
>   power domain to need_regulator + domain-supply = <&vdd_npu> and drop the
>   regulator-always-on workaround. I suspect that's also the right fix for the
>   power-off/on de-idle issue I described -- the always-on was really
> just papering
>   over the domain not being modelled with a regulator. I'll confirm on
> the board.
>
> - AUTO_GATING: thanks for the commit references -- I'll keep the bit-31
>   read-modify-write form with your Suggested-by and write the comment
> from those.
>   For the record: on v7.1-rc6 the NPU MMU also completes translations
> on the reset
>   value (I couldn't reproduce a page-walk stall without the write), so I'll 
> note
>   in the commit that it matches the vendor clock-gating handling rather than
>   fixing a failure I can reproduce here -- happy to drop it if the iommu
>   maintainers would prefer.
>
> - PVTPLL/NoC: I'll follow up with Finley. First I'll check whether the
>   need_regulator change resolves the NoC re-power de-idle on its own;
> if it still
>   I'll bring him the details (the genpd power-on de-idle ack and the
>   BUS_IDLE_ST state).
>
> I'll send a v4 with these. Thanks again for the quick, detailed answers.
>
> Kind regards,
> Midgy
>
> Le lun. 8 juin 2026 à 03:40, Chaoyi Chen <[email protected]> a écrit 
> :
> >
> > Hi Midgy,
> >
> > On 6/8/2026 5:03 AM, Midgy Balon wrote:
> > > Hi Chaoyi,
> > >
> > > Thanks a lot for looking at this -- input from Rockchip is exactly what 
> > > this
> > > series needs.
> > >
> > >> Hmmm. If I understand correctly, the NPU IOMMU should be v2 rather than 
> > >> v1,
> > >> implying it should support 40-bit PAs. Nevertheless, please note that the
> > >> upper limit for DTE is 32 bits.
> > >
> > > Understood, and that 32-bit-DTE note is the crux of the trouble I had, so 
> > > let
> > > me lay out what I see and ask how you'd prefer to solve it.
> > >
> > > The mainline node is already v2 (rockchip,rk3568-iommu in 
> > > rk356x-base.dtsi).
> > > The problem on this 8 GiB board: with the v2 ops the page-table 
> > > allocations
> > > (gfp_flags == 0) can land above 4 GiB, so the DTE ends up > 32 bits and 
> > > the
> > > NPU's first translation faults with DMA_READ_ERROR. To work around that I 
> > > had
> > > switched the NPU MMU to the v1 compatible (rockchip,iommu), whose ops set
> > > GFP_DMA32 and keep the DTE sub-4 GiB. That works in isolation, but 
> > > because the
> > > driver keeps a single global rk_ops, a v1 NPU MMU then trips
> > > WARN_ON(rk_ops != ops) against the SoC's v2 instances (VOP/VDEC), which 
> > > is why
> > > I based the series on Simon's per-device-ops work.
> > >
> > > So my question: with per-device ops in place, what's the intended way to 
> > > keep
> > > the NPU MMU on v2 *and* cap its DTE at 32 bits on boards with >4 GiB of 
> > > RAM?
> > > A v2 ops variant carrying GFP_DMA32 for this device, or is there a 
> > > register/
> > > config bit that constrains the DTE address? I'd rather follow the Rockchip
> > > intent here than carry the v1 workaround. (Simon, cc'd -- this is right 
> > > next to
> > > your per-device-ops series.)
> > >
> >
> > If Simon's method works, please use it :)
> >
> > >> Can these operations not be completed via the pmdomain driver?
> > >> If some operations are controlled by TF-A, are you using open source 
> > >> TF-A?
> > >
> > > Most of it is in pmdomain already. Power-on and NoC de-idle are done by 
> > > the
> > > RK3568 NPU power domain (genpd) at power-on -- the driver no longer pokes 
> > > the
> > > PMU directly. Two things remain outside it:
> > >
> > >  - vdd_npu: I mark it regulator-always-on in DT rather than wiring it as 
> > > the
> > >    domain's domain-supply, because as a domain-supply it created a 
> > > device-link
> > >    to the I2C PMIC (rk809) and genpd's power-off QoS-save path then hung
> > >    reading the NPU QoS registers behind the (gated) NoC. If there's a 
> > > clean way
> > >    to let genpd own vdd_npu without that I2C ordering deadlock I'd much 
> > > prefer
> > >    that -- pointers welcome.
> > >
> >
> > Please refer to the patch below regarding the RK3588 NPU pmdomain.
> > In short, you need to set a "need_regulator" for the RK3568 NPU pmdomain.
> >
> > https://lore.kernel.org/all/[email protected]/
> >
> > >  - the NPU compute clock (PVTPLL): set from the driver via SCMI, and only
> > >    needed for actual compute, not for bring-up.
> > >
> > > One more pmdomain observation from testing, possibly relevant to how the 
> > > NPU
> > > domain should be modelled: the domain's power-off/on cycle doesn't 
> > > reliably
> > > re-de-idle the NoC. If the NPU is probed after genpd has already powered 
> > > the
> > > (unused) domain off, the power-on de-idle fails ("failed to set idle on 
> > > domain
> > > 'npu'") and the NPU IOMMU then takes an external abort on its first MMIO 
> > > access.
> > > Probing the NPU before the unused-domain power-off, or marking the domain
> > > always-on, both avoid it. Is the NoC de-idle expected to work on a genpd
> > > re-power here, or should this domain effectively stay on?
> > >
> >
> > Not quite sure what's going on with PVTPLL and NOC.
> > Maybe @Finley knows about this?
> >
> > > On TF-A: yes -- bl31 is built from upstream arm-trusted-firmware
> > > (github.com/ARM-software/arm-trusted-firmware, RK3568 platform), 
> > > providing PSCI
> > > and the SCMI clock service. The only closed blob in the boot chain is 
> > > Rockchip's
> > > DDR init (rkbin), which is the standard situation for mainline RK356x.
> >
> > --
> > Best,
> > Chaoyi

Reply via email to