On 2/13/26 23:52, Marek Vasut wrote:
> On 2/12/26 4:56 PM, Thorsten Leemhuis wrote:
>> On 2/12/26 15:38, Marek Vasut wrote:
>>> On 2/12/26 10:00 AM, Matt Coster wrote:
>>>> On 11/02/2026 19:17, Marek Vasut wrote:
>>>>> On 1/23/26 2:50 PM, Geert Uytterhoeven wrote:
>>>>>> On Fri, 23 Jan 2026 at 14:36, Matt Coster <[email protected]>
>>>>>> wrote:
>>>>>>> On 22/01/2026 16:08, Geert Uytterhoeven wrote:
>>>>>>>> Call the dev_pm_domain_attach_list() and
>>>>>>>> dev_pm_domain_detach_list()
>>>>>>>> helpers instead of open-coding multi PM Domain handling.
>>>>>>>>
>>>>>>>> This changes behavior slightly:
>>>>>>>> - The new handling is also applied in case of a single PM
>>>>>>>> Domain,
>>>>>>>> - PM Domains are now referred to by index instead of by
>>>>>>>> name, but
>>>>>>>> "make dtbs_check" enforces the actual naming and ordering
>>>>>>>> anyway,
>>>>>>>> - There are no longer device links created between virtual
>>>>>>>> domain
>>>>>>>> devices, only between virtual devices and the parent device.
>>>>>>>
>>>>>>> We still need this guarantee, both at start and end of day. In the
>>>>>>> current implementation dev_pm_domain_attach_list() iterates
>>>>>>> forwards,
>>>>>>> but so does dev_pm_domain_detach_list(). Even if we changed that,
>>>>>>> I'd
>>>>>>> prefer not to rely on the implementation details when we can
>>>>>>> declare the
>>>>>>> dependencies explicitly.
>>>>>>
>>>>>> Note that on R-Car, the PM Domains are nested (see e.g.
>>>>>> r8a7795_areas[]),
>>>>>> so they are always (un)powered in the correct order. But that may
>>>>>> not
>>>>>> be the case in the integration on other SoCs.
>>>>>>
>>>>>>> We had/have a patch (attached) kicking around internally to use the
>>>>>>> *_list() functions but keep the inter-domain links in place; it got
>>>>>>> held
>>>>>>> up by discussions as to whether we actually need those dependencies
>>>>>>> for
>>>>>>> the hardware to behave correctly. Your patch spurred me to run
>>>>>>> around
>>>>>>> the office and nag people a bit, and it seems we really do need to
>>>>>>> care
>>>>>>> about the ordering.
>>>>>>
>>>>>> OK.
>>>>>>
>>>>>>> Can you add the links back in for a V2 or I can properly send the
>>>>>>> attached patch instead, I don't mind either way.
>>>>>>
>>>>>> Please move forward with your patch, you are the expert.
>>>>>> I prefer not to be blamed for any breakage ;-)
>>>>>
>>>>> Has there been any progress on fixing this kernel crash ?
>>>>>
>>>>> There are already two proposed solutions, but no fix is upstream.
>>>>
>>>> Yes and no. Our patch to use dev_pm_domain_attach_list() has landed in
>>>> drm-misc-next as commit e19cc5ab347e3 ("drm/imagination: Use>>
>>>> dev_pm_domain_attach_list()"), but this does not fix the underlying
>>>> issue of missing synchronization in the PM core[1] is still unresolved
>>>> as far as I'm aware.
>>>
>>> OK, but the pvr driver can currently easily crash the kernel on boot if
>>> firmware is missing, so that should be fixed soon, right ?
>>
>> Well, drm-misc-next afaik means that the above mentioned fix would only
>> be merged in 7.1, which is ~4 months away, which is not really "soon"
>> I'd say. Or did I misjudge this?
>
> The PM domain issue here crashes the kernel, so I think this would be
> material for drm-misc-fixes .
Yeah, sounds a lot like it.
>>> I added the regressions list onto CC, because this seems like a problem
>>> worth tracking.
>>
>> Noticed that and wondered what change caused the regression.
>
> I think this one:
>
> 330e76d31697 ("drm/imagination: Add power domain control")
Thx; FWIW, that was merged for v6.16-rc1.
>> Did not
>> find a answer in a quick search on lore[1]. Because if it's a
>> regression, we maybe should just revert the culprit for now according to
>> Linus:
>> https://lore.kernel.org/lkml/CAHk-=wi86AosXs66-
>> [email protected]/
>>
>> [1] I guess this was the initial report from Geert?
>> https://lore.kernel.org/all/
>> camuhmdwapt40hv3c+csbqfow05awcv1a6v_nijygoyi0i9_...@mail.gmail.com/
>
> It is.
>
> I think there are other SoCs which depend on the power domain commit, so
> revert is not so clear cut anymore.
Well, it's a judgement call. 330e76d31697 was merged less then a year
ago, so I'd not be surprised at all if Linus would revert it in a case
like this. But it seems it doesn't revert clearly anymore, which
complicates things.
> But SoCs which have hierarchical
> power domains and which manage to probe this driver without having a
> firmware available for the GPU will simply end with crashed kernel,
> which is really not good.
Does the patch Matt mentioned fix the crash? His "this does not fix the
underlying issue [...]" (see quote earlier) makes it sound like the
crash or some other problem (theoretical or practical? regression or
not?) remains. If that's the case and no quick fix in sight I guess it
would be best if someone affected could post a revert and then we can
ask Linus if he wants to pick it up.
Ciao, Thorsten