On 29/10/2020 09:53, Jan Kiszka wrote:
> On 29.10.20 09:39, Andrea Bastoni wrote:
>> On 29/10/2020 07:36, Jan Kiszka wrote:
>>> On 28.10.20 22:29, Andrea Bastoni wrote:
>>>> Hi,
>>>>
>>>> On 28/10/2020 21:14, Jan Kiszka wrote:
>>>>> On 27.10.20 10:22, Jan Kiszka wrote:
>>>>>> On 27.10.20 02:25, Peng Fan wrote:
>>>>>>> Jan,
>>>>>>>
>>>>>>>> Subject: Re: [PATCH v2 00/46] arm64: Rework SMMUv2 support
>>>>>>>>
>>>>>>>> On 14.10.20 10:28, Jan Kiszka wrote:
>>>>>>>>> Changes in v2:
>>>>>>>>>  - map 52-bit parange to 48
>>>>>>>>>
>>>>>>>>> That wasn't the plan when I started, but the more I dug into the
>>>>>>>>> details and started to understand the hardware, the more issues I
>>>>>>>>> found and the more dead code fragments from the Linux usage became
>>>>>>>> visible.
>>>>>>>>>
>>>>>>>>> Highlights of the outcome:
>>>>>>>>>  - Fix stall of SMMU due to unhandled stalled contexts (took me a 
>>>>>>>>> while
>>>>>>>>>    to understand that...)
>>>>>>>>>  - Fix programming of CBn_TCR and TTBR
>>>>>>>>>  - Fix TLB flush on cell exit
>>>>>>>>>  - Fix bogus handling of Extended StreamID support
>>>>>>>>>  - Do not pass-through unknown streams
>>>>>>>>>  - Disable SMMU on shutdown
>>>>>>>>>  - Reassign StreamIDs to the root cell
>>>>>>>>>  - 225 insertions(+), 666 deletions(-)
>>>>>>>>>
>>>>>>>>> The code works as expected on the Ultra96-v2 here, but due to all the
>>>>>>>>> time that went into the rework, I had no chance to bring up my MX8QM
>>>>>>>>> so far. I'm fairly optimistic that things are not broken there as
>>>>>>>>> well, but if they are, bisecting should be rather simple with this
>>>>>>>>> series. So please test and review.
>>>>>>>>>
>>>>>>>>
>>>>>>>> Alice, Peng, already had a chance to review or test (ie. next)?
>>>>>>>
>>>>>>> I gave a test, sometimes I met SDHC ADMA error when
>>>>>>> `jailhouse enable imx8qm.cell`, sometimes it work well.
>>>>>>>
>>>>>>> I suspect when during jailhouse enable phase, there might be
>>>>>>> ongoing sdhc transactions not finished, not sure.
>>>>>>>
>>>>>>> I have not find time to look into details.
>>>>>>>
>>>>>>> Anyway, you could check in to master I think, we could address
>>>>>>> the issue later when I have time.
>>>>>>>
>>>>>>
>>>>>> Hmm, I would still like to understand this first... Do you have the
>>>>>> chance to bisect this effect to a commit? Otherwise, I guess I finally
>>>>>> need to get my board running.
>>>>>>
>>>>>
>>>>> It's running now (quite some effort due to the incomplete upstream state
>>>>> - e.g. upstream u-boot runs but cannot boot all downstream kernels...),
>>>>> but I wasn't able to reproduce startup issues. Shutting down Jailhouse
>>>>> often hangs, though, at least restarting does all the time. And that
>>>>> even with next. Seems we still do not properly turn off/on something here.
>>>>>
>>>>> Interestingly, this issue was not present on the zynqmp.
>>>>
>>>> On a different version of the SMMUv2 developed @ Boston University (Renato 
>>>> in
>>>> CC), re-using the same root page table as the cell created problems due to
>>>> different attributes (uncached) needed by some devices.
>>>
>>> Why are so many folks working downstream on such essential things? Not
>>> helpful, for everyone, even if the goal should be "only" experimental
>>> results.
>>>
>>>>
>>>>> diff --git a/hypervisor/arch/arm64/smmu.c b/hypervisor/arch/arm64/smmu.c
>>>>> index 41c0ffb4..60743bc0 100644
>>>>> --- a/hypervisor/arch/arm64/smmu.c
>>>>> +++ b/hypervisor/arch/arm64/smmu.c
>>>>> @@ -220,6 +220,7 @@ static void arm_smmu_setup_context_bank(struct 
>>>>> arm_smmu_device *smmu,
>>>>>         mmio_write32(cb_base + ARM_SMMU_CB_TCR, VTCR_CELL & ~TCR_RES0);
>>>>>  
>>>>>         /* TTBR0 */
>>>>> +       /* Here */
>>>>>         mmio_write64(cb_base + ARM_SMMU_CB_TTBR0,
>>>>>                      paging_hvirt2phys(cell->arch.mm.root_table) & 
>>>>> TTBR_MASK);
>>>>
>>>> The issue in the BU version was solved by allocating a new page for this.
>>>>
>>>
>>> Only the root level? How were those entries different?
>>
>> Only the root level. IIRC, NC by default, instead of Normal.
>>
>>>> I wanted to check this effect for the version on next, but didn't find the 
>>>> time
>>>> to do it so far :/
>>>>
>>>
>>> How was the issue triggered?
>>
>> From the discussions I had, on the ZCU102, devices were randomly triggering
>> erros/ stopped working.
>>
> 
> I just ran a enable/disable loop aside flood-ping + dd on the Ultra96-v2
> (I would expect it to be identical to the ZCU102 in this regard), and
> that did not trigger any (visible) issues yet. I'll retry with lowering
> the enable frequency.

I extended the configuration of the zynqmp-zcu102 to use the SMMU and I've
started similar tests (enable/disable + flood ping + find /).

With the flooding ping I can regularly trigger ethernet errors in the diable ->
enable interval e.g.,:

[  373.470078] The Jailhouse was closed.
[  374.957052] macb ff0e0000.ethernet eth0: DMA bus error: HRESP not OK
[  374.966376] The Jailhouse is opening.

Maybe just outstanding transactions.

I got once an extended error that included the SD card

[  112.215426] macb ff0e0000.ethernet eth0: DMA bus error: HRESP not OK
[  112.223243] mmc0: ADMA error: 0x02000000
... full dump ...

But I cannot detect from the log if it was after the disable or while jailhouse
was enabled.

I didn't have time to debug much further. I want to double check (also with a
colleague) my current stream_id configuration because I only covered the LPD
masters and I want to check the other TBUs. (I'll post the configuration once
I've checked it.)

Thanks,
Andrea


> 
> Jan
> 
>>
>>>
>>>
>>> I made some progress meanwhile: Linux was also using the SMMU. I'll send
>>> a patch shortly that detects that, like we already in VT-d at least.
>>> Interestingly, this should have been broken on the Ultra96 as well, just
>>> didn't notice.
>>>
>>> With that, I'm running enable/disable loops now, but I lose my Ethernet
>>> link after a while. Returns after ifdown/up, and the system looks fine
>>> otherwise. Seems as if we drop transactions in the transition phase.
>>> However, a dd running in parallel was not triggering any issues.
>>>
>>> Jan
>>>
>>
> 

-- 
Thanks,
Andrea Bastoni

-- 
You received this message because you are subscribed to the Google Groups 
"Jailhouse" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To view this discussion on the web visit 
https://groups.google.com/d/msgid/jailhouse-dev/72e5d7a9-e647-e0c5-62a6-9572cbdeee67%40tum.de.

Reply via email to