On 17.09.20 09:16, Oliver Schwartz wrote:
On 15 Sep 2020, at 11:00, Jan Kiszka <[email protected]
<mailto:[email protected]>> wrote:
On 15.09.20 09:07, Oliver Schwartz wrote:
I’m currently trying out the arm64-zero-exits branch and got stuck.
System is a Xilinx ZU9EG on a custom board, similar to zcu102. I’ve
brought ATF up to date and patched it with Jans patch to enable SDEI.
If I don’t enable SDEI in ATF everything works as expected (with VM
exits for interrupts, of course). Jailhouse source is the tip of
branch arm64-zero-exits.
If I enable SDEI in ATF, jailhouse works most of the time, except for
when it doesn’t. Sometimes, ‘jailhouse enable’ results in:
Initializing processors:
CPU 1... OK
CPU 0...
/home/oliver/0.12-gitAUTOINC+98061469d0-r0/git/hypervisor/arch/arm64/setup.c:73:
returning error -EIO
Weird - that the SDEI event enable call.
FAILED
JAILHOUSE_ENABLE: Input/output error
I’ve seen this error only when I enable jailhouse through some init
script during the boot process, when the system is also busy
otherwise. When starting jailhouse on an idle system I haven’t seen this.
Possibly a regression of my recent refactoring which I didn't manage
to test yet. Could you try if
https://github.com/siemens/jailhouse/commits/e0ef829c85895dc6387d5ea11b08aa65a456255f
was any better?
Sometimes it may hang later during ‘jailhouse enable’:
Initializing processors:
CPU 1... OK
CPU 0... OK
CPU 2... OK
CPU 3... OK
Initializing unit: irqchip
Using SDEI-based management interrupt
Initializing unit: ARM SMMU v3
Initializing unit: PVU IOMMU
Initializing unit: PCI
Adding virtual PCI device 00:00.0 to cell "root"
Page pool usage after late setup: mem 67/992, remap 5/131072
Activating hypervisor
[ 5.847540] The Jailhouse is opening.
Using a JTAG debugger I see that one or more cores are stuck in
hypervisor/arch/arm-common/psci.c, line 105.
It may also succeed in stopping one or more CPUs and then hang (again
with one or more cores stuck in psci.c, line 105):
[ 5.810220] The Jailhouse is opening.
[ 5.860054] CPU1: shutdown
[ 5.862677] psci: CPU1 killed.
Now, with the first problem solved I’ve digged into the second one. It’s
actually a bit worse than in my initial description: If I just do
‘jailhouse enable’ the system will always hang a few milliseconds after
the command completes - the only exception is when ‘jailhouse create’ is
executed immediately afterwards (which creates an inmate that uses 3 of
4 CPU cores, leaving just one for Linux), which succeeds roughly on
every second try. I didn’t notice this initially because I usually start
jailhouse with a script that does ‘enable’ and ‘create’.
The reason for the hangs seems to be the psci emulation in Jailhouse, in
particular the CPU_SUSPEND calls. These are issued from my (Xilinx-)
kernel frequently if Linux has more than one core available. With SDEI
disabled the core can be woken up again by some interrupt. With SDEI
enabled, the core waits forever on the wfi intstruction. Because a
suspended core never wakes up again the whole system hangs at some point.
Any ideas why no interrupts are seen anymore in psci? My guess is that
it’s because the inmate (Linux) now has full control over the GIC, so it
may disable any interrupts before suspending a core, without Jailhouse
noticing. If this is the case, it may be necessary to re-enable the IRQs
before executing wfi. But I’m missing the big picture here - what
interrupt is the core waiting for in the first place? Any other thoughts?
You likely found a bug in the SDEI feature of Jailhouse. The CPU_SUSPEND
emulation assumes non-SDEI operation, i.e. interception of interrupts by
the hypervisor, but that is not true in this mode.
We need a way to wait for interrupts without actually receiving them
when they arrive, but rather return to EL1 then. Maybe re-enabling
interception, waiting, and then disabling it again before returning
would do the trick. But then I also do not understand yet why
https://github.com/bao-project/bao-hypervisor/blob/master/src/arch/armv8/psci.c
gets away with wfi. Possibly, they run with interrupts on through the
hypervisor, though that would not be straightforward either.
Jan
--
Siemens AG, Corporate Technology, CT RDA IOT SES-DE
Corporate Competence Center Embedded Linux
--
You received this message because you are subscribed to the Google Groups
"Jailhouse" group.
To unsubscribe from this group and stop receiving emails from it, send an email
to [email protected].
To view this discussion on the web visit
https://groups.google.com/d/msgid/jailhouse-dev/64bb13f6-6ae2-09ce-4c61-4c406a360c05%40siemens.com.