Hi Karim,

On 20/11/2022 19:47, Karim Manaouil wrote:
Hi Ralf,

Thanks a lot for the help!

The missing entries are for AMD IOMMU (they appear
in /proc/iommu tree, but for some reason the jailhouse
config script is not adding them to sysconfig.c).

I also had to add amd_iommu=false to the kernel cmdline.
Now the jailhouse is correctly created with no issues, and
the Linux on the root cell works perfectly.

Perfect, sounds good! Yep, that's required. Afair, we don't have support for the AMD IOMMU yet.


I wanted to try creating another cell using the demo provided
by configs/x86/apic-demo.c (since it's very simple), but it always
generates a PIO read error.

The PIO address is for UART (0x2f9). This address exists in a
PIO_RANGE in apic-demo.c and it also exists on the root cell's
sysconfig.c, so it should work, but here we are.

When you create a non-root cell, you take away that PIO_RANGE from the root cell. If the root cell now tries to further access that port, Jailhouse will stop that cell due to port access violation. This is exactly what happened and in fact it works as intended.

What you need to do is to ensure that Linux won't claim that port, but I wonder why Linux actually accesses that port… 0x2f8 is the secondary UART (0x3f8 is the primary, also used by Jailhouse for its debug output). Maybe there's a TTY allocated on that port?

There are two solutions to address your issue (you should do both, just to have your system under control):

1. You can, limit the number of uarts with the nr_uarts
   kernel parameter, or/and, identify and stop the application that
   accesses the UART. 'lsof | grep ttyS1' or checking systemd's services
   (via systemctl) might be a good starting point.

2. Don't assign PIO_RANGE(0x2f8, ...) to the non-root cell if you (a)
   don't have a secondary UART at all (which I would expect), and (b) if
   you don't need it inside that cell. You can use the primary UART for
   the non-root cell.

As far as I remember, the non-root cell will use the primary UART in any case by default.

  Ralf


Here is the log that I get after running:
jailhouse cell create configs/x86/apic-demo.cell
it shows both hypervisor and Linux output. I also
attached apic-demo.c for reference.

Page pool usage after late setup: mem 1934/7635, remap 65703/131072
Activating hypervisor
[  698.582280] jailhouse: enter_hypvisor called on every cpu
[  698.587918] jailhouse: console unmapped
[  698.591942] jailhouse: firmware released
[  698.595998] jailhouse: root cell registered
[  698.600313] jailhouse: pci virtual root device added
[  698.604973] hpet_rtc_timer_reinit: 5 callbacks suppressed
[  698.604977] hpet: Lost 5719 RTC interrupts
[  698.605434] The Jailhouse is opening.
[  733.204370] jailhouse: pci setup done
[  733.283290] IRQ fixup: irq 789 move in progress, old vector 39
[  733.289163] IRQ 789: no longer affine to CPU3
[  733.294892] smpboot: CPU 3 is now offline
Created cell "apic-demo"
Page pool usage after cell creation: mem 1949/7635, remap 65703/131072
AFATAL: Invalid PIO read, port: 2f9 size: 1
RIP: 0xffffffff999b2683 RSP: 0xffffa0f68d627c98 FLAGS: 6
RAX: 0xffffffff999b2670 RBX: 0x0000000000000247 RCX: 0x0000000000000000
RDX: 0x00000000000002f9 RSI: 0x0000000000000001 RDI: 0xffffffff9b3da9f8
CS: 10 BASE: 0x0000000000000000 AR-BYTES: 29b EFER.LMA 1
CR0: 0x0000000080050033 CR3: 0x0000003082bea000 CR4: 0x00000000003506a0
EFER: 0x0000000000001d01
Parking CPU 29 (Cell: "RootCell")
FATAL: Invalid PIO read, port: 2fa size: 1
RIP: 0xffffffff999b2683 RSP: 0xffffa0f68d450f08 FLAGS: 6
RAX: 0xffffffff999b2670 RBX: 0xffffffff9b3da9f8 RCX: 0x0000000000000000
RDX: 0x00000000000002fa RSI: 0x0000000000000002 RDI: 0xffffffff9b3da9f8
CS: 10 BASE: 0x0000000000000000 AR-BYTES: 29b EFER.LMA 1
CR0: 0x0000000080050033 CR3: 0x0000002085c36000 CR4: 0x00000000003506a0
EFER: 0x0000000000001d01
Parking CPU 64 (Cell: "RootCell")


Cheers
Karim
------------------------------------------------------------------------
*From:* Ralf Ramsauer <[email protected]>
*Sent:* 18 November 2022 20:18
*To:* Karim Manaouil <[email protected]>
*Cc:* [email protected] <[email protected]>; Henning Schild <[email protected]>; [email protected] <[email protected]>
*Subject:* Re: [EXT] Re: Jailhouse: enter_hypervisor returns -ENOMEM
This email was sent to you by someone outside the University.
You should only click on links or attachments if you are certain that the email is genuine and the content is safe.

Hi Karim,

On 18/11/2022 19:27, Karim Manaouil wrote:
Hi Ralf,

Thanks! I appreciate your help!

I disabled MCE to get rid of the unhandled MSR read error. It works.

I also fixed the PCIe 04:00.0 invalid write to reg 0xb4 by manually adding

Okay, take care that if you manually add an entry that you need to
adjust the array size and their references.

a capability entry to sysconfig.c giving it write permissions (btw, the
entry
for that register was not generated by the config tool).

Not everything is covered by the generator, some parts require manual
inspection.


Now, I am still getting the invalid MMIO/RAM read and write (see log below).
I first get the read error immediately after the page pool message
is printed. Then after a little while, the write error follows up.

Now we need to inspect /proc/iomem. You need to check what is behind
address 0x90482020, and add an appropriate memory region entry to your
config.

    Ralf


Always same case, immediate read error followed by a write error.
it happens every time (but the addresses change).

Here is below the last log from jailhouse copy pasted.
I also attached the output of lspci -vvv as well as sysconfig.c.

Cheers
Karim

Page pool usage after late setup: mem 1927/7635, remap 65703/131072
FATAL: Invalid MMIO/RAM read, addr: 0x0000000090482020 size: 4
RIP: 0xffffffff915d1735 RSP: 0xffffa9b08e97be18 FLAGS: 296
RAX: 0xffffa9b080780000 RBX: 0xffff934f4262a7c0 RCX: 0x0000000000000000
RDX: 0xffff934f47e32f10 RSI: 0xffff934f4004e800 RDI: 0x0000000000000021
CS: 10 BASE: 0x0000000000000000 AR-BYTES: 29b EFER.LMA 1
CR0: 0x0000000080050033 CR3: 0x00000020c2682000 CR4: 0x00000000003506a0
EFER: 0x0000000000001d01
Parking CPU 7 (Cell: "RootCell")
Ignoring NMI IPI to CPU 88
Ignoring NMI IPI to CPU 88
FATAL: Invalid MMIO/RAM write, addr: 0x0000000093a82008 size: 4
RIP: 0xffffffff915ccce7 RSP: 0xffffa9b08db54da0 FLAGS: 2
RAX: 0xffffa9b080380000 RBX: 0x0000000000000001 RCX: 0x0000000000001a70
RDX: 0xffff9376c0004000 RSI: 0x3000004500000000 RDI: 0x7ffffffffffff003
CS: 10 BASE: 0x0000000000000000 AR-BYTES: 29b EFER.LMA 1
CR0: 0x0000000080050033 CR3: 0x00000001267d0000 CR4: 0x00000000003506a0
EFER: 0x0000000000001d01
Parking CPU 81 (Cell: "RootCell")
Ignoring NMI IPI to CPU 88
Ignoring NMI IPI to CPU 88
Ignoring NMI IPI to CPU 88


------------------------------------------------------------------------
*From:* Ralf Ramsauer <[email protected]>
*Sent:* 18 November 2022 16:23
*To:* Karim Manaouil <[email protected]>; Henning Schild
<[email protected]>
*Cc:* [email protected] <[email protected]>;
[email protected] <[email protected]>
*Subject:* Re: [EXT] Re: Jailhouse: enter_hypervisor returns -ENOMEM
This email was sent to you by someone outside the University.
You should only click on links or attachments if you are certain that
the email is genuine and the content is safe.

Hi,

On 18/11/2022 02:19, Karim Manaouil wrote:
Hi Henning,

I spent some more time debugging the issue.
I am getting a "FATAL: Invalid MMIO/RAM write".
Probably it's happening right after the first cpu that
calls arch_cpu_activate_mm() in hypervisor/setup.c:entry().

Not sure why, but maybe you have some pointers.

Here is the jailhouse output copy-pasted below.

Cheers

Initializing Jailhouse hypervisor v0.12 (314-gc7a1b697-dirty) on CPU 6
Code location: 0xfffffffff0000050
Using xAPIC
Page pool usage after early setup: mem 813/7635, remap 1/131072
Initializing processors:

Ok, having read your log, you have (at the moment) at least two issues:

First:

   > FATAL: Invalid PCI config write, device 04:00.0, reg: 0xb4, size: 2

For the moment, go to your config and allow write access to that
capability. I could maybe help you with this if you share your config.
Please also attach lspci -vvv.

I can send you the config diff, and in the long run, you need to
understand what the changes mean.

Second:

   > FATAL: Unhandled MSR read: c0002001

That's MSR_AMD64_SMCA_MC0_STATUS. For the moment, disable Machine Check
Events (MCE) in your kernel config or add appropriate parameters to your
kernel to disable them.

Disable CONFIG_X86_MCE_{INTEL,AMD} in .config, or try adding mce=off to
your kernel parameters.

HTH,

     Ralf

   CPU 6... (APIC ID 48) OK
   CPU 64... (APIC ID 1) OK
   CPU 0... (APIC ID 0) OK
   CPU 16... (APIC ID 2) OK
   CPU 112... (APIC ID 7) OK
   CPU 104... (APIC ID 13) OK
   CPU 40... (APIC ID 12) OK
   CPU 72... (APIC ID 9) OK
   CPU 8... (APIC ID 8) OK
   CPU 56... (APIC ID 14) OK
   CPU 120... (APIC ID 15) OK
   CPU 110... (APIC ID 61) OK
   CPU 46... (APIC ID 60) OK
   CPU 14... (APIC ID 56) OK
   CPU 78... (APIC ID 57) OK
   CPU 94... (APIC ID 59) OK
   CPU 30... (APIC ID 58) OK
   CPU 126... (APIC ID 63) OK
   CPU 62... (APIC ID 62) OK
   CPU 2... (APIC ID 16) OK
   CPU 66... (APIC ID 17) OK
   CPU 18... (APIC ID 18) OK
   CPU 82... (APIC ID 19) OK
   CPU 114... (APIC ID 23) OK
   CPU 50... (APIC ID 22) OK
   CPU 98... (APIC ID 21) OK
   CPU 34... (APIC ID 20) OK
   CPU 12... (APIC ID 40) OK
   CPU 76... (APIC ID 41) OK
   CPU 60... (APIC ID 46) OK
   CPU 124... (APIC ID 47) OK
   CPU 44... (APIC ID 44) OK
   CPU 108... (APIC ID 45) OK
   CPU 92... (APIC ID 43) OK
   CPU 28... (APIC ID 42) OK
   CPU 26... (APIC ID 26) OK
   CPU 90... (APIC ID 27) OK
   CPU 74... (APIC ID 25) OK
   CPU 10... (APIC ID 24) OK
   CPU 106... (APIC ID 29) OK
   CPU 42... (APIC ID 28) OK
   CPU 58... (APIC ID 30) OK
   CPU 122... (APIC ID 31) OK
   CPU 20... (APIC ID 34) OK
   CPU 84... (APIC ID 35) OK
   CPU 36... (APIC ID 36) OK
   CPU 100... (APIC ID 37) OK
   CPU 116... (APIC ID 39) OK
   CPU 52... (APIC ID 38) OK
   CPU 4... (APIC ID 32) OK
   CPU 68... (APIC ID 33) OK
   CPU 96... (APIC ID 5) OK
   CPU 32... (APIC ID 4) OK
   CPU 88... (APIC ID 11) OK
   CPU 55... (APIC ID 118) OK
   CPU 119... (APIC ID 119) OK
   CPU 87... (APIC ID 115) OK
   CPU 23... (APIC ID 114) OK
   CPU 71... (APIC ID 113) OK
   CPU 7... (APIC ID 112) OK
   CPU 39... (APIC ID 116) OK
   CPU 103... (APIC ID 117) OK
   CPU 47... (APIC ID 124) OK
   CPU 111... (APIC ID 125) OK
   CPU 15... (APIC ID 120) OK
   CPU 79... (APIC ID 121) OK
   CPU 31... (APIC ID 122) OK
   CPU 95... (APIC ID 123) OK
   CPU 127... (APIC ID 127) OK
   CPU 63... (APIC ID 126) OK
   CPU 86... (APIC ID 51) OK
   CPU 22... (APIC ID 50) OK
   CPU 38... (APIC ID 52) OK
   CPU 102... (APIC ID 53) OK
   CPU 118... (APIC ID 55) OK
   CPU 54... (APIC ID 54) OK
   CPU 70... (APIC ID 49) OK
   CPU 109... (APIC ID 109) OK
   CPU 45... (APIC ID 108) OK
   CPU 93... (APIC ID 107) OK
   CPU 29... (APIC ID 106) OK
   CPU 13... (APIC ID 104) OK
   CPU 77... (APIC ID 105) OK
   CPU 61... (APIC ID 110) OK
   CPU 125... (APIC ID 111) OK
   CPU 5... (APIC ID 96) OK
   CPU 101... (APIC ID 101) OK
   CPU 37... (APIC ID 100) OK
   CPU 85... (APIC ID 99) OK
   CPU 21... (APIC ID 98) OK
   CPU 117... (APIC ID 103) OK
   CPU 53... (APIC ID 102) OK
   CPU 69... (APIC ID 97) OK
   CPU 49... (APIC ID 70) OK
   CPU 1... (APIC ID 64) OK
   CPU 65... (APIC ID 65) OK
   CPU 81... (APIC ID 67) OK
   CPU 17... (APIC ID 66) OK
   CPU 97... (APIC ID 69) OK
   CPU 33... (APIC ID 68) OK
   CPU 113... (APIC ID 71) OK
   CPU 25... (APIC ID 74) OK
   CPU 89... (APIC ID 75) OK
   CPU 41... (APIC ID 76) OK
   CPU 105... (APIC ID 77) OK
   CPU 9... (APIC ID 72) OK
   CPU 73... (APIC ID 73) OK
   CPU 121... (APIC ID 79) OK
   CPU 57... (APIC ID 78) OK
   CPU 3... (APIC ID 80) OK
   CPU 67... (APIC ID 81) OK
   CPU 35... (APIC ID 84) OK
   CPU 99... (APIC ID 85) OK
   CPU 115... (APIC ID 87) OK
   CPU 51... (APIC ID 86) OK
   CPU 19... (APIC ID 82) OK
   CPU 83... (APIC ID 83) OK
   CPU 107... (APIC ID 93) OK
   CPU 43... (APIC ID 92) OK
   CPU 11... (APIC ID 88) OK
   CPU 75... (APIC ID 89) OK
   CPU 123... (APIC ID 95) OK
   CPU 59... (APIC ID 94) OK
   CPU 27... (APIC ID 90) OK
   CPU 24... (APIC ID 10) OK
   CPU 80... (APIC ID 3) OK
   CPU 48... (APIC ID 6) OK
   CPU 91... (APIC ID 91) OK
Initializing unit: AMD IOMMU
AMD IOMMU @0xa1700000/0x80000
Initializing unit: IOAPIC
Initializing unit: PCI
Adding PCI device 00:00.0 to cell "RootCell"
Adding PCI device 00:01.0 to cell "RootCell"
Adding PCI device 00:01.1 to cell "RootCell"
Adding PCI device 00:01.3 to cell "RootCell"
Adding PCI device 00:01.4 to cell "RootCell"
Adding PCI device 00:02.0 to cell "RootCell"
Adding PCI device 00:03.0 to cell "RootCell"
Adding PCI device 00:04.0 to cell "RootCell"
Adding PCI device 00:07.0 to cell "RootCell"
Adding PCI device 00:07.1 to cell "RootCell"
Adding PCI device 00:08.0 to cell "RootCell"
Adding PCI device 00:08.1 to cell "RootCell"
Adding PCI device 00:14.0 to cell "RootCell"
Adding PCI device 00:14.3 to cell "RootCell"
Adding PCI device 00:18.0 to cell "RootCell"
Adding PCI device 00:18.1 to cell "RootCell"
Adding PCI device 00:18.2 to cell "RootCell"
Adding PCI device 00:18.3 to cell "RootCell"
Adding PCI device 00:18.4 to cell "RootCell"
Adding PCI device 00:18.5 to cell "RootCell"
Adding PCI device 00:18.6 to cell "RootCell"
Adding PCI device 00:18.7 to cell "RootCell"
Adding PCI device 00:19.0 to cell "RootCell"
Adding PCI device 00:19.1 to cell "RootCell"
Adding PCI device 00:19.2 to cell "RootCell"
Adding PCI device 00:19.3 to cell "RootCell"
Adding PCI device 00:19.4 to cell "RootCell"
Adding PCI device 00:19.5 to cell "RootCell"
Adding PCI device 00:19.6 to cell "RootCell"
Adding PCI device 00:19.7 to cell "RootCell"
Adding PCI device 00:1a.0 to cell "RootCell"
Adding PCI device 00:1a.1 to cell "RootCell"
Adding PCI device 00:1a.2 to cell "RootCell"
Adding PCI device 00:1a.3 to cell "RootCell"
Adding PCI device 00:1a.4 to cell "RootCell"
Adding PCI device 00:1a.5 to cell "RootCell"
Adding PCI device 00:1a.6 to cell "RootCell"
Adding PCI device 00:1a.7 to cell "RootCell"
Adding PCI device 00:1b.0 to cell "RootCell"
Adding PCI device 00:1b.1 to cell "RootCell"
Adding PCI device 00:1b.2 to cell "RootCell"
Adding PCI device 00:1b.3 to cell "RootCell"
Adding PCI device 00:1b.4 to cell "RootCell"
Adding PCI device 00:1b.5 to cell "RootCell"
Adding PCI device 00:1b.6 to cell "RootCell"
Adding PCI device 00:1b.7 to cell "RootCell"
Adding PCI device 00:1c.0 to cell "RootCell"
Adding PCI device 00:1c.1 to cell "RootCell"
Adding PCI device 00:1c.2 to cell "RootCell"
Adding PCI device 00:1c.3 to cell "RootCell"
Adding PCI device 00:1c.4 to cell "RootCell"
Adding PCI device 00:1c.5 to cell "RootCell"
Adding PCI device 00:1c.6 to cell "RootCell"
Adding PCI device 00:1c.7 to cell "RootCell"
Adding PCI device 00:1d.0 to cell "RootCell"
Adding PCI device 00:1d.1 to cell "RootCell"
Adding PCI device 00:1d.2 to cell "RootCell"
Adding PCI device 00:1d.3 to cell "RootCell"
Adding PCI device 00:1d.4 to cell "RootCell"
Adding PCI device 00:1d.5 to cell "RootCell"
Adding PCI device 00:1d.6 to cell "RootCell"
Adding PCI device 00:1d.7 to cell "RootCell"
Adding PCI device 00:1e.0 to cell "RootCell"
Adding PCI device 00:1e.1 to cell "RootCell"
Adding PCI device 00:1e.2 to cell "RootCell"
Adding PCI device 00:1e.3 to cell "RootCell"
Adding PCI device 00:1e.4 to cell "RootCell"
Adding PCI device 00:1e.5 to cell "RootCell"
Adding PCI device 00:1e.6 to cell "RootCell"
Adding PCI device 00:1e.7 to cell "RootCell"
Adding PCI device 00:1f.0 to cell "RootCell"
Adding PCI device 00:1f.1 to cell "RootCell"
Adding PCI device 00:1f.2 to cell "RootCell"
Adding PCI device 00:1f.3 to cell "RootCell"
Adding PCI device 00:1f.4 to cell "RootCell"
Adding PCI device 00:1f.5 to cell "RootCell"
Adding PCI device 00:1f.6 to cell "RootCell"
Adding PCI device 00:1f.7 to cell "RootCell"
Adding PCI device 01:00.0 to cell "RootCell"
Adding PCI device 01:00.1 to cell "RootCell"
Adding PCI device 02:00.0 to cell "RootCell"
Adding PCI device 03:00.0 to cell "RootCell"
Adding PCI device 04:00.0 to cell "RootCell"
Adding PCI device 04:00.1 to cell "RootCell"
Adding PCI device 05:00.0 to cell "RootCell"
Adding PCI device 05:00.2 to cell "RootCell"
Adding PCI device 05:00.3 to cell "RootCell"
Adding PCI device 06:00.0 to cell "RootCell"
Adding PCI device 06:00.1 to cell "RootCell"
Adding PCI device 06:00.2 to cell "RootCell"
Adding PCI device 20:00.0 to cell "RootCell"
Adding PCI device 20:00.2 to cell "RootCell"
Adding PCI device 20:01.0 to cell "RootCell"
Adding PCI device 20:01.1 to cell "RootCell"
Adding PCI device 20:02.0 to cell "RootCell"
Adding PCI device 20:03.0 to cell "RootCell"
Adding PCI device 20:04.0 to cell "RootCell"
Adding PCI device 20:07.0 to cell "RootCell"
Adding PCI device 20:07.1 to cell "RootCell"
Adding PCI device 20:08.0 to cell "RootCell"
Adding PCI device 20:08.1 to cell "RootCell"
Adding PCI device 21:00.0 to cell "RootCell"
Adding PCI device 21:00.1 to cell "RootCell"
Adding PCI device 22:08.0 to cell "RootCell"
Adding PCI device 23:00.0 to cell "RootCell"
Adding PCI device 24:00.0 to cell "RootCell"
Adding PCI device 24:00.2 to cell "RootCell"
Adding PCI device 24:00.3 to cell "RootCell"
Adding PCI device 25:00.0 to cell "RootCell"
Adding PCI device 25:00.1 to cell "RootCell"
Adding PCI device 40:00.0 to cell "RootCell"
Adding PCI device 40:00.2 to cell "RootCell"
Adding PCI device 40:01.0 to cell "RootCell"
Adding PCI device 40:02.0 to cell "RootCell"
Adding PCI device 40:03.0 to cell "RootCell"
Adding PCI device 40:03.1 to cell "RootCell"
Adding PCI device 40:04.0 to cell "RootCell"
Adding PCI device 40:07.0 to cell "RootCell"
Adding PCI device 40:07.1 to cell "RootCell"
Adding PCI device 40:08.0 to cell "RootCell"
Adding PCI device 40:08.1 to cell "RootCell"
Adding PCI device 41:00.0 to cell "RootCell"
Adding PCI device 41:00.1 to cell "RootCell"
Adding PCI device 42:00.0 to cell "RootCell"
Adding PCI device 42:00.2 to cell "RootCell"
Adding PCI device 43:00.0 to cell "RootCell"
Adding PCI device 43:00.1 to cell "RootCell"
Adding PCI device 60:00.0 to cell "RootCell"
Adding PCI device 60:00.2 to cell "RootCell"
Adding PCI device 60:01.0 to cell "RootCell"
Adding PCI device 60:02.0 to cell "RootCell"
Adding PCI device 60:03.0 to cell "RootCell"
Adding PCI device 60:03.1 to cell "RootCell"
Adding PCI device 60:04.0 to cell "RootCell"
Adding PCI device 60:07.0 to cell "RootCell"
Adding PCI device 60:07.1 to cell "RootCell"
Adding PCI device 60:08.0 to cell "RootCell"
Adding PCI device 60:08.1 to cell "RootCell"
Adding PCI device 61:00.0 to cell "RootCell"
Adding PCI device 62:00.0 to cell "RootCell"
Adding PCI device 62:00.2 to cell "RootCell"
Adding PCI device 63:00.0 to cell "RootCell"
Adding PCI device 63:00.1 to cell "RootCell"
Adding PCI device 80:00.0 to cell "RootCell"
Adding PCI device 80:00.2 to cell "RootCell"
Adding PCI device 80:01.0 to cell "RootCell"
Adding PCI device 80:02.0 to cell "RootCell"
Adding PCI device 80:03.0 to cell "RootCell"
Adding PCI device 80:04.0 to cell "RootCell"
Adding PCI device 80:07.0 to cell "RootCell"
Adding PCI device 80:07.1 to cell "RootCell"
Adding PCI device 80:08.0 to cell "RootCell"
Adding PCI device 80:08.1 to cell "RootCell"
Adding PCI device 81:00.0 to cell "RootCell"
Adding PCI device 81:00.2 to cell "RootCell"
Adding PCI device 82:00.0 to cell "RootCell"
Adding PCI device 82:00.1 to cell "RootCell"
Adding PCI device a0:00.0 to cell "RootCell"
Adding PCI device a0:00.2 to cell "RootCell"
Adding PCI device a0:01.0 to cell "RootCell"
Adding PCI device a0:02.0 to cell "RootCell"
Adding PCI device a0:03.0 to cell "RootCell"
Adding PCI device a0:04.0 to cell "RootCell"
Adding PCI device a0:07.0 to cell "RootCell"
Adding PCI device a0:07.1 to cell "RootCell"
Adding PCI device a0:08.0 to cell "RootCell"
Adding PCI device a0:08.1 to cell "RootCell"
Adding PCI device a1:00.0 to cell "RootCell"
Adding PCI device a1:00.2 to cell "RootCell"
Adding PCI device a2:00.0 to cell "RootCell"
Adding PCI device a2:00.1 to cell "RootCell"
Adding PCI device c0:00.0 to cell "RootCell"
Adding PCI device c0:00.2 to cell "RootCell"
Adding PCI device c0:01.0 to cell "RootCell"
Adding PCI device c0:02.0 to cell "RootCell"
Adding PCI device c0:03.0 to cell "RootCell"
Adding PCI device c0:03.1 to cell "RootCell"
Adding PCI device c0:03.2 to cell "RootCell"
Adding PCI device c0:03.3 to cell "RootCell"
Adding PCI device c0:03.4 to cell "RootCell"
Adding PCI device c0:04.0 to cell "RootCell"
Adding PCI device c0:07.0 to cell "RootCell"
Adding PCI device c0:07.1 to cell "RootCell"
Adding PCI device c0:08.0 to cell "RootCell"
Adding PCI device c0:08.1 to cell "RootCell"
Adding PCI device c1:00.0 to cell "RootCell"
Adding PCI device c2:00.0 to cell "RootCell"
Adding PCI device c3:00.0 to cell "RootCell"
Adding PCI device c4:00.0 to cell "RootCell"
Adding PCI device c5:00.0 to cell "RootCell"
Adding PCI device c5:00.2 to cell "RootCell"
Adding PCI device c6:00.0 to cell "RootCell"
Adding PCI device c6:00.1 to cell "RootCell"
Adding PCI device e0:00.0 to cell "RootCell"
Adding PCI device e0:00.2 to cell "RootCell"
Adding PCI device e0:01.0 to cell "RootCell"
Adding PCI device e0:02.0 to cell "RootCell"
Adding PCI device e0:03.0 to cell "RootCell"
Adding PCI device e0:04.0 to cell "RootCell"
Adding PCI device e0:07.0 to cell "RootCell"
Adding PCI device e0:07.1 to cell "RootCell"
Adding PCI device e0:08.0 to cell "RootCell"
Adding PCI device e0:08.1 to cell "RootCell"
Adding PCI device e1:00.0 to cell "RootCell"
Adding PCI device e1:00.2 to cell "RootCell"
Adding PCI device e2:00.0 to cell "RootCell"
Adding PCI device e2:00.1 to cell "RootCell"
Page pool usage after late setup: mem 1927/7635, remap 65703/131072
FATAL: Invalid MMIO/RAM write, addr: 0x00000000a1702008 size: 4
RIP: 0xffffffffa79d7777 RSP: 0xffffa2f7cda78de0 FLAGS: 6
RAX: 0xffffa2f7c0080000 RBX: 0x0000000000000001 RCX: 0x0000000000000030
RDX: 0xffff90d18000a000 RSI: 0x3000001700000000 RDI: 0x7ffffffffffff003
CS: 10 BASE: 0x0000000000000000 AR-BYTES: 29b EFER.LMA 1
CR0: 0x0000000080050033 CR3: 0x000000014487c000 CR4: 0x00000000003506e0
EFER: 0x0000000000001d01
Parking CPU 76 (Cell: "RootCell")
Ignoring NMI IPI to CPU 1
Ignoring NMI IPI to CPU 2
Ignoring NMI IPI to CPU 3
Ignoring NMI IPI to CPU 4
Ignoring NMI IPI to CPU 5
Ignoring NMI IPI to CPU 6
Ignoring NMI IPI to CPU 7
Ignoring NMI IPI to CPU 76
Ignoring NMI IPI to CPU 1
Ignoring NMI IPI to CPU 2
Ignoring NMI IPI to CPU 3
Ignoring NMI IPI to CPU 4
Ignoring NMI IPI to CPU 5
Ignoring NMI IPI to CPU 6
Ignoring NMI IPI to CPU 7
Ignoring NMI IPI to CPU 76
Ignoring NMI IPI to CPU 1
Ignoring NMI IPI to CPU 2
Ignoring NMI IPI to CPU 3
Ignoring NMI IPI to CPU 4
Ignoring NMI IPI to CPU 5
FATAL: Invalid PCI config write, device 04:00.0, reg: 0xb4, size: 2
RIP: 0xffffffffa7c52b3d RSP: 0xffffa2f7ce99bd98 FLAGS: 46
RAX: 0x000000000000242e RBX: 0x0000000000000000 RCX: 0x00000000000000b4
RDX: 0x0000000000000cfc RSI: 0x0000000000000216 RDI: 0xffffffffa9401790
CS: 10 BASE: 0x0000000000000000 AR-BYTES: 29b EFER.LMA 1
CR0: 0x0000000080050033 CR3: 0x0000003b99810000 CR4: 0x00000000003506e0
EFER: 0x0000000000001d01
Parking CPU 24 (Cell: "RootCell")
Ignoring NMI IPI to CPU 6
Ignoring NMI IPI to CPU 7
FATAL: Unhandled MSR read: c0002001
RIP: 0xffffffffa7c951cd RSP: 0xffffa2f7cd918e08 FLAGS: 246
RAX: 0x00000000c0002000 RBX: 0xffff90e15fc11020 RCX: 0x00000000c0002001
RDX: 0x0000000000000000 RSI: 0xffffa2f7cd918df0 RDI: 0x00000000c0002001
CS: 10 BASE: 0x0000000000000000 AR-BYTES: 29b EFER.LMA 1
CR0: 0x0000000080050033 CR3: 0x00000018a6a46000 CR4: 0x00000000003506e0
EFER: 0x0000000000001d01
Parking CPU 68 (Cell: "RootCell")
FATAL: Unhandled MSR read: c0002001
RIP: 0xffffffffa7c951cd RSP: 0xffffa2f7cd4f8e08 FLAGS: 246
RAX: 0x00000000c0002000 RBX: 0xffff90e15fb51020 RCX: 0x00000000c0002001
RDX: 0x0000000000000000 RSI: 0xffffa2f7cd4f8df0 RDI: 0x00000000c0002001
CS: 10 BASE: 0x0000000000000000 AR-BYTES: 29b EFER.LMA 1
CR0: 0x0000000080050033 CR3: 0x0000002081eaa000 CR4: 0x00000000003506e0
EFER: 0x0000000000001d01
Parking CPU 44 (Cell: "RootCell")

------------------------------------------------------------------------
*From:* Henning Schild <[email protected]>
*Sent:* 14 November 2022 09:22
*To:* Karim Manaouil <[email protected]>
*Cc:* Ralf Ramsauer <[email protected]>;
[email protected] <[email protected]>;
[email protected] <[email protected]>
*Subject:* Re: Jailhouse: enter_hypervisor returns -ENOMEM
This email was sent to you by someone outside the University.
You should only click on links or attachments if you are certain that
the email is genuine and the content is safe.

Am Sun, 13 Nov 2022 22:24:45 +0000
schrieb Karim Manaouil <[email protected]>:

Hi Ralf,

Thanks for the reply!

>Did you use jailhouse-config-create?

I am using `jailhouse config create` to generate the sysconfig.c file.

>You can use the --mem-hv option to
increate the memory. Try, for example, 32MiB and see if it works.

I tried with 32MiB. It worked. I am not getting -ENOMEM anymore.
The driver prints "The Jailhouse is opening" on dmesg. However, right
after that the CPUs get stuck, and I get rcu_sched detected stalls.
The system is completely irresponsive.

I attached a text file containing the full output from dmesg. Here is
the initial part:

I guess the output of the hypervisor might also be valuable here.
According to its spec that machine should have a serial port, and with
that default config from the generate script you should see logs coming
out of there. With the usual 115200 8n1

Henning

[  434.792008] The Jailhouse is opening.
[  455.787315] rcu: INFO: rcu_sched detected stalls on CPUs/tasks:
[  455.793303] rcu:     1-...0: (839 GPs behind)
idle=c2a/1/0x4000000000000000 softirq=681/681 fqs=1827 [  455.802292]
rcu:     2-...0: (144 GPs behind) idle=812/1/0x4000000000000000
softirq=905/905 fqs=1827 [  455.811276] rcu:     3-...0: (144 GPs
behind) idle=eaa/1/0x4000000000000000 softirq=719/719 fqs=1827 [
455.820266] rcu:     4-...0: (1 GPs behind)
idle=c2e/1/0x4000000000000000 softirq=1324/1324 fqs=1827 [
455.829252] rcu:     5-...0: (144 GPs behind)
idle=41a/1/0x4000000000000000 softirq=556/556 fqs=1827 [  455.838238]
rcu:     6-...0: (144 GPs behind) idle=912/1/0x4000000000000000
softirq=777/777 fqs=1827 [  455.847218] rcu:     7-...0: (144 GPs
behind) idle=5e6/1/0x4000000000000000 softirq=2409/2410 fqs=1827 [
455.856404]  (detected by 87, t=5253 jiffies, g=48537, q=364) [
455.862170] Sending NMI from CPU 87 to CPUs 1: [  465.776884] Sending
NMI from CPU 87 to CPUs 2: [  467.182686] watchdog: BUG: soft lockup
- CPU#0 stuck for 23s! [kworker/0:1:7] [  467.189857] Modules linked
in: jailhouse(O) nf_conntrack_netlink xfrm_user xt_addrtype
br_netfilter xt_CHECKSUM xt_MASQUERADE xt_conntrack ipt_REJECT
nf_reject_ipv4 xt_tcpudp nft_compat nft_chain_nat nf_natp [
467.189928]  binfmt_misc configfs efivarfs ip_tables x_tables autofs4
ext4 crc16 mbcache jbd2 raid10 raid456 libcrc32c crc32c_generic
async_raid6_recov async_memcpy async_pq async_xor xor async_tx
raid6_pq ] [  467.320567] CPU: 0 PID: 7 Comm: kworker/0:1 Tainted: G
          O      5.10.0 #3 [  467.328767] Hardware name: Dell Inc.
PowerEdge R7425/08V001, BIOS 1.15.0 09/11/2020 [  467.337154]
Workqueue: events drm_fb_helper_dirty_work [drm_kms_helper] [
467.344501] RIP: 0010:smp_call_function_many_cond+0x289/0x2d0 [
467.350979] Code: e8 1c 8a 39 00 3b 05 0a c1 74 01 89 c7 0f 83 0b fe
ff ff 48 63 c7 49 8b 16 48 03 14 c5 00 d9 99 9c 8b 42 08 a8 01 74 09
f3 90 <8b> 42 08 a8 01 75 f7 eb c9 48 c7 c2 20 cf 07 9d 4c 89 fe 44 7
[  467.371232] RSP: 0018:ffffa7d78015fcd8 EFLAGS: 00000202 [
467.377220] RAX: 0000000000000011 RBX: 0000000000031280 RCX:
0000000000000001 [  467.385123] RDX: ffff964f1fa31280 RSI:
0000000000000000 RDI: 0000000000000001 [  467.393024] RBP:
0000000000000000 R08: 0000000000000000 R09: 0000000000000001 [
467.400928] R10: 0000000000000002 R11: 0000000000000002 R12:
0000000000000000 [  467.408836] R13: 000000000000007f R14:
ffff962f1f42c9c0 R15: 0000000000000080 [  467.416737] FS:
0000000000000000(0000) GS:ffff962f1f400000(0000)
knlGS:0000000000000000 [  467.425604] CS:  0010 DS: 0000 ES: 0000
CR0: 0000000080050033 [  467.432127] CR2: 0000000000000000 CR3:
00000010987ea000 CR4: 00000000003506f0 [  467.440045] Call Trace: [
467.443289]  ? tlbflush_read_file+0x70/0x70 [  467.448266]  ?
tlbflush_read_file+0x70/0x70 [  467.453242]  on_each_cpu+0x2b/0x60 [
467.457437]  __purge_vmap_area_lazy+0x5d/0x680 [  467.462679]  ?
_cond_resched+0x16/0x40 [  467.467224]  ?
unmap_kernel_range_noflush+0x2fa/0x380 [  467.473072]
free_vmap_area_noflush+0xe7/0x100 [  467.478315]
remove_vm_area+0x96/0xa0 [  467.482770]  __vunmap+0x8d/0x290 [
467.486792]  drm_gem_shmem_vunmap+0x8b/0xa0 [drm] [  467.492299]
drm_client_buffer_vunmap+0x16/0x20 [drm] [  467.498144]
drm_fb_helper_dirty_work+0x187/0x1b0 [drm_kms_helper] [  467.505118]
process_one_work+0x1b6/0x350 [  467.509912]  worker_thread+0x53/0x3e0
[  467.514361]  ? process_one_work+0x350/0x350 [  467.519338]
kthread+0x11b/0x140 [  467.523342]  ? __kthread_bind_mask+0x60/0x60 [
  467.528389]  ret_from_fork+0x22/0x30

Cheers
Karim
________________________________
From: Ralf Ramsauer <[email protected]>
Sent: 12 November 2022 17:47
To: Karim Manaouil <[email protected]>; [email protected]
<[email protected]> Cc: [email protected]
<[email protected]> Subject: Re: Jailhouse:
enter_hypervisor returns -ENOMEM

This email was sent to you by someone outside the University.
You should only click on links or attachments if you are certain that
the email is genuine and the content is safe.

On 12/11/2022 18:15, Karim Manaouil wrote:
> Hi Jan,
>
> I am trying to deploy Jailhouse on an AMD EPYC with 128 CPUs (8 NUMA
> nodes), running Linux kernel v5.10 (same version used by jailhouse
> CI with same patches applied).
>
> `jailhouse hardware check` return that everything is ok and that
> "Check passed!".
>
> Memory was reserved via `memmap=0x5200000$0x3a000000`
>
> However, enter_hypervisor() [1] fails when entry() is called on
> every cpu and return -ENOMEM as error_code.

Try to reserve more memory. Maybe the default size of 6MiB for HV
memory is insufficient for 128 CPUs.

Did you use jailhouse-config-create? You can use the --mem-hv option
to increate the memory. Try, for example, 32MiB and see if it works.

    Ralf

>
> Do you possibly know where could the issue come from?
>
> Best
> Karim
>
> [1]
> https://github.com/siemens/jailhouse/blob/c7a1b6971ac15e4be8a0918b9bef6e2cbd99f9fc/driver/main.c#L251 
<https://github.com/siemens/jailhouse/blob/c7a1b6971ac15e4be8a0918b9bef6e2cbd99f9fc/driver/main.c#L251> 
<https://github.com/siemens/jailhouse/blob/c7a1b6971ac15e4be8a0918b9bef6e2cbd99f9fc/driver/main.c#L251 
<https://github.com/siemens/jailhouse/blob/c7a1b6971ac15e4be8a0918b9bef6e2cbd99f9fc/driver/main.c#L251>> 
<https://github.com/siemens/jailhouse/blob/c7a1b6971ac15e4be8a0918b9bef6e2cbd99f9fc/driver/main.c#L251 
<https://github.com/siemens/jailhouse/blob/c7a1b6971ac15e4be8a0918b9bef6e2cbd99f9fc/driver/main.c#L251 
<https://github.com/siemens/jailhouse/blob/c7a1b6971ac15e4be8a0918b9bef6e2cbd99f9fc/driver/main.c#L251>>>
> 
<https://github.com/siemens/jailhouse/blob/c7a1b6971ac15e4be8a0918b9bef6e2cbd99f9fc/driver/main.c#L251 
<https://github.com/siemens/jailhouse/blob/c7a1b6971ac15e4be8a0918b9bef6e2cbd99f9fc/driver/main.c#L251 
<https://github.com/siemens/jailhouse/blob/c7a1b6971ac15e4be8a0918b9bef6e2cbd99f9fc/driver/main.c#L251 
<https://github.com/siemens/jailhouse/blob/c7a1b6971ac15e4be8a0918b9bef6e2cbd99f9fc/driver/main.c#L251>>>>
>
> The University of Edinburgh is a charitable body, registered in
> Scotland, with registration number SC005336. Is e buidheann
> carthannais a th’ ann an Oilthigh Dhùn Èideann, clàraichte an Alba,
> àireamh clàraidh SC005336.
>
> --
> You received this message because you are subscribed to the Google
> Groups "Jailhouse" group.
> To unsubscribe from this group and stop receiving emails from it,
> send an email to [email protected]
> <mailto:[email protected]
<mailto:[email protected]
<mailto:[email protected]
<mailto:[email protected]>>>>.
> To view this discussion on the web visit
> 
https://groups.google.com/d/msgid/jailhouse-dev/AM0PR05MB6018F1663ABE61DA3C697CA4A9039%40AM0PR05MB6018.eurprd05.prod.outlook.com
 
<https://groups.google.com/d/msgid/jailhouse-dev/AM0PR05MB6018F1663ABE61DA3C697CA4A9039%40AM0PR05MB6018.eurprd05.prod.outlook.com>
 
<https://groups.google.com/d/msgid/jailhouse-dev/AM0PR05MB6018F1663ABE61DA3C697CA4A9039%40AM0PR05MB6018.eurprd05.prod.outlook.com
 
<https://groups.google.com/d/msgid/jailhouse-dev/AM0PR05MB6018F1663ABE61DA3C697CA4A9039%40AM0PR05MB6018.eurprd05.prod.outlook.com>>
 
<https://groups.google.com/d/msgid/jailhouse-dev/AM0PR05MB6018F1663ABE61DA3C697CA4A9039%40AM0PR05MB6018.eurprd05.prod.outlook.com
 
<https://groups.google.com/d/msgid/jailhouse-dev/AM0PR05MB6018F1663ABE61DA3C697CA4A9039%40AM0PR05MB6018.eurprd05.prod.outlook.com
 
<https://groups.google.com/d/msgid/jailhouse-dev/AM0PR05MB6018F1663ABE61DA3C697CA4A9039%40AM0PR05MB6018.eurprd05.prod.outlook.com>>>
> 
<https://groups.google.com/d/msgid/jailhouse-dev/AM0PR05MB6018F1663ABE61DA3C697CA4A9039%40AM0PR05MB6018.eurprd05.prod.outlook.com?utm_medium=email&utm_source=footer
 
<https://groups.google.com/d/msgid/jailhouse-dev/AM0PR05MB6018F1663ABE61DA3C697CA4A9039%40AM0PR05MB6018.eurprd05.prod.outlook.com?utm_medium=email&utm_source=footer
 
<https://groups.google.com/d/msgid/jailhouse-dev/AM0PR05MB6018F1663ABE61DA3C697CA4A9039%40AM0PR05MB6018.eurprd05.prod.outlook.com?utm_medium=email&utm_source=footer
 
<https://groups.google.com/d/msgid/jailhouse-dev/AM0PR05MB6018F1663ABE61DA3C697CA4A9039%40AM0PR05MB6018.eurprd05.prod.outlook.com?utm_medium=email&utm_source=footer>>>>.
>



--
You received this message because you are subscribed to the Google Groups 
"Jailhouse" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To view this discussion on the web visit 
https://groups.google.com/d/msgid/jailhouse-dev/22e926f8-c036-0e15-81a8-154eb74bb6f9%40oth-regensburg.de.

Reply via email to