Am Sun, 13 Nov 2022 22:24:45 +0000
schrieb Karim Manaouil <[email protected]>:

> Hi Ralf,
> 
> Thanks for the reply!
> 
> >Did you use jailhouse-config-create?  
> 
> I am using `jailhouse config create` to generate the sysconfig.c file.
> 
> >You can use the --mem-hv option to  
> increate the memory. Try, for example, 32MiB and see if it works.
> 
> I tried with 32MiB. It worked. I am not getting -ENOMEM anymore.
> The driver prints "The Jailhouse is opening" on dmesg. However, right
> after that the CPUs get stuck, and I get rcu_sched detected stalls.
> The system is completely irresponsive.
> 
> I attached a text file containing the full output from dmesg. Here is
> the initial part:

I guess the output of the hypervisor might also be valuable here.
According to its spec that machine should have a serial port, and with
that default config from the generate script you should see logs coming
out of there. With the usual 115200 8n1

Henning

> [  434.792008] The Jailhouse is opening.
> [  455.787315] rcu: INFO: rcu_sched detected stalls on CPUs/tasks:
> [  455.793303] rcu:     1-...0: (839 GPs behind)
> idle=c2a/1/0x4000000000000000 softirq=681/681 fqs=1827 [  455.802292]
> rcu:     2-...0: (144 GPs behind) idle=812/1/0x4000000000000000
> softirq=905/905 fqs=1827 [  455.811276] rcu:     3-...0: (144 GPs
> behind) idle=eaa/1/0x4000000000000000 softirq=719/719 fqs=1827 [
> 455.820266] rcu:     4-...0: (1 GPs behind)
> idle=c2e/1/0x4000000000000000 softirq=1324/1324 fqs=1827 [
> 455.829252] rcu:     5-...0: (144 GPs behind)
> idle=41a/1/0x4000000000000000 softirq=556/556 fqs=1827 [  455.838238]
> rcu:     6-...0: (144 GPs behind) idle=912/1/0x4000000000000000
> softirq=777/777 fqs=1827 [  455.847218] rcu:     7-...0: (144 GPs
> behind) idle=5e6/1/0x4000000000000000 softirq=2409/2410 fqs=1827 [
> 455.856404]  (detected by 87, t=5253 jiffies, g=48537, q=364) [
> 455.862170] Sending NMI from CPU 87 to CPUs 1: [  465.776884] Sending
> NMI from CPU 87 to CPUs 2: [  467.182686] watchdog: BUG: soft lockup
> - CPU#0 stuck for 23s! [kworker/0:1:7] [  467.189857] Modules linked
> in: jailhouse(O) nf_conntrack_netlink xfrm_user xt_addrtype
> br_netfilter xt_CHECKSUM xt_MASQUERADE xt_conntrack ipt_REJECT
> nf_reject_ipv4 xt_tcpudp nft_compat nft_chain_nat nf_natp [
> 467.189928]  binfmt_misc configfs efivarfs ip_tables x_tables autofs4
> ext4 crc16 mbcache jbd2 raid10 raid456 libcrc32c crc32c_generic
> async_raid6_recov async_memcpy async_pq async_xor xor async_tx
> raid6_pq ] [  467.320567] CPU: 0 PID: 7 Comm: kworker/0:1 Tainted: G
>          O      5.10.0 #3 [  467.328767] Hardware name: Dell Inc.
> PowerEdge R7425/08V001, BIOS 1.15.0 09/11/2020 [  467.337154]
> Workqueue: events drm_fb_helper_dirty_work [drm_kms_helper] [
> 467.344501] RIP: 0010:smp_call_function_many_cond+0x289/0x2d0 [
> 467.350979] Code: e8 1c 8a 39 00 3b 05 0a c1 74 01 89 c7 0f 83 0b fe
> ff ff 48 63 c7 49 8b 16 48 03 14 c5 00 d9 99 9c 8b 42 08 a8 01 74 09
> f3 90 <8b> 42 08 a8 01 75 f7 eb c9 48 c7 c2 20 cf 07 9d 4c 89 fe 44 7
> [  467.371232] RSP: 0018:ffffa7d78015fcd8 EFLAGS: 00000202 [
> 467.377220] RAX: 0000000000000011 RBX: 0000000000031280 RCX:
> 0000000000000001 [  467.385123] RDX: ffff964f1fa31280 RSI:
> 0000000000000000 RDI: 0000000000000001 [  467.393024] RBP:
> 0000000000000000 R08: 0000000000000000 R09: 0000000000000001 [
> 467.400928] R10: 0000000000000002 R11: 0000000000000002 R12:
> 0000000000000000 [  467.408836] R13: 000000000000007f R14:
> ffff962f1f42c9c0 R15: 0000000000000080 [  467.416737] FS:
> 0000000000000000(0000) GS:ffff962f1f400000(0000)
> knlGS:0000000000000000 [  467.425604] CS:  0010 DS: 0000 ES: 0000
> CR0: 0000000080050033 [  467.432127] CR2: 0000000000000000 CR3:
> 00000010987ea000 CR4: 00000000003506f0 [  467.440045] Call Trace: [
> 467.443289]  ? tlbflush_read_file+0x70/0x70 [  467.448266]  ?
> tlbflush_read_file+0x70/0x70 [  467.453242]  on_each_cpu+0x2b/0x60 [
> 467.457437]  __purge_vmap_area_lazy+0x5d/0x680 [  467.462679]  ?
> _cond_resched+0x16/0x40 [  467.467224]  ?
> unmap_kernel_range_noflush+0x2fa/0x380 [  467.473072]
> free_vmap_area_noflush+0xe7/0x100 [  467.478315]
> remove_vm_area+0x96/0xa0 [  467.482770]  __vunmap+0x8d/0x290 [
> 467.486792]  drm_gem_shmem_vunmap+0x8b/0xa0 [drm] [  467.492299]
> drm_client_buffer_vunmap+0x16/0x20 [drm] [  467.498144]
> drm_fb_helper_dirty_work+0x187/0x1b0 [drm_kms_helper] [  467.505118]
> process_one_work+0x1b6/0x350 [  467.509912]  worker_thread+0x53/0x3e0
> [  467.514361]  ? process_one_work+0x350/0x350 [  467.519338]
> kthread+0x11b/0x140 [  467.523342]  ? __kthread_bind_mask+0x60/0x60 [
>  467.528389]  ret_from_fork+0x22/0x30
> 
> Cheers
> Karim
> ________________________________
> From: Ralf Ramsauer <[email protected]>
> Sent: 12 November 2022 17:47
> To: Karim Manaouil <[email protected]>; [email protected]
> <[email protected]> Cc: [email protected]
> <[email protected]> Subject: Re: Jailhouse:
> enter_hypervisor returns -ENOMEM
> 
> This email was sent to you by someone outside the University.
> You should only click on links or attachments if you are certain that
> the email is genuine and the content is safe.
> 
> On 12/11/2022 18:15, Karim Manaouil wrote:
> > Hi Jan,
> >
> > I am trying to deploy Jailhouse on an AMD EPYC with 128 CPUs (8 NUMA
> > nodes), running Linux kernel v5.10 (same version used by jailhouse
> > CI with same patches applied).
> >
> > `jailhouse hardware check` return that everything is ok and that
> > "Check passed!".
> >
> > Memory was reserved via `memmap=0x5200000$0x3a000000`
> >
> > However, enter_hypervisor() [1] fails when entry() is called on
> > every cpu and return -ENOMEM as error_code.  
> 
> Try to reserve more memory. Maybe the default size of 6MiB for HV
> memory is insufficient for 128 CPUs.
> 
> Did you use jailhouse-config-create? You can use the --mem-hv option
> to increate the memory. Try, for example, 32MiB and see if it works.
> 
>    Ralf
> 
> >
> > Do you possibly know where could the issue come from?
> >
> > Best
> > Karim
> >
> > [1]
> > https://github.com/siemens/jailhouse/blob/c7a1b6971ac15e4be8a0918b9bef6e2cbd99f9fc/driver/main.c#L251
> > <https://github.com/siemens/jailhouse/blob/c7a1b6971ac15e4be8a0918b9bef6e2cbd99f9fc/driver/main.c#L251>
> >
> > The University of Edinburgh is a charitable body, registered in
> > Scotland, with registration number SC005336. Is e buidheann
> > carthannais a th’ ann an Oilthigh Dhùn Èideann, clàraichte an Alba,
> > àireamh clàraidh SC005336.
> >
> > --
> > You received this message because you are subscribed to the Google
> > Groups "Jailhouse" group.
> > To unsubscribe from this group and stop receiving emails from it,
> > send an email to [email protected]
> > <mailto:[email protected]>.
> > To view this discussion on the web visit
> > https://groups.google.com/d/msgid/jailhouse-dev/AM0PR05MB6018F1663ABE61DA3C697CA4A9039%40AM0PR05MB6018.eurprd05.prod.outlook.com
> > <https://groups.google.com/d/msgid/jailhouse-dev/AM0PR05MB6018F1663ABE61DA3C697CA4A9039%40AM0PR05MB6018.eurprd05.prod.outlook.com?utm_medium=email&utm_source=footer>.
> >  
> 

-- 
You received this message because you are subscribed to the Google Groups 
"Jailhouse" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To view this discussion on the web visit 
https://groups.google.com/d/msgid/jailhouse-dev/20221114102213.2d3223a1%40md1za8fc.ad001.siemens.net.

Reply via email to