Hi, I tested this workaround : I confirm that it works on Xen host, but not on Xen guest. If you try to start a vm with latest kernel i.e. theses parameters in cfg file :
# # Kernel + memory size # kernel = '/boot/vmlinuz-4.9.0-7-amd64' extra = 'elevator=noop' ramdisk = '/boot/initrd.img-4.9.0-7-amd64' The VM crash in loop with kernel error : [ 0.000000] Linux version 4.9.0-7-amd64 ([email protected]) (gcc version 6.3.0 20170516 (Debian 6.3.0-18+deb9u1) ) #1 SMP Debian 4.9.110-1 (2018-07-05) [ 0.000000] Command line: root=/dev/xvda2 ro elevator=noop [ 0.000000] x86/fpu: Supporting XSAVE feature 0x001: 'x87 floating point registers' [ 0.000000] x86/fpu: Supporting XSAVE feature 0x002: 'SSE registers' [ 0.000000] x86/fpu: Supporting XSAVE feature 0x004: 'AVX registers' [ 0.000000] x86/fpu: xstate_offset[2]: 576, xstate_sizes[2]: 256 [ 0.000000] x86/fpu: Enabled xstate features 0x7, context size is 832 bytes, using 'standard' format. [ 0.000000] ACPI in unprivileged domain disabled [ 0.000000] Released 0 page(s) [ 0.000000] e820: BIOS-provided physical RAM map: [ 0.000000] Xen: [mem 0x0000000000000000-0x000000000009ffff] usable [ 0.000000] Xen: [mem 0x00000000000a0000-0x00000000000fffff] reserved [ 0.000000] Xen: [mem 0x0000000000100000-0x000000007fffffff] usable [ 0.000000] NX (Execute Disable) protection: active [ 0.000000] DMI not present or invalid. [ 0.000000] Hypervisor detected: Xen [ 0.000000] e820: last_pfn = 0x80000 max_arch_pfn = 0x400000000 [ 0.000000] MTRR: Disabled [ 0.000000] x86/PAT: MTRRs disabled, skipping PAT initialization too. [ 0.000000] x86/PAT: Configuration [0-7]: WB WT UC- UC WC WP UC UC [ 0.000000] RAMDISK: [mem 0x02000000-0x05996fff] [ 0.000000] NUMA turned off [ 0.000000] Faking a node at [mem 0x0000000000000000-0x000000007fffffff] [ 0.000000] NODE_DATA(0) allocated [mem 0x7fc16000-0x7fc1afff] [ 0.000000] Zone ranges: [ 0.000000] DMA [mem 0x0000000000001000-0x0000000000ffffff] [ 0.000000] DMA32 [mem 0x0000000001000000-0x000000007fffffff] [ 0.000000] Normal empty [ 0.000000] Device empty [ 0.000000] Movable zone start for each node [ 0.000000] Early memory node ranges [ 0.000000] node 0: [mem 0x0000000000001000-0x000000000009ffff] [ 0.000000] node 0: [mem 0x0000000000100000-0x000000007fffffff] [ 0.000000] Initmem setup node 0 [mem 0x0000000000001000-0x000000007fffffff] [ 0.000000] p2m virtual area at ffffc90000000000, size is 40000000 [ 0.000000] Remapped 0 page(s) [ 0.000000] SFI: Simple Firmware Interface v0.81 http://simplefirmware.org [ 0.000000] smpboot: Allowing 1 CPUs, 0 hotplug CPUs [ 0.000000] PM: Registered nosave memory: [mem 0x00000000-0x00000fff] [ 0.000000] PM: Registered nosave memory: [mem 0x000a0000-0x000fffff] [ 0.000000] e820: [mem 0x80000000-0xffffffff] available for PCI devices [ 0.000000] Booting paravirtualized kernel on Xen [ 0.000000] Xen version: 4.8.4-pre (preserve-AD) [ 0.000000] clocksource: refined-jiffies: mask: 0xffffffff max_cycles: 0xffffffff, max_idle_ns: 7645519600211568 ns [ 0.000000] setup_percpu: NR_CPUS:512 nr_cpumask_bits:512 nr_cpu_ids:1 nr_node_ids:1 [ 0.000000] percpu: Embedded 35 pages/cpu @ffff88007f600000 s105304 r8192 d29864 u2097152 [ 0.000000] PV qspinlock hash table entries: 256 (order: 0, 4096 bytes) [ 0.000000] Built 1 zonelists in Node order, mobility grouping on. Total pages: 515978 [ 0.000000] Policy zone: DMA32 [ 0.000000] Kernel command line: root=/dev/xvda2 ro elevator=noop [ 0.000000] PID hash table entries: 4096 (order: 3, 32768 bytes) [ 0.000000] Memory: 1980804K/2096764K available (6250K kernel code, 1159K rwdata, 2868K rodata, 1420K init, 688K bss, 115960K reserved, 0K cma-reserved) [ 0.000000] Kernel/User page tables isolation: enabled [ 0.000000] Hierarchical RCU implementation. [ 0.000000] Build-time adjustment of leaf fanout to 64. [ 0.000000] RCU restricting CPUs from NR_CPUS=512 to nr_cpu_ids=1. [ 0.000000] RCU: Adjusting geometry for rcu_fanout_leaf=64, nr_cpu_ids=1 [ 0.000000] Using NULL legacy PIC [ 0.000000] NR_IRQS:33024 nr_irqs:32 0 [ 0.000000] xen:events: Using FIFO-based ABI [ 0.000000] Console: colour dummy device 80x25 [ 0.000000] console [tty0] enabled [ 0.000000] console [hvc0] enabled [ 0.000000] clocksource: xen: mask: 0xffffffffffffffff max_cycles: 0x1cd42e4dffb, max_idle_ns: 881590591483 ns [ 0.000000] installing Xen timer for CPU 0 [ 0.000000] tsc: Unable to calibrate against PIT [ 0.000000] tsc: No reference (HPET/PMTIMER) available [ 0.000000] tsc: Detected 2597.018 MHz processor [ 0.004000] Calibrating delay loop (skipped), value calculated using timer frequency.. 5194.03 BogoMIPS (lpj=10388072) [ 0.004000] pid_max: default: 32768 minimum: 301 [ 0.004000] Security Framework initialized [ 0.004000] Yama: disabled by default; enable with sysctl kernel.yama.* [ 0.004000] AppArmor: AppArmor disabled by boot time parameter [ 0.004000] Dentry cache hash table entries: 262144 (order: 9, 2097152 bytes) [ 0.004000] Inode-cache hash table entries: 131072 (order: 8, 1048576 bytes) [ 0.004000] Mount-cache hash table entries: 4096 (order: 3, 32768 bytes) [ 0.004000] Mountpoint-cache hash table entries: 4096 (order: 3, 32768 bytes) [ 0.004000] ENERGY_PERF_BIAS: Set to 'normal', was 'performance' [ 0.004000] ENERGY_PERF_BIAS: View and update with x86_energy_perf_policy(8) [ 0.004000] CPU: Physical Processor ID: 0 [ 0.004000] CPU: Processor Core ID: 0 [ 0.004000] mce: CPU supports 2 MCE banks [ 0.004000] Last level iTLB entries: 4KB 1024, 2MB 1024, 4MB 1024 [ 0.004000] Last level dTLB entries: 4KB 1024, 2MB 1024, 4MB 1024, 1GB 4 [ 0.004000] Spectre V2 : Mitigation: Full generic retpoline [ 0.004000] Spectre V2 : Spectre v2 mitigation: Enabling Indirect Branch Prediction Barrier [ 0.004000] Spectre V2 : Enabling Restricted Speculation for firmware calls [ 0.004000] Speculative Store Bypass: Vulnerable [ 0.051616] Freeing SMP alternatives memory: 24K [ 0.057710] ftrace: allocating 25269 entries in 99 pages [ 0.072061] cpu 0 spinlock event irq 1 [ 0.072071] smpboot: Max logical packages: 1 [ 0.072078] VPMU disabled by hypervisor. [ 0.072093] Performance Events: unsupported p6 CPU model 63 no PMU driver, software events only. [ 0.072602] NMI watchdog: disabled (cpu0): hardware events not enabled [ 0.072610] NMI watchdog: Shutting down hard lockup detector on all cpus [ 0.072624] x86: Booted up 1 node, 1 CPUs [ 0.072761] devtmpfs: initialized [ 0.072813] x86/mm: Memory block size: 128MB [ 0.074028] clocksource: jiffies: mask: 0xffffffff max_cycles: 0xffffffff, max_idle_ns: 7645041785100000 ns [ 0.074045] futex hash table entries: 256 (order: 2, 16384 bytes) [ 0.074075] pinctrl core: initialized pinctrl subsystem [ 0.074165] NET: Registered protocol family 16 [ 0.074176] xen:grant_table: Grant tables using version 1 layout [ 0.074195] Grant table initialized [ 0.074377] PCI: setting up Xen PCI frontend stub [ 0.074377] ACPI: Interpreter disabled. [ 0.074377] xen:balloon: Initialising balloon driver [ 0.076045] xen_balloon: Initialising balloon driver [ 0.076053] vgaarb: loaded [ 0.076068] dmi: Firmware registration failed. [ 0.076106] PCI: System does not support PCI [ 0.076111] PCI: System does not support PCI [ 0.076237] clocksource: Switched to clocksource xen [ 0.081278] VFS: Disk quotas dquot_6.6.0 [ 0.081294] VFS: Dquot-cache hash table entries: 512 (order 0, 4096 bytes) [ 0.081315] hugetlbfs: disabling because there are no supported hugepage sizes [ 0.081343] pnp: PnP ACPI: disabled [ 0.082398] NET: Registered protocol family 2 [ 0.082534] TCP established hash table entries: 16384 (order: 5, 131072 bytes) [ 0.082606] TCP bind hash table entries: 16384 (order: 6, 262144 bytes) [ 0.082654] TCP: Hash tables configured (established 16384 bind 16384) [ 0.082689] UDP hash table entries: 1024 (order: 3, 32768 bytes) [ 0.082708] UDP-Lite hash table entries: 1024 (order: 3, 32768 bytes) [ 0.082750] NET: Registered protocol family 1 [ 0.082788] Unpacking initramfs... [ 0.123386] Freeing initrd memory: 58972K [ 0.123786] general protection fault: 0000 [#1] SMP [ 0.123792] Modules linked in: [ 0.123799] CPU: 0 PID: 30 Comm: modprobe Not tainted 4.9.0-7-amd64 #1 Debian 4.9.110-1 [ 0.123807] task: ffff880078ad7000 task.stack: ffffc90040498000 [ 0.123812] RIP: e030:[<ffffffff81614d4d>] [<ffffffff81614d4d>] ret_from_fork+0x2d/0x70 [ 0.123824] RSP: e02b:ffffc9004049bf50 EFLAGS: 00010006 [ 0.123829] RAX: 0000000493ef5000 RBX: ffffffff8108e9d0 RCX: ffffea0001ec61df [ 0.123835] RDX: 0000000000000002 RSI: 0000000000000002 RDI: ffffc9004049bf58 [ 0.123841] RBP: 0000000000000000 R08: 0000000000000000 R09: ffff880078adc000 [ 0.124009] R10: 8080808080808080 R11: fefefefefefefeff R12: ffff88007ceb7a00 [ 0.124009] R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000 [ 0.124009] FS: 0000000000000000(0000) GS:ffff88007f600000(0000) knlGS:0000000000000000 [ 0.124009] CS: e033 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 0.124009] CR2: 00007ffd13e9e9b9 CR3: 0000000078af4000 CR4: 0000000000042660 [ 0.124009] Stack: [ 0.124009] 0000000000000000 0000000000000000 0000000000000000 0000000000000000 [ 0.124009] 0000000000000000 0000000000000000 0000000000000000 0000000000000000 [ 0.124009] 0000000000000000 0000000000000000 0000000000000000 0000000000000000 [ 0.124009] Call Trace: [ 0.124009] Code: c7 e8 b8 fe a8 ff 48 85 db 75 2f 48 89 e7 e8 5b ed 9e ff 50 90 0f 20 d8 65 48 0b 04 25 e0 02 01 00 78 08 65 88 04 25 e7 02 01 00 <0f> 22 d8 58 66 0f 1f 44 00 00 e9 c1 07 00 00 4c 89 e7 eb 11 e8 [ 0.124009] RIP [<ffffffff81614d4d>] ret_from_fork+0x2d/0x70 [ 0.124009] RSP <ffffc9004049bf50> [ 0.124009] ---[ end trace e2ff95a7e079b5b5 ]--- Did I miss something ? Thanks for your help. Best regards. Benoît Le lun. 16 juil. 2018 à 19:28, Hans van Kranenburg <[email protected]> a écrit : > Reportedly, adding pti=off to the kernel boot parameters will work > around the issue for now. > > Turning off pti in the guest kernel is done in any case for PV. The > issue between 4.9.107 and 4.9.111 affects the detection and turning off > of pti, that's why forcing it off helps. > > In 4.9.112 it's fixed in commit 1adc34adc3447c34926994b87db5d929f5ab45b5 > "x86/cpu: Re-apply forced caps every time CPU caps are re-read" > > Hans >

