Public bug reported:

System:  Intel SDP, Xeon(R) Gold 6252 CPU, 96 Core, 1708.5 GiB Memory (DCPMM 
Optane Memory)
Series: Bionic
Kernel: 4.15.0-51-generic #55
Arch: AMD64
Libvirt: 4.0.0-1ubuntu8.10

Problem:

While testing this machine , we have discovered that virtual machines
created via libvirt are unable to start when assigning 50 cores & 1
Terrabyte of memory to them.

The following tests were done using the disco cloud image, attempting to
boot a disco VM with 50 cores and 1 TB of memory.  The full console log
is attached to this bug.


[   15.229175] NET: Registered protocol family 10
[   15.231941] Segment Routing with IPv6
[   15.232523] NET: Registered protocol family 17
[   15.233141] BUG: unable to handle kernel paging request at ffff9d35c5a16880
[   15.233392] Key type dns_resolver registered
[   15.235863] #PF error: [PROT] [WRITE] [RSVD]
[   15.235863] PGD fcf1e05067 P4D fcf1e05067 PUD 10788a6c063 PMD 
8000010785a000e3
[   15.236967] Oops: 000b [#1] SMP PTI
[   15.242373] CPU: 26 PID: 456 Comm: kworker/26:1 Not tainted 5.0.0-16-generic 
#17-Ubuntu
[   15.242373] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 
1.10.2-1ubuntu1 04/01/2014
[   15.242373] Workqueue: ata_sff ata_sff_pio_task
[   15.242373] RIP: 0010:ioread32_rep+0x41/0x70
[   15.242373] Code: 48 8d 54 8e 04 8b 07 89 06 48 83 c6 04 48 39 d6 75 f3 5d 
c3 48 81 ff 00 00 01 00 76 27 0f b7 c7 89 c2 0f 1f 44 00 00 48 89 f7 <f3> 6d 5d 
c3 31 ff 48 85 c9 74 dd ed 89 04 be 48 83 c7 01 48 39 f9
[   15.242373] RSP: 0000:ffffa9bbda423d48 EFLAGS: 00010006
[   15.255328] RAX: 00000000000001f0 RBX: 0000000000000200 RCX: 0000000000000080
[   15.255328] RDX: 00000000000001f0 RSI: ffff9d35c5a16880 RDI: ffff9d35c5a16880
[   15.255328] RBP: ffffa9bbda423d48 R08: 0000000000000000 R09: 006666735f617461
[   15.255328] R10: 8080808080808080 R11: 0000000000000001 R12: ffff9d35c5a16880
[   15.255328] R13: 00000000000101f0 R14: 0000000000000000 R15: ffff9d35c5a16368
[   15.255328] FS:  0000000000000000(0000) GS:ffff9d35e4e80000(0000) 
knlGS:0000000000000000
[   15.255328] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[   15.255328] CR2: ffff9d35c5a16880 CR3: 000000fcf140e001 CR4: 00000000003606e0
[   15.255328] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[   15.255328] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[   15.255328] Call Trace:
[   15.255328]  ata_sff_data_xfer32+0x8e/0x160
[   15.255328]  ? __switch_to_asm+0x34/0x70
[   15.255328]  ata_pio_sector+0xb4/0x120
[   15.255328]  ata_pio_sectors+0x7e/0x90
[   15.255328]  ata_sff_hsm_move+0x228/0x690
[   15.255328]  ? __switch_to+0x96/0x4e0
[   15.255328]  ? __switch_to_asm+0x40/0x70
[   15.255328]  ? __switch_to_asm+0x34/0x70
[   15.255328]  ? __switch_to_asm+0x40/0x70
[   15.255328]  ata_sff_pio_task+0xcc/0x1b0
[   15.255328]  process_one_work+0x20f/0x410
[   15.255328]  worker_thread+0x34/0x400
[   15.255328]  kthread+0x120/0x140
[   15.255328]  ? process_one_work+0x410/0x410
[   15.255328]  ? __kthread_parkme+0x70/0x70
[   15.255328]  ret_from_fork+0x35/0x40
[   15.255328] Modules linked in:
[   15.255328] CR2: ffff9d35c5a16880
[   15.255328] ---[ end trace 047af05ecf201244 ]---
[   15.255328] RIP: 0010:ioread32_rep+0x41/0x70
[   15.255328] Code: 48 8d 54 8e 04 8b 07 89 06 48 83 c6 04 48 39 d6 75 f3 5d 
c3 48 81 ff 00 00 01 00 76 27 0f b7 c7 89 c2 0f 1f 44 00 00 48 89 f7 <f3> 6d 5d 
c3 31 ff 48 85 c9 74 dd ed 89 04 be 48 83 c7 01 48 39 f9
[   15.255328] RSP: 0000:ffffa9bbda423d48 EFLAGS: 00010006
[   15.283402] RAX: 00000000000001f0 RBX: 0000000000000200 RCX: 0000000000000080
[   15.283402] RDX: 00000000000001f0 RSI: ffff9d35c5a16880 RDI: ffff9d35c5a16880
[   15.283402] RBP: ffffa9bbda423d48 R08: 0000000000000000 R09: 006666735f617461
[   15.283402] R10: 8080808080808080 R11: 0000000000000001 R12: ffff9d35c5a16880
[   15.283402] R13: 00000000000101f0 R14: 0000000000000000 R15: ffff9d35c5a16368
[   15.283402] FS:  0000000000000000(0000) GS:ffff9d35e4e80000(0000) 
knlGS:0000000000000000
[   15.283402] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[   15.283402] CR2: ffff9d35c5a16880 CR3: 000000fcf140e001 CR4: 00000000003606e0
[   15.283402] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[   15.283402] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400


Steps to Reproduce:

1.) Download latest cloud image.
2.) qemu-img convert -O qcow2 <Cloud-IMG>
3.) qemu-img resize <IMG.QCOW2> +5GB 
4.) Generate Cloud Config to allow login

cat > config <<EOF
#cloud-config
password: ubuntu
chpasswd: { expire: False }
ssh_pwauth: True
EOF

5.) cloud-localds config.img config
6.) sudo virt-install --connect=qemu:///system --name virt-test-created --ram 
1096000 --vcpus=50 --os-type=linux --disk ubuntu.img,device=disk,bus=virtio 
--disk config.img,device=cdrom --graphics none --import


Alternatively, using uvtool binaries,  uvt-kvm yields a somewhat similar
yet not identical outcome,

Steps to Reproduce with uvtool:

1.) $uvt-simplestreams-libvirt sync release=disco arch=amd64
2.) $uvt-kvm create test rehlease=disco arch=amd64 --memory 1096000 --cpu 50

The console reveals that the vm will boot slightly longer, but appears
to hang at this point.

[   18.085051]   Magic number: 7:675:581
[   18.086091] memory memory6696: hash matches
[   18.086973] memory memory6015: hash matches
[   18.087924] memory memory4574: hash matches
[   18.088821] memory memory3738: hash matches
[   18.089746] memory memory2647: hash matches
[   18.090690] memory memory1361: hash matches
[   18.091573] memory memory574: hash matches
[   18.092660] rtc_cmos 00:00: setting system clock to 2019-06-05T12:34:17 UTC 
(1559738057)
[   18.097307] Freeing unused decrypted memory: 2040K
[   18.099181] Freeing unused kernel image memory: 2576K
[   18.109790] Write protecting the kernel read-only data: 22528k
[   18.112213] Freeing unused kernel image memory: 2016K
[   18.114252] Freeing unused kernel image memory: 1852K
[   18.135085] x86/mm: Checked W+X mappings: passed, no W+X pages found.
[   18.136415] x86/mm: Checking user space page tables
[   18.148461] x86/mm: Checked W+X mappings: passed, no W+X pages found.
[   18.149546] Run /init as init process
Loading, please wait...
Starting version 240

There does not appear to be a stack trace when using uvtools, versus creating 
the VM manually via downloading the cloud img and using virt-inst.
--- 
ProblemType: Bug
AlsaDevices:
 total 0
 crw-rw---- 1 root audio 116,  1 Jun  5 14:43 seq
 crw-rw---- 1 root audio 116, 33 Jun  5 14:43 timer
AplayDevices: Error: [Errno 2] No such file or directory: 'aplay': 'aplay'
ApportVersion: 2.20.9-0ubuntu7.6
Architecture: amd64
ArecordDevices: Error: [Errno 2] No such file or directory: 'arecord': 'arecord'
AudioDevicesInUse: Error: command ['fuser', '-v', '/dev/snd/seq', 
'/dev/snd/timer'] failed with exit code 1:
DistroRelease: Ubuntu 18.04
IwConfig: Error: [Errno 2] No such file or directory: 'iwconfig': 'iwconfig'
Lsusb:
 Bus 002 Device 001: ID 1d6b:0003 Linux Foundation 3.0 root hub
 Bus 001 Device 003: ID 0b1f:03e9 Insyde Software Corp. 
 Bus 001 Device 002: ID 0000:0001  
 Bus 001 Device 001: ID 1d6b:0002 Linux Foundation 2.0 root hub
MachineType: Intel Corporation S2600WFD
Package: linux (not installed)
PciMultimedia:
 
ProcEnviron:
 TERM=xterm-256color
 PATH=(custom, no user)
 XDG_RUNTIME_DIR=<set>
 LANG=C.UTF-8
 SHELL=/bin/bash
ProcFB: 0 astdrmfb
ProcKernelCmdLine: BOOT_IMAGE=/boot/vmlinuz-4.15.0-51-generic 
root=UUID=6d07bb91-3c5d-4851-86fa-2e5843fd3cae ro
ProcVersionSignature: Ubuntu 4.15.0-51.55-generic 4.15.18
RelatedPackageVersions:
 linux-restricted-modules-4.15.0-51-generic N/A
 linux-backports-modules-4.15.0-51-generic  N/A
 linux-firmware                             1.173.6
RfKill: Error: [Errno 2] No such file or directory: 'rfkill': 'rfkill'
Tags:  bionic uec-images
Uname: Linux 4.15.0-51-generic x86_64
UnreportableReason: This report is about a package that is not installed.
UpgradeStatus: No upgrade log present (probably fresh install)
UserGroups: adm audio cdrom dialout dip floppy libvirt lxd netdev plugdev sudo 
video
_MarkForUpload: False
dmi.bios.date: 02/27/2019
dmi.bios.vendor: Intel Corporation
dmi.bios.version: SE5C620.86B.0D.01.0395.022720191340
dmi.board.asset.tag: Base Board Asset Tag
dmi.board.name: S2600WFD
dmi.board.vendor: Intel Corporation
dmi.board.version: J46732-610
dmi.chassis.asset.tag: ....................
dmi.chassis.type: 23
dmi.chassis.vendor: ...............................
dmi.chassis.version: ..................
dmi.modalias: 
dmi:bvnIntelCorporation:bvrSE5C620.86B.0D.01.0395.022720191340:bd02/27/2019:svnIntelCorporation:pnS2600WFD:pvr....................:rvnIntelCorporation:rnS2600WFD:rvrJ46732-610:cvn...............................:ct23:cvr..................:
dmi.product.family: Family
dmi.product.name: S2600WFD
dmi.product.version: ....................
dmi.sys.vendor: Intel Corporation

** Affects: linux (Ubuntu)
     Importance: Undecided
         Status: New


** Tags: apport-collected bionic uec-images

** Attachment added: "kvm-debug.txt"
   
https://bugs.launchpad.net/bugs/1831763/+attachment/5268946/+files/kvm-debug.txt

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1831763

Title:
  Unable to create virtual machine with large amounts memory / cpu

Status in linux package in Ubuntu:
  New

Bug description:
  System:  Intel SDP, Xeon(R) Gold 6252 CPU, 96 Core, 1708.5 GiB Memory (DCPMM 
Optane Memory)
  Series: Bionic
  Kernel: 4.15.0-51-generic #55
  Arch: AMD64
  Libvirt: 4.0.0-1ubuntu8.10

  Problem:

  While testing this machine , we have discovered that virtual machines
  created via libvirt are unable to start when assigning 50 cores & 1
  Terrabyte of memory to them.

  The following tests were done using the disco cloud image, attempting
  to boot a disco VM with 50 cores and 1 TB of memory.  The full console
  log is attached to this bug.

  
  [   15.229175] NET: Registered protocol family 10
  [   15.231941] Segment Routing with IPv6
  [   15.232523] NET: Registered protocol family 17
  [   15.233141] BUG: unable to handle kernel paging request at ffff9d35c5a16880
  [   15.233392] Key type dns_resolver registered
  [   15.235863] #PF error: [PROT] [WRITE] [RSVD]
  [   15.235863] PGD fcf1e05067 P4D fcf1e05067 PUD 10788a6c063 PMD 
8000010785a000e3
  [   15.236967] Oops: 000b [#1] SMP PTI
  [   15.242373] CPU: 26 PID: 456 Comm: kworker/26:1 Not tainted 
5.0.0-16-generic #17-Ubuntu
  [   15.242373] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 
1.10.2-1ubuntu1 04/01/2014
  [   15.242373] Workqueue: ata_sff ata_sff_pio_task
  [   15.242373] RIP: 0010:ioread32_rep+0x41/0x70
  [   15.242373] Code: 48 8d 54 8e 04 8b 07 89 06 48 83 c6 04 48 39 d6 75 f3 5d 
c3 48 81 ff 00 00 01 00 76 27 0f b7 c7 89 c2 0f 1f 44 00 00 48 89 f7 <f3> 6d 5d 
c3 31 ff 48 85 c9 74 dd ed 89 04 be 48 83 c7 01 48 39 f9
  [   15.242373] RSP: 0000:ffffa9bbda423d48 EFLAGS: 00010006
  [   15.255328] RAX: 00000000000001f0 RBX: 0000000000000200 RCX: 
0000000000000080
  [   15.255328] RDX: 00000000000001f0 RSI: ffff9d35c5a16880 RDI: 
ffff9d35c5a16880
  [   15.255328] RBP: ffffa9bbda423d48 R08: 0000000000000000 R09: 
006666735f617461
  [   15.255328] R10: 8080808080808080 R11: 0000000000000001 R12: 
ffff9d35c5a16880
  [   15.255328] R13: 00000000000101f0 R14: 0000000000000000 R15: 
ffff9d35c5a16368
  [   15.255328] FS:  0000000000000000(0000) GS:ffff9d35e4e80000(0000) 
knlGS:0000000000000000
  [   15.255328] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
  [   15.255328] CR2: ffff9d35c5a16880 CR3: 000000fcf140e001 CR4: 
00000000003606e0
  [   15.255328] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 
0000000000000000
  [   15.255328] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 
0000000000000400
  [   15.255328] Call Trace:
  [   15.255328]  ata_sff_data_xfer32+0x8e/0x160
  [   15.255328]  ? __switch_to_asm+0x34/0x70
  [   15.255328]  ata_pio_sector+0xb4/0x120
  [   15.255328]  ata_pio_sectors+0x7e/0x90
  [   15.255328]  ata_sff_hsm_move+0x228/0x690
  [   15.255328]  ? __switch_to+0x96/0x4e0
  [   15.255328]  ? __switch_to_asm+0x40/0x70
  [   15.255328]  ? __switch_to_asm+0x34/0x70
  [   15.255328]  ? __switch_to_asm+0x40/0x70
  [   15.255328]  ata_sff_pio_task+0xcc/0x1b0
  [   15.255328]  process_one_work+0x20f/0x410
  [   15.255328]  worker_thread+0x34/0x400
  [   15.255328]  kthread+0x120/0x140
  [   15.255328]  ? process_one_work+0x410/0x410
  [   15.255328]  ? __kthread_parkme+0x70/0x70
  [   15.255328]  ret_from_fork+0x35/0x40
  [   15.255328] Modules linked in:
  [   15.255328] CR2: ffff9d35c5a16880
  [   15.255328] ---[ end trace 047af05ecf201244 ]---
  [   15.255328] RIP: 0010:ioread32_rep+0x41/0x70
  [   15.255328] Code: 48 8d 54 8e 04 8b 07 89 06 48 83 c6 04 48 39 d6 75 f3 5d 
c3 48 81 ff 00 00 01 00 76 27 0f b7 c7 89 c2 0f 1f 44 00 00 48 89 f7 <f3> 6d 5d 
c3 31 ff 48 85 c9 74 dd ed 89 04 be 48 83 c7 01 48 39 f9
  [   15.255328] RSP: 0000:ffffa9bbda423d48 EFLAGS: 00010006
  [   15.283402] RAX: 00000000000001f0 RBX: 0000000000000200 RCX: 
0000000000000080
  [   15.283402] RDX: 00000000000001f0 RSI: ffff9d35c5a16880 RDI: 
ffff9d35c5a16880
  [   15.283402] RBP: ffffa9bbda423d48 R08: 0000000000000000 R09: 
006666735f617461
  [   15.283402] R10: 8080808080808080 R11: 0000000000000001 R12: 
ffff9d35c5a16880
  [   15.283402] R13: 00000000000101f0 R14: 0000000000000000 R15: 
ffff9d35c5a16368
  [   15.283402] FS:  0000000000000000(0000) GS:ffff9d35e4e80000(0000) 
knlGS:0000000000000000
  [   15.283402] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
  [   15.283402] CR2: ffff9d35c5a16880 CR3: 000000fcf140e001 CR4: 
00000000003606e0
  [   15.283402] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 
0000000000000000
  [   15.283402] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 
0000000000000400

  
  Steps to Reproduce:

  1.) Download latest cloud image.
  2.) qemu-img convert -O qcow2 <Cloud-IMG>
  3.) qemu-img resize <IMG.QCOW2> +5GB 
  4.) Generate Cloud Config to allow login

  cat > config <<EOF
  #cloud-config
  password: ubuntu
  chpasswd: { expire: False }
  ssh_pwauth: True
  EOF

  5.) cloud-localds config.img config
  6.) sudo virt-install --connect=qemu:///system --name virt-test-created --ram 
1096000 --vcpus=50 --os-type=linux --disk ubuntu.img,device=disk,bus=virtio 
--disk config.img,device=cdrom --graphics none --import


  
  Alternatively, using uvtool binaries,  uvt-kvm yields a somewhat similar yet 
not identical outcome, 

  Steps to Reproduce with uvtool:

  1.) $uvt-simplestreams-libvirt sync release=disco arch=amd64
  2.) $uvt-kvm create test rehlease=disco arch=amd64 --memory 1096000 --cpu 50

  The console reveals that the vm will boot slightly longer, but appears
  to hang at this point.

  [   18.085051]   Magic number: 7:675:581
  [   18.086091] memory memory6696: hash matches
  [   18.086973] memory memory6015: hash matches
  [   18.087924] memory memory4574: hash matches
  [   18.088821] memory memory3738: hash matches
  [   18.089746] memory memory2647: hash matches
  [   18.090690] memory memory1361: hash matches
  [   18.091573] memory memory574: hash matches
  [   18.092660] rtc_cmos 00:00: setting system clock to 2019-06-05T12:34:17 
UTC (1559738057)
  [   18.097307] Freeing unused decrypted memory: 2040K
  [   18.099181] Freeing unused kernel image memory: 2576K
  [   18.109790] Write protecting the kernel read-only data: 22528k
  [   18.112213] Freeing unused kernel image memory: 2016K
  [   18.114252] Freeing unused kernel image memory: 1852K
  [   18.135085] x86/mm: Checked W+X mappings: passed, no W+X pages found.
  [   18.136415] x86/mm: Checking user space page tables
  [   18.148461] x86/mm: Checked W+X mappings: passed, no W+X pages found.
  [   18.149546] Run /init as init process
  Loading, please wait...
  Starting version 240

  There does not appear to be a stack trace when using uvtools, versus creating 
the VM manually via downloading the cloud img and using virt-inst.
  --- 
  ProblemType: Bug
  AlsaDevices:
   total 0
   crw-rw---- 1 root audio 116,  1 Jun  5 14:43 seq
   crw-rw---- 1 root audio 116, 33 Jun  5 14:43 timer
  AplayDevices: Error: [Errno 2] No such file or directory: 'aplay': 'aplay'
  ApportVersion: 2.20.9-0ubuntu7.6
  Architecture: amd64
  ArecordDevices: Error: [Errno 2] No such file or directory: 'arecord': 
'arecord'
  AudioDevicesInUse: Error: command ['fuser', '-v', '/dev/snd/seq', 
'/dev/snd/timer'] failed with exit code 1:
  DistroRelease: Ubuntu 18.04
  IwConfig: Error: [Errno 2] No such file or directory: 'iwconfig': 'iwconfig'
  Lsusb:
   Bus 002 Device 001: ID 1d6b:0003 Linux Foundation 3.0 root hub
   Bus 001 Device 003: ID 0b1f:03e9 Insyde Software Corp. 
   Bus 001 Device 002: ID 0000:0001  
   Bus 001 Device 001: ID 1d6b:0002 Linux Foundation 2.0 root hub
  MachineType: Intel Corporation S2600WFD
  Package: linux (not installed)
  PciMultimedia:
   
  ProcEnviron:
   TERM=xterm-256color
   PATH=(custom, no user)
   XDG_RUNTIME_DIR=<set>
   LANG=C.UTF-8
   SHELL=/bin/bash
  ProcFB: 0 astdrmfb
  ProcKernelCmdLine: BOOT_IMAGE=/boot/vmlinuz-4.15.0-51-generic 
root=UUID=6d07bb91-3c5d-4851-86fa-2e5843fd3cae ro
  ProcVersionSignature: Ubuntu 4.15.0-51.55-generic 4.15.18
  RelatedPackageVersions:
   linux-restricted-modules-4.15.0-51-generic N/A
   linux-backports-modules-4.15.0-51-generic  N/A
   linux-firmware                             1.173.6
  RfKill: Error: [Errno 2] No such file or directory: 'rfkill': 'rfkill'
  Tags:  bionic uec-images
  Uname: Linux 4.15.0-51-generic x86_64
  UnreportableReason: This report is about a package that is not installed.
  UpgradeStatus: No upgrade log present (probably fresh install)
  UserGroups: adm audio cdrom dialout dip floppy libvirt lxd netdev plugdev 
sudo video
  _MarkForUpload: False
  dmi.bios.date: 02/27/2019
  dmi.bios.vendor: Intel Corporation
  dmi.bios.version: SE5C620.86B.0D.01.0395.022720191340
  dmi.board.asset.tag: Base Board Asset Tag
  dmi.board.name: S2600WFD
  dmi.board.vendor: Intel Corporation
  dmi.board.version: J46732-610
  dmi.chassis.asset.tag: ....................
  dmi.chassis.type: 23
  dmi.chassis.vendor: ...............................
  dmi.chassis.version: ..................
  dmi.modalias: 
dmi:bvnIntelCorporation:bvrSE5C620.86B.0D.01.0395.022720191340:bd02/27/2019:svnIntelCorporation:pnS2600WFD:pvr....................:rvnIntelCorporation:rnS2600WFD:rvrJ46732-610:cvn...............................:ct23:cvr..................:
  dmi.product.family: Family
  dmi.product.name: S2600WFD
  dmi.product.version: ....................
  dmi.sys.vendor: Intel Corporation

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1831763/+subscriptions

-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to     : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp

Reply via email to