On 8/8/16 2:23 , Kilian Ries wrote:
> Hello,
> 
> 
> running different versions of SmartOS (up to the latest release 
> 20160803T101331Z) i have noticed a strange behaviour:
> 
> 
> When SmartOS runs on a host with one CPU and 4 cores (4 real cores, 8 HT 
> cores) i can set as many vCPUs for my KVM as i want. If i set it up to for 
> example 12 vCPUs and run prime inside my VM, it takes about one minute and 
> i'm seeing many messages like this:
> 
> 
> 
> ###
> 
> 
> [Do Aug  4 17:22:37 2016] BUG: soft lockup - CPU#6 stuck for 27s! 
> [mprime:2796]
> 
> [Do Aug  4 17:22:37 2016] Modules linked in:
> 
> [Do Aug  4 17:22:37 2016]  ppdev
> 
> [Do Aug  4 17:22:37 2016]  parport_pc
> 
> [Do Aug  4 17:22:37 2016]  sg
> 
> [Do Aug  4 17:22:37 2016]  parport
> 
> [Do Aug  4 17:22:37 2016]  pcspkr
> 
> [Do Aug  4 17:22:37 2016]  i2c_piix4
> 
> [Do Aug  4 17:22:37 2016]  ip_tables
> 
> [Do Aug  4 17:22:37 2016]  xfs
> 
> [Do Aug  4 17:22:37 2016]  libcrc32c
> 
> [Do Aug  4 17:22:37 2016]  sr_mod
> 
> [Do Aug  4 17:22:37 2016]  cdrom
> 
> [Do Aug  4 17:22:37 2016]  ata_generic
> 
> [Do Aug  4 17:22:37 2016]  pata_acpi
> 
> [Do Aug  4 17:22:37 2016]  bochs_drm
> 
> [Do Aug  4 17:22:37 2016]  syscopyarea
> 
> [Do Aug  4 17:22:37 2016]  sysfillrect
> 
> [Do Aug  4 17:22:37 2016]  sysimgblt
> 
> [Do Aug  4 17:22:37 2016]  drm_kms_helper
> 
> [Do Aug  4 17:22:37 2016]  ttm
> 
> [Do Aug  4 17:22:37 2016]  drm
> 
> [Do Aug  4 17:22:37 2016]  ata_piix
> 
> [Do Aug  4 17:22:37 2016]  virtio_net
> 
> [Do Aug  4 17:22:37 2016]  virtio_blk
> 
> [Do Aug  4 17:22:37 2016]  virtio_pci
> 
> [Do Aug  4 17:22:37 2016]  i2c_core
> 
> [Do Aug  4 17:22:37 2016]  virtio_ring
> 
> [Do Aug  4 17:22:37 2016]  libata
> 
> [Do Aug  4 17:22:37 2016]  floppy
> 
> [Do Aug  4 17:22:37 2016]  serio_raw
> 
> [Do Aug  4 17:22:37 2016]  virtio
> 
> [Do Aug  4 17:22:37 2016]  dm_mirror
> 
> [Do Aug  4 17:22:37 2016]  dm_region_hash
> 
> [Do Aug  4 17:22:37 2016]  dm_log
> 
> [Do Aug  4 17:22:37 2016]  dm_mod
> 
> 
> [Do Aug  4 17:22:37 2016] CPU: 6 PID: 2796 Comm: mprime Tainted: G            
>  L ------------   3.10.0-327.22.2.el7.x86_64 #1
> 
> [Do Aug  4 17:22:37 2016] Hardware name: Joyent SmartDC HVM, BIOS Bochs 
> 01/01/2007
> 
> [Do Aug  4 17:22:37 2016] task: ffff8803efd26780 ti: ffff8803eed04000 
> task.ti: ffff8803eed04000
> 
> [Do Aug  4 17:22:37 2016] RIP: 0033:[<000000000164dd78>]
> 
> [Do Aug  4 17:22:37 2016]  [<000000000164dd78>] 0x164dd77
> 
> [Do Aug  4 17:22:37 2016] RSP: 002b:00007f6f917f8710  EFLAGS: 00000202
> 
> [Do Aug  4 17:22:37 2016] RAX: 000000004000000d RBX: 000000000000fe2e RCX: 
> 00007f6f4afb4b00
> 
> [Do Aug  4 17:22:37 2016] RDX: 0000000000000000 RSI: 00007f6f4afaa480 RDI: 
> 00007f6f9823c400
> 
> [Do Aug  4 17:22:37 2016] RBP: 00007f6f9823c400 R08: 00007f6f4afa9780 R09: 
> 0000000000000000
> 
> [Do Aug  4 17:22:37 2016] R10: 00007f6f917f8840 R11: 00007f6f80001000 R12: 
> 00007f6f9823c800
> 
> [Do Aug  4 17:22:37 2016] R13: 00007f6f9823c500 R14: ffffffff8163c831 R15: 
> ffff8803eed07f70
> 
> [Do Aug  4 17:22:37 2016] FS:  00007f6f917fa700(0000) 
> GS:ffff880407cc0000(0000) knlGS:0000000000000000
> 
> [Do Aug  4 17:22:37 2016] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> 
> [Do Aug  4 17:22:37 2016] CR2: 00007f9ebbb5c000 CR3: 00000003efd1d000 CR4: 
> 00000000000006e0
> 
> [Do Aug  4 17:22:37 2016] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 
> 0000000000000000
> 
> [Do Aug  4 17:22:37 2016] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 
> 0000000000000400
> 
> 
> [Do Aug  4 17:22:40 2016] BUG: soft lockup - CPU#2 stuck for 30s! 
> [mprime:2792]
> 
> [Do Aug  4 17:22:40 2016] Modules linked in:
> 
> [Do Aug  4 17:22:40 2016]  ppdev
> 
> [Do Aug  4 17:22:40 2016]  parport_pc
> 
> [Do Aug  4 17:22:40 2016]  sg
> 
> [Do Aug  4 17:22:40 2016]  parport
> 
> [Do Aug  4 17:22:40 2016]  pcspkr
> 
> [Do Aug  4 17:22:40 2016]  i2c_piix4
> 
> [Do Aug  4 17:22:40 2016]  ip_tables
> 
> [Do Aug  4 17:22:40 2016]  xfs
> 
> [Do Aug  4 17:22:40 2016]  libcrc32c
> 
> [Do Aug  4 17:22:40 2016]  sr_mod
> 
> [Do Aug  4 17:22:40 2016]  cdrom
> 
> [Do Aug  4 17:22:40 2016]  ata_generic
> 
> [Do Aug  4 17:22:40 2016]  pata_acpi
> 
> [Do Aug  4 17:22:40 2016]  bochs_drm
> 
> [Do Aug  4 17:22:40 2016]  syscopyarea
> 
> [Do Aug  4 17:22:40 2016]  sysfillrect
> 
> [Do Aug  4 17:22:40 2016]  sysimgblt
> 
> [Do Aug  4 17:22:40 2016]  drm_kms_helper
> 
> [Do Aug  4 17:22:40 2016]  ttm
> 
> [Do Aug  4 17:22:40 2016]  drm
> 
> [Do Aug  4 17:22:40 2016]  ata_piix
> 
> [Do Aug  4 17:22:40 2016]  virtio_net
> 
> [Do Aug  4 17:22:40 2016]  virtio_blk
> 
> [Do Aug  4 17:22:40 2016]  virtio_pci
> 
> [Do Aug  4 17:22:40 2016]  i2c_core
> 
> [Do Aug  4 17:22:40 2016]  virtio_ring
> 
> [Do Aug  4 17:22:40 2016]  libata
> 
> [Do Aug  4 17:22:40 2016]  floppy
> 
> [Do Aug  4 17:22:40 2016]  serio_raw
> 
> [Do Aug  4 17:22:40 2016]  virtio
> 
> [Do Aug  4 17:22:40 2016]  dm_mirror
> 
> [Do Aug  4 17:22:40 2016]  dm_region_hash
> 
> [Do Aug  4 17:22:40 2016]  dm_log
> 
> [Do Aug  4 17:22:40 2016]  dm_mod
> 
> 
> [Do Aug  4 17:22:40 2016] CPU: 2 PID: 2792 Comm: mprime Tainted: G            
>  L ------------   3.10.0-327.22.2.el7.x86_64 #1
> 
> [Do Aug  4 17:22:40 2016] Hardware name: Joyent SmartDC HVM, BIOS Bochs 
> 01/01/2007
> 
> [Do Aug  4 17:22:40 2016] task: ffff8803efd23980 ti: ffff8803ef6d8000 
> task.ti: ffff8803ef6d8000
> 
> [Do Aug  4 17:22:40 2016] RIP: 0033:[<00007f6f997d7995>]
> 
> [Do Aug  4 17:22:40 2016]  [<00007f6f997d7995>] 0x7f6f997d7994
> 
> [Do Aug  4 17:22:40 2016] RSP: 002b:00007f6f937fc708  EFLAGS: 00000202
> 
> [Do Aug  4 17:22:40 2016] RAX: 414593d9a0beeb7d RBX: 000000000000fe2e RCX: 
> 40cdbb5e7a93de49
> 
> [Do Aug  4 17:22:40 2016] RDX: 000000000008c1d8 RSI: 00007f6edd0e1440 RDI: 
> 00007f6edd5f04b8
> 
> [Do Aug  4 17:22:40 2016] RBP: 00007f6edd0332a0 R08: 0000000000000000 R09: 
> 0000000000000099
> 
> [Do Aug  4 17:22:40 2016] R10: 0000000000000001 R11: 00007f6f50001000 R12: 
> 00007f6f7f2f8f80
> 
> [Do Aug  4 17:22:40 2016] R13: 00007f6f50002640 R14: ffffffff8163c831 R15: 
> ffff8803ef6dbf70
> 
> [Do Aug  4 17:22:40 2016] FS:  00007f6f937fe700(0000) 
> GS:ffff880407c40000(0000) knlGS:0000000000000000
> 
> [Do Aug  4 17:22:40 2016] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> 
> [Do Aug  4 17:22:40 2016] CR2: 0000000002541eb0 CR3: 00000003efd1d000 CR4: 
> 00000000000006e0
> 
> [Do Aug  4 17:22:40 2016] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 
> 0000000000000000
> 
> [Do Aug  4 17:22:40 2016] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 
> 0000000000000400
> 
> 
> [Do Aug  4 17:22:40 2016]  libata
> 
> [Do Aug  4 17:22:40 2016]  floppy serio_raw virtio dm_mirror dm_region_hash 
> dm_log dm_mod
> 
> [Do Aug  4 17:22:40 2016] CPU: 9 PID: 2799 Comm: mprime Tainted: G            
>  L ------------   3.10.0-327.22.2.el7.x86_64 #1
> 
> [Do Aug  4 17:22:40 2016] Hardware name: Joyent SmartDC HVM, BIOS Bochs 
> 01/01/2007
> 
> [Do Aug  4 17:22:40 2016] task: ffff8803efd25c00 ti: ffff8803efcc0000 
> task.ti: ffff8803efcc0000
> 
> [Do Aug  4 17:22:40 2016] RIP: 0033:[<000000000164ebb0>]  
> [<000000000164ebb0>] 0x164ebaf
> 
> [Do Aug  4 17:22:40 2016] RSP: 002b:00007f6f8b7fc710  EFLAGS: 00000202
> 
> [Do Aug  4 17:22:40 2016] RAX: 0000000000000010 RBX: 000000000000fe2e RCX: 
> 00007f6ee35d6280
> 
> [Do Aug  4 17:22:40 2016] RDX: 0000000000000000 RSI: 00007f6ee35c8e40 RDI: 
> 00007f6f7f0aa440
> 
> [Do Aug  4 17:22:40 2016] RBP: 00007f6f7f0aa800 R08: 00007f6ee35c8e00 R09: 
> 0000000000000000
> 
> [Do Aug  4 17:22:40 2016] R10: 00007f6f8b7fc840 R11: 00007f6f5c001000 R12: 
> 00007f6f7f0aa800
> 
> [Do Aug  4 17:22:40 2016] R13: 00007f6f7f0aa500 R14: ffffffff8163c831 R15: 
> ffff8803efcc3f70
> 
> [Do Aug  4 17:22:40 2016] FS:  00007f6f8b7fe700(0000) 
> GS:ffff880407d20000(0000) knlGS:0000000000000000
> 
> [Do Aug  4 17:22:40 2016] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> 
> [Do Aug  4 17:22:40 2016] CR2: 00007fe925f5f292 CR3: 00000003efd1d000 CR4: 
> 00000000000006e0
> 
> [Do Aug  4 17:22:40 2016] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 
> 0000000000000000
> 
> [Do Aug  4 17:22:40 2016] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 
> 0000000000000400
> 
> 
> ###
> 
> 
> 
> After some time the KVM is complete unusable, SSH login doesn't work and i 
> have no other solution than to force poweroff (vmadm stop -f UUID). If i set 
> the KVM to only 4 vCPUs everything runs fine and prime runs for > 30 minutes 
> without problems.
> 
> 
> As i know from other virtualization software (VMware, ...) you are not 
> allowed to setup more vCPUs / VM than the host itself has. But for example in 
> Proxmox, if you are using QEMU Virtual CPUs (as SmartOS does) you are allowed 
> to set as many vCPUs as you want.
> 
> 
> So is this a bug in SmartOS and you shouldn't be able to set more vCPUs / KVM 
> as the host itself has or when not, why isn't SmartOS capable of handling the 
> high load correctly (KVM shouldn't freeze)?

Hi Killian,

So at the end of the day, SmartOS is not going to stop you from
over-provisioning on pretty much any axis. So in theory we'll be trying
to time-share this CPU against all the others.

So, a couple of questions to better understand the situation.

Is this the only zone running on the box? Is there a CPU cap set?

If you look at the QEMU process with prstat -mL at say a per-second
rate, what do you end up seeing, if anything there? Do you see some
threads (which represent CPUs) spending all of their time in a certain
state or is something else going on? Is there a lot of LAT there for
some reason?

Robert


-------------------------------------------
smartos-discuss
Archives: https://www.listbox.com/member/archive/184463/=now
RSS Feed: https://www.listbox.com/member/archive/rss/184463/25769125-55cfbc00
Modify Your Subscription: 
https://www.listbox.com/member/?member_id=25769125&id_secret=25769125-7688e9fb
Powered by Listbox: http://www.listbox.com

Reply via email to