On an idle Xenial cloud image I'm seeing: [ 1485.236760] [<ffff800000086ad0>] __switch_to+0x90/0xa8 [ 1485.236772] [<ffff800000143e80>] __tick_nohz_idle_enter+0x50/0x3f0 [ 1485.236776] [<ffff800000144478>] tick_nohz_idle_enter+0x40/0x70 [ 1485.236785] [<ffff80000010baf0>] cpu_startup_entry+0x288/0x2d8 [ 1485.236791] [<ffff80000008fca8>] secondary_start_kernel+0x120/0x130 [ 1485.236795] [<000000004008290c>] 0x4008290c
after a while I get: [ 2462.806971] rcu_sched kthread starved for 15002 jiffies! g2579 c2578 f0x0 s3 ->state=0x1 [ 2667.835351] INFO: rcu_sched detected stalls on CPUs/tasks: [ 2667.836918] 0-...: (66 GPs behind) idle=cf0/0/0 softirq=5177/5177 fqs=0 [ 2667.838801] 2-...: (0 ticks this GP) idle=73a/0/0 softirq=4570/4570 fqs=0 [ 2667.840696] 3-...: (64 GPs behind) idle=eba/0/0 softirq=4654/4654 fqs=0 [ 2667.842533] (detected by 1, t=15002 jiffies, g=2638, c=2637, q=4389) and at this point sleeping blocks, for example strace on sleep(1) on the VM shows nanosleep({1, 0}) sleep forever, one has to SIGINT this as it never times out. Also the secondary_start_kernel() is indicative that the VM puts CPUs to sleep and wakes them on a timer. I can trigger this more often with more CPUs on the VM and also by loading the host, for example, producing a lot of cache or memory activity can trigger the initial hangs more frequently than having an idle host. So, I suspect there is a cpuhotplug and nohz combo causing issues here. -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1531768 Title: [arm64] lockups some time after booting Status in Auto Package Testing: Triaged Status in linux package in Ubuntu: Confirmed Bug description: I created an 8 CPU arm64 instance on Canonical's Scalingstack (which I want to use for armhf autopkgtesting in LXD). I started with wily as that has lxd available (it's not yet available in trusty nor the PPA for arm64). However, pretty much any LXD task that I do (I haven't tried much else) on this machine takes unbearably long. A simple "lxc profile set default raw.lxc lxc.seccomp=" or "lxc list" takes several minutes. I see tons of [ 1020.971955] rcu_sched kthread starved for 6000 jiffies! g1095 c1094 f0x0 [ 1121.166926] INFO: task fsnotify_mark:69 blocked for more than 120 seconds. in dmesg (the attached apport info has the complete dmesg). ProblemType: Bug DistroRelease: Ubuntu 15.10 Package: linux-image-4.2.0-22-generic 4.2.0-22.27 ProcVersionSignature: User Name 4.2.0-22.27-generic 4.2.6 Uname: Linux 4.2.0-22-generic aarch64 AlsaDevices: total 0 crw-rw---- 1 root audio 116, 1 Jan 7 09:18 seq crw-rw---- 1 root audio 116, 33 Jan 7 09:18 timer AplayDevices: Error: [Errno 2] No such file or directory: 'aplay' ApportVersion: 2.19.1-0ubuntu5 Architecture: arm64 ArecordDevices: Error: [Errno 2] No such file or directory: 'arecord' AudioDevicesInUse: Error: command ['fuser', '-v', '/dev/snd/seq', '/dev/snd/timer'] failed with exit code 1: CRDA: N/A Date: Thu Jan 7 09:24:01 2016 IwConfig: eth0 no wireless extensions. lo no wireless extensions. lxcbr0 no wireless extensions. Lspci: 00:00.0 Host bridge [0600]: Red Hat, Inc. Device [1b36:0008] Subsystem: Red Hat, Inc Device [1af4:1100] Control: I/O- Mem- BusMaster- SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx- Status: Cap- 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx- Lsusb: Error: command ['lsusb'] failed with exit code 1: unable to initialize libusb: -99 PciMultimedia: ProcEnviron: TERM=screen PATH=(custom, no user) XDG_RUNTIME_DIR=<set> LANG=en_US.UTF-8 SHELL=/bin/bash ProcFB: ProcKernelCmdLine: BOOT_IMAGE=/boot/vmlinuz-4.2.0-22-generic root=LABEL=cloudimg-rootfs earlyprintk RelatedPackageVersions: linux-restricted-modules-4.2.0-22-generic N/A linux-backports-modules-4.2.0-22-generic N/A linux-firmware 1.149.3 RfKill: Error: [Errno 2] No such file or directory: 'rfkill' SourcePackage: linux UdevLog: Error: [Errno 2] No such file or directory: '/var/log/udev' UpgradeStatus: No upgrade log present (probably fresh install) To manage notifications about this bug go to: https://bugs.launchpad.net/auto-package-testing/+bug/1531768/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp