This bug was nominated against a series that is no longer supported, ie
yakkety. The bug task representing the yakkety nomination is being
closed as Won't Fix.
This change has been made by an automated script, maintained by the
Ubuntu Kernel Team.
** Changed in: linux (Ubuntu Yakkety)
Status: Triaged => Won't Fix
--
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1672521
Title:
ThunderX: soft lockup on 4.8+ kernels
Status in linux package in Ubuntu:
Triaged
Status in linux source package in Yakkety:
Won't Fix
Status in linux source package in Zesty:
Triaged
Bug description:
I have been trying to easily reproduce this for days.
We initially observed it in OPNFV Armband, when we tried to upgrade our
Ubuntu Xenial installation kernel to linux-image-generic-hwe-16.04 (4.8).
In our environment, this was easily triggered on compute nodes, when
launching multiple VMs (we suspected OVS, QEMU etc.).
However, in order to rule out our specifics, we looked for a simple way to
reproduce it on all ThunderX nodes we have access to, and we finally found it:
$ apt-get install stress-ng
$ stress-ng --hdd 1024
We tested different FW versions, provided by both chip/board manufacturers,
and with all of them the result is 100% reproductible, leading to a kernel Oops
[1]:
[ 726.070531] INFO: task kworker/0:1:312 blocked for more than 120 seconds.
[ 726.077908] Tainted: G W I 4.8.0-41-generic
#44~16.04.1-Ubuntu
[ 726.085850] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables
this message.
[ 726.094383] kworker/0:1 D ffff0000080861bc 0 312 2
0x00000000
[ 726.094401] Workqueue: events vmstat_shepherd
[ 726.094404] Call trace:
[ 726.094411] [<ffff0000080861bc>] __switch_to+0x94/0xa8
[ 726.094418] [<ffff0000089854f4>] __schedule+0x224/0x718
[ 726.094421] [<ffff000008985a20>] schedule+0x38/0x98
[ 726.094425] [<ffff000008985d84>] schedule_preempt_disabled+0x14/0x20
[ 726.094428] [<ffff000008987644>] __mutex_lock_slowpath+0xd4/0x168
[ 726.094431] [<ffff000008987730>] mutex_lock+0x58/0x70
[ 726.094437] [<ffff0000080c552c>] get_online_cpus+0x44/0x70
[ 726.094440] [<ffff00000820ca24>] vmstat_shepherd+0x3c/0xe8
[ 726.094446] [<ffff0000080e1c60>] process_one_work+0x150/0x478
[ 726.094449] [<ffff0000080e1fd8>] worker_thread+0x50/0x4b8
[ 726.094453] [<ffff0000080e8eac>] kthread+0xec/0x100
[ 726.094456] [<ffff000008083690>] ret_from_fork+0x10/0x40
Over the last few days, I tested all 4.8-* and 4.10 (zesty backport), the
soft lockup happens with each and every one of them.
On the other hand, 4.4.0-45-generic seems to work perfectly fine (probably
newer 4.4.0-* too, but due to a regression in the ethernet drivers after
4.4.0-45, we can't test those with ease) under normal conditions, yet running
stress-ng leads to the same oops.
[1] http://paste.ubuntu.com/24172516/
---
AlsaDevices:
total 0
crw-rw---- 1 root audio 116, 1 Mar 13 19:27 seq
crw-rw---- 1 root audio 116, 33 Mar 13 19:27 timer
AplayDevices: Error: [Errno 2] No such file or directory
ApportVersion: 2.20.1-0ubuntu2.5
Architecture: arm64
ArecordDevices: Error: [Errno 2] No such file or directory
AudioDevicesInUse: Error: command ['fuser', '-v', '/dev/snd/seq',
'/dev/snd/timer'] failed with exit code 1:
DistroRelease: Ubuntu 16.04
IwConfig: Error: [Errno 2] No such file or directory
MachineType: GIGABYTE R120-T30
Package: linux (not installed)
PciMultimedia:
ProcEnviron:
TERM=vt220
PATH=(custom, no user)
XDG_RUNTIME_DIR=<set>
LANG=en_US.UTF-8
SHELL=/bin/bash
ProcFB: 0 astdrmfb
ProcKernelCmdLine: BOOT_IMAGE=/vmlinuz-4.8.0-41-generic
root=/dev/mapper/os-root ro console=tty0 console=ttyS0,115200
console=ttyAMA0,115200 net.ifnames=1 biosdevname=0 rootdelay=90 nomodeset quiet
splash vt.handoff=7
ProcVersionSignature: Ubuntu 4.8.0-41.44~16.04.1-generic 4.8.17
RelatedPackageVersions:
linux-restricted-modules-4.8.0-41-generic N/A
linux-backports-modules-4.8.0-41-generic N/A
linux-firmware 1.157.8
RfKill: Error: [Errno 2] No such file or directory
Tags: xenial
Uname: Linux 4.8.0-41-generic aarch64
UpgradeStatus: No upgrade log present (probably fresh install)
UserGroups:
_MarkForUpload: True
dmi.bios.date: 11/22/2016
dmi.bios.vendor: GIGABYTE
dmi.bios.version: T22
dmi.board.asset.tag: 01234567890123456789AB
dmi.board.name: MT30-GS0
dmi.board.vendor: GIGABYTE
dmi.board.version: 01234567
dmi.chassis.asset.tag: 01234567890123456789AB
dmi.chassis.type: 17
dmi.chassis.vendor: GIGABYTE
dmi.chassis.version: 01234567
dmi.modalias:
dmi:bvnGIGABYTE:bvrT22:bd11/22/2016:svnGIGABYTE:pnR120-T30:pvr0100:rvnGIGABYTE:rnMT30-GS0:rvr01234567:cvnGIGABYTE:ct17:cvr01234567:
dmi.product.name: R120-T30
dmi.product.version: 0100
dmi.sys.vendor: GIGABYTE
To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1672521/+subscriptions
--
Mailing list: https://launchpad.net/~kernel-packages
Post to : [email protected]
Unsubscribe : https://launchpad.net/~kernel-packages
More help : https://help.launchpad.net/ListHelp