Hi Salvatore, thank you for taking care of this!
I first did the tracing for a few seconds, and I have appended the compressed output "out.txt.gz", cut after line 5000, to this e-mail. Since some "nvidia"-related processes also appear, I want to inform you that I have an Optimus laptop where the Nvidia GPU renders images and the integrated Intel GPU sends the images to the monitor, just in case.
I also tried the stack trace, but was not sure, whether I did it right - so, this is what I did:
# top top - 16:29:42 up 7 min, 3 users, load average: 1,82, 1,52, 0,80 Tasks: 200 total, 2 running, 198 sleeping, 0 stopped, 0 zombie%Cpu(s): 0,5 us, 12,4 sy, 0,0 ni, 86,5 id, 0,0 wa, 0,0 hi, 0,5 si, 0,0 st
MiB Mem : 15889,4 total, 13390,9 free, 1263,3 used, 1235,2 buff/cache MiB Swap: 0,0 total, 0,0 free, 0,0 used. 14265,6 avail MemPID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 70 root 20 0 0 0 0 R 84,0 0,0 6:23.39 kworker/4:1+pm 32 root 20 0 0 0 0 S 16,0 0,0 1:12.32 ksoftirqd/4
761 root 20 0 349132 104820 67088 S 3,7 0,6 0:21.44 Xorg ...I saw the kworker process with PID 70 and thus looked at the stack of this process:
# cat /proc/70/stack [<0>] usb_start_wait_urb+0x65/0x160 [usbcore] [<0>] usb_control_msg+0xdd/0x140 [usbcore] [<0>] set_port_feature+0x30/0x40 [usbcore] [<0>] hub_suspend+0x1e3/0x250 [usbcore] [<0>] usb_suspend_both+0x9d/0x230 [usbcore] [<0>] usb_runtime_suspend+0x2a/0x70 [usbcore] [<0>] __rpm_callback+0xc7/0x200 [<0>] rpm_callback+0x1f/0x70 [<0>] rpm_suspend+0x138/0x670 [<0>] __pm_runtime_suspend+0x41/0x80 [<0>] usb_runtime_idle+0x2d/0x40 [usbcore] [<0>] __rpm_callback+0xc7/0x200 [<0>] rpm_idle+0xa5/0x310 [<0>] pm_runtime_work+0x73/0x90 [<0>] process_one_work+0x1a7/0x3a0 [<0>] worker_thread+0x30/0x390 [<0>] kthread+0x112/0x130 [<0>] ret_from_fork+0x35/0x40 [<0>] 0xffffffffffffffffI hope, this was right. If I can give you any more information, please, let me know.
Regards, Dirk. Am 02.08.20 um 15:44 schrieb Salvatore Bonaccorso:
Control: tags -1 + moreinfo Hi Dirk On Sun, Aug 02, 2020 at 10:00:27AM +0200, Dirk Kostrewa wrote:Package: src:linux Version: 4.19.132-1 Severity: normal Dear Maintainer, after booting the kernel 4.19.0-10-amd64, there is a kworker process running with a permanent high CPU load of almost 90% as reported by the "top" command: $ top top - 09:48:19 up 0 min, 4 users, load average: 1.91, 0.58, 0.20 Tasks: 218 total, 2 running, 216 sleeping, 0 stopped, 0 zombie %Cpu(s): 0.8 us, 12.4 sy, 0.0 ni, 84.5 id, 0.0 wa, 0.0 hi, 2.3 si, 0.0 st MiB Mem : 15889.4 total, 14173.1 free, 889.3 used, 827.0 buff/cache MiB Swap: 0.0 total, 0.0 free, 0.0 used. 14677.7 avail Mem PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 64 root 20 0 0 0 0 R 86.7 0.0 0:47.41 kworker/0:2+pm 9 root 20 0 0 0 0 S 20.0 0.0 0:08.84 ksoftirqd/0 364 root -51 0 0 0 0 S 6.7 0.0 0:00.50 irq/126-nvidia 1177 dirk 20 0 2921696 122848 94268 S 6.7 0.8 0:02.23 kwin_x11 1 root 20 0 169652 10280 7740 S 0.0 0.1 0:01.56 systemd 2 root 20 0 0 0 0 S 0.0 0.0 0:00.00 kthreadd ... The expected result after booting the kernel 4.19.0-10-amd64 is a kworker process with a CPU load close to 0%. As a control, booting the previous kernel 4.19.0-9-amd64 does not show a high CPU load for the kworker process. Instead, the kworker CPU load reported by the "top" command is 0.0%. Therefore, I suspect a bug in the kernel 4.19.0-10-amd64. Neither "dmesg" nor "journalctl -b" show any messages containing "kworker". I am using Debian/GNU Linux 10.5 with kernel 4.19.0-10-amd64 and libc6:amd64 2.28-10. If you need more information, I would be happy to provide it.To find out what could be the cause, could you have a look at https://www.kernel.org/doc/html/latest/core-api/workqueue.html#debugging this could help determining isolating why the kworker goes crazy. Regards, Salvatore
out.txt.gz
Description: application/gzip