Bug#838491: linux-image-4.7.0-0.bpo.1-amd64-unsigned: extreme load averages and over 2000 kworker threads
On Monday 10 October 2016 16:16:30 Ben Hutchings wrote: > > I think this might be fixed by "mm: memcontrol: use special workqueue > for creating per-memcg caches" included in version 4.7.6-1. Let us > know whether that does it. It did not happen again with linux-image-4.7.0-1-amd64-unsigned (4.7.6-1) within the last day. I guess you can close the BUG. Thanks! regards Markus Köberl -- Markus Koeberl Graz University of Technology Signal Processing and Speech Communication Laboratory E-mail: markus.koeb...@tugraz.at
Bug#838491: linux-image-4.7.0-0.bpo.1-amd64-unsigned: extreme load averages and over 2000 kworker threads
On Mon, 2016-10-10 at 15:41 +0200, Markus Koeberl wrote: > Package: src:linux > Followup-For: Bug #838491 > > Dear Maintainer, > > * What led up to the situation? > > upgrade kernel and systemd to the version proveded in jessie- > backports > > * What exactly did you do (or not do) that was effective (or > ineffective)? > > during normal usage (slurm cluster node): > > load average: 1290.54, 513.19, 466.29 > > the load 5 peaks reache 2000 > > ps aux | grep kworker | wc -l > 4188 [...] I think this might be fixed by "mm: memcontrol: use special workqueue for creating per-memcg caches" included in version 4.7.6-1. Let us know whether that does it. Ben. -- Ben Hutchings Unix is many things to many people, but it's never been everything to anybody. signature.asc Description: This is a digitally signed message part
Bug#838491: linux-image-4.7.0-0.bpo.1-amd64-unsigned: extreme load averages and over 2000 kworker threads
Package: src:linux Followup-For: Bug #838491 Dear Maintainer, * What led up to the situation? upgrade kernel and systemd to the version proveded in jessie-backports * What exactly did you do (or not do) that was effective (or ineffective)? during normal usage (slurm cluster node): load average: 1290.54, 513.19, 466.29 the load 5 peaks reache 2000 ps aux | grep kworker | wc -l 4188 I followed the Debugging instruction of https://raw.githubusercontent.com/torvalds/linux/master/Documentation/workqueue.txt echo workqueue:workqueue_queue_work > /sys/kernel/debug/tracing/set_event cat /sys/kernel/debug/tracing/trace_pipe > out.txt after a vew seconds: cat out.txt | awk '{print $8}' | sort | uniq -c | sort -n 1 function=do_cache_clean 1 function=pcpu_balance_workfn 1 function=xfs_eofblocks_worker 2 function=neigh_periodic_work 6 function=xfs_reclaim_worker 6 function=xlog_cil_push_work 8 function=disk_events_workfn 8 function=igb_watchdog_task 12 function=push_to_pool 13 function=blk_timeout_work 15 function=vmstat_shepherd 22 function=xfs_end_io 27 function=key_garbage_collector 27 function=lru_add_drain_per_cpu 34 function=delayed_fput 38 function=scsi_requeue_run_queue 39 function=blk_delay_work 40 function=cgroup_pidlist_destroy_work_fn 56 function=flush_to_ldisc 64 function=cache_reap 77 function=wb_workfn 101 function=os_execute_work_item 131 function=css_killed_work_fn 142 function=xfs_buf_ioend_work 156 function=vmstat_update 162 function=call_usermodehelper_exec_work 162 function=cgroup_release_agent 409 function=vmpressure_work_fn 497 function=css_release_work_fn 500 function=css_free_work_fn 47931 function=memcg_kmem_cache_create_func I found https://bugzilla.kernel.org/show_bug.cgi?id=172981 which seams to be the same problem. -- Package-specific info: ** Version: Linux version 4.7.0-0.bpo.1-amd64 (debian-kernel@lists.debian.org) (gcc version 4.9.2 (Debian 4.9.2-10) ) #1 SMP Debian 4.7.5-1~bpo8+2 (2016-10-01) ** Command line: BOOT_IMAGE=/vmlinuz-4.7.0-0.bpo.1-amd64 root=UUID=d3b74f44-0f5e-4ba1-9606-ad42b76e5918 ro cgroup_enable=memory swapaccount=1 elevator=deadline quiet nomodeset nouveau.modeset=0 ** Tainted: POE (12289) * Proprietary module has been loaded. * Out-of-tree module has been loaded. * Unsigned module has been loaded. ** Model information sys_vendor: Supermicro product_name: X10SRA product_version: 0123456789 chassis_vendor: Supermicro chassis_version: 0123456789 bios_vendor: American Megatrends Inc. bios_version: 2.0 board_vendor: Supermicro board_name: X10SRA board_version: 1.01 ** Loaded modules: 8021q(E) garp(E) mrp(E) stp(E) llc(E) nvidia_drm(POE) nvidia_modeset(POE) nvidia(POE) drm_kms_helper(E) drm(E) openafs(POE) nfsd(E) auth_rpcgss(E) nfs_acl(E) nfs(E) lockd(E) grace(E) fscache(E) sunrpc(E) intel_rapl(E) sb_edac(E) edac_core(E) x86_pkg_temp_thermal(E) intel_powerclamp(E) coretemp(E) xfs(E) libcrc32c(E) snd_hda_codec_hdmi(E) iTCO_wdt(E) iTCO_vendor_support(E) mxm_wmi(E) evdev(E) kvm_intel(E) kvm(E) irqbypass(E) crct10dif_pclmul(E) crc32_pclmul(E) ghash_clmulni_intel(E) hmac(E) drbg(E) ansi_cprng(E) aesni_intel(E) aes_x86_64(E) lrw(E) gf128mul(E) glue_helper(E) ablk_helper(E) cryptd(E) pcspkr(E) serio_raw(E) snd_hda_codec_realtek(E) snd_hda_codec_generic(E) snd_hda_intel(E) snd_hda_codec(E) snd_hda_core(E) snd_hwdep(E) snd_pcm(E) snd_timer(E) snd(E) soundcore(E) lpc_ich(E) mfd_core(E) sg(E) i2c_i801(E) shpchp(E) ipmi_msghandler(E) wmi(E) acpi_power_meter(E) tpm_tis(E) tpm(E) button(E) usbhid(E) hid(E) fuse(E) autofs4(E) ext4(E) crc16(E) jbd2(E) crc32c_generic(E) mbcache(E) dm_mod(E) sr_mod(E) cdrom(E) sd_mod(E) crc32c_intel(E) psmouse(E) ahci(E) igb(E) libahci(E) ehci_pci(E) i2c_algo_bit(E) ehci_hcd(E) dca(E) ptp(E) pps_core(E) xhci_pci(E) libata(E) xhci_hcd(E) usbcore(E) scsi_mod(E) usb_common(E) fjes(E) ** PCI devices: 00:00.0 Host bridge [0600]: Intel Corporation Haswell-E DMI2 [8086:2f00] (rev 02) Subsystem: Super Micro Computer Inc Device [15d9:0857] Control: I/O- Mem- BusMaster- SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx+ Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- SERR- 00:01.0 PCI bridge [0604]: Intel Corporation Haswell-E PCI Express Root Port 1 [8086:2f02] (rev 02) (prog-if 00 [Normal decode]) Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx+ Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- SERR- TAbort- Reset- FastB2B- PriDiscTmr- SecDiscTmr- DiscTmrStat- DiscTmrSERREn- Capabilities: Kernel driver in use: pcieport 00:03.0 PCI bridge [0604]: Intel Corporation Haswell-E PCI Express Root Port 3 [8086:2f08] (rev 02) (prog-if 00 [Normal decode]) Control: I/O+ Mem+ BusMaster+ SpecCycle-