Re: disabling secondary CPU hangs / system fails to suspend with kernel 4.19+
Hi, good news: starting with 5.0.6 suspend is working again. Best regards Thomas Am 29.03.19 um 10:22 schrieb Thomas Müller: > Hi, > > Am 18.03.19 um 12:57 schrieb Peter Zijlstra: >> On Fri, Mar 15, 2019 at 09:21:02PM +0100, Thomas Müller wrote: >>> I've just re-tested with runlevel 3. >>> Not a real VGA console, but at least no Wayland or Gnome to interfere... >>> >>> `echo 0 > /sys/...` just blocks and no message whatsoever is visible in >>> dmesg. >>> >>> I've executed `echo 0 > ...` in the background to keep my console >>> functional and I can e.g. echo >>> something to /dev/kmsg and it shows up, so reading/updating the log buffer >>> appears to be working >>> just fine. >> >> Damn.. Thanks for trying. I'll see if I can come up with something, but >> I'm out of idea for now :/ >> > Any new ideas so far? > > For reference: > I've just tested a vanilla 5.0.5 with localmodconfig (attached)... same > behavior :( > > > Best regards > Thomas >
Re: disabling secondary CPU hangs / system fails to suspend with kernel 4.19+
Hi, Am 15.03.19 um 13:15 schrieb Peter Zijlstra: > On Fri, Mar 15, 2019 at 12:41:00PM +0100, Thomas Müller wrote: > >>> What .config do you have? >> The one packaged by Fedora. I've attached the one for 4.20.15 as reference. > > Thanks, I'll have a poke, see what, if anything, is different from the > kernels I ran. > >>> And what, if anything do you see on the >>> console when it goes funny? >> Nothing unfortunately. >> When trying to suspend the display immediately goes blank, the system >> becomes unresponsive and the >> status LED within the power button start flashing rapidly (just like it does >> when the power cord is >> attached). >> >> >>> I think you wrote that hot-un-plug never completes? Is there anything in >>> dmesg when it's stuck in: >>> >>> echo 0 > /sys/devices/system/cpu/cpu1/online >>> >>> ? >> I've just tried that again and the system immediately froze. > > Hmm, I tought you said the system remained semi usable, just that reboot > stopped working thereafter and it needed a power cycle. > >> `journalctl -f` was running in a second window but it had no chance to >> output anything... :/ > > Ah, you're using a GUI! > > Stop doing that ;-) Easier said than done ;) > See if you can use the VGA console; not a FB console or a DRM console, > but the real ancient, proper text mode, VGA console. I've just re-tested with runlevel 3. Not a real VGA console, but at least no Wayland or Gnome to interfere... `echo 0 > /sys/...` just blocks and no message whatsoever is visible in dmesg. I've executed `echo 0 > ...` in the background to keep my console functional and I can e.g. echo something to /dev/kmsg and it shows up, so reading/updating the log buffer appears to be working just fine. A power cycle is still necessary to recover the system. > Now, don't ask me how to do that, because I don't know, I've been > running on pure serial console output for the past 10 years or so, heck > I don't even have systemd. > > And you might have to do something like: dmesg -n8, to get the console > to print the kernel messages or something. >
disabling secondary CPU hangs / system fails to suspend with kernel 4.19+
Hi, starting with kernel 4.19 my Lenovo ThinkPad X1 Carbon 5th no longer properly suspends. This is 100% reproducible and git bisect points to the following commit: > [be45bf5395e0886a93fc816bbe41a008ec2e42e2] watchdog/softlockup: Fix > cpu_stop_queue_work() double-queue bug > be45bf5395e0886a93fc816bbe41a008ec2e42e2 is the first bad commit > commit be45bf5395e0886a93fc816bbe41a008ec2e42e2 > Author: Peter Zijlstra > Date: Fri Jul 13 12:42:08 2018 +0200 > > watchdog/softlockup: Fix cpu_stop_queue_work() double-queue bug > > When scheduling is delayed for longer than the softlockup interrupt > period it is possible to double-queue the cpu_stop_work, causing list > corruption. > > Cure this by adding a completion to track the cpu_stop_work's > progress. > > Reported-by: kernel test robot > Tested-by: Rong Chen > Signed-off-by: Peter Zijlstra (Intel) > Cc: Linus Torvalds > Cc: Peter Zijlstra > Cc: Thomas Gleixner > Fixes: 9cf57731b63e ("watchdog/softlockup: Replace "watchdog/%u" threads > with cpu_stop_work") > Link: > http://lkml.kernel.org/r/20180713104208.gw2...@hirez.programming.kicks-ass.net > Signed-off-by: Ingo Molnar > > :04 04 6aca2dbb84bc33fe442b18b3d0a135c27adff7b9 > 2710af12d32e4b98df07768716689b213bce45fc M kernel The bugzilla reports have some additional details: * https://bugzilla.redhat.com/show_bug.cgi?id=1671504 * https://bugzilla.kernel.org/show_bug.cgi?id=202679 * https://bugzilla.kernel.org/show_bug.cgi?id=202137 I'm happy to provide additional information or test a patch or two (as long as it doesn't eat up my notebook ;)) Best regards Thomas
[BUG] Lockup on boot when trying to bring up r8169 NIC
Hi, I already sent this two days ago, but I have the feeling it was overlooked or filtered because of a large attachment. If I try to boot 2.6.21.6, 2.6.22.1 or 2.6.22-git8 the system completely hangs when init tries to bring up my r8169-based NIC. Not even the keyboard lights are working anymore. If I unplug the network cable, boot continues just fine and everything works as it should. If I boot with the cable unplugged, the system also hangs and continues after I plug in the cable. Everything works fine with 2.6.20.15. Configuration: http://www.mathtm.de/config_2.6.20.15_fc6based http://www.mathtm.de/config_2.6.21.6_f7based Using a Fedora kernel (based on 2.6.21.5) I get the following kernel message: r8169: eth0: link down BUG: soft lockup detected on CPU#0! [] softlockup_tick+0xa5/0xb4 [] update_process_times+0x3b/0x5e [] tick_sched_timer+0x57/0x9a [] hrtimer_interrupt+0x12b/0x1b6 [] tick_sched_timer+0x0/0x9a [] timer_interrupt+0x2c/0x32 [] handle_IRQ_event+0x1a/0x3f [] handle_level_irq+0x81/0xc7 [] do_IRQ+0xb8/0xd1 [] common_interrupt+0x23/0x28 [] handle_IRQ_event+0x11/0x3f [] handle_level_irq+0x81/0xc7 [] handle_level_irq+0x0/0xc7 [] do_IRQ+0xac/0xd1 [] common_interrupt+0x23/0x28 [] __do_softirq+0x54/0xba [] do_softirq+0x59/0xb1 [] handle_level_irq+0x0/0xc7 [] irq_exit+0x38/0x6b [] do_IRQ+0xbd/0xd1 [] common_interrupt+0x23/0x28 [] find_busiest_group+0x264/0x4c5 [] _spin_unlock_irqrestore+0x8/0x9 [] __mod_timer+0xa1/0xab [] rtl8169_open+0x12e/0x194 [r8169] [] dev_open+0x2b/0x62 [] dev_change_flags+0x47/0xe4 [] devinet_ioctl+0x250/0x56a [] copy_to_user+0x3c/0x50 [] sock_ioctl+0x19f/0x1be [] sock_ioctl+0x0/0x1be [] do_ioctl+0x1f/0x62 [] vfs_ioctl+0x244/0x256 [] sys_ioctl+0x4c/0x64 [] syscall_call+0x7/0xb === r8169: eth0: link up There already is a bugzilla entry at http://bugzilla.redhat.com/bugzilla/show_bug.cgi?id=242572 I know, not everyone is a fan of bugzilla, but maybe someone wants to take a look at what was discussed there. Please CC me as I'm not subscribed to the list and don't hesitate to tell me that I forgot to include some crucial information ;) Regards, Thomas - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[BUG] Lockup on boot when trying to bring up r8169 NIC
Hi, I already sent this two days ago, but I have the feeling it was overlooked or filtered because of a large attachment. If I try to boot 2.6.21.6, 2.6.22.1 or 2.6.22-git8 the system completely hangs when init tries to bring up my r8169-based NIC. Not even the keyboard lights are working anymore. If I unplug the network cable, boot continues just fine and everything works as it should. If I boot with the cable unplugged, the system also hangs and continues after I plug in the cable. Everything works fine with 2.6.20.15. Configuration: http://www.mathtm.de/config_2.6.20.15_fc6based http://www.mathtm.de/config_2.6.21.6_f7based Using a Fedora kernel (based on 2.6.21.5) I get the following kernel message: r8169: eth0: link down BUG: soft lockup detected on CPU#0! [c0451ea2] softlockup_tick+0xa5/0xb4 [c042e930] update_process_times+0x3b/0x5e [c043d298] tick_sched_timer+0x57/0x9a [c0439df5] hrtimer_interrupt+0x12b/0x1b6 [c043d241] tick_sched_timer+0x0/0x9a [c0408534] timer_interrupt+0x2c/0x32 [c045210e] handle_IRQ_event+0x1a/0x3f [c045354e] handle_level_irq+0x81/0xc7 [c04072c7] do_IRQ+0xb8/0xd1 [c04058ff] common_interrupt+0x23/0x28 [c0452105] handle_IRQ_event+0x11/0x3f [c045354e] handle_level_irq+0x81/0xc7 [c04534cd] handle_level_irq+0x0/0xc7 [c04072bb] do_IRQ+0xac/0xd1 [c04058ff] common_interrupt+0x23/0x28 [c042b2dc] __do_softirq+0x54/0xba [c04071b7] do_softirq+0x59/0xb1 [c04534cd] handle_level_irq+0x0/0xc7 [c042b194] irq_exit+0x38/0x6b [c04072cc] do_IRQ+0xbd/0xd1 [c04058ff] common_interrupt+0x23/0x28 [c04200d8] find_busiest_group+0x264/0x4c5 [c0601895] _spin_unlock_irqrestore+0x8/0x9 [c042e863] __mod_timer+0xa1/0xab [f8a4e1ec] rtl8169_open+0x12e/0x194 [r8169] [c05a3054] dev_open+0x2b/0x62 [c05a1aa1] dev_change_flags+0x47/0xe4 [c05de45c] devinet_ioctl+0x250/0x56a [c04e72c0] copy_to_user+0x3c/0x50 [c0598b47] sock_ioctl+0x19f/0x1be [c05989a8] sock_ioctl+0x0/0x1be [c047f713] do_ioctl+0x1f/0x62 [c047f99a] vfs_ioctl+0x244/0x256 [c047f9f8] sys_ioctl+0x4c/0x64 [c0404f70] syscall_call+0x7/0xb === r8169: eth0: link up There already is a bugzilla entry at http://bugzilla.redhat.com/bugzilla/show_bug.cgi?id=242572 I know, not everyone is a fan of bugzilla, but maybe someone wants to take a look at what was discussed there. Please CC me as I'm not subscribed to the list and don't hesitate to tell me that I forgot to include some crucial information ;) Regards, Thomas - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/