Bug#700333: Stack trace
I merged a slightly better fix, you all were on cc. It's going into 3.10 and it's tagged stable, so it will show up in stable kernels soon. Thanks for the fix! But where did you post it - on LKML? (I didn't see it because I'm not subscribed to LKML?) -- To UNSUBSCRIBE, email to debian-kernel-requ...@lists.debian.org with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org Archive: http://lists.debian.org/9eab1d6a826bbd58d0b074f045a1c...@yourcmc.ru
Bug#700333: Stack trace
When you do a suspend/resume cycle. OK, yes, I've found it there. The bug says The photo shows a BUG in hrtimer_interrupt() after making the hibernation image and while resuming the non-boot CPUs. so I'm guessing with Thomas' patch it suspends fine now? Yeah, now I'm using a patched kernel and it's OK. So, does it mean the problem is fixed by this patch or it's just confirmed and should be fixed by another one? -- To UNSUBSCRIBE, email to debian-kernel-requ...@lists.debian.org with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org Archive: http://lists.debian.org/925b81fa055e645546ae9d237eeb2...@yourcmc.ru
Bug#700333: Stack trace
Looks like we can't do anything about that in the HPET code itself. Vitaliy, could you try that patch ? Thanks, I've tried it several days ago (and still using a patched kernel :)) - the box survives. But at which moment should I check for Spurious interrupt in dmesg? -- To UNSUBSCRIBE, email to debian-kernel-requ...@lists.debian.org with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org Archive: http://lists.debian.org/2666050d6d50efdbfa3503aa10c0e...@yourcmc.ru
Bug#700333: Stack trace
Stack trace picture is here: http://vmx.yourcmc.ru/var/pics/IMG_20130306_141045.jpg Vitaliy reported that his system crashes when suspending to disk. This was a regression from 3.2 to 3.7, and remains in 3.8. Some details of this system are in the bug log at http://bugs.debian.org/700333. The photo shows a BUG in hrtimer_interrupt() after making the hibernation image and while resuming the non-boot CPUs. The HPET interrupt handler was called immediately after it was registered for CPU 2 (?), before the corresponding clock_event_device was registered. Seems like an obvious race condition, but then shouldn't the HPET have been stopped while the CPU was previously offlined? And it's strange that this system apparently hits the race quite reliably. Anyone? -- To UNSUBSCRIBE, email to debian-kernel-requ...@lists.debian.org with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org Archive: http://lists.debian.org/cc9020446ea75ed733ec96c505039...@yourcmc.ru
Bug#700333: Stack trace
Hi Ben! Did the stack help you to identify something? Enabling non-boot CPUs seems suspicious to me - does that mean instead of writing an image to disk and hibernating it's trying to resume? -- To UNSUBSCRIBE, email to debian-kernel-requ...@lists.debian.org with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org Archive: http://lists.debian.org/c17f13055e1e901a9f4a3ee94...@yourcmc.ru
Bug#700333: Stack trace
No, but I think this kernel parameter will help: pause_on_oops= Halt all CPUs after the first oops has been printed for the specified number of seconds. This is to be used if your oopses keep scrolling off the screen. (How have I not noticed this in all the years I've been crashing kernels?!) Thanks, it helped :) By the way, this crash happens with init=/bin/bash Stack trace picture is here: http://vmx.yourcmc.ru/var/pics/IMG_20130306_141045.jpg -- To UNSUBSCRIBE, email to debian-kernel-requ...@lists.debian.org with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org Archive: http://lists.debian.org/15ec7b46ebd929a67caea0d80b324...@yourcmc.ru
Bug#700333: Stack trace
Hello I've booted with no_console_suspend and got the stack trace, however it's from 3.8-aptosid kernel. The problem with 3.8 is the same as with 3.7. Can someone please help me - what does this stack mean? Kernel panic - not syncing: Fatal exception in interrupt [ cut here ] WARNING: at /tmp/buildd/linux-aptosid-3.8/debian/build/source_amd64_none/arch/kernel/smp.c:123 update_process_times+0x55/0x61() Hardware name: Studio XPS 1645 Modules linked in: dm_mirror dm_region_hash dm_log dm_mod ext4 crc16 jbd2 mbcache sd_mod crc_t10dif thermal ahci libahci libata scsi_mod fan Pid: 17, comm: kworker/1:0 Tainted: G D 3.8-1.slh.2-aptosid-amd64 #1 Call Trace: IRQ warn_slowpath_common+0x76/0x8a update_process_times+0x55/0x61 tick_periodic+0x60/0x6b tick_handle_periodic+0x18/0x52 smp_apic_timer_interrupt+0x6e/0x81 apic_timer_interrupt+0x6d/0x80 up+0xc/0x35 panic+0x18b/0x1c7 panic+0xfd/0x1c7 oops_end+0x9c/0xa9 do_invalid_op+0x87/0x91 hrtimer_interrupt+0x24/0x1a4 load_balance+0xc3/0x62a run_posix_cpu_timers+0x25/0x57a invalid_op+0x1e/0x30 request_threaded_irq+0x84/0xf5 hrtimer_get_next_event+0x92/0x92 hrtimer_interrupt+0x24/0x1a4 tick_notify+0x216/0x378 hpet_interrupt_handler+0x23/0x2b request_threaded_irq+0x84/0xf5 handle_irq_event_percpu+0x24/0x124 handle_irq_event+0x37/0x57 handle_edge_irq+0x98/0xbb handle_irq+0x15/0x1d do_IRQ+0x41/0x97 common_interrupt+0x6d/0x6d request_threaded_irq+0x84/0xf5 vsnprintf+0x187/0x439 vsnprintf+0x70/0x439 snprintf+0x39/0x3e register_handler_proc+0xd8/0x114 __setup_irq+0x334/0x3d4 hpet_set_periodic_freq+0x5f/0x5f request_threaded_irq+0xba/0xf5 hpet_work+0xe7/0x1a6 process_one_work+0x15d/0x252 worker_thread+0x117/0x1b2 rescuer_thread+0x187/0x187 kthread+0x81/0x89 __kthread_parkme+0x5b/0x5b ret_from_fork+0x7c/0xb0 __kthread_parkme+0x5b/0x5b ---[ end trace e6f760295bda327e ]--- -- To UNSUBSCRIBE, email to debian-kernel-requ...@lists.debian.org with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org Archive: http://lists.debian.org/4bded8a24b0719d575f6ffa6b38ae...@yourcmc.ru
Bug#700333: Stack trace
It means nothing very much. How about the stack trace *before* this line: The problem is that the maximum available VESA mode is 1400x1050 on my laptop and the stack is very long, and obviously I can't scroll it after a kernel panic :-) How can I get to previous lines of it? :-) There is netconsole: https://www.kernel.org/doc/Documentation/networking/netconsole.txt Although that might not work while suspending. Serial console would probably work if the computer has a serial port. If neither of those works then you might be able to use a video recording and freeze- frame. Yeah, the netconsole doesn't work during suspend - I've just checked, the last line it prints is Freezing remaining freezable tasks ... (elapsed 0.01 seconds) done. However the 1st time I tried to use netconsole the suspend surprisingly worked with 3.8 :-) the second time it returned back. So it seems the bug also isn't 100% reproducible. The computer has no serial port. And the video is also not an option - I've tried to film it with 60fps ContourHD, it seems the stack trace is printed very fast. It would be good to have some delay after printing each line of stack trace in the kernel - is there such an option? -- To UNSUBSCRIBE, email to debian-kernel-requ...@lists.debian.org with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org Archive: http://lists.debian.org/4e8c511dcfd811c0f2ab822adaf52...@yourcmc.ru
Bug#700333: Anyone?
Anyone? The bug still persists in 3.7.8-1~experimental.1. -- To UNSUBSCRIBE, email to debian-kernel-requ...@lists.debian.org with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org Archive: http://lists.debian.org/6cab98d3f95e5927ce5ba43edb197...@yourcmc.ru