Bug#607709: XEN kernel crash on DNS process
Il giorno gio, 09/02/2012 alle 16.34 -0600, Jonathan Nieder ha scritto: [...] Thanks and sorry for the slow response. No ideas come to mind, so some basic questions instead: - do you still use this system? If so, how are you coping? - what kernel and hypervisor do you use these days? - is it still reproducible with current squeeze or sid kernels and hypervisors? - any other weird observations? That was more than a year ago. I do not remember how I changed the system in order to avoid the problem. Right now I cannot access the system, so I cannot even check what packages (and versions) are there. If you are interested in the current configuration, I may check it. Bye, Giuseppe -- To UNSUBSCRIBE, email to debian-kernel-requ...@lists.debian.org with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org Archive: http://lists.debian.org/1329132219.4755.39.camel@scarafaggio
Bug#607709: XEN kernel crash on DNS process
tags 607709 + unreproducible quit On Mon, Feb 13, 2012 at 12:23:39PM +0100, Giuseppe Sacco wrote: Il giorno gio, 09/02/2012 alle 16.34 -0600, Jonathan Nieder ha scritto: - do you still use this system? If so, how are you coping? - what kernel and hypervisor do you use these days? - is it still reproducible with current squeeze or sid kernels and hypervisors? - any other weird observations? That was more than a year ago. I do not remember how I changed the system in order to avoid the problem. Right now I cannot access the system, so I cannot even check what packages (and versions) are there. If you are interested in the current configuration, I may check it. Yep, sorry we didn't get to this sooner. If you remember around when it was fixed and what kind of changes you tried in order to fix it, that would be the most helpful thing. Marking as unreproducible to make it more obvious to people looking through the bug list that trying to reproduce it and reporting the (positive or negative) result would be very useful. Jonathan -- To UNSUBSCRIBE, email to debian-kernel-requ...@lists.debian.org with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org Archive: http://lists.debian.org/20120213183202.GA6375@burratino
Bug#607709: XEN kernel crash on DNS process
reassign 607709 src:linux-2.6 2.6.32-27 found 607709 linux-2.6/2.6.32-29 quit Hi Giuseppe, Giuseppe Sacco wrote: I confirm the problem is really XEN related. We made more tests with non XEN kernel and we never got the error reported here. Now we are going to run again a XEN dom0 kernel, updating to latest kernel (from 2.6.32-27 to 2.6.32-29). I will report here more information whenever I collect them. [...] kernel BUG at [...]/arch/x86/xen/enlighten.c:309! invalid opcode: [#1] SMP last sysfs file: /sys/devices/pci:00/:00:1d.1/usb3/3-1/3-1:1.0/bInterfaceClass Modules linked in: bridge stp xen_evtchn xenfs vboxnetadp vboxnetflt vboxdrv lp parport i5100_edac shpchp psmouse edac_core processor pci_hotplug dcdbas serio_raw acpi_processor button evdev ext3 jbd mbcache sd_mod crc_t10dif sg osst sr_mod dm_mod st cdrom usbhid hid mptsas ata_generic ehci_hcd mptspi mptscsih scsi_transport_sas uhci_hcd ata_piix usbcore libata nls_base tg3 libphy mptbase scsi_transport_spi scsi_mod thermal thermal_sys radeonfb fb_ddc i2c_algo_bit i2c_core Pid: 2153, comm: server Not tainted (2.6.32-5-xen-686 #1) PowerEdge T300 EIP: 0061:[c1003b51] EFLAGS: 00010282 CPU: 3 EIP is at set_aliased_prot+0x9e/0x10d [...] Process server (pid: 2153, ti=f3ec8000 task=f13ddd80 task.ti=f3ec8000) [...] Call Trace: [c1003be2] ? xen_free_ldt+0x22/0x30 [c100b2ab] ? destroy_context+0x38/0x79 [c103564a] ? __mmdrop+0x1d/0x3a [c1032456] ? finish_task_switch+0x76/0x95 [c128d759] ? schedule+0x762/0x7dc [c1006048] ? xen_force_evtchn_callback+0xc/0x10 [c128e080] ? schedule_hrtimeout_range+0xa1/0xda [c104d7e2] ? hrtimer_wakeup+0x0/0x18 [c104e13d] ? hrtimer_start_range_ns+0xf/0x13 [c10c6dda] ? poll_schedule_timeout+0x26/0x3e [c10c772e] ? do_select+0x3e7/0x42d [c10d60a8] ? __getblk+0x3c/0x2f3 [c10c7b1a] ? __pollwait+0x0/0xa5 [c10d8581] ? __bread+0xa/0x5d [f865ff47] ? ext3_get_branch+0x5d/0xc4 [ext3] [...] Thanks and sorry for the slow response. No ideas come to mind, so some basic questions instead: - do you still use this system? If so, how are you coping? - what kernel and hypervisor do you use these days? - is it still reproducible with current squeeze or sid kernels and hypervisors? - any other weird observations? Hope that helps, Jonathan -- To UNSUBSCRIBE, email to debian-kernel-requ...@lists.debian.org with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org Archive: http://lists.debian.org/20120209223444.GA3377@burratino
Bug#607709: XEN kernel crash on DNS process
Hi all, I confirm the problem is really XEN related. We made more tests with non XEN kernel and we never got the error reported here. Now we are going to run again a XEN dom0 kernel, updating to latest kernel (from 2.6.32-27 to 2.6.32-29). I will report here more information whenever I collect them. The problem is: when the system crash, the machine is still powered on but nobody may connect and login. So, we are going to log in on the console and leave a shell opened. When the kernel will hang, we'll try to issue «xm dmesg» command. Setting syslog on a remote machine, would this help at all? Thanks, Giuseppe -- To UNSUBSCRIBE, email to debian-kernel-requ...@lists.debian.org with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org Archive: http://lists.debian.org/1293537131.15990.25.ca...@scarafaggio
Bug#607709: XEN kernel crash on DNS process
Giuseppe Sacco giuse...@eppesuigoccas.homedns.org writes: Package: linux-image-2.6.32-5-xen-686 Version: 2.6.32-27 Severity: important Are you running this as a dom0 or domU? You don't need to use a -xen- kernel as a domU. -- To UNSUBSCRIBE, email to debian-kernel-requ...@lists.debian.org with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org Archive: http://lists.debian.org/84tyi0won3@sauna.l.org
Bug#607709: XEN kernel crash on DNS process
Il giorno dom, 26/12/2010 alle 17.16 +0200, Timo Juhani Lindfors ha scritto: Giuseppe Sacco giuse...@eppesuigoccas.homedns.org writes: Package: linux-image-2.6.32-5-xen-686 Version: 2.6.32-27 Severity: important Are you running this as a dom0 or domU? You don't need to use a -xen- kernel as a domU. This is dom0 hosting a windows machine. Bye, Giuseppe -- To UNSUBSCRIBE, email to debian-kernel-requ...@lists.debian.org with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org Archive: http://lists.debian.org/1293400305.11821.1.ca...@scarafaggio
Bug#607709: XEN kernel crash on DNS process
Giuseppe Sacco giuse...@eppesuigoccas.homedns.org writes: This is dom0 hosting a windows machine. Ok, then I'm afraid I can't help much. I know only about debugging domU'ss and not dom0's. -- To UNSUBSCRIBE, email to debian-kernel-requ...@lists.debian.org with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org Archive: http://lists.debian.org/8462ugw5b6@sauna.l.org
Bug#607709: XEN kernel crash on DNS process
Hi all, this is a second trace from the same machine. The process blocked isn't the DNS server anymore, so the problem seems not related to any process. I collected three traces with three different processes. Please note the problem is not XEN specific, since I may reproduce the problem with non XEN kernel (same kernel version). Thank you very much for working on this subject. Dec 21 08:34:10 atf-124 kernel: [ 64.093097] [ cut here ] Dec 21 08:34:10 atf-124 kernel: [ 64.093209] kernel BUG at /build/buildd-linux-2.6_2.6.32-27-i386-c5N4Hf/linux-2.6-2.6 .32/debian/build/source_i386_xen/arch/x86/xen/enlighten.c:309! Dec 21 08:34:10 atf-124 kernel: [ 64.093456] invalid opcode: [#1] SMP Dec 21 08:34:10 atf-124 kernel: [ 64.093618] last sysfs file: /sys/devices/pci:00/:00:1d.7/usb5/5-7/5-7.1/5-7. 1:1.0/bInterfaceClass Dec 21 08:34:10 atf-124 kernel: [ 64.093803] Modules linked in: bridge stp xen_evtchn xenfs vboxnetadp vboxnetflt vbox drv shpchp lp processor i5100_edac pci_hotplug psmouse parport acpi_processor button dcdbas edac_core evdev serio_raw ex t3 jbd mbcache sd_mod crc_t10dif sg dm_mod osst sr_mod cdrom st usbhid hid ata_generic mptspi mptsas mptscsih scsi_trans port_spi thermal ehci_hcd thermal_sys tg3 mptbase libphy scsi_transport_sas ata_piix libata uhci_hcd scsi_mod usbcore nl s_base radeonfb fb_ddc i2c_algo_bit i2c_core Dec 21 08:34:10 atf-124 kernel: [ 64.097010] Dec 21 08:34:10 atf-124 kernel: [ 64.097010] Pid: 2049, comm: event Not tainted (2.6.32-5-xen-686 #1) PowerEdge T300 Dec 21 08:34:10 atf-124 kernel: [ 64.097314] EIP: 0061:[c1003abc] EFLAGS: 00010282 CPU: 2 Dec 21 08:34:10 atf-124 kernel: [ 64.097314] EIP is at set_aliased_prot+0x9e/0x10d Dec 21 08:34:10 atf-124 kernel: [ 64.097314] EAX: ffea EBX: f8c2c000 ECX: dac2c063 EDX: 8001 Dec 21 08:34:10 atf-124 kernel: [ 64.097314] ESI: EDI: 8000 EBP: 000cf400 ESP: f5e65a3c Dec 21 08:34:10 atf-124 kernel: [ 64.097314] DS: 007b ES: 007b FS: 00d8 GS: 00e0 SS: 0069 Dec 21 08:34:10 atf-124 kernel: [ 64.097314] Process event (pid: 2049, ti=f5e64000 task=f67ef2c0 task.ti=f5e64000) Dec 21 08:34:10 atf-124 kernel: [ 64.097314] Stack: Dec 21 08:34:10 atf-124 kernel: [ 64.097314] 8001 dac2c063 0163 f8c2c000 c2e52000 0001 0200 f8c2c000 Dec 21 08:34:10 atf-124 kernel: [ 64.098291] 0 2000 f6438cc0 c1003b4d f5fe2fc0 f67ef2c0 f5fe2fc0 c100b20b f5fe2f c0 Dec 21 08:34:10 atf-124 kernel: [ 64.098817] 0 f5fe2fc0 c103556e c103237c c145ee20 f5fe2fc0 c1db2a 80 Dec 21 08:34:10 atf-124 kernel: [ 64.098817] Call Trace: Dec 21 08:34:10 atf-124 kernel: [ 64.101438] [c1003b4d] ? xen_free_ldt+0x22/0x30 Dec 21 08:34:10 atf-124 kernel: [ 64.102616] [c100b20b] ? destroy_context+0x38/0x79 Dec 21 08:34:10 atf-124 kernel: [ 64.105163] [c103556e] ? __mmdrop+0x1d/0x3a Dec 21 08:34:10 atf-124 kernel: [ 64.106067] [c103237c] ? finish_task_switch+0x76/0x95 Dec 21 08:34:10 atf-124 kernel: [ 64.108722] [c128d169] ? schedule+0x762/0x7dc Dec 21 08:34:10 atf-124 kernel: [ 64.110296] [c1005fb4] ? xen_force_evtchn_callback+0xc/0x10 Dec 21 08:34:10 atf-124 kernel: [ 64.111807] [c128da90] ? schedule_hrtimeout_range+0xa1/0xda Dec 21 08:34:10 atf-124 kernel: [ 64.114401] [c104d652] ? hrtimer_wakeup+0x0/0x18 Dec 21 08:34:10 atf-124 kernel: [ 64.115627] [c104dfad] ? hrtimer_start_range_ns+0xf/0x13 Dec 21 08:34:10 atf-124 kernel: [ 64.117920] [c10c6c7e] ? poll_schedule_timeout+0x26/0x3e Dec 21 08:34:10 atf-124 kernel: [ 64.117920] [c10c75d2] ? do_select+0x3e7/0x42d Dec 21 08:34:10 atf-124 kernel: [ 64.120657] [c10d5f4c] ? __getblk+0x3c/0x2f3 Dec 21 08:34:10 atf-124 kernel: [ 64.120657] [c10c79be] ? __pollwait+0x0/0xa5 Dec 21 08:34:10 atf-124 kernel: [ 64.120657] [c10d8425] ? __bread+0xa/0x5d Dec 21 08:34:10 atf-124 kernel: [ 64.126411] [f8679f47] ? ext3_get_branch+0x5d/0xc4 [ext3] Dec 21 08:34:10 atf-124 kernel: [ 64.127216] [f867a730] ? ext3_get_blocks_handle+0x8f/0x78a [ext3] Dec 21 08:34:10 atf-124 kernel: [ 64.130117] [f867adf6] ? ext3_get_blocks_handle+0x755/0x78a [ext3] Dec 21 08:34:10 atf-124 kernel: [ 64.131997] [c1005fb4] ? xen_force_evtchn_callback+0xc/0x10 Dec 21 08:34:10 atf-124 kernel: [ 64.132327] [c10066dc] ? check_events+0x8/0xc Dec 21 08:34:10 atf-124 kernel: [ 64.135358] [c10d] ? ipc_lock+0x28/0x3b Dec 21 08:34:10 atf-124 kernel: [ 64.136474] [c1102256] ? semctl_main+0x36a/0x391 Dec 21 08:34:10 atf-124 kernel: [ 64.137804] [c10066d3] ? xen_restore_fl_direct_end+0x0/0x1 Dec 21 08:34:10 atf-124 kernel: [ 64.140186] [c10b6ad8] ? kmem_cache_alloc+0x79/0xe5 Dec 21 08:34:10 atf-124 kernel: [ 64.140186] [f867aec8] ? ext3_get_block+0x9d/0xd1 [ext3] Dec 21 08:34:10 atf-124 kernel: [ 64.140186] [c1005fb4] ? xen_force_evtchn_callback+0xc/0x10 Dec 21 08:34:10 atf-124 kernel: [ 64.145420] [c10066dc] ? check_events+0x8/0xc Dec 21 08:34:10
Bug#607709: XEN kernel crash on DNS process
On Thu, 2010-12-23 at 10:54 +0100, Giuseppe Sacco wrote: Hi all, this is a second trace from the same machine. The process blocked isn't the DNS server anymore, so the problem seems not related to any process. I collected three traces with three different processes. Please note the problem is not XEN specific, since I may reproduce the problem with non XEN kernel (same kernel version). Do you have an example of the trace from a non-Xen kernel. The traces in this bug report are very Xen specific indeed (a BUG_ON when a hypercall fails). I expect that the failure on Xen would be accompanied by a log message from the hypervisor. Does xm dmesg show anything? Ian. -- Ian Campbell Bizoos, n.: The millions of tiny individual bumps that make up a basketball. -- Rich Hall, Sniglets -- To UNSUBSCRIBE, email to debian-kernel-requ...@lists.debian.org with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org Archive: http://lists.debian.org/1293099932.22419.1.ca...@zakaz.uk.xensource.com
Bug#607709: XEN kernel crash on DNS process
Package: linux-image-2.6.32-5-xen-686 Version: 2.6.32-27 Severity: important Hi all, since a few days, my computer isn't accessible anymore. The kernel crashes and kills my DNS server, so that all applications are not accessible. This is what I find in syslog: Dec 20 22:14:17 atf-124 kernel: [47450.218613] [ cut here ] Dec 20 22:14:17 atf-124 kernel: [47450.221366] kernel BUG at /build/buildd-linux-2.6_2.6.32-27-i386-c5N4Hf/linux-2.6-2.6 .32/debian/build/source_i386_xen/arch/x86/xen/enlighten.c:309! Dec 20 22:14:17 atf-124 kernel: [47450.222575] invalid opcode: [#1] SMP Dec 20 22:14:17 atf-124 kernel: [47450.222575] last sysfs file: /sys/hypervisor/version/minor Dec 20 22:14:17 atf-124 kernel: [47450.222575] Modules linked in: bridge stp xen_evtchn xenfs vboxnetadp vboxnetflt vbox drv shpchp pci_hotplug evdev lp psmouse i5100_edac edac_core processor button parport dcdbas acpi_processor serio_raw ex t3 jbd mbcache sd_mod crc_t10dif sg osst sr_mod dm_mod cdrom st usbhid hid ehci_hcd mptspi scsi_transport_spi mptsas ata _generic mptscsih mptbase ata_piix scsi_transport_sas uhci_hcd libata usbcore tg3 libphy nls_base scsi_mod thermal therm al_sys radeonfb fb_ddc i2c_algo_bit i2c_core Dec 20 22:14:17 atf-124 kernel: [47450.222575] Dec 20 22:14:17 atf-124 kernel: [47450.222575] Pid: 2135, comm: server Not tainted (2.6.32-5-xen-686 #1) PowerEdge T300 Dec 20 22:14:17 atf-124 kernel: [47450.222575] EIP: 0061:[c1003abc] EFLAGS: 00010282 CPU: 3 Dec 20 22:14:17 atf-124 kernel: [47450.222575] EIP is at set_aliased_prot+0x9e/0x10d Dec 20 22:14:17 atf-124 kernel: [47450.222575] EAX: ffea EBX: f8c44000 ECX: bf802063 EDX: 8001 Dec 20 22:14:17 atf-124 kernel: [47450.222575] ESI: EDI: 8000 EBP: 0006a7d6 ESP: f5705a3c Dec 20 22:14:17 atf-124 kernel: [47450.222575] DS: 007b ES: 007b FS: 00d8 GS: 00e0 SS: 0069 Dec 20 22:14:17 atf-124 kernel: [47450.222575] Process server (pid: 2135, ti=f5704000 task=f5655500 task.ti=f5704000) Dec 20 22:14:17 atf-124 kernel: [47450.222575] Stack: Dec 20 22:14:17 atf-124 kernel: [47450.222575] 8001 bf802063 0163 f8c44000 c2e52000 0001 0200 f8c44000 Dec 20 22:14:17 atf-124 kernel: [47450.222575] 0 2000 f6439100 c1003b4d f34581c0 f5655500 f34581c0 c100b20b f34581 c0 Dec 20 22:14:17 atf-124 kernel: [47450.222575] 0 f34581c0 c103556e c103237c c145ee20 f34581c0 c23411 80 Dec 20 22:14:17 atf-124 kernel: [47450.222575] Call Trace: Dec 20 22:14:17 atf-124 kernel: [47450.222575] [c1003b4d] ? xen_free_ldt+0x22/0x30 Dec 20 22:14:17 atf-124 kernel: [47450.222575] [c100b20b] ? destroy_context+0x38/0x79 Dec 20 22:14:17 atf-124 kernel: [47450.222575] [c103556e] ? __mmdrop+0x1d/0x3a Dec 20 22:14:17 atf-124 kernel: [47450.222575] [c103237c] ? finish_task_switch+0x76/0x95 Dec 20 22:14:17 atf-124 kernel: [47450.222575] [c128d169] ? schedule+0x762/0x7dc Dec 20 22:14:17 atf-124 kernel: [47450.222575] [c104dfc8] ? hrtimer_start_expires+0x17/0x1d Dec 20 22:14:17 atf-124 kernel: [47450.222575] [c128da90] ? schedule_hrtimeout_range+0xa1/0xda Dec 20 22:14:17 atf-124 kernel: [47450.222575] [c104d652] ? hrtimer_wakeup+0x0/0x18 Dec 20 22:14:17 atf-124 kernel: [47450.222575] [c104dfad] ? hrtimer_start_range_ns+0xf/0x13 Dec 20 22:14:17 atf-124 kernel: [47450.222575] [c10c6c7e] ? poll_schedule_timeout+0x26/0x3e Dec 20 22:14:17 atf-124 kernel: [47450.222575] [c10c75d2] ? do_select+0x3e7/0x42d Dec 20 22:14:17 atf-124 kernel: [47450.222575] [c10d5f4c] ? __getblk+0x3c/0x2f3 Dec 20 22:14:17 atf-124 kernel: [47450.222575] [c10c79be] ? __pollwait+0x0/0xa5 Dec 20 22:14:17 atf-124 kernel: [47450.222575] [c102ccea] ? enqueue_entity+0x82/0x129 Dec 20 22:14:17 atf-124 kernel: [47450.222575] [c1005fb4] ? xen_force_evtchn_callback+0xc/0x10 Dec 20 22:14:17 atf-124 kernel: [47450.222575] [c10066dc] ? check_events+0x8/0xc Dec 20 22:14:17 atf-124 kernel: [47450.222575] [c10066d3] ? xen_restore_fl_direct_end+0x0/0x1 Dec 20 22:14:17 atf-124 kernel: [47450.222575] [c128e2f9] ? _spin_unlock_irqrestore+0xd/0xf Dec 20 22:14:17 atf-124 kernel: [47450.222575] [c10341bf] ? try_to_wake_up+0x2ae/0x2ba Dec 20 22:14:17 atf-124 kernel: [47450.222575] [c1005fb4] ? xen_force_evtchn_callback+0xc/0x10 Dec 20 22:14:17 atf-124 kernel: [47450.222575] [c10d] ? ipc_lock+0x28/0x3b Dec 20 22:14:17 atf-124 kernel: [47450.222575] [c108e6b2] ? find_get_page+0x1f/0x85 Dec 20 22:14:17 atf-124 kernel: [47450.222575] [c10d5b0e] ? __find_get_block_slow+0xe1/0xea Dec 20 22:14:17 atf-124 kernel: [47450.222575] [c1005fb4] ? xen_force_evtchn_callback+0xc/0x10 Dec 20 22:14:17 atf-124 kernel: [47450.222575] [c10066dc] ? check_events+0x8/0xc Dec 20 22:14:17 atf-124 kernel: [47450.222575] [c10b6ad8] ? kmem_cache_alloc+0x79/0xe5 Dec 20 22:14:17 atf-124 kernel: [47450.222575] [c1005fb4] ? xen_force_evtchn_callback+0xc/0x10 Dec 20 22:14:17 atf-124 kernel: [47450.222575] [f85dba97] ? journal_stop+0x254/0x260 [jbd] Dec 20