Bug#789770: linux-image-3.16.0-4-amd64: Dell R310 server (Xen 4.4 Dom0) periodically crashing after upgrade to Jessie from Wheezy
Package: src:linux Version: 3.16.7-ckt11-1 Severity: critical Justification: breaks the whole system Dear Maintainer, After upgrading the server from wheezy to jessie the server, which is a Xen 4.4 Dom0, it is crashing every 2-3 days and bringing down all its hosted VMs, with the following syslog dump: Jun 24 13:24:09 servername kernel: [438520.690952] WARNING: CPU: 0 PID: 0 at /build/linux-QZaPpC/linux-3.16.7-ckt11/net/sched/sch_generic.c:264 dev_watchdog+0x236/0x240() Jun 24 13:24:09 servername kernel: [438520.690959] NETDEV WATCHDOG: eth0 (bnx2): transmit queue 7 timed out Jun 24 13:24:09 servername kernel: [438520.690963] Modules linked in: dm_snapshot dm_bufio binfmt_misc xt_physdev xen_netback xen_blkback xen_gntdev xen_evtchn xenfs xen_privcmd nfsd auth_rpcgss oid_registry nfs_acl nfs lockd fscache sunrpc ib_iser rdma_cm iw_cm ib_cm ib_sa ib_mad ib_core ib_addr iscsi_tcp libiscsi_tcp libiscsi scsi_transport_iscsi bridge stp llc xt_tcpudp xt_recent xt_conntrack iptable_mangle iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 nf_nat nf_conntrack iptable_filter ip_tables x_tables joydev acpi_power_meter coretemp ipmi_devintf evdev ttm drm_kms_helper drm i2c_algo_bit dcdbas iTCO_wdt iTCO_vendor_support i2c_core pcspkr button ipmi_si tpm_tis ipmi_msghandler tpm lpc_ich i7core_edac mfd_core edac_core processor shpchp thermal_sys loop autofs4 ext4 crc16 mbcache jbd2 dm_mod sd_mod crc_t10dif crct10dif_generic sg ses crct10dif_common enclosure sr_mod cdrom usb_storage hid_generic usbhid hid ata_generic crc32c_intel ehci_pci ata_piix ehci_hcd libata m egaraid_sas usbcore usb_common scsi_mod bnx2 Jun 24 13:24:09 servername kernel: [438520.691049] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 3.16.0-4-amd64 #1 Debian 3.16.7-ckt11-1 Jun 24 13:24:09 servername kernel: [438520.691052] Hardware name: Dell Inc. PowerEdge R310/05XKKK, BIOS 1.6.4 03/03/2011 Jun 24 13:24:09 servername kernel: [438520.691055] 0009 8150b405 88007c803e28 81067797 Jun 24 13:24:09 servername kernel: [438520.691059] 0007 88007c803e78 0008 Jun 24 13:24:09 servername kernel: [438520.691062] 880002444000 810677fc 81777fb8 8830 Jun 24 13:24:09 servername kernel: [438520.691066] Call Trace: Jun 24 13:24:09 servername kernel: [438520.691069][] ? dump_stack+0x41/0x51 Jun 24 13:24:09 servername kernel: [438520.691081] [] ? warn_slowpath_common+0x77/0x90 Jun 24 13:24:09 servername kernel: [438520.691085] [] ? warn_slowpath_fmt+0x4c/0x50 Jun 24 13:24:09 servername kernel: [438520.691092] [] ? dev_watchdog+0x236/0x240 Jun 24 13:24:09 servername kernel: [438520.691096] [] ? dev_graft_qdisc+0x70/0x70 Jun 24 13:24:09 servername kernel: [438520.691102] [] ? call_timer_fn+0x31/0x100 Jun 24 13:24:09 servername kernel: [438520.691108] [] ? dev_graft_qdisc+0x70/0x70 Jun 24 13:24:09 servername kernel: [438520.691113] [] ? run_timer_softirq+0x209/0x2f0 Jun 24 13:24:09 servername kernel: [438520.691117] [] ? __do_softirq+0xf1/0x290 Jun 24 13:24:09 servername kernel: [438520.691122] [] ? irq_exit+0x95/0xa0 Jun 24 13:24:09 servername kernel: [438520.691128] [] ? xen_evtchn_do_upcall+0x35/0x50 Jun 24 13:24:09 servername kernel: [438520.691135] [] ? xen_do_hypervisor_callback+0x1e/0x30 Jun 24 13:24:09 servername kernel: [438520.691137][] ? xen_hypercall_sched_op+0xa/0x20 Jun 24 13:24:09 servername kernel: [438520.691145] [] ? xen_hypercall_sched_op+0xa/0x20 Jun 24 13:24:09 servername kernel: [438520.691150] [] ? xen_safe_halt+0xc/0x20 Jun 24 13:24:09 servername kernel: [438520.691154] [] ? default_idle+0x19/0xb0 Jun 24 13:24:09 servername kernel: [438520.691158] [] ? cpu_startup_entry+0x340/0x400 Jun 24 13:24:09 servername kernel: [438520.691161] [] ? start_kernel+0x492/0x49d Jun 24 13:24:09 servername kernel: [438520.691163] [] ? set_init_arg+0x4e/0x4e Jun 24 13:24:09 servername kernel: [438520.691166] [] ? xen_start_kernel+0x569/0x573 Jun 24 13:24:09 servername kernel: [438520.691169] ---[ end trace 05255fd39e925fd5 ]--- Jun 24 13:24:09 servername kernel: [438520.691173] bnx2 :04:00.0 eth0: <--- start FTQ dump ---> Jun 24 13:24:09 servername kernel: [438520.691206] bnx2 :04:00.0 eth0: RV2P_PFTQ_CTL 0001 Jun 24 13:24:09 servername kernel: [438520.691235] bnx2 :04:00.0 eth0: RV2P_TFTQ_CTL 0002 Jun 24 13:24:09 servername kernel: [438520.691265] bnx2 :04:00.0 eth0: RV2P_MFTQ_CTL 4000 Jun 24 13:24:09 servername kernel: [438520.691294] bnx2 :04:00.0 eth0: TBDR_FTQ_CTL 4002 Jun 24 13:24:09 servername kernel: [438520.691324] bnx2 :04:00.0 eth0: TDMA_FTQ_CTL 00010002 Jun 24 13:24:09 servername kernel: [438520.691353] bnx2 :04:00.0 eth0: TXP_FTQ_CTL 0001 Jun 24 13:24:09 servername kernel: [438520.691382] bnx2 :04:00.0 eth0: TXP_FTQ_CTL 0001 Jun 24 13:24:09 servername kernel: [438520.691412] bnx2 :04:00.0 eth0: TPAT_FTQ_CT
Bug#786936: xen-hypervisor-4.4-amd64: Upgrade dom0 from wheezy to jessie on Dell R610 results in dom0 unaccessible with xen_netback issue
Package: xen-hypervisor-4.4-amd64 Version: 4.4.1-9 Severity: critical Justification: breaks the whole system Dear Maintainer, After upgrading the R610 server from Debian 7 to Debian 8, the dom0 becomes unresponsive via ssh after an hour or so, although the domUs still remain accessible. Initially we thought it may be a disk space issue on / or /boot so action was taken to increase those petition sizes but it has no effect. We get the following trace in /var/log/syslog: May 26 09:18:59 servername kernel: [31526.937788] BUG: unable to handle kernel paging request at c90013a4b158 May 26 09:18:59 servername kernel: [31526.937798] IP: [] xenvif_get_ethtool_stats+0x50/0x80 [xen_netback] May 26 09:18:59 servername kernel: [31526.937807] PGD b243c067 PUD b243d067 PMD 8a56c067 PTE 0 May 26 09:18:59 servername kernel: [31526.937813] Oops: [#1] SMP May 26 09:18:59 servername kernel: [31526.937817] Modules linked in: dm_snapshot dm_bufio binfmt_misc xt_tcpudp xt_physdev iptable_filter ip_tables x_tables xen_netback xen_blkback xen_gntdev xen_evtchn xenfs xen_privcmd nfsd auth_rpcgss oid_registry nfs_acl nfs lockd fscache sunrpc ib_iser rdma_cm iw_cm ib_cm ib_sa ib_mad ib_core ib_addr iscsi_tcp libiscsi_tcp libiscsi scsi_transport_iscsi bridge stp llc nls_utf8 nls_cp437 vfat fat joydev intel_powerclamp coretemp crc32_pclmul ghash_clmulni_intel ttm evdev aesni_intel ipmi_devintf iTCO_wdt iTCO_vendor_support aes_x86_64 drm_kms_helper acpi_power_meter dcdbas lrw gf128mul glue_helper tpm_tis tpm drm i2c_algo_bit ablk_helper processor i2c_core lpc_ich ipmi_si ipmi_msghandler i7core_edac thermal_sys cryptd mfd_core button psmouse pcspkr serio_raw shpchp wmi edac_core loop autofs4 ext4 crc16 mbcache jbd2 dm_mod hid_generic usbhid hid sg sr_mod cdrom ses sd_mod enclosure ata_generic crc32c_intel lpfc crc_t10dif crct10dif_generic ehci_pci uhci_hcd crct10dif_pclmul ata_piix ehci_hcd scsi_transport_fc libata megaraid_sas scsi_tgt usbcore scsi_mod usb_common crct10dif_common bnx2 May 26 09:18:59 servername kernel: [31526.937917] CPU: 0 PID: 1311 Comm: snmpd Not tainted 3.16.0-4-amd64 #1 Debian 3.16.7-ckt9-3~deb8u1 May 26 09:18:59 servername kernel: [31526.937922] Hardware name: Dell Inc. PowerEdge R610/0F0XJ6, BIOS 6.4.0 07/23/2013 May 26 09:18:59 servername kernel: [31526.937927] task: 88008a86a250 ti: 880002b4c000 task.ti: 880002b4c000 May 26 09:18:59 servername kernel: [31526.937931] RIP: e030:[] [] xenvif_get_ethtool_stats+0x50/0x80 [xen_netback] May 26 09:18:59 servername kernel: [31526.937939] RSP: e02b:880002b4fd70 EFLAGS: 00010283 May 26 09:18:59 servername kernel: [31526.937942] RAX: c90013a14f38 RBX: 0230f940 RCX: 92008ea28c88 May 26 09:18:59 servername kernel: [31526.937946] RDX: 88008ecadc00 RSI: c90013a4b190 RDI: 88008da7c000 May 26 09:18:59 servername kernel: [31526.937949] RBP: 880002b4fe10 R08: a06827e0 R09: 0006 May 26 09:18:59 servername kernel: [31526.937953] R10: 0010ebb8 R11: 0246 R12: 0005 May 26 09:18:59 servername kernel: [31526.937957] R13: 88008da7c000 R14: a0682640 R15: 88008ecadc00 May 26 09:18:59 servername kernel: [31526.937965] FS: 7f93bcc9e700() GS:8800b2a0() knlGS: May 26 09:18:59 servername kernel: [31526.937969] CS: e033 DS: ES: CR0: 8005003b May 26 09:18:59 servername kernel: [31526.937973] CR2: c90013a4b158 CR3: 899ff000 CR4: 2660 May 26 09:18:59 servername kernel: [31526.937977] Stack: May 26 09:18:59 servername kernel: [31526.937979] 814225f1 000400114813 7fff3fff32a8 May 26 09:18:59 servername kernel: [31526.937985] 880002b4ff18 001d3fff32a0 880002b4fde0 814039a6 May 26 09:18:59 servername kernel: [31526.937990] 0005001d 8805 81420455 7fff3fff3280 May 26 09:18:59 servername kernel: [31526.937995] Call Trace: May 26 09:18:59 servername kernel: [31526.938003] [] ? dev_ethtool+0x921/0x1ac0 May 26 09:18:59 servername kernel: [31526.938009] [] ? ___sys_recvmsg+0x136/0x2a0 May 26 09:18:59 servername kernel: [31526.938014] [] ? netdev_run_todo+0x55/0x2f0 May 26 09:18:59 servername kernel: [31526.938020] [] ? dev_ioctl+0x19f/0x590 May 26 09:18:59 servername kernel: [31526.938026] [] ? kfree+0x118/0x220 May 26 09:18:59 servername kernel: [31526.938033] [] ? fsnotify_clear_marks_by_inode+0x2a/0x110 May 26 09:18:59 servername kernel: [31526.938038] [] ? sock_do_ioctl+0x3d/0x50 May 26 09:18:59 servername kernel: [31526.938043] [] ? sock_ioctl+0x1e8/0x2c0 May 26 09:18:59 servername kernel: [31526.938048] [] ? do_vfs_ioctl+0x2cf/0x4b0 May 26 09:18:59 servername kernel: [31526.938054] [] ? task_work_run+0x9c/0xd0 May 26 09:18:59 servername kernel: [31526.938059] [] ? SyS_ioctl+0x81/0xa0 May 26 09:18:59 servername kernel: [31526.938065] [] ? int_sign
Bug#697585: We've also experienced this issue.
Just rebooting for the second time now!