[Kernel-packages] [Bug 1315736] Re: [Dell PowerEdge R720] Machine Check Exception
[Expired for linux (Ubuntu) because there has been no activity for 60 days.] ** Changed in: linux (Ubuntu) Status: Incomplete => Expired -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1315736 Title: [Dell PowerEdge R720] Machine Check Exception Status in “linux” package in Ubuntu: Expired Bug description: Dell PowerEdge 720 on ubuntu 14.04 shows MCE errors on dmesg. Dell support instructed to run DSET and BIOS hardware diagnostics. Neither of the tools showed any errors. Dell support said that if there was a hardware error it would have been shown on Dell logs and the probable reason for the dmesg log is a bug in ubuntu kernel MCE reporting. So, is it that following dmesg is because of a kernel bug in ubuntu 14.04 server? [11562.171040] Please check user daemon is running. [94953.306404] sbridge: HANDLING MCE MEMORY ERROR [94953.306415] CPU 1: Machine Check Exception: 0 Bank 9: 8c4b000800c0 [94953.306416] TSC 0 ADDR 2dfa0e1000 MISC 9800080168c PROCESSOR 0:306e4 TIME 1399142359 SOCKET 1 APIC 20 [94953.306422] sbridge: HANDLING MCE MEMORY ERROR [94953.306423] CPU 1: Machine Check Exception: 0 Bank 10: 8c5800c1 [94953.306424] TSC 0 ADDR 2dfa0e1000 MISC 900208c PROCESSOR 0:306e4 TIME 1399142359 SOCKET 1 APIC 20 [94953.532217] EDAC MC1: 1 CE memory scrubbing error on CPU_SrcID#1_Channel#0_DIMM#0 (channel:0 slot:0 page:0x2dfa0e1 offset:0x0 grain:32 syndrome:0x0 - area:DRAM err_code:0008:00c0 socket:1 channel_mask:3 rank:0) [94953.532226] EDAC MC1: 1 CE memory scrubbing error on CPU_SrcID#1_Channel#1_DIMM#0 (channel:1 slot:0 page:0x2dfa0e1 offset:0x0 grain:32 syndrome:0x0 - area:DRAM err_code:0008:00c1 socket:1 channel_mask:3 rank:0) --- AlsaDevices: total 0 crw-rw 1 root audio 116, 1 touko 2 19:15 seq crw-rw 1 root audio 116, 33 touko 2 19:15 timer AplayDevices: Error: [Errno 2] No such file or directory ApportVersion: 2.14.1-0ubuntu3 Architecture: amd64 ArecordDevices: Error: [Errno 2] No such file or directory AudioDevicesInUse: Error: command ['fuser', '-v', '/dev/snd/seq', '/dev/snd/timer'] failed with exit code 1: CRDA: Error: [Errno 2] No such file or directory CurrentDmesg: Error: command ['sh', '-c', 'dmesg | comm -13 --nocheck-order /var/log/dmesg -'] failed with exit code 1: comm: /var/log/dmesg: Permission denied dmesg: write failed: Broken pipe DistroRelease: Ubuntu 14.04 InstallationDate: Installed on 2014-02-26 (66 days ago) InstallationMedia: Ubuntu-Server 14.04 LTS "Trusty Tahr" - Alpha amd64 (20140219) MachineType: Dell Inc. PowerEdge R720 Package: linux (not installed) PciMultimedia: ProcFB: 0 VESA VGA ProcKernelCmdLine: BOOT_IMAGE=/boot/vmlinuz-3.13.0-24-generic root=UUID=c03eb237-955a-4dee-bba1-deded53df372 ro ProcVersionSignature: Ubuntu 3.13.0-24.46-generic 3.13.9 RfKill: Error: [Errno 2] No such file or directory Tags: trusty Uname: Linux 3.13.0-24-generic x86_64 UpgradeStatus: No upgrade log present (probably fresh install) UserGroups: WifiSyslog: _MarkForUpload: True dmi.bios.date: 01/16/2014 dmi.bios.vendor: Dell Inc. dmi.bios.version: 2.2.2 dmi.board.name: 0DCWD1 dmi.board.vendor: Dell Inc. dmi.board.version: A01 dmi.chassis.type: 23 dmi.chassis.vendor: Dell Inc. dmi.modalias: dmi:bvnDellInc.:bvr2.2.2:bd01/16/2014:svnDellInc.:pnPowerEdgeR720:pvr:rvnDellInc.:rn0DCWD1:rvrA01:cvnDellInc.:ct23:cvr: dmi.product.name: PowerEdge R720 dmi.sys.vendor: Dell Inc. To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1315736/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp
[Kernel-packages] [Bug 1315736] Re: [Dell PowerEdge R720] Machine Check Exception
km, please file a new report against that crash report, as manually attaching it does not allow Apport to process it, delaying your issue. For more on this, please see https://wiki.ubuntu.com/ReportingBugs . ** Attachment removed: "ubuntu bug linux report" https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1315736/+attachment/4150051/+files/apport.linux-image-3.13.0-29-generic.v61kdeds.apport -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1315736 Title: [Dell PowerEdge R720] Machine Check Exception Status in “linux” package in Ubuntu: Incomplete Bug description: Dell PowerEdge 720 on ubuntu 14.04 shows MCE errors on dmesg. Dell support instructed to run DSET and BIOS hardware diagnostics. Neither of the tools showed any errors. Dell support said that if there was a hardware error it would have been shown on Dell logs and the probable reason for the dmesg log is a bug in ubuntu kernel MCE reporting. So, is it that following dmesg is because of a kernel bug in ubuntu 14.04 server? [11562.171040] Please check user daemon is running. [94953.306404] sbridge: HANDLING MCE MEMORY ERROR [94953.306415] CPU 1: Machine Check Exception: 0 Bank 9: 8c4b000800c0 [94953.306416] TSC 0 ADDR 2dfa0e1000 MISC 9800080168c PROCESSOR 0:306e4 TIME 1399142359 SOCKET 1 APIC 20 [94953.306422] sbridge: HANDLING MCE MEMORY ERROR [94953.306423] CPU 1: Machine Check Exception: 0 Bank 10: 8c5800c1 [94953.306424] TSC 0 ADDR 2dfa0e1000 MISC 900208c PROCESSOR 0:306e4 TIME 1399142359 SOCKET 1 APIC 20 [94953.532217] EDAC MC1: 1 CE memory scrubbing error on CPU_SrcID#1_Channel#0_DIMM#0 (channel:0 slot:0 page:0x2dfa0e1 offset:0x0 grain:32 syndrome:0x0 - area:DRAM err_code:0008:00c0 socket:1 channel_mask:3 rank:0) [94953.532226] EDAC MC1: 1 CE memory scrubbing error on CPU_SrcID#1_Channel#1_DIMM#0 (channel:1 slot:0 page:0x2dfa0e1 offset:0x0 grain:32 syndrome:0x0 - area:DRAM err_code:0008:00c1 socket:1 channel_mask:3 rank:0) --- AlsaDevices: total 0 crw-rw 1 root audio 116, 1 touko 2 19:15 seq crw-rw 1 root audio 116, 33 touko 2 19:15 timer AplayDevices: Error: [Errno 2] No such file or directory ApportVersion: 2.14.1-0ubuntu3 Architecture: amd64 ArecordDevices: Error: [Errno 2] No such file or directory AudioDevicesInUse: Error: command ['fuser', '-v', '/dev/snd/seq', '/dev/snd/timer'] failed with exit code 1: CRDA: Error: [Errno 2] No such file or directory CurrentDmesg: Error: command ['sh', '-c', 'dmesg | comm -13 --nocheck-order /var/log/dmesg -'] failed with exit code 1: comm: /var/log/dmesg: Permission denied dmesg: write failed: Broken pipe DistroRelease: Ubuntu 14.04 InstallationDate: Installed on 2014-02-26 (66 days ago) InstallationMedia: Ubuntu-Server 14.04 LTS "Trusty Tahr" - Alpha amd64 (20140219) MachineType: Dell Inc. PowerEdge R720 Package: linux (not installed) PciMultimedia: ProcFB: 0 VESA VGA ProcKernelCmdLine: BOOT_IMAGE=/boot/vmlinuz-3.13.0-24-generic root=UUID=c03eb237-955a-4dee-bba1-deded53df372 ro ProcVersionSignature: Ubuntu 3.13.0-24.46-generic 3.13.9 RfKill: Error: [Errno 2] No such file or directory Tags: trusty Uname: Linux 3.13.0-24-generic x86_64 UpgradeStatus: No upgrade log present (probably fresh install) UserGroups: WifiSyslog: _MarkForUpload: True dmi.bios.date: 01/16/2014 dmi.bios.vendor: Dell Inc. dmi.bios.version: 2.2.2 dmi.board.name: 0DCWD1 dmi.board.vendor: Dell Inc. dmi.board.version: A01 dmi.chassis.type: 23 dmi.chassis.vendor: Dell Inc. dmi.modalias: dmi:bvnDellInc.:bvr2.2.2:bd01/16/2014:svnDellInc.:pnPowerEdgeR720:pvr:rvnDellInc.:rn0DCWD1:rvrA01:cvnDellInc.:ct23:cvr: dmi.product.name: PowerEdge R720 dmi.sys.vendor: Dell Inc. To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1315736/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp
[Kernel-packages] [Bug 1315736] Re: [Dell PowerEdge R720] Machine Check Exception
attached is the the apport report Thanks! ** Attachment added: "ubuntu bug linux report" https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1315736/+attachment/4150051/+files/apport.linux-image-3.13.0-29-generic.v61kdeds.apport -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1315736 Title: [Dell PowerEdge R720] Machine Check Exception Status in “linux” package in Ubuntu: Incomplete Bug description: Dell PowerEdge 720 on ubuntu 14.04 shows MCE errors on dmesg. Dell support instructed to run DSET and BIOS hardware diagnostics. Neither of the tools showed any errors. Dell support said that if there was a hardware error it would have been shown on Dell logs and the probable reason for the dmesg log is a bug in ubuntu kernel MCE reporting. So, is it that following dmesg is because of a kernel bug in ubuntu 14.04 server? [11562.171040] Please check user daemon is running. [94953.306404] sbridge: HANDLING MCE MEMORY ERROR [94953.306415] CPU 1: Machine Check Exception: 0 Bank 9: 8c4b000800c0 [94953.306416] TSC 0 ADDR 2dfa0e1000 MISC 9800080168c PROCESSOR 0:306e4 TIME 1399142359 SOCKET 1 APIC 20 [94953.306422] sbridge: HANDLING MCE MEMORY ERROR [94953.306423] CPU 1: Machine Check Exception: 0 Bank 10: 8c5800c1 [94953.306424] TSC 0 ADDR 2dfa0e1000 MISC 900208c PROCESSOR 0:306e4 TIME 1399142359 SOCKET 1 APIC 20 [94953.532217] EDAC MC1: 1 CE memory scrubbing error on CPU_SrcID#1_Channel#0_DIMM#0 (channel:0 slot:0 page:0x2dfa0e1 offset:0x0 grain:32 syndrome:0x0 - area:DRAM err_code:0008:00c0 socket:1 channel_mask:3 rank:0) [94953.532226] EDAC MC1: 1 CE memory scrubbing error on CPU_SrcID#1_Channel#1_DIMM#0 (channel:1 slot:0 page:0x2dfa0e1 offset:0x0 grain:32 syndrome:0x0 - area:DRAM err_code:0008:00c1 socket:1 channel_mask:3 rank:0) --- AlsaDevices: total 0 crw-rw 1 root audio 116, 1 touko 2 19:15 seq crw-rw 1 root audio 116, 33 touko 2 19:15 timer AplayDevices: Error: [Errno 2] No such file or directory ApportVersion: 2.14.1-0ubuntu3 Architecture: amd64 ArecordDevices: Error: [Errno 2] No such file or directory AudioDevicesInUse: Error: command ['fuser', '-v', '/dev/snd/seq', '/dev/snd/timer'] failed with exit code 1: CRDA: Error: [Errno 2] No such file or directory CurrentDmesg: Error: command ['sh', '-c', 'dmesg | comm -13 --nocheck-order /var/log/dmesg -'] failed with exit code 1: comm: /var/log/dmesg: Permission denied dmesg: write failed: Broken pipe DistroRelease: Ubuntu 14.04 InstallationDate: Installed on 2014-02-26 (66 days ago) InstallationMedia: Ubuntu-Server 14.04 LTS "Trusty Tahr" - Alpha amd64 (20140219) MachineType: Dell Inc. PowerEdge R720 Package: linux (not installed) PciMultimedia: ProcFB: 0 VESA VGA ProcKernelCmdLine: BOOT_IMAGE=/boot/vmlinuz-3.13.0-24-generic root=UUID=c03eb237-955a-4dee-bba1-deded53df372 ro ProcVersionSignature: Ubuntu 3.13.0-24.46-generic 3.13.9 RfKill: Error: [Errno 2] No such file or directory Tags: trusty Uname: Linux 3.13.0-24-generic x86_64 UpgradeStatus: No upgrade log present (probably fresh install) UserGroups: WifiSyslog: _MarkForUpload: True dmi.bios.date: 01/16/2014 dmi.bios.vendor: Dell Inc. dmi.bios.version: 2.2.2 dmi.board.name: 0DCWD1 dmi.board.vendor: Dell Inc. dmi.board.version: A01 dmi.chassis.type: 23 dmi.chassis.vendor: Dell Inc. dmi.modalias: dmi:bvnDellInc.:bvr2.2.2:bd01/16/2014:svnDellInc.:pnPowerEdgeR720:pvr:rvnDellInc.:rn0DCWD1:rvrA01:cvnDellInc.:ct23:cvr: dmi.product.name: PowerEdge R720 dmi.sys.vendor: Dell Inc. To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1315736/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp
[Kernel-packages] [Bug 1315736] Re: [Dell PowerEdge R720] Machine Check Exception
km, thank you for your comment. So your hardware and problem may be tracked, could you please file a new report with Ubuntu by executing the following in a terminal while booted into the default Ubuntu kernel (not a mainline one) via: ubuntu-bug linux For more on this, please read the official Ubuntu documentation: Ubuntu Bug Control and Ubuntu Bug Squad: https://wiki.ubuntu.com/Bugs/BestPractices#X.2BAC8-Reporting.Focus_on_One_Issue Ubuntu Kernel Team: https://wiki.ubuntu.com/KernelTeam/KernelTeamBugPolicies#Filing_Kernel_Bug_reports Ubuntu Community: https://help.ubuntu.com/community/ReportingBugs#Bug_reporting_etiquette When opening up the new report, please feel free to subscribe me to it. Thank you for your understanding. Helpful bug reporting tips: https://wiki.ubuntu.com/ReportingBugs ** Changed in: linux (Ubuntu) Status: Confirmed => Incomplete -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1315736 Title: [Dell PowerEdge R720] Machine Check Exception Status in “linux” package in Ubuntu: Incomplete Bug description: Dell PowerEdge 720 on ubuntu 14.04 shows MCE errors on dmesg. Dell support instructed to run DSET and BIOS hardware diagnostics. Neither of the tools showed any errors. Dell support said that if there was a hardware error it would have been shown on Dell logs and the probable reason for the dmesg log is a bug in ubuntu kernel MCE reporting. So, is it that following dmesg is because of a kernel bug in ubuntu 14.04 server? [11562.171040] Please check user daemon is running. [94953.306404] sbridge: HANDLING MCE MEMORY ERROR [94953.306415] CPU 1: Machine Check Exception: 0 Bank 9: 8c4b000800c0 [94953.306416] TSC 0 ADDR 2dfa0e1000 MISC 9800080168c PROCESSOR 0:306e4 TIME 1399142359 SOCKET 1 APIC 20 [94953.306422] sbridge: HANDLING MCE MEMORY ERROR [94953.306423] CPU 1: Machine Check Exception: 0 Bank 10: 8c5800c1 [94953.306424] TSC 0 ADDR 2dfa0e1000 MISC 900208c PROCESSOR 0:306e4 TIME 1399142359 SOCKET 1 APIC 20 [94953.532217] EDAC MC1: 1 CE memory scrubbing error on CPU_SrcID#1_Channel#0_DIMM#0 (channel:0 slot:0 page:0x2dfa0e1 offset:0x0 grain:32 syndrome:0x0 - area:DRAM err_code:0008:00c0 socket:1 channel_mask:3 rank:0) [94953.532226] EDAC MC1: 1 CE memory scrubbing error on CPU_SrcID#1_Channel#1_DIMM#0 (channel:1 slot:0 page:0x2dfa0e1 offset:0x0 grain:32 syndrome:0x0 - area:DRAM err_code:0008:00c1 socket:1 channel_mask:3 rank:0) --- AlsaDevices: total 0 crw-rw 1 root audio 116, 1 touko 2 19:15 seq crw-rw 1 root audio 116, 33 touko 2 19:15 timer AplayDevices: Error: [Errno 2] No such file or directory ApportVersion: 2.14.1-0ubuntu3 Architecture: amd64 ArecordDevices: Error: [Errno 2] No such file or directory AudioDevicesInUse: Error: command ['fuser', '-v', '/dev/snd/seq', '/dev/snd/timer'] failed with exit code 1: CRDA: Error: [Errno 2] No such file or directory CurrentDmesg: Error: command ['sh', '-c', 'dmesg | comm -13 --nocheck-order /var/log/dmesg -'] failed with exit code 1: comm: /var/log/dmesg: Permission denied dmesg: write failed: Broken pipe DistroRelease: Ubuntu 14.04 InstallationDate: Installed on 2014-02-26 (66 days ago) InstallationMedia: Ubuntu-Server 14.04 LTS "Trusty Tahr" - Alpha amd64 (20140219) MachineType: Dell Inc. PowerEdge R720 Package: linux (not installed) PciMultimedia: ProcFB: 0 VESA VGA ProcKernelCmdLine: BOOT_IMAGE=/boot/vmlinuz-3.13.0-24-generic root=UUID=c03eb237-955a-4dee-bba1-deded53df372 ro ProcVersionSignature: Ubuntu 3.13.0-24.46-generic 3.13.9 RfKill: Error: [Errno 2] No such file or directory Tags: trusty Uname: Linux 3.13.0-24-generic x86_64 UpgradeStatus: No upgrade log present (probably fresh install) UserGroups: WifiSyslog: _MarkForUpload: True dmi.bios.date: 01/16/2014 dmi.bios.vendor: Dell Inc. dmi.bios.version: 2.2.2 dmi.board.name: 0DCWD1 dmi.board.vendor: Dell Inc. dmi.board.version: A01 dmi.chassis.type: 23 dmi.chassis.vendor: Dell Inc. dmi.modalias: dmi:bvnDellInc.:bvr2.2.2:bd01/16/2014:svnDellInc.:pnPowerEdgeR720:pvr:rvnDellInc.:rn0DCWD1:rvrA01:cvnDellInc.:ct23:cvr: dmi.product.name: PowerEdge R720 dmi.sys.vendor: Dell Inc. To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1315736/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp
[Kernel-packages] [Bug 1315736] Re: [Dell PowerEdge R720] Machine Check Exception
Status changed to 'Confirmed' because the bug affects multiple users. ** Changed in: linux (Ubuntu) Status: New => Confirmed -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1315736 Title: [Dell PowerEdge R720] Machine Check Exception Status in “linux” package in Ubuntu: Confirmed Bug description: Dell PowerEdge 720 on ubuntu 14.04 shows MCE errors on dmesg. Dell support instructed to run DSET and BIOS hardware diagnostics. Neither of the tools showed any errors. Dell support said that if there was a hardware error it would have been shown on Dell logs and the probable reason for the dmesg log is a bug in ubuntu kernel MCE reporting. So, is it that following dmesg is because of a kernel bug in ubuntu 14.04 server? [11562.171040] Please check user daemon is running. [94953.306404] sbridge: HANDLING MCE MEMORY ERROR [94953.306415] CPU 1: Machine Check Exception: 0 Bank 9: 8c4b000800c0 [94953.306416] TSC 0 ADDR 2dfa0e1000 MISC 9800080168c PROCESSOR 0:306e4 TIME 1399142359 SOCKET 1 APIC 20 [94953.306422] sbridge: HANDLING MCE MEMORY ERROR [94953.306423] CPU 1: Machine Check Exception: 0 Bank 10: 8c5800c1 [94953.306424] TSC 0 ADDR 2dfa0e1000 MISC 900208c PROCESSOR 0:306e4 TIME 1399142359 SOCKET 1 APIC 20 [94953.532217] EDAC MC1: 1 CE memory scrubbing error on CPU_SrcID#1_Channel#0_DIMM#0 (channel:0 slot:0 page:0x2dfa0e1 offset:0x0 grain:32 syndrome:0x0 - area:DRAM err_code:0008:00c0 socket:1 channel_mask:3 rank:0) [94953.532226] EDAC MC1: 1 CE memory scrubbing error on CPU_SrcID#1_Channel#1_DIMM#0 (channel:1 slot:0 page:0x2dfa0e1 offset:0x0 grain:32 syndrome:0x0 - area:DRAM err_code:0008:00c1 socket:1 channel_mask:3 rank:0) --- AlsaDevices: total 0 crw-rw 1 root audio 116, 1 touko 2 19:15 seq crw-rw 1 root audio 116, 33 touko 2 19:15 timer AplayDevices: Error: [Errno 2] No such file or directory ApportVersion: 2.14.1-0ubuntu3 Architecture: amd64 ArecordDevices: Error: [Errno 2] No such file or directory AudioDevicesInUse: Error: command ['fuser', '-v', '/dev/snd/seq', '/dev/snd/timer'] failed with exit code 1: CRDA: Error: [Errno 2] No such file or directory CurrentDmesg: Error: command ['sh', '-c', 'dmesg | comm -13 --nocheck-order /var/log/dmesg -'] failed with exit code 1: comm: /var/log/dmesg: Permission denied dmesg: write failed: Broken pipe DistroRelease: Ubuntu 14.04 InstallationDate: Installed on 2014-02-26 (66 days ago) InstallationMedia: Ubuntu-Server 14.04 LTS "Trusty Tahr" - Alpha amd64 (20140219) MachineType: Dell Inc. PowerEdge R720 Package: linux (not installed) PciMultimedia: ProcFB: 0 VESA VGA ProcKernelCmdLine: BOOT_IMAGE=/boot/vmlinuz-3.13.0-24-generic root=UUID=c03eb237-955a-4dee-bba1-deded53df372 ro ProcVersionSignature: Ubuntu 3.13.0-24.46-generic 3.13.9 RfKill: Error: [Errno 2] No such file or directory Tags: trusty Uname: Linux 3.13.0-24-generic x86_64 UpgradeStatus: No upgrade log present (probably fresh install) UserGroups: WifiSyslog: _MarkForUpload: True dmi.bios.date: 01/16/2014 dmi.bios.vendor: Dell Inc. dmi.bios.version: 2.2.2 dmi.board.name: 0DCWD1 dmi.board.vendor: Dell Inc. dmi.board.version: A01 dmi.chassis.type: 23 dmi.chassis.vendor: Dell Inc. dmi.modalias: dmi:bvnDellInc.:bvr2.2.2:bd01/16/2014:svnDellInc.:pnPowerEdgeR720:pvr:rvnDellInc.:rn0DCWD1:rvrA01:cvnDellInc.:ct23:cvr: dmi.product.name: PowerEdge R720 dmi.sys.vendor: Dell Inc. To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1315736/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp
[Kernel-packages] [Bug 1315736] Re: [Dell PowerEdge R720] Machine Check Exception
This is duplicate of #1323165 . There is also a possible workaround. -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1315736 Title: [Dell PowerEdge R720] Machine Check Exception Status in “linux” package in Ubuntu: Incomplete Bug description: Dell PowerEdge 720 on ubuntu 14.04 shows MCE errors on dmesg. Dell support instructed to run DSET and BIOS hardware diagnostics. Neither of the tools showed any errors. Dell support said that if there was a hardware error it would have been shown on Dell logs and the probable reason for the dmesg log is a bug in ubuntu kernel MCE reporting. So, is it that following dmesg is because of a kernel bug in ubuntu 14.04 server? [11562.171040] Please check user daemon is running. [94953.306404] sbridge: HANDLING MCE MEMORY ERROR [94953.306415] CPU 1: Machine Check Exception: 0 Bank 9: 8c4b000800c0 [94953.306416] TSC 0 ADDR 2dfa0e1000 MISC 9800080168c PROCESSOR 0:306e4 TIME 1399142359 SOCKET 1 APIC 20 [94953.306422] sbridge: HANDLING MCE MEMORY ERROR [94953.306423] CPU 1: Machine Check Exception: 0 Bank 10: 8c5800c1 [94953.306424] TSC 0 ADDR 2dfa0e1000 MISC 900208c PROCESSOR 0:306e4 TIME 1399142359 SOCKET 1 APIC 20 [94953.532217] EDAC MC1: 1 CE memory scrubbing error on CPU_SrcID#1_Channel#0_DIMM#0 (channel:0 slot:0 page:0x2dfa0e1 offset:0x0 grain:32 syndrome:0x0 - area:DRAM err_code:0008:00c0 socket:1 channel_mask:3 rank:0) [94953.532226] EDAC MC1: 1 CE memory scrubbing error on CPU_SrcID#1_Channel#1_DIMM#0 (channel:1 slot:0 page:0x2dfa0e1 offset:0x0 grain:32 syndrome:0x0 - area:DRAM err_code:0008:00c1 socket:1 channel_mask:3 rank:0) --- AlsaDevices: total 0 crw-rw 1 root audio 116, 1 touko 2 19:15 seq crw-rw 1 root audio 116, 33 touko 2 19:15 timer AplayDevices: Error: [Errno 2] No such file or directory ApportVersion: 2.14.1-0ubuntu3 Architecture: amd64 ArecordDevices: Error: [Errno 2] No such file or directory AudioDevicesInUse: Error: command ['fuser', '-v', '/dev/snd/seq', '/dev/snd/timer'] failed with exit code 1: CRDA: Error: [Errno 2] No such file or directory CurrentDmesg: Error: command ['sh', '-c', 'dmesg | comm -13 --nocheck-order /var/log/dmesg -'] failed with exit code 1: comm: /var/log/dmesg: Permission denied dmesg: write failed: Broken pipe DistroRelease: Ubuntu 14.04 InstallationDate: Installed on 2014-02-26 (66 days ago) InstallationMedia: Ubuntu-Server 14.04 LTS "Trusty Tahr" - Alpha amd64 (20140219) MachineType: Dell Inc. PowerEdge R720 Package: linux (not installed) PciMultimedia: ProcFB: 0 VESA VGA ProcKernelCmdLine: BOOT_IMAGE=/boot/vmlinuz-3.13.0-24-generic root=UUID=c03eb237-955a-4dee-bba1-deded53df372 ro ProcVersionSignature: Ubuntu 3.13.0-24.46-generic 3.13.9 RfKill: Error: [Errno 2] No such file or directory Tags: trusty Uname: Linux 3.13.0-24-generic x86_64 UpgradeStatus: No upgrade log present (probably fresh install) UserGroups: WifiSyslog: _MarkForUpload: True dmi.bios.date: 01/16/2014 dmi.bios.vendor: Dell Inc. dmi.bios.version: 2.2.2 dmi.board.name: 0DCWD1 dmi.board.vendor: Dell Inc. dmi.board.version: A01 dmi.chassis.type: 23 dmi.chassis.vendor: Dell Inc. dmi.modalias: dmi:bvnDellInc.:bvr2.2.2:bd01/16/2014:svnDellInc.:pnPowerEdgeR720:pvr:rvnDellInc.:rn0DCWD1:rvrA01:cvnDellInc.:ct23:cvr: dmi.product.name: PowerEdge R720 dmi.sys.vendor: Dell Inc. To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1315736/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp
[Kernel-packages] [Bug 1315736] Re: [Dell PowerEdge R720] Machine Check Exception
Nikhil Chhaochharia, thank you for your comment. So your hardware and problem may be tracked, could you please file a new report with Ubuntu by executing the following in a terminal while booted into a Ubuntu repository kernel (not a mainline one) via: ubuntu-bug linux For more on this, please read the official Ubuntu documentation: Ubuntu Bug Control and Ubuntu Bug Squad: https://wiki.ubuntu.com/Bugs/BestPractices#X.2BAC8-Reporting.Focus_on_One_Issue Ubuntu Kernel Team: https://wiki.ubuntu.com/KernelTeam/KernelTeamBugPolicies#Filing_Kernel_Bug_reports Ubuntu Community: https://help.ubuntu.com/community/ReportingBugs#Bug_reporting_etiquette When opening up the new report, please feel free to subscribe me to it. Thank you for your understanding. Helpful bug reporting tips: https://wiki.ubuntu.com/ReportingBugs -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1315736 Title: [Dell PowerEdge R720] Machine Check Exception Status in “linux” package in Ubuntu: Incomplete Bug description: Dell PowerEdge 720 on ubuntu 14.04 shows MCE errors on dmesg. Dell support instructed to run DSET and BIOS hardware diagnostics. Neither of the tools showed any errors. Dell support said that if there was a hardware error it would have been shown on Dell logs and the probable reason for the dmesg log is a bug in ubuntu kernel MCE reporting. So, is it that following dmesg is because of a kernel bug in ubuntu 14.04 server? [11562.171040] Please check user daemon is running. [94953.306404] sbridge: HANDLING MCE MEMORY ERROR [94953.306415] CPU 1: Machine Check Exception: 0 Bank 9: 8c4b000800c0 [94953.306416] TSC 0 ADDR 2dfa0e1000 MISC 9800080168c PROCESSOR 0:306e4 TIME 1399142359 SOCKET 1 APIC 20 [94953.306422] sbridge: HANDLING MCE MEMORY ERROR [94953.306423] CPU 1: Machine Check Exception: 0 Bank 10: 8c5800c1 [94953.306424] TSC 0 ADDR 2dfa0e1000 MISC 900208c PROCESSOR 0:306e4 TIME 1399142359 SOCKET 1 APIC 20 [94953.532217] EDAC MC1: 1 CE memory scrubbing error on CPU_SrcID#1_Channel#0_DIMM#0 (channel:0 slot:0 page:0x2dfa0e1 offset:0x0 grain:32 syndrome:0x0 - area:DRAM err_code:0008:00c0 socket:1 channel_mask:3 rank:0) [94953.532226] EDAC MC1: 1 CE memory scrubbing error on CPU_SrcID#1_Channel#1_DIMM#0 (channel:1 slot:0 page:0x2dfa0e1 offset:0x0 grain:32 syndrome:0x0 - area:DRAM err_code:0008:00c1 socket:1 channel_mask:3 rank:0) --- AlsaDevices: total 0 crw-rw 1 root audio 116, 1 touko 2 19:15 seq crw-rw 1 root audio 116, 33 touko 2 19:15 timer AplayDevices: Error: [Errno 2] No such file or directory ApportVersion: 2.14.1-0ubuntu3 Architecture: amd64 ArecordDevices: Error: [Errno 2] No such file or directory AudioDevicesInUse: Error: command ['fuser', '-v', '/dev/snd/seq', '/dev/snd/timer'] failed with exit code 1: CRDA: Error: [Errno 2] No such file or directory CurrentDmesg: Error: command ['sh', '-c', 'dmesg | comm -13 --nocheck-order /var/log/dmesg -'] failed with exit code 1: comm: /var/log/dmesg: Permission denied dmesg: write failed: Broken pipe DistroRelease: Ubuntu 14.04 InstallationDate: Installed on 2014-02-26 (66 days ago) InstallationMedia: Ubuntu-Server 14.04 LTS "Trusty Tahr" - Alpha amd64 (20140219) MachineType: Dell Inc. PowerEdge R720 Package: linux (not installed) PciMultimedia: ProcFB: 0 VESA VGA ProcKernelCmdLine: BOOT_IMAGE=/boot/vmlinuz-3.13.0-24-generic root=UUID=c03eb237-955a-4dee-bba1-deded53df372 ro ProcVersionSignature: Ubuntu 3.13.0-24.46-generic 3.13.9 RfKill: Error: [Errno 2] No such file or directory Tags: trusty Uname: Linux 3.13.0-24-generic x86_64 UpgradeStatus: No upgrade log present (probably fresh install) UserGroups: WifiSyslog: _MarkForUpload: True dmi.bios.date: 01/16/2014 dmi.bios.vendor: Dell Inc. dmi.bios.version: 2.2.2 dmi.board.name: 0DCWD1 dmi.board.vendor: Dell Inc. dmi.board.version: A01 dmi.chassis.type: 23 dmi.chassis.vendor: Dell Inc. dmi.modalias: dmi:bvnDellInc.:bvr2.2.2:bd01/16/2014:svnDellInc.:pnPowerEdgeR720:pvr:rvnDellInc.:rn0DCWD1:rvrA01:cvnDellInc.:ct23:cvr: dmi.product.name: PowerEdge R720 dmi.sys.vendor: Dell Inc. To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1315736/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp
[Kernel-packages] [Bug 1315736] Re: [Dell PowerEdge R720] Machine Check Exception
I am wondering if this is a security issue also, as it seems that a normal user can render system unresponsive. Has other reporters an easy procedure to reproduce this bug? I have a somewhat complex producedure which involves running a DNA sequence analysis pipeline (Qiime). -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1315736 Title: [Dell PowerEdge R720] Machine Check Exception Status in “linux” package in Ubuntu: Incomplete Bug description: Dell PowerEdge 720 on ubuntu 14.04 shows MCE errors on dmesg. Dell support instructed to run DSET and BIOS hardware diagnostics. Neither of the tools showed any errors. Dell support said that if there was a hardware error it would have been shown on Dell logs and the probable reason for the dmesg log is a bug in ubuntu kernel MCE reporting. So, is it that following dmesg is because of a kernel bug in ubuntu 14.04 server? [11562.171040] Please check user daemon is running. [94953.306404] sbridge: HANDLING MCE MEMORY ERROR [94953.306415] CPU 1: Machine Check Exception: 0 Bank 9: 8c4b000800c0 [94953.306416] TSC 0 ADDR 2dfa0e1000 MISC 9800080168c PROCESSOR 0:306e4 TIME 1399142359 SOCKET 1 APIC 20 [94953.306422] sbridge: HANDLING MCE MEMORY ERROR [94953.306423] CPU 1: Machine Check Exception: 0 Bank 10: 8c5800c1 [94953.306424] TSC 0 ADDR 2dfa0e1000 MISC 900208c PROCESSOR 0:306e4 TIME 1399142359 SOCKET 1 APIC 20 [94953.532217] EDAC MC1: 1 CE memory scrubbing error on CPU_SrcID#1_Channel#0_DIMM#0 (channel:0 slot:0 page:0x2dfa0e1 offset:0x0 grain:32 syndrome:0x0 - area:DRAM err_code:0008:00c0 socket:1 channel_mask:3 rank:0) [94953.532226] EDAC MC1: 1 CE memory scrubbing error on CPU_SrcID#1_Channel#1_DIMM#0 (channel:1 slot:0 page:0x2dfa0e1 offset:0x0 grain:32 syndrome:0x0 - area:DRAM err_code:0008:00c1 socket:1 channel_mask:3 rank:0) --- AlsaDevices: total 0 crw-rw 1 root audio 116, 1 touko 2 19:15 seq crw-rw 1 root audio 116, 33 touko 2 19:15 timer AplayDevices: Error: [Errno 2] No such file or directory ApportVersion: 2.14.1-0ubuntu3 Architecture: amd64 ArecordDevices: Error: [Errno 2] No such file or directory AudioDevicesInUse: Error: command ['fuser', '-v', '/dev/snd/seq', '/dev/snd/timer'] failed with exit code 1: CRDA: Error: [Errno 2] No such file or directory CurrentDmesg: Error: command ['sh', '-c', 'dmesg | comm -13 --nocheck-order /var/log/dmesg -'] failed with exit code 1: comm: /var/log/dmesg: Permission denied dmesg: write failed: Broken pipe DistroRelease: Ubuntu 14.04 InstallationDate: Installed on 2014-02-26 (66 days ago) InstallationMedia: Ubuntu-Server 14.04 LTS "Trusty Tahr" - Alpha amd64 (20140219) MachineType: Dell Inc. PowerEdge R720 Package: linux (not installed) PciMultimedia: ProcFB: 0 VESA VGA ProcKernelCmdLine: BOOT_IMAGE=/boot/vmlinuz-3.13.0-24-generic root=UUID=c03eb237-955a-4dee-bba1-deded53df372 ro ProcVersionSignature: Ubuntu 3.13.0-24.46-generic 3.13.9 RfKill: Error: [Errno 2] No such file or directory Tags: trusty Uname: Linux 3.13.0-24-generic x86_64 UpgradeStatus: No upgrade log present (probably fresh install) UserGroups: WifiSyslog: _MarkForUpload: True dmi.bios.date: 01/16/2014 dmi.bios.vendor: Dell Inc. dmi.bios.version: 2.2.2 dmi.board.name: 0DCWD1 dmi.board.vendor: Dell Inc. dmi.board.version: A01 dmi.chassis.type: 23 dmi.chassis.vendor: Dell Inc. dmi.modalias: dmi:bvnDellInc.:bvr2.2.2:bd01/16/2014:svnDellInc.:pnPowerEdgeR720:pvr:rvnDellInc.:rn0DCWD1:rvrA01:cvnDellInc.:ct23:cvr: dmi.product.name: PowerEdge R720 dmi.sys.vendor: Dell Inc. To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1315736/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp
[Kernel-packages] [Bug 1315736] Re: [Dell PowerEdge R720] Machine Check Exception
We are also seeing this bug. The machine becomes non-responsive, unable to ssh, high load average, trying to access the running java process does not work. I will file a bug as described in Comment #30 Our hardware is HP Proliant DL380p and we see the following in the syslog May 26 06:19:38 server06 kernel: [75831.929529] [ cut here ] May 26 06:19:38 server06 kernel: [75831.930191] kernel BUG at /build/buildd/linux-3.13.0/mm/memory.c:3756! May 26 06:19:38 server06 kernel: [75831.931129] invalid opcode: [#1] SMP May 26 06:19:38 server06 kernel: [75831.931729] Modules linked in: xt_multiport ip6t_REJECT xt_hl ip6t_rt nf_conntrack_ipv6 nf_defrag_ipv6 ipt_REJECT xt_LOG xt_limit xt_tcpudp xt_addrtype nf_conntrack_ipv4 nf_defrag_ipv4 xt_conntrack ip6table_filter ip6_tables gpio_ich nf_conntrack_netbios_ns nf_conntrack_broadcast nf_nat_ftp nf_nat nf_conntrack_ftp nf_conntrack iptable_filter ip_tables x_tables intel_rapl x86_pkg_temp_thermal intel_powerclamp coretemp kvm_intel kvm crct10dif_pclmul crc32_pclmul ghash_clmulni_intel aesni_intel aes_x86_64 lrw gf128mul glue_helper ablk_helper cryptd serio_raw sb_edac edac_core lpc_ich hpwdt hpilo ioatdma lp dca ipmi_si parport acpi_power_meter mac_hid tg3 ptp psmouse hpsa pps_core May 26 06:19:38 server06 kernel: [75831.941585] CPU: 4 PID: 2930 Comm: java Not tainted 3.13.0-24-generic #47-Ubuntu May 26 06:19:38 server06 kernel: [75831.942633] Hardware name: HP ProLiant DL380p Gen8, BIOS P70 02/10/2014 May 26 06:19:38 server06 kernel: [75831.943583] task: 881fe8372fe0 ti: 881fe632a000 task.ti: 881fe632a000 May 26 06:19:38 server06 kernel: [75831.944654] RIP: 0010:[] [] handle_mm_fault+0xe61/0xf10 May 26 06:19:38 server06 kernel: [75831.946137] RSP: :881fe632bd98 EFLAGS: 00010246 May 26 06:19:38 server06 kernel: [75831.946885] RAX: 0100 RBX: 7fc37320a370 RCX: 881fe632bb18 May 26 06:19:38 server06 kernel: [75831.947902] RDX: 881fe8372fe0 RSI: RDI: 800100c009e6 May 26 06:19:38 server06 kernel: [75831.948932] RBP: 881fe632be20 R08: R09: 00a9 May 26 06:19:38 server06 kernel: [75831.949952] R10: 0001 R11: R12: 881fd83a7cc8 May 26 06:19:38 server06 kernel: [75831.950961] R13: 880fe6787d40 R14: 880fe5d95780 R15: 0080 May 26 06:19:38 server06 kernel: [75831.951985] FS: 7fc938145700() GS:880fffa8() knlGS: May 26 06:19:38 server06 kernel: [75831.976736] CS: 0010 DS: ES: CR0: 80050033 May 26 06:19:38 server06 kernel: [75832.005183] CR2: 7fc373620930 CR3: 000fe63fe000 CR4: 000407e0 May 26 06:19:38 server06 kernel: [75832.033473] Stack: May 26 06:19:38 server06 kernel: [75832.060551] 0001 881fe632bdb0 8109a780 881fe632bdd0 May 26 06:19:38 server06 kernel: [75832.117385] 810d7ad6 0001 81f1ea20 881fe632be78 May 26 06:19:38 server06 kernel: [75832.173599] 810d983d 881fe632be48 88a9 0001 May 26 06:19:38 server06 kernel: [75832.231813] Call Trace: May 26 06:19:38 server06 kernel: [75832.258781] [] ? wake_up_state+0x10/0x20 May 26 06:19:38 server06 kernel: [75832.286702] [] ? wake_futex+0x66/0x90 May 26 06:19:38 server06 kernel: [75832.311849] [] ? futex_wake_op+0x4ed/0x620 May 26 06:19:38 server06 kernel: [75832.337329] [] __do_page_fault+0x184/0x560 May 26 06:19:38 server06 kernel: [75832.363061] [] ? acct_account_cputime+0x1c/0x20 May 26 06:19:38 server06 kernel: [75832.387739] [] ? account_user_time+0x8b/0xa0 May 26 06:19:38 server06 kernel: [75832.411608] [] ? vtime_account_user+0x54/0x60 May 26 06:19:38 server06 kernel: [75832.436126] [] do_page_fault+0x1a/0x70 May 26 06:19:38 server06 kernel: [75832.458239] [] page_fault+0x28/0x30 May 26 06:19:38 server06 kernel: [75832.481780] Code: ff 48 89 d9 4c 89 e2 4c 89 ee 4c 89 f7 44 89 4d c8 e8 34 c1 ff ff 85 c0 0f 85 94 f5 ff ff 49 8b 3c 24 44 8b 4d c8 e9 68 f3 ff ff <0f> 0b be 8e 00 00 00 48 c7 c7 18 25 a6 81 44 89 4d c8 e8 18 e7 May 26 06:19:38 server06 kernel: [75832.551672] RIP [] handle_mm_fault+0xe61/0xf10 May 26 06:19:38 server06 kernel: [75832.574254] RSP May 26 06:19:38 server06 kernel: [75832.630392] ---[ end trace e41b58adf8e0d72b ]--- -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1315736 Title: [Dell PowerEdge R720] Machine Check Exception Status in “linux” package in Ubuntu: Incomplete Bug description: Dell PowerEdge 720 on ubuntu 14.04 shows MCE errors on dmesg. Dell support instructed to run DSET and BIOS hardware diagnostics. Neither of the tools showed any errors. Dell support said that if there was a hardware error it would have been shown on Dell logs and the probable reason for the dmesg log i
[Kernel-packages] [Bug 1315736] Re: [Dell PowerEdge R720] Machine Check Exception
Tiago Antao, again please see https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1315736/comments/30 . -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1315736 Title: [Dell PowerEdge R720] Machine Check Exception Status in “linux” package in Ubuntu: Incomplete Bug description: Dell PowerEdge 720 on ubuntu 14.04 shows MCE errors on dmesg. Dell support instructed to run DSET and BIOS hardware diagnostics. Neither of the tools showed any errors. Dell support said that if there was a hardware error it would have been shown on Dell logs and the probable reason for the dmesg log is a bug in ubuntu kernel MCE reporting. So, is it that following dmesg is because of a kernel bug in ubuntu 14.04 server? [11562.171040] Please check user daemon is running. [94953.306404] sbridge: HANDLING MCE MEMORY ERROR [94953.306415] CPU 1: Machine Check Exception: 0 Bank 9: 8c4b000800c0 [94953.306416] TSC 0 ADDR 2dfa0e1000 MISC 9800080168c PROCESSOR 0:306e4 TIME 1399142359 SOCKET 1 APIC 20 [94953.306422] sbridge: HANDLING MCE MEMORY ERROR [94953.306423] CPU 1: Machine Check Exception: 0 Bank 10: 8c5800c1 [94953.306424] TSC 0 ADDR 2dfa0e1000 MISC 900208c PROCESSOR 0:306e4 TIME 1399142359 SOCKET 1 APIC 20 [94953.532217] EDAC MC1: 1 CE memory scrubbing error on CPU_SrcID#1_Channel#0_DIMM#0 (channel:0 slot:0 page:0x2dfa0e1 offset:0x0 grain:32 syndrome:0x0 - area:DRAM err_code:0008:00c0 socket:1 channel_mask:3 rank:0) [94953.532226] EDAC MC1: 1 CE memory scrubbing error on CPU_SrcID#1_Channel#1_DIMM#0 (channel:1 slot:0 page:0x2dfa0e1 offset:0x0 grain:32 syndrome:0x0 - area:DRAM err_code:0008:00c1 socket:1 channel_mask:3 rank:0) --- AlsaDevices: total 0 crw-rw 1 root audio 116, 1 touko 2 19:15 seq crw-rw 1 root audio 116, 33 touko 2 19:15 timer AplayDevices: Error: [Errno 2] No such file or directory ApportVersion: 2.14.1-0ubuntu3 Architecture: amd64 ArecordDevices: Error: [Errno 2] No such file or directory AudioDevicesInUse: Error: command ['fuser', '-v', '/dev/snd/seq', '/dev/snd/timer'] failed with exit code 1: CRDA: Error: [Errno 2] No such file or directory CurrentDmesg: Error: command ['sh', '-c', 'dmesg | comm -13 --nocheck-order /var/log/dmesg -'] failed with exit code 1: comm: /var/log/dmesg: Permission denied dmesg: write failed: Broken pipe DistroRelease: Ubuntu 14.04 InstallationDate: Installed on 2014-02-26 (66 days ago) InstallationMedia: Ubuntu-Server 14.04 LTS "Trusty Tahr" - Alpha amd64 (20140219) MachineType: Dell Inc. PowerEdge R720 Package: linux (not installed) PciMultimedia: ProcFB: 0 VESA VGA ProcKernelCmdLine: BOOT_IMAGE=/boot/vmlinuz-3.13.0-24-generic root=UUID=c03eb237-955a-4dee-bba1-deded53df372 ro ProcVersionSignature: Ubuntu 3.13.0-24.46-generic 3.13.9 RfKill: Error: [Errno 2] No such file or directory Tags: trusty Uname: Linux 3.13.0-24-generic x86_64 UpgradeStatus: No upgrade log present (probably fresh install) UserGroups: WifiSyslog: _MarkForUpload: True dmi.bios.date: 01/16/2014 dmi.bios.vendor: Dell Inc. dmi.bios.version: 2.2.2 dmi.board.name: 0DCWD1 dmi.board.vendor: Dell Inc. dmi.board.version: A01 dmi.chassis.type: 23 dmi.chassis.vendor: Dell Inc. dmi.modalias: dmi:bvnDellInc.:bvr2.2.2:bd01/16/2014:svnDellInc.:pnPowerEdgeR720:pvr:rvnDellInc.:rn0DCWD1:rvrA01:cvnDellInc.:ct23:cvr: dmi.product.name: PowerEdge R720 dmi.sys.vendor: Dell Inc. To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1315736/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp
[Kernel-packages] [Bug 1315736] Re: [Dell PowerEdge R720] Machine Check Exception
Sami, Good observation: I do not have a machine check exception. The similarities are: a reported bug on the same line; similar behaviour; and java involved. For reference I copy my kernel bug below (I get several instances of this, only that the next ones are tainted). As soon as I have a problem with the new upstream kernel I will report it back May 9 09:55:29 wintermute kernel: [604868.582044] [ cut here ] May 9 09:55:29 wintermute kernel: [604868.582059] kernel BUG at /build/buildd/linux-3.13.0/mm/memory.c:3756! May 9 09:55:29 wintermute kernel: [604868.582064] invalid opcode: [#1] SMP May 9 09:55:29 wintermute kernel: [604868.582069] Modules linked in: veth xt_addrtype xt_conntrack iptable_filter ipt_MASQUERADE iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 nf_nat nf_conntrack ip_tables x_tables bridge stp llc bnep rfcomm bluetooth aufs binfmt_misc kvm_amd kvm crct10dif_pclmul crc32_pclmul ghash_clmulni_intel joydev aesni_intel aes_x86_64 lrw gf128mul glue_helper ablk_helper cryptd parport_pc ppdev psmouse amd64_edac_mod sp5100_tco serio_raw edac_core lp fam15h_power k10temp i2c_piix4 edac_mce_amd mac_hid parport hid_generic usbhid hid usb_storage ixgbe igb mdio i2c_algo_bit dca ahci ptp libahci pps_core May 9 09:55:29 wintermute kernel: [604868.582148] CPU: 21 PID: 25260 Comm: java Not tainted 3.13.0-24-generic #46-Ubuntu May 9 09:55:29 wintermute kernel: [604868.582152] Hardware name: Supermicro H8QG6/H8QG6, BIOS 3.512/16/2013 May 9 09:55:29 wintermute kernel: [604868.582156] task: 8876d3985fc0 ti: 8871f58c8000 task.ti: 8871f58c8000 May 9 09:55:29 wintermute kernel: [604868.582159] RIP: 0010:[] [] handle_mm_fault+0xe61/0xf10 May 9 09:55:29 wintermute kernel: [604868.582171] RSP: :8871f58c9d98 EFLAGS: 00010246 May 9 09:55:29 wintermute kernel: [604868.582174] RAX: 0100 RBX: 7fa583801ea0 RCX: 8871f58c9b18 May 9 09:55:29 wintermute kernel: [604868.582177] RDX: 8876d3985fc0 RSI: RDI: 8020286009e6 May 9 09:55:29 wintermute kernel: [604868.582180] RBP: 8871f58c9e20 R08: R09: 00a9 May 9 09:55:29 wintermute kernel: [604868.582182] R10: 0001 R11: R12: 883fb68b30e0 May 9 09:55:29 wintermute kernel: [604868.582185] R13: 882e351b2600 R14: 88702aceec80 R15: 0080 May 9 09:55:29 wintermute kernel: [604868.582188] FS: 7fa5603f2700() GS:882fe7d4() knlGS: May 9 09:55:29 wintermute kernel: [604868.582192] CS: 0010 DS: ES: CR0: 8005003b May 9 09:55:29 wintermute kernel: [604868.582194] CR2: 7fa583a05620 CR3: 007861d59000 CR4: 000407e0 May 9 09:55:29 wintermute kernel: [604868.582198] Stack: May 9 09:55:29 wintermute kernel: [604868.582200] 8871f58c9e20 88702aceec80 7fad7d38fd70 7fa583804020 May 9 09:55:29 wintermute kernel: [604868.582241] 2190 7fad7401bb68 0002 May 9 09:55:29 wintermute kernel: [604868.582266] 887101ef5e20 7fad781a900f 88a9 ff03 May 9 09:55:29 wintermute kernel: [604868.582283] Call Trace: May 9 09:55:29 wintermute kernel: [604868.582297] [] __do_page_fault+0x184/0x560 May 9 09:55:29 wintermute kernel: [604868.582311] [] ? acct_account_cputime+0x1c/0x20 May 9 09:55:29 wintermute kernel: [604868.582321] [] ? account_user_time+0x8b/0xa0 May 9 09:55:29 wintermute kernel: [604868.582329] [] ? vtime_account_user+0x54/0x60 May 9 09:55:29 wintermute kernel: [604868.582338] [] do_page_fault+0x1a/0x70 May 9 09:55:29 wintermute kernel: [604868.582349] [] page_fault+0x28/0x30 May 9 09:55:29 wintermute kernel: [604868.582353] Code: ff 48 89 d9 4c 89 e2 4c 89 ee 4c 89 f7 44 89 4d c8 e8 34 c1 ff ff 85 c0 0f 85 94 f5 ff ff 49 8b 3c 24 44 8b 4d c8 e9 68 f3 ff ff <0f> 0b be 8e 00 00 00 48 c7 c7 18 25 a6 81 44 89 4d c8 e8 18 e7 May 9 09:55:29 wintermute kernel: [604868.582415] RIP [] handle_mm_fault+0xe61/0xf10 May 9 09:55:29 wintermute kernel: [604868.582421] RSP May 9 09:55:29 wintermute kernel: [604868.582426] ---[ end trace 77f5d1b963750a41 ]--- -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1315736 Title: [Dell PowerEdge R720] Machine Check Exception Status in “linux” package in Ubuntu: Incomplete Bug description: Dell PowerEdge 720 on ubuntu 14.04 shows MCE errors on dmesg. Dell support instructed to run DSET and BIOS hardware diagnostics. Neither of the tools showed any errors. Dell support said that if there was a hardware error it would have been shown on Dell logs and the probable reason for the dmesg log is a bug in ubuntu kernel MCE reporting. So, is it that following dmesg is because of a kernel bug in ubuntu 14.04 server? [11562.1
[Kernel-packages] [Bug 1315736] Re: [Dell PowerEdge R720] Machine Check Exception
We have now installed the new kernel, but as the bug is non- deterministic, we will have to wait until it manifests itself. -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1315736 Title: [Dell PowerEdge R720] Machine Check Exception Status in “linux” package in Ubuntu: Incomplete Bug description: Dell PowerEdge 720 on ubuntu 14.04 shows MCE errors on dmesg. Dell support instructed to run DSET and BIOS hardware diagnostics. Neither of the tools showed any errors. Dell support said that if there was a hardware error it would have been shown on Dell logs and the probable reason for the dmesg log is a bug in ubuntu kernel MCE reporting. So, is it that following dmesg is because of a kernel bug in ubuntu 14.04 server? [11562.171040] Please check user daemon is running. [94953.306404] sbridge: HANDLING MCE MEMORY ERROR [94953.306415] CPU 1: Machine Check Exception: 0 Bank 9: 8c4b000800c0 [94953.306416] TSC 0 ADDR 2dfa0e1000 MISC 9800080168c PROCESSOR 0:306e4 TIME 1399142359 SOCKET 1 APIC 20 [94953.306422] sbridge: HANDLING MCE MEMORY ERROR [94953.306423] CPU 1: Machine Check Exception: 0 Bank 10: 8c5800c1 [94953.306424] TSC 0 ADDR 2dfa0e1000 MISC 900208c PROCESSOR 0:306e4 TIME 1399142359 SOCKET 1 APIC 20 [94953.532217] EDAC MC1: 1 CE memory scrubbing error on CPU_SrcID#1_Channel#0_DIMM#0 (channel:0 slot:0 page:0x2dfa0e1 offset:0x0 grain:32 syndrome:0x0 - area:DRAM err_code:0008:00c0 socket:1 channel_mask:3 rank:0) [94953.532226] EDAC MC1: 1 CE memory scrubbing error on CPU_SrcID#1_Channel#1_DIMM#0 (channel:1 slot:0 page:0x2dfa0e1 offset:0x0 grain:32 syndrome:0x0 - area:DRAM err_code:0008:00c1 socket:1 channel_mask:3 rank:0) --- AlsaDevices: total 0 crw-rw 1 root audio 116, 1 touko 2 19:15 seq crw-rw 1 root audio 116, 33 touko 2 19:15 timer AplayDevices: Error: [Errno 2] No such file or directory ApportVersion: 2.14.1-0ubuntu3 Architecture: amd64 ArecordDevices: Error: [Errno 2] No such file or directory AudioDevicesInUse: Error: command ['fuser', '-v', '/dev/snd/seq', '/dev/snd/timer'] failed with exit code 1: CRDA: Error: [Errno 2] No such file or directory CurrentDmesg: Error: command ['sh', '-c', 'dmesg | comm -13 --nocheck-order /var/log/dmesg -'] failed with exit code 1: comm: /var/log/dmesg: Permission denied dmesg: write failed: Broken pipe DistroRelease: Ubuntu 14.04 InstallationDate: Installed on 2014-02-26 (66 days ago) InstallationMedia: Ubuntu-Server 14.04 LTS "Trusty Tahr" - Alpha amd64 (20140219) MachineType: Dell Inc. PowerEdge R720 Package: linux (not installed) PciMultimedia: ProcFB: 0 VESA VGA ProcKernelCmdLine: BOOT_IMAGE=/boot/vmlinuz-3.13.0-24-generic root=UUID=c03eb237-955a-4dee-bba1-deded53df372 ro ProcVersionSignature: Ubuntu 3.13.0-24.46-generic 3.13.9 RfKill: Error: [Errno 2] No such file or directory Tags: trusty Uname: Linux 3.13.0-24-generic x86_64 UpgradeStatus: No upgrade log present (probably fresh install) UserGroups: WifiSyslog: _MarkForUpload: True dmi.bios.date: 01/16/2014 dmi.bios.vendor: Dell Inc. dmi.bios.version: 2.2.2 dmi.board.name: 0DCWD1 dmi.board.vendor: Dell Inc. dmi.board.version: A01 dmi.chassis.type: 23 dmi.chassis.vendor: Dell Inc. dmi.modalias: dmi:bvnDellInc.:bvr2.2.2:bd01/16/2014:svnDellInc.:pnPowerEdgeR720:pvr:rvnDellInc.:rn0DCWD1:rvrA01:cvnDellInc.:ct23:cvr: dmi.product.name: PowerEdge R720 dmi.sys.vendor: Dell Inc. To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1315736/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp
[Kernel-packages] [Bug 1315736] Re: [Dell PowerEdge R720] Machine Check Exception
Hi Tiago, I agree, the bug seems the same (freezing process listing and relation to java in repreducing the bug). However, I am thinking if the Machine Check Exception is because of the same bug or if it is a totally different bug. Have you observed a Machine Check Exception? -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1315736 Title: [Dell PowerEdge R720] Machine Check Exception Status in “linux” package in Ubuntu: Incomplete Bug description: Dell PowerEdge 720 on ubuntu 14.04 shows MCE errors on dmesg. Dell support instructed to run DSET and BIOS hardware diagnostics. Neither of the tools showed any errors. Dell support said that if there was a hardware error it would have been shown on Dell logs and the probable reason for the dmesg log is a bug in ubuntu kernel MCE reporting. So, is it that following dmesg is because of a kernel bug in ubuntu 14.04 server? [11562.171040] Please check user daemon is running. [94953.306404] sbridge: HANDLING MCE MEMORY ERROR [94953.306415] CPU 1: Machine Check Exception: 0 Bank 9: 8c4b000800c0 [94953.306416] TSC 0 ADDR 2dfa0e1000 MISC 9800080168c PROCESSOR 0:306e4 TIME 1399142359 SOCKET 1 APIC 20 [94953.306422] sbridge: HANDLING MCE MEMORY ERROR [94953.306423] CPU 1: Machine Check Exception: 0 Bank 10: 8c5800c1 [94953.306424] TSC 0 ADDR 2dfa0e1000 MISC 900208c PROCESSOR 0:306e4 TIME 1399142359 SOCKET 1 APIC 20 [94953.532217] EDAC MC1: 1 CE memory scrubbing error on CPU_SrcID#1_Channel#0_DIMM#0 (channel:0 slot:0 page:0x2dfa0e1 offset:0x0 grain:32 syndrome:0x0 - area:DRAM err_code:0008:00c0 socket:1 channel_mask:3 rank:0) [94953.532226] EDAC MC1: 1 CE memory scrubbing error on CPU_SrcID#1_Channel#1_DIMM#0 (channel:1 slot:0 page:0x2dfa0e1 offset:0x0 grain:32 syndrome:0x0 - area:DRAM err_code:0008:00c1 socket:1 channel_mask:3 rank:0) --- AlsaDevices: total 0 crw-rw 1 root audio 116, 1 touko 2 19:15 seq crw-rw 1 root audio 116, 33 touko 2 19:15 timer AplayDevices: Error: [Errno 2] No such file or directory ApportVersion: 2.14.1-0ubuntu3 Architecture: amd64 ArecordDevices: Error: [Errno 2] No such file or directory AudioDevicesInUse: Error: command ['fuser', '-v', '/dev/snd/seq', '/dev/snd/timer'] failed with exit code 1: CRDA: Error: [Errno 2] No such file or directory CurrentDmesg: Error: command ['sh', '-c', 'dmesg | comm -13 --nocheck-order /var/log/dmesg -'] failed with exit code 1: comm: /var/log/dmesg: Permission denied dmesg: write failed: Broken pipe DistroRelease: Ubuntu 14.04 InstallationDate: Installed on 2014-02-26 (66 days ago) InstallationMedia: Ubuntu-Server 14.04 LTS "Trusty Tahr" - Alpha amd64 (20140219) MachineType: Dell Inc. PowerEdge R720 Package: linux (not installed) PciMultimedia: ProcFB: 0 VESA VGA ProcKernelCmdLine: BOOT_IMAGE=/boot/vmlinuz-3.13.0-24-generic root=UUID=c03eb237-955a-4dee-bba1-deded53df372 ro ProcVersionSignature: Ubuntu 3.13.0-24.46-generic 3.13.9 RfKill: Error: [Errno 2] No such file or directory Tags: trusty Uname: Linux 3.13.0-24-generic x86_64 UpgradeStatus: No upgrade log present (probably fresh install) UserGroups: WifiSyslog: _MarkForUpload: True dmi.bios.date: 01/16/2014 dmi.bios.vendor: Dell Inc. dmi.bios.version: 2.2.2 dmi.board.name: 0DCWD1 dmi.board.vendor: Dell Inc. dmi.board.version: A01 dmi.chassis.type: 23 dmi.chassis.vendor: Dell Inc. dmi.modalias: dmi:bvnDellInc.:bvr2.2.2:bd01/16/2014:svnDellInc.:pnPowerEdgeR720:pvr:rvnDellInc.:rn0DCWD1:rvrA01:cvnDellInc.:ct23:cvr: dmi.product.name: PowerEdge R720 dmi.sys.vendor: Dell Inc. To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1315736/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp
[Kernel-packages] [Bug 1315736] Re: [Dell PowerEdge R720] Machine Check Exception
I will do this, but one important comment: I am on a supermicro, not a dell. But the bug seems the same (same bug kernel line, and also java- related taints) -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1315736 Title: [Dell PowerEdge R720] Machine Check Exception Status in “linux” package in Ubuntu: Incomplete Bug description: Dell PowerEdge 720 on ubuntu 14.04 shows MCE errors on dmesg. Dell support instructed to run DSET and BIOS hardware diagnostics. Neither of the tools showed any errors. Dell support said that if there was a hardware error it would have been shown on Dell logs and the probable reason for the dmesg log is a bug in ubuntu kernel MCE reporting. So, is it that following dmesg is because of a kernel bug in ubuntu 14.04 server? [11562.171040] Please check user daemon is running. [94953.306404] sbridge: HANDLING MCE MEMORY ERROR [94953.306415] CPU 1: Machine Check Exception: 0 Bank 9: 8c4b000800c0 [94953.306416] TSC 0 ADDR 2dfa0e1000 MISC 9800080168c PROCESSOR 0:306e4 TIME 1399142359 SOCKET 1 APIC 20 [94953.306422] sbridge: HANDLING MCE MEMORY ERROR [94953.306423] CPU 1: Machine Check Exception: 0 Bank 10: 8c5800c1 [94953.306424] TSC 0 ADDR 2dfa0e1000 MISC 900208c PROCESSOR 0:306e4 TIME 1399142359 SOCKET 1 APIC 20 [94953.532217] EDAC MC1: 1 CE memory scrubbing error on CPU_SrcID#1_Channel#0_DIMM#0 (channel:0 slot:0 page:0x2dfa0e1 offset:0x0 grain:32 syndrome:0x0 - area:DRAM err_code:0008:00c0 socket:1 channel_mask:3 rank:0) [94953.532226] EDAC MC1: 1 CE memory scrubbing error on CPU_SrcID#1_Channel#1_DIMM#0 (channel:1 slot:0 page:0x2dfa0e1 offset:0x0 grain:32 syndrome:0x0 - area:DRAM err_code:0008:00c1 socket:1 channel_mask:3 rank:0) --- AlsaDevices: total 0 crw-rw 1 root audio 116, 1 touko 2 19:15 seq crw-rw 1 root audio 116, 33 touko 2 19:15 timer AplayDevices: Error: [Errno 2] No such file or directory ApportVersion: 2.14.1-0ubuntu3 Architecture: amd64 ArecordDevices: Error: [Errno 2] No such file or directory AudioDevicesInUse: Error: command ['fuser', '-v', '/dev/snd/seq', '/dev/snd/timer'] failed with exit code 1: CRDA: Error: [Errno 2] No such file or directory CurrentDmesg: Error: command ['sh', '-c', 'dmesg | comm -13 --nocheck-order /var/log/dmesg -'] failed with exit code 1: comm: /var/log/dmesg: Permission denied dmesg: write failed: Broken pipe DistroRelease: Ubuntu 14.04 InstallationDate: Installed on 2014-02-26 (66 days ago) InstallationMedia: Ubuntu-Server 14.04 LTS "Trusty Tahr" - Alpha amd64 (20140219) MachineType: Dell Inc. PowerEdge R720 Package: linux (not installed) PciMultimedia: ProcFB: 0 VESA VGA ProcKernelCmdLine: BOOT_IMAGE=/boot/vmlinuz-3.13.0-24-generic root=UUID=c03eb237-955a-4dee-bba1-deded53df372 ro ProcVersionSignature: Ubuntu 3.13.0-24.46-generic 3.13.9 RfKill: Error: [Errno 2] No such file or directory Tags: trusty Uname: Linux 3.13.0-24-generic x86_64 UpgradeStatus: No upgrade log present (probably fresh install) UserGroups: WifiSyslog: _MarkForUpload: True dmi.bios.date: 01/16/2014 dmi.bios.vendor: Dell Inc. dmi.bios.version: 2.2.2 dmi.board.name: 0DCWD1 dmi.board.vendor: Dell Inc. dmi.board.version: A01 dmi.chassis.type: 23 dmi.chassis.vendor: Dell Inc. dmi.modalias: dmi:bvnDellInc.:bvr2.2.2:bd01/16/2014:svnDellInc.:pnPowerEdgeR720:pvr:rvnDellInc.:rn0DCWD1:rvrA01:cvnDellInc.:ct23:cvr: dmi.product.name: PowerEdge R720 dmi.sys.vendor: Dell Inc. To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1315736/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp
[Kernel-packages] [Bug 1315736] Re: [Dell PowerEdge R720] Machine Check Exception
Tiago Antao, thank you for your comment. So your hardware and problem may be tracked, could you please file a new report with Ubuntu by executing the following in a terminal while booted into a Ubuntu repository kernel (not a mainline one) via: ubuntu-bug linux For more on this, please read the official Ubuntu documentation: Ubuntu Bug Control and Ubuntu Bug Squad: https://wiki.ubuntu.com/Bugs/BestPractices#X.2BAC8-Reporting.Focus_on_One_Issue Ubuntu Kernel Team: https://wiki.ubuntu.com/KernelTeam/KernelTeamBugPolicies#Filing_Kernel_Bug_reports Ubuntu Community: https://help.ubuntu.com/community/ReportingBugs#Bug_reporting_etiquette When opening up the new report, please feel free to subscribe me to it. Thank you for your understanding. Helpful bug reporting tips: https://wiki.ubuntu.com/ReportingBugs -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1315736 Title: [Dell PowerEdge R720] Machine Check Exception Status in “linux” package in Ubuntu: Incomplete Bug description: Dell PowerEdge 720 on ubuntu 14.04 shows MCE errors on dmesg. Dell support instructed to run DSET and BIOS hardware diagnostics. Neither of the tools showed any errors. Dell support said that if there was a hardware error it would have been shown on Dell logs and the probable reason for the dmesg log is a bug in ubuntu kernel MCE reporting. So, is it that following dmesg is because of a kernel bug in ubuntu 14.04 server? [11562.171040] Please check user daemon is running. [94953.306404] sbridge: HANDLING MCE MEMORY ERROR [94953.306415] CPU 1: Machine Check Exception: 0 Bank 9: 8c4b000800c0 [94953.306416] TSC 0 ADDR 2dfa0e1000 MISC 9800080168c PROCESSOR 0:306e4 TIME 1399142359 SOCKET 1 APIC 20 [94953.306422] sbridge: HANDLING MCE MEMORY ERROR [94953.306423] CPU 1: Machine Check Exception: 0 Bank 10: 8c5800c1 [94953.306424] TSC 0 ADDR 2dfa0e1000 MISC 900208c PROCESSOR 0:306e4 TIME 1399142359 SOCKET 1 APIC 20 [94953.532217] EDAC MC1: 1 CE memory scrubbing error on CPU_SrcID#1_Channel#0_DIMM#0 (channel:0 slot:0 page:0x2dfa0e1 offset:0x0 grain:32 syndrome:0x0 - area:DRAM err_code:0008:00c0 socket:1 channel_mask:3 rank:0) [94953.532226] EDAC MC1: 1 CE memory scrubbing error on CPU_SrcID#1_Channel#1_DIMM#0 (channel:1 slot:0 page:0x2dfa0e1 offset:0x0 grain:32 syndrome:0x0 - area:DRAM err_code:0008:00c1 socket:1 channel_mask:3 rank:0) --- AlsaDevices: total 0 crw-rw 1 root audio 116, 1 touko 2 19:15 seq crw-rw 1 root audio 116, 33 touko 2 19:15 timer AplayDevices: Error: [Errno 2] No such file or directory ApportVersion: 2.14.1-0ubuntu3 Architecture: amd64 ArecordDevices: Error: [Errno 2] No such file or directory AudioDevicesInUse: Error: command ['fuser', '-v', '/dev/snd/seq', '/dev/snd/timer'] failed with exit code 1: CRDA: Error: [Errno 2] No such file or directory CurrentDmesg: Error: command ['sh', '-c', 'dmesg | comm -13 --nocheck-order /var/log/dmesg -'] failed with exit code 1: comm: /var/log/dmesg: Permission denied dmesg: write failed: Broken pipe DistroRelease: Ubuntu 14.04 InstallationDate: Installed on 2014-02-26 (66 days ago) InstallationMedia: Ubuntu-Server 14.04 LTS "Trusty Tahr" - Alpha amd64 (20140219) MachineType: Dell Inc. PowerEdge R720 Package: linux (not installed) PciMultimedia: ProcFB: 0 VESA VGA ProcKernelCmdLine: BOOT_IMAGE=/boot/vmlinuz-3.13.0-24-generic root=UUID=c03eb237-955a-4dee-bba1-deded53df372 ro ProcVersionSignature: Ubuntu 3.13.0-24.46-generic 3.13.9 RfKill: Error: [Errno 2] No such file or directory Tags: trusty Uname: Linux 3.13.0-24-generic x86_64 UpgradeStatus: No upgrade log present (probably fresh install) UserGroups: WifiSyslog: _MarkForUpload: True dmi.bios.date: 01/16/2014 dmi.bios.vendor: Dell Inc. dmi.bios.version: 2.2.2 dmi.board.name: 0DCWD1 dmi.board.vendor: Dell Inc. dmi.board.version: A01 dmi.chassis.type: 23 dmi.chassis.vendor: Dell Inc. dmi.modalias: dmi:bvnDellInc.:bvr2.2.2:bd01/16/2014:svnDellInc.:pnPowerEdgeR720:pvr:rvnDellInc.:rn0DCWD1:rvrA01:cvnDellInc.:ct23:cvr: dmi.product.name: PowerEdge R720 dmi.sys.vendor: Dell Inc. To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1315736/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp
[Kernel-packages] [Bug 1315736] Re: [Dell PowerEdge R720] Machine Check Exception
I seem to have this bug also. While this is on a production server, I have some flexibility in rebooting it. I can note a few issues: 1. The kernel bug only happens with Java (tested both open-jdk7 and oracle8) 2. The java processes block and cannot be killed 3. Any process that tries to inspect the java process becomes blocked (e.g. top, ps, ...). an strace of a ps: open("/proc/41126/status", O_RDONLY)= 6 read(6, "Name:\tjava\nState:\tD (disk sleep)"..., 1024) = 870 close(6)= 0 open("/proc/41126/cmdline", O_RDONLY) = 6 read(6, [BLOCKS there] 4. As long as no queries are done on the blocked java processes, everything works (though the load of the machine is apparently high) Tell me what you need done to test this, and I will do it -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1315736 Title: [Dell PowerEdge R720] Machine Check Exception Status in “linux” package in Ubuntu: Incomplete Bug description: Dell PowerEdge 720 on ubuntu 14.04 shows MCE errors on dmesg. Dell support instructed to run DSET and BIOS hardware diagnostics. Neither of the tools showed any errors. Dell support said that if there was a hardware error it would have been shown on Dell logs and the probable reason for the dmesg log is a bug in ubuntu kernel MCE reporting. So, is it that following dmesg is because of a kernel bug in ubuntu 14.04 server? [11562.171040] Please check user daemon is running. [94953.306404] sbridge: HANDLING MCE MEMORY ERROR [94953.306415] CPU 1: Machine Check Exception: 0 Bank 9: 8c4b000800c0 [94953.306416] TSC 0 ADDR 2dfa0e1000 MISC 9800080168c PROCESSOR 0:306e4 TIME 1399142359 SOCKET 1 APIC 20 [94953.306422] sbridge: HANDLING MCE MEMORY ERROR [94953.306423] CPU 1: Machine Check Exception: 0 Bank 10: 8c5800c1 [94953.306424] TSC 0 ADDR 2dfa0e1000 MISC 900208c PROCESSOR 0:306e4 TIME 1399142359 SOCKET 1 APIC 20 [94953.532217] EDAC MC1: 1 CE memory scrubbing error on CPU_SrcID#1_Channel#0_DIMM#0 (channel:0 slot:0 page:0x2dfa0e1 offset:0x0 grain:32 syndrome:0x0 - area:DRAM err_code:0008:00c0 socket:1 channel_mask:3 rank:0) [94953.532226] EDAC MC1: 1 CE memory scrubbing error on CPU_SrcID#1_Channel#1_DIMM#0 (channel:1 slot:0 page:0x2dfa0e1 offset:0x0 grain:32 syndrome:0x0 - area:DRAM err_code:0008:00c1 socket:1 channel_mask:3 rank:0) --- AlsaDevices: total 0 crw-rw 1 root audio 116, 1 touko 2 19:15 seq crw-rw 1 root audio 116, 33 touko 2 19:15 timer AplayDevices: Error: [Errno 2] No such file or directory ApportVersion: 2.14.1-0ubuntu3 Architecture: amd64 ArecordDevices: Error: [Errno 2] No such file or directory AudioDevicesInUse: Error: command ['fuser', '-v', '/dev/snd/seq', '/dev/snd/timer'] failed with exit code 1: CRDA: Error: [Errno 2] No such file or directory CurrentDmesg: Error: command ['sh', '-c', 'dmesg | comm -13 --nocheck-order /var/log/dmesg -'] failed with exit code 1: comm: /var/log/dmesg: Permission denied dmesg: write failed: Broken pipe DistroRelease: Ubuntu 14.04 InstallationDate: Installed on 2014-02-26 (66 days ago) InstallationMedia: Ubuntu-Server 14.04 LTS "Trusty Tahr" - Alpha amd64 (20140219) MachineType: Dell Inc. PowerEdge R720 Package: linux (not installed) PciMultimedia: ProcFB: 0 VESA VGA ProcKernelCmdLine: BOOT_IMAGE=/boot/vmlinuz-3.13.0-24-generic root=UUID=c03eb237-955a-4dee-bba1-deded53df372 ro ProcVersionSignature: Ubuntu 3.13.0-24.46-generic 3.13.9 RfKill: Error: [Errno 2] No such file or directory Tags: trusty Uname: Linux 3.13.0-24-generic x86_64 UpgradeStatus: No upgrade log present (probably fresh install) UserGroups: WifiSyslog: _MarkForUpload: True dmi.bios.date: 01/16/2014 dmi.bios.vendor: Dell Inc. dmi.bios.version: 2.2.2 dmi.board.name: 0DCWD1 dmi.board.vendor: Dell Inc. dmi.board.version: A01 dmi.chassis.type: 23 dmi.chassis.vendor: Dell Inc. dmi.modalias: dmi:bvnDellInc.:bvr2.2.2:bd01/16/2014:svnDellInc.:pnPowerEdgeR720:pvr:rvnDellInc.:rn0DCWD1:rvrA01:cvnDellInc.:ct23:cvr: dmi.product.name: PowerEdge R720 dmi.sys.vendor: Dell Inc. To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1315736/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp
[Kernel-packages] [Bug 1315736] Re: [Dell PowerEdge R720] Machine Check Exception
** Tags added: needs-upstream-testing -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1315736 Title: [Dell PowerEdge R720] Machine Check Exception Status in “linux” package in Ubuntu: Incomplete Bug description: Dell PowerEdge 720 on ubuntu 14.04 shows MCE errors on dmesg. Dell support instructed to run DSET and BIOS hardware diagnostics. Neither of the tools showed any errors. Dell support said that if there was a hardware error it would have been shown on Dell logs and the probable reason for the dmesg log is a bug in ubuntu kernel MCE reporting. So, is it that following dmesg is because of a kernel bug in ubuntu 14.04 server? [11562.171040] Please check user daemon is running. [94953.306404] sbridge: HANDLING MCE MEMORY ERROR [94953.306415] CPU 1: Machine Check Exception: 0 Bank 9: 8c4b000800c0 [94953.306416] TSC 0 ADDR 2dfa0e1000 MISC 9800080168c PROCESSOR 0:306e4 TIME 1399142359 SOCKET 1 APIC 20 [94953.306422] sbridge: HANDLING MCE MEMORY ERROR [94953.306423] CPU 1: Machine Check Exception: 0 Bank 10: 8c5800c1 [94953.306424] TSC 0 ADDR 2dfa0e1000 MISC 900208c PROCESSOR 0:306e4 TIME 1399142359 SOCKET 1 APIC 20 [94953.532217] EDAC MC1: 1 CE memory scrubbing error on CPU_SrcID#1_Channel#0_DIMM#0 (channel:0 slot:0 page:0x2dfa0e1 offset:0x0 grain:32 syndrome:0x0 - area:DRAM err_code:0008:00c0 socket:1 channel_mask:3 rank:0) [94953.532226] EDAC MC1: 1 CE memory scrubbing error on CPU_SrcID#1_Channel#1_DIMM#0 (channel:1 slot:0 page:0x2dfa0e1 offset:0x0 grain:32 syndrome:0x0 - area:DRAM err_code:0008:00c1 socket:1 channel_mask:3 rank:0) --- AlsaDevices: total 0 crw-rw 1 root audio 116, 1 touko 2 19:15 seq crw-rw 1 root audio 116, 33 touko 2 19:15 timer AplayDevices: Error: [Errno 2] No such file or directory ApportVersion: 2.14.1-0ubuntu3 Architecture: amd64 ArecordDevices: Error: [Errno 2] No such file or directory AudioDevicesInUse: Error: command ['fuser', '-v', '/dev/snd/seq', '/dev/snd/timer'] failed with exit code 1: CRDA: Error: [Errno 2] No such file or directory CurrentDmesg: Error: command ['sh', '-c', 'dmesg | comm -13 --nocheck-order /var/log/dmesg -'] failed with exit code 1: comm: /var/log/dmesg: Permission denied dmesg: write failed: Broken pipe DistroRelease: Ubuntu 14.04 InstallationDate: Installed on 2014-02-26 (66 days ago) InstallationMedia: Ubuntu-Server 14.04 LTS "Trusty Tahr" - Alpha amd64 (20140219) MachineType: Dell Inc. PowerEdge R720 Package: linux (not installed) PciMultimedia: ProcFB: 0 VESA VGA ProcKernelCmdLine: BOOT_IMAGE=/boot/vmlinuz-3.13.0-24-generic root=UUID=c03eb237-955a-4dee-bba1-deded53df372 ro ProcVersionSignature: Ubuntu 3.13.0-24.46-generic 3.13.9 RfKill: Error: [Errno 2] No such file or directory Tags: trusty Uname: Linux 3.13.0-24-generic x86_64 UpgradeStatus: No upgrade log present (probably fresh install) UserGroups: WifiSyslog: _MarkForUpload: True dmi.bios.date: 01/16/2014 dmi.bios.vendor: Dell Inc. dmi.bios.version: 2.2.2 dmi.board.name: 0DCWD1 dmi.board.vendor: Dell Inc. dmi.board.version: A01 dmi.chassis.type: 23 dmi.chassis.vendor: Dell Inc. dmi.modalias: dmi:bvnDellInc.:bvr2.2.2:bd01/16/2014:svnDellInc.:pnPowerEdgeR720:pvr:rvnDellInc.:rn0DCWD1:rvrA01:cvnDellInc.:ct23:cvr: dmi.product.name: PowerEdge R720 dmi.sys.vendor: Dell Inc. To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1315736/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp
[Kernel-packages] [Bug 1315736] Re: [Dell PowerEdge R720] Machine Check Exception
Hi Cristopher, The server is a fully installed production server, so unfortunately downgrading to 12.04 is not possible. Reproducing the problem takes about 2 days during which time server is offline. Having the server offline is somewhat problematic. However it is possible to test 3.15-rc5, but I would rather do it in case there is a fix which addresses similar symptoms than reported here. Does the rc5 have a fix which could remedy the problem? Do the provided error messages give any information what might be wrong or is it that they are not helping in this case? I can see that there are some source code line numbers etc. -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1315736 Title: [Dell PowerEdge R720] Machine Check Exception Status in “linux” package in Ubuntu: Incomplete Bug description: Dell PowerEdge 720 on ubuntu 14.04 shows MCE errors on dmesg. Dell support instructed to run DSET and BIOS hardware diagnostics. Neither of the tools showed any errors. Dell support said that if there was a hardware error it would have been shown on Dell logs and the probable reason for the dmesg log is a bug in ubuntu kernel MCE reporting. So, is it that following dmesg is because of a kernel bug in ubuntu 14.04 server? [11562.171040] Please check user daemon is running. [94953.306404] sbridge: HANDLING MCE MEMORY ERROR [94953.306415] CPU 1: Machine Check Exception: 0 Bank 9: 8c4b000800c0 [94953.306416] TSC 0 ADDR 2dfa0e1000 MISC 9800080168c PROCESSOR 0:306e4 TIME 1399142359 SOCKET 1 APIC 20 [94953.306422] sbridge: HANDLING MCE MEMORY ERROR [94953.306423] CPU 1: Machine Check Exception: 0 Bank 10: 8c5800c1 [94953.306424] TSC 0 ADDR 2dfa0e1000 MISC 900208c PROCESSOR 0:306e4 TIME 1399142359 SOCKET 1 APIC 20 [94953.532217] EDAC MC1: 1 CE memory scrubbing error on CPU_SrcID#1_Channel#0_DIMM#0 (channel:0 slot:0 page:0x2dfa0e1 offset:0x0 grain:32 syndrome:0x0 - area:DRAM err_code:0008:00c0 socket:1 channel_mask:3 rank:0) [94953.532226] EDAC MC1: 1 CE memory scrubbing error on CPU_SrcID#1_Channel#1_DIMM#0 (channel:1 slot:0 page:0x2dfa0e1 offset:0x0 grain:32 syndrome:0x0 - area:DRAM err_code:0008:00c1 socket:1 channel_mask:3 rank:0) --- AlsaDevices: total 0 crw-rw 1 root audio 116, 1 touko 2 19:15 seq crw-rw 1 root audio 116, 33 touko 2 19:15 timer AplayDevices: Error: [Errno 2] No such file or directory ApportVersion: 2.14.1-0ubuntu3 Architecture: amd64 ArecordDevices: Error: [Errno 2] No such file or directory AudioDevicesInUse: Error: command ['fuser', '-v', '/dev/snd/seq', '/dev/snd/timer'] failed with exit code 1: CRDA: Error: [Errno 2] No such file or directory CurrentDmesg: Error: command ['sh', '-c', 'dmesg | comm -13 --nocheck-order /var/log/dmesg -'] failed with exit code 1: comm: /var/log/dmesg: Permission denied dmesg: write failed: Broken pipe DistroRelease: Ubuntu 14.04 InstallationDate: Installed on 2014-02-26 (66 days ago) InstallationMedia: Ubuntu-Server 14.04 LTS "Trusty Tahr" - Alpha amd64 (20140219) MachineType: Dell Inc. PowerEdge R720 Package: linux (not installed) PciMultimedia: ProcFB: 0 VESA VGA ProcKernelCmdLine: BOOT_IMAGE=/boot/vmlinuz-3.13.0-24-generic root=UUID=c03eb237-955a-4dee-bba1-deded53df372 ro ProcVersionSignature: Ubuntu 3.13.0-24.46-generic 3.13.9 RfKill: Error: [Errno 2] No such file or directory Tags: trusty Uname: Linux 3.13.0-24-generic x86_64 UpgradeStatus: No upgrade log present (probably fresh install) UserGroups: WifiSyslog: _MarkForUpload: True dmi.bios.date: 01/16/2014 dmi.bios.vendor: Dell Inc. dmi.bios.version: 2.2.2 dmi.board.name: 0DCWD1 dmi.board.vendor: Dell Inc. dmi.board.version: A01 dmi.chassis.type: 23 dmi.chassis.vendor: Dell Inc. dmi.modalias: dmi:bvnDellInc.:bvr2.2.2:bd01/16/2014:svnDellInc.:pnPowerEdgeR720:pvr:rvnDellInc.:rn0DCWD1:rvrA01:cvnDellInc.:ct23:cvr: dmi.product.name: PowerEdge R720 dmi.sys.vendor: Dell Inc. To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1315736/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp
[Kernel-packages] [Bug 1315736] Re: [Dell PowerEdge R720] Machine Check Exception
Sami Pietila, could you please test the latest mainline kernel via http://kernel.ubuntu.com/~kernel-ppa/mainline/v3.15-rc5-utopic/ and advise to the results? If reproducible, for regression testing purposes, could you please test for this in the server release of http://old- releases.ubuntu.com/releases/12.04.0/ and advise to the results? ** Changed in: linux (Ubuntu) Status: Confirmed => Incomplete -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1315736 Title: [Dell PowerEdge R720] Machine Check Exception Status in “linux” package in Ubuntu: Incomplete Bug description: Dell PowerEdge 720 on ubuntu 14.04 shows MCE errors on dmesg. Dell support instructed to run DSET and BIOS hardware diagnostics. Neither of the tools showed any errors. Dell support said that if there was a hardware error it would have been shown on Dell logs and the probable reason for the dmesg log is a bug in ubuntu kernel MCE reporting. So, is it that following dmesg is because of a kernel bug in ubuntu 14.04 server? [11562.171040] Please check user daemon is running. [94953.306404] sbridge: HANDLING MCE MEMORY ERROR [94953.306415] CPU 1: Machine Check Exception: 0 Bank 9: 8c4b000800c0 [94953.306416] TSC 0 ADDR 2dfa0e1000 MISC 9800080168c PROCESSOR 0:306e4 TIME 1399142359 SOCKET 1 APIC 20 [94953.306422] sbridge: HANDLING MCE MEMORY ERROR [94953.306423] CPU 1: Machine Check Exception: 0 Bank 10: 8c5800c1 [94953.306424] TSC 0 ADDR 2dfa0e1000 MISC 900208c PROCESSOR 0:306e4 TIME 1399142359 SOCKET 1 APIC 20 [94953.532217] EDAC MC1: 1 CE memory scrubbing error on CPU_SrcID#1_Channel#0_DIMM#0 (channel:0 slot:0 page:0x2dfa0e1 offset:0x0 grain:32 syndrome:0x0 - area:DRAM err_code:0008:00c0 socket:1 channel_mask:3 rank:0) [94953.532226] EDAC MC1: 1 CE memory scrubbing error on CPU_SrcID#1_Channel#1_DIMM#0 (channel:1 slot:0 page:0x2dfa0e1 offset:0x0 grain:32 syndrome:0x0 - area:DRAM err_code:0008:00c1 socket:1 channel_mask:3 rank:0) --- AlsaDevices: total 0 crw-rw 1 root audio 116, 1 touko 2 19:15 seq crw-rw 1 root audio 116, 33 touko 2 19:15 timer AplayDevices: Error: [Errno 2] No such file or directory ApportVersion: 2.14.1-0ubuntu3 Architecture: amd64 ArecordDevices: Error: [Errno 2] No such file or directory AudioDevicesInUse: Error: command ['fuser', '-v', '/dev/snd/seq', '/dev/snd/timer'] failed with exit code 1: CRDA: Error: [Errno 2] No such file or directory CurrentDmesg: Error: command ['sh', '-c', 'dmesg | comm -13 --nocheck-order /var/log/dmesg -'] failed with exit code 1: comm: /var/log/dmesg: Permission denied dmesg: write failed: Broken pipe DistroRelease: Ubuntu 14.04 InstallationDate: Installed on 2014-02-26 (66 days ago) InstallationMedia: Ubuntu-Server 14.04 LTS "Trusty Tahr" - Alpha amd64 (20140219) MachineType: Dell Inc. PowerEdge R720 Package: linux (not installed) PciMultimedia: ProcFB: 0 VESA VGA ProcKernelCmdLine: BOOT_IMAGE=/boot/vmlinuz-3.13.0-24-generic root=UUID=c03eb237-955a-4dee-bba1-deded53df372 ro ProcVersionSignature: Ubuntu 3.13.0-24.46-generic 3.13.9 RfKill: Error: [Errno 2] No such file or directory Tags: trusty Uname: Linux 3.13.0-24-generic x86_64 UpgradeStatus: No upgrade log present (probably fresh install) UserGroups: WifiSyslog: _MarkForUpload: True dmi.bios.date: 01/16/2014 dmi.bios.vendor: Dell Inc. dmi.bios.version: 2.2.2 dmi.board.name: 0DCWD1 dmi.board.vendor: Dell Inc. dmi.board.version: A01 dmi.chassis.type: 23 dmi.chassis.vendor: Dell Inc. dmi.modalias: dmi:bvnDellInc.:bvr2.2.2:bd01/16/2014:svnDellInc.:pnPowerEdgeR720:pvr:rvnDellInc.:rn0DCWD1:rvrA01:cvnDellInc.:ct23:cvr: dmi.product.name: PowerEdge R720 dmi.sys.vendor: Dell Inc. To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1315736/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp