This is a new Dell Power Edge server that has been initially installed
with Ubuntu 14.04 server. Reported problem(s) seems to occur with all
kernels that I have tried (Ubuntu default server kernel and with v3.15
upstream kernel).
--
You received this bug notification because you are a member of
** Description changed:
Dell PowerEdge 720 on ubuntu 14.04 shows MCE errors on dmesg. Dell
support instructed to run DSET and BIOS hardware diagnostics. Neither of
the tools showed any errors. Dell support said that if there was a
hardware error it would have been shown on Dell logs and
Did this issue start happening after an update/upgrade? Was there a
kernel version where you were not having this particular problem? This
will help determine if the problem you are seeing is the result of the
introduction of a regression, and when this regression was introduced.
If this is a
Now the logs show also the MCE error, so there seems to be no behavior
differences between v3.15 and ubuntu standard kernel.
[78132.360975] EDAC sbridge MC1: HANDLING MCE MEMORY ERROR
[78132.403996] EDAC sbridge MC1: CPU 1: Machine Check Event: 0 Bank 10:
8c46000800c1
[78132.448800] EDAC
Here are error messages from dmesg:
There are lines like:
[30187.335401] kernel BUG at /home/apw/COD/linux/mm/memory.c:3924!
[30187.337183] invalid opcode: [#1] SMP
...
[30223.621247] WARNING: CPU: 12 PID: 29190 at
/home/apw/COD/linux/kernel/watchdog.c:249
Top is also showing load averages about 60 - 70, but the process list
does look like the system is pretty much idle.
** Attachment added: top.png
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1315736/+attachment/4107851/+files/top.png
--
You received this bug notification because you
Hi,
I did run the memory test and no errors were detected.
I also changed to the mainline kernel. With the mainline kernel
(3.15.0-031500rc4-generic #201405042135 SMP) I have not seen yet MCE
error or had an unresponsive system, however I can still see some errors
on dmesg:
[ 840.160260]
Can you also give the latest 3.13 upstream kernel a test? It can be
downloaded from:
http://kernel.ubuntu.com/~kernel-ppa/mainline/v3.13.11-trusty/
--
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
It might be after all that this mainline v3.15 behaves like the default
ubuntu kernel as the server just went to unresponsive. However, I
noticed that I am able to login but only as a root and only from
console. The server seems to be in a weird state as ps aux command
shows about half of the
I am not sure if I have understood these tools correctly, but does the
vmstat show that CPUs are at idle and uptime command show that system
load is about 40?
** Attachment added: load.png
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1315736/+attachment/4107526/+files/load.png
--
I am also seeing messages like: BUG: soft lockup - CPU#25 stuck for 23s!
--
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1315736
Title:
Machine Check Exception
Status in “linux” package
It seems that putting the server under load results an unresponsible
server with console constantly flooding with error messages: A
screenshot attached.
** Attachment added: dmesg.png
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1315736/+attachment/4105408/+files/dmesg.png
--
You
Also, would it be possible for you to test the latest upstream kernel?
Refer to https://wiki.ubuntu.com/KernelMainlineBuilds . Please test the
latest v3.15 kernel[0].
If this bug is fixed in the mainline kernel, please add the following
tag 'kernel-fixed-upstream'.
If the mainline kernel does
Can you also perform a memory test, which can be accessed from the GRUB
menu? If you haven't gone to the GRUB menu before, it can be accessed
by holding the SHIFT key after system power-on and seeing the BIOS
messages.
** Changed in: linux (Ubuntu)
Importance: Undecided = Medium
** Changed
mesg also has following message:
[ 9441.626809] kernel BUG at /build/buildd/linux-3.13.0/mm/memory.c:3756!
[ 9441.628777] invalid opcode: [#1] SMP
[ 9441.630053] Modules linked in: ip6t_REJECT xt_hl ip6t_rt nf_conntrack_ipv6 nf
_defrag_ipv6 ipt_REJECT xt_limit xt_tcpudp xt_addrtype
apport information
** Tags added: apport-collected
** Description changed:
Dell PowerEdge 720 on ubuntu 14.04 shows MCE errors on dmesg. Dell
support instructed to run DSET and BIOS hardware diagnostics. Neither of
the tools showed any errors. Dell support said that if there was a
16 matches
Mail list logo