On 2015-06-30 21:20, Rustad, Mark D wrote:
Christian,
On Jun 30, 2015, at 1:58 AM, Christian Ruppert <id...@qasl.de> wrote:
bad news. It didn't work either. :(
That is too bad.
The system just did a reset tonight and there's nothing useful.
What I did was:
I removed the console= parameter and therefore I added your mentioned
earlyprintk=
I verified it's working by redirecting a "h" to the sysrq-trigger and
that's all I got:
[ 308.812492] SysRq : HELP : loglevel(0-9) reboot(b) crash(c)
terminate-all-tasks(e) memory-full-oom-kill(f) kill-all-tasks(i)
thaw-filesystems(j) sak(k) show-backtrace-all-active-cpus(l)
show-memory-usage(m) nice-all-RT-tasks(n) poweroff(o)
show-registers(p) show-all-timers(q) unraw(r) sync(s)
show-task-states(t) unmount(u) show-blocked-tasks(w)
dump-ftrace-buffer(z)
[4early console in decompress_kernel
Decompressing Linux... Parsing ELF... done.
Booting the kernel.
...
So basically still nothing :/
Could you send the full log that was captured via the earlyprintk,
just in case I can notice something that is reported there.
See the attached log but it's basically just from the newly booted
kernel
One mentioned netconsole but I doubt it will be any better if even
console= or earlyprintk= didn't catch anything.
I agree. It is incredibly unlikely that netconsole can catch anything
that earlyprintk can't.
Do you have any more ideas by chance?
One thing that comes to mind is that some systems will automatically
reset what any unrecoverable hardware error occurs. I have had systems
set up that way in the past and when such an error occurs, an
immediate reset is the result. Have you noticed any BIOS settings
related to that? If so, could you change them to SMIs or something? Or
is there a different instance of that hardware that you can run this
on?
See below
In my last mail I summarised our setup and I'm willing to provide as
much information as I can to get this solved but right now I have no
more ideas.
I think detailed information on your hardware and BIOS settings, along
with whatever log you do get via earlyprintk might help. It may be
possible that a software error could trigger an uncorrectable error,
but it isn't real common. It sure doesn't behave like a typical kernel
panic kind of issue. Oh, and do check any error log that your BIOS
might be holding for you.
We tried Supermicro 5018D-MTF (E3-1281v3), 5017C-MTF (E3-1220L IIRC) and
a Workstation PC (i5-4460) with an Asus mainboard (H97M-E) and it's the
same everywhere. All Systems do have 32GB RAM, the two Supermicro even
ECC. And we only have issues in combination with the mentioned X520 NIC
AND the SYNPROXY iptables extension.
mcelog is empty. The 5018D-MTF Event log has nothing either. I checked
for watchdog related settings in the BIOS but that looked good so far.
Also causing a test kernel panic resulted in a proper dump as well as a
valid kernel dump file. I can check the BIOS tomorrow and/or even make
some pictures of each page/tab in case it might help.
--
Mark Rustad, Networking Division, Intel Corporation
--
Regards,
Christian Ruppert
------------------------------------------------------------------------------
Don't Limit Your Business. Reach for the Cloud.
GigeNET's Cloud Solutions provide you with the tools and support that
you need to offload your IT needs and focus on growing your business.
Configured For All Businesses. Start Your Cloud Today.
https://www.gigenetcloud.com/
_______________________________________________
E1000-devel mailing list
E1000-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/e1000-devel
To learn more about Intel® Ethernet, visit
http://communities.intel.com/community/wired