On 19/4/17 9:02 am, Adi Pircalabu wrote:
On 18-04-2017 23:32, Lars Ellenberg wrote:
On Tue, Apr 18, 2017 at 10:52:57AM +1000, Adi Pircalabu wrote:
Hi, initially submitted here:
https://bugzilla.redhat.com/show_bug.cgi?id=1442593
The node that crashed was at the time the active member of an active/passive Pacemaker cluster, using DRBD backed replicated storage for iSCSI and NFS
resources.
The RedHat developer closed the bug due to loading drbd out-of-tree. The module is built using the source from http://git.drbd.org/drbd-8.4.git/ Even though it may or may not be related to DRBD, I thought it's worth
having your opinion on this.

I don't see how the presence of DRBD would make
the apic_timer_interrupt deref some bad pointer,
while the cpu is "idle",
with such "boring" backtrace,
and for you only.

But yes, I know, just having DRBD around
makes it responsible for everything.

You said that, I didn't :)

I mean, sure, in theory, it was possible, somehow...
but I see no indication of that in the data provided.

Thanks for looking into it. I wasn't convinced it's drbd causing that panic, since I've been running 8.4.6, 8.4.7 and 8.4.8 for years with no issues. I'm now running the latest RHEL 7 kernel and I got rid of all Dell Openmanage software, including the dell_rbu module. See what happens.

Just fyi, crashed again yesterday morning 7:06am, similar backtrace. crash output for bt, ps, task & vm attached. I've since downgraded the drbd module version from 8.4.9-2 to 8.4.9-1, waiting for the crash to replicate again. And, as expected, the folks @RedHat closed the bug after reopening it as notabug, blaming drbd.

Cheers,

--
Adi Pircalabu

Attachment: _crash-2017-04-23-070659.tar.gz
Description: GNU Zip compressed data

_______________________________________________
drbd-user mailing list
[email protected]
http://lists.linbit.com/mailman/listinfo/drbd-user

Reply via email to