On 19/4/17 9:02 am, Adi Pircalabu wrote:
On 18-04-2017 23:32, Lars Ellenberg wrote:On Tue, Apr 18, 2017 at 10:52:57AM +1000, Adi Pircalabu wrote:Hi, initially submitted here: https://bugzilla.redhat.com/show_bug.cgi?id=1442593The node that crashed was at the time the active member of an active/passive Pacemaker cluster, using DRBD backed replicated storage for iSCSI and NFSresources.The RedHat developer closed the bug due to loading drbd out-of-tree. The module is built using the source from http://git.drbd.org/drbd-8.4.git/ Even though it may or may not be related to DRBD, I thought it's worthhaving your opinion on this.I don't see how the presence of DRBD would make the apic_timer_interrupt deref some bad pointer, while the cpu is "idle", with such "boring" backtrace, and for you only. But yes, I know, just having DRBD around makes it responsible for everything.You said that, I didn't :)I mean, sure, in theory, it was possible, somehow... but I see no indication of that in the data provided.Thanks for looking into it. I wasn't convinced it's drbd causing that panic, since I've been running 8.4.6, 8.4.7 and 8.4.8 for years with no issues. I'm now running the latest RHEL 7 kernel and I got rid of all Dell Openmanage software, including the dell_rbu module. See what happens.
Just fyi, crashed again yesterday morning 7:06am, similar backtrace. crash output for bt, ps, task & vm attached. I've since downgraded the drbd module version from 8.4.9-2 to 8.4.9-1, waiting for the crash to replicate again. And, as expected, the folks @RedHat closed the bug after reopening it as notabug, blaming drbd.
Cheers, -- Adi Pircalabu
_crash-2017-04-23-070659.tar.gz
Description: GNU Zip compressed data
_______________________________________________ drbd-user mailing list [email protected] http://lists.linbit.com/mailman/listinfo/drbd-user
