On Tue, Apr 17, 2018 at 12:34 PM, Greg Rose <[email protected]> wrote: > On RHEL 7.x kernels we observe a panic induced by a paging error > when the timer kicks off a job that subsequently accesses memory > that belonged to the openvswitch kernel module but was since > unloaded - thus the paging error. > > The panic can be induced on any RHEL 7.x kernel with the following test: > > while `true` > do > make check-kmod TESTSUITEFLAGS="-k \!gre" > done > > On the systems I've been testing on it generally takes anywhere from a > minute to 15 minutes or so to repro but never longer than that. Similar > results have been seen by other testers. > > This patch does not fix the underlying bug, which does need to be > investigated and fixed, but it does prevent it from occurring. We > would like to prevent customer systems from panicking while we do > futher investigation to find the root cause. > > Here is the trace: > [252257.801809] BUG: unable to handle kernel paging request at > ffffffffc07c6298 > [252257.802451] IP: [<ffffffff810996e0>] run_timer_softirq+0xe0/0x310 > [252257.803055] PGD 19f5067 PUD 19f7067 PMD 2fb5fc2067 PTE 0 > [252257.803559] Oops: 0002 [#1] SMP > [252257.804138] Modules linked in: geneve ip6_udp_tunnel xt_statistic > xt_physdev xt_nat xt_recent xt_comment xt_mark ipt_MASQUERADE > nf_nat_masquerade_ipv4 iptable_nat xt_addrtype ipt_REJECT nf_reject_ipv4 > xt_conntrack iptable_filter ip_tables nf_conntrack_netlink br_netfilter > overlay(T) sch_htb veth udp_tunnel 8021q garp mrp tun ip_set nfnetlink bridge > stp llc nf_conntrack_ipv6 nf_defrag_ipv6 nf_nat_ipv6 nf_conntrack_ipv4 > nf_defrag_ipv4 nf_nat_ipv4 nf_nat nf_conntrack iTCO_wdt iTCO_vendor_support > dcdbas mxm_wmi sb_edac edac_core intel_powerclamp coretemp intel_rapl > iosf_mbi kvm_intel kvm irqbypass crc32_pclmul ghash_clmulni_intel aesni_intel > lrw gf128mul glue_helper ablk_helper cryptd joydev mei_me sg mei ipmi_ssif > pcspkr shpchp lpc_ich ipmi_si ipmi_devintf ipmi_msghandler acpi_power_meter > wmi nfsd auth_rpcgss > [252257.808079] nfs_acl lockd grace sunrpc xfs libcrc32c sr_mod sd_mod cdrom > crc_t10dif crct10dif_generic uas usb_storage mgag200 drm_kms_helper > syscopyarea sysfillrect sysimgblt fb_sys_fops ttm drm crct10dif_pclmul > crct10dif_common crc32c_intel ahci libahci ixgbe libata igb megaraid_sas mdio > ptp i2c_algo_bit pps_core i2c_core dca dm_mirror dm_region_hash dm_log dm_mod > [last unloaded: openvswitch] > [252257.811056] CPU: 33 PID: 0 Comm: swapper/33 Tainted: G OE ------------ T > 3.10.0-693.el7.x86_64 #1 > [252257.811826] Hardware name: Dell Inc. PowerEdge R630/02C2CP, BIOS 2.1.5 > 04/11/2016 > [252257.812605] task: ffff8830b7708fd0 ti: ffff8830b7718000 task.ti: > ffff8830b7718000 > [252257.813447] RIP: 0010:[<ffffffff810996e0>] [<ffffffff810996e0>] > run_timer_softirq+0xe0/0x310 > [252257.814298] RSP: 0018:ffff885fbe203e68 EFLAGS: 00010082 > [252257.815122] RAX: ffff8830b66bc838 RBX: ffff8830b66bc000 RCX: > ffffffffc07c6290 > [252257.815933] RDX: ffff8830b66bc810 RSI: ffff885fbe203e90 RDI: > ffff8830b66bc000 > [252257.816733] RBP: ffff885fbe203ed0 R08: 0000e56b5701d800 R09: > ffff885fbe203da0 > [252257.817568] R10: 0000000000000002 R11: ffff885fbe203da8 R12: > 0000000000000081 > [252257.818429] R13: 0000e56b56fb2eca R14: ffffffff819eb0c8 R15: > 0000000000000001 > [252257.819297] FS: 0000000000000000(0000) GS:ffff885fbe200000(0000) > knlGS:0000000000000000 > [252257.820174] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 > [252257.821156] CR2: ffffffffc07c6298 CR3: 00000000019f2000 CR4: > 00000000003407e0 > [252257.822012] DR0: 0000000000000000 DR1: 0000000000000000 DR2: > 0000000000000000 > [252257.822869] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: > 0000000000000400 > [252257.823720] Stack: > [252257.824678] ffff8830b66bdc28 ffff8830b66bd828 ffff8830b66bd428 > ffff8830b66bd028 > [252257.825699] ffff885fbe20fe80 ffff885fbe203eb0 ffffffff8132bfe0 > 28a56c277c4fe974 > [252257.826623] 0000000000000001 0000000000000001 0000e56b56fb2eca > ffffffff819eb0c8 > [252257.827524] Call Trace: > [252257.828410] <IRQ> > [252257.828417] > [252257.829292] [<ffffffff8132bfe0>] ? timerqueue_add+0x60/0xb0 > [252257.830164] [<ffffffff81090b3f>] __do_softirq+0xef/0x280 > [252257.831010] [<ffffffff816b6a5c>] call_softirq+0x1c/0x30 > [252257.831849] [<ffffffff8102d3c5>] do_softirq+0x65/0xa0 > [252257.832669] [<ffffffff81090ec5>] irq_exit+0x105/0x110 > [252257.833501] [<ffffffff816b76c2>] smp_apic_timer_interrupt+0x42/0x50 > [252257.834330] [<ffffffff816b5c1d>] apic_timer_interrupt+0x6d/0x80 > [252257.835152] <EOI> > [252257.835159] > [252257.835944] [<ffffffff81527a02>] ? cpuidle_enter_state+0x52/0xc0 > [252257.837231] [<ffffffff81527b48>] cpuidle_idle_call+0xd8/0x210 > [252257.838224] [<ffffffff81034fee>] arch_cpu_idle+0xe/0x30 > [252257.839133] [<ffffffff810e7bca>] cpu_startup_entry+0x14a/0x1c0 > [252257.839933] [<ffffffff81051af6>] start_secondary+0x1b6/0x230 > [252257.840684] Code: 00 00 00 44 0f b6 e0 45 85 e4 0f 84 a7 01 00 00 49 63 > d4 48 83 43 10 01 48 8d 75 c0 48 c1 e2 04 48 01 da 48 8b 4a 28 48 8d 42 28 > <48> 89 71 08 48 89 4d c0 48 8b 4a 30 48 89 4d c8 48 89 31 48 89 > [252257.842366] RIP [<ffffffff810996e0>] run_timer_softirq+0xe0/0x310 > [252257.843183] RSP <ffff885fbe203e68> > [252257.843955] CR2: ffffffffc07c6298 > > Signed-off-by: Greg Rose <[email protected]> > --- > > V2 - Add splat to commit message per Pravin's request > --- I pushed the master master and backported to 2.[6,7,8,9] branch.
Thanks. _______________________________________________ dev mailing list [email protected] https://mail.openvswitch.org/mailman/listinfo/ovs-dev
