Unfortunately I think the only suggestion I have is to stick to stable kernels. 
We don't have any ARM hardware available so we can't test this locally.

I'm also deleting the laundry-list of cc's because I think I'm the only one on 
82574L support on Linux at this time.

Todd Fujinaka
Software Application Engineer
Networking Division (ND)
Intel Corporation
todd.fujin...@intel.com
(503) 712-4565

-----Original Message-----
From: linux-nics-boun...@isotope.jf.intel.com 
[mailto:linux-nics-boun...@isotope.jf.intel.com] On Behalf Of Duc Dang
Sent: Monday, April 06, 2015 11:42 AM
To: Kirsher, Jeffrey T; Brandeburg, Jesse; Allan, Bruce W; Wyborny, Carolyn; 
Skidmore, Donald C; Rose, Gregory V; Vick, Matthew; Ronciak, John; Williams, 
Mitch A; Linux NICS; e1000-devel@lists.sourceforge.net
Subject: [linux-nics] Arm64 kernel crashes with e1000e driver

Hi E1000E driver maintainers,

I am using E1000E card (82574L chipset) on my APM ARM64 X-Gene platform and 
usually see an oops when I do "ifconfig eth0 up" or "ifconfig eth0 down" like 
below:
=====================================================================
[root@dhcp-10-66-13-41 ~]# ifconfig eth0 down
e1000e: eth0 NIC Link is Down
[root@dhcp-10-66-13-41 ~]# ifconfig eth0 up Unhandled fault: synchronous 
external abort (0x96000010) at 0xffffff8002180000 Internal error: : 96000010 
[#1] PREEMPT SMP Modules linked in:
CPU: 5 PID: 946 Comm: ifconfig Not tainted 4.0.0-rc6+ #6 Hardware name: APM 
X-Gene Merlin board (DT)
task: ffffffc7da3ec200 ti: ffffffc0f9afc000 task.ti: ffffffc0f9afc000 PC is at 
e1000e_set_rx_mode+0x13c/0x328 LR is at e1000e_set_rx_mode+0xdc/0x328 pc : 
[<ffffffc00041e2b8>] lr : [<ffffffc00041e258>] pstate: 60000145 sp : 
ffffffc0f9affbc0
x29: ffffffc0f9affbc0 x28: ffffffc7d9cfa800
x27: ffffffc7d9cfa800 x26: ffffffc7dc200000
x25: ffffffc7dc0bc000 x24: 0000000000000000
x23: 0000000004008002 x22: ffffffc7dc200000
x21: ffffffc7dc200d38 x20: 0000000000005404
x19: 0000000000000000 x18: 0000000000000000
x17: 00000000004ddd88 x16: ffffffc00019d978
x15: ffffffffffffffff x14: 0ffffffffffffffe
x13: 0000000000000020 x12: 0000000000000000
x11: 000000000000002c x10: 0000000000000008
x9 : 0000000000000000 x8 : ffffffc7dc201040
x7 : 0000000000000000 x6 : 000000000000003f
x5 : 0000000000000040 x4 : 0000000000000000
x3 : 00000000000016c9 x2 : ffffff8002180000
x1 : ffffff8002180100 x0 : 0000000600114ba9

Process ifconfig (pid: 946, stack limit = 0xffffffc0f9afc028)
Stack: (0xffffffc0f9affbc0 to 0xffffffc0f9b00000)
fbc0: f9affc20 ffffffc0 0050a3b0 ffffffc0 dc200000 ffffffc7 006194a8 ffffffc0
fbe0: 006194a8 ffffffc0 dc200048 ffffffc7 00001002 00000000 0000007f 00000000
fc00: dc0bc000 ffffffc7 dc200000 ffffffc7 f9affc60 ffffffc0 0050a4dc ffffffc0
fc20: f9affc40 ffffffc0 0050a414 ffffffc0 dc20025c ffffffc7 dc200000 ffffffc7
fc40: f9affc60 ffffffc0 0050a4f8 ffffffc0 dc200000 ffffffc7 00000000 00000000
fc60: f9affca0 ffffffc0 0050a7b0 ffffffc0 dc200000 ffffffc7 00001043 00000000
fc80: 00000041 00000000 00000000 00000000 dc200000 ffffffc7 dc200000 ffffffc7
fca0: f9affce0 ffffffc0 0050a894 ffffffc0 dc200000 ffffffc7 00000000 00000000
fcc0: 00000000 00000000 00001002 00000000 00000000 00000000 00000000 00000000
fce0: f9affd10 ffffffc0 00564be0 ffffffc0 dc0bc010 ffffffc7 00000000 00000000
fd00: d08cc3e8 0000007f 00008914 00000000 f9affdb0 ffffffc0 0056635c ffffffc0
fd20: d08cc3e8 0000007f 00008914 00000000 dc5e0930 ffffffc7 00000003 00000000
fd40: d08cc3e8 0000007f 00008914 00000000 0000011a 00000000 0000001d 00000000
fd60: 005c7000 ffffffc0 f9afc000 ffffffc0 0000011a 00000000 dc0bc010 ffffffc7
fd80: 005c7000 ffffffc0 30687465 00000000 00000000 00000000 7ecb1043 0000007f
fda0: d08ccc7c 0000007f d08cc618 0000007f f9affdc0 ffffffc0 004ee040 ffffffc0
fdc0: f9affde0 ffffffc0 004ee848 ffffffc0 00008914 00000000 d08cc3e8 0000007f
fde0: f9affe10 ffffffc0 001a7e3c ffffffc0 d9d76e00 ffffffc7 d08cc3e8 0000007f
fe00: dc5e0930 ffffffc7 00000003 00000000 f9affe90 ffffffc0 001a8118 ffffffc0
fe20: 00000000 00000000 d9d76e00 ffffffc7 d9d76e00 ffffffc7 00000003 00000000
fe40: f9affe70 ffffffc0 001b2d1c ffffffc0 00000003 00000000 00000002 00000000
fe60: 00000003 00000000 d9d76e00 ffffffc7 f9affe80 ffffffc0 001b255c ffffffc0
fe80: f9affe90 ffffffc0 001a80d4 ffffffc0 d08cc4c0 0000007f 00085c30 ffffffc0
fea0: 00000000 00000000 d08ccc8a 0000007f ffffffff ffffffff 7ebc9a0c 0000007f
fec0: 00000000 00000000 00000015 00000000 00000003 00000000 00008914 00000000
fee0: d08cc3e8 0000007f 7ed432d3 0000007f 00000004 00000000 d08ccc95 0000007f
ff00: 01010100 81010101 ae58f47b 00000000 0000001d 00000000 98969991 978b9aff
ff20: 01010101 01010101 00000008 00000000 00000006 00000000 000003f3 00000000
ff40: 7eb07a68 0000007f 0000578b 00000000 7ed51670 0000007f 7ebc9a00 0000007f
ff60: 00000000 00000000 d08cc3e8 0000007f d08ccc8a 0000007f 00000004 00000000
ff80: d08cc628 0000007f 7ed4f1c8 0000007f 00000003 00000000 d08cc620 0000007f
ffa0: 00000000 00000000 d08cc3c0 0000007f d08cc628 0000007f d08cc4c0 0000007f
ffc0: 7ecbdac0 0000007f d08cc300 0000007f 7ebc9a0c 0000007f 00000000 00000000
ffe0: 00000003 00000000 0000001d 00000000 00000000 00000000 00000000 00000000 
Call trace:
[<ffffffc00041e2b8>] e1000e_set_rx_mode+0x13c/0x328 [<ffffffc00050a3ac>] 
__dev_set_rx_mode+0x48/0x8c [<ffffffc00050a410>] dev_set_rx_mode+0x20/0x38 
[<ffffffc00050a4f4>] __dev_open+0xcc/0x120 [<ffffffc00050a7ac>] 
__dev_change_flags+0x88/0x150 [<ffffffc00050a890>] dev_change_flags+0x1c/0x5c 
[<ffffffc000564bdc>] devinet_ioctl+0x644/0x6f0 [<ffffffc000566358>] 
inet_ioctl+0x88/0xb0 [<ffffffc0004ee03c>] sock_do_ioctl.constprop.42+0x1c/0x5c
[<ffffffc0004ee844>] sock_ioctl+0x204/0x2c8 [<ffffffc0001a7e38>] 
do_vfs_ioctl+0x360/0x5bc [<ffffffc0001a8114>] SyS_ioctl+0x80/0x98
Code: a94363f7 a9446bf9 a8c67bfd d65f03c0 (b9400042) ---[ end trace 
94a85a23e747c00c ]--- Kernel panic - not syncing: Fatal exception in interrupt
CPU4: stopping
CPU: 4 PID: 0 Comm: swapper/4 Tainted: G      D         4.0.0-rc6+ #6
Hardware name: APM X-Gene Merlin board (DT) Call trace:
[<ffffffc0000899f4>] dump_backtrace+0x0/0x124 [<ffffffc000089b28>] 
show_stack+0x10/0x1c [<ffffffc0005b8580>] dump_stack+0x80/0xc4 
[<ffffffc0000902e0>] handle_IPI+0x184/0x194 [<ffffffc000082460>] 
gic_handle_irq+0x78/0x80 Exception stack(0xffffffc7dc9dbe00 to 
0xffffffc7dc9dbf20)
be00: 0080a000 ffffffc0 0080a5d8 ffffffc0 dc9dbf40 ffffffc7 000869c4 ffffffc0
be20: 00000000 00000000 00000000 00000000 00000001 00000000 00000000 00000000
be40: 00000000 00000000 ff795000 00000007 00000000 000aba95 50e55d3c 0000000d
be60: dc9ca670 ffffffc7 dc9dbea0 ffffffc7 000003ff 00000000 0002d9eb 00000000
be80: 00000018 00000000 e8000000 00000003 00000000 00000000 18000000 000b2341
bea0: 00196c84 ffffffc0 8969a640 0000007f 00000000 00000000 0080a000 ffffffc0
bec0: 0080a5d8 ffffffc0 00000000 00000000 00000000 00000000 007fae20 ffffffc0
bee0: 007fbb00 ffffffc0 005cf638 ffffffc0 dc9dbf50 ffffffc7 dc9d8000 ffffffc7
bf00: 0080a000 ffffffc0 dc9dbf40 ffffffc7 000869c0 ffffffc0 dc9dbf40 ffffffc7 
[<ffffffc0000855a4>] el1_irq+0x64/0xd8 [<ffffffc0000e3b1c>] 
cpu_startup_entry+0x204/0x288 [<ffffffc00008fdb4>] 
secondary_start_kernel+0x114/0x124
=====================================================================

The issue also happens with E1000E card with chipset 82572EI. The kernel I am 
using is 4.0.0-rc6+ (get from kernel.org git) with E1000E driver version 
2.3.2-k:
e1000e: Intel(R) PRO/1000 Network Driver - 2.3.2-k
e1000e: Copyright(c) 1999 - 2014 Intel Corporation.
e1000e 0000:01:00.0: Interrupt Throttling Rate (ints/sec) set to dynamic 
conservative mode e1000e 0000:01:00.0 0000:01:00.0 (uninitialized): Failed to 
initialize MSI-X interrupts.  Falling back to MSI interrupts.
e1000e 0000:01:00.0 0000:01:00.0 (uninitialized): Failed to initialize MSI 
interrupts.  Falling back to legacy interrupts.
e1000e 0000:01:00.0 eth0: registered PHC clock e1000e 0000:01:00.0 eth0: (PCI 
Express:2.5GT/s:Width x1) 00:1b:21:4e:95:76 e1000e 0000:01:00.0 eth0: Intel(R) 
PRO/1000 Network Connection e1000e 0000:01:00.0 eth0: MAC: 3, PHY: 8, PBA No: 
E46981-003

I see in the E1000E driver (file drivers/net/ethernet/intel/e1000e/netdev.c
) this code snippet:
void __ew32(struct e1000_hw *hw, unsigned long reg, u32 val) {
        if (hw->adapter->flags2 & FLAG2_PCIM2PCI_ARBITER_WA)
                __ew32_prepare(hw);

        writel(val, hw->hw_addr + reg);
}

If I enable the flag FLAG2_PCIM2PCI_ARBITER_WA to avoid the collision in MAC 
CSR accesses between the CPU PCIe controller and the Manageability Engine for 
81752EI and 82754L chipsets then the issue goes away.
I don't see this flag is enabled for 82752EI and 82754L chipsets in current 4.0 
kernel driver, so I am not sure if this a proper fix or not.

Please let me know if you have some other ideas about how I should debug this 
issue.

Thanks and regards,
Duc Dang.
_______________________________________________
Linux-nics mailing list
linux-n...@intel.com

------------------------------------------------------------------------------
BPM Camp - Free Virtual Workshop May 6th at 10am PDT/1PM EDT
Develop your own process in accordance with the BPMN 2 standard
Learn Process modeling best practices with Bonita BPM through live exercises
http://www.bonitasoft.com/be-part-of-it/events/bpm-camp-virtual- event?utm_
source=Sourceforge_BPM_Camp_5_6_15&utm_medium=email&utm_campaign=VA_SF
_______________________________________________
E1000-devel mailing list
E1000-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/e1000-devel
To learn more about Intel&#174; Ethernet, visit 
http://communities.intel.com/community/wired

Reply via email to