Thanks, Fujinaka. The comment in E1000E driver (below) says that "on some parts there is a bug that acknowledges Host accesses later than it should". Do you know if 82752EI and 82754L have this problem or not?
/** * __ew32_prepare - prepare to write to MAC CSR register on certain parts * @hw: pointer to the HW structure * * When updating the MAC CSR registers, the Manageability Engine (ME) could * be accessing the registers at the same time. Normally, this is handled in * h/w by an arbiter but on some parts there is a bug that acknowledges Host * accesses later than it should which could result in the register to have * an incorrect value. Workaround this by checking the FWSM register which * has bit 24 set while ME is accessing MAC CSR registers, wait if it is set * and try again a number of times. **/ Regards, Duc Dang. On Mon, Apr 6, 2015 at 12:42 PM, Fujinaka, Todd <todd.fujin...@intel.com> wrote: > Unfortunately I think the only suggestion I have is to stick to stable > kernels. We don't have any ARM hardware available so we can't test this > locally. > > I'm also deleting the laundry-list of cc's because I think I'm the only one > on 82574L support on Linux at this time. > > Todd Fujinaka > Software Application Engineer > Networking Division (ND) > Intel Corporation > todd.fujin...@intel.com > (503) 712-4565 > > -----Original Message----- > From: linux-nics-boun...@isotope.jf.intel.com > [mailto:linux-nics-boun...@isotope.jf.intel.com] On Behalf Of Duc Dang > Sent: Monday, April 06, 2015 11:42 AM > To: Kirsher, Jeffrey T; Brandeburg, Jesse; Allan, Bruce W; Wyborny, Carolyn; > Skidmore, Donald C; Rose, Gregory V; Vick, Matthew; Ronciak, John; Williams, > Mitch A; Linux NICS; e1000-devel@lists.sourceforge.net > Subject: [linux-nics] Arm64 kernel crashes with e1000e driver > > Hi E1000E driver maintainers, > > I am using E1000E card (82574L chipset) on my APM ARM64 X-Gene platform and > usually see an oops when I do "ifconfig eth0 up" or "ifconfig eth0 down" like > below: > ===================================================================== > [root@dhcp-10-66-13-41 ~]# ifconfig eth0 down > e1000e: eth0 NIC Link is Down > [root@dhcp-10-66-13-41 ~]# ifconfig eth0 up Unhandled fault: synchronous > external abort (0x96000010) at 0xffffff8002180000 Internal error: : 96000010 > [#1] PREEMPT SMP Modules linked in: > CPU: 5 PID: 946 Comm: ifconfig Not tainted 4.0.0-rc6+ #6 Hardware name: APM > X-Gene Merlin board (DT) > task: ffffffc7da3ec200 ti: ffffffc0f9afc000 task.ti: ffffffc0f9afc000 PC is > at e1000e_set_rx_mode+0x13c/0x328 LR is at e1000e_set_rx_mode+0xdc/0x328 pc : > [<ffffffc00041e2b8>] lr : [<ffffffc00041e258>] pstate: 60000145 sp : > ffffffc0f9affbc0 > x29: ffffffc0f9affbc0 x28: ffffffc7d9cfa800 > x27: ffffffc7d9cfa800 x26: ffffffc7dc200000 > x25: ffffffc7dc0bc000 x24: 0000000000000000 > x23: 0000000004008002 x22: ffffffc7dc200000 > x21: ffffffc7dc200d38 x20: 0000000000005404 > x19: 0000000000000000 x18: 0000000000000000 > x17: 00000000004ddd88 x16: ffffffc00019d978 > x15: ffffffffffffffff x14: 0ffffffffffffffe > x13: 0000000000000020 x12: 0000000000000000 > x11: 000000000000002c x10: 0000000000000008 > x9 : 0000000000000000 x8 : ffffffc7dc201040 > x7 : 0000000000000000 x6 : 000000000000003f > x5 : 0000000000000040 x4 : 0000000000000000 > x3 : 00000000000016c9 x2 : ffffff8002180000 > x1 : ffffff8002180100 x0 : 0000000600114ba9 > > Process ifconfig (pid: 946, stack limit = 0xffffffc0f9afc028) > Stack: (0xffffffc0f9affbc0 to 0xffffffc0f9b00000) > fbc0: f9affc20 ffffffc0 0050a3b0 ffffffc0 dc200000 ffffffc7 006194a8 ffffffc0 > fbe0: 006194a8 ffffffc0 dc200048 ffffffc7 00001002 00000000 0000007f 00000000 > fc00: dc0bc000 ffffffc7 dc200000 ffffffc7 f9affc60 ffffffc0 0050a4dc ffffffc0 > fc20: f9affc40 ffffffc0 0050a414 ffffffc0 dc20025c ffffffc7 dc200000 ffffffc7 > fc40: f9affc60 ffffffc0 0050a4f8 ffffffc0 dc200000 ffffffc7 00000000 00000000 > fc60: f9affca0 ffffffc0 0050a7b0 ffffffc0 dc200000 ffffffc7 00001043 00000000 > fc80: 00000041 00000000 00000000 00000000 dc200000 ffffffc7 dc200000 ffffffc7 > fca0: f9affce0 ffffffc0 0050a894 ffffffc0 dc200000 ffffffc7 00000000 00000000 > fcc0: 00000000 00000000 00001002 00000000 00000000 00000000 00000000 00000000 > fce0: f9affd10 ffffffc0 00564be0 ffffffc0 dc0bc010 ffffffc7 00000000 00000000 > fd00: d08cc3e8 0000007f 00008914 00000000 f9affdb0 ffffffc0 0056635c ffffffc0 > fd20: d08cc3e8 0000007f 00008914 00000000 dc5e0930 ffffffc7 00000003 00000000 > fd40: d08cc3e8 0000007f 00008914 00000000 0000011a 00000000 0000001d 00000000 > fd60: 005c7000 ffffffc0 f9afc000 ffffffc0 0000011a 00000000 dc0bc010 ffffffc7 > fd80: 005c7000 ffffffc0 30687465 00000000 00000000 00000000 7ecb1043 0000007f > fda0: d08ccc7c 0000007f d08cc618 0000007f f9affdc0 ffffffc0 004ee040 ffffffc0 > fdc0: f9affde0 ffffffc0 004ee848 ffffffc0 00008914 00000000 d08cc3e8 0000007f > fde0: f9affe10 ffffffc0 001a7e3c ffffffc0 d9d76e00 ffffffc7 d08cc3e8 0000007f > fe00: dc5e0930 ffffffc7 00000003 00000000 f9affe90 ffffffc0 001a8118 ffffffc0 > fe20: 00000000 00000000 d9d76e00 ffffffc7 d9d76e00 ffffffc7 00000003 00000000 > fe40: f9affe70 ffffffc0 001b2d1c ffffffc0 00000003 00000000 00000002 00000000 > fe60: 00000003 00000000 d9d76e00 ffffffc7 f9affe80 ffffffc0 001b255c ffffffc0 > fe80: f9affe90 ffffffc0 001a80d4 ffffffc0 d08cc4c0 0000007f 00085c30 ffffffc0 > fea0: 00000000 00000000 d08ccc8a 0000007f ffffffff ffffffff 7ebc9a0c 0000007f > fec0: 00000000 00000000 00000015 00000000 00000003 00000000 00008914 00000000 > fee0: d08cc3e8 0000007f 7ed432d3 0000007f 00000004 00000000 d08ccc95 0000007f > ff00: 01010100 81010101 ae58f47b 00000000 0000001d 00000000 98969991 978b9aff > ff20: 01010101 01010101 00000008 00000000 00000006 00000000 000003f3 00000000 > ff40: 7eb07a68 0000007f 0000578b 00000000 7ed51670 0000007f 7ebc9a00 0000007f > ff60: 00000000 00000000 d08cc3e8 0000007f d08ccc8a 0000007f 00000004 00000000 > ff80: d08cc628 0000007f 7ed4f1c8 0000007f 00000003 00000000 d08cc620 0000007f > ffa0: 00000000 00000000 d08cc3c0 0000007f d08cc628 0000007f d08cc4c0 0000007f > ffc0: 7ecbdac0 0000007f d08cc300 0000007f 7ebc9a0c 0000007f 00000000 00000000 > ffe0: 00000003 00000000 0000001d 00000000 00000000 00000000 00000000 00000000 > Call trace: > [<ffffffc00041e2b8>] e1000e_set_rx_mode+0x13c/0x328 [<ffffffc00050a3ac>] > __dev_set_rx_mode+0x48/0x8c [<ffffffc00050a410>] dev_set_rx_mode+0x20/0x38 > [<ffffffc00050a4f4>] __dev_open+0xcc/0x120 [<ffffffc00050a7ac>] > __dev_change_flags+0x88/0x150 [<ffffffc00050a890>] dev_change_flags+0x1c/0x5c > [<ffffffc000564bdc>] devinet_ioctl+0x644/0x6f0 [<ffffffc000566358>] > inet_ioctl+0x88/0xb0 [<ffffffc0004ee03c>] sock_do_ioctl.constprop.42+0x1c/0x5c > [<ffffffc0004ee844>] sock_ioctl+0x204/0x2c8 [<ffffffc0001a7e38>] > do_vfs_ioctl+0x360/0x5bc [<ffffffc0001a8114>] SyS_ioctl+0x80/0x98 > Code: a94363f7 a9446bf9 a8c67bfd d65f03c0 (b9400042) ---[ end trace > 94a85a23e747c00c ]--- Kernel panic - not syncing: Fatal exception in interrupt > CPU4: stopping > CPU: 4 PID: 0 Comm: swapper/4 Tainted: G D 4.0.0-rc6+ #6 > Hardware name: APM X-Gene Merlin board (DT) Call trace: > [<ffffffc0000899f4>] dump_backtrace+0x0/0x124 [<ffffffc000089b28>] > show_stack+0x10/0x1c [<ffffffc0005b8580>] dump_stack+0x80/0xc4 > [<ffffffc0000902e0>] handle_IPI+0x184/0x194 [<ffffffc000082460>] > gic_handle_irq+0x78/0x80 Exception stack(0xffffffc7dc9dbe00 to > 0xffffffc7dc9dbf20) > be00: 0080a000 ffffffc0 0080a5d8 ffffffc0 dc9dbf40 ffffffc7 000869c4 ffffffc0 > be20: 00000000 00000000 00000000 00000000 00000001 00000000 00000000 00000000 > be40: 00000000 00000000 ff795000 00000007 00000000 000aba95 50e55d3c 0000000d > be60: dc9ca670 ffffffc7 dc9dbea0 ffffffc7 000003ff 00000000 0002d9eb 00000000 > be80: 00000018 00000000 e8000000 00000003 00000000 00000000 18000000 000b2341 > bea0: 00196c84 ffffffc0 8969a640 0000007f 00000000 00000000 0080a000 ffffffc0 > bec0: 0080a5d8 ffffffc0 00000000 00000000 00000000 00000000 007fae20 ffffffc0 > bee0: 007fbb00 ffffffc0 005cf638 ffffffc0 dc9dbf50 ffffffc7 dc9d8000 ffffffc7 > bf00: 0080a000 ffffffc0 dc9dbf40 ffffffc7 000869c0 ffffffc0 dc9dbf40 ffffffc7 > [<ffffffc0000855a4>] el1_irq+0x64/0xd8 [<ffffffc0000e3b1c>] > cpu_startup_entry+0x204/0x288 [<ffffffc00008fdb4>] > secondary_start_kernel+0x114/0x124 > ===================================================================== > > The issue also happens with E1000E card with chipset 82572EI. The kernel I am > using is 4.0.0-rc6+ (get from kernel.org git) with E1000E driver version > 2.3.2-k: > e1000e: Intel(R) PRO/1000 Network Driver - 2.3.2-k > e1000e: Copyright(c) 1999 - 2014 Intel Corporation. > e1000e 0000:01:00.0: Interrupt Throttling Rate (ints/sec) set to dynamic > conservative mode e1000e 0000:01:00.0 0000:01:00.0 (uninitialized): Failed to > initialize MSI-X interrupts. Falling back to MSI interrupts. > e1000e 0000:01:00.0 0000:01:00.0 (uninitialized): Failed to initialize MSI > interrupts. Falling back to legacy interrupts. > e1000e 0000:01:00.0 eth0: registered PHC clock e1000e 0000:01:00.0 eth0: (PCI > Express:2.5GT/s:Width x1) 00:1b:21:4e:95:76 e1000e 0000:01:00.0 eth0: > Intel(R) PRO/1000 Network Connection e1000e 0000:01:00.0 eth0: MAC: 3, PHY: > 8, PBA No: E46981-003 > > I see in the E1000E driver (file drivers/net/ethernet/intel/e1000e/netdev.c > ) this code snippet: > void __ew32(struct e1000_hw *hw, unsigned long reg, u32 val) { > if (hw->adapter->flags2 & FLAG2_PCIM2PCI_ARBITER_WA) > __ew32_prepare(hw); > > writel(val, hw->hw_addr + reg); > } > > If I enable the flag FLAG2_PCIM2PCI_ARBITER_WA to avoid the collision in MAC > CSR accesses between the CPU PCIe controller and the Manageability Engine for > 81752EI and 82754L chipsets then the issue goes away. > I don't see this flag is enabled for 82752EI and 82754L chipsets in current > 4.0 kernel driver, so I am not sure if this a proper fix or not. > > Please let me know if you have some other ideas about how I should debug this > issue. > > Thanks and regards, > Duc Dang. > _______________________________________________ > Linux-nics mailing list > linux-n...@intel.com ------------------------------------------------------------------------------ BPM Camp - Free Virtual Workshop May 6th at 10am PDT/1PM EDT Develop your own process in accordance with the BPMN 2 standard Learn Process modeling best practices with Bonita BPM through live exercises http://www.bonitasoft.com/be-part-of-it/events/bpm-camp-virtual- event?utm_ source=Sourceforge_BPM_Camp_5_6_15&utm_medium=email&utm_campaign=VA_SF _______________________________________________ E1000-devel mailing list E1000-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/e1000-devel To learn more about Intel® Ethernet, visit http://communities.intel.com/community/wired