When unloading/loading the driver in a loop with
modprobe -r i40e ; modprobe i40e
after a few cycles the driver no longer successfully probes and outputs
the following.
[  160.171944] i40e 0000:07:00.1 eth7: adding 68:05:ca:2a:3a:41 vid=0
[  161.271487] i40e 0000:07:00.1: set phy mask fail, aq_err -54
[  161.685505] i40e 0000:07:00.0 eth6: NIC Link is Down
[  161.873172] i40e 0000:07:00.1: link restart failed, aq_err=0
[  162.401255] i40e 0000:07:00.1: PCI-Express: Speed 8.0GT/s Width x8
[  162.710082] i40e 0000:07:00.0: add filter failed, err -54, aq_err 0
[  162.930801] i40e 0000:07:00.1: get phy abilities failed, aq_err -54, 
advertised speed settings may not be correct
[  162.977599] i40e 0000:07:00.1: Features: PF-id[1] VFs: 32 VSIs: 34 QP: 32 
RX: PS RSS FD_ATR FD_SB NTUPLE PTP
[  163.238624] i40e 0000:07:00.0 eth6: NIC Link is Down
[  163.244566] i40e 0000:07:00.2: Initial pf_reset failed: -15
[  163.244607] i40e: probe of 0000:07:00.2 failed with error -15
[  163.464911] i40e 0000:07:00.3: Initial pf_reset failed: -15
[  163.490747] i40e: probe of 0000:07:00.3 failed with error -15
[  163.518932] i40e 0000:07:00.1: i40e_ptp_stop: removed PHC on eth7
[  163.746713] i40e 0000:07:00.1 eth7: NIC Link is Down
[  164.270164] i40e 0000:07:00.1: add filter failed, err -54, aq_err 0
[...]
[  184.462907] i40e: Copyright (c) 2013 - 2014 Intel Corporation.
[  184.711290] i40e 0000:07:00.0: Initial pf_reset failed: -15
[  184.736457] i40e: probe of 0000:07:00.0 failed with error -15
[  184.983109] i40e 0000:07:00.1: Initial pf_reset failed: -15
[  185.009354] i40e: probe of 0000:07:00.1 failed with error -15
[  185.256612] i40e 0000:07:00.2: Initial pf_reset failed: -15
[  185.281990] i40e: probe of 0000:07:00.2 failed with error -15
[  185.529085] i40e 0000:07:00.3: Initial pf_reset failed: -15
[  185.555094] i40e: probe of 0000:07:00.3 failed with error -15

Followed by

[  188.178408] NMI: IOCK error (debug interrupt?) for reason 71 on CPU 0.
[  188.214709] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 3.19.0+ #81
[  188.245187] Hardware name: HP ProLiant DL360p Gen8, BIOS P71 08/02/2014
[  188.276847] task: ffffffff81e13480 ti: ffffffff81e00000 task.ti: 
ffffffff81e00000
[  188.313671] RIP: 0010:[<ffffffff8100d45b>]  [<ffffffff8100d45b>] 
default_idle+0x1b/0xb0
[  188.351779] RSP: 0018:ffffffff81e03ea8  EFLAGS: 00000246
[  188.377118] RAX: 0000000000000000 RBX: ffffffff81e00010 RCX: 0000000000000000
[  188.412311] RDX: ffffffff81e00000 RSI: 0000000000000000 RDI: 0000000000000000
[  188.448563] RBP: ffffffff81e03eb8 R08: 0000000000000000 R09: 00000000fffe4047
[  188.482137] R10: ffffffff81a0e045 R11: 0000000000000000 R12: 0000000000000000
[  188.518089] R13: ffffffff81efd970 R14: ffffffff81e00010 R15: 0000000000000000
[  188.553382] FS:  0000000000000000(0000) GS:ffff880237a00000(0000) 
knlGS:0000000000000000
[  188.594583] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  188.621056] CR2: 00007fbcb561bc88 CR3: 0000000235966000 CR4: 00000000001406f0
[  188.656549] Stack:
[  188.665693]  ffffffff81e00010 ffffffff81e00010 ffffffff81e03ec8 
ffffffff8100cc3a
[  188.700062]  ffffffff81e03f48 ffffffff810884b7 ffffffff81e13480 
ffff880236538910
[  188.734638]  ffffffff81e00000 ffffffff81e00010 ffffffff81e00010 
ffffffff81e00000
[  188.773067] Call Trace:
[  188.784412]  [<ffffffff8100cc3a>] arch_cpu_idle+0xa/0x10
[  188.808717]  [<ffffffff810884b7>] cpu_startup_entry+0x227/0x3b0
[  188.837221]  [<ffffffff819d0a52>] rest_init+0x72/0x80
[  188.860698]  [<ffffffff81f201bd>] start_kernel+0x41b/0x428
[  188.887669]  [<ffffffff81f1fbc0>] ? set_init_arg+0x5d/0x5d
[  188.914359]  [<ffffffff81f1f5ad>] x86_64_start_reservations+0x2a/0x2c
[  188.945125]  [<ffffffff81f1f700>] x86_64_start_kernel+0x151/0x158
[  188.972480] Code: c0 48 83 c8 08 0f 22 c0 eb ce 66 0f 1f 44 00 00 55 8b 05 
a1 a8 ec 00 48 89 e5 41 54 65 44 8b 25 cc cc ff 7e 85 c0 5
3 7f 19 fb f4 <8b> 05 87 a8 ec 00 65 44 8b 25 b7 cc ff 7e 85 c0 7f 44 5b 41 5c


I've tracked this down to the following hunk from this commit.
commit cafa2ee6fbb1bbc2fecdeef990858d56646fc1bd
Author: Anjali Singhai Jain <anjali.sing...@intel.com>
Date:   Sat Sep 13 07:40:45 2014 +0000

    i40e: Fix a bug where Rx would stop after some time
[...]
diff --git a/drivers/net/ethernet/intel/i40e/i40e_main.c 
b/drivers/net/ethernet/intel/i40e/i40e_main.c
index f7464e8..ff6d94d 100644
--- a/drivers/net/ethernet/intel/i40e/i40e_main.c
+++ b/drivers/net/ethernet/intel/i40e/i40e_main.c
[...]
@@ -9169,6 +9178,13 @@ static int i40e_probe(struct pci_dev *pdev, const struct 
pci_device_id *ent)
        if (err)
                dev_info(&pf->pdev->dev, "set phy mask fail, aq_err %d\n", err);

+       msleep(75);
+       err = i40e_aq_set_link_restart_an(&pf->hw, true, NULL);
+       if (err) {
+               dev_info(&pf->pdev->dev, "link restart failed, aq_err=%d\n",
+                        pf->hw.aq.asq_last_status);
+       }
+
        /* The main driver is (mostly) up and happy. We need to set this state
         * before setting up the misc vector or we get a race and the vector
         * ends up disabled forever.

With this hunk removed the driver successfully unloaded/reloaded a
couple of hundred times. Would it be safe to just remove this hunk?
I haven't seen any negative effects by removing this yet.

  Stefan

------------------------------------------------------------------------------
Dive into the World of Parallel Programming The Go Parallel Website, sponsored
by Intel and developed in partnership with Slashdot Media, is your hub for all
things parallel software development, from weekly thought leadership blogs to
news, videos, case studies, tutorials and more. Take a look and join the 
conversation now. http://goparallel.sourceforge.net/
_______________________________________________
E1000-devel mailing list
E1000-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/e1000-devel
To learn more about Intel&#174; Ethernet, visit 
http://communities.intel.com/community/wired

Reply via email to