Hi,

I tried turning off EEE, sadly it didn't change anything.  I was going to
upload these files to a bug on sourceforge but I didn't see one open for
this issue and it would not let me create one.

The logs are here, for both good and bad boots: http://repo.ezzi.net/igblog/

The hardware in question is this board for the logs I posted:
http://www.supermicro.com/products/motherboard/Atom/X10/A1SRi-2758F.cfm
I also had the same issue occur on this board previously:
http://www.supermicro.com/products/motherboard/Xeon/C220/X10SL7-F.cfm
but I don't run Linux on that one anymore and works fine under FreeBSD, but
when I noticed it on the Xeon board it was running Debian 7.5 

For reproducing it just putting latest OpenWRT on it with static ip set on
the first card, then reboot the machine a few times and 40-60% of the time
the NIC will fail to come up.
FreeBSD worked fine on this machine also.  BIOS is updated to latest
version.  Both machines are on completely different switches as well, so
that is definitely not related.

Thanks


-----Original Message-----
From: Fujinaka, Todd [mailto:todd.fujin...@intel.com] 
Sent: Monday, January 11, 2016 11:22 AM
To: Fujinaka, Todd <todd.fujin...@intel.com>; Len White <lwh...@nrw.ca>
Cc: e1000-devel@lists.sourceforge.net
Subject: RE: C2758 i354 link issue

Oh, and someone just reminded me that this might be an EEE interoperability
issue. Have you tried turning off EEE?

Todd Fujinaka
Software Application Engineer
Networking Division (ND)
Intel Corporation
todd.fujin...@intel.com
(503) 712-4565


-----Original Message-----
From: Fujinaka, Todd [mailto:todd.fujin...@intel.com] 
Sent: Monday, January 11, 2016 6:30 AM
To: Len White
Cc: e1000-devel@lists.sourceforge.net
Subject: Re: [E1000-devel] C2758 i354 link issue

While that's useful, we would also like reproduction steps. We'll need exact
hardware and software setups and the steps you took to get there. Send the
full dmesg, "lspci -vvv", eeprom dump, and registers dumps of both the
failed and passing cases.

It would be easiest if you send those as separate files and attach them to a
bug opened on sourceforge.

Thanks.

Todd Fujinaka
Software Application Engineer
Networking Division (ND)
Intel Corporation
todd.fujin...@intel.com
(503) 712-4565

From: Len White [mailto:lwh...@nrw.ca]
Sent: Sunday, January 10, 2016 10:04 PM
To: Fujinaka, Todd
Cc: e1000-devel@lists.sourceforge.net
Subject: RE: C2758 i354 link issue

So I've compiled the driver with debug mode turned on, and I have two boot
logs and a log of workaround to share with you, the first one is it booting
normally and the link comes up normally without any workarounds, the second
is when it fails.

Good boot:  http://repo.ezzi.net/igb1.txt Fail boot:
http://repo.ezzi.net/igb2.txt

For the most part they are very similar, until it hits
e1000_check_downshift_generic  and then its totally different.  The time it
worked correctly it hit:

"e1000_check_downshift_generic"

e1000_put_hw_semaphore_generic
e1000_read_phy_reg_mdic
e1000_release_phy_82575
e1000_release_swfw_sync_82575
e1000_get_hw_semaphore_generic
e1000_put_hw_semaphore_generic
e1000_check_downshift_generic
e1000_read_phy_reg_82580
e1000_acquire_phy_82575
e1000_acquire_swfw_sync_82575

VS:

e1000_put_hw_semaphore_generic
e1000_read_phy_reg_mdic
e1000_release_phy_82575
e1000_release_swfw_sync_82575
e1000_get_hw_semaphore_generic
e1000_put_hw_semaphore_generic
<this is where it changes on the failure boot>
e1000_check_for_link_82575
e1000_check_for_copper_link
e1000_phy_has_link_generic
e1000_read_phy_reg_82580
e1000_acquire_phy_82575
e1000_acquire_swfw_sync_82575
e1000_get_hw_semaphore_generic

and it will sit here looping forever:

[ 1513.308318] e1000_put_hw_semaphore_generic [ 1513.313244]
e1000_read_phy_reg_82580 [ 1513.317666] e1000_acquire_phy_82575 [
1513.321996] e1000_acquire_swfw_sync_82575 [ 1513.326851]
e1000_get_hw_semaphore_generic [ 1513.331798] e1000_put_hw_semaphore_generic
[ 1513.336732] e1000_read_phy_reg_mdic [ 1513.341118]
e1000_release_phy_82575 [ 1513345449] e1000_release_swfw_sync_82575 [
1513.350303] e1000_get_hw_semaphore_generic [ 1513.355250]
e1000_put_hw_semaphore_generic [ 1514.233267] e1000_check_for_link_82575 [
1514.237881] e1000_check_for_copper_link [ 1514.242558]
e1000_phy_has_link_generic [ 1514.247153] e1000_read_phy_reg_82580 [
1514.251571] e1000_acquire_phy_82575 [ 1514.255895]
e1000_acquire_swfw_sync_82575 [ 1514.260753] e1000_get_hw_semaphore_generic
[ 1514.265699] e1000_put_hw_semaphore_generic

Over and over and over.  If I apply my workaround with ethtool, here is the
output of how it changes:
http://repo.ezzi.net/igb_workaround.txt

and then it works normally.  You can also see e1000_check_downshift_generic
gets called this time.

Just for a test in e1000_macc, I commented this out:
//      if (!link)
//              return E1000_SUCCESS; /* No link detected */

And it did allow the network card to come up on boot on what otherwise would
have been a failed boot, but there was still a interruption in the link (it
went down then it came  back up), and every 60-90 seconds it would go down
again for 10-15 seconds in a loop.

Hopefully this helps narrow it down

From: Fujinaka, Todd [mailto:todd.fujin...@intel.com]
Sent: Sunday, January 10, 2016 8:17 PM
To: Len White <lwh...@nrw.ca<mailto:lwh...@nrw.ca>>
Cc:
e1000-devel@lists.sourceforge.net<mailto:e1000-devel@lists.sourceforge.net>
Subject: RE: C2758 i354 link issue

Kristian didn't reply to the mailing list but if you check e1000-bugs you'll
see he resolved his issues. I'm still not clear on what the problem was, but
you can check there for more details.

In any case, I'm guessing your problem isn't the same as his problem. I
think I asked him to try a different distro and I don't think he did that.
I'm still not convinced this isn't a problem with OpenWRT but then again
you've never said what OS you're running. If it is OpenWRT, have you tried a
different distro?

Todd Fujinaka
Software Application Engineer
Networking Division (ND)
Intel Corporation
todd.fujin...@intel.com<mailto:todd.fujin...@intel.com>
(503) 712-4565

From: Len White [mailto:lwh...@nrw.ca]
Sent: Sunday, January 10, 2016 3:01 PM
To: Fujinaka, Todd
Cc:
e1000-devel@lists.sourceforge.net<mailto:e1000-devel@lists.sourceforge.net>
Subject: C2758 i354 link issue

Hi,

I'm having the same problem as Kristian is with his igb link issue.  I also
have a different machine with an i210 card in it, that also has the same
issue on Linux ONLY.  FreeBSD works fine.
It's about 40% chance on boot that the issue will happen (but I did find a
workaround).  I believe it's something to do with auto-negotiation code.  I
can either unplug/replug the cable and it will come back up, or I can force
the link speed/duplex and it will bring the link back up when the issue is
triggered.

Ethtool on a boot with issue:

root@moffat-gw / # ethtool eth0
Settings for eth0:
        Supported ports: [ TP ]
        Supported link modes:   10baseT/Half 10baseT/Full
                                100baseT/Half 100baseT/Full
                                1000baseT/Full
        Supported pause frame use: Symmetric
        Supports auto-negotiation: Yes
        Advertised link modes:  10baseT/Half 10baseT/Full
                                100baseT/Half 100baseT/Full
                                1000baseT/Full
        Advertised pause frame use: Symmetric
        Advertised auto-negotiation: Yes
        Speed: Unknown!
        Duplex: Unknown! (255)
        Port: Twisted Pair
        PHYAD: 0
        Transceiver: internal
        Auto-negotiation: on
        MDI-X: off (auto)
        Supports Wake-on: pumbg
        Wake-on: g
        Current message level: 0x00000007 (7)
                               drv probe link
        Link detected: no
root@moffat-gw / # ethtool -s eth0 speed 1000 duplex full
root@moffat-gw / # ethtool eth0[   59.285152] igb 0000:00:14.0 eth0: igb:
eth0 NIC Link is Up 1000 Mbps Full Duplex, Flow Control: RX/TX
[   59.295788] br-lan: port 1(eth0) entered forwarding state
[   59.302438] br-lan: port 1(eth0) entered forwarding state
[   59.309274] IPv6: ADDRCONF(NETDEV_CHANGE): br-lan: link becomes ready

Settings for eth0:
        Supported ports: [ TP ]
        Supported link modes:   10baseT/Half 10baseT/Full
                                100baseT/Half 100baseT/Full
                                1000baseT/Full
        Supported pause frame use: Symmetric
        Supports auto-negotiation: Yes
        Advertised link modes:  1000baseT/Full
        Advertised pause frame use: Symmetric
        Advertised auto-negotiation: Yes
        Speed: 1000Mb/s
        Duplex: Full
        Port: Twisted Pair
        PHYAD: 0
        Transceiver: internal
        Auto-negotiation: on
        MDI-X: off (auto)
        Supports Wake-on: pumbg
        Wake-on: g
        Current message level: 0x00000007 (7)
                               drv probe link
        Link detected: yes
root@moffat-gw / #  ethtool -i eth0
driver: igb
version: 5.3.3.5
firmware-version: 0.0.0
expansion-rom-version:
bus-info: 0000:00:14.0
supports-statistics: yes
supports-test: yes
supports-eeprom-access: yes
supports-register-dump: yes
supports-priv-flags: no
root@moffat-gw / # uname -a
Linux moffat-gw 4.1.15 #12 SMP Sun Jan 10 22:09:02 EST 2016 x86_64 GNU/Linux
root@moffat-gw / # ethtool -S  eth0 NIC statistics:
     rx_packets: 12463
     tx_packets: 9238
     rx_bytes: 2548091
     tx_bytes: 1699471
     rx_broadcast: 191
     tx_broadcast: 6
     rx_multicast: 3073
     tx_multicast: 91
     multicast: 3073
     collisions: 0
     rx_crc_errors: 0
     rx_no_buffer_count: 0
     rx_missed_errors: 0
     tx_aborted_errors: 0
     tx_carrier_errors: 0
     tx_window_errors: 0
     tx_abort_late_coll: 0
     tx_deferred_ok: 0
     tx_single_coll_ok: 0
     tx_multi_coll_ok: 0
     tx_timeout_count: 0
     rx_long_length_errors: 0
     rx_short_length_errors: 0
     rx_align_errors: 0
     tx_tcp_seg_good: 0
     tx_tcp_seg_failed: 0
     rx_flow_control_xon: 0
     rx_flow_control_xoff: 0
     tx_flow_control_xon: 0
     tx_flow_control_xoff: 0
     rx_long_byte_count: 2548091
     tx_dma_out_of_sync: 0
     lro_aggregated: 0
     lro_flushed: 0
     tx_smbus: 3068
     rx_smbus: 3106
     dropped_smbus: 0
     os2bmc_rx_by_bmc: 2
     os2bmc_tx_by_bmc: 3068
     os2bmc_tx_by_host: 2
     os2bmc_rx_by_host: 3068
     tx_hwtstamp_timeouts: 0
     rx_hwtstamp_cleared: 0
     rx_errors: 0
     tx_errors: 0
     tx_dropped: 0
     rx_length_errors: 0
     rx_over_errors: 0
     rx_frame_errors: 0
     rx_fifo_errors: 0
     tx_fifo_errors: 0
     tx_heartbeat_errors: 0
     tx_queue_0_packets: 6170
     tx_queue_0_bytes: 400752
     tx_queue_0_restart: 0
     rx_queue_0_packets: 12434
     rx_queue_0_bytes: 2916614
     rx_queue_0_drops: 0
     rx_queue_0_csum_err: 0
     rx_queue_0_alloc_failed: 0

________________________________

No viruses found in this message

________________________________

No viruses found in this message

________________________________

No viruses found in this message


-----
No viruses found in this message






-----
No viruses found in this message



------------------------------------------------------------------------------
Site24x7 APM Insight: Get Deep Visibility into Application Performance
APM + Mobile APM + RUM: Monitor 3 App instances at just $35/Month
Monitor end-to-end web transactions and take corrective actions now
Troubleshoot faster and improve end-user experience. Signup Now!
http://pubads.g.doubleclick.net/gampad/clk?id=267308311&iu=/4140
_______________________________________________
E1000-devel mailing list
E1000-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/e1000-devel
To learn more about Intel&#174; Ethernet, visit 
http://communities.intel.com/community/wired

Reply via email to