Thanks Jeff,
I don't think I was in contact with Don before.
Shall I send the requested information directly? I don't think a large
attachment will go to the list.
To not only spam the list here are some information about the system:
Kernel: 2.6.32.36
ethtool -i eth4
driver: ixgbe
version: 2.0.44-k2
firmware-version: 1.8-0
bus-info: 0000:0a:00.0
lspci -vvv is very large but at the error state no traffic is accepted. So the
PCIe speed as mentioned in the datasheet is no limiting factor here.
0a:00.0 Ethernet controller: Intel Corporation 82599EB 10-Gigabit Network
Connection (rev 01)
Subsystem: Unknown device 1b6d:00a0
Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr-
Stepping- SERR- FastB2B-
Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort-
<MAbort- >SERR- <PERR-
Latency: 0, Cache Line Size: 64 bytes
Interrupt: pin A routed to IRQ 40
Region 0: Memory at df5c0000 (64-bit, non-prefetchable) [size=128K]
Region 2: I/O ports at ecc0 [size=32]
Region 4: Memory at df5b8000 (64-bit, non-prefetchable) [size=16K]
Capabilities: [40] Power Management version 3
Flags: PMEClk- DSI+ D1- D2- AuxCurrent=0mA
PME(D0+,D1-,D2-,D3hot+,D3cold-)
Status: D0 PME-Enable- DSel=0 DScale=1 PME-
Capabilities: [50] Message Signalled Interrupts: 64bit+ Queue=0/0
Enable-
Address: 0000000000000000 Data: 0000
Capabilities: [70] MSI-X: Enable+ Mask- TabSize=64
Vector table: BAR=4 offset=00000000
PBA: BAR=4 offset=00002000
Capabilities: [a0] Express Endpoint IRQ 0
Device: Supported: MaxPayload 512 bytes, PhantFunc 0, ExtTag-
Device: Latency L0s <512ns, L1 <64us
Device: AtnBtn- AtnInd- PwrInd-
Device: Errors: Correctable+ Non-Fatal+ Fatal+ Unsupported+
Device: RlxdOrd+ ExtTag- PhantFunc- AuxPwr- NoSnoop+
Device: MaxPayload 256 bytes, MaxReadReq 512 bytes
Link: Supported Speed unknown, Width x8, ASPM L0s, Port 4
Link: Latency L0s unlimited, L1 <32us
Link: ASPM Disabled RCB 64 bytes CommClk- ExtSynch-
Link: Speed unknown, Width x8
Capabilities: [e0] Vital Product Data
Capabilities: [100] Advanced Error Reporting
Capabilities: [140] Device Serial Number 00-00-00-ff-ff-00-00-00
Capabilities: [150] Unknown (14)
Capabilities: [160] Unknown (16)
ethtool -d output twice a few seconds one after the other (only changes and not
TX)
0x00048: FRTIMER (Free Running Timer) 0x55AE1845 0x5613A01A
0x03FA0: mpc0 (Missed Packets Count 0) 0x0000511C 0x00005124
0x0405C: prc64 (Packets Received (64B) Count) 0x000172DF 0x000172E7
0x04078: bprc (Broadcast Packets Rx Count) 0x0000239E 0x000023A3
0x0407C: mprc (Multicast Packets Rx Count) 0x00015BAE 0x00015BB1
0x04088: gorcl (Good Octets Rx Count Low) 0x6882B90F 0x6882BB0F
0x040C0: torl (Total Octets Rx Count Low) 0x688B88FF 0x688B8AFF
0x040D0: tpr (Total Packets Received) 0x0D6326BB 0x0D6326C3
Neither the Receive Descriptor Head nor Tail register changes.
dmesg: Nothing
Cheers,
Martin
-----Original Message-----
From: [email protected] [mailto:[email protected]] On Behalf Of Jeff Kirsher
Sent: Donnerstag, 28. Juli 2011 13:17
To: Zielinski, Martin; Don Skidmore
Cc: [email protected]
Subject: Re: [E1000-devel] ixgbe: not accepting any packets - increasing
rx_missed_errors
On Thu, Jul 28, 2011 at 01:41, <[email protected]> wrote:
> Hello,
>
> With a 82559EB card a customer often comes into the situation that no packet
> can be received anymore until network restart.
> The symptom is that the rx_missed_errors register counts each packet but no
> more packets can be seen by the kernel.
>
> We are using a 2.6.32 kernel with version: 2.0.44-k2.
>
> I am aware that this is an old driver version, but please give me a chance to
> explain why I'm asking for information anyway:
>
> - The driver is part of the 2.6.32 stable branch.
> - It takes 2 - 10 days to reproduce it in the lab. So if we use a newer
> version, we cannot be sure that the problem is fixed just because we don't
> see it anymore.
> - According to the customer the issue started with an update that adds the
> memory boundary and disables packet split (errata #45). PSRTYPE register is
> not initialized in this version. Everything in the previous version worked
> (so with the even older driver).
> - It is a critical customer. If we provide a new version and it fails again
> this will become a problem.
> - All reports about this issue end up without resolution or the advice to
> update the driver. I really tried to extract an explanation or the exact
> changeset that fixes the issue. But I failed. So for documentation purposes
> it would be a good thing to make the solution googleble.
>
> Don Skidmore wrote in:
>
> http://sourceforge.net/mailarchive/forum.php?thread_name=29F4ED941D916B48B88B4D2A4F3D1B9C01D2E285AF%40orsmsx509.amr.corp.intel.com&forum_name=e1000-devel
>
> "Have you tried using the latest Source Forge driver (3.2.9). Including in
> it was a fix that corrected an erratum that sounds very similar to your
> issue."
>
> I'd greatly appreciate if someone can point me to the right direction. What
> I'd like to understand is:
>
Don seems to be have been working with you, so I will let him continue
in assisting you (since he is the ixgbe Maintainer).
There have been 15 more recent out-of-tree driver release's since the
you are using, so it is very possible that the issue you are seeing
was fixed later on in one of the more recent driver releases, and the
fix was not back-ported to the older 2.6.32 kernel. If Don does not
have the information already, any information that you can provide
(i.e. kernel config, lspci -vvv output. dmesg log with the error's you
are seeing). This information can help us greatly in determining what
fixes that were implemented in later versions of the driver would have
an effect on the issue you are seeing. Once we narrow down the
fix(es) that resolve the issue, then we can provide the additional
information on what the exact change is and why.
With some (not all) fixes, we should have testing scenarios which
would consistently reproduce the issue, so that we can accurately
determine if the fix(es) resolved the issue. I know that I am
speaking in generalities and nothing specific, this is mainly because
I do not the exact issue you are having the the possible fixes that
Don is aware of.
I have added Don to this email thread, and will let him work with you
to get the specifics on the issue(s) you are seeing. So that we can
work on getting a resolution to you, whether it be an updated driver
or a patchset against your kernel.
Cheers,
Jeff
> - What change exactly is the fix for this issue?
> - How can I verify that I am seeing the same issue (some special
> register/memory dump/...)?
> - How can I verify that the issue is fixed.
>
> I know - I'm asking for support for a driver that is part of the stable
> kernel but very old in your development line.
> So I would be even happier if someone takes the time to answer my questions.
>
> Cheers,
> Martin
>
> Martin Zielinski
> Dipl. Inform
> Senior Engineer
>
> McAfee GmbH
>
> Firmensitz: Muenchen
> Amtsgericht: AG Muenchen
> Handelsregister: HRB 144340
> Geschaeftsfuehrer: Emmet Russell, Keith Krzeminski, Douglas Rice
> Bankverbindung: ABN-Amro Bank N.V. Konto 671 211 9006
> UST-ID: DE168122444
>
> ------------------------------------------------------------------------------
> Got Input? Slashdot Needs You.
> Take our quick survey online. Come on, we don't ask for help often.
> Plus, you'll get a chance to win $100 to spend on ThinkGeek.
> http://p.sf.net/sfu/slashdot-survey
> _______________________________________________
> E1000-devel mailing list
> [email protected]
> https://lists.sourceforge.net/lists/listinfo/e1000-devel
> To learn more about Intel® Ethernet, visit
> http://communities.intel.com/community/wired
>
--
Cheers,
Jeff
Firmensitz: Muenchen
Amtsgericht: AG Muenchen
Handelsregister: HRB 144340
Geschaeftsfuehrer: Emmet Russell, Keith Krzeminski, Douglas Rice
Bankverbindung: ABN-Amro Bank N.V. Konto 671 211 9006
UST-ID: DE168122444
------------------------------------------------------------------------------
Got Input? Slashdot Needs You.
Take our quick survey online. Come on, we don't ask for help often.
Plus, you'll get a chance to win $100 to spend on ThinkGeek.
http://p.sf.net/sfu/slashdot-survey
_______________________________________________
E1000-devel mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/e1000-devel
To learn more about Intel® Ethernet, visit
http://communities.intel.com/community/wired