I have determined that Debian was complaining about my ethernet port
because I had flow control enabled on the switch, and the switch was
getting easily overwhelmed and hanging, so the Debian resets were valid.
Thank you for the research on this. I think you can close this case.
On Wed, Mar 22, 2017 at 02:42:30AM +0000, Ben Hutchings wrote:
> Control: retitle -1 TX watchdog fires on e1000e interface with flow control
> On Tue, 2017-03-21 at 18:36 -0400, Bruce Momjian,,, wrote:
> > On Tue, Mar 21, 2017 at 04:04:11PM -0400, Bruce Momjian,,, wrote:
> > > I think this proves my problems are related to flow control. How would
> > > you like to proceed? Is there a patch or change you would like me to
> > > test? Just close the ticket?
> > >
> > > I have a fix, but it is likely others would not know they had this
> > > problem unless they were monitoring their kernel logs or their network
> > > traffic for lag.
> > Oh, I should also mention the port that is having problems is connected
> > to a NetGear GS108Ev3 switch, with current firmware, version 2.00.09.
> > The port connected to my Actiontec FIOS router is not having problems.
> I don't know about any specific bug, but if the switch sends flow
> control XOFF frames continually for long enough (usually 5 seconds)
> this will trigger the TX watchdog.
> It sounds like your switch implements flow control properly (some
> broken switches auto-negotiate it but actually flood flow control
> frames). However, if a device on some other port (that also has flow
> control enabled) sends XOFF frames continually *and* your server sends
> frames that should go to that other port, the switch will do the same
> to the server once the switch's internal queue has filled up.
> If the switch has port statistics including numbers of pause frames
> then you can see where they are coming from, but I think it doesn't.
> Without that information it's going to be hard to tell exactly where
> the fault lies.
> The e1000e driver *does* have statistics for pause frames transmitted
> and received (run: "ethtool -S eth0| grep flow_control"). If you log
> these every second then it should be possible to see what happens
> around the time the TX watchdog fires. That could provide some clues
> as to whether the NIC is behaving correctly.
> Ben Hutchings
> Power corrupts. Absolute power is kind of neat.
> - John Lehman, Secretary of the US Navy
Bruce Momjian <br...@momjian.us> http://momjian.us
+ As you are, so once was I. As I am, so you will be. +
+ Ancient Roman grave inscription +