On Mon, Jul 30, 2018 at 4:43 PM, Àbéjídé Àyodélé <abejideayod...@gmail.com> wrote: >> Is it always the OOM errors followed by the Tx timeout? > > Yes, I believe I have the dmesg from one of the earlier incidence, I can > clean that up and make it public if you want.
There shouldn't be any need. Basically what you want to check for is to make sure those logs have the same pattern with OOM errors followed by the rcu_sched warning about detecting a CPU stall. If that is the case that is the most likely root cause for the Tx hangs that are being reported. >> Is it an actual serial connection or is it something like serial over >> LAN? > > Serial over LAN > >> Do you know if you have any sort of flow control enabled or >> anything that might delay displaying the message? > > None that I know of > >> Also what volume of logs are you sending over the serial interface? > > Just kernel logs (dmesg). > > Thanks > > Abejide Ayodele > It always seems impossible until it's done. --Nelson Mandela Well at this point I am not sure there is much left we can really do on our end. Basically what we need to do is make it so that the logging to the serial port doesn't trigger the RCU/CPU stall. You might try testing with the serial over lan logging disabled and maybe take a look at trying something like netconsole or the like to see if that might resolve the issue. Otherwise you might take a look at seeing if you can resolve the OOM condition so that you aren't sending enough logs to the serial console to trigger the stall. Thanks. - Alex ------------------------------------------------------------------------------ Check out the vibrant tech community on one of the world's most engaging tech sites, Slashdot.org! http://sdm.link/slashdot _______________________________________________ E1000-devel mailing list E1000-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/e1000-devel To learn more about Intel® Ethernet, visit http://communities.intel.com/community/wired