Yes my BIOS is quite old: 3999-IPCBuild-CM1 kd # dmesg | grep DMI [ 0.000000] DMI: PC Engines APU, BIOS SageBios_PCEngines_APU-45 04/05/2014
Funny as it's a pretty new box. Do you think I should remove PHYETH_DISABLE_OFFLOAD and try a BIOS upgrade to see if it fixes the problem? I don't think I will for existing sites as a BIOS upgrade looks pretty hard to do and I assume cannot be done remotely. Regards Michael Knill On 20/9/20, 12:18 pm, "Lonnie Abelbeck" <li...@lonnie.abelbeck.com> wrote: Hi Michael, Ahhh, very good ... looks like we are on to something. Add it to one/some of your APU2s and let us know how it goes. As far as my Qotom Q190G4N, I initially had to set PHYETH_DISABLE_OFFLOAD to keep it from locking-up with sustained high network traffic but then switched the RAM SO-DIMM with another brand and did not need PHYETH_DISABLE_OFFLOAD anymore. My comments probably got you started adding PHYETH_DISABLE_OFFLOAD. This is a very obscure kernel bug, as such it never got back-ported to Linux 3.16.x . For the APU2, the BIOS could play a role in how it initializes the NICs and whether this kernel bug is triggered. Lonnie > On Sep 19, 2020, at 7:49 PM, Michael Knill <michael.kn...@ipcsolutions.com.au> wrote: > > Awesome thanks Lonnie. > Yes its all making sense now. I already have this directive in my template against the Qotom Q190G4U for some reason (should I have?) > I have two Qotoms connected to the problem provider, one had this directive set already and has not failed and one did not (I forgot to change when I changed hardware) which fails. > All my APU's don't have this set so all have problems with this provider. > > I'm thinking we have finally solved this issue. > Thanks so much for your help > > Regards > Michael Knill > > On 20/9/20, 9:29 am, "Lonnie Abelbeck" <li...@lonnie.abelbeck.com> wrote: > > I would try this first > -- > PHYETH_DISABLE_OFFLOAD="tso gso gro" > -- > and see if it fixes the problem. > > If by chance it does fix it, then it would not be needed in AstLinux 1.4.x. > > The PHYETH_DISABLE_OFFLOAD settings disable some of the "offload" features of the NICs in an effort to work around this (somewhat obscure) kernel bug. > > It all kind of makes sense that a particular provider is fragmenting packets in ways others do not, and hits this kernel bug. > > The PHYETH_DISABLE_OFFLOAD setting above is very "safe", only drawback is it slightly reduces network performance near the 1 Gbps level. > > BTW, if traffic shaping is enabled this PHYETH_DISABLE_OFFLOAD setting is already applied to external ethernet NIC(s). > > Lonnie > > > >> On Sep 19, 2020, at 6:09 PM, Michael Knill <michael.kn...@ipcsolutions.com.au> wrote: >> >> Awesome thanks Lonnie. >> >> I will give it a try although I have no idea what it does! >> I assume I can remove this when I go to Astlinux 1.4? >> >> Regards >> Michael Knill >> >> Sent from my iPhone so please excuse my brevity. >> >>> On 19 Sep 2020, at 11:56 pm, Lonnie Abelbeck <li...@lonnie.abelbeck.com> wrote: >>> >>> >>> Hi Michael, >>> >>> Great info! >>> >>> Try this in your user.conf, and reboot. >>> -- >>> PHYETH_DISABLE_OFFLOAD="tso gso gro" >>> -- >>> >>> If my hunch is correct, this kernel fix added in 4.1.17 may be related ... >>> >>> net: preserve IP control block during GSO segmentation >>> https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/include/linux/skbuff.h?h=v4.1.17&id=abefd1b4087b9b5e83e7b4e7689f8b8e3cb2899c >>> >>> Lonnie >>> >>> >>> >>>> On Sep 18, 2020, at 10:56 PM, Michael Knill <michael.kn...@ipcsolutions.com.au> wrote: >>>> >>>> Yay some progress on this problem. >>>> >>>> I had my 4th site lock up yesterday. It was a site I moved from one location to another. There were no changes to the Astlinux box at all other than PPPoE credentials but after a couple of hours it locked up. So realistically the only change is the internet provider which is a new one that I am trialling and is the same provider as two of the other sites that are failing. >>>> >>>> As we are also using this provider in our home office, I set up another box this morning and connected the serial port not expecting anything to happen but it locked up and we captured it. Yay! It is attached. >>>> >>>> I'm hoping it will help the resolution of this problem. >>>> >>>> Regards >>>> Michael Knill >>>> >>> >>> >>> >>> >>> >>> >>> _______________________________________________ >>> Astlinux-users mailing list >>> Astlinux-users@lists.sourceforge.net >>> https://lists.sourceforge.net/lists/listinfo/astlinux-users >>> >>> Donations to support AstLinux are graciously accepted via PayPal to pay...@krisk.org. >>> <APU Crash.log> >> _______________________________________________ >> Astlinux-users mailing list >> Astlinux-users@lists.sourceforge.net >> https://lists.sourceforge.net/lists/listinfo/astlinux-users >> >> Donations to support AstLinux are graciously accepted via PayPal to pay...@krisk.org. > > > > _______________________________________________ > Astlinux-users mailing list > Astlinux-users@lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/astlinux-users > > Donations to support AstLinux are graciously accepted via PayPal to pay...@krisk.org. > > > _______________________________________________ > Astlinux-users mailing list > Astlinux-users@lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/astlinux-users > > Donations to support AstLinux are graciously accepted via PayPal to pay...@krisk.org. _______________________________________________ Astlinux-users mailing list Astlinux-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/astlinux-users Donations to support AstLinux are graciously accepted via PayPal to pay...@krisk.org. _______________________________________________ Astlinux-users mailing list Astlinux-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/astlinux-users Donations to support AstLinux are graciously accepted via PayPal to pay...@krisk.org.