You have been subscribed to a public bug: User reported several nodes lost connectivity in several situations, for instance during the netboot, in which a flood of arp traffic happens due to multiple simultaneous boot across the cluster.
No stack trace or message is seen, the device just stop receiving packets. In our attempts to reproduce the issue BCM5719 lost connectivity, always only under a heavy arp storm, in the follow situations: - changing MTU - interface configuration (with ifconfig or ip tool) - netboot In order to fix the issue we need to include the upstream patches: 1 - https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=748a240c5 which reads: "tg3: Fix rx hang on MTU change with 5717/5719 " 2 - https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=506b0a395 which reads: "tg3: APE heartbeat changes" Considering that 18.04 is planned to use linux 4.15 we will need to backport only the second patch. I'll submit it to the ml and post here a reference. ** Affects: linux (Ubuntu) Importance: Undecided Assignee: Ubuntu on IBM Power Systems Bug Triage (ubuntu-power-triage) Status: New ** Tags: architecture-ppc64le bugnameltc-165090 severity-high targetmilestone-inin1804 -- BCM5719/tg3 loses connectivity due to missing heartbeats between fw and driver https://bugs.launchpad.net/bugs/1751337 You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. -- Mailing list: https://launchpad.net/~kernel-packages Post to : firstname.lastname@example.org Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp