Public bug reported: HP DL580G7 MAAS version: 2.3.0 (6434-gd354690-0ubuntu1~16.04.1 cloud-init.log and syslog attached. System normally deploys fine as part of a MAAS cluster. Issue is only occurring when trying to juju deploy a Kubernetes worker to this node. Hang occurs trying to load docker.io according to syslog once system hangs, can't ssh into system console shows the error:
netenp4s0f0 Firmware Hang Detected . . . netxen_nic enp... Device Initialization error uname -a Linux gpu-server 4.13.0-32-generic #35~16.04.1-Ubuntu SMP Thu Jan 25 10:13:43 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux ethtool -i enp4s0f0 driver: netxen_nic version: 4.0.82 firmware-version: 4.0.596 expansion-rom-version: bus-info: 0000:04:00.0 supports-statistics: yes supports-test: yes supports-eeprom-access: yes supports-register-dump: yes supports-priv-flags: no dmesg |grep netxen [ 3.284815] netxen_nic 0000:04:00.0: 2MB memory map [ 3.537593] netxen_nic 0000:04:00.0: Gen2 strapping detected [ 3.537682] netxen_nic 0000:04:00.0: using 64-bit dma mask [ 3.828914] netxen_nic: NX3031 Gigabit Ethernet Board S/N \xffffffff\xffffffff\xffffffff\xffffffff\xffffffff\xffffffff\xffffffff\xffffffff\xffffffff\xffffffff\xffffffff\xffffffff\xffffffff\xffffffff\xffffffff\xffffffff\xffffffff\xffffffff\xffffffff\xffffffff\xffffffff\xffffffff\xffffffff\xffffffff\xffffffff\xffffffff\xffffffff\xffffffff\xffffffff\xffffffff\xffffffff\xffffffff Chip rev 0x42 [ 3.828918] netxen_nic 0000:04:00.0: Driver v4.0.82, firmware v4.0.596 [legacy] [ 3.829102] netxen_nic 0000:04:00.0: using msi-x interrupts [ 3.829105] netxen_nic 0000:04:00.0: non ULA adapter [ 3.829372] netxen_nic 0000:04:00.0: eth0: GbE port initialized [ 3.850279] netxen_nic 0000:04:00.1: 2MB memory map [ 3.850489] netxen_nic 0000:04:00.1: using 64-bit dma mask [ 4.080327] netxen_nic 0000:04:00.1: Driver v4.0.82, firmware v4.0.596 [legacy] [ 4.080588] netxen_nic 0000:04:00.1: using msi-x interrupts [ 4.080874] netxen_nic 0000:04:00.1: eth1: GbE port initialized [ 4.132499] netxen_nic 0000:04:00.2: 2MB memory map [ 4.132707] netxen_nic 0000:04:00.2: using 64-bit dma mask [ 4.188596] netxen_nic 0000:04:00.2: Driver v4.0.82, firmware v4.0.596 [legacy] [ 4.192717] netxen_nic 0000:04:00.2: using msi-x interrupts [ 4.196837] netxen_nic 0000:04:00.2: eth2: GbE port initialized [ 4.317290] netxen_nic 0000:04:00.3: 2MB memory map [ 4.321319] netxen_nic 0000:04:00.3: using 64-bit dma mask [ 4.388142] netxen_nic 0000:04:00.3: Driver v4.0.82, firmware v4.0.596 [legacy] [ 4.392254] netxen_nic 0000:04:00.3: using msi-x interrupts [ 4.396445] netxen_nic 0000:04:00.3: eth3: GbE port initialized [ 4.403938] netxen_nic 0000:04:00.2 enp4s0f2: renamed from eth2 [ 4.524474] netxen_nic 0000:04:00.0 enp4s0f0: renamed from eth0 [ 4.564541] netxen_nic 0000:04:00.1 enp4s0f1: renamed from eth1 [ 4.600619] netxen_nic 0000:04:00.3 enp4s0f3: renamed from eth3 [ 19.290490] netxen_nic: enp4s0f0 NIC Link is up [ 21.606695] netxen_nic: enp4s0f1 NIC Link is up Last thing in syslog before nics crash: Feb 17 19:08:57 gpu-server systemd[1]: Started ACPI event daemon. Feb 17 19:08:57 gpu-server systemd[1]: Starting Docker Socket for the API. Feb 17 19:08:57 gpu-server systemd[1]: Listening on Docker Socket for the API. Feb 17 19:08:57 gpu-server systemd[1]: Starting Docker Application Container Engine... Feb 17 19:08:58 gpu-server dockerd[18889]: time="2018-02-17T19:08:58.032297862Z" level=info msg="libcontainerd: new containerd process, pid: 18913" Feb 17 19:08:59 gpu-server kernel: [ 327.615084] audit: type=1400 audit(1518894539.111:43): apparmor="STATUS" operation="profile_load" profile="unconfined" name="docker-default" pid=18928 comm="apparmor_parser" Feb 17 19:08:59 gpu-server kernel: [ 327.662287] aufs 4.13-20170911 Feb 17 19:08:59 gpu-server dockerd[18889]: time="2018-02-17T19:08:59.358149475Z" level=info msg="Graph migration to content-addressability took 0.00 seconds" Feb 17 19:08:59 gpu-server dockerd[18889]: time="2018-02-17T19:08:59.358893722Z" level=warning msg="Your kernel does not support swap memory limit" Feb 17 19:08:59 gpu-server dockerd[18889]: time="2018-02-17T19:08:59.359058327Z" level=warning msg="Your kernel does not support cgroup rt period" Feb 17 19:08:59 gpu-server dockerd[18889]: time="2018-02-17T19:08:59.359106384Z" level=warning msg="Your kernel does not support cgroup rt runtime" Feb 17 19:08:59 gpu-server dockerd[18889]: time="2018-02-17T19:08:59.360798923Z" level=info msg="Loading containers: start." Feb 17 19:08:59 gpu-server kernel: [ 327.889472] Bridge firewalling registered Feb 17 19:08:59 gpu-server kernel: [ 327.924904] nf_conntrack version 0.5.0 (65536 buckets, 262144 max) Feb 17 19:08:59 gpu-server dockerd[18889]: time="2018-02-17T19:08:59.456778799Z" level=info msg="Firewalld running: false" ** Affects: linux-firmware (Ubuntu) Importance: Undecided Status: New ** Attachment added: "log.tar" https://bugs.launchpad.net/bugs/1750176/+attachment/5057440/+files/log.tar -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux-firmware in Ubuntu. https://bugs.launchpad.net/bugs/1750176 Title: Firmware Hang detected on HP netxen NIC Status in linux-firmware package in Ubuntu: New Bug description: HP DL580G7 MAAS version: 2.3.0 (6434-gd354690-0ubuntu1~16.04.1 cloud-init.log and syslog attached. System normally deploys fine as part of a MAAS cluster. Issue is only occurring when trying to juju deploy a Kubernetes worker to this node. Hang occurs trying to load docker.io according to syslog once system hangs, can't ssh into system console shows the error: netenp4s0f0 Firmware Hang Detected . . . netxen_nic enp... Device Initialization error uname -a Linux gpu-server 4.13.0-32-generic #35~16.04.1-Ubuntu SMP Thu Jan 25 10:13:43 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux ethtool -i enp4s0f0 driver: netxen_nic version: 4.0.82 firmware-version: 4.0.596 expansion-rom-version: bus-info: 0000:04:00.0 supports-statistics: yes supports-test: yes supports-eeprom-access: yes supports-register-dump: yes supports-priv-flags: no dmesg |grep netxen [ 3.284815] netxen_nic 0000:04:00.0: 2MB memory map [ 3.537593] netxen_nic 0000:04:00.0: Gen2 strapping detected [ 3.537682] netxen_nic 0000:04:00.0: using 64-bit dma mask [ 3.828914] netxen_nic: NX3031 Gigabit Ethernet Board S/N \xffffffff\xffffffff\xffffffff\xffffffff\xffffffff\xffffffff\xffffffff\xffffffff\xffffffff\xffffffff\xffffffff\xffffffff\xffffffff\xffffffff\xffffffff\xffffffff\xffffffff\xffffffff\xffffffff\xffffffff\xffffffff\xffffffff\xffffffff\xffffffff\xffffffff\xffffffff\xffffffff\xffffffff\xffffffff\xffffffff\xffffffff\xffffffff Chip rev 0x42 [ 3.828918] netxen_nic 0000:04:00.0: Driver v4.0.82, firmware v4.0.596 [legacy] [ 3.829102] netxen_nic 0000:04:00.0: using msi-x interrupts [ 3.829105] netxen_nic 0000:04:00.0: non ULA adapter [ 3.829372] netxen_nic 0000:04:00.0: eth0: GbE port initialized [ 3.850279] netxen_nic 0000:04:00.1: 2MB memory map [ 3.850489] netxen_nic 0000:04:00.1: using 64-bit dma mask [ 4.080327] netxen_nic 0000:04:00.1: Driver v4.0.82, firmware v4.0.596 [legacy] [ 4.080588] netxen_nic 0000:04:00.1: using msi-x interrupts [ 4.080874] netxen_nic 0000:04:00.1: eth1: GbE port initialized [ 4.132499] netxen_nic 0000:04:00.2: 2MB memory map [ 4.132707] netxen_nic 0000:04:00.2: using 64-bit dma mask [ 4.188596] netxen_nic 0000:04:00.2: Driver v4.0.82, firmware v4.0.596 [legacy] [ 4.192717] netxen_nic 0000:04:00.2: using msi-x interrupts [ 4.196837] netxen_nic 0000:04:00.2: eth2: GbE port initialized [ 4.317290] netxen_nic 0000:04:00.3: 2MB memory map [ 4.321319] netxen_nic 0000:04:00.3: using 64-bit dma mask [ 4.388142] netxen_nic 0000:04:00.3: Driver v4.0.82, firmware v4.0.596 [legacy] [ 4.392254] netxen_nic 0000:04:00.3: using msi-x interrupts [ 4.396445] netxen_nic 0000:04:00.3: eth3: GbE port initialized [ 4.403938] netxen_nic 0000:04:00.2 enp4s0f2: renamed from eth2 [ 4.524474] netxen_nic 0000:04:00.0 enp4s0f0: renamed from eth0 [ 4.564541] netxen_nic 0000:04:00.1 enp4s0f1: renamed from eth1 [ 4.600619] netxen_nic 0000:04:00.3 enp4s0f3: renamed from eth3 [ 19.290490] netxen_nic: enp4s0f0 NIC Link is up [ 21.606695] netxen_nic: enp4s0f1 NIC Link is up Last thing in syslog before nics crash: Feb 17 19:08:57 gpu-server systemd[1]: Started ACPI event daemon. Feb 17 19:08:57 gpu-server systemd[1]: Starting Docker Socket for the API. Feb 17 19:08:57 gpu-server systemd[1]: Listening on Docker Socket for the API. Feb 17 19:08:57 gpu-server systemd[1]: Starting Docker Application Container Engine... Feb 17 19:08:58 gpu-server dockerd[18889]: time="2018-02-17T19:08:58.032297862Z" level=info msg="libcontainerd: new containerd process, pid: 18913" Feb 17 19:08:59 gpu-server kernel: [ 327.615084] audit: type=1400 audit(1518894539.111:43): apparmor="STATUS" operation="profile_load" profile="unconfined" name="docker-default" pid=18928 comm="apparmor_parser" Feb 17 19:08:59 gpu-server kernel: [ 327.662287] aufs 4.13-20170911 Feb 17 19:08:59 gpu-server dockerd[18889]: time="2018-02-17T19:08:59.358149475Z" level=info msg="Graph migration to content-addressability took 0.00 seconds" Feb 17 19:08:59 gpu-server dockerd[18889]: time="2018-02-17T19:08:59.358893722Z" level=warning msg="Your kernel does not support swap memory limit" Feb 17 19:08:59 gpu-server dockerd[18889]: time="2018-02-17T19:08:59.359058327Z" level=warning msg="Your kernel does not support cgroup rt period" Feb 17 19:08:59 gpu-server dockerd[18889]: time="2018-02-17T19:08:59.359106384Z" level=warning msg="Your kernel does not support cgroup rt runtime" Feb 17 19:08:59 gpu-server dockerd[18889]: time="2018-02-17T19:08:59.360798923Z" level=info msg="Loading containers: start." Feb 17 19:08:59 gpu-server kernel: [ 327.889472] Bridge firewalling registered Feb 17 19:08:59 gpu-server kernel: [ 327.924904] nf_conntrack version 0.5.0 (65536 buckets, 262144 max) Feb 17 19:08:59 gpu-server dockerd[18889]: time="2018-02-17T19:08:59.456778799Z" level=info msg="Firewalld running: false" To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux-firmware/+bug/1750176/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp