Thanks Przemyslaw, good explanation on bug's description! I'm dealing
with this one, will update status here with news.
Cheers,
Guilherme
** Also affects: linux (Ubuntu Cosmic)
Importance: Undecided
Status: New
** Also affects: linux (Ubuntu Disco)
Importance: Undecided
Status: New
** Also affects: linux (Ubuntu Eoan)
Importance: Undecided
Status: Incomplete
** Also affects: linux (Ubuntu Bionic)
Importance: Undecided
Status: New
** Also affects: linux (Ubuntu Xenial)
Importance: Undecided
Status: New
** Also affects: linux (Ubuntu Ff-series)
Importance: Undecided
Status: New
** Changed in: linux (Ubuntu Eoan)
Status: Incomplete => Confirmed
** Changed in: linux (Ubuntu Ff-series)
Status: New => Confirmed
** Changed in: linux (Ubuntu Disco)
Status: New => Confirmed
** Changed in: linux (Ubuntu Cosmic)
Status: New => Confirmed
** Changed in: linux (Ubuntu Bionic)
Status: New => Confirmed
** Changed in: linux (Ubuntu Xenial)
Status: New => Confirmed
** Changed in: linux (Ubuntu Xenial)
Importance: Undecided => High
** Changed in: linux (Ubuntu Bionic)
Importance: Undecided => High
** Changed in: linux (Ubuntu Cosmic)
Importance: Undecided => High
** Changed in: linux (Ubuntu Disco)
Importance: Undecided => High
** Changed in: linux (Ubuntu Eoan)
Importance: Undecided => High
** Changed in: linux (Ubuntu Ff-series)
Importance: Undecided => High
** Changed in: linux (Ubuntu Xenial)
Assignee: (unassigned) => Guilherme G. Piccoli (gpiccoli)
** Changed in: linux (Ubuntu Bionic)
Assignee: (unassigned) => Guilherme G. Piccoli (gpiccoli)
** Changed in: linux (Ubuntu Cosmic)
Assignee: (unassigned) => Guilherme G. Piccoli (gpiccoli)
** Changed in: linux (Ubuntu Ff-series)
Assignee: (unassigned) => Guilherme G. Piccoli (gpiccoli)
** Changed in: linux (Ubuntu Eoan)
Assignee: (unassigned) => Guilherme G. Piccoli (gpiccoli)
** Changed in: linux (Ubuntu Disco)
Assignee: (unassigned) => Guilherme G. Piccoli (gpiccoli)
** Tags removed: bionic
** Tags added: bnx2x sts
** Description changed:
For the customer OpenStack deployment we deploy infra nodes on Dell R630
servers. The servers have onboard Broadcom's NetXtreme II BCM57800 NIC
(quad port: 2x1G ports, 2x10G ports). For each port in UP state, we
observe 100% CPU load. So in total, we observe 4 CPUs with 100% load.
perf report shows function bnx2x_ptp_task taking up much of the CPUs
time: https://pastebin.canonical.com/p/kfrpd6Pwh5/
Also, /var/log/syslog contains the following outputs every few seconds:
- [1738143.581721] bnx2x: [bnx2x_start_xmit:3855(eno4)]The device supports only
a single outstanding packet to timestamp, this packet will not be timestamped
- [1738176.727642] bnx2x: [bnx2x_start_xmit:3855(eno1)]The device supports only
a single outstanding packet to timestamp, this packet will not be timestamped
- [1738207.988310] bnx2x: [bnx2x_start_xmit:3855(eno3)]The device supports only
a single outstanding packet to timestamp, this packet will not be timestamped
- [1738240.227333] bnx2x: [bnx2x_start_xmit:3855(eno2)]The device supports only
a single outstanding packet to timestamp, this packet will not be timestamped
+ [1738143.581721] bnx2x: [bnx2x_start_xmit:3855(eno4)]The device supports only
a single outstanding packet to timestamp, this packet will not be timestamped
+ [1738176.727642] bnx2x: [bnx2x_start_xmit:3855(eno1)]The device supports only
a single outstanding packet to timestamp, this packet will not be timestamped
+ [1738207.988310] bnx2x: [bnx2x_start_xmit:3855(eno3)]The device supports only
a single outstanding packet to timestamp, this packet will not be timestamped
+ [1738240.227333] bnx2x: [bnx2x_start_xmit:3855(eno2)]The device supports only
a single outstanding packet to timestamp, this packet will not be timestamped
So, the problem seems to be in a "timestampped" TX packet; the driver
for some reason (to be yet understood) get an unexpected value from a
register and then, it that same function, reschedule itself to try again
this register read, read gets a bad value again, and so on infinitely.
This is showing in the system as the 100% CPU usage kthreads; the
message "The device supports only a single outstanding packet to
timestamp, this packet will not be timestamped" happens because the
driver can only timestamp a single TX packet at a time, and given it's
stuck trying, it cannot accept another packet in this "queue".
The infinite loop appears to be:
- static void bnx2x_ptp_task(struct work_struct *work)
- {
- struct bnx2x *bp = container_of(work, struct bnx2x, ptp_task);
- int port = BP_PORT(bp);
- u32 val_seq;
- u64 timestamp, ns;
- struct skb_shared_hwtstamps shhwtstamps;
+ static void bnx2x_ptp_task(struct work_struct *work)
+ {
+ struct bnx2x *bp = container_of(work, struct bnx2x, ptp_task);
+ int port = BP_PORT(bp);
+ u32 val_seq;
+ u64 timestamp, ns;
+ struct skb_shared_hwtstamps shhwtstamps;
- /* Read Tx timestamp registers */
- val_seq = REG_RD(bp, port ? NIG_REG_P1_TLLH_PTP_BUF_SEQID :
- NIG_REG_P0_TLLH_PTP_BUF_SEQID);
- if (val_seq & 0x10000) {
- [...]
- } else {
- DP(BNX2X_MSG_PTP, "There is no valid Tx timestamp yet\n");
- /* Reschedule to keep checking for a valid timestamp value */
- schedule_work(&bp->ptp_task);
- }
+ /* Read Tx timestamp registers */
+ val_seq = REG_RD(bp, port ? NIG_REG_P1_TLLH_PTP_BUF_SEQID :
+ NIG_REG_P0_TLLH_PTP_BUF_SEQID);
+ if (val_seq & 0x10000) {
+ [...]
+ } else {
+ DP(BNX2X_MSG_PTP, "There is no valid Tx timestamp yet\n");
+ /* Reschedule to keep checking for a valid timestamp value */
+ schedule_work(&bp->ptp_task);
+ }
It appears that val_seq & 0x10000 is never true, so the task constantly
reschedules itself immediately. Instrumenting the function shows that it
is being called in excess of 100,000 times per second. The REG_RD call
does appear to be expensive (as it's a register read from the device)
and shows high in the perf report, but that by itself doesn't appear to
be the root cause (i.e., it's not hanging forever in the REG_RD).
The cause appears to be that the driver is not prepared to deal with the
PTP request never being completed by the hardware. It's unclear why it
isn't completing, but regardless, the driver should not loop forever
here.
-
-
- Additional info:
-
-
- ubuntu@infra-1:~$ uname -a
- Linux infra-1 4.15.0-50-generic #54-Ubuntu SMP Mon May 6 18:46:08 UTC 2019
x86_64 x86_64 x86_64 GNU/Lin
-
-
- ubuntu@infra-1:~$ lspci | grep Broadcom
- 01:00.0 Ethernet controller: Broadcom Inc. and subsidiaries NetXtreme II
BCM57800 1/10 Gigabit Ethernet (rev 10)
- 01:00.1 Ethernet controller: Broadcom Inc. and subsidiaries NetXtreme II
BCM57800 1/10 Gigabit Ethernet (rev 10)
- 01:00.2 Ethernet controller: Broadcom Inc. and subsidiaries NetXtreme II
BCM57800 1/10 Gigabit Ethernet (rev 10)
- 01:00.3 Ethernet controller: Broadcom Inc. and subsidiaries NetXtreme II
BCM57800 1/10 Gigabit Ethernet (rev 10)
-
-
- ubuntu@infra-1:~$ lspci -n | grep 01:00
- 01:00.0 0200: 14e4:168a (rev 10)
- 01:00.1 0200: 14e4:168a (rev 10)
- 01:00.2 0200: 14e4:168a (rev 10)
- 01:00.3 0200: 14e4:168a (rev 10)
-
-
- ubuntu@infra-1:~/deploy$ sudo lshw -c network
- *-network:0
- description: Ethernet interface
- product: NetXtreme II BCM57800 1/10 Gigabit Ethernet
- vendor: Broadcom Inc. and subsidiaries
- physical id: 0
- bus info: pci@0000:01:00.0
- logical name: eno1
- version: 10
- serial: 42:39:92:e0:66:b6
- size: 10Gbit/s
- capacity: 10Gbit/s
- width: 64 bits
- clock: 33MHz
- capabilities: pm vpd msi msix pciexpress bus_master cap_list rom ethernet
physical tp 100bt 100bt-fd 1000bt-fd 10000bt-fd autonegotiation
- configuration: autonegotiation=on broadcast=yes driver=bnx2x
driverversion=1.712.30-0 duplex=full firmware=FFV14.10.07 bc 7.14.11 phy 1.45
latency=0 link=yes multicast=yes port=twisted pair slave=yes speed=10Gbit/s
- resources: irq:79 memory:95000000-957fffff memory:95800000-95ffffff
memory:96030000-9603ffff memory:91a00000-91a7ffff
- *-network:1
- description: Ethernet interface
- product: NetXtreme II BCM57800 1/10 Gigabit Ethernet
- vendor: Broadcom Inc. and subsidiaries
- physical id: 0.1
- bus info: pci@0000:01:00.1
- logical name: eno2
- version: 10
- serial: 42:39:92:e0:66:b6
- size: 10Gbit/s
- capacity: 10Gbit/s
- width: 64 bits
- clock: 33MHz
- capabilities: pm vpd msi msix pciexpress bus_master cap_list rom ethernet
physical tp 100bt 100bt-fd 1000bt-fd 10000bt-fd autonegotiation
- configuration: autonegotiation=on broadcast=yes driver=bnx2x
driverversion=1.712.30-0 duplex=full firmware=FFV14.10.07 bc 7.14.11 phy 1.45
latency=0 link=yes multicast=yes port=twisted pair slave=yes speed=10Gbit/s
- resources: irq:90 memory:94000000-947fffff memory:94800000-94ffffff
memory:96020000-9602ffff memory:91a80000-91afffff
- *-network:2
- description: Ethernet interface
- product: NetXtreme II BCM57800 1/10 Gigabit Ethernet
- vendor: Broadcom Inc. and subsidiaries
- physical id: 0.2
- bus info: pci@0000:01:00.2
- logical name: eno3
- version: 10
- serial: 52:f2:aa:63:a5:3c
- size: 1Gbit/s
- capacity: 1Gbit/s
- width: 64 bits
- clock: 33MHz
- capabilities: pm vpd msi msix pciexpress bus_master cap_list rom ethernet
physical tp 10bt 10bt-fd 100bt 100bt-fd 1000bt-fd autonegotiation
- configuration: autonegotiation=on broadcast=yes driver=bnx2x
driverversion=1.712.30-0 duplex=full firmware=FFV14.10.07 bc 7.14.11 latency=0
link=yes multicast=yes port=twisted pair slave=yes speed=1Gbit/s
- resources: irq:90 memory:93000000-937fffff memory:93800000-93ffffff
memory:96010000-9601ffff memory:91b00000-91b7ffff
- *-network:3
- description: Ethernet interface
- product: NetXtreme II BCM57800 1/10 Gigabit Ethernet
- vendor: Broadcom Inc. and subsidiaries
- physical id: 0.3
- bus info: pci@0000:01:00.3
- logical name: eno4
- version: 10
- serial: 52:f2:aa:63:a5:3c
- size: 1Gbit/s
- capacity: 1Gbit/s
- width: 64 bits
- clock: 33MHz
- capabilities: pm vpd msi msix pciexpress bus_master cap_list rom ethernet
physical tp 10bt 10bt-fd 100bt 100bt-fd 1000bt-fd autonegotiation
- configuration: autonegotiation=on broadcast=yes driver=bnx2x
driverversion=1.712.30-0 duplex=full firmware=FFV14.10.07 bc 7.14.11 latency=0
link=yes multicast=yes port=twisted pair slave=yes speed=1Gbit/s
- resources: irq:111 memory:92000000-927fffff memory:92800000-92ffffff
memory:96000000-9600ffff memory:91b80000-91bfffff
- *-network:0
- description: Ethernet interface
- physical id: 3
- logical name: bond1.1166
- serial: 42:39:92:e0:66:b6
- capabilities: ethernet physical
- configuration: autonegotiation=off broadcast=yes driver=802.1Q VLAN Support
driverversion=1.8 duplex=full firmware=N/A link=yes multicast=yes
- *-network:1
- description: Ethernet interface
- physical id: 4
- logical name: bond1
- serial: 42:39:92:e0:66:b6
- capabilities: ethernet physical
- configuration: autonegotiation=off broadcast=yes driver=bonding
driverversion=3.7.1 duplex=full firmware=2 link=yes master=yes multicast=yes
- *-network:2
- description: Ethernet interface
- physical id: 5
- logical name: broam
- serial: 36:76:ae:d3:1d:3b
- capabilities: ethernet physical
- configuration: broadcast=yes driver=bridge driverversion=2.3 firmware=N/A
ip=10.246.65.10 link=yes multicast=yes
- *-network:3
- description: Ethernet interface
- physical id: 6
- logical name: brinternal
- serial: ce:27:22:0d:8b:d1
- capabilities: ethernet physical
- configuration: broadcast=yes driver=bridge driverversion=2.3 firmware=N/A
ip=10.246.66.10 link=yes multicast=yes
- *-network:4
- description: Ethernet interface
- physical id: 7
- logical name: bond1.1171
- serial: 42:39:92:e0:66:b6
- capabilities: ethernet physical
- configuration: autonegotiation=off broadcast=yes driver=802.1Q VLAN Support
driverversion=1.8 duplex=full firmware=N/A link=yes multicast=yes
- *-network:5
- description: Ethernet interface
- physical id: 8
- logical name: bond0
- serial: 52:f2:aa:63:a5:3c
- capabilities: ethernet physical
- configuration: autonegotiation=off broadcast=yes driver=bonding
driverversion=3.7.1 duplex=full firmware=2 link=yes master=yes multicast=yes
- *-network:6
- description: Ethernet interface
- physical id: 9
- logical name: brexternal
- serial: 5e:e0:5c:1f:da:01
- capabilities: ethernet physical
- configuration: broadcast=yes driver=bridge driverversion=2.3 firmware=N/A
ip=10.246.71.10 link=yes multicast=yes
-
-
- ubuntu@infra-1:~$ modinfo bnx2x
- filename:
/lib/modules/4.15.0-50-generic/kernel/drivers/net/ethernet/broadcom/bnx2x/bnx2x.ko
- firmware: bnx2x/bnx2x-e2-7.13.1.0.fw
- firmware: bnx2x/bnx2x-e1h-7.13.1.0.fw
- firmware: bnx2x/bnx2x-e1-7.13.1.0.fw
- version: 1.712.30-0
- license: GPL
- description: QLogic
BCM57710/57711/57711E/57712/57712_MF/57800/57800_MF/57810/57810_MF/57840/57840_MF
Driver
- author: Eliezer Tamir
- srcversion: 5338D57FE057310DCD66774
- alias: pci:v000014E4d0000163Fsv*sd*bc*sc*i*
- alias: pci:v000014E4d0000163Esv*sd*bc*sc*i*
- alias: pci:v000014E4d0000163Dsv*sd*bc*sc*i*
- alias: pci:v00001077d000016ADsv*sd*bc*sc*i*
- alias: pci:v000014E4d000016ADsv*sd*bc*sc*i*
- alias: pci:v00001077d000016A4sv*sd*bc*sc*i*
- alias: pci:v000014E4d000016A4sv*sd*bc*sc*i*
- alias: pci:v000014E4d000016ABsv*sd*bc*sc*i*
- alias: pci:v000014E4d000016AFsv*sd*bc*sc*i*
- alias: pci:v000014E4d000016A2sv*sd*bc*sc*i*
- alias: pci:v00001077d000016A1sv*sd*bc*sc*i*
- alias: pci:v000014E4d000016A1sv*sd*bc*sc*i*
- alias: pci:v000014E4d0000168Dsv*sd*bc*sc*i*
- alias: pci:v000014E4d000016AEsv*sd*bc*sc*i*
- alias: pci:v000014E4d0000168Esv*sd*bc*sc*i*
- alias: pci:v000014E4d000016A9sv*sd*bc*sc*i*
- alias: pci:v000014E4d000016A5sv*sd*bc*sc*i*
- alias: pci:v000014E4d0000168Asv*sd*bc*sc*i*
- alias: pci:v000014E4d0000166Fsv*sd*bc*sc*i*
- alias: pci:v000014E4d00001663sv*sd*bc*sc*i*
- alias: pci:v000014E4d00001662sv*sd*bc*sc*i*
- alias: pci:v000014E4d00001650sv*sd*bc*sc*i*
- alias: pci:v000014E4d0000164Fsv*sd*bc*sc*i*
- alias: pci:v000014E4d0000164Esv*sd*bc*sc*i*
- depends: mdio,libcrc32c,ptp
- retpoline: Y
- intree: Y
- name: bnx2x
- vermagic: 4.15.0-50-generic SMP mod_unload
- signat: PKCS#7
- signer:
- sig_key:
- sig_hashalgo: md4
- parm: num_queues: Set number of queues (default is as a number of CPUs) (int)
- parm: disable_tpa: Disable the TPA (LRO) feature (int)
- parm: int_mode: Force interrupt mode other than MSI-X (1 INT#x; 2 MSI) (int)
- parm: dropless_fc: Pause on exhausted host ring (int)
- parm: mrrs: Force Max Read Req Size (0..3) (for debug) (int)
- parm: debug: Default debug msglevel (int)
--
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1832082
Title:
bnx2x driver causes 100% CPU load
To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1832082/+subscriptions
--
ubuntu-bugs mailing list
[email protected]
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs