I've got two new HPE servers running PTP. PTP starts normally, but after 10-12
hours, the bnx2 driver crashes. I've tried both the HPE driver for the NIC and
the Redhat native driver. Both have identical issues.
Anyone run into this issue? Any suggestions/workarounds? I have a ticket open
with both Redhat and HPE, but so far, no help.
Environment
-----------
HPE ProLiant DL560 Gen10
HPE FlexFabric 10Gb 2-port 534FLR-SFP+ Adapter, Firmware 7.17.19
Red Hat Enterprise Linux Server release 7.5 (Maipo)
Kernel 3.10.0-862.6.3.el7.x86_64
linuxptp-1.8-5.el7.x86_64
HPE Driver
----------
kmod-netxtreme2-7.14.46-1.rhel7u5.x86_64
bnx2x: QLogic 5771x/578xx 10/20-Gigabit Ethernet Driver bnx2x 1.714.18
($DateTime: 2018/04/19 20:00:08 $)
Redhat Native Driver
--------------------
bnx2x: QLogic 5771x/578xx 10/20-Gigabit Ethernet Driver bnx2x 1.712.30-0
(2014/02/10)
The hardware shows support of hardware timestamping:
[root@antares firmware-nic-qlogic-nx2-2.22.15-1.1]# ethtool -T eno1
Time stamping parameters for eno1:
Capabilities:
hardware-transmit (SOF_TIMESTAMPING_TX_HARDWARE)
software-transmit (SOF_TIMESTAMPING_TX_SOFTWARE)
hardware-receive (SOF_TIMESTAMPING_RX_HARDWARE)
software-receive (SOF_TIMESTAMPING_RX_SOFTWARE)
software-system-clock (SOF_TIMESTAMPING_SOFTWARE)
hardware-raw-clock (SOF_TIMESTAMPING_RAW_HARDWARE)
PTP Hardware Clock: 0
Hardware Transmit Timestamp Modes:
off (HWTSTAMP_TX_OFF)
on (HWTSTAMP_TX_ON)
Hardware Receive Filter Modes:
none (HWTSTAMP_FILTER_NONE)
ptpv1-l4-event (HWTSTAMP_FILTER_PTP_V1_L4_EVENT)
ptpv1-l4-sync (HWTSTAMP_FILTER_PTP_V1_L4_SYNC)
ptpv1-l4-delay-req (HWTSTAMP_FILTER_PTP_V1_L4_DELAY_REQ)
ptpv2-l4-event (HWTSTAMP_FILTER_PTP_V2_L4_EVENT)
ptpv2-l4-sync (HWTSTAMP_FILTER_PTP_V2_L4_SYNC)
ptpv2-l4-delay-req (HWTSTAMP_FILTER_PTP_V2_L4_DELAY_REQ)
ptpv2-l2-event (HWTSTAMP_FILTER_PTP_V2_L2_EVENT)
ptpv2-l2-sync (HWTSTAMP_FILTER_PTP_V2_L2_SYNC)
ptpv2-l2-delay-req (HWTSTAMP_FILTER_PTP_V2_L2_DELAY_REQ)
ptpv2-event (HWTSTAMP_FILTER_PTP_V2_EVENT)
ptpv2-sync (HWTSTAMP_FILTER_PTP_V2_SYNC)
ptpv2-delay-req (HWTSTAMP_FILTER_PTP_V2_DELAY_REQ)
Command line of ptp processes:
/usr/sbin/ptp4l -l 5 -f /var/run/timemaster/ptp4l.0.conf -H -i eno1
/usr/sbin/phc2sys -l 5 -a -r -R 1.00 -z /var/run/timemaster/ptp4l.0.socket -t
[0:eno1] -n 0 -E ntpshm -M 0
Configuration:
[root@capricorn ~]# more /var/run/timemaster/ptp4l.0.conf
[global]
slaveOnly 1
domainNumber 0
uds_address /var/run/timemaster/ptp4l.0.socket
message_tag [0:eno1]
Dmesg output:
At first, ptp comes up correctly and syncs the clock with no issues:
[ 9.237476] PTP clock support registered
[ 9.381258] bnx2x: QLogic 5771x/578xx 10/20-Gigabit Ethernet Driver bnx2x
1.712.30-0 (2014/02/10)
[ 9.381681] bnx2x 0000:29:00.0: msix capability found
[ 9.381809] bnx2x 0000:29:00.0: part number
394D0030-31383735-31543030-47303030
[ 9.575335] bnx2x 0000:29:00.0: irq 119 for MSI/MSI-X
[ 9.575344] bnx2x 0000:29:00.0: irq 120 for MSI/MSI-X
[ 9.575350] bnx2x 0000:29:00.0: irq 121 for MSI/MSI-X
[ 9.575356] bnx2x 0000:29:00.0: irq 122 for MSI/MSI-X
[ 9.575361] bnx2x 0000:29:00.0: irq 123 for MSI/MSI-X
[ 9.575369] bnx2x 0000:29:00.0: irq 124 for MSI/MSI-X
[ 9.575375] bnx2x 0000:29:00.0: irq 125 for MSI/MSI-X
[ 9.575380] bnx2x 0000:29:00.0: irq 126 for MSI/MSI-X
[ 9.575386] bnx2x 0000:29:00.0: irq 127 for MSI/MSI-X
[ 9.575391] bnx2x 0000:29:00.0: irq 128 for MSI/MSI-X
[ 9.684072] bnx2x 0000:29:00.1: msix capability found
[ 9.684218] bnx2x 0000:29:00.1: part number
394D0030-31383735-31543030-47303030
[ 9.872072] bnx2x 0000:29:00.1: irq 130 for MSI/MSI-X
[ 9.872079] bnx2x 0000:29:00.1: irq 131 for MSI/MSI-X
[ 9.872085] bnx2x 0000:29:00.1: irq 132 for MSI/MSI-X
[ 9.872091] bnx2x 0000:29:00.1: irq 133 for MSI/MSI-X
[ 9.872097] bnx2x 0000:29:00.1: irq 134 for MSI/MSI-X
[ 9.872103] bnx2x 0000:29:00.1: irq 135 for MSI/MSI-X
[ 9.872108] bnx2x 0000:29:00.1: irq 136 for MSI/MSI-X
[ 9.872115] bnx2x 0000:29:00.1: irq 137 for MSI/MSI-X
[ 9.872121] bnx2x 0000:29:00.1: irq 138 for MSI/MSI-X
[ 9.872126] bnx2x 0000:29:00.1: irq 139 for MSI/MSI-X
[ 51.210732] bnx2x 0000:29:00.0 eno1: using MSI-X IRQs: sp 119 fp[0] 121
... fp[7] 128
[ 51.463452] bnx2x 0000:29:00.0 eno1: NIC Link is Up, 10000 Mbps full duplex,
Flow control: ON - receive & transmit
Then after a periods of a few hours:
[10679.116279] bnx2x: [bnx2x_issue_dmae_with_comp:549(eno1)]DMAE timeout!
[10679.194744] bnx2x: [bnx2x_read_dmae:636(eno1)]DMAE returned failure -1
[17970.956219] bnx2x: [bnx2x_issue_dmae_with_comp:549(eno1)]DMAE timeout!
[17971.034689] bnx2x: [bnx2x_read_dmae:636(eno1)]DMAE returned failure -1
[20708.112105] bnx2x: [bnx2x_issue_dmae_with_comp:549(eno1)]DMAE timeout!
[20708.190573] bnx2x: [bnx2x_read_dmae:636(eno1)]DMAE returned failure -1
[26475.402157] bnx2x: [bnx2x_issue_dmae_with_comp:549(eno1)]DMAE timeout!
[26475.480625] bnx2x: [bnx2x_read_dmae:636(eno1)]DMAE returned failure -1
[36452.146498] bnx2x: [bnx2x_issue_dmae_with_comp:549(eno1)]DMAE timeout!
[36452.224959] bnx2x: [bnx2x_read_dmae:636(eno1)]DMAE returned failure -1
[49120.790072] bnx2x: [bnx2x_issue_dmae_with_comp:549(eno1)]DMAE timeout!
Then after 10-12 hours, the driver crashes and no longer passes any traffic:
[49123.612980] bnx2x: [bnx2x_stats_update:1232(eno1)]storm stats were not
updated for 3 times
[49123.712369] bnx2x: [bnx2x_stats_update:1233(eno1)]driver assert
[49123.712438] bnx2x: [bnx2x_ptp_adjfreq:13795(eno1)]Failed to set drift
[49123.860922] bnx2x: [bnx2x_panic_dump:923(eno1)]begin crash dump
-----------------
[49123.967604] bnx2x: [bnx2x_panic_dump:933(eno1)]def_idx(0xa8ca)
def_att_idx(0x166) attn_state(0x0) spq_prod_idx(0xec) next_stats_cnt(0xbfc0)
[49124.138085] bnx2x: [bnx2x_panic_dump:938(eno1)]DSB: attn bits(0x0)
ack(0x10) id(0x0) idx(0x166)
[49124.262527] bnx2x: [bnx2x_panic_dump:939(eno1)] def (0x0 0x0 0x0 0x0 0x0
0x0 0x0 0xac76 0x0 0x0 0x0 0x0 0x0 0x0 0x0 0x0) igu_sb_id(0x0)
igu_seg_id(0x1) pf_id(0x0) vnic_id(0x0) vf_id(0xff) vf_valid (0x0) state(0x1)
[49124.667394] bnx2x: [bnx2x_panic_dump:990(eno1)]fp0: rx_bd_prod(0x68a2)
rx_bd_cons(0x6db) rx_comp_prod(0x69e5) rx_comp_cons(0x6819)
*rx_cons_sb(0x6819)
[49124.850425] bnx2x: [bnx2x_panic_dump:993(eno1)] rx_sge_prod(0x400)
last_max_sge(0x0) fp_hc_idx(0x6683)
[49124.985325] bnx2x: [bnx2x_panic_dump:1010(eno1)]fp0: tx_pkt_prod(0xa)
tx_pkt_cons(0xa) tx_bd_prod(0x14) tx_bd_cons(0x13) *tx_cons_sb(0xa)
[49125.154748] bnx2x: [bnx2x_panic_dump:1010(eno1)]fp0: tx_pkt_prod(0x0)
tx_pkt_cons(0x0) tx_bd_prod(0x0) tx_bd_cons(0x0) *tx_cons_sb(0x0)
[49125.322075] bnx2x: [bnx2x_panic_dump:1010(eno1)]fp0: tx_pkt_prod(0x0)
tx_pkt_cons(0x0) tx_bd_prod(0x0) tx_bd_cons(0x0) *tx_cons_sb(0x0)
[49125.489398] bnx2x: [bnx2x_panic_dump:1021(eno1)] run indexes (0x6683 0x0)
[49125.623245] bnx2x: [bnx2x_panic_dump:1027(eno1)] indexes (
[49126.768509] bnx2x: [bnx2x_panic_dump:990(eno1)]fp1: rx_bd_prod(0x1c5)
rx_bd_cons(0x0) rx_comp_prod(0x1d0) rx_comp_cons(0x4) *rx_cons_sb(0x4)
[49126.924883] bnx2x: [bnx2x_enable_ptp_packets:15334(eno1)]Failed to enable
PTP packets
[49126.924890] bnx2x: [bnx2x_queue_chk_transition:5516(eno1)]Blocking
transition since pending was 20
[49126.924891] bnx2x: [bnx2x_queue_state_change:4665(eno1)]check transition
returned an error. rc -16
[49126.924891] bnx2x: [bnx2x_enable_ptp_packets:15334(eno1)]Failed to enable
PTP packets
[49127.412351] bnx2x: [bnx2x_panic_dump:993(eno1)] rx_sge_prod(0x400)
last_max_sge(0x0) fp_hc_idx(0x4)
[49127.544133] bnx2x: [bnx2x_panic_dump:1010(eno1)]fp1: tx_pkt_prod(0x0)
tx_pkt_cons(0x0) tx_bd_prod(0x0) tx_bd_cons(0x0) *tx_cons_sb(0x0)
[49127.712188] bnx2x: [bnx2x_panic_dump:1010(eno1)]fp1: tx_pkt_prod(0x0)
tx_pkt_cons(0x0) tx_bd_prod(0x0) tx_bd_cons(0x0) *tx_cons_sb(0x0)
[49127.879559] bnx2x: [bnx2x_panic_dump:1010(eno1)]fp1: tx_pkt_prod(0x0)
tx_pkt_cons(0x0) tx_bd_prod(0x0) tx_bd_cons(0x0) *tx_cons_sb(0x0)
[49128.046927] bnx2x: [bnx2x_panic_dump:1021(eno1)] run indexes (0x4 0x0)
[49128.159417] bnx2x: [bnx2x_panic_dump:1027(eno1)] indexes (
[49129.321046] bnx2x: [bnx2x_panic_dump:990(eno1)]fp2: rx_bd_prod(0x1c5)
rx_bd_cons(0x0) rx_comp_prod(0x1d0) rx_comp_cons(0x4) *rx_cons_sb(0x4)
[49129.494284] bnx2x: [bnx2x_panic_dump:993(eno1)] rx_sge_prod(0x400)
last_max_sge(0x0) fp_hc_idx(0xbabc)
[49129.628519] bnx2x: [bnx2x_panic_dump:1010(eno1)]fp2: tx_pkt_prod(0xbab8)
tx_pkt_cons(0xbab8) tx_bd_prod(0x76e6) tx_bd_cons(0x76e5) *tx_cons_sb(0xbab8)
[49129.812198] bnx2x: [bnx2x_panic_dump:1010(eno1)]fp2: tx_pkt_prod(0x0)
tx_pkt_cons(0x0) tx_bd_prod(0x0) tx_bd_cons(0x0) *tx_cons_sb(0x0)
[49129.979523] bnx2x: [bnx2x_panic_dump:1010(eno1)]fp2: tx_pkt_prod(0x0)
tx_pkt_cons(0x0) tx_bd_prod(0x0) tx_bd_cons(0x0) *tx_cons_sb(0x0)
[49130.146854] bnx2x: [bnx2x_panic_dump:1021(eno1)] run indexes (0xbabc 0x0)
[49130.279365] bnx2x: [bnx2x_panic_dump:1027(eno1)] indexes (
[49131.427380] bnx2x: [bnx2x_panic_dump:990(eno1)]fp3: rx_bd_prod(0x1c5)
rx_bd_cons(0x0) rx_comp_prod(0x1d0) rx_comp_cons(0x4) *rx_cons_sb(0x4)
[49131.600621] bnx2x: [bnx2x_panic_dump:993(eno1)] rx_sge_prod(0x400)
last_max_sge(0x0) fp_hc_idx(0x9)
[49131.731727] bnx2x: [bnx2x_panic_dump:1010(eno1)]fp3: tx_pkt_prod(0x5)
tx_pkt_cons(0x5) tx_bd_prod(0xa) tx_bd_cons(0x9) *tx_cons_sb(0x5)
[49131.899758] bnx2x: [bnx2x_panic_dump:1010(eno1)]fp3: tx_pkt_prod(0x0)
tx_pkt_cons(0x0) tx_bd_prod(0x0) tx_bd_cons(0x0) *tx_cons_sb(0x0)
[49132.067110] bnx2x: [bnx2x_panic_dump:1010(eno1)]fp3: tx_pkt_prod(0x0)
tx_pkt_cons(0x0) tx_bd_prod(0x0) tx_bd_cons(0x0) *tx_cons_sb(0x0)
[49132.234438] bnx2x: [bnx2x_panic_dump:1021(eno1)] run indexes (0x9 0x0)
[49132.347020] bnx2x: [bnx2x_panic_dump:1027(eno1)] indexes (
[49133.508630] bnx2x: [bnx2x_panic_dump:990(eno1)]fp4: rx_bd_prod(0x1c5)
rx_bd_cons(0x0) rx_comp_prod(0x1d0) rx_comp_cons(0x4) *rx_cons_sb(0x4)
[49133.681908] bnx2x: [bnx2x_panic_dump:993(eno1)] rx_sge_prod(0x400)
last_max_sge(0x0) fp_hc_idx(0x4)
[49133.813026] bnx2x: [bnx2x_panic_dump:1010(eno1)]fp4: tx_pkt_prod(0x0)
tx_pkt_cons(0x0) tx_bd_prod(0x0) tx_bd_cons(0x0) *tx_cons_sb(0x0)
[49133.981051] bnx2x: [bnx2x_panic_dump:1010(eno1)]fp4: tx_pkt_prod(0x0)
tx_pkt_cons(0x0) tx_bd_prod(0x0) tx_bd_cons(0x0) *tx_cons_sb(0x0)
[49134.148386] bnx2x: [bnx2x_panic_dump:1010(eno1)]fp4: tx_pkt_prod(0x0)
tx_pkt_cons(0x0) tx_bd_prod(0x0) tx_bd_cons(0x0) *tx_cons_sb(0x0)
[49134.315714] bnx2x: [bnx2x_panic_dump:1021(eno1)] run indexes (0x4 0x0)
[49134.428295] bnx2x: [bnx2x_panic_dump:1027(eno1)] indexes (
[49135.589916] bnx2x: [bnx2x_panic_dump:990(eno1)]fp5: rx_bd_prod(0xf3af)
rx_bd_cons(0x1e8) rx_comp_prod(0x8e4) rx_comp_cons(0x718) *rx_cons_sb(0x718)
[49135.770509] bnx2x: [bnx2x_panic_dump:993(eno1)] rx_sge_prod(0x400)
last_max_sge(0x0) fp_hc_idx(0xeafc)
[49135.904772] bnx2x: [bnx2x_panic_dump:1010(eno1)]fp5: tx_pkt_prod(0x0)
tx_pkt_cons(0x0) tx_bd_prod(0x0) tx_bd_cons(0x0) *tx_cons_sb(0x0)
[49136.072803] bnx2x: [bnx2x_panic_dump:1010(eno1)]fp5: tx_pkt_prod(0x0)
tx_pkt_cons(0x0) tx_bd_prod(0x0) tx_bd_cons(0x0) *tx_cons_sb(0x0)
[49136.240136] bnx2x: [bnx2x_panic_dump:1010(eno1)]fp5: tx_pkt_prod(0x0)
tx_pkt_cons(0x0) tx_bd_prod(0x0) tx_bd_cons(0x0) *tx_cons_sb(0x0)
[49136.407465] bnx2x: [bnx2x_panic_dump:1021(eno1)] run indexes (0xeafc 0x0)
[49136.523181] bnx2x: [bnx2x_panic_dump:1027(eno1)] indexes (
[49137.686868] bnx2x: [bnx2x_panic_dump:990(eno1)]fp6: rx_bd_prod(0x1c5)
rx_bd_cons(0x0) rx_comp_prod(0x1d0) rx_comp_cons(0x4) *rx_cons_sb(0x4)
[49137.860145] bnx2x: [bnx2x_panic_dump:993(eno1)] rx_sge_prod(0x400)
last_max_sge(0x0) fp_hc_idx(0x5)
[49137.991263] bnx2x: [bnx2x_panic_dump:1010(eno1)]fp6: tx_pkt_prod(0x1)
tx_pkt_cons(0x1) tx_bd_prod(0x2) tx_bd_cons(0x1) *tx_cons_sb(0x1)
[49138.159282] bnx2x: [bnx2x_panic_dump:1010(eno1)]fp6: tx_pkt_prod(0x0)
tx_pkt_cons(0x0) tx_bd_prod(0x0) tx_bd_cons(0x0) *tx_cons_sb(0x0)
[49138.326618] bnx2x: [bnx2x_panic_dump:1010(eno1)]fp6: tx_pkt_prod(0x0)
tx_pkt_cons(0x0) tx_bd_prod(0x0) tx_bd_cons(0x0) *tx_cons_sb(0x0)
[49138.493946] bnx2x: [bnx2x_panic_dump:1021(eno1)] run indexes (0x5 0x0)
[49138.606521] bnx2x: [bnx2x_panic_dump:1027(eno1)] indexes (
[49139.766030] bnx2x: [bnx2x_panic_dump:990(eno1)]fp7: rx_bd_prod(0x1c5)
rx_bd_cons(0x0) rx_comp_prod(0x1d0) rx_comp_cons(0x4) *rx_cons_sb(0x4)
[49139.939313] bnx2x: [bnx2x_panic_dump:993(eno1)] rx_sge_prod(0x400)
last_max_sge(0x0) fp_hc_idx(0x7)
[49140.070430] bnx2x: [bnx2x_panic_dump:1010(eno1)]fp7: tx_pkt_prod(0x3)
tx_pkt_cons(0x3) tx_bd_prod(0x6) tx_bd_cons(0x5) *tx_cons_sb(0x3)
[49140.238458] bnx2x: [bnx2x_panic_dump:1010(eno1)]fp7: tx_pkt_prod(0x0)
tx_pkt_cons(0x0) tx_bd_prod(0x0) tx_bd_cons(0x0) *tx_cons_sb(0x0)
[49140.405797] bnx2x: [bnx2x_panic_dump:1010(eno1)]fp7: tx_pkt_prod(0x0)
tx_pkt_cons(0x0) tx_bd_prod(0x0) tx_bd_cons(0x0) *tx_cons_sb(0x0)
[49140.573137] bnx2x: [bnx2x_panic_dump:1021(eno1)] run indexes (0x7 0x0)
[49140.685693] bnx2x: [bnx2x_panic_dump:1027(eno1)] indexes (
[49141.845218] bnx2x 0000:29:00.0 eno1: bc 7.15.24
[49142.887641] bnx2x: [bnx2x_queue_chk_transition:5516(eno1)]Blocking
transition since pending was 20
[49142.887642] bnx2x: [bnx2x_queue_state_change:4665(eno1)]check transition
returned an error. rc -16
[49142.887643] bnx2x: [bnx2x_enable_ptp_packets:15334(eno1)]Failed to enable
PTP packets
[49142.887646] bnx2x: [bnx2x_queue_chk_transition:5516(eno1)]Blocking
transition since pending was 20
[49142.887646] bnx2x: [bnx2x_queue_state_change:4665(eno1)]check transition
returned an error. rc -16
[49142.887647] bnx2x: [bnx2x_enable_ptp_packets:15334(eno1)]Failed to enable
PTP packets
[49146.107667] bnx2x: [bnx2x_mc_assert:750(eno1)]Chip Revision: everest3, FW
Version: 7_13_1
[49146.222713] bnx2x: [bnx2x_panic_dump:1186(eno1)]end crash dump
-----------------
----
Matthew Huff | 1 Manhattanville Rd
Director of Operations | Purchase, NY 10577
OTA Management LLC | Phone: 914-460-4039
------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
_______________________________________________
Linuxptp-users mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/linuxptp-users