I've got two new HPE servers running PTP. PTP starts normally, but after 10-12 
hours, the bnx2 driver crashes. I've tried both the HPE driver for the NIC and 
the Redhat native driver. Both have identical issues.

Anyone run into this issue? Any suggestions/workarounds? I have a ticket open 
with both Redhat and HPE, but so far, no help.

Environment
-----------
HPE ProLiant DL560 Gen10
HPE FlexFabric 10Gb 2-port 534FLR-SFP+ Adapter, Firmware  7.17.19
Red Hat Enterprise Linux Server release 7.5 (Maipo)
Kernel 3.10.0-862.6.3.el7.x86_64
linuxptp-1.8-5.el7.x86_64

HPE Driver
----------
kmod-netxtreme2-7.14.46-1.rhel7u5.x86_64
bnx2x: QLogic 5771x/578xx 10/20-Gigabit Ethernet Driver bnx2x 1.714.18 
($DateTime: 2018/04/19 20:00:08 $)

Redhat Native Driver
--------------------
bnx2x: QLogic 5771x/578xx 10/20-Gigabit Ethernet Driver bnx2x 1.712.30-0 
(2014/02/10)


The hardware shows support of hardware timestamping:
[root@antares firmware-nic-qlogic-nx2-2.22.15-1.1]# ethtool -T eno1
Time stamping parameters for eno1:
Capabilities:
        hardware-transmit     (SOF_TIMESTAMPING_TX_HARDWARE)
        software-transmit     (SOF_TIMESTAMPING_TX_SOFTWARE)
        hardware-receive      (SOF_TIMESTAMPING_RX_HARDWARE)
        software-receive      (SOF_TIMESTAMPING_RX_SOFTWARE)
        software-system-clock (SOF_TIMESTAMPING_SOFTWARE)
        hardware-raw-clock    (SOF_TIMESTAMPING_RAW_HARDWARE)
PTP Hardware Clock: 0
Hardware Transmit Timestamp Modes:
        off                   (HWTSTAMP_TX_OFF)
        on                    (HWTSTAMP_TX_ON)
Hardware Receive Filter Modes:
        none                  (HWTSTAMP_FILTER_NONE)
        ptpv1-l4-event        (HWTSTAMP_FILTER_PTP_V1_L4_EVENT)
        ptpv1-l4-sync         (HWTSTAMP_FILTER_PTP_V1_L4_SYNC)
        ptpv1-l4-delay-req    (HWTSTAMP_FILTER_PTP_V1_L4_DELAY_REQ)
        ptpv2-l4-event        (HWTSTAMP_FILTER_PTP_V2_L4_EVENT)
        ptpv2-l4-sync         (HWTSTAMP_FILTER_PTP_V2_L4_SYNC)
        ptpv2-l4-delay-req    (HWTSTAMP_FILTER_PTP_V2_L4_DELAY_REQ)
        ptpv2-l2-event        (HWTSTAMP_FILTER_PTP_V2_L2_EVENT)
        ptpv2-l2-sync         (HWTSTAMP_FILTER_PTP_V2_L2_SYNC)
        ptpv2-l2-delay-req    (HWTSTAMP_FILTER_PTP_V2_L2_DELAY_REQ)
        ptpv2-event           (HWTSTAMP_FILTER_PTP_V2_EVENT)
        ptpv2-sync            (HWTSTAMP_FILTER_PTP_V2_SYNC)
        ptpv2-delay-req       (HWTSTAMP_FILTER_PTP_V2_DELAY_REQ)

Command line of ptp processes:

/usr/sbin/ptp4l -l 5 -f /var/run/timemaster/ptp4l.0.conf -H -i eno1
/usr/sbin/phc2sys -l 5 -a -r -R 1.00 -z /var/run/timemaster/ptp4l.0.socket -t 
[0:eno1] -n 0 -E ntpshm -M 0

Configuration:
[root@capricorn ~]# more /var/run/timemaster/ptp4l.0.conf 
[global]
slaveOnly 1
domainNumber 0
uds_address /var/run/timemaster/ptp4l.0.socket
message_tag [0:eno1]

Dmesg output:

At first, ptp comes up correctly and syncs the clock with no issues:
[    9.237476] PTP clock support registered
[    9.381258] bnx2x: QLogic 5771x/578xx 10/20-Gigabit Ethernet Driver bnx2x 
1.712.30-0 (2014/02/10)
[    9.381681] bnx2x 0000:29:00.0: msix capability found
[    9.381809] bnx2x 0000:29:00.0: part number 
394D0030-31383735-31543030-47303030
[    9.575335] bnx2x 0000:29:00.0: irq 119 for MSI/MSI-X
[    9.575344] bnx2x 0000:29:00.0: irq 120 for MSI/MSI-X
[    9.575350] bnx2x 0000:29:00.0: irq 121 for MSI/MSI-X
[    9.575356] bnx2x 0000:29:00.0: irq 122 for MSI/MSI-X
[    9.575361] bnx2x 0000:29:00.0: irq 123 for MSI/MSI-X
[    9.575369] bnx2x 0000:29:00.0: irq 124 for MSI/MSI-X
[    9.575375] bnx2x 0000:29:00.0: irq 125 for MSI/MSI-X
[    9.575380] bnx2x 0000:29:00.0: irq 126 for MSI/MSI-X
[    9.575386] bnx2x 0000:29:00.0: irq 127 for MSI/MSI-X
[    9.575391] bnx2x 0000:29:00.0: irq 128 for MSI/MSI-X
[    9.684072] bnx2x 0000:29:00.1: msix capability found
[    9.684218] bnx2x 0000:29:00.1: part number 
394D0030-31383735-31543030-47303030
[    9.872072] bnx2x 0000:29:00.1: irq 130 for MSI/MSI-X
[    9.872079] bnx2x 0000:29:00.1: irq 131 for MSI/MSI-X
[    9.872085] bnx2x 0000:29:00.1: irq 132 for MSI/MSI-X
[    9.872091] bnx2x 0000:29:00.1: irq 133 for MSI/MSI-X
[    9.872097] bnx2x 0000:29:00.1: irq 134 for MSI/MSI-X
[    9.872103] bnx2x 0000:29:00.1: irq 135 for MSI/MSI-X
[    9.872108] bnx2x 0000:29:00.1: irq 136 for MSI/MSI-X
[    9.872115] bnx2x 0000:29:00.1: irq 137 for MSI/MSI-X
[    9.872121] bnx2x 0000:29:00.1: irq 138 for MSI/MSI-X
[    9.872126] bnx2x 0000:29:00.1: irq 139 for MSI/MSI-X
[   51.210732] bnx2x 0000:29:00.0 eno1: using MSI-X  IRQs: sp 119  fp[0] 121 
... fp[7] 128
[   51.463452] bnx2x 0000:29:00.0 eno1: NIC Link is Up, 10000 Mbps full duplex, 
Flow control: ON - receive & transmit

Then after a periods of a few hours:

[10679.116279] bnx2x: [bnx2x_issue_dmae_with_comp:549(eno1)]DMAE timeout!
[10679.194744] bnx2x: [bnx2x_read_dmae:636(eno1)]DMAE returned failure -1
[17970.956219] bnx2x: [bnx2x_issue_dmae_with_comp:549(eno1)]DMAE timeout!
[17971.034689] bnx2x: [bnx2x_read_dmae:636(eno1)]DMAE returned failure -1
[20708.112105] bnx2x: [bnx2x_issue_dmae_with_comp:549(eno1)]DMAE timeout!
[20708.190573] bnx2x: [bnx2x_read_dmae:636(eno1)]DMAE returned failure -1
[26475.402157] bnx2x: [bnx2x_issue_dmae_with_comp:549(eno1)]DMAE timeout!
[26475.480625] bnx2x: [bnx2x_read_dmae:636(eno1)]DMAE returned failure -1
[36452.146498] bnx2x: [bnx2x_issue_dmae_with_comp:549(eno1)]DMAE timeout!
[36452.224959] bnx2x: [bnx2x_read_dmae:636(eno1)]DMAE returned failure -1
[49120.790072] bnx2x: [bnx2x_issue_dmae_with_comp:549(eno1)]DMAE timeout!

Then after 10-12 hours, the driver crashes and no longer passes any traffic:
[49123.612980] bnx2x: [bnx2x_stats_update:1232(eno1)]storm stats were not 
updated for 3 times
[49123.712369] bnx2x: [bnx2x_stats_update:1233(eno1)]driver assert
[49123.712438] bnx2x: [bnx2x_ptp_adjfreq:13795(eno1)]Failed to set drift
[49123.860922] bnx2x: [bnx2x_panic_dump:923(eno1)]begin crash dump 
-----------------
[49123.967604] bnx2x: [bnx2x_panic_dump:933(eno1)]def_idx(0xa8ca)  
def_att_idx(0x166)  attn_state(0x0)  spq_prod_idx(0xec) next_stats_cnt(0xbfc0)
[49124.138085] bnx2x: [bnx2x_panic_dump:938(eno1)]DSB: attn bits(0x0)  
ack(0x10)  id(0x0)  idx(0x166)
[49124.262527] bnx2x: [bnx2x_panic_dump:939(eno1)]     def (0x0 0x0 0x0 0x0 0x0 
0x0 0x0 0xac76 0x0 0x0 0x0 0x0 0x0 0x0 0x0 0x0)  igu_sb_id(0x0)  
igu_seg_id(0x1) pf_id(0x0)  vnic_id(0x0)  vf_id(0xff)  vf_valid (0x0) state(0x1)
[49124.667394] bnx2x: [bnx2x_panic_dump:990(eno1)]fp0: rx_bd_prod(0x68a2)  
rx_bd_cons(0x6db)  rx_comp_prod(0x69e5)  rx_comp_cons(0x6819)  
*rx_cons_sb(0x6819)
[49124.850425] bnx2x: [bnx2x_panic_dump:993(eno1)]     rx_sge_prod(0x400)  
last_max_sge(0x0)  fp_hc_idx(0x6683)
[49124.985325] bnx2x: [bnx2x_panic_dump:1010(eno1)]fp0: tx_pkt_prod(0xa)  
tx_pkt_cons(0xa)  tx_bd_prod(0x14)  tx_bd_cons(0x13)  *tx_cons_sb(0xa)
[49125.154748] bnx2x: [bnx2x_panic_dump:1010(eno1)]fp0: tx_pkt_prod(0x0)  
tx_pkt_cons(0x0)  tx_bd_prod(0x0)  tx_bd_cons(0x0)  *tx_cons_sb(0x0)
[49125.322075] bnx2x: [bnx2x_panic_dump:1010(eno1)]fp0: tx_pkt_prod(0x0)  
tx_pkt_cons(0x0)  tx_bd_prod(0x0)  tx_bd_cons(0x0)  *tx_cons_sb(0x0)
[49125.489398] bnx2x: [bnx2x_panic_dump:1021(eno1)]     run indexes (0x6683 0x0)
[49125.623245] bnx2x: [bnx2x_panic_dump:1027(eno1)]     indexes (
[49126.768509] bnx2x: [bnx2x_panic_dump:990(eno1)]fp1: rx_bd_prod(0x1c5)  
rx_bd_cons(0x0)  rx_comp_prod(0x1d0)  rx_comp_cons(0x4)  *rx_cons_sb(0x4)
[49126.924883] bnx2x: [bnx2x_enable_ptp_packets:15334(eno1)]Failed to enable 
PTP packets
[49126.924890] bnx2x: [bnx2x_queue_chk_transition:5516(eno1)]Blocking 
transition since pending was 20
[49126.924891] bnx2x: [bnx2x_queue_state_change:4665(eno1)]check transition 
returned an error. rc -16
[49126.924891] bnx2x: [bnx2x_enable_ptp_packets:15334(eno1)]Failed to enable 
PTP packets
[49127.412351] bnx2x: [bnx2x_panic_dump:993(eno1)]     rx_sge_prod(0x400)  
last_max_sge(0x0)  fp_hc_idx(0x4)
[49127.544133] bnx2x: [bnx2x_panic_dump:1010(eno1)]fp1: tx_pkt_prod(0x0)  
tx_pkt_cons(0x0)  tx_bd_prod(0x0)  tx_bd_cons(0x0)  *tx_cons_sb(0x0)
[49127.712188] bnx2x: [bnx2x_panic_dump:1010(eno1)]fp1: tx_pkt_prod(0x0)  
tx_pkt_cons(0x0)  tx_bd_prod(0x0)  tx_bd_cons(0x0)  *tx_cons_sb(0x0)
[49127.879559] bnx2x: [bnx2x_panic_dump:1010(eno1)]fp1: tx_pkt_prod(0x0)  
tx_pkt_cons(0x0)  tx_bd_prod(0x0)  tx_bd_cons(0x0)  *tx_cons_sb(0x0)
[49128.046927] bnx2x: [bnx2x_panic_dump:1021(eno1)]     run indexes (0x4 0x0)
[49128.159417] bnx2x: [bnx2x_panic_dump:1027(eno1)]     indexes (
[49129.321046] bnx2x: [bnx2x_panic_dump:990(eno1)]fp2: rx_bd_prod(0x1c5)  
rx_bd_cons(0x0)  rx_comp_prod(0x1d0)  rx_comp_cons(0x4)  *rx_cons_sb(0x4)
[49129.494284] bnx2x: [bnx2x_panic_dump:993(eno1)]     rx_sge_prod(0x400)  
last_max_sge(0x0)  fp_hc_idx(0xbabc)
[49129.628519] bnx2x: [bnx2x_panic_dump:1010(eno1)]fp2: tx_pkt_prod(0xbab8)  
tx_pkt_cons(0xbab8)  tx_bd_prod(0x76e6)  tx_bd_cons(0x76e5)  *tx_cons_sb(0xbab8)
[49129.812198] bnx2x: [bnx2x_panic_dump:1010(eno1)]fp2: tx_pkt_prod(0x0)  
tx_pkt_cons(0x0)  tx_bd_prod(0x0)  tx_bd_cons(0x0)  *tx_cons_sb(0x0)
[49129.979523] bnx2x: [bnx2x_panic_dump:1010(eno1)]fp2: tx_pkt_prod(0x0)  
tx_pkt_cons(0x0)  tx_bd_prod(0x0)  tx_bd_cons(0x0)  *tx_cons_sb(0x0)
[49130.146854] bnx2x: [bnx2x_panic_dump:1021(eno1)]     run indexes (0xbabc 0x0)
[49130.279365] bnx2x: [bnx2x_panic_dump:1027(eno1)]     indexes (
[49131.427380] bnx2x: [bnx2x_panic_dump:990(eno1)]fp3: rx_bd_prod(0x1c5)  
rx_bd_cons(0x0)  rx_comp_prod(0x1d0)  rx_comp_cons(0x4)  *rx_cons_sb(0x4)
[49131.600621] bnx2x: [bnx2x_panic_dump:993(eno1)]     rx_sge_prod(0x400)  
last_max_sge(0x0)  fp_hc_idx(0x9)
[49131.731727] bnx2x: [bnx2x_panic_dump:1010(eno1)]fp3: tx_pkt_prod(0x5)  
tx_pkt_cons(0x5)  tx_bd_prod(0xa)  tx_bd_cons(0x9)  *tx_cons_sb(0x5)
[49131.899758] bnx2x: [bnx2x_panic_dump:1010(eno1)]fp3: tx_pkt_prod(0x0)  
tx_pkt_cons(0x0)  tx_bd_prod(0x0)  tx_bd_cons(0x0)  *tx_cons_sb(0x0)
[49132.067110] bnx2x: [bnx2x_panic_dump:1010(eno1)]fp3: tx_pkt_prod(0x0)  
tx_pkt_cons(0x0)  tx_bd_prod(0x0)  tx_bd_cons(0x0)  *tx_cons_sb(0x0)
[49132.234438] bnx2x: [bnx2x_panic_dump:1021(eno1)]     run indexes (0x9 0x0)
[49132.347020] bnx2x: [bnx2x_panic_dump:1027(eno1)]     indexes (
[49133.508630] bnx2x: [bnx2x_panic_dump:990(eno1)]fp4: rx_bd_prod(0x1c5)  
rx_bd_cons(0x0)  rx_comp_prod(0x1d0)  rx_comp_cons(0x4)  *rx_cons_sb(0x4)
[49133.681908] bnx2x: [bnx2x_panic_dump:993(eno1)]     rx_sge_prod(0x400)  
last_max_sge(0x0)  fp_hc_idx(0x4)
[49133.813026] bnx2x: [bnx2x_panic_dump:1010(eno1)]fp4: tx_pkt_prod(0x0)  
tx_pkt_cons(0x0)  tx_bd_prod(0x0)  tx_bd_cons(0x0)  *tx_cons_sb(0x0)
[49133.981051] bnx2x: [bnx2x_panic_dump:1010(eno1)]fp4: tx_pkt_prod(0x0)  
tx_pkt_cons(0x0)  tx_bd_prod(0x0)  tx_bd_cons(0x0)  *tx_cons_sb(0x0)
[49134.148386] bnx2x: [bnx2x_panic_dump:1010(eno1)]fp4: tx_pkt_prod(0x0)  
tx_pkt_cons(0x0)  tx_bd_prod(0x0)  tx_bd_cons(0x0)  *tx_cons_sb(0x0)
[49134.315714] bnx2x: [bnx2x_panic_dump:1021(eno1)]     run indexes (0x4 0x0)
[49134.428295] bnx2x: [bnx2x_panic_dump:1027(eno1)]     indexes (
[49135.589916] bnx2x: [bnx2x_panic_dump:990(eno1)]fp5: rx_bd_prod(0xf3af)  
rx_bd_cons(0x1e8)  rx_comp_prod(0x8e4)  rx_comp_cons(0x718)  *rx_cons_sb(0x718)
[49135.770509] bnx2x: [bnx2x_panic_dump:993(eno1)]     rx_sge_prod(0x400)  
last_max_sge(0x0)  fp_hc_idx(0xeafc)
[49135.904772] bnx2x: [bnx2x_panic_dump:1010(eno1)]fp5: tx_pkt_prod(0x0)  
tx_pkt_cons(0x0)  tx_bd_prod(0x0)  tx_bd_cons(0x0)  *tx_cons_sb(0x0)
[49136.072803] bnx2x: [bnx2x_panic_dump:1010(eno1)]fp5: tx_pkt_prod(0x0)  
tx_pkt_cons(0x0)  tx_bd_prod(0x0)  tx_bd_cons(0x0)  *tx_cons_sb(0x0)
[49136.240136] bnx2x: [bnx2x_panic_dump:1010(eno1)]fp5: tx_pkt_prod(0x0)  
tx_pkt_cons(0x0)  tx_bd_prod(0x0)  tx_bd_cons(0x0)  *tx_cons_sb(0x0)
[49136.407465] bnx2x: [bnx2x_panic_dump:1021(eno1)]     run indexes (0xeafc 0x0)
[49136.523181] bnx2x: [bnx2x_panic_dump:1027(eno1)]     indexes (
[49137.686868] bnx2x: [bnx2x_panic_dump:990(eno1)]fp6: rx_bd_prod(0x1c5)  
rx_bd_cons(0x0)  rx_comp_prod(0x1d0)  rx_comp_cons(0x4)  *rx_cons_sb(0x4)
[49137.860145] bnx2x: [bnx2x_panic_dump:993(eno1)]     rx_sge_prod(0x400)  
last_max_sge(0x0)  fp_hc_idx(0x5)
[49137.991263] bnx2x: [bnx2x_panic_dump:1010(eno1)]fp6: tx_pkt_prod(0x1)  
tx_pkt_cons(0x1)  tx_bd_prod(0x2)  tx_bd_cons(0x1)  *tx_cons_sb(0x1)
[49138.159282] bnx2x: [bnx2x_panic_dump:1010(eno1)]fp6: tx_pkt_prod(0x0)  
tx_pkt_cons(0x0)  tx_bd_prod(0x0)  tx_bd_cons(0x0)  *tx_cons_sb(0x0)
[49138.326618] bnx2x: [bnx2x_panic_dump:1010(eno1)]fp6: tx_pkt_prod(0x0)  
tx_pkt_cons(0x0)  tx_bd_prod(0x0)  tx_bd_cons(0x0)  *tx_cons_sb(0x0)
[49138.493946] bnx2x: [bnx2x_panic_dump:1021(eno1)]     run indexes (0x5 0x0)
[49138.606521] bnx2x: [bnx2x_panic_dump:1027(eno1)]     indexes (
[49139.766030] bnx2x: [bnx2x_panic_dump:990(eno1)]fp7: rx_bd_prod(0x1c5)  
rx_bd_cons(0x0)  rx_comp_prod(0x1d0)  rx_comp_cons(0x4)  *rx_cons_sb(0x4)
[49139.939313] bnx2x: [bnx2x_panic_dump:993(eno1)]     rx_sge_prod(0x400)  
last_max_sge(0x0)  fp_hc_idx(0x7)
[49140.070430] bnx2x: [bnx2x_panic_dump:1010(eno1)]fp7: tx_pkt_prod(0x3)  
tx_pkt_cons(0x3)  tx_bd_prod(0x6)  tx_bd_cons(0x5)  *tx_cons_sb(0x3)
[49140.238458] bnx2x: [bnx2x_panic_dump:1010(eno1)]fp7: tx_pkt_prod(0x0)  
tx_pkt_cons(0x0)  tx_bd_prod(0x0)  tx_bd_cons(0x0)  *tx_cons_sb(0x0)
[49140.405797] bnx2x: [bnx2x_panic_dump:1010(eno1)]fp7: tx_pkt_prod(0x0)  
tx_pkt_cons(0x0)  tx_bd_prod(0x0)  tx_bd_cons(0x0)  *tx_cons_sb(0x0)
[49140.573137] bnx2x: [bnx2x_panic_dump:1021(eno1)]     run indexes (0x7 0x0)
[49140.685693] bnx2x: [bnx2x_panic_dump:1027(eno1)]     indexes (
[49141.845218] bnx2x 0000:29:00.0 eno1: bc 7.15.24
[49142.887641] bnx2x: [bnx2x_queue_chk_transition:5516(eno1)]Blocking 
transition since pending was 20
[49142.887642] bnx2x: [bnx2x_queue_state_change:4665(eno1)]check transition 
returned an error. rc -16
[49142.887643] bnx2x: [bnx2x_enable_ptp_packets:15334(eno1)]Failed to enable 
PTP packets
[49142.887646] bnx2x: [bnx2x_queue_chk_transition:5516(eno1)]Blocking 
transition since pending was 20
[49142.887646] bnx2x: [bnx2x_queue_state_change:4665(eno1)]check transition 
returned an error. rc -16
[49142.887647] bnx2x: [bnx2x_enable_ptp_packets:15334(eno1)]Failed to enable 
PTP packets
[49146.107667] bnx2x: [bnx2x_mc_assert:750(eno1)]Chip Revision: everest3, FW 
Version: 7_13_1
[49146.222713] bnx2x: [bnx2x_panic_dump:1186(eno1)]end crash dump 
-----------------
----
Matthew Huff             | 1 Manhattanville Rd 
Director of Operations   | Purchase, NY 10577
OTA Management LLC       | Phone: 914-460-4039



------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
_______________________________________________
Linuxptp-users mailing list
Linuxptp-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/linuxptp-users

Reply via email to