I've got two new HPE servers running PTP. PTP starts normally, but after 10-12 hours, the bnx2 driver crashes. I've tried both the HPE driver for the NIC and the Redhat native driver. Both have identical issues.
Anyone run into this issue? Any suggestions/workarounds? I have a ticket open with both Redhat and HPE, but so far, no help. Environment ----------- HPE ProLiant DL560 Gen10 HPE FlexFabric 10Gb 2-port 534FLR-SFP+ Adapter, Firmware 7.17.19 Red Hat Enterprise Linux Server release 7.5 (Maipo) Kernel 3.10.0-862.6.3.el7.x86_64 linuxptp-1.8-5.el7.x86_64 HPE Driver ---------- kmod-netxtreme2-7.14.46-1.rhel7u5.x86_64 bnx2x: QLogic 5771x/578xx 10/20-Gigabit Ethernet Driver bnx2x 1.714.18 ($DateTime: 2018/04/19 20:00:08 $) Redhat Native Driver -------------------- bnx2x: QLogic 5771x/578xx 10/20-Gigabit Ethernet Driver bnx2x 1.712.30-0 (2014/02/10) The hardware shows support of hardware timestamping: [root@antares firmware-nic-qlogic-nx2-2.22.15-1.1]# ethtool -T eno1 Time stamping parameters for eno1: Capabilities: hardware-transmit (SOF_TIMESTAMPING_TX_HARDWARE) software-transmit (SOF_TIMESTAMPING_TX_SOFTWARE) hardware-receive (SOF_TIMESTAMPING_RX_HARDWARE) software-receive (SOF_TIMESTAMPING_RX_SOFTWARE) software-system-clock (SOF_TIMESTAMPING_SOFTWARE) hardware-raw-clock (SOF_TIMESTAMPING_RAW_HARDWARE) PTP Hardware Clock: 0 Hardware Transmit Timestamp Modes: off (HWTSTAMP_TX_OFF) on (HWTSTAMP_TX_ON) Hardware Receive Filter Modes: none (HWTSTAMP_FILTER_NONE) ptpv1-l4-event (HWTSTAMP_FILTER_PTP_V1_L4_EVENT) ptpv1-l4-sync (HWTSTAMP_FILTER_PTP_V1_L4_SYNC) ptpv1-l4-delay-req (HWTSTAMP_FILTER_PTP_V1_L4_DELAY_REQ) ptpv2-l4-event (HWTSTAMP_FILTER_PTP_V2_L4_EVENT) ptpv2-l4-sync (HWTSTAMP_FILTER_PTP_V2_L4_SYNC) ptpv2-l4-delay-req (HWTSTAMP_FILTER_PTP_V2_L4_DELAY_REQ) ptpv2-l2-event (HWTSTAMP_FILTER_PTP_V2_L2_EVENT) ptpv2-l2-sync (HWTSTAMP_FILTER_PTP_V2_L2_SYNC) ptpv2-l2-delay-req (HWTSTAMP_FILTER_PTP_V2_L2_DELAY_REQ) ptpv2-event (HWTSTAMP_FILTER_PTP_V2_EVENT) ptpv2-sync (HWTSTAMP_FILTER_PTP_V2_SYNC) ptpv2-delay-req (HWTSTAMP_FILTER_PTP_V2_DELAY_REQ) Command line of ptp processes: /usr/sbin/ptp4l -l 5 -f /var/run/timemaster/ptp4l.0.conf -H -i eno1 /usr/sbin/phc2sys -l 5 -a -r -R 1.00 -z /var/run/timemaster/ptp4l.0.socket -t [0:eno1] -n 0 -E ntpshm -M 0 Configuration: [root@capricorn ~]# more /var/run/timemaster/ptp4l.0.conf [global] slaveOnly 1 domainNumber 0 uds_address /var/run/timemaster/ptp4l.0.socket message_tag [0:eno1] Dmesg output: At first, ptp comes up correctly and syncs the clock with no issues: [ 9.237476] PTP clock support registered [ 9.381258] bnx2x: QLogic 5771x/578xx 10/20-Gigabit Ethernet Driver bnx2x 1.712.30-0 (2014/02/10) [ 9.381681] bnx2x 0000:29:00.0: msix capability found [ 9.381809] bnx2x 0000:29:00.0: part number 394D0030-31383735-31543030-47303030 [ 9.575335] bnx2x 0000:29:00.0: irq 119 for MSI/MSI-X [ 9.575344] bnx2x 0000:29:00.0: irq 120 for MSI/MSI-X [ 9.575350] bnx2x 0000:29:00.0: irq 121 for MSI/MSI-X [ 9.575356] bnx2x 0000:29:00.0: irq 122 for MSI/MSI-X [ 9.575361] bnx2x 0000:29:00.0: irq 123 for MSI/MSI-X [ 9.575369] bnx2x 0000:29:00.0: irq 124 for MSI/MSI-X [ 9.575375] bnx2x 0000:29:00.0: irq 125 for MSI/MSI-X [ 9.575380] bnx2x 0000:29:00.0: irq 126 for MSI/MSI-X [ 9.575386] bnx2x 0000:29:00.0: irq 127 for MSI/MSI-X [ 9.575391] bnx2x 0000:29:00.0: irq 128 for MSI/MSI-X [ 9.684072] bnx2x 0000:29:00.1: msix capability found [ 9.684218] bnx2x 0000:29:00.1: part number 394D0030-31383735-31543030-47303030 [ 9.872072] bnx2x 0000:29:00.1: irq 130 for MSI/MSI-X [ 9.872079] bnx2x 0000:29:00.1: irq 131 for MSI/MSI-X [ 9.872085] bnx2x 0000:29:00.1: irq 132 for MSI/MSI-X [ 9.872091] bnx2x 0000:29:00.1: irq 133 for MSI/MSI-X [ 9.872097] bnx2x 0000:29:00.1: irq 134 for MSI/MSI-X [ 9.872103] bnx2x 0000:29:00.1: irq 135 for MSI/MSI-X [ 9.872108] bnx2x 0000:29:00.1: irq 136 for MSI/MSI-X [ 9.872115] bnx2x 0000:29:00.1: irq 137 for MSI/MSI-X [ 9.872121] bnx2x 0000:29:00.1: irq 138 for MSI/MSI-X [ 9.872126] bnx2x 0000:29:00.1: irq 139 for MSI/MSI-X [ 51.210732] bnx2x 0000:29:00.0 eno1: using MSI-X IRQs: sp 119 fp[0] 121 ... fp[7] 128 [ 51.463452] bnx2x 0000:29:00.0 eno1: NIC Link is Up, 10000 Mbps full duplex, Flow control: ON - receive & transmit Then after a periods of a few hours: [10679.116279] bnx2x: [bnx2x_issue_dmae_with_comp:549(eno1)]DMAE timeout! [10679.194744] bnx2x: [bnx2x_read_dmae:636(eno1)]DMAE returned failure -1 [17970.956219] bnx2x: [bnx2x_issue_dmae_with_comp:549(eno1)]DMAE timeout! [17971.034689] bnx2x: [bnx2x_read_dmae:636(eno1)]DMAE returned failure -1 [20708.112105] bnx2x: [bnx2x_issue_dmae_with_comp:549(eno1)]DMAE timeout! [20708.190573] bnx2x: [bnx2x_read_dmae:636(eno1)]DMAE returned failure -1 [26475.402157] bnx2x: [bnx2x_issue_dmae_with_comp:549(eno1)]DMAE timeout! [26475.480625] bnx2x: [bnx2x_read_dmae:636(eno1)]DMAE returned failure -1 [36452.146498] bnx2x: [bnx2x_issue_dmae_with_comp:549(eno1)]DMAE timeout! [36452.224959] bnx2x: [bnx2x_read_dmae:636(eno1)]DMAE returned failure -1 [49120.790072] bnx2x: [bnx2x_issue_dmae_with_comp:549(eno1)]DMAE timeout! Then after 10-12 hours, the driver crashes and no longer passes any traffic: [49123.612980] bnx2x: [bnx2x_stats_update:1232(eno1)]storm stats were not updated for 3 times [49123.712369] bnx2x: [bnx2x_stats_update:1233(eno1)]driver assert [49123.712438] bnx2x: [bnx2x_ptp_adjfreq:13795(eno1)]Failed to set drift [49123.860922] bnx2x: [bnx2x_panic_dump:923(eno1)]begin crash dump ----------------- [49123.967604] bnx2x: [bnx2x_panic_dump:933(eno1)]def_idx(0xa8ca) def_att_idx(0x166) attn_state(0x0) spq_prod_idx(0xec) next_stats_cnt(0xbfc0) [49124.138085] bnx2x: [bnx2x_panic_dump:938(eno1)]DSB: attn bits(0x0) ack(0x10) id(0x0) idx(0x166) [49124.262527] bnx2x: [bnx2x_panic_dump:939(eno1)] def (0x0 0x0 0x0 0x0 0x0 0x0 0x0 0xac76 0x0 0x0 0x0 0x0 0x0 0x0 0x0 0x0) igu_sb_id(0x0) igu_seg_id(0x1) pf_id(0x0) vnic_id(0x0) vf_id(0xff) vf_valid (0x0) state(0x1) [49124.667394] bnx2x: [bnx2x_panic_dump:990(eno1)]fp0: rx_bd_prod(0x68a2) rx_bd_cons(0x6db) rx_comp_prod(0x69e5) rx_comp_cons(0x6819) *rx_cons_sb(0x6819) [49124.850425] bnx2x: [bnx2x_panic_dump:993(eno1)] rx_sge_prod(0x400) last_max_sge(0x0) fp_hc_idx(0x6683) [49124.985325] bnx2x: [bnx2x_panic_dump:1010(eno1)]fp0: tx_pkt_prod(0xa) tx_pkt_cons(0xa) tx_bd_prod(0x14) tx_bd_cons(0x13) *tx_cons_sb(0xa) [49125.154748] bnx2x: [bnx2x_panic_dump:1010(eno1)]fp0: tx_pkt_prod(0x0) tx_pkt_cons(0x0) tx_bd_prod(0x0) tx_bd_cons(0x0) *tx_cons_sb(0x0) [49125.322075] bnx2x: [bnx2x_panic_dump:1010(eno1)]fp0: tx_pkt_prod(0x0) tx_pkt_cons(0x0) tx_bd_prod(0x0) tx_bd_cons(0x0) *tx_cons_sb(0x0) [49125.489398] bnx2x: [bnx2x_panic_dump:1021(eno1)] run indexes (0x6683 0x0) [49125.623245] bnx2x: [bnx2x_panic_dump:1027(eno1)] indexes ( [49126.768509] bnx2x: [bnx2x_panic_dump:990(eno1)]fp1: rx_bd_prod(0x1c5) rx_bd_cons(0x0) rx_comp_prod(0x1d0) rx_comp_cons(0x4) *rx_cons_sb(0x4) [49126.924883] bnx2x: [bnx2x_enable_ptp_packets:15334(eno1)]Failed to enable PTP packets [49126.924890] bnx2x: [bnx2x_queue_chk_transition:5516(eno1)]Blocking transition since pending was 20 [49126.924891] bnx2x: [bnx2x_queue_state_change:4665(eno1)]check transition returned an error. rc -16 [49126.924891] bnx2x: [bnx2x_enable_ptp_packets:15334(eno1)]Failed to enable PTP packets [49127.412351] bnx2x: [bnx2x_panic_dump:993(eno1)] rx_sge_prod(0x400) last_max_sge(0x0) fp_hc_idx(0x4) [49127.544133] bnx2x: [bnx2x_panic_dump:1010(eno1)]fp1: tx_pkt_prod(0x0) tx_pkt_cons(0x0) tx_bd_prod(0x0) tx_bd_cons(0x0) *tx_cons_sb(0x0) [49127.712188] bnx2x: [bnx2x_panic_dump:1010(eno1)]fp1: tx_pkt_prod(0x0) tx_pkt_cons(0x0) tx_bd_prod(0x0) tx_bd_cons(0x0) *tx_cons_sb(0x0) [49127.879559] bnx2x: [bnx2x_panic_dump:1010(eno1)]fp1: tx_pkt_prod(0x0) tx_pkt_cons(0x0) tx_bd_prod(0x0) tx_bd_cons(0x0) *tx_cons_sb(0x0) [49128.046927] bnx2x: [bnx2x_panic_dump:1021(eno1)] run indexes (0x4 0x0) [49128.159417] bnx2x: [bnx2x_panic_dump:1027(eno1)] indexes ( [49129.321046] bnx2x: [bnx2x_panic_dump:990(eno1)]fp2: rx_bd_prod(0x1c5) rx_bd_cons(0x0) rx_comp_prod(0x1d0) rx_comp_cons(0x4) *rx_cons_sb(0x4) [49129.494284] bnx2x: [bnx2x_panic_dump:993(eno1)] rx_sge_prod(0x400) last_max_sge(0x0) fp_hc_idx(0xbabc) [49129.628519] bnx2x: [bnx2x_panic_dump:1010(eno1)]fp2: tx_pkt_prod(0xbab8) tx_pkt_cons(0xbab8) tx_bd_prod(0x76e6) tx_bd_cons(0x76e5) *tx_cons_sb(0xbab8) [49129.812198] bnx2x: [bnx2x_panic_dump:1010(eno1)]fp2: tx_pkt_prod(0x0) tx_pkt_cons(0x0) tx_bd_prod(0x0) tx_bd_cons(0x0) *tx_cons_sb(0x0) [49129.979523] bnx2x: [bnx2x_panic_dump:1010(eno1)]fp2: tx_pkt_prod(0x0) tx_pkt_cons(0x0) tx_bd_prod(0x0) tx_bd_cons(0x0) *tx_cons_sb(0x0) [49130.146854] bnx2x: [bnx2x_panic_dump:1021(eno1)] run indexes (0xbabc 0x0) [49130.279365] bnx2x: [bnx2x_panic_dump:1027(eno1)] indexes ( [49131.427380] bnx2x: [bnx2x_panic_dump:990(eno1)]fp3: rx_bd_prod(0x1c5) rx_bd_cons(0x0) rx_comp_prod(0x1d0) rx_comp_cons(0x4) *rx_cons_sb(0x4) [49131.600621] bnx2x: [bnx2x_panic_dump:993(eno1)] rx_sge_prod(0x400) last_max_sge(0x0) fp_hc_idx(0x9) [49131.731727] bnx2x: [bnx2x_panic_dump:1010(eno1)]fp3: tx_pkt_prod(0x5) tx_pkt_cons(0x5) tx_bd_prod(0xa) tx_bd_cons(0x9) *tx_cons_sb(0x5) [49131.899758] bnx2x: [bnx2x_panic_dump:1010(eno1)]fp3: tx_pkt_prod(0x0) tx_pkt_cons(0x0) tx_bd_prod(0x0) tx_bd_cons(0x0) *tx_cons_sb(0x0) [49132.067110] bnx2x: [bnx2x_panic_dump:1010(eno1)]fp3: tx_pkt_prod(0x0) tx_pkt_cons(0x0) tx_bd_prod(0x0) tx_bd_cons(0x0) *tx_cons_sb(0x0) [49132.234438] bnx2x: [bnx2x_panic_dump:1021(eno1)] run indexes (0x9 0x0) [49132.347020] bnx2x: [bnx2x_panic_dump:1027(eno1)] indexes ( [49133.508630] bnx2x: [bnx2x_panic_dump:990(eno1)]fp4: rx_bd_prod(0x1c5) rx_bd_cons(0x0) rx_comp_prod(0x1d0) rx_comp_cons(0x4) *rx_cons_sb(0x4) [49133.681908] bnx2x: [bnx2x_panic_dump:993(eno1)] rx_sge_prod(0x400) last_max_sge(0x0) fp_hc_idx(0x4) [49133.813026] bnx2x: [bnx2x_panic_dump:1010(eno1)]fp4: tx_pkt_prod(0x0) tx_pkt_cons(0x0) tx_bd_prod(0x0) tx_bd_cons(0x0) *tx_cons_sb(0x0) [49133.981051] bnx2x: [bnx2x_panic_dump:1010(eno1)]fp4: tx_pkt_prod(0x0) tx_pkt_cons(0x0) tx_bd_prod(0x0) tx_bd_cons(0x0) *tx_cons_sb(0x0) [49134.148386] bnx2x: [bnx2x_panic_dump:1010(eno1)]fp4: tx_pkt_prod(0x0) tx_pkt_cons(0x0) tx_bd_prod(0x0) tx_bd_cons(0x0) *tx_cons_sb(0x0) [49134.315714] bnx2x: [bnx2x_panic_dump:1021(eno1)] run indexes (0x4 0x0) [49134.428295] bnx2x: [bnx2x_panic_dump:1027(eno1)] indexes ( [49135.589916] bnx2x: [bnx2x_panic_dump:990(eno1)]fp5: rx_bd_prod(0xf3af) rx_bd_cons(0x1e8) rx_comp_prod(0x8e4) rx_comp_cons(0x718) *rx_cons_sb(0x718) [49135.770509] bnx2x: [bnx2x_panic_dump:993(eno1)] rx_sge_prod(0x400) last_max_sge(0x0) fp_hc_idx(0xeafc) [49135.904772] bnx2x: [bnx2x_panic_dump:1010(eno1)]fp5: tx_pkt_prod(0x0) tx_pkt_cons(0x0) tx_bd_prod(0x0) tx_bd_cons(0x0) *tx_cons_sb(0x0) [49136.072803] bnx2x: [bnx2x_panic_dump:1010(eno1)]fp5: tx_pkt_prod(0x0) tx_pkt_cons(0x0) tx_bd_prod(0x0) tx_bd_cons(0x0) *tx_cons_sb(0x0) [49136.240136] bnx2x: [bnx2x_panic_dump:1010(eno1)]fp5: tx_pkt_prod(0x0) tx_pkt_cons(0x0) tx_bd_prod(0x0) tx_bd_cons(0x0) *tx_cons_sb(0x0) [49136.407465] bnx2x: [bnx2x_panic_dump:1021(eno1)] run indexes (0xeafc 0x0) [49136.523181] bnx2x: [bnx2x_panic_dump:1027(eno1)] indexes ( [49137.686868] bnx2x: [bnx2x_panic_dump:990(eno1)]fp6: rx_bd_prod(0x1c5) rx_bd_cons(0x0) rx_comp_prod(0x1d0) rx_comp_cons(0x4) *rx_cons_sb(0x4) [49137.860145] bnx2x: [bnx2x_panic_dump:993(eno1)] rx_sge_prod(0x400) last_max_sge(0x0) fp_hc_idx(0x5) [49137.991263] bnx2x: [bnx2x_panic_dump:1010(eno1)]fp6: tx_pkt_prod(0x1) tx_pkt_cons(0x1) tx_bd_prod(0x2) tx_bd_cons(0x1) *tx_cons_sb(0x1) [49138.159282] bnx2x: [bnx2x_panic_dump:1010(eno1)]fp6: tx_pkt_prod(0x0) tx_pkt_cons(0x0) tx_bd_prod(0x0) tx_bd_cons(0x0) *tx_cons_sb(0x0) [49138.326618] bnx2x: [bnx2x_panic_dump:1010(eno1)]fp6: tx_pkt_prod(0x0) tx_pkt_cons(0x0) tx_bd_prod(0x0) tx_bd_cons(0x0) *tx_cons_sb(0x0) [49138.493946] bnx2x: [bnx2x_panic_dump:1021(eno1)] run indexes (0x5 0x0) [49138.606521] bnx2x: [bnx2x_panic_dump:1027(eno1)] indexes ( [49139.766030] bnx2x: [bnx2x_panic_dump:990(eno1)]fp7: rx_bd_prod(0x1c5) rx_bd_cons(0x0) rx_comp_prod(0x1d0) rx_comp_cons(0x4) *rx_cons_sb(0x4) [49139.939313] bnx2x: [bnx2x_panic_dump:993(eno1)] rx_sge_prod(0x400) last_max_sge(0x0) fp_hc_idx(0x7) [49140.070430] bnx2x: [bnx2x_panic_dump:1010(eno1)]fp7: tx_pkt_prod(0x3) tx_pkt_cons(0x3) tx_bd_prod(0x6) tx_bd_cons(0x5) *tx_cons_sb(0x3) [49140.238458] bnx2x: [bnx2x_panic_dump:1010(eno1)]fp7: tx_pkt_prod(0x0) tx_pkt_cons(0x0) tx_bd_prod(0x0) tx_bd_cons(0x0) *tx_cons_sb(0x0) [49140.405797] bnx2x: [bnx2x_panic_dump:1010(eno1)]fp7: tx_pkt_prod(0x0) tx_pkt_cons(0x0) tx_bd_prod(0x0) tx_bd_cons(0x0) *tx_cons_sb(0x0) [49140.573137] bnx2x: [bnx2x_panic_dump:1021(eno1)] run indexes (0x7 0x0) [49140.685693] bnx2x: [bnx2x_panic_dump:1027(eno1)] indexes ( [49141.845218] bnx2x 0000:29:00.0 eno1: bc 7.15.24 [49142.887641] bnx2x: [bnx2x_queue_chk_transition:5516(eno1)]Blocking transition since pending was 20 [49142.887642] bnx2x: [bnx2x_queue_state_change:4665(eno1)]check transition returned an error. rc -16 [49142.887643] bnx2x: [bnx2x_enable_ptp_packets:15334(eno1)]Failed to enable PTP packets [49142.887646] bnx2x: [bnx2x_queue_chk_transition:5516(eno1)]Blocking transition since pending was 20 [49142.887646] bnx2x: [bnx2x_queue_state_change:4665(eno1)]check transition returned an error. rc -16 [49142.887647] bnx2x: [bnx2x_enable_ptp_packets:15334(eno1)]Failed to enable PTP packets [49146.107667] bnx2x: [bnx2x_mc_assert:750(eno1)]Chip Revision: everest3, FW Version: 7_13_1 [49146.222713] bnx2x: [bnx2x_panic_dump:1186(eno1)]end crash dump ----------------- ---- Matthew Huff | 1 Manhattanville Rd Director of Operations | Purchase, NY 10577 OTA Management LLC | Phone: 914-460-4039 ------------------------------------------------------------------------------ Check out the vibrant tech community on one of the world's most engaging tech sites, Slashdot.org! http://sdm.link/slashdot _______________________________________________ Linuxptp-users mailing list Linuxptp-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/linuxptp-users