Hi,

Updated info for MTU.

Br, Xiaoping

From: Xiaoping Yan (NSB) <[email protected]>
Sent: 2023年7月5日 11:34
To: Alexander Kozyrev <[email protected]>; Matan Azrad <[email protected]>; 
[email protected]; Dekel Peled <[email protected]>
Subject: [External] RE: dpdk mlx5 driver crash in rxq_cq_decompress_v



CAUTION: This is an external email. Please be very careful when clicking links 
or opening attachments. See http://nok.it/nsb for additional information.




Hi Alex,

I forwarded the dump to you in separate mail.
Test topology:
Spirent testcenter<=>ipsec GW <=> DUT(ipsec, GTPu forward) = ipsec GW <=> 
Spirent testcenter
Plain GTPu packet between testcenter and ipsec GW, ipsec between ipsec GW and 
DUT.
Traffic pattern:
downlink: gtpu packet length 1236, throughput: 3.65Gbps
uplink: gtpu packet length 302, throughput 0.47Gbps
MTU:
Mellanox SRIOV VF is used in ipsec GW and DUT
PF MTU 9000, VF mtu 2000 (also tested with VF MTU 1500, same crash is seen)

Crash is seen in one of the ipsec GW

This commit (547b239a21) is not included in my dpdk version.

Thank you.

Br, Xiaoping

From: Alexander Kozyrev <[email protected]<mailto:[email protected]>>
Sent: 2023年7月5日 9:45
To: Matan Azrad <[email protected]<mailto:[email protected]>>; Xiaoping Yan (NSB) 
<[email protected]<mailto:[email protected]>>; 
[email protected]<mailto:[email protected]>; Dekel Peled 
<[email protected]<mailto:[email protected]>>
Subject: [External] RE: dpdk mlx5 driver crash in rxq_cq_decompress_v



CAUTION: This is an external email. Please be very careful when clicking links 
or opening attachments. See http://nok.it/nsb for additional information.




Hi Xiaoping, could you please forward the error CQE dump to me?
Would you mind elaborating more on your traffic pattern and test case scenario?
The following commit supposed to ignore MTU mismatch error between VF and PF:
547b239a21 net/mlx5: ignore non-critical syndromes for Rx queue

Regards,
Alex

From: Matan Azrad <[email protected]<mailto:[email protected]>>
Sent: Sunday, July 2, 2023 11:35 PM
To: Xiaoping Yan (NSB) 
<[email protected]<mailto:[email protected]>>; 
[email protected]<mailto:[email protected]>; Dekel Peled 
<[email protected]<mailto:[email protected]>>; Alexander Kozyrev 
<[email protected]<mailto:[email protected]>>
Subject: Re: dpdk mlx5 driver crash in rxq_cq_decompress_v

+ @Alexander Kozyrev<mailto:[email protected]> to suggest.

קבל ‏Outlook עבור Android‏<https://aka.ms/AAb9ysg>
________________________________
From: Xiaoping Yan (NSB) 
<[email protected]<mailto:[email protected]>>
Sent: Monday, July 3, 2023 4:18:22 AM
To: [email protected]<mailto:[email protected]> 
<[email protected]<mailto:[email protected]>>; Matan Azrad 
<[email protected]<mailto:[email protected]>>; 
[email protected]<mailto:[email protected]> 
<[email protected]<mailto:[email protected]>>
Subject: RE: dpdk mlx5 driver crash in rxq_cq_decompress_v

External email: Use caution opening links or attachments



Hi,



@'[email protected]'<mailto:[email protected]>@'Matan 
Azrad'<mailto:[email protected]> Can you kindly suggest?

Thank you.



Br, Xiaoping



From: Xiaoping Yan (NSB)
Sent: 2023年6月27日 12:11
To: [email protected]<mailto:[email protected]>; 'Matan Azrad' 
<[email protected]<mailto:[email protected]>>; '[email protected]' 
<[email protected]<mailto:[email protected]>>
Subject: dpdk mlx5 driver crash in rxq_cq_decompress_v



Hi,



dpdk version in use: 21.11.2



Mlx5 driver crashes in rxq_cq_decompress_v in traffic test after several 
minutes.

Stack trace:

(gdb) bt

#0  0x00007ffff58612bc in _mm_storeu_si128 (__B=..., __P=<optimized out>)

    at /usr/lib/gcc/x86_64-redhat-linux/12/include/emmintrin.h:739

#1  rxq_cq_decompress_v (rxq=rxq@entry=0x2abe5592f40, 
cq=cq@entry=0x2abe54fdb00, elts=elts@entry=0x2abe5594638)

    at ../dpdk-21.11/drivers/net/mlx5/mlx5_rxtx_vec_sse.h:142

#2  0x00007ffff5862c84 in rxq_burst_v (no_cq=<synthetic pointer>, 
err=0x7fffffffb848, pkts_n=4, pkts=<optimized out>,

    rxq=0x2abe5592f40) at ../dpdk-21.11/drivers/net/mlx5/mlx5_rxtx_vec.c:349

#3  mlx5_rx_burst_vec (dpdk_rxq=0x2abe5592f40, pkts=0x7fffffffbf80, pkts_n=32) 
at ../dpdk-21.11/drivers/net/mlx5/mlx5_rxtx_vec.c:393

#4  0x00005555556a0f41 in rte_eth_rx_burst (nb_pkts=32, rx_pkts=0x7fffffffbf80, 
queue_id=0, port_id=1)

    at /usr/include/rte_ethdev.h:5721

…

Attached is the error log “Unexpected CQE error syndrome…” and dump file



I found there was a similar bug here: https://bugs.dpdk.org/show_bug.cgi?id=334

But the fix (88c0733535d6 extend Rx completion with error handling) should 
already been included, as I’m using 21.11.2

Also below commit (fix to 88c0733535d6) is already included in my dpdk version.

commit 60b254e3923d007bcadbb8d410f95ad89a2f13fa

Author: Matan Azrad [email protected]<mailto:[email protected]>

Date:   Thu Aug 11 19:51:55 2022 +0300



    net/mlx5: fix Rx queue recovery mechanism



Any suggestion?

Thank you.



Br, Xiaoping


Reply via email to