On Thu, 1 Sep 2022 09:33:54 +0200
Anna Tauzzi <[email protected]> wrote:

> I'm using the Mellanox Connect X5:
> 
> pci@0000:3b:00.0  enp59s0f0np0   network        MT27800 Family [ConnectX-5]
> pci@0000:3b:00.1  enp59s0f1np1   network        MT27800 Family [ConnectX-5]
> pci@0000:3b:00.2  enp59s0f0v0    network        MT27800 Family [ConnectX-5
> Virtual Function]
> pci@0000:3b:00.3  enp59s0f0v1    network        MT27800 Family [ConnectX-5
> Virtual Function]
> pci@0000:3b:00.4  enp59s0f0v2    network        MT27800 Family [ConnectX-5
> Virtual Function]
> pci@0000:3b:00.5  enp59s0f0v3    network        MT27800 Family [ConnectX-5
> Virtual Function]
> pci@0000:3b:04.2  enp59s0f1v0    network        MT27800 Family [ConnectX-5
> Virtual Function]
> pci@0000:3b:04.3  enp59s0f1v1    network        MT27800 Family [ConnectX-5
> Virtual Function]
> pci@0000:3b:04.4  enp59s0f1v2    network        MT27800 Family [ConnectX-5
> Virtual Function]
> pci@0000:3b:04.5  enp59s0f1v3    network        MT27800 Family [ConnectX-5
> Virtual Function]
> 
> This is the message:
> lcore 6 called tx_pkt_burst for not ready port 0
> 8: [/lib/x86_64-linux-gnu/libc.so.6(+0x126a00) [0x7ffff7c77a00]]
> 7: [/lib/x86_64-linux-gnu/libc.so.6(+0x94b43) [0x7ffff7be5b43]]
> 6: [/usr/local/lib/librte_eal.so.22(+0x1559a) [0x7ffff7d8e59a]]
> 5: [build/simple_eth_tx_mp(+0x1a0c7) [0x55555556e0c7]]
> 4: [build/simple_eth_tx_mp(+0x19f89) [0x55555556df89]]
> 3: [build/simple_eth_tx_mp(+0x423c) [0x55555555823c]]
> 2: [/usr/local/lib/librte_ethdev.so.22(+0x7cbc) [0x7ffff7eb3cbc]]
> 1: [/usr/local/lib/librte_eal.so.22(rte_dump_stack+0x32) [0x7ffff7daf152]]
> 
> I'm having all sorts of problems with this Mellanox stuff, Intel cards are
> much more user friendly.
> 
> Just to recap:
> * configure on primary and transmit on primary           ---> GOOD
> 
> * configure on secondary and transmit on secondary  ---> SIGSEGV
> Thread 4 "lcore-worker-6" received signal SIGSEGV, Segmentation fault.
> [Switching to Thread 0x7ffff4346640 (LWP 7208)]
> rte_eth_tx_burst (port_id=0, queue_id=0, tx_pkts=0x7ffff4344ac0, nb_pkts=1)
> at /usr/local/include/rte_ethdev.h:5650
> 5650            qd = p->txq.data[queue_id];
> (gdb) print p->txq
> $2 = {data = 0x0, clbk = 0x7ffff7f21528 <rte_eth_devices+8296>} (data is
> NULL)
> 
> 
> * configure on primary and transmit on secondary       ---> PORT NOT READY
> 
> Do you know who should be notified of this problem? Should I open a bug on
> DPDK bugzilla or file it to NVIDIA?
> 
> Thx.
> 
> 
> 
> Il giorno gio 1 set 2022 alle ore 03:25 Stephen Hemminger <
> [email protected]> ha scritto:  
> 
> > On Wed, 31 Aug 2022 22:59:56 +0200
> > Anna Tauzzi <[email protected]> wrote:
> >  
> > > I initialize a port with the following methods on a primary process:
> > >
> > > rte_dev_probe(vf)
> > >
> > > rte_eth_dev_configure(port_id, ... );
> > >
> > > rte_eth_dev_adjust_nb_rx_tx_desc(port_id, ... );
> > >
> > > rte_eth_rx_queue_setup(port_id, .... );
> > >
> > > rte_eth_tx_queue_setup(port_id, ... );
> > >
> > > rte_eth_dev_start(port_id ... );
> > >
> > >
> > >
> > > Then I use the rte_eth_tx_burst(port_id) in the secondary process but I  
> > get  
> > > this message:
> > >
> > > called tx_pkt_burst for not ready port 0
> > >
> > > Is this expected?  
> >
> > No looks like a device driver bug. Which PMD?

What version of rdma-core and kernel.
There were some bugs in earlier versions around secondary process support.
They were fixed, some users are using failsafe and mlx5 on Azure with
secondary processes.

Reply via email to