Hi Eelco,

> > >>> #0  0x00007fbc6a78193f in raise () from /lib64/libc.so.6
> > >>> #1  0x00007fbc6a76bc95 in abort () from /lib64/libc.so.6
> > >>> #2  0x00000000004ed1a1 in __umem_elem_push_n (addrs=0x7fbc40f2ec50,
> > >>> n=32, umemp=0x24cc790) at lib/xdpsock.c:32
> > >>> #3  umem_elem_push_n (umemp=0x24cc790, n=32,

I've found that it's due to free the afxdp twice.
The free_afxdp_buf() should be called once per dp_packet, somehow
it gets called twice.
Applying this on v11 fixes the issue
--- a/lib/dp-packet.c
+++ b/lib/dp-packet.c
@@ -145,7 +145,7 @@ dp_packet_uninit(struct dp_packet *b)
             free_dpdk_buf((struct dp_packet*) b);
 #endif
         } else if (b->source == DPBUF_AFXDP) {
-            free_afxdp_buf(b);
+            ;
         }
     }
 }

I will work on next version
Thank you
William

<snip>`
> > >>> addrs=addrs@entry=0x7fbc40f2eea0) at lib/xdpsock.c:43
> > >>> #4  0x00000000009b4f51 in afxdp_complete_tx (xsk=0x24c86f0) at
> > >>> lib/netdev-afxdp.c:736
> > >>> #5  netdev_afxdp_batch_send (netdev=<optimized out>, qid=0,
> > >>> batch=0x7fbc24004e80, concurrent_txq=<optimized out>) at
> > >>> lib/netdev-afxdp.c:763
> > >>> #6  0x0000000000908041 in netdev_send (netdev=<optimized out>,
> > >>> qid=qid@entry=0, batch=batch@entry=0x7fbc24004e80,
> > >>> concurrent_txq=concurrent_txq@entry=true)
> > >>>      at lib/netdev.c:800
> > >>> #7  0x00000000008d4c34 in dp_netdev_pmd_flush_output_on_port
> > >>> (pmd=pmd@entry=0x7fbc40f32010, p=p@entry=0x7fbc24004e50) at
> > >>> lib/dpif-netdev.c:4187
> > >>> #8  0x00000000008d4f4f in dp_netdev_pmd_flush_output_packets
> > >>> (pmd=pmd@entry=0x7fbc40f32010, force=force@entry=false) at
> > >>> lib/dpif-netdev.c:4227
> > >>> #9  0x00000000008dd2e7 in dp_netdev_pmd_flush_output_packets
> > >>> (force=false, pmd=0x7fbc40f32010) at lib/dpif-netdev.c:4282
> > >>> #10 dp_netdev_process_rxq_port (pmd=pmd@entry=0x7fbc40f32010,
> > >>> rxq=0x24ce650, port_no=1) at lib/dpif-netdev.c:4282
> > >>> #11 0x00000000008dd64d in pmd_thread_main (f_=<optimized out>) at
> > >>> lib/dpif-netdev.c:5449
> > >>> #12 0x000000000095e95d in ovsthread_wrapper (aux_=<optimized out>)
> > >>> at
> > >>> lib/ovs-thread.c:352
> > >>> #13 0x00007fbc6b0a12de in start_thread () from
> > >>> /lib64/libpthread.so.0
> > >>> #14 0x00007fbc6a846a63 in clone () from /lib64/libc.so.6
> > >>>
> > >>> After this crash, systemd restart OVS, and it crashed again (guess
> > >>> traffic was still flowing for a bit with the NORMAL rule installed):
> > >>>
> > >>> Program terminated with signal SIGSEGV, Segmentation fault.
> > >>> #0  netdev_afxdp_rxq_recv (rxq_=0x2d5a860, batch=0x7f46f8ff70d0,
> > >>> qfill=0x0) at lib/netdev-afxdp.c:583
> > >>> 583         rx->fd = xsk_socket__fd(xsk->xsk);
> > >>> [Current thread is 1 (Thread 0x7f46f8ff9700 (LWP 28171))]
> > >>>
> > >>> (gdb) bt
> > >>> #0  netdev_afxdp_rxq_recv (rxq_=0x2d5a860, batch=0x7f46f8ff70d0,
> > >>> qfill=0x0) at lib/netdev-afxdp.c:583
> > >>> #1  0x0000000000907f31 in netdev_rxq_recv (rx=<optimized out>,
> > >>> batch=batch@entry=0x7f46f8ff70d0, qfill=<optimized out>) at
> > >>> lib/netdev.c:710
> > >>> #2  0x00000000008dd1d3 in dp_netdev_process_rxq_port
> > >>> (pmd=pmd@entry=0x2d8f0c0, rxq=0x2d8c090, port_no=2) at
> > >>> lib/dpif-netdev.c:4257
> > >>> #3  0x00000000008dd64d in pmd_thread_main (f_=<optimized out>) at
> > >>> lib/dpif-netdev.c:5449
> > >>> #4  0x000000000095e95d in ovsthread_wrapper (aux_=<optimized out>)
> > >>> at
> > >>> lib/ovs-thread.c:352
> > >>> #5  0x00007f47229732de in start_thread () from
> > >>> /lib64/libpthread.so.0
> > >>> #6  0x00007f4722118a63 in clone () from /lib64/libc.so.6
> > >>>
> > >>> I did not further investigate, but it should be easy to replicate.
> > >>> This
> > >>> is the same setup that worked fine with the v8 patchset for port to
> > >>> port.
> > >>> Next step was to verify PVP was fixed, but could not get thereā€¦
> > >>> Cheers,
> > >>
> > >> I'm not able to reproduce it on my testbed using i40e, I will try
> > >> using ixgbe today.
> > >>
> > >> btw, if you try skb-mode, does the crash still show up?
> > >> Although skb-mode is much slower, so it might not trigger the issue.
> > >>
> > >> Regards,
> > >> William
> > >>
> > >>>
> > <SNIP>
_______________________________________________
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Reply via email to