On Wed, Mar 25, 2020 at 12:53:18AM +0100, Ilya Maximets wrote:
> On 3/6/20 5:11 PM, Ilya Maximets wrote:
> > On 3/4/20 7:09 PM, William Tu wrote:
> >> On Fri, Feb 28, 2020 at 12:12 AM Ilya Maximets <i.maxim...@ovn.org> wrote:
> >>>
> >>> On 2/27/20 8:52 PM, William Tu wrote:
> >>>> On Thu, Feb 27, 2020 at 11:10 AM William Tu <u9012...@gmail.com> wrote:
> >>>>>
> >>>>> On Thu, Feb 27, 2020 at 9:54 AM Ilya Maximets <i.maxim...@ovn.org>
> >>>>> wrote:
> >>>>>>
> >>>>>> On 2/27/20 6:51 PM, William Tu wrote:
> >>>>>>> On Thu, Feb 27, 2020 at 9:42 AM Ilya Maximets <i.maxim...@ovn.org>
> >>>>>>> wrote:
> >>>>>>>>
> >>>>>>>> On 2/27/20 6:30 PM, William Tu wrote:
> >>>>>>>>> The patch adds a new netdev class 'afxdp-nonpmd' to enable afxdp
> >>>>>>>>> interrupt mode. This is similar to 'type=afxdp', except that the
> >>>>>>>>> is_pmd field is set to false. As a result, the packet processing
> >>>>>>>>> is handled by main thread, not pmd thread. This avoids burning
> >>>>>>>>> the CPU to always 100% when there is no traffic.
> >>>>>>>>>
> >>>>>>>>> Tested-at:
> >>>>>>>>> https://travis-ci.org/williamtu/ovs-travis/builds/655885506
> >>>>>>>>> Signed-off-by: William Tu <u9012...@gmail.com>
> >>>>>>>>> ---
> >>>>>>>>> NEWS | 3 +++
> >>>>>>>>> lib/netdev-afxdp.c | 20 +++++++++++++++++++-
> >>>>>>>>> lib/netdev-linux.c | 20 ++++++++++++++++++++
> >>>>>>>>> lib/netdev-provider.h | 1 +
> >>>>>>>>> lib/netdev.c | 1 +
> >>>>>>>>> tests/system-afxdp.at | 23 +++++++++++++++++++++++
> >>>>>>>>> 6 files changed, 67 insertions(+), 1 deletion(-)
> >>>>>>>>>
> >>>>>>>>> diff --git a/NEWS b/NEWS
> >>>>>>>>> index f62ef1f47ea8..594c55dc11d6 100644
> >>>>>>>>> --- a/NEWS
> >>>>>>>>> +++ b/NEWS
> >>>>>>>>> @@ -4,6 +4,9 @@ Post-v2.13.0
> >>>>>>>>> - OpenFlow:
> >>>>>>>>> * The OpenFlow ofp_desc/serial_num may now be configured by
> >>>>>>>>> setting the
> >>>>>>>>> value of other-config:dp-sn in the Bridge table.
> >>>>>>>>> + - AF_XDP:
> >>>>>>>>> + * New netdev class 'afxdp-nonpmd' for netdev-afxdp to save
> >>>>>>>>> CPU cycles
> >>>>>>>>> + by enabling interrupt mode.
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>> v2.13.0 - 14 Feb 2020
> >>>>>>>>> diff --git a/lib/netdev-afxdp.c b/lib/netdev-afxdp.c
> >>>>>>>>> index 482400d8d135..cd2c7c381139 100644
> >>>>>>>>> --- a/lib/netdev-afxdp.c
> >>>>>>>>> +++ b/lib/netdev-afxdp.c
> >>>>>>>>> @@ -169,6 +169,12 @@ struct netdev_afxdp_tx_lock {
> >>>>>>>>> );
> >>>>>>>>> };
> >>>>>>>>>
> >>>>>>>>> +static int nonpmd_cnt; /* Number of afxdp netdevs in non-pmd
> >>>>>>>>> mode. */
> >>>>>>>>> +static bool
> >>>>>>>>> +netdev_is_afxdp_nonpmd(struct netdev *netdev) {
> >>>>>>>>> + return netdev_get_class(netdev) == &netdev_afxdp_nonpmd_class;
> >>>>>>>>> +}
> >>>>>>>>> +
> >>>>>>>>> #ifdef HAVE_XDP_NEED_WAKEUP
> >>>>>>>>> static inline void
> >>>>>>>>> xsk_rx_wakeup_if_needed(struct xsk_umem_info *umem,
> >>>>>>>>> @@ -1115,7 +1121,10 @@ netdev_afxdp_batch_send(struct netdev
> >>>>>>>>> *netdev, int qid,
> >>>>>>>>> struct netdev_linux *dev;
> >>>>>>>>> int ret;
> >>>>>>>>>
> >>>>>>>>> - if (concurrent_txq) {
> >>>>>>>>> + /* Lock is required when mixing afxdp pmd and nonpmd mode.
> >>>>>>>>> + * ex: one device is created 'afxdp', the other is
> >>>>>>>>> 'afxdp-nonpmd'.
> >>>>>>>>> + */
> >>>>>>>>> + if (concurrent_txq || (nonpmd_cnt != 0)) {
> >>>>>>>>
> >>>>>>>> I'm confused about this change. Are you expecting same device being
> >>>>>>>> opened twice with different netdev type? Database should not allow
> >>>>>>>> this.
> >>>>>>>
> >>>>>>> Not the same device.
> >>>>>>>
> >>>>>>> For example, dev1 opened as 'afxdp', and dev2 opened as
> >>>>>>> 'afxdp-nonpmd'.
> >>>>>>> When dev2 receives a packet and send to dev1, the main thread used by
> >>>>>>> dev2
> >>>>>>> is calling the send(), while it is possible that dev2's pmd thread is
> >>>>>>> also calling
> >>>>>>> send(). So need a lock here.
> >>>>>>
> >>>>>> But they will send to different tx queues.
> >>>>>
> >>>>> OK, I think you are talking about the XPS feature (dynamic txqs).
> >>>>> I thought XPS can only work when all between pmd threads.
> >>>>> So adding a lock here.
> >>>>> The patch currently doesn't work without the lock. Let me investigate
> >>>>> more.
> >>>>>
> >>>> More details.
> >>>> On my test using veth (only have 1 rxq 1 txq)
> >>>> Somehow the main thread is assigned to use queue id = 2, which causes
> >>>> segfault.
> >>>
> >>> That is very strange...
> >>> Could you, please, share your test setup? I'll try to reproduce.
> >>>
> >>> Best regards, Ilya Maximets.
> >>
> >> Sorry, somehow I miss your reply in my mailbox.
> >> I simply tested it using two namespaces, one device as afxdp, the
> >> other as afxdp-nonpmd.
> >> ---
> >> # start ovs-vswitchd and ovsdb-server
> >>
> >> ovs-vsctl -- add-br br0 -- set Bridge br0 datapath_type=netdev
> >> ovs-vsctl set Open_vSwitch . other_config:pmd-cpu-mask=0xf
> >>
> >> ip netns add at_ns0
> >> ip link add p0 type veth peer name afxdp-p0
> >> ip link set p0 netns at_ns0
> >> ip link set dev afxdp-p0 up
> >> ovs-vsctl add-port br0 afxdp-p0
> >> ovs-vsctl -- set interface afxdp-p0 options:n_rxq=1
> >> type="afxdp-nonpmd" options:xdp-mode=generic
> >>
> >> ip netns exec at_ns0 sh << NS_EXEC_HEREDOC
> >> ip addr add "10.1.1.1/24" dev p0
> >> ip link set dev p0 up
> >> NS_EXEC_HEREDOC
> >>
> >> ip netns add at_ns1
> >> ip link add p1 type veth peer name afxdp-p1
> >> ip link set p1 netns at_ns1
> >> ip link set dev afxdp-p1 up
> >> ovs-vsctl add-port br0 afxdp-p1 -- \
> >> set interface afxdp-p1 type="afxdp"
> >> options:xdp-mode=generic options:n_rxq=1
> >>
> >> ip netns exec at_ns1 sh << NS_EXEC_HEREDOC
> >> ip addr add "10.1.1.2/24" dev p1
> >> ip link set dev p1 up
> >> NS_EXEC_HEREDOC
> >>
> >> ip netns exec at_ns0 ping -c 10 -i .2 10.1.1.2
> >> ---
> >>
> >> It's pretty easy to reproduce the issue, ping will crash the ovs-vswitchd.
> >
> > Thanks. I'll try to reproduce on next week.
>
> It took a bit longer, but I reproduced the issue and sent a fix here:
> https://patchwork.ozlabs.org/patch/1261072/
>
> Best regards, Ilya Maximets.
Thanks a lot!
William
_______________________________________________
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev