On Wed, Jan 29, 2020 at 02:42:27PM -0800, Yifeng Sun wrote: > Hi Flavio,
Hi Yifend, thanks for looking into this. > I found this piece of code in kernel's drivers/net/virtio_net.c and > its function receive_buf(): > if (hdr->hdr.flags & VIRTIO_NET_HDR_F_DATA_VALID) > skb->ip_summed = CHECKSUM_UNNECESSARY; > My understanding is that vhost_user needs to set flag > VIRTIO_NET_HDR_F_DATA_VALID so that > guest's kernel will skip packet's checksum validation. > > Then I looked through dpdk's source code but didn't find any place > that sets this flag. So I made > some changes as below, and TCP starts working between 2 VMs without > any kernel change. > > diff --git a/lib/librte_vhost/virtio_net.c b/lib/librte_vhost/virtio_net.c > index 73bf98bd9..5e45db655 100644 > --- a/lib/librte_vhost/virtio_net.c > +++ b/lib/librte_vhost/virtio_net.c > @@ -437,6 +437,7 @@ virtio_enqueue_offload(struct rte_mbuf *m_buf, > struct virtio_net_hdr *net_hdr) > ASSIGN_UNLESS_EQUAL(net_hdr->csum_start, 0); > ASSIGN_UNLESS_EQUAL(net_hdr->csum_offset, 0); > - ASSIGN_UNLESS_EQUAL(net_hdr->flags, 0); > + net_hdr->flags = VIRTIO_NET_HDR_F_DATA_VALID; > } > > /* IP cksum verification cannot be bypassed, then calculate here */ No, it actually uses ->flags to pass VIRTIO_NET_HDR_F_NEEDS_CSUM and then we pass the start and offset. HTH, fbl > > > Any comments will be appreciated! > > Thanks a lot, > Yifeng > > On Wed, Jan 29, 2020 at 1:21 PM Flavio Leitner <f...@sysclose.org> wrote: > > > > On Wed, Jan 29, 2020 at 11:19:47AM -0800, William Tu wrote: > > > On Wed, Jan 29, 2020 at 3:25 AM Flavio Leitner <f...@sysclose.org> wrote: > > > > > > > > On Tue, Jan 28, 2020 at 03:23:02PM -0800, Yifeng Sun wrote: > > > > > Sure. > > > > > > > > > > Firstly, make sure userspace-tso-enable is true > > > > > # ovs-vsctl get Open_vSwitch . other_config > > > > > {dpdk-init="true", enable-statistics="true", pmd-cpu-mask="0xf", > > > > > userspace-tso-enable="true"} > > > > > > > > > > Next, create 2 VMs with vhostuser-type interface on the same KVM host: > > > > > <interface type='vhostuser'> > > > > > <mac address='88:69:00:00:00:11'/> > > > > > <source type='unix' > > > > > path='/tmp/041afca0-6e11-4eab-a62f-1ccf5cd318fd' mode='server'/> > > > > > <model type='virtio'/> > > > > > <driver queues='2' rx_queue_size='512'> > > > > > <host csum='on' tso4='on' tso6='on'/> > > > > > <guest csum='on' tso4='on' tso6='on'/> > > > > > > > > I have other options set, but I don't think they are related: > > > > <host csum='on' gso='off' tso4='on' tso6='on' ecn='off' > > > > ufo='off' mrg_rxbuf='on'/> > > > > <guest csum='on' tso4='on' tso6='on' ecn='off' ufo='off'/> > > > > > > Is mrg_rxbuf required to be on? > > > > No. > > > > > I saw when enable userspace tso, we are setting external buffer > > > RTE_VHOST_USER_EXTBUF_SUPPORT > > > > Yes. > > > > > Is this the same thing? > > > > No. > > > > mrg_rxbuf says that we want the virtio ring to support chained ring > > entries. If that is disabled, the virtio ring will be populated with > > entries of maximum buffer length. If that is enabled, a packet will > > use one or chain more entries in the virtio ring, so each entry can > > be of smaller lengths. That is not visible to OvS. > > > > The RTE_VHOST_USER_EXTBUF_SUPPORT tells how a packet is provided > > after have been pulled out of virtio rings to OvS. We have three > > options currently: > > > > 1) LINEARBUF > > It supports data length up to the packet provided (~MTU size). > > > > 2) EXTBUF > > If the packet is too big for #1, allocate a buffer large enough > > to fit the data. We get a big packet, but instead of data being > > along with the packet's metadata, it's in an external buffer. > > > > <packet [packet metadata] [ unused buffer ]> > > +---> [ big buffer] > > > > Well, actually we make partial use of unused buffer to store > > struct rte_mbuf_ext_shared_info. > > > > 3) If neither LINEARBUF nor EXTBUF is not provided (default), > > vhost lib can provide large packets as a chain of mbufs, which > > OvS doesn't support today. > > > > HTH, > > -- > > fbl -- fbl _______________________________________________ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev