Re: [ovs-dev] [PATCH v3 5/6] netdev-dpdk: Use guest TSO segmentation size hint.
On Wed, May 15, 2024 at 2:09 PM Kevin Traynor wrote: > > On 19/04/2024 15:06, David Marchand wrote: > > In a typical setup like: > > guest A <-virtio-> OVS A <-vxlan-> OVS B <-virtio-> guest B > > > > TSO packets from guest A are segmented against the OVS A physical port > > mtu adjusted by the vxlan tunnel header size, regardless of guest A > > interface mtu. > > > > As an example, let's say guest A and guest B mtu are set to 1500 bytes. > > OVS A and OVS B physical ports mtu are set to 1600 bytes. > > Guest A will request TCP segmentation for 1448 bytes segments. > > On the other hand, OVS A will request 1498 bytes segments to the HW. > > This results in OVS B dropping packets because decapsulated packets > > are larger than the vhost-user port (serving guest B) mtu. > > > > 2024-04-17T14:13:01.239Z|2|netdev_dpdk(pmd-c03/id:7)|WARN|vhost0: > > Too big size 1564 max_packet_len 1518 > > > > vhost-user ports expose a guest mtu by filling mbuf->tso_segsz. > > Use it as a hint. > > > > This may result in segments (on the wire) slightly shorter than the > > optimal size. > > > > Reported-at: https://github.com/openvswitch/ovs-issues/issues/321 > > Signed-off-by: David Marchand > > --- > > Note: > > As we trust the guest with this change, should we put a lower limit on > > mbuf->tso_segsz? > > > > There are some checks I looked at (e.g [0]), but it could be checked > here for an earlier drop i suppose...additional comment below > > [0] > https://git.dpdk.org/dpdk/tree/drivers/net/ice/ice_rxtx.c#n3754 I guess you meant https://git.dpdk.org/dpdk/tree/drivers/net/ice/ice_rxtx.c#n3818 And same in v23.11, there are checks at the prepare stage: https://git.dpdk.org/dpdk-stable/tree/drivers/net/ice/ice_rxtx.c?h=23.11#n3681 I had forgotten about those checks. There is no limit exposed per driver from DPDK, so the simpler for OVS is to trust them. > > > --- > > lib/netdev-dpdk.c | 11 --- > > 1 file changed, 8 insertions(+), 3 deletions(-) > > > > diff --git a/lib/netdev-dpdk.c b/lib/netdev-dpdk.c > > index 661269e4b6..1dad2ef833 100644 > > --- a/lib/netdev-dpdk.c > > +++ b/lib/netdev-dpdk.c > > @@ -2671,14 +2671,19 @@ netdev_dpdk_prep_hwol_packet(struct netdev_dpdk > > *dev, struct rte_mbuf *mbuf) > > > > if (mbuf->ol_flags & RTE_MBUF_F_TX_TCP_SEG) { > > struct tcp_header *th = dp_packet_l4(pkt); > > +uint16_t link_tso_segsz; > > int hdr_len; > > > > if (tunnel_type) { > > -mbuf->tso_segsz = dev->mtu - mbuf->l2_len - mbuf->l3_len - > > - mbuf->l4_len - mbuf->outer_l3_len; > > +link_tso_segsz = dev->mtu - mbuf->l2_len - mbuf->l3_len - > > + mbuf->l4_len - mbuf->outer_l3_len; > > } else { > > mbuf->l4_len = TCP_OFFSET(th->tcp_ctl) * 4; > > -mbuf->tso_segsz = dev->mtu - mbuf->l3_len - mbuf->l4_len; > > +link_tso_segsz = dev->mtu - mbuf->l3_len - mbuf->l4_len; > > +} > > + > > +if (!mbuf->tso_segsz || mbuf->tso_segsz > link_tso_segsz) { > > It seems like something is not right if the flag is set but tso_segsz is > 0. It is set by vhost lib from gso_size, but I don't see a validation > there either. At the time I added a check on the 0 value, I thought there was a case where RTE_MBUF_F_TX_TCP_SEG could be set with no segsz value. But as you mention, all setters of this flag (either in vhost or in OVS) set a segsz too. So with segsz always set, combined with the drivers check, OVS probably does not need any check on tso_segsz. I intend to remove this check in a next revision. -- David Marchand ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
Re: [ovs-dev] [PATCH v3 5/6] netdev-dpdk: Use guest TSO segmentation size hint.
On 19/04/2024 15:06, David Marchand wrote: > In a typical setup like: > guest A <-virtio-> OVS A <-vxlan-> OVS B <-virtio-> guest B > > TSO packets from guest A are segmented against the OVS A physical port > mtu adjusted by the vxlan tunnel header size, regardless of guest A > interface mtu. > > As an example, let's say guest A and guest B mtu are set to 1500 bytes. > OVS A and OVS B physical ports mtu are set to 1600 bytes. > Guest A will request TCP segmentation for 1448 bytes segments. > On the other hand, OVS A will request 1498 bytes segments to the HW. > This results in OVS B dropping packets because decapsulated packets > are larger than the vhost-user port (serving guest B) mtu. > > 2024-04-17T14:13:01.239Z|2|netdev_dpdk(pmd-c03/id:7)|WARN|vhost0: > Too big size 1564 max_packet_len 1518 > > vhost-user ports expose a guest mtu by filling mbuf->tso_segsz. > Use it as a hint. > > This may result in segments (on the wire) slightly shorter than the > optimal size. > > Reported-at: https://github.com/openvswitch/ovs-issues/issues/321 > Signed-off-by: David Marchand > --- > Note: > As we trust the guest with this change, should we put a lower limit on > mbuf->tso_segsz? > There are some checks I looked at (e.g [0]), but it could be checked here for an earlier drop i suppose...additional comment below [0] https://git.dpdk.org/dpdk/tree/drivers/net/ice/ice_rxtx.c#n3754 > --- > lib/netdev-dpdk.c | 11 --- > 1 file changed, 8 insertions(+), 3 deletions(-) > > diff --git a/lib/netdev-dpdk.c b/lib/netdev-dpdk.c > index 661269e4b6..1dad2ef833 100644 > --- a/lib/netdev-dpdk.c > +++ b/lib/netdev-dpdk.c > @@ -2671,14 +2671,19 @@ netdev_dpdk_prep_hwol_packet(struct netdev_dpdk *dev, > struct rte_mbuf *mbuf) > > if (mbuf->ol_flags & RTE_MBUF_F_TX_TCP_SEG) { > struct tcp_header *th = dp_packet_l4(pkt); > +uint16_t link_tso_segsz; > int hdr_len; > > if (tunnel_type) { > -mbuf->tso_segsz = dev->mtu - mbuf->l2_len - mbuf->l3_len - > - mbuf->l4_len - mbuf->outer_l3_len; > +link_tso_segsz = dev->mtu - mbuf->l2_len - mbuf->l3_len - > + mbuf->l4_len - mbuf->outer_l3_len; > } else { > mbuf->l4_len = TCP_OFFSET(th->tcp_ctl) * 4; > -mbuf->tso_segsz = dev->mtu - mbuf->l3_len - mbuf->l4_len; > +link_tso_segsz = dev->mtu - mbuf->l3_len - mbuf->l4_len; > +} > + > +if (!mbuf->tso_segsz || mbuf->tso_segsz > link_tso_segsz) { It seems like something is not right if the flag is set but tso_segsz is 0. It is set by vhost lib from gso_size, but I don't see a validation there either. So it could be checked and/or adjusted here for obviously incorrect values. The only question would be is it the right layer to check these, or is it more appropriate for it to be done in the drivers. > +mbuf->tso_segsz = link_tso_segsz; > } > > hdr_len = mbuf->l2_len + mbuf->l3_len + mbuf->l4_len; ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
[ovs-dev] [PATCH v3 5/6] netdev-dpdk: Use guest TSO segmentation size hint.
In a typical setup like: guest A <-virtio-> OVS A <-vxlan-> OVS B <-virtio-> guest B TSO packets from guest A are segmented against the OVS A physical port mtu adjusted by the vxlan tunnel header size, regardless of guest A interface mtu. As an example, let's say guest A and guest B mtu are set to 1500 bytes. OVS A and OVS B physical ports mtu are set to 1600 bytes. Guest A will request TCP segmentation for 1448 bytes segments. On the other hand, OVS A will request 1498 bytes segments to the HW. This results in OVS B dropping packets because decapsulated packets are larger than the vhost-user port (serving guest B) mtu. 2024-04-17T14:13:01.239Z|2|netdev_dpdk(pmd-c03/id:7)|WARN|vhost0: Too big size 1564 max_packet_len 1518 vhost-user ports expose a guest mtu by filling mbuf->tso_segsz. Use it as a hint. This may result in segments (on the wire) slightly shorter than the optimal size. Reported-at: https://github.com/openvswitch/ovs-issues/issues/321 Signed-off-by: David Marchand --- Note: As we trust the guest with this change, should we put a lower limit on mbuf->tso_segsz? --- lib/netdev-dpdk.c | 11 --- 1 file changed, 8 insertions(+), 3 deletions(-) diff --git a/lib/netdev-dpdk.c b/lib/netdev-dpdk.c index 661269e4b6..1dad2ef833 100644 --- a/lib/netdev-dpdk.c +++ b/lib/netdev-dpdk.c @@ -2671,14 +2671,19 @@ netdev_dpdk_prep_hwol_packet(struct netdev_dpdk *dev, struct rte_mbuf *mbuf) if (mbuf->ol_flags & RTE_MBUF_F_TX_TCP_SEG) { struct tcp_header *th = dp_packet_l4(pkt); +uint16_t link_tso_segsz; int hdr_len; if (tunnel_type) { -mbuf->tso_segsz = dev->mtu - mbuf->l2_len - mbuf->l3_len - - mbuf->l4_len - mbuf->outer_l3_len; +link_tso_segsz = dev->mtu - mbuf->l2_len - mbuf->l3_len - + mbuf->l4_len - mbuf->outer_l3_len; } else { mbuf->l4_len = TCP_OFFSET(th->tcp_ctl) * 4; -mbuf->tso_segsz = dev->mtu - mbuf->l3_len - mbuf->l4_len; +link_tso_segsz = dev->mtu - mbuf->l3_len - mbuf->l4_len; +} + +if (!mbuf->tso_segsz || mbuf->tso_segsz > link_tso_segsz) { +mbuf->tso_segsz = link_tso_segsz; } hdr_len = mbuf->l2_len + mbuf->l3_len + mbuf->l4_len; -- 2.44.0 ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev