On 7/25/24 00:45, Mike Pattrick wrote:
> On Wed, Jul 24, 2024 at 3:56 PM Ilya Maximets <[email protected]> wrote:
>>
>> On 7/12/24 23:30, Mike Pattrick wrote:
>>> When prepending a tunnel header to a packet marked for segmentation, we
>>> need to adjust the segment size. Failure to do so can result in packets
>>> that are larger than the intended MTU post segmentation.
>>
>> Hi, Mike. Not a full review yet, but could you, please, explain in
>> more details why we need to adjust TSO segment size here?
>>
>> In order to ensure packet delivery, MTU inside VM / container must be
>> smaller than the physical network MTU by the size of the tunnel header.
>> So, tso_segsz of the packet received from such a VM or container should
>> be already small enough. Or am I missing something?
>
> My rationale here is the VM can't know if OVS will be encapsulating a
> packet in a udp tunnel, or how big that encapsulation would be. The
> UDP encapsulation is pretty large and potentially variable, it may be
> overly burdensome if the VM MTU must be adjusted to anticipate any UDP
> tunnel.
It is a burden, sure. But that's how it works for kernel datapath as well
and for networking in general. VM has to know the MTU of the underlying
network the same way as a physical host has to know the MTU of the physical
network it is connected to, there is no way around this. All the common
cluster management systems like OpenStack or ovn-kubernetes CNI will set
MTUs for VMs and container interfaces appropriately when tunneling is
involved. OVS on its own can't make this decision, especially because we
don't actually know the MTU of the destination or any hops in-between anyway.
VM has to know the MTU or has to run some form of path MTU discovery to not
send oversized packets. Userspace tunnels are not good for PMTUD since
they will not forward or generate ICMP errors, but that's a separate issue
and there are ways to not rely on ICMP for PMTUD.
Best regards, Ilya Maximets.
>
>
> Thanks,
> Mike
>
>>
>> Best regards, Ilya Maximets.
>>
>>>
>>> Fixes: 084c8087292c ("userspace: Support VXLAN and GENEVE TSO.")
>>> Signed-off-by: Mike Pattrick <[email protected]>
>>> ---
>>> lib/netdev-native-tnl.c | 11 +++++++++++
>>> 1 file changed, 11 insertions(+)
>>>
>>> diff --git a/lib/netdev-native-tnl.c b/lib/netdev-native-tnl.c
>>> index 16c56608d..78c91b5fa 100644
>>> --- a/lib/netdev-native-tnl.c
>>> +++ b/lib/netdev-native-tnl.c
>>> @@ -161,6 +161,17 @@ netdev_tnl_push_ip_header(struct dp_packet *packet,
>>> const void *header,
>>> *ip_tot_size = dp_packet_size(packet) - sizeof (struct eth_header);
>>>
>>> memcpy(eth, header, size);
>>> +
>>> + /* The prepended header may cause TSO marked packets to exceed the
>>> intended
>>> + * MTU on segmentation. */
>>> + if (dp_packet_hwol_is_tso(packet)) {
>>> + uint16_t tso_segsz = dp_packet_get_tso_segsz(packet);
>>> + if (tso_segsz > size) {
>>> + tso_segsz -= size;
>>> + dp_packet_set_tso_segsz(packet, tso_segsz);
>>> + }
>>> + }
>>> +
>>> /* The encapsulated packet has type Ethernet. Adjust dp_packet. */
>>> packet->packet_type = htonl(PT_ETH);
>>> dp_packet_reset_offsets(packet);
>>
>
_______________________________________________
dev mailing list
[email protected]
https://mail.openvswitch.org/mailman/listinfo/ovs-dev