Re: [ovs-discuss] Enabling TSO functionality affects virtual machine TCP traffic

2024-03-26 Thread Ilya Maximets via discuss
On 3/25/24 08:06, Felix Huettner wrote:
>>>
>>> Hi everyone,
>>>
>>> i have built a first version of this here and it does indeed solve our
>>> issues. Not sure though if there is more. I would like to send it to
>>> ovs-dev sometime next week once i added some tests.
>>> In the mean time you can take a look here:
>>>
>>> https://github.com/openvswitch/ovs/compare/master...felixhuettner:ovs:fix_it
>>
>> Thanks, Felix!  This is a possible route to take, however, there is
>> still an issue with miniflow_extract() requesting UDP checksum offload
>> not realizing that it requests it for an inner header.
>>
>> 
>> IMHO, the fact that flag can mean different things at different times
>> is the root of many of the problems with offloading and we'll keep
>> getting issues like this until API is fixed.
>> 
>>
>> I sent my version of the fix here:
>>   
>> https://patchwork.ozlabs.org/project/openvswitch/patch/20240322144213.1073946-1-i.maximets%40ovn.org/
>>
>> This should resolve the issue, but at a cost of dropping tunneled
>> traffic when userspace-tso is enabled.
>>
>> Would be great if you can test it out.
> 
> Hi Ilya,
> 
> thanks for the fix.
> It works great for me.

Thanks for testing!

Best regards, Ilya Maximets.

> 
> Thanks
> Felix
> 
>>
>> The other big issue is that these crashes are happening in the default
>> configuration, i.e. do not actually require users to opt-in for 
>> userspace-tso.
>>
>> Best regards, Ilya Maximets.
>>

___
discuss mailing list
disc...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-discuss


Re: [ovs-discuss] Enabling TSO functionality affects virtual machine TCP traffic

2024-03-25 Thread Felix Huettner via discuss
> > 
> > Hi everyone,
> > 
> > i have built a first version of this here and it does indeed solve our
> > issues. Not sure though if there is more. I would like to send it to
> > ovs-dev sometime next week once i added some tests.
> > In the mean time you can take a look here:
> > 
> > https://github.com/openvswitch/ovs/compare/master...felixhuettner:ovs:fix_it
> 
> Thanks, Felix!  This is a possible route to take, however, there is
> still an issue with miniflow_extract() requesting UDP checksum offload
> not realizing that it requests it for an inner header.
> 
> 
> IMHO, the fact that flag can mean different things at different times
> is the root of many of the problems with offloading and we'll keep
> getting issues like this until API is fixed.
> 
> 
> I sent my version of the fix here:
>   
> https://patchwork.ozlabs.org/project/openvswitch/patch/20240322144213.1073946-1-i.maximets%40ovn.org/
> 
> This should resolve the issue, but at a cost of dropping tunneled
> traffic when userspace-tso is enabled.
> 
> Would be great if you can test it out.

Hi Ilya,

thanks for the fix.
It works great for me.

Thanks
Felix

> 
> The other big issue is that these crashes are happening in the default
> configuration, i.e. do not actually require users to opt-in for userspace-tso.
> 
> Best regards, Ilya Maximets.
> 
___
discuss mailing list
disc...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-discuss


Re: [ovs-discuss] Enabling TSO functionality affects virtual machine TCP traffic

2024-03-22 Thread Ilya Maximets via discuss
On 3/22/24 15:30, Felix Huettner wrote:
>>>
>>> Thanks for the report and analysis!
>>>
>>> I agree, this is indeed a bug in recirculation + tunnel offload logic.
>>>
>>> To recap:
>>>
>>> 1. TCP packet enters OVS.
>>> 2. TCP packet gets encapsulated into UDP tunnel.
>>> 3. Recirculation happens.
>>> 4. Packet is re-parsed after recirculation with miniflow_extract()
>>>or similar function.
>>> 5. Packet is marked for UDP checksumming because we parse the outermost set
>>>of headers.  But since it is tunneled, it means inner UDP checksumming.
>>>And that makes no sense, because the inner packet is TCP.
>>
>> Hi everyone,
>>
>> we are actually seeing a similar issue, allthough in our case we are
>> crashing ovs with the following assertion when sending pings:
>>
>> util(pmd-c07/id:9)|EMER|lib/dp-packet.h:1425: assertion ip failed in 
>> dp_packet_ip_set_header_csum()
>>
>> or a segmentation fault when sending tcp traffic (because
>> packet_tcp_complete_csum has not asserts)
>>
>>>
>>> I'm assuming that this re-parsing is also messing up all the offsets in the
>>> dp-packet.  We likely need fix all the checksums before recirculation and
>>> clear all the flags.  For TSO however that would mean that we will need to
>>> drop the packet as we don't have GSO yet and we can't actually segment
>>> traffic in the middle of pipeline anyway...
>>
>> So miniflow_extract resets inner_l3_ofs and inner_l4_ofs quite at the
>> start of the function and has no logic to add these values back (at
>> least as far as i saw that).
>>
>>>
>>> Mike, any thoughts on this?
>>>
>>>
>>> A workaround for this particular situation hwoever would be enabling support
>>> for lb-output action on the bond:
>>>
>>>   ovs-vsctl set Port bond0 other_config:lb-output-action=true
>>
>> This helps at least for our issue.
>>
>>>
>>> This should avoid the recirculation in this particular scenario.
>>> But we still need to find a solution for the recirculation issue.
>>
>> I currently have no real idea how we could recover inner_l3_ofs for
>> geneve/vxlan packets. Since for UDP traffic we do not know the upper
>> protocol it is basically impossible to find out for sure if a given
>> communication is geneve/vxlan. Therefor recovering inner_l3_ofs for a
>> given packet sounds to me either like guessing or not really possible.
>>
>> An alternative i have though about is not resetting the offsets in
>> miniflow_extract for packets that are recirculating. In this case we
>> should have started with the original packet at some point and only
>> modified it by actions we defined. If all of these actions modify the
>> offsets as appropriate is there any reason at all to recalculate them?
> 
> Hi everyone,
> 
> i have built a first version of this here and it does indeed solve our
> issues. Not sure though if there is more. I would like to send it to
> ovs-dev sometime next week once i added some tests.
> In the mean time you can take a look here:
> 
> https://github.com/openvswitch/ovs/compare/master...felixhuettner:ovs:fix_it

Thanks, Felix!  This is a possible route to take, however, there is
still an issue with miniflow_extract() requesting UDP checksum offload
not realizing that it requests it for an inner header.


IMHO, the fact that flag can mean different things at different times
is the root of many of the problems with offloading and we'll keep
getting issues like this until API is fixed.


I sent my version of the fix here:
  
https://patchwork.ozlabs.org/project/openvswitch/patch/20240322144213.1073946-1-i.maxim...@ovn.org/

This should resolve the issue, but at a cost of dropping tunneled
traffic when userspace-tso is enabled.

Would be great if you can test it out.

The other big issue is that these crashes are happening in the default
configuration, i.e. do not actually require users to opt-in for userspace-tso.

Best regards, Ilya Maximets.

> 
> Thanks
> Felix
> 
>>
>> The really ugly alternative that is probably also rather guesswork would
>> be:
>> <<< start of miniflow_extract
>> tun_l3_ofs = inner_l3_ofs - l4_ofs;
>> tun_l4_ofs = inner_l4_ofs - inner_l3_ofs;
>> <<< main miniflow_extract code
>> packet->inner_l3_ofs = packet->l4_ofs + tun_l3_ofs;
>> packet->inner_l4_ofs = packet->inner_l3_ofs + tun_l4_ofs;
>> <<< end of miniflow_extract
>>
>> But i do not think that is a good idea.
>>
>> Thanks
>> Felix
>>
>> ___
>> discuss mailing list
>> disc...@openvswitch.org
>> https://mail.openvswitch.org/mailman/listinfo/ovs-discuss

___
discuss mailing list
disc...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-discuss


Re: [ovs-discuss] Enabling TSO functionality affects virtual machine TCP traffic

2024-03-22 Thread Felix Huettner via discuss
> > 
> > Thanks for the report and analysis!
> > 
> > I agree, this is indeed a bug in recirculation + tunnel offload logic.
> > 
> > To recap:
> > 
> > 1. TCP packet enters OVS.
> > 2. TCP packet gets encapsulated into UDP tunnel.
> > 3. Recirculation happens.
> > 4. Packet is re-parsed after recirculation with miniflow_extract()
> >or similar function.
> > 5. Packet is marked for UDP checksumming because we parse the outermost set
> >of headers.  But since it is tunneled, it means inner UDP checksumming.
> >And that makes no sense, because the inner packet is TCP.
> 
> Hi everyone,
> 
> we are actually seeing a similar issue, allthough in our case we are
> crashing ovs with the following assertion when sending pings:
> 
> util(pmd-c07/id:9)|EMER|lib/dp-packet.h:1425: assertion ip failed in 
> dp_packet_ip_set_header_csum()
> 
> or a segmentation fault when sending tcp traffic (because
> packet_tcp_complete_csum has not asserts)
> 
> > 
> > I'm assuming that this re-parsing is also messing up all the offsets in the
> > dp-packet.  We likely need fix all the checksums before recirculation and
> > clear all the flags.  For TSO however that would mean that we will need to
> > drop the packet as we don't have GSO yet and we can't actually segment
> > traffic in the middle of pipeline anyway...
> 
> So miniflow_extract resets inner_l3_ofs and inner_l4_ofs quite at the
> start of the function and has no logic to add these values back (at
> least as far as i saw that).
> 
> > 
> > Mike, any thoughts on this?
> > 
> > 
> > A workaround for this particular situation hwoever would be enabling support
> > for lb-output action on the bond:
> > 
> >   ovs-vsctl set Port bond0 other_config:lb-output-action=true
> 
> This helps at least for our issue.
> 
> > 
> > This should avoid the recirculation in this particular scenario.
> > But we still need to find a solution for the recirculation issue.
> 
> I currently have no real idea how we could recover inner_l3_ofs for
> geneve/vxlan packets. Since for UDP traffic we do not know the upper
> protocol it is basically impossible to find out for sure if a given
> communication is geneve/vxlan. Therefor recovering inner_l3_ofs for a
> given packet sounds to me either like guessing or not really possible.
> 
> An alternative i have though about is not resetting the offsets in
> miniflow_extract for packets that are recirculating. In this case we
> should have started with the original packet at some point and only
> modified it by actions we defined. If all of these actions modify the
> offsets as appropriate is there any reason at all to recalculate them?

Hi everyone,

i have built a first version of this here and it does indeed solve our
issues. Not sure though if there is more. I would like to send it to
ovs-dev sometime next week once i added some tests.
In the mean time you can take a look here:

https://github.com/openvswitch/ovs/compare/master...felixhuettner:ovs:fix_it

Thanks
Felix

> 
> The really ugly alternative that is probably also rather guesswork would
> be:
> <<< start of miniflow_extract
> tun_l3_ofs = inner_l3_ofs - l4_ofs;
> tun_l4_ofs = inner_l4_ofs - inner_l3_ofs;
> <<< main miniflow_extract code
> packet->inner_l3_ofs = packet->l4_ofs + tun_l3_ofs;
> packet->inner_l4_ofs = packet->inner_l3_ofs + tun_l4_ofs;
> <<< end of miniflow_extract
> 
> But i do not think that is a good idea.
> 
> Thanks
> Felix
> 
> ___
> discuss mailing list
> disc...@openvswitch.org
> https://mail.openvswitch.org/mailman/listinfo/ovs-discuss
___
discuss mailing list
disc...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-discuss


Re: [ovs-discuss] Enabling TSO functionality affects virtual machine TCP traffic

2024-03-22 Thread Felix Huettner via discuss
On Thu, Mar 21, 2024 at 07:45:21PM +0100, Ilya Maximets via discuss wrote:
> On 3/21/24 08:48, Qiang Qiang45 Zhang via discuss wrote:
> > Problem:
> > 
> > When the Geneve tunnel is the Bond port, TCP traffic between virtual 
> > machines
> > is not connected, and PING is allowed; When the Geneve tunnel is a single
> > physical port, both TCP and PING are normal.
> > 
> > Version:
> > 
> > ovs-vswitchd (Open vSwitch) 3.3.0
> > DPDK 23.11.0
> > 
> > Main configuration:
> > 
> > ovs-vsctl set op . other_config:userspace-tso-enable=true
> > ovs-vsctl add-br int-br -- set bridge int-br datapath_type=netdev
> > ovs-vsctl add-port int-br geneve0 -- set interface geneve0 type=geneve -- 
> > set interface geneve0 options:csum=false -- set interface geneve0 
> > options:key=flow -- set interface geneve0 options:remote_ip="192.168.54.183"
> > ovs-vsctl add-port int-br vm0 -- set interface vm0 type=dpdkvhostuserclient
> > ovs-ofctl show int-br
> > ovs-ofctl del-flows int-br
> > ovs-ofctl add-flow int-br 'table=0,in_port=2,actions=output:1'
> > ovs-ofctl add-flow int-br 'table=0,in_port=1,actions=output:2'
> > 
> >  
> > 
> > Topology:
> 
> > 
> > Key points for investigation: Comparison between ordinary physical and BOND 
> > tunnel entrances
> 
> > 
> > The tunnel entrance is a single physical entrance:
> > 
> > The tunnel entrance is a bond:
> > 
> > The key issue is that there is an additional redirect (0x1) for the bond, 
> > which
> > will cause the corresponding flag to be filled with outer Geneve 
> > information when
> > setting mbuf flags. The actual GDB stack for virtual machine debugging is 
> > as follows
> > (the inner layer is SSH TCP traffic, but it has set the csum_udp 
> > identifier. This
> > analysis code reads the outer layer header, which is the UDP header of 
> > Geneve)
> 
> Thanks for the report and analysis!
> 
> I agree, this is indeed a bug in recirculation + tunnel offload logic.
> 
> To recap:
> 
> 1. TCP packet enters OVS.
> 2. TCP packet gets encapsulated into UDP tunnel.
> 3. Recirculation happens.
> 4. Packet is re-parsed after recirculation with miniflow_extract()
>or similar function.
> 5. Packet is marked for UDP checksumming because we parse the outermost set
>of headers.  But since it is tunneled, it means inner UDP checksumming.
>And that makes no sense, because the inner packet is TCP.

Hi everyone,

we are actually seeing a similar issue, allthough in our case we are
crashing ovs with the following assertion when sending pings:

util(pmd-c07/id:9)|EMER|lib/dp-packet.h:1425: assertion ip failed in 
dp_packet_ip_set_header_csum()

or a segmentation fault when sending tcp traffic (because
packet_tcp_complete_csum has not asserts)

> 
> I'm assuming that this re-parsing is also messing up all the offsets in the
> dp-packet.  We likely need fix all the checksums before recirculation and
> clear all the flags.  For TSO however that would mean that we will need to
> drop the packet as we don't have GSO yet and we can't actually segment
> traffic in the middle of pipeline anyway...

So miniflow_extract resets inner_l3_ofs and inner_l4_ofs quite at the
start of the function and has no logic to add these values back (at
least as far as i saw that).

> 
> Mike, any thoughts on this?
> 
> 
> A workaround for this particular situation hwoever would be enabling support
> for lb-output action on the bond:
> 
>   ovs-vsctl set Port bond0 other_config:lb-output-action=true

This helps at least for our issue.

> 
> This should avoid the recirculation in this particular scenario.
> But we still need to find a solution for the recirculation issue.

I currently have no real idea how we could recover inner_l3_ofs for
geneve/vxlan packets. Since for UDP traffic we do not know the upper
protocol it is basically impossible to find out for sure if a given
communication is geneve/vxlan. Therefor recovering inner_l3_ofs for a
given packet sounds to me either like guessing or not really possible.

An alternative i have though about is not resetting the offsets in
miniflow_extract for packets that are recirculating. In this case we
should have started with the original packet at some point and only
modified it by actions we defined. If all of these actions modify the
offsets as appropriate is there any reason at all to recalculate them?

The really ugly alternative that is probably also rather guesswork would
be:
<<< start of miniflow_extract
tun_l3_ofs = inner_l3_ofs - l4_ofs;
tun_l4_ofs = inner_l4_ofs - inner_l3_ofs;
<<< main miniflow_extract code
packet->inner_l3_ofs = packet->l4_ofs + tun_l3_ofs;
packet->inner_l4_ofs = packet->inner_l3_ofs + tun_l4_ofs;
<<< end of miniflow_extract

But i do not think that is a good idea.

Thanks
Felix

___
discuss mailing list
disc...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-discuss


Re: [ovs-discuss] Enabling TSO functionality affects virtual machine TCP traffic

2024-03-21 Thread Ilya Maximets via discuss
On 3/21/24 08:48, Qiang Qiang45 Zhang via discuss wrote:
> Problem:
> 
> When the Geneve tunnel is the Bond port, TCP traffic between virtual machines
> is not connected, and PING is allowed; When the Geneve tunnel is a single
> physical port, both TCP and PING are normal.
> 
> Version:
> 
> ovs-vswitchd (Open vSwitch) 3.3.0
> DPDK 23.11.0
> 
> Main configuration:
> 
> ovs-vsctl set op . other_config:userspace-tso-enable=true
> ovs-vsctl add-br int-br -- set bridge int-br datapath_type=netdev
> ovs-vsctl add-port int-br geneve0 -- set interface geneve0 type=geneve -- set 
> interface geneve0 options:csum=false -- set interface geneve0 
> options:key=flow -- set interface geneve0 options:remote_ip="192.168.54.183"
> ovs-vsctl add-port int-br vm0 -- set interface vm0 type=dpdkvhostuserclient
> ovs-ofctl show int-br
> ovs-ofctl del-flows int-br
> ovs-ofctl add-flow int-br 'table=0,in_port=2,actions=output:1'
> ovs-ofctl add-flow int-br 'table=0,in_port=1,actions=output:2'
> 
>  
> 
> Topology:

> 
> Key points for investigation: Comparison between ordinary physical and BOND 
> tunnel entrances

> 
> The tunnel entrance is a single physical entrance:
> 
> The tunnel entrance is a bond:
> 
> The key issue is that there is an additional redirect (0x1) for the bond, 
> which
> will cause the corresponding flag to be filled with outer Geneve information 
> when
> setting mbuf flags. The actual GDB stack for virtual machine debugging is as 
> follows
> (the inner layer is SSH TCP traffic, but it has set the csum_udp identifier. 
> This
> analysis code reads the outer layer header, which is the UDP header of Geneve)

Thanks for the report and analysis!

I agree, this is indeed a bug in recirculation + tunnel offload logic.

To recap:

1. TCP packet enters OVS.
2. TCP packet gets encapsulated into UDP tunnel.
3. Recirculation happens.
4. Packet is re-parsed after recirculation with miniflow_extract()
   or similar function.
5. Packet is marked for UDP checksumming because we parse the outermost set
   of headers.  But since it is tunneled, it means inner UDP checksumming.
   And that makes no sense, because the inner packet is TCP.

I'm assuming that this re-parsing is also messing up all the offsets in the
dp-packet.  We likely need fix all the checksums before recirculation and
clear all the flags.  For TSO however that would mean that we will need to
drop the packet as we don't have GSO yet and we can't actually segment
traffic in the middle of pipeline anyway...

Mike, any thoughts on this?


A workaround for this particular situation hwoever would be enabling support
for lb-output action on the bond:

  ovs-vsctl set Port bond0 other_config:lb-output-action=true

This should avoid the recirculation in this particular scenario.
But we still need to find a solution for the recirculation issue.

Best regards, Ilya Maximets.
___
discuss mailing list
disc...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-discuss