On 3/22/24 17:36, Mike Pattrick wrote:
> On Fri, Mar 22, 2024 at 10:41 AM Ilya Maximets <[email protected]> wrote:
>>
>> Recirculation involves re-parsing the packet from scratch and that
>> process is not aware of multiple header levels nor the inner/outer
>> offsets.  So, it overwrites offsets with new ones from the outermost
>> headers and sets offloading flags that change their meaning when
>> the packet is marked for tunnel offloading.
>>
>> For example:
>>
>>  1. TCP packet enters OVS.
>>  2. TCP packet gets encapsulated into UDP tunnel.
>>  3. Recirculation happens.
>>  4. Packet is re-parsed after recirculation with miniflow_extract()
>>     or similar function.
>>  5. Packet is marked for UDP checksumming because we parse the
>>     outermost set of headers.  But since it is tunneled, it means
>>     inner UDP checksumming.  And that makes no sense, because the
>>     inner packet is TCP.
>>
>> This is causing packet drops due to malformed packets or even
>> assertions and crashes in the code that is trying to fixup checksums
>> for packets using incorrect metadata:
>>
>>  SUMMARY: UndefinedBehaviorSanitizer: undefined-behavior
>>
>>  lib/packets.c:2061:15: runtime error:
>>         member access within null pointer of type 'struct udp_header'
>>
>>   0 0xbe5221 in packet_udp_complete_csum lib/packets.c:2061:15
>>   1 0x7e5662 in dp_packet_ol_send_prepare lib/dp-packet.c:638:9
>>   2 0x96ef89 in netdev_send lib/netdev.c:940:9
>>   3 0x818e94 in dp_netdev_pmd_flush_output_on_port lib/dpif-netdev.c:5577:9
>>   4 0x817606 in dp_netdev_pmd_flush_output_packets lib/dpif-netdev.c:5618:27
>>   5 0x81cfa5 in dp_netdev_process_rxq_port lib/dpif-netdev.c:5677:9
>>   6 0x7eefe4 in dpif_netdev_run lib/dpif-netdev.c:7001:25
>>   7 0x610e87 in type_run ofproto/ofproto-dpif.c:367:9
>>   8 0x5b9e80 in ofproto_type_run ofproto/ofproto.c:1879:31
>>   9 0x55bbb4 in bridge_run__ vswitchd/bridge.c:3281:9
>>  10 0x558b6b in bridge_run vswitchd/bridge.c:3346:5
>>  11 0x591dc5 in main vswitchd/ovs-vswitchd.c:130:9
>>  12 0x172b89 in __libc_start_call_main (/lib64/libc.so.6+0x27b89)
>>  13 0x172c4a in __libc_start_main@GLIBC_2.2.5 (/lib64/libc.so.6+0x27c4a)
>>  14 0x47eff4 in _start (vswitchd/ovs-vswitchd+0x47eff4)
>>
>> Tests added for both IPv4 and IPv6 cases.  Though IPv6 test doesn't
>> trigger the issue it's better to have a symmetric test.
>>
>> Fixes: 084c8087292c ("userspace: Support VXLAN and GENEVE TSO.")
>> Reported-at: 
>> https://mail.openvswitch.org/pipermail/ovs-discuss/2024-March/053014.html
>> Signed-off-by: Ilya Maximets <[email protected]>
>> ---
> 
> I have tested this, and it does fix the segfault here.
> 
> Acked-by: Mike Pattrick <[email protected]>
> 

Thanks!  Applied and backported to 3.3.

We can think of alternative solutions on top of this fix, but I don't
see any easy ones that would cover all the cases for now.

Best regards, Ilya Maximets.
_______________________________________________
dev mailing list
[email protected]
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Reply via email to