On 6/10/26 9:51 PM, Kees Cook wrote:
> On Mon, Jun 08, 2026 at 11:41:37AM +0200, Ilya Maximets wrote:
>> On 6/8/26 10:25 AM, Johan Thomsen wrote:
>>> Hello,
>>>
>>> I am seeing what looks like a kernel bug in the Geneve/OVS/vhost
>>> transmit path on a Talos Linux node running Kube-ovn with Geneve
>>> overlay and KubeVirt VM traffic.
>>>
>>> Environment:
>>>
>>> Kernel: 6.18.33-talos
>>> Distro: Talos v1.13.3
>>>
>>> Compiler/config:
>>>
>>> CONFIG_CC_VERSION_TEXT="clang version 22.1.2"
>>> CONFIG_CC_IS_CLANG=y
>>> CONFIG_LTO=y
>>> CONFIG_LTO_CLANG=y
>>> CONFIG_LTO_CLANG_THIN=y
>>> CONFIG_FORTIFY_SOURCE=y
>>>
>>> Hardware: HPE ProLiant DL325 Gen11, AMD EPYC
>>>
>>> NIC driver: bnxt_en
>>>
>>> Workload/network:
>>>
>>> Kube-OVN, Geneve overlay
>>> Open vSwitch datapath
>>> KubeVirt/QEMU VM traffic via vhost/tap
>>>
>>> Relevant console output:
>>>
>>> [  648.742603] memcpy: detected buffer overflow: 104 byte write of
>>> buffer size 96
>>> [  648.749907] WARNING: CPU: 61 PID: 27020 at
>>> lib/string_helpers.c:1036 __fortify_report+0x45/0x60
>>> [  648.758689] Modules linked in: dm_round_robin dm_multipath lpfc
>>> nvmet_fc nvmet intel_rapl_msr intel_rapl_common ahci nvme_auth bnxt_en
>>> nvme hpilo hkdf libahci sp5100_tco watchdog k10temp
>>> [  648.775429] CPU: 61 UID: 107 PID: 27020 Comm: vhost-27002 Not
>>> tainted 6.18.29-talos #1 PREEMPT(none)
>>> [  648.784735] Hardware name: HPE ProLiant DL325 Gen11/ProLiant DL325
>>> Gen11, BIOS 2.84 11/05/2025
>>> [  648.890478]  skb_tunnel_info_unclone+0x179/0x190
>>> [  648.895152]  geneve_xmit+0x7fe/0xe00
>>> [  648.907240]  dev_hard_start_xmit+0xa7/0x1f0
>>> [  648.911479]  __dev_queue_xmit+0x864/0xf40
>>> [  648.919688]  do_execute_actions+0x9b9/0x1be0
>>> [  648.927727]  ovs_execute_actions+0x58/0x170
>>> [  648.931960]  ovs_dp_process_packet+0xb1/0x1c0
>>> [  648.936370]  ovs_vport_receive+0x90/0x100
>>> [  648.940428]  netdev_frame_hook+0x146/0x1a0
>>> [  648.954093]  __netif_receive_skb+0x3f/0x160
>>> [  648.958324]  process_backlog+0x10c/0x210
>>> [  648.962295]  __napi_poll+0x2f/0x190
>>> [  648.965832]  net_rx_action+0x2e3/0x500
>>> [  648.969632]  handle_softirqs+0xe7/0x310
>>> [  648.985387]  tun_get_user+0x137e/0x1510
>>> [  649.005878]  handle_tx+0x41f/0xd30
>>> [  649.029014]  vhost_run_work_list+0x52/0x90
>>> [  649.033162]  vhost_task_fn+0xc2/0x140
>>> [  649.064145] ---[ end trace 0000000000000000 ]---
>>> [  649.068820] ------------[ cut here ]------------
>>> [  649.073489] kernel BUG at lib/string_helpers.c:1043!
>>>
>>> I don't know whether this is a real overflow or a FORTIFY false-positive.
>>
>> Looks like a false-positive from the __counted_by fortification.
>>
>> I'd guess something like this would fit it:
>>
>> diff --git a/include/net/dst_metadata.h b/include/net/dst_metadata.h
>> index 1fc2fb03ce3f9..e51c3795da474 100644
>> --- a/include/net/dst_metadata.h
>> +++ b/include/net/dst_metadata.h
>> @@ -164,6 +164,7 @@ static inline struct metadata_dst 
>> *tun_dst_unclone(struct sk_buff *skb)
>>      if (!new_md)
>>              return ERR_PTR(-ENOMEM);
>>  
>> +    new_md->u.tun_info.options_len = md_size;
>>      memcpy(&new_md->u.tun_info, &md_dst->u.tun_info,
>>             sizeof(struct ip_tunnel_info) + md_size);
> 
> Speaking to this solution, it also makes sense, but does look redundant
> to the memcpy that follows it. I wonder something more in between would
> be better (the memcpy isn't needed to copy a struct, either):
> 
>       new_md->u.tun_info = md_dst->u.tun_info;

I like this.  memcpy is not needed for the base struct indeed.

>       memcpy(new_md->u.tun_info.options, md_dst->u.tun_info.options,
>               md_dst->u.tun_info.options_len);

But I'd keep the accessor function for the options pointer here, since
we have it, i.e. ip_tunnel_info_opts(&new_md->u.tun_info).

I'll run some tests with that and post a patch.

> 
> Is this the only place in the kernel where a struct ip_tunnel_info is
> being copied? The above really looks like an open-coded helper. :)

On a quick look through the code, it seems like this is the only place
where the data is copied directly from one tun_info into another.
Other places either construct the structure to/from other places or apply
modifications.  And so the ip_tunnel_info_opts_set() is suitable in those
cases, but not in here.

Best regards, Ilya Maximets.
_______________________________________________
dev mailing list
[email protected]
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Reply via email to