On 10/30/25 12:34 AM, Frode Nordahl wrote:
> On 10/29/25 14:11, Ilya Maximets wrote:
>> On 10/27/25 11:16 PM, Frode Nordahl wrote:
>>> On 10/27/25 19:40, Ilya Maximets wrote:
>>>> On 10/24/25 4:28 PM, Frode Nordahl wrote:
>>>>> The commit in the fixes tag extended struct dpif_execute with an
>>>>> upcall_pid member.
>>>>>
>>>>> This member was left uninitialized in dpif_execute_helper_cb(),
>>>>> leading to Open vSwitch providing the kernel with a PID for which
>>>>> it does not have a listener.
>>>>>
>>>>> This condition is caught by one of the existing system-traffic
>>>>> tests which is extended to explicitly check for presence of lost
>>>>> upcalls in the datapath.
>>>>>
>>>>> Fixes: 0d9dc8e9ca4a ("dpif-netlink: Provide original upcall pid in 
>>>>> 'execute' commands.")
>>>>> Reported-at: https://launchpad.net/bugs/2127061
>>>>> Signed-off-by: Frode Nordahl <[email protected]>
>>>>> ---
>>>>>    lib/dpif.c              | 1 +
>>>>>    tests/system-traffic.at | 2 ++
>>>>>    2 files changed, 3 insertions(+)
>>>>
>>>> Good catch, Frode!  Thanks for the patch.
>>>
>>> Thank you for taking the time to review, Ilya, much appreciated!
>>>
>>>>>
>>>>> Wanted to note that retis was instrumental in tracking down the
>>>>> root cause if this issue, very useful tool!
>>>>>
>>>>> I also wanted to note that I spent considerable amonut of time
>>>>> trying to figure out a low touch way of creating a specifc test
>>>>> for this, but did not come up with any good alternatives.
>>>>>
>>>>> I appear to somehow have missed that an existing system test
>>>>> catches the condition, so I went ahead and expanded on that
>>>>> in the end.
>>>>
>>>> Hmm.  I can't reproduce the problem with this particular test.
>>>> There should be no upcalls after the packet is re-injected via
>>>> the 'execute' request.  And also the pcap file check should be
>>>> failing if the packet was dropped.  Could you explain a bit more
>>>> what you see in your test run?
>>>
>>> I identified this particular test as interesting by running the entire
>>> testsuite with retis and looking for evidence of a -111 return for
>>> upcalls in the kernel code.
>>>
>>> Rechecking this locally I see a difference in behavior depending on
>>> which compiler is used to build Open vSwitch, specifically I see the
>>> condition when compiling with gcc 14.2.0.
>>>
>>> Recompiling with gcc 15.2.0 on otherwise identical system makes the test
>>> pass.  I can only guess there is a difference in behavior for how the
>>> uninitialized value is treated.
>>>
>>> In any case, we cannot rely on compiler behavior for our test case, so I
>>> guess we need to find some other way of testing this.  Any suggestions
>>> welcome :-)
>>
>> One option, I believe, could be to have packet go through conntrack twice
>> (different zones) and add debug_slow into OpenFlow rules to force packets
>> to userspace after each recirculation.  That should end up with the packets
>> dropped.  At the same time it's still a gamble as we're relying on values
>> of the uninitialized memory either way.  So, not particularly reliable.
>>
>> We may omit the test for this issue entirely, since valgrind is actually
>> reporting the problem consistently while running this mpls test in
>> check-kernel-valgrind:
> 
> That would certainly give more consistent results, and I think i favor 
> this option.
> 
>>   Thread 27 handler71:
>>   Conditional jump or move depends on uninitialised value(s)
>>      at 0x55986A: dpif_netlink_encode_execute (dpif-netlink.c:2028)
>>      by 0x55986A: dpif_netlink_operate__ (dpif-netlink.c:2112)
>>      by 0x559B75: dpif_netlink_operate_chunks (dpif-netlink.c:2428)
>>      by 0x559B75: dpif_netlink_operate (dpif-netlink.c:2487)
>>      by 0x48BDD9: dpif_operate (dpif.c:1370)
>>      by 0x48C386: dpif_execute (dpif.c:1324)
>>      by 0x48C386: dpif_execute (dpif.c:1314)
>>      by 0x48C386: dpif_execute_helper_cb (dpif.c:1243)
>>      by 0x4C277A: odp_execute_actions (odp-execute.c:1074)
>>      by 0x489AE0: dpif_execute_with_help (dpif.c:1299)
>>      by 0x48BD8A: dpif_operate (dpif.c:1435)
>>      by 0x446A8F: handle_upcalls (ofproto-dpif-upcall.c:1746)
>>      by 0x446A8F: recv_upcalls (ofproto-dpif-upcall.c:962)
>>      by 0x446C1E: udpif_upcall_handler (ofproto-dpif-upcall.c:857)
>>      by 0x5163C4: ovsthread_wrapper (ovs-thread.c:426)
>>      by 0x51CB199: start_thread (pthread_create.c:443)
>>      by 0x524F533: clone (clone.S:100)
>>    Uninitialised value was created by a stack allocation
>>      at 0x48C130: dpif_execute_helper_cb (dpif.c:1176)
>>
>>   Syscall param sendmsg(msg.msg_iov[0]) points to uninitialised byte(s)
>>      at 0x5250F4D: __libc_sendmsg (sendmsg.c:28)
>>      by 0x5250F4D: sendmsg (sendmsg.c:25)
>>      by 0x57355D: nl_sock_transact_multiple__ (netlink-socket.c:874)
>>      by 0x573955: nl_sock_transact_multiple.part.0 (netlink-socket.c:1097)
>>      by 0x574ADF: nl_sock_transact_multiple (netlink-socket.c:1062)
>>      by 0x574ADF: nl_transact_multiple (netlink-socket.c:1862)
>>      by 0x55993A: dpif_netlink_operate__ (dpif-netlink.c:2141)
>>      by 0x559B75: dpif_netlink_operate_chunks (dpif-netlink.c:2428)
>>      by 0x559B75: dpif_netlink_operate (dpif-netlink.c:2487)
>>      by 0x48BDD9: dpif_operate (dpif.c:1370)
>>      by 0x48C386: dpif_execute (dpif.c:1324)
>>      by 0x48C386: dpif_execute (dpif.c:1314)
>>      by 0x48C386: dpif_execute_helper_cb (dpif.c:1243)
>>      by 0x4C277A: odp_execute_actions (odp-execute.c:1074)
>>      by 0x489AE0: dpif_execute_with_help (dpif.c:1299)
>>      by 0x48BD8A: dpif_operate (dpif.c:1435)
>>      by 0x446A8F: handle_upcalls (ofproto-dpif-upcall.c:1746)
>>      by 0x446A8F: recv_upcalls (ofproto-dpif-upcall.c:962)
>>      by 0x446C1E: udpif_upcall_handler (ofproto-dpif-upcall.c:857)
>>      by 0x5163C4: ovsthread_wrapper (ovs-thread.c:426)
>>      by 0x51CB199: start_thread (pthread_create.c:443)
>>      by 0x524F533: clone (clone.S:100)
>>    Address 0xfeefdec is on thread 27's stack
>>    in frame #4, created by dpif_netlink_operate__ (dpif-netlink.c:2039)
>>    Uninitialised value was created by a stack allocation
>>      at 0x48C130: dpif_execute_helper_cb (dpif.c:1176)
>>
>>
>> The tests are painfully slow, so we're not running them in CI.  But we
>> really should be running them at least once per release or something.
> 
> +1
> 
>> I'll make a note for myself to do that.
> 
> We have some quality assurance jobs running for upstream OVS/OVN on a 
> cadence too, I'll have a look and see if we can do something in this 
> space too as an extension of those jobs.
> 
>>>
>>>>>
>>>>> diff --git a/lib/dpif.c b/lib/dpif.c
>>>>> index 070fc0131..a062a4179 100644
>>>>> --- a/lib/dpif.c
>>>>> +++ b/lib/dpif.c
>>>>> @@ -1240,6 +1240,7 @@ dpif_execute_helper_cb(void *aux_, struct 
>>>>> dp_packet_batch *packets_,
>>>>>            execute.probe = false;
>>>>>            execute.mtu = 0;
>>>>>            execute.hash = 0;
>>>>> +        execute.upcall_pid = 0;
>>>>
>>>> Thinking a bit more about the problem, I think, we should be
>>>> getting the original upcall pid for this request.  We have it
>>>> in the 'execute' structure before we're trying to execute with
>>>> help.  We can store the value in the aux data and set it here.
>>>> The same as we do for the 'flow'.  This will ensure that we're
>>>> keeping the upcall pid regardless if the execution is going
>>>> direct to kernel or some actions are executed in the userspace
>>>> first.
>>>
>>> Ack, I'll address this in a v2.
>>>
>>
>> I have a vague memory of already suggesting this somewhere, but I don't
>> remember where...  We seem to have a similar problem with the mtu and
>> the hash that are cleared, while they probably shouldn't be.  So, what
>> we could do here is to store the entire original 'execute' in the aux,
>> then copy it and only update fields that need to be updated, i.e.
>> actions, packet and the needs_help.  The rest should stay the same as
>> it was in the original request.
>>
>> WDYT?
> 
> Makes sense to me, I'll have a go at incorporating this in the next 
> iteration.  Thanks for the input and guidance!

No problem.

Do you have an approximate timeline for when you'll be able to send a v2?
It would be great to have it soon, as we also started seeing this issue
in fedora, and users may start deploying OVS 3.6 on new kernels soon.

Best regards, Ilya Maximets.
_______________________________________________
dev mailing list
[email protected]
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Reply via email to