On 19 Oct 2022, at 11:31, Eelco Chaudron wrote:
> On 19 Oct 2022, at 1:08, Ilya Maximets wrote:
>
>> On 10/19/22 00:07, Jan Scheurich wrote:
>>> Hi guys,
>>>
>>> I am afraid that commit is too long ago that would remember any details
>>> that caused us to change the code in beb75a40fdc2 ("userspace: Switching of
>>> L3 packets in L2 pipeline"). What I vaguely remember was that I couldn’t
>>> comprehend the original code and it was not working correctly in some of
>>> the cases we needed/tested. But perhaps the changes we introduced also had
>>> corner cases we didn't consider.
>
> First of all, thanks Ilya, for following this up while I was asleep and
> digging into who made the change :) I was trying to get the patch out quickly
> for some feedback and forgot this step :(
>
> My change does not seem to cause any failures in the test cases you changed.
> But it looks like you did not have any specific test cases for the problem
> you were trying to solve.
I took another look at the original code before Jan’s patch and replaced
netdev_flow_key_init() with the original function, netdev_flow_key_from_flow(),
from before commit beb75a40fdc2 I see two tunnel cases fail:
794: tunnel_push_pop - action FAILED
(tunnel-push-pop.at:660)
803: tunnel_push_pop_ipv6 - action FAILED
(tunnel-push-pop-ipv6.at:528)
This is due to the odd way of transforming the key to a packet and back to a
flow, where my implementation uses the key as supplied to create the miniflow,
which seems the right approach to me.
>>> Question though:
>>>
>>>>> The datapath supports installing wider flows, and OVS relies on this
>>>>> behavior. For example if ipv4(src=1.1.1.1/192.0.0.0,
>>>>> dst=1.1.1.2/192.0.0.0) exists, a wider flow (smaller mask) of
>>>>> ipv4(src=192.1.1.1/128.0.0.0,dst=192.1.1.2/128.0.0.0) is allowed to be
>>>>> added.
>>>
>>> That sounds strange to me. I always believed the datapath only supports
>>> non-overlapping flows, i.e. a packet can match at most one datapath flow.
>>> Only with that pre-requisite the dpcls classifier can work without
>>> priorities. Have I been wrong in this? What would be the semantics of
>>> adding a wider flow to the datapath? To my knowledge there is no guarantee
>>> that the dpcls subtables are visited in any specific order that would honor
>>> the mask width. And the first match will win.
>>
>> Yes, you're right that classifier is always choosing the first match,
>> so it can not work correctly in general if there are multiple flows
>> that match the same packet.
>>
>> What this change is trying to achieve, IIUC, is a bit of an odd case
>> where we have multiple flows that match on the same packet, but one
>> of them is a superset of another. And these flows should have identical
>> actions, of course, if we want the system to still function correctly.
>> (Should we actually compare actions to be identical and matches to
>> be a superset? If actions are not correct, the flow must be soon
>> revalidated, so that might not be a big concern. If the mask is not
>> a superset, the flow_wildcards_has_extra() check should catch that
>> and revalidate one of them, so should not be a problem either, I guess.
>> But we need to check that flow mods are always using ufid generated
>> from a full key and not updating actions on an incorrect flow. Same
>> for flow deletions, we need to be sure that while removing flows we're
>> actually finding and removing the one that we need.)
>
> Let me take another look at this delete code, as the mod code uses the same
> path as add, so that should be taken care of.
I did some experiments and reviewed the code, and we should be good with the
delete part in both userspace and the kernel (with both the ufid and key delete
methods).
>> We should actually re-check the classier and the caches to make sure
>> that there will not be any surprises and they are capable of dealing
>> with possible duplicates...
>
> I did take a brief look at it and could not find any red flags, but if you
> have something specific in mind please let me know and I recheck.
>
> Also as the current PMD datapath code already uses it in this way (see
> below), we might have caught issues earlier if the exist.
>
>>> Please clarify this. And in which sense OVS relies on this behavior?
>>
>> The problem is a revalidation process. During revalidation OVS removes
>> too generic flows from the datapath to avoid incorrect matches, but
>> allows too narrow flows to stay in the datapath to avoid the dataplane
>> disruption and also to avoid constant flow deletions if the datapath
>> ignores wildcards on certain fields/bits. See flow_wildcards_has_extra()
>> check in the revalidate_ukey__() function.
>>
>> The problem here is that we have too narrow flow installed and now
>> OpenFlow rules got changed, so the actual flow should be more generic.
>> Revalidators will not remove the narrow flow, we will eventually get
>> an upcall on the packet that doesn't match the narrow flow, but we will
>> not be able to install a more generic flow because after masking with
>> the new wider mask the key matches on the narrow flow, so we get EEXIST.
>>
>> This particular scenario is probably not an actual problem for the
>> userspace datapath, because upcalls are handled in a different code
>> path, so the wider flow should be installed just fine alongside with
>> the too narrow flow (We still need to check that all parts of the
>> code can handle this). So, it is only for a "slow" path here where
>> upper layers may decide to add a wider flow via dpif_flow_put(), but
>> will fail.
>
> Thats why I had the remark in the “the same way this is handled the userspace
> datapath.” in the commit message.
> So the dpif_netdev_flow_put() seems to be only used in the dummy datapath
> used by all the test cases, but the actual dpif-netdev path seems to use the
> handle_packet_upcall().
>
> Where in handle_packet_upcall() it will do a dp_netdev_pmd_lookup_flow()
> using the unmasked packet key (like my change below), and then does the
> dp_netdev_flow_add() using the masked ukey.
>
> Jan, I’m wondering why you only changed the dpif_netdev_flow_put() and not
> the PMD path, as this is probably the path used in real life. Or is there
> some odd kernel tunnel combination that is causing this path to be utilized?
> If so, do you have a test case you can run on this change? The default dp
> make checks all pass.
>
>> It's more of an issue for a kernel datapath where we seem to have
>> a similar code inside of a kernel.
>
> Yes, I do have a kernel patch ready also for this. But I waited because I
> wanted to get the userspace patch in/discussed first (before someone in the
> kernel does a quick ack and commit ;)
>
>> In any case, if the "fast" upcall path is able to add such flows
>> (I didn't verify that), the "slow" dpif_flow_put() case should work
>> the same way to avoid confusion and potential more serious issues in
>> the future.
>
> See above, I can confirm the fast path works similar to this change, i.e.
> using the packet ukey.
>
Jan and Ilya, please let me know if you see any other problems or if you are ok
with the change so I can also submit the kernel patch.
<SNIP>
_______________________________________________
dev mailing list
[email protected]
https://mail.openvswitch.org/mailman/listinfo/ovs-dev