On 20 Oct 2022, at 9:43, Eelco Chaudron wrote:
> On 19 Oct 2022, at 11:31, Eelco Chaudron wrote:
>
>> On 19 Oct 2022, at 1:08, Ilya Maximets wrote:
>>
>>> On 10/19/22 00:07, Jan Scheurich wrote:
>>>> Hi guys,
>>>>
>>>> I am afraid that commit is too long ago that would remember any details
>>>> that caused us to change the code in beb75a40fdc2 ("userspace: Switching
>>>> of L3 packets in L2 pipeline"). What I vaguely remember was that I
>>>> couldn’t comprehend the original code and it was not working correctly in
>>>> some of the cases we needed/tested. But perhaps the changes we introduced
>>>> also had corner cases we didn't consider.
>>
>> First of all, thanks Ilya, for following this up while I was asleep and
>> digging into who made the change :) I was trying to get the patch out
>> quickly for some feedback and forgot this step :(
>>
>> My change does not seem to cause any failures in the test cases you changed.
>> But it looks like you did not have any specific test cases for the problem
>> you were trying to solve.
>
> I took another look at the original code before Jan’s patch and replaced
> netdev_flow_key_init() with the original function,
> netdev_flow_key_from_flow(), from before commit beb75a40fdc2 I see two tunnel
> cases fail:
>
> 794: tunnel_push_pop - action FAILED
> (tunnel-push-pop.at:660)
> 803: tunnel_push_pop_ipv6 - action FAILED
> (tunnel-push-pop-ipv6.at:528)
>
> This is due to the odd way of transforming the key to a packet and back to a
> flow, where my implementation uses the key as supplied to create the
> miniflow, which seems the right approach to me.
>
>>>> Question though:
>>>>
>>>>>> The datapath supports installing wider flows, and OVS relies on this
>>>>>> behavior. For example if ipv4(src=1.1.1.1/192.0.0.0,
>>>>>> dst=1.1.1.2/192.0.0.0) exists, a wider flow (smaller mask) of
>>>>>> ipv4(src=192.1.1.1/128.0.0.0,dst=192.1.1.2/128.0.0.0) is allowed to be
>>>>>> added.
>>>>
>>>> That sounds strange to me. I always believed the datapath only supports
>>>> non-overlapping flows, i.e. a packet can match at most one datapath flow.
>>>> Only with that pre-requisite the dpcls classifier can work without
>>>> priorities. Have I been wrong in this? What would be the semantics of
>>>> adding a wider flow to the datapath? To my knowledge there is no guarantee
>>>> that the dpcls subtables are visited in any specific order that would
>>>> honor the mask width. And the first match will win.
>>>
>>> Yes, you're right that classifier is always choosing the first match,
>>> so it can not work correctly in general if there are multiple flows
>>> that match the same packet.
>>>
>>> What this change is trying to achieve, IIUC, is a bit of an odd case
>>> where we have multiple flows that match on the same packet, but one
>>> of them is a superset of another. And these flows should have identical
>>> actions, of course, if we want the system to still function correctly.
>>> (Should we actually compare actions to be identical and matches to
>>> be a superset? If actions are not correct, the flow must be soon
>>> revalidated, so that might not be a big concern. If the mask is not
>>> a superset, the flow_wildcards_has_extra() check should catch that
>>> and revalidate one of them, so should not be a problem either, I guess.
>>> But we need to check that flow mods are always using ufid generated
>>> from a full key and not updating actions on an incorrect flow. Same
>>> for flow deletions, we need to be sure that while removing flows we're
>>> actually finding and removing the one that we need.)
>>
>> Let me take another look at this delete code, as the mod code uses the same
>> path as add, so that should be taken care of.
>
> I did some experiments and reviewed the code, and we should be good with the
> delete part in both userspace and the kernel (with both the ufid and key
> delete methods).
>
>>> We should actually re-check the classier and the caches to make sure
>>> that there will not be any surprises and they are capable of dealing
>>> with possible duplicates...
>>
>> I did take a brief look at it and could not find any red flags, but if you
>> have something specific in mind please let me know and I recheck.
>>
>> Also as the current PMD datapath code already uses it in this way (see
>> below), we might have caught issues earlier if the exist.
>>
>>>> Please clarify this. And in which sense OVS relies on this behavior?
>>>
>>> The problem is a revalidation process. During revalidation OVS removes
>>> too generic flows from the datapath to avoid incorrect matches, but
>>> allows too narrow flows to stay in the datapath to avoid the dataplane
>>> disruption and also to avoid constant flow deletions if the datapath
>>> ignores wildcards on certain fields/bits. See flow_wildcards_has_extra()
>>> check in the revalidate_ukey__() function.
>>>
>>> The problem here is that we have too narrow flow installed and now
>>> OpenFlow rules got changed, so the actual flow should be more generic.
>>> Revalidators will not remove the narrow flow, we will eventually get
>>> an upcall on the packet that doesn't match the narrow flow, but we will
>>> not be able to install a more generic flow because after masking with
>>> the new wider mask the key matches on the narrow flow, so we get EEXIST.
>>>
>>> This particular scenario is probably not an actual problem for the
>>> userspace datapath, because upcalls are handled in a different code
>>> path, so the wider flow should be installed just fine alongside with
>>> the too narrow flow (We still need to check that all parts of the
>>> code can handle this). So, it is only for a "slow" path here where
>>> upper layers may decide to add a wider flow via dpif_flow_put(), but
>>> will fail.
>>
>> Thats why I had the remark in the “the same way this is handled the
>> userspace datapath.” in the commit message.
>> So the dpif_netdev_flow_put() seems to be only used in the dummy datapath
>> used by all the test cases, but the actual dpif-netdev path seems to use the
>> handle_packet_upcall().
>>
>> Where in handle_packet_upcall() it will do a dp_netdev_pmd_lookup_flow()
>> using the unmasked packet key (like my change below), and then does the
>> dp_netdev_flow_add() using the masked ukey.
>>
>> Jan, I’m wondering why you only changed the dpif_netdev_flow_put() and not
>> the PMD path, as this is probably the path used in real life. Or is there
>> some odd kernel tunnel combination that is causing this path to be utilized?
>> If so, do you have a test case you can run on this change? The default dp
>> make checks all pass.
>>
>>> It's more of an issue for a kernel datapath where we seem to have
>>> a similar code inside of a kernel.
>>
>> Yes, I do have a kernel patch ready also for this. But I waited because I
>> wanted to get the userspace patch in/discussed first (before someone in the
>> kernel does a quick ack and commit ;)
>>
>>> In any case, if the "fast" upcall path is able to add such flows
>>> (I didn't verify that), the "slow" dpif_flow_put() case should work
>>> the same way to avoid confusion and potential more serious issues in
>>> the future.
>>
>> See above, I can confirm the fast path works similar to this change, i.e.
>> using the packet ukey.
>>
>
> Jan and Ilya, please let me know if you see any other problems or if you are
> ok with the change so I can also submit the kernel patch.
See above, any more comments, as I think the patch should be ok as is?!?
Jan/Ilya?
_______________________________________________
dev mailing list
[email protected]
https://mail.openvswitch.org/mailman/listinfo/ovs-dev