Re: [ovs-dev] [PATCH v6] dpif-netdev: Avoid reordering of packets in a batch with same megaflow

Ilya Maximets Wed, 22 Aug 2018 08:32:30 -0700

On 22.08.2018 15:30, Stokes, Ian wrote:
>> On 7/27/2018 7:26 PM, Vishal Deep Ajmera wrote:
>>> OVS reads packets in batches from a given port and packets in the
>>> batch are subjected to potentially 3 levels of lookups to identify the
>>> datapath megaflow entry (or flow) associated with the packet.
>>> Each megaflow entry has a dedicated buffer in which packets that match
>>> the flow classification criteria are collected. This buffer helps OVS
>>> perform batch processing for all packets associated with a given flow.
>>>
>>> Each packet in the received batch is first subjected to lookup in the
>>> Exact Match Cache (EMC). Each EMC entry will point to a flow. If the
>>> EMC lookup is successful, the packet is moved from the rx batch to the
>>> per-flow buffer.
>>>
>>> Packets that did not match any EMC entry are rearranged in the rx
>>> batch at the beginning and are now subjected to a lookup in the megaflow
>> cache.
>>> Packets that match a megaflow cache entry are *appended* to the
>>> per-flow buffer.
>>>
>>> Packets that do not match any megaflow entry are subjected to
>>> slow-path processing through the upcall mechanism. This cannot change
>>> the order of packets as by definition upcall processing is only done
>>> for packets without matching megaflow entry.
>>>
>>> The EMC entry match fields encompass all potentially significant
>>> header fields, typically more than specified in the associated flow's
>>> match criteria. Hence, multiple EMC entries can point to the same
>>> flow. Given that per-flow batching happens at each lookup stage,
>>> packets belonging to the same megaflow can get re-ordered because some
>>> packets match EMC entries while others do not.
>>>
>>> The following example can illustrate the issue better. Consider
>>> following batch of packets (labelled P1 to P8) associated with a
>>> single TCP connection and associated with a single flow. Let us assume
>>> that packets with just the ACK bit set in TCP flags have been received
>>> in a prior batch also and a corresponding EMC entry exists.
>>>
>>> 1. P1 (TCP Flag: ACK)
>>> 2. P2 (TCP Flag: ACK)
>>> 3. P3 (TCP Flag: ACK)
>>> 4. P4 (TCP Flag: ACK, PSH)
>>> 5. P5 (TCP Flag: ACK)
>>> 6. P6 (TCP Flag: ACK)
>>> 7. P7 (TCP Flag: ACK)
>>> 8. P8 (TCP Flag: ACK)
>>>
>>> The megaflow classification criteria does not include TCP flags while
>>> the EMC match criteria does. Thus, all packets other than P4 match the
>>> existing EMC entry and are moved to the per-flow packet batch.
>>> Subsequently, packet P4 is moved to the same per-flow packet batch as
>>> a result of the megaflow lookup. Though the packets have all been
>>> correctly classified as being associated with the same flow, the
>>> packet order has not been preserved because of the per-flow batching
>>> performed during the EMC lookup stage. This packet re-ordering has
>>> performance implications for TCP applications.
>>>
>>> This patch preserves the packet ordering by performing the per-flow
>>> batching after both the EMC and megaflow lookups are complete. As an
>>> optimization, packets are flow-batched in emc processing till any
>>> packet in the batch has an EMC miss.
>>>
>>> A new flow map is maintained to keep the original order of packet
>>> along with flow information. Post fastpath processing, packets from
>>> flow map are *appended* to per-flow buffer.
>>>
>>
>> Thanks for the V6 Vishal, looking at this today myself.
>>
>> Ilya, has the v6 addressed your concerns from the v5?
> 
> There haven't been any more comments on this patch. From what I can see the 
> issues raised by Ilya on v5 have been addressed in the v6.  V6 seems to test 
> ok, unless there are objections I'm going to merge it as part of this weeks 
> dpdk_merge pull request for master.


Hi, Ian, everyone.
It's ok to merge. I still don't like the change, but I'll get along with it.
P.S. Current version can not be applied cleanly, minor rebase required.

Below text is not directly related to this patch.
---
Meanwhile, in general, I think that the main processing part
of dpif-netdev (dp_netdev_input__) is overcomplicated. It looks
like we have 4 different levels of packet batching (some of them
exists at the same time):
1. rx batches
2. Per-flow batches
3. new flow maps
4. output batches.
And we're only constantly batching packets in a different data
structures. That's frustrating.

I'd like to refactor it in a way of more iterative and sequential
processing like using "keys_map" approach from "dpcls_lookup" and
call different processing functions until "keys_map" is not empty.
Each processing function may use "ULLONG_FOR_EACH_1" for this
"key_map", set up rules for matched packets end disable them
from "keys_map" using "ULLONG_SET0".
Right now we have 5 options to classify packets:
1. has_flow_mark()
2. emc_lookup()
3. smc_lookup_batch()
4. dpcls_lookup()
5. handle_packet_upcall()
some of them works with packet batches, some with individual packets.
IMHO, if we'll make a plain pipeline from them calling the next
stage until there are some unhandled packets in "keys_map", it could
be much simpler in compare to current call tree with a mess of
processing functions, each of which implemented in a different way.
In the end we'll be able to push all the packets to per-flow batches
and execute actions.

In fact, this will effectively resolve current reordering issue too.

I'm not sure if the above text is parsable. =) Sorry, if not.

Any thoughts?
Anyway, I'm going to try.

Best regards, Ilya Maximets.

> Thanks
> Ian
> 
>>
>>> Signed-off-by: Vishal Deep Ajmera <[email protected]>
>>> Co-authored-by: Venkatesan Pradeep <[email protected]>
>>> Signed-off-by: Venkatesan Pradeep <[email protected]>
>>> ---
>>>   lib/dpif-netdev.c | 125 +++++++++++++++++++++++++++++++++++++++++++++-
>> --------
>>>   1 file changed, 106 insertions(+), 19 deletions(-)
_______________________________________________
dev mailing list
[email protected]
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Re: [ovs-dev] [PATCH v6] dpif-netdev: Avoid reordering of packets in a batch with same megaflow

Reply via email to