On 4 Jan 2023, at 17:42, Ilya Maximets wrote:

> On 1/4/23 11:33, Eelco Chaudron wrote:
>>
>>
>> On 3 Jan 2023, at 19:24, Ilya Maximets wrote:
>>
>>> On 1/3/23 18:13, Eelco Chaudron wrote:
>>>> When the ukey's action set changes, it could caus the flow to use a
>>>> different datapath, for example, when it moves from tc to kernel.
>>>> This will cause the the cached previous datapath statistics to be used.
>>>>
>>>> This change will reset the cached statistics when a change in
>>>> datapath is discovered.
>>>>
>>>> Signed-off-by: Eelco Chaudron <[email protected]>
>>>> ---
>>>>  ofproto/ofproto-dpif-upcall.c |   18 ++++++++++++++++--
>>>>  1 file changed, 16 insertions(+), 2 deletions(-)
>>>>
>>>> diff --git a/ofproto/ofproto-dpif-upcall.c b/ofproto/ofproto-dpif-upcall.c
>>>> index ad9635496..feaf58575 100644
>>>> --- a/ofproto/ofproto-dpif-upcall.c
>>>> +++ b/ofproto/ofproto-dpif-upcall.c
>>>> @@ -53,11 +53,13 @@ VLOG_DEFINE_THIS_MODULE(ofproto_dpif_upcall);
>>>>  COVERAGE_DEFINE(dumped_duplicate_flow);
>>>>  COVERAGE_DEFINE(dumped_new_flow);
>>>>  COVERAGE_DEFINE(handler_duplicate_upcall);
>>>> -COVERAGE_DEFINE(upcall_ukey_contention);
>>>> -COVERAGE_DEFINE(upcall_ukey_replace);
>>>>  COVERAGE_DEFINE(revalidate_missed_dp_flow);
>>>> +COVERAGE_DEFINE(ukey_dp_change);
>>>>  COVERAGE_DEFINE(upcall_flow_limit_hit);
>>>>  COVERAGE_DEFINE(upcall_flow_limit_kill);
>>>> +COVERAGE_DEFINE(upcall_ukey_contention);
>>>> +COVERAGE_DEFINE(upcall_ukey_replace);
>>>> +
>>>>
>>>>  /* A thread that reads upcalls from dpif, forwards each upcall's packet,
>>>>   * and possibly sets up a kernel flow as a cache. */
>>>> @@ -287,6 +289,7 @@ struct udpif_key {
>>>>
>>>>      struct ovs_mutex mutex;                   /* Guards the following. */
>>>>      struct dpif_flow_stats stats OVS_GUARDED; /* Last known stats.*/
>>>> +    const char *dp_layer OVS_GUARDED;         /* Last known dp_layer. */
>>>>      long long int created OVS_GUARDED;        /* Estimate of creation 
>>>> time. */
>>>>      uint64_t dump_seq OVS_GUARDED;            /* Tracks udpif->dump_seq. 
>>>> */
>>>>      uint64_t reval_seq OVS_GUARDED;           /* Tracks udpif->reval_seq. 
>>>> */
>>>> @@ -1766,6 +1769,7 @@ ukey_create__(const struct nlattr *key, size_t 
>>>> key_len,
>>>>      ukey->created = ukey->flow_time = time_msec();
>>>>      memset(&ukey->stats, 0, sizeof ukey->stats);
>>>>      ukey->stats.used = used;
>>>> +    ukey->dp_layer = "ovs";
>>>>      ukey->xcache = NULL;
>>>>
>>>>      ukey->offloaded = false;
>>>> @@ -2758,6 +2762,16 @@ revalidate(struct revalidator *revalidator)
>>>>                  continue;
>>>>              }
>>>>
>>>> +            if (strcmp(ukey->dp_layer, f->attrs.dp_layer)) {
>>>> +                /* The dp_layer has changed this is probably due to an 
>>>> earlier
>>>> +                 * revalidate cycle moving it to/from hw offload. In this 
>>>> case
>>>> +                 * we should reset the ukey stored statistics, as they are
>>>> +                 * from the deleted DP flow. */
>>>> +                COVERAGE_INC(ukey_dp_change);
>>>> +                ukey->dp_layer = f->attrs.dp_layer;
>>>> +                memset(&ukey->stats, 0, sizeof ukey->stats);
>>>
>>> Wouldn't this mess up OpenFlow stats down the line?  Especially if we
>>> accept a patch from Balazs that reverts an overflow workaround.
>>
>> This was actually uncovered by Balazs’s patch, as it happens more frequently.
>> It will also not affect the OpenFlow stats as they get pushed the delta.
>>
>>>
>>> Also, I would guess that this will mess up stats for netdev-offload-dpdk,
>>> because dpif-netdev hides all the stats shenanigans from the upper layers
>>> while correctly exposing the dp_layer.  It's doing that by summing the
>>> userspace datapath stats for that flow with offloaded ones.  See the
>>> function get_dpif_flow_status() for details.
>>
>> Looking at it again we are both partly right :) dpif-netdev has two entries 
>> in
>> parallel, so it needs to count both. However, if it was offloaded, and now
>> offload is no longer happening it will report only the none offloaded 
>> counters,
>> so reporting fewer packets/bytes causes a problem with Balazs’s patch 
>> included
>> (and also in the delete case). This is what I’m trying to solve, but it’s not
>> solved in this case as the none-offload stats are not reset.
>
> It's hard to tell without looking very close at the code, so I don't know for
> sure, but it might not be a real problem for the netdev-offload-dpdk in 
> practice,
> because the scenario where the flow was offloaded and suddenly not offloaded
> anymore is not possible.
>
> The reason is the partial offloading support, i.e. classifications offloading.
> Since match criteria of the flow never changes, the flow can only transition
> between partially and fully offloaded, it can not become non-offloaded.  And
> netdev-offlaod-dpdk will preserve stats.  The only problem is 
> netdev_flow_flush()
> that can be triggered on port deletion.  But there are other issues with the
> flush at the moment.

Guess I missed that part of the code :( I’ll try to do some rte_flow setup and 
experiment. I tried not to have to do this, but looks like I finally have to :(

>>
>> Do you know of any testbed/scripts to test rte_flow offload? I’ve not done 
>> any
>> rte_flow and it looks like it’s not tested by default, do you know if there 
>> is
>> some dummy offload PMD?
>
> I have some dummy rte_flow implementation here:
> https://github.com/igsilya/ovs/commit/12823ef43e12f4c2a6c8def2fdfc8f329e0ce622
>
> It is obviously not a real implementation and it does no offloading, so I 
> don't
> think you can use it for stats testing right away.  But maybe you can add some
> code along these lines to inject stats.
>
> I did post that patch at some point to the list for review, but had no chance
> to work on fixes for it.  The patch also might crash in some cases, IIRC. :)

Ok let me take a look at this, and see if it could help my testing.

>>
>>> I'd say, dpif-netlink should take care of correct stats management
>>> instead.  This will probably mean tracking stats delta in both
>>> dpif-netlink and netdev-offload-tc.  Note that netdev-offload-tc should
>>> still be able to report correct stats to dpif-netdev as well.
>>
>> I guess on dpif-netdev we can fix this as we keep state on each dp flow, 
>> which
>> is something we do not do for dpif-netlink, and I think you mentioned before
>> you do not want this either.
>
> Yeah, it's not a great solution.  Though I'm not sure how a good solution
> for this should look like.

I’ll spend some time thinking about this over the next week…

>>
>> We could differentiate the moves from tc<->ovs but it makes the code no 
>> longer
>> dp agnostics, guess we could fix this with a dpif callback.
>>
>> I need to give this some more thought, but any suggestions are welcome!
>>
>>
>> Cheers,
>>
>>
>> Eelco
>>
>>> Best regards, Ilya Maximets.
>>

_______________________________________________
dev mailing list
[email protected]
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Reply via email to