On 4 Jan 2023, at 17:42, Ilya Maximets wrote:
> On 1/4/23 11:33, Eelco Chaudron wrote: >> >> >> On 3 Jan 2023, at 19:24, Ilya Maximets wrote: >> >>> On 1/3/23 18:13, Eelco Chaudron wrote: >>>> When the ukey's action set changes, it could caus the flow to use a >>>> different datapath, for example, when it moves from tc to kernel. >>>> This will cause the the cached previous datapath statistics to be used. >>>> >>>> This change will reset the cached statistics when a change in >>>> datapath is discovered. >>>> >>>> Signed-off-by: Eelco Chaudron <[email protected]> >>>> --- >>>> ofproto/ofproto-dpif-upcall.c | 18 ++++++++++++++++-- >>>> 1 file changed, 16 insertions(+), 2 deletions(-) >>>> >>>> diff --git a/ofproto/ofproto-dpif-upcall.c b/ofproto/ofproto-dpif-upcall.c >>>> index ad9635496..feaf58575 100644 >>>> --- a/ofproto/ofproto-dpif-upcall.c >>>> +++ b/ofproto/ofproto-dpif-upcall.c >>>> @@ -53,11 +53,13 @@ VLOG_DEFINE_THIS_MODULE(ofproto_dpif_upcall); >>>> COVERAGE_DEFINE(dumped_duplicate_flow); >>>> COVERAGE_DEFINE(dumped_new_flow); >>>> COVERAGE_DEFINE(handler_duplicate_upcall); >>>> -COVERAGE_DEFINE(upcall_ukey_contention); >>>> -COVERAGE_DEFINE(upcall_ukey_replace); >>>> COVERAGE_DEFINE(revalidate_missed_dp_flow); >>>> +COVERAGE_DEFINE(ukey_dp_change); >>>> COVERAGE_DEFINE(upcall_flow_limit_hit); >>>> COVERAGE_DEFINE(upcall_flow_limit_kill); >>>> +COVERAGE_DEFINE(upcall_ukey_contention); >>>> +COVERAGE_DEFINE(upcall_ukey_replace); >>>> + >>>> >>>> /* A thread that reads upcalls from dpif, forwards each upcall's packet, >>>> * and possibly sets up a kernel flow as a cache. */ >>>> @@ -287,6 +289,7 @@ struct udpif_key { >>>> >>>> struct ovs_mutex mutex; /* Guards the following. */ >>>> struct dpif_flow_stats stats OVS_GUARDED; /* Last known stats.*/ >>>> + const char *dp_layer OVS_GUARDED; /* Last known dp_layer. */ >>>> long long int created OVS_GUARDED; /* Estimate of creation >>>> time. */ >>>> uint64_t dump_seq OVS_GUARDED; /* Tracks udpif->dump_seq. >>>> */ >>>> uint64_t reval_seq OVS_GUARDED; /* Tracks udpif->reval_seq. >>>> */ >>>> @@ -1766,6 +1769,7 @@ ukey_create__(const struct nlattr *key, size_t >>>> key_len, >>>> ukey->created = ukey->flow_time = time_msec(); >>>> memset(&ukey->stats, 0, sizeof ukey->stats); >>>> ukey->stats.used = used; >>>> + ukey->dp_layer = "ovs"; >>>> ukey->xcache = NULL; >>>> >>>> ukey->offloaded = false; >>>> @@ -2758,6 +2762,16 @@ revalidate(struct revalidator *revalidator) >>>> continue; >>>> } >>>> >>>> + if (strcmp(ukey->dp_layer, f->attrs.dp_layer)) { >>>> + /* The dp_layer has changed this is probably due to an >>>> earlier >>>> + * revalidate cycle moving it to/from hw offload. In this >>>> case >>>> + * we should reset the ukey stored statistics, as they are >>>> + * from the deleted DP flow. */ >>>> + COVERAGE_INC(ukey_dp_change); >>>> + ukey->dp_layer = f->attrs.dp_layer; >>>> + memset(&ukey->stats, 0, sizeof ukey->stats); >>> >>> Wouldn't this mess up OpenFlow stats down the line? Especially if we >>> accept a patch from Balazs that reverts an overflow workaround. >> >> This was actually uncovered by Balazs’s patch, as it happens more frequently. >> It will also not affect the OpenFlow stats as they get pushed the delta. >> >>> >>> Also, I would guess that this will mess up stats for netdev-offload-dpdk, >>> because dpif-netdev hides all the stats shenanigans from the upper layers >>> while correctly exposing the dp_layer. It's doing that by summing the >>> userspace datapath stats for that flow with offloaded ones. See the >>> function get_dpif_flow_status() for details. >> >> Looking at it again we are both partly right :) dpif-netdev has two entries >> in >> parallel, so it needs to count both. However, if it was offloaded, and now >> offload is no longer happening it will report only the none offloaded >> counters, >> so reporting fewer packets/bytes causes a problem with Balazs’s patch >> included >> (and also in the delete case). This is what I’m trying to solve, but it’s not >> solved in this case as the none-offload stats are not reset. > > It's hard to tell without looking very close at the code, so I don't know for > sure, but it might not be a real problem for the netdev-offload-dpdk in > practice, > because the scenario where the flow was offloaded and suddenly not offloaded > anymore is not possible. > > The reason is the partial offloading support, i.e. classifications offloading. > Since match criteria of the flow never changes, the flow can only transition > between partially and fully offloaded, it can not become non-offloaded. And > netdev-offlaod-dpdk will preserve stats. The only problem is > netdev_flow_flush() > that can be triggered on port deletion. But there are other issues with the > flush at the moment. Guess I missed that part of the code :( I’ll try to do some rte_flow setup and experiment. I tried not to have to do this, but looks like I finally have to :( >> >> Do you know of any testbed/scripts to test rte_flow offload? I’ve not done >> any >> rte_flow and it looks like it’s not tested by default, do you know if there >> is >> some dummy offload PMD? > > I have some dummy rte_flow implementation here: > https://github.com/igsilya/ovs/commit/12823ef43e12f4c2a6c8def2fdfc8f329e0ce622 > > It is obviously not a real implementation and it does no offloading, so I > don't > think you can use it for stats testing right away. But maybe you can add some > code along these lines to inject stats. > > I did post that patch at some point to the list for review, but had no chance > to work on fixes for it. The patch also might crash in some cases, IIRC. :) Ok let me take a look at this, and see if it could help my testing. >> >>> I'd say, dpif-netlink should take care of correct stats management >>> instead. This will probably mean tracking stats delta in both >>> dpif-netlink and netdev-offload-tc. Note that netdev-offload-tc should >>> still be able to report correct stats to dpif-netdev as well. >> >> I guess on dpif-netdev we can fix this as we keep state on each dp flow, >> which >> is something we do not do for dpif-netlink, and I think you mentioned before >> you do not want this either. > > Yeah, it's not a great solution. Though I'm not sure how a good solution > for this should look like. I’ll spend some time thinking about this over the next week… >> >> We could differentiate the moves from tc<->ovs but it makes the code no >> longer >> dp agnostics, guess we could fix this with a dpif callback. >> >> I need to give this some more thought, but any suggestions are welcome! >> >> >> Cheers, >> >> >> Eelco >> >>> Best regards, Ilya Maximets. >> _______________________________________________ dev mailing list [email protected] https://mail.openvswitch.org/mailman/listinfo/ovs-dev
