Sorry about the delay. I took a few minutes to properly understand the situation this morning. I applied the following form of your patch to master and branch-2.[123]:
--8<--------------------------cut here-------------------------->8-- From: Anoob Soman <anoob.so...@citrix.com> Date: Tue, 20 May 2014 12:40:35 +0100 Subject: [PATCH] netflow: Fold netflow_expire() into netflow_flow_clear(). netflow_flow_clear() asserted that no packets or bytes were included in the statistics for the flow being cleared. Before threading Open vSwitch, this assertion was always true because netflow_expire() was always called before calling netflow_flow_clear(). Since Open vSwitch was threaded, however, it was possible that a packet arrived after netflow_expire() but before netflow_flow_clear(), since each of these function separately took the netflow mutex. This commit fixes the problem by merging netflow_expire() into netflow_flow_clear(), under a single acquisition of the netflow mutex. Signed-off-by: Anoob Soman <anoob.so...@citrix.com> [b...@nicira.com modified the patch to remove netflow_expire() and rewrote the commit message] Signed-off-by: Ben Pfaff <b...@nicira.com> --- ofproto/netflow.c | 16 +--------------- ofproto/netflow.h | 3 +-- ofproto/ofproto-dpif-upcall.c | 2 -- ofproto/ofproto-dpif-xlate.c | 1 - 4 files changed, 2 insertions(+), 20 deletions(-) diff --git a/ofproto/netflow.c b/ofproto/netflow.c index e9382af..c7af010 100644 --- a/ofproto/netflow.c +++ b/ofproto/netflow.c @@ -276,19 +276,6 @@ netflow_expire__(struct netflow *nf, struct netflow_flow *nf_flow) } void -netflow_expire(struct netflow *nf, struct flow *flow) OVS_EXCLUDED(mutex) -{ - struct netflow_flow *nf_flow; - - ovs_mutex_lock(&mutex); - nf_flow = netflow_flow_lookup(nf, flow); - if (nf_flow) { - netflow_expire__(nf, nf_flow); - } - ovs_mutex_unlock(&mutex); -} - -void netflow_flow_clear(struct netflow *nf, struct flow *flow) OVS_EXCLUDED(mutex) { struct netflow_flow *nf_flow; @@ -296,8 +283,7 @@ netflow_flow_clear(struct netflow *nf, struct flow *flow) OVS_EXCLUDED(mutex) ovs_mutex_lock(&mutex); nf_flow = netflow_flow_lookup(nf, flow); if (nf_flow) { - ovs_assert(!nf_flow->packet_count); - ovs_assert(!nf_flow->byte_count); + netflow_expire__(nf, nf_flow); hmap_remove(&nf->flows, &nf_flow->hmap_node); free(nf_flow); } diff --git a/ofproto/netflow.h b/ofproto/netflow.h index e89b75e..94dd3ff 100644 --- a/ofproto/netflow.h +++ b/ofproto/netflow.h @@ -1,5 +1,5 @@ /* - * Copyright (c) 2008, 2009, 2010, 2011, 2013 Nicira, Inc. + * Copyright (c) 2008, 2009, 2010, 2011, 2013, 2014 Nicira, Inc. * * Licensed under the Apache License, Version 2.0 (the "License"); * you may not use this file except in compliance with the License. @@ -46,7 +46,6 @@ void netflow_unref(struct netflow *); bool netflow_exists(void); int netflow_set_options(struct netflow *, const struct netflow_options *); -void netflow_expire(struct netflow *, struct flow *); void netflow_run(struct netflow *); void netflow_wait(struct netflow *); diff --git a/ofproto/ofproto-dpif-upcall.c b/ofproto/ofproto-dpif-upcall.c index 67a0ed8..d38b843 100644 --- a/ofproto/ofproto-dpif-upcall.c +++ b/ofproto/ofproto-dpif-upcall.c @@ -1219,7 +1219,6 @@ revalidate_ukey(struct udpif *udpif, struct udpif_key *ukey, exit: if (netflow) { if (!ok) { - netflow_expire(netflow, &flow); netflow_flow_clear(netflow, &flow); } netflow_unref(netflow); @@ -1304,7 +1303,6 @@ push_dump_ops__(struct udpif *udpif, struct dump_op *ops, size_t n_ops) xlate_actions_for_side_effects(&xin); if (netflow) { - netflow_expire(netflow, &flow); netflow_flow_clear(netflow, &flow); netflow_unref(netflow); } diff --git a/ofproto/ofproto-dpif-xlate.c b/ofproto/ofproto-dpif-xlate.c index eded9d8..71eaad1 100644 --- a/ofproto/ofproto-dpif-xlate.c +++ b/ofproto/ofproto-dpif-xlate.c @@ -3956,7 +3956,6 @@ xlate_dev_unref(struct xc_entry *entry) static void xlate_cache_clear_netflow(struct netflow *netflow, struct flow *flow) { - netflow_expire(netflow, flow); netflow_flow_clear(netflow, flow); netflow_unref(netflow); free(flow); -- 1.7.10.4 On Tue, May 27, 2014 at 04:27:44PM +0000, Anoob Soman wrote: > Hi Ben, > > Do you have comments on the scenario in which the assert might happen ? > > Thanks, > Anoob. > ________________________________________ > From: Anoob Soman > Sent: 20 May 2014 20:06 > To: Ben Pfaff > Cc: dev@openvswitch.org > Subject: RE: [ovs-dev] [PATCH 1/1] netflow: Modify netflow_flow_clear() to do > netflow_expire__ > > Sorry, I should have made it clear in the commit message. In our testing, > occasionally ovs_assert(!nf_flow->packet_count) would fail, with the > following backtrace > > #0 0x00007f7d263ad265 in raise () from /lib64/libc.so.6 > #1 0x00007f7d263aed10 in abort () from /lib64/libc.so.6 > #2 0x00000000004a3b1e in ovs_abort_valist (err_no=<value optimized out>, > format=<value optimized out>, args=<value optimized out>) at lib/util.c:245 > #3 0x00000000004a8e78 in vlog_abort_valist (module_=<value optimized out>, > message=0x4f9410 "%s: assertion %s failed in %s()", args=0x42e3cfe0) at > lib/vlog.c:992 > #4 0x00000000004a8f06 in vlog_abort (module=0x10a7, message=0x1192 <Address > 0x1192 out of bounds>) at lib/vlog.c:1006 > #5 0x00000000004a3dab in ovs_assert_failure (where=0x6 <Address 0x6 out of > bounds>, function=0x80 <Address 0x80 out of bounds>, > condition=0xffffffffffffffff <Address 0xffffffffffffffff out of bounds>) > at lib/util.c:68 > #6 0x0000000000437db6 in netflow_flow_clear (nf=0x8722a0, flow=0x42e3ee20) > at ofproto/netflow.c:299 > #7 0x000000000042c6fb in revalidate_udumps (arg=0x7a4fe0) at > ofproto/ofproto-dpif-upcall.c:1391 > #8 udpif_revalidator (arg=0x7a4fe0) at ofproto/ofproto-dpif-upcall.c:724 > #9 0x00007f7d2616873d in __free_tcb () from /lib64/libpthread.so.0 > > Though netflow_flow_clear() and netflow_expire() are always called together, > there is a slight chance that the netflow stats are updated between the two > calls. > > With my patch, netflow_expire() is dead code (no one calls it). But I have > retained it for future use, if someone wants to expire netflow, without > explicitly taking locks. > > Thanks, > Anoob. > ________________________________________ > From: Ben Pfaff [b...@nicira.com] > Sent: 20 May 2014 19:25 > To: Anoob Soman > Cc: dev@openvswitch.org > Subject: Re: [ovs-dev] [PATCH 1/1] netflow: Modify netflow_flow_clear() to do > netflow_expire__ > > On Tue, May 20, 2014 at 12:40:35PM +0100, Anoob Soman wrote: > > This avoids causing failed assert(), when netflow_flow_clear() and > > netflow_expire() are called together. > > > > Signed-off-by: Anoob Soman <anoob.so...@citrix.com> > > I broke out the fix to the spelling of your name in AUTHORS (my fault, > sorry!) and pushed that. > > Can you explain a little more about the remainder of the patch? Does > the assertion trigger in practice (we have not received reports of > problems)? Can you explain the code flow that triggers it? > > Thanks, > > Ben. _______________________________________________ dev mailing list dev@openvswitch.org http://openvswitch.org/mailman/listinfo/dev