Sorry about the delay.  I took a few minutes to properly understand
the situation this morning.  I applied the following form of your
patch to master and branch-2.[123]:

--8<--------------------------cut here-------------------------->8--

From: Anoob Soman <anoob.so...@citrix.com>
Date: Tue, 20 May 2014 12:40:35 +0100
Subject: [PATCH] netflow: Fold netflow_expire() into netflow_flow_clear().

netflow_flow_clear() asserted that no packets or bytes were included
in the statistics for the flow being cleared.  Before threading Open
vSwitch, this assertion was always true because netflow_expire() was
always called before calling netflow_flow_clear().  Since Open
vSwitch was threaded, however, it was possible that a packet arrived
after netflow_expire() but before netflow_flow_clear(), since each of
these function separately took the netflow mutex.

This commit fixes the problem by merging netflow_expire() into
netflow_flow_clear(), under a single acquisition of the netflow
mutex.

Signed-off-by: Anoob Soman <anoob.so...@citrix.com>
[b...@nicira.com modified the patch to remove netflow_expire() and
 rewrote the commit message]
Signed-off-by: Ben Pfaff <b...@nicira.com>
---
 ofproto/netflow.c             |   16 +---------------
 ofproto/netflow.h             |    3 +--
 ofproto/ofproto-dpif-upcall.c |    2 --
 ofproto/ofproto-dpif-xlate.c  |    1 -
 4 files changed, 2 insertions(+), 20 deletions(-)

diff --git a/ofproto/netflow.c b/ofproto/netflow.c
index e9382af..c7af010 100644
--- a/ofproto/netflow.c
+++ b/ofproto/netflow.c
@@ -276,19 +276,6 @@ netflow_expire__(struct netflow *nf, struct netflow_flow 
*nf_flow)
 }
 
 void
-netflow_expire(struct netflow *nf, struct flow *flow) OVS_EXCLUDED(mutex)
-{
-    struct netflow_flow *nf_flow;
-
-    ovs_mutex_lock(&mutex);
-    nf_flow = netflow_flow_lookup(nf, flow);
-    if (nf_flow) {
-        netflow_expire__(nf, nf_flow);
-    }
-    ovs_mutex_unlock(&mutex);
-}
-
-void
 netflow_flow_clear(struct netflow *nf, struct flow *flow) OVS_EXCLUDED(mutex)
 {
     struct netflow_flow *nf_flow;
@@ -296,8 +283,7 @@ netflow_flow_clear(struct netflow *nf, struct flow *flow) 
OVS_EXCLUDED(mutex)
     ovs_mutex_lock(&mutex);
     nf_flow = netflow_flow_lookup(nf, flow);
     if (nf_flow) {
-        ovs_assert(!nf_flow->packet_count);
-        ovs_assert(!nf_flow->byte_count);
+        netflow_expire__(nf, nf_flow);
         hmap_remove(&nf->flows, &nf_flow->hmap_node);
         free(nf_flow);
     }
diff --git a/ofproto/netflow.h b/ofproto/netflow.h
index e89b75e..94dd3ff 100644
--- a/ofproto/netflow.h
+++ b/ofproto/netflow.h
@@ -1,5 +1,5 @@
 /*
- * Copyright (c) 2008, 2009, 2010, 2011, 2013 Nicira, Inc.
+ * Copyright (c) 2008, 2009, 2010, 2011, 2013, 2014 Nicira, Inc.
  *
  * Licensed under the Apache License, Version 2.0 (the "License");
  * you may not use this file except in compliance with the License.
@@ -46,7 +46,6 @@ void netflow_unref(struct netflow *);
 bool netflow_exists(void);
 
 int netflow_set_options(struct netflow *, const struct netflow_options *);
-void netflow_expire(struct netflow *, struct flow *);
 
 void netflow_run(struct netflow *);
 void netflow_wait(struct netflow *);
diff --git a/ofproto/ofproto-dpif-upcall.c b/ofproto/ofproto-dpif-upcall.c
index 67a0ed8..d38b843 100644
--- a/ofproto/ofproto-dpif-upcall.c
+++ b/ofproto/ofproto-dpif-upcall.c
@@ -1219,7 +1219,6 @@ revalidate_ukey(struct udpif *udpif, struct udpif_key 
*ukey,
 exit:
     if (netflow) {
         if (!ok) {
-            netflow_expire(netflow, &flow);
             netflow_flow_clear(netflow, &flow);
         }
         netflow_unref(netflow);
@@ -1304,7 +1303,6 @@ push_dump_ops__(struct udpif *udpif, struct dump_op *ops, 
size_t n_ops)
                 xlate_actions_for_side_effects(&xin);
 
                 if (netflow) {
-                    netflow_expire(netflow, &flow);
                     netflow_flow_clear(netflow, &flow);
                     netflow_unref(netflow);
                 }
diff --git a/ofproto/ofproto-dpif-xlate.c b/ofproto/ofproto-dpif-xlate.c
index eded9d8..71eaad1 100644
--- a/ofproto/ofproto-dpif-xlate.c
+++ b/ofproto/ofproto-dpif-xlate.c
@@ -3956,7 +3956,6 @@ xlate_dev_unref(struct xc_entry *entry)
 static void
 xlate_cache_clear_netflow(struct netflow *netflow, struct flow *flow)
 {
-    netflow_expire(netflow, flow);
     netflow_flow_clear(netflow, flow);
     netflow_unref(netflow);
     free(flow);
-- 
1.7.10.4



On Tue, May 27, 2014 at 04:27:44PM +0000, Anoob Soman wrote:
> Hi Ben,
> 
> Do you have comments on the scenario in which the assert might happen ?
> 
> Thanks,
> Anoob.
> ________________________________________
> From: Anoob Soman
> Sent: 20 May 2014 20:06
> To: Ben Pfaff
> Cc: dev@openvswitch.org
> Subject: RE: [ovs-dev] [PATCH 1/1] netflow: Modify netflow_flow_clear() to do 
> netflow_expire__
> 
> Sorry, I should have made it clear in the commit message. In our testing, 
> occasionally ovs_assert(!nf_flow->packet_count) would fail, with the 
> following backtrace
> 
> #0  0x00007f7d263ad265 in raise () from /lib64/libc.so.6
> #1  0x00007f7d263aed10 in abort () from /lib64/libc.so.6
> #2  0x00000000004a3b1e in ovs_abort_valist (err_no=<value optimized out>, 
> format=<value optimized out>, args=<value optimized out>) at lib/util.c:245
> #3  0x00000000004a8e78 in vlog_abort_valist (module_=<value optimized out>, 
> message=0x4f9410 "%s: assertion %s failed in %s()", args=0x42e3cfe0) at 
> lib/vlog.c:992
> #4  0x00000000004a8f06 in vlog_abort (module=0x10a7, message=0x1192 <Address 
> 0x1192 out of bounds>) at lib/vlog.c:1006
> #5  0x00000000004a3dab in ovs_assert_failure (where=0x6 <Address 0x6 out of 
> bounds>, function=0x80 <Address 0x80 out of bounds>,
>     condition=0xffffffffffffffff <Address 0xffffffffffffffff out of bounds>) 
> at lib/util.c:68
> #6  0x0000000000437db6 in netflow_flow_clear (nf=0x8722a0, flow=0x42e3ee20) 
> at ofproto/netflow.c:299
> #7  0x000000000042c6fb in revalidate_udumps (arg=0x7a4fe0) at 
> ofproto/ofproto-dpif-upcall.c:1391
> #8  udpif_revalidator (arg=0x7a4fe0) at ofproto/ofproto-dpif-upcall.c:724
> #9  0x00007f7d2616873d in __free_tcb () from /lib64/libpthread.so.0
> 
> Though netflow_flow_clear() and netflow_expire() are always called together, 
> there is a slight chance that the netflow stats are updated between the two 
> calls.
> 
> With my patch, netflow_expire() is dead code (no one calls it). But I have 
> retained it for future use, if someone wants to expire netflow, without 
> explicitly taking locks.
> 
> Thanks,
> Anoob.
> ________________________________________
> From: Ben Pfaff [b...@nicira.com]
> Sent: 20 May 2014 19:25
> To: Anoob Soman
> Cc: dev@openvswitch.org
> Subject: Re: [ovs-dev] [PATCH 1/1] netflow: Modify netflow_flow_clear() to do 
> netflow_expire__
> 
> On Tue, May 20, 2014 at 12:40:35PM +0100, Anoob Soman wrote:
> > This avoids causing failed assert(), when netflow_flow_clear() and
> > netflow_expire() are called together.
> >
> > Signed-off-by: Anoob Soman <anoob.so...@citrix.com>
> 
> I broke out the fix to the spelling of your name in AUTHORS (my fault,
> sorry!) and pushed that.
> 
> Can you explain a little more about the remainder of the patch?  Does
> the assertion trigger in practice (we have not received reports of
> problems)?  Can you explain the code flow that triggers it?
> 
> Thanks,
> 
> Ben.
_______________________________________________
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev

Reply via email to