Re: [ovs-dev] [patch v6 1/3] conntrack: Fix race for NAT cleanup.

2019-03-15 Thread Darrell Ball
On Fri, Mar 15, 2019 at 4:31 PM Ben Pfaff  wrote:

> On Fri, Mar 15, 2019 at 04:17:34PM -0700, Darrell Ball wrote:
> > On Fri, Mar 15, 2019 at 3:56 PM Ben Pfaff  wrote:
> >
> > > On Fri, Mar 15, 2019 at 03:01:18PM -0700, Darrell Ball wrote:
> > > > Reference lists are not fully protected during cleanup of
> > > > NAT connections where the bucket lock is transiently not held during
> > > > list traversal.  This can lead to referencing freed memory during
> > > > cleaning from multiple contexts.  Fix this by protecting with
> > > > the existing 'cleanup' mutex in the missed cases where 'conn_clean()'
> > > > is called.  'conntrack_flush()' is converted to expiry list traversal
> > > > to support the proper bucket level protection with the 'cleanup'
> mutex.
> > > >
> > > > The NAT exhaustion case cleanup in 'conn_not_found()' is also
> modified
> > > > to avoid the same issue.
> > > >
> > > > Fixes: 286de2729955 ("dpdk: Userspace Datapath: Introduce NAT
> Support.")
> > > > Reported-by: solomon 
> > > > Reported-at:
> > > https://mail.openvswitch.org/pipermail/ovs-dev/2019-March/357056.html
> > > > Tested-by: solomon 
> > > > Signed-off-by: Darrell Ball 
> > > > ---
> > > >
> > > > This patch is targeted for earlier releases as new RCU patches
> > > > inherently don't have this race.
> > > >
> > > > Backport to 2.8.
> > >
> > > Thanks.  I applied this to master, branch-2.11, and branch-2.10.  2.9
> > > and 2.8 had conflicts.
> > >
> >
> > I will create the backport patches for 2.9 and 2.8.
> >
> > Regarding branch 2.8 - it has diverged quite a bit in general from branch
> > >=2.9.
> > This is because of some small features/cosmetic changes that went into
> 2.9.
> > One option would be to bring 2.8 into sync with 2.9 in one patch.
> > Alternatively,
> > backport all dependencies  and fixes separately. Thoughts ?
>
> Usually it's better to backport them separately, because it makes it
> clear at a glance what happened in a list of patches.


yep


> But that can
> sometimes be a lot of trouble, and in that case a single patch can make
> sense.
>

It is the "lot of trouble" part I am trying to avoid. Let me see.
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [patch v6 1/3] conntrack: Fix race for NAT cleanup.

2019-03-15 Thread Ben Pfaff
On Fri, Mar 15, 2019 at 04:17:34PM -0700, Darrell Ball wrote:
> On Fri, Mar 15, 2019 at 3:56 PM Ben Pfaff  wrote:
> 
> > On Fri, Mar 15, 2019 at 03:01:18PM -0700, Darrell Ball wrote:
> > > Reference lists are not fully protected during cleanup of
> > > NAT connections where the bucket lock is transiently not held during
> > > list traversal.  This can lead to referencing freed memory during
> > > cleaning from multiple contexts.  Fix this by protecting with
> > > the existing 'cleanup' mutex in the missed cases where 'conn_clean()'
> > > is called.  'conntrack_flush()' is converted to expiry list traversal
> > > to support the proper bucket level protection with the 'cleanup' mutex.
> > >
> > > The NAT exhaustion case cleanup in 'conn_not_found()' is also modified
> > > to avoid the same issue.
> > >
> > > Fixes: 286de2729955 ("dpdk: Userspace Datapath: Introduce NAT Support.")
> > > Reported-by: solomon 
> > > Reported-at:
> > https://mail.openvswitch.org/pipermail/ovs-dev/2019-March/357056.html
> > > Tested-by: solomon 
> > > Signed-off-by: Darrell Ball 
> > > ---
> > >
> > > This patch is targeted for earlier releases as new RCU patches
> > > inherently don't have this race.
> > >
> > > Backport to 2.8.
> >
> > Thanks.  I applied this to master, branch-2.11, and branch-2.10.  2.9
> > and 2.8 had conflicts.
> >
> 
> I will create the backport patches for 2.9 and 2.8.
> 
> Regarding branch 2.8 - it has diverged quite a bit in general from branch
> >=2.9.
> This is because of some small features/cosmetic changes that went into 2.9.
> One option would be to bring 2.8 into sync with 2.9 in one patch.
> Alternatively,
> backport all dependencies  and fixes separately. Thoughts ?

Usually it's better to backport them separately, because it makes it
clear at a glance what happened in a list of patches.  But that can
sometimes be a lot of trouble, and in that case a single patch can make
sense.
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [patch v6 1/3] conntrack: Fix race for NAT cleanup.

2019-03-15 Thread Darrell Ball
On Fri, Mar 15, 2019 at 3:56 PM Ben Pfaff  wrote:

> On Fri, Mar 15, 2019 at 03:01:18PM -0700, Darrell Ball wrote:
> > Reference lists are not fully protected during cleanup of
> > NAT connections where the bucket lock is transiently not held during
> > list traversal.  This can lead to referencing freed memory during
> > cleaning from multiple contexts.  Fix this by protecting with
> > the existing 'cleanup' mutex in the missed cases where 'conn_clean()'
> > is called.  'conntrack_flush()' is converted to expiry list traversal
> > to support the proper bucket level protection with the 'cleanup' mutex.
> >
> > The NAT exhaustion case cleanup in 'conn_not_found()' is also modified
> > to avoid the same issue.
> >
> > Fixes: 286de2729955 ("dpdk: Userspace Datapath: Introduce NAT Support.")
> > Reported-by: solomon 
> > Reported-at:
> https://mail.openvswitch.org/pipermail/ovs-dev/2019-March/357056.html
> > Tested-by: solomon 
> > Signed-off-by: Darrell Ball 
> > ---
> >
> > This patch is targeted for earlier releases as new RCU patches
> > inherently don't have this race.
> >
> > Backport to 2.8.
>
> Thanks.  I applied this to master, branch-2.11, and branch-2.10.  2.9
> and 2.8 had conflicts.
>

I will create the backport patches for 2.9 and 2.8.

Regarding branch 2.8 - it has diverged quite a bit in general from branch
>=2.9.
This is because of some small features/cosmetic changes that went into 2.9.
One option would be to bring 2.8 into sync with 2.9 in one patch.
Alternatively,
backport all dependencies  and fixes separately. Thoughts ?
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [patch v6 1/3] conntrack: Fix race for NAT cleanup.

2019-03-15 Thread Ben Pfaff
On Fri, Mar 15, 2019 at 03:01:18PM -0700, Darrell Ball wrote:
> Reference lists are not fully protected during cleanup of
> NAT connections where the bucket lock is transiently not held during
> list traversal.  This can lead to referencing freed memory during
> cleaning from multiple contexts.  Fix this by protecting with
> the existing 'cleanup' mutex in the missed cases where 'conn_clean()'
> is called.  'conntrack_flush()' is converted to expiry list traversal
> to support the proper bucket level protection with the 'cleanup' mutex.
> 
> The NAT exhaustion case cleanup in 'conn_not_found()' is also modified
> to avoid the same issue.
> 
> Fixes: 286de2729955 ("dpdk: Userspace Datapath: Introduce NAT Support.")
> Reported-by: solomon 
> Reported-at: 
> https://mail.openvswitch.org/pipermail/ovs-dev/2019-March/357056.html
> Tested-by: solomon 
> Signed-off-by: Darrell Ball 
> ---
> 
> This patch is targeted for earlier releases as new RCU patches
> inherently don't have this race.
> 
> Backport to 2.8.

Thanks.  I applied this to master, branch-2.11, and branch-2.10.  2.9
and 2.8 had conflicts.
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


[ovs-dev] [patch v6 1/3] conntrack: Fix race for NAT cleanup.

2019-03-15 Thread Darrell Ball
Reference lists are not fully protected during cleanup of
NAT connections where the bucket lock is transiently not held during
list traversal.  This can lead to referencing freed memory during
cleaning from multiple contexts.  Fix this by protecting with
the existing 'cleanup' mutex in the missed cases where 'conn_clean()'
is called.  'conntrack_flush()' is converted to expiry list traversal
to support the proper bucket level protection with the 'cleanup' mutex.

The NAT exhaustion case cleanup in 'conn_not_found()' is also modified
to avoid the same issue.

Fixes: 286de2729955 ("dpdk: Userspace Datapath: Introduce NAT Support.")
Reported-by: solomon 
Reported-at: 
https://mail.openvswitch.org/pipermail/ovs-dev/2019-March/357056.html
Tested-by: solomon 
Signed-off-by: Darrell Ball 
---

This patch is targeted for earlier releases as new RCU patches
inherently don't have this race.

Backport to 2.8.

v6: Change some comments to lock annotations and added some other comments
to a few functions.
Changed the name of conn_lookup_any() to conn_lookup_def() to
reflect the usage, which is now enforced.

v5: Fix recent version compiler warning (Ilya) about local variable going
out of scope.  I don't think stack memory is reclaimed when block
scope ends, but theoretically some compiler could do it.

Moved "structure copy -> memcpy" changes into a new patch 3.

Add function comment to conn_clean_safe().

v4: Fix exhaustion case cleanup race in conn_not_found() as well.
At the same time, simplify the related code.

v3: Use cleanup_mutex in conntrack_destroy().

v2: Fix typo.



 lib/conntrack.c | 142 ++--
 1 file changed, 98 insertions(+), 44 deletions(-)

diff --git a/lib/conntrack.c b/lib/conntrack.c
index 691782c..dd6e19b 100644
--- a/lib/conntrack.c
+++ b/lib/conntrack.c
@@ -355,7 +355,7 @@ conntrack_destroy(struct conntrack *ct)
 struct conntrack_bucket *ctb = >buckets[i];
 struct conn *conn;
 
-ovs_mutex_destroy(>cleanup_mutex);
+ovs_mutex_lock(>cleanup_mutex);
 ct_lock_lock(>lock);
 HMAP_FOR_EACH_POP (conn, node, >connections) {
 if (conn->conn_type == CT_CONN_TYPE_DEFAULT) {
@@ -365,7 +365,9 @@ conntrack_destroy(struct conntrack *ct)
 }
 hmap_destroy(>connections);
 ct_lock_unlock(>lock);
+ovs_mutex_unlock(>cleanup_mutex);
 ct_lock_destroy(>lock);
+ovs_mutex_destroy(>cleanup_mutex);
 }
 ct_rwlock_wrlock(>resources_lock);
 struct nat_conn_key_node *nat_conn_key_node;
@@ -753,6 +755,27 @@ conn_lookup(struct conntrack *ct, const struct conn_key 
*key, long long now)
 return ctx.conn;
 }
 
+/* Only used when looking up 'CT_CONN_TYPE_DEFAULT' conns. */
+static struct conn *
+conn_lookup_def(const struct conn_key *key,
+const struct conntrack_bucket *ctb, uint32_t hash)
+OVS_REQUIRES(ctb->lock)
+{
+struct conn *conn = NULL;
+
+HMAP_FOR_EACH_WITH_HASH (conn, node, hash, >connections) {
+if (!conn_key_cmp(>key, key)
+&& conn->conn_type == CT_CONN_TYPE_DEFAULT) {
+break;
+}
+if (!conn_key_cmp(>rev_key, key)
+&& conn->conn_type == CT_CONN_TYPE_DEFAULT) {
+break;
+}
+}
+return conn;
+}
+
 static void
 conn_seq_skew_set(struct conntrack *ct, const struct conn_key *key,
   long long now, int seq_skew, bool seq_skew_dir)
@@ -823,6 +846,22 @@ conn_clean(struct conntrack *ct, struct conn *conn,
 }
 }
 
+/* Only called for 'CT_CONN_TYPE_DEFAULT' conns; must be called with no
+ * locks held and upon return no locks are held. */
+static void
+conn_clean_safe(struct conntrack *ct, struct conn *conn,
+struct conntrack_bucket *ctb, uint32_t hash)
+{
+ovs_mutex_lock(>cleanup_mutex);
+ct_lock_lock(>lock);
+conn = conn_lookup_def(>key, ctb, hash);
+if (conn) {
+conn_clean(ct, conn, ctb);
+}
+ct_lock_unlock(>lock);
+ovs_mutex_unlock(>cleanup_mutex);
+}
+
 static bool
 ct_verify_helper(const char *helper, enum ct_alg_ctl_type ct_alg_ctl)
 {
@@ -854,6 +893,7 @@ conn_not_found(struct conntrack *ct, struct dp_packet *pkt,
enum ct_alg_ctl_type ct_alg_ctl)
 {
 struct conn *nc = NULL;
+struct conn connl;
 
 if (!valid_new(pkt, >key)) {
 pkt->md.ct_state = CS_INVALID;
@@ -876,8 +916,9 @@ conn_not_found(struct conntrack *ct, struct dp_packet *pkt,
 }
 
 unsigned bucket = hash_to_bucket(ctx->hash);
-nc = new_conn(>buckets[bucket], pkt, >key, now);
-ctx->conn = nc;
+nc = 
+memset(nc, 0, sizeof *nc);
+memcpy(>key, >key, sizeof nc->key);
 nc->rev_key = nc->key;
 conn_key_reverse(>rev_key);
 
@@ -921,6 +962,7 @@ conn_not_found(struct conntrack *ct, struct dp_packet *pkt,
 ct_rwlock_wrlock(>resources_lock);
 bool nat_res =