Re: [ovs-dev] [PATCH v2] datapath: compat: Backports bugfixes for nf_conncount
Thanks YiHung for reviewing. I've sent out a new version. Best, Yifeng On Wed, Aug 7, 2019 at 10:24 AM Yi-Hung Wei wrote: > > On Tue, Aug 6, 2019 at 5:06 PM Yifeng Sun wrote: > > > > This patch backports several critical bug fixes related to > > locking and data consistency in nf_conncount code. > > > > This backport is based on the following upstream net-next upstream commits. > > 4cd273b ("netfilter: nf_conncount: don't skip eviction when age is > > negative") > > d4e7df1 ("netfilter: nf_conncount: use rb_link_node_rcu() instead of > > rb_link_node()") > > 53ca0f2 ("netfilter: nf_conncount: remove wrong condition check routine") > > 3c5cdb1 ("netfilter: nf_conncount: fix unexpected permanent node of list.") > > 31568ec ("netfilter: nf_conncount: fix list_del corruption in conn_free") > > fd3e71a ("netfilter: nf_conncount: use spin_lock_bh instead of spin_lock") > > > > This patch also added additional compat code so that it can build on > > all supported kernel versions. > > > > Travis tests are at > > https://travis-ci.org/yifsun/ovs-travis/builds/568603796 > > > > VMware-BZ: #2396471 > > > > CC: Taehee Yoo > > Signed-off-by: Yifeng Sun > > --- > > Hi Yifeng, > > Thanks for the patch. This backport looks good in general. > > I found that there are couple of fixes on nf_conncount in upstream > kernel. Since we would like to use df4a90250976 ("netfilter: > nf_conncount: merge lookup and add functions") to determine if the > kernel has all the fixes. It would be good to bring other commits on > the same date to our datapath compat code base as well. That is to > squash all the following commits. > > One more suggestion is to squash the other patch that you sent > ("datapath: Apply bug fixes of nf_conncount for different kernel > versions") into this one, since updating "HAVE_UPSTREAM_NF_CONNCOUNT" > is only useful with this patch. > > > a007232066f6 ("netfilter: nf_conncount: fix argument order to find_next_bit") > c80f10bc973a ("netfilter: nf_conncount: speculative garbage collection > on empty lists") > 2f971a8f4255 ("netfilter: nf_conncount: move all list iterations under > spinlock") > df4a90250976 ("netfilter: nf_conncount: merge lookup and add functions") > e8cfb372b38a ("netfilter: nf_conncount: restart search when nodes have > been erased") > f7fcc98dfc2d ("netfilter: nf_conncount: split gc in two phases") > 4cd273bb91b3 ("netfilter: nf_conncount: don't skip eviction when age > is negative") > c78e7818f16f ("netfilter: nf_conncount: replace CONNCOUNT_LOCK_SLOTS > with CONNCOUNT_SLOTS") > d4e7df16567b ("netfilter: nf_conncount: use rb_link_node_rcu() instead > of rb_link_node()") > 53ca0f2fec39 ("netfilter: nf_conncount: remove wrong condition check routine") > 3c5cdb17c3be ("netfilter: nf_conncount: fix unexpected permanent node of > list.") > 31568ec09ea0 ("netfilter: nf_conncount: fix list_del corruption in conn_free") > fd3e71a9f71e ("netfilter: nf_conncount: use spin_lock_bh instead of > spin_lock") > > Thanks, > > -Yi-Hung ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
Re: [ovs-dev] [PATCH v2] datapath: compat: Backports bugfixes for nf_conncount
On Tue, Aug 6, 2019 at 5:06 PM Yifeng Sun wrote: > > This patch backports several critical bug fixes related to > locking and data consistency in nf_conncount code. > > This backport is based on the following upstream net-next upstream commits. > 4cd273b ("netfilter: nf_conncount: don't skip eviction when age is negative") > d4e7df1 ("netfilter: nf_conncount: use rb_link_node_rcu() instead of > rb_link_node()") > 53ca0f2 ("netfilter: nf_conncount: remove wrong condition check routine") > 3c5cdb1 ("netfilter: nf_conncount: fix unexpected permanent node of list.") > 31568ec ("netfilter: nf_conncount: fix list_del corruption in conn_free") > fd3e71a ("netfilter: nf_conncount: use spin_lock_bh instead of spin_lock") > > This patch also added additional compat code so that it can build on > all supported kernel versions. > > Travis tests are at > https://travis-ci.org/yifsun/ovs-travis/builds/568603796 > > VMware-BZ: #2396471 > > CC: Taehee Yoo > Signed-off-by: Yifeng Sun > --- Hi Yifeng, Thanks for the patch. This backport looks good in general. I found that there are couple of fixes on nf_conncount in upstream kernel. Since we would like to use df4a90250976 ("netfilter: nf_conncount: merge lookup and add functions") to determine if the kernel has all the fixes. It would be good to bring other commits on the same date to our datapath compat code base as well. That is to squash all the following commits. One more suggestion is to squash the other patch that you sent ("datapath: Apply bug fixes of nf_conncount for different kernel versions") into this one, since updating "HAVE_UPSTREAM_NF_CONNCOUNT" is only useful with this patch. a007232066f6 ("netfilter: nf_conncount: fix argument order to find_next_bit") c80f10bc973a ("netfilter: nf_conncount: speculative garbage collection on empty lists") 2f971a8f4255 ("netfilter: nf_conncount: move all list iterations under spinlock") df4a90250976 ("netfilter: nf_conncount: merge lookup and add functions") e8cfb372b38a ("netfilter: nf_conncount: restart search when nodes have been erased") f7fcc98dfc2d ("netfilter: nf_conncount: split gc in two phases") 4cd273bb91b3 ("netfilter: nf_conncount: don't skip eviction when age is negative") c78e7818f16f ("netfilter: nf_conncount: replace CONNCOUNT_LOCK_SLOTS with CONNCOUNT_SLOTS") d4e7df16567b ("netfilter: nf_conncount: use rb_link_node_rcu() instead of rb_link_node()") 53ca0f2fec39 ("netfilter: nf_conncount: remove wrong condition check routine") 3c5cdb17c3be ("netfilter: nf_conncount: fix unexpected permanent node of list.") 31568ec09ea0 ("netfilter: nf_conncount: fix list_del corruption in conn_free") fd3e71a9f71e ("netfilter: nf_conncount: use spin_lock_bh instead of spin_lock") Thanks, -Yi-Hung ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
[ovs-dev] [PATCH v2] datapath: compat: Backports bugfixes for nf_conncount
This patch backports several critical bug fixes related to locking and data consistency in nf_conncount code. This backport is based on the following upstream net-next upstream commits. 4cd273b ("netfilter: nf_conncount: don't skip eviction when age is negative") d4e7df1 ("netfilter: nf_conncount: use rb_link_node_rcu() instead of rb_link_node()") 53ca0f2 ("netfilter: nf_conncount: remove wrong condition check routine") 3c5cdb1 ("netfilter: nf_conncount: fix unexpected permanent node of list.") 31568ec ("netfilter: nf_conncount: fix list_del corruption in conn_free") fd3e71a ("netfilter: nf_conncount: use spin_lock_bh instead of spin_lock") This patch also added additional compat code so that it can build on all supported kernel versions. Travis tests are at https://travis-ci.org/yifsun/ovs-travis/builds/568603796 VMware-BZ: #2396471 CC: Taehee Yoo Signed-off-by: Yifeng Sun --- v1->v2: Add fixes to support old kernel versions. Thanks YiHung for reviewing. acinclude.m4 | 2 ++ datapath/linux/Modules.mk| 3 +- datapath/linux/compat/include/linux/rbtree.h | 19 datapath/linux/compat/nf_conncount.c | 46 ++-- 4 files changed, 52 insertions(+), 18 deletions(-) create mode 100644 datapath/linux/compat/include/linux/rbtree.h diff --git a/acinclude.m4 b/acinclude.m4 index 116ffcf9096d..f8e856d3303f 100644 --- a/acinclude.m4 +++ b/acinclude.m4 @@ -1012,6 +1012,8 @@ AC_DEFUN([OVS_CHECK_LINUX_COMPAT], [ [OVS_DEFINE([HAVE_GRE_CALC_HLEN])]) OVS_GREP_IFELSE([$KSRC/include/net/gre.h], [ip_gre_calc_hlen], [OVS_DEFINE([HAVE_IP_GRE_CALC_HLEN])]) + OVS_GREP_IFELSE([$KSRC/include/linux/rbtree.h], [rb_link_node_rcu], + [OVS_DEFINE([HAVE_RBTREE_RB_LINK_NODE_RCU])]) if cmp -s datapath/linux/kcompat.h.new \ datapath/linux/kcompat.h >/dev/null 2>&1; then diff --git a/datapath/linux/Modules.mk b/datapath/linux/Modules.mk index cbb29f1c69d0..69d7faeac414 100644 --- a/datapath/linux/Modules.mk +++ b/datapath/linux/Modules.mk @@ -116,5 +116,6 @@ openvswitch_headers += \ linux/compat/include/uapi/linux/netfilter.h \ linux/compat/include/linux/mm.h \ linux/compat/include/linux/netfilter.h \ - linux/compat/include/linux/overflow.h + linux/compat/include/linux/overflow.h \ + linux/compat/include/linux/rbtree.h EXTRA_DIST += linux/compat/build-aux/export-check-whitelist diff --git a/datapath/linux/compat/include/linux/rbtree.h b/datapath/linux/compat/include/linux/rbtree.h new file mode 100644 index ..dbf20ff0e0b8 --- /dev/null +++ b/datapath/linux/compat/include/linux/rbtree.h @@ -0,0 +1,19 @@ +#ifndef __LINUX_RBTREE_WRAPPER_H +#define __LINUX_RBTREE_WRAPPER_H 1 + +#include_next + +#ifndef HAVE_RBTREE_RB_LINK_NODE_RCU +#include + +static inline void rb_link_node_rcu(struct rb_node *node, struct rb_node *parent, + struct rb_node **rb_link) +{ + node->__rb_parent_color = (unsigned long)parent; + node->rb_left = node->rb_right = NULL; + + rcu_assign_pointer(*rb_link, node); +} +#endif + +#endif /* __LINUX_RBTREE_WRAPPER_H */ diff --git a/datapath/linux/compat/nf_conncount.c b/datapath/linux/compat/nf_conncount.c index eeae440f872d..6a4d058e7fac 100644 --- a/datapath/linux/compat/nf_conncount.c +++ b/datapath/linux/compat/nf_conncount.c @@ -54,6 +54,7 @@ struct nf_conncount_tuple { struct nf_conntrack_zonezone; int cpu; u32 jiffies32; + booldead; struct rcu_head rcu_head; }; @@ -111,15 +112,16 @@ nf_conncount_add(struct nf_conncount_list *list, conn->zone = *zone; conn->cpu = raw_smp_processor_id(); conn->jiffies32 = (u32)jiffies; - spin_lock(>list_lock); + conn->dead = false; + spin_lock_bh(>list_lock); if (list->dead == true) { kmem_cache_free(conncount_conn_cachep, conn); - spin_unlock(>list_lock); + spin_unlock_bh(>list_lock); return NF_CONNCOUNT_SKIP; } list_add_tail(>node, >head); list->count++; - spin_unlock(>list_lock); + spin_unlock_bh(>list_lock); return NF_CONNCOUNT_ADDED; } @@ -136,19 +138,22 @@ static bool conn_free(struct nf_conncount_list *list, { bool free_entry = false; - spin_lock(>list_lock); + spin_lock_bh(>list_lock); - if (list->count == 0) { - spin_unlock(>list_lock); + if (conn->dead) { + spin_unlock_bh(>list_lock); return free_entry; } list->count--; + conn->dead = true; list_del_rcu(>node); - if (list->count == 0) + if (list->count == 0) { + list->dead = true;