Re: 4.19.x kernels oops in nf_conncount_destroy

2018-11-28 Thread Todd Eigenschink
This morning I found this thread, which I didn't see last night. I'm
not sure how I missed it, since I knew what I was searching for. It
includes a link to the same patches as I mentioned, but with a status
filter in the URL such that I can see the patches.

I applied the three patches and tested and it does NOT fix the problem
for me. It changes the behavior somewhat -- I saw several oopses (or
other noise) scroll past before it locked up. It also ended with
something like "eth0: pcnet32 transmit timed out", which I hadn't seen
before.

https://www.spinics.net/lists/netfilter-devel/msg57045.html

https://patchwork.ozlabs.org/project/netfilter-devel/list/?series=73972=*



Todd Eigenschink writes:
>EPILOGUE-AS-PREAMBLE:
>
>I had already typed most of this when I thought to search the
>netfilter-devel archive. I found this, which sounds an awful lot like
>my issue:
>
>https://www.spinics.net/lists/netfilter-devel/msg56882.html
>
>However, the patch link in the first followup seems empty, so I can't
>verify that it's the same thing or that the proposed fix works for me.
>
>
>--
>
>[1.] One line summary of the problem:
>
>4.19.x kernels oops in nf_conncount_destroy.
>
>
>[2.] Full description of the problem/report:
>
>We have been running 4.18.x kernels, up through 4.18.20, in production
>for a small web/email hosting operation with no issues. Everything
>relevant here is 32-bit Linux on VMware ESXi. Upon the release of
>4.18.20 and knowing that it was EOL, I stepped to then-current 4.19.4.
>
>One of our machines (a mail gateway) hung with an oops within a minute
>or two of boot. I rolled it back to deal with later.
>
>The next morning, another machine (coincidentally another mail
>gateway) crashed as well, and the tail end of the oops--that I could
>see on the 80x25 console--looked similar to what I remembered from the
>first. I rolled it back. If a third one happened, I was going to roll
>them all back. No other machines had issues.
>
>When 4.19.5 was released, I tried that, with the same effect, so I
>decided that since the fastest-crashing machine was, while production,
>not going to cause user-visible issues, I'd bisect to try to hunt down
>the cause. Every other machine, about 30 total, has been fine on
>4.19.4 / 4.19.5.
>
>Bisecting led me to this. 
>
>
>5c789e131cbb997a528451564ea4613e812fc718 is the first bad commit
>commit 5c789e131cbb997a528451564ea4613e812fc718
>Author: Yi-Hung Wei 
>Date:   Mon Jul 2 17:33:44 2018 -0700
>
>netfilter: nf_conncount: Add list lock and gc worker, and RCU for init 
> tree search
>
>This patch is originally from Florian Westphal.
>
>This patch does the following 3 main tasks.
>
>1) Add list lock to 'struct nf_conncount_list' so that we can
>alter the lists containing the individual connections without holding the
>main tree lock.  It would be useful when we only need to add/remove to/from
>a list without allocate/remove a node in the tree.  With this change, we
>update nft_connlimit accordingly since we longer need to maintain
>a list lock in nft_connlimit now.
>
>2) Use RCU for the initial tree search to improve tree look up performance.
>
>3) Add a garbage collection worker. This worker is schedule when there
>are excessive tree node that needed to be recycled.
>
>Moreover,the rbnode reclaim logic is moved from search tree to insert tree
>to avoid race condition.
>
>Signed-off-by: Yi-Hung Wei 
>Signed-off-by: Florian Westphal 
>Signed-off-by: Pablo Neira Ayuso 
>
>:04 04 3117a9e5f5d91c55bfcb495ed0cf20aac47beb4c 
>eb16c3c84edfa70268c651490dd5031a6474ca2d M include
>:04 04 f69622ea9603500bc837f6348bc7ffe6e4edefda 
>8983dc24192abb1ae1925f023a495c39d171021c M net
>
>
>And it makes perfect sense: Our only two machines that use
>nf_connlimit in their firewall configs are those two mail gateways. I
>imagine that the speed at which they oops has to do with their
>specific connlimit settings and how quickly they accumulate enough
>traffic to hit one of them.
>
>Oops details are below.
>
>
>[3.] Keywords (i.e., modules, networking, kernel):
>
>netfilter, nf_conncount, nf_connlimit
>
>
>[4.] Kernel information
>
>[4.1.] Kernel version (from /proc/version):
>
>[4.2.] Kernel .config file:
>
>grep = .config, net-related stuff only:
>
>
>CONFIG_NET=y
>CONFIG_NET_INGRESS=y
>CONFIG_PACKET=y
>CONFIG_UNIX=y
>CONFIG_XFRM=y
>CONFIG_XFRM_ALGO=y
>CONFIG_XFRM_USER=y
>CONFIG_XFRM_SUB_POLICY=y
>CONFIG_XFRM_IPCOMP=m
>CONFIG_NET_KEY=m
>CONFIG_INET=y
>CONFIG_IP_MULTICAST=y
>CONFIG_IP_ADVANCED_ROUTER=y
>CONFIG_IP_MULTIPLE_TABLES=y
>CONFIG_INET_AH=m
>CONFIG_INET_ESP=m
>CONFIG_INET_IPCOMP=m
>CONFIG_INET_XFRM_TUNNEL=m
>CONFIG_INET_TUNNEL=m
>CONFIG_INET_XFRM_MODE_TRANSPORT=m
>CONFIG_INET_XFRM_MODE_TUNNEL=m
>CONFIG_INET_XFRM_MODE_BEET=m
>CONFIG_TCP_CONG_CUBIC=y
>CONFIG_DEFAULT_TCP_CONG="cubic"

Re: RFC: Designing per chain rule cache support in libnftnl

2018-11-28 Thread Phil Sutter
Hi,

On Wed, Nov 28, 2018 at 02:51:54PM +0100, Pablo Neira Ayuso wrote:
> On Wed, Nov 28, 2018 at 02:21:01PM +0100, Phil Sutter wrote:
> > Hi Pablo,
> > 
> > On Fri, Nov 23, 2018 at 01:35:17PM +0100, Pablo Neira Ayuso wrote:
> > > On Fri, Nov 23, 2018 at 12:25:45PM +0100, Florian Westphal wrote:
> > > > Phil Sutter  wrote:
> > > > > > If user doesn't want it cleared at nftnl_chain_free() time they can
> > > > > > always allocate a new nftnl_rule_list and splice to that list.
> > > > > 
> > > > > Good point. What do you think about the simple approach of 
> > > > > introducing:
> > > > > 
> > > > > | struct nftnl_rule_list *nftnl_chain_get_rule_list(const struct 
> > > > > nftnl_chain *);
> > > > 
> > > > Looks fine to me.
> > > > 
> > > > > This would allow to reuse nftnl_rule_list routines from 
> > > > > libnftnl/rule.h.
> > > > > One potential problem I see is that users may try to call
> > > > > nftnl_rule_list_free(). Can we prevent that somehow?
> > > > 
> > > > Document that nftnl_rule_list_free() pairs with nftnl_rule_list_alloc() 
> > > > :-)
> > > > 
> > > > I don't think its an issue.
> > > > We could add a 'bool make_free_no_op' to nftnl_rule_list and set that to
> > > > true for nftnl_rule_list structures that are allocated indirectly on
> > > > behalf of nftnl_chain struct, but I think thats taking things too far.
> > > 
> > > Can we have an interface similar to nftnl_rule_add_expr() to add rules
> > > to chains?
> > > 
> > > So we add list field to nftnl_chain, and this new interface to
> > > add/delete rules.
> > 
> > I didn't look at struct nftnl_rule yet. OK, that seems rather different
> > from what I had in mind. So I guess your idea would be to add a field of
> > type struct list_head instead of struct nftnl_rule_list and implement
> > struct nftnl_rule_iter and relevant functions?
> 
> Yes. We would make explicit the relation between the objects, which
> makes sense to me. So far only nftnl_rule and nftnl_expr are basically
> "linked" in some way.
> 
> Would this approach for you?

Yes, that's fine with me. My idea was to reuse the nftnl_rule_list API,
but creating chains' rule lists in a consistent manner with respect to
rules' expression lists is probably more important long-term.

Thanks, Phil


Re: RFC: Designing per chain rule cache support in libnftnl

2018-11-28 Thread Pablo Neira Ayuso
On Wed, Nov 28, 2018 at 02:21:01PM +0100, Phil Sutter wrote:
> Hi Pablo,
> 
> On Fri, Nov 23, 2018 at 01:35:17PM +0100, Pablo Neira Ayuso wrote:
> > On Fri, Nov 23, 2018 at 12:25:45PM +0100, Florian Westphal wrote:
> > > Phil Sutter  wrote:
> > > > > If user doesn't want it cleared at nftnl_chain_free() time they can
> > > > > always allocate a new nftnl_rule_list and splice to that list.
> > > > 
> > > > Good point. What do you think about the simple approach of introducing:
> > > > 
> > > > | struct nftnl_rule_list *nftnl_chain_get_rule_list(const struct 
> > > > nftnl_chain *);
> > > 
> > > Looks fine to me.
> > > 
> > > > This would allow to reuse nftnl_rule_list routines from libnftnl/rule.h.
> > > > One potential problem I see is that users may try to call
> > > > nftnl_rule_list_free(). Can we prevent that somehow?
> > > 
> > > Document that nftnl_rule_list_free() pairs with nftnl_rule_list_alloc() 
> > > :-)
> > > 
> > > I don't think its an issue.
> > > We could add a 'bool make_free_no_op' to nftnl_rule_list and set that to
> > > true for nftnl_rule_list structures that are allocated indirectly on
> > > behalf of nftnl_chain struct, but I think thats taking things too far.
> > 
> > Can we have an interface similar to nftnl_rule_add_expr() to add rules
> > to chains?
> > 
> > So we add list field to nftnl_chain, and this new interface to
> > add/delete rules.
> 
> I didn't look at struct nftnl_rule yet. OK, that seems rather different
> from what I had in mind. So I guess your idea would be to add a field of
> type struct list_head instead of struct nftnl_rule_list and implement
> struct nftnl_rule_iter and relevant functions?

Yes. We would make explicit the relation between the objects, which
makes sense to me. So far only nftnl_rule and nftnl_expr are basically
"linked" in some way.

Would this approach for you?

Thanks!


[PATCH nft] tests: fix return codes

2018-11-28 Thread Arturo Borrero Gonzalez
Please,

consider merging the attached patch.

thanks.
commit 3497067ca187047c61d89ccad6eab4ebf5df9219
Author: Arturo Borrero Gonzalez 
Date:   Wed Nov 28 14:31:57 2018 +0100

tests: fix return codes

Try to return != 0 if a testsuite fails.

Signed-off-by: Arturo Borrero Gonzalez 

diff --git a/tests/build/run-tests.sh b/tests/build/run-tests.sh
index 626f6fd..b0560da 100755
--- a/tests/build/run-tests.sh
+++ b/tests/build/run-tests.sh
@@ -52,4 +52,4 @@ done
 rm -rf $tmpdir
 
 echo "results: [OK] $ok [FAILED] $failed [TOTAL] $((ok+failed))"
-exit 0
+exit $failed
diff --git a/tests/monitor/run-tests.sh b/tests/monitor/run-tests.sh
index f408988..0478cf6 100755
--- a/tests/monitor/run-tests.sh
+++ b/tests/monitor/run-tests.sh
@@ -17,7 +17,7 @@ fi
 testdir=$(mktemp -d)
 if [ ! -d $testdir ]; then
 	echo "Failed to create test directory" >&2
-	exit 0
+	exit 1
 fi
 trap "rm -rf $testdir; $nft flush ruleset" EXIT
 
diff --git a/tests/shell/run-tests.sh b/tests/shell/run-tests.sh
index 5b0ec41..fdca5fb 100755
--- a/tests/shell/run-tests.sh
+++ b/tests/shell/run-tests.sh
@@ -152,4 +152,4 @@ echo ""
 msg_info "results: [OK] $ok [FAILED] $failed [TOTAL] $((ok+failed))"
 
 kernel_cleanup
-exit 0
+exit $failed


Re: RFC: Designing per chain rule cache support in libnftnl

2018-11-28 Thread Phil Sutter
Hi Pablo,

On Fri, Nov 23, 2018 at 01:35:17PM +0100, Pablo Neira Ayuso wrote:
> On Fri, Nov 23, 2018 at 12:25:45PM +0100, Florian Westphal wrote:
> > Phil Sutter  wrote:
> > > > If user doesn't want it cleared at nftnl_chain_free() time they can
> > > > always allocate a new nftnl_rule_list and splice to that list.
> > > 
> > > Good point. What do you think about the simple approach of introducing:
> > > 
> > > | struct nftnl_rule_list *nftnl_chain_get_rule_list(const struct 
> > > nftnl_chain *);
> > 
> > Looks fine to me.
> > 
> > > This would allow to reuse nftnl_rule_list routines from libnftnl/rule.h.
> > > One potential problem I see is that users may try to call
> > > nftnl_rule_list_free(). Can we prevent that somehow?
> > 
> > Document that nftnl_rule_list_free() pairs with nftnl_rule_list_alloc() :-)
> > 
> > I don't think its an issue.
> > We could add a 'bool make_free_no_op' to nftnl_rule_list and set that to
> > true for nftnl_rule_list structures that are allocated indirectly on
> > behalf of nftnl_chain struct, but I think thats taking things too far.
> 
> Can we have an interface similar to nftnl_rule_add_expr() to add rules
> to chains?
> 
> So we add list field to nftnl_chain, and this new interface to
> add/delete rules.

I didn't look at struct nftnl_rule yet. OK, that seems rather different
from what I had in mind. So I guess your idea would be to add a field of
type struct list_head instead of struct nftnl_rule_list and implement
struct nftnl_rule_iter and relevant functions?

> We can probably deprecate the existing list interface if we follow
> that procedure after a bit of time in favour of this one.

OK, cool.

Thanks, Phil


Re: Proposal: rename of arptables.git and ebtables.git

2018-11-28 Thread Arturo Borrero Gonzalez
On 11/28/18 1:44 PM, Arturo Borrero Gonzalez wrote:
> Hi,
> 
> Now that the iptables.git repo offers arptables-nft and ebtables-nft,
> arptables.git holds arptables-legacy, etc, why we don't just rename the
> repos?
> 
> * from arptables.git to arptables-legacy.git
> * from ebtables.git to ebtables-legacy.git
> 
> This rename should help distros understand the differences between them
> and better accommodate the packaging of all the related tooling.
> 
> Mind that the rename may have side effects in tarball
> generation/publishing etc. I would expect the new arptables tarball to
> include the '-legacy' keyword, and same for ebtables.
> 
> If we go ahead with the rename, a new release is worth having,
> announcing these changes as well.
> 

Also,

please consider applying the attached patch.

thanks.
commit ee8a588338e7c75e90fcc49a69e3d3b018063828
Author: Arturo Borrero Gonzalez 
Date:   Wed Nov 28 13:47:28 2018 +0100

ebtables: legacy renaming

The original ebtables tool is now the legacy version, let's rename it.

A more uptodate client of the ebtables tool is provided in the iptables
tarball (ebtables-nft). The new tool was formerly known as ebtables-compat.

The new -legacy binary has no problem if called via a symlink with the
'ebtables' name, so users can still name this binary with whatever name.

Signed-off-by: Arturo Borrero Gonzalez 

diff --git a/Makefile.am b/Makefile.am
index 14938fe..b16a4d6 100644
--- a/Makefile.am
+++ b/Makefile.am
@@ -26,11 +26,11 @@ AM_CPPFLAGS = ${regular_CPPFLAGS} -I${top_srcdir}/include \
 	-DEBTD_PIPE=\"${PIPE}\" -DEBTD_PIPE_DIR=\"${PIPE_DIR}\"
 AM_CFLAGS = ${regular_CFLAGS}
 
-sbin_PROGRAMS = ebtables ebtablesd ebtablesu ebtables-restore
+sbin_PROGRAMS = ebtables-legacy ebtablesd ebtablesu ebtables-legacy-restore
 EXTRA_PROGRAMS = static examples/ulog/test_ulog
 sysconf_DATA = ethertypes
-sbin_SCRIPTS = ebtables-save
-man8_MANS = ebtables.8
+sbin_SCRIPTS = ebtables-legacy-save
+man8_MANS = ebtables-legacy.8
 lib_LTLIBRARIES = libebtc.la
 
 libebtc_la_SOURCES = \
@@ -47,21 +47,22 @@ libebtc_la_SOURCES = \
 	extensions/ebtable_nat.c
 # Make sure ebtables.c can be built twice
 libebtc_la_CPPFLAGS = ${AM_CPPFLAGS}
-ebtables_SOURCES = ebtables-standalone.c
-ebtables_LDADD = libebtc.la
+ebtables_legacy_SOURCES = ebtables-standalone.c
+ebtables_legacy_LDADD = libebtc.la
 ebtablesd_LDADD = libebtc.la
-ebtables_restore_LDADD = libebtc.la
+ebtables_legacy_restore_SOURCES = ebtables-restore.c
+ebtables_legacy_restore_LDADD = libebtc.la
 static_SOURCES = ebtables.c
 static_LDFLAGS = -static
 static_LDADD = libebtc.la
 examples_ulog_test_ulog_SOURCES = examples/ulog/test_ulog.c getethertype.c
 
 daemon: ebtablesd ebtablesu
-exec: ebtables ebtables-restore
+exec: ebtables-legacy ebtables-legacy-restore
 
-CLEANFILES = ebtables-save ebtables.sysv ebtables-config ebtables.8
+CLEANFILES = ebtables-legacy-save ebtables.sysv ebtables-config ebtables-legacy.8
 
-ebtables-save: ebtables-save.in ${top_builddir}/config.status
+ebtables-legacy-save: ebtables-save.in ${top_builddir}/config.status
 	${AM_V_GEN}sed -e 's![@]sbindir@!${sbindir}!g' <$< >$@
 
 ebtables.sysv: ebtables.sysv.in ${top_builddir}/config.status
@@ -70,7 +71,7 @@ ebtables.sysv: ebtables.sysv.in ${top_builddir}/config.status
 ebtables-config: ebtables-config.in ${top_builddir}/config.status
 	${AM_V_GEN}sed -e 's![@]sysconfigdir@!${sysconfigdir}!g' <$< >$@
 
-ebtables.8: ebtables.8.in ${top_builddir}/config.status
+ebtables-legacy.8: ebtables-legacy.8.in ${top_builddir}/config.status
 	${AM_V_GEN}sed -e 's![@]PACKAGE_VERSION!${PACKAGE_VERSION}!g' \
 		-e 's![@]PACKAGE_DATE@!${PROGDATE}!g' \
 		-e 's![@]LOCKFILE@!${LOCKFILE}!g' <$< >$@
diff --git a/ebtables.8.in b/ebtables-legacy.8.in
similarity index 98%
rename from ebtables.8.in
rename to ebtables-legacy.8.in
index 3e97c84..3417045 100644
--- a/ebtables.8.in
+++ b/ebtables-legacy.8.in
@@ -24,7 +24,7 @@
 .\" 
 .\"
 .SH NAME
-ebtables (@PACKAGE_VERSION@) \- Ethernet bridge frame table administration
+ebtables-legacy (@PACKAGE_VERSION@) \- Ethernet bridge frame table administration (legacy)
 .SH SYNOPSIS
 .BR "ebtables " [ -t " table ] " - [ ACDI "] chain rule specification [match extensions] [watcher extensions] target"
 .br
@@ -50,6 +50,18 @@ ebtables (@PACKAGE_VERSION@) \- Ethernet bridge frame table administration
 .br
 .BR "ebtables " [ -t " table ] [" --atomic-file " file] " --atomic-save
 .br
+
+.SH LEGACY
+This tool uses the old xtables/setsockopt framework, and is a legacy version
+of ebtables. That means that a new, more modern tool exists with the same
+functionality using the nf_tables framework and you are encouraged to migrate now.
+The new binaries (known as ebtables-nft and formerly known as ebtables-compat)
+uses the same syntax and semantics than this legacy one.
+
+You can still use this legacy tool. You should probably get some specific
+information from your Linux distribution or vendor.
+More docs are 

Proposal: rename of arptables.git and ebtables.git

2018-11-28 Thread Arturo Borrero Gonzalez
Hi,

Now that the iptables.git repo offers arptables-nft and ebtables-nft,
arptables.git holds arptables-legacy, etc, why we don't just rename the
repos?

* from arptables.git to arptables-legacy.git
* from ebtables.git to ebtables-legacy.git

This rename should help distros understand the differences between them
and better accommodate the packaging of all the related tooling.

Mind that the rename may have side effects in tarball
generation/publishing etc. I would expect the new arptables tarball to
include the '-legacy' keyword, and same for ebtables.

If we go ahead with the rename, a new release is worth having,
announcing these changes as well.


Re: [PATCH nf] netfilter: nf_tables: deactivate expressions in rule replecement routine

2018-11-28 Thread Pablo Neira Ayuso
Applied, thanks.