Re: [Patch net] xt_RATEEST: acquire xt_rateest_mutex for hash insert

2018-01-31 Thread Eric Dumazet
On Wed, 2018-01-31 at 16:26 -0800, Cong Wang wrote:
> rateest_hash is supposed to be protected by xt_rateest_mutex.
> 
> Reported-by: 
> Fixes: 5859034d7eb8 ("[NETFILTER]: x_tables: add RATEEST target")
> Cc: Pablo Neira Ayuso 
> Signed-off-by: Cong Wang 
> ---
>  net/netfilter/xt_RATEEST.c | 2 ++
>  1 file changed, 2 insertions(+)
> 
> diff --git a/net/netfilter/xt_RATEEST.c b/net/netfilter/xt_RATEEST.c
> index 498b54fd04d7..83ec3a282755 100644
> --- a/net/netfilter/xt_RATEEST.c
> +++ b/net/netfilter/xt_RATEEST.c
> @@ -36,7 +36,9 @@ static void xt_rateest_hash_insert(struct xt_rateest *est)
>   unsigned int h;
>  
>   h = xt_rateest_hash(est->name);
> + mutex_lock(&xt_rateest_mutex);
>   hlist_add_head(&est->list, &rateest_hash[h]);
> + mutex_unlock(&xt_rateest_mutex);
>  }

We probably should make this module netns aware, otherwise bad things
will happen.

(It seems multiple threads could run, so getting the mutex twice 
(xt_rateest_lookup then xt_rateest_hash_insert() is racy)


--
To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[Patch net] xt_RATEEST: acquire xt_rateest_mutex for hash insert

2018-01-31 Thread Cong Wang
rateest_hash is supposed to be protected by xt_rateest_mutex.

Reported-by: 
Fixes: 5859034d7eb8 ("[NETFILTER]: x_tables: add RATEEST target")
Cc: Pablo Neira Ayuso 
Signed-off-by: Cong Wang 
---
 net/netfilter/xt_RATEEST.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/net/netfilter/xt_RATEEST.c b/net/netfilter/xt_RATEEST.c
index 498b54fd04d7..83ec3a282755 100644
--- a/net/netfilter/xt_RATEEST.c
+++ b/net/netfilter/xt_RATEEST.c
@@ -36,7 +36,9 @@ static void xt_rateest_hash_insert(struct xt_rateest *est)
unsigned int h;
 
h = xt_rateest_hash(est->name);
+   mutex_lock(&xt_rateest_mutex);
hlist_add_head(&est->list, &rateest_hash[h]);
+   mutex_unlock(&xt_rateest_mutex);
 }
 
 struct xt_rateest *xt_rateest_lookup(const char *name)
-- 
2.13.0

--
To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[Patch net] xt_cgroup: initialize info->priv in cgroup_mt_check_v1()

2018-01-31 Thread Cong Wang
xt_cgroup_info_v1->priv is an internal pointer only used for kernel,
we should not trust what user-space provides.

Reported-by: 
Fixes: c38c4597e4bf ("netfilter: implement xt_cgroup cgroup2 path match")
Cc: Pablo Neira Ayuso 
Signed-off-by: Cong Wang 
---
 net/netfilter/xt_cgroup.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/net/netfilter/xt_cgroup.c b/net/netfilter/xt_cgroup.c
index 1db1ce59079f..891f4e7e8ea7 100644
--- a/net/netfilter/xt_cgroup.c
+++ b/net/netfilter/xt_cgroup.c
@@ -52,6 +52,7 @@ static int cgroup_mt_check_v1(const struct xt_mtchk_param 
*par)
return -EINVAL;
}
 
+   info->priv = NULL;
if (info->has_path) {
cgrp = cgroup_get_from_path(info->path);
if (IS_ERR(cgrp)) {
-- 
2.13.0

--
To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH nf] netfilter: flowtable infrastructure depends on NETFILTER_INGRESS

2018-01-31 Thread Pablo Neira Ayuso
config NF_FLOW_TABLE depends on NETFILTER_INGRESS. If users forget to
enable this toggle, flowtable registration fails with EOPNOTSUPP.

Moreover, turn 'select NF_FLOW_TABLE' in every flowtable family flavour
into dependency instead, otherwise this new dependency on
NETFILTER_INGRESS causes a warning. This also allows us to remove the
explicit dependency between family flowtables <-> NF_TABLES and
NF_CONNTRACK, given they depend on the NF_FLOW_TABLE core that already
expresses the general dependencies for this new infrastructure.

Moreover, NF_FLOW_TABLE_INET depends on NF_FLOW_TABLE_IPV4 and
NF_FLOWTABLE_IPV6, which already depends on NF_FLOW_TABLE. So we can get
rid of direct dependency with NF_FLOW_TABLE.

In general, let's avoid 'select', it just makes things more complicated.

Signed-off-by: Pablo Neira Ayuso 
---
 net/ipv4/netfilter/Kconfig | 3 +--
 net/ipv6/netfilter/Kconfig | 3 +--
 net/netfilter/Kconfig  | 8 +---
 3 files changed, 7 insertions(+), 7 deletions(-)

diff --git a/net/ipv4/netfilter/Kconfig b/net/ipv4/netfilter/Kconfig
index 5f52236780b4..dfe6fa4ea554 100644
--- a/net/ipv4/netfilter/Kconfig
+++ b/net/ipv4/netfilter/Kconfig
@@ -80,8 +80,7 @@ endif # NF_TABLES
 
 config NF_FLOW_TABLE_IPV4
tristate "Netfilter flow table IPv4 module"
-   depends on NF_CONNTRACK && NF_TABLES
-   select NF_FLOW_TABLE
+   depends on NF_FLOW_TABLE
help
  This option adds the flow table IPv4 support.
 
diff --git a/net/ipv6/netfilter/Kconfig b/net/ipv6/netfilter/Kconfig
index 4a634b7a2c80..d395d1590699 100644
--- a/net/ipv6/netfilter/Kconfig
+++ b/net/ipv6/netfilter/Kconfig
@@ -73,8 +73,7 @@ endif # NF_TABLES
 
 config NF_FLOW_TABLE_IPV6
tristate "Netfilter flow table IPv6 module"
-   depends on NF_CONNTRACK && NF_TABLES
-   select NF_FLOW_TABLE
+   depends on NF_FLOW_TABLE
help
  This option adds the flow table IPv6 support.
 
diff --git a/net/netfilter/Kconfig b/net/netfilter/Kconfig
index 9019fa98003d..d3220b43c832 100644
--- a/net/netfilter/Kconfig
+++ b/net/netfilter/Kconfig
@@ -666,8 +666,8 @@ endif # NF_TABLES
 
 config NF_FLOW_TABLE_INET
tristate "Netfilter flow table mixed IPv4/IPv6 module"
-   depends on NF_FLOW_TABLE_IPV4 && NF_FLOW_TABLE_IPV6
-   select NF_FLOW_TABLE
+   depends on NF_FLOW_TABLE_IPV4
+   depends on NF_FLOW_TABLE_IPV6
help
   This option adds the flow table mixed IPv4/IPv6 support.
 
@@ -675,7 +675,9 @@ config NF_FLOW_TABLE_INET
 
 config NF_FLOW_TABLE
tristate "Netfilter flow table module"
-   depends on NF_CONNTRACK && NF_TABLES
+   depends on NETFILTER_INGRESS
+   depends on NF_CONNTRACK
+   depends on NF_TABLES
help
  This option adds the flow table core infrastructure.
 
-- 
2.11.0

--
To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH net] netfilter: on sockopt() acquire sock lock only in the required scope

2018-01-31 Thread Pablo Neira Ayuso
On Tue, Jan 30, 2018 at 07:01:40PM +0100, Paolo Abeni wrote:
> Syzbot reported several deadlocks in the netfilter area caused by
> rtnl lock and socket lock being acquired with a different order on
> different code paths, leading to backtraces like the following one:
[...]
> The problem, as Florian noted, is that nf_setsockopt() is always
> called with the socket held, even if the lock itself is required only
> for very tight scopes and only for some operation.

Applied, thanks Paolo.
--
To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] netfilter: fix out-of-bounds accesses in clusterip_tg_check()

2018-01-31 Thread Pablo Neira Ayuso
On Tue, Jan 30, 2018 at 03:21:34PM +0100, Dmitry Vyukov wrote:
> Commit 136e92bbec0a switched local_nodes from an array to a bitmask
> but did not add proper bounds checks. As the result
> clusterip_config_init_nodelist() can both over-read
> ipt_clusterip_tgt_info.local_nodes and over-write
> clusterip_config.local_nodes.
> 
> Add bounds checks for both.

Applied, thanks Dmitry.
--
To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] netfilter: fix pointer leaks to userspace

2018-01-31 Thread Pablo Neira Ayuso
On Mon, Jan 29, 2018 at 01:21:20PM +0100, Dmitry Vyukov wrote:
> Several netfilter matches and targets put kernel pointers into
> info objects, but don't set usersize in descriptors.
> This leads to kernel pointer leaks if a match/target is set
> and then read back to userspace.
> 
> Properly set usersize for these matches/targets.
> 
> Found with manual code inspection.

Applied, thanks!

I think this fixes:

ec2318904965 xtables: extend matches and targets with .usersize

So I'm going to add the Fixes: tag here, no problem.
--
To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 1/1] Fix wraparound bug introduced in commit 48596a8ddc46

2018-01-31 Thread Pablo Neira Ayuso
On Fri, Jan 12, 2018 at 11:16:50AM +0100, Jozsef Kadlecsik wrote:
> The patch "netfilter: ipset: Fix adding an IPv4 range containing
> more than 2^31 addresses" introduced a wraparound bug, which could
> lead to memory exhaustion when adding an x.x.x.x-255.255.255.255
> range to any hash:*net* types.
> 
> Fixes Netfilter's bugzilla id #1212, reported by Thomas Schwark.

For the record, I have renamed this to:

"netfilter: ipset: Fix wraparound in hash:*net* types"

And I have added the Fixes: tag.

Applied, thanks Jozsef!
--
To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH nf-next] netfilter: ipv6: nf_defrag: Kill frag queue on RFC2460 failure

2018-01-31 Thread Subash Abhinov Kasiviswanathan
Failures were seen in ICMPv6 fragmentation timeout tests if they were
run after the RFC2460 failure tests. Kernel was not sending out the
ICMPv6 fragment reassembly time exceeded packet after the fragmentation
reassembly timeout of 1 minute had elapsed.

This happened because the frag queue was not released if an error in
IPv6 fragmentation header was detected by RFC2460.

Fixes: 83f1999caeb1 ("netfilter: ipv6: nf_defrag: Pass on packets to stack per 
RFC2460")
Signed-off-by: Subash Abhinov Kasiviswanathan 
---
Hi Pablo
This needs to be picked up for 4.16
---
 net/ipv6/netfilter/nf_conntrack_reasm.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/net/ipv6/netfilter/nf_conntrack_reasm.c 
b/net/ipv6/netfilter/nf_conntrack_reasm.c
index ce53dcf..b84ce3e 100644
--- a/net/ipv6/netfilter/nf_conntrack_reasm.c
+++ b/net/ipv6/netfilter/nf_conntrack_reasm.c
@@ -264,6 +264,7 @@ static int nf_ct_frag6_queue(struct frag_queue *fq, struct 
sk_buff *skb,
 * this case. -DaveM
 */
pr_debug("end of fragment not rounded to 8 bytes.\n");
+   inet_frag_kill(&fq->q, &nf_frags);
return -EPROTO;
}
if (end > fq->q.len) {
-- 
1.9.1

--
To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [netfilter-core] kernel panic: Out of memory and no killable processes... (2)

2018-01-31 Thread Michal Hocko
On Tue 30-01-18 11:27:45, Andrew Morton wrote:
> On Tue, 30 Jan 2018 15:01:04 +0100 Michal Hocko  wrote:
> 
> > > Well, this is not about syzkaller, it merely pointed out a potential
> > > DoS... And that has to be addressed somehow.
> > 
> > So how about this?
> > ---
> 
> argh ;)

doh, those hardwired moves...

> > >From d48e950f1b04f234b57b9e34c363bdcfec10aeee Mon Sep 17 00:00:00 2001
> > From: Michal Hocko 
> > Date: Tue, 30 Jan 2018 14:51:07 +0100
> > Subject: [PATCH] net/netfilter/x_tables.c: make allocation less aggressive
> > 
> > syzbot has noticed that xt_alloc_table_info can allocate a lot of
> > memory. This is an admin only interface but an admin in a namespace
> > is sufficient as well. eacd86ca3b03 ("net/netfilter/x_tables.c: use
> > kvmalloc() in xt_alloc_table_info()") has changed the opencoded
> > kmalloc->vmalloc fallback into kvmalloc. It has dropped __GFP_NORETRY on
> > the way because vmalloc has simply never fully supported __GFP_NORETRY
> > semantic. This is still the case because e.g. page tables backing the
> > vmalloc area are hardcoded GFP_KERNEL.
> > 
> > Revert back to __GFP_NORETRY as a poors man defence against excessively
> > large allocation request here. We will not rule out the OOM killer
> > completely but __GFP_NORETRY should at least stop the large request
> > in most cases.
> > 
> > Fixes: eacd86ca3b03 ("net/netfilter/x_tables.c: use kvmalloc() in 
> > xt_alloc_table_info()")
> > Signed-off-by: Michal Hocko 
> > ---
> >  net/netfilter/x_tables.c | 8 +++-
> >  1 file changed, 7 insertions(+), 1 deletion(-)
> > 
> > diff --git a/net/netfilter/x_tables.c b/net/netfilter/x_tables.c
> > index d8571f414208..a5f5c29bcbdc 100644
> > --- a/net/netfilter/x_tables.c
> > +++ b/net/netfilter/x_tables.c
> > @@ -1003,7 +1003,13 @@ struct xt_table_info *xt_alloc_table_info(unsigned 
> > int size)
> > if ((SMP_ALIGN(size) >> PAGE_SHIFT) + 2 > totalram_pages)
> > return NULL;
> 
> offtopic: preceding comment here is "prevent them from hitting BUG() in
> vmalloc.c".  I suspect this is ancient code and vmalloc sure as heck
> shouldn't go BUG with this input.  And it should be using `sz' ;)

Yeah, we do not BUG but rather fail instead. See __vmalloc_node_range.
My excavation tools pointed me to "VM: Rework vmalloc code to support mapping 
of arbitray pages"
by Christoph back in 2002. So yes, we can safely remove it finally. Se
below.


>From 8d52e1d939d101b0dafed6ae5c3c1376183e65bb Mon Sep 17 00:00:00 2001
From: Michal Hocko 
Date: Wed, 31 Jan 2018 09:16:56 +0100
Subject: [PATCH] net/netfilter/x_tables.c: remove size check

Back in 2002 vmalloc used to BUG on too large sizes. We are much better
behaved these days and vmalloc simply returns NULL for those. Remove
the check as it simply not needed and the comment even misleading.

Suggested-by: Andrew Morton 
Signed-off-by: Michal Hocko 
---
 net/netfilter/x_tables.c | 4 
 1 file changed, 4 deletions(-)

diff --git a/net/netfilter/x_tables.c b/net/netfilter/x_tables.c
index b55ec5aa51a6..48a6ff620493 100644
--- a/net/netfilter/x_tables.c
+++ b/net/netfilter/x_tables.c
@@ -999,10 +999,6 @@ struct xt_table_info *xt_alloc_table_info(unsigned int 
size)
if (sz < sizeof(*info))
return NULL;
 
-   /* Pedantry: prevent them from hitting BUG() in vmalloc.c --RR */
-   if ((SMP_ALIGN(size) >> PAGE_SHIFT) + 2 > totalram_pages)
-   return NULL;
-
/* __GFP_NORETRY is not fully supported by kvmalloc but it should
 * work reasonably well if sz is too large and bail out rather
 * than shoot all processes down before realizing there is nothing
-- 
2.15.1

-- 
Michal Hocko
SUSE Labs
--
To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [patch 1/1] net/netfilter/x_tables.c: make allocation less aggressive

2018-01-31 Thread Michal Hocko
On Tue 30-01-18 11:53:58, Eric Dumazet wrote:
[...]
> How is __GFP_NORETRY working exactly ?

this is what the documentation says.
 * __GFP_NORETRY: The VM implementation will try only very lightweight
 *   memory direct reclaim to get some memory under memory pressure (thus
 *   it can sleep). It will avoid disruptive actions like OOM killer. The
 *   caller must handle the failure which is quite likely to happen under
 *   heavy memory pressure. The flag is suitable when failure can easily be
 *   handled at small cost, such as reduced throughput

> Surely, if some firewall tools attempt to load a new iptables rules, we
> do not want to abort them if the request can be satisfied after few
> pages moved on swap or written back to disk.

I am not sure this really goes along with "namespace admin can request
arbitrary amount of memory" very well.
 
> We want to avoid huge allocations, but leave reasonable ones succeed.

Yes, that would be the best way forward. From the previous discussion
with Florian [1] it seems that "reasonable" is not that easy to figure
out. Anyway, this patch merely gets us back to pre eacd86ca3b03 times
where __GFP_NORETRY has been used for both kmalloc and vmalloc paths.
So it is more a quick band aid than a longterm solution.

[1] http://lkml.kernel.org/r/20180129165722.gf5...@breakpoint.cc
-- 
Michal Hocko
SUSE Labs
--
To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html