Proposal: Add config option to set xtable_lock wait = true.

2018-04-04 Thread Jack Ma

Hi Florian & Pablo,

I noticed that lots iptables users are likely to miss the '-w' option while 
implementing multi-process program.

Due to the fact that the iptables and ip6tables do not wait for the 
xtable_lock, people can easily mis-configure

their iptables command because of concurrency issues.

I'd like to propose a global config option to set the default wait interval and 
allow iptables to always wait for the lock.


ie. 

" iptables --always-wait (ms) " if no value is specified, then use the default 
1 second.

I found it hard to see any users who may wish to run iptables command without 
lock.

Does this proposal sound sane-ish ?

Regards,
Jack

--
To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH nf] netfilter: ebtables: don't attempt to allocate 0-sized compat array

2018-04-04 Thread Florian Westphal
Dmitry reports 32bit ebtables on 64bit kernel got broken by
a recent change that returns -EINVAL when ruleset has no entries.

ebtables however only counts user-defined chains, so for the
initial table nentries will be 0.

Don't try to allocate the compat array in this case, as no user
defined rules exist no rule will need 64bit translation.

Reported-by: Dmitry Vyukov 
Fixes: 7d7d7e02111e9 ("netfilter: compat: reject huge allocation requests")
Signed-off-by: Florian Westphal 
---
This change fixes ebtables for me;
adding a --limit 1/s rule via ebtables32 is also ok after this.

I double-checked, iptables32 is fine as-is.

diff --git a/net/bridge/netfilter/ebtables.c b/net/bridge/netfilter/ebtables.c
index 032e0fe45940..28a4c3490359 100644
--- a/net/bridge/netfilter/ebtables.c
+++ b/net/bridge/netfilter/ebtables.c
@@ -1825,13 +1825,14 @@ static int compat_table_info(const struct 
ebt_table_info *info,
 {
unsigned int size = info->entries_size;
const void *entries = info->entries;
-   int ret;
 
newinfo->entries_size = size;
-
-   ret = xt_compat_init_offsets(NFPROTO_BRIDGE, info->nentries);
-   if (ret)
-   return ret;
+   if (info->nentries) {
+   int ret = xt_compat_init_offsets(NFPROTO_BRIDGE,
+info->nentries);
+   if (ret)
+   return ret;
+   }
 
return EBT_ENTRY_ITERATE(entries, size, compat_calc_entry, info,
entries, newinfo);
-- 
2.16.1

--
To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: compat ebtables broke in syzkaller

2018-04-04 Thread Dmitry Vyukov
On Wed, Apr 4, 2018 at 9:04 PM, Florian Westphal  wrote:
> Dmitry Vyukov  wrote:
>> One question:
>>
>> > We will need to special-case compat_table_info() in ebtables.c to
>> > either not allocate the compat array for nentries == 0, or pretend
>> > it was 1.
>>
>> nentries == 0 is returned to us by EBT_SO_GET_INIT_INFO, and I think
>> there are actually 3 or 4 entries, and we do want to get all of them
>> (to restore them later). If EBT_SO_GET_INIT_ENTRIES will give us no
>> entries (or 1 entry), then we won't be able to restore them later. Am
>> I missing something?
>
> nentries are the user-defined entries, and there are none in the initial
> (empty) table, so this should be fine.

ah, ok, then we are good
--
To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: compat ebtables broke in syzkaller

2018-04-04 Thread Florian Westphal
Dmitry Vyukov  wrote:
> One question:
> 
> > We will need to special-case compat_table_info() in ebtables.c to
> > either not allocate the compat array for nentries == 0, or pretend
> > it was 1.
> 
> nentries == 0 is returned to us by EBT_SO_GET_INIT_INFO, and I think
> there are actually 3 or 4 entries, and we do want to get all of them
> (to restore them later). If EBT_SO_GET_INIT_ENTRIES will give us no
> entries (or 1 entry), then we won't be able to restore them later. Am
> I missing something?

nentries are the user-defined entries, and there are none in the initial
(empty) table, so this should be fine.
--
To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: compat ebtables broke in syzkaller

2018-04-04 Thread Dmitry Vyukov
On Wed, Apr 4, 2018 at 8:51 PM, Florian Westphal  wrote:
> Dmitry Vyukov  wrote:
>> Hi Florian, Pablo,
>>
>> After the recent netfilter pull syzkaller in compat mode started failing 
>> with:
>>
>> getsockopt(EBT_SO_GET_INIT_ENTRIES) (errno 22)
>>
>> I think this is caused by:
>>
>> commit 7d7d7e02111e9a4dc9d0658597f528f815d820fd
>> Author: Florian Westphal 
>> Date:   Tue Feb 27 19:42:35 2018 +0100
>> netfilter: compat: reject huge allocation requests
>>
>> But I don't know if it's a problem with kernel or with our code (it
>
> kernel.
>
>> The idea behind checkpoint_ebtables() and reset_ebtables() is to reset
>> ebtables to initial state after each test (unfortunately, creating new
>> net namespaces is too damn expensive).
>>
>> It does EBT_SO_GET_INIT_INFO and then EBT_SO_GET_INIT_ENTRIES to
>> memorize initial state. And EBT_SO_GET_INIT_ENTRIES started failing.
>> It fails on the first filter table, here is debug output:
>>
>> checkpoint ebtable filter: entries=0 hooks=e size=144
>> getsockopt(EBT_SO_GET_INIT_ENTRIES) (errno 22)
>
> Yes, looks like ebtables 'entries' only counts user rules, not base
> policies.
>
>> Any insight will be appreciated. Is it something to fix in our code?
>
> No.
>
>> Also, what's the difference between EBT_SO_GET_INFO/ENTRIES and
>> EBT_SO_GET_INIT_INFO/ENTRIES? I guessed from the name that INIT is
>> what we want to use if we want to memorize initial state. Is it
>> correct?
>
> Looks like it.
>
> We will need to special-case compat_table_info() in ebtables.c to
> either not allocate the compat array for nentries == 0, or pretend
> it was 1.
>
> Dmitry, I can send a patch tomorrow, if you can/want to submit a patch
> yourself go ahead.

Thanks!

I will wait for you. If it gets into 4.17, then it's good enough for
us. For now compat syzbot just tests the last good image before the
pull.

One question:

> We will need to special-case compat_table_info() in ebtables.c to
> either not allocate the compat array for nentries == 0, or pretend
> it was 1.

nentries == 0 is returned to us by EBT_SO_GET_INIT_INFO, and I think
there are actually 3 or 4 entries, and we do want to get all of them
(to restore them later). If EBT_SO_GET_INIT_ENTRIES will give us no
entries (or 1 entry), then we won't be able to restore them later. Am
I missing something?
--
To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: compat ebtables broke in syzkaller

2018-04-04 Thread Florian Westphal
Dmitry Vyukov  wrote:
> Hi Florian, Pablo,
> 
> After the recent netfilter pull syzkaller in compat mode started failing with:
> 
> getsockopt(EBT_SO_GET_INIT_ENTRIES) (errno 22)
> 
> I think this is caused by:
> 
> commit 7d7d7e02111e9a4dc9d0658597f528f815d820fd
> Author: Florian Westphal 
> Date:   Tue Feb 27 19:42:35 2018 +0100
> netfilter: compat: reject huge allocation requests
> 
> But I don't know if it's a problem with kernel or with our code (it

kernel.

> The idea behind checkpoint_ebtables() and reset_ebtables() is to reset
> ebtables to initial state after each test (unfortunately, creating new
> net namespaces is too damn expensive).
> 
> It does EBT_SO_GET_INIT_INFO and then EBT_SO_GET_INIT_ENTRIES to
> memorize initial state. And EBT_SO_GET_INIT_ENTRIES started failing.
> It fails on the first filter table, here is debug output:
> 
> checkpoint ebtable filter: entries=0 hooks=e size=144
> getsockopt(EBT_SO_GET_INIT_ENTRIES) (errno 22)

Yes, looks like ebtables 'entries' only counts user rules, not base
policies.

> Any insight will be appreciated. Is it something to fix in our code?

No.

> Also, what's the difference between EBT_SO_GET_INFO/ENTRIES and
> EBT_SO_GET_INIT_INFO/ENTRIES? I guessed from the name that INIT is
> what we want to use if we want to memorize initial state. Is it
> correct?

Looks like it.

We will need to special-case compat_table_info() in ebtables.c to
either not allocate the compat array for nentries == 0, or pretend
it was 1.

Dmitry, I can send a patch tomorrow, if you can/want to submit a patch
yourself go ahead.
--
To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


compat ebtables broke in syzkaller

2018-04-04 Thread Dmitry Vyukov
Hi Florian, Pablo,

After the recent netfilter pull syzkaller in compat mode started failing with:

getsockopt(EBT_SO_GET_INIT_ENTRIES) (errno 22)

I think this is caused by:

commit 7d7d7e02111e9a4dc9d0658597f528f815d820fd
Author: Florian Westphal 
Date:   Tue Feb 27 19:42:35 2018 +0100
netfilter: compat: reject huge allocation requests

But I don't know if it's a problem with kernel or with our code (it
used to work before that commit, but this was the first netfilter code
that I ever wrote). Out code is here:

https://github.com/google/syzkaller/blob/676bd07e7e80f8a270af7f0276443c68f4a99e25/executor/common_linux.h#L1666

The idea behind checkpoint_ebtables() and reset_ebtables() is to reset
ebtables to initial state after each test (unfortunately, creating new
net namespaces is too damn expensive).

It does EBT_SO_GET_INIT_INFO and then EBT_SO_GET_INIT_ENTRIES to
memorize initial state. And EBT_SO_GET_INIT_ENTRIES started failing.
It fails on the first filter table, here is debug output:

checkpoint ebtable filter: entries=0 hooks=e size=144
getsockopt(EBT_SO_GET_INIT_ENTRIES) (errno 22)

I've added some debug output to kernel and I see that it's
xt_compat_init_offsets() that fails with EINVAL. What looks strange to
me is that EBT_SO_GET_INIT_INFO returns nentries==0 (while size==144).

Any insight will be appreciated. Is it something to fix in our code?

Also, what's the difference between EBT_SO_GET_INFO/ENTRIES and
EBT_SO_GET_INIT_INFO/ENTRIES? I guessed from the name that INIT is
what we want to use if we want to memorize initial state. Is it
correct?

Thanks
--
To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 40/47] netfilter: nf_tables: build-in filter chain type

2018-04-04 Thread Pablo Neira Ayuso
On Wed, Apr 04, 2018 at 05:38:31PM +0200, Arnd Bergmann wrote:
> On Fri, Mar 30, 2018 at 1:46 PM, Pablo Neira Ayuso  
> wrote:
> > One module per supported filter chain family type takes too much memory
> > for very little code - too much modularization - place all chain filter
> > definitions in one single file.
> >
> > Signed-off-by: Pablo Neira Ayuso 
> 
> Hi Pablo,
> 
> I've bisected a link error to this patch:
> 
> net/netfilter/nft_reject_inet.o: In function `nft_reject_inet_eval':
> nft_reject_inet.c:(.text+0xa7): undefined reference to `nf_send_unreach6'
> nft_reject_inet.c:(.text+0x10c): undefined reference to `nf_send_unreach6'
> nft_reject_inet.c:(.text+0x138): undefined reference to `nf_send_reset6'
> 
> Unfortunately I don't immediately see what went wrong, maybe you
> can spot it.

Can you pass me your .config file? I will have a look at it. Thanks.
Looks like some missing Kconfig stuff.
--
To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 40/47] netfilter: nf_tables: build-in filter chain type

2018-04-04 Thread Arnd Bergmann
On Fri, Mar 30, 2018 at 1:46 PM, Pablo Neira Ayuso  wrote:
> One module per supported filter chain family type takes too much memory
> for very little code - too much modularization - place all chain filter
> definitions in one single file.
>
> Signed-off-by: Pablo Neira Ayuso 

Hi Pablo,

I've bisected a link error to this patch:

net/netfilter/nft_reject_inet.o: In function `nft_reject_inet_eval':
nft_reject_inet.c:(.text+0xa7): undefined reference to `nf_send_unreach6'
nft_reject_inet.c:(.text+0x10c): undefined reference to `nf_send_unreach6'
nft_reject_inet.c:(.text+0x138): undefined reference to `nf_send_reset6'

Unfortunately I don't immediately see what went wrong, maybe you
can spot it.

   Arnd
--
To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v6] netfilter : add NAT support for shifted portmap ranges

2018-04-04 Thread Thierry Du Tre
This is a patch proposal to support shifted ranges in portmaps.
(i.e. tcp/udp incoming port 5000-5100 on WAN redirected to LAN 
192.168.1.5:2000-2100)

Currently DNAT only works for single port or identical port ranges.
(i.e. ports 5000-5100 on WAN interface redirected to a LAN host while original 
destination port is not altered)
When different port ranges are configured, either 'random' mode should be used, 
or else all incoming connections are mapped onto the first port in the redirect 
range. (in described example WAN:5000-5100 will all be mapped to 
192.168.1.5:2000)

This patch introduces a new mode indicated by flag NF_NAT_RANGE_PROTO_OFFSET 
which uses a base port value to calculate an offset with the destination port 
present in the incoming stream. That offset is then applied as index within the 
redirect port range (index modulo rangewidth to handle range overflow).

In described example the base port would be 5000. An incoming stream with 
destination port 5004 would result in an offset value 4 which means that the 
NAT'ed stream will be using destination port 2004.

Other possibilities include deterministic mapping of larger or multiple ranges 
to a smaller range : WAN:5000-5999 -> LAN:5000-5099 (maps WAN port 5*xx to port 
51xx)

This patch does not change any current behavior. It just adds new NAT proto 
range functionality which must be selected via the specific flag when intended 
to use.

A patch for iptables (libipt_DNAT.c + libip6t_DNAT.c) will also be proposed 
which makes this functionality immediately available.

Signed-off-by: Thierry Du Tre 

---
Changes in v6:
- fix compile issue for openvswitch module

Changes in v5:
- reverted to v2 for struct nf_nat_range names
- rebased to nf-next

Changes in v4:
- renamed nf_nat_range1 to nf_nat_range_v1

Changes in v3:
- use nf_nat_range as name for updated struct, renamed existing 
nf_nat_range to nf_nat_range1
- reverted all nf_nat_range2 occurences

Changes in v2:
- added new revision for SNAT and DNAT targets to support the new base port 
variable in struct nf_nat_range2
- replaced all occurences of struct nf_nat_range with struct nf_nat_range2

 include/net/netfilter/ipv4/nf_nat_masquerade.h |  2 +-
 include/net/netfilter/ipv6/nf_nat_masquerade.h |  2 +-
 include/net/netfilter/nf_nat.h |  2 +-
 include/net/netfilter/nf_nat_l3proto.h |  4 +-
 include/net/netfilter/nf_nat_l4proto.h |  8 +--
 include/net/netfilter/nf_nat_redirect.h|  2 +-
 include/uapi/linux/netfilter/nf_nat.h  | 12 -
 net/ipv4/netfilter/ipt_MASQUERADE.c|  2 +-
 net/ipv4/netfilter/nf_nat_h323.c   |  4 +-
 net/ipv4/netfilter/nf_nat_l3proto_ipv4.c   |  4 +-
 net/ipv4/netfilter/nf_nat_masquerade_ipv4.c|  4 +-
 net/ipv4/netfilter/nf_nat_pptp.c   |  2 +-
 net/ipv4/netfilter/nf_nat_proto_gre.c  |  2 +-
 net/ipv4/netfilter/nf_nat_proto_icmp.c |  2 +-
 net/ipv4/netfilter/nft_masq_ipv4.c |  2 +-
 net/ipv6/netfilter/ip6t_MASQUERADE.c   |  2 +-
 net/ipv6/netfilter/nf_nat_l3proto_ipv6.c   |  4 +-
 net/ipv6/netfilter/nf_nat_masquerade_ipv6.c|  4 +-
 net/ipv6/netfilter/nf_nat_proto_icmpv6.c   |  2 +-
 net/ipv6/netfilter/nft_masq_ipv6.c |  2 +-
 net/ipv6/netfilter/nft_redir_ipv6.c|  2 +-
 net/netfilter/nf_nat_core.c| 27 +-
 net/netfilter/nf_nat_helper.c  |  2 +-
 net/netfilter/nf_nat_proto_common.c|  9 ++--
 net/netfilter/nf_nat_proto_dccp.c  |  2 +-
 net/netfilter/nf_nat_proto_sctp.c  |  2 +-
 net/netfilter/nf_nat_proto_tcp.c   |  2 +-
 net/netfilter/nf_nat_proto_udp.c   |  4 +-
 net/netfilter/nf_nat_proto_unknown.c   |  2 +-
 net/netfilter/nf_nat_redirect.c|  6 +--
 net/netfilter/nf_nat_sip.c |  2 +-
 net/netfilter/nft_nat.c|  2 +-
 net/netfilter/xt_NETMAP.c  |  8 +--
 net/netfilter/xt_REDIRECT.c|  2 +-
 net/netfilter/xt_nat.c | 72 +++---
 net/openvswitch/conntrack.c|  4 +-
 36 files changed, 145 insertions(+), 71 deletions(-)

diff --git a/include/net/netfilter/ipv4/nf_nat_masquerade.h 
b/include/net/netfilter/ipv4/nf_nat_masquerade.h
index ebd8694..cd24be4 100644
--- a/include/net/netfilter/ipv4/nf_nat_masquerade.h
+++ b/include/net/netfilter/ipv4/nf_nat_masquerade.h
@@ -6,7 +6,7 @@
 
 unsigned int
 nf_nat_masquerade_ipv4(struct sk_buff *skb, unsigned int hooknum,
-  const struct nf_nat_range *range,
+  const struct nf_nat_range2 *range,
   const struct net_device *out);
 
 void nf_nat_masquerade_ipv4_register_notifier(void);
diff --git a/include/net/netfilter/ipv6/nf_nat_masquerade.h 
b/include/net/netfilter/ipv6/nf_nat_masquerade.h

Re: [PATCH nft] configure: don't enable xtables when --without-xtables is passed

2018-04-04 Thread Alexander Dahl
Hello Florian,

thanks for your quick fix. :-)

> diff --git a/configure.ac b/configure.ac
> index 284bcc502346..eb673d52c6f2 100644
> --- a/configure.ac
> +++ b/configure.ac
> @@ -99,7 +99,7 @@ AM_CONDITIONAL([BUILD_CLI], [test "x$with_cli" != xno])
> 
>  AC_ARG_WITH([xtables], [AS_HELP_STRING([--with-xtables],
>  [Use libxtables for iptables interaction)])],
> - [with_libxtables=yes], [with_libxtables=no])
> + [with_libxtables=$withval], [with_libxtables=no])
>  AS_IF([test "x$with_libxtables" != xno], [
>  PKG_CHECK_MODULES([XTABLES], [xtables >= 1.6.1])
>  AC_DEFINE([HAVE_LIBXTABLES], [1], [0])

Tested-by: Alexander Dahl 

Greets
Alex

--
To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH nft] configure: don't enable xtables when --without-xtables is passed

2018-04-04 Thread Florian Westphal
AC_ARG_WITH runs this when EITHER --with-foo or --without-foo is given,
so use 'withval'.

After this patch:
./configure -> xtables off
./configure --with-xtables -> xtables on
./configure --without-xtables -> xtables off (was on).

Reported-by: Alexander Dahl 
Signed-off-by: Florian Westphal 
---
 configure.ac | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/configure.ac b/configure.ac
index 284bcc502346..eb673d52c6f2 100644
--- a/configure.ac
+++ b/configure.ac
@@ -99,7 +99,7 @@ AM_CONDITIONAL([BUILD_CLI], [test "x$with_cli" != xno])
 
 AC_ARG_WITH([xtables], [AS_HELP_STRING([--with-xtables],
 [Use libxtables for iptables interaction)])],
-   [with_libxtables=yes], [with_libxtables=no])
+   [with_libxtables=$withval], [with_libxtables=no])
 AS_IF([test "x$with_libxtables" != xno], [
 PKG_CHECK_MODULES([XTABLES], [xtables >= 1.6.1])
 AC_DEFINE([HAVE_LIBXTABLES], [1], [0])
-- 
2.16.1

--
To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html