[PATCH] iptables: move XT_LOCK_NAME from CFLAGS to config.h.

2017-03-15 Thread Lorenzo Colitti
This slightly simplifies configure.ac and results in more
correct dependencies.

Tested by running ./configure with --with-xt-lock-name and
without, and using strace to verify that the right lock is used.

$ make distclean-recursive && ./autogen.sh &&
  ./configure --disable-nftables --prefix /tmp/iptables &&
  make -j64 &&
  make install &&
  sudo strace -e open,flock /tmp/iptables/sbin/iptables -L foo
...
open("/run/xtables.lock", O_RDONLY|O_CREAT, 0600) = 3
flock(3, LOCK_EX|LOCK_NB)   = 0

$ make distclean-recursive && ./autogen.sh && \
  ./configure --disable-nftables --prefix /tmp/iptables \
  --with-xt-lock-name=/tmp/iptables/run/xtables.lock &&
  make -j64 &&
  make install &&
  sudo strace -e open,flock /tmp/iptables/sbin/iptables -L foo
...
open("/tmp/iptables/run/xtables.lock", O_RDONLY|O_CREAT, 0600) = 3
flock(3, LOCK_EX|LOCK_NB)   = 0

Signed-off-by: Lorenzo Colitti 
---
 configure.ac   | 6 --
 iptables/xshared.c | 1 +
 2 files changed, 5 insertions(+), 2 deletions(-)

diff --git a/configure.ac b/configure.ac
index b27502667c..221812a8f3 100644
--- a/configure.ac
+++ b/configure.ac
@@ -197,7 +197,7 @@ AC_SUBST([blacklist_6_modules])
 regular_CFLAGS="-Wall -Waggregate-return -Wmissing-declarations \
-Wmissing-prototypes -Wredundant-decls -Wshadow -Wstrict-prototypes \
-Winline -pipe";
-regular_CPPFLAGS="${largefile_cppflags} 
-DXT_LOCK_NAME=\\\"\${xt_lock_name}\\\" -D_REENTRANT \
+regular_CPPFLAGS="${largefile_cppflags} -D_REENTRANT \
-DXTABLES_LIBDIR=\\\"\${xtlibdir}\\\" -DXTABLES_INTERNAL";
 kinclude_CPPFLAGS="";
 if [[ -n "$kbuilddir" ]]; then
@@ -235,7 +235,9 @@ AC_SUBST([libxtables_vcurrent])
 AC_SUBST([libxtables_vage])
 libxtables_vmajor=$(($libxtables_vcurrent - $libxtables_vage));
 AC_SUBST([libxtables_vmajor])
-AC_SUBST([xt_lock_name])
+
+AC_DEFINE_UNQUOTED([XT_LOCK_NAME], "${xt_lock_name}",
+   [Location of the iptables lock file])
 
 AC_CONFIG_FILES([Makefile extensions/GNUmakefile include/Makefile
iptables/Makefile iptables/xtables.pc
diff --git a/iptables/xshared.c b/iptables/xshared.c
index 383ecf2cf2..9b8e856e25 100644
--- a/iptables/xshared.c
+++ b/iptables/xshared.c
@@ -1,3 +1,4 @@
+#include 
 #include 
 #include 
 #include 
-- 
2.12.0.367.g23dc2f6d3c-goog

--
To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH nf 1/1] netfilter: ctlink: Fix one possible memleak in nfnl_cthelper_create

2017-03-15 Thread fgao
From: Gao Feng 

When nf_conntrack_helper_register failed, the error handler just frees
the helper, but it does not free the helper->expect_policy which is
allocated in nfnl_cthelper_parse_expect_policy.

Signed-off-by: Gao Feng 
---
 net/netfilter/nfnetlink_cthelper.c | 8 +---
 1 file changed, 5 insertions(+), 3 deletions(-)

diff --git a/net/netfilter/nfnetlink_cthelper.c 
b/net/netfilter/nfnetlink_cthelper.c
index de87823..f0241a1 100644
--- a/net/netfilter/nfnetlink_cthelper.c
+++ b/net/netfilter/nfnetlink_cthelper.c
@@ -214,7 +214,7 @@
 
ret = nfnl_cthelper_parse_expect_policy(helper, tb[NFCTH_POLICY]);
if (ret < 0)
-   goto err;
+   goto err1;
 
strncpy(helper->name, nla_data(tb[NFCTH_NAME]), NF_CT_HELPER_NAME_LEN);
helper->data_len = ntohl(nla_get_be32(tb[NFCTH_PRIV_DATA_LEN]));
@@ -245,10 +245,12 @@
 
ret = nf_conntrack_helper_register(helper);
if (ret < 0)
-   goto err;
+   goto err2;
 
return 0;
-err:
+err2:
+   kfree(helper->expect_policy);
+err1:
kfree(helper);
return ret;
 }
-- 
1.9.1


--
To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH iptables 1/2] iptables: remove duplicated argument parsing code

2017-03-15 Thread Subash Abhinov Kasiviswanathan

On 2017-03-15 07:45, Lorenzo Colitti wrote:

1. Factor out repeated code to a new xs_has_arg function.
2. Add a new parse_wait_time option to parse the value of -w.
3. Make parse_wait_interval take argc and argv so its callers
   can be simpler.

Signed-off-by: Lorenzo Colitti 


Hi Lorenzo

I am seeing a compilation failure with this patch.
It might require a fix like below.

diff --git a/iptables/xtables.c b/iptables/xtables.c
index 45a7644..bde8ba6 100644
--- a/iptables/xtables.c
+++ b/iptables/xtables.c
@@ -1012,9 +1012,10 @@ void do_parse(struct nft_handle *h, int argc, 
char *argv[],

  "iptables-restore");
}
if (optarg)
-   parse_wait_interval(optarg, 
_interval);

+   parse_wait_interval(argc, argv,
+   _interval);
else if (xs_has_arg(argc, argv))
-   parse_wait_interval(argv[optind++],
+   parse_wait_interval(argc, argv,
_interval);

wait_interval_set = true;

--
Qualcomm Innovation Center, Inc.
The Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, a 
Linux Foundation Collaborative Project

--
To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH nf 1/1] netfilter: helper: Fix possible panic caused by invoking expectfn unloaded

2017-03-15 Thread Gao Feng
Hi Pablo,

On Wed, Mar 15, 2017 at 9:07 PM, Pablo Neira Ayuso  wrote:
> On Tue, Mar 14, 2017 at 02:26:06PM +0800, f...@ikuai8.com wrote:
>> From: Gao Feng 
>>
>> The helper module permits the helper modules register expectfn, and
>> it could be hold by external caller. But when the module is unloaded,
>> there may be some pending expect nodes which still hold the function
>> reference. It may cause unexpected behavior, even panic.
>>
>> Now it would delete the expect nodes which uses the expectfn when
>> unregister expectfn. And it must use the rcu_read_lock to protect
>> the expectfn until insert it or doesn't access it ever.
>
> Expectations should be removed by when the helper module is gone, so
> what is the problem here?

Let me explain it as following:
1. The expectations would be removed by when the helper module is
gone, but expectfn is not. For example, the file nf_nat_sip.c. It
registers the expectfn at init, and unregister expectfn at exit. But
it doesn't remove the expect node when unload;
The nf_nat_sip.c uses nf_ct_helper_expectfn_register register expectfn
and nf_ct_helper_expectfn_unregister unregister the expectfn.

2. ctlink could create one expect by CTA_EXPECT_FN and without
CTA_EXPECT_HELP_NAME. It invokes nf_ct_helper_expectfn_find_by_name to
get expectfn and helper is NULL.
There is one race condition
  cpu1
  cpu2
ctlink creates the expect node with expectfn

   the expectfn is unregistered
insert the expect node

Now the bug comes on. The module which expectfn is in is unloaded.

Best Regards
Feng


--
To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH net] bridge: ebtables: fix reception of frames DNAT-ed to bridge device

2017-03-15 Thread Linus Lüssing
On Wed, Mar 15, 2017 at 07:15:39PM +0100, Pablo Neira Ayuso wrote:
> Could you update ebtables dnat to check if the ethernet address
> matches the one of the input bridge interface, so we mangle the
> ->pkt_type accordingly from there, instead of doing this from the
> core?

Actually, that was the approach I thought about and went for first
(and it would probably work for me). Just checking against the
bridge device's net_device::dev_addr.

I scratched it though, as I was afraid that the issue might still
exist for people using some other upper device on top of the bridge
device. For instance, macvlan? And iterating over the
net_device::dev_addrs list seemed too costly for fast path to me.
--
To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [nft PATCH] proto: Add some exotic ICMPv6 types

2017-03-15 Thread Phil Sutter
On Wed, Mar 15, 2017 at 05:15:14PM +0100, Pablo Neira Ayuso wrote:
> On Wed, Mar 15, 2017 at 04:55:01PM +0100, Phil Sutter wrote:
> > This adds support for matching on inverse ND messages as defined by
> > RFC3122 (not implemented in Linux) and MLDv2 as defined by RFC3810.
> > 
> > Note that ICMPV6_MLD2_REPORT macro is defined in linux/icmpv6.h but
> > including that header leads to conflicts with symbols defined in
> > netinet/icmp6.h.
> > 
> > In addition to the above, "mld-listener-done" is introduced as an alias
> > for "mld-listener-reduction".
> > 
> > Signed-off-by: Phil Sutter 
> > ---
> > This should resolve netfilter BZ#926.
> > ---
> >  src/proto.c | 8 
> >  1 file changed, 8 insertions(+)
> > 
> > diff --git a/src/proto.c b/src/proto.c
> > index fb965304e59d9..6a8eed936d858 100644
> > --- a/src/proto.c
> > +++ b/src/proto.c
> > @@ -632,6 +632,10 @@ const struct proto_desc proto_ip = {
> >  
> >  #include 
> >  
> > +#define IND_NEIGHBOR_SOLICIT   141
> > +#define IND_NEIGHBOR_ADVERT142
> > +#define ICMPV6_MLD2_REPORT 143
> > +
> >  static const struct symbol_table icmp6_type_tbl = {
> > .base   = BASE_DECIMAL,
> > .symbols= {
> > @@ -644,12 +648,16 @@ static const struct symbol_table icmp6_type_tbl = {
> > SYMBOL("mld-listener-query",MLD_LISTENER_QUERY),
> > SYMBOL("mld-listener-report",   MLD_LISTENER_REPORT),
> > SYMBOL("mld-listener-reduction",MLD_LISTENER_REDUCTION),
> > +   SYMBOL("mld-listener-done", MLD_LISTENER_REDUCTION),
> 
> This one is duplicated, right?

Yes, it is the alias which was suggested in the ticket. Is this OK, or
should we rather respond with WONTFIX?

I realize this patch lacks an update to man page and a few test cases.
Should I reroll or send a follow-up?

Thanks, Phil
--
To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH net] bridge: ebtables: fix reception of frames DNAT-ed to bridge device

2017-03-15 Thread Pablo Neira Ayuso
On Wed, Mar 15, 2017 at 03:27:20PM +0100, Linus Lüssing wrote:
> On Wed, Mar 15, 2017 at 11:42:11AM +0100, Pablo Neira Ayuso wrote:
> > I'm missing then why redirect is not then just enough for Linus usecase.
> 
> For my usecase, the MAC address is configured by the user from a
> Web-UI. It may or may not be the one from the bridge device.
> 
> Besides, found it counter intuitive that DNAT did not work here
> and took me some time to find out why. At least I didn't read about
> any such known limitations of the dnat target in the ebtables
> manpage.

Could you update ebtables dnat to check if the ethernet address
matches the one of the input bridge interface, so we mangle the
->pkt_type accordingly from there, instead of doing this from the
core?
--
To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 03/10] netfilter: nf_tables: set pktinfo->thoff at AH header if found

2017-03-15 Thread Pablo Neira Ayuso
Phil Sutter reports that IPv6 AH header matching is broken. From
userspace, nft generates bytecode that expects to find the AH header at
NFT_PAYLOAD_TRANSPORT_HEADER both for IPv4 and IPv6. However,
pktinfo->thoff is set to the inner header after the AH header in IPv6,
while in IPv4 pktinfo->thoff points to the AH header indeed. This
behaviour is inconsistent. This patch fixes this problem by updating
ipv6_find_hdr() to get the IP6_FH_F_AUTH flag so this function stops at
the AH header, so both IPv4 and IPv6 pktinfo->thoff point to the AH
header.

This is also inconsistent when trying to match encapsulated headers:

1) A packet that looks like IPv4 + AH + TCP dport 22 will *not* match.
2) A packet that looks like IPv6 + AH + TCP dport 22 will match.

Reported-by: Phil Sutter 
Signed-off-by: Pablo Neira Ayuso 
---
 include/net/netfilter/nf_tables_ipv6.h | 6 --
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/include/net/netfilter/nf_tables_ipv6.h 
b/include/net/netfilter/nf_tables_ipv6.h
index d150b5066201..97983d1c05e4 100644
--- a/include/net/netfilter/nf_tables_ipv6.h
+++ b/include/net/netfilter/nf_tables_ipv6.h
@@ -9,12 +9,13 @@ nft_set_pktinfo_ipv6(struct nft_pktinfo *pkt,
 struct sk_buff *skb,
 const struct nf_hook_state *state)
 {
+   unsigned int flags = IP6_FH_F_AUTH;
int protohdr, thoff = 0;
unsigned short frag_off;
 
nft_set_pktinfo(pkt, skb, state);
 
-   protohdr = ipv6_find_hdr(pkt->skb, , -1, _off, NULL);
+   protohdr = ipv6_find_hdr(pkt->skb, , -1, _off, );
if (protohdr < 0) {
nft_set_pktinfo_proto_unspec(pkt, skb);
return;
@@ -32,6 +33,7 @@ __nft_set_pktinfo_ipv6_validate(struct nft_pktinfo *pkt,
const struct nf_hook_state *state)
 {
 #if IS_ENABLED(CONFIG_IPV6)
+   unsigned int flags = IP6_FH_F_AUTH;
struct ipv6hdr *ip6h, _ip6h;
unsigned int thoff = 0;
unsigned short frag_off;
@@ -50,7 +52,7 @@ __nft_set_pktinfo_ipv6_validate(struct nft_pktinfo *pkt,
if (pkt_len + sizeof(*ip6h) > skb->len)
return -1;
 
-   protohdr = ipv6_find_hdr(pkt->skb, , -1, _off, NULL);
+   protohdr = ipv6_find_hdr(pkt->skb, , -1, _off, );
if (protohdr < 0)
return -1;
 
-- 
2.1.4

--
To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 00/10] Netfilter fixes for net

2017-03-15 Thread Pablo Neira Ayuso
Hi David,

The following patchset contains Netfilter fixes for your net tree, a
rather large batch of fixes targeted to nf_tables, conntrack and bridge
netfilter. More specifically, they are:

1) Don't track fragmented packets if the socket option IP_NODEFRAG is set.
   From Florian Westphal.

2) SCTP protocol tracker assumes that ICMP error messages contain the
   checksum field, what results in packet drops. From Ying Xue.

3) Fix inconsistent handling of AH traffic from nf_tables.

4) Fix new bitmap set representation with big endian. Fix mismatches in
   nf_tables due to incorrect big endian handling too. Both patches
   from Liping Zhang.

5) Bridge netfilter doesn't honor maximum fragment size field, cap to
   largest fragment seen. From Florian Westphal.

6) Fake conntrack entry needs to be aligned to 8 bytes since the 3 LSB
   bits are now used to store the ctinfo. From Steven Rostedt.

7) Fix element comments with the bitmap set type. Revert the flush
   field in the nft_set_iter structure, not required anymore after
   fixing up element comments.

8) Missing error on invalid conntrack direction from nft_ct, also from
   Liping Zhang.

You can pull these changes from:

  git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf.git

Thanks!



The following changes since commit 8d70eeb84ab277377c017af6a21d0a337025dede:

  Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net (2017-03-04 
17:31:39 -0800)

are available in the git repository at:

  git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf.git HEAD

for you to fetch changes up to 4494dbc6dec37817f2cc2aa7604039a9e87ada18:

  netfilter: nft_ct: do cleanup work when NFTA_CT_DIRECTION is invalid 
(2017-03-15 17:15:54 +0100)


Florian Westphal (2):
  netfilter: don't track fragmented packets
  netfilter: bridge: honor frag_max_size when refragmenting

Liping Zhang (3):
  netfilter: nft_set_bitmap: fetch the element key based on the set->klen
  netfilter: nf_tables: fix mismatch in big-endian system
  netfilter: nft_ct: do cleanup work when NFTA_CT_DIRECTION is invalid

Pablo Neira Ayuso (3):
  netfilter: nf_tables: set pktinfo->thoff at AH header if found
  netfilter: nft_set_bitmap: keep a list of dummy elements
  Revert "netfilter: nf_tables: add flush field to struct nft_set_iter"

Steven Rostedt (VMware) (1):
  netfilter: Force fake conntrack entry to be at least 8 bytes aligned

Ying Xue (1):
  netfilter: nf_nat_sctp: fix ICMP packet to be dropped accidently

 include/net/netfilter/nf_conntrack.h   |   2 +-
 include/net/netfilter/nf_tables.h  |  30 -
 include/net/netfilter/nf_tables_ipv6.h |   6 +-
 net/bridge/br_netfilter_hooks.c|  12 +-
 net/ipv4/netfilter/nf_conntrack_l3proto_ipv4.c |   4 +
 net/ipv4/netfilter/nf_nat_l3proto_ipv4.c   |   5 -
 net/ipv4/netfilter/nft_masq_ipv4.c |   8 +-
 net/ipv4/netfilter/nft_redir_ipv4.c|   8 +-
 net/ipv6/netfilter/nft_masq_ipv6.c |   8 +-
 net/ipv6/netfilter/nft_redir_ipv6.c|   8 +-
 net/netfilter/nf_conntrack_core.c  |   6 +-
 net/netfilter/nf_nat_proto_sctp.c  |  13 +-
 net/netfilter/nf_tables_api.c  |   4 -
 net/netfilter/nft_ct.c |  21 ++--
 net/netfilter/nft_meta.c   |  40 +++---
 net/netfilter/nft_nat.c|   8 +-
 net/netfilter/nft_set_bitmap.c | 165 -
 17 files changed, 194 insertions(+), 154 deletions(-)
--
To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 04/10] netfilter: nft_set_bitmap: fetch the element key based on the set->klen

2017-03-15 Thread Pablo Neira Ayuso
From: Liping Zhang 

Currently we just assume the element key as a u32 integer, regardless of
the set key length.

This is incorrect, for example, the tcp port number is only 16 bits.
So when we use the nft_payload expr to get the tcp dport and store
it to dreg, the dport will be stored at 0~15 bits, and 16~31 bits
will be padded with zero.

So the reg->data[dreg] will be looked like as below:
  0  15   31
  +-+-+-+-+-+-+-+-+-+-+-+-+
  | tcp dport |  0|
  +-+-+-+-+-+-+-+-+-+-+-+-+
But for these big-endian systems, if we treate this register as a u32
integer, the element key will be larger than 65535, so the following
lookup in bitmap set will cause out of bound access.

Another issue is that if we add element with comments in bitmap
set(although the comments will be ignored eventually), the element will
vanish strangely. Because we treate the element key as a u32 integer, so
the comments will become the part of the element key, then the element
key will also be larger than 65535 and out of bound access will happen:
  # nft add element t s { 1 comment test }

Since set->klen is 1 or 2, it's fine to treate the element key as a u8 or
u16 integer.

Fixes: 665153ff5752 ("netfilter: nf_tables: add bitmap set type")
Signed-off-by: Liping Zhang 
Signed-off-by: Pablo Neira Ayuso 
---
 net/netfilter/nft_set_bitmap.c | 27 +--
 1 file changed, 17 insertions(+), 10 deletions(-)

diff --git a/net/netfilter/nft_set_bitmap.c b/net/netfilter/nft_set_bitmap.c
index 152d226552c1..9b024e22717b 100644
--- a/net/netfilter/nft_set_bitmap.c
+++ b/net/netfilter/nft_set_bitmap.c
@@ -45,9 +45,17 @@ struct nft_bitmap {
u8  bitmap[];
 };
 
-static inline void nft_bitmap_location(u32 key, u32 *idx, u32 *off)
+static inline void nft_bitmap_location(const struct nft_set *set,
+  const void *key,
+  u32 *idx, u32 *off)
 {
-   u32 k = (key << 1);
+   u32 k;
+
+   if (set->klen == 2)
+   k = *(u16 *)key;
+   else
+   k = *(u8 *)key;
+   k <<= 1;
 
*idx = k / BITS_PER_BYTE;
*off = k % BITS_PER_BYTE;
@@ -69,7 +77,7 @@ static bool nft_bitmap_lookup(const struct net *net, const 
struct nft_set *set,
u8 genmask = nft_genmask_cur(net);
u32 idx, off;
 
-   nft_bitmap_location(*key, , );
+   nft_bitmap_location(set, key, , );
 
return nft_bitmap_active(priv->bitmap, idx, off, genmask);
 }
@@ -83,7 +91,7 @@ static int nft_bitmap_insert(const struct net *net, const 
struct nft_set *set,
u8 genmask = nft_genmask_next(net);
u32 idx, off;
 
-   nft_bitmap_location(nft_set_ext_key(ext)->data[0], , );
+   nft_bitmap_location(set, nft_set_ext_key(ext), , );
if (nft_bitmap_active(priv->bitmap, idx, off, genmask))
return -EEXIST;
 
@@ -102,7 +110,7 @@ static void nft_bitmap_remove(const struct net *net,
u8 genmask = nft_genmask_next(net);
u32 idx, off;
 
-   nft_bitmap_location(nft_set_ext_key(ext)->data[0], , );
+   nft_bitmap_location(set, nft_set_ext_key(ext), , );
/* Enter 00 state. */
priv->bitmap[idx] &= ~(genmask << off);
 }
@@ -116,7 +124,7 @@ static void nft_bitmap_activate(const struct net *net,
u8 genmask = nft_genmask_next(net);
u32 idx, off;
 
-   nft_bitmap_location(nft_set_ext_key(ext)->data[0], , );
+   nft_bitmap_location(set, nft_set_ext_key(ext), , );
/* Enter 11 state. */
priv->bitmap[idx] |= (genmask << off);
 }
@@ -128,7 +136,7 @@ static bool nft_bitmap_flush(const struct net *net,
u8 genmask = nft_genmask_next(net);
u32 idx, off;
 
-   nft_bitmap_location(nft_set_ext_key(ext)->data[0], , );
+   nft_bitmap_location(set, nft_set_ext_key(ext), , );
/* Enter 10 state, similar to deactivation. */
priv->bitmap[idx] &= ~(genmask << off);
 
@@ -161,10 +169,9 @@ static void *nft_bitmap_deactivate(const struct net *net,
struct nft_bitmap *priv = nft_set_priv(set);
u8 genmask = nft_genmask_next(net);
struct nft_set_ext *ext;
-   u32 idx, off, key = 0;
+   u32 idx, off;
 
-   memcpy(, elem->key.val.data, set->klen);
-   nft_bitmap_location(key, , );
+   nft_bitmap_location(set, elem->key.val.data, , );
 
if (!nft_bitmap_active(priv->bitmap, idx, off, genmask))
return NULL;
-- 
2.1.4

--
To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 01/10] netfilter: don't track fragmented packets

2017-03-15 Thread Pablo Neira Ayuso
From: Florian Westphal 

Andrey reports syzkaller splat caused by

NF_CT_ASSERT(!ip_is_fragment(ip_hdr(skb)));

in ipv4 nat.  But this assertion (and the comment) are wrong, this function
does see fragments when IP_NODEFRAG setsockopt is used.

As conntrack doesn't track packets without complete l4 header, only the
first fragment is tracked.

Because applying nat to first packet but not the rest makes no sense this
also turns off tracking of all fragments.

Reported-by: Andrey Konovalov 
Signed-off-by: Florian Westphal 
Signed-off-by: Pablo Neira Ayuso 
---
 net/ipv4/netfilter/nf_conntrack_l3proto_ipv4.c | 4 
 net/ipv4/netfilter/nf_nat_l3proto_ipv4.c   | 5 -
 2 files changed, 4 insertions(+), 5 deletions(-)

diff --git a/net/ipv4/netfilter/nf_conntrack_l3proto_ipv4.c 
b/net/ipv4/netfilter/nf_conntrack_l3proto_ipv4.c
index bc1486f2c064..2e14ed11a35c 100644
--- a/net/ipv4/netfilter/nf_conntrack_l3proto_ipv4.c
+++ b/net/ipv4/netfilter/nf_conntrack_l3proto_ipv4.c
@@ -165,6 +165,10 @@ static unsigned int ipv4_conntrack_local(void *priv,
if (skb->len < sizeof(struct iphdr) ||
ip_hdrlen(skb) < sizeof(struct iphdr))
return NF_ACCEPT;
+
+   if (ip_is_fragment(ip_hdr(skb))) /* IP_NODEFRAG setsockopt set */
+   return NF_ACCEPT;
+
return nf_conntrack_in(state->net, PF_INET, state->hook, skb);
 }
 
diff --git a/net/ipv4/netfilter/nf_nat_l3proto_ipv4.c 
b/net/ipv4/netfilter/nf_nat_l3proto_ipv4.c
index f8aad03d674b..6f5e8d01b876 100644
--- a/net/ipv4/netfilter/nf_nat_l3proto_ipv4.c
+++ b/net/ipv4/netfilter/nf_nat_l3proto_ipv4.c
@@ -255,11 +255,6 @@ nf_nat_ipv4_fn(void *priv, struct sk_buff *skb,
/* maniptype == SRC for postrouting. */
enum nf_nat_manip_type maniptype = HOOK2MANIP(state->hook);
 
-   /* We never see fragments: conntrack defrags on pre-routing
-* and local-out, and nf_nat_out protects post-routing.
-*/
-   NF_CT_ASSERT(!ip_is_fragment(ip_hdr(skb)));
-
ct = nf_ct_get(skb, );
/* Can't track?  It's not due to stress, or conntrack would
 * have dropped it.  Hence it's the user's responsibilty to
-- 
2.1.4

--
To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 06/10] netfilter: bridge: honor frag_max_size when refragmenting

2017-03-15 Thread Pablo Neira Ayuso
From: Florian Westphal 

consider a bridge with mtu 9000, but end host sending smaller
packets to another host with mtu < 9000.

In this case, after reassembly, bridge+defrag would refragment,
and then attempt to send the reassembled packet as long as it
was below 9k.

Instead we have to cap by the largest fragment size seen.

Signed-off-by: Florian Westphal 
Signed-off-by: Pablo Neira Ayuso 
---
 net/bridge/br_netfilter_hooks.c | 12 +++-
 1 file changed, 7 insertions(+), 5 deletions(-)

diff --git a/net/bridge/br_netfilter_hooks.c b/net/bridge/br_netfilter_hooks.c
index 95087e6e8258..3c5185021c1c 100644
--- a/net/bridge/br_netfilter_hooks.c
+++ b/net/bridge/br_netfilter_hooks.c
@@ -721,18 +721,20 @@ static unsigned int nf_bridge_mtu_reduction(const struct 
sk_buff *skb)
 
 static int br_nf_dev_queue_xmit(struct net *net, struct sock *sk, struct 
sk_buff *skb)
 {
-   struct nf_bridge_info *nf_bridge;
-   unsigned int mtu_reserved;
+   struct nf_bridge_info *nf_bridge = nf_bridge_info_get(skb);
+   unsigned int mtu, mtu_reserved;
 
mtu_reserved = nf_bridge_mtu_reduction(skb);
+   mtu = skb->dev->mtu;
+
+   if (nf_bridge->frag_max_size && nf_bridge->frag_max_size < mtu)
+   mtu = nf_bridge->frag_max_size;
 
-   if (skb_is_gso(skb) || skb->len + mtu_reserved <= skb->dev->mtu) {
+   if (skb_is_gso(skb) || skb->len + mtu_reserved <= mtu) {
nf_bridge_info_free(skb);
return br_dev_queue_push_xmit(net, sk, skb);
}
 
-   nf_bridge = nf_bridge_info_get(skb);
-
/* This is wrong! We should preserve the original fragment
 * boundaries by preserving frag_list rather than refragmenting.
 */
-- 
2.1.4

--
To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 08/10] netfilter: nft_set_bitmap: keep a list of dummy elements

2017-03-15 Thread Pablo Neira Ayuso
Element comments may come without any prior set flag, so we have to keep
a list of dummy struct nft_set_ext to keep this information around. This
is only useful for set dumps to userspace. From the packet path, this
set type relies on the bitmap representation. This patch simplifies the
logic since we don't need to allocate the dummy nft_set_ext structure
anymore on the fly at the cost of increasing memory consumption because
of the list of dummy struct nft_set_ext.

Fixes: 665153ff5752 ("netfilter: nf_tables: add bitmap set type")
Signed-off-by: Pablo Neira Ayuso 
---
 net/netfilter/nft_set_bitmap.c | 146 +++--
 1 file changed, 66 insertions(+), 80 deletions(-)

diff --git a/net/netfilter/nft_set_bitmap.c b/net/netfilter/nft_set_bitmap.c
index 9b024e22717b..8ebbc2940f4c 100644
--- a/net/netfilter/nft_set_bitmap.c
+++ b/net/netfilter/nft_set_bitmap.c
@@ -15,6 +15,11 @@
 #include 
 #include 
 
+struct nft_bitmap_elem {
+   struct list_headhead;
+   struct nft_set_ext  ext;
+};
+
 /* This bitmap uses two bits to represent one element. These two bits determine
  * the element state in the current and the future generation.
  *
@@ -41,8 +46,9 @@
  *  restore its previous state.
  */
 struct nft_bitmap {
-   u16 bitmap_size;
-   u8  bitmap[];
+   struct  list_head   list;
+   u16 bitmap_size;
+   u8  bitmap[];
 };
 
 static inline void nft_bitmap_location(const struct nft_set *set,
@@ -82,21 +88,43 @@ static bool nft_bitmap_lookup(const struct net *net, const 
struct nft_set *set,
return nft_bitmap_active(priv->bitmap, idx, off, genmask);
 }
 
+static struct nft_bitmap_elem *
+nft_bitmap_elem_find(const struct nft_set *set, struct nft_bitmap_elem *this,
+u8 genmask)
+{
+   const struct nft_bitmap *priv = nft_set_priv(set);
+   struct nft_bitmap_elem *be;
+
+   list_for_each_entry_rcu(be, >list, head) {
+   if (memcmp(nft_set_ext_key(>ext),
+  nft_set_ext_key(>ext), set->klen) ||
+   !nft_set_elem_active(>ext, genmask))
+   continue;
+
+   return be;
+   }
+   return NULL;
+}
+
 static int nft_bitmap_insert(const struct net *net, const struct nft_set *set,
 const struct nft_set_elem *elem,
-struct nft_set_ext **_ext)
+struct nft_set_ext **ext)
 {
struct nft_bitmap *priv = nft_set_priv(set);
-   struct nft_set_ext *ext = elem->priv;
+   struct nft_bitmap_elem *new = elem->priv, *be;
u8 genmask = nft_genmask_next(net);
u32 idx, off;
 
-   nft_bitmap_location(set, nft_set_ext_key(ext), , );
-   if (nft_bitmap_active(priv->bitmap, idx, off, genmask))
+   be = nft_bitmap_elem_find(set, new, genmask);
+   if (be) {
+   *ext = >ext;
return -EEXIST;
+   }
 
+   nft_bitmap_location(set, nft_set_ext_key(>ext), , );
/* Enter 01 state. */
priv->bitmap[idx] |= (genmask << off);
+   list_add_tail_rcu(>head, >list);
 
return 0;
 }
@@ -106,13 +134,14 @@ static void nft_bitmap_remove(const struct net *net,
  const struct nft_set_elem *elem)
 {
struct nft_bitmap *priv = nft_set_priv(set);
-   struct nft_set_ext *ext = elem->priv;
+   struct nft_bitmap_elem *be = elem->priv;
u8 genmask = nft_genmask_next(net);
u32 idx, off;
 
-   nft_bitmap_location(set, nft_set_ext_key(ext), , );
+   nft_bitmap_location(set, nft_set_ext_key(>ext), , );
/* Enter 00 state. */
priv->bitmap[idx] &= ~(genmask << off);
+   list_del_rcu(>head);
 }
 
 static void nft_bitmap_activate(const struct net *net,
@@ -120,73 +149,52 @@ static void nft_bitmap_activate(const struct net *net,
const struct nft_set_elem *elem)
 {
struct nft_bitmap *priv = nft_set_priv(set);
-   struct nft_set_ext *ext = elem->priv;
+   struct nft_bitmap_elem *be = elem->priv;
u8 genmask = nft_genmask_next(net);
u32 idx, off;
 
-   nft_bitmap_location(set, nft_set_ext_key(ext), , );
+   nft_bitmap_location(set, nft_set_ext_key(>ext), , );
/* Enter 11 state. */
priv->bitmap[idx] |= (genmask << off);
+   nft_set_elem_change_active(net, set, >ext);
 }
 
 static bool nft_bitmap_flush(const struct net *net,
-const struct nft_set *set, void *ext)
+const struct nft_set *set, void *_be)
 {
struct nft_bitmap *priv = nft_set_priv(set);
u8 genmask = nft_genmask_next(net);
+   struct nft_bitmap_elem *be = _be;
u32 idx, off;
 
-   nft_bitmap_location(set, nft_set_ext_key(ext), , );
+   nft_bitmap_location(set, nft_set_ext_key(>ext), , );
/* Enter 10 

[PATCH 05/10] netfilter: nf_tables: fix mismatch in big-endian system

2017-03-15 Thread Pablo Neira Ayuso
From: Liping Zhang 

Currently, there are two different methods to store an u16 integer to
the u32 data register. For example:
  u32 *dest = >data[priv->dreg];
  1. *dest = 0; *(u16 *) dest = val_u16;
  2. *dest = val_u16;

For method 1, the u16 value will be stored like this, either in
big-endian or little-endian system:
  0  15   31
  +-+-+-+-+-+-+-+-+-+-+-+-+
  |   Value   | 0 |
  +-+-+-+-+-+-+-+-+-+-+-+-+

For method 2, in little-endian system, the u16 value will be the same
as listed above. But in big-endian system, the u16 value will be stored
like this:
  0  15   31
  +-+-+-+-+-+-+-+-+-+-+-+-+
  | 0 |   Value   |
  +-+-+-+-+-+-+-+-+-+-+-+-+

So later we use "memcmp(>data[priv->sreg], data, 2);" to do
compare in nft_cmp, nft_lookup expr ..., method 2 will get the wrong
result in big-endian system, as 0~15 bits will always be zero.

For the similar reason, when loading an u16 value from the u32 data
register, we should use "*(u16 *) sreg;" instead of "(u16)*sreg;",
the 2nd method will get the wrong value in the big-endian system.

So introduce some wrapper functions to store/load an u8 or u16
integer to/from the u32 data register, and use them in the right
place.

Signed-off-by: Liping Zhang 
Signed-off-by: Pablo Neira Ayuso 
---
 include/net/netfilter/nf_tables.h   | 29 +++
 net/ipv4/netfilter/nft_masq_ipv4.c  |  8 
 net/ipv4/netfilter/nft_redir_ipv4.c |  8 
 net/ipv6/netfilter/nft_masq_ipv6.c  |  8 
 net/ipv6/netfilter/nft_redir_ipv6.c |  8 
 net/netfilter/nft_ct.c  | 18 +
 net/netfilter/nft_meta.c| 40 +++--
 net/netfilter/nft_nat.c |  8 
 8 files changed, 80 insertions(+), 47 deletions(-)

diff --git a/include/net/netfilter/nf_tables.h 
b/include/net/netfilter/nf_tables.h
index 2aa8a9d80fbe..70c5ca0c60b1 100644
--- a/include/net/netfilter/nf_tables.h
+++ b/include/net/netfilter/nf_tables.h
@@ -103,6 +103,35 @@ struct nft_regs {
};
 };
 
+/* Store/load an u16 or u8 integer to/from the u32 data register.
+ *
+ * Note, when using concatenations, register allocation happens at 32-bit
+ * level. So for store instruction, pad the rest part with zero to avoid
+ * garbage values.
+ */
+
+static inline void nft_reg_store16(u32 *dreg, u16 val)
+{
+   *dreg = 0;
+   *(u16 *)dreg = val;
+}
+
+static inline void nft_reg_store8(u32 *dreg, u8 val)
+{
+   *dreg = 0;
+   *(u8 *)dreg = val;
+}
+
+static inline u16 nft_reg_load16(u32 *sreg)
+{
+   return *(u16 *)sreg;
+}
+
+static inline u8 nft_reg_load8(u32 *sreg)
+{
+   return *(u8 *)sreg;
+}
+
 static inline void nft_data_copy(u32 *dst, const struct nft_data *src,
 unsigned int len)
 {
diff --git a/net/ipv4/netfilter/nft_masq_ipv4.c 
b/net/ipv4/netfilter/nft_masq_ipv4.c
index a0ea8aad1bf1..f18677277119 100644
--- a/net/ipv4/netfilter/nft_masq_ipv4.c
+++ b/net/ipv4/netfilter/nft_masq_ipv4.c
@@ -26,10 +26,10 @@ static void nft_masq_ipv4_eval(const struct nft_expr *expr,
memset(, 0, sizeof(range));
range.flags = priv->flags;
if (priv->sreg_proto_min) {
-   range.min_proto.all =
-   *(__be16 *)>data[priv->sreg_proto_min];
-   range.max_proto.all =
-   *(__be16 *)>data[priv->sreg_proto_max];
+   range.min_proto.all = (__force __be16)nft_reg_load16(
+   >data[priv->sreg_proto_min]);
+   range.max_proto.all = (__force __be16)nft_reg_load16(
+   >data[priv->sreg_proto_max]);
}
regs->verdict.code = nf_nat_masquerade_ipv4(pkt->skb, nft_hook(pkt),
, nft_out(pkt));
diff --git a/net/ipv4/netfilter/nft_redir_ipv4.c 
b/net/ipv4/netfilter/nft_redir_ipv4.c
index 1650ed23c15d..5120be1d3118 100644
--- a/net/ipv4/netfilter/nft_redir_ipv4.c
+++ b/net/ipv4/netfilter/nft_redir_ipv4.c
@@ -26,10 +26,10 @@ static void nft_redir_ipv4_eval(const struct nft_expr *expr,
 
memset(, 0, sizeof(mr));
if (priv->sreg_proto_min) {
-   mr.range[0].min.all =
-   *(__be16 *)>data[priv->sreg_proto_min];
-   mr.range[0].max.all =
-   *(__be16 *)>data[priv->sreg_proto_max];
+   mr.range[0].min.all = (__force __be16)nft_reg_load16(
+   >data[priv->sreg_proto_min]);
+   mr.range[0].max.all = (__force __be16)nft_reg_load16(
+   >data[priv->sreg_proto_max]);
mr.range[0].flags |= NF_NAT_RANGE_PROTO_SPECIFIED;
}
 
diff --git a/net/ipv6/netfilter/nft_masq_ipv6.c 
b/net/ipv6/netfilter/nft_masq_ipv6.c
index 6c5b5b1830a7..4146536e9c15 100644
--- a/net/ipv6/netfilter/nft_masq_ipv6.c
+++ 

[PATCH 09/10] Revert "netfilter: nf_tables: add flush field to struct nft_set_iter"

2017-03-15 Thread Pablo Neira Ayuso
This reverts commit 1f48ff6c5393aa7fe290faf5d633164f105b0aa7.

This patch is not required anymore now that we keep a dummy list of
set elements in the bitmap set implementation, so revert this before
we forget this code has no clients.

Signed-off-by: Pablo Neira Ayuso 
---
 include/net/netfilter/nf_tables.h | 1 -
 net/netfilter/nf_tables_api.c | 4 
 2 files changed, 5 deletions(-)

diff --git a/include/net/netfilter/nf_tables.h 
b/include/net/netfilter/nf_tables.h
index 70c5ca0c60b1..0136028652bd 100644
--- a/include/net/netfilter/nf_tables.h
+++ b/include/net/netfilter/nf_tables.h
@@ -232,7 +232,6 @@ struct nft_set_elem {
 struct nft_set;
 struct nft_set_iter {
u8  genmask;
-   boolflush;
unsigned intcount;
unsigned intskip;
int err;
diff --git a/net/netfilter/nf_tables_api.c b/net/netfilter/nf_tables_api.c
index 5e0ccfd5bb37..434c739dfeca 100644
--- a/net/netfilter/nf_tables_api.c
+++ b/net/netfilter/nf_tables_api.c
@@ -3145,7 +3145,6 @@ int nf_tables_bind_set(const struct nft_ctx *ctx, struct 
nft_set *set,
iter.count  = 0;
iter.err= 0;
iter.fn = nf_tables_bind_check_setelem;
-   iter.flush  = false;
 
set->ops->walk(ctx, set, );
if (iter.err < 0)
@@ -3399,7 +3398,6 @@ static int nf_tables_dump_set(struct sk_buff *skb, struct 
netlink_callback *cb)
args.iter.count = 0;
args.iter.err   = 0;
args.iter.fn= nf_tables_dump_setelem;
-   args.iter.flush = false;
set->ops->walk(, set, );
 
nla_nest_end(skb, nest);
@@ -3963,7 +3961,6 @@ static int nf_tables_delsetelem(struct net *net, struct 
sock *nlsk,
struct nft_set_iter iter = {
.genmask= genmask,
.fn = nft_flush_set,
-   .flush  = true,
};
set->ops->walk(, set, );
 
@@ -5114,7 +5111,6 @@ static int nf_tables_check_loops(const struct nft_ctx 
*ctx,
iter.count  = 0;
iter.err= 0;
iter.fn = nf_tables_loop_check_setelem;
-   iter.flush  = false;
 
set->ops->walk(ctx, set, );
if (iter.err < 0)
-- 
2.1.4

--
To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH libnftnl 0/2] add backend support to define ct helpers

2017-03-15 Thread Pablo Neira Ayuso
On Tue, Mar 14, 2017 at 08:53:59PM +0100, Florian Westphal wrote:
> This adds libnftnl support to define connection tracking helpers.
> Frontend (nft) support will follow soon.

Acked-by: Pablo Neira Ayuso 
--
To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v2 nftables 0/7] ct helper set support

2017-03-15 Thread Pablo Neira Ayuso
On Wed, Mar 15, 2017 at 04:01:04PM +0100, Florian Westphal wrote:
> v2, with updated syntax to force type and protocol keywords
> into same statement, i.e.
> 
> ct helper ftp-standard {
>   type "ftp" protocol tcp
> }
> 
> I also cleaned up the changes to bison (reuse family_spec_explicit)
> and added a test for an invalid helper (l3proto ip6 in ip table).

Acked-by: Pablo Neira Ayuso 
--
To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [nft PATCH] proto: Add some exotic ICMPv6 types

2017-03-15 Thread Pablo Neira Ayuso
On Wed, Mar 15, 2017 at 04:55:01PM +0100, Phil Sutter wrote:
> This adds support for matching on inverse ND messages as defined by
> RFC3122 (not implemented in Linux) and MLDv2 as defined by RFC3810.
> 
> Note that ICMPV6_MLD2_REPORT macro is defined in linux/icmpv6.h but
> including that header leads to conflicts with symbols defined in
> netinet/icmp6.h.
> 
> In addition to the above, "mld-listener-done" is introduced as an alias
> for "mld-listener-reduction".
> 
> Signed-off-by: Phil Sutter 
> ---
> This should resolve netfilter BZ#926.
> ---
>  src/proto.c | 8 
>  1 file changed, 8 insertions(+)
> 
> diff --git a/src/proto.c b/src/proto.c
> index fb965304e59d9..6a8eed936d858 100644
> --- a/src/proto.c
> +++ b/src/proto.c
> @@ -632,6 +632,10 @@ const struct proto_desc proto_ip = {
>  
>  #include 
>  
> +#define IND_NEIGHBOR_SOLICIT 141
> +#define IND_NEIGHBOR_ADVERT  142
> +#define ICMPV6_MLD2_REPORT   143
> +
>  static const struct symbol_table icmp6_type_tbl = {
>   .base   = BASE_DECIMAL,
>   .symbols= {
> @@ -644,12 +648,16 @@ static const struct symbol_table icmp6_type_tbl = {
>   SYMBOL("mld-listener-query",MLD_LISTENER_QUERY),
>   SYMBOL("mld-listener-report",   MLD_LISTENER_REPORT),
>   SYMBOL("mld-listener-reduction",MLD_LISTENER_REDUCTION),
> + SYMBOL("mld-listener-done", MLD_LISTENER_REDUCTION),

This one is duplicated, right?

>   SYMBOL("nd-router-solicit", ND_ROUTER_SOLICIT),
>   SYMBOL("nd-router-advert",  ND_ROUTER_ADVERT),
>   SYMBOL("nd-neighbor-solicit",   ND_NEIGHBOR_SOLICIT),
>   SYMBOL("nd-neighbor-advert",ND_NEIGHBOR_ADVERT),
>   SYMBOL("nd-redirect",   ND_REDIRECT),
>   SYMBOL("router-renumbering",
> ICMP6_ROUTER_RENUMBERING),
> + SYMBOL("mld2-listener-report",  ICMPV6_MLD2_REPORT),
> + SYMBOL("ind-neighbor-solicit",  IND_NEIGHBOR_SOLICIT),
> + SYMBOL("ind-neighbor-advert",   IND_NEIGHBOR_ADVERT),
>   SYMBOL_LIST_END
>   },
>  };
> -- 
> 2.11.0
> 
--
To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH nf] netfilter: nft_ct: do cleanup work when NFTA_CT_DIRECTION is invalid

2017-03-15 Thread Pablo Neira Ayuso
On Wed, Mar 15, 2017 at 03:52:31PM +0100, Florian Westphal wrote:
> Liping Zhang  wrote:
> > From: Liping Zhang 
> > 
> > We should jump to invoke __nft_ct_set_destroy() instead of just
> > return error.
> > 
> > Fixes: edee4f1e9245 ("netfilter: nft_ct: add zone id set support")
> > Signed-off-by: Liping Zhang 
> 
> Indeed, good catch, thanks!
> 
> Acked-by: Florian Westphal 

Applied, thanks.
--
To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[nft PATCH] proto: Add some exotic ICMPv6 types

2017-03-15 Thread Phil Sutter
This adds support for matching on inverse ND messages as defined by
RFC3122 (not implemented in Linux) and MLDv2 as defined by RFC3810.

Note that ICMPV6_MLD2_REPORT macro is defined in linux/icmpv6.h but
including that header leads to conflicts with symbols defined in
netinet/icmp6.h.

In addition to the above, "mld-listener-done" is introduced as an alias
for "mld-listener-reduction".

Signed-off-by: Phil Sutter 
---
This should resolve netfilter BZ#926.
---
 src/proto.c | 8 
 1 file changed, 8 insertions(+)

diff --git a/src/proto.c b/src/proto.c
index fb965304e59d9..6a8eed936d858 100644
--- a/src/proto.c
+++ b/src/proto.c
@@ -632,6 +632,10 @@ const struct proto_desc proto_ip = {
 
 #include 
 
+#define IND_NEIGHBOR_SOLICIT   141
+#define IND_NEIGHBOR_ADVERT142
+#define ICMPV6_MLD2_REPORT 143
+
 static const struct symbol_table icmp6_type_tbl = {
.base   = BASE_DECIMAL,
.symbols= {
@@ -644,12 +648,16 @@ static const struct symbol_table icmp6_type_tbl = {
SYMBOL("mld-listener-query",MLD_LISTENER_QUERY),
SYMBOL("mld-listener-report",   MLD_LISTENER_REPORT),
SYMBOL("mld-listener-reduction",MLD_LISTENER_REDUCTION),
+   SYMBOL("mld-listener-done", MLD_LISTENER_REDUCTION),
SYMBOL("nd-router-solicit", ND_ROUTER_SOLICIT),
SYMBOL("nd-router-advert",  ND_ROUTER_ADVERT),
SYMBOL("nd-neighbor-solicit",   ND_NEIGHBOR_SOLICIT),
SYMBOL("nd-neighbor-advert",ND_NEIGHBOR_ADVERT),
SYMBOL("nd-redirect",   ND_REDIRECT),
SYMBOL("router-renumbering",
ICMP6_ROUTER_RENUMBERING),
+   SYMBOL("mld2-listener-report",  ICMPV6_MLD2_REPORT),
+   SYMBOL("ind-neighbor-solicit",  IND_NEIGHBOR_SOLICIT),
+   SYMBOL("ind-neighbor-advert",   IND_NEIGHBOR_ADVERT),
SYMBOL_LIST_END
},
 };
-- 
2.11.0

--
To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v2 nftables 4/7] src: implement add/create/delete for ct helper objects

2017-03-15 Thread Florian Westphal
Signed-off-by: Florian Westphal 
---
 no changes since v1.
 include/rule.h |  4 
 src/evaluate.c |  4 
 src/parser_bison.y | 63 --
 src/rule.c | 22 +++
 4 files changed, 91 insertions(+), 2 deletions(-)

diff --git a/include/rule.h b/include/rule.h
index b791cc0a497c..fb4606406a94 100644
--- a/include/rule.h
+++ b/include/rule.h
@@ -370,6 +370,7 @@ enum cmd_obj {
CMD_OBJ_COUNTERS,
CMD_OBJ_QUOTA,
CMD_OBJ_QUOTAS,
+   CMD_OBJ_CT_HELPER,
CMD_OBJ_CT_HELPERS,
 };
 
@@ -438,6 +439,9 @@ struct cmd {
 extern struct cmd *cmd_alloc(enum cmd_ops op, enum cmd_obj obj,
 const struct handle *h, const struct location *loc,
 void *data);
+extern struct cmd *cmd_alloc_obj_ct(enum cmd_ops op, int type,
+   const struct handle *h,
+   const struct location *loc, void *data);
 extern void cmd_free(struct cmd *cmd);
 
 #include 
diff --git a/src/evaluate.c b/src/evaluate.c
index 20f67ee784dd..8fb716c06244 100644
--- a/src/evaluate.c
+++ b/src/evaluate.c
@@ -2911,6 +2911,7 @@ static int cmd_evaluate_add(struct eval_ctx *ctx, struct 
cmd *cmd)
return table_evaluate(ctx, cmd->table);
case CMD_OBJ_COUNTER:
case CMD_OBJ_QUOTA:
+   case CMD_OBJ_CT_HELPER:
return 0;
default:
BUG("invalid command object type %u\n", cmd->obj);
@@ -2934,6 +2935,7 @@ static int cmd_evaluate_delete(struct eval_ctx *ctx, 
struct cmd *cmd)
case CMD_OBJ_TABLE:
case CMD_OBJ_COUNTER:
case CMD_OBJ_QUOTA:
+   case CMD_OBJ_CT_HELPER:
return 0;
default:
BUG("invalid command object type %u\n", cmd->obj);
@@ -3021,6 +3023,8 @@ static int cmd_evaluate_list(struct eval_ctx *ctx, struct 
cmd *cmd)
return cmd_evaluate_list_obj(ctx, cmd, NFT_OBJECT_QUOTA);
case CMD_OBJ_COUNTER:
return cmd_evaluate_list_obj(ctx, cmd, NFT_OBJECT_COUNTER);
+   case CMD_OBJ_CT_HELPER:
+   return cmd_evaluate_list_obj(ctx, cmd, NFT_OBJECT_CT_HELPER);
case CMD_OBJ_COUNTERS:
case CMD_OBJ_QUOTAS:
case CMD_OBJ_CT_HELPERS:
diff --git a/src/parser_bison.y b/src/parser_bison.y
index 1bcbff598ad7..5d3d10694823 100644
--- a/src/parser_bison.y
+++ b/src/parser_bison.y
@@ -583,8 +583,8 @@ static void location_update(struct location *loc, struct 
location *rhs, int n)
 %typeand_rhs_expr exclusive_or_rhs_expr 
inclusive_or_rhs_expr
 %destructor { expr_free($$); } and_rhs_expr exclusive_or_rhs_expr 
inclusive_or_rhs_expr
 
-%type counter_obj quota_obj
-%destructor { obj_free($$); }  counter_obj quota_obj
+%type counter_obj quota_obj ct_obj_alloc
+%destructor { obj_free($$); }  counter_obj quota_obj ct_obj_alloc
 
 %typerelational_expr
 %destructor { expr_free($$); } relational_expr
@@ -840,6 +840,19 @@ add_cmd:   TABLE   
table_spec
{
$$ = cmd_alloc(CMD_ADD, CMD_OBJ_QUOTA, &$2, 
&@$, $3);
}
+   |   CT  STRING  obj_specct_obj_alloc
'{' ct_block '}'stmt_seperator
+   {
+   struct error_record *erec;
+   int type;
+
+   erec = ct_objtype_parse(&@$, $2, );
+   if (erec != NULL) {
+   erec_queue(erec, state->msgs);
+   YYERROR;
+   }
+
+   $$ = cmd_alloc_obj_ct(CMD_ADD, type, &$3, &@$, 
$4);
+   }
;
 
 replace_cmd:   RULEruleid_spec rule
@@ -906,6 +919,19 @@ create_cmd :   TABLE   table_spec
{
$$ = cmd_alloc(CMD_CREATE, CMD_OBJ_QUOTA, &$2, 
&@$, $3);
}
+   |   CT  STRING  obj_specct_obj_alloc
'{' ct_block '}'stmt_seperator
+   {
+   struct error_record *erec;
+   int type;
+
+   erec = ct_objtype_parse(&@$, $2, );
+   if (erec != NULL) {
+   erec_queue(erec, state->msgs);
+   YYERROR;
+   }
+
+   $$ = cmd_alloc_obj_ct(CMD_CREATE, type, &$3, 
&@$, $4);
+   }
;
 
 insert_cmd :   RULErule_position   

[PATCH v2 nftables 7/7] doc: ct helper objects and helper set support

2017-03-15 Thread Florian Westphal
Signed-off-by: Florian Westphal 
---
 changes since v1:
 use  in cmdsynopsis, update example

 doc/nft.xml | 76 +
 1 file changed, 76 insertions(+)

diff --git a/doc/nft.xml b/doc/nft.xml
index 8ea280417742..80f201e89d37 100644
--- a/doc/nft.xml
+++ b/doc/nft.xml
@@ -950,6 +950,77 @@ filter input iif $int_ifs accept

 

+   Ct
+   
+   
+   ct
+   helper
+   type
+   type
+   protocol
+   protocol
+   l3proto
+   family
+   
+   
+   
+   Ct helper is used to define connection tracking 
helpers that can then be used in combination with the "ct helper 
set" statement.
+   type and protocol are mandatory, l3proto is 
derived from the table family by default, i.e. in the inet table the kernel will
+   try to load both the ipv4 and ipv6 helper 
backends, if they are supported by the kernel.
+   
+   
+   conntrack helper specifications
+   
+   
+   
+   
+   
+   
+   Keyword
+   
Description
+   Type
+   
+   
+   
+   
+   type
+   name of helper 
type
+   quoted string 
(e.g. "ftp")
+   
+   
+   protocol
+   layer 4 protocol 
of the helper
+   string (e.g. 
tcp)
+   
+   
+   l3proto
+   layer 3 protocol 
of the helper
+   address family 
(e.g. ip)
+   
+   
+   
+   
+   
+   defining and assigning ftp helper
+   
+   Unlike iptables, helper assignment needs to be 
performed after the conntrack lookup has completed, for example
+   with the default 0 hook priority.
+   
+   
+table inet myhelpers {
+  ct helper ftp-standard {
+ type "ftp" protocol tcp
+  }
+  chain prerouting {
+  type filter hook prerouting priority 0;
+  tcp dport 21 ct helper set "ftp-standard"
+  }
+}
+   
+   
+   
+
+   
Counter


@@ -3376,6 +3447,11 @@ ip6 filter output log flags all



+   
helper
+   name of 
ct helper object to assign to the connection
+   quoted 
string
+   
+   

mark

Connection tracking mark

mark
-- 
2.10.2

--
To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  

[PATCH v2 nftables 1/7] src: add initial ct helper support

2017-03-15 Thread Florian Westphal
This adds initial support for defining conntrack helper objects
which can then be assigned to connections using the objref infrastructure:

table ip filter {
  ct helper ftp-standard {
type "ftp" protocol tcp
  }
  chain y {
 tcp dport 21 ct helper set "ftp-standard"
  }
}

Signed-off-by: Florian Westphal 
---
 Changes since v1:
   - tweak user syntax
   - minor cleanup
   - fix parsing of l3proto: v1 used to set l4proto instead of l3.

 include/ct.h|  1 +
 include/linux/netfilter/nf_tables.h | 12 +-
 include/rule.h  |  7 
 src/ct.c| 10 +
 src/netlink.c   | 16 
 src/parser_bison.y  | 74 -
 src/rule.c  | 21 ++-
 src/statement.c | 10 -
 8 files changed, 146 insertions(+), 5 deletions(-)

diff --git a/include/ct.h b/include/ct.h
index 03e76e619e23..ae900ee4fb61 100644
--- a/include/ct.h
+++ b/include/ct.h
@@ -31,6 +31,7 @@ extern struct error_record *ct_dir_parse(const struct 
location *loc,
 const char *str, int8_t *dir);
 extern struct error_record *ct_key_parse(const struct location *loc, const 
char *str,
 unsigned int *key);
+extern struct error_record *ct_objtype_parse(const struct location *loc, const 
char *str, int *type);
 
 extern struct stmt *notrack_stmt_alloc(const struct location *loc);
 
diff --git a/include/linux/netfilter/nf_tables.h 
b/include/linux/netfilter/nf_tables.h
index a9280a6541ac..8f3842690d17 100644
--- a/include/linux/netfilter/nf_tables.h
+++ b/include/linux/netfilter/nf_tables.h
@@ -1260,10 +1260,20 @@ enum nft_fib_flags {
NFTA_FIB_F_PRESENT  = 1 << 5,   /* check existence only */
 };
 
+enum nft_ct_helper_attributes {
+   NFTA_CT_HELPER_UNSPEC,
+   NFTA_CT_HELPER_NAME,
+   NFTA_CT_HELPER_L3PROTO,
+   NFTA_CT_HELPER_L4PROTO,
+   __NFTA_CT_HELPER_MAX,
+};
+#define NFTA_CT_HELPER_MAX (__NFTA_CT_HELPER_MAX - 1)
+
 #define NFT_OBJECT_UNSPEC  0
 #define NFT_OBJECT_COUNTER 1
 #define NFT_OBJECT_QUOTA   2
-#define __NFT_OBJECT_MAX   3
+#define NFT_OBJECT_CT_HELPER   3
+#define __NFT_OBJECT_MAX   4
 #define NFT_OBJECT_MAX (__NFT_OBJECT_MAX - 1)
 
 /**
diff --git a/include/rule.h b/include/rule.h
index ed12774d0ba7..d89a963dfd05 100644
--- a/include/rule.h
+++ b/include/rule.h
@@ -260,6 +260,12 @@ struct quota {
uint32_tflags;
 };
 
+struct ct {
+   char helper_name[16];
+   uint16_t l3proto;
+   uint8_t l4proto;
+};
+
 /**
  * struct obj - nftables stateful object statement
  *
@@ -277,6 +283,7 @@ struct obj {
union {
struct counter  counter;
struct quotaquota;
+   struct ct   ct;
};
 };
 
diff --git a/src/ct.c b/src/ct.c
index 83fceff67139..fd8ca87a21fb 100644
--- a/src/ct.c
+++ b/src/ct.c
@@ -353,6 +353,16 @@ struct error_record *ct_key_parse(const struct location 
*loc, const char *str,
return error(loc, "syntax error, unexpected %s, known keys are %s", 
str, buf);
 }
 
+struct error_record *ct_objtype_parse(const struct location *loc, const char 
*str, int *type)
+{
+   if (strcmp(str, "helper") == 0) {
+   *type = NFT_OBJECT_CT_HELPER;
+   return NULL;
+   }
+
+   return error(loc, "unknown ct class '%s', want 'helper'", str);
+}
+
 struct expr *ct_expr_alloc(const struct location *loc, enum nft_ct_keys key,
   int8_t direction)
 {
diff --git a/src/netlink.c b/src/netlink.c
index fb6d2876a6f1..6fbb67da7f76 100644
--- a/src/netlink.c
+++ b/src/netlink.c
@@ -317,6 +317,15 @@ alloc_nftnl_obj(const struct handle *h, struct obj *obj)
nftnl_obj_set_u32(nlo, NFTNL_OBJ_QUOTA_FLAGS,
  obj->quota.flags);
break;
+   case NFT_OBJECT_CT_HELPER:
+   nftnl_obj_set_str(nlo, NFTNL_OBJ_CT_HELPER_NAME,
+ obj->ct.helper_name);
+   nftnl_obj_set_u8(nlo, NFTNL_OBJ_CT_HELPER_L4PROTO,
+ obj->ct.l4proto);
+   if (obj->ct.l3proto)
+   nftnl_obj_set_u16(nlo, NFTNL_OBJ_CT_HELPER_L3PROTO,
+ obj->ct.l3proto);
+   break;
default:
BUG("Unknown type %d\n", obj->type);
break;
@@ -1814,6 +1823,13 @@ static struct obj *netlink_delinearize_obj(struct 
netlink_ctx *ctx,
nftnl_obj_get_u64(nlo, NFTNL_OBJ_QUOTA_CONSUMED);
obj->quota.flags =
nftnl_obj_get_u32(nlo, NFTNL_OBJ_QUOTA_FLAGS);
+   break;
+   case NFT_OBJECT_CT_HELPER:
+   snprintf(obj->ct.helper_name, 

[PATCH v2 nftables 0/7] ct helper set support

2017-03-15 Thread Florian Westphal
v2, with updated syntax to force type and protocol keywords
into same statement, i.e.

ct helper ftp-standard {
  type "ftp" protocol tcp
}

I also cleaned up the changes to bison (reuse family_spec_explicit)
and added a test for an invalid helper (l3proto ip6 in ip table).

 doc/nft.xml |   76 +
 include/ct.h|1 
 include/linux/netfilter/nf_tables.h |   12 ++
 include/rule.h  |   12 ++
 src/ct.c|   10 ++
 src/evaluate.c  |   37 +---
 src/netlink.c   |   16 +++
 src/parser_bison.y  |  156 +++-
 src/rule.c  |   45 ++
 src/statement.c |   10 ++
 tests/py/ip/objects.t   |5 +
 tests/py/ip/objects.t.payload   |   14 +++
 tests/py/nft-test.py|   28 +-
 13 files changed, 399 insertions(+), 23 deletions(-)
--
To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v2 nftables 3/7] src: allow listing all ct helpers

2017-03-15 Thread Florian Westphal
this implements
nft list ct helpers table filter
table ip filter {
ct helper ftp-standard {
..

Signed-off-by: Florian Westphal 
---
 no changes since v1.
 include/rule.h |  1 +
 src/evaluate.c |  1 +
 src/parser_bison.y | 19 +++
 src/rule.c |  2 ++
 4 files changed, 23 insertions(+)

diff --git a/include/rule.h b/include/rule.h
index d89a963dfd05..b791cc0a497c 100644
--- a/include/rule.h
+++ b/include/rule.h
@@ -370,6 +370,7 @@ enum cmd_obj {
CMD_OBJ_COUNTERS,
CMD_OBJ_QUOTA,
CMD_OBJ_QUOTAS,
+   CMD_OBJ_CT_HELPERS,
 };
 
 struct export {
diff --git a/src/evaluate.c b/src/evaluate.c
index ae30bc9bb3b9..20f67ee784dd 100644
--- a/src/evaluate.c
+++ b/src/evaluate.c
@@ -3023,6 +3023,7 @@ static int cmd_evaluate_list(struct eval_ctx *ctx, struct 
cmd *cmd)
return cmd_evaluate_list_obj(ctx, cmd, NFT_OBJECT_COUNTER);
case CMD_OBJ_COUNTERS:
case CMD_OBJ_QUOTAS:
+   case CMD_OBJ_CT_HELPERS:
if (cmd->handle.table == NULL)
return 0;
if (table_lookup(>handle) == NULL)
diff --git a/src/parser_bison.y b/src/parser_bison.y
index 2cf732ce818f..1bcbff598ad7 100644
--- a/src/parser_bison.y
+++ b/src/parser_bison.y
@@ -1016,6 +1016,25 @@ list_cmd :   TABLE   table_spec
{
$$ = cmd_alloc(CMD_LIST, CMD_OBJ_MAP, &$2, &@$, 
NULL);
}
+   |   CT  STRING  TABLE   table_spec
+   {
+   int cmd;
+
+   if (strcmp($2, "helpers") == 0) {
+   cmd = CMD_OBJ_CT_HELPERS;
+   } else {
+   struct error_record *erec;
+
+   erec = error(&@$, "unknown ct class 
'%s', want 'helpers'", $2);
+
+   if (erec != NULL) {
+   erec_queue(erec, state->msgs);
+   YYERROR;
+   }
+   }
+
+   $$ = cmd_alloc(CMD_LIST, cmd, &$4, &@$, NULL);
+   }
;
 
 reset_cmd  :   COUNTERSruleset_spec
diff --git a/src/rule.c b/src/rule.c
index 17c20f35398a..453aa2f2cc9c 100644
--- a/src/rule.c
+++ b/src/rule.c
@@ -1455,6 +1455,8 @@ static int do_command_list(struct netlink_ctx *ctx, 
struct cmd *cmd)
case CMD_OBJ_QUOTA:
case CMD_OBJ_QUOTAS:
return do_list_obj(ctx, cmd, NFT_OBJECT_QUOTA);
+   case CMD_OBJ_CT_HELPERS:
+   return do_list_obj(ctx, cmd, NFT_OBJECT_CT_HELPER);
default:
BUG("invalid command object type %u\n", cmd->obj);
}
-- 
2.10.2

--
To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v2 nftables 6/7] tests: add insert-failure test

2017-03-15 Thread Florian Westphal
It should not be possible to add a ip6 restricted helper to ip family.

Signed-off-by: Florian Westphal 
---
 not part of v1 series.

 tests/py/ip/objects.t |  1 +
 tests/py/nft-test.py  | 17 ++---
 2 files changed, 15 insertions(+), 3 deletions(-)

diff --git a/tests/py/ip/objects.t b/tests/py/ip/objects.t
index ec8e8fd916d4..742ec6af2572 100644
--- a/tests/py/ip/objects.t
+++ b/tests/py/ip/objects.t
@@ -6,6 +6,7 @@
 %cnt2 type counter;ok
 %qt1 type quota 25 mbytes;ok
 %qt2 type quota over 1 kbytes;ok
+%cthelp2 type ct helper { type \"ftp\" protocol tcp\; l3proto ip6\; };fail
 
 ip saddr 192.168.1.3 counter name "cnt2";ok
 ip saddr 192.168.1.3 counter name "cnt3";fail
diff --git a/tests/py/nft-test.py b/tests/py/nft-test.py
index b22404076edd..8d1df3bc517a 100755
--- a/tests/py/nft-test.py
+++ b/tests/py/nft-test.py
@@ -517,12 +517,23 @@ def obj_add(o, test_result, filename, lineno):
 print_error(reason, filename, lineno)
 return -1
 
-if not _obj_exist(o, filename, lineno):
-reason = "I have just added the " + obj_handle + \
- " to the table " + table.name + " but it does not exist"
+exist = _obj_exist(o, filename, lineno)
+
+if exist:
+if test_result == "ok":
+ return 0
+reason = "I added the " + obj_handle + \
+ " to the table " + table.name + " but it should have 
failed"
 print_error(reason, filename, lineno)
 return -1
 
+if test_result == "fail":
+return 0
+
+reason = "I have just added the " + obj_handle + \
+ " to the table " + table.name + " but it does not exist"
+print_error(reason, filename, lineno)
+return -1
 
 def obj_delete(table, filename=None, lineno=None):
 '''
-- 
2.10.2

--
To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v2 nftables 5/7] tests: py: add ct helper tests

2017-03-15 Thread Florian Westphal
needs minor tweak to nft-test.py so we don't zap the ';' withhin the {}.

Signed-off-by: Florian Westphal 
---
 no changes since v1.
 tests/py/ip/objects.t |  4 
 tests/py/ip/objects.t.payload | 14 ++
 tests/py/nft-test.py  | 11 ++-
 3 files changed, 28 insertions(+), 1 deletion(-)

diff --git a/tests/py/ip/objects.t b/tests/py/ip/objects.t
index 8109402da8ba..ec8e8fd916d4 100644
--- a/tests/py/ip/objects.t
+++ b/tests/py/ip/objects.t
@@ -13,3 +13,7 @@ counter name tcp dport map {443 : "cnt1", 80 : "cnt2", 22 : 
"cnt1"};ok
 ip saddr 192.168.1.3 quota name "qt1";ok
 ip saddr 192.168.1.3 quota name "qt3";fail
 quota name tcp dport map {443 : "qt1", 80 : "qt2", 22 : "qt1"};ok
+
+%cthelp1 type ct helper { type \"ftp\" protocol tcp\; };ok
+ct helper set "cthelp1";ok
+ct helper set tcp dport map {21 : "cthelp1", 2121 : "cthelp1" };ok
diff --git a/tests/py/ip/objects.t.payload b/tests/py/ip/objects.t.payload
index b5cad4d1e3fc..6499d36348fe 100644
--- a/tests/py/ip/objects.t.payload
+++ b/tests/py/ip/objects.t.payload
@@ -29,3 +29,17 @@ ip test-ip4 output
   [ cmp eq reg 1 0x0006 ]
   [ payload load 2b @ transport header + 2 => reg 1 ]
   [ objref sreg 1 set __objmap%d id 1 ]
+
+# ct helper set "cthelp1"
+ip test-ip4 output
+  [ objref type 3 name cthelp1 ]
+
+# ct helper set tcp dport map {21 : "cthelp1", 2121 : "cthelp1" }
+__objmap%d test-ip4 43
+__objmap%d test-ip4 0
+element 1500  : 0 [end] element 4908  : 0 [end]
+ip test-ip4 output
+  [ payload load 1b @ network header + 9 => reg 1 ]
+  [ cmp eq reg 1 0x0006 ]
+  [ payload load 2b @ transport header + 2 => reg 1 ]
+  [ objref sreg 1 set __objmap%d id 1 ]
diff --git a/tests/py/nft-test.py b/tests/py/nft-test.py
index 25009217e51d..b22404076edd 100755
--- a/tests/py/nft-test.py
+++ b/tests/py/nft-test.py
@@ -885,6 +885,10 @@ def obj_process(obj_line, filename, lineno):
 obj_type = tokens[2]
 obj_spcf = ""
 
+if obj_type == "ct" and tokens[3] == "helper":
+   obj_type = "ct helper"
+   tokens[3] = ""
+
 if len(tokens) > 3:
 obj_spcf = " ".join(tokens[3:])
 
@@ -985,7 +989,12 @@ def run_test_file(filename, force_all_family_option, 
specific_file):
 continue
 
 if line[0] == "%":  # Adds this object
-obj_line = line.rstrip()[1:].split(";")
+brace = line.rfind("}")
+if brace < 0:
+obj_line = line.rstrip()[1:].split(";")
+else:
+obj_line = (line[1:brace+1], line[brace+2:].rstrip())
+
 ret = obj_process(obj_line, filename, lineno)
 tests += 1
 if ret == -1:
-- 
2.10.2

--
To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH nf] netfilter: nft_ct: do cleanup work when NFTA_CT_DIRECTION is invalid

2017-03-15 Thread Florian Westphal
Liping Zhang  wrote:
> From: Liping Zhang 
> 
> We should jump to invoke __nft_ct_set_destroy() instead of just
> return error.
> 
> Fixes: edee4f1e9245 ("netfilter: nft_ct: add zone id set support")
> Signed-off-by: Liping Zhang 

Indeed, good catch, thanks!

Acked-by: Florian Westphal 
--
To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH net] bridge: ebtables: fix reception of frames DNAT-ed to bridge device

2017-03-15 Thread Linus Lüssing
On Wed, Mar 15, 2017 at 11:42:11AM +0100, Pablo Neira Ayuso wrote:
> I'm missing then why redirect is not then just enough for Linus usecase.

For my usecase, the MAC address is configured by the user from a
Web-UI. It may or may not be the one from the bridge device.

Besides, found it counter intuitive that DNAT did not work here
and took me some time to find out why. At least I didn't read about
any such known limitations of the dnat target in the ebtables
manpage.
--
To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH nf] netfilter: nft_ct: do cleanup work when NFTA_CT_DIRECTION is invalid

2017-03-15 Thread Liping Zhang
From: Liping Zhang 

We should jump to invoke __nft_ct_set_destroy() instead of just
return error.

Fixes: edee4f1e9245 ("netfilter: nft_ct: add zone id set support")
Signed-off-by: Liping Zhang 
---
 net/netfilter/nft_ct.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/net/netfilter/nft_ct.c b/net/netfilter/nft_ct.c
index 91585b5..0264258 100644
--- a/net/netfilter/nft_ct.c
+++ b/net/netfilter/nft_ct.c
@@ -544,7 +544,8 @@ static int nft_ct_set_init(const struct nft_ctx *ctx,
case IP_CT_DIR_REPLY:
break;
default:
-   return -EINVAL;
+   err = -EINVAL;
+   goto err1;
}
}
 
-- 
2.5.5


--
To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] netfilter: logging copyrights is useless

2017-03-15 Thread Harald Welte
Hi Corentin,

On Wed, Mar 15, 2017 at 02:17:39PM +0100, Corentin Labbe wrote:
> Logging copyrights does not add any useful information in logs.
> This patch remove such logging

Historically, there were plenty of more copyright notices for certain
drivers or sections of the code being printed while booting.  I still
remember fondly the many ethernet driver notices of a Donald Becker, for
example.

I understand that it is questionable whether or not such statements are
"useful".  You might argue, their use is in
* stating a legal formality to the user
* making it simpler to determine if a given part of code is used in a
  given device (e.g. as part of GPL enforcement) while just logging the
  serial console and no requirement to (find a way to) dump the internal
  flash

Besides such practical arguments (it is of what use to whom), there are
legal concerns regarding the removal of copyright statements.  This
holds true on whether or not it is Free Software, or whether or not it
is GPL licensed.  If an author puts a copyright statement somehwere, he
exercises his right to be regarded as the author of the work.  It is
typically not permitted to remove such notices, as that would be a
copyright infringement in itself.

Also, beyond general legal concerns, the GPLv2 states explicitly:

> c) If the modified program normally reads commands interactively
> when run, you must cause it, when started running for such
> interactive use in the most ordinary way, to print or display an
> announcement including an appropriate copyright notice and a notice
> that there is no warranty (or else, saying that you provide a
> warranty) and that users may redistribute the program under these
> conditions, and telling the user how to view a copy of this License.
> (Exception: if the Program itself is interactive but does not normally
> print such an announcement, your work based on the Program is not
> required to print an announcement.)

Now you can argue whether the kernel is a an interactive program, but at
least you can see some intent to not remove any notices/messages that
were originally present in the program.

So I think your patch could only applied if the respective copyright
holders agree to remove their respective notices.

I personally would argue to keep them.  Nobody has complained about them
so far, and they have probably saved many weeks of my work time in GPL
compliance / enforcement work.  I understand this is a "niche use case",
though ;)

-- 
- Harald Welte  http://netfilter.org/

  "Fragmentation is like classful addressing -- an interesting early
   architectural error that shows how much experimentation was going
   on while IP was being designed."-- Paul Vixie
--
To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH iptables 2/2] iptables-restore: support acquiring the lock.

2017-03-15 Thread Lorenzo Colitti
Currently, ip[6]tables-restore does not perform any locking, so it
is not safe to use concurrently with ip[6]tables.

This patch makes ip[6]tables-restore wait for the lock if -w
was specified. Arguments to -w and -W are supported in the same
was as they are in ip[6]tables.

The lock is not acquired on startup. Instead, it is acquired when
a new table handle is created (on encountering '*') and released
when the table is committed (COMMIT). This makes it possible to
keep long-running iptables-restore processes in the background
(for example, reading commands from a pipe opened by a system
management daemon) and simultaneously run iptables commands.

If -w is not specified, then the command proceeds without taking
the lock.

Tested as follows:

1. Run iptables-restore -w, and check that iptables commands work
   with or without -w.
2. Type "*filter" into the iptables-restore input. Verify that
   a) ip[6]tables commands without -w fail with "another app is
  currently holding the xtables lock...".
   b) ip[6]tables commands with "-w 2" fail after 2 seconds.
   c) ip[6]tables commands with "-w" hang until "COMMIT" is
  typed into the iptables-restore window.
3. With the lock held by an ip6tables-restore process:
 strace -e flock /tmp/iptables/sbin/iptables-restore -w 1 -W 10
   shows 11 calls to flock and fails.

Signed-off-by: Narayan Kamath 
Signed-off-by: Lorenzo Colitti 
---
 iptables/ip6tables-restore.c | 55 ++--
 iptables/ip6tables.c |  2 +-
 iptables/iptables-restore.c  | 55 ++--
 iptables/iptables.c  |  2 +-
 iptables/xshared.c   | 18 ++-
 iptables/xshared.h   | 23 +-
 6 files changed, 122 insertions(+), 33 deletions(-)

diff --git a/iptables/ip6tables-restore.c b/iptables/ip6tables-restore.c
index dc0acb05a4..8a47f09c95 100644
--- a/iptables/ip6tables-restore.c
+++ b/iptables/ip6tables-restore.c
@@ -15,6 +15,7 @@
 #include 
 #include 
 #include "ip6tables.h"
+#include "xshared.h"
 #include "xtables.h"
 #include "libiptc/libip6tc.h"
 #include "ip6tables-multi.h"
@@ -25,17 +26,23 @@
 #define DEBUGP(x, args...)
 #endif
 
-static int counters = 0, verbose = 0, noflush = 0;
+static int counters = 0, verbose = 0, noflush = 0, wait = 0;
+
+static struct timeval wait_interval = {
+   .tv_sec = 1,
+};
 
 /* Keeping track of external matches and targets.  */
 static const struct option options[] = {
-   {.name = "counters", .has_arg = false, .val = 'c'},
-   {.name = "verbose",  .has_arg = false, .val = 'v'},
-   {.name = "test", .has_arg = false, .val = 't'},
-   {.name = "help", .has_arg = false, .val = 'h'},
-   {.name = "noflush",  .has_arg = false, .val = 'n'},
-   {.name = "modprobe", .has_arg = true,  .val = 'M'},
-   {.name = "table",.has_arg = true,  .val = 'T'},
+   {.name = "counters",  .has_arg = 0, .val = 'c'},
+   {.name = "verbose",   .has_arg = 0, .val = 'v'},
+   {.name = "test",  .has_arg = 0, .val = 't'},
+   {.name = "help",  .has_arg = 0, .val = 'h'},
+   {.name = "noflush",   .has_arg = 0, .val = 'n'},
+   {.name = "modprobe",  .has_arg = 1, .val = 'M'},
+   {.name = "table", .has_arg = 1, .val = 'T'},
+   {.name = "wait",  .has_arg = 2, .val = 'w'},
+   {.name = "wait-interval", .has_arg = 2, .val = 'W'},
{NULL},
 };
 
@@ -43,14 +50,16 @@ static void print_usage(const char *name, const char 
*version) __attribute__((no
 
 static void print_usage(const char *name, const char *version)
 {
-   fprintf(stderr, "Usage: %s [-c] [-v] [-t] [-h] [-n] [-T table] [-M 
command]\n"
+   fprintf(stderr, "Usage: %s [-c] [-v] [-t] [-h] [-n] [-w secs] [-W 
usecs] [-T table] [-M command]\n"
"  [ --counters ]\n"
"  [ --verbose ]\n"
"  [ --test ]\n"
"  [ --help ]\n"
"  [ --noflush ]\n"
+   "  [ --wait=\n"
+   "  [ --wait-interval=\n"
"  [ --table= ]\n"
-   "  [ --modprobe= ]\n", name);
+   "  [ --modprobe= ]\n", name);
 
exit(1);
 }
@@ -181,7 +190,7 @@ int ip6tables_restore_main(int argc, char *argv[])
 {
struct xtc_handle *handle = NULL;
char buffer[10240];
-   int c;
+   int c, lock;
char curtable[XT_TABLE_MAXNAMELEN + 1];
FILE *in;
int in_table = 0, testing = 0;
@@ -189,6 +198,7 @@ int ip6tables_restore_main(int argc, char *argv[])
const struct xtc_ops *ops = _ops;
 
line = 0;
+   lock = XT_LOCK_NOT_ACQUIRED;
 
ip6tables_globals.program_name = "ip6tables-restore";
c = 

[PATCH iptables 1/2] iptables: remove duplicated argument parsing code

2017-03-15 Thread Lorenzo Colitti
1. Factor out repeated code to a new xs_has_arg function.
2. Add a new parse_wait_time option to parse the value of -w.
3. Make parse_wait_interval take argc and argv so its callers
   can be simpler.

Signed-off-by: Lorenzo Colitti 
---
 iptables/ip6tables.c   | 62 +-
 iptables/iptables.c| 62 +-
 iptables/xshared.c | 35 ++--
 iptables/xshared.h |  4 +++-
 iptables/xtables-arp.c | 30 
 iptables/xtables.c | 58 ++
 6 files changed, 95 insertions(+), 156 deletions(-)

diff --git a/iptables/ip6tables.c b/iptables/ip6tables.c
index 0bd415dec5..4d77721b04 100644
--- a/iptables/ip6tables.c
+++ b/iptables/ip6tables.c
@@ -1400,8 +1400,7 @@ int do_command6(int argc, char *argv[], char **table,
add_command(, CMD_DELETE, CMD_NONE,
cs.invert);
chain = optarg;
-   if (optind < argc && argv[optind][0] != '-'
-   && argv[optind][0] != '!') {
+   if (xs_has_arg(argc, argv)) {
rulenum = parse_rulenumber(argv[optind++]);
command = CMD_DELETE_NUM;
}
@@ -1411,8 +1410,7 @@ int do_command6(int argc, char *argv[], char **table,
add_command(, CMD_REPLACE, CMD_NONE,
cs.invert);
chain = optarg;
-   if (optind < argc && argv[optind][0] != '-'
-   && argv[optind][0] != '!')
+   if (xs_has_arg(argc, argv))
rulenum = parse_rulenumber(argv[optind++]);
else
xtables_error(PARAMETER_PROBLEM,
@@ -1424,8 +1422,7 @@ int do_command6(int argc, char *argv[], char **table,
add_command(, CMD_INSERT, CMD_NONE,
cs.invert);
chain = optarg;
-   if (optind < argc && argv[optind][0] != '-'
-   && argv[optind][0] != '!')
+   if (xs_has_arg(argc, argv))
rulenum = parse_rulenumber(argv[optind++]);
else rulenum = 1;
break;
@@ -1434,11 +1431,9 @@ int do_command6(int argc, char *argv[], char **table,
add_command(, CMD_LIST,
CMD_ZERO | CMD_ZERO_NUM, cs.invert);
if (optarg) chain = optarg;
-   else if (optind < argc && argv[optind][0] != '-'
-&& argv[optind][0] != '!')
+   else if (xs_has_arg(argc, argv))
chain = argv[optind++];
-   if (optind < argc && argv[optind][0] != '-'
-   && argv[optind][0] != '!')
+   if (xs_has_arg(argc, argv))
rulenum = parse_rulenumber(argv[optind++]);
break;
 
@@ -1446,11 +1441,9 @@ int do_command6(int argc, char *argv[], char **table,
add_command(, CMD_LIST_RULES,
CMD_ZERO | CMD_ZERO_NUM, cs.invert);
if (optarg) chain = optarg;
-   else if (optind < argc && argv[optind][0] != '-'
-&& argv[optind][0] != '!')
+   else if (xs_has_arg(argc, argv))
chain = argv[optind++];
-   if (optind < argc && argv[optind][0] != '-'
-   && argv[optind][0] != '!')
+   if (xs_has_arg(argc, argv))
rulenum = parse_rulenumber(argv[optind++]);
break;
 
@@ -1458,8 +1451,7 @@ int do_command6(int argc, char *argv[], char **table,
add_command(, CMD_FLUSH, CMD_NONE,
cs.invert);
if (optarg) chain = optarg;
-   else if (optind < argc && argv[optind][0] != '-'
-&& argv[optind][0] != '!')
+   else if (xs_has_arg(argc, argv))
chain = argv[optind++];
break;
 
@@ -1467,11 +1459,9 @@ int do_command6(int argc, char *argv[], char **table,
add_command(, CMD_ZERO, CMD_LIST|CMD_LIST_RULES,
cs.invert);
if (optarg) chain = optarg;
-   else if (optind < argc && argv[optind][0] != '-'
-   && argv[optind][0] != '!')
+  

[PATCH] netfilter: logging copyrights is useless

2017-03-15 Thread Corentin Labbe
Logging copyrights does not add any useful information in logs.
This patch remove such logging

Signed-off-by: Corentin Labbe 
---
 net/ipv4/netfilter/arp_tables.c | 1 -
 net/ipv4/netfilter/ip_tables.c  | 1 -
 net/ipv6/netfilter/ip6_tables.c | 1 -
 net/netfilter/nf_tables_api.c   | 1 -
 net/netfilter/nft_compat.c  | 2 --
 5 files changed, 6 deletions(-)

diff --git a/net/ipv4/netfilter/arp_tables.c b/net/ipv4/netfilter/arp_tables.c
index f17dab1..a89211f 100644
--- a/net/ipv4/netfilter/arp_tables.c
+++ b/net/ipv4/netfilter/arp_tables.c
@@ -1647,7 +1647,6 @@ static int __init arp_tables_init(void)
if (ret < 0)
goto err4;
 
-   pr_info("arp_tables: (C) 2002 David S. Miller\n");
return 0;
 
 err4:
diff --git a/net/ipv4/netfilter/ip_tables.c b/net/ipv4/netfilter/ip_tables.c
index 384b857..27a4fea 100644
--- a/net/ipv4/netfilter/ip_tables.c
+++ b/net/ipv4/netfilter/ip_tables.c
@@ -1933,7 +1933,6 @@ static int __init ip_tables_init(void)
if (ret < 0)
goto err5;
 
-   pr_info("(C) 2000-2006 Netfilter Core Team\n");
return 0;
 
 err5:
diff --git a/net/ipv6/netfilter/ip6_tables.c b/net/ipv6/netfilter/ip6_tables.c
index 1e15c54..979acef 100644
--- a/net/ipv6/netfilter/ip6_tables.c
+++ b/net/ipv6/netfilter/ip6_tables.c
@@ -1955,7 +1955,6 @@ static int __init ip6_tables_init(void)
if (ret < 0)
goto err5;
 
-   pr_info("(C) 2000-2006 Netfilter Core Team\n");
return 0;
 
 err5:
diff --git a/net/netfilter/nf_tables_api.c b/net/netfilter/nf_tables_api.c
index 4559f5d..f53d46e 100644
--- a/net/netfilter/nf_tables_api.c
+++ b/net/netfilter/nf_tables_api.c
@@ -5608,7 +5608,6 @@ static int __init nf_tables_module_init(void)
if (err < 0)
goto err3;
 
-   pr_info("nf_tables: (c) 2007-2009 Patrick McHardy \n");
return register_pernet_subsys(_tables_net_ops);
 err3:
nf_tables_core_module_exit();
diff --git a/net/netfilter/nft_compat.c b/net/netfilter/nft_compat.c
index fab6bf3..841c5ae 100644
--- a/net/netfilter/nft_compat.c
+++ b/net/netfilter/nft_compat.c
@@ -810,8 +810,6 @@ static int __init nft_compat_module_init(void)
goto err_target;
}
 
-   pr_info("nf_tables_compat: (c) 2012 Pablo Neira Ayuso 
\n");
-
return ret;
 
 err_target:
-- 
2.10.2

--
To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH nf 1/1] netfilter: helper: Fix possible panic caused by invoking expectfn unloaded

2017-03-15 Thread Pablo Neira Ayuso
On Tue, Mar 14, 2017 at 02:26:06PM +0800, f...@ikuai8.com wrote:
> From: Gao Feng 
> 
> The helper module permits the helper modules register expectfn, and
> it could be hold by external caller. But when the module is unloaded,
> there may be some pending expect nodes which still hold the function
> reference. It may cause unexpected behavior, even panic.
> 
> Now it would delete the expect nodes which uses the expectfn when
> unregister expectfn. And it must use the rcu_read_lock to protect
> the expectfn until insert it or doesn't access it ever.

Expectations should be removed by when the helper module is gone, so
what is the problem here?
--
To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 0/7] net, netfilter refcounter conversions

2017-03-15 Thread Pablo Neira Ayuso
On Wed, Mar 15, 2017 at 01:10:38PM +0200, Elena Reshetova wrote:
> This series, for the netfilter subsystem, replaces atomic_t reference
> counters with the new refcount_t type and API (see include/linux/refcount.h).
> By doing this we prevent intentional or accidental
> underflows or overflows that can led to use-after-free vulnerabilities.
> 
> Please take the series to your tree if there are no run-time issues.

Could you collapse all of your patches into one single? They are all
part of the same logical change to me.

>  21 files changed, 85 insertions(+), 75 deletions(-)

The diffstat is small enough to do what I'm asking.

Thanks!
--
To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


ANNOUNCE: New Talk: Story of a Network Virtualization and it's future in Software and in Hardware

2017-03-15 Thread Jamal Hadi Salim

The tech committee would like to announce a new accepted talk from
Anjali Singhai Jain along with
Alexander H Duyck, Parthasarathy Sarangam and Nrupal Jani

The details are as follows:
---
The paper and the presentation will quickly go through a time lapse of 
Network virtualization as it evolved, successes and failures and the 
reasons behind those. Revisiting history and understanding the present 
and future use cases for Network virtualization will fuel the future 
Hardware and Software designs for a better end user experience and 
pushing the envelope on Network virtualization. Intel and the industry 
has seen many generations of Network virtualization.


The talk will be focused on two important areas:

Briefly Analyze the past and present
Make a case for the future technologies

The talk will go deep dive into HW challenges in two areas of Network 
Virtualization:


Why the Hardware offloads and getting them just right is important 
(Goldilocks effect), may be SRIOV is little too much.
Host interface exposed by a network Virtualization device and how 
do we get that just right


The talk will also cover what the future SW model for Network 
Virtualization is shaping up to be


The best control and data plane split that works for Network or 
compute intensive VMs/Containers.

Why less is more in some cases for the Virtual function device.
Is true SR-IOV a good answer in all cases, are Mediated devices a 
better compromise


To conclude we will go over the upcoming Virtualization technology 
supported by VFIO Mediated devices and PCIE specification for PASID, 
what problems will that solve. How do we get the software model right in 
this case and learn from our mistakes with SR-IOV.



cheers,
jamal
--
To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH nft 6/9] tests: py: add ct helper tests

2017-03-15 Thread Florian Westphal
Pablo Neira Ayuso  wrote:
> On Tue, Mar 14, 2017 at 08:58:13PM +0100, Florian Westphal wrote:
> > +%cthelp1 type ct helper { type \"ftp\"\; protocol tcp\; };ok
> 
> Just a minor syntax nitpick here.
> 
> Protocol should be part of the same statement, right? ie.
> 
> { type "ftp" protocol tcp ; }
> 
> It fundamental to achieve a working configuration. You can send a
> follow up patch to amend this, no problem.

No, I'll change this in patch #1 and will respin the series, thanks
Pablo.
--
To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[ANNOUNCE] 13th Netfilter Workshop nearby Faro, Portugal

2017-03-15 Thread Pablo Neira Ayuso
Hi!

We are glad to announce a new round in the Netfilter Workshop series.
This year this event will take place nearby Faro, Portugal, from 3th
July to 7th July, 2016 [1]. Exact location TBA.

The Netfilter Workshop (NFWS) is the main event organized for and by
the Netfilter community. During the workshop days, Linux kernel
networking and Netfilter developers meet and discuss the status of the
on-going Netfilter-related developments and the plans for the near
future.

Attendance requires an invitation. Linux networking developers with
contributions to any of the Netfilter subsystems and users with
interesting usecases and open problems are also welcome. We have
traditionally left room for other projects that rely on Netfilter
infrastructure such as the Linux Virtual Server project. You can send
us proposal in a very lightweight format: title and quick abstract
(not more than 500 words!) as well as estimated time to present.

We are looking for sponsors, if you think you can help us to get funds
to run the workshop, please, contact us at coreteam@netfilter and
we'll be glad to send you our sponsorship policy.

Join us!

[1] http://workshop.netfilter.org/2017/
--
To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 1/7] net, netfilter: convert ip_vs_conn.refcnt from atomic_t to refcount_t

2017-03-15 Thread Elena Reshetova
refcount_t type and corresponding API should be
used instead of atomic_t when the variable is used as
a reference counter. This allows to avoid accidental
refcounter overflows that might lead to use-after-free
situations.

Signed-off-by: Elena Reshetova 
Signed-off-by: Hans Liljestrand 
Signed-off-by: Kees Cook 
Signed-off-by: David Windsor 
---
 include/net/ip_vs.h   |  8 +---
 net/netfilter/ipvs/ip_vs_conn.c   | 24 
 net/netfilter/ipvs/ip_vs_core.c   |  4 ++--
 net/netfilter/ipvs/ip_vs_proto_sctp.c |  2 +-
 net/netfilter/ipvs/ip_vs_proto_tcp.c  |  2 +-
 5 files changed, 21 insertions(+), 19 deletions(-)

diff --git a/include/net/ip_vs.h b/include/net/ip_vs.h
index 7bdfa7d..f1429c3 100644
--- a/include/net/ip_vs.h
+++ b/include/net/ip_vs.h
@@ -12,6 +12,8 @@
 #include  /* for struct list_head */
 #include  /* for struct rwlock_t */
 #include/* for struct atomic_t */
+#include  /* for struct refcount_t */
+
 #include 
 #include 
 #include 
@@ -525,7 +527,7 @@ struct ip_vs_conn {
struct netns_ipvs   *ipvs;
 
/* counter and timer */
-   atomic_trefcnt; /* reference count */
+   refcount_t  refcnt; /* reference count */
struct timer_list   timer;  /* Expiration timer */
volatile unsigned long  timeout;/* timeout */
 
@@ -1211,14 +1213,14 @@ struct ip_vs_conn * ip_vs_conn_out_get_proto(struct 
netns_ipvs *ipvs, int af,
  */
 static inline bool __ip_vs_conn_get(struct ip_vs_conn *cp)
 {
-   return atomic_inc_not_zero(>refcnt);
+   return refcount_inc_not_zero(>refcnt);
 }
 
 /* put back the conn without restarting its timer */
 static inline void __ip_vs_conn_put(struct ip_vs_conn *cp)
 {
smp_mb__before_atomic();
-   atomic_dec(>refcnt);
+   refcount_dec(>refcnt);
 }
 void ip_vs_conn_put(struct ip_vs_conn *cp);
 void ip_vs_conn_fill_cport(struct ip_vs_conn *cp, __be16 cport);
diff --git a/net/netfilter/ipvs/ip_vs_conn.c b/net/netfilter/ipvs/ip_vs_conn.c
index e6a2753..3d2ac71a 100644
--- a/net/netfilter/ipvs/ip_vs_conn.c
+++ b/net/netfilter/ipvs/ip_vs_conn.c
@@ -181,7 +181,7 @@ static inline int ip_vs_conn_hash(struct ip_vs_conn *cp)
 
if (!(cp->flags & IP_VS_CONN_F_HASHED)) {
cp->flags |= IP_VS_CONN_F_HASHED;
-   atomic_inc(>refcnt);
+   refcount_inc(>refcnt);
hlist_add_head_rcu(>c_list, _vs_conn_tab[hash]);
ret = 1;
} else {
@@ -215,7 +215,7 @@ static inline int ip_vs_conn_unhash(struct ip_vs_conn *cp)
if (cp->flags & IP_VS_CONN_F_HASHED) {
hlist_del_rcu(>c_list);
cp->flags &= ~IP_VS_CONN_F_HASHED;
-   atomic_dec(>refcnt);
+   refcount_dec(>refcnt);
ret = 1;
} else
ret = 0;
@@ -242,13 +242,13 @@ static inline bool ip_vs_conn_unlink(struct ip_vs_conn 
*cp)
if (cp->flags & IP_VS_CONN_F_HASHED) {
ret = false;
/* Decrease refcnt and unlink conn only if we are last user */
-   if (atomic_cmpxchg(>refcnt, 1, 0) == 1) {
+   if (refcount_dec_if_one(>refcnt)) {
hlist_del_rcu(>c_list);
cp->flags &= ~IP_VS_CONN_F_HASHED;
ret = true;
}
} else
-   ret = atomic_read(>refcnt) ? false : true;
+   ret = refcount_read(>refcnt) ? false : true;
 
spin_unlock(>lock);
ct_write_unlock_bh(hash);
@@ -475,7 +475,7 @@ static void __ip_vs_conn_put_timer(struct ip_vs_conn *cp)
 void ip_vs_conn_put(struct ip_vs_conn *cp)
 {
if ((cp->flags & IP_VS_CONN_F_ONE_PACKET) &&
-   (atomic_read(>refcnt) == 1) &&
+   (refcount_read(>refcnt) == 1) &&
!timer_pending(>timer))
/* expire connection immediately */
__ip_vs_conn_put_notimer(cp);
@@ -617,8 +617,8 @@ ip_vs_bind_dest(struct ip_vs_conn *cp, struct ip_vs_dest 
*dest)
  IP_VS_DBG_ADDR(cp->af, >vaddr), ntohs(cp->vport),
  IP_VS_DBG_ADDR(cp->daf, >daddr), ntohs(cp->dport),
  ip_vs_fwd_tag(cp), cp->state,
- cp->flags, atomic_read(>refcnt),
- atomic_read(>refcnt));
+ cp->flags, refcount_read(>refcnt),
+ refcount_read(>refcnt));
 
/* Update the connection counters */
if (!(flags & IP_VS_CONN_F_TEMPLATE)) {
@@ -714,8 +714,8 @@ static inline void ip_vs_unbind_dest(struct ip_vs_conn *cp)
  IP_VS_DBG_ADDR(cp->af, >vaddr), ntohs(cp->vport),
  IP_VS_DBG_ADDR(cp->daf, >daddr), ntohs(cp->dport),
  ip_vs_fwd_tag(cp), 

[PATCH 2/7] net, netfilter: convert ip_vs_dest.refcnt from atomic_t to refcount_t

2017-03-15 Thread Elena Reshetova
refcount_t type and corresponding API should be
used instead of atomic_t when the variable is used as
a reference counter. This allows to avoid accidental
refcounter overflows that might lead to use-after-free
situations.

Signed-off-by: Elena Reshetova 
Signed-off-by: Hans Liljestrand 
Signed-off-by: Kees Cook 
Signed-off-by: David Windsor 
---
 include/net/ip_vs.h  |  8 
 net/netfilter/ipvs/ip_vs_ctl.c   | 12 ++--
 net/netfilter/ipvs/ip_vs_lblc.c  |  2 +-
 net/netfilter/ipvs/ip_vs_lblcr.c |  6 +++---
 net/netfilter/ipvs/ip_vs_nq.c|  2 +-
 net/netfilter/ipvs/ip_vs_rr.c|  2 +-
 net/netfilter/ipvs/ip_vs_sed.c   |  2 +-
 net/netfilter/ipvs/ip_vs_wlc.c   |  2 +-
 net/netfilter/ipvs/ip_vs_wrr.c   |  2 +-
 9 files changed, 19 insertions(+), 19 deletions(-)

diff --git a/include/net/ip_vs.h b/include/net/ip_vs.h
index f1429c3..8a4a57b8 100644
--- a/include/net/ip_vs.h
+++ b/include/net/ip_vs.h
@@ -669,7 +669,7 @@ struct ip_vs_dest {
atomic_tconn_flags; /* flags to copy to conn */
atomic_tweight; /* server weight */
 
-   atomic_trefcnt; /* reference counter */
+   refcount_t  refcnt; /* reference counter */
struct ip_vs_stats  stats;  /* statistics */
unsigned long   idle_start; /* start time, jiffies */
 
@@ -1412,18 +1412,18 @@ void ip_vs_try_bind_dest(struct ip_vs_conn *cp);
 
 static inline void ip_vs_dest_hold(struct ip_vs_dest *dest)
 {
-   atomic_inc(>refcnt);
+   refcount_inc(>refcnt);
 }
 
 static inline void ip_vs_dest_put(struct ip_vs_dest *dest)
 {
smp_mb__before_atomic();
-   atomic_dec(>refcnt);
+   refcount_dec(>refcnt);
 }
 
 static inline void ip_vs_dest_put_and_free(struct ip_vs_dest *dest)
 {
-   if (atomic_dec_and_test(>refcnt))
+   if (refcount_dec_and_test(>refcnt))
kfree(dest);
 }
 
diff --git a/net/netfilter/ipvs/ip_vs_ctl.c b/net/netfilter/ipvs/ip_vs_ctl.c
index 5aeb0dd..541aa76 100644
--- a/net/netfilter/ipvs/ip_vs_ctl.c
+++ b/net/netfilter/ipvs/ip_vs_ctl.c
@@ -699,7 +699,7 @@ ip_vs_trash_get_dest(struct ip_vs_service *svc, int dest_af,
  dest->vfwmark,
  IP_VS_DBG_ADDR(dest->af, >addr),
  ntohs(dest->port),
- atomic_read(>refcnt));
+ refcount_read(>refcnt));
if (dest->af == dest_af &&
ip_vs_addr_equal(dest_af, >addr, daddr) &&
dest->port == dport &&
@@ -934,7 +934,7 @@ ip_vs_new_dest(struct ip_vs_service *svc, struct 
ip_vs_dest_user_kern *udest,
atomic_set(>activeconns, 0);
atomic_set(>inactconns, 0);
atomic_set(>persistconns, 0);
-   atomic_set(>refcnt, 1);
+   refcount_set(>refcnt, 1);
 
INIT_HLIST_NODE(>d_list);
spin_lock_init(>dst_lock);
@@ -998,7 +998,7 @@ ip_vs_add_dest(struct ip_vs_service *svc, struct 
ip_vs_dest_user_kern *udest)
IP_VS_DBG_BUF(3, "Get destination %s:%u from trash, "
  "dest->refcnt=%d, service %u/%s:%u\n",
  IP_VS_DBG_ADDR(udest->af, ), ntohs(dport),
- atomic_read(>refcnt),
+ refcount_read(>refcnt),
  dest->vfwmark,
  IP_VS_DBG_ADDR(svc->af, >vaddr),
  ntohs(dest->vport));
@@ -1074,7 +1074,7 @@ static void __ip_vs_del_dest(struct netns_ipvs *ipvs, 
struct ip_vs_dest *dest,
spin_lock_bh(>dest_trash_lock);
IP_VS_DBG_BUF(3, "Moving dest %s:%u into trash, dest->refcnt=%d\n",
  IP_VS_DBG_ADDR(dest->af, >addr), ntohs(dest->port),
- atomic_read(>refcnt));
+ refcount_read(>refcnt));
if (list_empty(>dest_trash) && !cleanup)
mod_timer(>dest_trash_timer,
  jiffies + (IP_VS_DEST_TRASH_PERIOD >> 1));
@@ -1157,7 +1157,7 @@ static void ip_vs_dest_trash_expire(unsigned long data)
 
spin_lock(>dest_trash_lock);
list_for_each_entry_safe(dest, next, >dest_trash, t_list) {
-   if (atomic_read(>refcnt) > 1)
+   if (refcount_read(>refcnt) > 1)
continue;
if (dest->idle_start) {
if (time_before(now, dest->idle_start +
@@ -1545,7 +1545,7 @@ ip_vs_forget_dev(struct ip_vs_dest *dest, struct 
net_device *dev)
  dev->name,
  IP_VS_DBG_ADDR(dest->af, >addr),
  ntohs(dest->port),
- atomic_read(>refcnt));
+ refcount_read(>refcnt));

[PATCH 0/7] net, netfilter refcounter conversions

2017-03-15 Thread Elena Reshetova
This series, for the netfilter subsystem, replaces atomic_t reference
counters with the new refcount_t type and API (see include/linux/refcount.h).
By doing this we prevent intentional or accidental
underflows or overflows that can led to use-after-free vulnerabilities.

Please take the series to your tree if there are no run-time issues.

Elena Reshetova (7):
  net, netfilter: convert ip_vs_conn.refcnt from atomic_t to refcount_t
  net, netfilter: convert ip_vs_dest.refcnt from atomic_t to refcount_t
  net, netfilter: convert ctnl_timeout.refcnt from atomic_t to
refcount_t
  net, netfilter: convert nf_acct.refcnt from atomic_t to refcount_t
  net, netfilter: convert nf_conntrack_expect.use from atomic_t to
refcount_t
  net, netfilter: convert nfulnl_instance.use from atomic_t to
refcount_t
  net, netfilter: convert clusterip_config.refcount and
clusterip_config.entries from atomic_t to refcount_t

 include/net/ip_vs.h  | 16 +---
 include/net/netfilter/nf_conntrack_expect.h  |  4 +++-
 include/net/netfilter/nf_conntrack_timeout.h |  3 ++-
 net/ipv4/netfilter/ipt_CLUSTERIP.c   | 19 ++-
 net/netfilter/ipvs/ip_vs_conn.c  | 24 
 net/netfilter/ipvs/ip_vs_core.c  |  4 ++--
 net/netfilter/ipvs/ip_vs_ctl.c   | 12 ++--
 net/netfilter/ipvs/ip_vs_lblc.c  |  2 +-
 net/netfilter/ipvs/ip_vs_lblcr.c |  6 +++---
 net/netfilter/ipvs/ip_vs_nq.c|  2 +-
 net/netfilter/ipvs/ip_vs_proto_sctp.c|  2 +-
 net/netfilter/ipvs/ip_vs_proto_tcp.c |  2 +-
 net/netfilter/ipvs/ip_vs_rr.c|  2 +-
 net/netfilter/ipvs/ip_vs_sed.c   |  2 +-
 net/netfilter/ipvs/ip_vs_wlc.c   |  2 +-
 net/netfilter/ipvs/ip_vs_wrr.c   |  2 +-
 net/netfilter/nf_conntrack_expect.c  | 10 +-
 net/netfilter/nf_conntrack_netlink.c |  4 ++--
 net/netfilter/nfnetlink_acct.c   | 16 +---
 net/netfilter/nfnetlink_cttimeout.c  | 12 ++--
 net/netfilter/nfnetlink_log.c| 14 --
 21 files changed, 85 insertions(+), 75 deletions(-)

-- 
2.7.4

--
To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 3/7] net, netfilter: convert ctnl_timeout.refcnt from atomic_t to refcount_t

2017-03-15 Thread Elena Reshetova
refcount_t type and corresponding API should be
used instead of atomic_t when the variable is used as
a reference counter. This allows to avoid accidental
refcounter overflows that might lead to use-after-free
situations.

Signed-off-by: Elena Reshetova 
Signed-off-by: Hans Liljestrand 
Signed-off-by: Kees Cook 
Signed-off-by: David Windsor 
---
 include/net/netfilter/nf_conntrack_timeout.h |  3 ++-
 net/netfilter/nfnetlink_cttimeout.c  | 12 ++--
 2 files changed, 8 insertions(+), 7 deletions(-)

diff --git a/include/net/netfilter/nf_conntrack_timeout.h 
b/include/net/netfilter/nf_conntrack_timeout.h
index 5cc5e9e..d40b893 100644
--- a/include/net/netfilter/nf_conntrack_timeout.h
+++ b/include/net/netfilter/nf_conntrack_timeout.h
@@ -4,6 +4,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
 
@@ -12,7 +13,7 @@
 struct ctnl_timeout {
struct list_headhead;
struct rcu_head rcu_head;
-   atomic_trefcnt;
+   refcount_t  refcnt;
charname[CTNL_TIMEOUT_NAME_MAX];
__u16   l3num;
struct nf_conntrack_l4proto *l4proto;
diff --git a/net/netfilter/nfnetlink_cttimeout.c 
b/net/netfilter/nfnetlink_cttimeout.c
index 139e086..baa75f3 100644
--- a/net/netfilter/nfnetlink_cttimeout.c
+++ b/net/netfilter/nfnetlink_cttimeout.c
@@ -138,7 +138,7 @@ static int cttimeout_new_timeout(struct net *net, struct 
sock *ctnl,
strcpy(timeout->name, nla_data(cda[CTA_TIMEOUT_NAME]));
timeout->l3num = l3num;
timeout->l4proto = l4proto;
-   atomic_set(>refcnt, 1);
+   refcount_set(>refcnt, 1);
list_add_tail_rcu(>head, >nfct_timeout_list);
 
return 0;
@@ -172,7 +172,7 @@ ctnl_timeout_fill_info(struct sk_buff *skb, u32 portid, u32 
seq, u32 type,
nla_put_be16(skb, CTA_TIMEOUT_L3PROTO, htons(timeout->l3num)) ||
nla_put_u8(skb, CTA_TIMEOUT_L4PROTO, timeout->l4proto->l4proto) ||
nla_put_be32(skb, CTA_TIMEOUT_USE,
-htonl(atomic_read(>refcnt
+htonl(refcount_read(>refcnt
goto nla_put_failure;
 
if (likely(l4proto->ctnl_timeout.obj_to_nlattr)) {
@@ -339,7 +339,7 @@ static int ctnl_timeout_try_del(struct net *net, struct 
ctnl_timeout *timeout)
/* We want to avoid races with ctnl_timeout_put. So only when the
 * current refcnt is 1, we decrease it to 0.
 */
-   if (atomic_cmpxchg(>refcnt, 1, 0) == 1) {
+   if (refcount_dec_if_one(>refcnt)) {
/* We are protected by nfnl mutex. */
list_del_rcu(>head);
nf_ct_l4proto_put(timeout->l4proto);
@@ -536,7 +536,7 @@ ctnl_timeout_find_get(struct net *net, const char *name)
if (!try_module_get(THIS_MODULE))
goto err;
 
-   if (!atomic_inc_not_zero(>refcnt)) {
+   if (!refcount_inc_not_zero(>refcnt)) {
module_put(THIS_MODULE);
goto err;
}
@@ -550,7 +550,7 @@ ctnl_timeout_find_get(struct net *net, const char *name)
 
 static void ctnl_timeout_put(struct ctnl_timeout *timeout)
 {
-   if (atomic_dec_and_test(>refcnt))
+   if (refcount_dec_and_test(>refcnt))
kfree_rcu(timeout, rcu_head);
 
module_put(THIS_MODULE);
@@ -601,7 +601,7 @@ static void __net_exit cttimeout_net_exit(struct net *net)
list_del_rcu(>head);
nf_ct_l4proto_put(cur->l4proto);
 
-   if (atomic_dec_and_test(>refcnt))
+   if (refcount_dec_and_test(>refcnt))
kfree_rcu(cur, rcu_head);
}
 }
-- 
2.7.4

--
To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 5/7] net, netfilter: convert nf_conntrack_expect.use from atomic_t to refcount_t

2017-03-15 Thread Elena Reshetova
refcount_t type and corresponding API should be
used instead of atomic_t when the variable is used as
a reference counter. This allows to avoid accidental
refcounter overflows that might lead to use-after-free
situations.

Signed-off-by: Elena Reshetova 
Signed-off-by: Hans Liljestrand 
Signed-off-by: Kees Cook 
Signed-off-by: David Windsor 
---
 include/net/netfilter/nf_conntrack_expect.h |  4 +++-
 net/netfilter/nf_conntrack_expect.c | 10 +-
 net/netfilter/nf_conntrack_netlink.c|  4 ++--
 3 files changed, 10 insertions(+), 8 deletions(-)

diff --git a/include/net/netfilter/nf_conntrack_expect.h 
b/include/net/netfilter/nf_conntrack_expect.h
index 5ed33ea..65cc2cb 100644
--- a/include/net/netfilter/nf_conntrack_expect.h
+++ b/include/net/netfilter/nf_conntrack_expect.h
@@ -5,6 +5,8 @@
 #ifndef _NF_CONNTRACK_EXPECT_H
 #define _NF_CONNTRACK_EXPECT_H
 
+#include 
+
 #include 
 #include 
 
@@ -37,7 +39,7 @@ struct nf_conntrack_expect {
struct timer_list timeout;
 
/* Usage count. */
-   atomic_t use;
+   refcount_t use;
 
/* Flags */
unsigned int flags;
diff --git a/net/netfilter/nf_conntrack_expect.c 
b/net/netfilter/nf_conntrack_expect.c
index 4b2e1fb..cb29e59 100644
--- a/net/netfilter/nf_conntrack_expect.c
+++ b/net/netfilter/nf_conntrack_expect.c
@@ -133,7 +133,7 @@ nf_ct_expect_find_get(struct net *net,
 
rcu_read_lock();
i = __nf_ct_expect_find(net, zone, tuple);
-   if (i && !atomic_inc_not_zero(>use))
+   if (i && !refcount_inc_not_zero(>use))
i = NULL;
rcu_read_unlock();
 
@@ -186,7 +186,7 @@ nf_ct_find_expectation(struct net *net,
return NULL;
 
if (exp->flags & NF_CT_EXPECT_PERMANENT) {
-   atomic_inc(>use);
+   refcount_inc(>use);
return exp;
} else if (del_timer(>timeout)) {
nf_ct_unlink_expect(exp);
@@ -275,7 +275,7 @@ struct nf_conntrack_expect *nf_ct_expect_alloc(struct 
nf_conn *me)
return NULL;
 
new->master = me;
-   atomic_set(>use, 1);
+   refcount_set(>use, 1);
return new;
 }
 EXPORT_SYMBOL_GPL(nf_ct_expect_alloc);
@@ -348,7 +348,7 @@ static void nf_ct_expect_free_rcu(struct rcu_head *head)
 
 void nf_ct_expect_put(struct nf_conntrack_expect *exp)
 {
-   if (atomic_dec_and_test(>use))
+   if (refcount_dec_and_test(>use))
call_rcu(>rcu, nf_ct_expect_free_rcu);
 }
 EXPORT_SYMBOL_GPL(nf_ct_expect_put);
@@ -361,7 +361,7 @@ static void nf_ct_expect_insert(struct nf_conntrack_expect 
*exp)
unsigned int h = nf_ct_expect_dst_hash(net, >tuple);
 
/* two references : one for hash insert, one for the timer */
-   atomic_add(2, >use);
+   refcount_add(2, >use);
 
hlist_add_head(>lnode, _help->expectations);
master_help->expecting[exp->class]++;
diff --git a/net/netfilter/nf_conntrack_netlink.c 
b/net/netfilter/nf_conntrack_netlink.c
index 6806b5e..d49cc1e 100644
--- a/net/netfilter/nf_conntrack_netlink.c
+++ b/net/netfilter/nf_conntrack_netlink.c
@@ -2693,7 +2693,7 @@ ctnetlink_exp_dump_table(struct sk_buff *skb, struct 
netlink_callback *cb)
cb->nlh->nlmsg_seq,
IPCTNL_MSG_EXP_NEW,
exp) < 0) {
-   if (!atomic_inc_not_zero(>use))
+   if (!refcount_inc_not_zero(>use))
continue;
cb->args[1] = (unsigned long)exp;
goto out;
@@ -2739,7 +2739,7 @@ ctnetlink_exp_ct_dump_table(struct sk_buff *skb, struct 
netlink_callback *cb)
cb->nlh->nlmsg_seq,
IPCTNL_MSG_EXP_NEW,
exp) < 0) {
-   if (!atomic_inc_not_zero(>use))
+   if (!refcount_inc_not_zero(>use))
continue;
cb->args[1] = (unsigned long)exp;
goto out;
-- 
2.7.4

--
To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 4/7] net, netfilter: convert nf_acct.refcnt from atomic_t to refcount_t

2017-03-15 Thread Elena Reshetova
refcount_t type and corresponding API should be
used instead of atomic_t when the variable is used as
a reference counter. This allows to avoid accidental
refcounter overflows that might lead to use-after-free
situations.

Signed-off-by: Elena Reshetova 
Signed-off-by: Hans Liljestrand 
Signed-off-by: Kees Cook 
Signed-off-by: David Windsor 
---
 net/netfilter/nfnetlink_acct.c | 16 +---
 1 file changed, 9 insertions(+), 7 deletions(-)

diff --git a/net/netfilter/nfnetlink_acct.c b/net/netfilter/nfnetlink_acct.c
index d44d89b..f44cbd3 100644
--- a/net/netfilter/nfnetlink_acct.c
+++ b/net/netfilter/nfnetlink_acct.c
@@ -11,6 +11,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
 #include 
@@ -32,7 +33,7 @@ struct nf_acct {
atomic64_t  bytes;
unsigned long   flags;
struct list_headhead;
-   atomic_trefcnt;
+   refcount_t  refcnt;
charname[NFACCT_NAME_MAX];
struct rcu_head rcu_head;
chardata[0];
@@ -123,7 +124,7 @@ static int nfnl_acct_new(struct net *net, struct sock *nfnl,
atomic64_set(>pkts,
 be64_to_cpu(nla_get_be64(tb[NFACCT_PKTS])));
}
-   atomic_set(>refcnt, 1);
+   refcount_set(>refcnt, 1);
list_add_tail_rcu(>head, >nfnl_acct_list);
return 0;
 }
@@ -166,7 +167,7 @@ nfnl_acct_fill_info(struct sk_buff *skb, u32 portid, u32 
seq, u32 type,
 NFACCT_PAD) ||
nla_put_be64(skb, NFACCT_BYTES, cpu_to_be64(bytes),
 NFACCT_PAD) ||
-   nla_put_be32(skb, NFACCT_USE, htonl(atomic_read(>refcnt
+   nla_put_be32(skb, NFACCT_USE, htonl(refcount_read(>refcnt
goto nla_put_failure;
if (acct->flags & NFACCT_F_QUOTA) {
u64 *quota = (u64 *)acct->data;
@@ -325,11 +326,12 @@ static int nfnl_acct_get(struct net *net, struct sock 
*nfnl,
 static int nfnl_acct_try_del(struct nf_acct *cur)
 {
int ret = 0;
+   unsigned int refcount;
 
/* We want to avoid races with nfnl_acct_put. So only when the current
 * refcnt is 1, we decrease it to 0.
 */
-   if (atomic_cmpxchg(>refcnt, 1, 0) == 1) {
+   if (refcount_dec_if_one(>refcnt)) {
/* We are protected by nfnl mutex. */
list_del_rcu(>head);
kfree_rcu(cur, rcu_head);
@@ -413,7 +415,7 @@ struct nf_acct *nfnl_acct_find_get(struct net *net, const 
char *acct_name)
if (!try_module_get(THIS_MODULE))
goto err;
 
-   if (!atomic_inc_not_zero(>refcnt)) {
+   if (!refcount_inc_not_zero(>refcnt)) {
module_put(THIS_MODULE);
goto err;
}
@@ -429,7 +431,7 @@ EXPORT_SYMBOL_GPL(nfnl_acct_find_get);
 
 void nfnl_acct_put(struct nf_acct *acct)
 {
-   if (atomic_dec_and_test(>refcnt))
+   if (refcount_dec_and_test(>refcnt))
kfree_rcu(acct, rcu_head);
 
module_put(THIS_MODULE);
@@ -502,7 +504,7 @@ static void __net_exit nfnl_acct_net_exit(struct net *net)
list_for_each_entry_safe(cur, tmp, >nfnl_acct_list, head) {
list_del_rcu(>head);
 
-   if (atomic_dec_and_test(>refcnt))
+   if (refcount_dec_and_test(>refcnt))
kfree_rcu(cur, rcu_head);
}
 }
-- 
2.7.4

--
To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 6/7] net, netfilter: convert nfulnl_instance.use from atomic_t to refcount_t

2017-03-15 Thread Elena Reshetova
refcount_t type and corresponding API should be
used instead of atomic_t when the variable is used as
a reference counter. This allows to avoid accidental
refcounter overflows that might lead to use-after-free
situations.

Signed-off-by: Elena Reshetova 
Signed-off-by: Hans Liljestrand 
Signed-off-by: Kees Cook 
Signed-off-by: David Windsor 
---
 net/netfilter/nfnetlink_log.c | 14 --
 1 file changed, 8 insertions(+), 6 deletions(-)

diff --git a/net/netfilter/nfnetlink_log.c b/net/netfilter/nfnetlink_log.c
index 08247bf..ecd857b 100644
--- a/net/netfilter/nfnetlink_log.c
+++ b/net/netfilter/nfnetlink_log.c
@@ -40,6 +40,8 @@
 #include 
 
 #include 
+#include 
+
 
 #if IS_ENABLED(CONFIG_BRIDGE_NETFILTER)
 #include "../bridge/br_private.h"
@@ -57,7 +59,7 @@
 struct nfulnl_instance {
struct hlist_node hlist;/* global list of instances */
spinlock_t lock;
-   atomic_t use;   /* use count */
+   refcount_t use; /* use count */
 
unsigned int qlen;  /* number of nlmsgs in skb */
struct sk_buff *skb;/* pre-allocatd skb */
@@ -115,7 +117,7 @@ __instance_lookup(struct nfnl_log_net *log, u_int16_t 
group_num)
 static inline void
 instance_get(struct nfulnl_instance *inst)
 {
-   atomic_inc(>use);
+   refcount_inc(>use);
 }
 
 static struct nfulnl_instance *
@@ -125,7 +127,7 @@ instance_lookup_get(struct nfnl_log_net *log, u_int16_t 
group_num)
 
rcu_read_lock_bh();
inst = __instance_lookup(log, group_num);
-   if (inst && !atomic_inc_not_zero(>use))
+   if (inst && !refcount_inc_not_zero(>use))
inst = NULL;
rcu_read_unlock_bh();
 
@@ -145,7 +147,7 @@ static void nfulnl_instance_free_rcu(struct rcu_head *head)
 static void
 instance_put(struct nfulnl_instance *inst)
 {
-   if (inst && atomic_dec_and_test(>use))
+   if (inst && refcount_dec_and_test(>use))
call_rcu_bh(>rcu, nfulnl_instance_free_rcu);
 }
 
@@ -180,7 +182,7 @@ instance_create(struct net *net, u_int16_t group_num,
INIT_HLIST_NODE(>hlist);
spin_lock_init(>lock);
/* needs to be two, since we _put() after creation */
-   atomic_set(>use, 2);
+   refcount_set(>use, 2);
 
setup_timer(>timer, nfulnl_timer, (unsigned long)inst);
 
@@ -1031,7 +1033,7 @@ static int seq_show(struct seq_file *s, void *v)
   inst->group_num,
   inst->peer_portid, inst->qlen,
   inst->copy_mode, inst->copy_range,
-  inst->flushtimeout, atomic_read(>use));
+  inst->flushtimeout, refcount_read(>use));
 
return 0;
 }
-- 
2.7.4

--
To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 7/7] net, netfilter: convert clusterip_config.refcount and clusterip_config.entries from atomic_t to refcount_t

2017-03-15 Thread Elena Reshetova
refcount_t type and corresponding API should be
used instead of atomic_t when the variable is used as
a reference counter. This allows to avoid accidental
refcounter overflows that might lead to use-after-free
situations.

Signed-off-by: Elena Reshetova 
Signed-off-by: Hans Liljestrand 
Signed-off-by: Kees Cook 
Signed-off-by: David Windsor 
---
 net/ipv4/netfilter/ipt_CLUSTERIP.c | 19 ++-
 1 file changed, 10 insertions(+), 9 deletions(-)

diff --git a/net/ipv4/netfilter/ipt_CLUSTERIP.c 
b/net/ipv4/netfilter/ipt_CLUSTERIP.c
index 52f2645..fcbdc0c 100644
--- a/net/ipv4/netfilter/ipt_CLUSTERIP.c
+++ b/net/ipv4/netfilter/ipt_CLUSTERIP.c
@@ -22,6 +22,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
 #include 
@@ -40,8 +41,8 @@ MODULE_DESCRIPTION("Xtables: CLUSTERIP target");
 
 struct clusterip_config {
struct list_head list;  /* list of all configs */
-   atomic_t refcount;  /* reference count */
-   atomic_t entries;   /* number of entries/rules
+   refcount_t refcount;/* reference count */
+   refcount_t entries; /* number of entries/rules
 * referencing us */
 
__be32 clusterip;   /* the IP address */
@@ -77,7 +78,7 @@ struct clusterip_net {
 static inline void
 clusterip_config_get(struct clusterip_config *c)
 {
-   atomic_inc(>refcount);
+   refcount_inc(>refcount);
 }
 
 
@@ -89,7 +90,7 @@ static void clusterip_config_rcu_free(struct rcu_head *head)
 static inline void
 clusterip_config_put(struct clusterip_config *c)
 {
-   if (atomic_dec_and_test(>refcount))
+   if (refcount_dec_and_test(>refcount))
call_rcu_bh(>rcu, clusterip_config_rcu_free);
 }
 
@@ -103,7 +104,7 @@ clusterip_config_entry_put(struct clusterip_config *c)
struct clusterip_net *cn = net_generic(net, clusterip_net_id);
 
local_bh_disable();
-   if (atomic_dec_and_lock(>entries, >lock)) {
+   if (refcount_dec_and_lock(>entries, >lock)) {
list_del_rcu(>list);
spin_unlock(>lock);
local_bh_enable();
@@ -149,10 +150,10 @@ clusterip_config_find_get(struct net *net, __be32 
clusterip, int entry)
c = NULL;
else
 #endif
-   if (unlikely(!atomic_inc_not_zero(>refcount)))
+   if (unlikely(!refcount_inc_not_zero(>refcount)))
c = NULL;
else if (entry)
-   atomic_inc(>entries);
+   refcount_inc(>entries);
}
rcu_read_unlock_bh();
 
@@ -188,8 +189,8 @@ clusterip_config_init(const struct ipt_clusterip_tgt_info 
*i, __be32 ip,
clusterip_config_init_nodelist(c, i);
c->hash_mode = i->hash_mode;
c->hash_initval = i->hash_initval;
-   atomic_set(>refcount, 1);
-   atomic_set(>entries, 1);
+   refcount_set(>refcount, 1);
+   refcount_set(>entries, 1);
 
spin_lock_bh(>lock);
if (__clusterip_config_find(net, ip)) {
-- 
2.7.4

--
To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [iptables PATCH] extensions: libxt_statistic: Complete nft translator

2017-03-15 Thread Pablo Neira Ayuso
On Tue, Mar 14, 2017 at 03:11:12PM +0100, Phil Sutter wrote:
> On Mon, Mar 13, 2017 at 05:53:53PM +0100, Pablo Neira Ayuso wrote:
> > On Mon, Mar 13, 2017 at 05:01:53PM +0100, Phil Sutter wrote:
> > [...]
> > > The nftables numgen expression works differently:
> > 
> > Phil, if you think we need a 1:1 mapping so iptables users moving to
> > nftables don't get confused, I'll be fine to take an update to
> > nft_numgen so we accomodate a new NFT_NG_PROBABILISTIC mode or so.
> 
> Well, implementing the translator wasn't exactly trivial, but in general
> I don't think numgen is particularly hard to use. Of course an explicit
> probability mode might make things easier, but then I guess it wouldn't
> fit into the LHS/RHS scheme anymore.

Right, we would need a specific statement for this.

Question is how useful this can be as statement. The usecases I found
for this are:

1) Load balancing, which is already covered by numgen via maps.
2) Simulate packet loss.

With a statement we could combine this probability thing with flow
tables, but still I wonder how useful can be to match packets using
probability at a per-flow level, a.k.a. hashprobability.

Florian already sent a patch to add an alias for this [1], problem is
that this break symmetry between what we add to the kernel and what we
may get, and that is going to break the rule deletion by description.

Just a brain dump on this in case anyone want to spend jiffies on
this.

[1] https://patchwork.ozlabs.org/patch/591534/
--
To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH net] bridge: ebtables: fix reception of frames DNAT-ed to bridge device

2017-03-15 Thread Pablo Neira Ayuso
On Wed, Mar 15, 2017 at 11:26:08AM +0100, Florian Westphal wrote:
> Linus Lüssing  wrote:
> > When trying to redirect bridged frames to the bridge device itself
> > via the ebtables nat-prerouting chain and the dnat target then this
> > currently fails:
> > 
> > The ethernet destination of the frame is dnat'ed to the MAC address of
> > the bridge itself just fine and the correctly altered frame can even
> > be captured via a tcpdump on br0 (with or without promisc mode).
> >
> > However, the IP code drops it in the beginning of ip_input.c/ip_rcv()
> > as the dnat target did not update the skb->pkt_type.
> 
> Right, thats the reason why ebtables also has ebt_redirect target
> which does this pkt_type fixup.

I'm missing then why redirect is not then just enough for Linus usecase.
--
To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH nft 9/9] doc: helper assignement

2017-03-15 Thread Pablo Neira Ayuso
Would you mind document the wiki page too, please?
--
To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH nft 0/9] ct helper set support

2017-03-15 Thread Pablo Neira Ayuso
On Tue, Mar 14, 2017 at 08:58:07PM +0100, Florian Westphal wrote:
> This series adds the frontend/nft support to define and
> assign connection tracking helpers.
> 
> Example:
> 
> table inet myhelpers {
>   ct helper ftp-standard {
>  type "ftp"
>  protocol tcp
>   }
>   chain prerouting {
>   type filter hook prerouting priority 0;
>   tcp dport 21 ct helper set "ftp-standard"
>   }
> }
> 
> A future extension could also allow to define/set knobs
> that can only be set via module parameters at this time,
> for instance the ftp 'loose mode' or the number of allowed expectations.

LGTM.

Acked-by: Pablo Neira Ayuso 
--
To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH net] bridge: ebtables: fix reception of frames DNAT-ed to bridge device

2017-03-15 Thread Pablo Neira Ayuso
On Wed, Mar 15, 2017 at 04:18:11AM +0100, Linus Lüssing wrote:
> When trying to redirect bridged frames to the bridge device itself
> via the ebtables nat-prerouting chain and the dnat target then this
> currently fails:
> 
> The ethernet destination of the frame is dnat'ed to the MAC address of
> the bridge itself just fine and the correctly altered frame can even
> be captured via a tcpdump on br0 (with or without promisc mode).
> 
> However, the IP code drops it in the beginning of ip_input.c/ip_rcv()
> as the dnat target did not update the skb->pkt_type. If after
> dnat'ing the packet is now destined to us then the skb->pkt_type
> needs to be updated from PACKET_OTHERHOST to PACKET_HOST, too.
> 
> Signed-off-by: Linus Lüssing 
> ---
>  net/bridge/br_input.c | 6 +-
>  1 file changed, 5 insertions(+), 1 deletion(-)
> 
> diff --git a/net/bridge/br_input.c b/net/bridge/br_input.c
> index 013f2290b..ec83175 100644
> --- a/net/bridge/br_input.c
> +++ b/net/bridge/br_input.c
> @@ -198,8 +198,12 @@ int br_handle_frame_finish(struct net *net, struct sock 
> *sk, struct sk_buff *skb
>   if (dst) {
>   unsigned long now = jiffies;
>  
> - if (dst->is_local)
> + if (dst->is_local) {
> + /* fix up potential DNAT mess */
> + skb->pkt_type = PACKET_HOST;

I would like to find a way to fix this from ebtables itself, so we
don't need to add this code to the bridge core path. AFAICS, from
prerouting we don't know the dst yet, so we cannot know if this packet
is local from there.
--
To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH net] bridge: ebtables: fix reception of frames DNAT-ed to bridge device

2017-03-15 Thread Florian Westphal
Linus Lüssing  wrote:
> When trying to redirect bridged frames to the bridge device itself
> via the ebtables nat-prerouting chain and the dnat target then this
> currently fails:
> 
> The ethernet destination of the frame is dnat'ed to the MAC address of
> the bridge itself just fine and the correctly altered frame can even
> be captured via a tcpdump on br0 (with or without promisc mode).
>
> However, the IP code drops it in the beginning of ip_input.c/ip_rcv()
> as the dnat target did not update the skb->pkt_type.

Right, thats the reason why ebtables also has ebt_redirect target
which does this pkt_type fixup.

> - if (dst->is_local)
> + if (dst->is_local) {
> + /* fix up potential DNAT mess */
> + skb->pkt_type = PACKET_HOST;
> +
>   return br_pass_frame_up(skb);
> + }

I don't mind this change though (i.e. I don't see how this would
bite us later).
--
To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html