[PATCH nf v2] netfilter: nat: limit port clash resolution attempts

2018-12-08 Thread Florian Westphal
In case almost or all available ports are taken, clash resolution can
take a very long time, resulting in soft lockup.

This can happen when many to-be-natted hosts connect to same
destination:port (e.g. a proxy) and all connections pass the same SNAT.

Pick a random offset in the acceptable range, then try ever smaller
number of adjacent port numbers, until either the limit is reached or a
useable port was found.  This results in at most 248 attempts
(128 + 64 + 32 + 16 + 8, i.e. 4 restarts with new search offset)
instead of 64000+,

v2: increment 'i' too in for loop (Xiaozhou Liu)

Signed-off-by: Florian Westphal 
---
 Pablo,

 this will unfortunately result in a nf-next merge conflict
 due to *rover removal in nf-next.
 I can send a patch vs. nf-next instead if you prefer.

 net/netfilter/nf_nat_proto_common.c | 26 ++
 1 file changed, 22 insertions(+), 4 deletions(-)

diff --git a/net/netfilter/nf_nat_proto_common.c 
b/net/netfilter/nf_nat_proto_common.c
index 5d849d835561..0e3321660624 100644
--- a/net/netfilter/nf_nat_proto_common.c
+++ b/net/netfilter/nf_nat_proto_common.c
@@ -41,9 +41,10 @@ void nf_nat_l4proto_unique_tuple(const struct nf_nat_l3proto 
*l3proto,
 const struct nf_conn *ct,
 u16 *rover)
 {
-   unsigned int range_size, min, max, i;
+   unsigned int range_size, min, max, i, attempts;
__be16 *portptr;
-   u_int16_t off;
+   u16 off;
+   static const unsigned int max_attempts = 128;
 
if (maniptype == NF_NAT_MANIP_SRC)
portptr = >src.u.all;
@@ -89,15 +90,32 @@ void nf_nat_l4proto_unique_tuple(const struct 
nf_nat_l3proto *l3proto,
off = *rover;
}
 
-   for (i = 0; ; ++off) {
+   attempts = range_size;
+   if (attempts > max_attempts)
+   attempts = max_attempts;
+
+   /* We are in softirq; doing a search of the entire range risks
+* soft lockup when all tuples are already used.
+*
+* If we can't find any free port from first offset, pick a new
+* one and try again, with ever smaller search window.
+*/
+another_round:
+   for (i = 0; i < attempts; i++, off++) {
*portptr = htons(min + off % range_size);
-   if (++i != range_size && nf_nat_used_tuple(tuple, ct))
+   if (nf_nat_used_tuple(tuple, ct))
continue;
if (!(range->flags & (NF_NAT_RANGE_PROTO_RANDOM_ALL|
NF_NAT_RANGE_PROTO_OFFSET)))
*rover = off;
return;
}
+
+   if (attempts >= range_size || attempts < 16)
+   return;
+   attempts /= 2;
+   off = prandom_u32();
+   goto another_round;
 }
 EXPORT_SYMBOL_GPL(nf_nat_l4proto_unique_tuple);
 
-- 
2.19.2



Re: Another compilation error

2018-12-08 Thread Ansuel Smith
Sorry already patched. Ignore this.
Il giorno sab 8 dic 2018 alle ore 20:29 Ansuel Smith
 ha scritto:
>
> Think is triggerd with nftables support
>
> In file included from
> /home/daniel/Build/openwrt-ath79/staging_dir/toolchain-mips_24kc_gcc-7.3.0_musl/include/net/ethernet.h:10:0,
>  from ../iptables/nft-bridge.h:8,
>  from libebt_vlan.c:18:
> /home/daniel/Build/openwrt-ath79/staging_dir/toolchain-mips_24kc_gcc-7.3.0_musl/include/netinet/if_ether.h:111:8:
> error: redefinition of 'struct ethhdr'
>  struct ethhdr {
> ^~
> In file included from libebt_vlan.c:16:0:
> /home/daniel/Build/openwrt-ath79/build_dir/target-mips_24kc_musl/linux-ath79_generic/linux-4.14.82/user_headers/include/linux/if_ether.h:155:8:
> note: originally defined here
>  struct ethhdr {
> ^~
> make[6]: *** [GNUmakefile:127: libebt_vlan.oo] Error 1
> make[6]: Leaving directory
> '/home/daniel/Build/openwrt-ath79/build_dir/target-mips_24kc_musl/linux-ath79_generic/iptables-1.8.2/extensions'
> make[5]: *** [Makefile:506: all-recursive] Error 1


Another compilation error

2018-12-08 Thread Ansuel Smith
Think is triggerd with nftables support

In file included from
/home/daniel/Build/openwrt-ath79/staging_dir/toolchain-mips_24kc_gcc-7.3.0_musl/include/net/ethernet.h:10:0,
 from ../iptables/nft-bridge.h:8,
 from libebt_vlan.c:18:
/home/daniel/Build/openwrt-ath79/staging_dir/toolchain-mips_24kc_gcc-7.3.0_musl/include/netinet/if_ether.h:111:8:
error: redefinition of 'struct ethhdr'
 struct ethhdr {
^~
In file included from libebt_vlan.c:16:0:
/home/daniel/Build/openwrt-ath79/build_dir/target-mips_24kc_musl/linux-ath79_generic/linux-4.14.82/user_headers/include/linux/if_ether.h:155:8:
note: originally defined here
 struct ethhdr {
^~
make[6]: *** [GNUmakefile:127: libebt_vlan.oo] Error 1
make[6]: Leaving directory
'/home/daniel/Build/openwrt-ath79/build_dir/target-mips_24kc_musl/linux-ath79_generic/iptables-1.8.2/extensions'
make[5]: *** [Makefile:506: all-recursive] Error 1


Re: [PATCH nf] netfilter: nat: limit port clash resolution attempts

2018-12-08 Thread Florian Westphal
Xiaozhou Liu  wrote:
> > +   for (i = 0; i < attempts; ++off) {
> > *portptr = htons(min + off % range_size);
> > -   if (++i != range_size && nf_nat_used_tuple(tuple, ct))
> > +   if (nf_nat_used_tuple(tuple, ct))
> > continue;
> > if (!(range->flags & (NF_NAT_RANGE_PROTO_RANDOM_ALL|
> > NF_NAT_RANGE_PROTO_OFFSET)))
> > *rover = off;
> > return;
> > }
> 
> i never gets increased here so will it loop forever in the worst?

good catch, i should be incremented in the loop. I will send a v2.


Re: [PATCH nf] netfilter: nat: limit port clash resolution attempts

2018-12-08 Thread Xiaozhou Liu
On Sat, Dec 08, 2018 at 11:07:44AM +0100, Florian Westphal wrote:
>  Pablo,
> 
>  this will unfortunately result in a nf-next merge conflict
>  due to *rover removal in nf-next.
>  I can send a patch vs. nf-next instead if you prefer.
> 
>  net/netfilter/nf_nat_proto_common.c | 26 ++
>  1 file changed, 22 insertions(+), 4 deletions(-)
> 
> diff --git a/net/netfilter/nf_nat_proto_common.c 
> b/net/netfilter/nf_nat_proto_common.c
> index 5d849d835561..0e3321660624 100644
> --- a/net/netfilter/nf_nat_proto_common.c
> +++ b/net/netfilter/nf_nat_proto_common.c
> @@ -41,9 +41,10 @@ void nf_nat_l4proto_unique_tuple(const struct 
> nf_nat_l3proto *l3proto,
>const struct nf_conn *ct,
>u16 *rover)
>  {
> - unsigned int range_size, min, max, i;
> + unsigned int range_size, min, max, i, attempts;
>   __be16 *portptr;
> - u_int16_t off;
> + u16 off;
> + static const unsigned int max_attempts = 128;
>  
>   if (maniptype == NF_NAT_MANIP_SRC)
>   portptr = >src.u.all;
> @@ -89,15 +90,32 @@ void nf_nat_l4proto_unique_tuple(const struct 
> nf_nat_l3proto *l3proto,
>   off = *rover;
>   }
>  
> - for (i = 0; ; ++off) {
> + attempts = range_size;
> + if (attempts > max_attempts)
> + attempts = max_attempts;
> +
> + /* We are in softirq; doing a search of the entire range risks
> +  * soft lockup when all tuples are already used.
> +  *
> +  * If we can't find any free port from first offset, pick a new
> +  * one and try again, with ever smaller search window.
> +  */
> +another_round:
> + for (i = 0; i < attempts; ++off) {
>   *portptr = htons(min + off % range_size);
> - if (++i != range_size && nf_nat_used_tuple(tuple, ct))
> + if (nf_nat_used_tuple(tuple, ct))
>   continue;
>   if (!(range->flags & (NF_NAT_RANGE_PROTO_RANDOM_ALL|
>   NF_NAT_RANGE_PROTO_OFFSET)))
>   *rover = off;
>   return;
>   }

i never gets increased here so will it loop forever in the worst?


Thanks,
Xiaozhou


[PATCH nf] netfilter: nat: limit port clash resolution attempts

2018-12-08 Thread Florian Westphal
In case almost or all available ports are taken, clash resolution can
take a very long time, resulting in soft lockup.

This can happen when many to-be-natted hosts connect to same
destination:port (e.g. a proxy) and all connections pass the same SNAT.

Pick a random offset in the acceptable range, then try ever smaller
number of adjacent port numbers, until either the limit is reached or a
useable port was found.  This results in at most 248 attempts
(128 + 64 + 32 + 16 + 8, i.e. 4 restarts with new search offset)
instead of 64000+,

Signed-off-by: Florian Westphal 
---
 Pablo,

 this will unfortunately result in a nf-next merge conflict
 due to *rover removal in nf-next.
 I can send a patch vs. nf-next instead if you prefer.

 net/netfilter/nf_nat_proto_common.c | 26 ++
 1 file changed, 22 insertions(+), 4 deletions(-)

diff --git a/net/netfilter/nf_nat_proto_common.c 
b/net/netfilter/nf_nat_proto_common.c
index 5d849d835561..0e3321660624 100644
--- a/net/netfilter/nf_nat_proto_common.c
+++ b/net/netfilter/nf_nat_proto_common.c
@@ -41,9 +41,10 @@ void nf_nat_l4proto_unique_tuple(const struct nf_nat_l3proto 
*l3proto,
 const struct nf_conn *ct,
 u16 *rover)
 {
-   unsigned int range_size, min, max, i;
+   unsigned int range_size, min, max, i, attempts;
__be16 *portptr;
-   u_int16_t off;
+   u16 off;
+   static const unsigned int max_attempts = 128;
 
if (maniptype == NF_NAT_MANIP_SRC)
portptr = >src.u.all;
@@ -89,15 +90,32 @@ void nf_nat_l4proto_unique_tuple(const struct 
nf_nat_l3proto *l3proto,
off = *rover;
}
 
-   for (i = 0; ; ++off) {
+   attempts = range_size;
+   if (attempts > max_attempts)
+   attempts = max_attempts;
+
+   /* We are in softirq; doing a search of the entire range risks
+* soft lockup when all tuples are already used.
+*
+* If we can't find any free port from first offset, pick a new
+* one and try again, with ever smaller search window.
+*/
+another_round:
+   for (i = 0; i < attempts; ++off) {
*portptr = htons(min + off % range_size);
-   if (++i != range_size && nf_nat_used_tuple(tuple, ct))
+   if (nf_nat_used_tuple(tuple, ct))
continue;
if (!(range->flags & (NF_NAT_RANGE_PROTO_RANDOM_ALL|
NF_NAT_RANGE_PROTO_OFFSET)))
*rover = off;
return;
}
+
+   if (attempts >= range_size || attempts < 16)
+   return;
+   attempts /= 2;
+   off = prandom_u32();
+   goto another_round;
 }
 EXPORT_SYMBOL_GPL(nf_nat_l4proto_unique_tuple);
 
-- 
2.19.2



[PATCH nf] netfilter: nf_conncount: use rb_link_node_rcu() instead of rb_link_node()

2018-12-07 Thread Taehee Yoo
rbnode in insert_tree() is rcu protected pointer.
So, in order to handle this pointer, _rcu function should be used.
rb_link_node_rcu() is a rcu version of rb_link_node().

Fixes: 34848d5c896e ("netfilter: nf_conncount: Split insert and traversal")
Signed-off-by: Taehee Yoo 
---
 net/netfilter/nf_conncount.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/net/netfilter/nf_conncount.c b/net/netfilter/nf_conncount.c
index b6d0f6deea86..9cd180bda092 100644
--- a/net/netfilter/nf_conncount.c
+++ b/net/netfilter/nf_conncount.c
@@ -427,7 +427,7 @@ insert_tree(struct net *net,
count = 1;
rbconn->list.count = count;
 
-   rb_link_node(>node, parent, rbnode);
+   rb_link_node_rcu(>node, parent, rbnode);
rb_insert_color(>node, root);
 out_unlock:
spin_unlock_bh(_conncount_locks[hash % CONNCOUNT_LOCK_SLOTS]);
-- 
2.17.1



Urgently need money? We can help you!

2018-12-07 Thread Mr. Muller Dieter
Urgently need money? We can help you!
Are you by the current situation in trouble or threatens you in trouble?
In this way, we give you the ability to take a new development.
As a rich person I feel obliged to assist people who are struggling to give 
them a chance. Everyone deserved a second chance and since the Government 
fails, it will have to come from others.
No amount is too crazy for us and the maturity we determine by mutual agreement.
No surprises, no extra costs, but just the agreed amounts and nothing else.
Don't wait any longer and comment on this post. Please specify the amount you 
want to borrow and we will contact you with all the possibilities. contact us 
today at stewarrt.l...@gmail.com


Re: [PATCH RFC] src: support for arp ether and IP source and destination fields

2018-12-07 Thread Pablo Neira Ayuso
On Fri, Dec 07, 2018 at 02:05:15PM +0100, Florian Westphal wrote:
> Pablo Neira Ayuso  wrote:
> > Add ip-saddr, ip-daddr, ether-saddr, ether-daddr for arp, eg.
> > 
> >  # nft add table arp x
> >  # nft add chain arp x y { type filter hook input priority 0\; }
> >  # nft add rule arp x y arp ip-saddr 192.168.2.1 counter
> 
> 'arp {ip,ether} {s,d}addr' would create ambiguities?

That's my concern moving forward with the grammar, but I can double
check more carefully. I just quickly jumped on this when looking at
one of the bugzilla issues.

> If so, this '-' notation seems ok to me.

Thanks.


Re: [PATCH RFC] src: support for arp ether and IP source and destination fields

2018-12-07 Thread Florian Westphal
Pablo Neira Ayuso  wrote:
> Add ip-saddr, ip-daddr, ether-saddr, ether-daddr for arp, eg.
> 
>  # nft add table arp x
>  # nft add chain arp x y { type filter hook input priority 0\; }
>  # nft add rule arp x y arp ip-saddr 192.168.2.1 counter

'arp {ip,ether} {s,d}addr' would create ambiguities?

If so, this '-' notation seems ok to me.



[PATCH RFC] src: support for arp ether and IP source and destination fields

2018-12-07 Thread Pablo Neira Ayuso
Add ip-saddr, ip-daddr, ether-saddr, ether-daddr for arp, eg.

 # nft add table arp x
 # nft add chain arp x y { type filter hook input priority 0\; }
 # nft add rule arp x y arp ip-saddr 192.168.2.1 counter

Testing this:

 # ip neigh flush dev eth0
 # ping 8.8.8.8
 # nft list ruleset
 table arp x {
chain y {
type filter hook input priority filter; policy accept;
arp ip-saddr 192.168.2.1 counter packets 1 bytes 46
}
 }

Signed-off-by: Pablo Neira Ayuso 
---
Documentation is still missing, and we should generate dependencies to
restrict htype and ptype.

 include/headers.h  | 12 
 include/proto.h|  4 
 src/parser_bison.y |  8 
 src/proto.c| 22 ++
 src/scanner.l  |  4 
 5 files changed, 42 insertions(+), 8 deletions(-)

diff --git a/include/headers.h b/include/headers.h
index 3d564debf8b0..759f93bf8c7a 100644
--- a/include/headers.h
+++ b/include/headers.h
@@ -78,6 +78,18 @@ struct sctphdr {
uint32_tchecksum;
 };
 
+struct arp_hdr {
+   uint16_thtype;
+   uint16_tptype;
+   uint8_t hlen;
+   uint8_t plen;
+   uint16_toper;
+   uint8_t sha[6];
+   uint32_tspa;
+   uint8_t tha[6];
+   uint32_ttpa;
+} __attribute__((__packed__));
+
 struct ipv6hdr {
uint8_t version:4,
priority:4;
diff --git a/include/proto.h b/include/proto.h
index 9a9f9255f047..b097e8fcbc2b 100644
--- a/include/proto.h
+++ b/include/proto.h
@@ -182,6 +182,10 @@ enum arp_hdr_fields {
ARPHDR_HLN,
ARPHDR_PLN,
ARPHDR_OP,
+   ARPHDR_IP_SADDR,
+   ARPHDR_IP_DADDR,
+   ARPHDR_ETHER_SADDR,
+   ARPHDR_ETHER_DADDR,
 };
 
 enum ip_hdr_fields {
diff --git a/src/parser_bison.y b/src/parser_bison.y
index 34202b0415ec..e94282a43615 100644
--- a/src/parser_bison.y
+++ b/src/parser_bison.y
@@ -296,6 +296,10 @@ int nft_lex(void *, void *, void *);
 %token HLEN"hlen"
 %token PLEN"plen"
 %token OPERATION   "operation"
+%token ETHER_SADDR "ether-saddr"
+%token ETHER_DADDR "ether-daddr"
+%token IP_SADDR"ip-saddr"
+%token IP_DADDR"ip-daddr"
 
 %token IP  "ip"
 %token HDRVERSION  "version"
@@ -4204,6 +4208,10 @@ arp_hdr_field:   HTYPE   { $$ = 
ARPHDR_HRD; }
|   HLEN{ $$ = ARPHDR_HLN; }
|   PLEN{ $$ = ARPHDR_PLN; }
|   OPERATION   { $$ = ARPHDR_OP; }
+   |   ETHER_SADDR { $$ = ARPHDR_ETHER_SADDR; }
+   |   ETHER_DADDR { $$ = ARPHDR_ETHER_DADDR; }
+   |   IP_SADDR{ $$ = ARPHDR_IP_SADDR; }
+   |   IP_DADDR{ $$ = ARPHDR_IP_DADDR; }
;
 
 ip_hdr_expr:   IP  ip_hdr_field
diff --git a/src/proto.c b/src/proto.c
index d178bf39ea90..d52f11ce6f30 100644
--- a/src/proto.c
+++ b/src/proto.c
@@ -822,23 +822,29 @@ const struct datatype arpop_type = {
 };
 
 #define ARPHDR_TYPE(__name, __type, __member) \
-   HDR_TYPE(__name, __type, struct arphdr, __member)
+   HDR_TYPE(__name, __type, struct arp_hdr, __member)
 #define ARPHDR_FIELD(__name, __member) \
-   HDR_FIELD(__name, struct arphdr, __member)
+   HDR_FIELD(__name, struct arp_hdr, __member)
 
 const struct proto_desc proto_arp = {
.name   = "arp",
.base   = PROTO_BASE_NETWORK_HDR,
.templates  = {
-   [ARPHDR_HRD]= ARPHDR_FIELD("htype", ar_hrd),
-   [ARPHDR_PRO]= ARPHDR_TYPE("ptype", _type, 
ar_pro),
-   [ARPHDR_HLN]= ARPHDR_FIELD("hlen", ar_hln),
-   [ARPHDR_PLN]= ARPHDR_FIELD("plen", ar_pln),
-   [ARPHDR_OP] = ARPHDR_TYPE("operation", _type, 
ar_op),
+   [ARPHDR_HRD]= ARPHDR_FIELD("htype", htype),
+   [ARPHDR_PRO]= ARPHDR_TYPE("ptype", _type, 
ptype),
+   [ARPHDR_HLN]= ARPHDR_FIELD("hlen", hlen),
+   [ARPHDR_PLN]= ARPHDR_FIELD("plen", plen),
+   [ARPHDR_OP] = ARPHDR_TYPE("operation", _type, 
oper),
+   [ARPHDR_ETHER_SADDR]= ARPHDR_TYPE("ether-saddr", 
_type, sha),
+   [ARPHDR_ETHER_DADDR]= ARPHDR_TYPE("ether-daddr", 
_type, tha),
+   [ARPHDR_IP_SADDR]   = ARPHDR_TYPE("ip-saddr", _type, 
spa),
+   [ARPHDR_IP_DADDR]   = ARPHDR_TYPE("ip-daddr", _type, 
tpa),
},
.format = {
.filter = (1 << ARPHDR_HRD) | (1 << ARPHDR_PRO) |
- (1 << ARPHDR_HLN) | (1 << 

Re: [PATCH nf] netfilter: seqadj: re-load tcp header pointer after possible head reallocation

2018-12-07 Thread Pablo Neira Ayuso
On Wed, Dec 05, 2018 at 02:12:19PM +0100, Florian Westphal wrote:
> When adjusting sack block sequence numbers, skb_make_writable() gets
> called to make sure tcp options are all in the linear area, and buffer
> is not shared.
> 
> This can cause tcp header pointer to get reallocated, so we must
> reaload it to avoid memory corruption.
> 
> This bug pre-dates git history.

Applied, thanks Florian.


Re: [libnftnl PATCH 0/2] chain: Support per chain rules list

2018-12-07 Thread Pablo Neira Ayuso
On Thu, Dec 06, 2018 at 05:17:50PM +0100, Phil Sutter wrote:
> This series implements a rule list in chains to allow for per chain rule
> caches in iptables-nft as well as nftables.
> 
> A second patch then adds utility functions for chain and rule lookups,
> preparing for further optimizing these tasks in a transparent way since
> users won't open-code the chain/rule list traversal anymore.

Series applied, thanks Phil.


[libnftnl PATCH 0/2] chain: Support per chain rules list

2018-12-06 Thread Phil Sutter
This series implements a rule list in chains to allow for per chain rule
caches in iptables-nft as well as nftables.

A second patch then adds utility functions for chain and rule lookups,
preparing for further optimizing these tasks in a transparent way since
users won't open-code the chain/rule list traversal anymore.

Phil Sutter (2):
  chain: Support per chain rules list
  chain: Add lookup functions for chain list and rules in chain

 include/internal.h   |   1 +
 include/libnftnl/chain.h |  17 +
 include/rule.h   |  26 
 src/chain.c  | 132 ++-
 src/libnftnl.map |  13 
 src/rule.c   |  22 ---
 6 files changed, 188 insertions(+), 23 deletions(-)
 create mode 100644 include/rule.h

-- 
2.19.0



[libnftnl PATCH 2/2] chain: Add lookup functions for chain list and rules in chain

2018-12-06 Thread Phil Sutter
For now, these lookup functions simply iterate over the linked list
until they find the right entry. In future, they may make use of more
optimized data structures behind the curtains.

Signed-off-by: Phil Sutter 
---
 include/libnftnl/chain.h |  2 ++
 src/chain.c  | 28 
 src/libnftnl.map |  3 +++
 3 files changed, 33 insertions(+)

diff --git a/include/libnftnl/chain.h b/include/libnftnl/chain.h
index f04f61056cc7c..64e10e91aaefe 100644
--- a/include/libnftnl/chain.h
+++ b/include/libnftnl/chain.h
@@ -76,6 +76,7 @@ int nftnl_chain_nlmsg_parse(const struct nlmsghdr *nlh, 
struct nftnl_chain *t);
 int nftnl_rule_foreach(struct nftnl_chain *c,
  int (*cb)(struct nftnl_rule *r, void *data),
  void *data);
+struct nftnl_rule *nftnl_rule_lookup_byindex(struct nftnl_chain *c, uint32_t 
index);
 
 struct nftnl_rule_iter;
 
@@ -89,6 +90,7 @@ struct nftnl_chain_list *nftnl_chain_list_alloc(void);
 void nftnl_chain_list_free(struct nftnl_chain_list *list);
 int nftnl_chain_list_is_empty(const struct nftnl_chain_list *list);
 int nftnl_chain_list_foreach(struct nftnl_chain_list *chain_list, int 
(*cb)(struct nftnl_chain *t, void *data), void *data);
+struct nftnl_chain *nftnl_chain_list_lookup_byname(struct nftnl_chain_list 
*chain_list, const char *chain);
 
 void nftnl_chain_list_add(struct nftnl_chain *r, struct nftnl_chain_list 
*list);
 void nftnl_chain_list_add_tail(struct nftnl_chain *r, struct nftnl_chain_list 
*list);
diff --git a/src/chain.c b/src/chain.c
index c8b7f9ba12618..8668fb7d1494d 100644
--- a/src/chain.c
+++ b/src/chain.c
@@ -734,6 +734,20 @@ int nftnl_rule_foreach(struct nftnl_chain *c,
return 0;
 }
 
+EXPORT_SYMBOL(nftnl_rule_lookup_byindex);
+struct nftnl_rule *
+nftnl_rule_lookup_byindex(struct nftnl_chain *c, uint32_t index)
+{
+   struct nftnl_rule *r;
+
+   list_for_each_entry(r, >rule_list, head) {
+   if (!index)
+   return r;
+   index--;
+   }
+   return NULL;
+}
+
 struct nftnl_rule_iter {
const struct nftnl_chain*c;
struct nftnl_rule   *cur;
@@ -856,6 +870,20 @@ int nftnl_chain_list_foreach(struct nftnl_chain_list 
*chain_list,
return 0;
 }
 
+EXPORT_SYMBOL(nftnl_chain_list_lookup_byname);
+struct nftnl_chain *
+nftnl_chain_list_lookup_byname(struct nftnl_chain_list *chain_list,
+  const char *chain)
+{
+   struct nftnl_chain *c;
+
+   list_for_each_entry(c, _list->list, head) {
+   if (!strcmp(chain, c->name))
+   return c;
+   }
+   return NULL;
+}
+
 struct nftnl_chain_list_iter {
const struct nftnl_chain_list   *list;
struct nftnl_chain  *cur;
diff --git a/src/libnftnl.map b/src/libnftnl.map
index 96d5b5f1cec49..0d3be32263eee 100644
--- a/src/libnftnl.map
+++ b/src/libnftnl.map
@@ -345,4 +345,7 @@ LIBNFTNL_12 {
   nftnl_rule_iter_create;
   nftnl_rule_iter_next;
   nftnl_rule_iter_destroy;
+
+  nftnl_chain_list_lookup_byname;
+  nftnl_rule_lookup_byindex;
 } LIBNFTNL_11;
-- 
2.19.0



[PATCH v2 nf-next] netfilter: conntrack: udp: only extend timeout to stream mode after 2s

2018-12-06 Thread Florian Westphal
Currently DNS resolvers that send both A and  queries from same source port
can trigger stream mode prematurely, which results in non-early-evictable 
conntrack entry
for three minutes, even though DNS requests are done in a few milliseconds.

Add a two second grace period where we continue to use the ordinary
30-second default timeout.  Its enough for DNS request/response traffic,
even if two request/reply packets are involved.

ASSURED is still set, else conntrack (and thus a possible
NAT mapping ...) gets zapped too in case conntrack table runs full.

v2: fix comment

Signed-off-by: Florian Westphal 
---
 include/net/netfilter/nf_conntrack.h   |  5 +
 net/netfilter/nf_conntrack_proto_udp.c | 16 +---
 2 files changed, 18 insertions(+), 3 deletions(-)

diff --git a/include/net/netfilter/nf_conntrack.h 
b/include/net/netfilter/nf_conntrack.h
index 7e012312cd61..249d0a5b12b8 100644
--- a/include/net/netfilter/nf_conntrack.h
+++ b/include/net/netfilter/nf_conntrack.h
@@ -27,12 +27,17 @@
 
 #include 
 
+struct nf_ct_udp {
+   unsigned long   stream_ts;
+};
+
 /* per conntrack: protocol private data */
 union nf_conntrack_proto {
/* insert conntrack proto private data here */
struct nf_ct_dccp dccp;
struct ip_ct_sctp sctp;
struct ip_ct_tcp tcp;
+   struct nf_ct_udp udp;
struct nf_ct_gre gre;
unsigned int tmpl_padto;
 };
diff --git a/net/netfilter/nf_conntrack_proto_udp.c 
b/net/netfilter/nf_conntrack_proto_udp.c
index ebf151054ad6..82da6b2625b1 100644
--- a/net/netfilter/nf_conntrack_proto_udp.c
+++ b/net/netfilter/nf_conntrack_proto_udp.c
@@ -100,11 +100,21 @@ static int udp_packet(struct nf_conn *ct,
if (!timeouts)
timeouts = udp_get_timeouts(nf_ct_net(ct));
 
+   if (!nf_ct_is_confirmed(ct))
+   ct->proto.udp.stream_ts = 2 * HZ + jiffies;
+
/* If we've seen traffic both ways, this is some kind of UDP
-  stream.  Extend timeout. */
+* stream. Set Assured.
+*/
if (test_bit(IPS_SEEN_REPLY_BIT, >status)) {
-   nf_ct_refresh_acct(ct, ctinfo, skb,
-  timeouts[UDP_CT_REPLIED]);
+   unsigned long extra = timeouts[UDP_CT_UNREPLIED];
+
+   /* Still active after two seconds? Extend timeout. */
+   if (time_after(jiffies, ct->proto.udp.stream_ts))
+   extra = timeouts[UDP_CT_REPLIED];
+
+   nf_ct_refresh_acct(ct, ctinfo, skb, extra);
+
/* Also, more likely to be important, and not a probe */
if (!test_and_set_bit(IPS_ASSURED_BIT, >status))
nf_conntrack_event_cache(IPCT_ASSURED, ct);
-- 
2.19.2



[PATCH nf-next] netfilter: conntrack: udp: reduce default timeouts

2018-12-05 Thread Florian Westphal
We have no explicit signal when a UDP stream has terminated, peers just
stop sending.

For unreplied UDP case, 10 seconds should be enough to cover
delayed replies, and for suspected stream connections a timeout
of two minutes is sane to keep NAT mapping alive a while longer.
It matches tcp conntracks 'timewait' default timeout value.

Signed-off-by: Florian Westphal 
---
 Documentation/networking/nf_conntrack-sysctl.txt | 4 ++--
 net/netfilter/nf_conntrack_proto_udp.c   | 4 ++--
 2 files changed, 4 insertions(+), 4 deletions(-)

diff --git a/Documentation/networking/nf_conntrack-sysctl.txt 
b/Documentation/networking/nf_conntrack-sysctl.txt
index 1669dc2419fd..371b6260dcd5 100644
--- a/Documentation/networking/nf_conntrack-sysctl.txt
+++ b/Documentation/networking/nf_conntrack-sysctl.txt
@@ -154,10 +154,10 @@ nf_conntrack_timestamp - BOOLEAN
Enable connection tracking flow timestamping.
 
 nf_conntrack_udp_timeout - INTEGER (seconds)
-   default 30
+   default 10
 
 nf_conntrack_udp_timeout_stream - INTEGER (seconds)
-   default 180
+   default 120
 
This extended timeout will be used in case there is an UDP stream
detected.
diff --git a/net/netfilter/nf_conntrack_proto_udp.c 
b/net/netfilter/nf_conntrack_proto_udp.c
index 76cee2fe3b1b..807389da42f4 100644
--- a/net/netfilter/nf_conntrack_proto_udp.c
+++ b/net/netfilter/nf_conntrack_proto_udp.c
@@ -28,8 +28,8 @@
 #include 
 
 static const unsigned int udp_timeouts[UDP_CT_MAX] = {
-   [UDP_CT_UNREPLIED]  = 30*HZ,
-   [UDP_CT_REPLIED]= 180*HZ,
+   [UDP_CT_UNREPLIED]  = 10*HZ,
+   [UDP_CT_REPLIED]= 120*HZ,
 };
 
 static unsigned int *udp_get_timeouts(struct net *net)
-- 
2.19.2



[PATCH nf-next] netfilter: conntrack: udp: only extend timeout after 2s

2018-12-05 Thread Florian Westphal
DNS resolvers that send both A and  queries from same source port can
trigger stream mode prematurely, which results in non-early-evictable ct
for three minutes, even though request is done after a few milliseconds.

Add a two second grace period where we continue to use the ordinary
(unreplied) timeout.  Its enough for DNS request/response traffic, even
if two request/reply packets are involved.

ASSURED is still set, else conntrack (and thus a possible
NAT mapping ...) might get zapped in case conntrack table runs full.

Signed-off-by: Florian Westphal 
---
 include/net/netfilter/nf_conntrack.h   |  5 +
 net/netfilter/nf_conntrack_proto_udp.c | 18 ++
 2 files changed, 19 insertions(+), 4 deletions(-)

diff --git a/include/net/netfilter/nf_conntrack.h 
b/include/net/netfilter/nf_conntrack.h
index 7e012312cd61..249d0a5b12b8 100644
--- a/include/net/netfilter/nf_conntrack.h
+++ b/include/net/netfilter/nf_conntrack.h
@@ -27,12 +27,17 @@
 
 #include 
 
+struct nf_ct_udp {
+   unsigned long   stream_ts;
+};
+
 /* per conntrack: protocol private data */
 union nf_conntrack_proto {
/* insert conntrack proto private data here */
struct nf_ct_dccp dccp;
struct ip_ct_sctp sctp;
struct ip_ct_tcp tcp;
+   struct nf_ct_udp udp;
struct nf_ct_gre gre;
unsigned int tmpl_padto;
 };
diff --git a/net/netfilter/nf_conntrack_proto_udp.c 
b/net/netfilter/nf_conntrack_proto_udp.c
index c879d8d78cfd..76cee2fe3b1b 100644
--- a/net/netfilter/nf_conntrack_proto_udp.c
+++ b/net/netfilter/nf_conntrack_proto_udp.c
@@ -100,11 +100,21 @@ static int udp_packet(struct nf_conn *ct,
if (!timeouts)
timeouts = udp_get_timeouts(nf_ct_net(ct));
 
-   /* If we've seen traffic both ways, this is some kind of UDP
-  stream.  Extend timeout. */
+   if (!nf_ct_is_confirmed(ct))
+   ct->proto.udp.stream_ts = 2 * HZ + jiffies;
+
+   /* If we've seen traffic both ways for more than one second, this
+* is some kind of UDP stream.  Set Assured.
+*/
if (test_bit(IPS_SEEN_REPLY_BIT, >status)) {
-   nf_ct_refresh_acct(ct, ctinfo, skb,
-  timeouts[UDP_CT_REPLIED]);
+   unsigned long extra = timeouts[UDP_CT_UNREPLIED];
+
+   /* Still active after two seconds? Extend timeout. */
+   if (time_after(jiffies, ct->proto.udp.stream_ts))
+   extra = timeouts[UDP_CT_REPLIED];
+
+   nf_ct_refresh_acct(ct, ctinfo, skb, extra);
+
/* Also, more likely to be important, and not a probe */
if (!test_and_set_bit(IPS_ASSURED_BIT, >status))
nf_conntrack_event_cache(IPCT_ASSURED, ct);
-- 
2.19.2



[PATCH nf-next] netfilter: nat: remove unnecessary 'else if' branch

2018-12-05 Thread Xiaozhou Liu
Since a pseudo-random starting point is used in finding a port in
the default case, that 'else if' branch above is no longer a necessity.
So remove it to simplify code.

Signed-off-by: Xiaozhou Liu 
---
 net/netfilter/nf_nat_proto_common.c | 2 --
 1 file changed, 2 deletions(-)

diff --git a/net/netfilter/nf_nat_proto_common.c 
b/net/netfilter/nf_nat_proto_common.c
index a7de939fa5a9..136ab65c4082 100644
--- a/net/netfilter/nf_nat_proto_common.c
+++ b/net/netfilter/nf_nat_proto_common.c
@@ -80,8 +80,6 @@ void nf_nat_l4proto_unique_tuple(const struct nf_nat_l3proto 
*l3proto,
off = l3proto->secure_port(tuple, maniptype == NF_NAT_MANIP_SRC
  ? tuple->dst.u.all
  : tuple->src.u.all);
-   } else if (range->flags & NF_NAT_RANGE_PROTO_RANDOM_FULLY) {
-   off = prandom_u32();
} else if (range->flags & NF_NAT_RANGE_PROTO_OFFSET) {
off = (ntohs(*portptr) - ntohs(range->base_proto.all));
} else {
-- 
2.11.0



[PATCH nf] netfilter: seqadj: re-load tcp header pointer after possible head reallocation

2018-12-05 Thread Florian Westphal
When adjusting sack block sequence numbers, skb_make_writable() gets
called to make sure tcp options are all in the linear area, and buffer
is not shared.

This can cause tcp header pointer to get reallocated, so we must
reaload it to avoid memory corruption.

This bug pre-dates git history.

Reported-by: Neel Mehta 
Reported-by: Shane Huntley 
Reported-by: Heather Adkins 
Signed-off-by: Florian Westphal 
---
diff --git a/net/netfilter/nf_conntrack_seqadj.c 
b/net/netfilter/nf_conntrack_seqadj.c
index a975efd6b8c3..9da303461069 100644
--- a/net/netfilter/nf_conntrack_seqadj.c
+++ b/net/netfilter/nf_conntrack_seqadj.c
@@ -115,12 +115,12 @@ static void nf_ct_sack_block_adjust(struct sk_buff *skb,
 /* TCP SACK sequence number adjustment */
 static unsigned int nf_ct_sack_adjust(struct sk_buff *skb,
  unsigned int protoff,
- struct tcphdr *tcph,
  struct nf_conn *ct,
  enum ip_conntrack_info ctinfo)
 {
-   unsigned int dir, optoff, optend;
+   struct tcphdr *tcph = (void *)skb->data + protoff;
struct nf_conn_seqadj *seqadj = nfct_seqadj(ct);
+   unsigned int dir, optoff, optend;
 
optoff = protoff + sizeof(struct tcphdr);
optend = protoff + tcph->doff * 4;
@@ -128,6 +128,7 @@ static unsigned int nf_ct_sack_adjust(struct sk_buff *skb,
if (!skb_make_writable(skb, optend))
return 0;
 
+   tcph = (void *)skb->data + protoff;
dir = CTINFO2DIR(ctinfo);
 
while (optoff < optend) {
@@ -207,7 +208,7 @@ int nf_ct_seq_adjust(struct sk_buff *skb,
 ntohl(newack));
tcph->ack_seq = newack;
 
-   res = nf_ct_sack_adjust(skb, protoff, tcph, ct, ctinfo);
+   res = nf_ct_sack_adjust(skb, protoff, ct, ctinfo);
 out:
spin_unlock_bh(>lock);
 
-- 
2.19.2



Re: Proposal: rename of arptables.git and ebtables.git

2018-12-05 Thread Pablo Neira Ayuso
On Wed, Dec 05, 2018 at 12:18:30PM +0100, Arturo Borrero Gonzalez wrote:
[...]
> I would apply the -legacy renaming patch regardless. We already did this
> with arptables after the agreement @ NFWS. In fact, me sending the patch
> now (instead of last summer) is just my lack of time to write it earlier :-)

I'm going to apply your patch

Author: Arturo Borrero Gonzalez 
Date:   Wed Nov 28 13:47:28 2018 +0100

ebtables: legacy renaming

OK?

> Also, once the patch is applied, we should consider a release of both
> arptables and ebtables now that iptables contains the -nft variant and
> is being used in the wild.

That's fine with me.


Re: Proposal: rename of arptables.git and ebtables.git

2018-12-05 Thread Arturo Borrero Gonzalez
On 12/4/18 11:57 AM, Pablo Neira Ayuso wrote:
> On Tue, Dec 04, 2018 at 11:50:46AM +0100, Arturo Borrero Gonzalez wrote:
>> On 11/28/18 2:10 PM, Arturo Borrero Gonzalez wrote:
>>> On 11/28/18 1:44 PM, Arturo Borrero Gonzalez wrote:
 Hi,

 Now that the iptables.git repo offers arptables-nft and ebtables-nft,
 arptables.git holds arptables-legacy, etc, why we don't just rename the
 repos?

 * from arptables.git to arptables-legacy.git
 * from ebtables.git to ebtables-legacy.git

 This rename should help distros understand the differences between them
 and better accommodate the packaging of all the related tooling.

 Mind that the rename may have side effects in tarball
 generation/publishing etc. I would expect the new arptables tarball to
 include the '-legacy' keyword, and same for ebtables.

 If we go ahead with the rename, a new release is worth having,
 announcing these changes as well.

>>>
>>> Also,
>>>
>>> please consider applying the attached patch.
>>>
>>
>> ping :-)
> 
> Phil suggested no rename of the trees, I can update the description in
> git.netfilter.org to place LEGACY there. Concern as you mentioned is
> that it may break existing links/scripts. Not sure git support
> redirections from old repo URI to new one...
> 

Most people use these tools from distributions and if using directly
from git.netfilter.org they won't have problems finding a new URL. If
manually downloading tarball from netfilter.org, even less problem.
Distro packagers would have to refresh the upstream URL, sure, but
that's really a minor thing compared to the big -legacy -nft movement,
which requires a lot of other renaming and adjustments anyway.

My suggestion of the rename of the .git repo is because I already
detected several confused people who don't understand the relationship
between arptables-legacy, arptables-nft and the .git repos they are
served from (and same for ebtables).

Also, worth considering that having the repo clearly stating -legacy in
the name will help raise awareness of the -nft version, which could
serve as another motivation to encourage migration.

I don't even have a strong opinion on this :-) it was just a proposal bc
I see several benefits.

> I think it's fine to apply a patch to add the "-legacy" postfix as we
> do in iptables.
> 
> Are you OK with this approach?
> 

I would apply the -legacy renaming patch regardless. We already did this
with arptables after the agreement @ NFWS. In fact, me sending the patch
now (instead of last summer) is just my lack of time to write it earlier :-)

Also, once the patch is applied, we should consider a release of both
arptables and ebtables now that iptables contains the -nft variant and
is being used in the wild.


Re: stable nftables kernel changes for port to 3.12 kernel

2018-12-05 Thread Pablo Neira Ayuso
On Wed, Dec 05, 2018 at 12:59:43AM +0200, Pavel Melnik wrote:
> Hi
> 
> > I'd just change NF_IP6_PRI_RAW to -450 and use ip6tables rules in raw
> > table.
> 
> We will try, thanks

Have a look at:

commit 902d6a4c2a4f411582689e53fb101895ffe99028
Author: Subash Abhinov Kasiviswanathan 
Date:   Wed Jan 10 20:51:57 2018 -0700

netfilter: nf_defrag: Skip defrag if NOTRACK is set

It's providing a way to do this in the way Florian has mentioned.


Re: stable nftables kernel changes for port to 3.12 kernel

2018-12-04 Thread Pavel Melnik

Hi


I'd just change NF_IP6_PRI_RAW to -450 and use ip6tables rules in raw
table.


We will try, thanks


nft add table ip6 filter
nft add chain ...

and so on.


I have tried this, but no effect ..

Regards,
 Pavel


Re: stable nftables kernel changes for port to 3.12 kernel

2018-12-04 Thread Florian Westphal
Pavel Melnik  wrote:
> We were asked to implement functionality to drop fragmented IPv6 packets,
> addressed to local interface, on device based 3.12 kernel

Urgh.

I'd just change NF_IP6_PRI_RAW to -450 and use ip6tables rules in raw
table.

> But we observed the 'same' issue if try to use nftables on
> 3.13.0-163-generic PC kernel. No tables and chains are created by nft cmd,
> or at least displayed by 'nft list tables'

Thats normal, nftables has no builtin tables.

nft add table ip6 filter
nft add chain ...

and so on.


stable nftables kernel changes for port to 3.12 kernel

2018-12-04 Thread Pavel Melnik

Hi

We were asked to implement functionality to drop fragmented IPv6 
packets, addressed to local interface, on device based 3.12 kernel


As I understand it's not possible to do this by ip6tables rule in the 
case when nf_conntrack is enabled, but it possible if use nftables


Could you please advice the kernel version from which is make sense to 
bring nftables functionality (I am found post that referenced 3.18, but 
this seems too big step)


Our initial attempts to bring initial integration commits from v3.13 
kernel is not work.


But we observed the 'same' issue if try to use nftables on 
3.13.0-163-generic PC kernel. No tables and chains are created by nft 
cmd, or at least displayed by 'nft list tables'



Regards
   Pavel


Re: Proposal: rename of arptables.git and ebtables.git

2018-12-04 Thread Jan Engelhardt


On Tuesday 2018-12-04 11:57, Pablo Neira Ayuso wrote:
>On Tue, Dec 04, 2018 at 11:50:46AM +0100, Arturo Borrero Gonzalez wrote:
>> On 11/28/18 2:10 PM, Arturo Borrero Gonzalez wrote:
>> > On 11/28/18 1:44 PM, Arturo Borrero Gonzalez wrote:
>> >> Hi,
>> >>
>> >> Now that the iptables.git repo offers arptables-nft and ebtables-nft,
>> >> arptables.git holds arptables-legacy, etc, why we don't just rename the
>> >> repos?
>> >>
>> >> * from arptables.git to arptables-legacy.git
>> >> * from ebtables.git to ebtables-legacy.git
>> > 
>> > please consider applying the attached patch.
>> 
>> ping :-)
>
>Phil suggested no rename of the trees, I can update the description in
>git.netfilter.org to place LEGACY there. Concern as you mentioned is
>that it may break existing links/scripts. Not sure git support
>redirections from old repo URI to new one...
>
>I think it's fine to apply a patch to add the "-legacy" postfix as we
>do in iptables.

I think it is sufficient to do one action. Whoever builds the source will run
into the name difference at some point (and that is all that is needed to raise
awareness). Given git downloads usually do not count as build, the program name
change seems more preferable to have than renaming the git repo.
(But doing both is of course not too bad either.)


Re: Proposal: rename of arptables.git and ebtables.git

2018-12-04 Thread Pablo Neira Ayuso
On Tue, Dec 04, 2018 at 11:50:46AM +0100, Arturo Borrero Gonzalez wrote:
> On 11/28/18 2:10 PM, Arturo Borrero Gonzalez wrote:
> > On 11/28/18 1:44 PM, Arturo Borrero Gonzalez wrote:
> >> Hi,
> >>
> >> Now that the iptables.git repo offers arptables-nft and ebtables-nft,
> >> arptables.git holds arptables-legacy, etc, why we don't just rename the
> >> repos?
> >>
> >> * from arptables.git to arptables-legacy.git
> >> * from ebtables.git to ebtables-legacy.git
> >>
> >> This rename should help distros understand the differences between them
> >> and better accommodate the packaging of all the related tooling.
> >>
> >> Mind that the rename may have side effects in tarball
> >> generation/publishing etc. I would expect the new arptables tarball to
> >> include the '-legacy' keyword, and same for ebtables.
> >>
> >> If we go ahead with the rename, a new release is worth having,
> >> announcing these changes as well.
> >>
> > 
> > Also,
> > 
> > please consider applying the attached patch.
> > 
> 
> ping :-)

Phil suggested no rename of the trees, I can update the description in
git.netfilter.org to place LEGACY there. Concern as you mentioned is
that it may break existing links/scripts. Not sure git support
redirections from old repo URI to new one...

I think it's fine to apply a patch to add the "-legacy" postfix as we
do in iptables.

Are you OK with this approach?

Thanks.


Re: Proposal: rename of arptables.git and ebtables.git

2018-12-04 Thread Arturo Borrero Gonzalez
On 11/28/18 2:10 PM, Arturo Borrero Gonzalez wrote:
> On 11/28/18 1:44 PM, Arturo Borrero Gonzalez wrote:
>> Hi,
>>
>> Now that the iptables.git repo offers arptables-nft and ebtables-nft,
>> arptables.git holds arptables-legacy, etc, why we don't just rename the
>> repos?
>>
>> * from arptables.git to arptables-legacy.git
>> * from ebtables.git to ebtables-legacy.git
>>
>> This rename should help distros understand the differences between them
>> and better accommodate the packaging of all the related tooling.
>>
>> Mind that the rename may have side effects in tarball
>> generation/publishing etc. I would expect the new arptables tarball to
>> include the '-legacy' keyword, and same for ebtables.
>>
>> If we go ahead with the rename, a new release is worth having,
>> announcing these changes as well.
>>
> 
> Also,
> 
> please consider applying the attached patch.
> 

ping :-)


Re: [PATCH v3] netfilter/ipset: replace a strncpy() with strscpy()

2018-12-04 Thread Jozsef Kadlecsik
Hi,

On Sat, 1 Dec 2018, Qian Cai wrote:

> To make overflows as obvious as possible and to prevent code from blithely
> proceeding with a truncated string. This also has a side-effect to fix a
> compilation warning when using GCC 8.2.1.
> 
> net/netfilter/ipset/ip_set_core.c: In function 'ip_set_sockfn_get':
> net/netfilter/ipset/ip_set_core.c:2027:3: warning: 'strncpy' writing 32
> bytes into a region of size 2 overflows the destination
> [-Wstringop-overflow=]
> 
> Signed-off-by: Qian Cai 

Patch is applied with a slight modification, see below:
 
> Changelog:
> * v2:
> - Released the lock for the error-path as well.
> * v1:
> - Checked the return value.
> 
>  net/netfilter/ipset/ip_set_core.c | 6 --
>  1 file changed, 4 insertions(+), 2 deletions(-)
> 
> diff --git a/net/netfilter/ipset/ip_set_core.c 
> b/net/netfilter/ipset/ip_set_core.c
> index 1577f2f76060..33929fb645b6 100644
> --- a/net/netfilter/ipset/ip_set_core.c
> +++ b/net/netfilter/ipset/ip_set_core.c
> @@ -2024,9 +2024,11 @@ ip_set_sockfn_get(struct sock *sk, int optval, void 
> __user *user, int *len)
>   }
>   nfnl_lock(NFNL_SUBSYS_IPSET);
>   set = ip_set(inst, req_get->set.index);
> - strncpy(req_get->set.name, set ? set->name : "",
> - IPSET_MAXNAMELEN);
> + ret = strscpy(req_get->set.name, set ? set->name : "",
> +   IPSET_MAXNAMELEN);
>   nfnl_unlock(NFNL_SUBSYS_IPSET);
> + if (ret == -E2BIG)

I replaced the condition with

if (ret < 0)

so that it can handle future error codes from strscpy() as well.

> + goto done;
>   goto copy;
>   }
>   default:

Best regards,
Jozsef
-
E-mail  : kad...@blackhole.kfki.hu, kadlecsik.joz...@wigner.mta.hu
PGP key : http://www.kfki.hu/~kadlec/pgp_public_key.txt
Address : Wigner Research Centre for Physics, Hungarian Academy of Sciences
  H-1525 Budapest 114, POB. 49, Hungary


Re: [PATCH nf] netfilter: nf_tables: fix suspicious RCU usage in nft_chain_stats_replace()

2018-12-03 Thread Pablo Neira Ayuso
On Mon, Nov 26, 2018 at 08:03:30PM +0900, Taehee Yoo wrote:
> basechain->stats is rcu protected data.
> And write critical section of basechain->stats data is
> nft_chain_stats_replace().
> The function is executed in commit phase. so that actually commit_mutex
> lock protects that.
> Hence commit_mutex lockdep should be used for rcu_dereference_protected()
> in the nft_chain_stats_replace() instead of NFNL_SUBSYS_NFTABLES.

Applied, thanks.


[PATCH nft] parser: bail out on incorrect burst unit

2018-12-03 Thread Pablo Neira Ayuso
Burst can be either bytes or packets, depending on the rate limit unit.

 # nft add rule x y iif eth0 limit rate 512 kbytes/second burst 5 packets
 Error: syntax error, unexpected packets, expecting string or bytes
 add rule x y iif eth0 limit rate 512 kbytes/second burst 5 packets
^^^

Closes: https://bugzilla.netfilter.org/show_bug.cgi?id=1306
Signed-off-by: Pablo Neira Ayuso 
---
 src/parser_bison.y   | 15 +--
 tests/py/any/limit.t |  2 ++
 2 files changed, 11 insertions(+), 6 deletions(-)

diff --git a/src/parser_bison.y b/src/parser_bison.y
index e73e1ecd0805..34202b0415ec 100644
--- a/src/parser_bison.y
+++ b/src/parser_bison.y
@@ -590,7 +590,7 @@ int nft_lex(void *, void *, void *);
 %type level_type log_flags log_flags_tcp log_flag_tcp
 %typelimit_stmt quota_stmt connlimit_stmt
 %destructor { stmt_free($$); } limit_stmt quota_stmt connlimit_stmt
-%type limit_burst limit_mode time_unit quota_mode
+%type limit_burst_pkts limit_burst_bytes limit_mode 
time_unit quota_mode
 %typereject_stmt reject_stmt_alloc
 %destructor { stmt_free($$); } reject_stmt reject_stmt_alloc
 %typenat_stmt nat_stmt_alloc masq_stmt 
masq_stmt_alloc redir_stmt redir_stmt_alloc
@@ -2475,7 +2475,7 @@ log_flag_tcp  :   SEQUENCE
}
;
 
-limit_stmt :   LIMIT   RATElimit_mode  NUM SLASH   
time_unit   limit_burst
+limit_stmt :   LIMIT   RATElimit_mode  NUM SLASH   
time_unit   limit_burst_pkts
{
$$ = limit_stmt_alloc(&@$);
$$->limit.rate  = $4;
@@ -2484,7 +2484,7 @@ limit_stmt:   LIMIT   RATE
limit_mode  NUM SLASH   time_unit   limit_burst
$$->limit.type  = NFT_LIMIT_PKTS;
$$->limit.flags = $3;
}
-   |   LIMIT   RATElimit_mode  NUM STRING  
limit_burst
+   |   LIMIT   RATElimit_mode  NUM STRING  
limit_burst_bytes
{
struct error_record *erec;
uint64_t rate, unit;
@@ -2565,8 +2565,11 @@ limit_mode   :   OVER
{ $$ = NFT_LIMIT_F_INV; }
|   /* empty */ { $$ = 0; }
;
 
-limit_burst:   /* empty */ { $$ = 0; }
+limit_burst_pkts   :   /* empty */ { $$ = 0; }
|   BURST   NUM PACKETS { $$ = $2; }
+   ;
+
+limit_burst_bytes  :   /* empty */ { $$ = 0; }
|   BURST   NUM BYTES   { $$ = $2; }
|   BURST   NUM STRING
{
@@ -3532,7 +3535,7 @@ ct_obj_alloc  :
}
;
 
-limit_config   :   RATElimit_mode  NUM SLASH   
time_unit   limit_burst
+limit_config   :   RATElimit_mode  NUM SLASH   
time_unit   limit_burst_pkts
{
struct limit *limit;
limit = xzalloc(sizeof(*limit));
@@ -3543,7 +3546,7 @@ limit_config  :   RATElimit_mode  
NUM SLASH   time_unit   limit_burst
limit->flags= $2;
$$ = limit;
}
-   |   RATElimit_mode  NUM STRING  
limit_burst
+   |   RATElimit_mode  NUM STRING  
limit_burst_bytes
{
struct limit *limit;
struct error_record *erec;
diff --git a/tests/py/any/limit.t b/tests/py/any/limit.t
index 8180bea3ddae..ef7f93133297 100644
--- a/tests/py/any/limit.t
+++ b/tests/py/any/limit.t
@@ -14,6 +14,7 @@ limit rate 400/hour;ok
 limit rate 40/day;ok
 limit rate 400/week;ok
 limit rate 1023/second burst 10 packets;ok
+limit rate 1023/second burst 10 bytes;fail
 
 limit rate 1 kbytes/second;ok
 limit rate 2 kbytes/second;ok
@@ -21,6 +22,7 @@ limit rate 1025 kbytes/second;ok
 limit rate 1023 mbytes/second;ok
 limit rate 10230 mbytes/second;ok
 limit rate 1023000 mbytes/second;ok
+limit rate 512 kbytes/second burst 5 packets;fail
 
 limit rate 1025 bytes/second burst 512 bytes;ok
 limit rate 1025 kbytes/second burst 1023 kbytes;ok
-- 
2.11.0



Re: [PATCH RESEND iptables] include: extend the headers conflict workaround to in6.h

2018-12-03 Thread Pablo Neira Ayuso
On Sun, Dec 02, 2018 at 06:56:34PM +0200, Baruch Siach wrote:
> Commit 8d9d7e4b9ef ("include: fix build with kernel headers before 4.2")
> introduced a kernel/user headers conflict workaround that allows build
> of iptables with kernel headers older than 4.2. This minor extension
> allows build with kernel headers older than 3.12, which is the version
> that introduced explicit IP headers synchronization.

Applied.


Re: [iptables PATCH] extensions: libipt_realm: Document allowed realm values

2018-12-03 Thread Pablo Neira Ayuso
On Mon, Dec 03, 2018 at 02:52:28PM +0100, Phil Sutter wrote:
> Older versions of iptables allowed for negative realm values by accident
> (they would be cast to unsigned). While this was clearly a bug, document
> the fixed behaviour.

Applied, thanks Phil.


[iptables PATCH] extensions: libipt_realm: Document allowed realm values

2018-12-03 Thread Phil Sutter
Older versions of iptables allowed for negative realm values by accident
(they would be cast to unsigned). While this was clearly a bug, document
the fixed behaviour.

Signed-off-by: Phil Sutter 
---
 extensions/libipt_realm.man | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/extensions/libipt_realm.man b/extensions/libipt_realm.man
index a40b1adc72ba2..72dff9b2e4212 100644
--- a/extensions/libipt_realm.man
+++ b/extensions/libipt_realm.man
@@ -5,3 +5,5 @@ setups involving dynamic routing protocols like BGP.
 Matches a given realm number (and optionally mask). If not a number, value
 can be a named realm from /etc/iproute2/rt_realms (mask can not be used in
 that case).
+Both value and mask are four byte unsigned integers and may be specified in
+decimal, hex (by prefixing with "0x") or octal (if a leading zero is given).
-- 
2.19.0



[PATCH v3] netfilter: nf_conntrack_sip: add sip_external_media logic

2018-12-03 Thread Alin Nastac
From: Alin Nastac 

Allow media streams that are not passing through this router.

When enabled, the sip_external_media logic will leave SDP
payload untouched when it detects that interface towards INVITEd
party is the same with the one towards media endpoint.

Signed-off-by: Alin Nastac 
---
 net/netfilter/nf_conntrack_sip.c | 42 
 1 file changed, 42 insertions(+)

diff --git a/net/netfilter/nf_conntrack_sip.c b/net/netfilter/nf_conntrack_sip.c
index c8d2b66..f067c6b 100644
--- a/net/netfilter/nf_conntrack_sip.c
+++ b/net/netfilter/nf_conntrack_sip.c
@@ -21,6 +21,8 @@
 #include 
 #include 
 
+#include 
+#include 
 #include 
 #include 
 #include 
@@ -54,6 +56,11 @@ module_param(sip_direct_media, int, 0600);
 MODULE_PARM_DESC(sip_direct_media, "Expect Media streams between signalling "
   "endpoints only (default 1)");
 
+static int sip_external_media __read_mostly = 0;
+module_param(sip_external_media, int, 0600);
+MODULE_PARM_DESC(sip_external_media, "Expect Media streams between external "
+"endpoints (default 0)");
+
 const struct nf_nat_sip_hooks *nf_nat_sip_hooks;
 EXPORT_SYMBOL_GPL(nf_nat_sip_hooks);
 
@@ -861,6 +868,41 @@ static int set_expected_rtp_rtcp(struct sk_buff *skb, 
unsigned int protoff,
if (!nf_inet_addr_cmp(daddr, >tuplehash[dir].tuple.src.u3))
return NF_ACCEPT;
saddr = >tuplehash[!dir].tuple.src.u3;
+   } else if (sip_external_media) {
+   struct net_device *dev = skb_dst(skb)->dev;
+   struct net *net = dev_net(dev);
+   struct rtable *rt;
+   struct flowi4 fl4 = {};
+#if IS_ENABLED(CONFIG_IPV6)
+   struct flowi6 fl6 = {};
+#endif
+   struct dst_entry *dst = NULL;
+
+   switch (nf_ct_l3num(ct)) {
+   case NFPROTO_IPV4:
+   fl4.daddr = daddr->ip;
+   rt = ip_route_output_key(net, );
+   if (!IS_ERR(rt))
+   dst = >dst;
+   break;
+
+#if IS_ENABLED(CONFIG_IPV6)
+   case NFPROTO_IPV6:
+   fl6.daddr = daddr->in6;
+   dst = ip6_route_output(net, NULL, );
+   if (dst->error) {
+   dst_release(dst);
+   dst = NULL;
+   }
+   break;
+#endif
+   }
+
+   /* Don't predict any conntracks when media endpoint is reachable
+* through the same interface as the signalling peer.
+*/
+   if (dst && dst->dev == dev)
+   return NF_ACCEPT;
}
 
/* We need to check whether the registration exists before attempting
-- 
2.7.4



[PATCH v3] netfilter/ipset: replace a strncpy() with strscpy()

2018-12-01 Thread Qian Cai
To make overflows as obvious as possible and to prevent code from blithely
proceeding with a truncated string. This also has a side-effect to fix a
compilation warning when using GCC 8.2.1.

net/netfilter/ipset/ip_set_core.c: In function 'ip_set_sockfn_get':
net/netfilter/ipset/ip_set_core.c:2027:3: warning: 'strncpy' writing 32
bytes into a region of size 2 overflows the destination
[-Wstringop-overflow=]

Signed-off-by: Qian Cai 
---

Changelog:
* v2:
- Released the lock for the error-path as well.
* v1:
- Checked the return value.

 net/netfilter/ipset/ip_set_core.c | 6 --
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/net/netfilter/ipset/ip_set_core.c 
b/net/netfilter/ipset/ip_set_core.c
index 1577f2f76060..33929fb645b6 100644
--- a/net/netfilter/ipset/ip_set_core.c
+++ b/net/netfilter/ipset/ip_set_core.c
@@ -2024,9 +2024,11 @@ ip_set_sockfn_get(struct sock *sk, int optval, void 
__user *user, int *len)
}
nfnl_lock(NFNL_SUBSYS_IPSET);
set = ip_set(inst, req_get->set.index);
-   strncpy(req_get->set.name, set ? set->name : "",
-   IPSET_MAXNAMELEN);
+   ret = strscpy(req_get->set.name, set ? set->name : "",
+ IPSET_MAXNAMELEN);
nfnl_unlock(NFNL_SUBSYS_IPSET);
+   if (ret == -E2BIG)
+   goto done;
goto copy;
}
default:
-- 
2.17.2 (Apple Git-113)



[PATCH nft] doc: nft: document ct count

2018-12-01 Thread Pablo Neira Ayuso
Signed-off-by: Pablo Neira Ayuso 
---
 doc/payload-expression.txt | 8 
 1 file changed, 8 insertions(+)

diff --git a/doc/payload-expression.txt b/doc/payload-expression.txt
index a2284ce8c3d9..eb98e5d7898c 100644
--- a/doc/payload-expression.txt
+++ b/doc/payload-expression.txt
@@ -619,5 +619,13 @@ integer (64 bit)
 |zone|
 conntrack zone |
 integer (16 bit)
+|count|
+count number of connections
+integer (32 bit)
 |==
 A description of conntrack-specific types listed above can be found 
sub-section CONNTRACK TYPES above.
+
+.restrict the number of parallel connections to a server
+
+filter input tcp dport 22 meter test { ip saddr ct count over 2 } reject
+
-- 
2.11.0




Re: [PATCH v2] netfilter: ipset: replace a strncpy() with strscpy()

2018-12-01 Thread Jozsef Kadlecsik
Hi,

On Mon, 26 Nov 2018, Qian Cai wrote:

> To make overflows as obvious as possible and to prevent code from blithely
> proceeding with a truncated string. This also has a side-effect to fix a
> compilation warning when using GCC 8.2.1.
> 
> net/netfilter/ipset/ip_set_core.c: In function 'ip_set_sockfn_get':
> net/netfilter/ipset/ip_set_core.c:2027:3: warning: 'strncpy' writing 32
> bytes into a region of size 2 overflows the destination
> [-Wstringop-overflow=]
> 
> Signed-off-by: Qian Cai 
> ---
> 
> Changes since v1:
> * Checked the return value.
> 
>  net/netfilter/ipset/ip_set_core.c | 7 +--
>  1 file changed, 5 insertions(+), 2 deletions(-)
> 
> diff --git a/net/netfilter/ipset/ip_set_core.c 
> b/net/netfilter/ipset/ip_set_core.c
> index 1577f2f76060..c6f82556f7f2 100644
> --- a/net/netfilter/ipset/ip_set_core.c
> +++ b/net/netfilter/ipset/ip_set_core.c
> @@ -2024,8 +2024,11 @@ ip_set_sockfn_get(struct sock *sk, int optval, void 
> __user *user, int *len)
>   }
>   nfnl_lock(NFNL_SUBSYS_IPSET);
>   set = ip_set(inst, req_get->set.index);
> - strncpy(req_get->set.name, set ? set->name : "",
> - IPSET_MAXNAMELEN);
> + if (strscpy(req_get->set.name, set ? set->name : "",
> + IPSET_MAXNAMELEN) == -E2BIG) {
> + ret = -E2BIG;
> + goto done;
> + }
>   nfnl_unlock(NFNL_SUBSYS_IPSET);
>   goto copy;
>   }

This second version is not OK: the netlink lock is not released in
the error path. Please use an explicit ret = strscpy() assignment first,
then check the error condition and in the error path call 
nfnl_unlock(NFNL_SUBSYS_IPSET) and goto to the final error handling.
Thanks!

Best regards,
Jozsef
-
E-mail  : kad...@blackhole.kfki.hu, kadlecsik.joz...@wigner.mta.hu
PGP key : http://www.kfki.hu/~kadlec/pgp_public_key.txt
Address : Wigner Research Centre for Physics, Hungarian Academy of Sciences
  H-1525 Budapest 114, POB. 49, Hungary


Re: [PATCH nf-next] netfilter: nat: remove l4 protocol port rovers

2018-12-01 Thread Pablo Neira Ayuso
On Thu, Nov 15, 2018 at 10:22:59AM +0100, Florian Westphal wrote:
> This is a leftover from days where single-cpu systems were common:
> Store last port used to resolve a clash to use it as a starting point when
> the next conflict needs to be resolved.
> 
> When we have parallel attempt to connect to same address:port pair,
> its likely that both cores end up computing the same "available" port,
> as both use same starting port, and newly used ports won't become
> visible to other cores until the conntrack gets confirmed later.
> 
> One of the cores then has to drop the packet at insertion time because
> the chosen new tuple turns out to be in use after all.
> 
> Lets simplify this: remove port rover and use a pseudo-random starting
> point.
> 
> Note that this doesn't make netfilter default to 'fully random' mode;
> the 'rover' was only used if NAT could not reuse source port as-is.

Applied, thanks Florian.


[PATCH nft 2/2] src: introduce simple hints on incorrect identifier

2018-12-01 Thread Pablo Neira Ayuso
 # cat test.nft
 define test = "1.2.3.4"

 table ip x {
chain y {
ip saddr $text
}
 }
 # nft -f test.nft
 test.nft:5:13-16: Error: unknown identifier 'text'; did you mean identifier 
‘test’?
 ip saddr $text
   

Signed-off-by: Pablo Neira Ayuso 
---
 include/rule.h |  2 ++
 src/parser_bison.y | 12 ++--
 src/rule.c | 18 ++
 3 files changed, 30 insertions(+), 2 deletions(-)

diff --git a/include/rule.h b/include/rule.h
index 88fed62e7cba..dc5e5b87f933 100644
--- a/include/rule.h
+++ b/include/rule.h
@@ -112,6 +112,8 @@ extern void symbol_bind(struct scope *scope, const char 
*identifier,
 extern int symbol_unbind(const struct scope *scope, const char *identifier);
 extern struct symbol *symbol_lookup(const struct scope *scope,
const char *identifier);
+struct symbol *symbol_lookup_fuzzy(const struct scope *scope,
+  const char *identifier);
 struct symbol *symbol_get(const struct scope *scope, const char *identifier);
 
 enum table_flags {
diff --git a/src/parser_bison.y b/src/parser_bison.y
index dfe306837624..e73e1ecd0805 100644
--- a/src/parser_bison.y
+++ b/src/parser_bison.y
@@ -3078,8 +3078,16 @@ variable_expr:   '$' identifier
 
sym = symbol_get(scope, $2);
if (!sym) {
-   erec_queue(error(&@2, "unknown 
identifier '%s'", $2),
-  state->msgs);
+   sym = symbol_lookup_fuzzy(scope, $2);
+   if (sym) {
+   erec_queue(error(&@2, "unknown 
identifier '%s'; "
+ "did you 
mean identifier ‘%s’?",
+ $2, 
sym->identifier),
+  state->msgs);
+   } else {
+   erec_queue(error(&@2, "unknown 
identifier '%s'", $2),
+  state->msgs);
+   }
xfree($2);
YYERROR;
}
diff --git a/src/rule.c b/src/rule.c
index 0a3c1970c83a..ad3001294c65 100644
--- a/src/rule.c
+++ b/src/rule.c
@@ -692,6 +692,24 @@ struct symbol *symbol_lookup(const struct scope *scope, 
const char *identifier)
return NULL;
 }
 
+struct symbol *symbol_lookup_fuzzy(const struct scope *scope,
+  const char *identifier)
+{
+   struct string_misspell_state st;
+   struct symbol *sym;
+
+   string_misspell_init();
+
+   while (scope != NULL) {
+   list_for_each_entry(sym, >symbols, list)
+   string_misspell_update(sym->identifier, identifier,
+  sym, );
+
+   scope = scope->parent;
+   }
+   return st.obj;
+}
+
 static const char * const chain_type_str_array[] = {
"filter",
"nat",
-- 
2.11.0




[PATCH nft 2/3] src: allow for misspellings in object names

2018-11-30 Thread Pablo Neira Ayuso
Use this from the lookup path, to check for mispellings:

 # nft add table filter
 # nft add chain filtre test
 Error: No such file or directory; did you mean table ‘filter’ in family ip?
 add chain filtre test
   ^^

Signed-off-by: Pablo Neira Ayuso 
---
 include/misspell.h | 13 
 src/Makefile.am|  1 +
 src/misspell.c | 91 ++
 src/rule.c | 25 +--
 4 files changed, 127 insertions(+), 3 deletions(-)
 create mode 100644 include/misspell.h
 create mode 100644 src/misspell.c

diff --git a/include/misspell.h b/include/misspell.h
new file mode 100644
index ..ba01e7417220
--- /dev/null
+++ b/include/misspell.h
@@ -0,0 +1,13 @@
+#ifndef _MISSPELL_H_
+#define _MISSPELL_H_
+
+struct string_misspell_state {
+   unsigned intmin_distance;
+   void*obj;
+};
+
+void string_misspell_init(struct string_misspell_state *st);
+int string_misspell_update(const char *a, const char *b,
+  void *obj, struct string_misspell_state *st);
+
+#endif
diff --git a/src/Makefile.am b/src/Makefile.am
index 31d076cda82c..8e1a4d8795dc 100644
--- a/src/Makefile.am
+++ b/src/Makefile.am
@@ -47,6 +47,7 @@ libnftables_la_SOURCES =  \
netlink.c   \
netlink_linearize.c \
netlink_delinearize.c   \
+   misspell.c  \
monitor.c   \
segtree.c   \
rbtree.c\
diff --git a/src/misspell.c b/src/misspell.c
new file mode 100644
index ..922d305d5e01
--- /dev/null
+++ b/src/misspell.c
@@ -0,0 +1,91 @@
+#include 
+#include 
+#include 
+#include 
+#include 
+
+enum string_distance_function {
+   DELETION= 0,/* m1 */
+   INSERTION,  /* m2 */
+   TRANSFORMATION, /* m3 */
+};
+#define DISTANCE_MAX   (TRANSFORMATION + 1)
+
+static unsigned int min_distance(unsigned int *cost)
+{
+   unsigned int min = UINT_MAX;
+   int k;
+
+   for (k = 0; k < DISTANCE_MAX; k++) {
+   if (cost[k] < min)
+   min = cost[k];
+   }
+
+   return min;
+}
+
+/* A simple implementation of "The string-to-string correction problem (1974)"
+ * by Robert Wagner.
+ */
+static unsigned int string_distance(const char *a, const char *b)
+{
+   unsigned int len_a = strlen(a);
+   unsigned int len_b = strlen(b);
+   unsigned int *distance;
+   unsigned int i, j, ret;
+
+   distance = xzalloc((len_a + 1) * (len_b + 1) * sizeof(unsigned int));
+
+#define DISTANCE(__i, __j) distance[(__i) * len_b + (__j)]
+
+   for (i = 0; i <= len_a; i++)
+   DISTANCE(i, 0) = i;
+   for (j = 0; j <= len_b; j++)
+   DISTANCE(0, j) = j;
+
+   for (i = 1; i <= len_a; i++) {
+   for (j = 1; j <= len_b; j++) {
+   unsigned int subcost = (a[i] == b[j]) ? 0 : 1;
+   unsigned int cost[3];
+
+   cost[DELETION] = DISTANCE(i - 1, j) + 1;
+   cost[INSERTION] = DISTANCE(i, j - 1) + 1;
+   cost[TRANSFORMATION] = DISTANCE(i - 1, j - 1) + subcost;
+   DISTANCE(i, j) = min_distance(cost);
+
+   if (i > 1 && j > 1 &&
+   a[i] == b[j - 1] &&
+   a[i - 1] == b[j])
+   DISTANCE(i, j) =
+   min(DISTANCE(i, j),
+   DISTANCE(i - 2, j - 2) + subcost);
+   }
+   }
+
+   ret = DISTANCE(len_a, len_b);
+
+   xfree(distance);
+
+   return ret;
+}
+
+void string_misspell_init(struct string_misspell_state *st)
+{
+   st->obj = NULL;
+   st->min_distance = UINT_MAX;
+}
+
+int string_misspell_update(const char *a, const char *b,
+  void *obj, struct string_misspell_state *st)
+{
+   unsigned int distance;
+
+   distance = string_distance(a, b);
+
+   if (distance < st->min_distance) {
+   st->min_distance = distance;
+   st->obj = obj;
+   return 1;
+   }
+   return 0;
+}
diff --git a/src/rule.c b/src/rule.c
index 1fffa39ab243..c244d0ba6b02 100644
--- a/src/rule.c
+++ b/src/rule.c
@@ -22,6 +22,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 
 #include 
@@ -354,18 +355,24 @@ struct set *set_lookup_fuzzy(const char *set_name,
 const struct nft_cache *cache,
 const struct table **t)
 {
+   struct string_misspell_state st;
struct table *table;
struct set *set;
 
+   string_misspell_init();
+
list_for_each_entry(table, >list, list) {

[PATCH nft 1/3] utils: remove type checks in min() and max()

2018-11-30 Thread Pablo Neira Ayuso
So we can pass functions as parameters, needed by follow up patch.

Signed-off-by: Pablo Neira Ayuso 
---
 include/utils.h | 16 +---
 1 file changed, 5 insertions(+), 11 deletions(-)

diff --git a/include/utils.h b/include/utils.h
index 01560eae8d7f..e791523c0471 100644
--- a/include/utils.h
+++ b/include/utils.h
@@ -61,17 +61,11 @@
 #define div_round_up(n, d) (((n) + (d) - 1) / (d))
 #define round_up(n, b) (div_round_up(n, b) * b)
 
-#define min(x, y) ({   \
-   typeof(x) _min1 = (x);  \
-   typeof(y) _min2 = (y);  \
-   (void) (&_min1 == &_min2);  \
-   _min1 < _min2 ? _min1 : _min2; })
-
-#define max(x, y) ({   \
-   typeof(x) _max1 = (x);  \
-   typeof(y) _max2 = (y);  \
-   (void) (&_max1 == &_max2);  \
-   _max1 > _max2 ? _max1 : _max2; })
+#define min(_x, _y) ({ \
+   _x < _y ? _x : _y; })
+
+#define max(_x, _y) ({ \
+   _x > _y ? _x : _y; })
 
 #define SNPRINTF_BUFFER_SIZE(ret, size, len, offset)   \
if (ret < 0)\
-- 
2.11.0




[PATCH nft 3/3] misspell: add distance threshold for suggestions

2018-11-30 Thread Pablo Neira Ayuso
Restrict suggestions to threshold, like gcc does.

Signed-off-by: Pablo Neira Ayuso 
---
 src/misspell.c | 21 ++---
 1 file changed, 18 insertions(+), 3 deletions(-)

diff --git a/src/misspell.c b/src/misspell.c
index 922d305d5e01..059d2e20de7a 100644
--- a/src/misspell.c
+++ b/src/misspell.c
@@ -78,11 +78,26 @@ void string_misspell_init(struct string_misspell_state *st)
 int string_misspell_update(const char *a, const char *b,
   void *obj, struct string_misspell_state *st)
 {
-   unsigned int distance;
+   unsigned int len_a, len_b, max_len, min_len, distance, threshold;
 
-   distance = string_distance(a, b);
+   len_a = strlen(a);
+   len_b = strlen(b);
+
+   max_len = max(len_a, len_b);
+   min_len = min(len_a, len_b);
+
+   if (max_len <= 1)
+   return 0;
 
-   if (distance < st->min_distance) {
+   if (max_len - min_len <= 1)
+   threshold = max(div_round_up(max_len, 3), 1);
+   else
+   threshold = div_round_up(max_len + 2, 3);
+
+   distance = string_distance(a, b);
+   if (distance > threshold)
+   return 0;
+   else if (distance < st->min_distance) {
st->min_distance = distance;
st->obj = obj;
return 1;
-- 
2.11.0




[PATCH v2] netfilter: nf_conntrack_sip: add sip_external_media logic

2018-11-30 Thread Alin Nastac
From: Alin Nastac 

Allow media streams that are not passing through this router.

When enabled, the sip_external_media logic will leave SDP
payload untouched when it detects that interface towards INVITEd
party is the same with the one towards media endpoint.

Signed-off-by: Alin Nastac 
---
 net/netfilter/nf_conntrack_sip.c | 40 
 1 file changed, 40 insertions(+)

diff --git a/net/netfilter/nf_conntrack_sip.c b/net/netfilter/nf_conntrack_sip.c
index c8d2b66..f09a0e1 100644
--- a/net/netfilter/nf_conntrack_sip.c
+++ b/net/netfilter/nf_conntrack_sip.c
@@ -21,6 +21,8 @@
 #include 
 #include 
 
+#include 
+#include 
 #include 
 #include 
 #include 
@@ -54,6 +56,11 @@ module_param(sip_direct_media, int, 0600);
 MODULE_PARM_DESC(sip_direct_media, "Expect Media streams between signalling "
   "endpoints only (default 1)");
 
+static int sip_external_media __read_mostly = 0;
+module_param(sip_external_media, int, 0600);
+MODULE_PARM_DESC(sip_external_media, "Expect Media streams between external "
+"endpoints (default 0)");
+
 const struct nf_nat_sip_hooks *nf_nat_sip_hooks;
 EXPORT_SYMBOL_GPL(nf_nat_sip_hooks);
 
@@ -861,6 +868,39 @@ static int set_expected_rtp_rtcp(struct sk_buff *skb, 
unsigned int protoff,
if (!nf_inet_addr_cmp(daddr, >tuplehash[dir].tuple.src.u3))
return NF_ACCEPT;
saddr = >tuplehash[!dir].tuple.src.u3;
+   } else if (sip_external_media) {
+   struct net_device *dev = skb_dst(skb)->dev;
+   struct net *net = dev_net(dev);
+   struct rtable *rt;
+   struct flowi4 fl4 = {};
+   struct flowi6 fl6 = {};
+   struct dst_entry *dst = NULL;
+
+   switch (nf_ct_l3num(ct)) {
+   case NFPROTO_IPV4:
+   fl4.daddr = daddr->ip;
+   rt = ip_route_output_key(net, );
+   if (!IS_ERR(rt))
+   dst = >dst;
+   break;
+
+#if IS_ENABLED(CONFIG_IPV6)
+   case NFPROTO_IPV6:
+   fl6.daddr = daddr->in6;
+   dst = ip6_route_output(net, NULL, );
+   if (dst->error) {
+   dst_release(dst);
+   dst = NULL;
+   }
+   break;
+#endif
+   }
+
+   /* Don't predict any conntracks when media endpoint is reachable
+* through the same interface as the signalling peer.
+*/
+   if (dst && dst->dev == dev)
+   return NF_ACCEPT;
}
 
/* We need to check whether the registration exists before attempting
-- 
2.7.4



Re: [PATCH] netfilter: nf_conntrack_sip: add sip_external_media logic

2018-11-29 Thread kbuild test robot
Hi Alin,

Thank you for the patch! Yet something to improve:

[auto build test ERROR on nf/master]
[also build test ERROR on v4.20-rc4 next-20181129]
[if your patch is applied to the wrong git tree, please drop us a note to help 
improve the system]

url:
https://github.com/0day-ci/linux/commits/Alin-Nastac/netfilter-nf_conntrack_sip-add-sip_external_media-logic/20181130-032136
base:   https://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf.git master
config: x86_64-randconfig-a0-11300811 (attached as .config)
compiler: gcc-7 (Debian 7.3.0-1) 7.3.0
reproduce:
# save the attached .config to linux build tree
make ARCH=x86_64 

All errors (new ones prefixed by >>):

   net/netfilter/nf_conntrack_sip.o: In function `ip6_route_output':
>> include/net/ip6_route.h:88: undefined reference to `ip6_route_output_flags'

vim +88 include/net/ip6_route.h

33bd5ac54 David Ahern 2018-07-03  74  
5c3a0fd7d Joe Perches 2013-09-21  75  void ip6_route_input(struct sk_buff 
*skb);
d409b8476 Mahesh Bandewar 2016-09-16  76  struct dst_entry 
*ip6_route_input_lookup(struct net *net,
d409b8476 Mahesh Bandewar 2016-09-16  77
 struct net_device *dev,
b75cc8f90 David Ahern 2018-03-02  78
 struct flowi6 *fl6,
b75cc8f90 David Ahern 2018-03-02  79
 const struct sk_buff *skb, int flags);
^1da177e4 Linus Torvalds  2005-04-16  80  
6f21c96a7 Paolo Abeni 2016-01-29  81  struct dst_entry 
*ip6_route_output_flags(struct net *net, const struct sock *sk,
6f21c96a7 Paolo Abeni 2016-01-29  82
 struct flowi6 *fl6, int flags);
6f21c96a7 Paolo Abeni 2016-01-29  83  
6f21c96a7 Paolo Abeni 2016-01-29  84  static inline struct dst_entry 
*ip6_route_output(struct net *net,
6f21c96a7 Paolo Abeni 2016-01-29  85
 const struct sock *sk,
6f21c96a7 Paolo Abeni 2016-01-29  86
 struct flowi6 *fl6)
6f21c96a7 Paolo Abeni 2016-01-29  87  {
6f21c96a7 Paolo Abeni 2016-01-29 @88return 
ip6_route_output_flags(net, sk, fl6, 0);
6f21c96a7 Paolo Abeni 2016-01-29  89  }
6f21c96a7 Paolo Abeni 2016-01-29  90  

:: The code at line 88 was first introduced by commit
:: 6f21c96a78b835259546d8f3fb4edff0f651d478 ipv6: enforce flowi6_oif usage 
in ip6_dst_lookup_tail()

:: TO: Paolo Abeni 
:: CC: David S. Miller 

---
0-DAY kernel test infrastructureOpen Source Technology Center
https://lists.01.org/pipermail/kbuild-all   Intel Corporation


.config.gz
Description: application/gzip


Re: [PATCH] netfilter: nf_conntrack_sip: add sip_external_media logic

2018-11-29 Thread kbuild test robot
Hi Alin,

Thank you for the patch! Yet something to improve:

[auto build test ERROR on nf/master]
[also build test ERROR on v4.20-rc4 next-20181129]
[if your patch is applied to the wrong git tree, please drop us a note to help 
improve the system]

url:
https://github.com/0day-ci/linux/commits/Alin-Nastac/netfilter-nf_conntrack_sip-add-sip_external_media-logic/20181130-032136
base:   https://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf.git master
config: powerpc-defconfig (attached as .config)
compiler: powerpc64-linux-gnu-gcc (Debian 7.2.0-11) 7.2.0
reproduce:
wget 
https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O 
~/bin/make.cross
chmod +x ~/bin/make.cross
# save the attached .config to linux build tree
GCC_VERSION=7.2.0 make.cross ARCH=powerpc 

All errors (new ones prefixed by >>):

>> ERROR: ".ip6_route_output_flags" [net/netfilter/nf_conntrack_sip.ko] 
>> undefined!

---
0-DAY kernel test infrastructureOpen Source Technology Center
https://lists.01.org/pipermail/kbuild-all   Intel Corporation


.config.gz
Description: application/gzip


[PATCH nftables] src: xt: fix build when libxtables is not installed

2018-11-29 Thread Florian Westphal
If libxtables is not even installed, build fails due to to missing
include file.

ifdef LIBXTABLES guard fixes the first error, but results in two
followup failures:
1. missing IFNAMSIZ definition
2. dereference of unknown struct.

Signed-off-by: Florian Westphal 
---
 src/xt.c | 5 +
 1 file changed, 5 insertions(+)

diff --git a/src/xt.c b/src/xt.c
index ab7f50173149..08560976aa0f 100644
--- a/src/xt.c
+++ b/src/xt.c
@@ -10,7 +10,10 @@
 #include 
 #include 
 #include 
+#include 
+#ifdef HAVE_LIBXTABLES
 #include 
+#endif
 #include 
 #include  /* for isspace */
 #include 
@@ -76,6 +79,7 @@ void xt_stmt_xlate(const struct stmt *stmt, struct output_ctx 
*octx)
 
 void xt_stmt_release(const struct stmt *stmt)
 {
+#ifdef HAVE_LIBXTABLES
switch (stmt->xt.type) {
case NFT_XT_MATCH:
if (!stmt->xt.match)
@@ -95,6 +99,7 @@ void xt_stmt_release(const struct stmt *stmt)
default:
break;
}
+#endif
xfree(stmt->xt.entry);
 }
 
-- 
2.19.2



[PATCH] netfilter: nf_conntrack_sip: add sip_external_media logic

2018-11-29 Thread Alin Nastac
Allow media streams that are not passing through this router.

When enabled, the sip_external_media logic will leave SDP
payload untouched when it detects that interface towards INVITEd
party is the same with the one towards media endpoint.

Signed-off-by: Alin Nastac 
---
 net/netfilter/nf_conntrack_sip.c | 38 ++
 1 file changed, 38 insertions(+)

diff --git a/net/netfilter/nf_conntrack_sip.c b/net/netfilter/nf_conntrack_sip.c
index c8d2b66..5416c08 100644
--- a/net/netfilter/nf_conntrack_sip.c
+++ b/net/netfilter/nf_conntrack_sip.c
@@ -21,6 +21,8 @@
 #include 
 #include 
 
+#include 
+#include 
 #include 
 #include 
 #include 
@@ -54,6 +56,11 @@ module_param(sip_direct_media, int, 0600);
 MODULE_PARM_DESC(sip_direct_media, "Expect Media streams between signalling "
   "endpoints only (default 1)");
 
+static int sip_external_media __read_mostly = 0;
+module_param(sip_external_media, int, 0600);
+MODULE_PARM_DESC(sip_external_media, "Expect Media streams between external "
+"endpoints (default 0)");
+
 const struct nf_nat_sip_hooks *nf_nat_sip_hooks;
 EXPORT_SYMBOL_GPL(nf_nat_sip_hooks);
 
@@ -861,6 +868,37 @@ static int set_expected_rtp_rtcp(struct sk_buff *skb, 
unsigned int protoff,
if (!nf_inet_addr_cmp(daddr, >tuplehash[dir].tuple.src.u3))
return NF_ACCEPT;
saddr = >tuplehash[!dir].tuple.src.u3;
+   } else if (sip_external_media) {
+   struct net_device *dev = skb_dst(skb)->dev;
+   struct net *net = dev_net(dev);
+   struct rtable *rt;
+   struct flowi4 fl4 = {};
+   struct flowi6 fl6 = {};
+   struct dst_entry *dst = NULL;
+
+   switch (nf_ct_l3num(ct)) {
+   case NFPROTO_IPV4:
+   fl4.daddr = daddr->ip;
+   rt = ip_route_output_key(net, );
+   if (!IS_ERR(rt))
+   dst = >dst;
+   break;
+
+   case NFPROTO_IPV6:
+   fl6.daddr = daddr->in6;
+   dst = ip6_route_output(net, NULL, );
+   if (dst->error) {
+   dst_release(dst);
+   dst = NULL;
+   }
+   break;
+   }
+
+   /* Don't predict any conntracks when media endpoint is reachable
+* through the same interface as the signalling peer.
+*/
+   if (dst && dst->dev == dev)
+   return NF_ACCEPT;
}
 
/* We need to check whether the registration exists before attempting
-- 
2.7.4



Re: 4.19.x kernels oops in nf_conncount_destroy

2018-11-28 Thread Todd Eigenschink
This morning I found this thread, which I didn't see last night. I'm
not sure how I missed it, since I knew what I was searching for. It
includes a link to the same patches as I mentioned, but with a status
filter in the URL such that I can see the patches.

I applied the three patches and tested and it does NOT fix the problem
for me. It changes the behavior somewhat -- I saw several oopses (or
other noise) scroll past before it locked up. It also ended with
something like "eth0: pcnet32 transmit timed out", which I hadn't seen
before.

https://www.spinics.net/lists/netfilter-devel/msg57045.html

https://patchwork.ozlabs.org/project/netfilter-devel/list/?series=73972=*



Todd Eigenschink writes:
>EPILOGUE-AS-PREAMBLE:
>
>I had already typed most of this when I thought to search the
>netfilter-devel archive. I found this, which sounds an awful lot like
>my issue:
>
>https://www.spinics.net/lists/netfilter-devel/msg56882.html
>
>However, the patch link in the first followup seems empty, so I can't
>verify that it's the same thing or that the proposed fix works for me.
>
>
>--
>
>[1.] One line summary of the problem:
>
>4.19.x kernels oops in nf_conncount_destroy.
>
>
>[2.] Full description of the problem/report:
>
>We have been running 4.18.x kernels, up through 4.18.20, in production
>for a small web/email hosting operation with no issues. Everything
>relevant here is 32-bit Linux on VMware ESXi. Upon the release of
>4.18.20 and knowing that it was EOL, I stepped to then-current 4.19.4.
>
>One of our machines (a mail gateway) hung with an oops within a minute
>or two of boot. I rolled it back to deal with later.
>
>The next morning, another machine (coincidentally another mail
>gateway) crashed as well, and the tail end of the oops--that I could
>see on the 80x25 console--looked similar to what I remembered from the
>first. I rolled it back. If a third one happened, I was going to roll
>them all back. No other machines had issues.
>
>When 4.19.5 was released, I tried that, with the same effect, so I
>decided that since the fastest-crashing machine was, while production,
>not going to cause user-visible issues, I'd bisect to try to hunt down
>the cause. Every other machine, about 30 total, has been fine on
>4.19.4 / 4.19.5.
>
>Bisecting led me to this. 
>
>
>5c789e131cbb997a528451564ea4613e812fc718 is the first bad commit
>commit 5c789e131cbb997a528451564ea4613e812fc718
>Author: Yi-Hung Wei 
>Date:   Mon Jul 2 17:33:44 2018 -0700
>
>netfilter: nf_conncount: Add list lock and gc worker, and RCU for init 
> tree search
>
>This patch is originally from Florian Westphal.
>
>This patch does the following 3 main tasks.
>
>1) Add list lock to 'struct nf_conncount_list' so that we can
>alter the lists containing the individual connections without holding the
>main tree lock.  It would be useful when we only need to add/remove to/from
>a list without allocate/remove a node in the tree.  With this change, we
>update nft_connlimit accordingly since we longer need to maintain
>a list lock in nft_connlimit now.
>
>2) Use RCU for the initial tree search to improve tree look up performance.
>
>3) Add a garbage collection worker. This worker is schedule when there
>are excessive tree node that needed to be recycled.
>
>Moreover,the rbnode reclaim logic is moved from search tree to insert tree
>to avoid race condition.
>
>Signed-off-by: Yi-Hung Wei 
>Signed-off-by: Florian Westphal 
>Signed-off-by: Pablo Neira Ayuso 
>
>:04 04 3117a9e5f5d91c55bfcb495ed0cf20aac47beb4c 
>eb16c3c84edfa70268c651490dd5031a6474ca2d M include
>:04 04 f69622ea9603500bc837f6348bc7ffe6e4edefda 
>8983dc24192abb1ae1925f023a495c39d171021c M net
>
>
>And it makes perfect sense: Our only two machines that use
>nf_connlimit in their firewall configs are those two mail gateways. I
>imagine that the speed at which they oops has to do with their
>specific connlimit settings and how quickly they accumulate enough
>traffic to hit one of them.
>
>Oops details are below.
>
>
>[3.] Keywords (i.e., modules, networking, kernel):
>
>netfilter, nf_conncount, nf_connlimit
>
>
>[4.] Kernel information
>
>[4.1.] Kernel version (from /proc/version):
>
>[4.2.] Kernel .config file:
>
>grep = .config, net-related stuff only:
>
>
>CONFIG_NET=y
>CONFIG_NET_INGRESS=y
>CONFIG_PACKET=y
>CONFIG_UNIX=y
>CONFIG_XFRM=y
>CONFIG_XFRM_ALGO=y
>CONFIG_XFRM_USER=y
>CONFIG_XFRM_SUB_POLICY=y
>CONFIG_XFRM_IPCOMP=m
>CONFIG_NET_KEY=m
>CONFIG_INET=y
>CONFIG_IP_MULTICAST=y
>CONFIG_IP_ADVANCED_ROUTER=y
>CONFIG_IP_MULTIPLE_TABLES=y
>CONFIG_INET_AH=m
>CONFIG_INET_ESP=m
>CONFIG_INET_IPCOMP=m
>CONFIG_INET_XFRM_TUNNEL=m
>CONFIG_INET_TUNNEL=m
>CONFIG_INET_XFRM_MODE_TRANSPORT=m
>CONFIG_INET_XFRM_MODE_TUNNEL=m
>CONFIG_INET_XFRM_MODE_BEET=m
>CONFIG_TCP_CONG_CUBIC=y
>CONFIG_DEFAULT_TCP_CONG="cubic"

Re: RFC: Designing per chain rule cache support in libnftnl

2018-11-28 Thread Phil Sutter
Hi,

On Wed, Nov 28, 2018 at 02:51:54PM +0100, Pablo Neira Ayuso wrote:
> On Wed, Nov 28, 2018 at 02:21:01PM +0100, Phil Sutter wrote:
> > Hi Pablo,
> > 
> > On Fri, Nov 23, 2018 at 01:35:17PM +0100, Pablo Neira Ayuso wrote:
> > > On Fri, Nov 23, 2018 at 12:25:45PM +0100, Florian Westphal wrote:
> > > > Phil Sutter  wrote:
> > > > > > If user doesn't want it cleared at nftnl_chain_free() time they can
> > > > > > always allocate a new nftnl_rule_list and splice to that list.
> > > > > 
> > > > > Good point. What do you think about the simple approach of 
> > > > > introducing:
> > > > > 
> > > > > | struct nftnl_rule_list *nftnl_chain_get_rule_list(const struct 
> > > > > nftnl_chain *);
> > > > 
> > > > Looks fine to me.
> > > > 
> > > > > This would allow to reuse nftnl_rule_list routines from 
> > > > > libnftnl/rule.h.
> > > > > One potential problem I see is that users may try to call
> > > > > nftnl_rule_list_free(). Can we prevent that somehow?
> > > > 
> > > > Document that nftnl_rule_list_free() pairs with nftnl_rule_list_alloc() 
> > > > :-)
> > > > 
> > > > I don't think its an issue.
> > > > We could add a 'bool make_free_no_op' to nftnl_rule_list and set that to
> > > > true for nftnl_rule_list structures that are allocated indirectly on
> > > > behalf of nftnl_chain struct, but I think thats taking things too far.
> > > 
> > > Can we have an interface similar to nftnl_rule_add_expr() to add rules
> > > to chains?
> > > 
> > > So we add list field to nftnl_chain, and this new interface to
> > > add/delete rules.
> > 
> > I didn't look at struct nftnl_rule yet. OK, that seems rather different
> > from what I had in mind. So I guess your idea would be to add a field of
> > type struct list_head instead of struct nftnl_rule_list and implement
> > struct nftnl_rule_iter and relevant functions?
> 
> Yes. We would make explicit the relation between the objects, which
> makes sense to me. So far only nftnl_rule and nftnl_expr are basically
> "linked" in some way.
> 
> Would this approach for you?

Yes, that's fine with me. My idea was to reuse the nftnl_rule_list API,
but creating chains' rule lists in a consistent manner with respect to
rules' expression lists is probably more important long-term.

Thanks, Phil


Re: RFC: Designing per chain rule cache support in libnftnl

2018-11-28 Thread Pablo Neira Ayuso
On Wed, Nov 28, 2018 at 02:21:01PM +0100, Phil Sutter wrote:
> Hi Pablo,
> 
> On Fri, Nov 23, 2018 at 01:35:17PM +0100, Pablo Neira Ayuso wrote:
> > On Fri, Nov 23, 2018 at 12:25:45PM +0100, Florian Westphal wrote:
> > > Phil Sutter  wrote:
> > > > > If user doesn't want it cleared at nftnl_chain_free() time they can
> > > > > always allocate a new nftnl_rule_list and splice to that list.
> > > > 
> > > > Good point. What do you think about the simple approach of introducing:
> > > > 
> > > > | struct nftnl_rule_list *nftnl_chain_get_rule_list(const struct 
> > > > nftnl_chain *);
> > > 
> > > Looks fine to me.
> > > 
> > > > This would allow to reuse nftnl_rule_list routines from libnftnl/rule.h.
> > > > One potential problem I see is that users may try to call
> > > > nftnl_rule_list_free(). Can we prevent that somehow?
> > > 
> > > Document that nftnl_rule_list_free() pairs with nftnl_rule_list_alloc() 
> > > :-)
> > > 
> > > I don't think its an issue.
> > > We could add a 'bool make_free_no_op' to nftnl_rule_list and set that to
> > > true for nftnl_rule_list structures that are allocated indirectly on
> > > behalf of nftnl_chain struct, but I think thats taking things too far.
> > 
> > Can we have an interface similar to nftnl_rule_add_expr() to add rules
> > to chains?
> > 
> > So we add list field to nftnl_chain, and this new interface to
> > add/delete rules.
> 
> I didn't look at struct nftnl_rule yet. OK, that seems rather different
> from what I had in mind. So I guess your idea would be to add a field of
> type struct list_head instead of struct nftnl_rule_list and implement
> struct nftnl_rule_iter and relevant functions?

Yes. We would make explicit the relation between the objects, which
makes sense to me. So far only nftnl_rule and nftnl_expr are basically
"linked" in some way.

Would this approach for you?

Thanks!


[PATCH nft] tests: fix return codes

2018-11-28 Thread Arturo Borrero Gonzalez
Please,

consider merging the attached patch.

thanks.
commit 3497067ca187047c61d89ccad6eab4ebf5df9219
Author: Arturo Borrero Gonzalez 
Date:   Wed Nov 28 14:31:57 2018 +0100

tests: fix return codes

Try to return != 0 if a testsuite fails.

Signed-off-by: Arturo Borrero Gonzalez 

diff --git a/tests/build/run-tests.sh b/tests/build/run-tests.sh
index 626f6fd..b0560da 100755
--- a/tests/build/run-tests.sh
+++ b/tests/build/run-tests.sh
@@ -52,4 +52,4 @@ done
 rm -rf $tmpdir
 
 echo "results: [OK] $ok [FAILED] $failed [TOTAL] $((ok+failed))"
-exit 0
+exit $failed
diff --git a/tests/monitor/run-tests.sh b/tests/monitor/run-tests.sh
index f408988..0478cf6 100755
--- a/tests/monitor/run-tests.sh
+++ b/tests/monitor/run-tests.sh
@@ -17,7 +17,7 @@ fi
 testdir=$(mktemp -d)
 if [ ! -d $testdir ]; then
 	echo "Failed to create test directory" >&2
-	exit 0
+	exit 1
 fi
 trap "rm -rf $testdir; $nft flush ruleset" EXIT
 
diff --git a/tests/shell/run-tests.sh b/tests/shell/run-tests.sh
index 5b0ec41..fdca5fb 100755
--- a/tests/shell/run-tests.sh
+++ b/tests/shell/run-tests.sh
@@ -152,4 +152,4 @@ echo ""
 msg_info "results: [OK] $ok [FAILED] $failed [TOTAL] $((ok+failed))"
 
 kernel_cleanup
-exit 0
+exit $failed


Re: RFC: Designing per chain rule cache support in libnftnl

2018-11-28 Thread Phil Sutter
Hi Pablo,

On Fri, Nov 23, 2018 at 01:35:17PM +0100, Pablo Neira Ayuso wrote:
> On Fri, Nov 23, 2018 at 12:25:45PM +0100, Florian Westphal wrote:
> > Phil Sutter  wrote:
> > > > If user doesn't want it cleared at nftnl_chain_free() time they can
> > > > always allocate a new nftnl_rule_list and splice to that list.
> > > 
> > > Good point. What do you think about the simple approach of introducing:
> > > 
> > > | struct nftnl_rule_list *nftnl_chain_get_rule_list(const struct 
> > > nftnl_chain *);
> > 
> > Looks fine to me.
> > 
> > > This would allow to reuse nftnl_rule_list routines from libnftnl/rule.h.
> > > One potential problem I see is that users may try to call
> > > nftnl_rule_list_free(). Can we prevent that somehow?
> > 
> > Document that nftnl_rule_list_free() pairs with nftnl_rule_list_alloc() :-)
> > 
> > I don't think its an issue.
> > We could add a 'bool make_free_no_op' to nftnl_rule_list and set that to
> > true for nftnl_rule_list structures that are allocated indirectly on
> > behalf of nftnl_chain struct, but I think thats taking things too far.
> 
> Can we have an interface similar to nftnl_rule_add_expr() to add rules
> to chains?
> 
> So we add list field to nftnl_chain, and this new interface to
> add/delete rules.

I didn't look at struct nftnl_rule yet. OK, that seems rather different
from what I had in mind. So I guess your idea would be to add a field of
type struct list_head instead of struct nftnl_rule_list and implement
struct nftnl_rule_iter and relevant functions?

> We can probably deprecate the existing list interface if we follow
> that procedure after a bit of time in favour of this one.

OK, cool.

Thanks, Phil


Re: Proposal: rename of arptables.git and ebtables.git

2018-11-28 Thread Arturo Borrero Gonzalez
On 11/28/18 1:44 PM, Arturo Borrero Gonzalez wrote:
> Hi,
> 
> Now that the iptables.git repo offers arptables-nft and ebtables-nft,
> arptables.git holds arptables-legacy, etc, why we don't just rename the
> repos?
> 
> * from arptables.git to arptables-legacy.git
> * from ebtables.git to ebtables-legacy.git
> 
> This rename should help distros understand the differences between them
> and better accommodate the packaging of all the related tooling.
> 
> Mind that the rename may have side effects in tarball
> generation/publishing etc. I would expect the new arptables tarball to
> include the '-legacy' keyword, and same for ebtables.
> 
> If we go ahead with the rename, a new release is worth having,
> announcing these changes as well.
> 

Also,

please consider applying the attached patch.

thanks.
commit ee8a588338e7c75e90fcc49a69e3d3b018063828
Author: Arturo Borrero Gonzalez 
Date:   Wed Nov 28 13:47:28 2018 +0100

ebtables: legacy renaming

The original ebtables tool is now the legacy version, let's rename it.

A more uptodate client of the ebtables tool is provided in the iptables
tarball (ebtables-nft). The new tool was formerly known as ebtables-compat.

The new -legacy binary has no problem if called via a symlink with the
'ebtables' name, so users can still name this binary with whatever name.

Signed-off-by: Arturo Borrero Gonzalez 

diff --git a/Makefile.am b/Makefile.am
index 14938fe..b16a4d6 100644
--- a/Makefile.am
+++ b/Makefile.am
@@ -26,11 +26,11 @@ AM_CPPFLAGS = ${regular_CPPFLAGS} -I${top_srcdir}/include \
 	-DEBTD_PIPE=\"${PIPE}\" -DEBTD_PIPE_DIR=\"${PIPE_DIR}\"
 AM_CFLAGS = ${regular_CFLAGS}
 
-sbin_PROGRAMS = ebtables ebtablesd ebtablesu ebtables-restore
+sbin_PROGRAMS = ebtables-legacy ebtablesd ebtablesu ebtables-legacy-restore
 EXTRA_PROGRAMS = static examples/ulog/test_ulog
 sysconf_DATA = ethertypes
-sbin_SCRIPTS = ebtables-save
-man8_MANS = ebtables.8
+sbin_SCRIPTS = ebtables-legacy-save
+man8_MANS = ebtables-legacy.8
 lib_LTLIBRARIES = libebtc.la
 
 libebtc_la_SOURCES = \
@@ -47,21 +47,22 @@ libebtc_la_SOURCES = \
 	extensions/ebtable_nat.c
 # Make sure ebtables.c can be built twice
 libebtc_la_CPPFLAGS = ${AM_CPPFLAGS}
-ebtables_SOURCES = ebtables-standalone.c
-ebtables_LDADD = libebtc.la
+ebtables_legacy_SOURCES = ebtables-standalone.c
+ebtables_legacy_LDADD = libebtc.la
 ebtablesd_LDADD = libebtc.la
-ebtables_restore_LDADD = libebtc.la
+ebtables_legacy_restore_SOURCES = ebtables-restore.c
+ebtables_legacy_restore_LDADD = libebtc.la
 static_SOURCES = ebtables.c
 static_LDFLAGS = -static
 static_LDADD = libebtc.la
 examples_ulog_test_ulog_SOURCES = examples/ulog/test_ulog.c getethertype.c
 
 daemon: ebtablesd ebtablesu
-exec: ebtables ebtables-restore
+exec: ebtables-legacy ebtables-legacy-restore
 
-CLEANFILES = ebtables-save ebtables.sysv ebtables-config ebtables.8
+CLEANFILES = ebtables-legacy-save ebtables.sysv ebtables-config ebtables-legacy.8
 
-ebtables-save: ebtables-save.in ${top_builddir}/config.status
+ebtables-legacy-save: ebtables-save.in ${top_builddir}/config.status
 	${AM_V_GEN}sed -e 's![@]sbindir@!${sbindir}!g' <$< >$@
 
 ebtables.sysv: ebtables.sysv.in ${top_builddir}/config.status
@@ -70,7 +71,7 @@ ebtables.sysv: ebtables.sysv.in ${top_builddir}/config.status
 ebtables-config: ebtables-config.in ${top_builddir}/config.status
 	${AM_V_GEN}sed -e 's![@]sysconfigdir@!${sysconfigdir}!g' <$< >$@
 
-ebtables.8: ebtables.8.in ${top_builddir}/config.status
+ebtables-legacy.8: ebtables-legacy.8.in ${top_builddir}/config.status
 	${AM_V_GEN}sed -e 's![@]PACKAGE_VERSION!${PACKAGE_VERSION}!g' \
 		-e 's![@]PACKAGE_DATE@!${PROGDATE}!g' \
 		-e 's![@]LOCKFILE@!${LOCKFILE}!g' <$< >$@
diff --git a/ebtables.8.in b/ebtables-legacy.8.in
similarity index 98%
rename from ebtables.8.in
rename to ebtables-legacy.8.in
index 3e97c84..3417045 100644
--- a/ebtables.8.in
+++ b/ebtables-legacy.8.in
@@ -24,7 +24,7 @@
 .\" 
 .\"
 .SH NAME
-ebtables (@PACKAGE_VERSION@) \- Ethernet bridge frame table administration
+ebtables-legacy (@PACKAGE_VERSION@) \- Ethernet bridge frame table administration (legacy)
 .SH SYNOPSIS
 .BR "ebtables " [ -t " table ] " - [ ACDI "] chain rule specification [match extensions] [watcher extensions] target"
 .br
@@ -50,6 +50,18 @@ ebtables (@PACKAGE_VERSION@) \- Ethernet bridge frame table administration
 .br
 .BR "ebtables " [ -t " table ] [" --atomic-file " file] " --atomic-save
 .br
+
+.SH LEGACY
+This tool uses the old xtables/setsockopt framework, and is a legacy version
+of ebtables. That means that a new, more modern tool exists with the same
+functionality using the nf_tables framework and you are encouraged to migrate now.
+The new binaries (known as ebtables-nft and formerly known as ebtables-compat)
+uses the same syntax and semantics than this legacy one.
+
+You can still use this legacy tool. You should probably get some specific
+information from your Linux distribution or vendor.
+More docs are 

Proposal: rename of arptables.git and ebtables.git

2018-11-28 Thread Arturo Borrero Gonzalez
Hi,

Now that the iptables.git repo offers arptables-nft and ebtables-nft,
arptables.git holds arptables-legacy, etc, why we don't just rename the
repos?

* from arptables.git to arptables-legacy.git
* from ebtables.git to ebtables-legacy.git

This rename should help distros understand the differences between them
and better accommodate the packaging of all the related tooling.

Mind that the rename may have side effects in tarball
generation/publishing etc. I would expect the new arptables tarball to
include the '-legacy' keyword, and same for ebtables.

If we go ahead with the rename, a new release is worth having,
announcing these changes as well.


Re: [PATCH nf] netfilter: nf_tables: deactivate expressions in rule replecement routine

2018-11-28 Thread Pablo Neira Ayuso
Applied, thanks.


4.19.x kernels oops in nf_conncount_destroy

2018-11-27 Thread Todd Eigenschink
EPILOGUE-AS-PREAMBLE:

I had already typed most of this when I thought to search the
netfilter-devel archive. I found this, which sounds an awful lot like
my issue:

https://www.spinics.net/lists/netfilter-devel/msg56882.html

However, the patch link in the first followup seems empty, so I can't
verify that it's the same thing or that the proposed fix works for me.


--

[1.] One line summary of the problem:

4.19.x kernels oops in nf_conncount_destroy.


[2.] Full description of the problem/report:

We have been running 4.18.x kernels, up through 4.18.20, in production
for a small web/email hosting operation with no issues. Everything
relevant here is 32-bit Linux on VMware ESXi. Upon the release of
4.18.20 and knowing that it was EOL, I stepped to then-current 4.19.4.

One of our machines (a mail gateway) hung with an oops within a minute
or two of boot. I rolled it back to deal with later.

The next morning, another machine (coincidentally another mail
gateway) crashed as well, and the tail end of the oops--that I could
see on the 80x25 console--looked similar to what I remembered from the
first. I rolled it back. If a third one happened, I was going to roll
them all back. No other machines had issues.

When 4.19.5 was released, I tried that, with the same effect, so I
decided that since the fastest-crashing machine was, while production,
not going to cause user-visible issues, I'd bisect to try to hunt down
the cause. Every other machine, about 30 total, has been fine on
4.19.4 / 4.19.5.

Bisecting led me to this. 


5c789e131cbb997a528451564ea4613e812fc718 is the first bad commit
commit 5c789e131cbb997a528451564ea4613e812fc718
Author: Yi-Hung Wei 
Date:   Mon Jul 2 17:33:44 2018 -0700

netfilter: nf_conncount: Add list lock and gc worker, and RCU for init tree 
search

This patch is originally from Florian Westphal.

This patch does the following 3 main tasks.

1) Add list lock to 'struct nf_conncount_list' so that we can
alter the lists containing the individual connections without holding the
main tree lock.  It would be useful when we only need to add/remove to/from
a list without allocate/remove a node in the tree.  With this change, we
update nft_connlimit accordingly since we longer need to maintain
a list lock in nft_connlimit now.

2) Use RCU for the initial tree search to improve tree look up performance.

3) Add a garbage collection worker. This worker is schedule when there
are excessive tree node that needed to be recycled.

Moreover,the rbnode reclaim logic is moved from search tree to insert tree
to avoid race condition.

Signed-off-by: Yi-Hung Wei 
Signed-off-by: Florian Westphal 
Signed-off-by: Pablo Neira Ayuso 

:04 04 3117a9e5f5d91c55bfcb495ed0cf20aac47beb4c 
eb16c3c84edfa70268c651490dd5031a6474ca2d M  include
:04 04 f69622ea9603500bc837f6348bc7ffe6e4edefda 
8983dc24192abb1ae1925f023a495c39d171021c M  net


And it makes perfect sense: Our only two machines that use
nf_connlimit in their firewall configs are those two mail gateways. I
imagine that the speed at which they oops has to do with their
specific connlimit settings and how quickly they accumulate enough
traffic to hit one of them.

Oops details are below.


[3.] Keywords (i.e., modules, networking, kernel):

netfilter, nf_conncount, nf_connlimit


[4.] Kernel information

[4.1.] Kernel version (from /proc/version):

[4.2.] Kernel .config file:

grep = .config, net-related stuff only:


CONFIG_NET=y
CONFIG_NET_INGRESS=y
CONFIG_PACKET=y
CONFIG_UNIX=y
CONFIG_XFRM=y
CONFIG_XFRM_ALGO=y
CONFIG_XFRM_USER=y
CONFIG_XFRM_SUB_POLICY=y
CONFIG_XFRM_IPCOMP=m
CONFIG_NET_KEY=m
CONFIG_INET=y
CONFIG_IP_MULTICAST=y
CONFIG_IP_ADVANCED_ROUTER=y
CONFIG_IP_MULTIPLE_TABLES=y
CONFIG_INET_AH=m
CONFIG_INET_ESP=m
CONFIG_INET_IPCOMP=m
CONFIG_INET_XFRM_TUNNEL=m
CONFIG_INET_TUNNEL=m
CONFIG_INET_XFRM_MODE_TRANSPORT=m
CONFIG_INET_XFRM_MODE_TUNNEL=m
CONFIG_INET_XFRM_MODE_BEET=m
CONFIG_TCP_CONG_CUBIC=y
CONFIG_DEFAULT_TCP_CONG="cubic"
CONFIG_NET_PTP_CLASSIFY=y
CONFIG_NETFILTER=y
CONFIG_NETFILTER_ADVANCED=y
CONFIG_NETFILTER_INGRESS=y
CONFIG_NETFILTER_NETLINK=y
CONFIG_NETFILTER_FAMILY_ARP=y
CONFIG_NF_CONNTRACK=y
CONFIG_NF_LOG_COMMON=y
CONFIG_NETFILTER_CONNCOUNT=y
CONFIG_NF_CONNTRACK_MARK=y
CONFIG_NF_CONNTRACK_PROCFS=y
CONFIG_NF_CONNTRACK_TIMEOUT=y
CONFIG_NF_CONNTRACK_FTP=y
CONFIG_NF_CT_NETLINK=y
CONFIG_NF_CT_NETLINK_TIMEOUT=y
CONFIG_NF_NAT=y
CONFIG_NF_NAT_NEEDED=y
CONFIG_NF_NAT_FTP=y
CONFIG_NF_NAT_REDIRECT=y
CONFIG_NF_TABLES=y
CONFIG_NFT_CT=y
CONFIG_NFT_CONNLIMIT=y
CONFIG_NFT_LOG=y
CONFIG_NFT_LIMIT=y
CONFIG_NFT_MASQ=y
CONFIG_NFT_NAT=y
CONFIG_NFT_REJECT=y
CONFIG_NF_FLOW_TABLE=m
CONFIG_NETFILTER_XTABLES=y
CONFIG_NETFILTER_XT_MARK=y
CONFIG_NETFILTER_XT_CONNMARK=y
CONFIG_NETFILTER_XT_TARGET_CONNMARK=y
CONFIG_NETFILTER_XT_TARGET_LOG=y
CONFIG_NETFILTER_XT_TARGET_MARK=y

[PATCH nf] netfilter: nf_tables: deactivate expressions in rule replecement routine

2018-11-27 Thread Taehee Yoo
Rule replacement routine removes an old rule then adds a new rule.
In the old rule removing routine, below steps are needed.
Allocate trans, deactivate rule and deactivate expressons of rule.
But there is no expression deactivation routine in rule replacement
routine.

test commands:
   %nft add table ip filter
   %nft add chain ip filter c1
   %nft add chain ip filter c1
   %nft add rule ip filter c1 jump c2
   %nft replace rule ip filter c1 handle 3 accept
   %nft flush ruleset

 expression means immediate NFT_JUMP to chain c2.
Reference count of chain c2 is increased when the rule is added.

When rule is deleted or replaced, reference count of c2 should be
decreased. reference count decrement routine is in
the nft_immediate_deactivate().
That function is called by nft_rule_expr_deactivate().
But There is no nft_rule_expr_deactivate() in the rule replacement
routine. therefore reference count is not decreased.
That eventually makes the below message.

splat looks like:
[  214.396453] WARNING: CPU: 1 PID: 21 at net/netfilter/nf_tables_api.c:1432 
nf_tables_chain_destroy.isra.38+0x2f9/0x3a0 [nf_tables]
[  214.398983] Modules linked in: nf_tables nfnetlink
[  214.398983] CPU: 1 PID: 21 Comm: kworker/1:1 Not tainted 4.20.0-rc2+ #44
[  214.398983] Workqueue: events nf_tables_trans_destroy_work [nf_tables]
[  214.398983] RIP: 0010:nf_tables_chain_destroy.isra.38+0x2f9/0x3a0 [nf_tables]
[  214.398983] Code: 00 00 00 fc ff df 48 89 fa 48 c1 ea 03 80 3c 02 00 0f 85 
8e 00 00 00 48 8b 7b 58 e8 e1 2c 4e c6 48 89 df e8 d9 2c 4e c6 eb 9a <0f> 0b eb 
96 0f 0b e9 7e fe ff ff e8 a7 7e 4e c6 e9 a4 fe ff ff e8
[  214.398983] RSP: 0018:8881152874e8 EFLAGS: 00010202
[  214.398983] RAX: 0001 RBX: 88810ef9fc28 RCX: 8881152876f0
[  214.398983] RDX: dc00 RSI: 111022a50ede RDI: 88810ef9fc78
[  214.398983] RBP: 111022a50e9d R08: 8000 R09: 
[  214.398983] R10:  R11:  R12: 111022a50eba
[  214.398983] R13: 888114446e08 R14: 8881152876f0 R15: ed1022a50ed6
[  214.398983] FS:  () GS:88811640() 
knlGS:
[  214.398983] CS:  0010 DS:  ES:  CR0: 80050033
[  214.398983] CR2: 7fab9bb5f868 CR3: 00012aa16000 CR4: 001006e0
[  214.398983] Call Trace:
[  214.398983]  ? nf_tables_table_destroy.isra.37+0x100/0x100 [nf_tables]
[  214.398983]  ? __kasan_slab_free+0x145/0x180
[  214.398983]  ? nf_tables_trans_destroy_work+0x439/0x830 [nf_tables]
[  214.398983]  ? kfree+0xdb/0x280
[  214.398983]  nf_tables_trans_destroy_work+0x5f5/0x830 [nf_tables]
[ ... ]

Fixes: bb7b40aecbf7 ("netfilter: nf_tables: bogus EBUSY in chain deletions")
Reported by: Christoph Anton Mitterer 
Link: https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=914505
Link: https://bugzilla.kernel.org/show_bug.cgi?id=201791
Signed-off-by: Taehee Yoo 
---
 net/netfilter/nf_tables_api.c | 15 ---
 1 file changed, 4 insertions(+), 11 deletions(-)

diff --git a/net/netfilter/nf_tables_api.c b/net/netfilter/nf_tables_api.c
index ddeaa1990e1e..2e61aab6ed73 100644
--- a/net/netfilter/nf_tables_api.c
+++ b/net/netfilter/nf_tables_api.c
@@ -2667,21 +2667,14 @@ static int nf_tables_newrule(struct net *net, struct 
sock *nlsk,
}
 
if (nlh->nlmsg_flags & NLM_F_REPLACE) {
-   if (!nft_is_active_next(net, old_rule)) {
-   err = -ENOENT;
-   goto err2;
-   }
-   trans = nft_trans_rule_add(, NFT_MSG_DELRULE,
-  old_rule);
+   trans = nft_trans_rule_add(, NFT_MSG_NEWRULE, rule);
if (trans == NULL) {
err = -ENOMEM;
goto err2;
}
-   nft_deactivate_next(net, old_rule);
-   chain->use--;
-
-   if (nft_trans_rule_add(, NFT_MSG_NEWRULE, rule) == NULL) {
-   err = -ENOMEM;
+   err = nft_delrule(, old_rule);
+   if (err < 0) {
+   nft_trans_destroy(trans);
goto err2;
}
 
-- 
2.17.1



Re: [iptables PATCH] xtables: Don't use native nftables comments

2018-11-27 Thread Pablo Neira Ayuso
On Tue, Nov 27, 2018 at 08:07:11PM +0100, Phil Sutter wrote:
> The problem with converting libxt_comment into nftables comment is that
> rules change when parsing from kernel due to comment match being moved
> to the end of the match list. And since match ordering matters, the rule
> may not be found anymore when checking or deleting. Apart from that,
> iptables-nft didn't support multiple comments per rule anymore. This is
> a compatibility issue without technical reason.
> 
> Leave conversion from nftables comment to libxt_comment in place so we
> don't break running systems during an update.

Applied, thanks Phil.


[iptables PATCH] xtables: Don't use native nftables comments

2018-11-27 Thread Phil Sutter
The problem with converting libxt_comment into nftables comment is that
rules change when parsing from kernel due to comment match being moved
to the end of the match list. And since match ordering matters, the rule
may not be found anymore when checking or deleting. Apart from that,
iptables-nft didn't support multiple comments per rule anymore. This is
a compatibility issue without technical reason.

Leave conversion from nftables comment to libxt_comment in place so we
don't break running systems during an update.

Signed-off-by: Phil Sutter 
---
 extensions/libxt_comment.t |  2 ++
 iptables/nft-ipv4.c| 14 +++---
 iptables/nft-ipv6.c| 14 +++---
 iptables/nft.c | 27 ---
 iptables/nft.h |  1 -
 5 files changed, 8 insertions(+), 50 deletions(-)

diff --git a/extensions/libxt_comment.t b/extensions/libxt_comment.t
index f12cd66841e7f..f0c8fb999401b 100644
--- a/extensions/libxt_comment.t
+++ b/extensions/libxt_comment.t
@@ -1,6 +1,8 @@
 :INPUT,FORWARD,OUTPUT
 -m comment;;FAIL
 -m comment --comment;;FAIL
+-p tcp -m tcp --dport 22 -m comment --comment foo;=;OK
+-p tcp -m comment --comment foo -m tcp --dport 22;=;OK
 #
 # it fails with 256 characters
 #
diff --git a/iptables/nft-ipv4.c b/iptables/nft-ipv4.c
index ffb439b4a1128..4497eb9b9347c 100644
--- a/iptables/nft-ipv4.c
+++ b/iptables/nft-ipv4.c
@@ -77,17 +77,9 @@ static int nft_ipv4_add(struct nftnl_rule *r, void *data)
add_compat(r, cs->fw.ip.proto, cs->fw.ip.invflags & XT_INV_PROTO);
 
for (matchp = cs->matches; matchp; matchp = matchp->next) {
-   /* Use nft built-in comments support instead of comment match */
-   if (strcmp(matchp->match->name, "comment") == 0) {
-   ret = add_comment(r, (char *)matchp->match->m->data);
-   if (ret < 0)
-   goto try_match;
-   } else {
-try_match:
-   ret = add_match(r, matchp->match->m);
-   if (ret < 0)
-   return ret;
-   }
+   ret = add_match(r, matchp->match->m);
+   if (ret < 0)
+   return ret;
}
 
/* Counters need to me added before the target, otherwise they are
diff --git a/iptables/nft-ipv6.c b/iptables/nft-ipv6.c
index 7bacee4ab3a21..cacb1c9e141f2 100644
--- a/iptables/nft-ipv6.c
+++ b/iptables/nft-ipv6.c
@@ -66,17 +66,9 @@ static int nft_ipv6_add(struct nftnl_rule *r, void *data)
add_compat(r, cs->fw6.ipv6.proto, cs->fw6.ipv6.invflags & XT_INV_PROTO);
 
for (matchp = cs->matches; matchp; matchp = matchp->next) {
-   /* Use nft built-in comments support instead of comment match */
-   if (strcmp(matchp->match->name, "comment") == 0) {
-   ret = add_comment(r, (char *)matchp->match->m->data);
-   if (ret < 0)
-   goto try_match;
-   } else {
-try_match:
-   ret = add_match(r, matchp->match->m);
-   if (ret < 0)
-   return ret;
-   }
+   ret = add_match(r, matchp->match->m);
+   if (ret < 0)
+   return ret;
}
 
/* Counters need to me added before the target, otherwise they are
diff --git a/iptables/nft.c b/iptables/nft.c
index 0223c0ed10001..7b6fb2b10686d 100644
--- a/iptables/nft.c
+++ b/iptables/nft.c
@@ -1129,33 +1129,6 @@ enum udata_type {
 };
 #define UDATA_TYPE_MAX (__UDATA_TYPE_MAX - 1)
 
-int add_comment(struct nftnl_rule *r, const char *comment)
-{
-   struct nftnl_udata_buf *udata;
-   uint32_t len;
-
-   if (nftnl_rule_get_data(r, NFTNL_RULE_USERDATA, ))
-   return -EALREADY;
-
-   udata = nftnl_udata_buf_alloc(NFT_USERDATA_MAXLEN);
-   if (!udata)
-   return -ENOMEM;
-
-   if (strnlen(comment, 255) == 255)
-   return -ENOSPC;
-
-   if (!nftnl_udata_put_strz(udata, UDATA_TYPE_COMMENT, comment))
-   return -ENOMEM;
-
-   nftnl_rule_set_data(r, NFTNL_RULE_USERDATA,
-   nftnl_udata_buf_data(udata),
-   nftnl_udata_buf_len(udata));
-
-   nftnl_udata_buf_free(udata);
-
-   return 0;
-}
-
 static int parse_udata_cb(const struct nftnl_udata *attr, void *data)
 {
unsigned char *value = nftnl_udata_get(attr);
diff --git a/iptables/nft.h b/iptables/nft.h
index 711199948a89f..bf60ab3943659 100644
--- a/iptables/nft.h
+++ b/iptables/nft.h
@@ -121,7 +121,6 @@ int add_match(struct nftnl_rule *r, struct xt_entry_match 
*m);
 int add_target(struct nftnl_rule *r, struct xt_entry_target *t);
 int add_jumpto(struct nftnl_rule *r, const char *name, int verdict);
 int add_action(struct nftnl_rule *r, struct iptables_command_state *cs, bool 
goto_set);
-int add_comment(struct nftnl_rule 

[PATCH] netfilter: ipset: fix ip_set_byindex function

2018-11-27 Thread Florent Fourcot
New function added by "Introduction of new commands and protocol
version 7" is not working, since we return skb2 to user

Signed-off-by: Victorien Molle 
Signed-off-by: Florent Fourcot 
---
 net/netfilter/ipset/ip_set_core.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/net/netfilter/ipset/ip_set_core.c 
b/net/netfilter/ipset/ip_set_core.c
index 1c3614aca34e..e3113aa1a975 100644
--- a/net/netfilter/ipset/ip_set_core.c
+++ b/net/netfilter/ipset/ip_set_core.c
@@ -1949,7 +1949,7 @@ static int ip_set_byindex(struct net *net, struct sock 
*ctnl,
if (!nlh2)
goto nlmsg_failure;
if (nla_put_u8(skb2, IPSET_ATTR_PROTOCOL, protocol(attr)) ||
-   nla_put_string(skb, IPSET_ATTR_SETNAME, set->name))
+   nla_put_string(skb2, IPSET_ATTR_SETNAME, set->name))
goto nla_put_failure;
nlmsg_end(skb2, nlh2);
 
-- 
2.11.0



Re: iptables configure ignore "--disable-silent-rules"

2018-11-27 Thread Jan Engelhardt
On Tuesday 2018-11-27 12:56, Rolf Eike Beer wrote:

>Hi,
>
>it seems to me that "--disable-silent-rules" has no effect on iptables 
>configure, i.e. I still have to pass V=1 to make to see what it is actually 
>doing.

This is expected because automake is not used in every
directory. But V=1 is the one way supported "everywhere", i.e.
linux kernel, iptables, automake.



iptables configure ignore "--disable-silent-rules"

2018-11-27 Thread Rolf Eike Beer
Hi,

it seems to me that "--disable-silent-rules" has no effect on iptables 
configure, i.e. I still have to pass V=1 to make to see what it is actually 
doing.

It also seems that the netfilter-announce archive is missing some mails (or 
they never got send), at least I don't see any iptables 1.8.x announce there, 
and no libnftnl 1.1.x one either.

Greetings,

Eike
-- 
Rolf Eike Beer, emlix GmbH, http://www.emlix.com
Fon +49 551 30664-0, Fax +49 551 30664-11
Gothaer Platz 3, 37083 Göttingen, Germany
Sitz der Gesellschaft: Göttingen, Amtsgericht Göttingen HR B 3160
Geschäftsführung: Heike Jordan, Dr. Uwe Kracke – Ust-IdNr.: DE 205 198 055

emlix - smart embedded open source

signature.asc
Description: This is a digitally signed message part.


[PATCH] netfilter: nf_nat_sip: fix RTP/RTCP source port translations

2018-11-27 Thread Alin Nastac
Perform the same SNAT translation on RTP/RTCP conntracks regardless of
who sends the first datagram.

Prior to this change, RTP packets send by the peer who required source
port translation were forwarded with unmodified source port when this
peer started its voice/video stream first.

Signed-off-by: Alin Nastac 
---
 net/netfilter/nf_nat_sip.c | 35 +++
 1 file changed, 31 insertions(+), 4 deletions(-)

diff --git a/net/netfilter/nf_nat_sip.c b/net/netfilter/nf_nat_sip.c
index 1f30860..a1e23cc 100644
--- a/net/netfilter/nf_nat_sip.c
+++ b/net/netfilter/nf_nat_sip.c
@@ -18,6 +18,7 @@
 
 #include 
 #include 
+#include 
 #include 
 #include 
 #include 
@@ -316,6 +317,9 @@ static void nf_nat_sip_seq_adjust(struct sk_buff *skb, 
unsigned int protoff,
 static void nf_nat_sip_expected(struct nf_conn *ct,
struct nf_conntrack_expect *exp)
 {
+   struct nf_conn_help *help = nfct_help(ct->master);
+   struct nf_conntrack_expect *pair_exp;
+   int range_set_for_snat = 0;
struct nf_nat_range2 range;
 
/* This must be a fresh one. */
@@ -327,15 +331,38 @@ static void nf_nat_sip_expected(struct nf_conn *ct,
range.min_addr = range.max_addr = exp->saved_addr;
nf_nat_setup_info(ct, , NF_NAT_MANIP_DST);
 
-   /* Change src to where master sends to, but only if the connection
-* actually came from the same source. */
-   if (nf_inet_addr_cmp(>tuplehash[IP_CT_DIR_ORIGINAL].tuple.src.u3,
+   /* Do SRC manip according with the parameters found in the
+* paired expected conntrack. */
+   spin_lock_bh(_conntrack_expect_lock);
+   hlist_for_each_entry(pair_exp, >expectations, lnode) {
+   if (pair_exp->tuple.src.l3num == nf_ct_l3num(ct) &&
+   pair_exp->tuple.dst.protonum == 
ct->tuplehash[IP_CT_DIR_ORIGINAL].tuple.dst.protonum &&
+   
nf_inet_addr_cmp(>tuplehash[IP_CT_DIR_ORIGINAL].tuple.src.u3, 
_exp->saved_addr) &&
+   ct->tuplehash[IP_CT_DIR_ORIGINAL].tuple.src.u.all == 
pair_exp->saved_proto.all) {
+   range.flags = (NF_NAT_RANGE_MAP_IPS | 
NF_NAT_RANGE_PROTO_SPECIFIED);
+   range.min_proto.all = range.max_proto.all = 
pair_exp->tuple.dst.u.all;
+   range.min_addr = range.max_addr = 
pair_exp->tuple.dst.u3;
+   range_set_for_snat = 1;
+   break;
+   }
+   }
+   spin_unlock_bh(_conntrack_expect_lock);
+
+   /* When no paired expected conntrack has been found, change src to
+* where master sends to, but only if the connection actually came
+* from the same source. */
+   if (!range_set_for_snat &&
+   nf_inet_addr_cmp(>tuplehash[IP_CT_DIR_ORIGINAL].tuple.src.u3,
 >master->tuplehash[exp->dir].tuple.src.u3)) {
range.flags = NF_NAT_RANGE_MAP_IPS;
range.min_addr = range.max_addr
= ct->master->tuplehash[!exp->dir].tuple.dst.u3;
-   nf_nat_setup_info(ct, , NF_NAT_MANIP_SRC);
+   range_set_for_snat = 1;
}
+
+   /* Perform SRC manip. */
+   if (range_set_for_snat)
+   nf_nat_setup_info(ct, , NF_NAT_MANIP_SRC);
 }
 
 static unsigned int nf_nat_sip_expect(struct sk_buff *skb, unsigned int 
protoff,
-- 
2.7.4



Re: [PATCH] netfilter: nf_nat_sip: fix RTP/RTCP source port translations

2018-11-26 Thread Alin Năstac
Hi Pablo,

On Tue, Nov 27, 2018 at 12:57 AM Pablo Neira Ayuso  wrote:
>
> Hi Alin,
>
> On Mon, Nov 05, 2018 at 02:54:53PM +0100, Alin Nastac wrote:
> > Perform the same SNAT translation on RTP/RTCP conntracks regardless of
> > who sends the first datagram.
> >
> > Prior to this change, RTP packets send by the peer who required source
> > port translation were forwarded with unmodified source port when this
> > peer started its voice/video stream first.
>
> Do you have more detailed description, eg. scenario triggering this
> problem to understand better what this is fixing.

The scenario fixed by this patch involves a regular SIP call, but one
that requires port
translation for the RTP conntrack. For instance, suppose you have 2
SIP agents in the
LAN, both connected to the same SIP proxy:
  - agent S1  starts first and its REGISTER phase will create a
permanent expected
conntrack on dport 5060 for allowing SIP packets to be forwarded to S1
regardless of
their source IP address or port
  - on agent S2  registration, its permanent expected conntrack will
confict with the S1
signalling expected conntrack, so it will be translated to port 1024

When S1 initiates a call using RTP/RTCP port range 1024-1025, SIP
helper will find
that port 1024 is taken over by S2's signalling expected conntrack, so
it translates it
to port range 1026-1027. All goes well if the RTP conntrack is
initiated by a packet
originated from the SIP proxy, but when the first RTP packet is sent
by S1 (usually
the peer that initiates the call is the one that sends the first RTP
packet), it is sent
towards SIP proxy with unmodified source port (1024 iso 1026).

> Not telling that this is not fixing anything, but this fix looks
> slightly hairy.

I tried to find a less hairy solution, but the information necessary
to fix RTP SNAT is not
stored in the expected conntrack created by the S1 INVITE, it is
available in the paired
expected conntrack created by the SIP proxy reply.

> BTW, I need a Signed-off-by: tag here.

Ah, I keep forgetting this, sorry! I will send it signed in a couple of hours.


[no subject]

2018-11-26 Thread Offer
-- 
-- 
Guten Tag, Wir sind eine registrierte private Geldverleiher. Wir geben
Kredite an Firmen, Einzelpersonen, die ihre finanzielle Status auf der
ganzen Welt aktualisieren müssen, mit minimalen jährlichen Zinsen von
2% .reply, wenn nötig.

Good Day, We are a registered private money lender. We give out loans
to firms, Individual who need to update their financial status all
over the world, with Minimal annual Interest Rates of 2%reply if
needed.


Re: [PATCH nf] netfilter: nf_conncount: remove wrong condition check routine

2018-11-26 Thread Pablo Neira Ayuso
On Sun, Nov 25, 2018 at 06:47:13PM +0900, Taehee Yoo wrote:
> All lists in the tree_nodes_free() have both zero count and true dead flag.
> Because lists are selected by nf_conncount_gc_list() and that makes that
> zero-count and true dead flag.
> So that the if statement of tree_nodes_free() is unnecessary and wrong.

Applied, thanks.


Re: [PATCH nf v2 0/2] netfilter: fix notifier registration bugs

2018-11-26 Thread Pablo Neira Ayuso
On Thu, Nov 22, 2018 at 07:59:25PM +0900, Taehee Yoo wrote:
> This patch series fix notifier registration bugs.
> 
> First patch adds error handling code for failure of notifier registration.
> notifier registration can be failed. so that error handling code are needed.
> 
> Second patch fixes double-register bug in masqerade modules.
> In order to protect double-register, masquerade modules manage
> reference count. but it's not enough.
> So that, this patch uses mutex instead of atomic value.

Series applied, thanks.


Re: [PATCH v2] ipv6: Preserve link scope traffic original oif

2018-11-26 Thread Pablo Neira Ayuso
On Wed, Nov 21, 2018 at 02:00:30PM +0100, Alin Nastac wrote:
> When ip6_route_me_harder is invoked, it resets outgoing interface of:
>   - link-local scoped packets sent by neighbor discovery
>   - multicast packets sent by MLD host
>   - multicast packets send by MLD proxy daemon that sets outgoing
> interface through IPV6_PKTINFO ipi6_ifindex
> 
> Link-local and multicast packets must keep their original oif after
> ip6_route_me_harder is called.

Applied, thanks Alin.


Re: Did You Receive My Last Mail?

2018-11-26 Thread Reem Al-Hashimi
Hello,

My name is ms. Reem Al-Hashimi. The UAE minister of state for international 
cooparation. I got your contact from an email database from your country. I 
have a financial transaction i would like to discuss with you. Please reply to 
reem2...@daum.net, for more details if you are interested.

Regards,

Ms. Reem Al-Hashimi


[PATCH v2] netfilter: ipset: replace a strncpy() with strscpy()

2018-11-26 Thread Qian Cai
To make overflows as obvious as possible and to prevent code from blithely
proceeding with a truncated string. This also has a side-effect to fix a
compilation warning when using GCC 8.2.1.

net/netfilter/ipset/ip_set_core.c: In function 'ip_set_sockfn_get':
net/netfilter/ipset/ip_set_core.c:2027:3: warning: 'strncpy' writing 32
bytes into a region of size 2 overflows the destination
[-Wstringop-overflow=]

Signed-off-by: Qian Cai 
---

Changes since v1:
* Checked the return value.

 net/netfilter/ipset/ip_set_core.c | 7 +--
 1 file changed, 5 insertions(+), 2 deletions(-)

diff --git a/net/netfilter/ipset/ip_set_core.c 
b/net/netfilter/ipset/ip_set_core.c
index 1577f2f76060..c6f82556f7f2 100644
--- a/net/netfilter/ipset/ip_set_core.c
+++ b/net/netfilter/ipset/ip_set_core.c
@@ -2024,8 +2024,11 @@ ip_set_sockfn_get(struct sock *sk, int optval, void 
__user *user, int *len)
}
nfnl_lock(NFNL_SUBSYS_IPSET);
set = ip_set(inst, req_get->set.index);
-   strncpy(req_get->set.name, set ? set->name : "",
-   IPSET_MAXNAMELEN);
+   if (strscpy(req_get->set.name, set ? set->name : "",
+   IPSET_MAXNAMELEN) == -E2BIG) {
+   ret = -E2BIG;
+   goto done;
+   }
nfnl_unlock(NFNL_SUBSYS_IPSET);
goto copy;
}
-- 
2.17.2 (Apple Git-113)



[PATCH] netfilter: update comment about get_unique_tuple()

2018-11-26 Thread Xiaozhou Liu
`__ip_conntrack_confirm' in the comments is confusing to newcomers
since it has long been replaced with __nf_conntrack_confirm.

Signed-off-by: Xiaozhou Liu 
---
 net/netfilter/nf_nat_core.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/net/netfilter/nf_nat_core.c b/net/netfilter/nf_nat_core.c
index e2b196054dfc..527d125964d1 100644
--- a/net/netfilter/nf_nat_core.c
+++ b/net/netfilter/nf_nat_core.c
@@ -315,7 +315,8 @@ find_best_ips_proto(const struct nf_conntrack_zone *zone,
  * and NF_INET_LOCAL_OUT, we change the destination to map into the
  * range. It might not be possible to get a unique tuple, but we try.
  * At worst (or if we race), we will end up with a final duplicate in
- * __ip_conntrack_confirm and drop the packet. */
+ * __nf_conntrack_confirm and drop the packet.
+ */
 static void
 get_unique_tuple(struct nf_conntrack_tuple *tuple,
 const struct nf_conntrack_tuple *orig_tuple,
-- 
2.11.0



Re: [PATCH nf] netfilter: xt_TEE: fix build failure

2018-11-26 Thread Taehee Yoo
On Mon, 26 Nov 2018 at 20:28, Pablo Neira Ayuso  wrote:
>
> On Mon, Nov 26, 2018 at 06:39:28PM +0900, Taehee Yoo wrote:
> > Hi Pablo,
> >
> > According to Masahiro Yamada, this is Kconfig bug and he is fixing Kconfig.
> > https://lkml.org/lkml/2018/11/26/291
> >
> > So that I think this patch will be useless.
> > Could you check it up?
>
> OK, will keep back your patch by now, if this fix for Kbuild is still
> not fixing up the problem, then robots will spot this again.
>

Okay, Thank you for checking!

> Thanks!
>
> > On Sun, 18 Nov 2018 at 23:39, Taehee Yoo  wrote:
> > >
> > > xt_TEE.c needs nf_dup_ipv6.c to support ipv6 packet duplication.
> > > So that if xt_TEE is enabled, nf_dup_ipv6 will be automatically selected.
> > > But there is build failure scenario.
> > >
> > > test config:
> > > CONFIG_NETFILTER_XT_TARGET_TEE=y
> > > CONFIG_NF_DUP_IPV6=m
> > >
> > > compile result:
> > > net/netfilter/xt_TEE.o: In function `tee_tg6':
> > > net/netfilter/xt_TEE.c:57: undefined reference to `nf_dup_ipv6'
> > >
> > > This patch forces to avoid above config.
> > >
> > > Fixes: 5d400a4933e8 ("netfilter: Kconfig: Change select IPv6 
> > > dependencies")
> > > Reported-by: Randy Dunlap 
> > > Reported-by: Reported-by: Stephen Rothwell 
> > > Signed-off-by: Taehee Yoo 
> > > ---
> > >  net/netfilter/Kconfig | 2 +-
> > >  1 file changed, 1 insertion(+), 1 deletion(-)
> > >
> > > diff --git a/net/netfilter/Kconfig b/net/netfilter/Kconfig
> > > index 2ab870ef233a..a0c2712290ea 100644
> > > --- a/net/netfilter/Kconfig
> > > +++ b/net/netfilter/Kconfig
> > > @@ -1011,7 +1011,7 @@ config NETFILTER_XT_TARGET_TEE
> > > depends on IPV6 || IPV6=n
> > > depends on !NF_CONNTRACK || NF_CONNTRACK
> > > select NF_DUP_IPV4
> > > -   select NF_DUP_IPV6 if IP6_NF_IPTABLES
> > > +   select NF_DUP_IPV6 if IP6_NF_IPTABLES != n
> > > ---help---
> > > This option adds a "TEE" target with which a packet can be cloned 
> > > and
> > > this clone be rerouted to another nexthop.
> > > --
> > > 2.17.1
> > >


Re: [PATCH nf] netfilter: xt_TEE: fix build failure

2018-11-26 Thread Pablo Neira Ayuso
On Mon, Nov 26, 2018 at 06:39:28PM +0900, Taehee Yoo wrote:
> Hi Pablo,
> 
> According to Masahiro Yamada, this is Kconfig bug and he is fixing Kconfig.
> https://lkml.org/lkml/2018/11/26/291
> 
> So that I think this patch will be useless.
> Could you check it up?

OK, will keep back your patch by now, if this fix for Kbuild is still
not fixing up the problem, then robots will spot this again.

Thanks!

> On Sun, 18 Nov 2018 at 23:39, Taehee Yoo  wrote:
> >
> > xt_TEE.c needs nf_dup_ipv6.c to support ipv6 packet duplication.
> > So that if xt_TEE is enabled, nf_dup_ipv6 will be automatically selected.
> > But there is build failure scenario.
> >
> > test config:
> > CONFIG_NETFILTER_XT_TARGET_TEE=y
> > CONFIG_NF_DUP_IPV6=m
> >
> > compile result:
> > net/netfilter/xt_TEE.o: In function `tee_tg6':
> > net/netfilter/xt_TEE.c:57: undefined reference to `nf_dup_ipv6'
> >
> > This patch forces to avoid above config.
> >
> > Fixes: 5d400a4933e8 ("netfilter: Kconfig: Change select IPv6 dependencies")
> > Reported-by: Randy Dunlap 
> > Reported-by: Reported-by: Stephen Rothwell 
> > Signed-off-by: Taehee Yoo 
> > ---
> >  net/netfilter/Kconfig | 2 +-
> >  1 file changed, 1 insertion(+), 1 deletion(-)
> >
> > diff --git a/net/netfilter/Kconfig b/net/netfilter/Kconfig
> > index 2ab870ef233a..a0c2712290ea 100644
> > --- a/net/netfilter/Kconfig
> > +++ b/net/netfilter/Kconfig
> > @@ -1011,7 +1011,7 @@ config NETFILTER_XT_TARGET_TEE
> > depends on IPV6 || IPV6=n
> > depends on !NF_CONNTRACK || NF_CONNTRACK
> > select NF_DUP_IPV4
> > -   select NF_DUP_IPV6 if IP6_NF_IPTABLES
> > +   select NF_DUP_IPV6 if IP6_NF_IPTABLES != n
> > ---help---
> > This option adds a "TEE" target with which a packet can be cloned 
> > and
> > this clone be rerouted to another nexthop.
> > --
> > 2.17.1
> >


[PATCH nf] netfilter: nf_tables: fix suspicious RCU usage in nft_chain_stats_replace()

2018-11-26 Thread Taehee Yoo
basechain->stats is rcu protected data.
And write critical section of basechain->stats data is
nft_chain_stats_replace().
The function is executed in commit phase. so that actually commit_mutex
lock protects that.
Hence commit_mutex lockdep should be used for rcu_dereference_protected()
in the nft_chain_stats_replace() instead of NFNL_SUBSYS_NFTABLES.

By this patch, rcu APIs are used to handle basechain->stats data.

test commands:
   %iptables-nft -I INPUT
   %iptables-nft -Z
   %iptables-nft -Z

splat looks like:
[89279.358755] =
[89279.363656] WARNING: suspicious RCU usage
[89279.368458] 4.20.0-rc2+ #44 Tainted: GWL
[89279.374661] -
[89279.379542] net/netfilter/nf_tables_api.c:1404 suspicious 
rcu_dereference_protected() usage!
[89279.389520]
other info that might help us debug this:

[89279.398893]
rcu_scheduler_active = 2, debug_locks = 1
[89279.406556] 1 lock held by iptables-nft/5225:
[89279.411728]  #0: bf45a000 (>nft.commit_mutex){+.+.}, at: 
nf_tables_valid_genid+0x1f/0x70 [nf_tables]
[89279.424022]
stack backtrace:
[89279.429236] CPU: 0 PID: 5225 Comm: iptables-nft Tainted: GWL
4.20.0-rc2+ #44
[89279.430135] Call Trace:
[89279.430135]  dump_stack+0xc9/0x16b
[89279.430135]  ? show_regs_print_info+0x5/0x5
[89279.430135]  ? lockdep_rcu_suspicious+0x117/0x160
[89279.430135]  nft_chain_commit_update+0x4ea/0x640 [nf_tables]
[89279.430135]  ? sched_clock_local+0xd4/0x140
[89279.430135]  ? check_flags.part.35+0x440/0x440
[89279.430135]  ? __rhashtable_remove_fast.constprop.67+0xec0/0xec0 [nf_tables]
[89279.430135]  ? sched_clock_cpu+0x126/0x170
[89279.430135]  ? find_held_lock+0x39/0x1c0
[89279.430135]  ? hlock_class+0x140/0x140
[89279.430135]  ? is_bpf_text_address+0x5/0xf0
[89279.430135]  ? check_flags.part.35+0x440/0x440
[89279.430135]  ? __lock_is_held+0xb4/0x140
[89279.430135]  nf_tables_commit+0x2555/0x39c0 [nf_tables]

Fixes: f102d66b335a4 ("netfilter: nf_tables: use dedicated mutex to guard 
transactions")
Signed-off-by: Taehee Yoo 
---
 include/linux/netfilter/nfnetlink.h | 12 
 net/netfilter/nf_tables_api.c   | 21 +
 net/netfilter/nf_tables_core.c  |  2 +-
 3 files changed, 14 insertions(+), 21 deletions(-)

diff --git a/include/linux/netfilter/nfnetlink.h 
b/include/linux/netfilter/nfnetlink.h
index 4a520d3304a2..cf09ab37b45b 100644
--- a/include/linux/netfilter/nfnetlink.h
+++ b/include/linux/netfilter/nfnetlink.h
@@ -62,18 +62,6 @@ static inline bool lockdep_nfnl_is_held(__u8 subsys_id)
 }
 #endif /* CONFIG_PROVE_LOCKING */
 
-/*
- * nfnl_dereference - fetch RCU pointer when updates are prevented by subsys 
mutex
- *
- * @p: The pointer to read, prior to dereferencing
- * @ss: The nfnetlink subsystem ID
- *
- * Return the value of the specified RCU-protected pointer, but omit
- * the READ_ONCE(), because caller holds the NFNL subsystem mutex.
- */
-#define nfnl_dereference(p, ss)\
-   rcu_dereference_protected(p, lockdep_nfnl_is_held(ss))
-
 #define MODULE_ALIAS_NFNL_SUBSYS(subsys) \
MODULE_ALIAS("nfnetlink-subsys-" __stringify(subsys))
 
diff --git a/net/netfilter/nf_tables_api.c b/net/netfilter/nf_tables_api.c
index ddeaa1990e1e..e82ad1795194 100644
--- a/net/netfilter/nf_tables_api.c
+++ b/net/netfilter/nf_tables_api.c
@@ -1216,7 +1216,8 @@ static int nf_tables_fill_chain_info(struct sk_buff *skb, 
struct net *net,
if (nla_put_string(skb, NFTA_CHAIN_TYPE, basechain->type->name))
goto nla_put_failure;
 
-   if (basechain->stats && nft_dump_stats(skb, basechain->stats))
+   if (rcu_access_pointer(basechain->stats) &&
+   nft_dump_stats(skb, rcu_dereference(basechain->stats)))
goto nla_put_failure;
}
 
@@ -1392,7 +1393,8 @@ static struct nft_stats __percpu *nft_stats_alloc(const 
struct nlattr *attr)
return newstats;
 }
 
-static void nft_chain_stats_replace(struct nft_base_chain *chain,
+static void nft_chain_stats_replace(struct net *net,
+   struct nft_base_chain *chain,
struct nft_stats __percpu *newstats)
 {
struct nft_stats __percpu *oldstats;
@@ -1400,8 +1402,9 @@ static void nft_chain_stats_replace(struct nft_base_chain 
*chain,
if (newstats == NULL)
return;
 
-   if (chain->stats) {
-   oldstats = nfnl_dereference(chain->stats, NFNL_SUBSYS_NFTABLES);
+   if (rcu_access_pointer(chain->stats)) {
+   oldstats = rcu_dereference_protected(chain->stats,
+   lockdep_commit_lock_is_held(net));
rcu_assign_pointer(chain->stats, newstats);
synchronize_rcu();
free_percpu(oldstats);
@@ -1439,9 +1442,10 @@ static void nf_tables_chain_destroy(struct 

Re: [PATCH] netfilter: ipset: replace a strncpy() with strscpy()

2018-11-26 Thread Jozsef Kadlecsik
Hi,

On Wed, 21 Nov 2018, Qian Cai wrote:

> To make overflows as obvious as possible and to prevent code from blithely
> proceeding with a truncated string. This also has a side-effect to fix a
> compilation warning using GCC 8.2.1.
> 
> net/netfilter/ipset/ip_set_core.c: In function 'ip_set_sockfn_get':
> net/netfilter/ipset/ip_set_core.c:2027:3: warning: 'strncpy' writing 32
> bytes into a region of size 2 overflows the destination
> [-Wstringop-overflow=]

But without checking the return value of strscpy(), I don't see why it's 
better to call strscpy(). Also, with your patch I get the warning:

  CC [M]  /usr/src/git/ipset/ipset/kernel/net/netfilter/xt_set.o
  CC [M]  
/usr/src/git/ipset/ipset/kernel/net/netfilter/ipset/ip_set_core.o
/usr/src/git/ipset/ipset/kernel/net/netfilter/ipset/ip_set_core.c: In 
function 'ip_set_sockfn_get':
/usr/src/git/ipset/ipset/kernel/net/netfilter/ipset/ip_set_core.c:2184:3: 
warning: ignoring return value of 'strscpy', declared with attribute 
warn_unused_result [-Wunused-result]
   strscpy(req_get->set.name, set ? set->name : "",
   ^

So please add the proper checking of the return value.

Best regards,
Jozsef
> Signed-off-by: Qian Cai 
> ---
>  net/netfilter/ipset/ip_set_core.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/net/netfilter/ipset/ip_set_core.c 
> b/net/netfilter/ipset/ip_set_core.c
> index 1577f2f..915aa0d 100644
> --- a/net/netfilter/ipset/ip_set_core.c
> +++ b/net/netfilter/ipset/ip_set_core.c
> @@ -2024,7 +2024,7 @@ static int ip_set_protocol(struct net *net, struct sock 
> *ctnl,
>   }
>   nfnl_lock(NFNL_SUBSYS_IPSET);
>   set = ip_set(inst, req_get->set.index);
> - strncpy(req_get->set.name, set ? set->name : "",
> + strscpy(req_get->set.name, set ? set->name : "",
>   IPSET_MAXNAMELEN);
>   nfnl_unlock(NFNL_SUBSYS_IPSET);
>   goto copy;
> -- 
> 1.8.3.1
> 
> 

-
E-mail  : kad...@blackhole.kfki.hu, kadlecsik.joz...@wigner.mta.hu
PGP key : http://www.kfki.hu/~kadlec/pgp_public_key.txt
Address : Wigner Research Centre for Physics, Hungarian Academy of Sciences
  H-1525 Budapest 114, POB. 49, Hungary


Re: [PATCH nf] netfilter: xt_TEE: fix build failure

2018-11-26 Thread Taehee Yoo
Hi Pablo,

According to Masahiro Yamada, this is Kconfig bug and he is fixing Kconfig.
https://lkml.org/lkml/2018/11/26/291

So that I think this patch will be useless.
Could you check it up?

Thanks!

On Sun, 18 Nov 2018 at 23:39, Taehee Yoo  wrote:
>
> xt_TEE.c needs nf_dup_ipv6.c to support ipv6 packet duplication.
> So that if xt_TEE is enabled, nf_dup_ipv6 will be automatically selected.
> But there is build failure scenario.
>
> test config:
> CONFIG_NETFILTER_XT_TARGET_TEE=y
> CONFIG_NF_DUP_IPV6=m
>
> compile result:
> net/netfilter/xt_TEE.o: In function `tee_tg6':
> net/netfilter/xt_TEE.c:57: undefined reference to `nf_dup_ipv6'
>
> This patch forces to avoid above config.
>
> Fixes: 5d400a4933e8 ("netfilter: Kconfig: Change select IPv6 dependencies")
> Reported-by: Randy Dunlap 
> Reported-by: Reported-by: Stephen Rothwell 
> Signed-off-by: Taehee Yoo 
> ---
>  net/netfilter/Kconfig | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/net/netfilter/Kconfig b/net/netfilter/Kconfig
> index 2ab870ef233a..a0c2712290ea 100644
> --- a/net/netfilter/Kconfig
> +++ b/net/netfilter/Kconfig
> @@ -1011,7 +1011,7 @@ config NETFILTER_XT_TARGET_TEE
> depends on IPV6 || IPV6=n
> depends on !NF_CONNTRACK || NF_CONNTRACK
> select NF_DUP_IPV4
> -   select NF_DUP_IPV6 if IP6_NF_IPTABLES
> +   select NF_DUP_IPV6 if IP6_NF_IPTABLES != n
> ---help---
> This option adds a "TEE" target with which a packet can be cloned and
> this clone be rerouted to another nexthop.
> --
> 2.17.1
>


Re: [PATCH nf] netfilter: nfnetlink_cttimeout: nf_proto_net must be first member of netns_proto_gre

2018-11-26 Thread Pablo Neira Ayuso
On Wed, Nov 21, 2018 at 01:38:59PM +0100, Florian Westphal wrote:
> Can't move timeouts around, it appears conntrack sysctl unregister
> assumes net_generic() returns nf_proto_net, so we get crash.
> 
> Expose layout of netns_proto_gre instead.
> 
> Reported-by: kernel test robot 
> Fixes: 991acf532b  netfilter: nfnetlink_cttimeout: fetch timeouts for udplite 
> and gre, too

I have squashed this patch into this previous fix.

Thanks Florian.


[PATCH nf] netfilter: nf_conncount: remove wrong condition check routine

2018-11-25 Thread Taehee Yoo
All lists in the tree_nodes_free() have both zero count and true dead flag.
Because lists are selected by nf_conncount_gc_list() and that makes that
zero-count and true dead flag.
So that the if statement of tree_nodes_free() is unnecessary and wrong.

Fixes: 31568ec09ea0 ("netfilter: nf_conncount: fix list_del corruption in 
conn_free")
Signed-off-by: Taehee Yoo 
---
 net/netfilter/nf_conncount.c | 7 ++-
 1 file changed, 2 insertions(+), 5 deletions(-)

diff --git a/net/netfilter/nf_conncount.c b/net/netfilter/nf_conncount.c
index 8acae4a3e4c0..b6d0f6deea86 100644
--- a/net/netfilter/nf_conncount.c
+++ b/net/netfilter/nf_conncount.c
@@ -323,11 +323,8 @@ static void tree_nodes_free(struct rb_root *root,
while (gc_count) {
rbconn = gc_nodes[--gc_count];
spin_lock(>list.list_lock);
-   if (rbconn->list.count == 0 && rbconn->list.dead == false) {
-   rbconn->list.dead = true;
-   rb_erase(>node, root);
-   call_rcu(>rcu_head, __tree_nodes_free);
-   }
+   rb_erase(>node, root);
+   call_rcu(>rcu_head, __tree_nodes_free);
spin_unlock(>list.list_lock);
}
 }
-- 
2.17.1



[PATCH] include: extend the headers conflict workaround to in6.h

2018-11-24 Thread Baruch Siach
Commit 8d9d7e4b9ef ("include: fix build with kernel headers before 4.2")
introduced a kernel/user headers conflict workaround that allows build
of iptables with kernel headers older than 4.2. This minor extension
allows build with kernel headers older than 3.12, which is the version
that introduced explicit IP headers synchronization.

Cc: Florian Westphal 
Signed-off-by: Baruch Siach 
---
 include/linux/netfilter.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/include/linux/netfilter.h b/include/linux/netfilter.h
index bacf8cd92116..042d8b1478e0 100644
--- a/include/linux/netfilter.h
+++ b/include/linux/netfilter.h
@@ -5,8 +5,8 @@
 
 #ifndef _NETINET_IN_H
 #include 
-#endif
 #include 
+#endif
 #include 
 
 /* Responses from hook functions. */
-- 
2.19.1



Re: [iptables PATCH] ebtables: Use xtables_exit_err()

2018-11-23 Thread Florian Westphal
Phil Sutter  wrote:
> When e.g. ebtables-nft detects an incompatible table, a stray '.' was
> printed as last line of output:
> 
> | # ebtables-nft -L
> | table `filter' is incompatible, use 'nft' tool.
> | .
> 
> This comes from ebtables' own exit_err callback. Instead use the common
> one which also provides useful version information.

Applied, thanks.


[iptables PATCH] ebtables: Use xtables_exit_err()

2018-11-23 Thread Phil Sutter
When e.g. ebtables-nft detects an incompatible table, a stray '.' was
printed as last line of output:

| # ebtables-nft -L
| table `filter' is incompatible, use 'nft' tool.
| .

This comes from ebtables' own exit_err callback. Instead use the common
one which also provides useful version information.

While being at it, align the final error message in xtables_eb_main()
with how the others print it.

Signed-off-by: Phil Sutter 
---
 iptables/xtables-eb-standalone.c |  2 +-
 iptables/xtables-eb.c| 15 ++-
 2 files changed, 3 insertions(+), 14 deletions(-)

diff --git a/iptables/xtables-eb-standalone.c b/iptables/xtables-eb-standalone.c
index 84ce0b60a7076..fb3daba0bd604 100644
--- a/iptables/xtables-eb-standalone.c
+++ b/iptables/xtables-eb-standalone.c
@@ -54,7 +54,7 @@ int xtables_eb_main(int argc, char *argv[])
ret = nft_commit();
 
if (!ret)
-   fprintf(stderr, "%s\n", nft_strerror(errno));
+   fprintf(stderr, "ebtables: %s\n", nft_strerror(errno));
 
exit(!ret);
 }
diff --git a/iptables/xtables-eb.c b/iptables/xtables-eb.c
index f1aba555186eb..efc1f16ac6364 100644
--- a/iptables/xtables-eb.c
+++ b/iptables/xtables-eb.c
@@ -291,23 +291,12 @@ struct option ebt_original_options[] =
{ 0 }
 };
 
-static void __attribute__((__noreturn__,format(printf,2,3)))
-ebt_print_error(enum xtables_exittype status, const char *format, ...)
-{
-   va_list l;
-
-   va_start(l, format);
-   vfprintf(stderr, format, l);
-   fprintf(stderr, ".\n");
-   va_end(l);
-   exit(-1);
-}
-
+extern void xtables_exit_error(enum xtables_exittype status, const char *msg, 
...) __attribute__((noreturn, format(printf,2,3)));
 struct xtables_globals ebtables_globals = {
.option_offset  = 0,
.program_version= IPTABLES_VERSION,
.orig_opts  = ebt_original_options,
-   .exit_err   = ebt_print_error,
+   .exit_err   = xtables_exit_error,
.compat_rev = nft_compatible_revision,
 };
 
-- 
2.19.0



compilation error glibc

2018-11-23 Thread Ansuel Smith
arm-openwrt-linux-gnueabi-gcc -D_LARGEFILE_SOURCE=1 -D_LARGE_FILES
-D_FILE_OFFSET_BITS=64 -D_REENTRANT
-DXTABLES_LIBDIR=\"/usr/lib/iptables\" -DXTABLES_INTERNAL -I../include
-I.. -I../include -I..
-I/media/MyBook/openwrt/build_dir/target-arm_cortex-a15+neon-vfpv4_glibc_eabi/linux-ipq806x/linux-4.14.82/user_headers/include/uapi
-I/media/MyBook/openwrt/build_dir/target-arm_cortex-a15+neon-vfpv4_glibc_eabi/linux-ipq806x/linux-4.14.82/user_headers/include
-I/media/MyBook/openwrt/build_dir/target-arm_cortex-a15+neon-vfpv4_glibc_eabi/linux-ipq806x/linux-4.14.82/user_headers/include/uapi
-I/media/MyBook/openwrt/build_dir/target-arm_cortex-a15+neon-vfpv4_glibc_eabi/linux-ipq806x/linux-4.14.82/user_headers/include
-I/media/MyBook/openwrt/build_dir/target-arm_cortex-a15+neon-vfpv4_glibc_eabi/linux-ipq806x/iptables-1.8.2/include
-I/media/MyBook/openwrt/build_dir/target-arm_cortex-a15+neon-vfpv4_glibc_eabi/linux-ipq806x/linux-4.14.82/user_headers/include
-I/media/MyBook/openwrt/staging_dir/target-arm_cortex-a15+neon-vfpv4_glibc_eabi/usr/include
-I/media/MyBook/openwrt/staging_dir/target-arm_cortex-a15+neon-vfpv4_glibc_eabi/include
-I/media/MyBook/openwrt/staging_dir/toolchain-arm_cortex-a15+neon-vfpv4_gcc-8.2.0_glibc_eabi/usr/include
-I/media/MyBook/openwrt/staging_dir/toolchain-arm_cortex-a15+neon-vfpv4_gcc-8.2.0_glibc_eabi/include
   -Wp,-MMD,./.libxt_mac.o.d,-MT,libxt_mac.o -Wall -Waggregate-return
-Wmissing-declarations -Wmissing-prototypes -Wredundant-decls -Wshadow
-Wstrict-prototypes -Wlogical-op -Winline -pipe -DNO_SHARED_LIBS=1
-D_INIT=libxt_mac_init -DPIC -fPIC -O2 -pipe -march=armv7-a+neon-vfpv4
-mtune=cortex-a15 -fno-caller-saves -fno-plt -fhonour-copts
-Wno-error=unused-but-set-variable -Wno-error=unused-result
-mfloat-abi=hard
-fmacro-prefix-map=/media/MyBook/openwrt/build_dir/target-arm_cortex-a15+neon-vfpv4_glibc_eabi/linux-ipq806x/iptables-1.8.2=iptables-1.8.2
-Wformat -Werror=format-security -D_FORTIFY_SOURCE=1 -Wl,-z,now
-Wl,-z,relro 
-I/media/MyBook/openwrt/build_dir/target-arm_cortex-a15+neon-vfpv4_glibc_eabi/linux-ipq806x/iptables-1.8.2/include
-I/media/MyBook/openwrt/build_dir/target-arm_cortex-a15+neon-vfpv4_glibc_eabi/linux-ipq806x/linux-4.14.82/user_headers/include
-ffunction-sections -fdata-sections -DNO_LEGACY  -o libxt_mac.o -c
libxt_mac.c;
In file included from ../include/linux/netfilter.h:9,
 from ../include/xtables.h:18,
 from libxt_mac.c:7:
/media/MyBook/openwrt/build_dir/target-arm_cortex-a15+neon-vfpv4_glibc_eabi/linux-ipq806x/linux-4.14.82/user_headers/include/linux/in6.h:33:8:
error: redefinition of 'struct in6_addr'
 struct in6_addr {
^~~~
In file included from ../include/xtables.h:15,
 from libxt_mac.c:7:
/media/MyBook/openwrt/staging_dir/toolchain-arm_cortex-a15+neon-vfpv4_gcc-8.2.0_glibc_eabi/include/netinet/in.h:211:8:
note: originally defined here
 struct in6_addr
^~~~
In file included from ../include/linux/netfilter.h:9,
 from ../include/xtables.h:18,
 from libxt_mac.c:7:
/media/MyBook/openwrt/build_dir/target-arm_cortex-a15+neon-vfpv4_glibc_eabi/linux-ipq806x/linux-4.14.82/user_headers/include/linux/in6.h:41:
warning: "s6_addr" redefined
 #define s6_addr   in6_u.u6_addr8

In file included from ../include/xtables.h:15,
 from libxt_mac.c:7:


Re: RFC: Designing per chain rule cache support in libnftnl

2018-11-23 Thread Pablo Neira Ayuso
On Fri, Nov 23, 2018 at 01:35:17PM +0100, Pablo Neira Ayuso wrote:
> On Fri, Nov 23, 2018 at 12:25:45PM +0100, Florian Westphal wrote:
> > Phil Sutter  wrote:
> > > > If user doesn't want it cleared at nftnl_chain_free() time they can
> > > > always allocate a new nftnl_rule_list and splice to that list.
> > > 
> > > Good point. What do you think about the simple approach of introducing:
> > > 
> > > | struct nftnl_rule_list *nftnl_chain_get_rule_list(const struct 
> > > nftnl_chain *);
> > 
> > Looks fine to me.
> > 
> > > This would allow to reuse nftnl_rule_list routines from libnftnl/rule.h.
> > > One potential problem I see is that users may try to call
> > > nftnl_rule_list_free(). Can we prevent that somehow?
> > 
> > Document that nftnl_rule_list_free() pairs with nftnl_rule_list_alloc() :-)
> > 
> > I don't think its an issue.
> > We could add a 'bool make_free_no_op' to nftnl_rule_list and set that to
> > true for nftnl_rule_list structures that are allocated indirectly on
> > behalf of nftnl_chain struct, but I think thats taking things too far.
> 
> Can we have an interface similar to nftnl_rule_add_expr() to add rules
> to chains?
> 
> So we add list field to nftnl_chain, and this new interface to
> add/delete rules.

We can add an internal hashtable, that allows lookup by handle. Also
add iterators à la nftnl_expr_foreach() too.


Re: RFC: Designing per chain rule cache support in libnftnl

2018-11-23 Thread Pablo Neira Ayuso
On Fri, Nov 23, 2018 at 12:25:45PM +0100, Florian Westphal wrote:
> Phil Sutter  wrote:
> > > If user doesn't want it cleared at nftnl_chain_free() time they can
> > > always allocate a new nftnl_rule_list and splice to that list.
> > 
> > Good point. What do you think about the simple approach of introducing:
> > 
> > | struct nftnl_rule_list *nftnl_chain_get_rule_list(const struct 
> > nftnl_chain *);
> 
> Looks fine to me.
> 
> > This would allow to reuse nftnl_rule_list routines from libnftnl/rule.h.
> > One potential problem I see is that users may try to call
> > nftnl_rule_list_free(). Can we prevent that somehow?
> 
> Document that nftnl_rule_list_free() pairs with nftnl_rule_list_alloc() :-)
> 
> I don't think its an issue.
> We could add a 'bool make_free_no_op' to nftnl_rule_list and set that to
> true for nftnl_rule_list structures that are allocated indirectly on
> behalf of nftnl_chain struct, but I think thats taking things too far.

Can we have an interface similar to nftnl_rule_add_expr() to add rules
to chains?

So we add list field to nftnl_chain, and this new interface to
add/delete rules.

We can probably deprecate the existing list interface if we follow
that procedure after a bit of time in favour of this one.


Re: RFC: Designing per chain rule cache support in libnftnl

2018-11-23 Thread Florian Westphal
Phil Sutter  wrote:
> > If user doesn't want it cleared at nftnl_chain_free() time they can
> > always allocate a new nftnl_rule_list and splice to that list.
> 
> Good point. What do you think about the simple approach of introducing:
> 
> | struct nftnl_rule_list *nftnl_chain_get_rule_list(const struct nftnl_chain 
> *);

Looks fine to me.

> This would allow to reuse nftnl_rule_list routines from libnftnl/rule.h.
> One potential problem I see is that users may try to call
> nftnl_rule_list_free(). Can we prevent that somehow?

Document that nftnl_rule_list_free() pairs with nftnl_rule_list_alloc() :-)

I don't think its an issue.
We could add a 'bool make_free_no_op' to nftnl_rule_list and set that to
true for nftnl_rule_list structures that are allocated indirectly on
behalf of nftnl_chain struct, but I think thats taking things too far.

> A more fool-proof (but somewhat tedious) solution would be to duplicate
> nftnl_rule_list API for use on an nftnl_chain. But I don't quite like
> that.

I don't like it either, API bloat is problem.


Re: RFC: Designing per chain rule cache support in libnftnl

2018-11-23 Thread Phil Sutter
On Fri, Nov 23, 2018 at 07:49:49AM +0100, Florian Westphal wrote:
> Phil Sutter  wrote:
> > In order to improve performance in 'nft -f' as well as xtables-restore
> > with very large rulesets, we need to store rules by chain they belong
> > to. In order to avoid pointless code duplication, this should be
> > supported by libnftnl.
> 
> Unfortunately we still need to change lookup algorithm as well
> (hash, tree?), linear list scan is too expensive.
> 
> We might even need multiple internal ways to keep track of the chains,
> e.g. to accelerate insert/delete-by-index :-/

That's right. I would "hide" these details within struct nftnl_rule_list
though and provide appropriate lookup routines.

For now, I'm focussing on the API, if we get it right the data structure
behind it is replaceable/extensible at will.

> > Looking into the topic, it seems like extending struct nftnl_chain is
> > the most straightforward way to go. My idea is to embed an
> > nftnl_rule_list in there, though I'm unsure how to best do that in
> > practice:
> > 
> > We could either add a field of type struct nftnl_rule_list which would
> > have to be initialized/cleared in nftnl_chain_alloc() and
> > nftnl_chain_free(). This would be accompanied by a function to retrieve
> > the pointer to that field so the existing rule_list routines may be used
> > with it.
> > 
> > Another option would be to add a pointer to a struct nftnl_rule_list.
> > Having a function to retrieve a pointer to that pointer, the rule_list
> > could be initialized/cleared by users on demand.
> > 
> > What do you consider more practical? Is there a third option I didn't
> > think of yet?
> 
> I'd vote for the former (embed nftnl_rule_list).

OK, thanks.

> If user doesn't want it cleared at nftnl_chain_free() time they can
> always allocate a new nftnl_rule_list and splice to that list.

Good point. What do you think about the simple approach of introducing:

| struct nftnl_rule_list *nftnl_chain_get_rule_list(const struct nftnl_chain *);

This would allow to reuse nftnl_rule_list routines from libnftnl/rule.h.
One potential problem I see is that users may try to call
nftnl_rule_list_free(). Can we prevent that somehow?

A more fool-proof (but somewhat tedious) solution would be to duplicate
nftnl_rule_list API for use on an nftnl_chain. But I don't quite like
that.

Cheers, Phil


Re: RFC: Designing per chain rule cache support in libnftnl

2018-11-22 Thread Florian Westphal
Phil Sutter  wrote:
> In order to improve performance in 'nft -f' as well as xtables-restore
> with very large rulesets, we need to store rules by chain they belong
> to. In order to avoid pointless code duplication, this should be
> supported by libnftnl.

Unfortunately we still need to change lookup algorithm as well
(hash, tree?), linear list scan is too expensive.

We might even need multiple internal ways to keep track of the chains,
e.g. to accelerate insert/delete-by-index :-/

> Looking into the topic, it seems like extending struct nftnl_chain is
> the most straightforward way to go. My idea is to embed an
> nftnl_rule_list in there, though I'm unsure how to best do that in
> practice:
> 
> We could either add a field of type struct nftnl_rule_list which would
> have to be initialized/cleared in nftnl_chain_alloc() and
> nftnl_chain_free(). This would be accompanied by a function to retrieve
> the pointer to that field so the existing rule_list routines may be used
> with it.
> 
> Another option would be to add a pointer to a struct nftnl_rule_list.
> Having a function to retrieve a pointer to that pointer, the rule_list
> could be initialized/cleared by users on demand.
> 
> What do you consider more practical? Is there a third option I didn't
> think of yet?

I'd vote for the former (embed nftnl_rule_list).

If user doesn't want it cleared at nftnl_chain_free() time they can
always allocate a new nftnl_rule_list and splice to that list.


Re: [iptables PATCH] arptables: Support --set-counters option

2018-11-22 Thread Florian Westphal
Phil Sutter  wrote:
> Relevant code for this was already present (short option '-c'), just the
> long option definition was missing.

Applied, thanks.


[iptables PATCH] arptables: Support --set-counters option

2018-11-22 Thread Phil Sutter
Relevant code for this was already present (short option '-c'), just the
long option definition was missing.

While being at it, add '-c' to help text.

Signed-off-by: Phil Sutter 
---
 iptables/xtables-arp.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/iptables/xtables-arp.c b/iptables/xtables-arp.c
index 5a9924ca56442..2f369d9aadb01 100644
--- a/iptables/xtables-arp.c
+++ b/iptables/xtables-arp.c
@@ -144,6 +144,7 @@ static struct option original_opts[] = {
{ "help", 2, 0, 'h' },
{ "line-numbers", 0, 0, '0' },
{ "modprobe", 1, 0, 'M' },
+   { "set-counters", 1, 0, 'c' },
{ 0 }
 };
 
@@ -481,7 +482,7 @@ exit_printhelp(void)
 "  --line-numbers  print line numbers when listing\n"
 "  --exact -x  expand numbers (display exact values)\n"
 "  --modprobe=try to insert modules using this 
command\n"
-"  --set-counters PKTS BYTES   set the counter during insert/append\n"
+"  --set-counters -c PKTS BYTESset the counter during insert/append\n"
 "[!] --version -V  print package version.\n");
printf(" opcode strings: \n");
 for (i = 0; i < NUMOPCODES; i++)
-- 
2.19.0



[PATCH nf v2 2/2] netfilter: nat: fix double register in masquerade modules

2018-11-22 Thread Taehee Yoo
masquerade modules register notifier and that should not be
double-registered. so that these modules manage reference counter.
If already notifiers are registered, it just return success.
But there is unsafe scenario.

test commands:

   while :
   do
   modprobe ip6t_MASQUERADE &
   modprobe nft_masq_ipv6 &
   modprobe -rv ip6t_MASQUERADE &
   modprobe -rv nft_masq_ipv6 &
   done

numbers are reference count.

CPU0CPU1CPU2CPU3CPU4
[insmod][insmod][rmmod] [rmmod] [insmod]

0->1
register1->2
returns 2->1
returns 1->0
0->1
register <--
unregister


The unregistation of CPU3 should be processed before the
registration of CPU4.

In order to fix this, mutex can be used.
So that this patch uses it.

splat looks like:
[  323.869557] watchdog: BUG: soft lockup - CPU#0 stuck for 22s! [modprobe:1381]
[  323.869574] Modules linked in: nf_tables(+) nf_nat_ipv6(-) nf_nat 
nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 n]
[  323.869574] irq event stamp: 194074
[  323.898930] hardirqs last  enabled at (194073): [] 
trace_hardirqs_on_thunk+0x1a/0x1c
[  323.898930] hardirqs last disabled at (194074): [] 
trace_hardirqs_off_thunk+0x1a/0x1c
[  323.898930] softirqs last  enabled at (182132): [] 
__do_softirq+0x6ec/0xa3b
[  323.898930] softirqs last disabled at (182109): [] 
irq_exit+0x1a6/0x1e0
[  323.898930] CPU: 0 PID: 1381 Comm: modprobe Not tainted 4.20.0-rc2+ #27
[  323.898930] RIP: 0010:raw_notifier_chain_register+0xea/0x240
[  323.898930] Code: 3c 03 0f 8e f2 00 00 00 44 3b 6b 10 7f 4d 49 bc 00 00 00 
00 00 fc ff df eb 22 48 8d 7b 10 488
[  323.898930] RSP: 0018:888101597218 EFLAGS: 0206 ORIG_RAX: 
ff13
[  323.898930] RAX:  RBX: c04361c0 RCX: 
[  323.898930] RDX: 126132ae RSI: c04aa3c0 RDI: c04361d0
[  323.898930] RBP: c04361c8 R08:  R09: 0001
[  323.898930] R10: 8881015972b0 R11: fbfff26132c4 R12: dc00
[  323.898930] R13:  R14: 1110202b2e44 R15: c04aa3c0
[  323.898930] FS:  7f813ed41540() GS:88811ae0() 
knlGS:
[  323.898930] CS:  0010 DS:  ES:  CR0: 80050033
[  323.898930] CR2: 559bf2c9f120 CR3: 00010bc8 CR4: 001006f0
[  323.898930] Call Trace:
[  323.898930]  ? atomic_notifier_chain_register+0x2d0/0x2d0
[  323.898930]  ? down_read+0x150/0x150
[  323.898930]  ? sched_clock_cpu+0x126/0x170
[  323.898930]  ? nf_tables_core_module_init+0xe4/0xe4 [nf_tables]
[  323.898930]  ? nf_tables_core_module_init+0xe4/0xe4 [nf_tables]
[  323.898930]  register_netdevice_notifier+0xbb/0x790
[  323.898930]  ? __dev_close_many+0x2d0/0x2d0
[  323.898930]  ? __mutex_unlock_slowpath+0x17f/0x740
[  323.898930]  ? wait_for_completion+0x710/0x710
[  323.898930]  ? nf_tables_core_module_init+0xe4/0xe4 [nf_tables]
[  323.898930]  ? up_write+0x6c/0x210
[  323.898930]  ? nf_tables_core_module_init+0xe4/0xe4 [nf_tables]
[  324.127073]  ? nf_tables_core_module_init+0xe4/0xe4 [nf_tables]
[  324.127073]  nft_chain_filter_init+0x1e/0xe8a [nf_tables]
[  324.127073]  nf_tables_module_init+0x37/0x92 [nf_tables]
[ ... ]

Fixes: 8dd33cc93ec9 ("netfilter: nf_nat: generalize IPv4 masquerading support 
for nf_tables")
Fixes: be6b635cd674 ("netfilter: nf_nat: generalize IPv6 masquerading support 
for nf_tables")
Signed-off-by: Taehee Yoo 
---

v2:
 - Add second patch
 - return success when notifier is already registered. (Florian Westphal)
v1: Initial patch

 net/ipv4/netfilter/nf_nat_masquerade_ipv4.c | 23 ++---
 net/ipv6/netfilter/nf_nat_masquerade_ipv6.c | 23 ++---
 2 files changed, 32 insertions(+), 14 deletions(-)

diff --git a/net/ipv4/netfilter/nf_nat_masquerade_ipv4.c 
b/net/ipv4/netfilter/nf_nat_masquerade_ipv4.c
index c7d7fa4fc369..41327bb99093 100644
--- a/net/ipv4/netfilter/nf_nat_masquerade_ipv4.c
+++ b/net/ipv4/netfilter/nf_nat_masquerade_ipv4.c
@@ -147,15 +147,17 @@ static struct notifier_block masq_inet_notifier = {
.notifier_call  = masq_inet_event,
 };
 
-static atomic_t masquerade_notifier_refcount = ATOMIC_INIT(0);
+static int masq_refcnt;
+static DEFINE_MUTEX(masq_mutex);
 
 int nf_nat_masquerade_ipv4_register_notifier(void)
 {
-   int ret;
+   int ret = 0;
 
+   mutex_lock(_mutex);
/* check if the notifier was already set */
-   if (atomic_inc_return(_notifier_refcount) > 1)
-   return 0;
+   if (++masq_refcnt > 1)
+   goto out_unlock;
 
/* Register for device down reports */
ret 

[PATCH nf v2 0/2] netfilter: fix notifier registration bugs

2018-11-22 Thread Taehee Yoo
This patch series fix notifier registration bugs.

First patch adds error handling code for failure of notifier registration.
notifier registration can be failed. so that error handling code are needed.

Second patch fixes double-register bug in masqerade modules.
In order to protect double-register, masquerade modules manage
reference count. but it's not enough.
So that, this patch uses mutex instead of atomic value.

v2:
 - Add second patch
 - return success when notifier is already registered. (Florian Westphal)
v1: Initial patch

Taehee Yoo (2):
  netfilter: add missing error handling code for register functions
  netfilter: nat: fix double register in masquerade modules

 .../net/netfilter/ipv4/nf_nat_masquerade.h|  2 +-
 .../net/netfilter/ipv6/nf_nat_masquerade.h|  2 +-
 net/ipv4/netfilter/ipt_MASQUERADE.c   |  7 ++-
 net/ipv4/netfilter/nf_nat_masquerade_ipv4.c   | 38 +++---
 net/ipv4/netfilter/nft_masq_ipv4.c|  4 +-
 net/ipv6/netfilter/ip6t_MASQUERADE.c  |  8 ++-
 net/ipv6/netfilter/nf_nat_masquerade_ipv6.c   | 49 ++-
 net/ipv6/netfilter/nft_masq_ipv6.c|  4 +-
 net/netfilter/nft_flow_offload.c  |  5 +-
 9 files changed, 89 insertions(+), 30 deletions(-)

-- 
2.17.1



[PATCH nf v2 1/2] netfilter: add missing error handling code for register functions

2018-11-22 Thread Taehee Yoo
register_{netdevice/inetaddr/inet6addr}_notifier returns value that
could be error value. so that error handling code are needed.

Signed-off-by: Taehee Yoo 
---

v2:
 - Add second patch
 - return success when notifier is already registered. (Florian Westphal)
v1: Initial patch

 .../net/netfilter/ipv4/nf_nat_masquerade.h|  2 +-
 .../net/netfilter/ipv6/nf_nat_masquerade.h|  2 +-
 net/ipv4/netfilter/ipt_MASQUERADE.c   |  7 ++--
 net/ipv4/netfilter/nf_nat_masquerade_ipv4.c   | 21 +---
 net/ipv4/netfilter/nft_masq_ipv4.c|  4 ++-
 net/ipv6/netfilter/ip6t_MASQUERADE.c  |  8 +++--
 net/ipv6/netfilter/nf_nat_masquerade_ipv6.c   | 32 +--
 net/ipv6/netfilter/nft_masq_ipv6.c|  4 ++-
 net/netfilter/nft_flow_offload.c  |  5 ++-
 9 files changed, 63 insertions(+), 22 deletions(-)

diff --git a/include/net/netfilter/ipv4/nf_nat_masquerade.h 
b/include/net/netfilter/ipv4/nf_nat_masquerade.h
index cd24be4c4a99..13d55206bb9f 100644
--- a/include/net/netfilter/ipv4/nf_nat_masquerade.h
+++ b/include/net/netfilter/ipv4/nf_nat_masquerade.h
@@ -9,7 +9,7 @@ nf_nat_masquerade_ipv4(struct sk_buff *skb, unsigned int 
hooknum,
   const struct nf_nat_range2 *range,
   const struct net_device *out);
 
-void nf_nat_masquerade_ipv4_register_notifier(void);
+int nf_nat_masquerade_ipv4_register_notifier(void);
 void nf_nat_masquerade_ipv4_unregister_notifier(void);
 
 #endif /*_NF_NAT_MASQUERADE_IPV4_H_ */
diff --git a/include/net/netfilter/ipv6/nf_nat_masquerade.h 
b/include/net/netfilter/ipv6/nf_nat_masquerade.h
index 0c3b5ebf0bb8..2917bf95c437 100644
--- a/include/net/netfilter/ipv6/nf_nat_masquerade.h
+++ b/include/net/netfilter/ipv6/nf_nat_masquerade.h
@@ -5,7 +5,7 @@
 unsigned int
 nf_nat_masquerade_ipv6(struct sk_buff *skb, const struct nf_nat_range2 *range,
   const struct net_device *out);
-void nf_nat_masquerade_ipv6_register_notifier(void);
+int nf_nat_masquerade_ipv6_register_notifier(void);
 void nf_nat_masquerade_ipv6_unregister_notifier(void);
 
 #endif /* _NF_NAT_MASQUERADE_IPV6_H_ */
diff --git a/net/ipv4/netfilter/ipt_MASQUERADE.c 
b/net/ipv4/netfilter/ipt_MASQUERADE.c
index ce1512b02cb2..fd3f9e8a74da 100644
--- a/net/ipv4/netfilter/ipt_MASQUERADE.c
+++ b/net/ipv4/netfilter/ipt_MASQUERADE.c
@@ -81,9 +81,12 @@ static int __init masquerade_tg_init(void)
int ret;
 
ret = xt_register_target(_tg_reg);
+   if (ret)
+   return ret;
 
-   if (ret == 0)
-   nf_nat_masquerade_ipv4_register_notifier();
+   ret = nf_nat_masquerade_ipv4_register_notifier();
+   if (ret)
+   xt_unregister_target(_tg_reg);
 
return ret;
 }
diff --git a/net/ipv4/netfilter/nf_nat_masquerade_ipv4.c 
b/net/ipv4/netfilter/nf_nat_masquerade_ipv4.c
index a9d5e013e555..c7d7fa4fc369 100644
--- a/net/ipv4/netfilter/nf_nat_masquerade_ipv4.c
+++ b/net/ipv4/netfilter/nf_nat_masquerade_ipv4.c
@@ -149,16 +149,29 @@ static struct notifier_block masq_inet_notifier = {
 
 static atomic_t masquerade_notifier_refcount = ATOMIC_INIT(0);
 
-void nf_nat_masquerade_ipv4_register_notifier(void)
+int nf_nat_masquerade_ipv4_register_notifier(void)
 {
+   int ret;
+
/* check if the notifier was already set */
if (atomic_inc_return(_notifier_refcount) > 1)
-   return;
+   return 0;
 
/* Register for device down reports */
-   register_netdevice_notifier(_dev_notifier);
+   ret = register_netdevice_notifier(_dev_notifier);
+   if (ret)
+   goto err_dec;
/* Register IP address change reports */
-   register_inetaddr_notifier(_inet_notifier);
+   ret = register_inetaddr_notifier(_inet_notifier);
+   if (ret)
+   goto err_unregister;
+
+   return ret;
+err_unregister:
+   unregister_netdevice_notifier(_dev_notifier);
+err_dec:
+   atomic_dec(_notifier_refcount);
+   return ret;
 }
 EXPORT_SYMBOL_GPL(nf_nat_masquerade_ipv4_register_notifier);
 
diff --git a/net/ipv4/netfilter/nft_masq_ipv4.c 
b/net/ipv4/netfilter/nft_masq_ipv4.c
index f1193e1e928a..6847de1d1db8 100644
--- a/net/ipv4/netfilter/nft_masq_ipv4.c
+++ b/net/ipv4/netfilter/nft_masq_ipv4.c
@@ -69,7 +69,9 @@ static int __init nft_masq_ipv4_module_init(void)
if (ret < 0)
return ret;
 
-   nf_nat_masquerade_ipv4_register_notifier();
+   ret = nf_nat_masquerade_ipv4_register_notifier();
+   if (ret)
+   nft_unregister_expr(_masq_ipv4_type);
 
return ret;
 }
diff --git a/net/ipv6/netfilter/ip6t_MASQUERADE.c 
b/net/ipv6/netfilter/ip6t_MASQUERADE.c
index 491f808e356a..29c7f1915a96 100644
--- a/net/ipv6/netfilter/ip6t_MASQUERADE.c
+++ b/net/ipv6/netfilter/ip6t_MASQUERADE.c
@@ -58,8 +58,12 @@ static int __init masquerade_tg6_init(void)
int err;
 
err = xt_register_target(_tg6_reg);
-   if (err == 0)
-   

[PATCH] netfilter: ipset: replace a strncpy() with strscpy()

2018-11-21 Thread Qian Cai
To make overflows as obvious as possible and to prevent code from blithely
proceeding with a truncated string. This also has a side-effect to fix a
compilation warning using GCC 8.2.1.

net/netfilter/ipset/ip_set_core.c: In function 'ip_set_sockfn_get':
net/netfilter/ipset/ip_set_core.c:2027:3: warning: 'strncpy' writing 32
bytes into a region of size 2 overflows the destination
[-Wstringop-overflow=]

Signed-off-by: Qian Cai 
---
 net/netfilter/ipset/ip_set_core.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/net/netfilter/ipset/ip_set_core.c 
b/net/netfilter/ipset/ip_set_core.c
index 1577f2f..915aa0d 100644
--- a/net/netfilter/ipset/ip_set_core.c
+++ b/net/netfilter/ipset/ip_set_core.c
@@ -2024,7 +2024,7 @@ static int ip_set_protocol(struct net *net, struct sock 
*ctnl,
}
nfnl_lock(NFNL_SUBSYS_IPSET);
set = ip_set(inst, req_get->set.index);
-   strncpy(req_get->set.name, set ? set->name : "",
+   strscpy(req_get->set.name, set ? set->name : "",
IPSET_MAXNAMELEN);
nfnl_unlock(NFNL_SUBSYS_IPSET);
goto copy;
-- 
1.8.3.1



[PATCH v2] ipv6: Preserve link scope traffic original oif

2018-11-21 Thread Alin Nastac
When ip6_route_me_harder is invoked, it resets outgoing interface of:
  - link-local scoped packets sent by neighbor discovery
  - multicast packets sent by MLD host
  - multicast packets send by MLD proxy daemon that sets outgoing
interface through IPV6_PKTINFO ipi6_ifindex

Link-local and multicast packets must keep their original oif after
ip6_route_me_harder is called.

Signed-off-by: Alin Nastac 
---
 net/ipv6/netfilter.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/net/ipv6/netfilter.c b/net/ipv6/netfilter.c
index 5ae8e1c..8b075f0 100644
--- a/net/ipv6/netfilter.c
+++ b/net/ipv6/netfilter.c
@@ -24,7 +24,8 @@ int ip6_route_me_harder(struct net *net, struct sk_buff *skb)
unsigned int hh_len;
struct dst_entry *dst;
struct flowi6 fl6 = {
-   .flowi6_oif = sk ? sk->sk_bound_dev_if : 0,
+   .flowi6_oif = sk && sk->sk_bound_dev_if ? sk->sk_bound_dev_if :
+   rt6_need_strict(>daddr) ? 
skb_dst(skb)->dev->ifindex : 0,
.flowi6_mark = skb->mark,
.flowi6_uid = sock_net_uid(net, sk),
.daddr = iph->daddr,
-- 
2.7.4



[PATCH nf] netfilter: nfnetlink_cttimeout: nf_proto_net must be first member of netns_proto_gre

2018-11-21 Thread Florian Westphal
Can't move timeouts around, it appears conntrack sysctl unregister
assumes net_generic() returns nf_proto_net, so we get crash.

Expose layout of netns_proto_gre instead.

Reported-by: kernel test robot 
Fixes: 991acf532b  netfilter: nfnetlink_cttimeout: fetch timeouts for udplite 
and gre, too
Signed-off-by: Florian Westphal 
---
 include/linux/netfilter/nf_conntrack_proto_gre.h | 13 +
 net/netfilter/nf_conntrack_proto_gre.c   | 14 +-
 net/netfilter/nfnetlink_cttimeout.c  |  8 ++--
 3 files changed, 20 insertions(+), 15 deletions(-)

diff --git a/include/linux/netfilter/nf_conntrack_proto_gre.h 
b/include/linux/netfilter/nf_conntrack_proto_gre.h
index b8d95564bd53..14edb795ab43 100644
--- a/include/linux/netfilter/nf_conntrack_proto_gre.h
+++ b/include/linux/netfilter/nf_conntrack_proto_gre.h
@@ -21,6 +21,19 @@ struct nf_ct_gre_keymap {
struct nf_conntrack_tuple tuple;
 };
 
+enum grep_conntrack {
+   GRE_CT_UNREPLIED,
+   GRE_CT_REPLIED,
+   GRE_CT_MAX
+};
+
+struct netns_proto_gre {
+   struct nf_proto_net nf;
+   rwlock_tkeymap_lock;
+   struct list_headkeymap_list;
+   unsigned intgre_timeouts[GRE_CT_MAX];
+};
+
 /* add new tuple->key_reply pair to keymap */
 int nf_ct_gre_keymap_add(struct nf_conn *ct, enum ip_conntrack_dir dir,
 struct nf_conntrack_tuple *t);
diff --git a/net/netfilter/nf_conntrack_proto_gre.c 
b/net/netfilter/nf_conntrack_proto_gre.c
index dd8db7fbc437..2a5e56c6d8d9 100644
--- a/net/netfilter/nf_conntrack_proto_gre.c
+++ b/net/netfilter/nf_conntrack_proto_gre.c
@@ -43,24 +43,12 @@
 #include 
 #include 
 
-enum grep_conntrack {
-   GRE_CT_UNREPLIED,
-   GRE_CT_REPLIED,
-   GRE_CT_MAX
-};
-
 static const unsigned int gre_timeouts[GRE_CT_MAX] = {
[GRE_CT_UNREPLIED]  = 30*HZ,
[GRE_CT_REPLIED]= 180*HZ,
 };
 
 static unsigned int proto_gre_net_id __read_mostly;
-struct netns_proto_gre {
-   unsigned intgre_timeouts[GRE_CT_MAX];
-   struct nf_proto_net nf;
-   rwlock_tkeymap_lock;
-   struct list_headkeymap_list;
-};
 
 static inline struct netns_proto_gre *gre_pernet(struct net *net)
 {
@@ -402,7 +390,7 @@ static int __init nf_ct_proto_gre_init(void)
 {
int ret;
 
-   BUILD_BUG_ON(offsetof(struct netns_proto_gre, gre_timeouts));
+   BUILD_BUG_ON(offsetof(struct netns_proto_gre, nf) != 0);
 
ret = register_pernet_subsys(_gre_net_ops);
if (ret < 0)
diff --git a/net/netfilter/nfnetlink_cttimeout.c 
b/net/netfilter/nfnetlink_cttimeout.c
index 1643faa35f56..109b0d27345a 100644
--- a/net/netfilter/nfnetlink_cttimeout.c
+++ b/net/netfilter/nfnetlink_cttimeout.c
@@ -474,8 +474,12 @@ static int cttimeout_default_get(struct net *net, struct 
sock *ctnl,
break;
case IPPROTO_GRE:
 #ifdef CONFIG_NF_CT_PROTO_GRE
-   if (l4proto->net_id)
-   timeouts = net_generic(net, *l4proto->net_id);
+   if (l4proto->net_id) {
+   struct netns_proto_gre *net_gre;
+
+   net_gre = net_generic(net, *l4proto->net_id);
+   timeouts = net_gre->gre_timeouts;
+   }
 #endif
break;
case 255:
-- 
2.18.1



Re: [PATCH v2] ipv6: Preserve link scope traffic original oif

2018-11-21 Thread Pablo Neira Ayuso
On Wed, Nov 21, 2018 at 01:24:25PM +0100, Pablo Neira Ayuso wrote:
> On Wed, Nov 21, 2018 at 12:17:50PM +0100, Alin Nastac wrote:
> > When ip6_route_me_harder is invoked, it resets outgoing interface of:
> >   - link-local scoped packets sent by neighbor discovery
> >   - multicast packets sent by MLD host
> >   - multicast packets send by MLD proxy daemon that sets outgoing
> > interface through IPV6_PKTINFO ipi6_ifindex
> > 
> > Link-local and multicast packets must keep their original oif after
> > ip6_route_me_harder is called.
> 
> Please, resubmit including your Signed-off-by tag.

Or I can append it here, but I need your consent, thanks.

> > ---
> >  net/ipv6/netfilter.c | 3 ++-
> >  1 file changed, 2 insertions(+), 1 deletion(-)
> > 
> > diff --git a/net/ipv6/netfilter.c b/net/ipv6/netfilter.c
> > index 5ae8e1c..8b075f0 100644
> > --- a/net/ipv6/netfilter.c
> > +++ b/net/ipv6/netfilter.c
> > @@ -24,7 +24,8 @@ int ip6_route_me_harder(struct net *net, struct sk_buff 
> > *skb)
> > unsigned int hh_len;
> > struct dst_entry *dst;
> > struct flowi6 fl6 = {
> > -   .flowi6_oif = sk ? sk->sk_bound_dev_if : 0,
> > +   .flowi6_oif = sk && sk->sk_bound_dev_if ? sk->sk_bound_dev_if :
> > +   rt6_need_strict(>daddr) ? 
> > skb_dst(skb)->dev->ifindex : 0,
> > .flowi6_mark = skb->mark,
> > .flowi6_uid = sock_net_uid(net, sk),
> > .daddr = iph->daddr,
> > -- 
> > 2.7.4
> > 


Re: [PATCH v2] ipv6: Preserve link scope traffic original oif

2018-11-21 Thread Pablo Neira Ayuso
On Wed, Nov 21, 2018 at 12:17:50PM +0100, Alin Nastac wrote:
> When ip6_route_me_harder is invoked, it resets outgoing interface of:
>   - link-local scoped packets sent by neighbor discovery
>   - multicast packets sent by MLD host
>   - multicast packets send by MLD proxy daemon that sets outgoing
> interface through IPV6_PKTINFO ipi6_ifindex
> 
> Link-local and multicast packets must keep their original oif after
> ip6_route_me_harder is called.

Please, resubmit including your Signed-off-by tag.

Thanks!

> ---
>  net/ipv6/netfilter.c | 3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)
> 
> diff --git a/net/ipv6/netfilter.c b/net/ipv6/netfilter.c
> index 5ae8e1c..8b075f0 100644
> --- a/net/ipv6/netfilter.c
> +++ b/net/ipv6/netfilter.c
> @@ -24,7 +24,8 @@ int ip6_route_me_harder(struct net *net, struct sk_buff 
> *skb)
>   unsigned int hh_len;
>   struct dst_entry *dst;
>   struct flowi6 fl6 = {
> - .flowi6_oif = sk ? sk->sk_bound_dev_if : 0,
> + .flowi6_oif = sk && sk->sk_bound_dev_if ? sk->sk_bound_dev_if :
> + rt6_need_strict(>daddr) ? 
> skb_dst(skb)->dev->ifindex : 0,
>   .flowi6_mark = skb->mark,
>   .flowi6_uid = sock_net_uid(net, sk),
>   .daddr = iph->daddr,
> -- 
> 2.7.4
> 


[PATCH v2] ipv6: Preserve link scope traffic original oif

2018-11-21 Thread Alin Nastac
When ip6_route_me_harder is invoked, it resets outgoing interface of:
  - link-local scoped packets sent by neighbor discovery
  - multicast packets sent by MLD host
  - multicast packets send by MLD proxy daemon that sets outgoing
interface through IPV6_PKTINFO ipi6_ifindex

Link-local and multicast packets must keep their original oif after
ip6_route_me_harder is called.
---
 net/ipv6/netfilter.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/net/ipv6/netfilter.c b/net/ipv6/netfilter.c
index 5ae8e1c..8b075f0 100644
--- a/net/ipv6/netfilter.c
+++ b/net/ipv6/netfilter.c
@@ -24,7 +24,8 @@ int ip6_route_me_harder(struct net *net, struct sk_buff *skb)
unsigned int hh_len;
struct dst_entry *dst;
struct flowi6 fl6 = {
-   .flowi6_oif = sk ? sk->sk_bound_dev_if : 0,
+   .flowi6_oif = sk && sk->sk_bound_dev_if ? sk->sk_bound_dev_if :
+   rt6_need_strict(>daddr) ? 
skb_dst(skb)->dev->ifindex : 0,
.flowi6_mark = skb->mark,
.flowi6_uid = sock_net_uid(net, sk),
.daddr = iph->daddr,
-- 
2.7.4



  1   2   3   4   5   6   7   8   9   10   >