Re: netfilter: nft_ct: add zone id set support

2017-02-23 Thread Florian Westphal
Pablo Neira Ayuso  wrote:
> On Thu, Feb 23, 2017 at 12:34:35PM +0100, Florian Westphal wrote:
> > Yes, Dan reported this and a patch is queued at
> > http://patchwork.ozlabs.org/patch/727573/
> > 
> > Pablo, any reason why this is still waiting?
> 
> I just flushing out my nf.git tree via pull request.
> 
> Once these changes are pulled, I'll fetch recent net-next changes that
> were just merged via net. Then, I'll pick this so we can calm down
> these compilation warnings.
> 
> Are you OK with this procedure? Thanks!

Sure.


Re: netfilter: nft_ct: add zone id set support

2017-02-23 Thread Pablo Neira Ayuso
On Thu, Feb 23, 2017 at 12:34:35PM +0100, Florian Westphal wrote:
> Geert Uytterhoeven  wrote:
> > On Wed, Feb 22, 2017 at 8:02 PM, Linux Kernel Mailing List
> >  wrote:
> > > Web:
> > > https://git.kernel.org/torvalds/c/edee4f1e92458299505ff007733f676b00c516a1
> > > Commit: edee4f1e92458299505ff007733f676b00c516a1
> > > Parent: 5c178d81b69f08ca3195427a6ea9a46d9af23127
> > > Refname:refs/heads/master
> > > Author: Florian Westphal 
> > > AuthorDate: Fri Feb 3 13:35:50 2017 +0100
> > > Committer:  Pablo Neira Ayuso 
> > > CommitDate: Wed Feb 8 14:16:23 2017 +0100
> > >
> > Unlike for the other cases of the switch statement, "len" is not initialized
> > here...
> > 
> > > +   break;
> > > priv->sreg = nft_parse_register(tb[NFTA_CT_SREG]);
> > > err = nft_validate_register_load(priv->sreg, len);
> > 
> > ... and used here, which may lead to spurious failures of
> > nft_validate_register_load().
> 
> Yes, Dan reported this and a patch is queued at
> http://patchwork.ozlabs.org/patch/727573/
> 
> Pablo, any reason why this is still waiting?

I just flushing out my nf.git tree via pull request.

Once these changes are pulled, I'll fetch recent net-next changes that
were just merged via net. Then, I'll pick this so we can calm down
these compilation warnings.

Are you OK with this procedure? Thanks!


Re: netfilter: nft_ct: add zone id set support

2017-02-23 Thread Florian Westphal
Geert Uytterhoeven  wrote:
> On Wed, Feb 22, 2017 at 8:02 PM, Linux Kernel Mailing List
>  wrote:
> > Web:
> > https://git.kernel.org/torvalds/c/edee4f1e92458299505ff007733f676b00c516a1
> > Commit: edee4f1e92458299505ff007733f676b00c516a1
> > Parent: 5c178d81b69f08ca3195427a6ea9a46d9af23127
> > Refname:refs/heads/master
> > Author: Florian Westphal 
> > AuthorDate: Fri Feb 3 13:35:50 2017 +0100
> > Committer:  Pablo Neira Ayuso 
> > CommitDate: Wed Feb 8 14:16:23 2017 +0100
> >
> Unlike for the other cases of the switch statement, "len" is not initialized
> here...
> 
> > +   break;
> > priv->sreg = nft_parse_register(tb[NFTA_CT_SREG]);
> > err = nft_validate_register_load(priv->sreg, len);
> 
> ... and used here, which may lead to spurious failures of
> nft_validate_register_load().

Yes, Dan reported this and a patch is queued at
http://patchwork.ozlabs.org/patch/727573/

Pablo, any reason why this is still waiting?
Do you want me to run more tests?



Re: netfilter: nft_ct: add zone id set support

2017-02-23 Thread Geert Uytterhoeven
Hi Florian,

On Wed, Feb 22, 2017 at 8:02 PM, Linux Kernel Mailing List
<linux-ker...@vger.kernel.org> wrote:
> Web:
> https://git.kernel.org/torvalds/c/edee4f1e92458299505ff007733f676b00c516a1
> Commit: edee4f1e92458299505ff007733f676b00c516a1
> Parent: 5c178d81b69f08ca3195427a6ea9a46d9af23127
> Refname:refs/heads/master
> Author: Florian Westphal <f...@strlen.de>
> AuthorDate: Fri Feb 3 13:35:50 2017 +0100
> Committer:  Pablo Neira Ayuso <pa...@netfilter.org>
> CommitDate: Wed Feb 8 14:16:23 2017 +0100
>
> netfilter: nft_ct: add zone id set support
>
> zones allow tracking multiple connections sharing identical tuples,
> this is needed e.g. when tracking distinct vlans with overlapping ip
> addresses (conntrack is l2 agnostic).
>
> Thus the zone has to be set before the packet is picked up by the
> connection tracker.  This is done by means of 'conntrack templates' which
> are conntrack structures used solely to pass this info from one netfilter
> hook to the next.
>
> The iptables CT target instantiates these connection tracking templates
> once per rule, i.e. the template is fixed/tied to particular zone, can
> be read-only and therefore be re-used by as many skbs simultaneously as
> needed.
>
> We can't follow this model because we want to take the zone id from
> an sreg at rule eval time so we could e.g. fill in the zone id from
> the packets vlan id or a e.g. nftables key : value maps.
>
> To avoid cost of per packet alloc/free of the template, use a percpu
> template 'scratch' object and use the refcount to detect the (unlikely)
> case where the template is still attached to another skb (i.e., previous
> skb was nfqueued ...).
>
> Signed-off-by: Florian Westphal <f...@strlen.de>
> Signed-off-by: Pablo Neira Ayuso <pa...@netfilter.org>

> --- a/net/netfilter/nft_ct.c
> +++ b/net/netfilter/nft_ct.c

> @@ -407,6 +503,7 @@ static int nft_ct_set_init(const struct nft_ctx *ctx,
> unsigned int len;

With gcc-4.1.2 and -Os:

net/netfilter/nft_ct.c: In function ‘nft_ct_set_init’:
net/netfilter/nft_ct.c:503: warning: ‘len’ may be used
uninitialized in this function

> int err;
>
> +   priv->dir = IP_CT_DIR_MAX;
> priv->key = ntohl(nla_get_be32(tb[NFTA_CT_KEY]));
> switch (priv->key) {
>  #ifdef CONFIG_NF_CONNTRACK_MARK
> @@ -426,10 +523,28 @@ static int nft_ct_set_init(const struct nft_ctx *ctx,
> return err;
> break;
>  #endif
> +#ifdef CONFIG_NF_CONNTRACK_ZONES
> +   case NFT_CT_ZONE:
> +   if (!nft_ct_tmpl_alloc_pcpu())
> +   return -ENOMEM;
> +   nft_ct_pcpu_template_refcnt++;

Unlike for the other cases of the switch statement, "len" is not initialized
here...

> +   break;
> +#endif
> default:
> return -EOPNOTSUPP;
> }
>
> +   if (tb[NFTA_CT_DIRECTION]) {
> +   priv->dir = nla_get_u8(tb[NFTA_CT_DIRECTION]);
> +   switch (priv->dir) {
> +   case IP_CT_DIR_ORIGINAL:
> +   case IP_CT_DIR_REPLY:
> +   break;
> +   default:
> +   return -EINVAL;
> +   }
> +   }
> +
> priv->sreg = nft_parse_register(tb[NFTA_CT_SREG]);
> err = nft_validate_register_load(priv->sreg, len);

... and used here, which may lead to spurious failures of
nft_validate_register_load().

Gr{oetje,eeting}s,

Geert

--
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- ge...@linux-m68k.org

In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
-- Linus Torvalds


[PATCH 11/21] netfilter: nft_ct: add zone id set support

2017-02-12 Thread Pablo Neira Ayuso
From: Florian Westphal 

zones allow tracking multiple connections sharing identical tuples,
this is needed e.g. when tracking distinct vlans with overlapping ip
addresses (conntrack is l2 agnostic).

Thus the zone has to be set before the packet is picked up by the
connection tracker.  This is done by means of 'conntrack templates' which
are conntrack structures used solely to pass this info from one netfilter
hook to the next.

The iptables CT target instantiates these connection tracking templates
once per rule, i.e. the template is fixed/tied to particular zone, can
be read-only and therefore be re-used by as many skbs simultaneously as
needed.

We can't follow this model because we want to take the zone id from
an sreg at rule eval time so we could e.g. fill in the zone id from
the packets vlan id or a e.g. nftables key : value maps.

To avoid cost of per packet alloc/free of the template, use a percpu
template 'scratch' object and use the refcount to detect the (unlikely)
case where the template is still attached to another skb (i.e., previous
skb was nfqueued ...).

Signed-off-by: Florian Westphal 
Signed-off-by: Pablo Neira Ayuso 
---
 net/netfilter/nft_ct.c | 144 -
 1 file changed, 143 insertions(+), 1 deletion(-)

diff --git a/net/netfilter/nft_ct.c b/net/netfilter/nft_ct.c
index 2d82df2737da..c6b8022c0e47 100644
--- a/net/netfilter/nft_ct.c
+++ b/net/netfilter/nft_ct.c
@@ -32,6 +32,11 @@ struct nft_ct {
};
 };
 
+#ifdef CONFIG_NF_CONNTRACK_ZONES
+static DEFINE_PER_CPU(struct nf_conn *, nft_ct_pcpu_template);
+static unsigned int nft_ct_pcpu_template_refcnt __read_mostly;
+#endif
+
 static u64 nft_ct_get_eval_counter(const struct nf_conn_counter *c,
   enum nft_ct_keys k,
   enum ip_conntrack_dir d)
@@ -191,6 +196,53 @@ static void nft_ct_get_eval(const struct nft_expr *expr,
regs->verdict.code = NFT_BREAK;
 }
 
+#ifdef CONFIG_NF_CONNTRACK_ZONES
+static void nft_ct_set_zone_eval(const struct nft_expr *expr,
+struct nft_regs *regs,
+const struct nft_pktinfo *pkt)
+{
+   struct nf_conntrack_zone zone = { .dir = NF_CT_DEFAULT_ZONE_DIR };
+   const struct nft_ct *priv = nft_expr_priv(expr);
+   struct sk_buff *skb = pkt->skb;
+   enum ip_conntrack_info ctinfo;
+   u16 value = regs->data[priv->sreg];
+   struct nf_conn *ct;
+
+   ct = nf_ct_get(skb, );
+   if (ct) /* already tracked */
+   return;
+
+   zone.id = value;
+
+   switch (priv->dir) {
+   case IP_CT_DIR_ORIGINAL:
+   zone.dir = NF_CT_ZONE_DIR_ORIG;
+   break;
+   case IP_CT_DIR_REPLY:
+   zone.dir = NF_CT_ZONE_DIR_REPL;
+   break;
+   default:
+   break;
+   }
+
+   ct = this_cpu_read(nft_ct_pcpu_template);
+
+   if (likely(atomic_read(>ct_general.use) == 1)) {
+   nf_ct_zone_add(ct, );
+   } else {
+   /* previous skb got queued to userspace */
+   ct = nf_ct_tmpl_alloc(nft_net(pkt), , GFP_ATOMIC);
+   if (!ct) {
+   regs->verdict.code = NF_DROP;
+   return;
+   }
+   }
+
+   atomic_inc(>ct_general.use);
+   nf_ct_set(skb, ct, IP_CT_NEW);
+}
+#endif
+
 static void nft_ct_set_eval(const struct nft_expr *expr,
struct nft_regs *regs,
const struct nft_pktinfo *pkt)
@@ -269,6 +321,45 @@ static void nft_ct_netns_put(struct net *net, uint8_t 
family)
nf_ct_netns_put(net, family);
 }
 
+#ifdef CONFIG_NF_CONNTRACK_ZONES
+static void nft_ct_tmpl_put_pcpu(void)
+{
+   struct nf_conn *ct;
+   int cpu;
+
+   for_each_possible_cpu(cpu) {
+   ct = per_cpu(nft_ct_pcpu_template, cpu);
+   if (!ct)
+   break;
+   nf_ct_put(ct);
+   per_cpu(nft_ct_pcpu_template, cpu) = NULL;
+   }
+}
+
+static bool nft_ct_tmpl_alloc_pcpu(void)
+{
+   struct nf_conntrack_zone zone = { .id = 0 };
+   struct nf_conn *tmp;
+   int cpu;
+
+   if (nft_ct_pcpu_template_refcnt)
+   return true;
+
+   for_each_possible_cpu(cpu) {
+   tmp = nf_ct_tmpl_alloc(_net, , GFP_KERNEL);
+   if (!tmp) {
+   nft_ct_tmpl_put_pcpu();
+   return false;
+   }
+
+   atomic_set(>ct_general.use, 1);
+   per_cpu(nft_ct_pcpu_template, cpu) = tmp;
+   }
+
+   return true;
+}
+#endif
+
 static int nft_ct_get_init(const struct nft_ctx *ctx,
   const struct nft_expr *expr,
   const struct nlattr * const tb[])
@@ -394,6 +485,11 @@ static void __nft_ct_set_destroy(const struct