Re: [PATCH 8/7] net/netfilter/nf_conntrack_core: Remove another memory barrier

2016-09-02 Thread Manfred Spraul

On 09/02/2016 09:22 PM, Peter Zijlstra wrote:

On Fri, Sep 02, 2016 at 08:35:55AM +0200, Manfred Spraul wrote:

On 09/01/2016 06:41 PM, Peter Zijlstra wrote:

On Thu, Sep 01, 2016 at 04:30:39PM +0100, Will Deacon wrote:

On Thu, Sep 01, 2016 at 05:27:52PM +0200, Manfred Spraul wrote:

Since spin_unlock_wait() is defined as equivalent to spin_lock();
spin_unlock(), the memory barrier before spin_unlock_wait() is
also not required.

Note that ACQUIRE+RELEASE isn't a barrier.

Both are semi-permeable and things can cross in the middle, like:


x = 1;
LOCK
UNLOCK
r = y;

can (validly) get re-ordered like:

LOCK
r = y;
x = 1;
UNLOCK

So if you want things ordered, as I think you do, I think the smp_mb()
is still needed.

CPU1:
x=1; /* without WRITE_ONCE */
LOCK(l);
UNLOCK(l);

smp_store_release(x,0)


CPU2;
LOCK(l)
if (smp_load_acquire(x)==1) goto slow_path

UNLOCK(l)

Ordering is enforced because both CPUs access the same lock.

x=1 can't be reordered past the UNLOCK(l), I don't see that further
guarantees are necessary.

Correct?

Correct, sadly implementations do not comply :/ In fact, even x86 is
broken here.

I spoke to Will earlier today and he suggests either making
spin_unlock_wait() stronger to avoids any and all such surprises or just
getting rid of the thing.

I'm not sure which way we should go, but please hold off on these two
patches until I've had a chance to audit all of those implementations
again.

For me, it doesn't really matter.
spin_unlock_wait() as "R", as "RAcq" or as "spin_lock(); spin_lock();" - 
I just want a usable definition for ipc/sem.c


So (just to keep Andrew updated):
Ready for merging is: (Bugfixes, safe just with spin_unlock_wait() as "R"):

- 45a449340cd1 ("ipc/sem.c: fix complex_count vs. simple op race")
  Cc stable, back to 3.10 ...
- 7fd5653d9986 ("net/netfilter/nf_conntrack_core: Fix memory barriers.")
  Cc stable, back to ~4.5

--
Manfred
--
To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 8/7] net/netfilter/nf_conntrack_core: Remove another memory barrier

2016-09-02 Thread Peter Zijlstra
On Fri, Sep 02, 2016 at 08:35:55AM +0200, Manfred Spraul wrote:
> On 09/01/2016 06:41 PM, Peter Zijlstra wrote:
> >On Thu, Sep 01, 2016 at 04:30:39PM +0100, Will Deacon wrote:
> >>On Thu, Sep 01, 2016 at 05:27:52PM +0200, Manfred Spraul wrote:
> >>>Since spin_unlock_wait() is defined as equivalent to spin_lock();
> >>>spin_unlock(), the memory barrier before spin_unlock_wait() is
> >>>also not required.
> >Note that ACQUIRE+RELEASE isn't a barrier.
> >
> >Both are semi-permeable and things can cross in the middle, like:
> >
> >
> > x = 1;
> > LOCK
> > UNLOCK
> > r = y;
> >
> >can (validly) get re-ordered like:
> >
> > LOCK
> > r = y;
> > x = 1;
> > UNLOCK
> >
> >So if you want things ordered, as I think you do, I think the smp_mb()
> >is still needed.
> CPU1:
> x=1; /* without WRITE_ONCE */
> LOCK(l);
> UNLOCK(l);
> 
> smp_store_release(x,0)
> 
> 
> CPU2;
> LOCK(l)
> if (smp_load_acquire(x)==1) goto slow_path
> 
> UNLOCK(l)
> 
> Ordering is enforced because both CPUs access the same lock.
> 
> x=1 can't be reordered past the UNLOCK(l), I don't see that further
> guarantees are necessary.
> 
> Correct?

Correct, sadly implementations do not comply :/ In fact, even x86 is
broken here.

I spoke to Will earlier today and he suggests either making
spin_unlock_wait() stronger to avoids any and all such surprises or just
getting rid of the thing.

I'm not sure which way we should go, but please hold off on these two
patches until I've had a chance to audit all of those implementations
again.

I'll try and have a look at your other patches before that.
--
To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH nf-next 1/2] netfilter: nft_quota: fix overquota logic

2016-09-02 Thread Pablo Neira Ayuso
Use xor to decide to break further rule evaluation or not, since the
existing logic doesn't achieve the expected inversion.

Signed-off-by: Pablo Neira Ayuso 
---
 net/netfilter/nft_quota.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/net/netfilter/nft_quota.c b/net/netfilter/nft_quota.c
index 6eafbf9..92b6ff1 100644
--- a/net/netfilter/nft_quota.c
+++ b/net/netfilter/nft_quota.c
@@ -33,7 +33,7 @@ static void nft_quota_eval(const struct nft_expr *expr,
 {
struct nft_quota *priv = nft_expr_priv(expr);
 
-   if (nft_quota(priv, pkt) < 0 && !priv->invert)
+   if ((nft_quota(priv, pkt) < 0) ^ priv->invert)
regs->verdict.code = NFT_BREAK;
 }
 
-- 
2.1.4

--
To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH nf-next 2/2] netfilter: nft_quota: introduce nft_overquota()

2016-09-02 Thread Pablo Neira Ayuso
This is patch renames the existing function to nft_overquota() and make
it return a boolean that tells us if we have exceeded our byte quota.
Just a cleanup.

Signed-off-by: Pablo Neira Ayuso 
---
 net/netfilter/nft_quota.c | 8 
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/net/netfilter/nft_quota.c b/net/netfilter/nft_quota.c
index 92b6ff1..c00104c 100644
--- a/net/netfilter/nft_quota.c
+++ b/net/netfilter/nft_quota.c
@@ -21,10 +21,10 @@ struct nft_quota {
atomic64_t  remain;
 };
 
-static inline long nft_quota(struct nft_quota *priv,
-const struct nft_pktinfo *pkt)
+static inline bool nft_overquota(struct nft_quota *priv,
+const struct nft_pktinfo *pkt)
 {
-   return atomic64_sub_return(pkt->skb->len, &priv->remain);
+   return atomic64_sub_return(pkt->skb->len, &priv->remain) < 0;
 }
 
 static void nft_quota_eval(const struct nft_expr *expr,
@@ -33,7 +33,7 @@ static void nft_quota_eval(const struct nft_expr *expr,
 {
struct nft_quota *priv = nft_expr_priv(expr);
 
-   if ((nft_quota(priv, pkt) < 0) ^ priv->invert)
+   if (nft_overquota(priv, pkt) ^ priv->invert)
regs->verdict.code = NFT_BREAK;
 }
 
-- 
2.1.4

--
To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH nf] netfilter: nf_tables_trace: fix endiness when dump chain policy

2016-09-02 Thread Florian Westphal
Liping Zhang  wrote:
> From: Liping Zhang 
> 
> NFTA_TRACE_POLICY attribute is big endian, but we forget to call
> htonl to convert it. Fortunately, this attribute is parsed as big
> endian in libnftnl.

It is however handled as u16, not u32.  Care to send a patch for this as
well?  Thanks!
--
To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v2 libnftnl] expr: numgen: Rename until attribute by modulus

2016-09-02 Thread Laura Garcia Liebana
The _modulus_ attribute will be reused as _until_, as it's similar to
other expressions with value limits (ex. hash).

Renaming is possible according to the kernel module ntf_numgen that has
not been released yet.

Signed-off-by: Laura Garcia Liebana 
---
Changes in V2:
- Separate changes with incremental counter offset value.

 include/buffer.h|  1 -
 include/libnftnl/expr.h |  2 +-
 include/linux/netfilter/nf_tables.h |  4 +--
 src/expr/numgen.c   | 54 ++---
 tests/nft-expr_numgen-test.c|  8 +++---
 5 files changed, 34 insertions(+), 35 deletions(-)

diff --git a/include/buffer.h b/include/buffer.h
index 8cfe377..a753c78 100644
--- a/include/buffer.h
+++ b/include/buffer.h
@@ -82,7 +82,6 @@ int nftnl_buf_reg(struct nftnl_buf *b, int type, union 
nftnl_data_reg *reg,
 #define TOTAL  "total"
 #define TYPE   "type"
 #define UNIT   "unit"
-#define UNTIL  "until"
 #define USE"use"
 #define XOR"xor"
 #define ADD"add"
diff --git a/include/libnftnl/expr.h b/include/libnftnl/expr.h
index 8815154..94ce529 100644
--- a/include/libnftnl/expr.h
+++ b/include/libnftnl/expr.h
@@ -52,7 +52,7 @@ enum {
 
 enum {
NFTNL_EXPR_NG_DREG  = NFTNL_EXPR_BASE,
-   NFTNL_EXPR_NG_UNTIL,
+   NFTNL_EXPR_NG_MODULUS,
NFTNL_EXPR_NG_TYPE,
 };
 
diff --git a/include/linux/netfilter/nf_tables.h 
b/include/linux/netfilter/nf_tables.h
index 0b11abf..a3dbac2 100644
--- a/include/linux/netfilter/nf_tables.h
+++ b/include/linux/netfilter/nf_tables.h
@@ -670,13 +670,13 @@ enum nft_exthdr_attributes {
  * enum nft_ng_attributes - nf_tables number generator expression attributes
  *
  * @NFTA_NG_DREG: destination register (NLA_U32)
- * @NFTA_NG_UNTIL: limit value (NLA_U32)
+ * @NFTA_NG_MODULUS: maximum value to be returned (NLA_U32)
  * @NFTA_NG_TYPE: type of operation (NLA_U32)
  */
 enum nft_ng_attributes {
NFTA_NG_UNSPEC,
NFTA_NG_DREG,
-   NFTA_NG_UNTIL,
+   NFTA_NG_MODULUS,
NFTA_NG_TYPE,
__NFTA_NG_MAX
 };
diff --git a/src/expr/numgen.c b/src/expr/numgen.c
index 0669eda..a008c75 100644
--- a/src/expr/numgen.c
+++ b/src/expr/numgen.c
@@ -22,7 +22,7 @@
 
 struct nftnl_expr_ng {
enum nft_registers  dreg;
-   unsigned intuntil;
+   unsigned intmodulus;
enum nft_ng_typetype;
 };
 
@@ -36,8 +36,8 @@ nftnl_expr_ng_set(struct nftnl_expr *e, uint16_t type,
case NFTNL_EXPR_NG_DREG:
ng->dreg = *((uint32_t *)data);
break;
-   case NFTNL_EXPR_NG_UNTIL:
-   ng->until = *((uint32_t *)data);
+   case NFTNL_EXPR_NG_MODULUS:
+   ng->modulus = *((uint32_t *)data);
break;
case NFTNL_EXPR_NG_TYPE:
ng->type = *((uint32_t *)data);
@@ -58,9 +58,9 @@ nftnl_expr_ng_get(const struct nftnl_expr *e, uint16_t type,
case NFTNL_EXPR_NG_DREG:
*data_len = sizeof(ng->dreg);
return &ng->dreg;
-   case NFTNL_EXPR_NG_UNTIL:
-   *data_len = sizeof(ng->until);
-   return &ng->until;
+   case NFTNL_EXPR_NG_MODULUS:
+   *data_len = sizeof(ng->modulus);
+   return &ng->modulus;
case NFTNL_EXPR_NG_TYPE:
*data_len = sizeof(ng->type);
return &ng->type;
@@ -78,7 +78,7 @@ static int nftnl_expr_ng_cb(const struct nlattr *attr, void 
*data)
 
switch (type) {
case NFTA_NG_DREG:
-   case NFTA_NG_UNTIL:
+   case NFTA_NG_MODULUS:
case NFTA_NG_TYPE:
if (mnl_attr_validate(attr, MNL_TYPE_U32) < 0)
abi_breakage();
@@ -96,8 +96,8 @@ nftnl_expr_ng_build(struct nlmsghdr *nlh, const struct 
nftnl_expr *e)
 
if (e->flags & (1 << NFTNL_EXPR_NG_DREG))
mnl_attr_put_u32(nlh, NFTA_NG_DREG, htonl(ng->dreg));
-   if (e->flags & (1 << NFTNL_EXPR_NG_UNTIL))
-   mnl_attr_put_u32(nlh, NFTA_NG_UNTIL, htonl(ng->until));
+   if (e->flags & (1 << NFTNL_EXPR_NG_MODULUS))
+   mnl_attr_put_u32(nlh, NFTA_NG_MODULUS, htonl(ng->modulus));
if (e->flags & (1 << NFTNL_EXPR_NG_TYPE))
mnl_attr_put_u32(nlh, NFTA_NG_TYPE, htonl(ng->type));
 }
@@ -116,9 +116,9 @@ nftnl_expr_ng_parse(struct nftnl_expr *e, struct nlattr 
*attr)
ng->dreg = ntohl(mnl_attr_get_u32(tb[NFTA_NG_DREG]));
e->flags |= (1 << NFTNL_EXPR_NG_DREG);
}
-   if (tb[NFTA_NG_UNTIL]) {
-   ng->until = ntohl(mnl_attr_get_u32(tb[NFTA_NG_UNTIL]));
-   e->flags |= (1 << NFTNL_EXPR_NG_UNTIL);
+   if (tb[NFTA_NG_MODULUS]) {
+   ng->modulus = ntohl(mnl_attr_get_u32(tb[NFTA_NG_MODULUS]));
+   e->flags |= (1 << NFTNL_EXPR_NG_MODULUS);
}
if

[PATCH v2] netfilter: nft_numgen: rename until attribute by modulus

2016-09-02 Thread Laura Garcia Liebana
The _until_ attribute is renamed to _modulus_ as the behaviour is similar to
other expresions with number limits (ex. nft_hash).

Renaming is possible because there isn't a kernel release yet with these
changes.

Signed-off-by: Laura Garcia Liebana 
---
Changes in V2:
- Separate changes with offset support

 include/uapi/linux/netfilter/nf_tables.h |  4 ++--
 net/netfilter/nft_numgen.c   | 30 +++---
 2 files changed, 17 insertions(+), 17 deletions(-)

diff --git a/include/uapi/linux/netfilter/nf_tables.h 
b/include/uapi/linux/netfilter/nf_tables.h
index e941139..366ef7d 100644
--- a/include/uapi/linux/netfilter/nf_tables.h
+++ b/include/uapi/linux/netfilter/nf_tables.h
@@ -1151,13 +1151,13 @@ enum nft_trace_types {
  * attributes
  *
  * @NFTA_NG_DREG: destination register (NLA_U32)
- * @NFTA_NG_UNTIL: source value to increment the counter until reset (NLA_U32)
+ * @NFTA_NG_MODULUS: maximum counter value (NLA_U32)
  * @NFTA_NG_TYPE: operation type (NLA_U32)
  */
 enum nft_ng_attributes {
NFTA_NG_UNSPEC,
NFTA_NG_DREG,
-   NFTA_NG_UNTIL,
+   NFTA_NG_MODULUS,
NFTA_NG_TYPE,
__NFTA_NG_MAX
 };
diff --git a/net/netfilter/nft_numgen.c b/net/netfilter/nft_numgen.c
index e1d5a11..04d14a5 100644
--- a/net/netfilter/nft_numgen.c
+++ b/net/netfilter/nft_numgen.c
@@ -21,7 +21,7 @@ static DEFINE_PER_CPU(struct rnd_state, 
nft_numgen_prandom_state);
 
 struct nft_ng_inc {
enum nft_registers  dreg:8;
-   u32 until;
+   u32 modulus;
atomic_tcounter;
 };
 
@@ -34,7 +34,7 @@ static void nft_ng_inc_eval(const struct nft_expr *expr,
 
do {
oval = atomic_read(&priv->counter);
-   nval = (oval + 1 < priv->until) ? oval + 1 : 0;
+   nval = (oval + 1 < priv->modulus) ? oval + 1 : 0;
} while (atomic_cmpxchg(&priv->counter, oval, nval) != oval);
 
memcpy(®s->data[priv->dreg], &priv->counter, sizeof(u32));
@@ -42,7 +42,7 @@ static void nft_ng_inc_eval(const struct nft_expr *expr,
 
 const struct nla_policy nft_ng_policy[NFTA_NG_MAX + 1] = {
[NFTA_NG_DREG]  = { .type = NLA_U32 },
-   [NFTA_NG_UNTIL] = { .type = NLA_U32 },
+   [NFTA_NG_MODULUS]   = { .type = NLA_U32 },
[NFTA_NG_TYPE]  = { .type = NLA_U32 },
 };
 
@@ -52,8 +52,8 @@ static int nft_ng_inc_init(const struct nft_ctx *ctx,
 {
struct nft_ng_inc *priv = nft_expr_priv(expr);
 
-   priv->until = ntohl(nla_get_be32(tb[NFTA_NG_UNTIL]));
-   if (priv->until == 0)
+   priv->modulus = ntohl(nla_get_be32(tb[NFTA_NG_MODULUS]));
+   if (priv->modulus == 0)
return -ERANGE;
 
priv->dreg = nft_parse_register(tb[NFTA_NG_DREG]);
@@ -67,11 +67,11 @@ static int nft_ng_inc_init(const struct nft_ctx *ctx,
 }
 
 static int nft_ng_dump(struct sk_buff *skb, enum nft_registers dreg,
-  u32 until, enum nft_ng_types type)
+  u32 modulus, enum nft_ng_types type)
 {
if (nft_dump_register(skb, NFTA_NG_DREG, dreg))
goto nla_put_failure;
-   if (nft_dump_register(skb, NFTA_NG_UNTIL, until))
+   if (nft_dump_register(skb, NFTA_NG_MODULUS, modulus))
goto nla_put_failure;
if (nft_dump_register(skb, NFTA_NG_TYPE, type))
goto nla_put_failure;
@@ -86,12 +86,12 @@ static int nft_ng_inc_dump(struct sk_buff *skb, const 
struct nft_expr *expr)
 {
const struct nft_ng_inc *priv = nft_expr_priv(expr);
 
-   return nft_ng_dump(skb, priv->dreg, priv->until, NFT_NG_INCREMENTAL);
+   return nft_ng_dump(skb, priv->dreg, priv->modulus, NFT_NG_INCREMENTAL);
 }
 
 struct nft_ng_random {
enum nft_registers  dreg:8;
-   u32 until;
+   u32 modulus;
 };
 
 static void nft_ng_random_eval(const struct nft_expr *expr,
@@ -102,7 +102,7 @@ static void nft_ng_random_eval(const struct nft_expr *expr,
struct rnd_state *state = this_cpu_ptr(&nft_numgen_prandom_state);
 
regs->data[priv->dreg] = reciprocal_scale(prandom_u32_state(state),
- priv->until);
+ priv->modulus);
 }
 
 static int nft_ng_random_init(const struct nft_ctx *ctx,
@@ -111,8 +111,8 @@ static int nft_ng_random_init(const struct nft_ctx *ctx,
 {
struct nft_ng_random *priv = nft_expr_priv(expr);
 
-   priv->until = ntohl(nla_get_be32(tb[NFTA_NG_UNTIL]));
-   if (priv->until == 0)
+   priv->modulus = ntohl(nla_get_be32(tb[NFTA_NG_MODULUS]));
+   if (priv->modulus == 0)
return -ERANGE;
 
prandom_init_once(&nft_numgen_prandom_state);
@@ -129,7 +129,7 @@ static int nft_ng_random_dump(struct sk_buff *skb, const 
struct nft_expr *expr)
 {
const struct nft_ng_random *priv = nft_expr_pr

Re: [PATCH libnftnl] trace: use get_u32 to parse NFPROTO and POLICY attribute

2016-09-02 Thread Florian Westphal
Liping Zhang  wrote:
> From: Liping Zhang 
> 
> NFTA_TRACE_NFPROTO and NFTA_TRACE_POLICY attribute is 32-bit
> value, so we should use mnl_attr_get_u32 and htonl here.
 
Applied, thanks.
--
To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH libnftnl] trace: use get_u32 to parse NFPROTO and POLICY attribute

2016-09-02 Thread Liping Zhang
From: Liping Zhang 

NFTA_TRACE_NFPROTO and NFTA_TRACE_POLICY attribute is 32-bit
value, so we should use mnl_attr_get_u32 and htonl here.

Signed-off-by: Liping Zhang 
---
 src/trace.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/src/trace.c b/src/trace.c
index 2b3388d..bd05d3c 100644
--- a/src/trace.c
+++ b/src/trace.c
@@ -408,12 +408,12 @@ int nftnl_trace_nlmsg_parse(const struct nlmsghdr *nlh, 
struct nftnl_trace *t)
t->flags |= (1 << NFTNL_TRACE_TRANSPORT_HEADER);
 
if (tb[NFTA_TRACE_NFPROTO]) {
-   t->nfproto = ntohs(mnl_attr_get_u16(tb[NFTA_TRACE_NFPROTO]));
+   t->nfproto = ntohl(mnl_attr_get_u32(tb[NFTA_TRACE_NFPROTO]));
t->flags |= (1 << NFTNL_TRACE_NFPROTO);
}
 
if (tb[NFTA_TRACE_POLICY]) {
-   t->policy = ntohs(mnl_attr_get_u16(tb[NFTA_TRACE_POLICY]));
+   t->policy = ntohl(mnl_attr_get_u32(tb[NFTA_TRACE_POLICY]));
t->flags |= (1 << NFTNL_TRACE_POLICY);
}
 
-- 
2.5.5


--
To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH nf] netfilter: nf_tables_trace: fix endiness when dump chain policy

2016-09-02 Thread Liping Zhang
From: Liping Zhang 

NFTA_TRACE_POLICY attribute is big endian, but we forget to call
htonl to convert it. Fortunately, this attribute is parsed as big
endian in libnftnl.

Signed-off-by: Liping Zhang 
---
 net/netfilter/nf_tables_trace.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/net/netfilter/nf_tables_trace.c b/net/netfilter/nf_tables_trace.c
index 39eb1cc..fa24a5b 100644
--- a/net/netfilter/nf_tables_trace.c
+++ b/net/netfilter/nf_tables_trace.c
@@ -237,7 +237,7 @@ void nft_trace_notify(struct nft_traceinfo *info)
break;
case NFT_TRACETYPE_POLICY:
if (nla_put_be32(skb, NFTA_TRACE_POLICY,
-info->basechain->policy))
+htonl(info->basechain->policy)))
goto nla_put_failure;
break;
}
-- 
2.5.5


--
To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH iptables] extensions: libip6t_SNAT/DNAT: add square bracket in xlat output when port is specified

2016-09-02 Thread Liping Zhang
From: Liping Zhang 

It is better to add square brackets to ip6 address in nft translation
output when the port is specified. This is keep consistent with the
nft syntax.

Before this patch:
  # ip6tables-translate -t nat -A OUTPUT -p tcp -j DNAT --to-destination \
  [123::4]:1
  nft add rule ip6 nat OUTPUT meta l4proto tcp counter dnat to 123::4 :1
  # ip6tables-translate -t nat -A POSTROUTING -p tcp -j SNAT --to-source \
  [123::4-123::8]:1
  nft add rule ip6 nat POSTROUTING meta l4proto tcp counter snat to 
123::4-123::8 :1

Apply this patch:
  # ip6tables-translate -t nat -A OUTPUT -p tcp -j DNAT --to-destination \
  [123::4]:1
  nft add rule ip6 nat OUTPUT meta l4proto tcp counter dnat to [123::4]:1
  # ip6tables-translate -t nat -A POSTROUTING -p tcp -j SNAT --to-source \
  [123::4-123::8]:1
  nft add rule ip6 nat POSTROUTING meta l4proto tcp counter snat to 
[123::4]-[123::8]:1

Signed-off-by: Liping Zhang 
---
 extensions/libip6t_DNAT.c | 21 ++---
 extensions/libip6t_SNAT.c | 21 ++---
 2 files changed, 28 insertions(+), 14 deletions(-)

diff --git a/extensions/libip6t_DNAT.c b/extensions/libip6t_DNAT.c
index 97a8b1c..08d920d 100644
--- a/extensions/libip6t_DNAT.c
+++ b/extensions/libip6t_DNAT.c
@@ -234,17 +234,24 @@ static void DNAT_save(const void *ip, const struct 
xt_entry_target *target)
 static void print_range_xlate(const struct nf_nat_range *range,
  struct xt_xlate *xl)
 {
+   bool proto_specified = range->flags & NF_NAT_RANGE_PROTO_SPECIFIED;
+
if (range->flags & NF_NAT_RANGE_MAP_IPS) {
-   xt_xlate_add(xl, "%s",
-  xtables_ip6addr_to_numeric(&range->min_addr.in6));
+   xt_xlate_add(xl, "%s%s%s",
+proto_specified ? "[" : "",
+xtables_ip6addr_to_numeric(&range->min_addr.in6),
+proto_specified ? "]" : "");
 
if (memcmp(&range->min_addr, &range->max_addr,
-  sizeof(range->min_addr)))
-   xt_xlate_add(xl, "-%s",
-xtables_ip6addr_to_numeric(&range->max_addr.in6));
+  sizeof(range->min_addr))) {
+   xt_xlate_add(xl, "-%s%s%s",
+proto_specified ? "[" : "",
+
xtables_ip6addr_to_numeric(&range->max_addr.in6),
+proto_specified ? "]" : "");
+   }
}
-   if (range->flags & NF_NAT_RANGE_PROTO_SPECIFIED) {
-   xt_xlate_add(xl, " :%hu", ntohs(range->min_proto.tcp.port));
+   if (proto_specified) {
+   xt_xlate_add(xl, ":%hu", ntohs(range->min_proto.tcp.port));
 
if (range->max_proto.tcp.port != range->min_proto.tcp.port)
xt_xlate_add(xl, "-%hu",
diff --git a/extensions/libip6t_SNAT.c b/extensions/libip6t_SNAT.c
index c3d8190..671ac61 100644
--- a/extensions/libip6t_SNAT.c
+++ b/extensions/libip6t_SNAT.c
@@ -244,17 +244,24 @@ static void SNAT_save(const void *ip, const struct 
xt_entry_target *target)
 static void print_range_xlate(const struct nf_nat_range *range,
  struct xt_xlate *xl)
 {
+   bool proto_specified = range->flags & NF_NAT_RANGE_PROTO_SPECIFIED;
+
if (range->flags & NF_NAT_RANGE_MAP_IPS) {
-   xt_xlate_add(xl, "%s",
-  xtables_ip6addr_to_numeric(&range->min_addr.in6));
+   xt_xlate_add(xl, "%s%s%s",
+proto_specified ? "[" : "",
+xtables_ip6addr_to_numeric(&range->min_addr.in6),
+proto_specified ? "]" : "");
 
if (memcmp(&range->min_addr, &range->max_addr,
-  sizeof(range->min_addr)))
-   xt_xlate_add(xl, "-%s",
-xtables_ip6addr_to_numeric(&range->max_addr.in6));
+  sizeof(range->min_addr))) {
+   xt_xlate_add(xl, "-%s%s%s",
+proto_specified ? "[" : "",
+
xtables_ip6addr_to_numeric(&range->max_addr.in6),
+proto_specified ? "]" : "");
+   }
}
-   if (range->flags & NF_NAT_RANGE_PROTO_SPECIFIED) {
-   xt_xlate_add(xl, " :%hu", ntohs(range->min_proto.tcp.port));
+   if (proto_specified) {
+   xt_xlate_add(xl, ":%hu", ntohs(range->min_proto.tcp.port));
 
if (range->max_proto.tcp.port != range->min_proto.tcp.port)
xt_xlate_add(xl, "-%hu",
-- 
2.5.5


--
To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: nfqueue & bridge netfilter considered broken

2016-09-02 Thread Florian Westphal
Pablo Neira Ayuso  wrote:
> On Fri, Sep 02, 2016 at 11:58:53AM +0200, Pablo Neira Ayuso wrote:
> > On Fri, Sep 02, 2016 at 11:08:48AM +0200, Florian Westphal wrote:
> > > I - discard extra nfct entry when cloning.  Works, but obviously not
> > >  compatible in any way (the clones are INVALID).
> > 
> > This approach is simple and it would only break when packets are
> > flooded to all ports, actually this is not working anyway because of
> > clashes at confirm, right?
> 
> Hm, what about attaching the notrack conntrack for this case?

This is what Patrick said last time this came up (source:
http://marc.info/?l=netfilter-devel&m=131471329004889&w=2 ):

"I don't think the clones should have invalid state, even untracked is
very questionable since all packets should have NAT applied to them in
the same way, connmarks might be used etc.

We probably need to restore the above mentioned assumption somehow. One
way would be to serialize reinjection of packets belonging to
unconfirmed conntracks in nf_reinject or the queueing modules. Conntrack
related stuff doesn't really belong there, but it seems like the easiest
and safest fix to me."

As for bridge conntrack, this is indeed a good question.

Seems we will need to register a dedicated conntrack bridge hook that
takes care of uncloning in FORWARD hook, i.e. add a hook in FORWARD
that makes a deep copy of all unconfirmed conntracks if skb is cloned,
and (once skb reaches nf_confirm) do a non-destructive clash resolution
(accept instead of drop of the clashing entries should be enough).

We have to sacrifice another status bit for this, or perhaps add a
bridge conntrack extension to store such a clash hint though.

Any other idea?
--
To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH libnftnl] include: resync nf_tables.h cache copy

2016-09-02 Thread Pablo Neira Ayuso
Sync this with the kernel header file we currently have in tree.

This patch addresses the compilation warning and breakage as result of
this header update, specifically the "attibute" typo in trace and
missing default case in expr/numgen.c.

Signed-off-by: Pablo Neira Ayuso 
---
 include/linux/netfilter/nf_tables.h | 84 ++---
 src/expr/numgen.c   |  4 +-
 src/trace.c |  2 +-
 3 files changed, 44 insertions(+), 46 deletions(-)

diff --git a/include/linux/netfilter/nf_tables.h 
b/include/linux/netfilter/nf_tables.h
index 0b11abf..8a63f22 100644
--- a/include/linux/netfilter/nf_tables.h
+++ b/include/linux/netfilter/nf_tables.h
@@ -667,33 +667,6 @@ enum nft_exthdr_attributes {
 #define NFTA_EXTHDR_MAX(__NFTA_EXTHDR_MAX - 1)
 
 /**
- * enum nft_ng_attributes - nf_tables number generator expression attributes
- *
- * @NFTA_NG_DREG: destination register (NLA_U32)
- * @NFTA_NG_UNTIL: limit value (NLA_U32)
- * @NFTA_NG_TYPE: type of operation (NLA_U32)
- */
-enum nft_ng_attributes {
-   NFTA_NG_UNSPEC,
-   NFTA_NG_DREG,
-   NFTA_NG_UNTIL,
-   NFTA_NG_TYPE,
-   __NFTA_NG_MAX
-};
-#define NFTA_NG_MAX(__NFTA_NG_MAX - 1)
-
-/**
- * enum nft_ng_type - nf_tables number generator expression reject types
- *
- * @NFT_NG_INCREMENTAL: Incremental number generator
- * @NFT_NG_RANDOM: Random number generator
- */
-enum nft_ng_type {
-   NFT_NG_INCREMENTAL,
-   NFT_NG_RANDOM
-};
-
-/**
  * enum nft_meta_keys - nf_tables meta expression keys
  *
  * @NFT_META_LEN: packet length (skb->len)
@@ -751,6 +724,26 @@ enum nft_meta_keys {
 };
 
 /**
+ * enum nft_hash_attributes - nf_tables hash expression netlink attributes
+ *
+ * @NFTA_HASH_SREG: source register (NLA_U32)
+ * @NFTA_HASH_DREG: destination register (NLA_U32)
+ * @NFTA_HASH_LEN: source data length (NLA_U32)
+ * @NFTA_HASH_MODULUS: modulus value (NLA_U32)
+ * @NFTA_HASH_SEED: seed value (NLA_U32)
+ */
+enum nft_hash_attributes {
+   NFTA_HASH_UNSPEC,
+   NFTA_HASH_SREG,
+   NFTA_HASH_DREG,
+   NFTA_HASH_LEN,
+   NFTA_HASH_MODULUS,
+   NFTA_HASH_SEED,
+   __NFTA_HASH_MAX,
+};
+#define NFTA_HASH_MAX  (__NFTA_HASH_MAX - 1)
+
+/**
  * enum nft_meta_attributes - nf_tables meta expression netlink attributes
  *
  * @NFTA_META_DREG: destination register (NLA_U32)
@@ -908,7 +901,7 @@ enum nft_queue_attributes {
 #define NFT_QUEUE_FLAG_MASK0x03
 
 enum nft_quota_flags {
-   NFT_QUOTA_F_INV = (1 << 0),
+   NFT_QUOTA_F_INV = (1 << 0),
 };
 
 /**
@@ -1097,7 +1090,7 @@ enum nft_gen_attributes {
  * @NFTA_TRACE_NFPROTO: nf protocol processed (NLA_U32)
  * @NFTA_TRACE_POLICY: policy that decided fate of packet (NLA_U32)
  */
-enum nft_trace_attibutes {
+enum nft_trace_attributes {
NFTA_TRACE_UNSPEC,
NFTA_TRACE_TABLE,
NFTA_TRACE_CHAIN,
@@ -1130,23 +1123,26 @@ enum nft_trace_types {
 #define NFT_TRACETYPE_MAX (__NFT_TRACETYPE_MAX - 1)
 
 /**
- * enum nft_hash_attributes - nf_tables hash expression attributes
+ * enum nft_ng_attributes - nf_tables number generator expression netlink 
attributes
  *
- * @NFTA_HASH_SREG: source register (NLA_U32)
- * @NFTA_HASH_DREG: destination register (NLA_U32)
- * @NFTA_HASH_LEN: data length (NLA_U32)
- * @NFTA_HASH_MODULUS: Modulus value (NLA_U32)
- * @NFTA_HASH_SEED: hash initial value (NLA_U32)
+ * @NFTA_NG_DREG: destination register (NLA_U32)
+ * @NFTA_NG_UNTIL: source value to increment the counter until reset (NLA_U32)
+ * @NFTA_NG_TYPE: operation type (NLA_U32)
  */
-enum nft_hash_attributes {
-   NFTA_HASH_UNSPEC,
-   NFTA_HASH_SREG,
-   NFTA_HASH_DREG,
-   NFTA_HASH_LEN,
-   NFTA_HASH_MODULUS,
-   NFTA_HASH_SEED,
-   __NFTA_HASH_MAX
+enum nft_ng_attributes {
+   NFTA_NG_UNSPEC,
+   NFTA_NG_DREG,
+   NFTA_NG_UNTIL,
+   NFTA_NG_TYPE,
+   __NFTA_NG_MAX
+};
+#define NFTA_NG_MAX(__NFTA_NG_MAX - 1)
+
+enum nft_ng_types {
+   NFT_NG_INCREMENTAL,
+   NFT_NG_RANDOM,
+   __NFT_NG_MAX
 };
-#define NFTA_HASH_MAX  (__NFTA_HASH_MAX - 1)
+#define NFT_NG_MAX (__NFT_NG_MAX - 1)
 
 #endif /* _LINUX_NF_TABLES_H */
diff --git a/src/expr/numgen.c b/src/expr/numgen.c
index 0669eda..7f2b425 100644
--- a/src/expr/numgen.c
+++ b/src/expr/numgen.c
@@ -23,7 +23,7 @@
 struct nftnl_expr_ng {
enum nft_registers  dreg;
unsigned intuntil;
-   enum nft_ng_typetype;
+   enum nft_ng_types   type;
 };
 
 static int
@@ -200,6 +200,8 @@ nftnl_expr_ng_snprintf_default(char *buf, size_t size,
   ng->until);
SNPRINTF_BUFFER_SIZE(ret, size, len, offset);
break;
+   default:
+   break;
}
 
return offset;
diff --git a/src/trace.c b/src/trace.c
index 1a50390..2b3388d 100644
--- a/src/trace.c
+++ b/src/trace.c
@@ -79,7 +79,7 @@ bool nftnl_trace_is_set(const struct nftnl

Re: nfqueue & bridge netfilter considered broken

2016-09-02 Thread Pablo Neira Ayuso
On Fri, Sep 02, 2016 at 11:58:53AM +0200, Pablo Neira Ayuso wrote:
> On Fri, Sep 02, 2016 at 11:08:48AM +0200, Florian Westphal wrote:
> > I - discard extra nfct entry when cloning.  Works, but obviously not
> >  compatible in any way (the clones are INVALID).
> 
> This approach is simple and it would only break when packets are
> flooded to all ports, actually this is not working anyway because of
> clashes at confirm, right?

Hm, what about attaching the notrack conntrack for this case?
--
To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: nfqueue & bridge netfilter considered broken

2016-09-02 Thread Pablo Neira Ayuso
On Fri, Sep 02, 2016 at 11:08:48AM +0200, Florian Westphal wrote:
> I - discard extra nfct entry when cloning.  Works, but obviously not
>  compatible in any way (the clones are INVALID).

This approach is simple and it would only break when packets are
flooded to all ports, actually this is not working anyway because of
clashes at confirm, right?
--
To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] netfilter: nft_numgen: add counter offset value and rename until by modulus

2016-09-02 Thread Pablo Neira Ayuso
On Fri, Sep 02, 2016 at 10:39:37AM +0200, Laura Garcia Liebana wrote:
> Add support for an initialization counter value. With this option the
> sysadmin is able to start the counter when used with the increment
> type.
> 
> Example:
> 
>   meta mark set numgen inc mod 2 sum 100
> 
> This will generate marks with the serie 100, 101, 100, 101, ...
> 
> The _until_ attribute is renamed to _modulus_ as the behaviour is similar to
> other expresions with number limits(ex. nft_hash).
> 
> Signed-off-by: Laura Garcia Liebana 
> ---
>  include/uapi/linux/netfilter/nf_tables.h |  6 --
>  net/netfilter/nft_numgen.c   | 37 
> ++--
>  2 files changed, 25 insertions(+), 18 deletions(-)
> 
> diff --git a/include/uapi/linux/netfilter/nf_tables.h 
> b/include/uapi/linux/netfilter/nf_tables.h
> index e941139..4dbeeed 100644
> --- a/include/uapi/linux/netfilter/nf_tables.h
> +++ b/include/uapi/linux/netfilter/nf_tables.h
> @@ -1151,14 +1151,16 @@ enum nft_trace_types {
>   *   attributes
>   *
>   * @NFTA_NG_DREG: destination register (NLA_U32)
> - * @NFTA_NG_UNTIL: source value to increment the counter until reset 
> (NLA_U32)
> + * @NFTA_NG_MODULUS: maximum counter value (NLA_U32)
>   * @NFTA_NG_TYPE: operation type (NLA_U32)
> + * @NFTA_NG_SUM: Offset to initiate the counter (NLA_U32)

This needs two different patches.

In general, if you describe two things that don't relate each other in
your patch, then this is asking for two patches.

On top of that, if you split patches in logical pieces, you usually
make it easier for the reviewers to spot problems.

BTW, please include in your description that this rename is possible
because we didn't release any kernel version with these changes yet,
that's why this is still possible.
--
To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


nfqueue & bridge netfilter considered broken

2016-09-02 Thread Florian Westphal
Hi.

This is a note to summarize state of bridge + br_netfilter + nfqueue.

TL;DR: I am giving up.  I see no way to fix this in a sane fashion.

What I tried:
I - discard extra nfct entry when cloning.  Works, but obviously not
 compatible in any way (the clones are INVALID).

II - add locking where needed.  It gets out of hand rather quickly,
 it is also pushed on everyone else rather than just those places
 that cause clones with unconfirmed conntrack entries (only places
 that i know of is mcast routing and the bridge).

III - serialize on reinject.
 Idea was to change nfqnl_recv_verdict() to (instead of nf_reinject() )
 call a wrapper, do_nf_reinject, which essentially does:

static void do_nf_reinject(struct nf_queue_entry *entry, unsigned int verdict)
{
#if IS_ENABLED(NF_CONNTRACK)
   struct nf_conn *ct = entry->skb->nfct;

  if (!ct || nf_ct_is_confirmed(ct)) {
   nf_reinject(entry, verdict);
   return 0;
  }

  if (skb_cloned(entry->skb)) {
  module_get(THIS_MODULE);
  push_to_work_queue_for_reinject(entry, verdict);
  return;
  }
#endif
  nf_reinject(entry, verdict);
}

The work queue then makes sure that we feed such skb one-by-one.

Aside from this being ugly as f there are two problems:
 1. skb_cloned() being true does NOT mean that we have to serialize, skb->nfct 
might
be unique after all, i.e. we again end up pushing work on users/cases
where this might not be needed at all.

 2. We can't know if all clones get queued via nfqueue, so we could have 
something like:
a. clone is created
b. this clone is sent to userspace for queueing
c. another clone is created
d. this clone is NOT queued to userspace and about to continue traversal
e. at same time, userspace reinjects the older clone on other cpu
-> we race again despite such 'serializer'.

This leaves option number IV:
 - when cloning in bridge, also make a full copy of ->nfct.
 This has 2 issues:
   1. dependency from bridge to conntrack (yuck)
   2. confirmation will find clash and toss the clones ...
... so we will need more clash resolution hacks (double-yuck).

At this point I am considering submitting a patch to
remove nfqueue support for bridge netfilter altogether.
--
To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH libnftnl] expr: numgen: add counter offset value and rename until by modulus

2016-09-02 Thread Laura Garcia Liebana
Add support to pass through an offset value to the counter
initialization. With this feature, the sysadmin is able to send an
started value for the counter.

Example:

meta mark set numgen inc mod 2 sum 100

This will generate marks with series 100, 101, 100, 101, ...

The _modulus_ attribute will be reused as _until_, as it's similar to
other expressions with value limits (ex. hash).

Signed-off-by: Laura Garcia Liebana 
---
 include/buffer.h|  2 +-
 include/libnftnl/expr.h |  3 +-
 include/linux/netfilter/nf_tables.h |  6 ++-
 src/expr/numgen.c   | 83 -
 tests/nft-expr_numgen-test.c| 12 --
 5 files changed, 70 insertions(+), 36 deletions(-)

diff --git a/include/buffer.h b/include/buffer.h
index 8cfe377..448cccf 100644
--- a/include/buffer.h
+++ b/include/buffer.h
@@ -78,11 +78,11 @@ int nftnl_buf_reg(struct nftnl_buf *b, int type, union 
nftnl_data_reg *reg,
 #define SREG_KEY   "sreg_key"
 #define SREG_DATA  "sreg_data"
 #define SREG   "sreg"
+#define SUM"sum"
 #define TABLE  "table"
 #define TOTAL  "total"
 #define TYPE   "type"
 #define UNIT   "unit"
-#define UNTIL  "until"
 #define USE"use"
 #define XOR"xor"
 #define ADD"add"
diff --git a/include/libnftnl/expr.h b/include/libnftnl/expr.h
index 8815154..3cf0db1 100644
--- a/include/libnftnl/expr.h
+++ b/include/libnftnl/expr.h
@@ -52,8 +52,9 @@ enum {
 
 enum {
NFTNL_EXPR_NG_DREG  = NFTNL_EXPR_BASE,
-   NFTNL_EXPR_NG_UNTIL,
+   NFTNL_EXPR_NG_MODULUS,
NFTNL_EXPR_NG_TYPE,
+   NFTNL_EXPR_NG_SUM,
 };
 
 enum {
diff --git a/include/linux/netfilter/nf_tables.h 
b/include/linux/netfilter/nf_tables.h
index 0b11abf..2718832 100644
--- a/include/linux/netfilter/nf_tables.h
+++ b/include/linux/netfilter/nf_tables.h
@@ -670,14 +670,16 @@ enum nft_exthdr_attributes {
  * enum nft_ng_attributes - nf_tables number generator expression attributes
  *
  * @NFTA_NG_DREG: destination register (NLA_U32)
- * @NFTA_NG_UNTIL: limit value (NLA_U32)
+ * @NFTA_NG_MODULUS: maximum value to be returned (NLA_U32)
  * @NFTA_NG_TYPE: type of operation (NLA_U32)
+ * @NFTA_NG_SUM: offset to be added to the counter (NLA_U32)
  */
 enum nft_ng_attributes {
NFTA_NG_UNSPEC,
NFTA_NG_DREG,
-   NFTA_NG_UNTIL,
+   NFTA_NG_MODULUS,
NFTA_NG_TYPE,
+   NFTA_NG_SUM,
__NFTA_NG_MAX
 };
 #define NFTA_NG_MAX(__NFTA_NG_MAX - 1)
diff --git a/src/expr/numgen.c b/src/expr/numgen.c
index 0669eda..c119d5f 100644
--- a/src/expr/numgen.c
+++ b/src/expr/numgen.c
@@ -22,8 +22,9 @@
 
 struct nftnl_expr_ng {
enum nft_registers  dreg;
-   unsigned intuntil;
+   unsigned intmodulus;
enum nft_ng_typetype;
+   unsigned intsum;
 };
 
 static int
@@ -36,12 +37,15 @@ nftnl_expr_ng_set(struct nftnl_expr *e, uint16_t type,
case NFTNL_EXPR_NG_DREG:
ng->dreg = *((uint32_t *)data);
break;
-   case NFTNL_EXPR_NG_UNTIL:
-   ng->until = *((uint32_t *)data);
+   case NFTNL_EXPR_NG_MODULUS:
+   ng->modulus = *((uint32_t *)data);
break;
case NFTNL_EXPR_NG_TYPE:
ng->type = *((uint32_t *)data);
break;
+   case NFTNL_EXPR_NG_SUM:
+   ng->sum = *((uint32_t *)data);
+   break;
default:
return -1;
}
@@ -58,12 +62,15 @@ nftnl_expr_ng_get(const struct nftnl_expr *e, uint16_t type,
case NFTNL_EXPR_NG_DREG:
*data_len = sizeof(ng->dreg);
return &ng->dreg;
-   case NFTNL_EXPR_NG_UNTIL:
-   *data_len = sizeof(ng->until);
-   return &ng->until;
+   case NFTNL_EXPR_NG_MODULUS:
+   *data_len = sizeof(ng->modulus);
+   return &ng->modulus;
case NFTNL_EXPR_NG_TYPE:
*data_len = sizeof(ng->type);
return &ng->type;
+   case NFTNL_EXPR_NG_SUM:
+   *data_len = sizeof(ng->sum);
+   return &ng->sum;
}
return NULL;
 }
@@ -78,8 +85,9 @@ static int nftnl_expr_ng_cb(const struct nlattr *attr, void 
*data)
 
switch (type) {
case NFTA_NG_DREG:
-   case NFTA_NG_UNTIL:
+   case NFTA_NG_MODULUS:
case NFTA_NG_TYPE:
+   case NFTA_NG_SUM:
if (mnl_attr_validate(attr, MNL_TYPE_U32) < 0)
abi_breakage();
break;
@@ -96,10 +104,12 @@ nftnl_expr_ng_build(struct nlmsghdr *nlh, const struct 
nftnl_expr *e)
 
if (e->flags & (1 << NFTNL_EXPR_NG_DREG))
mnl_attr_put_u32(nlh, NFTA_NG_DREG, htonl(ng->dreg));
-   if (e->flags & (1 << NFTNL_EXPR

[PATCH] netfilter: nft_numgen: add counter offset value and rename until by modulus

2016-09-02 Thread Laura Garcia Liebana
Add support for an initialization counter value. With this option the
sysadmin is able to start the counter when used with the increment
type.

Example:

meta mark set numgen inc mod 2 sum 100

This will generate marks with the serie 100, 101, 100, 101, ...

The _until_ attribute is renamed to _modulus_ as the behaviour is similar to
other expresions with number limits(ex. nft_hash).

Signed-off-by: Laura Garcia Liebana 
---
 include/uapi/linux/netfilter/nf_tables.h |  6 --
 net/netfilter/nft_numgen.c   | 37 ++--
 2 files changed, 25 insertions(+), 18 deletions(-)

diff --git a/include/uapi/linux/netfilter/nf_tables.h 
b/include/uapi/linux/netfilter/nf_tables.h
index e941139..4dbeeed 100644
--- a/include/uapi/linux/netfilter/nf_tables.h
+++ b/include/uapi/linux/netfilter/nf_tables.h
@@ -1151,14 +1151,16 @@ enum nft_trace_types {
  * attributes
  *
  * @NFTA_NG_DREG: destination register (NLA_U32)
- * @NFTA_NG_UNTIL: source value to increment the counter until reset (NLA_U32)
+ * @NFTA_NG_MODULUS: maximum counter value (NLA_U32)
  * @NFTA_NG_TYPE: operation type (NLA_U32)
+ * @NFTA_NG_SUM: Offset to initiate the counter (NLA_U32)
  */
 enum nft_ng_attributes {
NFTA_NG_UNSPEC,
NFTA_NG_DREG,
-   NFTA_NG_UNTIL,
+   NFTA_NG_MODULUS,
NFTA_NG_TYPE,
+   NFTA_NG_SUM,
__NFTA_NG_MAX
 };
 #define NFTA_NG_MAX(__NFTA_NG_MAX - 1)
diff --git a/net/netfilter/nft_numgen.c b/net/netfilter/nft_numgen.c
index e1d5a11..bfde6d0 100644
--- a/net/netfilter/nft_numgen.c
+++ b/net/netfilter/nft_numgen.c
@@ -21,7 +21,7 @@ static DEFINE_PER_CPU(struct rnd_state, 
nft_numgen_prandom_state);
 
 struct nft_ng_inc {
enum nft_registers  dreg:8;
-   u32 until;
+   u32 modulus;
atomic_tcounter;
 };
 
@@ -34,7 +34,7 @@ static void nft_ng_inc_eval(const struct nft_expr *expr,
 
do {
oval = atomic_read(&priv->counter);
-   nval = (oval + 1 < priv->until) ? oval + 1 : 0;
+   nval = (oval + 1 < priv->modulus) ? oval + 1 : 0;
} while (atomic_cmpxchg(&priv->counter, oval, nval) != oval);
 
memcpy(®s->data[priv->dreg], &priv->counter, sizeof(u32));
@@ -42,8 +42,9 @@ static void nft_ng_inc_eval(const struct nft_expr *expr,
 
 const struct nla_policy nft_ng_policy[NFTA_NG_MAX + 1] = {
[NFTA_NG_DREG]  = { .type = NLA_U32 },
-   [NFTA_NG_UNTIL] = { .type = NLA_U32 },
+   [NFTA_NG_MODULUS]   = { .type = NLA_U32 },
[NFTA_NG_TYPE]  = { .type = NLA_U32 },
+   [NFTA_NG_SUM]   = { .type = NLA_U32 },
 };
 
 static int nft_ng_inc_init(const struct nft_ctx *ctx,
@@ -51,27 +52,31 @@ static int nft_ng_inc_init(const struct nft_ctx *ctx,
   const struct nlattr * const tb[])
 {
struct nft_ng_inc *priv = nft_expr_priv(expr);
+   u32 sum = 0;
 
-   priv->until = ntohl(nla_get_be32(tb[NFTA_NG_UNTIL]));
-   if (priv->until == 0)
+   if (tb[NFTA_NG_SUM])
+   sum = ntohl(nla_get_be32(tb[NFTA_NG_SUM]));
+
+   priv->modulus = ntohl(nla_get_be32(tb[NFTA_NG_MODULUS]));
+   if (priv->modulus == 0 || sum >= priv->modulus)
return -ERANGE;
 
priv->dreg = nft_parse_register(tb[NFTA_NG_DREG]);
if (priv->dreg < 0)
return -ERANGE;
 
-   atomic_set(&priv->counter, 0);
+   atomic_set(&priv->counter, sum);
 
return nft_validate_register_store(ctx, priv->dreg, NULL,
   NFT_DATA_VALUE, sizeof(u32));
 }
 
 static int nft_ng_dump(struct sk_buff *skb, enum nft_registers dreg,
-  u32 until, enum nft_ng_types type)
+  u32 modulus, enum nft_ng_types type)
 {
if (nft_dump_register(skb, NFTA_NG_DREG, dreg))
goto nla_put_failure;
-   if (nft_dump_register(skb, NFTA_NG_UNTIL, until))
+   if (nft_dump_register(skb, NFTA_NG_MODULUS, modulus))
goto nla_put_failure;
if (nft_dump_register(skb, NFTA_NG_TYPE, type))
goto nla_put_failure;
@@ -86,12 +91,12 @@ static int nft_ng_inc_dump(struct sk_buff *skb, const 
struct nft_expr *expr)
 {
const struct nft_ng_inc *priv = nft_expr_priv(expr);
 
-   return nft_ng_dump(skb, priv->dreg, priv->until, NFT_NG_INCREMENTAL);
+   return nft_ng_dump(skb, priv->dreg, priv->modulus, NFT_NG_INCREMENTAL);
 }
 
 struct nft_ng_random {
enum nft_registers  dreg:8;
-   u32 until;
+   u32 modulus;
 };
 
 static void nft_ng_random_eval(const struct nft_expr *expr,
@@ -102,7 +107,7 @@ static void nft_ng_random_eval(const struct nft_expr *expr,
struct rnd_state *state = this_cpu_ptr(&nft_numgen_prandom_state);
 
regs->data[priv->dreg] = reciprocal_scale(prandom_u32_state(state),
-  

Re: [PATCH v2 1/2 nf] netfilter: seqadj: Fix one possible panic in seqadj when mem is exhausted

2016-09-02 Thread Gao Feng
Hi Florian,

On Fri, Sep 2, 2016 at 2:59 PM, Florian Westphal  wrote:
> f...@ikuai8.com  wrote:
>> From: Gao Feng 
>>
>> When memory is exhausted, nfct_seqadj_ext_add may fail to add the seqadj
>> extension. But the function nf_ct_seqadj_init doesn't check if get valid
>> seqadj pointer by the nfct_seqadj, while other functions perform the
>> sanity check.
>>
>> So the system would be panic when nfct_seqadj_ext_add failed.
>>
>> Signed-off-by: Gao Feng 
>
>> diff --git a/net/netfilter/nf_conntrack_seqadj.c 
>> b/net/netfilter/nf_conntrack_seqadj.c
>> index dff0f0c..2c8e201 100644
>> --- a/net/netfilter/nf_conntrack_seqadj.c
>> +++ b/net/netfilter/nf_conntrack_seqadj.c
>> @@ -16,9 +16,14 @@ int nf_ct_seqadj_init(struct nf_conn *ct, enum 
>> ip_conntrack_info ctinfo,
>>   if (off == 0)
>>   return 0;
>>
>> + seqadj = nfct_seqadj(ct);
>> + if (unlikely(!seqadj)) {
>> + WARN_ONCE(1, "Missing nfct_seqadj_ext_add() setup call\n");
>> + return 0;
>> + }
>> +
>
> Not sure this WARN() is really needed, I would remove it (since its most
> likely only missing due to memory shortage).
>
> Other than that, this looks good.
>

I prefer to keep the warning, although it only happens caused by mem
shortage now.
But it would give the accurate description, if the nfct_seqadj_ext_add
was lost when new features were depending on synadj.


Best Regards
Feng


--
To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html