Re: [PATCH nf] netfilter: core: remove erroneous warn_on

2017-09-06 Thread Aaron Conole
Florian Westphal <f...@strlen.de> writes:

> kernel test robot reported:
>
> WARNING: CPU: 0 PID: 1244 at net/netfilter/core.c:218 
> __nf_hook_entries_try_shrink+0x49/0xcd
> [..]
>
> After allowing batching in nf_unregister_net_hooks its possible that an 
> earlier
> call to __nf_hook_entries_try_shrink already compacted the list.
> If this happens we don't need to do anything.
>
> Fixes: d3ad2c17b4047 ("netfilter: core: batch nf_unregister_net_hooks 
> synchronize_net calls")
> Reported-by: kernel test robot <xiaolong...@intel.com>
> Signed-off-by: Florian Westphal <f...@strlen.de>
> ---


Acked-by: Aaron Conole <acon...@bytheb.org>
--
To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH nf-next 1/3] netfilter: convert hook list to an array

2017-08-23 Thread Aaron Conole
Eric Dumazet <eric.duma...@gmail.com> writes:

> On Wed, 2017-08-23 at 17:26 +0200, Florian Westphal wrote:
>> From: Aaron Conole <acon...@bytheb.org>
>
> ...
>
>> -static struct nf_hook_entry __rcu **nf_hook_entry_head(struct net
>> *net, const struct nf_hook_ops *reg)
>> +static struct nf_hook_entries *allocate_hook_entries_size(u16 num)
>> +{
>> +struct nf_hook_entries *e;
>> +size_t alloc = sizeof(*e) +
>> +   sizeof(struct nf_hook_entry) * num +
>> +   sizeof(struct nf_hook_ops *) * num;
>> +
>> +if (num == 0)
>> +return NULL;
>> +
>> +e = kvmalloc(alloc, GFP_KERNEL);
>> +if (e) {
>> +memset(e, 0, alloc);
>> +e->num_hook_entries = num;
>> +}
>
>
> nit:
>
>   e = kvzalloc(alloc, GFP_KERNEL);
>   if (e)
>   e->num_hook_entries = num;

d'oh!  Thanks for spotting.
--
To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH iptables] iptables: support insisting that the lock is held

2017-05-03 Thread Aaron Conole
Aaron Conole <acon...@bytheb.org> writes:

> Pablo Neira Ayuso <pa...@netfilter.org> writes:
>
>> On Thu, Apr 20, 2017 at 06:23:33PM +0900, Lorenzo Colitti wrote:
>>> Currently, iptables programs will exit with an error if the
>>> iptables lock cannot be acquired, but will silently continue if
>>> the lock cannot be opened at all.
>>
>> This sounds to me like a wrong design decision was made when
>> introducing this userspace lock.
>
> I wouldn't say it that way.  I looked at this a while ago, and one thing
> to keep in mind is the if the particular prefix path in the filesystem
> (for instance /run) isn't available, then this will cause iptables to
> fail.  I'm not sure how many systems do provide /run - at the time it
> might have been more common.

Another issue is container systems.  Until recently, Kubernetes didn't
provide /run at all, and not all container orchestration tools will
provide this filesystem.

>>> This can cause unexpected failures (with unhelpful error messages)
>>> in the presence of concurrent updates.
>>> 
>>> This patch adds a compile-time option that causes iptables to
>>> refuse to do anything if the lock cannot be acquired. It is a
>>> compile-time option instead of a command-line flag because:
>>> 
>>> 1. In order to reliably avoid concurrent modification, all
>>>invocations of iptables commands must follow this behaviour.
>>> 2. Whether or not the lock can be opened is typically not
>>>a run-time condition but is likely to be a configuration
>>>error.
>>>
>>> Tested by deleting xtables.lock and observing that all commands
>>> failed if iptables was compiled with --enable-strict-locking, but
>>> succeeded otherwise.
>>> 
>>> By default, --enable-strict-locking is disabled for backwards
>>> compatibility reasons. It can be enabled by default in a future
>>> change if desired.
>>
>> I would like to skip this compile time switch, if the existing
>> behaviour is broken, we should just fix it. What is the scenario that
>> can indeed have an impact in terms of backward compatibility breakage?
>> Does it really make sense to keep a buggy behaviour around?
>
> I'm not sure about a change to the behavior, but I agree that a compile
> time switch is probably not the way to go.

I've thought about it.  I think a better change that makes sense in the
presence of concurrent updates would be to use the wait argument as a
total time, and apply it to both the userspace lock, AND an EAGAIN from
the kernel space side.  So if the kernel space says 'locked try again',
and the user passed a -w option, we can simply keep retrying until the
wait time expires.  Does that make sense and solve your issue, Lorenzo?
--
To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH iptables] iptables: support insisting that the lock is held

2017-05-03 Thread Aaron Conole
Pablo Neira Ayuso  writes:

> On Thu, Apr 20, 2017 at 06:23:33PM +0900, Lorenzo Colitti wrote:
>> Currently, iptables programs will exit with an error if the
>> iptables lock cannot be acquired, but will silently continue if
>> the lock cannot be opened at all.
>
> This sounds to me like a wrong design decision was made when
> introducing this userspace lock.

I wouldn't say it that way.  I looked at this a while ago, and one thing
to keep in mind is the if the particular prefix path in the filesystem
(for instance /run) isn't available, then this will cause iptables to
fail.  I'm not sure how many systems do provide /run - at the time it
might have been more common.

>> This can cause unexpected failures (with unhelpful error messages)
>> in the presence of concurrent updates.
>> 
>> This patch adds a compile-time option that causes iptables to
>> refuse to do anything if the lock cannot be acquired. It is a
>> compile-time option instead of a command-line flag because:
>> 
>> 1. In order to reliably avoid concurrent modification, all
>>invocations of iptables commands must follow this behaviour.
>> 2. Whether or not the lock can be opened is typically not
>>a run-time condition but is likely to be a configuration
>>error.
>>
>> Tested by deleting xtables.lock and observing that all commands
>> failed if iptables was compiled with --enable-strict-locking, but
>> succeeded otherwise.
>> 
>> By default, --enable-strict-locking is disabled for backwards
>> compatibility reasons. It can be enabled by default in a future
>> change if desired.
>
> I would like to skip this compile time switch, if the existing
> behaviour is broken, we should just fix it. What is the scenario that
> can indeed have an impact in terms of backward compatibility breakage?
> Does it really make sense to keep a buggy behaviour around?

I'm not sure about a change to the behavior, but I agree that a compile
time switch is probably not the way to go.

-Aaron
--
To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH nf-next] ip_vs_sync: change comparison on sync_refresh_period

2017-04-12 Thread Aaron Conole
The sync_refresh_period variable is unsigned, so it can never be < 0.

Signed-off-by: Aaron Conole <acon...@bytheb.org>
---
 net/netfilter/ipvs/ip_vs_sync.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/net/netfilter/ipvs/ip_vs_sync.c b/net/netfilter/ipvs/ip_vs_sync.c
index b03c280..123dc0f 100644
--- a/net/netfilter/ipvs/ip_vs_sync.c
+++ b/net/netfilter/ipvs/ip_vs_sync.c
@@ -520,7 +520,7 @@ static int ip_vs_sync_conn_needed(struct netns_ipvs *ipvs,
if (!(cp->flags & IP_VS_CONN_F_TEMPLATE) &&
pkts % sync_period != sysctl_sync_threshold(ipvs))
return 0;
-   } else if (sync_refresh_period <= 0 &&
+   } else if (!sync_refresh_period &&
   pkts != sysctl_sync_threshold(ipvs))
return 0;
 
-- 
2.9.3

--
To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH nf-next] nf_conntrack: remove double assignment

2017-04-12 Thread Aaron Conole
The protonet pointer will unconditionally be rewritten, so just do the
needed assignment first.

Signed-off-by: Aaron Conole <acon...@bytheb.org>
---
 net/netfilter/nf_conntrack_proto.c | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/net/netfilter/nf_conntrack_proto.c 
b/net/netfilter/nf_conntrack_proto.c
index 2d6ee18..e6d2945 100644
--- a/net/netfilter/nf_conntrack_proto.c
+++ b/net/netfilter/nf_conntrack_proto.c
@@ -441,9 +441,8 @@ EXPORT_SYMBOL_GPL(nf_ct_l4proto_unregister_one);
 void nf_ct_l4proto_pernet_unregister_one(struct net *net,
 struct nf_conntrack_l4proto *l4proto)
 {
-   struct nf_proto_net *pn = NULL;
+   struct nf_proto_net *pn = nf_ct_l4proto_net(net, l4proto);
 
-   pn = nf_ct_l4proto_net(net, l4proto);
if (pn == NULL)
return;
 
-- 
2.9.3

--
To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH nf-next] nf_tables: remove double return statement

2017-04-12 Thread Aaron Conole
Signed-off-by: Aaron Conole <acon...@bytheb.org>
---
 net/netfilter/nf_tables_api.c | 2 --
 1 file changed, 2 deletions(-)

diff --git a/net/netfilter/nf_tables_api.c b/net/netfilter/nf_tables_api.c
index 2d822d2..1452fb7 100644
--- a/net/netfilter/nf_tables_api.c
+++ b/net/netfilter/nf_tables_api.c
@@ -4435,8 +4435,6 @@ static int nf_tables_getobj(struct net *net, struct sock 
*nlsk,
 err:
kfree_skb(skb2);
return err;
-
-   return 0;
 }
 
 static void nft_obj_destroy(struct nft_object *obj)
-- 
2.9.3

--
To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH nf-next] ipset: remove unused function __ip_set_get_netlink

2017-04-10 Thread Aaron Conole
There are no in-tree callers.

Signed-off-by: Aaron Conole <acon...@bytheb.org>
---
 net/netfilter/ipset/ip_set_core.c | 8 
 1 file changed, 8 deletions(-)

diff --git a/net/netfilter/ipset/ip_set_core.c 
b/net/netfilter/ipset/ip_set_core.c
index c296f9b..68ba531 100644
--- a/net/netfilter/ipset/ip_set_core.c
+++ b/net/netfilter/ipset/ip_set_core.c
@@ -501,14 +501,6 @@ __ip_set_put(struct ip_set *set)
  * a separate reference counter
  */
 static inline void
-__ip_set_get_netlink(struct ip_set *set)
-{
-   write_lock_bh(_set_ref_lock);
-   set->ref_netlink++;
-   write_unlock_bh(_set_ref_lock);
-}
-
-static inline void
 __ip_set_put_netlink(struct ip_set *set)
 {
write_lock_bh(_set_ref_lock);
-- 
2.9.3

--
To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH nf-next] ipvs: remove unused function ip_vs_set_state_timeout

2017-04-10 Thread Aaron Conole
There are no in-tree callers of this function and it isn't exported.

Signed-off-by: Aaron Conole <acon...@bytheb.org>
---
 include/net/ip_vs.h  |  2 --
 net/netfilter/ipvs/ip_vs_proto.c | 22 --
 2 files changed, 24 deletions(-)

diff --git a/include/net/ip_vs.h b/include/net/ip_vs.h
index 8a4a57b8..c76fedb 100644
--- a/include/net/ip_vs.h
+++ b/include/net/ip_vs.h
@@ -1349,8 +1349,6 @@ int ip_vs_protocol_init(void);
 void ip_vs_protocol_cleanup(void);
 void ip_vs_protocol_timeout_change(struct netns_ipvs *ipvs, int flags);
 int *ip_vs_create_timeout_table(int *table, int size);
-int ip_vs_set_state_timeout(int *table, int num, const char *const *names,
-   const char *name, int to);
 void ip_vs_tcpudp_debug_packet(int af, struct ip_vs_protocol *pp,
   const struct sk_buff *skb, int offset,
   const char *msg);
diff --git a/net/netfilter/ipvs/ip_vs_proto.c b/net/netfilter/ipvs/ip_vs_proto.c
index 8ae4807..ca880a3 100644
--- a/net/netfilter/ipvs/ip_vs_proto.c
+++ b/net/netfilter/ipvs/ip_vs_proto.c
@@ -193,28 +193,6 @@ ip_vs_create_timeout_table(int *table, int size)
 }
 
 
-/*
- * Set timeout value for state specified by name
- */
-int
-ip_vs_set_state_timeout(int *table, int num, const char *const *names,
-   const char *name, int to)
-{
-   int i;
-
-   if (!table || !name || !to)
-   return -EINVAL;
-
-   for (i = 0; i < num; i++) {
-   if (strcmp(names[i], name))
-   continue;
-   table[i] = to * HZ;
-   return 0;
-   }
-   return -ENOENT;
-}
-
-
 const char * ip_vs_state_name(__u16 proto, int state)
 {
struct ip_vs_protocol *pp = ip_vs_proto_get(proto);
-- 
2.9.3

--
To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH nf-next] netfilter: add list element test to br_netfilter_hooks

2016-12-06 Thread Aaron Conole
The for-loop in the bridge hook entries assumes that the elements are
always present.  However, this assumption may not always be true.

Fixes: 66cfc1dd07c7 ("netfilter: convert while loops to for loops")
Signed-off-by: Aaron Conole <acon...@bytheb.org>
--
Pablo, if possible could this be squashed into the commit instead?  I
only did a build test of this, but it should be correct.

diff --git a/net/bridge/br_netfilter_hooks.c b/net/bridge/br_netfilter_hooks.c
index fbe35b4..b12501a 100644
--- a/net/bridge/br_netfilter_hooks.c
+++ b/net/bridge/br_netfilter_hooks.c
@@ -1009,7 +1009,7 @@ int br_nf_hook_thresh(unsigned int hook, struct net *net,
int ret;
 
for (elem = rcu_dereference(net->nf.hooks[NFPROTO_BRIDGE][hook]);
-nf_hook_entry_priority(elem) <= NF_BR_PRI_BRNF;
+elem && nf_hook_entry_priority(elem) <= NF_BR_PRI_BRNF;
 elem = rcu_dereference(elem->next))
;

--
2.5.0
--
To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH nf-next 1/3] netfilter: introduce accessor functions for hook entries

2016-11-15 Thread Aaron Conole
This allows easier future refactoring.

Signed-off-by: Aaron Conole <acon...@bytheb.org>
---
 include/linux/netfilter.h   | 27 +++
 net/bridge/br_netfilter_hooks.c |  2 +-
 net/netfilter/core.c| 10 --
 net/netfilter/nf_queue.c|  5 ++---
 4 files changed, 34 insertions(+), 10 deletions(-)

diff --git a/include/linux/netfilter.h b/include/linux/netfilter.h
index 6923014..575aa19 100644
--- a/include/linux/netfilter.h
+++ b/include/linux/netfilter.h
@@ -79,6 +79,33 @@ struct nf_hook_entry {
const struct nf_hook_ops*orig_ops;
 };
 
+static inline void
+nf_hook_entry_init(struct nf_hook_entry *entry,const struct 
nf_hook_ops *ops)
+{
+   entry->next = NULL;
+   entry->ops = *ops;
+   entry->orig_ops = ops;
+}
+
+static inline int
+nf_hook_entry_priority(const struct nf_hook_entry *entry)
+{
+   return entry->ops.priority;
+}
+
+static inline int
+nf_hook_entry_hookfn(const struct nf_hook_entry *entry, struct sk_buff *skb,
+struct nf_hook_state *state)
+{
+   return entry->ops.hook(entry->ops.priv, skb, state);
+}
+
+static inline const struct nf_hook_ops *
+nf_hook_entry_ops(const struct nf_hook_entry *entry)
+{
+   return entry->orig_ops;
+}
+
 static inline void nf_hook_state_init(struct nf_hook_state *p,
  unsigned int hook,
  u_int8_t pf,
diff --git a/net/bridge/br_netfilter_hooks.c b/net/bridge/br_netfilter_hooks.c
index 8155bd2..ef8cfa7 100644
--- a/net/bridge/br_netfilter_hooks.c
+++ b/net/bridge/br_netfilter_hooks.c
@@ -1010,7 +1010,7 @@ int br_nf_hook_thresh(unsigned int hook, struct net *net,
 
elem = rcu_dereference(net->nf.hooks[NFPROTO_BRIDGE][hook]);
 
-   while (elem && (elem->ops.priority <= NF_BR_PRI_BRNF))
+   while (elem && (nf_hook_entry_priority(elem) <= NF_BR_PRI_BRNF))
elem = rcu_dereference(elem->next);
 
if (!elem)
diff --git a/net/netfilter/core.c b/net/netfilter/core.c
index de30e08..2bb46e2 100644
--- a/net/netfilter/core.c
+++ b/net/netfilter/core.c
@@ -102,15 +102,13 @@ int nf_register_net_hook(struct net *net, const struct 
nf_hook_ops *reg)
if (!entry)
return -ENOMEM;
 
-   entry->orig_ops = reg;
-   entry->ops  = *reg;
-   entry->next = NULL;
+   nf_hook_entry_init(entry, reg);
 
mutex_lock(_hook_mutex);
 
/* Find the spot in the list */
while ((p = nf_entry_dereference(*pp)) != NULL) {
-   if (reg->priority < p->orig_ops->priority)
+   if (reg->priority < nf_hook_entry_priority(p))
break;
pp = >next;
}
@@ -140,7 +138,7 @@ void nf_unregister_net_hook(struct net *net, const struct 
nf_hook_ops *reg)
 
mutex_lock(_hook_mutex);
while ((p = nf_entry_dereference(*pp)) != NULL) {
-   if (p->orig_ops == reg) {
+   if (nf_hook_entry_ops(p) == reg) {
rcu_assign_pointer(*pp, p->next);
break;
}
@@ -311,7 +309,7 @@ int nf_hook_slow(struct sk_buff *skb, struct nf_hook_state 
*state,
int ret;
 
do {
-   verdict = entry->ops.hook(entry->ops.priv, skb, state);
+   verdict = nf_hook_entry_hookfn(entry, skb, state);
switch (verdict & NF_VERDICT_MASK) {
case NF_ACCEPT:
entry = rcu_dereference(entry->next);
diff --git a/net/netfilter/nf_queue.c b/net/netfilter/nf_queue.c
index 77cba9f..4a76624 100644
--- a/net/netfilter/nf_queue.c
+++ b/net/netfilter/nf_queue.c
@@ -185,7 +185,7 @@ static unsigned int nf_iterate(struct sk_buff *skb,
 
do {
 repeat:
-   verdict = (*entryp)->ops.hook((*entryp)->ops.priv, skb, state);
+   verdict = nf_hook_entry_hookfn((*entryp), skb, state);
if (verdict != NF_ACCEPT) {
if (verdict != NF_REPEAT)
return verdict;
@@ -200,7 +200,6 @@ static unsigned int nf_iterate(struct sk_buff *skb,
 void nf_reinject(struct nf_queue_entry *entry, unsigned int verdict)
 {
struct nf_hook_entry *hook_entry = entry->hook;
-   struct nf_hook_ops *elem = _entry->ops;
struct sk_buff *skb = entry->skb;
const struct nf_afinfo *afinfo;
int err;
@@ -209,7 +208,7 @@ void nf_reinject(struct nf_queue_entry *entry, unsigned int 
verdict)
 
/* Continue traversal iff userspace said ok... */
if (verdict == NF_REPEAT)
-   verdict = elem->hook(elem->priv, skb, >state);
+   verdict = nf_hook_entry_hookfn(hook_entry, skb, >state);
 
if (verdict == NF_ACCEPT) {
afinfo = nf_get_afinfo(entry-&

[PATCH nf-next 2/3] netfilter: decouple nf_hook_entry and nf_hook_ops

2016-11-15 Thread Aaron Conole
From: Aaron Conole <acon...@redhat.com>

During nfhook traversal we only need a very small subset of
nf_hook_ops members.

We need:
- next element
- hook function to call
- hook function priv argument

Bridge netfilter also needs 'thresh'; can be obtained via ->orig_ops.

nf_hook_entry struct is now 32 bytes on x86_64.

A followup patch will turn the run-time list into an array that only
stores hook functions plus their priv arguments, eliminating the ->next
element.

Suggested-by: Florian Westphal <f...@strlen.de>
Signed-off-by: Aaron Conole <acon...@bytheb.org>
---
 include/linux/netfilter.h | 10 ++
 1 file changed, 6 insertions(+), 4 deletions(-)

diff --git a/include/linux/netfilter.h b/include/linux/netfilter.h
index 575aa19..a4b97be 100644
--- a/include/linux/netfilter.h
+++ b/include/linux/netfilter.h
@@ -75,7 +75,8 @@ struct nf_hook_ops {
 
 struct nf_hook_entry {
struct nf_hook_entry __rcu  *next;
-   struct nf_hook_ops  ops;
+   nf_hookfn   *hook;
+   void*priv;
const struct nf_hook_ops*orig_ops;
 };
 
@@ -83,21 +84,22 @@ static inline void
 nf_hook_entry_init(struct nf_hook_entry *entry,const struct 
nf_hook_ops *ops)
 {
entry->next = NULL;
-   entry->ops = *ops;
+   entry->hook = ops->hook;
+   entry->priv = ops->priv;
entry->orig_ops = ops;
 }
 
 static inline int
 nf_hook_entry_priority(const struct nf_hook_entry *entry)
 {
-   return entry->ops.priority;
+   return entry->orig_ops->priority;
 }
 
 static inline int
 nf_hook_entry_hookfn(const struct nf_hook_entry *entry, struct sk_buff *skb,
 struct nf_hook_state *state)
 {
-   return entry->ops.hook(entry->ops.priv, skb, state);
+   return entry->hook(entry->priv, skb, state);
 }
 
 static inline const struct nf_hook_ops *
-- 
2.7.4

--
To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH RFC netfilter-next 1/3] netfilter: introduce accessor functions for hook entries

2016-11-03 Thread Aaron Conole
Pablo Neira Ayuso <pa...@netfilter.org> writes:

> On Thu, Oct 27, 2016 at 02:27:51PM -0400, Aaron Conole wrote:
>> This allows easier future refactoring.
>> 
>> Signed-off-by: Aaron Conole <acon...@bytheb.org>
>> ---
>>  include/linux/netfilter.h   | 35 ++-
>>  net/bridge/br_netfilter_hooks.c |  2 +-
>>  net/netfilter/core.c|  8 +++-
>>  net/netfilter/nf_queue.c|  8 
>>  4 files changed, 42 insertions(+), 11 deletions(-)

...

> I'd suggest something like:
>
> static inline int
> nf_entry_hookfn(const struct nf_hook_entry *entry,
> struct sk_buff *skb, struct nf_hook_state *state)
> {
> return entry->ops.hook(entry, nf_hook_entry_priv(entry), skb, state);
> }
>
> So you can avoid this:
>
>   verdict = nf_hook_entry_hookfn(*entryp)
>   (nf_hook_entry_priv(*entryp), skb, state);

Makes sense, I'll do that.  Thanks for the review!
--
To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH RFC netfilter-next 2/3] netfilter: decouple nf_hook_entry and nf_hook_ops

2016-10-27 Thread Aaron Conole
During nfhook traversal we only need a very small subset of
nf_hook_ops members.

We need:
- next element
- hook function to call
- hook function priv argument

Bridge netfilter also needs 'thresh'; can be obtained via ->orig_ops.

nf_hook_entry struct is now 32 bytes on x86_64.

Followup patch will turn the run-time list into an array that only
stores hook functions plus their priv arguments.

Suggested-by: Florian Westphal <f...@strlen.de>
Signed-off-by: Aaron Conole <acon...@bytheb.org>
---
 include/linux/netfilter.h | 12 +++-
 1 file changed, 7 insertions(+), 5 deletions(-)

diff --git a/include/linux/netfilter.h b/include/linux/netfilter.h
index b25386b..cc2d977 100644
--- a/include/linux/netfilter.h
+++ b/include/linux/netfilter.h
@@ -76,7 +76,8 @@ struct nf_hook_ops {
 
 struct nf_hook_entry {
struct nf_hook_entry __rcu  *next;
-   struct nf_hook_ops  ops;
+   nf_hookfn   *hook;
+   void*priv;
const struct nf_hook_ops*orig_ops;
 };
 
@@ -84,26 +85,27 @@ static inline void
 nf_hook_entry_init(struct nf_hook_entry *entry,const struct 
nf_hook_ops *ops)
 {
entry->next = NULL;
-   entry->ops = *ops;
+   entry->hook = ops->hook;
+   entry->priv = ops->priv;
entry->orig_ops = ops;
 }
 
 static inline int
 nf_hook_entry_priority(const struct nf_hook_entry *entry)
 {
-   return entry->ops.priority;
+   return entry->orig_ops->priority;
 }
 
 static inline nf_hookfn *
 nf_hook_entry_hookfn(const struct nf_hook_entry *entry)
 {
-   return entry->ops.hook;
+   return entry->hook;
 }
 
 static inline void *
 nf_hook_entry_priv(const struct nf_hook_entry *entry)
 {
-   return entry->ops.priv;
+   return entry->priv;
 }
 
 static inline const struct nf_hook_ops *
-- 
2.7.4

--
To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH RFC netfilter-next 0/3] Additional refactoring enhancements for nf_hook_entry

2016-10-27 Thread Aaron Conole
This series reduces the size of the nf_hook_entry objects, and converts the
loops which traverse into for-loops.  This will facilitate future work to
switch from a singly-linked list into an array.

This series was built and tested on top of Pablo's series posted at
https://www.spinics.net/lists/netfilter-devel/msg44408.html

Aaron Conole (3):
  netfilter: introduce accessor functions for hook entries
  netfilter: decouple nf_hook_entry and nf_hook_ops
  netfilter: Convert while loops to for-loops

 include/linux/netfilter.h   | 42 +
 net/bridge/br_netfilter_hooks.c |  8 
 net/netfilter/core.c| 14 +-
 net/netfilter/nf_queue.c| 11 +--
 4 files changed, 52 insertions(+), 23 deletions(-)

-- 
2.7.4

--
To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH RFC netfilter-next 3/3] netfilter: convert while loops to for loops

2016-10-27 Thread Aaron Conole
This is to facilitate converting from a singly-linked list to an array
of elements.

Signed-off-by: Aaron Conole <acon...@bytheb.org>
---
 include/linux/netfilter.h   | 3 +--
 net/bridge/br_netfilter_hooks.c | 8 
 net/netfilter/core.c| 6 ++
 net/netfilter/nf_queue.c| 3 +--
 4 files changed, 8 insertions(+), 12 deletions(-)

diff --git a/include/linux/netfilter.h b/include/linux/netfilter.h
index cc2d977..89bb370 100644
--- a/include/linux/netfilter.h
+++ b/include/linux/netfilter.h
@@ -195,14 +195,13 @@ static inline int nf_hook_iterate(struct sk_buff *skb,
int ret;
 
entry = rcu_dereference(state->hook_entries);
-   while (entry) {
+   for (; entry; entry = rcu_dereference(entry->next)) {
RCU_INIT_POINTER(state->hook_entries, entry);
 repeat:
verdict = nf_hook_entry_hookfn(entry)(nf_hook_entry_priv(entry),
  skb, state);
switch (verdict) {
case NF_ACCEPT:
-   entry = rcu_dereference(entry->next);
break;
case NF_DROP:
kfree_skb(skb);
diff --git a/net/bridge/br_netfilter_hooks.c b/net/bridge/br_netfilter_hooks.c
index 37c4d59..ec4c409 100644
--- a/net/bridge/br_netfilter_hooks.c
+++ b/net/bridge/br_netfilter_hooks.c
@@ -1008,10 +1008,10 @@ int br_nf_hook_thresh(unsigned int hook, struct net 
*net,
struct nf_hook_state state;
int ret;
 
-   elem = rcu_dereference(net->nf.hooks[NFPROTO_BRIDGE][hook]);
-
-   while (elem && (nf_hook_entry_priority(elem) <= NF_BR_PRI_BRNF))
-   elem = rcu_dereference(elem->next);
+   for (elem = rcu_dereference(net->nf.hooks[NFPROTO_BRIDGE][hook]);
+elem && (nf_hook_entry_priority(elem) <= NF_BR_PRI_BRNF);
+elem = rcu_dereference(elem->next))
+   ;
 
if (!elem)
return okfn(net, sk, skb);
diff --git a/net/netfilter/core.c b/net/netfilter/core.c
index 48782d4..52ded39 100644
--- a/net/netfilter/core.c
+++ b/net/netfilter/core.c
@@ -107,10 +107,9 @@ int nf_register_net_hook(struct net *net, const struct 
nf_hook_ops *reg)
mutex_lock(_hook_mutex);
 
/* Find the spot in the list */
-   while ((p = nf_entry_dereference(*pp)) != NULL) {
+   for (; (p = nf_entry_dereference(*pp)) != NULL; pp = >next) {
if (reg->priority < nf_hook_entry_priority(p))
break;
-   pp = >next;
}
rcu_assign_pointer(entry->next, p);
rcu_assign_pointer(*pp, entry);
@@ -137,12 +136,11 @@ void nf_unregister_net_hook(struct net *net, const struct 
nf_hook_ops *reg)
return;
 
mutex_lock(_hook_mutex);
-   while ((p = nf_entry_dereference(*pp)) != NULL) {
+   for (; (p = nf_entry_dereference(*pp)) != NULL; pp = >next) {
if (nf_hook_entry_ops(p) == reg) {
rcu_assign_pointer(*pp, p->next);
break;
}
-   pp = >next;
}
mutex_unlock(_hook_mutex);
if (!p) {
diff --git a/net/netfilter/nf_queue.c b/net/netfilter/nf_queue.c
index 2d0dc56..2c53357 100644
--- a/net/netfilter/nf_queue.c
+++ b/net/netfilter/nf_queue.c
@@ -183,7 +183,7 @@ static unsigned int nf_iterate(struct sk_buff *skb,
 {
unsigned int verdict;
 
-   while (*entryp) {
+   for (; *entryp; *entryp = rcu_dereference((*entryp)->next)) {
RCU_INIT_POINTER(state->hook_entries, *entryp);
 repeat:
verdict = nf_hook_entry_hookfn(*entryp)
@@ -193,7 +193,6 @@ static unsigned int nf_iterate(struct sk_buff *skb,
return verdict;
goto repeat;
}
-   *entryp = rcu_dereference((*entryp)->next);
}
return NF_ACCEPT;
 }
-- 
2.7.4

--
To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH RFC netfilter-next 1/3] netfilter: introduce accessor functions for hook entries

2016-10-27 Thread Aaron Conole
This allows easier future refactoring.

Signed-off-by: Aaron Conole <acon...@bytheb.org>
---
 include/linux/netfilter.h   | 35 ++-
 net/bridge/br_netfilter_hooks.c |  2 +-
 net/netfilter/core.c|  8 +++-
 net/netfilter/nf_queue.c|  8 
 4 files changed, 42 insertions(+), 11 deletions(-)

diff --git a/include/linux/netfilter.h b/include/linux/netfilter.h
index d0beb607..b25386b 100644
--- a/include/linux/netfilter.h
+++ b/include/linux/netfilter.h
@@ -80,6 +80,38 @@ struct nf_hook_entry {
const struct nf_hook_ops*orig_ops;
 };
 
+static inline void
+nf_hook_entry_init(struct nf_hook_entry *entry,const struct 
nf_hook_ops *ops)
+{
+   entry->next = NULL;
+   entry->ops = *ops;
+   entry->orig_ops = ops;
+}
+
+static inline int
+nf_hook_entry_priority(const struct nf_hook_entry *entry)
+{
+   return entry->ops.priority;
+}
+
+static inline nf_hookfn *
+nf_hook_entry_hookfn(const struct nf_hook_entry *entry)
+{
+   return entry->ops.hook;
+}
+
+static inline void *
+nf_hook_entry_priv(const struct nf_hook_entry *entry)
+{
+   return entry->ops.priv;
+}
+
+static inline const struct nf_hook_ops *
+nf_hook_entry_ops(const struct nf_hook_entry *entry)
+{
+   return entry->orig_ops;
+}
+
 static inline void nf_hook_state_init(struct nf_hook_state *p,
  struct nf_hook_entry *hook_entry,
  unsigned int hook,
@@ -164,7 +196,8 @@ static inline int nf_hook_iterate(struct sk_buff *skb,
while (entry) {
RCU_INIT_POINTER(state->hook_entries, entry);
 repeat:
-   verdict = entry->ops.hook(entry->ops.priv, skb, state);
+   verdict = nf_hook_entry_hookfn(entry)(nf_hook_entry_priv(entry),
+ skb, state);
switch (verdict) {
case NF_ACCEPT:
entry = rcu_dereference(entry->next);
diff --git a/net/bridge/br_netfilter_hooks.c b/net/bridge/br_netfilter_hooks.c
index d153925..37c4d59 100644
--- a/net/bridge/br_netfilter_hooks.c
+++ b/net/bridge/br_netfilter_hooks.c
@@ -1010,7 +1010,7 @@ int br_nf_hook_thresh(unsigned int hook, struct net *net,
 
elem = rcu_dereference(net->nf.hooks[NFPROTO_BRIDGE][hook]);
 
-   while (elem && (elem->ops.priority <= NF_BR_PRI_BRNF))
+   while (elem && (nf_hook_entry_priority(elem) <= NF_BR_PRI_BRNF))
elem = rcu_dereference(elem->next);
 
if (!elem)
diff --git a/net/netfilter/core.c b/net/netfilter/core.c
index 5cf9415..48782d4 100644
--- a/net/netfilter/core.c
+++ b/net/netfilter/core.c
@@ -102,15 +102,13 @@ int nf_register_net_hook(struct net *net, const struct 
nf_hook_ops *reg)
if (!entry)
return -ENOMEM;
 
-   entry->orig_ops = reg;
-   entry->ops  = *reg;
-   entry->next = NULL;
+   nf_hook_entry_init(entry, reg);
 
mutex_lock(_hook_mutex);
 
/* Find the spot in the list */
while ((p = nf_entry_dereference(*pp)) != NULL) {
-   if (reg->priority < p->orig_ops->priority)
+   if (reg->priority < nf_hook_entry_priority(p))
break;
pp = >next;
}
@@ -140,7 +138,7 @@ void nf_unregister_net_hook(struct net *net, const struct 
nf_hook_ops *reg)
 
mutex_lock(_hook_mutex);
while ((p = nf_entry_dereference(*pp)) != NULL) {
-   if (p->orig_ops == reg) {
+   if (nf_hook_entry_ops(p) == reg) {
rcu_assign_pointer(*pp, p->next);
break;
}
diff --git a/net/netfilter/nf_queue.c b/net/netfilter/nf_queue.c
index 2d5dca0..2d0dc56 100644
--- a/net/netfilter/nf_queue.c
+++ b/net/netfilter/nf_queue.c
@@ -186,7 +186,8 @@ static unsigned int nf_iterate(struct sk_buff *skb,
while (*entryp) {
RCU_INIT_POINTER(state->hook_entries, *entryp);
 repeat:
-   verdict = (*entryp)->ops.hook((*entryp)->ops.priv, skb, state);
+   verdict = nf_hook_entry_hookfn(*entryp)
+   (nf_hook_entry_priv(*entryp), skb, state);
if (verdict != NF_ACCEPT) {
if (verdict != NF_REPEAT)
return verdict;
@@ -202,7 +203,6 @@ void nf_reinject(struct nf_queue_entry *entry, unsigned int 
verdict)
struct nf_hook_entry *hook_entry;
struct sk_buff *skb = entry->skb;
const struct nf_afinfo *afinfo;
-   struct nf_hook_ops *elem;
int err;
 
/* Userspace may request to enqueue this packet again. */
@@ -220,13 +220,13 @@ void nf_reinject(struct nf_queue_entry *entry, unsigned 
int verdict)
}
 
hook_entry = rcu_dereference(entry->state

Re: [PATCH nf,v2] netfilter: nf_queue: don't re-enter same hook on packet reinjection

2016-10-18 Thread Aaron Conole
Florian Westphal <f...@strlen.de> writes:

> Pablo Neira Ayuso <pa...@netfilter.org> wrote:
>> On Mon, Oct 17, 2016 at 03:29:27PM -0400, Aaron Conole wrote:
>> > Pablo Neira Ayuso <pa...@netfilter.org> writes:
>> [...]
>> > > From c1a731c68791bcd504a7fe5d28f5f0fd59d66118 Mon Sep 17 00:00:00 2001
>> > > From: Pablo Neira Ayuso <pa...@netfilter.org>
>> > > Date: Thu, 13 Oct 2016 08:14:03 +0200
>> > > Subject: [PATCH nf,v3] netfilter: nf_queue: don't re-enter same hook on 
>> > > packet
>> > >  reinjection
>> > >
>> > > If the packet is accepted, we have to skip the current hook from where
>> > > the packet was enqueued. Thus, we can emulate the previous
>> > > list_for_each_entry_continue() behaviour happening from nf_reinject(),
>> > > otherwise the packets gets enqueued over and over again.
>> > >
>> > > Fixes: e3b37f11e6e4 ("netfilter: replace list_head with single linked 
>> > > list")
>> > > Signed-off-by: Pablo Neira Ayuso <pa...@netfilter.org>
>> > > ---
>> > >  net/netfilter/nf_queue.c | 6 --
>> > >  1 file changed, 4 insertions(+), 2 deletions(-)
>> > >
>> > > diff --git a/net/netfilter/nf_queue.c b/net/netfilter/nf_queue.c
>> > > index 96964a0070e1..0b5ac3c9c2bc 100644
>> > > --- a/net/netfilter/nf_queue.c
>> > > +++ b/net/netfilter/nf_queue.c
>> > > @@ -187,8 +187,10 @@ void nf_reinject(struct nf_queue_entry *entry, 
>> > > unsigned int verdict)
>> > >  entry->state.thresh = INT_MIN;
>> > >  
>> > >  if (verdict == NF_ACCEPT) {
>> > > -next_hook:
>> > > -verdict = nf_iterate(skb, >state, _entry);
>> > > +hook_entry = rcu_dereference(hook_entry->next);
>> > > +if (hook_entry)
>> > > +next_hook:
>> > 
>> > Should the above two lines be transposed to this?
>> > 
>> >  next_hook:
>> >if (hook_entry)
>> > 
>> > Sorry if I'm misunderstanding it.  Too many special cases for my tiny
>> > brain...
>> 
>> Right, my patch is still not correct.
>> 
>> I think this should be it:
>> 
>> if (verdict == NF_ACCEPT) {
>> next_hook:
>> hook_entry = rcu_dereference(hook_entry->next);
>> if (hook_entry)
>> verdict = nf_iterate(skb, >state, 
>> _entry);
>>

Yes.

>> So we jump to "next_hook" in case of NF_QUEUE verdict with bypass flag
>> set on.  In that case, we need to continue just after the current hook
>> entry to emulate the behaviour that we previously have via
>> list_for_each_entry_continue().
>> 
>> This NF_QUEUE handling is also broken from nf_hook_slow() path, right?
>
> Yes.  As you already indicate, list_for_each_entry_continue() resumes
> after the current elem, this isn't true anymore.
>
> So for nf_queue we need to move to hook_entry->next in ACCEPT case,
> and, for nf_hook_slow, we need to do the same when hitting
>
> (verdict & NF_VERDICT_FLAG_QUEUE_BYPASS))
>  goto next_hook;
>
> branch.

Right.  That looks correct.
--
To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH nf,v2] netfilter: nf_queue: don't re-enter same hook on packet reinjection

2016-10-17 Thread Aaron Conole
Pablo Neira Ayuso <pa...@netfilter.org> writes:

> On Mon, Oct 17, 2016 at 11:23:01AM -0400, Aaron Conole wrote:
>> Pablo Neira Ayuso <pa...@netfilter.org> writes:
>>
>> > Make sure we skip the current hook from where the packet was enqueued,
>> > otherwise the packets gets enqueued over and over again.
>> >
>> > Fixes: e3b37f11e6e4 ("netfilter: replace list_head with single linked 
>> > list")
>> > Signed-off-by: Pablo Neira Ayuso <pa...@netfilter.org>
>> > ---
>> > v2: Make sure next hook is non-null, otherwise we are at the end of the
>> >hook list and we can skip nf_iterate().
>> >
>> >  net/netfilter/nf_queue.c | 3 ++-
>> >  1 file changed, 2 insertions(+), 1 deletion(-)
>> >
>> > diff --git a/net/netfilter/nf_queue.c b/net/netfilter/nf_queue.c
>> > index 96964a0070e1..691e713d70f5 100644
>> > --- a/net/netfilter/nf_queue.c
>> > +++ b/net/netfilter/nf_queue.c
>> > @@ -185,8 +185,9 @@ void nf_reinject(struct nf_queue_entry *entry, 
>> > unsigned int verdict)
>> >}
>> >
>> >entry->state.thresh = INT_MIN;
>> > +  hook_entry = rcu_dereference(hook_entry->next);
>> >
>> > -  if (verdict == NF_ACCEPT) {
>> > +  if (hook_entry && verdict == NF_ACCEPT) {
>> >next_hook:
>> >verdict = nf_iterate(skb, >state, _entry);
>> >}
>>
>> ACK. I thought switch case below could have a problem, but re-checked
>> the first nf_queue leg, and it seems okay.
>
> Argh, still not right. If we get a NF_QUEUE verdict to re-enqueue
> again, then hook_entry may become NULL.
>
>   switch (verdict & NF_VERDICT_MASK) {
>   case NF_ACCEPT:
>   case NF_STOP:
>   local_bh_disable();
>   entry->state.okfn(entry->state.net, entry->state.sk, skb);
>   local_bh_enable();
>   break;
>   case NF_QUEUE:
>   RCU_INIT_POINTER(entry->state.hook_entries, hook_entry); <--
>
> Attaching new patch.
>
> From c1a731c68791bcd504a7fe5d28f5f0fd59d66118 Mon Sep 17 00:00:00 2001
> From: Pablo Neira Ayuso <pa...@netfilter.org>
> Date: Thu, 13 Oct 2016 08:14:03 +0200
> Subject: [PATCH nf,v3] netfilter: nf_queue: don't re-enter same hook on packet
>  reinjection
>
> If the packet is accepted, we have to skip the current hook from where
> the packet was enqueued. Thus, we can emulate the previous
> list_for_each_entry_continue() behaviour happening from nf_reinject(),
> otherwise the packets gets enqueued over and over again.
>
> Fixes: e3b37f11e6e4 ("netfilter: replace list_head with single linked list")
> Signed-off-by: Pablo Neira Ayuso <pa...@netfilter.org>
> ---
>  net/netfilter/nf_queue.c | 6 --
>  1 file changed, 4 insertions(+), 2 deletions(-)
>
> diff --git a/net/netfilter/nf_queue.c b/net/netfilter/nf_queue.c
> index 96964a0070e1..0b5ac3c9c2bc 100644
> --- a/net/netfilter/nf_queue.c
> +++ b/net/netfilter/nf_queue.c
> @@ -187,8 +187,10 @@ void nf_reinject(struct nf_queue_entry *entry, unsigned 
> int verdict)
>   entry->state.thresh = INT_MIN;
>  
>   if (verdict == NF_ACCEPT) {
> - next_hook:
> - verdict = nf_iterate(skb, >state, _entry);
> + hook_entry = rcu_dereference(hook_entry->next);
> + if (hook_entry)
> +next_hook:

Should the above two lines be transposed to this?

 next_hook:
if (hook_entry)

Sorry if I'm misunderstanding it.  Too many special cases for my tiny
brain...

-Aaron
--
To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH nf,v2] netfilter: nf_queue: don't re-enter same hook on packet reinjection

2016-10-17 Thread Aaron Conole
Pablo Neira Ayuso  writes:

> Make sure we skip the current hook from where the packet was enqueued,
> otherwise the packets gets enqueued over and over again.
>
> Fixes: e3b37f11e6e4 ("netfilter: replace list_head with single linked list")
> Signed-off-by: Pablo Neira Ayuso 
> ---
> v2: Make sure next hook is non-null, otherwise we are at the end of the
> hook list and we can skip nf_iterate().
>
>  net/netfilter/nf_queue.c | 3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)
>
> diff --git a/net/netfilter/nf_queue.c b/net/netfilter/nf_queue.c
> index 96964a0070e1..691e713d70f5 100644
> --- a/net/netfilter/nf_queue.c
> +++ b/net/netfilter/nf_queue.c
> @@ -185,8 +185,9 @@ void nf_reinject(struct nf_queue_entry *entry, unsigned 
> int verdict)
>   }
>  
>   entry->state.thresh = INT_MIN;
> + hook_entry = rcu_dereference(hook_entry->next);
>  
> - if (verdict == NF_ACCEPT) {
> + if (hook_entry && verdict == NF_ACCEPT) {
>   next_hook:
>   verdict = nf_iterate(skb, >state, _entry);
>   }

ACK.  I thought switch case below could have a problem, but re-checked
the first nf_queue leg, and it seems okay.

-Aaron
--
To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 00/10, nf-next] Netfilter core updates

2016-10-17 Thread Aaron Conole
Florian Westphal  writes:

> Pablo Neira Ayuso  wrote:
>> Let me know if you have any comment, otherwise I'll place this in the
>> nf-next tree so we can follow up working on top of these.
>
> Please do, thanks!

+1.  Some of this work was in my back burner, so thanks Pablo :)
--
To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html



Re: [PATCH 1/2 nf] netfilter: nf_queue: don't re-enter same hook on packet reinjection

2016-10-13 Thread Aaron Conole
Pablo Neira Ayuso <pa...@netfilter.org> writes:

> Make sure we skip the current hook from where the packet was enqueued,
> otherwise the packets gets enqueued over and over again.
>
> Fixes: e3b37f11e6e4 ("netfilter: replace list_head with single linked list")
> Signed-off-by: Pablo Neira Ayuso <pa...@netfilter.org>
> ---
> I managed to reproduce this with a simple test.
>
>  # iptables -I OUTPUT -j QUEUE
>  # cd libnetfilter_queue/utils/
>  # ./nfqnl_test
>
> Without my patch, netfilter munches packets that are reinjected.
>
> @Aaron: Please, I'd appreciate if you can have a look to confirm this bug
> and the fix. Thanks.

Looks like I missed this in my testing.

Reviewed-by: Aaron Conole <acon...@bytheb.org>

>  net/netfilter/nf_queue.c | 1 +
>  1 file changed, 1 insertion(+)
>
> diff --git a/net/netfilter/nf_queue.c b/net/netfilter/nf_queue.c
> index 96964a0070e1..221d7a5c2fec 100644
> --- a/net/netfilter/nf_queue.c
> +++ b/net/netfilter/nf_queue.c
> @@ -184,6 +184,7 @@ void nf_reinject(struct nf_queue_entry *entry, unsigned 
> int verdict)
>   verdict = NF_DROP;
>   }
>  
> + hook_entry = rcu_dereference(hook_entry->next);
>   entry->state.thresh = INT_MIN;
>  
>   if (verdict == NF_ACCEPT) {

--
To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: slab corruption with current -git

2016-10-11 Thread Aaron Conole
Michal Kubecek  writes:

> On Mon, Oct 10, 2016 at 04:24:01AM -0400, David Miller wrote:
>> From: David Miller 
>> Date: Sun, 09 Oct 2016 23:57:45 -0400 (EDT)
>> 
>> This means that the netns is possibly getting freed up before we
>> unregister the netfilter hooks.
>
> Sounds a bit like the issue discussed here:
>
>   https://marc.info/?l=netfilter-devel=146980917627262=2
>
> Could it be (partly) the same race condition?

It looks like it's possible.  It appears that there could be a
long-standing race between these.  I'll look into it more carefully, and
discuss with Pablo and Florian when they're situated from netdev
conference.

-Aaron
--
To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: slab corruption with current -git (was Re: [git pull] vfs pile 1 (splice))

2016-10-10 Thread Aaron Conole
Linus Torvalds <torva...@linux-foundation.org> writes:

> On Mon, Oct 10, 2016 at 9:28 AM, Linus Torvalds
> <torva...@linux-foundation.org> wrote:
>>
>> So as I already answered to Dave, I'm not actually sure that this was
>> the buggy code, or that my patch would make any difference at all.
>
> My patch does seem to fix things, and in fact the warning about "hook
> not found" now triggers.
>
> So I think the bug really was that the singly-linked list handling
> code did not correctly handle the case of not finding the entry, and
> then freed (incorrectly) the last one that wasn't actually unlinked.
>
> In fact, I get quite a few warnings (56 total) about 30 seconds after
> logging in:
>
> [   54.213170] WARNING: CPU: 1 PID: 111 at net/netfilter/core.c:151
> nf_unregister_net_hook+0x8e/0x170
> ... repeat 54 times ...
> [   54.445520] WARNING: CPU: 7 PID: 111 at net/netfilter/core.c:151
> nf_unregister_net_hook+0x8e/0x170
>
> and looking in the journal, the first one is (again) immediately
> preceded by that systemd-hostnamed service stopping:
>
>   Oct 10 11:45:47 i7 audit[1546]: USER_LOGIN
>   ...
>   Oct 10 11:46:11 i7 audit[1]: SERVICE_STOP pid=1 uid=0
> auid=4294967295 ses=4294967295 subj=system_u:system_r:init_t:s0
> msg='unit=fprintd comm="systemd" exe="/usr/lib/systemd/systemd"
> hostname=? addr=? terminal=? res=success'
>   Oct 10 11:46:13 i7 pulseaudio[1697]: [pulseaudio] bluez5-util.c:
> GetManagedObjects() failed: org.freedesktop.DBus.Error.NoReply: Did
> not receive a reply. Possible causes include: the remote application
> did not send a reply, the message bus security policy blocked the
> reply, the reply timeout expir
>   Oct 10 11:46:13 i7 dbus-daemon[1003]: [system] Failed to activate
> service 'org.bluez': timed out
>   Oct 10 11:46:20 i7 audit[1]: SERVICE_STOP pid=1 uid=0
> auid=4294967295 ses=4294967295 subj=system_u:system_r:init_t:s0
> msg='unit=systemd-hostnamed comm="systemd"
> exe="/usr/lib/systemd/systemd" hostname=? addr=? terminal=?
> res=success'
>   Oct 10 11:46:20 i7 kernel: [ cut here ]
>   Oct 10 11:46:20 i7 kernel: WARNING: CPU: 1 PID: 111 at
> net/netfilter/core.c:151 nf_unregister_net_hook+0x8e/0x170
>
> so I do think it's something to do with some network startup service
> thing (perhaps dhcp, perhaps chrome, who knows) as I do my initial
> login.
>
> David - I think that also explains what was wrong with the old code.
> In the old code, this loop:
>
> while (hooks_entry && nf_entry_dereference(hooks_entry->next)) {
>
> would exit with "hooks_entry" pointing to the last list entry (because
> ->next was NULL). Nothing was ever unlinked in the loop itself,
> because it never actually found a matching entry, but then after the
> loop it would free that last entry because it *thought* that was the
> match.
>
> My list rewrite fixes that.
>
> Anyway, I'm assuming it will come to me from the networking tree after
> more testing by the maintainers. You can add my
>
>   Signed-off-by: Linus Torvalds <torva...@linux-foundation.org>
>
> to the patch, though.
>
> David, if you want me to just commit that thing directly, I can
> obviously do so, but I do think somebody should look at
>
>  (a) that I actually got the priority list ordering right on the
>  insertion side

It looks correct.

Reviewed-by: Aaron Conole <acon...@bytheb.org>

>  (b) what it is that makes it try to unregister that hook that isn't
> on the list in the first place

This is a still problem, I think.  I wasn't able to reproduce the issue
on a fedora-23 VM.  My fedora 24 bare-metal system does trigger this,
though.  Not sure what changed in userspace/kernel interaction side (not
an excuse, but just an observation).

> but on the whole I consider this issue explained and solved. I'll
> continue to run with my patch on my machine (just not committed).

Okay.  Very sorry for this, again.

>   Linus
--
To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: slab corruption with current -git (was Re: [git pull] vfs pile 1 (splice))

2016-10-10 Thread Aaron Conole
Linus Torvalds  writes:

> On Sun, Oct 9, 2016 at 7:49 PM, Linus Torvalds
>  wrote:
>>
>> There is one *correct* way to remove an entry from a singly linked
>> list, and it looks like this:
>>
>> struct entry **pp, *p;
>>
>> pp = 
>> while ((p = *pp) != NULL) {
>> if (right_entry(p)) {
>> *pp = p->next;
>> break;
>> }
>> pp = >next;
>> }
>>
>> and that's it. Nothing else.

Sorry, I should have done that.

> This COMPLETELY UNTESTED patch tries to fix the nf_hook_entry code to do this.
>
> I repeat: it's ENTIRELY UNTESTED. I just converted the insertion and
> deletion to the proper pattern, but I could easily have gotten the
> insertion priority test the wrong way around entirely, for example. Or
> it could simply have some other completely broken bug in it. It
> compiles for me, but that's all I actually checked.

Okay, I'm looking it over.  Sorry for the mess.

> Note that the "correct way" of doing list operations also almost
> inevitably is the shortest way by far, since it gets rid of all the
> special cases. So the patch looks nice. It gets rid of the magic
> "nf_set_hooks_head()" thing too, because once you do list following
> right, the head is no different from any other pointer in the list.
>
> So the patch stats look good:
>
>  net/netfilter/core.c | 108 
> ---
>  1 file changed, 33 insertions(+), 75 deletions(-)
>
> but again, it's entirely *entirely* untested. Please consider this
> just a "this is generally how list insert/delete operations should be
> done, avoiding special cases for the first entry".

I'll review it, and test it.  Can you tell me what steps you took to
reproduce the oops?  I'll enable slab debugging and try to reproduce
without and with this patch (and I'll also look into David's recent
email as well).  Are you simply creating and removing network
namespaces (I did test that, but I should have done a better job)?

> ALSO NOTE! The code assumes that the "nf_hook_mutex" locking only
> protects the actual *lists*, and that the address to the list can be
> looked up without holding the lock. That's generally how things are
> done, and it simplifies error handling (because you can do the "there
> is no such list at all" test before you do anything else. But again, I
> don't actually know the code, and if there is something that actually
> expands the number of lists etc that depends on that mutex, then the
> list head lookup may need to be inside the lock too.

That should be correct, the nf_hook_mutex is only for protecting the
lists.

>Linus
--
To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: slab corruption with current -git (was Re: [git pull] vfs pile 1 (splice))

2016-10-09 Thread Aaron Conole
Florian Westphal  writes:

> Linus Torvalds  wrote:
>> On Sun, Oct 9, 2016 at 12:11 PM, Linus Torvalds
>>  wrote:
>> >
>> > Anyway, I don't think I can bisect it, but I'll try to narrow it down
>> > a *bit* at least.
>> >
>> > Not doing any more pulls on this unstable base, I've been puttering
>> > around in trying to clean up some stupid printk logging issues
>> > instead.
>> 
>> So I finally got a oops with slub debugging enabled. It doesn't really
>> narrow things down, though, it kind of extends on the possible
>> suspects. Now adding David Miller and Pablo, because it looks like it
>> may be netfilter that does something bad and corrupts memory.
>
> Quite possible, the netns interactions are not nice :-/
>
>> Without further ado, here's the new oops:
>> 
>>general protection fault:  [#1] SMP
>>CPU: 7 PID: 169 Comm: kworker/u16:7 Not tainted
>> 4.8.0-11288-gb66484cd7470 #1
>>Hardware name: System manufacturer System Product Name/Z170-K, BIOS
> ..
>>Call Trace:
>>  netfilter_net_exit+0x2f/0x60
>>  ops_exit_list.isra.4+0x38/0x60
>>  cleanup_net+0x1ba/0x2a0
>>  process_one_work+0x1f1/0x480
>>  worker_thread+0x48/0x4d0
>>  ? process_one_work+0x480/0x480
>
> ..
>
>> like it's a pointer loaded from a free'd allocation.
>> 
>> The code disassembles to
>> 
>>0: 0f b6 ca movzbl %dl,%ecx
>>3: 48 8d 84 c8 00 01 00 lea0x100(%rax,%rcx,8),%rax
>>a: 00
>>b: 49 8b 5c c5 00   mov0x0(%r13,%rax,8),%rbx
>>   10: 48 85 db test   %rbx,%rbx
>>   13: 0f 84 cb 00 00 00 je 0xe4
>>   19: 4c 3b 63 40   cmp0x40(%rbx),%r12
>>   1d: 48 8b 03 mov(%rbx),%rax
>>   20: 0f 84 e9 00 00 00 je 0x10f
>>   26: 48 85 c0 test   %rax,%rax
>>   29: 74 26 je 0x51
>>   2b:* 4c 3b 60 40   cmp0x40(%rax),%r12 <-- trapping instruction
>>   2f: 75 08 jne0x39
>>   31: e9 ef 00 00 00   jmpq   0x125
>>   36: 48 89 d8 mov%rbx,%rax
>>   39: 48 8b 18 mov(%rax),%rbx
>>   3c: 48 85 db test   %rbx,%rbx
>> 
>> and that oopsing instruction seems to be the compare of
>> "hooks_entry->orig_ops" from hooks_entry in this expression:
>> 
>> if (hooks_entry && hooks_entry->orig_ops == reg) {
>> 
>> so hooks_entry() is bogus. It was gotten from
>> 
>> hooks_entry = nf_hook_entry_head(net, reg);
>> 
>> but that's as far as I dug. And yes, I do have
>> CONFIG_NETFILTER_INGRESS=y in case that matters.
>> 
>> And all this code has changed pretty radically in commit e3b37f11e6e4
>> ("netfilter: replace list_head with single linked list"), and there
>> was clearly already something wrong with that code, with commit
>> 5119e4381a90 ("netfilter: Fix potential null pointer dereference")
>> adding the test against NULL. But I suspect that only hid the "oops,
>> it's actually not NULL, it loaded some uninitialized value" problem.
>> 
>> Over to the networking guys.. Ideas?
>
> Sorry, not off the top of my head.
> Pablo is currently travelling back home from netdev 1.2 in Tokyo,
> I can help starting Wednesday when I am back.
>
> One shot in the dark (not even compile tested; wonder if we can end up
> zapping bogus hook ...)
>

I was just about to build and test something similar:

diff --git a/net/netfilter/core.c b/net/netfilter/core.c
index c9d90eb..e84103f 100644
--- a/net/netfilter/core.c
+++ b/net/netfilter/core.c
@@ -189,7 +189,7 @@ void nf_unregister_net_hook(struct net *net, const struct 
nf_hook_ops *reg)
 
 unlock:
mutex_unlock(_hook_mutex);
-   if (!hooks_entry) {
+   if (!hooks_entry || hooks_entry->orig_ops != reg) {
WARN(1, "nf_unregister_net_hook: hook not found!\n");
return;
}
--
To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 2/3] netfilter: hide reference to nf_hooks_ingress

2016-09-30 Thread Aaron Conole
Arnd Bergmann  writes:

> A recent cleanup added an unconditional reference to the nf_hooks_ingress 
> pointer,
> but that fails when CONFIG_NETFILTER_INGRESS is disabled and that member is
> not present in net_device:
>
> net/netfilter/core.c: In function 'nf_set_hooks_head':
> net/netfilter/core.c:96:30: error: 'struct net_device' has no member named 
> 'nf_hooks_ingress'
>
> This avoids the build error by simply enclosing the assignment in an #ifdef,
> which may or may not be the correct fix.

NAK, it's not the right fix.  The entry being set may be leaked with only this
hunk.  I've posted a complete fix for this.

Sorry that it was broken.
--
To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH nf-next v4 0/2] fixes for recent nf_compact hooks

2016-09-28 Thread Aaron Conole
Two possible error conditions were caught during an extended testing
session, and by a build robot.  These patches fix the two issues (a
missing handler when config is changed, and a potential NULL
dereference).

Aaron Conole (2):
  netfilter: Fix potential null pointer dereference
  nf_set_hooks_head: acommodate different kconfig

 net/netfilter/core.c | 6 +-
 1 file changed, 5 insertions(+), 1 deletion(-)

-- 
2.5.5

--
To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH nf-next v3 1/2] netfilter: Fix potential null pointer dereference

2016-09-28 Thread Aaron Conole
Eric Dumazet <eric.duma...@gmail.com> writes:

> On Wed, 2016-09-28 at 10:56 -0400, Aaron Conole wrote:
>> Eric Dumazet <eric.duma...@gmail.com> writes:
>> 
>> > On Wed, 2016-09-28 at 09:12 -0400, Aaron Conole wrote:
>> >> It's possible for nf_hook_entry_head to return NULL.  If two
>> >> nf_unregister_net_hook calls happen simultaneously with a single hook
>> >> entry in the list, both will enter the nf_hook_mutex critical section.
>> >> The first will successfully delete the head, but the second will see
>> >> this NULL pointer and attempt to dereference.
>> >> 
>> >> This fix ensures that no null pointer dereference could occur when such
>> >> a condition happens.
>> >> 
>> >> Signed-off-by: Aaron Conole <acon...@bytheb.org>
>> >> ---
>> >>  net/netfilter/core.c | 2 +-
>> >>  1 file changed, 1 insertion(+), 1 deletion(-)
>> >> 
>> >> diff --git a/net/netfilter/core.c b/net/netfilter/core.c
>> >> index 360c63d..e58e420 100644
>> >> --- a/net/netfilter/core.c
>> >> +++ b/net/netfilter/core.c
>> >> @@ -160,7 +160,7 @@ void nf_unregister_net_hook(struct net *net, const 
>> >> struct nf_hook_ops *reg)
>> >>  
>> >>   mutex_lock(_hook_mutex);
>> >>   hooks_entry = nf_hook_entry_head(net, reg);
>> >> - if (hooks_entry->orig_ops == reg) {
>> >> + if (hooks_entry && hooks_entry->orig_ops == reg) {
>> >>   nf_set_hooks_head(net, reg,
>> >> nf_entry_dereference(hooks_entry->next));
>> >>   goto unlock;
>> >
>> > When was the bug added exactly ?
>> 
>> Sunday, on the nf-next tree.
>> 
>> > For all bug fixes, you need to add a Fixes: tag.
>> >
>> > Like :
>> >
>> > Fixes: e3b37f11e6e4 ("netfilter: replace list_head with single linked 
>> > list")
>> 
>> I would but it's in nf-next tree, and I'm not sure how pulls go.  If
>> they are done via patch imports, then the sha sums will be wrong and the
>> commit message will be misleading.  If the sums are preserved, then I
>> can resubmit with this information.
>> 
>
> I gave the (12 digits) sha-1 as present in David Miller net-next tree.
>
> https://git.kernel.org/cgit/linux/kernel/git/davem/net-next.git/commit/?id=e3b37f11e6e4e6b6f02cc762f182ce233d2c1c9d
>
>
> This wont change, because David never rebases his tree under normal
> operations.
>
> Thanks.

Thank you very much, Eric.  I've reposted.

-Aaron
--
To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH nf-next v4 2/2] nf_set_hooks_head: accommodate different kconfig

2016-09-28 Thread Aaron Conole
When CONFIG_NETFILTER_INGRESS is unset (or no), we need to handle
the request for registration properly by dropping the hook.  This
releases the entry during the set.

Fixes: e3b37f11e6e4 ("netfilter: replace list_head with single linked list")
Signed-off-by: Aaron Conole <acon...@bytheb.org>
---
 net/netfilter/core.c | 15 +++
 1 file changed, 11 insertions(+), 4 deletions(-)

diff --git a/net/netfilter/core.c b/net/netfilter/core.c
index e58e420..61e8a9d 100644
--- a/net/netfilter/core.c
+++ b/net/netfilter/core.c
@@ -90,10 +90,12 @@ static void nf_set_hooks_head(struct net *net, const struct 
nf_hook_ops *reg,
 {
switch (reg->pf) {
case NFPROTO_NETDEV:
+#ifdef CONFIG_NETFILTER_INGRESS
/* We already checked in nf_register_net_hook() that this is
 * used from ingress.
 */
rcu_assign_pointer(reg->dev->nf_hooks_ingress, entry);
+#endif
break;
default:
rcu_assign_pointer(net->nf.hooks[reg->pf][reg->hooknum],
@@ -107,10 +109,15 @@ int nf_register_net_hook(struct net *net, const struct 
nf_hook_ops *reg)
struct nf_hook_entry *hooks_entry;
struct nf_hook_entry *entry;
 
-   if (reg->pf == NFPROTO_NETDEV &&
-   (reg->hooknum != NF_NETDEV_INGRESS ||
-!reg->dev || dev_net(reg->dev) != net))
-   return -EINVAL;
+   if (reg->pf == NFPROTO_NETDEV) {
+#ifndef CONFIG_NETFILTER_INGRESS
+   if (reg->hooknum == NF_NETDEV_INGRESS)
+   return -EOPNOTSUPP;
+#endif
+   if (reg->hooknum != NF_NETDEV_INGRESS ||
+   !reg->dev || dev_net(reg->dev) != net)
+   return -EINVAL;
+   }
 
entry = kmalloc(sizeof(*entry), GFP_KERNEL);
if (!entry)
-- 
2.7.4

--
To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH nf-next v3 1/2] netfilter: Fix potential null pointer dereference

2016-09-28 Thread Aaron Conole
Eric Dumazet <eric.duma...@gmail.com> writes:

> On Wed, 2016-09-28 at 09:12 -0400, Aaron Conole wrote:
>> It's possible for nf_hook_entry_head to return NULL.  If two
>> nf_unregister_net_hook calls happen simultaneously with a single hook
>> entry in the list, both will enter the nf_hook_mutex critical section.
>> The first will successfully delete the head, but the second will see
>> this NULL pointer and attempt to dereference.
>> 
>> This fix ensures that no null pointer dereference could occur when such
>> a condition happens.
>> 
>> Signed-off-by: Aaron Conole <acon...@bytheb.org>
>> ---
>>  net/netfilter/core.c | 2 +-
>>  1 file changed, 1 insertion(+), 1 deletion(-)
>> 
>> diff --git a/net/netfilter/core.c b/net/netfilter/core.c
>> index 360c63d..e58e420 100644
>> --- a/net/netfilter/core.c
>> +++ b/net/netfilter/core.c
>> @@ -160,7 +160,7 @@ void nf_unregister_net_hook(struct net *net, const 
>> struct nf_hook_ops *reg)
>>  
>>  mutex_lock(_hook_mutex);
>>  hooks_entry = nf_hook_entry_head(net, reg);
>> -if (hooks_entry->orig_ops == reg) {
>> +if (hooks_entry && hooks_entry->orig_ops == reg) {
>>  nf_set_hooks_head(net, reg,
>>nf_entry_dereference(hooks_entry->next));
>>  goto unlock;
>
> When was the bug added exactly ?

Sunday, on the nf-next tree.

> For all bug fixes, you need to add a Fixes: tag.
>
> Like :
>
> Fixes: e3b37f11e6e4 ("netfilter: replace list_head with single linked list")

I would but it's in nf-next tree, and I'm not sure how pulls go.  If
they are done via patch imports, then the sha sums will be wrong and the
commit message will be misleading.  If the sums are preserved, then I
can resubmit with this information.

> So that 100 different people in stable teams do not have to do the
> archeology themselves ...
>
> Thanks.

Thanks for the review, Eric.
--
To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH nf-next v2 1/2] netfilter: Fix potential null pointer dereference

2016-09-28 Thread Aaron Conole
Liping Zhang <zlpnob...@gmail.com> writes:

> 2016-09-28 11:08 GMT+08:00 Liping Zhang <zlpnob...@gmail.com>:
>> Hi Feng,
>>
>> 2016-09-28 9:23 GMT+08:00 Feng Gao <gfree.w...@gmail.com>:
>>> Hi Aaraon,
>>>
>>> On Tue, Sep 27, 2016 at 9:38 PM, Aaron Conole <acon...@bytheb.org> wrote:
>>>> It's possible for nf_hook_entry_head to return NULL if two
>>>> nf_unregister_net_hook calls happen simultaneously with a single hook
>>>
>>> The critical region of nf_unregister_net_hook is protected by 
>>> _hook_mutex.
>>> When it would be called simultaneously?
>>
>> This is unrelated to race condition.
>>
>> Suppose that only the last nf_hook_entry exist, and two callers want to do
>> un-register work.
>>
>> The first one will remove it successfully, after the end of the work, the
>> second one will enter the critical section, but it will see the NULL pointer.
>> Because the last nf_hook_entry was already removed by the first one.
>>
>>>
>>> Regards
>>> Feng
>>>
>>>> entry in the list.  This fix ensures that no null pointer dereference
>>>> could occur when such a race happens.
>>>>
>>>> Signed-off-by: Aaron Conole <acon...@bytheb.org>
>
> I read the commit log again, I think the description here is a
> little confusing indeed.

Sorry about that.  I've posted a new description which describes the
situation better.  I hope it helps.

-Aaron
--
To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH nf-next v3 1/2] netfilter: Fix potential null pointer dereference

2016-09-28 Thread Aaron Conole
It's possible for nf_hook_entry_head to return NULL.  If two
nf_unregister_net_hook calls happen simultaneously with a single hook
entry in the list, both will enter the nf_hook_mutex critical section.
The first will successfully delete the head, but the second will see
this NULL pointer and attempt to dereference.

This fix ensures that no null pointer dereference could occur when such
a condition happens.

Signed-off-by: Aaron Conole <acon...@bytheb.org>
---
 net/netfilter/core.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/net/netfilter/core.c b/net/netfilter/core.c
index 360c63d..e58e420 100644
--- a/net/netfilter/core.c
+++ b/net/netfilter/core.c
@@ -160,7 +160,7 @@ void nf_unregister_net_hook(struct net *net, const struct 
nf_hook_ops *reg)
 
mutex_lock(_hook_mutex);
hooks_entry = nf_hook_entry_head(net, reg);
-   if (hooks_entry->orig_ops == reg) {
+   if (hooks_entry && hooks_entry->orig_ops == reg) {
nf_set_hooks_head(net, reg,
  nf_entry_dereference(hooks_entry->next));
goto unlock;
-- 
2.7.4

--
To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH nf-next v3 2/2] nf_set_hooks_head: accommodate different kconfig

2016-09-28 Thread Aaron Conole
When CONFIG_NETFILTER_INGRESS is unset (or no), we need to handle
the request for registration properly by dropping the hook.  This
releases the entry during the set.

Signed-off-by: Aaron Conole <acon...@bytheb.org>
---
 net/netfilter/core.c | 15 +++
 1 file changed, 11 insertions(+), 4 deletions(-)

diff --git a/net/netfilter/core.c b/net/netfilter/core.c
index e58e420..61e8a9d 100644
--- a/net/netfilter/core.c
+++ b/net/netfilter/core.c
@@ -90,10 +90,12 @@ static void nf_set_hooks_head(struct net *net, const struct 
nf_hook_ops *reg,
 {
switch (reg->pf) {
case NFPROTO_NETDEV:
+#ifdef CONFIG_NETFILTER_INGRESS
/* We already checked in nf_register_net_hook() that this is
 * used from ingress.
 */
rcu_assign_pointer(reg->dev->nf_hooks_ingress, entry);
+#endif
break;
default:
rcu_assign_pointer(net->nf.hooks[reg->pf][reg->hooknum],
@@ -107,10 +109,15 @@ int nf_register_net_hook(struct net *net, const struct 
nf_hook_ops *reg)
struct nf_hook_entry *hooks_entry;
struct nf_hook_entry *entry;
 
-   if (reg->pf == NFPROTO_NETDEV &&
-   (reg->hooknum != NF_NETDEV_INGRESS ||
-!reg->dev || dev_net(reg->dev) != net))
-   return -EINVAL;
+   if (reg->pf == NFPROTO_NETDEV) {
+#ifndef CONFIG_NETFILTER_INGRESS
+   if (reg->hooknum == NF_NETDEV_INGRESS)
+   return -EOPNOTSUPP;
+#endif
+   if (reg->hooknum != NF_NETDEV_INGRESS ||
+   !reg->dev || dev_net(reg->dev) != net)
+   return -EINVAL;
+   }
 
entry = kmalloc(sizeof(*entry), GFP_KERNEL);
if (!entry)
-- 
2.7.4

--
To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH nf-next v3 0/2] fixes for recent nf_compact hooks

2016-09-28 Thread Aaron Conole
Two possible error conditions were caught during an extended testing
session, and by a build robot.  These patches fix the two issues (a
missing handler when config is changed, and a potential NULL
dereference).

Aaron Conole (2):
  netfilter: Fix potential null pointer dereference
  nf_set_hooks_head: acommodate different kconfig

 net/netfilter/core.c | 6 +-
 1 file changed, 5 insertions(+), 1 deletion(-)

-- 
2.5.5

--
To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH nf-next v2 0/2] fixes for recent nf_compact hooks

2016-09-27 Thread Aaron Conole
Two possible error conditions were caught during an extended testing
session, and by a build robot.  These patches fix the two issues (a
missing handler when config is changed, and a potential NULL
dereference).

Aaron Conole (2):
  netfilter: Fix potential null pointer dereference
  nf_set_hooks_head: acommodate different kconfig

 net/netfilter/core.c | 6 +-
 1 file changed, 5 insertions(+), 1 deletion(-)

-- 
2.5.5

--
To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH nf-next v2 1/2] netfilter: Fix potential null pointer dereference

2016-09-27 Thread Aaron Conole
It's possible for nf_hook_entry_head to return NULL if two
nf_unregister_net_hook calls happen simultaneously with a single hook
entry in the list.  This fix ensures that no null pointer dereference
could occur when such a race happens.

Signed-off-by: Aaron Conole <acon...@bytheb.org>
---
 net/netfilter/core.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/net/netfilter/core.c b/net/netfilter/core.c
index 360c63d..e58e420 100644
--- a/net/netfilter/core.c
+++ b/net/netfilter/core.c
@@ -160,7 +160,7 @@ void nf_unregister_net_hook(struct net *net, const struct 
nf_hook_ops *reg)
 
mutex_lock(_hook_mutex);
hooks_entry = nf_hook_entry_head(net, reg);
-   if (hooks_entry->orig_ops == reg) {
+   if (hooks_entry && hooks_entry->orig_ops == reg) {
nf_set_hooks_head(net, reg,
  nf_entry_dereference(hooks_entry->next));
goto unlock;
-- 
2.7.4

--
To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH nf-next 2/2] nf_set_hooks_head: acommodate different kconfig

2016-09-26 Thread Aaron Conole
Florian Westphal <f...@strlen.de> writes:

> Aaron Conole <acon...@bytheb.org> wrote:
>> When CONFIG_NETFILTER_INGRESS is unset (or no), we need to handle
>> the request for registration properly by dropping the hook.  This
>> releases the entry during the set.
>> 
>> Signed-off-by: Aaron Conole <acon...@bytheb.org>
>> ---
>>  net/netfilter/core.c | 4 
>>  1 file changed, 4 insertions(+)
>> 
>> diff --git a/net/netfilter/core.c b/net/netfilter/core.c
>> index e58e420..1d0a4c9 100644
>> --- a/net/netfilter/core.c
>> +++ b/net/netfilter/core.c
>> @@ -90,10 +90,14 @@ static void nf_set_hooks_head(struct net *net, const 
>> struct nf_hook_ops *reg,
>>  {
>>  switch (reg->pf) {
>>  case NFPROTO_NETDEV:
>> +#ifdef CONFIG_NETFILTER_INGRESS
>>  /* We already checked in nf_register_net_hook() that this is
>>   * used from ingress.
>>   */
>>  rcu_assign_pointer(reg->dev->nf_hooks_ingress, entry);
>> +#else
>> +kfree(entry);
>> +#endif
>>  break;
>
> This looks dodgy (its correct though).
>
> I'd propose to add a test to nf_register_net_hook()
> to bail with -EOPNOSTUPP instead of this "#else kfree()" if we get
> NFPROTO_NETDEV pf with CONFIG_NETFILTER_INGRESS=n build instead.

Okay, I'll spin a new version.

Thanks for the review, Florian!

-Aaron
--
To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH nf-next 1/2] netfilter: Fix potential null pointer dereference

2016-09-26 Thread Aaron Conole
It's possible for nf_hook_entry_head to return NULL if two
nf_unregister_net_hook calls happen simultaneously with a single hook
entry in the list.  This fix ensures that no null pointer dereference
could occur when such a race happens.

Signed-off-by: Aaron Conole <acon...@bytheb.org>
---
 net/netfilter/core.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/net/netfilter/core.c b/net/netfilter/core.c
index 360c63d..e58e420 100644
--- a/net/netfilter/core.c
+++ b/net/netfilter/core.c
@@ -160,7 +160,7 @@ void nf_unregister_net_hook(struct net *net, const struct 
nf_hook_ops *reg)
 
mutex_lock(_hook_mutex);
hooks_entry = nf_hook_entry_head(net, reg);
-   if (hooks_entry->orig_ops == reg) {
+   if (hooks_entry && hooks_entry->orig_ops == reg) {
nf_set_hooks_head(net, reg,
  nf_entry_dereference(hooks_entry->next));
goto unlock;
-- 
2.5.5

--
To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH nf-next 0/2] fixes for recent nf_compact hooks

2016-09-26 Thread Aaron Conole
Two possible error conditions were caught during an extended testing
session, and by a build robot.  These patches fix the two issues (a
missing handler when config is changed, and a potential NULL
dereference).

Aaron Conole (2):
  netfilter: Fix potential null pointer dereference
  nf_set_hooks_head: acommodate different kconfig

 net/netfilter/core.c | 6 +-
 1 file changed, 5 insertions(+), 1 deletion(-)

-- 
2.5.5

--
To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH nf-next 2/2] nf_set_hooks_head: acommodate different kconfig

2016-09-26 Thread Aaron Conole
When CONFIG_NETFILTER_INGRESS is unset (or no), we need to handle
the request for registration properly by dropping the hook.  This
releases the entry during the set.

Signed-off-by: Aaron Conole <acon...@bytheb.org>
---
 net/netfilter/core.c | 4 
 1 file changed, 4 insertions(+)

diff --git a/net/netfilter/core.c b/net/netfilter/core.c
index e58e420..1d0a4c9 100644
--- a/net/netfilter/core.c
+++ b/net/netfilter/core.c
@@ -90,10 +90,14 @@ static void nf_set_hooks_head(struct net *net, const struct 
nf_hook_ops *reg,
 {
switch (reg->pf) {
case NFPROTO_NETDEV:
+#ifdef CONFIG_NETFILTER_INGRESS
/* We already checked in nf_register_net_hook() that this is
 * used from ingress.
 */
rcu_assign_pointer(reg->dev->nf_hooks_ingress, entry);
+#else
+   kfree(entry);
+#endif
break;
default:
rcu_assign_pointer(net->nf.hooks[reg->pf][reg->hooknum],
-- 
2.5.5

--
To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] netfilter: replace list_head with single linked list

2016-09-21 Thread Aaron Conole
Aaron Conole <acon...@bytheb.org> writes:

> The netfilter hook list never uses the prev pointer, and so can be trimmed to
> be a simple singly-linked list.
>
> In addition to having a more light weight structure for hook traversal,
> struct net becomes 5568 bytes (down from 6400) and struct net_device becomes
> 2176 bytes (down from 2240).
>
> Signed-off-by: Aaron Conole <acon...@bytheb.org>
> Signed-off-by: Florian Westphal <f...@strlen.de>

Apologies, all.  This subject prefix is incorrect.  It should be:
[PATCH nf-next v3 7/7]

Should I repost?

> ---
>  include/linux/netdevice.h |   2 +-
>  include/linux/netfilter.h |  61 +
>  include/linux/netfilter_ingress.h |  16 +++--
>  include/net/netfilter/nf_queue.h  |   3 +-
>  include/net/netns/netfilter.h |   2 +-
>  net/bridge/br_netfilter_hooks.c   |  19 ++---
>  net/netfilter/core.c  | 141 
> +-
>  net/netfilter/nf_internals.h  |  10 +--
>  net/netfilter/nf_queue.c  |  18 ++---
>  net/netfilter/nfnetlink_queue.c   |   8 ++-
>  10 files changed, 165 insertions(+), 115 deletions(-)
>
> diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h
> index 67bb978..41f49f5 100644
> --- a/include/linux/netdevice.h
> +++ b/include/linux/netdevice.h
> @@ -1783,7 +1783,7 @@ struct net_device {
>  #endif
>   struct netdev_queue __rcu *ingress_queue;
>  #ifdef CONFIG_NETFILTER_INGRESS
> - struct list_headnf_hooks_ingress;
> + struct nf_hook_entry __rcu *nf_hooks_ingress;
>  #endif
>  
>   unsigned char   broadcast[MAX_ADDR_LEN];
> diff --git a/include/linux/netfilter.h b/include/linux/netfilter.h
> index ad444f0..17c90b0 100644
> --- a/include/linux/netfilter.h
> +++ b/include/linux/netfilter.h
> @@ -55,12 +55,34 @@ struct nf_hook_state {
>   struct net_device *out;
>   struct sock *sk;
>   struct net *net;
> - struct list_head *hook_list;
> + struct nf_hook_entry __rcu *hook_entries;
>   int (*okfn)(struct net *, struct sock *, struct sk_buff *);
>  };
>  
> +typedef unsigned int nf_hookfn(void *priv,
> +struct sk_buff *skb,
> +const struct nf_hook_state *state);
> +struct nf_hook_ops {
> + struct list_headlist;
> +
> + /* User fills in from here down. */
> + nf_hookfn   *hook;
> + struct net_device   *dev;
> + void*priv;
> + u_int8_tpf;
> + unsigned inthooknum;
> + /* Hooks are ordered in ascending priority. */
> + int priority;
> +};
> +
> +struct nf_hook_entry {
> + struct nf_hook_entry __rcu  *next;
> + struct nf_hook_ops  ops;
> + const struct nf_hook_ops*orig_ops;
> +};
> +
>  static inline void nf_hook_state_init(struct nf_hook_state *p,
> -   struct list_head *hook_list,
> +   struct nf_hook_entry *hook_entry,
> unsigned int hook,
> int thresh, u_int8_t pf,
> struct net_device *indev,
> @@ -76,26 +98,11 @@ static inline void nf_hook_state_init(struct 
> nf_hook_state *p,
>   p->out = outdev;
>   p->sk = sk;
>   p->net = net;
> - p->hook_list = hook_list;
> + RCU_INIT_POINTER(p->hook_entries, hook_entry);
>   p->okfn = okfn;
>  }
>  
> -typedef unsigned int nf_hookfn(void *priv,
> -struct sk_buff *skb,
> -const struct nf_hook_state *state);
>  
> -struct nf_hook_ops {
> - struct list_headlist;
> -
> - /* User fills in from here down. */
> - nf_hookfn   *hook;
> - struct net_device   *dev;
> - void*priv;
> - u_int8_tpf;
> - unsigned inthooknum;
> - /* Hooks are ordered in ascending priority. */
> - int priority;
> -};
>  
>  struct nf_sockopt_ops {
>   struct list_head list;
> @@ -161,7 +168,8 @@ static inline int nf_hook_thresh(u_int8_t pf, unsigned 
> int hook,
>int (*okfn)(struct net *, struct sock *, 
> struct sk_buff *),
>int thresh)
>  {
> - struct list_head *hook_list;
> + struct nf_hook_entry *hook_head;
> + int ret = 1;
>  
>  #ifdef HAVE_JUMP_LABEL
>   if (__builtin_constant_p(pf) &&
> @@ -170,22 +178,19 @@ static inline

[PATCH nf-next v3 1/7] netfilter: bridge: add and use br_nf_hook_thresh

2016-09-21 Thread Aaron Conole
From: Florian Westphal <f...@strlen.de>

This replaces the last uses of NF_HOOK_THRESH().
Followup patch will remove it and rename nf_hook_thresh.

The reason is that inet (non-bridge) netfilter no longer invokes the
hooks from hooks, so we do no longer need the thresh value to skip hooks
with a lower priority.

The bridge netfilter however may need to do this. br_nf_hook_thresh is a
wrapper that is supposed to do this, i.e. only call hooks with a
priority that exceeds NF_BR_PRI_BRNF.

It's used only in the recursion cases of br_netfilter.  It invokes
nf_hook_slow while holding an rcu read-side critical section to make a
future cleanup simpler.

Signed-off-by: Florian Westphal <f...@strlen.de>
Signed-off-by: Aaron Conole <acon...@bytheb.org>
---
 include/net/netfilter/br_netfilter.h |  6 
 net/bridge/br_netfilter_hooks.c  | 60 ++--
 net/bridge/br_netfilter_ipv6.c   | 12 +++-
 3 files changed, 62 insertions(+), 16 deletions(-)

diff --git a/include/net/netfilter/br_netfilter.h 
b/include/net/netfilter/br_netfilter.h
index e8d1448..0b0c35c 100644
--- a/include/net/netfilter/br_netfilter.h
+++ b/include/net/netfilter/br_netfilter.h
@@ -15,6 +15,12 @@ static inline struct nf_bridge_info *nf_bridge_alloc(struct 
sk_buff *skb)
 
 void nf_bridge_update_protocol(struct sk_buff *skb);
 
+int br_nf_hook_thresh(unsigned int hook, struct net *net, struct sock *sk,
+ struct sk_buff *skb, struct net_device *indev,
+ struct net_device *outdev,
+ int (*okfn)(struct net *, struct sock *,
+ struct sk_buff *));
+
 static inline struct nf_bridge_info *
 nf_bridge_info_get(const struct sk_buff *skb)
 {
diff --git a/net/bridge/br_netfilter_hooks.c b/net/bridge/br_netfilter_hooks.c
index 77e7f69..6029af4 100644
--- a/net/bridge/br_netfilter_hooks.c
+++ b/net/bridge/br_netfilter_hooks.c
@@ -30,6 +30,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 
 #include 
@@ -395,11 +396,10 @@ bridged_dnat:
skb->dev = nf_bridge->physindev;
nf_bridge_update_protocol(skb);
nf_bridge_push_encap_header(skb);
-   NF_HOOK_THRESH(NFPROTO_BRIDGE,
-  NF_BR_PRE_ROUTING,
-  net, sk, skb, skb->dev, NULL,
-  br_nf_pre_routing_finish_bridge,
-  1);
+   br_nf_hook_thresh(NF_BR_PRE_ROUTING,
+ net, sk, skb, skb->dev,
+ NULL,
+ br_nf_pre_routing_finish);
return 0;
}
ether_addr_copy(eth_hdr(skb)->h_dest, dev->dev_addr);
@@ -417,10 +417,8 @@ bridged_dnat:
skb->dev = nf_bridge->physindev;
nf_bridge_update_protocol(skb);
nf_bridge_push_encap_header(skb);
-   NF_HOOK_THRESH(NFPROTO_BRIDGE, NF_BR_PRE_ROUTING, net, sk, skb,
-  skb->dev, NULL,
-  br_handle_frame_finish, 1);
-
+   br_nf_hook_thresh(NF_BR_PRE_ROUTING, net, sk, skb, skb->dev, NULL,
+ br_handle_frame_finish);
return 0;
 }
 
@@ -992,6 +990,50 @@ static struct notifier_block brnf_notifier __read_mostly = 
{
.notifier_call = brnf_device_event,
 };
 
+/* recursively invokes nf_hook_slow (again), skipping already-called
+ * hooks (< NF_BR_PRI_BRNF).
+ *
+ * Called with rcu read lock held.
+ */
+int br_nf_hook_thresh(unsigned int hook, struct net *net,
+ struct sock *sk, struct sk_buff *skb,
+ struct net_device *indev,
+ struct net_device *outdev,
+ int (*okfn)(struct net *, struct sock *,
+ struct sk_buff *))
+{
+   struct nf_hook_ops *elem;
+   struct nf_hook_state state;
+   struct list_head *head;
+   int ret;
+
+   head = >nf.hooks[NFPROTO_BRIDGE][hook];
+
+   list_for_each_entry_rcu(elem, head, list) {
+   struct nf_hook_ops *next;
+
+   next = list_entry_rcu(list_next_rcu(>list),
+ struct nf_hook_ops, list);
+   if (next->priority <= NF_BR_PRI_BRNF)
+   continue;
+   }
+
+   if (>list == head)
+   return okfn(net, sk, skb);
+
+   /* We may already have this, but read-locks nest anyway */
+   rcu_read_lock();
+   nf_hook_state_init(, head, hook, NF_BR_PRI_BRNF + 1,
+  NFPROTO_BRIDGE, indev, outdev, sk, net, okfn);
+
+   ret = nf_hook_slow(skb, );
+   r

[PATCH nf-next v3 5/7] nf_register_net_hook: Only allow sane values

2016-09-21 Thread Aaron Conole
This commit adds an upfront check for sane values to be passed when
registering a netfilter hook.  This will be used in a future patch for a
simplified hook list traversal.

Signed-off-by: Aaron Conole <acon...@bytheb.org>
---
 net/netfilter/core.c | 5 +
 1 file changed, 5 insertions(+)

diff --git a/net/netfilter/core.c b/net/netfilter/core.c
index c8faf81..67b7428 100644
--- a/net/netfilter/core.c
+++ b/net/netfilter/core.c
@@ -89,6 +89,11 @@ int nf_register_net_hook(struct net *net, const struct 
nf_hook_ops *reg)
struct nf_hook_entry *entry;
struct nf_hook_ops *elem;
 
+   if (reg->pf == NFPROTO_NETDEV &&
+   (reg->hooknum != NF_NETDEV_INGRESS ||
+!reg->dev || dev_net(reg->dev) != net))
+   return -EINVAL;
+
entry = kmalloc(sizeof(*entry), GFP_KERNEL);
if (!entry)
return -ENOMEM;
-- 
2.7.4

--
To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH nf-next v3 2/7] netfilter: call nf_hook_state_init with rcu_read_lock held

2016-09-21 Thread Aaron Conole
From: Florian Westphal <f...@strlen.de>

This makes things simpler because we can store the head of the list
in the nf_state structure without worrying about concurrent add/delete
of hook elements from the list.

A future commit will make use of this to implement a simpler
linked-list.

Signed-off-by: Florian Westphal <f...@strlen.de>
Signed-off-by: Aaron Conole <acon...@bytheb.org>
---
 include/linux/netfilter.h | 8 +++-
 include/linux/netfilter_ingress.h | 1 +
 2 files changed, 8 insertions(+), 1 deletion(-)

diff --git a/include/linux/netfilter.h b/include/linux/netfilter.h
index 9230f9a..ad444f0 100644
--- a/include/linux/netfilter.h
+++ b/include/linux/netfilter.h
@@ -174,10 +174,16 @@ static inline int nf_hook_thresh(u_int8_t pf, unsigned 
int hook,
 
if (!list_empty(hook_list)) {
struct nf_hook_state state;
+   int ret;
 
+   /* We may already have this, but read-locks nest anyway */
+   rcu_read_lock();
nf_hook_state_init(, hook_list, hook, thresh,
   pf, indev, outdev, sk, net, okfn);
-   return nf_hook_slow(skb, );
+
+   ret = nf_hook_slow(skb, );
+   rcu_read_unlock();
+   return ret;
}
return 1;
 }
diff --git a/include/linux/netfilter_ingress.h 
b/include/linux/netfilter_ingress.h
index 5fcd375..6965ba0 100644
--- a/include/linux/netfilter_ingress.h
+++ b/include/linux/netfilter_ingress.h
@@ -14,6 +14,7 @@ static inline bool nf_hook_ingress_active(const struct 
sk_buff *skb)
return !list_empty(>dev->nf_hooks_ingress);
 }
 
+/* caller must hold rcu_read_lock */
 static inline int nf_hook_ingress(struct sk_buff *skb)
 {
struct nf_hook_state state;
-- 
2.7.4

--
To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH] netfilter: replace list_head with single linked list

2016-09-21 Thread Aaron Conole
The netfilter hook list never uses the prev pointer, and so can be trimmed to
be a simple singly-linked list.

In addition to having a more light weight structure for hook traversal,
struct net becomes 5568 bytes (down from 6400) and struct net_device becomes
2176 bytes (down from 2240).

Signed-off-by: Aaron Conole <acon...@bytheb.org>
Signed-off-by: Florian Westphal <f...@strlen.de>
---
 include/linux/netdevice.h |   2 +-
 include/linux/netfilter.h |  61 +
 include/linux/netfilter_ingress.h |  16 +++--
 include/net/netfilter/nf_queue.h  |   3 +-
 include/net/netns/netfilter.h |   2 +-
 net/bridge/br_netfilter_hooks.c   |  19 ++---
 net/netfilter/core.c  | 141 +-
 net/netfilter/nf_internals.h  |  10 +--
 net/netfilter/nf_queue.c  |  18 ++---
 net/netfilter/nfnetlink_queue.c   |   8 ++-
 10 files changed, 165 insertions(+), 115 deletions(-)

diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h
index 67bb978..41f49f5 100644
--- a/include/linux/netdevice.h
+++ b/include/linux/netdevice.h
@@ -1783,7 +1783,7 @@ struct net_device {
 #endif
struct netdev_queue __rcu *ingress_queue;
 #ifdef CONFIG_NETFILTER_INGRESS
-   struct list_headnf_hooks_ingress;
+   struct nf_hook_entry __rcu *nf_hooks_ingress;
 #endif
 
unsigned char   broadcast[MAX_ADDR_LEN];
diff --git a/include/linux/netfilter.h b/include/linux/netfilter.h
index ad444f0..17c90b0 100644
--- a/include/linux/netfilter.h
+++ b/include/linux/netfilter.h
@@ -55,12 +55,34 @@ struct nf_hook_state {
struct net_device *out;
struct sock *sk;
struct net *net;
-   struct list_head *hook_list;
+   struct nf_hook_entry __rcu *hook_entries;
int (*okfn)(struct net *, struct sock *, struct sk_buff *);
 };
 
+typedef unsigned int nf_hookfn(void *priv,
+  struct sk_buff *skb,
+  const struct nf_hook_state *state);
+struct nf_hook_ops {
+   struct list_headlist;
+
+   /* User fills in from here down. */
+   nf_hookfn   *hook;
+   struct net_device   *dev;
+   void*priv;
+   u_int8_tpf;
+   unsigned inthooknum;
+   /* Hooks are ordered in ascending priority. */
+   int priority;
+};
+
+struct nf_hook_entry {
+   struct nf_hook_entry __rcu  *next;
+   struct nf_hook_ops  ops;
+   const struct nf_hook_ops*orig_ops;
+};
+
 static inline void nf_hook_state_init(struct nf_hook_state *p,
- struct list_head *hook_list,
+ struct nf_hook_entry *hook_entry,
  unsigned int hook,
  int thresh, u_int8_t pf,
  struct net_device *indev,
@@ -76,26 +98,11 @@ static inline void nf_hook_state_init(struct nf_hook_state 
*p,
p->out = outdev;
p->sk = sk;
p->net = net;
-   p->hook_list = hook_list;
+   RCU_INIT_POINTER(p->hook_entries, hook_entry);
p->okfn = okfn;
 }
 
-typedef unsigned int nf_hookfn(void *priv,
-  struct sk_buff *skb,
-  const struct nf_hook_state *state);
 
-struct nf_hook_ops {
-   struct list_headlist;
-
-   /* User fills in from here down. */
-   nf_hookfn   *hook;
-   struct net_device   *dev;
-   void*priv;
-   u_int8_tpf;
-   unsigned inthooknum;
-   /* Hooks are ordered in ascending priority. */
-   int priority;
-};
 
 struct nf_sockopt_ops {
struct list_head list;
@@ -161,7 +168,8 @@ static inline int nf_hook_thresh(u_int8_t pf, unsigned int 
hook,
 int (*okfn)(struct net *, struct sock *, 
struct sk_buff *),
 int thresh)
 {
-   struct list_head *hook_list;
+   struct nf_hook_entry *hook_head;
+   int ret = 1;
 
 #ifdef HAVE_JUMP_LABEL
if (__builtin_constant_p(pf) &&
@@ -170,22 +178,19 @@ static inline int nf_hook_thresh(u_int8_t pf, unsigned 
int hook,
return 1;
 #endif
 
-   hook_list = >nf.hooks[pf][hook];
+   rcu_read_lock();
 
-   if (!list_empty(hook_list)) {
+   hook_head = rcu_dereference(net->nf.hooks[pf][hook]);
+   if (hook_head) {
struct nf_hook_state state;
-   int ret;
 
-   /* We may already have this, but read-locks nest anyway */
-   rcu_read_lock();
-   nf_hook_state_init(, hook_list, hook, thresh,
+   nf_hook_state_init(, hook_head, hook, thresh,
   pf, indev, outdev, sk, net,

[PATCH nf-next v3 4/7] nf_hook_slow: Remove explicit rcu_read_lock

2016-09-21 Thread Aaron Conole
All of the callers of nf_hook_slow already hold the rcu_read_lock, so this
cleanup removes the recursive call.  This is just a cleanup, as the locking
code gracefully handles this situation.

Signed-off-by: Aaron Conole <acon...@bytheb.org>
---
 net/bridge/netfilter/ebt_redirect.c| 2 +-
 net/bridge/netfilter/ebtables.c| 2 +-
 net/ipv4/netfilter/nf_conntrack_l3proto_ipv4.c | 2 +-
 net/ipv4/netfilter/nf_conntrack_proto_icmp.c   | 2 +-
 net/ipv6/netfilter/nf_conntrack_l3proto_ipv6.c | 2 +-
 net/ipv6/netfilter/nf_conntrack_proto_icmpv6.c | 2 +-
 net/netfilter/core.c   | 6 +-
 net/netfilter/nf_conntrack_core.c  | 2 +-
 net/netfilter/nf_conntrack_h323_main.c | 2 +-
 net/netfilter/nf_conntrack_helper.c| 2 +-
 net/netfilter/nfnetlink_cthelper.c | 2 +-
 net/netfilter/nfnetlink_log.c  | 6 --
 net/netfilter/nfnetlink_queue.c| 2 +-
 net/netfilter/xt_helper.c  | 2 +-
 14 files changed, 17 insertions(+), 19 deletions(-)

diff --git a/net/bridge/netfilter/ebt_redirect.c 
b/net/bridge/netfilter/ebt_redirect.c
index 20396499..2e7c4f9 100644
--- a/net/bridge/netfilter/ebt_redirect.c
+++ b/net/bridge/netfilter/ebt_redirect.c
@@ -24,7 +24,7 @@ ebt_redirect_tg(struct sk_buff *skb, const struct 
xt_action_param *par)
return EBT_DROP;
 
if (par->hooknum != NF_BR_BROUTING)
-   /* rcu_read_lock()ed by nf_hook_slow */
+   /* rcu_read_lock()ed by nf_hook_thresh */
ether_addr_copy(eth_hdr(skb)->h_dest,
br_port_get_rcu(par->in)->br->dev->dev_addr);
else
diff --git a/net/bridge/netfilter/ebtables.c b/net/bridge/netfilter/ebtables.c
index cceac5b..dd71332 100644
--- a/net/bridge/netfilter/ebtables.c
+++ b/net/bridge/netfilter/ebtables.c
@@ -146,7 +146,7 @@ ebt_basic_match(const struct ebt_entry *e, const struct 
sk_buff *skb,
return 1;
if (NF_INVF(e, EBT_IOUT, ebt_dev_check(e->out, out)))
return 1;
-   /* rcu_read_lock()ed by nf_hook_slow */
+   /* rcu_read_lock()ed by nf_hook_thresh */
if (in && (p = br_port_get_rcu(in)) != NULL &&
NF_INVF(e, EBT_ILOGICALIN,
ebt_dev_check(e->logical_in, p->br->dev)))
diff --git a/net/ipv4/netfilter/nf_conntrack_l3proto_ipv4.c 
b/net/ipv4/netfilter/nf_conntrack_l3proto_ipv4.c
index 870aebd..713c09a 100644
--- a/net/ipv4/netfilter/nf_conntrack_l3proto_ipv4.c
+++ b/net/ipv4/netfilter/nf_conntrack_l3proto_ipv4.c
@@ -110,7 +110,7 @@ static unsigned int ipv4_helper(void *priv,
if (!help)
return NF_ACCEPT;
 
-   /* rcu_read_lock()ed by nf_hook_slow */
+   /* rcu_read_lock()ed by nf_hook_thresh */
helper = rcu_dereference(help->helper);
if (!helper)
return NF_ACCEPT;
diff --git a/net/ipv4/netfilter/nf_conntrack_proto_icmp.c 
b/net/ipv4/netfilter/nf_conntrack_proto_icmp.c
index 4b5904b..d075b3c 100644
--- a/net/ipv4/netfilter/nf_conntrack_proto_icmp.c
+++ b/net/ipv4/netfilter/nf_conntrack_proto_icmp.c
@@ -149,7 +149,7 @@ icmp_error_message(struct net *net, struct nf_conn *tmpl, 
struct sk_buff *skb,
return -NF_ACCEPT;
}
 
-   /* rcu_read_lock()ed by nf_hook_slow */
+   /* rcu_read_lock()ed by nf_hook_thresh */
innerproto = __nf_ct_l4proto_find(PF_INET, origtuple.dst.protonum);
 
/* Ordinarily, we'd expect the inverted tupleproto, but it's
diff --git a/net/ipv6/netfilter/nf_conntrack_l3proto_ipv6.c 
b/net/ipv6/netfilter/nf_conntrack_l3proto_ipv6.c
index 1aa5848..963ee38 100644
--- a/net/ipv6/netfilter/nf_conntrack_l3proto_ipv6.c
+++ b/net/ipv6/netfilter/nf_conntrack_l3proto_ipv6.c
@@ -115,7 +115,7 @@ static unsigned int ipv6_helper(void *priv,
help = nfct_help(ct);
if (!help)
return NF_ACCEPT;
-   /* rcu_read_lock()ed by nf_hook_slow */
+   /* rcu_read_lock()ed by nf_hook_thresh */
helper = rcu_dereference(help->helper);
if (!helper)
return NF_ACCEPT;
diff --git a/net/ipv6/netfilter/nf_conntrack_proto_icmpv6.c 
b/net/ipv6/netfilter/nf_conntrack_proto_icmpv6.c
index 660bc10..f5a61bc 100644
--- a/net/ipv6/netfilter/nf_conntrack_proto_icmpv6.c
+++ b/net/ipv6/netfilter/nf_conntrack_proto_icmpv6.c
@@ -165,7 +165,7 @@ icmpv6_error_message(struct net *net, struct nf_conn *tmpl,
return -NF_ACCEPT;
}
 
-   /* rcu_read_lock()ed by nf_hook_slow */
+   /* rcu_read_lock()ed by nf_hook_thresh */
inproto = __nf_ct_l4proto_find(PF_INET6, origtuple.dst.protonum);
 
/* Ordinarily, we'd expect the inverted tupleproto, but it's
diff --git a/net/netfilter/core.c b/net/netfilter/core.c
index f39276d..c8faf81 100644
--- a/net/netfilter/core.c
+++ b/net/netfilter/core.c
@@ -291,16 +291,13 @@ repeat:
 
 

Re: [PATCH nf-next v2 1/3] netfilter: bridge: add and use br_nf_hook_thresh

2016-07-14 Thread Aaron Conole
Pablo Neira Ayuso <pa...@netfilter.org> writes:

> On Tue, Jul 12, 2016 at 11:32:19AM -0400, Aaron Conole wrote:
>> +/* recursively invokes nf_hook_slow (again), skipping already-called
>> + * hooks (< NF_BR_PRI_BRNF).
>> + *
>> + * Called with rcu read lock held.
>> + */
>> +int br_nf_hook_thresh(unsigned int hook, struct net *net,
>> +  struct sock *sk, struct sk_buff *skb,
>> +  struct net_device *indev,
>> +  struct net_device *outdev,
>> +  int (*okfn)(struct net *, struct sock *,
>> +  struct sk_buf *))
>  ^^
>
> This doesn't compile. Please, make sure patches compile independently
> so we make sure git bisectability doesn't break. Thanks.

Okay, I will make sure to do this for all the patches.  Apologies.

Thanks for the review.
--
To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH nf-next v2 0/3] Compact netfilter hooks list

2016-07-12 Thread Aaron Conole
This series makes a simple change to shrink the netfilter hook list
from a double linked list, to a singly linked list.  Since the hooks
are always traversed in-order, there is no need to maintain a previous
pointer.

This was jointly developed by Florian Westphal.

It has been tested with RCU and lockdep debugging enabled.

Aaron Conole (2):
  netfilter: bridge: add and use br_nf_hook_thresh
  netfilter: replace list_head with single linked list

Florian Westphal (1):
  netfilter: call nf_hook_state_init with rcu_read_lock held

 include/linux/netdevice.h  |   2 +-
 include/linux/netfilter.h  |  26 +++--
 include/linux/netfilter_ingress.h  |  15 ++-
 include/net/netfilter/br_netfilter.h   |   6 ++
 include/net/netfilter/nf_queue.h   |   9 +-
 include/net/netns/netfilter.h  |   2 +-
 net/bridge/br_netfilter_hooks.c|  50 --
 net/bridge/br_netfilter_ipv6.c |  12 +--
 net/bridge/netfilter/ebt_redirect.c|   2 +-
 net/bridge/netfilter/ebtables.c|   2 +-
 net/ipv4/netfilter/nf_conntrack_l3proto_ipv4.c |   2 +-
 net/ipv4/netfilter/nf_conntrack_proto_icmp.c   |   2 +-
 net/ipv6/netfilter/nf_conntrack_l3proto_ipv6.c |   2 +-
 net/ipv6/netfilter/nf_conntrack_proto_icmpv6.c |   2 +-
 net/netfilter/core.c   | 132 -
 net/netfilter/nf_conntrack_core.c  |   2 +-
 net/netfilter/nf_conntrack_h323_main.c |   2 +-
 net/netfilter/nf_conntrack_helper.c|   2 +-
 net/netfilter/nf_internals.h   |  10 +-
 net/netfilter/nf_queue.c   |  15 ++-
 net/netfilter/nfnetlink_cthelper.c |   2 +-
 net/netfilter/nfnetlink_log.c  |   8 +-
 net/netfilter/nfnetlink_queue.c|   7 +-
 net/netfilter/xt_helper.c  |   2 +-
 24 files changed, 202 insertions(+), 114 deletions(-)

-- 
2.5.5

--
To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH nf-next v2 1/3] netfilter: bridge: add and use br_nf_hook_thresh

2016-07-12 Thread Aaron Conole
From: Florian Westphal <f...@strlen.de>

This replaces the last uses of NF_HOOK_THRESH().
Followup patch will remove it and rename nf_hook_thresh.

The reason is that inet (non-bridge) netfilter no longer invokes the
hooks from hooks, so we do no longer need the thresh value to skip hooks
with a lower priority.

The bridge netfilter however may need to do this. br_nf_hook_thresh is a
wrapper that is supposed to do this, i.e. only call hooks with a
priority that exceeds NF_BR_PRI_BRNF.

It's used only in the recursion cases of br_netfilter.

Signed-off-by: Florian Westphal <f...@strlen.de>
Signed-off-by: Aaron Conole <acon...@bytheb.org>
---
 include/net/netfilter/br_netfilter.h |  6 
 net/bridge/br_netfilter_hooks.c  | 57 ++--
 net/bridge/br_netfilter_ipv6.c   | 12 
 3 files changed, 59 insertions(+), 16 deletions(-)

diff --git a/include/net/netfilter/br_netfilter.h 
b/include/net/netfilter/br_netfilter.h
index e8d1448..0b0c35c 100644
--- a/include/net/netfilter/br_netfilter.h
+++ b/include/net/netfilter/br_netfilter.h
@@ -15,6 +15,12 @@ static inline struct nf_bridge_info *nf_bridge_alloc(struct 
sk_buff *skb)
 
 void nf_bridge_update_protocol(struct sk_buff *skb);
 
+int br_nf_hook_thresh(unsigned int hook, struct net *net, struct sock *sk,
+ struct sk_buff *skb, struct net_device *indev,
+ struct net_device *outdev,
+ int (*okfn)(struct net *, struct sock *,
+ struct sk_buff *));
+
 static inline struct nf_bridge_info *
 nf_bridge_info_get(const struct sk_buff *skb)
 {
diff --git a/net/bridge/br_netfilter_hooks.c b/net/bridge/br_netfilter_hooks.c
index 77e7f69..7856129 100644
--- a/net/bridge/br_netfilter_hooks.c
+++ b/net/bridge/br_netfilter_hooks.c
@@ -30,6 +30,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 
 #include 
@@ -395,11 +396,10 @@ bridged_dnat:
skb->dev = nf_bridge->physindev;
nf_bridge_update_protocol(skb);
nf_bridge_push_encap_header(skb);
-   NF_HOOK_THRESH(NFPROTO_BRIDGE,
-  NF_BR_PRE_ROUTING,
-  net, sk, skb, skb->dev, NULL,
-  br_nf_pre_routing_finish_bridge,
-  1);
+   br_nf_hook_thresh(NF_BR_PRE_ROUTING,
+ net, sk, skb, skb->dev,
+ NULL,
+ br_nf_pre_routing_finish);
return 0;
}
ether_addr_copy(eth_hdr(skb)->h_dest, dev->dev_addr);
@@ -417,10 +417,8 @@ bridged_dnat:
skb->dev = nf_bridge->physindev;
nf_bridge_update_protocol(skb);
nf_bridge_push_encap_header(skb);
-   NF_HOOK_THRESH(NFPROTO_BRIDGE, NF_BR_PRE_ROUTING, net, sk, skb,
-  skb->dev, NULL,
-  br_handle_frame_finish, 1);
-
+   br_nf_hook_thresh(NF_BR_PRE_ROUTING, net, sk, skb, skb->dev, NULL,
+ br_handle_frame_finish);
return 0;
 }
 
@@ -992,6 +990,47 @@ static struct notifier_block brnf_notifier __read_mostly = 
{
.notifier_call = brnf_device_event,
 };
 
+/* recursively invokes nf_hook_slow (again), skipping already-called
+ * hooks (< NF_BR_PRI_BRNF).
+ *
+ * Called with rcu read lock held.
+ */
+int br_nf_hook_thresh(unsigned int hook, struct net *net,
+ struct sock *sk, struct sk_buff *skb,
+ struct net_device *indev,
+ struct net_device *outdev,
+ int (*okfn)(struct net *, struct sock *,
+ struct sk_buf *))
+{
+   struct nf_hook_ops *elem;
+   struct nf_hook_state state;
+   struct list_head *head;
+   int ret;
+
+   head = >nf.hooks[NFPROTO_BRIDGE][hook];
+
+   list_for_each_entry_rcu(elem, head, list) {
+   struct nf_hook_ops *next;
+
+   next = list_entry_rcu(list_next_rcu(>list),
+ struct nf_hook_ops, list);
+   if (next->priority <= NF_BR_PRI_BRNF)
+   continue;
+   }
+
+   if (>list == head)
+   return okfn(net, sk, skb);
+
+   nf_hook_state_init(, head, hook, NF_BR_PRI_BRNF + 1,
+  NFPROTO_BRIDGE, indev, outdev, sk, net, okfn);
+
+   ret = nf_hook_slow(skb, );
+   if (ret == 1)
+   ret = okfn(net, sk, skb);
+
+   return ret;
+}
+
 #ifdef CONFIG_SYSCTL
 static
 int brnf_sysctl_call_tables(struct ctl_table *ctl, int write,
diff --git a/net/bridge/

Re: [PATCH nf-next 3/3] netfilter: replace list_head with single linked list

2016-07-11 Thread Aaron Conole
Thanks for this;  I will send a v2 in the next two days.

-Aaron

Florian Westphal <f...@strlen.de> writes:

> Aaron Conole <acon...@bytheb.org> wrote:
>> --- a/net/netfilter/core.c
>> +++ b/net/netfilter/core
> [..]
>> +#define nf_entry_dereference(e) \
>> +rcu_dereference_protected(e, lockdep_is_held(_hook_mutex))
>>  
>> -static struct list_head *nf_find_hook_list(struct net *net,
>> -   const struct nf_hook_ops *reg)
>> +static struct nf_hook_entry *nf_find_hook_list(struct net *net,
>> +   const struct nf_hook_ops *reg)
>>  {
>> -struct list_head *hook_list = NULL;
>> +struct nf_hook_entry *hook_list = NULL;
>>  
>>  if (reg->pf != NFPROTO_NETDEV)
>> -hook_list = >nf.hooks[reg->pf][reg->hooknum];
>> +hook_list = rcu_dereference(net->nf.hooks[reg->pf]
>> +[reg->hooknum]);
>>  else if (reg->hooknum == NF_NETDEV_INGRESS) {
>>  #ifdef CONFIG_NETFILTER_INGRESS
>>  if (reg->dev && dev_net(reg->dev) == net)
>> -hook_list = >dev->nf_hooks_ingress;
>> +hook_list =
>> +rcu_dereference(reg->dev->nf_hooks_ingress);
>
> Both of these should use nf_entry_dereference() to avoid the lockdep
> splat reported by kbuild robot:
>
> net/netfilter/core.c:75 suspicious rcu_dereference_check() usage!
> 2 locks held by swapper/1:
> #0:  (rtnl_mutex){+.+.+.}, at: [] rtnl_lock+0x17/0x20
> #1: (nf_hook_mutex){+.+...}, at: []
> nf_register_net_hook+0xcb/0x240
>

--
To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH nf-next 3/3] netfilter: replace list_head with single linked list

2016-06-30 Thread Aaron Conole
The netfilter hook list never uses the prev pointer, and so can be
trimmed to be a smaller singly-linked list.

In addition to having a more light weight structure for hook traversal,
struct net becomes 5568 bytes (down from 6400) and struct net_device
becomes 2176 bytes (down from 2240).

Signed-off-by: Aaron Conole <acon...@bytheb.org>
Signed-off-by: Florian Westphal <f...@strlen.de>
---
 include/linux/netdevice.h |   2 +-
 include/linux/netfilter.h |  18 +++---
 include/linux/netfilter_ingress.h |  14 +++--
 include/net/netfilter/nf_queue.h  |   9 ++-
 include/net/netns/netfilter.h |   2 +-
 net/bridge/br_netfilter_hooks.c   |  21 +++
 net/netfilter/core.c  | 126 --
 net/netfilter/nf_internals.h  |  10 +--
 net/netfilter/nf_queue.c  |  15 +++--
 net/netfilter/nfnetlink_queue.c   |   5 +-
 10 files changed, 129 insertions(+), 93 deletions(-)

diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h
index e84d9d2..8235f67 100644
--- a/include/linux/netdevice.h
+++ b/include/linux/netdevice.h
@@ -1747,7 +1747,7 @@ struct net_device {
 #endif
struct netdev_queue __rcu *ingress_queue;
 #ifdef CONFIG_NETFILTER_INGRESS
-   struct list_headnf_hooks_ingress;
+   struct nf_hook_entry __rcu *nf_hooks_ingress;
 #endif
 
unsigned char   broadcast[MAX_ADDR_LEN];
diff --git a/include/linux/netfilter.h b/include/linux/netfilter.h
index ad444f0..3390a84 100644
--- a/include/linux/netfilter.h
+++ b/include/linux/netfilter.h
@@ -55,12 +55,12 @@ struct nf_hook_state {
struct net_device *out;
struct sock *sk;
struct net *net;
-   struct list_head *hook_list;
+   struct nf_hook_entry *hook_list;
int (*okfn)(struct net *, struct sock *, struct sk_buff *);
 };
 
 static inline void nf_hook_state_init(struct nf_hook_state *p,
- struct list_head *hook_list,
+ struct nf_hook_entry *hook_list,
  unsigned int hook,
  int thresh, u_int8_t pf,
  struct net_device *indev,
@@ -97,6 +97,12 @@ struct nf_hook_ops {
int priority;
 };
 
+struct nf_hook_entry {
+   struct nf_hook_entry __rcu  *next;
+   struct nf_hook_ops  ops;
+   const struct nf_hook_ops*orig_ops;
+};
+
 struct nf_sockopt_ops {
struct list_head list;
 
@@ -161,8 +167,6 @@ static inline int nf_hook_thresh(u_int8_t pf, unsigned int 
hook,
 int (*okfn)(struct net *, struct sock *, 
struct sk_buff *),
 int thresh)
 {
-   struct list_head *hook_list;
-
 #ifdef HAVE_JUMP_LABEL
if (__builtin_constant_p(pf) &&
__builtin_constant_p(hook) &&
@@ -170,14 +174,14 @@ static inline int nf_hook_thresh(u_int8_t pf, unsigned 
int hook,
return 1;
 #endif
 
-   hook_list = >nf.hooks[pf][hook];
-
-   if (!list_empty(hook_list)) {
+   if (rcu_access_pointer(net->nf.hooks[pf][hook])) {
+   struct nf_hook_entry *hook_list;
struct nf_hook_state state;
int ret;
 
/* We may already have this, but read-locks nest anyway */
rcu_read_lock();
+   hook_list = rcu_dereference(net->nf.hooks[pf][hook]);
nf_hook_state_init(, hook_list, hook, thresh,
   pf, indev, outdev, sk, net, okfn);
 
diff --git a/include/linux/netfilter_ingress.h 
b/include/linux/netfilter_ingress.h
index 6965ba0..e3e3f6d 100644
--- a/include/linux/netfilter_ingress.h
+++ b/include/linux/netfilter_ingress.h
@@ -11,23 +11,27 @@ static inline bool nf_hook_ingress_active(const struct 
sk_buff *skb)
if 
(!static_key_false(_hooks_needed[NFPROTO_NETDEV][NF_NETDEV_INGRESS]))
return false;
 #endif
-   return !list_empty(>dev->nf_hooks_ingress);
+   return rcu_access_pointer(skb->dev->nf_hooks_ingress) != NULL;
 }
 
 /* caller must hold rcu_read_lock */
 static inline int nf_hook_ingress(struct sk_buff *skb)
 {
+   struct nf_hook_entry *e = rcu_dereference(skb->dev->nf_hooks_ingress);
struct nf_hook_state state;
 
-   nf_hook_state_init(, >dev->nf_hooks_ingress,
-  NF_NETDEV_INGRESS, INT_MIN, NFPROTO_NETDEV,
-  skb->dev, NULL, NULL, dev_net(skb->dev), NULL);
+   if (unlikely(!e))
+   return 0;
+
+   nf_hook_state_init(, e, NF_NETDEV_INGRESS, INT_MIN,
+  NFPROTO_NETDEV, skb->dev, NULL, NULL,
+  dev_net(skb->dev), NULL);
return nf_hook_slow(skb, );
 }
 
 static inline void nf_hook_ingress_init(struct net_device *dev)
 {
-   INIT_LIS

[PATCH nf-next 1/3] netfilter: bridge: add and use br_nf_hook_thresh

2016-06-30 Thread Aaron Conole
From: Florian Westphal <f...@strlen.de>

This replaces the last uses of NF_HOOK_THRESH().
Followup patch will remove it and rename nf_hook_thresh.

The reason is that inet (non-bridge) netfilter no longer invokes the
hooks from hooks, so we do no longer need the thresh value to skip hooks
with a lower priority.

The bridge netfilter however may need to do this. br_nf_hook_thresh is a
wrapper that is supposed to do this, i.e. only call hooks with a
priority that exceeds NF_BR_PRI_BRNF.

It's used only in the recursion cases of br_netfilter.

Signed-off-by: Florian Westphal <f...@strlen.de>
Signed-off-by: Aaron Conole <acon...@bytheb.org>
---
 include/net/netfilter/br_netfilter.h |  6 
 net/bridge/br_netfilter_hooks.c  | 57 ++--
 net/bridge/br_netfilter_ipv6.c   | 12 
 3 files changed, 59 insertions(+), 16 deletions(-)

diff --git a/include/net/netfilter/br_netfilter.h 
b/include/net/netfilter/br_netfilter.h
index e8d1448..0b0c35c 100644
--- a/include/net/netfilter/br_netfilter.h
+++ b/include/net/netfilter/br_netfilter.h
@@ -15,6 +15,12 @@ static inline struct nf_bridge_info *nf_bridge_alloc(struct 
sk_buff *skb)
 
 void nf_bridge_update_protocol(struct sk_buff *skb);
 
+int br_nf_hook_thresh(unsigned int hook, struct net *net, struct sock *sk,
+ struct sk_buff *skb, struct net_device *indev,
+ struct net_device *outdev,
+ int (*okfn)(struct net *, struct sock *,
+ struct sk_buff *));
+
 static inline struct nf_bridge_info *
 nf_bridge_info_get(const struct sk_buff *skb)
 {
diff --git a/net/bridge/br_netfilter_hooks.c b/net/bridge/br_netfilter_hooks.c
index 2d25979..19f230c 100644
--- a/net/bridge/br_netfilter_hooks.c
+++ b/net/bridge/br_netfilter_hooks.c
@@ -30,6 +30,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 
 #include 
@@ -395,11 +396,10 @@ bridged_dnat:
skb->dev = nf_bridge->physindev;
nf_bridge_update_protocol(skb);
nf_bridge_push_encap_header(skb);
-   NF_HOOK_THRESH(NFPROTO_BRIDGE,
-  NF_BR_PRE_ROUTING,
-  net, sk, skb, skb->dev, NULL,
-  br_nf_pre_routing_finish_bridge,
-  1);
+   br_nf_hook_thresh(NF_BR_PRE_ROUTING,
+ net, sk, skb, skb->dev,
+ NULL,
+ br_nf_pre_routing_finish);
return 0;
}
ether_addr_copy(eth_hdr(skb)->h_dest, dev->dev_addr);
@@ -417,10 +417,8 @@ bridged_dnat:
skb->dev = nf_bridge->physindev;
nf_bridge_update_protocol(skb);
nf_bridge_push_encap_header(skb);
-   NF_HOOK_THRESH(NFPROTO_BRIDGE, NF_BR_PRE_ROUTING, net, sk, skb,
-  skb->dev, NULL,
-  br_handle_frame_finish, 1);
-
+   br_nf_hook_thresh(NF_BR_PRE_ROUTING, net, sk, skb, skb->dev, NULL,
+ br_handle_frame_finish);
return 0;
 }
 
@@ -992,6 +990,47 @@ static struct notifier_block brnf_notifier __read_mostly = 
{
.notifier_call = brnf_device_event,
 };
 
+/* recursively invokes nf_hook_slow (again), skipping already-called
+ * hooks (< NF_BR_PRI_BRNF).
+ *
+ * Called with rcu read lock held.
+ */
+int br_nf_hook_thresh(unsigned int hook, struct net *net,
+ struct sock *sk, struct sk_buff *skb,
+ struct net_device *indev,
+ struct net_device *outdev,
+ int (*okfn)(struct net *, struct sock *,
+ struct sk_buf *))
+{
+   struct nf_hook_ops *elem;
+   struct nf_hook_state state;
+   struct list_head *head;
+   int ret;
+
+   head = >nf.hooks[NFPROTO_BRIDGE][hook];
+
+   list_for_each_entry_rcu(elem, head, list) {
+   struct nf_hook_ops *next;
+
+   next = list_entry_rcu(list_next_rcu(>list),
+ struct nf_hook_ops, list);
+   if (next->priority <= NF_BR_PRI_BRNF)
+   continue;
+   }
+
+   if (>list == head)
+   return okfn(net, sk, skb);
+
+   nf_hook_state_init(, head, hook, NF_BR_PRI_BRNF + 1,
+  NFPROTO_BRIDGE, indev, outdev, sk, net, okfn);
+
+   ret = nf_hook_slow(skb, );
+   if (ret == 1)
+   ret = okfn(net, sk, skb);
+
+   return ret;
+}
+
 #ifdef CONFIG_SYSCTL
 static
 int brnf_sysctl_call_tables(struct ctl_table *ctl, int write,
diff --git a/net/bridge/

[PATCH nf-next 2/3] netfilter: call nf_hook_state_init with rcu_read_lock held

2016-06-30 Thread Aaron Conole
From: Florian Westphal <f...@strlen.de>

This makes things simpler because we can store the head of the list
in the nf_state structure without worrying about concurrent add/delete
of hook elements from the list.

Signed-off-by: Florian Westphal <f...@strlen.de>
Signed-off-by: Aaron Conole <acon...@bytheb.org>
---
 include/linux/netfilter.h  | 8 +++-
 include/linux/netfilter_ingress.h  | 1 +
 net/bridge/netfilter/ebt_redirect.c| 2 +-
 net/bridge/netfilter/ebtables.c| 2 +-
 net/ipv4/netfilter/nf_conntrack_l3proto_ipv4.c | 2 +-
 net/ipv4/netfilter/nf_conntrack_proto_icmp.c   | 2 +-
 net/ipv6/netfilter/nf_conntrack_l3proto_ipv6.c | 2 +-
 net/ipv6/netfilter/nf_conntrack_proto_icmpv6.c | 2 +-
 net/netfilter/core.c   | 5 +
 net/netfilter/nf_conntrack_core.c  | 2 +-
 net/netfilter/nf_conntrack_h323_main.c | 2 +-
 net/netfilter/nf_conntrack_helper.c| 2 +-
 net/netfilter/nfnetlink_cthelper.c | 2 +-
 net/netfilter/nfnetlink_log.c  | 8 ++--
 net/netfilter/nfnetlink_queue.c| 2 +-
 net/netfilter/xt_helper.c  | 2 +-
 16 files changed, 27 insertions(+), 19 deletions(-)

diff --git a/include/linux/netfilter.h b/include/linux/netfilter.h
index 9230f9a..ad444f0 100644
--- a/include/linux/netfilter.h
+++ b/include/linux/netfilter.h
@@ -174,10 +174,16 @@ static inline int nf_hook_thresh(u_int8_t pf, unsigned 
int hook,
 
if (!list_empty(hook_list)) {
struct nf_hook_state state;
+   int ret;
 
+   /* We may already have this, but read-locks nest anyway */
+   rcu_read_lock();
nf_hook_state_init(, hook_list, hook, thresh,
   pf, indev, outdev, sk, net, okfn);
-   return nf_hook_slow(skb, );
+
+   ret = nf_hook_slow(skb, );
+   rcu_read_unlock();
+   return ret;
}
return 1;
 }
diff --git a/include/linux/netfilter_ingress.h 
b/include/linux/netfilter_ingress.h
index 5fcd375..6965ba0 100644
--- a/include/linux/netfilter_ingress.h
+++ b/include/linux/netfilter_ingress.h
@@ -14,6 +14,7 @@ static inline bool nf_hook_ingress_active(const struct 
sk_buff *skb)
return !list_empty(>dev->nf_hooks_ingress);
 }
 
+/* caller must hold rcu_read_lock */
 static inline int nf_hook_ingress(struct sk_buff *skb)
 {
struct nf_hook_state state;
diff --git a/net/bridge/netfilter/ebt_redirect.c 
b/net/bridge/netfilter/ebt_redirect.c
index 20396499..2e7c4f9 100644
--- a/net/bridge/netfilter/ebt_redirect.c
+++ b/net/bridge/netfilter/ebt_redirect.c
@@ -24,7 +24,7 @@ ebt_redirect_tg(struct sk_buff *skb, const struct 
xt_action_param *par)
return EBT_DROP;
 
if (par->hooknum != NF_BR_BROUTING)
-   /* rcu_read_lock()ed by nf_hook_slow */
+   /* rcu_read_lock()ed by nf_hook_thresh */
ether_addr_copy(eth_hdr(skb)->h_dest,
br_port_get_rcu(par->in)->br->dev->dev_addr);
else
diff --git a/net/bridge/netfilter/ebtables.c b/net/bridge/netfilter/ebtables.c
index 5a61f35..6faa2c3 100644
--- a/net/bridge/netfilter/ebtables.c
+++ b/net/bridge/netfilter/ebtables.c
@@ -148,7 +148,7 @@ ebt_basic_match(const struct ebt_entry *e, const struct 
sk_buff *skb,
return 1;
if (FWINV2(ebt_dev_check(e->out, out), EBT_IOUT))
return 1;
-   /* rcu_read_lock()ed by nf_hook_slow */
+   /* rcu_read_lock()ed by nf_hook_thresh */
if (in && (p = br_port_get_rcu(in)) != NULL &&
FWINV2(ebt_dev_check(e->logical_in, p->br->dev), EBT_ILOGICALIN))
return 1;
diff --git a/net/ipv4/netfilter/nf_conntrack_l3proto_ipv4.c 
b/net/ipv4/netfilter/nf_conntrack_l3proto_ipv4.c
index ae1a71a..eab0239 100644
--- a/net/ipv4/netfilter/nf_conntrack_l3proto_ipv4.c
+++ b/net/ipv4/netfilter/nf_conntrack_l3proto_ipv4.c
@@ -110,7 +110,7 @@ static unsigned int ipv4_helper(void *priv,
if (!help)
return NF_ACCEPT;
 
-   /* rcu_read_lock()ed by nf_hook_slow */
+   /* rcu_read_lock()ed by nf_hook_thresh */
helper = rcu_dereference(help->helper);
if (!helper)
return NF_ACCEPT;
diff --git a/net/ipv4/netfilter/nf_conntrack_proto_icmp.c 
b/net/ipv4/netfilter/nf_conntrack_proto_icmp.c
index c567e1b..2c08d6a 100644
--- a/net/ipv4/netfilter/nf_conntrack_proto_icmp.c
+++ b/net/ipv4/netfilter/nf_conntrack_proto_icmp.c
@@ -149,7 +149,7 @@ icmp_error_message(struct net *net, struct nf_conn *tmpl, 
struct sk_buff *skb,
return -NF_ACCEPT;
}
 
-   /* rcu_read_lock()ed by nf_hook_slow */
+   /* rcu_read_lock()ed by nf_hook_thresh */
innerproto = __nf_ct_l4proto_find(PF_INET, origtuple.dst.protonum);
 

[RFC nf-next 1/3] netfilter: bridge: add and use br_nf_hook_thresh

2016-06-22 Thread Aaron Conole
From: Florian Westphal <f...@strlen.de>

This replaces the last uses of NF_HOOK_THRESH().
Followup patch will remove it and rename nf_hook_thresh.

The reason is that inet (non-bridge) netfilter no longer invokes the
hooks from hooks, so we do no longer need the thresh value to skip hooks
with a lower priority.

The bridge netfilter however may need to do this. br_nf_hook_thresh is a
wrapper that is supposed to do this, i.e. only call hooks with a
priority that exceeds NF_BR_PRI_BRNF.

It's used only in the recursion cases of br_netfilter.

Signed-off-by: Florian Westphal <f...@strlen.de>
Signed-off-by: Aaron Conole <acon...@redhat.com>
---
 include/net/netfilter/br_netfilter.h |  6 
 net/bridge/br_netfilter_hooks.c  | 57 ++--
 net/bridge/br_netfilter_ipv6.c   | 12 
 3 files changed, 59 insertions(+), 16 deletions(-)

diff --git a/include/net/netfilter/br_netfilter.h 
b/include/net/netfilter/br_netfilter.h
index e8d1448..0b0c35c 100644
--- a/include/net/netfilter/br_netfilter.h
+++ b/include/net/netfilter/br_netfilter.h
@@ -15,6 +15,12 @@ static inline struct nf_bridge_info *nf_bridge_alloc(struct 
sk_buff *skb)
 
 void nf_bridge_update_protocol(struct sk_buff *skb);
 
+int br_nf_hook_thresh(unsigned int hook, struct net *net, struct sock *sk,
+ struct sk_buff *skb, struct net_device *indev,
+ struct net_device *outdev,
+ int (*okfn)(struct net *, struct sock *,
+ struct sk_buff *));
+
 static inline struct nf_bridge_info *
 nf_bridge_info_get(const struct sk_buff *skb)
 {
diff --git a/net/bridge/br_netfilter_hooks.c b/net/bridge/br_netfilter_hooks.c
index 2d25979..19f230c 100644
--- a/net/bridge/br_netfilter_hooks.c
+++ b/net/bridge/br_netfilter_hooks.c
@@ -30,6 +30,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 
 #include 
@@ -395,11 +396,10 @@ bridged_dnat:
skb->dev = nf_bridge->physindev;
nf_bridge_update_protocol(skb);
nf_bridge_push_encap_header(skb);
-   NF_HOOK_THRESH(NFPROTO_BRIDGE,
-  NF_BR_PRE_ROUTING,
-  net, sk, skb, skb->dev, NULL,
-  br_nf_pre_routing_finish_bridge,
-  1);
+   br_nf_hook_thresh(NF_BR_PRE_ROUTING,
+ net, sk, skb, skb->dev,
+ NULL,
+ br_nf_pre_routing_finish);
return 0;
}
ether_addr_copy(eth_hdr(skb)->h_dest, dev->dev_addr);
@@ -417,10 +417,8 @@ bridged_dnat:
skb->dev = nf_bridge->physindev;
nf_bridge_update_protocol(skb);
nf_bridge_push_encap_header(skb);
-   NF_HOOK_THRESH(NFPROTO_BRIDGE, NF_BR_PRE_ROUTING, net, sk, skb,
-  skb->dev, NULL,
-  br_handle_frame_finish, 1);
-
+   br_nf_hook_thresh(NF_BR_PRE_ROUTING, net, sk, skb, skb->dev, NULL,
+ br_handle_frame_finish);
return 0;
 }
 
@@ -992,6 +990,47 @@ static struct notifier_block brnf_notifier __read_mostly = 
{
.notifier_call = brnf_device_event,
 };
 
+/* recursively invokes nf_hook_slow (again), skipping already-called
+ * hooks (< NF_BR_PRI_BRNF).
+ *
+ * Called with rcu read lock held.
+ */
+int br_nf_hook_thresh(unsigned int hook, struct net *net,
+ struct sock *sk, struct sk_buff *skb,
+ struct net_device *indev,
+ struct net_device *outdev,
+ int (*okfn)(struct net *, struct sock *,
+ struct sk_buf *))
+{
+   struct nf_hook_ops *elem;
+   struct nf_hook_state state;
+   struct list_head *head;
+   int ret;
+
+   head = >nf.hooks[NFPROTO_BRIDGE][hook];
+
+   list_for_each_entry_rcu(elem, head, list) {
+   struct nf_hook_ops *next;
+
+   next = list_entry_rcu(list_next_rcu(>list),
+ struct nf_hook_ops, list);
+   if (next->priority <= NF_BR_PRI_BRNF)
+   continue;
+   }
+
+   if (>list == head)
+   return okfn(net, sk, skb);
+
+   nf_hook_state_init(, head, hook, NF_BR_PRI_BRNF + 1,
+  NFPROTO_BRIDGE, indev, outdev, sk, net, okfn);
+
+   ret = nf_hook_slow(skb, );
+   if (ret == 1)
+   ret = okfn(net, sk, skb);
+
+   return ret;
+}
+
 #ifdef CONFIG_SYSCTL
 static
 int brnf_sysctl_call_tables(struct ctl_table *ctl, int write,
diff --git a/net/bridge/

[RFC nf-next 2/3] netfilter: call nf_hook_state_init with rcu_read_lock held

2016-06-22 Thread Aaron Conole
From: Florian Westphal <f...@strlen.de>

This makes things simpler because we can store the head of the list
in the nf_state structure without worrying about concurrent add/delete
of hook elements from the list.

Signed-off-by: Florian Westphal <f...@strlen.de>
Signed-off-by: Aaron Conole <acon...@bytheb.org>
---
 include/linux/netfilter.h  | 8 +++-
 include/linux/netfilter_ingress.h  | 1 +
 net/bridge/netfilter/ebt_redirect.c| 2 +-
 net/bridge/netfilter/ebtables.c| 2 +-
 net/ipv4/netfilter/nf_conntrack_l3proto_ipv4.c | 2 +-
 net/ipv4/netfilter/nf_conntrack_proto_icmp.c   | 2 +-
 net/ipv6/netfilter/nf_conntrack_l3proto_ipv6.c | 2 +-
 net/ipv6/netfilter/nf_conntrack_proto_icmpv6.c | 2 +-
 net/netfilter/core.c   | 5 +
 net/netfilter/nf_conntrack_core.c  | 2 +-
 net/netfilter/nf_conntrack_h323_main.c | 2 +-
 net/netfilter/nf_conntrack_helper.c| 2 +-
 net/netfilter/nfnetlink_cthelper.c | 2 +-
 net/netfilter/nfnetlink_log.c  | 8 ++--
 net/netfilter/nfnetlink_queue.c| 2 +-
 net/netfilter/xt_helper.c  | 2 +-
 16 files changed, 27 insertions(+), 19 deletions(-)

diff --git a/include/linux/netfilter.h b/include/linux/netfilter.h
index 9230f9a..ad444f0 100644
--- a/include/linux/netfilter.h
+++ b/include/linux/netfilter.h
@@ -174,10 +174,16 @@ static inline int nf_hook_thresh(u_int8_t pf, unsigned 
int hook,
 
if (!list_empty(hook_list)) {
struct nf_hook_state state;
+   int ret;
 
+   /* We may already have this, but read-locks nest anyway */
+   rcu_read_lock();
nf_hook_state_init(, hook_list, hook, thresh,
   pf, indev, outdev, sk, net, okfn);
-   return nf_hook_slow(skb, );
+
+   ret = nf_hook_slow(skb, );
+   rcu_read_unlock();
+   return ret;
}
return 1;
 }
diff --git a/include/linux/netfilter_ingress.h 
b/include/linux/netfilter_ingress.h
index 5fcd375..6965ba0 100644
--- a/include/linux/netfilter_ingress.h
+++ b/include/linux/netfilter_ingress.h
@@ -14,6 +14,7 @@ static inline bool nf_hook_ingress_active(const struct 
sk_buff *skb)
return !list_empty(>dev->nf_hooks_ingress);
 }
 
+/* caller must hold rcu_read_lock */
 static inline int nf_hook_ingress(struct sk_buff *skb)
 {
struct nf_hook_state state;
diff --git a/net/bridge/netfilter/ebt_redirect.c 
b/net/bridge/netfilter/ebt_redirect.c
index 20396499..2e7c4f9 100644
--- a/net/bridge/netfilter/ebt_redirect.c
+++ b/net/bridge/netfilter/ebt_redirect.c
@@ -24,7 +24,7 @@ ebt_redirect_tg(struct sk_buff *skb, const struct 
xt_action_param *par)
return EBT_DROP;
 
if (par->hooknum != NF_BR_BROUTING)
-   /* rcu_read_lock()ed by nf_hook_slow */
+   /* rcu_read_lock()ed by nf_hook_thresh */
ether_addr_copy(eth_hdr(skb)->h_dest,
br_port_get_rcu(par->in)->br->dev->dev_addr);
else
diff --git a/net/bridge/netfilter/ebtables.c b/net/bridge/netfilter/ebtables.c
index 5a61f35..6faa2c3 100644
--- a/net/bridge/netfilter/ebtables.c
+++ b/net/bridge/netfilter/ebtables.c
@@ -148,7 +148,7 @@ ebt_basic_match(const struct ebt_entry *e, const struct 
sk_buff *skb,
return 1;
if (FWINV2(ebt_dev_check(e->out, out), EBT_IOUT))
return 1;
-   /* rcu_read_lock()ed by nf_hook_slow */
+   /* rcu_read_lock()ed by nf_hook_thresh */
if (in && (p = br_port_get_rcu(in)) != NULL &&
FWINV2(ebt_dev_check(e->logical_in, p->br->dev), EBT_ILOGICALIN))
return 1;
diff --git a/net/ipv4/netfilter/nf_conntrack_l3proto_ipv4.c 
b/net/ipv4/netfilter/nf_conntrack_l3proto_ipv4.c
index ae1a71a..eab0239 100644
--- a/net/ipv4/netfilter/nf_conntrack_l3proto_ipv4.c
+++ b/net/ipv4/netfilter/nf_conntrack_l3proto_ipv4.c
@@ -110,7 +110,7 @@ static unsigned int ipv4_helper(void *priv,
if (!help)
return NF_ACCEPT;
 
-   /* rcu_read_lock()ed by nf_hook_slow */
+   /* rcu_read_lock()ed by nf_hook_thresh */
helper = rcu_dereference(help->helper);
if (!helper)
return NF_ACCEPT;
diff --git a/net/ipv4/netfilter/nf_conntrack_proto_icmp.c 
b/net/ipv4/netfilter/nf_conntrack_proto_icmp.c
index c567e1b..2c08d6a 100644
--- a/net/ipv4/netfilter/nf_conntrack_proto_icmp.c
+++ b/net/ipv4/netfilter/nf_conntrack_proto_icmp.c
@@ -149,7 +149,7 @@ icmp_error_message(struct net *net, struct nf_conn *tmpl, 
struct sk_buff *skb,
return -NF_ACCEPT;
}
 
-   /* rcu_read_lock()ed by nf_hook_slow */
+   /* rcu_read_lock()ed by nf_hook_thresh */
innerproto = __nf_ct_l4proto_find(PF_INET, origtuple.dst.protonum);