Re: [PATCH nf-next,RFC 08/10] netfilter: move NF_QUEUE handling away from core

2016-10-14 Thread Pablo Neira Ayuso
On Fri, Oct 14, 2016 at 06:47:20PM +0200, Pablo Neira Ayuso wrote:
> On Fri, Oct 14, 2016 at 05:38:12PM +0200, Florian Westphal wrote:
> > Pablo Neira Ayuso  wrote:
> > > On Fri, Oct 14, 2016 at 04:06:15PM +0800, Liping Zhang wrote:
> > > > Hi Pablo,
> > > >
> > > > 2016-10-13 20:02 GMT+08:00 Pablo Neira Ayuso :
> > > > > +int nf_queue(struct sk_buff *skb, const struct nf_hook_state *state,
> > > > > +unsigned int queuenum, bool bypass)
> > > > > +{
> > > > > +   int ret;
> > > > > +
> > > > > +   ret = __nf_queue(skb, state, queuenum);
> > > > > +   if (ret < 0) {
> > > > > +   if (ret == -ESRCH && bypass)
> > > > > +   return NF_ACCEPT;
> > > > > +   kfree_skb(skb);
> > > > > +   return NF_DROP;
> > > > > +   }
> > > > > +
> > > > > +   return NF_STOLEN;
> > > >
> > > > I think this will break something ... Imagine such situation:
> > > > # ip route add default dev eth0
> > > > # ip rule add fwmark 0x1/0xf lookup eth1
> > > > # ip rule add fwmark 0x2/0xf lookup eth2
> > > > # iptables -t mangle -A OUTPUT -d 1.1.1.1 -j MARK --set-mark 0x1
> > > > # iptables -t mangle -A OUTPUT -d 2.2.2.2 -j MARK --set-mark 0x2
> > > > # iptables -t mangle -A OUTPUT -j NFQUEUE
> > > >
> > > > So ip packets with dst 1.1.1.1 will be sent via eth1, ip packets with
> > > > dst 2.2.2.2 will be sent via eth2 ...
> > > >
> > > > But apply this patch, after queue the packet with dst 1.1.1.1 to the
> > > > userspace and reinject it to the kernel, the packet will be sent via
> > > > the wrong interface, i.e. eth0 not eth1.
> > > >
> > > > Because ret is *NF_STOLEN* so we will not call ip_route_me_harder
> > > > to do re-route in ipt_mangle_out().
> > > 
> > > Good point. Then, we can just return NF_QUEUE here instead, which
> > > would become sort of an alias of NF_STOLEN, but this now just signals
> > > the core that the packet was enqueued to userspace. I mean:
> > > 
> > > int nf_queue(struct sk_buff *skb, const struct nf_hook_state *state,
> > > unsigned int queuenum, bool bypass)
> > > {
> > >int ret;
> > > 
> > >ret = __nf_queue(skb, state, queuenum);
> > >if (ret < 0) {
> > >if (ret == -ESRCH && bypass)
> > >return NF_ACCEPT;
> > >kfree_skb(skb);
> > >return NF_DROP;
> > >}
> > > 
> > >return NF_QUEUE; <--- this.
> > > }
> > 
> > I'm afraid that won't fly.  When This NF_QUEUE is returned here, we're
> > in a race as skb is already on its way to userspace (or perhaps already
> > being reinjected/dropped on other cpu).
> > 
> > I think the simplest way out is to always re-route from nf_reinject
> > in case we were queued from mangle output.
> > 
> > For nft, we might be able to make a note of 'route' chain type in the
> > nf_hook_state and then have nf_reinject check for that.
> 
> Hm, we already have afinfo->saveroute() and afinfo->reroute() handling
> from nf_queue() and nf_reinject() respectively, so returning NF_STOLEN
> (as originally proposed) should be fine.

Oh I see, but this doesn't cover Liping's usecase above, since the
mark would not be updated from nfqueue userspace application.
--
To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH nf-next,RFC 08/10] netfilter: move NF_QUEUE handling away from core

2016-10-14 Thread Pablo Neira Ayuso
On Fri, Oct 14, 2016 at 05:38:12PM +0200, Florian Westphal wrote:
> Pablo Neira Ayuso  wrote:
> > On Fri, Oct 14, 2016 at 04:06:15PM +0800, Liping Zhang wrote:
> > > Hi Pablo,
> > >
> > > 2016-10-13 20:02 GMT+08:00 Pablo Neira Ayuso :
> > > > +int nf_queue(struct sk_buff *skb, const struct nf_hook_state *state,
> > > > +unsigned int queuenum, bool bypass)
> > > > +{
> > > > +   int ret;
> > > > +
> > > > +   ret = __nf_queue(skb, state, queuenum);
> > > > +   if (ret < 0) {
> > > > +   if (ret == -ESRCH && bypass)
> > > > +   return NF_ACCEPT;
> > > > +   kfree_skb(skb);
> > > > +   return NF_DROP;
> > > > +   }
> > > > +
> > > > +   return NF_STOLEN;
> > >
> > > I think this will break something ... Imagine such situation:
> > > # ip route add default dev eth0
> > > # ip rule add fwmark 0x1/0xf lookup eth1
> > > # ip rule add fwmark 0x2/0xf lookup eth2
> > > # iptables -t mangle -A OUTPUT -d 1.1.1.1 -j MARK --set-mark 0x1
> > > # iptables -t mangle -A OUTPUT -d 2.2.2.2 -j MARK --set-mark 0x2
> > > # iptables -t mangle -A OUTPUT -j NFQUEUE
> > >
> > > So ip packets with dst 1.1.1.1 will be sent via eth1, ip packets with
> > > dst 2.2.2.2 will be sent via eth2 ...
> > >
> > > But apply this patch, after queue the packet with dst 1.1.1.1 to the
> > > userspace and reinject it to the kernel, the packet will be sent via
> > > the wrong interface, i.e. eth0 not eth1.
> > >
> > > Because ret is *NF_STOLEN* so we will not call ip_route_me_harder
> > > to do re-route in ipt_mangle_out().
> > 
> > Good point. Then, we can just return NF_QUEUE here instead, which
> > would become sort of an alias of NF_STOLEN, but this now just signals
> > the core that the packet was enqueued to userspace. I mean:
> > 
> > int nf_queue(struct sk_buff *skb, const struct nf_hook_state *state,
> > unsigned int queuenum, bool bypass)
> > {
> >int ret;
> > 
> >ret = __nf_queue(skb, state, queuenum);
> >if (ret < 0) {
> >if (ret == -ESRCH && bypass)
> >return NF_ACCEPT;
> >kfree_skb(skb);
> >return NF_DROP;
> >}
> > 
> >return NF_QUEUE; <--- this.
> > }
> 
> I'm afraid that won't fly.  When This NF_QUEUE is returned here, we're
> in a race as skb is already on its way to userspace (or perhaps already
> being reinjected/dropped on other cpu).
> 
> I think the simplest way out is to always re-route from nf_reinject
> in case we were queued from mangle output.
> 
> For nft, we might be able to make a note of 'route' chain type in the
> nf_hook_state and then have nf_reinject check for that.

Hm, we already have afinfo->saveroute() and afinfo->reroute() handling
from nf_queue() and nf_reinject() respectively, so returning NF_STOLEN
(as originally proposed) should be fine.
--
To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH nf-next,RFC 08/10] netfilter: move NF_QUEUE handling away from core

2016-10-14 Thread Florian Westphal
Pablo Neira Ayuso  wrote:
> On Fri, Oct 14, 2016 at 04:06:15PM +0800, Liping Zhang wrote:
> > Hi Pablo,
> >
> > 2016-10-13 20:02 GMT+08:00 Pablo Neira Ayuso :
> > > +int nf_queue(struct sk_buff *skb, const struct nf_hook_state *state,
> > > +unsigned int queuenum, bool bypass)
> > > +{
> > > +   int ret;
> > > +
> > > +   ret = __nf_queue(skb, state, queuenum);
> > > +   if (ret < 0) {
> > > +   if (ret == -ESRCH && bypass)
> > > +   return NF_ACCEPT;
> > > +   kfree_skb(skb);
> > > +   return NF_DROP;
> > > +   }
> > > +
> > > +   return NF_STOLEN;
> >
> > I think this will break something ... Imagine such situation:
> > # ip route add default dev eth0
> > # ip rule add fwmark 0x1/0xf lookup eth1
> > # ip rule add fwmark 0x2/0xf lookup eth2
> > # iptables -t mangle -A OUTPUT -d 1.1.1.1 -j MARK --set-mark 0x1
> > # iptables -t mangle -A OUTPUT -d 2.2.2.2 -j MARK --set-mark 0x2
> > # iptables -t mangle -A OUTPUT -j NFQUEUE
> >
> > So ip packets with dst 1.1.1.1 will be sent via eth1, ip packets with
> > dst 2.2.2.2 will be sent via eth2 ...
> >
> > But apply this patch, after queue the packet with dst 1.1.1.1 to the
> > userspace and reinject it to the kernel, the packet will be sent via
> > the wrong interface, i.e. eth0 not eth1.
> >
> > Because ret is *NF_STOLEN* so we will not call ip_route_me_harder
> > to do re-route in ipt_mangle_out().
> 
> Good point. Then, we can just return NF_QUEUE here instead, which
> would become sort of an alias of NF_STOLEN, but this now just signals
> the core that the packet was enqueued to userspace. I mean:
> 
> int nf_queue(struct sk_buff *skb, const struct nf_hook_state *state,
> unsigned int queuenum, bool bypass)
> {
>int ret;
> 
>ret = __nf_queue(skb, state, queuenum);
>if (ret < 0) {
>if (ret == -ESRCH && bypass)
>return NF_ACCEPT;
>kfree_skb(skb);
>return NF_DROP;
>}
> 
>return NF_QUEUE; <--- this.
> }

I'm afraid that won't fly.  When This NF_QUEUE is returned here, we're
in a race as skb is already on its way to userspace (or perhaps already
being reinjected/dropped on other cpu).

I think the simplest way out is to always re-route from nf_reinject
in case we were queued from mangle output.

For nft, we might be able to make a note of 'route' chain type in the
nf_hook_state and then have nf_reinject check for that.

--
To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH nf-next,RFC 08/10] netfilter: move NF_QUEUE handling away from core

2016-10-14 Thread Pablo Neira Ayuso
On Fri, Oct 14, 2016 at 11:53:30AM +0200, Pablo Neira Ayuso wrote:
[...] 
> BTW, looking at ipt_mangle_out():
> 
> ret = ipt_do_table(skb, state, state->net->ipv4.iptable_mangle);
> /* Reroute for ANY change. */
> if (ret != NF_DROP && ret != NF_STOLEN) {
> iph = ip_hdr(skb);
> 
> if (iph->saddr != saddr ||
> iph->daddr != daddr ||
> skb->mark != mark ||
> iph->tos != tos) {
> err = ip_route_me_harder(state->net, skb, RTN_UNSPEC);
> if (err < 0)
> ret = NF_DROP_ERR(err);
> }
> }
> 
> It seems that we're triggering an expensive re-reroute for dropped
> packets from the mangle table, since ret != NF_DROP evaluates false
> given the errno number is encoded in the most significant 16 bits.

Forget this, we never see errno at this stage, so this is fine.
--
To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH nf-next,RFC 08/10] netfilter: move NF_QUEUE handling away from core

2016-10-14 Thread Pablo Neira Ayuso
On Fri, Oct 14, 2016 at 04:06:15PM +0800, Liping Zhang wrote:
> Hi Pablo,
>
> 2016-10-13 20:02 GMT+08:00 Pablo Neira Ayuso :
> > +int nf_queue(struct sk_buff *skb, const struct nf_hook_state *state,
> > +unsigned int queuenum, bool bypass)
> > +{
> > +   int ret;
> > +
> > +   ret = __nf_queue(skb, state, queuenum);
> > +   if (ret < 0) {
> > +   if (ret == -ESRCH && bypass)
> > +   return NF_ACCEPT;
> > +   kfree_skb(skb);
> > +   return NF_DROP;
> > +   }
> > +
> > +   return NF_STOLEN;
>
> I think this will break something ... Imagine such situation:
> # ip route add default dev eth0
> # ip rule add fwmark 0x1/0xf lookup eth1
> # ip rule add fwmark 0x2/0xf lookup eth2
> # iptables -t mangle -A OUTPUT -d 1.1.1.1 -j MARK --set-mark 0x1
> # iptables -t mangle -A OUTPUT -d 2.2.2.2 -j MARK --set-mark 0x2
> # iptables -t mangle -A OUTPUT -j NFQUEUE
>
> So ip packets with dst 1.1.1.1 will be sent via eth1, ip packets with
> dst 2.2.2.2 will be sent via eth2 ...
>
> But apply this patch, after queue the packet with dst 1.1.1.1 to the
> userspace and reinject it to the kernel, the packet will be sent via
> the wrong interface, i.e. eth0 not eth1.
>
> Because ret is *NF_STOLEN* so we will not call ip_route_me_harder
> to do re-route in ipt_mangle_out().

Good point. Then, we can just return NF_QUEUE here instead, which
would become sort of an alias of NF_STOLEN, but this now just signals
the core that the packet was enqueued to userspace. I mean:

int nf_queue(struct sk_buff *skb, const struct nf_hook_state *state,
unsigned int queuenum, bool bypass)
{
   int ret;

   ret = __nf_queue(skb, state, queuenum);
   if (ret < 0) {
   if (ret == -ESRCH && bypass)
   return NF_ACCEPT;
   kfree_skb(skb);
   return NF_DROP;
   }

   return NF_QUEUE; <--- this.
}

BTW, looking at ipt_mangle_out():

ret = ipt_do_table(skb, state, state->net->ipv4.iptable_mangle);
/* Reroute for ANY change. */
if (ret != NF_DROP && ret != NF_STOLEN) {
iph = ip_hdr(skb);

if (iph->saddr != saddr ||
iph->daddr != daddr ||
skb->mark != mark ||
iph->tos != tos) {
err = ip_route_me_harder(state->net, skb, RTN_UNSPEC);
if (err < 0)
ret = NF_DROP_ERR(err);
}
}

It seems that we're triggering an expensive re-reroute for dropped
packets from the mangle table, since ret != NF_DROP evaluates false
given the errno number is encoded in the most significant 16 bits.

> > diff --git a/net/netfilter/nft_queue.c b/net/netfilter/nft_queue.c
> > index f596a1614daa..015053a2643d 100644
> > --- a/net/netfilter/nft_queue.c
> > +++ b/net/netfilter/nft_queue.c
> > @@ -48,10 +48,8 @@ static void nft_queue_eval(const struct nft_expr *expr,
> > }
> > }
> >
> > -   ret = NF_QUEUE_NR(queue);
> > -   if (priv->flags & NFT_QUEUE_FLAG_BYPASS)
> > -   ret |= NF_VERDICT_FLAG_QUEUE_BYPASS;
> > -
> > +   ret = nf_queue(pkt->skb, pkt->xt.state, NF_QUEUE_NR(queue),
> > +  priv->flags & NFT_QUEUE_FLAG_BYPASS);
> > regs->verdict.code = ret;
> >  }
>
> I think here we forget to use nf_queue() in nft_queue_sreg_eval().
>
> And in nfnl_userspace_cthelper(), such conversion was missed also.

Right, thanks, will fix up this spot too.
--
To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH nf-next,RFC 08/10] netfilter: move NF_QUEUE handling away from core

2016-10-14 Thread Liping Zhang
Hi Pablo,

2016-10-13 20:02 GMT+08:00 Pablo Neira Ayuso :
> +int nf_queue(struct sk_buff *skb, const struct nf_hook_state *state,
> +unsigned int queuenum, bool bypass)
> +{
> +   int ret;
> +
> +   ret = __nf_queue(skb, state, queuenum);
> +   if (ret < 0) {
> +   if (ret == -ESRCH && bypass)
> +   return NF_ACCEPT;
> +   kfree_skb(skb);
> +   return NF_DROP;
> +   }
> +
> +   return NF_STOLEN;

I think this will break something ... Imagine such situation:
# ip route add default dev eth0
# ip rule add fwmark 0x1/0xf lookup eth1
# ip rule add fwmark 0x2/0xf lookup eth2
# iptables -t mangle -A OUTPUT -d 1.1.1.1 -j MARK --set-mark 0x1
# iptables -t mangle -A OUTPUT -d 2.2.2.2 -j MARK --set-mark 0x2
# iptables -t mangle -A OUTPUT -j NFQUEUE

So ip packets with dst 1.1.1.1 will be sent via eth1, ip packets with
dst 2.2.2.2 will be sent via eth2 ...

But apply this patch, after queue the packet with dst 1.1.1.1 to the
userspace and reinject it to the kernel, the packet will be sent via
the wrong interface, i.e. eth0 not eth1.

Because ret is *NF_STOLEN* so we will not call ip_route_me_harder
to do re-route in ipt_mangle_out().

> diff --git a/net/netfilter/nft_queue.c b/net/netfilter/nft_queue.c
> index f596a1614daa..015053a2643d 100644
> --- a/net/netfilter/nft_queue.c
> +++ b/net/netfilter/nft_queue.c
> @@ -48,10 +48,8 @@ static void nft_queue_eval(const struct nft_expr *expr,
> }
> }
>
> -   ret = NF_QUEUE_NR(queue);
> -   if (priv->flags & NFT_QUEUE_FLAG_BYPASS)
> -   ret |= NF_VERDICT_FLAG_QUEUE_BYPASS;
> -
> +   ret = nf_queue(pkt->skb, pkt->xt.state, NF_QUEUE_NR(queue),
> +  priv->flags & NFT_QUEUE_FLAG_BYPASS);
> regs->verdict.code = ret;
>  }

I think here we forget to use nf_queue() in nft_queue_sreg_eval().

And in nfnl_userspace_cthelper(), such conversion was missed also.
--
To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH nf-next,RFC 08/10] netfilter: move NF_QUEUE handling away from core

2016-10-13 Thread Pablo Neira Ayuso
On Thu, Oct 13, 2016 at 02:38:21PM +0200, Florian Westphal wrote:
> Pablo Neira Ayuso  wrote:
[...]
> > diff --git a/net/ipv4/netfilter/ip_tables.c b/net/ipv4/netfilter/ip_tables.c
> > index de4fa03f46f3..7040842c34f4 100644
> > --- a/net/ipv4/netfilter/ip_tables.c
> > +++ b/net/ipv4/netfilter/ip_tables.c
> > @@ -29,6 +29,7 @@
> >  #include 
> >  #include 
> >  #include 
> > +#include 
> >  #include "../../netfilter/xt_repldata.h"
> >  
> >  MODULE_LICENSE("GPL");
> > @@ -329,6 +330,9 @@ ipt_do_table(struct sk_buff *skb,
> > /* Pop from stack? */
> > if (v != XT_RETURN) {
> > verdict = (unsigned int)(-v) - 1;
> > +   if (verdict == NF_QUEUE)
> > +   verdict = nf_queue(skb, state,
> > +  0, false);
> 
> Any reason why this is needed?
> AFAICS xt_NFQUEUE will never return NF_QUEUE after this patch.

-j QUEUE uses the standard target to return NF_QUEUE. This is very
primitive way to queue packets to userspace queue 0 via nf_queue, but
still may break. I can place this under unlikely() as these days
people should be using NFQUEUE instead.

> > diff --git a/net/netfilter/core.c b/net/netfilter/core.c
> > index 2b3b2f8e39c4..9ae2febd86e3 100644
> > --- a/net/netfilter/core.c
> > +++ b/net/netfilter/core.c
> > @@ -309,6 +309,7 @@ unsigned int nf_iterate(struct sk_buff *skb,
> > unsigned int verdict;
> >  
> > while (*entryp) {
> > +   RCU_INIT_POINTER(state->hook_entries, *entryp);
> >  repeat:
> > verdict = (*entryp)->ops.hook((*entryp)->ops.priv, skb, state);
> > if (verdict != NF_ACCEPT) {
> > @@ -331,9 +332,8 @@ int nf_hook_slow(struct sk_buff *skb, struct 
> > nf_hook_state *state)
> > int ret;
> >  
> > entry = rcu_dereference(state->hook_entries);
> > -next_hook:
> > verdict = nf_iterate(skb, state, );
> > -   switch (verdict & NF_VERDICT_MASK) {
> > +   switch (verdict) {
> 
> This looks buggy, verdict might encode errno for NF_DROP case.
> 
> What you could do is:
> 
> switch (verdict) {
> case NF_ACCEPT:
>   /* something */
>   break;
> case NF_STOLEN:
>   break;
> case NF_DROP: /* fallthrough */
> default: /* drop with error? */
>   kfree_skb(skb);
>   errno = ...
> }

Right, will fix this, thanks. 
--
To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH nf-next,RFC 08/10] netfilter: move NF_QUEUE handling away from core

2016-10-13 Thread Florian Westphal
Pablo Neira Ayuso  wrote:
> > Any reason why this is needed?
> > AFAICS xt_NFQUEUE will never return NF_QUEUE after this patch.
> 
> -j QUEUE uses the standard target to return NF_QUEUE. This is very
> primitive way to queue packets to userspace queue 0 via nf_queue, but
> still may break. I can place this under unlikely() as these days
> people should be using NFQUEUE instead.

No need, just add a comment that this handles legacy standard target
QUEUE (i forgot we still have this).
--
To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH nf-next,RFC 08/10] netfilter: move NF_QUEUE handling away from core

2016-10-13 Thread Pablo Neira Ayuso
Export a new nf_queue() function that translates the NF_QUEUE verdict
depending on the scenario:

1) Drop packet if queue is full.
2) Accept packet if bypass is enabled.
3) Return stolen if packet is enqueued.

We can call this function from xt_NFQUEUE and nft_queue. Thus, we
move packet queuing to userspace away from the core path.

We still have to handle the old QUEUE standard target for
{ip,ip6}_tables, which points to queue number zero. Just in case we
still have any user relying on this behaviour. No need to handle this
from arp and ebtables, they never got a native queue target.

After this patch, we have to inconditionally set state->hook_entries
before calling the hook since nf_iterate() since we need this to know
from what hook the packet is escaping to userspace in nf_queue.

>From nft_verdict_init(), disallow NF_QUEUE as verdict since we always
use the nft_queue expression for this and we don't have any userspace
code using this since the beginning.

Signed-off-by: Pablo Neira Ayuso 
---
 include/net/netfilter/nf_queue.h |  3 +++
 net/ipv4/netfilter/arp_tables.c  |  1 +
 net/ipv4/netfilter/ip_tables.c   |  4 
 net/ipv6/netfilter/ip6_tables.c  |  4 
 net/netfilter/core.c | 14 ++-
 net/netfilter/nf_internals.h |  2 --
 net/netfilter/nf_queue.c | 51 ++--
 net/netfilter/nf_tables_api.c|  3 +--
 net/netfilter/nf_tables_core.c   |  3 +--
 net/netfilter/nft_queue.c|  6 ++---
 net/netfilter/xt_NFQUEUE.c   | 29 ---
 11 files changed, 67 insertions(+), 53 deletions(-)

diff --git a/include/net/netfilter/nf_queue.h b/include/net/netfilter/nf_queue.h
index 2280cfe86c56..807b9de72b43 100644
--- a/include/net/netfilter/nf_queue.h
+++ b/include/net/netfilter/nf_queue.h
@@ -29,6 +29,9 @@ struct nf_queue_handler {
 
 void nf_register_queue_handler(struct net *net, const struct nf_queue_handler 
*qh);
 void nf_unregister_queue_handler(struct net *net);
+
+int nf_queue(struct sk_buff *skb, const struct nf_hook_state *state,
+unsigned int queuenum, bool bypass);
 void nf_reinject(struct nf_queue_entry *entry, unsigned int verdict);
 
 void nf_queue_entry_get_refs(struct nf_queue_entry *entry);
diff --git a/net/ipv4/netfilter/arp_tables.c b/net/ipv4/netfilter/arp_tables.c
index e76ab23a2deb..83d82f6be8dd 100644
--- a/net/ipv4/netfilter/arp_tables.c
+++ b/net/ipv4/netfilter/arp_tables.c
@@ -28,6 +28,7 @@
 
 #include 
 #include 
+#include 
 #include "../../netfilter/xt_repldata.h"
 
 MODULE_LICENSE("GPL");
diff --git a/net/ipv4/netfilter/ip_tables.c b/net/ipv4/netfilter/ip_tables.c
index de4fa03f46f3..7040842c34f4 100644
--- a/net/ipv4/netfilter/ip_tables.c
+++ b/net/ipv4/netfilter/ip_tables.c
@@ -29,6 +29,7 @@
 #include 
 #include 
 #include 
+#include 
 #include "../../netfilter/xt_repldata.h"
 
 MODULE_LICENSE("GPL");
@@ -329,6 +330,9 @@ ipt_do_table(struct sk_buff *skb,
/* Pop from stack? */
if (v != XT_RETURN) {
verdict = (unsigned int)(-v) - 1;
+   if (verdict == NF_QUEUE)
+   verdict = nf_queue(skb, state,
+  0, false);
break;
}
if (stackidx == 0) {
diff --git a/net/ipv6/netfilter/ip6_tables.c b/net/ipv6/netfilter/ip6_tables.c
index 7eac01d5d621..7119daa19ba6 100644
--- a/net/ipv6/netfilter/ip6_tables.c
+++ b/net/ipv6/netfilter/ip6_tables.c
@@ -33,6 +33,7 @@
 #include 
 #include 
 #include 
+#include 
 #include "../../netfilter/xt_repldata.h"
 
 MODULE_LICENSE("GPL");
@@ -361,6 +362,9 @@ ip6t_do_table(struct sk_buff *skb,
/* Pop from stack? */
if (v != XT_RETURN) {
verdict = (unsigned int)(-v) - 1;
+   if (verdict == NF_QUEUE)
+   verdict = nf_queue(skb, state,
+  0, false);
break;
}
if (stackidx == 0)
diff --git a/net/netfilter/core.c b/net/netfilter/core.c
index 2b3b2f8e39c4..9ae2febd86e3 100644
--- a/net/netfilter/core.c
+++ b/net/netfilter/core.c
@@ -309,6 +309,7 @@ unsigned int nf_iterate(struct sk_buff *skb,
unsigned int verdict;
 
while (*entryp) {
+   RCU_INIT_POINTER(state->hook_entries, *entryp);
 repeat:
verdict = (*entryp)->ops.hook((*entryp)->ops.priv, skb, state);
if (verdict != NF_ACCEPT) {
@@ -331,9 +332,8 @@ int nf_hook_slow(struct sk_buff *skb, struct nf_hook_state 
*state)
int ret;