Re: [PATCH 02/14] tcp: fix mark propagation with fwmark_reflect enabled

2017-01-26 Thread Eric Dumazet
On Thu, 2017-01-26 at 20:19 +0100, Pablo Neira Ayuso wrote:

> Right. This is not percpu as in IPv4.
> 
> I can send a follow up patch to get this in sync with the way we do it
> in IPv4, ie. add percpu socket.
> 
> Fine with this approach? Thanks!

Not really.

percpu sockets are going to slow down network namespace creation /
deletion and increase memory foot print.

IPv6 is cleaner because it does not really have to use different
sockets.

Ultimately would would like to have the same for IPv4.

I would rather carry the mark either in an additional parameter,
or in the flow that is already passed as a parameter.



--
To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 02/14] tcp: fix mark propagation with fwmark_reflect enabled

2017-01-26 Thread Pablo Neira Ayuso
On Thu, Jan 26, 2017 at 10:02:40AM -0800, Eric Dumazet wrote:
> On Thu, 2017-01-26 at 17:37 +0100, Pablo Neira Ayuso wrote:
> > From: Pau Espin Pedrol 
> > 
> > Otherwise, RST packets generated by the TCP stack for non-existing
> > sockets always have mark 0.
> > The mark from the original packet is assigned to the netns_ipv4/6
> > socket used to send the response so that it can get copied into the
> > response skb when the socket sends it.
> > 
> > Fixes: e110861f8609 ("net: add a sysctl to reflect the fwmark on replies")
> > Cc: Lorenzo Colitti 
> > Signed-off-by: Pau Espin Pedrol 
> > Signed-off-by: Pablo Neira Ayuso 
> > ---
> >  net/ipv4/ip_output.c | 1 +
> >  net/ipv6/tcp_ipv6.c  | 1 +
> >  2 files changed, 2 insertions(+)
> > 
> > diff --git a/net/ipv4/ip_output.c b/net/ipv4/ip_output.c
> > index fac275c48108..b67719f45953 100644
> > --- a/net/ipv4/ip_output.c
> > +++ b/net/ipv4/ip_output.c
> > @@ -1629,6 +1629,7 @@ void ip_send_unicast_reply(struct sock *sk, struct 
> > sk_buff *skb,
> > sk->sk_protocol = ip_hdr(skb)->protocol;
> > sk->sk_bound_dev_if = arg->bound_dev_if;
> > sk->sk_sndbuf = sysctl_wmem_default;
> > +   sk->sk_mark = fl4.flowi4_mark;
> > err = ip_append_data(sk, , ip_reply_glue_bits, arg->iov->iov_base,
> >  len, 0, , , MSG_DONTWAIT);
> > if (unlikely(err)) {
> > diff --git a/net/ipv6/tcp_ipv6.c b/net/ipv6/tcp_ipv6.c
> > index 73bc8fc68acd..2b20622a5824 100644
> > --- a/net/ipv6/tcp_ipv6.c
> > +++ b/net/ipv6/tcp_ipv6.c
> > @@ -840,6 +840,7 @@ static void tcp_v6_send_response(const struct sock *sk, 
> > struct sk_buff *skb, u32
> > dst = ip6_dst_lookup_flow(ctl_sk, , NULL);
> > if (!IS_ERR(dst)) {
> > skb_dst_set(buff, dst);
> > +   ctl_sk->sk_mark = fl6.flowi6_mark;
> > ip6_xmit(ctl_sk, buff, , NULL, tclass);
> > TCP_INC_STATS(net, TCP_MIB_OUTSEGS);
> > if (rst)
> 
> 
> This patch is wrong.
> 
> ctl_sk is a shared socket, and tcp_v6_send_response() can be called from
> many different cpus at the same time.

Right. This is not percpu as in IPv4.

I can send a follow up patch to get this in sync with the way we do it
in IPv4, ie. add percpu socket.

Fine with this approach? Thanks!
--
To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 02/14] tcp: fix mark propagation with fwmark_reflect enabled

2017-01-26 Thread Eric Dumazet
On Thu, 2017-01-26 at 17:37 +0100, Pablo Neira Ayuso wrote:
> From: Pau Espin Pedrol 
> 
> Otherwise, RST packets generated by the TCP stack for non-existing
> sockets always have mark 0.
> The mark from the original packet is assigned to the netns_ipv4/6
> socket used to send the response so that it can get copied into the
> response skb when the socket sends it.
> 
> Fixes: e110861f8609 ("net: add a sysctl to reflect the fwmark on replies")
> Cc: Lorenzo Colitti 
> Signed-off-by: Pau Espin Pedrol 
> Signed-off-by: Pablo Neira Ayuso 
> ---
>  net/ipv4/ip_output.c | 1 +
>  net/ipv6/tcp_ipv6.c  | 1 +
>  2 files changed, 2 insertions(+)
> 
> diff --git a/net/ipv4/ip_output.c b/net/ipv4/ip_output.c
> index fac275c48108..b67719f45953 100644
> --- a/net/ipv4/ip_output.c
> +++ b/net/ipv4/ip_output.c
> @@ -1629,6 +1629,7 @@ void ip_send_unicast_reply(struct sock *sk, struct 
> sk_buff *skb,
>   sk->sk_protocol = ip_hdr(skb)->protocol;
>   sk->sk_bound_dev_if = arg->bound_dev_if;
>   sk->sk_sndbuf = sysctl_wmem_default;
> + sk->sk_mark = fl4.flowi4_mark;
>   err = ip_append_data(sk, , ip_reply_glue_bits, arg->iov->iov_base,
>len, 0, , , MSG_DONTWAIT);
>   if (unlikely(err)) {
> diff --git a/net/ipv6/tcp_ipv6.c b/net/ipv6/tcp_ipv6.c
> index 73bc8fc68acd..2b20622a5824 100644
> --- a/net/ipv6/tcp_ipv6.c
> +++ b/net/ipv6/tcp_ipv6.c
> @@ -840,6 +840,7 @@ static void tcp_v6_send_response(const struct sock *sk, 
> struct sk_buff *skb, u32
>   dst = ip6_dst_lookup_flow(ctl_sk, , NULL);
>   if (!IS_ERR(dst)) {
>   skb_dst_set(buff, dst);
> + ctl_sk->sk_mark = fl6.flowi6_mark;
>   ip6_xmit(ctl_sk, buff, , NULL, tclass);
>   TCP_INC_STATS(net, TCP_MIB_OUTSEGS);
>   if (rst)


This patch is wrong.

ctl_sk is a shared socket, and tcp_v6_send_response() can be called from
many different cpus at the same time.



--
To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 02/14] tcp: fix mark propagation with fwmark_reflect enabled

2017-01-26 Thread Pablo Neira Ayuso
From: Pau Espin Pedrol 

Otherwise, RST packets generated by the TCP stack for non-existing
sockets always have mark 0.
The mark from the original packet is assigned to the netns_ipv4/6
socket used to send the response so that it can get copied into the
response skb when the socket sends it.

Fixes: e110861f8609 ("net: add a sysctl to reflect the fwmark on replies")
Cc: Lorenzo Colitti 
Signed-off-by: Pau Espin Pedrol 
Signed-off-by: Pablo Neira Ayuso 
---
 net/ipv4/ip_output.c | 1 +
 net/ipv6/tcp_ipv6.c  | 1 +
 2 files changed, 2 insertions(+)

diff --git a/net/ipv4/ip_output.c b/net/ipv4/ip_output.c
index fac275c48108..b67719f45953 100644
--- a/net/ipv4/ip_output.c
+++ b/net/ipv4/ip_output.c
@@ -1629,6 +1629,7 @@ void ip_send_unicast_reply(struct sock *sk, struct 
sk_buff *skb,
sk->sk_protocol = ip_hdr(skb)->protocol;
sk->sk_bound_dev_if = arg->bound_dev_if;
sk->sk_sndbuf = sysctl_wmem_default;
+   sk->sk_mark = fl4.flowi4_mark;
err = ip_append_data(sk, , ip_reply_glue_bits, arg->iov->iov_base,
 len, 0, , , MSG_DONTWAIT);
if (unlikely(err)) {
diff --git a/net/ipv6/tcp_ipv6.c b/net/ipv6/tcp_ipv6.c
index 73bc8fc68acd..2b20622a5824 100644
--- a/net/ipv6/tcp_ipv6.c
+++ b/net/ipv6/tcp_ipv6.c
@@ -840,6 +840,7 @@ static void tcp_v6_send_response(const struct sock *sk, 
struct sk_buff *skb, u32
dst = ip6_dst_lookup_flow(ctl_sk, , NULL);
if (!IS_ERR(dst)) {
skb_dst_set(buff, dst);
+   ctl_sk->sk_mark = fl6.flowi6_mark;
ip6_xmit(ctl_sk, buff, , NULL, tclass);
TCP_INC_STATS(net, TCP_MIB_OUTSEGS);
if (rst)
-- 
2.1.4

--
To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html