Re: [PATCH] net: implement IP_RECVHDRS option to get full headers through recvmsg cmsg.

2018-04-02 Thread David Miller
From: Maciej Żenczykowski 
Date: Sat, 31 Mar 2018 22:43:14 -0700

> From: Luigi Rizzo 
> 
> We have all sorts of different ways to fetch pre-UDP payload metadata:
>   IP_RECVTOS
>   IP_RECVTTL
>   IP_RECVOPTS
>   IP_RETOPTS
> 
> But nothing generic which simply allows you to receive the entire packet 
> header.
> 
> This is in similar vein to TCP_SAVE_SYN but for UDP and other datagram 
> sockets.
> 
> This is envisioned as a way to get GUE extension metadata for encapsulated
> packets, but implemented in a way to be much more future proof.
> 
> (Implemented by Luigi, who asked me to send it upstream)
> 
> Cc: Eric Dumazet 
> Signed-off-by: Luigi Rizzo 
> Signed-off-by: Maciej Żenczykowski 

This is an ipv4 level socket option, so why are you copying in the MAC
header(s)?

That part I don't like at all.

First of all, you have no idea what the link level protocol is for that
MAC header, therefore how could you even begin to interpret it's contents
correctly?

Second of all, MAC level details belong not in AF_INET socket interfaces.

Thank you.


Re: [PATCH] net: implement IP_RECVHDRS option to get full headers through recvmsg cmsg.

2018-04-01 Thread Eric Dumazet


On 03/31/2018 10:43 PM, Maciej Żenczykowski wrote:
> From: Luigi Rizzo 
> 
> We have all sorts of different ways to fetch pre-UDP payload metadata:
>   IP_RECVTOS
>   IP_RECVTTL
>   IP_RECVOPTS
>   IP_RETOPTS
> 
> But nothing generic which simply allows you to receive the entire packet 
> header.
> 
> This is in similar vein to TCP_SAVE_SYN but for UDP and other datagram 
> sockets.
> 
> This is envisioned as a way to get GUE extension metadata for encapsulated
> packets, but implemented in a way to be much more future proof.
> 
> (Implemented by Luigi, who asked me to send it upstream)
> 

Hmm... what happened to IPv6 ? ;)



[PATCH] net: implement IP_RECVHDRS option to get full headers through recvmsg cmsg.

2018-03-31 Thread Maciej Żenczykowski
From: Luigi Rizzo 

We have all sorts of different ways to fetch pre-UDP payload metadata:
  IP_RECVTOS
  IP_RECVTTL
  IP_RECVOPTS
  IP_RETOPTS

But nothing generic which simply allows you to receive the entire packet header.

This is in similar vein to TCP_SAVE_SYN but for UDP and other datagram sockets.

This is envisioned as a way to get GUE extension metadata for encapsulated
packets, but implemented in a way to be much more future proof.

(Implemented by Luigi, who asked me to send it upstream)

Cc: Eric Dumazet 
Signed-off-by: Luigi Rizzo 
Signed-off-by: Maciej Żenczykowski 
---
 include/net/inet_sock.h |  1 +
 include/uapi/linux/in.h |  1 +
 net/ipv4/ip_sockglue.c  | 26 ++
 3 files changed, 28 insertions(+)

diff --git a/include/net/inet_sock.h b/include/net/inet_sock.h
index 0a671c32d6b9..4299750c3bea 100644
--- a/include/net/inet_sock.h
+++ b/include/net/inet_sock.h
@@ -237,6 +237,7 @@ struct inet_sock {
 #define IP_CMSG_ORIGDSTADDRBIT(6)
 #define IP_CMSG_CHECKSUM   BIT(7)
 #define IP_CMSG_RECVFRAGSIZE   BIT(8)
+#define IP_CMSG_RECVHDRS   BIT(9)
 
 /**
  * sk_to_full_sk - Access to a full socket
diff --git a/include/uapi/linux/in.h b/include/uapi/linux/in.h
index 48e8a225b985..6dae3e1023cc 100644
--- a/include/uapi/linux/in.h
+++ b/include/uapi/linux/in.h
@@ -119,6 +119,7 @@ struct in_addr {
 #define IP_CHECKSUM23
 #define IP_BIND_ADDRESS_NO_PORT24
 #define IP_RECVFRAGSIZE25
+#define IP_RECVHDRS26
 
 /* IP_MTU_DISCOVER values */
 #define IP_PMTUDISC_DONT   0   /* Never send DF frames */
diff --git a/net/ipv4/ip_sockglue.c b/net/ipv4/ip_sockglue.c
index 5ad2d8ed3a3f..35c5f70daea9 100644
--- a/net/ipv4/ip_sockglue.c
+++ b/net/ipv4/ip_sockglue.c
@@ -71,6 +71,14 @@ static void ip_cmsg_recv_tos(struct msghdr *msg, struct 
sk_buff *skb)
put_cmsg(msg, SOL_IP, IP_TOS, 1, _hdr(skb)->tos);
 }
 
+/* Return all headers */
+static void ip_cmsg_recv_headers(struct msghdr *msg, struct sk_buff *skb)
+{
+   int len = skb->data - skb_mac_header(skb);
+
+   put_cmsg(msg, SOL_IP, IP_RECVHDRS, len, eth_hdr(skb));
+}
+
 static void ip_cmsg_recv_opts(struct msghdr *msg, struct sk_buff *skb)
 {
if (IPCB(skb)->opt.optlen == 0)
@@ -205,6 +213,14 @@ void ip_cmsg_recv_offset(struct msghdr *msg, struct sock 
*sk,
return;
}
 
+   if (flags & IP_CMSG_RECVHDRS) {
+   ip_cmsg_recv_headers(msg, skb);
+
+   flags &= ~IP_CMSG_RECVHDRS;
+   if (!flags)
+   return;
+   }
+
if (flags & IP_CMSG_RETOPTS) {
ip_cmsg_recv_retopts(sock_net(sk), msg, skb);
 
@@ -597,6 +613,7 @@ static int do_ip_setsockopt(struct sock *sk, int level,
case IP_PKTINFO:
case IP_RECVTTL:
case IP_RECVOPTS:
+   case IP_RECVHDRS:
case IP_RECVTOS:
case IP_RETOPTS:
case IP_TOS:
@@ -701,6 +718,12 @@ static int do_ip_setsockopt(struct sock *sk, int level,
else
inet->cmsg_flags &= ~IP_CMSG_RECVOPTS;
break;
+   case IP_RECVHDRS:
+   if (val)
+   inet->cmsg_flags |=  IP_CMSG_RECVHDRS;
+   else
+   inet->cmsg_flags &= ~IP_CMSG_RECVHDRS;
+   break;
case IP_RETOPTS:
if (val)
inet->cmsg_flags |= IP_CMSG_RETOPTS;
@@ -1362,6 +1385,9 @@ static int do_ip_getsockopt(struct sock *sk, int level, 
int optname,
case IP_RECVOPTS:
val = (inet->cmsg_flags & IP_CMSG_RECVOPTS) != 0;
break;
+   case IP_RECVHDRS:
+   val = (inet->cmsg_flags & IP_CMSG_RECVHDRS) != 0;
+   break;
case IP_RETOPTS:
val = (inet->cmsg_flags & IP_CMSG_RETOPTS) != 0;
break;
-- 
2.17.0.rc1.321.gba9d0f2565-goog