Re: [PATCH v3 net-next] Introduce a sysctl that modifies the value of PROT_SOCK.

2017-01-24 Thread David Miller
From: Krister Johansen 
Date: Fri, 20 Jan 2017 17:49:11 -0800

> Add net.ipv4.ip_unprivileged_port_start, which is a per namespace sysctl
> that denotes the first unprivileged inet port in the namespace.  To
> disable all privileged ports set this to zero.  It also checks for
> overlap with the local port range.  The privileged and local range may
> not overlap.
> 
> The use case for this change is to allow containerized processes to bind
> to priviliged ports, but prevent them from ever being allowed to modify
> their container's network configuration.  The latter is accomplished by
> ensuring that the network namespace is not a child of the user
> namespace.  This modification was needed to allow the container manager
> to disable a namespace's priviliged port restrictions without exposing
> control of the network namespace to processes in the user namespace.
> 
> Signed-off-by: Krister Johansen 

Applied, thanks.


Re: [PATCH v3 net-next] Introduce a sysctl that modifies the value of PROT_SOCK.

2017-01-23 Thread David Miller
From: Krister Johansen 
Date: Fri, 20 Jan 2017 17:49:11 -0800

> Add net.ipv4.ip_unprivileged_port_start, which is a per namespace sysctl
> that denotes the first unprivileged inet port in the namespace.  To
> disable all privileged ports set this to zero.  It also checks for
> overlap with the local port range.  The privileged and local range may
> not overlap.
> 
> The use case for this change is to allow containerized processes to bind
> to priviliged ports, but prevent them from ever being allowed to modify
> their container's network configuration.  The latter is accomplished by
> ensuring that the network namespace is not a child of the user
> namespace.  This modification was needed to allow the container manager
> to disable a namespace's priviliged port restrictions without exposing
> control of the network namespace to processes in the user namespace.
> 
> Signed-off-by: Krister Johansen 

I'm not ignoring this change, I just want to think about it some more.

Just FYI...


[PATCH v3 net-next] Introduce a sysctl that modifies the value of PROT_SOCK.

2017-01-20 Thread Krister Johansen
Add net.ipv4.ip_unprivileged_port_start, which is a per namespace sysctl
that denotes the first unprivileged inet port in the namespace.  To
disable all privileged ports set this to zero.  It also checks for
overlap with the local port range.  The privileged and local range may
not overlap.

The use case for this change is to allow containerized processes to bind
to priviliged ports, but prevent them from ever being allowed to modify
their container's network configuration.  The latter is accomplished by
ensuring that the network namespace is not a child of the user
namespace.  This modification was needed to allow the container manager
to disable a namespace's priviliged port restrictions without exposing
control of the network namespace to processes in the user namespace.

Signed-off-by: Krister Johansen 
---
 Documentation/networking/ip-sysctl.txt |  9 ++
 include/net/ip.h   | 10 +++
 include/net/netns/ipv4.h   |  1 +
 net/ipv4/af_inet.c |  5 +++-
 net/ipv4/sysctl_net_ipv4.c | 50 +-
 net/ipv6/af_inet6.c|  3 +-
 net/netfilter/ipvs/ip_vs_ctl.c |  7 ++---
 net/sctp/socket.c  | 10 ---
 security/selinux/hooks.c   |  3 +-
 9 files changed, 86 insertions(+), 12 deletions(-)

Changes v1 -> v2:

Remove LOWPORT_SYSCTL config option.  This is now always enabled as long
as CONFIG_SYSCTL is.

Changes v2 -> v3:

Add documentation to ip-sysctl.txt.
Rename "protected" variables and functions to "privileged."

diff --git a/Documentation/networking/ip-sysctl.txt 
b/Documentation/networking/ip-sysctl.txt
index aa1bb49..17f2e77 100644
--- a/Documentation/networking/ip-sysctl.txt
+++ b/Documentation/networking/ip-sysctl.txt
@@ -822,6 +822,15 @@ ip_local_reserved_ports - list of comma separated ranges
 
Default: Empty
 
+ip_unprivileged_port_start - INTEGER
+   This is a per-namespace sysctl.  It defines the first
+   unprivileged port in the network namespace.  Privileged ports
+   require root or CAP_NET_BIND_SERVICE in order to bind to them.
+   To disable all privileged ports, set this to 0.  It may not
+   overlap with the ip_local_reserved_ports range.
+
+   Default: 1024
+
 ip_nonlocal_bind - BOOLEAN
If set, allows processes to bind() to non-local IP addresses,
which can be quite useful - but may break some applications.
diff --git a/include/net/ip.h b/include/net/ip.h
index ab6761a..bf264a8 100644
--- a/include/net/ip.h
+++ b/include/net/ip.h
@@ -263,11 +263,21 @@ static inline bool sysctl_dev_name_is_allowed(const char 
*name)
return strcmp(name, "default") != 0  && strcmp(name, "all") != 0;
 }
 
+static inline int inet_prot_sock(struct net *net)
+{
+   return net->ipv4.sysctl_ip_prot_sock;
+}
+
 #else
 static inline int inet_is_local_reserved_port(struct net *net, int port)
 {
return 0;
 }
+
+static inline int inet_prot_sock(struct net *net)
+{
+   return PROT_SOCK;
+}
 #endif
 
 __be32 inet_current_timestamp(void);
diff --git a/include/net/netns/ipv4.h b/include/net/netns/ipv4.h
index 8e3f5b6..e365732 100644
--- a/include/net/netns/ipv4.h
+++ b/include/net/netns/ipv4.h
@@ -135,6 +135,7 @@ struct netns_ipv4 {
 
 #ifdef CONFIG_SYSCTL
unsigned long *sysctl_local_reserved_ports;
+   int sysctl_ip_prot_sock;
 #endif
 
 #ifdef CONFIG_IP_MROUTE
diff --git a/net/ipv4/af_inet.c b/net/ipv4/af_inet.c
index aae410b..28fe8da 100644
--- a/net/ipv4/af_inet.c
+++ b/net/ipv4/af_inet.c
@@ -479,7 +479,7 @@ int inet_bind(struct socket *sock, struct sockaddr *uaddr, 
int addr_len)
 
snum = ntohs(addr->sin_port);
err = -EACCES;
-   if (snum && snum < PROT_SOCK &&
+   if (snum && snum < inet_prot_sock(net) &&
!ns_capable(net->user_ns, CAP_NET_BIND_SERVICE))
goto out;
 
@@ -1700,6 +1700,9 @@ static __net_init int inet_init_net(struct net *net)
net->ipv4.sysctl_ip_default_ttl = IPDEFTTL;
net->ipv4.sysctl_ip_dynaddr = 0;
net->ipv4.sysctl_ip_early_demux = 1;
+#ifdef CONFIG_SYSCTL
+   net->ipv4.sysctl_ip_prot_sock = PROT_SOCK;
+#endif
 
return 0;
 }
diff --git a/net/ipv4/sysctl_net_ipv4.c b/net/ipv4/sysctl_net_ipv4.c
index c8d2836..1b86199 100644
--- a/net/ipv4/sysctl_net_ipv4.c
+++ b/net/ipv4/sysctl_net_ipv4.c
@@ -35,6 +35,8 @@ static int ip_local_port_range_min[] = { 1, 1 };
 static int ip_local_port_range_max[] = { 65535, 65535 };
 static int tcp_adv_win_scale_min = -31;
 static int tcp_adv_win_scale_max = 31;
+static int ip_privileged_port_min;
+static int ip_privileged_port_max = 65535;
 static int ip_ttl_min = 1;
 static int ip_ttl_max = 255;
 static int tcp_syn_retries_min = 1;
@@ -79,7 +81,12 @@ static int ipv4_local_port_range(struct ctl_table *table, 
int write,
ret = proc_dointvec_minmax(, write, buffer, lenp, ppos);
 
if (write && ret == 0) {
-