Re: [PATCH 0/5] Networking cgroup controller
On Thu, Aug 25, 2016 at 11:56:27AM -0700, Mahesh Bandewar (महेश बंडेवार) wrote: > On Thu, Aug 25, 2016 at 11:04 AM, Alexei Starovoitov >wrote: > > On Thu, Aug 25, 2016 at 08:54:19AM -0700, Mahesh Bandewar (महेश बंडेवार) > > wrote: > >> On Wed, Aug 24, 2016 at 2:03 PM, Tejun Heo wrote: > >> > Hello, Anoop. > >> > > >> > On Wed, Aug 10, 2016 at 05:53:13PM -0700, Anoop Naravaram wrote: > >> >> This patchset introduces a cgroup controller for the networking > >> >> subsystem as a > >> >> whole. As of now, this controller will be used for: > >> >> > >> >> * Limiting the specific ports that a process in a cgroup is allowed to > >> >> bind > >> >> to or listen on. For example, you can say that all the processes in a > >> >> cgroup can only bind to ports 1000-2000, and listen on ports > >> >> 1000-1100, which > >> >> guarantees that the remaining ports will be available for other > >> >> processes. > >> >> > >> >> * Restricting which DSCP values processes can use with their sockets. > >> >> For > >> >> example, you can say that all the processes in a cgroup can only send > >> >> packets with a DSCP tag between 48 and 63 (corresponding to TOS > >> >> values of > >> >> 192 to 255). > >> >> > >> >> * Limiting the total number of udp ports that can be used by a process > >> >> in a > >> >> cgroup. For example, you can say that all the processes in one cgroup > >> >> are > >> >> allowed to use a total of up to 100 udp ports. Since the total number > >> >> of udp > >> >> ports that can be used by all processes is limited, this is useful for > >> >> rationing out the ports to different process groups. > >> >> > >> >> In the future, more networking-related properties may be added to this > >> >> controller. > >> > > >> > Thanks for working on this; however, I share the sentiment expressed > >> > by others that this looks like too piecemeal an approach. If there > >> > are no alternatives, we surely should consider this but it at least > >> > *looks* like bpf should be able to cover the same functionalities > >> > without having to revise and extend in-kernel capabilities constantly. > >> > > >> My primary concern is the cost that need to be paid to get this > >> functionality. > >> (a) The suggested alternatives eBPF either can't solve the problem in > >> the current form or need substantial work to get it done. e.g. > >> udp-port-limit since there is no notion of "maintaining > >> counters-per-group-of-processes". This is solved by the cgroup infra. > > > > what is specifically missing? > > there are several ways to do counters in bpf and as soon as bpf program > > is attachable to a cgroup, all of these counter features come for free. > > Counting bytes or packets or port bind failures or anything else per cgroup > > with bpf is trivial. No extra code is needed. > > > Alexei, I was referring to the association of eBPF to the cgroup. Lack > of it makes anyone wants to use it invest into additional > administrative infra that you are currently getting with cgroup-infra. Please look at Daniel's patches. They have been circulating in different forms for quite some time now. Your bind port filter use case can be easily added on top. Then the end result is additional ten lines of code instead of hundreds. Another alternative is to go cgroup+lsm+bpf route that Sargun and Mickael are proposing. I think it will also work for your use case. The goal we all should have is to have common infra that solves the largest number of use cases. -- To unsubscribe from this list: send the line "unsubscribe linux-doc" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 0/5] Networking cgroup controller
On Thu, Aug 25, 2016 at 9:09 AM, Tejun Heowrote: > Hello, Mahesh. > > On Thu, Aug 25, 2016 at 08:54:19AM -0700, Mahesh Bandewar (महेश बंडेवार) > wrote: >> In short most of the associated problems are handled by the >> cgroup-infra / APIs while all that need separate solution in >> alternatives. Tejun, feels like I'm advocating cgroup approach to you >> ;) > > My concern here is that the proposed fixed mechanism isn't gonna be > enough. Port range matching wouldn't scale, so we'd need some hashmap > style thing which may be too expensive for simple matches so either we > do something adaptive or have different interfaces for the two and so > on. IOW, I think this approach is likely to replicate what iptables > have been doing with its extensions. I don't doubt that it is one of > the workable approaches but hardly an attractive one especially at > this point. > > ebpf approach does have its shortcomings for sure but mending them > seems a lot more manageable and future-proof than going with fixed but > constantly expanding set of operations. e.g. We can add per-cgroup > bpf programs which are called only on socket creation or other major > events, or just let bpf programs which get called on bind(2), and add > some per-cgroup state variables which are maintained by cgroup code > which can be used from these bpf programs. > Well, I haven't seen any of these yet (please point me the right place if I missed) Especially the hooks that allows users to add per-cgroup bpf programs that can be used in control-path (I think Daniel's recent patches allow in data-path). > Thanks. > > -- > tejun -- To unsubscribe from this list: send the line "unsubscribe linux-doc" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 0/5] Networking cgroup controller
On Thu, Aug 25, 2016 at 08:54:19AM -0700, Mahesh Bandewar (महेश बंडेवार) wrote: > On Wed, Aug 24, 2016 at 2:03 PM, Tejun Heowrote: > > Hello, Anoop. > > > > On Wed, Aug 10, 2016 at 05:53:13PM -0700, Anoop Naravaram wrote: > >> This patchset introduces a cgroup controller for the networking subsystem > >> as a > >> whole. As of now, this controller will be used for: > >> > >> * Limiting the specific ports that a process in a cgroup is allowed to bind > >> to or listen on. For example, you can say that all the processes in a > >> cgroup can only bind to ports 1000-2000, and listen on ports 1000-1100, > >> which > >> guarantees that the remaining ports will be available for other > >> processes. > >> > >> * Restricting which DSCP values processes can use with their sockets. For > >> example, you can say that all the processes in a cgroup can only send > >> packets with a DSCP tag between 48 and 63 (corresponding to TOS values of > >> 192 to 255). > >> > >> * Limiting the total number of udp ports that can be used by a process in a > >> cgroup. For example, you can say that all the processes in one cgroup are > >> allowed to use a total of up to 100 udp ports. Since the total number of > >> udp > >> ports that can be used by all processes is limited, this is useful for > >> rationing out the ports to different process groups. > >> > >> In the future, more networking-related properties may be added to this > >> controller. > > > > Thanks for working on this; however, I share the sentiment expressed > > by others that this looks like too piecemeal an approach. If there > > are no alternatives, we surely should consider this but it at least > > *looks* like bpf should be able to cover the same functionalities > > without having to revise and extend in-kernel capabilities constantly. > > > My primary concern is the cost that need to be paid to get this functionality. > (a) The suggested alternatives eBPF either can't solve the problem in > the current form or need substantial work to get it done. e.g. > udp-port-limit since there is no notion of "maintaining > counters-per-group-of-processes". This is solved by the cgroup infra. what is specifically missing? there are several ways to do counters in bpf and as soon as bpf program is attachable to a cgroup, all of these counter features come for free. Counting bytes or packets or port bind failures or anything else per cgroup with bpf is trivial. No extra code is needed. > (b) Also the hooks implemented are mostly with a per packet cost vs. > once when you are establishing the channel. Also not sure if the LSM > approach will allow some privileged user to over-ride the filters > attached and thus override the limits imposed. This is on top of the > administrative costs that currently don't have solution for and you > get it for free with cgroup infra. > > In short most of the associated problems are handled by the > cgroup-infra / APIs while all that need separate solution in > alternatives. Tejun, feels like I'm advocating cgroup approach to you > ;) > > Thanks, > --mahesh.. > > > > Thanks. > > > > -- > > tejun -- To unsubscribe from this list: send the line "unsubscribe linux-doc" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 0/5] Networking cgroup controller
On Thu, Aug 25, 2016 at 12:09:20PM -0400, Tejun Heo wrote: > ebpf approach does have its shortcomings for sure but mending them > seems a lot more manageable and future-proof than going with fixed but > constantly expanding set of operations. e.g. We can add per-cgroup > bpf programs which are called only on socket creation or other major > events, or just let bpf programs which get called on bind(2), and add ^^ please ignore this part. half-assed edit. -- tejun -- To unsubscribe from this list: send the line "unsubscribe linux-doc" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 0/5] Networking cgroup controller
Hello, Mahesh. On Thu, Aug 25, 2016 at 08:54:19AM -0700, Mahesh Bandewar (महेश बंडेवार) wrote: > In short most of the associated problems are handled by the > cgroup-infra / APIs while all that need separate solution in > alternatives. Tejun, feels like I'm advocating cgroup approach to you > ;) My concern here is that the proposed fixed mechanism isn't gonna be enough. Port range matching wouldn't scale, so we'd need some hashmap style thing which may be too expensive for simple matches so either we do something adaptive or have different interfaces for the two and so on. IOW, I think this approach is likely to replicate what iptables have been doing with its extensions. I don't doubt that it is one of the workable approaches but hardly an attractive one especially at this point. ebpf approach does have its shortcomings for sure but mending them seems a lot more manageable and future-proof than going with fixed but constantly expanding set of operations. e.g. We can add per-cgroup bpf programs which are called only on socket creation or other major events, or just let bpf programs which get called on bind(2), and add some per-cgroup state variables which are maintained by cgroup code which can be used from these bpf programs. Thanks. -- tejun -- To unsubscribe from this list: send the line "unsubscribe linux-doc" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 0/5] Networking cgroup controller
On Wed, Aug 24, 2016 at 2:03 PM, Tejun Heowrote: > Hello, Anoop. > > On Wed, Aug 10, 2016 at 05:53:13PM -0700, Anoop Naravaram wrote: >> This patchset introduces a cgroup controller for the networking subsystem as >> a >> whole. As of now, this controller will be used for: >> >> * Limiting the specific ports that a process in a cgroup is allowed to bind >> to or listen on. For example, you can say that all the processes in a >> cgroup can only bind to ports 1000-2000, and listen on ports 1000-1100, >> which >> guarantees that the remaining ports will be available for other processes. >> >> * Restricting which DSCP values processes can use with their sockets. For >> example, you can say that all the processes in a cgroup can only send >> packets with a DSCP tag between 48 and 63 (corresponding to TOS values of >> 192 to 255). >> >> * Limiting the total number of udp ports that can be used by a process in a >> cgroup. For example, you can say that all the processes in one cgroup are >> allowed to use a total of up to 100 udp ports. Since the total number of >> udp >> ports that can be used by all processes is limited, this is useful for >> rationing out the ports to different process groups. >> >> In the future, more networking-related properties may be added to this >> controller. > > Thanks for working on this; however, I share the sentiment expressed > by others that this looks like too piecemeal an approach. If there > are no alternatives, we surely should consider this but it at least > *looks* like bpf should be able to cover the same functionalities > without having to revise and extend in-kernel capabilities constantly. > My primary concern is the cost that need to be paid to get this functionality. (a) The suggested alternatives eBPF either can't solve the problem in the current form or need substantial work to get it done. e.g. udp-port-limit since there is no notion of "maintaining counters-per-group-of-processes". This is solved by the cgroup infra. (b) Also the hooks implemented are mostly with a per packet cost vs. once when you are establishing the channel. Also not sure if the LSM approach will allow some privileged user to over-ride the filters attached and thus override the limits imposed. This is on top of the administrative costs that currently don't have solution for and you get it for free with cgroup infra. In short most of the associated problems are handled by the cgroup-infra / APIs while all that need separate solution in alternatives. Tejun, feels like I'm advocating cgroup approach to you ;) Thanks, --mahesh.. > Thanks. > > -- > tejun -- To unsubscribe from this list: send the line "unsubscribe linux-doc" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 0/5] Networking cgroup controller
On Tue, Aug 23, 2016 at 1:49 AM, Parav Panditwrote: > Hi Anoop, > > Regardless of usecase, I think this functionality is best handled as > LSM functionality instead of cgroup. > I'm not so sure about that. Cgroup APIs are useful and this is just an extension to it. > Tasks which are proposed in this patch are related to access control checks. > LSM already has required hooks for socket operations such as bind(), > listen() as few small examples. > > Refer to security_socket_listen() which invokes LSM specific hooks. > This is invoked in source/net/socket.c as part of listen() system call. > LSM hook callback can check whether a given a process can listen to > requested UDP port or not. > This has administrative overhead that is not addressed. The underlying cgroup infrastructure takes care of it in this (current) implementation. > Parav > > [...] -- To unsubscribe from this list: send the line "unsubscribe linux-doc" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 0/5] Networking cgroup controller
Hi Anoop, Regardless of usecase, I think this functionality is best handled as LSM functionality instead of cgroup. Tasks which are proposed in this patch are related to access control checks. LSM already has required hooks for socket operations such as bind(), listen() as few small examples. Refer to security_socket_listen() which invokes LSM specific hooks. This is invoked in source/net/socket.c as part of listen() system call. LSM hook callback can check whether a given a process can listen to requested UDP port or not. Parav On Thu, Aug 11, 2016 at 6:23 AM, Anoop Naravaramwrote: > This patchset introduces a cgroup controller for the networking subsystem as a > whole. As of now, this controller will be used for: > > * Limiting the specific ports that a process in a cgroup is allowed to bind > to or listen on. For example, you can say that all the processes in a > cgroup can only bind to ports 1000-2000, and listen on ports 1000-1100, > which > guarantees that the remaining ports will be available for other processes. > > * Restricting which DSCP values processes can use with their sockets. For > example, you can say that all the processes in a cgroup can only send > packets with a DSCP tag between 48 and 63 (corresponding to TOS values of > 192 to 255). > > * Limiting the total number of udp ports that can be used by a process in a > cgroup. For example, you can say that all the processes in one cgroup are > allowed to use a total of up to 100 udp ports. Since the total number of udp > ports that can be used by all processes is limited, this is useful for > rationing out the ports to different process groups. > > In the future, more networking-related properties may be added to this > controller. > > Anoop Naravaram (5): > net: create the networking cgroup controller > net: add bind/listen ranges to net cgroup > net: add udp limit to net cgroup > net: add dscp ranges to net cgroup > net: add test for net cgroup > > Documentation/cgroup-v1/net.txt | 95 + > include/linux/cgroup_subsys.h | 4 + > include/net/net_cgroup.h | 103 ++ > net/Kconfig | 10 + > net/core/Makefile | 1 + > net/core/net_cgroup.c | 706 > ++ > net/ipv4/af_inet.c| 8 + > net/ipv4/inet_connection_sock.c | 7 + > net/ipv4/ip_sockglue.c| 13 + > net/ipv4/udp.c| 8 + > net/ipv6/af_inet6.c | 7 + > net/ipv6/datagram.c | 9 + > net/ipv6/ipv6_sockglue.c | 8 + > scripts/cgroup/net_cgroup_test.py | 359 +++ > 14 files changed, 1338 insertions(+) > create mode 100644 Documentation/cgroup-v1/net.txt > create mode 100644 include/net/net_cgroup.h > create mode 100644 net/core/net_cgroup.c > create mode 100755 scripts/cgroup/net_cgroup_test.py > > -- > 2.8.0.rc3.226.g39d4020 > > -- > To unsubscribe from this list: send the line "unsubscribe cgroups" in > the body of a message to majord...@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line "unsubscribe linux-doc" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 0/5] Networking cgroup controller
Hi Anoop, On Thu, Aug 11, 2016 at 6:23 AM, Anoop Naravaramwrote: > This patchset introduces a cgroup controller for the networking subsystem as a > whole. As of now, this controller will be used for: > > * Limiting the specific ports that a process in a cgroup is allowed to bind > to or listen on. For example, you can say that all the processes in a > cgroup can only bind to ports 1000-2000, and listen on ports 1000-1100, > which > guarantees that the remaining ports will be available for other processes. > > * Restricting which DSCP values processes can use with their sockets. For > example, you can say that all the processes in a cgroup can only send > packets with a DSCP tag between 48 and 63 (corresponding to TOS values of > 192 to 255). > > * Limiting the total number of udp ports that can be used by a process in a > cgroup. For example, you can say that all the processes in one cgroup are > allowed to use a total of up to 100 udp ports. Since the total number of udp > ports that can be used by all processes is limited, this is useful for > rationing out the ports to different process groups. > > In the future, more networking-related properties may be added to this > controller. > Since network namespace allows process in each namespace to listen to same port range in their own namespace. What is the rationale or use case to limit certain process to view certain port range? -- To unsubscribe from this list: send the line "unsubscribe linux-doc" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 0/5] Networking cgroup controller
This patchset introduces a cgroup controller for the networking subsystem as a whole. As of now, this controller will be used for: * Limiting the specific ports that a process in a cgroup is allowed to bind to or listen on. For example, you can say that all the processes in a cgroup can only bind to ports 1000-2000, and listen on ports 1000-1100, which guarantees that the remaining ports will be available for other processes. * Restricting which DSCP values processes can use with their sockets. For example, you can say that all the processes in a cgroup can only send packets with a DSCP tag between 48 and 63 (corresponding to TOS values of 192 to 255). * Limiting the total number of udp ports that can be used by a process in a cgroup. For example, you can say that all the processes in one cgroup are allowed to use a total of up to 100 udp ports. Since the total number of udp ports that can be used by all processes is limited, this is useful for rationing out the ports to different process groups. In the future, more networking-related properties may be added to this controller. Anoop Naravaram (5): net: create the networking cgroup controller net: add bind/listen ranges to net cgroup net: add udp limit to net cgroup net: add dscp ranges to net cgroup net: add test for net cgroup Documentation/cgroup-v1/net.txt | 95 + include/linux/cgroup_subsys.h | 4 + include/net/net_cgroup.h | 103 ++ net/Kconfig | 10 + net/core/Makefile | 1 + net/core/net_cgroup.c | 706 ++ net/ipv4/af_inet.c| 8 + net/ipv4/inet_connection_sock.c | 7 + net/ipv4/ip_sockglue.c| 13 + net/ipv4/udp.c| 8 + net/ipv6/af_inet6.c | 7 + net/ipv6/datagram.c | 9 + net/ipv6/ipv6_sockglue.c | 8 + scripts/cgroup/net_cgroup_test.py | 359 +++ 14 files changed, 1338 insertions(+) create mode 100644 Documentation/cgroup-v1/net.txt create mode 100644 include/net/net_cgroup.h create mode 100644 net/core/net_cgroup.c create mode 100755 scripts/cgroup/net_cgroup_test.py -- 2.8.0.rc3.226.g39d4020 -- To unsubscribe from this list: send the line "unsubscribe linux-doc" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html