Re: Notifying on empty cgroup
15.01.2014, 15:59, "Michal Hocko" : > [CCing cgroups mailing list] > On Wed 15-01-14 06:12:45, Victor Porton wrote: > >> I want to write software which needs to receive a signal when the cgroup >> created by it becomes empty. (After this the empty cgroup should be deleted >> just not to clutter the memory.) >> >> If the kernel does not support such notifications, it should be improved. >> This functionality is crucial for some kinds of software. >> >> There is /sys/fs/cgroup/systemd/release_agent but I don't understand how to >> use it. I don't understand why we would need it at all. > > "1.4 What does notify_on_release do ?" in > Documentation/cgroups/cgroups.txt the kernel source doesn't help? I've read it. I understand what it does. I don't understand how to use it in practice nor why it is done this way. >> Starting a binary on emptying a cgroup with the purpose to notify an other >> binary looks like a big overkill. > > the binary can do rmdir which is what you want, no? I suppose a base package should do that, not my specific software. Do I understand right? >> Also my program should work in userspace without the need to use >> release_agent which can be accessed only by root. > > The release_agent is global for all groups so the program doesn't have > to care. Again: What MY program should do? >> Note that my work is related with sandboxing software (running a program in >> closed environment, so that it would be unable for example to remove user's >> files). >> >> See also >> http://portonsoft.wordpress.com/2014/01/11/toward-robust-linux-sandbox/ >> >> -- >> Victor Porton - http://portonvictor.org >> -- >> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in >> the body of a message to majord...@vger.kernel.org >> More majordomo info at http://vger.kernel.org/majordomo-info.html >> Please read the FAQ at http://www.tux.org/lkml/ > > -- > Michal Hocko > SUSE Labs -- Victor Porton - http://portonvictor.org -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Notifying on empty cgroup
15.01.2014, 15:59, Michal Hocko mho...@suse.cz: [CCing cgroups mailing list] On Wed 15-01-14 06:12:45, Victor Porton wrote: I want to write software which needs to receive a signal when the cgroup created by it becomes empty. (After this the empty cgroup should be deleted just not to clutter the memory.) If the kernel does not support such notifications, it should be improved. This functionality is crucial for some kinds of software. There is /sys/fs/cgroup/systemd/release_agent but I don't understand how to use it. I don't understand why we would need it at all. 1.4 What does notify_on_release do ? in Documentation/cgroups/cgroups.txt the kernel source doesn't help? I've read it. I understand what it does. I don't understand how to use it in practice nor why it is done this way. Starting a binary on emptying a cgroup with the purpose to notify an other binary looks like a big overkill. the binary can do rmdir which is what you want, no? I suppose a base package should do that, not my specific software. Do I understand right? Also my program should work in userspace without the need to use release_agent which can be accessed only by root. The release_agent is global for all groups so the program doesn't have to care. Again: What MY program should do? Note that my work is related with sandboxing software (running a program in closed environment, so that it would be unable for example to remove user's files). See also http://portonsoft.wordpress.com/2014/01/11/toward-robust-linux-sandbox/ -- Victor Porton - http://portonvictor.org -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/ -- Michal Hocko SUSE Labs -- Victor Porton - http://portonvictor.org -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Notifying on empty cgroup
I want to write software which needs to receive a signal when the cgroup created by it becomes empty. (After this the empty cgroup should be deleted just not to clutter the memory.) If the kernel does not support such notifications, it should be improved. This functionality is crucial for some kinds of software. There is /sys/fs/cgroup/systemd/release_agent but I don't understand how to use it. I don't understand why we would need it at all. Starting a binary on emptying a cgroup with the purpose to notify an other binary looks like a big overkill. Also my program should work in userspace without the need to use release_agent which can be accessed only by root. Note that my work is related with sandboxing software (running a program in closed environment, so that it would be unable for example to remove user's files). See also http://portonsoft.wordpress.com/2014/01/11/toward-robust-linux-sandbox/ -- Victor Porton - http://portonvictor.org -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Notifying on empty cgroup
I want to write software which needs to receive a signal when the cgroup created by it becomes empty. (After this the empty cgroup should be deleted just not to clutter the memory.) If the kernel does not support such notifications, it should be improved. This functionality is crucial for some kinds of software. There is /sys/fs/cgroup/systemd/release_agent but I don't understand how to use it. I don't understand why we would need it at all. Starting a binary on emptying a cgroup with the purpose to notify an other binary looks like a big overkill. Also my program should work in userspace without the need to use release_agent which can be accessed only by root. Note that my work is related with sandboxing software (running a program in closed environment, so that it would be unable for example to remove user's files). See also http://portonsoft.wordpress.com/2014/01/11/toward-robust-linux-sandbox/ -- Victor Porton - http://portonvictor.org -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Impementing sandbox in Linux
http://portonsoft.wordpress.com/2014/01/11/toward-robust-linux-sandbox/ considers some issues of implementing sandboxing in Linux. I am unsure whether Linux supports waiting until a cgroup becomes empty (what is needed for sandboxing software). If it does not support, please make a patch. Please post comments to the above blog post. If you answer this message, please CC: me, I am not subscribed to this mailing list. -- Victor Porton - http://portonvictor.org -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Fwd: Waiting for programs to stop
I remind that we discuss sandboxing of untrusted programs. My application needs to receive a signal when ALL direct and indirect children of a process (including this process itself) started in a sandbox exit (it should work even when they call setsid()). You can assume that the sandboxing binary creates a new cgroup. Can this be done with the current kernel? -- Victor Porton - http://portonvictor.org -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Create new NetFilter table
10.01.2014, 21:39, "Joshua Brindle" : > Victor Porton wrote: > >> I propose to create a new NetFilter table dedicated to rules created >> programmatically (not by explicit admin's iptables command). >> >> Otherwise an admin could be tempted to say `iptables -F security` which >> would probably break rules created for example by sandboxing software (which >> may follow same-origin policy to restrict one particular program to certain >> domain and port only). Note that in this case `iptables -F security` is a >> security risk (sandbox breaking)? >> >> New table could be possibly be called: >> >> - temp >> - temporary >> - auto >> - automatic >> - volatile >> - daemon >> - system >> - sys >> >> In iptables docs it should be said that this table should not be >> manipulated manually. > > Is it possible that the solution to your sandboxing problem is seccomp > filter? > > http://outflux.net/teach-seccomp/ > > You'd filter out any syscall that can make outbound connections and then > only pass already opened sockets to the sandboxed threads? > > seccomp filter was actually created for sandboxing, so that user > applications could voluntarily shed the ability to call certain syscalls > before handling untrusted data. seccomp would not work for me, because I need network enabled sandboxes. Moreover we should be able to filter out certain subnets such as 127.0.0.0/255.0.0.0 (and others), This cleanly can't be done with seccomp. -- Victor Porton - http://portonvictor.org -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Create new NetFilter table
I propose to create a new NetFilter table dedicated to rules created programmatically (not by explicit admin's iptables command). Otherwise an admin could be tempted to say `iptables -F security` which would probably break rules created for example by sandboxing software (which may follow same-origin policy to restrict one particular program to certain domain and port only). Note that in this case `iptables -F security` is a security risk (sandbox breaking)? New table could be possibly be called: - temp - temporary - auto - automatic - volatile - daemon - system - sys In iptables docs it should be said that this table should not be manipulated manually. -- Victor Porton - http://portonvictor.org -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: A feature suggestion for sandboxing processes
I was told that it can be done using cgroups. So no urgent necessity to add my new syscall. 10.01.2014, 01:55, "Victor Porton" : > In Fedora there is bin/sandbox command which runs a specified command in so > called 'sandbox'. Program running in sandbox cannot open new files (it is > commonly used with preopen stdin and stdout) and possibly its access to > network is limited. It is intended to run potentially malicious software > safely. > > This Fedora sandbox is not perfect however. > > One problem is: > > Suppose the sandboxed program spawned some child processes and exited itself. > > Suppose we want to kill the sandboxed program after 30 second, if it has not > exited voluntarily. > > The trouble is that the software cannot figure out which processes have > appeared from the sandboxed binary. So we are unable to kill these processes > automatically. This means that a hacker can in this way create thousands (or > more) processes which would overload the system. > > Also note that the sandboxed program may run setsid() and thus its identity > may be lost completely. > > I propose to add parameter sandbox_id to each process in the kernel. It would > be 0 for normal processes and allocated like PID or GID for processes we > create in sandbox. Children inherit sandbox_id. There should be an API call > using which a process makes it sandboxed_id non-zero (which returns EPERM if > it is already non-zero). > > Then there should be API to enumerate all processes with given sandbox_id, so > that we would be able to kill them (-TERM or -KILL). Or maybe we should also > have the function which sends the given signal to all processes with given > sandbox_id (otherwise we would war with a hacker which could possibly create > new children faster than we kill them). > > Please add me in CC: (I am not subscribed for this mailing list.) -- Victor Porton - http://portonvictor.org -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC] subreaper mode 2 (Re: A feature suggestion for sandboxing processes)
I don't quite understand your subreaper mode 2, but for me it looks like that this would break compatibility (sandboxed applications ideally should not be written in any special way, any application which does not open new files (or does similar things) should work in sandbox just like as if there would be no sandbox). 10.01.2014, 04:55, "Andy Lutomirski" : > On 01/09/2014 03:55 PM, Victor Porton wrote: > >> In Fedora there is bin/sandbox command which runs a specified command in so >> called 'sandbox'. Program running in sandbox cannot open new files (it is >> commonly used with preopen stdin and stdout) and possibly its access to >> network is limited. It is intended to run potentially malicious software >> safely. >> >> This Fedora sandbox is not perfect however. >> >> One problem is: >> >> Suppose the sandboxed program spawned some child processes and exited >> itself. >> >> Suppose we want to kill the sandboxed program after 30 second, if it has >> not exited voluntarily. >> >> The trouble is that the software cannot figure out which processes have >> appeared from the sandboxed binary. So we are unable to kill these processes >> automatically. This means that a hacker can in this way create thousands (or >> more) processes which would overload the system. >> >> Also note that the sandboxed program may run setsid() and thus its identity >> may be lost completely. >> >> I propose to add parameter sandbox_id to each process in the kernel. It >> would be 0 for normal processes and allocated like PID or GID for processes >> we create in sandbox. Children inherit sandbox_id. There should be an API >> call using which a process makes it sandboxed_id non-zero (which returns >> EPERM if it is already non-zero). >> >> Then there should be API to enumerate all processes with given sandbox_id, >> so that we would be able to kill them (-TERM or -KILL). Or maybe we should >> also have the function which sends the given signal to all processes with >> given sandbox_id (otherwise we would war with a hacker which could possibly >> create new children faster than we kill them). > > I think you need to think bigger :) > > I've occasionally pondered how to do real tracking of process trees > (sandbox could use it, but I was thinking of systemd and other service > managers). cgroups* suck for this purpose. > > One approach would be to have another subreaper mode (subreaper mode 2) > that does three things: > - Subreaper mode 2 zombies do not send SIGCHLD and cannot be reaped > until they have no descendents left. > - Direct zombie children of subreaper mode 2 zombies are automatically > reaped. > - Descendents that need to be reparented are reparented to the > subreaper, just like in subreaper mode 1. > > Then you'd add an API that takes the PID of a mode 2 subreaper and kills > its entire process subtree. (Optionally, tgkill could do that > automatically.) > > To use this for sandbox, sandbox would set subreaper mode 2 and then > fork. The initial sandbox process would exit and the child would exec > into the sandbox. The parent would stick around as a zombie until the > whole tree went away. > > To use this for an init-like program, the service manager would > fork/clone a dummy PID, set subreaper mode 2, fork again, and exec the > service. That dummy PID would serve as a persistent reference to the > subtree. > > For added fun, there should be a way to efficiently find the mode 2 > subreaper that owns a given pid/tid. That way systemd / journald could > map PIDs to service names without mucking with cgroups. > > An alternative formulation of more or less the same thing would be a > syscall manage_pid_subtree(pid_t pid) that does, roughly: > > if (pid->real_parent != current) return -EINVAL; > set subreaper mode; > exit current mm, signal set, etc to conserve resources; > /* at this point, current is essentially a kernel thread. */ > wait for pid to exit; > exit, copying pid's return code and other exit siginfo state; > > To manage a subreaper, you double-fork, and then the middle process > would call manage_pid_subtree on its child. > > Thoughts? > > * Goddamnit, systemd, I want a way to turn *off* your control of the One > True Cgroup Hierarchy (TM). I consider the lack of such a mechanism to > be a serious upcoming regression. Maybe if the kernel gives systemd a > way to do this, systemd will use it. > > --Andy -- Victor Porton - http://portonvictor.org -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC] subreaper mode 2 (Re: A feature suggestion for sandboxing processes)
I don't quite understand your subreaper mode 2, but for me it looks like that this would break compatibility (sandboxed applications ideally should not be written in any special way, any application which does not open new files (or does similar things) should work in sandbox just like as if there would be no sandbox). 10.01.2014, 04:55, Andy Lutomirski l...@amacapital.net: On 01/09/2014 03:55 PM, Victor Porton wrote: In Fedora there is bin/sandbox command which runs a specified command in so called 'sandbox'. Program running in sandbox cannot open new files (it is commonly used with preopen stdin and stdout) and possibly its access to network is limited. It is intended to run potentially malicious software safely. This Fedora sandbox is not perfect however. One problem is: Suppose the sandboxed program spawned some child processes and exited itself. Suppose we want to kill the sandboxed program after 30 second, if it has not exited voluntarily. The trouble is that the software cannot figure out which processes have appeared from the sandboxed binary. So we are unable to kill these processes automatically. This means that a hacker can in this way create thousands (or more) processes which would overload the system. Also note that the sandboxed program may run setsid() and thus its identity may be lost completely. I propose to add parameter sandbox_id to each process in the kernel. It would be 0 for normal processes and allocated like PID or GID for processes we create in sandbox. Children inherit sandbox_id. There should be an API call using which a process makes it sandboxed_id non-zero (which returns EPERM if it is already non-zero). Then there should be API to enumerate all processes with given sandbox_id, so that we would be able to kill them (-TERM or -KILL). Or maybe we should also have the function which sends the given signal to all processes with given sandbox_id (otherwise we would war with a hacker which could possibly create new children faster than we kill them). I think you need to think bigger :) I've occasionally pondered how to do real tracking of process trees (sandbox could use it, but I was thinking of systemd and other service managers). cgroups* suck for this purpose. One approach would be to have another subreaper mode (subreaper mode 2) that does three things: - Subreaper mode 2 zombies do not send SIGCHLD and cannot be reaped until they have no descendents left. - Direct zombie children of subreaper mode 2 zombies are automatically reaped. - Descendents that need to be reparented are reparented to the subreaper, just like in subreaper mode 1. Then you'd add an API that takes the PID of a mode 2 subreaper and kills its entire process subtree. (Optionally, tgkill could do that automatically.) To use this for sandbox, sandbox would set subreaper mode 2 and then fork. The initial sandbox process would exit and the child would exec into the sandbox. The parent would stick around as a zombie until the whole tree went away. To use this for an init-like program, the service manager would fork/clone a dummy PID, set subreaper mode 2, fork again, and exec the service. That dummy PID would serve as a persistent reference to the subtree. For added fun, there should be a way to efficiently find the mode 2 subreaper that owns a given pid/tid. That way systemd / journald could map PIDs to service names without mucking with cgroups. An alternative formulation of more or less the same thing would be a syscall manage_pid_subtree(pid_t pid) that does, roughly: if (pid-real_parent != current) return -EINVAL; set subreaper mode; exit current mm, signal set, etc to conserve resources; /* at this point, current is essentially a kernel thread. */ wait for pid to exit; exit, copying pid's return code and other exit siginfo state; To manage a subreaper, you double-fork, and then the middle process would call manage_pid_subtree on its child. Thoughts? * Goddamnit, systemd, I want a way to turn *off* your control of the One True Cgroup Hierarchy (TM). I consider the lack of such a mechanism to be a serious upcoming regression. Maybe if the kernel gives systemd a way to do this, systemd will use it. --Andy -- Victor Porton - http://portonvictor.org -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: A feature suggestion for sandboxing processes
I was told that it can be done using cgroups. So no urgent necessity to add my new syscall. 10.01.2014, 01:55, Victor Porton por...@narod.ru: In Fedora there is bin/sandbox command which runs a specified command in so called 'sandbox'. Program running in sandbox cannot open new files (it is commonly used with preopen stdin and stdout) and possibly its access to network is limited. It is intended to run potentially malicious software safely. This Fedora sandbox is not perfect however. One problem is: Suppose the sandboxed program spawned some child processes and exited itself. Suppose we want to kill the sandboxed program after 30 second, if it has not exited voluntarily. The trouble is that the software cannot figure out which processes have appeared from the sandboxed binary. So we are unable to kill these processes automatically. This means that a hacker can in this way create thousands (or more) processes which would overload the system. Also note that the sandboxed program may run setsid() and thus its identity may be lost completely. I propose to add parameter sandbox_id to each process in the kernel. It would be 0 for normal processes and allocated like PID or GID for processes we create in sandbox. Children inherit sandbox_id. There should be an API call using which a process makes it sandboxed_id non-zero (which returns EPERM if it is already non-zero). Then there should be API to enumerate all processes with given sandbox_id, so that we would be able to kill them (-TERM or -KILL). Or maybe we should also have the function which sends the given signal to all processes with given sandbox_id (otherwise we would war with a hacker which could possibly create new children faster than we kill them). Please add me in CC: (I am not subscribed for this mailing list.) -- Victor Porton - http://portonvictor.org -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Create new NetFilter table
I propose to create a new NetFilter table dedicated to rules created programmatically (not by explicit admin's iptables command). Otherwise an admin could be tempted to say `iptables -F security` which would probably break rules created for example by sandboxing software (which may follow same-origin policy to restrict one particular program to certain domain and port only). Note that in this case `iptables -F security` is a security risk (sandbox breaking)? New table could be possibly be called: - temp - temporary - auto - automatic - volatile - daemon - system - sys In iptables docs it should be said that this table should not be manipulated manually. -- Victor Porton - http://portonvictor.org -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Create new NetFilter table
10.01.2014, 21:39, Joshua Brindle brin...@quarksecurity.com: Victor Porton wrote: I propose to create a new NetFilter table dedicated to rules created programmatically (not by explicit admin's iptables command). Otherwise an admin could be tempted to say `iptables -F security` which would probably break rules created for example by sandboxing software (which may follow same-origin policy to restrict one particular program to certain domain and port only). Note that in this case `iptables -F security` is a security risk (sandbox breaking)? New table could be possibly be called: - temp - temporary - auto - automatic - volatile - daemon - system - sys In iptables docs it should be said that this table should not be manipulated manually. Is it possible that the solution to your sandboxing problem is seccomp filter? http://outflux.net/teach-seccomp/ You'd filter out any syscall that can make outbound connections and then only pass already opened sockets to the sandboxed threads? seccomp filter was actually created for sandboxing, so that user applications could voluntarily shed the ability to call certain syscalls before handling untrusted data. seccomp would not work for me, because I need network enabled sandboxes. Moreover we should be able to filter out certain subnets such as 127.0.0.0/255.0.0.0 (and others), This cleanly can't be done with seccomp. -- Victor Porton - http://portonvictor.org -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Fwd: Waiting for programs to stop
I remind that we discuss sandboxing of untrusted programs. My application needs to receive a signal when ALL direct and indirect children of a process (including this process itself) started in a sandbox exit (it should work even when they call setsid()). You can assume that the sandboxing binary creates a new cgroup. Can this be done with the current kernel? -- Victor Porton - http://portonvictor.org -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Impementing sandbox in Linux
http://portonsoft.wordpress.com/2014/01/11/toward-robust-linux-sandbox/ considers some issues of implementing sandboxing in Linux. I am unsure whether Linux supports waiting until a cgroup becomes empty (what is needed for sandboxing software). If it does not support, please make a patch. Please post comments to the above blog post. If you answer this message, please CC: me, I am not subscribed to this mailing list. -- Victor Porton - http://portonvictor.org -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
A feature suggestion for sandboxing processes
In Fedora there is bin/sandbox command which runs a specified command in so called 'sandbox'. Program running in sandbox cannot open new files (it is commonly used with preopen stdin and stdout) and possibly its access to network is limited. It is intended to run potentially malicious software safely. This Fedora sandbox is not perfect however. One problem is: Suppose the sandboxed program spawned some child processes and exited itself. Suppose we want to kill the sandboxed program after 30 second, if it has not exited voluntarily. The trouble is that the software cannot figure out which processes have appeared from the sandboxed binary. So we are unable to kill these processes automatically. This means that a hacker can in this way create thousands (or more) processes which would overload the system. Also note that the sandboxed program may run setsid() and thus its identity may be lost completely. I propose to add parameter sandbox_id to each process in the kernel. It would be 0 for normal processes and allocated like PID or GID for processes we create in sandbox. Children inherit sandbox_id. There should be an API call using which a process makes it sandboxed_id non-zero (which returns EPERM if it is already non-zero). Then there should be API to enumerate all processes with given sandbox_id, so that we would be able to kill them (-TERM or -KILL). Or maybe we should also have the function which sends the given signal to all processes with given sandbox_id (otherwise we would war with a hacker which could possibly create new children faster than we kill them). Please add me in CC: (I am not subscribed for this mailing list.) -- Victor Porton - http://portonvictor.org -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
A feature suggestion for sandboxing processes
In Fedora there is bin/sandbox command which runs a specified command in so called 'sandbox'. Program running in sandbox cannot open new files (it is commonly used with preopen stdin and stdout) and possibly its access to network is limited. It is intended to run potentially malicious software safely. This Fedora sandbox is not perfect however. One problem is: Suppose the sandboxed program spawned some child processes and exited itself. Suppose we want to kill the sandboxed program after 30 second, if it has not exited voluntarily. The trouble is that the software cannot figure out which processes have appeared from the sandboxed binary. So we are unable to kill these processes automatically. This means that a hacker can in this way create thousands (or more) processes which would overload the system. Also note that the sandboxed program may run setsid() and thus its identity may be lost completely. I propose to add parameter sandbox_id to each process in the kernel. It would be 0 for normal processes and allocated like PID or GID for processes we create in sandbox. Children inherit sandbox_id. There should be an API call using which a process makes it sandboxed_id non-zero (which returns EPERM if it is already non-zero). Then there should be API to enumerate all processes with given sandbox_id, so that we would be able to kill them (-TERM or -KILL). Or maybe we should also have the function which sends the given signal to all processes with given sandbox_id (otherwise we would war with a hacker which could possibly create new children faster than we kill them). Please add me in CC: (I am not subscribed for this mailing list.) -- Victor Porton - http://portonvictor.org -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/