Re: [PATCH v3] signal: add procfd_send_signal() syscall
On Wed, Dec 5, 2018 at 7:08 PM Christian Brauner wrote: > As a sidenote I'm switching the name from procfd_send_signal() to > taskfd_send_signal(). It seems to me the best way to handle Eric's > request to reflect that we can eventually both signal tgids and tids. Cool; sounds fine to me. -- Kees Cook
Re: [PATCH v3] signal: add procfd_send_signal() syscall
On Wed, Dec 5, 2018 at 7:08 PM Christian Brauner wrote: > As a sidenote I'm switching the name from procfd_send_signal() to > taskfd_send_signal(). It seems to me the best way to handle Eric's > request to reflect that we can eventually both signal tgids and tids. Cool; sounds fine to me. -- Kees Cook
Re: [PATCH v3] signal: add procfd_send_signal() syscall
On Wed, Dec 05, 2018 at 03:24:08PM -0800, Kees Cook wrote: > On Wed, Dec 5, 2018 at 12:53 PM Christian Brauner > wrote: > > On Wed, Dec 05, 2018 at 12:20:43PM -0600, Eric W. Biederman wrote: > > > Christian Brauner writes: > > > > [1]: https://lkml.org/lkml/2018/11/18/130 > > > > [2]: > > > > https://lore.kernel.org/lkml/874lbtjvtd@oldenburg2.str.redhat.com/ > > > > [3]: > > > > https://lore.kernel.org/lkml/20181204132604.aspfupwjgjx6f...@brauner.io/ > > > > [4]: > > > > https://lore.kernel.org/lkml/20181203180224.fkvw4kajtbvru...@brauner.io/ > > > > [5]: > > > > https://lore.kernel.org/lkml/20181121213946.ga10...@mail.hallyn.com/ > > > > [6]: > > > > https://lore.kernel.org/lkml/20181120103111.etlqp7zop34v6...@brauner.io/ > > > > [7]: > > > > https://lore.kernel.org/lkml/36323361-90bd-41af-ab5b-ee0d7ba02...@amacapital.net/ > > > > [8]: https://lore.kernel.org/lkml/87tvjxp8pc@xmission.com/ > > > > [9]: https://asciinema.org/a/X1J8eGhe3vCfBE2b9TXtTaSJ7 > > > > [10]: > > > > https://lore.kernel.org/lkml/20181203180224.fkvw4kajtbvru...@brauner.io/ > > > > [11]: > > > > https://lore.kernel.org/lkml/f53d6d38-3521-4c20-9034-5af447df6...@amacapital.net/ > > I nominate this for 2018's most-well-documented syscall commit log award. ;) Hahaha. If I win can I get my price in beer(s)? :) > > > > > + /* > > > > +* Give userspace a way to detect whether /proc//task/ fds > > > > +* are supported. > > > > +*/ > > > > + ret = -EOPNOTSUPP; > > > > + if (proc_is_tid_procfd(f.file)) > > > > + goto err; > > > > > > -EBADF is the proper error code. > > > > This is done so that userspace has a way of figuring out that tid fds > > are not yet supported. This has been discussed with Florian (see commit > > message). > > Right, we should keep this -EOPNOTSUPP. > > > > > + /* Is this a procfd? */ > > > > + ret = -EINVAL; > > > > + if (!proc_is_tgid_procfd(f.file)) > > > > + goto err; > > > > > > -EBADF is the proper error code. > > Yeah, EINVAL tends to be used for bad flags... this is more about an > improper fd. > > > > > > > > + /* Without CONFIG_PROC_FS proc_pid() returns NULL. */ > > > > + pid = proc_pid(file_inode(f.file)); > > > > + if (!pid) > > > > + goto err; > > > > > > Perhaps you want to fold the proc_pid into the proc_is_tgid_procfd > > > call. That way proc_pid can stay private to proc. > > > > Hm, I guess we can do that for now. My intention was to have reuseable > > helpers but I guess it would be fine for now. > > > > > > > > > + if (!may_signal_procfd(pid)) > > > > + goto err; > > > > + > > Does the ns parent checking in may_signal_procfd need any locking or > RCU? I know pid and current namespaces are "pinned", but I don't know > how parent ns works here. I'm assuming the parents are stuck until all > children go away? Yeah, since they are hierarchical killing an ancestor means killing the children. Also, in case you're interested, there's precedent for that: kernel/pid_namespace.c:static struct ns_common *pidns_get_parent(struct ns_common *ns) I'm not using this function because a) I would have to special case the initial test-case and b) it takes a get() on the pid ns which would force us to use another put which is unnecessary. > > > > > + ret = kill_pid_info(sig, , pid); > > Just double-checking for myself: this does not bypass > security_task_kill(), so no problem there AFAIK. > > Reviewed-by: Kees Cook Thanks! :) As a sidenote I'm switching the name from procfd_send_signal() to taskfd_send_signal(). It seems to me the best way to handle Eric's request to reflect that we can eventually both signal tgids and tids. > > -- > Kees Cook
Re: [PATCH v3] signal: add procfd_send_signal() syscall
On Wed, Dec 05, 2018 at 03:24:08PM -0800, Kees Cook wrote: > On Wed, Dec 5, 2018 at 12:53 PM Christian Brauner > wrote: > > On Wed, Dec 05, 2018 at 12:20:43PM -0600, Eric W. Biederman wrote: > > > Christian Brauner writes: > > > > [1]: https://lkml.org/lkml/2018/11/18/130 > > > > [2]: > > > > https://lore.kernel.org/lkml/874lbtjvtd@oldenburg2.str.redhat.com/ > > > > [3]: > > > > https://lore.kernel.org/lkml/20181204132604.aspfupwjgjx6f...@brauner.io/ > > > > [4]: > > > > https://lore.kernel.org/lkml/20181203180224.fkvw4kajtbvru...@brauner.io/ > > > > [5]: > > > > https://lore.kernel.org/lkml/20181121213946.ga10...@mail.hallyn.com/ > > > > [6]: > > > > https://lore.kernel.org/lkml/20181120103111.etlqp7zop34v6...@brauner.io/ > > > > [7]: > > > > https://lore.kernel.org/lkml/36323361-90bd-41af-ab5b-ee0d7ba02...@amacapital.net/ > > > > [8]: https://lore.kernel.org/lkml/87tvjxp8pc@xmission.com/ > > > > [9]: https://asciinema.org/a/X1J8eGhe3vCfBE2b9TXtTaSJ7 > > > > [10]: > > > > https://lore.kernel.org/lkml/20181203180224.fkvw4kajtbvru...@brauner.io/ > > > > [11]: > > > > https://lore.kernel.org/lkml/f53d6d38-3521-4c20-9034-5af447df6...@amacapital.net/ > > I nominate this for 2018's most-well-documented syscall commit log award. ;) Hahaha. If I win can I get my price in beer(s)? :) > > > > > + /* > > > > +* Give userspace a way to detect whether /proc//task/ fds > > > > +* are supported. > > > > +*/ > > > > + ret = -EOPNOTSUPP; > > > > + if (proc_is_tid_procfd(f.file)) > > > > + goto err; > > > > > > -EBADF is the proper error code. > > > > This is done so that userspace has a way of figuring out that tid fds > > are not yet supported. This has been discussed with Florian (see commit > > message). > > Right, we should keep this -EOPNOTSUPP. > > > > > + /* Is this a procfd? */ > > > > + ret = -EINVAL; > > > > + if (!proc_is_tgid_procfd(f.file)) > > > > + goto err; > > > > > > -EBADF is the proper error code. > > Yeah, EINVAL tends to be used for bad flags... this is more about an > improper fd. > > > > > > > > + /* Without CONFIG_PROC_FS proc_pid() returns NULL. */ > > > > + pid = proc_pid(file_inode(f.file)); > > > > + if (!pid) > > > > + goto err; > > > > > > Perhaps you want to fold the proc_pid into the proc_is_tgid_procfd > > > call. That way proc_pid can stay private to proc. > > > > Hm, I guess we can do that for now. My intention was to have reuseable > > helpers but I guess it would be fine for now. > > > > > > > > > + if (!may_signal_procfd(pid)) > > > > + goto err; > > > > + > > Does the ns parent checking in may_signal_procfd need any locking or > RCU? I know pid and current namespaces are "pinned", but I don't know > how parent ns works here. I'm assuming the parents are stuck until all > children go away? Yeah, since they are hierarchical killing an ancestor means killing the children. Also, in case you're interested, there's precedent for that: kernel/pid_namespace.c:static struct ns_common *pidns_get_parent(struct ns_common *ns) I'm not using this function because a) I would have to special case the initial test-case and b) it takes a get() on the pid ns which would force us to use another put which is unnecessary. > > > > > + ret = kill_pid_info(sig, , pid); > > Just double-checking for myself: this does not bypass > security_task_kill(), so no problem there AFAIK. > > Reviewed-by: Kees Cook Thanks! :) As a sidenote I'm switching the name from procfd_send_signal() to taskfd_send_signal(). It seems to me the best way to handle Eric's request to reflect that we can eventually both signal tgids and tids. > > -- > Kees Cook
Re: [PATCH v3] signal: add procfd_send_signal() syscall
On Wed, Dec 5, 2018 at 12:53 PM Christian Brauner wrote: > On Wed, Dec 05, 2018 at 12:20:43PM -0600, Eric W. Biederman wrote: > > Christian Brauner writes: > > > [1]: https://lkml.org/lkml/2018/11/18/130 > > > [2]: > > > https://lore.kernel.org/lkml/874lbtjvtd@oldenburg2.str.redhat.com/ > > > [3]: > > > https://lore.kernel.org/lkml/20181204132604.aspfupwjgjx6f...@brauner.io/ > > > [4]: > > > https://lore.kernel.org/lkml/20181203180224.fkvw4kajtbvru...@brauner.io/ > > > [5]: https://lore.kernel.org/lkml/20181121213946.ga10...@mail.hallyn.com/ > > > [6]: > > > https://lore.kernel.org/lkml/20181120103111.etlqp7zop34v6...@brauner.io/ > > > [7]: > > > https://lore.kernel.org/lkml/36323361-90bd-41af-ab5b-ee0d7ba02...@amacapital.net/ > > > [8]: https://lore.kernel.org/lkml/87tvjxp8pc@xmission.com/ > > > [9]: https://asciinema.org/a/X1J8eGhe3vCfBE2b9TXtTaSJ7 > > > [10]: > > > https://lore.kernel.org/lkml/20181203180224.fkvw4kajtbvru...@brauner.io/ > > > [11]: > > > https://lore.kernel.org/lkml/f53d6d38-3521-4c20-9034-5af447df6...@amacapital.net/ I nominate this for 2018's most-well-documented syscall commit log award. ;) > > > + /* > > > +* Give userspace a way to detect whether /proc//task/ fds > > > +* are supported. > > > +*/ > > > + ret = -EOPNOTSUPP; > > > + if (proc_is_tid_procfd(f.file)) > > > + goto err; > > > > -EBADF is the proper error code. > > This is done so that userspace has a way of figuring out that tid fds > are not yet supported. This has been discussed with Florian (see commit > message). Right, we should keep this -EOPNOTSUPP. > > > + /* Is this a procfd? */ > > > + ret = -EINVAL; > > > + if (!proc_is_tgid_procfd(f.file)) > > > + goto err; > > > > -EBADF is the proper error code. Yeah, EINVAL tends to be used for bad flags... this is more about an improper fd. > > > > > + /* Without CONFIG_PROC_FS proc_pid() returns NULL. */ > > > + pid = proc_pid(file_inode(f.file)); > > > + if (!pid) > > > + goto err; > > > > Perhaps you want to fold the proc_pid into the proc_is_tgid_procfd > > call. That way proc_pid can stay private to proc. > > Hm, I guess we can do that for now. My intention was to have reuseable > helpers but I guess it would be fine for now. > > > > > > + if (!may_signal_procfd(pid)) > > > + goto err; > > > + Does the ns parent checking in may_signal_procfd need any locking or RCU? I know pid and current namespaces are "pinned", but I don't know how parent ns works here. I'm assuming the parents are stuck until all children go away? > > > + ret = kill_pid_info(sig, , pid); Just double-checking for myself: this does not bypass security_task_kill(), so no problem there AFAIK. Reviewed-by: Kees Cook -- Kees Cook
Re: [PATCH v3] signal: add procfd_send_signal() syscall
On Wed, Dec 5, 2018 at 12:53 PM Christian Brauner wrote: > On Wed, Dec 05, 2018 at 12:20:43PM -0600, Eric W. Biederman wrote: > > Christian Brauner writes: > > > [1]: https://lkml.org/lkml/2018/11/18/130 > > > [2]: > > > https://lore.kernel.org/lkml/874lbtjvtd@oldenburg2.str.redhat.com/ > > > [3]: > > > https://lore.kernel.org/lkml/20181204132604.aspfupwjgjx6f...@brauner.io/ > > > [4]: > > > https://lore.kernel.org/lkml/20181203180224.fkvw4kajtbvru...@brauner.io/ > > > [5]: https://lore.kernel.org/lkml/20181121213946.ga10...@mail.hallyn.com/ > > > [6]: > > > https://lore.kernel.org/lkml/20181120103111.etlqp7zop34v6...@brauner.io/ > > > [7]: > > > https://lore.kernel.org/lkml/36323361-90bd-41af-ab5b-ee0d7ba02...@amacapital.net/ > > > [8]: https://lore.kernel.org/lkml/87tvjxp8pc@xmission.com/ > > > [9]: https://asciinema.org/a/X1J8eGhe3vCfBE2b9TXtTaSJ7 > > > [10]: > > > https://lore.kernel.org/lkml/20181203180224.fkvw4kajtbvru...@brauner.io/ > > > [11]: > > > https://lore.kernel.org/lkml/f53d6d38-3521-4c20-9034-5af447df6...@amacapital.net/ I nominate this for 2018's most-well-documented syscall commit log award. ;) > > > + /* > > > +* Give userspace a way to detect whether /proc//task/ fds > > > +* are supported. > > > +*/ > > > + ret = -EOPNOTSUPP; > > > + if (proc_is_tid_procfd(f.file)) > > > + goto err; > > > > -EBADF is the proper error code. > > This is done so that userspace has a way of figuring out that tid fds > are not yet supported. This has been discussed with Florian (see commit > message). Right, we should keep this -EOPNOTSUPP. > > > + /* Is this a procfd? */ > > > + ret = -EINVAL; > > > + if (!proc_is_tgid_procfd(f.file)) > > > + goto err; > > > > -EBADF is the proper error code. Yeah, EINVAL tends to be used for bad flags... this is more about an improper fd. > > > > > + /* Without CONFIG_PROC_FS proc_pid() returns NULL. */ > > > + pid = proc_pid(file_inode(f.file)); > > > + if (!pid) > > > + goto err; > > > > Perhaps you want to fold the proc_pid into the proc_is_tgid_procfd > > call. That way proc_pid can stay private to proc. > > Hm, I guess we can do that for now. My intention was to have reuseable > helpers but I guess it would be fine for now. > > > > > > + if (!may_signal_procfd(pid)) > > > + goto err; > > > + Does the ns parent checking in may_signal_procfd need any locking or RCU? I know pid and current namespaces are "pinned", but I don't know how parent ns works here. I'm assuming the parents are stuck until all children go away? > > > + ret = kill_pid_info(sig, , pid); Just double-checking for myself: this does not bypass security_task_kill(), so no problem there AFAIK. Reviewed-by: Kees Cook -- Kees Cook
Re: [PATCH v3] signal: add procfd_send_signal() syscall
On Wed, Dec 05, 2018 at 12:20:43PM -0600, Eric W. Biederman wrote: > Christian Brauner writes: > > > The kill() syscall operates on process identifiers (pid). After a process > > has exited its pid can be reused by another process. If a caller sends a > > signal to a reused pid it will end up signaling the wrong process. This > > issue has often surfaced and there has been a push [1] to address this > > problem. > > > > This patch uses file descriptors (fd) from proc/ as stable handles on > > struct pid. Even if a pid is recycled the handle will not change. The fd > > can be used to send signals to the process it refers to. > > Thus, the new syscall procfd_send_signal() is introduced to solve this > > problem. Instead of pids it operates on process fds (procfd). > > > > /* prototype and argument /* > > long procfd_send_signal(int fd, int sig, siginfo_t *info, unsigned int > > flags); > > > > In addition to the procfd and signal argument it takes an additional > > siginfo_t and flags argument. If the siginfo_t argument is NULL then > > procfd_send_signal() behaves like kill(). If it is not NULL > > procfd_send_signal() behaves like rt_sigqueueinfo(). > > The flags argument is added to allow for future extensions of this syscall. > > It currently needs to be passed as 0. Failing to do so will cause EINVAL. > > > > /* procfd_send_signal() replaces multiple pid-based syscalls */ > > The procfd_send_signal() syscall currently takes on the job of the > > following syscalls that operate on pids: > > - kill(2) > > - rt_sigqueueinfo(2) > > The syscall is defined in such a way that it can also operate on thread fds > > instead of process fds. In a future patchset I will extend it to operate on > > procfds from /proc//task/ at which point it will additionally > > take on the job of: > > - tgkill(2) > > - rt_tgsigqueueinfo(2) > > Right now this is intentionally left out to keep this patchset as simple as > > possible (cf. [4]). If a procfd of /proc//task/ is passed > > EOPNOTSUPP will be returned to give userspace a way to detect when I add > > support for such procfds {cf. [10]). > > > > /* naming */ > > The name procfd_send_signal() was chosen to reflect the fact that it takes > > on the job of multiple syscalls. It is intentional that the name is not > > reminiscent of neither kill(2) nor rt_sigqueueinfo(2). Not the fomer > > because it might imply that procfd_send_signal() is only a replacement for > > kill(2) and not the latter because it is a hazzle to remember the correct > > spelling (especially for non-native speakers) and because it is not > > descriptive enough of what the syscall actually does. The name > > "procfd_send_signal" makes it very clear that its job is to send signals. > > > > /* O_PATH file descriptors */ > > procfds opened as O_PATH fds cannot be used to send signals to a process > > (cf. [2]). Signaling processes through procfds is the equivalent of writing > > to a file. Thus, this is not an operation that operates "purely at the > > file descriptor level" as required by the open(2) manpage. > > > > /* zombies */ > > Zombies can be signaled just as any other process. No special error will be > > reported since a zombie state is an unreliable state (cf. [3]). > > > > /* cross-namespace signals */ > > The patch currently enforces that the signaler and signalee either are in > > the same pid namespace or that the signaler's pid namespace is an ancestor > > of the signalee's pid namespace. This is done for the sake of simplicity > > and because it is unclear to what values certain members of struct > > siginfo_t would need to be set to (cf. [5], [6]). > > > > /* compat syscalls */ > > It became clear that we would like to avoid adding compat syscalls (cf. > > [7]). The compat syscall handling is now done in kernel/signal.c itself by > > adding __copy_siginfo_from_user_generic() which lets us avoid compat > > syscalls (cf. [8]). > > With upcoming rework for syscall handling things might improve even further > > (cf. [11]). > > This patch was tested on x64 and x86. > > > > /* userspace usage */ > > An asciinema recording for the basic functionality can be found under [9]. > > With this patch a process can be killed via: > > > > #define _GNU_SOURCE > > #include > > #include > > #include > > #include > > #include > > #include > > #include > > #include > > #include > > #include > > > > static inline int do_procfd_send_signal(int procfd, int sig, siginfo_t > > *info, > > unsigned int flags) > > { > > #ifdef __NR_procfd_send_signal > > return syscall(__NR_procfd_send_signal, procfd, sig, info, flags); > > #else > > return -ENOSYS; > > #endif > > } > > > > int main(int argc, char *argv[]) > > { > > int fd, ret, saved_errno, sig; > > > > if (argc < 3) > > exit(EXIT_FAILURE); > > > > fd = open(argv[1], O_DIRECTORY | O_CLOEXEC); > > if (fd < 0) { > >
Re: [PATCH v3] signal: add procfd_send_signal() syscall
On Wed, Dec 05, 2018 at 12:20:43PM -0600, Eric W. Biederman wrote: > Christian Brauner writes: > > > The kill() syscall operates on process identifiers (pid). After a process > > has exited its pid can be reused by another process. If a caller sends a > > signal to a reused pid it will end up signaling the wrong process. This > > issue has often surfaced and there has been a push [1] to address this > > problem. > > > > This patch uses file descriptors (fd) from proc/ as stable handles on > > struct pid. Even if a pid is recycled the handle will not change. The fd > > can be used to send signals to the process it refers to. > > Thus, the new syscall procfd_send_signal() is introduced to solve this > > problem. Instead of pids it operates on process fds (procfd). > > > > /* prototype and argument /* > > long procfd_send_signal(int fd, int sig, siginfo_t *info, unsigned int > > flags); > > > > In addition to the procfd and signal argument it takes an additional > > siginfo_t and flags argument. If the siginfo_t argument is NULL then > > procfd_send_signal() behaves like kill(). If it is not NULL > > procfd_send_signal() behaves like rt_sigqueueinfo(). > > The flags argument is added to allow for future extensions of this syscall. > > It currently needs to be passed as 0. Failing to do so will cause EINVAL. > > > > /* procfd_send_signal() replaces multiple pid-based syscalls */ > > The procfd_send_signal() syscall currently takes on the job of the > > following syscalls that operate on pids: > > - kill(2) > > - rt_sigqueueinfo(2) > > The syscall is defined in such a way that it can also operate on thread fds > > instead of process fds. In a future patchset I will extend it to operate on > > procfds from /proc//task/ at which point it will additionally > > take on the job of: > > - tgkill(2) > > - rt_tgsigqueueinfo(2) > > Right now this is intentionally left out to keep this patchset as simple as > > possible (cf. [4]). If a procfd of /proc//task/ is passed > > EOPNOTSUPP will be returned to give userspace a way to detect when I add > > support for such procfds {cf. [10]). > > > > /* naming */ > > The name procfd_send_signal() was chosen to reflect the fact that it takes > > on the job of multiple syscalls. It is intentional that the name is not > > reminiscent of neither kill(2) nor rt_sigqueueinfo(2). Not the fomer > > because it might imply that procfd_send_signal() is only a replacement for > > kill(2) and not the latter because it is a hazzle to remember the correct > > spelling (especially for non-native speakers) and because it is not > > descriptive enough of what the syscall actually does. The name > > "procfd_send_signal" makes it very clear that its job is to send signals. > > > > /* O_PATH file descriptors */ > > procfds opened as O_PATH fds cannot be used to send signals to a process > > (cf. [2]). Signaling processes through procfds is the equivalent of writing > > to a file. Thus, this is not an operation that operates "purely at the > > file descriptor level" as required by the open(2) manpage. > > > > /* zombies */ > > Zombies can be signaled just as any other process. No special error will be > > reported since a zombie state is an unreliable state (cf. [3]). > > > > /* cross-namespace signals */ > > The patch currently enforces that the signaler and signalee either are in > > the same pid namespace or that the signaler's pid namespace is an ancestor > > of the signalee's pid namespace. This is done for the sake of simplicity > > and because it is unclear to what values certain members of struct > > siginfo_t would need to be set to (cf. [5], [6]). > > > > /* compat syscalls */ > > It became clear that we would like to avoid adding compat syscalls (cf. > > [7]). The compat syscall handling is now done in kernel/signal.c itself by > > adding __copy_siginfo_from_user_generic() which lets us avoid compat > > syscalls (cf. [8]). > > With upcoming rework for syscall handling things might improve even further > > (cf. [11]). > > This patch was tested on x64 and x86. > > > > /* userspace usage */ > > An asciinema recording for the basic functionality can be found under [9]. > > With this patch a process can be killed via: > > > > #define _GNU_SOURCE > > #include > > #include > > #include > > #include > > #include > > #include > > #include > > #include > > #include > > #include > > > > static inline int do_procfd_send_signal(int procfd, int sig, siginfo_t > > *info, > > unsigned int flags) > > { > > #ifdef __NR_procfd_send_signal > > return syscall(__NR_procfd_send_signal, procfd, sig, info, flags); > > #else > > return -ENOSYS; > > #endif > > } > > > > int main(int argc, char *argv[]) > > { > > int fd, ret, saved_errno, sig; > > > > if (argc < 3) > > exit(EXIT_FAILURE); > > > > fd = open(argv[1], O_DIRECTORY | O_CLOEXEC); > > if (fd < 0) { > >
Re: [PATCH v3] signal: add procfd_send_signal() syscall
Christian Brauner writes: > The kill() syscall operates on process identifiers (pid). After a process > has exited its pid can be reused by another process. If a caller sends a > signal to a reused pid it will end up signaling the wrong process. This > issue has often surfaced and there has been a push [1] to address this > problem. > > This patch uses file descriptors (fd) from proc/ as stable handles on > struct pid. Even if a pid is recycled the handle will not change. The fd > can be used to send signals to the process it refers to. > Thus, the new syscall procfd_send_signal() is introduced to solve this > problem. Instead of pids it operates on process fds (procfd). > > /* prototype and argument /* > long procfd_send_signal(int fd, int sig, siginfo_t *info, unsigned int flags); > > In addition to the procfd and signal argument it takes an additional > siginfo_t and flags argument. If the siginfo_t argument is NULL then > procfd_send_signal() behaves like kill(). If it is not NULL > procfd_send_signal() behaves like rt_sigqueueinfo(). > The flags argument is added to allow for future extensions of this syscall. > It currently needs to be passed as 0. Failing to do so will cause EINVAL. > > /* procfd_send_signal() replaces multiple pid-based syscalls */ > The procfd_send_signal() syscall currently takes on the job of the > following syscalls that operate on pids: > - kill(2) > - rt_sigqueueinfo(2) > The syscall is defined in such a way that it can also operate on thread fds > instead of process fds. In a future patchset I will extend it to operate on > procfds from /proc//task/ at which point it will additionally > take on the job of: > - tgkill(2) > - rt_tgsigqueueinfo(2) > Right now this is intentionally left out to keep this patchset as simple as > possible (cf. [4]). If a procfd of /proc//task/ is passed > EOPNOTSUPP will be returned to give userspace a way to detect when I add > support for such procfds {cf. [10]). > > /* naming */ > The name procfd_send_signal() was chosen to reflect the fact that it takes > on the job of multiple syscalls. It is intentional that the name is not > reminiscent of neither kill(2) nor rt_sigqueueinfo(2). Not the fomer > because it might imply that procfd_send_signal() is only a replacement for > kill(2) and not the latter because it is a hazzle to remember the correct > spelling (especially for non-native speakers) and because it is not > descriptive enough of what the syscall actually does. The name > "procfd_send_signal" makes it very clear that its job is to send signals. > > /* O_PATH file descriptors */ > procfds opened as O_PATH fds cannot be used to send signals to a process > (cf. [2]). Signaling processes through procfds is the equivalent of writing > to a file. Thus, this is not an operation that operates "purely at the > file descriptor level" as required by the open(2) manpage. > > /* zombies */ > Zombies can be signaled just as any other process. No special error will be > reported since a zombie state is an unreliable state (cf. [3]). > > /* cross-namespace signals */ > The patch currently enforces that the signaler and signalee either are in > the same pid namespace or that the signaler's pid namespace is an ancestor > of the signalee's pid namespace. This is done for the sake of simplicity > and because it is unclear to what values certain members of struct > siginfo_t would need to be set to (cf. [5], [6]). > > /* compat syscalls */ > It became clear that we would like to avoid adding compat syscalls (cf. > [7]). The compat syscall handling is now done in kernel/signal.c itself by > adding __copy_siginfo_from_user_generic() which lets us avoid compat > syscalls (cf. [8]). > With upcoming rework for syscall handling things might improve even further > (cf. [11]). > This patch was tested on x64 and x86. > > /* userspace usage */ > An asciinema recording for the basic functionality can be found under [9]. > With this patch a process can be killed via: > > #define _GNU_SOURCE > #include > #include > #include > #include > #include > #include > #include > #include > #include > #include > > static inline int do_procfd_send_signal(int procfd, int sig, siginfo_t *info, > unsigned int flags) > { > #ifdef __NR_procfd_send_signal > return syscall(__NR_procfd_send_signal, procfd, sig, info, flags); > #else > return -ENOSYS; > #endif > } > > int main(int argc, char *argv[]) > { > int fd, ret, saved_errno, sig; > > if (argc < 3) > exit(EXIT_FAILURE); > > fd = open(argv[1], O_DIRECTORY | O_CLOEXEC); > if (fd < 0) { > printf("%s - Failed to open \"%s\"\n", strerror(errno), > argv[1]); > exit(EXIT_FAILURE); > } > > sig = atoi(argv[2]); > > printf("Sending signal %d to process %s\n", sig, argv[1]); > ret = do_procfd_send_signal(fd, sig, NULL, 0); > >
Re: [PATCH v3] signal: add procfd_send_signal() syscall
Christian Brauner writes: > The kill() syscall operates on process identifiers (pid). After a process > has exited its pid can be reused by another process. If a caller sends a > signal to a reused pid it will end up signaling the wrong process. This > issue has often surfaced and there has been a push [1] to address this > problem. > > This patch uses file descriptors (fd) from proc/ as stable handles on > struct pid. Even if a pid is recycled the handle will not change. The fd > can be used to send signals to the process it refers to. > Thus, the new syscall procfd_send_signal() is introduced to solve this > problem. Instead of pids it operates on process fds (procfd). > > /* prototype and argument /* > long procfd_send_signal(int fd, int sig, siginfo_t *info, unsigned int flags); > > In addition to the procfd and signal argument it takes an additional > siginfo_t and flags argument. If the siginfo_t argument is NULL then > procfd_send_signal() behaves like kill(). If it is not NULL > procfd_send_signal() behaves like rt_sigqueueinfo(). > The flags argument is added to allow for future extensions of this syscall. > It currently needs to be passed as 0. Failing to do so will cause EINVAL. > > /* procfd_send_signal() replaces multiple pid-based syscalls */ > The procfd_send_signal() syscall currently takes on the job of the > following syscalls that operate on pids: > - kill(2) > - rt_sigqueueinfo(2) > The syscall is defined in such a way that it can also operate on thread fds > instead of process fds. In a future patchset I will extend it to operate on > procfds from /proc//task/ at which point it will additionally > take on the job of: > - tgkill(2) > - rt_tgsigqueueinfo(2) > Right now this is intentionally left out to keep this patchset as simple as > possible (cf. [4]). If a procfd of /proc//task/ is passed > EOPNOTSUPP will be returned to give userspace a way to detect when I add > support for such procfds {cf. [10]). > > /* naming */ > The name procfd_send_signal() was chosen to reflect the fact that it takes > on the job of multiple syscalls. It is intentional that the name is not > reminiscent of neither kill(2) nor rt_sigqueueinfo(2). Not the fomer > because it might imply that procfd_send_signal() is only a replacement for > kill(2) and not the latter because it is a hazzle to remember the correct > spelling (especially for non-native speakers) and because it is not > descriptive enough of what the syscall actually does. The name > "procfd_send_signal" makes it very clear that its job is to send signals. > > /* O_PATH file descriptors */ > procfds opened as O_PATH fds cannot be used to send signals to a process > (cf. [2]). Signaling processes through procfds is the equivalent of writing > to a file. Thus, this is not an operation that operates "purely at the > file descriptor level" as required by the open(2) manpage. > > /* zombies */ > Zombies can be signaled just as any other process. No special error will be > reported since a zombie state is an unreliable state (cf. [3]). > > /* cross-namespace signals */ > The patch currently enforces that the signaler and signalee either are in > the same pid namespace or that the signaler's pid namespace is an ancestor > of the signalee's pid namespace. This is done for the sake of simplicity > and because it is unclear to what values certain members of struct > siginfo_t would need to be set to (cf. [5], [6]). > > /* compat syscalls */ > It became clear that we would like to avoid adding compat syscalls (cf. > [7]). The compat syscall handling is now done in kernel/signal.c itself by > adding __copy_siginfo_from_user_generic() which lets us avoid compat > syscalls (cf. [8]). > With upcoming rework for syscall handling things might improve even further > (cf. [11]). > This patch was tested on x64 and x86. > > /* userspace usage */ > An asciinema recording for the basic functionality can be found under [9]. > With this patch a process can be killed via: > > #define _GNU_SOURCE > #include > #include > #include > #include > #include > #include > #include > #include > #include > #include > > static inline int do_procfd_send_signal(int procfd, int sig, siginfo_t *info, > unsigned int flags) > { > #ifdef __NR_procfd_send_signal > return syscall(__NR_procfd_send_signal, procfd, sig, info, flags); > #else > return -ENOSYS; > #endif > } > > int main(int argc, char *argv[]) > { > int fd, ret, saved_errno, sig; > > if (argc < 3) > exit(EXIT_FAILURE); > > fd = open(argv[1], O_DIRECTORY | O_CLOEXEC); > if (fd < 0) { > printf("%s - Failed to open \"%s\"\n", strerror(errno), > argv[1]); > exit(EXIT_FAILURE); > } > > sig = atoi(argv[2]); > > printf("Sending signal %d to process %s\n", sig, argv[1]); > ret = do_procfd_send_signal(fd, sig, NULL, 0); > >
Re: [PATCH v3] signal: add procfd_send_signal() syscall
Christian Brauner writes: > The kill() syscall operates on process identifiers (pid). After a process > has exited its pid can be reused by another process. If a caller sends a > signal to a reused pid it will end up signaling the wrong process. This > issue has often surfaced and there has been a push [1] to address this > problem. > > This patch uses file descriptors (fd) from proc/ as stable handles on > struct pid. Even if a pid is recycled the handle will not change. The fd > can be used to send signals to the process it refers to. > Thus, the new syscall procfd_send_signal() is introduced to solve this > problem. Instead of pids it operates on process fds (procfd). > > /* prototype and argument /* > long procfd_send_signal(int fd, int sig, siginfo_t *info, unsigned int flags); > > In addition to the procfd and signal argument it takes an additional > siginfo_t and flags argument. If the siginfo_t argument is NULL then > procfd_send_signal() behaves like kill(). If it is not NULL > procfd_send_signal() behaves like rt_sigqueueinfo(). > The flags argument is added to allow for future extensions of this syscall. > It currently needs to be passed as 0. Failing to do so will cause EINVAL. > > /* procfd_send_signal() replaces multiple pid-based syscalls */ > The procfd_send_signal() syscall currently takes on the job of the > following syscalls that operate on pids: > - kill(2) > - rt_sigqueueinfo(2) > The syscall is defined in such a way that it can also operate on thread fds > instead of process fds. In a future patchset I will extend it to operate on > procfds from /proc//task/ at which point it will additionally > take on the job of: > - tgkill(2) > - rt_tgsigqueueinfo(2) > Right now this is intentionally left out to keep this patchset as simple as > possible (cf. [4]). If a procfd of /proc//task/ is passed > EOPNOTSUPP will be returned to give userspace a way to detect when I add > support for such procfds {cf. [10]). > > /* naming */ > The name procfd_send_signal() was chosen to reflect the fact that it takes > on the job of multiple syscalls. It is intentional that the name is not > reminiscent of neither kill(2) nor rt_sigqueueinfo(2). Not the fomer > because it might imply that procfd_send_signal() is only a replacement for > kill(2) and not the latter because it is a hazzle to remember the correct > spelling (especially for non-native speakers) and because it is not > descriptive enough of what the syscall actually does. The name > "procfd_send_signal" makes it very clear that its job is to send signals. > > /* O_PATH file descriptors */ > procfds opened as O_PATH fds cannot be used to send signals to a process > (cf. [2]). Signaling processes through procfds is the equivalent of writing > to a file. Thus, this is not an operation that operates "purely at the > file descriptor level" as required by the open(2) manpage. > > /* zombies */ > Zombies can be signaled just as any other process. No special error will be > reported since a zombie state is an unreliable state (cf. [3]). > > /* cross-namespace signals */ > The patch currently enforces that the signaler and signalee either are in > the same pid namespace or that the signaler's pid namespace is an ancestor > of the signalee's pid namespace. This is done for the sake of simplicity > and because it is unclear to what values certain members of struct > siginfo_t would need to be set to (cf. [5], [6]). > > /* compat syscalls */ > It became clear that we would like to avoid adding compat syscalls (cf. > [7]). The compat syscall handling is now done in kernel/signal.c itself by > adding __copy_siginfo_from_user_generic() which lets us avoid compat > syscalls (cf. [8]). > With upcoming rework for syscall handling things might improve even further > (cf. [11]). > This patch was tested on x64 and x86. > > /* userspace usage */ > An asciinema recording for the basic functionality can be found under [9]. > With this patch a process can be killed via: > > #define _GNU_SOURCE > #include > #include > #include > #include > #include > #include > #include > #include > #include > #include > > static inline int do_procfd_send_signal(int procfd, int sig, siginfo_t *info, > unsigned int flags) > { > #ifdef __NR_procfd_send_signal > return syscall(__NR_procfd_send_signal, procfd, sig, info, flags); > #else > return -ENOSYS; > #endif > } > > int main(int argc, char *argv[]) > { > int fd, ret, saved_errno, sig; > > if (argc < 3) > exit(EXIT_FAILURE); > > fd = open(argv[1], O_DIRECTORY | O_CLOEXEC); > if (fd < 0) { > printf("%s - Failed to open \"%s\"\n", strerror(errno), > argv[1]); > exit(EXIT_FAILURE); > } > > sig = atoi(argv[2]); > > printf("Sending signal %d to process %s\n", sig, argv[1]); > ret = do_procfd_send_signal(fd, sig, NULL, 0); > >
Re: [PATCH v3] signal: add procfd_send_signal() syscall
Christian Brauner writes: > The kill() syscall operates on process identifiers (pid). After a process > has exited its pid can be reused by another process. If a caller sends a > signal to a reused pid it will end up signaling the wrong process. This > issue has often surfaced and there has been a push [1] to address this > problem. > > This patch uses file descriptors (fd) from proc/ as stable handles on > struct pid. Even if a pid is recycled the handle will not change. The fd > can be used to send signals to the process it refers to. > Thus, the new syscall procfd_send_signal() is introduced to solve this > problem. Instead of pids it operates on process fds (procfd). > > /* prototype and argument /* > long procfd_send_signal(int fd, int sig, siginfo_t *info, unsigned int flags); > > In addition to the procfd and signal argument it takes an additional > siginfo_t and flags argument. If the siginfo_t argument is NULL then > procfd_send_signal() behaves like kill(). If it is not NULL > procfd_send_signal() behaves like rt_sigqueueinfo(). > The flags argument is added to allow for future extensions of this syscall. > It currently needs to be passed as 0. Failing to do so will cause EINVAL. > > /* procfd_send_signal() replaces multiple pid-based syscalls */ > The procfd_send_signal() syscall currently takes on the job of the > following syscalls that operate on pids: > - kill(2) > - rt_sigqueueinfo(2) > The syscall is defined in such a way that it can also operate on thread fds > instead of process fds. In a future patchset I will extend it to operate on > procfds from /proc//task/ at which point it will additionally > take on the job of: > - tgkill(2) > - rt_tgsigqueueinfo(2) > Right now this is intentionally left out to keep this patchset as simple as > possible (cf. [4]). If a procfd of /proc//task/ is passed > EOPNOTSUPP will be returned to give userspace a way to detect when I add > support for such procfds {cf. [10]). > > /* naming */ > The name procfd_send_signal() was chosen to reflect the fact that it takes > on the job of multiple syscalls. It is intentional that the name is not > reminiscent of neither kill(2) nor rt_sigqueueinfo(2). Not the fomer > because it might imply that procfd_send_signal() is only a replacement for > kill(2) and not the latter because it is a hazzle to remember the correct > spelling (especially for non-native speakers) and because it is not > descriptive enough of what the syscall actually does. The name > "procfd_send_signal" makes it very clear that its job is to send signals. > > /* O_PATH file descriptors */ > procfds opened as O_PATH fds cannot be used to send signals to a process > (cf. [2]). Signaling processes through procfds is the equivalent of writing > to a file. Thus, this is not an operation that operates "purely at the > file descriptor level" as required by the open(2) manpage. > > /* zombies */ > Zombies can be signaled just as any other process. No special error will be > reported since a zombie state is an unreliable state (cf. [3]). > > /* cross-namespace signals */ > The patch currently enforces that the signaler and signalee either are in > the same pid namespace or that the signaler's pid namespace is an ancestor > of the signalee's pid namespace. This is done for the sake of simplicity > and because it is unclear to what values certain members of struct > siginfo_t would need to be set to (cf. [5], [6]). > > /* compat syscalls */ > It became clear that we would like to avoid adding compat syscalls (cf. > [7]). The compat syscall handling is now done in kernel/signal.c itself by > adding __copy_siginfo_from_user_generic() which lets us avoid compat > syscalls (cf. [8]). > With upcoming rework for syscall handling things might improve even further > (cf. [11]). > This patch was tested on x64 and x86. > > /* userspace usage */ > An asciinema recording for the basic functionality can be found under [9]. > With this patch a process can be killed via: > > #define _GNU_SOURCE > #include > #include > #include > #include > #include > #include > #include > #include > #include > #include > > static inline int do_procfd_send_signal(int procfd, int sig, siginfo_t *info, > unsigned int flags) > { > #ifdef __NR_procfd_send_signal > return syscall(__NR_procfd_send_signal, procfd, sig, info, flags); > #else > return -ENOSYS; > #endif > } > > int main(int argc, char *argv[]) > { > int fd, ret, saved_errno, sig; > > if (argc < 3) > exit(EXIT_FAILURE); > > fd = open(argv[1], O_DIRECTORY | O_CLOEXEC); > if (fd < 0) { > printf("%s - Failed to open \"%s\"\n", strerror(errno), > argv[1]); > exit(EXIT_FAILURE); > } > > sig = atoi(argv[2]); > > printf("Sending signal %d to process %s\n", sig, argv[1]); > ret = do_procfd_send_signal(fd, sig, NULL, 0); > >
Re: [PATCH v3] signal: add procfd_send_signal() syscall
Hi Christian, Thank you for the patch! Perhaps something to improve: [auto build test WARNING on linus/master] [also build test WARNING on v4.20-rc5] [cannot apply to next-20181205] [if your patch is applied to the wrong git tree, please drop us a note to help improve the system] url: https://github.com/0day-ci/linux/commits/Christian-Brauner/signal-add-procfd_send_signal-syscall/20181205-174512 config: sh-titan_defconfig (attached as .config) compiler: sh4-linux-gnu-gcc (Debian 7.2.0-11) 7.2.0 reproduce: wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross chmod +x ~/bin/make.cross # save the attached .config to linux build tree GCC_VERSION=7.2.0 make.cross ARCH=sh All warnings (new ones prefixed by >>): :1317:2: warning: #warning syscall pkey_mprotect not implemented [-Wcpp] :1320:2: warning: #warning syscall pkey_alloc not implemented [-Wcpp] :1323:2: warning: #warning syscall pkey_free not implemented [-Wcpp] :1326:2: warning: #warning syscall statx not implemented [-Wcpp] :1332:2: warning: #warning syscall io_pgetevents not implemented [-Wcpp] :1335:2: warning: #warning syscall rseq not implemented [-Wcpp] >> :1338:2: warning: #warning syscall procfd_send_signal not implemented >> [-Wcpp] --- 0-DAY kernel test infrastructureOpen Source Technology Center https://lists.01.org/pipermail/kbuild-all Intel Corporation .config.gz Description: application/gzip
Re: [PATCH v3] signal: add procfd_send_signal() syscall
Hi Christian, Thank you for the patch! Perhaps something to improve: [auto build test WARNING on linus/master] [also build test WARNING on v4.20-rc5] [cannot apply to next-20181205] [if your patch is applied to the wrong git tree, please drop us a note to help improve the system] url: https://github.com/0day-ci/linux/commits/Christian-Brauner/signal-add-procfd_send_signal-syscall/20181205-174512 config: sh-titan_defconfig (attached as .config) compiler: sh4-linux-gnu-gcc (Debian 7.2.0-11) 7.2.0 reproduce: wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross chmod +x ~/bin/make.cross # save the attached .config to linux build tree GCC_VERSION=7.2.0 make.cross ARCH=sh All warnings (new ones prefixed by >>): :1317:2: warning: #warning syscall pkey_mprotect not implemented [-Wcpp] :1320:2: warning: #warning syscall pkey_alloc not implemented [-Wcpp] :1323:2: warning: #warning syscall pkey_free not implemented [-Wcpp] :1326:2: warning: #warning syscall statx not implemented [-Wcpp] :1332:2: warning: #warning syscall io_pgetevents not implemented [-Wcpp] :1335:2: warning: #warning syscall rseq not implemented [-Wcpp] >> :1338:2: warning: #warning syscall procfd_send_signal not implemented >> [-Wcpp] --- 0-DAY kernel test infrastructureOpen Source Technology Center https://lists.01.org/pipermail/kbuild-all Intel Corporation .config.gz Description: application/gzip