Re: [PATCH v3] signal: add procfd_send_signal() syscall

2018-12-05 Thread Kees Cook
On Wed, Dec 5, 2018 at 7:08 PM Christian Brauner  wrote:
> As a sidenote I'm switching the name from procfd_send_signal() to
> taskfd_send_signal(). It seems to me the best way to handle Eric's
> request to reflect that we can eventually both signal tgids and tids.

Cool; sounds fine to me.

-- 
Kees Cook


Re: [PATCH v3] signal: add procfd_send_signal() syscall

2018-12-05 Thread Kees Cook
On Wed, Dec 5, 2018 at 7:08 PM Christian Brauner  wrote:
> As a sidenote I'm switching the name from procfd_send_signal() to
> taskfd_send_signal(). It seems to me the best way to handle Eric's
> request to reflect that we can eventually both signal tgids and tids.

Cool; sounds fine to me.

-- 
Kees Cook


Re: [PATCH v3] signal: add procfd_send_signal() syscall

2018-12-05 Thread Christian Brauner
On Wed, Dec 05, 2018 at 03:24:08PM -0800, Kees Cook wrote:
> On Wed, Dec 5, 2018 at 12:53 PM Christian Brauner  
> wrote:
> > On Wed, Dec 05, 2018 at 12:20:43PM -0600, Eric W. Biederman wrote:
> > > Christian Brauner  writes:
> > > > [1]:  https://lkml.org/lkml/2018/11/18/130
> > > > [2]:  
> > > > https://lore.kernel.org/lkml/874lbtjvtd@oldenburg2.str.redhat.com/
> > > > [3]:  
> > > > https://lore.kernel.org/lkml/20181204132604.aspfupwjgjx6f...@brauner.io/
> > > > [4]:  
> > > > https://lore.kernel.org/lkml/20181203180224.fkvw4kajtbvru...@brauner.io/
> > > > [5]:  
> > > > https://lore.kernel.org/lkml/20181121213946.ga10...@mail.hallyn.com/
> > > > [6]:  
> > > > https://lore.kernel.org/lkml/20181120103111.etlqp7zop34v6...@brauner.io/
> > > > [7]:  
> > > > https://lore.kernel.org/lkml/36323361-90bd-41af-ab5b-ee0d7ba02...@amacapital.net/
> > > > [8]:  https://lore.kernel.org/lkml/87tvjxp8pc@xmission.com/
> > > > [9]:  https://asciinema.org/a/X1J8eGhe3vCfBE2b9TXtTaSJ7
> > > > [10]: 
> > > > https://lore.kernel.org/lkml/20181203180224.fkvw4kajtbvru...@brauner.io/
> > > > [11]: 
> > > > https://lore.kernel.org/lkml/f53d6d38-3521-4c20-9034-5af447df6...@amacapital.net/
> 
> I nominate this for 2018's most-well-documented syscall commit log award. ;)

Hahaha. If I win can I get my price in beer(s)? :)

> 
> > > > +   /*
> > > > +* Give userspace a way to detect whether /proc//task/ fds
> > > > +* are supported.
> > > > +*/
> > > > +   ret = -EOPNOTSUPP;
> > > > +   if (proc_is_tid_procfd(f.file))
> > > > +   goto err;
> > >
> > >   -EBADF is the proper error code.
> >
> > This is done so that userspace has a way of figuring out that tid fds
> > are not yet supported. This has been discussed with Florian (see commit
> > message).
> 
> Right, we should keep this -EOPNOTSUPP.
> 
> > > > +   /* Is this a procfd? */
> > > > +   ret = -EINVAL;
> > > > +   if (!proc_is_tgid_procfd(f.file))
> > > > +   goto err;
> > >
> > >   -EBADF is the proper error code.
> 
> Yeah, EINVAL tends to be used for bad flags... this is more about an
> improper fd.
> 
> > >
> > > > +   /* Without CONFIG_PROC_FS proc_pid() returns NULL. */
> > > > +   pid = proc_pid(file_inode(f.file));
> > > > +   if (!pid)
> > > > +   goto err;
> > >
> > > Perhaps you want to fold the proc_pid into the proc_is_tgid_procfd
> > > call.  That way proc_pid can stay private to proc.
> >
> > Hm, I guess we can do that for now. My intention was to have reuseable
> > helpers but I guess it would be fine for now.
> >
> > >
> > > > +   if (!may_signal_procfd(pid))
> > > > +   goto err;
> > > > +
> 
> Does the ns parent checking in may_signal_procfd need any locking or
> RCU? I know pid and current namespaces are "pinned", but I don't know
> how parent ns works here. I'm assuming the parents are stuck until all
> children go away?

Yeah, since they are hierarchical killing an ancestor means killing the
children. Also, in case you're interested, there's precedent for that:
kernel/pid_namespace.c:static struct ns_common *pidns_get_parent(struct 
ns_common *ns)
I'm not using this function because a) I would have to special case the
initial test-case and b) it takes a get() on the pid ns which would
force us to use another put which is unnecessary.

> 
> > > > +   ret = kill_pid_info(sig, , pid);
> 
> Just double-checking for myself: this does not bypass
> security_task_kill(), so no problem there AFAIK.
> 
> Reviewed-by: Kees Cook 

Thanks! :)
As a sidenote I'm switching the name from procfd_send_signal() to
taskfd_send_signal(). It seems to me the best way to handle Eric's
request to reflect that we can eventually both signal tgids and tids.

> 
> -- 
> Kees Cook


Re: [PATCH v3] signal: add procfd_send_signal() syscall

2018-12-05 Thread Christian Brauner
On Wed, Dec 05, 2018 at 03:24:08PM -0800, Kees Cook wrote:
> On Wed, Dec 5, 2018 at 12:53 PM Christian Brauner  
> wrote:
> > On Wed, Dec 05, 2018 at 12:20:43PM -0600, Eric W. Biederman wrote:
> > > Christian Brauner  writes:
> > > > [1]:  https://lkml.org/lkml/2018/11/18/130
> > > > [2]:  
> > > > https://lore.kernel.org/lkml/874lbtjvtd@oldenburg2.str.redhat.com/
> > > > [3]:  
> > > > https://lore.kernel.org/lkml/20181204132604.aspfupwjgjx6f...@brauner.io/
> > > > [4]:  
> > > > https://lore.kernel.org/lkml/20181203180224.fkvw4kajtbvru...@brauner.io/
> > > > [5]:  
> > > > https://lore.kernel.org/lkml/20181121213946.ga10...@mail.hallyn.com/
> > > > [6]:  
> > > > https://lore.kernel.org/lkml/20181120103111.etlqp7zop34v6...@brauner.io/
> > > > [7]:  
> > > > https://lore.kernel.org/lkml/36323361-90bd-41af-ab5b-ee0d7ba02...@amacapital.net/
> > > > [8]:  https://lore.kernel.org/lkml/87tvjxp8pc@xmission.com/
> > > > [9]:  https://asciinema.org/a/X1J8eGhe3vCfBE2b9TXtTaSJ7
> > > > [10]: 
> > > > https://lore.kernel.org/lkml/20181203180224.fkvw4kajtbvru...@brauner.io/
> > > > [11]: 
> > > > https://lore.kernel.org/lkml/f53d6d38-3521-4c20-9034-5af447df6...@amacapital.net/
> 
> I nominate this for 2018's most-well-documented syscall commit log award. ;)

Hahaha. If I win can I get my price in beer(s)? :)

> 
> > > > +   /*
> > > > +* Give userspace a way to detect whether /proc//task/ fds
> > > > +* are supported.
> > > > +*/
> > > > +   ret = -EOPNOTSUPP;
> > > > +   if (proc_is_tid_procfd(f.file))
> > > > +   goto err;
> > >
> > >   -EBADF is the proper error code.
> >
> > This is done so that userspace has a way of figuring out that tid fds
> > are not yet supported. This has been discussed with Florian (see commit
> > message).
> 
> Right, we should keep this -EOPNOTSUPP.
> 
> > > > +   /* Is this a procfd? */
> > > > +   ret = -EINVAL;
> > > > +   if (!proc_is_tgid_procfd(f.file))
> > > > +   goto err;
> > >
> > >   -EBADF is the proper error code.
> 
> Yeah, EINVAL tends to be used for bad flags... this is more about an
> improper fd.
> 
> > >
> > > > +   /* Without CONFIG_PROC_FS proc_pid() returns NULL. */
> > > > +   pid = proc_pid(file_inode(f.file));
> > > > +   if (!pid)
> > > > +   goto err;
> > >
> > > Perhaps you want to fold the proc_pid into the proc_is_tgid_procfd
> > > call.  That way proc_pid can stay private to proc.
> >
> > Hm, I guess we can do that for now. My intention was to have reuseable
> > helpers but I guess it would be fine for now.
> >
> > >
> > > > +   if (!may_signal_procfd(pid))
> > > > +   goto err;
> > > > +
> 
> Does the ns parent checking in may_signal_procfd need any locking or
> RCU? I know pid and current namespaces are "pinned", but I don't know
> how parent ns works here. I'm assuming the parents are stuck until all
> children go away?

Yeah, since they are hierarchical killing an ancestor means killing the
children. Also, in case you're interested, there's precedent for that:
kernel/pid_namespace.c:static struct ns_common *pidns_get_parent(struct 
ns_common *ns)
I'm not using this function because a) I would have to special case the
initial test-case and b) it takes a get() on the pid ns which would
force us to use another put which is unnecessary.

> 
> > > > +   ret = kill_pid_info(sig, , pid);
> 
> Just double-checking for myself: this does not bypass
> security_task_kill(), so no problem there AFAIK.
> 
> Reviewed-by: Kees Cook 

Thanks! :)
As a sidenote I'm switching the name from procfd_send_signal() to
taskfd_send_signal(). It seems to me the best way to handle Eric's
request to reflect that we can eventually both signal tgids and tids.

> 
> -- 
> Kees Cook


Re: [PATCH v3] signal: add procfd_send_signal() syscall

2018-12-05 Thread Kees Cook
On Wed, Dec 5, 2018 at 12:53 PM Christian Brauner  wrote:
> On Wed, Dec 05, 2018 at 12:20:43PM -0600, Eric W. Biederman wrote:
> > Christian Brauner  writes:
> > > [1]:  https://lkml.org/lkml/2018/11/18/130
> > > [2]:  
> > > https://lore.kernel.org/lkml/874lbtjvtd@oldenburg2.str.redhat.com/
> > > [3]:  
> > > https://lore.kernel.org/lkml/20181204132604.aspfupwjgjx6f...@brauner.io/
> > > [4]:  
> > > https://lore.kernel.org/lkml/20181203180224.fkvw4kajtbvru...@brauner.io/
> > > [5]:  https://lore.kernel.org/lkml/20181121213946.ga10...@mail.hallyn.com/
> > > [6]:  
> > > https://lore.kernel.org/lkml/20181120103111.etlqp7zop34v6...@brauner.io/
> > > [7]:  
> > > https://lore.kernel.org/lkml/36323361-90bd-41af-ab5b-ee0d7ba02...@amacapital.net/
> > > [8]:  https://lore.kernel.org/lkml/87tvjxp8pc@xmission.com/
> > > [9]:  https://asciinema.org/a/X1J8eGhe3vCfBE2b9TXtTaSJ7
> > > [10]: 
> > > https://lore.kernel.org/lkml/20181203180224.fkvw4kajtbvru...@brauner.io/
> > > [11]: 
> > > https://lore.kernel.org/lkml/f53d6d38-3521-4c20-9034-5af447df6...@amacapital.net/

I nominate this for 2018's most-well-documented syscall commit log award. ;)

> > > +   /*
> > > +* Give userspace a way to detect whether /proc//task/ fds
> > > +* are supported.
> > > +*/
> > > +   ret = -EOPNOTSUPP;
> > > +   if (proc_is_tid_procfd(f.file))
> > > +   goto err;
> >
> >   -EBADF is the proper error code.
>
> This is done so that userspace has a way of figuring out that tid fds
> are not yet supported. This has been discussed with Florian (see commit
> message).

Right, we should keep this -EOPNOTSUPP.

> > > +   /* Is this a procfd? */
> > > +   ret = -EINVAL;
> > > +   if (!proc_is_tgid_procfd(f.file))
> > > +   goto err;
> >
> >   -EBADF is the proper error code.

Yeah, EINVAL tends to be used for bad flags... this is more about an
improper fd.

> >
> > > +   /* Without CONFIG_PROC_FS proc_pid() returns NULL. */
> > > +   pid = proc_pid(file_inode(f.file));
> > > +   if (!pid)
> > > +   goto err;
> >
> > Perhaps you want to fold the proc_pid into the proc_is_tgid_procfd
> > call.  That way proc_pid can stay private to proc.
>
> Hm, I guess we can do that for now. My intention was to have reuseable
> helpers but I guess it would be fine for now.
>
> >
> > > +   if (!may_signal_procfd(pid))
> > > +   goto err;
> > > +

Does the ns parent checking in may_signal_procfd need any locking or
RCU? I know pid and current namespaces are "pinned", but I don't know
how parent ns works here. I'm assuming the parents are stuck until all
children go away?

> > > +   ret = kill_pid_info(sig, , pid);

Just double-checking for myself: this does not bypass
security_task_kill(), so no problem there AFAIK.

Reviewed-by: Kees Cook 

-- 
Kees Cook


Re: [PATCH v3] signal: add procfd_send_signal() syscall

2018-12-05 Thread Kees Cook
On Wed, Dec 5, 2018 at 12:53 PM Christian Brauner  wrote:
> On Wed, Dec 05, 2018 at 12:20:43PM -0600, Eric W. Biederman wrote:
> > Christian Brauner  writes:
> > > [1]:  https://lkml.org/lkml/2018/11/18/130
> > > [2]:  
> > > https://lore.kernel.org/lkml/874lbtjvtd@oldenburg2.str.redhat.com/
> > > [3]:  
> > > https://lore.kernel.org/lkml/20181204132604.aspfupwjgjx6f...@brauner.io/
> > > [4]:  
> > > https://lore.kernel.org/lkml/20181203180224.fkvw4kajtbvru...@brauner.io/
> > > [5]:  https://lore.kernel.org/lkml/20181121213946.ga10...@mail.hallyn.com/
> > > [6]:  
> > > https://lore.kernel.org/lkml/20181120103111.etlqp7zop34v6...@brauner.io/
> > > [7]:  
> > > https://lore.kernel.org/lkml/36323361-90bd-41af-ab5b-ee0d7ba02...@amacapital.net/
> > > [8]:  https://lore.kernel.org/lkml/87tvjxp8pc@xmission.com/
> > > [9]:  https://asciinema.org/a/X1J8eGhe3vCfBE2b9TXtTaSJ7
> > > [10]: 
> > > https://lore.kernel.org/lkml/20181203180224.fkvw4kajtbvru...@brauner.io/
> > > [11]: 
> > > https://lore.kernel.org/lkml/f53d6d38-3521-4c20-9034-5af447df6...@amacapital.net/

I nominate this for 2018's most-well-documented syscall commit log award. ;)

> > > +   /*
> > > +* Give userspace a way to detect whether /proc//task/ fds
> > > +* are supported.
> > > +*/
> > > +   ret = -EOPNOTSUPP;
> > > +   if (proc_is_tid_procfd(f.file))
> > > +   goto err;
> >
> >   -EBADF is the proper error code.
>
> This is done so that userspace has a way of figuring out that tid fds
> are not yet supported. This has been discussed with Florian (see commit
> message).

Right, we should keep this -EOPNOTSUPP.

> > > +   /* Is this a procfd? */
> > > +   ret = -EINVAL;
> > > +   if (!proc_is_tgid_procfd(f.file))
> > > +   goto err;
> >
> >   -EBADF is the proper error code.

Yeah, EINVAL tends to be used for bad flags... this is more about an
improper fd.

> >
> > > +   /* Without CONFIG_PROC_FS proc_pid() returns NULL. */
> > > +   pid = proc_pid(file_inode(f.file));
> > > +   if (!pid)
> > > +   goto err;
> >
> > Perhaps you want to fold the proc_pid into the proc_is_tgid_procfd
> > call.  That way proc_pid can stay private to proc.
>
> Hm, I guess we can do that for now. My intention was to have reuseable
> helpers but I guess it would be fine for now.
>
> >
> > > +   if (!may_signal_procfd(pid))
> > > +   goto err;
> > > +

Does the ns parent checking in may_signal_procfd need any locking or
RCU? I know pid and current namespaces are "pinned", but I don't know
how parent ns works here. I'm assuming the parents are stuck until all
children go away?

> > > +   ret = kill_pid_info(sig, , pid);

Just double-checking for myself: this does not bypass
security_task_kill(), so no problem there AFAIK.

Reviewed-by: Kees Cook 

-- 
Kees Cook


Re: [PATCH v3] signal: add procfd_send_signal() syscall

2018-12-05 Thread Christian Brauner
On Wed, Dec 05, 2018 at 12:20:43PM -0600, Eric W. Biederman wrote:
> Christian Brauner  writes:
> 
> > The kill() syscall operates on process identifiers (pid). After a process
> > has exited its pid can be reused by another process. If a caller sends a
> > signal to a reused pid it will end up signaling the wrong process. This
> > issue has often surfaced and there has been a push [1] to address this
> > problem.
> >
> > This patch uses file descriptors (fd) from proc/ as stable handles on
> > struct pid. Even if a pid is recycled the handle will not change. The fd
> > can be used to send signals to the process it refers to.
> > Thus, the new syscall procfd_send_signal() is introduced to solve this
> > problem. Instead of pids it operates on process fds (procfd).
> >
> > /* prototype and argument /*
> > long procfd_send_signal(int fd, int sig, siginfo_t *info, unsigned int 
> > flags);
> >
> > In addition to the procfd and signal argument it takes an additional
> > siginfo_t and flags argument. If the siginfo_t argument is NULL then
> > procfd_send_signal() behaves like kill(). If it is not NULL
> > procfd_send_signal() behaves like rt_sigqueueinfo().
> > The flags argument is added to allow for future extensions of this syscall.
> > It currently needs to be passed as 0. Failing to do so will cause EINVAL.
> >
> > /* procfd_send_signal() replaces multiple pid-based syscalls */
> > The procfd_send_signal() syscall currently takes on the job of the
> > following syscalls that operate on pids:
> > - kill(2)
> > - rt_sigqueueinfo(2)
> > The syscall is defined in such a way that it can also operate on thread fds
> > instead of process fds. In a future patchset I will extend it to operate on
> > procfds from /proc//task/ at which point it will additionally
> > take on the job of:
> > - tgkill(2)
> > - rt_tgsigqueueinfo(2)
> > Right now this is intentionally left out to keep this patchset as simple as
> > possible (cf. [4]). If a procfd of /proc//task/ is passed
> > EOPNOTSUPP will be returned to give userspace a way to detect when I add
> > support for such procfds {cf. [10]).
> >
> > /* naming */
> > The name procfd_send_signal() was chosen to reflect the fact that it takes
> > on the job of multiple syscalls. It is intentional that the name is not
> > reminiscent of neither kill(2) nor rt_sigqueueinfo(2). Not the fomer
> > because it might imply that procfd_send_signal() is only a replacement for
> > kill(2) and not the latter because it is a hazzle to remember the correct
> > spelling (especially for non-native speakers) and because it is not
> > descriptive enough of what the syscall actually does. The name
> > "procfd_send_signal" makes it very clear that its job is to send signals.
> >
> > /* O_PATH file descriptors */
> > procfds opened as O_PATH fds cannot be used to send signals to a process
> > (cf. [2]). Signaling processes through procfds is the equivalent of writing
> > to a file. Thus, this is not an operation that operates "purely at the
> > file descriptor level" as required by the open(2) manpage.
> >
> > /* zombies */
> > Zombies can be signaled just as any other process. No special error will be
> > reported since a zombie state is an unreliable state (cf. [3]).
> >
> > /* cross-namespace signals */
> > The patch currently enforces that the signaler and signalee either are in
> > the same pid namespace or that the signaler's pid namespace is an ancestor
> > of the signalee's pid namespace. This is done for the sake of simplicity
> > and because it is unclear to what values certain members of struct
> > siginfo_t would need to be set to (cf. [5], [6]).
> >
> > /* compat syscalls */
> > It became clear that we would like to avoid adding compat syscalls (cf.
> > [7]). The compat syscall handling is now done in kernel/signal.c itself by
> > adding __copy_siginfo_from_user_generic() which lets us avoid compat
> > syscalls (cf. [8]).
> > With upcoming rework for syscall handling things might improve even further
> > (cf. [11]).
> > This patch was tested on x64 and x86.
> >
> > /* userspace usage */
> > An asciinema recording for the basic functionality can be found under [9].
> > With this patch a process can be killed via:
> >
> >  #define _GNU_SOURCE
> >  #include 
> >  #include 
> >  #include 
> >  #include 
> >  #include 
> >  #include 
> >  #include 
> >  #include 
> >  #include 
> >  #include 
> >
> >  static inline int do_procfd_send_signal(int procfd, int sig, siginfo_t 
> > *info,
> >  unsigned int flags)
> >  {
> >  #ifdef __NR_procfd_send_signal
> >  return syscall(__NR_procfd_send_signal, procfd, sig, info, flags);
> >  #else
> >  return -ENOSYS;
> >  #endif
> >  }
> >
> >  int main(int argc, char *argv[])
> >  {
> >  int fd, ret, saved_errno, sig;
> >
> >  if (argc < 3)
> >  exit(EXIT_FAILURE);
> >
> >  fd = open(argv[1], O_DIRECTORY | O_CLOEXEC);
> >  if (fd < 0) {
> >  

Re: [PATCH v3] signal: add procfd_send_signal() syscall

2018-12-05 Thread Christian Brauner
On Wed, Dec 05, 2018 at 12:20:43PM -0600, Eric W. Biederman wrote:
> Christian Brauner  writes:
> 
> > The kill() syscall operates on process identifiers (pid). After a process
> > has exited its pid can be reused by another process. If a caller sends a
> > signal to a reused pid it will end up signaling the wrong process. This
> > issue has often surfaced and there has been a push [1] to address this
> > problem.
> >
> > This patch uses file descriptors (fd) from proc/ as stable handles on
> > struct pid. Even if a pid is recycled the handle will not change. The fd
> > can be used to send signals to the process it refers to.
> > Thus, the new syscall procfd_send_signal() is introduced to solve this
> > problem. Instead of pids it operates on process fds (procfd).
> >
> > /* prototype and argument /*
> > long procfd_send_signal(int fd, int sig, siginfo_t *info, unsigned int 
> > flags);
> >
> > In addition to the procfd and signal argument it takes an additional
> > siginfo_t and flags argument. If the siginfo_t argument is NULL then
> > procfd_send_signal() behaves like kill(). If it is not NULL
> > procfd_send_signal() behaves like rt_sigqueueinfo().
> > The flags argument is added to allow for future extensions of this syscall.
> > It currently needs to be passed as 0. Failing to do so will cause EINVAL.
> >
> > /* procfd_send_signal() replaces multiple pid-based syscalls */
> > The procfd_send_signal() syscall currently takes on the job of the
> > following syscalls that operate on pids:
> > - kill(2)
> > - rt_sigqueueinfo(2)
> > The syscall is defined in such a way that it can also operate on thread fds
> > instead of process fds. In a future patchset I will extend it to operate on
> > procfds from /proc//task/ at which point it will additionally
> > take on the job of:
> > - tgkill(2)
> > - rt_tgsigqueueinfo(2)
> > Right now this is intentionally left out to keep this patchset as simple as
> > possible (cf. [4]). If a procfd of /proc//task/ is passed
> > EOPNOTSUPP will be returned to give userspace a way to detect when I add
> > support for such procfds {cf. [10]).
> >
> > /* naming */
> > The name procfd_send_signal() was chosen to reflect the fact that it takes
> > on the job of multiple syscalls. It is intentional that the name is not
> > reminiscent of neither kill(2) nor rt_sigqueueinfo(2). Not the fomer
> > because it might imply that procfd_send_signal() is only a replacement for
> > kill(2) and not the latter because it is a hazzle to remember the correct
> > spelling (especially for non-native speakers) and because it is not
> > descriptive enough of what the syscall actually does. The name
> > "procfd_send_signal" makes it very clear that its job is to send signals.
> >
> > /* O_PATH file descriptors */
> > procfds opened as O_PATH fds cannot be used to send signals to a process
> > (cf. [2]). Signaling processes through procfds is the equivalent of writing
> > to a file. Thus, this is not an operation that operates "purely at the
> > file descriptor level" as required by the open(2) manpage.
> >
> > /* zombies */
> > Zombies can be signaled just as any other process. No special error will be
> > reported since a zombie state is an unreliable state (cf. [3]).
> >
> > /* cross-namespace signals */
> > The patch currently enforces that the signaler and signalee either are in
> > the same pid namespace or that the signaler's pid namespace is an ancestor
> > of the signalee's pid namespace. This is done for the sake of simplicity
> > and because it is unclear to what values certain members of struct
> > siginfo_t would need to be set to (cf. [5], [6]).
> >
> > /* compat syscalls */
> > It became clear that we would like to avoid adding compat syscalls (cf.
> > [7]). The compat syscall handling is now done in kernel/signal.c itself by
> > adding __copy_siginfo_from_user_generic() which lets us avoid compat
> > syscalls (cf. [8]).
> > With upcoming rework for syscall handling things might improve even further
> > (cf. [11]).
> > This patch was tested on x64 and x86.
> >
> > /* userspace usage */
> > An asciinema recording for the basic functionality can be found under [9].
> > With this patch a process can be killed via:
> >
> >  #define _GNU_SOURCE
> >  #include 
> >  #include 
> >  #include 
> >  #include 
> >  #include 
> >  #include 
> >  #include 
> >  #include 
> >  #include 
> >  #include 
> >
> >  static inline int do_procfd_send_signal(int procfd, int sig, siginfo_t 
> > *info,
> >  unsigned int flags)
> >  {
> >  #ifdef __NR_procfd_send_signal
> >  return syscall(__NR_procfd_send_signal, procfd, sig, info, flags);
> >  #else
> >  return -ENOSYS;
> >  #endif
> >  }
> >
> >  int main(int argc, char *argv[])
> >  {
> >  int fd, ret, saved_errno, sig;
> >
> >  if (argc < 3)
> >  exit(EXIT_FAILURE);
> >
> >  fd = open(argv[1], O_DIRECTORY | O_CLOEXEC);
> >  if (fd < 0) {
> >  

Re: [PATCH v3] signal: add procfd_send_signal() syscall

2018-12-05 Thread Eric W. Biederman
Christian Brauner  writes:

> The kill() syscall operates on process identifiers (pid). After a process
> has exited its pid can be reused by another process. If a caller sends a
> signal to a reused pid it will end up signaling the wrong process. This
> issue has often surfaced and there has been a push [1] to address this
> problem.
>
> This patch uses file descriptors (fd) from proc/ as stable handles on
> struct pid. Even if a pid is recycled the handle will not change. The fd
> can be used to send signals to the process it refers to.
> Thus, the new syscall procfd_send_signal() is introduced to solve this
> problem. Instead of pids it operates on process fds (procfd).
>
> /* prototype and argument /*
> long procfd_send_signal(int fd, int sig, siginfo_t *info, unsigned int flags);
>
> In addition to the procfd and signal argument it takes an additional
> siginfo_t and flags argument. If the siginfo_t argument is NULL then
> procfd_send_signal() behaves like kill(). If it is not NULL
> procfd_send_signal() behaves like rt_sigqueueinfo().
> The flags argument is added to allow for future extensions of this syscall.
> It currently needs to be passed as 0. Failing to do so will cause EINVAL.
>
> /* procfd_send_signal() replaces multiple pid-based syscalls */
> The procfd_send_signal() syscall currently takes on the job of the
> following syscalls that operate on pids:
> - kill(2)
> - rt_sigqueueinfo(2)
> The syscall is defined in such a way that it can also operate on thread fds
> instead of process fds. In a future patchset I will extend it to operate on
> procfds from /proc//task/ at which point it will additionally
> take on the job of:
> - tgkill(2)
> - rt_tgsigqueueinfo(2)
> Right now this is intentionally left out to keep this patchset as simple as
> possible (cf. [4]). If a procfd of /proc//task/ is passed
> EOPNOTSUPP will be returned to give userspace a way to detect when I add
> support for such procfds {cf. [10]).
>
> /* naming */
> The name procfd_send_signal() was chosen to reflect the fact that it takes
> on the job of multiple syscalls. It is intentional that the name is not
> reminiscent of neither kill(2) nor rt_sigqueueinfo(2). Not the fomer
> because it might imply that procfd_send_signal() is only a replacement for
> kill(2) and not the latter because it is a hazzle to remember the correct
> spelling (especially for non-native speakers) and because it is not
> descriptive enough of what the syscall actually does. The name
> "procfd_send_signal" makes it very clear that its job is to send signals.
>
> /* O_PATH file descriptors */
> procfds opened as O_PATH fds cannot be used to send signals to a process
> (cf. [2]). Signaling processes through procfds is the equivalent of writing
> to a file. Thus, this is not an operation that operates "purely at the
> file descriptor level" as required by the open(2) manpage.
>
> /* zombies */
> Zombies can be signaled just as any other process. No special error will be
> reported since a zombie state is an unreliable state (cf. [3]).
>
> /* cross-namespace signals */
> The patch currently enforces that the signaler and signalee either are in
> the same pid namespace or that the signaler's pid namespace is an ancestor
> of the signalee's pid namespace. This is done for the sake of simplicity
> and because it is unclear to what values certain members of struct
> siginfo_t would need to be set to (cf. [5], [6]).
>
> /* compat syscalls */
> It became clear that we would like to avoid adding compat syscalls (cf.
> [7]). The compat syscall handling is now done in kernel/signal.c itself by
> adding __copy_siginfo_from_user_generic() which lets us avoid compat
> syscalls (cf. [8]).
> With upcoming rework for syscall handling things might improve even further
> (cf. [11]).
> This patch was tested on x64 and x86.
>
> /* userspace usage */
> An asciinema recording for the basic functionality can be found under [9].
> With this patch a process can be killed via:
>
>  #define _GNU_SOURCE
>  #include 
>  #include 
>  #include 
>  #include 
>  #include 
>  #include 
>  #include 
>  #include 
>  #include 
>  #include 
>
>  static inline int do_procfd_send_signal(int procfd, int sig, siginfo_t *info,
>  unsigned int flags)
>  {
>  #ifdef __NR_procfd_send_signal
>  return syscall(__NR_procfd_send_signal, procfd, sig, info, flags);
>  #else
>  return -ENOSYS;
>  #endif
>  }
>
>  int main(int argc, char *argv[])
>  {
>  int fd, ret, saved_errno, sig;
>
>  if (argc < 3)
>  exit(EXIT_FAILURE);
>
>  fd = open(argv[1], O_DIRECTORY | O_CLOEXEC);
>  if (fd < 0) {
>  printf("%s - Failed to open \"%s\"\n", strerror(errno), 
> argv[1]);
>  exit(EXIT_FAILURE);
>  }
>
>  sig = atoi(argv[2]);
>
>  printf("Sending signal %d to process %s\n", sig, argv[1]);
>  ret = do_procfd_send_signal(fd, sig, NULL, 0);
>
> 

Re: [PATCH v3] signal: add procfd_send_signal() syscall

2018-12-05 Thread Eric W. Biederman
Christian Brauner  writes:

> The kill() syscall operates on process identifiers (pid). After a process
> has exited its pid can be reused by another process. If a caller sends a
> signal to a reused pid it will end up signaling the wrong process. This
> issue has often surfaced and there has been a push [1] to address this
> problem.
>
> This patch uses file descriptors (fd) from proc/ as stable handles on
> struct pid. Even if a pid is recycled the handle will not change. The fd
> can be used to send signals to the process it refers to.
> Thus, the new syscall procfd_send_signal() is introduced to solve this
> problem. Instead of pids it operates on process fds (procfd).
>
> /* prototype and argument /*
> long procfd_send_signal(int fd, int sig, siginfo_t *info, unsigned int flags);
>
> In addition to the procfd and signal argument it takes an additional
> siginfo_t and flags argument. If the siginfo_t argument is NULL then
> procfd_send_signal() behaves like kill(). If it is not NULL
> procfd_send_signal() behaves like rt_sigqueueinfo().
> The flags argument is added to allow for future extensions of this syscall.
> It currently needs to be passed as 0. Failing to do so will cause EINVAL.
>
> /* procfd_send_signal() replaces multiple pid-based syscalls */
> The procfd_send_signal() syscall currently takes on the job of the
> following syscalls that operate on pids:
> - kill(2)
> - rt_sigqueueinfo(2)
> The syscall is defined in such a way that it can also operate on thread fds
> instead of process fds. In a future patchset I will extend it to operate on
> procfds from /proc//task/ at which point it will additionally
> take on the job of:
> - tgkill(2)
> - rt_tgsigqueueinfo(2)
> Right now this is intentionally left out to keep this patchset as simple as
> possible (cf. [4]). If a procfd of /proc//task/ is passed
> EOPNOTSUPP will be returned to give userspace a way to detect when I add
> support for such procfds {cf. [10]).
>
> /* naming */
> The name procfd_send_signal() was chosen to reflect the fact that it takes
> on the job of multiple syscalls. It is intentional that the name is not
> reminiscent of neither kill(2) nor rt_sigqueueinfo(2). Not the fomer
> because it might imply that procfd_send_signal() is only a replacement for
> kill(2) and not the latter because it is a hazzle to remember the correct
> spelling (especially for non-native speakers) and because it is not
> descriptive enough of what the syscall actually does. The name
> "procfd_send_signal" makes it very clear that its job is to send signals.
>
> /* O_PATH file descriptors */
> procfds opened as O_PATH fds cannot be used to send signals to a process
> (cf. [2]). Signaling processes through procfds is the equivalent of writing
> to a file. Thus, this is not an operation that operates "purely at the
> file descriptor level" as required by the open(2) manpage.
>
> /* zombies */
> Zombies can be signaled just as any other process. No special error will be
> reported since a zombie state is an unreliable state (cf. [3]).
>
> /* cross-namespace signals */
> The patch currently enforces that the signaler and signalee either are in
> the same pid namespace or that the signaler's pid namespace is an ancestor
> of the signalee's pid namespace. This is done for the sake of simplicity
> and because it is unclear to what values certain members of struct
> siginfo_t would need to be set to (cf. [5], [6]).
>
> /* compat syscalls */
> It became clear that we would like to avoid adding compat syscalls (cf.
> [7]). The compat syscall handling is now done in kernel/signal.c itself by
> adding __copy_siginfo_from_user_generic() which lets us avoid compat
> syscalls (cf. [8]).
> With upcoming rework for syscall handling things might improve even further
> (cf. [11]).
> This patch was tested on x64 and x86.
>
> /* userspace usage */
> An asciinema recording for the basic functionality can be found under [9].
> With this patch a process can be killed via:
>
>  #define _GNU_SOURCE
>  #include 
>  #include 
>  #include 
>  #include 
>  #include 
>  #include 
>  #include 
>  #include 
>  #include 
>  #include 
>
>  static inline int do_procfd_send_signal(int procfd, int sig, siginfo_t *info,
>  unsigned int flags)
>  {
>  #ifdef __NR_procfd_send_signal
>  return syscall(__NR_procfd_send_signal, procfd, sig, info, flags);
>  #else
>  return -ENOSYS;
>  #endif
>  }
>
>  int main(int argc, char *argv[])
>  {
>  int fd, ret, saved_errno, sig;
>
>  if (argc < 3)
>  exit(EXIT_FAILURE);
>
>  fd = open(argv[1], O_DIRECTORY | O_CLOEXEC);
>  if (fd < 0) {
>  printf("%s - Failed to open \"%s\"\n", strerror(errno), 
> argv[1]);
>  exit(EXIT_FAILURE);
>  }
>
>  sig = atoi(argv[2]);
>
>  printf("Sending signal %d to process %s\n", sig, argv[1]);
>  ret = do_procfd_send_signal(fd, sig, NULL, 0);
>
> 

Re: [PATCH v3] signal: add procfd_send_signal() syscall

2018-12-05 Thread Eric W. Biederman
Christian Brauner  writes:

> The kill() syscall operates on process identifiers (pid). After a process
> has exited its pid can be reused by another process. If a caller sends a
> signal to a reused pid it will end up signaling the wrong process. This
> issue has often surfaced and there has been a push [1] to address this
> problem.
>
> This patch uses file descriptors (fd) from proc/ as stable handles on
> struct pid. Even if a pid is recycled the handle will not change. The fd
> can be used to send signals to the process it refers to.
> Thus, the new syscall procfd_send_signal() is introduced to solve this
> problem. Instead of pids it operates on process fds (procfd).
>
> /* prototype and argument /*
> long procfd_send_signal(int fd, int sig, siginfo_t *info, unsigned int flags);
>
> In addition to the procfd and signal argument it takes an additional
> siginfo_t and flags argument. If the siginfo_t argument is NULL then
> procfd_send_signal() behaves like kill(). If it is not NULL
> procfd_send_signal() behaves like rt_sigqueueinfo().
> The flags argument is added to allow for future extensions of this syscall.
> It currently needs to be passed as 0. Failing to do so will cause EINVAL.
>
> /* procfd_send_signal() replaces multiple pid-based syscalls */
> The procfd_send_signal() syscall currently takes on the job of the
> following syscalls that operate on pids:
> - kill(2)
> - rt_sigqueueinfo(2)
> The syscall is defined in such a way that it can also operate on thread fds
> instead of process fds. In a future patchset I will extend it to operate on
> procfds from /proc//task/ at which point it will additionally
> take on the job of:
> - tgkill(2)
> - rt_tgsigqueueinfo(2)
> Right now this is intentionally left out to keep this patchset as simple as
> possible (cf. [4]). If a procfd of /proc//task/ is passed
> EOPNOTSUPP will be returned to give userspace a way to detect when I add
> support for such procfds {cf. [10]).
>
> /* naming */
> The name procfd_send_signal() was chosen to reflect the fact that it takes
> on the job of multiple syscalls. It is intentional that the name is not
> reminiscent of neither kill(2) nor rt_sigqueueinfo(2). Not the fomer
> because it might imply that procfd_send_signal() is only a replacement for
> kill(2) and not the latter because it is a hazzle to remember the correct
> spelling (especially for non-native speakers) and because it is not
> descriptive enough of what the syscall actually does. The name
> "procfd_send_signal" makes it very clear that its job is to send signals.
>
> /* O_PATH file descriptors */
> procfds opened as O_PATH fds cannot be used to send signals to a process
> (cf. [2]). Signaling processes through procfds is the equivalent of writing
> to a file. Thus, this is not an operation that operates "purely at the
> file descriptor level" as required by the open(2) manpage.
>
> /* zombies */
> Zombies can be signaled just as any other process. No special error will be
> reported since a zombie state is an unreliable state (cf. [3]).
>
> /* cross-namespace signals */
> The patch currently enforces that the signaler and signalee either are in
> the same pid namespace or that the signaler's pid namespace is an ancestor
> of the signalee's pid namespace. This is done for the sake of simplicity
> and because it is unclear to what values certain members of struct
> siginfo_t would need to be set to (cf. [5], [6]).
>
> /* compat syscalls */
> It became clear that we would like to avoid adding compat syscalls (cf.
> [7]). The compat syscall handling is now done in kernel/signal.c itself by
> adding __copy_siginfo_from_user_generic() which lets us avoid compat
> syscalls (cf. [8]).
> With upcoming rework for syscall handling things might improve even further
> (cf. [11]).
> This patch was tested on x64 and x86.
>
> /* userspace usage */
> An asciinema recording for the basic functionality can be found under [9].
> With this patch a process can be killed via:
>
>  #define _GNU_SOURCE
>  #include 
>  #include 
>  #include 
>  #include 
>  #include 
>  #include 
>  #include 
>  #include 
>  #include 
>  #include 
>
>  static inline int do_procfd_send_signal(int procfd, int sig, siginfo_t *info,
>  unsigned int flags)
>  {
>  #ifdef __NR_procfd_send_signal
>  return syscall(__NR_procfd_send_signal, procfd, sig, info, flags);
>  #else
>  return -ENOSYS;
>  #endif
>  }
>
>  int main(int argc, char *argv[])
>  {
>  int fd, ret, saved_errno, sig;
>
>  if (argc < 3)
>  exit(EXIT_FAILURE);
>
>  fd = open(argv[1], O_DIRECTORY | O_CLOEXEC);
>  if (fd < 0) {
>  printf("%s - Failed to open \"%s\"\n", strerror(errno), 
> argv[1]);
>  exit(EXIT_FAILURE);
>  }
>
>  sig = atoi(argv[2]);
>
>  printf("Sending signal %d to process %s\n", sig, argv[1]);
>  ret = do_procfd_send_signal(fd, sig, NULL, 0);
>
> 

Re: [PATCH v3] signal: add procfd_send_signal() syscall

2018-12-05 Thread Eric W. Biederman
Christian Brauner  writes:

> The kill() syscall operates on process identifiers (pid). After a process
> has exited its pid can be reused by another process. If a caller sends a
> signal to a reused pid it will end up signaling the wrong process. This
> issue has often surfaced and there has been a push [1] to address this
> problem.
>
> This patch uses file descriptors (fd) from proc/ as stable handles on
> struct pid. Even if a pid is recycled the handle will not change. The fd
> can be used to send signals to the process it refers to.
> Thus, the new syscall procfd_send_signal() is introduced to solve this
> problem. Instead of pids it operates on process fds (procfd).
>
> /* prototype and argument /*
> long procfd_send_signal(int fd, int sig, siginfo_t *info, unsigned int flags);
>
> In addition to the procfd and signal argument it takes an additional
> siginfo_t and flags argument. If the siginfo_t argument is NULL then
> procfd_send_signal() behaves like kill(). If it is not NULL
> procfd_send_signal() behaves like rt_sigqueueinfo().
> The flags argument is added to allow for future extensions of this syscall.
> It currently needs to be passed as 0. Failing to do so will cause EINVAL.
>
> /* procfd_send_signal() replaces multiple pid-based syscalls */
> The procfd_send_signal() syscall currently takes on the job of the
> following syscalls that operate on pids:
> - kill(2)
> - rt_sigqueueinfo(2)
> The syscall is defined in such a way that it can also operate on thread fds
> instead of process fds. In a future patchset I will extend it to operate on
> procfds from /proc//task/ at which point it will additionally
> take on the job of:
> - tgkill(2)
> - rt_tgsigqueueinfo(2)
> Right now this is intentionally left out to keep this patchset as simple as
> possible (cf. [4]). If a procfd of /proc//task/ is passed
> EOPNOTSUPP will be returned to give userspace a way to detect when I add
> support for such procfds {cf. [10]).
>
> /* naming */
> The name procfd_send_signal() was chosen to reflect the fact that it takes
> on the job of multiple syscalls. It is intentional that the name is not
> reminiscent of neither kill(2) nor rt_sigqueueinfo(2). Not the fomer
> because it might imply that procfd_send_signal() is only a replacement for
> kill(2) and not the latter because it is a hazzle to remember the correct
> spelling (especially for non-native speakers) and because it is not
> descriptive enough of what the syscall actually does. The name
> "procfd_send_signal" makes it very clear that its job is to send signals.
>
> /* O_PATH file descriptors */
> procfds opened as O_PATH fds cannot be used to send signals to a process
> (cf. [2]). Signaling processes through procfds is the equivalent of writing
> to a file. Thus, this is not an operation that operates "purely at the
> file descriptor level" as required by the open(2) manpage.
>
> /* zombies */
> Zombies can be signaled just as any other process. No special error will be
> reported since a zombie state is an unreliable state (cf. [3]).
>
> /* cross-namespace signals */
> The patch currently enforces that the signaler and signalee either are in
> the same pid namespace or that the signaler's pid namespace is an ancestor
> of the signalee's pid namespace. This is done for the sake of simplicity
> and because it is unclear to what values certain members of struct
> siginfo_t would need to be set to (cf. [5], [6]).
>
> /* compat syscalls */
> It became clear that we would like to avoid adding compat syscalls (cf.
> [7]). The compat syscall handling is now done in kernel/signal.c itself by
> adding __copy_siginfo_from_user_generic() which lets us avoid compat
> syscalls (cf. [8]).
> With upcoming rework for syscall handling things might improve even further
> (cf. [11]).
> This patch was tested on x64 and x86.
>
> /* userspace usage */
> An asciinema recording for the basic functionality can be found under [9].
> With this patch a process can be killed via:
>
>  #define _GNU_SOURCE
>  #include 
>  #include 
>  #include 
>  #include 
>  #include 
>  #include 
>  #include 
>  #include 
>  #include 
>  #include 
>
>  static inline int do_procfd_send_signal(int procfd, int sig, siginfo_t *info,
>  unsigned int flags)
>  {
>  #ifdef __NR_procfd_send_signal
>  return syscall(__NR_procfd_send_signal, procfd, sig, info, flags);
>  #else
>  return -ENOSYS;
>  #endif
>  }
>
>  int main(int argc, char *argv[])
>  {
>  int fd, ret, saved_errno, sig;
>
>  if (argc < 3)
>  exit(EXIT_FAILURE);
>
>  fd = open(argv[1], O_DIRECTORY | O_CLOEXEC);
>  if (fd < 0) {
>  printf("%s - Failed to open \"%s\"\n", strerror(errno), 
> argv[1]);
>  exit(EXIT_FAILURE);
>  }
>
>  sig = atoi(argv[2]);
>
>  printf("Sending signal %d to process %s\n", sig, argv[1]);
>  ret = do_procfd_send_signal(fd, sig, NULL, 0);
>
> 

Re: [PATCH v3] signal: add procfd_send_signal() syscall

2018-12-05 Thread kbuild test robot
Hi Christian,

Thank you for the patch! Perhaps something to improve:

[auto build test WARNING on linus/master]
[also build test WARNING on v4.20-rc5]
[cannot apply to next-20181205]
[if your patch is applied to the wrong git tree, please drop us a note to help 
improve the system]

url:
https://github.com/0day-ci/linux/commits/Christian-Brauner/signal-add-procfd_send_signal-syscall/20181205-174512
config: sh-titan_defconfig (attached as .config)
compiler: sh4-linux-gnu-gcc (Debian 7.2.0-11) 7.2.0
reproduce:
wget 
https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O 
~/bin/make.cross
chmod +x ~/bin/make.cross
# save the attached .config to linux build tree
GCC_VERSION=7.2.0 make.cross ARCH=sh 

All warnings (new ones prefixed by >>):

   :1317:2: warning: #warning syscall pkey_mprotect not implemented 
[-Wcpp]
   :1320:2: warning: #warning syscall pkey_alloc not implemented [-Wcpp]
   :1323:2: warning: #warning syscall pkey_free not implemented [-Wcpp]
   :1326:2: warning: #warning syscall statx not implemented [-Wcpp]
   :1332:2: warning: #warning syscall io_pgetevents not implemented 
[-Wcpp]
   :1335:2: warning: #warning syscall rseq not implemented [-Wcpp]
>> :1338:2: warning: #warning syscall procfd_send_signal not implemented 
>> [-Wcpp]

---
0-DAY kernel test infrastructureOpen Source Technology Center
https://lists.01.org/pipermail/kbuild-all   Intel Corporation


.config.gz
Description: application/gzip


Re: [PATCH v3] signal: add procfd_send_signal() syscall

2018-12-05 Thread kbuild test robot
Hi Christian,

Thank you for the patch! Perhaps something to improve:

[auto build test WARNING on linus/master]
[also build test WARNING on v4.20-rc5]
[cannot apply to next-20181205]
[if your patch is applied to the wrong git tree, please drop us a note to help 
improve the system]

url:
https://github.com/0day-ci/linux/commits/Christian-Brauner/signal-add-procfd_send_signal-syscall/20181205-174512
config: sh-titan_defconfig (attached as .config)
compiler: sh4-linux-gnu-gcc (Debian 7.2.0-11) 7.2.0
reproduce:
wget 
https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O 
~/bin/make.cross
chmod +x ~/bin/make.cross
# save the attached .config to linux build tree
GCC_VERSION=7.2.0 make.cross ARCH=sh 

All warnings (new ones prefixed by >>):

   :1317:2: warning: #warning syscall pkey_mprotect not implemented 
[-Wcpp]
   :1320:2: warning: #warning syscall pkey_alloc not implemented [-Wcpp]
   :1323:2: warning: #warning syscall pkey_free not implemented [-Wcpp]
   :1326:2: warning: #warning syscall statx not implemented [-Wcpp]
   :1332:2: warning: #warning syscall io_pgetevents not implemented 
[-Wcpp]
   :1335:2: warning: #warning syscall rseq not implemented [-Wcpp]
>> :1338:2: warning: #warning syscall procfd_send_signal not implemented 
>> [-Wcpp]

---
0-DAY kernel test infrastructureOpen Source Technology Center
https://lists.01.org/pipermail/kbuild-all   Intel Corporation


.config.gz
Description: application/gzip