Re: [PATCH v7 0/6] seccomp trap to userspace
Hi Tycho, On 09/27/2018 05:11 PM, Tycho Andersen wrote: Hi all, Here's v7 of the seccomp trap to userspace set. There are various minor changes and bug fixes, but two major changes: * We now pass fds to the tracee via an ioctl, and do it immediately when the ioctl is called. For this we needed some help from the vfs, so I've put the one patch in this series and cc'd fsdevel. This does have the advantage that the feature is now totally decoupled from the rest of the set, which is itself useful (thanks Andy!) * Instead of putting all of the notification related stuff into the struct seccomp_filter, it now lives in its own struct notification, which is pointed to by struct seccomp_filter. This will save a lot of memory (thanks Tyler!) Is there a documentation (man page) patch for this API change? Thanks, Michael v6 discussion: https://lkml.org/lkml/2018/9/6/769 Thoughts welcome, Tycho Tycho Andersen (6): seccomp: add a return code to trap to userspace seccomp: make get_nth_filter available outside of CHECKPOINT_RESTORE seccomp: add a way to get a listener fd from ptrace files: add a replace_fd_files() function seccomp: add a way to pass FDs via a notification fd samples: add an example of seccomp user trap Documentation/ioctl/ioctl-number.txt | 1 + .../userspace-api/seccomp_filter.rst | 89 +++ fs/file.c | 22 +- include/linux/file.h | 8 + include/linux/seccomp.h | 14 +- include/uapi/linux/ptrace.h | 2 + include/uapi/linux/seccomp.h | 42 +- kernel/ptrace.c | 4 + kernel/seccomp.c | 527 ++- samples/seccomp/.gitignore| 1 + samples/seccomp/Makefile | 7 +- samples/seccomp/user-trap.c | 312 + tools/testing/selftests/seccomp/seccomp_bpf.c | 607 +- 13 files changed, 1617 insertions(+), 19 deletions(-) create mode 100644 samples/seccomp/user-trap.c
Re: [PATCH v7 0/6] seccomp trap to userspace
Hi Tycho, On 09/27/2018 05:11 PM, Tycho Andersen wrote: Hi all, Here's v7 of the seccomp trap to userspace set. There are various minor changes and bug fixes, but two major changes: * We now pass fds to the tracee via an ioctl, and do it immediately when the ioctl is called. For this we needed some help from the vfs, so I've put the one patch in this series and cc'd fsdevel. This does have the advantage that the feature is now totally decoupled from the rest of the set, which is itself useful (thanks Andy!) * Instead of putting all of the notification related stuff into the struct seccomp_filter, it now lives in its own struct notification, which is pointed to by struct seccomp_filter. This will save a lot of memory (thanks Tyler!) Is there a documentation (man page) patch for this API change? Thanks, Michael v6 discussion: https://lkml.org/lkml/2018/9/6/769 Thoughts welcome, Tycho Tycho Andersen (6): seccomp: add a return code to trap to userspace seccomp: make get_nth_filter available outside of CHECKPOINT_RESTORE seccomp: add a way to get a listener fd from ptrace files: add a replace_fd_files() function seccomp: add a way to pass FDs via a notification fd samples: add an example of seccomp user trap Documentation/ioctl/ioctl-number.txt | 1 + .../userspace-api/seccomp_filter.rst | 89 +++ fs/file.c | 22 +- include/linux/file.h | 8 + include/linux/seccomp.h | 14 +- include/uapi/linux/ptrace.h | 2 + include/uapi/linux/seccomp.h | 42 +- kernel/ptrace.c | 4 + kernel/seccomp.c | 527 ++- samples/seccomp/.gitignore| 1 + samples/seccomp/Makefile | 7 +- samples/seccomp/user-trap.c | 312 + tools/testing/selftests/seccomp/seccomp_bpf.c | 607 +- 13 files changed, 1617 insertions(+), 19 deletions(-) create mode 100644 samples/seccomp/user-trap.c
Re: [PATCH] io_submit.2: Add IOCB_FLAG_IOPRIO
Hello Adam, On 07/13/2018 10:58 PM, adam.manzana...@wdc.com wrote: From: Adam Manzanares The newly added IOCB_FLAG_IOPRIO aio_flag introduces new behaviors and return values. The details of this new feature are posted here: https://lkml.org/lkml/2018/5/22/809 Thanks for this patch. I've applied it, but I have a question below about a detail that probably needs fixing. Signed-off-by: Adam Manzanares --- man2/io_submit.2 | 34 +++--- 1 file changed, 27 insertions(+), 7 deletions(-) diff --git a/man2/io_submit.2 b/man2/io_submit.2 index d17e3122a..15e1ecdea 100644 --- a/man2/io_submit.2 +++ b/man2/io_submit.2 @@ -164,14 +164,26 @@ This is the size of the buffer pointed to by This is the file offset at which the I/O operation is to be performed. .TP .I aio_flags -This is the flag to be passed iocb structure. -The only valid value is -.BR IOCB_FLAG_RESFD , -which indicates that the asynchronous I/O control must signal the file +This is the set of flags associated with the iocb structure. +The valid values are: +.RS +.TP +.BR IOCB_FLAG_RESFD +Asynchronous I/O control must signal the file descriptor mentioned in .I aio_resfd upon completion. .TP +.BR IOCB_FLAG_IOPRIO " (since Linux 4.18)" +.\" commit d9a08a9e616beeccdbd0e7262b7225ffdfa49e92 +Interpret the +.I aio_reqprio +field as an +.B IOPRIO_VALUE +as defined by +.IR linux/ioprio.h. +.RE +.TP .I aio_resfd The file descriptor to signal in the event of asynchronous I/O completion. .SH RETURN VALUE @@ -196,13 +208,21 @@ The AIO context specified by \fIctx_id\fP is invalid. \fInr\fP is less than 0. The \fIiocb\fP at .I *iocbpp[0] -is not properly initialized, -or the operation specified is invalid for the file descriptor -in the \fIiocb\fP. +is not properly initialized, the operation specified is invalid for the file +descriptor in the \fIiocb\fP, or the value in the +.I aio_reqprio +field is invalid. .TP .B ENOSYS .BR io_submit () is not implemented on this architecture. +.TP +.B EPERM +The aio_reqprio field is set with the class +.B IOPRIO_CLASS_RT +, but the submitting context does not have the What does "submitting context" mean? Threads/tasks/processes have capabilities. Can you rephrase in terms of processes/threads? Thanks, Michael +.B CAP_SYS_ADMIN +privilege. .SH VERSIONS .PP The asynchronous I/O system calls first appeared in Linux 2.5.
Re: [PATCH] io_submit.2: Add IOCB_FLAG_IOPRIO
Hello Adam, On 07/13/2018 10:58 PM, adam.manzana...@wdc.com wrote: From: Adam Manzanares The newly added IOCB_FLAG_IOPRIO aio_flag introduces new behaviors and return values. The details of this new feature are posted here: https://lkml.org/lkml/2018/5/22/809 Thanks for this patch. I've applied it, but I have a question below about a detail that probably needs fixing. Signed-off-by: Adam Manzanares --- man2/io_submit.2 | 34 +++--- 1 file changed, 27 insertions(+), 7 deletions(-) diff --git a/man2/io_submit.2 b/man2/io_submit.2 index d17e3122a..15e1ecdea 100644 --- a/man2/io_submit.2 +++ b/man2/io_submit.2 @@ -164,14 +164,26 @@ This is the size of the buffer pointed to by This is the file offset at which the I/O operation is to be performed. .TP .I aio_flags -This is the flag to be passed iocb structure. -The only valid value is -.BR IOCB_FLAG_RESFD , -which indicates that the asynchronous I/O control must signal the file +This is the set of flags associated with the iocb structure. +The valid values are: +.RS +.TP +.BR IOCB_FLAG_RESFD +Asynchronous I/O control must signal the file descriptor mentioned in .I aio_resfd upon completion. .TP +.BR IOCB_FLAG_IOPRIO " (since Linux 4.18)" +.\" commit d9a08a9e616beeccdbd0e7262b7225ffdfa49e92 +Interpret the +.I aio_reqprio +field as an +.B IOPRIO_VALUE +as defined by +.IR linux/ioprio.h. +.RE +.TP .I aio_resfd The file descriptor to signal in the event of asynchronous I/O completion. .SH RETURN VALUE @@ -196,13 +208,21 @@ The AIO context specified by \fIctx_id\fP is invalid. \fInr\fP is less than 0. The \fIiocb\fP at .I *iocbpp[0] -is not properly initialized, -or the operation specified is invalid for the file descriptor -in the \fIiocb\fP. +is not properly initialized, the operation specified is invalid for the file +descriptor in the \fIiocb\fP, or the value in the +.I aio_reqprio +field is invalid. .TP .B ENOSYS .BR io_submit () is not implemented on this architecture. +.TP +.B EPERM +The aio_reqprio field is set with the class +.B IOPRIO_CLASS_RT +, but the submitting context does not have the What does "submitting context" mean? Threads/tasks/processes have capabilities. Can you rephrase in terms of processes/threads? Thanks, Michael +.B CAP_SYS_ADMIN +privilege. .SH VERSIONS .PP The asynchronous I/O system calls first appeared in Linux 2.5.
Re: [PATCH v3][manpages 1/2] perf_event_open.2: Document PERF_EVENT_IOC_PAUSE_OUTPUT
Hello Wangnan, On 10/24/2016 08:52 AM, Wang Nan wrote: Linux 4.7 (86e7972f690c1017fd086cdfe53d8524e68c661c) introduces PERF_EVENT_IOC_PAUSE_OUTPUT feature. Document it. Just to confirm, I presume this patch has been superseded by the one from Vince that I just applied. Cheers, Michael Signed-off-by: Wang Nan Reviewed-by: Vince Weaver Cc: Michael Kerrisk --- man2/perf_event_open.2 | 24 1 file changed, 24 insertions(+) diff --git a/man2/perf_event_open.2 b/man2/perf_event_open.2 index fade28c..561331c 100644 --- a/man2/perf_event_open.2 +++ b/man2/perf_event_open.2 @@ -1687,6 +1687,15 @@ the .I data_tail value should be written by user space to reflect the last read data. In this case, the kernel will not overwrite unread data. + +When the mapping is read only (without +.BR PROT_WRITE ), +setting .I data_tail is not allowed. +In this case, the kernel will overwrite data when sample coming, unless +the ring buffer is paused by a +.BR PERF_EVENT_IOC_PAUSE_OUTPUT +.BR ioctl (2) +system call before reading. .TP .IR data_offset " (since Linux 4.1)" .\" commit e8c6deac69629c0cb97c3d3272f8631ef17f8f0f @@ -2865,6 +2874,21 @@ The argument is a BPF program file descriptor that was created by a previous .BR bpf (2) system call. +.TP +.BR PERF_EVENT_IOC_PAUSE_OUTPUT " (since Linux 4.7)" +.\" commit 86e7972f690c1017fd086cdfe53d8524e68c661c +This allows pausing and resuming the event's ring-buffer. A +paused ring-buffer does not prevent generation of samples, but simply +discards the samples. The discarded samples are considered lost, +causing +.BR PERF_RECORD_LOST +to be generated when possible. + +The argument is an integer. A nonzero value pauses the ring-buffer, +zero resumes the ring-buffer. + +Pausing a read only ring buffer before reading from it without having +to worry about data being overwritten. .SS Using prctl(2) A process can enable or disable all the event groups that are attached to it using the
Re: [PATCH v3][manpages 1/2] perf_event_open.2: Document PERF_EVENT_IOC_PAUSE_OUTPUT
Hello Wangnan, On 10/24/2016 08:52 AM, Wang Nan wrote: Linux 4.7 (86e7972f690c1017fd086cdfe53d8524e68c661c) introduces PERF_EVENT_IOC_PAUSE_OUTPUT feature. Document it. Just to confirm, I presume this patch has been superseded by the one from Vince that I just applied. Cheers, Michael Signed-off-by: Wang Nan Reviewed-by: Vince Weaver Cc: Michael Kerrisk --- man2/perf_event_open.2 | 24 1 file changed, 24 insertions(+) diff --git a/man2/perf_event_open.2 b/man2/perf_event_open.2 index fade28c..561331c 100644 --- a/man2/perf_event_open.2 +++ b/man2/perf_event_open.2 @@ -1687,6 +1687,15 @@ the .I data_tail value should be written by user space to reflect the last read data. In this case, the kernel will not overwrite unread data. + +When the mapping is read only (without +.BR PROT_WRITE ), +setting .I data_tail is not allowed. +In this case, the kernel will overwrite data when sample coming, unless +the ring buffer is paused by a +.BR PERF_EVENT_IOC_PAUSE_OUTPUT +.BR ioctl (2) +system call before reading. .TP .IR data_offset " (since Linux 4.1)" .\" commit e8c6deac69629c0cb97c3d3272f8631ef17f8f0f @@ -2865,6 +2874,21 @@ The argument is a BPF program file descriptor that was created by a previous .BR bpf (2) system call. +.TP +.BR PERF_EVENT_IOC_PAUSE_OUTPUT " (since Linux 4.7)" +.\" commit 86e7972f690c1017fd086cdfe53d8524e68c661c +This allows pausing and resuming the event's ring-buffer. A +paused ring-buffer does not prevent generation of samples, but simply +discards the samples. The discarded samples are considered lost, +causing +.BR PERF_RECORD_LOST +to be generated when possible. + +The argument is an integer. A nonzero value pauses the ring-buffer, +zero resumes the ring-buffer. + +Pausing a read only ring buffer before reading from it without having +to worry about data being overwritten. .SS Using prctl(2) A process can enable or disable all the event groups that are attached to it using the