Re: [PATCH v7 0/6] seccomp trap to userspace

2018-09-28 Thread Michael Kerrisk (man-opages)

Hi Tycho,

On 09/27/2018 05:11 PM, Tycho Andersen wrote:

Hi all,

Here's v7 of the seccomp trap to userspace set. There are various minor
changes and bug fixes, but two major changes:

* We now pass fds to the tracee via an ioctl, and do it immediately when
   the ioctl is called. For this we needed some help from the vfs, so
   I've put the one patch in this series and cc'd fsdevel. This does have
   the advantage that the feature is now totally decoupled from the rest
   of the set, which is itself useful (thanks Andy!)

* Instead of putting all of the notification related stuff into the
   struct seccomp_filter, it now lives in its own struct notification,
   which is pointed to by struct seccomp_filter. This will save a lot of
   memory (thanks Tyler!)


Is there a documentation (man page) patch for this API change?

Thanks,

Michael


v6 discussion: https://lkml.org/lkml/2018/9/6/769

Thoughts welcome,

Tycho

Tycho Andersen (6):
   seccomp: add a return code to trap to userspace
   seccomp: make get_nth_filter available outside of CHECKPOINT_RESTORE
   seccomp: add a way to get a listener fd from ptrace
   files: add a replace_fd_files() function
   seccomp: add a way to pass FDs via a notification fd
   samples: add an example of seccomp user trap

  Documentation/ioctl/ioctl-number.txt  |   1 +
  .../userspace-api/seccomp_filter.rst  |  89 +++
  fs/file.c |  22 +-
  include/linux/file.h  |   8 +
  include/linux/seccomp.h   |  14 +-
  include/uapi/linux/ptrace.h   |   2 +
  include/uapi/linux/seccomp.h  |  42 +-
  kernel/ptrace.c   |   4 +
  kernel/seccomp.c  | 527 ++-
  samples/seccomp/.gitignore|   1 +
  samples/seccomp/Makefile  |   7 +-
  samples/seccomp/user-trap.c   | 312 +
  tools/testing/selftests/seccomp/seccomp_bpf.c | 607 +-
  13 files changed, 1617 insertions(+), 19 deletions(-)
  create mode 100644 samples/seccomp/user-trap.c



Re: [PATCH v7 0/6] seccomp trap to userspace

2018-09-28 Thread Michael Kerrisk (man-opages)

Hi Tycho,

On 09/27/2018 05:11 PM, Tycho Andersen wrote:

Hi all,

Here's v7 of the seccomp trap to userspace set. There are various minor
changes and bug fixes, but two major changes:

* We now pass fds to the tracee via an ioctl, and do it immediately when
   the ioctl is called. For this we needed some help from the vfs, so
   I've put the one patch in this series and cc'd fsdevel. This does have
   the advantage that the feature is now totally decoupled from the rest
   of the set, which is itself useful (thanks Andy!)

* Instead of putting all of the notification related stuff into the
   struct seccomp_filter, it now lives in its own struct notification,
   which is pointed to by struct seccomp_filter. This will save a lot of
   memory (thanks Tyler!)


Is there a documentation (man page) patch for this API change?

Thanks,

Michael


v6 discussion: https://lkml.org/lkml/2018/9/6/769

Thoughts welcome,

Tycho

Tycho Andersen (6):
   seccomp: add a return code to trap to userspace
   seccomp: make get_nth_filter available outside of CHECKPOINT_RESTORE
   seccomp: add a way to get a listener fd from ptrace
   files: add a replace_fd_files() function
   seccomp: add a way to pass FDs via a notification fd
   samples: add an example of seccomp user trap

  Documentation/ioctl/ioctl-number.txt  |   1 +
  .../userspace-api/seccomp_filter.rst  |  89 +++
  fs/file.c |  22 +-
  include/linux/file.h  |   8 +
  include/linux/seccomp.h   |  14 +-
  include/uapi/linux/ptrace.h   |   2 +
  include/uapi/linux/seccomp.h  |  42 +-
  kernel/ptrace.c   |   4 +
  kernel/seccomp.c  | 527 ++-
  samples/seccomp/.gitignore|   1 +
  samples/seccomp/Makefile  |   7 +-
  samples/seccomp/user-trap.c   | 312 +
  tools/testing/selftests/seccomp/seccomp_bpf.c | 607 +-
  13 files changed, 1617 insertions(+), 19 deletions(-)
  create mode 100644 samples/seccomp/user-trap.c



Re: [PATCH] io_submit.2: Add IOCB_FLAG_IOPRIO

2018-08-13 Thread Michael Kerrisk (man-opages)

Hello Adam,

On 07/13/2018 10:58 PM, adam.manzana...@wdc.com wrote:

From: Adam Manzanares 

The newly added IOCB_FLAG_IOPRIO aio_flag introduces
new behaviors and return values.

The details of this new feature are posted here:
https://lkml.org/lkml/2018/5/22/809


Thanks for this patch. I've applied it, but I have a question below 
about a detail that probably needs fixing.



Signed-off-by: Adam Manzanares 
---
  man2/io_submit.2 | 34 +++---
  1 file changed, 27 insertions(+), 7 deletions(-)

diff --git a/man2/io_submit.2 b/man2/io_submit.2
index d17e3122a..15e1ecdea 100644
--- a/man2/io_submit.2
+++ b/man2/io_submit.2
@@ -164,14 +164,26 @@ This is the size of the buffer pointed to by
  This is the file offset at which the I/O operation is to be performed.
  .TP
  .I aio_flags
-This is the flag to be passed iocb structure.
-The only valid value is
-.BR IOCB_FLAG_RESFD ,
-which indicates that the asynchronous I/O control must signal the file
+This is the set of flags associated with the iocb structure.
+The valid values are:
+.RS
+.TP
+.BR IOCB_FLAG_RESFD
+Asynchronous I/O control must signal the file
  descriptor mentioned in
  .I aio_resfd
  upon completion.
  .TP
+.BR IOCB_FLAG_IOPRIO " (since Linux 4.18)"
+.\" commit d9a08a9e616beeccdbd0e7262b7225ffdfa49e92
+Interpret the
+.I aio_reqprio
+field as an
+.B IOPRIO_VALUE
+as defined by
+.IR linux/ioprio.h.
+.RE
+.TP
  .I aio_resfd
  The file descriptor to signal in the event of asynchronous I/O completion.
  .SH RETURN VALUE
@@ -196,13 +208,21 @@ The AIO context specified by \fIctx_id\fP is invalid.
  \fInr\fP is less than 0.
  The \fIiocb\fP at
  .I *iocbpp[0]
-is not properly initialized,
-or the operation specified is invalid for the file descriptor
-in the \fIiocb\fP.
+is not properly initialized, the operation specified is invalid for the file
+descriptor in the \fIiocb\fP, or the value in the
+.I aio_reqprio
+field is invalid.
  .TP
  .B ENOSYS
  .BR io_submit ()
  is not implemented on this architecture.
+.TP
+.B EPERM
+The aio_reqprio field is set with the class
+.B IOPRIO_CLASS_RT
+, but the submitting context does not have the


What does "submitting context" mean? Threads/tasks/processes have
capabilities. Can you rephrase in terms of processes/threads?

Thanks,

Michael


+.B CAP_SYS_ADMIN
+privilege.
  .SH VERSIONS
  .PP
  The asynchronous I/O system calls first appeared in Linux 2.5.



Re: [PATCH] io_submit.2: Add IOCB_FLAG_IOPRIO

2018-08-13 Thread Michael Kerrisk (man-opages)

Hello Adam,

On 07/13/2018 10:58 PM, adam.manzana...@wdc.com wrote:

From: Adam Manzanares 

The newly added IOCB_FLAG_IOPRIO aio_flag introduces
new behaviors and return values.

The details of this new feature are posted here:
https://lkml.org/lkml/2018/5/22/809


Thanks for this patch. I've applied it, but I have a question below 
about a detail that probably needs fixing.



Signed-off-by: Adam Manzanares 
---
  man2/io_submit.2 | 34 +++---
  1 file changed, 27 insertions(+), 7 deletions(-)

diff --git a/man2/io_submit.2 b/man2/io_submit.2
index d17e3122a..15e1ecdea 100644
--- a/man2/io_submit.2
+++ b/man2/io_submit.2
@@ -164,14 +164,26 @@ This is the size of the buffer pointed to by
  This is the file offset at which the I/O operation is to be performed.
  .TP
  .I aio_flags
-This is the flag to be passed iocb structure.
-The only valid value is
-.BR IOCB_FLAG_RESFD ,
-which indicates that the asynchronous I/O control must signal the file
+This is the set of flags associated with the iocb structure.
+The valid values are:
+.RS
+.TP
+.BR IOCB_FLAG_RESFD
+Asynchronous I/O control must signal the file
  descriptor mentioned in
  .I aio_resfd
  upon completion.
  .TP
+.BR IOCB_FLAG_IOPRIO " (since Linux 4.18)"
+.\" commit d9a08a9e616beeccdbd0e7262b7225ffdfa49e92
+Interpret the
+.I aio_reqprio
+field as an
+.B IOPRIO_VALUE
+as defined by
+.IR linux/ioprio.h.
+.RE
+.TP
  .I aio_resfd
  The file descriptor to signal in the event of asynchronous I/O completion.
  .SH RETURN VALUE
@@ -196,13 +208,21 @@ The AIO context specified by \fIctx_id\fP is invalid.
  \fInr\fP is less than 0.
  The \fIiocb\fP at
  .I *iocbpp[0]
-is not properly initialized,
-or the operation specified is invalid for the file descriptor
-in the \fIiocb\fP.
+is not properly initialized, the operation specified is invalid for the file
+descriptor in the \fIiocb\fP, or the value in the
+.I aio_reqprio
+field is invalid.
  .TP
  .B ENOSYS
  .BR io_submit ()
  is not implemented on this architecture.
+.TP
+.B EPERM
+The aio_reqprio field is set with the class
+.B IOPRIO_CLASS_RT
+, but the submitting context does not have the


What does "submitting context" mean? Threads/tasks/processes have
capabilities. Can you rephrase in terms of processes/threads?

Thanks,

Michael


+.B CAP_SYS_ADMIN
+privilege.
  .SH VERSIONS
  .PP
  The asynchronous I/O system calls first appeared in Linux 2.5.



Re: [PATCH v3][manpages 1/2] perf_event_open.2: Document PERF_EVENT_IOC_PAUSE_OUTPUT

2018-08-13 Thread Michael Kerrisk (man-opages)

Hello Wangnan,

On 10/24/2016 08:52 AM, Wang Nan wrote:

Linux 4.7 (86e7972f690c1017fd086cdfe53d8524e68c661c) introduces
PERF_EVENT_IOC_PAUSE_OUTPUT feature. Document it.


Just to confirm, I presume this patch has been superseded by the one
from Vince that I just applied.

Cheers,

Michael


Signed-off-by: Wang Nan 
Reviewed-by: Vince Weaver 
Cc: Michael Kerrisk 
---
  man2/perf_event_open.2 | 24 
  1 file changed, 24 insertions(+)

diff --git a/man2/perf_event_open.2 b/man2/perf_event_open.2
index fade28c..561331c 100644
--- a/man2/perf_event_open.2
+++ b/man2/perf_event_open.2
@@ -1687,6 +1687,15 @@ the
  .I data_tail
  value should be written by user space to reflect the last read data.
  In this case, the kernel will not overwrite unread data.
+
+When the mapping is read only (without
+.BR PROT_WRITE ),
+setting .I data_tail is not allowed.
+In this case, the kernel will overwrite data when sample coming, unless
+the ring buffer is paused by a
+.BR PERF_EVENT_IOC_PAUSE_OUTPUT
+.BR ioctl (2)
+system call before reading.
  .TP
  .IR data_offset " (since Linux 4.1)"
  .\" commit e8c6deac69629c0cb97c3d3272f8631ef17f8f0f
@@ -2865,6 +2874,21 @@ The argument is a BPF program file descriptor that was 
created by
  a previous
  .BR bpf (2)
  system call.
+.TP
+.BR PERF_EVENT_IOC_PAUSE_OUTPUT " (since Linux 4.7)"
+.\" commit 86e7972f690c1017fd086cdfe53d8524e68c661c
+This allows pausing and resuming the event's ring-buffer. A
+paused ring-buffer does not prevent generation of samples, but simply
+discards the samples. The discarded samples are considered lost,
+causing
+.BR PERF_RECORD_LOST
+to be generated when possible.
+
+The argument is an integer. A nonzero value pauses the ring-buffer,
+zero resumes the ring-buffer.
+
+Pausing a read only ring buffer before reading from it without having
+to worry about data being overwritten.
  .SS Using prctl(2)
  A process can enable or disable all the event groups that are
  attached to it using the



Re: [PATCH v3][manpages 1/2] perf_event_open.2: Document PERF_EVENT_IOC_PAUSE_OUTPUT

2018-08-13 Thread Michael Kerrisk (man-opages)

Hello Wangnan,

On 10/24/2016 08:52 AM, Wang Nan wrote:

Linux 4.7 (86e7972f690c1017fd086cdfe53d8524e68c661c) introduces
PERF_EVENT_IOC_PAUSE_OUTPUT feature. Document it.


Just to confirm, I presume this patch has been superseded by the one
from Vince that I just applied.

Cheers,

Michael


Signed-off-by: Wang Nan 
Reviewed-by: Vince Weaver 
Cc: Michael Kerrisk 
---
  man2/perf_event_open.2 | 24 
  1 file changed, 24 insertions(+)

diff --git a/man2/perf_event_open.2 b/man2/perf_event_open.2
index fade28c..561331c 100644
--- a/man2/perf_event_open.2
+++ b/man2/perf_event_open.2
@@ -1687,6 +1687,15 @@ the
  .I data_tail
  value should be written by user space to reflect the last read data.
  In this case, the kernel will not overwrite unread data.
+
+When the mapping is read only (without
+.BR PROT_WRITE ),
+setting .I data_tail is not allowed.
+In this case, the kernel will overwrite data when sample coming, unless
+the ring buffer is paused by a
+.BR PERF_EVENT_IOC_PAUSE_OUTPUT
+.BR ioctl (2)
+system call before reading.
  .TP
  .IR data_offset " (since Linux 4.1)"
  .\" commit e8c6deac69629c0cb97c3d3272f8631ef17f8f0f
@@ -2865,6 +2874,21 @@ The argument is a BPF program file descriptor that was 
created by
  a previous
  .BR bpf (2)
  system call.
+.TP
+.BR PERF_EVENT_IOC_PAUSE_OUTPUT " (since Linux 4.7)"
+.\" commit 86e7972f690c1017fd086cdfe53d8524e68c661c
+This allows pausing and resuming the event's ring-buffer. A
+paused ring-buffer does not prevent generation of samples, but simply
+discards the samples. The discarded samples are considered lost,
+causing
+.BR PERF_RECORD_LOST
+to be generated when possible.
+
+The argument is an integer. A nonzero value pauses the ring-buffer,
+zero resumes the ring-buffer.
+
+Pausing a read only ring buffer before reading from it without having
+to worry about data being overwritten.
  .SS Using prctl(2)
  A process can enable or disable all the event groups that are
  attached to it using the