As of commit 654ced4a1377 ("tracing: Introduce tracepoint_is_faultable()") system call trace events allow faulting in user space memory. Have some of the system call trace events take advantage of this.
Introduce a way to read from user space addresses from the syscall trace event. The way this is accomplished is by creating a per CPU temporary buffer that is used to read unsafe user memory. When a syscall trace event needs to read user memory, it reads a per CPU counter that gets updated every time a user space task is scheduled on the CPU. It then enables preemption, copies the user space memory into this buffer, then disables preemption again. If the counter is less than two from its original value the buffer is valid. Otherwise it needs to try again. The reason to check for less than two and not equal to the previous value is because scheduling kernel tasks is fine. Only user space tasks will write to this buffer. If the task schedules out and only kernel tasks run and the tasks schedules back in, the counter will be incremented by one. A new file is created in the tracefs directory (and also per instance) that allows the user to shorten the amount copied from user space. It can be completely disabled if set to zero (it will only display "" or (, ...) but no copying from user space will be performed). The max size to copy is hard coded to 128, which should be enough for this purpose. This allows the output to look like this: sys_access(filename: 0x7f8c55368470 "/etc/ld.so.preload", mode: 4) sys_execve(filename: 0x564ebcf5a6b8 "/usr/bin/emacs", argv: 0x7fff357c0300, envp: 0x564ebc4a4820) sys_write(fd: 0xa, buf: 0x5646978d13c0 (01:00:05:00:00:00:00:00:01:87:55:89:00:00:00:00:00:00:00:00:00:00:00:00:00:00:00:00:00:00:00:00), count: 0x20) sys_sethostname(name: 0x5584310eb2a0 "debian", len: 6) sys_renameat2(olddfd: 0xffffff9c, oldname: 0x7ffe02facdff "/tmp/x", newdfd: 0xffffff9c, newname: 0x7ffe02face06 "/tmp/y", flags: 1) Steven Rostedt (7): tracing: Replace syscall RCU pointer assignment with READ/WRITE_ONCE() tracing: Have syscall trace events show "0x" for values greater than 10 tracing: Have syscall trace events read user space string tracing: Have system call events record user array data tracing: Display some syscall arrays as strings tracing: Allow syscall trace events to read more than one user parameter tracing: Add syscall_user_buf_size to limit amount written ---- Documentation/trace/ftrace.rst | 7 + include/trace/syscall.h | 8 +- kernel/trace/Kconfig | 13 + kernel/trace/trace.c | 52 +++ kernel/trace/trace.h | 3 + kernel/trace/trace_syscalls.c | 697 +++++++++++++++++++++++++++++++++++++++-- 6 files changed, 750 insertions(+), 30 deletions(-)