Hello,

I made a small systemtap script which can generate a part of
configuration for a systemd service. When run, first it produces
strace-like output which is annotated with information gathered from
various kernel probes. When the process exits, a summary is generated in
systemd unit format. The purpose of the script is to help system
administrators, distro maintainers and program developers to prepare
better unit files.

The systemtap probes check for:
 * capability use for CapabilityBoundingSet
 * device access for DeviceAllow
 * address family use for RestrictAddressFamilies
 * RLIMIT related information

Also file system accesses are checked against ProtectSystem/ProtectHome
requirements and mmap()/mprotect() flags against what is needed by
MemoryDenyWriteExecute=true. For example, if the process never writes to
/boot, /etc or /usr, we can set ProtectSystem=full, but if the script
detects a write access, this is degraded to ProtectSystem=true or
ProtectSystem=false. For InaccessibleDirectories, the user should
specify the list of candidate paths which can be made inaccessible. If
there's any FS access to those paths, they are dropped from the list,
otherwise the remaining paths are proposed as inaccessible.

A list of system calls is produced for SystemCallFilter.

When in doubt, the strace part can be used to verify how the summary was
produced.

Short sample output with wpa_supplicant (looks better on a wide terminal
without line wrapping):
socket(PF_LOCAL, SOCK_STREAM|SOCK_CLOEXEC, 0) = 4
[RestrictAddressFamilies=AF_UNIX] [NOFILE 3 -> 4]
socket(PF_NETLINK, SOCK_RAW, 0) = 5 [RestrictAddressFamilies=AF_NETLINK]
connect(4, {AF_UNIX, "/var/run/dbus/system_bus_socket"}, 33) = 0
[ReadWriteDirectories=/run/dbus/system_bus_socket ]
[InaccessibleDirectories=~/ /run /run/dbus /run/dbus/system_bus_socket
/var ]
open("/dev/urandom", O_RDONLY|O_NOCTTY|O_NONBLOCK) = 14
[DeviceAllow=/dev/char/1:9 r ] [NOFILE 13 -> 14]
[InaccessibleDirectories=~/ /dev /dev/urandom ]

Summary:
CapabilityBoundingSet=CAP_NET_ADMIN CAP_NET_RAW
# Consider also possibly missing CapabilityBoundingSet=CAP_SYS_ADMIN
ProtectHome=true
ProtectSystem=full
DevicePolicy=strict
DeviceAllow=/dev/char/1:3 rw
DeviceAllow=/dev/char/1:8 r
DeviceAllow=/dev/char/1:9 r
DeviceAllow=/dev/char/10:58 r
# LimitFSIZE=0
# LimitDATA=577536
# LimitSTACK=139264
# LimitCORE=0
# LimitNOFILE=15
# LimitAS=45146112
# LimitNPROC=159
# LimitMEMLOCK=0
# LimitSIGPENDING=0
# LimitMSGQUEUE=0
# LimitNICE=0
# LimitRTPRIO=0
RestrictAddressFamilies=AF_UNIX AF_INET AF_NETLINK AF_PACKET
MemoryDenyWriteExecute=true
SystemCallFilter=access alarm arch_prctl bind brk chmod clock_getres
clock_gettime close connect execve exit_group fcntl fstat geteuid
getresgid getresuid getrlimit getsockname getuid ioctl mkdir mmap
mprotect munmap open poll read recvfrom recvmsg rmdir rt_sigaction
rt_sigprocmask rt_sigreturn rt_sigsuspend select sendmsg sendto
set_robust_list set_tid_address setsockopt socket statfs unlink write
InaccessibleDirectories=-/bin
InaccessibleDirectories=-/boot
InaccessibleDirectories=-/dev/hugepages
InaccessibleDirectories=-/dev/mqueue
InaccessibleDirectories=-/dev/pts
InaccessibleDirectories=-/dev/shm
InaccessibleDirectories=-/home
InaccessibleDirectories=-/lost+found
InaccessibleDirectories=-/media
InaccessibleDirectories=-/mnt
InaccessibleDirectories=-/opt
InaccessibleDirectories=-/proc/bus
InaccessibleDirectories=-/proc/sys
InaccessibleDirectories=-/root
InaccessibleDirectories=-/srv
InaccessibleDirectories=-/tmp
InaccessibleDirectories=-/usr/bin
InaccessibleDirectories=-/usr/sbin
InaccessibleDirectories=-/var/tmp
ReadOnlyDirectories=/
ReadWriteDirectories=/dev/null /run /run/dbus/system_bus_socket
/run/wpa_supplicant /run/wpa_supplicant/wlan0 socket:[38833]

This is pretty much valid (with some editing, for example last line only
really needs /run/wpa_supplicant) for my system, even if the script is
not yet perfect. I do not trust the RLIMIT values yet and systemtap
itself causes problems (needs to be run as root, system call names don't
match seccomp). Perhaps there should be a way for systemd to start the
process directly without staprun in between and then tell staprun about
the process.

I suppose the script could find a home either with systemd repository
(as it's fairly specific to systemd), systemtap (it's just another
script) or just somewhere in github if nobody cares. Would it be
interesting for systemd?

For future features, it may be possible to probe what kind of settings
for NoNewPrivileges or SecureBits could be used. This could need small
changes to kernel. PrivateTmp and PrivateNetwork may be possible to be
generated in some cases, MountFlags probably not.

-Topi
#! /bin/sh

# suppress some run-time errors here for cleaner output
//bin/true && exec stap --suppress-handler-errors --skip-badvars $0 ${1+"$@"}

/*
 * Compile:
 * stap -p4 -DSTP_NO_OVERLOAD -m strace
 * Run:
 * /usr/bin/staprun -R -c "/sbin/wpa_supplicant -u -O /run/wpa_supplicant -c 
/etc/wpa_supplicant.conf -i wlan0" -w /root/strace.ko only_capability_use=1 
timestamp=0
 */

/* configuration options; set these with stap -G */
global follow_fork = 0   /* -Gfollow_fork=1 means trace descendant processes 
too */
global timestamp = 1     /* -Gtimestamp=0 means don't print a syscall timestamp 
*/
global elapsed_time = 0  /* -Gelapsed_time=1 means print a syscall duration too 
*/
global only_capability_use = 0 /* -Gonly_capability_use=1 means print only when 
capabilities are used */
global inaccessible_candidates = "/bin /boot /dev /dev/hugepages /dev/mqueue 
/dev/pts /dev/shm /home /lost+found /media /mnt /opt /proc /proc/bus /proc/sys 
/root /sbin /srv /sys /sys/fs /usr/bin /usr/sbin /tmp /var /var/tmp"

global thread_argstr%
global thread_time%

global syscalls_nonreturn[2]
global capnames[64]
global used_caps
global missing_caps
global all_used_caps
global all_missing_caps
global accessed_devices[1000]
global all_accessed_devices[1000]
global highwatermark_fsize
global highwatermark_data
global highwatermark_stack
global highwatermark_core
global highwatermark_nproc
global highwatermark_nofile
global highwatermark_memlock
global highwatermark_as
global highwatermark_sigpending
global highwatermark_msgqueue
global highwatermark_nice
global highwatermark_rtprio
global old_highwatermark_fsize
global old_highwatermark_data
global old_highwatermark_stack
global old_highwatermark_core
global old_highwatermark_nproc
global old_highwatermark_nofile
global old_highwatermark_memlock
global old_highwatermark_as
global old_highwatermark_sigpending
global old_highwatermark_msgqueue
global old_highwatermark_nice
global old_highwatermark_rtprio
global afnames%
global used_afs
global missing_afs
global all_used_afs
global all_missing_afs
global no_memory_deny_write_execute
global all_memory_deny_write_execute = "true"
global used_syscalls%
global syscalls_for_seccomp%
global accessed_paths%
global all_accessed_paths%
global written_paths%
global all_written_paths%
global inaccessibles%
global protect_system_paths%
global protect_system = "full"
global protect_home_paths%
global protect_home = "true"
global print_syscall


probe begin 
  {
    /* list those syscalls that never .return */
    syscalls_nonreturn["exit"]=1
    syscalls_nonreturn["exit_group"]=1

    // egrep '#define CAP_.*[0-9]+$' 
/usr/src/linux-headers*/include/uapi/linux/capability.h | awk '{ print 
"capnames[" $3 "] = \"" $2 "\";" }'
    capnames[0] = "CAP_CHOWN";
    capnames[1] = "CAP_DAC_OVERRIDE";
    capnames[2] = "CAP_DAC_READ_SEARCH";
    capnames[3] = "CAP_FOWNER";
    capnames[4] = "CAP_FSETID";
    capnames[5] = "CAP_KILL";
    capnames[6] = "CAP_SETGID";
    capnames[7] = "CAP_SETUID";
    capnames[8] = "CAP_SETPCAP";
    capnames[9] = "CAP_LINUX_IMMUTABLE";
    capnames[10] = "CAP_NET_BIND_SERVICE";
    capnames[11] = "CAP_NET_BROADCAST";
    capnames[12] = "CAP_NET_ADMIN";
    capnames[13] = "CAP_NET_RAW";
    capnames[14] = "CAP_IPC_LOCK";
    capnames[15] = "CAP_IPC_OWNER";
    capnames[16] = "CAP_SYS_MODULE";
    capnames[17] = "CAP_SYS_RAWIO";
    capnames[18] = "CAP_SYS_CHROOT";
    capnames[19] = "CAP_SYS_PTRACE";
    capnames[20] = "CAP_SYS_PACCT";
    capnames[21] = "CAP_SYS_ADMIN";
    capnames[22] = "CAP_SYS_BOOT";
    capnames[23] = "CAP_SYS_NICE";
    capnames[24] = "CAP_SYS_RESOURCE";
    capnames[25] = "CAP_SYS_TIME";
    capnames[26] = "CAP_SYS_TTY_CONFIG";
    capnames[27] = "CAP_MKNOD";
    capnames[28] = "CAP_LEASE";
    capnames[29] = "CAP_AUDIT_WRITE";
    capnames[30] = "CAP_AUDIT_CONTROL";
    capnames[31] = "CAP_SETFCAP";
    capnames[32] = "CAP_MAC_OVERRIDE";
    capnames[33] = "CAP_MAC_ADMIN";
    capnames[34] = "CAP_SYSLOG";
    capnames[35] = "CAP_WAKE_ALARM";
    capnames[36] = "CAP_BLOCK_SUSPEND";
    capnames[37] = "CAP_AUDIT_READ";

    //egrep '#define AF_.*' /usr/src/linux-headers-*/include/linux/socket.h | 
awk '{ print "afnames[" $3 "] = \"" $2 "\"" }'
    afnames[0] = "AF_UNSPEC"
    afnames[1] = "AF_UNIX"
    afnames[2] = "AF_INET"
    afnames[3] = "AF_AX25"
    afnames[4] = "AF_IPX"
    afnames[5] = "AF_APPLETALK"
    afnames[6] = "AF_NETROM"
    afnames[7] = "AF_BRIDGE"
    afnames[8] = "AF_ATMPVC"
    afnames[9] = "AF_X25"
    afnames[10] = "AF_INET6"
    afnames[11] = "AF_ROSE"
    afnames[12] = "AF_DECnet"
    afnames[13] = "AF_NETBEUI"
    afnames[14] = "AF_SECURITY"
    afnames[15] = "AF_KEY"
    afnames[16] = "AF_NETLINK"
    afnames[17] = "AF_PACKET"
    afnames[18] = "AF_ASH"
    afnames[19] = "AF_ECONET"
    afnames[20] = "AF_ATMSVC"
    afnames[21] = "AF_RDS"
    afnames[22] = "AF_SNA"
    afnames[23] = "AF_IRDA"
    afnames[24] = "AF_PPPOX"
    afnames[25] = "AF_WANPIPE"
    afnames[26] = "AF_LLC"
    afnames[27] = "AF_IB"
    afnames[28] = "AF_MPLS"
    afnames[29] = "AF_CAN"
    afnames[30] = "AF_TIPC"
    afnames[31] = "AF_BLUETOOTH"
    afnames[32] = "AF_IUCV"
    afnames[33] = "AF_RXRPC"
    afnames[34] = "AF_ISDN"
    afnames[35] = "AF_PHONET"
    afnames[36] = "AF_IEEE802154"
    afnames[37] = "AF_CAIF"
    afnames[38] = "AF_ALG"
    afnames[39] = "AF_NFC"
    afnames[40] = "AF_VSOCK"
    afnames[41] = "AF_KCM"

    syscalls_for_seccomp["fstatat"] = "fstatat64"
    syscalls_for_seccomp["mmap2"] = "mmap"
    syscalls_for_seccomp["pread"] = "pread64"
    syscalls_for_seccomp["pwrite"] = "pwrite64"

    str = tokenize(inaccessible_candidates, " ")
    while (str != "") {
      inaccessibles[str] = 0
      str = tokenize("", " ")
    }

    protect_system_paths["/boot"] = 1
    protect_system_paths["/etc"] = 1
    protect_system_paths["/usr"] = 1
    # Additional ProtectSystem directories in Debian
    protect_system_paths["/bin"] = 1
    protect_system_paths["/lib"] = 1
    protect_system_paths["/lib64"] = 1
    protect_system_paths["/sbin"] = 1

    protect_home_paths["/home"] = 1
    protect_home_paths["/root"] = 1
    protect_home_paths["/run/user"] = 1
 }



function filter_p()
  {
    if (target() == 0) return 0; /* system-wide */
    if (!follow_fork && pid() != target()) return 1; /* single-process */
    if (follow_fork && !target_set_pid(pid())) return 1; /* multi-process */
    return 0;
  }

function caps_to_str(caps)
  {
    str = ""
    for (i = 0; i < 37; i++) # CAP_LAST_CAP
      if (caps & (1 << i)) {
        str .= capnames[i]
        if ((caps & ~((1 << (i + 1)) - 1)) != 0)
          str .= " "
      }
    return str
  }

function dev_to_str(type, dev, access)
  {
    devs = "/dev/"
    if (type == 1) # DEV_BLOCK
      devs .= "block"
    else
      devs .= "char"
    devs .= sprintf("/%d:%d ", dev >> 32, dev & 0xffffffff)
    if (access & 2) # ACC_READ
      devs .= "r"
    if (access & 4) # ACC_WRITE
      devs .= "w"
    if (access & 1) # ACC_MKNOD
      devs .= "m"
    return devs
  }

function afs_to_str(afs)
  {
    str = ""
    for (i = 0; i < 42; i++) # MAX_AF
      if (afs & (1 << i)) {
        str .= afnames[i]
        if ((afs & ~((1 << (i + 1)) - 1)) != 0)
          str .= " "
      }
    return str
  }

/* Capabilities */
probe kernel.function("cap_capable@security/commoncap.c").return
  {
    if (filter_p()) next;

    if ($return == 0 && $audit)
      used_caps |= 1 << $cap;
    else
      missing_caps |= 1 << $cap;
  }

/* Devices */
probe 
kernel.function("__devcgroup_check_permission@security/device_cgroup.c").return
  {
    if (filter_p()) next;

    if ($return == 0)
      accessed_devices[$type, $major << 32 | $minor] |= $access
  }

/* RLIMIT_FSIZE */
probe kernel.function("inode_newsize_ok@fs/attr.c").return
  {
    if (filter_p()) next;

    if ($return == 0 && highwatermark_fsize < $offset)
      highwatermark_fsize = $offset
  }

/* RLIMIT_DATA */
probe kernel.function("prctl_set_mm@kernel/sys.c").return
  {
    if (filter_p()) next;

    if ($return == 0 && highwatermark_data < $prctl_map->end_data - 
$prctl_map->start_data) {
      highwatermark_data = $prctl_map->end_data - $prctl_map->start_data
      print_syscall = 1
    }
  }

probe kernel.function("do_brk@mm/mmap.c").return
  {
    if (filter_p()) next;

    task = task_current()
    if ($return > 0 && highwatermark_data < task->mm->data_vm << 12) { # 
PAGE_SHIFT
      highwatermark_data = task->mm->data_vm << 12
      print_syscall = 1
    }
    if ($return > 0 && highwatermark_as < task->mm->total_vm << 12) {
      highwatermark_as = task->mm->total_vm << 12
      print_syscall = 1
    }
  }

/* also RLIMIT_STACK and RLIMIT_MEMLOCK */
probe kernel.function("vm_stat_account@mm/mmap.c").return
  {
    if (filter_p()) next;

    if (highwatermark_data < $mm->data_vm << 12) { # PAGE_SHIFT
      highwatermark_data = $mm->data_vm << 12
      print_syscall = 1
    }
    if (highwatermark_stack < $mm->stack_vm << 12) {
      highwatermark_stack = $mm->stack_vm << 12
      print_syscall = 1
    }
    if (highwatermark_memlock < atomic_long_read(&$mm->locked_vm) << 12) {
      highwatermark_memlock = atomic_long_read(&$mm->locked_vm) << 12
      print_syscall = 1
    }
    if (highwatermark_as < $mm->total_vm << 12) {
      highwatermark_as = $mm->total_vm << 12
      print_syscall = 1
    }
  }

/* RLIMIT_CORE */
probe kernel.function("dump_emit@fs/coredump.c").return
  {
    if (filter_p()) next;

    if (highwatermark_core < $cprm->written) {
      highwatermark_core = $cprm->written
      print_syscall = 1
    }
  }

/* RLIMIT_NPROC */
probe kernel.function("commit_creds@kernel/cred.c").return
  {
    if (filter_p()) next;

    if (highwatermark_nproc < atomic_read(&$new->user->processes)) {
      highwatermark_nproc = atomic_read(&$new->user->processes)
      print_syscall = 1
    }
  }

probe kernel.function("copy_process@kernel/fork.c").return
  {
    if (filter_p()) next;
    printf("return %d\n", $return);
    try {
    if (($return > 0 || $return < -1000) && $return->real_cred && 
$return->real_cred->user)
      printf("good return %d\n", $return);
      if (highwatermark_nproc < 
atomic_read(&$return->real_cred->user->processes)) {
        highwatermark_nproc = atomic_read(&$return->real_cred->user->processes)
        print_syscall = 1
      }
    } catch {}
  }

/* RLIMIT_NOFILE */
probe kernel.function("__alloc_fd@fs/file.c").return
  {
    if (filter_p()) next;

    if (($return >= 0 || $return < -1000) && highwatermark_nofile < $return) {
      highwatermark_nofile = $return
      print_syscall = 1
    }
  }

probe kernel.function("do_dup2@fs/file.c").return
  {
    if (filter_p()) next;

    if (($return >= 0 || $return < -1000) && highwatermark_nofile < $return) {
      highwatermark_nofile = $return
      print_syscall = 1
    }
  }

/* RLIMIT_MEMLOCK */
probe kernel.function("sys_bpf@kernel/bpf/syscall.c").return
  {
    if (filter_p()) next;

    task = task_current()
    user = task->real_cred->user
    if ($return == 0 && highwatermark_memlock < 
atomic_long_read(&user->locked_vm) << 12) { # PAGE_SHIFT
      highwatermark_memlock = atomic_long_read(&user->locked_vm) << 12
      print_syscall = 1
    }
  }

probe kernel.function("perf_mmap@kernel/events/core.c").return
  {
    if (filter_p()) next;

    task = task_current()
    if ($return == 0 && highwatermark_memlock < task->mm->pinned_vm << 12) { # 
PAGE_SHIFT
      highwatermark_memlock = task->mm->pinned_vm << 12
      print_syscall = 1
    }
  }

probe kernel.function("do_mlock@mm/mlock.c").return
  {
    if (filter_p()) next;

    task = task_current()
    if ($return == 0 && highwatermark_memlock < task->mm->locked_vm << 12) { # 
PAGE_SHIFT
      highwatermark_memlock = task->mm->locked_vm << 12
      print_syscall = 1
    }
  }

probe kernel.function("sys_mlockall@mm/mlock.c").return
  {
    if (filter_p()) next;

    task = task_current()
    if ($return == 0 && highwatermark_memlock < task->mm->total_vm << 12) { # 
PAGE_SHIFT
      highwatermark_memlock = task->mm->total_vm << 12
      print_syscall = 1
    }
  }

/* RLIMIT_SIGPENDING */
probe kernel.function("__sigqueue_alloc@kernel/signal.c").return
  {
    if (filter_p()) next;

    task = task_current()
    user = task->real_cred->user
    if ($return == 0 && highwatermark_sigpending < 
atomic_read(&user->sigpending)) {
      highwatermark_sigpending = atomic_read(&user->sigpending)
      print_syscall = 1
    }
  }

/* RLIMIT_MSGGQUEUE */
probe kernel.function("mqueue_get_inode@ipc/mqueue.c").return
  {
    if (filter_p()) next;

    task = task_current()
    user = task->real_cred->user
    if ($return == 0 && highwatermark_msgqueue < user->mq_bytes) {
      highwatermark_msgqueue = user->mq_bytes
      print_syscall = 1
    }
  }

/* RLIMIT_NICE */
probe kernel.function("set_user_nice@kernel/sched/core.c").return
  {
    if (filter_p()) next;

    if (highwatermark_nice < $nice) {
      highwatermark_nice = $nice
      print_syscall = 1
    }
  }

/* RLIMIT_RTPRIO */
probe kernel.function("__sched_setscheduler@kernel/sched/core.c").return
  {
    if (filter_p()) next;

    if (highwatermark_rtprio < $attr->sched_priority) {
      highwatermark_rtprio = $attr->sched_priority
      print_syscall = 1
    }
  }

/* socket address families */
probe kernel.function("__sock_create@net/socket.c").return
  {
    if (filter_p()) next;

    if ($return == 0) {
      used_afs |= 1 << $family
      print_syscall = 1
    } else if ($return == 93) { # EPROTONOSUPPORT
      missing_afs |= 1 << $family
      print_syscall = 1
    }
  }

/* mmap flags */
probe kernel.function("do_mmap@mm/mmap.c").return
  {
    if (filter_p()) next;

    if (($return >= 0 || $return < -1000) && ($flags & (2 | 4)) == (2 | 4)) { # 
PROT_WRITE | PROT_EXEC
      no_memory_deny_write_execute = 1
      print_syscall = 1
    }
  }

/* path checks */
probe kernel.function("security_path_mknod@security/security.c").return
  {
    if (filter_p()) next;

    if ($return == 0) {
      accessed_paths[fullpath_struct_path($dir)]++
      written_paths[fullpath_struct_path($dir)]++
      print_syscall = 1
    }
  }

probe kernel.function("security_path_mkdir@security/security.c").return
  {
    if (filter_p()) next;

    if ($return == 0) {
      accessed_paths[fullpath_struct_path($dir)]++
      written_paths[fullpath_struct_path($dir)]++
      print_syscall = 1
    }
  }

probe kernel.function("security_path_rmdir@security/security.c").return
  {
    if (filter_p()) next;

    if ($return == 0) {
      accessed_paths[fullpath_struct_path($dir)]++
      written_paths[fullpath_struct_path($dir)]++
      print_syscall = 1
    }
  }

probe kernel.function("security_path_unlink@security/security.c").return
  {
    if (filter_p()) next;

    if ($return == 0) {
      accessed_paths[fullpath_struct_path($dir)]++
      written_paths[fullpath_struct_path($dir)]++
      print_syscall = 1
    }
  }

probe kernel.function("security_path_symlink@security/security.c").return
  {
    if (filter_p()) next;

    if ($return == 0) {
      accessed_paths[fullpath_struct_path($dir)]++
      written_paths[fullpath_struct_path($dir)]++
      print_syscall = 1
    }
  }

probe kernel.function("security_path_link@security/security.c").return
  {
    if (filter_p()) next;

    if ($return == 0) {
      accessed_paths[fullpath_struct_path($new_dir)]++
      written_paths[fullpath_struct_path($new_dir)]++
      print_syscall = 1
    }
  }

probe kernel.function("security_path_rename@security/security.c").return
  {
    if (filter_p()) next;

    if ($return == 0) {
      accessed_paths[fullpath_struct_path($old_dir)]++
      written_paths[fullpath_struct_path($old_dir)]++
      accessed_paths[fullpath_struct_path($new_dir)]++
      written_paths[fullpath_struct_path($new_dir)]++
      print_syscall = 1
    }
  }

probe kernel.function("security_path_truncate@security/security.c").return
  {
    if (filter_p()) next;

    if ($return == 0) {
      accessed_paths[fullpath_struct_path($path)]++
      written_paths[fullpath_struct_path($path)]++
      print_syscall = 1
    }
  }

probe kernel.function("security_path_chmod@security/security.c").return
  {
    if (filter_p()) next;

    if ($return == 0) {
      accessed_paths[fullpath_struct_path($path)]++
      written_paths[fullpath_struct_path($path)]++
      print_syscall = 1
    }
  }

probe kernel.function("security_path_chown@security/security.c").return
  {
    if (filter_p()) next;

    if ($return == 0) {
      accessed_paths[fullpath_struct_path($path)]++
      written_paths[fullpath_struct_path($path)]++
      print_syscall = 1
    }
  }

probe kernel.function("security_path_chroot@security/security.c").return
  {
    if (filter_p()) next;

    if ($return == 0) {
      accessed_paths[fullpath_struct_path($path)]++
      print_syscall = 1
    }
  }

probe kernel.function("security_inode_create@security/security.c").return
  {
    if (filter_p()) next;

    if ($return == 0) {
      accessed_paths[inode_path($dir)]++
      written_paths[inode_path($dir)]++
      print_syscall = 1
    }
  }

probe kernel.function("security_inode_link@security/security.c").return
  {
    if (filter_p()) next;

    if ($return == 0) {
      accessed_paths[inode_path($dir)]++
      written_paths[inode_path($dir)]++
      print_syscall = 1
    }
  }

probe kernel.function("security_inode_unlink@security/security.c").return
  {
    if (filter_p()) next;

    if ($return == 0) {
      accessed_paths[inode_path($dir)]++
      written_paths[inode_path($dir)]++
      print_syscall = 1
    }
  }

probe kernel.function("security_inode_symlink@security/security.c").return
  {
    if (filter_p()) next;

    if ($return == 0) {
      accessed_paths[inode_path($dir)]++
      written_paths[inode_path($dir)]++
      print_syscall = 1
    }
  }

probe kernel.function("security_inode_mkdir@security/security.c").return
  {
    if (filter_p()) next;

    if ($return == 0) {
      accessed_paths[inode_path($dir)]++
      written_paths[inode_path($dir)]++
      print_syscall = 1
    }
  }

probe kernel.function("security_inode_rmdir@security/security.c").return
  {
    if (filter_p()) next;

    if ($return == 0) {
      accessed_paths[inode_path($dir)]++
      written_paths[inode_path($dir)]++
      print_syscall = 1
    }
  }

probe kernel.function("security_inode_mknod@security/security.c").return
  {
    if (filter_p()) next;

    if ($return == 0) {
      accessed_paths[inode_path($dir)]++
      written_paths[inode_path($dir)]++
      print_syscall = 1
    }
  }

probe kernel.function("security_inode_rename@security/security.c").return
  {
    if (filter_p()) next;

    if ($return == 0) {
      accessed_paths[inode_path($old_dir)]++
      written_paths[inode_path($old_dir)]++
      accessed_paths[inode_path($new_dir)]++
      written_paths[inode_path($new_dir)]++
      print_syscall = 1
    }
  }

probe kernel.function("security_inode_readlink@security/security.c").return
  {
    if (filter_p()) next;

    if ($return == 0 && $dentry > 1000) {
      printf("func %s dentry 0x%x\n", pp(), $dentry);
      accessed_paths[d_path($dentry)]++
      print_syscall = 1
    }
  }

probe kernel.function("security_inode_follow_link@security/security.c").return
  {
    if (filter_p()) next;

    if ($return == 0 && $dentry > 1000) {
      printf("func %s dentry 0x%x inode 0x%x\n", pp(), $dentry, $inode);
      accessed_paths[inode_path($inode)]++
      print_syscall = 1
    }
  }

probe kernel.function("security_inode_permission@security/security.c").return
  {
    if (filter_p()) next;

    if ($return == 0) {
      accessed_paths[inode_path($inode)]++
      if ($mask & (0x00000002 | 0x00000008)) # MAY_WRITE | MAY_APPEND
        written_paths[inode_path($inode)]++
      print_syscall = 1
    }
  }

probe kernel.function("security_inode_setattr@security/security.c").return
  {
    if (filter_p()) next;

    if ($return == 0 && $dentry > 1000) {
      printf("func %s dentry 0x%x\n", pp(), $dentry);
      accessed_paths[d_path($dentry)]++
      written_paths[d_path($dentry)]++
      print_syscall = 1
    }
  }

probe kernel.function("security_inode_getattr@security/security.c").return
  {
    if (filter_p()) next;

    if ($return == 0) {
      accessed_paths[fullpath_struct_path($path)]++
      print_syscall = 1
    }
  }

probe kernel.function("security_inode_setxattr@security/security.c").return
  {
    if (filter_p()) next;

    if ($return == 0 && $dentry > 1000) {
      printf("func %s dentry 0x%x\n", pp(), $dentry);
      accessed_paths[d_path($dentry)]++
      written_paths[d_path($dentry)]++
      print_syscall = 1
    }
  }

probe kernel.function("security_inode_getxattr@security/security.c").return
  {
    if (filter_p()) next;

    if ($return == 0 && $dentry > 1000) {
      printf("func %s dentry 0x%x\n", pp(), $dentry);
      accessed_paths[d_path($dentry)]++
      print_syscall = 1
    }
  }

probe kernel.function("security_inode_removexattr@security/security.c").return
  {
    if (filter_p()) next;

    if ($return == 0 && $dentry > 1000) {
      printf("func %s dentry 0x%x\n", pp(), $dentry);
      accessed_paths[d_path($dentry)]++
      written_paths[d_path($dentry)]++
      print_syscall = 1
    }
  }

probe kernel.function("security_inode_getsecurity@security/security.c").return
  {
    if (filter_p()) next;

    if ($return == 0) {
      accessed_paths[inode_path($inode)]++
      print_syscall = 1
    }
  }

probe kernel.function("security_inode_setsecurity@security/security.c").return
  {
    if (filter_p()) next;

    if ($return == 0) {
      accessed_paths[inode_path($inode)]++
      written_paths[inode_path($inode)]++
      print_syscall = 1
    }
  }

probe kernel.function("security_inode_listsecurity@security/security.c").return
  {
    if (filter_p()) next;

    if ($return == 0) {
      accessed_paths[inode_path($inode)]++
      print_syscall = 1
    }
  }

probe kernel.function("security_inode_getsecid@security/security.c").return
  {
    if (filter_p()) next;

    accessed_paths[inode_path($inode)]++
    print_syscall = 1
  }

probe kernel.function("security_file_permission@security/security.c").return
  {
    if (filter_p()) next;

    if ($return == 0) {
      accessed_paths[fullpath_struct_file(task_current(), $file)]++
      if ($mask & (0x00000002 | 0x00000008)) # MAY_WRITE | MAY_APPEND
        written_paths[fullpath_struct_file(task_current(), $file)]++
      print_syscall = 1
    }
  }

probe kernel.function("security_file_set_fowner@security/security.c").return
  {
    if (filter_p()) next;

    accessed_paths[fullpath_struct_file(task_current(), $file)]++
    written_paths[fullpath_struct_file(task_current(), $file)]++
    print_syscall = 1
  }

probe kernel.function("security_file_open@security/security.c").return
  {
    if (filter_p()) next;

    if ($return == 0) {
      accessed_paths[fullpath_struct_file(task_current(), $file)]++
      print_syscall = 1
    }
  }

probe kernel.function("security_inode_setsecctx@security/security.c").return
  {
    if (filter_p()) next;

    if ($return == 0 && $dentry > 1000) {
      printf("func %s dentry 0x%x\n", pp(), $dentry);
      accessed_paths[d_path($dentry)]++
      written_paths[d_path($dentry)]++
      print_syscall = 1
    }
  }

probe kernel.function("security_inode_getsecctx@security/security.c").return
  {
    if (filter_p()) next;

    if ($return == 0) {
      accessed_paths[inode_path($inode)]++
      print_syscall = 1
    }
  }

/* system call printing */
probe nd_syscall.* 
  {
    # TODO: filter out apparently-nested syscalls (that are implemented
    # in terms of each other within the kernel); PR6762

    if (filter_p()) next;

    used_syscalls[name]++

    thread_argstr[tid()]=argstr
    if (timestamp || elapsed_time)
      thread_time[tid()]=gettimeofday_us()

    if (name in syscalls_nonreturn)
      report(name,argstr,"")
  }

probe nd_syscall.*.return
  {
    if (filter_p()) next;

    report(name,thread_argstr[tid()],retstr)
  }

function report(syscall_name, syscall_argstr, syscall_retstr)
  {
    if (timestamp || elapsed_time)
      {
        now = gettimeofday_us()
        then = thread_time[tid()]

        if (timestamp)
          prefix=sprintf("%s.%06d ", ctime(then/1000000), then%1000000)

        if (elapsed_time && (now>then)) {
          diff = now-then
          suffix=sprintf(" <%d.%06d>", diff/1000000, diff%1000000)
        }

        delete thread_time[tid()]
      }

    /* add a thread-id string in lots of cases, except if
       stap strace.stp -c SINGLE_THREADED_CMD */
    if (tid() != target()) {
      prefix .= sprintf("%s[%d] ", execname(), tid())
    }

    if (used_caps) {
       suffix .= " [Capabilities=" . caps_to_str(used_caps) . "]"
       all_used_caps |= used_caps
       print_syscall = 1
    }                  
    if (missing_caps) {
       suffix .= " missing [Capabilities=" . caps_to_str(missing_caps) . "]"
       all_missing_caps |= missing_caps
       print_syscall = 1
    }                  

    foreach ([type, dev] in accessed_devices) {
      devs .= dev_to_str(type, dev, accessed_devices[type, dev]) . " "
      if (has_devs == 0) {
        has_devs = 1
        print_syscall = 1
        devs = " [DeviceAllow=" . devs
      }
      all_accessed_devices[type, dev] = accessed_devices[type, dev];
    }
    if (has_devs) {
      devs .= "]"
      suffix .= devs
    }

    if (used_afs) {
      suffix .= " [RestrictAddressFamilies=" . afs_to_str(used_afs) . "]"
      all_used_afs |= used_afs
      print_syscall = 1
    }                  
    if (missing_afs) {
      suffix .= " missing [RestrictAddressFamilies=" . afs_to_str(missing_afs) 
. "]"
      all_missing_afs |= missing_afs
      print_syscall = 1
    }                  

    if (no_memory_deny_write_execute) {
      suffix .= " [MemoryDenyWriteExecute=false]"
      all_memory_deny_write_execute = "false"
    }                  

    if (highwatermark_fsize > old_highwatermark_fsize) {
      suffix .= sprintf(" [FSIZE %d -> %d]", old_highwatermark_fsize, 
highwatermark_fsize)
      old_highwatermark_fsize = highwatermark_fsize
    }
    if (highwatermark_data > old_highwatermark_data) {
      suffix .= sprintf(" [DATA %d -> %d]", old_highwatermark_data, 
highwatermark_data)
      old_highwatermark_data = highwatermark_data
    }
    if (highwatermark_stack > old_highwatermark_stack) {
      suffix .= sprintf(" [STACK %d -> %d]", old_highwatermark_stack, 
highwatermark_stack)
      old_highwatermark_stack = highwatermark_stack
    }
    if (highwatermark_core > old_highwatermark_core) {
      suffix .= sprintf(" [CORE %d -> %d]", old_highwatermark_core, 
highwatermark_core)
      old_highwatermark_core = highwatermark_core
    }
    if (highwatermark_nofile > old_highwatermark_nofile) {
      suffix .= sprintf(" [NOFILE %d -> %d]", old_highwatermark_nofile, 
highwatermark_nofile)
      old_highwatermark_nofile = highwatermark_nofile
    }
    if (highwatermark_as > old_highwatermark_as) {
      suffix .= sprintf(" [AS %d -> %d]", old_highwatermark_as, 
highwatermark_as)
      old_highwatermark_as = highwatermark_as
    }
    if (highwatermark_nproc > old_highwatermark_nproc) {
      suffix .= sprintf(" [NPROC %d -> %d]", old_highwatermark_nproc, 
highwatermark_nproc)
      old_highwatermark_nproc = highwatermark_nproc
    }
    if (highwatermark_memlock > old_highwatermark_memlock) {
      suffix .= sprintf(" [MEMLOCK %d -> %d]", old_highwatermark_memlock, 
highwatermark_memlock)
      old_highwatermark_memlock = highwatermark_memlock
    }
    if (highwatermark_sigpending > old_highwatermark_sigpending) {
      suffix .= sprintf(" [SIGPENDING %d -> %d]", old_highwatermark_sigpending, 
highwatermark_sigpending)
      old_highwatermark_sigpending = highwatermark_sigpending
    }
    if (highwatermark_msgqueue > old_highwatermark_msgqueue) {
      suffix .= sprintf(" [MSGQUEUE %d -> %d]", old_highwatermark_msgqueue, 
highwatermark_msgqueue)
      old_highwatermark_msgqueue = highwatermark_msgqueue
    }
    if (highwatermark_nice > old_highwatermark_nice) {
      suffix .= sprintf(" [NICE %d -> %d]", old_highwatermark_nice, 
highwatermark_nice)
      old_highwatermark_nice = highwatermark_nice
    }
    if (highwatermark_rtprio > old_highwatermark_rtprio) {
      suffix .= sprintf(" [RTPRIO %d -> %d]", old_highwatermark_rtprio, 
highwatermark_rtprio)
      old_highwatermark_rtprio = highwatermark_rtprio
    }
    
    foreach ([path+] in written_paths) {
      if (has_dirs == 0) {
        has_dirs = 1
        print_syscall = 1
        dirs = " [ReadWriteDirectories="
      }
      dirs .= path . " "
      all_written_paths[path]++
      if (protect_system == "full" && path == "/etc") {
        protect_system = "true"
        suffix .= " [ProtectSystem=true]"
      } else if (protect_system != "false" && path in protect_system_paths) {
        protect_system = "false"
        suffix .= " [ProtectSystem=false]"
      }
      if (protect_home != "false" && path in protect_home_paths) {
        protect_home = "false"
        suffix .= " [ProtectHome=false]"
      }
    }
    if (has_dirs) {
      dirs .= "]"
      suffix .= dirs
    }

    has_dirs = 0
    foreach ([path+] in accessed_paths) {
      if (has_dirs == 0) {
        has_dirs = 1
        print_syscall = 1
        dirs = " [InaccessibleDirectories=~"
      }
      dirs .= path . " "
      all_accessed_paths[path]++
      if (protect_home == "true" && path in protect_home_paths) {
        protect_home = "read-only"
        suffix .= " [ProtectHome=read-only]"
      }
    }
    if (has_dirs) {
      dirs .= "]"
      suffix .= dirs
    }

    if (!only_capability_use || print_syscall)
        printf("%s%s(%s) = %s%s\n",
             prefix, 
             syscall_name, syscall_argstr, syscall_retstr,
             suffix)

    used_caps = 0
    missing_caps = 0
    used_afs = 0
    print_syscall = 0
    no_memory_deny_write_execute = 0
    delete accessed_devices
    delete accessed_paths
    delete written_paths

    delete thread_argstr[tid()]
  }

probe end
  {
    printf("\nSummary:\n")
    printf("CapabilityBoundingSet=%s\n", caps_to_str(all_used_caps))
    if (all_missing_caps)
            printf("# Consider also possibly missing 
CapabilityBoundingSet=%s\n", caps_to_str(all_missing_caps))
    printf("ProtectHome=%s\n", protect_home)
    printf("ProtectSystem=%s\n", protect_system)
    # No way to analyze if PrivateTmp could be used
    printf("DevicePolicy=strict\n")
    foreach ([type, dev+] in all_accessed_devices)
      printf("DeviceAllow=%s\n", dev_to_str(type, dev, 
all_accessed_devices[type, dev]))
    printf("# LimitFSIZE=%d\n", highwatermark_fsize)
    printf("# LimitDATA=%d\n", highwatermark_data)
    printf("# LimitSTACK=%d\n", highwatermark_stack)
    printf("# LimitCORE=%d\n", highwatermark_core)
    printf("# LimitNOFILE=%d\n", highwatermark_nofile)
    printf("# LimitAS=%d\n", highwatermark_as)
    printf("# LimitNPROC=%d\n", highwatermark_nproc)
    printf("# LimitMEMLOCK=%d\n", highwatermark_memlock)
    printf("# LimitSIGPENDING=%d\n", highwatermark_sigpending)
    printf("# LimitMSGQUEUE=%d\n", highwatermark_msgqueue)
    printf("# LimitNICE=%d\n", highwatermark_nice)
    printf("# LimitRTPRIO=%d\n", highwatermark_rtprio)
    printf("RestrictAddressFamilies=%s\n", afs_to_str(all_used_afs))
    if (all_missing_afs)
            printf("# Consider also possibly missing 
RestrictAddressFamilies=%s\n", afs_to_str(all_missing_afs))
    printf("MemoryDenyWriteExecute=%s\n", all_memory_deny_write_execute)
    printf("SystemCallFilter=")
    foreach ([syscall+] in used_syscalls)
      if (syscall in syscalls_for_seccomp)
        printf("%s ", syscalls_for_seccomp[syscall])
      else
        printf("%s ", syscall)

    foreach ([path] in all_accessed_paths)
      if (path in inaccessibles)
        inaccessibles[path] = 1

    foreach ([path+] in inaccessibles)
      if (inaccessibles[path] == 0)
        printf("\nInaccessibleDirectories=-%s", path)

    printf("\nReadOnlyDirectories=/\nReadWriteDirectories=")
    foreach ([path+] in all_written_paths)
      printf("%s ", path)
    printf("\n")
  }
_______________________________________________
systemd-devel mailing list
systemd-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/systemd-devel

Reply via email to