On 01/27/15 01:54, Lennart Poettering wrote: > On Sun, 25.01.15 12:23, Topi Miettinen (toiwo...@gmail.com) wrote: > >> There's no need for CAP_CHOWN, CAP_DAC_OVERRIDE or CAP_FOWNER. > > Hmm, that's not true, is it? load_clock_timestamp() is invoked before > we drop privs in the daemon. And it certainly calls fchmod() and > fchown(), so that it can later on still access the clock touch file. > > What am I missing?
It works for me because the file exists already with correct owner and permissions. I'd put CAP_CHOWN and CAP_FOWNER back for now. Maybe the clock file could reside in a separate subdirectory, then the directory owner and permissions could be set up already at installation? > >> No new privileges are needed, especially no setuid fixups are >> expected. > > Yeah, we should probably set this for most of our daemons, not just > timesyncd. > > I am not entirely sure how NoNewPrivileges= and "no-setuid-fixup" > actually relate to each other. I.e. what does one do that the other > doesn't? And if they do different things, isn't NoNewPrivileges= kinda > a superset of no-setuid-fixup, and thus makes setting the latter > redundant? I think NoNewPrivileges (prctl(PR_SET_NO_NEW_PRIVS)) prevents any kind of uid/gid changes, while no-setuid-fixup does not restrict uid change but does not elevate the capability set. > > Reading through Documentation/prctl/no_new_privs.txt in the > kernel sources I found this paragraph: > > "Be careful, though: LSMs might also not tighten constraints on > exec in no_new_privs mode. (This means that setting up a > general-purpose service launcher to set no_new_privs before > execing daemons may interfere with LSM-based sandboxing.)" > > This kinda suggests that SELinux and friends might not like > NoNewPrivileges=, hence I am not entirely sure it is really the right > option for us. > > Hence I am wondering if "no-setuid-fixup" (and -locked) might be the > better option for us to enable everywhere? I don't think timesyncd is executing any helper programs, especially set-uid or with file system capabilities, so both should work fine. > >> Device policy can be closed, timesyncd does not access any devices. > > Note that we set PrivateDevices=yes already, which implies > DevicePolicy=closed. > >> Timesyncd only needs write access to /var/lib/systemd/clock. There's no >> need to access /boot nor most API filesystems. > > Well, /boot is already covered by ProtectSystem=full actually (but > this wasn't documented, admittedly. Updated the man page now in gate.) > > I am a bit concerned about listing by default all these directories in > the unit file. I mean, the reason why ProtectSystem=, ProtectHome= and > suchlike exists is precisely to break this down to a few booleans (or > boolean-like enums), instead of listing all directories in all unit > files all the time. I mean, it's great if people do this on their > local systems, but to make this palatable generically upstream I > created ProtectSystem= and ProtectHome= to hide the directory lists > away. > > The problem with these lists is that we need to be really conservative > with them, since many of the libs we use (including glibc) use > different interface "behind our back", and thus writing generally > valid rules, that work on all archs, on all distros, on all > glibc/libary versions is hard. This is what SELinux policy does, but I > *really* didn't want to create another implementation of such a > finegrained policy, if you follow what I mean. I'm not sure. Shouldn't we then ship a SELinux policy file then? Would you be interested in AppArmor profile for timesyncd, I have one? Also, if a distro uses weird SELinux policies or setuid helpers at every possible opportunity, shouldn't they have some responsibility of fixing their setup? > >> Install a system call filter, generated with 'strace -c'. > > Hmm, I am a bit concerned about this part, I am not sure we can really > do this upstream. Different glibc and other library versions we use > invoke different syscalls. Moreover, different archs use different > syscalls, too. This makes it really difficult putting together syscall > filters upsteram that work generically, on all downstream versions. > > I think SystemCallFilter= is something that end-users can use on their > systems, but I am not sure we can use this upstream. :-( Right, this was generated on x86_64. > >> Only one additional process is needed. > > Hmm, LimitNPROC=2 unfortunately doesn't do what we want it to do: > since the daemon drops the to the systemd-timesync user on its own, > PID 1 invokes it as root, not as the the systemd-timesync user, which > makes the excercise pointless... We should probably handle this better > though, and warn, instead of silently accepting it. Added that to the > TODO list now. This probably applies to other limits too? > > I have now changed the timesyncd code to do the right thing on its own > after dropping privs. During runtime RLIMIT_NPROC=2 should now be set. > >> Mounts should not propagate back, so set the MountFlags to slave >> (actually default since we use e.g. PrivateTmp). > > This is implied by PrivateTmp=, indeed, hence unnecessary. > > Lennart > _______________________________________________ systemd-devel mailing list systemd-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/systemd-devel