Hi William William Kane: > Hi Peter, > >> Would be great if you could get details about the failing call. > > I already thought of gathering said details by tracing the process, > but did not want to risk my uptime statistics, which would inevitably > happen if I had to restart the server and service over and over (I > disabled tracing globally through the Yama LSM as a security measure, > i.e. kernel.yama.ptrace_scope == 3) - recently I lost the guard flag > multiple times, caused by some sort of attack that I already reported > on this list (tor-relays) - someone kept creating a fuckton of > circuits through my relay (averaging 90k per minute), thus causing tor > to run out of memory / get oom-killed by the kernel before it could > even step in and close the circuits - if it was even trying to, it > would make sense for the DoS mitigation code to be active only for the > first link in the circuit aka the guard, and my node simply being a > middle-only relay, it got completely stomped by said attack. > > After somewhat mitigating this attack by tweaking MaxMemInQueues, > creating a bigger swap file and tuning vm.swappiness, I regained the > guard flag, but then the hypervisor my KVM box is running under > experienced some issues and had to be rebooted - once again, I > received no notice of that until the relay was already offline for a > few days, causing me to lose the guard flag again. > > Seems like luck is just not on my side these days, or well, it's been weeks > now.
You could try to just run a second instance of Tor by copying the systemd config and Tor settings. You probably don't need to enable OrPort and ControlPort to reproduce the issue. > >> You should simply see a Permission Denied if the capability is the problem. > > Here's a copy from stdout, only happening if Sandbox is set to 1.: > > Mar 15 20:15:20.000 [notice] Configured to measure statistics. Look > for the *-stats files that will first be written to the data directory > in 24 hours from now. > Mar 15 20:15:21.000 [warn] fstat() on directory /var/lib/tor_debug failed. > Mar 15 20:15:21.000 [err] Can't create/check datadirectory /var/lib/tor_debug > Mar 15 20:15:21.000 [err] Error initializing keys; exiting > > Running it as a privileged user does not change thing, so no permissions > issue: > > Mar 15 20:17:24.000 [notice] Configured to measure statistics. Look > for the *-stats files that will first be written to the data directory > in 24 hours from now. > Mar 15 20:17:24.000 [warn] You are running Tor as root. You don't need > to, and you probably shouldn't. > Mar 15 20:17:25.000 [warn] fstat() on directory /var/lib/tor_debug failed. > Mar 15 20:17:25.000 [err] Can't create/check datadirectory /var/lib/tor_debug > Mar 15 20:17:25.000 [err] Error initializing keys; exiting > > I've traced down the origin of the fstat() call to this piece of code: > > https://github.com/torproject/tor/blob/master/src/lib/fs/dir.c#L158 > > However, looking at the code that establishes and populates seccomp > rules, it seems like fstat and it's 64 bit counterpart are not subject > to (parameter) filtering, i.e. seccomp_rule_add_0 is invoked with the > parameter SCMP_ACT_ALLOW, reading the manpage for seccomp_rule_add(3) > reveals: "The seccomp filter will have no effect on the thread calling > the syscall if it matches the filter rule." > > References: > > https://github.com/torproject/tor/blob/master/src/lib/sandbox/sandbox.c#L148 > https://github.com/torproject/tor/blob/master/src/lib/sandbox/sandbox.c#L1595 > https://man7.org/linux/man-pages/man3/seccomp_rule_add.3.html > > So, even though technically, seccomp should allow these syscalls to be > invoked, no matter which parameters are passed, somehow enabling the > whole sandbox subsystem still breaks fstat. fstat() in the log above refers to the fstat() function in libc but libc can use numerous syscalls in the background to implement it. I could find fstat, fstat64 and fstatat64, and newer kernels may have even more syscalls, that could be used. Usually, when seccomp starts failing, it is because a library was updated (like libc) and started using another syscall to implement a function (like fstat()) or the kernel was updated, which the library detected, and started using a new, "improved" syscall. To be sure what syscall is used, the auditd logs would be invaluable. Performance impact should be neglectable if you don't manually add any auditing rules. _______________________________________________ tor-relays mailing list [email protected] https://lists.torproject.org/cgi-bin/mailman/listinfo/tor-relays
