P.S: I know it's not an error but a warning, bad wording from my side there.
Right now the relay appears to be semi-stable, still consuming much more memory than I remember from pre-2021 times, but that's fine, nothing dangerous yet. At one point, traffic is at it's peak with 80mbit/s, another time, it dips down to 16mbit/s for many minutes - not sure if this is the attacker or simply tor compressing consensus documents.. log still spamming the warning mentioned above. Best Regards, William 2021-02-04 17:51 GMT, William Kane <[email protected]>: > Hi community, > > Unfortunately my otherwise stable tor guard relay has recently lost > it's guard flag, once again, due to what I think is a new type of > (D)DoS attack, either directly targeted towards my tor relay, or > against some other relays inside the network, facilitated through my > relay. > > It all started to go downhill one month ago, on January 10th of this > year - the linux kernel OOM killer decided to reap the tor process, > multiple times in a row - it was violating the MaxMemInQueues setting > of 732MB, the optimal value according to tor - on a virtual machine > with a dedicated CPU core (and sadly, a lack of hardware AES > acceleration, but that's off topic, by completely sandboxing and > isolating the tor process, and then disabling all mitigations offered > by the linux kernel, I still managed to achieve a peak throughput of > 10mb/s while keeping other users and processes safe and sound - this > made the relay the fastest tor relay belonging to my AS.. sorry, just > bragging ;-)) which has 1024 megabytes of physical ram, and a swap > partition with a size of 512 megabytes (vm.swappiness initially was > 90, I've changed it to 70). > > The first time this happened, it came out of nowhere, so I wasn't > closely monitoring the metrics page of my relay - this led to a > downtime of 3 days, then leading to the loss of the guard flag. > > Since then, traffic on my relay has been limited to traffic coming > from other relays, it is now exclusively a middle-only relay - it did > not recover from the attack, even though I managed to achieve longer > consecutive uptimes by tweaking MaxMemInQueues, first down to 704, > then 672, and now 640MB. > > The only log entry I ever saw before the tor process got reaped is the > following one: > > Feb 04 12:47:08 *hostname_redacted* tor[224]: Feb 04 12:47:08.000 > [warn] Your computer is too slow to handle this many circuit creation > requests! Please consider using the MaxAdvertisedBandwidth config > option or choosing a more restricted exit policy. [93409 similar > message(s) suppressed in last 60 seconds] > Feb 04 12:48:08 *hostname_redacted* tor[224]: Feb 04 12:48:08.000 > [warn] Your computer is too slow to handle this many circuit creation > requests! Please consider using the MaxAdvertisedBandwidth config > option or choosing a more restricted exit policy. [42527 similar > message(s) suppressed in last 60 seconds] > > As you can tell by the date, this was today - after 3 weeks and 1 day, > this noon, the process got OOM-killed again - I instantly noticed it, > logged into the machine, installed updates, updated my pacman > mirrorlist, the usual stuff - then I rebooted the machine, only to log > in 5 minutes later to see tor using 100% of CPU time, with the log > file getting spammed by this error - it started only 3 seconds after > the relay published it's descriptor. > > Clearly, this is some sort of targeted attack against either my relay > or someone is abusing it to attack someone or something else inside or > outside the tor network. > > I did read the recent information on attacks regarding DirAuth's, but > apparently a fix has been deployed on all of them, and checking the > bandwidth stats of some of them, it seems to be working, somewhat. > > I wonder if this is the culprit here, this machine has been running > tor relays since 2014, and I never had these problems with it before - > even without lowering MaxMemInQueues. > > To me, that's just further proof that this is a targeted attack using > my relay and before you ask, I know some KVM hypervisors are > oversold, but my hosting company stopped selling KVM machines with > mechanical host HDD's a few years ago, so new customers can't be the > reason for all of this. > > Is there anything I can do to make my relay as stable as possible > until the attacker(s) stop(s) hammering my relay? This doesn't seem to > get caught by the built-in DoS prevention, so I didn't try tweaking > the associated config options. > > Maybe someone from the tor team has a patch I can try applying? I'm > not completely up to date on the newest tor shenanigans and pitfalls, > so maybe it has already been posted on this mailing list, but I don't > feel like reading a bazillion messages. > > For the time being, MaxMemInQueues will stay at 640 megabytes, to > (hopefully) at least keep the relay up and running, even though > performance, for anyone unlucky enough to build a circuit through it, > will be affected, severely (despite all this, I'm still pushing around > 11TB/s of traffic a month, I just don't know how much of it is > legitimate..) - it's my duty as a relay operator to guarantee for the > safety and usability of my tor relay, so I'm very eager to find a > solution for this. > > Unfortunately, this is a live relay, otherwise I would probably try > developing my own solution for the problem (including publishing it / > making a pull request), but in this case repeated downtimes and > restarts, which would be necessary when working on the source code, is > absolutely not an option as it would possibly disrupt hundreds, if not > thousands of clients. > > If anyone could point me in a direction, I'd really appreciate it. > > Thank you, > > William > _______________________________________________ tor-relays mailing list [email protected] https://lists.torproject.org/cgi-bin/mailman/listinfo/tor-relays
