Hello, Thank you for this information. After 24-hours of testing, these configurations brought Tor to a halt.
At first I started with the sysctl modifications. After a few hours with just that, there was no improvement in ~75% inet_csk_bind_conflict utilization. I then installed Torutils for both IPv4 and IPv6. After only a couple of hours, Tor dropped to below 15 Mbps across both servers (40 relays). 16 hours later, Tor dropped below 2 Mbps. I've removed all of these new settings and restarted. -- Christopher Sheats (yawnbox) Executive Director Emerald Onion Signal: +1 206.739.3390 Website: https://emeraldonion.org/ Mastodon: https://digitalcourage.social/@EmeraldOnion/ > On Dec 2, 2022, at 7:30 AM, Chris <[email protected]> wrote: > > Hi, > > As I'm sure you've already gathered, your system is maxing out trying to > deal with all the connection requests. When inet_csk_get_port is called > and the port is found to be occupied then inet_csk_bind_conflict is > called to resolve the conflict. So in normal circumstances you shouldn't > see it in perf top much less at 79%. There are two ways to deal with it, > and each method should be complimented by the other. One way is to try > to increase the number of ports and reduce the wait time which you have > somehow tried. I would add the following: > > net.ipv4.tcp_fin_timeout = 20 > > net.ipv4.tcp_max_tw_buckets = 1200 > > net.ipv4.tcp_keepalive_time = 1200 > > net.ipv4.tcp_syncookies = 1 > > net.ipv4.tcp_max_syn_backlog = 8192 > > The complimentary method to the above is to lower the number of > connection requests by removing the frivolous connection requests out of > the equation using a few iptables rules. > > I'm assuming the increased load you're experiencing is due to the > current DDos attacks and I'm not sure if you're using anything to > mitigate that but you should consider it. > > You may find something useful at the following links > > [1](https://github.com/Enkidu-6/tor-ddos) > > [2](https://github.com/toralf/torutils) > > [background](https://gitlab.torproject.org/tpo/community/support/-/issues/40093) > > Cheers. > > On 12/1/2022 3:35 PM, Christopher Sheats wrote: >> Hello tor-relays, >> >> We are using Ubuntu server currently for our exit relays. >> Occasionally, exit throughput will drop from ~4Gbps down to ~200Mbps >> and the only observable data point that we have is a significant >> increase in inet_csk_bind_conflict, as seen via 'perf top', where it >> will hit 85% [kernel] utilization. >> >> A while back we thought we solved with with two /etc/sysctl.conf settings: >> net.ipv4.ip_local_port_range = 1024 65535 >> net.ipv4.tcp_tw_reuse = 1 >> >> However we are still experiencing this problem. >> >> Both of our (currently, two) relay servers suffer from the same >> problem, at the same time. They are AMD Epyc 7402P bare-metal servers >> each with 96GB RAM, each has 20 exit relays on them. This issue >> persists after upgrading to 0.4.7.11. >> >> Screenshots of perf top are shared >> here: https://digitalcourage.social/@EmeraldOnion/109440197076214023 >> >> Does anyone have experience troubleshooting and/or fixing this problem? >> >> Cheers, >> >> -- >> Christopher Sheats (yawnbox) >> Executive Director >> Emerald Onion >> Signal: +1 206.739.3390 >> Website: https://emeraldonion.org/ >> Mastodon: https://digitalcourage.social/@EmeraldOnion/ >> >> >> >> >> >> _______________________________________________ >> tor-relays mailing list >> [email protected] >> https://lists.torproject.org/cgi-bin/mailman/listinfo/tor-relays
signature.asc
Description: Message signed with OpenPGP
_______________________________________________ tor-relays mailing list [email protected] https://lists.torproject.org/cgi-bin/mailman/listinfo/tor-relays
