[tor-relays] 1 circuit using 1.5Gig or ram? [0.3.3.2-alpha]
On 12 Feb (19:44:02 UTC), David Goulet wrote: >Wow... 1599323088 bytes is insane. This should _not_ happen for only 1 >circuit. We actually have checks in place to avoid this but it seems they >either totally failed or we have a edge case. > >Can you tell me what scheduler were you using (look for "Scheduler" in the >notice log). > >Any warnings in the logs that you could share or everything was normal? > >Finally, if you can share the OS you are running this relay and if Linux, the >kernel version. Don't know if it's relevant but my relay was hit in similar fashion in December. Running 0.2.9.14 (no KIST) on Linux at the time (no other related log messages, MaxMemInQueues=1GB reduced from 2GB after OOM termination): Dec 15 15:28:52 Tor[]: assign_to_cpuworker failed. Ignoring. Dec 15 15:48:16 Tor[]: assign_to_cpuworker failed. Ignoring. Dec 15 16:39:44 Tor[]: We're low on memory. Killing circuits with over-long queues. (This behavior is controlled by MaxMemInQueues.) Dec 15 17:39:45 Tor[]: Removed 442695264 bytes by killing 1 circuits; 18766 circuits remain alive. Also killed 0 non-linked directory connections. Dec 15 19:03:22 Tor[]: We're low on memory. Killing circuits with over-long queues. (This behavior is controlled by MaxMemInQueues.) Dec 15 19:03:23 Tor[]: Removed 1060505952 bytes by killing 1 circuits; 19865 circuits remain alive. Also killed 0 non-linked directory connections. More recently (and more reasonably, MaxMemInQueues=512MB), running 0.3.2.9: Feb 4 20:12:39 Tor[]: Scheduler type KIST has been enabled. Feb 6 08:12:41 Tor[]: Heartbeat: Tor's uptime is 1 day 11:59 hours. I've sent 29.00 MB and received 364.99 MB. Feb 6 14:04:43 Tor[]: We're low on memory. Killing circuits with over-long queues. (This behavior is controlled by MaxMemInQueues.) Feb 6 14:04:43 Tor[]: Removed 166298880 bytes by killing 2 circuits; 20213 circuits remain alive. Also killed 0 non-linked directory connections. Feb 6 14:11:17 Tor[]: Heartbeat: Tor's uptime is 1 day 17:59 hours, with 20573 circuits open. I've sent 910.29 GB and received 902.58 GB. Feb 6 14:11:17 Tor[]: Circuit handshake stats since last time: 1876499/3018306 TAP, 4322015/4322131 NTor. Feb 6 14:11:17 Tor[]: Since startup, we have initiated 0 v1 connections, 0 v2 connections, 1 v3 connections, and 23846 v4 connections; and received 6 v1 connections, 7844 v2 connections, 11906 v3 connections, and 214565 v4 connections. Feb 6 14:12:41 Tor[]: Heartbeat: Tor's uptime is 1 day 17:59 hours. I've sent 31.62 MB and received 420.63 MB. Feb 6 14:22:50 Tor[]: We're low on memory. Killing circuits with over-long queues. (This behavior is controlled by MaxMemInQueues.) Feb 6 14:22:50 Tor[]: Removed 181501584 bytes by killing 2 circuits; 19078 circuits remain alive. Also killed 0 non-linked directory connections. Feb 6 15:01:50 Tor[]: We're low on memory. Killing circuits with over-long queues. (This behavior is controlled by MaxMemInQueues.) Feb 6 15:01:50 Tor[]: Removed 105918912 bytes by killing 1 circuits; 19679 circuits remain alive. Also killed 0 non-linked directory connections. Feb 6 15:46:24 Tor[]: Channel padding timeout scheduled 157451ms in the past. Feb 6 19:30:36 Tor[]: new bridge descriptor 'Binnacle' (fresh): $4F0DB7E687FC7C0AE55C8F243DA8B0EB27FBF1F2~Binnacle at 108.53.208.157 Feb 6 20:11:17 Tor[]: Heartbeat: Tor's uptime is 1 day 23:59 hours, with 18045 circuits open. I've sent 1043.74 GB and received 1034.65 GB. Feb 6 20:11:17 Tor[]: Circuit handshake stats since last time: 260970/368918 TAP, 3957087/3957791 NTor. Perhaps this indicates some newer KIST mitigation logic is effective. ___ tor-relays mailing list tor-relays@lists.torproject.org https://lists.torproject.org/cgi-bin/mailman/listinfo/tor-relays
Re: [tor-relays] 1 circuit using 1.5Gig or ram? [0.3.3.2-alpha]
On 12 Feb (21:14:14), Stijn Jonker wrote: > Hi David, > > On 12 Feb 2018, at 20:44, David Goulet wrote: > > > On 12 Feb (20:09:35), Stijn Jonker wrote: > >> Hi all, > >> > >> So in general 0.3.3.1-alpha-dev and 0.3.3.2-alpha running on two nodes > >> without any connection limits on the iptables firewall seems to be a lot > >> more robust against the recent increase in clients (or possible [D]DoS). > >> But > >> tonight for a short period of time one of the relays was running a bit > >> "hot" > >> so to say. > >> > >> Only to be greated by this log entry: > >> Feb 12 18:54:55 tornode2 Tor[6362]: We're low on memory (cell queues total > >> alloc: 1602579792 buffer total alloc: 1388544, tor compress total alloc: > >> 1586784 rendezvous cache total alloc: 489909). Killing circuits > >> withover-long queues. (This behavior is controlled by MaxMemInQueues.) > >> Feb 12 18:54:56 tornode2 Tor[6362]: Removed 1599323088 bytes by killing 1 > >> circuits; 39546 circuits remain alive. Also killed 0 non-linked directory > >> connections. > > > > Wow... 1599323088 bytes is insane. This should _not_ happen for only 1 > > circuit. We actually have checks in place to avoid this but it seems they > > either totally failed or we have a edge case. > Yeah it felt a "bit" much. A couple megs I wouldn't have shared :-) > > > Can you tell me what scheduler were you using (look for "Scheduler" in the > > notice log). > > The schedular always seems to be KIST (never played with it/tried to change > it) > Feb 11 19:58:24 tornode2 Tor[6362]: Scheduler type KIST has been enabled. > > > Any warnings in the logs that you could share or everything was normal? > Besides that ESXi host gave an alarm about CPU usage, nothing odd in the logs > around that time I could find. > The general syslog logging worked both locally on the host and remote as the > hourly cron jobs surround this entry. > > > > Finally, if you can share the OS you are running this relay and if Linux, > > the > > kernel version. > > Debian Stretch, Linux tornode2 4.9.0-5-amd64 #1 SMP Debian 4.9.65-3+deb9u2 > (2018-01-04) x86_64 GNU/Linux > not sure it matters, but ESXi based VM, running with 2 vCPU's based on > i5-5300U, 4 Gig of memory > > No problems, happy to squash bugs. I guess one of the "musts" when running > Alpha code, although this might not be alpha related (I can't judge). Thanks for all the information! I've opened https://bugs.torproject.org/25226 Cheers! David -- 1xYrq8XhE25CKCQqvcX/cqKg04v1HthMMM3PwaRqqdU= signature.asc Description: PGP signature ___ tor-relays mailing list tor-relays@lists.torproject.org https://lists.torproject.org/cgi-bin/mailman/listinfo/tor-relays
Re: [tor-relays] 1 circuit using 1.5Gig or ram? [0.3.3.2-alpha]
Hi David, On 12 Feb 2018, at 20:44, David Goulet wrote: > On 12 Feb (20:09:35), Stijn Jonker wrote: >> Hi all, >> >> So in general 0.3.3.1-alpha-dev and 0.3.3.2-alpha running on two nodes >> without any connection limits on the iptables firewall seems to be a lot >> more robust against the recent increase in clients (or possible [D]DoS). But >> tonight for a short period of time one of the relays was running a bit "hot" >> so to say. >> >> Only to be greated by this log entry: >> Feb 12 18:54:55 tornode2 Tor[6362]: We're low on memory (cell queues total >> alloc: 1602579792 buffer total alloc: 1388544, tor compress total alloc: >> 1586784 rendezvous cache total alloc: 489909). Killing circuits >> withover-long queues. (This behavior is controlled by MaxMemInQueues.) >> Feb 12 18:54:56 tornode2 Tor[6362]: Removed 1599323088 bytes by killing 1 >> circuits; 39546 circuits remain alive. Also killed 0 non-linked directory >> connections. > > Wow... 1599323088 bytes is insane. This should _not_ happen for only 1 > circuit. We actually have checks in place to avoid this but it seems they > either totally failed or we have a edge case. Yeah it felt a "bit" much. A couple megs I wouldn't have shared :-) > Can you tell me what scheduler were you using (look for "Scheduler" in the > notice log). The schedular always seems to be KIST (never played with it/tried to change it) Feb 11 19:58:24 tornode2 Tor[6362]: Scheduler type KIST has been enabled. > Any warnings in the logs that you could share or everything was normal? Besides that ESXi host gave an alarm about CPU usage, nothing odd in the logs around that time I could find. The general syslog logging worked both locally on the host and remote as the hourly cron jobs surround this entry. > Finally, if you can share the OS you are running this relay and if Linux, the > kernel version. Debian Stretch, Linux tornode2 4.9.0-5-amd64 #1 SMP Debian 4.9.65-3+deb9u2 (2018-01-04) x86_64 GNU/Linux not sure it matters, but ESXi based VM, running with 2 vCPU's based on i5-5300U, 4 Gig of memory No problems, happy to squash bugs. I guess one of the "musts" when running Alpha code, although this might not be alpha related (I can't judge). Thx, Stijn signature.asc Description: OpenPGP digital signature ___ tor-relays mailing list tor-relays@lists.torproject.org https://lists.torproject.org/cgi-bin/mailman/listinfo/tor-relays
Re: [tor-relays] 1 circuit using 1.5Gig or ram? [0.3.3.2-alpha]
Hi Tor & Others, On 12 Feb 2018, at 20:29, tor wrote: I see this occasionally. It's not specific to 0.3.3.x. I reported it back in October 2017: Thx, I more or less added the version in the subject to clearly indicate it was on an alpha release https://lists.torproject.org/pipermail/tor-relays/2017-October/013328.html Roger replied here: https://lists.torproject.org/pipermail/tor-relays/2017-October/013334.html Ah thanks, not sure why my google kung-fu missed this one. MaxMemInQueues is set to 1.5 GB by default, which is why the problematic circuit uses that much RAM before its killed. You can lower MaxMemInQueues in torrc, however that will obviously have other impacts on your relay. If you have plenty of RAM, I'd maybe just leave things alone for now since Tor is already killing the circuit. My tornodes have 4Gig or ram, so I also put the MaxMemInQueues at 1,5G whilst the (D)DoS attacks were more troublesome (wasn't aware it was the default). I agree in theory some mitigation against this would be nice, but I'm not smart enough to offer anything specific. It seems Roger and other devs are already thinking about the issue. Not a coder myself (except some scripting) For those looking for the paper as well, the original URL gives a 403, I believe this is a copy (alterations or omitted slides can't check of course) http://www.robgjansen.com/talks/sniper-dcaps-20131011.pdf Thx, Stijn___ tor-relays mailing list tor-relays@lists.torproject.org https://lists.torproject.org/cgi-bin/mailman/listinfo/tor-relays
Re: [tor-relays] 1 circuit using 1.5Gig or ram? [0.3.3.2-alpha]
On 12 Feb (20:09:35), Stijn Jonker wrote: > Hi all, > > So in general 0.3.3.1-alpha-dev and 0.3.3.2-alpha running on two nodes > without any connection limits on the iptables firewall seems to be a lot > more robust against the recent increase in clients (or possible [D]DoS). But > tonight for a short period of time one of the relays was running a bit "hot" > so to say. > > Only to be greated by this log entry: > Feb 12 18:54:55 tornode2 Tor[6362]: We're low on memory (cell queues total > alloc: 1602579792 buffer total alloc: 1388544, tor compress total alloc: > 1586784 rendezvous cache total alloc: 489909). Killing circuits > withover-long queues. (This behavior is controlled by MaxMemInQueues.) > Feb 12 18:54:56 tornode2 Tor[6362]: Removed 1599323088 bytes by killing 1 > circuits; 39546 circuits remain alive. Also killed 0 non-linked directory > connections. Wow... 1599323088 bytes is insane. This should _not_ happen for only 1 circuit. We actually have checks in place to avoid this but it seems they either totally failed or we have a edge case. Can you tell me what scheduler were you using (look for "Scheduler" in the notice log). Any warnings in the logs that you could share or everything was normal? Finally, if you can share the OS you are running this relay and if Linux, the kernel version. Big thanks! David -- 1xYrq8XhE25CKCQqvcX/cqKg04v1HthMMM3PwaRqqdU= signature.asc Description: PGP signature ___ tor-relays mailing list tor-relays@lists.torproject.org https://lists.torproject.org/cgi-bin/mailman/listinfo/tor-relays
Re: [tor-relays] 1 circuit using 1.5Gig or ram? [0.3.3.2-alpha]
I see this occasionally. It's not specific to 0.3.3.x. I reported it back in October 2017: https://lists.torproject.org/pipermail/tor-relays/2017-October/013328.html Roger replied here: https://lists.torproject.org/pipermail/tor-relays/2017-October/013334.html MaxMemInQueues is set to 1.5 GB by default, which is why the problematic circuit uses that much RAM before its killed. You can lower MaxMemInQueues in torrc, however that will obviously have other impacts on your relay. If you have plenty of RAM, I'd maybe just leave things alone for now since Tor is already killing the circuit. I agree in theory some mitigation against this would be nice, but I'm not smart enough to offer anything specific. It seems Roger and other devs are already thinking about the issue. ___ tor-relays mailing list tor-relays@lists.torproject.org https://lists.torproject.org/cgi-bin/mailman/listinfo/tor-relays
[tor-relays] 1 circuit using 1.5Gig or ram? [0.3.3.2-alpha]
Hi all, So in general 0.3.3.1-alpha-dev and 0.3.3.2-alpha running on two nodes without any connection limits on the iptables firewall seems to be a lot more robust against the recent increase in clients (or possible [D]DoS). But tonight for a short period of time one of the relays was running a bit "hot" so to say. Only to be greated by this log entry: Feb 12 18:54:55 tornode2 Tor[6362]: We're low on memory (cell queues total alloc: 1602579792 buffer total alloc: 1388544, tor compress total alloc: 1586784 rendezvous cache total alloc: 489909). Killing circuits withover-long queues. (This behavior is controlled by MaxMemInQueues.) Feb 12 18:54:56 tornode2 Tor[6362]: Removed 1599323088 bytes by killing 1 circuits; 39546 circuits remain alive. Also killed 0 non-linked directory connections. Feb 12 19:04:10 tornode2 Tor[6362]: Your network connection speed appears to have changed. Resetting timeout to 60s after 18 timeouts and 1000 buildtimes. So 1 Circuit being able to claim 1,5 gig or ram, now this seems a big much. Whilst the DoS protection seems to do something (see below). Now this could be a new attack or just an error etc. However wouldn't some sort of fair memory balance between circuits be an other mitigation factor to consider? Not saying it should be as strict as "circuit memory"/"# of circuits" but 99.x% of memory for one circuit feels wrong for a relay. Feb 12 13:58:34 tornode2 Tor[6362]: DoS mitigation since startup: 910770 circuits rejected, 10 marked addresses. 25972 connections closed. 324 single hop clients refused. Feb 12 19:58:34 tornode2 Tor[6362]: DoS mitigation since startup: 1222320 circuits rejected, 12 marked addresses. 33359 connections closed. 402 single hop clients refused. Thx, Stijn___ tor-relays mailing list tor-relays@lists.torproject.org https://lists.torproject.org/cgi-bin/mailman/listinfo/tor-relays