[tor-relays] 1 circuit using 1.5Gig or ram? [0.3.3.2-alpha]

2018-02-12 Thread starlight . 2017q4
On 12 Feb (19:44:02 UTC), David Goulet wrote:
>Wow... 1599323088 bytes is insane. This should _not_ happen for only 1
>circuit. We actually have checks in place to avoid this but it seems they
>either totally failed or we have a edge case.
>
>Can you tell me what scheduler were you using (look for "Scheduler" in the
>notice log).
>
>Any warnings in the logs that you could share or everything was normal?
>
>Finally, if you can share the OS you are running this relay and if Linux, the
>kernel version.


Don't know if it's relevant but my relay was hit in similar fashion in December.
Running 0.2.9.14 (no KIST) on Linux at the time (no other related log messages,
MaxMemInQueues=1GB reduced from 2GB after OOM termination):

Dec 15 15:28:52 Tor[]: assign_to_cpuworker failed. Ignoring.
Dec 15 15:48:16 Tor[]: assign_to_cpuworker failed. Ignoring.
Dec 15 16:39:44 Tor[]: We're low on memory.  Killing circuits with over-long 
queues. (This behavior is controlled by MaxMemInQueues.)
Dec 15 17:39:45 Tor[]: Removed 442695264 bytes by killing 1 circuits; 18766 
circuits remain alive. Also killed 0 non-linked directory connections.
Dec 15 19:03:22 Tor[]: We're low on memory.  Killing circuits with over-long 
queues. (This behavior is controlled by MaxMemInQueues.)
Dec 15 19:03:23 Tor[]: Removed 1060505952 bytes by killing 1 circuits; 19865 
circuits remain alive. Also killed 0 non-linked directory connections.

More recently (and more reasonably, MaxMemInQueues=512MB), running 0.3.2.9:

Feb  4 20:12:39 Tor[]: Scheduler type KIST has been enabled.
Feb  6 08:12:41 Tor[]: Heartbeat: Tor's uptime is 1 day 11:59 hours. I've sent 
29.00 MB and received 364.99 MB.
Feb  6 14:04:43 Tor[]: We're low on memory.  Killing circuits with over-long 
queues. (This behavior is controlled by MaxMemInQueues.)
Feb  6 14:04:43 Tor[]: Removed 166298880 bytes by killing 2 circuits; 20213 
circuits remain alive. Also killed 0 non-linked directory connections.
Feb  6 14:11:17 Tor[]: Heartbeat: Tor's uptime is 1 day 17:59 hours, with 20573 
circuits open. I've sent 910.29 GB and received 902.58 GB.
Feb  6 14:11:17 Tor[]: Circuit handshake stats since last time: 1876499/3018306 
TAP, 4322015/4322131 NTor.
Feb  6 14:11:17 Tor[]: Since startup, we have initiated 0 v1 connections, 0 v2 
connections, 1 v3 connections, and 23846 v4 connections; and received 6 v1 
connections, 7844 v2 connections, 11906 v3 connections, and 214565 v4 
connections.
Feb  6 14:12:41 Tor[]: Heartbeat: Tor's uptime is 1 day 17:59 hours. I've sent 
31.62 MB and received 420.63 MB.
Feb  6 14:22:50 Tor[]: We're low on memory.  Killing circuits with over-long 
queues. (This behavior is controlled by MaxMemInQueues.)
Feb  6 14:22:50 Tor[]: Removed 181501584 bytes by killing 2 circuits; 19078 
circuits remain alive. Also killed 0 non-linked directory connections.
Feb  6 15:01:50 Tor[]: We're low on memory.  Killing circuits with over-long 
queues. (This behavior is controlled by MaxMemInQueues.)
Feb  6 15:01:50 Tor[]: Removed 105918912 bytes by killing 1 circuits; 19679 
circuits remain alive. Also killed 0 non-linked directory connections.
Feb  6 15:46:24 Tor[]: Channel padding timeout scheduled 157451ms in the past. 
Feb  6 19:30:36 Tor[]: new bridge descriptor 'Binnacle' (fresh): 
$4F0DB7E687FC7C0AE55C8F243DA8B0EB27FBF1F2~Binnacle at 108.53.208.157
Feb  6 20:11:17 Tor[]: Heartbeat: Tor's uptime is 1 day 23:59 hours, with 18045 
circuits open. I've sent 1043.74 GB and received 1034.65 GB.
Feb  6 20:11:17 Tor[]: Circuit handshake stats since last time: 260970/368918 
TAP, 3957087/3957791 NTor.

Perhaps this indicates some newer KIST mitigation logic is effective.

___
tor-relays mailing list
tor-relays@lists.torproject.org
https://lists.torproject.org/cgi-bin/mailman/listinfo/tor-relays


Re: [tor-relays] 1 circuit using 1.5Gig or ram? [0.3.3.2-alpha]

2018-02-12 Thread David Goulet
On 12 Feb (21:14:14), Stijn Jonker wrote:
> Hi David,
> 
> On 12 Feb 2018, at 20:44, David Goulet wrote:
> 
> > On 12 Feb (20:09:35), Stijn Jonker wrote:
> >> Hi all,
> >>
> >> So in general 0.3.3.1-alpha-dev and 0.3.3.2-alpha running on two nodes
> >> without any connection limits on the iptables firewall seems to be a lot
> >> more robust against the recent increase in clients (or possible [D]DoS). 
> >> But
> >> tonight for a short period of time one of the relays was running a bit 
> >> "hot"
> >> so to say.
> >>
> >> Only to be greated by this log entry:
> >> Feb 12 18:54:55 tornode2 Tor[6362]: We're low on memory (cell queues total
> >> alloc: 1602579792 buffer total alloc: 1388544, tor compress total alloc:
> >> 1586784 rendezvous cache total alloc: 489909). Killing circuits
> >> withover-long queues. (This behavior is controlled by MaxMemInQueues.)
> >> Feb 12 18:54:56 tornode2 Tor[6362]: Removed 1599323088 bytes by killing 1
> >> circuits; 39546 circuits remain alive. Also killed 0 non-linked directory
> >> connections.
> >
> > Wow... 1599323088 bytes is insane. This should _not_ happen for only 1
> > circuit. We actually have checks in place to avoid this but it seems they
> > either totally failed or we have a edge case.
> Yeah it felt a "bit" much. A couple megs I wouldn't have shared :-)
> 
> > Can you tell me what scheduler were you using (look for "Scheduler" in the
> > notice log).
> 
> The schedular always seems to be KIST (never played with it/tried to change 
> it)
> Feb 11 19:58:24 tornode2 Tor[6362]: Scheduler type KIST has been enabled.
> 
> > Any warnings in the logs that you could share or everything was normal?
> Besides that ESXi host gave an alarm about CPU usage, nothing odd in the logs 
> around that time I could find.
> The general syslog logging worked both locally on the host and remote as the 
> hourly cron jobs surround this entry.
> 
> 
> > Finally, if you can share the OS you are running this relay and if Linux, 
> > the
> > kernel version.
> 
> Debian Stretch, Linux tornode2 4.9.0-5-amd64 #1 SMP Debian 4.9.65-3+deb9u2 
> (2018-01-04) x86_64 GNU/Linux
> not sure it matters, but ESXi based VM, running with 2 vCPU's based on 
> i5-5300U, 4 Gig of memory
> 
> No problems, happy to squash bugs. I guess one of the "musts" when running 
> Alpha code, although this might not be alpha related (I can't judge).

Thanks for all the information!

I've opened https://bugs.torproject.org/25226

Cheers!
David

-- 
1xYrq8XhE25CKCQqvcX/cqKg04v1HthMMM3PwaRqqdU=


signature.asc
Description: PGP signature
___
tor-relays mailing list
tor-relays@lists.torproject.org
https://lists.torproject.org/cgi-bin/mailman/listinfo/tor-relays


Re: [tor-relays] 1 circuit using 1.5Gig or ram? [0.3.3.2-alpha]

2018-02-12 Thread Stijn Jonker
Hi David,

On 12 Feb 2018, at 20:44, David Goulet wrote:

> On 12 Feb (20:09:35), Stijn Jonker wrote:
>> Hi all,
>>
>> So in general 0.3.3.1-alpha-dev and 0.3.3.2-alpha running on two nodes
>> without any connection limits on the iptables firewall seems to be a lot
>> more robust against the recent increase in clients (or possible [D]DoS). But
>> tonight for a short period of time one of the relays was running a bit "hot"
>> so to say.
>>
>> Only to be greated by this log entry:
>> Feb 12 18:54:55 tornode2 Tor[6362]: We're low on memory (cell queues total
>> alloc: 1602579792 buffer total alloc: 1388544, tor compress total alloc:
>> 1586784 rendezvous cache total alloc: 489909). Killing circuits
>> withover-long queues. (This behavior is controlled by MaxMemInQueues.)
>> Feb 12 18:54:56 tornode2 Tor[6362]: Removed 1599323088 bytes by killing 1
>> circuits; 39546 circuits remain alive. Also killed 0 non-linked directory
>> connections.
>
> Wow... 1599323088 bytes is insane. This should _not_ happen for only 1
> circuit. We actually have checks in place to avoid this but it seems they
> either totally failed or we have a edge case.
Yeah it felt a "bit" much. A couple megs I wouldn't have shared :-)

> Can you tell me what scheduler were you using (look for "Scheduler" in the
> notice log).

The schedular always seems to be KIST (never played with it/tried to change it)
Feb 11 19:58:24 tornode2 Tor[6362]: Scheduler type KIST has been enabled.

> Any warnings in the logs that you could share or everything was normal?
Besides that ESXi host gave an alarm about CPU usage, nothing odd in the logs 
around that time I could find.
The general syslog logging worked both locally on the host and remote as the 
hourly cron jobs surround this entry.


> Finally, if you can share the OS you are running this relay and if Linux, the
> kernel version.

Debian Stretch, Linux tornode2 4.9.0-5-amd64 #1 SMP Debian 4.9.65-3+deb9u2 
(2018-01-04) x86_64 GNU/Linux
not sure it matters, but ESXi based VM, running with 2 vCPU's based on 
i5-5300U, 4 Gig of memory

No problems, happy to squash bugs. I guess one of the "musts" when running 
Alpha code, although this might not be alpha related (I can't judge).

Thx,
Stijn

signature.asc
Description: OpenPGP digital signature
___
tor-relays mailing list
tor-relays@lists.torproject.org
https://lists.torproject.org/cgi-bin/mailman/listinfo/tor-relays


Re: [tor-relays] 1 circuit using 1.5Gig or ram? [0.3.3.2-alpha]

2018-02-12 Thread Stijn Jonker

Hi Tor & Others,

On 12 Feb 2018, at 20:29, tor wrote:

I see this occasionally. It's not specific to 0.3.3.x. I reported it 
back in October 2017:


Thx, I more or less added the version in the subject to clearly indicate 
it was on an alpha release



https://lists.torproject.org/pipermail/tor-relays/2017-October/013328.html

Roger replied here:

https://lists.torproject.org/pipermail/tor-relays/2017-October/013334.html


Ah thanks, not sure why my google kung-fu missed this one.

MaxMemInQueues is set to 1.5 GB by default, which is why the 
problematic circuit uses that much RAM before its killed. You can 
lower MaxMemInQueues in torrc, however that will obviously have other 
impacts on your relay. If you have plenty of RAM, I'd maybe just leave 
things alone for now since Tor is already killing the circuit.


My tornodes have 4Gig or ram, so I also put the MaxMemInQueues at 1,5G 
whilst the (D)DoS attacks were more troublesome (wasn't aware it was the 
default).


I agree in theory some mitigation against this would be nice, but I'm 
not smart enough to offer anything specific. It seems Roger and other 
devs are already thinking about the issue.


Not a coder myself (except some scripting)

For those looking for the paper as well, the original URL gives a 403, I 
believe this is a copy (alterations or omitted slides can't check of 
course) http://www.robgjansen.com/talks/sniper-dcaps-20131011.pdf


Thx,
Stijn___
tor-relays mailing list
tor-relays@lists.torproject.org
https://lists.torproject.org/cgi-bin/mailman/listinfo/tor-relays


Re: [tor-relays] 1 circuit using 1.5Gig or ram? [0.3.3.2-alpha]

2018-02-12 Thread David Goulet
On 12 Feb (20:09:35), Stijn Jonker wrote:
> Hi all,
> 
> So in general 0.3.3.1-alpha-dev and 0.3.3.2-alpha running on two nodes
> without any connection limits on the iptables firewall seems to be a lot
> more robust against the recent increase in clients (or possible [D]DoS). But
> tonight for a short period of time one of the relays was running a bit "hot"
> so to say.
> 
> Only to be greated by this log entry:
> Feb 12 18:54:55 tornode2 Tor[6362]: We're low on memory (cell queues total
> alloc: 1602579792 buffer total alloc: 1388544, tor compress total alloc:
> 1586784 rendezvous cache total alloc: 489909). Killing circuits
> withover-long queues. (This behavior is controlled by MaxMemInQueues.)
> Feb 12 18:54:56 tornode2 Tor[6362]: Removed 1599323088 bytes by killing 1
> circuits; 39546 circuits remain alive. Also killed 0 non-linked directory
> connections.

Wow... 1599323088 bytes is insane. This should _not_ happen for only 1
circuit. We actually have checks in place to avoid this but it seems they
either totally failed or we have a edge case.

Can you tell me what scheduler were you using (look for "Scheduler" in the
notice log).

Any warnings in the logs that you could share or everything was normal?

Finally, if you can share the OS you are running this relay and if Linux, the
kernel version.

Big thanks!
David

-- 
1xYrq8XhE25CKCQqvcX/cqKg04v1HthMMM3PwaRqqdU=


signature.asc
Description: PGP signature
___
tor-relays mailing list
tor-relays@lists.torproject.org
https://lists.torproject.org/cgi-bin/mailman/listinfo/tor-relays


Re: [tor-relays] 1 circuit using 1.5Gig or ram? [0.3.3.2-alpha]

2018-02-12 Thread tor
I see this occasionally. It's not specific to 0.3.3.x. I reported it back in 
October 2017:

https://lists.torproject.org/pipermail/tor-relays/2017-October/013328.html

Roger replied here:

https://lists.torproject.org/pipermail/tor-relays/2017-October/013334.html

MaxMemInQueues is set to 1.5 GB by default, which is why the problematic 
circuit uses that much RAM before its killed. You can lower MaxMemInQueues in 
torrc, however that will obviously have other impacts on your relay. If you 
have plenty of RAM, I'd maybe just leave things alone for now since Tor is 
already killing the circuit.

I agree in theory some mitigation against this would be nice, but I'm not smart 
enough to offer anything specific. It seems Roger and other devs are already 
thinking about the issue.

___
tor-relays mailing list
tor-relays@lists.torproject.org
https://lists.torproject.org/cgi-bin/mailman/listinfo/tor-relays


[tor-relays] 1 circuit using 1.5Gig or ram? [0.3.3.2-alpha]

2018-02-12 Thread Stijn Jonker

Hi all,

So in general 0.3.3.1-alpha-dev and 0.3.3.2-alpha running on two nodes 
without any connection limits on the iptables firewall seems to be a lot 
more robust against the recent increase in clients (or possible [D]DoS). 
But tonight for a short period of time one of the relays was running a 
bit "hot" so to say.


Only to be greated by this log entry:
Feb 12 18:54:55 tornode2 Tor[6362]: We're low on memory (cell queues 
total alloc: 1602579792 buffer total alloc: 1388544, tor compress total 
alloc: 1586784 rendezvous cache total alloc: 489909). Killing circuits 
withover-long queues. (This behavior is controlled by MaxMemInQueues.)
Feb 12 18:54:56 tornode2 Tor[6362]: Removed 1599323088 bytes by killing 
1 circuits; 39546 circuits remain alive. Also killed 0 non-linked 
directory connections.
Feb 12 19:04:10 tornode2 Tor[6362]: Your network connection speed 
appears to have changed. Resetting timeout to 60s after 18 timeouts and 
1000 buildtimes.


So 1 Circuit being able to claim 1,5 gig or ram, now this seems a big 
much. Whilst the DoS protection seems to do something (see below). Now 
this could be a new attack or just an error etc. However wouldn't some 
sort of fair memory balance between circuits be an other mitigation 
factor to consider? Not saying it should be as strict as "circuit 
memory"/"# of circuits" but 99.x% of memory for one circuit feels wrong 
for a relay.


Feb 12 13:58:34 tornode2 Tor[6362]: DoS mitigation since startup: 910770 
circuits rejected, 10 marked addresses. 25972 connections closed. 324 
single hop clients refused.
Feb 12 19:58:34 tornode2 Tor[6362]: DoS mitigation since startup: 
1222320 circuits rejected, 12 marked addresses. 33359 connections 
closed. 402 single hop clients refused.


Thx,
Stijn___
tor-relays mailing list
tor-relays@lists.torproject.org
https://lists.torproject.org/cgi-bin/mailman/listinfo/tor-relays