#24782: Set a lower default MaxMemInQueues value ---------------------------------+------------------------------------ Reporter: teor | Owner: ahf Type: defect | Status: assigned Priority: Medium | Milestone: Tor: 0.3.2.x-final Component: Core Tor/Tor | Version: Severity: Normal | Resolution: Keywords: tor-relay, tor-ddos | Actual Points: Parent ID: | Points: 0.5 Reviewer: | Sponsor: ---------------------------------+------------------------------------
Comment (by teor): Replying to [comment:5 dgoulet]: > We could also explore the possibility for that value to be a moving target at runtime. It is a bit more dicy and complicated but because Tor at startup looks at the "Total memory" instead of the "Available memory" to estimate that value, things can go badly quickly if 4/16 GB of RAM are available which will make Tor use 12GB as a limit... and even with a fairly good amount of swap, this is likely to be killed by the OOM of the OS at some point. > > On the flip side, a fast relay stuck with an estimation of 1GB or 2GB of RAM that Tor can use at startup won't be "fast" for much long before the OOM kicks in and start killing old circuits. This is not what I have observed. I have some fast Guards. Under normal load they don't ever use much more than 1 - 2 GB total RAM. > It is difficult to tell what a normal fast relay will endure in terms of RAM for Tor overtime but so far of what I can tell with my relays, between 1 and 2 GB is usually what I see (in non-DoS condition and non-Exit). I usually see 1-2 GB for non-exits, and closer to 2 GB for exits. > I do believe right now that the network is still fairly usable because we have big Guards able to use 5, 10, 12GB of RAM right now... Unclear to me if firing up the OOM more frequently would improve the situation but we should be very careful at not making every relays using a "too low amount of ram" :S. If the fastest relay can do 1 Gbps, then that's 125 MB per second. 12 GB of RAM is 100 seconds of traffic. Is it really useful to buffer 100 seconds of traffic? (Or, under the current load, tens of thousands of useless circuits?) So I'm not sure if using more RAM for queues actually helps. In my experience, it just increases the number of active connections and CPU usage. I don't know how to measure if this benefits or hurts clients. (I guess I could tweak my guard and test running a client through it?) Here's what happened when I followed my own advice in this thread: https://lists.torproject.org/pipermail/tor-relays/2018-January/014021.html I have a few big guards that are very close to a lot of the new clients. They were using 150% CPU, 4-8 GB RAM, and 15000 connections each. But they were not actually carrying much useful traffic. I tried reducing MaxMemInQueues to 2 GB and 1 GB, and they started using 3-7 GB RAM. This is on 0.3.0 with the destroy cell fix. (But on my slower Guards and my Exit, MaxMemInQueues worked really well, reducing the RAM usage to 0.5 - 1.5 GB, without reducing the consensus weight.) I tried reducing the number of file descriptors, that reduced the CPU to around 110%, because the new connections were closed earlier. It pushed a lot of the sockets into the kernel TIME_WAIT state, about 10,000 on top of the regular 10,000. (Maybe these new Tor clients didn't do exponential backoff?) I tried DisableOOSCheck 0, and it didn't seem to make much difference to RAM or CPU, but it made a small difference to sockets (and it makes sure that I don't lose important sockets, like new control port sockets, so I left it on). I already set RelayBandwidthRate, but now I also set MaxAdvertisedBandwidth to about half the RelayBandwidthRate. Hopefully this will make the clients go elsewhere. But this isn't really a solution for the network. So I'm out of options to try and regulate traffic on these guards. And I need to have them working in about a week or so, because I need to run safe stats collections on them. I think my only remaining option is to drop connections when the number of connections per IP goes above some limit. From the tor-relays posts, it seems like up to 10 connections per IP is normal, but these clients will make hundreds of connections at once. I think I should DROP rather than RST, because that forces the client to timeout, rather than immediately making another connection. -- Ticket URL: <https://trac.torproject.org/projects/tor/ticket/24782#comment:6> Tor Bug Tracker & Wiki <https://trac.torproject.org/> The Tor Project: anonymity online
_______________________________________________ tor-bugs mailing list tor-bugs@lists.torproject.org https://lists.torproject.org/cgi-bin/mailman/listinfo/tor-bugs