#24767: All relays are constantly connecting to down relays and failing over and over -------------------------------------------------+------------------------- Reporter: arma | Owner: dgoulet Type: enhancement | Status: | needs_review Priority: Very High | Milestone: Tor: | 0.3.3.x-final Component: Core Tor/Tor | Version: Severity: Normal | Resolution: Keywords: tor-relay, tor-dos, performance, | Actual Points: review-group-32, 033-must, review-group-34, | 033-triage-20180320, 033-included-20180320 | Parent ID: | Points: Reviewer: asn | Sponsor: -------------------------------------------------+-------------------------
Comment (by dgoulet): > Is it okay with you if I change the name anyway? The problem is that the "since" implies a continuous condition. If I say "I have been unable to eat broccoli since 1982", it does not mean only that in 1982 I failed to eat broccoli: it also means that at every time since 1982, I have also been unable to eat broccoli. Renamed `unable_connect_since_ts` in commit `0410a5e49c`. > But the preferred address can change while Tor is running, I think, based on configuration options. That would make the value cached in the node become incorrect, and that's why I think we should maybe just not have this in the node_t at all. If a `node_t` preferred address changes, indeed we won't match the connection to a `node_t` anymore and thus we will fallback to using the cache for unknown nodes. And that object in the cache will be used until the `node_t` gets modified by I guess the new descriptor at some point. So all in all, we'll still be tracking connection failure correctly. The only reason why I'm still a bit hesitating on using the "unknown cache" for everything is maybe because of performance. If we would use that, at *every* connection, we would need to do a lookup where the hash function is a bit heavier with this addr + port + digest triplet than simply looking up the node list. Furthermore, every 60 seconds (`OR_CONNECT_FAILURE_CLEANUP_INTERVAL`), we cleanup that cache meaning we can iterate over thousands of objects in there (potentially more than the number of relays). I don't think it would be a significant cost but still important to keep in mind. Keeping it tiny can worth it. However, just using the cache (and not the node_t), would make things a bit more simpler in the code, we would just need to accept the additional memory and lookup/iteration price. My testing on a real relays shows that the majority of the times, we'll get our information out of the `node_t`. At this point, I can lean towards both sides, I might just need a second opinion? -- Ticket URL: <https://trac.torproject.org/projects/tor/ticket/24767#comment:38> Tor Bug Tracker & Wiki <https://trac.torproject.org/> The Tor Project: anonymity online
_______________________________________________ tor-bugs mailing list tor-bugs@lists.torproject.org https://lists.torproject.org/cgi-bin/mailman/listinfo/tor-bugs