#19522: HS intro circuit retry logic fails when network interface is down --------------------------+-------------------------------- Reporter: asn | Owner: Type: defect | Status: needs_revision Priority: Medium | Milestone: Tor: 0.2.??? Component: Core Tor/Tor | Version: Severity: Normal | Resolution: Keywords: tor-hs | Actual Points: Parent ID: #16387 | Points: 1.5 Reviewer: | Sponsor: SponsorR-can --------------------------+--------------------------------
Comment (by asn): Replying to [comment:5 dgoulet]: > Replying to [comment:4 asn]: > > Thanks for the testing timonh. > > > > I pushed a branch `bug19522` in my repo, so that people can test further: > > https://gitweb.torproject.org/user/asn/tor.git > > > > FWIW, I read some code to verify the assumption of comment:1 and it seems to be accurate. But more digging and testing is required to get better confidence, as those functions were quite hairy. > > The fix seems logical but adds some _powerful_ assumptions that `rend_service_launch_establish_intro()` failure is always due to some "local issues". As far as I can tell, it seems to be the case that basically if we can't launch a circuit it's because we just can't get packet out of the wire or we simply don't have enough information to be able to do so (consensus for instance). > > If you could, adding a comment to `rend_service_launch_establish_intro()` documenting the returned value and in this case the `-1` being that it doesn't mean the intro point is _bad_ per-se but rather the failing of launching a circuit is due to "local reachability" or not enough information to continue issues. This is _very_ important that we don't mess it up I believe because retrying over and over a bad intro point is also a bad thing. Hmm, as discussed on IRC, I believe that `rend_service_launch_establish_intro()` will return `-1` for local issues in almost all cases, but there are cases where it could in theory return `-1` for remote issues. Specifically, you can reach `connection_or_connect()` from that function which will eventually call `connect(2)`, which can in theory fail with remote errors such as `ECONNREFUSED` or `ETIMEDOUT`. I'm pretty sure this can't happen in actual remote networks (since the socket API is asynch), but there is no way to be sure. The good thing here, is that IIUC '''the only time''' we would want to remove an intro point in `rend_consider_services_intro_points()` is only when `connect()` fails with the above remote errors. In all the other cases, we would want to keep the intro point since it's just local errors. Unfortunately, the retval of `connect()` is not exposed in `rend_consider_services_intro_points()`. A more correct patch here would involve exposing the `connect()` retval in that function and only removing the intro point if it's a remote error code. This does not seem like a trivial patch here, but also not too hard to do. -- Ticket URL: <https://trac.torproject.org/projects/tor/ticket/19522#comment:6> Tor Bug Tracker & Wiki <https://trac.torproject.org/> The Tor Project: anonymity online _______________________________________________ tor-bugs mailing list tor-bugs@lists.torproject.org https://lists.torproject.org/cgi-bin/mailman/listinfo/tor-bugs