On Fri, Dec 09, 2022 at 09:47:07AM +0000, Alexander Færøy wrote: > On 2022/12/01 20:35, Christopher Sheats wrote: > > Does anyone have experience troubleshooting and/or fixing this problem? > > Like I wrote in [1], I think it would be interesting to hear if the > patch from pseudonymisaTor in ticket #26646[2] would be of any help in > the given situation. The patch allows an exit operator to specify a > range of IP addresses for binding purposes for outbound connections. I > would think this could split the load wasted on trying to resolve port > conflicts in the kernel amongst the set of IP's you have available for > outbound connections.
This sounds similar to a problem we faced with the main Snowflake bridge. After usage passed a certain threshold, we started getting constant EADDRNOTAVAIL, not on the outgoing connections to middle nodes, but on the many localhost TCP connections used by the pluggable transports model. https://gitlab.torproject.org/tpo/anti-censorship/pluggable-transports/snowflake/-/issues/40198 https://gitlab.torproject.org/tpo/anti-censorship/pluggable-transports/snowflake/-/issues/40201 Long story short, the only mitigation that worked for us was to bind sockets to an address (with port number unspecified, and with IP_BIND_ADDRESS_NO_PORT *unset*) before connecting them, and use different 127.0.0.0/8 addresses or ranges of addresses in different segments of the communication chain. https://gitlab.torproject.org/tpo/anti-censorship/pluggable-transports/snowflake/-/merge_requests/120 https://gitlab.torproject.org/dcf/extor-static-cookie/-/commit/a5c7a038a71aec1ff78d1b15888f1c75b66639cd IP_BIND_ADDRESS_NO_PORT was mentioned in another part of the thread (https://lists.torproject.org/pipermail/tor-relays/2022-December/020895.html). For us, this bind option *did not help* and in fact we had to apply a workaround for Haproxy, which has IP_BIND_ADDRESS_NO_PORT hardcoded. *Why* that should be the case is a mystery to me, as is why it is true that bind-before-connect avoids EADDRNOTAVAIL even when the address manually bound to is the very same address the kernel would have automatically assigned. I even spent some time reading the Linux 5.10 source code trying to make sense of it. In the source code I found, or at least think I found, code paths for the behvior I observed; but the behavior seems to go against how bind and IP_BIND_ADDRESS_NO_PORT are documented to work. https://gitlab.torproject.org/tpo/anti-censorship/pluggable-transports/snowflake/-/issues/40201#note_2839472 > Although my understanding of what Linux is doing is very imperfect, my > understanding is that both of these questions have the same answer: > port number assignment in `connect` when called on a socket not yet > bound to a port works differently than in `bind` when called with a > port number of 0. In case (1), the socket is not bound to a port > because you haven't even called `bind`. In case (2), the socket is not > bound to a port because haproxy sets the `IP_BIND_ADDRESS_NO_PORT` > sockopt before calling `bind`. When you call `bind` *without* > `IP_BIND_ADDRESS_NO_PORT`, it causes the port number to be bound > before calling `connect`, which avoids the code path in `connect` that > results in `EADDRNOTAVAIL`. > > I am confused by these results, which are contrary to my understanding > of what `IP_BIND_ADDRESS_NO_PORT` is supposed to do, which is > precisely to avoid the problem of source address port exhaustion by > deferring the port number assignment until the time of `connect`, when > additional information about the destination address is available. But > it's demonstrable that binding to a source port before calling > `connect` avoids `EADDRNOTAVAIL` errors in our use cases, whatever the > cause may be. _______________________________________________ tor-relays mailing list [email protected] https://lists.torproject.org/cgi-bin/mailman/listinfo/tor-relays
