On Wed, Sep 30, 2015 at 11:27 AM, Tom van der Woerdt <[email protected]> wrote: > Hey all, > > I'd like your thoughts and comments on this proposal. > > Tom > > > PS: If you want to deliver them in person, I'm in Berlin. > > > > > Filename: xxx-intro-rendezvous-controlsocket.txt > Title: Load-balancing hidden services by splitting introduction from > rendezvous
IMO great idea. I ignored it until the Berlin meeting because the title didn't reflect what it actually does in a way I understood. Instead I would suggest a title more like: "Controller features to so hidden-service introduce2 handling to happen on a separate host from rendezvous2 sending" > Author: Tom van der Woerdt > Created: 2015-09-30 > Status: draft > > 1. Overview and motivation > > To address scaling concerns with the onion web, we want to be able to > spread the load of hidden services across multiple machines. > OnionBalance is a great stab at this, and it can currently give us 60x > the capacity by publishing 6 separate descriptors, each with 10 > introduction points, but more is better. This proposal aims to address > hidden service scaling up to a point where we can handle millions of > concurrent connections. > > The basic idea involves splitting the 'introduce' from the > 'rendezvous', in the tor implementation, and adding new events and > commands to the control specification to allow intercepting > introductions and transmitting them to different nodes, which will then > take care of the actual rendezvous. External controller code could > relay the data to another node or a pool of nodes, all which are run by > the hidden service operator, effectively distributing the load of > hidden services over multiple processes. > > By cleverly utilizing the current descriptor methods, we could publish > up to sixty unique introduction points, which could translate to many > thousands of parallel tor workers. This should allow hidden services to > go multi-threaded, with a few small changes. > > > 2. Specification > > We propose two additions to the control specification, of which one is > an event and the other is a new command. We also introduce a new > configuration option. > > > 2.1. DisableAutomaticRendezvous configuration option > > The syntax is: > "DisableAutomaticRendezvous" SP [1|0] CRLF > > This configuration option is defined to be a boolean toggle which, if > set, stops the tor implementation from automatically doing a rendezvous > when an INTRODUCE2 cell is received. Instead, an event will be sent to > the controllers. If no controllers are present, the introduction cell > should be dropped, as acting on it instead of dropping it could open a > window for a DoS. > > For security reasons, the configuration should be made available only > in the configuration files, and not as an option settable by the > controller. > > > 2.2. The "INTRODUCE" event > > The syntax is: > "650" SP "INTRODUCE" SP RendezvousData CRLF > > RendezvousData = implementation-specific, but must not contain > whitespace, must only contain human-readable > characters, and should be no longer than 512 bytes > > The INTRODUCE event should contain sufficient data to allow continuing > the rendezvous from another Tor instance. The exact format is left > unspecified and left up to the implementation. From this follows that > only matching versions can be used safely to coordinate the rendezvous > of hidden service connections. Recommendation: Allow it to be longer than 512 bytes (futureproofing), rename it to something like "INTRODUCE_REQUEST_RECEIVED". Recommendation: Specify what it would look like as implemented for today's Tor. > > 2.3. "PERFORM-RENDEZVOUS" command > > The syntax is: > "PERFORM-RENDEZVOUS" SP RendezvousData CRLF > > This command allows a controller to perform a rendezvous using data > received through an INTRODUCE event. The format of RendezvousData is > not specified other than that it must not contain whitespace, and > should be no longer than 512 bytes. Recommendation: Allow it to be longer than 512 bytes (futureproofing), rename it to something like "ANSWER_RENDEZVOUS". Recommendation: Specify what it would look like as implemented for today's Tor. > 3. Compatibility and security > > The implementation of these methods should, ideally, not change > anything in the network, and all control changes are opt-in, so this > proposal is fully backwards compatible. > > Controllers handling this data must be careful to not leak rendezvous > data to untrusted parties, as it could be used to intercept and > manipulate hidden services traffic. > > > 4. Example > > Let's take an example where a client (Alice) tries to contact Bob's > hidden service. To do this, Bob follows the normal hidden service > specification, except he sets up ten servers to do this. One of these > publishes the descriptor, the others have this desabled. When the > INTRODUCE2 cell arrives at the node which published the descriptor, it > does not immediately try to perform the rendezvous, but instead outputs > this to the controller. Through an out-of-band process this message is > relayed to a controller of another node of Bob's, and this transmits > the "PERFORM-RENDEZVOUS" command to that node. This node finally > performs the rendezvous, and will continue to serve data to Alice, > whose client will now not have to talk to the introduction point > anymore. > > > 5. Other considerations > > We have left the actual format of the rendezvous data in the control > protocol unspecified, so that controllers do not need to worry about > the various types of hidden service connections, most notably proposal > 224. IMO we need to specify what this looks like for current hidden services, and for hidden services under proposal 224, or else this proposal is not complete. > The decision to not implement the actual cell relaying in the tor > implementation itself was taken to allow more advanced configurations, > and to leave the actual load-balancing algorithm to the implementor of > the controller. The developer of the tor implementation should not > have to choose between a round-robin algorithm and something that could > pull CPU load averages from a centralized monitoring system. > > _______________________________________________ > tor-dev mailing list > [email protected] > https://lists.torproject.org/cgi-bin/mailman/listinfo/tor-dev _______________________________________________ tor-dev mailing list [email protected] https://lists.torproject.org/cgi-bin/mailman/listinfo/tor-dev
