On Tue, May 14, 2013 at 09:42:47AM +0200, Karsten Loesing wrote: > On 5/14/13 8:08 AM, Matthew Finkel wrote: > > Hi all, > > > > Over the last few weeks I've been working with George and Aaron on > > updating BridgeDB's code with respect to how it handles pluggable > > transports. > > Hi Matthew, > > I didn't read your proposed BridgeDB changes in detail (sorry!), but I'd > like to ask for something: can you make sure that the > bridge-pool-assignments file stays useful when your changes are deployed? > > https://metrics.torproject.org/formats.html#bridgepool > > We're not processing bridge pool assignment files automatically, but > we'll include them in Atlas once it supports searching for bridges. > Right now the strings make some sense to bridge operators by giving them > an idea whether and how BridgeDB distributes their bridge. If possible, > this should still be the case in the future. > > Thanks! > Karsten >
Hi Karsten, Absolutely. To be honest, I don't expect these modifications to impact that file much, and I see no reason to alter the format of it, but I'll verify everything remains sane throughout the updates. Thanks for raising your concern! Matt > > > I've made some decent progress, but there are some > > questions that I'd like to ask (because I'm not sure I should be the > > one making the decision). I've also started updating the spec and > > there are some parts on which I'd like some clarification. I'll try to > > summarize the thoughts on the matter we/I have thus far. See [A] if > > you're unfamiliar with the BridgeDB code/spec/idea. > > > > 1) How should BridgeDB decide the number of transports, and types, it > > should hand out? > > > > - My current patch returns transports based on the ratio of how many > > there are compared to the other bridges, so that if we hand out > > four bridges and obfs2 bridges account for 3/10 of all running > > bridges, then BridgeDB will hand out (4*(3/10)) = 1.2 bridges with > > each request, on average. > > - I've also added an option into bridgedb.conf to set the (expected) > > minimum and maximum number of bridges which support a specific PT > > that BridgeDB should hand out per request. > > - I have a verification check that tries to force us to meet these > > values, however, with its current implementation it's not > > guaranteed, only probabilistic. I think this is okay for now. > > - So, is this enough? Do we want/need a deterministic method of > > supplying bridges with a supported set of transports? > > - Another option is to place each transport into its own subring and > > select from each of the subrings to ensure we meet the requirement. > > The more I've thought about this, the more I think this defeats > > the purpose of constructing the rings, though. > > - Last (for now), if a bridge supports multiple PTs, should we return > > all of them to the user or randomly select one or select one with a > > bias? We agreed that we really shouldn't do the first because that > > would just accelerate the ability of a censor to block more bridges. > > The middle option works, but given that many bridges now support > > obfs2 and obfs3, is it a good idea to, again, probabilistically > > return each type (roughly) half the time? > > > > 2) Should we prefer to distribute PT bridges over regular bridges which > > have their ORPort on 443? > > - Right now returning ORPorts on 443 is the highest priority and > > transports are a secondary best-effort operation. > > > > 3) Unless I incorrectly understand the code, the bridges never rotate. > > The bridge interval is set to NoSchedule(), which means it returns > > a static time. Is there a reason for this? This is counter to the > > spec. Just wondering. :) > > > > > > (I had some other points I wanted to raise, but I'm blanking on them > > now. I think this is a good start, though.) > > > > Please also let me know and correct anything I may have gotten wrong. > > > > Thanks everyone, and thanks to George and Aaron for their help, as well. > > > > - Matt > > > > > > > > > > A. For those who don't know the details of the code, the simplified > > version is as follows: > > > > 1) All bridges send their bridge descriptors and misc information > > to the Bridge Authority. > > 2) Bridge Authority provides a network status file containing all > > known bridges described by their name, fingerprint, digest, > > time of publication, IP addr, ORPort, DirPort. Bridge Auth also > > provides a bridge descriptor file also specifying the bridges > > IP addr, ORPort, and fingerprint. Last, it supplies an extra-info > > file that contains all the extra info that the bridges > > provide - mainly their transports, in our case. > > 3) BridgeDB parses all of these files and associates the information > > to a single instance of a bridge. > > 4) BridgeDB assigns each running bridge to a distributor (website, > > email, etc) based on an hmac of the bridge's ID. Once assigned, > > the bridge is inserted into the distributors list of bridges. > > 5) BridgeDB then further organizes the bridges assigned to each > > distributor by moving them into rings and subrings. > > - A ring is simply a sorted list of an hmac of the bridges' ID > > which, when traversed, wraps around to the beginning if it ever > > reaches the end. > > - The hmac of the bridge's ID is used to retrieve the actual > > bridge instance from a hash, which is stored along side the ring. > > 6) Some distributors, such as https, are 'initialized' with a few > > rings based on filters. > > - https starts out with a ring containing all bridges assigned to > > it, a ring only containing bridges which support IPv4 > > connections, and a ring only containing bridges which support > > IPv6 connections. > > - Every ring also contains two subrings (currently). One subring > > is the subset of bridges from the parent ring which have their > > ORPort listening on port 443. The other subring is the subset > > of bridges from the parent ring which have the stable flag set. > > - For example, > > - Cluster 1 Ring > > - subring (stable) > > - subring (https) > > - Cluster 2 Ring > > - subring (stable) > > - subring (https) > > - IPv4 Cluster 1 Ring > > - subring (stable) > > - subring (https) > > - IPv4 Cluster 2 Ring > > - subring (stable) > > - subring (https) > > - IPv6 Cluster 1 Ring > > - subring (stable) > > - subring (https) > > - IPv6 Cluster 2 Ring > > - subring (stable) > > - subring (https) > > 7) When BridgeDB receives a request for bridges from its website, it > > forwards the query on to the IP distributor. The details will > > include if a specific PT was requested, IP version bridge > > supports, country within which the bridge should not be blocked, > > requesing IP address, and interval. > > 8) The distributor then decides on the "area" of the IP address, > > currently the /24 mask, and then finds the "cluster" within that > > area (by taking the first eight bytes of an hmac of the area and > > using the result (modulus "the number of clusters")). A filter is > > then constructed based on the requested information. If a ring > > already exists that satisfies exactly these filters then that is > > then constructed based on the requested information. If a ring > > already exists that satisfies exactly these filters then that is > > used. Else a new ring (with subrings) is constructed to satisfy > > this request. The distributor also computes the position in the > > ring as the hmac of the interval and the area. > > 9) Once the correct ring exists, it determines how many bridges it > > can find in the ring's subrings to satisfy the request. This is > > done by taking the previously computed position and finding it > > in the list of bridges ID's hmacs and then selecting the next > > consecutive "requested number of bridges" from the list (wrapping > > around to the beginning, if necessary). The same is then done for > > the main ring. The results from these searchs are then joined and > > the first "requested number of bridges" unique keys are selected > > from the list. This list is then sorted and returned, propagating > > back up to the user. > > 10) Similar actions are taken by the other distributors. For example, > > the email distributor doesn't use an "area" to decide which > > bridges to distribute, it uses the normalized requesting/source > > mail address. > > 11) Misc: > > - Because the rings are sorted by an hmac of the bridge's ID, I > > expect that they will be uniformly distributed around the ring. > > As such, I don't expect there to be a bias for one type of > > bridge/transport/ORPort over any other. (Is this incorrect?) > > _______________________________________________ > > tor-dev mailing list > > [email protected] > > https://lists.torproject.org/cgi-bin/mailman/listinfo/tor-dev > > > _______________________________________________ tor-dev mailing list [email protected] https://lists.torproject.org/cgi-bin/mailman/listinfo/tor-dev
