Re: [tor-dev] Load Balancing in 2.7 series - incompatible with OnionBalance ?

2015-10-23 Thread Alec Muffett
> Let's use your idea of "if one IP fails and TTL expired then re-fetch".
> This could also make it "easier" to identify people connecting to
> Facebook. As your client guard, I see you do the fetch + IP/RP dance (3
> circuits in short period of time where two are killed). I wait 2 hours
> and then kill all circuits passing through me from you. If I can see
> again that distinctive HS pattern (3 circuits), I'll get closer to know
> that you are accessing FB.


Would that not happen if and only if (in the meantime) the server had had a 
server outage impacting the first IP that the client tries reconnecting to?

Odds on, the client entry guard will see no measurable change?

-a

___
tor-dev mailing list
tor-dev@lists.torproject.org
https://lists.torproject.org/cgi-bin/mailman/listinfo/tor-dev


[tor-dev] [PATCH] Document our current guard selection algorithm in path-spec.txt.

2015-10-23 Thread isis
Hey hey,

I've been working on documenting our current guard selection algorithm
(#17261), [0] which as most of you already know, has some room for
improvement.  The patch is in my bug17261 branch. [1]

However, it's also attached here for reference and discussion.

[0]: https://trac.torproject.org/projects/tor/ticket/17261
[1]: https://gitweb.torproject.org/user/isis/torspec.git/log/?h=bug17261

Best,
-- 
 ♥Ⓐ isis agora lovecruft
_
OpenPGP: 4096R/0A6A58A14B5946ABDE18E207A3ADB67A2CDB8B35
Current Keys: https://blog.patternsinthevoid.net/isis.txt

From 38d9df22ace881f0907c6cdd3ccd38dc95538aad Mon Sep 17 00:00:00 2001
From: Isis Lovecruft 
Date: Fri, 23 Oct 2015 16:29:17 +
Subject: [PATCH] Document our current guard selection algorithm in
 path-spec.txt.
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

 * ADDS new section, "§5.1. Guard selection algorithm", to path-spec.txt.
 * FIXES #17261: https://bugs.torproject.org/17261
---
 path-spec.txt | 99 +++
 1 file changed, 99 insertions(+)

diff --git a/path-spec.txt b/path-spec.txt
index 896195a..47dae3b 100644
--- a/path-spec.txt
+++ b/path-spec.txt
@@ -602,6 +602,105 @@ of their choices.
   Tor does not add a guard persistently to the list until the first time we
   have connected to it successfully.
 
+5.1. Guard selection algorithm
+
+  If configured to use entry guards, and the circuit's purpose is not marked
+  for testing, then a random entry guard from the persisted state (as
+  mentioned earlier in §5) will be chosen (provided there is already some
+  persisted state storing previously chosen guard nodes).
+
+  Otherwise, if any the above conditions are not satisfied, then a new entry
+  guard node will be chosen for that circuit.  The algorithm is as follows:
+
+- EXCLUDED_NODES is a list of nodes which, for some reason, are not
+  acceptable for use as an entry guard.
+
+1. If an exit node has been chosen for the circuit:
+
+   1.a. Then that exit is added to EXCLUDED_NODES (and thus will not be
+used as the entry guard).
+
+2. If running behind a fascist firewall (e.g. outgoing connections are
+   only permitted to ports 80 and/or 443):
+
+   2.a. For all known routers in the network (as given in the
+networkstatus document), a router is added to the list of
+EXCLUDED_NODES iff it does not advertise the ability to be reached
+via the ports allowed through the fascist firewall.
+
+3. Add any entry guards currently in volatile storage, as well as all
+   nodes within their families, to EXCLUDED_NODES.
+
+4. Determine which of the following flags should apply to the selection of
+   an entry guard:
+
+ * CRN_NEED_UPTIME: the router can only be chosen as an entry guard
+   iff has been available for at least some minimum uptime.
+ * CRN_NEED_CAPACITY: potentially suitable routers are weighted by
+   their advertised bandwidth capacity.
+ * CRN_ALLOW_INVALID: also consider using routers which have been
+   marked as invalid.
+ * CRN_NEED_GUARD: only consider routers which have the Guard flag.
+ * CRN_NEED_DESC: only consider routers for which we have enough
+   information to be used to build a circuit.
+
+   Additionally, if configured to allow nodes marked as invalid AND to
+   specifically allow entry guards which have been marked as invalid, then
+   the CRN_ALLOW_INVALID flag will be set.  Lastly, the CRN_NEED_GUARD and
+   CRN_NEED_DESC flags are always applied, regardless of configuration.
+
+5. If configured to exclude routers which allow single-hop circuits, then
+   the list of known routers is traversed, and all routers which permit
+   single-hop circuits are added to EXCLUDED_NODES.
+
+6. If we are an OR, add ourselves (and our family) to EXCLUDED_NODES.
+
+7. The list of potential routers is weighted according to the bandwidth
+   weights from the consensus (cf. §5.1.1), and then a random selection is
+   chosen with respect to those weights.
+
+   7.a. If we've made a choice now, the algorithm finishes.
+   7.b. Otherwise, continue to step #8.
+
+8. We couldn't find a suitable guard, so now we try much harder by
+   discarding the CRN_NEED_UPTIME, CRN_NEED_CAPACITY, and CRN_NEED_GUARD
+   selection flags.  This effectively means we'll use nearly any router,
+   except for ones already in EXCLUDED_LIST.
+
+   [XXX Does this mean we even include BadExits and other misbehaving
+   nodes?  This sounds bad.  —isis]
+
+5.1.1. How consensus bandwidth weights factor into entry guard selection
+
+  When weighting a list of routers for choosing an entry guard, the following
+  consensus parameters (from the "bandwidth-weights" line) 

Re: [tor-dev] Load Balancing in 2.7 series - incompatible with OnionBalance ?

2015-10-23 Thread teor

On 23 Oct 2015, at 03:30, Alec Muffett  wrote:

>> However, you mention that one DC going down could cause a bad experience for 
>> users. In most HA/DR setups I've seen there should be enough capacity if 
>> something fails, is that not the case for you? Can a single data center not 
>> serve all Tor traffic?
> 
> It's not the datacentre which worries me - we already know how to deal with 
> those - it's the failure-based resource contention for the limited 
> introduction-point space that is afforded by a maximum (?) of six descriptors 
> each of which cites 10 introduction points. 
> 
> A cap of 60 IPs is a clear protocol bottleneck which - even with your 
> excellent idea - could break a service deployment.

Let's try a crazier and quite possibly terrible idea:
(Consider it a thought experiment rather than a serious technical proposal. 
Based on my limited understanding, I know I will make mistakes with the 
details.)

What if a high-volume onion service tries to post descriptors to all of the 
HSDirs that a client might try, not just the typical 6?

Here's how it might work:

At any point in time, a client may be using any one of the three valid 
consensuses. (Technically, clients can be using any one of the last 24 to 
bootstrap, but they update to the latest during bootstrap.)

(Clients which are running  constantly will download a new consensus near the 
end of their current consensus validity period. This might mean that fewer 
clients are using the latest consensus, for example.)

Therefore, depending on HSDir hashring churn, clients might be trying HSDirs 
outside the typical 6 (that is, 3 hashring positions, with 2 HSDirs selected 
side-by-side in each position, specifically to mitigate this very issue).

Also, when the hashring is close to rotating (every 24 hours), Tor will post to 
both the old and new HSDirs.

What if:
* an onion service posts a different descriptor to each HSDir a client might be 
querying, based on any valid consensus and any nearby hashring rotation; and
* different introduction points are included in each descriptor.

I can see this generating up to 3 (consensuses) x 2 (hashring positions) x 3 
(hashring positions) x 2 (hashring replicas) x 10 (introduction points per 
descriptor) = 360 introduction points per service.

Unfortunately, the potential increase in introduction points varies based on 
the consensus HSDir list churn, and the time of day. These are a poor basis for 
load-balancing.

Also, if HSDir churn and client clock skew are so bad that clients could be 
accessing any one of 36 HSDirs, we should have noticed clients which couldn't 
find any of their HSDirs, and already increased the side-by-side replica count.

So I think it's a terrible idea, but I wonder if we could squeeze another 60 
introduction points out of this scheme, or a scheme like it.

Tim



___
tor-dev mailing list
tor-dev@lists.torproject.org
https://lists.torproject.org/cgi-bin/mailman/listinfo/tor-dev


Re: [tor-dev] Load Balancing in 2.7 series - incompatible with OnionBalance ?

2015-10-23 Thread teor

On 23 Oct 2015, at 03:30, Alec Muffett  wrote:

>> However, if you were to use proposal #255 to split the introduction and 
>> rendezvous to separate tor instances, you would then be limited to:
>> - 6*10*N tor introduction points, where there are 6 HSDirs, each receiving 
>> 10 different introduction points from different tor instances, and N 
>> failover instances of this infrastructure competing to post descriptors. 
>> (Where N = 1, 2, 3.)
>> - a virtually unlimited number of tor servers doing the rendezvous and 
>> exchanging data (say 1 server per M clients, where M is perhaps 100 or so, 
>> but ideally dynamically determined based on load/response time).
>> In this scenario, you could potentially overload the introduction points.
> 
> Exactly my concern, especially when combined with overlong lifetimes of 
> mostly-zombie descriptors.

Hopefully, at this point the onion service operator would inform the directory 
authority operators. They would then decide on higher values for the HSDir 
hashring consensus parameters, thus increasing the number of HSDir replicas per 
onion service.

Of course, this assumes a lot - including that the directory authorities will 
change, and that no-one has hard-coded the 6 replicas as a constant anywhere in 
their code. We might want to check this for OnionBalance.

Better to fix the issues at the source, if we can.

Tim


___
tor-dev mailing list
tor-dev@lists.torproject.org
https://lists.torproject.org/cgi-bin/mailman/listinfo/tor-dev