Re: [tor-dev] Guard nodes and network down events

2014-08-15 Thread Matthew Finkel
On Wed, Aug 13, 2014 at 03:47:43PM +0300, George Kadianakis wrote:
 Hello friends :)
 
 This is a post to discuss how Tor should treat its entry guards when
 its network goes down. This is part of ticket #12595 [0] which aims to
 design better interfaces and data structures for entry guards.
 

hilight
 This thread investigates what should happen when the network goes down
 and Tor's connection to a guard fails. How should Tor recognize that
 and connect back to that guard when the network goes up again?

snip /
 
 The fundamental issue here is that Tor does not have a primitive that
 detects whether the network is up or down, since any such primitive
 stands out to a network attacker [3]. This means that when Tor fails
 to connect to an entry guard, Tor can never be sure whether the guard
 was actually down or whether the network is down, and that complicates
 its algorithm significantly.
 
/hilight

These are both important problems, but I think we'll get ourselves into
trouble if we try to solve them both in the same discussion, one color
per bikeshed is enough :)

Both David and Tom have good ideas, but I think this will take the
discussion offtopic from how do we choose which guard we should use
when we realize we have a connection to the internet. I think further
discussion should go into a How should Tor detect that it has an
internet connection? thread.

 In #12595, I laid down a few different ways that Tor could improve its
 current network down entry guard algorithm [4]. After thinking some
 more, I've been leaning towards algorithm (a), which is:
 
   Everytime we manage to connect to a guard, if it's not the top
   guard in our list, mark all previous guards as retriable and try
   again from the top.
 
 For me, this seems like the most complete way of ensuring that the
 guards in the top of the list are always going to get a fair hearing
 even if the network was down when they were first probed.
 
 However, that algorithm is not without its problems. Here are some
 notes:
 
 - This algorithm suffers from an infinite loop.
 
   The naive version suffers from an obvious infinite loop if the first
   guards in our list are actually down. To protect against that, we
   will probably need to rewrite the algorithm to:
 
 Everytime we manage to connect to a guard, if it's not the top
 guard in our list, mark all previous guards as retriable and try
 again from the top: this time pick whichever guard we can connect
 to even if it's not the top one.
 
 - This algorithm is not actually robust.
 
   If we wanted to design a truly robust algorithm, it would have to be
   robust even against adversarial network down events. That is, the
   algorithm would need to work well even if you imagine the network as
   an on/off switch that the adversary can toggle at will.
 
   Here is how that algorithm can fail against such an adversary:
 
   1) Tor starts up. Adversary switches network off.
   2) Tor starts going through guards and fails to connect to them.
   3) Adversary switches network on.
   4) Tor detects network up event, marks all guards as up and goes from the 
 top.
   5) Adversary switches network off.
   6) Tor starts going through guards and fails to connect to them.
   7) Adversary switches network on, and Tor establishes a circuit to
  the guard that it was currently walking over.
 
   If the adversary is a LAN adversary, she learns the order of the
   guards in Tor's list, so she can basically choose whichever guard
   she wants.
 
   FWIW, whether such an adversary is realistic is up for debate.
 
 - This algorithm is not very elegant.
 
 - This algorithm might make guard fingerprinting worse.
 
   Imagine that the first 2 guards in your list are actually
   down. Everytime Tor detects a network up event, it will attempt to
   connect to those 2 dead guards before connecting to the third guard
   which is up.

This is true, but in this case Tor will only try connecting to the
first two guards in this case for a few hours, until they're dropped
from the consensus. But, in terms of fingerprinting, hours is too
long, which I think is part of your point.

Can we do something simple like:
   1) Tor starts up. Adversary switches network off.
   2) Tor starts going through guards and fails to connect to them.
   3) Adversary switches network on.
   4) Tor detects network up event, marks all guards as up and goes from the 
top.
   5) Adversary switches network off.
   6) Tor starts going through guards and fails to connect to them.
   7) Adversary switches network on, and Tor establishes a circuit to
  the guard that it was currently walking over.
   8) Tor walks the list of guards from the top, again.
   9) Tor establishes connection with first available Guard
 9a) If guard in 9) is different and earlier in the list than guard
 in 7), drop 7) and use 9).
 9b) or, if guard in 9) is the same or later in the list than 7),
 then use 7)

This has some 

Re: [tor-dev] Guard nodes and network down events

2014-08-14 Thread Tom Ritter
On 13 August 2014 07:47, George Kadianakis desnac...@riseup.net wrote:
 The fundamental issue here is that Tor does not have a primitive that
 detects whether the network is up or down, since any such primitive
 stands out to a network attacker [3].

I'm not certain this is true.  Windows and Mac OS detect whether or
not there is a Captive Portal/Internet connection.  While one can
argue this is bad practice, piggybacking on a detection mechanism used
by default in widely deployed OS's seems like it would not stand out.

Windows has IsInternetConnected [0] which uses NCSI[1].

I know less about Mac, but there is SCNetowrkReachability [2].
Apparently the (undocumented) way that Apple uses to detect captive
portals is [3].

It's not very clean to emulate a request instead of using an API, if
it came down to it.  But while it may seem dangerous to emulate a
request that can change in an OS patch... the reality of it is that as
long as you pay attention to the patches, you'd be able to deploy a
fix long before a non-negligible portion of people patched anyway.

-tom

[0] 
http://msdn.microsoft.com/en-us/library/windows/desktop/aa366143(v=vs.85).aspx
[1] http://blog.superuser.com/2011/05/16/windows-7-network-awareness/
[2] 
https://developer.apple.com/library/mac/documentation/SystemConfiguration/Reference/SCNetworkReachabilityRef/Reference/reference.html
[3] 
http://blog.erratasec.com/2010/09/apples-secret-wispr-request.html#.U-y2KYBdWaQ
___
tor-dev mailing list
tor-dev@lists.torproject.org
https://lists.torproject.org/cgi-bin/mailman/listinfo/tor-dev


[tor-dev] Guard nodes and network down events

2014-08-13 Thread George Kadianakis
Hello friends :)

This is a post to discuss how Tor should treat its entry guards when
its network goes down. This is part of ticket #12595 [0] which aims to
design better interfaces and data structures for entry guards.

This thread investigates what should happen when the network goes down
and Tor's connection to a guard fails. How should Tor recognize that
and connect back to that guard when the network goes up again?

I recently sent an email to tor-dev [1] which explains Tor's current
behavior and its problems. tl;dr there are edge cases [2] where Tor
will not detect the network back up event and will connect to low
priority guards instead of connecting to its primary guards.

The fundamental issue here is that Tor does not have a primitive that
detects whether the network is up or down, since any such primitive
stands out to a network attacker [3]. This means that when Tor fails
to connect to an entry guard, Tor can never be sure whether the guard
was actually down or whether the network is down, and that complicates
its algorithm significantly.

In #12595, I laid down a few different ways that Tor could improve its
current network down entry guard algorithm [4]. After thinking some
more, I've been leaning towards algorithm (a), which is:

  Everytime we manage to connect to a guard, if it's not the top
  guard in our list, mark all previous guards as retriable and try
  again from the top.

For me, this seems like the most complete way of ensuring that the
guards in the top of the list are always going to get a fair hearing
even if the network was down when they were first probed.

However, that algorithm is not without its problems. Here are some
notes:

- This algorithm suffers from an infinite loop.

  The naive version suffers from an obvious infinite loop if the first
  guards in our list are actually down. To protect against that, we
  will probably need to rewrite the algorithm to:

Everytime we manage to connect to a guard, if it's not the top
guard in our list, mark all previous guards as retriable and try
again from the top: this time pick whichever guard we can connect
to even if it's not the top one.

- This algorithm is not actually robust.

  If we wanted to design a truly robust algorithm, it would have to be
  robust even against adversarial network down events. That is, the
  algorithm would need to work well even if you imagine the network as
  an on/off switch that the adversary can toggle at will.

  Here is how that algorithm can fail against such an adversary:

  1) Tor starts up. Adversary switches network off.
  2) Tor starts going through guards and fails to connect to them.
  3) Adversary switches network on.
  4) Tor detects network up event, marks all guards as up and goes from the 
top.
  5) Adversary switches network off.
  6) Tor starts going through guards and fails to connect to them.
  7) Adversary switches network on, and Tor establishes a circuit to
 the guard that it was currently walking over.

  If the adversary is a LAN adversary, she learns the order of the
  guards in Tor's list, so she can basically choose whichever guard
  she wants.

  FWIW, whether such an adversary is realistic is up for debate.

- This algorithm is not very elegant.

- This algorithm might make guard fingerprinting worse.

  Imagine that the first 2 guards in your list are actually
  down. Everytime Tor detects a network up event, it will attempt to
  connect to those 2 dead guards before connecting to the third guard
  which is up.

. A LAN/WAN adversary will be able to see those failed connections and
  the third successful connection, which form a nice tight guard
  fingerprinting vector (similar to #10969).

And that's all folks.

I'm looking forward to some feedback on the proposed algorithms as
well as improvements and suggestions.

PS: I think that Tor was doing a trick with UDP to learn its public IP
address or something. I need to read up on that trick, and see if
it can be used to build a network up primitive.

[0]: https://trac.torproject.org/projects/tor/ticket/12595

[1]: https://lists.torproject.org/pipermail/tor-dev/2014-June/007042.html

[2]: https://trac.torproject.org/projects/tor/ticket/12450

[3]: http://stackoverflow.com/questions/3764291/checking-network-connection
 https://trac.torproject.org/projects/tor/ticket/12595#comment:6

[4]: https://trac.torproject.org/projects/tor/ticket/12595#comment:5
___
tor-dev mailing list
tor-dev@lists.torproject.org
https://lists.torproject.org/cgi-bin/mailman/listinfo/tor-dev


Re: [tor-dev] Guard nodes and network down events

2014-08-13 Thread David Goulet
On 13 Aug (15:47:43), George Kadianakis wrote:
 Hello friends :)
 
 This is a post to discuss how Tor should treat its entry guards when
 its network goes down. This is part of ticket #12595 [0] which aims to
 design better interfaces and data structures for entry guards.
 
 This thread investigates what should happen when the network goes down
 and Tor's connection to a guard fails. How should Tor recognize that
 and connect back to that guard when the network goes up again?
 
 I recently sent an email to tor-dev [1] which explains Tor's current
 behavior and its problems. tl;dr there are edge cases [2] where Tor
 will not detect the network back up event and will connect to low
 priority guards instead of connecting to its primary guards.
 
 The fundamental issue here is that Tor does not have a primitive that
 detects whether the network is up or down, since any such primitive
 stands out to a network attacker [3]. This means that when Tor fails
 to connect to an entry guard, Tor can never be sure whether the guard
 was actually down or whether the network is down, and that complicates
 its algorithm significantly.

That's a difficult thing to detect but there are ways to have a fairly good
estimate that the network is down versus the end point.

I'm going to describe different scenario that can happen with the network and
considered network down event.

1) No power on the interface so basically an unplugged cable or dead switch.

This will trigger an event on the net device which will set the interface in a
DOWN state thus not having the IFF_UP flag in the kernel (netdevice(7)). That
can be probed with an ioctl() using SIOCGIFFLAGS.  That does NOT indicate
connectivity or not to the network, simply that the interface can not
physically transmit data.

2) The default route or the network route to the address you are trying to
reach do not exist.

You'll end up with a connect error set to ENETUNREACH, ENETDOWN,
EADDRNOTAVAIL (this one is mostly for multicast and bind()) or
EHOSTUNREACH. The above can happen for instance if your interface loses its
address or/and the route table changes. But not that those error code are event
detected by the kernel thus NOTHING will come out of your computer onto the
network.

3) Everything is quiescent but the network drops everything.

In that case, your connect() will most likely fail with a timeout in blocking
mode or loop with EINPROGRESS for non blocking.

(Please add some scenario here if I'm missing some!).

That being said, we could do a combination of those to try to assert as best as
possible if it's a network issue or not. Note that this is pretty Linux
specific for the ioctl part(). Point 2) and 3) are POSIX standard code.

 
 In #12595, I laid down a few different ways that Tor could improve its
 current network down entry guard algorithm [4]. After thinking some
 more, I've been leaning towards algorithm (a), which is:
 
   Everytime we manage to connect to a guard, if it's not the top
   guard in our list, mark all previous guards as retriable and try
   again from the top.
 
 For me, this seems like the most complete way of ensuring that the
 guards in the top of the list are always going to get a fair hearing
 even if the network was down when they were first probed.
 
 However, that algorithm is not without its problems. Here are some
 notes:
 
 - This algorithm suffers from an infinite loop.
 
   The naive version suffers from an obvious infinite loop if the first
   guards in our list are actually down. To protect against that, we
   will probably need to rewrite the algorithm to:
 
 Everytime we manage to connect to a guard, if it's not the top
 guard in our list, mark all previous guards as retriable and try
 again from the top: this time pick whichever guard we can connect
 to even if it's not the top one.
 
 - This algorithm is not actually robust.
 
   If we wanted to design a truly robust algorithm, it would have to be
   robust even against adversarial network down events. That is, the
   algorithm would need to work well even if you imagine the network as
   an on/off switch that the adversary can toggle at will.
 
   Here is how that algorithm can fail against such an adversary:
 
   1) Tor starts up. Adversary switches network off.
   2) Tor starts going through guards and fails to connect to them.
   3) Adversary switches network on.
   4) Tor detects network up event, marks all guards as up and goes from the 
 top.
   5) Adversary switches network off.
   6) Tor starts going through guards and fails to connect to them.
   7) Adversary switches network on, and Tor establishes a circuit to
  the guard that it was currently walking over.

I think the crucial point here is 5), this has to be detected and if so, you
stop, reflag everyone and try again when you get a network up event.

 
   If the adversary is a LAN adversary, she learns the order of the
   guards in Tor's list, so she can basically choose whichever guard