[Tech] Theoretical Master Thesis on sybil attacks on freenet

Matthew Toseland Fri, 21 Sep 2007 01:13:55 +0100

'    The calculation of the routing table has changed several times, two of 
them were major changes in the protocol. After the initial Paper the first 
big change was called the "Next Generation Routing" [7] and the last big step 
was the introduction of a small world topology. [1] This newest attempt was 
the creation of a p2p network which relies on personal contacts of people.'

Not true. We always relied on a small world topology: The network is only 
navigable if it is a small world network. However, the original Freenet 
concept was purely opennet, and therefore relied to a very large degree 
on "destination sampling" (connect to the data source when you get a 
successful request). We understand this rather better now than we did thanks 
to Oskar's recent work.

With regards to random routing for a few steps ... unfortunately it doesn't 
solve the problem (it's been proposed lots of times of course):
- You can only do it for a few hops before getting into real routing. 
Therefore your anonymity set is only those nodes which could have random 
routed to you. And it can be a fairly uneven set.
- Much worse: If two requests can be correlated by e.g. both being part of the 
same splitfile, both being post inserts from the same frost ID etc ... then 
you can do *nasty* statistical attacks based on the proportion of that known 
quantity that are routed through the attacker's node.

Re section 4.2:
- Are the "1" occurrences leaf nodes ?
- What do the "2" and "3"'s look like topologically speaking?

What exactly sucks about the tree? It would be helpful to understand this so 
we can advise people not to abuse darknet by setting up such structures (e.g. 
mirroring external authority structures). Clearly if you get a request from 
below you in the tree, you can identify that the node is within that 
subtree - is that the totality of the dangers of a tree topology?

Theorem 7: Cool! On the circle, if the source is just below the left attacker, 
the left attacker will get requests which get very close to the exact 
opposite of the request sender's own location. And you have the opposite with 
the right hand attacker. The more blocks, the closer the requests will get to 
the exact opposite point of the sender on the ring, and the more accurately 
you can identify the source...

Basically the operation is this (assuming pure greedy routing and no overload 
etc):
We have a request for location T. Our location is L. We want to find S, the 
location of the source node. We can reasonably conclude that S is between T' 
(exact opposite of T) and L (our location), within the shortest arc linking 
the two points. If there are enough blocks, enough different T's, the point 
at which we stop getting requests tells us where the source is on the circle.

However:
- Routing may not be purely greedy, because of e.g. backoff.
- The topology may not be close enough to a ring for this to work anyway. Of 
course there's a question as to whether the network will work at all in that 
case.
- Routing may be deliberately obscured by e.g. a few random hops from time to 
time.
- Routing will not always stop when it reaches the closest node to the target. 
Probe requests do, but real requests and inserts have an HTL limit, currently 
continuing for 10 hops after the last improvement in the closeness to the 
target.
- If the source is a long way away, we will only receive a small fraction of 
the requests it sends. How much does this limit accuracy? If you're too far 
away you may not get any requests, or not enough to connect them together.
- Finding the target's rough location doesn't instantly find the target. An 
attacker would probably try to narrow down the search by connecting to nodes 
closer to the location in question.
- It doesn't work on popular files which multiple requestors may be fetching 
simultaneously.

It would be fascinating to do a more accurate attack simulation! How much 
information can be gained about a requestor, and to what degree is this 
constrained by the above factors? (Introducing them one at a time, of 
course...)

Another attack you can do, if you are close to the target, is take into 
account the exact local topology and use the proportion of requests coming to 
your node to identify which node it is.

"Beware of high degree" is an interesting message. I'm not quite sure how you 
support it. Generally we've assumed that reasonably high degree is a good 
thing - certainly it will *significantly* reduce the number of hops needed to 
find data, and it's quite helpful in getting small-world-like networks... 
(Too few connections and you probably won't have the occasional long range 
one). Are you saying high average degree is a problem, or that individual 
nodes with degree far above the average (aka ubernodes) are a problem?

As far as dependant vs independant messages go ... we have a few options:
- Bundle all dependant messages into huge files and transfer them all at once 
in a single request. This isn't very practical: transferring a 1GB file with 
no multi-sourcing is going to take roughly forever. Having variable block 
sizes up to some large limit might help a bit, but would suck for various 
network/practical/code reasons, as it did on 0.5, and it wouldn't solve the 
problem, only make correlation attacks a bit harder.
- Bundle dependant requests together, and route them as a blob, keeping them 
bundled for a number of random hops. This has performance issues, relies on 
tunnels which may break if any node goes offline, doesn't work well with long 
lived requests ...
- Premix routing. Establish an explicit anonymity set of most of the nodes 
within some number of hops, and use the darknet to work out which nodes 
are "real" versus which are bogus Sybil's, or try to plug in a network-wide 
mixnet layer like I2P for opennet.
- Just live with it: you can identify your neighbours' traffic. This is 
plainly unacceptable.

On Tuesday 18 September 2007 08:31, you wrote:
> Hello everyone
> 
> I announced a half year ago that I write a master thesis over possible sibyl
> attacks on freenet. I don't want to spam the devl list because the info of 
the
> paper is already known, however not in these clear detail. So I post it here 
on
> the technical list, since the information of the paper won't help much in
> developing.
> 
> The paper gives some theoretical background of how safe you might or might 
not
> be in freenet, however the network topology is very very simplified 
throughout
> the paper to make it possible to analyze it theoretically.
> 
> For those who won't believe that darknet and trusted peers are a necessity 
the
> paper will open your eyes. Freenet 0.5 and Opennet are wide open to attacks
> described in the paper. A small step for freenet a big step for me, I 
finally
> have some spare time :)
> 
> Anyone interested can read the paper here, its quite mathematical, but I 
hope
> easy to read to any Undergraduate CS Student.
> 
> http://download.apophis.ch/paper/MA.pdf
> 
> cheers 
> Thomas Bruderer aka. Apophis
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: not available
URL: 
<https://emu.freenetproject.org/pipermail/tech/attachments/20070921/a8533312/attachment.pgp>

[Tech] Theoretical Master Thesis on sybil attacks on freenet

Reply via email to