Re: [freenet-dev] Pitch Black Attack - Analysis, Code, Etc.

2013-01-31 Thread Michael Grube
On Mon, Jan 28, 2013 at 6:22 PM, Matthew Toseland t...@amphibian.dyndns.org
 wrote:

 On Monday 28 Jan 2013 21:39:54 Michael Grube wrote:
  On Mon, Jan 28, 2013 at 4:14 PM, Matthew Toseland 
 t...@amphibian.dyndns.org
   wrote:
 
   On Monday 28 Jan 2013 18:09:07 Michael Grube wrote:
On Sun, Jan 27, 2013 at 12:30 PM, Matthew Toseland 
t...@amphibian.dyndns.org wrote:
   
 On Sunday 27 Jan 2013 05:02:17 Michael Grube wrote:
  Hi everyone,
 
  Around this time last year I started on work to simulate the
 pitch
   black
  attack working against Oskar Sandberg's swapping algorithm that
 is
  implemented for use with darknet. The work is essentially
 incomplete,
 but I
  did enough to get an idea of how well Oskar's proposed solution
 to
   the
  Black Attack works. Hopefully the information that follows can
   provide
 some
  insight for anybody working on this problem.

 It looks like we have a good chance of using Oskar's original plan.
   Maybe
 even getting it published, with some help (carl might help even if
 you
 don't have time?).
 
  The code is messy, so I'm going to do a walkthrough of how
 exactly I
   ran
  the simulation.
 
  To start off, my code is available at
 http://github.com/mgrube/pbsim
   .
 
  Let's start. The first thing I did was create the small world
 network
 that
  is assumed in the darknet. The graph size can obviously be of any
   size,
 but
  in our experiment we'll make the network size 1000 nodes. This is
   pretty
  simple in python and can be accomplished with one line in the
   networkx
  library:

 You did check that there isn't a scalability issue? :)
   
I tested with 10,000 nodes as well and the results did not vary by
 much.
The most important difference I noticed was that 2 attackers became a
   less
significant number. Not that this really means anything to a would-be
attacker.
   
If you are convinced that scalability is a problem, I can add
 support for
threads to what I have and make it easy to simulate 100,000 or 1M or
whatever number we want to try.
  
   I don't know. In my experience threads complicate matters quite a bit.
  
 
  Certainly they can. In this case it would help, however.
 
 

 I wonder if it would be worth writing up the natural pitch black
 via
 churn evolution we saw in ~ 2008. Basically, when you have churn,
   newbies
 end up with the worst locations i.e. those furthest away from the
   main
 clusters. So even without an attack, the network locations become
 more
   and
 more clustered. We fixed it by periodic randomisation, which
 seemed to
   have
 relatively little cost - the nodes quickly got their old locations
   back,
 more or less.

 Another thing we want to look into is what the cost is of swapping
 (especially on a growing network, or even two networks merging) in
   terms of
 losing datastores due to locations changing. That might need more
   detailed
 simulations...
   
I will see what I can do about looking into these sometime later this
   week.
  
   Good.
   
  We can see the aftermath by looking at a histogram of node
   locations. The
  randomize function uses a random function to assign each node a
   location,
  so first let's look at a histogram of locations before the
 attack:
 

  
 http://127.0.0.1:/CHK@ODZ1s5SDYrVvyNo0ONh4O9rtI~pcVmTSShh47UFPY5U,SKJfkX2eswHMrqidDWTUoZKGMaZ9yt0l6uLUZMmxOqk,AAMC--8/preattacklocations.PNG

 Suprisingly wide range of concentrations.
 
  The biasing locations for these attack nodes were:
  .6935
  .1935
  .9435
  .4435
  .4665
  .9665
  .7165
  .2165
 
  Our histogram of node locations now shows a disproportionate
 number
   of
  nodes with those locations:
 
 

  
 http://127.0.0.1:/CHK@aI0BN0NXEjU--8dFtCYZwPwUWcM0rpamIf3lnv7FfHc,SCr2NPJYZVpFJKSf-qDYerQTQyDfdoV3-DeX-W1e91I,AAMC--8/postattack.PNG

 Scary!
   
Quite.
   
  So, the attack with only two nodes is obviously very effective.
 It's
  important to note that the attack simulation method assumed that
   nodes
 were
  attacking before the swapping algorithm had a chance to organize
 the
  network. This is something of a worst case scenario.

 So this is attacking after we've done some swapping but not enough
 to
 reach the point of diminishing returns?
 
  Now, let's measure the effectiveness of Oskar Sandberg's proposed
 solution,
  which is described on the bug tracker:
  https://bugs.freenetproject.org/view.php?id=3919
 
  We can test sandberg's solution by using:
 
  sandbergsolution(sandberg_solution_network, attackers, .037)
 
  The last parameter, .037 is the distance threshold for
   re-randomizing a
  node's location. To be 

Re: [freenet-dev] Pitch Black Attack - Analysis, Code, Etc.

2013-01-31 Thread Michael Grube
On Mon, Jan 28, 2013 at 6:22 PM, Matthew Toseland t...@amphibian.dyndns.org
 wrote:

 On Monday 28 Jan 2013 21:39:54 Michael Grube wrote:
  On Mon, Jan 28, 2013 at 4:14 PM, Matthew Toseland 
 t...@amphibian.dyndns.org
   wrote:
 
   On Monday 28 Jan 2013 18:09:07 Michael Grube wrote:
On Sun, Jan 27, 2013 at 12:30 PM, Matthew Toseland 
t...@amphibian.dyndns.org wrote:
   
 On Sunday 27 Jan 2013 05:02:17 Michael Grube wrote:
  Hi everyone,
 
  Around this time last year I started on work to simulate the
 pitch
   black
  attack working against Oskar Sandberg's swapping algorithm that
 is
  implemented for use with darknet. The work is essentially
 incomplete,
 but I
  did enough to get an idea of how well Oskar's proposed solution
 to
   the
  Black Attack works. Hopefully the information that follows can
   provide
 some
  insight for anybody working on this problem.

 It looks like we have a good chance of using Oskar's original plan.
   Maybe
 even getting it published, with some help (carl might help even if
 you
 don't have time?).
 
  The code is messy, so I'm going to do a walkthrough of how
 exactly I
   ran
  the simulation.
 
  To start off, my code is available at
 http://github.com/mgrube/pbsim
   .
 
  Let's start. The first thing I did was create the small world
 network
 that
  is assumed in the darknet. The graph size can obviously be of any
   size,
 but
  in our experiment we'll make the network size 1000 nodes. This is
   pretty
  simple in python and can be accomplished with one line in the
   networkx
  library:

 You did check that there isn't a scalability issue? :)
   
I tested with 10,000 nodes as well and the results did not vary by
 much.
The most important difference I noticed was that 2 attackers became a
   less
significant number. Not that this really means anything to a would-be
attacker.
   
If you are convinced that scalability is a problem, I can add
 support for
threads to what I have and make it easy to simulate 100,000 or 1M or
whatever number we want to try.
  
   I don't know. In my experience threads complicate matters quite a bit.
  
 
  Certainly they can. In this case it would help, however.
 
 

 I wonder if it would be worth writing up the natural pitch black
 via
 churn evolution we saw in ~ 2008. Basically, when you have churn,
   newbies
 end up with the worst locations i.e. those furthest away from the
   main
 clusters. So even without an attack, the network locations become
 more
   and
 more clustered. We fixed it by periodic randomisation, which
 seemed to
   have
 relatively little cost - the nodes quickly got their old locations
   back,
 more or less.

 Another thing we want to look into is what the cost is of swapping
 (especially on a growing network, or even two networks merging) in
   terms of
 losing datastores due to locations changing. That might need more
   detailed
 simulations...
   
I will see what I can do about looking into these sometime later this
   week.
  
   Good.
   
  We can see the aftermath by looking at a histogram of node
   locations. The
  randomize function uses a random function to assign each node a
   location,
  so first let's look at a histogram of locations before the
 attack:
 

  
 http://127.0.0.1:/CHK@ODZ1s5SDYrVvyNo0ONh4O9rtI~pcVmTSShh47UFPY5U,SKJfkX2eswHMrqidDWTUoZKGMaZ9yt0l6uLUZMmxOqk,AAMC--8/preattacklocations.PNG

 Suprisingly wide range of concentrations.
 
  The biasing locations for these attack nodes were:
  .6935
  .1935
  .9435
  .4435
  .4665
  .9665
  .7165
  .2165
 
  Our histogram of node locations now shows a disproportionate
 number
   of
  nodes with those locations:
 
 

  
 http://127.0.0.1:/CHK@aI0BN0NXEjU--8dFtCYZwPwUWcM0rpamIf3lnv7FfHc,SCr2NPJYZVpFJKSf-qDYerQTQyDfdoV3-DeX-W1e91I,AAMC--8/postattack.PNG

 Scary!
   
Quite.
   
  So, the attack with only two nodes is obviously very effective.
 It's
  important to note that the attack simulation method assumed that
   nodes
 were
  attacking before the swapping algorithm had a chance to organize
 the
  network. This is something of a worst case scenario.

 So this is attacking after we've done some swapping but not enough
 to
 reach the point of diminishing returns?
 
  Now, let's measure the effectiveness of Oskar Sandberg's proposed
 solution,
  which is described on the bug tracker:
  https://bugs.freenetproject.org/view.php?id=3919
 
  We can test sandberg's solution by using:
 
  sandbergsolution(sandberg_solution_network, attackers, .037)
 
  The last parameter, .037 is the distance threshold for
   re-randomizing a
  node's location. To be 

[freenet-dev] Pitch Black Attack - Analysis, Code, Etc.

2013-01-31 Thread Michael Grube
Old response that was never forwarded.

On Sun, Jan 27, 2013 at 9:06 AM, Arne Babenhauserheide arne_...@web.dewrote:

 Hi Snark,

 Thank you for posting! Your analysis looks pretty good.

 Am Sonntag, 27. Januar 2013, 00:02:17 schrieb Michael Grube:

  Not bad! There is obviously still some influence but the location
  distribution has evened out noticeably.
 
  There is one down side to this solution, however, and that is that it
  appears to affect search performance. By how much, I am not sure, but our
  link length distribution is now looking less ideal:
 
 
 http://127.0.0.1:/CHK@TdODwHOdC9peiHYGtTxDa9yy9v0lXSHKWW4G7wM5-~A,OIy08YxNZdg4M3vpgm7wETOhUvU3RYFzrkJQ7No9poE,AAMC--8/deterioratinglinkdist.PNG

 What happens if you now apply normal swapping to this distribution? Does
 it get better or do we see a general problem of swapping?


Do you mean without attackers? Changing back to the original swapping
method with attackers makes the location distribution fall apart again.
Without attackers from this point on, the link length distribution moves
back to ideal distribution again.  I just ran this again to be sure, but
the resulting graph is nothing new, so I didn't insert it.



 (in some tests while discussing probes, a swapping example I wrote worked
 well for some stuff, but broke down with certain configurations)

 The link length distribution could be a pretty big problem…

 Compare it with the real distribution:


 http://127.0.0.1:/USK@pxtehd-TmfJwyNUAW2Clk4pwv7Nshyg21NNfXcqzFv4,LTjcTWqvsq3ju6pMGe9Cqb3scvQgECG81hRdgj5WO4s,AQACAAE/statistics/148/plot_link_length.png


My first thought is that I'd like to see this graph with link length from 0
to 1. Also, it's important to note that this graph is percent of nodes with
that link length or smaller, whereas the graphs I inserted are counts of
the number of links with the distance marked on the x axis. This difference
might have been obvious to some people, but I just want to be sure
everybody sees that.

I can't promise anything immediate, but I was already implementing the
search algorithm in my simulation. I can try to get some actual numbers by
this time next week.



 Best wishes,
 Arne

 PS: I also like it that you used freenet itself for hosting!


Of course =)


 --
 Unpolitisch sein
 heißt politisch sein,
 ohne es zu merken.
 - Arne (http://draketo.de)



___
Devl mailing list
Devl@freenetproject.org
https://emu.freenetproject.org/cgi-bin/mailman/listinfo/devl

Re: [freenet-dev] Pitch Black Attack - Analysis, Code, Etc.

2013-01-31 Thread Matthew Toseland
On Thursday 31 Jan 2013 16:16:38 Michael Grube wrote:
  
  So how exactly do you use the 0.037 constant?
 
  If you don't have a peer with distance greater than 0.037 * (distance from
  random location to nearest node to the random location), then you reset?
  (This will break for nodes with very small degree...)
 
 No. It's not based on your immediate peer - a probe doing a DFS search
 returns the closest result to some key that is selected at random. If the
 closest node identifier to the randomly selected location is further than
 some distance, the node originating the probe needs to randomize its
 location.

I don't understand. The node originating the search is not the victim. It 
doesn't have a wrong location. So why does this help at all?

Oskar's proposal, as you quoted:

   From your notes in the bug tracker:
  
   Pick a key randomly, route for it with a special query that returns the
   nearest node identifier to the key found. If the closest you can get is
  much
   further than your distance to your neighbors, give up your current
  position
   for the random one. The definition of much further needs to be
  determined
   experimentally, but it shouldn't be an issue (since the attack in
  question
   works by putting a neighbor thousands of times closer to you then it
  should
   be).

In other words, we use the probe to find out how far we *should* be from our 
neighbours. Then if we are much too close, we are probably the victim of an 
attack, so we randomise.


signature.asc
Description: This is a digitally signed message part.
___
Devl mailing list
Devl@freenetproject.org
https://emu.freenetproject.org/cgi-bin/mailman/listinfo/devl

Re: [freenet-dev] Pitch Black Attack - Analysis, Code, Etc.

2013-01-31 Thread Matthew Toseland
On Thursday 31 Jan 2013 19:24:27 Michael Grube wrote:
 On Thu, Jan 31, 2013 at 2:06 PM, Matthew Toseland t...@amphibian.dyndns.org
  wrote:
 
  On Thursday 31 Jan 2013 16:16:38 Michael Grube wrote:

So how exactly do you use the 0.037 constant?
   
If you don't have a peer with distance greater than 0.037 * (distance
  from
random location to nearest node to the random location), then you
  reset?
(This will break for nodes with very small degree...)
  
   No. It's not based on your immediate peer - a probe doing a DFS search
   returns the closest result to some key that is selected at random. If the
   closest node identifier to the randomly selected location is further than
   some distance, the node originating the probe needs to randomize its
   location.
 
  I don't understand. The node originating the search is not the victim. It
  doesn't have a wrong location. So why does this help at all?
 
 The node initiating the search very well could be a vicitm. Yes, if all of
 their peers are malicious then they are out of options, but assuming most
 of their trusted peers are not malicious Sandberg's algo works just fine.
 All nodes with malicious biases should be considered victims, IMO.

So when a proportion of the network have bad locations, and therefore there is 
a gap in the ring, you want an equivalent number of nodes to reset their 
locations and hopefully fill the gap? Isn't this going to be far less efficient 
than having *the nodes that are actually affected* reset their locations? I.e. 
it'll either do far too much resetting or not nearly enough?
 
  Oskar's proposal, as you quoted:
 
 From your notes in the bug tracker:

 Pick a key randomly, route for it with a special query that returns
  the
 nearest node identifier to the key found. If the closest you can get
  is
much
 further than your distance to your neighbors, give up your current
position
 for the random one.
 
  In other words, we use the probe to find out how far we *should* be from
  our neighbours. Then if we are much too close, we are probably the victim
  of an attack, so we randomise.
 
 So you're saying we need to calculate the ideal distance with the
 probes? That is not what I'm reading:

 The definition of much further needs to be
   determined
experimentally, but it shouldn't be an issue (since the attack in
   question
works by putting a neighbor thousands of times closer to you then it
   should
be).
 
 Are you proposing that we estimate network size by using the
 probes to find the average value of the distance from a randomly selected
 value and the closest actual result? How does this tell us what the ideal
 distance should be?
 
AFAICS Oskar's proposal is very clear, at least if it was reported correctly:

If (the distance to your neighbours)  (arbitrary constant) * (distance from 
probe), then reset.

'The distance to your neighbours'
- You don't use this at all. Hence your interpretation MUST be wrong.

'The closest you can get'
- The distance between the target location and the closest node from the probe.

'The definition of much further needs to be determined experimentally'
- This is arbitrary constant above.
- This is the tunable parameter. Which we will need to experiment with. It's 
obviously going to be a tradeoff between security and performance.

A corrupt network will likely have big gaps covering most of the keyspace. 
Routing to a random location will likely find us in one of these gaps. On the 
other hand, the average node should be within the area with normal-ish 
routing, even in this case - i.e. it should have mostly neighbours that are 
very close to it.

So usually the probe would return a large gap, while our neighbours are close 
by. And therefore we would reset.

But maybe I'm missing something important here...

I'm CC'ing devl since this is an important design discussion and there isn't 
anything confidential here.


signature.asc
Description: This is a digitally signed message part.
___
Devl mailing list
Devl@freenetproject.org
https://emu.freenetproject.org/cgi-bin/mailman/listinfo/devl

Re: [freenet-dev] Pitch Black Attack - Analysis, Code, Etc.

2013-01-28 Thread Matthew Toseland
On Monday 28 Jan 2013 21:39:54 Michael Grube wrote:
 On Mon, Jan 28, 2013 at 4:14 PM, Matthew Toseland t...@amphibian.dyndns.org
  wrote:
 
  On Monday 28 Jan 2013 18:09:07 Michael Grube wrote:
   On Sun, Jan 27, 2013 at 12:30 PM, Matthew Toseland 
   t...@amphibian.dyndns.org wrote:
  
On Sunday 27 Jan 2013 05:02:17 Michael Grube wrote:
 Hi everyone,

 Around this time last year I started on work to simulate the pitch
  black
 attack working against Oskar Sandberg's swapping algorithm that is
 implemented for use with darknet. The work is essentially incomplete,
but I
 did enough to get an idea of how well Oskar's proposed solution to
  the
 Black Attack works. Hopefully the information that follows can
  provide
some
 insight for anybody working on this problem.
   
It looks like we have a good chance of using Oskar's original plan.
  Maybe
even getting it published, with some help (carl might help even if you
don't have time?).

 The code is messy, so I'm going to do a walkthrough of how exactly I
  ran
 the simulation.

 To start off, my code is available at http://github.com/mgrube/pbsim
  .

 Let's start. The first thing I did was create the small world network
that
 is assumed in the darknet. The graph size can obviously be of any
  size,
but
 in our experiment we'll make the network size 1000 nodes. This is
  pretty
 simple in python and can be accomplished with one line in the
  networkx
 library:
   
You did check that there isn't a scalability issue? :)
  
   I tested with 10,000 nodes as well and the results did not vary by much.
   The most important difference I noticed was that 2 attackers became a
  less
   significant number. Not that this really means anything to a would-be
   attacker.
  
   If you are convinced that scalability is a problem, I can add support for
   threads to what I have and make it easy to simulate 100,000 or 1M or
   whatever number we want to try.
 
  I don't know. In my experience threads complicate matters quite a bit.
 
 
 Certainly they can. In this case it would help, however.
 
 
   
I wonder if it would be worth writing up the natural pitch black via
churn evolution we saw in ~ 2008. Basically, when you have churn,
  newbies
end up with the worst locations i.e. those furthest away from the
  main
clusters. So even without an attack, the network locations become more
  and
more clustered. We fixed it by periodic randomisation, which seemed to
  have
relatively little cost - the nodes quickly got their old locations
  back,
more or less.
   
Another thing we want to look into is what the cost is of swapping
(especially on a growing network, or even two networks merging) in
  terms of
losing datastores due to locations changing. That might need more
  detailed
simulations...
  
   I will see what I can do about looking into these sometime later this
  week.
 
  Good.
  
 We can see the aftermath by looking at a histogram of node
  locations. The
 randomize function uses a random function to assign each node a
  location,
 so first let's look at a histogram of locations before the attack:

   
  http://127.0.0.1:/CHK@ODZ1s5SDYrVvyNo0ONh4O9rtI~pcVmTSShh47UFPY5U,SKJfkX2eswHMrqidDWTUoZKGMaZ9yt0l6uLUZMmxOqk,AAMC--8/preattacklocations.PNG
   
Suprisingly wide range of concentrations.

 The biasing locations for these attack nodes were:
 .6935
 .1935
 .9435
 .4435
 .4665
 .9665
 .7165
 .2165

 Our histogram of node locations now shows a disproportionate number
  of
 nodes with those locations:


   
  http://127.0.0.1:/CHK@aI0BN0NXEjU--8dFtCYZwPwUWcM0rpamIf3lnv7FfHc,SCr2NPJYZVpFJKSf-qDYerQTQyDfdoV3-DeX-W1e91I,AAMC--8/postattack.PNG
   
Scary!
  
   Quite.
  
 So, the attack with only two nodes is obviously very effective. It's
 important to note that the attack simulation method assumed that
  nodes
were
 attacking before the swapping algorithm had a chance to organize the
 network. This is something of a worst case scenario.
   
So this is attacking after we've done some swapping but not enough to
reach the point of diminishing returns?

 Now, let's measure the effectiveness of Oskar Sandberg's proposed
solution,
 which is described on the bug tracker:
 https://bugs.freenetproject.org/view.php?id=3919

 We can test sandberg's solution by using:

 sandbergsolution(sandberg_solution_network, attackers, .037)

 The last parameter, .037 is the distance threshold for
  re-randomizing a
 node's location. To be completely honest I am not sure why, but .037
seemed
 to work out as a decent experimental distance. This could quite
  easily
 change depending on the size of the network and should by no means be
used
 as a default value.
   
  

Re: [freenet-dev] Pitch Black Attack - Analysis, Code, Etc.

2013-01-27 Thread Arne Babenhauserheide
Hi Snark,

Thank you for posting! Your analysis looks pretty good.

Am Sonntag, 27. Januar 2013, 00:02:17 schrieb Michael Grube:

 Not bad! There is obviously still some influence but the location
 distribution has evened out noticeably.

 There is one down side to this solution, however, and that is that it
 appears to affect search performance. By how much, I am not sure, but our
 link length distribution is now looking less ideal:

 http://127.0.0.1:/CHK@TdODwHOdC9peiHYGtTxDa9yy9v0lXSHKWW4G7wM5-~A,OIy08YxNZdg4M3vpgm7wETOhUvU3RYFzrkJQ7No9poE,AAMC--8/deterioratinglinkdist.PNG

What happens if you now apply normal swapping to this distribution? Does it get 
better or do we see a general problem of swapping?

(in some tests while discussing probes, a swapping example I wrote worked well 
for some stuff, but broke down with certain configurations)

The link length distribution could be a pretty big problem…

Compare it with the real distribution:

http://127.0.0.1:/USK@pxtehd-TmfJwyNUAW2Clk4pwv7Nshyg21NNfXcqzFv4,LTjcTWqvsq3ju6pMGe9Cqb3scvQgECG81hRdgj5WO4s,AQACAAE/statistics/148/plot_link_length.png

Best wishes,
Arne

PS: I also like it that you used freenet itself for hosting!
--
Unpolitisch sein
heißt politisch sein,
ohne es zu merken.
- Arne (http://draketo.de)




signature.asc
Description: This is a digitally signed message part.
___
Devl mailing list
Devl@freenetproject.org
https://emu.freenetproject.org/cgi-bin/mailman/listinfo/devl

Re: [freenet-dev] Pitch Black Attack - Analysis, Code, Etc.

2013-01-27 Thread Matthew Toseland
On Sunday 27 Jan 2013 05:02:17 Michael Grube wrote:
 Hi everyone,
 
 Around this time last year I started on work to simulate the pitch black
 attack working against Oskar Sandberg's swapping algorithm that is
 implemented for use with darknet. The work is essentially incomplete, but I
 did enough to get an idea of how well Oskar's proposed solution to the
 Black Attack works. Hopefully the information that follows can provide some
 insight for anybody working on this problem.

It looks like we have a good chance of using Oskar's original plan. Maybe even 
getting it published, with some help (carl might help even if you don't have 
time?).
 
 The code is messy, so I'm going to do a walkthrough of how exactly I ran
 the simulation.
 
 To start off, my code is available at http://github.com/mgrube/pbsim.
 
 Let's start. The first thing I did was create the small world network that
 is assumed in the darknet. The graph size can obviously be of any size, but
 in our experiment we'll make the network size 1000 nodes. This is pretty
 simple in python and can be accomplished with one line in the networkx
 library:

You did check that there isn't a scalability issue? :)
 
 from pynetsim import *
 from networkx import *
 from pylab import *
 random_network = navigable_small_world_graph(1000, 4, 2, 1,
 1).to_undirected()
 
 We've just made a Kleinberg small world graph of 1000 nodes with 4 nearest
 neighbors, 2 randomly chosen long distance connections, an exponent
 describing the probability of a connection(explicitly defined as 1/d for a
 Kleinberg graph)  and specified one dimension.

Library API looks nice. Is it too slow to simulate swapping on large networks? 
(100k nodes). There are various things it'd be interesting to try...
 
 Now we'll randomize the locations assigned to all of the nodes with the
 randomize function in pynetsim:
 
 randomize(random_network)
 
 To compare graphs with the same initial conditions, we're going to copy the
 graph...
 
 clean_swap_network = random_network.copy()
 attacked_network = random_network.copy()
 sandberg_solution_network = random_network.copy()
 
 Before we go any further, we should verify that the darknet swapping
 algorithm is implemented properly.
 
 In order for the swapping algorithm to be the most effective, it should
 occur about 2000N times, where N is the number of nodes in the graph. Let's
 do that:
 
 while i  2000:
 i += 1
 swapiteration(clean_swap_network)

swapIteration involves every node trying to swap once?
 
 Now it's time to stare at your computer while every node runs the swapping
 algorithm 2000 times.

IIRC swapping doesn't scale (at least not linearly)... some of the papers 
talked about log^2(n) iterations?
 
 After you've done that, you can examine the link length distribution by
 using the hist function and the edges of the clean_swap_network:
 
 linklengths = list()
 
 for e in clean_swap_network.edges():
 linklengths.append(abs(e[0][0] - e[1][0]))
 
 hist(linklengths, 100)
 
 I've taken a snapshot of the histogram that is generated from this:
 
 http://127.0.0.1:/CHK@YbRCKaxMhIxRgBpYIT2Ux4t-Pi01xxMu2XqWWvG2YY0,SWOn~fUZygCOzHNruIiw7eZEwqFDme9ZGTc4vdtgQFQ,AAMC--8/post_swapping.PNG
 
 Ok, so it looks like our swapping algorithm is properly implemented. Let's
 attack the network!
 
 attackers = list()
 pickmalnodes(attacked_network, attackers, 2)  #We're picking 2 malicious
 nodes because that is the number chosen by the writers of the Pitch Black
 paper.
 attacksimulation(attacked_network, attackers) #We're using 2 nodes, each
 with 4 malicious locations.

Maybe your code is easier to read than the paper ... Or maybe I was asleep when 
I read the paper...

I wonder if it would be worth writing up the natural pitch black via churn 
evolution we saw in ~ 2008. Basically, when you have churn, newbies end up 
with the worst locations i.e. those furthest away from the main clusters. So 
even without an attack, the network locations become more and more clustered. 
We fixed it by periodic randomisation, which seemed to have relatively little 
cost - the nodes quickly got their old locations back, more or less.

Another thing we want to look into is what the cost is of swapping (especially 
on a growing network, or even two networks merging) in terms of losing 
datastores due to locations changing. That might need more detailed 
simulations...
 
 We can see the aftermath by looking at a histogram of node locations. The
 randomize function uses a random function to assign each node a location,
 so first let's look at a histogram of locations before the attack:
 http://127.0.0.1:/CHK@ODZ1s5SDYrVvyNo0ONh4O9rtI~pcVmTSShh47UFPY5U,SKJfkX2eswHMrqidDWTUoZKGMaZ9yt0l6uLUZMmxOqk,AAMC--8/preattacklocations.PNG

Suprisingly wide range of concentrations.
 
 The biasing locations for these attack nodes were:
 .6935
 .1935
 .9435
 .4435
 .4665
 .9665
 .7165
 .2165
 
 Our histogram of node locations now shows a disproportionate number of
 nodes 

Re: [freenet-dev] Pitch Black Attack - Analysis, Code, Etc.

2013-01-27 Thread Matthew Toseland
On Sunday 27 Jan 2013 14:06:08 Arne Babenhauserheide wrote:
 Hi Snark,
 
 Thank you for posting! Your analysis looks pretty good.
 
 Am Sonntag, 27. Januar 2013, 00:02:17 schrieb Michael Grube:
 
  Not bad! There is obviously still some influence but the location
  distribution has evened out noticeably.
  
  There is one down side to this solution, however, and that is that it
  appears to affect search performance. By how much, I am not sure, but our
  link length distribution is now looking less ideal:
  
  http://127.0.0.1:/CHK@TdODwHOdC9peiHYGtTxDa9yy9v0lXSHKWW4G7wM5-~A,OIy08YxNZdg4M3vpgm7wETOhUvU3RYFzrkJQ7No9poE,AAMC--8/deterioratinglinkdist.PNG
 
 What happens if you now apply normal swapping to this distribution? Does it 
 get better or do we see a general problem of swapping?
 
 (in some tests while discussing probes, a swapping example I wrote worked 
 well for some stuff, but broke down with certain configurations)
 
 The link length distribution could be a pretty big problem…
 
 Compare it with the real distribution: 
 
 http://127.0.0.1:/USK@pxtehd-TmfJwyNUAW2Clk4pwv7Nshyg21NNfXcqzFv4,LTjcTWqvsq3ju6pMGe9Cqb3scvQgECG81hRdgj5WO4s,AQACAAE/statistics/148/plot_link_length.png

I think this is backwards, can you show the ideal?
 
 Best wishes,
 Arne
 
 PS: I also like it that you used freenet itself for hosting!


signature.asc
Description: This is a digitally signed message part.
___
Devl mailing list
Devl@freenetproject.org
https://emu.freenetproject.org/cgi-bin/mailman/listinfo/devl