On Wednesday 27 Feb 2013 19:40:49 Matthew Toseland wrote:
> On Wednesday 27 Feb 2013 18:54:34 Matthew Toseland wrote:
> > operhiem1's graphs of probed total datastore size have been attacked 
> > recently by nodes returning bogus store sizes (in the multi-petabyte 
> > range). This caused a sudden jump in store sizes on the total store size 
> > graph. He excluded outliers, and the spike went away, but now it's come 
> > back.
> > 
> > The simplest explanation is that the person whose nodes are returning the 
> > bogus stats has hacked their node to return bogus datastore stats even when 
> > it is relaying a probe request. Given we use fairly high HTLs (30?) for 
> > probes, this can affect enough traffic to have a big impact on stats.
> > 
> > Total store size stats don't matter that much, but we need to use probe 
> > stats for a couple of things that do:
> > 1. Pitch Black prevention will require probing for the typical distance 
> > between a node and its peers. Granted on darknet it's harder for an 
> > attacker to have a significant number of edges / nodes distributed across 
> > the keyspace.
> > 2. I would like to be able to test empirically whether a given change 
> > works. Overall performance fluctuates too wildly based on too many factors, 
> > so probing random nodes for a single statistic (e.g. the proportion of 
> > requests rejected) seems the best way to sanity check a network-level 
> > change. If the stats can be perverted this easily then we can't rely on 
> > them, so empiricism doesn't work.
> > 
> > So how can we deal with this problem?
> > 
> > We can safely get stats from a randomly chosen target location, by routing 
> > several parts of a probe request randomly and then towards that location. 
> > The main problems with this are:
> > - It gives too much control. Probes are supposed to be random.
> > - A random location may not be a random node, e.g. for Pitch Black 
> > countermeasures when we are being attacked.
> > 
> > For empiricism I guess we probably want to just have a relatively small 
> > number of trusted nodes which insert their stats regularly - "canary" nodes?
> > 
> Preliminary conclusions, talking to digger3:
> 
> There are 3 use cases.
> 
> 1) Empirical confirmation when we do a build that changes something. Measure 
> something to see if it worked. *NOT* overall performance, low level stuff 
> that should show a big change.
> => We can use "canary" nodes for this, run by people we trust. Some will need 
> to run artificial configs, and they're probably not representative of the 
> network as a whole.
> => TODO: We should try to organise this explicitly, preferably before trying 
> the planned AIMD changes...
> 2) Pitch Black location distance detection.
> => Probably OK, because it's hard to get a lot of nodes in random places on 
> the keyspace on darknet.
> 3) General stats: Datastore, bandwidth, link length distributions, etc. This 
> stuff can and should affect development.
> => This is much harder. *Maybe* fetch from a random location, but even there 
> it's problematic?
> => We can however improve this significantly by discarding a larger number of 
> outliers.
> Given that probes have HTL 30, and assuming opennet so nodes are randomly 
> distributed:
> 10 nodes could corrupt 5% of probes
> 21 nodes could corrupt 10% of probes
> 44 nodes could corrupt 20% of probes.
> 
> Also note that it depends on what the stat is - the probe request stats are a 
> percentage from 0 to 100, so much less vulnerable than datastore size, which 
> can be *big*.
> 
One proposal: use low HTL probes from each node: (possibly combined with 
central reporting, possibly not)

https://bugs.freenetproject.org/view.php?id=5643

Attachment: signature.asc
Description: This is a digitally signed message part.

_______________________________________________
Devl mailing list
[email protected]
https://emu.freenetproject.org/cgi-bin/mailman/listinfo/devl

Reply via email to