On Wednesday 27 Feb 2013 19:40:49 Matthew Toseland wrote: > On Wednesday 27 Feb 2013 18:54:34 Matthew Toseland wrote: > > operhiem1's graphs of probed total datastore size have been attacked > > recently by nodes returning bogus store sizes (in the multi-petabyte > > range). This caused a sudden jump in store sizes on the total store size > > graph. He excluded outliers, and the spike went away, but now it's come > > back. > > > > The simplest explanation is that the person whose nodes are returning the > > bogus stats has hacked their node to return bogus datastore stats even when > > it is relaying a probe request. Given we use fairly high HTLs (30?) for > > probes, this can affect enough traffic to have a big impact on stats. > > > > Total store size stats don't matter that much, but we need to use probe > > stats for a couple of things that do: > > 1. Pitch Black prevention will require probing for the typical distance > > between a node and its peers. Granted on darknet it's harder for an > > attacker to have a significant number of edges / nodes distributed across > > the keyspace. > > 2. I would like to be able to test empirically whether a given change > > works. Overall performance fluctuates too wildly based on too many factors, > > so probing random nodes for a single statistic (e.g. the proportion of > > requests rejected) seems the best way to sanity check a network-level > > change. If the stats can be perverted this easily then we can't rely on > > them, so empiricism doesn't work. > > > > So how can we deal with this problem? > > > > We can safely get stats from a randomly chosen target location, by routing > > several parts of a probe request randomly and then towards that location. > > The main problems with this are: > > - It gives too much control. Probes are supposed to be random. > > - A random location may not be a random node, e.g. for Pitch Black > > countermeasures when we are being attacked. > > > > For empiricism I guess we probably want to just have a relatively small > > number of trusted nodes which insert their stats regularly - "canary" nodes? > > > Preliminary conclusions, talking to digger3: > > There are 3 use cases. > > 1) Empirical confirmation when we do a build that changes something. Measure > something to see if it worked. *NOT* overall performance, low level stuff > that should show a big change. > => We can use "canary" nodes for this, run by people we trust. Some will need > to run artificial configs, and they're probably not representative of the > network as a whole. > => TODO: We should try to organise this explicitly, preferably before trying > the planned AIMD changes... > 2) Pitch Black location distance detection. > => Probably OK, because it's hard to get a lot of nodes in random places on > the keyspace on darknet. > 3) General stats: Datastore, bandwidth, link length distributions, etc. This > stuff can and should affect development. > => This is much harder. *Maybe* fetch from a random location, but even there > it's problematic? > => We can however improve this significantly by discarding a larger number of > outliers. > Given that probes have HTL 30, and assuming opennet so nodes are randomly > distributed: > 10 nodes could corrupt 5% of probes > 21 nodes could corrupt 10% of probes > 44 nodes could corrupt 20% of probes. > > Also note that it depends on what the stat is - the probe request stats are a > percentage from 0 to 100, so much less vulnerable than datastore size, which > can be *big*. > One proposal: use low HTL probes from each node: (possibly combined with central reporting, possibly not)
https://bugs.freenetproject.org/view.php?id=5643
signature.asc
Description: This is a digitally signed message part.
_______________________________________________ Devl mailing list [email protected] https://emu.freenetproject.org/cgi-bin/mailman/listinfo/devl
