On Friday 19 June 2009 17:37:00 Robert Hailey wrote: > > On Jun 18, 2009, at 9:07 PM, Matthew Toseland wrote: > > > CHUNK SIZE PROBLEM: > > =================== > > > > Current plans call to split the keyspace up for each datastore, and > > assign keys to a manageable sized bloom filter for a section of the > > (hashed) keyspace. These can then be transferred separately, are > > useful immediately after being transferred, can be juggled in > > memory, etc. However, we cannot guarantee that the populaion of such > > a chunk will be small enough for the filter to be effective. Solving > > this requires either moving the boundaries dynamically (possibly > > continually), or making the chunks significantly larger than they > > would need to be in an ideal world. > > Right... but I believe we prescribed that the split would be based on > a hashed value of the key and not by logical keyspace location to > avoid disproportionate chunks. > > That is to say... ideally a node is going to get a disporportionate > amount of cache/store data about it's network location. > > STORE_SIZE/MAX_BLOOM_CHUNK -> N_CHUNKS > H(key, N_CHUNKS) => n & (0 < n < N) > CHUNK[n].add(key) > > Wouldn't the problem be reduced to finding a well-scattering hash > function then?
Yes and no. How much variation will we have even if we divide by hashed keyspace? Hence how much bigger than the ideal splitting size do we need each chunk to be to maintain approximately the right false positives ratio? > > Surely even hashing the byte-wise double value of the location would > be sufficient, I suppose that would look more like this... > > H(key) => location > H(location)/N_CHUNKS => n > > > > > LOCAL SECURITY ISSUES: AFTER-THE-FACT TRACING BY POWERFUL ATTACKERS > > =================================================================== > > > > We have IMHO adequately addressed the network level security issues, > > by not caching until HTL has reduced a bit. > > What about an evil peer sending a "full" filter to try and collect all > traffic? Have we discussed that? Two answers: 1) FOAF attacks can do exactly this right now. You can go for a specific area of keyspace or you can maximise the proportion covered. The only defence is the no-more-than-30%-of-requests-to-one-peer limit (which is waived if you have too few peers). 2) Bloom matches are not normal requests, they are closer to the special short-term requests we send in response to a peer offering us a key we have previously requested. We can keep success ratios for the nodes based on the number of keys they should have had and the number they actually had. > > Perhaps a backoff on a bloom filter false positive would kill such an > attack, but may interfere with initial testing. Initial testing yes but long term IMHO there is no problem. -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 835 bytes Desc: This is a digitally signed message part. URL: <https://emu.freenetproject.org/pipermail/devl/attachments/20090619/2f48edcd/attachment.pgp>
