On 7/27/06, Ludovic Courtes <[EMAIL PROTECTED]> wrote:
Hi,
If I understand correctly, meta-data in flud are stored in a DHT while
data are stored using point-to-point relations among peers, as in
PeerStore (by Landers et al). AFAIK, DHTs usually assume that all the
participants (or a very large fraction thereof) honor the DHT protocol.
Since flud is apparently designed to be used among mutually suspicious
peers, aren't you afraid that using a DHT can make the service
vulnerable to misbehaving peers?
The DHT layer is one of the most susceptible components, but there
are several reasons why this isn't worrisome at this point:
1) the DHT layer is mostly there for performance. While a node has
not lost its data, it retains a local copy of metadata for files that
it has stored, and can thus access directly (during verify operations,
for example) without relying on the DHT, or use its local copy to
assess and detect malicious/misbehaving nodes at the DHT layer and
blacklist them.
Details aren't fully described at the wiki yet, but current thinking
is that all file metadata will also be replicated outside of the DHT.
In the case where a node loses all its data, it will be more painful
for that node to retrieve metadata from non-DHT resources than from
the DHT, but absolutely possible. This has two effects: 1) correct
performance in the face of DHT failure and 2) decreased incentive to
mount an attack on the DHT (because of (1)).
2) Using a Kademlia-style DHT gives a lot of advantages over a
Chord/Pastry/etc-style DHT in this setting. There is a good deal of
sybil-resistence already built into the way Kademlia works, for
instance, and since the querying node is in charge of all iterations
of the protocol, it can leverage quorum decisions and do things like
inject a small bit of randomness into the routing to subvert attempts
at corrupting the DHT.
3) Because of measures like those described above have been taken, a
very large portion of the flud network would need to be compromised
before having significant impact on correctness, and such an attack
would be equally effective against components other than the DHT. For
now, we are relying on requiring nodes to find hash collisions with
their own IDs and a timestamp (as in hashcash and Herbivore's join op)
to discourage sybil attacks. We're also banking on the value/effort
proposition of mounting such attacks; when the flud network is small,
it is a low-value target (relatively). When it is large, it is
increasingly difficult to mount such an attack (due to resources
required, Kademlia's preference for old nodes, etc.).
Still, you are right to be leery of the DHT. Without metadata,
files are essentially lost even if all file data is present, so it is
very important to make sure that the metadata is safe. DHTs still
have many open problems, but I'm satisfied that the scheme used in
flud will keep data in a completely recoverable state.
I'm sure we'll learn a lot more as we test with deliberately crafted
malicious nodes (if writing such beasts appeals to anyone, please
speak up -- I'd love the help).
Alen
_______________________________________________
p2p-hackers mailing list
[email protected]
http://lists.zooko.com/mailman/listinfo/p2p-hackers