Re: [p2p-hackers] announcing Allmydata-Tahoe v0.3

Jim McCoy Thu, 14 Jun 2007 21:56:41 -0700

On 6/14/07 8:48 PM, "Alen Peacock" <[EMAIL PROTECTED]> wrote:


>   The main problem I see with both the "make replicas of my file for
> me, plzkthx" strategy and the "make some parity blocks and push them
> out to other nodes for me, plzkthx" strategy are that they seem
> inherently unfair when examined individually;

It is unfair if you consider each file publication event on its own, but in
a network where cooperation is the modus operandi (like a network where I am
depending on you to hold my bits for later retrieval :) the short term
downside has a net positive effect.  If a node does not have the bandwidth
to spare then it should not swarm out blocks it has already received from
the file publisher, but in most cases the upstream bandwidth is idle and it
is in my nodes best interest to help you publish your data quickly if I have
a reasonable expectation that you will do the same for me when I need to
push a file into the mesh.

>   I think I follow you when you say that with replication in this
> example, you could theoretically finish the complete op in 3x time
> instead of 4x, but I'm curious if such a [semi-]store-and-forward
> system can really do that in practice,

The best way to accomplish this in code is to use a simple "ask, then push"
strategy for sending out data blocks.  Before a node pushes a block it asks
the remote peer if they already have the block (always a good idea in a
backup system, since there are a _lot_ of common files and you can get lucky
if someone else has already pushed the file you are backing up.)  Instead of
needing to coerce peers to forward blocks you just build in this behavior as
a default and as long as everyone is following the rules you will discover
that by the time you are getting around to sending out the second copy of a
particular block it may already have been pushed to the remote peer by
another member of the network.  After you think all of the blocks have been
pushed you can do a verification step in case a peer was sharing a block and
interrupted the transfer (leading the receiving peer to initially answer the
publisher by saying it was getting the block but this transaction did not
complete) so that you can be certain that the requisite number of copies are
available.

Another variation of the store-and-forward idea that you can consider if
your target environment are systems with asymmetric bandwidth is to enable
the publisher to cache the blocks on any available system and trust every
system to push the blocks to their eventual destination.  This would take
advantage of all of the latent upstream bandwidth available in the mesh, but
requires more trust among the peers (trust that they will stay available to
re-push the blocks, not just that they will not lie and refuse to push the
block...)  Depending on the size of the mesh and the bandwidth asymmetry
involved you might be able to get close to a 1x time to push a file and have
it confirmed at full replication in the mesh.

Jim


_______________________________________________
p2p-hackers mailing list
[email protected]
http://lists.zooko.com/mailman/listinfo/p2p-hackers
Re: [p2p-hackers] announcing Allmydata-Tahoe v0.3

Reply via email to