On Fri, 2003-05-09 at 23:01, Toad wrote: 
> 
> Nice in theory, but in practice, how do you distribute the new 
> formulas?

I've been thinking this week about a simpler problem:  a network
of nodes which provide the service of distributed storage and
distribution of fixed length chunks of say 1 MiB each.  All chunks
are supposed to contain apparently random data, i.e. it doesn't
compress much.  It looks to me like this protocol has a lot
of advantages.  I think something like it could end up running
in the backbone nodes of a future content delivery network.

Chunks are addressed by their hash code.  Nodes specialize in
a region of hash space, and keep detailed information about those
peers which specialize in neighboring regions, e.g., a complete if
slightly out-of-date table of contents for neighbors.  All nodes
try to keep a complete list of all known nodes and the median hash
code carried by the node.

Each node tries to make say hourly contact with every other node, and
exchanges information such as the median hash code, the available
space, and the speed of the connection.  During this contact, nodes 
also exchange reports on the discovery of new nodes and changes 
to the status of known nodes.  The total number of nodes in the
network will be limited by treating nodes which specialize in
nearby hashes as a single node when necessary.

Nodes accept requests from outside sources to supply a chunk
addressed by its hash code, or to store a chunk.  Anonymity of
the source is provided separately.  Nodes also generate random
chunks themselves, and generate chunks calculated as the XOR of 
two other chunks which happen to be in their possession.  Frequently
requested chunks are more likely to be chosen for this XOR operation.

Distributed storage is accomplished by means of these XOR results.
Whenever a node calculates the XOR of two chunks and inserts the
result, it also retains the fact that the three chunks are
related:  any one can be calculated from the XOR of the other two.
This fact is stored by inserting the triple of hash codes into
the distributed hash index.  The problem of false triples requires
that the triples be signed.  Each node which discovers that a
triple turned out to be correct will also sign it, and if a node
finds that the triple failed, it will sign and insert a denial of 
the triple.  Note that calculation of a new XOR in general requires
insertion of many triples into the table, since the new chunk may
be calculable from many pairs of known chunks.  These inferred
triples are initially marked as inferred but become certified when
they have been successfully used.

The full table of all known triples will be rather large, and
each node cannot be expected to keep a full copy.  However,
nodes can keep copies of those triples which contain chunks
in their area of specialization.  Note that these triples will
generally also contain chunks outside of the area of specialization,
so that loss of a node will not preclude regeneration of its content.

Insertion of a new triple into the table is a distributed operation,
since inferred triples will also need insertion, and it may be
difficult to discover all of them.  In general, the table is a
table of n-tuples, such that the n chunks XOR to zero, and new
triples may be discovered by way of longer n-tuples.  If there
is a triple (A,B,C) and some other tuple (A,D,E,...), then 
the tuple (B,C,D,E,...) will also XOR to zero.  But this tuple
might contain anther instance of B or C or both.  If so, then 
duplicates cancel and the n-tuple becomes an (n-2)-tuple or 
(n-4)-tuple.

Routing of requests for data is fairly simple.  First, some load
balancing occurs.  All nodes keep a table of all other nodes
they know about, including the capacity number.  An idle node has 
positive capacity, while an overloaded or unreachable node has 
zero capacity.  A node (the "receiver") which receives a request
for a particular hash code draws a random number and accepts the
request with probability related to its current capacity.  
Otherwise, the receiver passes the buck by drawing a second 
random number and looking in the cumulated normalized table of 
node capacities to find which node gets the request.  The selected 
node replies acknowledging that it is now receiver for the request, 
and updating its capacity number.  If no ack is received, the 
receiver restarts the request using updated capacity numbers and 
new random numbers.

When the request arrives at a node (the "handler") which 
decides not to pass the buck, the handler examines the requested 
hash and looks in its table of triples to see if it knows of
a formula for the chunk.  If it does not, then it reconsiders
its decision to be the handler using a lower probability of
acceptance.  Otherwise, it searches its table of median hashes 
and sends availability queries to those nodes with medians 
closest to the request.  If the handler knows any formulas for
the chunk, so that it could XOR against a (preferably) local 
chunk to produce the desired chunk, it sends availability 
queries for those other chunks to nodes with nearby medians.
If there are many formulas, a subset is chosen at random.

After the queries are answered or timeout, the handler
decides how to handle the request.  The answers will contain
up-to-date node capacity and network bandwidth information, and
when there is more than one way to satisfy the request, the
handler will decide at random using probabilities calculated
from this information.  Thus, the request might be satisfied
by sending it on to a node which has the actual chunk, if that
node is lightly loaded, or it might be satisfied by requesting
a chunk which will permit the handler to calculate the chunk
and send the result back to the original requester.  If the
request can be satisfied from two non-local chunks, then 
the handler will either request that both chunks be delivered
so that it can calculate the chunk, or it will pass responsibility 
for the request on to one of the two nodes which will then
request the other non-local chunk and satisfy the request.

A similar scheme is used for insert requests.  It is not necessary
that a new chunk be inserted as given, and if the node which
specializes in the new chunk's hash is heavily loaded, then
probably it will only be inserted as one or more XOR results.

The detailed method for inserting new triples or tuples is
not fully worked out.

-- Ed Huff

_______________________________________________
devl mailing list
devl at freenetproject.org
http://hawk.freenetproject.org:8080/cgi-bin/mailman/listinfo/devl

Reply via email to