Tahoe Devs,

Is there currently any mechanism, or any plans to implement a mechanism, which 
allows for storage nodes to be 'arranged' with nearby gateway/WAPI nodes, 
allowing for a more logical scaleout in a formal datacenter environment?  To 
borrow from Hadoop terminology, this would be called 'Rack Awareness' 
(http://hadoop.apache.org/common/docs/r0.17.2/hdfs_user_guide.html#Rack+Awareness).

For example, say that I have three facilities, and I wish to setup 3-4 nodes in 
each of them in a 3-of-10 scheme:

Facility 1 (ISDN):
Gateway 1
Node 1A
Node 1B
Node 1C

Facility 2 (ISDN):
Gateway 2
Node 2A
Node 2B
Node 2C
Node 2D

Facility 3 (56K):
Gateway 3
Node 3A
Node 3B
Node 3C

For arguments sake, let us say that we have a very expensive, limited 
connection between these facilities (to make this extreme, let's call it a 
dialup-ish connection - obviously this is an exaggeration, but the argument 
scales up).

If gateway 1 attempts to retrieve a file, it is obviously most efficient for it 
to do so utilizing nodes 1ABC.  In the event that 1C is down, any of the other 
shares can obviously step in, at a greater cost to the infrastructure.  
Ideally, you would also be able to weight the next best share - if facilities 1 
and 2 have ISDN lines, and 3 has a dialup line, it is preferable for a gateway 
at facility 1 to query a node at facility 2.  If no weighting is configured, or 
in a distributed friendnet, other possible methods could be distance vector 
routing (how many hops to the other nodes with shares), latency, or Geo-IP 
lookups.

Best Regards,
Nathan Eisenberg
_______________________________________________
tahoe-dev mailing list
[email protected]
http://allmydata.org/cgi-bin/mailman/listinfo/tahoe-dev

Reply via email to