Hi Zooko, I haven't had much time to comment on this thread with allocating shares to servers etc...
On Tue, Jan 17, 2012 at 6:21 PM, Zooko Wilcox-O'Hearn <[email protected]> wrote: > Folks: > > There is a *lot* of interest in this topic. A lot of people seem to > think Tahoe-LAFS does everything they want except that it doesn't give > them a way to do this. > It would be nice to have the capability to spread files out in on nodes furtherest away from each other on the network, but also have the capability of grouping a bunch of nodes into one failure group or in a better way of putting it, a single failure group could mean that its just a bunch of nodes in the same 'network'. > I hope this means somebody is about to volunteer to implement it! :-) > > Olaf: > > Yes, I think if we had a file describing which storage servers should > be used and/or what description (location, tags, whatever) is > associated with each storage server, then it would be a useful > improvement to start distributing the contents of that file from some > service. Then you can configure multiple gateways one time to get > their configuration from that service from then on, instead of > reconfiguring each of them each time you change your ideas about the > servers. That service is called a "blesser" in the tickets mentioned > below, because the gateway refuses to use a server unless that > server's public key comes with a "blessing" signed by the blesser's > private key. > > However, I suggested the simpler "hosts file"-style approach for > starters because I feel like there's a lot I don't understand about > this, and that experimentation with real situations will make it > clearer. > I like the idea of a simple hosts file style approach, something like this... 192.1.1.0/24 /1/1 192.1.2.0/24 /1/2 . . 172.1.0.0/16 /2/1 172.2.0.0/16 /2/2 Where the first column is the address (and network mask), and the second field of /X/Y represent the location and rack positions, that is X is the data centre location and Y is the rack that it is in. (replace data centre with machine, and replace word rack with hdd if it is in a collection of machines in an office). This is actually the format that sector/sphere uses, see http://sector.sourceforge.net/, it seems like a sensible format for configuring the topology of the network. I've played with sector to see how they allocate files to nodes across a WAN. It seems to work (at least for me) and the format isn't too unreasonable and is not too difficult to grasp. I would propose that something like this be implemented at the client when selecting nodes, basically pick nodes based on distance between nodes by using the /X/Y values, it would be a simple case allocating shares first by looking at the X value in the /X/Y column and cycling through that toplevel list to find groups of nodes at different locations. Then move to the Y value in /X/Y column to place shares. You could probably randomly select nodes in the Y field to try and get an even distribution. You might not be any better or worse off than the current allocation policies, but at least the user can configure it if they trust and know the network. Or if its in a single location, but with heterogenous node configurations (where a some nodes have more than one tahoe storage node on a single physical node) it might be easier for people in that use case. If you didn't have a topology configured, the behaviour could fall back to the current one. The above is just my 2c to the discussion, I need to read through all the posts again to understand this discussion, I did not expect this much interest to be generated out of some of my comments made from the opening posts. Jimmy. -- http://www.sgenomics.org/~jtang/ _______________________________________________ tahoe-dev mailing list [email protected] http://tahoe-lafs.org/cgi-bin/mailman/listinfo/tahoe-dev
