As a user/administrator of an ad-hoc Tahoe-lafs, there are several assumptions that are not valid for our use case: - All nodes do not start out with the same size storage space - Almost all nodes will be used to store data other than Tahoe shares - Bandwidth balancing will be more important than storage space balancing - Nodes will be expected to join, leave and fail at random - The rate at which a node reaches capacity is of far less concern than distribution of shares across nodes.
Nodes will range from 4GB non-dedicated to 2 TB dedicated. I realize that this is a use case that falls outside of the initial design parameters of Tahoe-lafs, but you have managed to build something useful and (somewhat) elegant. When a tool like that becomes available, people will use it in the most unexpected ways. ---- - Think carefully. - Contra mundum - "Against the world" (St. Athanasius) - Credo ut intelliga - "I believe that I may know" (St. Augustin of Hippo) On Sat, Dec 26, 2009 at 9:12 AM, tahoe-lafs <[email protected]> wrote: > #302: stop permuting peerlist, use SI as offset into ring instead? > > ------------------------------------+--------------------------------------- > Reporter: warner | Owner: > Type: task | Status: new > Priority: major | Milestone: undecided > Component: code-peerselection | Version: 0.7.0 > Keywords: repair newcaps newurls | Launchpad_bug: > > ------------------------------------+--------------------------------------- > > Comment(by zooko): > > Thanks for doing this work to simulate it and write up such a detailed and > useful report! I think you are right that the unpermuted share placement > can often (depending on node id placement and {{{N}}}) result in > significantly higher inlet rates to some storage servers than others. But > as you say it isn't clear how much this matters: "Now, do we actually need > uniform upload rates? What we really want, to attain maximum reliability, > is to never double-up shares. That means we want all servers to become > full at the same time, so instead of equal bytes-per-second for all > servers, we actually want equal percentage-of-space-per-second for all > servers." > > Note that in actual deployment, storage servers end up being of multiple > generations, so for example on the allmydata.com prodgrid the oldest > servers are running 1 TB hard drives, then once those started filling up > we deployed the thumper which comprises about 48 storage servers each with > a 0.5 TB hard drive, then once the thumper started getting full we > deployed a few more servers, including ten which each had a 2 TB hard > drive. The point is that there was never a time (after the initial > deployment started to fill up) where we had similar amounts of free space > on lots of servers so that equal inlet rates would lead to equal time-to- > full. > > My simulator (mentioned earlier in this thread) reported time-to-full > instead of reporting inlet rate, and it indicated that regardless of > whether you have permuted or non-permuted share placement, if you start > with a large set of empty, same-sized servers and start filling them, then > once the first one gets full then very quickly they all get full. > > Note that there are two separate arguments: 1. A more uniform inlet rate > might not be so important. 2. The time between the first one filling and > the last one filling is a small fraction of the time between the start of > the grid and the last one filling (regardless of share placement > strategy). > > I guess I'm not sure how you got from "do we actually need uniform upload > rates?" to "easier to deal with and gives better system-wide properties" > in your comment:12. > > Oh! Also note that "What we really want, to attain maximum reliability, > is to never double-up shares" is at least partially if not fully addressed > by #778. > > -- > Ticket URL: <http://allmydata.org/trac/tahoe/ticket/302#comment:13> > tahoe-lafs <http://allmydata.org> > secure decentralized file storage grid > _______________________________________________ > tahoe-dev mailing list > [email protected] > http://allmydata.org/cgi-bin/mailman/listinfo/tahoe-dev >
_______________________________________________ tahoe-dev mailing list [email protected] http://allmydata.org/cgi-bin/mailman/listinfo/tahoe-dev
