On Wed, Jan 11, 2012 at 2:49 AM, Brian Warner <[email protected]> wrote:
> > Yeah, our recommendation is one storage node per spindle, since it's > individual disks that usually live or die. Computers fail too (which > might take out multiple storage nodes), but you can usually move their > disks to a new computer, so I think of those as transient failures. > Whereas, when a disk fails, it's usually down for good. > > If you think your disks are indestructible but your partitions are not, > then maybe multiple storage nodes per disk (one per partition) could > make sense. Seems dubious to me, though. It's just again, the large amount of disk that I have in one machine (it was cheap, its a storage pod, fully loaded with 45x3tb disks) and file system limits that are causing me the headache. I don't really trust any standard linux fs to safely store my data over a long period of time with out corruption (amongst other reasons) > >>> What about setting up multiple instances of tahoe storage nodes per >>> partition on one machine, in an possible scenario where I have 150tb >>> of space on a machine but I can only make a bunch of 16tb partitions. >>> I ask this because we have a few machines in work right now with this >>> kind of setup and I'm kinda pushing for using tahoe-lafs as a >>> possible backend storage system, possibly with irods sitting on top >>> to manage the data (yet to be decided). > > If you need to aggregate multiple partitions into one big one, and you > don't have an OS-level way to do it (lvm, etc), then one cheap-and-dirty > approach is to make symlinks from the individual prefix directories. > Tahoe stores shares in: > > $NODEDIR/storage/shares/$PREFIX/$STORAGEINDEX/$SHARENUMBER > > where $PREFIX is like "aa" or "7q": there are 1024 of them (first two > characters of the base32-encoded storage-index). Since files get mapped > to storage-index values effectively randomly, if you had two partitions, > you could build 512 symlinks (22 to gz) that point into one of them, and > have the other 512 symlinks (ha to zz) point to the second one. Or > 256/256/256/256, etc. Nasty, but it'd work, as long the Law Of Large > Numbers holds up and the partitions fill at about the same rate. I was thinking about doing this, it seems nasty but I might end up having to do it. > >>> As a side question, as we expand the number of nodes, I would >>> probably want to change the k-of-n settings. would the migration >>> method to newer k-of-n parameters be copy and delete within the grid >>> to rebalance data? > > Yeah, unfortunately, there's no good way to re-encode a file short of > just downloading it and re-uploading it. The k-of-N settings for new > uploads are controlled by the client node's tahoe.cfg file (see > shares.required and shares.total), but they're embedded in the filecap. > So you could set your tahoe.cfg to the new settings, use 'tahoe cp -r' > to copy a bunch of files out of tahoe into your local directory, then > 'tahoe cp -r' again to re-upload them (with the new settings). is this in the FAQ in the documentation? I think this type of information should probably reside somewhere on the website. > > BTW, we use "re-encode" to talk about changing a file's encoding > parameters, like 'k' and 'N': that generally means making entirely new > shares. When we say "rebalance", we're talking about moving shares > around without changing them, like when new servers are added, and we > want to move shares around to spread out the load more evenly. We don't > have automatic tools for either yet. I think I knew this already but I needed confirmation on it, perhaps if this isn't in the FAQ or docs already it might be nice to have it somewhere as well. Thanks, Jimmy. -- http://www.sgenomics.org/~jtang/ _______________________________________________ tahoe-dev mailing list [email protected] http://tahoe-lafs.org/cgi-bin/mailman/listinfo/tahoe-dev
