On Thu, 15 Jan 2009 00:33:09 -0700 Shawn Willden <[email protected]> wrote:
> However, it occurs to me that there may be situations in which a quick and > dirty repair job may be adequate, and much cheaper. Rather than > regenerating the shares and delivering the actual lost copies, the repairer > can simply make additional copies of the shares still remaining. Yeah, the system we had before Tahoe (now referred to as "Mountain View", and closely related to the Mnet/HiveCache codebase) used both expansion and replication. I think we were using an expansion factor of 4.0x, and a replication factor of 3.0x, for a total share size that was 12x the original data. Your analysis is completely correct. We didn't put any energy into replication in Tahoe. One reason was that it makes failure analysis harder (*which* share was lost now matters, so one of the independent-failures assumptions must be dropped). Another reason was that we figured that, since allmydata.com's servers are all in the same colo, bandwidth was effectively free. A third is that we simply haven't gotten around to it. We'd need a storage-server API to introducer server A and server B, and then tell A to send a given share to B (this is pretty easy if one of them has a public IP, and gets considerably harder when both are behind NAT). The repair process would need to make a decision about when it was ok to replicate and when it was necessary to encode new shares.. perhaps the first three lost shares could be addressed by replication, after which new shares must be generated. > For example, the probability of losing a file with N=10, k=3, p=.95 and > four lost shares which have been replaced by duplicates of still-extant > shares is 9.9e-8, as compared to 1.6e-9 for a proper repair job. Not that > much worse. Neat! How did you compute that number? > If there's storage to spare, the repairer could even direct all six peers > to duplicate their shares, achieving a file loss probability of 5.8e-10, > which is *better* than the nominal case, albeit at the expense of consuming > 12 shares of distributed storage rather than 10. Yeah, it's a big multivariable optimization/tradeoff problem: storage space consumed, CPU used, bandwidth used (on links of varying capacities, owned by different parties), reliability (against multiple sorts of failures). Very messy :). cheers, -Brian _______________________________________________ tahoe-dev mailing list [email protected] http://allmydata.org/cgi-bin/mailman/listinfo/tahoe-dev
