[Folks: I'm replying to old mailing list posts that I didn't have time to reply to when they were new because I was preparing the tahoe-1.3.0 release. Beware of time travel culture shock.]
On Feb 16, 2009, at 1:51 AM, Shawn Willden wrote: > I don't think this is a problem. Or at least, it's not a problem > that doesn't exist even without the weak hash. If the attacker > knows the storage ID of your file, he can replace it in the grid -- > he doesn't need to be able to generate another file that hashes to > the same value. Currently we address this problem by having storage servers never overwrite immutable files with different contents. Only the first client to begin uploading an immutable file gets to choose its storage index, then if another client tries to use the same storage index while the upload is in progress the server tells it that the file is already in progress (or maybe it says "the file is already there", which wouldn't be quite right...), and then once the uploader closes the upload the mapping between that storage index and that share, in the mind of that storage server, is set in stone. Now, we're about to introduce garbage collection in Tahoe-1.4 or so, and then that raises the question of what if the share got garbage collected and then someone uploads a different flie with the same storage index, and then someone who didn't know about either of those events tries to re-upload the original one. In the long run I think a better solution is to make the storage index be equal to the verifier cap. This requires a different semantics for uploads-in-progress because the verifier cap isn't known to the uploader when it starts the upload, only when it finishes the upload, so it will have to tell the storage server that it is about to start uploading something and bind the ongoing upload to the current connection or else to a temporary "upload in progress" token instead of to the ultimate storage index. Then, once the upload is finished the storage server moves it from the temp "incoming" directory to the final location indexed by its storage index which is its verify cap. The storage server can also therefore *check* that the share matches the verify cap (because anyone can check that a share fits a given verify cap), which makes all of those aforementioned issues simpler and more obviously right. As an added benefit, this might facilitate better restart of interrupted uploads and such. I think Brian might know some other problems or complications of that proposal, so hopefully he'll follow-up to this post. > Another use case that I plan to try in the near future is to attach > a big USB drive to a Linksys router running custom firmware, and > use that as a Tahoe node. :-) David Reid and Zandr Milewski are both interested in experimenting with Tahoe on those sorts of embedded NAS/router/whatsit boxes. Exciting! >> In the year 2012 (hey, we're living in the future!), the new SHA-3 >> hash function will be chosen. That function will also, I hope, >> require about 1/3 as many CPU cycles as SHA-256 does while being a >> safer long-term bet. > > If the result parallels the success of the AES selection process, > it may be even faster than that. I wish! The very fastest not-yet-broken candidates right now take about 1/3 as many CPU cycles as SHA-256 (according to [1]), and the thrust of NIST's management of the contest seems to be to get a hash function which isn't slower than SHA-256, but which is safer. So, even after SHA-3 is final, we'll need either as many CPU cycles as SHA-2 or perhaps 1/2 or 1/3 or 1/4 as many. By comparison MD5 takes about 1/4 as many cycles as SHA-256. (And by the way if matters a lot what CPU architecture you're using and how long are the messages you want to hash.) Regards, Zooko [1] http://bench.cr.yp.to/results-hash.html _______________________________________________ tahoe-dev mailing list [email protected] http://allmydata.org/cgi-bin/mailman/listinfo/tahoe-dev
