Brian Warner wrote: > Some observations: > > * obviously the "v = H(ciphertext)" could+should be expanded to include > our usual UEB scheme, with all integrity information (merkle trees, > share hash trees, ideally even an encrypted form of the plaintext > hash data) going into the UEB, and "v" being the hash of the UEB. > David-Sarah's point about making verifycap=H(v,K1enc) is spot-on. > > * verifycap cannot be offline-derived from readcap: you have to run > through part of the download process, fetch at least "v" and the > K1enc value, derive K1, hash K1+v together to confirm that you really > do get the readcap, then emit H(v+K1enc) as the verifycap. This makes > manifest/repaircap generation really expensive (a network trip per > file). One mitigation strategy would be to store both readcap and > verifycap in dirnodes, effectively caching the verifycap computation.
Given that the combined (readcap, H(v, k1_enc)) is as short as just the readcap in any alternative scheme, this seems quite acceptable to me. > * what should the storage-index be? It clearly must be the hash of the > readcap, otherwise readers cannot find the shares (or must carry > around some extra value, negating the shortness of the readcap). > > * but since storage-index != verifycap (i.e. H(UEBhash+k1enc)), servers > will be unable to completely validate their shares. They can confirm > that everything (including K1enc, thanks to David-Sarah's suggestion) > matches the verifycap, but they can't tell that the verifycap matches > the storage-index under which the share is stored (i.e. they'd be > unable to detect two swapped sharefiles). This permits the > "roadblock" attack and generally misses our goals of allowing full > server-side validation. That could be fixed by including the storage index in the verifycap, i.e. (storage_index, H(v, k1_enc)). dirnodes still only need to store (readcap, H(v, k1_enc)), since the readcap can be hashed to get the storage index. > * we can't determine the storage-index until after we've encoded the > entire file (which generally means after we've uploaded it). So we > need a new uploader protocol that lets us upload to an as-yet-unnamed > slot, and then provide the slot's storage-index at the very end of > the process. This is more work, but it isn't a huge deal. > > * we wouldn't be able to directly use our permuted-list Tahoe2 > peer-selection protocol, since we won't know the storage-index (and > thus the permuted list) until after we've uploaded all the shares. Zooko's protocol still works if r = H(k1, H(plaintext)), rather than r = H(k1, H(ciphertext)). In that case you would only need to know the hash of the plaintext, not the encoded ciphertext, to calculate the storage-index. Does that help? In the mutable-file variant I suggested there is no corresponding problem, because v is a public verification key that is fixed for a given file, and can be generated before any particular ciphertext. > So, while I like the one-cryptovalue trick, I'm unsatisfied with both > the lack of server-side validation and offline readcap-to-verifycap > attenuation, and the separate SSI value makes me slightly nervous. Are the above suggestions enough to address your dissatisfaction? > Incidentally, I kind of suspect that we could get away with longer > immutable readcaps if we had short directory readcaps, since I imagine > that people are more likely to share with dircaps (which get you > filenames) than with the raw filecaps. On the other hand, I fear that we > have even fewer tricks available for mutable encoding schemes, unless > semiprivate keys work out. On the contrary, dircaps can be shorter than immutable filecaps due to not needing collision resistance. -- David-Sarah Hopwood ⚥ http://davidsarah.livejournal.com _______________________________________________ tahoe-dev mailing list [email protected] http://allmydata.org/cgi-bin/mailman/listinfo/tahoe-dev
