Zooko Wilcox-O'Hearn wrote: > Now, convergent encryption could do both jobs with one value! If you > let the symmetric key be the secure hash of the plaintext, then the > reader could use the symmetric key to decrypt, then verify that the > key was the hash of the plaintext.
In addition to the other reasons you listed, you might not be able to use this because of alacrity: a CHK hash can't be validated until the entire plaintext has been downloaded. OTOH, it's conceivable that you could build up a plaintext merkle tree with about the same effort as the normal CHK flat hash, and use the root of that as your encryption key, and safely encrypt the plaintext hash tree in a way that lets you grab it quickly (one node at a time). It'd be kinda complex, but that might let you use CHK-like encryption keys that also gave you low-alacrity integrity properties. > Here's my idea about ensuring both confidentiality and integrity with > a single crypto value. Ah, good, thanks for writing this up. I certainly like your scheme better than the fragments of your scheme that I was able to reconstruct from a memory of a vague conversation :-). I'll try to update NewImmutableEncodingDesign in the next few days with your algorithm. Some observations: * obviously the "v = H(ciphertext)" could+should be expanded to include our usual UEB scheme, with all integrity information (merkle trees, share hash trees, ideally even an encrypted form of the plaintext hash data) going into the UEB, and "v" being the hash of the UEB. David-Sarah's point about making verifycap=H(v,K1enc) is spot-on. * verifycap cannot be offline-derived from readcap: you have to run through part of the download process, fetch at least "v" and the K1enc value, derive K1, hash K1+v together to confirm that you really do get the readcap, then emit H(v+K1enc) as the verifycap. This makes manifest/repaircap generation really expensive (a network trip per file). One mitigation strategy would be to store both readcap and verifycap in dirnodes, effectively caching the verifycap computation. * what should the storage-index be? It clearly must be the hash of the readcap, otherwise readers cannot find the shares (or must carry around some extra value, negating the shortness of the readcap). * but since storage-index != verifycap (i.e. H(UEBhash+k1enc)), servers will be unable to completely validate their shares. They can confirm that everything (including K1enc, thanks to David-Sarah's suggestion) matches the verifycap, but they can't tell that the verifycap matches the storage-index under which the share is stored (i.e. they'd be unable to detect two swapped sharefiles). This permits the "roadblock" attack and generally misses our goals of allowing full server-side validation. * we can't determine the storage-index until after we've encoded the entire file (which generally means after we've uploaded it). So we need a new uploader protocol that lets us upload to an as-yet-unnamed slot, and then provide the slot's storage-index at the very end of the process. This is more work, but it isn't a huge deal. * we wouldn't be able to directly use our permuted-list Tahoe2 peer-selection protocol, since we won't know the storage-index (and thus the permuted list) until after we've uploaded all the shares. I think we'd have to go with the "server-selection-index" idea: a much shorter string (since it only needs to provide load-balancing, not collision resistance), either randomly generated or derived from a salted CHK hash (and thus computable before encoding/upload), used to permute the peerlist. This string must be included in the readcap, increasing it's length, but we could probably get away with maybe 20 bits or so. So, while I like the one-cryptovalue trick, I'm unsatisfied with both the lack of server-side validation and offline readcap-to-verifycap attenuation, and the separate SSI value makes me slightly nervous. Incidentally, I kind of suspect that we could get away with longer immutable readcaps if we had short directory readcaps, since I imagine that people are more likely to share with dircaps (which get you filenames) than with the raw filecaps. On the other hand, I fear that we have even fewer tricks available for mutable encoding schemes, unless semiprivate keys work out. cheers, -Brian _______________________________________________ tahoe-dev mailing list [email protected] http://allmydata.org/cgi-bin/mailman/listinfo/tahoe-dev
