Shawn Willden wrote: > Specifically, it contains: > > 1. The root of a Merkle tree on the file plaintext > 2. A flat hash of the file plaintext > 3. The root of a Merkle tree on the file ciphertext > 4. A flat hash of the file ciphertext > 5. Roots of Merkle trees on each share of the FEC-encoded ciphertext
Incidentally, we removed 1 and 2 forever ago, to squash the partial-information-guessing-attack. We'd like to bring them back, safely encrypted with the readcap, to detect integrity problems relating to having the wrong key or having a buggy AES implementation. > To address these issues, I propose splitting the UEB into two parts Interesting. As you point out, I'm not sure I like the introduction of an extra layer of caps (and an asymmetric key) into the immutable file scheme. It raises the question: who should hold onto these caps? Where should they put them? I suppose the original uploader of the file is the special party who then has the ability to re-encode it, but they'll have to store it somewhere, and it feels wasteful to put an extra layer of caps in the dirnodes (along with the writecap, readcap, and traversalcap) just to track an object that so few people will actually be able to use. Adding an asymmetric key might also introduce some new attack vectors. If I give you a readcap and claim that it points to a certain contract, and you sign that readcap to sign the contract, can I pull any tricks by also holding on to this newly-introduced signing key? I guess if the readcap covers UEB1, then I can't forge a document or cause you to sign something else, but I can produce shares that will look completely valid during fetch and decode but then fail the ciphertext check. That means I can make it awfully hard to actually download the document (since without an effective share hash, you can't know which were the bad shares, so you can try using other ones). (the structure for this would probably put H(UEB1|VerifyKey) in the readcap, and then store a signed UEB2 in each share). I guess we should figure out the use case here. Re-encoding the file is something that you'd want to do when the grid has changed in size, such that it is now appropriate to use different parameters than before, right? And if you're changing 'k', then you'll certainly need to replace all the existing shares. So the goal appears to be to do all the work of uploading a new copy of the file, but allow the old caps to start referencing the new version. Deriving the filecap without performing FEC doesn't feel like a huge win to me.. it's just a performance difference in testing for convergence, right? And if you (or someone you trust) uploaded the file originally, you (or they) could just retain a table mapping file hash to readcap (like tahoe's backupdb), letting you do this file-to-filecap computation even faster. I certainly see more value in being able to change the encoding parameters after the fact. But I'm kinda hopeful that there might be a way to allow re-encoding without such a big change (perhaps by allocating more space in the share-hash-tree, to allow same-k-bigger-N changes). I *am* intrigued by the idea of immutable files being just locked-down variants of mutable files. A mutable-file readcap plus a hash of the expected contents (i.e. H(UEB1)) would achieve this pretty well.. might not be too much longer than our current immutable readcaps, and we could keep the encoding-parameter-sensitive parts (UEB2) in the signed (and therefore mutable) portion, so they could be changed later. cheers, -Brian _______________________________________________ tahoe-dev mailing list [email protected] http://allmydata.org/cgi-bin/mailman/listinfo/tahoe-dev
