And in answer to your questions: On Jul 12, 2009, at 18:45 PM, Shawn Willden wrote:
> What's the rationale for including the full 256-bit UEB hash in the > CHK URI? Those URIs could be shortened considerably by truncating > it to, say, 128 bits. It is that the integrity of an immutable file cap is the "exactly one file matches this cap" guarantee. To ensure this requires 2K bits in the immutable cap to guarantee K bits of security, because of a birthday-surprise attack in which an attacker generates two (or more) files with the same immutable file cap so that they have the ability to undetectably swap in the alternate files substituted for the original file, after they've distributed the cap to other people. To generate such files requires only about K bits of work to find multiple matching files for a 2K-bit immutable file cap. > How difficult would it be to allow Tahoe to operate with either > full UEB hashes or abbreviated hashes? It is a neat idea. We've discussed it before, but I can't find the reference. I seem to recall that Brian had a good summary of the risk of publishing a shortened immutable cap. Perhaps he just pointed out that in the future people may come to distrust whether the file that they get by retrieving with that cap is really the only file that could have matched. If your shortened cap is sufficiently, let's say 192-bits, this risk doesn't sound like a big issue as far as brute computer power goes -- even if people in the future have vastly improved computation technology, 2^96 computations will probably still be very, perhaps even prohibitively, expensive. However, the possibility of people uncovering algorithmic weaknesses in the hash algorithm that we are using (currently SHA-256d, hopefully in the future SHA-3) can reduce the effective strength. By the way, I'm sitting on a good idea that I haven't finished writing up yet for how to combine the encryption key and the integrity-checking hash together so that you have only one value (perhaps of size 256 bits) instead of two values -- one for the key and one for the hash. Perhaps that would solve most of your performance issues? As I mentioned in my previous mail, I'd like to understand more about what the performance implications are in GridBackup. > What is the bare minimum data needed to retrieve, reassemble and > decrypt an immutable file? Just the AES read key? That, and some way to find the shares, which we currently call the "storage index". That would omit not only the integrity check on the ciphertext (to guarantee that the immutable cap you started with could match only one file) but also the integrity check on the shares (to identify which servers are responsible for serving up corrupted shares, in the case that the resulting file was corrupted). Regards, Zooko _______________________________________________ tahoe-dev mailing list [email protected] http://allmydata.org/cgi-bin/mailman/listinfo/tahoe-dev
