On Fri, Sep 4, 2009 at 3:44 PM, Hal Finney<[email protected]> wrote: >> Are there any other types of digital signature scheme which can have public >> keys shorter than 2*security level bits? > > Well I suggested the small DSA variant. David Hopwood showed a shortcoming > but it sounded like it didn't apply to your usage.
Oh, of course. I'm sorry I didn't refer to your letter [1]. I guess the problem is that I don't understand it yet. And I don't understand why it isn't vulnerable to the generic attacks on discrete-log-in-groups that Tanja Lange mentioned [2]. I'll study your letter more. > More generally, what would happen if you replaced an arbitrary public key > with its hash, and used that for the URL; and then included the full public > key (or at least information sufficient to regenerate it) in the signature? > The sigs would get much bigger but the key could be much smaller. Does that > tradeoff work for you? Are its security implications worth exploring? Yes! In fact that is exactly what we do in the current version of Tahoe-LAFS. See figure 2 on page 3 of lafs.pdf [3]. Maybe we should continue to use this "hashes-in-caps" structure in the next version of our caps. It allows us to use whichever signature scheme we like best [footnote 1]. The drawbacks to this approach are (a) it is complicated compared to the semi-private-key-based design, (b) the simple implementation of it (with a separate verification hash and symmetric key in each read-cap) makes for read-caps just as large as those of the semi-private-key-based design, (c) it requires access to the storage servers in order to perform a "diminish" operation to get a read-cap from a write-cap, and (d) if all of the storage servers die then not only do you lose whatever was in the file but you also permanently lose the ability to write new things into that file. lafs.pdf [3] shows how the current mutable file cap design could be simplified by using semi-private keys instead (see figure 3 on page 5). The "complication gap" between the two structures grows larger when we add the desire for deep-verify caps (what Brian has called "traversal caps"). See Brian's letter [7] which is the best current summary of the two cryptographic structures used to build a 4-level hierarchy of caps (deep-write, deep-read, deep-verify, shallow-verify). (See also [8] for an argument against creating the fourth level of the hierarchy -- shallow verify caps -- and instead letting every storage server have a deep-verify cap for each share that it stores.) In my letter [9] to which Brian's [7] is a reply I alluded to a "compression trick" by which we could perhaps have smaller read-caps in the "hashes-in-caps" structure. In [7], Brian said that even though I hadn't told him what "compression trick" I meant, he strongly suspected I couldn't do that and also enable servers to verify the correctness of the storage index. He was right. However, my "compression trick" might save enough bits that we could then spend some bits to add a separate storage index into the read-cap and still have a shorter read-cap overall. I've now finally posted the "compression trick" idea: [10]. Here is how I was thinking of using it in mutable files. Look at figure 2 on page 3 of lafs.pdf [3] again. See how the read-cap is the concatenation of the read-key and the verify-cap? What if it were instead a single value that protects the confidentiality of the read-key and the integrity of both of those values? In his letter [7], Brian wrote: "there's no way to get a single bit to perform both jobs (and still be able to derive a storage-index from any cap)". He's exactly right -- if we used this compression trick then what would we use as a storage index? (A storage index is the means by which a client can indicate to a server *which* of the shares that the server is holding is the one that the client wants.) If the storage index is derivable using nothing but the read-key then the storage server (which isn't allowed to see the read-key) will not be able to verify that it is storing a given share under the correct storage index. Imagine an attacker who asks a storage server to hold onto a share of file A but tells the storage server that it belongs under the storage index of file B (to which the attacker doesn't have a write-cap), in order to mess up the storage server's handling of the latter. On the other hand if the storage index is *not* derivable using nothing but the read-key, then the read-cap will need to include other information in addition to the read-key so that the holder of the read key will be able to explain to a storage server which share he wants. However, this compression trick could save enough bits in the read-cap that we could then add in some bits for storage index and still have a shorter read-cap. Suppose that the read-cap was composed of a 128-bit crypto value giving confidentiality and integrity over the read-key and the verify-cap, plus 20 bits which are the first 20 bits of the verify cap. The Tahoe-LAFS download protocol currently in v1.5 consists of two phases, in the first phase the downloader asks the server "What shares of the following storage index do you have?" and in the second phase it asks a subset of the servers "Please send me share such-and-such". With this read-cap of 128-bits crypto plus 20-bits storage index, the reader would first calculate an identifier of the read-cap (i.e. the secure hash of the read-cap) and then in the first phase of download ask the storage server whether it is holding any shares with the following 20-bit storage index and the following identifier-of-the-read-cap. (The server can check that the verify-cap is correct -- i.e. that the verify-cap is linked to the public key that was used to sign the share, but it can't verify anything about the identifier-hash-of-the-read-cap, so whenever a writer writes a share, the server checks the digital signature on the verify-cap but just accepts the hash-of-the-read-cap blindly.) The server's reply in phase one would include all of the verify-caps that it holds which match that storage index and hash-of-read-cap. In the second phase, the reader would ask for specific shares by their full verify-cap, not just their 20-bit prefix of it. We can get away with using a small number of bits such as 20 because brute-forcing this value doesn't let an attacker violate anyone's confidentiality or integrity -- all it does is let him impose a very small added cost to the storage server and to the downloader. It costs an attacker approximately 2^20 computation to generate a verify-cap whose first 20 bits match the first 20 bits of someone else's verify-cap. Having done so, the attacker can upload his new share to the storage server. The only effect is that when the downloader asks the storage server if it has shares, it replies with *two* verify-caps rather than one, and the downloader tries each of them in turn to find the first one that fits his read-cap. If the attacker wants to add another colliding storage index, he'll have to perform another 2^20 computation (approximately) to find one. This attacker would have spent about 2^20 times the cost of calculating the verify-cap (which costs about 8000 CPU cycles) in order to cause the downloader try at most one extra verify-cap (which also costs about 8000 CPU cycles). Also he forces the server and the downloader to transfer an extra verify-cap, which is about 16 bytes. (There is another attack that someone could do which is to spend massive CPU to generate a whole bunch of collisions for their own file and then upload them, just to stress the server's handling of many colliding verify-cap prefixes or to then ask someone to read their file in order to stress the reader's handling of many colliding verify-cap-prefixes. You gotta be a pretty desperate DoS attacker to resort to that kind of junk though.) What's the point of the added hash-of-read-cap that the server can't verify? To minimize accidental collisions among legitimate users. A typical storage server in 2009 probably holds a million shares, but you could imagine that maybe in a few years someone will want to run a single storage server with many millions or even billions of shares on one server. (Note: you can always run multiple storage servers using the same filesystem, so you don't *have* to put billions of shares into a single storage server, but for practical reasons you might want to.) If legitimate writers include the real hash-of-the-read-cap when writing shares then there will never be accidental collisions among legitimate writers. Here is what read-caps look like in the three schemes -- semi-private-key ECDSA, hashes-in-caps, and hashes-in-caps-with-compression-trick. In all cases we get a 128-bit security level for confidentiality and integrity. For semi-private-key-ECDSA that means we need an elliptic curve of 256 bits. That means the read-cap in semi-private-key-ECDSA would be 257 bits (256 bits for the x-coordinate of the curve point and 1 bit for the sign). Example: lafs:R_0PpJXAJEAGu1hNLzf1DpdoIrViDHJ5EHsm7LRJ1genKF For hashes-in-keys, we need 128 bits for the symmetric read-key and 128 bits for the secure hash of the public key. Example: lafs:R_Nnlb5qlTV2LhccUWEIy7qZrXiHZzBtHKJiasLtGOkxq For hashes-in-keys-with-compression-trick, we could have 128 bits for the object which secures both the the symmetric read-key and the public key, and another 20 bits for the prefix of the verify cap. Example: lafs:R_3AeIEJs52QXFMGYqalmuxSYx0 I strongly value small caps. I want Tahoe-LAFS to be widely used. See ticket #217 for a littany of complaints from users about the current caps (which take 130 characters, like this: http://testgrid.allmydata.org:3567/uri/URI:DIR2-RO:j74uhg25nwdpjpacl6rkat2yhm:kav7ijeft5h7r7rxdp5bgtlt3viv32yabqajkrdykozia5544jqa ). As another example, see twitter and identi.ca and associated "tiny url" services. As another example, observe that allmydata.com, after funding the development of Tahoe-LAFS and its 130-character caps, then went ahead and funded the development of a tiny-url service on top of Tahoe-LAFS, which tiny-url service produced 68-character caps, like this: //www.allmydata.com/webdrive/tiny/2cxq4ucnwqjkloykf9b7bbr1xrvc . (From my recollection of the discussions about the development of that tiny-url service, the fact that the resulting tiny-urls are revocable was considered a plus but not a requirement. The fact that they were smaller and more acceptable to users was the hard requirement.) It is hard to measure something like "How much does end-user-aversion go up as random-character URL length goes up?", but I'm sure that the curve is significant, and that 130 characters is way past the tolerability level. I also strongly suspect that a scheme that fits in 32 characters, like lafs:R_3AeIEJs52QXFMGYqalmuxSYx0 will enjoy more widespread use than a scheme that requires 50 characters, like lafs:R_Nnlb5qlTV2LhccUWEIy7qZrXiHZzBtHKJiasLtGOkxq . However, I also strongly value simplicity of design and implementation. Too bad we can't have both! Okay, I've laboriously typed up this letter over the course of about a week while riding the bus to work, while on lunch-break, and during the hour that elapses between the time my children fall asleep and the time I fall asleep. During this period, I've noticed that Brian Warner and David-Sarah Hopwood have been busily writing back and forth to each other, and I wouldn't be surprised if by now they've already invented this idea or something even better. :-) But I'm going to go ahead and send this note now, and I'll probably read Brian's and David-Sarah's posts on the bus to work. :-) Regards, Zooko footnote 1: what signature scheme do I like best? Some variant of Rabin-Williams apparently has a good security proof [5], and Paul Crowley's argument in favor of a security proof appeals to me. AGL's implementation thereof, rwb0fuz, has excellent verification speed [6] -- almost 1/100 the cost of ecdsa-192 or hector! However, it costs 2.5 times as much as ecdsa-192 or 11 times as much as hector to sign, and it costs 100 times as much as ecdsa-192 or 500 times as much as hector generate a new keypair (using the benchmarks on the Intel Atom chip in 64-bit mode). Overall, it looks like ecdsa is the fastest mature algorithm even ignoring its advantages in public key size, and hector is a very promising candidate for the *next* next version of Tahoe-LAFS caps. ;-) Thanks to Samuel Neves for help with brute force probabilities on IRC. [1] http://allmydata.org/pipermail/tahoe-dev/2009-August/002709.html [2] http://allmydata.org/pipermail/tahoe-dev/2009-August/002689.html [3] http://allmydata.org/~zooko/lafs.pdf [4] http://allmydata.org/trac/tahoe/browser/docs/specifications/mut.svg [5] http://cr.yp.to/sigs/rwtight-20080201.pdf [6] http://bench.cr.yp.to [7] http://allmydata.org/pipermail/tahoe-dev/2009-July/002345.html [8] http://allmydata.org/pipermail/tahoe-dev/2009-July/002302.html [9] http://allmydata.org/pipermail/tahoe-dev/2009-July/002314.html [10] http://allmydata.org/pipermail/tahoe-dev/2009-September/002796.html tickets mentioned in this letter: http://allmydata.org/trac/tahoe/ticket/217 # DSA-based mutable files -- small URLs, fast file creation _______________________________________________ tahoe-dev mailing list [email protected] http://allmydata.org/cgi-bin/mailman/listinfo/tahoe-dev
