Zooko Wilcox-O'Hearn wrote: > > On Monday,2009-08-10, at 11:56 , Jason Resch wrote: > > > You have stated how Cleversafe manages the key but not provided any > > details regarding how Tahoe-LAFS manages the decryption key? > > I think this is potentially Tahoe-LAFS's best contribution to the > state of the art, so I hope many of the readers of these lists will > think carefully about the following. > > The design of Tahoe-LAFS is to separate key management (== access > control) from data storage, and to make key management simple and > flexible. > > First, we boil down the key management problem for a given file or > directory to a single key, which is short (less than 100 bytes) so > that it is easier to manage. This key suffices for both decryption > and integrity-checking. > Zooko,
I hope to not come off as overly critical. I believe Tahoe has developed many interesting features and ideas, and its approach for encryption is better than many if not most other systems I am familiar with. The problem I am about to point out is an almost universal problem among cryptosystems, so forgive me if I sound like I am picking on Tahoe-LAFS, that is not my intention. On the topic of key management it seems rather than addressing the problem, Tahoe-LAFS offloads it to the end-user. In a sense, Tahoe-LAFS is like a super-compression algorithm: Send a file of any size through Tahoe-LAFS and get back a much smaller string of data. Like a compression algorithm, the end-user is still responsible for reliably and securely storing the result. The effect is the security and reliability of the stored data can never exceed that of the system the user stores their identifiers on. Tahoe-LAFS sacrifices perhaps the greatest benefit of dispersed storage, the ultra-high reliability. Being small, identifiers are easy to replicate and therefore are easy to store reliably. However, having many copies of these highly confidential identifiers in different locations or on different media makes it much more likely that they will be compromised, reducing the confidentiality of the data below that of keeping only a single instance. Users of Tahoe-LAFS are faced with a difficult choice: 1. Keep data highly confidential, by not making copies of the identifiers 2. Keep data highly reliable, by replicating the identifiers Luckily there is a third option, which can achieve the best of both worlds: 3. Use a secret sharing scheme to attain reliable and confidential key storage While secret sharing schemes are the ideal method for key storage, Tahoe-LAFS doesn't provide this feature and given its current design, cannot support it. To have a secret sharing system, there must be some way to authenticate those who request shares. In the case of Tahoe-LAFS, as I understand it, the authentication (or access control) key is the very secret one would try to secure using the secret sharing scheme. The only way to achieve the ultra-high reliability which information dispersal allows is to make sure keys are stored as reliably as the data. A corollary of this is that the authentication credentials used to access the keys must be something which is revocable and replaceable when lost, otherwise there is an infinite regress of how to protect the credentials from loss. Cleversafe has adopted this approach, using replaceable authentication credentials and a secret sharing scheme to store both the key and data. I should note that nothing in our approach precludes someone from encrypting their data and taking responsibility for the management of that key, but doing so sets an upper-bound on reliability, equal to that of the key management system. > > Second, we make a separate, independent key for every single file or > directory. This means that access control decisions such as "Should > I share this file with my friend?" don't have to be linked to access > control of other files or directories. (Although they *can* be > bundled together if desired.) > There is flexibility in having a separate key for each file but it also means one needs to take the time to make backups of the newly generated keys following each upload session. Although I suppose if one restricts uploads to children of directories whose key is already backed up this step can be avoided. Is that correct? > > > Third, we *embed the key directly into the identifier of the file*. > This part is important. You know how in a filesystem, whether local > or distributed, files have a unique "file handle" or identifier? In > a traditional Unix filesystem it is the inode number. Like a Unix > directory, a Tahoe-LAFS directory consists of a map from the name of > each child to the file handle to that child. The critical decision > here is to embed the crypto key directly into that handle. The > result is that when some human or some program wants to give anothe > human or program access to a Tahoe-LAFS file or directory, it does so > by giving the file handle. This single value serves for access > control (you can't decrypt the file if you don't have it), > identification (the unique identifer of the file is its file handle), > and actual usage -- the file handle is sufficient to locate and > acquire the file contents. > That is interesting. I think this article on Cryptree ( http://www.dcg.ethz.ch/publications/srds06.pdf ) would be of particular interest to you, if you haven't seen it before. It is used by another dispersed storage service, Wuala ( http://www.wuala.com/ ). > > > The resulting short string which serves as identifier, access control > token, and file handle is called a "capability" or a "cap" for > short. There are several kinds of capability in Tahoe-LAFS. The one > that I've described above is a "read-cap to an immutable file". > How is the immutability enforced? In particular it isn't clear to me how a write-cap allows one to update a file, is this something the servers check via a digital signature or HMAC on updated data? > > > Okay, my bus has arrived at work so I don't have time right now to > describe the other ones, but please observe that this design so far > already makes you start thinking about how you could build something > cool on top of it. You can do so without having to think too much > about how the ciphertext is stored (it is erasure-coded and spread > across a distributed, fault-tolerant key-value storage grid), and > without having to know too much about how other programs or other > humans on the same system are managing their caps. > The freedom for the user to use anything is a double-edged sword. Those having the right resources, infrastructure and know-how may be able to store their keys in an extremely reliable and secure manner, but the average user with less expertise or equipment may be left with a less than ideal system for securing keys. The advantage of Tahoe-LAFS approach is that its upper-bound for key management can be as high as one is willing to make it, because it is not defined. The downside is that the lower-bound can also be as low as one allows. > > We owe thanks to many others including the authors of Self-certifying > filesystem, Freenet, Mojo Nation and especially the obj-cap ideas as > expressed by Mark Miller. > > Regards, > > Zooko > <http://allmydata.org/cgi-bin/mailman/listinfo/tahoe-dev> > Thanks for this follow-up post. Regards, Jason _______________________________________________ tahoe-dev mailing list [email protected] http://allmydata.org/cgi-bin/mailman/listinfo/tahoe-dev
