On Jul 26, 2009, at 12:11 AM, james hughes wrote:
On Jul 24, 2009, at 9:33 PM, Zooko Wilcox-O'Hearn wrote:
[cross-posted to [email protected] and [email protected] ]
Disclosure: Cleversafe is to some degree a competitor of my Tahoe- LAFS
project.
...
I am tempted to ignore this idea that they are pushing about encryption
being overrated, because they are wrong and it is embarassing.
The trick is cute, but I argue largely irrelevant. Follows is a response to
this web page that can probably be broadened to be a criticism of any system
that claims security and also claims that key management of some sort is not a
necessary evil....
> It seems to me there's a much simpler critique. The Cleversafe approach -
> which is not without its nice points - solves the "key management problem" in
> exactly the same way that some version of Windows solved the "frequent
> General Protection Fault crashes" problem (by eliminating the error message).
Eliminating the error message amounts to ignoring the problem while sweeping it
there under the rug, which I don't think is an accurate representation of how
this technique handles key management. This technique provides a genuine
method for achieving a high degree of security without the need for a key
management system. James Hughes, who posted earlier in this thread, referenced
a paper which explores this topic in greater detail:
http://www.ssrc.ucsc.edu/Papers/storer-usenix07.pdf. Cleversafe's method has
the same confidentiality advantages as POTSHARDS yet we achieve much greater
storage efficiency than is possible using an information theoretically secure
secret sharing scheme, as POTSHARDS does. While one may object to the fact
that Cleversafe's technique requires that multiple secure locations for slices
to be stored, this problem always existed for geo-dispersal. When Cleversafe
first began, the design did not have the feature of AONT; the benefits of
dispersal alone (increased reliability, availability and efficiency in storage)
were strong enough a motivation to pursue this model for data storage. >From
this frame of reference, you can see that adding the AONT as an additional
pre-processiing step was a very minor modification which yielded very
significant results.
> The "key management problem" comes down to: I have encrypted data stored
> somewhere (where we assume attackers can access it, but not make use of it
> without the key). To make that data meaningful, I need to be able to locate
> the key appropriate to that data. What's a key? It's some private
> information. In Cleversafe's approach, I have data stored in pieces all over
> the place. To get at it, I need to know where the pieces of some data are.
> That information has to be secret, since anyone who has access to it can do
> the same computation and recover the data just as I can.
I think you may be missing one piece of the puzzle. The location of where the
data is stored is not secret. In fact anyone with a packet sniffer could see
the IP addresses of where the slices are being dispersed to. The slices stored
on slice servers are not available for anyone to download, an authentication
system is used to ensure only the proper party can access the slices, much like
any key management system must authenticate the user in some manner before
releasing the key. What authentication system is used can vary from deployment
to deployment, but the advantage of authentication keys vs. data encryption
keys is that authentication keys can be lost and replaced without any impact on
the reliability of the data. Therefore one need not replicate their
authentication keys to many locations to prevent their loss.
> Alternatively, I can rely not on the secrecy of that information, but on the
> discretion of those who hold the pieces. OK, but I could have done that with
> a simpler technique: Encrypt the data conventionally, then split the key
> among the trusted holders. That's a tiny, and more to the point, *fixed*
> overhead beyond the size of the data, which will always beat the cleverest
> Reed-Solomon or erasure coding. (It also has - if I use an appropriate mode -
> such nice features as random access to small parts of the data without the
> need to decrypt the whole thing first.)
As noted above, the main point of dispersal is the extremely high levels of
reliability that can be achieved. The security features of AONT when combined
with dispersal is only the icing on the cake. Consider that a 10 of 16
dispersed storage network can suffer 4 simultaneous failures and still have the
reliability of a RAID 6 system. Security encompasses not just confidentiality
but also availability, if the failure of a single piece of hardware results in
irrecoverable data loss, then such data is not very secure. Regarding the
problem of random access, our software does have a segmentation layer for
exactly this purpose, it splits large files into smaller segments which can be
accessed individually.
> Granted, Cleversafe has other nice features. But other than changing "the key
> management problem" to "the secret information needed to get at the data,
> which won't be used as a crypto key" problem, I don't see how they've
> actually *solved* anything.
If you consider any hypothetical conventional system for data encryption and
key management, it will almost surely suffer from one of the following common
problems, problems which are mitigated or eliminated by dispersal+aont:
- Low reliability or availability of key storage system
- Ease of physical compromise
- Vulnerability to malicious or incompetent insiders
- Reliance on asymmetric cryptography
- Reliance on passwords which are either easy to forget or easy to
brute force
- Complex procedures for cycling and expiring keys
- Difficulty with accessing off-line key shares
> Further: If I'm only encrypting stuff for myself, there's little reason to
> use multiple keys. The key management problem becomes interesting when there
> is different encrypted data with different access rights for different groups
> of users. It's beyond me how Cleversafe's approach makes this easier - or
> harder.
It actually becomes entirely about access rights and authentication, as it
should be. The reason for data encryption is to serve as a last line of
defense for attacks that can circumvent authentication mechanisms. If it were
impossible to open up your laptop and take its hard drive out, full disk
encryption wouldn't be needed. Likewise if it were physically impossible to
tap a connection, TLS would be unnecessary. Using the AONT with Dispersal
makes it such that bypassing the authentication system (by physically stealing
a machine holding slices, or remotely compromising it) is entirely fruitless.
Only by getting access to a threshold number of devices (which we maintain is
harder than accessing a single location where one key is kept) can one get at
the data.
I hope this helps to clarify the reason behind why we have adopted this
approach. If you have any further questions don't hesitate to ask.
Jason
P.S.
Zooko cross-posted his original post to several threads. You may wish to check
out what has been said at the [email protected] mailing list on this
topic.
_______________________________________________
tahoe-dev mailing list
[email protected]
http://allmydata.org/cgi-bin/mailman/listinfo/tahoe-dev