So how do *you* manage your keys, then? Re: [tahoe-dev] cleversafe says: 3 Reasons Why Encryption isOverrated

2009-08-18 Thread Zooko Wilcox-O'Hearn

On Monday,2009-08-10, at 11:56 , Jason Resch wrote:

You have stated how Cleversafe manages the key but not provided any  
details regarding how Tahoe-LAFS manages the decryption key?


I think this is potentially Tahoe-LAFS's best contribution to the  
state of the art, so I hope many of the readers of these lists will  
think carefully about the following.


The design of Tahoe-LAFS is to separate key management (== access  
control) from data storage, and to make key management simple and  
flexible.


First, we boil down the key management problem for a given file or  
directory to a single key, which is short (less than 100 bytes) so  
that it is easier to manage.  This key suffices for both decryption  
and integrity-checking.


Second, we make a separate, independent key for every single file or  
directory.  This means that access control decisions such as "Should  
I share this file with my friend?" don't have to be linked to access  
control of other files or directories.  (Although they *can* be  
bundled together if desired.)


Third, we *embed the key directly into the identifier of the file*.   
This part is important.  You know how in a filesystem, whether local  
or distributed, files have a unique "file handle" or identifier?  In  
a traditional Unix filesystem it is the inode number.  Like a Unix  
directory, a Tahoe-LAFS directory consists of a map from the name of  
each child to the file handle to that child.  The critical decision  
here is to embed the crypto key directly into that handle.  The  
result is that when some human or some program wants to give anothe  
human or program access to a Tahoe-LAFS file or directory, it does so  
by giving the file handle.  This single value serves for access  
control (you can't decrypt the file if you don't have it),  
identification (the unique identifer of the file is its file handle),  
and actual usage -- the file handle is sufficient to locate and  
acquire the file contents.


The resulting short string which serves as identifier, access control  
token, and file handle is called a "capability" or a "cap" for  
short.  There are several kinds of capability in Tahoe-LAFS.  The one  
that I've described above is a "read-cap to an immutable file".


Okay, my bus has arrived at work so I don't have time right now to  
describe the other ones, but please observe that this design so far  
already makes you start thinking about how you could build something  
cool on top of it.  You can do so without having to think too much  
about how the ciphertext is stored (it is erasure-coded and spread  
across a distributed, fault-tolerant key-value storage grid), and  
without having to know too much about how other programs or other  
humans on the same system are managing their caps.


We owe thanks to many others including the authors of Self-certifying  
filesystem, Freenet, Mojo Nation and especially the obj-cap ideas as  
expressed by Mark Miller.


Regards,

Zooko

-
The Cryptography Mailing List
Unsubscribe by sending "unsubscribe cryptography" to majord...@metzdowd.com


RE: strong claims about encryption safety Re: [tahoe-dev] cleversafe says: 3 Reasons Why Encryption isOverrated

2009-08-13 Thread Jason Resch
Zooko Wilcox-O'Hearn wrote:
>
> [removing Cc: tahoe-dev as this subthread is not about Tahoe-LAFS.  
> Of course, the subscribers to tahoe-dev would probably be interested 
> in this subthread, but that just goes to show that they ought to 
> subscribe to cryptogra...@metzdowd.com.]
>
> On Monday,2009-08-10, at 11:56 , Jason Resch wrote:
>
> >> I don't think there is any basis to the claims that Cleversafe 
> >> makes that their erasure-coding ("Information Dispersal")-based 
> >> system is fundamentally safer, e.g. these claims from [3]: "a 
> >> malicious party cannot recreate data from a slice, or two, or 
> >> three, no matter what the advances in processing power." ... 
> >> "Maybe encryption alone is 'good enough' in some cases now  - but 
> >> Dispersal is 'good always' and represents the future."
> >
> > It is fundamentally safer in that even if the transformation key 
> > were brute forced, the attacker only gains data from the slice, 
> > which in general will have 1/threshold the data.
>

You failed to quote the other reason I offered:

It is not dependent on asymmetric cryptography, which depends on:
1. No one ever figuring out a fast way to factor primes, an area in which there 
has been substantial progress.
2. No one ever building a quantum computer with more than twice as many qubits 
as your key length.

In other posts on the subject I have offered even more reasons, including:

1. It is not dependent on Password Based Encryption, which struggles to walk 
the tightrope of reliability vs. confidentiality, use a long password and you 
are likely to forget it, use a short one and it is easy to brute force.  Write 
down a long password and you've made it easier for someone to find.

2. In our system there are no master keys.  Therefore a compromise of the 
client computer which reads data only leads to loss of confidentiality for the 
data that was read during the compromise, not the loss of confidentiality for 
all data encrypted by a master key as is the case with most systems.

3. It offers an elegant solution for reliably and confidentially storing keys.  
Making copies of keys is a trade-off between confidentiality and reliability, 
secret sharing schemes, such as this can achieve both simultaneously.

> Okay, so the Cleversafe method of erasure-coding ciphertext and 
> storing the slices on different servers is "safer" in exactly the 
> same way that encrypting and then giving an attacker only a part of 
> the ciphertext is safer.  That is: having less ciphertext might 
> hinder cryptanalysis a little, and also even if the attacker totally 
> wins and is able to decrypt the ciphertext, at least he'll only get 
> part of the plaintext that way.  On the other hand I might consider 
> it scant comfort if I were told that "the good news is that the 
> attacker was able to read only the first 1/3 of each of your 
> files".  :-)
See above, this is just one and perhaps the most trivial and "meaningless" of 
the security advantages.  The biggest advantage in my mind is the way it 
addresses the problem of key management, which in my opinion is the elephant in 
the room for most cryptosystems.  I look forward to your response on this 
subject, as practically speaking it is the most relevant issue for such 
systems, not which ciphers or key lengths are used.
>
>
> But the Cleversafe method of appending the masked key to the last 
> slice makes it less safe, because having the masked key might help a 
> cryptanalyst quite a lot.
Assuming the hash function is a random oracle, the way the key is masked is 
equivalent to One-Time-Pad encryption.  There would need to be an extremely 
serious flaw in the hash function (such as having output bits highly skewed 
towards 1 or 0, or for earlier input to have very little impact on the hash 
result) for it to compromise the security of the masked key.  Recall that the 
hash is calculated over random-seeming encrypted data, so even if the input was 
highly or specially formed to cause trouble for a hash function, the fact that 
it is encrypted before hashed eliminates this type of chosen-plaintext attack.

You can point out cryptographic primitives used in AONT and say if they aren't 
secure then your system isn't secure, but one could do the same for any system. 
 When designing systems, one should work with the assumption that the hash 
algorithms and ciphers do what they are meant to, but always plan for forward 
support of new algorithms if/when a critical flaw is discovered.  We have done 
this, implementing support for different ciphers, key lengths and hash 
functions, and I believe this is about the best anyone can do when designing a 
system.

>
> In any case, the claims that are made on the Cleversafe web site are 
> wrong and misleading: "a malicious party cannot recreate data from a 
> slice, or two, or three, no matter what the advances in processing 
> power" [1].  It is easy for customers to believe this claim, because 
> an honest party who is 

strong claims about encryption safety Re: [tahoe-dev] cleversafe says: 3 Reasons Why Encryption isOverrated

2009-08-12 Thread Zooko Wilcox-O'Hearn
[removing Cc: tahoe-dev as this subthread is not about Tahoe-LAFS.   
Of course, the subscribers to tahoe-dev would probably be interested  
in this subthread, but that just goes to show that they ought to  
subscribe to cryptogra...@metzdowd.com.]


On Monday,2009-08-10, at 11:56 , Jason Resch wrote:

I don't think there is any basis to the claims that Cleversafe  
makes that their erasure-coding ("Information Dispersal")-based  
system is fundamentally safer, e.g. these claims from [3]: "a  
malicious party cannot recreate data from a slice, or two, or  
three, no matter what the advances in processing power." ...  
"Maybe encryption alone is 'good enough' in some cases now  - but  
Dispersal is 'good always' and represents the future."


It is fundamentally safer in that even if the transformation key  
were brute forced, the attacker only gains data from the slice,  
which in general will have 1/threshold the data.


Okay, so the Cleversafe method of erasure-coding ciphertext and  
storing the slices on different servers is "safer" in exactly the  
same way that encrypting and then giving an attacker only a part of  
the ciphertext is safer.  That is: having less ciphertext might  
hinder cryptanalysis a little, and also even if the attacker totally  
wins and is able to decrypt the ciphertext, at least he'll only get  
part of the plaintext that way.  On the other hand I might consider  
it scant comfort if I were told that "the good news is that the  
attacker was able to read only the first 1/3 of each of your  
files".  :-)


But the Cleversafe method of appending the masked key to the last  
slice makes it less safe, because having the masked key might help a  
cryptanalyst quite a lot.


In any case, the claims that are made on the Cleversafe web site are  
wrong and misleading: "a malicious party cannot recreate data from a  
slice, or two, or three, no matter what the advances in processing  
power" [1].  It is easy for customers to believe this claim, because  
an honest party who is following the normal protocol is limited in  
this way and because information-theoretically-secure secret-sharing  
schemes have this property.  I kind of suspect that the Cleversafe  
folks got confused at some point by the similarities between their  
AONT+erasure-coding scheme and a secret-sharing scheme.


In any case, the statement quoted above is not true, and not only  
that isolated statement, but also the entire thrust of the  
"encryption isn't safe but Cleversafe's algorithm is safer" argument  
[2].  Just to pick out another of the numerous examples of misleading  
and unjustified claims along these lines, here is another: "Given  
that the level of security provided by the AONT can be set  
arbitrarily high (there is no limit to the length of key it uses for  
the transformation), information theoretic security is not necessary  
as one can simply use a key so long that it could not be cracked  
before the stars burn out." [3].


On the other hand Cleversafe's arguments about key management being  
hard and about there being a trade-off between confidentiality and  
availability are spot on: [3].  Although I don't think that their  
strategy for addressing the key management issues is the best  
strategy, at least their description of the problem are correct.   
Also, if you ignore the ill-justified claims about security on that  
page, their explanation of the benefits of their approach is  
correct.  (Sorry if this comes off as smug -- I'm trying to be fair.)


(I'm not even going to address their third point [4] -- at least not  
until we take this conversation to the law mailing list! :-))


Okay, I think I've made my opinion about these issues fairly clear  
now, so I'll try to refrain from following-up to this subthread --  
the "strong claims about encryption safety" subthread -- unless there  
are some interesting new technical details that I haven't thought  
of.  By the way, when googling in the attempt to learn more  
information about the Cleversafe algorithm, I happened to see that  
Cleversafe is mentioned in this paper by Bellare and Rogaway: "Robust  
Computational Secrete Sharing and a Unified Account of Classical  
Secret-Sharing Goals" [5].  I haven't read that paper yet, but given  
the authors I would assume it is an excellent starting point for a  
modern study of the cryptographic issues.  :-)


I still do intend to follow-up on the subthread which I call "So how  
do *you* do key management, then?", which I consider to be the most  
important issue for practical security of systems like these.


Regards,

Zooko, writing e-mail on his lunch break

[1] http://dev.cleversafe.org/weblog/?p=63
[2] http://dev.cleversafe.org/weblog/?p=95
[3] http://dev.cleversafe.org/weblog/?p=111
[4] http://dev.cleversafe.org/weblog/?p=178
[5] http://www.cs.ucdavis.edu/~rogaway/papers/rcss.html

-
The Cryptography Mailing List
Unsub

Re: [tahoe-dev] cleversafe says: 3 Reasons Why Encryption isOverrated

2009-08-11 Thread Zooko Wilcox-O'Hearn

On Monday,2009-08-10, at 13:47 , Zooko Wilcox-O'Hearn wrote:


This conversation has bifurcated,


Oh, and while I don't mind if people want to talk about this on the  
tahoe-dev list, it doesn't have that much to do with tahoe-lafs  
anymore, now that we're done comparing Tahoe-LAFS to Cleversafe and  
are just arguing about the cryptographic design of Cleversafe.  ;-)   
So, it seems quite topical for the cryptography list and only  
tangentially topical for the tahoe-dev list.  I've also been enjoying  
the subthread about the physical limits of computation that have  
spawned off on the cryptography mailing list.  Ooh, were you guys  
considering only classical computers and not quantum computers when  
you estimated that either 2^128, 2^200 or 2^400 was the physical  
limit of possible computation?  :-)


Regards,

Zooko

-
The Cryptography Mailing List
Unsubscribe by sending "unsubscribe cryptography" to majord...@metzdowd.com


Re: [tahoe-dev] cleversafe says: 3 Reasons Why Encryption isOverrated

2009-08-11 Thread Zooko Wilcox-O'Hearn
This conversation has bifurcated, since I replied and removed tahoe- 
dev from the Cc: line, sending just to the cryptography list, and  
David-Sarah Hopwood has replied and removed cryptography, leaving  
just the tahoe-dev list.


Here is the root of the thread on the cryptography mailing list archive:

http://www.mail-archive.com/cryptography@metzdowd.com/msg10680.html

Here it is on the tahoe-dev mailing list archive.  Note that  
threading is screwed up in our mailing list archive.  :-(


http://allmydata.org/pipermail/tahoe-dev/2009-August/subject.html#start

Regards,

Zooko

-
The Cryptography Mailing List
Unsubscribe by sending "unsubscribe cryptography" to majord...@metzdowd.com


RE: [tahoe-dev] cleversafe says: 3 Reasons Why Encryption isOverrated

2009-08-11 Thread Jason Resch
Zooko Wilcox-O'Hearn wrote:
>
> [cross-posted to tahoe-...@allmydata.org and cryptogra...@metzdowd.com]
>
> Folks:
>
> It doesn't look like I'm going to get time to write a long post about 
> this bundle of issues, comparing Cleversafe with Tahoe-LAFS (both use 
> erasure coding and encryption, and the encryption and key-management 
> part differs), and arguing against the ill-advised Fear, Uncertainty, 
> and Doubt that the Cleversafe folks have posted.  So, I'm going to 
> try to throw out a few short pieces which hopefully each make sense.
>
> First, the most important issue in all of this is the one that my 
> programming partner Brian Warner already thoroughly addressed in [1] 
> (see also the reply by Jason Resch [2]).  That is the issue of access 
> control, which is intertwined with the issues of key management.  The 
> other issues are cryptographic details which are important to get 
> right, but the access control and key management issues are the ones 
> that directly impact every user and that make or break the security 
> and usefulness of the system.
>
> Second, the Cleversafe documents seem to indicate that the security 
> of their system does not rely on encryption, but it does.  The data 
> in Cleversafe is encrypted with AES-256 before being erasure-coded 
> and each share stored on a different server (exactly the same as in 
> Tahoe-LAFS).  If AES-256 is crackable, then a storage server can 
> learn information about the file (exactly as in Tahoe-LAFS).  The 
> difference is that Cleversafe also stores the decryption key on the 
> storage servers, encoded in such a way that  any K of the storage 
> servers must cooperate to recover it.  In contrast, Tahoe-LAFS 
> manages the decryption key separately. 

You have stated how Cleversafe manages the key but not provided any details 
regarding how Tahoe-LAFS manages the decryption key?  In your documentation it 
was stated that many of your users choose to store the capability (containing 
the key) for their root file on your data storage servers.  I would think that 
this results in less security than Cleversafe's approach because our servers 
enforce authentication and access controls.

> This added step of including 
> a secret-shared copy of the decryption key on the storage servers 
> does not make the data less vulnerable to weaknesses in AES-256, as 
> their documents claim.  (If anything, it makes it more vulnerable, 
> but probably it has no effect and it is just as vulnerable to 
> weaknesses in AES-256 as Tahoe-LAFS is.)

I agree.  I should also note that the use of AES-256 or any cipher is a 
configuration parameter for our generalized transformation algorithm, which 
also can support stream ciphers.

>
>
> Third, I don't understand why Cleversafe documents claim that public 
> key cryptosystems whose security is based on "math" are more likely 
> to fall to future advances in cryptanalysis.  I think most 
> cryptographers have the opposite belief -- that encryption based on 
> bit-twiddling such as block ciphers or stream ciphers is much more 
> likely to fall to future cryptanalysis.  Certainly the history of 
> modern cryptography seems to fit with this -- of the original crop of 
> public key cryptosystems founded on a math problem, some are still 
> regarded as secure today (RSA, DH, McEliece), but there has been a 
> long succession of symmetric crypto primitives based on bit twiddling 
> which have then turned out to be insecure.  (Including, ominously 
> enough, AES-256, which was regarded as a gold standard until a few 
> months ago.)

Symmetric ciphers frequently break in small pieces at a time, reducing the 
number of bits of protection below what would be expected for the given key 
length.  If an asymmetric algorithm were to break (due to finding solutions to 
factoring or discrete logarithms) those algorithms would fail utterly, no 
length of a key could be considered secure.  This of course has not happened 
yet, but it remains a possibility unless it is someday proven that there is no 
efficient solution.  Even if math does not provide a path to breaking 
asymmetric ciphers, physics does by way of quantum computing.

Hundreds of symmetric ciphers have been devised and as weaknesses are found in 
currently used symmetric ciphers it is easy to migrate to other well-vetted 
algorithms.  Asymmetric ciphers are in short supply, and depend on discover 
trap door functions in math, so a break in them would offer fewer exit 
strategies.

>
> Fourth, it seems like the same access control/key management model 
> that Cleversafe currently offers could be achieved by encrypting the 
> data with a random AES key and then using secret sharing to split the 
> key and store on share of the key with each server.  I *think* that 
> this would have the same cryptographic properties as the current 
> Cleversafe approach of using an All-Or-Nothing-Transform followed by 
> erasure coding.  Both would qualify as "computation secret shar

RE: [tahoe-dev] cleversafe says: 3 Reasons Why Encryption isOverrated

2009-08-11 Thread Jason Resch
james hughes wrote:
>
> On Aug 6, 2009, at 1:52 AM, Ben Laurie wrote:
>
> > Zooko Wilcox-O'Hearn wrote:
> >> I don't think there is any basis to the claims that Cleversafe makes
> >> that their erasure-coding ("Information Dispersal")-based system is
> >> fundamentally safer, e.g. these claims from [3]: "a malicious party
> >> cannot recreate data from a slice, or two, or three, no matter what 
> >> the
> >> advances in processing power." ... "Maybe encryption alone is 'good
> >> enough' in some cases now  - but Dispersal is 'good always' and
> >> represents the future."
> >
> > Surely this is fundamental to threshold secret sharing - until you 
> > reach
> > the threshold, you have not reduced the cost of an attack?
>
> Until you reach the threshold, you do not have the information to 
> attack. It becomes information theoretic secure.

With a secret sharing scheme such as Shamir's you have information theoretic 
security.  With the All-or-Nothing Transform and dispersal the distinction is 
there is only computational security.  The practical difference is that though 
2^-256 is very close to 0, it is not 0, so the possibility remains that with 
sufficient computational power useful data could be obtained with less than a 
threshold number of slices.  The difficulty of this is as hard as breaking the 
symmetric cipher used in the transformation.

>
>
> They are correct, if you lose a "slice, or two, or three" that's fine, 
> but once you have the threshold number, then you have it all. This 
> means that you must still defend the site from attackers, protect your 
> media from loss, ensure your admins are trusted. As such, you have 
> accomplished nothing to make the management of the data easier.

Is there any data storage system which does not require some protection against 
attackers, resiliency to media failure, and trusted administrators?  Even in a 
systems where one encrypts the data and focuses all energy on keeping the key 
safe, the encrypted copies must still be protected for availability and 
reliability reasons.

The security provided by this approach is only the icing on the cake to the 
other benefits of dispersal.  Dispersal provides extremely high fault tolerance 
and reliability without the large storage requirements of making copies.  See 
this paper "Erasure Coding vs. Replication: A Quantitative Comparison" by the 
creators of OceanStore for a primer on some of the advantages: 
http://www.cs.rice.edu/Conferences/IPTPS02/170.pdf

>
> Assume your threshold is 5. You lost 5 disks... Whose information was 
> lost? Anyone? Do you know?

If a particular "vault" (Our term for a logical grouping of data on which 
access controls may be applied) had data stored on on a threshold number of 
compromised drives, then data in that vault would be considered compromised.  
Our systems tracks which vaults have data on which machines through a global 
set of configuration information we call the Registry.

> What if the 5 drives were lost over 5 
> years, what then?

When drives or machines are known to be lost or compromised one may perform a 
read and overwrite of the peer-slices.  This makes obsolete any slices 
attackers may have accumulated up until that point.  This is due to the fact 
that the AONT is a random transformation, and newly generated slices cannot be 
used with old ones to re-create data.  Therefore this protocol protects against 
slow accumulation of a threshold number of slices over time.

> CleverSafe can not provide any security guarantees 
> unless these questions can be answered. Without answers, CleverSafe is 
> neither Clever nor Safe.
>
> Jim
>
>

Please let me know if you have any additional questions regarding our 
technology.

Best Regards,

Jason Resch

-
The Cryptography Mailing List
Unsubscribe by sending "unsubscribe cryptography" to majord...@metzdowd.com