Re: [cryptography] Master Password
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 On May 30, 2012, at 4:28 AM, Maarten Billemont wrote: If I understand your point correctly, you're telling me that while scrypt might delay brute-force attacks on a user's master password, it's not terribly useful a defense against someone building a rainbow table. Furthermore, you're of the opinion that the delay that scrypt introduces isn't very valuable and I should just simplify the solution with a hash function that's better trusted and more reliable. Tests on my local machine (a MacBook Pro) indicate that scrypt can generate 10 hashes per second with its current configuration while SHA-1 can generate about 1570733. This doesn't quite seem like a trivial delay, assuming rainbow tables are off the... table. Though I certainly wish to see and understand your point of view. My real advice, as in what I would do (and have done) is to run PBKDF2 with something like SHA1 or SHA-256 HMAC and an absurd number of iterations, enough to take one to two seconds on your MBP, which would be longer on ARM. There is a good reason to pick SHA1 here over SHA256 and that is that the time differential will be more predictable. Let me digress for a moment. It's a truism of security that complexity is the enemy of security and simple things are more secure. We just had such a discussion on this list in the last week. However, saying that is kinda like saying that you should love your mother because she loves you. Each of those is something it's impossible to disagree with -- how can you be against simplicity or loving your mother? But neither of them is actionable. The truism about my mother doesn't say how much I should spend on her birthday present. It doesn't tell me if I should come home early from work so I can spend more time with her or spend more time at work so I can be successful and she'll be proud of me. They are each true, but each meaningless, since I can use the principle to defend either A or ~A. Having said that, the sort of simplicity that I strive for and that I like is something that I call understandability. It isn't code size, or anything like that, it's how well I can not only understand it, but intuit it. It is similar to the mathematical concept of elegance in that it is an aesthetic principle. Like loving my mother, I might do something today and its opposite tomorrow because it is aesthetic within its context. I've read the scrypt paper maybe a half-dozen times. As a matter of fact, I just went and read it again while writing this note. Each time I read it, I nod as I follow along and when I get to the end of the paper, I'm not sure I understand it any more. I remain unconvinced. I think it complex. I think it inelegant. It fails my understandability test. This is not rational, and I know that; this is why I said that this is something that gentlepersons can disagree on. I don't think that because *I* don't like it that *you* shouldn't like it. I also mean no disrespect to Colin Percival, who is jaw-droppingly brilliant. I read his paper and say Wow even as I remain unconvinced. I also start to poke at it in some odd ways, mentally. I have a friend who builds custom supercomputers. These things have thousands of CPUs and tens of terabytes of memory. Would scrypt hold up to its claims in such an environment? I don't know, and my eyebrow is raised. Let us suppose that someone were to spend billions of dollars making a supercomputing site out in the desert somewhere. Would scrypt stand up to the analytic creativity that they show? I don't know. Moreover, I am irrationally skeptical; I believe that it would not, and I have no rational reason for it. Lastly, I fixate on Table 1 of the scrypt paper, on page 14. Estimated cost of hardware to crack a password in 1 year. In the middle row-section we see an estimate for a 10 character password. It would take (according to the paper) $160M to make a computer that would break 100ms of PBKDF2-HMAC-SHA256. The comparison is against 64ms of scrypt and a cost estimate of $43B. In the next row-section down, it gives a comparison of 5s of PBKDF2 for $8.3B versus 3.8s of scrypt for $175T. PBKDF2 is understandable. It's simple. In my head, I can reach into my mental box of cryptographic Lego and pull out a couple SHA blocks, snap them to an HMAC crossbar, and then wrap the thing in a PBKDF2 loop and see the whole thing in my head. It's understandable. I *believe* Colin Percival's number that 100ms of iteration will cost $160M (assuming 2009 hardware costs, at standard temperature and pressure) to break, and I think Wow. That's good enough. And if it isn't -- we can up it to 200ms, and handwave out to $300M hardware cost. I can also mentally adjust those against using GPUs and other accelerators because it's understandable. In contrast, I can't get a mental model of scrypt. It is mentally complex and because of that complexity I don't
Re: [cryptography] Master Password
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 On May 30, 2012, at 12:59 PM, Nico Williams wrote: Are you saying that PBKDFs are just so much cargo cult now? No. PBKDF2 is what I suggest, actually. C.F. my entirely too long missive to Maarten that I just sent. Jon -BEGIN PGP SIGNATURE- Version: PGP Universal 3.2.0 (Build 1672) Charset: us-ascii wj8DBQFPxxfxsTedWZOD3gYRAiUHAJ4wLHhlM4220R3nOryUVitaC83ShACg5yjk MjpQdcrhZywKmrWdPgjHoG0= =wYQ9 -END PGP SIGNATURE- ___ cryptography mailing list cryptography@randombit.net http://lists.randombit.net/mailman/listinfo/cryptography
Re: [cryptography] Master Password
On Thu, May 31, 2012 at 2:03 AM, Jon Callas j...@callas.org wrote: On May 30, 2012, at 4:28 AM, Maarten Billemont wrote: If I understand your point correctly, you're telling me that while scrypt might delay brute-force attacks on a user's master password, it's not terribly useful a defense against someone building a rainbow table. Furthermore, you're of the opinion that the delay that scrypt introduces isn't very valuable and I should just simplify the solution with a hash function that's better trusted and more reliable. Tests on my local machine (a MacBook Pro) indicate that scrypt can generate 10 hashes per second with its current configuration while SHA-1 can generate about 1570733. This doesn't quite seem like a trivial delay, assuming rainbow tables are off the... table. Though I certainly wish to see and understand your point of view. My real advice, as in what I would do (and have done) is to run PBKDF2 with something like SHA1 or SHA-256 HMAC and an absurd number of iterations, enough to take one to two seconds on your MBP, which would be longer on ARM. There is a good reason to pick SHA1 here over SHA256 and that is that the time differential will be more predictable. If you'll advise the use of compute-hard PBKDFs why not also memory hard PBKDFs? Forget scrypt if you don't like it. But the underlying idea in scrypt is simple: run one PBKDF instance to generate a key for a PRF, then generate so many megabytes of pseudo-random data from that PRF, then use another PBKDF+PRF instance to generate indices into the output of the first, then finally apply a KDF (possibly a simple hash function) to the output of the second pass to generate the final output. The use of a PRF to index the output of another PRF is simple, and a simple and wonderful way to construct a memory-hard PBKDF. Meory and comput hardness in a PBKDF is surely to be desired. It makes it harder to optimize hardware for fast computation of rainbow tables. Nico -- ___ cryptography mailing list cryptography@randombit.net http://lists.randombit.net/mailman/listinfo/cryptography
Re: [cryptography] Master Password
Reminds me of Feb 2003 - Moderately Hard, Memory-bound Functions NDSS 03, Martin Abadi, Mike Burrows, Mark Manasse, and Ted Wobber. (cached at) http://hashcash.org/papers/memory-bound-ndss.pdf By microsoft research, but then when exchange and oulook added a computational cost function, for hashcash anti-spam postage stamp like purposes they actually used hashcash and not microsoft research's memory bound puzzle. In fact they cooked up their own format, and made a slight tweak to SHA1 just to make existing SHA1 hardware inapplicable. (As part of their Coordinated Spam Reduction Initiative http://www.microsoft.com/en-us/download/details.aspx?id=17854 there is an open spec). Actually I worked for microsoft for a year or so around that the time, and had talked with the anti-spam team to give them a brain dump on anti-spam and hashcash, but I dont know why they rejected memory bound puzzles. That particular one has a disadvantage that it requires a few megs of pre-baked numbers to be shipped with the library. Seems like Percival's one does not. My guess was there's just something comparatively elegant and simple about hashcash. (If I recall they also used 8 sub-puzzles to reduce the variance of the compute cost). One quite generic argument I could suggest for being wary of scrypt would be if someone said, hey here's my new hash function, use it instead of SHA1, its better - you would and should very wary. A lot of public review goes into finding a good hash algorithm. (Yeah I know SHA1 has a chink in its armor now, but you get the point). It is entirely possible to have flaws crop up in the non-parallelizable aspect of the design, or non-reuse of RAM across parallel invocations. I had a go at making non-parallelizable variants of hashcash also, and its not so hard to do, but for most applications it doesnt really help practically because the attacker would get lots of parallelism from trying different passwords or different users or what have you in parallel. The main point of scrypt is to increase the hardware cost by needing lots of ram to evaluate the function which seems like a reasonable objective. Abadi et al's mbound puzzle objective was to make the bottleneck memory latency (plus a minimum required amount of RAM) instead of CPU. They made the argument that there mbound would be fairer because there is smaller variance in RAM latency between eg smartphone ram and desktop server ram compared to the imbalance between desktop/server cpu (core i7 3930k? vs arm9). The stats seem to add up. Though that was a few years ago, these days some of the xeon or i7 have a pretty impressive on die L3 cache! Adam On Thu, May 31, 2012 at 10:02:17AM -0500, Nico Williams wrote: On Thu, May 31, 2012 at 2:03 AM, Jon Callas j...@callas.org wrote: On May 30, 2012, at 4:28 AM, Maarten Billemont wrote: If I understand your point correctly, you're telling me that while scrypt might delay brute-force attacks on a user's master password, it's not terribly useful a defense against someone building a rainbow table. Furthermore, you're of the opinion that the delay that scrypt introduces isn't very valuable and I should just simplify the solution with a hash function that's better trusted and more reliable. Tests on my local machine (a MacBook Pro) indicate that scrypt can generate 10 hashes per second with its current configuration while SHA-1 can generate about 1570733. This doesn't quite seem like a trivial delay, assuming rainbow tables are off the... table. Though I certainly wish to see and understand your point of view. My real advice, as in what I would do (and have done) is to run PBKDF2 with something like SHA1 or SHA-256 HMAC and an absurd number of iterations, enough to take one to two seconds on your MBP, which would be longer on ARM. There is a good reason to pick SHA1 here over SHA256 and that is that the time differential will be more predictable. If you'll advise the use of compute-hard PBKDFs why not also memory hard PBKDFs? Forget scrypt if you don't like it. But the underlying idea in scrypt is simple: run one PBKDF instance to generate a key for a PRF, then generate so many megabytes of pseudo-random data from that PRF, then use another PBKDF+PRF instance to generate indices into the output of the first, then finally apply a KDF (possibly a simple hash function) to the output of the second pass to generate the final output. The use of a PRF to index the output of another PRF is simple, and a simple and wonderful way to construct a memory-hard PBKDF. Meory and comput hardness in a PBKDF is surely to be desired. It makes it harder to optimize hardware for fast computation of rainbow tables. ___ cryptography mailing list cryptography@randombit.net http://lists.randombit.net/mailman/listinfo/cryptography
Re: [cryptography] Master Password
On Thu, May 31, 2012 at 10:43 AM, Adam Back a...@cypherspace.org wrote: One quite generic argument I could suggest for being wary of scrypt would be if someone said, hey here's my new hash function, use it instead of SHA1, its better - you would and should very wary. A lot of public review goes into finding a good hash algorithm. (Yeah I know SHA1 has a chink in its armor now, but you get the point). Yes, but note that one could address that with some assumptions, and with some techniques that one would reject when making a better hash -- the point is to be slow, so things that make a PBKDF slower are OK: PBKDF2' = PBKDF2(P' = to_password(memory_hard(P, S, p)) + to_password(PBKDF2(P, S, p)), S, p) where P, S, and p are the password, salt and PBKDF parameters, to_password() generates a password from a key, and + concatenates strings. No one would build an H' from H that way. But for a PBKDF it seems sensible (but see below). Can PBKDF2' be weaker than PBKDF2? As long as PBKDF2 does not throw away any entropy, and as long as knowing one portion of the password (say, if the memory_hard function turns out to be weak) is not enough to guess the remainder from PBKDF2's output, then I think the answer has to be no. Now, I'm making assumptions here. Clearly PBKDF2 can toss some entropy out, for example, so at least one of my two assumptions is incorrect, but is it enough to wreck the security of my PBKDF2' construction? Nico -- ___ cryptography mailing list cryptography@randombit.net http://lists.randombit.net/mailman/listinfo/cryptography
Re: [cryptography] Master Password
On 05/31/2012 11:28 AM, Nico Williams wrote: Yes, but note that one could address that with some assumptions, and with some techniques that one would reject when making a better hash -- the point is to be slow, More precisely, the point is to take a tunable amount of time with strong assurance that an attacker will be unable to perform the computation with significantly less computational resources. The deliberate consumption of computational resources is a price that the defender has to pay in order to impose costs on the attacker. This ought to be an advantageous strategy for the defender as long as the attacker is expected to need to invoke the function many times more. But the defender's and attacker's cost structure is usually very different. The defender (say a website with a farm of PHP servers) doesn't get to choose when to begin the computation (legitimate users can log in at any time) and he pays a cost for noticeable latency and server resources. The attacker costs are proportional to the number of guesses he needs to make to reverse the password. Hopefully this is dominated by wrong guesses. But the attacker is free to parallelize the computation across whatever specialized hardware he can assemble in the time that the credentials are valid (sometimes years). Some attackers could be using stolen resources (e.g. botnets for which they do not pay the power bill). Starting with: Ep = password entropy in bits (chosen by the user) N = iteration count (chosen by the defender) Dd(N) = defender's per-iteration cost for N iterations. A nonlinear function determined primarily by factors out of anyone's control. Aa = attacker's algorithmic advantage. This is never 1.0 because the attacker is free to use the same algorithm as the defender. Ah = attacker's hardware and parallelism advantage. This is never 1.0 because the attacker is free to use the same commodity hardware as the defender. Aw = attacker's server and electric power cost advantage. Probably close to 1.0 or, in the case of a botnet, easily 1000 or even infinite. The defender's advantage to having a work factor looks something like: N * 2**Ep Ad = -- Dd(N) * Aa * Ah * Ap Observations: * The defender's goal is to maximize this function and for a given PBKDF his only knob to turn is N. * The attackers goal is to reduce it. If the defender can't find an N with an Ad significantly than 1.0, he may be better of not imposing a work factor at all. * Ep is the only exponential in the whole function. The individual user has all the leverage in this situation. * If individual users can be shown to present a different Ep to the attacker, it could be beneficial to adjust N independently per user. For example a website might say to a user: Want to log in 2 seconds faster next time? Pick a longer password! * N appears in both the numerator and denominator. In the denominator it goes nonlinear and very fuzzy. Yuck! * The defender would like to increase Ep, but nobody has a really great way do that yet. * Cryptographers mainly try to defend Aa and Ah. * It seems like the defender's best strategy is to increase N until people begin to notice and complain. * Clearly the attacker's best move is to beg, borrow, or steal a botnet. Only a noob would spend billions of dollars on the world's largest datacenter in the middle of the desert somewhere when they could instead spend a fraction of that money developing a popular video game and have millions of users happily bearing the cost of power and hardware depreciation to provide you with GPU-hours. so things that make a PBKDF slower are OK: PBKDF2' = PBKDF2(P' = to_password(memory_hard(P, S, p)) + to_password(PBKDF2(P, S, p)), S, p) where P, S, and p are the password, salt and PBKDF parameters, to_password() generates a password from a key, and + concatenates strings. No one would build an H' from H that way. But for a PBKDF it seems sensible (but see below). Can PBKDF2' be weaker than PBKDF2? Yes. It could turn out to result in an Aa or Ah significantly greater than 1.0. It could end up being usefully parallelizable so it can be evaluated more efficiently when testing many passwords at one time rather than one at a time. Even if it just reduced the performance of other things, say webserver software, sharing the defender's hardware it could be a step backwards. It could also be that memory_hard(P, S, p) introduces impractical-to-mitigate timing or other side channel attacks that leaked p. As long as PBKDF2 does not throw away any entropy, and as long as knowing one portion of the password (say, if the memory_hard function turns out to be weak) is not enough to guess the remainder from PBKDF2's output, then I think the answer has to be no. Now, I'm making assumptions here. Clearly PBKDF2 can toss some entropy out, Last I checked, PBKDF2
Re: [cryptography] Master Password
On Thu, May 31, 2012 at 2:03 PM, Marsh Ray ma...@extendedsubset.com wrote: On 05/31/2012 11:28 AM, Nico Williams wrote: Yes, but note that one could address that with some assumptions, and with some techniques that one would reject when making a better hash -- the point is to be slow, [...] Starting with: Ep = password entropy in bits (chosen by the user) N = iteration count (chosen by the defender) For memory-hard PBKDFs you also need a memory size parameter, though you might derive that from N. [...] The defender's advantage to having a work factor looks something like: N * 2**Ep Ad = -- Dd(N) * Aa * Ah * Ap Nicely summarized. * If individual users can be shown to present a different Ep to the attacker, it could be beneficial to adjust N independently per user. For example a website might say to a user: Want to log in 2 seconds faster next time? Pick a longer password! Nice. But again, in practice this won't work. You might think that for a single master password users could get into the 48 bits of entropy range, but probably not. Can PBKDF2' be weaker than PBKDF2? Yes. Mine or any PBKDF2' in principle? It could turn out to result in an Aa or Ah significantly greater than 1.0. It could end up being usefully parallelizable so it can be evaluated more efficiently when testing many passwords at one time rather than one at a time. Even if it just reduced the performance of other things, say webserver software, sharing the defender's hardware it could be a step backwards. I don't see how my PBKDF2' construction ends up being more parallelizable than PBKDF2 such that PBKDF2' can run faster than PBKDF2 -- you need to compute PBKDF2 before you can compute PBKDF2' and I see no way around that. If memory_hard() is not nearly as memory-hard (and/or slow) as initially thought then PBKDF2' will not help much, but it should still not hinder either. Unless... unless N is tuned down on the theory that memory_hard() adds strength -- if it turns out not to then having turned N down will greatly help the attacker. It could also be that memory_hard(P, S, p) introduces impractical-to-mitigate timing or other side channel attacks that leaked p. Sure. But let's assume that memory_hard() isn't awful. It just isn't yet accepted, and it might have some weaknesses. I'd hope that by now everyone pays attention to timing side channels (but power side channels? not so much). We can test for timing leaks. We can't test for as-yet undiscovered optimizations. Last I checked, PBKDF2 re-introduced all the starting entropy from P and S at every iteration. So it shouldn't lose any significant entropy. Ah, yes, I wasn't sure, but now I see that it does. It's possible that the chosen PRF will toss a particular bit every time (e.g., the high bit in every byte of the password), but let's assume it doesn't. If memory_hard did not take P and S and instead took memory_hard(PBKDF2(usage 2 + P, S, N)) it might be easier to think about. We would have less to worry about if P were passed through a salted one way function before it were handed to memory_hard. Sure, but I think you're making my point: there are things we could do to make a PBKDF2' that is at least as strong as PBKDF2. As long as we don't respond to PBKDF2' by tuning down N I think that's possible, trivially possible. But the risk is probably quite high that N will get turned down (it used to take 1s to login, now it takes 2s!-tune down N so it takes 1s with the new PBKDF2'). Nico -- ___ cryptography mailing list cryptography@randombit.net http://lists.randombit.net/mailman/listinfo/cryptography
[cryptography] future KDFs (and some history)
Hi, I've just posted the slides from my PHDays talk online: http://www.openwall.com/presentations/PHDays2012-Password-Security/ The title is: Password security: past, present, future (with strong bias towards password hashing) A few of you have already seen the historical background slides from this presentation. Since that draft, I've added the following 9 slides (to the very end): Desirable properties of a future KDF KDFs unfriendly to hardware we do not have CPU + RAM friendliness GPU friendliness FPGA/ASIC friendliness Local parameter Unreadable local parameter KDFs in scripting languages (future phpass) Need to resist the temptation I'd appreciate any comments. Alexander P.S. As it was suggested to me I need to specifically mention PBKDF2 in a next revision of this presentation. I left that specific example out as I already had trouble fitting this broad topic in 50 minutes (I left many other things out as well, yet I got 52 slides), but perhaps a mention on a slide wouldn't hurt. ___ cryptography mailing list cryptography@randombit.net http://lists.randombit.net/mailman/listinfo/cryptography