Re: [cryptography] Master Password

2012-06-28 Thread Maarten Billemont
On 30 May 2012, at 01:01, Maarten Billemont wrote:
 Dear readers,
 
 I've written an iOS / Mac application whose goal it is to produce passwords 
 for any purpose.  I was really hoping for the opportunity to receive some 
 critical feedback or review of the algorithm used[1].


I'd like to thank everyone involved for their tremendously useful feedback.

After considering the discussion both on this list and with external parties, 
an attempt to weigh the careful balance between security, usability, 
convenience and safety, I've completed a new release of Master Password which 
embodies the following changes since it was last evaluated:

- Still using scrypt[1] to derive a key from the user's master password, 
however, I've introduced a name concept for its salt.

The name has few constraints and is chosen by the user.  This choice assumes 
that the user understands that their name is a key part of access to their 
passwords. I believe that in modern times, where users sign up for websites 
everywhere using a username - password combination, that we can safely expect 
this of most users.

The name introduces a salt to the scrypt key derivation.  The salt is not 
secure-random, so its entropy is understandably fairly low.  I'm convinced, 
however, that this moderate entropy is sufficient to protect the user against a 
reasonable attempt at pre-calculating rainbow tables.

- I'm prefixing user data with (1) a fixed identifier 
com.lyndir.masterpassword and (2) the length of the user data.

The identifier (1) should help avoid cases where the algorithm might in the 
future be inadvertently re-implemented by another party, unexpectedly yielding 
the same results.
The length (2) should help protect the algorithm if any of the functions in the 
future are found vulnerable to certain length-based side-effects.

- I've normalized the algorithm by using HMAC-SHA256 instead of SHA1 hashing 
to determine the seed.

There's been some discussion about this, and in the end I believe the original 
implementation was without known flaws.

However, following certain recommendations I've decided the job is very 
befitting of a HMAC hash (given that we're hashing data using a key) and this 
should help avoid awkward questions from future peers.

The algorithm now calculates the seed using hmac-sha256(key, 
'com.lyndir.masterpassword' | length(site-name) | site-name | site-counter)

The reason I've opted for SHA256 over SHA1 here (against certain 
recommendations) is that the function yields me a larger hash.  This is 
important because:

- I've increased the entropy of the Long Password type (1) and added a 
Maximum Security Password type (2).

Assume that you've used Master Password to create a password for a certain 
website, the site stores a hash of your password but gets compromised and your 
hash becomes public.

The idea of Master Password is to generate passwords for sites that are strong 
enough to give you at least a good chance at safety in this event.  Naturally, 
it depends on the exact conditions under which the hosting site was compromised 
and whether or not they stored your site password in clear-text, as a result of 
a salted hash, or a simple hash function.

Master Password can only do its best in this case to protect you, so assuming 
the site host had taken mild precautions and only a SHA1 or MD5 hash of your 
password got leaked and became public, it would be nice if Master Password's 
generated passwords were strong enough to make finding your site password from 
a leaked hash infeasible.

Long Password (1) type passwords were before a 14-character sequence of 
numbers, upper-and-lowercase letters, and symbols.  Assuming the attacker knows 
nothing more about the layout of your password, that would take him 
1210537694726365245693116416 (86**14) attempts to try all combinations.

Assume furthermore that our attacker is in the posesssion of a machine that can 
calculate 2 billion MD5/SHA1's per second (eg. 10 8800GT's).

This blind attempt at brute-forcing a Long Master Password generated password 
would take them at most 19678164253 years.

However, because of the template that Master Password uses to make Long 
Passwords readable to the user (usability), the true entropy of such passwords 
was in fact much less than 86**14.  In fact, doing the math, I determined that 
it would take a mere 96486886125 permutations.

With the same hardware at his disposal, an attacker that knows your password 
was generated by Master Password and is a Long Password type, he now needs 
less than 5 days to discover the password for your leaked hash.

This has made me revise the template used to create the Long Password.  It 
was important to me that passwords generated by Master Password were easy for a 
user to transfer manually using a keyboard.  After all, the standard use case 
is generating a password on your iPhone (with a moderately secure keypad - at 
least in comparison to that of your desktop) and typing it 

Re: [cryptography] Master Password

2012-06-07 Thread Steven Bellovin

On May 31, 2012, at 3:03 PM, Marsh Ray wrote:

 On 05/31/2012 11:28 AM, Nico Williams wrote:
 
 Yes, but note that one could address that with some assumptions, and
 with some techniques that one would reject when making a better hash
 -- the point is to be slow,
 
 More precisely, the point is to take a tunable amount of time with strong 
 assurance that an attacker will be unable to perform the computation with 
 significantly less computational resources.
 
 The deliberate consumption of computational resources is a price that the 
 defender has to pay in order to impose costs on the attacker. This ought to 
 be an advantageous strategy for the defender as long as the attacker is 
 expected to need to invoke the function many times more.
 
 But the defender's and attacker's cost structure is usually very different. 
 The defender (say a website with a farm of PHP servers) doesn't get to choose 
 when to begin the computation (legitimate users can log in at any time) and 
 he pays a cost for noticeable latency and server resources.
 
 The attacker costs are proportional to the number of guesses he needs to make 
 to reverse the password. Hopefully this is dominated by wrong guesses. But 
 the attacker is free to parallelize the computation across whatever 
 specialized hardware he can assemble in the time that the credentials are 
 valid (sometimes years). Some attackers could be using stolen resources (e.g. 
 botnets for which they do not pay the power bill).


There's another, completely different issue: does the attacker want a 
particular password, or will any passwords from a large set suffice?  

Given the availability of cheap cloud computing, botnets, GPUs, and botnets 
with GPUs, Aa * Ah * Ap can be very, very high, i.e., the attacker has a strong 
advantage when attacking a particular password.  Some say that it's so high 
that increasing Ad is essentially meaningless.  On the other hand, if there are 
many passwords in the set being attacked, a large Ad translates into a 
reduction in the fraction that can be attack in any given time frame.

--Steve Bellovin, https://www.cs.columbia.edu/~smb





___
cryptography mailing list
cryptography@randombit.net
http://lists.randombit.net/mailman/listinfo/cryptography


Re: [cryptography] Master Password

2012-06-07 Thread Nico Williams
On Thu, Jun 7, 2012 at 4:14 PM, Steven Bellovin s...@cs.columbia.edu wrote:
 There's another, completely different issue: does the attacker want a 
 particular password, or will any passwords from a large set suffice?

 Given the availability of cheap cloud computing, botnets, GPUs, and botnets 
 with GPUs, Aa * Ah * Ap can be very, very high, i.e., the attacker has a 
 strong advantage when attacking a particular password.  Some say that it's so 
 high that increasing Ad is essentially meaningless.  On the other hand, if 
 there are many passwords in the set being attacked, a large Ad translates 
 into a reduction in the fraction that can be attack in any given time frame.

If the attacker can't easily identify the user IDs...  If usernames
are put through a PBKDF as well to generate the lookup key with which
to find the password verifier, how much does the defender gain?  For
any one password, not much, because there's less entropy in usernames
than passwords, so the Ad barely improves -- but if the attacker can't
identify that one password then the slight increase in Ad helps slow
the attacker's progress through all of the verifiers they have.
Moreover, the verifier DB could be peppered with chaff with which to
further slow down the attacker.  Does this make sense?

Nico
--
___
cryptography mailing list
cryptography@randombit.net
http://lists.randombit.net/mailman/listinfo/cryptography


Re: [cryptography] Master Password

2012-06-01 Thread Marsh Ray

On 05/31/2012 04:08 PM, Nico Williams wrote:

On Thu, May 31, 2012 at 2:03 PM, Marsh Rayma...@extendedsubset.com  wrote:

On 05/31/2012 11:28 AM, Nico Williams wrote:

Yes, but note that one could address that with some assumptions, and
with some techniques that one would reject when making a better hash
-- the point is to be slow,



[...]
Starting with:

Ep = password entropy in bits (chosen by the user)

N =  iteration count (chosen by the defender)


For memory-hard PBKDFs you also need a memory size parameter, though
you might derive that from N.


That could be useful.

We could also model the memory access latency and bandwidth of the 
defender's servers and the number of CPU cores trying to do other useful 
stuff over the same main memory bus and so on. I figured I'd quit while 
I was ahead. :-) The defender can think about his own systems in as much 
detail as he wishes, but biggest unknowns in the equations are still the 
capabilities of his future attackers.


Another big thing I left out is the cost of time to the attacker. Does 
he have a deadline? What is the value over time of a user's password. We 
used to think of password change policies as putting a hard expiration 
limit on the value of this data, but now we know that users typically 
share passwords across systems and just put a counter digit at the end 
to share them across time. Also, a password often represents the only 
entropy securing certain Wifi and VPN protocols so the attacker may have 
obtained a packet capture that will remain useful to decrypt long into 
the future.



* If individual users can be shown to present a different Ep to the
attacker, it could be beneficial to adjust N independently per user. For
example a website might say to a user: Want to log in 2 seconds faster next
time? Pick a longer password!


Nice.  But again, in practice this won't work.  You might think that
for a single master password users could get into the 48 bits of
entropy range, but probably not.


Well what about this (just thinking out loud):

Current password change policies can only occasionally recover the 
former level of security after a silent breach, but they make it harder 
for users to remember entropic passwords all the time.


What if instead of asking users to throw out their old password and 
choose a brand new completely independent strong one every 90 days we 
instead asked users to append 12 more characters to their old password? 
This way they could get better at remembering longer sequences 
incrementally with more practice over time.


Of course once it becomes too long (seems like a good problem to have) 
we could start dropping the same number of characters from the front. 
Could we in just three password change periods have users on rolling 36 
character passwords?


OK so maybe we don't get the same catastrophic fully reseeding property 
of a traditional password replacement policy, but if the string the user 
appends is subjected to the same length and complexity requirements in 
place now then it seems like a rolling long password could be no worse.



Can PBKDF2' be weaker than PBKDF2?


Yes.


Mine or any PBKDF2' in principle?


In principle.


I don't see how my PBKDF2' construction ends up being more
parallelizable than PBKDF2 such that PBKDF2' can run faster than
PBKDF2 -- you need to compute PBKDF2 before you can compute PBKDF2'
and I see no way around that.  If memory_hard() is not nearly as
memory-hard (and/or slow) as initially thought then PBKDF2' will not
help much, but it should still not hinder either.


Say memory_hard() was designed with DES as the fundamental primitive 
(for slowness of course) back when servers could only spare a 128 MiB of 
memory at a time.


Today's attacker would be able to parallelize the DES with a bitslice 
implementation on a video card with 4 GB RAM and GPU that can hide 
random memory latencies with single-clock context switching between 64 
threads.


The defender cannot benefit from that parallelism, he has to hash the 
passwords one at a time as users authenticate.


So it doesn't matter how high you turn up parameter N, or even to a 
degree M; it's not (just) a question of scaling parameters to track 
Moore's law. It's that a hash function that looked as-hard-as-practical 
a few years ago turns out to be more a bit more efficient in parallel on 
the attacker's hardware than in serial on the defenders'.


Another example:

On 05/31/2012 10:43 AM, Adam Back wrote:

That
particular one has a disadvantage that it requires a few megs of pre-baked
numbers to be shipped with the library.


A few MB of static constants sounds to me like a resource that may be 
shareable in the attacker's parallel synchronous implementation, but 
maybe even demand-paged in the defender's system.



Unless... unless N
is tuned down on the theory that memory_hard() adds strength -- if it
turns out not to then having turned N down will greatly help the
attacker.




It could also be that 

Re: [cryptography] Master Password

2012-05-31 Thread Jon Callas
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

On May 30, 2012, at 4:28 AM, Maarten Billemont wrote:

 If I understand your point correctly, you're telling me that while scrypt 
 might delay brute-force attacks on a user's master password, it's not 
 terribly useful a defense against someone building a rainbow table.  
 Furthermore, you're of the opinion that the delay that scrypt introduces 
 isn't very valuable and I should just simplify the solution with a hash 
 function that's better trusted and more reliable.
 
 Tests on my local machine (a MacBook Pro) indicate that scrypt can generate 
 10 hashes per second with its current configuration while SHA-1 can generate 
 about 1570733.  This doesn't quite seem like a trivial delay, assuming 
 rainbow tables are off the... table.  Though I certainly wish to see and 
 understand your point of view.

My real advice, as in what I would do (and have done) is to run PBKDF2 with 
something like SHA1 or SHA-256 HMAC and an absurd number of iterations, enough 
to take one to two seconds on your MBP, which would be longer on ARM. There is 
a good reason to pick SHA1 here over SHA256 and that is that the time 
differential will be more predictable.

Let me digress for a moment. It's a truism of security that complexity is the 
enemy of security and simple things are more secure. We just had such a 
discussion on this list in the last week. However, saying that is kinda like 
saying that you should love your mother because she loves you. Each of those is 
something it's impossible to disagree with -- how can you be against simplicity 
or loving your mother? But neither of them is actionable. The truism about my 
mother doesn't say how much I should spend on her birthday present. It doesn't 
tell me if I should come home early from work so I can spend more time with her 
or spend more time at work so I can be successful and she'll be proud of me. 
They are each true, but each meaningless, since I can use the principle to 
defend either A or ~A.

Having said that, the sort of simplicity that I strive for and that I like is 
something that I call understandability. It isn't code size, or anything like 
that, it's how well I can not only understand it, but intuit it. It is similar 
to the mathematical concept of elegance in that it is an aesthetic principle. 
Like loving my mother, I might do something today and its opposite tomorrow 
because it is aesthetic within its context.

I've read the scrypt paper maybe a half-dozen times. As a matter of fact, I 
just went and read it again while writing this note. Each time I read it, I nod 
as I follow along and when I get to the end of the paper, I'm not sure I 
understand it any more. I remain unconvinced. I think it complex. I think it 
inelegant. It fails my understandability test. This is not rational, and I know 
that; this is why I said that this is something that gentlepersons can disagree 
on. I don't think that because *I* don't like it that *you* shouldn't like it. 
I also mean no disrespect to Colin Percival, who is jaw-droppingly brilliant. I 
read his paper and say Wow even as I remain unconvinced.

I also start to poke at it in some odd ways, mentally. I have a friend who 
builds custom supercomputers. These things have thousands of CPUs and tens of 
terabytes of memory. Would scrypt hold up to its claims in such an environment? 
I don't know, and my eyebrow is raised. 

Let us suppose that someone were to spend billions of dollars making a 
supercomputing site out in the desert somewhere. Would scrypt stand up to the 
analytic creativity that they show? I don't know. Moreover, I am irrationally 
skeptical; I believe that it would not, and I have no rational reason for it.

Lastly, I fixate on Table 1 of the scrypt paper, on page 14. Estimated cost of 
hardware to crack a password in 1 year. In the middle row-section we see an 
estimate for a 10 character password. It would take (according to the paper) 
$160M to make a computer that would break 100ms of PBKDF2-HMAC-SHA256. The 
comparison is against 64ms of scrypt and a cost estimate of $43B. In the next 
row-section down, it gives a comparison of 5s of PBKDF2 for $8.3B versus 3.8s 
of scrypt for $175T.

PBKDF2 is understandable. It's simple. In my head, I can reach into my mental 
box of cryptographic Lego and pull out a couple SHA blocks, snap them to an 
HMAC crossbar, and then wrap the thing in a PBKDF2 loop and see the whole thing 
in my head. It's understandable. I *believe* Colin Percival's number that 100ms 
of iteration will cost $160M (assuming 2009 hardware costs, at standard 
temperature and pressure) to break, and I think Wow. That's good enough. And 
if it isn't -- we can up it to 200ms, and handwave out to $300M hardware cost. 
I can also mentally adjust those against using GPUs and other accelerators 
because it's understandable.

In contrast, I can't get a mental model of scrypt. It is mentally complex and 
because of that complexity I don't 

Re: [cryptography] Master Password

2012-05-31 Thread Jon Callas
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

On May 30, 2012, at 12:59 PM, Nico Williams wrote:

 
 Are you saying that PBKDFs are just so much cargo cult now?

No. PBKDF2 is what I suggest, actually. C.F. my entirely too long missive to 
Maarten that I just sent.

Jon


-BEGIN PGP SIGNATURE-
Version: PGP Universal 3.2.0 (Build 1672)
Charset: us-ascii

wj8DBQFPxxfxsTedWZOD3gYRAiUHAJ4wLHhlM4220R3nOryUVitaC83ShACg5yjk
MjpQdcrhZywKmrWdPgjHoG0=
=wYQ9
-END PGP SIGNATURE-
___
cryptography mailing list
cryptography@randombit.net
http://lists.randombit.net/mailman/listinfo/cryptography


Re: [cryptography] Master Password

2012-05-31 Thread Nico Williams
On Thu, May 31, 2012 at 2:03 AM, Jon Callas j...@callas.org wrote:
 On May 30, 2012, at 4:28 AM, Maarten Billemont wrote:

 If I understand your point correctly, you're telling me that while scrypt 
 might delay brute-force attacks on a user's master password, it's not 
 terribly useful a defense against someone building a rainbow table.  
 Furthermore, you're of the opinion that the delay that scrypt introduces 
 isn't very valuable and I should just simplify the solution with a hash 
 function that's better trusted and more reliable.

 Tests on my local machine (a MacBook Pro) indicate that scrypt can generate 
 10 hashes per second with its current configuration while SHA-1 can generate 
 about 1570733.  This doesn't quite seem like a trivial delay, assuming 
 rainbow tables are off the... table.  Though I certainly wish to see and 
 understand your point of view.

 My real advice, as in what I would do (and have done) is to run PBKDF2 with 
 something like SHA1 or SHA-256 HMAC and an absurd number of iterations, 
 enough to take one to two seconds on your MBP, which would be longer on ARM. 
 There is a good reason to pick SHA1 here over SHA256 and that is that the 
 time differential will be more predictable.

If you'll advise the use of compute-hard PBKDFs why not also memory
hard PBKDFs?  Forget scrypt if you don't like it.  But the underlying
idea in scrypt is simple: run one PBKDF instance to generate a key for
a PRF, then generate so many megabytes of pseudo-random data from that
PRF, then use another PBKDF+PRF instance to generate indices into the
output of the first, then finally apply a KDF (possibly a simple hash
function) to the output of the second pass to generate the final
output.  The use of a PRF to index the output of another PRF is
simple, and a simple and wonderful way to construct a memory-hard
PBKDF.

Meory and comput hardness in a PBKDF is surely to be desired.  It
makes it harder to optimize hardware for fast computation of rainbow
tables.

Nico
--
___
cryptography mailing list
cryptography@randombit.net
http://lists.randombit.net/mailman/listinfo/cryptography


Re: [cryptography] Master Password

2012-05-31 Thread Adam Back

Reminds me of Feb 2003 - Moderately Hard, Memory-bound Functions NDSS 03,
Martin Abadi, Mike Burrows, Mark Manasse, and Ted Wobber. 
(cached at) http://hashcash.org/papers/memory-bound-ndss.pdf


By microsoft research, but then when exchange and oulook added a
computational cost function, for hashcash anti-spam postage stamp like
purposes they actually used hashcash and not microsoft research's memory
bound puzzle.  In fact they cooked up their own format, and made a slight
tweak to SHA1 just to make existing SHA1 hardware inapplicable.
(As part of their Coordinated Spam Reduction Initiative
http://www.microsoft.com/en-us/download/details.aspx?id=17854
there is an open spec).

Actually I worked for microsoft for a year or so around that the time, and
had talked with the anti-spam team to give them a brain dump on anti-spam
and hashcash, but I dont know why they rejected memory bound puzzles.  That
particular one has a disadvantage that it requires a few megs of pre-baked
numbers to be shipped with the library.  Seems like Percival's one does not.

My guess was there's just something comparatively elegant and simple about
hashcash.  (If I recall they also used 8 sub-puzzles to reduce the variance
of the compute cost).


One quite generic argument I could suggest for being wary of scrypt would be
if someone said, hey here's my new hash function, use it instead of SHA1,
its better - you would and should very wary.  A lot of public review goes
into finding a good hash algorithm.  (Yeah I know SHA1 has a chink in its
armor now, but you get the point).

It is entirely possible to have flaws crop up in the non-parallelizable
aspect of the design, or non-reuse of RAM across parallel invocations.

I had a go at making non-parallelizable variants of hashcash also, and its
not so hard to do, but for most applications it doesnt really help
practically because the attacker would get lots of parallelism from trying
different passwords or different users or what have you in parallel.

The main point of scrypt is to increase the hardware cost by needing lots of
ram to evaluate the function which seems like a reasonable objective.

Abadi et al's mbound puzzle objective was to make the bottleneck memory
latency (plus a minimum required amount of RAM) instead of CPU.  They made
the argument that there mbound would be fairer because there is smaller
variance in RAM latency between eg smartphone ram and desktop server ram
compared to the imbalance between desktop/server cpu (core i7 3930k?  vs
arm9).  The stats seem to add up.  Though that was a few years ago, these
days some of the xeon or i7 have a pretty impressive on die L3 cache!

Adam

On Thu, May 31, 2012 at 10:02:17AM -0500, Nico Williams wrote:

On Thu, May 31, 2012 at 2:03 AM, Jon Callas j...@callas.org wrote:

On May 30, 2012, at 4:28 AM, Maarten Billemont wrote:


If I understand your point correctly, you're telling me that while scrypt might 
delay brute-force attacks on a user's master password, it's not terribly useful 
a defense against someone building a rainbow table.  Furthermore, you're of the 
opinion that the delay that scrypt introduces isn't very valuable and I should 
just simplify the solution with a hash function that's better trusted and more 
reliable.

Tests on my local machine (a MacBook Pro) indicate that scrypt can generate 10 hashes per 
second with its current configuration while SHA-1 can generate about 1570733.  This 
doesn't quite seem like a trivial delay, assuming rainbow tables are off 
the... table.  Though I certainly wish to see and understand your point of view.


My real advice, as in what I would do (and have done) is to run PBKDF2 with 
something like SHA1 or SHA-256 HMAC and an absurd number of iterations, enough 
to take one to two seconds on your MBP, which would be longer on ARM. There is 
a good reason to pick SHA1 here over SHA256 and that is that the time 
differential will be more predictable.


If you'll advise the use of compute-hard PBKDFs why not also memory
hard PBKDFs?  Forget scrypt if you don't like it.  But the underlying
idea in scrypt is simple: run one PBKDF instance to generate a key for
a PRF, then generate so many megabytes of pseudo-random data from that
PRF, then use another PBKDF+PRF instance to generate indices into the
output of the first, then finally apply a KDF (possibly a simple hash
function) to the output of the second pass to generate the final
output.  The use of a PRF to index the output of another PRF is
simple, and a simple and wonderful way to construct a memory-hard
PBKDF.

Meory and comput hardness in a PBKDF is surely to be desired.  It
makes it harder to optimize hardware for fast computation of rainbow
tables.

___
cryptography mailing list
cryptography@randombit.net
http://lists.randombit.net/mailman/listinfo/cryptography


Re: [cryptography] Master Password

2012-05-31 Thread Nico Williams
On Thu, May 31, 2012 at 10:43 AM, Adam Back a...@cypherspace.org wrote:
 One quite generic argument I could suggest for being wary of scrypt would be
 if someone said, hey here's my new hash function, use it instead of SHA1,
 its better - you would and should very wary.  A lot of public review goes
 into finding a good hash algorithm.  (Yeah I know SHA1 has a chink in its
 armor now, but you get the point).

Yes, but note that one could address that with some assumptions, and
with some techniques that one would reject when making a better hash
-- the point is to be slow, so things that make a PBKDF slower are OK:

PBKDF2' = PBKDF2(P' = to_password(memory_hard(P, S, p)) +
to_password(PBKDF2(P, S, p)), S, p)

where P, S, and p are the password, salt and PBKDF parameters,
to_password() generates a password from a key, and + concatenates
strings.

No one would build an H' from H that way.  But for a PBKDF it seems
sensible (but see below).

Can PBKDF2' be weaker than PBKDF2?  As long as PBKDF2 does not throw
away any entropy, and as long as knowing one portion of the password
(say, if the memory_hard function turns out to be weak) is not enough
to guess the remainder from PBKDF2's output, then I think the answer
has to be no.  Now, I'm making assumptions here.  Clearly PBKDF2 can
toss some entropy out, for example, so at least one of my two
assumptions is incorrect, but is it enough to wreck the security of my
PBKDF2' construction?

Nico
--
___
cryptography mailing list
cryptography@randombit.net
http://lists.randombit.net/mailman/listinfo/cryptography


Re: [cryptography] Master Password

2012-05-31 Thread Marsh Ray

On 05/31/2012 11:28 AM, Nico Williams wrote:


Yes, but note that one could address that with some assumptions, and
with some techniques that one would reject when making a better hash
-- the point is to be slow,


More precisely, the point is to take a tunable amount of time with 
strong assurance that an attacker will be unable to perform the 
computation with significantly less computational resources.


The deliberate consumption of computational resources is a price that 
the defender has to pay in order to impose costs on the attacker. This 
ought to be an advantageous strategy for the defender as long as the 
attacker is expected to need to invoke the function many times more.


But the defender's and attacker's cost structure is usually very 
different. The defender (say a website with a farm of PHP servers) 
doesn't get to choose when to begin the computation (legitimate users 
can log in at any time) and he pays a cost for noticeable latency and 
server resources.


The attacker costs are proportional to the number of guesses he needs to 
make to reverse the password. Hopefully this is dominated by wrong 
guesses. But the attacker is free to parallelize the computation across 
whatever specialized hardware he can assemble in the time that the 
credentials are valid (sometimes years). Some attackers could be using 
stolen resources (e.g. botnets for which they do not pay the power bill).


Starting with:

Ep = password entropy in bits (chosen by the user)

N =  iteration count (chosen by the defender)

Dd(N) = defender's per-iteration cost for N iterations. A nonlinear
function determined primarily by factors out of anyone's
control.

Aa = attacker's algorithmic advantage. This is never  1.0 because
 the attacker is free to use the same algorithm as the defender.

Ah = attacker's hardware and parallelism advantage. This is
 never  1.0 because the attacker is free to use the same
 commodity hardware as the defender.

Aw = attacker's server and electric power cost advantage. Probably
 close to 1.0 or, in the case of a botnet, easily 1000 or even
 infinite.

The defender's advantage to having a work factor looks something like:

  N * 2**Ep
Ad =  --
Dd(N) * Aa * Ah * Ap

Observations:

* The defender's goal is to maximize this function and for a given PBKDF 
his only knob to turn is N.


* The attackers goal is to reduce it. If the defender can't find an N 
with an Ad significantly than 1.0, he may be better of not imposing a 
work factor at all.


* Ep is the only exponential in the whole function. The individual user 
has all the leverage in this situation.


* If individual users can be shown to present a different Ep to the 
attacker, it could be beneficial to adjust N independently per user. For 
example a website might say to a user: Want to log in 2 seconds faster 
next time? Pick a longer password!


* N appears in both the numerator and denominator. In the denominator it 
goes nonlinear and very fuzzy. Yuck!


* The defender would like to increase Ep, but nobody has a really great 
way do that yet.


* Cryptographers mainly try to defend Aa and Ah.

* It seems like the defender's best strategy is to increase N until 
people begin to notice and complain.


* Clearly the attacker's best move is to beg, borrow, or steal a botnet. 
Only a noob would spend billions of dollars on the world's largest 
datacenter in the middle of the desert somewhere when they could instead 
spend a fraction of that money developing a popular video game and have 
millions of users happily bearing the cost of power and hardware 
depreciation to provide you with GPU-hours.



so things that make a PBKDF slower are OK:

PBKDF2' = PBKDF2(P' = to_password(memory_hard(P, S, p)) +
to_password(PBKDF2(P, S, p)), S, p)

where P, S, and p are the password, salt and PBKDF parameters,
to_password() generates a password from a key, and + concatenates
strings.

No one would build an H' from H that way.  But for a PBKDF it seems
sensible (but see below).

Can PBKDF2' be weaker than PBKDF2?


Yes.

It could turn out to result in an Aa or Ah significantly greater than 
1.0. It could end up being usefully parallelizable so it can be 
evaluated more efficiently when testing many passwords at one time 
rather than one at a time. Even if it just reduced the performance of 
other things, say webserver software, sharing the defender's hardware it 
could be a step backwards.


It could also be that memory_hard(P, S, p) introduces 
impractical-to-mitigate timing or other side channel attacks that leaked p.



As long as PBKDF2 does not throw
away any entropy, and as long as knowing one portion of the password
(say, if the memory_hard function turns out to be weak) is not enough
to guess the remainder from PBKDF2's output, then I think the answer
has to be no.  Now, I'm making assumptions here.  Clearly PBKDF2 can
toss some entropy out,


Last I checked, PBKDF2 

Re: [cryptography] Master Password

2012-05-31 Thread Nico Williams
On Thu, May 31, 2012 at 2:03 PM, Marsh Ray ma...@extendedsubset.com wrote:
 On 05/31/2012 11:28 AM, Nico Williams wrote:
 Yes, but note that one could address that with some assumptions, and
 with some techniques that one would reject when making a better hash
 -- the point is to be slow,

 [...]
 Starting with:

 Ep = password entropy in bits (chosen by the user)

 N =  iteration count (chosen by the defender)

For memory-hard PBKDFs you also need a memory size parameter, though
you might derive that from N.

 [...]
 The defender's advantage to having a work factor looks something like:

          N * 2**Ep
 Ad =  --
        Dd(N) * Aa * Ah * Ap

Nicely summarized.

 * If individual users can be shown to present a different Ep to the
 attacker, it could be beneficial to adjust N independently per user. For
 example a website might say to a user: Want to log in 2 seconds faster next
 time? Pick a longer password!

Nice.  But again, in practice this won't work.  You might think that
for a single master password users could get into the 48 bits of
entropy range, but probably not.

 Can PBKDF2' be weaker than PBKDF2?

 Yes.

Mine or any PBKDF2' in principle?

 It could turn out to result in an Aa or Ah significantly greater than 1.0.
 It could end up being usefully parallelizable so it can be evaluated more
 efficiently when testing many passwords at one time rather than one at a
 time. Even if it just reduced the performance of other things, say webserver
 software, sharing the defender's hardware it could be a step backwards.

I don't see how my PBKDF2' construction ends up being more
parallelizable than PBKDF2 such that PBKDF2' can run faster than
PBKDF2 -- you need to compute PBKDF2 before you can compute PBKDF2'
and I see no way around that.  If memory_hard() is not nearly as
memory-hard (and/or slow) as initially thought then PBKDF2' will not
help much, but it should still not hinder either.  Unless... unless N
is tuned down on the theory that memory_hard() adds strength -- if it
turns out not to then having turned N down will greatly help the
attacker.

 It could also be that memory_hard(P, S, p) introduces
 impractical-to-mitigate timing or other side channel attacks that leaked p.

Sure.  But let's assume that memory_hard() isn't awful.  It just isn't
yet accepted, and it might have some weaknesses.  I'd hope that by now
everyone pays attention to timing side channels (but power side
channels? not so much).  We can test for timing leaks.  We can't test
for as-yet undiscovered optimizations.

 Last I checked, PBKDF2 re-introduced all the starting entropy from P and S
 at every iteration. So it shouldn't lose any significant entropy.

Ah, yes, I wasn't sure, but now I see that it does.  It's possible
that the chosen PRF will toss a particular bit every time (e.g., the
high bit in every byte of the password), but let's assume it doesn't.

 If memory_hard did not take P and S and instead took
       memory_hard(PBKDF2(usage 2 + P, S, N))
 it might be easier to think about. We would have less to worry about if P
 were passed through a salted one way function before it were handed to
 memory_hard.

Sure, but I think you're making my point: there are things we could do
to make a PBKDF2' that is at least as strong as PBKDF2.  As long as we
don't respond to PBKDF2' by tuning down N I think that's possible,
trivially possible.  But the risk is probably quite high that N will
get turned down (it used to take 1s to login, now it takes 2s!-tune
down N so it takes 1s with the new PBKDF2').

Nico
--
___
cryptography mailing list
cryptography@randombit.net
http://lists.randombit.net/mailman/listinfo/cryptography


Re: [cryptography] Master Password

2012-05-30 Thread Maarten Billemont
On 30 May 2012, at 02:49, Jonathan Thornburg wrote:
 On Wed, 30 May 2012, Maarten Billemont wrote:
 Master Password is different in that it generates passwords based
 purely off of a user's master password and the name of the site.
 
 Is there a provision to rollover the master password
   [Useful to, e.g., regain security after ending a
   relationship in which the master password had been
   shared with a spouse/SO/lover/etc.]
 without requiring re-visiting every site where one has used a derived
 password based on that master password?
 
 ciao,


No, the Master Password application makes it very explicit that the password is 
not to be shared with anyone.  Passwords output by the algorithm for a specific 
site can optionally be shared, at the risk of needing to increment the password 
counter for the site when the user wishes to stop sharing their account.

smime.p7s
Description: S/MIME cryptographic signature
___
cryptography mailing list
cryptography@randombit.net
http://lists.randombit.net/mailman/listinfo/cryptography


Re: [cryptography] Master Password

2012-05-30 Thread Jon Callas
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

Your algorithm is basically okay, but there are a couple of errors you've made, 
things you and I will disagree over, and one flaw that I consider to wreck the 
whole thing. But all of the problems are correctable, easily. If I have not 
understood something, or read it too quickly and gotten confused, I apologize 
in advance.

Let me walk through a reduction of your system:

(1) You take the master password and run it through a 512-bit hash function, 
producing master binary secret.

You pick scrypt for your hash function, because you think burning time and 
space adds to security. I do not. This is a place where gentlepersons can 
disagree, and I really don't expect to convince you that SHA-512 or Skein would 
be better options. I'm convinced that I know why you're doing it, and it would 
be a waste of both our times to go further. We just disagree.

At the end of it, it hardly matters because if an attacker wishes to construct 
a rainbow table, the correct way to do it is to assemble a list of likely 
passwords and just go from there. It will take longer if they use scrypt than 
with a real hash function, but once it's done it is done. They have the rainbow 
table.

This isn't a flaw, it just is. The goal of your system requires that you have a 
master secret. But security-wise, there's no win here other than burning some 
cycles in a way that the attacker can trivially replicate.

Security-wise, I'm quite certain that scrypt isn't nearly as secure as any real 
hash function you'd pick, but I'm just whining. We know that the security is 
password. If they pick puppies as their password, it really doesn't matter 
what hash function you run it through. Almost certainly, there is not enough 
security in their password to make it a difference what function you picked.

Let's call the parameters P for password and M for the master key.

(2) You take M and construct site-specific keys. We'll call the site name N, 
your counter C, and the site keys K_s.

You compute a given site-specific key, K_s, with:

K_s = SHA1(S + M + C), where + is the function that concatenates a null byte 
and then the second string.

Strictly speaking, you really ought to do it in the order K + C + S, because 
that's more collision-resistant. It's good practice when computing a keyed hash 
to hash the key first. In reality, it probably doesn't matter, but it *does* 
save you lots of debates with people like me.

You also want to hash in the length of S, too, because that's also more 
collision-resistent.

So you really want it to be K_s = SHA1(M + C + S.length + S), but that's the 
only real security problems I can see. The ordering is a nit, and omitting the 
length is only a problem if the hash function is broken. Speaking of which, why 
not use a non-broken hash function, like SHA256, or SHA512 or SHA512/160, if 
the output size matters to you? Given that you're using scrypt, why not use a 
better hash function, even if it is slower? But that's also a nit.

The real problem you have, however, is in the counter. First of all a counter 
is not a salt. A salt is an arbitrary non-security parameter. But arbitrary 
means random, just not secret. A counter is a counter.

The counter has two problems to it. One is that it doesn't add to the security 
of the system. The other it makes system utterly useless. 

An attacker can easily brute force through the site keys by just running a 
counter. That's why I say it doesn't add to the security. Even if you used 
scrypt here, too, it wouldn't matter much. It's still easy to brute force.

However, this completely ruins the system for the end user. The end user can't 
just remember their master password and the site name. They have to remember 
what *order* they did it in too, which ruins everything. You can't sync this 
across devices, you have to keep track of orders, and so on. You need to remove 
the counter.

I understand why you did it. You did it because that way two people with the 
same master password on the same site aren't going to have the same password -- 
which would give away the master password to the other one, as well as leaking 
to an attacker that you picked a lousy password, even if they don't immediately 
know what it is (cue the rainbow tables). Ironically, this is kinda like the 
recent RSA/GCD thing in that your security oops can be made into a disaster if 
someone else makes the same oops.

You can fix this one by substituting the person's site username for the 
counter. In this revision, you have K_s = HASH(M + U.length + U + S.length + 
S). That's a much better construction.

(3) You run the site key, K, through some pretty printer. I really didn't read 
this and really don't care. It doesn't matter to me. TL;DR.

All in all, it's cute. I like it enough to write this note. I advise that you:

* Pick a real hash function or two. We can debate scrypt, but you can do better 
than SHA1. Even a second scrypt is better than 

Re: [cryptography] Master Password

2012-05-30 Thread Maarten Billemont
First of all, thanks for your time and very valuable feedback.

On 30 May 2012, at 07:20, Marsh Ray wrote:
 On 05/29/2012 06:01 PM, Maarten Billemont wrote:
 Dear readers,
 
 I've written an iOS / Mac application whose goal it is to produce
 passwords for any purpose.  I was really hoping for the opportunity
 to receive some critical feedback or review of the algorithm
 used[1].
 
 
 [1] http://masterpassword.lyndir.com/algorithm.html
 
 Master Password Master Password So how does it work?
 
 The theory behind Master Password is simple. The user remembers a
 single, secure password. The user only ever uses that password to log
 into the Master Password application. This master password is then
 used as a seed to generate a different password based on the name of
 
 How about just used to generate? Usually the term seed is used when it 
 completely determines a pseudorandom sequence rather than being combined with 
 other information.
 
 [edit: OIC, the master password generates a seed based on the name of the 
 site. This was unclear.]
 
 the site to generate a password for.
 
 The result is that each master password generates its own unique
 sequence of passwords for any site name. Since the only input data is
 the master password and the site name (along with a password counter,
 see below), there is no need for any kind of storage to recreate a
 site's password. All that's needed is the correct master password and
 the correct algorithm implementation.
 
 How do you know how far along in the password sequence you are?

The summary descriptions of the algorithm are simplified (perhaps, too much so) 
by assuming default values for those inputs that have ones.  Technically, there 
are these inputs:
 - The master password,
 - The site name,
 - The password counter (default 0),
 - The password type (default Long Password).

 
 What that does for you is make
 it almost impossible to lose your passwords.
 
 To lose a password has multiple meanings.
 How about much easier to create and remember strong passwords?

The intended meaning was, get in a situation where you can no longer obtain to 
the data necessary for accessing your passwords.  Additionally, you can no 
longer get in a situation where this data can be lost to a third party, such 
as when someone steals your notebook full of passwords.

 
 It also makes it nearly
 impossible for hackers to steal your online identity.
 
 I don't think this claim is well substantiated. :-)

The basis of this claim is the theory that attacking your password for a single 
site remains as vulnerable as password authentication has ever been, but this 
does not risk your global identity as reusing or rehashing the same password on 
multiple sites would.  It's phrased awkwardly for an audience of professionals, 
but was targeted at users.  This page may not be the right location for such 
language.

 
 The Algorithm
 
 In short, the algorithm is comprised of the following steps:
 
 Determining the master key Determining the cipher seed Encoding a
 user-friendly password
 
 The Master Password
 
 The user chooses a single master password, preferably sufficiently
 long to harden against brute-force attacks. Master Password
 recommends absurd two or three-word sentences as they're easily
 remembered and generally sufficiently high in entropy.
 
 I believe this is not good advice because two and three word sentences will 
 not provide enough entropy.
 
 Wikipedia cites a Harvard/Google study indicating English has 1,022,000 
 words. 
 https://en.wikipedia.org/wiki/Number_of_words_in_English#Number_of_words_in_English
 
 Users have great difficulty picking original passwords, even security pros 
 tend to be overconfident in the uniqueness of password schemes.
 For example:
 http://www.imperva.com/docs/WP_Consumer_Password_Worst_Practices.pdf
 the 5000 most popular passwords [] are used by a share of 20% of the users.
 
 So even if users chose two completely independent passwords from the 
 RockYou statistical distribution to form their sentence it would mean that 
 more than 4% of users will use one of the top 25 million two password 
 combinations.
 
 But that's when users are asked to pick unguessable passwords, asking for 
 very short proper sentences is certain to make this worse. Many proper two 
 word sentences in English are of the form noun-verb. But English has far more 
 nouns than verbs. For example, https://en.wiktionary.org/ has 174871 entries 
 for nouns and only 24931 under verbs. Adverb-verb sentences will be popular 
 too, but there are only 12982 adverbs listed.
 
 In fact, a recent study tested two word minimum passphrases against an Amazon 
 system which returned a different error depending on whether or not the 
 passphrase was in use. They estimate that a dictionary of 20,656 phrases 
 covers the choices of about 1.13% of users and concluded that by our 
 metrics, even 5-word phrases would be highly insecure against offline 
 attacks, with fewer than 30 

Re: [cryptography] Master Password

2012-05-30 Thread Kyle Creyts
I would hazard a guess that this system would stand up well against
mass attacks, at the very least making them much less economically
desirable or feasible for attackers who benefit most from password
dumps. Most architectures fail in single cases, anyway, due to poor
user awareness, poor user decisions, or bad implementations.

On Wed, May 30, 2012 at 9:06 AM, Maarten Billemont lhun...@lyndir.com wrote:
 First of all, thanks for your time and very valuable feedback.

 On 30 May 2012, at 07:20, Marsh Ray wrote:
 On 05/29/2012 06:01 PM, Maarten Billemont wrote:
 Dear readers,

 I've written an iOS / Mac application whose goal it is to produce
 passwords for any purpose.  I was really hoping for the opportunity
 to receive some critical feedback or review of the algorithm
 used[1].


 [1] http://masterpassword.lyndir.com/algorithm.html

 Master Password Master Password So how does it work?

 The theory behind Master Password is simple. The user remembers a
 single, secure password. The user only ever uses that password to log
 into the Master Password application. This master password is then
 used as a seed to generate a different password based on the name of

 How about just used to generate? Usually the term seed is used when it 
 completely determines a pseudorandom sequence rather than being combined 
 with other information.

 [edit: OIC, the master password generates a seed based on the name of the 
 site. This was unclear.]

 the site to generate a password for.

 The result is that each master password generates its own unique
 sequence of passwords for any site name. Since the only input data is
 the master password and the site name (along with a password counter,
 see below), there is no need for any kind of storage to recreate a
 site's password. All that's needed is the correct master password and
 the correct algorithm implementation.

 How do you know how far along in the password sequence you are?

 The summary descriptions of the algorithm are simplified (perhaps, too much 
 so) by assuming default values for those inputs that have ones.  Technically, 
 there are these inputs:
  - The master password,
  - The site name,
  - The password counter (default 0),
  - The password type (default Long Password).


 What that does for you is make
 it almost impossible to lose your passwords.

 To lose a password has multiple meanings.
 How about much easier to create and remember strong passwords?

 The intended meaning was, get in a situation where you can no longer obtain 
 to the data necessary for accessing your passwords.  Additionally, you can no 
 longer get in a situation where this data can be lost to a third party, 
 such as when someone steals your notebook full of passwords.


 It also makes it nearly
 impossible for hackers to steal your online identity.

 I don't think this claim is well substantiated. :-)

 The basis of this claim is the theory that attacking your password for a 
 single site remains as vulnerable as password authentication has ever been, 
 but this does not risk your global identity as reusing or rehashing the same 
 password on multiple sites would.  It's phrased awkwardly for an audience of 
 professionals, but was targeted at users.  This page may not be the right 
 location for such language.


 The Algorithm

 In short, the algorithm is comprised of the following steps:

 Determining the master key Determining the cipher seed Encoding a
 user-friendly password

 The Master Password

 The user chooses a single master password, preferably sufficiently
 long to harden against brute-force attacks. Master Password
 recommends absurd two or three-word sentences as they're easily
 remembered and generally sufficiently high in entropy.

 I believe this is not good advice because two and three word sentences will 
 not provide enough entropy.

 Wikipedia cites a Harvard/Google study indicating English has 1,022,000 
 words. 
 https://en.wikipedia.org/wiki/Number_of_words_in_English#Number_of_words_in_English

 Users have great difficulty picking original passwords, even security pros 
 tend to be overconfident in the uniqueness of password schemes.
 For example:
 http://www.imperva.com/docs/WP_Consumer_Password_Worst_Practices.pdf
 the 5000 most popular passwords [] are used by a share of 20% of the users.

 So even if users chose two completely independent passwords from the 
 RockYou statistical distribution to form their sentence it would mean that 
 more than 4% of users will use one of the top 25 million two password 
 combinations.

 But that's when users are asked to pick unguessable passwords, asking for 
 very short proper sentences is certain to make this worse. Many proper two 
 word sentences in English are of the form noun-verb. But English has far 
 more nouns than verbs. For example, https://en.wiktionary.org/ has 174871 
 entries for nouns and only 24931 under verbs. Adverb-verb sentences will be 
 popular too, but there are only 12982 adverbs listed.

 

Re: [cryptography] Master Password

2012-05-30 Thread Kyle Creyts
Which is not to say that I find the single case, or cryptographic
strength to be superior to other systems. But it certainly
complicates the job of an attacker seeking to exploit large numbers of
passwords, or cross-service password reuse. Imperfect, but not a
terrible step.

On Wed, May 30, 2012 at 11:23 AM, Kyle Creyts kyle.cre...@gmail.com wrote:
 I would hazard a guess that this system would stand up well against
 mass attacks, at the very least making them much less economically
 desirable or feasible for attackers who benefit most from password
 dumps. Most architectures fail in single cases, anyway, due to poor
 user awareness, poor user decisions, or bad implementations.

 On Wed, May 30, 2012 at 9:06 AM, Maarten Billemont lhun...@lyndir.com wrote:
 First of all, thanks for your time and very valuable feedback.

 On 30 May 2012, at 07:20, Marsh Ray wrote:
 On 05/29/2012 06:01 PM, Maarten Billemont wrote:
 Dear readers,

 I've written an iOS / Mac application whose goal it is to produce
 passwords for any purpose.  I was really hoping for the opportunity
 to receive some critical feedback or review of the algorithm
 used[1].


 [1] http://masterpassword.lyndir.com/algorithm.html

 Master Password Master Password So how does it work?

 The theory behind Master Password is simple. The user remembers a
 single, secure password. The user only ever uses that password to log
 into the Master Password application. This master password is then
 used as a seed to generate a different password based on the name of

 How about just used to generate? Usually the term seed is used when it 
 completely determines a pseudorandom sequence rather than being combined 
 with other information.

 [edit: OIC, the master password generates a seed based on the name of the 
 site. This was unclear.]

 the site to generate a password for.

 The result is that each master password generates its own unique
 sequence of passwords for any site name. Since the only input data is
 the master password and the site name (along with a password counter,
 see below), there is no need for any kind of storage to recreate a
 site's password. All that's needed is the correct master password and
 the correct algorithm implementation.

 How do you know how far along in the password sequence you are?

 The summary descriptions of the algorithm are simplified (perhaps, too much 
 so) by assuming default values for those inputs that have ones.  
 Technically, there are these inputs:
  - The master password,
  - The site name,
  - The password counter (default 0),
  - The password type (default Long Password).


 What that does for you is make
 it almost impossible to lose your passwords.

 To lose a password has multiple meanings.
 How about much easier to create and remember strong passwords?

 The intended meaning was, get in a situation where you can no longer obtain 
 to the data necessary for accessing your passwords.  Additionally, you can 
 no longer get in a situation where this data can be lost to a third party, 
 such as when someone steals your notebook full of passwords.


 It also makes it nearly
 impossible for hackers to steal your online identity.

 I don't think this claim is well substantiated. :-)

 The basis of this claim is the theory that attacking your password for a 
 single site remains as vulnerable as password authentication has ever been, 
 but this does not risk your global identity as reusing or rehashing the same 
 password on multiple sites would.  It's phrased awkwardly for an audience of 
 professionals, but was targeted at users.  This page may not be the right 
 location for such language.


 The Algorithm

 In short, the algorithm is comprised of the following steps:

 Determining the master key Determining the cipher seed Encoding a
 user-friendly password

 The Master Password

 The user chooses a single master password, preferably sufficiently
 long to harden against brute-force attacks. Master Password
 recommends absurd two or three-word sentences as they're easily
 remembered and generally sufficiently high in entropy.

 I believe this is not good advice because two and three word sentences will 
 not provide enough entropy.

 Wikipedia cites a Harvard/Google study indicating English has 1,022,000 
 words. 
 https://en.wikipedia.org/wiki/Number_of_words_in_English#Number_of_words_in_English

 Users have great difficulty picking original passwords, even security pros 
 tend to be overconfident in the uniqueness of password schemes.
 For example:
 http://www.imperva.com/docs/WP_Consumer_Password_Worst_Practices.pdf
 the 5000 most popular passwords [] are used by a share of 20% of the 
 users.

 So even if users chose two completely independent passwords from the 
 RockYou statistical distribution to form their sentence it would mean that 
 more than 4% of users will use one of the top 25 million two password 
 combinations.

 But that's when users are asked to pick unguessable passwords, asking for 
 

Re: [cryptography] Master Password

2012-05-30 Thread Maarten Billemont
Thanks a lot, Jon, for taking the time and sharing your thoughts.

On 30 May 2012, at 09:32, Jon Callas wrote:
 Your algorithm is basically okay, but there are a couple of errors you've 
 made, things you and I will disagree over, and one flaw that I consider to 
 wreck the whole thing. But all of the problems are correctable, easily. If I 
 have not understood something, or read it too quickly and gotten confused, I 
 apologize in advance.
 
 Let me walk through a reduction of your system:
 
 (1) You take the master password and run it through a 512-bit hash function, 
 producing master binary secret.
 
 You pick scrypt for your hash function, because you think burning time and 
 space adds to security. I do not. This is a place where gentlepersons can 
 disagree, and I really don't expect to convince you that SHA-512 or Skein 
 would be better options. I'm convinced that I know why you're doing it, and 
 it would be a waste of both our times to go further. We just disagree.
 
 At the end of it, it hardly matters because if an attacker wishes to 
 construct a rainbow table, the correct way to do it is to assemble a list of 
 likely passwords and just go from there. It will take longer if they use 
 scrypt than with a real hash function, but once it's done it is done. They 
 have the rainbow table.
 
 This isn't a flaw, it just is. The goal of your system requires that you have 
 a master secret. But security-wise, there's no win here other than burning 
 some cycles in a way that the attacker can trivially replicate.
 
 Security-wise, I'm quite certain that scrypt isn't nearly as secure as any 
 real hash function you'd pick, but I'm just whining. We know that the 
 security is password. If they pick puppies as their password, it really 
 doesn't matter what hash function you run it through. Almost certainly, there 
 is not enough security in their password to make it a difference what 
 function you picked.

If I understand your point correctly, you're telling me that while scrypt might 
delay brute-force attacks on a user's master password, it's not terribly useful 
a defense against someone building a rainbow table.  Furthermore, you're of the 
opinion that the delay that scrypt introduces isn't very valuable and I should 
just simplify the solution with a hash function that's better trusted and more 
reliable.

Tests on my local machine (a MacBook Pro) indicate that scrypt can generate 10 
hashes per second with its current configuration while SHA-1 can generate about 
1570733.  This doesn't quite seem like a trivial delay, assuming rainbow 
tables are off the... table.  Though I certainly wish to see and understand 
your point of view.

 
 Let's call the parameters P for password and M for the master key.
 
 (2) You take M and construct site-specific keys. We'll call the site name N, 
 your counter C, and the site keys K_s.
 
 You compute a given site-specific key, K_s, with:
 
 K_s = SHA1(S + M + C), where + is the function that concatenates a null 
 byte and then the second string.
 
 Strictly speaking, you really ought to do it in the order K + C + S, because 
 that's more collision-resistant. It's good practice when computing a keyed 
 hash to hash the key first. In reality, it probably doesn't matter, but it 
 *does* save you lots of debates with people like me.

Slight mix-up of parameters, but I think I can follow :-)

 
 You also want to hash in the length of S, too, because that's also more 
 collision-resistent.
 
 So you really want it to be K_s = SHA1(M + C + S.length + S), but that's the 
 only real security problems I can see. The ordering is a nit, and omitting 
 the length is only a problem if the hash function is broken. Speaking of 
 which, why not use a non-broken hash function, like SHA256, or SHA512 or 
 SHA512/160, if the output size matters to you? Given that you're using 
 scrypt, why not use a better hash function, even if it is slower? But that's 
 also a nit.

I changed the order so that the name of the site wouldn't be last, in order to 
avoid length extension issues.  Adding in the length of the site name would 
curtail those issues as well, I expect, without needing to change the order.

When you say SHA-1 is broken, are you referring to this property or something 
else?  I was unaware of any other issues that SHA-1 suffers but the others do 
not.

 
 The real problem you have, however, is in the counter. First of all a counter 
 is not a salt. A salt is an arbitrary non-security parameter. But arbitrary 
 means random, just not secret. A counter is a counter.
 
 The counter has two problems to it. One is that it doesn't add to the 
 security of the system. The other it makes system utterly useless. 
 
 An attacker can easily brute force through the site keys by just running a 
 counter. That's why I say it doesn't add to the security. Even if you used 
 scrypt here, too, it wouldn't matter much. It's still easy to brute force.
 
 However, this completely ruins the 

Re: [cryptography] Master Password

2012-05-30 Thread Jonathan Thornburg
You're right, sharing of master passwords is a bad idea.  But given
human nature, it happens, and a security system needs to take that
into account.  There are also a lot of other ways a master password
can be compromised and thus need rolling over, e.g. shoulder-surfing,
virus keyloggers, theft of PC where web browser remembered it, etc.

So... it would be a *big* plus to have a way to rollover the master
password without having to manually re-visit and re-password each
website.

-- 
-- Jonathan Thornburg [remove -animal to reply] 
jth...@astro.indiana-zebra.edu
   Dept of Astronomy  IUCSS, Indiana University, Bloomington, Indiana, USA
   Washing one's hands of the conflict between the powerful and the
powerless means to side with the powerful, not to be neutral.
  -- quote by Freire / poster by Oxfam
___
cryptography mailing list
cryptography@randombit.net
http://lists.randombit.net/mailman/listinfo/cryptography


Re: [cryptography] Master Password

2012-05-30 Thread Maarten Billemont
On 30 May 2012, at 15:09, Jonathan Thornburg wrote:
 You're right, sharing of master passwords is a bad idea.  But given
 human nature, it happens, and a security system needs to take that
 into account.  There are also a lot of other ways a master password
 can be compromised and thus need rolling over, e.g. shoulder-surfing,
 virus keyloggers, theft of PC where web browser remembered it, etc.
 
 So... it would be a *big* plus to have a way to rollover the master
 password without having to manually re-visit and re-password each
 website.

If somebody gets a hold of your master password, they now have the ability to 
generate any password on demand.  There is only one fix: Change all the 
passwords that the attacker can now generate.  Whatever trickery you aim to do 
with your master password, all site passwords must be changed, either way.

Since Master Password is a solution that hooks into the password authentication 
mechanism of arbitrary sites and it works completely offline, there is no way 
for it to communicate with sites on your behalf or invalidate your passwords 
somehow.  What you're pointing out is a flaw of the decentralized password 
authentication mechanism, and the only fix is to stop using decentralized 
password authentication.

It's not a terribly big deal, IMO, but if you are indeed worried about loosing 
your master password, you need to either start using something like OpenID 
exclusively or use a password solution that doesn't work statelessly and then 
hope you won't also loose that state to the attacker along with your master 
password (seems unlikely that you can protect yourself from this).

A very valid point, however, is key logging.  To thwart that, Master Password 
users could use only the iOS application, which displays their password, and 
copy it manually to their desktop.  That would void the need to enter the 
master password on the desktop which may potentially be compromised.

Don't get me wrong; your iPhone can also get compromised, but it's 
significantly less likely to happen.  It's probably the most secure keypad you 
have around, not to mention it's much easier to keep nosey sholder-surfers away 
from a tiny screen you hold in your hand than a large keyboard in a fixed 
position on your desk.

smime.p7s
Description: S/MIME cryptographic signature
___
cryptography mailing list
cryptography@randombit.net
http://lists.randombit.net/mailman/listinfo/cryptography


Re: [cryptography] Master Password

2012-05-30 Thread Steven Bellovin

On May 29, 2012, at 7:01 22PM, Maarten Billemont wrote:

 Dear readers,
 
 I've written an iOS / Mac application whose goal it is to produce passwords 
 for any purpose.  I was really hoping for the opportunity to receive some 
 critical feedback or review of the algorithm used[1].
 
 --
 ABOUT
 
 With an increasing trend of web applications requiring users to register 
 accounts, we find ourselves with countless accounts.  Ideally, each should 
 have a different password, so that authenticating yourself for one account 
 doesn't reveal your credentials of other accounts.  That becomes really hard 
 when you've got tens or hundreds of passwords to remember.
 
 Solutions exist, mostly in the form of password vaults that list your 
 passwords and get stored in an encrypted form.  Other solutions send your 
 passwords off to be stored on some company's cloud service.
 
 Master Password is different in that it generates passwords based purely off 
 of a user's master password and the name of the site.  That means you need no 
 storage and have a fully offline algorithm that needs nothing more than what 
 you can remember easily.
 --
 
 I'm rather a notice in the field of security (certainly in comparison to some 
 of you), and I was hoping that some of you might find the time to have a look 
 at the algorithm and see if there are any obvious flaws or risks to the 
 security and integrity of the solution.
 
 As a side-note, the iOS application, Master Password, is fully open-source[2] 
 under the GPLv3.  If any of you speak fluent Objective-C, it would be awesome 
 if they could have a peek at the source code as well.


From a very quick glance, it looks to be about the same as

@inproceedings{web-pw-gen,
Author = {J. Alex Halderman and Brent Waters and Edward W. Felten},
Booktitle = {Proc. 14th Intl. World Wide Web Conference},
Month = {May},
Title = {A Convenient Method for Securely Managing Passwords},
Url = {http://userweb.cs.utexas.edu/~bwaters/publications/papers/www2005
.pdf},
Year = 2005,
}

As someone else has noted, a crucial issue is that every site receives a
function of your master password, the site name, and a counter that
defaults to zero.  If they launch a password-guessing attack -- and I
know you've made it expensive, but you can't go too far in that
direction without making user password retrieval too time-consuming, and
the attackers have GPUs, botnets, and things like EC2 to parallelize
their work -- they can retrieve the master password and hence all of
your others.  You can strengthen you scheme significantly by making the
counter 8 bytes and starting it with some random value.


--Steve Bellovin, https://www.cs.columbia.edu/~smb





___
cryptography mailing list
cryptography@randombit.net
http://lists.randombit.net/mailman/listinfo/cryptography


Re: [cryptography] Master Password

2012-05-30 Thread Maarten Billemont
On 30 May 2012, at 16:26, Wyss, Felix wrote:
 What about including a random salt when generating the key from the master 
 password?  The application could either generate the salt for you on first 
 use (and recommend writing it down and keeping in a safe place) or allow 
 entering an existing salt (e.g. when transferring to a new device).
 
 That would make password guessing practically infeasible and two users 
 sharing the same master password won't inadvertently break the system.


You're proposing adding a second secret to the mix.  While interesting in the 
sense that it guards against both loss of a master password and rainbow tables, 
I'd rather avoid introducing any factors that can easily become lost to the 
user.

I imagine introducing a username to the scheme as a salt, as proposed by Jon 
Callas, would create sufficient protection against rainbow tables.

smime.p7s
Description: S/MIME cryptographic signature
___
cryptography mailing list
cryptography@randombit.net
http://lists.randombit.net/mailman/listinfo/cryptography


Re: [cryptography] Master Password

2012-05-30 Thread Charles Morris
On Wed, May 30, 2012 at 9:57 AM, Steven Bellovin s...@cs.columbia.edu wrote:

 On May 29, 2012, at 7:01 22PM, Maarten Billemont wrote:

 Dear readers,

 I've written an iOS / Mac application whose goal it is to produce passwords 
 for any purpose.  I was really hoping for the opportunity to receive some 
 critical feedback or review of the algorithm used[1].

 --
 ABOUT

 With an increasing trend of web applications requiring users to register 
 accounts, we find ourselves with countless accounts.  Ideally, each should 
 have a different password, so that authenticating yourself for one account 
 doesn't reveal your credentials of other accounts.  That becomes really hard 
 when you've got tens or hundreds of passwords to remember.

 Solutions exist, mostly in the form of password vaults that list your 
 passwords and get stored in an encrypted form.  Other solutions send your 
 passwords off to be stored on some company's cloud service.

 Master Password is different in that it generates passwords based purely off 
 of a user's master password and the name of the site.  That means you need 
 no storage and have a fully offline algorithm that needs nothing more than 
 what you can remember easily.
 --

 I'm rather a notice in the field of security (certainly in comparison to 
 some of you), and I was hoping that some of you might find the time to have 
 a look at the algorithm and see if there are any obvious flaws or risks to 
 the security and integrity of the solution.

 As a side-note, the iOS application, Master Password, is fully 
 open-source[2] under the GPLv3.  If any of you speak fluent Objective-C, it 
 would be awesome if they could have a peek at the source code as well.


 From a very quick glance, it looks to be about the same as

 @inproceedings{web-pw-gen,
        Author = {J. Alex Halderman and Brent Waters and Edward W. Felten},
        Booktitle = {Proc. 14th Intl. World Wide Web Conference},
        Month = {May},
        Title = {A Convenient Method for Securely Managing Passwords},
        Url = 
 {http://userweb.cs.utexas.edu/~bwaters/publications/papers/www2005
 .pdf},
        Year = 2005,
 }

 As someone else has noted, a crucial issue is that every site receives a
 function of your master password, the site name, and a counter that
 defaults to zero.  If they launch a password-guessing attack -- and I
 know you've made it expensive, but you can't go too far in that
 direction without making user password retrieval too time-consuming, and
 the attackers have GPUs, botnets, and things like EC2 to parallelize
 their work -- they can retrieve the master password and hence all of
 your others.  You can strengthen you scheme significantly by making the
 counter 8 bytes and starting it with some random value.


                --Steve Bellovin, https://www.cs.columbia.edu/~smb


Indeed, this has been done many times before, and although that is true,
Mr. Billemont should still be congratulated for his insight and I'm sure
he will make money off the irobot community; even though he is probably
(almost surely) stepping on some patents and prior work of others.

However Steve's modification does not fully repair this idea, in fact
it reduces the security given by the master password in comparison
to the random counter values to basically nothing.

As your new found secrets (the random counters) are stored plaintext
in the device (even if they aren't, so what? maybe we could encrypt
them with more random counters).

This leads to not only the scheme being completely device dependent
(try to ask a user to remember and IP, don't even consider 512 random bits)

And a device compromise being a terrible scenario. Obvious solutions
to this issue that I have considered are obviously flawed.

scrypt is also a terrible choice. Enjoy your free money.
___
cryptography mailing list
cryptography@randombit.net
http://lists.randombit.net/mailman/listinfo/cryptography


Re: [cryptography] Master Password

2012-05-30 Thread Marsh Ray

On 05/30/2012 04:06 AM, Maarten Billemont wrote:

First of all, thanks for your time and very valuable feedback.

On 30 May 2012, at 07:20, Marsh Ray wrote:

On 05/29/2012 06:01 PM, Maarten Billemont wrote:



Initially, my recommendation for a master password was to use a
sufficiently-random 12-character string.



I added the proposal for
paranoid users to incorporate non-latin glyphs into the password.
The big issue here is usability: Ideally, you want to lock the
application down and require the master password for every usage of
it.  That means entering the master password needs to be sufficiently
painless.  At the same time, it's vital that users do not forget
their master password, or the whole scheme falls apart; there is no
master password recovery scenario.  These two key elements combined
reading about things like passfault's attempts at determining
password strengths, lead me to think absurd word-based sentences are
a best fit for sufficiently-high entropy memorable and easy-to-enter
pass phrases.  I may need to re-evaluate this opinion based on your
references.  Perhaps you have proposals that might better address
these requirements?


No one knows how to generate human-friendly passphrases that 
consistently meet the standard of 128 bit key strength.


The best scheme I've been able to come up for my own use is to generate 
a unique password for every site with:


   dd if=/dev/urandom bs=18 count=1 | base64

And then to write them down on blank business card paper stock. This way 
I can conveniently carry them or store them securely.


Sadly, converting the data security problem to a physical security 
problem is for me an improvement. If I frequently traveled in an area 
where my physical security was reduced I might make a different decision.



https://passfault.appspot.com/password_strength.html


Let's tell users to give all their passwords to an unrelated 3rd party 
website. Hmm, what could possibly go wrong?



The purpose of this salt is indeed not to thwart dictionary
attacks, it exists solely to shake up the input data to the hash and
produce an entirely new hash on the user's demand.  I was unfamiliar
with your definition of salt, and I'm certainly willing to change my
terminology.  Have you a proposition of what this password counter
might better be called?


I would try to throw it out of the design entirely. If users need to rev 
the password for a site, use example.com 2 for the site name. You 
could provide UI to assist with this but it's needless complexity for 
the algorithm.



The site name is a UTF-8 decoded byte string, as mentioned in the
text above, but I will repeat it inline as well, to avoid possible
ambiguity.  The key and salt can indeed contain NULs, but then again,
they can contain anything.  Are you hinting that they might well be
simply concatenated without delimiter?


The rule is simple: for every possible unique combination of k, s, and 
sitename you need to be able to repeatably generate a unique binary 
string. The easiest way to do this is to show that given the binary, you 
can unambiguously get back to the original inputs.


Things that tend to trip programmers up when they go to implement this are:
* delimiting variable length inputs
* character data encoding issues, e.g., ASCII vs UNICODE
* other character data issues such as case folding and unicode 
normalization forms

* cross-platform user input of special characters


If you're referring to the possibility of length extension to produce
a password for a different site from a given password, ignoring that
the resulting seed from this hash operation is never actually wholly
known by a site, I believe I've dodged that by putting the site name
in front of the other elements and additionally delimiting it with a
NUL, though I suppose that with some luck, the salt may yet be
extended?  Note, however, that it is a fixed-length numeric value.
I'm unsure that this can be meaningfully exploited by length
extension.


When we're the least bit unsure about whether something could be 
vulnerable, the bet plan is usually to fall back on the tried and true 
conservative methods.


My current standard favorite in the category of key derivation functions is:

PBKDF2 HMAC SHA-2-512/256  

It's very conservative, widely used and reviewed, has a tunable work 
factor, designed to accept a password input and generates an arbitrary 
length key.



I certainly can't deny that HMAC does seem like better a fit.  It's
just that it also seems rather unnecessary.


From one view point it does resemble cargo-cult superstition. From 
another view point it's easier to prove that something is not weak when 
it has more ideal properties. The length extension property of SHA-1 is 
not ideal, so SHA-1 is harder to think about. But if you have a secret 
key available that you were already planning to input you can use HMAC 
which does not have this undesirable property and is thus easier to 
think about.



Often, they 

Re: [cryptography] Master Password

2012-05-30 Thread Nico Williams
On Wed, May 30, 2012 at 2:32 AM, Jon Callas j...@callas.org wrote:
 (1) You take the master password and run it through a 512-bit hash function, 
 producing master binary secret.

 You pick scrypt for your hash function, because you think burning time and 
 space adds to security. I do not. This is a place where gentlepersons can 
 disagree, and I really don't expect to convince you that SHA-512 or Skein 
 would be better options. I'm convinced that I know why you're doing it, and 
 it would be a waste of both our times to go further. We just disagree.

 At the end of it, it hardly matters because if an attacker wishes to 
 construct a rainbow table, the correct way to do it is to assemble a list of 
 likely passwords and just go from there. It will take longer if they use 
 scrypt than with a real hash function, but once it's done it is done. They 
 have the rainbow table.

This is why salting is important.  They should not be able to build a
single rainbow table that works for all cases.  They should have to
build a rainbow table per-user, but since that's wasteful (of storage)
they won't unless they are prepping to attack a single account at some
point when material suitable for attack becomes available.

Once you're salting the next step is to slow down the password search.
 We can't slow it down too much, else the user will suffer and
complain, so in the end the user had better pick password.  And if the
attacker is attacking a small number of users then they can build
rainbow tables, which means that... in general you're right.

Are you saying that PBKDFs are just so much cargo cult now?

Nico
--
___
cryptography mailing list
cryptography@randombit.net
http://lists.randombit.net/mailman/listinfo/cryptography


Re: [cryptography] Master Password

2012-05-30 Thread Nico Williams
On Wed, May 30, 2012 at 3:25 PM, Maarten Billemont lhun...@lyndir.com wrote:
 I'm currently considering asking the user for their full name and using that 
 as a salt in the scrypt operation.  Full names are often lengthy and there's 
 a good deal of them.  Do you recon this might introduce enough entropy or 
 should I also be asking for the user's birth date?  I'm just thinking that 
 this is good information that will make for a wide enough range of different 
 salts that it will hopefully make rainbow tables too expensive while still 
 avoiding the problem that a user cannot remember any random salt of such 
 entropy.

My problem with your design is that the statelessness of it forces you
to depend on a really, really good master password, because otherwise
any site [to which the user gives a password generated this way] can
then mount an off-line dictionary attack on the user's master
password.  This means that the user needs to have such a strong
password that it's likely not practical.

PBKDFs with large work/memory factors are useful when the attacker has
to compromise some other part of the system in order to be able to
mount an off-line dictionary attack on the password.  A scheme that
exposes material suitable for attacking without any other protections
is weak.

Nico
--
___
cryptography mailing list
cryptography@randombit.net
http://lists.randombit.net/mailman/listinfo/cryptography


Re: [cryptography] Master Password

2012-05-30 Thread Eitan Adler
On 30 May 2012 13:25, Maarten Billemont lhun...@lyndir.com wrote:
 On 30 May 2012, at 22:17, Marsh Ray wrote:
 On 05/30/2012 02:59 PM, Nico Williams wrote:

 This is why salting is important.  They should not be able to build
 a single rainbow table that works for all cases.

 In order to be useful, the salt has to be large enough to not have large 
 numbers of collisions across large user populations. Ideally it should be 
 out of brute force range, or else the attacker can just fix a password and 
 construct his rainbow table over the salt possibilities.

 This implies that a maximally-effective salt will be larger than a user is 
 able to remember, but the explicit goal of this scheme was to avoid 
 persistent state on the device. If we're willing to give up this design 
 goal, we'd probably be better off building a proper encrypted password 
 manager app instead.

 There may still be value in hashing in the username, but only in the 
 aggregate. I don't see that it helps the targeted user case much.


 I'm currently considering asking the user for their full name and using that 
 as a salt in the scrypt operation.  Full names are often lengthy and there's 
 a good deal of them.  Do you recon this might introduce enough entropy or 
 should I also be asking for the user's birth date?  I'm just thinking that 
 this is good information that will make for a wide enough range of different 
 salts that it will hopefully make rainbow tables too expensive while still 
 avoiding the problem that a user cannot remember any random salt of such 
 entropy.

People can be identified uniquely with about 33 bits of entropy.
Birthdays give about 9 bits of entropy.


-- 
Eitan Adler
___
cryptography mailing list
cryptography@randombit.net
http://lists.randombit.net/mailman/listinfo/cryptography


Re: [cryptography] Master Password

2012-05-30 Thread Jonathan Thornburg
On Wed, 30 May 2012, Maarten Billemont wrote:
 I'm currently considering asking the user for their full name and
 using that as a salt in the scrypt operation.  [[...]]

Digressing slightly from crypto, note that full name is not as tidy
or troublefree a concept as one might think.  It's instructive to look
through
  
http://www.kalzumeus.com/2010/06/17/falsehoods-programmers-believe-about-names/
(read all the comments!) and think about how your user interface will
try to work around issues #2,3,4,7,11,12,13,24,25,26,27, and 35.  A
particularly tricky point is how to handle characters which aren't on
the standard virtual/physical keyboard (actually that's issue #11 in
that list).

Given that you want invariance under discard/loose hardware, buy new
hardware, you don't just need to canonicalize the name (which is already
tricky in an i18n context), you need to do so in a hardware-independent
way.  What happens when the user upgrades from hardware/software which
doesn't have native support for some of the letters in her name (e.g.,
she's German and her family name is L o-umlaut f f l e r, but her
smartphone is a US model which only groks [a-zA-Z] as letters) to new
hardware/software which *does* grok the letters (e.g., she buys a new
smartphone in Germany, which *does* have o-umlaut on its virtual|physical
keyboard)?

ciao,

-- 
-- Jonathan Thornburg [remove -animal to reply] 
jth...@astro.indiana-zebra.edu
   Dept of Astronomy  IUCSS, Indiana University, Bloomington, Indiana, USA
   Washing one's hands of the conflict between the powerful and the
powerless means to side with the powerful, not to be neutral.
  -- quote by Freire / poster by Oxfam
___
cryptography mailing list
cryptography@randombit.net
http://lists.randombit.net/mailman/listinfo/cryptography


Re: [cryptography] Master Password

2012-05-30 Thread Maarten Billemont
I'm going to attempt to summarize/rehash the comments I've found have a 
significant relevance to the quality of the algorithm.  I've had a lot of great 
feedback, which I'm tremendously thankful for.  My apologies in advance for any 
important aspects that any of you have highlighted if I forget to include them 
here.

[1] There's no way to recover or change the master password without forcing the 
user to set new passwords for each of their sites.
[2] It's difficult for a user to come up with a master password that's high in 
entropy.
[3] Master passwords with weak entropy put the user at a significant risk of 
having all their passwords stolen.
[4] The algorithm can do with some tweaking and cleaning up.  (eg. no overly 
high scrypt output key size, employ HMAC over SHA1, remove scrypt in favor of 
alternative hashing functions, ...)
[5] The terminology in the algorithm's documentation can be improved.
[6] There are quite a few input values that the user is expected to remember in 
order to reproduce his passwords on another clean device.
[7] The master password can be determined relatively easily using a rainbow 
table built for the given scrypt parameter values and either a hash of the 
master key or all of the site name + site password counter + site password type 
+ site output password.
[8] The modulo operator introduces a bias when used to choose an element from a 
range.
[9] The cipher currently employed to construct an output password yields a 
password whose entropy is significantly below the ideal.

Proposals to address these issues:

[1] Instead of creating a hash of the master password, create a salt value on 
the device using a pseudo-number generator.  Use that salt to computer the 
master key.  This allows changing the Master Password, and protects against 
rainbow tables, but turns Master Password into a stateful solution.
[2] I currently propose sentences.  This may yet be insufficient.
[3] Known issue.  The master password must be decent.
[4] Parameter values can be optimized such as scrypt parameters, ciphers, etc.  
We can also change things like use proven HMAC over SHA-1 where relevant.
[5] The vocabulary must be fixed or amended.
[6] Most of the current input values have good defaults which will not 
generally be changed by users.  For those that users to change, the user can 
fairly easily recover from forgetting the parameter value by means of some 
trail-and-error.
[7] A salt must be added when scrypt is used or afterwards, when the master key 
is used.  This salt should be high enough in entropy that it makes it 
infeasible to construct a rainbow table for each salt value.
[8] Consider replacing the modulo operator or fixing its bias.
[9] Perhaps the cipher can be improved upon.  Additional cipher candidates 
might help.  Ideally, the solution does not sacrifice usability.

If I've missed any issues, highlight them for me.  I believe the algorithm can 
be fixed by introducing a salt, ideally one that's memorable to the user, and 
its initial goal preserved.  Presumably, it will never be quite as secure as 
using the output of dd (| base64) as a password.

smime.p7s
Description: S/MIME cryptographic signature
___
cryptography mailing list
cryptography@randombit.net
http://lists.randombit.net/mailman/listinfo/cryptography


Re: [cryptography] Master Password

2012-05-29 Thread Marsh Ray

On 05/29/2012 06:01 PM, Maarten Billemont wrote:

Dear readers,

I've written an iOS / Mac application whose goal it is to produce
passwords for any purpose.  I was really hoping for the opportunity
to receive some critical feedback or review of the algorithm
used[1].




[1] http://masterpassword.lyndir.com/algorithm.html

Master Password Master Password So how does it work?

The theory behind Master Password is simple. The user remembers a
single, secure password. The user only ever uses that password to log
into the Master Password application. This master password is then
used as a seed to generate a different password based on the name of


How about just used to generate? Usually the term seed is used when 
it completely determines a pseudorandom sequence rather than being 
combined with other information.


[edit: OIC, the master password generates a seed based on the name of 
the site. This was unclear.]



the site to generate a password for.

The result is that each master password generates its own unique
sequence of passwords for any site name. Since the only input data is
the master password and the site name (along with a password counter,
see below), there is no need for any kind of storage to recreate a
site's password. All that's needed is the correct master password and
the correct algorithm implementation.


How do you know how far along in the password sequence you are?


What that does for you is make
it almost impossible to lose your passwords.


To lose a password has multiple meanings.
How about much easier to create and remember strong passwords?


It also makes it nearly
impossible for hackers to steal your online identity.


I don't think this claim is well substantiated. :-)


The Algorithm

In short, the algorithm is comprised of the following steps:

Determining the master key Determining the cipher seed Encoding a
user-friendly password

The Master Password

The user chooses a single master password, preferably sufficiently
long to harden against brute-force attacks. Master Password
recommends absurd two or three-word sentences as they're easily
remembered and generally sufficiently high in entropy.


I believe this is not good advice because two and three word sentences 
will not provide enough entropy.


Wikipedia cites a Harvard/Google study indicating English has 1,022,000 
words. 
https://en.wikipedia.org/wiki/Number_of_words_in_English#Number_of_words_in_English


Users have great difficulty picking original passwords, even security 
pros tend to be overconfident in the uniqueness of password schemes.

For example:
http://www.imperva.com/docs/WP_Consumer_Password_Worst_Practices.pdf
the 5000 most popular passwords [] are used by a share of 20% of the 
users.


So even if users chose two completely independent passwords from the 
RockYou statistical distribution to form their sentence it would mean 
that more than 4% of users will use one of the top 25 million two 
password combinations.


But that's when users are asked to pick unguessable passwords, asking 
for very short proper sentences is certain to make this worse. Many 
proper two word sentences in English are of the form noun-verb. But 
English has far more nouns than verbs. For example, 
https://en.wiktionary.org/ has 174871 entries for nouns and only 24931 
under verbs. Adverb-verb sentences will be popular too, but there are 
only 12982 adverbs listed.


In fact, a recent study tested two word minimum passphrases against an 
Amazon system which returned a different error depending on whether or 
not the passphrase was in use. They estimate that a dictionary of 
20,656 phrases covers the choices of about 1.13% of users and concluded 
that by our metrics, even 5-word phrases would be highly insecure 
against offline attacks, with fewer than 30 bits of work compromising 
over half of users.

http://www.lightbluetouchpaper.org/2012/03/07/some-evidence-on-multi-word-passphrases/
http://www.cl.cam.ac.uk/~jcb82/doc/BS12-USEC-passphrase_linguistics.pdf


The application then creates a scrypt key derivative from the user's
password. This process takes quite a bit of processing time and
memory.


Specifics would be very interesting here. How much time on a modern CPU?


This step exists to make brute-force attempts at guessing the
master password from a given output password far more difficult, to
practically infeasible, even for otherwise vulnerable password
strings.

key   = scrypt( P, S, N, r, p, dkLen ) where P = master password
(UTF-8) S = empty N = 16384 r = 8 p = 1 dkLen = 64

The result is a 64-byte key derived from the user's master password.


64 bytes is overkill. Later you use SHA-1 in such a way that there's no 
point in having more than 160 bits here. For human chosen passphrases, 
even the standard 128 bits is very generous.



This key will be fed into the rest of the algorithm to produce output
passwords that are as private to the user as his master password is.



Combining The