On 05/30/2012 04:06 AM, Maarten Billemont wrote:
First of all, thanks for your time and very valuable feedback.

On 30 May 2012, at 07:20, Marsh Ray wrote:
On 05/29/2012 06:01 PM, Maarten Billemont wrote:

Initially, my recommendation for a master password was to use a
sufficiently-random 12-character string.

I added the proposal for
"paranoid" users to incorporate non-latin glyphs into the password.
The big issue here is usability: Ideally, you want to lock the
application down and require the master password for every usage of
it.  That means entering the master password needs to be sufficiently
painless.  At the same time, it's vital that users do not forget
their master password, or the whole scheme falls apart; there is no
master password recovery scenario.  These two key elements combined
reading about things like passfault's attempts at determining
password strengths, lead me to think absurd word-based sentences are
a best fit for sufficiently-high entropy memorable and easy-to-enter
pass phrases.  I may need to re-evaluate this opinion based on your
references.  Perhaps you have proposals that might better address
these requirements?

No one knows how to generate human-friendly passphrases that consistently meet the standard of 128 bit key strength.

The best scheme I've been able to come up for my own use is to generate a unique password for every site with:

   dd if=/dev/urandom bs=18 count=1 | base64

And then to write them down on blank business card paper stock. This way I can conveniently carry them or store them securely.

Sadly, converting the data security problem to a physical security problem is for me an improvement. If I frequently traveled in an area where my physical security was reduced I might make a different decision.

https://passfault.appspot.com/password_strength.html

Let's tell users to give all their passwords to an unrelated 3rd party website. Hmm, what could possibly go wrong?

The purpose of this "salt" is indeed not to thwart dictionary
attacks, it exists solely to shake up the input data to the hash and
produce an entirely new hash on the user's demand.  I was unfamiliar
with your definition of salt, and I'm certainly willing to change my
terminology.  Have you a proposition of what this "password counter"
might better be called?

I would try to throw it out of the design entirely. If users need to rev the password for a site, use "example.com 2" for the site name. You could provide UI to assist with this but it's needless complexity for the algorithm.

The site name is a UTF-8 decoded byte string, as mentioned in the
text above, but I will repeat it inline as well, to avoid possible
ambiguity.  The key and salt can indeed contain NULs, but then again,
they can contain anything.  Are you hinting that they might well be
simply concatenated without delimiter?

The rule is simple: for every possible unique combination of k, s, and sitename you need to be able to repeatably generate a unique binary string. The easiest way to do this is to show that given the binary, you can unambiguously get back to the original inputs.

Things that tend to trip programmers up when they go to implement this are:
* delimiting variable length inputs
* character data encoding issues, e.g., "ASCII" vs "UNICODE"
* other character data issues such as case folding and unicode normalization forms
* cross-platform user input of special characters

If you're referring to the possibility of length extension to produce
a password for a different site from a given password, ignoring that
the resulting seed from this hash operation is never actually wholly
known by a site, I believe I've dodged that by putting the site name
in front of the other elements and additionally delimiting it with a
NUL, though I suppose that with some luck, the "salt" may yet be
extended?  Note, however, that it is a fixed-length numeric value.
I'm unsure that this can be meaningfully exploited by length
extension.

When we're the least bit unsure about whether something could be vulnerable, the bet plan is usually to fall back on the tried and true conservative methods.

My current standard favorite in the category of key derivation functions is:

        PBKDF2< HMAC< SHA-2-512/256 > >

It's very conservative, widely used and reviewed, has a tunable work factor, designed to accept a password input and generates an arbitrary length key.

I certainly can't deny that HMAC does seem like better a fit.  It's
just that it also seems rather unnecessary.

From one view point it does resemble cargo-cult superstition. From another view point it's easier to prove that something is not weak when it has more ideal properties. The length extension property of SHA-1 is not ideal, so SHA-1 is harder to think about. But if you have a secret key available that you were already planning to input you can use HMAC which does not have this undesirable property and is thus easier to think about.

Often, they exist purely as a side-effect of bad password
handling such as storing the clear-text passwords in a database.

I don't think you've substantiated that last sentence.

I'm unsure what you're looking for here.  I could add that databases
often impose length or data type limitations on their content.  I
could add that code which improperly merges strings of different
contexts can cause passwords that contain certain characters to
result in runtime syntax errors.  But I believe it's perhaps a little
off-scope.  Is this the kind of substantiation you would like to see
added or were you referring to something else?

There are people who will disagree with that statement. If you're not wanting to get into a debate about it and back it up with facts, you should probably just leave it out.

When looking at Wiktionary, I found the definition of cipher to be:
(cryptography) A cryptographic system using an algorithm that
converts letters or sequences of bits into ciphertext.

While ciphertext is described as: Encoded text, text that is
unreadable.

This lead me to believe that the conversion of hash bytes into a
different form that makes the hash unreadable was a suitable match
for this term.  Your "template" proposal isn't a bad one either, and
I certainly want to avoid using terms that can potentially be
misunderstood.

There's a state to the southeast of me where many people refer to reading and writing as "ciphering". For example, someone said "Yep, that little Junior's gettin reel good at his cipherin." I kind of thought the state was populated by cryptographers until someone explained this to me.

The point being, we should probably try to avoid using terms in ways that conflict with usage in a specific domain, even if there are other definitions where where it does make sense.

This was pointed out to me before.  As I understand it, the result is
that my entropy is slightly reduced.  I will evaluate the issue and
consider a better way of choosing a number from a range, but I'm not
yet convinced the loss in entropy is in fact, significant.  Either
way, if the solution is trivial, there's no harm in doing the right
thing.

It's usually easier to do it the right way than to prove that the badness introduced by the wrong way is acceptable.

It just happens to be terribly difficult for a user to manually copy
a password that is comprised of an evenly distributed set of
characters, but much easier when those characters can form "groups"
in the user's mind.  The cipher part is therefore purely a usability
feature.

You may have by now noticed a common theme here. Entropy content is inversely proportional to human usability.

The question is will Fred D. User ever be able to select and use password(s) that tomorrows botnet herder will be unable to crack? A large work factor can help the defender to some extent but the attacker's capabilities seem to be growing much faster.

It is also very efficient to iterate all these possibilities.
[...]
hash of a "medium" type password it would take him, on average, 15
milliseconds to brute force it.
[...]
The same attacker would need 17.1 hours on average (34.2 hours max)
to crack any "long" format password from a salted SHA-1 hash.

This is an inherent flaw of the password authentication method.

But you've made it dramatically worse, particularly for security conscious users who *do* know how to pick strong passwords.

a) Your "medium" passwords can never provide more than 29 bits of security no matter how strong the passphrase. Your "long" passwords can never provide more than 49 bits.

b) The compromise of the passphrase compromises all user passwords. Sure, there are lots of users sharing passwords across sites. But most users who would use your systems are currently using a different password for their online banking as they are for everything else.

Master Password isn't here to replace or fix passwords.  Better
alternatives exist such as public key authentication, hardware OTP
tokens or smartcards.  The problem is the unfortunate fact that
password authentication is universal standard.  Master Password
merely aspires to produce reasonably decent passwords for a user's
sites in a way that doesn't compromise their global identity when one
or more of the user's passwords is compromised.

This reasoning is all too common in the data security industry.

"X exists in common usage. It's too difficult to fix it or replace it, so let's make it more convenient instead."

I'm not quite convinced that things would be worse.  Master
Password's algorithm was always designed with the knowledge that
individual passwords can easily be compromised, and in fact was
conceived as a way of addressing this problem specifically.

The master password is no different than an individual password except that it formalizes putting all the eggs in one basket.

If we compare the Master Password solution being used by a large
user-base against that same user-base using their own individual
password management solutions, I believe we'll find that since users
easily give up on security and simply reuse or rehash a small subset
of passwords for all accounts, an attacker that obtains the user's
credentials for one site will have a much easier time at figuring out
the user's credentials for their other sites than they would if they
first had to brute-force that user's master password.

Note that an attacker gains very little from discovering a user's
password for a site, and very little more by discovering that
multiple users are using the same password.  As far as I understand,
there's no way to reverse the site's password and determine the
master password from it, and knowing that multiple users that have
used the same master password does not aid in this attempt either.

If two users independently chose the same passphrase, it's a very good indication that it's one that's bruteforceable by the attacker. The more users who choose the passphrase, the easier it will be to bruteforce it.

If your system had been in use by a significant percentage of the users of rockyou.com before the breach we would probably have learned more security conscious users' online banking passwords than we did without it.

Knowing that 2, 5 or 10 users are using the same master password
might make it more profitable a target for brute-forcing, but I'm
currently still convinced that an attacker will opt for a user-chosen
password before he'll have a go at brute-forcing Master Password.

If your system were popular and I was an attacker who'd hacked a hashed password database, I would probably spend the 15 ms per password to see if it were a lyndir "medium" password.

If I were an attacker targeting a specific user of your system, I'd find out what websites you use and hack the least secure one, or entice them to register with my fake website. When I've talked to professional pentesters they tell me that a phishing email with a link to a site that looks just like the employee's benefits website works most of the time.

I don't even need to ask for their super valuable password, I just need them to generate a new one long enough to represent all the entropy in their master passphrase. So my phishing site would require one of the lyndyr "long" format passwords. If the targeted user had selected a five word passphrase, according to Bonneau's estimate I have a 50% chance of being able to recover it with a precomputed rainbow table of 2^30 entries.

You said it takes your Mac about 0.1 s to compute this scrypt, so my work to precompute this table was about 29826 machine hours. This is very reasonable, http://www.freerainbowtables.com/en/tableprogress/ has 3847 active machines and has generated tables requiring far more computation than this. At retail prices, this computation will cost about $5548 @ $0.186/hr for a dedicated Amazon EC2 high CPU unit. Botnet rental prices are reportedly far, far smaller.

With this I can recover most users' master passphrase and consequently learn all of their account passwords. But that's when the system requests five word passphrases. For users with master passphrases weaker than that, it's game over much much sooner.

- Marsh
_______________________________________________
cryptography mailing list
[email protected]
http://lists.randombit.net/mailman/listinfo/cryptography

Reply via email to