On 05/30/2012 04:06 AM, Maarten Billemont wrote:
First of all, thanks for your time and very valuable feedback.
On 30 May 2012, at 07:20, Marsh Ray wrote:
On 05/29/2012 06:01 PM, Maarten Billemont wrote:
Initially, my recommendation for a master password was to use a
sufficiently-random 12-character string.
I added the proposal for
"paranoid" users to incorporate non-latin glyphs into the password.
The big issue here is usability: Ideally, you want to lock the
application down and require the master password for every usage of
it. That means entering the master password needs to be sufficiently
painless. At the same time, it's vital that users do not forget
their master password, or the whole scheme falls apart; there is no
master password recovery scenario. These two key elements combined
reading about things like passfault's attempts at determining
password strengths, lead me to think absurd word-based sentences are
a best fit for sufficiently-high entropy memorable and easy-to-enter
pass phrases. I may need to re-evaluate this opinion based on your
references. Perhaps you have proposals that might better address
these requirements?
No one knows how to generate human-friendly passphrases that
consistently meet the standard of 128 bit key strength.
The best scheme I've been able to come up for my own use is to generate
a unique password for every site with:
dd if=/dev/urandom bs=18 count=1 | base64
And then to write them down on blank business card paper stock. This way
I can conveniently carry them or store them securely.
Sadly, converting the data security problem to a physical security
problem is for me an improvement. If I frequently traveled in an area
where my physical security was reduced I might make a different decision.
https://passfault.appspot.com/password_strength.html
Let's tell users to give all their passwords to an unrelated 3rd party
website. Hmm, what could possibly go wrong?
The purpose of this "salt" is indeed not to thwart dictionary
attacks, it exists solely to shake up the input data to the hash and
produce an entirely new hash on the user's demand. I was unfamiliar
with your definition of salt, and I'm certainly willing to change my
terminology. Have you a proposition of what this "password counter"
might better be called?
I would try to throw it out of the design entirely. If users need to rev
the password for a site, use "example.com 2" for the site name. You
could provide UI to assist with this but it's needless complexity for
the algorithm.
The site name is a UTF-8 decoded byte string, as mentioned in the
text above, but I will repeat it inline as well, to avoid possible
ambiguity. The key and salt can indeed contain NULs, but then again,
they can contain anything. Are you hinting that they might well be
simply concatenated without delimiter?
The rule is simple: for every possible unique combination of k, s, and
sitename you need to be able to repeatably generate a unique binary
string. The easiest way to do this is to show that given the binary, you
can unambiguously get back to the original inputs.
Things that tend to trip programmers up when they go to implement this are:
* delimiting variable length inputs
* character data encoding issues, e.g., "ASCII" vs "UNICODE"
* other character data issues such as case folding and unicode
normalization forms
* cross-platform user input of special characters
If you're referring to the possibility of length extension to produce
a password for a different site from a given password, ignoring that
the resulting seed from this hash operation is never actually wholly
known by a site, I believe I've dodged that by putting the site name
in front of the other elements and additionally delimiting it with a
NUL, though I suppose that with some luck, the "salt" may yet be
extended? Note, however, that it is a fixed-length numeric value.
I'm unsure that this can be meaningfully exploited by length
extension.
When we're the least bit unsure about whether something could be
vulnerable, the bet plan is usually to fall back on the tried and true
conservative methods.
My current standard favorite in the category of key derivation functions is:
PBKDF2< HMAC< SHA-2-512/256 > >
It's very conservative, widely used and reviewed, has a tunable work
factor, designed to accept a password input and generates an arbitrary
length key.
I certainly can't deny that HMAC does seem like better a fit. It's
just that it also seems rather unnecessary.
From one view point it does resemble cargo-cult superstition. From
another view point it's easier to prove that something is not weak when
it has more ideal properties. The length extension property of SHA-1 is
not ideal, so SHA-1 is harder to think about. But if you have a secret
key available that you were already planning to input you can use HMAC
which does not have this undesirable property and is thus easier to
think about.
Often, they exist purely as a side-effect of bad password
handling such as storing the clear-text passwords in a database.
I don't think you've substantiated that last sentence.
I'm unsure what you're looking for here. I could add that databases
often impose length or data type limitations on their content. I
could add that code which improperly merges strings of different
contexts can cause passwords that contain certain characters to
result in runtime syntax errors. But I believe it's perhaps a little
off-scope. Is this the kind of substantiation you would like to see
added or were you referring to something else?
There are people who will disagree with that statement. If you're not
wanting to get into a debate about it and back it up with facts, you
should probably just leave it out.
When looking at Wiktionary, I found the definition of cipher to be:
(cryptography) A cryptographic system using an algorithm that
converts letters or sequences of bits into ciphertext.
While ciphertext is described as: Encoded text, text that is
unreadable.
This lead me to believe that the conversion of hash bytes into a
different form that makes the hash unreadable was a suitable match
for this term. Your "template" proposal isn't a bad one either, and
I certainly want to avoid using terms that can potentially be
misunderstood.
There's a state to the southeast of me where many people refer to
reading and writing as "ciphering". For example, someone said "Yep, that
little Junior's gettin reel good at his cipherin." I kind of thought the
state was populated by cryptographers until someone explained this to me.
The point being, we should probably try to avoid using terms in ways
that conflict with usage in a specific domain, even if there are other
definitions where where it does make sense.
This was pointed out to me before. As I understand it, the result is
that my entropy is slightly reduced. I will evaluate the issue and
consider a better way of choosing a number from a range, but I'm not
yet convinced the loss in entropy is in fact, significant. Either
way, if the solution is trivial, there's no harm in doing the right
thing.
It's usually easier to do it the right way than to prove that the
badness introduced by the wrong way is acceptable.
It just happens to be terribly difficult for a user to manually copy
a password that is comprised of an evenly distributed set of
characters, but much easier when those characters can form "groups"
in the user's mind. The cipher part is therefore purely a usability
feature.
You may have by now noticed a common theme here. Entropy content is
inversely proportional to human usability.
The question is will Fred D. User ever be able to select and use
password(s) that tomorrows botnet herder will be unable to crack? A
large work factor can help the defender to some extent but the
attacker's capabilities seem to be growing much faster.
It is also very efficient to iterate all these possibilities.
[...]
hash of a "medium" type password it would take him, on average, 15
milliseconds to brute force it.
[...]
The same attacker would need 17.1 hours on average (34.2 hours max)
to crack any "long" format password from a salted SHA-1 hash.
This is an inherent flaw of the password authentication method.
But you've made it dramatically worse, particularly for security
conscious users who *do* know how to pick strong passwords.
a) Your "medium" passwords can never provide more than 29 bits of
security no matter how strong the passphrase. Your "long" passwords can
never provide more than 49 bits.
b) The compromise of the passphrase compromises all user passwords.
Sure, there are lots of users sharing passwords across sites. But most
users who would use your systems are currently using a different
password for their online banking as they are for everything else.
Master Password isn't here to replace or fix passwords. Better
alternatives exist such as public key authentication, hardware OTP
tokens or smartcards. The problem is the unfortunate fact that
password authentication is universal standard. Master Password
merely aspires to produce reasonably decent passwords for a user's
sites in a way that doesn't compromise their global identity when one
or more of the user's passwords is compromised.
This reasoning is all too common in the data security industry.
"X exists in common usage. It's too difficult to fix it or replace it,
so let's make it more convenient instead."
I'm not quite convinced that things would be worse. Master
Password's algorithm was always designed with the knowledge that
individual passwords can easily be compromised, and in fact was
conceived as a way of addressing this problem specifically.
The master password is no different than an individual password except
that it formalizes putting all the eggs in one basket.
If we compare the Master Password solution being used by a large
user-base against that same user-base using their own individual
password management solutions, I believe we'll find that since users
easily give up on security and simply reuse or rehash a small subset
of passwords for all accounts, an attacker that obtains the user's
credentials for one site will have a much easier time at figuring out
the user's credentials for their other sites than they would if they
first had to brute-force that user's master password.
Note that an attacker gains very little from discovering a user's
password for a site, and very little more by discovering that
multiple users are using the same password. As far as I understand,
there's no way to reverse the site's password and determine the
master password from it, and knowing that multiple users that have
used the same master password does not aid in this attempt either.
If two users independently chose the same passphrase, it's a very good
indication that it's one that's bruteforceable by the attacker. The more
users who choose the passphrase, the easier it will be to bruteforce it.
If your system had been in use by a significant percentage of the users
of rockyou.com before the breach we would probably have learned more
security conscious users' online banking passwords than we did without it.
Knowing that 2, 5 or 10 users are using the same master password
might make it more profitable a target for brute-forcing, but I'm
currently still convinced that an attacker will opt for a user-chosen
password before he'll have a go at brute-forcing Master Password.
If your system were popular and I was an attacker who'd hacked a hashed
password database, I would probably spend the 15 ms per password to see
if it were a lyndir "medium" password.
If I were an attacker targeting a specific user of your system, I'd find
out what websites you use and hack the least secure one, or entice them
to register with my fake website. When I've talked to professional
pentesters they tell me that a phishing email with a link to a site that
looks just like the employee's benefits website works most of the time.
I don't even need to ask for their super valuable password, I just need
them to generate a new one long enough to represent all the entropy in
their master passphrase. So my phishing site would require one of the
lyndyr "long" format passwords. If the targeted user had selected a five
word passphrase, according to Bonneau's estimate I have a 50% chance of
being able to recover it with a precomputed rainbow table of 2^30 entries.
You said it takes your Mac about 0.1 s to compute this scrypt, so my
work to precompute this table was about 29826 machine hours. This is
very reasonable, http://www.freerainbowtables.com/en/tableprogress/ has
3847 active machines and has generated tables requiring far more
computation than this. At retail prices, this computation will cost
about $5548 @ $0.186/hr for a dedicated Amazon EC2 high CPU unit. Botnet
rental prices are reportedly far, far smaller.
With this I can recover most users' master passphrase and consequently
learn all of their account passwords. But that's when the system
requests five word passphrases. For users with master passphrases weaker
than that, it's game over much much sooner.
- Marsh
_______________________________________________
cryptography mailing list
[email protected]
http://lists.randombit.net/mailman/listinfo/cryptography