Re: Security - contrib.auth hashing

2010-12-14 Thread Bret W
On Tuesday, July 20, 2010 2:23:52 PM UTC-4, Craig Younkins wrote:
>
> Maybe. The issue in my mind with bcrypt and scrypt is that they are not 
> validated by NIST or NSA, unlike SHA-2. Blowfish was examined by NIST for 
> the AES competition but to my knowledge the use of hashing has not been. 
> SHA-2 was developed by NSA and is recommended by NIST (
> http://csrc.nist.gov/groups/ST/toolkit/secure_hashing.html).
>
> That being said, I'm asking the opinion of a few other folks at OWASP and 
> trying to get a consensus of 1 sentence to summarize how passwords should be 
> stored. In my mind, this sentence should be "Use a SHA-2 algorithm with a 
> 64-bit random salt and 1000 iterations," but this statement is my own and 
> does not necessarily reflect the views of OWASP. I'll post here 
> with developments.
>

I wanted to follow up on this discussion to see if any further thought had 
been given to using bcrypt.

With the recent Gawker hacking incident, there has been another round of 
discussion happening regarding best practices for securely storing 
credentials, and from the discussions I've seen at Hacker News, those in the 
know are still recommending bcrypt.

I am not a security researcher or a cryptographer, so I don't have much to 
add to this conversation, other than that I want to be sure that Django's 
following the best practices put forth by security professionals.

Backward compatibility is definitely an issue to be addressed, and it's not 
in the scope of this message to do so, but I would like to say that some 
changes worth ugly fixes. It's obvious that it's not possible to rehash 
passwords that application developers don't have, so it seems likely that 
there's going to need to be a hack to support an old and a new hashing 
scheme for a couple of versions of Django. I believe most developers would 
be accepting of a little interim cruft if it meant a more secure product in 
the long term.

While we're on the subject of security, have the security-related pieces of 
Django ever undergone a security audit? I remember Simon W. asking for a 
code review of his signed-cookie implementation 
(https://groups.google.com/forum/#topic/django-developers/KX6LIgBvfzo), and 
I now see that Jacob didn't feel that a security audit was worthwhile, given 
what the DSF can afford and the implications for peer review. If 
contrib.auth hasn't been reviewed by a security expert, I'd like to suggest 
that someone investigate the possibility of having it reviewed. Security, 
and specifically cryptography, is one area of computing that requires tons 
of expertise. Even with many eyeballs, it's hard to be certain that a 
salient detail wasn't overlooked. That being said, I'm not close to the 
framework development process, and I don't know what's been done in the past 
or who's been consulted.

Bret

-- 
You received this message because you are subscribed to the Google Groups 
"Django developers" group.
To post to this group, send email to django-develop...@googlegroups.com.
To unsubscribe from this group, send email to 
django-developers+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/django-developers?hl=en.



Re: Security - contrib.auth hashing

2010-07-20 Thread Craig Younkins
On Tue, Jul 20, 2010 at 12:09 PM, Jacob Kaplan-Moss wrote:

> On Tue, Jul 20, 2010 at 8:41 AM, Craig Younkins 
> wrote:
> > I'm very glad you don't have MD5 as the default. SHA-1 (currently
> employed)
> > is acceptable for now, but at this point there are theoretical attacks
> that
> > can find collisions in time that is "within the realm of computational
> > possibility". It is recommended that SHA-2 be used for new applications.
> > See http://www.pythonsecurity.org/wiki/hashing/
>
> Actually, if we're being picky, it'd probably be best to use a
> high-cost hashing algorithm like bcrypt or scrypt. SHA (all flavors)
> is designed to be fairly fast, so by using MD5 or SHA we're
> essentially helping brute-force attacks take less time.
>
> http://code.djangoproject.com/ticket/5600 and
> http://code.djangoproject.com/ticket/5787 tracked this request; we
> eventually determined the trade-off in supporting multiple versions
> wasn't worth the extra feature. I might be convinced to change my mind
> now, though, but only if there's a good answer to the
> backwards-compatibility issues.
>

Maybe. The issue in my mind with bcrypt and scrypt is that they are not
validated by NIST or NSA, unlike SHA-2. Blowfish was examined by NIST for
the AES competition but to my knowledge the use of hashing has not been.
SHA-2 was developed by NSA and is recommended by NIST (
http://csrc.nist.gov/groups/ST/toolkit/secure_hashing.html).

That being said, I'm asking the opinion of a few other folks at OWASP and
trying to get a consensus of 1 sentence to summarize how passwords should be
stored. In my mind, this sentence should be "Use a SHA-2 algorithm with a
64-bit random salt and 1000 iterations," but this statement is my own and
does not necessarily reflect the views of OWASP. I'll post here
with developments.


> The hashing scheme uses random.random(). The random module uses the
> > deterministic Mersenne Twister algorithm to generate random numbers. This
> is
> > fine for most purposes, but it is not suitable for cryptographic
> purposes.
> > It is much better to create a random.SystemRandom instance to get random
> > data from the OS that is suitable for cryptography.
>
> The problem with SystemRandom is buried in the docs: it's "not
> available on all systems."
> (http://docs.python.org/library/random.html#random.SystemRandom). I'm
> open to a solution here, but we'd need to be very careful to determine
> if SystemRandom is available.
>

SystemRandom should be available on Linux, Solaris, Mac OS
X, NetBSD, OpenBSD, Tru64 UNIX 5.1B, AIX 5.2, and HP-UX 11i v2, and at least
Windows 2000 on up. It's unclear to me if CryptGenRandom was in the API for
95 or 98.

In any case, this is to generate the salt. There is no reason I can think of
why we can't default to SystemRandom and fall back to regular random module
methods if it raises NotImplementedError.


> The most concerning thing in the hashing algorithm is that a salt of only
> 5
> > hexadecimal characters is used. This is just over a million possible
> salts
> > (20 bits). We'd really like to see something closer to our recommendation
> of
> > 64 bits.
>
> I'm not sure why we're only using 5 characters. Anyone remember?
>
> Could you open a ticket to track this issue?
>

http://code.djangoproject.com/ticket/13969


> Is there a measure to prevent users from having dollar signs in their
> > passwords? This would mess up the concatenated string that is stored in
> the
> > database.
>
> Unless I'm really dense, I think it doesn't matter -- we hash
> passwords before they get stored, so we can't "mess up" the stored
> string: it's always just [0-9A-F]. Right?
>

You're right! I'm the one that's dense and looking too quickly. :-)

-- 
You received this message because you are subscribed to the Google Groups 
"Django developers" group.
To post to this group, send email to django-develop...@googlegroups.com.
To unsubscribe from this group, send email to 
django-developers+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/django-developers?hl=en.



Re: Security - contrib.auth hashing

2010-07-20 Thread Jacob Kaplan-Moss
Hey Craig --

Thanks for the notes - this is good stuff!

On Tue, Jul 20, 2010 at 8:41 AM, Craig Younkins  wrote:
> I'm very glad you don't have MD5 as the default. SHA-1 (currently employed)
> is acceptable for now, but at this point there are theoretical attacks that
> can find collisions in time that is "within the realm of computational
> possibility". It is recommended that SHA-2 be used for new applications.
> See http://www.pythonsecurity.org/wiki/hashing/

Actually, if we're being picky, it'd probably be best to use a
high-cost hashing algorithm like bcrypt or scrypt. SHA (all flavors)
is designed to be fairly fast, so by using MD5 or SHA we're
essentially helping brute-force attacks take less time.

http://code.djangoproject.com/ticket/5600 and
http://code.djangoproject.com/ticket/5787 tracked this request; we
eventually determined the trade-off in supporting multiple versions
wasn't worth the extra feature. I might be convinced to change my mind
now, though, but only if there's a good answer to the
backwards-compatibility issues.

> The hashing scheme uses random.random(). The random module uses the
> deterministic Mersenne Twister algorithm to generate random numbers. This is
> fine for most purposes, but it is not suitable for cryptographic purposes.
> It is much better to create a random.SystemRandom instance to get random
> data from the OS that is suitable for cryptography.

The problem with SystemRandom is buried in the docs: it's "not
available on all systems."
(http://docs.python.org/library/random.html#random.SystemRandom). I'm
open to a solution here, but we'd need to be very careful to determine
if SystemRandom is available.

> The most concerning thing in the hashing algorithm is that a salt of only 5
> hexadecimal characters is used. This is just over a million possible salts
> (20 bits). We'd really like to see something closer to our recommendation of
> 64 bits.

I'm not sure why we're only using 5 characters. Anyone remember?

Could you open a ticket to track this issue?

> Is there a measure to prevent users from having dollar signs in their
> passwords? This would mess up the concatenated string that is stored in the
> database.

Unless I'm really dense, I think it doesn't matter -- we hash
passwords before they get stored, so we can't "mess up" the stored
string: it's always just [0-9A-F]. Right?

> You might consider hashing with multiple rounds. By applying the hash
> function many times, you essentially lengthen the hashing/password
> verification stage. Since users spend very little time in this stage, it
> will have minimal impact in them. Crackers spend nearly 100% of their time
> doing this, so it significantly slows them down.
> See http://www.pythonsecurity.org/wiki/hashing/#multiple-rounds

Yup -- or, as said above, use s/bcrypt. I *would* like to revisit slow
hashing algorithms -- maybe if we can't make s/bcrypt be a good option
we could switch to multiple rounds.

Jacob

-- 
You received this message because you are subscribed to the Google Groups 
"Django developers" group.
To post to this group, send email to django-develop...@googlegroups.com.
To unsubscribe from this group, send email to 
django-developers+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/django-developers?hl=en.



Security - contrib.auth hashing

2010-07-20 Thread Craig Younkins
Please note this email does not include or indicate a specific, immediately
viable flaw.

I'm doing a brief analysis of the contrib.auth system:
http://www.pythonsecurity.org/wiki/django/#authentication . I have a couple
of notes that I'd like to share with you.

   - I'm very glad you don't have MD5 as the default. SHA-1 (currently
   employed) is acceptable for now, but at this point there are theoretical
   attacks that can find collisions in time that is "within the realm of
   computational possibility". It is recommended that SHA-2 be used for new
   applications. See http://www.pythonsecurity.org/wiki/hashing/
   - The hashing scheme uses random.random(). The random module uses the
   deterministic Mersenne Twister algorithm to generate random numbers. This is
   fine for most purposes, but it is not suitable for cryptographic purposes.
   It is much better to create a
random.SystemRandom
instance
   to get random data from the OS that is suitable for cryptography.
   - The most concerning thing in the hashing algorithm is that a salt of
   only 5 hexadecimal characters is used. This is just over a million possible
   salts (20 bits). We'd really like to see something closer to our
   recommendation of 64 bits.

Other tidbits:

   - Is there a measure to prevent users from having dollar signs in their
   passwords? This would mess up the concatenated string that is stored in the
   database.
   - You might consider hashing with multiple rounds. By applying the hash
   function many times, you essentially lengthen the hashing/password
   verification stage. Since users spend very little time in this stage, it
   will have minimal impact in them. Crackers spend nearly 100% of their time
   doing this, so it significantly slows them down. See
   http://www.pythonsecurity.org/wiki/hashing/#multiple-rounds


*Craig Younkins*

-- 
You received this message because you are subscribed to the Google Groups 
"Django developers" group.
To post to this group, send email to django-develop...@googlegroups.com.
To unsubscribe from this group, send email to 
django-developers+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/django-developers?hl=en.