On May 18, 2013, at 5:15 AM, Aymeric Augustin <[email protected]> wrote:
> Apologies for answering so late. I see the change discussed here was already
> committed. The change itself is fine — essentially because it's limited to
> the bcrypt password hasher — but I'd like to bring some perspective to parts
> of this discussion.
>
> Overall, I strongly advocate consistency in the Python ecosystem, and the
> standard library sets the, err, standard. Here's how it deals with this
> situation in Python 3.
>
>>>> import hashlib
>
> 1) Hash functions must reject str objects because the encoding isn't
> guaranteed:
>
>>>> hashlib.md5('foo')
> Traceback (most recent call last):
> File "<stdin>", line 1, in <module>
> TypeError: Unicode-objects must be encoded before hashing
>
> 2) Digests must be returned as bytes (quite obviously):
>
>>>> hashlib.md5(b'foo').digest()
> b'\xac\xbd\x18\xdbL\xc2\xf8\\\xed\xefeO\xcc\xc4\xa4\xd8'
>
> 3) Hex digests must be returned as str:
>
>>>> hashlib.md5(b'foo').hexdigest()
> 'acbd18db4cc2f85cedef654fccc4a4d8'
>
> Adapting this example to Python 2 is left as an exercise :)
>
> As a consequence, I agree with Claude's recommendation to use unicode strings
> whenever possible (eg. for hex digests). However, I believe that a simple
> hash function mustn't accept unicode strings. Wrappers — say, an
> make_password_hash function — must encode unicode strings to bytes before
> passing them to hash functions.
>
> Regarding Donald's pull request, `data = force_bytes(data)` makes sense,
> because the hasher must be fed bytes. There's already a `password =
> force_bytes(password)` just above.
>
> I'm less enthusiastic about the change adding `force_text(data)`. It actually
> works around bcrypt.hashpw returning an unexpected type in these
> circumstance. But, if that's how bcrypt.hashpw works, that's fine.
Well the python library returns bytes (and accepts bytes for the salt) because
fundamentally bcrypt operates on bytes, and the C library reflects that. The
force_text would need to happen either in Django or in the Python library and I
believe it's more appropriate for it to happen in Django.
>
> Donald, we've discussed this before and I know you have strong feelings
> against the design of the standard library in this regard. Still, Python is
> the environment we're living in, and we shouldn't fight it.
>
> --
> Aymeric.
>
>
>
> --
> You received this message because you are subscribed to the Google Groups
> "Django developers" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to [email protected].
> To post to this group, send email to [email protected].
> Visit this group at http://groups.google.com/group/django-developers?hl=en.
> For more options, visit https://groups.google.com/groups/opt_out.
>
>
-----------------
Donald Stufft
PGP: 0x6E3CBCE93372DCFA // 7C6B 7C5D 5E2B 6356 A926 F04F 6E3C BCE9 3372 DCFA
signature.asc
Description: Message signed with OpenPGP using GPGMail
