Re: BCrypt + Python3
On May 18, 2013, at 5:15 AM, Aymeric Augustin wrote: > Apologies for answering so late. I see the change discussed here was already > committed. The change itself is fine — essentially because it's limited to > the bcrypt password hasher — but I'd like to bring some perspective to parts > of this discussion. > > Overall, I strongly advocate consistency in the Python ecosystem, and the > standard library sets the, err, standard. Here's how it deals with this > situation in Python 3. > import hashlib > > 1) Hash functions must reject str objects because the encoding isn't > guaranteed: > hashlib.md5('foo') > Traceback (most recent call last): > File "", line 1, in > TypeError: Unicode-objects must be encoded before hashing > > 2) Digests must be returned as bytes (quite obviously): > hashlib.md5(b'foo').digest() > b'\xac\xbd\x18\xdbL\xc2\xf8\\\xed\xefeO\xcc\xc4\xa4\xd8' > > 3) Hex digests must be returned as str: > hashlib.md5(b'foo').hexdigest() > 'acbd18db4cc2f85cedef654fccc4a4d8' > > Adapting this example to Python 2 is left as an exercise :) > > As a consequence, I agree with Claude's recommendation to use unicode strings > whenever possible (eg. for hex digests). However, I believe that a simple > hash function mustn't accept unicode strings. Wrappers — say, an > make_password_hash function — must encode unicode strings to bytes before > passing them to hash functions. > > Regarding Donald's pull request, `data = force_bytes(data)` makes sense, > because the hasher must be fed bytes. There's already a `password = > force_bytes(password)` just above. > > I'm less enthusiastic about the change adding `force_text(data)`. It actually > works around bcrypt.hashpw returning an unexpected type in these > circumstance. But, if that's how bcrypt.hashpw works, that's fine. Well the python library returns bytes (and accepts bytes for the salt) because fundamentally bcrypt operates on bytes, and the C library reflects that. The force_text would need to happen either in Django or in the Python library and I believe it's more appropriate for it to happen in Django. > > Donald, we've discussed this before and I know you have strong feelings > against the design of the standard library in this regard. Still, Python is > the environment we're living in, and we shouldn't fight it. > > -- > Aymeric. > > > > -- > You received this message because you are subscribed to the Google Groups > "Django developers" group. > To unsubscribe from this group and stop receiving emails from it, send an > email to django-developers+unsubscr...@googlegroups.com. > To post to this group, send email to django-developers@googlegroups.com. > Visit this group at http://groups.google.com/group/django-developers?hl=en. > For more options, visit https://groups.google.com/groups/opt_out. > > - Donald Stufft PGP: 0x6E3CBCE93372DCFA // 7C6B 7C5D 5E2B 6356 A926 F04F 6E3C BCE9 3372 DCFA signature.asc Description: Message signed with OpenPGP using GPGMail
Re: BCrypt + Python3
Apologies for answering so late. I see the change discussed here was already committed. The change itself is fine — essentially because it's limited to the bcrypt password hasher — but I'd like to bring some perspective to parts of this discussion. Overall, I strongly advocate consistency in the Python ecosystem, and the standard library sets the, err, standard. Here's how it deals with this situation in Python 3. >>> import hashlib 1) Hash functions must reject str objects because the encoding isn't guaranteed: >>> hashlib.md5('foo') Traceback (most recent call last): File "", line 1, in TypeError: Unicode-objects must be encoded before hashing 2) Digests must be returned as bytes (quite obviously): >>> hashlib.md5(b'foo').digest() b'\xac\xbd\x18\xdbL\xc2\xf8\\\xed\xefeO\xcc\xc4\xa4\xd8' 3) Hex digests must be returned as str: >>> hashlib.md5(b'foo').hexdigest() 'acbd18db4cc2f85cedef654fccc4a4d8' Adapting this example to Python 2 is left as an exercise :) As a consequence, I agree with Claude's recommendation to use unicode strings whenever possible (eg. for hex digests). However, I believe that a simple hash function mustn't accept unicode strings. Wrappers — say, an make_password_hash function — must encode unicode strings to bytes before passing them to hash functions. Regarding Donald's pull request, `data = force_bytes(data)` makes sense, because the hasher must be fed bytes. There's already a `password = force_bytes(password)` just above. I'm less enthusiastic about the change adding `force_text(data)`. It actually works around bcrypt.hashpw returning an unexpected type in these circumstance. But, if that's how bcrypt.hashpw works, that's fine. Donald, we've discussed this before and I know you have strong feelings against the design of the standard library in this regard. Still, Python is the environment we're living in, and we shouldn't fight it. -- Aymeric. -- You received this message because you are subscribed to the Google Groups "Django developers" group. To unsubscribe from this group and stop receiving emails from it, send an email to django-developers+unsubscr...@googlegroups.com. To post to this group, send email to django-developers@googlegroups.com. Visit this group at http://groups.google.com/group/django-developers?hl=en. For more options, visit https://groups.google.com/groups/opt_out.
Re: BCrypt + Python3
On May 11, 2013, at 4:10 AM, Claude Paroz wrote: > Le samedi 11 mai 2013 07:59:18 UTC+2, Donald Stufft a écrit : > I went looking for BCrypt + Django + Python3 today and this is what I found: > > The current recommended solution to bcrypt + Django is using py-bcrypt which > is not compatible with Python3. > > Someone else has taken py-bcrypt and created py3k-bcrypt however they made > the decision to enforce having str instances sent in for password/salt > instead of bytes which makes it incompatible with Django's encode function on > the BCrypt password hashers[1]. > > So I created a library simply called `bcrypt`[2] which has the same API as > py-bcrypt but it functions on Python 2.6+ and 3.x (as well as PyPy 2.0 since > it's implemented via CFFI). When testing this against Python3 + Django I > discovered that Django isn't properly encoding/decoding when talking to the > external library so I made a patch that causes it to always send bytes to the > external library, and str's to other pats of Django. That can be found here: > https://github.com/django/django/pull/1052 > > My bcrypt library is obviously new code but it's a small wrapper over > crypt_blowfish from OpenWall[3] and I'm wondering (assuming no one objects to > me merging my Patch) if it would make sense to switch the documentation away > from suggesting py-bcrypt and have it suggest bcrypt instead since it will > allow BCrypt to function on Python3 as well. > > Thoughts? > > [1] I believe this is inheriently wrong as bcrypt operates on bytes not on > unicode characters, and in order for this to work py3k-bcrypt must be > assuming a character set it can encode(). > [2] Found at https://crate.io/packages/bcrypt/ or > https://github.com/dstufft/bcrypt > [2] Found at http://www.openwall.com/crypt/ > > > Hi Donald, > > There are several approaches in string handling in Python 3, being as content > input or output. As for me, I'm generally privileging unicode strings > whenever possible. See for example the Python hashlib behaviour for digest() > and hexdigest(): digest() returns a bytestring as it can return a full range > of bytes (0-255), while hexdigest() returns a string as the result is > guaranteed to be ASCII-safe. > > Similarly I would have returned a string (unicode) from hashpw() as far as it > is guaranteed to be ASCII-safe. As for inputs, I think it is easy enough to > accept both bytestrings and strings, test them and encode('utf-8') when > needed. > I recognize that it looks a bit odd on Python 2 to receive unicode when you > fed bytes to a method. > > I'm not sure there is a "right" way, it's all about design and choice. Feel > free to ignore me :-) As far as the return value from hashpw goes it's bytes primarily because the inputs to hashpw are expected to be bytes. As far as the input values for hashpw goes, it accepts only bytes because bcrypt as an algorithm functions only on streams of bytes, not on unicode characters. Not every character in the world can be represented in utf-8 and I believe it's better for a library to require bytes (you can see the hashlib on python3 does this) than to make a possibly erroneous guess. > > Claude > > > -- > You received this message because you are subscribed to the Google Groups > "Django developers" group. > To unsubscribe from this group and stop receiving emails from it, send an > email to django-developers+unsubscr...@googlegroups.com. > To post to this group, send email to django-developers@googlegroups.com. > Visit this group at http://groups.google.com/group/django-developers?hl=en. > For more options, visit https://groups.google.com/groups/opt_out. > > - Donald Stufft PGP: 0x6E3CBCE93372DCFA // 7C6B 7C5D 5E2B 6356 A926 F04F 6E3C BCE9 3372 DCFA signature.asc Description: Message signed with OpenPGP using GPGMail
Re: BCrypt + Python3
Le samedi 11 mai 2013 07:59:18 UTC+2, Donald Stufft a écrit : > > I went looking for BCrypt + Django + Python3 today and this is what I > found: > > The current recommended solution to bcrypt + Django is using py-bcrypt > which is not compatible with Python3. > > Someone else has taken py-bcrypt and created py3k-bcrypt however they made > the decision to enforce having str instances sent in for password/salt > instead of bytes which makes it incompatible with Django's encode function > on the BCrypt password hashers[1]. > > So I created a library simply called `bcrypt`[2] which has the same API as > py-bcrypt but it functions on Python 2.6+ and 3.x (as well as PyPy 2.0 > since it's implemented via CFFI). When testing this against Python3 + > Django I discovered that Django isn't properly encoding/decoding when > talking to the external library so I made a patch that causes it to always > send bytes to the external library, and str's to other pats of Django. That > can be found here: https://github.com/django/django/pull/1052 > > My bcrypt library is obviously new code but it's a small wrapper over > crypt_blowfish from OpenWall[3] and I'm wondering (assuming no one objects > to me merging my Patch) if it would make sense to switch the documentation > away from suggesting py-bcrypt and have it suggest bcrypt instead since it > will allow BCrypt to function on Python3 as well. > > Thoughts? > > [1] I believe this is inheriently wrong as bcrypt operates on bytes not on > unicode characters, and in order for this to work py3k-bcrypt must be > assuming a character set it can encode(). > [2] Found at https://crate.io/packages/bcrypt/ or > https://github.com/dstufft/bcrypt > [2] Found at http://www.openwall.com/crypt/ > > Hi Donald, There are several approaches in string handling in Python 3, being as content input or output. As for me, I'm generally privileging unicode strings whenever possible. See for example the Python hashlib behaviour for digest() and hexdigest(): digest() returns a bytestring as it can return a full range of bytes (0-255), while hexdigest() returns a string as the result is guaranteed to be ASCII-safe. Similarly I would have returned a string (unicode) from hashpw() as far as it is guaranteed to be ASCII-safe. As for inputs, I think it is easy enough to accept both bytestrings and strings, test them and encode('utf-8') when needed. I recognize that it looks a bit odd on Python 2 to receive unicode when you fed bytes to a method. I'm not sure there is a "right" way, it's all about design and choice. Feel free to ignore me :-) Claude -- You received this message because you are subscribed to the Google Groups "Django developers" group. To unsubscribe from this group and stop receiving emails from it, send an email to django-developers+unsubscr...@googlegroups.com. To post to this group, send email to django-developers@googlegroups.com. Visit this group at http://groups.google.com/group/django-developers?hl=en. For more options, visit https://groups.google.com/groups/opt_out.
BCrypt + Python3
I went looking for BCrypt + Django + Python3 today and this is what I found: The current recommended solution to bcrypt + Django is using py-bcrypt which is not compatible with Python3. Someone else has taken py-bcrypt and created py3k-bcrypt however they made the decision to enforce having str instances sent in for password/salt instead of bytes which makes it incompatible with Django's encode function on the BCrypt password hashers[1]. So I created a library simply called `bcrypt`[2] which has the same API as py-bcrypt but it functions on Python 2.6+ and 3.x (as well as PyPy 2.0 since it's implemented via CFFI). When testing this against Python3 + Django I discovered that Django isn't properly encoding/decoding when talking to the external library so I made a patch that causes it to always send bytes to the external library, and str's to other pats of Django. That can be found here: https://github.com/django/django/pull/1052 My bcrypt library is obviously new code but it's a small wrapper over crypt_blowfish from OpenWall[3] and I'm wondering (assuming no one objects to me merging my Patch) if it would make sense to switch the documentation away from suggesting py-bcrypt and have it suggest bcrypt instead since it will allow BCrypt to function on Python3 as well. Thoughts? [1] I believe this is inheriently wrong as bcrypt operates on bytes not on unicode characters, and in order for this to work py3k-bcrypt must be assuming a character set it can encode(). [2] Found at https://crate.io/packages/bcrypt/ or https://github.com/dstufft/bcrypt [2] Found at http://www.openwall.com/crypt/ - Donald Stufft PGP: 0x6E3CBCE93372DCFA // 7C6B 7C5D 5E2B 6356 A926 F04F 6E3C BCE9 3372 DCFA signature.asc Description: Message signed with OpenPGP using GPGMail