On May 11, 2013, at 4:10 AM, Claude Paroz <[email protected]> wrote:

> Le samedi 11 mai 2013 07:59:18 UTC+2, Donald Stufft a écrit :
> I went looking for BCrypt + Django + Python3 today and this is what I found: 
> 
> The current recommended solution to bcrypt + Django is using py-bcrypt which 
> is not compatible with Python3. 
> 
> Someone else has taken py-bcrypt and created py3k-bcrypt however they made 
> the decision to enforce having str instances sent in for password/salt 
> instead of bytes which makes it incompatible with Django's encode function on 
> the BCrypt password hashers[1]. 
> 
> So I created a library simply called `bcrypt`[2] which has the same API as 
> py-bcrypt but it functions on Python 2.6+ and 3.x (as well as PyPy 2.0 since 
> it's implemented via CFFI). When testing this against Python3 + Django I 
> discovered that Django isn't properly encoding/decoding when talking to the 
> external library so I made a patch that causes it to always send bytes to the 
> external library, and str's to other pats of Django. That can be found here: 
> https://github.com/django/django/pull/1052 
> 
> My bcrypt library is obviously new code but it's a small wrapper over 
> crypt_blowfish from OpenWall[3] and I'm wondering (assuming no one objects to 
> me merging my Patch) if it would make sense to switch the documentation away 
> from suggesting py-bcrypt and have it suggest bcrypt instead since it will 
> allow BCrypt to function on Python3 as well. 
> 
> Thoughts? 
> 
> [1] I believe this is inheriently wrong as bcrypt operates on bytes not on 
> unicode characters, and in order for this to work py3k-bcrypt must be 
> assuming a character set it can encode(). 
> [2] Found at https://crate.io/packages/bcrypt/ or 
> https://github.com/dstufft/bcrypt 
> [2] Found at http://www.openwall.com/crypt/ 
> 
> 
> Hi Donald,
> 
> There are several approaches in string handling in Python 3, being as content 
> input or output. As for me, I'm generally privileging unicode strings 
> whenever possible. See for example the Python hashlib behaviour for digest() 
> and hexdigest(): digest() returns a bytestring as it can return a full range 
> of bytes (0-255), while hexdigest() returns a string as the result is 
> guaranteed to be ASCII-safe.
> 
> Similarly I would have returned a string (unicode) from hashpw() as far as it 
> is guaranteed to be ASCII-safe. As for inputs, I think it is easy enough to 
> accept both bytestrings and strings, test them and encode('utf-8') when 
> needed.
> I recognize that it looks a bit odd on Python 2 to receive unicode when you 
> fed bytes to a method.
> 
> I'm not sure there is a "right" way, it's all about design and choice. Feel 
> free to ignore me :-)

As far as the return value from hashpw goes it's bytes primarily because the 
inputs to hashpw are expected to be bytes.

As far as the input values for hashpw goes, it accepts only bytes because 
bcrypt as an algorithm functions only on streams of bytes, not on unicode 
characters. Not every character in the world can be represented in utf-8 and I 
believe it's better for a library to require bytes (you can see the hashlib on 
python3 does this) than to make a possibly erroneous  guess.

> 
> Claude
> 
> 
> -- 
> You received this message because you are subscribed to the Google Groups 
> "Django developers" group.
> To unsubscribe from this group and stop receiving emails from it, send an 
> email to [email protected].
> To post to this group, send email to [email protected].
> Visit this group at http://groups.google.com/group/django-developers?hl=en.
> For more options, visit https://groups.google.com/groups/opt_out.
>  
>  


-----------------
Donald Stufft
PGP: 0x6E3CBCE93372DCFA // 7C6B 7C5D 5E2B 6356 A926 F04F 6E3C BCE9 3372 DCFA

Attachment: signature.asc
Description: Message signed with OpenPGP using GPGMail

Reply via email to