Re: [cryptography] Oddity in common bcrypt implementation

Marsh Ray Tue, 28 Jun 2011 07:56:57 -0700

On 06/27/2011 06:30 PM, Sampo Syreeni wrote:

On 2011-06-20, Marsh Ray wrot

I once looked up the Unicode algorithm for some basic "case
insensitive" string comparison... 40 pages!


Isn't that precisely why e.g. Peter Gutmann once wrote against the
canonicalization (in the Unicode context, "normalization") that ISO
derived crypto protocols do, in favour of the "bytes are bytes" approach
that PGP/GPG takes?

Yes, but in most actual systems the strings are going to get handled.It's more a question of whether or not your protocol specificationdefines the format it's expecting.

Humans tend to not define text very precisely and computers don't workwith it directly anyway, they only work with encoded representations oftext as character data. Even a simple accented character in a word orname can be represented in several different ways.

Many devs (particularly Unixers :-) in the US, AU, and NZ have gottenaway with the "7 bit ASCII" assumption for a long time, but most of therest of the world has to deal with locales, code pages, and multi-byteencodings. This seemed to allow older IETF protocol specs to often getaway without a rigorous treatment of the character data encoding issues.(I suspect one factor in the lead of the English-speaking world in thedevelopment of 20th century computers and protocols is because we couldget by with one of the smallest character sets.)


Let's say you're writing a piece of code like:
if (username == "root")
{
        // avoid doing something insecure with root privs
}

The logic of this example is probably broken in important ways but thepoint remains: sometimes we need to compare usernames for equality incontexts that have security implications. You can only claim "bytes arebytes" up until the point that the customer says they have a directoryserver which compares usernames "case insensitively".

For most things "verbatim binary" is the right choice. However, apassword or pass phrase is specifically character data which is theresult of a user input method.

If you want to do crypto, just do crypto on the bits/bytes. If you
really have to, you can tag the intended format for forensic purposes
and sign your intent. But don't meddle with your given bits.
Canonicalization/normalization is simply too hard to do right or even to
analyse to have much place in protocol design.


Consider RAIDUS.

The first RFC http://tools.ietf.org/html/rfc2058#section-5.2

says nothing about the encoding of the character data of the passwordfield, it just treats it as a series of octets. So what do you do whenimplementing RADIUS on an OS that gives user input to your applicationwith UTF-16LE encoding? If you "don't meddle with your given bits" andjust pass them on to the protocol layer, they are almost guaranteed tobe non-interoperable.


Later RFCs http://tools.ietf.org/html/rfc2865

have added in most places "It is recommended that the message containUTF-8 encoded 10646 characters." I think this is a really practicalmiddle ground. Interestingly, it doesn't say this for the passwordfield, likely because the authors figured it would break some existingunderspecified behavior.

So exactly which characters are allowed in passwords and how are they tobe represented for interoperable RADIUS implementations? I have no idea,and I help maintain one!

Consequently, we can hardly blame users for not using special charactersin their passwords.


- Marsh
_______________________________________________
cryptography mailing list
[email protected]
http://lists.randombit.net/mailman/listinfo/cryptography

Re: [cryptography] Oddity in common bcrypt implementation

Reply via email to