Re: [OAUTH-WG] Understanding the reasoning for Base64

Ben Laurie Sat, 03 Jul 2010 09:27:21 -0700

Let's not lose sight of the underlying reason to choose base64:
avoiding the issue of canonicalisation. If you use an encoding that
various software layers can choose to decode and operate on, then you
open the canonicalisation can of worms. The point of using base64 is
so the blob you hand around is just that: a blob, which no-one can
process except the guy who actually wants to process it.


And a more general plea: can we please stop trimming everything down
to the bare minimum we can get away with? This is equivalent to
maximising fragility, which is a bad idea and has led to several
notable failures of late.

On 3 July 2010 17:13, Naitik Shah <[email protected]> wrote:
>
>
> On Sat, Jul 3, 2010 at 9:02 AM, Dick Hardt <[email protected]> wrote:
>>
>> On 2010-07-02, at 5:04 PM, Paul Tarjan wrote:
>>
>> >>> We don't think base64url will work, because the most common error
>> >>> we'll see is that developers forget the "url" part and just do plain 
>> >>> base64,
>> >>> and that's not sufficient because the stock set includes +.
>> >>
>> >> I think forgetting to url-decode is more likely than doing the wrong
>> >> base64 encoding. At least with the wrong base64 encoding, what was done
>> >> wrong is more obvious right away. The + will not be in the string.
>> >
>> > Most web frameworks that I know of urldecode the inputs before they even
>> > hit application code.
>> >
>> >
>> >
>> >>>
>> >>> So it will maybe work, maybe not. Maybe they'll do urlencoding after
>> >>> anyways, since if they are passing this as a query param, or post data,
>> >>> client libraries will take a dict and try to "do the right thing". And we
>> >>> end up with pluses, and we're not quite sure if they should be 
>> >>> urldecoded or
>> >>> not.
>> >>
>> >> we won't have pluses
>> >
>> > I think Naitik is saying that accidentally doing base64 and not
>> > base64url will send some '+'s along.
>>
>> if there are '+'s in the token, then it is easy for someone helping to
>> spot the problem. also easy for servers to send back an error message
>> saying, "hey, looks like you are using base64 instead of base64url encoding"
>>
>> ie, it is easy to detect the error -- urlencoding / decoding is hard to
>> detect as an error
>
> The pluses are not guaranteed. They may or may not be there depending on the
> data stream you're encoding. If you don't urlencode the JSON, you'll get a
> "{", if you do it once, you'll get a "%7B", if you do it twice, you'll get a
> "%257B" -- seems easier to detect.
>
>
>>
>> >
>> >
>> >
>> >
>> >> why hex? ... why not base64url?
>> >
>> > It seems to be the encoding format in languages:
>> >
>> > python:
>> >>>> hmac.new('secret', 'payload', hashlib.sha256).hexdigest()
>> > 'b82fcb791acec57859b989b430a826488ce2e479fdf92326bd0a2e8375a42ba4'
>> >
>> > php:
>> > print hash_hmac('sha256', 'payload', 'secret');
>> > b82fcb791acec57859b989b430a826488ce2e479fdf92326bd0a2e8375a42ba
>> >
>> > ruby:
>> >>> HMAC::SHA256.hexdigest('secret', 'payload')
>> > => "b82fcb791acec57859b989b430a826488ce2e479fdf92326bd0a2e8375a42ba4"
>>
>> When I wrote a sample in Perl, it was pretty easy to make it base64url
>> which then provides a consistent encoding.
>
> Did it involve a string replace call? Or a third party library?
>
>
>>
>> >
>> >> I am unclear on what your point is.
>> >>
>> >> The token would be included as one of the headers. This is often
>> >> preferable as it separates the authorization layer (in header) from
>> >> application layer parameters (query string or message body)
>> >
>> > With our proposal, we were focussed on url parameters (hence the choice
>> > of urlencode after it was all put together). I think it makes total sense 
>> > to
>> > not do the encoding as part of the sig spec, and let the transport choice
>> > dictate which encoding to use.
>>
>> I understand what you are saying. having multiple encodings makes
>> libraries harder, and leads to the issues that motivated base64url over
>> url-encoding
>>
>>
>>
>> >
>> > Therefore, I think we should make the signature:
>> >
>> >    hash + '.' + json string
>> >
>> > And then if you are putting it in a url parameter, you should urlencode
>> > the whole thing. If you are putting it in an HTTP header you should remove
>> > all the "\r" and "\n" in the json output (which are only whitespace as they
>> > aren't allowed inside strings, and most language encoders won't even output
>> > them by default).
>> >
>> > This way, this is a general signature spec, regardless of how it is
>> > being sent. You could send it as a DNS record and do the proper encoding 
>> > for
>> > that scenario, or carrier pigeon encoded in Navajo, etc.
>> >
>> >
>> >
>> > So to sum up:
>> >
>> > * We'd like the signature first (so you can left split instead of right
>> > split)
>>
>> What are the advantages of left split vs right split?
>
> Built in split function with a limit is more common, which makes the left
> split easier.
>
>
> -Naitik
>
> _______________________________________________
> OAuth mailing list
> [email protected]
> https://www.ietf.org/mailman/listinfo/oauth
>
>
_______________________________________________
OAuth mailing list
[email protected]
https://www.ietf.org/mailman/listinfo/oauth

Re: [OAUTH-WG] Understanding the reasoning for Base64

Reply via email to