Re: [websec] [decade] Digest: Adventures in encoding

Stephen Farrell Fri, 07 Oct 2011 01:54:54 -0700


Hi Phill,


Oauth [1] uses ""application/x-www-form-urlencoded" format as defined by
[W3C.REC-html401-19991224]" all over the place to solve basically
this problem but in the context of HTTP URLs which has to be worse
than for a new URI scheme.

Why not do the same here?

S.

[1] http://tools.ietf.org/html/draft-ietf-oauth-v2-22#section-4.1.1

On 10/07/2011 03:49 AM, Phillip Hallam-Baker wrote:

Following on the base16/base64 discussion, I have written some code (see
end) and have some ni digests in various flavors of encoding.

My conclusion is that we should split the difference and do base32 instead.
I think the arguments are actually quite compelling.

This is only an encoding issue. So choosing an encoding that requires the
least number of systems to be touched is my priority. Choosing base32 allows
the resolution scheme to be supported by unmodified Apache and IIS.

The additional code burden for ni/digest implementers to write base32 is
trivial


There is also the option of doing more than one encoding. But Base64 uris
are only slightly shorter.


*Base16*
ni:sha-256;B77B635B2832BF95E8F2935963F134A41F4F11C0BEDD6CED2C5E551F288D9980

Problem - very long even without separators.


*Base32:*
ni:sha-256;W5AGGABIGKAJLAHSSNAGHABUUQAE6AGAX3AGZABMLZAB6AENTGAA

Advantage: more compact than Base16 (somewhat), can be read out over a
telephone or terminal room (try that with Base64). Can be printed as a
static reference in a journal or equivalent.


*Base32s:*
ni:sha-256;W5AGGA-BIGKAJ-LAHSSNA-GHABUU-QAE6AGA-X3AGZA-BMLZAB-6AENTGAA

This is my own invention, basically Base32 with separators added for
readability.


*Base64:*
ni:sha-256;t3tjWygyv5Xo8pNZY/E0pB9PEcC+3WztLF5VHyiNmYA=

This is the traditional base64 encoding.

Process disadvantage: You know that someone is bound to challenge the
forward slash just as the document gets to last call. I don't see an
advantage to risk a discuss.

Practical disadvantage: Gets messed up when converted to a well-known URL.
Consider the following:

ni:sha-256;t3tjWygyv5Xo8pNZY/E0pB9PEcC+3WztLF5VHyiNmYA=?http=example.com

This would map to:

http://example.com/.well-known/ni/sha-256/t3tjWygyv5Xo8pNZY/E0pB9PEcC+3WztLF5VHyiNmYA=

The forward slash is not a hierarchy indicator which is bad. Worse still
this would mean that support for ni digest objects requires a code plug in
for Apache, IIS etc rather than just mapping .well-known/ni/sha-256 to the
directory with the digest values.


*Base64url:*

ni:sha-256;t3tjWygyv5Xo8pNZY_E0pB9PEcC-3WztLF5VHyiNmYA

This is essentially the same as base64 for size etc. The only disadvantage
being that the encoder has to be scratch written. (Mine took 20 mins).

The advantage over plain base64 is that there is no code required to support
the .well-known version of the locator scheme on the server at all. Just
some admin stuff. Also the URL is completely compatible with URI process
lore.


*Summary:*

Arguments can be made for each one of these schemes. I think the argument
for Base16 is the weakest since Base32 can do everything that Base16 can do.
Neither is implemented as a standard library function on my platform.

If we went for Base32, I would argue for allowing some form of readability
separator. Base32s is much easier to transcribe than plain Base32.

Base64/Base64url offer the best compression in a practical form. For most
applications a truncated digest will be acceptable. The main disadvantage of
the Base64 schemes is that they are case sensitive. This will play merry
heck with case insensitive but case preserving file systems such as
Windows.

I think we should knock out base16 from consideration as base32 does it
better.

Since it is fairly easy to write a filter that strips out the separators, I
would argue for allowing separators at any point in the identifier but that
the canonical form is with the separators stripped out and this is what is
used to create the URL form. So

ni:sha-256;W5AGGA-BIGKAJ-LAHSSNA-GHABUU-QAE6AGA-X3AGZA-BMLZAB-6AENTGAA

would map to

http://example.com/.well-known/ni/sha-256/W5AGGABIGKAJLAHSSNAGHABUUQAE6AGAX3AGZABMLZAB6AENTGAA

This preserves the criteria that Apache, IIS etc can be configured to
resolve these identifiers without new code.


Taking out Base 16 and base32 without separators, I see the following
options:

Base32s only
Base64 only
Base64url only
Base32s + Base64url

I can see pros and cons for the base32 encoding. To make it really readable
it is necessary to put in the separators which is something of an overhead.
So I can see an argument for both.

But if we only pick one I would say take base32 with separators. It is not
the most compact but it is good enough. It is fully URL compatible and has
the readability benefit.


Here is Base32 in C#:

         string ToBase32String(byte[] data, int Length) {
             string result = "";
             int offset = 0;
             int a = 0;

             for (int i = 0; i<  Length; i++) {
                 a = (a<<  8) | data[i];
                 offset += 8;

                 while (offset>= 5) {
                     offset -= 5;

                     int n = a>>  offset;
                     result = result + BASE32[n];
                     a = a&  (0x1f>>  (5 - offset));
                     }
                 }

             if (offset>  0) {
                 result = result + BASE32[a];
                 }
             return result;
             }

Here is base64url in C#:

        string ToBase64urlString(byte[] data, int Length) {
            string result = "";
            int offset = 0;
            int a = 0;

            for (int i = 0; i<  Length; i++) {
                a = (a<<  8) | data[i];
                offset += 8;

                //Console.WriteLine ("{0:x4}/{2:3} : {1}", a, result,
offset);

                while (offset>= 6) {
                    offset -= 6;

                    int n = a>>  offset;
                    result = result + BASE64URL[n];
                    a = a&  (0x3f>>  (6 - offset));
                    }
                }
            if (offset>  0) {
                result = result + BASE64URL[a];
                }
            return result;
            }



_______________________________________________
decade mailing list
[email protected]
https://www.ietf.org/mailman/listinfo/decade

_______________________________________________
websec mailing list
[email protected]
https://www.ietf.org/mailman/listinfo/websec

Re: [websec] [decade] Digest: Adventures in encoding

Reply via email to