Re: [websec] [decade] Digest: Adventures in encoding

Phillip Hallam-Baker Fri, 07 Oct 2011 08:44:44 -0700

On Fri, Oct 7, 2011 at 7:22 AM, "Martin J. Dürst" <[email protected]>wrote:


>
> As for Phil's original ideas, if there's a security issue with
> distinguishing upper- and lowercase filenames on Windows, then abandoning
> base64 may be a good idea. I wouldn't know of and couldn't imagine a
> structural security issue, but then I'm no security expert, but there's
> certainly some erosion, up to one bit per 6 bits if all of the characters
> are alphabetic.
>

It may not be too much of an issue. Windows probably couldn't cope with 2^64
files in one directory, let alone 2^128 so there would have to be some
post-processing on the back end for those cases.



> As for Base32, I don't like it too much because it seems to be totally new,
> and I prefer using existing stuff if it's good enough. This may show up in
> code. In scripting languages (think Perl, Ruby,...), Base64 and Base16 are
> simple pack operations, and Base64url is a pack and a tr, but Base32 needs
> handcoding. Also, Base32 aligns 8 characters to 5 bytes, Base64 4 characters
> to 3 bytes, and Base16 2 characters to 1 byte, and I somehow prefer a more
> regular alignment (although the reason for this may be that I never
> completely figured out how to handle the non-aligned stuff at the end).
>

It just drops out with zero bits padding the last chunk.

The code is not complex, about 10 lines for the inner loop.



> As for the Base32 with slashes, why not just leave the choice of having
> slashes or not to the creator, and require the consumer to take them out (as
> Phil proposed anyway). But with the "slashes being taken out" I don't
> understand Phil's argument about direct mapping to file names.
>

It is probably a separate use case to DECADE. I was thinking we might need
it in WebSec - possibly.

For example, imagine that we have an attack going on with Bank of Ethel and
Ethel is asking me to push out a security policy in our AV system by
telephone. I have tried reading Base64 over the telephone and it is not a
pleasant thing to do.



> On the other hand, I think allowing both Base32s and Base64url is a
> non-starter. Why have two if one is good enough? And then you can't map to
> URI paths because somebody could transform the digest from one version to
> another to squeeze it.
>

Well the pragmatic issue is that I am holding up someone who is wanting to
write code. So if I take a guess and it is wrong we end up with a legacy
issue :-(

We do have an escape hole though. The separator between the algorithm and
the data. That can be used as an encoding switch.

So we can use ; for one encoding, | for another and so on.


For forwards compatibility we can require the interim stuff to accept both
encodings and decode form encoding and to generate base64 by default but be
capable of emitting base32 and formencode(base64)).



> What I still don't understand is where the pressure for having to have the
> digest and part of the URI path needing to be the same comes from. If I want
> to have a digest protection for a page directly at http://www.example.org/,
> shouldn't I be able to do so? I tried to ask about this previously, but I
> still didn't get the point, sorry.
>

Ah the issue there is that if the browser has the necessary smarts it can
look at:

<a href="
http://example.com/.well-known/ni/sha-256/t3tjWygyv5Xo8pNZY%2fE0pB9PEcC%2b3WztLF5VHyiNmYA=<http://example.com/.well-known/ni/sha-256/t3tjWygyv5Xo8pNZY/E0pB9PEcC+3WztLF5VHyiNmYA=>">This
is a strong link to static content</a>

And determine that the URL is a strong link and apply appropriate processing
when in strict mode (or whatever).


This link is 100% backwards compatible with existing browsers. all it does
is to add in some extra semantics.

So for the 'strong link' application, browsers would not need to support the
ni: scheme at all and it would not be necessary to provide different content
according to whether the browser did or did not understand the content.



> Regards,   Martin.
>
>  S.
>>
>> [1] 
>> http://tools.ietf.org/html/**draft-ietf-oauth-v2-22#**section-4.1.1<http://tools.ietf.org/html/draft-ietf-oauth-v2-22#section-4.1.1>
>>
>> On 10/07/2011 03:49 AM, Phillip Hallam-Baker wrote:
>>
>>> Following on the base16/base64 discussion, I have written some code (see
>>> end) and have some ni digests in various flavors of encoding.
>>>
>>> My conclusion is that we should split the difference and do base32
>>> instead.
>>> I think the arguments are actually quite compelling.
>>>
>>> This is only an encoding issue. So choosing an encoding that requires the
>>> least number of systems to be touched is my priority. Choosing base32
>>> allows
>>> the resolution scheme to be supported by unmodified Apache and IIS.
>>>
>>> The additional code burden for ni/digest implementers to write base32 is
>>> trivial
>>>
>>>
>>> There is also the option of doing more than one encoding. But Base64 uris
>>> are only slightly shorter.
>>>
>>>
>>> *Base16*
>>> ni:sha-256;**B77B635B2832BF95E8F2935963F134**
>>> A41F4F11C0BEDD6CED2C5E551F288D**9980
>>>
>>>
>>> Problem - very long even without separators.
>>>
>>>
>>> *Base32:*
>>> ni:sha-256;**W5AGGABIGKAJLAHSSNAGHABUUQAE6A**GAX3AGZABMLZAB6AENTGAA
>>>
>>> Advantage: more compact than Base16 (somewhat), can be read out over a
>>> telephone or terminal room (try that with Base64). Can be printed as a
>>> static reference in a journal or equivalent.
>>>
>>>
>>> *Base32s:*
>>> ni:sha-256;W5AGGA-BIGKAJ-**LAHSSNA-GHABUU-QAE6AGA-X3AGZA-**
>>> BMLZAB-6AENTGAA
>>>
>>> This is my own invention, basically Base32 with separators added for
>>> readability.
>>>
>>>
>>> *Base64:*
>>> ni:sha-256;t3tjWygyv5Xo8pNZY/**E0pB9PEcC+3WztLF5VHyiNmYA=
>>>
>>> This is the traditional base64 encoding.
>>>
>>> Process disadvantage: You know that someone is bound to challenge the
>>> forward slash just as the document gets to last call. I don't see an
>>> advantage to risk a discuss.
>>>
>>> Practical disadvantage: Gets messed up when converted to a well-known
>>> URL.
>>> Consider the following:
>>>
>>> ni:sha-256;t3tjWygyv5Xo8pNZY/**E0pB9PEcC+3WztLF5VHyiNmYA=?**http=
>>> example.com
>>>
>>> This would map to:
>>>
>>> http://example.com/.well-**known/ni/sha-256/**
>>> t3tjWygyv5Xo8pNZY/E0pB9PEcC+**3WztLF5VHyiNmYA=<http://example.com/.well-known/ni/sha-256/t3tjWygyv5Xo8pNZY/E0pB9PEcC+3WztLF5VHyiNmYA=>
>>>
>>>
>>> The forward slash is not a hierarchy indicator which is bad. Worse still
>>> this would mean that support for ni digest objects requires a code
>>> plug in
>>> for Apache, IIS etc rather than just mapping .well-known/ni/sha-256 to
>>> the
>>> directory with the digest values.
>>>
>>>
>>> *Base64url:*
>>>
>>> ni:sha-256;t3tjWygyv5Xo8pNZY_**E0pB9PEcC-3WztLF5VHyiNmYA
>>>
>>> This is essentially the same as base64 for size etc. The only
>>> disadvantage
>>> being that the encoder has to be scratch written. (Mine took 20 mins).
>>>
>>> The advantage over plain base64 is that there is no code required to
>>> support
>>> the .well-known version of the locator scheme on the server at all. Just
>>> some admin stuff. Also the URL is completely compatible with URI process
>>> lore.
>>>
>>>
>>> *Summary:*
>>>
>>> Arguments can be made for each one of these schemes. I think the argument
>>> for Base16 is the weakest since Base32 can do everything that Base16
>>> can do.
>>> Neither is implemented as a standard library function on my platform.
>>>
>>> If we went for Base32, I would argue for allowing some form of
>>> readability
>>> separator. Base32s is much easier to transcribe than plain Base32.
>>>
>>> Base64/Base64url offer the best compression in a practical form. For most
>>> applications a truncated digest will be acceptable. The main
>>> disadvantage of
>>> the Base64 schemes is that they are case sensitive. This will play merry
>>> heck with case insensitive but case preserving file systems such as
>>> Windows.
>>>
>>> I think we should knock out base16 from consideration as base32 does it
>>> better.
>>>
>>> Since it is fairly easy to write a filter that strips out the
>>> separators, I
>>> would argue for allowing separators at any point in the identifier but
>>> that
>>> the canonical form is with the separators stripped out and this is
>>> what is
>>> used to create the URL form. So
>>>
>>> ni:sha-256;W5AGGA-BIGKAJ-**LAHSSNA-GHABUU-QAE6AGA-X3AGZA-**
>>> BMLZAB-6AENTGAA
>>>
>>> would map to
>>>
>>> http://example.com/.well-**known/ni/sha-256/**
>>> W5AGGABIGKAJLAHSSNAGHABUUQAE6A**GAX3AGZABMLZAB6AENTGAA<http://example.com/.well-known/ni/sha-256/W5AGGABIGKAJLAHSSNAGHABUUQAE6AGAX3AGZABMLZAB6AENTGAA>
>>>
>>>
>>> This preserves the criteria that Apache, IIS etc can be configured to
>>> resolve these identifiers without new code.
>>>
>>>
>>> Taking out Base 16 and base32 without separators, I see the following
>>> options:
>>>
>>> Base32s only
>>> Base64 only
>>> Base64url only
>>> Base32s + Base64url
>>>
>>> I can see pros and cons for the base32 encoding. To make it really
>>> readable
>>> it is necessary to put in the separators which is something of an
>>> overhead.
>>> So I can see an argument for both.
>>>
>>> But if we only pick one I would say take base32 with separators. It is
>>> not
>>> the most compact but it is good enough. It is fully URL compatible and
>>> has
>>> the readability benefit.
>>>
>>>
>>> Here is Base32 in C#:
>>>
>>> string ToBase32String(byte[] data, int Length) {
>>> string result = "";
>>> int offset = 0;
>>> int a = 0;
>>>
>>> for (int i = 0; i< Length; i++) {
>>> a = (a<< 8) | data[i];
>>> offset += 8;
>>>
>>> while (offset>= 5) {
>>> offset -= 5;
>>>
>>> int n = a>> offset;
>>> result = result + BASE32[n];
>>> a = a& (0x1f>> (5 - offset));
>>> }
>>> }
>>>
>>> if (offset> 0) {
>>> result = result + BASE32[a];
>>> }
>>> return result;
>>> }
>>>
>>> Here is base64url in C#:
>>>
>>> string ToBase64urlString(byte[] data, int Length) {
>>> string result = "";
>>> int offset = 0;
>>> int a = 0;
>>>
>>> for (int i = 0; i< Length; i++) {
>>> a = (a<< 8) | data[i];
>>> offset += 8;
>>>
>>> //Console.WriteLine ("{0:x4}/{2:3} : {1}", a, result,
>>> offset);
>>>
>>> while (offset>= 6) {
>>> offset -= 6;
>>>
>>> int n = a>> offset;
>>> result = result + BASE64URL[n];
>>> a = a& (0x3f>> (6 - offset));
>>> }
>>> }
>>> if (offset> 0) {
>>> result = result + BASE64URL[a];
>>> }
>>> return result;
>>> }
>>>
>>>
>>>
>>> ______________________________**_________________
>>> decade mailing list
>>> [email protected]
>>> https://www.ietf.org/mailman/**listinfo/decade<https://www.ietf.org/mailman/listinfo/decade>
>>>
>> ______________________________**_________________
>> websec mailing list
>> [email protected]
>> https://www.ietf.org/mailman/**listinfo/websec<https://www.ietf.org/mailman/listinfo/websec>
>>
>>


-- 
Website: http://hallambaker.com/

_______________________________________________
websec mailing list
[email protected]
https://www.ietf.org/mailman/listinfo/websec

Re: [websec] [decade] Digest: Adventures in encoding

Reply via email to