http://tools.ietf.org/html/draft-ietf-jose-json-web-signature-41#section-1.1 
uses this notation:

   UTF8(STRING) denotes the octets of the UTF-8 
[RFC3629<http://tools.ietf.org/html/rfc3629>] representation
   of STRING, where STRING is a sequence of zero or more Unicode
   
[UNICODE<http://tools.ietf.org/html/draft-ietf-jose-json-web-signature-41#ref-UNICODE>]
 characters.

   ASCII(STRING) denotes the octets of the ASCII 
[RFC20<http://tools.ietf.org/html/rfc20>] representation
   of STRING, where STRING is a sequence of zero or more ASCII
   characters.

This is unambiguous and has already been vetted by the IESG and SecDir, so I 
would use exactly this wording.

OCTETS(STRING) is ambiguous, since for the same string there are many possible 
representations as octets, including ASCII, UTF-8, UTF-16, UTF-32, and EBCDIC.

                                                                -- Mike

From: OAuth [mailto:oauth-boun...@ietf.org] On Behalf Of John Bradley
Sent: Friday, January 30, 2015 11:33 AM
To: Brian Campbell
Cc: oauth; Naveen Agarwal
Subject: Re: [OAUTH-WG] PKCE: SHA256(WAT?)

Have a look at the latest version I added OCTETS(STRING) to show the 
conversion.   ASCII(STRING) seemed more confusing by drawing character encoding 
back in.

I was tempted to call it a octet array without the terminating NULL of STRING 
but didn’t want to introduce array.

Let me know what you think.

On Jan 30, 2015, at 1:56 PM, Brian Campbell 
<bcampb...@pingidentity.com<mailto:bcampb...@pingidentity.com>> wrote:

But, while it may be clear to you, what I'm saying here is that it's not clear 
to a reader/implementer.

Somehow the conversion from a character string to an octet string needs to be 
clearly and unambiguously stated. It doesn't have to be the text I suggested 
but it's not sufficient as it is now.

Something like this might work, if you don't want to touch the parts in 4.2 and 
4.6: "SHA256(STRING) denotes a SHA2 256bit hash [RFC6234] of the octets of the 
ASCII [RFC0020] representation of STRING."

An "octet sequence using the url and filename safe Alphabet [...], with length 
less than 128 characters." is ambiguous. Octets and characters are intermixed 
with no mention of encoding. But they're not interchangeable.


On Fri, Jan 30, 2015 at 7:15 AM, Nat Sakimura 
<sakim...@gmail.com<mailto:sakim...@gmail.com>> wrote:
I do not think we need ASCII(). It is quite clear without it, I suppose.

In 4.1, I would rather do like:

 code_verifier = high entropy cryptographic random
   octet sequence using the url and filename safe Alphabet [A-Z] / [a-z]
   / [0-9] / "-" / "_" from Sec 5 of RFC 4648 [RFC4648], with length
   less than 128 characters.

Nat

2015-01-30 22:51 GMT+09:00 Brian Campbell 
<bcampb...@pingidentity.com<mailto:bcampb...@pingidentity.com>>:
That's definitely an improvement (to me anyway).
Checking that the rest of the document uses those notations appropriately, I 
think, yields a few other changes. And probably begs for the "ASCII(STRING) 
denotes the octets of the ASCII representation of STRING" notation/function, or 
something like it, to be put back in. Those changes might look like the 
following:

In 4.1.:

OLD:
   code_verifier = high entropy cryptographic random ASCII [RFC0020]
   octet sequence using the url and filename safe Alphabet [A-Z] / [a-z]
   / [0-9] / "-" / "_" from Sec 5 of RFC 4648 [RFC4648], with length
   less than 128 characters.

NEW (maybe):
  code_verifier = high entropy cryptographically strong random STRING
  using the url and filename safe Alphabet [A-Z] / [a-z]
   / [0-9] / "-" / "_" from Sec 5 of RFC 4648 [RFC4648], with length
   less than 128 characters.

In 4.2.:

OLD:
   S256  "code_challenge" = BASE64URL(SHA256("code_verifier"))

NEW (maybe):
   S256  "code_challenge" = BASE64URL(SHA256(ASCII("code_verifier")))

In 4.6.:

OLD:
   SHA256("code_verifier" ) == BASE64URL-DECODE("code_challenge").

NEW (maybe):
   SHA256(ASCII("code_verifier")) == BASE64URL-DECODE("code_challenge").



On Thu, Jan 29, 2015 at 8:37 PM, Nat Sakimura (=nat) 
<n...@sakimura.org<mailto:n...@sakimura.org>> wrote:
I take your point, Brian.

In our most recent manuscript, STRING is defined inside ASCII(STRING) as

STRING is a sequence of zero or more ASCII characters

but it is kind of circular, and we do not seem to use ASCII().

What about re-writing the section like below?

STRING denotes a sequence of zero or more ASCII  
[RFC0020]<http://xml2rfc.ietf.org/cgi-bin/xml2rfc.cgi#RFC0020> characters.
OCTETS denotes a sequence of zero or more octets.
BASE64URL(OCTETS) denotes the base64url encoding of OCTETS, per Section 
3<http://xml2rfc.ietf.org/cgi-bin/xml2rfc.cgi#Terminology> producing a 
ASCII[RFC0020]<http://xml2rfc.ietf.org/cgi-bin/xml2rfc.cgi#RFC0020> STRING.
BASE64URL-DECODE(STRING) denotes the base64url decoding of STRING, per Section 
3<http://xml2rfc.ietf.org/cgi-bin/xml2rfc.cgi#Terminology>, producing a 
sequence of octets.
SHA256(OCTETS) denotes a SHA2 256bit hash 
[RFC6234]<http://xml2rfc.ietf.org/cgi-bin/xml2rfc.cgi#RFC6234> of OCTETS.





On Jan 30, 2015, at 08:15, Brian Campbell 
<bcampb...@pingidentity.com<mailto:bcampb...@pingidentity.com>> wrote:

In §2 [1] we've got "SHA256(STRING) denotes a SHA2 256bit hash [RFC6234] of 
STRING."
But, in the little cow town where I come from anyway, you hash bits/octets not 
character strings (BTW, "STRING" isn't defined anywhere but it's kind of 
implied that it's a string of characters).
Should it say something more like "SHA256(STRING) denotes a SHA2 256bit hash 
[RFC6234] of the octets of the ASCII [RFC0020] representation of STRING."?
I know it's kind of pedantic but I find it kind of confusing because the 
code_verifier uses the url and filename safe alphabet, which has me second 
guessing if SHA256(STRING) actually means a hash of the octet produced by 
base64url decoding the string.
Maybe it's just me but, when reading the text, I find the transform process to 
be much more confusing than I think it needs to be. Removing and clarifying 
some things will help. I hate to suggest this but maybe an example showing the 
computation steps on both ends would be helpful?

Also "UTF8(STRING)" and "ASCII(STRING)" notations are defined in §2 but not 
used anywhere.

And §2 also says, "BASE64URL-DECODE(STRING) denotes the base64url decoding of 
STRING, per Section 3, producing a UTF-8 sequence of octets." But what is a 
UTF-8 sequence of octets? Isn't it just a sequence octets? The [RFC3629] 
reference, I think, could be removed.

[1] https://tools.ietf.org/html/draft-ietf-oauth-spop-06#section-2

Nat Sakimura
n...@sakimura.org<mailto:n...@sakimura.org>








--
Nat Sakimura (=nat)
Chairman, OpenID Foundation
http://nat.sakimura.org/
@_nat_en


_______________________________________________
OAuth mailing list
OAuth@ietf.org
https://www.ietf.org/mailman/listinfo/oauth

Reply via email to