Re: attribute exchange value encoding

2007-05-30 Thread Johnny Bufu

On 29-May-07, at 2:33 AM, Claus Färber wrote:

> Johnny Bufu schrieb:
>> The attribute metadata can be used to define attribute-specific
>> encodings, which should deal with issues like this.
>
> Ah, so the _usual_ way is that the metadata (Can this be renamed to
> "datatype definition"? "metadata" is very misleading.) defines the
> encoding.

That would be the preferred way, yes. "Metadata" seems to make sense  
to most people involved in the schema effort, because it can contain  
other stuff beside data type description. But this is a work in  
progress, so if you're interested and want to make your case I'm sure  
you'll be most welcome on the idschema list:

[EMAIL PROTECTED]
http://mail.idcommons.net/cgi-bin/mailman/listinfo/idschemas

> For binary data, it will be base64Binary or hexBinary as
> defined in XML schema. Correct?

Yes, whatever encoding is defined in the metadata document.

>> The AX protocol has to stay simple (that was overwhelming feedback
>> I've received at IIW). The base64 encoding is there as a convenience:
>> if a number of OPs and RPs agree on an attribute type (the classical
>> example being an avatar image) but don't want to go to the trouble of
>> publishing metadata information,
>
> In other words: The metadata is implicitly agreed upon by the parties
> involved. If they can agree on the meaning and the base format  
> (integer,
> string, *binary,...) they can also agree on an encoding (e.g. agree on
> base64Binary instead of *binary).
>
> So I don't think AX needs means to flag base64 data. The parties
> involved should know when base64Binary or hexBinary is used by out of
> band information (metadata/datatype definition or mutual agreement).

I see your point - if the parties involved agree on the attribute  
type and (implicitly) on its data format, they can / should go one  
step further and agree on the encoding as well. And eventually the  
prevailing method would be to programatically determine the encoding  
from the metadata documents.

> In other words, AX should just restrict values to UTF-8 strings and
> recommend base64Binary (or hexBinary) for datatypes (datatypes, not
> data!) that can't be represented as UTF-8 strings.

Yes, that would work too; it would be basically AX draft-5 plus  
disallowing newlines in attribute values in order to comply with the  
underlying OpenID data formats.

If there's agreement that encoding is more tightly coupled with the  
attribute types and their metadata definition, we can just reference  
that in the AX spec.


Johnny

___
specs mailing list
specs@openid.net
http://openid.net/mailman/listinfo/specs


RE: attribute exchange value encoding

2007-05-29 Thread Guoping Liu
I agree with Claus. We may not need a base64 type. 

Guoping


-Original Message-
From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Claus Färber
Sent: Tuesday, May 29, 2007 3:33 AM
To: specs@openid.net
Subject: Re: attribute exchange value encoding

Johnny Bufu schrieb:
> The attribute metadata can be used to define attribute-specific  
> encodings, which should deal with issues like this.

Ah, so the _usual_ way is that the metadata (Can this be renamed to 
"datatype definition"? "metadata" is very misleading.) defines the 
encoding. For binary data, it will be base64Binary or hexBinary as 
defined in XML schema. Correct?

> The AX protocol has to stay simple (that was overwhelming feedback  
> I've received at IIW). The base64 encoding is there as a convenience:  
> if a number of OPs and RPs agree on an attribute type (the classical  
> example being an avatar image) but don't want to go to the trouble of  
> publishing metadata information,

In other words: The metadata is implicitly agreed upon by the parties 
involved. If they can agree on the meaning and the base format (integer, 
string, *binary,...) they can also agree on an encoding (e.g. agree on 
base64Binary instead of *binary).

So I don't think AX needs means to flag base64 data. The parties 
involved should know when base64Binary or hexBinary is used by out of 
band information (metadata/datatype definition or mutual agreement).

In other words, AX should just restrict values to UTF-8 strings and 
recommend base64Binary (or hexBinary) for datatypes (datatypes, not 
data!) that can't be represented as UTF-8 strings.

Claus

___
specs mailing list
specs@openid.net
http://openid.net/mailman/listinfo/specs
___
specs mailing list
specs@openid.net
http://openid.net/mailman/listinfo/specs


Re: attribute exchange value encoding

2007-05-29 Thread Claus Färber
Johnny Bufu schrieb:
> The attribute metadata can be used to define attribute-specific  
> encodings, which should deal with issues like this.

Ah, so the _usual_ way is that the metadata (Can this be renamed to 
"datatype definition"? "metadata" is very misleading.) defines the 
encoding. For binary data, it will be base64Binary or hexBinary as 
defined in XML schema. Correct?

> The AX protocol has to stay simple (that was overwhelming feedback  
> I've received at IIW). The base64 encoding is there as a convenience:  
> if a number of OPs and RPs agree on an attribute type (the classical  
> example being an avatar image) but don't want to go to the trouble of  
> publishing metadata information,

In other words: The metadata is implicitly agreed upon by the parties 
involved. If they can agree on the meaning and the base format (integer, 
string, *binary,...) they can also agree on an encoding (e.g. agree on 
base64Binary instead of *binary).

So I don't think AX needs means to flag base64 data. The parties 
involved should know when base64Binary or hexBinary is used by out of 
band information (metadata/datatype definition or mutual agreement).

In other words, AX should just restrict values to UTF-8 strings and 
recommend base64Binary (or hexBinary) for datatypes (datatypes, not 
data!) that can't be represented as UTF-8 strings.

Claus

___
specs mailing list
specs@openid.net
http://openid.net/mailman/listinfo/specs


Re: attribute exchange value encoding

2007-05-29 Thread Claus Färber
Johnny Bufu schrieb:
> I believe the HTTP encoding [1] in the OpenID spec will take care of  
> this part, i.e. before putting the OpenID + AX message on the wire,  
> the OpenID layer has to HTTP-encode it.

Maybe "Base 64 Encoding with URL and Filename Safe Alphabet" (RFC 3548, 
section 4) should be used for efficiency.

If 2 out of 64 characters need to be %-encoded, this increases the size 
by an average of 6,25 %. (I'm ignoring the '=' as it only appears once.) 
The total overhead of Base64 changes from 33,3 % to 41,7 %.

Claus

___
specs mailing list
specs@openid.net
http://openid.net/mailman/listinfo/specs


Re: attribute exchange value encoding

2007-05-28 Thread Johnny Bufu
Hi Gouping,

On 28-May-07, at 9:22 PM, Guoping Liu wrote:
> I have a couple comments on Section 3.3.2 Default Encoding of a Binary
> Value.
>
> First, the character set of standard Base64 encoding is not URL-safe.
> Specifically, '+', '/' and '=' need to be URL-encoded. So, we need to
> URL-encode the value after base64 encoding.

I believe the HTTP encoding [1] in the OpenID spec will take care of  
this part, i.e. before putting the OpenID + AX message on the wire,  
the OpenID layer has to HTTP-encode it.

> Secondly, different platforms may have different binary formats for a
> given type of objects. There may be interoperability issues with  
> binary
> values across different platforms. We may want to use a string
> representation of an object instead of its binary representation, like
> in any XML document. For example, for an integer value 1234 of  
> attribute
> x we have openid.ax.x=1234. With this we will not need base64  
> encoding.
> But, we will still need URL-encoding.

The attribute metadata can be used to define attribute-specific  
encodings, which should deal with issues like this.

The AX protocol has to stay simple (that was overwhelming feedback  
I've received at IIW). The base64 encoding is there as a convenience:  
if a number of OPs and RPs agree on an attribute type (the classical  
example being an avatar image) but don't want to go to the trouble of  
publishing metadata information, they can use AX's base64 support for  
transferring it. And yes, I'd expect numeric values to be transferred  
as strings in >90% of the cases.


Johnny

[1] http://openid.net/specs/openid-authentication-2_0-11.html#anchor4

___
specs mailing list
specs@openid.net
http://openid.net/mailman/listinfo/specs


RE: attribute exchange value encoding

2007-05-28 Thread Guoping Liu
Johnny:

I have a couple comments on Section 3.3.2 Default Encoding of a Binary
Value. 

First, the character set of standard Base64 encoding is not URL-safe.
Specifically, '+', '/' and '=' need to be URL-encoded. So, we need to
URL-encode the value after base64 encoding. 

Secondly, different platforms may have different binary formats for a
given type of objects. There may be interoperability issues with binary
values across different platforms. We may want to use a string
representation of an object instead of its binary representation, like
in any XML document. For example, for an integer value 1234 of attribute
x we have openid.ax.x=1234. With this we will not need base64 encoding.
But, we will still need URL-encoding.

Regards,
Guoping

-Original Message-
From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On
Behalf Of Johnny Bufu
Sent: Thursday, May 24, 2007 6:16 PM
To: OpenID specs list
Subject: attribute exchange value encoding

Hello list,

While at IIW, I asked around what people thought about the encoding  
mechanisms we've added recently, in order to allow for transferring  
any data types. The consensus was that everyone would prefer  
something simpler and lighter.

So I've rewritten the encoding section, such that:

- for strings, only the newline (and percent) characters are required  
to be escaped,
   (to comply with OpenID's data formats), using percent-encoding;

- base64 must be used for encoding binary data, and defined
   an additional field for this:
openid.ax.encoding.=base64


Please review section 3.3 Attribute Values to see if there are any  
issues.


One remaining question is about the choice of encoding for strings.  
Percent-encoding (RFC3968) seems the simplest from a spec  
perspective, however some libraries provide (better) support for the  
older URL-encoding (RFC1738), which throws '+' characters into the  
mix. Which do you think would work best for implementers, users, and  
would cause less interop problems?


Johnny

___
specs mailing list
specs@openid.net
http://openid.net/mailman/listinfo/specs
___
specs mailing list
specs@openid.net
http://openid.net/mailman/listinfo/specs


Re: Re-defining the Key-Value format (was: attribute exchange value encoding)

2007-05-28 Thread Johnny Bufu
Hi Claus,

On 28-May-07, at 5:55 AM, Claus Färber wrote:

> Johnny Bufu schrieb:
>> So I've rewritten the encoding section, such that:
>>
>> - for strings, only the newline (and percent) characters are required
>> to be escaped,
>>(to comply with OpenID's data formats), using percent-encoding;
>
> This means that '%' characters need to be encoded up to three times:

I'm not sure I follow your reasoning all the way; please see my  
comments below and point where I'm wrong.

> For example:
>
> User name: 100%pure
>
> Embedded in an URI that is the value of the attribute:
>http://example.com/foo/100%25pure

This encoding happens outside of the OpenID / AX protocols. There's  
nothing we can do in the specs about it, if the value of an attribute  
is an URI like http://example.com/foo/100%25pure.

 From the OpenID / AX point of view, I view the above as an unencoded  
% character (AX doesn't know in this case that the payload is an  
URI); it's up to whoever consumes the attribute value to handle it  
properly.


> Encoded for AX using Key-Value Form Encoding  (OID 2, 4.1.1.)
>openid.ax.foo.uri:http://example.com/foo/100%2525pure

AX has nothing to do directly with key-value encoding. I see no  
reference to percent-encoding from OpenID2's section 4.1.1.

But yes, using the AX 3.3.1 Default Encoding of a String Value [1],  
if user_name=100%pure the field in an key-value representation would be:

openid.ax.foo.value=100%25pure


> Encoded for AX using HTTP Encoding (OID 2, 4.1.2.)
>openid.ax.foo.uri=http%3A//example.com/foo/100%2525pure

Yes, there would be a double-encoding of the % char, one done by AX  
3.3.1, and another x-www-form encoding as required by OpenID 4.1.2  
for indirect messages.


> I don't think it's a good idea to introduce a solution to the "\n"
> problem in AX only. It should be part of the base spec (OpenId 2
> Authentication).

What do you see as pros / cons for each proposed solution?


> What about changing section 4.1.1. from:
>
>  A message in Key-Value form is a sequence of lines.  Each
>  line begins with a key, followed by a colon, and the value
>  associated with the key.  The line is terminated by a
>  single newline (UCS codepoint 10, "\n"). A key or value
>  MUST NOT contain a newline and a key also MUST NOT contain
>  a colon.
>
> to (wording adapted from RFC 2822):
>
>   A message in Key-Value form consists of fields composed of
>  a key, followed by a colon (":"), followed by a value, and
>  terminated by a single LF (UCS codepoint 10, "\n").
>
>  The key MUST be composed of printable US-ASCII characters  
> except
>  ":" (i.e. characters that have values between 33 and 57, or
>  between 59 and 126, inclusive). The key MUST NOT start with
>  a '*' (codepoint 32).
>
>  The value MUST be composed of a sequence of characters  
> encoded
>  as UTF-8. If an extension to this specification allows values
>  that contain LF (UCS codepoint 10, "\n") characters, these LF
>  (UCS codepoint 10, "\n") characters MUST be encoded as a
>  sequence of LF, '*', ':' (UCS codepoints 10, 42, 32,   
> "\n*:").
>
> [Unlike the suggested %-encoding, this encoding is compatible with
> the current spec as long as LF characters are not actually allowed
> within the value.

What makes the proposed percent-encoding incompatible with the  
current OpenID spec?


> It's similar to the RFC 2822 folding mechanism but folding is only
> allowed (and mandated) where a LF is to be encoded. Further, the
> continuation line is compatible with the key-value format,  
> using '*'
> as a pseudo key value.]
>
>  If an extension to this specification needs to allows binary
>  data in values, i.e. if it allows arbitrary bytes not to be
>  interpreted as UTF-8 characters, it MAY use Base64  
> []
>  encoding for the specification of the format of that value.

I would be (mildly) ok with dealing with newline escaping in the core  
if others agree, but:
- it does add some extra stuff, which some may not like / approve
- it would add another item on the 'compatibility list', and another  
thing that OpenID 1/2 implementations would need to deal with twice
- not sure what would be the net advantage of having it there (aside  
from having consistency across all extensions).

> [Note: Base64, is quite efficient when it comes to encoding the
> message in HTTP Encoding (OID 2, 4.1.2.). Unencoded bytes would  
> have
> to use the %-encoding, rougly doubling the size. Unencoded  
> bytes also
> create problems if implementations think they should be UTF-8,  
> e.g.
> if perl strings are used.]
>
>> - base64 must be used for encoding binary data, and defined
>>an additional field for this:
>>  openid.ax.encoding.=base64
>
> I think it's much simpler if the specification of the fiel

Re-defining the Key-Value format (was: attribute exchange value encoding)

2007-05-28 Thread Claus Färber
Johnny Bufu schrieb:
> So I've rewritten the encoding section, such that:
> 
> - for strings, only the newline (and percent) characters are required  
> to be escaped,
>(to comply with OpenID's data formats), using percent-encoding;

This means that '%' characters need to be encoded up to three times:

For example:

User name: 100%pure

Embedded in an URI that is the value of the attribute:
   http://example.com/foo/100%25pure

Encoded for AX using Key-Value Form Encoding  (OID 2, 4.1.1.)
   openid.ax.foo.uri:http://example.com/foo/100%2525pure

Encoded for AX using HTTP Encoding (OID 2, 4.1.2.)
   openid.ax.foo.uri=http%3A//example.com/foo/100%2525pure

I don't think it's a good idea to introduce a solution to the "\n" 
problem in AX only. It should be part of the base spec (OpenId 2 
Authentication).

What about changing section 4.1.1. from:

 A message in Key-Value form is a sequence of lines.  Each
 line begins with a key, followed by a colon, and the value
 associated with the key.  The line is terminated by a
 single newline (UCS codepoint 10, "\n"). A key or value
 MUST NOT contain a newline and a key also MUST NOT contain
 a colon.

to (wording adapted from RFC 2822):

A message in Key-Value form consists of fields composed of
 a key, followed by a colon (":"), followed by a value, and
 terminated by a single LF (UCS codepoint 10, "\n").

 The key MUST be composed of printable US-ASCII characters except
 ":" (i.e. characters that have values between 33 and 57, or
 between 59 and 126, inclusive). The key MUST NOT start with
 a '*' (codepoint 32).

 The value MUST be composed of a sequence of characters encoded
 as UTF-8. If an extension to this specification allows values
 that contain LF (UCS codepoint 10, "\n") characters, these LF
 (UCS codepoint 10, "\n") characters MUST be encoded as a
 sequence of LF, '*', ':' (UCS codepoints 10, 42, 32,  "\n*:").

[Unlike the suggested %-encoding, this encoding is compatible with
the current spec as long as LF characters are not actually allowed
within the value.
It's similar to the RFC 2822 folding mechanism but folding is only
allowed (and mandated) where a LF is to be encoded. Further, the
continuation line is compatible with the key-value format, using '*'
as a pseudo key value.]

 If an extension to this specification needs to allows binary
 data in values, i.e. if it allows arbitrary bytes not to be
 interpreted as UTF-8 characters, it MAY use Base64 []
 encoding for the specification of the format of that value.

[Note: Base64, is quite efficient when it comes to encoding the
message in HTTP Encoding (OID 2, 4.1.2.). Unencoded bytes would have
to use the %-encoding, rougly doubling the size. Unencoded bytes also
create problems if implementations think they should be UTF-8, e.g.
if perl strings are used.]

> - base64 must be used for encoding binary data, and defined
>an additional field for this:
>   openid.ax.encoding.=base64

I think it's much simpler if the specification of the field value format 
just says UTF-8 or Base64 and if the same encoding is used for all 
actual values, even those that would not need any encoding.

Claus

___
specs mailing list
specs@openid.net
http://openid.net/mailman/listinfo/specs


Re: attribute exchange value encoding

2007-05-25 Thread Johnny Bufu
Hi Drummond,

On 25-May-07, at 8:55 PM, Drummond Reed wrote:
>> One remaining question is about the choice of encoding for strings.
>> Percent-encoding (RFC3968) seems the simplest from a spec
>> perspective, however some libraries provide (better) support for the
>> older URL-encoding (RFC1738), which throws '+' characters into the
>> mix. Which do you think would work best for implementers, users, and
>> would cause less interop problems?
>
> FWIW, encoding "+" can be a hassle as it's a legal character in  
> media type
> values and is also sometimes used for spaces. I'd vote for pure  
> RFC3986
> percent-encoding.

Simplicity was the reason for requiring percent-encoding for only two  
characters.

Then it was brought to my attention that implementers may be tempted  
to use the more readily-available functions for URL-encoding, which  
do share the percent-encoding part with what's specified currently in  
AX, but will break other characters.

This is why I wanted to know what others thought about this being a  
potential problem.


Another option would be to define an equally simple but not-so- 
popular encoding, so that implementers are forced to write the  
required 5-line encoding routines themselves; but the unfamiliarity  
of it would add to the (perceived?) complexity of the spec.

I'm trying to find a good balance between simplicity, providing  
enough features, and avoiding deployment problems; so all feedback is  
highly appreciated!


Thanks,
Johnny

___
specs mailing list
specs@openid.net
http://openid.net/mailman/listinfo/specs


RE: attribute exchange value encoding

2007-05-25 Thread Drummond Reed
>Johnny Bufu wrote:
>
>While at IIW, I asked around what people thought about the encoding  
>mechanisms we've added recently, in order to allow for transferring  
>any data types. The consensus was that everyone would prefer  
>something simpler and lighter.
>
>So I've rewritten the encoding section, such that:
>
>- for strings, only the newline (and percent) characters are required  
>to be escaped,
>   (to comply with OpenID's data formats), using percent-encoding;
>
>- base64 must be used for encoding binary data, and defined
>   an additional field for this:
>   openid.ax.encoding.=base64
>
>Please review section 3.3 Attribute Values to see if there are any  
>issues.
>
>One remaining question is about the choice of encoding for strings.  
>Percent-encoding (RFC3968) seems the simplest from a spec  
>perspective, however some libraries provide (better) support for the  
>older URL-encoding (RFC1738), which throws '+' characters into the  
>mix. Which do you think would work best for implementers, users, and  
>would cause less interop problems?

Johnny,

FWIW, encoding "+" can be a hassle as it's a legal character in media type
values and is also sometimes used for spaces. I'd vote for pure RFC3986
percent-encoding.

=Drummond 

___
specs mailing list
specs@openid.net
http://openid.net/mailman/listinfo/specs


Re: attribute exchange value encoding

2007-05-24 Thread Johnny Bufu

On 24-May-07, at 5:15 PM, Johnny Bufu wrote:

> Please review section 3.3 Attribute Values to see if there are any
> issues.

Of course it helps if there's a link to click on... I missed it in  
the previous message:




Johnny

___
specs mailing list
specs@openid.net
http://openid.net/mailman/listinfo/specs


attribute exchange value encoding

2007-05-24 Thread Johnny Bufu
Hello list,

While at IIW, I asked around what people thought about the encoding  
mechanisms we've added recently, in order to allow for transferring  
any data types. The consensus was that everyone would prefer  
something simpler and lighter.

So I've rewritten the encoding section, such that:

- for strings, only the newline (and percent) characters are required  
to be escaped,
   (to comply with OpenID's data formats), using percent-encoding;

- base64 must be used for encoding binary data, and defined
   an additional field for this:
openid.ax.encoding.=base64


Please review section 3.3 Attribute Values to see if there are any  
issues.


One remaining question is about the choice of encoding for strings.  
Percent-encoding (RFC3968) seems the simplest from a spec  
perspective, however some libraries provide (better) support for the  
older URL-encoding (RFC1738), which throws '+' characters into the  
mix. Which do you think would work best for implementers, users, and  
would cause less interop problems?


Johnny

___
specs mailing list
specs@openid.net
http://openid.net/mailman/listinfo/specs