-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 André,
(Marking OT because, well... just because). On 1/22/2010 2:59 PM, Warnier wrote: > Christopher Schultz wrote: >> That "authorization.getBytes()" is just asking for trouble, because it >> uses the platform default encoding to convert characters to bytes. It >> should be using US-ASCII, ISO-8859-1, or something like that. > > -1 > I don't think you have a problem there, because what you are decoding > into bytes there IS bytes (it is base64-encoded). Maybe all character sets have bytes 0-127 the same as US-ASCII, but I don't know about some of those I never see myself: Shift-JS and all those Asian encodings, etc. It would be better to be explicit. >> It also calls the String constructor with a byte array without >> specifying the encoding, therefore using the platform default. > > +1 > That is indeed where you have a problem. There you SHOULD always decode > it as US-ASCII (or maybe iso-8859-1, I'm not quite sure what the spec > says exactly). - From my reading, the spec is silent but one can draw the conclusion that US-ASCII is basically all that is supported. I should all the capability of configuring this encoding to override the (soon to be) default of US-ASCII: if the user knows the client will use UTF-8, they should be allowed to force that encoding to be used. > Let's say that the spec is clear and says that the header value is > *TEXT, and that *TEXT is always US-ASCII (or ISO-8859-1) by default. > > Let's take it from the browser side first. > If the "userid:password" is indeed composed only of us-ascii characters, > then the browser base64-encodes this directly and it is trivial.(*) > > But let's say that "userid:password" is something else than us-ascii. > Another part of the spec says that then, you have to encode it according > to RFC2047. No, I don't think this is correct: the spec says that the HTTP header values must be in US-ASCII, and may be encoded using RFC2047 in order to achieve that. Since Base64 encoding always results in a US-ASCII-compatible value, there is no reason to involve RFC2047. > My contention is then that the browser should first RFC2047-encode > "userid:password", and then base64-encode the result. While that sounds like a good idea, it's almost certainly never done that way. > Back on the server side. > The server base64-decodes the authorization token, into an ascii string. > It can do that always, because either the string was ascii to start > with, or else it was not, but then it has been RFC2047-encoded, yelding > a result that is ascii. > (like : =?iso-8859-2?B?....base64-encoded stuff...?= ) This would be a decent configurable setting for a BASIC authenticator... something like "allow-rfc2047" or whatever. What about those people who really want to have a username like "=?whatever" and a password like "whatever?="? They can't login? :) > The above, I believe, would be totally consistent with the current RFCs. Yes, but for whatever reason, nobody ever fully implements the RFCs :) There are standards and there are practices. In this case, I think practices outweigh the standards :) > But there is a major catch : I don't believe that there is a browser on > the market today, which "properly" encodes the "userid:password" string > via rfc2047 when it isn't ascii. Nor would it be appropriate to do so, because base64 encoding is /always/ used and will therefore /always/ result in a valid HTTP Authenticate header value. - -chris -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.10 (MingW32) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/ iEYEARECAAYFAktaFaQACgkQ9CaO5/Lv0PBMcACgpSL6QcBn6C2thQash4W/LIhg 5VgAn2hmTLmwdgk1HkhDxOshDDyZkBr0 =xBQs -----END PGP SIGNATURE----- --------------------------------------------------------------------- To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org For additional commands, e-mail: users-h...@tomcat.apache.org