Geir H. Pettersen wrote:
Hi,

I have been using the commons httpclient successfully since rc1. Great work
guys! This is the best client that I ever have used in java.

Thanks for the flowers, Geir!


The httpclient.wire log is fantastic, but there is something there that
bothers me a bit. Before the characters are written to log, they are decoded
with "US-ASCII" (or the default character set if that fails).

The problem with this is if you try to debug http traces with special
character you will actually lose information on what is actually sent.

My example is: I am having some encoding problems with my client, and I want
to check exactly what bytes I am sending. I am sending the three Norwegian
characters (1)ĉ (2)ĝ and (3)ċ. (ae together, o with a slash and a with a
ring over).

(1)     is encoded in UTF-8 as 0xc3 0xa6
(2)     is encoded in UTF-8 as 0xc3 0xb8
(3)     is encodes in UTF-8 as 0xc3 0xa5

The problem is when I try to POST these three characters. It is logged as:
"[0xfffd][0xfffd][0xfffd][0xfffd][0xfffd][0xfffd]" (but actually that is not
what is sent)

I want the log to be: "[0xc3][0xa6][0xc3][0xb8][0xc3][0xa5]"

I think, just using the default platform encoding is as bad as using US-ASCII because it does not reflect was is *really* sent over the wire. Actually the wirelog should ideally provide all bytes in hexadecimal representation along with their supposed interpretation as characters in some encoding (such as default platform encoding). A presentation similar to hexdumps would me nice IMHO. However this might imply a small buffering (16 bytes per line) to make a nice layout.


Sample output (hex values do not match text):

DE AD BE EF  DE AD BE EF  DE AD BE EF  DE AD BE EF  GET /index.htm H
DE AD BE EF  DE AD BE EF  DE AD BE EF  DE AD BE EF  TTP/1.1..Host:ww


--------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]



Reply via email to