[Hessian-interest] String encoding

Ben Hood Sat, 24 Nov 2007 06:41:16 -0800

Hi,

I have a question about the length encoding for UTF-8.


The string "\u00c3" is represented as xc3 x83 in UTF.

According to the spec, this should be encoded as x01 xc3 x83.

So it would seem that the length refers to the length of native  
encoding.

But wouldn't it be more practical for a parser to know the length of  
the UTF-8 payload, i.e. x02 xc3 x83?

Wouldn't that be more consistent with UTF-8 strings whose characters  
are all 1 byte in UTF, e.g. x05 hello?

Thx,

Ben 


_______________________________________________
hessian-interest mailing list
[email protected]
http://maillist.caucho.com/mailman/listinfo/hessian-interest

[Hessian-interest] String encoding

Reply via email to