Title: RE: Roundtripping Solved

Arcane Jill wrote:
> >> #    for all possible octet sequences s:
> >> #        length of (UTF-8(f(s)) <= length of s,
>
> >No, that is not the requirement. It is:
> >bytelength(f(s)) <= 2*bytelength(s)
>
> You haven't understood. By definition, s is an octet stream,
> and f(s) is a
> Unicode character stream - and therefore "bytelength(f(s))"
> is completely
> meaningless.

Sorry. My fault. How about:
bytelength(UTF-16(f(s))) <= 2*bytelength(s)
and
bytelength(UTF-32(f(s))) <= 4*bytelength(s)
?

And it is:
bytelength(UTF-8(f(s))) <= 3*bytelength(s)
right?

Which is not very good, but mostly I can get away without that conversion. I simply keep s as-is. Which is, BTW, what Unicoders fear most. But often do themselves.


Lars

Reply via email to