> I am almost sure that it should be made all-the-way around: the client can
> request a specific encoding to the server: See RFC 2616 section 14.2 page
> 102: the Accept-Charset header.

Or an _ordered_list_ of those as input. See also the Languages while you
are at it;  and the Accept: type as well - they are all dimensions of the
same problem. And they are not orthogonal; i.e. there is an easy semantic
coupling between languages and charset - and the Accept list may prompt
you to send a gif or pdf in some cases.

> On another thought... The cache should store unicode characters "as is", not
> bytes, as those might change for the same request URL depending on the
> different headers in the request...

You'd have to track which Accept, Accept-Language and Accept-Charset you
negotiated on. As applications may (also) do i18n and localizations
optimizations such as swapping ',' into '.' or abusing charsets and doing
locale specific normalizations of the unicode cast.

Dw.

Reply via email to