Vadim Gritsenko wrote:
Stefano Mazzocchi wrote:

Cocoon is heavily internationalized but we fail to do one thing: signal the proper encoding to the user-agent thru HTTP headers, which is the most reliable way of doing it.

the current *hack* is to use <meta> tags in the HTML stream,

Ew!

I know.


these are interpreted by the HTTP server stack and transfered as HTTP headers. but this creates many problems and concern mixes.

Vadim suggested to set the headers from the serializers, but I think there is a better alternative.

So I propose to add the method

getEncoding()

to the interface

org.apache.cocoon.sitemap



Why sitemap would ever know anything about encoding?

Ok, good question.


There are two parts to the encoding problem: decoding incoming request and encoding outgoing response.

Right.


Request encoding can be set by SetCharacterEncodingAction or by anything else via request.setCharacterEncoding() method. Or, every request parameter can be decoded independently. Response encoding directly depends on the encoding parameter set to the serializer from the sitemap.

yes.


And, any of these are totally independent from the internationalization. Internationalization affects language used to produce output, but not how the text in this language is encoded (UTF8, UTF16, ISO-1859-1, what-have-you).

true. but you can't have chinese text in US-ASCII, right? my point is that having globally-balanced hooks for encoding will allow cocoon to be even more friendly for non-latin-charset needs. I used i18n out of context here, sorry.


So, if you to put encoding into sitemap... You will have to disable serializer configuration and request configuration and force sitemap encoding onto request / response. Is this what you are proposing?

nonononononooo


please, read again, my proposal, i think it's pretty clear.

If yes... IMHO, it makes more sence to have this parameter of the pipeline but not whole sitemap.

I didn't propose that, where did you get that impression?


But I am not convinced that it's sitemap's responsibility to worry about encoding (from SoC POV).

I restate:


1) I want a way for serializers to indicate to the pipeline what is the encoding they will be using, so that the pipeline can set the right HTTP header for it.

2) also, i want a way to overwrite the sitemap-wide behavior of every single serializers, locally, such as

<map:serialize encoding="UTF-8"/>

when the global serializer configurations state they will be using something else.

Is the proposal clear enough?

Stefano.

Reply via email to