On 16/3/03 20:04, "Stefano Mazzocchi" <[EMAIL PROTECTED]> wrote:
So, if you to put encoding into sitemap... You will have to disable serializer configuration and request configuration and force sitemap encoding onto request / response. Is this what you are proposing?
nonononononooo
please, read again, my proposal, i think it's pretty clear.
Stefano, I believe your proposal got to the list chopped up big time, because what Vadim quoted is _ALL_ I've got as well, and really I don't understand what you want to do.
Uh, than sorry.
[big snip on well detailed encoding things]
To rewrite what he said with the above mentioned three-layer encoding in mind:
- the servlet container/mail engine/whatever will take care of the "Transfer Encoding" (Cocoon as an application should not care nor interfere with it).
Right.
- ALL serializers should have the ability to deal with "Content Encoding", unless (that would be my preferred option, as 90% of the times we think about deploying things over servlets) we don't want to "recommend" the use of "servlet filters" to do things such as GZIP encoding of the content.
In the past, I've been suggesting people to go down the servlet filter path, but I'm getting more and more to think that servlet filters are totally useless crap that can possibly work only for a few things and are overdesigned for what they can do.
So, I'm all in favor to provide internal alternatives.
You suggest to add a property to the serializer, but I think this is *NOT* a serializer's concer, but a higher level concern.
What about adding a 'content encoding' attribute to the 'pipeline' instead?
A pipeline provides a context of processing behavior. I think it fits perfectly with what we need and we don't even have to modify the serializers because all the stuff will be done by the pipeline engine that assembles the pipelines and creates the final response.
- TEXT-based serializers should think about "charset encoding" and are the only ones which should do that.
Right.
So, in my opinion, the "best" way to tackle the charset-encoding problem is to have the org.apache.cocoon.serialization.AbstractTextSerializer to receive an OutputStream from its implementation of the SitemapOutputComponent interface, but to expose to its solid implementations another couple of methods, instead of "getOutputStream":
- String getCharsetEncoding() [or getCharacterEncoding]:
Returns the default character encoding configured for the specified
AbstractTextSerializer (or the default one for the sitemap if none
was specified).
This can be usefult (for example) in the HtmlSerializer so that a new
<meta http-equiv="Content-Type" content="text/html; charset=???"/>
tag can be added automagically to the output, or to the "XMLSerializer"
so that the "<?xml version="1.0" encoding="???"?>" initial processing
instruction can be constructed appropriately.
- Writer getWriter():
Returns a java.io.Writer encoding character data to the response output stream according to whatever is returned by getCharsetEncoding
Sounds good to me.
Those two should be controlled from the sitemap by (as you, Stefano, said):
2) also, i want a way to overwrite the sitemap-wide behavior of every single serializers, locally, such as
<map:serialize encoding="UTF-8"/>
The only "nitpick" I have is that since "encoding" means a lot of things, this should be called "charset" (which is way more specific)...
very good point, I agree.
This can be easily picked up by the AbstractTextSerializer.configure() method and returned by the two methods added above...
Right.
I can work on a patch if you guys want... It's pretty trivial indeed...
Cool.
Stefano.