On Wed, 2008-03-19 at 21:50 -0700, Mark Diggory wrote:
> On Mar 19, 2008, at 8:44 PM, Conal Tuohy wrote:
> 
> >
> > On Wed, 2008-03-19 at 18:52 -0700, Mark Diggory wrote:
> >> On Mar 19, 2008, at 5:30 PM, Conal Tuohy wrote:
> >> If its a question of it being well formed xhtml coming out of the
> >> template... A simple Java class can be created to assist in this area
> >> using JTidy to process the string containing the html prior to
> >> processing it into the outputstream.
> >
> > You mean using JTidy as an XSLT extension function, passing a  
> > string and
> > returning a DOM back into the XSLT? That could work. Or JTidy could be
> > packaged into a Cocoon transformer of its own.
> 
> The problem is that anything in the pipeline (generator or  
> transformer) has to produce well formed XML.  So a transformer that  
> takes poorly formed xml would never be reached because the sax errors  
> happen in-between it and the previous one. 

I was thinking of a transformer that would consume a SAX stream
containing some nodes which contained quoted HTML, and would unquote
them and parse the result as HTML. So it would be well-formed (text)
before, and well-formed (XHTML) after.

> However, a Generator that  
> cleaned up RSS on the way through would be a nice feature... And if  
> it were triggered/configured from the URI of the resource being  
> called to get the RSS, even better... the just an xpath "document 
> ('http://foo.com/feed.rss')" or "document(${dspace.dir}/config/ 
> news.rss')"  could activate it... see
> 
> http://cocoon.apache.org/2.0/userdocs/generators/html-generator.html

Better not to invoke it from within XSLT though - it could be included
using the XInclude transformer referring to a Cocoon pipeline with the
"cocoon:" protocol.

There are known issues with Cocoon's caching when using the XPath
"document()" function in XSLT (I don't think these have been fixed?).
Basically the cache validity of the result is based on the resources
mentioned in the pipeline definition in the sitemap, and doesn't take
account of resources invoked from with an XSLT itself. So the RSS could
change without invalidating the cache entry of the resulting page, and
you could end up serving stale content until some other part of the
document changed, and the cache was invalidated.

-- 
Conal Tuohy
New Zealand Electronic Text Centre
www.nzetc.org


-------------------------------------------------------------------------
This SF.net email is sponsored by: Microsoft
Defy all challenges. Microsoft(R) Visual Studio 2008.
http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/
_______________________________________________
DSpace-tech mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/dspace-tech

Reply via email to