Hi José,

I did some tests and although you were right to try adding
<omit-xml-declaration>true</omit-xml-declaration>
you probably did it to the wrong serializer, the corect one would be "xml", see
<map:serializers default="xml">

This feature is documented here:
http://cocoon.apache.org/2.1/faq/faq-serializers.html

I tried it with all the serializers, tried clearing the Cocoon cache,
but I didn't get it to work, either. Curious.

How can we work around that?

First of all, let me ask if you really need to scrape HTML. That
sounds ugly in any case. Doesn't Scirus support harvesting content via
OAI-PMH? DSpace does have an OAI-PMH interface, it was even optimized
for speed for the upcoming DSpace 3.0. According to this whitepaper it
seems they do support OAI:
http://www.scirus.com/press/pdf/WhitePaper_Scirus.pdf

If you really really want to scrape HTML, you could use an Apache
frontend for Tomcat and remove the XML declaration using
mod_substitute. But that is just another hack.
http://httpd.apache.org/docs/2.2/mod/mod_substitute.html


Regards,
~~helix84

Compulsory reading: DSpace Mailing List Etiquette
https://wiki.duraspace.org/display/DSPACE/Mailing+List+Etiquette

------------------------------------------------------------------------------
Everyone hates slow websites. So do we.
Make your web apps faster with AppDynamics
Download AppDynamics Lite for free today:
http://p.sf.net/sfu/appdyn_sfd2d_oct
_______________________________________________
DSpace-tech mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/dspace-tech

Reply via email to