I restored the Xalan settings after (failing to) add Saxon by copying Emacs' ~ backup copies of cocoon.xconf and sitemap.xmap, but now suddenly there are Unicode replacement characters (U+FFFD) appearing for accents in pages which were working before.
The data is taken from a feed from an Oracle Application Server giving a HTML <table> fragment, eg http://rss.ucc.ie/live/w_rms_profile_list.show?p_school_id=A005 which dog and wget identify in the headers as Content-Type: text/html; charset=WINDOWS-1252 (yes, I know, yuck...not my server) [That URI may not be accessible off-campus] This is processed by a pipeline to ensure it is XML: <map:match pattern="people-in-schools/*"> <map:generate type="html" src="http://rss.ucc.ie/dev/w_rms_profile_list.show?p_school_id={1}"/> <map:serialize type="xml"/> </map:match> so that http://publish.ucc.ie/researchprofiles/people-in-schools/A005 produces XML I can consume in my XSLT. However, this is appearing as: <?xml version="1.0" encoding="ISO-8859-1"?><html...etc depite the fact that the sitemap.xmap says very clearly: <map:serializer logger="sitemap.serializer.xml" mime-type="application/xml" name="xml" src="org.apache.cocoon.serialization.XMLSerializer"> <encoding>UTF-8</encoding> </map:serializer> The result is that the output at http://publish.ucc.ie/researchprofiles/A005 has Unicode replacement characters instead of accents. I thought it should enforce translation to UTF-8 but obviously I have missed something....but what? --------------------------------------------------------------------- To unsubscribe, e-mail: users-unsubscr...@cocoon.apache.org For additional commands, e-mail: users-h...@cocoon.apache.org