I restored the Xalan settings after (failing to) add Saxon by copying
Emacs' ~ backup copies of cocoon.xconf and sitemap.xmap, but now
suddenly there are Unicode replacement characters (U+FFFD) appearing for
accents in pages which were working before.

The data is taken from a feed from an Oracle Application Server giving a
HTML <table> fragment, eg
http://rss.ucc.ie/live/w_rms_profile_list.show?p_school_id=A005
which dog and wget identify in the headers as
Content-Type: text/html; charset=WINDOWS-1252
(yes, I know, yuck...not my server)

[That URI may not be accessible off-campus]

This is processed by a pipeline to ensure it is XML:

<map:match pattern="people-in-schools/*">
  <map:generate type="html"
  src="http://rss.ucc.ie/dev/w_rms_profile_list.show?p_school_id={1}"/>
  <map:serialize type="xml"/>
</map:match>

so that
http://publish.ucc.ie/researchprofiles/people-in-schools/A005
produces XML I can consume in my XSLT. However, this is appearing as:

<?xml version="1.0" encoding="ISO-8859-1"?><html...etc

depite the fact that the sitemap.xmap says very clearly:

<map:serializer logger="sitemap.serializer.xml"
        mime-type="application/xml" name="xml"
        src="org.apache.cocoon.serialization.XMLSerializer">
    <encoding>UTF-8</encoding>
</map:serializer>

The result is that the output at
http://publish.ucc.ie/researchprofiles/A005
has Unicode replacement characters instead of accents.

I thought it should enforce translation to UTF-8 but obviously I have
missed something....but what?

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscr...@cocoon.apache.org
For additional commands, e-mail: users-h...@cocoon.apache.org

Reply via email to