Hi Jeremias,
Thanks for your suggestions. The problem was as expected - we use
Maverick as our MVC framework and it has a bug when passing content
through multiple transformations - the encoding was messed up by the
time it reached FOP. The simple fix was to set up FOP to do the
xml->xsl:fo and the xsl:fo->pdf transform rather than letting maverick
control the former.
Thanks,
Nick
Jeremias Maerki wrote:
You have the right suspicion. It must certainly be something inside the
servlet that affects the encoding of the characters. The input data is
probably in UTF-8 (your Maori character as two bytes) and the data
passed to FOP is probably incorrectly forced to ISO-8859-1 (or another 1
byte encoding) somehow. Without seeing the Java source code that handles
the transformation it's impossible to just point you to the right place.
Look out for things like the use of String.getBytes(), InputStreamReader
or byte[]. They can (but don't have to) indicate problem spots. Please
refer to our demo servlet [1][2] and the embedding examples [3] for
recommended patterns. If you don't manage to identify the problem, post
your code here.
[1] http://xmlgraphics.apache.org/fop/stable/servlets.html
[2]
http://svn.apache.org/viewvc/xmlgraphics/fop/trunk/src/java/org/apache/fop/servlet/FopServlet.java?view=markup
[3] http://xmlgraphics.apache.org/fop/stable/embedding.html
On 31.01.2008 07:58:51 Nicholas Hogg wrote:
Hi,
I am having difficulty getting two characters to display in my pdfs when
running fop from a servlet that needs to support the Maori language (New
Zealand). The Maori language has 10 characters outside Latin 1
character set: AEIOUaeiou all with a macron. I have embedded a font,
freeSerif, from
http://download.savannah.gnu.org/releases/freefont/freefont-ttf-20060126.tar.gz
and it definitely has glyphs for these characters. I use XSLT to
generate xsl:fo and use the fop API to generate the pdf. Strangely, if
I capture the generated xsl:fo and run fop by hand (standalone), the
generated pdf is correct. Even more strangely, when run from the
servlet, all macron characters display correctly except the lowercase
"a" and "o" macron characters ("\u0101" and "\u014D"). Each appears in
the pdf as two characters - the second is always a "?" and the first is
either blank or a different "?" symbol, depending on the embedded font I
am trying to use. Trying different fonts produces the same problem with
these two characters.
I have looked at the servlet generated pdf in Adobe Acrobat Pro, and in
the document properties verified that the font is being embedded (I use
a peculiar name for it). It is encoded as "Identity-H".
When running fop by hand I use the same fop.xconf config file as the
servlet. I am running fop 0.94, and the servlet in Jetty6, on Windows,
with Java 1.6.03.
Any thoughts on how the same xsl:fo would produce different results
standalone vs servlet? I can only think that the xsl:fo going into fop
via the servlet is getting corrupted so that fop thinks the "\u0101" and
"\u014D" characters are two separate characters each.
Many thanks,
Nick
Jeremias Maerki
---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
--
Nicholas Hogg
KE Software (Australia)
www.kesoftware.com
---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]