You have the right suspicion. It must certainly be something inside the servlet that affects the encoding of the characters. The input data is probably in UTF-8 (your Maori character as two bytes) and the data passed to FOP is probably incorrectly forced to ISO-8859-1 (or another 1 byte encoding) somehow. Without seeing the Java source code that handles the transformation it's impossible to just point you to the right place. Look out for things like the use of String.getBytes(), InputStreamReader or byte[]. They can (but don't have to) indicate problem spots. Please refer to our demo servlet [1][2] and the embedding examples [3] for recommended patterns. If you don't manage to identify the problem, post your code here.
[1] http://xmlgraphics.apache.org/fop/stable/servlets.html [2] http://svn.apache.org/viewvc/xmlgraphics/fop/trunk/src/java/org/apache/fop/servlet/FopServlet.java?view=markup [3] http://xmlgraphics.apache.org/fop/stable/embedding.html On 31.01.2008 07:58:51 Nicholas Hogg wrote: > Hi, > > I am having difficulty getting two characters to display in my pdfs when > running fop from a servlet that needs to support the Maori language (New > Zealand). The Maori language has 10 characters outside Latin 1 > character set: AEIOUaeiou all with a macron. I have embedded a font, > freeSerif, from > http://download.savannah.gnu.org/releases/freefont/freefont-ttf-20060126.tar.gz > > and it definitely has glyphs for these characters. I use XSLT to > generate xsl:fo and use the fop API to generate the pdf. Strangely, if > I capture the generated xsl:fo and run fop by hand (standalone), the > generated pdf is correct. Even more strangely, when run from the > servlet, all macron characters display correctly except the lowercase > "a" and "o" macron characters ("\u0101" and "\u014D"). Each appears in > the pdf as two characters - the second is always a "?" and the first is > either blank or a different "?" symbol, depending on the embedded font I > am trying to use. Trying different fonts produces the same problem with > these two characters. > > I have looked at the servlet generated pdf in Adobe Acrobat Pro, and in > the document properties verified that the font is being embedded (I use > a peculiar name for it). It is encoded as "Identity-H". > > When running fop by hand I use the same fop.xconf config file as the > servlet. I am running fop 0.94, and the servlet in Jetty6, on Windows, > with Java 1.6.03. > > Any thoughts on how the same xsl:fo would produce different results > standalone vs servlet? I can only think that the xsl:fo going into fop > via the servlet is getting corrupted so that fop thinks the "\u0101" and > "\u014D" characters are two separate characters each. > > Many thanks, > Nick Jeremias Maerki --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
