I've been able to debug this a little bit, and it seems that, even though I am setting the output encoding to UTF-8, it is being written as ASCII. Since we can't get much farther without posting code, here goes:
Serializer serializer = SerializerFactory.getSerializer(props); log.debug("Output Encoding: " + serializer.getOutputFormat().getProperty("encoding")); serializer.setOutputStream(new FileOutputStream(results)); filters[lastFilter].setContentHandler(serializer.asContentHandler()); filters[lastFilter].parse(new InputSource(new FileReader(xmlFile))); log.debug("Finished the transformation"); The first log message indeed prints "Output Encoding: UTF-8". However, when I create a FileReader for this same File ("results" in the code above), and do file.getEncoding(), it prints "ASCII". Also, when I look at the file with less, I see "General<C2><A0>Electric" and in emacs, I see "General??Electric". This is just an XSL transform up to this point, nothing FOP-specific (though the file is a FO document), so perhaps the Xalan list is the proper place for this question? Here is the code for the Reader: FileReader fileReader = new FileReader(foFile); BufferedReader reader = new BufferedReader(fileReader); log.debug("Encoding for " + foFile + ": " + fileReader.getEncoding()); Again, this prints "Encoding for /tmp/quarterly40215.xml: ASCII". At this point, the reader is used to read the file into a byte array. Then it is wrapped in a ByteArrayInput stream and fed to the FOP Driver. Are we any closer? On 11/25/05, Craig McDaniel <[EMAIL PROTECTED]> wrote: > On 11/25/05, Andreas L Delmelle <[EMAIL PROTECTED]> wrote: > > On Nov 25, 2005, at 22:14, Craig McDaniel wrote: > > > > > I am trying to debug a PDF rendering for a client where non-breaking > > > spaces are comming out as double question marks "??". FOP is being > > > called from a servlet. I have tried using the fop command line tool > > > and can not reproduce the problem. I have written an simple servlet on > > > another system that functionally does the same thing, and can not > > > reproduce the problem here either. > > > > > > Any ideas what could cause this? Is it some kind of character encoding > > > issue? > > > > Indeed. The question-marks are most likely related to: > > http://java.sun.com/j2se/1.4.2/docs/api/java/nio/charset/ > > CharsetEncoder.html > > > > > The entity   is being used. What should my next step be > > > in debugging this? > > > > Firstly: are you still using FOP 0.20.5? If so, can you try out the > > recent alpha release, and report if the problem still occurs? > > I am using 0.20.5. Unfortunately, I do not have access to deploy > changes to the server at this time, so I am unable to test changes in > the only environment where the problem is happening ;-( > > > If you can't (or are already using FOP 0.90alpha), I think the best > > bet is to go looking for places --in the servlet code, I presume-- > > where an XML declaration is hard-coded as a String literal or where a > > Charset is used that's different from the default (= UTF-8). > > The original data file has no XML declaration. The stylesheet has one, > but does not have an encoding attribute. The   entities are in > the XSL, by the way. > > I almost feel like I am debugging this thing blind. I do have the > source code, but it is too spread out to post here. It might be worth > pointing out that the XSL is applied to the XML data and sent to a > ByteArrayOutputStream. The byte array is then stored and later passed > into the FOP driver as a ByteArrayInputStream. Likewise, the output of > the driver is written to a byte array and finally, it gets sent to the > browser with response.getOutputStream().write(bytes). Not the way I > would have done it. Anyway, like I said, I coded up a servlet just > like this one and could not reproduce the problem in my own > environment. Perhaps this is a default encoding problem. > > > HTH! > > Absolutely, thanks for your help! > > > Cheers, > > > > Andreas > > > -- > Craig McDaniel > -- Craig McDaniel --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]