Hi, I had the same problem with ä and ö characters as I had to create pdf's from Finnish texts. I used the TextToPDF.createPDFFromText method so I'm not sure if the same solution works for you. I needed to change the encoding of the font to get it working. Here's an example:
String text = "opqrstuvwxyzåäö"; org.apache.pdfbox.TextToPDF textToPdf = new org.apache.pdfbox.TextToPDF(); EncodingManager encodingManager = new EncodingManager(); textToPdf.getFont().setEncoding(encodingManager.getEncoding(COSName.WIN_ANSI_ENCODING)); PDDocument doc = textToPdf.createPDFFromText(new StringReader(text)); Hope this helps. -Mikael On Thu, May 20, 2010 at 4:04 AM, Michael Fischer v. Mollard <[email protected]>wrote: > Am Samstag, den 08.05.2010, 16:51 +0200 schrieb Andreas Lehmkuehler: > > Hi, > > > > dysign.ch schrieb: > > > Hello everybody > > > > > > I recently started a java project where one of the tasks is to generate > > > pdf files from some stored texts. As I'm from Switzerland, many texts > > > contain umlauts like ä,ö or ü. When pdfbox parses my strings with e.g. > > > contentStream.drawString(str); the result looks qite crappy if the > > > string contains umlauts, all umlaut characters are replaced with > squares > > > and between everu normal character there is an additional space (string > > > comes from document with 'normal' ascii coding). If I convert the > source > > > document/string to use UTF 8, the result looks even worse. > > > > > > How can I tell pdfbox to print such umlauts correctly (it doesn't > matter > > > if I'd have to use embedded fonts or external ttfs, I'm just happy if > it > > > works somehow). > > You should use an external ttf which supports umlauts, see [1] how to do > that. > > Convert your text to UTF-8 before calling drawString(). > > > > HTH > > Andreas Lehmkühler > > > > [1] > > > http://svn.apache.org/repos/asf/pdfbox/trunk/pdfbox/src/main/java/org/apache/pdfbox/examples/pdmodel/HelloWorldTTF.java > > Sorry, > > but thats to fast for me. drawString expects a String, and a Java String > is AFAIK always UTF-16. > > Playing a little bit with HelloWorldTTF.java i got poor results with > /usr/share/fonts/truetype/ttf-bitstream-vera/VeraMoBI.ttf > > and > java.io.IOException: Unknown cmap format:12 > at > > org.apache.fontbox.ttf.CMAPEncodingEntry.initSubtable(CMAPEncodingEntry.java:157) > at org.apache.fontbox.ttf.CMAPTable.initData(CMAPTable.java:90) > at org.apache.fontbox.ttf.TTFParser.parseTTF(TTFParser.java:144) > at org.apache.fontbox.ttf.TTFParser.parseTTF(TTFParser.java:87) > at > > org.apache.pdfbox.pdmodel.font.PDTrueTypeFont.loadDescriptorDictionary(PDTrueTypeFont.java:193) > at > > org.apache.pdfbox.pdmodel.font.PDTrueTypeFont.loadTTF(PDTrueTypeFont.java:153) > at > > org.apache.pdfbox.pdmodel.font.PDTrueTypeFont.loadTTF(PDTrueTypeFont.java:130) > > with the > DejaVu-font /usr/share/fonts/truetype/ttf-dejavu/DejaVuSansMono.ttf. > > Can anybody point me some examples how to do it right? > > Regards > Michael > > > >

