ehmm, ok... so the last post gave me an idea. I wanted to open the PDF in reader, fill ind æøåÆØÅ in one of the fields, save the PDF and read it in using PDFBOX and write out the value as a byte array/values to see what its stored in it. I gave me some unexpected results - it gave me exacly the same values as if I hardcoded a string in java with æøåÆØÅ and converted it to a bytearray and printed out the values. Then I tried to insert new values in that one exact field - values having ÆØÅ in the value.
this works. In that one field. AND the font is different. Im thinking that the real problem is in the initial creating of the PDF. Its in openoffice and open office is then used to export to PDF and then the PDF is used in my code. Im guessing that we should look at how the PDF is created in the first place. My coworker is not danish. Maybe his openoffice is setting some font that just does not make sense to danish and so if he used my openoffice or ... fixed his, then I would not have a problem. so thats where we will look. Or I can simply open the PDF in my reader - fill in æ in all fields, save it and write over everything in java :D if it was a bigger job maybe it would make sense to really understand whats going on ... -----Original Message----- From: Andreas Lehmkuehler [mailto:[email protected]] Sent: 9. februar 2014 19:44 To: [email protected] Subject: Re: encoding Hi, Am 08.02.2014 17:31, schrieb Jan Agermose // Conviator ApS: > hi > > Im trying to use this code to fill a document. It works - except for > encoding because of Danish chars: æøå > > PDDocument pdfDocument = PDDocument.load(path); > PDType1Font font = PDType1Font.HELVETICA; > //contentStream.setFont(font, 12); > > PDDocumentCatalog docCatalog = pdfDocument.getDocumentCatalog(); > PDAcroForm acroForm = docCatalog.getAcroForm(); > > List<PDField> fields = acroForm.getFields(); > for (PDField field : fields) { > if (field.getFullyQualifiedName().equals("Text1")) { > field.setValue(p.getFornavn() + " " + p.getEfternavn()); > } > File f = File.createTempFile("ansoegningsyddanmark",".pdf"); > pdfDocument.save(f); > > > im also trying to change this : > field.setValue(p.getFornavn() + " " + > p.getEfternavn()); into one of: > field.setValue(p.getFornavn() + " " + p.getEfternavn()+ > "\0153u"); > field.setValue(new > String(p.getBy().getBytes("UTF-16"), "ISO8859_1") in order to try to fix it > but its not working. > > any ideas how to fix this? Other encodings as WinANSI aren't yet supported, see PDFBOX-922 [1] for further details. BR Andreas Lehmkühler [1] https://issues.apache.org/jira/browse/PDFBOX-922

