Hi Hesham, > Hello , Thanks for your reply to my question : > http://markmail.org/thread/r2bbln57hjng7bxuSorry for replying > you through Mail ... I've already sent a reply but it didn't appear in the > Thread !! Did you get an error message or something like that? Whatever I'll cc my answer to the list
> What I was trying to say is that I've already seen the ExtractText.java file > and I implemented this solution which worked fine. > Now I was trying to write the extracted data from the PDF to a PDF file > instead of a text file. If the extracted data was in English > then it's written okay, but what if the extracted data was in Arabic for > example ?The written data in the PDF will appear > like this : þÿþÞ jþäþË > I've changed the Encoding for the written data in the PDF like this > :PDSimpleFont font = PDType1Font.TIMES_ROMAN; > font.setEncoding( new WinAnsiEncoding() ); > This encoding works fine for German, French, Turkish .... But not Arabic.Any > ideas ? Is there a way to use UTF-8 Encoding ? I've > attached my code with > this e-mail .... > I hope you can help me. Thanks ,Hesham The "trick" is to use the correct font. For an arabic text you'll need of course a font which supports arabic characters. Have a look at org.apache.pdfbox.TextToPDF as an simple example how to use external fonts for drawing. If you want to use UTF-8-encoding, just do it. Create a string with the wanted encoding and add it to the pdf calling the method drawString from PDPageContentStream (see org.apache.pdfbox.TextToPDF). Andreas Lehmkühler