Hi,

Current official pdfbox version is 1.8.5. However....

If the characters overlap each other, i.e. all chars from a word on the same place, then it's
https://issues.apache.org/jira/browse/PDFBOX-62

Characters not properly rendered is likely a problem with type1 embedded fonts, which is unsolved in 1.8, although it works with 2.0 which is unreleased by available through svn.

https://pdfbox.apache.org/downloads.html

Can't tell for sure because most attachments are ignored here. Upload them to a sharehoster or open an issue with JIRA.

Tilman

Am 19.05.2014 08:18, schrieb webrtcgo:
Hi, please forgive my english first.
I tried to convert a pdf file to images, using pdfbox 1.8.4 within tika-app-1.5.jar.
The jpeg files I got were not ideal.
The content in the images were different from the pdf file.
Some characters were in different places, and some characters overlapped others.
There were many lines of console information which read:
'13:49:07,094 WARN [PDSimpleFont:107] Changing font on <l> from <Courier New Italic> to the default font 13:49:07,094 WARN [PDSimpleFont:107] Changing font on <l> from <Courier New Italic> to the default font 13:49:07,095 WARN [PDSimpleFont:107] Changing font on <y> from <Courier New Italic> to the default font 13:49:07,095 WARN [PDSimpleFont:107] Changing font on <l> from <Courier New Italic> to the default font
...'
Could you give me some instruction, tell me how to solve this problem, how to get ideal images?
Thanks a lot.

I attached the pdf file and one of the images.
And here are my code:
PDDocument doc = PDDocument.load(input + ".pdf");
List<PDPage> pages = doc.getDocumentCatalog().getAllPages();
for (int i = 0; i < pages.size(); i++) {
    PDPage page = pages.get(i);
BufferedImage image = page.convertToImage();
Iterator<ImageWriter> iter = ImageIO.getImageWritersBySuffix("JPG");
ImageWriter writer = iter.next();
File outFile = new File(input + i + ".jpg");
FileOutputStream out = new FileOutputStream(outFile);
ImageOutputStream outImage = ImageIO.createImageOutputStream(out);
writer.setOutput(outImage);
writer.write(new IIOImage(image, null, null));
writer.dispose();
out.close();
}
doc.close();

------------------------------------------------------------------------
webrtcgo

Reply via email to