Hi,

Am 01.07.2013 17:06, schrieb Hai Nguyen FUB:
Dear Pdfbox-developers,

My name is Hai and I am a java developer at the Freie Universtät Berlin.

I am currently working on a project, which deals with converting pdf files
to images. I have looked around and found the Pdfbox library to be a good
pdf handling tool.

After awhile working with this tool, I got stucked on a problem: whenever I
tried to convert a handwritten pdf file, which means those files are
handwritten documents and were scanned and exported to pdf files (I do not
have the original images files), I received the following errors:

16:54:08,965 ERROR [FlateFilter] FlateFilter: stop reading corrupt stream
due to a DataFormatException


could you give me a hint, how to solve it?
Without having a hand on a sample pdf I'm just guessing. Try the non-sequential
parser by using loadNonSeq() instead of load() to load the pdf.

my code snapshot is in the following:

PDDocument document = PDDocument.load(new
File("src/test/resources/pdf/249scan.pdf"));



@SuppressWarnings("unchecked")
List<PDPage> pages = document.getDocumentCatalog().getAllPages();

PDPage page = pages.get(0);
BufferedImage bi = page.convertToImage();
ImageIO.write(bi, "png", new File("src/test/resources/pdf/test.png"));



Thank you in advance & Best regards

--
Hai Nguyen

Freie Universität Berlin
FB Mathematik u. Informatik
AG Intelligente Systeme und
Robotik<http://inf.fu-berlin.de/groups/ag-ki/index.html>
Arnimallee 7, Raum 111
D-14195 Berlin

Tel-1: +49 / 30 838 75114 ( Arnimallee - FUB)
Tel-2: +49 / 30 838 75148 (Takustr - FUB)
Tel-3: +49 / 30 2093 6381 (Office upstaris - HUB)
Tel-4: +49 / 30 2093 6393 (Lab downstairs - HUB)
Fax:    +49 / 30 838 75059
__________________________

BR
Andreas Lehmkühler

Reply via email to