Hello ,

I am still upgrading from PDFBox 0.73 to PDFBox 1.4. The new version is very 
nice, better extracting results and more PDFs work fine with it. But I have 
noticed the extracting performance for the new version is much slower than 
version 0.73.

For example I have tested extracting the text from a 200 pages PDF (Page by 
page) using the 2 versions + Doing some little logic on the extracted data, and 
the result was very different:
version 0.73: Took 4 seconds.
version 1.4: Took 1 minute, 22 seconds.

The results were better in version 1.4 a bit, but the time consumed is very 
big. 
Is there any way I can fasten the extraction process for the PDF data in 
version 1.4 ?


Best regards ,
Hesham 

Reply via email to