AW: Parsing a pdf file takes 3minutes

Clemens Wyss DEV Mon, 23 Dec 2013 07:14:59 -0800

Opened an issue therefor  https://issues.apache.org/jira/browse/PDFBOX-1821


-----Ursprüngliche Nachricht-----
Von: Clemens Wyss - MySign AG [mailto:[email protected]] 
Gesendet: Sonntag, 22. Dezember 2013 17:37
An: '[email protected]'
Betreff: Parsing a pdf file takes 3minutes

I initially posted this question in the tika-mailing list, and I even created 
an issue herefore:
https://issues.apache.org/jira/browse/TIKA-1213 
Hopefully now being on the right list, I re-phrase the problem I am confronted 
with:
I have (several) pdf documents which take up to 3minutes to be parsed/extracted 
(for later lucene indexing). 
For example  the pdf which is attached to the jira issue requires 3minutes.

How/why is this possible? How can I improve on this?

Any help appreciated
Clemens

AW: Parsing a pdf file takes 3minutes

Reply via email to