[
https://issues.apache.org/jira/browse/PDFBOX-1207?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13916405#comment-13916405
]
John Hewson commented on PDFBOX-1207:
-------------------------------------
This is a 579 page PDF file, which probably explains why things are slow, but
10 minutes is really too slow.
> PDFPageProcessor.processStream() take 10 minutes to return
> ----------------------------------------------------------
>
> Key: PDFBOX-1207
> URL: https://issues.apache.org/jira/browse/PDFBOX-1207
> Project: PDFBox
> Issue Type: Bug
> Components: Parsing
> Affects Versions: 1.6.0
> Environment: Seen on multiple platforms
> Reporter: Dan Krause
> Labels: RepairMode
>
> Attempting to extract images and text from each page. Long processing time is
> specific to this file:
> http://docs.redhat.com/docs/en-US/Red_Hat_Enterprise_Linux/6/pdf/Installation_Guide/Red_Hat_Enterprise_Linux-6-Installation_Guide-en-US.pdf
--
This message was sent by Atlassian JIRA
(v6.1.5#6160)