[
https://issues.apache.org/jira/browse/PDFBOX-895?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12994218#comment-12994218
]
Panayiotis Vlissidis commented on PDFBOX-895:
---------------------------------------------
Excellent!!!
Thanks again for all your hard work and time invested into PDFBox
> Infinite recursion when trying to extract text from specific types of PDFs
> --------------------------------------------------------------------------
>
> Key: PDFBOX-895
> URL: https://issues.apache.org/jira/browse/PDFBOX-895
> Project: PDFBox
> Issue Type: Bug
> Components: Text extraction
> Affects Versions: 1.3.1
> Reporter: Panayiotis Vlissidis
> Assignee: Andreas Lehmkühler
> Priority: Critical
> Fix For: 1.5.0
>
> Attachments: test.pdf
>
>
> Hello and thanks for PDFBox.
> We just started using PDFBox for text extraction(through Tika)
> and it fails to finish text extraction falling in an infinite loop
> and never returning the text.
> Please note that this happens only for a specific type of PDF
> documents(used for hand writing recognition) such as the one attached.
> Not sure if this is a bug of PDFBox or due to the nature of the PDFs,
> but I think that PDFBox should at least break out if extraction is not
> possible.
> I wish I could give you more information but I know nothing about PDF format,
> parsing, etc.
> Please let me know if you need any information or my help in any way.
> Thanks a lot for your time.
--
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira