[jira] [Comment Edited] (PDFBOX-1787) pdfbox hangs on a corrupt PDF file

Hong-Thai Nguyen (JIRA) Mon, 02 Dec 2013 06:04:14 -0800

    [ 
https://issues.apache.org/jira/browse/PDFBOX-1787?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13836523#comment-13836523
 ]


Hong-Thai Nguyen edited comment on PDFBOX-1787 at 12/2/13 2:01 PM:
-------------------------------------------------------------------

I agree that we can't do anything to extract text's content but what's we 
expecting that our pdfbox should stop and report properly when having this kind 
of problem.
NonSequenticalPDFParser is the newer one with more robustness of PDF files ? 
Text extraction result is the same as current PDFParser ? I'm reading code of 
PDFBOX-1104, seem that this parser improve extraction perf by starting 
extraction from random page.

Thanks


was (Author: thaichat04):
I agree that we can't do anything to extract text's content but what's we 
expecting that our pdfbox should stop and report properly when having this kind 
of problem.
NonSequenticalPDFParser is the newer one with more robustness of PDF files ? 
Text extraction result is the same as current PDFParser ?

Thanks

> pdfbox hangs on a corrupt PDF file
> ----------------------------------
>
>                 Key: PDFBOX-1787
>                 URL: https://issues.apache.org/jira/browse/PDFBOX-1787
>             Project: PDFBox
>          Issue Type: Bug
>          Components: Text extraction
>    Affects Versions: 1.8.3
>         Environment: windows
>            Reporter: Hong-Thai Nguyen
>         Attachments: corrupt_file.pdf
>
>
> pdfbox hangs on command line on attached file.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

[jira] [Comment Edited] (PDFBOX-1787) pdfbox hangs on a corrupt PDF file

Reply via email to