[ 
https://issues.apache.org/jira/browse/PDFBOX-1779?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13820953#comment-13820953
 ] 

Sharmilee S commented on PDFBOX-1779:
-------------------------------------

I tried that and now i am getting this:
ExtractText failed with the following exception:
java.io.IOException: Missing end of file marker '%%EOF'
        at 
org.apache.pdfbox.pdfparser.NonSequentialPDFParser.getStartxrefOffset(NonSequentialPDFParser.java:576)
        at 
org.apache.pdfbox.pdfparser.NonSequentialPDFParser.initialParse(NonSequentialPDFParser.java:325)
        at 
org.apache.pdfbox.pdfparser.NonSequentialPDFParser.parse(NonSequentialPDFParser.java:700)
        at org.apache.pdfbox.pdmodel.PDDocument.loadNonSeq(PDDocument.java:1245)
        at org.apache.pdfbox.ExtractText.startExtraction(ExtractText.java:208)
        at org.apache.pdfbox.ExtractText.main(ExtractText.java:85)
        at org.apache.pdfbox.PDFBox.main(PDFBox.java:58)


>  Error: End-of-File, expected line
> ----------------------------------
>
>                 Key: PDFBOX-1779
>                 URL: https://issues.apache.org/jira/browse/PDFBOX-1779
>             Project: PDFBox
>          Issue Type: Bug
>          Components: Text extraction
>    Affects Versions: 1.8.2
>         Environment: Linux 
> pdfbox 1.8.2 
>            Reporter: Sharmilee S
>            Priority: Critical
>              Labels: linux, pdf, pdfbox, textExtraction
>
> Getting this exception when filename is passed from shell script in linux.
> ExtractText failed with the following exception:
> java.io.IOException: Error: End-of-File, expected line
>         at 
> org.apache.pdfbox.pdfparser.BaseParser.readLine(BaseParser.java:1489)
>         at 
> org.apache.pdfbox.pdfparser.PDFParser.parseHeader(PDFParser.java:298)
>         at org.apache.pdfbox.pdfparser.PDFParser.parse(PDFParser.java:173)
>         at org.apache.pdfbox.pdmodel.PDDocument.load(PDDocument.java:1211)
>         at org.apache.pdfbox.pdmodel.PDDocument.load(PDDocument.java:1176)
>         at org.apache.pdfbox.pdmodel.PDDocument.load(PDDocument.java:1101)
>         at org.apache.pdfbox.ExtractText.startExtraction(ExtractText.java:212)
>         at org.apache.pdfbox.ExtractText.main(ExtractText.java:85)
>         at org.apache.pdfbox.PDFBox.main(PDFBox.java:58)



--
This message was sent by Atlassian JIRA
(v6.1#6144)

Reply via email to