errors in %%EOF handling (fix included)
---------------------------------------

                 Key: PDFBOX-979
                 URL: https://issues.apache.org/jira/browse/PDFBOX-979
             Project: PDFBox
          Issue Type: Bug
          Components: Parsing
    Affects Versions: 1.6.0
            Reporter: Timo Boehme


The '%%EOF' handling in PDFParser has several errors. The current 
implementation (start from line 467):

                String eof = "";
                if(!pdfSource.isEOF())
                    readLine(); // if there's more data to read, get the EOF 
flag
                
                // verify that EOF exists
                if("%%EOF".equals(eof)) {
                    // PDF does not conform to spec, we should warn someone
                    log.warn("expected='%%EOF' actual='" + eof + "'");
                    // if we're not at the end of a file, just put it back and 
move on
                    if(!pdfSource.isEOF())
                        pdfSource.unread(eof.getBytes("ISO-8859-1"));
                }

The problems:
- eof variable gets no value
- comparison if("%%EOF".equals(eof)) must be negated
- unreading must first add a newline or space byte because we read with 
readline() (like in bug PDFBOX-978)

Corrected version:
                String eof = "";
                if(!pdfSource.isEOF())
                    eof = readLine(); // if there's more data to read, get the 
EOF flag
                
                // verify that EOF exists
                if(!"%%EOF".equals(eof)) {
                    // PDF does not conform to spec, we should warn someone
                    log.warn("expected='%%EOF' actual='" + eof + "'");
                    // if we're not at the end of a file, just put it back and 
move on
                    if(!pdfSource.isEOF()) {
                        pdfSource.unread( SPACE_BYTE ); // we read a whole 
line; add space as newline replacement
                        pdfSource.unread(eof.getBytes("ISO-8859-1"));
                    }
                }


--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to