[ https://issues.apache.org/jira/browse/PDFBOX-1987?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13939008#comment-13939008 ]
Maruan Sahyoun commented on PDFBOX-1987: ---------------------------------------- PDFBOX-276 describes such a file. PDF.js has some files with invalid hex strings. There are some files which have missing CR and/or LF at the end of a stream ... > Provide a PDF Lexer as a base for PDF parsing > --------------------------------------------- > > Key: PDFBOX-1987 > URL: https://issues.apache.org/jira/browse/PDFBOX-1987 > Project: PDFBox > Issue Type: Improvement > Components: Parsing > Reporter: Maruan Sahyoun > Priority: Minor > Fix For: 2.0.0 > > Attachments: src.zip > > > In order to enhance the parsing process and as a foundation for a combination > of the different parsers a PDF lexer should be provided. -- This message was sent by Atlassian JIRA (v6.2#6252)