[ 
https://issues.apache.org/jira/browse/PDFBOX-1606?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13665080#comment-13665080
 ] 

Timo Boehme commented on PDFBOX-1606:
-------------------------------------

NonSequentialPDFParser only sets password via constructor but decryption is 
done in initialParse which is called by parse() (not called by constructor). 
The difference is in the different parsing types of the parsers. PDFParser 
first reads all encrypted objects top-down. Then with openProtection the 
objects are decrypted and reparsed (object streams). NonSequentialPDFParser 
parses XREF table and accesses needed objects directly, decrypting them as they 
are read. Thus it needs decryption password for parsing. One could add an 
openProtection(password) method to NonSequentialPDFParser which has to be 
called before parse() but this would only add code with no extra functionality.

                
> NonSequentialPDFParser produces garbage text in document info
> -------------------------------------------------------------
>
>                 Key: PDFBOX-1606
>                 URL: https://issues.apache.org/jira/browse/PDFBOX-1606
>             Project: PDFBox
>          Issue Type: Bug
>          Components: Parsing
>    Affects Versions: 1.8.1
>         Environment: Windows 7, JRE 1.7.0_15-b03
>            Reporter: Alex Alishevskikh
>         Attachments: 00-214 EU Data Protection Directive Update 12-1.pdf
>
>
> For some documents, NonSequentialPDFParser produces PDDocumentInformation 
> with binary garbage in its fields (title/author/producer/etc). Invocation of 
> PDDocumentInformation.getXXXDate() methods fails with "IOException:Error 
> converting date" for those documents.
> Classic PDFParser does not have problems with the same documents.  

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to