Hi, I have a bunch of PDF files - encrypted to prohibit changes and annotations (this matters because documents are forms) - created by Acrobat PDFMaker Tika (1.3/trunk) fails to parse these documents.
A trial using NonSequentialParser (see PDFBOX-1554 and PDFBOX-1387) looks promising: text is extracted properly (but metadata is garbled, cf. PDFBOX-1606). Does anyone have an alternative solution at hand? Is this on the radar? Open a Jira? Sample document (same error) from PDFBOX-1554: https://issues.apache.org/jira/secure/attachment/12575878/test_1e6a2e_001_test.pdf Thanks, Sebastian
