Hi,

I have a bunch of PDF files
- encrypted to prohibit changes and annotations
  (this matters because documents are forms)
- created by Acrobat PDFMaker
Tika (1.3/trunk) fails to parse these documents.

A trial using NonSequentialParser (see PDFBOX-1554 and PDFBOX-1387) looks
promising:
text is extracted properly (but metadata is garbled, cf. PDFBOX-1606).

Does anyone have an alternative solution at hand?
Is this on the radar? Open a Jira?

Sample document (same error) from PDFBOX-1554:
https://issues.apache.org/jira/secure/attachment/12575878/test_1e6a2e_001_test.pdf

Thanks,
Sebastian

Reply via email to