Metadata extraction broken on some PDF files
--------------------------------------------

                 Key: PDFBOX-858
                 URL: https://issues.apache.org/jira/browse/PDFBOX-858
             Project: PDFBox
          Issue Type: Bug
          Components: PDModel
    Affects Versions: 1.2.1, 1.3.0
            Reporter: Patrik Stenmark
         Attachments: 2001Derivatives and Public Debt Mngt.pdf, 
RethinkingTheFinancialNetwork.pdf

On certain PDF files (examples attached), the metadata extraction seems to be 
broken. Preview (on Mac OS X) and Acrobat Reader is able to read the metadata, 
but PDFbox gives complete jibberish: 

Author=è'ÿÆ??kÔ7??ÕªG?

I've tried both the version included in Tika 0.7 (1.0.0 I believe) and r1021264 
from SVN. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to