Hi,

Am 27.02.2012 09:30, schrieb Denis Voloshin:
Hi

We have this pdf which we get from a customer which complains the document is
processed successfully but the content is gibberish.
If you open the file in Acrobat, it displays correctly there is nothing strange
in the document properties but if you copy text
from it and try to paste it to another document (Word or text) you get 
gibberish.
Therefore we think something in the file is corrupted.
No, I don't think so.

We suspect that the reason in the way how the document was created.
Correct. The given PDF uses a custom made font encoding, which doesn't provide
a mapping for readable text.

The question is, if there is any way that we can inform a user that we have such
problematic pdf using PDFBox tool
Hmm, it may be possible. One have to determine which kind of font and encoding is used, which should lead to the result you're looking for. But it might be
not that easy. There are many pdfs using both "readable" and "unreadable" fonts.

The version we use is pdfbox-app-1.2.1-ecmts-1.5.jar

Thanks


*Best Regards.*
Denis Voloshin
Software engineer
Phone: +972-2-649-1162
Mobile: +972-54-642-2269

BR
Andreas Lehmkühler

Reply via email to