Hi

Apologies if this is the wrong email to use. I am trying to understand if
and how well PDFBox supports extraction of text from a pdf document that
contains type 3 fonts. It's taken a while to understand the reason behind
the apparent failure in parsing.

Before I go further I thought it would be better to ask, in addition I did
find this ticket in JIRA but I wasn't sure if it was still relevant.

https://issues.apache.org/jira/browse/PDFBOX-124

I can use pdftotext it's not completely successful but it does extract to
some degree. Any guidance is greatly appreciated.

Thanks
Jinder

Reply via email to