Hi Apologies if this is the wrong email to use. I am trying to understand if and how well PDFBox supports extraction of text from a pdf document that contains type 3 fonts. It's taken a while to understand the reason behind the apparent failure in parsing.
Before I go further I thought it would be better to ask, in addition I did find this ticket in JIRA but I wasn't sure if it was still relevant. https://issues.apache.org/jira/browse/PDFBOX-124 I can use pdftotext it's not completely successful but it does extract to some degree. Any guidance is greatly appreciated. Thanks Jinder