Hi Brad,

On 21 Feb 2013, at 11:28, Brad Stallion <[email protected]> wrote:

> I'm extracting text from PDF files using my own sax handler. The problem is 
> that I get both visible and invisible text, i.e. text contained in invisible 
> parts of the layout.
> How can I identify the invisible parts?

We use PDFBox under the hood in Tika.  Have you tried asking on their user list?

Cheers,
Dave

Reply via email to