Hi Dave,
no, not yet, good idea.
In case there exists some parameter to tune in PDFBox, how can I access to it 
directly?
Thanks



>________________________________
> Da: Dave Meikle <[email protected]>
>A: [email protected]; Brad Stallion <[email protected]> 
>Inviato: Domenica 10 Marzo 2013 0:53
>Oggetto: Re: Tika and invisible text from pdf
> 
>Hi Brad,
>
>On 21 Feb 2013, at 11:28, Brad Stallion <[email protected]> wrote:
>
>> I'm extracting text from PDF files using my own sax handler. The problem is 
>> that I get both visible and invisible text, i.e. text contained in invisible 
>> parts of the layout.
>> How can I identify the invisible parts?
>
>We use PDFBox under the hood in Tika.  Have you tried asking on their user 
>list?
>
>Cheers,
>Dave
>
>

Reply via email to