Yes. Use iText or PDFBox These are common PDF libraries.
On 2/6/16, 2:24 PM, "Code for Libraries on behalf of Andrew Cunningham" <CODE4LIB@LISTSERV.ND.EDU on behalf of lang.supp...@gmail.com> wrote: >Hi all, > >I am working with PDF files in some South Asian and South East Asian >languages. Each PDF has ActualText added for each tag in the PDF. Each PDF >has ActualText as an alternative forvthe visible text layer in the PDF. > >Is anyone aware of tools the will allow me to index and search PDFs based >on the ActualText content rather than the visible text layers in the PDF? > >Andrew > >-- >Andrew Cunningham >lang.supp...@gmail.com