tika-user  

parsing only specified content types in archive

Daniel Knapp
Fri, 04 Dec 2009 06:58:22 -0800

Hello,

is there an option to define the content types that should be parsed in an 
archive file?
for example i have a zip archive that contains jar and pdf files, tika should 
only parse the pdf files and skip the rest.

or is there an general option to define which content types should be parsed, 
using the Tika.parse(...) facade.

thanks in advance!

Regards,
Daniel

Attachment: smime.p7s
Description: S/MIME cryptographic signature