I'am looking to index ms word and pdf using uploading data with solr cell
using apache tika;
 I just hope use tika to detect corrupt files before indexing and get a
list of corrupted file. if its possible.
I try runing java -jar tika-app.jar <input_dir> <output_dir> I get in the
output_dir all the files of <input_dir> in format xml and all the corrupt
file with size 0ko (empty)

Reply via email to