Can somebody point me to a good tutorial on how to index Word documents using 
Solr?

I have a few hundred Microsoft Word documents I want to search. Through the use 
of the Tika library it seems as if I ought to be able to index my Word 
documents directly into Solr, but none of the tutorials I have found on the Web 
are complete. Missing directories. Missing files. Documentation for versions 
unreleased. Etc.

Put another way, Tika can create a (nice) XHTML file complete with some useful 
metadata that can all be fed to Solr for indexing, but I can barely get out of 
the starting gate. Have you indexed Word documents using Solr, and if so, then 
how? 

—
Eric Morgan

Reply via email to