Hello Everyone!

I am up and running with my nutch 1.4 /solr 3.3  architecture and am looking
to add a few new features.  

My users want the ability to view their solr results as xhtml with the hits
highlighted in the document.  So a word document/pdf would become an XHTML
version first.

I see that Tika can produce XHTML but I don't see a way to integrate that
with the parsing that nutch does in the parse-tika plugin.  Seems like the
results sent to solr for the "content" field are just the text of the
document.  

Is there a way to do this?

Thanks!



--
View this message in context: 
http://lucene.472066.n3.nabble.com/Cached-page-like-google-with-hits-highlighted-tp4001374.html
Sent from the Nutch - User mailing list archive at Nabble.com.

Reply via email to