Hello Everyone! I am up and running with my nutch 1.4 /solr 3.3 architecture and am looking to add a few new features.
My users want the ability to view their solr results as xhtml with the hits highlighted in the document. So a word document/pdf would become an XHTML version first. I see that Tika can produce XHTML but I don't see a way to integrate that with the parsing that nutch does in the parse-tika plugin. Seems like the results sent to solr for the "content" field are just the text of the document. Is there a way to do this? Thanks! -- View this message in context: http://lucene.472066.n3.nabble.com/Cached-page-like-google-with-hits-highlighted-tp4001374.html Sent from the Nutch - User mailing list archive at Nabble.com.