I am just doing a simple project for my Information Retrieval class.  I am
currently using nutch to get a bunch of pages and it is indexing and storing
the parsed page to SOLR.  What I really want to do is have it store the page
source with HTML tags as well.  Is there an easy way to tell nutch to do
that?

If not, after I have my pages indexed if I want to retrieve there original
source from nutch what would be the command to do that?

--
View this message in context: 
http://lucene.472066.n3.nabble.com/Out-of-the-box-Nutch-indexing-url-source-to-Solr-tp3855918p3855918.html
Sent from the Nutch - User mailing list archive at Nabble.com.

Reply via email to