YOu can also make a parse filter that copies the raw structure to another field and have it indexed later by an index filter.
On Sunday 25 March 2012 18:39:53 JohnRodey wrote: > I am just doing a simple project for my Information Retrieval class. I am > currently using nutch to get a bunch of pages and it is indexing and > storing the parsed page to SOLR. What I really want to do is have it > store the page source with HTML tags as well. Is there an easy way to > tell nutch to do that? > > If not, after I have my pages indexed if I want to retrieve there original > source from nutch what would be the command to do that? > > -- > View this message in context: > http://lucene.472066.n3.nabble.com/Out-of-the-box-Nutch-indexing-url-sourc > e-to-Solr-tp3855918p3855918.html Sent from the Nutch - User mailing list > archive at Nabble.com. -- Markus Jelsma - CTO - Openindex

