Thanks Sebastian, 

I think I will try looking into the HtmlParseFilter since we do have control
over the content we are crawling and indexing. 



--
View this message in context: 
http://lucene.472066.n3.nabble.com/Nutch-how-to-crawl-but-not-index-the-site-navigation-w-Solr-tp4078702p4079169.html
Sent from the Nutch - User mailing list archive at Nabble.com.

Reply via email to