admittedly this is a cross-post from stackoverflow, but I don't know if there are a whole lot of Nutch folks over there.
My question is about crawling HTML navigation menus, but not indexing the text for those links in Solr. While I have seen some older discussions from several years ago about making this an option in later development, but I am not really finding anything via searching that gives a good indication of how one might exlude site navigation menu content from the content that Nutch indexes to Solr during a crawl. That is, I am seeing the navigation menu text in all content that is getting indexed and this damages search because then all content will have the same text in it. Obviously I want to keep using the site navigation for crawling, but I don't want it indexed. Is there a best practice for accomplishing this with Nutch? Like a way to wrap the navigation in some kind of tag , for example? I am new to Nutch (obviously) so I don't know the best place that this would be accomplished. thanks very much. -- View this message in context: http://lucene.472066.n3.nabble.com/Nutch-how-to-crawl-but-not-index-the-site-navigation-w-Solr-tp4078702.html Sent from the Nutch - User mailing list archive at Nabble.com.

