Hi, It will be appreciable if you will help me in this regard, I want some of the pages not to be indexed during crawl if they din't meet with specific criteria??
I am getting the url of those pages in Hadoop log which they don't meet ?? But still those urls along with all the contents are indexed, so what I want is to delete all those urls or contents from Luke tool , I mean index. Can any body help me resolving the issue?? Ratnesh V2Solutions India -- View this message in context: http://www.nabble.com/How-to-prevent-a-page-from-being-index-during-crawl-or-after-crawl---tf3505149.html#a9788975 Sent from the Nutch - User mailing list archive at Nabble.com.
