Hi, It will be appreciable if you will help me in this regard, I want some of the pages not to be indexed during crawl if they din't meet with specific criteria??
I am getting the url of those pages in Hadoop log which they don't meet ?? But still those urls along with all the contents are indexed, so what I want is to delete all those urls or contents from Luke tool , I mean index. Can any body help me resolving the issue?? Ratnesh V2Solutions India -- View this message in context: http://www.nabble.com/How-to-prevent-a-page-from-being-index-during-crawl-or-after-crawl---tf3505149.html#a9788975 Sent from the Nutch - User mailing list archive at Nabble.com. ------------------------------------------------------------------------- Take Surveys. Earn Cash. Influence the Future of IT Join SourceForge.net's Techsay panel and you'll get the chance to share your opinions on IT & business topics through brief surveys-and earn cash http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV _______________________________________________ Nutch-general mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/nutch-general
