There is an index prune tool. I'm not sure if it's in svn. bin/nutch prune -queries delete.me segments/*
prune tool uses lucene standard queries and defaults to the url field so if you want to delete some urls you could use spammy.host.com as spammy AND host AND com and that would remove it. --- "Insurance Squared Inc." <[EMAIL PROTECTED]> wrote: > I know this has been asked a number of times, but I > don't think there's > been a definitive answer posted yet. Is there any > way (in v0.71) to > immediately remove a site (or all the pages from a > site) from the > index? Right now with our setup I think we have to > wait the 30 days for > the segment to die off before a site can be removed. > > > ------------------------------------------------------- This SF.net email is sponsored by: Splunk Inc. Do you grep through log files for problems? Stop! Download the new AJAX search engine that makes searching your log files as easy as surfing the web. DOWNLOAD SPLUNK! http://sel.as-us.falkag.net/sel?cmd=lnk&kid=103432&bid=230486&dat=121642 _______________________________________________ Nutch-general mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/nutch-general
