There is an index prune tool. I'm not sure if it's in
svn.

bin/nutch prune -queries delete.me segments/*

prune tool uses lucene standard queries and defaults
to the url field so if you want to delete some urls
you could use

spammy.host.com

as 

spammy AND host AND com  

and that would remove it.

--- "Insurance Squared Inc."
<[EMAIL PROTECTED]> wrote:

> I know this has been asked a number of times, but I
> don't think there's 
> been a definitive answer posted yet.  Is there any
> way (in v0.71) to 
> immediately remove a site (or all the pages from a
> site) from the 
> index?  Right now with our setup I think we have to
> wait the 30 days for 
> the segment to die off before a site can be removed.
> 
> 
> 



-------------------------------------------------------
This SF.net email is sponsored by: Splunk Inc. Do you grep through log files
for problems?  Stop!  Download the new AJAX search engine that makes
searching your log files as easy as surfing the  web.  DOWNLOAD SPLUNK!
http://sel.as-us.falkag.net/sel?cmd=lnk&kid=103432&bid=230486&dat=121642
_______________________________________________
Nutch-general mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/nutch-general

Reply via email to