Hello, 

I'm running the PruneIndexTool to remove some unwanted URLs from my index, but 
it doesn't work. 

I use :  
bin/nutch org.apache.nutch.tools.PruneIndexTool /nutch/local/my_crawl/index 
-queries queries.txt -output pruned.txt 

where: 
queries.txt hat the following entries: 
site:topsy.com 
site:osdir.com 
site:www.cez.cz 
site:biblecourses.com 
site:bbftv.tv 
site:autoavangarde.org 
site:www.volkswagen.com 
site:premiere21.com 

After execute the command, pruned.txt contains a lot of URLs with the pruned 
sites, but when I run a new query all pruned sites are still in the results. 

What I'm doing wrong? 

Thanks 
Patricio 

Reply via email to