Hello, I am using apache-nutch-1.8.
I have an application which should crawl about 50 to 100 urls . Problem is that customer wants to change the urls from time to time (also delete someones) What is the correct way? I wanted to do it this way: 1. Change the list of urls in seed.txt 2. Change the list in regex-urlfilter.txt (add +^http://www.rlp.de/ i.e. for every url) 3. Delete the crawl directory and subdirectories 4. Delete the solr index 5. Run a cronjob every night bin/crawl urls/seed.txt crawl http://localhost:8983/solr 5 Is this o.k.? Thanx for help Martin

