Hi, Background: I have several article list urls in seed.txt. Currently, the nutch crawl command crawls both the list urls and the article urls every time. I want to prevent re-crawling for the urls (article urls) which are already crawled. But I want to crawl the urls in the seed.txt (article list urls). Do you have idea about this?
Regards, Rui

