hi,
after applying adaptive refetch patch to nutch mapred, for the first time i
called the crawl command as i have to initialize the crawldb...
the next time, i comment out the following lines in
org.apache.nutch.crawl.Crawl.java
if (fs.exists(dir)) {
throw new RuntimeException(dir + " already exists.");
}
and
new Injector(job).inject(crawlDb, rootUrlDir);
But i find, the files are fetched even though they were nt modified. how to
use the same crawldb and using the same for further crawls in mapred
versions?
thanks
D.Saravanaraj