Hi, I need some help with the nutch re-crawl esp. because I am using Nutch 2.2.1 with HBase and I could not find a whole lot of information anywhere on how recrawl can be performed when urls are stored in HBase.
*Background*: I have crawled various domains and I assigned a specific hbase table name in the crawl command every time I crawled individual domains like shown below. E.g. Crawl 1: Url: www.*abc*.com Crawl Command used first it is crawled: bin/crawl urls *abc_webpage *http://localhost:8983/solr/ 6 ------------------------------------------------------------------------------------------------------------- Crawl 2: Url: www.*def*.com Crawl Command used first it is crawled: bin/crawl urls *def_webpage *http://localhost:8983/solr/ 10 ------------------------------------------------------------------------------------------------------------ Crawl 3: Url: www.*ghi*.com Crawl Command used first it is crawled: bin/crawl urls *ghi_webpage *http://localhost:8983/solr/ 3 *Question:* Now, since it is time to refetch/recrawl all those urls as there have been several updates that were made to it since the last time it was crawled. I will be using Defaultfetchschedule and I have updated fetch.interval in nutch-site.xml accordingly. My question is what nutch command should I use when I *re-crawl* those three above URLs from the example provided above? Thanks for any help!

