Hi You may put separate crawling phases to separate scripts something like
inject.sh crawl.sh indexing.sh And configure these scripts to start at certain time using any scheduling tool for example I find it very easy to use linux cron scheduler. But you can configure that crawl can work between 12.00- 13.00. Crawl is working until it has unfetched resources in queue or max fetch limit is reached. And it takes as much time as needed. Best Regards Alexander Aristov On 9 February 2011 04:17, .: Abhishek :. <[email protected]> wrote: > Hi all, > > I am just trying to figure out if there is some way I can set Nutch crawls > between a time interval say like crawl from 12:00 AM to 12:00 PM and then > start the further processing(start process of indexing and so on that > follows the crawl) after that. > > I think Nutch job is tied to Hadoop's JobConf. I am not sure on how this > could be done. Rather, if I am to use an external shell script for doing > this, how do I chain the crawl process and trigger further processing after > crawl? > > Thanks, > Abi >

