I'm unsure about what Hadoop can do here but with Nutch you can't. What you can do is create a run script that checks the current time before starting. Nutch job's cannot always be aborted and resumed, beware of the fetch process.
On Wednesday 09 February 2011 02:17:01 .: Abhishek :. wrote: > Hi all, > > I am just trying to figure out if there is some way I can set Nutch crawls > between a time interval say like crawl from 12:00 AM to 12:00 PM and then > start the further processing(start process of indexing and so on that > follows the crawl) after that. > > I think Nutch job is tied to Hadoop's JobConf. I am not sure on how this > could be done. Rather, if I am to use an external shell script for doing > this, how do I chain the crawl process and trigger further processing after > crawl? > > Thanks, > Abi -- Markus Jelsma - CTO - Openindex http://www.linkedin.com/in/markus17 050-8536620 / 06-50258350

