Abhishek, You can probably take a look at Oozie or Azkaban. I am not sure they support running process between xand y time, but definitely support scheduling a job Thanks and Regards, Sonal <https://github.com/sonalgoyal/hiho>Connect Hadoop with databases, Salesforce, FTP servers and others <https://github.com/sonalgoyal/hiho> Nube Technologies <http://www.nubetech.co>
<http://in.linkedin.com/in/sonalgoyal> On Thu, Feb 10, 2011 at 4:31 PM, Markus Jelsma <[email protected]>wrote: > I'm unsure about what Hadoop can do here but with Nutch you can't. What you > can do is create a run script that checks the current time before starting. > Nutch job's cannot always be aborted and resumed, beware of the fetch > process. > > On Wednesday 09 February 2011 02:17:01 .: Abhishek :. wrote: > > Hi all, > > > > I am just trying to figure out if there is some way I can set Nutch > crawls > > between a time interval say like crawl from 12:00 AM to 12:00 PM and then > > start the further processing(start process of indexing and so on that > > follows the crawl) after that. > > > > I think Nutch job is tied to Hadoop's JobConf. I am not sure on how > this > > could be done. Rather, if I am to use an external shell script for doing > > this, how do I chain the crawl process and trigger further processing > after > > crawl? > > > > Thanks, > > Abi > > -- > Markus Jelsma - CTO - Openindex > http://www.linkedin.com/in/markus17 > 050-8536620 / 06-50258350 >

