If I get you right you are asking about Installing Oozie as Distributed and/or HA cluster?! In that case I am not familiar with an out of the box solution by Oozie. But, I think you can made up a solution of your own, for example: Installing Oozie on two servers on the same partition which will be synchronized by DRBD. You can trigger a "failover" using linux Heartbeat and that way maintain a virtual IP.
On Thu, Sep 1, 2011 at 1:59 PM, Per Steffensen <[email protected]> wrote: > Hi > > Thanks a lot for pointing me to Oozie. I have looked a little bit into > Oozie and it seems like the "component" triggering jobs is called > "Coordinator Application". But I really see nowhere that this Coordinator > Application doesnt just run on a single machine, and that it will therefore > not trigger anything if this machine is down. Can you confirm that the > "Coordinator Application"-role is distributed in a distribued Oozie setup, > so that jobs gets triggered even if one or two machines are down? > > Regards, Per Steffensen > > Ronen Itkin skrev: > > Hi >> >> Try to use Oozie for job coordination and work flows. >> >> >> >> On Thu, Sep 1, 2011 at 12:30 PM, Per Steffensen <[email protected]> >> wrote: >> >> >> >>> Hi >>> >>> I use hadoop for a MapReduce job in my system. I would like to have the >>> job >>> run very 5th minute. Are there any "distributed" timer job stuff in >>> hadoop? >>> Of course I could setup a timer in an external timer framework (CRON or >>> something like that) that invokes the MapReduce job. But CRON is only >>> running on one particular machine, so if that machine goes down my job >>> will >>> not be triggered. Then I could setup the timer on all or many machines, >>> but >>> I would not like the job to be run in more than one instance every 5th >>> minute, so then the timer jobs would need to coordinate who is actually >>> starting the job "this time" and all the rest would just have to do >>> nothing. >>> Guess I could come up with a solution to that - e.g. writing some "lock" >>> stuff using HDFS files or by using ZooKeeper. But I would really like if >>> someone had already solved the problem, and provided some kind of a >>> "distributed timer framework" running in a "cluster", so that I could >>> just >>> register a timer job with the cluster, and then be sure that it is >>> invoked >>> every 5th minute, no matter if one or two particular machines in the >>> cluster >>> is down. >>> >>> Any suggestions are very welcome. >>> >>> Regards, Per Steffensen >>> >>> >>> >> >> >> >> >> > > -- * Ronen Itkin* Taykey | www.taykey.com
