Hi, I wrote up a page describing how we might go about doing scheduled tasks. It largely leverages the existing task framework and adds a scheduling layer on top of it.
https://cwiki.apache.org/confluence/display/HELIX/Scheduled+Tasks Any feedback is appreciated. Thanks, Kanak ---------------------------------------- > Date: Tue, 18 Mar 2014 20:29:21 -0700 > Subject: Re: Scheduling tasks in the cluster > From: [email protected] > To: [email protected] > > I like the idea what we should do is have the concept of a Task > Manager with apis to execute tasks immediately or after a specific > duration or periodically. I think we can absolutely put together an > API for this, with synchronous responses and fire-and-forget with a > callback semantics. > > The tricky part is persistence since we need to make sure they can be > pulled into memory right before they are to be scheduled etc. > > But all in all would be a good addition. > > Sandeep > > On Tue, Mar 18, 2014 at 5:14 PM, Kanak Biscuitwala <[email protected]> > wrote: >> >> I'll send out a longer email once I've finished gathering requirements and >> sketching through a design, but here are my initial thoughts: >> >> - This actually requires two things from Helix: being able to run tasks in >> the cluster reliably and being able to schedule tasks in the cluster reliably >> - For the task half of this work, we probably have most of the code >> available already as the task framework supports things like target >> resources, DAG-based dependencies, task states, canceling, and correctness >> in the face of controller failover. >> - The scheduling half is the part that requires the most new additions. We >> basically need to be able to (1) store the schedule, (2) know when to wake >> up to process an item on the schedule, and (3) do this without needing >> anything in controller memory >> >> Kanak >> >> ---------------------------------------- >>> Date: Tue, 18 Mar 2014 16:09:05 -0700 >>> Subject: Scheduling tasks in the cluster >>> From: [email protected] >>> To: [email protected] >>> >>> This requirement has come up often and I think its worth while to spend >>> some time to come up with an elegant solution. We have offered work around >>> but it still requires users to write write quite a bit of complex code >>> >>> Problem statement: >>> Schedule a Task(s) in the cluster. The task can be Adhoc (one time) or >>> Recurring (every X minutes or once between 12 to 3 AM etc - basically a >>> cron expression). Additional criteria as to where the task should be run, >>> it can be run on any node in the cluster or any node in that cluster that >>> hosts a particular resource and in a particular state. If the task fails we >>> might have to retry the task, it can either retry x times before trying on >>> another node etc. There might be additional constraints that not more than >>> X tasks should be run on a particular node or across the entire cluster. >>> >>> Helix supports all these features in one way or the other but there is no >>> first class support of API that encapsulates all the above features. >>> >>> Any thoughts on how such an API/DSL should look like ? >>> >>> thanks, >>> Kishore G >>
