Hi,

I wrote up a page describing how we might go about doing scheduled tasks. It 
largely leverages the existing task framework and adds a scheduling layer on 
top of it.

https://cwiki.apache.org/confluence/display/HELIX/Scheduled+Tasks

Any feedback is appreciated.

Thanks,
Kanak

----------------------------------------
> Date: Tue, 18 Mar 2014 20:29:21 -0700
> Subject: Re: Scheduling tasks in the cluster
> From: [email protected]
> To: [email protected]
>
> I like the idea what we should do is have the concept of a Task
> Manager with apis to execute tasks immediately or after a specific
> duration or periodically. I think we can absolutely put together an
> API for this, with synchronous responses and fire-and-forget with a
> callback semantics.
>
> The tricky part is persistence since we need to make sure they can be
> pulled into memory right before they are to be scheduled etc.
>
> But all in all would be a good addition.
>
> Sandeep
>
> On Tue, Mar 18, 2014 at 5:14 PM, Kanak Biscuitwala <[email protected]> 
> wrote:
>>
>> I'll send out a longer email once I've finished gathering requirements and 
>> sketching through a design, but here are my initial thoughts:
>>
>> - This actually requires two things from Helix: being able to run tasks in 
>> the cluster reliably and being able to schedule tasks in the cluster reliably
>> - For the task half of this work, we probably have most of the code 
>> available already as the task framework supports things like target 
>> resources, DAG-based dependencies, task states, canceling, and correctness 
>> in the face of controller failover.
>> - The scheduling half is the part that requires the most new additions. We 
>> basically need to be able to (1) store the schedule, (2) know when to wake 
>> up to process an item on the schedule, and (3) do this without needing 
>> anything in controller memory
>>
>> Kanak
>>
>> ----------------------------------------
>>> Date: Tue, 18 Mar 2014 16:09:05 -0700
>>> Subject: Scheduling tasks in the cluster
>>> From: [email protected]
>>> To: [email protected]
>>>
>>> This requirement has come up often and I think its worth while to spend
>>> some time to come up with an elegant solution. We have offered work around
>>> but it still requires users to write write quite a bit of complex code
>>>
>>> Problem statement:
>>> Schedule a Task(s) in the cluster. The task can be Adhoc (one time) or
>>> Recurring (every X minutes or once between 12 to 3 AM etc - basically a
>>> cron expression). Additional criteria as to where the task should be run,
>>> it can be run on any node in the cluster or any node in that cluster that
>>> hosts a particular resource and in a particular state. If the task fails we
>>> might have to retry the task, it can either retry x times before trying on
>>> another node etc. There might be additional constraints that not more than
>>> X tasks should be run on a particular node or across the entire cluster.
>>>
>>> Helix supports all these features in one way or the other but there is no
>>> first class support of API that encapsulates all the above features.
>>>
>>> Any thoughts on how such an API/DSL should look like ?
>>>
>>> thanks,
>>> Kishore G
>>
                                          

Reply via email to