GitHub user dasahcc opened a pull request:

    https://github.com/apache/helix/pull/64

    Delay the job scheduling when check job ready to schedule

    Overview
    Currently, Workflows and Jobs running by Helix requires more flexibility. 
For example, some of the jobs need to be started after some jobs finished for a 
certain mount of time. Same as Workflow, it may run at specific time, when some 
operations have been done.  To better support Workflow and Job scheduling, 
Helix should provide a new feature to let user setup the delay time or starting 
for specific Workflows and Jobs. Workflows and Jobs should have an option that 
allow user set starting time of this Workflow or Job or set the delaying time 
for this Workflow and Job, when they are ready to start. Then Workflows and 
Jobs can be scheduled at correct time.
    Purposed Design
    The whole design has been split into two parts, generic rebalancer 
scheduling and delay time calculation. Since Job scheduling can be done via 
rerun WorkflowRebalancer, Workflow and Job delay scheduling can rely on the 
same generic scheduling mechanism. Generic task scheduling tasks the 
responsibiliy to set the running time for specific Workflow object. Then each 
object has its own starting time calculation algorithm.
    
    Generic Task Scheduling
    For generic task scheduling, it is better to have a centralized scheduler, 
RebalanceScheduler. It provides four public APIs:
    public class RebalanceScheduler {
        public void scheduleRebalance(HelixManager manager, String resource, 
long startTime);
     
        public long getRebalanceTime(String resource);
     
        public long removeScheduledRebalance(String resource);
     
        public static void invokeRebalance(HelixDataAccessor accessor, String 
resource);
    }
     
    Obviously, it offers schedule a rebalancer, get schedule time of a 
rebalancer and remove a rebalancer schedule. It also have an API that can 
invoke rebalancer immediately. With this RebalancerScheduler, each resource can 
be scheduled at certain start time. 
    Delay Time Calculation
    Workflows have a property expiryTime, which is the delay time that for the 
Workflow. User can set it by call setExpiry method in WorkflowConfig. For Job, 
two methods, in JobConfig, will be provided: setExecutionStart and 
setExecutionDelay. Through these API, user can set the delay time and start 
time for Workflows and Jobs. Internally, Helix will take the delay time and 
start time, which is later.
    For the logic implemented in computing Workflows and Jobs, Helix choose to 
do real time computation. User can set delay time or start time at JobConfig. 
When the job is ready to run, Helix will calculate the "start time" for delay 
via current time plus the delay time. Then compare it with start time if user 
set it up in JobConfig.
    Inline image 1
    Impact
    From user perspective, user have to understand the difference between delay 
time and start time.
    The WorkflowRebalancer will be called multiple times, which might be 
considered for performance.

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/dasahcc/helix helix-0.6.x

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/helix/pull/64.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #64
    
----
commit 0df37a7b2f066290f5424b61d9b27d9aed13d340
Author: Junkai Xue <j...@linkedin.com>
Date:   2016-12-17T01:21:43Z

    Delay the job scheduling when check job ready to schedule

----


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

Reply via email to