Thanks for the help! On Wed, Jan 4, 2017 at 9:02 AM, kishore g <g.kish...@gmail.com> wrote:
> will review it today > > On Tue, Jan 3, 2017 at 12:24 PM, Xue Junkai <junkai....@gmail.com> wrote: > > > Hi All, > > > > Here's the pull request of this design: https://github.com/ > > apache/helix/pull/64 > > Could anyone help me review it? > > > > Best, > > > > Junkai > > > > On Thu, Dec 8, 2016 at 6:09 PM, Xue Junkai <junkai....@gmail.com> wrote: > > > >> Hi All, > >> > >> I have a short design for the Delayed Workflow and Job Scheduling. Since > >> I cannot access wiki, I attached with this email. Any feedbacks and > >> comments are highly appreciated! > >> > >> Best, > >> > >> Junkai > >> Overview > >> > >> Currently, Workflows and Jobs running by Helix requires more > flexibility. > >> For example, some of the jobs need to be started after some jobs > finished > >> for a certain mount of time. Same as Workflow, it may run at specific > time, > >> when some operations have been done. To better support Workflow and Job > >> scheduling, Helix should provide a new feature to let user setup the > delay > >> time or starting for specific Workflows and Jobs. Workflows and Jobs > should > >> have an option that allow user set starting time of this Workflow or > Job or > >> set the delaying time for this Workflow and Job, when they are ready to > >> start. Then Workflows and Jobs can be scheduled at correct time. > >> Purposed Design > >> > >> The whole design has been split into two parts, generic rebalancer > >> scheduling and delay time calculation. Since Job scheduling can be done > via > >> rerun WorkflowRebalancer, Workflow and Job delay scheduling can rely on > the > >> same generic scheduling mechanism. Generic task scheduling tasks the > >> responsibiliy to set the running time for specific Workflow object. Then > >> each object has its own starting time calculation algorithm. > >> > >> Generic Task Scheduling > >> > >> For generic task scheduling, it is better to have a centralized > >> scheduler, RebalanceScheduler. It provides four public APIs: > >> public class RebalanceScheduler { > >> public void scheduleRebalance(HelixManager manager, String resource, > >> long startTime); > >> > >> public long getRebalanceTime(String resource); > >> > >> public long removeScheduledRebalance(String resource); > >> > >> public static void invokeRebalance(HelixDataAccessor accessor, > >> String resource); > >> } > >> > >> > >> > >> Obviously, it offers schedule a rebalancer, get schedule time of a > >> rebalancer and remove a rebalancer schedule. It also have an API that > can > >> invoke rebalancer immediately. With this RebalancerScheduler, each > resource > >> can be scheduled at certain start time. > >> Delay Time Calculation > >> > >> Workflows have a property expiryTime, which is the delay time that for > >> the Workflow. User can set it by call setExpiry method in > WorkflowConfig. > >> For Job, two methods, in JobConfig, will be provided: setExecutionStart > and > >> setExecutionDelay. Through these API, user can set the delay time and > start > >> time for Workflows and Jobs. Internally, Helix will take the delay time > and > >> start time, which is later. > >> > >> For the logic implemented in computing Workflows and Jobs, Helix choose > >> to do real time computation. User can set delay time or start time at > >> JobConfig. When the job is ready to run, Helix will calculate the "start > >> time" for delay via current time plus the delay time. Then compare it > with > >> start time if user set it up in JobConfig. > >> > >> [image: Inline image 1] > >> Impact > >> > >> - From user perspective, user have to understand the difference > >> between delay time and start time. > >> - The WorkflowRebalancer will be called multiple times, which might > >> be considered for performance. > >> > >> > > > > > > -- > > Junkai Xue > > > -- Junkai Xue