Delayed Workflow and Job Scheduling Design

2016-12-08 Thread Xue Junkai
Hi All,

I have a short design for the Delayed Workflow and Job Scheduling. Since I
cannot access wiki, I attached with this email. Any feedbacks and comments
are highly appreciated!

Best,

Junkai
Overview

Currently, Workflows and Jobs running by Helix requires more flexibility.
For example, some of the jobs need to be started after some jobs finished
for a certain mount of time. Same as Workflow, it may run at specific time,
when some operations have been done.  To better support Workflow and Job
scheduling, Helix should provide a new feature to let user setup the delay
time or starting for specific Workflows and Jobs. Workflows and Jobs should
have an option that allow user set starting time of this Workflow or Job or
set the delaying time for this Workflow and Job, when they are ready to
start. Then Workflows and Jobs can be scheduled at correct time.
Purposed Design

The whole design has been split into two parts, generic rebalancer
scheduling and delay time calculation. Since Job scheduling can be done via
rerun WorkflowRebalancer, Workflow and Job delay scheduling can rely on the
same generic scheduling mechanism. Generic task scheduling tasks the
responsibiliy to set the running time for specific Workflow object. Then
each object has its own starting time calculation algorithm.

Generic Task Scheduling

For generic task scheduling, it is better to have a centralized scheduler,
RebalanceScheduler. It provides four public APIs:
public class RebalanceScheduler {
public void scheduleRebalance(HelixManager manager, String resource,
long startTime);

public long getRebalanceTime(String resource);

public long removeScheduledRebalance(String resource);

public static void invokeRebalance(HelixDataAccessor accessor, String
resource);
}



Obviously, it offers schedule a rebalancer, get schedule time of a
rebalancer and remove a rebalancer schedule. It also have an API that can
invoke rebalancer immediately. With this RebalancerScheduler, each resource
can be scheduled at certain start time.
Delay Time Calculation

Workflows have a property expiryTime, which is the delay time that for the
Workflow. User can set it by call setExpiry method in WorkflowConfig. For
Job, two methods, in JobConfig, will be provided: setExecutionStart and
setExecutionDelay. Through these API, user can set the delay time and start
time for Workflows and Jobs. Internally, Helix will take the delay time and
start time, which is later.

For the logic implemented in computing Workflows and Jobs, Helix choose to
do real time computation. User can set delay time or start time at
JobConfig. When the job is ready to run, Helix will calculate the "start
time" for delay via current time plus the delay time. Then compare it with
start time if user set it up in JobConfig.

[image: Inline image 1]
Impact

   - From user perspective, user have to understand the difference between
   delay time and start time.
   - The WorkflowRebalancer will be called multiple times, which might be
   considered for performance.


Helix State Transition Cancellation Design

2016-12-08 Thread Xue Junkai
Hi All,

Here's an other design for Helix state transition cancellation. Could you
please help me review it?

Best,

Junkai
Introduction

State transition takes a vital part of Helix managing clusters. There are
different reasons can cause state transition is not necessary, for example,
the node of partition running state transition is down. Thus state
transition cancellation would be a useful feature to have in Helix. It not
only helps cancel the state transition to avoid invalid state but also
benefits for reducing redundant state transitions.  This document will show
the design of state transition cancellation.
Problem Statement

There are a couple of situations that may need to cancel a state transition:

   - Cancel any ongoing state transition on an instance if Helix decides
   not putting any replicas on this instance. One example is that user would
   like to add new nodes one by one. When node 1 added, one partition A can be
   moved from old nodes to node one. After the partition A started state
   transition on node 1, node 2 is added to the system. However, the partition
   A, started state transition in node 1, will be assigned to node 2. Thus it
   is better to cancel the state transition in node 1 and start state
   transition in node 2. Following picture showing the process of adding node.
   In stage 3, if Helix does not cancel the partition A state transition, the
   partition A will finish the state transition at node 1 and do another state
   transition to bring it to initial state or previous state. It is the
   redundant work we are trying to avoid.
   - Cancel all ongoing state transition once a resource is deleted or
   disabled.
   - Cancel all ongoing state transition if the instance is disabled.
   - Cancel all running tasks if the job is stopped, aborted or deleted.
   Since a job is in the non-running state, all the sub tasks in running
   should be cancelled to keep consistent with job status.

In any of these conditions, it is better to have Helix doing a cancellation
to save resources or avoid invalid operation.
Existing Solutions

In current Helix roadmap, cancellation is not supported. Because Helix will
try to bring the systems to the ideal mapping state finally, even if there
are couple of redundant operations. Thus one of situation mentioned above
occurred in real world, Helix will still wait for state transition finished
and issue another counter state transition to make partition in right
state. Or if this state transition is stuck there for long time, state
transition time out will solve the problem.
Purposed DesignCancel Logic

>From Helix perspective, Helix should cancel the state transitions which may
cause redundant works or invalid operations. In this case, when a state
transition need to be cancelled, Helix will issue a message to the
participant. Then participant will be notified that the corresponding
partition's state transition is cancelled.

   - *Ignored*: Participant can ignore the cancellation if user believes
   this state transition is very important or hard to cancel. Participant will
   keep doing state transition and finish the callback accordingly. For
   example, if partition cancelled during bootstrapping, there might be some
   data loss or even the cancel operation is not allowed. In this case, user
   can just ignore the cancel flag and continuing with state transition.

However, the cancellation final result depends on user's implementation.
There are two options that user can purpose for the cancellation reaction
and final result:

   - *Option 1: *
  - *Cancelled to Previous State*: State transition can be cancelled
  and force setting to previous state.
  - *Throw Error Exception*: If anything fail during rollback or
  previous state is not a proper targeting state, user can throw
an exception
  and Helix put it to error state.
  - *Option 2:*
  - *Cancelled to Cancel State*: State transition can be cancelled and
  set to the cancel state. Then Helix will do the state transition based on
  user defined cancel to other states in following steps.
  - *Throw Error Exception*: If anything fail during rollback, user can
  throw an exception and put it to error state.

Cancel Failed

What if cancel failed? If cancellation is failed, Helix will leave the
partition as what it is now. Since when a cancel failed, Helix encounters
following situations:

   - Ignored by participant: As the cancellation message ignored, Helix
   will wait until the state transition finished and receive a message for
   counter state transition.
   - State Transition Stuck: Once state transition is stuck or partition is
   busy, the state transition will be terminated with state transition time
   out.
   - Fail to Rollback or Clean Up: If clean up or rollback, implemented by
   user, is failed, the normal exception will be thrown. Thus the partition
   will be set to error state.

In sum, Helix can handle the state 

Re: Delayed Workflow and Job Scheduling Design

2017-01-04 Thread Xue Junkai
Thanks for the help!

On Wed, Jan 4, 2017 at 9:02 AM, kishore g <g.kish...@gmail.com> wrote:

> will review it today
>
> On Tue, Jan 3, 2017 at 12:24 PM, Xue Junkai <junkai@gmail.com> wrote:
>
> > Hi All,
> >
> > Here's the pull request of this design: https://github.com/
> > apache/helix/pull/64
> > Could anyone help me review it?
> >
> > Best,
> >
> > Junkai
> >
> > On Thu, Dec 8, 2016 at 6:09 PM, Xue Junkai <junkai@gmail.com> wrote:
> >
> >> Hi All,
> >>
> >> I have a short design for the Delayed Workflow and Job Scheduling. Since
> >> I cannot access wiki, I attached with this email. Any feedbacks and
> >> comments are highly appreciated!
> >>
> >> Best,
> >>
> >> Junkai
> >> Overview
> >>
> >> Currently, Workflows and Jobs running by Helix requires more
> flexibility.
> >> For example, some of the jobs need to be started after some jobs
> finished
> >> for a certain mount of time. Same as Workflow, it may run at specific
> time,
> >> when some operations have been done.  To better support Workflow and Job
> >> scheduling, Helix should provide a new feature to let user setup the
> delay
> >> time or starting for specific Workflows and Jobs. Workflows and Jobs
> should
> >> have an option that allow user set starting time of this Workflow or
> Job or
> >> set the delaying time for this Workflow and Job, when they are ready to
> >> start. Then Workflows and Jobs can be scheduled at correct time.
> >> Purposed Design
> >>
> >> The whole design has been split into two parts, generic rebalancer
> >> scheduling and delay time calculation. Since Job scheduling can be done
> via
> >> rerun WorkflowRebalancer, Workflow and Job delay scheduling can rely on
> the
> >> same generic scheduling mechanism. Generic task scheduling tasks the
> >> responsibiliy to set the running time for specific Workflow object. Then
> >> each object has its own starting time calculation algorithm.
> >>
> >> Generic Task Scheduling
> >>
> >> For generic task scheduling, it is better to have a centralized
> >> scheduler, RebalanceScheduler. It provides four public APIs:
> >> public class RebalanceScheduler {
> >> public void scheduleRebalance(HelixManager manager, String resource,
> >> long startTime);
> >>
> >> public long getRebalanceTime(String resource);
> >>
> >> public long removeScheduledRebalance(String resource);
> >>
> >> public static void invokeRebalance(HelixDataAccessor accessor,
> >> String resource);
> >> }
> >>
> >>
> >>
> >> Obviously, it offers schedule a rebalancer, get schedule time of a
> >> rebalancer and remove a rebalancer schedule. It also have an API that
> can
> >> invoke rebalancer immediately. With this RebalancerScheduler, each
> resource
> >> can be scheduled at certain start time.
> >> Delay Time Calculation
> >>
> >> Workflows have a property expiryTime, which is the delay time that for
> >> the Workflow. User can set it by call setExpiry method in
> WorkflowConfig.
> >> For Job, two methods, in JobConfig, will be provided: setExecutionStart
> and
> >> setExecutionDelay. Through these API, user can set the delay time and
> start
> >> time for Workflows and Jobs. Internally, Helix will take the delay time
> and
> >> start time, which is later.
> >>
> >> For the logic implemented in computing Workflows and Jobs, Helix choose
> >> to do real time computation. User can set delay time or start time at
> >> JobConfig. When the job is ready to run, Helix will calculate the "start
> >> time" for delay via current time plus the delay time. Then compare it
> with
> >> start time if user set it up in JobConfig.
> >>
> >> [image: Inline image 1]
> >> Impact
> >>
> >>- From user perspective, user have to understand the difference
> >>between delay time and start time.
> >>- The WorkflowRebalancer will be called multiple times, which might
> >>be considered for performance.
> >>
> >>
> >
> >
> > --
> > Junkai Xue
> >
>



-- 
Junkai Xue


Re: Delayed Workflow and Job Scheduling Design

2017-01-04 Thread Xue Junkai
For the questions:

>>- From user perspective, user have to understand the difference
>>between delay time and start time.
Yes. This will be documented in Task Framework User Guide to explain
difference between StartTime and DelayTime

>>- The WorkflowRebalancer will be called multiple times, which might
>>be considered for performance.

The only time complexity we added in is to going through all the jobs
(O(n)) to find the next schedule time. Although the WorkflowRebalancer will
be called multiple times it wont affect that much.

Best,

Junkai


On Wed, Jan 4, 2017 at 9:02 AM, kishore g <g.kish...@gmail.com> wrote:

> will review it today
>
> On Tue, Jan 3, 2017 at 12:24 PM, Xue Junkai <junkai@gmail.com> wrote:
>
> > Hi All,
> >
> > Here's the pull request of this design: https://github.com/
> > apache/helix/pull/64
> > Could anyone help me review it?
> >
> > Best,
> >
> > Junkai
> >
> > On Thu, Dec 8, 2016 at 6:09 PM, Xue Junkai <junkai@gmail.com> wrote:
> >
> >> Hi All,
> >>
> >> I have a short design for the Delayed Workflow and Job Scheduling. Since
> >> I cannot access wiki, I attached with this email. Any feedbacks and
> >> comments are highly appreciated!
> >>
> >> Best,
> >>
> >> Junkai
> >> Overview
> >>
> >> Currently, Workflows and Jobs running by Helix requires more
> flexibility.
> >> For example, some of the jobs need to be started after some jobs
> finished
> >> for a certain mount of time. Same as Workflow, it may run at specific
> time,
> >> when some operations have been done.  To better support Workflow and Job
> >> scheduling, Helix should provide a new feature to let user setup the
> delay
> >> time or starting for specific Workflows and Jobs. Workflows and Jobs
> should
> >> have an option that allow user set starting time of this Workflow or
> Job or
> >> set the delaying time for this Workflow and Job, when they are ready to
> >> start. Then Workflows and Jobs can be scheduled at correct time.
> >> Purposed Design
> >>
> >> The whole design has been split into two parts, generic rebalancer
> >> scheduling and delay time calculation. Since Job scheduling can be done
> via
> >> rerun WorkflowRebalancer, Workflow and Job delay scheduling can rely on
> the
> >> same generic scheduling mechanism. Generic task scheduling tasks the
> >> responsibiliy to set the running time for specific Workflow object. Then
> >> each object has its own starting time calculation algorithm.
> >>
> >> Generic Task Scheduling
> >>
> >> For generic task scheduling, it is better to have a centralized
> >> scheduler, RebalanceScheduler. It provides four public APIs:
> >> public class RebalanceScheduler {
> >> public void scheduleRebalance(HelixManager manager, String resource,
> >> long startTime);
> >>
> >> public long getRebalanceTime(String resource);
> >>
> >> public long removeScheduledRebalance(String resource);
> >>
> >> public static void invokeRebalance(HelixDataAccessor accessor,
> >> String resource);
> >> }
> >>
> >>
> >>
> >> Obviously, it offers schedule a rebalancer, get schedule time of a
> >> rebalancer and remove a rebalancer schedule. It also have an API that
> can
> >> invoke rebalancer immediately. With this RebalancerScheduler, each
> resource
> >> can be scheduled at certain start time.
> >> Delay Time Calculation
> >>
> >> Workflows have a property expiryTime, which is the delay time that for
> >> the Workflow. User can set it by call setExpiry method in
> WorkflowConfig.
> >> For Job, two methods, in JobConfig, will be provided: setExecutionStart
> and
> >> setExecutionDelay. Through these API, user can set the delay time and
> start
> >> time for Workflows and Jobs. Internally, Helix will take the delay time
> and
> >> start time, which is later.
> >>
> >> For the logic implemented in computing Workflows and Jobs, Helix choose
> >> to do real time computation. User can set delay time or start time at
> >> JobConfig. When the job is ready to run, Helix will calculate the "start
> >> time" for delay via current time plus the delay time. Then compare it
> with
> >> start time if user set it up in JobConfig.
> >>
> >> [image: Inline image 1]
> >> Impact
> >>
> >>- From user perspective, user have to understand the difference
> >>between delay time and start time.
> >>- The WorkflowRebalancer will be called multiple times, which might
> >>be considered for performance.
> >>
> >>
> >
> >
> > --
> > Junkai Xue
> >
>



-- 
Junkai Xue


Re: Delayed Workflow and Job Scheduling Design

2017-01-03 Thread Xue Junkai
Hi All,

Here's the pull request of this design:
https://github.com/apache/helix/pull/64
Could anyone help me review it?

Best,

Junkai

On Thu, Dec 8, 2016 at 6:09 PM, Xue Junkai <junkai@gmail.com> wrote:

> Hi All,
>
> I have a short design for the Delayed Workflow and Job Scheduling. Since I
> cannot access wiki, I attached with this email. Any feedbacks and comments
> are highly appreciated!
>
> Best,
>
> Junkai
> Overview
>
> Currently, Workflows and Jobs running by Helix requires more flexibility.
> For example, some of the jobs need to be started after some jobs finished
> for a certain mount of time. Same as Workflow, it may run at specific time,
> when some operations have been done.  To better support Workflow and Job
> scheduling, Helix should provide a new feature to let user setup the delay
> time or starting for specific Workflows and Jobs. Workflows and Jobs should
> have an option that allow user set starting time of this Workflow or Job or
> set the delaying time for this Workflow and Job, when they are ready to
> start. Then Workflows and Jobs can be scheduled at correct time.
> Purposed Design
>
> The whole design has been split into two parts, generic rebalancer
> scheduling and delay time calculation. Since Job scheduling can be done via
> rerun WorkflowRebalancer, Workflow and Job delay scheduling can rely on the
> same generic scheduling mechanism. Generic task scheduling tasks the
> responsibiliy to set the running time for specific Workflow object. Then
> each object has its own starting time calculation algorithm.
>
> Generic Task Scheduling
>
> For generic task scheduling, it is better to have a centralized scheduler,
> RebalanceScheduler. It provides four public APIs:
> public class RebalanceScheduler {
> public void scheduleRebalance(HelixManager manager, String resource,
> long startTime);
>
> public long getRebalanceTime(String resource);
>
> public long removeScheduledRebalance(String resource);
>
> public static void invokeRebalance(HelixDataAccessor accessor, String
> resource);
> }
>
>
>
> Obviously, it offers schedule a rebalancer, get schedule time of a
> rebalancer and remove a rebalancer schedule. It also have an API that can
> invoke rebalancer immediately. With this RebalancerScheduler, each resource
> can be scheduled at certain start time.
> Delay Time Calculation
>
> Workflows have a property expiryTime, which is the delay time that for the
> Workflow. User can set it by call setExpiry method in WorkflowConfig. For
> Job, two methods, in JobConfig, will be provided: setExecutionStart and
> setExecutionDelay. Through these API, user can set the delay time and start
> time for Workflows and Jobs. Internally, Helix will take the delay time and
> start time, which is later.
>
> For the logic implemented in computing Workflows and Jobs, Helix choose to
> do real time computation. User can set delay time or start time at
> JobConfig. When the job is ready to run, Helix will calculate the "start
> time" for delay via current time plus the delay time. Then compare it with
> start time if user set it up in JobConfig.
>
> [image: Inline image 1]
> Impact
>
>- From user perspective, user have to understand the difference
>between delay time and start time.
>- The WorkflowRebalancer will be called multiple times, which might be
>considered for performance.
>
>


-- 
Junkai Xue


Re: dynamic zookeepr quorum

2017-03-18 Thread Xue Junkai
Hi Shawn,

Great suggestion! But I have a concern for 3.5.x is that there is no stable
version of 3.5.x, right? If stable version of 3.5.x comes out, we shall
consider about it!

Best,

Junkai

On Sat, Mar 18, 2017 at 4:01 PM, Neutron Sharc 
wrote:

> Hi all,
>
> Recent zookeeper 3.5 allows to dynamically grow zookeeper quorum.
> Does helix zookeeper clients perform runtime reconfigure to use the
> new zookeeper servers?
>
>
> -Shawn
>



-- 
Junkai Xue


Re: [GitHub] helix issue #81: Creating a separate threadpool to handle batchMessages

2017-04-03 Thread Xue Junkai
Sure! Will add that.

On Mon, Apr 3, 2017 at 12:21 PM, kishore g  wrote:

> +1 on adding an api to enable/disable this at a cluster level.
>
> On Mon, Apr 3, 2017 at 12:18 PM, dasahcc  wrote:
>
> > Github user dasahcc commented on the issue:
> >
> > https://github.com/apache/helix/pull/81
> >
> > Looks good to me! I will do the following things for corresponding
> > change:
> > 1. Will add a test for this.
> > 2. Will provide an API in HelixManager for enabling batch message and
> > support cluster/resource level batch message enabling.
> >
> >
> > ---
> > If your project is set up for it, you can reply to this email and have
> your
> > reply appear on GitHub as well. If your project does not have this
> feature
> > enabled and wishes so, or if the feature is enabled but not working,
> please
> > contact infrastructure at infrastruct...@apache.org or file a JIRA
> ticket
> > with INFRA.
> > ---
> >
>



-- 
Junkai Xue


Re: [ANNOUNCE] New committer: Junkai Xue

2017-04-03 Thread Xue Junkai
Thanks! Very glad to join the team!

Best,

Junkai

On Mon, Apr 3, 2017 at 11:01 AM, kishore g  wrote:

> Yay!.  Welcome to the club.
>
> On Mon, Apr 3, 2017 at 10:45 AM, Lei Xia  wrote:
>
> > The Project Management Committee (PMC) for Apache Helix has asked Junkai
> > Xue
> > to become a committer and we are pleased to announce that he has
> accepted.
> >
> > Being a committer enables easier contribution to the project since there
> is
> > no need to go via the patch submission process. This should enable better
> > productivity.
> >
> >
> > Helix Team
> >
>



-- 
Junkai Xue


Re: How to run Helix tasks without Yarn

2017-08-02 Thread Xue Junkai
Hi Ramprasad,

We do have a doc for the Task Framework,
http://helix.apache.org/0.6.8-docs/tutorial_task_framework.html. But which
Helix version are you using? This doc works for Helix 0.6.x but not for 0.7.

Please let us know if you have any further questions!

Best,

Junkai

On Wed, Aug 2, 2017 at 11:04 AM, Venkata Ramaprasad Sanapala <
rp...@geocloud.io> wrote:

> Hi,
>
> I am working on implementing helix task and I would like to run without
> Yarn. All examples are implemented using Yarn. Is there a way I can run
> helix tasks without Yarn?
>
> Here is code that I added:
>
> Task: DbMetaDataTask.java
> Task Service: DbMetaDataTaskService.java
>
> I would like to start this task service directly. Any document or
> reference to an example will help.
>
> Thanks & Regards,
> —Ramprasad.
>
>


Re: Documentation of setting up controller as service

2017-08-09 Thread Xue Junkai
Since we already have command line tools for Helix, a tutorial of using
command line to setup controller and participant is enough for that? We can
run controller standalone or in distributed model with command line.

On Wed, Aug 9, 2017 at 8:55 PM, kishore g  wrote:

> https://stackoverflow.com/questions/45406255/how-to-
> setup-apache-helix-controller-in-controller-as-a-service-mode
>
> Does anyone know the steps to set up the controller as a service? Let's
> update the documentation and also answer the question on stack over flow.
>


Re: Generate Helix release 0.6.8

2017-05-10 Thread Xue Junkai
Yes. I have the PR for this : https://github.com/apache/helix/pull/91

Best,

Junkai

On Wed, May 10, 2017 at 12:08 PM, kishore g <g.kish...@gmail.com> wrote:

> Yes. Do you have a PR for that?. I can review it
>
> On Wed, May 10, 2017 at 11:19 AM, Xue Junkai <junkai@gmail.com> wrote:
>
> > Sure! Please let me know if this change works or not. BTW will customized
> > batch message threadpool be involved in this release?
> >
> > Best,
> >
> > Junkai
> >
> > On Tue, May 9, 2017 at 7:28 PM, kishore g <g.kish...@gmail.com> wrote:
> >
> > > I would like to have that fix included for Pinot. I will test it the
> > patch.
> > >
> > > On Tue, May 9, 2017 at 5:59 PM, Xue Junkai <junkai@gmail.com>
> wrote:
> > >
> > > > It does contain the batchMessage thread pool fix. But for race
> > condition
> > > > fix I withdraw the pull request since I am not quite sure whether the
> > fix
> > > > works or not. In addition, this release will include the
> > > > AutoRebalanceStrategy not assign replicas fix.
> > > >
> > > >
> > > > Best,
> > > >
> > > > Junkai
> > > >
> > > > On Tue, May 9, 2017 at 5:49 PM, kishore g <g.kish...@gmail.com>
> wrote:
> > > >
> > > > > Does this include the batchMessage thread pool fix and fix to the
> > race
> > > > > condition
> > > > >
> > > > > On Tue, May 9, 2017 at 5:08 PM, Xue Junkai <junkai@gmail.com>
> > > wrote:
> > > > >
> > > > > > Hi Helix Devs,
> > > > > >
> > > > > > I am going to work on releasing Helix 0.6.8 this week. Please let
> > me
> > > > know
> > > > > > if you have any questions, comments and concerns.
> > > > > >
> > > > > > Best,
> > > > > >
> > > > > > Junkai Xue
> > > > > >
> > > > >
> > > >
> > > >
> > > >
> > > > --
> > > > Junkai Xue
> > > >
> > >
> >
> >
> >
> > --
> > Junkai Xue
> >
>



-- 
Junkai Xue


Re: Generate Helix release 0.6.8

2017-05-10 Thread Xue Junkai
Sure! Please let me know if this change works or not. BTW will customized
batch message threadpool be involved in this release?

Best,

Junkai

On Tue, May 9, 2017 at 7:28 PM, kishore g <g.kish...@gmail.com> wrote:

> I would like to have that fix included for Pinot. I will test it the patch.
>
> On Tue, May 9, 2017 at 5:59 PM, Xue Junkai <junkai@gmail.com> wrote:
>
> > It does contain the batchMessage thread pool fix. But for race condition
> > fix I withdraw the pull request since I am not quite sure whether the fix
> > works or not. In addition, this release will include the
> > AutoRebalanceStrategy not assign replicas fix.
> >
> >
> > Best,
> >
> > Junkai
> >
> > On Tue, May 9, 2017 at 5:49 PM, kishore g <g.kish...@gmail.com> wrote:
> >
> > > Does this include the batchMessage thread pool fix and fix to the race
> > > condition
> > >
> > > On Tue, May 9, 2017 at 5:08 PM, Xue Junkai <junkai@gmail.com>
> wrote:
> > >
> > > > Hi Helix Devs,
> > > >
> > > > I am going to work on releasing Helix 0.6.8 this week. Please let me
> > know
> > > > if you have any questions, comments and concerns.
> > > >
> > > > Best,
> > > >
> > > > Junkai Xue
> > > >
> > >
> >
> >
> >
> > --
> > Junkai Xue
> >
>



-- 
Junkai Xue


Generate Helix release 0.6.8

2017-05-09 Thread Xue Junkai
Hi Helix Devs,

I am going to work on releasing Helix 0.6.8 this week. Please let me know
if you have any questions, comments and concerns.

Best,

Junkai Xue


Re: Generate Helix release 0.6.8

2017-05-09 Thread Xue Junkai
It does contain the batchMessage thread pool fix. But for race condition
fix I withdraw the pull request since I am not quite sure whether the fix
works or not. In addition, this release will include the
AutoRebalanceStrategy not assign replicas fix.


Best,

Junkai

On Tue, May 9, 2017 at 5:49 PM, kishore g <g.kish...@gmail.com> wrote:

> Does this include the batchMessage thread pool fix and fix to the race
> condition
>
> On Tue, May 9, 2017 at 5:08 PM, Xue Junkai <junkai@gmail.com> wrote:
>
> > Hi Helix Devs,
> >
> > I am going to work on releasing Helix 0.6.8 this week. Please let me know
> > if you have any questions, comments and concerns.
> >
> > Best,
> >
> > Junkai Xue
> >
>



-- 
Junkai Xue


[VOTE] Apache Helix 0.6.8 Release

2017-06-12 Thread Xue Junkai
Hi,

This is to call for a vote on releasing the following candidate as Apache
Helix 0.6.8. This is the eleventh release of Helix as an Apache project, as
well as the seventh release as a top-level Apache project.

Apache Helix is a generic cluster management framework that makes it easy
to build partitioned and replicated, fault-tolerant and scalable
distributed systems.

Release notes:
http://helix.apache.org/0.6.8-docs/releasenotes/release-0.6.8.html

Release artifacts:
https://repository.apache.org/content/repositories/orgapachehelix-1010

Distribution:
* binaries:
https://dist.apache.org/repos/dist/dev/helix/0.6.8/binaries/
* sources:
https://dist.apache.org/repos/dist/dev/helix/0.6.8/src/

The [VERSION] release tag:
https://git-wip-us.apache.org/repos/asf?p=helix.git;a=tag;h=refs/tags/helix-0.6.8

KEYS file available here:
https://dist.apache.org/repos/dist/dev/helix/KEYS

Please vote on the release. The vote will be open for at least 72 hours.

[+1] -- "YES, release"
[0] -- "No opinion"
[-1] -- "NO, do not release"

Thanks,
The Apache Helix Team


Re: Branches rearrangement

2017-06-22 Thread Xue Junkai
Thanks for the suggestions! We will work on this and may have a discuss on
the details.

Best,

Junkai

On Thu, Jun 22, 2017 at 5:41 PM, kishore g <g.kish...@gmail.com> wrote:

> Thanks, Junkai. can we first separate the api's into a separate
> helix-api module before we make any release on master. We can also start
> with a new major version and start versioning from 2.0.0
>
> thanks
> Kishore G,
>
>
>
>
> On Thu, Jun 22, 2017 at 4:28 PM, Lei Xia <l...@apache.org> wrote:
>
> > Thanks Junkai for working on this!
> >
> >
> > Lei
> >
> > On Thu, Jun 22, 2017 at 4:23 PM, Xue Junkai <j...@apache.org> wrote:
> >
> > > Hi All,
> > >
> > > Branch switches are done now! Now the development work will be started
> in
> > > master branch, which is forked from helix-0.6.x. Please update your
> > forked
> > > repo or refork it if it is necessary.
> > >
> > > Feel free to let us know if you have any questions or concerns!
> > >
> > > Best,
> > >
> > > Junkai
> > >
> > > On Mon, Jun 19, 2017 at 4:33 PM, Xue Junkai <j...@apache.org> wrote:
> > >
> > > > Hi All,
> > > >
> > > > Here're some heads ups for branch rearrangements:
> > > >
> > > >1. There will not be any features checked into helix-0.6.x, except
> > > >some hot fixes.
> > > >2. Master branch will be moved to helix-0.7.x for 0.7 version bug
> > > >fixing.
> > > >3. Move helix-0.6.x branch to Master branch and try our best to
> make
> > > >it compatible with helix-0.7.x
> > > >
> > > > Please let us know if you have any suggestions or questions regarding
> > > this!
> > > >
> > > > Best,
> > > >
> > > > Junkai
> > > >
> > >
> >
>



-- 
Junkai Xue


Re: Branches rearrangement

2017-06-22 Thread Xue Junkai
Hi All,

Branch switches are done now! Now the development work will be started in
master branch, which is forked from helix-0.6.x. Please update your forked
repo or refork it if it is necessary.

Feel free to let us know if you have any questions or concerns!

Best,

Junkai

On Mon, Jun 19, 2017 at 4:33 PM, Xue Junkai <j...@apache.org> wrote:

> Hi All,
>
> Here're some heads ups for branch rearrangements:
>
>1. There will not be any features checked into helix-0.6.x, except
>some hot fixes.
>2. Master branch will be moved to helix-0.7.x for 0.7 version bug
>fixing.
>3. Move helix-0.6.x branch to Master branch and try our best to make
>it compatible with helix-0.7.x
>
> Please let us know if you have any suggestions or questions regarding this!
>
> Best,
>
> Junkai
>


[ANNOUNCE] Apache Helix 0.6.8 Release

2017-06-19 Thread Xue Junkai
The Apache Helix Team is pleased to announce the eleventh release, 0.6.8,
of the Apache Helix project.

Apache Helix is a generic cluster management framework that makes it easy
to build partitioned, fault tolerant, and scalable distributed systems.

The full release notes are available here:
http://helix.apache.org/0.6.8-docs/releasenotes/release-0.6.8.html

You can declare a maven dependency to use it:


  org.apache.helix
  helix-core
  0.6.8


Or download the release sources:
http://helix.apache.org/0.6.8-docs/download.cgi

Additional info

Website: http://helix.apache.org/
Helix mailing lists: http://helix.apache.org/mail-lists.html

We hope you will enjoy using the latest release of Apache Helix!

Cheers,
Apache Helix Team


Branches rearrangement

2017-06-19 Thread Xue Junkai
Hi All,

Here're some heads ups for branch rearrangements:

   1. There will not be any features checked into helix-0.6.x, except some
   hot fixes.
   2. Master branch will be moved to helix-0.7.x for 0.7 version bug fixing.
   3. Move helix-0.6.x branch to Master branch and try our best to make it
   compatible with helix-0.7.x

Please let us know if you have any suggestions or questions regarding this!

Best,

Junkai


[RESULT][VOTE] Apache Helix 0.6.8 Release

2017-06-15 Thread Xue Junkai
Thanks for voting on the 0.6.8 release. It has now exceeded 72 hours so I
am closing the vote.

Binding +1s:
Kishore Gopalakrishna, Lei Xia, Zhen Zhang

Nonbinding +1s:

Binding 0s:

Nonbinding 0s:

Binding -1s:

Nonbinding -1s:

The vote has passed, thanks a lot to everyone for voting!


[ANNOUNCE] Apache Helix 0.6.9 Release

2017-10-14 Thread Xue Junkai
The Apache Helix Team is pleased to announce the 12th release, 0.6.9, of
the Apache Helix project.

Apache Helix is a generic cluster management framework that makes it easy
to build partitioned, fault tolerant, and scalable distributed systems.

The full release notes are available here:
http://helix.apache.org/0.6.9-docs/releasenotes/release-0.6.9.html

You can declare a maven dependency to use it:


  org.apache.helix
  helix-core
  0.6.9


Or download the release sources:
http://helix.apache.org/0.6.9-docs/download.cgi

Additional info

Website: http://helix.apache.org/
Helix mailing lists: http://helix.apache.org/mail-lists.html

We hope you will enjoy using the latest release of Apache Helix!

Cheers,
Apache Helix Team


[VOTE] Apache Helix 0.6.9 Release

2017-10-09 Thread Xue Junkai
Hi,

This is to call for a vote on releasing the following candidate as Apache
Helix 0.6.9. This is the 12th release of Helix as an Apache project, as
well as the 8th release as a top-level Apache project.

Apache Helix is a generic cluster management framework that makes it easy
to build partitioned and replicated, fault-tolerant and scalable
distributed systems.

Release notes:
http://helix.apache.org/0.6.9-docs/releasenotes/release-0.6.9.html

Release artifacts:
https://repository.apache.org/content/repositories/orgapachehelix-1011

Distribution:
* binaries:
https://dist.apache.org/repos/dist/dev/helix/0.6.9/binaries/
* sources:
https://dist.apache.org/repos/dist/dev/helix/0.6.9/src/

The 0.6.9 release tag:
https://git-wip-us.apache.org/repos/asf?p=helix.git;a=tag;h=refs/tags/helix-0.6.9

KEYS file available here:
https://dist.apache.org/repos/dist/dev/helix/KEYS

Please vote on the release. The vote will be open for at least 72 hours.

[+1] -- "YES, release"
[0] -- "No opinion"
[-1] -- "NO, do not release"

Thanks,
The Apache Helix Team


Re: Clarification of configuration related to Delayed Partition Movement

2017-10-23 Thread Xue Junkai
Hi Mahesh,

There are two steps for enabling delayed rebalance:
1. Set DELAY_REBALANCE_TIME in ClusterConfig. The unit is milisecond.
2. Change the REBALANCER_CLASS_NAME of IdealState for resource to
DelayedAutoRebalancer:
For example , idealState.setRebalancerClassName(DelayedAutoRebalancer.class.
getName());

Please let us know if you have any further questions.

Best,

Junkai

On Mon, Oct 23, 2017 at 10:08 PM, leela maheswararao <
leela_mahesh...@yahoo.com.invalid> wrote:

> + user mail list
>
>
> Sent from Yahoo Mail for iPhone
>
>
> On Monday, October 23, 2017, 11:35 PM, leela maheswararao <
> leela_mahesh...@yahoo.com.INVALID> wrote:
>
> Hi,From this link (https://cwiki.apache.org/confluence/pages/viewpage.
> action?pageId=73630206), I got to know that, helix will support
> specifying time period to delay partition movements from failed nodes.
> However I couldn't exact config settings to do this on site documentation (
> http://helix.apache.org/0.6.9-docs/Tutorial.html). Can somebody help me
> on this?
> Regards,Mahesh
>
>
>


-- 
Junkai Xue


Re: Using job specific Participants in Helix

2017-11-16 Thread Xue Junkai
Hi Dlmuthu and Kishore,

Yes. Task framework does support tagged instance at job level. It can be
set in JobConfig for InstanceGroupTag. Also, do not forget to tag the
instances.

Best,

Junkai

On Thu, Nov 16, 2017 at 8:50 PM, kishore g  wrote:

> Hi Dlmuthu,
>
> You can achieve this by using the tag feature in Helix. For example, to
> Participants P1, P2 should only handle DataCollecting tasks, you need to do
> the following
>
>- Tag Participant P1 and P2 as "DataCollecting"
>- When you create a job resource that represents DataCollecting task,
>you have to tag the resource also as DataCollecting.
>
> Helix will then assign all tasks in this job to nodes that are tagged as
> DataCollecting.
>
> Note this is available only in FULL_AUTO rebalance mode. I am not sure if
> the rebalancer in task framework supports this feature.
>
> Lei/Junkai, do you know if this is supported in Task Framework?
>
> thanks
>
>
> On Thu, Nov 16, 2017 at 8:31 PM, Ajinkya Dhamnaskar  >
> wrote:
>
> > Dlmuthu,
> >
> > This explains your need. Could you please point me to the code? I was
> > wondering, where and how are we registering callbacks for the
> participants?
> >
> > On Thu, Nov 16, 2017 at 7:16 PM, DImuthu Upeksha <
> > dimuthu.upeks...@gmail.com
> > > wrote:
> >
> > > Adding Helix Dev
> > >
> > > Hi Ajinkya
> > >
> > > Thank you for the explanation.
> > >
> > > Let me explain my requirement. [1] is the correct task registry
> > > configuration in a working participant code. All the transition call
> > backs
> > > are registered in this code. The problem here is, we have to bundle
> > > binaries of all the tasks into a single Participant. If we want to
> change
> > > one task, we need to rebuild the participant with other tasks as well.
> > What
> > > I thought is, why can't we build Participants that only do specific set
> > of
> > > tasks. For example, in the cluster there are 3 participants,
> Participant
> > 1
> > > [2], Participant 2 [3] and Participant 3 [4]. If we want to change the
> > > DataCollecting task, we only need to re build Participant 2. I got
> above
> > > error, once I run 3 participants in above configuration. I can
> understand
> > > this is due to the missing transition callback of CommandTask in
> > > Participant 1 as I have purposely commented out it. What I need to know
> > is
> > > that, is this configuration allowed in Helix architecture or do we have
> > to
> > > implement Participants that contain all the task implementations as in
> > [1].
> > > Workflow configuration for both scenarios is here [5].
> > >
> > > [1] https://gist.github.com/DImuthuUpe/41c0db579e7d86d101d112f07ed6ea
> 00
> > > [2] https://gist.github.com/DImuthuUpe/ec72df1ec3207ce2dce88ff7f1756d
> a4
> > > [3] https://gist.github.com/DImuthuUpe/4d193a3dff3008315efa2e31f6209c
> ac
> > > [4] https://gist.github.com/DImuthuUpe/872c432935b8d33944dd571b3ac420
> 7b
> > > [5] https://gist.github.com/DImuthuUpe/f61851b68b685b8d6744689dc130ba
> bd
> > >
> > > Thanks
> > > Dimuthu
> > >
> > > On Fri, Nov 17, 2017 at 3:21 AM, Ajinkya Dhamnaskar <
> > adham...@umail.iu.edu
> > > > wrote:
> > >
> > >> Hey Dlmuthu,
> > >>
> > >> Not an expert in Helix, but from exceptions it seems, system is
> entering
> > >> in a state not expected by reflection. I feel
> https://github.com/apache
> > >> /helix/blob/master/helix-core/src/main/java/org/apache/helix
> > >> /messaging/handling/HelixStateTransitionHandler.java#L295 is
> triggering
> > >> this exception.
> > >> As mentioned in the later part of the stack trace and from Helix
> Apache
> > >> Docs 
> *("Helix
> > >> is built on the following assumption: if your distributed resource is
> > >> modeled by a finite state machine, then Helix can tell participants
> when
> > >> they should transition between states. In the Java API, this means
> > >> implementing transition callbacks. In the Helix agent API, this means
> > >> providing commands than can run for each transition"),* did you
> > >> implement *transition callback* for these tasks?
> > >>
> > >> On Thu, Nov 16, 2017 at 10:01 AM, DImuthu Upeksha <
> > >> dimuthu.upeks...@gmail.com> wrote:
> > >>
> > >>> Hi Devs,
> > >>>
> > >>> I'm working on the technology evaluation to re architecture Apache
> > >>> Airavata task execution framework and Helix seems like a good
> > candidate for
> > >>> that as it has an in built distributed generic workflow execution
> > >>> capability. After going through several tutorials, I tried to
> > implement a
> > >>> simple workflow on Helix to demonstrate following transition
> > >>>
> > >>> Data Collecting Job -> Command Executing Job -> Data Uploading Job
> > >>>
> > >>> I managed to implement this using a Participant node that includes
> all
> > >>> the tasks required for above workflow. However my goal is to
> implement
> > >>> specialized Participants for each Job type. For example, Participant
> 

Re: Exceptions in logs

2017-11-03 Thread Xue Junkai
Hi Leela,

Which version of Helix are you using?

best,

Junkai

On Fri, Nov 3, 2017 at 5:31 PM, leela maheswararao <
leela_mahesh...@yahoo.com.invalid> wrote:

> Team,
> I see below exception in logs during startup time. any issue with my
> configuration?
>  X`20171103182638.943``1unowned``unowned``SEVERE`
> org.apache.helix.manager.zk.ZKExceptionHandler`Exception while invoking
> init callback for listener:org.apache.helix.messaging.handling.
> helixtaskexecu...@76f10035java.lang.NullPointerException: null at
> org.apache.helix.messaging.handling.HelixTaskExecutor.
> onMessage(HelixTaskExecutor.java:691) at org.apache.helix.manager.zk.
> CallbackHandler.invoke(CallbackHandler.java:243) at
> org.apache.helix.manager.zk.CallbackHandler.enqueueTask(CallbackHandler.java:177)
> at org.apache.helix.manager.zk.CallbackHandler.init(CallbackHandler.java:373)
> at 
> org.apache.helix.manager.zk.CallbackHandler.(CallbackHandler.java:124)
> at 
> org.apache.helix.manager.zk.ZKHelixManager.addListener(ZKHelixManager.java:340)
> at 
> org.apache.helix.manager.zk.ZKHelixManager.addMessageListener(ZKHelixManager.java:413)
> at 
> org.apache.helix.manager.zk.ParticipantManager.setupMsgHandler(ParticipantManager.java:336)
> at 
> org.apache.helix.manager.zk.ParticipantManager.handleNewSession(ParticipantManager.java:118)
> at org.apache.helix.manager.zk.ZKHelixManager.
> handleNewSessionAsParticipant(ZKHelixManager.java:926) at
> org.apache.helix.manager.zk.ZKHelixManager.handleNewSession(ZKHelixManager.java:892)
> at 
> org.apache.helix.manager.zk.ZKHelixManager.createClient(ZKHelixManager.java:513)
> at org.apache.helix.manager.zk.ZKHelixManager.connect(ZKHelixManager.java:551)
> at com.store.vg.helix.HelixParticipant.start(HelixParticipant.java:43)
>
> Regards,Mahesh
>


Re: specify master constraint per node

2017-11-03 Thread Xue Junkai
Hi Leela,

Are your resources all in SemiAuto mode? SemiAuto mode assignment is based
on your preference list in your IdealState list fields. If you want to
spread it out, you have to change your preference list node order for
different partitions.

If you are using FullAuto mode, you will not have this problem.

best,

Junkai

On Fri, Nov 3, 2017 at 5:35 PM, leela maheswararao <
leela_mahesh...@yahoo.com.invalid> wrote:

> Team,In my example(SEMI_AUTO), I have configured 3 partitions with 3
> replicas on 3 nodes. Once all nodes are started, external view shows as
> below. Here all MASTER's are assigned to same node. Is there a way to
> spread MASTER's evenly?  is this behavior due to server booting order?
>
> IdealState for R1:
>
> {
>
>   "id" : "R1",
>
>   "mapFields" : {
>
> "R1_0" : {
>
>   "localhost_8085" : "MASTER",
>
>   "localhost_8086" : "SLAVE",
>
>   "localhost_8087" : "SLAVE"
>
> },
>
> "R1_1" : {
>
>   "localhost_8085" : "SLAVE",
>
>   "localhost_8086" : "SLAVE",
>
>   "localhost_8087" : "MASTER"
>
> },
>
> "R1_2" : {
>
>   "localhost_8085" : "SLAVE",
>
>   "localhost_8086" : "MASTER",
>
>   "localhost_8087" : "SLAVE"
>
> }
>
>   },
>
>   "listFields" : {
>
> "R1_0" : [ "localhost_8085", "localhost_8086", "localhost_8087" ],
>
> "R1_1" : [ "localhost_8087", "localhost_8085", "localhost_8086" ],
>
> "R1_2" : [ "localhost_8086", "localhost_8085", "localhost_8087" ]
>
>   },
>
>   "simpleFields" : {
>
> "IDEAL_STATE_MODE" : "AUTO",
>
> "NUM_PARTITIONS" : "3",
>
> "REBALANCE_MODE" : "SEMI_AUTO",
>
> "REBALANCE_STRATEGY" : "DEFAULT",
>
> "REPLICAS" : "3",
>
> "STATE_MODEL_DEF_REF" : "MasterSlave",
>
> "STATE_MODEL_FACTORY_NAME" : "DEFAULT"
>
>   }
>
> }
>
>
>
>
> ExternalView for R1:
>
> {
>
>   "id" : "R1",
>
>   "mapFields" : {
>
> "R1_0" : {
>
>   "localhost_8085" : "SLAVE",
>
>   "localhost_8086" : "MASTER",
>
>   "localhost_8087" : "SLAVE"
>
> },
>
> "R1_1" : {
>
>   "localhost_8085" : "SLAVE",
>
>   "localhost_8086" : "MASTER",
>
>   "localhost_8087" : "SLAVE"
>
> },
>
> "R1_2" : {
>
>   "localhost_8085" : "SLAVE",
>
>   "localhost_8086" : "MASTER",
>
>   "localhost_8087" : "SLAVE"
>
> }
>
>   },
>
>   "listFields" : {
>
>   },
>
>   "simpleFields" : {
>
> "BUCKET_SIZE" : "0",
>
> "IDEAL_STATE_MODE" : "AUTO",
>
> "NUM_PARTITIONS" : "3",
>
> "REBALANCE_MODE" : "SEMI_AUTO",
>
> "REBALANCE_STRATEGY" : "DEFAULT",
>
> "REPLICAS" : "3",
>
> "STATE_MODEL_DEF_REF" : "MasterSlave",
>
> "STATE_MODEL_FACTORY_NAME" : "DEFAULT"
>
>   }
>
> }
> Regards,Mahesh
>


Re: specify master constraint per node

2017-11-03 Thread Xue Junkai
ondHandler stuff,
> RoutingTableProvider routingTableProvider) {
>
>   this.stuff = stuff;
>
>   this.routingTableProvider = routingTableProvider;
>
>  }
>
>
>
>
>  @Override
>
>  public StateModel createNewStateModel(String resourceName, String
> stateUnitKey) {
>
>   logger.info(
>
> "MasterSlaveStateModelFactory => " + "resourceName " + resourceName +
> " stateUnitKey " + stateUnitKey);
>
>   return new MasterSlaveStateModel(stateUnitKey, stuff,
> routingTableProvider);
>
>  }
>
>
>
>
>  @StateModelInfo(initialState = "OFFLINE", states =
> "{'OFFLINE','SLAVE','MASTER', 'DROPPED'}")
>
>  public static class MasterSlaveStateModel extends StateModel {
>
>
>
>
>   private final String _stateUnitKey;
>
>   SconeHandler stuff;
>
>   RoutingTableProvider routingTableProvider;
>
>
>
>
>
>
>   public MasterSlaveStateModel(String stateUnitKey, SconeHandler
> stuff,RoutingTableProvider routingTableProvider) {
>
>_stateUnitKey = stateUnitKey;
>
>this.stuff = stuff;
>
>this.routingTableProvider = routingTableProvider;
>
>   }
>
>
>
>
>   @Transition(from = "MASTER", to = "SLAVE")
>
>   public void masterToSlave(Message message, NotificationContext context) {
>
>logger.info("BootstrapProcess.BootstrapStateModel.masterToSlave()");
>
>   }
>
>
>
>
>   @Transition(from = "OFFLINE", to = "SLAVE")
>
>   public void offlineToSlave(Message message, NotificationContext context)
> {
>
>logger.info("BootstrapProcess.BootstrapStateModel.offlineToSlave()");
>
>
>
>   }
>
>
>
>
>   @Transition(from = "SLAVE", to = "OFFLINE")
>
>   public void slaveToOffline(Message message, NotificationContext context)
> {
>
>logger.info("BootstrapProcess.BootstrapStateModel.slaveToOffline()");
>
>   }
>
>
>
>
>   @Transition(from = "SLAVE", to = "MASTER")
>
>   public void slaveToMaster(Message message, NotificationContext context) {
>
>logger.info("BootstrapProcess.BootstrapStateModel.slaveToMaster()");
>
>   }
>
>
>
>
>   @Transition(from = "OFFLINE", to = "DROPPED")
>
>   public void onBecomeDroppedFromOffline(Message message,
> NotificationContext context) {
>
>logger.info("BootstrapProcess.BootstrapStateModel.offlineToDropped()");
>
>   }
>
>
>
>
>  }
>
> }
>
> Controller class:
>
> public class HelixController {
>
>
>
>
>
>
>  String clusterName;
>
>  String instanceName;
>
>  String zkConnectString;
>
>
>
>  public HelixController(String clusterName, String instanceName, String
> zkConnectString) {
>
>   super();
>
>   this.clusterName = clusterName;
>
>   this.instanceName = instanceName;
>
>   this.zkConnectString = zkConnectString;
>
>  }
>
>
>
>  public void start(){
>
>   try{
>
>HelixManager manager = HelixManagerFactory.
> getZKHelixManager(clusterName,
>
>  instanceName,
>
>  InstanceType.CONTROLLER,
>
>  zkConnectString);
>
>manager.connect();
>
>GenericHelixController controller = new GenericHelixController();
>
>manager.addConfigChangeListener(controller);
>
>manager.addLiveInstanceChangeListener(controller);
>
>manager.addIdealStateChangeListener(controller);
>
>//manager..addExternalViewChangeListener(controller);
>
>manager.addControllerListener(controller);
>
>   }catch(Exception ex){
>
>logger.log(Level.SEVERE, "Error during Controller start", ex);
>
>   }
>
>  }
>
>
>
>
>
> }
>
> do you see any issues with my above classes?
> On Saturday, November 4, 2017, 6:13:46 AM GMT+5:30, Xue Junkai <
> j...@apache.org> wrote:
>
>  Hi Leela,
>
> Are your resources all in SemiAuto mode? SemiAuto mode assignment is based
> on your preference list in your IdealState list fields. If you want to
> spread it out, you have to change your preference list node order for
> different partitions.
>
> If you are using FullAuto mode, you will not have this problem.
>
> best,
>
> Junkai
>
> On Fri, Nov 3, 2017 at 5:35 PM, leela maheswararao <
> leela_mahesh...@yahoo.com.invalid> wrote:
>
> > Team,In my example(SEMI_AUTO), I have configured 3 partitions with 3
> > replicas on 3 nodes. Once all nodes are started, external view shows as
> > below. Here all MASTER's are assigned to same node. Is there a way to
> > spread MASTER's evenly?  is this behavior due to server booting order?
> >
> > IdealState for R1:
> >
> > {
> >
> >  "id" : "R1",
> >
> >  "mapFields" : {
> >
> >"R1_0" : {
> >
> >  "localhost_8085" : "MASTER",
> >
> >  "localhost_8086" : "SLAVE",
> >
> >  "localhost_8087" : "SLAVE"
> >
> >},
> >
> >"R1_1" : {
> >
> >  "localhost_8085" : "SLAVE",
> >
> >  "localhost_8086" : "SLAVE",
> >
> >  "localhost_8087" : "MASTER"
> >
> >},
> >
> >"R1_2" : {
> >
> >  "localhost_8085" : "SLAVE",
> >
> >  "localhost_8086" : "MASTER",
> >
> >  "localhost_8087" : "SLAVE"
> >
> >}
> >
> >  },
> >
> >  "listFields" : {
> >
> >"R1_0" : [ "localhost_8085", "localhost_8086", "localhost_8087" ],
> >
> >"R1_1" : [ "localhost_8087", "localhost_8085", "localhost_8086" ],
> >
> >"R1_2" : [ "localhost_8086", "localhost_8085", "localhost_8087" ]
> >
> >  },
> >
> >  "simpleFields" : {
> >
> >"IDEAL_STATE_MODE" : "AUTO",
> >
> >"NUM_PARTITIONS" : "3",
> >
> >"REBALANCE_MODE" : "SEMI_AUTO",
> >
> >"REBALANCE_STRATEGY" : "DEFAULT",
> >
> >"REPLICAS" : "3",
> >
> >"STATE_MODEL_DEF_REF" : "MasterSlave",
> >
> >"STATE_MODEL_FACTORY_NAME" : "DEFAULT"
> >
> >  }
> >
> > }
> >
> >
> >
> >
> > ExternalView for R1:
> >
> > {
> >
> >  "id" : "R1",
> >
> >  "mapFields" : {
> >
> >"R1_0" : {
> >
> >  "localhost_8085" : "SLAVE",
> >
> >  "localhost_8086" : "MASTER",
> >
> >  "localhost_8087" : "SLAVE"
> >
> >},
> >
> >"R1_1" : {
> >
> >  "localhost_8085" : "SLAVE",
> >
> >  "localhost_8086" : "MASTER",
> >
> >  "localhost_8087" : "SLAVE"
> >
> >},
> >
> >"R1_2" : {
> >
> >  "localhost_8085" : "SLAVE",
> >
> >  "localhost_8086" : "MASTER",
> >
> >  "localhost_8087" : "SLAVE"
> >
> >}
> >
> >  },
> >
> >  "listFields" : {
> >
> >  },
> >
> >  "simpleFields" : {
> >
> >"BUCKET_SIZE" : "0",
> >
> >"IDEAL_STATE_MODE" : "AUTO",
> >
> >"NUM_PARTITIONS" : "3",
> >
> >"REBALANCE_MODE" : "SEMI_AUTO",
> >
> >"REBALANCE_STRATEGY" : "DEFAULT",
> >
> >"REPLICAS" : "3",
> >
> >"STATE_MODEL_DEF_REF" : "MasterSlave",
> >
> >"STATE_MODEL_FACTORY_NAME" : "DEFAULT"
> >
> >  }
> >
> > }
> > Regards,Mahesh
> >
>


Re: [VOTE] Apache Helix 0.8.1 Release

2018-04-26 Thread Xue Junkai
+1

On Thu, Apr 26, 2018 at 5:13 PM, Lei Xia  wrote:

> +1
>
>
>
> On Thu, Apr 26, 2018 at 11:57 AM Vivo Xu  wrote:
>
> > +1
> > On Thu, Apr 26, 2018 at 11:34 AM Lei Xia  wrote:
> >
> > > +1
> > >
> > >
> > >
> > > Lei Xia
> > >
> > > 
> > > From: Eric Kim 
> > > Sent: Thursday, April 26, 2018 10:47:58 AM
> > > To: dev@helix.apache.org
> > > Subject: [VOTE] Apache Helix 0.8.1 Release
> > >
> > > Hi,
> > >
> > >
> > >
> > > This is to call for a vote on releasing the following candidate as
> Apache
> > > Helix 0.8.1. This is the thirteenth release of Helix as an Apache
> > project,
> > > as well as the ninth release as a top-level Apache project.
> > >
> > >
> > >
> > > Apache Helix is a generic cluster management framework that makes it
> easy
> > > to build partitioned and replicated, fault-tolerant and scalable
> > > distributed systems.
> > >
> > >
> > >
> > > Release notes:
> > >
> > > https://helix.apache.org/0.8.1-docs/releasenotes/release-0.8.1.html
> > >
> > >
> > >
> > > Release artifacts:
> > >
> > > https://repository.apache.org/content/repositories/orgapachehelix-1016
> > >
> > >
> > >
> > > Distribution:
> > >
> > > * binaries:
> > >
> > > https://dist.apache.org/repos/dist/dev/helix/0.8.1/binaries/
> > >
> > > * sources:
> > >
> > > https://dist.apache.org/repos/dist/dev/helix/0.8.1/src/
> > >
> > >
> > >
> > > The 0.8.1 release tag:
> > >
> > >
> > >
> > https://git-wip-us.apache.org/repos/asf?p=helix.git;a=tag;h=
> refs/tags/helix-0.8.1
> > >
> > >
> > >
> > > KEYS file available here:
> > >
> > > https://dist.apache.org/repos/dist/dev/helix/KEYS
> > >
> > >
> > >
> > > Please vote on the release. The vote will be open for at least 72
> hours.
> > >
> > >
> > >
> > > [+1] -- "YES, release"
> > >
> > > [0] -- "No opinion"
> > >
> > > [-1] -- "NO, do not release"
> > >
> > >
> > >
> > > Thanks,
> > >
> > > The Apache Helix Team
> > >
> > >
> >
>



-- 
Junkai Xue


Re: [GitHub] helix issue #126: In CriteriaEvaluator.evaluateCriteria, if resourcename is ...

2018-01-09 Thread Xue Junkai
Hi Shreeshiv,

Sorry that we dont access to email list. Please visit the website to
unsubscribe your self.

Best,

Junkai

On Tue, Jan 9, 2018 at 8:51 AM, shreeshiv patel 
wrote:

> please unsubscribee me
>
> On Tue, Jan 9, 2018 at 10:31 AM, dasahcc  wrote:
>
> > Github user dasahcc commented on the issue:
> >
> > https://github.com/apache/helix/pull/126
> >
> > @kishoreg Good suggestion! Will make that change.
> >
> >
> > ---
> >
>



-- 
Junkai Xue


[RESULT][VOTE] Apache Helix 0.8.0 Release

2018-02-01 Thread Xue Junkai
Thanks for voting on the 0.8.0 release. It has now exceeded 72 hours so I
am closing the vote.

Binding +1s:
Kishore Gopalakrishna, Lei Xia, Shi Lu, Zhen Zhang


Nonbinding +1s:
Bo Liu

Binding 0s:

Nonbinding 0s:

Binding -1s:

Nonbinding -1s:

The vote has passed, thanks a lot to everyone for voting!


Re: [VOTE] Apache Helix 0.8.0 Release

2018-01-31 Thread Xue Junkai
Hi All,

We fixed the issue that helix-front module from main pom file. Here's the
new release candidate. Thanks for helping verify it.

https://repository.apache.org/content/repositories/orgapachehelix-1014

Best,

Junkai

On Mon, Jan 29, 2018 at 6:32 PM, kishore g <g.kish...@gmail.com> wrote:

> +1.
>
> On Mon, Jan 29, 2018 at 6:13 PM, Bo Liu <newpoo@gmail.com> wrote:
>
> > +1 can't wait to try V2 restful API and UI.
> >
> > On Mon, Jan 29, 2018 at 5:53 PM, Lei Xia <l...@apache.org> wrote:
> >
> > > +1
> > >
> > > On Mon, Jan 29, 2018 at 5:13 PM, Xue Junkai <j...@apache.org> wrote:
> > >
> > > > Hi,
> > > >
> > > > This is to call for a vote on releasing the following candidate as
> > Apache
> > > > Helix 0.8.0. This is the 13th release of Helix as an Apache project,
> as
> > > > well as the 9th release as a top-level Apache project.
> > > >
> > > > Apache Helix is a generic cluster management framework that makes it
> > easy
> > > > to build partitioned and replicated, fault-tolerant and scalable
> > > > distributed systems.
> > > >
> > > > Release notes:
> > > > *http://helix.apache.org/0.8.0-docs/releasenotes/release-0.8.0.html
> > > > <http://helix.apache.org/0.8.0-docs/releasenotes/release-0.8.0.html
> >*
> > > >
> > > > Release artifacts:
> > > > https://repository.apache.org/content/repositories/
> orgapachehelix-1013
> > > >
> > > > Distribution:
> > > > * binaries:
> > > > https://dist.apache.org/repos/dist/dev/helix/0.8.0/binaries/
> > > > * sources:
> > > > https://dist.apache.org/repos/dist/dev/helix/0.8.0/src/
> > > >
> > > > The 0.8.0 release tag:
> > > > https://git-wip-us.apache.org/repos/asf?p=helix.git;a=tag;h=
> > > > refs/tags/helix-0.8.0
> > > >
> > > > KEYS file available here:
> > > > https://dist.apache.org/repos/dist/dev/helix/KEYS
> > > >
> > > > Please vote on the release. The vote will be open for at least 72
> > hours.
> > > >
> > > > [+1] -- "YES, release"
> > > > [0] -- "No opinion"
> > > > [-1] -- "NO, do not release"
> > > >
> > > > Thanks,
> > > > The Apache Helix Team
> > > >
> > >
> >
> >
> >
> > --
> > Best regards,
> > Bo
> >
>



-- 
Junkai Xue


[VOTE] Apache Helix 0.8.0 Release

2018-01-29 Thread Xue Junkai
Hi,

This is to call for a vote on releasing the following candidate as Apache
Helix 0.8.0. This is the 13th release of Helix as an Apache project, as
well as the 9th release as a top-level Apache project.

Apache Helix is a generic cluster management framework that makes it easy
to build partitioned and replicated, fault-tolerant and scalable
distributed systems.

Release notes:
*http://helix.apache.org/0.8.0-docs/releasenotes/release-0.8.0.html
*

Release artifacts:
https://repository.apache.org/content/repositories/orgapachehelix-1013

Distribution:
* binaries:
https://dist.apache.org/repos/dist/dev/helix/0.8.0/binaries/
* sources:
https://dist.apache.org/repos/dist/dev/helix/0.8.0/src/

The 0.8.0 release tag:
https://git-wip-us.apache.org/repos/asf?p=helix.git;a=tag;h=refs/tags/helix-0.8.0

KEYS file available here:
https://dist.apache.org/repos/dist/dev/helix/KEYS

Please vote on the release. The vote will be open for at least 72 hours.

[+1] -- "YES, release"
[0] -- "No opinion"
[-1] -- "NO, do not release"

Thanks,
The Apache Helix Team


Re: doc link broken

2018-01-02 Thread Xue Junkai
It has been fixed now.

Best,

Junkai

On Tue, Jan 2, 2018 at 8:12 AM, Lei Xia  wrote:

> Hi, William
>
>Thanks for reporting the issue,  we will fix it asap.
>
>
> Lei
>
> On Tue, Jan 2, 2018 at 12:35 AM William Guo  wrote:
>
> > hi helix team,
> >
> > http://helix.apache.org/0.7.1-docs/tutorial_health.html
> >
> > http://helix.apache.org/0.6.9-docs/tutorial_health.html
> >
> > Above links are broken, could you fix this?
> >
> >
> > Thanks,
> > William
> >
>



-- 
Junkai Xue


Re: [VOTE] Apache Helix 0.8.2 Release

2018-07-25 Thread Xue Junkai
+1

On Wed, Jul 25, 2018 at 4:46 PM, Wang Jiajun  wrote:

> Hi,
>
> This is to call for a vote on releasing the following candidate as Apache
> Helix 0.8.2. This is the 14th release of Helix as an Apache project, as
> well as the 10th release as a top-level Apache project.
>
> Apache Helix is a generic cluster management framework that makes it easy
> to build partitioned and replicated, fault-tolerant and scalable
> distributed systems.
>
> Release notes:
> http://helix.apache.org/0.8.2-docs/releasenotes/release-0.8.2.html
>
> Release artifacts:
> https://repository.apache.org/content/repositories/orgapachehelix-1018
>
> Distribution:
> * binaries:
> https://dist.apache.org/repos/dist/dev/helix/0.8.2/binaries/
> * sources:
> https://dist.apache.org/repos/dist/dev/helix/0.8.2/src/
>
> The 0.8.2 release tag:
> https://git-wip-us.apache.org/repos/asf?p=helix.git;a=tag;h=
> refs/tags/helix-
> 0.8.2
>
> KEYS file available here:
> https://dist.apache.org/repos/dist/dev/helix/KEYS
>
> Please vote on the release. The vote will be open for at least 72 hours.
>
> [+1] -- "YES, release"
> [0] -- "No opinion"
> [-1] -- "NO, do not release"
>
> Thanks,
> The Apache Helix Team
>



-- 
Junkai Xue


Re: Sporadic issue when restarting a Participant

2018-04-11 Thread Xue Junkai
Hi DImuthu,

This could caused by the race condition of restarting participants. When a
task assigned to the participant, it will initialize the task and update
the TaskContext in JobContext by participant. At that moment, the
participant got restarted. When the participant comes back, controller will
reassign the task to the participant. During this process, all the
TaskContexts of the job will be readback and parsed. So that maybe where
the NPE comes from.

I will create a ticket for this NULL check. Thanks for reporting this.

Best,

Junkai



On Fri, Apr 6, 2018 at 12:12 PM, DImuthu Upeksha  wrote:

> Hi Folks,
>
> I used helix task framework to run several workflows and restarted the
> participant that held the implemented tasks. Most of the cases in restarts,
> Helix catches up and continue with last Task but In some cases it prints
> following error on Controller log and Workflow stops working upon that
> point. What could be the reason for that? I'm using Helix 0.6.7 version.
>
> 2018-04-06 15:10:57,766 [Thread-3] ERROR
> o.a.h.c.s.BestPossibleStateCalcStage  - Error computing assignment for
> resource
> Workflow_of_process_PROCESS_7f6c8a54-b50f-4bdb-aafd-
> 59ce87276527-POST-b5e39e07-2d8e-4309-be5a-f5b6067f9a24_
> TASK_cc8039e5-f054-4dea-8c7f-07c98077b117.
> Skipping.
> java.lang.NullPointerException: Name is null
> at java.lang.Enum.valueOf(Enum.java:236)
> at
> org.apache.helix.task.TaskPartitionState.valueOf(
> TaskPartitionState.java:25)
> at
> org.apache.helix.task.JobRebalancer.computeResourceMapping(
> JobRebalancer.java:272)
> at
> org.apache.helix.task.JobRebalancer.computeBestPossiblePartitionSt
> ate(JobRebalancer.java:140)
> at
> org.apache.helix.controller.stages.BestPossibleStateCalcStage.compute(
> BestPossibleStateCalcStage.java:171)
> at
> org.apache.helix.controller.stages.BestPossibleStateCalcStage.process(
> BestPossibleStateCalcStage.java:66)
> at
> org.apache.helix.controller.pipeline.Pipeline.handle(Pipeline.java:48)
> at
> org.apache.helix.controller.GenericHelixController.handleEvent(
> GenericHelixController.java:295)
> at
> org.apache.helix.controller.GenericHelixController$
> ClusterEventProcessor.run(GenericHelixController.java:595)
> 2018-04-06 15:11:00,385 [Thread-3] ERROR
> o.a.h.c.s.BestPossibleStateCalcStage  - Error computing assignment for
> resource
> Workflow_of_process_PROCESS_2b69b499-c527-4c9d-8b2b-
> db17366f5f81-POST-c67607ae-9177-4a02-af8a-8b3751eea4ff_
> TASK_1ea6876d-f2ec-4139-a15d-0e64a80a3025.
> Skipping.
>
> Thanks
> Dimuthu
>



-- 
Junkai Xue


Re: Stale jobs in Helix 0.8.2

2018-10-19 Thread Xue Junkai
Was this old jobs? Workflows, Jobs should not have external view. From this
context, it seemed not to be scheduled by controller. Even there is no
assignment.

Please check whether you have live instances or instance is not disabled.
At same time, do you to try to submit different jobs to have a try?

Best,

Junkai

On Fri, Oct 19, 2018 at 9:28 AM DImuthu Upeksha 
wrote:

> Hi,
>
> I'm noticing that in task framework, some jobs are not scheduled by the
> controller.
>
> For an example, in the zookeeper path
>
>
> /AiravataDemoCluster/CONFIGS/RESOURCE/Workflow_of_process_PROCESS_43171238-ec40-4cc9-a717-50569b13a839-POST-ac2cbfdb-1d1d-4767-8d51-3d3e8f3d84db_TASK_3fecf59e-fd81-438c-9c55-a4f3633a54dc
>
> I can see the content [1]
>
> And for the path
>
>
> /AiravataDemoCluster/EXTERNALVIEW/Workflow_of_process_PROCESS_43171238-ec40-4cc9-a717-50569b13a839-POST-ac2cbfdb-1d1d-4767-8d51-3d3e8f3d84db_TASK_3fecf59e-fd81-438c-9c55-a4f3633a54dc
>
> I can see the content [2]
>
> And for the path
>
>
> /AiravataDemoCluster/IDEALSTATES/Workflow_of_process_PROCESS_43171238-ec40-4cc9-a717-50569b13a839-POST-ac2cbfdb-1d1d-4767-8d51-3d3e8f3d84db_TASK_3fecf59e-fd81-438c-9c55-a4f3633a54dc
>
> Content is [3]
>
> But this job is not scheduled for some reason. Can you please let me know
> what could be the reason for this?
>
> [1] https://gist.github.com/DImuthuUpe/7f54642cd81bc50337e4262d0fdeed9f
> [2] https://gist.github.com/DImuthuUpe/eddc4f4c22cb4128a2ae623d14e87a04
> [3] https://gist.github.com/DImuthuUpe/d2ed567fea91ed6b8d80afb4421210b0
>
> Thanks
> Dimuthu
>


-- 
Junkai Xue


Re: Sporadic issue in putting a variable in workflow scope

2018-11-09 Thread Xue Junkai
It is possible. For example, if other jobs caused the workflow failed, it
will trigger the monitoring to clean up the workflow. Then if this job is
still running, you may see the problem. That's what I am trying to ask for,
extra thread deleting/cleaning workflows.

I can understand it clean up the failed workflow. But I am wondering why
not just set expiry and let Helix controller does the clean up for
completed workflows.

On Sat, Nov 10, 2018 at 1:30 PM DImuthu Upeksha 
wrote:

> Hi Junkai,
>
> There is a cleanup agent [1] who is monitoring currently available
> workflows and deleting completed and failed workflows to clear up zookeeper
> storage. Do you think that this will be causing this issue?
>
> [1]
>
> https://github.com/apache/airavata/blob/staging/modules/airavata-helix/helix-spectator/src/main/java/org/apache/airavata/helix/impl/controller/WorkflowCleanupAgent.java
>
> Thanks
> Dimuthu
>
> On Fri, Nov 9, 2018 at 11:14 PM DImuthu Upeksha <
> dimuthu.upeks...@gmail.com>
> wrote:
>
> > Hi Junkai,
> >
> > There is no manual workflow killing logic implemented but as you have
> > suggested, I need to verify that. Unfortunately all the helix log levels
> in
> > our servers were set to WARN as helix is printing a whole lot of logs in
> > INFO level so there is no much valuable information in logs. Can you
> > specify which class is printing logs associated for workflow termination
> > and I'll enable DEBUG level for that class and observe further.
> >
> > Thanks
> > Dimuthu
> >
> > On Fri, Nov 9, 2018 at 9:18 PM Xue Junkai  wrote:
> >
> >> Hmm, that's very strange. The user content store znode only has been
> >> deleted when the workflow is gone. From the log, it shows the znode is
> >> gone. Could you please try to dig the log to find whether the workflow
> has
> >> been manually killed? If that's the case, then it is possible you have
> the
> >> problem.
> >>
> >> On Fri, Nov 9, 2018 at 12:13 PM DImuthu Upeksha <
> >> dimuthu.upeks...@gmail.com>
> >> wrote:
> >>
> >> > Hi Junkai,
> >> >
> >> > Thanks for your suggestion. You have captured most of the parts
> >> correctly.
> >> > There are two jobs as job1 and job2. And there is a dependency that
> job2
> >> > depends on job1. Until job1 is completed job2 should not be scheduled.
> >> And
> >> > task 1 in job 1 is calling that method and it is not updating anyone's
> >> > content. It's just putting and value in workflow level. What do you
> >> mean my
> >> > keeping a key-value store in workflow level? I already use that key
> >> value
> >> > store given by helix by calling putUserContent method.
> >> >
> >> > public void sendNextJob(String jobId) {
> >> > putUserContent(WORKFLOW_STARTED, "TRUE", Scope.WORKFLOW);
> >> > if (jobId != null) {
> >> > putUserContent(NEXT_JOB, jobId, Scope.WORKFLOW);
> >> > }
> >> > }
> >> >
> >> > Dimuthu
> >> >
> >> >
> >> > On Fri, Nov 9, 2018 at 2:48 PM Xue Junkai 
> wrote:
> >> >
> >> > > In my understanding, it could be you have job1 and job2. The task
> >> running
> >> > > in job1 tries to update content for job2. Then, there could be a
> race
> >> > > condition happening here that job2 is not scheduled.
> >> > >
> >> > > If that's the case, I suggest you can put key-value store at
> workflow
> >> > level
> >> > > since this is cross-job operation.
> >> > >
> >> > > Best,
> >> > >
> >> > > Junkai
> >> > >
> >> > > On Fri, Nov 9, 2018 at 11:45 AM DImuthu Upeksha <
> >> > > dimuthu.upeks...@gmail.com>
> >> > > wrote:
> >> > >
> >> > > > Hi Junkai,
> >> > > >
> >> > > > This method is being called inside a running task. And it is
> working
> >> > for
> >> > > > most of the time. I only saw this in 2 occasions for last few
> months
> >> > and
> >> > > > both of them happened today and yesterday.
> >> > > >
> >> > > > Thanks
> >> > > > Dimuthu
> >> > > >
> >> > > > On Fri, Nov 9, 2018 at 2:40 PM Xue Junkai 
> >> > wrote:
> >> > > >
> >> > &g

Re: Sporadic issue in putting a variable in workflow scope

2018-11-09 Thread Xue Junkai
Hmm, that's very strange. The user content store znode only has been
deleted when the workflow is gone. From the log, it shows the znode is
gone. Could you please try to dig the log to find whether the workflow has
been manually killed? If that's the case, then it is possible you have the
problem.

On Fri, Nov 9, 2018 at 12:13 PM DImuthu Upeksha 
wrote:

> Hi Junkai,
>
> Thanks for your suggestion. You have captured most of the parts correctly.
> There are two jobs as job1 and job2. And there is a dependency that job2
> depends on job1. Until job1 is completed job2 should not be scheduled. And
> task 1 in job 1 is calling that method and it is not updating anyone's
> content. It's just putting and value in workflow level. What do you mean my
> keeping a key-value store in workflow level? I already use that key value
> store given by helix by calling putUserContent method.
>
> public void sendNextJob(String jobId) {
> putUserContent(WORKFLOW_STARTED, "TRUE", Scope.WORKFLOW);
> if (jobId != null) {
> putUserContent(NEXT_JOB, jobId, Scope.WORKFLOW);
> }
> }
>
> Dimuthu
>
>
> On Fri, Nov 9, 2018 at 2:48 PM Xue Junkai  wrote:
>
> > In my understanding, it could be you have job1 and job2. The task running
> > in job1 tries to update content for job2. Then, there could be a race
> > condition happening here that job2 is not scheduled.
> >
> > If that's the case, I suggest you can put key-value store at workflow
> level
> > since this is cross-job operation.
> >
> > Best,
> >
> > Junkai
> >
> > On Fri, Nov 9, 2018 at 11:45 AM DImuthu Upeksha <
> > dimuthu.upeks...@gmail.com>
> > wrote:
> >
> > > Hi Junkai,
> > >
> > > This method is being called inside a running task. And it is working
> for
> > > most of the time. I only saw this in 2 occasions for last few months
> and
> > > both of them happened today and yesterday.
> > >
> > > Thanks
> > > Dimuthu
> > >
> > > On Fri, Nov 9, 2018 at 2:40 PM Xue Junkai 
> wrote:
> > >
> > > > User content store node will be created one the job has been
> scheduled.
> > > In
> > > > your case, I think the job is not scheduled. This method usually has
> > been
> > > > utilized in running task.
> > > >
> > > > Best,
> > > >
> > > > Junkai
> > > >
> > > > On Fri, Nov 9, 2018 at 8:19 AM DImuthu Upeksha <
> > > dimuthu.upeks...@gmail.com
> > > > >
> > > > wrote:
> > > >
> > > > > Hi Helix Folks,
> > > > >
> > > > > I'm having this sporadic issue in some tasks of our workflows when
> we
> > > try
> > > > > to store a value in the workflow context and I have added both code
> > > > section
> > > > > and error message below. Do you have an idea what's causing this?
> > > Please
> > > > > let me know if you need further information. We are using Helix
> 0.8.2
> > > > >
> > > > > public void sendNextJob(String jobId) {
> > > > > putUserContent(WORKFLOW_STARTED, "TRUE", Scope.WORKFLOW);
> > > > > if (jobId != null) {
> > > > > putUserContent(NEXT_JOB, jobId, Scope.WORKFLOW);
> > > > > }
> > > > > }
> > > > >
> > > > > Failed to setup environment of task
> > > > > TASK_55096de4-2cb6-4b09-84fd-7fdddba93435
> > > > > java.lang.NullPointerException: null
> > > > > at
> org.apache.helix.task.TaskUtil$1.update(TaskUtil.java:358)
> > > > > at
> org.apache.helix.task.TaskUtil$1.update(TaskUtil.java:356)
> > > > > at
> > > > >
> > > > >
> > > >
> > >
> >
> org.apache.helix.manager.zk.HelixGroupCommit.commit(HelixGroupCommit.java:126)
> > > > > at
> > > > >
> > > > >
> > > >
> > >
> >
> org.apache.helix.manager.zk.ZkCacheBaseDataAccessor.update(ZkCacheBaseDataAccessor.java:306)
> > > > > at
> > > > >
> > > > >
> > > >
> > >
> >
> org.apache.helix.store.zk.AutoFallbackPropertyStore.update(AutoFallbackPropertyStore.java:61)
> > > > > at
> > > > >
> > > >
> > >
> >
> org.apache.helix.task.TaskUtil.addWorkflowJobUserContent(TaskUtil.java:356)
> > 

Re: Sporadic issue in putting a variable in workflow scope

2018-11-09 Thread Xue Junkai
In my understanding, it could be you have job1 and job2. The task running
in job1 tries to update content for job2. Then, there could be a race
condition happening here that job2 is not scheduled.

If that's the case, I suggest you can put key-value store at workflow level
since this is cross-job operation.

Best,

Junkai

On Fri, Nov 9, 2018 at 11:45 AM DImuthu Upeksha 
wrote:

> Hi Junkai,
>
> This method is being called inside a running task. And it is working for
> most of the time. I only saw this in 2 occasions for last few months and
> both of them happened today and yesterday.
>
> Thanks
> Dimuthu
>
> On Fri, Nov 9, 2018 at 2:40 PM Xue Junkai  wrote:
>
> > User content store node will be created one the job has been scheduled.
> In
> > your case, I think the job is not scheduled. This method usually has been
> > utilized in running task.
> >
> > Best,
> >
> > Junkai
> >
> > On Fri, Nov 9, 2018 at 8:19 AM DImuthu Upeksha <
> dimuthu.upeks...@gmail.com
> > >
> > wrote:
> >
> > > Hi Helix Folks,
> > >
> > > I'm having this sporadic issue in some tasks of our workflows when we
> try
> > > to store a value in the workflow context and I have added both code
> > section
> > > and error message below. Do you have an idea what's causing this?
> Please
> > > let me know if you need further information. We are using Helix 0.8.2
> > >
> > > public void sendNextJob(String jobId) {
> > > putUserContent(WORKFLOW_STARTED, "TRUE", Scope.WORKFLOW);
> > > if (jobId != null) {
> > > putUserContent(NEXT_JOB, jobId, Scope.WORKFLOW);
> > > }
> > > }
> > >
> > > Failed to setup environment of task
> > > TASK_55096de4-2cb6-4b09-84fd-7fdddba93435
> > > java.lang.NullPointerException: null
> > > at org.apache.helix.task.TaskUtil$1.update(TaskUtil.java:358)
> > > at org.apache.helix.task.TaskUtil$1.update(TaskUtil.java:356)
> > > at
> > >
> > >
> >
> org.apache.helix.manager.zk.HelixGroupCommit.commit(HelixGroupCommit.java:126)
> > > at
> > >
> > >
> >
> org.apache.helix.manager.zk.ZkCacheBaseDataAccessor.update(ZkCacheBaseDataAccessor.java:306)
> > > at
> > >
> > >
> >
> org.apache.helix.store.zk.AutoFallbackPropertyStore.update(AutoFallbackPropertyStore.java:61)
> > > at
> > >
> >
> org.apache.helix.task.TaskUtil.addWorkflowJobUserContent(TaskUtil.java:356)
> > > at
> > >
> > >
> >
> org.apache.helix.task.UserContentStore.putUserContent(UserContentStore.java:78)
> > > at
> > >
> > >
> >
> org.apache.airavata.helix.core.AbstractTask.sendNextJob(AbstractTask.java:136)
> > > at
> org.apache.airavata.helix.core.OutPort.invoke(OutPort.java:42)
> > > at
> > >
> > >
> >
> org.apache.airavata.helix.core.AbstractTask.onSuccess(AbstractTask.java:123)
> > > at
> > >
> > >
> >
> org.apache.airavata.helix.impl.task.AiravataTask.onSuccess(AiravataTask.java:97)
> > > at
> > >
> > >
> >
> org.apache.airavata.helix.impl.task.env.EnvSetupTask.onRun(EnvSetupTask.java:52)
> > > at
> > >
> > >
> >
> org.apache.airavata.helix.impl.task.AiravataTask.onRun(AiravataTask.java:349)
> > > at
> > > org.apache.airavata.helix.core.AbstractTask.run(AbstractTask.java:92)
> > > at org.apache.helix.task.TaskRunner.run(TaskRunner.java:71)
> > > at
> > > java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
> > >     at java.util.concurrent.FutureTask.run(FutureTask.java:266)
> > > at
> > >
> > >
> >
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180)
> > > at
> > >
> > >
> >
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)
> > > at
> > >
> > >
> >
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
> > > at
> > >
> > >
> >
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
> > > at java.lang.Thread.run(Thread.java:748)
> > >
> > > Thanks
> > > Dimuthu
> > >
> >
> >
> > --
> > Junkai Xue
> >
>


-- 
Junkai Xue


Re: Sporadic issue in putting a variable in workflow scope

2018-11-10 Thread Xue Junkai
Hi Dimuthu,

Couple things here:
1. Only JobQueue in Helix is single branch DAG and 1 job running at a time
with defining parallel job number to be 1. Otherwise, you may see many jobs
running at same time as you set parallel job number to be a different
number. For generic workflow, all jobs without dependencies could be
dispatched together.
2. Helix only cleans up the completed generic workflows by deleting all the
related znode, not for JobQueue. For JobQueue you have to set up periodical
purge time. As Helix defined, JobQueue never finishes and only can be
terminated by manual kill and it can keep accepting dynamic jobs. Thus you
have to understand your workflow is generic workflow or JobQueue. For
failed generic workflow, even if you setup the expiry time, Helix will not
clean it up as Helix would like to keep it for user further investigation.
3. For Helix controller, if Helix failed to clean up workflows, the only
thing you can see is the having workflows with context but no resource
config and idealstate there. This is because of ZK write fail to clean last
piece, context node. And there is no ideal state can trigger clean up again
for this workflow.

Please take a look for this task framework tutorial for detailed
configurations:
https://helix.apache.org/0.8.2-docs/tutorial_task_framework.html

Best,

Junkai

On Sat, Nov 10, 2018 at 8:29 AM DImuthu Upeksha 
wrote:

> Hi Junkai,
>
> Thanks for the clarification. There are few special properties in our
> workflows. All the workflows are single branch DAGs so there will be only
> one job running at a time. By looking at the log, I could see that only the
> task with this error has been failed. Cleanup agent deleted this workflow
> after this task is failed so it is clear that no other task is triggering
> this issue (I checked the timestamp).
>
> However for the instance, I disabled the cleanup agent for a while. Reason
> for adding this agent is because Helix became slow to schedule pending jobs
> when the load is high and participant was waiting without running anything
> for few minutes. We discussed this on thread "Sporadic delays in task
> execution". Before implementing this agent, I noticed that, there were lots
> of uncleared znodes related to Completed and Failed workflows and I though
> that was the reason to slow down controller / participant. After
> implementing this agent, things went smoothly until this point.
>
> Now I understand that you have your own workflow cleanup logic implemented
> in Helix but we might need to tune it to our case. Can you point me into
> code / documentation where I can have an idea about that?
>
> And this for my understanding, let's say that for some reason Helix failed
> to clean up completed workflows and related resources in zk. Will that
> affect to the performance of controller / participant? My understanding was
> that Helix was registering zk watchers for all the paths irrespective of
> the status of the workflow/ job/ task. Please correct me if I'm wrong.
>
> Thanks
> Dimuthu
>
> On Sat, Nov 10, 2018 at 1:49 AM Xue Junkai  wrote:
>
> > It is possible. For example, if other jobs caused the workflow failed, it
> > will trigger the monitoring to clean up the workflow. Then if this job is
> > still running, you may see the problem. That's what I am trying to ask
> for,
> > extra thread deleting/cleaning workflows.
> >
> > I can understand it clean up the failed workflow. But I am wondering why
> > not just set expiry and let Helix controller does the clean up for
> > completed workflows.
> >
> > On Sat, Nov 10, 2018 at 1:30 PM DImuthu Upeksha <
> > dimuthu.upeks...@gmail.com> wrote:
> >
> >> Hi Junkai,
> >>
> >> There is a cleanup agent [1] who is monitoring currently available
> >> workflows and deleting completed and failed workflows to clear up
> >> zookeeper
> >> storage. Do you think that this will be causing this issue?
> >>
> >> [1]
> >>
> >>
> https://github.com/apache/airavata/blob/staging/modules/airavata-helix/helix-spectator/src/main/java/org/apache/airavata/helix/impl/controller/WorkflowCleanupAgent.java
> >>
> >> Thanks
> >> Dimuthu
> >>
> >> On Fri, Nov 9, 2018 at 11:14 PM DImuthu Upeksha <
> >> dimuthu.upeks...@gmail.com>
> >> wrote:
> >>
> >> > Hi Junkai,
> >> >
> >> > There is no manual workflow killing logic implemented but as you have
> >> > suggested, I need to verify that. Unfortunately all the helix log
> >> levels in
> >> > our servers were set to WARN as helix is printing a whole lot of logs
> in
> >> > INFO level so there is no much valuabl

Re: Sporadic issue in putting a variable in workflow scope

2018-11-12 Thread Xue Junkai
1 and 2 are correct. 3 is wrong. The expiry time start counting only when
the workflow is completed. If it is not scheduled ( dont have enought
resource) or still running, Helix never deletes it.



On Sun, Nov 11, 2018 at 8:01 PM DImuthu Upeksha 
wrote:

> Hi Junkai,
>
> Thanks for the clarification. That helped a lot.
>
> I our case, each of the task of the workflow are depending on the previous
> task. So there is no parallel execution. And we are not using Job Queues.
>
> Regarding the expiry time, what are the rules that you are imposing on
> that? For example let's say I setup an expiry time to 2 hours, I assume
> following situations are covered in Helix,
>
> 1. Even though the workflow is completed before 2 hours, resources related
> to that workflow will not be cleared until 2 hours are elapsed and exactly
> after 2 hours, all the resources will be cleared by the framework.
> 2. If the workflow failed, resources will not be cleared even after 2 hours
> 3. If the workflow wasn't scheduled within 2 hours in a participant, it
> will be deleted
>
> Is my understanding correct?
>
> Thanks
> Dimuthu
>
>
> On Sat, Nov 10, 2018 at 4:26 PM Xue Junkai  wrote:
>
> > Hi Dimuthu,
> >
> > Couple things here:
> > 1. Only JobQueue in Helix is single branch DAG and 1 job running at a
> time
> > with defining parallel job number to be 1. Otherwise, you may see many
> jobs
> > running at same time as you set parallel job number to be a different
> > number. For generic workflow, all jobs without dependencies could be
> > dispatched together.
> > 2. Helix only cleans up the completed generic workflows by deleting all
> > the related znode, not for JobQueue. For JobQueue you have to set up
> > periodical purge time. As Helix defined, JobQueue never finishes and only
> > can be terminated by manual kill and it can keep accepting dynamic jobs.
> > Thus you have to understand your workflow is generic workflow or
> JobQueue.
> > For failed generic workflow, even if you setup the expiry time, Helix
> will
> > not clean it up as Helix would like to keep it for user further
> > investigation.
> > 3. For Helix controller, if Helix failed to clean up workflows, the only
> > thing you can see is the having workflows with context but no resource
> > config and idealstate there. This is because of ZK write fail to clean
> last
> > piece, context node. And there is no ideal state can trigger clean up
> again
> > for this workflow.
> >
> > Please take a look for this task framework tutorial for detailed
> > configurations:
> > https://helix.apache.org/0.8.2-docs/tutorial_task_framework.html
> >
> > Best,
> >
> > Junkai
> >
> > On Sat, Nov 10, 2018 at 8:29 AM DImuthu Upeksha <
> > dimuthu.upeks...@gmail.com> wrote:
> >
> >> Hi Junkai,
> >>
> >> Thanks for the clarification. There are few special properties in our
> >> workflows. All the workflows are single branch DAGs so there will be
> only
> >> one job running at a time. By looking at the log, I could see that only
> >> the
> >> task with this error has been failed. Cleanup agent deleted this
> workflow
> >> after this task is failed so it is clear that no other task is
> triggering
> >> this issue (I checked the timestamp).
> >>
> >> However for the instance, I disabled the cleanup agent for a while.
> Reason
> >> for adding this agent is because Helix became slow to schedule pending
> >> jobs
> >> when the load is high and participant was waiting without running
> anything
> >> for few minutes. We discussed this on thread "Sporadic delays in task
> >> execution". Before implementing this agent, I noticed that, there were
> >> lots
> >> of uncleared znodes related to Completed and Failed workflows and I
> though
> >> that was the reason to slow down controller / participant. After
> >> implementing this agent, things went smoothly until this point.
> >>
> >> Now I understand that you have your own workflow cleanup logic
> implemented
> >> in Helix but we might need to tune it to our case. Can you point me into
> >> code / documentation where I can have an idea about that?
> >>
> >> And this for my understanding, let's say that for some reason Helix
> failed
> >> to clean up completed workflows and related resources in zk. Will that
> >> affect to the performance of controller / participant? My understanding
> >> was
> >> that Helix was registering zk watchers for all the paths irr

Re: Sporadic issue in putting a variable in workflow scope

2018-11-13 Thread Xue Junkai
Yes. You are right.

On Tue, Nov 13, 2018 at 7:32 AM DImuthu Upeksha 
wrote:

> Hi Junkai,
>
> Thanks a lot. I'll try with expiry time then. Is this[1] the place where
> Helix has implemented this logic? If that so, default expiry time should be
> 24 hours. Am I right?
>
> [1]
>
> https://github.com/apache/helix/blob/master/helix-core/src/main/java/org/apache/helix/task/TaskUtil.java#L711
>
> Thanks
> Dimuthu
>
> On Mon, Nov 12, 2018 at 10:17 PM Xue Junkai  wrote:
>
> > 1 and 2 are correct. 3 is wrong. The expiry time start counting only when
> > the workflow is completed. If it is not scheduled ( dont have enought
> > resource) or still running, Helix never deletes it.
> >
> >
> >
> > On Sun, Nov 11, 2018 at 8:01 PM DImuthu Upeksha <
> > dimuthu.upeks...@gmail.com> wrote:
> >
> >> Hi Junkai,
> >>
> >> Thanks for the clarification. That helped a lot.
> >>
> >> I our case, each of the task of the workflow are depending on the
> previous
> >> task. So there is no parallel execution. And we are not using Job
> Queues.
> >>
> >> Regarding the expiry time, what are the rules that you are imposing on
> >> that? For example let's say I setup an expiry time to 2 hours, I assume
> >> following situations are covered in Helix,
> >>
> >> 1. Even though the workflow is completed before 2 hours, resources
> related
> >> to that workflow will not be cleared until 2 hours are elapsed and
> exactly
> >> after 2 hours, all the resources will be cleared by the framework.
> >> 2. If the workflow failed, resources will not be cleared even after 2
> >> hours
> >> 3. If the workflow wasn't scheduled within 2 hours in a participant, it
> >> will be deleted
> >>
> >> Is my understanding correct?
> >>
> >> Thanks
> >> Dimuthu
> >>
> >>
> >> On Sat, Nov 10, 2018 at 4:26 PM Xue Junkai 
> wrote:
> >>
> >> > Hi Dimuthu,
> >> >
> >> > Couple things here:
> >> > 1. Only JobQueue in Helix is single branch DAG and 1 job running at a
> >> time
> >> > with defining parallel job number to be 1. Otherwise, you may see many
> >> jobs
> >> > running at same time as you set parallel job number to be a different
> >> > number. For generic workflow, all jobs without dependencies could be
> >> > dispatched together.
> >> > 2. Helix only cleans up the completed generic workflows by deleting
> all
> >> > the related znode, not for JobQueue. For JobQueue you have to set up
> >> > periodical purge time. As Helix defined, JobQueue never finishes and
> >> only
> >> > can be terminated by manual kill and it can keep accepting dynamic
> jobs.
> >> > Thus you have to understand your workflow is generic workflow or
> >> JobQueue.
> >> > For failed generic workflow, even if you setup the expiry time, Helix
> >> will
> >> > not clean it up as Helix would like to keep it for user further
> >> > investigation.
> >> > 3. For Helix controller, if Helix failed to clean up workflows, the
> only
> >> > thing you can see is the having workflows with context but no resource
> >> > config and idealstate there. This is because of ZK write fail to clean
> >> last
> >> > piece, context node. And there is no ideal state can trigger clean up
> >> again
> >> > for this workflow.
> >> >
> >> > Please take a look for this task framework tutorial for detailed
> >> > configurations:
> >> > https://helix.apache.org/0.8.2-docs/tutorial_task_framework.html
> >> >
> >> > Best,
> >> >
> >> > Junkai
> >> >
> >> > On Sat, Nov 10, 2018 at 8:29 AM DImuthu Upeksha <
> >> > dimuthu.upeks...@gmail.com> wrote:
> >> >
> >> >> Hi Junkai,
> >> >>
> >> >> Thanks for the clarification. There are few special properties in our
> >> >> workflows. All the workflows are single branch DAGs so there will be
> >> only
> >> >> one job running at a time. By looking at the log, I could see that
> only
> >> >> the
> >> >> task with this error has been failed. Cleanup agent deleted this
> >> workflow
> >> >> after this task is failed so it is clear that no other task is
> >> triggering
> >> >> this issue (I checked t

Re: Sporadic issue in putting a variable in workflow scope

2018-11-09 Thread Xue Junkai
User content store node will be created one the job has been scheduled. In
your case, I think the job is not scheduled. This method usually has been
utilized in running task.

Best,

Junkai

On Fri, Nov 9, 2018 at 8:19 AM DImuthu Upeksha 
wrote:

> Hi Helix Folks,
>
> I'm having this sporadic issue in some tasks of our workflows when we try
> to store a value in the workflow context and I have added both code section
> and error message below. Do you have an idea what's causing this? Please
> let me know if you need further information. We are using Helix 0.8.2
>
> public void sendNextJob(String jobId) {
> putUserContent(WORKFLOW_STARTED, "TRUE", Scope.WORKFLOW);
> if (jobId != null) {
> putUserContent(NEXT_JOB, jobId, Scope.WORKFLOW);
> }
> }
>
> Failed to setup environment of task
> TASK_55096de4-2cb6-4b09-84fd-7fdddba93435
> java.lang.NullPointerException: null
> at org.apache.helix.task.TaskUtil$1.update(TaskUtil.java:358)
> at org.apache.helix.task.TaskUtil$1.update(TaskUtil.java:356)
> at
>
> org.apache.helix.manager.zk.HelixGroupCommit.commit(HelixGroupCommit.java:126)
> at
>
> org.apache.helix.manager.zk.ZkCacheBaseDataAccessor.update(ZkCacheBaseDataAccessor.java:306)
> at
>
> org.apache.helix.store.zk.AutoFallbackPropertyStore.update(AutoFallbackPropertyStore.java:61)
> at
> org.apache.helix.task.TaskUtil.addWorkflowJobUserContent(TaskUtil.java:356)
> at
>
> org.apache.helix.task.UserContentStore.putUserContent(UserContentStore.java:78)
> at
>
> org.apache.airavata.helix.core.AbstractTask.sendNextJob(AbstractTask.java:136)
> at org.apache.airavata.helix.core.OutPort.invoke(OutPort.java:42)
> at
>
> org.apache.airavata.helix.core.AbstractTask.onSuccess(AbstractTask.java:123)
> at
>
> org.apache.airavata.helix.impl.task.AiravataTask.onSuccess(AiravataTask.java:97)
> at
>
> org.apache.airavata.helix.impl.task.env.EnvSetupTask.onRun(EnvSetupTask.java:52)
> at
>
> org.apache.airavata.helix.impl.task.AiravataTask.onRun(AiravataTask.java:349)
> at
> org.apache.airavata.helix.core.AbstractTask.run(AbstractTask.java:92)
> at org.apache.helix.task.TaskRunner.run(TaskRunner.java:71)
> at
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
> at java.util.concurrent.FutureTask.run(FutureTask.java:266)
> at
>
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180)
> at
>
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)
> at
>
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
> at
>
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
> at java.lang.Thread.run(Thread.java:748)
>
> Thanks
> Dimuthu
>


-- 
Junkai Xue


Re: Sporadic delays in task execution

2018-10-02 Thread Xue Junkai
Could you please check the log of how long for each pipeline stage takes?

Also, did you set expiry for workflows? Are they piled up for long time?
How long for each workflow completes?

best,

Junkai

On Wed, Sep 26, 2018 at 8:52 AM DImuthu Upeksha 
wrote:

> Hi Junkai,
>
> Average load is like 10 - 20 workflows per minutes. In some cases it's less
> than that However based on the observations, I feel like it does not depend
> on the load and it is sporadic. Is there a particular log lines that I can
> filter in controller and participant to capture the timeline of workflow so
> that I can figure out which which component is malfunctioning? We use helix
> v 0.8.1.
>
> Thanks
> Dimuthu
>
> On Tue, Sep 25, 2018 at 5:19 PM Xue Junkai  wrote:
>
> > Hi Dimuthu,
> >
> > At which rate, you are keep submitting workflows? Usually, Workflow
> > scheduling is very fast. And which version of Helix you are using?
> >
> > Best,
> >
> > Junkai
> >
> > On Tue, Sep 25, 2018 at 8:58 AM DImuthu Upeksha <
> > dimuthu.upeks...@gmail.com>
> > wrote:
> >
> > > Hi Folks,
> > >
> > > We have noticed some delays between workflow submission and actual
> > picking
> > > up by participants and seems like that delay is somewhat constant
> around
> > 2-
> > > 3 minutes. We used to continuously submit workflows and after 2 -3
> > minutes,
> > > a bulk of workflows are picked by participant and execute them. Then it
> > > remain silent for next 2 -3 minutes event we submit more workflows.
> It's
> > > like participant picking up workflows in discrete time intervals. I'm
> not
> > > sure whether this is an issue of controller or the participant. Do you
> > have
> > > any experience with this sort of behavior?
> > >
> > > Thanks
> > > Dimuthu
> > >
> >
> >
> > --
> > Junkai Xue
> >
>


-- 
Junkai Xue


Re: Sporadic delays in task execution

2018-09-25 Thread Xue Junkai
Hi Dimuthu,

At which rate, you are keep submitting workflows? Usually, Workflow
scheduling is very fast. And which version of Helix you are using?

Best,

Junkai

On Tue, Sep 25, 2018 at 8:58 AM DImuthu Upeksha 
wrote:

> Hi Folks,
>
> We have noticed some delays between workflow submission and actual picking
> up by participants and seems like that delay is somewhat constant around 2-
> 3 minutes. We used to continuously submit workflows and after 2 -3 minutes,
> a bulk of workflows are picked by participant and execute them. Then it
> remain silent for next 2 -3 minutes event we submit more workflows. It's
> like participant picking up workflows in discrete time intervals. I'm not
> sure whether this is an issue of controller or the participant. Do you have
> any experience with this sort of behavior?
>
> Thanks
> Dimuthu
>


-- 
Junkai Xue


[VOTE] Apache Helix 0.8.3 Release

2018-11-26 Thread Xue Junkai
Hi,


This is to call for a vote on releasing the following candidate as Apache
Helix 0.8.3. This is the 15th release of Helix as an Apache project, as
well as the 11th release as a top-level Apache project.


Apache Helix is a generic cluster management framework that makes it easy
to build partitioned and replicated, fault-tolerant and scalable
distributed systems.


Release notes:

*https://helix.apache.org/0.8.3-docs/releasenotes/release-0.8.3.html
*


Release artifacts:

https://repository.apache.org/content/repositories/orgapachehelix-1021


Distribution:

* binaries:

https://dist.apache.org/repos/dist/dev/helix/0.8.3]/binaries/

* sources:

https://dist.apache.org/repos/dist/dev/helix/0.8.3/src/


The 0.8.3 release tag:

https://git-wip-us.apache.org/repos/asf?p=helix.git;a=tag;h=refs/tags/helix-0.8.3


KEYS file available here:

https://dist.apache.org/repos/dist/dev/helix/KEYS


Please vote on the release. The vote will be open for at least 72 hours.


[+1] -- "YES, release"

[0] -- "No opinion"

[-1] -- "NO, do not release"


Thanks,

The Apache Helix Team


Re: Cloning a Helix Workflow

2018-09-13 Thread Xue Junkai
Thanks for sharing the information. I think it is a good feature for task
framework. There are two things noticed for roughly scanning the code:
1. For Helix, we are using JAVA 1.7, lamda expression is not supported here.
2. Context should not related to any workflow cloning.
3. JobDAG is already cloned by WorkflowConfig

BTW, we need to think about other things that may need to be clone, which
are not included in WorkflowConfigs.

Best,

Junkai

On Thu, Sep 13, 2018 at 12:49 PM DImuthu Upeksha 
wrote:

> Hi Xue,
>
> To give you a context, we are using Helix as the backend task execution
> engine of Apache Airavata project.
> It is sort of a complex process to recreate the same workflow with same
> configurations as the workflow generation is done by passing through
> different parsers.  As a workaround I came up with following code to
> replicate the same workflow with different name. Do you think that this
> feature is a good have feature in original task API? If so we can try to
> add it and send a PR.
>
> public void cloneWorkflow(String workflowName) {
> WorkflowContext workflowContext =
> taskDriver.getWorkflowContext(workflowName);
> WorkflowConfig workflowConfig =
> taskDriver.getWorkflowConfig(workflowName);
> JobDag jobDag = workflowConfig.getJobDag();
>
> Set allNodes = jobDag.getAllNodes();
> Map jobContextMap = new HashMap<>();
> Map jobConfigMap = new HashMap<>();
>
> allNodes.stream().forEach(job -> {
> jobConfigMap.put(job, taskDriver.getJobConfig(job));
> jobContextMap.put(job, taskDriver.getJobContext(job));
> });
>
> Workflow.Builder workflowBuilder = new
> Workflow.Builder(workflowName + "_CLONE").setExpiry(0);
>
> allNodes.forEach(job -> {
>
> List taskBuilds = new ArrayList<>();
>
> Map taskConfigMap =
> jobConfigMap.get(job).getTaskConfigMap();
> taskConfigMap.forEach((id, config) -> {
> TaskConfig.Builder taskBuilder = new
> TaskConfig.Builder().setTaskId(id +
> "_CLONE").setCommand(config.getCommand());
> Map orignalMap = config.getConfigMap();
> orignalMap.forEach((key, value) ->
> taskBuilder.addConfig(key, value));
> taskBuilds.add(taskBuilder.build());
> });
>
> JobConfig.Builder jobBuilder = new JobConfig.Builder()
> .addTaskConfigs(taskBuilds)
>
> .setFailureThreshold(jobConfigMap.get(job).getFailureThreshold())
>
> .setMaxAttemptsPerTask(jobConfigMap.get(job).getMaxAttemptsPerTask());
>
> workflowBuilder.addJob(job, jobBuilder);
> });
>
> Map> parentsToChildren =
> jobDag.getParentsToChildren();
>
> parentsToChildren.forEach((parent, children) -> {
> children.forEach(child ->
> workflowBuilder.addParentChildDependency(parent, child));
> });
>
> WorkflowConfig.Builder config = new
> WorkflowConfig.Builder().setFailureThreshold(0);
> workflowBuilder.setWorkflowConfig(config.build());
> Workflow workflow = workflowBuilder.build();
>
> taskDriver.start(workflow);
> }
>
> Thanks
>
> Dimuthu
>
>
> On Thu, Sep 13, 2018 at 2:57 PM Xue Junkai  wrote:
>
> > Hi Dlmuthu,
> >
> > Currently, Helix does not support rerun workflow feature. If you would
> like
> > to reexecute the workflow, please submit a new one.
> >
> > Or for your scenario, was the workflow caused by job failing? If yes, you
> > can increase the number of failed threshold for job level and task level,
> > which can keep the tasks retrying and not failing the workflow.
> >
> > Hope this answer your question.
> >
> > Best,
> >
> > Junkai
> >
> > On Wed, Sep 12, 2018 at 11:49 AM kishore g  wrote:
> >
> > > Lei, do you know if there is a way to restart the workflow?
> > >
> > > On Wed, Sep 12, 2018 at 10:07 AM DImuthu Upeksha <
> > > dimuthu.upeks...@gmail.com>
> > > wrote:
> > >
> > > > Any update on this ?
> > > >
> > > > On Wed, Apr 4, 2018 at 9:10 AM DImuthu Upeksha <
> > > dimuthu.upeks...@gmail.com
> > > > >
> > > > wrote:
> > > >
> > > > > Hi Folks,
> > > > >
> > > > > I'm running 50 -100 Helix Task Workflows at a time and due to some
> > > > > unexpected issues, some workflows go into the failed state. Is
> there
> > a
> > > > way
> > > > > I can retry those workflows from the beginning or clone new
> workflows
> > > > from
> > > > > them and run as fresh workflows?
> > > > >
> > > > > Thanks
> > > > > Dimuthu
> > > > >
> > > >
> > >
> >
> >
> > --
> > Junkai Xue
> >
>


-- 
Junkai Xue


Re: Cloning a Helix Workflow

2018-09-13 Thread Xue Junkai
Hi Dlmuthu,

Currently, Helix does not support rerun workflow feature. If you would like
to reexecute the workflow, please submit a new one.

Or for your scenario, was the workflow caused by job failing? If yes, you
can increase the number of failed threshold for job level and task level,
which can keep the tasks retrying and not failing the workflow.

Hope this answer your question.

Best,

Junkai

On Wed, Sep 12, 2018 at 11:49 AM kishore g  wrote:

> Lei, do you know if there is a way to restart the workflow?
>
> On Wed, Sep 12, 2018 at 10:07 AM DImuthu Upeksha <
> dimuthu.upeks...@gmail.com>
> wrote:
>
> > Any update on this ?
> >
> > On Wed, Apr 4, 2018 at 9:10 AM DImuthu Upeksha <
> dimuthu.upeks...@gmail.com
> > >
> > wrote:
> >
> > > Hi Folks,
> > >
> > > I'm running 50 -100 Helix Task Workflows at a time and due to some
> > > unexpected issues, some workflows go into the failed state. Is there a
> > way
> > > I can retry those workflows from the beginning or clone new workflows
> > from
> > > them and run as fresh workflows?
> > >
> > > Thanks
> > > Dimuthu
> > >
> >
>


-- 
Junkai Xue


Re: Sporadic delays in task execution

2019-03-20 Thread Xue Junkai
Hi Dimuthu,

What's the version of Helix you are using?

Best,

Junkai

On Wed, Mar 20, 2019 at 8:54 PM DImuthu Upeksha 
wrote:

> Hi Helix Dev,
>
> We are again seeing this delay in task execution. Please have a look at the
> screencast [1] of logs printed in participant (top shell) and controller
> (bottom shell). When I record this, there were about 90 - 100 workflows
> pending to be executed. As you can see some tasks were suddenly executed
> and then participant freezed for about 30 seconds before executing next set
> of tasks. I can see some WARN logs on controller log. I feel like this 30
> second delay is some sort of a pattern. What do you think as the reason for
> this? I can provide you more information by turning on verbose logs on
> controller if you want.
>
> [1] https://youtu.be/3EUdSxnIxVw
>
> Thanks
> Dimuthu
>
> On Thu, Oct 4, 2018 at 4:46 PM DImuthu Upeksha  >
> wrote:
>
> > Hi Junkai,
> >
> > I'm CCing Airavata dev list as this is directly related to the project.
> >
> > I just went through the zookeeper path like //EXTERNALVIEW,
> > //CONFIGS/RESOURCE as I have noticed that helix controller
> is
> > periodically monitoring for the children of those paths even though all
> the
> > Workflows have moved into a saturated state like COMPLETED and STOPPED.
> In
> > our case, we have a lot of completed workflows piled up in those paths. I
> > believe that helix is clearing up those resources after some TTL. What I
> > did was writing an external spectator [1] that continuously monitors for
> > saturated workflows and clearing up resources before controller does that
> > after a TTL. After that, we didn't see such delays in workflow execution
> > and everything seems to be running smoothly. However we are continuously
> > monitoring our deployments for any form of adverse effect introduced by
> > that improvement.
> >
> > Please let us know if we are doing something wrong in this improvement or
> > is there any better way to achieve this directly through helix task
> > framework.
> >
> > [1]
> >
> https://github.com/apache/airavata/blob/staging/modules/airavata-helix/helix-spectator/src/main/java/org/apache/airavata/helix/impl/controller/WorkflowCleanupAgent.java
> >
> > Thanks
> > Dimuthu
> >
> > On Tue, Oct 2, 2018 at 1:12 PM Xue Junkai  wrote:
> >
> >> Could you please check the log of how long for each pipeline stage
> takes?
> >>
> >> Also, did you set expiry for workflows? Are they piled up for long time?
> >> How long for each workflow completes?
> >>
> >> best,
> >>
> >> Junkai
> >>
> >> On Wed, Sep 26, 2018 at 8:52 AM DImuthu Upeksha <
> >> dimuthu.upeks...@gmail.com>
> >> wrote:
> >>
> >> > Hi Junkai,
> >> >
> >> > Average load is like 10 - 20 workflows per minutes. In some cases it's
> >> less
> >> > than that However based on the observations, I feel like it does not
> >> depend
> >> > on the load and it is sporadic. Is there a particular log lines that I
> >> can
> >> > filter in controller and participant to capture the timeline of
> >> workflow so
> >> > that I can figure out which which component is malfunctioning? We use
> >> helix
> >> > v 0.8.1.
> >> >
> >> > Thanks
> >> > Dimuthu
> >> >
> >> > On Tue, Sep 25, 2018 at 5:19 PM Xue Junkai 
> >> wrote:
> >> >
> >> > > Hi Dimuthu,
> >> > >
> >> > > At which rate, you are keep submitting workflows? Usually, Workflow
> >> > > scheduling is very fast. And which version of Helix you are using?
> >> > >
> >> > > Best,
> >> > >
> >> > > Junkai
> >> > >
> >> > > On Tue, Sep 25, 2018 at 8:58 AM DImuthu Upeksha <
> >> > > dimuthu.upeks...@gmail.com>
> >> > > wrote:
> >> > >
> >> > > > Hi Folks,
> >> > > >
> >> > > > We have noticed some delays between workflow submission and actual
> >> > > picking
> >> > > > up by participants and seems like that delay is somewhat constant
> >> > around
> >> > > 2-
> >> > > > 3 minutes. We used to continuously submit workflows and after 2 -3
> >> > > minutes,
> >> > > > a bulk of workflows are picked by participant and execute them.
> >> Then it
> >> > > > remain silent for next 2 -3 minutes event we submit more
> workflows.
> >> > It's
> >> > > > like participant picking up workflows in discrete time intervals.
> >> I'm
> >> > not
> >> > > > sure whether this is an issue of controller or the participant. Do
> >> you
> >> > > have
> >> > > > any experience with this sort of behavior?
> >> > > >
> >> > > > Thanks
> >> > > > Dimuthu
> >> > > >
> >> > >
> >> > >
> >> > > --
> >> > > Junkai Xue
> >> > >
> >> >
> >>
> >>
> >> --
> >> Junkai Xue
> >>
> >
>


-- 
Junkai Xue


[ANNOUNCE] New Committer: Hunter Lee

2019-03-28 Thread Xue Junkai
Hi, All


  The Project Management Committee (PMC) for Apache Helix has asked Hunter
Lee to become a committer and we are pleased to announce that he has
accepted.


  Being a committer enables easier contribution to the project since there
is no need to go via the patch submission process. This should enable
better productivity.


  Welcome Hunter!


Helix Team


[VOTE] Apache Helix 0.8.4 Release

2019-02-27 Thread Xue Junkai
Hi,


This is to call for a vote on releasing the following candidate as Apache
Helix 0.8.4. This is the 16th release of Helix as an Apache project, as
well as the 12th release as a top-level Apache project.


Apache Helix is a generic cluster management framework that makes it easy
to build partitioned and replicated, fault-tolerant and scalable
distributed systems.


Release notes:

*https://helix.apache.org/0.8.4-docs/releasenotes/release-0.8.4.html
*


Release artifacts:

https://repository.apache.org/content/repositories/orgapachehelix-1026


Distribution:

* binaries:

*https://dist.apache.org/repos/dist/dev/helix/0.8.4/binaries/
*

* sources:

https://dist.apache.org/repos/dist/dev/helix/0.8.4/src/


The 0.8.4 release tag:

https://git-wip-us.apache.org/repos/asf?p=helix.git;a=tag;h=refs/tags/helix-0.8.4


KEYS file available here:

https://dist.apache.org/repos/dist/dev/helix/KEYS


Please vote on the release. The vote will be open for at least 72 hours.


[+1] -- "YES, release"

[0] -- "No opinion"

[-1] -- "NO, do not release"


Thanks,

The Apache Helix Team


[RESULT][VOTE] Apache Helix 0.8.4 Release

2019-03-06 Thread Xue Junkai
Thanks for voting on the 0.8.4 release. It has now exceeded 72 hours so I
am closing the vote.


Binding +1s:

Kishore Gopalakrishna, Lei Xia, Junkai Xue


Nonbinding +1s:

Jiajun Wang, Hunter Lee


Binding 0s:


Nonbinding 0s:


Binding -1s:


Nonbinding -1s:


The vote has passed, thanks a lot to everyone for voting!


[ANNOUNCE] Apache Helix 0.8.4 Release

2019-03-06 Thread Xue Junkai
The Apache Helix Team is pleased to announce the 16th release, 0.8.4, of
the Apache Helix project.


Apache Helix is a generic cluster management framework that makes it easy
to build partitioned, fault tolerant, and scalable distributed systems.


The full release notes are available here:

*https://helix.apache.org/0.8.4-docs/releasenotes/release-0.8.4.html
*


You can declare a maven dependency to use it:




  org.apache.helix

  helix-core

  0.8.4




Or download the release sources:

https://helix.apache.org/0.8.4-docs/download.cgi


Additional info


Website: http://helix.apache.org/

Helix mailing lists: http://helix.apache.org/mail-lists.html


We hope you will enjoy using the latest release of Apache Helix!


Cheers,

Apache Helix Team


Re: Sporadic delays in task execution

2019-03-21 Thread Xue Junkai
Can you try one thing? Touch the ideal state to trigger an event. If
workflows are not scheduled, it should scheduling has problem.

Best,

Junkai

On Wed, Mar 20, 2019 at 10:31 PM DImuthu Upeksha 
wrote:

> Hi Junkai,
>
> We are using 0.8.1
>
> Dimuthu
>
> On Thu, Mar 21, 2019 at 12:14 AM Xue Junkai  wrote:
>
> > Hi Dimuthu,
> >
> > What's the version of Helix you are using?
> >
> > Best,
> >
> > Junkai
> >
> > On Wed, Mar 20, 2019 at 8:54 PM DImuthu Upeksha <
> > dimuthu.upeks...@gmail.com>
> > wrote:
> >
> > > Hi Helix Dev,
> > >
> > > We are again seeing this delay in task execution. Please have a look at
> > the
> > > screencast [1] of logs printed in participant (top shell) and
> controller
> > > (bottom shell). When I record this, there were about 90 - 100 workflows
> > > pending to be executed. As you can see some tasks were suddenly
> executed
> > > and then participant freezed for about 30 seconds before executing next
> > set
> > > of tasks. I can see some WARN logs on controller log. I feel like this
> 30
> > > second delay is some sort of a pattern. What do you think as the reason
> > for
> > > this? I can provide you more information by turning on verbose logs on
> > > controller if you want.
> > >
> > > [1] https://youtu.be/3EUdSxnIxVw
> > >
> > > Thanks
> > > Dimuthu
> > >
> > > On Thu, Oct 4, 2018 at 4:46 PM DImuthu Upeksha <
> > dimuthu.upeks...@gmail.com
> > > >
> > > wrote:
> > >
> > > > Hi Junkai,
> > > >
> > > > I'm CCing Airavata dev list as this is directly related to the
> project.
> > > >
> > > > I just went through the zookeeper path like / > Name>/EXTERNALVIEW,
> > > > //CONFIGS/RESOURCE as I have noticed that helix
> > controller
> > > is
> > > > periodically monitoring for the children of those paths even though
> all
> > > the
> > > > Workflows have moved into a saturated state like COMPLETED and
> STOPPED.
> > > In
> > > > our case, we have a lot of completed workflows piled up in those
> > paths. I
> > > > believe that helix is clearing up those resources after some TTL.
> What
> > I
> > > > did was writing an external spectator [1] that continuously monitors
> > for
> > > > saturated workflows and clearing up resources before controller does
> > that
> > > > after a TTL. After that, we didn't see such delays in workflow
> > execution
> > > > and everything seems to be running smoothly. However we are
> > continuously
> > > > monitoring our deployments for any form of adverse effect introduced
> by
> > > > that improvement.
> > > >
> > > > Please let us know if we are doing something wrong in this
> improvement
> > or
> > > > is there any better way to achieve this directly through helix task
> > > > framework.
> > > >
> > > > [1]
> > > >
> > >
> >
> https://github.com/apache/airavata/blob/staging/modules/airavata-helix/helix-spectator/src/main/java/org/apache/airavata/helix/impl/controller/WorkflowCleanupAgent.java
> > > >
> > > > Thanks
> > > > Dimuthu
> > > >
> > > > On Tue, Oct 2, 2018 at 1:12 PM Xue Junkai 
> > wrote:
> > > >
> > > >> Could you please check the log of how long for each pipeline stage
> > > takes?
> > > >>
> > > >> Also, did you set expiry for workflows? Are they piled up for long
> > time?
> > > >> How long for each workflow completes?
> > > >>
> > > >> best,
> > > >>
> > > >> Junkai
> > > >>
> > > >> On Wed, Sep 26, 2018 at 8:52 AM DImuthu Upeksha <
> > > >> dimuthu.upeks...@gmail.com>
> > > >> wrote:
> > > >>
> > > >> > Hi Junkai,
> > > >> >
> > > >> > Average load is like 10 - 20 workflows per minutes. In some cases
> > it's
> > > >> less
> > > >> > than that However based on the observations, I feel like it does
> not
> > > >> depend
> > > >> > on the load and it is sporadic. Is there a particular log lines
> > that I
> > > >> can
> > > >> > filter in controller and participant

[ANNOUNCE] Apache Helix 0.8.3 Release

2019-02-12 Thread Xue Junkai
The Apache Helix Team is pleased to announce the 15th release, 0.8.3, of
the Apache Helix project.


Apache Helix is a generic cluster management framework that makes it easy
to build partitioned, fault tolerant, and scalable distributed systems.


The full release notes are available here:

*https://helix.apache.org/0.8.3-docs/releasenotes/release-0.8.3.html
*


You can declare a maven dependency to use it:




  org.apache.helix

  helix-core

  0.8.3




Or download the release sources:

http://helix.apache.org/0.8.3-docs/download.cgi


Additional info


Website: http://helix.apache.org/

Helix mailing lists: http://helix.apache.org/mail-lists.html


We hope you will enjoy using the latest release of Apache Helix!


Cheers,

Apache Helix Team


[RESULT][VOTE] Apache Helix 0.8.3 Release

2019-02-11 Thread Xue Junkai
Thanks for voting on the 0.8.3 release. It has now exceeded 72 hours so I
am closing the vote.


Binding +1s:

Kishore Gopalakrishna, Oliver Lamy, Lei Xia, Junkai Xue


Nonbinding +1s:

Jiajun Wang, Harry Zhang


Binding 0s:


Nonbinding 0s:


Binding -1s:


Nonbinding -1s:


The vote has passed, thanks a lot to everyone for voting!


Re: [NOTICE] Mandatory migration of git repos to gitbox.apache.org - one week left!

2019-01-30 Thread Xue Junkai
Thanks for the reminder. We are agree to move the gitbox.

Best,

Junkai

On Wed, Jan 30, 2019 at 12:10 AM Apache Infrastructure Team <
infrastruct...@apache.org> wrote:

> Hello again, helix folks.
> This is a reminder that you have *one week left* before the mandatory
> mass-migration from git-wip-us to gitbox.
>
> As stated earlier in 2018, and reiterated a few times, all git
> repositories must be migrated from the git-wip-us.apache.org URL to
> gitbox.apache.org, as the old service is being decommissioned. Your
> project is receiving this email because you still have repositories on
> git-wip-us that needs to be migrated.
>
> The following repositories on git-wip-us belong to your project:
>  - helix.git
>
>
> We are now entering the remaining one week of the mandated
> (coordinated) move stage of the roadmap, and you are asked to please
> coordinate migration with the Apache Infrastructure Team before February
> 7th. All repositories not migrated on February 7th will be mass migrated
> without warning, and we'd appreciate it if we could work together to
> avoid a big mess that day :-).
>
> As stated earlier, moving to gitbox means you will get full write access
> on GitHub as well, and be able to close/merge pull requests and much
> more. The move is mandatory for all Apache projects using git.
>
> To have your repositories moved, please follow these steps:
>
> - Ensure consensus on the move (a link to a lists.apache.org thread will
>   suffice for us as evidence).
> - Create a JIRA ticket at https://issues.apache.org/jira/browse/INFRA
>
> Your migration should only take a few minutes. If you wish to migrate
> at a specific time of day or date, please do let us know in the ticket,
> otherwise we will migrate at the earliest convenient time.
>
> There will be redirects in place from git-wip to gitbox, so requests
> using the old remote origins should still work (however we encourage
> people to update their remotes once migration has completed).
>
> As always, we appreciate your understanding and patience as we move
> things around and work to provide better services and features for
> the Apache Family.
>
> Should you wish to contact us with feedback or questions, please do so
> at: us...@infra.apache.org.
>
>
> With regards,
> Apache Infrastructure
>
>


Re: [NOTICE] Mandatory migration of git repos to gitbox.apache.org - one week left!

2019-01-30 Thread Xue Junkai
Thanks Tommaso! Shall we have have a separate voting and link it? Or this
email thread is fair enough?

Best,

Junkai

On Wed, Jan 30, 2019 at 12:30 AM Tommaso Teofili 
wrote:

> I think we should cast a vote, isn't it?
>
> Regards,
> Tommaso
>
> Il giorno mer 30 gen 2019 alle ore 09:27 Xue Junkai 
> ha scritto:
> >
> > Thanks for the reminder. We are agree to move the gitbox.
> >
> > Best,
> >
> > Junkai
> >
> > On Wed, Jan 30, 2019 at 12:10 AM Apache Infrastructure Team <
> infrastruct...@apache.org> wrote:
> >>
> >> Hello again, helix folks.
> >> This is a reminder that you have *one week left* before the mandatory
> >> mass-migration from git-wip-us to gitbox.
> >>
> >> As stated earlier in 2018, and reiterated a few times, all git
> >> repositories must be migrated from the git-wip-us.apache.org URL to
> >> gitbox.apache.org, as the old service is being decommissioned. Your
> >> project is receiving this email because you still have repositories on
> >> git-wip-us that needs to be migrated.
> >>
> >> The following repositories on git-wip-us belong to your project:
> >>  - helix.git
> >>
> >>
> >> We are now entering the remaining one week of the mandated
> >> (coordinated) move stage of the roadmap, and you are asked to please
> >> coordinate migration with the Apache Infrastructure Team before February
> >> 7th. All repositories not migrated on February 7th will be mass migrated
> >> without warning, and we'd appreciate it if we could work together to
> >> avoid a big mess that day :-).
> >>
> >> As stated earlier, moving to gitbox means you will get full write access
> >> on GitHub as well, and be able to close/merge pull requests and much
> >> more. The move is mandatory for all Apache projects using git.
> >>
> >> To have your repositories moved, please follow these steps:
> >>
> >> - Ensure consensus on the move (a link to a lists.apache.org thread
> will
> >>   suffice for us as evidence).
> >> - Create a JIRA ticket at https://issues.apache.org/jira/browse/INFRA
> >>
> >> Your migration should only take a few minutes. If you wish to migrate
> >> at a specific time of day or date, please do let us know in the ticket,
> >> otherwise we will migrate at the earliest convenient time.
> >>
> >> There will be redirects in place from git-wip to gitbox, so requests
> >> using the old remote origins should still work (however we encourage
> >> people to update their remotes once migration has completed).
> >>
> >> As always, we appreciate your understanding and patience as we move
> >> things around and work to provide better services and features for
> >> the Apache Family.
> >>
> >> Should you wish to contact us with feedback or questions, please do so
> >> at: us...@infra.apache.org.
> >>
> >>
> >> With regards,
> >> Apache Infrastructure
> >>
>


[VOTE] Apache Helix 0.8.3 Release

2019-02-04 Thread Xue Junkai
Hi,


This is to call for a vote on releasing the following candidate as Apache
Helix 0.8.3. This is the 15th release of Helix as an Apache project, as
well as the 11th release as a top-level Apache project.


Apache Helix is a generic cluster management framework that makes it easy
to build partitioned and replicated, fault-tolerant and scalable
distributed systems.


Release notes:

*https://helix.apache.org/0.8.3-docs/releasenotes/release-0.8.3.html
*


Release artifacts:

https://repository.apache.org/content/repositories/orgapachehelix-1022


Distribution:

* binaries:

*https://dist.apache.org/repos/dist/dev/helix/0.8.3/binaries/
*

* sources:

https://dist.apache.org/repos/dist/dev/helix/0.8.3/src/


The 0.8.3 release tag:

https://git-wip-us.apache.org/repos/asf?p=helix.git;a=tag;h=refs/tags/helix-
0.8.3


KEYS file available here:

https://dist.apache.org/repos/dist/dev/helix/KEYS


Please vote on the release. The vote will be open for at least 72 hours.


[+1] -- "YES, release"

[0] -- "No opinion"

[-1] -- "NO, do not release"


Thanks,

The Apache Helix Team


Re: [NOTICE] Mandatory migration of git repos to gitbox.apache.org - one week left!

2019-02-04 Thread Xue Junkai
+1

On Mon, Feb 4, 2019 at 2:20 PM Lei Xia  wrote:

> +1
>
> On Wed, Jan 30, 2019 at 11:28 AM Wang Jiajun 
> wrote:
>
> > +1
> >
> > Best Regards,
> > Jiajun
> >
> >
> > On Wed, Jan 30, 2019 at 12:34 AM Xue Junkai  wrote:
> >
> > > Thanks Tommaso! Shall we have have a separate voting and link it? Or
> this
> > > email thread is fair enough?
> > >
> > > Best,
> > >
> > > Junkai
> > >
> > > On Wed, Jan 30, 2019 at 12:30 AM Tommaso Teofili <
> > > tommaso.teof...@gmail.com>
> > > wrote:
> > >
> > > > I think we should cast a vote, isn't it?
> > > >
> > > > Regards,
> > > > Tommaso
> > > >
> > > > Il giorno mer 30 gen 2019 alle ore 09:27 Xue Junkai  >
> > > > ha scritto:
> > > > >
> > > > > Thanks for the reminder. We are agree to move the gitbox.
> > > > >
> > > > > Best,
> > > > >
> > > > > Junkai
> > > > >
> > > > > On Wed, Jan 30, 2019 at 12:10 AM Apache Infrastructure Team <
> > > > infrastruct...@apache.org> wrote:
> > > > >>
> > > > >> Hello again, helix folks.
> > > > >> This is a reminder that you have *one week left* before the
> > mandatory
> > > > >> mass-migration from git-wip-us to gitbox.
> > > > >>
> > > > >> As stated earlier in 2018, and reiterated a few times, all git
> > > > >> repositories must be migrated from the git-wip-us.apache.org URL
> to
> > > > >> gitbox.apache.org, as the old service is being decommissioned.
> Your
> > > > >> project is receiving this email because you still have
> repositories
> > on
> > > > >> git-wip-us that needs to be migrated.
> > > > >>
> > > > >> The following repositories on git-wip-us belong to your project:
> > > > >>  - helix.git
> > > > >>
> > > > >>
> > > > >> We are now entering the remaining one week of the mandated
> > > > >> (coordinated) move stage of the roadmap, and you are asked to
> please
> > > > >> coordinate migration with the Apache Infrastructure Team before
> > > February
> > > > >> 7th. All repositories not migrated on February 7th will be mass
> > > migrated
> > > > >> without warning, and we'd appreciate it if we could work together
> to
> > > > >> avoid a big mess that day :-).
> > > > >>
> > > > >> As stated earlier, moving to gitbox means you will get full write
> > > access
> > > > >> on GitHub as well, and be able to close/merge pull requests and
> much
> > > > >> more. The move is mandatory for all Apache projects using git.
> > > > >>
> > > > >> To have your repositories moved, please follow these steps:
> > > > >>
> > > > >> - Ensure consensus on the move (a link to a lists.apache.org
> thread
> > > > will
> > > > >>   suffice for us as evidence).
> > > > >> - Create a JIRA ticket at
> > https://issues.apache.org/jira/browse/INFRA
> > > > >>
> > > > >> Your migration should only take a few minutes. If you wish to
> > migrate
> > > > >> at a specific time of day or date, please do let us know in the
> > > ticket,
> > > > >> otherwise we will migrate at the earliest convenient time.
> > > > >>
> > > > >> There will be redirects in place from git-wip to gitbox, so
> requests
> > > > >> using the old remote origins should still work (however we
> encourage
> > > > >> people to update their remotes once migration has completed).
> > > > >>
> > > > >> As always, we appreciate your understanding and patience as we
> move
> > > > >> things around and work to provide better services and features for
> > > > >> the Apache Family.
> > > > >>
> > > > >> Should you wish to contact us with feedback or questions, please
> do
> > so
> > > > >> at: us...@infra.apache.org.
> > > > >>
> > > > >>
> > > > >> With regards,
> > > > >> Apache Infrastructure
> > > > >>
> > > >
> > >
> >
> --
> Lei Xia
>


-- 
Junkai Xue


Re: For PMC - enabling GitHub issues and wiki

2019-05-26 Thread Xue Junkai
+1

On Sun, May 26, 2019 at 11:25 AM kishore g  wrote:

> I am in favor of enabling github issues and wiki
>
> On Sun, May 26, 2019 at 11:22 AM Hunter Lee  wrote:
>
> > Could a member of the PMC update the ticket for GitHub issues and wiki?
> > This was discussed informally offline, so please mention that we do not
> > have the record of it, but as long as the PMC could verify that we want
> > this for Helix, the infra team should be able to go ahead and do it for
> us.
> > https://issues.apache.org/jira/browse/INFRA-18471
> >
> > Thanks,
> > Hunter
> >
>


-- 
Junkai Xue


Re: [VOTE] Apache Helix 0.9.0 Release

2019-06-12 Thread Xue Junkai
+1

On Wed, Jun 12, 2019 at 11:15 AM kishore g  wrote:

> +1.
>
>
>
> On Wed, Jun 12, 2019 at 11:01 AM Lei Xia  wrote:
>
> > +1
> >
> > On Tue, Jun 11, 2019 at 3:07 PM Hunter Lee  wrote:
> >
> > > Hi,
> > >
> > > This is to call for a vote on releasing the following candidate as
> Apache
> > > Helix 0.9.0. This is the 17th release of Helix as an Apache project, as
> > > well as the 13th release as a top-level Apache project.
> > >
> > > Apache Helix is a generic cluster management framework that makes it
> easy
> > > to build partitioned and replicated, fault-tolerant and scalable
> > > distributed systems.
> > >
> > > Release notes:
> > > https://helix.apache.org/0.9.0-docs/releasenotes/release-0.9.0.html
> > >
> > > Release artifacts:
> > >
> https://repository.apache.org/content/repositories/orgapachehelix-1029/
> > > <
> > >
> >
> https://repository.apache.org/content/repositories/orgapachehelix-%5B%5D
> > > >
> > >
> > > Distribution:
> > > * binaries:
> > > https://dist.apache.org/repos/dist/dev/helix/0.9.0/binaries/
> > > 
> > > * sources:
> > > https://dist.apache.org/repos/dist/dev/helix/0.9.0/src/
> > > 
> > >
> > > The [VERSION] release tag:
> > >
> > >
> >
> https://git-wip-us.apache.org/repos/asf?p=helix.git;a=tag;h=refs/tags/helix-0.9.0
> > > <
> > >
> >
> https://git-wip-us.apache.org/repos/asf?p=helix.git;a=tag;h=refs/tags/helix-[VERSION]
> > > >
> > >
> > > KEYS file available here:
> > > https://dist.apache.org/repos/dist/dev/helix/KEYS
> > >
> > > Please vote on the release. The vote will be open for at least 72
> hours.
> > >
> > > [+1] -- "YES, release"
> > > [0] -- "No opinion"
> > > [-1] -- "NO, do not release"
> > >
> > > Thanks,
> > > The Apache Helix Team
> > >
> >
>


-- 
Junkai Xue


Re: Multiple instance group tags for Job config

2019-06-24 Thread Xue Junkai
Hi Dimuthu,

That's a good feature to support in the future. We dont have plan to
support it right now. Could you please create a Helix ticket for that?

Best,

Junkai

On Mon, Jun 24, 2019 at 7:59 AM DImuthu Upeksha 
wrote:

> Hi Folks,
>
> Currently we can set only one instance group tag for a job.
>
> jobCfg.setInstanceGroupTag("INSTANCEGROUPTAG");
>
> Do you have anything planned to support multiple instance group tags for
> one job so that the job can be run either in group A or group B? This is
> somewhat similar to Node Affinity [1] concept in Kubernetes.
>
> [1] https://kubernetes.io/docs/concepts/configuration/assign-pod-node/
>
> Thanks
> Dimuthu
>


-- 
Junkai Xue


Re: Java 11 support?

2019-04-29 Thread Xue Junkai
Hi Kevin,

Unfortunately, we do not have this discussion before and we just started
the process of migrating from Java 7 to Java 8. It may need some effort to
migrate from 8 to 11 again. Feel free to create a Helix ticket for this.

Best,

Junkai

On Mon, Apr 29, 2019 at 3:54 PM Kevin Lafferty 
wrote:

> I did a quick search in Jira and on the user/dev mailing lists, and I
> couldn't find any discussion of adding support for Java 11.
>
> I'm looking at using Helix, but I require Java 11. Has there been any
> discussion of this yet?
>
> Thanks,
> Kevin
>


Missing 0.9.0.1 Helix release in maven repository

2019-08-02 Thread Xue Junkai
Hi Folks,

We are generating an Apache Helix release 0.9.0.1. Everything looks good
after we click the release in Repository Manager.

But we did not find the 0.9.0.1 release in maven repository until now.
Could you please help us figure it out why it is not showing up? It was
released yesterday 6pm.

Could that be maven only supports 3 level version?

Best,

Junkai


Re: [VOTE] Apache Helix 0.9.1 Release

2019-08-14 Thread Xue Junkai
+1

On Wed, Aug 14, 2019 at 2:07 PM Wang Jiajun  wrote:

> Hi,
>
> This is to call for a vote on releasing the following candidate as Apache
> Helix 0.9.1. This is the 18th release of Helix as an Apache project, as
> well as the 14th release as a top-level Apache project.
>
> Apache Helix is a generic cluster management framework that makes it easy
> to build partitioned and replicated, fault-tolerant and scalable
> distributed systems.
>
> Release notes:
> https://helix.apache.org/0.9.1-docs/releasenotes/release-0.9.1.html
>
> Release artifacts:
> https://repository.apache.org/content/repositories/orgapachehelix-1032/
>
> Distribution:
> * binaries:
> https://dist.apache.org/repos/dist/dev/helix/0.9.1/binaries/
> * sources:
> https://dist.apache.org/repos/dist/dev/helix/0.9.1/src/
>
> The 0.9.1 release tag:
>
> https://git-wip-us.apache.org/repos/asf?p=helix.git;a=tag;h=refs/tags/helix-0.9.1
>
> KEYS file available here:
> https://dist.apache.org/repos/dist/dev/helix/KEYS
>
> Please vote on the release. The vote will be open for at least 72 hours.
>
> [+1] -- "YES, release"
> [0] -- "No opinion"
> [-1] -- "NO, do not release"
>
> Thanks,
> The Apache Helix Team
>


-- 
Junkai Xue


Re: Missing 0.9.0.1 Helix release in maven repository

2019-08-07 Thread Xue Junkai
Create an apache jira to track it:
https://issues.apache.org/jira/browse/INFRA-18849


On Fri, Aug 2, 2019 at 12:56 PM Xue Junkai  wrote:

> Hi Folks,
>
> We are generating an Apache Helix release 0.9.0.1. Everything looks good
> after we click the release in Repository Manager.
>
> But we did not find the 0.9.0.1 release in maven repository until now.
> Could you please help us figure it out why it is not showing up? It was
> released yesterday 6pm.
>
> Could that be maven only supports 3 level version?
>
> Best,
>
> Junkai
>


[ANNOUNCE] Move review related emails to revi...@helix.apache.org

2019-08-05 Thread Xue Junkai
Hi Folks,

Due to large set of emails from github reviews flooding dev channel, we
decide to move all review related notifications to revi...@helix.apache.org.

Please resubscribe the reviews by directly send subscribe emails to
revi...@helix.apache.org or go through this page:
https://helix.apache.org/mail-lists.html

Best,

Junkai


Re: [VOTE] Apache Helix 0.9.4 Release

2020-01-23 Thread Xue Junkai
+1

On Thu, Jan 23, 2020 at 5:46 AM Lei Xia  wrote:

> +1
>
>
> Lei
>
> On Wed, Jan 22, 2020 at 6:30 PM Hunter Lee  wrote:
>
> > It is up now.
> >
> > Hunter
> >
> > On Wed, Jan 22, 2020 at 8:48 AM Lei Xia  wrote:
> >
> > > Thanks Hunter, the release notes link seems not work?
> > >
> > >
> > > Lei
> > >
> > > On Tue, Jan 21, 2020 at 11:40 PM Hunter Lee 
> wrote:
> > >
> > > > Hi,
> > > >
> > > > This is to call for a vote on releasing the following candidate as
> > Apache
> > > > Helix 0.9.4. This is the 19th release of Helix as an Apache project,
> as
> > > > well as the 15th release as a top-level Apache project.
> > > >
> > > > Apache Helix is a generic cluster management framework that makes it
> > easy
> > > > to build partitioned and replicated, fault-tolerant and scalable
> > > > distributed systems.
> > > >
> > > > Release notes:
> > > > https://helix.apache.org/0.9.4-docs/releasenotes/release-0.9.4.html
> > > >
> > > > Release artifacts:
> > > >
> > https://repository.apache.org/content/repositories/orgapachehelix-1036/
> > > >
> > > > Distribution:
> > > > * binaries:
> > > > https://dist.apache.org/repos/dist/dev/helix/0.9.4/binaries/
> > > > * sources:
> > > > https://dist.apache.org/repos/dist/dev/helix/0.9.4/src/
> > > >
> > > > The 0.9.4 release tag:
> > > >
> > > >
> > >
> >
> https://git-wip-us.apache.org/repos/asf?p=helix.git;a=tag;h=refs/tags/helix-0.9.4
> > > >
> > > > KEYS file available here:
> > > > https://dist.apache.org/repos/dist/dev/helix/KEYS
> > > >
> > > > Please vote on the release. The vote will be open for at least 72
> > hours.
> > > >
> > > > [+1] -- "YES, release"
> > > > [0] -- "No opinion"
> > > > [-1] -- "NO, do not release"
> > > >
> > > > Thanks,
> > > > The Apache Helix Team
> > > >
> > >
> >
>


[VOTE] Apache Helix 1.0.0 Release

2020-05-04 Thread Xue Junkai
Hi,


This is to call for a vote on releasing the following candidate as Apache
Helix 1.0.0. This is the 21st release of Helix as an Apache project, as
well as the [16th release as a top-level Apache project.


Apache Helix is a generic cluster management framework that makes it easy
to build partitioned and replicated, fault-tolerant and scalable
distributed systems.


Release notes:

http://helix.apache.org/1.0.0-docs/releasenotes/release-1.0.0.html


Release artifacts:

https://repository.apache.org/content/repositories/orgapachehelix-1037


Distribution:

* binaries:

https://dist.apache.org/repos/dist/dev/helix/1.0.0/binaries/

* sources:

https://dist.apache.org/repos/dist/dev/helix/1.0.0/src/



The 1.0.0 release tag:

https://git-wip-us.apache.org/repos/asf?p=helix.git;a=tag;h=refs/tags/helix-1.0.0


KEYS file available here:

https://dist.apache.org/repos/dist/dev/helix/KEYS


Please vote on the release. The vote will be open for at least 72 hours.


[+1] -- "YES, release"

[0] -- "No opinion"

[-1] -- "NO, do not release"


Thanks,

The Apache Helix Team


[VOTE] Apache Helix 0.9.7 Release

2020-05-11 Thread Xue Junkai
Hi,


This is to call for a vote on releasing the following candidate as Apache
Helix 0.9.7. This is the 22nd release of Helix as an Apache project, as
well as the 18th release as a top-level Apache project. This release is
supporting the customers are using 0.9 series.


Apache Helix is a generic cluster management framework that makes it easy
to build partitioned and replicated, fault-tolerant and scalable
distributed systems.


Release notes:

http://helix.apache.org/0.9.7-docs/releasenotes/release-0.9.7.html


Release artifacts:

https://repository.apache.org/content/repositories/orgapachehelix-1039


Distribution:

* binaries:

https://dist.apache.org/repos/dist/dev/helix/0.9.7/binaries/

* sources:

https://dist.apache.org/repos/dist/dev/helix/0.9.7/src/


The 0.9.7 release tag:

https://git-wip-us.apache.org/repos/asf?p=helix.git;a=tag;h=refs/tags/helix-0.9.7


KEYS file available here:

https://dist.apache.org/repos/dist/dev/helix/KEYS


Please vote on the release. The vote will be open for at least 72 hours.


[+1] -- "YES, release"

[0] -- "No opinion"

[-1] -- "NO, do not release"


Thanks,

The Apache Helix Team


Re: [VOTE] Apache Helix 0.9.5 Release

2020-05-11 Thread Xue Junkai
Sorry All. This version has been taken in maven. We need to investigate it.

I will send another vote for different version. Please ignore this vote.

Best,

Junkai

On Mon, May 11, 2020 at 12:06 PM Hunter Lee  wrote:

> +1
>
> On Mon, May 11, 2020 at 12:00 PM Xue Junkai  wrote:
>
> > Hi,
> >
> >
> > This is to call for a vote on releasing the following candidate as Apache
> > Helix 0.9.5. This is the 22nd release of Helix as an Apache project, as
> > well as the 18th release as a top-level Apache project. This release is
> > supporting the customers are using 0.9 series.
> >
> >
> > Apache Helix is a generic cluster management framework that makes it easy
> > to build partitioned and replicated, fault-tolerant and scalable
> > distributed systems.
> >
> >
> > Release notes:
> >
> > http://helix.apache.org/0.9.5-docs/releasenotes/release-0.9.5.html
> >
> >
> > Release artifacts:
> >
> > https://repository.apache.org/content/repositories/orgapachehelix-1038
> >
> >
> > Distribution:
> >
> > * binaries:
> >
> > https://dist.apache.org/repos/dist/dev/helix/0.9.5/binaries/
> >
> > * sources:
> >
> > https://dist.apache.org/repos/dist/dev/helix/0.9.5/src/
> >
> >
> > The 0.9.5 release tag:
> >
> >
> >
> https://git-wip-us.apache.org/repos/asf?p=helix.git;a=tag;h=refs/tags/helix-0.9.5
> >
> >
> > KEYS file available here:
> >
> > https://dist.apache.org/repos/dist/dev/helix/KEYS
> >
> >
> > Please vote on the release. The vote will be open for at least 72 hours.
> >
> >
> > [+1] -- "YES, release"
> >
> > [0] -- "No opinion"
> >
> > [-1] -- "NO, do not release"
> >
> >
> > Thanks,
> >
> > The Apache Helix Team
> >
>


[RESULT][VOTE] Apache Helix 1.0.0 Release

2020-05-11 Thread Xue Junkai
Thanks for voting on the [VERSION] release. It has now exceeded 72 hours so
I am closing the vote.


Binding +1s:

Kishore Gopalakrishna, Lei Xia, Junkai Xue


Nonbinding +1s:

Jiajun Wang, Harry Zhang


Binding 0s:


Nonbinding 0s:


Binding -1s:


Nonbinding -1s:


The vote has passed, thanks a lot to everyone for voting!


[ANNOUNCE] Apache Helix 0.9.7 Release

2020-05-18 Thread Xue Junkai
The Apache Helix Team is pleased to announce the 22nd release, 0.9.7, of
the Apache Helix project.


This release is supporting the customers are using 0.9 series. We will
support the users who are not able to onboard 1.0.0 major features or
backward incompatible changes. So the 0.9 series release only
support bug fixing and minor improvement. And 0.9 series releases will base
on the users' asking.


Apache Helix is a generic cluster management framework that makes it easy

to build partitioned, fault tolerant, and scalable distributed systems.


The full release notes are available here:

http://helix.apache.org/0.9.7-docs/releasenotes/release-0.9.7.html


You can declare a maven dependency to use it:




  org.apache.helix

  helix-core

  0.9.7




Or download the release sources:

http://helix.apache.org/0.9.7-docs/download.cgi


Additional info


Website: http://helix.apache.org/

Helix mailing lists: http://helix.apache.org/mail-lists.html


We hope you will enjoy using the latest release of Apache Helix!


Cheers,

Apache Helix Team


[RESULT][VOTE] Apache Helix 0.9.7 Release

2020-05-18 Thread Xue Junkai
Thanks for voting on the 0.9.7 release. It has now exceeded 72 hours so I
am closing the vote.


Binding +1s:

Lei Xia, Olivier Lamy, Kishore Gopalakrishna, Junkai Xue


Nonbinding +1s:

Hunter Lee, Hao Zhang


Binding 0s:


Nonbinding 0s:


Binding -1s:


Nonbinding -1s:


The vote has passed, thanks a lot to everyone for voting!


[ANNOUNCE] Apache Helix 1.0.0 Release

2020-05-11 Thread Xue Junkai
The Apache Helix Team is pleased to announce the 21st release,

1.0.0, of the Apache Helix project.


Apache Helix is a generic cluster management framework that makes it easy

to build partitioned, fault tolerant, and scalable distributed systems.


The full release notes are available here:

http://helix.apache.org/1.0.0-docs/releasenotes/release-1.0.0.html


You can declare a maven dependency to use it:




  org.apache.helix

  helix-core

  1.0.0




Or download the release sources:

http://helix.apache.org/1.0.0-docs/download.cgi


Additional info


Website: http://helix.apache.org/

Helix mailing lists: http://helix.apache.org/mail-lists.html


We hope you will enjoy using the latest release of Apache Helix!


Cheers,

Apache Helix Team


Re: [VOTE] Apache Helix 0.9.8 Release

2020-10-14 Thread Xue Junkai
+1

On Wed, Oct 14, 2020 at 6:25 PM Wang Jiajun  wrote:

> Hi,
>
> This is to call for a vote on releasing the following candidate as Apache
> Helix 0.9.8. This is the 23rd release of Helix as an Apache project, as
> well as the 19th release as a top-level Apache project. This release is
> supporting the customers who are using the 0.9 series.
>
> Apache Helix is a generic cluster management framework that makes it easy
> to build partitioned and replicated, fault-tolerant and scalable
> distributed systems.
>
> Release notes:
> https://helix.apache.org/0.9.8-docs/releasenotes/release-0.9.8.html
>
> Release artifacts:
> https://repository.apache.org/content/repositories/orgapachehelix-1042/
>
> Distribution:
> * binaries:
> https://dist.apache.org/repos/dist/dev/helix/0.9.8/binaries/
> * sources:
> https://dist.apache.org/repos/dist/dev/helix/0.9.8/src/
>
> The 0.9.8 release tag:
>
> https://git-wip-us.apache.org/repos/asf?p=helix.git;a=tag;h=refs/tags/helix-0.9.8
>
> KEYS file available here:
> https://dist.apache.org/repos/dist/dev/helix/KEYS
>
> Please vote on the release. The vote will be open for at least 72 hours.
>
> [+1] -- "YES, release"
> [0] -- "No opinion"
> [-1] -- "NO, do not release"
>
> Thanks,
> The Apache Helix Team
>


[RESULT][VOTE] Apache Helix 1.0.1 Release

2020-07-29 Thread Xue Junkai
hanks for voting on the [VERSION] release. It has now exceeded 72 hours so
I am closing the vote.


Binding +1s:

Lei Xia,

Zhen Zhang,

Junkai Xue


Nonbinding +1s:


Binding 0s:


Nonbinding 0s:


Binding -1s:


Nonbinding -1s:


The vote has passed, thanks a lot to everyone for voting!


[ANNOUNCE] Apache Helix 1.0.1 Release

2020-07-29 Thread Xue Junkai
The Apache Helix Team is pleased to announce the 21st release,

1.0.1, of the Apache Helix project.


Apache Helix is a generic cluster management framework that makes it easy

to build partitioned, fault tolerant, and scalable distributed systems.


The full release notes are available here:

https://helix.apache.org/1.0.1-docs/releasenotes/release-1.0.1.html


You can declare a maven dependency to use it:




  org.apache.helix

  helix-core

  1.0.1




Or download the release sources:

http://helix.apache.org/1.0.1-docs/download.cgi


Additional info


Website: http://helix.apache.org/

Helix mailing lists: http://helix.apache.org/mail-lists.html


We hope you will enjoy using the latest release of Apache Helix!


Cheers,

Apache Helix Team


Re: [VOTE] Apache Helix 1.0.1 Release

2020-07-29 Thread Xue Junkai
+1

On Thu, Jul 2, 2020 at 12:11 PM Lei Xia  wrote:

> +1
>
> On Wed, Jul 1, 2020 at 2:51 PM Xue Junkai  wrote:
>
> > Hi,
> >
> >
> > This is to call for a vote on releasing the following candidate as Apache
> > Helix 1.0.1. This is the 21st release of Helix as an Apache project, as
> > well as the 17th release as a top-level Apache project.
> >
> >
> > Apache Helix is a generic cluster management framework that makes it easy
> > to build partitioned and replicated, fault-tolerant and scalable
> > distributed systems.
> >
> >
> > Release notes:
> >
> > http://helix.apache.org/1.0.1-docs/releasenotes/release-1.0.1.html
> >
> >
> > Release artifacts:
> >
> > https://repository.apache.org/content/repositories/orgapachehelix-1040
> >
> >
> > Distribution:
> >
> > * binaries:
> >
> > https://dist.apache.org/repos/dist/dev/helix/1.0.1/binaries/
> >
> > * sources:
> >
> > https://dist.apache.org/repos/dist/dev/helix/1.0.1/src/
> >
> >
> > The 1.0.1 release tag:
> >
> >
> >
> https://git-wip-us.apache.org/repos/asf?p=helix.git;a=tag;h=refs/tags/helix-1.0.1
> >
> >
> > KEYS file available here:
> >
> > https://dist.apache.org/repos/dist/dev/helix/KEYS
> >
> >
> > Please vote on the release. The vote will be open for at least 72 hours.
> >
> >
> > [+1] -- "YES, release"
> >
> > [0] -- "No opinion"
> >
> > [-1] -- "NO, do not release"
> >
> >
> > Thanks,
> >
> > The Apache Helix Team
> >
>


-- 
Junkai Xue


[ANNOUNCE] New Committers: Ali Reza Zamani and Huizhi Lu

2020-08-05 Thread Xue Junkai
Hi, All


  The Project Management Committee (PMC) for Apache Helix has asked Ali
Reza Zamani and Huizhi Lu to become committers and we are pleased to
announce that they have accepted.


  Being a committer enables easier contribution to the project since there
is no need to go via the patch submission process. This should enable
better productivity.


  Welcome Ali Reza Zamani and Huizhi Lu!


Helix Team


[ANNOUCE] Deprecate Apache Helix 1.0.0

2020-07-01 Thread Xue Junkai
Hi Helix devs and users:

We will deprecate the Helix 1.0.0 release due to an unstable behavior. It
is introduced by auto merge branched feature development into master. The
git auto merge silently resolved the conflict causing unexpected runtime
unstability. Unfortunately our release validation tools failed to detect
this issue when we prepared this release.

We will deprecate 1.0.0 release and generate 1.0.1 release with the fix as
well as several other improvements.

Cheers,

The Apache Helix Team


[VOTE] Apache Helix 1.0.1 Release

2020-07-01 Thread Xue Junkai
Hi,


This is to call for a vote on releasing the following candidate as Apache
Helix 1.0.1. This is the 21st release of Helix as an Apache project, as
well as the 17th release as a top-level Apache project.


Apache Helix is a generic cluster management framework that makes it easy
to build partitioned and replicated, fault-tolerant and scalable
distributed systems.


Release notes:

http://helix.apache.org/1.0.1-docs/releasenotes/release-1.0.1.html


Release artifacts:

https://repository.apache.org/content/repositories/orgapachehelix-1040


Distribution:

* binaries:

https://dist.apache.org/repos/dist/dev/helix/1.0.1/binaries/

* sources:

https://dist.apache.org/repos/dist/dev/helix/1.0.1/src/


The 1.0.1 release tag:

https://git-wip-us.apache.org/repos/asf?p=helix.git;a=tag;h=refs/tags/helix-1.0.1


KEYS file available here:

https://dist.apache.org/repos/dist/dev/helix/KEYS


Please vote on the release. The vote will be open for at least 72 hours.


[+1] -- "YES, release"

[0] -- "No opinion"

[-1] -- "NO, do not release"


Thanks,

The Apache Helix Team


[VOTE] Apache Helix 0.9.9 Release

2020-11-20 Thread Xue Junkai
Hi,

This is to call for a vote on releasing the following candidate as Apache
Helix 0.9.9. This is the 24th release of Helix as an Apache project, as
well as the 20th release as a top-level Apache project.

Apache Helix is a generic cluster management framework that makes it easy
to build partitioned and replicated, fault-tolerant and scalable
distributed systems.

Release notes:
http://helix.apache.org/0.9.9-docs/releasenotes/release-0.9.9.html
Release artifacts:
https://repository.apache.org/content/repositories/orgapachehelix-1043

Distribution:
* binaries: https://dist.apache.org/repos/dist/dev/helix/0.9.9/binaries/
* sources: https://dist.apache.org/repos/dist/dev/helix/0.9.9/src/

The 0.9.9 release tag:
https://git-wip-us.apache.org/repos/asf?p=helix.git;a=tag;h=refs/tags/helix-0.9.9

KEYS file available here:
https://dist.apache.org/repos/dist/dev/helix/KEYS

Please vote on the release.
The vote will be open for at least 72 hours.
[+1] -- "YES, release"
[0] -- "No opinion"
[-1] -- "NO, do not release"

Thanks,

The Apache Helix Team


Re: Helix configuration parameter documentation

2020-11-30 Thread Xue Junkai
Hi Brent,

Glad to hear that. Yes. Please submit a PR for the doc, we will help you
review.

Best,

Junkai

On Mon, Nov 30, 2020 at 4:11 PM Brent  wrote:

> Hey all,
>
> I wanted to gauge interest in constructing a page (possibly something that
> lives under https://helix.apache.org/*-docs/index.html) that explains the
> various Apache Helix configuration parameters including their scopes (e.g.
> CLUSTER/RESOURCE/INSTANCE), their default values and descriptions.  I ended
> up constructing a first pass for my own consumption and figured it might be
> useful to other Helix users as well.
>
> I don't mind generating a first draft and sharing it (though I do need to
> get approval for that first), but I'll very likely be missing information
> or be wrong on certain things so it will definitely require proofreading
> from more knowledgeable folks (i.e. you all).  It would be cool if we could
> auto-generate it from the code so it didn't need to be updated separately,
> but I'm not sure how feasible that is at the moment so this seemed like the
> next best bet. This is more of a Zookeeper-level description at the moment;
> it doesn't currently reference the corresponding Java objects/methods.
>
> A very minimal sample formatted in Markdown is below (pasted into your
> Markdown previewer of choice).  Thoughts?
>
> | Param Name  | Scope   | Deprecated | Default Value | Description |
> | --- | --- | -- | - | --- |
> | DELAY_REBALANCE_TIME | CLUSTER | No | -1 | Delayed time that Helix should
> hold off for until rebalancing (in milliseconds).  Only valid when
> DELAY_REBALANCE_ENABLED is true. |
> |...| ... | ... | ... | ... |
> | MAX_PARTITIONS_PER_INSTANCE | RESOURCE | No | Integer.MAX_VALUE | The
> maximum number of partitions, for the given resource, that can be placed on
> any single instance. |
> |...| ... | ... | ... | ... |
> | DOMAIN | INSTANCE | No | null | Domain represents a hierarchy identifier
> for an instance.  The value should mirror the TOPOLOGY setting for the
> cluster. See
>
> https://engineering.linkedin.com/blog/2017/07/powering-helix_s-auto-rebalancer-with-topology-aware-partition-p
> .
> |
>


Apache Helix 2021 Spring Meetup

2021-02-05 Thread Xue Junkai
We are back! Apache Helix Meetup. After 7 years, we will hold the 2021
Apache Helix meetup. We will talk about what's new in Helix for open source
from 0.6 to 1.0 and also invited our heavy users from Pinterest and Uber to
share their experience.
Please join us!

https://www.meetup.com/Building-distributed-systems-using-Apache-Helix-Meetup-group/events/276116422/


Best,

Junkai