[jira] [Created] (HELIX-774) Helix process getting increased day by day

2018-10-31 Thread Mohanraj Tirougnaname (JIRA)
Mohanraj Tirougnaname created HELIX-774:
---

 Summary: Helix process getting increased day by day 
 Key: HELIX-774
 URL: https://issues.apache.org/jira/browse/HELIX-774
 Project: Apache Helix
  Issue Type: Bug
  Components: helix-webapp-admin
Affects Versions: 0.6.5
 Environment: Linux
Reporter: Mohanraj Tirougnaname
 Fix For: 0.6.5


Hi Team, 

We are using helix in cluster Load balancing. While starting jbpm server the 3 
helix process daily getting added in the process and takes lot of memory. Below 
I have added the helix process, please help me to fix this.

Please refer below process

jboss 31027 1 0 Oct19 ? 00:04:27 
/u01/app/mw/jdk1.8.0_121/bin/java -Xms512m -Xmx512m -classpath 
/u01/app/mw/prod_zk/helix/conf
/u01/app/mw/prod_zk/heli /repo/log4j/log4j/1.2.15/log4j-1.2.15.jar
/u01/app/mw/prod_zk/helix/repo/org/apache/zookeeper/zookeeper/3.3.4/zookeeper-3.3.4.jar
/u01/app/mw/prod_zk/helix/repo/jline/jline/0.9.94/jline-0.9.94.jar
/u01/app/mw/prod_zk/helix/repo/org/codehaus/jackson/jackson-core-asl/1.8.5/jackson-core-asl-1.8.5.jar
/u01/app/mw/prod_zk/helix/repo/org/codehaus/jackson/jackson-mapper-asl/1.8.5/jackson-mapper-asl-1.8.5.jar
/u01/app/mw/prod_zk/helix/repo/commons-io/commons-io/1.4/commons-io-1.4.jar
/u01/app/mw/prod_zk/helix/repo/commons-cli/commons-cli/1.2/commons-cli-1.2.jar
/u01/app/mw/prod_zk/helix/repo/com/github/sgroschupf/zkclient/0.1/zkclient-0.1.jar
/u01/app/mw/prod_zk/helix/repo/org/apache/commons/commons-math/2.1/commons-math-2.1.jar
/u01/app/mw/prod_zk/helix/repo/commons-codec/commons-codec/1.6/commons-codec-1.6.jar
/u01/app/mw/prod_zk/helix/repo/com/google/guava/guava/15.0/guava-15.0.jar
/u01/app/mw/prod_zk/helix/repo/org/yaml/snakeyaml/1.12/snakeyaml-1.12.jar
/u01/app/mw/prod_zk/helix/repo/org/apache/helix/helix-core/0.6.5/helix-core-0.6.5.jar
 -Dapp.name=run-helix-controller -Dapp.pid=31027 
-Dapp.repo=/u01/app/mw/prod_zk/helix/repo -Dbasedir=/u01/app/mw/prod_zk/helix 
org.apache.helix.controller.HelixControllerMain --zkSvr 
204.26.160.42:4181,204.26.160.43:4181 --cluster repoCluster3



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HELIX-772) Support TaskDriver.addUserContent() api

2018-10-31 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/HELIX-772?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16670698#comment-16670698
 ] 

ASF GitHub Bot commented on HELIX-772:
--

Github user asfgit closed the pull request at:

https://github.com/apache/helix/pull/280


> Support TaskDriver.addUserContent() api
> ---
>
> Key: HELIX-772
> URL: https://issues.apache.org/jira/browse/HELIX-772
> Project: Apache Helix
>  Issue Type: Bug
>Reporter: Harry Zhang
>Assignee: Harry Zhang
>Priority: Major
>
> Need to support add user content in task driver
>  
> AC:
>  * implement APi
>  * add test
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HELIX-773) Support getLastScheduledTaskTimestamp information in workflow rest api

2018-10-31 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/HELIX-773?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16670702#comment-16670702
 ] 

ASF GitHub Bot commented on HELIX-773:
--

Github user asfgit closed the pull request at:

https://github.com/apache/helix/pull/281


> Support getLastScheduledTaskTimestamp information in workflow rest api
> --
>
> Key: HELIX-773
> URL: https://issues.apache.org/jira/browse/HELIX-773
> Project: Apache Helix
>  Issue Type: Bug
>Reporter: Harry Zhang
>Assignee: Harry Zhang
>Priority: Major
>
> Support getLastScheduledTaskTimestamp information in workflow rest api



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[GitHub] helix pull request #281: [HELIX-773] add getLastScheduledTaskTimestamp infor...

2018-10-31 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/helix/pull/281


---


[GitHub] helix pull request #283: [HELIX-775] consolidate user content related apis f...

2018-10-31 Thread zhan849
GitHub user zhan849 opened a pull request:

https://github.com/apache/helix/pull/283

[HELIX-775] consolidate user content related apis for task driver

HELIX-1315: consolidate user content related apis for task driver


To consolidate task driver user content related apis, and corresponding 
rest apis, I'm deprecating the general getUserContent() api, but instead, we 
now have the following apis for get / add / update user content.

```java
public void addOrUpdateWorkflowUserContentMap(String workflowName,
  final Map contentToAddOrUpdate);

public void addOrUpdateJobUserContentMap(String workflowName, String 
jobName,
  final Map contentToAddOrUpdate);

public void addOrUpdateTaskUserContentMap(String workflowName, String 
jobName,
  String taskPartitionId, final Map 
contentToAddOrUpdate);


public Map getWorkflowUserContentMap(String workflowName);


public Map getJobUserContentMap(String workflowName, String 
jobName);

public Map getTaskUserContentMap(String workflowName, 
String jobName,
  String taskPartitionId);
```

delete user content api tbd but can use the same convension

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/zhan849/helix harry/task-user-content

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/helix/pull/283.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #283


commit b235c4ee5a82c5970d29e839317ea242813a58bc
Author: Harry Zhang 
Date:   2018-10-04T18:25:08Z

[HELIX-775] consolidate user content related apis for task driver




---


[jira] [Commented] (HELIX-775) Task driver should support add/get task framework user content

2018-10-31 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/HELIX-775?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16670730#comment-16670730
 ] 

ASF GitHub Bot commented on HELIX-775:
--

GitHub user zhan849 opened a pull request:

https://github.com/apache/helix/pull/283

[HELIX-775] consolidate user content related apis for task driver

HELIX-1315: consolidate user content related apis for task driver


To consolidate task driver user content related apis, and corresponding 
rest apis, I'm deprecating the general getUserContent() api, but instead, we 
now have the following apis for get / add / update user content.

```java
public void addOrUpdateWorkflowUserContentMap(String workflowName,
  final Map contentToAddOrUpdate);

public void addOrUpdateJobUserContentMap(String workflowName, String 
jobName,
  final Map contentToAddOrUpdate);

public void addOrUpdateTaskUserContentMap(String workflowName, String 
jobName,
  String taskPartitionId, final Map 
contentToAddOrUpdate);


public Map getWorkflowUserContentMap(String workflowName);


public Map getJobUserContentMap(String workflowName, String 
jobName);

public Map getTaskUserContentMap(String workflowName, 
String jobName,
  String taskPartitionId);
```

delete user content api tbd but can use the same convension

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/zhan849/helix harry/task-user-content

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/helix/pull/283.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #283


commit b235c4ee5a82c5970d29e839317ea242813a58bc
Author: Harry Zhang 
Date:   2018-10-04T18:25:08Z

[HELIX-775] consolidate user content related apis for task driver




> Task driver should support add/get task framework user content
> --
>
> Key: HELIX-775
> URL: https://issues.apache.org/jira/browse/HELIX-775
> Project: Apache Helix
>  Issue Type: Task
>Reporter: Harry Zhang
>Assignee: Harry Zhang
>Priority: Major
>
> Task driver should support add/get task framework user content at 
> workflow/job/task levels
>  
> AC:
>  * finish implementation
>  * add tests



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HELIX-775) Task driver should support add/get task framework user content

2018-10-31 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/HELIX-775?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16670739#comment-16670739
 ] 

Hudson commented on HELIX-775:
--

FAILURE: Integrated in Jenkins build helix #1560 (See 
[https://builds.apache.org/job/helix/1560/])
[HELIX-775] consolidate user content related apis for task driver (hrzhang: rev 
b235c4ee5a82c5970d29e839317ea242813a58bc)
* (edit) helix-core/src/main/java/org/apache/helix/task/TaskUtil.java
* (edit) helix-core/src/main/java/org/apache/helix/task/TaskDriver.java
* (edit) 
helix-core/src/test/java/org/apache/helix/task/TestGetSetUserContentStore.java


> Task driver should support add/get task framework user content
> --
>
> Key: HELIX-775
> URL: https://issues.apache.org/jira/browse/HELIX-775
> Project: Apache Helix
>  Issue Type: Task
>Reporter: Harry Zhang
>Assignee: Harry Zhang
>Priority: Major
>
> Task driver should support add/get task framework user content at 
> workflow/job/task levels
>  
> AC:
>  * finish implementation
>  * add tests



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


helix - Build # 1560 - Still Failing

2018-10-31 Thread Apache Jenkins Server
The Apache Jenkins build system has built helix (build #1560)

Status: Still Failing

Check console output at https://builds.apache.org/job/helix/1560/ to view the 
results.

[jira] [Commented] (HELIX-772) Support TaskDriver.addUserContent() api

2018-10-31 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/HELIX-772?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16670717#comment-16670717
 ] 

Hudson commented on HELIX-772:
--

FAILURE: Integrated in Jenkins build helix #1558 (See 
[https://builds.apache.org/job/helix/1558/])
[HELIX-772] add TaskDriver.addUserContent() api and related tests (hrzhang: rev 
0c251bbf640206729755301c3dda734eea78343f)
* (add) 
helix-core/src/test/java/org/apache/helix/task/TestGetSetUserContentStore.java
* (edit) helix-core/src/main/java/org/apache/helix/task/TaskUtil.java
* (edit) 
helix-core/src/test/java/org/apache/helix/integration/task/TestIndependentTaskRebalancer.java
* (delete) 
helix-core/src/test/java/org/apache/helix/task/TestGetUserContentStore.java
* (edit) helix-core/src/main/java/org/apache/helix/task/TaskDriver.java


> Support TaskDriver.addUserContent() api
> ---
>
> Key: HELIX-772
> URL: https://issues.apache.org/jira/browse/HELIX-772
> Project: Apache Helix
>  Issue Type: Bug
>Reporter: Harry Zhang
>Assignee: Harry Zhang
>Priority: Major
>
> Need to support add user content in task driver
>  
> AC:
>  * implement APi
>  * add test
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HELIX-773) Support getLastScheduledTaskTimestamp information in workflow rest api

2018-10-31 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/HELIX-773?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16670718#comment-16670718
 ] 

Hudson commented on HELIX-773:
--

FAILURE: Integrated in Jenkins build helix #1558 (See 
[https://builds.apache.org/job/helix/1558/])
[HELIX-773] add getLastScheduledTaskTimestamp information in workflow (hrzhang: 
rev 566d4f166473b477ea0db1cfba5d04c8f3d6bf30)
* (add) 
helix-core/src/test/java/org/apache/helix/task/TestGetLastScheduledTaskExecInfo.java
* (edit) 
helix-rest/src/main/java/org/apache/helix/rest/server/resources/helix/WorkflowAccessor.java
* (delete) 
helix-core/src/test/java/org/apache/helix/task/TestGetLastScheduledTaskTimestamp.java
* (edit) helix-core/src/main/java/org/apache/helix/task/TaskDriver.java
* (add) helix-core/src/main/java/org/apache/helix/task/TaskExecutionInfo.java
* (edit) 
helix-rest/src/test/java/org/apache/helix/rest/server/TestWorkflowAccessor.java


> Support getLastScheduledTaskTimestamp information in workflow rest api
> --
>
> Key: HELIX-773
> URL: https://issues.apache.org/jira/browse/HELIX-773
> Project: Apache Helix
>  Issue Type: Bug
>Reporter: Harry Zhang
>Assignee: Harry Zhang
>Priority: Major
>
> Support getLastScheduledTaskTimestamp information in workflow rest api



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[GitHub] helix pull request #283: [HELIX-775] consolidate user content related apis f...

2018-10-31 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/helix/pull/283


---


[jira] [Created] (HELIX-776) REST2.0: Add delete command to updateInstanceConfig

2018-10-31 Thread Hunter L (JIRA)
Hunter L created HELIX-776:
--

 Summary: REST2.0: Add delete command to updateInstanceConfig
 Key: HELIX-776
 URL: https://issues.apache.org/jira/browse/HELIX-776
 Project: Apache Helix
  Issue Type: Improvement
Reporter: Hunter L
Assignee: Hunter L


For instance configs, REST2.0 did not expose the REST API for deletion of 
fields. This RB adds update and delete commands to updateInstanceConfig and an 
integration test thereof. Changelist: 1. Add delete command to 
updateInstanceConfig in InstanceAccessor 2. Add integration tests



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HELIX-777) TASK: Handle null currentState for unscheduled tasks

2018-10-31 Thread Hunter L (JIRA)
Hunter L created HELIX-777:
--

 Summary: TASK: Handle null currentState for unscheduled tasks
 Key: HELIX-777
 URL: https://issues.apache.org/jira/browse/HELIX-777
 Project: Apache Helix
  Issue Type: Improvement
Reporter: Hunter L
Assignee: Hunter L


It was observed that when a workflow is submitted and the Controller attempts 
to schedule its tasks, ZK read fails to read the appropriate job's context, 
causing the job to be stuck in an unscheduled state. The job remained 
unscheduled because it had no currentStates, and its job context did not 
contain any assignment/state information. This RB fixes such stuck states by 
detecting null currentStates.
Changelist:
1. Check if currentState is null and if it is, manually assign an INIT state



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[GitHub] helix pull request #282: [HELIX-775] add task driver support for helix rest ...

2018-10-31 Thread zhan849
GitHub user zhan849 opened a pull request:

https://github.com/apache/helix/pull/282

[HELIX-775] add task driver support for helix rest to add/get task fr…

…amework user content


consolidate user content related apis for task driver


To consolidate task driver user content related apis, and corresponding 
rest apis, I'm deprecating the general getUserContent() api, but instead, we 
now have the following apis for get / add / update user content.

```java
public void addOrUpdateWorkflowUserContentMap(String workflowName,
  final Map contentToAddOrUpdate);

public void addOrUpdateJobUserContentMap(String workflowName, String 
jobName,
  final Map contentToAddOrUpdate);

public void addOrUpdateTaskUserContentMap(String workflowName, String 
jobName,
  String taskPartitionId, final Map 
contentToAddOrUpdate);


public Map getWorkflowUserContentMap(String workflowName);


public Map getJobUserContentMap(String workflowName, String 
jobName);

public Map getTaskUserContentMap(String workflowName, 
String jobName,
  String taskPartitionId);
```

API for deleting user content is TBD but can use the same convension

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/zhan849/helix harry/task-user-content

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/helix/pull/282.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #282


commit 7ec5313bccb679014d6a0605ee5d7184063e555e
Author: Harry Zhang 
Date:   2018-10-31T20:55:44Z

[HELIX-775] add task driver support for helix rest to add/get task 
framework user content




---


[jira] [Commented] (HELIX-775) Task driver should support add/get task framework user content

2018-10-31 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/HELIX-775?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16670711#comment-16670711
 ] 

ASF GitHub Bot commented on HELIX-775:
--

GitHub user zhan849 opened a pull request:

https://github.com/apache/helix/pull/282

[HELIX-775] add task driver support for helix rest to add/get task fr…

…amework user content


consolidate user content related apis for task driver


To consolidate task driver user content related apis, and corresponding 
rest apis, I'm deprecating the general getUserContent() api, but instead, we 
now have the following apis for get / add / update user content.

```java
public void addOrUpdateWorkflowUserContentMap(String workflowName,
  final Map contentToAddOrUpdate);

public void addOrUpdateJobUserContentMap(String workflowName, String 
jobName,
  final Map contentToAddOrUpdate);

public void addOrUpdateTaskUserContentMap(String workflowName, String 
jobName,
  String taskPartitionId, final Map 
contentToAddOrUpdate);


public Map getWorkflowUserContentMap(String workflowName);


public Map getJobUserContentMap(String workflowName, String 
jobName);

public Map getTaskUserContentMap(String workflowName, 
String jobName,
  String taskPartitionId);
```

API for deleting user content is TBD but can use the same convension

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/zhan849/helix harry/task-user-content

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/helix/pull/282.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #282


commit 7ec5313bccb679014d6a0605ee5d7184063e555e
Author: Harry Zhang 
Date:   2018-10-31T20:55:44Z

[HELIX-775] add task driver support for helix rest to add/get task 
framework user content




> Task driver should support add/get task framework user content
> --
>
> Key: HELIX-775
> URL: https://issues.apache.org/jira/browse/HELIX-775
> Project: Apache Helix
>  Issue Type: Task
>Reporter: Harry Zhang
>Assignee: Harry Zhang
>Priority: Major
>
> Task driver should support add/get task framework user content at 
> workflow/job/task levels
>  
> AC:
>  * finish implementation
>  * add tests



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


helix - Build # 1559 - Still Failing

2018-10-31 Thread Apache Jenkins Server
The Apache Jenkins build system has built helix (build #1559)

Status: Still Failing

Check console output at https://builds.apache.org/job/helix/1559/ to view the 
results.

[jira] [Commented] (HELIX-775) Task driver should support add/get task framework user content

2018-10-31 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/HELIX-775?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16670734#comment-16670734
 ] 

ASF GitHub Bot commented on HELIX-775:
--

Github user asfgit closed the pull request at:

https://github.com/apache/helix/pull/283


> Task driver should support add/get task framework user content
> --
>
> Key: HELIX-775
> URL: https://issues.apache.org/jira/browse/HELIX-775
> Project: Apache Helix
>  Issue Type: Task
>Reporter: Harry Zhang
>Assignee: Harry Zhang
>Priority: Major
>
> Task driver should support add/get task framework user content at 
> workflow/job/task levels
>  
> AC:
>  * finish implementation
>  * add tests



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HELIX-775) Task driver should support add/get task framework user content

2018-10-31 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/HELIX-775?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16670714#comment-16670714
 ] 

ASF GitHub Bot commented on HELIX-775:
--

Github user asfgit closed the pull request at:

https://github.com/apache/helix/pull/282


> Task driver should support add/get task framework user content
> --
>
> Key: HELIX-775
> URL: https://issues.apache.org/jira/browse/HELIX-775
> Project: Apache Helix
>  Issue Type: Task
>Reporter: Harry Zhang
>Assignee: Harry Zhang
>Priority: Major
>
> Task driver should support add/get task framework user content at 
> workflow/job/task levels
>  
> AC:
>  * finish implementation
>  * add tests



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


helix - Build # 1558 - Still Failing

2018-10-31 Thread Apache Jenkins Server
The Apache Jenkins build system has built helix (build #1558)

Status: Still Failing

Check console output at https://builds.apache.org/job/helix/1558/ to view the 
results.

[jira] [Created] (HELIX-775) Task driver should support add/get task framework user content

2018-10-31 Thread Harry Zhang (JIRA)
Harry Zhang created HELIX-775:
-

 Summary: Task driver should support add/get task framework user 
content
 Key: HELIX-775
 URL: https://issues.apache.org/jira/browse/HELIX-775
 Project: Apache Helix
  Issue Type: Task
Reporter: Harry Zhang
Assignee: Harry Zhang


Task driver should support add/get task framework user content at 
workflow/job/task levels

 

AC:
 * finish implementation
 * add tests



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


helix - Build # 1561 - Still Failing

2018-10-31 Thread Apache Jenkins Server
The Apache Jenkins build system has built helix (build #1561)

Status: Still Failing

Check console output at https://builds.apache.org/job/helix/1561/ to view the 
results.

[GitHub] helix pull request #284: PR

2018-10-31 Thread narendly
GitHub user narendly opened a pull request:

https://github.com/apache/helix/pull/284

PR



You can merge this pull request into a Git repository by running:

$ git pull https://github.com/narendly/helix master

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/helix/pull/284.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #284


commit 6090732be6b88863017a93106fa692dc7350520b
Author: Hunter Lee 
Date:   2018-10-31T21:20:18Z

[HELIX-776] REST2.0: Add delete command to updateInstanceConfig

For instance configs, REST2.0 did not expose the REST API for deletion of 
fields. This RB adds update and delete commands to updateInstanceConfig and an 
integration test thereof.
Changelist:
1. Add delete command to updateInstanceConfig in InstanceAccessor
2. Add integration tests

commit 5d24ed544898ff69f289f54be71a04413735d118
Author: Hunter Lee 
Date:   2018-10-31T21:21:49Z

[HELIX-777] TASK: Handle null currentState for unscheduled tasks

It was observed that when a workflow is submitted and the Controller 
attempts to schedule its tasks, ZK read fails to read the appropriate job's 
context, causing the job to be stuck in an unscheduled state. The job remained 
unscheduled because it had no currentStates, and its job context did not 
contain any assignment/state information. This RB fixes such stuck states by 
detecting null currentStates.
Changelist:
1. Check if currentState is null and if it is, manually assign an INIT state

commit ceba1a55ae351090144c001324f908f2364212a4
Author: Hunter Lee 
Date:   2018-11-01T00:20:37Z

[HELIX-778] TASK: Fix a race condition in updatePreviousAssignedTasksStatus

It was observed that TestUnregisteredCommand is very unstable. The reason 
was identified to be a race condition where when a task fails, sometimes a 
pending message for that task (from INIT to RUNNING) wasn't being cleaned up on 
time, so AbstractTaskDispatcher's updatePreviousAssignedTasksStatus would try 
to process that message and skip the status update of that task (like updating 
its status and NUM_ATTEMPTS field in JobContext).

A short, temporary fix is to call markPartitionError() prior to checking 
the pending message, but over the long haul, we would need to revisit the task 
status update's design here to avoid this type of race conditions.

Changelist:
1. Move markPartitionError() up before checking for a pending message on 
the task
2. Fix TestUnregisteredCommand's instability




---


[jira] [Created] (HELIX-778) TASK: Fix a race condition in updatePreviousAssignedTasksStatus

2018-10-31 Thread Hunter L (JIRA)
Hunter L created HELIX-778:
--

 Summary: TASK: Fix a race condition in 
updatePreviousAssignedTasksStatus
 Key: HELIX-778
 URL: https://issues.apache.org/jira/browse/HELIX-778
 Project: Apache Helix
  Issue Type: Improvement
Reporter: Hunter L
Assignee: Hunter L


It was observed that TestUnregisteredCommand is very unstable. The reason was 
identified to be a race condition where when a task fails, sometimes a pending 
message for that task (from INIT to RUNNING) wasn't being cleaned up on time, 
so AbstractTaskDispatcher's updatePreviousAssignedTasksStatus would try to 
process that message and skip the status update of that task (like updating its 
status and NUM_ATTEMPTS field in JobContext).

A short, temporary fix is to call markPartitionError() prior to checking the 
pending message, but over the long haul, we would need to revisit the task 
status update's design here to avoid this type of race conditions.

Changelist:
1. Move markPartitionError() up before checking for a pending message on the 
task
2. Fix TestUnregisteredCommand's instability



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[GitHub] helix pull request #284: PR

2018-10-31 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/helix/pull/284


---


[jira] [Commented] (HELIX-778) TASK: Fix a race condition in updatePreviousAssignedTasksStatus

2018-10-31 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/HELIX-778?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16670955#comment-16670955
 ] 

Hudson commented on HELIX-778:
--

FAILURE: Integrated in Jenkins build helix #1561 (See 
[https://builds.apache.org/job/helix/1561/])
[HELIX-778] TASK: Fix a race condition in (hulee: rev 
ceba1a55ae351090144c001324f908f2364212a4)
* (edit) 
helix-core/src/test/java/org/apache/helix/integration/task/TestUnregisteredCommand.java
* (edit) 
helix-core/src/main/java/org/apache/helix/task/AbstractTaskDispatcher.java


> TASK: Fix a race condition in updatePreviousAssignedTasksStatus
> ---
>
> Key: HELIX-778
> URL: https://issues.apache.org/jira/browse/HELIX-778
> Project: Apache Helix
>  Issue Type: Improvement
>Reporter: Hunter L
>Assignee: Hunter L
>Priority: Major
>
> It was observed that TestUnregisteredCommand is very unstable. The reason was 
> identified to be a race condition where when a task fails, sometimes a 
> pending message for that task (from INIT to RUNNING) wasn't being cleaned up 
> on time, so AbstractTaskDispatcher's updatePreviousAssignedTasksStatus would 
> try to process that message and skip the status update of that task (like 
> updating its status and NUM_ATTEMPTS field in JobContext).
> A short, temporary fix is to call markPartitionError() prior to checking the 
> pending message, but over the long haul, we would need to revisit the task 
> status update's design here to avoid this type of race conditions.
> Changelist:
> 1. Move markPartitionError() up before checking for a pending message on the 
> task
> 2. Fix TestUnregisteredCommand's instability



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HELIX-777) TASK: Handle null currentState for unscheduled tasks

2018-10-31 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/HELIX-777?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16670954#comment-16670954
 ] 

Hudson commented on HELIX-777:
--

FAILURE: Integrated in Jenkins build helix #1561 (See 
[https://builds.apache.org/job/helix/1561/])
[HELIX-777] TASK: Handle null currentState for unscheduled tasks (hulee: rev 
5d24ed544898ff69f289f54be71a04413735d118)
* (edit) 
helix-core/src/main/java/org/apache/helix/task/AbstractTaskDispatcher.java


> TASK: Handle null currentState for unscheduled tasks
> 
>
> Key: HELIX-777
> URL: https://issues.apache.org/jira/browse/HELIX-777
> Project: Apache Helix
>  Issue Type: Improvement
>Reporter: Hunter L
>Assignee: Hunter L
>Priority: Major
>
> It was observed that when a workflow is submitted and the Controller attempts 
> to schedule its tasks, ZK read fails to read the appropriate job's context, 
> causing the job to be stuck in an unscheduled state. The job remained 
> unscheduled because it had no currentStates, and its job context did not 
> contain any assignment/state information. This RB fixes such stuck states by 
> detecting null currentStates.
> Changelist:
> 1. Check if currentState is null and if it is, manually assign an INIT state



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)