helix git commit: Batch write operations of task framework context update

2018-10-29 Thread jxue
Repository: helix Updated Branches: refs/heads/master 5033785c2 -> 923002e8e Batch write operations of task framework context update Existing task framework pipeline performance is limited by updating Workflow/Job Contexts to ZK. Current controller will update context even after a simple

[2/2] helix git commit: Fix issue for keeping sending state transitions

2018-10-29 Thread jxue
Fix issue for keeping sending state transitions We encountered a problem that Helix keep sending state transitions for the cluster already in stable state. The root cause is periodic rebalance will send event without cloning event object. Two pipeline will share same cache, which may cause

[1/2] helix git commit: Add warn log when Helix controller puts a cluster into maintenance mode.

2018-10-29 Thread jxue
Repository: helix Updated Branches: refs/heads/master 923002e8e -> a6863937c Add warn log when Helix controller puts a cluster into maintenance mode. Project: http://git-wip-us.apache.org/repos/asf/helix/repo Commit: http://git-wip-us.apache.org/repos/asf/helix/commit/ce167f54 Tree:

[2/2] helix git commit: HELIX-1269: improve semantics for BaseDataAccessor.remove()

2018-10-29 Thread jxue
HELIX-1269: improve semantics for BaseDataAccessor.remove() Project: http://git-wip-us.apache.org/repos/asf/helix/repo Commit: http://git-wip-us.apache.org/repos/asf/helix/commit/a6863937 Tree: http://git-wip-us.apache.org/repos/asf/helix/tree/a6863937 Diff:

[1/2] helix git commit: Log improvements

2018-10-29 Thread jxue
Repository: helix Updated Branches: refs/heads/master a6863937c -> 53a6791e7 Log improvements Project: http://git-wip-us.apache.org/repos/asf/helix/repo Commit: http://git-wip-us.apache.org/repos/asf/helix/commit/c3297ae4 Tree: http://git-wip-us.apache.org/repos/asf/helix/tree/c3297ae4 Diff:

[jira] [Created] (HELIX-768) TASK: Fix a bug in WorkflowAccessor

2018-10-29 Thread Hunter L (JIRA)
Hunter L created HELIX-768: -- Summary: TASK: Fix a bug in WorkflowAccessor Key: HELIX-768 URL: https://issues.apache.org/jira/browse/HELIX-768 Project: Apache Helix Issue Type: Improvement

[2/2] helix git commit: Introduce Helix ZkClient factory. And use the factory to generate new ZkClient in the critical Helix components.

2018-10-29 Thread jxue
Introduce Helix ZkClient factory. And use the factory to generate new ZkClient in the critical Helix components. The motivation of this change is sharing ZkConnection as much as possible. DedicatedZkClient: the client that uses it's own connection. SharedZkClient: the client that uses a shared

[1/2] helix git commit: Introduce Helix ZkClient factory. And use the factory to generate new ZkClient in the critical Helix components.

2018-10-29 Thread jxue
Repository: helix Updated Branches: refs/heads/master 281f5d1ec -> 7bb55742e http://git-wip-us.apache.org/repos/asf/helix/blob/7bb55742/helix-core/src/main/java/org/apache/helix/manager/zk/zookeeper/ZkClient.java -- diff --git

[jira] [Commented] (HELIX-769) TASK2.0: Add PropertyKey APIs for new ZNode structure workflow/job paths

2018-10-29 Thread Hudson (JIRA)
[ https://issues.apache.org/jira/browse/HELIX-769?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16667878#comment-16667878 ] Hudson commented on HELIX-769: -- FAILURE: Integrated in Jenkins build helix #1552 (See

[jira] [Commented] (HELIX-768) TASK: Fix a bug in WorkflowAccessor

2018-10-29 Thread Hudson (JIRA)
[ https://issues.apache.org/jira/browse/HELIX-768?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16667877#comment-16667877 ] Hudson commented on HELIX-768: -- FAILURE: Integrated in Jenkins build helix #1552 (See

[jira] [Commented] (HELIX-767) TASK: Remove quotaType fields from Workflow and Job Beans

2018-10-29 Thread Hudson (JIRA)
[ https://issues.apache.org/jira/browse/HELIX-767?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16667876#comment-16667876 ] Hudson commented on HELIX-767: -- FAILURE: Integrated in Jenkins build helix #1552 (See

helix git commit: Enhance the stability of test TestClusterStatusMonitorLifecycle.

2018-10-29 Thread jxue
Repository: helix Updated Branches: refs/heads/master 739adb0d6 -> 67aade646 Enhance the stability of test TestClusterStatusMonitorLifecycle. Project: http://git-wip-us.apache.org/repos/asf/helix/repo Commit: http://git-wip-us.apache.org/repos/asf/helix/commit/67aade64 Tree:

[jira] [Commented] (HELIX-770) HELIX: Fix a possible NPE in loadBalance in IntermediateStateCalcStage

2018-10-29 Thread ASF GitHub Bot (JIRA)
[ https://issues.apache.org/jira/browse/HELIX-770?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16667902#comment-16667902 ] ASF GitHub Bot commented on HELIX-770: -- GitHub user narendly opened a pull request:

[jira] [Created] (HELIX-770) HELIX: Fix a possible NPE in loadBalance in IntermediateStateCalcStage

2018-10-29 Thread Hunter L (JIRA)
Hunter L created HELIX-770: -- Summary: HELIX: Fix a possible NPE in loadBalance in IntermediateStateCalcStage Key: HELIX-770 URL: https://issues.apache.org/jira/browse/HELIX-770 Project: Apache Helix

[jira] [Created] (HELIX-767) TASK: Remove quotaType fields from Workflow and Job Beans

2018-10-29 Thread Hunter L (JIRA)
Hunter L created HELIX-767: -- Summary: TASK: Remove quotaType fields from Workflow and Job Beans Key: HELIX-767 URL: https://issues.apache.org/jira/browse/HELIX-767 Project: Apache Helix Issue Type:

[jira] [Created] (HELIX-769) TASK2.0: Add PropertyKey APIs for new ZNode structure workflow/job paths

2018-10-29 Thread Hunter L (JIRA)
Hunter L created HELIX-769: -- Summary: TASK2.0: Add PropertyKey APIs for new ZNode structure workflow/job paths Key: HELIX-769 URL: https://issues.apache.org/jira/browse/HELIX-769 Project: Apache Helix

[1/3] helix git commit: [HELIX-767] TASK: Remove quotaType fields from Workflow and Job Beans

2018-10-29 Thread jxue
Repository: helix Updated Branches: refs/heads/master 53a6791e7 -> 739adb0d6 [HELIX-767] TASK: Remove quotaType fields from Workflow and Job Beans For a short while, we used quota type to denote the quota type of Task Framework resources. However, we changed the design so that we are using

[3/3] helix git commit: [HELIX-769] TASK2.0: Add PropertyKey APIs for new ZNode structure workflow/job paths

2018-10-29 Thread jxue
[HELIX-769] TASK2.0: Add PropertyKey APIs for new ZNode structure workflow/job paths As part of ZNode restructuring for Task Framework, we need a convenient way to generate paths to read from and write to ZooKeeper. PropertyKey was already being used for this purpose for the most part

helix git commit: Improve the logic for Task Context update

2018-10-29 Thread jxue
Repository: helix Updated Branches: refs/heads/master 67aade646 -> 281f5d1ec Improve the logic for Task Context update In one batch of contexts update, Helix does not need to persist back the data to ZK if it is ready to remove in the same round. Project:

[jira] [Commented] (HELIX-770) HELIX: Fix a possible NPE in loadBalance in IntermediateStateCalcStage

2018-10-29 Thread Hudson (JIRA)
[ https://issues.apache.org/jira/browse/HELIX-770?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16667951#comment-16667951 ] Hudson commented on HELIX-770: -- FAILURE: Integrated in Jenkins build helix #1555 (See

helix git commit: [HELIX-770] HELIX: Fix a possible NPE in loadBalance in IntermediateStateCalcStage

2018-10-29 Thread jxue
Repository: helix Updated Branches: refs/heads/master 7bb55742e -> cf010f904 [HELIX-770] HELIX: Fix a possible NPE in loadBalance in IntermediateStateCalcStage In isLoadBalanceDownwardForAllReplicas() in IntermediateStateCalcStage, statePriorityMap was throwing a NPE because the partition

[jira] [Commented] (HELIX-770) HELIX: Fix a possible NPE in loadBalance in IntermediateStateCalcStage

2018-10-29 Thread ASF GitHub Bot (JIRA)
[ https://issues.apache.org/jira/browse/HELIX-770?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16667948#comment-16667948 ] ASF GitHub Bot commented on HELIX-770: -- Github user asfgit closed the pull request at:

[4/4] helix git commit: Using HelixZkClient to replace ZkClient in helix-core and helix-rest.

2018-10-29 Thread jxue
Using HelixZkClient to replace ZkClient in helix-core and helix-rest. 1. Replace as much usage as possible. For the raw ZkClient tests, the usages are kept. 2. For backward compatibility, some public interfaces still returns ZkClient. Marks them as Deprecated. Project:

[1/4] helix git commit: Refactor WorkflowRebalancer to WorkflowHandler

2018-10-29 Thread jxue
Repository: helix Updated Branches: refs/heads/master cf010f904 -> 9d7364d7a Refactor WorkflowRebalancer to WorkflowHandler Current WorkflowRebalancer is a little bit messing that mixing workflow update and scheduling logic together. Refactor WorklfowRebalancer to WorkflowHandler which will

[3/4] helix git commit: Using HelixZkClient to replace ZkClient in helix-core and helix-rest.

2018-10-29 Thread jxue
http://git-wip-us.apache.org/repos/asf/helix/blob/9d7364d7/helix-core/src/test/java/org/apache/helix/integration/TestDriver.java -- diff --git a/helix-core/src/test/java/org/apache/helix/integration/TestDriver.java

[4/5] helix git commit: [HELIX-765] TASK: Build quota profile from scratch every rebalance

2018-10-29 Thread jxue
[HELIX-765] TASK: Build quota profile from scratch every rebalance It has been reported that instances have a full quota despite no tasks existing in their CURRENTSTATES. The cause of this is not clear, so making ClusterDataCache trigger a refresh of all AssignableInstances will ensure that

[3/5] helix git commit: [HELIX-764] TASK: Fix LiveInstanceCurrentState change flag

2018-10-29 Thread jxue
[HELIX-764] TASK: Fix LiveInstanceCurrentState change flag Previously, existsLiveInstanceOrCurrentStateChange was getting reset in ClusterDataCache when its getter was called. This was problematic because if there were multiple jobs or multiple workflows, whoever calls this getter would get

[2/5] helix git commit: [HELIX-763] Task:Ignore tasks whose workflow and job are inactive

2018-10-29 Thread jxue
[HELIX-763] Task:Ignore tasks whose workflow and job are inactive It was discovered that by manual testing, there were task states in INIT and RUNNING, and they were occupying a thread count even though their parent job or workflow was in an inactive state (terminal or stopped). This was

[1/5] helix git commit: [HELIX-762] TASK: Change LOG mode from info to debug

2018-10-29 Thread jxue
Repository: helix Updated Branches: refs/heads/master d75d5fcdc -> 5033785c2 [HELIX-762] TASK: Change LOG mode from info to debug In production, it was observed that some users were running thousands of tasks, and since AssignableInstance leaves a line of log for each task assigned or

[5/5] helix git commit: [HELIX-766] TASK: Add logging functionality in AssignableInstanceManager

2018-10-29 Thread jxue
[HELIX-766] TASK: Add logging functionality in AssignableInstanceManager In order to debug task-related inquiries and issues, we realized that it would be very helpful if we logged there was a log recording the current quota capacity of all AssignableInstances. This is for cases where we see

[jira] [Commented] (HELIX-765) [TASK] Build quota profile from scratch every rebalance

2018-10-29 Thread Hudson (JIRA)
[ https://issues.apache.org/jira/browse/HELIX-765?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16667536#comment-16667536 ] Hudson commented on HELIX-765: -- FAILURE: Integrated in Jenkins build helix #1547 (See

[jira] [Commented] (HELIX-762) [TASK] Change LOG mode from info to debug

2018-10-29 Thread Hudson (JIRA)
[ https://issues.apache.org/jira/browse/HELIX-762?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16667533#comment-16667533 ] Hudson commented on HELIX-762: -- FAILURE: Integrated in Jenkins build helix #1547 (See

[jira] [Commented] (HELIX-764) [TASK] Fix LiveInstanceCurrentState change flag

2018-10-29 Thread Hudson (JIRA)
[ https://issues.apache.org/jira/browse/HELIX-764?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16667535#comment-16667535 ] Hudson commented on HELIX-764: -- FAILURE: Integrated in Jenkins build helix #1547 (See

[jira] [Commented] (HELIX-766) [TASK] Add logging functionality in AssignableInstanceManager

2018-10-29 Thread Hudson (JIRA)
[ https://issues.apache.org/jira/browse/HELIX-766?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16667537#comment-16667537 ] Hudson commented on HELIX-766: -- FAILURE: Integrated in Jenkins build helix #1547 (See