[GitHub] helix pull request #140: [HELIX-679] consolidate semantics of recursively de...

2018-03-08 Thread zhan849
GitHub user zhan849 opened a pull request: https://github.com/apache/helix/pull/140 [HELIX-679] consolidate semantics of recursively delete path in ZkClient This change consolidates semantics of APIs in ZkClient that recursively deletes a path * For backward compatibility

[GitHub] helix pull request #146: [HELIX-680] add system setting to unblock TestZkCal...

2018-03-09 Thread zhan849
GitHub user zhan849 opened a pull request: https://github.com/apache/helix/pull/146 [HELIX-680] add system setting to unblock TestZkCallbackHandlerLeak test with zookeeper 3.4.11 upgrade By adding system property in ZkUnitTestBase `beforeSuite()`, `TestZkCallbackHandlerLeak` can

[GitHub] helix issue #146: [HELIX-680] add system setting to unblock TestZkCallbackHa...

2018-03-14 Thread zhan849
Github user zhan849 commented on the issue: https://github.com/apache/helix/pull/146 @lei-xia just rebased ---

[GitHub] helix pull request #148: Move RoutingDataCache to BasicDataCache as a sharab...

2018-03-14 Thread zhan849
GitHub user zhan849 opened a pull request: https://github.com/apache/helix/pull/148 Move RoutingDataCache to BasicDataCache as a sharable component In this commit, I moved main logics of RoutingDataCache to BasicClusterDatqaCache under helix.common, to make it a commonly share-able

[GitHub] helix pull request #152: [HELIX-681] don't fail state transition task if we ...

2018-03-19 Thread zhan849
GitHub user zhan849 opened a pull request: https://github.com/apache/helix/pull/152 [HELIX-681] don't fail state transition task if we fail to remove message or send out relay message This PR includes fix on participant side: 1. Consolidated message deletion logic to Heli

[GitHub] helix pull request #156: [HELIX-682] controller should delete obsolete messa...

2018-03-20 Thread zhan849
GitHub user zhan849 opened a pull request: https://github.com/apache/helix/pull/156 [HELIX-682] controller should delete obsolete messages with timeout to unblock state transition This RB contains implementations and tests for controller: during MessageGenerationPhase, it checks

[GitHub] helix pull request #152: [HELIX-681] don't fail state transition task if we ...

2018-03-21 Thread zhan849
Github user zhan849 commented on a diff in the pull request: https://github.com/apache/helix/pull/152#discussion_r176182830 --- Diff: helix-core/src/main/java/org/apache/helix/messaging/handling/HelixTask.java --- @@ -168,7 +169,14 @@ public HelixTaskResult call

[GitHub] helix pull request #152: [HELIX-681] don't fail state transition task if we ...

2018-03-21 Thread zhan849
Github user zhan849 commented on a diff in the pull request: https://github.com/apache/helix/pull/152#discussion_r176204863 --- Diff: helix-core/src/main/java/org/apache/helix/util/HelixUtil.java --- @@ -219,4 +220,22 @@ public static String serializeByComma(List objects

[GitHub] helix pull request #156: [HELIX-682] controller should delete obsolete messa...

2018-03-22 Thread zhan849
Github user zhan849 commented on a diff in the pull request: https://github.com/apache/helix/pull/156#discussion_r176498024 --- Diff: helix-core/src/main/java/org/apache/helix/controller/stages/MessageGenerationPhase.java --- @@ -121,6 +131,18 @@ public void process(ClusterEvent

[GitHub] helix pull request #162: [HELIX-683] clean monitoring cache upon helix contr...

2018-03-26 Thread zhan849
GitHub user zhan849 opened a pull request: https://github.com/apache/helix/pull/162 [HELIX-683] clean monitoring cache upon helix controller enable monitoring In this PR I added methods to clear monitoring records in cache when we enable cluster status monitoring. I also added

[GitHub] helix pull request #173: [HELIX-689] remove redundant logs from zkclient

2018-04-03 Thread zhan849
GitHub user zhan849 opened a pull request: https://github.com/apache/helix/pull/173 [HELIX-689] remove redundant logs from zkclient Currently, in controller message cleanup, we print out 2 lines of message when message does not exist, which is totally redundant. In this PR, I

[GitHub] helix pull request #175: [HELIX-692] use map instead of list to avoid deleti...

2018-04-06 Thread zhan849
GitHub user zhan849 opened a pull request: https://github.com/apache/helix/pull/175 [HELIX-692] use map instead of list to avoid deleting redundant message during cleanup Currently in MessageGenerationPhase, we are using list to store messages to GC. However, pending message is

[GitHub] helix pull request #174: [HELIX-691] Allow users to update InstanceConfig

2018-04-09 Thread zhan849
Github user zhan849 commented on a diff in the pull request: https://github.com/apache/helix/pull/174#discussion_r180178181 --- Diff: helix-rest/src/main/java/org/apache/helix/rest/server/resources/helix/InstanceAccessor.java --- @@ -315,26 +316,27 @@ public Response

[GitHub] helix pull request #174: [HELIX-691] Allow users to update InstanceConfig

2018-04-09 Thread zhan849
Github user zhan849 commented on a diff in the pull request: https://github.com/apache/helix/pull/174#discussion_r180175528 --- Diff: helix-rest/src/main/java/org/apache/helix/rest/server/resources/helix/InstanceAccessor.java --- @@ -223,60 +224,60 @@ public Response

[GitHub] helix pull request #174: [HELIX-691] Allow users to update InstanceConfig

2018-04-09 Thread zhan849
Github user zhan849 commented on a diff in the pull request: https://github.com/apache/helix/pull/174#discussion_r180174778 --- Diff: helix-rest/src/main/java/org/apache/helix/rest/server/resources/helix/ResourceAccessor.java --- @@ -84,10 +96,66 @@ public Response getResources

[GitHub] helix issue #179: Unique thread id for the threads that execute Tasks

2018-04-12 Thread zhan849
Github user zhan849 commented on the issue: https://github.com/apache/helix/pull/179 @DImuthuUpe do you know how scheduled thread pool is managing threads internally? Is it always having a fixed number of 40 thread or its creating / deleting thread when needed but 40 is just an upper

[GitHub] helix pull request #180: Two minor fixes

2018-04-13 Thread zhan849
GitHub user zhan849 opened a pull request: https://github.com/apache/helix/pull/180 Two minor fixes 1. add handler time information in HelixTask log 2. fix broken TestExternalViewUpdates You can merge this pull request into a Git repository by running: $ git pull https

[GitHub] helix pull request #181: [HELIX-690] batch message execution should not shar...

2018-04-16 Thread zhan849
GitHub user zhan849 opened a pull request: https://github.com/apache/helix/pull/181 [HELIX-690] batch message execution should not share same context In this PR, I added deep copy methods to NotificationContext so when processing messages in batch, different thread would not share

[GitHub] helix pull request #182: [HELIX-695] add helix manager listener for new conn...

2018-04-16 Thread zhan849
GitHub user zhan849 opened a pull request: https://github.com/apache/helix/pull/182 [HELIX-695] add helix manager listener for new connection notification In this PR I added invocation and related tests of `stateListener.onConnected()` method in ZkHelixManager when it is connected

[GitHub] helix issue #180: Two minor fixes

2018-04-18 Thread zhan849
Github user zhan849 commented on the issue: https://github.com/apache/helix/pull/180 @lei-xia done ---

[GitHub] helix pull request #184: fix broken TestTaskCreateThrottling

2018-04-19 Thread zhan849
GitHub user zhan849 opened a pull request: https://github.com/apache/helix/pull/184 fix broken TestTaskCreateThrottling this PR fixes a broken test You can merge this pull request into a Git repository by running: $ git pull https://github.com/zhan849/helix harry/test-fix

[GitHub] helix pull request #191: [HELIX-696] fix workflow state flip-flop issue

2018-04-19 Thread zhan849
GitHub user zhan849 opened a pull request: https://github.com/apache/helix/pull/191 [HELIX-696] fix workflow state flip-flop issue Fixed issues: *After timeout, timer is not scheduled to clean it up when workflow expires *After timeout, state handling logic is messy that

[GitHub] helix pull request #194: Fix broken TestWorkflowTermination

2018-04-24 Thread zhan849
GitHub user zhan849 opened a pull request: https://github.com/apache/helix/pull/194 Fix broken TestWorkflowTermination test is broken by a temp fix before not to set JobState to NOT_STARTED when initializing workflow context, this PR fixes the test according to temp fix You can

[GitHub] helix pull request #195: [HELIX-682] delete duplicated message and log error...

2018-04-24 Thread zhan849
GitHub user zhan849 opened a pull request: https://github.com/apache/helix/pull/195 [HELIX-682] delete duplicated message and log error in HelixTaskExecutor on participant This PR is the second part of message dedup on participant side You can merge this pull request into a Git

[GitHub] helix pull request #197: [HELIX-681] change controller msg purge timeout to ...

2018-04-24 Thread zhan849
GitHub user zhan849 opened a pull request: https://github.com/apache/helix/pull/197 [HELIX-681] change controller msg purge timeout to larger number Changed message purge delay to 1min, updated tests accordingly. You can merge this pull request into a Git repository by running

[GitHub] helix issue #201: add null check in DeplayedAutoRebalancerz#computeNewIdealS...

2018-04-30 Thread zhan849
Github user zhan849 commented on the issue: https://github.com/apache/helix/pull/201 1. IDEs are already doing NPE checking for us. 2. You are just detecting null and throw another exception, how's it different than NPE? ---

[GitHub] helix pull request #200: throw new Exception to avoid ugly NPE

2018-04-30 Thread zhan849
Github user zhan849 commented on a diff in the pull request: https://github.com/apache/helix/pull/200#discussion_r185050120 --- Diff: helix-core/src/main/java/org/apache/helix/tools/commandtools/ZkGrep.java --- @@ -463,9 +463,8 @@ static File gunzip(File zipFile

[GitHub] helix issue #201: add null check in DeplayedAutoRebalancerz#computeNewIdealS...

2018-05-03 Thread zhan849
Github user zhan849 commented on the issue: https://github.com/apache/helix/pull/201 @lujiefsi currently we have the assumption that resource will have state model def, which is registered by ParticipantManager. If you really want to fix the issue, then I'd suggest doin

[GitHub] helix pull request #200: throw new Exception to avoid ugly NPE

2018-05-03 Thread zhan849
Github user zhan849 commented on a diff in the pull request: https://github.com/apache/helix/pull/200#discussion_r185911985 --- Diff: helix-core/src/main/java/org/apache/helix/tools/commandtools/ZkGrep.java --- @@ -463,9 +463,8 @@ static File gunzip(File zipFile

[GitHub] helix pull request #204: [HELIX-705]: Participant duplicated state transitio...

2018-06-25 Thread zhan849
GitHub user zhan849 opened a pull request: https://github.com/apache/helix/pull/204 [HELIX-705]: Participant duplicated state transition handling rework Re-implemented helix task executor state transition message dedup logic, and added tests for verifying it: - Duplicated

[GitHub] helix pull request #206: [HELIX-706] process tev and persist assignment asyn...

2018-06-26 Thread zhan849
GitHub user zhan849 opened a pull request: https://github.com/apache/helix/pull/206 [HELIX-706] process tev and persist assignment asynchronously Added async worker in generic helix controller to process persist assignment stage and tev generation state asynchronously You can

[GitHub] helix pull request #208: [HELIX-709] Prepare controller stages for async exe...

2018-06-28 Thread zhan849
GitHub user zhan849 opened a pull request: https://github.com/apache/helix/pull/208 [HELIX-709] Prepare controller stages for async execution - Implemented AbstractAsyncBaseStage - Refactored TEVCalcState and PersistAssignmentStage to use AbstractAsyncBaseStage You can merge

[GitHub] helix pull request #209: [HELIX-710] Create abstract state model for distrib...

2018-06-28 Thread zhan849
GitHub user zhan849 opened a pull request: https://github.com/apache/helix/pull/209 [HELIX-710] Create abstract state model for distributed leader standby helix service This RB abstracts a leader standby state model that helix services such as controller or other services would

[GitHub] helix pull request #214: [HELIX-709] Move external view calculation to async...

2018-07-09 Thread zhan849
GitHub user zhan849 opened a pull request: https://github.com/apache/helix/pull/214 [HELIX-709] Move external view calculation to async stage and re-organize pipeline - Separated controller pipeline to execute external view compute async and as early as possible - renamed

[GitHub] helix pull request #218: minor logging improvements

2018-07-09 Thread zhan849
GitHub user zhan849 opened a pull request: https://github.com/apache/helix/pull/218 minor logging improvements minor log fixes You can merge this pull request into a Git repository by running: $ git pull https://github.com/zhan849/helix harry/minor-improvements Alternatively

[GitHub] helix pull request #219: [HELIX-717] Add api for get / set quota type, ratio...

2018-07-09 Thread zhan849
GitHub user zhan849 opened a pull request: https://github.com/apache/helix/pull/219 [HELIX-717] Add api for get / set quota type, ratio and participant capacity Add api for get / set quota type, ratio and participant capacity You can merge this pull request into a Git repository by

[GitHub] helix pull request #220: [HELIX-718] implement TaskAssignResult

2018-07-09 Thread zhan849
GitHub user zhan849 opened a pull request: https://github.com/apache/helix/pull/220 [HELIX-718] implement TaskAssignResult Implement TaskAssignResult as a part of task assigner You can merge this pull request into a Git repository by running: $ git pull https://github.com

[GitHub] helix pull request #222: [HELIX-718] implement AssignableInstance

2018-07-09 Thread zhan849
GitHub user zhan849 opened a pull request: https://github.com/apache/helix/pull/222 [HELIX-718] implement AssignableInstance Implement AssignableInstance and related tests as a part of task assigner You can merge this pull request into a Git repository by running: $ git pull

[GitHub] helix pull request #223: [HELIX-718] provide a method in AssignableInstance ...

2018-07-09 Thread zhan849
GitHub user zhan849 opened a pull request: https://github.com/apache/helix/pull/223 [HELIX-718] provide a method in AssignableInstance to set current assignment This is required when an assignable instance is initialized, it needs to recover its current states You can merge this

[GitHub] helix pull request #224: [HELIX-718] implement ThreadCountBasedTaskAssigner

2018-07-09 Thread zhan849
GitHub user zhan849 opened a pull request: https://github.com/apache/helix/pull/224 [HELIX-718] implement ThreadCountBasedTaskAssigner In this RB, I implemented a thread count based task assigner that is optimized for short-term use cases. It assumes: - All tasks to assign have

[GitHub] helix pull request #248: helix manager should support getting metadata store...

2018-07-16 Thread zhan849
GitHub user zhan849 opened a pull request: https://github.com/apache/helix/pull/248 helix manager should support getting metadata store connection string Add an API to get metadatastore connection string in Helix Manager You can merge this pull request into a Git repository by

[GitHub] helix issue #248: helix manager should support getting metadata store connec...

2018-07-16 Thread zhan849
Github user zhan849 commented on the issue: https://github.com/apache/helix/pull/248 @kishoreg We have a request here that it would be handy to retrieve zk address from ZkHelixManager for user components to perform some customized operations in ZooKeeper without sharing same ZkClient

[GitHub] helix pull request #257: [HELIX-740] check NPE in getInstancesInClusterWithT...

2018-07-17 Thread zhan849
GitHub user zhan849 opened a pull request: https://github.com/apache/helix/pull/257 [HELIX-740] check NPE in getInstancesInClusterWithTag and throw more meaningful exception Added cluster config check in `getInstancesInClusterWithTag()` and throw IllegalStateException when

[GitHub] helix pull request #258: [HELIX-741] make swap instance more robust and idem...

2018-07-17 Thread zhan849
GitHub user zhan849 opened a pull request: https://github.com/apache/helix/pull/258 [HELIX-741] make swap instance more robust and idempotent Made swap instance more robust: 1. List ideal state names and read ideal state individually to avoid partial read 2. remove

[GitHub] helix pull request #264: bump ivy file versions and disable helix-front buil...

2018-07-31 Thread zhan849
GitHub user zhan849 opened a pull request: https://github.com/apache/helix/pull/264 bump ivy file versions and disable helix-front build After releasing open source 0.8.2 stable release, we need to modify all ivy files to match the version specified in pom file in root directory

[GitHub] helix pull request #266: Propose design for aggregated cluster view service

2018-08-20 Thread zhan849
GitHub user zhan849 opened a pull request: https://github.com/apache/helix/pull/266 Propose design for aggregated cluster view service This PR adds a design doc for aggregated cluster view service. You can merge this pull request into a Git repository by running: $ git pull

[GitHub] helix pull request #270: [HELIX-753] Record top state handoff finished in si...

2018-09-21 Thread zhan849
GitHub user zhan849 opened a pull request: https://github.com/apache/helix/pull/270 [HELIX-753] Record top state handoff finished in single cluster data cache refresh This PR adds top state handoff reporting when a single pipeline refresh catches the entire handoff process, which

[GitHub] helix pull request #266: Propose design for aggregated cluster view service

2018-10-23 Thread zhan849
Github user zhan849 commented on a diff in the pull request: https://github.com/apache/helix/pull/266#discussion_r227562948 --- Diff: designs/aggregated-cluster-view/design.md --- @@ -0,0 +1,353 @@ +Aggregated Cluster View Design

[GitHub] helix pull request #266: Propose design for aggregated cluster view service

2018-10-23 Thread zhan849
Github user zhan849 commented on a diff in the pull request: https://github.com/apache/helix/pull/266#discussion_r227597163 --- Diff: designs/aggregated-cluster-view/design.md --- @@ -0,0 +1,353 @@ +Aggregated Cluster View Design

[GitHub] helix pull request #278: [HELIX-771] More detailed top state handoff metrics

2018-10-30 Thread zhan849
GitHub user zhan849 opened a pull request: https://github.com/apache/helix/pull/278 [HELIX-771] More detailed top state handoff metrics Added more details about top state handoff to distinguish helix latency and user latency We define there are 2 types of handoff

[GitHub] helix pull request #280: [HELIX-772] add TaskDriver.addUserContent() api and...

2018-10-30 Thread zhan849
GitHub user zhan849 opened a pull request: https://github.com/apache/helix/pull/280 [HELIX-772] add TaskDriver.addUserContent() api and related tests Implemented TaskDriver.addUserContent() Added test (TestGetSetUserContentStore) for testing all getter/setter for user

[GitHub] helix pull request #281: [HELIX-773] add getLastScheduledTaskTimestamp infor...

2018-10-30 Thread zhan849
GitHub user zhan849 opened a pull request: https://github.com/apache/helix/pull/281 [HELIX-773] add getLastScheduledTaskTimestamp information in workflow rest API - Added TaskExecutionInfo object to wrap task execution information - added TaskExecutionInfo to last scheduled

[GitHub] helix pull request #282: [HELIX-775] add task driver support for helix rest ...

2018-10-31 Thread zhan849
GitHub user zhan849 opened a pull request: https://github.com/apache/helix/pull/282 [HELIX-775] add task driver support for helix rest to add/get task fr… …amework user content consolidate user content related apis for task driver To consolidate task

[GitHub] helix pull request #283: [HELIX-775] consolidate user content related apis f...

2018-10-31 Thread zhan849
GitHub user zhan849 opened a pull request: https://github.com/apache/helix/pull/283 [HELIX-775] consolidate user content related apis for task driver HELIX-1315: consolidate user content related apis for task driver To consolidate task driver user content related apis

[GitHub] helix pull request #285: [HELIX-779] do not clean list field in maintenance ...

2018-11-01 Thread zhan849
GitHub user zhan849 opened a pull request: https://github.com/apache/helix/pull/285 [HELIX-779] do not clean list field in maintenance rebalancer for new resources Setting list fields to empty map will prevent newly added and initially rebalanced resources during maintenance mode

[GitHub] helix pull request #287: [HELIX-780] add get/add job user content rest api

2018-11-01 Thread zhan849
GitHub user zhan849 opened a pull request: https://github.com/apache/helix/pull/287 [HELIX-780] add get/add job user content rest api added apis and tests You can merge this pull request into a Git repository by running: $ git pull https://github.com/zhan849/helix harry/tf

[GitHub] helix pull request #289: [HELIX-780] add task user content related api and a...

2018-11-01 Thread zhan849
GitHub user zhan849 opened a pull request: https://github.com/apache/helix/pull/289 [HELIX-780] add task user content related api and added more tests - added get/add task user content rest api - consolidated rest api behavior: when getting/adding user content, if job/workflow

[GitHub] helix pull request #290: fix potential NPE in TopStateHandoffReportStage

2018-11-01 Thread zhan849
GitHub user zhan849 opened a pull request: https://github.com/apache/helix/pull/290 fix potential NPE in TopStateHandoffReportStage You can merge this pull request into a Git repository by running: $ git pull https://github.com/zhan849/helix harry/minor-fixes Alternatively

[GitHub] helix pull request #291: Fix unstable TestControllerLeadershipChange

2018-11-01 Thread zhan849
GitHub user zhan849 opened a pull request: https://github.com/apache/helix/pull/291 Fix unstable TestControllerLeadershipChange - make setLeader more reliable - restart participant after manager 1 regain leadership - use cluster verifier to wait for cluster converge You can

[GitHub] helix pull request #292: [HELIX-785] Record helix latency instead of user la...

2018-11-02 Thread zhan849
GitHub user zhan849 opened a pull request: https://github.com/apache/helix/pull/292 [HELIX-785] Record helix latency instead of user latency in top state handoff metrics - top state handoff reports helix latency instead of user latency - modified test cases You can merge this

[GitHub] helix pull request #294: Implement view cluster aggregator

2018-11-02 Thread zhan849
GitHub user zhan849 opened a pull request: https://github.com/apache/helix/pull/294 Implement view cluster aggregator Based on #266 , design of helix view aggregator, here is the implementation. Helix view aggregator will be a different module under helix repo, and the impl will

[GitHub] helix pull request #296: Skip resources with state model def ref as Task dur...

2018-11-14 Thread zhan849
GitHub user zhan849 opened a pull request: https://github.com/apache/helix/pull/296 Skip resources with state model def ref as Task during top state handoff We should not report top state handoff for resources with state model def ref as "Task" as this is meaningless a