[GitHub] helix pull request #278: [HELIX-771] More detailed top state handoff metrics

2018-10-30 Thread zhan849
GitHub user zhan849 opened a pull request:

https://github.com/apache/helix/pull/278

[HELIX-771] More detailed top state handoff metrics


Added more details about top state handoff to distinguish helix latency and 
user latency


We define there are 2 types of handoff
- Graceful handoff (controlled top state handoff, i.e. disable instance, 
load balance, etc)
- Non-Graceful (uncontroller top state handoff, i.e. node crash, etc)


For graceful handoff, we record total handoff latency and user latency
For non-graceful handoff, we record total handoff only


Moved top state handoff metrics to an independent stage to make logics 
cleaner.\
Refactored TestTopStateHandoffmetrics to make it cleaner and more json more 
natively

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/zhan849/helix harry/topstate-metrics

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/helix/pull/278.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #278


commit 7e49f995e29ea200fcc42ce6af148ed521979f5c
Author: Harry Zhang 
Date:   2018-10-30T22:55:20Z

[HELIX-771] More detailed top state handoff metrics




---


[jira] [Created] (HELIX-773) Support getLastScheduledTaskTimestamp information in workflow rest api

2018-10-30 Thread Harry Zhang (JIRA)
Harry Zhang created HELIX-773:
-

 Summary: Support getLastScheduledTaskTimestamp information in 
workflow rest api
 Key: HELIX-773
 URL: https://issues.apache.org/jira/browse/HELIX-773
 Project: Apache Helix
  Issue Type: Bug
Reporter: Harry Zhang
Assignee: Harry Zhang


Support getLastScheduledTaskTimestamp information in workflow rest api



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[GitHub] helix pull request #279: Check in the intermediate state calculate stage for...

2018-10-30 Thread jiajunwang
GitHub user jiajunwang opened a pull request:

https://github.com/apache/helix/pull/279

Check in the intermediate state calculate stage for best possible state.

Resource rebalance pipeline should continuously processing resource even 
some resources cannot be calculated.
This is for preventing controller management being stopped by some 
problematic resources.

Also add several exception handling for resource loops in different stage. 
The idea is the detail calculation may throw HelixException, but at the top 
stage layer, these exception should not prevent the whole pipeline to be 
finished.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/jiajunwang/helix master

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/helix/pull/279.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #279


commit 0beeb8fa2babe2d8c7bc50a0a454f752b7c96295
Author: Jiajun Wang 
Date:   2018-09-27T23:18:59Z

Check in the intermediate state calculate stage for best possible state.

Resource rebalance pipeline should continuously processing resource even 
some resources cannot be calculated.
This is for preventing controller management being stopped by some 
problematic resources.

Also add several exception handling for resource loops in different stage. 
The idea is the detail calculation may throw HelixException, but at the top 
stage layer, these exception should not prevent the whole pipeline to be 
finished.




---


[jira] [Commented] (HELIX-773) Support getLastScheduledTaskTimestamp information in workflow rest api

2018-10-30 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/HELIX-773?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16669438#comment-16669438
 ] 

ASF GitHub Bot commented on HELIX-773:
--

GitHub user zhan849 opened a pull request:

https://github.com/apache/helix/pull/281

[HELIX-773] add getLastScheduledTaskTimestamp information in workflow rest 
API

- Added TaskExecutionInfo object to wrap task execution information
- added TaskExecutionInfo to last scheduled task in workflow property in 
workflow rest API
- Modified related tests

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/zhan849/helix harry/workflow-rest

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/helix/pull/281.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #281


commit 917f6b7ee1b2b44b10eea7e5de7f07aa7f184618
Author: Harry Zhang 
Date:   2018-10-30T23:43:25Z

[HELIX-773] add getLastScheduledTaskTimestamp information in workflow rest 
api




> Support getLastScheduledTaskTimestamp information in workflow rest api
> --
>
> Key: HELIX-773
> URL: https://issues.apache.org/jira/browse/HELIX-773
> Project: Apache Helix
>  Issue Type: Bug
>Reporter: Harry Zhang
>Assignee: Harry Zhang
>Priority: Major
>
> Support getLastScheduledTaskTimestamp information in workflow rest api



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HELIX-771) More detailed top state handoff metrics

2018-10-30 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/HELIX-771?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16669418#comment-16669418
 ] 

ASF GitHub Bot commented on HELIX-771:
--

Github user asfgit closed the pull request at:

https://github.com/apache/helix/pull/278


> More detailed top state handoff metrics
> ---
>
> Key: HELIX-771
> URL: https://issues.apache.org/jira/browse/HELIX-771
> Project: Apache Helix
>  Issue Type: Bug
>  Components: helix-core
>Reporter: Harry Zhang
>Assignee: Harry Zhang
>Priority: Major
>
> To define top state handoff SLA, we need some more detailed data:
>  * graceful top state handoff (i.e. disable instance / resource / etc, both 
> Helix and e2e latency)
>  * abrupt top state handoff (i.e. node crash)
> AC:
>  - prepare metrics, test, code complete



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HELIX-771) More detailed top state handoff metrics

2018-10-30 Thread Harry Zhang (JIRA)
Harry Zhang created HELIX-771:
-

 Summary: More detailed top state handoff metrics
 Key: HELIX-771
 URL: https://issues.apache.org/jira/browse/HELIX-771
 Project: Apache Helix
  Issue Type: Bug
  Components: helix-core
Reporter: Harry Zhang
Assignee: Harry Zhang


To define top state handoff SLA, we need some more detailed data:
 * graceful top state handoff (i.e. disable instance / resource / etc, both 
Helix and e2e latency)
 * abrupt top state handoff (i.e. node crash)

AC:
 - prepare metrics, test, code complete



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[GitHub] helix pull request #280: [HELIX-772] add TaskDriver.addUserContent() api and...

2018-10-30 Thread zhan849
GitHub user zhan849 opened a pull request:

https://github.com/apache/helix/pull/280

[HELIX-772] add TaskDriver.addUserContent() api and related tests


Implemented TaskDriver.addUserContent()
Added test (TestGetSetUserContentStore) for testing all getter/setter for 
user content
Modified unstable TestIndependentTaskRebalancer

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/zhan849/helix harry/add-user-content

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/helix/pull/280.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #280


commit df24f5975bd517626490f14e6e038f8370ddd815
Author: Harry Zhang 
Date:   2018-10-30T23:25:12Z

[HELIX-772] add TaskDriver.addUserContent() api and related tests




---


[GitHub] helix pull request #281: [HELIX-773] add getLastScheduledTaskTimestamp infor...

2018-10-30 Thread zhan849
GitHub user zhan849 opened a pull request:

https://github.com/apache/helix/pull/281

[HELIX-773] add getLastScheduledTaskTimestamp information in workflow rest 
API

- Added TaskExecutionInfo object to wrap task execution information
- added TaskExecutionInfo to last scheduled task in workflow property in 
workflow rest API
- Modified related tests

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/zhan849/helix harry/workflow-rest

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/helix/pull/281.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #281


commit 917f6b7ee1b2b44b10eea7e5de7f07aa7f184618
Author: Harry Zhang 
Date:   2018-10-30T23:43:25Z

[HELIX-773] add getLastScheduledTaskTimestamp information in workflow rest 
api




---


[jira] [Commented] (HELIX-771) More detailed top state handoff metrics

2018-10-30 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/HELIX-771?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16669413#comment-16669413
 ] 

ASF GitHub Bot commented on HELIX-771:
--

GitHub user zhan849 opened a pull request:

https://github.com/apache/helix/pull/278

[HELIX-771] More detailed top state handoff metrics


Added more details about top state handoff to distinguish helix latency and 
user latency


We define there are 2 types of handoff
- Graceful handoff (controlled top state handoff, i.e. disable instance, 
load balance, etc)
- Non-Graceful (uncontroller top state handoff, i.e. node crash, etc)


For graceful handoff, we record total handoff latency and user latency
For non-graceful handoff, we record total handoff only


Moved top state handoff metrics to an independent stage to make logics 
cleaner.\
Refactored TestTopStateHandoffmetrics to make it cleaner and more json more 
natively

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/zhan849/helix harry/topstate-metrics

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/helix/pull/278.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #278


commit 7e49f995e29ea200fcc42ce6af148ed521979f5c
Author: Harry Zhang 
Date:   2018-10-30T22:55:20Z

[HELIX-771] More detailed top state handoff metrics




> More detailed top state handoff metrics
> ---
>
> Key: HELIX-771
> URL: https://issues.apache.org/jira/browse/HELIX-771
> Project: Apache Helix
>  Issue Type: Bug
>  Components: helix-core
>Reporter: Harry Zhang
>Assignee: Harry Zhang
>Priority: Major
>
> To define top state handoff SLA, we need some more detailed data:
>  * graceful top state handoff (i.e. disable instance / resource / etc, both 
> Helix and e2e latency)
>  * abrupt top state handoff (i.e. node crash)
> AC:
>  - prepare metrics, test, code complete



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[GitHub] helix pull request #279: Check in the intermediate state calculate stage for...

2018-10-30 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/helix/pull/279


---


[jira] [Commented] (HELIX-772) Support TaskDriver.addUserContent() api

2018-10-30 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/HELIX-772?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16669436#comment-16669436
 ] 

ASF GitHub Bot commented on HELIX-772:
--

GitHub user zhan849 opened a pull request:

https://github.com/apache/helix/pull/280

[HELIX-772] add TaskDriver.addUserContent() api and related tests


Implemented TaskDriver.addUserContent()
Added test (TestGetSetUserContentStore) for testing all getter/setter for 
user content
Modified unstable TestIndependentTaskRebalancer

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/zhan849/helix harry/add-user-content

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/helix/pull/280.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #280


commit df24f5975bd517626490f14e6e038f8370ddd815
Author: Harry Zhang 
Date:   2018-10-30T23:25:12Z

[HELIX-772] add TaskDriver.addUserContent() api and related tests




> Support TaskDriver.addUserContent() api
> ---
>
> Key: HELIX-772
> URL: https://issues.apache.org/jira/browse/HELIX-772
> Project: Apache Helix
>  Issue Type: Bug
>Reporter: Harry Zhang
>Assignee: Harry Zhang
>Priority: Major
>
> Need to support add user content in task driver
>  
> AC:
>  * implement APi
>  * add test
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


helix - Build # 1557 - Still Failing

2018-10-30 Thread Apache Jenkins Server
The Apache Jenkins build system has built helix (build #1557)

Status: Still Failing

Check console output at https://builds.apache.org/job/helix/1557/ to view the 
results.

[jira] [Commented] (HELIX-771) More detailed top state handoff metrics

2018-10-30 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/HELIX-771?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16669482#comment-16669482
 ] 

Hudson commented on HELIX-771:
--

FAILURE: Integrated in Jenkins build helix #1557 (See 
[https://builds.apache.org/job/helix/1557/])
[HELIX-771] More detailed top state handoff metrics (hrzhang: rev 
7e49f995e29ea200fcc42ce6af148ed521979f5c)
* (edit) 
helix-core/src/main/java/org/apache/helix/controller/stages/ClusterDataCache.java
* (edit) 
helix-core/src/test/java/org/apache/helix/integration/task/TestIndependentTaskRebalancer.java
* (edit) helix-core/src/test/resources/TestTopStateHandoffMetrics.json
* (add) 
helix-core/src/main/java/org/apache/helix/controller/stages/MissingTopStateRecord.java
* (edit) 
helix-core/src/main/java/org/apache/helix/monitoring/mbeans/ClusterStatusMonitor.java
* (edit) 
helix-core/src/main/java/org/apache/helix/controller/GenericHelixController.java
* (edit) 
helix-core/src/main/java/org/apache/helix/controller/stages/CurrentStateComputationStage.java
* (add) 
helix-core/src/main/java/org/apache/helix/controller/stages/TopStateHandoffReportStage.java
* (edit) 
helix-core/src/main/java/org/apache/helix/monitoring/mbeans/ResourceMonitor.java
* (edit) helix-core/src/main/java/org/apache/helix/task/TaskDriver.java
* (edit) 
helix-core/src/test/java/org/apache/helix/monitoring/mbeans/TestTopStateHandoffMetrics.java
* (edit) 
helix-core/src/main/java/org/apache/helix/controller/stages/TaskGarbageCollectionStage.java


> More detailed top state handoff metrics
> ---
>
> Key: HELIX-771
> URL: https://issues.apache.org/jira/browse/HELIX-771
> Project: Apache Helix
>  Issue Type: Bug
>  Components: helix-core
>Reporter: Harry Zhang
>Assignee: Harry Zhang
>Priority: Major
>
> To define top state handoff SLA, we need some more detailed data:
>  * graceful top state handoff (i.e. disable instance / resource / etc, both 
> Helix and e2e latency)
>  * abrupt top state handoff (i.e. node crash)
> AC:
>  - prepare metrics, test, code complete



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)