[ https://issues.apache.org/jira/browse/HELIX-753?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16664402#comment-16664402 ]
Hudson commented on HELIX-753: ------------------------------ FAILURE: Integrated in Jenkins build helix #1545 (See [https://builds.apache.org/job/helix/1545/]) [HELIX-753] Record top state handoff finished in single cluster data (hrzhang: rev 67ff66b4897309c785b8b42863e95734eba81aab) * (edit) helix-core/src/test/java/org/apache/helix/monitoring/mbeans/TestTopStateHandoffMetrics.java * (edit) helix-core/src/main/java/org/apache/helix/controller/GenericHelixController.java * (edit) helix-core/src/main/java/org/apache/helix/controller/stages/ClusterEvent.java * (edit) helix-core/src/main/java/org/apache/helix/controller/stages/CurrentStateComputationStage.java * (edit) helix-core/src/test/resources/TestTopStateHandoffMetrics.json * (edit) helix-core/src/main/java/org/apache/helix/controller/stages/AttributeName.java > Record top state handoff finished in single cluster data cache refresh > ---------------------------------------------------------------------- > > Key: HELIX-753 > URL: https://issues.apache.org/jira/browse/HELIX-753 > Project: Apache Helix > Issue Type: Bug > Reporter: Harry Zhang > Assignee: Harry Zhang > Priority: Major > > Currently we are calculating top state handoff duration by doing the > following: > - record missing top state when we see a top state missing > - record top state come back when we see it come back > - report top state handoff duration > This is perfectly fine for non-P2P state transitions as the entire top state > handoff process will always finish for >= 2 pipeline runs. However, for P2P > enabled clusters, top state handoff are quick, and if it is quicker than > cluster data refresh stage latency, we will lose a lot of short top state > handoffs, which make the number miserable on ingraph. > We need to revise top state handoff metrics implementation so we don't lose > data point statistically (i.e. we are losing all short handoffs now). > AC: > - revise impl so we catch those short top state hand-offs > - write new tests to catch the fix if needed -- This message was sent by Atlassian JIRA (v7.6.3#76005)