rahulrane50 opened a new issue, #2383: URL: https://github.com/apache/helix/issues/2383
### Describe the bug In TopStateHandoffReportStage, when top state is detected in this stage and helix has previous missing top state record ([code pointer](https://github.com/apache/helix/blob/e307aa91f9782e82edfa6259d053ca079cbda7c9/helix-core/src/main/java/org/apache/helix/controller/stages/TopStateHandoffReportStage.java#L212)), helix finds out startTime and endTime of handsoff and report handsoff duration. But this is reported only if it's beyond set threshold ([code pointer](https://github.com/apache/helix/blob/e307aa91f9782e82edfa6259d053ca079cbda7c9/helix-core/src/main/java/org/apache/helix/controller/stages/TopStateHandoffReportStage.java#L478)). Ideally this handsoff should still be reported. Now it's debatable if this handsoff is considered as successful or failed and we can discuss that. But either way it should be reported IMHO ### To Reproduce Set the missing_top_state_threshold in cluster config to some value. Now when top state handsoff happens from one host to another and it takes more than set threshold then helix won't update handsoff duration metrics but would mark this handsoff as "failed" and increment failedTopStateHandsOffCounter. ### Expected behavior Ideally handsoff duration should be reported but we can mark this handsoff as failed one. ### Additional context Add any other context about the problem here. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
