Zhen Zhang created HELIX-444:
--------------------------------
Summary: add per-participant partition count gauges to helix
Key: HELIX-444
URL: https://issues.apache.org/jira/browse/HELIX-444
Project: Apache Helix
Issue Type: Improvement
Reporter: Zhen Zhang
Assignee: Zhen Zhang
We need a way to pull the known down partition counts out of
DifferenceWithIdealState when an instance is offline, reducing the alert volume
to solely the down instance notification. Without metrics from helix indicating
the number of partitions hosted on a given participant, we can't reason as to
which "DifferenceWithIdealState" counts are supposed to be down and which are
an actually difference caused by something other than a node outage.
These should be produced on a per-participant, per-resource basis (ie.,
helix.i001.participantstatus.ESPRESSO_USCP.ela4-app1234.prod.linkedin.com_11932.ucpx.partitiongauge
= 64 or whatever)
--
This message was sent by Atlassian JIRA
(v6.2#6252)