Chun-Hung Hsiao created MESOS-9639:
--------------------------------------

             Summary: Make CSI plugin RPC metrics agnostic to CSI versions.
                 Key: MESOS-9639
                 URL: https://issues.apache.org/jira/browse/MESOS-9639
             Project: Mesos
          Issue Type: Task
          Components: storage
            Reporter: Chun-Hung Hsiao
            Assignee: Chun-Hung Hsiao


Currently SLRP provides per-CSI-call metrics, e.g.:
{noformat}
resource_providers/<rp_type>.<rp_name>/csi_plugin/rpcs/csi.v0.controller.CreateVolume/successes
resource_providers/<rp_type>.<rp_name>/csi_plugin/rpcs/csi.v0.node.NodeGetId/errors
{noformat}
If we are to continue to provide such fine-grained metrics, when operators 
upgrade their CSI plugins to CSI v1, then SLRP would report another set of 
metrics for v1, which would be inconvenient to operators.

Also the fine-grained metrics are not very useful for operators, as most 
information are highly correlated to per-operation metrics. So most likely 
operators would simply aggregate the per-CSI-call metrics for monitoring CSI 
plugins, and use per-operation metrics to monitor volume creation/destroy/etc.

So instead of provide such fine-grained metrics, we could just provide a set of 
aggregated rpc metrics that are agnostic to CSI versions, such as:
{noformat}
resource_providers/<rp_type>.<rp_name>/csi_plugin/rpcs_pending
resource_providers/<rp_type>.<rp_name>/csi_plugin/rpcs_finished
resource_providers/<rp_type>.<rp_name>/csi_plugin/rpcs_failed
resource_providers/<rp_type>.<rp_name>/csi_plugin/rpcs_cancelled
{noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to