[
https://issues.apache.org/jira/browse/RANGER-4047?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Madhan Neethiraj updated RANGER-4047:
-------------------------------------
Fix Version/s: 3.0.0
2.5.0
> Ranger KMS health metrics
> --------------------------
>
> Key: RANGER-4047
> URL: https://issues.apache.org/jira/browse/RANGER-4047
> Project: Ranger
> Issue Type: New Feature
> Components: kms
> Reporter: Vikas Kumar
> Assignee: Vikas Kumar
> Priority: Major
> Fix For: 3.0.0, 2.5.0
>
> Attachments: Apache KMS metrics.xlsx
>
>
> Ranger KMS should collect the important System as well as application level
> health metrics.
> System metrics: JVM/CPU/memory related metrics
> Application metrics: KMS API execution metrics. Like, number of time DECRYPT
> operation invoked and time taken to complete the request etc.
> There should also be API to consume these stats, preferably REST API. Any
> metric tools should be able to get these metrics through that REST API.
> This will help making KMS highly observable. Alerts system can be configured
> to consume, process and generate alerts.
> There could be many other use cases.
> *Approach:*
> The solution depends on "ranger-metrics" common module for both Json and
> Prometheus sink.
> Kms has added only KMS specific application metrics. Generally, one COUNT
> metric and corresponding elapsed time gauge metric for each REST end points.
> By default, metric collection is not thread-safe but the by adding following
> property in kms-site.xml it can be made thread-safe:
> Prop name: hadoop.kms.metric.collection.threadsafe=true. // possible values
> true/false
>
> ===========Sample response to list down metrics collected============
> curl -ivk -H "Content-Type: application/json" -H -X GET
> [http://localhost:9292/kms/metrics/json?user.name=vikas]
> sample response:
> {
> "KMS": {
> "GET_CURRENT_KEY_COUNT": 0,
> "DELETE_KEY_ELAPSED_TIME": 0,
> "EEK_DECRYPT_ELAPSED_TIME": 0,
> "GET_KEYS_METADATA_ELAPSED_TIME": 0,
> "EEK_GENERATE_ELAPSED_TIME": 0,
> "GET_CURRENT_KEY_ELAPSED_TIME": 0,
> "EEK_REENCRYPT_ELAPSED_TIME": 0,
> "KEY_CREATE_COUNT": 1,
> "UNAUTHORIZED_CALLS_COUNT": 0,
> "KEY_CREATE_ELAPSED_TIME": 81,
> "GET_KEY_VERSION_COUNT": 0,
> "ROLL_NEW_VERSION_ELAPSED_TIME": 0,
> "REENCRYPT_EEK_BATCH_COUNT": 0,
> "REENCRYPT_EEK_BATCH_ELAPSED_TIME": 0,
> "GET_KEYS_METADATA_COUNT": 0,
> "GET_KEY_VERSIONS_COUNT": 0,
> "GET_KEY_VERSIONS_ELAPSED_TIME": 0,
> "GET_KEYS_COUNT": 2,
> "EEK_GENERATE_COUNT": 0,
> "INVALIDATE_CACHE_COUNT": 0,
> "GET_METADATA_COUNT": 3,
> "REENCRYPT_EEK_BATCH_KEYS_COUNT": 0,
> "EEK_REENCRYPT_COUNT": 0,
> "UNAUTHENTICATED_CALLS_COUNT": 0,
> "GET_KEY_VERSION_ELAPSED_TIME": 0,
> "INVALIDATE_CACHE_ELAPSED_TIME": 0,
> "ROLL_NEW_VERSION_COUNT": 0,
> "EEK_DECRYPT_COUNT": 0,
> "GET_KEYS_METADATA_KEYNAMES_COUNT": 0,
> "DELETE_KEY_COUNT": 0,
> "GET_KEYS_ELAPSED_TIME": 72,
> "GET_METADATA_ELAPSED_TIME": 14,
> "TOTAL_CALL_COUNT": 7
> },
> "RangerJvm": {
> "GcTimeTotal": 339,
> "SystemLoadAvg": 1.47,
> "ThreadsBusy": 5,
> "GcCountTotal": 9,
> "MemoryMax": 1005584384,
> "MemoryCurrent": 221646760,
> "ThreadsWaiting": 20,
> "ProcessorsAvailable": 2,
> "GcTimeMax": 339,
> "ThreadsBlocked": 0,
> "ThreadsRemaining": 9
> }
> }
--
This message was sent by Atlassian Jira
(v8.20.10#820010)