List,
Struggling with collecting metrics from the tserver. We are attempting to
pull down per tablet,
allowed_metrics = [
"rows_inserted",
"rows_upserted",
"rows_deleted",
"scanner_rows_scanned",
"upserts_as_updates",
"rows_updated",
"insertations_failed_dup_key"
]
by querying the metrics json endpoint.
We see the metrics being sent and they appear to be the same value.
I have the following questions:
1. Our metrics unique per `id` in the json payload ?
2. How do other collect metrics for their clusters?
Any help would be appreciated thanks !
Our while loop looks like this:
while not self.shutdown_event.is_set():
try:
collection_time = time()
http_response = requests.get("%s://localhost:%s/metrics" % (
self.protocol, self.port,),
verify=False)
for metric_type in http_response.json():
metric_prefix = metric_type['type']
for metric in metric_type['metrics']:
if metric["name"] not in allowed_metrics:
continue
full_name = metric_prefix + "." + metric["name"]
for key, value in metric.items():
if key == "name":
continue
log.info("%s_%s -> %s" % (full_name, key,
value,))
try:
point = float(value)
tags = metric_type['attributes'].copy()
tags['id'] = metric_type['id']
self.metrics_client.gauge(
"%s_%s" % (full_name, key,),
point,
timestamp=collection_time,
tags=tags)
except ValueError as not_a_number:
log.info("%s is not a number. Not sending",
value)
self.metrics_client.flush(timestamp=collection_time)
except Exception as ex:
log.error("Failed to parse kudu metrics", ex)
log.info("Pausing for 10 seconds after processing metrics")
self.shutdown_event.wait(10)