Gabor Zele created KUDU-3546:
--------------------------------
Summary: json formatted partition info in data returned by metrics
API
Key: KUDU-3546
URL: https://issues.apache.org/jira/browse/KUDU-3546
Project: Kudu
Issue Type: Improvement
Components: metrics
Reporter: Gabor Zele
The metrics API returns the actual partition of a tablet as one of it's
attributes:
{noformat}
{
"type": "tablet",
"id": "527d053abe3b450fac2e23c1e58b29f7",
"attributes": {
"partition": "HASH (hash_key_1) PARTITION 1, HASH (hash_key_2,
hash_key_2) PARTITION 7, RANGE (timestamp) PARTITION 1649980800000 <= VALUES <
1651363200000",
"table_name": "impala::dbname.tablename",
"table_id": "63655530f5e743b1a710ba2c857b52b7"
},
"metrics": [
(...)
]
}{noformat}
With this "partition" atribute, we can identify to which actual partition the
metrics belongs to, and we could use this for metrics analytics.
However, the textual format of this attribute is good for human interpretation,
it is much harder to parse it with code. I'd recommend adding a new attribute
where the same info could be retrieved in a json format, something like this,
for example:
{noformat}
"attributes": {
"partition": "HASH (hash_key_1) PARTITION 1, HASH (hash_key_2, hash_key_3)
PARTITION 7, RANGE (timestamp) PARTITION 1649980800000 <= VALUES <
1651363200000", "partition_json" : [
{
"type": "HASH",
"keys": [
"hash_key_1"
],
"number": 1
},
{
"type": "HASH",
"keys": [
"hash_key_2",
"hash_key_3"
],
"number": 7
},
{
"type": "RANGE",
"keys": [
"timestamp"
],
"start_value": 1649980800000,
"start_op": "<=",
"end_op": "<",
"end_value": 1651363200000,
}
]
} {noformat}
--
This message was sent by Atlassian Jira
(v8.20.10#820010)