bharos opened a new pull request, #8913:
URL: https://github.com/apache/gravitino/pull/8913
<!--
1. Title: [#<issue>] <type>(<scope>): <subject>
Examples:
- "[#123] feat(operator): support xxx"
- "[#233] fix: check null before access result in xxx"
- "[MINOR] refactor: fix typo in variable name"
- "[MINOR] docs: fix typo in README"
- "[#255] test: fix flaky test NameOfTheTest"
Reference: https://www.conventionalcommits.org/en/v1.0.0/
2. If the PR is unfinished, please mark this PR as draft.
-->
### What changes were proposed in this pull request?
This PR adds observability for Iceberg client operations by bridging
Iceberg's metrics reporting to Gravitino's MetricsSystem.
Key Changes:
IcebergClientMetricsSource: New metrics source with iceberg-client namespace
(separate from iceberg-rest-server HTTP metrics)
IcebergRestMetricsStore: Implements MetricsStore to parse and record Iceberg
commit/scan metrics using Iceberg's public APIs
Configuration: Enable with metricsStore = rest
### Why are the changes needed?
Metrics sent to /v1/{prefix}/namespaces/{namespace}/tables/{table}/metrics
are silently dropped when using dummy store. This PR enables monitoring of:
Iceberg table operations (commits, scans)
Data file operations (added/removed files, sizes)
Query performance metrics sent through the metrics API
Fix: #(issue)
### Does this PR introduce _any_ user-facing change?
Yes, new configuration and metrics:
```
# Server configuration
gravitino.iceberg-rest.metricsStore = rest
```
```
# Client configuration (Spark)
spark.sql.catalog.<catalog-name>.rest-metrics-impl =
org.apache.iceberg.rest.RESTMetricsReporter
```
Exposed metrics (under iceberg-client namespace): commit reports, scan
reports, data files added/removed, file sizes, scan/commit durations, and 27+
additional metrics.
### How was this patch tested?
- Unit tests:
```
./gradlew :iceberg:iceberg-rest-server:test --tests
TestIcebergRestMetricsStore
```
- Production verification: Deployed to K8s with Spark SQL workload,
confirmed 32 metrics tracked correctly
```
curl -s http://localhost:9001/metrics | jq '.histograms |
with_entries(select(.key | startswith("iceberg-client")))'
{
"iceberg-client.iceberg.total-duration": {
"count": 3,
"max": 0,
"mean": 0,
"min": 0,
"p50": 0,
"p75": 0,
"p95": 0,
"p98": 0,
"p99": 0,
"p999": 0,
"stddev": 0
},
"iceberg-client.iceberg.total-planning-duration": {
"count": 9,
"max": 0,
"mean": 0,
"min": 0,
"p50": 0,
"p75": 0,
"p95": 0,
"p98": 0,
"p99": 0,
"p999": 0,
"stddev": 0
}
}
```
```
curl -s http://localhost:9001/metrics | jq '.counters |
with_entries(select(.key | startswith("iceberg-client")))'
{
"iceberg-client.iceberg.added-data-files": {
"count": 1
},
"iceberg-client.iceberg.added-files-size-bytes": {
"count": 960
},
"iceberg-client.iceberg.added-records": {
"count": 1
},
"iceberg-client.iceberg.attempts": {
"count": 3
},
"iceberg-client.iceberg.dvs": {
"count": 0
},
"iceberg-client.iceberg.equality-delete-files": {
"count": 0
},
"iceberg-client.iceberg.indexed-delete-files": {
"count": 0
},
"iceberg-client.iceberg.positional-delete-files": {
"count": 0
},
"iceberg-client.iceberg.removed-data-files": {
"count": 1
},
"iceberg-client.iceberg.removed-files-size-bytes": {
"count": 923
},
"iceberg-client.iceberg.removed-records": {
"count": 1
},
"iceberg-client.iceberg.reports.commit": {
"count": 3
},
"iceberg-client.iceberg.reports.scan": {
"count": 9
},
"iceberg-client.iceberg.result-data-files": {
"count": 5
},
"iceberg-client.iceberg.result-delete-files": {
"count": 0
},
"iceberg-client.iceberg.scanned-data-manifests": {
"count": 5
},
"iceberg-client.iceberg.scanned-delete-manifests": {
"count": 0
},
"iceberg-client.iceberg.skipped-data-files": {
"count": 0
},
"iceberg-client.iceberg.skipped-data-manifests": {
"count": 2
},
"iceberg-client.iceberg.skipped-delete-files": {
"count": 0
},
"iceberg-client.iceberg.skipped-delete-manifests": {
"count": 0
},
"iceberg-client.iceberg.total-data-files": {
"count": 1
},
"iceberg-client.iceberg.total-data-manifests": {
"count": 7
},
"iceberg-client.iceberg.total-delete-file-size-in-bytes": {
"count": 0
},
"iceberg-client.iceberg.total-delete-files": {
"count": 0
},
"iceberg-client.iceberg.total-delete-manifests": {
"count": 0
},
"iceberg-client.iceberg.total-equality-deletes": {
"count": 0
},
"iceberg-client.iceberg.total-file-size-in-bytes": {
"count": 4615
},
"iceberg-client.iceberg.total-files-size-bytes": {
"count": 960
},
"iceberg-client.iceberg.total-positional-deletes": {
"count": 0
},
"iceberg-client.iceberg.total-records": {
"count": 1
}
}
```
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]