DerGut commented on issue #474: URL: https://github.com/apache/iceberg-python/issues/474#issuecomment-3067304005
I'd like to pick this back up because we started a discussion about in https://github.com/apache/iceberg-rust/issues/1466 too. On the draft implementation, @sdd raised a good point that we now have other, often more idiomatic interfaces available https://github.com/apache/iceberg-rust/pull/1496#issuecomment-3064213003. In Rust for example, we've decided on using the facade [`metrics`](https://docs.rs/metrics/latest/metrics/) which users can back by any exporter they like, offering simple integrations with existing observability systems. In Python, [opentelemetry](https://github.com/open-telemetry/opentelemetry-python) offers similar functionality. Using existing telemetry APIs, reporting code could look much simpler and backing integrations will be easier (no custom code needed). #### Metric Names Emitting metrics straight from the library will mean we also need to standardize on metric names or implementations could diverge, defeating the idea of a unified way of monitoring Iceberg clients. I would like to propose a naming system similar to @sdd's [PoC](https://github.com/apache/iceberg-rust/pull/1502) comprised of ``` iceberg.<operation>.<resource>.<count-type> ``` for example `iceberg.scan.data_files.scanned`, `iceberg.scan.delete_manifests.skipped` or `iceberg.commit.delete_files.added`. Existing metrics can be taken from [`ScanMetricsResult.java`](https://github.com/apache/iceberg/blob/ae672a270dceea92fc56fc2ca51a1a9d03715122/core/src/main/java/org/apache/iceberg/metrics/ScanMetricsResult.java#L27) and [`CommitMetricsResult.java`](https://github.com/apache/iceberg/blob/ae672a270dceea92fc56fc2ca51a1a9d03715122/core/src/main/java/org/apache/iceberg/metrics/CommitMetricsResult.java#L30). #### Catalog Spec The Metrics Reporting API is part of the catalog spec which suggests that we should consider implementing it anyway. If we can prove with an experiment that (for example) an opentelemetry exporter can consume a spec-compliant reporter interface, we should be good. If we can't, we need to take this into consideration. With the spec's API, multiple metrics are bundled together into a single report. This doesn't seem natural for other metrics APIs and could become an implementation burden. --- I can help with an implementation, but I would first like to extend the discussion about following the Java implementation vs. using more idiomatic approaches. Note, I've also created https://github.com/apache/iceberg-go/issues/485 in an attempt to get different implementations move into a similar direction. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
