DerGut commented on issue #474:
URL: https://github.com/apache/iceberg-python/issues/474#issuecomment-3067304005

   I'd like to pick this back up because we started a discussion about in 
https://github.com/apache/iceberg-rust/issues/1466 too.
   
   On the draft implementation, @sdd raised a good point that we now have 
other, often more idiomatic interfaces available 
https://github.com/apache/iceberg-rust/pull/1496#issuecomment-3064213003. In 
Rust for example, we've decided on using the facade 
[`metrics`](https://docs.rs/metrics/latest/metrics/) which users can back by 
any exporter they like, offering simple integrations with existing 
observability systems. In Python, 
[opentelemetry](https://github.com/open-telemetry/opentelemetry-python) offers 
similar functionality.
   
   Using existing telemetry APIs, reporting code could look much simpler and 
backing integrations will be easier (no custom code needed).
   
   #### Metric Names
   Emitting metrics straight from the library will mean we also need to 
standardize on metric names or implementations could diverge, defeating the 
idea of a unified way of monitoring Iceberg clients.
   
   I would like to propose a naming system similar to @sdd's 
[PoC](https://github.com/apache/iceberg-rust/pull/1502) comprised of
   ```
   iceberg.<operation>.<resource>.<count-type>
   ```
   for example `iceberg.scan.data_files.scanned`,  
`iceberg.scan.delete_manifests.skipped` or `iceberg.commit.delete_files.added`. 
Existing metrics can be taken from 
[`ScanMetricsResult.java`](https://github.com/apache/iceberg/blob/ae672a270dceea92fc56fc2ca51a1a9d03715122/core/src/main/java/org/apache/iceberg/metrics/ScanMetricsResult.java#L27)
 and 
[`CommitMetricsResult.java`](https://github.com/apache/iceberg/blob/ae672a270dceea92fc56fc2ca51a1a9d03715122/core/src/main/java/org/apache/iceberg/metrics/CommitMetricsResult.java#L30).
   
   #### Catalog Spec
   The Metrics Reporting API is part of the catalog spec which suggests that we 
should consider implementing it anyway. If we can prove with an experiment that 
(for example) an opentelemetry exporter can consume a spec-compliant reporter 
interface, we should be good. If we can't, we need to take this into 
consideration.
   With the spec's API, multiple metrics are bundled together into a single 
report. This doesn't seem natural for other metrics APIs and could become an 
implementation burden. 
   
   ---
   
   I can help with an implementation, but I would first like to extend the 
discussion about following the Java implementation vs. using more idiomatic 
approaches. Note, I've also created 
https://github.com/apache/iceberg-go/issues/485 in an attempt to get different 
implementations move into a similar direction.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to