FYI I went ahead and created https://github.com/apache/iceberg/pull/15304
as a SPI workaround until metrics are integrated with spark more deeply

Romain Manni-Bucau
@rmannibucau <https://x.com/rmannibucau> | .NET Blog
<https://dotnetbirdie.github.io/> | Blog <https://rmannibucau.github.io/> | Old
Blog <http://rmannibucau.wordpress.com> | Github
<https://github.com/rmannibucau> | LinkedIn
<https://www.linkedin.com/in/rmannibucau> | Book
<https://www.packtpub.com/en-us/product/java-ee-8-high-performance-9781788473064>
Javaccino founder (Java/.NET service - contact via linkedin)


Le jeu. 12 févr. 2026 à 11:38, Romain Manni-Bucau <[email protected]> a
écrit :

> Hi all,
>
> Is it intended that S3FileIO doesn't wire [aws
> sdk].ClientOverrideConfiguration.Builder#addMetricPublisher so basically,
> compared to hadoop-aws you can't retrieve metrics from Spark (or any other
> engine) and send them to a collector in a centralized manner?
> Is there another intended way?
>
> For plain hadoop-aws the workaround is to use (by reflection)
> S3AInstrumentation.getMetricsSystem().allSources() and wire it to a spark
> sink.
>
> To be clear I do care about the byte written/read but more importantly
> about the latency, number of requests, statuses etc. The stats exposed
> through FileSystem in iceberg are < 10 whereas we should get >> 100 stats
> (taking hadoop as a ref).
>
> Anything I missed?
>
> Romain Manni-Bucau
> @rmannibucau <https://x.com/rmannibucau> | .NET Blog
> <https://dotnetbirdie.github.io/> | Blog <https://rmannibucau.github.io/> |
> Old Blog <http://rmannibucau.wordpress.com> | Github
> <https://github.com/rmannibucau> | LinkedIn
> <https://www.linkedin.com/in/rmannibucau> | Book
> <https://www.packtpub.com/en-us/product/java-ee-8-high-performance-9781788473064>
> Javaccino founder (Java/.NET service - contact via linkedin)
>

Reply via email to