FYI I went ahead and created https://github.com/apache/iceberg/pull/15304 as a SPI workaround until metrics are integrated with spark more deeply
Romain Manni-Bucau @rmannibucau <https://x.com/rmannibucau> | .NET Blog <https://dotnetbirdie.github.io/> | Blog <https://rmannibucau.github.io/> | Old Blog <http://rmannibucau.wordpress.com> | Github <https://github.com/rmannibucau> | LinkedIn <https://www.linkedin.com/in/rmannibucau> | Book <https://www.packtpub.com/en-us/product/java-ee-8-high-performance-9781788473064> Javaccino founder (Java/.NET service - contact via linkedin) Le jeu. 12 févr. 2026 à 11:38, Romain Manni-Bucau <[email protected]> a écrit : > Hi all, > > Is it intended that S3FileIO doesn't wire [aws > sdk].ClientOverrideConfiguration.Builder#addMetricPublisher so basically, > compared to hadoop-aws you can't retrieve metrics from Spark (or any other > engine) and send them to a collector in a centralized manner? > Is there another intended way? > > For plain hadoop-aws the workaround is to use (by reflection) > S3AInstrumentation.getMetricsSystem().allSources() and wire it to a spark > sink. > > To be clear I do care about the byte written/read but more importantly > about the latency, number of requests, statuses etc. The stats exposed > through FileSystem in iceberg are < 10 whereas we should get >> 100 stats > (taking hadoop as a ref). > > Anything I missed? > > Romain Manni-Bucau > @rmannibucau <https://x.com/rmannibucau> | .NET Blog > <https://dotnetbirdie.github.io/> | Blog <https://rmannibucau.github.io/> | > Old Blog <http://rmannibucau.wordpress.com> | Github > <https://github.com/rmannibucau> | LinkedIn > <https://www.linkedin.com/in/rmannibucau> | Book > <https://www.packtpub.com/en-us/product/java-ee-8-high-performance-9781788473064> > Javaccino founder (Java/.NET service - contact via linkedin) >
