you get all thread local stats for a specific thread
from IOStatisticsContext.getCurrentIOStatisticsContext().getIOStatistics()

take a snapshot and that and you have something json marshallable or java
serializable which aggregates nicely

Call  IOStatisticsContext.getCurrentIOStatisticsContext().reset() when your
worker thread starts a specific task to ensure you only get the stats for
that task (s3a & I think gcs).

from the fs you getIOStatistics() and you get all the stats of all
filesystems and streams after close(). which from a quick look at some s3
io to a non-aws store shows a couple of failures, interestingly enough. We
collect separate averages for success and failure on every op so you can
see the difference.

the JMX stats we collect are a very small subset of the statistics, stuff
like "bytes drained in close"  and time to wait for an executor in the
thread pool (action_executor_acquired) are important as they're generally
sign of misconfigurations


2026-02-12 20:05:24,587 [main] INFO  statistics.IOStatisticsLogging
(IOStatisticsLogging.java:logIOStatisticsAtLevel(269)) - IOStatistics:
counters=((action_file_opened=1)
(action_http_get_request=1)
(action_http_head_request=26)
(audit_request_execution=70)
(audit_span_creation=22)
(directories_created=4)
(directories_deleted=2)
(files_copied=2)
(files_copied_bytes=14)
(files_created=1)
(files_deleted=4)
(filesystem_close=1)
(filesystem_initialization=1)
(object_bulk_delete_request=1)
(object_copy_requests=2)
(object_delete_objects=6)
(object_delete_request=4)
(object_list_request=31)
(object_metadata_request=26)
(object_put_bytes=7)
(object_put_request=5)
(object_put_request_completed=5)
(op_create=1)
(op_createfile=2)
(op_createfile.failures=1)
(op_delete=3)
(op_get_file_status=7)
(op_get_file_status.failures=4)
(op_hflush=1)
(op_hsync=1)
(op_list_files=2)
(op_list_files.failures=1)
(op_list_status=2)
(op_list_status.failures=1)
(op_mkdirs=2)
(op_open=1)
(op_rename=2)
(store_client_creation=1)
(store_io_request=70)
(stream_read_bytes=7)
(stream_read_close_operations=1)
(stream_read_closed=1)
(stream_read_opened=1)
(stream_read_operations=1)
(stream_read_remote_stream_drain=1)
(stream_read_seek_policy_changed=1)
(stream_read_total_bytes=7)
(stream_write_block_uploads=2)
(stream_write_bytes=7)
(stream_write_total_data=14)
(stream_write_total_time=290));

gauges=();

minimums=((action_executor_acquired.min=0)
(action_file_opened.min=136)
(action_http_get_request.min=140)
(action_http_head_request.min=107)
(filesystem_close.min=13)
(filesystem_initialization.min=808)
(object_bulk_delete_request.min=257)
(object_delete_request.min=117)
(object_list_request.min=113)
(object_put_request.min=121)
(op_create.min=148)
(op_createfile.failures.min=111)
(op_delete.min=117)
(op_get_file_status.failures.min=226)
(op_get_file_status.min=1)
(op_list_files.failures.min=391)
(op_list_files.min=138)
(op_list_status.failures.min=458)
(op_list_status.min=1056)
(op_mkdirs.min=709)
(op_rename.min=1205)
(store_client_creation.min=718)
(store_io_rate_limited_duration.min=0)
(stream_read_remote_stream_drain.min=1));

maximums=((action_executor_acquired.max=0)
(action_file_opened.max=136)
(action_http_get_request.max=140)
(action_http_head_request.max=270)
(filesystem_close.max=13)
(filesystem_initialization.max=808)
(object_bulk_delete_request.max=257)
(object_delete_request.max=149)
(object_list_request.max=1027)
(object_put_request.max=289)
(op_create.max=148)
(op_createfile.failures.max=111)
(op_delete.max=273)
(op_get_file_status.failures.max=262)
(op_get_file_status.max=254)
(op_list_files.failures.max=391)
(op_list_files.max=138)
(op_list_status.failures.max=458)
(op_list_status.max=1056)
(op_mkdirs.max=2094)
(op_rename.max=1523)
(store_client_creation.max=718)
(store_io_rate_limited_duration.max=0)
(stream_read_remote_stream_drain.max=1));

means=((action_executor_acquired.mean=(samples=1, sum=0, mean=0.0000))
(action_file_opened.mean=(samples=1, sum=136, mean=136.0000))
(action_http_get_request.mean=(samples=1, sum=140, mean=140.0000))
(action_http_head_request.mean=(samples=26, sum=3543, mean=136.2692))
(filesystem_close.mean=(samples=1, sum=13, mean=13.0000))
(filesystem_initialization.mean=(samples=1, sum=808, mean=808.0000))
(object_bulk_delete_request.mean=(samples=1, sum=257, mean=257.0000))
(object_delete_request.mean=(samples=4, sum=525, mean=131.2500))
(object_list_request.mean=(samples=31, sum=5651, mean=182.2903))
(object_put_request.mean=(samples=5, sum=1066, mean=213.2000))
(op_create.mean=(samples=1, sum=148, mean=148.0000))
(op_createfile.failures.mean=(samples=1, sum=111, mean=111.0000))
(op_delete.mean=(samples=3, sum=523, mean=174.3333))
(op_get_file_status.failures.mean=(samples=4, sum=992, mean=248.0000))
(op_get_file_status.mean=(samples=3, sum=365, mean=121.6667))
(op_list_files.failures.mean=(samples=1, sum=391, mean=391.0000))
(op_list_files.mean=(samples=1, sum=138, mean=138.0000))
(op_list_status.failures.mean=(samples=1, sum=458, mean=458.0000))
(op_list_status.mean=(samples=1, sum=1056, mean=1056.0000))
(op_mkdirs.mean=(samples=2, sum=2803, mean=1401.5000))
(op_rename.mean=(samples=2, sum=2728, mean=1364.0000))
(store_client_creation.mean=(samples=1, sum=718, mean=718.0000))
(store_io_rate_limited_duration.mean=(samples=5, sum=0, mean=0.0000))
(stream_read_remote_stream_drain.mean=(samples=1, sum=1, mean=1.0000)));

Anyway, no, S3FileIO doesn't have any of that. Keeps the code simple, which
is in its favour.


On Thu, 12 Feb 2026 at 18:40, Romain Manni-Bucau <[email protected]>
wrote:

> hmm, I'm not sure what you do propose to link it to spark sinks but
> S3AInstrumentation.getMetricSystem().allSources for hadoop-aws and
> MetricsPublisher for iceberg are the "least worse" solution I came with.
> Clearly dirty but more efficient than reinstrumenting the whole stack
> everywhere (pull vs push mode).
>
> Do you mean I should wrap everything to read the thread local every time
> and maintain the registry in spark metricssystem?
>
> Another way to see it is to open JMX when using hadoop-aws, these are the
> graphs I want to get into grafana at some point.
>
> Romain Manni-Bucau
> @rmannibucau <https://x.com/rmannibucau> | .NET Blog
> <https://dotnetbirdie.github.io/> | Blog <https://rmannibucau.github.io/> |
> Old Blog <http://rmannibucau.wordpress.com> | Github
> <https://github.com/rmannibucau> | LinkedIn
> <https://www.linkedin.com/in/rmannibucau> | Book
> <https://www.packtpub.com/en-us/product/java-ee-8-high-performance-9781788473064>
> Javaccino founder (Java/.NET service - contact via linkedin)
>
>
> Le jeu. 12 févr. 2026 à 19:19, Steve Loughran <[email protected]> a
> écrit :
>
>>
>> ok, stream level.
>>
>> No, it's not the same.
>>
>> For those s3a input stream stats, you don't need to go into the s3a
>> internals
>> 1. every source of IOStats implements InputStreamStatistics, which is
>> hadoop-common code
>> 2. in close() s3a input streams update thread level IOStatisticsContext (
>> https://issues.apache.org/jira/browse/HADOOP-17461 ... some
>> stabilisation so use with Hadoop 3.4.0/Spark 4.0+)
>>
>> The thread stuff is so streams opened and closed in, say, the parquet
>> reader, update stats just for that worker thread even though you never get
>> near the stream instance itself.
>>
>> Regarding iceberg fileio stats, well, maybe someone could add it to the
>> classes. Spark 4+ could think about collecting the stats for each task and
>> aggregating, as that was the original goal. You get that aggregation
>> indirectly on s3a with the s3a committers, similar through abfs, but really
>> spark should just collect and report it itself.
>>
>>
>> On Thu, 12 Feb 2026 at 17:03, Romain Manni-Bucau <[email protected]>
>> wrote:
>>
>>> Hi Steve,
>>>
>>> Do you reference org.apache.iceberg.io.FileIOMetricsContext and
>>> org.apache.hadoop.fs.FileSystem.Statistics.StatisticsData? It misses most
>>> of what I'm looking for (429 to cite a single one).
>>> software.amazon.awssdk.metrics helps a bit but is not sink friendly.
>>> Compared to hadoop-aws usage combining iceberg native and aws s3 client
>>> ones kind of compensate the lack but what I would love to see
>>> is org.apache.hadoop.fs.s3a.S3AInstrumentation and more particularly
>>> org.apache.hadoop.fs.s3a.S3AInstrumentation.InputStreamStatistics#InputStreamStatistics
>>> (I'm mainly reading for my use cases).
>>>
>>>
>>> Romain Manni-Bucau
>>> @rmannibucau <https://x.com/rmannibucau> | .NET Blog
>>> <https://dotnetbirdie.github.io/> | Blog
>>> <https://rmannibucau.github.io/> | Old Blog
>>> <http://rmannibucau.wordpress.com> | Github
>>> <https://github.com/rmannibucau> | LinkedIn
>>> <https://www.linkedin.com/in/rmannibucau> | Book
>>> <https://www.packtpub.com/en-us/product/java-ee-8-high-performance-9781788473064>
>>> Javaccino founder (Java/.NET service - contact via linkedin)
>>>
>>>
>>> Le jeu. 12 févr. 2026 à 15:50, Steve Loughran <[email protected]> a
>>> écrit :
>>>
>>>>
>>>>
>>>> On Thu, 12 Feb 2026 at 10:39, Romain Manni-Bucau <[email protected]>
>>>> wrote:
>>>>
>>>>> Hi all,
>>>>>
>>>>> Is it intended that S3FileIO doesn't wire [aws
>>>>> sdk].ClientOverrideConfiguration.Builder#addMetricPublisher so basically,
>>>>> compared to hadoop-aws you can't retrieve metrics from Spark (or any other
>>>>> engine) and send them to a collector in a centralized manner?
>>>>> Is there another intended way?
>>>>>
>>>>
>>>> already a PR up awaiting review by committers
>>>> https://github.com/apache/iceberg/pull/15122
>>>>
>>>>
>>>>
>>>>>
>>>>> For plain hadoop-aws the workaround is to use (by reflection)
>>>>> S3AInstrumentation.getMetricsSystem().allSources() and wire it to a
>>>>> spark sink.
>>>>>
>>>>
>>>> The intended way to do it there is to use the IOStatistics API which
>>>> not only lets you at the s3a stats, google cloud collects stuff the same
>>>> way, and there's explicit collection of things per thread for stream read
>>>> and write....
>>>>
>>>> try setting
>>>>
>>>> fs.iostatistics.logging.level info
>>>>
>>>> to see what gets measured
>>>>
>>>>
>>>>> To be clear I do care about the byte written/read but more importantly
>>>>> about the latency, number of requests, statuses etc. The stats exposed
>>>>> through FileSystem in iceberg are < 10 whereas we should get >> 100 stats
>>>>> (taking hadoop as a ref).
>>>>>
>>>>
>>>> AWS metrics are a very limited sets
>>>>
>>>> software.amazon.awssdk.core.metrics.CoreMetric
>>>>
>>>> The retry count is good here as it measures stuff beneath any
>>>> application code. With the rest signer, it'd make sense to also collect
>>>> signing time, as the RPC call to the signing endpoint would be included.
>>>>
>>>> -steve
>>>>
>>>

Reply via email to