[
https://issues.apache.org/jira/browse/FLINK-30450?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17649716#comment-17649716
]
Steve Loughran commented on FLINK-30450:
----------------------------------------
if you can grab the IOStatistics from an iOStatisticsSource implementation,
s3a, gcs and abfs FS instances collect detailed stats you can snapshot and then
marshall as json/serialized java objects. Also works for other classes (input
streams, output streams, list iterators....). that stuff is there to play with.
otherwise, hadoop-3.3.2+ optionally annotates the http-referer headr on all S3
requests so you can apportion blame from the S3 server logs, including looking
for 503 throttle events. Avoids code changes up the stack and gives a view of
the entire cluster.
> FileSystem supports exporting client-side metrics
> -------------------------------------------------
>
> Key: FLINK-30450
> URL: https://issues.apache.org/jira/browse/FLINK-30450
> Project: Flink
> Issue Type: New Feature
> Components: FileSystems
> Reporter: Hangxiang Yu
> Priority: Major
>
> Client-side metrics, or job level metrics for filesystem could help us to
> monitor filesystem more precisely.
> Some metrics (like request rate , throughput, latency, retry count, etc) are
> useful to monitor the network or client problem of checkpointing or other
> access cases for a job.
> Some filesystems like s3, s3-presto, gs have supported enabling some metrics,
> these could be exported in the filesystem.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)