Luca Canali commented on HADOOP-16830:

We find that IO time metrics can be quite useful for debugging, and I wanted to 
check if that could make sense in the context of this JIRA.

As an example, for Apache Spark we have tested with hooking up I/O timing 
metrics for S3A into Spark's monitoring system (and also for HDFS and other 
Hadoop compatible filesystems).
>From the end-user point of view the result is I/O time instrumenation in a 
>dashboard together with other Spark's metrics (such as CPU time and run time), 

The tested implementation relied on Spark 3.0's new plugin infrastructure 
[SPARK-29397|https://issues.apache.org/jira/browse/SPARK-29397] that allows to 
integrate external metrics into Spark instrumentation.  
Example code of [Spark's plugins to capture Hadoop IO 
Proof of concept [implementation of some read time metrics for 

> Add public IOStatistics API; S3A to support
> -------------------------------------------
>                 Key: HADOOP-16830
>                 URL: https://issues.apache.org/jira/browse/HADOOP-16830
>             Project: Hadoop Common
>          Issue Type: Sub-task
>          Components: fs, fs/s3
>    Affects Versions: 3.3.0
>            Reporter: Steve Loughran
>            Assignee: Steve Loughran
>            Priority: Major
> Applications like to collect the statistics which specific operations take, 
> by collecting exactly those operations done during the execution of FS API 
> calls by their individual worker threads, and returning these to their job 
> driver
> * S3A has a statistics API for some streams, but it's a non-standard one; 
> Impala &c can't use it
> * FileSystem storage statistics are public, but as they aren't cross-thread, 
> they don't aggregate properly
> Proposed
> # A new IOStatistics interface to serve up statistics
> # S3A to implement
> # other stores to follow
> # Pass-through from the usual wrapper classes (FS data input/output streams)
> It's hard to think about how best to offer an API for operation context 
> stats, and how to actually implement.
> ThreadLocal isn't enough because the helper threads need to update on the 
> thread local value of the instigator
> My Initial PoC doesn't address that issue, but it shows what I'm thinking of

This message was sent by Atlassian Jira

To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

Reply via email to