[jira] [Assigned] (HADOOP-17469) IOStatistics Phase II

Steve Loughran (Jira) Wed, 27 Jul 2022 06:32:08 -0700


     [ 
https://issues.apache.org/jira/browse/HADOOP-17469?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]


Steve Loughran reassigned HADOOP-17469:
---------------------------------------

    Assignee: Mehakmeet Singh

> IOStatistics Phase II
> ---------------------
>
>                 Key: HADOOP-17469
>                 URL: https://issues.apache.org/jira/browse/HADOOP-17469
>             Project: Hadoop Common
>          Issue Type: New Feature
>          Components: fs, fs/azure, fs/s3
>    Affects Versions: 3.3.1
>            Reporter: Steve Loughran
>            Assignee: Mehakmeet Singh
>            Priority: Major
>
> Continue IOStatistics development with goals of
> * Easy adoption in applications
> * better instrumentation in hadoop codebase (distcp?)
> * more stats in abfs and s3a connectors
> A key has to be a thread level context for statistics so that app code 
> doesn't have to explicitly ask for the stats for each worker thread. Instead 
> filesystem components update the context stats as well as thread stats 
> (when?) and then apps can pick up.
> * need to manage performance by minimising inefficient lookups, lock 
> acquisition etc on what should be memory-only ops (read()), (write()),
> * and for duration tracking, cut down on calls to System.currentTime() so 
> that only 1 should be made per operation, 
> * need to propagate the context into worker threads
> Target uses
> * Impala 
> * Spark via SPARK-29397 
> * S3A committers
> * Iceberg.
> I have a WiP Parquet branch too, to see what can be done there. This shows up 
> how the thread context is needed as its unworkable to build up your own stats 
> shapshot. Even if you collect it for listX and stream reads, it doesn't 
> include FS operations (e.g. rename()) and you need to rework all your methods 
> to pass the stats collector around



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[jira] [Assigned] (HADOOP-17469) IOStatistics Phase II

Reply via email to