[ 
https://issues.apache.org/jira/browse/HADOOP-17271?focusedWorklogId=524625&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-524625
 ]

ASF GitHub Bot logged work on HADOOP-17271:
-------------------------------------------

                Author: ASF GitHub Bot
            Created on: 15/Dec/20 18:31
            Start Date: 15/Dec/20 18:31
    Worklog Time Spent: 10m 
      Work Description: steveloughran opened a new pull request #2553:
URL: https://github.com/apache/hadoop/pull/2553


   This is the rebase of #2324 #2323 with the following changes
   
   
   We're into the danger zone here as I'm adding more features; to get this
   in it needs to be merged and then I can add as new patches on top.
   Specifically I'm looking at: aws SDK (major) and better tracking of upload
   progress (minor).
   
   ## Hadoop common
   
   HADOOP-16830 API/Implementation changes for statistics
   
   * tuning standard statistic names
   * review javadoc
   * can generate a o.a.h.fs.Statistics instance from an IOStatistics instance;
     this can be used to unify statistics collection/reporting.
   * Duration tracker to take long as argument
   * another duration tracker lambda expression for any InvocationRaisingIOE 
void -> void.
   
       
   ## Hadoop-aws
   HADOOP-17271. S3A Statistics Enhancement/tuning
   
   * move S3A statistics from o.a.h.fs.s3a.impl.statistics into s3a.statistics
   (interfaces) and s3a.statistics.impl for the implementations
   * org.apache.hadoop.fs.s3a.Statistic enum adds a type for each entry.
     this allows test/instrumentation setup to immediately determine which
     are counters vs other types, so set things up properly. This will make
     a different as more statistics are added
   * S3AFileSystem.getStatistics() now serves up a dynamic
     binding to S3AInstrumention.instanceIOStatistics
   * ...so no need to have separate statistics updating
   * Committer to track duration of: file upload, file commit
   * Some review of block upload counting/gauges, but key changes
     left for a follow-upon JIRA.
   * javadocs everywhere!
   
   
   ## Testing: 
   
   S3 london with
   ```
   -Dparallel-tests -DtestsThreadCount=4 -Dmarkers=delete -Ds3guard -Ddynamo  
-Dfs.s3a.directory.marker.audit=true
   -Dparallel-tests -DtestsThreadCount=4 -Dmarkers=keep  
-Dfs.s3a.directory.marker.audit=true
   ```
   
   I'll do a scale run next.
   
   1. Unifying Statistics by making the Statistic class include a type, and 
then automatically registering things in the right place turns out to be a nice 
design pattern we should use in the other stores. 
   1. Serving up the IOStatistics counters as FileSystem StorageStatistics 
eliminates duplicate work/inconsistency. The core support for this is in 
hadoop-common.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]


Issue Time Tracking
-------------------

    Worklog Id:     (was: 524625)
    Time Spent: 8h  (was: 7h 50m)

> S3A statistics to support IOStatistics
> --------------------------------------
>
>                 Key: HADOOP-17271
>                 URL: https://issues.apache.org/jira/browse/HADOOP-17271
>             Project: Hadoop Common
>          Issue Type: Sub-task
>          Components: fs/s3
>    Affects Versions: 3.3.0
>            Reporter: Steve Loughran
>            Assignee: Steve Loughran
>            Priority: Major
>              Labels: pull-request-available
>          Time Spent: 8h
>  Remaining Estimate: 0h
>
> S3A to rework statistics with
> * API + Implementation split of the interfaces used by subcomponents when 
> reporting stats
> * S3A Instrumentation to implement all the interfaces
> * streams, etc to all implement IOStatisticsSources and serve to callers
> * Add some tracking of durations of remote requests



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to