[
https://issues.apache.org/jira/browse/HADOOP-17271?focusedWorklogId=524625&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-524625
]
ASF GitHub Bot logged work on HADOOP-17271:
-------------------------------------------
Author: ASF GitHub Bot
Created on: 15/Dec/20 18:31
Start Date: 15/Dec/20 18:31
Worklog Time Spent: 10m
Work Description: steveloughran opened a new pull request #2553:
URL: https://github.com/apache/hadoop/pull/2553
This is the rebase of #2324 #2323 with the following changes
We're into the danger zone here as I'm adding more features; to get this
in it needs to be merged and then I can add as new patches on top.
Specifically I'm looking at: aws SDK (major) and better tracking of upload
progress (minor).
## Hadoop common
HADOOP-16830 API/Implementation changes for statistics
* tuning standard statistic names
* review javadoc
* can generate a o.a.h.fs.Statistics instance from an IOStatistics instance;
this can be used to unify statistics collection/reporting.
* Duration tracker to take long as argument
* another duration tracker lambda expression for any InvocationRaisingIOE
void -> void.
## Hadoop-aws
HADOOP-17271. S3A Statistics Enhancement/tuning
* move S3A statistics from o.a.h.fs.s3a.impl.statistics into s3a.statistics
(interfaces) and s3a.statistics.impl for the implementations
* org.apache.hadoop.fs.s3a.Statistic enum adds a type for each entry.
this allows test/instrumentation setup to immediately determine which
are counters vs other types, so set things up properly. This will make
a different as more statistics are added
* S3AFileSystem.getStatistics() now serves up a dynamic
binding to S3AInstrumention.instanceIOStatistics
* ...so no need to have separate statistics updating
* Committer to track duration of: file upload, file commit
* Some review of block upload counting/gauges, but key changes
left for a follow-upon JIRA.
* javadocs everywhere!
## Testing:
S3 london with
```
-Dparallel-tests -DtestsThreadCount=4 -Dmarkers=delete -Ds3guard -Ddynamo
-Dfs.s3a.directory.marker.audit=true
-Dparallel-tests -DtestsThreadCount=4 -Dmarkers=keep
-Dfs.s3a.directory.marker.audit=true
```
I'll do a scale run next.
1. Unifying Statistics by making the Statistic class include a type, and
then automatically registering things in the right place turns out to be a nice
design pattern we should use in the other stores.
1. Serving up the IOStatistics counters as FileSystem StorageStatistics
eliminates duplicate work/inconsistency. The core support for this is in
hadoop-common.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
Issue Time Tracking
-------------------
Worklog Id: (was: 524625)
Time Spent: 8h (was: 7h 50m)
> S3A statistics to support IOStatistics
> --------------------------------------
>
> Key: HADOOP-17271
> URL: https://issues.apache.org/jira/browse/HADOOP-17271
> Project: Hadoop Common
> Issue Type: Sub-task
> Components: fs/s3
> Affects Versions: 3.3.0
> Reporter: Steve Loughran
> Assignee: Steve Loughran
> Priority: Major
> Labels: pull-request-available
> Time Spent: 8h
> Remaining Estimate: 0h
>
> S3A to rework statistics with
> * API + Implementation split of the interfaces used by subcomponents when
> reporting stats
> * S3A Instrumentation to implement all the interfaces
> * streams, etc to all implement IOStatisticsSources and serve to callers
> * Add some tracking of durations of remote requests
--
This message was sent by Atlassian Jira
(v8.3.4#803005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]