Steve Loughran created HADOOP-17434:
---------------------------------------

             Summary: Improve S3A upload statistics collection from 
ProgressEvent callbacks
                 Key: HADOOP-17434
                 URL: https://issues.apache.org/jira/browse/HADOOP-17434
             Project: Hadoop Common
          Issue Type: Sub-task
          Components: fs/s3
    Affects Versions: 3.4.0
            Reporter: Steve Loughran


Collection of S3A upload stats from ProgressEvent callbacks can be improved

Two similar but different implementations of listeners
* org.apache.hadoop.fs.s3a.S3ABlockOutputStream.BlockUploadProgress
* org.apache.hadoop.fs.s3a.ProgressableProgressListener. Used on simple PUT 
calls.

Both call back into S3A FS to incrementWriteOperations; BlockUploadProgress 
also updates S3AInstrumentation/IOStatistics.

* I'm not 100% confident that BlockUploadProgress is updating things 
(especially gauges of pending bytes) at the right time
* or that completion is being handled
* And the other interface doesn't update S3AInstrumentation; numbers are lost.
* And there's no incremental updating during 
{{CommitOperations.uploadFileToPendingCommit()}}, which doesn't call 
Progressable.progress() other than on every block.
* or in MultipartUploader 

Proposed: 
* a single Progress listener which updates BlockOutputStreamStatistics, used by 
all interfaces.
* WriteOperations to help set this up for callers; 
* And it's uploadPart API to take a Progressable (or the progress listener to 
use for uploading that part)
* Multipart upload API to also add a progressable...would help for distcp-like 
applications.

+Itests to verify that the gauges come out right. At the end of each operation, 
the #of bytes pending upload == 0; that of bytes uploaded == the original size





--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org

Reply via email to