Steve Loughran created HADOOP-17434: ---------------------------------------
Summary: Improve S3A upload statistics collection from ProgressEvent callbacks Key: HADOOP-17434 URL: https://issues.apache.org/jira/browse/HADOOP-17434 Project: Hadoop Common Issue Type: Sub-task Components: fs/s3 Affects Versions: 3.4.0 Reporter: Steve Loughran Collection of S3A upload stats from ProgressEvent callbacks can be improved Two similar but different implementations of listeners * org.apache.hadoop.fs.s3a.S3ABlockOutputStream.BlockUploadProgress * org.apache.hadoop.fs.s3a.ProgressableProgressListener. Used on simple PUT calls. Both call back into S3A FS to incrementWriteOperations; BlockUploadProgress also updates S3AInstrumentation/IOStatistics. * I'm not 100% confident that BlockUploadProgress is updating things (especially gauges of pending bytes) at the right time * or that completion is being handled * And the other interface doesn't update S3AInstrumentation; numbers are lost. * And there's no incremental updating during {{CommitOperations.uploadFileToPendingCommit()}}, which doesn't call Progressable.progress() other than on every block. * or in MultipartUploader Proposed: * a single Progress listener which updates BlockOutputStreamStatistics, used by all interfaces. * WriteOperations to help set this up for callers; * And it's uploadPart API to take a Progressable (or the progress listener to use for uploading that part) * Multipart upload API to also add a progressable...would help for distcp-like applications. +Itests to verify that the gauges come out right. At the end of each operation, the #of bytes pending upload == 0; that of bytes uploaded == the original size -- This message was sent by Atlassian Jira (v8.3.4#803005) --------------------------------------------------------------------- To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-dev-h...@hadoop.apache.org