[
https://issues.apache.org/jira/browse/HADOOP-13028?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15263292#comment-15263292
]
Chris Nauroth commented on HADOOP-13028:
----------------------------------------
Hello [[email protected]]. This looks very useful overall.
I'm a bit confused, because it seems different iterations of the patch have
folded in fixes from other JIRAs. Can you please clarify for reviewers if we
should be reviewing other patches first?
Since the patch is touching some {{LOG.debug}} statements, would it be helpful
to include {{src}} and {{dst}} in those log message?
{{S3AFileSystem#removeKeys}} appears to have some subtle bugs. This is not
entirely related to your patch. The multi-delete might fail with some objects
successfully deleted but others remaining. However, the stats only increment
if the whole multi-delete succeeded.
http://docs.aws.amazon.com/AWSJavaSDK/latest/javadoc/com/amazonaws/services/s3/AmazonS3Client.html#deleteObjects(com.amazonaws.services.s3.model.DeleteObjectsRequest)
http://docs.aws.amazon.com/AWSJavaSDK/latest/javadoc/com/amazonaws/services/s3/model/MultiObjectDeleteException.html
Similarly, if multi-delete is disabled, then any individual delete in the loop
might throw an exception and skip the stats increments.
I'll wait for clarification on the question on pre-requisite patches before I
take this for a test run myself.
> add low level counter metrics for S3A; use in read performance tests
> --------------------------------------------------------------------
>
> Key: HADOOP-13028
> URL: https://issues.apache.org/jira/browse/HADOOP-13028
> Project: Hadoop Common
> Issue Type: Sub-task
> Components: fs/s3, metrics
> Affects Versions: 2.8.0
> Reporter: Steve Loughran
> Assignee: Steve Loughran
> Attachments: HADOOP-13028-001.patch, HADOOP-13028-002.patch,
> HADOOP-13028-004.patch, HADOOP-13028-005.patch, HADOOP-13028-006.patch,
> org.apache.hadoop.fs.s3a.scale.TestS3AInputStreamPerformance-output.txt,
> org.apache.hadoop.fs.s3a.scale.TestS3AInputStreamPerformance-output.txt
>
>
> against S3 (and other object stores), opening connections can be expensive,
> closing connections may be expensive (a sign of a regression).
> S3A FS and individual input streams should have counters of the # of
> open/close/failure+reconnect operations, timers of how long things take. This
> can be used downstream to measure efficiency of the code (how often
> connections are being made), connection reliability, etc.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]