[ 
https://issues.apache.org/jira/browse/HADOOP-13028?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Steve Loughran updated HADOOP-13028:
------------------------------------
    Attachment: 
org.apache.hadoop.fs.s3a.scale.TestS3AInputStreamPerformance-output.txt

The file {{TestS3AInputStreamPerformance}} shows the result of a test run. 
There's a fairly big variance in times of operations, which could be a 
combination of buffering and network delays; someone should test this in EC2 
itself to see what shows up. For anyone trying to optimise things, those 
counters of times skipped, times incomplete, etc, are the ones which provide 
more deterministic results

> add counter and timer metrics for S3A HTTP & low-level operations
> -----------------------------------------------------------------
>
>                 Key: HADOOP-13028
>                 URL: https://issues.apache.org/jira/browse/HADOOP-13028
>             Project: Hadoop Common
>          Issue Type: Sub-task
>          Components: fs/s3, metrics
>    Affects Versions: 2.8.0
>            Reporter: Steve Loughran
>            Assignee: Steve Loughran
>         Attachments: HADOOP-13028-001.patch, 
> org.apache.hadoop.fs.s3a.scale.TestS3AInputStreamPerformance-output.txt
>
>
> against S3 (and other object stores), opening connections can be expensive, 
> closing connections may be expensive (a sign of a regression). 
> S3A FS and individual input streams should have counters of the # of 
> open/close/failure+reconnect operations, timers of how long things take. This 
> can be used downstream to measure efficiency of the code (how often 
> connections are being made), connection reliability, etc.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to