[ 
https://issues.apache.org/jira/browse/HADOOP-13028?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15270986#comment-15270986
 ] 

Steve Loughran commented on HADOOP-13028:
-----------------------------------------

1. fixed

3. counting backwards seeks I'm just ignoring that value for now. left it in in 
case we ever wanted to start tracking these things. One interesting question 
about all seek + read stats, is really histograms of requests would be the best 
metric; not just the aggregates.

2. let me review that code. In fact, maybe I should factor it out for some 
independent checks. Really, we should be asking for the whole thing, shouldn't 
we? Because even irrespective of the amount you want in the current read() 
call, you don't want to have to re-open just because you didn't know the 
initial amount, do you? 

I think the http content-range call does require you to specify a limit, so 
file-len is always required, but that can be enough



> add low level counter metrics for S3A; use in read performance tests
> --------------------------------------------------------------------
>
>                 Key: HADOOP-13028
>                 URL: https://issues.apache.org/jira/browse/HADOOP-13028
>             Project: Hadoop Common
>          Issue Type: Sub-task
>          Components: fs/s3, metrics
>    Affects Versions: 2.8.0
>            Reporter: Steve Loughran
>            Assignee: Steve Loughran
>         Attachments: HADOOP-13028-001.patch, HADOOP-13028-002.patch, 
> HADOOP-13028-004.patch, HADOOP-13028-005.patch, HADOOP-13028-006.patch, 
> HADOOP-13028-007.patch, HADOOP-13028-008.patch, 
> HADOOP-13028-branch-2-008.patch, 
> org.apache.hadoop.fs.s3a.scale.TestS3AInputStreamPerformance-output.txt, 
> org.apache.hadoop.fs.s3a.scale.TestS3AInputStreamPerformance-output.txt
>
>
> against S3 (and other object stores), opening connections can be expensive, 
> closing connections may be expensive (a sign of a regression). 
> S3A FS and individual input streams should have counters of the # of 
> open/close/failure+reconnect operations, timers of how long things take. This 
> can be used downstream to measure efficiency of the code (how often 
> connections are being made), connection reliability, etc.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to