[ 
https://issues.apache.org/jira/browse/HADOOP-13028?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15265383#comment-15265383
 ] 

Steve Loughran commented on HADOOP-13028:
-----------------------------------------

regarding the contained patches, the IOE handling patch HADOOP-12844, is a 
direct precursor, I just optimised the implementation by moving the exiting 
handler for socket exceptions to after the EOF handler, and expanded the check 
to all IOEs. You can look at the patch there and think "would that work?" We 
don't have any test checking this failure path (who fancies writing some fault 
injection mocking?), so a review matters there.

The forward seek buffering code is very different; this is the code to 
consider. it does a lot of thinking about how far to seek

# if the forward length is in the {{available()}} range, that is already 
received, *always read forwards*. That's irrespective of requested range.
# otherwise, min of (bytes-remaining, buffer size)
# with counters of times of forward/backward seeks, and how many bytes were 
skipped during forward seeks
# there are tests

So: review this code directly.

I'll look at the logging and remove keys code. There's already an open JIRA on 
failures of deletes after a rename, which I was hoping to have addressed 
elsewhere.


> add low level counter metrics for S3A; use in read performance tests
> --------------------------------------------------------------------
>
>                 Key: HADOOP-13028
>                 URL: https://issues.apache.org/jira/browse/HADOOP-13028
>             Project: Hadoop Common
>          Issue Type: Sub-task
>          Components: fs/s3, metrics
>    Affects Versions: 2.8.0
>            Reporter: Steve Loughran
>            Assignee: Steve Loughran
>         Attachments: HADOOP-13028-001.patch, HADOOP-13028-002.patch, 
> HADOOP-13028-004.patch, HADOOP-13028-005.patch, HADOOP-13028-006.patch, 
> org.apache.hadoop.fs.s3a.scale.TestS3AInputStreamPerformance-output.txt, 
> org.apache.hadoop.fs.s3a.scale.TestS3AInputStreamPerformance-output.txt
>
>
> against S3 (and other object stores), opening connections can be expensive, 
> closing connections may be expensive (a sign of a regression). 
> S3A FS and individual input streams should have counters of the # of 
> open/close/failure+reconnect operations, timers of how long things take. This 
> can be used downstream to measure efficiency of the code (how often 
> connections are being made), connection reliability, etc.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to