[ 
https://issues.apache.org/jira/browse/HADOOP-13203?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15342185#comment-15342185
 ] 

Chris Nauroth commented on HADOOP-13203:
----------------------------------------

Steve and Rajesh, this looks great to me.  We'll get the best of both worlds.  
Thank you very much.

All of the random vs. sequential logic looks correct to me.  All tests passed 
for me against a bucket in US-west-2, barring the known failure related to a 
secret with a '\+' in it, which is tracked elsewhere.  I only have a few minor 
nitpicks on patch 008.

1. Please add audience and stability annotations to {{S3AInputPolicy}}.

{code}
   * Optimised purely for random seek+reed/positionedRead operations;
{code}

2. s/reed/read

{code}
    // Better to set it to the value requested by higher level layer.
    // In case this is set to contentLength, expect lots of connection
    // closes when backwards-seeks are executed.
    // Note that abort would force the internal connection to be
    // closed and makes it un-usable.
{code}

3. I think that comment can be removed.  I don't think it's relevant anymore.

{code}
    LOG.info("Stream Statistics\n{}", streamStatistics);
{code}

4. I suggest changing to this for platform-agnostic line endings:

{code}
    LOG.info(String.format("Stream Statistics%n{}"), streamStatistics);
{code}


> S3a: Consider reducing the number of connection aborts by setting correct 
> length in s3 request
> ----------------------------------------------------------------------------------------------
>
>                 Key: HADOOP-13203
>                 URL: https://issues.apache.org/jira/browse/HADOOP-13203
>             Project: Hadoop Common
>          Issue Type: Sub-task
>          Components: fs/s3
>    Affects Versions: 2.8.0
>            Reporter: Rajesh Balamohan
>            Assignee: Rajesh Balamohan
>         Attachments: HADOOP-13203-branch-2-001.patch, 
> HADOOP-13203-branch-2-002.patch, HADOOP-13203-branch-2-003.patch, 
> HADOOP-13203-branch-2-004.patch, HADOOP-13203-branch-2-005.patch, 
> HADOOP-13203-branch-2-006.patch, HADOOP-13203-branch-2-007.patch, 
> HADOOP-13203-branch-2-008.patch, stream_stats.tar.gz
>
>
> Currently file's "contentLength" is set as the "requestedStreamLen", when 
> invoking S3AInputStream::reopen().  As a part of lazySeek(), sometimes the 
> stream had to be closed and reopened. But lots of times the stream was closed 
> with abort() causing the internal http connection to be unusable. This incurs 
> lots of connection establishment cost in some jobs.  It would be good to set 
> the correct value for the stream length to avoid connection aborts. 
> I will post the patch once aws tests passes in my machine.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to