[ https://issues.apache.org/jira/browse/HADOOP-13203?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15314402#comment-15314402 ]
Chris Nauroth commented on HADOOP-13203: ---------------------------------------- Rajesh, thank you for the further explanation. Sorry for my earlier confusion. I was misinterpreting the word "abort" to mean something happening at the TCP layer, e.g. an RST packet sent from the S3 back-end. Now I understand that we're really talking about our own abort logic in {{S3AInputStream#closeStream}}. Now that I understand the goal of this change, I can code review it. I'll try to do that later today (PST). > S3a: Consider reducing the number of connection aborts by setting correct > length in s3 request > ---------------------------------------------------------------------------------------------- > > Key: HADOOP-13203 > URL: https://issues.apache.org/jira/browse/HADOOP-13203 > Project: Hadoop Common > Issue Type: Sub-task > Components: fs/s3 > Reporter: Rajesh Balamohan > Assignee: Rajesh Balamohan > Priority: Minor > Attachments: HADOOP-13203-branch-2-001.patch, > HADOOP-13203-branch-2-002.patch > > > Currently file's "contentLength" is set as the "requestedStreamLen", when > invoking S3AInputStream::reopen(). As a part of lazySeek(), sometimes the > stream had to be closed and reopened. But lots of times the stream was closed > with abort() causing the internal http connection to be unusable. This incurs > lots of connection establishment cost in some jobs. It would be good to set > the correct value for the stream length to avoid connection aborts. > I will post the patch once aws tests passes in my machine. -- This message was sent by Atlassian JIRA (v6.3.4#6332) --------------------------------------------------------------------- To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org