[ 
https://issues.apache.org/jira/browse/HADOOP-11570?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14316709#comment-14316709
 ] 

Dan Hecht commented on HADOOP-11570:
------------------------------------

I considered that as well, but didn't know how to choose a good threshold.  And 
I agree that the seek case is more important.  So, my preference would be to 
get this patch committed and then the threshold optimization for seek and/or 
close could be explored as a separate issue.

> S3AInputStream.close() downloads the remaining bytes of the object from S3
> --------------------------------------------------------------------------
>
>                 Key: HADOOP-11570
>                 URL: https://issues.apache.org/jira/browse/HADOOP-11570
>             Project: Hadoop Common
>          Issue Type: Sub-task
>          Components: fs/s3
>    Affects Versions: 2.6.0
>            Reporter: Dan Hecht
>         Attachments: HADOOP-11570-001.patch
>
>
> Currently, S3AInputStream.close() calls S3Object.close().  But, 
> S3Object.close() will read the remaining bytes of the S3 object, potentially 
> transferring a lot of bytes from S3 that are discarded.  Instead, the wrapped 
> stream should be aborted to avoid transferring discarded bytes (unless the 
> preceding read() finished at contentLength).  For example, reading only the 
> first byte of a 1 GB object and then closing the stream will result in all 1 
> GB transferred from S3.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to