[jira] [Commented] (HADOOP-11570) S3AInputStream.close() downloads the remaining bytes of the object from S3

Dan Hecht (JIRA) Wed, 11 Feb 2015 08:03:02 -0800

    [ 
https://issues.apache.org/jira/browse/HADOOP-11570?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14316416#comment-14316416
 ]


Dan Hecht commented on HADOOP-11570:
------------------------------------

Correct, the seek case already uses abort().  Additionally, the 
S3ObjectInputStream.abort() documentation makes it clear that this is the 
expected tradeoff between abort() and close():

{code}
   /**
     * {@inheritDoc}
     *
     * Aborts the underlying http request without reading any more data and
     * closes the stream.
     * <p>
     * By default Apache {@link HttpClient} tries to reuse http connections by
     * reading to the end of an attached input stream on
     * {@link InputStream#close()}. This is efficient from a socket pool
     * management perspective, but for objects with large payloads can incur
     * significant overhead while bytes are read from s3 and discarded. It's up
     * to clients to decide when to take the performance hit implicit in not
     * reusing an http connection in order to not read unnecessary information
     * from S3.
     *
     * @see EofSensorInputStream
     */
{code}

> S3AInputStream.close() downloads the remaining bytes of the object from S3
> --------------------------------------------------------------------------
>
>                 Key: HADOOP-11570
>                 URL: https://issues.apache.org/jira/browse/HADOOP-11570
>             Project: Hadoop Common
>          Issue Type: Sub-task
>          Components: fs/s3
>    Affects Versions: 2.6.0
>            Reporter: Dan Hecht
>         Attachments: HADOOP-11570-001.patch
>
>
> Currently, S3AInputStream.close() calls S3Object.close().  But, 
> S3Object.close() will read the remaining bytes of the S3 object, potentially 
> transferring a lot of bytes from S3 that are discarded.  Instead, the wrapped 
> stream should be aborted to avoid transferring discarded bytes (unless the 
> preceding read() finished at contentLength).  For example, reading only the 
> first byte of a 1 GB object and then closing the stream will result in all 1 
> GB transferred from S3.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HADOOP-11570) S3AInputStream.close() downloads the remaining bytes of the object from S3

Reply via email to