[ 
https://issues.apache.org/jira/browse/HADOOP-18182?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17528664#comment-17528664
 ] 

Steve Loughran commented on HADOOP-18182:
-----------------------------------------

before HADOOP-17338 the code was something like

{code}
GetObjectRequest request = client.newGetRequest(key)
S3Object object = client.getObject(request));
wrappedStream = object.getObjectContent();
{code}

where wrappedStream was a field of S3AInputStream...we would keep it and then 
close/abort as needed.

this worked well until an aws sdk update added a finalizer method to S3Object 
which closed the http stream when it was GC'd, even though the wrappedStream 
was still actively using the stream. as a result, if a stream was kept open 
long enough for a GC to happen, things would break.

we need to keep the lifespan of the two objects in sync, retaining a ref to the 
outer S3Object for as long as the inner stream is used.

> S3File to store reference to active S3Object in a field.
> --------------------------------------------------------
>
>                 Key: HADOOP-18182
>                 URL: https://issues.apache.org/jira/browse/HADOOP-18182
>             Project: Hadoop Common
>          Issue Type: Sub-task
>          Components: fs/s3
>    Affects Versions: 3.4.0
>            Reporter: Steve Loughran
>            Assignee: Bhalchandra Pandit
>            Priority: Major
>
> HADOOP-17338 showed us how recent {{S3Object.finalize()}} can call 
> stream.close() and so close an active stream if a GC happens during a read. 
> replicate the same fix here.



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to