[
https://issues.apache.org/jira/browse/HADOOP-18182?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17528664#comment-17528664
]
Steve Loughran commented on HADOOP-18182:
-----------------------------------------
before HADOOP-17338 the code was something like
{code}
GetObjectRequest request = client.newGetRequest(key)
S3Object object = client.getObject(request));
wrappedStream = object.getObjectContent();
{code}
where wrappedStream was a field of S3AInputStream...we would keep it and then
close/abort as needed.
this worked well until an aws sdk update added a finalizer method to S3Object
which closed the http stream when it was GC'd, even though the wrappedStream
was still actively using the stream. as a result, if a stream was kept open
long enough for a GC to happen, things would break.
we need to keep the lifespan of the two objects in sync, retaining a ref to the
outer S3Object for as long as the inner stream is used.
> S3File to store reference to active S3Object in a field.
> --------------------------------------------------------
>
> Key: HADOOP-18182
> URL: https://issues.apache.org/jira/browse/HADOOP-18182
> Project: Hadoop Common
> Issue Type: Sub-task
> Components: fs/s3
> Affects Versions: 3.4.0
> Reporter: Steve Loughran
> Assignee: Bhalchandra Pandit
> Priority: Major
>
> HADOOP-17338 showed us how recent {{S3Object.finalize()}} can call
> stream.close() and so close an active stream if a GC happens during a read.
> replicate the same fix here.
--
This message was sent by Atlassian Jira
(v8.20.7#820007)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]