[ 
https://issues.apache.org/jira/browse/HADOOP-18182?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17528695#comment-17528695
 ] 

Ahmar Suhail commented on HADOOP-18182:
---------------------------------------

I'm still a bit confused about this one and am maybe missing something/getting 
something wrong, but I feel like we're already doing this. `S3Reader` calls 
`S3File` which opens the input stream to S3. `S3File` stores a reference to the 
`S3Object` in its map 
[here|https://github.com/apache/hadoop/blob/feature-HADOOP-18028-s3a-prefetch/hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/read/S3File.java#L65]
 which should prevent it from being GC'd? `S3Reader` reads the entire block 
from the input stream in chunks of 64KB, and then it's finally method closes 
the input stream, so in this case the lifespan of two objects should be in 
sync. I've been looking at implementing `unbuffer()`..when you say we would 
want to clean all this up, is there something we need to do other than calling 
the `close()` methods which may cause the lifespans to go out of sync?

> S3File to store reference to active S3Object in a field.
> --------------------------------------------------------
>
>                 Key: HADOOP-18182
>                 URL: https://issues.apache.org/jira/browse/HADOOP-18182
>             Project: Hadoop Common
>          Issue Type: Sub-task
>          Components: fs/s3
>    Affects Versions: 3.4.0
>            Reporter: Steve Loughran
>            Assignee: Bhalchandra Pandit
>            Priority: Major
>
> HADOOP-17338 showed us how recent {{S3Object.finalize()}} can call 
> stream.close() and so close an active stream if a GC happens during a read. 
> replicate the same fix here.



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to