[
https://issues.apache.org/jira/browse/HADOOP-15944?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17578347#comment-17578347
]
Steve Loughran commented on HADOOP-15944:
-----------------------------------------
i think closing idle streams would be good, especially input streams. the
logging stuff is sufficient now because of that auditing stuff
we have reenabled auditing in latest versions of cdp and love it! the
IOStatisticsContext code uses the same WeakReferenceMap; it is off by default
for now
> S3AInputStream logging to make it easier to debug file leakage
> --------------------------------------------------------------
>
> Key: HADOOP-15944
> URL: https://issues.apache.org/jira/browse/HADOOP-15944
> Project: Hadoop Common
> Issue Type: Sub-task
> Components: fs/s3
> Affects Versions: 3.1.1
> Reporter: Steve Loughran
> Priority: Minor
>
> Problem: if an app opens too many input streams, then all the http
> connections in the S3A pool can be used up; all attempts to do other FS
> operations fail timing out for http pool access
> Proposed simple solution: log better what's going on with input stream
> lifecyce, specifically
> # include URL of file in open, reopen & close events
> # maybe: Separate logger for these events, though S3A Input stream should be
> enough as it doesn't do much else.
> # maybe: have some prefix in the events like "Lifecycle", so that you could
> use the existing log @ debug, grep for that phrase and look at the printed
> URLs to identify what's going on
> # stream metrics: expose some of the state of the http connection pool and/or
> active input and output streams
> Idle output streams don't use up http connections, as they only connect
> during block upload.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]