[ 
https://issues.apache.org/jira/browse/HADOOP-19330?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17896431#comment-17896431
 ] 

ASF GitHub Bot commented on HADOOP-19330:
-----------------------------------------

steveloughran opened a new pull request, #7151:
URL: https://github.com/apache/hadoop/pull/7151

   
   S3AInputStream.finalizer() to
   - abort any active HTTP connection.
   - warn if there was one
   - warning to include: path, and stack+thread ID of creation.
   
   The warning message is logged at WARN to log
   
      org.apache.hadoop.fs.s3a.connection.leaks
   
   set to a lower log level to hide these messages
   
   This is a best-effort recovery from stream leaks.
   Cleanup will *only* take place during GC; it is easy to run out of 
connections without running out of memory.
   
   
   ### How was this patch tested?
   
   New ITest.
   
   ### For code changes:
   
   - [X] Does the title or this PR starts with the corresponding JIRA issue id 
(e.g. 'HADOOP-17799. Your PR title ...')?
   - [X] Object storage: have the integration tests been executed and the 
endpoint declared according to the connector-specific documentation?
   - [ ] If adding new dependencies to the code, are these dependencies 
licensed in a way that is compatible for inclusion under [ASF 
2.0](http://www.apache.org/legal/resolved.html#category-a)?
   - [ ] If applicable, have you updated the `LICENSE`, `LICENSE-binary`, 
`NOTICE-binary` files?
   
   




> S3AInputStream.finalizer to warn if closed with http connection -then release 
> it
> --------------------------------------------------------------------------------
>
>                 Key: HADOOP-19330
>                 URL: https://issues.apache.org/jira/browse/HADOOP-19330
>             Project: Hadoop Common
>          Issue Type: Sub-task
>          Components: fs/s3
>    Affects Versions: 3.4.1
>            Reporter: Steve Loughran
>            Assignee: Steve Loughran
>            Priority: Major
>
> A recurring problem is that applications forget to close their input streams; 
> eventually the HTTP connection runs out.
> Having the finalizer close streams during GC will ensure that after a GC the 
> http connections are returned. While this is an improvement on today, it is 
> insufficient
> * only happens during GC, so may not fix problem entirely
> * doesn't let developers know things are going wrong.
> * doesn't let us differentiate well between stream leak and overloaded FS
> proposed enhancements then
> * collect stack trace in constructor
> * log in finalize at warn including path, thread and stack
> * have special log for this, so it can be turned off in production (libraries 
> telling end users off for developer errors is simply an annoyance)



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to