[ 
https://issues.apache.org/jira/browse/HADOOP-11446?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14266025#comment-14266025
 ] 

Thomas Demoor commented on HADOOP-11446:
----------------------------------------

@ [~tedyu]: 
- I fear we have overlooked this: copyFile and copyLocalFile still use a 
method-local TransferManager transfers object (instead of the class member 
object aka this.transfers). The addendum removes the shutDown() calls to the 
local transfermanager there but we should entirely abolish the local objects 
and only use this.transfers.
- concerning close():  I guess calling close is left to the end user. However, 
I think we do not leak memory as long as fs.s3a.threads.keepalivetime > 0. 
Because you set tpe.allowCoreThreadTimeOut(true), the TransferManager will be 
garbage collected after it goes out of scope AND all (core) threads have timed 
out. Correct?

@[[email protected]]: I fear we should. Without the addendum: if the purge 
code is hit, the next fs command will throw an error as the TransferManager has 
been shut down. Furthermore,  my first remark above hints at an addendum-002 

> S3AOutputStream should use shared thread pool to avoid OutOfMemoryError
> -----------------------------------------------------------------------
>
>                 Key: HADOOP-11446
>                 URL: https://issues.apache.org/jira/browse/HADOOP-11446
>             Project: Hadoop Common
>          Issue Type: Bug
>          Components: fs/s3
>    Affects Versions: 2.6.0
>            Reporter: Ted Yu
>            Assignee: Ted Yu
>             Fix For: 2.7.0
>
>         Attachments: hadoop-11446-001.patch, hadoop-11446-002.patch, 
> hadoop-11446-003.patch, hadoop-11446.addendum
>
>
> When working with Terry Padgett who used s3a for hbase snapshot, the 
> following issue was uncovered.
> Here is part of the output including the OOME when hbase snapshot is exported 
> to s3a (nofile ulimit was increased to 102400):
> {code}
> 2014-12-19 13:15:03,895 INFO  [main] s3a.S3AFileSystem: OutputStream for key 
> 'FastQueryPOC/2014-12-11/EVENT1-IDX-snapshot/.hbase-snapshot/.tmp/EVENT1_IDX_snapshot_2012_12_11/
>     650a5678810fbdaa91809668d11ccf09/.regioninfo' closed. Now beginning upload
> 2014-12-19 13:15:03,895 INFO  [main] s3a.S3AFileSystem: Minimum upload part 
> size: 16777216 threshold2147483647
> Exception in thread "main" java.lang.OutOfMemoryError: unable to create new 
> native thread
>         at java.lang.Thread.start0(Native Method)
>         at java.lang.Thread.start(Thread.java:713)
>         at 
> java.util.concurrent.ThreadPoolExecutor.addWorker(ThreadPoolExecutor.java:949)
>         at 
> java.util.concurrent.ThreadPoolExecutor.execute(ThreadPoolExecutor.java:1360)
>         at 
> java.util.concurrent.AbstractExecutorService.submit(AbstractExecutorService.java:132)
>         at 
> com.amazonaws.services.s3.transfer.internal.UploadMonitor.<init>(UploadMonitor.java:129)
>         at 
> com.amazonaws.services.s3.transfer.TransferManager.upload(TransferManager.java:449)
>         at 
> com.amazonaws.services.s3.transfer.TransferManager.upload(TransferManager.java:382)
>         at 
> org.apache.hadoop.fs.s3a.S3AOutputStream.close(S3AOutputStream.java:127)
>         at 
> org.apache.hadoop.fs.FSDataOutputStream$PositionCache.close(FSDataOutputStream.java:72)
>         at 
> org.apache.hadoop.fs.FSDataOutputStream.close(FSDataOutputStream.java:106)
>         at org.apache.hadoop.io.IOUtils.copyBytes(IOUtils.java:54)
>         at org.apache.hadoop.io.IOUtils.copyBytes(IOUtils.java:112)
>         at org.apache.hadoop.fs.FileUtil.copy(FileUtil.java:366)
>         at org.apache.hadoop.fs.FileUtil.copy(FileUtil.java:356)
>         at org.apache.hadoop.fs.FileUtil.copy(FileUtil.java:356)
>         at org.apache.hadoop.fs.FileUtil.copy(FileUtil.java:338)
>         at 
> org.apache.hadoop.hbase.snapshot.ExportSnapshot.run(ExportSnapshot.java:791)
>         at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
>         at 
> org.apache.hadoop.hbase.snapshot.ExportSnapshot.innerMain(ExportSnapshot.java:882)
>         at 
> org.apache.hadoop.hbase.snapshot.ExportSnapshot.main(ExportSnapshot.java:886)
> {code}
> In S3AOutputStream#close():
> {code}
>       TransferManager transfers = new TransferManager(client);
> {code}
> This results in each TransferManager creating its own thread pool, leading to 
> the OOME.
> One solution is to pass shared thread pool to TransferManager.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to