[ 
https://issues.apache.org/jira/browse/HADOOP-11446?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14259682#comment-14259682
 ] 

Thomas Demoor commented on HADOOP-11446:
----------------------------------------

Hi Ted,
Looks good to me. Some minor remarks:

- The parameters should be defined (and documented) in Constants.java. Your 
default of fs.s3a.threads.core=256 means up to 256 parallel (part)Uploads. That 
should fill up your network pipe :p. However, the number of concurrent http 
connections opened by the underlying AmazonS3Client (fs.s3a.max.connections) is 
set to a much lower value by default (too low?). Could you elaborate on the 
default values? I think we should tweak these a bit to give a good "out of the 
box" experience and/or document some tuning guidelines for different network 
conditions (use cases).

- Also use the shiny new single TransferManager for purging at the end of 
initialize() in S3AFileSystem, replacing the following code path
{code}
if (purgeExistingMultipart) {
      TransferManager transferManager = new TransferManager(s3);
{code}

- I like that you went for a low-level implementation for the Executor instead 
of using Executors.newFixedThreadPool. The ability to block submitting threads 
by setting fs.s3a.max.total.tasks  is nice tool for limiting memory 
consumption. Out of curiosiity: can you envision use cases where setting 
different values for core.threads and max.threads would be important? 


> S3AOutputStream should use shared thread pool to avoid OutOfMemoryError
> -----------------------------------------------------------------------
>
>                 Key: HADOOP-11446
>                 URL: https://issues.apache.org/jira/browse/HADOOP-11446
>             Project: Hadoop Common
>          Issue Type: Bug
>            Reporter: Ted Yu
>            Assignee: Ted Yu
>         Attachments: hadoop-11446-001.patch
>
>
> Here is part of the output including the OOME when hbase snapshot is exported 
> to s3a (nofile ulimit was increased to 102400):
> {code}
> 2014-12-19 13:15:03,895 INFO  [main] s3a.S3AFileSystem: OutputStream for key 
> 'FastQueryPOC/2014-12-11/EVENT1-IDX-snapshot/.hbase-snapshot/.tmp/EVENT1_IDX_snapshot_2012_12_11/
>     650a5678810fbdaa91809668d11ccf09/.regioninfo' closed. Now beginning upload
> 2014-12-19 13:15:03,895 INFO  [main] s3a.S3AFileSystem: Minimum upload part 
> size: 16777216 threshold2147483647
> Exception in thread "main" java.lang.OutOfMemoryError: unable to create new 
> native thread
>         at java.lang.Thread.start0(Native Method)
>         at java.lang.Thread.start(Thread.java:713)
>         at 
> java.util.concurrent.ThreadPoolExecutor.addWorker(ThreadPoolExecutor.java:949)
>         at 
> java.util.concurrent.ThreadPoolExecutor.execute(ThreadPoolExecutor.java:1360)
>         at 
> java.util.concurrent.AbstractExecutorService.submit(AbstractExecutorService.java:132)
>         at 
> com.amazonaws.services.s3.transfer.internal.UploadMonitor.<init>(UploadMonitor.java:129)
>         at 
> com.amazonaws.services.s3.transfer.TransferManager.upload(TransferManager.java:449)
>         at 
> com.amazonaws.services.s3.transfer.TransferManager.upload(TransferManager.java:382)
>         at 
> org.apache.hadoop.fs.s3a.S3AOutputStream.close(S3AOutputStream.java:127)
>         at 
> org.apache.hadoop.fs.FSDataOutputStream$PositionCache.close(FSDataOutputStream.java:72)
>         at 
> org.apache.hadoop.fs.FSDataOutputStream.close(FSDataOutputStream.java:106)
>         at org.apache.hadoop.io.IOUtils.copyBytes(IOUtils.java:54)
>         at org.apache.hadoop.io.IOUtils.copyBytes(IOUtils.java:112)
>         at org.apache.hadoop.fs.FileUtil.copy(FileUtil.java:366)
>         at org.apache.hadoop.fs.FileUtil.copy(FileUtil.java:356)
>         at org.apache.hadoop.fs.FileUtil.copy(FileUtil.java:356)
>         at org.apache.hadoop.fs.FileUtil.copy(FileUtil.java:338)
>         at 
> org.apache.hadoop.hbase.snapshot.ExportSnapshot.run(ExportSnapshot.java:791)
>         at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
>         at 
> org.apache.hadoop.hbase.snapshot.ExportSnapshot.innerMain(ExportSnapshot.java:882)
>         at 
> org.apache.hadoop.hbase.snapshot.ExportSnapshot.main(ExportSnapshot.java:886)
> {code}
> In S3AOutputStream#close():
> {code}
>       TransferManager transfers = new TransferManager(client);
> {code}
> This results in each TransferManager creating its own thread pool, leading to 
> the OOME.
> One solution is to pass shared thread pool to TransferManager.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to