[ 
https://issues.apache.org/jira/browse/HADOOP-12319?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14694923#comment-14694923
 ] 

Thomas Demoor commented on HADOOP-12319:
----------------------------------------

Hi Colin.

Completely correct observation: I openend HADOOP-11684 for this and have a 
patch submitted there waiting for review. 

Would be fantastic if you review and/or try it out and give some feedback.

Note that it relies on HADOOP-12269, which was merged into trunk last week, so 
you probably need to apply that patch as well and update your aws-sdk.

> S3AFastOutputStream has no ability to apply backpressure
> --------------------------------------------------------
>
>                 Key: HADOOP-12319
>                 URL: https://issues.apache.org/jira/browse/HADOOP-12319
>             Project: Hadoop Common
>          Issue Type: Improvement
>          Components: fs/s3
>    Affects Versions: 2.7.0
>            Reporter: Colin Marc
>            Priority: Critical
>
> Currently, users of S3AFastOutputStream can control memory usage with a few 
> settings: {{fs.s3a.threads.core,max}}, which control the number of active 
> uploads (specifically as arguments to a {{ThreadPoolExecutor}}), and 
> {{fs.s3a.max.total.tasks}}, which controls the size of the feeding queue for 
> the {{ThreadPoolExecutor}}.
> However, a user can get an almost *guaranteed* crash if the throughput of the 
> writing job is higher than the total S3 throughput, because there is never 
> any backpressure or blocking on calls to {{write}}.
> If {{fs.s3a.max.total.tasks}} is set high (the default is 1000), then 
> {{write}} calls will continue to add data to the queue, which can eventually 
> OOM. But if the user tries to set it lower, then writes will fail when the 
> queue is full; the {{ThreadPoolExecutor}} will reject the part with 
> {{java.util.concurrent.RejectedExecutionException}}.
> Ideally, calls to {{write}} should *block, not fail* when the queue is full, 
> so as to apply backpressure on whatever the writing process is.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to