[ 
https://issues.apache.org/jira/browse/HADOOP-12319?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Wang resolved HADOOP-12319.
----------------------------------
       Resolution: Duplicate
    Fix Version/s:     (was: 2.8.0)

> S3AFastOutputStream has no ability to apply backpressure
> --------------------------------------------------------
>
>                 Key: HADOOP-12319
>                 URL: https://issues.apache.org/jira/browse/HADOOP-12319
>             Project: Hadoop Common
>          Issue Type: Improvement
>          Components: fs/s3
>    Affects Versions: 2.7.0
>            Reporter: Colin Marc
>            Priority: Critical
>
> Currently, users of S3AFastOutputStream can control memory usage with a few 
> settings: {{fs.s3a.threads.core,max}}, which control the number of active 
> uploads (specifically as arguments to a {{ThreadPoolExecutor}}), and 
> {{fs.s3a.max.total.tasks}}, which controls the size of the feeding queue for 
> the {{ThreadPoolExecutor}}.
> However, a user can get an almost *guaranteed* crash if the throughput of the 
> writing job is higher than the total S3 throughput, because there is never 
> any backpressure or blocking on calls to {{write}}.
> If {{fs.s3a.max.total.tasks}} is set high (the default is 1000), then 
> {{write}} calls will continue to add data to the queue, which can eventually 
> OOM. But if the user tries to set it lower, then writes will fail when the 
> queue is full; the {{ThreadPoolExecutor}} will reject the part with 
> {{java.util.concurrent.RejectedExecutionException}}.
> Ideally, calls to {{write}} should *block, not fail* when the queue is full, 
> so as to apply backpressure on whatever the writing process is.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org

Reply via email to