[jira] [Updated] (HADOOP-16798) job commit failure in S3A MR magic committer test

Steve Loughran (Jira) Tue, 30 Jun 2020 02:55:59 -0700


     [ 
https://issues.apache.org/jira/browse/HADOOP-16798?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]


Steve Loughran updated HADOOP-16798:
------------------------------------
    Description: 
failure in 
{code}
ITestS3ACommitterMRJob.test_200_execute:304->Assert.fail:88 Job 
job_1578669113137_0003 failed in state FAILED with cause Job commit failed: 
java.util.concurrent.RejectedExecutionException: Task 
java.util.concurrent.FutureTask@6e894de2 rejected from 
org.apache.hadoop.util.concurrent.HadoopThreadPoolExecutor@225eed53[Terminated, 
pool size = 0, active threads = 0, queued tasks = 0, completed tasks = 0]
{code}

Stack implies thread pool rejected it, but toString says "Terminated". Race 
condition?

*update 2020-04-22*: it's caused when a task is aborted in the AM -the 
threadpool is disposed of, and while that is shutting down in one thread, task 
commit is initiated using the same thread pool. When the task committer's 
destroy operation times out, it kills all the active uploads.



  was:
failure in 
{code}
ITestS3ACommitterMRJob.test_200_execute:304->Assert.fail:88 Job 
job_1578669113137_0003 failed in state FAILED with cause Job commit failed: 
java.util.concurrent.RejectedExecutionException: Task 
java.util.concurrent.FutureTask@6e894de2 rejected from 
org.apache.hadoop.util.concurrent.HadoopThreadPoolExecutor@225eed53[Terminated, 
pool size = 0, active threads = 0, queued tasks = 0, completed tasks = 0]
{code}

Stack implies thread pool rejected it, but toString says "Terminated". Race 
condition?

*update 2020-04-22*: it's caused when a task is aborted in the AM -the 
threadpool is disposed of, and while that is shutting down in one thread, task 
commit is initiated using the same thread pool. When the task committer's 
destroy operation times out, it kills all the active uploads.

Proposed: destroyThreadPool immediately copies reference to current thread pool 
and nullifies it, so that any new operation needing a thread pool will create a 
new one


> job commit failure in S3A MR magic committer test
> -------------------------------------------------
>
>                 Key: HADOOP-16798
>                 URL: https://issues.apache.org/jira/browse/HADOOP-16798
>             Project: Hadoop Common
>          Issue Type: Sub-task
>          Components: fs/s3
>    Affects Versions: 3.3.0
>            Reporter: Steve Loughran
>            Assignee: Steve Loughran
>            Priority: Major
>         Attachments: stdout
>
>
> failure in 
> {code}
> ITestS3ACommitterMRJob.test_200_execute:304->Assert.fail:88 Job 
> job_1578669113137_0003 failed in state FAILED with cause Job commit failed: 
> java.util.concurrent.RejectedExecutionException: Task 
> java.util.concurrent.FutureTask@6e894de2 rejected from 
> org.apache.hadoop.util.concurrent.HadoopThreadPoolExecutor@225eed53[Terminated,
>  pool size = 0, active threads = 0, queued tasks = 0, completed tasks = 0]
> {code}
> Stack implies thread pool rejected it, but toString says "Terminated". Race 
> condition?
> *update 2020-04-22*: it's caused when a task is aborted in the AM -the 
> threadpool is disposed of, and while that is shutting down in one thread, 
> task commit is initiated using the same thread pool. When the task 
> committer's destroy operation times out, it kills all the active uploads.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[jira] [Updated] (HADOOP-16798) job commit failure in S3A MR magic committer test

Reply via email to