[ 
https://issues.apache.org/jira/browse/MAPREDUCE-7278?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tarun Parimi updated MAPREDUCE-7278:
------------------------------------
    Attachment: MAPREDUCE-7278.001.patch
        Status: Patch Available  (was: Open)

We can fix this issue in either of the two ways.

1. MAPREDUCE-6485 seems to be only applicable in speculative execution enabled 
scenarios. So we can avoid launching another attempt when there is an existing 
active taskattempt. I am attaching a fairly simple patch which does this. 
Doesn't contain any unit tests for now. But this fixes the issue when testing 
with the repro job.

2. Prevent the TaskAttemptImpl#notifyTaskAttemptFailed() getting called two 
times when undergoing the below transitions. Haven't done this change in the 
patch.
RUNNING -> FAIL_FINISHING_CONTAINER -> FAIL_CONTAINER_CLEANUP -> 
FAIL_TASK_CLEANUP -> FAILED

[~prabhujoseph], can you take a look at this when you get some time?



> Speculative execution behavior is observed even when 
> mapreduce.map.speculative and mapreduce.reduce.speculative are false
> -------------------------------------------------------------------------------------------------------------------------
>
>                 Key: MAPREDUCE-7278
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-7278
>             Project: Hadoop Map/Reduce
>          Issue Type: Bug
>          Components: task
>    Affects Versions: 2.8.0
>            Reporter: Tarun Parimi
>            Priority: Major
>         Attachments: MAPREDUCE-7278.001.patch, Screen Shot 2020-04-30 at 
> 8.04.27 PM.png
>
>
> When a failed task attempt container is stuck in FAIL_FINISHING_CONTAINER 
> state for some time, we observe two task attempts are launched simultaneously 
> even when speculative execution is disabled.
> This results in the below message shown in the killed attempts, indicating 
> speculation has occurred. This is an issue for jobs which require speculative 
> execution to be strictly disabled.
>   !Screen Shot 2020-04-30 at 8.04.27 PM.png!
>  
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org

Reply via email to