[ 
https://issues.apache.org/jira/browse/MAPREDUCE-7278?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17096678#comment-17096678
 ] 

Tarun Parimi commented on MAPREDUCE-7278:
-----------------------------------------

I am able to reproduce this consistently in trunk. To trigger the issue, we 
need a failing taskattempt to transition to FAIL_CONTAINER_CLEANUP. This occurs 
if it is timed out while finishing. To ensure it always times out in a local 
cluster, we can set 
{{mapreduce.task.exit.timeout}} and 
{{mapreduce.task.exit.timeout.check-interval-ms}} to extremely low values.

The below hadoop streaming job always reproduces the issue.
{code:java}
yarn jar 
$HADOOP_HOME/share/hadoop/tools/lib/hadoop-streaming-3.4.0-SNAPSHOT.jar  
-Dmapreduce.task.exit.timeout=20 
-Dmapreduce.task.exit.timeout.check-interval-ms=10 -mapper /bin/failonce.sh  
-reducer /bin/wc -input /sample -output /tmp/output
{code}

The contents of /bin/failonce.sh are as below. It only fails for the 0th 
attempt and succeeds for other attempts.

{code:java}
#!/bin/bash
atmpt=`echo $mapred_task_id | awk -F_ '{print $NF}'`
if [[ $atmpt -eq 0 ]]; then exit 1; fi
{code}




> Speculative execution behavior is observed even when 
> mapreduce.map.speculative and mapreduce.reduce.speculative are false
> -------------------------------------------------------------------------------------------------------------------------
>
>                 Key: MAPREDUCE-7278
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-7278
>             Project: Hadoop Map/Reduce
>          Issue Type: Bug
>          Components: task
>    Affects Versions: 2.8.0
>            Reporter: Tarun Parimi
>            Priority: Major
>         Attachments: Screen Shot 2020-04-30 at 8.04.27 PM.png
>
>
> When a failed task attempt container is stuck in FAIL_FINISHING_CONTAINER 
> state for some time, we observe two task attempts are launched simultaneously 
> even when speculative execution is disabled.
> This results in the below message shown in the killed attempts, indicating 
> speculation has occurred. This is an issue for jobs which require speculative 
> execution to be strictly disabled.
>   !Screen Shot 2020-04-30 at 8.04.27 PM.png!
>  
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org

Reply via email to