[jira] Commented: (HADOOP-3333) job failing because of reassigning same tasktracker to failing tasks

Amareshwari Sriramadasu (JIRA) Tue, 17 Jun 2008 03:28:40 -0700

    [ 
https://issues.apache.org/jira/browse/HADOOP-3333?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12605542#action_12605542
 ]


Amareshwari Sriramadasu commented on HADOOP-3333:
-------------------------------------------------

some minor comments:
1. Indentation of Line length needs to be fixed (to make 80) in a couple of 
places in JobTracker and JIP
2. Since taskTrackerStatus is passed as parameter to JIP.failedTask(), earlier 
call to jobtracker.getTaskTracker(status.getTaskTracker()) for getting the 
taskTrackerStatus can be replaced with the parameter passed.


> job failing because of reassigning same tasktracker to failing tasks
> --------------------------------------------------------------------
>
>                 Key: HADOOP-3333
>                 URL: https://issues.apache.org/jira/browse/HADOOP-3333
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: mapred
>    Affects Versions: 0.16.3
>            Reporter: Christian Kunz
>            Assignee: Jothi Padmanabhan
>            Priority: Blocker
>             Fix For: 0.18.0
>
>         Attachments: hadoop-3333-v1.patch, hadoop-3333.patch, 
> HADOOP-3333_0_20080503.patch, HADOOP-3333_1_20080505.patch, 
> HADOOP-3333_2_20080506.patch
>
>
> We have a long running a job in a 2nd atttempt. Previous job was failing and 
> current jobs risks to fail as well, because  reduce tasks failing on marginal 
> TaskTrackers are assigned repeatedly to the same TaskTrackers (probably 
> because it is the only available slot), eventually running out of attempts.
> Reduce tasks should be assigned to the same TaskTrackers at most twice, or 
> TaskTrackers need to get some better smarts to find  failing hardware.
> BTW, mapred.reduce.max.attempts=12, which is high, but does not help in this 
> case.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HADOOP-3333) job failing because of reassigning same tasktracker to failing tasks

Reply via email to