[ 
https://issues.apache.org/jira/browse/MAPREDUCE-3921?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13261094#comment-13261094
 ] 

Siddharth Seth commented on MAPREDUCE-3921:
-------------------------------------------

bq. Yes, I think. Based on our experience, here we are pre-emptively taking 
action on a task that might actually be ok. And it should be an infrequent 
action.
bq. My understanding of existing behavior in mrv1 was that only maps are 
pre-emptively terminated for performance reasons.
I think 'fetch failure' / 'node unhealthy' should be considered in the same way 
- at least for the purpose of counting towards the allowed_task_failure limit. 
Ideally for the tasks state as well. There's currently no way to distinguish 
between a task causing a node to go unhealthy versus other problems. My guess 
is 'fetch failures' are more often than not caused by a bad tracker, as against 
a bad task.
WRT killing reduce tasks on unhealthy node - I'm not sure what was done in 20 
(From a quick look, couldn't find the code which kills map tasks either). It'd 
be best if Vinod or others with more knowledge and history about how and why 20 
deals with this pitch in.

bq. My understanding was that scheduling happens when the job moves from INIT 
to RUNNING state via the StartTransition(). Unless allocate is called on RM it 
will not return any unhealthy machines. So I thought that JOB_UPDATED_EVENT can 
never come until the job moves into the RUNNING state. Can you please point out 
the scenario you are thinking about?
Calls to allocate() start once the RMCommunicator service is started - which 
happens before a JOB_START event is sent. Very unlikely - but there's an 
extremely remote possibility of an allocate call completing before a job moves 
into the START state. 

bq. Unless you really want this, I would prefer it the way its currently 
written. I prefer not to depend on string name encodings.
It's safe to use TaskId.getTaskType() - don't need to explicitly depend on 
string name encoding. Avoids the extra task lookups.

bq. That was a question I had and put it in the comments. It seems that for a 
TaskAttemptCompletedEventTransition the code removes the previous successful 
entry from successAttemptCompletionEventNoMap. It then checks if the current 
attempt is successful, and in that case adds it to the 
successAttemptCompletionEventNoMap. But what if the current attempt is not 
successful. We have now removed the previous successful attempt too. Is that 
the desired behavior. This question is independent of this jira.
It also marks the removed entry as OBSOLETE - so the 
taskAttemptCompletionEvents list doesn't have any SUCCESSFUL attempts for the 
specific taskId.

bq. I have moved the finishedTask increment out of that function and made it 
explicit in every transition that requires it to be that way.
In the same context I have a question in comments in 
MapRetroactiveFailureTransition. Why is this not calling 
handleAttemptCompletion. My understanding is that handleAttemptCompletion is 
used to notify reducers about changes in map outputs. So if a map was failed 
after success then reducers should know about it so that they can abandon its 
outputs before getting too many fetch failures. Is that not so?
It is calling it via AttemptFailedTransition.transition(). That's the bit which 
also counts the failure towards the allowed_failure_limit.

bq. Sorry I did not find getContainerManagerAddress(). The map in 
AssignedRequests stores ContainerId and its not possible to get nodeId from it. 
What are you proposing?
Correction - called getAssignedContainerMgrAddress. IAC, was proposing storing 
the containers NodeId with the AssignedRequest - that completely removes the 
need to fetch the actual task.

bq. It does not look like it but there may be race conditions I have not 
thought of. But looking further, it seems that the action on this event checks 
for job completion in TaskCompletedTransition. TaskCompletedTransition 
increments job.completedTaskCount irrespective of whether the task has 
succeeded/killed or failed. Now, 
TaskCompletedTransition.checkJobCompleteSuccess() checks job.completedTaskCount 
== job.tasks.size() for completion. How is this working? Wont enough killed 
tasks/failed + completed tasks trigger job completion? Or is that expected 
behavior?
It checks for failure before attempting the SUCCESS check - so that should 
work. Unless I'm missing something - Tasks could complete after a Job moves to 
state FAILED - which would end up generating this event.

                
> MR AM should act on the nodes liveliness information when nodes go 
> up/down/unhealthy
> ------------------------------------------------------------------------------------
>
>                 Key: MAPREDUCE-3921
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3921
>             Project: Hadoop Map/Reduce
>          Issue Type: Bug
>          Components: mr-am, mrv2
>    Affects Versions: 0.23.0
>            Reporter: Vinod Kumar Vavilapalli
>            Assignee: Bikas Saha
>             Fix For: 0.23.2
>
>         Attachments: MAPREDUCE-3921-1.patch, MAPREDUCE-3921-3.patch, 
> MAPREDUCE-3921-4.patch, MAPREDUCE-3921-5.patch, 
> MAPREDUCE-3921-branch-0.23.patch, MAPREDUCE-3921-branch-0.23.patch, 
> MAPREDUCE-3921-branch-0.23.patch, MAPREDUCE-3921.patch
>
>


--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to