[jira] [Comment Edited] (MAPREDUCE-7169) Speculative attempts should not run on the same node

2020-05-11 Thread Ahmed Hussein (Jira)


[ 
https://issues.apache.org/jira/browse/MAPREDUCE-7169?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17104401#comment-17104401
 ] 

Ahmed Hussein edited comment on MAPREDUCE-7169 at 5/11/20, 12:25 PM:
-

{quote}Actually when task attempt is killed by default Avataar is VIRGIN. this 
is defect which needs to be addressed. If speculative task attempt is killed it 
is launched as normal task attempt
{quote}
That's interesting.
 If speculative task attempt is killed, a new task attempt is launched as 
normal. The new attempt will have a new ID I guess, right? in that case, the 
map entry would not be relevant and a new entry is created for the new attempt 
ID.
{quote}How do you get taskattempt details in RMContainerAllocator??
{quote}
I see your point. It is preferred that "request" object should only be alive if 
it is pending or not handled yet. In order to keep that concept, the simplest 
work around is to change the field in {{TaskAttemptBlacklistManager}}.
{code:java}
private Map>
  taskAttemptToEventMapping =
  new HashMap>();

  public void addToTaskAttemptBlacklist(ContainerRequestEvent event) {
if (null != event.getBlacklistedNodes()
&& event.getBlacklistedNodes().size() > 0) {
  taskAttemptToEventMapping.put(event.getAttemptID(), 
event.getBlacklistedNodes());
}
  }
{code}
One last thing:
 Since we are going to keep {{TaskAttemptBlacklistManager}}, then 
{{RMContainerAllocator.taskManager}} is not the best name. Perhaps it should be 
renamed to something more relevant to its functionality 
{{attemptBlacklistMgr}}, or {{speculativeLocalityMgr}}, .or..etc.


was (Author: ahussein):
{quote}Actually when task attempt is killed by default Avataar is VIRGIN. this 
is defect which needs to be addressed. If speculative task attempt is killed it 
is launched as normal task attempt{quote}
That's interesting.
If speculative task attempt is killed, a new task attempt is launched as 
normal. The new attempt will have a new ID I guess, right? in that case, the 
map entry would not be relevant and a new entry is created for the new attempt 
ID.

{quote}How do you get taskattempt details in RMContainerAllocator??{quote}

I see your point. The point is that a "request" object should only be alive if 
it is pending or not handled yet. In order to keep that concept, the simplest 
work around is to change the field in {{TaskAttemptBlacklistManager}}.


{code:java}
private Map>
  taskAttemptToEventMapping =
  new HashMap>();

  public void addToTaskAttemptBlacklist(ContainerRequestEvent event) {
if (null != event.getBlacklistedNodes()
&& event.getBlacklistedNodes().size() > 0) {
  taskAttemptToEventMapping.put(event.getAttemptID(), 
event.getBlacklistedNodes());
}
  }
{code}

One last thing:
Since we are going to keep {{TaskAttemptBlacklistManager}}, then  
{{RMContainerAllocator.taskManager}} is not the best name. Perhaps it should be 
renamed to something more relevant to its functionality 
{{attemptBlacklistMgr}}, or {{speculativeLocalityMgr}}, .or..etc.




> Speculative attempts should not run on the same node
> 
>
> Key: MAPREDUCE-7169
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-7169
> Project: Hadoop Map/Reduce
>  Issue Type: New Feature
>  Components: yarn
>Affects Versions: 2.7.2
>Reporter: Lee chen
>Assignee: Bilwa S T
>Priority: Major
> Attachments: MAPREDUCE-7169-001.patch, MAPREDUCE-7169-002.patch, 
> MAPREDUCE-7169-003.patch, MAPREDUCE-7169.004.patch, MAPREDUCE-7169.005.patch, 
> image-2018-12-03-09-54-07-859.png
>
>
>   I found in all versions of yarn, Speculative Execution may set the 
> speculative task to the node of  original task.What i have read is only it 
> will try to have one more task attempt. haven't seen any place mentioning not 
> on same node.It is unreasonable.If the node have some problems lead to tasks 
> execution will be very slow. and then placement the speculative  task to same 
> node cannot help the  problematic task.
>  In our cluster (version 2.7.2,2700 nodes),this phenomenon appear 
> almost everyday.
>  !image-2018-12-03-09-54-07-859.png! 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org



[jira] [Commented] (MAPREDUCE-7169) Speculative attempts should not run on the same node

2020-05-11 Thread Ahmed Hussein (Jira)


[ 
https://issues.apache.org/jira/browse/MAPREDUCE-7169?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17104401#comment-17104401
 ] 

Ahmed Hussein commented on MAPREDUCE-7169:
--

{quote}Actually when task attempt is killed by default Avataar is VIRGIN. this 
is defect which needs to be addressed. If speculative task attempt is killed it 
is launched as normal task attempt{quote}
That's interesting.
If speculative task attempt is killed, a new task attempt is launched as 
normal. The new attempt will have a new ID I guess, right? in that case, the 
map entry would not be relevant and a new entry is created for the new attempt 
ID.

{quote}How do you get taskattempt details in RMContainerAllocator??{quote}

I see your point. The point is that a "request" object should only be alive if 
it is pending or not handled yet. In order to keep that concept, the simplest 
work around is to change the field in {{TaskAttemptBlacklistManager}}.


{code:java}
private Map>
  taskAttemptToEventMapping =
  new HashMap>();

  public void addToTaskAttemptBlacklist(ContainerRequestEvent event) {
if (null != event.getBlacklistedNodes()
&& event.getBlacklistedNodes().size() > 0) {
  taskAttemptToEventMapping.put(event.getAttemptID(), 
event.getBlacklistedNodes());
}
  }
{code}

One last thing:
Since we are going to keep {{TaskAttemptBlacklistManager}}, then  
{{RMContainerAllocator.taskManager}} is not the best name. Perhaps it should be 
renamed to something more relevant to its functionality 
{{attemptBlacklistMgr}}, or {{speculativeLocalityMgr}}, .or..etc.




> Speculative attempts should not run on the same node
> 
>
> Key: MAPREDUCE-7169
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-7169
> Project: Hadoop Map/Reduce
>  Issue Type: New Feature
>  Components: yarn
>Affects Versions: 2.7.2
>Reporter: Lee chen
>Assignee: Bilwa S T
>Priority: Major
> Attachments: MAPREDUCE-7169-001.patch, MAPREDUCE-7169-002.patch, 
> MAPREDUCE-7169-003.patch, MAPREDUCE-7169.004.patch, MAPREDUCE-7169.005.patch, 
> image-2018-12-03-09-54-07-859.png
>
>
>   I found in all versions of yarn, Speculative Execution may set the 
> speculative task to the node of  original task.What i have read is only it 
> will try to have one more task attempt. haven't seen any place mentioning not 
> on same node.It is unreasonable.If the node have some problems lead to tasks 
> execution will be very slow. and then placement the speculative  task to same 
> node cannot help the  problematic task.
>  In our cluster (version 2.7.2,2700 nodes),this phenomenon appear 
> almost everyday.
>  !image-2018-12-03-09-54-07-859.png! 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org