[jira] [Commented] (TEZ-4068) Prevent new speculative attempt after task has issued canCommit to an attempt

2019-05-08 Thread Ying Han (JIRA)


[ 
https://issues.apache.org/jira/browse/TEZ-4068?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16836066#comment-16836066
 ] 

Ying Han commented on TEZ-4068:
---

Indeed in most cases a speculative attempt scheduled once a canCommit has been 
issued would be cancelled before completion. I would like to mention though, 
that there is a slight chance that an attempt can still fail after canCommit: 
between invocation of TaskImpl#canCommit and the sending of 
TaskAttemptCompletedEvent. 

That being said, I do agree that speculative attempt scheduled after commit has 
been initialized would be most likely wasted, and it is a reasonable 
optimization to prevent that from happening. I would like to take on this JIRA 
and has assigned it to myself, [~jeagles].

> Prevent new speculative attempt after task has issued canCommit to an attempt
> -
>
> Key: TEZ-4068
> URL: https://issues.apache.org/jira/browse/TEZ-4068
> Project: Apache Tez
>  Issue Type: Improvement
>Reporter: Jonathan Eagles
>Priority: Major
>
> When a running attempt calls TaskImpl#canCommit through the taskUmbilical, 
> the TaskImpl will issue a "go" if it is the first attempt to do so. Otherwise 
> it will issue a "no-go". After commitAttempt is assigned is TaskImpl, no 
> other attempt is allowed to succeed at that point. So a speculative attempt 
> that is launched after commitAttempt is assigned can never finished before 
> the original since is will allows be given a "no-go" in the canCommit 
> response. In this jira, I propose to discuss disabling speculative attempts 
> after commitAttempt has been assigned.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (TEZ-4068) Prevent new speculative attempt after task has issued canCommit to an attempt

2019-05-08 Thread Jonathan Eagles (JIRA)


[ 
https://issues.apache.org/jira/browse/TEZ-4068?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16835897#comment-16835897
 ] 

Jonathan Eagles commented on TEZ-4068:
--

[~Chyler], This change in behavior is similar to the TaskImpl state machine 
change made in TEZ-4062. I would like to hear your thoughts on this jira and 
whether it is a good change or not. 

> Prevent new speculative attempt after task has issued canCommit to an attempt
> -
>
> Key: TEZ-4068
> URL: https://issues.apache.org/jira/browse/TEZ-4068
> Project: Apache Tez
>  Issue Type: Improvement
>Reporter: Jonathan Eagles
>Priority: Major
>
> When a running attempt calls TaskImpl#canCommit through the taskUmbilical, 
> the TaskImpl will issue a "go" if it is the first attempt to do so. Otherwise 
> it will issue a "no-go". After commitAttempt is assigned is TaskImpl, no 
> other attempt is allowed to succeed at that point. So a speculative attempt 
> that is launched after commitAttempt is assigned can never finished before 
> the original since is will allows be given a "no-go" in the canCommit 
> response. In this jira, I propose to discuss disabling speculative attempts 
> after commitAttempt has been assigned.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)