[
https://issues.apache.org/jira/browse/TEZ-2581?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14976365#comment-14976365
]
Bikas Saha commented on TEZ-2581:
---------------------------------
tez-dag/src/main/java/org/apache/tez/dag/app/dag/impl/TaskAttemptImpl.java
Comments embedded in the code snippet itself.
{code}
protected static class ScheduleTaskattemptTransition implements ....
if (ta.recoveryData != null) {
TaskAttemptStartedEvent taStartedEvent =
ta.recoveryData.getTaskAttemptStartedEvent(ta.getID());
// If TaskAttemptStartedEvent is seen, we don't need to ask for new
container to launch it.
// Just send TaskAttemptEventStartedRemotely to move it to RUNNING
if (taStartedEvent != null) { >>>> debug logging
LOG.info("TaskAttemptStartedEvent is seen, send
TaskAttemptEventStartedRemotely to itself, attemptId=" + ta.attemptId);
ta.sendEvent(new TaskAttemptEventStartedRemotely(ta.attemptId,
taStartedEvent.getContainerId(), null));
return TaskAttemptStateInternal.START_WAIT;
} ......
if (taFinishedEvent != null) {
if (taFinishedEvent.getState() == TaskAttemptState.FAILED) { >>>>
debug logging
LOG.info("TaskAttemptFinishedEvent is seen with state of FAILED,
send TaskAttemptEventAttemptFailed to itself"
+ ", attemptId=" + ta.attemptId);
ta.sendEvent(new TaskAttemptEventAttemptFailed(ta.getID(),
TaskAttemptEventType.TA_FAILED,
taFinishedEvent.getDiagnostics(),
taFinishedEvent.getTaskAttemptError()));
} else { >>>>>> Precondition that state should be KILLED?
LOG.info("TaskAttemptFinishedEvent is seen with state of KILLED,
send TaskAttemptEventKillRequest to itself"
+ ", attemptId=" + ta.attemptId);
ta.sendEvent(new TaskAttemptEventKillRequest(ta.getID(),
taFinishedEvent.getDiagnostics(),
taFinishedEvent.getTaskAttemptError()));
}
return TaskAttemptStateInternal.NEW;
}
>>>>> can the control come if recovery data is non null. If yes or no, are
>>>>> there some preconditions for that?
}
TaskAttemptEventSchedule scheduleEvent = (TaskAttemptEventSchedule) event;
ta.scheduledTime = ta.clock.getTime();
{code}
The if condition for container being null is not semantically accurate since
container is expected to be non-null. IMO, the code flow (everywhere) should be
like it is in ScheduleTaskattemptTransition, where first recovery data is
checked for alternatives. If there is no alternative then regular code flow
happens without any new checks.
{code}
// Container may be null in recovery since we do not recover container
for now.
if (ta.appContext.getAllContainers().get(event.getContainerId()) != null)
{
AMContainer amContainer =
ta.appContext.getAllContainers().get(event.getContainerId());
{code}
The pattern below, where possible, may be more maintainable
{code}// do common work
if (recover from recoveryData) {
// do recovery exclusive work e.g. send events to itself
} else {
// do normal exclusive work. e.g. logHistoryStarted() here and not have
recovery logic inside logHistoryStarted() itself
}
// do common work{code} e.g. send events to Task
Comments in the code block directly
{code}
if (taFinishedEvent != null) {
.....
} <<<<< should the next line be in the else block of the above if?
Seems to be so from the comments
if (ta.recoveryData.getTaskAttemptStartedEvent(ta.getID()) != null) {
// If TaskAttemptStartedEvent is seen but no TaskAttemptFinishedEvent
is seen, <<<< should be in else stmt?
// that means it is still in RUNNING in the last AM attempt,
// Send TaskAttemptEventContainerTerminated to move it to KILLED.
// In the future when we can support recover container, it is not
necessary to send this event.
LOG.info("TaskAttempt is still running in the last AM attempt,
attemptId=" + ta.attemptId); <<< debug?
<<<<<< send a kill event instead of faking container terminated else we may see
future bugs in the code for handling container termination
ta.sendEvent(new TaskAttemptEventContainerTerminated(ta.attemptId,
"Task Attempt killed in recovery due to can't recover the
running task attempt",
TaskAttemptTerminationCause.TERMINATED_AT_SHUTDOWN)); <<<<<
we should create a new termination cause
}
{code}
TezEventUpdaterTransition - we could accumulate events and then write them upon
finish once - any downsides to that?
For debugging/tracing it may be useful to add a isRecovered flag in each of the
events we send out because of alternative recovery code path. This was in debug
logging etc. we could see e.g. if the ta_done was sent by the listener or by
the TA itself.
Can we check that the post recovery value of various times like
startTime/launchTime/finishTime etc. remain at their pre recovery values?
Can we simply make the recoveryData availabe from the appContext instead of
changing the constructor of TA/Task/Vertex/DAG etc. ? Might remove a lot of
diffs.
> Umbrella for Tez Recovery Redesign
> ----------------------------------
>
> Key: TEZ-2581
> URL: https://issues.apache.org/jira/browse/TEZ-2581
> Project: Apache Tez
> Issue Type: Improvement
> Reporter: Jeff Zhang
> Assignee: Jeff Zhang
> Attachments: TEZ-2581-WIP-1.patch, TEZ-2581-WIP-2.patch,
> TEZ-2581-WIP-3.patch, TEZ-2581-WIP-4.patch, TEZ-2581-WIP-5.patch,
> TEZ-2581-WIP-6.patch, TezRecoveryRedesignProposal.pdf,
> TezRecoveryRedesignV1.1.pdf
>
>
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)