LinBinJ commented on PR #10541:
URL:
https://github.com/apache/dolphinscheduler/pull/10541#issuecomment-1840233590
Why was this piece of code deleted? This deletion causes an issue where,
during retries, a new instance isn't generated. As a result, the logs of the
old instance become inaccessible. More critically, it fails to update the
'startTime', causing the newly retried instance to immediately time out and
fail upon startup.
` ```
if (taskInstance.getState().typeIsFailure()) {
if (taskInstance.isSubProcess()) {
taskInstance.setRetryTimes(taskInstance.getRetryTimes() + 1);
} else {
if (processInstanceState != ExecutionStatus.READY_STOP
&& processInstanceState !=
ExecutionStatus.READY_PAUSE) {
// failure task set invalid
taskInstance.setFlag(Flag.NO);
updateTaskInstance(taskInstance);
// crate new task instance
if (taskInstance.getState() !=
ExecutionStatus.NEED_FAULT_TOLERANCE) {
taskInstance.setRetryTimes(taskInstance.getRetryTimes() + 1);
}
taskInstance.setSubmitTime(null);
taskInstance.setLogPath(null);
taskInstance.setExecutePath(null);
taskInstance.setStartTime(null);
taskInstance.setEndTime(null);
taskInstance.setFlag(Flag.YES);
taskInstance.setHost(null);
taskInstance.setId(0);
}
}
if (processInstanceState.typeIsFinished() || processInstanceState ==
ExecutionStatus.READY_STOP) {
logger.warn("processInstance {} was {}, skip submit task",
processInstance.getProcessDefinitionCode(), processInstanceState);
return null;
}
if (processInstanceState == ExecutionStatus.READY_PAUSE) {
taskInstance.setState(ExecutionStatus.PAUSE);
}
````
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]