keepmeup opened a new issue #4249:
URL: https://github.com/apache/incubator-dolphinscheduler/issues/4249
**Describe the question**
**Which version of DolphinScheduler:**
1.3.3
**Additional context**
当TaskExecuteThread执行task时发生异常,则会kill该task,taskinstance的状态会变为kill,但是对应的masterTaskExecThread线程仍会等待阻塞在waittoquit,实际上该循环可能无法退出了,对应的masterExecThread就白白占用了
If an exception occurs in the taskExecuteThread, the task will be killed.
However, the kill method only sets the status of taskInstance to kill state.But
MasterTaskExecThread still waitTaskQuit
主要代码如下:
org.apache.dolphinscheduler.server.worker.runner.TaskExecuteThread#run
`
try{
...
}catch (Exception e) {
logger.error("task scheduler failure", e);
kill();// #**some exception will go to kill**
responseCommand.setStatus(ExecutionStatus.FAILURE.getCode());
responseCommand.setEndTime(new Date());
responseCommand.setProcessId(task.getProcessId());
responseCommand.setAppIds(task.getAppIds());
} finally {
taskExecutionContextCacheManager.removeByTaskInstanceId(taskExecutionContext.getTaskInstanceId());
ResponceCache.get().cache(taskExecutionContext.getTaskInstanceId(),responseCommand.convert2Command(),Event.RESULT);
taskCallbackService.sendResult(taskExecutionContext.getTaskInstanceId(),
responseCommand.convert2Command());
}
`
org.apache.dolphinscheduler.server.worker.runner.TaskExecuteThread#kill
` public void kill(){
if (task != null){
try {
task.cancelApplication(true);
}catch (Exception e){
logger.error(e.getMessage(),e);
}
}
}`
org.apache.dolphinscheduler.server.master.runner.MasterTaskExecThread#waitTaskQuit
`
public Boolean waitTaskQuit(){
// query new state
taskInstance =
processService.findTaskInstanceById(taskInstance.getId());
logger.info("wait task: process id: {}, task id:{}, task name:{}
complete",
this.taskInstance.getProcessInstanceId(),
this.taskInstance.getId(), this.taskInstance.getName());
while (Stopper.isRunning()){
try {
if(this.processInstance == null){
logger.error("process instance not exists , master task
exec thread exit");
return true;
}
// task instance add queue , waiting worker to kill
if(this.cancel || this.processInstance.getState() ==
ExecutionStatus.READY_STOP){
cancelTaskInstance();//# **taskInstance will change to
kill state,but will still wait in the loop**
}
if(processInstance.getState() ==
ExecutionStatus.READY_PAUSE){
pauseTask();
}
// task instance finished
if (taskInstance.getState().typeIsFinished()){
// if task is final result , then remove taskInstance
from cache
taskInstanceCacheManager.removeByTaskInstanceId(taskInstance.getId());
break;
}
if (checkTaskTimeout()) {
this.checkTimeoutFlag = !alertTimeout();
}
// updateProcessInstance task instance
taskInstance =
processService.findTaskInstanceById(taskInstance.getId());
processInstance =
processService.findProcessInstanceById(processInstance.getId());
Thread.sleep(Constants.SLEEP_TIME_MILLIS);
} catch (Exception e) {
logger.error("exception",e);
if (processInstance != null) {
logger.error("wait task quit failed, instance id:{},
task id:{}",
processInstance.getId(), taskInstance.getId());
}
}
}
return true;
}
`
**Requirement or improvement**
如果taskInstance的状态被设置成kill后,其对应的MasterTaskExecThread线程不应该继续在阻塞等待,可以归还对应线程给线程池
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]