wcmolin opened a new issue, #13244:
URL: https://github.com/apache/dolphinscheduler/issues/13244

   ### Search before asking
   
   - [X] I had searched in the 
[issues](https://github.com/apache/dolphinscheduler/issues?q=is%3Aissue) and 
found no similar issues.
   
   
   ### What happened
   
   Sometimes it happens that after stopping the workflow, the status of the 
killed task is still running
   
   <img width="1701" alt="image" 
src="https://user-images.githubusercontent.com/11606262/208908647-222ed3e5-5bff-420c-a141-a321d910bd6e.png";>
   
   <img width="1176" alt="image" 
src="https://user-images.githubusercontent.com/11606262/208908697-eecec7d5-808e-4a23-8efa-12ded2473205.png";>
   
   This is a screenshot of the task configuration:
   <img width="1100" alt="image" 
src="https://user-images.githubusercontent.com/11606262/208909286-7a0081e5-4c81-4da8-bd8d-4571f7b4f092.png";>
   
   
   ### What you expected to happen
   
   Killed task status shows as killed
   
   ### How to reproduce
   
   The code must be changed to reproduce as much as possible.
   
   Modify line 175 of 
`org.apache.dolphinscheduler.server.master.processor.queue.TaskResponseService` 
and add the following code
   ```java
   if (null != taskResponsePersistThread) {
       if (taskResponsePersistThread.addEvent(taskResponseEvent))...    
   } else {
       logger.error("task response persist thread is null");
   }
   ```
   
   Modify line 200 of 
`org.apache.dolphinscheduler.server.master.processor.queue.TaskResponseService` 
and add the following code
   ```java
   while (Stopper.isRunning()) {
       try {
           eventHandler();
           // TimeUnit.MILLISECONDS.sleep(Constants.SLEEP_TIME_MILLIS);
           TimeUnit.MILLISECONDS.sleep(Constants.SLEEP_TIME_MILLIS * 10);
   ```
   
   The reason for this is to simulate another site to remove the instance in 
processInstanceMap.
   ```java
   if (taskResponsePersistThread.eventSize() == 0) {
       if 
(!processInstanceMap.containsKey(taskResponsePersistThread.getProcessInstanceId()))
 {
           
processTaskResponseMap.remove(taskResponsePersistThread.getProcessInstanceId());
           logger.info("remove process instance: {}", 
taskResponsePersistThread.getProcessInstanceId());
       }
       continue;
   }
   ```
    This will cause the instance in `processTaskResponseMap` to be removed in 
the following code, resulting in `TaskResponsePersistThread` being empty and 
printing 
   ```
   task response persist thread is null
   ```
   
   **master log:**
   ```
   [INFO] 2022-12-21 20:44:27.372 
org.apache.dolphinscheduler.server.master.processor.CacheProcessor:[52] - 
received command : CacheExpireCommand{CacheType=PROCESS_DEFINITION, 
cacheKey=7960963905984}
   [INFO] 2022-12-21 20:44:27.390 
org.apache.dolphinscheduler.server.master.processor.CacheProcessor:[70] - cache 
evict, type:processDefinition, key:7960963905984
   [INFO] 2022-12-21 20:44:28.363 
org.apache.dolphinscheduler.server.master.runner.MasterSchedulerService:[255] - 
find command 8, slot:0 :
   [INFO] 2022-12-21 20:44:28.363 
org.apache.dolphinscheduler.server.master.runner.MasterSchedulerService:[208] - 
find one command: id: 8, type: START_PROCESS
   [INFO] 2022-12-21 20:44:28.482 
org.apache.dolphinscheduler.server.master.runner.MasterSchedulerService:[221] - 
handle command end, command 8 process 8 start...
   [INFO] 2022-12-21 20:44:28.650 
org.apache.dolphinscheduler.server.master.runner.WorkflowExecuteThread:[1370] - 
add task to stand by list, task name:sleep-20s, task id:0, task 
code:7960945419712
   [INFO] 2022-12-21 20:44:28.685 
org.apache.dolphinscheduler.service.process.ProcessService:[1088] - start 
submit task : sleep-20s, instance id:8, state: RUNNING_EXECUTION
   [INFO] 2022-12-21 20:44:28.803 
org.apache.dolphinscheduler.service.process.ProcessService:[1101] - end submit 
task to db successfully:8 sleep-20s state:SUBMITTED_SUCCESS complete, instance 
id:8 state: RUNNING_EXECUTION  
   [INFO] 2022-12-21 20:44:28.823 
org.apache.dolphinscheduler.server.master.runner.task.CommonTaskProcessor:[164] 
- master submit success, task id:8, task name:sleep-20s, process id:8
   [INFO] 2022-12-21 20:44:28.824 
org.apache.dolphinscheduler.server.master.runner.WorkflowExecuteThread:[1384] - 
remove task from stand by list, id: 8 name:sleep-20s
   [INFO] 2022-12-21 20:44:29.043 
org.apache.dolphinscheduler.server.master.processor.TaskAckProcessor:[71] - 
taskAckCommand : TaskExecuteAckCommand{taskInstanceId=8, startTime=Wed Dec 21 
20:44:29 CST 2022, host='192.168.20.62:1234', status=1, 
logPath='/Users/molin/dev/github/dolphinscheduler/logs/7960963905984_2/8/8.log',
 
executePath='/tmp/dolphinscheduler/exec/process/7960944102080/7960963905984_2/8/8',
 processInstanceId='0'}
   [INFO] 2022-12-21 20:44:29.043 
org.apache.dolphinscheduler.server.master.processor.TaskAckProcessor:[71] - 
taskAckCommand : TaskExecuteAckCommand{taskInstanceId=8, startTime=Wed Dec 21 
20:44:29 CST 2022, host='192.168.20.62:1234', status=1, 
logPath='/Users/molin/dev/github/dolphinscheduler/logs/7960963905984_2/8/8.log',
 
executePath='/tmp/dolphinscheduler/exec/process/7960944102080/7960963905984_2/8/8',
 processInstanceId='8'}
   [ERROR] 2022-12-21 20:44:29.044 
org.apache.dolphinscheduler.server.master.processor.queue.TaskResponseService:[176]
 - task response persist thread is null
   [INFO] 2022-12-21 20:44:33.368 
org.apache.dolphinscheduler.server.master.processor.queue.TaskResponseService:[226]
 - already exists handler process size:0
   [INFO] 2022-12-21 20:44:33.703 
org.apache.dolphinscheduler.server.master.processor.queue.TaskResponseService:[233]
 - persist events 8 succeeded.
   [INFO] 2022-12-21 20:44:42.226 
org.apache.dolphinscheduler.server.master.processor.StateEventProcessor:[75] - 
received command : State Event :key: 8-0-8-0 type: PROCESS_STATE_CHANGE 
executeStatus: READY_STOP task instance id: 0 process instance id: 8 context: 
null
   [INFO] 2022-12-21 20:44:42.975 
org.apache.dolphinscheduler.server.master.runner.EventExecuteService:[127] - 
handle process instance : 8 , events count:1
   [INFO] 2022-12-21 20:44:42.980 
org.apache.dolphinscheduler.server.master.runner.EventExecuteService:[130] - 
already exists handler process size:0
   [INFO] 2022-12-21 20:44:42.982 
org.apache.dolphinscheduler.server.master.runner.WorkflowExecuteThread:[310] - 
process event: State Event :key: 8-0-8-0 type: PROCESS_STATE_CHANGE 
executeStatus: READY_STOP task instance id: 0 process instance id: 8 context: 
null
   [INFO] 2022-12-21 20:44:42.983 
org.apache.dolphinscheduler.server.master.runner.WorkflowExecuteThread:[497] - 
process:8 state RUNNING_EXECUTION change to READY_STOP
   [INFO] 2022-12-21 20:44:43.013 
org.apache.dolphinscheduler.server.master.runner.WorkflowExecuteThread:[1422] - 
kill called on process instance id: 8, num: 1
   [INFO] 2022-12-21 20:44:43.094 
org.apache.dolphinscheduler.server.master.runner.task.CommonTaskProcessor:[209] 
- master kill taskInstance name :sleep-20s taskInstance id:8
   [INFO] 2022-12-21 20:44:43.095 
org.apache.dolphinscheduler.server.master.runner.WorkflowExecuteThread:[310] - 
process event: State Event :key: null type: PROCESS_STATE_CHANGE executeStatus: 
STOP task instance id: 0 process instance id: 8 context: null
   [INFO] 2022-12-21 20:44:43.095 
org.apache.dolphinscheduler.server.master.runner.WorkflowExecuteThread:[497] - 
process:8 state READY_STOP change to STOP
   [INFO] 2022-12-21 20:44:43.107 
org.apache.dolphinscheduler.server.master.runner.WorkflowExecuteThread:[1320] - 
work flow process instance [id: 8, 
name:test_task_status_error-2-20221221204428415], state change from READY_STOP 
to STOP, cmd type: STOP
   [INFO] 2022-12-21 20:44:43.307 
org.apache.dolphinscheduler.server.master.runner.EventExecuteService:[139] - 
process instance 8 finished.
   [INFO] 2022-12-21 20:44:43.385 
org.apache.dolphinscheduler.server.master.processor.queue.TaskResponseService:[222]
 - remove process instance: 8
   [INFO] 2022-12-21 20:44:43.610 
org.apache.dolphinscheduler.server.master.processor.TaskResponseProcessor:[72] 
- received command : TaskExecuteResponseCommand{taskInstanceId=8, 
processInstanceId=8, status=9, startTime=null, endTime=Wed Dec 21 20:44:43 CST 
2022, host=null, logPath=null, executePath=null, processId=2407, appIds='', 
varPool=[]}
   [ERROR] 2022-12-21 20:44:43.613 
org.apache.dolphinscheduler.server.master.processor.queue.TaskResponseService:[176]
 - task response persist thread is null
   
   ```
   
   **worker log:**
   ```
   [INFO] 2022-12-21 20:44:28.882 
org.apache.dolphinscheduler.server.worker.processor.TaskExecuteProcessor:[110] 
- received command : 
TaskExecuteRequestCommand{taskExecutionContext='{"taskInstanceId":8,"taskName":"sleep-20s","firstSubmitTime":"2022-12-21
 
20:44:28","startTime":null,"endTime":null,"taskType":"SHELL","host":null,"executePath":null,"logPath":null,"taskJson":null,"processId":0,"processDefineCode":7960963905984,"processDefineVersion":2,"appIds":null,"processInstanceId":8,"scheduleTime":null,"globalParams":null,"executorId":1,"cmdTypeIfComplement":0,"tenantCode":"molin","queue":"default","projectCode":7960944102080,"taskParams":"{\"resourceList\":[],\"localParams\":[],\"rawScript\":\"echo
 \\\"start\\\"\\nsleep 20\\necho 
\\\"end\\\"\",\"dependence\":{},\"conditionResult\":{\"successNode\":[],\"failedNode\":[]},\"waitStartTimeout\":{},\"switchResult\":{}}","envFile":null,"environmentConfig":null,"definedParams":null,"taskAppId":null,"taskTimeoutStrategy":null,"taskTimeout":21
 
47483647,"workerGroup":"default","delayTime":0,"currentExecutionStatus":null,"resources":{},"sqlTaskExecutionContext":{"warningGroupId":0,"connectionParams":null,"udfFuncTenantCodeMap":null,"defaultFS":null},"dataxTaskExecutionContext":{"dataSourceId":0,"sourcetype":0,"sourceConnectionParams":null,"dataTargetId":0,"targetType":0,"targetConnectionParams":null},"dependenceTaskExecutionContext":null,"sqoopTaskExecutionContext":{"dataSourceId":0,"sourcetype":0,"sourceConnectionParams":null,"dataTargetId":0,"targetType":0,"targetConnectionParams":null},"varPool":null,"dryRun":0,"paramsMap":null,"procedureTaskExecutionContext":{"connectionParams":null}}'}
   [INFO] 2022-12-21 20:44:28.919 
org.apache.dolphinscheduler.common.utils.FileUtils:[121] - create dir success 
/tmp/dolphinscheduler/exec/process/7960944102080/7960963905984_2/8/8
   [INFO] 2022-12-21 20:44:28.921 
org.apache.dolphinscheduler.server.worker.runner.TaskExecuteThread:[128] - task 
instance local execute path : 
/tmp/dolphinscheduler/exec/process/7960944102080/7960963905984_2/8/8
   [INFO] 2022-12-21 20:44:29.038 
org.apache.dolphinscheduler.server.worker.runner.TaskExecuteThread:[146] - the 
task begins to execute. task instance id: 8
   [INFO] 2022-12-21 20:44:29.075 TaskLogLogger-class 
org.apache.dolphinscheduler.plugin.task.shell.ShellTask:[83] - shell task 
params {"resourceList":[],"localParams":[],"rawScript":"echo \"start\"\nsleep 
20\necho 
\"end\"","dependence":{},"conditionResult":{"successNode":[],"failedNode":[]},"waitStartTimeout":{},"switchResult":{}}
   [INFO] 2022-12-21 20:44:29.098 TaskLogLogger-class 
org.apache.dolphinscheduler.plugin.task.shell.ShellTask:[137] - raw script : 
echo "start"
   sleep 20
   echo "end"
   [INFO] 2022-12-21 20:44:29.099 TaskLogLogger-class 
org.apache.dolphinscheduler.plugin.task.shell.ShellTask:[138] - task execute 
path : /tmp/dolphinscheduler/exec/process/7960944102080/7960963905984_2/8/8
   [INFO] 2022-12-21 20:44:29.100 TaskLogLogger-class 
org.apache.dolphinscheduler.plugin.task.shell.ShellTask:[86] - tenantCode 
user:molin, task dir:8_8
   [INFO] 2022-12-21 20:44:29.100 TaskLogLogger-class 
org.apache.dolphinscheduler.plugin.task.shell.ShellTask:[91] - create command 
file:/tmp/dolphinscheduler/exec/process/7960944102080/7960963905984_2/8/8/8_8.command
   [INFO] 2022-12-21 20:44:29.101 TaskLogLogger-class 
org.apache.dolphinscheduler.plugin.task.shell.ShellTask:[117] - command : 
#!/bin/sh
   BASEDIR=$(cd `dirname $0`; pwd)
   cd $BASEDIR
   source /etc/profile
   
/tmp/dolphinscheduler/exec/process/7960944102080/7960963905984_2/8/8/8_8_node.sh
   [INFO] 2022-12-21 20:44:29.115 TaskLogLogger-class 
org.apache.dolphinscheduler.plugin.task.shell.ShellTask:[287] - task run 
command: sudo -u molin sh 
/tmp/dolphinscheduler/exec/process/7960944102080/7960963905984_2/8/8/8_8.command
   [INFO] 2022-12-21 20:44:29.117 TaskLogLogger-class 
org.apache.dolphinscheduler.plugin.task.shell.ShellTask:[178] - process start, 
process id is: 2407
   [INFO] 2022-12-21 20:44:30.121 TaskLogLogger-class 
org.apache.dolphinscheduler.plugin.task.shell.ShellTask:[66] -  -> welcome to 
use bigdata scheduling system...
        start
   [INFO] 2022-12-21 20:44:43.101 
org.apache.dolphinscheduler.server.worker.processor.TaskKillProcessor:[96] - 
received kill command : TaskKillRequestCommand{taskInstanceId=8}
   [INFO] 2022-12-21 20:44:43.550 
org.apache.dolphinscheduler.server.worker.processor.TaskKillProcessor:[145] - 
process id:2407, cmd:sudo -u molin kill -9 02408 02414 02415
   [INFO] 2022-12-21 20:44:43.598 TaskLogLogger-class 
org.apache.dolphinscheduler.plugin.task.shell.ShellTask:[202] - process has 
exited, execute 
path:/tmp/dolphinscheduler/exec/process/7960944102080/7960963905984_2/8/8, 
processId:2407 ,exitStatusCode:137 ,processWaitForStatus:true 
,processExitValue:137
   [INFO] 2022-12-21 20:44:43.599 
org.apache.dolphinscheduler.server.worker.runner.TaskExecuteThread:[206] - task 
instance id : 8,task final status : KILL
   [INFO] 2022-12-21 20:44:43.602 
org.apache.dolphinscheduler.server.worker.runner.TaskExecuteThread:[264] - 
develop mode is: false
   [INFO] 2022-12-21 20:44:43.608 
org.apache.dolphinscheduler.server.worker.runner.TaskExecuteThread:[282] - exec 
local path: 
/tmp/dolphinscheduler/exec/process/7960944102080/7960963905984_2/8/8 cleared.
   [INFO] 2022-12-21 20:44:43.617 
org.apache.dolphinscheduler.server.worker.processor.TaskKillProcessor:[194] - 
view log host : 192.168.20.62,logPath : 
/Users/molin/dev/github/dolphinscheduler/logs/7960963905984_2/8/8.log
   [INFO] 2022-12-21 20:44:43.617 
org.apache.dolphinscheduler.service.log.LogClientService:[117] - view log path 
/Users/molin/dev/github/dolphinscheduler/logs/7960963905984_2/8/8.log
   [INFO] 2022-12-21 20:44:43.623 
org.apache.dolphinscheduler.remote.NettyRemotingClient:[390] - netty client 
closed
   [INFO] 2022-12-21 20:44:43.630 
org.apache.dolphinscheduler.service.log.LogClientService:[74] - logger client 
closed
   [INFO] 2022-12-21 20:44:44.149 TaskLogLogger-class 
org.apache.dolphinscheduler.plugin.task.shell.ShellTask:[60] - FINALIZE_SESSION
   ```
   
   ### Anything else
   
   _No response_
   
   ### Version
   
   2.0.x
   
   ### Are you willing to submit PR?
   
   - [ ] Yes I am willing to submit a PR!
   
   ### Code of Conduct
   
   - [X] I agree to follow this project's [Code of 
Conduct](https://www.apache.org/foundation/policies/conduct)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: 
[email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to