caishunfeng commented on a change in pull request #8490:
URL: https://github.com/apache/dolphinscheduler/pull/8490#discussion_r811959026
##########
File path:
dolphinscheduler-worker/src/main/java/org/apache/dolphinscheduler/server/worker/WorkerServer.java
##########
@@ -211,4 +221,22 @@ public void close(String cause) {
public void stop(String cause) {
close(cause);
}
+
+ /**
+ * kill all yarn tasks which are running
+ */
+ public void killAllRunningYarnTasks() {
Review comment:
> I think it should kill the local process and kill the yarn task when
the master is fault-tolerant
It's related to the order, the master add worker into dead server and
failover worker, and then worker check itslef dead and close. During the worker
heartbeat interval, the worker may run a new task, but it has not yet
responded. Therefore, the worker should also kill the local process to prevent
omission.
##########
File path:
dolphinscheduler-worker/src/main/java/org/apache/dolphinscheduler/server/worker/WorkerServer.java
##########
@@ -211,4 +221,22 @@ public void close(String cause) {
public void stop(String cause) {
close(cause);
}
+
+ /**
+ * kill all yarn tasks which are running
+ */
+ public void killAllRunningYarnTasks() {
Review comment:
> I think it should kill the local process and kill the yarn task when
the master is fault-tolerant
It's related to the order, the master add worker into dead server and
failover worker, and then worker check itslef dead and close. During the worker
heartbeat interval, the worker may run a new task, but it has not yet
responded. Therefore, the worker should also kill the local process to prevent
omission.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]