Copilot commented on code in PR #17320:
URL: 
https://github.com/apache/dolphinscheduler/pull/17320#discussion_r2212707962


##########
dolphinscheduler-task-plugin/dolphinscheduler-task-api/src/test/java/org/apache/dolphinscheduler/plugin/task/api/utils/ProcessUtilsTest.java:
##########
@@ -125,16 +155,76 @@ void testKillProcessSuccessWithSigInt() throws Exception {
                 .thenReturn("kill -s SIGINT 12345");
         mockedOSUtils.when(() -> OSUtils.exeCmd("kill -s SIGINT 
12345")).thenReturn("");
 
-        // Mock process check - process dies after SIGINT
-        mockedOSUtils.when(() -> OSUtils.exeCmd("ps -p 
12345")).thenReturn(null);
+        // Mock kill -0
+        mockedOSUtils.when(() -> OSUtils.getSudoCmd(Mockito.eq("testTenant"), 
Mockito.matches("kill -0.*")))
+                .thenReturn("kill -0 12345");
+        // Initialize a counter to track how many times the method is invoked
+        AtomicInteger callCount = new AtomicInteger(0);
+        // Mock the static method OSUtils.exeCmd that matches "kill -0" command
+        mockedOSUtils.when(() -> OSUtils.exeCmd(Mockito.matches(".*kill 
-0.*")))
+                .thenAnswer(invocation -> {
+                    int count = callCount.incrementAndGet();
+                    // these calls will succeed (simulate process is alive)
+                    if (count == 1 || count == 2) {
+                        return "";
+                    } else {
+                        throw new RuntimeException("Command failed");
+                    }
+                });

Review Comment:
   The AtomicInteger with hardcoded call count logic (count == 1 || count == 2) 
makes the test brittle and difficult to understand. Consider using a more 
explicit mock setup with predefined responses.
   ```suggestion
           // Mock the static method OSUtils.exeCmd that matches "kill -0" 
command
           mockedOSUtils.when(() -> OSUtils.exeCmd(Mockito.matches(".*kill 
-0.*")))
                   .thenReturn("") // First invocation succeeds (process is 
alive)
                   .thenReturn("") // Second invocation succeeds (process is 
alive)
                   .thenThrow(new RuntimeException("Command failed")); // 
Subsequent invocations fail (process is no longer alive)
   ```



##########
dolphinscheduler-task-plugin/dolphinscheduler-task-api/src/main/java/org/apache/dolphinscheduler/plugin/task/api/utils/ProcessUtils.java:
##########
@@ -139,23 +156,100 @@ public static boolean kill(@NonNull TaskExecutionContext 
request) {
     /**
      * Send a kill signal to a process group
      * @param signal Signal type (SIGINT, SIGTERM, SIGKILL)
-     * @param pids Process ID list
+     * @param pidList Process ID list
      * @param tenantCode Tenant code
      */
-    private static boolean sendKillSignal(String signal, String pids, String 
tenantCode) {
+    private static boolean sendKillSignal(String signal, List<Integer> 
pidList, String tenantCode) {
+        if (pidList == null || pidList.isEmpty()) {
+            log.info("No process needs to be killed.");
+            return true;
+        }
+
+        List<Integer> alivePidList = getAlivePidList(pidList, tenantCode);
+        if (alivePidList.isEmpty()) {
+            log.info("All processes already terminated.");
+            return true;
+        }
+
+        String pids = alivePidList.stream()
+                .map(String::valueOf)
+                .collect(Collectors.joining(" "));
+
         try {
+            // 1. Send the kill signal
             String killCmd = String.format("kill -s %s %s", signal, pids);
             killCmd = OSUtils.getSudoCmd(tenantCode, killCmd);
             log.info("Sending {} to process group: {}, command: {}", signal, 
pids, killCmd);
             OSUtils.exeCmd(killCmd);
 
-            return true;
+            // 2. Wait for the processes to terminate with a timeout-based 
polling mechanism
+            // Max wait time
+            long timeoutMillis = 
TimeUnit.SECONDS.toMillis(SHELL_KILL_WAIT_TIMEOUT);
+
+            long startTime = System.currentTimeMillis();
+            while (!alivePidList.isEmpty() && (System.currentTimeMillis() - 
startTime < timeoutMillis)) {
+                // Remove if process is no longer alive
+                alivePidList.removeIf(pid -> !isProcessAlive(pid, tenantCode));
+                if (!alivePidList.isEmpty()) {
+                    // Wait for a short interval before checking process 
statuses again, to avoid excessive CPU usage
+                    // from tight-loop polling.
+                    ThreadUtils.sleep(SLEEP_TIME_MILLIS);

Review Comment:
   [nitpick] The polling mechanism uses a fixed sleep interval which may not be 
optimal. Consider using exponential backoff or a configurable polling interval 
to balance responsiveness with CPU usage.
   ```suggestion
               long initialSleepMillis = 100; // Start with 100ms
               long maxSleepMillis = 2000; // Cap the sleep interval at 2000ms
               long currentSleepMillis = initialSleepMillis;
   
               while (!alivePidList.isEmpty() && (System.currentTimeMillis() - 
startTime < timeoutMillis)) {
                   // Remove if process is no longer alive
                   alivePidList.removeIf(pid -> !isProcessAlive(pid, 
tenantCode));
                   if (!alivePidList.isEmpty()) {
                       // Wait for a dynamically adjusted interval before 
checking process statuses again
                       ThreadUtils.sleep(currentSleepMillis);
                       // Increase the sleep interval exponentially, up to the 
maximum limit
                       currentSleepMillis = Math.min(currentSleepMillis * 2, 
maxSleepMillis);
                   } else {
                       // Reset the sleep interval if all processes are 
terminated
                       currentSleepMillis = initialSleepMillis;
   ```



##########
dolphinscheduler-task-plugin/dolphinscheduler-task-api/src/main/java/org/apache/dolphinscheduler/plugin/task/api/utils/ProcessUtils.java:
##########
@@ -139,23 +156,100 @@ public static boolean kill(@NonNull TaskExecutionContext 
request) {
     /**
      * Send a kill signal to a process group
      * @param signal Signal type (SIGINT, SIGTERM, SIGKILL)
-     * @param pids Process ID list
+     * @param pidList Process ID list
      * @param tenantCode Tenant code
      */
-    private static boolean sendKillSignal(String signal, String pids, String 
tenantCode) {
+    private static boolean sendKillSignal(String signal, List<Integer> 
pidList, String tenantCode) {
+        if (pidList == null || pidList.isEmpty()) {
+            log.info("No process needs to be killed.");
+            return true;
+        }
+
+        List<Integer> alivePidList = getAlivePidList(pidList, tenantCode);
+        if (alivePidList.isEmpty()) {
+            log.info("All processes already terminated.");
+            return true;
+        }
+
+        String pids = alivePidList.stream()
+                .map(String::valueOf)
+                .collect(Collectors.joining(" "));
+
         try {
+            // 1. Send the kill signal
             String killCmd = String.format("kill -s %s %s", signal, pids);
             killCmd = OSUtils.getSudoCmd(tenantCode, killCmd);
             log.info("Sending {} to process group: {}, command: {}", signal, 
pids, killCmd);
             OSUtils.exeCmd(killCmd);
 
-            return true;
+            // 2. Wait for the processes to terminate with a timeout-based 
polling mechanism
+            // Max wait time
+            long timeoutMillis = 
TimeUnit.SECONDS.toMillis(SHELL_KILL_WAIT_TIMEOUT);
+
+            long startTime = System.currentTimeMillis();
+            while (!alivePidList.isEmpty() && (System.currentTimeMillis() - 
startTime < timeoutMillis)) {
+                // Remove if process is no longer alive
+                alivePidList.removeIf(pid -> !isProcessAlive(pid, tenantCode));

Review Comment:
   [nitpick] The while loop modifies `alivePidList` during iteration with 
`removeIf()`. While this is safe for ArrayList, it could be made more explicit 
by collecting results separately to avoid potential confusion.
   ```suggestion
                   // Collect PIDs that are still alive
                   List<Integer> stillAlivePids = alivePidList.stream()
                           .filter(pid -> isProcessAlive(pid, tenantCode))
                           .collect(Collectors.toList());
   
                   alivePidList = stillAlivePids;
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to