Copilot commented on code in PR #17320:
URL:
https://github.com/apache/dolphinscheduler/pull/17320#discussion_r2212707962
##########
dolphinscheduler-task-plugin/dolphinscheduler-task-api/src/test/java/org/apache/dolphinscheduler/plugin/task/api/utils/ProcessUtilsTest.java:
##########
@@ -125,16 +155,76 @@ void testKillProcessSuccessWithSigInt() throws Exception {
.thenReturn("kill -s SIGINT 12345");
mockedOSUtils.when(() -> OSUtils.exeCmd("kill -s SIGINT
12345")).thenReturn("");
- // Mock process check - process dies after SIGINT
- mockedOSUtils.when(() -> OSUtils.exeCmd("ps -p
12345")).thenReturn(null);
+ // Mock kill -0
+ mockedOSUtils.when(() -> OSUtils.getSudoCmd(Mockito.eq("testTenant"),
Mockito.matches("kill -0.*")))
+ .thenReturn("kill -0 12345");
+ // Initialize a counter to track how many times the method is invoked
+ AtomicInteger callCount = new AtomicInteger(0);
+ // Mock the static method OSUtils.exeCmd that matches "kill -0" command
+ mockedOSUtils.when(() -> OSUtils.exeCmd(Mockito.matches(".*kill
-0.*")))
+ .thenAnswer(invocation -> {
+ int count = callCount.incrementAndGet();
+ // these calls will succeed (simulate process is alive)
+ if (count == 1 || count == 2) {
+ return "";
+ } else {
+ throw new RuntimeException("Command failed");
+ }
+ });
Review Comment:
The AtomicInteger with hardcoded call count logic (count == 1 || count == 2)
makes the test brittle and difficult to understand. Consider using a more
explicit mock setup with predefined responses.
```suggestion
// Mock the static method OSUtils.exeCmd that matches "kill -0"
command
mockedOSUtils.when(() -> OSUtils.exeCmd(Mockito.matches(".*kill
-0.*")))
.thenReturn("") // First invocation succeeds (process is
alive)
.thenReturn("") // Second invocation succeeds (process is
alive)
.thenThrow(new RuntimeException("Command failed")); //
Subsequent invocations fail (process is no longer alive)
```
##########
dolphinscheduler-task-plugin/dolphinscheduler-task-api/src/main/java/org/apache/dolphinscheduler/plugin/task/api/utils/ProcessUtils.java:
##########
@@ -139,23 +156,100 @@ public static boolean kill(@NonNull TaskExecutionContext
request) {
/**
* Send a kill signal to a process group
* @param signal Signal type (SIGINT, SIGTERM, SIGKILL)
- * @param pids Process ID list
+ * @param pidList Process ID list
* @param tenantCode Tenant code
*/
- private static boolean sendKillSignal(String signal, String pids, String
tenantCode) {
+ private static boolean sendKillSignal(String signal, List<Integer>
pidList, String tenantCode) {
+ if (pidList == null || pidList.isEmpty()) {
+ log.info("No process needs to be killed.");
+ return true;
+ }
+
+ List<Integer> alivePidList = getAlivePidList(pidList, tenantCode);
+ if (alivePidList.isEmpty()) {
+ log.info("All processes already terminated.");
+ return true;
+ }
+
+ String pids = alivePidList.stream()
+ .map(String::valueOf)
+ .collect(Collectors.joining(" "));
+
try {
+ // 1. Send the kill signal
String killCmd = String.format("kill -s %s %s", signal, pids);
killCmd = OSUtils.getSudoCmd(tenantCode, killCmd);
log.info("Sending {} to process group: {}, command: {}", signal,
pids, killCmd);
OSUtils.exeCmd(killCmd);
- return true;
+ // 2. Wait for the processes to terminate with a timeout-based
polling mechanism
+ // Max wait time
+ long timeoutMillis =
TimeUnit.SECONDS.toMillis(SHELL_KILL_WAIT_TIMEOUT);
+
+ long startTime = System.currentTimeMillis();
+ while (!alivePidList.isEmpty() && (System.currentTimeMillis() -
startTime < timeoutMillis)) {
+ // Remove if process is no longer alive
+ alivePidList.removeIf(pid -> !isProcessAlive(pid, tenantCode));
+ if (!alivePidList.isEmpty()) {
+ // Wait for a short interval before checking process
statuses again, to avoid excessive CPU usage
+ // from tight-loop polling.
+ ThreadUtils.sleep(SLEEP_TIME_MILLIS);
Review Comment:
[nitpick] The polling mechanism uses a fixed sleep interval which may not be
optimal. Consider using exponential backoff or a configurable polling interval
to balance responsiveness with CPU usage.
```suggestion
long initialSleepMillis = 100; // Start with 100ms
long maxSleepMillis = 2000; // Cap the sleep interval at 2000ms
long currentSleepMillis = initialSleepMillis;
while (!alivePidList.isEmpty() && (System.currentTimeMillis() -
startTime < timeoutMillis)) {
// Remove if process is no longer alive
alivePidList.removeIf(pid -> !isProcessAlive(pid,
tenantCode));
if (!alivePidList.isEmpty()) {
// Wait for a dynamically adjusted interval before
checking process statuses again
ThreadUtils.sleep(currentSleepMillis);
// Increase the sleep interval exponentially, up to the
maximum limit
currentSleepMillis = Math.min(currentSleepMillis * 2,
maxSleepMillis);
} else {
// Reset the sleep interval if all processes are
terminated
currentSleepMillis = initialSleepMillis;
```
##########
dolphinscheduler-task-plugin/dolphinscheduler-task-api/src/main/java/org/apache/dolphinscheduler/plugin/task/api/utils/ProcessUtils.java:
##########
@@ -139,23 +156,100 @@ public static boolean kill(@NonNull TaskExecutionContext
request) {
/**
* Send a kill signal to a process group
* @param signal Signal type (SIGINT, SIGTERM, SIGKILL)
- * @param pids Process ID list
+ * @param pidList Process ID list
* @param tenantCode Tenant code
*/
- private static boolean sendKillSignal(String signal, String pids, String
tenantCode) {
+ private static boolean sendKillSignal(String signal, List<Integer>
pidList, String tenantCode) {
+ if (pidList == null || pidList.isEmpty()) {
+ log.info("No process needs to be killed.");
+ return true;
+ }
+
+ List<Integer> alivePidList = getAlivePidList(pidList, tenantCode);
+ if (alivePidList.isEmpty()) {
+ log.info("All processes already terminated.");
+ return true;
+ }
+
+ String pids = alivePidList.stream()
+ .map(String::valueOf)
+ .collect(Collectors.joining(" "));
+
try {
+ // 1. Send the kill signal
String killCmd = String.format("kill -s %s %s", signal, pids);
killCmd = OSUtils.getSudoCmd(tenantCode, killCmd);
log.info("Sending {} to process group: {}, command: {}", signal,
pids, killCmd);
OSUtils.exeCmd(killCmd);
- return true;
+ // 2. Wait for the processes to terminate with a timeout-based
polling mechanism
+ // Max wait time
+ long timeoutMillis =
TimeUnit.SECONDS.toMillis(SHELL_KILL_WAIT_TIMEOUT);
+
+ long startTime = System.currentTimeMillis();
+ while (!alivePidList.isEmpty() && (System.currentTimeMillis() -
startTime < timeoutMillis)) {
+ // Remove if process is no longer alive
+ alivePidList.removeIf(pid -> !isProcessAlive(pid, tenantCode));
Review Comment:
[nitpick] The while loop modifies `alivePidList` during iteration with
`removeIf()`. While this is safe for ArrayList, it could be made more explicit
by collecting results separately to avoid potential confusion.
```suggestion
// Collect PIDs that are still alive
List<Integer> stillAlivePids = alivePidList.stream()
.filter(pid -> isProcessAlive(pid, tenantCode))
.collect(Collectors.toList());
alivePidList = stillAlivePids;
```
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]