Copilot commented on code in PR #7150:
URL: https://github.com/apache/kyuubi/pull/7150#discussion_r2590228845


##########
extensions/spark/kyuubi-spark-shutdown-watchdog/src/main/java/org/apache/spark/kyuubi/shutdown/watchdog/ShutdownWatchdog.java:
##########
@@ -0,0 +1,151 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *    http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.spark.kyuubi.shutdown.watchdog;
+
+import com.google.common.annotations.VisibleForTesting;
+import java.util.Objects;
+import java.util.concurrent.TimeUnit;
+import java.util.concurrent.atomic.AtomicReference;
+import java.util.function.IntConsumer;
+import org.apache.spark.SparkConf;
+import org.apache.spark.util.SparkExitCode;
+import org.slf4j.Logger;
+
+/**
+ * Timer-backed watchdog that forces the Spark driver to exit if a graceful 
shutdown stalls.
+ *
+ * <p>Implementation is deliberately pure Java so the plugin has no Scala 
runtime dependency.
+ */
+public final class ShutdownWatchdog {
+
+  // Align with Spark's standard exit codes for consistency with Spark driver 
failures.
+  static final int EXIT_CODE_FORCED_TERMINATION = 
SparkExitCode.UNCAUGHT_EXCEPTION();
+  static final int EXIT_CODE_WATCHDOG_FAILURE = 
SparkExitCode.UNCAUGHT_EXCEPTION_TWICE();
+
+  private static final AtomicReference<Thread> WATCHDOG_THREAD_REF = new 
AtomicReference<>();
+  private static volatile IntConsumer exitFn = System::exit;
+
+  private ShutdownWatchdog() {}
+
+  static void setExitFn(IntConsumer fn) {
+    exitFn = fn != null ? fn : System::exit;
+  }
+
+  static void startIfNeeded(SparkConf sparkConf, Logger logger) {
+    Objects.requireNonNull(sparkConf, "sparkConf");
+    Objects.requireNonNull(logger, "logger");
+
+    if (!SparkShutdownWatchdogConf.isEnabled(sparkConf)) {
+      logger.info(
+          "Shutdown Watchdog is disabled via {}=false.",
+          SparkShutdownWatchdogConf.SHUTDOWN_WATCHDOG_ENABLED_KEY);
+      return;
+    }
+
+    final long timeoutMillis = 
SparkShutdownWatchdogConf.getTimeoutMillis(sparkConf);
+    if (timeoutMillis <= 0L) {
+      logger.info("Shutdown Watchdog is disabled because timeout <= 0.");
+      return;
+    }
+
+    Thread existing = WATCHDOG_THREAD_REF.get();
+    if (existing != null && existing.isAlive()) {
+      logger.warn("Shutdown Watchdog is already running, ignoring duplicate 
start request.");
+      return;
+    }
+
+    final Thread watchdogThread =
+        new Thread(() -> runWatchdogLoop(timeoutMillis, logger), 
"spark-shutdown-watchdog");

Review Comment:
   [nitpick] The thread name 'spark-shutdown-watchdog' is inconsistent with the 
actual watchdog thread name 'shutdown-watchdog' mentioned in the documentation 
and test output. Consider renaming to 'shutdown-watchdog' for consistency.
   ```suggestion
           new Thread(() -> runWatchdogLoop(timeoutMillis, logger), 
"shutdown-watchdog");
   ```



##########
extensions/spark/kyuubi-spark-shutdown-watchdog/src/main/java/org/apache/spark/kyuubi/shutdown/watchdog/SparkShutdownWatchdogConf.java:
##########
@@ -0,0 +1,44 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *    http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.spark.kyuubi.shutdown.watchdog;
+
+import org.apache.spark.SparkConf;
+
+/**
+ * Spark configuration helpers for the shutdown watchdog plugin.
+ *
+ * <p>Implemented in Java to avoid pulling Scala classes into the plugin 
artifact.
+ */
+final class SparkShutdownWatchdogConf {
+
+  static final String SHUTDOWN_WATCHDOG_ENABLED_KEY = 
"spark.kyuubi.shutdown.watchdog.enabled";
+  static final boolean SHUTDOWN_WATCHDOG_ENABLED_DEFAULT = true;
+
+  static final String SHUTDOWN_WATCHDOG_TIMEOUT_KEY = 
"spark.kyuubi.shutdown.watchdog.timeout";
+  private static final String SHUTDOWN_WATCHDOG_TIMEOUT_FALLBACK = "0ms";

Review Comment:
   The constant name 'SHUTDOWN_WATCHDOG_TIMEOUT_FALLBACK' is misleading. This 
is actually the default value, not a fallback. Consider renaming to 
'SHUTDOWN_WATCHDOG_TIMEOUT_DEFAULT' to match the naming pattern of 
'SHUTDOWN_WATCHDOG_ENABLED_DEFAULT'.



##########
docs/extensions/engines/spark/shutdown-watchdog.md:
##########
@@ -0,0 +1,89 @@
+<!--
+- Licensed to the Apache Software Foundation (ASF) under one or more
+- contributor license agreements.  See the NOTICE file distributed with
+- this work for additional information regarding copyright ownership.
+- The ASF licenses this file to You under the Apache License, Version 2.0
+- (the "License"); you may not use this file except in compliance with
+- the License.  You may obtain a copy of the License at
+-
+-   http://www.apache.org/licenses/LICENSE-2.0
+-
+- Unless required by applicable law or agreed to in writing, software
+- distributed under the License is distributed on an "AS IS" BASIS,
+- WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+- See the License for the specific language governing permissions and
+- limitations under the License.
+-->
+
+# Shutdown Watchdog Plugin
+
+The shutdown watchdog prevents zombie Spark SQL engines by monitoring graceful 
shutdown.
+If the driver does not finish within the configured timeout, the plugin 
produces a JVM thread
+dump for diagnostics and forcefully terminates the process to unblock resource 
reclamation.
+
+## Build with Apache Maven
+
+The plugin lives in the module 
`extensions/spark/kyuubi-spark-shutdown-watchdog` and can be built via:
+
+```shell
+build/mvn clean package -DskipTests \
+  -pl extensions/spark/kyuubi-spark-shutdown-watchdog -am
+```
+
+The module still publishes artifacts with the standard Scala suffix
+(`kyuubi-spark-shutdown-watchdog_2.12`, `..._2.13`, etc.) so that it aligns 
with
+the rest of the project. Maven expands `${scala.binary.version}` automatically,
+so you can run the command above without worrying about Scala version 
specifics.
+Because the implementation is pure Java, there are no Scala runtime
+dependencies—building by module path is enough.
+
+After the build succeeds the jar is located at:
+`./extensions/spark/kyuubi-spark-shutdown-watchdog/target/kyuubi-spark-shutdown-watchdog_${scala.binary.version}-${project.version}.jar`
+
+## Installing
+
+Place the jar on the Spark driver classpath, for example by:
+
+- Copying it to `$SPARK_HOME/jars`, or
+- Pointing Spark to it through `spark.jars`
+
+## Enabling the plugin
+
+Add the plugin class to `spark.plugins` when launching the Spark SQL engine:
+
+```properties
+spark.plugins=org.apache.spark.kyuubi.shutdown.watchdog.SparkShutdownWatchdogPlugin
+```
+
+Configure the timeout directly through Spark (see also the general 
configuration
+table in `docs/configuration/settings.md`):
+
+```properties
+spark.kyuubi.shutdown.watchdog.timeout=60000

Review Comment:
   [nitpick] The example configuration uses milliseconds (60000) but the 
description mentions the value should be in milliseconds format. Consider using 
a time unit string like '60s' or '1m' for better readability, matching the 
pattern used in TESTING.md examples.
   ```suggestion
   spark.kyuubi.shutdown.watchdog.timeout=1m
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to