Chesnay Schepler created FLINK-29330:
----------------------------------------

             Summary: Provide better logs of MiniCluster shutdown procedure
                 Key: FLINK-29330
                 URL: https://issues.apache.org/jira/browse/FLINK-29330
             Project: Flink
          Issue Type: Technical Debt
          Components: Tests
            Reporter: Chesnay Schepler
            Assignee: Chesnay Schepler
             Fix For: 1.17.0


I recently ran into an issue where the shutdown of a MiniCluster timed out. The 
logs weren't helpful at all and I had to go in and check every asynchronously 
component for whether _that_ component was the cause.

The main issues were that various components don't log anything at all, or that 
when they did it wasn't clear who owned that component.

I'd like to add a util that makes it easier for us log the start/stop of a 
shutdown procedure,
{code:java}
public class ShutdownLog {
    /**
     * Logs the beginning and end of the shutdown procedure for the given 
component.
     *
     * <p>This method accepts a {@link Supplier} instead of a {@link 
CompletableFuture} because the
     * latter usually required implies the shutdown to already have begun.
     *
     * @param log Logger of owning component
     * @param component component that will be shut down
     * @param shutdownTrigger component shutdown trigger
     * @return termination future of the component
     */
    public static <C> CompletableFuture<C> logShutdown(
            Logger log, String component, Supplier<CompletableFuture<C>> 
shutdownTrigger) {
        log.debug("Starting shutdown of {}.", component);
        return FutureUtils.logCompletion(log, "shutdown of " + component, 
shutdownTrigger.get());
    }
}

public class FutureUtils {
    public static <T> CompletableFuture<T> logCompletion(
            Logger log, String action, CompletableFuture<T> future) {
        future.handle(
                (t, throwable) -> {
                    if (throwable == null) {
                        log.debug("Completed {}.", action);
                    } else {
                        log.debug("Failed {}.", action, throwable);
                    }
                    return null;
                });
        return future;
    }
...
{code}
and extend the AutoCloseableAsync interface for an easy opt-in and customized 
logging:
{code:java}
    default CompletableFuture<Void> closeAsync(Logger log) {
        return ShutdownLog.logShutdown(log, getClass().getSimpleName(), 
this::closeAsync);
    }
{code}
MiniCluster example usages:
{code:java}
-terminationFutures.add(dispatcherResourceManagerComponent.closeAsync())
+terminationFutures.add(dispatcherResourceManagerComponent.closeAsync(LOG))
{code}
{code:java}
-return ExecutorUtils.nonBlockingShutdown(
-        executorShutdownTimeoutMillis, TimeUnit.MILLISECONDS, ioExecutor);
+return ShutdownLog.logShutdown(
+        LOG,
+        "ioExecutor",
+        () ->
+                ExecutorUtils.nonBlockingShutdown(
+                        executorShutdownTimeoutMillis,
+                        TimeUnit.MILLISECONDS,
+                        ioExecutor));
{code}
[~mapohl] I'm interested what you think about this.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to