Chesnay Schepler created FLINK-29330:
----------------------------------------
Summary: Provide better logs of MiniCluster shutdown procedure
Key: FLINK-29330
URL: https://issues.apache.org/jira/browse/FLINK-29330
Project: Flink
Issue Type: Technical Debt
Components: Tests
Reporter: Chesnay Schepler
Assignee: Chesnay Schepler
Fix For: 1.17.0
I recently ran into an issue where the shutdown of a MiniCluster timed out. The
logs weren't helpful at all and I had to go in and check every asynchronously
component for whether _that_ component was the cause.
The main issues were that various components don't log anything at all, or that
when they did it wasn't clear who owned that component.
I'd like to add a util that makes it easier for us log the start/stop of a
shutdown procedure,
{code:java}
public class ShutdownLog {
/**
* Logs the beginning and end of the shutdown procedure for the given
component.
*
* <p>This method accepts a {@link Supplier} instead of a {@link
CompletableFuture} because the
* latter usually required implies the shutdown to already have begun.
*
* @param log Logger of owning component
* @param component component that will be shut down
* @param shutdownTrigger component shutdown trigger
* @return termination future of the component
*/
public static <C> CompletableFuture<C> logShutdown(
Logger log, String component, Supplier<CompletableFuture<C>>
shutdownTrigger) {
log.debug("Starting shutdown of {}.", component);
return FutureUtils.logCompletion(log, "shutdown of " + component,
shutdownTrigger.get());
}
}
public class FutureUtils {
public static <T> CompletableFuture<T> logCompletion(
Logger log, String action, CompletableFuture<T> future) {
future.handle(
(t, throwable) -> {
if (throwable == null) {
log.debug("Completed {}.", action);
} else {
log.debug("Failed {}.", action, throwable);
}
return null;
});
return future;
}
...
{code}
and extend the AutoCloseableAsync interface for an easy opt-in and customized
logging:
{code:java}
default CompletableFuture<Void> closeAsync(Logger log) {
return ShutdownLog.logShutdown(log, getClass().getSimpleName(),
this::closeAsync);
}
{code}
MiniCluster example usages:
{code:java}
-terminationFutures.add(dispatcherResourceManagerComponent.closeAsync())
+terminationFutures.add(dispatcherResourceManagerComponent.closeAsync(LOG))
{code}
{code:java}
-return ExecutorUtils.nonBlockingShutdown(
- executorShutdownTimeoutMillis, TimeUnit.MILLISECONDS, ioExecutor);
+return ShutdownLog.logShutdown(
+ LOG,
+ "ioExecutor",
+ () ->
+ ExecutorUtils.nonBlockingShutdown(
+ executorShutdownTimeoutMillis,
+ TimeUnit.MILLISECONDS,
+ ioExecutor));
{code}
[~mapohl] I'm interested what you think about this.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)