pnowojski commented on a change in pull request #15322: URL: https://github.com/apache/flink/pull/15322#discussion_r670486540
########## File path: flink-dstl/flink-dstl-dfs/src/main/java/org/apache/flink/changelog/fs/RetryingExecutor.java ########## @@ -0,0 +1,145 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.flink.changelog.fs; + +import org.apache.flink.util.function.RunnableWithException; + +import org.slf4j.Logger; +import org.slf4j.LoggerFactory; + +import java.util.Optional; +import java.util.concurrent.ScheduledExecutorService; +import java.util.concurrent.ScheduledFuture; +import java.util.concurrent.TimeUnit; +import java.util.concurrent.TimeoutException; +import java.util.concurrent.atomic.AtomicBoolean; + +import static java.util.concurrent.TimeUnit.MILLISECONDS; + +/** + * A {@link RunnableWithException} executor that schedules a next attempt upon timeout based on + * {@link RetryPolicy}. Aimed to curb tail latencies + */ +class RetryingExecutor implements AutoCloseable { + private static final Logger LOG = LoggerFactory.getLogger(RetryingExecutor.class); + + private final ScheduledExecutorService scheduler; + + RetryingExecutor(int nThreads) { + this(SchedulerFactory.create(nThreads, "ChangelogRetryScheduler", LOG)); + } + + RetryingExecutor(ScheduledExecutorService scheduler) { + this.scheduler = scheduler; + } + + void execute(RetryPolicy retryPolicy, RunnableWithException action) { + LOG.debug("execute with retryPolicy: {}", retryPolicy); + RetriableTask task = new RetriableTask(action, retryPolicy, scheduler); + scheduler.submit(task); + } + + @Override + public void close() throws Exception { + LOG.debug("close"); + scheduler.shutdownNow(); + if (!scheduler.awaitTermination(1, TimeUnit.SECONDS)) { + LOG.warn("Unable to cleanly shutdown executorService in 1s"); + } + } + + private static final class RetriableTask implements Runnable { + private final RunnableWithException runnable; + private final ScheduledExecutorService executorService; + private final int current; + private final RetryPolicy retryPolicy; + private final AtomicBoolean actionCompleted; + private final AtomicBoolean attemptCompleted = new AtomicBoolean(false); Review comment: Ok, I'm not entirely convinced, but we can try it this way. ########## File path: flink-dstl/flink-dstl-dfs/src/main/java/org/apache/flink/changelog/fs/FsStateChangelogOptions.java ########## @@ -57,4 +87,18 @@ "Number of threads to use to perform cleanup in case an upload is discarded " + "(and not cleaned up by JM). " + "If the cleanup doesn't keep up then task might be back-pressured."); + + public static final ConfigOption<Integer> NUM_UPLOAD_THREADS = + ConfigOptions.key("dstl.dfs.upload.num-threads") + .intType() + .defaultValue(5) + .withDescription("Number of threads to use for upload."); Review comment: > I think the assumption that async checkpoint part uploads are always using the same shared pool is incorrect. For example RocksDB uses separate threads to upload, and the IO executor is only used to wait for the result. This sounds like RocksDB issue that it maybe should be re-using the same common thread pool as well? I'm really afraid nobody will be fine tuning those pools. If there are problems that boils down to too few IO threads, a solution is to just bump the thread numbers, and I don't believe anyone will want to bother how much he should bump individual thread pools. Just bump everything that corresponds with the same IO system. I can see two reasons against using the same thread pool: 1. different actions are waiting for one another in a blocking fashion, that can not be dis-entangled like by for example splitting into two actions 2. having actions with different priority In this particular case: 1. I think doesn't apply here. It would apply to the RocksDB Incremental using the same thread pool as AsyncCheckpointRunnable, but not here I think 2. In that case it should be solved probably in a better way by still sharing thread pool, but with more sophisticated actions scheduler. Besides I don't see this being an issue in this case either? ########## File path: flink-dstl/flink-dstl-dfs/src/main/java/org/apache/flink/changelog/fs/RetryingExecutor.java ########## @@ -0,0 +1,145 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.flink.changelog.fs; + +import org.apache.flink.util.function.RunnableWithException; + +import org.slf4j.Logger; +import org.slf4j.LoggerFactory; + +import java.util.Optional; +import java.util.concurrent.ScheduledExecutorService; +import java.util.concurrent.ScheduledFuture; +import java.util.concurrent.TimeUnit; +import java.util.concurrent.TimeoutException; +import java.util.concurrent.atomic.AtomicBoolean; + +import static java.util.concurrent.TimeUnit.MILLISECONDS; + +/** + * A {@link RunnableWithException} executor that schedules a next attempt upon timeout based on + * {@link RetryPolicy}. Aimed to curb tail latencies + */ +class RetryingExecutor implements AutoCloseable { + private static final Logger LOG = LoggerFactory.getLogger(RetryingExecutor.class); + + private final ScheduledExecutorService scheduler; + + RetryingExecutor(int nThreads) { + this(SchedulerFactory.create(nThreads, "ChangelogRetryScheduler", LOG)); + } + + RetryingExecutor(ScheduledExecutorService scheduler) { + this.scheduler = scheduler; + } + + void execute(RetryPolicy retryPolicy, RunnableWithException action) { + LOG.debug("execute with retryPolicy: {}", retryPolicy); + RetriableTask task = new RetriableTask(action, retryPolicy, scheduler); + scheduler.submit(task); + } + + @Override + public void close() throws Exception { + LOG.debug("close"); + scheduler.shutdownNow(); + if (!scheduler.awaitTermination(1, TimeUnit.SECONDS)) { + LOG.warn("Unable to cleanly shutdown executorService in 1s"); + } + } + + private static final class RetriableTask implements Runnable { + private final RunnableWithException runnable; + private final ScheduledExecutorService executorService; + private final int current; + private final RetryPolicy retryPolicy; + private final AtomicBoolean actionCompleted; + private final AtomicBoolean attemptCompleted = new AtomicBoolean(false); Review comment: Ok, I'm not entirely convinced, but I can see your point and let's try it this way. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
