pnowojski commented on a change in pull request #11687: [FLINK-16536][network][checkpointing] Implement InputChannel state recovery for unaligned checkpoint URL: https://github.com/apache/flink/pull/11687#discussion_r408336918
########## File path: flink-runtime/src/main/java/org/apache/flink/runtime/io/network/partition/consumer/SingleInputGate.java ########## @@ -243,7 +249,23 @@ void requestPartitions() throws IOException, InterruptedException { } for (InputChannel inputChannel : inputChannels.values()) { - inputChannel.requestSubpartition(); + executor.submit(() -> { + try { + inputChannel.initializeState(reader); + } catch (Throwable t) { + inputChannel.setError(t); + } + }); + } + + for (InputChannel inputChannel : inputChannels.values()) { + executor.submit(() -> { + try { + inputChannel.requestSubpartition(); Review comment: Could it be that we start reading from a channel before we actually request subpartition? For example: ``` Optional<BufferAndAvailability> RemoteInputChannel#getNextBuffer() throws IOException { (...) checkState(partitionRequestClient != null, "Queried for a buffer before requesting a queue."); // ?? ``` or what if `partitionRequestClient` is already not null, but ``` partitionRequestClient.requestSubpartition(partitionId, subpartitionIndex, this, 0); ``` hasn't completed? ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services