azagrebin commented on a change in pull request #7351: [FLINK-11008][State
Backends, Checkpointing]SpeedUp upload state files using multithread
URL: https://github.com/apache/flink/pull/7351#discussion_r246340345
##########
File path:
flink-state-backends/flink-statebackend-rocksdb/src/main/java/org/apache/flink/contrib/streaming/state/RocksDbStateDataTransfer.java
##########
@@ -61,6 +69,88 @@ static void transferAllStateDataToDirectory(
downloadDataForAllStateHandles(miscFiles, dest,
restoringThreadNum, closeableRegistry);
}
+ public static void uploadFilesToCheckpointFs(
+ @Nonnull Map<StateHandleID, Path> files,
+ int numberOfSnapshottingThreads,
+ CheckpointStreamFactory checkpointStreamFactory,
+ CloseableRegistry closeableRegistry,
+ Map<StateHandleID, StreamStateHandle> hanldes) throws Exception
{
Review comment:
Not sure here, in general I think it is not obvious that function implicitly
returns something (here `hanldes`, typo btw, should be `handles`) in its
arguments. At least I would mention this fact in the function doc comment.
`transferAllStateDataToDirectory` is now public and should also have a doc
comment.
On the other hand, if function returns explicitly `handles`, we have to
reiterate them later where the result is used to add it to the final result map
(e.g. `sstFiles.putAll(uploadFilesToCheckpointFs(..))`). Though, I would not
expect the size of map to be performance critical for reiteration.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
With regards,
Apache Git Services