yihua commented on a change in pull request #3734:
URL: https://github.com/apache/hudi/pull/3734#discussion_r719653450
##########
File path:
hudi-client/hudi-client-common/src/main/java/org/apache/hudi/table/action/clean/CleanActionExecutor.java
##########
@@ -68,13 +78,85 @@ protected static Boolean deleteFileAndGetResult(FileSystem
fs, String deletePath
}
}
+ static Stream<Pair<String, PartitionCleanStat>>
deleteFilesFunc(Iterator<Pair<String, CleanFileInfo>> cleanFileInfo,
HoodieTable table) {
+ Map<String, PartitionCleanStat> partitionCleanStatMap = new HashMap<>();
+ FileSystem fs = table.getMetaClient().getFs();
+
+ cleanFileInfo.forEachRemaining(partitionDelFileTuple -> {
+ String partitionPath = partitionDelFileTuple.getLeft();
+ Path deletePath = new
Path(partitionDelFileTuple.getRight().getFilePath());
+ String deletePathStr = deletePath.toString();
+ Boolean deletedFileResult = null;
+ try {
+ deletedFileResult = deleteFileAndGetResult(fs, deletePathStr);
+
+ } catch (IOException e) {
+ LOG.error("Delete file failed");
Review comment:
Yes, this is copied from the engine-specific clean action executors.
Filename added to the log.
##########
File path:
hudi-client/hudi-flink-client/src/main/java/org/apache/hudi/client/common/HoodieFlinkEngineContext.java
##########
@@ -86,6 +90,17 @@ public RuntimeContext getRuntimeContext() {
.collect(Collectors.toList());
}
+ @Override
+ public <I, K, V> Stream<ImmutablePair<K, V>>
mapPartitionsToPairAndReduceByKey(
+ Stream<I> data, SerializablePairFlatMapFunction<Iterator<I>, K, V>
flatMapToPairFunc,
+ SerializableBiFunction<V, V, V> reduceFunc, int parallelism) {
+ return
throwingFlatMapToPairWrapper(flatMapToPairFunc).apply(data.parallel().iterator())
Review comment:
We cannot. The `parallelism` is introduced to be compatible with Spark.
All other Flink transformations don't use the parallelism argument either.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]