nsivabalan commented on code in PR #17550:
URL: https://github.com/apache/hudi/pull/17550#discussion_r2733976768
##########
hudi-client/hudi-client-common/src/main/java/org/apache/hudi/table/action/clean/CleanPlanner.java:
##########
@@ -254,11 +254,39 @@ private Stream<String>
getPartitionsForInstants(HoodieInstant instant) {
*/
private List<String> getPartitionPathsForFullCleaning() {
// Go to brute force mode of scanning all partitions
+ List<String> allPartitionPaths;
try {
- return hoodieTable.getTableMetadata().getAllPartitionPaths();
+ allPartitionPaths =
hoodieTable.getTableMetadata().getAllPartitionPaths();
} catch (IOException ioe) {
throw new HoodieIOException("Fetching all partitions failed ", ioe);
}
+
+ String partitionSelected = config.getCleanerPartitionFilterSelected();
+ String partitionRegex = config.getCleanerPartitionFilterRegex();
+
+ // Return early if no partition filter is configured
+ if (StringUtils.isNullOrEmpty(partitionSelected) &&
StringUtils.isNullOrEmpty(partitionRegex)) {
+ return allPartitionPaths;
+ }
+
+ // Partition filter cannot be used with incremental cleaning mode
+ if (config.incrementalCleanerModeEnabled()) {
+ throw new IllegalArgumentException("Incremental Cleaning mode is
enabled. Partition filter for clean cannot be used.");
+ }
+
+ // Static list of partitions takes precedence over regex pattern
+ if (!StringUtils.isNullOrEmpty(partitionSelected)) {
+ List<String> selectedPartitions =
Arrays.asList(partitionSelected.split(","));
+ LOG.info("Restricting partitions to clean using selected list: {}",
selectedPartitions);
Review Comment:
do you think we should actually print the filtered out partitions (i.e. post
filtering). may be for static filtering may not matter much. but for regex
filtering, would help validating that regex is working as expected.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]