nsivabalan commented on a change in pull request #5213:
URL: https://github.com/apache/hudi/pull/5213#discussion_r841092654
##########
File path:
hudi-client/hudi-client-common/src/main/java/org/apache/hudi/index/bloom/HoodieBloomIndex.java
##########
@@ -138,6 +134,28 @@ public HoodieBloomIndex(HoodieWriteConfig config,
BaseHoodieBloomIndexHelper blo
partitionRecordKeyPairs, fileComparisonPairs, partitionToFileInfo,
recordsPerPartition);
}
+ private List<Pair<String, BloomIndexFileInfo>>
getBloomIndexFileInfoForPartitions(HoodieEngineContext context,
+
HoodieTable hoodieTable,
+
List<String> affectedPartitionPathList) {
+ List<Pair<String, BloomIndexFileInfo>> fileInfoList = new ArrayList<>();
+
+ if (config.getBloomIndexPruneByRanges()) {
+ // load column ranges from metadata index if column stats index is
enabled and column_stats metadata partition is available
+ if (config.isMetadataColumnStatsIndexEnabled()
Review comment:
just so we are on same page. we will call it out in our release notes,
that if someone wishes to disable certain partitions in MDT, whats the right
way to go about.
From what we discussed offline:
We have to fix disabling of any partition in MDT in similar way to how we
handle disabling Metadata table completely. We have to delete the directory and
update table config for sure.
So, we maintain the tableConfig in pristine state. If it says certain
partitions is good to use, it should be in a state to be consumable. If not,
pipeline should fail (for eg, if someone manually deletes one of the MDT
partition).
if table config says, its not in usable state, no writers or readers should
every use it.
So, coming back to this patch, just relying on completedMetadataPartitions
should be good enough in my opinion. but we can let this proceed. It is just an
extra guard. But in general, lets try to maintain the table config in good
state at any point in time.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]