manojpec commented on a change in pull request #4352:
URL: https://github.com/apache/hudi/pull/4352#discussion_r789071111
##########
File path:
hudi-client/hudi-client-common/src/main/java/org/apache/hudi/index/bloom/HoodieBloomIndex.java
##########
@@ -111,13 +116,19 @@ public HoodieBloomIndex(HoodieWriteConfig config,
BaseHoodieBloomIndexHelper blo
private HoodiePairData<HoodieKey, HoodieRecordLocation> lookupIndex(
HoodiePairData<String, String> partitionRecordKeyPairs, final
HoodieEngineContext context,
final HoodieTable hoodieTable) {
- // Obtain records per partition, in the incoming records
+ // Step 1: Obtain records per partition, in the incoming records
Map<String, Long> recordsPerPartition =
partitionRecordKeyPairs.countByKey();
List<String> affectedPartitionPathList = new
ArrayList<>(recordsPerPartition.keySet());
// Step 2: Load all involved files as <Partition, filename> pairs
- List<Pair<String, BloomIndexFileInfo>> fileInfoList =
- loadInvolvedFiles(affectedPartitionPathList, context, hoodieTable);
+ List<Pair<String, BloomIndexFileInfo>> fileInfoList;
+ if (config.getBloomIndexPruneByRanges()) {
+ fileInfoList =
(config.getMetadataConfig().isMetaIndexColumnStatsEnabled()
+ ? loadColumnRangesFromMetaIndex(affectedPartitionPathList, context,
hoodieTable)
+ : loadColumnRangesFromFiles(affectedPartitionPathList, context,
hoodieTable));
+ } else {
+ fileInfoList =
getLatestBaseFilesForPartitions(affectedPartitionPathList, context,
hoodieTable);
+ }
Review comment:
This is not specific to loadColumnRangeFromFiles(). When pruning by
ranges is not enabled, we need to do this. Also, subclasses don't have to
loadColumnRangeFromFiles() when range pruning is not enabled.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]