Github user jackylk commented on a diff in the pull request:
https://github.com/apache/carbondata/pull/2204#discussion_r183230216
--- Diff:
hadoop/src/main/java/org/apache/carbondata/hadoop/api/CarbonInputFormat.java ---
@@ -359,23 +359,27 @@ protected Expression
getFilterPredicates(Configuration configuration) {
.getProperty(CarbonCommonConstants.USE_DISTRIBUTED_DATAMAP,
CarbonCommonConstants.USE_DISTRIBUTED_DATAMAP_DEFAULT));
DataMapExprWrapper dataMapExprWrapper =
-
DataMapChooser.get().choose(getOrCreateCarbonTable(job.getConfiguration()),
resolver);
+
DataMapChooser.get().chooseCG(getOrCreateCarbonTable(job.getConfiguration()),
resolver);
DataMapJob dataMapJob = getDataMapJob(job.getConfiguration());
List<PartitionSpec> partitionsToPrune =
getPartitionsToPrune(job.getConfiguration());
List<ExtendedBlocklet> prunedBlocklets;
- DataMapLevel dataMapLevel = dataMapExprWrapper.getDataMapType();
- if (dataMapJob != null &&
- (distributedCG ||
- (dataMapLevel == DataMapLevel.FG &&
isFgDataMapPruningEnable(job.getConfiguration())))) {
- DistributableDataMapFormat datamapDstr =
- new DistributableDataMapFormat(carbonTable, dataMapExprWrapper,
segmentIds,
- partitionsToPrune, BlockletDataMapFactory.class.getName());
- prunedBlocklets = dataMapJob.execute(datamapDstr, resolver);
- // Apply expression on the blocklets.
- prunedBlocklets = dataMapExprWrapper.pruneBlocklets(prunedBlocklets);
+ if (distributedCG) {
--- End diff --
I am a bit confused. For CG, there is always a BlockletDataMap (CG), and
user may define additional CG datamap. So there are 1 + N CG datamap where N
>=0. In this case, we should prune using BlockletDataMap first, then use user
defined CG datamap. Am I right?
If I am correct, this if check should change to `if (dataMapJob != null)`?
---