Github user manishgupta88 commented on a diff in the pull request:
https://github.com/apache/carbondata/pull/2869#discussion_r230386178
--- Diff:
hadoop/src/main/java/org/apache/carbondata/hadoop/api/CarbonFileInputFormat.java
---
@@ -145,9 +154,30 @@ public CarbonTable
getOrCreateCarbonTable(Configuration configuration) throws IO
externalTableSegments.add(seg);
}
}
- // do block filtering and get split
- List<InputSplit> splits =
- getSplits(job, filter, externalTableSegments, null,
partitionInfo, null);
+ List<InputSplit> splits = new ArrayList<>();
+ boolean useBlockDataMap =
job.getConfiguration().getBoolean("filter_blocks", true);
+ if (useBlockDataMap) {
+ // do block filtering and get split
+ splits = getSplits(job, filter, externalTableSegments, null,
partitionInfo, null);
+ } else {
+ for (CarbonFile carbonFile :
getAllCarbonDataFiles(carbonTable.getTablePath())) {
+ CarbonInputSplit split =
+ new CarbonInputSplit("null", new
Path(carbonFile.getAbsolutePath()), 0,
+ carbonFile.getLength(), carbonFile.getLocations(),
FileFormat.COLUMNAR_V3);
--- End diff --
Add a comment why "null" is passed. Better to use it from a constant. Also
explain the scenarios when if and else block will be executed
---