Github user xuchuanyin commented on a diff in the pull request:
https://github.com/apache/carbondata/pull/2864#discussion_r228703510
--- Diff:
integration/spark2/src/main/scala/org/apache/carbondata/spark/rdd/CarbonDataRDDFactory.scala
---
@@ -1171,12 +1171,27 @@ object CarbonDataRDDFactory {
.ensureExecutorsAndGetNodeList(blockList, sqlContext.sparkContext)
val skewedDataOptimization = CarbonProperties.getInstance()
.isLoadSkewedDataOptimizationEnabled()
- val loadMinSizeOptimization = CarbonProperties.getInstance()
- .isLoadMinSizeOptimizationEnabled()
// get user ddl input the node loads the smallest amount of data
- val expectedMinSizePerNode = carbonLoadModel.getLoadMinSize()
+ val carbonTable =
carbonLoadModel.getCarbonDataLoadSchema.getCarbonTable
+ val loadMinSize =
carbonTable.getTableInfo.getFactTable.getTableProperties.asScala
--- End diff --
It seems that you get the load-min-size only from the table property but
you claimed that carbon also support specifying it through loadOption.
The expected procedure is:
1. get the loadMinSize from LoadOption, if it is zero, goto step2;
otherwise goto step4ï¼
2. get it from TableProperty, if it is zero, go to step 3, otherwise goto
step4;
3. use other strategy
4. use NODE_MIN_SIZE_FIRST;
Have you handled this?
---