This is an automated email from the ASF dual-hosted git repository. kunalkapoor pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/carbondata.git
commit 67f42a69fd042d184d32fd9c11304217bf21e75f Author: QiangCai <[email protected]> AuthorDate: Wed Apr 10 11:39:52 2019 +0800 [DOC] Improve java doc for DataSkewRangePartitioner Improve java doc for DataSkewRangePartitioner This closes #3175 --- .../src/main/scala/org/apache/spark/DataSkewRangePartitioner.scala | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/integration/spark-common/src/main/scala/org/apache/spark/DataSkewRangePartitioner.scala b/integration/spark-common/src/main/scala/org/apache/spark/DataSkewRangePartitioner.scala index d434108..733ee87 100644 --- a/integration/spark-common/src/main/scala/org/apache/spark/DataSkewRangePartitioner.scala +++ b/integration/spark-common/src/main/scala/org/apache/spark/DataSkewRangePartitioner.scala @@ -40,7 +40,7 @@ import org.apache.spark.util.{CollectionsUtils, Utils} * the rangeBounds are also the distinct values, but it calculates the skew weight. * So some rangeBounds maybe have more than one partitions. * - * for example, split following CSV file to 5 partitions: + * for example, split following CSV file to 5 partitions by col2: * --------------- * col1,col2 * 1, @@ -77,6 +77,7 @@ import org.apache.spark.util.{CollectionsUtils, Utils} * -------------------------------------------------------------- * The skew weight of range bound "null" is 2. * So it will start two tasks for range bound "null" to create two partitions. + * For a range bound, the number of final partitions is the same as the skew weight. */ class DataSkewRangePartitioner[K: Ordering : ClassTag, V]( partitions: Int,
