[carbondata] branch master updated: [CARBONDATA-3661] Fix target file size check fail when upload local file to carbon store

2020-01-10 Thread ravipesala
This is an automated email from the ASF dual-hosted git repository.

ravipesala pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/carbondata.git


The following commit(s) were added to refs/heads/master by this push:
 new e2ddc41  [CARBONDATA-3661] Fix target file size check fail when upload 
local file to carbon store
e2ddc41 is described below

commit e2ddc415e6530d5dae85ecea43e7bb96504df36b
Author: liuzhi <371684...@qq.com>
AuthorDate: Fri Jan 10 12:54:24 2020 +0800

[CARBONDATA-3661] Fix target file size check fail when upload local file to 
carbon store

Why is this PR needed?
Multi flink tasks write carbon data may use the same carbon data file name, 
it will cause target file size check fail when upload local file to carbon 
store.

What changes were proposed in this PR?
Make different flink task use different carbon data file name. use UUID as 
write task ID.

Does this PR introduce any user interface change?
No
Is any new testcase added?
No

This closes #3573
---
 .../java/org/apache/carbon/flink/CarbonLocalWriter.java |  1 +
 .../main/java/org/apache/carbon/flink/CarbonS3Writer.java   |  1 +
 .../org/apache/carbondata/sdk/file/CarbonWriterBuilder.java | 13 +
 3 files changed, 15 insertions(+)

diff --git 
a/integration/flink/src/main/java/org/apache/carbon/flink/CarbonLocalWriter.java
 
b/integration/flink/src/main/java/org/apache/carbon/flink/CarbonLocalWriter.java
index db88cd4..a8068a3 100644
--- 
a/integration/flink/src/main/java/org/apache/carbon/flink/CarbonLocalWriter.java
+++ 
b/integration/flink/src/main/java/org/apache/carbon/flink/CarbonLocalWriter.java
@@ -62,6 +62,7 @@ final class CarbonLocalWriter extends CarbonWriter {
 try {
   final CarbonWriterBuilder writerBuilder =
   org.apache.carbondata.sdk.file.CarbonWriter.builder()
+  .taskNo(UUID.randomUUID().toString().replace("-", ""))
   .outputPath(super.getWritePath(row))
   .writtenBy("flink")
   
.withSchemaFile(CarbonTablePath.getSchemaFilePath(table.getTablePath()))
diff --git 
a/integration/flink/src/main/java/org/apache/carbon/flink/CarbonS3Writer.java 
b/integration/flink/src/main/java/org/apache/carbon/flink/CarbonS3Writer.java
index ecae32a..d23c668 100644
--- 
a/integration/flink/src/main/java/org/apache/carbon/flink/CarbonS3Writer.java
+++ 
b/integration/flink/src/main/java/org/apache/carbon/flink/CarbonS3Writer.java
@@ -65,6 +65,7 @@ final class CarbonS3Writer extends CarbonWriter {
 try {
   final CarbonWriterBuilder writerBuilder =
   org.apache.carbondata.sdk.file.CarbonWriter.builder()
+  .taskNo(UUID.randomUUID().toString().replace("-", ""))
   .outputPath(super.getWritePath(row))
   .writtenBy("flink")
   
.withSchemaFile(CarbonTablePath.getSchemaFilePath(table.getTablePath()))
diff --git 
a/store/sdk/src/main/java/org/apache/carbondata/sdk/file/CarbonWriterBuilder.java
 
b/store/sdk/src/main/java/org/apache/carbondata/sdk/file/CarbonWriterBuilder.java
index eb47a8d..cbf899f 100644
--- 
a/store/sdk/src/main/java/org/apache/carbondata/sdk/file/CarbonWriterBuilder.java
+++ 
b/store/sdk/src/main/java/org/apache/carbondata/sdk/file/CarbonWriterBuilder.java
@@ -152,6 +152,19 @@ public class CarbonWriterBuilder {
   }
 
   /**
+   * sets the taskNo for the writer. SDKs concurrently running
+   * will set taskNo in order to avoid conflicts in file's name during write.
+   *
+   * @param taskNo is the TaskNo user wants to specify.
+   *   by default it is system time in nano seconds.
+   * @return updated CarbonWriterBuilder
+   */
+  public CarbonWriterBuilder taskNo(String taskNo) {
+this.taskNo = taskNo;
+return this;
+  }
+
+  /**
* to set the timestamp in the carbondata and carbonindex index files
*
* @param timestamp is a timestamp to be used in the carbondata and 
carbonindex index files.



[carbondata] branch master updated: [CARBONDATA-3650] Remove file format V1 and V2 reader

2020-01-05 Thread ravipesala
This is an automated email from the ASF dual-hosted git repository.

ravipesala pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/carbondata.git


The following commit(s) were added to refs/heads/master by this push:
 new 5bd345b  [CARBONDATA-3650] Remove file format V1 and V2 reader
5bd345b is described below

commit 5bd345ba0aa1fe822831e7dabfcbfa88ec635614
Author: Jacky Li 
AuthorDate: Sun Dec 29 21:55:05 2019 +0800

[CARBONDATA-3650] Remove file format V1 and V2 reader

V1 and V2 file format is deprecated in CarbonData 2.0

This closes #3543
---
 .../core/constants/CarbonVersionConstants.java |   5 -
 .../impl/VariableLengthDimensionColumnPage.java|  15 --
 .../chunk/reader/CarbonDataReaderFactory.java  |  30 +--
 .../reader/dimension/AbstractChunkReader.java  |  96 --
 ...rmat.java => AbstractDimensionChunkReader.java} |  48 -
 .../CompressedDimensionChunkFileBasedReaderV1.java | 181 --
 .../CompressedDimensionChunkFileBasedReaderV2.java | 203 -
 ...aderV3.java => DimensionChunkPageReaderV3.java} |   6 +-
 ...edReaderV3.java => DimensionChunkReaderV3.java} |   6 +-
 .../reader/measure/AbstractMeasureChunkReader.java |  86 +++--
 .../AbstractMeasureChunkReaderV2V3Format.java  | 111 ---
 .../CompressedMeasureChunkFileBasedReaderV1.java   | 112 
 .../CompressedMeasureChunkFileBasedReaderV2.java   | 152 ---
 ...ReaderV3.java => MeasureChunkPageReaderV3.java} |   6 +-
 ...asedReaderV3.java => MeasureChunkReaderV3.java} |   6 +-
 .../datastore/page/encoding/EncodingFactory.java   |   8 -
 .../statistics/PrimitivePageStatsCollector.java|   6 +-
 .../blockletindex/BlockletDataRefNode.java |  27 +--
 .../core/keygenerator/mdkey/NumberCompressor.java  | 181 --
 .../core/metadata/ColumnarFormatVersion.java   |   4 +-
 .../core/metadata/blocklet/BlockletInfo.java   |  69 ---
 .../core/metadata/datatype/DataType.java   |   2 -
 .../core/metadata/datatype/DataTypes.java  |   5 -
 .../core/metadata/datatype/LegacyLongType.java |  33 
 .../apache/carbondata/core/util/CarbonUtil.java|  15 --
 .../core/util/DataFileFooterConverterFactory.java  |   7 +-
 .../apache/carbondata/core/util/DataTypeUtil.java  |   4 -
 .../carbondata/core/util/path/CarbonTablePath.java |   8 +-
 .../mdkey/NumberCompressorUnitTest.java| 116 
 .../carbondata/core/util/CarbonTestUtil.java   |   3 -
 .../carbondata/core/util/CarbonUtilTest.java   |  10 +-
 .../CarbonV1toV3CompatabilityTestCase.scala|  98 --
 .../LoadTableWithLocalDictionaryTestCase.scala |   4 +-
 .../TestNonTransactionalCarbonTable.scala  |   4 +-
 .../LocalDictionarySupportLoadTableTest.scala  |   4 +-
 .../spark/rdd/CarbonDataRDDFactory.scala   |   2 +-
 .../processing/store/CarbonDataWriterFactory.java  |   5 +-
 37 files changed, 164 insertions(+), 1514 deletions(-)

diff --git 
a/core/src/main/java/org/apache/carbondata/core/constants/CarbonVersionConstants.java
 
b/core/src/main/java/org/apache/carbondata/core/constants/CarbonVersionConstants.java
index 2382bd8..50c8ffd 100644
--- 
a/core/src/main/java/org/apache/carbondata/core/constants/CarbonVersionConstants.java
+++ 
b/core/src/main/java/org/apache/carbondata/core/constants/CarbonVersionConstants.java
@@ -50,11 +50,6 @@ public final class CarbonVersionConstants {
*/
   public static final String CARBONDATA_BUILD_DATE;
 
-  /**
-   * number of rows per blocklet column page default value for V2 version
-   */
-  public static final int NUMBER_OF_ROWS_PER_BLOCKLET_COLUMN_PAGE_DEFAULT_V2 = 
12;
-
   static {
 // create input stream for CARBONDATA_VERSION_INFO_FILE
 InputStream resourceStream = Thread.currentThread().getContextClassLoader()
diff --git 
a/core/src/main/java/org/apache/carbondata/core/datastore/chunk/impl/VariableLengthDimensionColumnPage.java
 
b/core/src/main/java/org/apache/carbondata/core/datastore/chunk/impl/VariableLengthDimensionColumnPage.java
index 2e941b2..2a71934 100644
--- 
a/core/src/main/java/org/apache/carbondata/core/datastore/chunk/impl/VariableLengthDimensionColumnPage.java
+++ 
b/core/src/main/java/org/apache/carbondata/core/datastore/chunk/impl/VariableLengthDimensionColumnPage.java
@@ -36,21 +36,6 @@ public class VariableLengthDimensionColumnPage extends 
AbstractDimensionColumnPa
* @param invertedIndexReverse reverse inverted index
* @param numberOfRows number of rows
* @param dictionary   carbon local dictionary for string column.
-   */
-  public VariableLengthDimensionColumnPage(byte[] dataChunks, int[] 
invertedIndex,
-  int[] invertedIndexReverse, int numberOfRows, DimensionStoreType 
dimStoreType,
-  CarbonDictionary dictionary, int dataLength) {
-this(dataChunks, invertedIndex, invertedIndexReverse, numberOfRows

[carbondata] 28/33: [CARBONDATA-3520] CTAS should fail if select query contains duplicate columns

2019-10-03 Thread ravipesala
This is an automated email from the ASF dual-hosted git repository.

ravipesala pushed a commit to branch branch-1.6
in repository https://gitbox.apache.org/repos/asf/carbondata.git

commit 21bbc4a5306ff850fcf14b488e4cb0452415213c
Author: Indhumathi27 
AuthorDate: Mon Sep 16 16:25:03 2019 +0530

[CARBONDATA-3520] CTAS should fail if select query contains duplicate 
columns

Problem:
If Select query contains Duplicate columns, CTAS was creating
a table with only one column, which is wrong

Solution:
Throw error message if Select query contains duplicate columns.

This closes #3388
---
 .../createTable/TestCreateTableAsSelect.scala  | 37 ++
 .../sql/parser/CarbonSparkSqlParserUtil.scala  | 23 +++---
 2 files changed, 56 insertions(+), 4 deletions(-)

diff --git 
a/integration/spark-common-test/src/test/scala/org/apache/carbondata/spark/testsuite/createTable/TestCreateTableAsSelect.scala
 
b/integration/spark-common-test/src/test/scala/org/apache/carbondata/spark/testsuite/createTable/TestCreateTableAsSelect.scala
index 3896061..8e4d8fa 100644
--- 
a/integration/spark-common-test/src/test/scala/org/apache/carbondata/spark/testsuite/createTable/TestCreateTableAsSelect.scala
+++ 
b/integration/spark-common-test/src/test/scala/org/apache/carbondata/spark/testsuite/createTable/TestCreateTableAsSelect.scala
@@ -407,6 +407,43 @@ class TestCreateTableAsSelect extends QueryTest with 
BeforeAndAfterAll {
 checkAnswer(sql("SELECT * FROM target_table"), Seq(Row("shenzhen", 24.5)))
   }
 
+  test("test duplicate columns with select query") {
+sql("DROP TABLE IF EXISTS target_table")
+sql("DROP TABLE IF EXISTS source_table")
+// create carbon table and insert data
+sql(
+  """
+| CREATE TABLE source_table(
+| id INT,
+| name STRING,
+| city STRING,
+| age INT)
+| STORED BY 'carbondata'
+| """.stripMargin)
+sql("INSERT INTO source_table SELECT 1,'bob','shenzhen',27")
+val e = intercept[AnalysisException] {
+  sql(
+"""
+  | CREATE TABLE target_table
+  | STORED BY 'carbondata'
+  | AS
+  |   SELECT t1.city, t2.city
+  |   FROM source_table t1, source_table t2 where t1.city=t2.city and 
t1.city = 'shenzhen'
+  """.stripMargin)
+}
+e.getMessage().toString.contains("Duplicated column names found in table 
definition of " +
+ "`target_table`: [\"city\"]")
+sql(
+  """
+| CREATE TABLE target_table
+| STORED BY 'carbondata'
+| AS
+|   SELECT t1.city as a, t2.city as b
+|   FROM source_table t1, source_table t2 where t1.city=t2.city and 
t1.city = 'shenzhen'
+  """.stripMargin)
+checkAnswer(sql("select * from target_table"), Seq(Row("shenzhen", 
"shenzhen")))
+  }
+
   override def afterAll {
 sql("DROP TABLE IF EXISTS carbon_ctas_test")
 sql("DROP TABLE IF EXISTS parquet_ctas_test")
diff --git 
a/integration/spark2/src/main/scala/org/apache/spark/sql/parser/CarbonSparkSqlParserUtil.scala
 
b/integration/spark2/src/main/scala/org/apache/spark/sql/parser/CarbonSparkSqlParserUtil.scala
index 5c008f2..4d85e88 100644
--- 
a/integration/spark2/src/main/scala/org/apache/spark/sql/parser/CarbonSparkSqlParserUtil.scala
+++ 
b/integration/spark2/src/main/scala/org/apache/spark/sql/parser/CarbonSparkSqlParserUtil.scala
@@ -119,6 +119,8 @@ object CarbonSparkSqlParserUtil {
   case _ =>
   // ignore this case
 }
+val columnNames = fields.map(_.name.get)
+checkIfDuplicateColumnExists(columns, tableIdentifier, columnNames)
 if (partitionFields.nonEmpty && options.isStreaming) {
   operationNotAllowed("Streaming is not allowed on partitioned table", 
partitionColumns)
 }
@@ -355,16 +357,29 @@ object CarbonSparkSqlParserUtil {
 
 // Ensuring whether no duplicate name is used in table definition
 val colNames: Seq[String] = cols.map(_.name)
+checkIfDuplicateColumnExists(columns, tableIdentifier, colNames)
+colNames
+  }
+
+  private def checkIfDuplicateColumnExists(columns: ColTypeListContext,
+  tableIdentifier: TableIdentifier,
+  colNames: Seq[String]): Unit = {
 if (colNames.length != colNames.distinct.length) {
   val duplicateColumns = colNames.groupBy(identity).collect {
 case (x, ys) if ys.length > 1 => "\"" + x + "\""
   }
-  operationNotAllowed(s"Duplicated column names found in table definition 
of " +
-  s"$tableIdentifier: ${ 
duplicateColumns.mkString("[", ",", "]

[carbondata] 18/33: [CARBONDATA-3506]Fix alter table failures on parition table with hive.metastore.disallow.incompatible.col.type.changes as true

2019-10-03 Thread ravipesala
This is an automated email from the ASF dual-hosted git repository.

ravipesala pushed a commit to branch branch-1.6
in repository https://gitbox.apache.org/repos/asf/carbondata.git

commit ef26a4a0a556d574cc30c74c674234a4564f34c1
Author: akashrn5 
AuthorDate: Wed Aug 28 12:05:13 2019 +0530

[CARBONDATA-3506]Fix alter table failures on parition table with 
hive.metastore.disallow.incompatible.col.type.changes as true

Problem:
In case of spark2.2 and above and , when we call 
alterExternalCatalogForTableWithUpdatedSchema to update the new schema to 
external catalog
in case of add column, spark gets the catalog table and then it itself adds 
the partition columns if the table is partition table for all the
new data schema sent by carbon, so there will be duplicate partition 
columns, so validation fails in hive
When the table has only two columns and one of them is partition column, 
then dropping non partition column is invalid because,
 if we allow it is like table with all columns as partition columns. So 
with the above property as true, drop column will fail to update the hive 
metastore.
in spark2.2 and above if the datatype change is done on partition column, 
with the above property as true, it also fails,
 as we are not sending partition column for schema alter in hive

Solution:
when sending the new schema to spark to update in catalog, do not send the 
partition columns in case of spark2.2 and above,
as spark will take care of adding parition columns to new schema sent by us.
In the above scenario of drop, do not allow drop column, if after dropping 
the specific column, if table has only partition columns.
Block the operation on datatype change on partition column on spark2.2 and 
above.

This closes #3367
---
 .../StandardPartitionTableQueryTestCase.scala  | 29 +
 .../schema/CarbonAlterTableAddColumnCommand.scala  | 20 +---
 ...nAlterTableColRenameDataTypeChangeCommand.scala | 36 +++---
 .../schema/CarbonAlterTableDropColumnCommand.scala | 35 +
 4 files changed, 99 insertions(+), 21 deletions(-)

diff --git 
a/integration/spark-common-test/src/test/scala/org/apache/carbondata/spark/testsuite/standardpartition/StandardPartitionTableQueryTestCase.scala
 
b/integration/spark-common-test/src/test/scala/org/apache/carbondata/spark/testsuite/standardpartition/StandardPartitionTableQueryTestCase.scala
index c19c0b9..fb4b511 100644
--- 
a/integration/spark-common-test/src/test/scala/org/apache/carbondata/spark/testsuite/standardpartition/StandardPartitionTableQueryTestCase.scala
+++ 
b/integration/spark-common-test/src/test/scala/org/apache/carbondata/spark/testsuite/standardpartition/StandardPartitionTableQueryTestCase.scala
@@ -21,8 +21,10 @@ import 
org.apache.spark.sql.execution.strategy.CarbonDataSourceScan
 import org.apache.spark.sql.test.Spark2TestQueryExecutor
 import org.apache.spark.sql.test.util.QueryTest
 import org.apache.spark.sql.{DataFrame, Row}
+import org.apache.spark.util.SparkUtil
 import org.scalatest.BeforeAndAfterAll
 
+import 
org.apache.carbondata.common.exceptions.sql.MalformedCarbonCommandException
 import org.apache.carbondata.core.constants.CarbonCommonConstants
 import org.apache.carbondata.core.datastore.impl.FileFactory
 import org.apache.carbondata.core.util.CarbonProperties
@@ -439,18 +441,32 @@ test("Creation of partition table should fail if the 
colname in table schema and
 
   test("validate data in partition table after dropping and adding a column") {
 sql("drop table if exists par")
-sql("create table par(name string) partitioned by (age double) stored by " 
+
+sql("create table par(name string, add string) partitioned by (age double) 
stored by " +
   "'carbondata' TBLPROPERTIES('cache_level'='blocklet')")
-sql(s"load data local inpath '$resourcesPath/uniqwithoutheader.csv' into 
table par options" +
-s"('header'='false')")
+sql("insert into par select 'joey','NY',32 union all select 
'chandler','NY',32")
 sql("alter table par drop columns(name)")
 sql("alter table par add columns(name string)")
-sql(s"load data local inpath '$resourcesPath/uniqwithoutheader.csv' into 
table par options" +
-s"('header'='false')")
-checkAnswer(sql("select name from par"), Seq(Row("a"),Row("b"), Row(null), 
Row(null)))
+sql("insert into par select 'joey','NY',32 union all select 
'joey','NY',32")
+checkAnswer(sql("select name from par"), Seq(Row("NY"),Row("NY"), 
Row(null), Row(null)))
 sql("drop table if exists par")
   }
 
+  test("test drop column when after dropping only partition column remains and 
datatype change on partition column") {
+

[carbondata] 31/33: [CARBONDATA-3527] Fix 'String length cannot exceed 32000 characters' issue when load data with 'GLOBAL_SORT' from csv files which include big complex type data

2019-10-03 Thread ravipesala
This is an automated email from the ASF dual-hosted git repository.

ravipesala pushed a commit to branch branch-1.6
in repository https://gitbox.apache.org/repos/asf/carbondata.git

commit 93425458a93871fd18b1c3c41da396dbb06c02c8
Author: Zhang Zhichao <441586...@qq.com>
AuthorDate: Wed Sep 25 15:58:35 2019 +0800

[CARBONDATA-3527] Fix 'String length cannot exceed 32000 characters' issue 
when load data with 'GLOBAL_SORT' from csv files which include big complex type 
data

Problem:
When complex type data is used more than 32000 characters to indicate in 
csv file, and load data with 'GLOBAL_SORT' from these csv files, it will throw 
'String length cannot exceed 32000 characters' exception.

Cause:
Use 'GLOBAL_SORT' to load data from csv files, it reads files and firstly 
store data in StringArrayRow, the type of all data are string, when call 
'CarbonScalaUtil.getString' in 'NewRddIterator.next', it will check the length 
of all data and throw 'String length cannot exceed 32000 characters' exception 
even if it's complex type data which store as more than 32000 characters in csv 
files.

Solution:
In 'FieldConverter.objectToString' (called in 'CarbonScalaUtil.getString'), 
if the data type of field is complex type, don't check the length.

This closes #3399
---
 .../src/test/resources/complexdata3.csv| 10 +
 .../complexType/TestComplexDataType.scala  | 52 ++
 .../spark/rdd/NewCarbonDataLoadRDD.scala   |  6 ++-
 .../carbondata/spark/util/CarbonScalaUtil.scala|  4 +-
 .../streaming/parser/FieldConverter.scala  | 14 +++---
 5 files changed, 79 insertions(+), 7 deletions(-)

diff --git a/integration/spark-common-test/src/test/resources/complexdata3.csv 
b/integration/spark-common-test/src/test/resources/complexdata3.csv
new file mode 100644
index 000..63cd44b
--- /dev/null
+++ b/integration/spark-common-test/src/test/resources/complexdata3.csv
@@ -0,0 +1,10 @@
+e01a1773-bd37-40be-a1de-d7e74837a281   (0551)96116063  886 00315   
(0551)46819921  853 4   0   1568220618904   50  asp 
fk  2745000 1   0   0   0   0   
-0.19569306\0020.10781755\002-0.06963766\002-0.06576662\002-0.17820272\002-0.01949397\0020.08014756\002-0.05287997\0020.02067086\002-0.11302640\0020.07383678\0020.07296083\0020.11693181\002-0.06988186\0020.05753217\002-0.02308202\002-0.03685183\0020.05840293\0020.03959572\002-0.01631518\0020.05918765\0020.07385136\002-0.05143059\002-0.19158234\0020.13839211\002
 [...]
+f72ce5cb-2ea6-423b-8c1f-6dadfd6f52e7   (0551)73382297  853 00314   
(0551)73382297  49  9   0   156827510   1559asp 
fk  5821000 1   0   0   0   0   
-0.19569308\0020.10781755\002-0.06963766\002-0.06576661\002-0.17820270\002-0.01949396\0020.08014755\002-0.05287996\0020.02067086\002-0.11302640\0020.07383677\0020.07296082\0020.11693182\002-0.06988187\0020.05753216\002-0.02308202\002-0.03685183\0020.05840293\0020.03959572\002-0.01631517\0020.05918765\0020.07385137\002-0.05143059\002-0.19158235\0020.13839212\00
 [...]
+e282ecb5-9be8-4a0e-8faf-d10e535ab877   13396633307 49  00319   
13918448986 1   7   0   1568260253193   1150asp 
fk  3884000 1   0   0   0   0   
-0.19569308\0020.10781755\002-0.06963766\002-0.06576661\002-0.17820270\002-0.01949396\0020.08014755\002-0.05287996\0020.02067086\002-0.11302640\0020.07383677\0020.07296082\0020.11693182\002-0.06988187\0020.05753216\002-0.02308202\002-0.03685183\0020.05840293\0020.03959572\002-0.01631517\0020.05918765\0020.07385137\002-0.05143059\002-0.19158235\0020.13839212\002-0.0826
 [...]
+01e36a06-b4fd-4638-862c-2785f9e4331b   13924865616 82  00310   
0086(021)60080162   82  6   0   1568293725356   2108
asp fk  3152000 1   0   0   0   0   
-0.19569308\0020.10781755\002-0.06963766\002-0.06576661\002-0.17820270\002-0.01949396\0020.08014755\002-0.05287996\0020.02067086\002-0.11302640\0020.07383677\0020.07296082\0020.11693182\002-0.06988187\0020.05753216\002-0.02308202\002-0.03685183\0020.05840293\0020.03959572\002-0.01631517\0020.05918765\0020.07385137\002-0.05143059\002-0.19158235\0020.13839212\002
 [...]
+a451790d-42f8-48e5-88f4-ba21118e63e6   13326037312 81  00318   
(0551)17198025  852 2   0   1568294179731   2116asp 
fk  1127000 1   0   0   0   0   
-0.19569308\0020.10781755\002-0.06963766\002-0.06576661\002-0.17820270\002-0.01949396\0020.08014755\002-0.05287996\0020.02067086\002-0.11302640\0020.07383677\0020.07296082\0020.11693182\002-0.06988187\0020.05753216\002-0.02308202\002-0.03685183\0020.05840293\0020.03959572\002-0.01631517\0020.05918765\0020.07385137\002-0.05143059\002-0.19158235\0020.13839212\002-0
 [...]
+9d26e280-4e87-4cb

[carbondata] 15/33: [CARBONDATA-3507] Fix Create Table As Select Failure in Spark-2.3

2019-10-03 Thread ravipesala
This is an automated email from the ASF dual-hosted git repository.

ravipesala pushed a commit to branch branch-1.6
in repository https://gitbox.apache.org/repos/asf/carbondata.git

commit d509cd19e3a9249f18c6b8b0ab2bbe19df017e65
Author: manishnalla1994 
AuthorDate: Thu Aug 29 12:00:11 2019 +0530

[CARBONDATA-3507] Fix Create Table As Select Failure in Spark-2.3

Problem: Create table as select fails with Spark-2.3.
Cause: When creating the table location path the function
removes the "hdfs://" part from the path and then stores it,
due to which in later stages the file is treated as a Local Carbon File.
Solution: Get the original table path without removing the prefix.

This closes #3368
---
 .../main/scala/org/apache/spark/sql/hive/CarbonFileMetastore.scala| 4 +---
 1 file changed, 1 insertion(+), 3 deletions(-)

diff --git 
a/integration/spark2/src/main/scala/org/apache/spark/sql/hive/CarbonFileMetastore.scala
 
b/integration/spark2/src/main/scala/org/apache/spark/sql/hive/CarbonFileMetastore.scala
index b19b11c..684bcbb 100644
--- 
a/integration/spark2/src/main/scala/org/apache/spark/sql/hive/CarbonFileMetastore.scala
+++ 
b/integration/spark2/src/main/scala/org/apache/spark/sql/hive/CarbonFileMetastore.scala
@@ -567,9 +567,7 @@ class CarbonFileMetastore extends CarbonMetaStore {
 }
 val tableLocation = catalogTable.storage.locationUri match {
   case tableLoc@Some(uri) =>
-if (tableLoc.get.isInstanceOf[URI]) {
-  
FileFactory.getUpdatedFilePath(tableLoc.get.asInstanceOf[URI].getPath)
-}
+FileFactory.getUpdatedFilePath(tableLoc.get.toString)
   case None =>
 CarbonEnv.getTablePath(tableIdentifier.database, 
tableIdentifier.table)(sparkSession)
 }



[carbondata] 22/33: [HOTFIX] fix incorrect word in index-server doc

2019-10-03 Thread ravipesala
This is an automated email from the ASF dual-hosted git repository.

ravipesala pushed a commit to branch branch-1.6
in repository https://gitbox.apache.org/repos/asf/carbondata.git

commit 8ffbc1d6afda2ad77efeb738168b972239c89731
Author: lamber-ken <2217232...@qq.com>
AuthorDate: Tue Sep 17 02:08:40 2019 +0800

[HOTFIX] fix incorrect word in index-server doc

fix incorrect word in index-server doc

This closes #3390
---
 docs/index-server.md | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/docs/index-server.md b/docs/index-server.md
index 9253f2a..0b888c4 100644
--- a/docs/index-server.md
+++ b/docs/index-server.md
@@ -191,7 +191,7 @@ that will authenticate the user to access the index server 
and no other service.
   
 ## Starting the Server
 ``` 
-./bin/spark-submit --master [yarn/local] --[o ptional parameters] --class 
org.apache.carbondata.indexserver.IndexServer [path to 
carbondata-spark2-.jar]
+./bin/spark-submit --master [yarn/local] --[optional parameters] --class 
org.apache.carbondata.indexserver.IndexServer [path to 
carbondata-spark2-.jar]
 ```
 Or 
 ``` 



[carbondata] 11/33: [CARBONDATA-3497] Support to write long string for streaming table

2019-10-03 Thread ravipesala
This is an automated email from the ASF dual-hosted git repository.

ravipesala pushed a commit to branch branch-1.6
in repository https://gitbox.apache.org/repos/asf/carbondata.git

commit 99e0c7cb59fdf9d89a797b1b13923b64639dfc30
Author: Zhang Zhichao <441586...@qq.com>
AuthorDate: Tue Aug 27 11:32:48 2019 +0800

[CARBONDATA-3497] Support to write long string for streaming table

This closes #3366
---
 .../hadoop/stream/StreamRecordReader.java  |  19 +-
 .../resources/streamSample_with_long_string.csv|   6 +
 .../streaming/CarbonAppendableStreamSink.scala |  19 +-
 .../converter/SparkDataTypeConverterImpl.java  |   6 +-
 .../TestStreamingTableWithLongString.scala | 649 +
 .../streaming/CarbonStreamRecordWriter.java|  11 +-
 .../streaming/parser/CSVStreamParserImp.java   |   5 +-
 .../streaming/parser/CarbonStreamParser.java   |   3 +-
 .../streaming/parser/RowStreamParserImp.scala  |  11 +-
 9 files changed, 715 insertions(+), 14 deletions(-)

diff --git 
a/hadoop/src/main/java/org/apache/carbondata/hadoop/stream/StreamRecordReader.java
 
b/hadoop/src/main/java/org/apache/carbondata/hadoop/stream/StreamRecordReader.java
index 75e36be..1e40baa 100644
--- 
a/hadoop/src/main/java/org/apache/carbondata/hadoop/stream/StreamRecordReader.java
+++ 
b/hadoop/src/main/java/org/apache/carbondata/hadoop/stream/StreamRecordReader.java
@@ -81,6 +81,7 @@ public class StreamRecordReader extends RecordReader {
   protected CarbonTable carbonTable;
   private CarbonColumn[] storageColumns;
   private boolean[] isRequired;
+  private boolean[] dimensionsIsVarcharTypeMap;
   private DataType[] measureDataTypes;
   private int dimensionCount;
   private int measureCount;
@@ -163,6 +164,10 @@ public class StreamRecordReader extends RecordReader {
 .getDirectDictionaryGenerator(storageColumns[i].getDataType());
   }
 }
+dimensionsIsVarcharTypeMap = new boolean[dimensionCount];
+for (int i = 0; i < dimensionCount; i++) {
+  dimensionsIsVarcharTypeMap[i] = storageColumns[i].getDataType() == 
DataTypes.VARCHAR;
+}
 measureDataTypes = new DataType[measureCount];
 for (int i = 0; i < measureCount; i++) {
   measureDataTypes[i] = storageColumns[dimensionCount + i].getDataType();
@@ -387,7 +392,12 @@ public class StreamRecordReader extends RecordReader {
 }
   } else {
 if (isNoDictColumn[colCount]) {
-  int v = input.readShort();
+  int v = 0;
+  if (dimensionsIsVarcharTypeMap[colCount]) {
+v = input.readInt();
+  } else {
+v = input.readShort();
+  }
   if (isRequired[colCount]) {
 byte[] b = input.readBytes(v);
 if (isFilterRequired[colCount]) {
@@ -561,7 +571,12 @@ public class StreamRecordReader extends RecordReader {
 outputValues[colCount] = 
CarbonCommonConstants.MEMBER_DEFAULT_VAL_ARRAY;
   } else {
 if (isNoDictColumn[colCount]) {
-  int v = input.readShort();
+  int v = 0;
+  if (dimensionsIsVarcharTypeMap[colCount]) {
+v = input.readInt();
+  } else {
+v = input.readShort();
+  }
   outputValues[colCount] = input.readBytes(v);
 } else {
   outputValues[colCount] = input.readInt();
diff --git 
a/integration/spark-common-test/src/test/resources/streamSample_with_long_string.csv
 
b/integration/spark-common-test/src/test/resources/streamSample_with_long_string.csv
new file mode 100644
index 000..b010c07
--- /dev/null
+++ 
b/integration/spark-common-test/src/test/resources/streamSample_with_long_string.csv
@@ -0,0 +1,6 @@
+id,name,city,salary,tax,percent,birthday,register,updated,longstr,file
+10001,batch_1,city_1,0.1,0.01,80.01,1990-01-01,2010-01-01 
10:01:01,2010-01-01 
10:01:01,1abcabcabcabcabcabcabcabcabcabcabcabcabcabcabcabcabcabcabcabcabcabcabcabcabcabcabcabcabcabcabcabcabcabcabcabcabcabcabcabcabcabcabcabcabcabcabcabcabcabcabcabcabcabcabcabcabcabcabcabcabcabcabcabcabcabcabcabcabcabcabcabcabcabcabcabcabcabcabcabcabcabcabcabcabcabcabcabcabcabcabcabcabcabcabcabcabcabcabcabcabcabcabcabcabcabcabcabcabcabcabcabcabcabcabcabcabcabcabcabcabcabcabcabcabcabcabcabcabcabcabcabcabca
 [...]
+10002,batch_2,city_2,0.2,0.02,80.02,1990-01-02,2010-01-02 
10:01:01,2010-01-02 
10:01:01,2abcabcabcabcabcabcabcabcabcabcabcabcabcabcabcabcabcabcabcabcabcabcabcabcabcabcabcabcabcabcabcabcabcabcabcabcabcabcabcabcabcabcabcabcabcabcabcabcabcabcabcabcabcabcabcabcabcabcabcabcabcabcabcabcabcabcabcabcabcabcabcabcabcabcabcabcabcabcabcabcabcabcabcabcabcabcabcabcabcabcabcabcabcabcabcabcabcabcabcabcabcabcabcabcabcabcabcabcabcabcabcabcabcabcabcabcabcabcabcabcabcabcabcabcabcabcabcabcabcabcabcabcabca
 [...]
+10003,batch_3,city_3,0.3,0.03,80.03,1990-01-03,2010-01-03 
10:01:01,2010-

[carbondata] 26/33: [HOTFIX] Fix wrong min/max index of measure

2019-10-03 Thread ravipesala
This is an automated email from the ASF dual-hosted git repository.

ravipesala pushed a commit to branch branch-1.6
in repository https://gitbox.apache.org/repos/asf/carbondata.git

commit de81b380b71b111061bf27d8837b50a995631268
Author: QiangCai 
AuthorDate: Wed Sep 18 21:02:15 2019 +0800

[HOTFIX] Fix wrong min/max index of measure

This closes #3394
---
 .../carbondata/core/util/CarbonMetadataUtil.java   |  64 
 .../org/apache/carbondata/sdk/file/MinMaxTest.java | 161 +
 2 files changed, 188 insertions(+), 37 deletions(-)

diff --git 
a/core/src/main/java/org/apache/carbondata/core/util/CarbonMetadataUtil.java 
b/core/src/main/java/org/apache/carbondata/core/util/CarbonMetadataUtil.java
index f35afc0..7414ab7 100644
--- a/core/src/main/java/org/apache/carbondata/core/util/CarbonMetadataUtil.java
+++ b/core/src/main/java/org/apache/carbondata/core/util/CarbonMetadataUtil.java
@@ -477,50 +477,40 @@ public class CarbonMetadataUtil {
 ByteBuffer firstBuffer = null;
 ByteBuffer secondBuffer = null;
 if (dataType == DataTypes.BOOLEAN || dataType == DataTypes.BYTE) {
-  return first[0] - second[0];
+  if (first[0] > second[0]) {
+return 1;
+  } else if (first[0] < second[0]) {
+return -1;
+  }
+  return 0;
 } else if (dataType == DataTypes.DOUBLE) {
-  firstBuffer = ByteBuffer.allocate(8);
-  firstBuffer.put(first);
-  secondBuffer = ByteBuffer.allocate(8);
-  secondBuffer.put(second);
-  firstBuffer.flip();
-  secondBuffer.flip();
-  double compare = firstBuffer.getDouble() - secondBuffer.getDouble();
-  if (compare > 0) {
-compare = 1;
-  } else if (compare < 0) {
-compare = -1;
+  double firstValue = ((ByteBuffer) 
(ByteBuffer.allocate(8).put(first).flip())).getDouble();
+  double secondValue = ((ByteBuffer) 
(ByteBuffer.allocate(8).put(second).flip())).getDouble();
+  if (firstValue > secondValue) {
+return 1;
+  } else if (firstValue < secondValue) {
+return -1;
   }
-  return (int) compare;
+  return 0;
 } else if (dataType == DataTypes.FLOAT) {
-  firstBuffer = ByteBuffer.allocate(8);
-  firstBuffer.put(first);
-  secondBuffer = ByteBuffer.allocate(8);
-  secondBuffer.put(second);
-  firstBuffer.flip();
-  secondBuffer.flip();
-  double compare = firstBuffer.getFloat() - secondBuffer.getFloat();
-  if (compare > 0) {
-compare = 1;
-  } else if (compare < 0) {
-compare = -1;
+  float firstValue = ((ByteBuffer) 
(ByteBuffer.allocate(8).put(first).flip())).getFloat();
+  float secondValue = ((ByteBuffer) 
(ByteBuffer.allocate(8).put(second).flip())).getFloat();
+  if (firstValue > secondValue) {
+return 1;
+  } else if (firstValue < secondValue) {
+return -1;
   }
-  return (int) compare;
+  return 0;
 } else if (dataType == DataTypes.LONG || dataType == DataTypes.INT
 || dataType == DataTypes.SHORT) {
-  firstBuffer = ByteBuffer.allocate(8);
-  firstBuffer.put(first);
-  secondBuffer = ByteBuffer.allocate(8);
-  secondBuffer.put(second);
-  firstBuffer.flip();
-  secondBuffer.flip();
-  long compare = firstBuffer.getLong() - secondBuffer.getLong();
-  if (compare > 0) {
-compare = 1;
-  } else if (compare < 0) {
-compare = -1;
+  long firstValue = ((ByteBuffer) 
(ByteBuffer.allocate(8).put(first).flip())).getLong();
+  long secondValue = ((ByteBuffer) 
(ByteBuffer.allocate(8).put(second).flip())).getLong();
+  if (firstValue > secondValue) {
+return 1;
+  } else if (firstValue < secondValue) {
+return -1;
   }
-  return (int) compare;
+  return 0;
 } else if (DataTypes.isDecimal(dataType)) {
   return 
DataTypeUtil.byteToBigDecimal(first).compareTo(DataTypeUtil.byteToBigDecimal(second));
 } else {
diff --git 
a/store/sdk/src/test/java/org/apache/carbondata/sdk/file/MinMaxTest.java 
b/store/sdk/src/test/java/org/apache/carbondata/sdk/file/MinMaxTest.java
new file mode 100644
index 000..c26fdd5
--- /dev/null
+++ b/store/sdk/src/test/java/org/apache/carbondata/sdk/file/MinMaxTest.java
@@ -0,0 +1,161 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES

[carbondata] 10/33: [CARBONDATA-3505] Drop database cascade fix

2019-10-03 Thread ravipesala
This is an automated email from the ASF dual-hosted git repository.

ravipesala pushed a commit to branch branch-1.6
in repository https://gitbox.apache.org/repos/asf/carbondata.git

commit 2328707b4477a11b7713aee8f123b780ad48cc25
Author: kunal642 
AuthorDate: Tue Aug 27 14:49:58 2019 +0530

[CARBONDATA-3505] Drop database cascade fix

Problem: When 2 databases are created on same location and one of them is 
dropped
then the folder is also deleted from backend. If we try to drop the 2nd 
database
then it would try to lookup the other table, but the schema file would not 
exist
in the backend and the drop will fail.

Solution: Add a check to call CarbonDropDatabaseCommand only if the database
location exists in the backend.

This closes #3365
---
 .../main/scala/org/apache/spark/sql/CarbonEnv.scala   | 19 ++-
 .../command/cache/CarbonShowCacheCommand.scala|  4 ++--
 .../spark/sql/execution/strategy/DDLStrategy.scala|  4 +++-
 .../apache/spark/sql/hive/CarbonFileMetastore.scala   |  4 ++--
 4 files changed, 25 insertions(+), 6 deletions(-)

diff --git 
a/integration/spark2/src/main/scala/org/apache/spark/sql/CarbonEnv.scala 
b/integration/spark2/src/main/scala/org/apache/spark/sql/CarbonEnv.scala
index 1cbd156..f2a52d2 100644
--- a/integration/spark2/src/main/scala/org/apache/spark/sql/CarbonEnv.scala
+++ b/integration/spark2/src/main/scala/org/apache/spark/sql/CarbonEnv.scala
@@ -20,7 +20,7 @@ package org.apache.spark.sql
 import java.util.concurrent.ConcurrentHashMap
 
 import org.apache.spark.sql.catalyst.TableIdentifier
-import org.apache.spark.sql.catalyst.analysis.NoSuchTableException
+import org.apache.spark.sql.catalyst.analysis.{NoSuchDatabaseException, 
NoSuchTableException}
 import org.apache.spark.sql.catalyst.catalog.SessionCatalog
 import org.apache.spark.sql.events.{MergeBloomIndexEventListener, 
MergeIndexEventListener}
 import org.apache.spark.sql.execution.command.cache._
@@ -267,6 +267,23 @@ object CarbonEnv {
   }
 
   /**
+   * Returns true with the database folder exists in file system. False in all 
other scenarios.
+   */
+  def databaseLocationExists(dbName: String,
+  sparkSession: SparkSession, ifExists: Boolean): Boolean = {
+try {
+  FileFactory.getCarbonFile(getDatabaseLocation(dbName, 
sparkSession)).exists()
+} catch {
+  case e: NoSuchDatabaseException =>
+if (ifExists) {
+  false
+} else {
+  throw e
+}
+}
+  }
+
+  /**
* The method returns the database location
* if carbon.storeLocation does  point to spark.sql.warehouse.dir then 
returns
* the database locationUri as database location else follows the old 
behaviour
diff --git 
a/integration/spark2/src/main/scala/org/apache/spark/sql/execution/command/cache/CarbonShowCacheCommand.scala
 
b/integration/spark2/src/main/scala/org/apache/spark/sql/execution/command/cache/CarbonShowCacheCommand.scala
index 45e811a..4b7f680 100644
--- 
a/integration/spark2/src/main/scala/org/apache/spark/sql/execution/command/cache/CarbonShowCacheCommand.scala
+++ 
b/integration/spark2/src/main/scala/org/apache/spark/sql/execution/command/cache/CarbonShowCacheCommand.scala
@@ -443,9 +443,9 @@ case class CarbonShowCacheCommand(tableIdentifier: 
Option[TableIdentifier],
   case (_, _, sum, provider) =>
 provider.toLowerCase match {
   case `bloomFilterIdentifier` =>
-allIndexSize += sum
-  case _ =>
 allDatamapSize += sum
+  case _ =>
+allIndexSize += sum
 }
 }
 (allIndexSize, allDatamapSize)
diff --git 
a/integration/spark2/src/main/scala/org/apache/spark/sql/execution/strategy/DDLStrategy.scala
 
b/integration/spark2/src/main/scala/org/apache/spark/sql/execution/strategy/DDLStrategy.scala
index 4791687..3ef8cfa 100644
--- 
a/integration/spark2/src/main/scala/org/apache/spark/sql/execution/strategy/DDLStrategy.scala
+++ 
b/integration/spark2/src/main/scala/org/apache/spark/sql/execution/strategy/DDLStrategy.scala
@@ -37,6 +37,7 @@ import org.apache.spark.util.{CarbonReflectionUtils, 
DataMapUtil, FileUtils, Spa
 
 import 
org.apache.carbondata.common.exceptions.sql.MalformedCarbonCommandException
 import org.apache.carbondata.common.logging.LogServiceFactory
+import org.apache.carbondata.core.datastore.impl.FileFactory
 import org.apache.carbondata.core.metadata.schema.table.CarbonTable
 import org.apache.carbondata.core.util.{CarbonProperties, DataTypeUtil, 
ThreadLocalSessionInfo}
 import org.apache.carbondata.spark.util.Util
@@ -115,7 +116,8 @@ class DDLStrategy(sparkSession: SparkSession) extends 
SparkStrategy {
   
.setConfigurationToCurrentThread(sparkSession.sessionState.newHadoopConf())
 FileUtils.createDatabaseDirectory(dbName, dbLocation, 
sparkSession.sparkContext)
 ExecutedCommandExec(createDb) :: Nil
-  case drop@DropDatabaseCommand(dbName, ifEx

[carbondata] 30/33: [CARBONDATA-3523] Store data file size into index file

2019-10-03 Thread ravipesala
This is an automated email from the ASF dual-hosted git repository.

ravipesala pushed a commit to branch branch-1.6
in repository https://gitbox.apache.org/repos/asf/carbondata.git

commit 49e9ea3c6a75532e4f51d924bce4334687597a9c
Author: QiangCai 
AuthorDate: Tue Aug 13 10:25:31 2019 +0800

[CARBONDATA-3523] Store data file size into index file

In BlockIndex, the file_size is always zero. We can set the actual value 
during data loading and use it during the query to improve the query 
performance.

1. avoid invoking listFiles for each segment
2. avoid invoking getFileStatus for each data file

This closes #3356
---
 .../core/datastore/block/TableBlockInfo.java   | 13 +
 .../carbondata/core/metadata/index/BlockIndexInfo.java | 18 ++
 .../core/util/AbstractDataFileFooterConverter.java |  3 +++
 .../carbondata/core/util/BlockletDataMapUtil.java  | 17 ++---
 .../carbondata/core/util/CarbonMetadataUtil.java   |  1 +
 .../store/writer/AbstractFactDataWriter.java   |  5 +++--
 .../store/writer/v3/CarbonFactDataWriterImplV3.java| 18 ++
 7 files changed, 62 insertions(+), 13 deletions(-)

diff --git 
a/core/src/main/java/org/apache/carbondata/core/datastore/block/TableBlockInfo.java
 
b/core/src/main/java/org/apache/carbondata/core/datastore/block/TableBlockInfo.java
index 25d82f8..4dd1403 100644
--- 
a/core/src/main/java/org/apache/carbondata/core/datastore/block/TableBlockInfo.java
+++ 
b/core/src/main/java/org/apache/carbondata/core/datastore/block/TableBlockInfo.java
@@ -54,6 +54,11 @@ public class TableBlockInfo implements Distributable, 
Serializable {
   private String filePath;
 
   /**
+   * file size of the block
+   */
+  private long fileSize;
+
+  /**
* block offset in the file
*/
   private long blockOffset;
@@ -439,6 +444,14 @@ public class TableBlockInfo implements Distributable, 
Serializable {
 this.filePath = filePath;
   }
 
+  public long getFileSize() {
+return fileSize;
+  }
+
+  public void setFileSize(long fileSize) {
+this.fileSize = fileSize;
+  }
+
   public BlockletDetailInfo getDetailInfo() {
 return detailInfo;
   }
diff --git 
a/core/src/main/java/org/apache/carbondata/core/metadata/index/BlockIndexInfo.java
 
b/core/src/main/java/org/apache/carbondata/core/metadata/index/BlockIndexInfo.java
index ae99ed8..f7f2d3c 100644
--- 
a/core/src/main/java/org/apache/carbondata/core/metadata/index/BlockIndexInfo.java
+++ 
b/core/src/main/java/org/apache/carbondata/core/metadata/index/BlockIndexInfo.java
@@ -51,6 +51,11 @@ public class BlockIndexInfo {
   private BlockletInfo blockletInfo;
 
   /**
+   * file size
+   */
+  private long fileSize;
+
+  /**
* Constructor
*
* @param numberOfRows  number of rows
@@ -80,6 +85,12 @@ public class BlockIndexInfo {
 this.blockletInfo = blockletInfo;
   }
 
+  public BlockIndexInfo(long numberOfRows, String fileName, long offset,
+  BlockletIndex blockletIndex, BlockletInfo blockletInfo, long fileSize) {
+this(numberOfRows, fileName, offset, blockletIndex, blockletInfo);
+this.fileSize = fileSize;
+  }
+
   /**
* @return the numberOfRows
*/
@@ -114,4 +125,11 @@ public class BlockIndexInfo {
   public BlockletInfo getBlockletInfo() {
 return blockletInfo;
   }
+
+  /**
+   * @return file size
+   */
+  public long getFileSize() {
+return fileSize;
+  }
 }
diff --git 
a/core/src/main/java/org/apache/carbondata/core/util/AbstractDataFileFooterConverter.java
 
b/core/src/main/java/org/apache/carbondata/core/util/AbstractDataFileFooterConverter.java
index 64d30c2..f16a3ae 100644
--- 
a/core/src/main/java/org/apache/carbondata/core/util/AbstractDataFileFooterConverter.java
+++ 
b/core/src/main/java/org/apache/carbondata/core/util/AbstractDataFileFooterConverter.java
@@ -244,6 +244,9 @@ public abstract class AbstractDataFileFooterConverter {
 }
 fileName = (CarbonCommonConstants.FILE_SEPARATOR + 
fileName).replaceAll("//", "/");
 tableBlockInfo.setFilePath(parentPath + fileName);
+if (readBlockIndexInfo.isSetFile_size()) {
+  tableBlockInfo.setFileSize(readBlockIndexInfo.getFile_size());
+}
 return tableBlockInfo;
   }
 
diff --git 
a/core/src/main/java/org/apache/carbondata/core/util/BlockletDataMapUtil.java 
b/core/src/main/java/org/apache/carbondata/core/util/BlockletDataMapUtil.java
index 6cd60a2..5a988c4 100644
--- 
a/core/src/main/java/org/apache/carbondata/core/util/BlockletDataMapUtil.java
+++ 
b/core/src/main/java/org/apache/carbondata/core/util/BlockletDataMapUtil.java
@@ -38,9 +38,11 @@ import 
org.apache.carbondata.common.logging.LogServiceFactory;
 import org.apache.carbondata.core.constants.CarbonCommonConstants;
 import org.apache.carbondata.core.datamap.Segment;
 import org.apache.carbondata.core.datastore.block.SegmentProperties;
+import org.apache.carbondata.core.datastore.block.Tab

[carbondata] 03/33: [CARBONDATA-3493] Initialize Profiler in CarbonEnv

2019-10-03 Thread ravipesala
This is an automated email from the ASF dual-hosted git repository.

ravipesala pushed a commit to branch branch-1.6
in repository https://gitbox.apache.org/repos/asf/carbondata.git

commit 90b6c648d3b20e563e685b9aa8c8bafcefbcf3ad
Author: akashrn5 
AuthorDate: Wed Jul 31 18:54:41 2019 +0530

[CARBONDATA-3493] Initialize Profiler in CarbonEnv

Problem: After enabling "enable.query.statistics", exception is
thrown while querying because profiler is not initialized before
setting up the rpc end point connection.

Solution: Initialized Profiler in CarbonEnv before setting up the
rpc end point connection.

This closes #3342

Co-authored-by: shivamasn 
---
 integration/spark2/src/main/scala/org/apache/spark/sql/CarbonEnv.scala  | 2 ++
 .../spark2/src/main/scala/org/apache/spark/sql/CarbonSession.scala  | 2 --
 2 files changed, 2 insertions(+), 2 deletions(-)

diff --git 
a/integration/spark2/src/main/scala/org/apache/spark/sql/CarbonEnv.scala 
b/integration/spark2/src/main/scala/org/apache/spark/sql/CarbonEnv.scala
index c13e7b9..1cbd156 100644
--- a/integration/spark2/src/main/scala/org/apache/spark/sql/CarbonEnv.scala
+++ b/integration/spark2/src/main/scala/org/apache/spark/sql/CarbonEnv.scala
@@ -28,6 +28,7 @@ import org.apache.spark.sql.execution.command.mv._
 import org.apache.spark.sql.execution.command.preaaggregate._
 import org.apache.spark.sql.execution.command.timeseries.TimeSeriesFunction
 import org.apache.spark.sql.hive._
+import org.apache.spark.sql.profiler.Profiler
 
 import org.apache.carbondata.common.logging.LogServiceFactory
 import org.apache.carbondata.core.constants.CarbonCommonConstants
@@ -121,6 +122,7 @@ class CarbonEnv {
 initialized = true
   }
 }
+Profiler.initialize(sparkSession.sparkContext)
 LOGGER.info("Initialize CarbonEnv completed...")
   }
 }
diff --git 
a/integration/spark2/src/main/scala/org/apache/spark/sql/CarbonSession.scala 
b/integration/spark2/src/main/scala/org/apache/spark/sql/CarbonSession.scala
index 7b1bf4c..deefcd1 100644
--- a/integration/spark2/src/main/scala/org/apache/spark/sql/CarbonSession.scala
+++ b/integration/spark2/src/main/scala/org/apache/spark/sql/CarbonSession.scala
@@ -259,8 +259,6 @@ object CarbonSession {
 }
 options.foreach { case (k, v) => 
session.sessionState.conf.setConfString(k, v) }
 SparkSession.setDefaultSession(session)
-// Setup monitor end point and register CarbonMonitorListener
-Profiler.initialize(sparkContext)
 // Register a successfully instantiated context to the singleton. This 
should be at the
 // end of the class definition so that the singleton is updated only 
if there is no
 // exception in the construction of the instance.



[carbondata] 01/33: [CARBONDATA-3480] Fixed unnecessary refresh for table by removing modified mdt file

2019-10-03 Thread ravipesala
This is an automated email from the ASF dual-hosted git repository.

ravipesala pushed a commit to branch branch-1.6
in repository https://gitbox.apache.org/repos/asf/carbondata.git

commit 56879f747d150f4efae0f2998a38f4297706bc5e
Author: kunal642 
AuthorDate: Fri Jul 26 14:52:36 2019 +0530

[CARBONDATA-3480] Fixed unnecessary refresh for table by removing modified 
mdt file

This closes #3339
---
 .../carbondata/core/datamap/DataMapFilter.java |  47 +++
 .../core/datamap/DataMapStoreManager.java  |  14 +-
 .../carbondata/core/metadata/CarbonMetadata.java   |   9 +
 .../core/metadata/schema/table/CarbonTable.java|   4 +-
 .../core/metadata/schema/table/TableSchema.java|   4 +
 .../statusmanager/SegmentUpdateStatusManager.java  |  26 --
 .../apache/carbondata/core/util/CarbonUtil.java|   1 -
 .../core/metadata/CarbonMetadataTest.java  |   7 +-
 .../ThriftWrapperSchemaConverterImplTest.java  |   4 +-
 .../metadata/schema/table/CarbonTableTest.java |   8 +-
 .../table/CarbonTableWithComplexTypesTest.java |   6 +-
 .../dblocation/DBLocationCarbonTableTestCase.scala |  25 --
 .../apache/spark/sql/hive/CarbonSessionUtil.scala  |   6 +-
 .../carbondata/indexserver/IndexServer.scala   |  10 +-
 .../scala/org/apache/spark/sql/CarbonEnv.scala |  51 ++-
 .../command/datamap/CarbonDropDataMapCommand.scala |   1 -
 .../management/RefreshCarbonTableCommand.scala |   2 -
 .../CarbonAlterTableDropPartitionCommand.scala |  12 +-
 .../CarbonAlterTableSplitPartitionCommand.scala|   3 -
 .../command/preaaggregate/PreAggregateUtil.scala   |  19 +-
 .../command/table/CarbonDropTableCommand.scala |  13 +
 .../spark/sql/hive/CarbonFileMetastore.scala   | 425 +
 .../spark/sql/hive/CarbonHiveMetaStore.scala   |  10 +-
 .../apache/spark/sql/hive/CarbonMetaStore.scala|  10 +-
 .../scala/org/apache/spark/util/CleanFiles.scala   |   3 -
 .../scala/org/apache/spark/util/Compaction.scala   |   2 -
 .../apache/spark/util/DeleteSegmentByDate.scala|   2 -
 .../org/apache/spark/util/DeleteSegmentById.scala  |   2 -
 .../scala/org/apache/spark/util/TableLoader.scala  |   2 -
 .../apache/spark/sql/hive/CarbonSessionState.scala |  31 +-
 .../AlterTableColumnRenameTestCase.scala   |   4 +-
 31 files changed, 322 insertions(+), 441 deletions(-)

diff --git 
a/core/src/main/java/org/apache/carbondata/core/datamap/DataMapFilter.java 
b/core/src/main/java/org/apache/carbondata/core/datamap/DataMapFilter.java
index c20d0d5..ac4886d 100644
--- a/core/src/main/java/org/apache/carbondata/core/datamap/DataMapFilter.java
+++ b/core/src/main/java/org/apache/carbondata/core/datamap/DataMapFilter.java
@@ -18,10 +18,15 @@
 package org.apache.carbondata.core.datamap;
 
 import java.io.Serializable;
+import java.util.HashSet;
+import java.util.Set;
 
 import org.apache.carbondata.core.datastore.block.SegmentProperties;
 import org.apache.carbondata.core.metadata.schema.table.CarbonTable;
+import org.apache.carbondata.core.metadata.schema.table.column.CarbonDimension;
+import org.apache.carbondata.core.metadata.schema.table.column.CarbonMeasure;
 import org.apache.carbondata.core.scan.executor.util.RestructureUtil;
+import org.apache.carbondata.core.scan.expression.ColumnExpression;
 import org.apache.carbondata.core.scan.expression.Expression;
 import org.apache.carbondata.core.scan.filter.resolver.FilterResolverIntf;
 
@@ -39,9 +44,51 @@ public class DataMapFilter implements Serializable {
   public DataMapFilter(CarbonTable table, Expression expression) {
 this.table = table;
 this.expression = expression;
+if (expression != null) {
+  checkIfFilterColumnExistsInTable();
+}
 resolve();
   }
 
+  private Set extractColumnExpressions(Expression expression) {
+Set columnExpressionList = new HashSet<>();
+for (Expression expressions: expression.getChildren()) {
+  if (expressions != null && expressions.getChildren() != null
+  && expressions.getChildren().size() > 0) {
+columnExpressionList.addAll(extractColumnExpressions(expressions));
+  } else if (expressions instanceof ColumnExpression) {
+columnExpressionList.add(((ColumnExpression) 
expressions).getColumnName());
+  }
+}
+return columnExpressionList;
+  }
+
+  private void checkIfFilterColumnExistsInTable() {
+Set columnExpressionList = extractColumnExpressions(expression);
+for (String colExpression : columnExpressionList) {
+  if (colExpression.equalsIgnoreCase("positionid")) {
+continue;
+  }
+  boolean exists = false;
+  for (CarbonMeasure carbonMeasure : table.getAllMeasures()) {
+if (!carbonMeasure.isInvisible() && carbonMeasure.getColName()
+.equalsIgnoreCase(colExpression)) {
+  exists = true;
+}
+  }
+  for (CarbonDimension carbonDimension : table.getAllDimensions()) 

[carbondata] 13/33: [CARBONDATA-3452] dictionary include udf handle all the scenarios

2019-10-03 Thread ravipesala
This is an automated email from the ASF dual-hosted git repository.

ravipesala pushed a commit to branch branch-1.6
in repository https://gitbox.apache.org/repos/asf/carbondata.git

commit 6f90b28dd3d3f008f33668884807cc63cb6b5db5
Author: ajantha-bhat 
AuthorDate: Wed Aug 14 20:36:13 2019 +0530

[CARBONDATA-3452] dictionary include udf handle all the scenarios

Problem: select query failure when substring on dictionary column with join.
Cause: when dictionary include is present, data type is updated to int from
string in plan attribute. so substring was unresolved on int column.
Join operation try to reference this attribute which is unresolved.
Solution: Need to handle this for all the scenarios in CarbonLateDecodeRule

This closes #3358
---
 .../hadoop/api/CarbonTableOutputFormat.java|   5 +-
 .../spark/sql/optimizer/CarbonLateDecodeRule.scala | 141 ++---
 .../carbondata/query/SubQueryJoinTestSuite.scala   |  19 +++
 .../processing/util/CarbonDataProcessorUtil.java   |   5 +-
 4 files changed, 120 insertions(+), 50 deletions(-)

diff --git 
a/hadoop/src/main/java/org/apache/carbondata/hadoop/api/CarbonTableOutputFormat.java
 
b/hadoop/src/main/java/org/apache/carbondata/hadoop/api/CarbonTableOutputFormat.java
index 9ba5e97..16703bf 100644
--- 
a/hadoop/src/main/java/org/apache/carbondata/hadoop/api/CarbonTableOutputFormat.java
+++ 
b/hadoop/src/main/java/org/apache/carbondata/hadoop/api/CarbonTableOutputFormat.java
@@ -19,6 +19,7 @@ package org.apache.carbondata.hadoop.api;
 
 import java.io.IOException;
 import java.util.List;
+import java.util.UUID;
 import java.util.concurrent.ExecutionException;
 import java.util.concurrent.ExecutorService;
 import java.util.concurrent.Executors;
@@ -221,8 +222,8 @@ public class CarbonTableOutputFormat extends 
FileOutputFormat attr
+  case a@Alias(attr: AttributeReference, _) => a
+  case others =>
+// datatype need to change for dictionary columns if only alias
+// or attribute ref present.
+// If anything else present, no need to change data type.
+needChangeDatatype = false
+others
+}
+needChangeDatatype
+  }
+
   private def updateTempDecoder(plan: LogicalPlan,
   aliasMapOriginal: CarbonAliasDecoderRelation,
   attrMap: java.util.HashMap[AttributeReferenceWrapper, 
CarbonDecoderRelation]):
@@ -650,44 +665,71 @@ class CarbonLateDecodeRule extends Rule[LogicalPlan] with 
PredicateHelper {
 cd
   case sort: Sort =>
 val sortExprs = sort.order.map { s =>
-  s.transform {
-case attr: AttributeReference =>
-  updateDataType(attr, attrMap, allAttrsNotDecode, aliasMap)
-  }.asInstanceOf[SortOrder]
+  if (needDataTypeUpdate(s)) {
+s.transform {
+  case attr: AttributeReference =>
+updateDataType(attr, attrMap, allAttrsNotDecode, aliasMap)
+}.asInstanceOf[SortOrder]
+  } else {
+s
+  }
 }
 Sort(sortExprs, sort.global, sort.child)
   case agg: Aggregate if 
!agg.child.isInstanceOf[CarbonDictionaryCatalystDecoder] =>
 val aggExps = agg.aggregateExpressions.map { aggExp =>
-  aggExp.transform {
-case attr: AttributeReference =>
-  updateDataType(attr, attrMap, allAttrsNotDecode, aliasMap)
+  if (needDataTypeUpdate(aggExp)) {
+aggExp.transform {
+  case attr: AttributeReference =>
+updateDataType(attr, attrMap, allAttrsNotDecode, aliasMap)
+}
+  } else {
+aggExp
   }
 }.asInstanceOf[Seq[NamedExpression]]
-
 val grpExps = agg.groupingExpressions.map { gexp =>
-  gexp.transform {
-case attr: AttributeReference =>
-  updateDataType(attr, attrMap, allAttrsNotDecode, aliasMap)
+  if (needDataTypeUpdate(gexp)) {
+gexp.transform {
+  case attr: AttributeReference =>
+updateDataType(attr, attrMap, allAttrsNotDecode, aliasMap)
+}
+  } else {
+gexp
   }
 }
 Aggregate(grpExps, aggExps, agg.child)
   case expand: Expand =>
-val ex = expand.transformExpressions {
-  case attr: AttributeReference =>
-updateDataType(attr, attrMap, allAttrsNotDecode, aliasMap)
+// can't use needDataTypeUpdate here as argument is of type Expand
+var needChangeDatatype: Boolean = true
+expand.transformExpressions {
+  case attr: AttributeReference => attr
+  case a@Alias(attr: AttributeReference, _) => a
+  case others =>
+// datatype need to change for dictionary columns if only alias
+// or attribute ref present.
+// If anything else present, no need to change data type.
+

[carbondata] 23/33: [CARBONDATA-3489] Optimized the comparator instances in sort

2019-10-03 Thread ravipesala
This is an automated email from the ASF dual-hosted git repository.

ravipesala pushed a commit to branch branch-1.6
in repository https://gitbox.apache.org/repos/asf/carbondata.git

commit 9e4664742f04b4995bd8a2ecd465e8797ce7c2d2
Author: Vikram Ahuja 
AuthorDate: Tue Aug 6 12:21:11 2019 +0530

[CARBONDATA-3489] Optimized the comparator instances in sort

Root cause: In case of sorting in the comparator classes(NewRowComparator,
RawRowComparator, IntermediateSortTempRowComparator and UnsafeRowComparator)
a new SerializableComparator  object is been created in the compare method
everytime two objects are passed for comparison.

Solution: We can reduce the number of SerializeableComparator objects
that are been created by storing the SerializeableComparators of
primitive datatypes in a map and getting it from the map instead of
creating a new SerializeableComparator everytime.

This closes #3354
---
 .../core/util/comparator/Comparator.java   |  49 +++---
 .../partition/impl/RawRowComparatorTest.java   | 142 
 .../IntermediateSortTempRowComparatorTest.java | 178 +
 .../sort/sortdata/NewRowComparatorTest.java| 109 +
 4 files changed, 453 insertions(+), 25 deletions(-)

diff --git 
a/core/src/main/java/org/apache/carbondata/core/util/comparator/Comparator.java 
b/core/src/main/java/org/apache/carbondata/core/util/comparator/Comparator.java
index 6981405..d7e8f80 100644
--- 
a/core/src/main/java/org/apache/carbondata/core/util/comparator/Comparator.java
+++ 
b/core/src/main/java/org/apache/carbondata/core/util/comparator/Comparator.java
@@ -25,24 +25,23 @@ import org.apache.carbondata.core.util.ByteUtil;
 
 public final class Comparator {
 
+  //Comparators are made static so that only one instance is generated
+  private static final SerializableComparator BOOLEAN  = new 
BooleanSerializableComparator();
+  private static final SerializableComparator INT = new 
IntSerializableComparator();
+  private static final SerializableComparator SHORT = new 
ShortSerializableComparator();
+  private static final SerializableComparator DOUBLE = new 
DoubleSerializableComparator();
+  private static final SerializableComparator FLOAT = new 
FloatSerializableComparator();
+  private static final SerializableComparator LONG = new 
LongSerializableComparator();
+  private static final SerializableComparator DECIMAL  = new 
BigDecimalSerializableComparator();
+  private static final SerializableComparator BYTE = new 
ByteArraySerializableComparator();
+
   public static SerializableComparator getComparator(DataType dataType) {
-if (dataType == DataTypes.BOOLEAN) {
-  return new BooleanSerializableComparator();
-} else if (dataType == DataTypes.INT) {
-  return new IntSerializableComparator();
-} else if (dataType == DataTypes.SHORT) {
-  return new ShortSerializableComparator();
-} else if (dataType == DataTypes.DOUBLE) {
-  return new DoubleSerializableComparator();
-} else if (dataType == DataTypes.FLOAT) {
-  return new FloatSerializableComparator();
-} else if (dataType == DataTypes.LONG || dataType == DataTypes.DATE
-|| dataType == DataTypes.TIMESTAMP) {
-  return new LongSerializableComparator();
-} else if (DataTypes.isDecimal(dataType)) {
-  return new BigDecimalSerializableComparator();
+if (dataType == DataTypes.DATE || dataType == DataTypes.TIMESTAMP) {
+  return LONG;
+} else if (dataType == DataTypes.STRING) {
+  return BYTE;
 } else {
-  return new ByteArraySerializableComparator();
+  return getComparatorByDataTypeForMeasure(dataType);
 }
   }
 
@@ -54,21 +53,21 @@ public final class Comparator {
*/
   public static SerializableComparator 
getComparatorByDataTypeForMeasure(DataType dataType) {
 if (dataType == DataTypes.BOOLEAN) {
-  return new BooleanSerializableComparator();
+  return BOOLEAN;
 } else if (dataType == DataTypes.INT) {
-  return new IntSerializableComparator();
+  return INT;
 } else if (dataType == DataTypes.SHORT) {
-  return new ShortSerializableComparator();
+  return SHORT;
 } else if (dataType == DataTypes.LONG) {
-  return new LongSerializableComparator();
+  return LONG;
 } else if (dataType == DataTypes.DOUBLE) {
-  return new DoubleSerializableComparator();
+  return DOUBLE;
 } else if (dataType == DataTypes.FLOAT) {
-  return new FloatSerializableComparator();
+  return FLOAT;
 } else if (DataTypes.isDecimal(dataType)) {
-  return new BigDecimalSerializableComparator();
+  return DECIMAL;
 } else if (dataType == DataTypes.BYTE) {
-  return new ByteArraySerializableComparator();
+  return BYTE;
 } else {
   throw new IllegalArgumentException("Unsupported data type: " + 
dataType.getName());
 }
@@ -198,4 +197,4 @@ class BigDecimalSerializable

[carbondata] 25/33: [DOC] Update doc for alter sort_columns

2019-10-03 Thread ravipesala
This is an automated email from the ASF dual-hosted git repository.

ravipesala pushed a commit to branch branch-1.6
in repository https://gitbox.apache.org/repos/asf/carbondata.git

commit ab86705f64f460b639e05a110d6a4e13977cc773
Author: QiangCai 
AuthorDate: Fri Sep 20 09:54:52 2019 +0800

[DOC] Update doc for alter sort_columns

This closes #3395
---
 docs/ddl-of-carbondata.md | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/docs/ddl-of-carbondata.md b/docs/ddl-of-carbondata.md
index 7ab0e5f..9c9a02f 100644
--- a/docs/ddl-of-carbondata.md
+++ b/docs/ddl-of-carbondata.md
@@ -817,7 +817,7 @@ Users can specify which columns to include and exclude for 
local dictionary gene
```
 
**NOTE:**
-* The future version will enhance "custom" compaction to sort the old 
segment one by one.
+* The "custom" compaction support re-sorting the old segment one by 
one in version 1.6 or later.
 * The streaming table is not supported for SORT_COLUMNS modification.
 * If the inverted index columns are removed from the new SORT_COLUMNS, 
they will not 
 create the inverted index. But the old configuration of INVERTED_INDEX 
will be kept.



[carbondata] 06/33: [CARBONDATA-3509] Support disable query prefetch by configuration

2019-10-03 Thread ravipesala
This is an automated email from the ASF dual-hosted git repository.

ravipesala pushed a commit to branch branch-1.6
in repository https://gitbox.apache.org/repos/asf/carbondata.git

commit 129b1634aa8eab29e6ec75653a9d9bd451d965b1
Author: ajantha-bhat 
AuthorDate: Fri Aug 30 11:08:09 2019 +0530

[CARBONDATA-3509] Support disable query prefetch by configuration

Support disable query prefetch by configuration:
Prefetch runs in asynch thread in query and it is always enable in query 
flow. If some user wants to disable it, they can use this property to disable 
and observe in logs.

This closes #3370
---
 .../carbondata/core/constants/CarbonCommonConstants.java |  9 +
 .../core/scan/executor/impl/AbstractQueryExecutor.java   |  2 ++
 .../apache/carbondata/core/scan/model/QueryModel.java|  4 +++-
 .../apache/carbondata/core/util/CarbonProperties.java| 16 
 docs/configuration-parameters.md |  1 +
 5 files changed, 31 insertions(+), 1 deletion(-)

diff --git 
a/core/src/main/java/org/apache/carbondata/core/constants/CarbonCommonConstants.java
 
b/core/src/main/java/org/apache/carbondata/core/constants/CarbonCommonConstants.java
index 17b191d..67fa13f 100644
--- 
a/core/src/main/java/org/apache/carbondata/core/constants/CarbonCommonConstants.java
+++ 
b/core/src/main/java/org/apache/carbondata/core/constants/CarbonCommonConstants.java
@@ -1483,6 +1483,15 @@ public final class CarbonCommonConstants {
 
   public static final String 
CARBON_MAX_EXECUTOR_THREADS_FOR_BLOCK_PRUNING_DEFAULT = "4";
 
+  /*
+   * whether to enable prefetch for query
+   */
+  @CarbonProperty
+  public static final String CARBON_QUERY_PREFETCH_ENABLE =
+  "carbon.query.prefetch.enable";
+
+  public static final String CARBON_QUERY_PREFETCH_ENABLE_DEFAULT = "true";
+
   
//
   // Datamap parameter start here
   
//
diff --git 
a/core/src/main/java/org/apache/carbondata/core/scan/executor/impl/AbstractQueryExecutor.java
 
b/core/src/main/java/org/apache/carbondata/core/scan/executor/impl/AbstractQueryExecutor.java
index b3d4780..6760e77 100644
--- 
a/core/src/main/java/org/apache/carbondata/core/scan/executor/impl/AbstractQueryExecutor.java
+++ 
b/core/src/main/java/org/apache/carbondata/core/scan/executor/impl/AbstractQueryExecutor.java
@@ -493,7 +493,9 @@ public abstract class AbstractQueryExecutor implements 
QueryExecutor {
 segmentProperties.getDimensionOrdinalToChunkMapping().size());
 if (queryModel.isReadPageByPage()) {
   blockExecutionInfo.setPrefetchBlocklet(false);
+  LOGGER.info("Query prefetch is: false, read page by page");
 } else {
+  LOGGER.info("Query prefetch is: " + queryModel.isPreFetchData());
   blockExecutionInfo.setPrefetchBlocklet(queryModel.isPreFetchData());
 }
 // In case of fg datamap it should not go to direct fill.
diff --git 
a/core/src/main/java/org/apache/carbondata/core/scan/model/QueryModel.java 
b/core/src/main/java/org/apache/carbondata/core/scan/model/QueryModel.java
index 267527f..4d10492 100644
--- a/core/src/main/java/org/apache/carbondata/core/scan/model/QueryModel.java
+++ b/core/src/main/java/org/apache/carbondata/core/scan/model/QueryModel.java
@@ -34,6 +34,7 @@ import 
org.apache.carbondata.core.scan.expression.UnknownExpression;
 import 
org.apache.carbondata.core.scan.expression.conditional.ConditionalExpression;
 import org.apache.carbondata.core.scan.filter.resolver.FilterResolverIntf;
 import org.apache.carbondata.core.stats.QueryStatisticsRecorder;
+import org.apache.carbondata.core.util.CarbonProperties;
 import org.apache.carbondata.core.util.CarbonUtil;
 import org.apache.carbondata.core.util.DataTypeConverter;
 
@@ -110,7 +111,7 @@ public class QueryModel {
   // whether to clear/free unsafe memory or not
   private boolean freeUnsafeMemory = true;
 
-  private boolean preFetchData = true;
+  private boolean preFetchData;
 
   /**
* It fills the vector directly from decoded column page with out any 
staging and conversions.
@@ -125,6 +126,7 @@ public class QueryModel {
 tableBlockInfos = new ArrayList();
 this.table = carbonTable;
 this.queryId = String.valueOf(System.nanoTime());
+this.preFetchData = CarbonProperties.getQueryPrefetchEnable();
   }
 
   public static QueryModel newInstance(CarbonTable carbonTable) {
diff --git 
a/core/src/main/java/org/apache/carbondata/core/util/CarbonProperties.java 
b/core/src/main/java/org/apache/carbondata/core/util/CarbonProperties.java
index c60dad8..adf4905 100644
--- a/core/src/main/java/org/apache/carbondata/core/util/CarbonProperties.java
+++ b/core/src/main/java/org/apache/carbondata/core/util/CarbonProperties.java
@@ -1773

[carbondata] 09/33: [CARBONDATA-3502] Select query with UDF having Match expression inside IN expression Fails

2019-10-03 Thread ravipesala
This is an automated email from the ASF dual-hosted git repository.

ravipesala pushed a commit to branch branch-1.6
in repository https://gitbox.apache.org/repos/asf/carbondata.git

commit 75e207cf7ac3fa262ba04cc3c8d7f2d902256882
Author: manishnalla1994 
AuthorDate: Mon Aug 26 17:25:34 2019 +0530

[CARBONDATA-3502] Select query with UDF having Match expression
inside IN expression Fails

Problem: Select query with UDF having Match expression inside
IN expression Fails with ArrayIndexOutOfBounds exception.

Cause: The expression should not be treated as Match expression,
instead should be treated as SparkUnknownExpression.

Solution: Removed the check for Match Expression as it was only added
for Lucene Search mode, which is no longer present.

This closes #3363
---
 .../src/main/scala/org/apache/spark/sql/optimizer/CarbonFilters.scala   | 2 --
 1 file changed, 2 deletions(-)

diff --git 
a/integration/spark2/src/main/scala/org/apache/spark/sql/optimizer/CarbonFilters.scala
 
b/integration/spark2/src/main/scala/org/apache/spark/sql/optimizer/CarbonFilters.scala
index c4415f8..0fd07bb 100644
--- 
a/integration/spark2/src/main/scala/org/apache/spark/sql/optimizer/CarbonFilters.scala
+++ 
b/integration/spark2/src/main/scala/org/apache/spark/sql/optimizer/CarbonFilters.scala
@@ -425,8 +425,6 @@ object CarbonFilters {
 new AndExpression(l, r)
   case strTrim: StringTrim if isStringTrimCompatibleWithCarbon(strTrim) =>
 transformExpression(strTrim)
-  case s: ScalaUDF =>
-new MatchExpression(s.children.head.toString())
   case _ =>
 new SparkUnknownExpression(expr.transform {
   case AttributeReference(name, dataType, _, _) =>



[carbondata] 27/33: [CARBONDATA-3473] Fix data size calcution of the last column in CarbonCli

2019-10-03 Thread ravipesala
This is an automated email from the ASF dual-hosted git repository.

ravipesala pushed a commit to branch branch-1.6
in repository https://gitbox.apache.org/repos/asf/carbondata.git

commit 57309d70d08675c31975d2a60692835e7a6c22cf
Author: Manhua 
AuthorDate: Wed Jul 17 17:39:29 2019 +0800

[CARBONDATA-3473] Fix data size calcution of the last column in CarbonCli

When update last column chunk data size, current code use 
columnDataSize.add(fileSizeInBytes - footerSizeInBytes - previousChunkOffset) 
for every blocklet. This leads to wrong result for calculting the data size of 
the last column, especially when a carbon data file has multiple blocklet.

In this PR, we fix this problem and modify the calcultion by remarking the 
end offset of blocklet.

This closes #3330
---
 .../java/org/apache/carbondata/tool/DataFile.java  | 32 +++---
 .../org/apache/carbondata/tool/CarbonCliTest.java  |  6 ++--
 2 files changed, 19 insertions(+), 19 deletions(-)

diff --git a/tools/cli/src/main/java/org/apache/carbondata/tool/DataFile.java 
b/tools/cli/src/main/java/org/apache/carbondata/tool/DataFile.java
index e553a78..4ed3945 100644
--- a/tools/cli/src/main/java/org/apache/carbondata/tool/DataFile.java
+++ b/tools/cli/src/main/java/org/apache/carbondata/tool/DataFile.java
@@ -121,16 +121,21 @@ class DataFile {
 this.partNo = CarbonTablePath.DataFileUtil.getPartNo(fileName);
 
 // calculate blocklet size and column size
-// first calculate the header size, it equals the offset of first
-// column chunk in first blocklet
-long headerSizeInBytes = 
footer.blocklet_info_list3.get(0).column_data_chunks_offsets.get(0);
-long previousOffset = headerSizeInBytes;
-for (BlockletInfo3 blockletInfo3 : footer.blocklet_info_list3) {
+for (int j = 0; j < footer.getBlocklet_info_list3().size(); j++) {
+  // remark start and end offset of current blocklet for computing 
blocklet size
+  // and chunk data size of the last column
+  BlockletInfo3 blockletInfo3 = footer.blocklet_info_list3.get(j);
+  long blockletEndOffset;
+  if (j != footer.getBlocklet_info_list3().size() - 1) {
+// use start offset of next blocklet as end offset of current blocklet
+blockletEndOffset = footer.blocklet_info_list3.get(j + 
1).column_data_chunks_offsets.get(j);
+  } else {
+// use start offset of footer as end offset of current blocklet if it 
is the last blocklet
+blockletEndOffset = fileSizeInBytes - footerSizeInBytes;
+  }
   // calculate blocklet size in bytes
-  long blockletOffset = blockletInfo3.column_data_chunks_offsets.get(0);
-  blockletSizeInBytes.add(blockletOffset - previousOffset);
-  previousOffset = blockletOffset;
-
+  this.blockletSizeInBytes.add(
+  blockletEndOffset - 
blockletInfo3.column_data_chunks_offsets.get(0));
   // calculate column size in bytes for each column
   LinkedList columnDataSize = new LinkedList<>();
   LinkedList columnMetaSize = new LinkedList<>();
@@ -140,17 +145,12 @@ class DataFile {
 
columnMetaSize.add(blockletInfo3.column_data_chunks_length.get(i).longValue());
 previousChunkOffset = blockletInfo3.column_data_chunks_offsets.get(i);
   }
-  // last column chunk data size
-  columnDataSize.add(fileSizeInBytes - footerSizeInBytes - 
previousChunkOffset);
+  // update chunk data size of the last column
+  columnDataSize.add(blockletEndOffset - previousChunkOffset);
   columnDataSize.removeFirst();
   this.columnDataSizeInBytes.add(columnDataSize);
   this.columnMetaSizeInBytes.add(columnMetaSize);
-
 }
-// last blocklet size
-blockletSizeInBytes.add(
-fileSizeInBytes - footerSizeInBytes - headerSizeInBytes - 
previousOffset);
-this.blockletSizeInBytes.removeFirst();
 
 assert (blockletSizeInBytes.size() == getNumBlocklets());
   }
diff --git 
a/tools/cli/src/test/java/org/apache/carbondata/tool/CarbonCliTest.java 
b/tools/cli/src/test/java/org/apache/carbondata/tool/CarbonCliTest.java
index af8d51d..4d89777 100644
--- a/tools/cli/src/test/java/org/apache/carbondata/tool/CarbonCliTest.java
+++ b/tools/cli/src/test/java/org/apache/carbondata/tool/CarbonCliTest.java
@@ -234,11 +234,11 @@ public class CarbonCliTest {
 
 expectedOutput = buildLines(
 "BLK  BLKLT  Meta Size  Data Size  LocalDict  DictEntries  DictSize  
AvgPageSize  Min%  Max%   Min  Max  " ,
-"00  3.36KB 5.14MB false  00.0B  
93.76KB  0.0   100.0  0290  " ,
+"00  3.36KB 2.57MB false  00.0B  
93.76KB  0.0   100.0  0290  " ,
 "01  3.36KB 2.57MB false  00.0B  
93.76KB  0.0   100.0  1292  " ,
-"10  3.36KB 5.14MB false  0 

[carbondata] 20/33: [HOTFIX] fix missing quotation marks in datamap doc

2019-10-03 Thread ravipesala
This is an automated email from the ASF dual-hosted git repository.

ravipesala pushed a commit to branch branch-1.6
in repository https://gitbox.apache.org/repos/asf/carbondata.git

commit 204f29047b2685d03f27febcec24095f56053a31
Author: lamber-ken <2217232...@qq.com>
AuthorDate: Wed Sep 11 21:48:21 2019 +0800

[HOTFIX] fix missing quotation marks in datamap doc

This closes #3383
---
 docs/datamap/datamap-management.md | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/docs/datamap/datamap-management.md 
b/docs/datamap/datamap-management.md
index 199cd14..f910559 100644
--- a/docs/datamap/datamap-management.md
+++ b/docs/datamap/datamap-management.md
@@ -74,7 +74,7 @@ If user perform following command on the main table, system 
will return failure.
`ALTER TABLE RENAME`. Note that adding a new column is supported, and for 
dropping columns and
change datatype command, CarbonData will check whether it will impact the 
pre-aggregate table, if
 not, the operation is allowed, otherwise operation will be rejected by 
throwing exception.
-3. Partition management command: `ALTER TABLE ADD/DROP PARTITION
+3. Partition management command: `ALTER TABLE ADD/DROP PARTITION`.
 
 If user do want to perform above operations on the main table, user can first 
drop the datamap, perform the operation, and re-create the datamap again.
 



[carbondata] 12/33: [CARBONDATA-3513] fix 'taskNo' exceeding Long.MAX_VALUE issue when execute major compaction

2019-10-03 Thread ravipesala
This is an automated email from the ASF dual-hosted git repository.

ravipesala pushed a commit to branch branch-1.6
in repository https://gitbox.apache.org/repos/asf/carbondata.git

commit 41ae280c16905687c7ea08b3cd05acef9e359c26
Author: changchun wang 
AuthorDate: Thu Sep 5 16:28:41 2019 +0800

[CARBONDATA-3513] fix 'taskNo' exceeding Long.MAX_VALUE issue when execute 
major compaction

Probelm:
Major compaction command runs error.
java.lang.NumberFormatException is thrown.java.lang.NumberFormatException: 
For input string: "328812001110"
Through code analysis it was found that taskno is "long" type. taskno 
generate algorithm may generate a number bigger than "Long.MAX_VALUE". 
carbondata-3325 change taskno type to string. But in some places it still using 
long.

Solution:
Change taskno type to string.

This closes #3376
---
 .../apache/carbondata/processing/merger/AbstractResultProcessor.java| 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git 
a/processing/src/main/java/org/apache/carbondata/processing/merger/AbstractResultProcessor.java
 
b/processing/src/main/java/org/apache/carbondata/processing/merger/AbstractResultProcessor.java
index f557e9b..951339a 100644
--- 
a/processing/src/main/java/org/apache/carbondata/processing/merger/AbstractResultProcessor.java
+++ 
b/processing/src/main/java/org/apache/carbondata/processing/merger/AbstractResultProcessor.java
@@ -61,7 +61,7 @@ public abstract class AbstractResultProcessor {
   carbonDataFileAttributes = new CarbonDataFileAttributes(index, 
loadModel.getFactTimeStamp());
 } else {
   carbonDataFileAttributes =
-  new CarbonDataFileAttributes(Long.parseLong(loadModel.getTaskNo()),
+  new CarbonDataFileAttributes(loadModel.getTaskNo(),
   loadModel.getFactTimeStamp());
 }
 
carbonFactDataHandlerModel.setCarbonDataFileAttributes(carbonDataFileAttributes);



[carbondata] 16/33: [CARBONDATA-3508] Support CG datamap pruning fallback while querying

2019-10-03 Thread ravipesala
This is an automated email from the ASF dual-hosted git repository.

ravipesala pushed a commit to branch branch-1.6
in repository https://gitbox.apache.org/repos/asf/carbondata.git

commit 0e2d3e20cad0e7032d959c5f9107249eaa258685
Author: shivamasn 
AuthorDate: Thu Aug 29 11:49:41 2019 +0530

[CARBONDATA-3508] Support CG datamap pruning fallback while querying

Problem: Select query fails when the cg datamap is dropped concurrently
while running the select query on filter column on which datamap is created.

Solution: Handle the exception from datamap blocklet pruning if
it fails and consider only the pruned blocklets from default datamap 
pruning.

This closes #3369
---
 .../core/indexstore/BlockletDataMapIndexStore.java |  2 +-
 .../statusmanager/SegmentUpdateStatusManager.java  |  6 ++--
 .../datamap/bloom/BloomCoarseGrainDataMap.java |  2 +-
 .../carbondata/hadoop/api/CarbonInputFormat.java   | 32 ++
 4 files changed, 27 insertions(+), 15 deletions(-)

diff --git 
a/core/src/main/java/org/apache/carbondata/core/indexstore/BlockletDataMapIndexStore.java
 
b/core/src/main/java/org/apache/carbondata/core/indexstore/BlockletDataMapIndexStore.java
index 32ee9cb..fd549e0 100644
--- 
a/core/src/main/java/org/apache/carbondata/core/indexstore/BlockletDataMapIndexStore.java
+++ 
b/core/src/main/java/org/apache/carbondata/core/indexstore/BlockletDataMapIndexStore.java
@@ -80,7 +80,7 @@ public class BlockletDataMapIndexStore
 return get(identifierWrapper, null);
   }
 
-  private BlockletDataMapIndexWrapper 
get(TableBlockIndexUniqueIdentifierWrapper identifierWrapper,
+  public BlockletDataMapIndexWrapper 
get(TableBlockIndexUniqueIdentifierWrapper identifierWrapper,
   Map> segInfoCache) throws IOException 
{
 TableBlockIndexUniqueIdentifier identifier =
 identifierWrapper.getTableBlockIndexUniqueIdentifier();
diff --git 
a/core/src/main/java/org/apache/carbondata/core/statusmanager/SegmentUpdateStatusManager.java
 
b/core/src/main/java/org/apache/carbondata/core/statusmanager/SegmentUpdateStatusManager.java
index f7083dc..bc794f4 100644
--- 
a/core/src/main/java/org/apache/carbondata/core/statusmanager/SegmentUpdateStatusManager.java
+++ 
b/core/src/main/java/org/apache/carbondata/core/statusmanager/SegmentUpdateStatusManager.java
@@ -27,8 +27,10 @@ import java.io.InputStreamReader;
 import java.io.OutputStreamWriter;
 import java.util.ArrayList;
 import java.util.HashMap;
+import java.util.HashSet;
 import java.util.List;
 import java.util.Map;
+import java.util.Set;
 
 import org.apache.carbondata.common.logging.LogServiceFactory;
 import org.apache.carbondata.core.constants.CarbonCommonConstants;
@@ -790,8 +792,8 @@ public class SegmentUpdateStatusManager {
 final long deltaEndTimestamp =
 getEndTimeOfDeltaFile(CarbonCommonConstants.DELETE_DELTA_FILE_EXT, 
block);
 
-List files =
-new ArrayList<>(CarbonCommonConstants.DEFAULT_COLLECTION_SIZE);
+Set files =
+new HashSet<>(CarbonCommonConstants.DEFAULT_COLLECTION_SIZE);
 
 for (CarbonFile eachFile : allSegmentFiles) {
   String fileName = eachFile.getName();
diff --git 
a/datamap/bloom/src/main/java/org/apache/carbondata/datamap/bloom/BloomCoarseGrainDataMap.java
 
b/datamap/bloom/src/main/java/org/apache/carbondata/datamap/bloom/BloomCoarseGrainDataMap.java
index fea48c3..f931353 100644
--- 
a/datamap/bloom/src/main/java/org/apache/carbondata/datamap/bloom/BloomCoarseGrainDataMap.java
+++ 
b/datamap/bloom/src/main/java/org/apache/carbondata/datamap/bloom/BloomCoarseGrainDataMap.java
@@ -232,7 +232,7 @@ public class BloomCoarseGrainDataMap extends 
CoarseGrainDataMap {
   LOGGER.warn(String.format("HitBlocklets is empty in bloom filter prune 
method. " +
   "bloomQueryModels size is %d, filterShards size if %d",
   bloomQueryModels.size(), filteredShard.size()));
-  return null;
+  return new ArrayList<>();
 }
 return new ArrayList<>(hitBlocklets);
   }
diff --git 
a/hadoop/src/main/java/org/apache/carbondata/hadoop/api/CarbonInputFormat.java 
b/hadoop/src/main/java/org/apache/carbondata/hadoop/api/CarbonInputFormat.java
index ac9e11e..45041e4 100644
--- 
a/hadoop/src/main/java/org/apache/carbondata/hadoop/api/CarbonInputFormat.java
+++ 
b/hadoop/src/main/java/org/apache/carbondata/hadoop/api/CarbonInputFormat.java
@@ -573,19 +573,29 @@ m filterExpression
   if (cgDataMapExprWrapper != null) {
 // Prune segments from already pruned blocklets
 DataMapUtil.pruneSegments(segmentIds, prunedBlocklets);
-List cgPrunedBlocklets;
+List cgPrunedBlocklets = new ArrayList<>();
+boolean isCGPruneFallback = false;
 // Again prune with CG datamap.
-if (distributedCG && dataMapJob != null) {
-  cgPrunedBlocklets = DataMapUtil
-  .executeDataMapJob(carb

[carbondata] 05/33: [CARBONDATA-3491] Return updated/deleted rows count when execute update/delete sql

2019-10-03 Thread ravipesala
This is an automated email from the ASF dual-hosted git repository.

ravipesala pushed a commit to branch branch-1.6
in repository https://gitbox.apache.org/repos/asf/carbondata.git

commit 6c9bbfe8ec5606262b93b101986352578b2b
Author: Zhang Zhichao <441586...@qq.com>
AuthorDate: Tue Aug 13 11:00:23 2019 +0800

[CARBONDATA-3491] Return updated/deleted rows count when execute 
update/delete sql

Return updated/deleted rows count when execute update/delete sql

This closes #3357
---
 .../testsuite/iud/DeleteCarbonTableTestCase.scala  | 19 +
 .../testsuite/iud/UpdateCarbonTableTestCase.scala  | 33 ++
 .../scala/org/apache/carbondata/spark/KeyVal.scala | 10 +++
 .../apache/spark/util/CarbonReflectionUtils.scala  | 16 +++
 .../apache/spark/sql/CarbonCatalystOperators.scala |  6 ++--
 .../mutation/CarbonProjectForDeleteCommand.scala   | 21 ++
 .../mutation/CarbonProjectForUpdateCommand.scala   | 19 -
 .../command/mutation/DeleteExecution.scala | 27 ++
 .../spark/sql/hive/CarbonAnalysisRules.scala   | 12 +++-
 9 files changed, 129 insertions(+), 34 deletions(-)

diff --git 
a/integration/spark-common-test/src/test/scala/org/apache/carbondata/spark/testsuite/iud/DeleteCarbonTableTestCase.scala
 
b/integration/spark-common-test/src/test/scala/org/apache/carbondata/spark/testsuite/iud/DeleteCarbonTableTestCase.scala
index f26283b..4565d7a 100644
--- 
a/integration/spark-common-test/src/test/scala/org/apache/carbondata/spark/testsuite/iud/DeleteCarbonTableTestCase.scala
+++ 
b/integration/spark-common-test/src/test/scala/org/apache/carbondata/spark/testsuite/iud/DeleteCarbonTableTestCase.scala
@@ -361,6 +361,25 @@ class DeleteCarbonTableTestCase extends QueryTest with 
BeforeAndAfterAll {
 sql("drop table if exists decimal_table")
   }
 
+  test("[CARBONDATA-3491] Return updated/deleted rows count when execute 
update/delete sql") {
+sql("drop table if exists test_return_row_count")
+
+sql("create table test_return_row_count (a string, b string, c string) 
stored by 'carbondata'").show()
+sql("insert into test_return_row_count select 'aaa','bbb','ccc'").show()
+sql("insert into test_return_row_count select 'bbb','bbb','ccc'").show()
+sql("insert into test_return_row_count select 'ccc','bbb','ccc'").show()
+sql("insert into test_return_row_count select 'ccc','bbb','ccc'").show()
+
+checkAnswer(sql("delete from test_return_row_count where a = 'aaa'"),
+Seq(Row(1))
+)
+checkAnswer(sql("select * from test_return_row_count"),
+Seq(Row("bbb", "bbb", "ccc"), Row("ccc", "bbb", "ccc"), Row("ccc", 
"bbb", "ccc"))
+)
+
+sql("drop table if exists test_return_row_count").show()
+  }
+
   override def afterAll {
 sql("use default")
 sql("drop database  if exists iud_db cascade")
diff --git 
a/integration/spark-common-test/src/test/scala/org/apache/carbondata/spark/testsuite/iud/UpdateCarbonTableTestCase.scala
 
b/integration/spark-common-test/src/test/scala/org/apache/carbondata/spark/testsuite/iud/UpdateCarbonTableTestCase.scala
index cf45600..ef18035 100644
--- 
a/integration/spark-common-test/src/test/scala/org/apache/carbondata/spark/testsuite/iud/UpdateCarbonTableTestCase.scala
+++ 
b/integration/spark-common-test/src/test/scala/org/apache/carbondata/spark/testsuite/iud/UpdateCarbonTableTestCase.scala
@@ -826,6 +826,39 @@ class UpdateCarbonTableTestCase extends QueryTest with 
BeforeAndAfterAll {
 sql("""drop table iud.dest11""").show
   }
 
+  test("[CARBONDATA-3491] Return updated/deleted rows count when execute 
update/delete sql") {
+sql("drop table if exists test_return_row_count")
+sql("drop table if exists test_return_row_count_source")
+
+sql("create table test_return_row_count (a string, b string, c string) 
stored by 'carbondata'").show()
+sql("insert into test_return_row_count select 'bbb','bbb','ccc'").show()
+sql("insert into test_return_row_count select 'ccc','bbb','ccc'").show()
+sql("insert into test_return_row_count select 'ccc','bbb','ccc'").show()
+
+sql("create table test_return_row_count_source (a string, b string, c 
string) stored by 'carbondata'").show()
+sql("insert into test_return_row_count_source select 
'aaa','eee','ccc'").show()
+sql("insert into test_return_row_count_source select 
'bbb','bbb','ccc'").show()
+sql("insert into test_return_row_count_source select 
'ccc','bbb','ccc'").show()
+sql("insert into test_return_row_count_source select 
'ccc','bbb','ccc'").show()
+
+checkAnswer

[carbondata] 08/33: [CARBONDATA-3499] Fix insert failure with customFileProvider

2019-10-03 Thread ravipesala
This is an automated email from the ASF dual-hosted git repository.

ravipesala pushed a commit to branch branch-1.6
in repository https://gitbox.apache.org/repos/asf/carbondata.git

commit 1943ddae258c754ab11c466557877378fb9a748e
Author: ajantha-bhat 
AuthorDate: Thu Aug 22 17:42:12 2019 +0530

[CARBONDATA-3499] Fix insert failure with customFileProvider

Problem:
Below exception is thrown when the custom file system is used with
first time insert randomly. IllegalArgumentException("Path belongs to
unsupported file system") from FileFactory.getFileType()

Cause:
DefaultFileTypeProvider.initializeCustomFileProvider is called concurrently
during insert. Hence one thread got the provider and other thread didn't get
as flag is set to true. so other thread failed as it tried with default 
provider.

Solution:
synchronize the initialization of custom file provider.

This closes #3362
---
 .../datastore/impl/DefaultFileTypeProvider.java| 31 +-
 1 file changed, 19 insertions(+), 12 deletions(-)

diff --git 
a/core/src/main/java/org/apache/carbondata/core/datastore/impl/DefaultFileTypeProvider.java
 
b/core/src/main/java/org/apache/carbondata/core/datastore/impl/DefaultFileTypeProvider.java
index cdb1a20..4572cc4 100644
--- 
a/core/src/main/java/org/apache/carbondata/core/datastore/impl/DefaultFileTypeProvider.java
+++ 
b/core/src/main/java/org/apache/carbondata/core/datastore/impl/DefaultFileTypeProvider.java
@@ -43,7 +43,9 @@ public class DefaultFileTypeProvider implements 
FileTypeInterface {
*/
   protected FileTypeInterface customFileTypeProvider = null;
 
-  protected boolean customFileTypeProviderInitialized = false;
+  protected Boolean customFileTypeProviderInitialized = false;
+
+  private final Object lock = new Object();
 
   public DefaultFileTypeProvider() {
   }
@@ -52,17 +54,22 @@ public class DefaultFileTypeProvider implements 
FileTypeInterface {
* This method is required apart from Constructor to handle the below 
circular dependency.
* CarbonProperties-->FileFactory-->DefaultTypeProvider-->CarbonProperties
*/
-  private void initializeCustomFileprovider() {
+  private void initializeCustomFileProvider() {
 if (!customFileTypeProviderInitialized) {
-  customFileTypeProviderInitialized = true;
-  String customFileProvider =
-  
CarbonProperties.getInstance().getProperty(CarbonCommonConstants.CUSTOM_FILE_PROVIDER);
-  if (customFileProvider != null && !customFileProvider.trim().isEmpty()) {
-try {
-  customFileTypeProvider =
-  (FileTypeInterface) 
Class.forName(customFileProvider).newInstance();
-} catch (Exception e) {
-  LOGGER.error("Unable load configured FileTypeInterface class. 
Ignored.", e);
+  // This initialization can happen in concurrent threads.
+  synchronized (lock) {
+if (!customFileTypeProviderInitialized) {
+  String customFileProvider = CarbonProperties.getInstance()
+  .getProperty(CarbonCommonConstants.CUSTOM_FILE_PROVIDER);
+  if (customFileProvider != null && 
!customFileProvider.trim().isEmpty()) {
+try {
+  customFileTypeProvider =
+  (FileTypeInterface) 
Class.forName(customFileProvider).newInstance();
+} catch (Exception e) {
+  LOGGER.error("Unable load configured FileTypeInterface class. 
Ignored.", e);
+}
+customFileTypeProviderInitialized = true;
+  }
 }
   }
 }
@@ -77,7 +84,7 @@ public class DefaultFileTypeProvider implements 
FileTypeInterface {
* @return true if supported by the custom
*/
   @Override public boolean isPathSupported(String path) {
-initializeCustomFileprovider();
+initializeCustomFileProvider();
 if (customFileTypeProvider != null) {
   return customFileTypeProvider.isPathSupported(path);
 }



[carbondata] 33/33: [CARBONDATA-3526]Fix cache issue during update and query

2019-10-03 Thread ravipesala
This is an automated email from the ASF dual-hosted git repository.

ravipesala pushed a commit to branch branch-1.6
in repository https://gitbox.apache.org/repos/asf/carbondata.git

commit be9580b5768389d6adf0392fc820fb7e7186bd4c
Author: akashrn5 
AuthorDate: Thu Sep 12 14:30:48 2019 +0530

[CARBONDATA-3526]Fix cache issue during update and query

Problem:
When multiple updates happen on table, cache is loaded
during update operation, but since on second update the
horizontal compaction happens inside the segment, already
loaded into cache are invalid. So if we do clean files,
physical deletion of horizontal compacted takes place,
but still the cache contains old files. So when select
 query is fired, query fails with file not found exception.

Solution:
once after horizontal compaction is finished, new compacted
files are generated, so the segments inside cache are now invalid,
so clear the cache of invalid segment after horizontal compaction.
During drop cache command, clear the cache of segmentMap also.

This closes #3385
---
 .../sql/execution/command/cache/CarbonDropCacheCommand.scala | 8 +++-
 .../sql/execution/command/mutation/HorizontalCompaction.scala| 9 -
 2 files changed, 11 insertions(+), 6 deletions(-)

diff --git 
a/integration/spark2/src/main/scala/org/apache/spark/sql/execution/command/cache/CarbonDropCacheCommand.scala
 
b/integration/spark2/src/main/scala/org/apache/spark/sql/execution/command/cache/CarbonDropCacheCommand.scala
index 1554f6a..7b8e10f 100644
--- 
a/integration/spark2/src/main/scala/org/apache/spark/sql/execution/command/cache/CarbonDropCacheCommand.scala
+++ 
b/integration/spark2/src/main/scala/org/apache/spark/sql/execution/command/cache/CarbonDropCacheCommand.scala
@@ -25,7 +25,7 @@ import org.apache.spark.sql.execution.command.MetadataCommand
 
 import org.apache.carbondata.common.logging.LogServiceFactory
 import org.apache.carbondata.core.cache.CacheProvider
-import org.apache.carbondata.core.datamap.DataMapUtil
+import org.apache.carbondata.core.datamap.{DataMapStoreManager, DataMapUtil}
 import org.apache.carbondata.core.metadata.schema.table.CarbonTable
 import org.apache.carbondata.core.util.CarbonProperties
 import org.apache.carbondata.events.{DropTableCacheEvent, OperationContext, 
OperationListenerBus}
@@ -55,13 +55,11 @@ case class CarbonDropCacheCommand(tableIdentifier: 
TableIdentifier, internalCall
 carbonTable.getTableName)) {
 DataMapUtil.executeClearDataMapJob(carbonTable, 
DataMapUtil.DISTRIBUTED_JOB_NAME)
   } else {
-val allIndexFiles = 
CacheUtil.getAllIndexFiles(carbonTable)(sparkSession)
 // Extract dictionary keys for the table and create cache keys from 
those
 val dictKeys: List[String] = CacheUtil.getAllDictCacheKeys(carbonTable)
-
 // Remove elements from cache
-val keysToRemove = allIndexFiles ++ dictKeys
-cache.removeAll(keysToRemove.asJava)
+cache.removeAll(dictKeys.asJava)
+
DataMapStoreManager.getInstance().clearDataMaps(carbonTable.getAbsoluteTableIdentifier)
   }
 }
 LOGGER.info("Drop cache request served for table " + 
carbonTable.getTableUniqueName)
diff --git 
a/integration/spark2/src/main/scala/org/apache/spark/sql/execution/command/mutation/HorizontalCompaction.scala
 
b/integration/spark2/src/main/scala/org/apache/spark/sql/execution/command/mutation/HorizontalCompaction.scala
index fb20e4f..62a3486 100644
--- 
a/integration/spark2/src/main/scala/org/apache/spark/sql/execution/command/mutation/HorizontalCompaction.scala
+++ 
b/integration/spark2/src/main/scala/org/apache/spark/sql/execution/command/mutation/HorizontalCompaction.scala
@@ -28,7 +28,7 @@ import 
org.apache.spark.sql.execution.command.management.CarbonAlterTableCompact
 import org.apache.spark.sql.util.SparkSQLUtil
 
 import org.apache.carbondata.common.logging.LogServiceFactory
-import org.apache.carbondata.core.datamap.Segment
+import org.apache.carbondata.core.datamap.{DataMapStoreManager, Segment}
 import org.apache.carbondata.core.metadata.AbsoluteTableIdentifier
 import org.apache.carbondata.core.metadata.schema.table.CarbonTable
 import org.apache.carbondata.core.statusmanager.SegmentUpdateStatusManager
@@ -106,6 +106,13 @@ object HorizontalCompaction {
   segmentUpdateStatusManager,
   deleteTimeStamp,
   segLists)
+
+// If there are already index and data files are present for old update 
operation, then the
+// cache will be loaded for those files during current update, but once 
after horizontal
+// compaction is finished, new compacted files are generated, so the 
segments inside cache are
+// now invalid, so clear the cache of invalid segment after horizontal 
compaction.
+DataMapStoreManager.getInstance()
+  .clearInvalidSegments(carbonTable, 
segLists.asScala.map(_.getSegmentNo).asJava)
   }
 
   /**



[carbondata] 17/33: [HOTFIX] Fix NPE on windows

2019-10-03 Thread ravipesala
This is an automated email from the ASF dual-hosted git repository.

ravipesala pushed a commit to branch branch-1.6
in repository https://gitbox.apache.org/repos/asf/carbondata.git

commit 631941b566ba015921ef2cdc89434f35d88aef5f
Author: Manhua 
AuthorDate: Thu Sep 5 21:13:28 2019 +0800

[HOTFIX] Fix NPE on windows

Analyse
carbon index files are merged but SegmentFile did not update, so it fails 
to get any default datamap for pruning.

The reason for no updated is about path comparison like

/[your_path_here]/examples/spark2/target/store/default/source/\Fact\Part0\Segment_0
vs

D:/[your_path_here]/examples/spark2/target/store/default/source/Fact/Part0/Segment_0

Solution
use the AbsoluteTableIdentifier of table from CarbonMetadata
instead of the newly created object, keep the path style same

This closes #3377
---
 .../main/scala/org/apache/spark/sql/hive/CarbonFileMetastore.scala| 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git 
a/integration/spark2/src/main/scala/org/apache/spark/sql/hive/CarbonFileMetastore.scala
 
b/integration/spark2/src/main/scala/org/apache/spark/sql/hive/CarbonFileMetastore.scala
index 684bcbb..900b69c 100644
--- 
a/integration/spark2/src/main/scala/org/apache/spark/sql/hive/CarbonFileMetastore.scala
+++ 
b/integration/spark2/src/main/scala/org/apache/spark/sql/hive/CarbonFileMetastore.scala
@@ -151,8 +151,8 @@ class CarbonFileMetastore extends CarbonMetaStore {
 val tables = Option(CarbonMetadata.getInstance.getCarbonTable(database, 
tableName))
 tables match {
   case Some(t) =>
-if (isSchemaRefreshed(absIdentifier, sparkSession)) {
-  readCarbonSchema(absIdentifier, parameters)
+if (isSchemaRefreshed(t.getAbsoluteTableIdentifier, sparkSession)) {
+  readCarbonSchema(t.getAbsoluteTableIdentifier, parameters)
 } else {
   CarbonRelation(database, tableName, 
CarbonSparkUtil.createSparkMeta(t), t)
 }



[carbondata] 21/33: [CARBONDATA-3454] optimized index server output for count(*)

2019-10-03 Thread ravipesala
This is an automated email from the ASF dual-hosted git repository.

ravipesala pushed a commit to branch branch-1.6
in repository https://gitbox.apache.org/repos/asf/carbondata.git

commit 43a8086e15ec34648c02f9022975e683fc139f25
Author: kunal642 
AuthorDate: Thu Jun 27 14:32:11 2019 +0530

[CARBONDATA-3454] optimized index server output for count(*)

Optimised the output for count(*) queries so that only a long is send back 
to the driver to reduce the network transfer cost for index server

This closes #3308
---
 .../apache/carbondata/core/datamap/DataMapJob.java |   2 +
 .../carbondata/core/datamap/DataMapUtil.java   |  13 ++-
 .../core/datamap/DistributableDataMapFormat.java   |  34 +--
 .../core/indexstore/ExtendedBlocklet.java  |  68 -
 .../core/indexstore/ExtendedBlockletWrapper.java   |  27 +++--
 .../ExtendedBlockletWrapperContainer.java  |  19 ++--
 .../carbondata/hadoop/api/CarbonInputFormat.java   |  52 --
 .../hadoop/api/CarbonTableInputFormat.java |  22 ++--
 .../carbondata/indexserver/DataMapJobs.scala   |  15 ++-
 .../indexserver/DistributedCountRDD.scala  | 111 +
 .../indexserver/DistributedPruneRDD.scala  |  29 ++
 .../indexserver/DistributedRDDUtils.scala  |  13 +++
 .../carbondata/indexserver/IndexServer.scala   |  19 
 13 files changed, 319 insertions(+), 105 deletions(-)

diff --git 
a/core/src/main/java/org/apache/carbondata/core/datamap/DataMapJob.java 
b/core/src/main/java/org/apache/carbondata/core/datamap/DataMapJob.java
index 9eafe7c..326282d 100644
--- a/core/src/main/java/org/apache/carbondata/core/datamap/DataMapJob.java
+++ b/core/src/main/java/org/apache/carbondata/core/datamap/DataMapJob.java
@@ -35,4 +35,6 @@ public interface DataMapJob extends Serializable {
 
   List execute(DistributableDataMapFormat dataMapFormat);
 
+  Long executeCountJob(DistributableDataMapFormat dataMapFormat);
+
 }
diff --git 
a/core/src/main/java/org/apache/carbondata/core/datamap/DataMapUtil.java 
b/core/src/main/java/org/apache/carbondata/core/datamap/DataMapUtil.java
index dd9debc..bca7409 100644
--- a/core/src/main/java/org/apache/carbondata/core/datamap/DataMapUtil.java
+++ b/core/src/main/java/org/apache/carbondata/core/datamap/DataMapUtil.java
@@ -230,7 +230,7 @@ public class DataMapUtil {
   List validSegments, List invalidSegments, DataMapLevel 
level,
   List segmentsToBeRefreshed) throws IOException {
 return executeDataMapJob(carbonTable, resolver, dataMapJob, 
partitionsToPrune, validSegments,
-invalidSegments, level, false, segmentsToBeRefreshed);
+invalidSegments, level, false, segmentsToBeRefreshed, false);
   }
 
   /**
@@ -241,7 +241,8 @@ public class DataMapUtil {
   public static List executeDataMapJob(CarbonTable 
carbonTable,
   FilterResolverIntf resolver, DataMapJob dataMapJob, List 
partitionsToPrune,
   List validSegments, List invalidSegments, DataMapLevel 
level,
-  Boolean isFallbackJob, List segmentsToBeRefreshed) throws 
IOException {
+  Boolean isFallbackJob, List segmentsToBeRefreshed, boolean 
isCountJob)
+  throws IOException {
 List invalidSegmentNo = new ArrayList<>();
 for (Segment segment : invalidSegments) {
   invalidSegmentNo.add(segment.getSegmentNo());
@@ -250,9 +251,11 @@ public class DataMapUtil {
 DistributableDataMapFormat dataMapFormat =
 new DistributableDataMapFormat(carbonTable, resolver, validSegments, 
invalidSegmentNo,
 partitionsToPrune, false, level, isFallbackJob);
-List prunedBlocklets = dataMapJob.execute(dataMapFormat);
-// Apply expression on the blocklets.
-return prunedBlocklets;
+if (isCountJob) {
+  dataMapFormat.setCountStarJob();
+  dataMapFormat.setIsWriteToFile(false);
+}
+return dataMapJob.execute(dataMapFormat);
   }
 
   public static SegmentStatusManager.ValidAndInvalidSegmentsInfo 
getValidAndInvalidSegments(
diff --git 
a/core/src/main/java/org/apache/carbondata/core/datamap/DistributableDataMapFormat.java
 
b/core/src/main/java/org/apache/carbondata/core/datamap/DistributableDataMapFormat.java
index 8426fcb..b430c5d 100644
--- 
a/core/src/main/java/org/apache/carbondata/core/datamap/DistributableDataMapFormat.java
+++ 
b/core/src/main/java/org/apache/carbondata/core/datamap/DistributableDataMapFormat.java
@@ -28,7 +28,6 @@ import java.util.UUID;
 
 import org.apache.carbondata.common.logging.LogServiceFactory;
 import org.apache.carbondata.core.constants.CarbonCommonConstants;
-import org.apache.carbondata.core.datamap.dev.DataMap;
 import org.apache.carbondata.core.datamap.dev.expr.DataMapDistributableWrapper;
 import org.apache.carbondata.core.datastore.impl.FileFactory;
 import org.apache.carbondata.core.indexstore.ExtendedBlocklet;
@@ -91,6 +90,8 @@ public class DistributableDataMapFormat extends 
FileInputFormat validSegments, List invalidSegments, 
List part

[carbondata] 14/33: [CARBONDATA-3495] Fix Insert into Complex data type of Binary failure with Carbon & SparkFileFormat

2019-10-03 Thread ravipesala
This is an automated email from the ASF dual-hosted git repository.

ravipesala pushed a commit to branch branch-1.6
in repository https://gitbox.apache.org/repos/asf/carbondata.git

commit 3aea196715a9954e38dccdc50824e8f6a0a75de3
Author: Indhumathi27 
AuthorDate: Thu Aug 22 16:01:49 2019 +0530

[CARBONDATA-3495] Fix Insert into Complex data type of Binary
failure with Carbon & SparkFileFormat

Problem:
Insert into Complex data type(Array/Struct/Map) of binary data
type fails with Invalid data type name, because Binary with
complex data types is not handled

Solution:
Handle Binary data type to work with complex data types

This closes #3361
---
 .../core/datastore/page/ComplexColumnPage.java |   1 +
 .../apache/carbondata/core/util/DataTypeUtil.java  |   3 +
 .../src/test/resources/complexbinary.csv   |   3 +
 .../complexType/TestComplexDataType.scala  | 114 +
 .../spark/sql/catalyst/CarbonDDLSqlParser.scala|   2 +
 .../SparkCarbonDataSourceBinaryTest.scala  |  88 
 .../processing/datatypes/PrimitiveDataType.java|   3 +
 .../org/apache/carbondata/sdk/file/ImageTest.java  |  41 
 8 files changed, 255 insertions(+)

diff --git 
a/core/src/main/java/org/apache/carbondata/core/datastore/page/ComplexColumnPage.java
 
b/core/src/main/java/org/apache/carbondata/core/datastore/page/ComplexColumnPage.java
index 921ae50..c4f8849 100644
--- 
a/core/src/main/java/org/apache/carbondata/core/datastore/page/ComplexColumnPage.java
+++ 
b/core/src/main/java/org/apache/carbondata/core/datastore/page/ComplexColumnPage.java
@@ -124,6 +124,7 @@ public class ComplexColumnPage {
 DataTypes.isMapType(dataType) ||
 (dataType == DataTypes.STRING) ||
 (dataType == DataTypes.VARCHAR) ||
+(dataType == DataTypes.BINARY) ||
 (dataType == DataTypes.DATE) ||
 DataTypes.isDecimal(dataType) {
   // For all these above condition the ColumnPage should be Taken as 
BYTE_ARRAY
diff --git 
a/core/src/main/java/org/apache/carbondata/core/util/DataTypeUtil.java 
b/core/src/main/java/org/apache/carbondata/core/util/DataTypeUtil.java
index 9aea579..adb63cd 100644
--- a/core/src/main/java/org/apache/carbondata/core/util/DataTypeUtil.java
+++ b/core/src/main/java/org/apache/carbondata/core/util/DataTypeUtil.java
@@ -530,6 +530,7 @@ public final class DataTypeUtil {
   public static boolean isFixedSizeDataType(DataType dataType) {
 if (dataType == DataTypes.STRING ||
 dataType == DataTypes.VARCHAR ||
+dataType == DataTypes.BINARY ||
 DataTypes.isDecimal(dataType)) {
   return false;
 } else {
@@ -1019,6 +1020,8 @@ public final class DataTypeUtil {
   return DataTypes.BYTE_ARRAY;
 } else if (DataTypes.BYTE_ARRAY.getName().equalsIgnoreCase(name)) {
   return DataTypes.BYTE_ARRAY;
+} else if (DataTypes.BINARY.getName().equalsIgnoreCase(name)) {
+  return DataTypes.BINARY;
 } else if (name.equalsIgnoreCase("decimal")) {
   return DataTypes.createDefaultDecimalType();
 } else if (name.equalsIgnoreCase("array")) {
diff --git a/integration/spark-common-test/src/test/resources/complexbinary.csv 
b/integration/spark-common-test/src/test/resources/complexbinary.csv
new file mode 100644
index 000..3870f5f
--- /dev/null
+++ b/integration/spark-common-test/src/test/resources/complexbinary.csv
@@ -0,0 +1,3 @@
+1,true,abc,binary1$binary2,binary1,1
+2,false,abcd,binary11$binary12,binary11,1
+3,true,abcde,binary13$binary13,binary13,1
\ No newline at end of file
diff --git 
a/integration/spark-common-test/src/test/scala/org/apache/carbondata/integration/spark/testsuite/complexType/TestComplexDataType.scala
 
b/integration/spark-common-test/src/test/scala/org/apache/carbondata/integration/spark/testsuite/complexType/TestComplexDataType.scala
index b5f77c2..9d6b4d1 100644
--- 
a/integration/spark-common-test/src/test/scala/org/apache/carbondata/integration/spark/testsuite/complexType/TestComplexDataType.scala
+++ 
b/integration/spark-common-test/src/test/scala/org/apache/carbondata/integration/spark/testsuite/complexType/TestComplexDataType.scala
@@ -1013,4 +1013,118 @@ class TestComplexDataType extends QueryTest with 
BeforeAndAfterAll {
 checkAnswer(sql("select 
id,name,structField.intval,name,structField.stringval from 
table1"),Seq(Row(null,"aaa",23,"aaa","bb")))
   }
 
+  test("test array of binary data type") {
+sql("drop table if exists carbon_table")
+sql("drop table if exists hive_table")
+sql("create table if not exists hive_table(id int, label boolean, name 
string," +
+"binaryField array, autoLabel boolean) row format delimited 
fields terminated by ','")
+sql("insert into hive_table values(1,true,'abc',array('binary'),false)")
+   

[carbondata] 24/33: [HOTFIX] Remove duplicate case for BYTE_ARRAY

2019-10-03 Thread ravipesala
This is an automated email from the ASF dual-hosted git repository.

ravipesala pushed a commit to branch branch-1.6
in repository https://gitbox.apache.org/repos/asf/carbondata.git

commit 336848365c06cc10d6bb39691b35f657be565c10
Author: Manhua 
AuthorDate: Fri Sep 20 14:26:48 2019 +0800

[HOTFIX] Remove duplicate case for BYTE_ARRAY

This closes #3396
---
 core/src/main/java/org/apache/carbondata/core/util/DataTypeUtil.java | 4 
 1 file changed, 4 deletions(-)

diff --git 
a/core/src/main/java/org/apache/carbondata/core/util/DataTypeUtil.java 
b/core/src/main/java/org/apache/carbondata/core/util/DataTypeUtil.java
index adb63cd..3e0edb1 100644
--- a/core/src/main/java/org/apache/carbondata/core/util/DataTypeUtil.java
+++ b/core/src/main/java/org/apache/carbondata/core/util/DataTypeUtil.java
@@ -1018,8 +1018,6 @@ public final class DataTypeUtil {
   return DataTypes.NULL;
 } else if (DataTypes.BYTE_ARRAY.getName().equalsIgnoreCase(name)) {
   return DataTypes.BYTE_ARRAY;
-} else if (DataTypes.BYTE_ARRAY.getName().equalsIgnoreCase(name)) {
-  return DataTypes.BYTE_ARRAY;
 } else if (DataTypes.BINARY.getName().equalsIgnoreCase(name)) {
   return DataTypes.BINARY;
 } else if (name.equalsIgnoreCase("decimal")) {
@@ -1070,8 +1068,6 @@ public final class DataTypeUtil {
   return DataTypes.NULL;
 } else if 
(DataTypes.BYTE_ARRAY.getName().equalsIgnoreCase(dataType.getName())) {
   return DataTypes.BYTE_ARRAY;
-} else if 
(DataTypes.BYTE_ARRAY.getName().equalsIgnoreCase(dataType.getName())) {
-  return DataTypes.BYTE_ARRAY;
 } else if 
(DataTypes.BINARY.getName().equalsIgnoreCase(dataType.getName())) {
   return DataTypes.BINARY;
 } else if (dataType.getName().equalsIgnoreCase("decimal")) {



[carbondata] 02/33: [CARBONDATA-3494]Fix NullPointerException in drop table and Correct the document formatting

2019-10-03 Thread ravipesala
This is an automated email from the ASF dual-hosted git repository.

ravipesala pushed a commit to branch branch-1.6
in repository https://gitbox.apache.org/repos/asf/carbondata.git

commit 7571b99ecb257956a05528a0d2910f66f4fd7daf
Author: akashrn5 
AuthorDate: Thu Aug 15 17:56:57 2019 +0530

[CARBONDATA-3494]Fix NullPointerException in drop table and Correct the 
document formatting

This closes #Problem:

Fix the formatting of the document in index server md file.
drop table is calling drop datamap command with force drop as true. Due to 
this table is removed from meta and physically. Then when processData is called 
for drop table, it tried to create carbonTable object by reading schema which 
causes NullPointerException.
Solution:

correct the formatting
Skip ProcessData if carbonTable is null

This closes #3359
---
 .../org/apache/carbondata/core/datamap/DataMapStoreManager.java| 7 ++-
 docs/index-server.md   | 6 ++
 2 files changed, 8 insertions(+), 5 deletions(-)

diff --git 
a/core/src/main/java/org/apache/carbondata/core/datamap/DataMapStoreManager.java
 
b/core/src/main/java/org/apache/carbondata/core/datamap/DataMapStoreManager.java
index ce0d6a6..f1f48fa 100644
--- 
a/core/src/main/java/org/apache/carbondata/core/datamap/DataMapStoreManager.java
+++ 
b/core/src/main/java/org/apache/carbondata/core/datamap/DataMapStoreManager.java
@@ -598,6 +598,11 @@ public final class DataMapStoreManager {
*/
   public void deleteDataMap(AbsoluteTableIdentifier identifier, String 
dataMapName) {
 CarbonTable carbonTable = getCarbonTable(identifier);
+if (carbonTable == null) {
+  // If carbon table is null then it means table is already deleted, 
therefore return without
+  // doing any further changes.
+  return;
+}
 String tableUniqueName = 
identifier.getCarbonTableIdentifier().getTableUniqueName();
 if (CarbonProperties.getInstance()
 .isDistributedPruningEnabled(identifier.getDatabaseName(), 
identifier.getTableName())) {
@@ -613,7 +618,7 @@ public final class DataMapStoreManager {
   if (tableIndices != null) {
 int i = 0;
 for (TableDataMap tableDataMap : tableIndices) {
-  if (carbonTable != null && tableDataMap != null && dataMapName
+  if (tableDataMap != null && dataMapName
   
.equalsIgnoreCase(tableDataMap.getDataMapSchema().getDataMapName())) {
 try {
   DataMapUtil
diff --git a/docs/index-server.md b/docs/index-server.md
index 5dd15c5..9253f2a 100644
--- a/docs/index-server.md
+++ b/docs/index-server.md
@@ -136,11 +136,9 @@ The Index Server is a long running service therefore the 
'spark.yarn.keytab' and
 | Name |  Default Value|  Description |
 |:--:|:-:|:--:   |
 | carbon.enable.index.server   |  false | Enable the use of index server 
for pruning for the whole application.   |
-| carbon.index.server.ip |NA   |   Specify the IP/HOST on which the server 
is started. Better to
- specify the private IP. |
+| carbon.index.server.ip |NA   |   Specify the IP/HOST on which the server 
is started. Better to specify the private IP. |
 | carbon.index.server.port | NA | The port on which the index server is 
started. |
-| carbon.disable.index.server.fallback | false | Whether to enable/disable 
fallback for index server
-. Should be used for testing purposes only. Refer: [Fallback](#Fallback)|
+| carbon.disable.index.server.fallback | false | Whether to enable/disable 
fallback for index server. Should be used for testing purposes only. Refer: 
[Fallback](#Fallback)|
 |carbon.index.server.max.jobname.length|NA|The max length of the job to show 
in the index server service UI. For bigger queries this may impact performance 
as the whole string would be sent from JDBCServer to IndexServer.|
 
 



[carbondata] branch branch-1.6 updated (72169e5 -> be9580b)

2019-10-03 Thread ravipesala
This is an automated email from the ASF dual-hosted git repository.

ravipesala pushed a change to branch branch-1.6
in repository https://gitbox.apache.org/repos/asf/carbondata.git.


from 72169e5  [maven-release-plugin] prepare for next development iteration
 new 56879f7  [CARBONDATA-3480] Fixed unnecessary refresh for table by 
removing modified mdt file
 new 7571b99  [CARBONDATA-3494]Fix NullPointerException in drop table and 
Correct the document formatting
 new 90b6c64  [CARBONDATA-3493] Initialize Profiler in CarbonEnv
 new 019c777  [CARBONDATA-3466] Fix NPE for carboncli command
 new 6c9bbfe  [CARBONDATA-3491] Return updated/deleted rows count when 
execute update/delete sql
 new 129b163  [CARBONDATA-3509] Support disable query prefetch by 
configuration
 new dff8ab3  [HOTFIX] Remove hive-service from carbondata assembly jar
 new 1943dda  [CARBONDATA-3499] Fix insert failure with customFileProvider
 new 75e207c  [CARBONDATA-3502] Select query with UDF having Match 
expression inside IN expression Fails
 new 2328707  [CARBONDATA-3505] Drop database cascade fix
 new 99e0c7c  [CARBONDATA-3497] Support to write long string for streaming 
table
 new 41ae280  [CARBONDATA-3513] fix 'taskNo' exceeding Long.MAX_VALUE issue 
when execute major compaction
 new 6f90b28  [CARBONDATA-3452] dictionary include udf handle all the 
scenarios
 new 3aea196  [CARBONDATA-3495] Fix Insert into Complex data type of Binary 
failure with Carbon & SparkFileFormat
 new d509cd1  [CARBONDATA-3507] Fix Create Table As Select Failure in 
Spark-2.3
 new 0e2d3e2  [CARBONDATA-3508] Support CG datamap pruning fallback while 
querying
 new 631941b  [HOTFIX] Fix NPE on windows
 new ef26a4a  [CARBONDATA-3506]Fix alter table failures on parition table 
with hive.metastore.disallow.incompatible.col.type.changes as true
 new f750b6f  [CARBONDATA-3515] Limit local dictionary size to 16MB and 
allow configuration.
 new 204f290  [HOTFIX] fix missing quotation marks in datamap doc
 new 43a8086  [CARBONDATA-3454] optimized index server output for count(*)
 new 8ffbc1d  [HOTFIX] fix incorrect word in index-server doc
 new 9e46647  [CARBONDATA-3489] Optimized the comparator instances in sort
 new 3368483  [HOTFIX] Remove duplicate case for BYTE_ARRAY
 new ab86705  [DOC] Update doc for alter sort_columns
 new de81b38  [HOTFIX] Fix wrong min/max index of measure
 new 57309d7  [CARBONDATA-3473] Fix data size calcution of the last column 
in CarbonCli
 new 21bbc4a  [CARBONDATA-3520] CTAS should fail if select query contains 
duplicate columns
 new 85dc030  [HOTFIX]Update Documentation for MV datamap
 new 49e9ea3  [CARBONDATA-3523] Store data file size into index file
 new 9342545  [CARBONDATA-3527] Fix 'String length cannot exceed 32000 
characters' issue when load data with 'GLOBAL_SORT' from csv files which 
include big complex type data
 new 00d2fe9  [CARBONDATA-3501] Fix update table with varchar column
 new be9580b  [CARBONDATA-3526]Fix cache issue during update and query

The 33 revisions listed above as "new" are entirely new to this
repository and will be described in separate emails.  The revisions
listed as "add" were already present in the repository and have only
been added to this reference.


Summary of changes:
 README.md  |   1 +
 assembly/pom.xml   |   1 +
 .../core/constants/CarbonCommonConstants.java  |  20 +
 .../carbondata/core/datamap/DataMapFilter.java |  47 ++
 .../apache/carbondata/core/datamap/DataMapJob.java |   2 +
 .../core/datamap/DataMapStoreManager.java  |  21 +-
 .../carbondata/core/datamap/DataMapUtil.java   |  13 +-
 .../core/datamap/DistributableDataMapFormat.java   |  34 +-
 .../core/datastore/block/TableBlockInfo.java   |  13 +
 .../datastore/impl/DefaultFileTypeProvider.java|  31 +-
 .../core/datastore/page/ComplexColumnPage.java |   1 +
 .../core/indexstore/BlockletDataMapIndexStore.java |   2 +-
 .../core/indexstore/ExtendedBlocklet.java  |  68 ++-
 .../core/indexstore/ExtendedBlockletWrapper.java   |  27 +-
 .../ExtendedBlockletWrapperContainer.java  |  19 +-
 .../dictionaryholder/MapBasedDictionaryStore.java  |  16 +-
 .../carbondata/core/metadata/CarbonMetadata.java   |   9 +
 .../core/metadata/index/BlockIndexInfo.java|  18 +
 .../core/metadata/schema/table/CarbonTable.java|   4 +-
 .../core/metadata/schema/table/TableSchema.java|   4 +
 .../scan/executor/impl/AbstractQueryExecutor.java  |   2 +
 .../carbondata/core/scan/model/QueryModel.java |   4 +-
 .../statusmanager/SegmentUpdateStatusManager.java  |  32 +-
 .../core/util/AbstractDataFileFooterConverter.java |   3 +
 .../carbondata/core/util/BlockletDataMapUtil.java  |  17 +-
 .../carbondata/core/util/CarbonMetadataUtil.java   |  65 +--
 

[carbondata] 19/33: [CARBONDATA-3515] Limit local dictionary size to 16MB and allow configuration.

2019-10-03 Thread ravipesala
This is an automated email from the ASF dual-hosted git repository.

ravipesala pushed a commit to branch branch-1.6
in repository https://gitbox.apache.org/repos/asf/carbondata.git

commit f750b6f210ba87923793631f6b4a2cc4f7dbdd3d
Author: ajantha-bhat 
AuthorDate: Tue Sep 10 10:48:26 2019 +0530

[CARBONDATA-3515] Limit local dictionary size to 16MB and allow 
configuration.

problem: currently local dictionary max size is 2GB, because of this, for 
varchar columns or long string columns,
local dictionary can be of 2GB size. so, as local dictionary is stored in 
blocklet. blocklet size will exceed 2 GB,
 even though configured maximum blocklet size is 64MB. some places inter 
overflow happens during casting.

solution: Limit local dictionary size to 16MB and allow configuration. 
default size is 4MB

This closes #3380
---
 .../core/constants/CarbonCommonConstants.java  | 11 ++
 .../dictionaryholder/MapBasedDictionaryStore.java  | 16 ++--
 .../carbondata/core/util/CarbonProperties.java | 43 ++
 docs/configuration-parameters.md   |  1 +
 4 files changed, 68 insertions(+), 3 deletions(-)

diff --git 
a/core/src/main/java/org/apache/carbondata/core/constants/CarbonCommonConstants.java
 
b/core/src/main/java/org/apache/carbondata/core/constants/CarbonCommonConstants.java
index 67fa13f..ac77582 100644
--- 
a/core/src/main/java/org/apache/carbondata/core/constants/CarbonCommonConstants.java
+++ 
b/core/src/main/java/org/apache/carbondata/core/constants/CarbonCommonConstants.java
@@ -1209,6 +1209,17 @@ public final class CarbonCommonConstants {
 
   public static final String CARBON_ENABLE_RANGE_COMPACTION_DEFAULT = "true";
 
+  @CarbonProperty
+  /**
+   * size based threshold for local dictionary in mb.
+   */
+  public static final String CARBON_LOCAL_DICTIONARY_SIZE_THRESHOLD_IN_MB =
+  "carbon.local.dictionary.size.threshold.inmb";
+
+  public static final int CARBON_LOCAL_DICTIONARY_SIZE_THRESHOLD_IN_MB_DEFAULT 
= 4;
+
+  public static final int CARBON_LOCAL_DICTIONARY_SIZE_THRESHOLD_IN_MB_MAX = 
16;
+
   
//
   // Query parameter start here
   
//
diff --git 
a/core/src/main/java/org/apache/carbondata/core/localdictionary/dictionaryholder/MapBasedDictionaryStore.java
 
b/core/src/main/java/org/apache/carbondata/core/localdictionary/dictionaryholder/MapBasedDictionaryStore.java
index 7b8617a..0a50451 100644
--- 
a/core/src/main/java/org/apache/carbondata/core/localdictionary/dictionaryholder/MapBasedDictionaryStore.java
+++ 
b/core/src/main/java/org/apache/carbondata/core/localdictionary/dictionaryholder/MapBasedDictionaryStore.java
@@ -20,7 +20,9 @@ import java.util.Map;
 import java.util.concurrent.ConcurrentHashMap;
 
 import org.apache.carbondata.core.cache.dictionary.DictionaryByteArrayWrapper;
+import org.apache.carbondata.core.constants.CarbonCommonConstants;
 import 
org.apache.carbondata.core.localdictionary.exception.DictionaryThresholdReachedException;
+import org.apache.carbondata.core.util.CarbonProperties;
 
 /**
  * Map based dictionary holder class, it will use map to hold
@@ -51,6 +53,11 @@ public class MapBasedDictionaryStore implements 
DictionaryStore {
   private int dictionaryThreshold;
 
   /**
+   * dictionary threshold size in bytes
+   */
+  private long dictionarySizeThresholdInBytes;
+
+  /**
* for checking threshold is reached or not
*/
   private boolean isThresholdReached;
@@ -62,6 +69,8 @@ public class MapBasedDictionaryStore implements 
DictionaryStore {
 
   public MapBasedDictionaryStore(int dictionaryThreshold) {
 this.dictionaryThreshold = dictionaryThreshold;
+this.dictionarySizeThresholdInBytes = 
Integer.parseInt(CarbonProperties.getInstance()
+
.getProperty(CarbonCommonConstants.CARBON_LOCAL_DICTIONARY_SIZE_THRESHOLD_IN_MB))
 << 20;
 this.dictionary = new ConcurrentHashMap<>();
 this.referenceDictionaryArray = new 
DictionaryByteArrayWrapper[dictionaryThreshold];
   }
@@ -93,7 +102,7 @@ public class MapBasedDictionaryStore implements 
DictionaryStore {
   value = ++lastAssignValue;
   currentSize += data.length;
   // if new value is greater than threshold
-  if (value > dictionaryThreshold || currentSize >= Integer.MAX_VALUE) 
{
+  if (value > dictionaryThreshold || currentSize > 
dictionarySizeThresholdInBytes) {
 // set the threshold boolean to true
 isThresholdReached = true;
 // throw exception
@@ -111,9 +120,10 @@ public class MapBasedDictionaryStore implements 
DictionaryStore {
 
   private void checkIfThresholdReached() throws 
DictionaryThresholdReachedException {
 if (isThresholdReached) {
-  if (currentSize &g

[carbondata] 29/33: [HOTFIX]Update Documentation for MV datamap

2019-10-03 Thread ravipesala
This is an automated email from the ASF dual-hosted git repository.

ravipesala pushed a commit to branch branch-1.6
in repository https://gitbox.apache.org/repos/asf/carbondata.git

commit 85dc0304743e74c7abc00d59d4d4b5e5f619d03e
Author: Indhumathi27 
AuthorDate: Thu Jul 25 17:02:26 2019 +0530

[HOTFIX]Update Documentation for MV datamap

This closes #3335
---
 README.md| 1 +
 docs/datamap/mv-datamap-guide.md | 2 ++
 2 files changed, 3 insertions(+)

diff --git a/README.md b/README.md
index 3226a30..da5b547 100644
--- a/README.md
+++ b/README.md
@@ -60,6 +60,7 @@ CarbonData is built using Apache Maven, to [build 
CarbonData](https://github.com
  * [CarbonData Lucene 
DataMap](https://github.com/apache/carbondata/blob/master/docs/datamap/lucene-datamap-guide.md)
 
  * [CarbonData Pre-aggregate 
DataMap](https://github.com/apache/carbondata/blob/master/docs/datamap/preaggregate-datamap-guide.md)
 
  * [CarbonData Timeseries 
DataMap](https://github.com/apache/carbondata/blob/master/docs/datamap/timeseries-datamap-guide.md)
 
+ * [CarbonData MV 
DataMap](https://github.com/apache/carbondata/blob/master/docs/datamap/mv-datamap-guide.md)
 * [SDK 
Guide](https://github.com/apache/carbondata/blob/master/docs/sdk-guide.md) 
 * [C++ SDK 
Guide](https://github.com/apache/carbondata/blob/master/docs/csdk-guide.md)
 * [Performance 
Tuning](https://github.com/apache/carbondata/blob/master/docs/performance-tuning.md)
 
diff --git a/docs/datamap/mv-datamap-guide.md b/docs/datamap/mv-datamap-guide.md
index d22357c..fc1ffd5 100644
--- a/docs/datamap/mv-datamap-guide.md
+++ b/docs/datamap/mv-datamap-guide.md
@@ -65,6 +65,7 @@ EXPLAIN SELECT a, sum(b) from maintable group by a;
   CREATE DATAMAP agg_sales
   ON TABLE sales
   USING "MV"
+  DMPROPERTIES('TABLE_BLOCKSIZE'='256 MB','LOCAL_DICTIONARY_ENABLE'='false')
   AS
 SELECT country, sex, sum(quantity), avg(price)
 FROM sales
@@ -97,6 +98,7 @@ EXPLAIN SELECT a, sum(b) from maintable group by a;
property is inherited from parent table, which allows user to provide 
different tableproperties
for child table
  * MV creation with limit or union all ctas queries is unsupported
+ * MV datamap does not support Streaming
 
  How MV tables are selected
 



[carbondata] 32/33: [CARBONDATA-3501] Fix update table with varchar column

2019-10-03 Thread ravipesala
This is an automated email from the ASF dual-hosted git repository.

ravipesala pushed a commit to branch branch-1.6
in repository https://gitbox.apache.org/repos/asf/carbondata.git

commit 00d2fe930e713958e052e3a738616a852f1dbfe7
Author: Manhua 
AuthorDate: Wed Sep 25 11:47:01 2019 +0800

[CARBONDATA-3501] Fix update table with varchar column

Problem
Update on table with varchar column will throw exception

Analyse
In the loading part of update operation, it gets the isVarcharTypeMapping 
for each column in the order when table created. And this gives a hint for 
checking string length. It does not allow to exceeds 32000 char for a column 
which is not varchar type.

However when changing the plan for updating in CarbonIUDRule, it first 
deletes the old expression and appends the new one, which makes the order 
differ to table created. Such that the string length checking fail.

Solution
Keep the order as table created when modify update plan

This closes #3398
---
 .../longstring/VarcharDataTypesBasicTestCase.scala  | 10 ++
 .../command/management/CarbonLoadDataCommand.scala  |  2 +-
 .../org/apache/spark/sql/hive/CarbonAnalysisRules.scala |  4 ++--
 .../org/apache/spark/sql/optimizer/CarbonIUDRule.scala  | 17 ++---
 4 files changed, 27 insertions(+), 6 deletions(-)

diff --git 
a/integration/spark-common-test/src/test/scala/org/apache/carbondata/spark/testsuite/longstring/VarcharDataTypesBasicTestCase.scala
 
b/integration/spark-common-test/src/test/scala/org/apache/carbondata/spark/testsuite/longstring/VarcharDataTypesBasicTestCase.scala
index 4fd2cc0..9719cfc 100644
--- 
a/integration/spark-common-test/src/test/scala/org/apache/carbondata/spark/testsuite/longstring/VarcharDataTypesBasicTestCase.scala
+++ 
b/integration/spark-common-test/src/test/scala/org/apache/carbondata/spark/testsuite/longstring/VarcharDataTypesBasicTestCase.scala
@@ -389,6 +389,16 @@ class VarcharDataTypesBasicTestCase extends QueryTest with 
BeforeAndAfterEach wi
 
 sql("DROP TABLE IF EXISTS varchar_complex_table")
   }
+  
+  test("update table with long string column") {
+prepareTable()
+// update non-varchar column
+sql(s"update $longStringTable set(id)=(0) where name is not null").show()
+// update varchar column
+sql(s"update $longStringTable set(description)=('empty') where name is not 
null").show()
+// update non-varchar column
+sql(s"update $longStringTable set(description, id)=('sth.', 1) where name 
is not null").show()
+  }
 
 // ignore this test in CI, because it will need at least 4GB memory to run 
successfully
   ignore("Exceed 2GB per column page for varchar datatype") {
diff --git 
a/integration/spark2/src/main/scala/org/apache/spark/sql/execution/command/management/CarbonLoadDataCommand.scala
 
b/integration/spark2/src/main/scala/org/apache/spark/sql/execution/command/management/CarbonLoadDataCommand.scala
index 6a03eab..b2f9a1e 100644
--- 
a/integration/spark2/src/main/scala/org/apache/spark/sql/execution/command/management/CarbonLoadDataCommand.scala
+++ 
b/integration/spark2/src/main/scala/org/apache/spark/sql/execution/command/management/CarbonLoadDataCommand.scala
@@ -1060,7 +1060,7 @@ case class CarbonLoadDataCommand(
 val dropAttributes = df.logicalPlan.output.dropRight(1)
 val finalOutput = catalogTable.schema.map { attr =>
   dropAttributes.find { d =>
-val index = d.name.lastIndexOf("-updatedColumn")
+val index = 
d.name.lastIndexOf(CarbonCommonConstants.UPDATED_COL_EXTENSION)
 if (index > 0) {
   d.name.substring(0, index).equalsIgnoreCase(attr.name)
 } else {
diff --git 
a/integration/spark2/src/main/scala/org/apache/spark/sql/hive/CarbonAnalysisRules.scala
 
b/integration/spark2/src/main/scala/org/apache/spark/sql/hive/CarbonAnalysisRules.scala
index 9b923b0..d11bf1e 100644
--- 
a/integration/spark2/src/main/scala/org/apache/spark/sql/hive/CarbonAnalysisRules.scala
+++ 
b/integration/spark2/src/main/scala/org/apache/spark/sql/hive/CarbonAnalysisRules.scala
@@ -122,9 +122,9 @@ case class CarbonIUDAnalysisRule(sparkSession: 
SparkSession) extends Rule[Logica
 val renamedProjectList = projectList.zip(columns).map { case (attr, 
col) =>
   attr match {
 case UnresolvedAlias(child22, _) =>
-  UnresolvedAlias(Alias(child22, col + "-updatedColumn")())
+  UnresolvedAlias(Alias(child22, col + 
CarbonCommonConstants.UPDATED_COL_EXTENSION)())
 case UnresolvedAttribute(_) =>
-  UnresolvedAlias(Alias(attr, col + "-updatedColumn")())
+  UnresolvedAlias(Alias(attr, col + 
CarbonCommonConstants.UPDATED_COL_EXTENSION)())
 case _ => attr
   }
 }
diff --git 
a/integration/spark2/src/main/scala/org/apache/spark/sql/optimiz

[carbondata] 07/33: [HOTFIX] Remove hive-service from carbondata assembly jar

2019-10-03 Thread ravipesala
This is an automated email from the ASF dual-hosted git repository.

ravipesala pushed a commit to branch branch-1.6
in repository https://gitbox.apache.org/repos/asf/carbondata.git

commit dff8ab38a986b388182a12f60dd4272659e1cbb6
Author: Zhang Zhichao <441586...@qq.com>
AuthorDate: Wed Sep 4 14:36:06 2019 +0800

[HOTFIX] Remove hive-service from carbondata assembly jar

Problem: In some environments, there will occur 'No Such Method: 
registerCurrentOperationLog' exception while execute sql on carbon thrift 
server.

Cause: spark hive thrift module rewrite class 
'org.apache.hive.service.cli.operation.ExecuteStatementOperation' and add 
method 'registerCurrentOperationLog' in it, but when start carbon thrift 
server, it maybe load class 'ExecuteStatementOperation' first from carbondata 
assembly jar (includes 'org.apache.hive:hive-service'), this class 
'ExecuteStatementOperation' which is from hive-service jar doesn't have method 
'registerCurrentOperationLog', so it throws NoSuchMethodException.

Solution: remove all artifacts of 'org.apache.hive' when assemble 
carbondata jar.

This closes #3373
---
 assembly/pom.xml | 1 +
 1 file changed, 1 insertion(+)

diff --git a/assembly/pom.xml b/assembly/pom.xml
index 12d7e6e..bf729c5 100644
--- a/assembly/pom.xml
+++ b/assembly/pom.xml
@@ -107,6 +107,7 @@
   org.apache.spark:*
   org.apache.zookeeper:*
   org.apache.avro:*
+  org.apache.hive:*
   com.google.guava:guava
   org.xerial.snappy:snappy-java
   



svn commit: r35419 - /release/carbondata/1.6.0/

2019-08-28 Thread ravipesala
Author: ravipesala
Date: Wed Aug 28 06:24:37 2019
New Revision: 35419

Log:
Upload 1.6.0

Added:
release/carbondata/1.6.0/

release/carbondata/1.6.0/apache-carbondata-1.6.0-bin-spark2.1.0-hadoop2.7.2.jar 
  (with props)

release/carbondata/1.6.0/apache-carbondata-1.6.0-bin-spark2.1.0-hadoop2.7.2.jar.asc
   (with props)

release/carbondata/1.6.0/apache-carbondata-1.6.0-bin-spark2.1.0-hadoop2.7.2.jar.sha512
   (with props)

release/carbondata/1.6.0/apache-carbondata-1.6.0-bin-spark2.2.1-hadoop2.7.2.jar 
  (with props)

release/carbondata/1.6.0/apache-carbondata-1.6.0-bin-spark2.2.1-hadoop2.7.2.jar.asc
   (with props)

release/carbondata/1.6.0/apache-carbondata-1.6.0-bin-spark2.2.1-hadoop2.7.2.jar.sha512
   (with props)

release/carbondata/1.6.0/apache-carbondata-1.6.0-bin-spark2.3.2-hadoop2.7.2.jar 
  (with props)

release/carbondata/1.6.0/apache-carbondata-1.6.0-bin-spark2.3.2-hadoop2.7.2.jar.asc
   (with props)

release/carbondata/1.6.0/apache-carbondata-1.6.0-bin-spark2.3.2-hadoop2.7.2.jar.sha512
   (with props)
release/carbondata/1.6.0/apache-carbondata-1.6.0-source-release.asc   (with 
props)
release/carbondata/1.6.0/apache-carbondata-1.6.0-source-release.zip   (with 
props)
release/carbondata/1.6.0/apache-carbondata-1.6.0-source-release.zip.sha512  
 (with props)

Added: 
release/carbondata/1.6.0/apache-carbondata-1.6.0-bin-spark2.1.0-hadoop2.7.2.jar
==
Binary file - no diff available.

Propchange: 
release/carbondata/1.6.0/apache-carbondata-1.6.0-bin-spark2.1.0-hadoop2.7.2.jar
--
svn:executable = *

Propchange: 
release/carbondata/1.6.0/apache-carbondata-1.6.0-bin-spark2.1.0-hadoop2.7.2.jar
--
svn:mime-type = application/octet-stream

Added: 
release/carbondata/1.6.0/apache-carbondata-1.6.0-bin-spark2.1.0-hadoop2.7.2.jar.asc
==
--- 
release/carbondata/1.6.0/apache-carbondata-1.6.0-bin-spark2.1.0-hadoop2.7.2.jar.asc
 (added)
+++ 
release/carbondata/1.6.0/apache-carbondata-1.6.0-bin-spark2.1.0-hadoop2.7.2.jar.asc
 Wed Aug 28 06:24:37 2019
@@ -0,0 +1,11 @@
+-BEGIN PGP SIGNATURE-
+
+iQEzBAEBCAAdFiEEsZE8naWI0MngB++fuw0pZv1r+vAFAl1eFMcACgkQuw0pZv1r
++vC2Ggf/XVkeWV+DUF4szeS1Aw4FDFAi/SncuxA4znoFvZjtbSf8aiaMyS0pe0K5
+OcSC6KsVrDKI/l1C298ezbn4WpMWhlQEunjIlX7etSzviS1zjAaP+rL3lL6CVMHt
+9vbXuIMUotRb+XdyEocHvsisMIxzabCqvw/Vouz4kV+IjT35pDpo7Nn3g+MBclBh
+1BiKcnQQZ1irBRN63LmaO/oV5IDpVcEouTXri+i0ZF0h/8zzGxFXJ8MHWay/3SjA
+EWaHgHLsxWAz9UyO54T3XdqvRjU09EN5TmgmPP3QBHjOFBynCD1Op+dYvEJLxVfq
+DDKe56YVPBH2o/3k0aW5PR/lw1RuNw==
+=OXRW
+-END PGP SIGNATURE-

Propchange: 
release/carbondata/1.6.0/apache-carbondata-1.6.0-bin-spark2.1.0-hadoop2.7.2.jar.asc
--
svn:executable = *

Added: 
release/carbondata/1.6.0/apache-carbondata-1.6.0-bin-spark2.1.0-hadoop2.7.2.jar.sha512
==
--- 
release/carbondata/1.6.0/apache-carbondata-1.6.0-bin-spark2.1.0-hadoop2.7.2.jar.sha512
 (added)
+++ 
release/carbondata/1.6.0/apache-carbondata-1.6.0-bin-spark2.1.0-hadoop2.7.2.jar.sha512
 Wed Aug 28 06:24:37 2019
@@ -0,0 +1 @@
+47d50e9c13d8fd3191788d8bf46d23ef2be40181655dd740ddfd3d53d2e23802645ac978a9a1c69ec1fc6359eee16ac894626c56f5146bd7064e1ffd99009cc0
  apache-carbondata-1.6.0-bin-spark2.1.0-hadoop2.7.2.jar

Propchange: 
release/carbondata/1.6.0/apache-carbondata-1.6.0-bin-spark2.1.0-hadoop2.7.2.jar.sha512
--
svn:executable = *

Added: 
release/carbondata/1.6.0/apache-carbondata-1.6.0-bin-spark2.2.1-hadoop2.7.2.jar
==
Binary file - no diff available.

Propchange: 
release/carbondata/1.6.0/apache-carbondata-1.6.0-bin-spark2.2.1-hadoop2.7.2.jar
--
svn:executable = *

Propchange: 
release/carbondata/1.6.0/apache-carbondata-1.6.0-bin-spark2.2.1-hadoop2.7.2.jar
--
svn:mime-type = application/octet-stream

Added: 
release/carbondata/1.6.0/apache-carbondata-1.6.0-bin-spark2.2.1-hadoop2.7.2.jar.asc
==
--- 
release/carbondata/1.6.0/apache-carbondata-1.6.0-bin-spark2.2.1-hadoop2.7.2.jar.asc
 (added)
+++ 
release/carbondata/1.6.0/apache-carbondata-1.6.0-bin-spark2.2.1-hadoop2.7.2.jar.asc
 Wed Aug 28 06:24:37 2019
@@ -0,0 +1,11 @@
+-BEGIN PGP SIGNATURE-
+
+iQEzBAEBCAAdFiEEsZE8naWI0MngB++fuw0pZv1r+vAFAl1eFTYACgkQuw0pZv1r
++vCuJwf

[carbondata] branch master updated: [CARBONDATA-3494]Fix NullPointerException in drop table and Correct the document formatting

2019-08-20 Thread ravipesala
This is an automated email from the ASF dual-hosted git repository.

ravipesala pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/carbondata.git


The following commit(s) were added to refs/heads/master by this push:
 new 499489c  [CARBONDATA-3494]Fix NullPointerException in drop table and 
Correct the document formatting
499489c is described below

commit 499489c64d0624308a734f8898a9ef23f4773224
Author: akashrn5 
AuthorDate: Thu Aug 15 17:56:57 2019 +0530

[CARBONDATA-3494]Fix NullPointerException in drop table and Correct the 
document formatting

This closes #Problem:

Fix the formatting of the document in index server md file.
drop table is calling drop datamap command with force drop as true. Due to 
this table is removed from meta and physically. Then when processData is called 
for drop table, it tried to create carbonTable object by reading schema which 
causes NullPointerException.
Solution:

correct the formatting
Skip ProcessData if carbonTable is null

This closes #3359
---
 .../org/apache/carbondata/core/datamap/DataMapStoreManager.java| 7 ++-
 docs/index-server.md   | 6 ++
 2 files changed, 8 insertions(+), 5 deletions(-)

diff --git 
a/core/src/main/java/org/apache/carbondata/core/datamap/DataMapStoreManager.java
 
b/core/src/main/java/org/apache/carbondata/core/datamap/DataMapStoreManager.java
index ce0d6a6..f1f48fa 100644
--- 
a/core/src/main/java/org/apache/carbondata/core/datamap/DataMapStoreManager.java
+++ 
b/core/src/main/java/org/apache/carbondata/core/datamap/DataMapStoreManager.java
@@ -598,6 +598,11 @@ public final class DataMapStoreManager {
*/
   public void deleteDataMap(AbsoluteTableIdentifier identifier, String 
dataMapName) {
 CarbonTable carbonTable = getCarbonTable(identifier);
+if (carbonTable == null) {
+  // If carbon table is null then it means table is already deleted, 
therefore return without
+  // doing any further changes.
+  return;
+}
 String tableUniqueName = 
identifier.getCarbonTableIdentifier().getTableUniqueName();
 if (CarbonProperties.getInstance()
 .isDistributedPruningEnabled(identifier.getDatabaseName(), 
identifier.getTableName())) {
@@ -613,7 +618,7 @@ public final class DataMapStoreManager {
   if (tableIndices != null) {
 int i = 0;
 for (TableDataMap tableDataMap : tableIndices) {
-  if (carbonTable != null && tableDataMap != null && dataMapName
+  if (tableDataMap != null && dataMapName
   
.equalsIgnoreCase(tableDataMap.getDataMapSchema().getDataMapName())) {
 try {
   DataMapUtil
diff --git a/docs/index-server.md b/docs/index-server.md
index 5dd15c5..9253f2a 100644
--- a/docs/index-server.md
+++ b/docs/index-server.md
@@ -136,11 +136,9 @@ The Index Server is a long running service therefore the 
'spark.yarn.keytab' and
 | Name |  Default Value|  Description |
 |:--:|:-:|:--:   |
 | carbon.enable.index.server   |  false | Enable the use of index server 
for pruning for the whole application.   |
-| carbon.index.server.ip |NA   |   Specify the IP/HOST on which the server 
is started. Better to
- specify the private IP. |
+| carbon.index.server.ip |NA   |   Specify the IP/HOST on which the server 
is started. Better to specify the private IP. |
 | carbon.index.server.port | NA | The port on which the index server is 
started. |
-| carbon.disable.index.server.fallback | false | Whether to enable/disable 
fallback for index server
-. Should be used for testing purposes only. Refer: [Fallback](#Fallback)|
+| carbon.disable.index.server.fallback | false | Whether to enable/disable 
fallback for index server. Should be used for testing purposes only. Refer: 
[Fallback](#Fallback)|
 |carbon.index.server.max.jobname.length|NA|The max length of the job to show 
in the index server service UI. For bigger queries this may impact performance 
as the whole string would be sent from JDBCServer to IndexServer.|
 
 



[carbondata] branch master updated: [CARBONDATA-3480] Fixed unnecessary refresh for table by removing modified mdt file

2019-08-20 Thread ravipesala
This is an automated email from the ASF dual-hosted git repository.

ravipesala pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/carbondata.git


The following commit(s) were added to refs/heads/master by this push:
 new a5344df  [CARBONDATA-3480] Fixed unnecessary refresh for table by 
removing modified mdt file
a5344df is described below

commit a5344df2bfe20560324f9a0b1ef92051540e70d8
Author: kunal642 
AuthorDate: Fri Jul 26 14:52:36 2019 +0530

[CARBONDATA-3480] Fixed unnecessary refresh for table by removing modified 
mdt file

This closes #3339
---
 .../carbondata/core/datamap/DataMapFilter.java |  47 +++
 .../core/datamap/DataMapStoreManager.java  |  14 +-
 .../carbondata/core/metadata/CarbonMetadata.java   |   9 +
 .../core/metadata/schema/table/CarbonTable.java|   4 +-
 .../core/metadata/schema/table/TableSchema.java|   4 +
 .../statusmanager/SegmentUpdateStatusManager.java  |  26 --
 .../apache/carbondata/core/util/CarbonUtil.java|   1 -
 .../core/metadata/CarbonMetadataTest.java  |   7 +-
 .../ThriftWrapperSchemaConverterImplTest.java  |   4 +-
 .../metadata/schema/table/CarbonTableTest.java |   8 +-
 .../table/CarbonTableWithComplexTypesTest.java |   6 +-
 .../dblocation/DBLocationCarbonTableTestCase.scala |  25 --
 .../apache/spark/sql/hive/CarbonSessionUtil.scala  |   6 +-
 .../carbondata/indexserver/IndexServer.scala   |  10 +-
 .../scala/org/apache/spark/sql/CarbonEnv.scala |  51 ++-
 .../command/datamap/CarbonDropDataMapCommand.scala |   1 -
 .../management/RefreshCarbonTableCommand.scala |   2 -
 .../CarbonAlterTableDropPartitionCommand.scala |  12 +-
 .../CarbonAlterTableSplitPartitionCommand.scala|   3 -
 .../command/preaaggregate/PreAggregateUtil.scala   |  19 +-
 .../command/table/CarbonDropTableCommand.scala |  13 +
 .../spark/sql/hive/CarbonFileMetastore.scala   | 425 +
 .../spark/sql/hive/CarbonHiveMetaStore.scala   |  10 +-
 .../apache/spark/sql/hive/CarbonMetaStore.scala|  10 +-
 .../scala/org/apache/spark/util/CleanFiles.scala   |   3 -
 .../scala/org/apache/spark/util/Compaction.scala   |   2 -
 .../apache/spark/util/DeleteSegmentByDate.scala|   2 -
 .../org/apache/spark/util/DeleteSegmentById.scala  |   2 -
 .../scala/org/apache/spark/util/TableLoader.scala  |   2 -
 .../apache/spark/sql/hive/CarbonSessionState.scala |  31 +-
 .../AlterTableColumnRenameTestCase.scala   |   4 +-
 31 files changed, 322 insertions(+), 441 deletions(-)

diff --git 
a/core/src/main/java/org/apache/carbondata/core/datamap/DataMapFilter.java 
b/core/src/main/java/org/apache/carbondata/core/datamap/DataMapFilter.java
index c20d0d5..ac4886d 100644
--- a/core/src/main/java/org/apache/carbondata/core/datamap/DataMapFilter.java
+++ b/core/src/main/java/org/apache/carbondata/core/datamap/DataMapFilter.java
@@ -18,10 +18,15 @@
 package org.apache.carbondata.core.datamap;
 
 import java.io.Serializable;
+import java.util.HashSet;
+import java.util.Set;
 
 import org.apache.carbondata.core.datastore.block.SegmentProperties;
 import org.apache.carbondata.core.metadata.schema.table.CarbonTable;
+import org.apache.carbondata.core.metadata.schema.table.column.CarbonDimension;
+import org.apache.carbondata.core.metadata.schema.table.column.CarbonMeasure;
 import org.apache.carbondata.core.scan.executor.util.RestructureUtil;
+import org.apache.carbondata.core.scan.expression.ColumnExpression;
 import org.apache.carbondata.core.scan.expression.Expression;
 import org.apache.carbondata.core.scan.filter.resolver.FilterResolverIntf;
 
@@ -39,9 +44,51 @@ public class DataMapFilter implements Serializable {
   public DataMapFilter(CarbonTable table, Expression expression) {
 this.table = table;
 this.expression = expression;
+if (expression != null) {
+  checkIfFilterColumnExistsInTable();
+}
 resolve();
   }
 
+  private Set extractColumnExpressions(Expression expression) {
+Set columnExpressionList = new HashSet<>();
+for (Expression expressions: expression.getChildren()) {
+  if (expressions != null && expressions.getChildren() != null
+  && expressions.getChildren().size() > 0) {
+columnExpressionList.addAll(extractColumnExpressions(expressions));
+  } else if (expressions instanceof ColumnExpression) {
+columnExpressionList.add(((ColumnExpression) 
expressions).getColumnName());
+  }
+}
+return columnExpressionList;
+  }
+
+  private void checkIfFilterColumnExistsInTable() {
+Set columnExpressionList = extractColumnExpressions(expression);
+for (String colExpression : columnExpressionList) {
+  if (colExpression.equalsIgnoreCase("positionid")) {
+continue;
+  }
+  boolean exists = false;
+  for (CarbonMeasure carbonMeasure : table.getAllMeasures()) {
+if (!carbonMeasure.isInvi

[carbondata] branch branch-1.6 updated (79b533f -> a73cadd)

2019-08-12 Thread ravipesala
This is an automated email from the ASF dual-hosted git repository.

ravipesala pushed a change to branch branch-1.6
in repository https://gitbox.apache.org/repos/asf/carbondata.git.


 discard 79b533f  [maven-release-plugin] prepare release 
apache-CarbonData-1.6.0-rc2

This update removed existing revisions from the reference, leaving the
reference pointing at a previous point in the repository history.

 * -- * -- N   refs/heads/branch-1.6 (a73cadd)
\
 O -- O -- O   (79b533f)

Any revisions marked "omit" are not gone; other references still
refer to them.  Any revisions marked "discard" are gone forever.

No new revisions were added by this update.

Summary of changes:
 assembly/pom.xml  |  2 +-
 common/pom.xml|  2 +-
 core/pom.xml  |  2 +-
 datamap/bloom/pom.xml |  6 --
 datamap/examples/pom.xml  |  6 --
 datamap/lucene/pom.xml|  6 --
 datamap/mv/core/pom.xml   |  2 +-
 datamap/mv/plan/pom.xml   |  2 +-
 examples/flink/pom.xml|  2 +-
 examples/spark2/pom.xml   |  2 +-
 format/pom.xml|  2 +-
 hadoop/pom.xml|  2 +-
 integration/hive/pom.xml  |  2 +-
 integration/presto/pom.xml|  2 +-
 integration/spark-common-test/pom.xml | 14 +++---
 integration/spark-common/pom.xml  |  2 +-
 integration/spark-datasource/pom.xml  |  2 +-
 integration/spark2/pom.xml|  2 +-
 pom.xml   |  4 ++--
 processing/pom.xml|  2 +-
 store/sdk/pom.xml |  6 --
 streaming/pom.xml |  6 --
 tools/cli/pom.xml |  6 --
 23 files changed, 48 insertions(+), 36 deletions(-)



[carbondata] branch branch-1.6 updated (61a5bd3 -> a73cadd)

2019-08-12 Thread ravipesala
This is an automated email from the ASF dual-hosted git repository.

ravipesala pushed a change to branch branch-1.6
in repository https://gitbox.apache.org/repos/asf/carbondata.git.


from 61a5bd3  [HOTFIX] Fix dictionary include issue with codegen failure
 add d5db3f4  [CARBONDATA-3488] Check the file size after move local file 
to carbon path
 add a73cadd  [CARBONDATA-3490] Fix concurrent data load failure with 
carbondata FileNotFound exception

No new revisions were added by this update.

Summary of changes:
 .../apache/carbondata/core/util/CarbonUtil.java| 22 +++---
 .../apache/carbondata/spark/util/CommonUtil.scala  |  7 +--
 2 files changed, 24 insertions(+), 5 deletions(-)



[carbondata] branch master updated: [CARBONDATA-3490] Fix concurrent data load failure with carbondata FileNotFound exception

2019-08-12 Thread ravipesala
This is an automated email from the ASF dual-hosted git repository.

ravipesala pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/carbondata.git


The following commit(s) were added to refs/heads/master by this push:
 new a73cadd  [CARBONDATA-3490] Fix concurrent data load failure with 
carbondata FileNotFound exception
a73cadd is described below

commit a73cadda438de57713ffc5fd85a86b4fdb5442c7
Author: ajantha-bhat 
AuthorDate: Fri Aug 9 10:19:32 2019 +0530

[CARBONDATA-3490] Fix concurrent data load failure with carbondata 
FileNotFound exception

problem: When two load is happening concurrently, one load is cleaning the 
temp directory of the concurrent load

cause: temp directory to store the carbon files is created using system.get 
nano time, due to this two load have same store location. when one load is 
completed, it cleaned the temp directory. causing dataload failure for other 
load.

solution:
use UUID instead of nano time while creating the temp directory to have 
each load a unique directory.

This closes #3352
---
 .../main/scala/org/apache/carbondata/spark/util/CommonUtil.scala   | 7 +--
 1 file changed, 5 insertions(+), 2 deletions(-)

diff --git 
a/integration/spark-common/src/main/scala/org/apache/carbondata/spark/util/CommonUtil.scala
 
b/integration/spark-common/src/main/scala/org/apache/carbondata/spark/util/CommonUtil.scala
index 7015279..8d6cdfb 100644
--- 
a/integration/spark-common/src/main/scala/org/apache/carbondata/spark/util/CommonUtil.scala
+++ 
b/integration/spark-common/src/main/scala/org/apache/carbondata/spark/util/CommonUtil.scala
@@ -21,6 +21,7 @@ import java.io.File
 import java.math.BigDecimal
 import java.text.SimpleDateFormat
 import java.util
+import java.util.UUID
 import java.util.regex.{Matcher, Pattern}
 
 import scala.collection.JavaConverters._
@@ -777,8 +778,10 @@ object CommonUtil {
 val isCarbonUseYarnLocalDir = CarbonProperties.getInstance().getProperty(
   CarbonCommonConstants.CARBON_LOADING_USE_YARN_LOCAL_DIR,
   
CarbonCommonConstants.CARBON_LOADING_USE_YARN_LOCAL_DIR_DEFAULT).equalsIgnoreCase("true")
-val tmpLocationSuffix =
-  
s"${File.separator}carbon${System.nanoTime()}${CarbonCommonConstants.UNDERSCORE}$index"
+val tmpLocationSuffix = s"${ File.separator }carbon${
+  UUID.randomUUID().toString
+.replace("-", "")
+}${ CarbonCommonConstants.UNDERSCORE }$index"
 if (isCarbonUseYarnLocalDir) {
   val yarnStoreLocations = Util.getConfiguredLocalDirs(SparkEnv.get.conf)
 



[carbondata] branch branch-1.6 updated (9724fd4 -> 61a5bd3)

2019-08-12 Thread ravipesala
This is an automated email from the ASF dual-hosted git repository.

ravipesala pushed a change to branch branch-1.6
in repository https://gitbox.apache.org/repos/asf/carbondata.git.


 discard 9724fd4  [maven-release-plugin] prepare for next development iteration
omit 9ca7891  [maven-release-plugin] prepare release 
apache-CarbonData-1.6.0-rc2
omit 80438f7  [HOTFIX] Removed the hive-exec and commons dependency from 
hive module
omit 575b711  [CARBONDATA-3481] Multi-thread pruning fails when datamaps 
count is just near numOfThreadsForPruning
omit 2ebc041  [CARBONDATA-3478]Fix ArrayIndexOutOfBound Exception on 
compaction after alter operation
omit 917e041  [HOTFIX] CLI test case failed during release because of space 
differences
 add c8cc92b  [CARBONDATA-3478]Fix ArrayIndexOutOfBound Exception on 
compaction after alter operation
 add 10f3747  [HOTFIX] CLI test case failed during release because of space 
differences
 add 765712a  [CARBONDATA-3481] Multi-thread pruning fails when datamaps 
count is just near numOfThreadsForPruning
 add d7d70a8  [HOTFIX] Removed the hive-exec and commons dependency from 
hive module
 add f005fd4  [CARBONDATA-3477] deal line break chars correctly after 
'select' in 'update ... select columns' sql
 add 35f1501  [CARBONDATA-3483] don't require update.lock and 
compaction.lock again when execute 'IUD_UPDDEL_DELTA' compaction
 add e14c817  [CARBONDATA-3485] Data loading is failed from S3 to hdfs 
table having ~2K carbonfiles
 add 88ec830  [CARBONDATA-3476] Fix Read time and scan time stats in 
executor log for filter query
 add ebe4057  [CARBONDATA-3452] Fix select query failure when substring on 
dictionary column with join
 add bbeb974  [CARBONDATA-3487] wrong Input metrics (size/record) displayed 
in spark UI during insert into
 add aa67a99  [CARBONDATA-3486] Fix Serialization/Deserialization issue 
with DataType
 add 8f0724e  [CARBONDATA-3482] Fixed NPE in Concurrent query
 add 61a5bd3  [HOTFIX] Fix dictionary include issue with codegen failure

This update added new revisions after undoing existing revisions.
That is to say, some revisions that were in the old version of the
branch are not in the new version.  This situation occurs
when a user --force pushes a change and generates a repository
containing something like this:

 * -- * -- B -- O -- O -- O   (9724fd4)
\
 N -- N -- N   refs/heads/branch-1.6 (61a5bd3)

You should already have received notification emails for all of the O
revisions, and so the following emails describe only the N revisions
from the common base, B.

Any revisions marked "omit" are not gone; other references still
refer to them.  Any revisions marked "discard" are gone forever.

No new revisions were added by this update.

Summary of changes:
 assembly/pom.xml   |  2 +-
 common/pom.xml |  2 +-
 core/pom.xml   |  2 +-
 .../core/constants/CarbonCommonConstants.java  | 10 
 .../block/SegmentPropertiesAndSchemaHolder.java| 21 ---
 .../core/indexstore/BlockletDataMapIndexStore.java |  2 +-
 .../indexstore/blockletindex/BlockDataMap.java | 36 
 .../indexstore/blockletindex/BlockletDataMap.java  |  7 +--
 .../carbondata/core/scan/filter/FilterUtil.java|  5 +-
 .../MeasureColumnResolvedFilterInfo.java   |  3 +-
 .../AbstractDetailQueryResultIterator.java | 10 ++--
 .../scan/scanner/impl/BlockletFilterScanner.java   | 34 
 .../carbondata/core/util/CarbonProperties.java | 24 
 .../carbondata/core/util/TaskMetricsMap.java   | 21 ++-
 datamap/bloom/pom.xml  |  6 +-
 datamap/examples/pom.xml   |  6 +-
 datamap/lucene/pom.xml |  6 +-
 datamap/mv/core/pom.xml|  2 +-
 datamap/mv/plan/pom.xml|  2 +-
 examples/flink/pom.xml |  2 +-
 examples/spark2/pom.xml|  2 +-
 format/pom.xml |  2 +-
 hadoop/pom.xml |  2 +-
 integration/hive/pom.xml   |  2 +-
 integration/presto/pom.xml |  2 +-
 integration/spark-common-test/pom.xml  | 14 ++---
 ...ryWithColumnMetCacheAndCacheLevelProperty.scala | 27 +
 .../iud/HorizontalCompactionTestCase.scala | 64 +-
 .../testsuite/iud/UpdateCarbonTableTestCase.scala  | 42 +-
 integration/spark-common/pom.xml   |  2 +-
 .../apache/carbondata/spark/InitInputMetrics.java  |  2 +-
 .../spark/load/DataLoadProcessBuilderOnSpark.scala |  2 +-
 .../apache/carbondata/spark/rdd/CarbonRDD.scala|  4 +-
 .../carbondata/spark/rdd/CarbonScanRDD.scala   |  2 +-
 

[carbondata] branch master updated: [HOTFIX] Fix dictionary include issue with codegen failure

2019-08-09 Thread ravipesala
This is an automated email from the ASF dual-hosted git repository.

ravipesala pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/carbondata.git


The following commit(s) were added to refs/heads/master by this push:
 new 61a5bd3  [HOTFIX] Fix dictionary include issue with codegen failure
61a5bd3 is described below

commit 61a5bd3351cfeaa85528abeb70a8eae9c6521db6
Author: ajantha-bhat 
AuthorDate: Fri Aug 9 17:33:34 2019 +0530

[HOTFIX] Fix dictionary include issue with codegen failure

problem: when whole codegen is false, query on dictionary include column 
fails.

cause: This is because, the data type is not updated for dictionary include 
column.

solution: return the updated expression when data type is changed for the 
dictionary include column

This closes #3353
---
 .../scala/org/apache/spark/sql/optimizer/CarbonLateDecodeRule.scala  | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git 
a/integration/spark2/src/main/scala/org/apache/spark/sql/optimizer/CarbonLateDecodeRule.scala
 
b/integration/spark2/src/main/scala/org/apache/spark/sql/optimizer/CarbonLateDecodeRule.scala
index 93773fc..961bf11 100644
--- 
a/integration/spark2/src/main/scala/org/apache/spark/sql/optimizer/CarbonLateDecodeRule.scala
+++ 
b/integration/spark2/src/main/scala/org/apache/spark/sql/optimizer/CarbonLateDecodeRule.scala
@@ -713,9 +713,10 @@ class CarbonLateDecodeRule extends Rule[LogicalPlan] with 
PredicateHelper {
 prExp.transform {
   case attr: AttributeReference =>
 updateDataType(attr, attrMap, allAttrsNotDecode, aliasMap)
-}
+}.asInstanceOf[NamedExpression]
+  } else {
+prExp
   }
-  prExp
 }
 Project(prExps, p.child)
   case wd: Window if relations.nonEmpty =>



[carbondata] branch master updated: [CARBONDATA-3482] Fixed NPE in Concurrent query

2019-08-09 Thread ravipesala
This is an automated email from the ASF dual-hosted git repository.

ravipesala pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/carbondata.git


The following commit(s) were added to refs/heads/master by this push:
 new 8f0724e  [CARBONDATA-3482] Fixed NPE in Concurrent query
8f0724e is described below

commit 8f0724e4256e960608aa6a0d66593acd2ceaa84e
Author: kunal642 
AuthorDate: Mon Jul 29 14:31:31 2019 +0530

[CARBONDATA-3482] Fixed NPE in Concurrent query

Problem: In case of concurrent queries if Q1 is loading cache and Q2 is 
removing from cache then Q2 may remove the segmentPropertiesIndex which Q1 has 
allocated and is about to access. This will cause NullPointerException .

Solution: Instead of adding index in BlockDataMap keep the reference of 
segmentPropertiesWrapper to be used.

This closes #3351
---
 .../block/SegmentPropertiesAndSchemaHolder.java| 21 ++---
 .../core/indexstore/BlockletDataMapIndexStore.java |  2 +-
 .../indexstore/blockletindex/BlockDataMap.java | 36 --
 .../indexstore/blockletindex/BlockletDataMap.java  |  7 +
 ...ryWithColumnMetCacheAndCacheLevelProperty.scala | 27 ++--
 5 files changed, 26 insertions(+), 67 deletions(-)

diff --git 
a/core/src/main/java/org/apache/carbondata/core/datastore/block/SegmentPropertiesAndSchemaHolder.java
 
b/core/src/main/java/org/apache/carbondata/core/datastore/block/SegmentPropertiesAndSchemaHolder.java
index f2f2d8c..056a0e7 100644
--- 
a/core/src/main/java/org/apache/carbondata/core/datastore/block/SegmentPropertiesAndSchemaHolder.java
+++ 
b/core/src/main/java/org/apache/carbondata/core/datastore/block/SegmentPropertiesAndSchemaHolder.java
@@ -98,7 +98,7 @@ public class SegmentPropertiesAndSchemaHolder {
* @param columnCardinality
* @param segmentId
*/
-  public int addSegmentProperties(CarbonTable carbonTable,
+  public SegmentPropertiesWrapper addSegmentProperties(CarbonTable carbonTable,
   List columnsInTable, int[] columnCardinality, String 
segmentId) {
 SegmentPropertiesAndSchemaHolder.SegmentPropertiesWrapper 
segmentPropertiesWrapper =
 new 
SegmentPropertiesAndSchemaHolder.SegmentPropertiesWrapper(carbonTable,
@@ -137,7 +137,7 @@ public class SegmentPropertiesAndSchemaHolder {
 .addMinMaxColumns(carbonTable);
   }
 }
-return segmentIdSetAndIndexWrapper.getSegmentPropertiesIndex();
+return 
getSegmentPropertiesWrapper(segmentIdSetAndIndexWrapper.getSegmentPropertiesIndex());
   }
 
   /**
@@ -222,17 +222,14 @@ public class SegmentPropertiesAndSchemaHolder {
* Method to remove the given segment ID
*
* @param segmentId
-   * @param segmentPropertiesIndex
* @param clearSegmentWrapperFromMap flag to specify whether to clear 
segmentPropertiesWrapper
*   from Map if all the segment's using it 
have become stale
*/
-  public void invalidate(String segmentId, int segmentPropertiesIndex,
+  public void invalidate(String segmentId, SegmentPropertiesWrapper 
segmentPropertiesWrapper,
   boolean clearSegmentWrapperFromMap) {
-SegmentPropertiesWrapper segmentPropertiesWrapper =
-indexToSegmentPropertiesWrapperMapping.get(segmentPropertiesIndex);
-if (null != segmentPropertiesWrapper) {
-  SegmentIdAndSegmentPropertiesIndexWrapper 
segmentIdAndSegmentPropertiesIndexWrapper =
-  segmentPropWrapperToSegmentSetMap.get(segmentPropertiesWrapper);
+SegmentIdAndSegmentPropertiesIndexWrapper 
segmentIdAndSegmentPropertiesIndexWrapper =
+segmentPropWrapperToSegmentSetMap.get(segmentPropertiesWrapper);
+if (segmentIdAndSegmentPropertiesIndexWrapper != null) {
   synchronized 
(getOrCreateTableLock(segmentPropertiesWrapper.getTableIdentifier())) {
 segmentIdAndSegmentPropertiesIndexWrapper.removeSegmentId(segmentId);
 // if after removal of given SegmentId, the segmentIdSet becomes empty 
that means this
@@ -240,14 +237,16 @@ public class SegmentPropertiesAndSchemaHolder {
 // removed from all the holders
 if (clearSegmentWrapperFromMap && 
segmentIdAndSegmentPropertiesIndexWrapper.segmentIdSet
 .isEmpty()) {
-  
indexToSegmentPropertiesWrapperMapping.remove(segmentPropertiesIndex);
+  indexToSegmentPropertiesWrapperMapping
+  
.remove(segmentIdAndSegmentPropertiesIndexWrapper.getSegmentPropertiesIndex());
   segmentPropWrapperToSegmentSetMap.remove(segmentPropertiesWrapper);
 } else if (!clearSegmentWrapperFromMap
 && 
segmentIdAndSegmentPropertiesIndexWrapper.segmentIdSet.isEmpty()) {
   // min max columns can very when cache is modified. So even though 
entry is not required
   // to be deleted from map clear the column cache so that it can 
filled again
   segmentPropertiesWrapper.clear();
-  LOGGER.info(

[carbondata] branch branch-1.6 updated: [HOTFIX] CLI test case failed during release because of space differences

2019-08-01 Thread ravipesala
This is an automated email from the ASF dual-hosted git repository.

ravipesala pushed a commit to branch branch-1.6
in repository https://gitbox.apache.org/repos/asf/carbondata.git


The following commit(s) were added to refs/heads/branch-1.6 by this push:
 new 917e041  [HOTFIX] CLI test case failed during release because of space 
differences
917e041 is described below

commit 917e041439282985ec28ff89249db6088e6771df
Author: ravipesala 
AuthorDate: Thu Aug 1 18:12:50 2019 +0530

[HOTFIX] CLI test case failed during release because of space differences

CLI test case is failed if the release name is short and without snapshot, 
it adds more space.
That's why changed test check the individual contains instead of a batch of 
lines.

This closes #3344
---
 .../src/test/java/org/apache/carbondata/tool/CarbonCliTest.java  | 9 +++--
 1 file changed, 3 insertions(+), 6 deletions(-)

diff --git 
a/tools/cli/src/test/java/org/apache/carbondata/tool/CarbonCliTest.java 
b/tools/cli/src/test/java/org/apache/carbondata/tool/CarbonCliTest.java
index 26901f8..af8d51d 100644
--- a/tools/cli/src/test/java/org/apache/carbondata/tool/CarbonCliTest.java
+++ b/tools/cli/src/test/java/org/apache/carbondata/tool/CarbonCliTest.java
@@ -241,12 +241,9 @@ public class CarbonCliTest {
 "20  3.36KB 4.06MB false  00.0B  
93.76KB  0.0   100.0  7298  " ,
 "21  2.04KB 1.49MB false  00.0B  
89.62KB  0.0   100.0  9299  ");
 Assert.assertTrue(output.contains(expectedOutput));
-
-expectedOutput = buildLines(
-"## version Details",
-"written_by  Version ",
-"TestUtil"+ CarbonVersionConstants.CARBONDATA_VERSION+"  ");
-Assert.assertTrue(output.contains(expectedOutput));
+Assert.assertTrue(output.contains("## version Details"));
+Assert.assertTrue(output.contains("written_by  Version"));
+Assert.assertTrue(output.contains("TestUtil"+ 
CarbonVersionConstants.CARBONDATA_VERSION));
   }
 
   @Test



[carbondata] branch branch-1.6 updated (6366d9e -> ac2af7c)

2019-07-31 Thread ravipesala
This is an automated email from the ASF dual-hosted git repository.

ravipesala pushed a change to branch branch-1.6
in repository https://gitbox.apache.org/repos/asf/carbondata.git.


 discard 6366d9e  [maven-release-plugin] prepare for next development iteration
omit 9938633  [maven-release-plugin] prepare release 
apache-carbondata-1.6.0-rc1
 add ee78597  [CARBONDATA-3462][DOC]Added documentation for index server
 add a77e4fd  [HOTFIX] Reset the hive catalog table stats to none even 
after refresh lookup relation.
 add 1d0754e  [HOTFIX] Fix json to carbon writer
 add b6e5f69  [HOTFIX] Added taskid as UUID while writing files in 
fileformat to avoid corrupting.
 add ec2a731  [HOTFIX] Included MV module in assembly jar
 add c0d8d34  [HOTFIX] Fixed sk/ak not found for datasource table
 add c65cc12  [CARBONDATA-3474]Fix validate mvQuery having filter 
expression and correct error message
 add ed117f7  [HOTFIX] Fix failing CI test cases
 add ac2af7c  [HOTFIX] Fixed date filter issue for fileformat

This update added new revisions after undoing existing revisions.
That is to say, some revisions that were in the old version of the
branch are not in the new version.  This situation occurs
when a user --force pushes a change and generates a repository
containing something like this:

 * -- * -- B -- O -- O -- O   (6366d9e)
\
 N -- N -- N   refs/heads/branch-1.6 (ac2af7c)

You should already have received notification emails for all of the O
revisions, and so the following emails describe only the N revisions
from the common base, B.

Any revisions marked "omit" are not gone; other references still
refer to them.  Any revisions marked "discard" are gone forever.

No new revisions were added by this update.

Summary of changes:
 README.md  |   3 +-
 assembly/pom.xml   |  27 +--
 common/pom.xml |   2 +-
 core/pom.xml   |   2 +-
 .../core/constants/CarbonCommonConstants.java  |   2 +-
 .../carbondata/core/util/path/CarbonTablePath.java |  12 +-
 datamap/bloom/pom.xml  |   6 +-
 datamap/examples/pom.xml   |   6 +-
 datamap/lucene/pom.xml |   6 +-
 datamap/mv/core/pom.xml|   2 +-
 .../apache/carbondata/mv/datamap/MVHelper.scala|  13 +-
 .../org/apache/carbondata/mv/datamap/MVUtil.scala  |   3 +-
 .../carbondata/mv/rewrite/MVCreateTestCase.scala   |   2 +-
 .../mv/rewrite/TestAllOperationsOnMV.scala |  36 +++-
 datamap/mv/plan/pom.xml|   2 +-
 docs/index-server.md   | 229 +
 examples/spark2/pom.xml|   2 +-
 format/pom.xml |   2 +-
 hadoop/pom.xml |   2 +-
 .../carbondata/hadoop/api/CarbonInputFormat.java   |   4 +-
 integration/hive/pom.xml   |   2 +-
 integration/presto/pom.xml |   2 +-
 integration/spark-common-test/pom.xml  |  14 +-
 integration/spark-common/pom.xml   |   2 +-
 integration/spark-datasource/pom.xml   |   2 +-
 .../execution/datasources/CarbonFileIndex.scala|   3 +-
 .../datasources/CarbonSparkDataSourceUtil.scala|   5 +
 .../datasources/SparkCarbonFileFormat.scala|   4 +-
 .../datasource/SparkCarbonDataSourceTest.scala |  13 ++
 integration/spark2/pom.xml |   2 +-
 .../sql/hive/CarbonInMemorySessionState.scala  |  28 ++-
 .../apache/spark/sql/hive/CarbonSessionState.scala |  25 ++-
 .../apache/spark/sql/hive/CarbonSessionUtil.scala  |  69 +--
 .../carbondata/indexserver/IndexServer.scala   |   9 +-
 .../scala/org/apache/spark/util/DataMapUtil.scala  |   2 +-
 pom.xml|  80 ++-
 processing/pom.xml |   2 +-
 .../partition/spliter/RowResultProcessor.java  |   2 +-
 .../processing/store/CarbonDataFileAttributes.java |  15 +-
 .../store/CarbonFactDataHandlerModel.java  |   2 +-
 store/sdk/pom.xml  |   6 +-
 .../carbondata/sdk/file/CarbonWriterBuilder.java   |   9 +-
 .../carbondata/sdk/file/CSVCarbonWriterTest.java   |   4 +-
 .../carbondata/sdk/file/CarbonReaderTest.java  |  53 +
 streaming/pom.xml  |   6 +-
 .../streaming/CarbonStreamRecordWriter.java|   2 +-
 tools/cli/pom.xml  |   6 +-
 47 files changed, 537 insertions(+), 195 deletions(-)
 create mode 100644 docs/index-server.md



[carbondata] branch master updated: [HOTFIX] Fix failing CI test cases

2019-07-30 Thread ravipesala
This is an automated email from the ASF dual-hosted git repository.

ravipesala pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/carbondata.git


The following commit(s) were added to refs/heads/master by this push:
 new ed117f7  [HOTFIX] Fix failing CI test cases
ed117f7 is described below

commit ed117f74caae606849d01f5df434804ecc97d8eb
Author: kunal642 
AuthorDate: Mon Jul 29 21:46:41 2019 +0530

[HOTFIX] Fix failing CI test cases

Problem: Bloom and lucene dependency was removed due to which mvn was 
downloaded the old jar.

Solution: Add bloom and lucene dependency to the main pom

This closes #3341
---
 pom.xml | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/pom.xml b/pom.xml
index 3995e43..35317ec 100644
--- a/pom.xml
+++ b/pom.xml
@@ -108,6 +108,8 @@
 store/sdk
 assembly
 tools/cli
+datamap/bloom
+datamap/lucene
 datamap/mv/plan
 datamap/mv/core
 examples/spark2



svn commit: r34885 - in /dev/carbondata/1.6.0-rc1: ./ apache-carbondata-1.6.0-source-release.zip apache-carbondata-1.6.0-source-release.zip.asc apache-carbondata-1.6.0-source-release.zip.sha512

2019-07-15 Thread ravipesala
Author: ravipesala
Date: Mon Jul 15 14:02:27 2019
New Revision: 34885

Log:
Upload 1.6.0-rc1

Added:
dev/carbondata/1.6.0-rc1/
dev/carbondata/1.6.0-rc1/apache-carbondata-1.6.0-source-release.zip   (with 
props)
dev/carbondata/1.6.0-rc1/apache-carbondata-1.6.0-source-release.zip.asc
dev/carbondata/1.6.0-rc1/apache-carbondata-1.6.0-source-release.zip.sha512

Added: dev/carbondata/1.6.0-rc1/apache-carbondata-1.6.0-source-release.zip
==
Binary file - no diff available.

Propchange: dev/carbondata/1.6.0-rc1/apache-carbondata-1.6.0-source-release.zip
--
svn:mime-type = application/octet-stream

Added: dev/carbondata/1.6.0-rc1/apache-carbondata-1.6.0-source-release.zip.asc
==
--- dev/carbondata/1.6.0-rc1/apache-carbondata-1.6.0-source-release.zip.asc 
(added)
+++ dev/carbondata/1.6.0-rc1/apache-carbondata-1.6.0-source-release.zip.asc Mon 
Jul 15 14:02:27 2019
@@ -0,0 +1,16 @@
+-BEGIN PGP SIGNATURE-
+
+iQIzBAABCgAdFiEER3EpqJTxH7zLwCVHutcqeKexsu4FAl0scX4ACgkQutcqeKex
+su7+eRAAh3pmvUpjys9SxZZAftzqog0a5TJG3z0sLzFHj17WD86VE+zcsUj8EskU
+zbbiJI76Q17gyt3p6m6+a0fwtN+YMJi8gx5WWmd5HEgkBaSE7zwYblXM/dpnVZke
+HYM+RLZVMpPJJ0cc3bzf3JjXEaZzj+pm1s+ewuEka/QyqXzq+CUFzOSD9whFv3Xh
+m+MaPGl2CdigQrGdDpLxyRKipmdg3yF/lSexASIB6Ol5VZxqGIwX4WCmHZo0HbkY
+GL+3YJFnoExKylxC3Y6pk6gtaWFkmR3lHazHtWlJN+K/tGgG+XqM1Nn2w/wDBMfW
+yt1Yla19OeW9GoazLehzojsMorQPRL6+3ZZYa61LUkrdSa5dTtaXaQ+RKGXsmEwk
+04Hxgvk+g6eRFCro8AseR45ss4GXvsOQyAEv5Y8szemz/kRcrDk8VYLMtQNyyKGj
+Bm26G7X68lMtVmyaju0XdKRraeDD1P5qgFyH0Tj8cYuLBEjCYGMLRHTSoyiOrwZY
+0ididPCBR5nsTTb00FhAJfJDwkZ1dTkwJiz74SMtw3Hb4eNKXUKMOHLPJu2tASEm
+5vZ+y844NadwvuYaEr5iXrPlYf1f2C9Rhca61ypFFPrhttgABE+W8wRsWmsXMLQO
+KK15e036XJYVEqMlA1fT25uLZvohg1cKQVKvpgP5ZUzzreu/k90=
+=/c1j
+-END PGP SIGNATURE-

Added: 
dev/carbondata/1.6.0-rc1/apache-carbondata-1.6.0-source-release.zip.sha512
==
--- dev/carbondata/1.6.0-rc1/apache-carbondata-1.6.0-source-release.zip.sha512 
(added)
+++ dev/carbondata/1.6.0-rc1/apache-carbondata-1.6.0-source-release.zip.sha512 
Mon Jul 15 14:02:27 2019
@@ -0,0 +1 @@
+e9d34a979f91466fc7be4d5807b2f935af395099e81a9fb6597bdfaac8e4cec2edb76ec029409f3ed2c513a032f34545507ea000e96c8b41bab300be4bc8e4de
  apache-carbondata-1.6.0-source-release.zip




[carbondata] branch branch-1.6 updated: [maven-release-plugin] prepare for next development iteration

2019-07-15 Thread ravipesala
This is an automated email from the ASF dual-hosted git repository.

ravipesala pushed a commit to branch branch-1.6
in repository https://gitbox.apache.org/repos/asf/carbondata.git


The following commit(s) were added to refs/heads/branch-1.6 by this push:
 new 6366d9e  [maven-release-plugin] prepare for next development iteration
6366d9e is described below

commit 6366d9e423fc37a78f10c8376be1a95c64d3bd61
Author: ravipesala 
AuthorDate: Mon Jul 15 18:28:52 2019 +0530

[maven-release-plugin] prepare for next development iteration
---
 assembly/pom.xml  | 2 +-
 common/pom.xml| 2 +-
 core/pom.xml  | 2 +-
 datamap/bloom/pom.xml | 2 +-
 datamap/examples/pom.xml  | 2 +-
 datamap/lucene/pom.xml| 2 +-
 datamap/mv/core/pom.xml   | 2 +-
 datamap/mv/plan/pom.xml   | 2 +-
 examples/spark2/pom.xml   | 2 +-
 format/pom.xml| 2 +-
 hadoop/pom.xml| 2 +-
 integration/hive/pom.xml  | 2 +-
 integration/presto/pom.xml| 2 +-
 integration/spark-common-test/pom.xml | 2 +-
 integration/spark-common/pom.xml  | 2 +-
 integration/spark-datasource/pom.xml  | 2 +-
 integration/spark2/pom.xml| 2 +-
 pom.xml   | 4 ++--
 processing/pom.xml| 2 +-
 store/sdk/pom.xml | 2 +-
 streaming/pom.xml | 2 +-
 tools/cli/pom.xml | 2 +-
 22 files changed, 23 insertions(+), 23 deletions(-)

diff --git a/assembly/pom.xml b/assembly/pom.xml
index 7cc80dc..6d6c391 100644
--- a/assembly/pom.xml
+++ b/assembly/pom.xml
@@ -22,7 +22,7 @@
   
 org.apache.carbondata
 carbondata-parent
-1.6.0
+1.6.1-SNAPSHOT
 ../pom.xml
   
 
diff --git a/common/pom.xml b/common/pom.xml
index 8e5ddaa..728314c 100644
--- a/common/pom.xml
+++ b/common/pom.xml
@@ -22,7 +22,7 @@
   
 org.apache.carbondata
 carbondata-parent
-1.6.0
+1.6.1-SNAPSHOT
 ../pom.xml
   
 
diff --git a/core/pom.xml b/core/pom.xml
index b39a42e..22982f3 100644
--- a/core/pom.xml
+++ b/core/pom.xml
@@ -22,7 +22,7 @@
   
 org.apache.carbondata
 carbondata-parent
-1.6.0
+1.6.1-SNAPSHOT
 ../pom.xml
   
 
diff --git a/datamap/bloom/pom.xml b/datamap/bloom/pom.xml
index a29f77b..8ba7846 100644
--- a/datamap/bloom/pom.xml
+++ b/datamap/bloom/pom.xml
@@ -4,7 +4,7 @@
   
 org.apache.carbondata
 carbondata-parent
-1.6.0
+1.6.1-SNAPSHOT
 ../../pom.xml
   
 
diff --git a/datamap/examples/pom.xml b/datamap/examples/pom.xml
index a9c179d..6e3b8ae 100644
--- a/datamap/examples/pom.xml
+++ b/datamap/examples/pom.xml
@@ -22,7 +22,7 @@
   
 org.apache.carbondata
 carbondata-parent
-1.6.0
+1.6.1-SNAPSHOT
 ../../pom.xml
   
 
diff --git a/datamap/lucene/pom.xml b/datamap/lucene/pom.xml
index 1a23a52..42a22b2 100644
--- a/datamap/lucene/pom.xml
+++ b/datamap/lucene/pom.xml
@@ -4,7 +4,7 @@
   
 org.apache.carbondata
 carbondata-parent
-1.6.0
+1.6.1-SNAPSHOT
 ../../pom.xml
   
 
diff --git a/datamap/mv/core/pom.xml b/datamap/mv/core/pom.xml
index 5cb284d..6af274d 100644
--- a/datamap/mv/core/pom.xml
+++ b/datamap/mv/core/pom.xml
@@ -22,7 +22,7 @@
   
 org.apache.carbondata
 carbondata-parent
-1.6.0
+1.6.1-SNAPSHOT
 ../../../pom.xml
   
 
diff --git a/datamap/mv/plan/pom.xml b/datamap/mv/plan/pom.xml
index fe1afb7..4b8c9be 100644
--- a/datamap/mv/plan/pom.xml
+++ b/datamap/mv/plan/pom.xml
@@ -22,7 +22,7 @@
   
 org.apache.carbondata
 carbondata-parent
-1.6.0
+1.6.1-SNAPSHOT
 ../../../pom.xml
   
 
diff --git a/examples/spark2/pom.xml b/examples/spark2/pom.xml
index e303406..ad0d3ec 100644
--- a/examples/spark2/pom.xml
+++ b/examples/spark2/pom.xml
@@ -22,7 +22,7 @@
   
 org.apache.carbondata
 carbondata-parent
-1.6.0
+1.6.1-SNAPSHOT
 ../../pom.xml
   
 
diff --git a/format/pom.xml b/format/pom.xml
index 51135d8..81aa95b 100644
--- a/format/pom.xml
+++ b/format/pom.xml
@@ -22,7 +22,7 @@
   
 org.apache.carbondata
 carbondata-parent
-1.6.0
+1.6.1-SNAPSHOT
 ../pom.xml
   
 
diff --git a/hadoop/pom.xml b/hadoop/pom.xml
index 59f515e..bcb5696 100644
--- a/hadoop/pom.xml
+++ b/hadoop/pom.xml
@@ -22,7 +22,7 @@
   
 org.apache.carbondata
 carbondata-parent
-1.6.0
+1.6.1-SNAPSHOT
 ../pom.xml
   
 
diff --git a/integration/hive/pom.xml b/integration/hive/pom.xml
index 58b0796..dfa8810 100644
--- a/integration/hive/pom.xml
+++ b/integration/hive/pom.xml
@@ -22,7 +22,7 @@
 
 org.apache.carbondata
 carbondata-parent
-1.6.0
+1.6.1-SNAPSHOT
 ../../pom.xml
 
 
diff --git a/integration/presto/pom.xml b/integration/presto/pom.xml
index a2e9ef3..2631605 100644
--- a/integration/presto/pom.xml
+++ b/integration/presto/pom.xml
@@ -22,7 +22,7

[carbondata] annotated tag apache-carbondata-1.6.0-rc1 created (now 04c1e6b)

2019-07-15 Thread ravipesala
This is an automated email from the ASF dual-hosted git repository.

ravipesala pushed a change to annotated tag apache-carbondata-1.6.0-rc1
in repository https://gitbox.apache.org/repos/asf/carbondata.git.


  at 04c1e6b  (tag)
 tagging 9938633c6a80407876c7e7fa0ffd455164edff4b (commit)
  by ravipesala
  on Mon Jul 15 18:28:31 2019 +0530

- Log -
[maven-release-plugin] copy for tag apache-carbondata-1.6.0-rc1
---

No new revisions were added by this update.



[carbondata] annotated tag apache-carbondata-1.6.0-rc1 deleted (was 68bae63)

2019-07-15 Thread ravipesala
This is an automated email from the ASF dual-hosted git repository.

ravipesala pushed a change to annotated tag apache-carbondata-1.6.0-rc1
in repository https://gitbox.apache.org/repos/asf/carbondata.git.


*** WARNING: tag apache-carbondata-1.6.0-rc1 was deleted! ***

   tag was  68bae63

The revisions that were on this annotated tag are still contained in
other references; therefore, this change does not discard any commits
from the repository.



[carbondata] annotated tag apache-carbondata-1.6.0-rc1 created (now 68bae63)

2019-07-15 Thread ravipesala
This is an automated email from the ASF dual-hosted git repository.

ravipesala pushed a change to annotated tag apache-carbondata-1.6.0-rc1
in repository https://gitbox.apache.org/repos/asf/carbondata.git.


  at 68bae63  (tag)
 tagging 9938633c6a80407876c7e7fa0ffd455164edff4b (commit)
  by ravipesala
  on Mon Jul 15 18:17:06 2019 +0530

- Log -
[maven-release-plugin] copy for tag apache-carbondata-1.6.0-rc1
---

No new revisions were added by this update.



[carbondata] branch branch-1.6 created (now 9938633)

2019-07-15 Thread ravipesala
This is an automated email from the ASF dual-hosted git repository.

ravipesala pushed a change to branch branch-1.6
in repository https://gitbox.apache.org/repos/asf/carbondata.git.


  at 9938633  [maven-release-plugin] prepare release 
apache-carbondata-1.6.0-rc1

This branch includes the following new commits:

 new 9938633  [maven-release-plugin] prepare release 
apache-carbondata-1.6.0-rc1

The 1 revisions listed above as "new" are entirely new to this
repository and will be described in separate emails.  The revisions
listed as "add" were already present in the repository and have only
been added to this reference.




[carbondata] 01/01: [maven-release-plugin] prepare release apache-carbondata-1.6.0-rc1

2019-07-15 Thread ravipesala
This is an automated email from the ASF dual-hosted git repository.

ravipesala pushed a commit to branch branch-1.6
in repository https://gitbox.apache.org/repos/asf/carbondata.git

commit 9938633c6a80407876c7e7fa0ffd455164edff4b
Author: ravipesala 
AuthorDate: Mon Jul 15 18:14:51 2019 +0530

[maven-release-plugin] prepare release apache-carbondata-1.6.0-rc1
---
 assembly/pom.xml  |  2 +-
 common/pom.xml|  2 +-
 core/pom.xml  |  2 +-
 datamap/bloom/pom.xml |  6 ++
 datamap/examples/pom.xml  |  6 ++
 datamap/lucene/pom.xml|  6 ++
 datamap/mv/core/pom.xml   |  2 +-
 datamap/mv/plan/pom.xml   |  2 +-
 examples/spark2/pom.xml   |  2 +-
 format/pom.xml|  2 +-
 hadoop/pom.xml|  2 +-
 integration/hive/pom.xml  |  2 +-
 integration/presto/pom.xml|  2 +-
 integration/spark-common-test/pom.xml | 14 +++---
 integration/spark-common/pom.xml  |  2 +-
 integration/spark-datasource/pom.xml  |  2 +-
 integration/spark2/pom.xml|  2 +-
 pom.xml   |  4 ++--
 processing/pom.xml|  2 +-
 store/sdk/pom.xml |  6 ++
 streaming/pom.xml |  6 ++
 tools/cli/pom.xml |  6 ++
 22 files changed, 35 insertions(+), 47 deletions(-)

diff --git a/assembly/pom.xml b/assembly/pom.xml
index d88c91a..7cc80dc 100644
--- a/assembly/pom.xml
+++ b/assembly/pom.xml
@@ -22,7 +22,7 @@
   
 org.apache.carbondata
 carbondata-parent
-1.6.0-SNAPSHOT
+1.6.0
 ../pom.xml
   
 
diff --git a/common/pom.xml b/common/pom.xml
index 14cd52f..8e5ddaa 100644
--- a/common/pom.xml
+++ b/common/pom.xml
@@ -22,7 +22,7 @@
   
 org.apache.carbondata
 carbondata-parent
-1.6.0-SNAPSHOT
+1.6.0
 ../pom.xml
   
 
diff --git a/core/pom.xml b/core/pom.xml
index 41481af..b39a42e 100644
--- a/core/pom.xml
+++ b/core/pom.xml
@@ -22,7 +22,7 @@
   
 org.apache.carbondata
 carbondata-parent
-1.6.0-SNAPSHOT
+1.6.0
 ../pom.xml
   
 
diff --git a/datamap/bloom/pom.xml b/datamap/bloom/pom.xml
index 1e8c382..a29f77b 100644
--- a/datamap/bloom/pom.xml
+++ b/datamap/bloom/pom.xml
@@ -1,12 +1,10 @@
-http://maven.apache.org/POM/4.0.0;
- xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance;
- xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 
http://maven.apache.org/xsd/maven-4.0.0.xsd;>
+http://maven.apache.org/POM/4.0.0; 
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance; 
xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 
http://maven.apache.org/xsd/maven-4.0.0.xsd;>
   4.0.0
 
   
 org.apache.carbondata
 carbondata-parent
-1.6.0-SNAPSHOT
+1.6.0
 ../../pom.xml
   
 
diff --git a/datamap/examples/pom.xml b/datamap/examples/pom.xml
index 3720a1c..a9c179d 100644
--- a/datamap/examples/pom.xml
+++ b/datamap/examples/pom.xml
@@ -15,16 +15,14 @@
 See the License for the specific language governing permissions and
 limitations under the License.
 -->
-http://maven.apache.org/POM/4.0.0;
- xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance;
- xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 
http://maven.apache.org/xsd/maven-4.0.0.xsd;>
+http://maven.apache.org/POM/4.0.0; 
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance; 
xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 
http://maven.apache.org/xsd/maven-4.0.0.xsd;>
 
   4.0.0
 
   
 org.apache.carbondata
 carbondata-parent
-1.6.0-SNAPSHOT
+1.6.0
 ../../pom.xml
   
 
diff --git a/datamap/lucene/pom.xml b/datamap/lucene/pom.xml
index 3e93a83..1a23a52 100644
--- a/datamap/lucene/pom.xml
+++ b/datamap/lucene/pom.xml
@@ -1,12 +1,10 @@
-http://maven.apache.org/POM/4.0.0;
- xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance;
- xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 
http://maven.apache.org/xsd/maven-4.0.0.xsd;>
+http://maven.apache.org/POM/4.0.0; 
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance; 
xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 
http://maven.apache.org/xsd/maven-4.0.0.xsd;>
   4.0.0
 
   
 org.apache.carbondata
 carbondata-parent
-1.6.0-SNAPSHOT
+1.6.0
 ../../pom.xml
   
 
diff --git a/datamap/mv/core/pom.xml b/datamap/mv/core/pom.xml
index 0a1f0e2..5cb284d 100644
--- a/datamap/mv/core/pom.xml
+++ b/datamap/mv/core/pom.xml
@@ -22,7 +22,7 @@
   
 org.apache.carbondata
 carbondata-parent
-1.6.0-SNAPSHOT
+1.6.0
 ../../../pom.xml
   
 
diff --git a/datamap/mv/plan/pom.xml b/datamap/mv/plan/pom.xml
index 753d48b..fe1afb7 100644
--- a/datamap/mv/plan/pom.xml
+++ b/datamap/mv/plan/pom.xml
@@ -22,7 +22,7 @@
   
 org.apache.carbondata
 carbon

[carbondata] branch master updated: [CARBONDATA-3460] Fixed EOFException in CarbonScanRDD

2019-07-15 Thread ravipesala
This is an automated email from the ASF dual-hosted git repository.

ravipesala pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/carbondata.git


The following commit(s) were added to refs/heads/master by this push:
 new 0902d45  [CARBONDATA-3460] Fixed EOFException in CarbonScanRDD
0902d45 is described below

commit 0902d459a30e0fdd72868b2956eeb1c6b3b06346
Author: kunal642 
AuthorDate: Wed Jul 3 10:54:08 2019 +0530

[CARBONDATA-3460] Fixed EOFException in CarbonScanRDD

Problem: Delete delta information was not written properly in the 
OutputStream due the flag based writing.

Solution: Always write the delete delta info, the size of the array will be 
the deciding factor whether to read further or not.

This closes #3316
---
 .../core/indexstore/ExtendedBlocklet.java  |  1 -
 .../apache/carbondata/hadoop/CarbonInputSplit.java | 52 +++---
 2 files changed, 26 insertions(+), 27 deletions(-)

diff --git 
a/core/src/main/java/org/apache/carbondata/core/indexstore/ExtendedBlocklet.java
 
b/core/src/main/java/org/apache/carbondata/core/indexstore/ExtendedBlocklet.java
index d97148d..a85423b 100644
--- 
a/core/src/main/java/org/apache/carbondata/core/indexstore/ExtendedBlocklet.java
+++ 
b/core/src/main/java/org/apache/carbondata/core/indexstore/ExtendedBlocklet.java
@@ -177,7 +177,6 @@ public class ExtendedBlocklet extends Blocklet {
   DataOutputStream dos = new DataOutputStream(ebos);
   inputSplit.setFilePath(null);
   inputSplit.setBucketId(null);
-  inputSplit.setWriteDeleteDelta(false);
   if (inputSplit.isBlockCache()) {
 inputSplit.updateFooteroffset();
 inputSplit.updateBlockLength();
diff --git 
a/core/src/main/java/org/apache/carbondata/hadoop/CarbonInputSplit.java 
b/core/src/main/java/org/apache/carbondata/hadoop/CarbonInputSplit.java
index da1bc2c..edbfcfe 100644
--- a/core/src/main/java/org/apache/carbondata/hadoop/CarbonInputSplit.java
+++ b/core/src/main/java/org/apache/carbondata/hadoop/CarbonInputSplit.java
@@ -14,6 +14,7 @@
  * See the License for the specific language governing permissions and
  * limitations under the License.
  */
+
 package org.apache.carbondata.hadoop;
 
 import java.io.ByteArrayInputStream;
@@ -150,8 +151,6 @@ public class CarbonInputSplit extends FileSplit
*/
   private int rowCount;
 
-  private boolean writeDeleteDelta = true;
-
   public CarbonInputSplit() {
 segment = null;
 taskId = "0";
@@ -195,7 +194,13 @@ public class CarbonInputSplit extends FileSplit
 this.version = ColumnarFormatVersion.valueOf(in.readShort());
 // will be removed after count(*) optmization in case of index server
 this.rowCount = in.readInt();
-this.writeDeleteDelta = in.readBoolean();
+if (in.readBoolean()) {
+  int numberOfDeleteDeltaFiles = in.readInt();
+  deleteDeltaFiles = new String[numberOfDeleteDeltaFiles];
+  for (int i = 0; i < numberOfDeleteDeltaFiles; i++) {
+deleteDeltaFiles[i] = in.readUTF();
+  }
+}
 // after deseralizing required field get the start position of field which 
will be only used
 // in executor
 int leftoverPosition = underlineStream.getPosition();
@@ -359,7 +364,13 @@ public class CarbonInputSplit extends FileSplit
   this.length = in.readLong();
   this.version = ColumnarFormatVersion.valueOf(in.readShort());
   this.rowCount = in.readInt();
-  this.writeDeleteDelta = in.readBoolean();
+  if (in.readBoolean()) {
+int numberOfDeleteDeltaFiles = in.readInt();
+deleteDeltaFiles = new String[numberOfDeleteDeltaFiles];
+for (int i = 0; i < numberOfDeleteDeltaFiles; i++) {
+  deleteDeltaFiles[i] = in.readUTF();
+}
+  }
   this.bucketId = in.readUTF();
 }
 this.blockletId = in.readUTF();
@@ -379,13 +390,6 @@ public class CarbonInputSplit extends FileSplit
   validBlockletIds.add((int) in.readShort());
 }
 this.isLegacyStore = in.readBoolean();
-if (writeDeleteDelta) {
-  int numberOfDeleteDeltaFiles = in.readInt();
-  deleteDeltaFiles = new String[numberOfDeleteDeltaFiles];
-  for (int i = 0; i < numberOfDeleteDeltaFiles; i++) {
-deleteDeltaFiles[i] = in.readUTF();
-  }
-}
   }
 
   @Override public void write(DataOutput out) throws IOException {
@@ -397,11 +401,10 @@ public class CarbonInputSplit extends FileSplit
   out.writeLong(length);
   out.writeShort(version.number());
   out.writeInt(rowCount);
-  out.writeBoolean(writeDeleteDelta);
+  writeDeleteDeltaFile(out);
   out.writeUTF(bucketId);
   out.writeUTF(blockletId);
   out.write(serializeData, offset, actualLen);
-  writeDeleteDeltaFile(out);
   return;
 }
 // please refer writeDetailInfo doc
@@ -419,7 +422,7 @@ public class CarbonInputSplit extends FileSplit
 } else {
   out.writeInt(0);
 }
-out.writeBoole

[carbondata] branch master updated: [CARBONDATA-3459] Fixed id based distribution for showcache command

2019-07-15 Thread ravipesala
This is an automated email from the ASF dual-hosted git repository.

ravipesala pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/carbondata.git


The following commit(s) were added to refs/heads/master by this push:
 new a682f98  [CARBONDATA-3459] Fixed id based distribution for showcache 
command
a682f98 is described below

commit a682f98e885bedd3a3d980223937095861c27607
Author: kunal642 
AuthorDate: Wed Jul 3 00:53:43 2019 +0530

[CARBONDATA-3459] Fixed id based distribution for showcache command

Problem: Currently tasks are not being fired based on the executor ID 
because getPrefferedLocation was not overridden.

Solution: override getPreferredLocations in the ShowCache and 
InvalidateCacheRDD to fire tasks at the appropriate location

This closes #3315
---
 .../carbondata/indexserver/DistributedRDDUtils.scala  |  6 +++---
 .../carbondata/indexserver/DistributedShowCacheRDD.scala  |  8 
 .../indexserver/InvalidateSegmentCacheRDD.scala   | 15 ++-
 3 files changed, 25 insertions(+), 4 deletions(-)

diff --git 
a/integration/spark2/src/main/scala/org/apache/carbondata/indexserver/DistributedRDDUtils.scala
 
b/integration/spark2/src/main/scala/org/apache/carbondata/indexserver/DistributedRDDUtils.scala
index a568153..933ec15 100644
--- 
a/integration/spark2/src/main/scala/org/apache/carbondata/indexserver/DistributedRDDUtils.scala
+++ 
b/integration/spark2/src/main/scala/org/apache/carbondata/indexserver/DistributedRDDUtils.scala
@@ -316,10 +316,10 @@ object DistributedRDDUtils {
   if (existingSegmentMapping == null) {
 val newSegmentMapping = new ConcurrentHashMap[String, String]()
 newSegmentMapping.put(segment.getSegmentNo, s"${newHost}_$newExecutor")
-tableToExecutorMapping.put(tableUniqueName, newSegmentMapping)
+tableToExecutorMapping.putIfAbsent(tableUniqueName, newSegmentMapping)
   } else {
-existingSegmentMapping.put(segment.getSegmentNo, 
s"${newHost}_$newExecutor")
-tableToExecutorMapping.put(tableUniqueName, existingSegmentMapping)
+existingSegmentMapping.putIfAbsent(segment.getSegmentNo, 
s"${newHost}_$newExecutor")
+tableToExecutorMapping.putIfAbsent(tableUniqueName, 
existingSegmentMapping)
   }
   s"executor_${newHost}_$newExecutor"
 }
diff --git 
a/integration/spark2/src/main/scala/org/apache/carbondata/indexserver/DistributedShowCacheRDD.scala
 
b/integration/spark2/src/main/scala/org/apache/carbondata/indexserver/DistributedShowCacheRDD.scala
index 78b7e72..f1707c6 100644
--- 
a/integration/spark2/src/main/scala/org/apache/carbondata/indexserver/DistributedShowCacheRDD.scala
+++ 
b/integration/spark2/src/main/scala/org/apache/carbondata/indexserver/DistributedShowCacheRDD.scala
@@ -37,6 +37,14 @@ class DistributedShowCacheRDD(@transient private val ss: 
SparkSession, tableName
   }
   }.toArray
 
+  override protected def getPreferredLocations(split: Partition): Seq[String] 
= {
+if (split.asInstanceOf[DataMapRDDPartition].getLocations != null) {
+  split.asInstanceOf[DataMapRDDPartition].getLocations.toSeq
+} else {
+  Seq()
+}
+  }
+
   override protected def internalGetPartitions: Array[Partition] = {
 executorsList.zipWithIndex.map {
   case (executor, idx) =>
diff --git 
a/integration/spark2/src/main/scala/org/apache/carbondata/indexserver/InvalidateSegmentCacheRDD.scala
 
b/integration/spark2/src/main/scala/org/apache/carbondata/indexserver/InvalidateSegmentCacheRDD.scala
index c2bd589..750f9d9 100644
--- 
a/integration/spark2/src/main/scala/org/apache/carbondata/indexserver/InvalidateSegmentCacheRDD.scala
+++ 
b/integration/spark2/src/main/scala/org/apache/carbondata/indexserver/InvalidateSegmentCacheRDD.scala
@@ -30,7 +30,12 @@ import org.apache.carbondata.spark.rdd.CarbonRDD
 class InvalidateSegmentCacheRDD(@transient private val ss: SparkSession, 
carbonTable: CarbonTable,
 invalidSegmentIds: List[String]) extends CarbonRDD[String](ss, Nil) {
 
-  val executorsList: Array[String] = 
DistributionUtil.getNodeList(ss.sparkContext)
+  val executorsList: Array[String] = 
DistributionUtil.getExecutors(ss.sparkContext).flatMap {
+case (host, executors) =>
+  executors.map {
+executor => s"executor_${host}_$executor"
+  }
+  }.toArray
 
   override def internalCompute(split: Partition,
   context: TaskContext): Iterator[String] = {
@@ -38,6 +43,14 @@ class InvalidateSegmentCacheRDD(@transient private val ss: 
SparkSession, carbonT
 Iterator.empty
   }
 
+  override protected def getPreferredLocations(split: Partition): Seq[String] 
= {
+if (split.asInstanceOf[DataMapRDDPartition].getLocations != null) {
+  split.asInstanceOf[DataMapRDDPartition].getLocations.toSeq
+} else {
+  Seq()
+}
+  }
+
   override protected def internalGetPartitions: Array[Partition] = {
 if (invalidSegmentIds.isEmpty) {
   Array()



[carbondata] branch master updated: [HOTFIX] Fixed MinMax Based Pruning for Measure column in case of Legacy store

2019-07-15 Thread ravipesala
This is an automated email from the ASF dual-hosted git repository.

ravipesala pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/carbondata.git


The following commit(s) were added to refs/heads/master by this push:
 new b017253  [HOTFIX] Fixed MinMax Based Pruning for Measure column in 
case of Legacy store
b017253 is described below

commit b017253f4eb0fb78e8249e895a8a2a4d2ab929da
Author: Indhumathi27 
AuthorDate: Tue Jul 9 14:01:07 2019 +0530

[HOTFIX] Fixed MinMax Based Pruning for Measure column in case of Legacy 
store

This closes #3320
---
 .../core/scan/filter/executer/IncludeFilterExecuterImpl.java | 9 ++---
 1 file changed, 6 insertions(+), 3 deletions(-)

diff --git 
a/core/src/main/java/org/apache/carbondata/core/scan/filter/executer/IncludeFilterExecuterImpl.java
 
b/core/src/main/java/org/apache/carbondata/core/scan/filter/executer/IncludeFilterExecuterImpl.java
index 1231aa0..bfa2460 100644
--- 
a/core/src/main/java/org/apache/carbondata/core/scan/filter/executer/IncludeFilterExecuterImpl.java
+++ 
b/core/src/main/java/org/apache/carbondata/core/scan/filter/executer/IncludeFilterExecuterImpl.java
@@ -509,9 +509,12 @@ public class IncludeFilterExecuterImpl implements 
FilterExecuter {
   }
 } else if (isMeasurePresentInCurrentBlock) {
   chunkIndex = msrColumnEvaluatorInfo.getColumnIndexInMinMaxByteArray();
-  isScanRequired = isScanRequired(blkMaxVal[chunkIndex], 
blkMinVal[chunkIndex],
-  msrColumnExecutorInfo.getFilterKeys(),
-  msrColumnEvaluatorInfo.getType());
+  if (isMinMaxSet[chunkIndex]) {
+isScanRequired = isScanRequired(blkMaxVal[chunkIndex], 
blkMinVal[chunkIndex],
+msrColumnExecutorInfo.getFilterKeys(), 
msrColumnEvaluatorInfo.getType());
+  } else {
+isScanRequired = true;
+  }
 }
 
 if (isScanRequired) {



[carbondata] branch master updated: [CARBONDATA-3467] Fix count(*) with filter on string column

2019-07-12 Thread ravipesala
This is an automated email from the ASF dual-hosted git repository.

ravipesala pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/carbondata.git


The following commit(s) were added to refs/heads/master by this push:
 new ebe78dc  [CARBONDATA-3467] Fix count(*) with filter on string column
ebe78dc is described below

commit ebe78dca170773a5f4a37e8146a923b2dc6604a4
Author: Indhumathi27 
AuthorDate: Tue Jul 9 09:10:24 2019 +0530

[CARBONDATA-3467] Fix count(*) with filter on string column

Problem:
count(*) with filter on string column throws Unresolved Exception
Solution:
Added check for UnresolvedAlias in MVAnalyzer

This closes #3319
---
 .../org/apache/carbondata/mv/datamap/MVAnalyzerRule.scala   |  9 -
 .../carbondata/mv/rewrite/TestAllOperationsOnMV.scala   | 13 -
 2 files changed, 20 insertions(+), 2 deletions(-)

diff --git 
a/datamap/mv/core/src/main/scala/org/apache/carbondata/mv/datamap/MVAnalyzerRule.scala
 
b/datamap/mv/core/src/main/scala/org/apache/carbondata/mv/datamap/MVAnalyzerRule.scala
index 04bcfbb..edd9c81 100644
--- 
a/datamap/mv/core/src/main/scala/org/apache/carbondata/mv/datamap/MVAnalyzerRule.scala
+++ 
b/datamap/mv/core/src/main/scala/org/apache/carbondata/mv/datamap/MVAnalyzerRule.scala
@@ -70,7 +70,14 @@ class MVAnalyzerRule(sparkSession: SparkSession) extends 
Rule[LogicalPlan] {
 plan.transform {
   case aggregate@Aggregate(grp, aExp, child) =>
 // check for if plan is for dataload for preaggregate table, then skip 
applying mv
-if (aExp.exists(p => p.name.equals("preAggLoad") || 
p.name.equals("preAgg"))) {
+val isPreAggLoad = aExp.exists { p =>
+  if (p.isInstanceOf[UnresolvedAlias]) {
+false
+  } else {
+p.name.equals("preAggLoad") || p.name.equals("preAgg")
+  }
+}
+if (isPreAggLoad) {
   needAnalysis = false
 }
 Aggregate(grp, aExp, child)
diff --git 
a/datamap/mv/core/src/test/scala/org/apache/carbondata/mv/rewrite/TestAllOperationsOnMV.scala
 
b/datamap/mv/core/src/test/scala/org/apache/carbondata/mv/rewrite/TestAllOperationsOnMV.scala
index 839a2e6..81ddf38 100644
--- 
a/datamap/mv/core/src/test/scala/org/apache/carbondata/mv/rewrite/TestAllOperationsOnMV.scala
+++ 
b/datamap/mv/core/src/test/scala/org/apache/carbondata/mv/rewrite/TestAllOperationsOnMV.scala
@@ -540,6 +540,17 @@ class TestAllOperationsOnMV extends QueryTest with 
BeforeAndAfterEach {
 }.getMessage.contains("Operation not allowed on child table.")
   }
 
+  test("test count(*) with filter") {
+sql("drop table if exists maintable")
+sql("create table maintable(id int, name string, id1 string, id2 string, 
dob timestamp, doj " +
+"timestamp, v1 bigint, v2 bigint, v3 decimal(30,10), v4 
decimal(20,10), v5 double, v6 " +
+"double ) stored by 'carbondata'")
+sql("insert into maintable values(1, 'abc', 'id001', 'id002', '2017-01-01 
00:00:00','2017-01-01 " +
+"00:00:00', 234, 2242,12.4,23.4,2323,455 )")
+checkAnswer(sql("select count(*) from maintable where  id1 < id2"), 
Seq(Row(1)))
+sql("drop table if exists maintable")
+  }
+
   test("drop meta cache on mv datamap table") {
 sql("drop table IF EXISTS maintable")
 sql("create table maintable(name string, c_code int, price int) stored by 
'carbondata'")
@@ -580,6 +591,6 @@ class TestAllOperationsOnMV extends QueryTest with 
BeforeAndAfterEach {
 newSet.addAll(oldSet)
 newSet
   }
-
+  
 }
 



[carbondata] branch master updated: [CARBONDATA-3457][MV] Fix Column not found issue with Query having Cast Expression

2019-07-12 Thread ravipesala
This is an automated email from the ASF dual-hosted git repository.

ravipesala pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/carbondata.git


The following commit(s) were added to refs/heads/master by this push:
 new 771d436  [CARBONDATA-3457][MV] Fix Column not found issue with Query 
having Cast Expression
771d436 is described below

commit 771d436fe2ed2d34ccf0ee1d8f555af30c382345
Author: Indhumathi27 
AuthorDate: Thu Jun 27 17:09:20 2019 +0530

[CARBONDATA-3457][MV] Fix Column not found issue with Query having Cast 
Expression

Problem:
For Cast(exp), alias reference is not included, hence throws column not 
found exception for column given inside cast expression.

Solution:
AliasMap has to be created for CAST[EXP] also and should be replaced with 
subsmer alias map references.

This closes #3312
---
 .../carbondata/mv/rewrite/DefaultMatchMaker.scala  | 16 ++
 .../carbondata/mv/rewrite/MVCreateTestCase.scala   | 58 ++
 2 files changed, 74 insertions(+)

diff --git 
a/datamap/mv/core/src/main/scala/org/apache/carbondata/mv/rewrite/DefaultMatchMaker.scala
 
b/datamap/mv/core/src/main/scala/org/apache/carbondata/mv/rewrite/DefaultMatchMaker.scala
index 9a9a2a6..5329608 100644
--- 
a/datamap/mv/core/src/main/scala/org/apache/carbondata/mv/rewrite/DefaultMatchMaker.scala
+++ 
b/datamap/mv/core/src/main/scala/org/apache/carbondata/mv/rewrite/DefaultMatchMaker.scala
@@ -53,6 +53,14 @@ abstract class DefaultMatchPattern extends 
MatchPattern[ModularPlan] {
 (a.child.asInstanceOf[Attribute], a.toAttribute)
   })
 
+// Create aliasMap with Expression to alias reference attribute
+val aliasMapExp =
+  subsumer.outputList.collect {
+case a: Alias if a.child.isInstanceOf[Expression] &&
+ !a.child.isInstanceOf[AggregateExpression] =>
+  a.child -> a.toAttribute
+  }.toMap
+
 // Check and replace all alias references with subsumer alias map 
references.
 val compensation1 = compensation.transform {
   case plan if !plan.skip && plan != subsumer =>
@@ -66,6 +74,14 @@ abstract class DefaultMatchPattern extends 
MatchPattern[ModularPlan] {
   exprId = ref.exprId,
   qualifier = a.qualifier)
   }.getOrElse(a)
+  case a: Expression =>
+aliasMapExp
+  .get(a)
+  .map { ref =>
+AttributeReference(
+  ref.name, ref.dataType)(
+  exprId = ref.exprId)
+  }.getOrElse(a)
   }
 }
 
diff --git 
a/datamap/mv/core/src/test/scala/org/apache/carbondata/mv/rewrite/MVCreateTestCase.scala
 
b/datamap/mv/core/src/test/scala/org/apache/carbondata/mv/rewrite/MVCreateTestCase.scala
index 1d259c8..ca6c0c5 100644
--- 
a/datamap/mv/core/src/test/scala/org/apache/carbondata/mv/rewrite/MVCreateTestCase.scala
+++ 
b/datamap/mv/core/src/test/scala/org/apache/carbondata/mv/rewrite/MVCreateTestCase.scala
@@ -1169,6 +1169,64 @@ class MVCreateTestCase extends QueryTest with 
BeforeAndAfterAll {
 assert(TestUtil.verifyMVDataMap(analyzed1, "da_cast"))
   }
 
+  test("test cast of expression with mv") {
+sql("drop table IF EXISTS maintable")
+sql("create table maintable (m_month bigint, c_code string, " +
+"c_country smallint, d_dollar_value double, q_quantity double, u_unit 
smallint, b_country smallint, i_id int, y_year smallint) stored by 
'carbondata'")
+sql("insert into maintable select 10, 'xxx', 123, 456, 45, 5, 23, 1, 2000")
+sql("drop datamap if exists da_cast")
+sql(
+  "create datamap da_cast using 'mv' as select cast(floor((m_month +1000) 
/ 900) * 900 - 2000 AS INT) as a, c_code as abc from maintable")
+val df1 = sql(
+  " select cast(floor((m_month +1000) / 900) * 900 - 2000 AS INT) as a 
,c_code as abc  from maintable")
+val df2 = sql(
+  " select cast(floor((m_month +1000) / 900) * 900 - 2000 AS INT),c_code 
as abc  from maintable")
+val analyzed1 = df1.queryExecution.analyzed
+assert(TestUtil.verifyMVDataMap(analyzed1, "da_cast"))
+  }
+
+  test("test cast with & without alias") {
+sql("drop table IF EXISTS maintable")
+sql("create table maintable (m_month bigint, c_code string, " +
+"c_country smallint, d_dollar_value double, q_quantity double, u_unit 
smallint, b_country smallint, i_id int, y_year smallint) stored by 
'carbondata'")
+sql("insert into maintable select 10, 'xxx', 123, 456, 45, 5, 23, 1, 2000")
+sql("drop datamap if exists da_cast")
+sql(
+  "create datamap da_cast using 'mv' as select cast(m_month + 1000 AS INT) 
as a, c_code as abc from maintable")
+checkAnswer(sql(

[carbondata] branch master updated: [CARBONDATA-3456] Fix DataLoading to MV table when Yarn-Application is killed

2019-07-12 Thread ravipesala
This is an automated email from the ASF dual-hosted git repository.

ravipesala pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/carbondata.git


The following commit(s) were added to refs/heads/master by this push:
 new cdf0594  [CARBONDATA-3456] Fix DataLoading to MV table when 
Yarn-Application is killed
cdf0594 is described below

commit cdf0594cb4fefcec6a892692daca2d73f40ccd19
Author: Indhumathi27 
AuthorDate: Thu Jun 27 18:16:04 2019 +0530

[CARBONDATA-3456] Fix DataLoading to MV table when Yarn-Application is 
killed

Problem:
When dataLoad is triggered on datamaptable and new LoadMetaDetail with 
SegmentStatus as InsertInProgress and segmentMappingInfo is created and then 
yarn-application is killed. Then on next load, stale loadMetadetail is still in 
InsertInProgress state and mainTableSegemnts mapped to that loadMetaDetail is 
not considered for nextLoad resulted in dataMismatch between main table and 
datamap table

Solution:
Clean up the old invalid segment before creating a new entry for new Load.

This closes #3310
---
 .../carbondata/core/datamap/DataMapProvider.java   | 25 
 .../carbondata/core/datamap/DataMapUtil.java   | 18 ++-
 .../core/datamap/dev/DataMapSyncStatus.java| 19 ---
 .../carbondata/core/metadata/SegmentFileStore.java |  2 +-
 .../core/statusmanager/SegmentStatusManager.java   | 27 ++
 .../apache/carbondata/core/util/CarbonUtil.java|  2 +-
 .../bloom/BloomCoarseGrainDataMapFactory.java  |  3 ++-
 .../datamap/lucene/LuceneDataMapFactoryBase.java   |  3 ++-
 .../carbondata/mv/datamap/MVDataMapProvider.scala  |  8 ++-
 .../mv/rewrite/MVIncrementalLoadingTestcase.scala  |  6 +++--
 .../hadoop/api/CarbonOutputCommitter.java  |  5 ++--
 .../hadoop/api/CarbonTableInputFormat.java |  6 +++--
 .../carbondata/datamap/IndexDataMapProvider.java   |  4 ++--
 .../datamap/PreAggregateDataMapProvider.java   |  4 ++--
 .../datamap/IndexDataMapRebuildRDD.scala   |  3 ++-
 .../spark/rdd/CarbonDataRDDFactory.scala   |  1 +
 .../spark/sql/events/MergeIndexEventListener.scala |  2 +-
 .../sql/execution/command/cache/CacheUtil.scala|  4 ++--
 .../command/cache/DropCacheEventListeners.scala|  3 ++-
 .../command/datamap/CarbonDataMapShowCommand.scala |  5 ++--
 .../command/mutation/HorizontalCompaction.scala|  6 +++--
 .../CarbonAlterTableDropHivePartitionCommand.scala |  2 +-
 .../CarbonAlterTableDropPartitionCommand.scala |  3 ++-
 .../CarbonAlterTableSplitPartitionCommand.scala|  3 ++-
 .../org/apache/spark/sql/hive/CarbonRelation.scala |  4 ++--
 .../org/apache/spark/util/MergeIndexUtil.scala |  2 +-
 .../processing/merger/CarbonDataMergerUtil.java|  7 +++---
 27 files changed, 120 insertions(+), 57 deletions(-)

diff --git 
a/core/src/main/java/org/apache/carbondata/core/datamap/DataMapProvider.java 
b/core/src/main/java/org/apache/carbondata/core/datamap/DataMapProvider.java
index d0b66f3..c320226 100644
--- a/core/src/main/java/org/apache/carbondata/core/datamap/DataMapProvider.java
+++ b/core/src/main/java/org/apache/carbondata/core/datamap/DataMapProvider.java
@@ -129,10 +129,15 @@ public abstract class DataMapProvider {
 }
 String newLoadName = "";
 String segmentMap = "";
-AbsoluteTableIdentifier dataMapTableAbsoluteTableIdentifier = 
AbsoluteTableIdentifier
-.from(dataMapSchema.getRelationIdentifier().getTablePath(),
+CarbonTable dataMapTable = CarbonTable
+
.buildFromTablePath(dataMapSchema.getRelationIdentifier().getTableName(),
 dataMapSchema.getRelationIdentifier().getDatabaseName(),
-dataMapSchema.getRelationIdentifier().getTableName());
+dataMapSchema.getRelationIdentifier().getTablePath(),
+dataMapSchema.getRelationIdentifier().getTableId());
+AbsoluteTableIdentifier dataMapTableAbsoluteTableIdentifier =
+dataMapTable.getAbsoluteTableIdentifier();
+// Clean up the old invalid segment data before creating a new entry for 
new load.
+SegmentStatusManager.deleteLoadsAndUpdateMetadata(dataMapTable, false, 
null);
 SegmentStatusManager segmentStatusManager =
 new SegmentStatusManager(dataMapTableAbsoluteTableIdentifier);
 Map> segmentMapping = new HashMap<>();
@@ -148,6 +153,15 @@ public abstract class DataMapProvider {
 
CarbonTablePath.getMetadataPath(dataMapSchema.getRelationIdentifier().getTablePath());
 LoadMetadataDetails[] loadMetaDataDetails =
 SegmentStatusManager.readLoadMetadata(dataMapTableMetadataPath);
+// Mark for delete all stale loadMetadetail
+for (LoadMetadataDetails loadMetadataDetail : loadMetaDataDetails) {
+  if ((loadMetadataDetail.getSegmentStatus() == 
SegmentStatus.INSERT_IN_PROGRESS
+  || loadMe

svn commit: r34819 - /release/carbondata/1.5.4/apache-carbondata-1.5.4-source-release.zip.asc

2019-07-09 Thread ravipesala
Author: ravipesala
Date: Tue Jul  9 16:08:24 2019
New Revision: 34819

Log:
Checkin 1.5.4

Added:
release/carbondata/1.5.4/apache-carbondata-1.5.4-source-release.zip.asc

Added: release/carbondata/1.5.4/apache-carbondata-1.5.4-source-release.zip.asc
==
--- release/carbondata/1.5.4/apache-carbondata-1.5.4-source-release.zip.asc 
(added)
+++ release/carbondata/1.5.4/apache-carbondata-1.5.4-source-release.zip.asc Tue 
Jul  9 16:08:24 2019
@@ -0,0 +1,16 @@
+-BEGIN PGP SIGNATURE-
+
+iQIzBAEBCgAdFiEER3EpqJTxH7zLwCVHutcqeKexsu4FAl0ku8MACgkQutcqeKex
+su4Gmw//WcMUGJwO5RgIZWkyBgScoksV/tGfTVyckO8IS0cQcpeTFZ3mzrWkz5Me
+8PjGaFvfn687dXV+wOZ2XYLkJB8HmYWhm2uq4ET/7pv2yRkc6BfvJvKA8oSPPcfg
+Cbwlc174xQaLWb2a+3rLIT2Q2CCuHy+dc3vL1StZaDibCs7ecDZ+KAf/SMVizYWI
+2aialZ0m9xvfIb5d3ENadP+8VcCHzpkdfyDzsNfpLKkYV87C04MKNJHwMRI2wKKd
+FNg9PWLkGrPiR5/zWUSmIrcxB5V0SyKRa/7rZdsAgd5oIok3itp8NIUohQNDv7iM
+Cvqedq4+Woi1Lm2BRrpx1alwm4cP04iwvzQQXi9YglHzSXbnZd4JN6qbdoLrpV/w
+5k2V2x5dPZjWMtRJ/HraL0bCvam7D5ghIUAYvfN5F8c7YUDM28rkDV1aNhSXON8D
+YNrf8wJzno3U97q50RmyfU6zkTKC1aV5XwW34ZbSOw9SqTAmY397RjAGnHqcsfNw
+NjELgGMPcUmrTDPv+mpXKBNMfFBoKgg09EMy1jyDAmGAhQF5X5rtvzeIbAfprIbG
+V+omKApIBHzibq65tw0f5QmhRwrClGOsDnhkbkRxybzzYDFjuocTGiBpTkZ9CNUw
+DCXI7o6ZC/8q8zdOi6ACCNIiIzbdQRZoJyVeQmGzBLHa7SryY3Y=
+=VnUg
+-END PGP SIGNATURE-




svn commit: r34818 - /release/carbondata/1.5.4/apache-carbondata-1.5.4-source-release.zip.asc

2019-07-09 Thread ravipesala
Author: ravipesala
Date: Tue Jul  9 16:06:19 2019
New Revision: 34818

Log:
Checkin 1.5.4

Removed:
release/carbondata/1.5.4/apache-carbondata-1.5.4-source-release.zip.asc



svn commit: r34798 - /release/carbondata/1.5.4/apache-carbondata-1.5.4-source-release.zip.asc

2019-07-08 Thread ravipesala
Author: ravipesala
Date: Mon Jul  8 15:56:07 2019
New Revision: 34798

Log:
Checkin 1.5.4

Modified:
release/carbondata/1.5.4/apache-carbondata-1.5.4-source-release.zip.asc

Modified: 
release/carbondata/1.5.4/apache-carbondata-1.5.4-source-release.zip.asc
==
Binary files - no diff available.




[carbondata] branch master updated: [CARBONDATA-3440] Updated alter table DDL to accept upgrade_segments as a compaction type

2019-07-02 Thread ravipesala
This is an automated email from the ASF dual-hosted git repository.

ravipesala pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/carbondata.git


The following commit(s) were added to refs/heads/master by this push:
 new 785cc6c  [CARBONDATA-3440] Updated alter table DDL to accept 
upgrade_segments as a compaction type
785cc6c is described below

commit 785cc6cbb9aecd7cc90c892a9479855b9b403be4
Author: kunal642 
AuthorDate: Tue Jun 11 19:57:23 2019 +0530

[CARBONDATA-3440] Updated alter table DDL to accept upgrade_segments as a 
compaction type

Updated alter table DDL to accept upgrade_segments as a compaction type.
made legacy segment distribution round-robin based.

This closes #3277
---
 .../core/datamap/DistributableDataMapFormat.java   |  38 +--
 .../apache/carbondata/core/datamap/Segment.java|  13 -
 .../core/indexstore/ExtendedBlocklet.java  |   1 +
 .../core/indexstore/ExtendedBlockletWrapper.java   |   2 +-
 .../blockletindex/BlockletDataMapFactory.java  |   2 +-
 .../core/metadata/schema/table/CarbonTable.java| 311 ++---
 .../apache/carbondata/core/util/SessionParams.java |   4 +-
 .../apache/carbondata/hadoop/CarbonInputSplit.java |  44 ++-
 .../carbondata/hadoop/api/CarbonInputFormat.java   |   4 +-
 .../carbondata/indexserver/DataMapJobs.scala   |  30 +-
 .../indexserver/DistributedPruneRDD.scala  |   2 +-
 .../indexserver/DistributedRDDUtils.scala  |  57 +++-
 .../carbondata/indexserver/IndexServer.scala   |  37 ++-
 .../CarbonAlterTableCompactionCommand.scala|  32 ++-
 .../restructure/AlterTableUpgradeSegmentTest.scala |  50 
 .../processing/merger/CompactionType.java  |   1 +
 16 files changed, 385 insertions(+), 243 deletions(-)

diff --git 
a/core/src/main/java/org/apache/carbondata/core/datamap/DistributableDataMapFormat.java
 
b/core/src/main/java/org/apache/carbondata/core/datamap/DistributableDataMapFormat.java
index cdc9e5c..8426fcb 100644
--- 
a/core/src/main/java/org/apache/carbondata/core/datamap/DistributableDataMapFormat.java
+++ 
b/core/src/main/java/org/apache/carbondata/core/datamap/DistributableDataMapFormat.java
@@ -24,6 +24,7 @@ import java.nio.charset.Charset;
 import java.util.ArrayList;
 import java.util.Iterator;
 import java.util.List;
+import java.util.UUID;
 
 import org.apache.carbondata.common.logging.LogServiceFactory;
 import org.apache.carbondata.core.constants.CarbonCommonConstants;
@@ -84,9 +85,12 @@ public class DistributableDataMapFormat extends 
FileInputFormat segmentsToLoad = new ArrayList<>();
 segmentsToLoad.add(distributable.getDistributable().getSegment());
 List blocklets = new ArrayList<>();
-DataMapChooser dataMapChooser = null;
-if (null != filterResolverIntf) {
-  dataMapChooser = new DataMapChooser(table);
-}
 if (dataMapLevel == null) {
   TableDataMap defaultDataMap = DataMapStoreManager.getInstance()
   .getDataMap(table, 
distributable.getDistributable().getDataMapSchema());
   dataMaps = 
defaultDataMap.getTableDataMaps(distributable.getDistributable());
-  if (table.isTransactionalTable()) {
-blocklets = defaultDataMap.prune(dataMaps, 
distributable.getDistributable(),
-filterResolverIntf, partitions);
-  } else {
-blocklets = defaultDataMap.prune(segmentsToLoad, new 
DataMapFilter(filterResolverIntf),
-partitions);
-  }
+  blocklets = defaultDataMap
+  .prune(segmentsToLoad, new DataMapFilter(filterResolverIntf), 
partitions);
   blocklets = DataMapUtil
   .pruneDataMaps(table, filterResolverIntf, segmentsToLoad, 
partitions, blocklets,
   dataMapChooser);
@@ -380,10 +374,6 @@ public class DistributableDataMapFormat extends 
FileInputFormat getValidSegmentIds() {
+List validSegments = new ArrayList<>();
+for (Segment segment : this.validSegments) {
+  validSegments.add(segment.getSegmentNo());
+}
+return validSegments;
+  }
+
+  public void createDataMapChooser() throws IOException {
+if (null != filterResolverIntf) {
+  this.dataMapChooser = new DataMapChooser(table);
+}
+  }
 }
diff --git a/core/src/main/java/org/apache/carbondata/core/datamap/Segment.java 
b/core/src/main/java/org/apache/carbondata/core/datamap/Segment.java
index 9370be8..ad80182 100644
--- a/core/src/main/java/org/apache/carbondata/core/datamap/Segment.java
+++ b/core/src/main/java/org/apache/carbondata/core/datamap/Segment.java
@@ -69,11 +69,6 @@ public class Segment implements Serializable, Writable {
 
   private long indexSize = 0;
 
-  /**
-   * Whether to cache the segment data maps in executors or not.
-   */
-  private boolean isCacheable = true;
-
   public Segment() {
 
   }
@@ -287,14 +282,6 @@ public class Segment implements Seri

[carbondata] branch master updated: [CARBONDATA-3398] Handled show cache for index server and MV

2019-06-22 Thread ravipesala
This is an automated email from the ASF dual-hosted git repository.

ravipesala pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/carbondata.git


The following commit(s) were added to refs/heads/master by this push:
 new f708efb  [CARBONDATA-3398] Handled show cache for index server and MV
f708efb is described below

commit f708efb183d0247f9d6a46f7dff6bb4507998f3f
Author: kunal642 
AuthorDate: Tue May 28 15:30:49 2019 +0530

[CARBONDATA-3398] Handled show cache for index server and MV

Added support to show/drop metacahe information from index server.
Added tableNotFoundException fix when dbName and tableName have '' in their 
names, while splitting using '' the dbName was extracted wrongly. Instead now 
dbname and tableName would be seperated by '-' internally for show cache

This closes #3245
---
 .../core/datamap/dev/DataMapFactory.java   |   4 +
 .../core/indexstore/BlockletDetailsFetcher.java|   2 -
 .../blockletindex/BlockletDataMapFactory.java  |  13 +-
 .../bloom/BloomCoarseGrainDataMapFactory.java  |  16 +
 .../hadoop/api/CarbonTableInputFormat.java |   4 +-
 .../sql/commands/TestCarbonShowCacheCommand.scala  |  35 +-
 .../apache/carbondata/spark/util/CommonUtil.scala  |   9 +-
 .../carbondata/indexserver/DataMapJobs.scala   |   2 +-
 .../indexserver/DistributedShowCacheRDD.scala  |  32 +-
 .../carbondata/indexserver/IndexServer.scala   |   9 +-
 .../scala/org/apache/spark/sql/CarbonEnv.scala |   2 +-
 .../command/cache/CarbonShowCacheCommand.scala | 465 ++---
 .../command/cache/ShowCacheEventListeners.scala|  78 ++--
 .../scala/org/apache/spark/util/DataMapUtil.scala  |   2 +-
 14 files changed, 428 insertions(+), 245 deletions(-)

diff --git 
a/core/src/main/java/org/apache/carbondata/core/datamap/dev/DataMapFactory.java 
b/core/src/main/java/org/apache/carbondata/core/datamap/dev/DataMapFactory.java
index 3fa7be6..1116525 100644
--- 
a/core/src/main/java/org/apache/carbondata/core/datamap/dev/DataMapFactory.java
+++ 
b/core/src/main/java/org/apache/carbondata/core/datamap/dev/DataMapFactory.java
@@ -192,4 +192,8 @@ public abstract class DataMapFactory {
   public boolean supportRebuild() {
 return false;
   }
+
+  public String getCacheSize() {
+return null;
+  }
 }
diff --git 
a/core/src/main/java/org/apache/carbondata/core/indexstore/BlockletDetailsFetcher.java
 
b/core/src/main/java/org/apache/carbondata/core/indexstore/BlockletDetailsFetcher.java
index 5eace3c..ae01e9e 100644
--- 
a/core/src/main/java/org/apache/carbondata/core/indexstore/BlockletDetailsFetcher.java
+++ 
b/core/src/main/java/org/apache/carbondata/core/indexstore/BlockletDetailsFetcher.java
@@ -60,6 +60,4 @@ public interface BlockletDetailsFetcher {
* clears the datamap from cache and segmentMap from executor
*/
   void clear();
-
-  String getCacheSize() throws IOException ;
 }
diff --git 
a/core/src/main/java/org/apache/carbondata/core/indexstore/blockletindex/BlockletDataMapFactory.java
 
b/core/src/main/java/org/apache/carbondata/core/indexstore/blockletindex/BlockletDataMapFactory.java
index cab1b8b..f928976 100644
--- 
a/core/src/main/java/org/apache/carbondata/core/indexstore/blockletindex/BlockletDataMapFactory.java
+++ 
b/core/src/main/java/org/apache/carbondata/core/indexstore/blockletindex/BlockletDataMapFactory.java
@@ -302,14 +302,19 @@ public class BlockletDataMapFactory extends 
CoarseGrainDataMapFactory
 }
   }
 
-  @Override public String getCacheSize() throws IOException {
+  @Override
+  public String getCacheSize() {
 long sum = 0L;
 int numOfIndexFiles = 0;
 for (Map.Entry> entry : 
segmentMap.entrySet()) {
   for (TableBlockIndexUniqueIdentifier tableBlockIndexUniqueIdentifier : 
entry.getValue()) {
-sum += cache.get(new 
TableBlockIndexUniqueIdentifierWrapper(tableBlockIndexUniqueIdentifier,
-getCarbonTable())).getMemorySize();
-numOfIndexFiles++;
+BlockletDataMapIndexWrapper blockletDataMapIndexWrapper = 
cache.getIfPresent(
+new 
TableBlockIndexUniqueIdentifierWrapper(tableBlockIndexUniqueIdentifier,
+getCarbonTable()));
+if (blockletDataMapIndexWrapper != null) {
+  sum += blockletDataMapIndexWrapper.getMemorySize();
+  numOfIndexFiles++;
+}
   }
 }
 return numOfIndexFiles + ":" + sum;
diff --git 
a/datamap/bloom/src/main/java/org/apache/carbondata/datamap/bloom/BloomCoarseGrainDataMapFactory.java
 
b/datamap/bloom/src/main/java/org/apache/carbondata/datamap/bloom/BloomCoarseGrainDataMapFactory.java
index 03599a9..f261871 100644
--- 
a/datamap/bloom/src/main/java/org/apache/carbondata/datamap/bloom/BloomCoarseGrainDataMapFactory.java
+++ 
b/datamap/bloom/src/main/java/org/apache/carbondata/datamap/bloom/BloomCoarseGrainDataMapFactory.java
@@ -453,4 +453,20 @@ public class BloomCoarseGrainDataMapFactory extends 
D

[carbondata] branch master updated: [CARBONDATA-3409] Fix Concurrent dataloading Issue with mv

2019-06-05 Thread ravipesala
This is an automated email from the ASF dual-hosted git repository.

ravipesala pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/carbondata.git


The following commit(s) were added to refs/heads/master by this push:
 new 9d02092  [CARBONDATA-3409] Fix Concurrent dataloading Issue with mv
9d02092 is described below

commit 9d0209226eb3be7735da7cd66d88cece0141e7e5
Author: Indhumathi27 
AuthorDate: Fri May 31 15:53:01 2019 +0530

[CARBONDATA-3409] Fix Concurrent dataloading Issue with mv

Problem:
While performing concurrent dataloading to MV datamap, if any of the loads 
was not able to get TableStatusLock, then because newLoadName and segmentMap 
was empty, it was doing full rebuild.

Solution:
If load was not able to take tablestatuslock, then disable the datamap and 
return

This closes #3252
---
 .../main/java/org/apache/carbondata/core/datamap/DataMapProvider.java  | 3 +++
 1 file changed, 3 insertions(+)

diff --git 
a/core/src/main/java/org/apache/carbondata/core/datamap/DataMapProvider.java 
b/core/src/main/java/org/apache/carbondata/core/datamap/DataMapProvider.java
index c4ee49b..6a9d2c5 100644
--- a/core/src/main/java/org/apache/carbondata/core/datamap/DataMapProvider.java
+++ b/core/src/main/java/org/apache/carbondata/core/datamap/DataMapProvider.java
@@ -27,6 +27,7 @@ import org.apache.carbondata.common.logging.LogServiceFactory;
 import org.apache.carbondata.core.constants.CarbonCommonConstants;
 import org.apache.carbondata.core.datamap.dev.DataMapFactory;
 import org.apache.carbondata.core.datamap.status.DataMapSegmentStatusUtil;
+import org.apache.carbondata.core.datamap.status.DataMapStatusManager;
 import org.apache.carbondata.core.locks.ICarbonLock;
 import org.apache.carbondata.core.metadata.AbsoluteTableIdentifier;
 import org.apache.carbondata.core.metadata.schema.table.CarbonTable;
@@ -208,6 +209,8 @@ public abstract class DataMapProvider {
 "Not able to acquire the lock for Table status updation for table 
" + dataMapSchema
 .getRelationIdentifier().getDatabaseName() + "." + 
dataMapSchema
 .getRelationIdentifier().getTableName());
+DataMapStatusManager.disableDataMap(dataMapSchema.getDataMapName());
+return false;
   }
 } finally {
   if (carbonLock.unlock()) {



[carbondata] branch master updated: [CARBONDATA-3407]Fix distinct, count, Sum query failure when MV is created on single projection column

2019-06-05 Thread ravipesala
This is an automated email from the ASF dual-hosted git repository.

ravipesala pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/carbondata.git


The following commit(s) were added to refs/heads/master by this push:
 new b0d5a5c  [CARBONDATA-3407]Fix distinct, count, Sum query failure when 
MV is created on single projection column
b0d5a5c is described below

commit b0d5a5c792d3cde62da164c4c019beefe8cc2608
Author: akashrn5 
AuthorDate: Thu May 30 14:15:44 2019 +0530

[CARBONDATA-3407]Fix distinct, count, Sum query failure when MV is created 
on single projection column

Problem:
when MV datamap is created on single column as simple projection, sum, 
distinct,count queries are failing during sql conversion of modular plan. 
Basically there is no case to handle the modular plan when we have group by 
node without alias info and has select child node which is rewritten.

Solution:
the sql generation cases should take this case also, after that the 
rewritten query will wrong as alias will be present inside count or aggregate 
function.
So actually rewritten query should be like:
SELECT count(limit_fail_dm1_table.limit_fail_designation) AS 
count(designation) FROM default.limit_fail_dm1_table

This closes #3249
---
 .../carbondata/mv/datamap/MVAnalyzerRule.scala  |  2 +-
 .../carbondata/mv/rewrite/MVCreateTestCase.scala| 21 +
 .../carbondata/mv/plans/util/SQLBuildDSL.scala  |  2 +-
 .../carbondata/mv/plans/util/SQLBuilder.scala   |  6 +-
 4 files changed, 28 insertions(+), 3 deletions(-)

diff --git 
a/datamap/mv/core/src/main/scala/org/apache/carbondata/mv/datamap/MVAnalyzerRule.scala
 
b/datamap/mv/core/src/main/scala/org/apache/carbondata/mv/datamap/MVAnalyzerRule.scala
index 558a5bb..04bcfbb 100644
--- 
a/datamap/mv/core/src/main/scala/org/apache/carbondata/mv/datamap/MVAnalyzerRule.scala
+++ 
b/datamap/mv/core/src/main/scala/org/apache/carbondata/mv/datamap/MVAnalyzerRule.scala
@@ -79,7 +79,7 @@ class MVAnalyzerRule(sparkSession: SparkSession) extends 
Rule[LogicalPlan] {
   DataMapClassProvider.MV.getShortName).asInstanceOf[SummaryDatasetCatalog]
 if (needAnalysis && catalog != null && isValidPlan(plan, catalog)) {
   val modularPlan = 
catalog.mvSession.sessionState.rewritePlan(plan).withMVTable
-  if (modularPlan.find (_.rewritten).isDefined) {
+  if (modularPlan.find(_.rewritten).isDefined) {
 val compactSQL = modularPlan.asCompactSQL
 val analyzed = sparkSession.sql(compactSQL).queryExecution.analyzed
 analyzed
diff --git 
a/datamap/mv/core/src/test/scala/org/apache/carbondata/mv/rewrite/MVCreateTestCase.scala
 
b/datamap/mv/core/src/test/scala/org/apache/carbondata/mv/rewrite/MVCreateTestCase.scala
index 25d2542..e025623 100644
--- 
a/datamap/mv/core/src/test/scala/org/apache/carbondata/mv/rewrite/MVCreateTestCase.scala
+++ 
b/datamap/mv/core/src/test/scala/org/apache/carbondata/mv/rewrite/MVCreateTestCase.scala
@@ -1041,6 +1041,26 @@ class MVCreateTestCase extends QueryTest with 
BeforeAndAfterAll {
 assert(verifyMVDataMap(analyzed2, "mvlikedm2"))
   }
 
+  test("test distinct, count, sum on MV with single projection column") {
+sql("drop table if exists maintable")
+sql("create table maintable(name string, age int, add string) stored by 
'carbondata'")
+sql("create datamap single_mv using 'mv' as select age from maintable")
+sql("insert into maintable select 'pheobe',31,'NY'")
+sql("insert into maintable select 'rachel',32,'NY'")
+val df1 = sql("select distinct(age) from maintable")
+val df2 = sql("select sum(age) from maintable")
+val df3 = sql("select count(age) from maintable")
+val analyzed1 = df1.queryExecution.analyzed
+val analyzed2 = df2.queryExecution.analyzed
+val analyzed3 = df3.queryExecution.analyzed
+checkAnswer(df1, Seq(Row(31), Row(32)))
+checkAnswer(df2, Seq(Row(63)))
+checkAnswer(df3, Seq(Row(2)))
+assert(TestUtil.verifyMVDataMap(analyzed1, "single_mv"))
+assert(TestUtil.verifyMVDataMap(analyzed2, "single_mv"))
+assert(TestUtil.verifyMVDataMap(analyzed3, "single_mv"))
+  }
+
   def verifyMVDataMap(logicalPlan: LogicalPlan, dataMapName: String): Boolean 
= {
 val tables = logicalPlan collect {
   case l: LogicalRelation => l.catalogTable.get
@@ -1060,6 +1080,7 @@ class MVCreateTestCase extends QueryTest with 
BeforeAndAfterAll {
 sql("drop table IF EXISTS fact_table_parquet")
 sql("drop table if exists limit_fail")
 sql("drop table IF EXISTS mv_like")
+sql("drop table IF EXISTS maintable")
   }
 
   override def afterAll {
diff --git 
a/datamap/mv/plan/src/main/scala/org/apache/carbondata/mv/plans/util/SQLBuildDSL.scala
 
b

[carbondata] branch master updated: [CARBONDATA-3404] Support CarbonFile API through FileTypeInterface to use custom FileSystem

2019-06-05 Thread ravipesala
This is an automated email from the ASF dual-hosted git repository.

ravipesala pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/carbondata.git


The following commit(s) were added to refs/heads/master by this push:
 new 85f1b9f  [CARBONDATA-3404] Support CarbonFile API through 
FileTypeInterface to use custom FileSystem
85f1b9f is described below

commit 85f1b9ff4d459248af56002e1523bcb46bf366e4
Author: KanakaKumar 
AuthorDate: Wed May 29 12:39:06 2019 +0530

[CARBONDATA-3404] Support CarbonFile API through FileTypeInterface to use 
custom FileSystem

Currently CarbonData supports few set of FileSystems like HDFS,S3,VIEWFS 
schemes.
If user configures table path from different file systems apart from 
supported, FileFactory takes CarbonLocalFile as default and causes errors.

This PR proposes to support a API for user to extend CarbonFile which 
override the required methods from AbstractCarbonFile if a specific handling 
required for operations like renameForce.

This closes #3246
---
 .../core/constants/CarbonCommonConstants.java  |  5 ++
 .../filesystem/AbstractDFSCarbonFile.java  |  6 +-
 .../core/datastore/filesystem/CarbonFile.java  |  4 +-
 .../core/datastore/filesystem/LocalCarbonFile.java | 10 +--
 .../datastore/impl/DefaultFileTypeProvider.java| 84 +++---
 .../core/datastore/impl/FileFactory.java   | 82 +
 .../core/datastore/impl/FileTypeInterface.java | 23 --
 .../carbondata/core/locks/CarbonLockFactory.java   | 11 ++-
 .../core/metadata/schema/SchemaReader.java |  5 +-
 .../apache/carbondata/core/util/CarbonUtil.java| 19 ++---
 .../store/impl/FileFactoryImplUnitTest.java| 55 --
 .../filesystem/store/impl/TestFileProvider.java| 59 +++
 .../dblocation/DBLocationCarbonTableTestCase.scala |  4 +-
 13 files changed, 282 insertions(+), 85 deletions(-)

diff --git 
a/core/src/main/java/org/apache/carbondata/core/constants/CarbonCommonConstants.java
 
b/core/src/main/java/org/apache/carbondata/core/constants/CarbonCommonConstants.java
index 8b39343..1201e1a 100644
--- 
a/core/src/main/java/org/apache/carbondata/core/constants/CarbonCommonConstants.java
+++ 
b/core/src/main/java/org/apache/carbondata/core/constants/CarbonCommonConstants.java
@@ -1601,6 +1601,11 @@ public final class CarbonCommonConstants {
   public static final String S3_SECRET_KEY = "fs.s3.awsSecretAccessKey";
 
   /**
+   * Configuration Key for custom file provider
+   */
+  public static final String CUSTOM_FILE_PROVIDER = 
"carbon.fs.custom.file.provider";
+
+  /**
* FS_DEFAULT_FS
*/
   @CarbonProperty
diff --git 
a/core/src/main/java/org/apache/carbondata/core/datastore/filesystem/AbstractDFSCarbonFile.java
 
b/core/src/main/java/org/apache/carbondata/core/datastore/filesystem/AbstractDFSCarbonFile.java
index a90648e..1470c05 100644
--- 
a/core/src/main/java/org/apache/carbondata/core/datastore/filesystem/AbstractDFSCarbonFile.java
+++ 
b/core/src/main/java/org/apache/carbondata/core/datastore/filesystem/AbstractDFSCarbonFile.java
@@ -404,8 +404,8 @@ public abstract class AbstractDFSCarbonFile implements 
CarbonFile {
 return new DataOutputStream(new BufferedOutputStream(outputStream));
   }
 
-  @Override public boolean isFileExist(String filePath, FileFactory.FileType 
fileType,
-  boolean performFileCheck) throws IOException {
+  @Override public boolean isFileExist(String filePath, boolean 
performFileCheck)
+  throws IOException {
 filePath = filePath.replace("\\", "/");
 Path path = new Path(filePath);
 FileSystem fs = path.getFileSystem(FileFactory.getConfiguration());
@@ -416,7 +416,7 @@ public abstract class AbstractDFSCarbonFile implements 
CarbonFile {
 }
   }
 
-  @Override public boolean isFileExist(String filePath, FileFactory.FileType 
fileType)
+  @Override public boolean isFileExist(String filePath)
   throws IOException {
 filePath = filePath.replace("\\", "/");
 Path path = new Path(filePath);
diff --git 
a/core/src/main/java/org/apache/carbondata/core/datastore/filesystem/CarbonFile.java
 
b/core/src/main/java/org/apache/carbondata/core/datastore/filesystem/CarbonFile.java
index be08338..c3c5be5 100644
--- 
a/core/src/main/java/org/apache/carbondata/core/datastore/filesystem/CarbonFile.java
+++ 
b/core/src/main/java/org/apache/carbondata/core/datastore/filesystem/CarbonFile.java
@@ -139,10 +139,10 @@ public interface CarbonFile {
   DataOutputStream getDataOutputStream(String path, FileFactory.FileType 
fileType, int bufferSize,
   String compressor) throws IOException;
 
-  boolean isFileExist(String filePath, FileFactory.FileType fileType, boolean 
performFileCheck)
+  boolean isFileExist(String filePath, boolean performFileCheck)
   throws IOException;
 
-  boolean isFileExist(String filePath, Fi

[carbondata] branch master updated: [CARBONDATA-3350] Enhance custom compaction to resort old single segment by new sort_columns

2019-06-05 Thread ravipesala
This is an automated email from the ASF dual-hosted git repository.

ravipesala pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/carbondata.git


The following commit(s) were added to refs/heads/master by this push:
 new 6fa7fb4  [CARBONDATA-3350] Enhance custom compaction to resort old 
single segment by new sort_columns
6fa7fb4 is described below

commit 6fa7fb4f94ca3082113d0b47b109bdd16cf046a3
Author: QiangCai 
AuthorDate: Wed May 15 16:46:20 2019 +0800

[CARBONDATA-3350] Enhance custom compaction to resort old single segment by 
new sort_columns

This closes #3202
---
 .../blockletindex/BlockletDataMapFactory.java  |   2 +-
 .../TableStatusReadCommittedScope.java |   2 +-
 .../spark/rdd/CarbonTableCompactor.scala   |  21 +++-
 .../processing/merger/CarbonCompactionUtil.java| 132 +
 4 files changed, 128 insertions(+), 29 deletions(-)

diff --git 
a/core/src/main/java/org/apache/carbondata/core/indexstore/blockletindex/BlockletDataMapFactory.java
 
b/core/src/main/java/org/apache/carbondata/core/indexstore/blockletindex/BlockletDataMapFactory.java
index 446507f..cab1b8b 100644
--- 
a/core/src/main/java/org/apache/carbondata/core/indexstore/blockletindex/BlockletDataMapFactory.java
+++ 
b/core/src/main/java/org/apache/carbondata/core/indexstore/blockletindex/BlockletDataMapFactory.java
@@ -167,7 +167,7 @@ public class BlockletDataMapFactory extends 
CoarseGrainDataMapFactory
 return dataMaps;
   }
 
-  private Set 
getTableBlockIndexUniqueIdentifiers(Segment segment)
+  public Set 
getTableBlockIndexUniqueIdentifiers(Segment segment)
   throws IOException {
 Set tableBlockIndexUniqueIdentifiers =
 segmentMap.get(segment.getSegmentNo());
diff --git 
a/core/src/main/java/org/apache/carbondata/core/readcommitter/TableStatusReadCommittedScope.java
 
b/core/src/main/java/org/apache/carbondata/core/readcommitter/TableStatusReadCommittedScope.java
index 5622efe..e4fd6f4 100644
--- 
a/core/src/main/java/org/apache/carbondata/core/readcommitter/TableStatusReadCommittedScope.java
+++ 
b/core/src/main/java/org/apache/carbondata/core/readcommitter/TableStatusReadCommittedScope.java
@@ -55,7 +55,7 @@ public class TableStatusReadCommittedScope implements 
ReadCommittedScope {
   }
 
   public TableStatusReadCommittedScope(AbsoluteTableIdentifier identifier,
-  LoadMetadataDetails[] loadMetadataDetails, Configuration configuration) 
throws IOException {
+  LoadMetadataDetails[] loadMetadataDetails, Configuration configuration) {
 this.identifier = identifier;
 this.configuration = configuration;
 this.loadMetadataDetails = loadMetadataDetails;
diff --git 
a/integration/spark2/src/main/scala/org/apache/carbondata/spark/rdd/CarbonTableCompactor.scala
 
b/integration/spark2/src/main/scala/org/apache/carbondata/spark/rdd/CarbonTableCompactor.scala
index afe2927..4c7dd95 100644
--- 
a/integration/spark2/src/main/scala/org/apache/carbondata/spark/rdd/CarbonTableCompactor.scala
+++ 
b/integration/spark2/src/main/scala/org/apache/carbondata/spark/rdd/CarbonTableCompactor.scala
@@ -29,13 +29,15 @@ import 
org.apache.spark.sql.execution.command.{CarbonMergerMapping, CompactionCa
 import org.apache.spark.util.MergeIndexUtil
 
 import org.apache.carbondata.core.constants.CarbonCommonConstants
+import org.apache.carbondata.core.constants.SortScopeOptions.SortScope
 import org.apache.carbondata.core.datamap.{DataMapStoreManager, Segment}
+import org.apache.carbondata.core.datastore.impl.FileFactory
 import org.apache.carbondata.core.metadata.SegmentFileStore
 import org.apache.carbondata.core.statusmanager.{LoadMetadataDetails, 
SegmentStatusManager}
 import org.apache.carbondata.core.util.path.CarbonTablePath
 import org.apache.carbondata.events._
 import org.apache.carbondata.processing.loading.model.CarbonLoadModel
-import org.apache.carbondata.processing.merger.{CarbonDataMergerUtil, 
CompactionType}
+import org.apache.carbondata.processing.merger.{CarbonCompactionUtil, 
CarbonDataMergerUtil, CompactionType}
 import org.apache.carbondata.spark.MergeResultImpl
 
 /**
@@ -50,6 +52,21 @@ class CarbonTableCompactor(carbonLoadModel: CarbonLoadModel,
 operationContext: OperationContext)
   extends Compactor(carbonLoadModel, compactionModel, executor, sqlContext, 
storeLocation) {
 
+  private def needSortSingleSegment(
+  loadsToMerge: java.util.List[LoadMetadataDetails]): Boolean = {
+// support to resort old segment with old sort_columns
+if (CompactionType.CUSTOM == compactionModel.compactionType &&
+loadsToMerge.size() == 1 &&
+SortScope.NO_SORT != compactionModel.carbonTable.getSortScope) {
+  !CarbonCompactionUtil.isSortedByCurrentSortColumns(
+carbonLoadModel.getCarbonDataLoadSchema.getCarbonTable,
+loadsToMerge.get(0),
+FileFactory.getConfiguration)
+} else {
+  false
+}
+  }
+
   override def executeCo

[carbondata] branch master updated: [CARBONDATA-3403]Fix MV is not working for like and filter AND and OR queries

2019-05-31 Thread ravipesala
This is an automated email from the ASF dual-hosted git repository.

ravipesala pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/carbondata.git


The following commit(s) were added to refs/heads/master by this push:
 new 32f5b50  [CARBONDATA-3403]Fix MV is not working for like and filter 
AND and OR queries
32f5b50 is described below

commit 32f5b505509731ea1f7ff0fde2c7e25aea4925b4
Author: akashrn5 
AuthorDate: Tue May 28 11:55:13 2019 +0530

[CARBONDATA-3403]Fix MV is not working for like and filter AND and OR 
queries

Problem:
MV table is not hit during query for like and filter AND and OR queries,
When we have like or filter queries, the queries will have literals which 
will be case sensitive to fetch the data.
But dring MV modular plan generation, we register the schema for datamap 
where we convert the complete datamap query to lower case, which will even 
convert the literals.
So after modular plan generation of user query, during matching pahse of 
modular plan of datamap and user query, the semantic equals fails for literals, 
that is Attribute reference type.
Solution: Do not convert the query to lower case when registering schema, 
that is when adding the preagg fun to query. So it will be handled for MV
For preaggregate, instead converting complete query to lowercase, convert 
to lower case during ColumnTableRelation generation and createField for 
preaggregate generation, so it will be handled for preaggregate.

This closes #3242
---
 .../carbondata/mv/rewrite/MVCreateTestCase.scala | 20 
 .../timeseries/TestTimeSeriesCreateTable.scala   |  2 +-
 .../command/preaaggregate/PreAggregateUtil.scala |  5 +++--
 .../spark/sql/parser/CarbonSpark2SqlParser.scala |  4 ++--
 4 files changed, 26 insertions(+), 5 deletions(-)

diff --git 
a/datamap/mv/core/src/test/scala/org/apache/carbondata/mv/rewrite/MVCreateTestCase.scala
 
b/datamap/mv/core/src/test/scala/org/apache/carbondata/mv/rewrite/MVCreateTestCase.scala
index 5e12ad3..25d2542 100644
--- 
a/datamap/mv/core/src/test/scala/org/apache/carbondata/mv/rewrite/MVCreateTestCase.scala
+++ 
b/datamap/mv/core/src/test/scala/org/apache/carbondata/mv/rewrite/MVCreateTestCase.scala
@@ -1022,6 +1022,25 @@ class MVCreateTestCase extends QueryTest with 
BeforeAndAfterAll {
 sql("drop table if exists all_table")
   }
 
+  test(" test MV with like queries and filter queries") {
+sql("drop table if exists mv_like")
+sql(
+  "create table mv_like(name string, age int, address string, Country 
string, id int) stored by 'carbondata'")
+sql(
+  "create datamap mvlikedm1 using 'mv' as select name,address from mv_like 
where Country NOT LIKE 'US' group by name,address")
+sql(
+  "create datamap mvlikedm2 using 'mv' as select name,address,Country from 
mv_like where Country = 'US' or Country = 'China' group by 
name,address,Country")
+sql("insert into mv_like select 'chandler', 32, 'newYork', 'US', 5")
+val df1 = sql(
+  "select name,address from mv_like where Country NOT LIKE 'US' group by 
name,address")
+val analyzed1 = df1.queryExecution.analyzed
+assert(verifyMVDataMap(analyzed1, "mvlikedm1"))
+val df2 = sql(
+  "select name,address,Country from mv_like where Country = 'US' or 
Country = 'China' group by name,address,Country")
+val analyzed2 = df2.queryExecution.analyzed
+assert(verifyMVDataMap(analyzed2, "mvlikedm2"))
+  }
+
   def verifyMVDataMap(logicalPlan: LogicalPlan, dataMapName: String): Boolean 
= {
 val tables = logicalPlan collect {
   case l: LogicalRelation => l.catalogTable.get
@@ -1040,6 +1059,7 @@ class MVCreateTestCase extends QueryTest with 
BeforeAndAfterAll {
 sql("drop table IF EXISTS fact_streaming_table2")
 sql("drop table IF EXISTS fact_table_parquet")
 sql("drop table if exists limit_fail")
+sql("drop table IF EXISTS mv_like")
   }
 
   override def afterAll {
diff --git 
a/integration/spark-common-test/src/test/scala/org/apache/carbondata/integration/spark/testsuite/timeseries/TestTimeSeriesCreateTable.scala
 
b/integration/spark-common-test/src/test/scala/org/apache/carbondata/integration/spark/testsuite/timeseries/TestTimeSeriesCreateTable.scala
index d68195c..eabe0f5 100644
--- 
a/integration/spark-common-test/src/test/scala/org/apache/carbondata/integration/spark/testsuite/timeseries/TestTimeSeriesCreateTable.scala
+++ 
b/integration/spark-common-test/src/test/scala/org/apache/carbondata/integration/spark/testsuite/timeseries/TestTimeSeriesCreateTable.scala
@@ -517,7 +517,7 @@ class TestTimeSeriesCreateTable extends QueryTest with 
BeforeAndAfterAll with Be
|GROUP BY dataTime
 """.stripMargin)
 }
-assert(e.getMessage.contains(&quo

[carbondata] branch master updated: [CARBONDATA-3399] Implement executor id based distribution for indexserver

2019-05-31 Thread ravipesala
This is an automated email from the ASF dual-hosted git repository.

ravipesala pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/carbondata.git


The following commit(s) were added to refs/heads/master by this push:
 new fa3e392  [CARBONDATA-3399] Implement executor id based distribution 
for indexserver
fa3e392 is described below

commit fa3e392c17ff1867baa6ac1ae918346e76ac1add
Author: kunal642 
AuthorDate: Mon May 27 12:41:54 2019 +0530

[CARBONDATA-3399] Implement executor id based distribution for indexserver

This closes #3237
---
 .../apache/spark/sql/hive/DistributionUtil.scala   |   8 +
 .../indexserver/DistributedPruneRDD.scala  |   9 +-
 .../indexserver/DistributedRDDUtils.scala  | 218 -
 .../indexserver/DistributedRDDUtilsTest.scala  | 115 +++
 4 files changed, 300 insertions(+), 50 deletions(-)

diff --git 
a/integration/spark-common/src/main/scala/org/apache/spark/sql/hive/DistributionUtil.scala
 
b/integration/spark-common/src/main/scala/org/apache/spark/sql/hive/DistributionUtil.scala
index 0861d2b..4256777 100644
--- 
a/integration/spark-common/src/main/scala/org/apache/spark/sql/hive/DistributionUtil.scala
+++ 
b/integration/spark-common/src/main/scala/org/apache/spark/sql/hive/DistributionUtil.scala
@@ -89,6 +89,14 @@ object DistributionUtil {
 }
   }
 
+  def getExecutors(sparkContext: SparkContext): Map[String, Seq[String]] = {
+val bm = sparkContext.env.blockManager
+bm.master.getPeers(bm.blockManagerId)
+  .groupBy(blockManagerId => blockManagerId.host).map {
+  case (host, blockManagerIds) => (host, blockManagerIds.map(_.executorId))
+}
+  }
+
   private def getLocalhostIPs = {
 val iface = NetworkInterface.getNetworkInterfaces
 var addresses: List[InterfaceAddress] = List.empty
diff --git 
a/integration/spark2/src/main/scala/org/apache/carbondata/indexserver/DistributedPruneRDD.scala
 
b/integration/spark2/src/main/scala/org/apache/carbondata/indexserver/DistributedPruneRDD.scala
index d2dab2d..607f923 100644
--- 
a/integration/spark2/src/main/scala/org/apache/carbondata/indexserver/DistributedPruneRDD.scala
+++ 
b/integration/spark2/src/main/scala/org/apache/carbondata/indexserver/DistributedPruneRDD.scala
@@ -38,7 +38,7 @@ import org.apache.carbondata.core.util.CarbonProperties
 import org.apache.carbondata.spark.rdd.CarbonRDD
 import org.apache.carbondata.spark.util.CarbonScalaUtil
 
-private[indexserver] class DataMapRDDPartition(rddId: Int, idx: Int, val 
inputSplit: InputSplit)
+class DataMapRDDPartition(rddId: Int, idx: Int, val inputSplit: InputSplit)
   extends Partition {
 
   override def index: Int = idx
@@ -50,8 +50,6 @@ private[indexserver] class DistributedPruneRDD(@transient 
private val ss: SparkS
 dataMapFormat: DistributableDataMapFormat)
   extends CarbonRDD[(String, ExtendedBlocklet)](ss, Nil) {
 
-  val executorsList: Set[String] = 
DistributionUtil.getNodeList(ss.sparkContext).toSet
-
   @transient private val LOGGER = 
LogServiceFactory.getLogService(classOf[DistributedPruneRDD]
 .getName)
 
@@ -106,7 +104,8 @@ private[indexserver] class DistributedPruneRDD(@transient 
private val ss: SparkS
   throw new java.util.NoSuchElementException("End of stream")
 }
 havePair = false
-val executorIP = SparkEnv.get.blockManager.blockManagerId.host
+val executorIP = s"${ SparkEnv.get.blockManager.blockManagerId.host 
}_${
+  SparkEnv.get.blockManager.blockManagerId.executorId}"
 val value = (executorIP + "_" + cacheSize.toString, 
reader.getCurrentValue)
 value
   }
@@ -125,6 +124,8 @@ private[indexserver] class DistributedPruneRDD(@transient 
private val ss: SparkS
 f => new DataMapRDDPartition(id, f._2, f._1)
   }.toArray
 } else {
+  val executorsList: Map[String, Seq[String]] = DistributionUtil
+.getExecutors(ss.sparkContext)
   val (response, time) = CarbonScalaUtil.logTime {
 DistributedRDDUtils.getExecutors(splits.toArray, executorsList, 
dataMapFormat
   .getCarbonTable.getTableUniqueName, id)
diff --git 
a/integration/spark2/src/main/scala/org/apache/carbondata/indexserver/DistributedRDDUtils.scala
 
b/integration/spark2/src/main/scala/org/apache/carbondata/indexserver/DistributedRDDUtils.scala
index c381f80..c7632be 100644
--- 
a/integration/spark2/src/main/scala/org/apache/carbondata/indexserver/DistributedRDDUtils.scala
+++ 
b/integration/spark2/src/main/scala/org/apache/carbondata/indexserver/DistributedRDDUtils.scala
@@ -20,7 +20,6 @@ import java.util.concurrent.ConcurrentHashMap
 
 import scala.collection.JavaConverters._
 
-import org.apache.commons.lang.StringUtils
 import org.apache.hadoop.mapreduce.InputSplit
 import org.apache.spark.Partition
 
@@ -29,14 +28,14 @@ import 
org.apache.carbondata.core.datamap.dev.expr.DataMapDistributableWrapper
 
 object Dis

[carbondata] annotated tag apache-carbondata-1.5.4 created (now 5a55d9b)

2019-05-29 Thread ravipesala
This is an automated email from the ASF dual-hosted git repository.

ravipesala pushed a change to annotated tag apache-carbondata-1.5.4
in repository https://gitbox.apache.org/repos/asf/carbondata.git.


  at 5a55d9b  (tag)
 tagging 1f2e184b81bef4e861b4dd32be94dc50bada6b68 (commit)
 replaces apache-carbondata-1.5.3-rc1
  by ravipesala
  on Fri May 17 14:27:20 2019 +0530

- Log -
[maven-release-plugin] copy for tag apache-carbondata-1.5.4-rc1
---

No new revisions were added by this update.



[carbondata] annotated tag apache-carbondata-1.5.4-rc1 deleted (was 5a55d9b)

2019-05-29 Thread ravipesala
This is an automated email from the ASF dual-hosted git repository.

ravipesala pushed a change to annotated tag apache-carbondata-1.5.4-rc1
in repository https://gitbox.apache.org/repos/asf/carbondata.git.


*** WARNING: tag apache-carbondata-1.5.4-rc1 was deleted! ***

   tag was  5a55d9b

The revisions that were on this annotated tag are still contained in
other references; therefore, this change does not discard any commits
from the repository.



svn commit: r34308 - /release/carbondata/1.5.4/

2019-05-29 Thread ravipesala
Author: ravipesala
Date: Wed May 29 11:27:17 2019
New Revision: 34308

Log:
Checkin 1.5.4

Added:
release/carbondata/1.5.4/

release/carbondata/1.5.4/apache-carbondata-1.5.4-bin-spark2.1.0-hadoop2.8.3.jar 
  (with props)

release/carbondata/1.5.4/apache-carbondata-1.5.4-bin-spark2.1.0-hadoop2.8.3.jar.asc

release/carbondata/1.5.4/apache-carbondata-1.5.4-bin-spark2.1.0-hadoop2.8.3.jar.sha512

release/carbondata/1.5.4/apache-carbondata-1.5.4-bin-spark2.2.1-hadoop2.8.3.jar 
  (with props)

release/carbondata/1.5.4/apache-carbondata-1.5.4-bin-spark2.2.1-hadoop2.8.3.jar.asc

release/carbondata/1.5.4/apache-carbondata-1.5.4-bin-spark2.2.1-hadoop2.8.3.jar.sha512

release/carbondata/1.5.4/apache-carbondata-1.5.4-bin-spark2.3.2-hadoop2.8.3.jar 
  (with props)

release/carbondata/1.5.4/apache-carbondata-1.5.4-bin-spark2.3.2-hadoop2.8.3.jar.asc

release/carbondata/1.5.4/apache-carbondata-1.5.4-bin-spark2.3.2-hadoop2.8.3.jar.sha512
release/carbondata/1.5.4/apache-carbondata-1.5.4-source-release.zip   (with 
props)
release/carbondata/1.5.4/apache-carbondata-1.5.4-source-release.zip.asc   
(with props)
release/carbondata/1.5.4/apache-carbondata-1.5.4-source-release.zip.sha512

Added: 
release/carbondata/1.5.4/apache-carbondata-1.5.4-bin-spark2.1.0-hadoop2.8.3.jar
==
Binary file - no diff available.

Propchange: 
release/carbondata/1.5.4/apache-carbondata-1.5.4-bin-spark2.1.0-hadoop2.8.3.jar
--
svn:mime-type = application/octet-stream

Added: 
release/carbondata/1.5.4/apache-carbondata-1.5.4-bin-spark2.1.0-hadoop2.8.3.jar.asc
==
--- 
release/carbondata/1.5.4/apache-carbondata-1.5.4-bin-spark2.1.0-hadoop2.8.3.jar.asc
 (added)
+++ 
release/carbondata/1.5.4/apache-carbondata-1.5.4-bin-spark2.1.0-hadoop2.8.3.jar.asc
 Wed May 29 11:27:17 2019
@@ -0,0 +1,16 @@
+-BEGIN PGP SIGNATURE-
+
+iQIzBAEBCgAdFiEER3EpqJTxH7zLwCVHutcqeKexsu4FAlzuasgACgkQutcqeKex
+su6Hyw//X2dc1cvZRrldHyobNdhqjAqsJHq6rt+YecN4bTQm90gwPKoxClfefDT9
+N0aPmhAqjFFbd+yI8R51aGQyPqyFiew2y/2xsjRUDn8TILUcd040NGfk9HGTep7B
+/1KE6REsxQGGfM1a0tY0tn3yMwciKLUpXimwgi8LNmhcXiqjXQJJa0YSnDyM/U+O
+gUp5Ne0skI7Q5M8hbsknXcqVmbiWIqquncqUM3qNE84VPcgt04bo6IJWJ35A0vmd
+a1zwqtYD7MvP8E2pTuv4F/47u2XqxO/ho+G8qtj6MKp2n/jytw73qkH/N93HIJXc
+0KlFVWPuggxgRp1tTY1p0D68hx2L5aIkJOFISQlAMycslaFIq0YcoF4prtc/CrdN
+JFDjJ7UFUdaOUmmE9n7R+XilD/usjiC2wxiIl2SELFoO2Gf4fnhbzs4Qdm19LLkP
+8ws6tJ3fkXPaRJKC9Vbl0q86UF478/GMHPZN9f7m0P2ulY+GYMxLXTyL/SF2rTt4
+b45S2UEpDwIPPzww4Hq2wvOOZi9eiLGT4+YMfHXGfthI3tCBQMQOP5ccO9SB3PY2
+Ea1/x5WtpxDMVax/zE3/5ZpjInhKXgRo9a0eRW70I0WST+ObxEuXsRoTS5fLPZuZ
+fpHpNQlxBct8EMuOz/DnrPva7HhABOm9VCm4zBlW4Zd9XgaCK5k=
+=OJKq
+-END PGP SIGNATURE-

Added: 
release/carbondata/1.5.4/apache-carbondata-1.5.4-bin-spark2.1.0-hadoop2.8.3.jar.sha512
==
--- 
release/carbondata/1.5.4/apache-carbondata-1.5.4-bin-spark2.1.0-hadoop2.8.3.jar.sha512
 (added)
+++ 
release/carbondata/1.5.4/apache-carbondata-1.5.4-bin-spark2.1.0-hadoop2.8.3.jar.sha512
 Wed May 29 11:27:17 2019
@@ -0,0 +1 @@
+c107e1d21aaaf2d50c8ef765dfdd99ff62e93cdad942c3598f4d110712ae931dfeab7f0de22090eb6afc8bf6f25af7d174456b7c9749201e9c1c83afd38fe90d
  apache-carbondata-1.5.4-bin-spark2.1.0-hadoop2.8.3.jar

Added: 
release/carbondata/1.5.4/apache-carbondata-1.5.4-bin-spark2.2.1-hadoop2.8.3.jar
==
Binary file - no diff available.

Propchange: 
release/carbondata/1.5.4/apache-carbondata-1.5.4-bin-spark2.2.1-hadoop2.8.3.jar
--
svn:mime-type = application/octet-stream

Added: 
release/carbondata/1.5.4/apache-carbondata-1.5.4-bin-spark2.2.1-hadoop2.8.3.jar.asc
==
--- 
release/carbondata/1.5.4/apache-carbondata-1.5.4-bin-spark2.2.1-hadoop2.8.3.jar.asc
 (added)
+++ 
release/carbondata/1.5.4/apache-carbondata-1.5.4-bin-spark2.2.1-hadoop2.8.3.jar.asc
 Wed May 29 11:27:17 2019
@@ -0,0 +1,16 @@
+-BEGIN PGP SIGNATURE-
+
+iQIzBAEBCgAdFiEER3EpqJTxH7zLwCVHutcqeKexsu4FAlzuas8ACgkQutcqeKex
+su6nnRAArLDfJCuNiXcrzzJSNxfoNx2WrgBFYsYs+UsQlvmY01/604TvU3jXBwUL
+9eg0OK0iDO3Qp7oD0XE/Pzq4ZR8pOwYBgPucEXm9UYBp43cdIAUa+MUZhsYowMJn
+hyY99cvT6krS0N+Y6VQAdiC4QWODlUPtD/blqkuEQehHRtUOJvpuQXleg2aBtEVB
+rB5E9zLsJ1bSepXyMRXM86BYEUHrN/E037OGMdLrjt+mRK2kc0wwtAKsrD8qfckN
+TAFr9vYZbv3EgA+5p+8dGKIbYJrTYJyFxHoDtm9/COwcaGML+6y2aJPZwmxL7B4G
+V+xmhKnQ2JandQYu80Gdy94QVKhJ3juG2K6Q+RJza6ZKUhmsdVpZhWewPKrxPlS9
+/0kSE9cxyEYm5ERhlN95xNK/B37LvGZNuSVfFD1IywHR/8CnJj8QJTutFMfpwxM6
+jhNthmGLgpux+ut4wdMruwfV3/rodNYtKjPfJBDMW/z21LG1ory16b+HZfIZBAL1
+DzGsWX8DeN+ZiZpX9+pjETEtUeAyuP2fmrZwWBRwf8gGh9mTe

[carbondata] branch master updated: [CARBONDATA-3402] Fix block complex data type and validate dmproperties for MV

2019-05-29 Thread ravipesala
This is an automated email from the ASF dual-hosted git repository.

ravipesala pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/carbondata.git


The following commit(s) were added to refs/heads/master by this push:
 new 1023ba9  [CARBONDATA-3402] Fix block complex data type and validate 
dmproperties for MV
1023ba9 is described below

commit 1023ba951cc623b8f312e66fa288744705a928de
Author: Indhumathi27 
AuthorDate: Mon May 27 18:44:33 2019 +0530

[CARBONDATA-3402] Fix block complex data type and validate dmproperties for 
MV

This PR includes,

Blocked complex data types with mv
Fixed to_date function while creating mv datamap
Added inheriting Global dictionary from parent table to child table for 
preaggregate & mv
Validate DMproperties for MV

This closes #3241
---
 .../apache/carbondata/mv/datamap/MVHelper.scala|  25 +++-
 .../org/apache/carbondata/mv/datamap/MVUtil.scala  |  34 +++--
 .../carbondata/mv/rewrite/MVCoalesceTestCase.scala |  16 +--
 .../mv/rewrite/MVCountAndCaseTestCase.scala|   9 +-
 .../carbondata/mv/rewrite/MVCreateTestCase.scala   | 137 ++---
 .../mv/rewrite/MVIncrementalLoadingTestcase.scala  |  37 +++---
 .../mv/rewrite/MVMultiJoinTestCase.scala   |  11 +-
 .../carbondata/mv/rewrite/MVRewriteTestCase.scala  |   9 +-
 .../carbondata/mv/rewrite/MVSampleTestCase.scala   |  25 ++--
 .../carbondata/mv/rewrite/MVTPCDSTestCase.scala|  28 ++---
 .../carbondata/mv/rewrite/MVTpchTestCase.scala |  35 +++---
 .../mv/rewrite/TestAllOperationsOnMV.scala |  61 +
 .../mv/rewrite/TestPartitionWithMV.scala   |  11 +-
 .../preaggregate/TestPreAggCreateCommand.scala |   8 +-
 .../spark/sql/catalyst/CarbonDDLSqlParser.scala|   3 +-
 .../command/management/CarbonLoadDataCommand.scala |  12 +-
 .../scala/org/apache/spark/util/DataMapUtil.scala  |  58 ++---
 17 files changed, 282 insertions(+), 237 deletions(-)

diff --git 
a/datamap/mv/core/src/main/scala/org/apache/carbondata/mv/datamap/MVHelper.scala
 
b/datamap/mv/core/src/main/scala/org/apache/carbondata/mv/datamap/MVHelper.scala
index 8d60a06..57082d7 100644
--- 
a/datamap/mv/core/src/main/scala/org/apache/carbondata/mv/datamap/MVHelper.scala
+++ 
b/datamap/mv/core/src/main/scala/org/apache/carbondata/mv/datamap/MVHelper.scala
@@ -32,6 +32,7 @@ import org.apache.spark.sql.execution.command.{Field, 
PartitionerField, TableMod
 import org.apache.spark.sql.execution.command.table.{CarbonCreateTableCommand, 
CarbonDropTableCommand}
 import org.apache.spark.sql.execution.datasources.LogicalRelation
 import org.apache.spark.sql.parser.CarbonSpark2SqlParser
+import org.apache.spark.sql.types.{ArrayType, MapType, StructType}
 import org.apache.spark.util.{DataMapUtil, PartitionUtils}
 
 import 
org.apache.carbondata.common.exceptions.sql.MalformedCarbonCommandException
@@ -60,6 +61,7 @@ object MVHelper {
 s"MV datamap does not support streaming"
   )
 }
+MVUtil.validateDMProperty(dmProperties)
 val updatedQuery = new 
CarbonSpark2SqlParser().addPreAggFunction(queryString)
 val query = sparkSession.sql(updatedQuery)
 val logicalPlan = MVHelper.dropDummFuc(query.queryExecution.analyzed)
@@ -71,6 +73,11 @@ object MVHelper {
 val updatedQueryWithDb = validateMVQuery(sparkSession, logicalPlan)
 val fullRebuild = isFullReload(logicalPlan)
 val fields = logicalPlan.output.map { attr =>
+  if (attr.dataType.isInstanceOf[ArrayType] || 
attr.dataType.isInstanceOf[StructType] ||
+  attr.dataType.isInstanceOf[MapType]) {
+throw new UnsupportedOperationException(
+  s"MV datamap is unsupported for ComplexData type column: " + 
attr.name)
+  }
   val name = updateColumnName(attr)
   val rawSchema = '`' + name + '`' + ' ' + attr.dataType.typeName
   if (attr.dataType.typeName.startsWith("decimal")) {
@@ -312,13 +319,19 @@ object MVHelper {
 modularPlan.asCompactSQL
   }
 
+  def getUpdatedName(name: String): String = {
+val updatedName = name.replace("(", "_")
+  .replace(")", "")
+  .replace(" ", "_")
+  .replace("=", "")
+  .replace(",", "")
+  .replace(".", "_")
+  .replace("`", "")
+updatedName
+  }
+
   def updateColumnName(attr: Attribute): String = {
-val name =
-  attr.name.replace("(", "_")
-.replace(")", "")
-.replace(" ", "_")
-.replace("=", "")
-.replace(",", "")
+val name = getUpdatedName(attr.name)
 attr.qualifier.map(qualifier => qualifier + "_" + name).getOrElse(name)
   }
 
diff --git 
a/datamap/mv/core/src/main/scala/org/apache/carbondat

[carbondata] branch master updated: [CARBONDATA-3393] Merge Index Job Failure should not trigger the merge index job again. Exception should be propagated to the caller.

2019-05-29 Thread ravipesala
This is an automated email from the ASF dual-hosted git repository.

ravipesala pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/carbondata.git


The following commit(s) were added to refs/heads/master by this push:
 new 706e8d3  [CARBONDATA-3393] Merge Index Job Failure should not trigger 
the merge index job again. Exception should be propagated to the caller.
706e8d3 is described below

commit 706e8d34c40da97e0d123f58eac3f6da3953f4d0
Author: dhatchayani 
AuthorDate: Tue May 28 19:29:46 2019 +0530

[CARBONDATA-3393] Merge Index Job Failure should not trigger the merge 
index job again. Exception should be propagated to the caller.

Problem:
If the merge index job is failed, the same job is triggered again.

Solution:
Merge index job exception has to be propagated to the caller. It should not 
trigger the same job again.

Changes:
(1) Merge index job failure will not be propagated to the caller. And will 
only be LOGGED.
(2) Implement a new method to write the SegmentFile based on the current 
load timestamp. This helps in case of merge index failures and writing merge 
index for old store.

This closes #3226
---
 .../core/constants/CarbonCommonConstants.java  | 12 +++
 .../carbondata/core/metadata/SegmentFileStore.java | 21 +++
 .../org/apache/spark/rdd/CarbonMergeFilesRDD.scala | 41 +++---
 3 files changed, 62 insertions(+), 12 deletions(-)

diff --git 
a/core/src/main/java/org/apache/carbondata/core/constants/CarbonCommonConstants.java
 
b/core/src/main/java/org/apache/carbondata/core/constants/CarbonCommonConstants.java
index aa9dd05..311019c 100644
--- 
a/core/src/main/java/org/apache/carbondata/core/constants/CarbonCommonConstants.java
+++ 
b/core/src/main/java/org/apache/carbondata/core/constants/CarbonCommonConstants.java
@@ -346,6 +346,18 @@ public final class CarbonCommonConstants {
   public static final String CARBON_MERGE_INDEX_IN_SEGMENT_DEFAULT = "true";
 
   /**
+   * It is the user defined property to specify whether to throw exception or 
not in case
+   * if the MERGE INDEX JOB is failed. Default value - TRUE
+   * TRUE - throws exception and fails the corresponding LOAD job
+   * FALSE - Logs the exception and continue with the LOAD
+   */
+  @CarbonProperty
+  public static final String CARBON_MERGE_INDEX_FAILURE_THROW_EXCEPTION =
+  "carbon.merge.index.failure.throw.exception";
+
+  public static final String 
CARBON_MERGE_INDEX_FAILURE_THROW_EXCEPTION_DEFAULT = "true";
+
+  /**
* property to be used for specifying the max byte limit for string/varchar 
data type till
* where storing min/max in data file will be considered
*/
diff --git 
a/core/src/main/java/org/apache/carbondata/core/metadata/SegmentFileStore.java 
b/core/src/main/java/org/apache/carbondata/core/metadata/SegmentFileStore.java
index 69e5dc3..cbf58c7 100644
--- 
a/core/src/main/java/org/apache/carbondata/core/metadata/SegmentFileStore.java
+++ 
b/core/src/main/java/org/apache/carbondata/core/metadata/SegmentFileStore.java
@@ -139,12 +139,32 @@ public class SegmentFileStore {
*/
   public static String writeSegmentFile(CarbonTable carbonTable, String 
segmentId, String UUID)
   throws IOException {
+return writeSegmentFile(carbonTable, segmentId, UUID, null);
+  }
+
+  /**
+   * Write segment file to the metadata folder of the table selecting only the 
current load files
+   *
+   * @param carbonTable
+   * @param segmentId
+   * @param UUID
+   * @param currentLoadTimeStamp
+   * @return
+   * @throws IOException
+   */
+  public static String writeSegmentFile(CarbonTable carbonTable, String 
segmentId, String UUID,
+  final String currentLoadTimeStamp) throws IOException {
 String tablePath = carbonTable.getTablePath();
 boolean supportFlatFolder = carbonTable.isSupportFlatFolder();
 String segmentPath = CarbonTablePath.getSegmentPath(tablePath, segmentId);
 CarbonFile segmentFolder = FileFactory.getCarbonFile(segmentPath);
 CarbonFile[] indexFiles = segmentFolder.listFiles(new CarbonFileFilter() {
   @Override public boolean accept(CarbonFile file) {
+if (null != currentLoadTimeStamp) {
+  return file.getName().contains(currentLoadTimeStamp) && (
+  file.getName().endsWith(CarbonTablePath.INDEX_FILE_EXT) || 
file.getName()
+  .endsWith(CarbonTablePath.MERGE_INDEX_FILE_EXT));
+}
 return (file.getName().endsWith(CarbonTablePath.INDEX_FILE_EXT) || 
file.getName()
 .endsWith(CarbonTablePath.MERGE_INDEX_FILE_EXT));
   }
@@ -185,6 +205,7 @@ public class SegmentFileStore {
 return null;
   }
 
+
   /**
* Move the loaded data from source folder to destination folder.
*/
diff --git 
a/integration/spark-common/src/main/scala/org/apache/spark/rdd/CarbonMergeFilesRDD.scala
 
b/integration/spark-common/src/main/scala/org/apac

[carbondata] branch master updated: [DOCUMENTATION] Document change for GLOBAL_SORT_PARTITIONS

2019-05-29 Thread ravipesala
This is an automated email from the ASF dual-hosted git repository.

ravipesala pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/carbondata.git


The following commit(s) were added to refs/heads/master by this push:
 new 10cbf4e  [DOCUMENTATION] Document change for GLOBAL_SORT_PARTITIONS
10cbf4e is described below

commit 10cbf4ec018de4671284e9f6974d05b22609f3a0
Author: manishnalla1994 
AuthorDate: Mon May 27 12:09:04 2019 +0530

[DOCUMENTATION] Document change for GLOBAL_SORT_PARTITIONS

Documentation change done for Global Sort Partitions during Range Column 
DataLoad/Compaction.

This closes #3234
---
 docs/dml-of-carbondata.md | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/docs/dml-of-carbondata.md b/docs/dml-of-carbondata.md
index 6ec0520..3e2a22d 100644
--- a/docs/dml-of-carbondata.md
+++ b/docs/dml-of-carbondata.md
@@ -281,6 +281,8 @@ CarbonData DML statements are documented here,which 
includes:
 
 If the SORT_SCOPE is defined as GLOBAL_SORT, then user can specify the 
number of partitions to use while shuffling data for sort using 
GLOBAL_SORT_PARTITIONS. If it is not configured, or configured less than 1, 
then it uses the number of map task as reduce task. It is recommended that each 
reduce task deal with 512MB-1GB data.
 For RANGE_COLUMN, GLOBAL_SORT_PARTITIONS is used to specify the number of 
range partitions also.
+GLOBAL_SORT_PARTITIONS should be specified optimally during RANGE_COLUMN 
LOAD because if a higher number is configured then the load time may be less 
but it will result in creation of more files which would degrade the query and 
compaction performance.
+Conversely, if less partitions are configured then the load performance 
may degrade due to less use of parallelism but the query and compaction will 
become faster. Hence the user may choose optimal number depending on the use 
case.
   ```
   OPTIONS('GLOBAL_SORT_PARTITIONS'='2')
   ```



[carbondata] branch master updated: [CARBONDATA-3396] Range Compaction Data Mismatch Fix

2019-05-29 Thread ravipesala
This is an automated email from the ASF dual-hosted git repository.

ravipesala pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/carbondata.git


The following commit(s) were added to refs/heads/master by this push:
 new ce40c64  [CARBONDATA-3396] Range Compaction Data Mismatch Fix
ce40c64 is described below

commit ce40c64f552d02417400111e9865ff77a05d4fbd
Author: manishnalla1994 
AuthorDate: Mon May 27 11:41:10 2019 +0530

[CARBONDATA-3396] Range Compaction Data Mismatch Fix

Problem : When we have to compact the data second time and the ranges made 
first time have data in more than one file/blocklet, then while compacting 
second time if the first blocklet does not contain any record then the other 
files are also skipped. Also, Global Sort and Local Sort with Range Column were 
taking different time for same data load and compaction as during write step we 
give only 1 core to Global Sort.

Solution : For the first issue we are reading all the blocklets of a given 
range and then breaking only when the batch size is full. For the second issue 
in case of range column both the sort scopes will now take same number of cores 
and behave similarly.

Also changed the number of tasks to be launched during the compaction, now 
based on the number of tasks during load.

This closes #3233
---
 .../core/constants/CarbonCommonConstants.java  |  4 
 .../AbstractDetailQueryResultIterator.java | 14 +
 .../scan/result/iterator/RawResultIterator.java| 11 +--
 .../carbondata/core/util/CarbonProperties.java | 23 --
 .../carbondata/spark/rdd/CarbonMergerRDD.scala | 18 -
 .../processing/merger/CarbonCompactionUtil.java| 11 +++
 .../store/CarbonFactDataHandlerModel.java  |  3 ++-
 7 files changed, 53 insertions(+), 31 deletions(-)

diff --git 
a/core/src/main/java/org/apache/carbondata/core/constants/CarbonCommonConstants.java
 
b/core/src/main/java/org/apache/carbondata/core/constants/CarbonCommonConstants.java
index e78ea17..aa9dd05 100644
--- 
a/core/src/main/java/org/apache/carbondata/core/constants/CarbonCommonConstants.java
+++ 
b/core/src/main/java/org/apache/carbondata/core/constants/CarbonCommonConstants.java
@@ -1193,10 +1193,6 @@ public final class CarbonCommonConstants {
 
   public static final String CARBON_RANGE_COLUMN_SCALE_FACTOR_DEFAULT = "3";
 
-  public static final String CARBON_ENABLE_RANGE_COMPACTION = 
"carbon.enable.range.compaction";
-
-  public static final String CARBON_ENABLE_RANGE_COMPACTION_DEFAULT = "false";
-
   
//
   // Query parameter start here
   
//
diff --git 
a/core/src/main/java/org/apache/carbondata/core/scan/result/iterator/AbstractDetailQueryResultIterator.java
 
b/core/src/main/java/org/apache/carbondata/core/scan/result/iterator/AbstractDetailQueryResultIterator.java
index f39e549..d7f2c0b 100644
--- 
a/core/src/main/java/org/apache/carbondata/core/scan/result/iterator/AbstractDetailQueryResultIterator.java
+++ 
b/core/src/main/java/org/apache/carbondata/core/scan/result/iterator/AbstractDetailQueryResultIterator.java
@@ -24,7 +24,6 @@ import java.util.concurrent.ExecutorService;
 
 import org.apache.carbondata.common.CarbonIterator;
 import org.apache.carbondata.common.logging.LogServiceFactory;
-import org.apache.carbondata.core.constants.CarbonCommonConstants;
 import org.apache.carbondata.core.datastore.DataRefNode;
 import org.apache.carbondata.core.datastore.FileReader;
 import org.apache.carbondata.core.datastore.block.AbstractIndex;
@@ -89,18 +88,7 @@ public abstract class AbstractDetailQueryResultIterator 
extends CarbonIterato
 
   AbstractDetailQueryResultIterator(List infos, QueryModel 
queryModel,
   ExecutorService execService) {
-String batchSizeString =
-
CarbonProperties.getInstance().getProperty(CarbonCommonConstants.DETAIL_QUERY_BATCH_SIZE);
-if (null != batchSizeString) {
-  try {
-batchSize = Integer.parseInt(batchSizeString);
-  } catch (NumberFormatException ne) {
-LOGGER.error("Invalid inmemory records size. Using default value");
-batchSize = CarbonCommonConstants.DETAIL_QUERY_BATCH_SIZE_DEFAULT;
-  }
-} else {
-  batchSize = CarbonCommonConstants.DETAIL_QUERY_BATCH_SIZE_DEFAULT;
-}
+batchSize = CarbonProperties.getQueryBatchSize();
 this.recorder = queryModel.getStatisticsRecorder();
 this.blockExecutionInfos = infos;
 this.fileReader = FileFactory.getFileHolder(
diff --git 
a/core/src/main/java/org/apache/carbondata/core/scan/result/iterator/RawResultIterator.java
 
b/core/src/main/java/org/apache/carbondata/core/scan/result/iterator/RawResultIterator.java
index 4d471b6..911a

[carbondata] branch master updated: [CARBONDATA-3397]Remove SparkUnknown Expression to Index Server

2019-05-29 Thread ravipesala
This is an automated email from the ASF dual-hosted git repository.

ravipesala pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/carbondata.git


The following commit(s) were added to refs/heads/master by this push:
 new 15bae6e  [CARBONDATA-3397]Remove SparkUnknown Expression to Index 
Server
15bae6e is described below

commit 15bae6e5848bc83d4a6f65499fe7dacf88f5a67a
Author: BJangir 
AuthorDate: Mon May 27 14:55:39 2019 +0530

[CARBONDATA-3397]Remove SparkUnknown Expression to Index Server

Problem
if Query has UDF and it is registered to the Main driver Since UDF function 
will not be available in Index server , query will be failed in Indexserver 
(with NoClassDefincationFound).

Solution
UDF are SparkUnkownFilter(RowLevelFilterExecuterImpl) so Remove the 
SparkUnknown Expression because anyway for pruning we select all blocks. 
org.apache.carbondata.core.scan.filter.executer.RowLevelFilterExecuterImpl#isScanRequired.

Supply all the UDFs functions and it's related lambda expressions to 
IndexServer also. But it has below issues
a. Spark FunctionRegistry is not writable
b. sending All functions from Main Server to Index server will be costly(in 
Size) & no way to find implicit function and explicit user created functions.

So Solution 1 is adopted.

This closes #3238
---
 .../core/datamap/DistributableDataMapFormat.java   |  8 
 .../scan/filter/FilterExpressionProcessor.java | 43 ++
 .../carbondata/indexserver/DataMapJobs.scala   | 39 
 3 files changed, 90 insertions(+)

diff --git 
a/core/src/main/java/org/apache/carbondata/core/datamap/DistributableDataMapFormat.java
 
b/core/src/main/java/org/apache/carbondata/core/datamap/DistributableDataMapFormat.java
index f76cfec..57540e4 100644
--- 
a/core/src/main/java/org/apache/carbondata/core/datamap/DistributableDataMapFormat.java
+++ 
b/core/src/main/java/org/apache/carbondata/core/datamap/DistributableDataMapFormat.java
@@ -334,4 +334,12 @@ public class DistributableDataMapFormat extends 
FileInputFormat

[carbondata] branch master updated: [CARBONDATA-3400] Support IndexSever for Spark-Shell in secure Mode(kerberos)

2019-05-29 Thread ravipesala
This is an automated email from the ASF dual-hosted git repository.

ravipesala pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/carbondata.git


The following commit(s) were added to refs/heads/master by this push:
 new bf096e1  [CARBONDATA-3400] Support IndexSever for Spark-Shell in 
secure Mode(kerberos)
bf096e1 is described below

commit bf096e128f35865c7cd46cd5a5058c8e5227d773
Author: BJangir 
AuthorDate: Mon May 27 15:26:21 2019 +0530

[CARBONDATA-3400] Support IndexSever for Spark-Shell in secure 
Mode(kerberos)

Problem
In spark-shell OR Spark-Submit mode, Application user and IndexServer User 
are different .
Application user is based on Kinit user OR based on spark.yarn.principle 
user whereas Indexserver user is based on spark.carbon.indexserver.principal . 
it is possible that both are different as Indexserver should have it's own 
authentication principle and should not depend on Application principle so that 
any application's Query(Thrifserver,Spark-shell,Spark-sql,Spark-Submit) can be 
served from IndexServer.

Solution
Authenticate the IndexServer by it's own principle and keytab.
keytab is required so that long run application (client and indexserver ) 
does not impacted on token expire.

Note:- Spark-default.conf of Thriftserver (beeline), spark-submit 
,spark-sql should have both spark.carbon.indexserver.principal and 
spark.carbon.indexserver.keytab.

This closes #3240
---
 .../scala/org/apache/carbondata/indexserver/IndexServer.scala| 9 -
 1 file changed, 8 insertions(+), 1 deletion(-)

diff --git 
a/integration/spark2/src/main/scala/org/apache/carbondata/indexserver/IndexServer.scala
 
b/integration/spark2/src/main/scala/org/apache/carbondata/indexserver/IndexServer.scala
index e738fb3..f066095 100644
--- 
a/integration/spark2/src/main/scala/org/apache/carbondata/indexserver/IndexServer.scala
+++ 
b/integration/spark2/src/main/scala/org/apache/carbondata/indexserver/IndexServer.scala
@@ -167,9 +167,16 @@ object IndexServer extends ServerInterface {
*/
   def getClient: ServerInterface = {
 import org.apache.hadoop.ipc.RPC
+val indexServerUser = sparkSession.sparkContext.getConf
+  .get("spark.carbon.indexserver.principal", "")
+val indexServerKeyTab = sparkSession.sparkContext.getConf
+  .get("spark.carbon.indexserver.keytab", "")
+val ugi = 
UserGroupInformation.loginUserFromKeytabAndReturnUGI(indexServerUser,
+  indexServerKeyTab)
+LOGGER.info("Login successful for user " + indexServerUser);
 RPC.getProxy(classOf[ServerInterface],
   RPC.getProtocolVersion(classOf[ServerInterface]),
-  new InetSocketAddress(serverIp, serverPort), 
UserGroupInformation.getLoginUser,
+  new InetSocketAddress(serverIp, serverPort), ugi,
   FileFactory.getConfiguration, 
NetUtils.getDefaultSocketFactory(FileFactory.getConfiguration))
   }
 }



[carbondata] branch master updated: [CARBONDATA-3364] Support Read from Hive. Queries are giving empty results from hive.

2019-05-28 Thread ravipesala
This is an automated email from the ASF dual-hosted git repository.

ravipesala pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/carbondata.git


The following commit(s) were added to refs/heads/master by this push:
 new fcca6c5  [CARBONDATA-3364] Support Read from Hive. Queries are giving 
empty results from hive.
fcca6c5 is described below

commit fcca6c5b661ec02adfa17622e980a0c396bd84c2
Author: dhatchayani 
AuthorDate: Mon Apr 29 18:52:57 2019 +0530

[CARBONDATA-3364] Support Read from Hive. Queries are giving empty results 
from hive.

This closes #3192
---
 .../apache/carbondata/examples/HiveExample.scala   | 99 +-
 .../apache/carbondata/examplesCI/RunExamples.scala |  3 +-
 integration/hive/pom.xml   |  9 +-
 .../carbondata/hive/CarbonHiveInputSplit.java  |  8 +-
 .../apache/carbondata/hive/CarbonHiveSerDe.java|  2 +-
 .../carbondata/hive/MapredCarbonInputFormat.java   | 20 ++---
 .../carbondata/hive/MapredCarbonOutputFormat.java  | 12 ++-
 .../{ => test}/server/HiveEmbeddedServer2.java | 20 ++---
 integration/spark-common-test/pom.xml  |  6 ++
 .../TestCreateHiveTableWithCarbonDS.scala  |  4 +-
 integration/spark-common/pom.xml   |  5 ++
 .../apache/spark/util/CarbonReflectionUtils.scala  | 17 ++--
 .../spark/util/DictionaryLRUCacheTestCase.scala|  1 +
 pom.xml|  1 +
 14 files changed, 123 insertions(+), 84 deletions(-)

diff --git 
a/examples/spark2/src/main/scala/org/apache/carbondata/examples/HiveExample.scala
 
b/examples/spark2/src/main/scala/org/apache/carbondata/examples/HiveExample.scala
index b50e763..c043076 100644
--- 
a/examples/spark2/src/main/scala/org/apache/carbondata/examples/HiveExample.scala
+++ 
b/examples/spark2/src/main/scala/org/apache/carbondata/examples/HiveExample.scala
@@ -19,33 +19,36 @@ package org.apache.carbondata.examples
 import java.io.File
 import java.sql.{DriverManager, ResultSet, Statement}
 
-import org.apache.spark.sql.SparkSession
+import org.apache.hadoop.fs.Path
+import org.apache.hadoop.fs.permission.{FsAction, FsPermission}
 
 import org.apache.carbondata.common.logging.LogServiceFactory
-import org.apache.carbondata.core.constants.CarbonCommonConstants
-import org.apache.carbondata.core.util.CarbonProperties
+import org.apache.carbondata.core.datastore.impl.FileFactory
 import org.apache.carbondata.examples.util.ExampleUtils
-import org.apache.carbondata.hive.server.HiveEmbeddedServer2
+import org.apache.carbondata.hive.test.server.HiveEmbeddedServer2
 
 // scalastyle:off println
 object HiveExample {
 
   private val driverName: String = "org.apache.hive.jdbc.HiveDriver"
 
-  def main(args: Array[String]) {
-val carbonSession = ExampleUtils.createCarbonSession("HiveExample")
-exampleBody(carbonSession, CarbonProperties.getStorePath
-  + CarbonCommonConstants.FILE_SEPARATOR
-  + CarbonCommonConstants.DATABASE_DEFAULT_NAME)
-carbonSession.stop()
+  val rootPath = new File(this.getClass.getResource("/").getPath
+  + "../../../..").getCanonicalPath
+  private val targetLoc = s"$rootPath/examples/spark2/target"
+  val metaStoreLoc = s"$targetLoc/metastore_db"
+  val storeLocation = s"$targetLoc/store"
+  val logger = LogServiceFactory.getLogService(this.getClass.getCanonicalName)
 
+
+  def main(args: Array[String]) {
+createCarbonTable(storeLocation)
+readFromHive
 System.exit(0)
   }
 
-  def exampleBody(carbonSession: SparkSession, store: String): Unit = {
-val logger = 
LogServiceFactory.getLogService(this.getClass.getCanonicalName)
-val rootPath = new File(this.getClass.getResource("/").getPath
-  + "../../../..").getCanonicalPath
+  def createCarbonTable(store: String): Unit = {
+
+val carbonSession = ExampleUtils.createCarbonSession("HiveExample")
 
 carbonSession.sql("""DROP TABLE IF EXISTS 
HIVE_CARBON_EXAMPLE""".stripMargin)
 
@@ -56,14 +59,44 @@ object HiveExample {
  | STORED BY 'carbondata'
""".stripMargin)
 
+val inputPath = FileFactory
+  
.getUpdatedFilePath(s"$rootPath/examples/spark2/src/main/resources/sample.csv")
+
 carbonSession.sql(
   s"""
- | LOAD DATA LOCAL INPATH 
'$rootPath/examples/spark2/src/main/resources/sample.csv'
+ | LOAD DATA LOCAL INPATH '$inputPath'
+ | INTO TABLE HIVE_CARBON_EXAMPLE
+   """.stripMargin)
+
+carbonSession.sql(
+  s"""
+ | LOAD DATA LOCAL INPATH '$inputPath'
  | INTO TABLE HIVE_CARBON_EXAMPLE
""".stripMargin)
 
 carbonSession.sql("SELECT * FROM HIVE_CARBON_EXAMPLE").show()
 
+carbonSession.close()

[carbondata] branch master updated: [CARBONDATA-3395] Fix Exception when concurrent readers built with same split object

2019-05-28 Thread ravipesala
This is an automated email from the ASF dual-hosted git repository.

ravipesala pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/carbondata.git


The following commit(s) were added to refs/heads/master by this push:
 new 36ee528  [CARBONDATA-3395] Fix Exception when concurrent readers built 
with same split object
36ee528 is described below

commit 36ee52836c7bb7bc8e7a4cc6c294d7b77fdba2ee
Author: ajantha-bhat 
AuthorDate: Fri May 24 19:50:57 2019 +0530

[CARBONDATA-3395] Fix Exception when concurrent readers built with same 
split object

problem: Fix Exception when concurrent readers built with same split object

cause: In CarbonInputSplit, BlockletDetailInfo and BlockletInfo are made 
lazy. so, BlockletInfo is prepared during reader builder.
so, when two readers work on same split object, the state of this object is 
changed and leading to array out of bound issue.

solution: a) synchronize BlockletInfo creation,
b) load BlockletDetailInfo before passing to reader inside getSplit() API 
itself.
c) Failure case get the proper identifier to cleanup the datamaps.
d) build_with_splits, need to handle default projection filling if not 
configured.

This closes #3232
---
 .../carbondata/core/indexstore/BlockletDetailInfo.java   |  6 +-
 .../carbondata/hadoop/api/CarbonFileInputFormat.java | 16 ++--
 .../apache/carbondata/sdk/file/CarbonReaderBuilder.java  | 14 ++
 3 files changed, 25 insertions(+), 11 deletions(-)

diff --git 
a/core/src/main/java/org/apache/carbondata/core/indexstore/BlockletDetailInfo.java
 
b/core/src/main/java/org/apache/carbondata/core/indexstore/BlockletDetailInfo.java
index a5aa899..af07f09 100644
--- 
a/core/src/main/java/org/apache/carbondata/core/indexstore/BlockletDetailInfo.java
+++ 
b/core/src/main/java/org/apache/carbondata/core/indexstore/BlockletDetailInfo.java
@@ -108,7 +108,11 @@ public class BlockletDetailInfo implements Serializable, 
Writable {
   public BlockletInfo getBlockletInfo() {
 if (null == blockletInfo) {
   try {
-setBlockletInfoFromBinary();
+synchronized (this) {
+  if (null == blockletInfo) {
+setBlockletInfoFromBinary();
+  }
+}
   } catch (IOException e) {
 throw new RuntimeException(e);
   }
diff --git 
a/hadoop/src/main/java/org/apache/carbondata/hadoop/api/CarbonFileInputFormat.java
 
b/hadoop/src/main/java/org/apache/carbondata/hadoop/api/CarbonFileInputFormat.java
index e83f898..1f34c4f 100644
--- 
a/hadoop/src/main/java/org/apache/carbondata/hadoop/api/CarbonFileInputFormat.java
+++ 
b/hadoop/src/main/java/org/apache/carbondata/hadoop/api/CarbonFileInputFormat.java
@@ -200,17 +200,21 @@ public class CarbonFileInputFormat extends 
CarbonInputFormat implements Se
   }
 });
   }
-  if (getColumnProjection(job.getConfiguration()) == null) {
-// If the user projection is empty, use default all columns as 
projections.
-// All column name will be filled inside getSplits, so can update only 
here.
-String[]  projectionColumns = projectAllColumns(carbonTable);
-setColumnProjection(job.getConfiguration(), projectionColumns);
-  }
+  setAllColumnProjectionIfNotConfigured(job, carbonTable);
   return splits;
 }
 return null;
   }
 
+  public void setAllColumnProjectionIfNotConfigured(JobContext job, 
CarbonTable carbonTable) {
+if (getColumnProjection(job.getConfiguration()) == null) {
+  // If the user projection is empty, use default all columns as 
projections.
+  // All column name will be filled inside getSplits, so can update only 
here.
+  String[]  projectionColumns = projectAllColumns(carbonTable);
+  setColumnProjection(job.getConfiguration(), projectionColumns);
+}
+  }
+
   private List getAllCarbonDataFiles(String tablePath) {
 List carbonFiles;
 try {
diff --git 
a/store/sdk/src/main/java/org/apache/carbondata/sdk/file/CarbonReaderBuilder.java
 
b/store/sdk/src/main/java/org/apache/carbondata/sdk/file/CarbonReaderBuilder.java
index 6ead50d..2db92ea 100644
--- 
a/store/sdk/src/main/java/org/apache/carbondata/sdk/file/CarbonReaderBuilder.java
+++ 
b/store/sdk/src/main/java/org/apache/carbondata/sdk/file/CarbonReaderBuilder.java
@@ -358,8 +358,8 @@ public class CarbonReaderBuilder {
   }
 } catch (Exception ex) {
   // Clear the datamap cache as it can get added in getSplits() method
-  DataMapStoreManager.getInstance()
-  .clearDataMaps(format.getAbsoluteTableIdentifier(hadoopConf));
+  DataMapStoreManager.getInstance().clearDataMaps(
+  
format.getOrCreateCarbonTable((job.getConfiguration())).getAbsoluteTableIdentifier());
   throw ex;
 }
   }
@@ -372,6 +372,8 @@ public class CarbonReaderBuilder {
 }
 final Job job = new Job(new JobConf(hadoopConf));
 CarbonFileInputFormat format

[carbondata] branch master updated: [HOTFIX]Fix select * failure when MV datamap is enabled

2019-05-28 Thread ravipesala
This is an automated email from the ASF dual-hosted git repository.

ravipesala pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/carbondata.git


The following commit(s) were added to refs/heads/master by this push:
 new faba657  [HOTFIX]Fix select * failure when MV datamap is enabled
faba657 is described below

commit faba657becafe3b68fe73af875385c57384dbc8f
Author: akashrn5 
AuthorDate: Mon May 27 12:28:00 2019 +0530

[HOTFIX]Fix select * failure when MV datamap is enabled

Problem:
when select * is executed with limit, ColumnPruning rule will remove the 
project node from the plan during optimization, so child of limit nod eis 
relation and it fails in modular plan generation

Solution:
so if child of Limit is relation, then make the select node and make the 
modular plan

This closes #3235
---
 .../carbondata/mv/rewrite/MVCreateTestCase.scala   | 18 ++
 .../carbondata/mv/plans/modular/ModularPatterns.scala  | 10 ++
 .../mv/plans/util/Logical2ModularExtractions.scala |  7 +++
 3 files changed, 35 insertions(+)

diff --git 
a/datamap/mv/core/src/test/scala/org/apache/carbondata/mv/rewrite/MVCreateTestCase.scala
 
b/datamap/mv/core/src/test/scala/org/apache/carbondata/mv/rewrite/MVCreateTestCase.scala
index 4f5423e..48f967f 100644
--- 
a/datamap/mv/core/src/test/scala/org/apache/carbondata/mv/rewrite/MVCreateTestCase.scala
+++ 
b/datamap/mv/core/src/test/scala/org/apache/carbondata/mv/rewrite/MVCreateTestCase.scala
@@ -953,6 +953,23 @@ class MVCreateTestCase extends QueryTest with 
BeforeAndAfterAll {
 sql("drop table if exists all_table")
   }
 
+  test("test select * and distinct when MV is enabled") {
+sql("drop table if exists limit_fail")
+sql("CREATE TABLE limit_fail (empname String, designation String, doj 
Timestamp,workgroupcategory int, workgroupcategoryname String, deptno int, 
deptname String,projectcode int, projectjoindate Timestamp, projectenddate 
Timestamp,attendance int,utilization int,salary int)STORED BY 
'org.apache.carbondata.format'")
+sql(s"LOAD DATA local inpath '$resourcesPath/data_big.csv' INTO TABLE 
limit_fail  OPTIONS" +
+"('DELIMITER'= ',', 'QUOTECHAR'= '\"')")
+sql("create datamap limit_fail_dm1 using 'mv' as select 
empname,designation from limit_fail")
+try {
+  val df = sql("select distinct(empname) from limit_fail limit 10")
+  sql("select * from limit_fail limit 10").show()
+  val analyzed = df.queryExecution.analyzed
+  assert(verifyMVDataMap(analyzed, "limit_fail_dm1"))
+} catch {
+  case ex: Exception =>
+assert(false)
+}
+  }
+
   def verifyMVDataMap(logicalPlan: LogicalPlan, dataMapName: String): Boolean 
= {
 val tables = logicalPlan collect {
   case l: LogicalRelation => l.catalogTable.get
@@ -970,6 +987,7 @@ class MVCreateTestCase extends QueryTest with 
BeforeAndAfterAll {
 sql("drop table IF EXISTS fact_streaming_table1")
 sql("drop table IF EXISTS fact_streaming_table2")
 sql("drop table IF EXISTS fact_table_parquet")
+sql("drop table if exists limit_fail")
   }
 
   override def afterAll {
diff --git 
a/datamap/mv/plan/src/main/scala/org/apache/carbondata/mv/plans/modular/ModularPatterns.scala
 
b/datamap/mv/plan/src/main/scala/org/apache/carbondata/mv/plans/modular/ModularPatterns.scala
index a4116d9..30857c8 100644
--- 
a/datamap/mv/plan/src/main/scala/org/apache/carbondata/mv/plans/modular/ModularPatterns.scala
+++ 
b/datamap/mv/plan/src/main/scala/org/apache/carbondata/mv/plans/modular/ModularPatterns.scala
@@ -19,6 +19,7 @@ package org.apache.carbondata.mv.plans.modular
 
 import org.apache.spark.sql.catalyst.expressions.{Expression, NamedExpression, 
PredicateHelper, _}
 import org.apache.spark.sql.catalyst.plans.logical._
+import org.apache.spark.sql.execution.datasources.LogicalRelation
 
 import org.apache.carbondata.mv.plans.{Pattern, _}
 import org.apache.carbondata.mv.plans.modular.Flags._
@@ -118,6 +119,15 @@ abstract class ModularPatterns extends 
Modularizer[ModularPlan] {
   makeSelectModule(output, input, predicate, aliasmap, joinedge, flags,
 children.map(modularizeLater), Seq(Seq(limitExpr)) ++ fspec1, 
wspec)
 
+// if select * is with limit, then projection is removed from plan, so 
send the parent plan
+// to ExtractSelectModule to make the select node
+case limit@Limit(limitExpr, lr: LogicalRelation) =>
+  val (output, input, predicate, aliasmap, joinedge, children, flags1,
+  fspec1, wspec) = ExtractSelectModule.unapply(limit).get
+  val flags = flags1.setFlag(LIMIT)
+  makeSelectModule(output, input, predicate, aliasmap, joinedge, flags,
+children.map(modularizeLater), Seq(Seq(limitExpr)) ++ f

[carbondata] branch master updated: [CARBONDATA-3387] Support Partition with MV datamap & Show DataMap Status

2019-05-28 Thread ravipesala
This is an automated email from the ASF dual-hosted git repository.

ravipesala pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/carbondata.git


The following commit(s) were added to refs/heads/master by this push:
 new 51235d4  [CARBONDATA-3387] Support Partition with MV datamap & Show 
DataMap Status
51235d4 is described below

commit 51235d4cf239ea0d167623fed5ae339796d56eae
Author: Indhumathi27 
AuthorDate: Mon May 13 11:08:31 2019 +0530

[CARBONDATA-3387] Support Partition with MV datamap & Show DataMap Status

This PR includes,

Support Partition with Mv Datamap [Datamap with single parent table]

Show DataMap status and ParentTable to Datamap table segment Sync 
Information with SHOW DATAMAP ddl

Optimization for Incremental DataLoad.
In case of below scenario we can avoid reloading the MV
Maintable segments:0,1,2
MV: 0 => 0,1,2
Now after maintable compaction it will reload the 0.1 segment of maintable 
to MV, this is avoided by changing the mapping {0,1,2}=>{0.1}

This closes #3216
---
 .../core/constants/CarbonCommonConstants.java  |   2 +
 .../carbondata/core/datamap/DataMapProvider.java   |  64 +-
 .../core/metadata/schema/table/DataMapSchema.java  |  13 +
 datamap/mv/core/pom.xml|   2 +-
 .../carbondata/mv/datamap/MVDataMapProvider.scala  |  12 +-
 .../apache/carbondata/mv/datamap/MVHelper.scala|  75 ++-
 .../org/apache/carbondata/mv/datamap/MVUtil.scala  |   3 +-
 .../mv/rewrite/MVIncrementalLoadingTestcase.scala  |  23 +
 .../mv/rewrite/TestAllOperationsOnMV.scala | 138 -
 .../mv/rewrite/TestPartitionWithMV.scala   | 688 +
 datamap/mv/plan/pom.xml|   2 +-
 .../mv/plans/util/BirdcageOptimizer.scala  |   4 +-
 .../testsuite/datamap/TestDataMapCommand.scala |  10 +-
 ...StandardPartitionWithPreaggregateTestCase.scala |  10 +
 .../scala/org/apache/spark/sql/CarbonEnv.scala |   5 +-
 .../datamap/CarbonCreateDataMapCommand.scala   |  36 +-
 .../command/datamap/CarbonDataMapShowCommand.scala |  54 +-
 .../command/management/CarbonLoadDataCommand.scala |  10 +-
 .../execution/command/mv/DataMapListeners.scala| 113 +++-
 .../CarbonAlterTableDropHivePartitionCommand.scala |   4 -
 .../preaaggregate/PreAggregateListeners.scala  |   2 +-
 .../command/table/CarbonDropTableCommand.scala |  14 +-
 .../spark/sql/execution/strategy/DDLStrategy.scala |   4 +
 .../processing/util/CarbonLoaderUtil.java  |  43 ++
 24 files changed, 1280 insertions(+), 51 deletions(-)

diff --git 
a/core/src/main/java/org/apache/carbondata/core/constants/CarbonCommonConstants.java
 
b/core/src/main/java/org/apache/carbondata/core/constants/CarbonCommonConstants.java
index 9375414..e78ea17 100644
--- 
a/core/src/main/java/org/apache/carbondata/core/constants/CarbonCommonConstants.java
+++ 
b/core/src/main/java/org/apache/carbondata/core/constants/CarbonCommonConstants.java
@@ -2174,4 +2174,6 @@ public final class CarbonCommonConstants {
*/
   public static final String PARENT_TABLES = "parent_tables";
 
+  public static final String LOAD_SYNC_TIME = "load_sync_time";
+
 }
diff --git 
a/core/src/main/java/org/apache/carbondata/core/datamap/DataMapProvider.java 
b/core/src/main/java/org/apache/carbondata/core/datamap/DataMapProvider.java
index fe2e7dd..c4ee49b 100644
--- a/core/src/main/java/org/apache/carbondata/core/datamap/DataMapProvider.java
+++ b/core/src/main/java/org/apache/carbondata/core/datamap/DataMapProvider.java
@@ -264,23 +264,52 @@ public abstract class DataMapProvider {
 } else {
   for (RelationIdentifier relationIdentifier : relationIdentifiers) {
 List dataMapTableSegmentList = new ArrayList<>();
+// Get all segments for parent relationIdentifier
+List mainTableSegmentList =
+DataMapUtil.getMainTableValidSegmentList(relationIdentifier);
+boolean ifTableStatusUpdateRequired = false;
 for (LoadMetadataDetails loadMetaDetail : listOfLoadFolderDetails) {
   if (loadMetaDetail.getSegmentStatus() == SegmentStatus.SUCCESS
   || loadMetaDetail.getSegmentStatus() == 
SegmentStatus.INSERT_IN_PROGRESS) {
 Map> segmentMaps =
 
DataMapSegmentStatusUtil.getSegmentMap(loadMetaDetail.getExtraInfo());
-dataMapTableSegmentList.addAll(segmentMaps.get(
-relationIdentifier.getDatabaseName() + 
CarbonCommonConstants.POINT
-+ relationIdentifier.getTableName()));
+String mainTableMetaDataPath =
+
CarbonTablePath.getMetadataPath(relationIdentifier.getTablePath());
+LoadMetadataDetails[] parentTableLoadMetaDataDetails =
+SegmentStatusManager.readLoadMetadata(mainTableMetaDataPath);
+String table = relationIdentifier.getDa

[carbondata] branch master updated: [CARBONDATA-3392] Make LRU mandatory for index server

2019-05-28 Thread ravipesala
This is an automated email from the ASF dual-hosted git repository.

ravipesala pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/carbondata.git


The following commit(s) were added to refs/heads/master by this push:
 new df7339c  [CARBONDATA-3392] Make LRU mandatory for index server
df7339c is described below

commit df7339ce005be48dfb440e4cd02f640d6555e887
Author: kunal642 
AuthorDate: Wed May 15 16:40:28 2019 +0530

[CARBONDATA-3392] Make LRU mandatory for index server

Background:
Currently LRU is optional for the user to configure, but this will raise 
some concerns in case of index server because the invalid segments have to be 
constantly removed from the cache in case of update/delete/compaction scenarios.

Therefore if clear segment job is failed then the job would not fail bu 
there has to be a mechanism to prevent that segment from being in cache forever.

To prevent the above mentioned scenario LRU cache size for executor is a 
mandatory property for the index server application.

This closes #3222
---
 .../carbondata/core/datamap/DataMapUtil.java   | 10 +-
 .../carbondata/core/util/BlockletDataMapUtil.java  |  2 +-
 .../hadoop/api/CarbonTableInputFormat.java | 39 +-
 .../carbondata/indexserver/DataMapJobs.scala   | 18 --
 .../indexserver/DistributedPruneRDD.scala  | 12 +--
 .../carbondata/indexserver/IndexServer.scala   | 19 +--
 .../spark/rdd/CarbonDataRDDFactory.scala   | 10 --
 .../sql/execution/command/cache/CacheUtil.scala| 15 +++--
 .../command/cache/CarbonShowCacheCommand.scala | 23 -
 9 files changed, 86 insertions(+), 62 deletions(-)

diff --git 
a/core/src/main/java/org/apache/carbondata/core/datamap/DataMapUtil.java 
b/core/src/main/java/org/apache/carbondata/core/datamap/DataMapUtil.java
index e20f19a..2371a10 100644
--- a/core/src/main/java/org/apache/carbondata/core/datamap/DataMapUtil.java
+++ b/core/src/main/java/org/apache/carbondata/core/datamap/DataMapUtil.java
@@ -115,7 +115,15 @@ public class DataMapUtil {
 DistributableDataMapFormat dataMapFormat = new 
DistributableDataMapFormat(carbonTable,
 validAndInvalidSegmentsInfo.getValidSegments(), invalidSegment, true,
 dataMapToClear);
-dataMapJob.execute(dataMapFormat);
+try {
+  dataMapJob.execute(dataMapFormat);
+} catch (Exception e) {
+  if 
(dataMapJob.getClass().getName().equalsIgnoreCase(DISTRIBUTED_JOB_NAME)) {
+LOGGER.warn("Failed to clear distributed cache.", e);
+  } else {
+throw e;
+  }
+}
   }
 
   public static void executeClearDataMapJob(CarbonTable carbonTable, String 
jobClassName)
diff --git 
a/core/src/main/java/org/apache/carbondata/core/util/BlockletDataMapUtil.java 
b/core/src/main/java/org/apache/carbondata/core/util/BlockletDataMapUtil.java
index c90c3dc..68aad72 100644
--- 
a/core/src/main/java/org/apache/carbondata/core/util/BlockletDataMapUtil.java
+++ 
b/core/src/main/java/org/apache/carbondata/core/util/BlockletDataMapUtil.java
@@ -228,7 +228,7 @@ public class BlockletDataMapUtil {
 List tableBlockIndexUniqueIdentifiers = 
new ArrayList<>();
 String mergeFilePath =
 identifier.getIndexFilePath() + CarbonCommonConstants.FILE_SEPARATOR + 
identifier
-.getMergeIndexFileName();
+.getIndexFileName();
 segmentIndexFileStore.readMergeFile(mergeFilePath);
 List indexFiles =
 
segmentIndexFileStore.getCarbonMergeFileToIndexFilesMap().get(mergeFilePath);
diff --git 
a/hadoop/src/main/java/org/apache/carbondata/hadoop/api/CarbonTableInputFormat.java
 
b/hadoop/src/main/java/org/apache/carbondata/hadoop/api/CarbonTableInputFormat.java
index dd86dcb..274c7ef 100644
--- 
a/hadoop/src/main/java/org/apache/carbondata/hadoop/api/CarbonTableInputFormat.java
+++ 
b/hadoop/src/main/java/org/apache/carbondata/hadoop/api/CarbonTableInputFormat.java
@@ -557,22 +557,31 @@ public class CarbonTableInputFormat extends 
CarbonInputFormat {
 }
 if (isIUDTable || isUpdateFlow) {
   Map blockletToRowCountMap = new HashMap<>();
-  if 
(CarbonProperties.getInstance().isDistributedPruningEnabled(table.getDatabaseName(),
-  table.getTableName())) {
-List extendedBlocklets = 
CarbonTableInputFormat.convertToCarbonInputSplit(
-getDistributedSplit(table, null, partitions, filteredSegment,
-allSegments.getInvalidSegments(), toBeCleanedSegments));
-for (InputSplit extendedBlocklet : extendedBlocklets) {
-  CarbonInputSplit blocklet = (CarbonInputSplit) extendedBlocklet;
-  String filePath = blocklet.getFilePath();
-  String blockName = filePath.substring(filePath.lastIndexOf("/") + 1);
-  blockletToRowCountMap.put(blocklet.getSegmentId() + "," + blockName,
-  (

[carbondata] branch master updated: [CARBONDATA-3357] Support TableProperties from single parent table and restrict alter/delete/partition on mv

2019-05-27 Thread ravipesala
This is an automated email from the ASF dual-hosted git repository.

ravipesala pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/carbondata.git


The following commit(s) were added to refs/heads/master by this push:
 new 2a28dba  [CARBONDATA-3357] Support TableProperties from single parent 
table and restrict alter/delete/partition on mv
2a28dba is described below

commit 2a28dba04236ce976984d9cbc398eb8fa517d6f5
Author: Indhumathi27 
AuthorDate: Wed Apr 24 01:04:21 2019 +0530

[CARBONDATA-3357] Support TableProperties from single parent table and 
restrict alter/delete/partition on mv

Inherit Table Properties from main table to mv datamap table, if datamap 
has single parent table, else use
default table properties.
Restrict Alter/Delete/Partition operations on MV

This closes #3184
---
 .../core/datamap/DataMapStoreManager.java  |  27 +-
 .../carbondata/core/datamap/DataMapUtil.java   |   1 +
 .../core/metadata/schema/table/CarbonTable.java|  17 --
 .../core/metadata/schema/table/DataMapSchema.java  |  14 +
 .../carbondata/mv/datamap/MVDataMapProvider.scala  |  19 +-
 .../apache/carbondata/mv/datamap/MVHelper.scala| 110 ++--
 .../org/apache/carbondata/mv/datamap/MVUtil.scala  | 287 +
 .../mv/rewrite/MVCountAndCaseTestCase.scala|   2 -
 .../carbondata/mv/rewrite/MVCreateTestCase.scala   |  29 +--
 .../mv/rewrite/MVIncrementalLoadingTestcase.scala  |   1 -
 .../mv/rewrite/MVMultiJoinTestCase.scala   |   8 +-
 .../carbondata/mv/rewrite/MVTpchTestCase.scala |  10 +-
 .../mv/rewrite/TestAllOperationsOnMV.scala | 255 ++
 .../mv/rewrite/matching/TestSQLBatch.scala |   4 +-
 .../preaggregate/TestPreAggregateLoad.scala|   2 +-
 .../TestTimeSeriesUnsupportedSuite.scala   |   8 +-
 .../scala/org/apache/spark/sql/CarbonEnv.scala |   9 +-
 .../command/datamap/CarbonDropDataMapCommand.scala |   9 +
 .../management/CarbonCleanFilesCommand.scala   |   3 +-
 .../execution/command/mv/DataMapListeners.scala| 146 ++-
 .../CarbonAlterTableDropHivePartitionCommand.scala |   7 +-
 .../preaaggregate/PreAggregateListeners.scala  |   6 +-
 .../preaaggregate/PreAggregateTableHelper.scala| 102 +---
 .../schema/CarbonAlterTableRenameCommand.scala |   7 +-
 .../spark/sql/execution/strategy/DDLStrategy.scala |   4 +-
 .../spark/sql/hive/CarbonAnalysisRules.scala   |  10 +-
 .../scala/org/apache/spark/util/DataMapUtil.scala  | 160 
 27 files changed, 1054 insertions(+), 203 deletions(-)

diff --git 
a/core/src/main/java/org/apache/carbondata/core/datamap/DataMapStoreManager.java
 
b/core/src/main/java/org/apache/carbondata/core/datamap/DataMapStoreManager.java
index 81b1fb2..89402c2 100644
--- 
a/core/src/main/java/org/apache/carbondata/core/datamap/DataMapStoreManager.java
+++ 
b/core/src/main/java/org/apache/carbondata/core/datamap/DataMapStoreManager.java
@@ -281,19 +281,22 @@ public final class DataMapStoreManager {
   dataMapCatalogs = new ConcurrentHashMap<>();
   List dataMapSchemas = getAllDataMapSchemas();
   for (DataMapSchema schema : dataMapSchemas) {
-DataMapCatalog dataMapCatalog = 
dataMapCatalogs.get(schema.getProviderName());
-if (dataMapCatalog == null) {
-  dataMapCatalog = dataMapProvider.createDataMapCatalog();
-  if (null == dataMapCatalog) {
-throw new RuntimeException("Internal Error.");
+if (schema.getProviderName()
+
.equalsIgnoreCase(dataMapProvider.getDataMapSchema().getProviderName())) {
+  DataMapCatalog dataMapCatalog = 
dataMapCatalogs.get(schema.getProviderName());
+  if (dataMapCatalog == null) {
+dataMapCatalog = dataMapProvider.createDataMapCatalog();
+if (null == dataMapCatalog) {
+  throw new RuntimeException("Internal Error.");
+}
+dataMapCatalogs.put(schema.getProviderName(), dataMapCatalog);
+  }
+  try {
+dataMapCatalog.registerSchema(schema);
+  } catch (Exception e) {
+// Ignore the schema
+LOGGER.error("Error while registering schema", e);
   }
-  dataMapCatalogs.put(schema.getProviderName(), dataMapCatalog);
-}
-try {
-  dataMapCatalog.registerSchema(schema);
-} catch (Exception e) {
-  // Ignore the schema
-  LOGGER.error("Error while registering schema", e);
 }
   }
 }
diff --git 
a/core/src/main/java/org/apache/carbondata/core/datamap/DataMapUtil.java 
b/core/src/main/java/org/apache/carbondata/core/datamap/DataMapUtil.java
index 0a604fb..e20f19a 100644
--- a/core/src/main/java/org/apache/carbondata/core/datamap/DataMapUtil.java
+++ b/core/src/main/java/org/apache/carbondata/core/datamap/DataMapUtil.java
@@ -270,4 +270,5 @@

[carbondata] branch master updated: [CARBONDATA-3384] Fix NullPointerException for update/delete using index server

2019-05-27 Thread ravipesala
This is an automated email from the ASF dual-hosted git repository.

ravipesala pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/carbondata.git


The following commit(s) were added to refs/heads/master by this push:
 new bd16325  [CARBONDATA-3384] Fix NullPointerException for update/delete 
using index server
bd16325 is described below

commit bd1632564acb248db7080b9fd5f76b8e8da79101
Author: kunal642 
AuthorDate: Wed May 15 11:35:18 2019 +0530

[CARBONDATA-3384] Fix NullPointerException for update/delete using index 
server

Problem:
After update the segment cache is cleared from the executor, then in any 
subsequent query only one index file is considered for creating the 
BlockUniqueIdentifier. Therefore the query throws NullPointer when accessing 
the segmentProperties.

Solution:
Consider all index file for the segment for Identifier creation.

This closes #3218
---
 .../indexstore/blockletindex/BlockletDataMapFactory.java |  4 ++--
 .../carbondata/hadoop/api/CarbonTableInputFormat.java|  4 +++-
 .../indexserver/InvalidateSegmentCacheRDD.scala  | 16 ++--
 3 files changed, 15 insertions(+), 9 deletions(-)

diff --git 
a/core/src/main/java/org/apache/carbondata/core/indexstore/blockletindex/BlockletDataMapFactory.java
 
b/core/src/main/java/org/apache/carbondata/core/indexstore/blockletindex/BlockletDataMapFactory.java
index e4a3ad8..446507f 100644
--- 
a/core/src/main/java/org/apache/carbondata/core/indexstore/blockletindex/BlockletDataMapFactory.java
+++ 
b/core/src/main/java/org/apache/carbondata/core/indexstore/blockletindex/BlockletDataMapFactory.java
@@ -344,6 +344,7 @@ public class BlockletDataMapFactory extends 
CoarseGrainDataMapFactory
 Set tableBlockIndexUniqueIdentifiers =
 segmentMap.get(distributable.getSegment().getSegmentNo());
 if (tableBlockIndexUniqueIdentifiers == null) {
+  tableBlockIndexUniqueIdentifiers = new HashSet<>();
   Set indexFiles = 
distributable.getSegment().getCommittedIndexFile().keySet();
   for (String indexFile : indexFiles) {
 CarbonFile carbonFile = FileFactory.getCarbonFile(indexFile);
@@ -363,10 +364,9 @@ public class BlockletDataMapFactory extends 
CoarseGrainDataMapFactory
 identifiersWrapper.add(
 new 
TableBlockIndexUniqueIdentifierWrapper(tableBlockIndexUniqueIdentifier,
 this.getCarbonTable()));
-tableBlockIndexUniqueIdentifiers = new HashSet<>();
 tableBlockIndexUniqueIdentifiers.add(tableBlockIndexUniqueIdentifier);
-segmentMap.put(distributable.getSegment().getSegmentNo(), 
tableBlockIndexUniqueIdentifiers);
   }
+  segmentMap.put(distributable.getSegment().getSegmentNo(), 
tableBlockIndexUniqueIdentifiers);
 } else {
   for (TableBlockIndexUniqueIdentifier tableBlockIndexUniqueIdentifier :
   tableBlockIndexUniqueIdentifiers) {
diff --git 
a/hadoop/src/main/java/org/apache/carbondata/hadoop/api/CarbonTableInputFormat.java
 
b/hadoop/src/main/java/org/apache/carbondata/hadoop/api/CarbonTableInputFormat.java
index 458c95e..dd86dcb 100644
--- 
a/hadoop/src/main/java/org/apache/carbondata/hadoop/api/CarbonTableInputFormat.java
+++ 
b/hadoop/src/main/java/org/apache/carbondata/hadoop/api/CarbonTableInputFormat.java
@@ -564,7 +564,9 @@ public class CarbonTableInputFormat extends 
CarbonInputFormat {
 allSegments.getInvalidSegments(), toBeCleanedSegments));
 for (InputSplit extendedBlocklet : extendedBlocklets) {
   CarbonInputSplit blocklet = (CarbonInputSplit) extendedBlocklet;
-  blockletToRowCountMap.put(blocklet.getSegmentId() + "," + 
blocklet.getFilePath(),
+  String filePath = blocklet.getFilePath();
+  String blockName = filePath.substring(filePath.lastIndexOf("/") + 1);
+  blockletToRowCountMap.put(blocklet.getSegmentId() + "," + blockName,
   (long) blocklet.getDetailInfo().getRowCount());
 }
   } else {
diff --git 
a/integration/spark2/src/main/scala/org/apache/carbondata/indexserver/InvalidateSegmentCacheRDD.scala
 
b/integration/spark2/src/main/scala/org/apache/carbondata/indexserver/InvalidateSegmentCacheRDD.scala
index 1aa8cd9..bc83d2f 100644
--- 
a/integration/spark2/src/main/scala/org/apache/carbondata/indexserver/InvalidateSegmentCacheRDD.scala
+++ 
b/integration/spark2/src/main/scala/org/apache/carbondata/indexserver/InvalidateSegmentCacheRDD.scala
@@ -43,12 +43,16 @@ class InvalidateSegmentCacheRDD(@transient private val ss: 
SparkSession, databas
   }
 
   override protected def internalGetPartitions: Array[Partition] = {
-executorsList.zipWithIndex.map {
-  case (executor, idx) =>
-// create a dummy split for each executor to accumulate the cache size.
-val dummySplit = new CarbonInputSplit()
-dummySplit.setLocation(Array(executor))
-  

[carbondata] branch master updated: [HOTFIX] exclude logback from arrow dependency

2019-05-21 Thread ravipesala
This is an automated email from the ASF dual-hosted git repository.

ravipesala pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/carbondata.git


The following commit(s) were added to refs/heads/master by this push:
 new df71291  [HOTFIX] exclude logback from arrow dependency
df71291 is described below

commit df71291ec6a87cbf1c3e03cf728959abf2990faf
Author: ajantha-bhat 
AuthorDate: Tue May 21 14:46:20 2019 +0530

[HOTFIX] exclude logback from arrow dependency

[HOTFIX] exclude logback from arrow dependency
logack is a similar logging framework with default DEBUG log level, arrow 
was importing this by transitive dependency. Due to this all library log level 
set to debug causing huge logs.
Excluded this from dependency now.

This closes #3228
---
 store/sdk/pom.xml | 48 
 1 file changed, 48 insertions(+)

diff --git a/store/sdk/pom.xml b/store/sdk/pom.xml
index a1d594d..6f04a58 100644
--- a/store/sdk/pom.xml
+++ b/store/sdk/pom.xml
@@ -49,6 +49,12 @@
   org.apache.arrow
   arrow-format
   0.12.0
+  
+
+  ch.qos.logback
+  logback-classic
+
+  
 
 
   org.apache.arrow
@@ -56,6 +62,10 @@
   0.12.0
   
 
+  ch.qos.logback
+  logback-classic
+
+
   io.netty
   netty-common
 
@@ -71,6 +81,10 @@
   0.12.0
   
 
+  ch.qos.logback
+  logback-classic
+
+
   io.netty
   netty-common
 
@@ -84,6 +98,12 @@
   org.apache.arrow
   arrow-plasma
   0.12.0
+  
+
+  ch.qos.logback
+  logback-classic
+
+  
 
 
   org.apache.arrow
@@ -91,6 +111,10 @@
   0.12.0
   
 
+  ch.qos.logback
+  logback-classic
+
+
   io.netty
   netty-buffer
 
@@ -100,21 +124,45 @@
   org.apache.arrow
   arrow-tools
   0.12.0
+  
+
+  ch.qos.logback
+  logback-classic
+
+  
 
 
   com.fasterxml.jackson.core
   jackson-core
   ${dep.jackson.version}
+  
+
+  ch.qos.logback
+  logback-classic
+
+  
 
 
   com.fasterxml.jackson.core
   jackson-annotations
   ${dep.jackson.version}
+  
+
+  ch.qos.logback
+  logback-classic
+
+  
 
 
   com.fasterxml.jackson.core
   jackson-databind
   ${dep.jackson.version}
+  
+
+  ch.qos.logback
+  logback-classic
+
+  
 
   
 



[carbondata] branch master updated: [CARBONDATA-3303] Fix that MV datamap return wrong results when using coalesce and less groupby columns

2019-05-21 Thread ravipesala
This is an automated email from the ASF dual-hosted git repository.

ravipesala pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/carbondata.git


The following commit(s) were added to refs/heads/master by this push:
 new a2b7d20  [CARBONDATA-3303] Fix that MV datamap return wrong results 
when using coalesce and less groupby columns
a2b7d20 is described below

commit a2b7d20339a8ee28e1695e8eac9e1afa2c3a5b03
Author: qiuchenjian <807169...@qq.com>
AuthorDate: Tue Feb 26 14:50:26 2019 +0800

[CARBONDATA-3303] Fix that MV datamap return wrong results when using 
coalesce and less groupby columns

Problem
MV datamap return wrong results when using coalesce and query SQL's groupby 
columns is less than MV SQL's
create table coalesce_test_main(id int,name string,height int,weight int 
using carbondata
insert into coalesce_test_main select 1,'tom',170,130
insert into coalesce_test_main select 2,'tom',170,120
insert into coalesce_test_main select 3,'lily',160,100
create datamap coalesce_test_main_mv using 'mv' as select 
coalesce(sum(id),0) as sum_id,name as myname,weight from coalesce_test_main 
group by name,weight
select coalesce(sum(id),0) as sumid,name from coalesce_test_main group by 
name
The query results:
1 tom
2 tom
3 lily

Solution
When query SQL's groupby columns is less than MV SQL's and the MV SQL has 
coalesce expression, MV table cann't calculate the right result, so MV 
shouldn't take effect at this scene

This closes #3135
---
 .../apache/carbondata/mv/datamap/MVHelper.scala| 14 +++-
 .../carbondata/mv/rewrite/MVCoalesceTestCase.scala | 91 ++
 .../carbondata/mv/rewrite/MVRewriteTestCase.scala  |  4 +-
 3 files changed, 106 insertions(+), 3 deletions(-)

diff --git 
a/datamap/mv/core/src/main/scala/org/apache/carbondata/mv/datamap/MVHelper.scala
 
b/datamap/mv/core/src/main/scala/org/apache/carbondata/mv/datamap/MVHelper.scala
index 6d0b2d3..810449c 100644
--- 
a/datamap/mv/core/src/main/scala/org/apache/carbondata/mv/datamap/MVHelper.scala
+++ 
b/datamap/mv/core/src/main/scala/org/apache/carbondata/mv/datamap/MVHelper.scala
@@ -25,7 +25,7 @@ import scala.collection.mutable.ArrayBuffer
 import org.apache.spark.sql.{CarbonEnv, CarbonToSparkAdapter, SparkSession}
 import org.apache.spark.sql.catalyst.TableIdentifier
 import org.apache.spark.sql.catalyst.catalog.CatalogTable
-import org.apache.spark.sql.catalyst.expressions.{Alias, Attribute, 
AttributeReference, Cast, Expression, NamedExpression, ScalaUDF, SortOrder}
+import org.apache.spark.sql.catalyst.expressions.{Alias, Attribute, 
AttributeReference, Cast, Coalesce, Expression, NamedExpression, ScalaUDF, 
SortOrder}
 import org.apache.spark.sql.catalyst.expressions.aggregate._
 import org.apache.spark.sql.catalyst.plans.logical.{Aggregate, Join, 
LogicalPlan, Project}
 import org.apache.spark.sql.execution.command.{Field, TableModel, 
TableNewProcessor}
@@ -184,6 +184,18 @@ object MVHelper {
 if (catalog.isMVWithSameQueryPresent(logicalPlan)) {
   throw new UnsupportedOperationException("MV with same query present")
 }
+
+var expressionValid = true
+modularPlan.transformExpressions {
+  case coal@Coalesce(_) if coal.children.exists(
+exp => exp.isInstanceOf[AggregateExpression]) =>
+expressionValid = false
+coal
+}
+
+if (!expressionValid) {
+  throw new UnsupportedOperationException("MV doesn't support Coalesce")
+}
   }
 
   def updateColumnName(attr: Attribute): String = {
diff --git 
a/datamap/mv/core/src/test/scala/org/apache/carbondata/mv/rewrite/MVCoalesceTestCase.scala
 
b/datamap/mv/core/src/test/scala/org/apache/carbondata/mv/rewrite/MVCoalesceTestCase.scala
new file mode 100644
index 000..f2a27c7
--- /dev/null
+++ 
b/datamap/mv/core/src/test/scala/org/apache/carbondata/mv/rewrite/MVCoalesceTestCase.scala
@@ -0,0 +1,91 @@
+/*
+* Licensed to the Apache Software Foundation (ASF) under one or more
+* contributor license agreements.  See the NOTICE file distributed with
+* this work for additional information regarding copyright ownership.
+* The ASF licenses this file to You under the Apache License, Version 
2.0
+* (the "License"); you may not use this file except in compliance with
+* the License.  You may obtain a copy of the License at
+*
+*http://www.apache.org/licenses/LICENSE-2.0
+*
+* Unless required by applicable law or agreed to in writing, software
+* distributed under the License is distributed on an "AS IS" BASIS,
+* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or 
implied.
+* See the License for the specific language governing permissions and
+* limitations under the License.
+*/
+package org.apache.carbondata.m

[carbondata] branch master updated: [CARBONDATA-3309] MV datamap supports Spark 2.1

2019-05-20 Thread ravipesala
This is an automated email from the ASF dual-hosted git repository.

ravipesala pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/carbondata.git


The following commit(s) were added to refs/heads/master by this push:
 new 4d7c8ad  [CARBONDATA-3309] MV datamap supports Spark 2.1
4d7c8ad is described below

commit 4d7c8ada98ed15511d0abff349b64522f047344b
Author: qiuchenjian <807169...@qq.com>
AuthorDate: Sun Mar 17 20:04:48 2019 +0800

[CARBONDATA-3309] MV datamap supports Spark 2.1

[Problem]
MV datamap doesn't support Spark 2.1 version, so we need to support it

[Solution]
The following is the modification point and all MV test cases are passed on 
spark 2.1 version

The Class we cann’t access in Spark 2.1 version
(1). org.apache.spark.internal.Logging
(2). org.apache.spark.sql.internal.SQLConf
Solution:Create class extends above classed

The Class that Spark 2.1 version doesn’t have
(1). org.apache.spark.sql.catalyst.plans.logical.Subquery
(2). org.apache.spark.sql.catalyst.catalog.interface.HiveTableRelation
Solution: Use CatalogRelation instead and don’t use (in 
LogicalPlanSignatureGenerator) Mv the Subquery code to carbon project

The method that we can’t access in Spark 2.1 version
(1). sparkSession.sessionState.catalog.lookupRelation
Solution: Solution:Add this method of SparkSQLUtil

The changes of some class
(1). org.apache.spark.sql.catalyst.expressions.SortOrder
(2). org.apache.spark.sql.catalyst.expressions.Cast
(3). org.apache.spark.sql.catalyst.plans.Statistics
Solution: Adapt the new interface

The method that Spark 2.1 version doesn’t have
(1). normalizeExprId,canonicalized of 
org.apache.spark.sql.catalyst.plans.QueryPlan
(2). CASE_SENSITIVE of SQLConf
(3). STARSCHEMA_DETECTION of SQLConf
Solution:Don’t use normalize , canonicalize and the CASE_SENSITIVE, 
STARSCHEMA_DETECTION

Some logicplan optimization rules that Spark 2.1 version doesn’t have
(1). SimplifyCreateMapOps
(2). SimplifyCreateArrayOps
(3). SimplifyCreateStructOps
(4). RemoveRedundantProject
(5). RemoveRedundantAliases
(6). PullupCorrelatedPredicates
(7). ReplaceDeduplicateWithAggregate
(8). EliminateView
Solution: Delete or move the code to carbon project

Generate the instance in SparkSQLUtil to adapt Spark 2.1 version

Query SQL pass the MV check in Spark 2.1 version(CarbonSessionState)

This closes #3150
---
 .../carbondata/mv/datamap/MVDataMapProvider.scala  |   2 +-
 .../apache/carbondata/mv/datamap/MVHelper.scala|   2 +-
 .../apache/carbondata/mv/rewrite/MatchMaker.scala  |   2 +-
 .../mv/rewrite/SummaryDatasetCatalog.scala |   5 +-
 .../carbondata/mv/rewrite/TestSQLSuite.scala   |   4 +-
 .../carbondata/mv/rewrite/Tpcds_1_4_Suite.scala|   4 +-
 .../mv/expressions/modular/subquery.scala  |  13 ++-
 .../mv/plans/modular/AggregatePushDown.scala   |   8 +-
 .../carbondata/mv/plans/modular/Harmonizer.scala   |   2 +-
 .../carbondata/mv/plans/modular/ModularPlan.scala  |   8 +-
 .../mv/plans/modular/ModularRelation.scala |  22 +---
 .../carbondata/mv/plans/modular/Modularizer.scala  |   2 +-
 .../mv/plans/util/BirdcageOptimizer.scala  |  10 +-
 .../mv/plans/util/Logical2ModularExtractions.scala |  19 +--
 .../carbondata/mv/plans/util/SQLBuildDSL.scala |   5 +-
 .../carbondata/mv/plans/util/SQLBuilder.scala  |   9 --
 .../carbondata/mv/plans/util/Signature.scala   |   2 +-
 .../carbondata/mv/testutil/Tpcds_1_4_Tables.scala  |   4 +-
 .../carbondata/mv/plans/ModularToSQLSuite.scala|   4 +-
 .../carbondata/mv/plans/SignatureSuite.scala   |   4 +-
 .../spark/sql/catalyst/analysis/EmptyRule.scala|  26 +
 .../org/apache/spark/sql/util/SparkSQLUtil.scala   | 113 +-
 .../apache/spark/util/CarbonReflectionUtils.scala  |   7 ++
 .../src/main/scala/org/apache/spark/Logging.scala  |  22 
 .../main/scala/org/apache/spark/sql/SQLConf.scala  |  23 
 .../apache/spark/sql/CarbonToSparkAdapater.scala   |   8 +-
 .../sql/catalyst/catalog/HiveTableRelation.scala   |  56 +
 .../sql/catalyst/optimizer/MigrateOptimizer.scala  | 129 +
 .../sql/catalyst/plans/logical/Subquery.scala  |  28 +
 .../apache/spark/sql/hive/CarbonSessionState.scala |  19 ++-
 30 files changed, 481 insertions(+), 81 deletions(-)

diff --git 
a/datamap/mv/core/src/main/scala/org/apache/carbondata/mv/datamap/MVDataMapProvider.scala
 
b/datamap/mv/core/src/main/scala/org/apache/carbondata/mv/datamap/MVDataMapProvider.scala
index 7108bf8..5ffc46a 100644
--- 
a/datamap/mv/core/src/main/scala/org/apache/carbondata/mv/datamap/MVDataMapProvider.scala
+++ 
b/datamap/mv/core/src/main/scala/org/apache/carbondata/mv/datamap/MVDataMapProvider.scala
@@ -81,7 +81,7 @@ class MVDataMapProvider(
   val iden

[carbondata] branch master updated: [CARBONDATA-3295] Fix that MV datamap throw exception because its rewrite algorithm when query SQL has multiply subquery

2019-05-19 Thread ravipesala
This is an automated email from the ASF dual-hosted git repository.

ravipesala pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/carbondata.git


The following commit(s) were added to refs/heads/master by this push:
 new 789c97e  [CARBONDATA-3295] Fix that MV datamap throw exception because 
its rewrite algorithm when query SQL has multiply subquery
789c97e is described below

commit 789c97e4196edbe454bd83730b36cd21f72ce0cd
Author: qiuchenjian <807169...@qq.com>
AuthorDate: Sun Feb 17 18:52:09 2019 +0800

[CARBONDATA-3295] Fix that MV datamap throw exception because its rewrite 
algorithm when query SQL has multiply subquery

[Problem]
Error:

java.lang.UnsupportedOperationException was thrown.
java.lang.UnsupportedOperationException
at 
org.apache.carbondata.mv.plans.util.SQLBuildDSL.productArity(SQLBuildDSL.scala:36)
at scala.runtime.ScalaRunTime12409anon.(ScalaRunTime.scala:174)
at 
scala.runtime.ScalaRunTime$.typedProductIterator(ScalaRunTime.scala:172)create 
datamap data_table_mv using 'mv' as
 SELECT STARTTIME,LAYER4ID,
 COALESCE (SUM(seq),0) AS seq_c,
 COALESCE (SUM(succ),0)  AS succ_c
 FROM data_table
 GROUP BY STARTTIME,LAYER4IDSELECT  MT. AS ,
 MT. AS ,
 (CASE WHEN (SUM(COALESCE(seq_c, 0))) = 0 THEN NULL
   ELSE
   (CASE WHEN (CAST((SUM(COALESCE(seq_c, 0))) AS int)) = 0 THEN 0
 ELSE ((CAST((SUM(COALESCE(succ_c, 0))) AS double))
 / (CAST((SUM(COALESCE(seq_c, 0))) AS double)))
 END) * 100
   END) AS rate
 FROM (
   SELECT sum_result.*, H_REGION. FROM
   (SELECT cast(floor((starttime + 28800) / 3600) * 3600 - 28800 as int) AS 
,
 LAYER4ID,
 COALESCE(SUM(seq), 0) AS seq_c,
 COALESCE(SUM(succ), 0) AS succ_c
   FROM data_table
   WHERE STARTTIME >= 1549866600 AND STARTTIME < 1549899900
   GROUP BY cast(floor((STARTTIME + 28800) / 3600) * 3600 - 28800 as 
int),LAYER4ID
   )sum_result
   LEFT JOIN
   (SELECT l4id AS ,
 l4name AS ,
 l4name AS NAME_2250410101
   FROM region
   GROUP BY l4id, l4name) H_REGION
   ON sum_result.LAYER4ID = H_REGION.
 WHERE H_REGION.NAME_2250410101 IS NOT NULL
 ) MT
 GROUP BY MT., MT.
 ORDER BY  ASC LIMIT 5000
[Root Cause]
  // TODO Find a better way to set the rewritten flag, it may fail in 
some conditions.
  val mapping =
rewrittenPlan.collect { case m: ModularPlan => m } zip
updatedDataMapTablePlan.collect { case m: ModularPlan => m }
  mapping.foreach(f => if (f._1.rewritten) f._2.setRewritten())
this rewrite algorithm has bug, nodes are not sequential in some scenes

[Solution]
Fix the mv rewrite algorithm to fix this,we can compare the Select and 
Group object between oriPlan and rewrittenPlan, but the ModularPlan tree has 
beed changed, so we cann't compare their childrens, this pr add coarseEqual to 
compare.

This closes #3129
---
 .../apache/carbondata/mv/datamap/MVHelper.scala|  6 --
 .../carbondata/mv/rewrite/MVRewriteTestCase.scala  | 96 ++
 .../mv/plans/modular/basicOperators.scala  | 12 +++
 3 files changed, 108 insertions(+), 6 deletions(-)

diff --git 
a/datamap/mv/core/src/main/scala/org/apache/carbondata/mv/datamap/MVHelper.scala
 
b/datamap/mv/core/src/main/scala/org/apache/carbondata/mv/datamap/MVHelper.scala
index 4c7fbc4..8baa924 100644
--- 
a/datamap/mv/core/src/main/scala/org/apache/carbondata/mv/datamap/MVHelper.scala
+++ 
b/datamap/mv/core/src/main/scala/org/apache/carbondata/mv/datamap/MVHelper.scala
@@ -596,12 +596,6 @@ object MVHelper {
 case g: GroupBy =>
   MVHelper.updateDataMap(g, rewrite)
   }
-  // TODO Find a better way to set the rewritten flag, it may fail in some 
conditions.
-  val mapping =
-rewrittenPlan.collect { case m: ModularPlan => m } zip
-updatedDataMapTablePlan.collect { case m: ModularPlan => m }
-  mapping.foreach(f => if (f._1.rewritten) f._2.setRewritten())
-
   updatedDataMapTablePlan
 } else {
   rewrittenPlan
diff --git 
a/datamap/mv/core/src/test/scala/org/apache/carbondata/mv/rewrite/MVRewriteTestCase.scala
 
b/datamap/mv/core/src/test/scala/org/apache/carbondata/mv/rewrite/MVRewriteTestCase.scala
new file mode 100644
index 000..3f5164f
--- /dev/null
+++ 
b/datamap/mv/core/src/test/scala/org/apache/carbondata/mv/rewrite/MVRewriteTestCase.scala
@@ -0,0 +1,96 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance w

[carbondata] branch master updated: [CARBONDATA-3294] Fix that MV datamap throw error when using count(1) and case when expression

2019-05-19 Thread ravipesala
This is an automated email from the ASF dual-hosted git repository.

ravipesala pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/carbondata.git


The following commit(s) were added to refs/heads/master by this push:
 new 85ec206  [CARBONDATA-3294] Fix that MV datamap throw error when using 
count(1) and case when expression
85ec206 is described below

commit 85ec206e670f769f6d7875c527941346924eff43
Author: qiuchenjian <807169...@qq.com>
AuthorDate: Sat Feb 16 21:45:05 2019 +0800

[CARBONDATA-3294] Fix that MV datamap throw error when using count(1) and 
case when expression

[Problem]
MV datamap throw error when using count(1) and case when expression, the 
error is:
mismatched input 'FROM' expecting {, 'WHERE', 'GROUP', 'ORDER', 'HAVING', 
'LIMIT', 'LATERAL', 'WINDOW', 'UNION', 'EXCEPT', 'MINUS', 'INTERSECT', 'SORT', 
'CLUSTER', 'DISTRIBUTE'}(line 2, pos 0)
== SQL ==
SELECT MT.3600, MT.2250410101, countNum, rate
FROM
^^^

[Solution]
The compacted SQL has a extra 'case when' expression cause this error 
,because window operator has a bug when transforming logic plan to modular plan

This closes #3128
---
 .../mv/rewrite/MVCountAndCaseTestCase.scala| 97 ++
 .../mv/plans/modular/ModularPatterns.scala | 11 ++-
 2 files changed, 104 insertions(+), 4 deletions(-)

diff --git 
a/datamap/mv/core/src/test/scala/org/apache/carbondata/mv/rewrite/MVCountAndCaseTestCase.scala
 
b/datamap/mv/core/src/test/scala/org/apache/carbondata/mv/rewrite/MVCountAndCaseTestCase.scala
new file mode 100644
index 000..567d6a9
--- /dev/null
+++ 
b/datamap/mv/core/src/test/scala/org/apache/carbondata/mv/rewrite/MVCountAndCaseTestCase.scala
@@ -0,0 +1,97 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.carbondata.mv.rewrite
+
+import org.apache.spark.sql.catalyst.plans.logical.LogicalPlan
+import org.apache.spark.sql.execution.datasources.LogicalRelation
+import org.apache.spark.sql.test.util.QueryTest
+import org.scalatest.BeforeAndAfterAll
+
+class MVCountAndCaseTestCase  extends QueryTest with BeforeAndAfterAll{
+
+
+  override def beforeAll(): Unit = {
+drop
+sql("create table region(l4id string,l4name string) using carbondata")
+sql(
+  s"""create table data_table(
+ |starttime int, seq long,succ long,LAYER4ID string,tmp int)
+ |using carbondata""".stripMargin)
+  }
+
+  def drop(): Unit ={
+sql("drop table if exists region")
+sql("drop table if exists data_table")
+  }
+
+  test("test mv count and case when expression") {
+sql("drop datamap if exists data_table_mv")
+sql(s"""create datamap data_table_mv using 'mv' as
+   | SELECT STARTTIME,LAYER4ID,
+   | SUM(seq) AS seq_c,
+   | SUM(succ) AS succ_c
+   | FROM data_table
+   | GROUP BY STARTTIME,LAYER4ID""".stripMargin)
+
+sql("rebuild datamap data_table_mv")
+
+var frame = sql(s"""SELECT  MT.`3600` AS `3600`,
+   | MT.`2250410101` AS `2250410101`,
+   | count(1) over() as countNum,
+   | (CASE WHEN (SUM(COALESCE(seq_c, 0))) = 0 THEN NULL
+   |   ELSE
+   |   (CASE WHEN (CAST((SUM(COALESCE(seq_c, 0))) AS int)) 
= 0 THEN 0
+   | ELSE ((CAST((SUM(COALESCE(succ_c, 0))) AS double))
+   | / (CAST((SUM(COALESCE(seq_c, 0))) AS double)))
+   | END) * 100
+   |   END) AS rate
+   | FROM (
+   |   SELECT sum_result.*, H_REGION.`2250410101` FROM
+   |   (SELECT cast(floor((starttime + 28800) / 3600) * 
3600 - 28800 as int) AS `3600`,
+   | LAYER4ID,
+   | COALESCE(SUM(seq), 0) AS seq_c,
+   | COALESCE(SUM(succ), 0) AS succ_c
+   |   FROM data_table
+   |   WHERE STA

[carbondata] branch master updated: [CARBONDATA-3291] Fix that MV datamap doesn't take affect when the same table join

2019-05-19 Thread ravipesala
This is an automated email from the ASF dual-hosted git repository.

ravipesala pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/carbondata.git


The following commit(s) were added to refs/heads/master by this push:
 new 0482983  [CARBONDATA-3291] Fix that MV datamap doesn't take affect 
when the same table join
0482983 is described below

commit 04829839452d7f56954219b56a6e515239effe61
Author: qiuchenjian <807169...@qq.com>
AuthorDate: Wed Feb 13 20:32:42 2019 +0800

[CARBONDATA-3291] Fix that MV datamap doesn't take affect when the same 
table join

[Problem]
MV datamap doesn't take affect when the same table join
the error scene see the test case

This closes #3125
---
 .../carbondata/mv/rewrite/DefaultMatchMaker.scala  | 15 +++-
 .../apache/carbondata/mv/rewrite/Navigator.scala   | 51 +---
 .../mv/rewrite/MVMultiJoinTestCase.scala   | 94 ++
 .../mv/plans/modular/ModularRelation.scala | 15 
 4 files changed, 160 insertions(+), 15 deletions(-)

diff --git 
a/datamap/mv/core/src/main/scala/org/apache/carbondata/mv/rewrite/DefaultMatchMaker.scala
 
b/datamap/mv/core/src/main/scala/org/apache/carbondata/mv/rewrite/DefaultMatchMaker.scala
index cc5cc7b..59d72f8 100644
--- 
a/datamap/mv/core/src/main/scala/org/apache/carbondata/mv/rewrite/DefaultMatchMaker.scala
+++ 
b/datamap/mv/core/src/main/scala/org/apache/carbondata/mv/rewrite/DefaultMatchMaker.scala
@@ -162,8 +162,14 @@ object SelectSelectNoChildDelta extends 
DefaultMatchPattern with PredicateHelper
 // are 1-1 correspondence.
 // Change the following two conditions to more complicated ones if we 
want to
 // consider things that combine extrajoin, rejoin, and harmonized 
relations
-val isUniqueRmE = subsumer.children.filter { x => 
subsumee.children.count(_ == x) != 1 }
-val isUniqueEmR = subsumee.children.filter { x => 
subsumer.children.count(_ == x) != 1 }
+val isUniqueRmE = subsumer.children.filter { x => 
subsumee.children.count{
+  case relation: ModularRelation => relation.fineEquals(x)
+  case other => other == x
+} != 1 }
+val isUniqueEmR = subsumee.children.filter { x => 
subsumer.children.count{
+  case relation: ModularRelation => relation.fineEquals(x)
+  case other => other == x
+} != 1 }
 
 val extrajoin = sel_1a.children.filterNot { child => 
sel_1q.children.contains(child) }
 val rejoin = sel_1q.children.filterNot { child => 
sel_1a.children.contains(child) }
@@ -180,7 +186,10 @@ object SelectSelectNoChildDelta extends 
DefaultMatchPattern with PredicateHelper
 isPredicateEmdR && isOutputEdR) {
   val mappings = sel_1a.children.zipWithIndex.map {
 case (childr, fromIdx) if sel_1q.children.contains(childr) =>
-  val toIndx = sel_1q.children.indexWhere(_ == childr)
+  val toIndx = sel_1q.children.indexWhere{
+case relation: ModularRelation => relation.fineEquals(childr)
+case other => other == childr
+  }
   (toIndx -> fromIdx)
 
   }
diff --git 
a/datamap/mv/core/src/main/scala/org/apache/carbondata/mv/rewrite/Navigator.scala
 
b/datamap/mv/core/src/main/scala/org/apache/carbondata/mv/rewrite/Navigator.scala
index 76df4c2..905cd17 100644
--- 
a/datamap/mv/core/src/main/scala/org/apache/carbondata/mv/rewrite/Navigator.scala
+++ 
b/datamap/mv/core/src/main/scala/org/apache/carbondata/mv/rewrite/Navigator.scala
@@ -17,11 +17,11 @@
 
 package org.apache.carbondata.mv.rewrite
 
-import org.apache.spark.sql.catalyst.expressions.{Attribute, AttributeMap, 
AttributeSet}
+import org.apache.spark.sql.catalyst.expressions.{Attribute, AttributeMap}
 
 import org.apache.carbondata.mv.expressions.modular._
-import org.apache.carbondata.mv.plans.modular.{GroupBy, ModularPlan, Select}
 import org.apache.carbondata.mv.plans.modular
+import org.apache.carbondata.mv.plans.modular._
 import org.apache.carbondata.mv.session.MVSession
 
 private[mv] class Navigator(catalog: SummaryDatasetCatalog, session: 
MVSession) {
@@ -146,21 +146,27 @@ private[mv] class Navigator(catalog: 
SummaryDatasetCatalog, session: MVSession)
 val rtables = subsumer.collect { case n: modular.LeafNode => n }
 val etables = subsumee.collect { case n: modular.LeafNode => n }
 val pairs = for {
-  rtable <- rtables
-  etable <- etables
-  if rtable == etable
-} yield (rtable, etable)
+  i <- rtables.indices
+  j <- etables.indices
+  if rtables(i) == etables(j) && reTablesJoinMatched(
+rtables(i), etables(j), subsumer, subsumee, i, j
+  )
+} yield (rtables(i), etables(j))
 
 pairs.foldLeft(subsumer) {
   case (curSubsumer, pair) =>
 val mappedOperator =
-  if (pa

[carbondata] branch master updated: [CARBONDATA-3367][CARBONDATA-3368] Fix multiple issues in SDK reader

2019-05-19 Thread ravipesala
This is an automated email from the ASF dual-hosted git repository.

ravipesala pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/carbondata.git


The following commit(s) were added to refs/heads/master by this push:
 new bd1d774  [CARBONDATA-3367][CARBONDATA-3368] Fix multiple issues in SDK 
reader
bd1d774 is described below

commit bd1d7745c1f62caedddbc519afaffd354e535b62
Author: ajantha-bhat 
AuthorDate: Wed Mar 6 16:44:52 2019 +0800

[CARBONDATA-3367][CARBONDATA-3368] Fix multiple issues in SDK reader

Problem:
[CARBONDATA-3367] OOM when huge number of carbondata files are read from 
SDK reader

Cause:
Currently, for each carbondata file, one CarbonRecordReader will be 
created. And list of CarbonRecordReader will be maintained in carbonReader. so 
even when CarbonRecordReader is closed, the GC will not happen for that reader 
as list is still referring that object.
so, each CarbonRecordReader needs separate memory , instead of reusing the 
previous memory.

Solution : Once CarbonRecordReader.close is done, remove it from the list

problem:
[CARBONDATA-3368]InferSchema from datafile instead of index file

cause:
problem : In SDK, when multiple readers were created with same folder 
location with different file list, for inferschema all the readers refers same 
index file, which was causing bottle neck and JVM crash in case of JNI call.

solution: Inferschema from the data file mentioned while building the 
reader.

problem :
Support list interface for projection, when SDK is called from other 
languages, JNI interface supports only list from other languages. so need to 
add list interface for projections.

This closes #3197
---
 .../core/metadata/schema/table/CarbonTable.java| 43 ++
 .../apache/carbondata/sdk/file/CarbonReader.java   |  7 +++-
 .../carbondata/sdk/file/CarbonReaderBuilder.java   | 15 
 3 files changed, 24 insertions(+), 41 deletions(-)

diff --git 
a/core/src/main/java/org/apache/carbondata/core/metadata/schema/table/CarbonTable.java
 
b/core/src/main/java/org/apache/carbondata/core/metadata/schema/table/CarbonTable.java
index c66d1fc..f9ba6f5 100644
--- 
a/core/src/main/java/org/apache/carbondata/core/metadata/schema/table/CarbonTable.java
+++ 
b/core/src/main/java/org/apache/carbondata/core/metadata/schema/table/CarbonTable.java
@@ -37,8 +37,6 @@ import org.apache.carbondata.core.datamap.DataMapStoreManager;
 import org.apache.carbondata.core.datamap.TableDataMap;
 import org.apache.carbondata.core.datamap.dev.DataMapFactory;
 import org.apache.carbondata.core.datastore.block.SegmentProperties;
-import org.apache.carbondata.core.datastore.filesystem.CarbonFile;
-import org.apache.carbondata.core.datastore.impl.FileFactory;
 import org.apache.carbondata.core.features.TableOperation;
 import org.apache.carbondata.core.metadata.AbsoluteTableIdentifier;
 import org.apache.carbondata.core.metadata.CarbonTableIdentifier;
@@ -252,12 +250,9 @@ public class CarbonTable implements Serializable {
   String tableName,
   Configuration configuration) throws IOException {
 TableInfo tableInfoInfer = CarbonUtil.buildDummyTableInfo(tablePath, 
"null", "null");
-CarbonFile carbonFile = 
getLatestIndexFile(FileFactory.getCarbonFile(tablePath, configuration));
-if (carbonFile == null) {
-  throw new RuntimeException("Carbon index file not exists.");
-}
-org.apache.carbondata.format.TableInfo tableInfo = CarbonUtil
-.inferSchemaFromIndexFile(carbonFile.getPath(), tableName);
+// InferSchema from data file
+org.apache.carbondata.format.TableInfo tableInfo =
+CarbonUtil.inferSchema(tablePath, tableName, false, configuration);
 List columnSchemaList = new ArrayList();
 for (org.apache.carbondata.format.ColumnSchema thriftColumnSchema : 
tableInfo
 .getFact_table().getTable_columns()) {
@@ -271,38 +266,6 @@ public class CarbonTable implements Serializable {
 return CarbonTable.buildFromTableInfo(tableInfoInfer);
   }
 
-  private static CarbonFile getLatestIndexFile(CarbonFile tablePath) {
-CarbonFile[] carbonFiles = tablePath.listFiles();
-CarbonFile latestCarbonIndexFile = null;
-long latestIndexFileTimestamp = 0L;
-for (CarbonFile carbonFile : carbonFiles) {
-  if (carbonFile.getName().endsWith(CarbonTablePath.INDEX_FILE_EXT)
-  && carbonFile.getLastModifiedTime() > latestIndexFileTimestamp) {
-latestCarbonIndexFile = carbonFile;
-latestIndexFileTimestamp = carbonFile.getLastModifiedTime();
-  } else if (carbonFile.isDirectory()) {
-// if the list has directories that doesn't contain index files,
-// continue checking other files/directories in the list.
-if (getLatestIndexFile(carbonFile) == null) {
-  continue;
-} else {
- 

[carbondata] 02/02: [CARBONDATA-3365] Integrate apache arrow vector filling to carbon SDK

2019-05-19 Thread ravipesala
This is an automated email from the ASF dual-hosted git repository.

ravipesala pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/carbondata.git

commit f5cc9b748830c0251ee70a86aa62d8533762bb87
Author: ajantha-bhat 
AuthorDate: Tue Feb 26 20:03:41 2019 +0800

[CARBONDATA-3365] Integrate apache arrow vector filling to carbon SDK

So, By integrating carbon to support filling arrow vector, contents read by
carbondata files can be used for analytics in any programming language. say
arrow vector filled from carbon java SDK can be read by python, c, c++ and
many other languages supported by arrow.
This will also increase the scope for carbondata use-cases and carbondata
can be used for various applications as arrow is integrated already with
many query engines.

This closes #3193
---
 .../carbondata/examples/CarbonSessionExample.scala | 180 ++---
 .../hadoop/api/CarbonFileInputFormat.java  |  20 +--
 .../carbondata/hadoop/api/CarbonInputFormat.java   |   3 -
 store/sdk/pom.xml  |  31 +++-
 .../carbondata/sdk/file/ArrowCarbonReader.java | 106 
 .../apache/carbondata/sdk/file/CarbonReader.java   |  10 --
 .../carbondata/sdk/file/CarbonReaderBuilder.java   |  67 +++-
 .../carbondata/sdk/file/CarbonSchemaReader.java|  16 ++
 .../carbondata/sdk/file/arrow/ArrowConverter.java  |  80 +++--
 .../sdk/file/arrow/ArrowFieldWriter.java   |  45 +-
 .../carbondata/sdk/file/arrow/ArrowUtils.java  |  29 ++--
 .../carbondata/sdk/file/arrow/ArrowWriter.java |   6 +
 .../file/arrow/ExtendedByteArrayOutputStream.java  |  39 +
 .../carbondata/sdk/file/CarbonReaderTest.java  | 135 
 14 files changed, 563 insertions(+), 204 deletions(-)

diff --git 
a/examples/spark2/src/main/scala/org/apache/carbondata/examples/CarbonSessionExample.scala
 
b/examples/spark2/src/main/scala/org/apache/carbondata/examples/CarbonSessionExample.scala
index 3aa761e..b6921f2 100644
--- 
a/examples/spark2/src/main/scala/org/apache/carbondata/examples/CarbonSessionExample.scala
+++ 
b/examples/spark2/src/main/scala/org/apache/carbondata/examples/CarbonSessionExample.scala
@@ -37,7 +37,7 @@ object CarbonSessionExample {
   s"$rootPath/examples/spark2/src/main/resources/log4j.properties")
 
 CarbonProperties.getInstance()
-  .addProperty(CarbonCommonConstants.ENABLE_QUERY_STATISTICS, "false")
+  .addProperty(CarbonCommonConstants.ENABLE_QUERY_STATISTICS, "true")
 val spark = ExampleUtils.createCarbonSession("CarbonSessionExample")
 spark.sparkContext.setLogLevel("INFO")
 exampleBody(spark)
@@ -49,96 +49,92 @@ object CarbonSessionExample {
 val rootPath = new File(this.getClass.getResource("/").getPath
 + "../../../..").getCanonicalPath
 
-//spark.sql("DROP TABLE IF EXISTS source")
-//
-//// Create table
-//spark.sql(
-//  s"""
-// | CREATE TABLE source(
-// | shortField SHORT,
-// | intField INT,
-// | bigintField LONG,
-// | doubleField DOUBLE,
-// | stringField STRING,
-// | timestampField TIMESTAMP,
-// | decimalField DECIMAL(18,2),
-// | dateField DATE,
-// | charField CHAR(5),
-// | floatField FLOAT
-// | )
-// | STORED AS carbondata
-//   """.stripMargin)
-//
-//val path = s"$rootPath/examples/spark2/src/main/resources/data.csv"
-//
-//// scalastyle:off
-//spark.sql(
-//  s"""
-// | LOAD DATA LOCAL INPATH '$path'
-// | INTO TABLE source
-// | OPTIONS('HEADER'='true', 'COMPLEX_DELIMITER_LEVEL_1'='#')
-//   """.stripMargin)
-//// scalastyle:on
-//
-//spark.sql(
-//  s"""
-// | SELECT charField, stringField, intField
-// | FROM source
-// | WHERE stringfield = 'spark' AND decimalField > 40
-//  """.stripMargin).show()
-//
-//spark.sql(
-//  s"""
-// | SELECT *
-// | FROM source WHERE length(stringField) = 5
-//   """.stripMargin).show()
-//
-//spark.sql(
-//  s"""
-// | SELECT *
-// | FROM source WHERE date_format(dateField, "-MM-dd") = 
"2015-07-23"
-//   """.stripMargin).show()
-//
-//spark.sql("SELECT count(stringField) FROM source").show()
-//
-//spark.sql(
-//  s"""
-// | SELECT sum(intField), stringField
-// | FROM source
-// | GROUP BY stringField
-//   """.stripMargin).show()
-//
-//spark.sql(
-//  s"""
-// | SELECT t1.*, t2.*
-// | FROM source t1, sou

[carbondata] branch master updated (894216e -> f5cc9b7)

2019-05-19 Thread ravipesala
This is an automated email from the ASF dual-hosted git repository.

ravipesala pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/carbondata.git.


from 894216e  [CARBONDATA-3386] Concurrent Merge index and query is failing
 new c85a11f  [CARBONDATA-3365] Integrate apache arrow vector filling to 
carbon SDK
 new f5cc9b7  [CARBONDATA-3365] Integrate apache arrow vector filling to 
carbon SDK

The 2 revisions listed above as "new" are entirely new to this
repository and will be described in separate emails.  The revisions
listed as "add" were already present in the repository and have only
been added to this reference.


Summary of changes:
 store/sdk/pom.xml  |  71 
 .../carbondata/sdk/file/ArrowCarbonReader.java | 106 ++
 .../carbondata/sdk/file/CarbonReaderBuilder.java   |  20 +-
 .../carbondata/sdk/file/CarbonSchemaReader.java|  16 +
 .../carbondata/sdk/file/arrow/ArrowConverter.java  | 135 
 .../sdk/file/arrow/ArrowFieldWriter.java   | 367 +
 .../carbondata/sdk/file/arrow/ArrowUtils.java  | 112 +++
 .../carbondata/sdk/file/arrow/ArrowWriter.java | 144 
 .../file/arrow/ExtendedByteArrayOutputStream.java  |  36 +-
 .../carbondata/sdk/file/CarbonReaderTest.java  | 135 
 10 files changed, 1119 insertions(+), 23 deletions(-)
 create mode 100644 
store/sdk/src/main/java/org/apache/carbondata/sdk/file/ArrowCarbonReader.java
 create mode 100644 
store/sdk/src/main/java/org/apache/carbondata/sdk/file/arrow/ArrowConverter.java
 create mode 100644 
store/sdk/src/main/java/org/apache/carbondata/sdk/file/arrow/ArrowFieldWriter.java
 create mode 100644 
store/sdk/src/main/java/org/apache/carbondata/sdk/file/arrow/ArrowUtils.java
 create mode 100644 
store/sdk/src/main/java/org/apache/carbondata/sdk/file/arrow/ArrowWriter.java
 copy 
core/src/main/java/org/apache/carbondata/core/util/ReUsableByteArrayDataOutputStream.java
 => 
store/sdk/src/main/java/org/apache/carbondata/sdk/file/arrow/ExtendedByteArrayOutputStream.java
 (54%)



[carbondata] 01/02: [CARBONDATA-3365] Integrate apache arrow vector filling to carbon SDK

2019-05-19 Thread ravipesala
This is an automated email from the ASF dual-hosted git repository.

ravipesala pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/carbondata.git

commit c85a11f0f900180dcf36976809c3d244fb24c161
Author: kumarvishal09 
AuthorDate: Wed Feb 6 18:10:43 2019 +0530

[CARBONDATA-3365] Integrate apache arrow vector filling to carbon SDK

So, By integrating carbon to support filling arrow vector, contents read by
carbondata files can be used for analytics in any programming language. say
arrow vector filled from carbon java SDK can be read by python, c, c++ and
many other languages supported by arrow.
This will also increase the scope for carbondata use-cases and carbondata
can be used for various applications as arrow is integrated already with
many query engines.

This closes #3193
---
 .../carbondata/examples/CarbonSessionExample.scala | 180 +--
 .../hadoop/api/CarbonFileInputFormat.java  |  20 +-
 .../carbondata/hadoop/api/CarbonInputFormat.java   |   3 +
 store/sdk/pom.xml  |  50 
 .../apache/carbondata/sdk/file/CarbonReader.java   |  10 +
 .../carbondata/sdk/file/CarbonReaderBuilder.java   |  49 +++
 .../carbondata/sdk/file/arrow/ArrowConverter.java  |  73 +
 .../sdk/file/arrow/ArrowFieldWriter.java   | 328 +
 .../carbondata/sdk/file/arrow/ArrowUtils.java  | 111 +++
 .../carbondata/sdk/file/arrow/ArrowWriter.java | 138 +
 10 files changed, 873 insertions(+), 89 deletions(-)

diff --git 
a/examples/spark2/src/main/scala/org/apache/carbondata/examples/CarbonSessionExample.scala
 
b/examples/spark2/src/main/scala/org/apache/carbondata/examples/CarbonSessionExample.scala
index b6921f2..3aa761e 100644
--- 
a/examples/spark2/src/main/scala/org/apache/carbondata/examples/CarbonSessionExample.scala
+++ 
b/examples/spark2/src/main/scala/org/apache/carbondata/examples/CarbonSessionExample.scala
@@ -37,7 +37,7 @@ object CarbonSessionExample {
   s"$rootPath/examples/spark2/src/main/resources/log4j.properties")
 
 CarbonProperties.getInstance()
-  .addProperty(CarbonCommonConstants.ENABLE_QUERY_STATISTICS, "true")
+  .addProperty(CarbonCommonConstants.ENABLE_QUERY_STATISTICS, "false")
 val spark = ExampleUtils.createCarbonSession("CarbonSessionExample")
 spark.sparkContext.setLogLevel("INFO")
 exampleBody(spark)
@@ -49,92 +49,96 @@ object CarbonSessionExample {
 val rootPath = new File(this.getClass.getResource("/").getPath
 + "../../../..").getCanonicalPath
 
-spark.sql("DROP TABLE IF EXISTS source")
-
-// Create table
-spark.sql(
-  s"""
- | CREATE TABLE source(
- | shortField SHORT,
- | intField INT,
- | bigintField LONG,
- | doubleField DOUBLE,
- | stringField STRING,
- | timestampField TIMESTAMP,
- | decimalField DECIMAL(18,2),
- | dateField DATE,
- | charField CHAR(5),
- | floatField FLOAT
- | )
- | STORED AS carbondata
-   """.stripMargin)
-
-val path = s"$rootPath/examples/spark2/src/main/resources/data.csv"
-
-// scalastyle:off
-spark.sql(
-  s"""
- | LOAD DATA LOCAL INPATH '$path'
- | INTO TABLE source
- | OPTIONS('HEADER'='true', 'COMPLEX_DELIMITER_LEVEL_1'='#')
-   """.stripMargin)
-// scalastyle:on
-
-spark.sql(
-  s"""
- | SELECT charField, stringField, intField
- | FROM source
- | WHERE stringfield = 'spark' AND decimalField > 40
-  """.stripMargin).show()
-
-spark.sql(
-  s"""
- | SELECT *
- | FROM source WHERE length(stringField) = 5
-   """.stripMargin).show()
-
-spark.sql(
-  s"""
- | SELECT *
- | FROM source WHERE date_format(dateField, "-MM-dd") = 
"2015-07-23"
-   """.stripMargin).show()
-
-spark.sql("SELECT count(stringField) FROM source").show()
-
-spark.sql(
-  s"""
- | SELECT sum(intField), stringField
- | FROM source
- | GROUP BY stringField
-   """.stripMargin).show()
-
-spark.sql(
-  s"""
- | SELECT t1.*, t2.*
- | FROM source t1, source t2
- | WHERE t1.stringField = t2.stringField
-  """.stripMargin).show()
-
-spark.sql(
-  s"""
- | WITH t1 AS (
- | SELECT * FROM source
- | UNION ALL
- | SELECT * FROM source
- | )
- | SELECT t1.*, t2.*
- | FROM t1, source t2
- | WHERE t1.stringField = t2.stri

svn commit: r34085 - in /dev/carbondata/1.5.4-rc1: ./ apache-carbondata-1.5.4-source-release.zip apache-carbondata-1.5.4-source-release.zip.asc apache-carbondata-1.5.4-source-release.zip.sha512

2019-05-17 Thread ravipesala
Author: ravipesala
Date: Fri May 17 13:40:24 2019
New Revision: 34085

Log:
Upload 1.5.4-rc1

Added:
dev/carbondata/1.5.4-rc1/
dev/carbondata/1.5.4-rc1/apache-carbondata-1.5.4-source-release.zip   (with 
props)
dev/carbondata/1.5.4-rc1/apache-carbondata-1.5.4-source-release.zip.asc   
(with props)
dev/carbondata/1.5.4-rc1/apache-carbondata-1.5.4-source-release.zip.sha512

Added: dev/carbondata/1.5.4-rc1/apache-carbondata-1.5.4-source-release.zip
==
Binary file - no diff available.

Propchange: dev/carbondata/1.5.4-rc1/apache-carbondata-1.5.4-source-release.zip
--
svn:mime-type = application/octet-stream

Added: dev/carbondata/1.5.4-rc1/apache-carbondata-1.5.4-source-release.zip.asc
==
Binary file - no diff available.

Propchange: 
dev/carbondata/1.5.4-rc1/apache-carbondata-1.5.4-source-release.zip.asc
--
svn:mime-type = application/octet-stream

Added: 
dev/carbondata/1.5.4-rc1/apache-carbondata-1.5.4-source-release.zip.sha512
==
--- dev/carbondata/1.5.4-rc1/apache-carbondata-1.5.4-source-release.zip.sha512 
(added)
+++ dev/carbondata/1.5.4-rc1/apache-carbondata-1.5.4-source-release.zip.sha512 
Fri May 17 13:40:24 2019
@@ -0,0 +1 @@
+505d02818bae28b2cad475b49960c4c28fcf8cbbe171b0cc79139005db0de167395ed92e3db011cfdf6598a5a5f13a81cc085501c762d32bbe35a430e7b9fee8
  apache-carbondata-1.5.4-source-release.zip




[carbondata] branch branch-1.5 updated: [maven-release-plugin] prepare for next development iteration

2019-05-17 Thread ravipesala
This is an automated email from the ASF dual-hosted git repository.

ravipesala pushed a commit to branch branch-1.5
in repository https://gitbox.apache.org/repos/asf/carbondata.git


The following commit(s) were added to refs/heads/branch-1.5 by this push:
 new 31a2d14  [maven-release-plugin] prepare for next development iteration
31a2d14 is described below

commit 31a2d1432b23e8350f319d0ac88bfffada3a74d4
Author: ravipesala 
AuthorDate: Fri May 17 14:27:36 2019 +0530

[maven-release-plugin] prepare for next development iteration
---
 assembly/pom.xml  | 2 +-
 common/pom.xml| 2 +-
 core/pom.xml  | 2 +-
 datamap/bloom/pom.xml | 2 +-
 datamap/examples/pom.xml  | 2 +-
 datamap/lucene/pom.xml| 2 +-
 datamap/mv/core/pom.xml   | 2 +-
 datamap/mv/plan/pom.xml   | 2 +-
 examples/spark2/pom.xml   | 2 +-
 format/pom.xml| 2 +-
 hadoop/pom.xml| 2 +-
 integration/hive/pom.xml  | 2 +-
 integration/presto/pom.xml| 2 +-
 integration/spark-common-test/pom.xml | 2 +-
 integration/spark-common/pom.xml  | 2 +-
 integration/spark-datasource/pom.xml  | 2 +-
 integration/spark2/pom.xml| 2 +-
 pom.xml   | 4 ++--
 processing/pom.xml| 2 +-
 store/sdk/pom.xml | 2 +-
 streaming/pom.xml | 2 +-
 tools/cli/pom.xml | 2 +-
 22 files changed, 23 insertions(+), 23 deletions(-)

diff --git a/assembly/pom.xml b/assembly/pom.xml
index 7206b0d..f1586cb 100644
--- a/assembly/pom.xml
+++ b/assembly/pom.xml
@@ -22,7 +22,7 @@
   
 org.apache.carbondata
 carbondata-parent
-1.5.4
+1.5.5-SNAPSHOT
 ../pom.xml
   
 
diff --git a/common/pom.xml b/common/pom.xml
index 5fa7df8..4844768 100644
--- a/common/pom.xml
+++ b/common/pom.xml
@@ -22,7 +22,7 @@
   
 org.apache.carbondata
 carbondata-parent
-1.5.4
+1.5.5-SNAPSHOT
 ../pom.xml
   
 
diff --git a/core/pom.xml b/core/pom.xml
index 56cfaf5..ec55faf 100644
--- a/core/pom.xml
+++ b/core/pom.xml
@@ -22,7 +22,7 @@
   
 org.apache.carbondata
 carbondata-parent
-1.5.4
+1.5.5-SNAPSHOT
 ../pom.xml
   
 
diff --git a/datamap/bloom/pom.xml b/datamap/bloom/pom.xml
index fdc2f62..ab5e29c 100644
--- a/datamap/bloom/pom.xml
+++ b/datamap/bloom/pom.xml
@@ -4,7 +4,7 @@
   
 org.apache.carbondata
 carbondata-parent
-1.5.4
+1.5.5-SNAPSHOT
 ../../pom.xml
   
 
diff --git a/datamap/examples/pom.xml b/datamap/examples/pom.xml
index 08693f0..0c9d804 100644
--- a/datamap/examples/pom.xml
+++ b/datamap/examples/pom.xml
@@ -22,7 +22,7 @@
   
 org.apache.carbondata
 carbondata-parent
-1.5.4
+1.5.5-SNAPSHOT
 ../../pom.xml
   
 
diff --git a/datamap/lucene/pom.xml b/datamap/lucene/pom.xml
index dfd09f6..ee06416 100644
--- a/datamap/lucene/pom.xml
+++ b/datamap/lucene/pom.xml
@@ -4,7 +4,7 @@
   
 org.apache.carbondata
 carbondata-parent
-1.5.4
+1.5.5-SNAPSHOT
 ../../pom.xml
   
 
diff --git a/datamap/mv/core/pom.xml b/datamap/mv/core/pom.xml
index 9ee517c..b92dc0e 100644
--- a/datamap/mv/core/pom.xml
+++ b/datamap/mv/core/pom.xml
@@ -22,7 +22,7 @@
   
 org.apache.carbondata
 carbondata-parent
-1.5.4
+1.5.5-SNAPSHOT
 ../../../pom.xml
   
 
diff --git a/datamap/mv/plan/pom.xml b/datamap/mv/plan/pom.xml
index 4ee274e..3d18384 100644
--- a/datamap/mv/plan/pom.xml
+++ b/datamap/mv/plan/pom.xml
@@ -22,7 +22,7 @@
   
 org.apache.carbondata
 carbondata-parent
-1.5.4
+1.5.5-SNAPSHOT
 ../../../pom.xml
   
 
diff --git a/examples/spark2/pom.xml b/examples/spark2/pom.xml
index 1bc9247..88a99f6 100644
--- a/examples/spark2/pom.xml
+++ b/examples/spark2/pom.xml
@@ -22,7 +22,7 @@
   
 org.apache.carbondata
 carbondata-parent
-1.5.4
+1.5.5-SNAPSHOT
 ../../pom.xml
   
 
diff --git a/format/pom.xml b/format/pom.xml
index 3b4bcee..45875d7 100644
--- a/format/pom.xml
+++ b/format/pom.xml
@@ -22,7 +22,7 @@
   
 org.apache.carbondata
 carbondata-parent
-1.5.4
+1.5.5-SNAPSHOT
 ../pom.xml
   
 
diff --git a/hadoop/pom.xml b/hadoop/pom.xml
index 6780f07..9bfc789 100644
--- a/hadoop/pom.xml
+++ b/hadoop/pom.xml
@@ -22,7 +22,7 @@
   
 org.apache.carbondata
 carbondata-parent
-1.5.4
+1.5.5-SNAPSHOT
 ../pom.xml
   
 
diff --git a/integration/hive/pom.xml b/integration/hive/pom.xml
index 649eec4..a990b44 100644
--- a/integration/hive/pom.xml
+++ b/integration/hive/pom.xml
@@ -22,7 +22,7 @@
 
 org.apache.carbondata
 carbondata-parent
-1.5.4
+1.5.5-SNAPSHOT
 ../../pom.xml
 
 
diff --git a/integration/presto/pom.xml b/integration/presto/pom.xml
index a4a9aba..4dacce1 100644
--- a/integration/presto/pom.xml
+++ b/integration/presto/pom.xml
@@ -22,7 +22,7

[carbondata] annotated tag apache-carbondata-1.5.4-rc1 created (now 5a55d9b)

2019-05-17 Thread ravipesala
This is an automated email from the ASF dual-hosted git repository.

ravipesala pushed a change to annotated tag apache-carbondata-1.5.4-rc1
in repository https://gitbox.apache.org/repos/asf/carbondata.git.


  at 5a55d9b  (tag)
 tagging 1f2e184b81bef4e861b4dd32be94dc50bada6b68 (commit)
 replaces apache-carbondata-1.5.3-rc1
  by ravipesala
  on Fri May 17 14:27:20 2019 +0530

- Log -
[maven-release-plugin] copy for tag apache-carbondata-1.5.4-rc1
---

No new revisions were added by this update.



  1   2   3   4   5   6   7   8   9   10   >