[carbondata] branch master updated: [DOC] Fix the spell mistake of enable.unsafe.in.query.processing

2019-03-24 Thread xuchuanyin
This is an automated email from the ASF dual-hosted git repository.

xuchuanyin pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/carbondata.git


The following commit(s) were added to refs/heads/master by this push:
 new 9854f20  [DOC] Fix the spell mistake of 
enable.unsafe.in.query.processing
9854f20 is described below

commit 9854f2033b4bbac0698f8cadd6c5872ab0423c2d
Author: qiuchenjian <807169...@qq.com>
AuthorDate: Sat Mar 23 09:28:32 2019 +0800

[DOC] Fix the spell mistake of enable.unsafe.in.query.processing

Fix the spell mistake of enable.unsafe.in.query.processing

This closes #3160
---
 docs/usecases.md | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/docs/usecases.md b/docs/usecases.md
index 0cfcf85..8ff4975 100644
--- a/docs/usecases.md
+++ b/docs/usecases.md
@@ -148,8 +148,8 @@ Use all columns are no-dictionary as the cardinality is 
high.
 | Compaction | carbon.number.of.cores.while.compacting | 12
  | Higher number of cores can improve the compaction speed.Data size is 
huge.Compaction need to use more threads to speed up the process |
 | Compaction | carbon.enable.auto.load.merge   | FALSE 
  | Doing auto minor compaction is costly process as data size is huge.Perform 
manual compaction when the cluster is less loaded |
 | Query | carbon.enable.vector.reader | true| 
To fetch results faster, supporting spark vector processing will speed up the 
query |
-| Query | enable.unsafe.in.query.procressing  | true| 
Data that needs to be scanned in huge which in turn generates more short lived 
Java objects. This cause pressure of GC.using unsafe and offheap will reduce 
the GC overhead |
-| Query | use.offheap.in.query.processing | true| 
Data that needs to be scanned in huge which in turn generates more short lived 
Java objects. This cause pressure of GC.using unsafe and offheap will reduce 
the GC overhead.offheap can be accessed through java unsafe.hence 
enable.unsafe.in.query.procressing needs to be true |
+| Query | enable.unsafe.in.query.processing  | true| 
Data that needs to be scanned in huge which in turn generates more short lived 
Java objects. This cause pressure of GC.using unsafe and offheap will reduce 
the GC overhead |
+| Query | use.offheap.in.query.processing | true| 
Data that needs to be scanned in huge which in turn generates more short lived 
Java objects. This cause pressure of GC.using unsafe and offheap will reduce 
the GC overhead.offheap can be accessed through java unsafe.hence 
enable.unsafe.in.query.processing needs to be true |
 | Query | enable.unsafe.columnpage| TRUE| 
Keep the column pages in offheap memory so that the memory overhead due to java 
object is less and also reduces GC pressure. |
 | Query | carbon.unsafe.working.memory.in.mb  | 10240   | 
Amount of memory to use for offheap operations, you can increase this memory 
based on the data size |
 



[carbondata] branch master updated: [DOC] Update the doc of "Show DataMap"

2019-03-12 Thread xuchuanyin
This is an automated email from the ASF dual-hosted git repository.

xuchuanyin pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/carbondata.git


The following commit(s) were added to refs/heads/master by this push:
 new 1825861  [DOC] Update the doc of "Show DataMap"
1825861 is described below

commit 182586164283c6f5dc32a1c9b488fbc6b25d44d7
Author: qiuchenjian <807169...@qq.com>
AuthorDate: Wed Jan 30 09:23:21 2019 +0800

[DOC] Update the doc of "Show DataMap"

update the doc of showing datamap

This closes #3117
---
 docs/datamap/datamap-management.md | 1 +
 1 file changed, 1 insertion(+)

diff --git a/docs/datamap/datamap-management.md 
b/docs/datamap/datamap-management.md
index 0dc4718..087c70a 100644
--- a/docs/datamap/datamap-management.md
+++ b/docs/datamap/datamap-management.md
@@ -141,6 +141,7 @@ There is a SHOW DATAMAPS command, when this is issued, 
system will read all data
 - DataMapName
 - DataMapProviderName like mv, preaggreagte, timeseries, etc
 - Associated Table
+- DataMap Properties
 
 ### Compaction on DataMap
 



[carbondata] branch master updated: [CARBONDATA-3281] Add validation for the size of the LRU cache

2019-03-07 Thread xuchuanyin
This is an automated email from the ASF dual-hosted git repository.

xuchuanyin pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/carbondata.git


The following commit(s) were added to refs/heads/master by this push:
 new e443a94  [CARBONDATA-3281] Add validation for the size of the LRU cache
e443a94 is described below

commit e443a943c63b0981603179d75b07d4006a925109
Author: litt 
AuthorDate: Tue Jan 29 16:34:40 2019 +0800

[CARBONDATA-3281] Add validation for the size of the LRU cache

If configure the LRU bigger than jvm xmx size, then use 
CARBON_MAX_LRU_CACHE_SIZE_DEFAULT instead.Because if setting LRU bigger than 
xmx size,if we query for a big table with too many carbonfiles, it may cause 
"Error: java.io.IOException: Problem in loading segment blocks: GC overhead
limit exceeded (state=,code=0)" and the jdbc server will restart.

This closes #3118
---
 .../carbondata/core/cache/CarbonLRUCache.java  | 32 ++
 .../core/constants/CarbonCommonConstants.java  |  5 
 .../carbondata/core/cache/CarbonLRUCacheTest.java  |  7 +
 3 files changed, 44 insertions(+)

diff --git 
a/core/src/main/java/org/apache/carbondata/core/cache/CarbonLRUCache.java 
b/core/src/main/java/org/apache/carbondata/core/cache/CarbonLRUCache.java
index 0c75173..3371d0d 100644
--- a/core/src/main/java/org/apache/carbondata/core/cache/CarbonLRUCache.java
+++ b/core/src/main/java/org/apache/carbondata/core/cache/CarbonLRUCache.java
@@ -70,6 +70,16 @@ public final class CarbonLRUCache {
   + CarbonCommonConstants.CARBON_MAX_LRU_CACHE_SIZE_DEFAULT);
   lruCacheMemorySize = Long.parseLong(defaultPropertyName);
 }
+
+// if lru cache is bigger than jvm max heap then set part size of max heap 
(60% default)
+if (isBeyondMaxMemory()) {
+  double changeSize = getPartOfXmx();
+  LOGGER.warn("Configured LRU size " + lruCacheMemorySize +
+  "MB exceeds the max size of JVM heap. Carbon will fallback to 
use " +
+  changeSize + " MB instead");
+  lruCacheMemorySize = (long)changeSize;
+}
+
 initCache();
 if (lruCacheMemorySize > 0) {
   LOGGER.info("Configured LRU cache size is " + lruCacheMemorySize + " 
MB");
@@ -326,4 +336,26 @@ public final class CarbonLRUCache {
   public Map getCacheMap() {
 return lruCacheMap;
   }
+
+  /**
+   * Check if LRU cache setting is bigger than max memory of jvm.
+   * if LRU cache is bigger than max memory of jvm when query for a big 
segments table,
+   * may cause JDBC server crash.
+   * @return true LRU cache is bigger than max memory of jvm, false otherwise
+   */
+  private boolean isBeyondMaxMemory() {
+long mSize = Runtime.getRuntime().maxMemory();
+long lruSize = lruCacheMemorySize * BYTE_CONVERSION_CONSTANT;
+return lruSize >= mSize;
+  }
+
+  /**
+   * when LRU cache is bigger than max heap of jvm.
+   * set to part of  max heap size, use CARBON_LRU_CACHE_PERCENT_OVER_MAX_SIZE 
default 60%.
+   * @return the LRU cache size
+   */
+  private double getPartOfXmx() {
+long mSizeMB = Runtime.getRuntime().maxMemory() / BYTE_CONVERSION_CONSTANT;
+return mSizeMB * 
CarbonCommonConstants.CARBON_LRU_CACHE_PERCENT_OVER_MAX_SIZE;
+  }
 }
diff --git 
a/core/src/main/java/org/apache/carbondata/core/constants/CarbonCommonConstants.java
 
b/core/src/main/java/org/apache/carbondata/core/constants/CarbonCommonConstants.java
index f5c07a4..69374ad 100644
--- 
a/core/src/main/java/org/apache/carbondata/core/constants/CarbonCommonConstants.java
+++ 
b/core/src/main/java/org/apache/carbondata/core/constants/CarbonCommonConstants.java
@@ -1257,6 +1257,11 @@ public final class CarbonCommonConstants {
   public static final String CARBON_MAX_LRU_CACHE_SIZE_DEFAULT = "-1";
 
   /**
+   * when LRU cache if beyond the jvm max memory size,set 60% percent of max 
size
+   */
+  public static final double CARBON_LRU_CACHE_PERCENT_OVER_MAX_SIZE = 0.6d;
+
+  /**
* property to enable min max during filter query
*/
   @CarbonProperty
diff --git 
a/core/src/test/java/org/apache/carbondata/core/cache/CarbonLRUCacheTest.java 
b/core/src/test/java/org/apache/carbondata/core/cache/CarbonLRUCacheTest.java
index 0493655..8ef6684 100644
--- 
a/core/src/test/java/org/apache/carbondata/core/cache/CarbonLRUCacheTest.java
+++ 
b/core/src/test/java/org/apache/carbondata/core/cache/CarbonLRUCacheTest.java
@@ -60,6 +60,13 @@ public class CarbonLRUCacheTest {
 assertNull(carbonLRUCache.get("Column2"));
   }
 
+  @Test public void testBiggerThanMaxSizeConfiguration() {
+CarbonLRUCache carbonLRUCacheForConfig =
+new CarbonLRUCache("prop2", "20");//200GB
+assertTrue(carbonLRUCacheForConfig.put("Column1", cacheable, 10L));
+assertFalse(carbonLRUCacheForConfig.put("Column2", cacheable, 
107374182400L)

[carbondata] branch master updated: [CARBONDATA-3305] Support show metacache command to list the cache sizes for all tables

2019-03-05 Thread xuchuanyin
This is an automated email from the ASF dual-hosted git repository.

xuchuanyin pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/carbondata.git


The following commit(s) were added to refs/heads/master by this push:
 new 718be37  [CARBONDATA-3305] Support show metacache command to list the 
cache sizes for all tables
718be37 is described below

commit 718be37295a55de3317191118bf74720e4de800f
Author: QiangCai 
AuthorDate: Thu Jan 17 15:39:33 2019 +0800

[CARBONDATA-3305] Support show metacache command to list the cache sizes 
for all tables

>>> SHOW METACACHE
+++--++---+
|Database|Table   |Index size|Datamap size|Dictionary size|
+++--++---+
|ALL |ALL |842 Bytes |982 Bytes   |80.34 KB   |
|default |ALL |842 Bytes |982 Bytes   |80.34 KB   |
|default |t1  |225 Bytes |982 Bytes   |0  |
|default |t1_dpagg|259 Bytes |0   |0  |
|default |t2  |358 Bytes |0   |80.34 KB   |
+++--++---+

>>> SHOW METACACHE FOR TABLE t1
+--+-+--+
|Field |Size |Comment   |
+--+-+--+
|Index |225 Bytes|1/1 index files cached|
|Dictionary|0|  |
|dpagg |259 Bytes|preaggregate  |
|dblom |982 Bytes|bloomfilter   |
+--+-+--+

>>> SHOW METACACHE FOR TABLE t2
+--+-+--+
|Field |Size |Comment   |
+--+-+--+
|Index |358 Bytes|2/2 index files cached|
|Dictionary|80.34 KB |  |
+--+-+--+

This closes #3078
---
 .../carbondata/core/cache/CacheProvider.java   |   4 +
 .../carbondata/core/cache/CarbonLRUCache.java  |   4 +
 docs/ddl-of-carbondata.md  |  21 ++
 .../sql/commands/TestCarbonShowCacheCommand.scala  | 163 +++
 .../spark/sql/catalyst/CarbonDDLSqlParser.scala|   1 +
 .../command/cache/CarbonShowCacheCommand.scala | 312 +
 .../spark/sql/parser/CarbonSpark2SqlParser.scala   |  12 +-
 7 files changed, 515 insertions(+), 2 deletions(-)

diff --git 
a/core/src/main/java/org/apache/carbondata/core/cache/CacheProvider.java 
b/core/src/main/java/org/apache/carbondata/core/cache/CacheProvider.java
index 99b1693..deb48e2 100644
--- a/core/src/main/java/org/apache/carbondata/core/cache/CacheProvider.java
+++ b/core/src/main/java/org/apache/carbondata/core/cache/CacheProvider.java
@@ -195,4 +195,8 @@ public class CacheProvider {
 }
 cacheTypeToCacheMap.clear();
   }
+
+  public CarbonLRUCache getCarbonCache() {
+return carbonLRUCache;
+  }
 }
diff --git 
a/core/src/main/java/org/apache/carbondata/core/cache/CarbonLRUCache.java 
b/core/src/main/java/org/apache/carbondata/core/cache/CarbonLRUCache.java
index 87254e3..74ff8a0 100644
--- a/core/src/main/java/org/apache/carbondata/core/cache/CarbonLRUCache.java
+++ b/core/src/main/java/org/apache/carbondata/core/cache/CarbonLRUCache.java
@@ -305,4 +305,8 @@ public final class CarbonLRUCache {
   lruCacheMap.clear();
 }
   }
+
+  public Map getCacheMap() {
+return lruCacheMap;
+  }
 }
diff --git a/docs/ddl-of-carbondata.md b/docs/ddl-of-carbondata.md
index 0d0e5bd..3476475 100644
--- a/docs/ddl-of-carbondata.md
+++ b/docs/ddl-of-carbondata.md
@@ -67,6 +67,7 @@ CarbonData DDL statements are documented here,which includes:
   * [SPLIT PARTITION](#split-a-partition)
   * [DROP PARTITION](#drop-a-partition)
 * [BUCKETING](#bucketing)
+* [CACHE](#cache)
 
 ## CREATE TABLE
 
@@ -1088,4 +1089,24 @@ Users can specify which columns to include and exclude 
for local dictionary gene
   TBLPROPERTIES ('BUCKETNUMBER'='4', 'BUCKETCOLUMNS'='productName')
   ```
 
+## CACHE
 
+  CarbonData internally uses LRU caching to improve the performance. The user 
can get information 
+  about current cache used status in memory through the following command:
+
+  ```sql
+  SHOW METADATA
+  ``` 
+  
+  This shows the overall memory consumed in the cache by categories - index 
files, dictionary and 
+  datamaps. This also shows the cache usage by all the tables and children 
tables in the current 
+  database.
+  
+  ```sql
+  SHOW METADATA ON TABLE tableName
+  ```
+  
+  This shows detailed information on cache usage by the table `tableName` and 
its carbonindex files, 
+  its dictionary files, its datamaps and children tables.
+  
+  This command is not allowed on child tables.
diff --git 
a/integration/spark-common-test/src/test/scala/org/apache/carbondata/sql/commands/TestCarbonShowCacheCommand.scala
 
b/integr

[carbondata] branch master updated: [CARBONDATA-2447] Block update operation on range/list/hash partition table

2019-02-22 Thread xuchuanyin
This is an automated email from the ASF dual-hosted git repository.

xuchuanyin pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/carbondata.git


The following commit(s) were added to refs/heads/master by this push:
 new 1470f78  [CARBONDATA-2447] Block update operation on range/list/hash 
partition table
1470f78 is described below

commit 1470f786451c841841cee21cd7d4a3bdf95ecc76
Author: qiuchenjian <807169...@qq.com>
AuthorDate: Tue Jan 22 10:17:00 2019 +0800

[CARBONDATA-2447] Block update operation on range/list/hash partition table

[problem]
when update the data on range partition table, it will lost data or update 
failed , see the jira or new test case

[Cause]
Range partition table take taskNo in filename as partitionId, when update 
the taskNo is inscreasing ,the taskNo didn't changed with partitionId

[Solution]
(1) When query the range partition table, don't match the partitionid 
---this method losses the meaning of partition
(2) Range partition table use directory or seperate part as partitionid 
---this is not necessary and suggest to use standard partition
(3) Range partition table doesn't support update opretion ---this PR use 
this method

This closes #3091
---
 .../partition/TestUpdateForPartitionTable.scala| 71 ++
 .../mutation/CarbonProjectForUpdateCommand.scala   |  9 ++-
 2 files changed, 79 insertions(+), 1 deletion(-)

diff --git 
a/integration/spark-common-test/src/test/scala/org/apache/carbondata/spark/testsuite/partition/TestUpdateForPartitionTable.scala
 
b/integration/spark-common-test/src/test/scala/org/apache/carbondata/spark/testsuite/partition/TestUpdateForPartitionTable.scala
new file mode 100644
index 000..14dab1e
--- /dev/null
+++ 
b/integration/spark-common-test/src/test/scala/org/apache/carbondata/spark/testsuite/partition/TestUpdateForPartitionTable.scala
@@ -0,0 +1,71 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.carbondata.spark.testsuite.partition
+
+import org.apache.spark.sql.test.util.QueryTest
+import org.scalatest. BeforeAndAfterAll
+
+class TestUpdateForPartitionTable extends QueryTest with BeforeAndAfterAll {
+
+  override def beforeAll: Unit = {
+dropTable
+
+sql("create table test_range_partition_table (id int) partitioned by (name 
string) " +
+  "stored by 'carbondata' TBLPROPERTIES('PARTITION_TYPE' = 
'RANGE','RANGE_INFO' = 'a,e,f')")
+sql("create table test_hive_partition_table (id int) partitioned by (name 
string) " +
+  "stored by 'carbondata'")
+sql("create table test_hash_partition_table (id int) partitioned by (name 
string) " +
+  "stored by 'carbondata' TBLPROPERTIES('PARTITION_TYPE' = 
'HASH','NUM_PARTITIONS' = '2')")
+sql("create table test_list_partition_table (id int) partitioned by (name 
string) " +
+  "stored by 'carbondata' TBLPROPERTIES('PARTITION_TYPE' = 
'LIST','LIST_INFO' = 'a,e,f')")
+  }
+
+  def dropTable = {
+sql("drop table if exists test_hash_partition_table")
+sql("drop table if exists test_list_partition_table")
+sql("drop table if exists test_range_partition_table")
+sql("drop table if exists test_hive_partition_table")
+  }
+
+
+  test ("test update for unsupported partition table") {
+val updateTables = Array(
+  "test_range_partition_table",
+  "test_list_partition_table",
+  "test_hash_partition_table")
+
+updateTables.foreach(table => {
+  sql("insert into " + table + " select 1,'b' ")
+  val ex = intercept[UnsupportedOperationException] {
+sql("update " + table + " set (name) = ('c') where id = 1").collect()
+  }
+  assertResult("Unsupported update operation for range/hash/list partition 
table")(ex.getMessage)
+})
+
+  }
+
+  test ("test update for hive(standard) partition table") {
+
+sql("insert into test_hive_partition_table select 1,'b' ")
+sql(&q

[carbondata] branch master updated: [CARBONDATA-3107] Optimize error/exception coding for better debugging

2019-02-22 Thread xuchuanyin
This is an automated email from the ASF dual-hosted git repository.

xuchuanyin pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/carbondata.git


The following commit(s) were added to refs/heads/master by this push:
 new 9672a10  [CARBONDATA-3107] Optimize error/exception coding for better 
debugging
9672a10 is described below

commit 9672a1032a77502f67dbc81e6e2758bfe2eaeebf
Author: Manhua 
AuthorDate: Tue Oct 30 09:44:59 2018 +0800

[CARBONDATA-3107] Optimize error/exception coding for better debugging

Some error log in carbon is only a single line of conclusion message
(like "Dataload failed"), and when we look into codes may found that
is newly created exception without original exception, so more jobs
need to be done to find the root cause.

To better locate the root cause when carbon fails, this PR propose to
keep the original throwable for logging when wrapping it to another
exception, and also log stack trace alone with error message.

   Changes in this PR follows these rules: (`e` is an exception)

   | Code Sample | Problem | Suggest Modification |
   | --- | --- | --- |
   | `LOGGER.error(e);` | no stack trace(e is taken as message instead of 
throwable) | `LOGGER.error(e.getMessage(), e);` |
   | `LOGGER.error(e.getMessage());` | no stack trace | 
`LOGGER.error(e.getMessage(), e);` |
   | `catch ... throw new Exception("Error occur")` | useless message, no 
stack trace | `throw new Exception(e)` |
   | `catch ... throw new Exception(e.getMessage())` | no stack trace | 
`throw new Exception(e)` |
   | `catch ... throw new Exception(e.getMessage(), e)` | no need to call 
`getMessage()` | `throw new Exception(e)` |
   | `catch ... throw new Exception("Error occur: " + e.getMessage(), e)` | 
useless message | `throw new Exception(e)` |
   | `catch ... throw new Exception("DataLoad fail: " + e.getMessage())` | 
no stack trace | `throw new Exception("DataLoad fail: " + e.getMessage(), e)` |

   Some exceptions, such as MalformedCarbonCommandException,
   InterruptException, NumberFormatException, InvalidLoadOptionException
   and NoRetryException, do not have Constructor using `Throwable`,
   so this PR does not change it.

This closes #2878
---
 .../cache/dictionary/ForwardDictionaryCache.java   |  2 +-
 .../cache/dictionary/ReverseDictionaryCache.java   |  2 +-
 .../core/datamap/DataMapStoreManager.java  |  2 +-
 .../carbondata/core/datamap/DataMapUtil.java   |  2 +-
 .../filesystem/AbstractDFSCarbonFile.java  | 14 ++---
 .../datastore/filesystem/AlluxioCarbonFile.java|  2 +-
 .../core/datastore/filesystem/HDFSCarbonFile.java  |  2 +-
 .../core/datastore/filesystem/LocalCarbonFile.java |  2 +-
 .../core/datastore/filesystem/S3CarbonFile.java|  2 +-
 .../datastore/filesystem/ViewFSCarbonFile.java |  2 +-
 .../core/datastore/impl/FileFactory.java   |  4 ++--
 .../client/NonSecureDictionaryClient.java  |  2 +-
 .../client/NonSecureDictionaryClientHandler.java   |  4 ++--
 .../generator/TableDictionaryGenerator.java|  2 +-
 .../server/NonSecureDictionaryServerHandler.java   |  2 +-
 .../service/AbstractDictionaryServer.java  |  8 
 .../core/indexstore/BlockletDataMapIndexStore.java |  4 ++--
 .../core/indexstore/BlockletDetailInfo.java|  2 +-
 .../timestamp/DateDirectDictionaryGenerator.java   |  4 ++--
 .../carbondata/core/locks/ZookeeperInit.java   |  2 +-
 .../core/metadata/schema/table/CarbonTable.java|  2 +-
 .../carbondata/core/mutate/CarbonUpdateUtil.java   |  4 ++--
 .../core/reader/CarbonDeleteFilesDataReader.java   | 14 ++---
 .../carbondata/core/scan/filter/FilterUtil.java| 16 +++
 .../core/scan/result/BlockletScannedResult.java|  6 +++---
 .../AbstractDetailQueryResultIterator.java |  2 +-
 .../core/statusmanager/LoadMetadataDetails.java|  6 +++---
 .../core/statusmanager/SegmentStatusManager.java   |  6 +++---
 .../carbondata/core/util/CarbonProperties.java |  2 +-
 .../apache/carbondata/core/util/CarbonUtil.java|  8 
 .../apache/carbondata/core/util/DataTypeUtil.java  |  8 
 .../core/util/ObjectSerializationUtil.java |  4 ++--
 .../carbondata/core/util/path/HDFSLeaseUtils.java  |  2 +-
 .../datastore/filesystem/HDFSCarbonFileTest.java   |  2 +-
 .../core/load/LoadMetadataDetailsUnitTest.java |  2 +-
 .../bloom/BloomCoarseGrainDataMapFactory.java  |  2 +-
 .../datamap/lucene/LuceneFineGrainDataMap.java |  6 +++---
 .../lucene/LuceneFineGrainDataMapFactory.java  |  2 +-
 .../hive/CarbonDictionaryDecodeReadSupport.java|  2 +-
 .../carbondata/hive/MapredCarbonInputFormat.java   |  2 +-
 .../carbondata/presto/impl/CarbonTableReader.java  | 12 +--
 .../client/SecureDicti

[carbondata] branch master updated: [CARBONDATA-3278] Remove duplicate code to get filter string of date/timestamp

2019-02-22 Thread xuchuanyin
This is an automated email from the ASF dual-hosted git repository.

xuchuanyin pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/carbondata.git


The following commit(s) were added to refs/heads/master by this push:
 new d84f721  [CARBONDATA-3278] Remove duplicate code to get filter string 
of date/timestamp
d84f721 is described below

commit d84f721e847ed33ece605213b2c71971f215b733
Author: Manhua 
AuthorDate: Mon Jan 28 15:55:45 2019 +0800

[CARBONDATA-3278] Remove duplicate code to get filter string of 
date/timestamp

Remove duplicated code to get filter string of date/timestamp by method
`ExpressionResult.getString()` instead.

This closes #3109
---
 .../datamap/bloom/BloomCoarseGrainDataMap.java | 41 +++---
 1 file changed, 5 insertions(+), 36 deletions(-)

diff --git 
a/datamap/bloom/src/main/java/org/apache/carbondata/datamap/bloom/BloomCoarseGrainDataMap.java
 
b/datamap/bloom/src/main/java/org/apache/carbondata/datamap/bloom/BloomCoarseGrainDataMap.java
index 4459fc5..fea48c3 100644
--- 
a/datamap/bloom/src/main/java/org/apache/carbondata/datamap/bloom/BloomCoarseGrainDataMap.java
+++ 
b/datamap/bloom/src/main/java/org/apache/carbondata/datamap/bloom/BloomCoarseGrainDataMap.java
@@ -19,17 +19,13 @@ package org.apache.carbondata.datamap.bloom;
 
 import java.io.IOException;
 import java.io.UnsupportedEncodingException;
-import java.text.DateFormat;
-import java.text.SimpleDateFormat;
 import java.util.ArrayList;
 import java.util.Arrays;
-import java.util.Date;
 import java.util.HashMap;
 import java.util.HashSet;
 import java.util.List;
 import java.util.Map;
 import java.util.Set;
-import java.util.TimeZone;
 import java.util.concurrent.ConcurrentHashMap;
 
 import org.apache.carbondata.common.annotations.InterfaceAudience;
@@ -57,6 +53,7 @@ import 
org.apache.carbondata.core.scan.expression.LiteralExpression;
 import 
org.apache.carbondata.core.scan.expression.conditional.EqualToExpression;
 import org.apache.carbondata.core.scan.expression.conditional.InExpression;
 import org.apache.carbondata.core.scan.expression.conditional.ListExpression;
+import 
org.apache.carbondata.core.scan.expression.exception.FilterIllegalMemberException;
 import org.apache.carbondata.core.scan.expression.logical.AndExpression;
 import org.apache.carbondata.core.scan.filter.resolver.FilterResolverIntf;
 import org.apache.carbondata.core.util.CarbonProperties;
@@ -303,35 +300,6 @@ public class BloomCoarseGrainDataMap extends 
CoarseGrainDataMap {
 return queryModels;
   }
 
-  /**
-   * Here preprocessed NULL and date/timestamp data type.
-   *
-   * Note that if the datatype is date/timestamp, the expressionValue is long 
type.
-   */
-  private Object getLiteralExpValue(LiteralExpression le) {
-Object expressionValue = le.getLiteralExpValue();
-Object literalValue;
-
-if (null == expressionValue) {
-  literalValue = null;
-} else if (le.getLiteralExpDataType() == DataTypes.DATE) {
-  DateFormat format = new 
SimpleDateFormat(CarbonCommonConstants.CARBON_DATE_DEFAULT_FORMAT);
-  // the below settings are set statically according to 
DateDirectDirectionaryGenerator
-  format.setLenient(false);
-  format.setTimeZone(TimeZone.getTimeZone("GMT"));
-  literalValue = format.format(new Date((long) expressionValue / 1000));
-} else if (le.getLiteralExpDataType() == DataTypes.TIMESTAMP) {
-  DateFormat format =
-  new 
SimpleDateFormat(CarbonCommonConstants.CARBON_TIMESTAMP_DEFAULT_FORMAT);
-  // the below settings are set statically according to 
TimeStampDirectDirectionaryGenerator
-  format.setLenient(false);
-  literalValue = format.format(new Date((long) expressionValue / 1000));
-} else {
-  literalValue = expressionValue;
-}
-return literalValue;
-  }
-
 
   private BloomQueryModel buildQueryModelForEqual(ColumnExpression ce,
   LiteralExpression le) throws DictionaryGenerationException, 
UnsupportedEncodingException {
@@ -358,11 +326,12 @@ public class BloomCoarseGrainDataMap extends 
CoarseGrainDataMap {
 
   private byte[] getInternalFilterValue(CarbonColumn carbonColumn, 
LiteralExpression le) throws
   DictionaryGenerationException, UnsupportedEncodingException {
-Object filterLiteralValue = getLiteralExpValue(le);
 // convert the filter value to string and apply converters on it to get 
carbon internal value
 String strFilterValue = null;
-if (null != filterLiteralValue) {
-  strFilterValue = String.valueOf(filterLiteralValue);
+try {
+  strFilterValue = le.getExpressionResult().getString();
+} catch (FilterIllegalMemberException e) {
+  throw new RuntimeException("Error while resolving filter expression", e);
 }
 
 Object convertedValue = 
this.name2Converters.get(carbonColumn.getColName()).convert(



carbondata git commit: [CARBONDATA-3181][BloomDataMap] Fix access field error for BitSet in bloom filter

2018-12-20 Thread xuchuanyin
Repository: carbondata
Updated Branches:
  refs/heads/master d7ff3e688 -> 96ce00758


[CARBONDATA-3181][BloomDataMap] Fix access field error for BitSet in bloom 
filter

Problem
java.lang.IllegalAccessError is thrown when query on bloom filter without 
compress on CarbonThriftServer.

Analyse
similar problem was occur when get/set BitSet in CarbonBloomFilter, it uses 
reflection to solve. We can do it like this.
Since we have set the BitSet already, another easier way is to call super class 
method to avoid accessing it from CarbonBloomFilter

Solution
if bloom filter is not compressed, call super method to test membership

This closes #3000


Project: http://git-wip-us.apache.org/repos/asf/carbondata/repo
Commit: http://git-wip-us.apache.org/repos/asf/carbondata/commit/96ce0075
Tree: http://git-wip-us.apache.org/repos/asf/carbondata/tree/96ce0075
Diff: http://git-wip-us.apache.org/repos/asf/carbondata/diff/96ce0075

Branch: refs/heads/master
Commit: 96ce00758c3f1153f7ed5dccb43533be55dd1e5e
Parents: d7ff3e6
Author: Manhua 
Authored: Wed Dec 19 11:30:46 2018 +0800
Committer: xuchuanyin 
Committed: Thu Dec 20 19:08:37 2018 +0800

--
 .../hadoop/util/bloom/CarbonBloomFilter.java| 20 
 1 file changed, 8 insertions(+), 12 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/carbondata/blob/96ce0075/datamap/bloom/src/main/java/org/apache/hadoop/util/bloom/CarbonBloomFilter.java
--
diff --git 
a/datamap/bloom/src/main/java/org/apache/hadoop/util/bloom/CarbonBloomFilter.java
 
b/datamap/bloom/src/main/java/org/apache/hadoop/util/bloom/CarbonBloomFilter.java
index 4b111df..fb0c779 100644
--- 
a/datamap/bloom/src/main/java/org/apache/hadoop/util/bloom/CarbonBloomFilter.java
+++ 
b/datamap/bloom/src/main/java/org/apache/hadoop/util/bloom/CarbonBloomFilter.java
@@ -49,27 +49,23 @@ public class CarbonBloomFilter extends BloomFilter {
 
   @Override
   public boolean membershipTest(Key key) {
-if (key == null) {
-  throw new NullPointerException("key cannot be null");
-}
-
-int[] h = hash.hash(key);
-hash.clear();
 if (compress) {
   // If it is compressed check in roaring bitmap
+  if (key == null) {
+throw new NullPointerException("key cannot be null");
+  }
+  int[] h = hash.hash(key);
+  hash.clear();
   for (int i = 0; i < nbHash; i++) {
 if (!bitmap.contains(h[i])) {
   return false;
 }
   }
+  return true;
 } else {
-  for (int i = 0; i < nbHash; i++) {
-if (!bits.get(h[i])) {
-  return false;
-}
-  }
+  // call super method to avoid IllegalAccessError for `bits` field
+  return super.membershipTest(key);
 }
-return true;
   }
 
   @Override



carbondata git commit: [HOTFIX] replace apache common log with carbondata log4j

2018-12-19 Thread xuchuanyin
Repository: carbondata
Updated Branches:
  refs/heads/master c0d858139 -> e01916b6b


[HOTFIX] replace apache common log with carbondata log4j

replace apache common log with carbondata log4j

This closes #2999


Project: http://git-wip-us.apache.org/repos/asf/carbondata/repo
Commit: http://git-wip-us.apache.org/repos/asf/carbondata/commit/e01916b6
Tree: http://git-wip-us.apache.org/repos/asf/carbondata/tree/e01916b6
Diff: http://git-wip-us.apache.org/repos/asf/carbondata/diff/e01916b6

Branch: refs/heads/master
Commit: e01916b6bff6daf6ccbc359d44baeabfadf6a0bf
Parents: c0d8581
Author: brijoobopanna 
Authored: Tue Dec 18 16:15:13 2018 +0530
Committer: xuchuanyin 
Committed: Thu Dec 20 14:23:15 2018 +0800

--
 .../org/apache/carbondata/core/datamap/TableDataMap.java | 7 ---
 .../indexstore/blockletindex/BlockletDataMapFactory.java | 3 ---
 .../org/apache/carbondata/core/util/BlockletDataMapUtil.java | 7 ---
 .../apache/carbondata/core/util/ObjectSerializationUtil.java | 8 +---
 .../org/apache/carbondata/hadoop/api/CarbonInputFormat.java  | 7 ---
 .../apache/carbondata/hadoop/api/CarbonTableInputFormat.java | 7 ---
 .../carbondata/hadoop/api/CarbonTableOutputFormat.java   | 7 ---
 .../apache/carbondata/hive/server/HiveEmbeddedServer2.java   | 7 ---
 .../loading/iterator/CarbonOutputIteratorWrapper.java| 7 ---
 9 files changed, 33 insertions(+), 27 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/carbondata/blob/e01916b6/core/src/main/java/org/apache/carbondata/core/datamap/TableDataMap.java
--
diff --git 
a/core/src/main/java/org/apache/carbondata/core/datamap/TableDataMap.java 
b/core/src/main/java/org/apache/carbondata/core/datamap/TableDataMap.java
index 4de7449..86390e8 100644
--- a/core/src/main/java/org/apache/carbondata/core/datamap/TableDataMap.java
+++ b/core/src/main/java/org/apache/carbondata/core/datamap/TableDataMap.java
@@ -29,6 +29,7 @@ import java.util.concurrent.Future;
 import java.util.concurrent.TimeUnit;
 
 import org.apache.carbondata.common.annotations.InterfaceAudience;
+import org.apache.carbondata.common.logging.LogServiceFactory;
 import org.apache.carbondata.core.constants.CarbonCommonConstants;
 import org.apache.carbondata.core.datamap.dev.BlockletSerializer;
 import org.apache.carbondata.core.datamap.dev.DataMap;
@@ -50,8 +51,7 @@ import org.apache.carbondata.events.Event;
 import org.apache.carbondata.events.OperationContext;
 import org.apache.carbondata.events.OperationEventListener;
 
-import org.apache.commons.logging.Log;
-import org.apache.commons.logging.LogFactory;
+import org.apache.log4j.Logger;
 
 /**
  * Index at the table level, user can add any number of DataMap for one table, 
by
@@ -74,7 +74,8 @@ public final class TableDataMap extends 
OperationEventListener {
 
   private SegmentPropertiesFetcher segmentPropertiesFetcher;
 
-  private static final Log LOG = LogFactory.getLog(TableDataMap.class);
+  private static final Logger LOG =
+  LogServiceFactory.getLogService(TableDataMap.class.getName());
 
   /**
* It is called to initialize and load the required table datamap metadata.

http://git-wip-us.apache.org/repos/asf/carbondata/blob/e01916b6/core/src/main/java/org/apache/carbondata/core/indexstore/blockletindex/BlockletDataMapFactory.java
--
diff --git 
a/core/src/main/java/org/apache/carbondata/core/indexstore/blockletindex/BlockletDataMapFactory.java
 
b/core/src/main/java/org/apache/carbondata/core/indexstore/blockletindex/BlockletDataMapFactory.java
index 096a5e3..5892f78 100644
--- 
a/core/src/main/java/org/apache/carbondata/core/indexstore/blockletindex/BlockletDataMapFactory.java
+++ 
b/core/src/main/java/org/apache/carbondata/core/indexstore/blockletindex/BlockletDataMapFactory.java
@@ -54,8 +54,6 @@ import org.apache.carbondata.core.util.BlockletDataMapUtil;
 import org.apache.carbondata.core.util.path.CarbonTablePath;
 import org.apache.carbondata.events.Event;
 
-import org.apache.commons.logging.Log;
-import org.apache.commons.logging.LogFactory;
 import org.apache.hadoop.fs.FileSystem;
 import org.apache.hadoop.fs.LocatedFileStatus;
 import org.apache.hadoop.fs.Path;
@@ -67,7 +65,6 @@ import org.apache.hadoop.fs.RemoteIterator;
 public class BlockletDataMapFactory extends CoarseGrainDataMapFactory
 implements BlockletDetailsFetcher, SegmentPropertiesFetcher, 
CacheableDataMap {
 
-  private static final Log LOG = 
LogFactory.getLog(BlockletDataMapFactory.class);
   private static final String NAME = "clustered.btree.blocklet";
   /**
* variable for cache level BLOCKLET

http://git-wip-us.apache.org/repos/asf/carbondata/blob/e01916b6/core/src/main/java/org/apache/carbond

carbondata git commit: [CARBONDATA-3166]Updated Document and added Column Compressor used in Describe Formatted

2018-12-14 Thread xuchuanyin
Repository: carbondata
Updated Branches:
  refs/heads/master 82adc50e7 -> ebdd5486e


[CARBONDATA-3166]Updated Document and added Column Compressor used in Describe 
Formatted

Updated Document and added column compressor used in Describe Formatted Command

This closes #2986


Project: http://git-wip-us.apache.org/repos/asf/carbondata/repo
Commit: http://git-wip-us.apache.org/repos/asf/carbondata/commit/ebdd5486
Tree: http://git-wip-us.apache.org/repos/asf/carbondata/tree/ebdd5486
Diff: http://git-wip-us.apache.org/repos/asf/carbondata/diff/ebdd5486

Branch: refs/heads/master
Commit: ebdd5486e309e606efa1ddeb7d2b6d62f2315541
Parents: 82adc50
Author: shardul-cr7 
Authored: Thu Dec 13 14:12:18 2018 +0530
Committer: xuchuanyin 
Committed: Fri Dec 14 19:52:34 2018 +0800

--
 docs/configuration-parameters.md | 2 +-
 .../execution/command/table/CarbonDescribeFormattedCommand.scala | 4 +++-
 2 files changed, 4 insertions(+), 2 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/carbondata/blob/ebdd5486/docs/configuration-parameters.md
--
diff --git a/docs/configuration-parameters.md b/docs/configuration-parameters.md
index 4aa2929..db21c6a 100644
--- a/docs/configuration-parameters.md
+++ b/docs/configuration-parameters.md
@@ -91,7 +91,7 @@ This section provides the details of all the configurations 
required for the Car
 | carbon.dictionary.server.port | 2030 | Single Pass Loading enables single 
job to finish data loading with dictionary generation on the fly. It enhances 
performance in the scenarios where the subsequent data loading after initial 
load involves fewer incremental updates on the dictionary. Single pass loading 
can be enabled using the option ***carbon.options.single.pass***. When this 
option is specified, a dictionary server will be internally started to handle 
the dictionary generation and query requests. This configuration specifies the 
port on which the server need to listen for incoming requests. Port value 
ranges between 0-65535 |
 | carbon.load.directWriteToStorePath.enabled | false | During data load, all 
the carbondata files are written to local disk and finally copied to the target 
store location in HDFS/S3. Enabling this parameter will make carbondata files 
to be written directly onto target HDFS/S3 location bypassing the local 
disk.**NOTE:** Writing directly to HDFS/S3 saves local disk IO(once for writing 
the files and again for copying to HDFS/S3) there by improving the performance. 
But the drawback is when data loading fails or the application crashes, 
unwanted carbondata files will remain in the target HDFS/S3 location until it 
is cleared during next data load or by running *CLEAN FILES* DDL command |
 | carbon.options.serialization.null.format | \N | Based on the business 
scenarios, some columns might need to be loaded with null values. As null value 
cannot be written in csv files, some special characters might be adopted to 
specify null values. This configuration can be used to specify the null values 
format in the data being loaded. |
-| carbon.column.compressor | snappy | CarbonData will compress the column 
values using the compressor specified by this configuration. Currently 
CarbonData supports 'snappy' and 'zstd' compressors. |
+| carbon.column.compressor | snappy | CarbonData will compress the column 
values using the compressor specified by this configuration. Currently 
CarbonData supports 'snappy', 'zstd' and 'gzip' compressors. |
 | carbon.minmax.allowed.byte.count | 200 | CarbonData will write the min max 
values for string/varchar types column using the byte count specified by this 
configuration. Max value is 1000 bytes(500 characters) and Min value is 10 
bytes(5 characters). **NOTE:** This property is useful for reducing the store 
size thereby improving the query performance but can lead to query degradation 
if value is not configured properly. | |
 
 ## Compaction Configuration

http://git-wip-us.apache.org/repos/asf/carbondata/blob/ebdd5486/integration/spark2/src/main/scala/org/apache/spark/sql/execution/command/table/CarbonDescribeFormattedCommand.scala
--
diff --git 
a/integration/spark2/src/main/scala/org/apache/spark/sql/execution/command/table/CarbonDescribeFormattedCommand.scala
 
b/integration/spark2/src/main/scala/org/apache/spark/sql/execution/command/table/CarbonDescribeFormattedCommand.scala
index 151359e..2d560df 100644
--- 
a/integration/spark2/src/main/scala/org/apache/spark/sql/execution/command/table/CarbonDescribeFormattedCommand.scala
+++ 
b/integration/spark2/src/main/scala/org/apache/spark/sql/execution/command/table/CarbonDescribeFormattedCommand.scala
@@ -92,7 +92,9 @@ private[sql] case cl

carbondata git commit: [CARBONDATA-3133] Update the document to add spark 2.3.2 and datamap mv compiling method

2018-11-27 Thread xuchuanyin
Repository: carbondata
Updated Branches:
  refs/heads/master 7584bf7d5 -> 295734cc8


[CARBONDATA-3133] Update the document to add spark 2.3.2 and datamap mv 
compiling method

Update the document to add spark 2.3.2 and add datamap mv compiling method.

This closes #2955


Project: http://git-wip-us.apache.org/repos/asf/carbondata/repo
Commit: http://git-wip-us.apache.org/repos/asf/carbondata/commit/295734cc
Tree: http://git-wip-us.apache.org/repos/asf/carbondata/tree/295734cc
Diff: http://git-wip-us.apache.org/repos/asf/carbondata/diff/295734cc

Branch: refs/heads/master
Commit: 295734cc89dc5bae5538b500d674938cbfb75e9f
Parents: 7584bf7
Author: Jonathan.Wei <252637...@qq.com>
Authored: Tue Nov 27 15:20:48 2018 +0800
Committer: xuchuanyin 
Committed: Wed Nov 28 14:49:45 2018 +0800

--
 build/README.md | 5 -
 1 file changed, 4 insertions(+), 1 deletion(-)
--


http://git-wip-us.apache.org/repos/asf/carbondata/blob/295734cc/build/README.md
--
diff --git a/build/README.md b/build/README.md
index 028c03a..f361a6e 100644
--- a/build/README.md
+++ b/build/README.md
@@ -29,9 +29,12 @@ Build with different supported versions of Spark, by default 
using Spark 2.2.1 t
 ```
 mvn -DskipTests -Pspark-2.1 -Dspark.version=2.1.0 clean package
 mvn -DskipTests -Pspark-2.2 -Dspark.version=2.2.1 clean package
+mvn -DskipTests -Pspark-2.3 -Dspark.version=2.3.2 clean package
 ```
 
-Note: If you are working in Windows environment, remember to add `-Pwindows` 
while building the project.
+Note:
+ - If you are working in Windows environment, remember to add `-Pwindows` 
while building the project.
+ - The mv feature is not compiled by default. If you want to use this feature, 
remember to add `-Pmv` while building the project.
 
 ## For contributors : To build the format code after any changes, please 
follow the below command.
 Note:Need install Apache Thrift 0.9.3



carbondata git commit: [HOTFIX] Improve log message in CarbonWriterBuilder

2018-11-21 Thread xuchuanyin
Repository: carbondata
Updated Branches:
  refs/heads/master c2ae98744 -> 3b8de320d


[HOTFIX] Improve log message in CarbonWriterBuilder

In master the log message is not proper:
AppName is not set, please use writtenBy() API to set the App Namewhich is 
using SDK

This closes #2920


Project: http://git-wip-us.apache.org/repos/asf/carbondata/repo
Commit: http://git-wip-us.apache.org/repos/asf/carbondata/commit/3b8de320
Tree: http://git-wip-us.apache.org/repos/asf/carbondata/tree/3b8de320
Diff: http://git-wip-us.apache.org/repos/asf/carbondata/diff/3b8de320

Branch: refs/heads/master
Commit: 3b8de320d7092470e6d58ad3dcee594e3ae7ecc8
Parents: c2ae987
Author: Jacky Li 
Authored: Wed Nov 14 20:54:33 2018 +0800
Committer: xuchuanyin 
Committed: Wed Nov 21 18:32:26 2018 +0800

--
 .../org/apache/carbondata/sdk/file/CarbonWriterBuilder.java| 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/carbondata/blob/3b8de320/store/sdk/src/main/java/org/apache/carbondata/sdk/file/CarbonWriterBuilder.java
--
diff --git 
a/store/sdk/src/main/java/org/apache/carbondata/sdk/file/CarbonWriterBuilder.java
 
b/store/sdk/src/main/java/org/apache/carbondata/sdk/file/CarbonWriterBuilder.java
index 917d4dc..1ca5b74 100644
--- 
a/store/sdk/src/main/java/org/apache/carbondata/sdk/file/CarbonWriterBuilder.java
+++ 
b/store/sdk/src/main/java/org/apache/carbondata/sdk/file/CarbonWriterBuilder.java
@@ -438,13 +438,13 @@ public class CarbonWriterBuilder {
 Objects.requireNonNull(path, "path should not be null");
 if (this.writerType == null) {
   throw new IOException(
-  "Writer type is not set, use withCsvInput() or withAvroInput() or 
withJsonInput()  "
+  "'writerType' must be set, use withCsvInput() or withAvroInput() or 
withJsonInput()  "
   + "API based on input");
 }
 if (this.writtenByApp == null || this.writtenByApp.isEmpty()) {
   throw new RuntimeException(
-  "AppName is not set, please use writtenBy() API to set the App Name"
-  + "which is using SDK");
+  "'writtenBy' must be set when writing carbon files, use writtenBy() 
API to "
+  + "set it, it can be the name of the application which is using 
the SDK");
 }
 CarbonLoadModel loadModel = buildLoadModel(schema);
 loadModel.setSdkWriterCores(numOfThreads);



carbondata git commit: [CARBONDATA-3031] refining usage of numberofcores in CarbonProperties

2018-11-16 Thread xuchuanyin
Repository: carbondata
Updated Branches:
  refs/heads/master 0c02e9801 -> 39e8e3da5


[CARBONDATA-3031] refining usage of numberofcores in CarbonProperties

1. many places use the function 'getNumOfCores' of CarbonProperties which 
returns the loading cores.
2. so if we still use the value in scene like 'query' or 'compaction' , it will 
be confused.

This closes #2907


Project: http://git-wip-us.apache.org/repos/asf/carbondata/repo
Commit: http://git-wip-us.apache.org/repos/asf/carbondata/commit/39e8e3da
Tree: http://git-wip-us.apache.org/repos/asf/carbondata/tree/39e8e3da
Diff: http://git-wip-us.apache.org/repos/asf/carbondata/diff/39e8e3da

Branch: refs/heads/master
Commit: 39e8e3da5c4cca7ff022991c4b5f8808988356d5
Parents: 0c02e98
Author: Sssan520 
Authored: Mon Jul 2 19:12:24 2018 +0800
Committer: xuchuanyin 
Committed: Sat Nov 17 13:49:48 2018 +0800

--
 .../dictionary/AbstractDictionaryCache.java |  2 +-
 .../generator/TableDictionaryGenerator.java |  2 +-
 .../reader/CarbonDeleteFilesDataReader.java |  6 +++-
 .../carbondata/core/util/CarbonProperties.java  | 35 +++-
 .../CarbonAlterTableDropPartitionCommand.scala  |  4 +--
 .../CarbonAlterTableSplitPartitionCommand.scala |  6 ++--
 .../loading/CarbonDataLoadConfiguration.java| 10 ++
 .../loading/DataLoadProcessBuilder.java |  2 ++
 .../processing/merger/CarbonDataMergerUtil.java |  3 +-
 .../merger/CompactionResultSortProcessor.java   |  4 ++-
 .../sort/sortdata/SortParameters.java   |  8 +++--
 .../store/CarbonFactDataHandlerModel.java   | 13 ++--
 .../util/CarbonDataProcessorUtil.java   |  2 +-
 13 files changed, 62 insertions(+), 35 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/carbondata/blob/39e8e3da/core/src/main/java/org/apache/carbondata/core/cache/dictionary/AbstractDictionaryCache.java
--
diff --git 
a/core/src/main/java/org/apache/carbondata/core/cache/dictionary/AbstractDictionaryCache.java
 
b/core/src/main/java/org/apache/carbondata/core/cache/dictionary/AbstractDictionaryCache.java
index 83c7237..36d5f98 100644
--- 
a/core/src/main/java/org/apache/carbondata/core/cache/dictionary/AbstractDictionaryCache.java
+++ 
b/core/src/main/java/org/apache/carbondata/core/cache/dictionary/AbstractDictionaryCache.java
@@ -70,7 +70,7 @@ public abstract class AbstractDictionaryCachehttp://git-wip-us.apache.org/repos/asf/carbondata/blob/39e8e3da/core/src/main/java/org/apache/carbondata/core/dictionary/generator/TableDictionaryGenerator.java
--
diff --git 
a/core/src/main/java/org/apache/carbondata/core/dictionary/generator/TableDictionaryGenerator.java
 
b/core/src/main/java/org/apache/carbondata/core/dictionary/generator/TableDictionaryGenerator.java
index 33a91d8..003ab5a 100644
--- 
a/core/src/main/java/org/apache/carbondata/core/dictionary/generator/TableDictionaryGenerator.java
+++ 
b/core/src/main/java/org/apache/carbondata/core/dictionary/generator/TableDictionaryGenerator.java
@@ -78,7 +78,7 @@ public class TableDictionaryGenerator
   }
 
   @Override public void writeDictionaryData() {
-int numOfCores = CarbonProperties.getInstance().getNumberOfCores();
+int numOfCores = CarbonProperties.getInstance().getNumberOfLoadingCores();
 long start = System.currentTimeMillis();
 ExecutorService executorService = Executors.newFixedThreadPool(numOfCores);
 for (final DictionaryGenerator generator : columnMap.values()) {

http://git-wip-us.apache.org/repos/asf/carbondata/blob/39e8e3da/core/src/main/java/org/apache/carbondata/core/reader/CarbonDeleteFilesDataReader.java
--
diff --git 
a/core/src/main/java/org/apache/carbondata/core/reader/CarbonDeleteFilesDataReader.java
 
b/core/src/main/java/org/apache/carbondata/core/reader/CarbonDeleteFilesDataReader.java
index 32eb60d..ee87a75 100644
--- 
a/core/src/main/java/org/apache/carbondata/core/reader/CarbonDeleteFilesDataReader.java
+++ 
b/core/src/main/java/org/apache/carbondata/core/reader/CarbonDeleteFilesDataReader.java
@@ -60,12 +60,16 @@ public class CarbonDeleteFilesDataReader {
 initThreadPoolSize();
   }
 
+  public CarbonDeleteFilesDataReader(int thread_pool_size) {
+this.thread_pool_size = thread_pool_size;
+  }
+
   /**
* This method will initialize the thread pool size to be used for creating 
the
* max number of threads for a job
*/
   private void initThreadPoolSize() {
-thread_pool_size = CarbonProperties.getInstance().getNumberOfCores();
+thread_pool_size = 
CarbonProperties.getInstance().getNumberOfLoadingCores();
   }
 
   /**

http://git-wip-us.apache.org/repos/asf/carbondata/blob/39e8e3da/core/src/main/java/org/apa

[1/2] carbondata git commit: [CARBONDATA-3087] Improve DESC FORMATTED output

2018-11-15 Thread xuchuanyin
Repository: carbondata
Updated Branches:
  refs/heads/master ceb135175 -> 851dd2c88


http://git-wip-us.apache.org/repos/asf/carbondata/blob/851dd2c8/integration/spark2/src/test/scala/org/apache/spark/sql/GetDataSizeAndIndexSizeTest.scala
--
diff --git 
a/integration/spark2/src/test/scala/org/apache/spark/sql/GetDataSizeAndIndexSizeTest.scala
 
b/integration/spark2/src/test/scala/org/apache/spark/sql/GetDataSizeAndIndexSizeTest.scala
index 03ec3a1..563206f 100644
--- 
a/integration/spark2/src/test/scala/org/apache/spark/sql/GetDataSizeAndIndexSizeTest.scala
+++ 
b/integration/spark2/src/test/scala/org/apache/spark/sql/GetDataSizeAndIndexSizeTest.scala
@@ -17,7 +17,10 @@
 
 package org.apache.spark.sql
 
+import java.util.Date
+
 import org.apache.spark.sql.test.util.QueryTest
+
 import org.apache.carbondata.core.constants.CarbonCommonConstants
 import org.scalatest.BeforeAndAfterAll
 
@@ -59,7 +62,7 @@ class GetDataSizeAndIndexSizeTest extends QueryTest with 
BeforeAndAfterAll {
   .filter(row => 
row.getString(0).contains(CarbonCommonConstants.TABLE_DATA_SIZE) ||
   row.getString(0).contains(CarbonCommonConstants.TABLE_INDEX_SIZE))
 assert(res1.length == 2)
-res1.foreach(row => assert(row.getString(1).trim.toLong > 0))
+res1.foreach(row => assert(row.getString(1).trim.substring(0, 2).toDouble 
> 0))
   }
 
   test("get data size and index size after major compaction") {
@@ -73,7 +76,7 @@ class GetDataSizeAndIndexSizeTest extends QueryTest with 
BeforeAndAfterAll {
   .filter(row => 
row.getString(0).contains(CarbonCommonConstants.TABLE_DATA_SIZE) ||
 row.getString(0).contains(CarbonCommonConstants.TABLE_INDEX_SIZE))
 assert(res2.length == 2)
-res2.foreach(row => assert(row.getString(1).trim.toLong > 0))
+res2.foreach(row => assert(row.getString(1).trim.substring(0, 2).toDouble 
> 0))
   }
 
   test("get data size and index size after minor compaction") {
@@ -91,7 +94,7 @@ class GetDataSizeAndIndexSizeTest extends QueryTest with 
BeforeAndAfterAll {
   .filter(row => 
row.getString(0).contains(CarbonCommonConstants.TABLE_DATA_SIZE) ||
 row.getString(0).contains(CarbonCommonConstants.TABLE_INDEX_SIZE))
 assert(res3.length == 2)
-res3.foreach(row => assert(row.getString(1).trim.toLong > 0))
+res3.foreach(row => assert(row.getString(1).trim.substring(0, 2).toDouble 
> 0))
   }
 
   test("get data size and index size after insert into") {
@@ -105,7 +108,7 @@ class GetDataSizeAndIndexSizeTest extends QueryTest with 
BeforeAndAfterAll {
   .filter(row => 
row.getString(0).contains(CarbonCommonConstants.TABLE_DATA_SIZE) ||
 row.getString(0).contains(CarbonCommonConstants.TABLE_INDEX_SIZE))
 assert(res4.length == 2)
-res4.foreach(row => assert(row.getString(1).trim.toLong > 0))
+res4.foreach(row => assert(row.getString(1).trim.substring(0, 2).toDouble 
> 0))
   }
 
   test("get data size and index size after insert overwrite") {
@@ -119,7 +122,7 @@ class GetDataSizeAndIndexSizeTest extends QueryTest with 
BeforeAndAfterAll {
   .filter(row => 
row.getString(0).contains(CarbonCommonConstants.TABLE_DATA_SIZE) ||
 row.getString(0).contains(CarbonCommonConstants.TABLE_INDEX_SIZE))
 assert(res5.length == 2)
-res5.foreach(row => assert(row.getString(1).trim.toLong > 0))
+res5.foreach(row => assert(row.getString(1).trim.substring(0, 2).toDouble 
> 0))
   }
 
   test("get data size and index size for empty table") {
@@ -128,15 +131,14 @@ class GetDataSizeAndIndexSizeTest extends QueryTest with 
BeforeAndAfterAll {
   .filter(row => 
row.getString(0).contains(CarbonCommonConstants.TABLE_DATA_SIZE) ||
 row.getString(0).contains(CarbonCommonConstants.TABLE_INDEX_SIZE))
 assert(res6.length == 2)
-res6.foreach(row => assert(row.getString(1).trim.toLong == 0))
+res6.foreach(row => assert(row.getString(1).trim.substring(0, 2).toDouble 
== 0))
   }
 
   test("get last update time for empty table") {
 sql("CREATE TABLE tableSize9 (empno int, workgroupcategory string, deptno 
int, projectcode int, attendance int) STORED BY 'org.apache.carbondata.format'")
 val res7 = sql("DESCRIBE FORMATTED tableSize9").collect()
-  .filter(row => 
row.getString(0).contains(CarbonCommonConstants.LAST_UPDATE_TIME))
+  .filter(row => row.getString(0).contains("Last Update"))
 assert(res7.length == 1)
-res7.foreach(row => assert(row.getString(1).trim.toLong == 0))
   }
 
   test("get last update time for unempty table") {
@@ -144,9 +146,8 @@ class GetDataSizeAndIndexSizeTest extends QueryTest with 
BeforeAndAfterAll {
 sql(s"""LOAD DATA local inpath '$resourcesPath/data.csv' INTO TABLE 
tableSize10 OPTIONS ('DELIMITER'= ',', 'QUOTECHAR'= '\"', 'FILEHEADER'='')""")
 
 val res8 = sql("DESCRIBE FORMATTED tableSize10").collect()
-  .filter(row => 
row.getString(0).contains(CarbonCommonConstants.LAST_UPDATE_TIME))
+ 

[2/2] carbondata git commit: [CARBONDATA-3087] Improve DESC FORMATTED output

2018-11-15 Thread xuchuanyin
[CARBONDATA-3087] Improve DESC FORMATTED output

Change output of DESC FORMATTED

This closes #2908


Project: http://git-wip-us.apache.org/repos/asf/carbondata/repo
Commit: http://git-wip-us.apache.org/repos/asf/carbondata/commit/851dd2c8
Tree: http://git-wip-us.apache.org/repos/asf/carbondata/tree/851dd2c8
Diff: http://git-wip-us.apache.org/repos/asf/carbondata/diff/851dd2c8

Branch: refs/heads/master
Commit: 851dd2c884895762af305fa668ad80c151ba89bd
Parents: ceb1351
Author: Jacky Li 
Authored: Thu Nov 8 20:11:28 2018 +0800
Committer: xuchuanyin 
Committed: Thu Nov 15 16:48:04 2018 +0800

--
 .../core/constants/CarbonCommonConstants.java   |  13 +-
 .../core/constants/SortScopeOptions.java|  49 +++
 .../core/metadata/schema/table/CarbonTable.java |  75 -
 .../core/metadata/schema/table/TableInfo.java   |   4 +-
 .../apache/carbondata/core/util/CarbonUtil.java |   3 +
 .../TestNoInvertedIndexLoadAndQuery.scala   |   7 +-
 .../preaggregate/TestPreAggCreateCommand.scala  |   7 +-
 ...ithColumnMetCacheAndCacheLevelProperty.scala |  12 +-
 ...ithColumnMetCacheAndCacheLevelProperty.scala |   8 +-
 .../TestCreateTableWithCompactionOptions.scala  |  20 --
 .../TestNonTransactionalCarbonTable.scala   |  52 +--
 .../testsuite/dataload/TestLoadDataFrame.scala  |   4 +-
 .../describeTable/TestDescribeTable.scala   |  12 +-
 .../LocalDictionarySupportCreateTableTest.scala |   8 +-
 .../testsuite/sortcolumns/TestSortColumns.scala |   3 +-
 .../sql/commands/StoredAsCarbondataSuite.scala  |   2 +-
 .../sql/commands/UsingCarbondataSuite.scala |   2 +-
 .../spark/rdd/NewCarbonDataLoadRDD.scala|   2 +-
 .../apache/spark/sql/test/util/QueryTest.scala  |   2 +-
 .../spark/rdd/CarbonDataRDDFactory.scala|   3 +-
 .../management/CarbonLoadDataCommand.scala  |   3 +-
 .../table/CarbonDescribeFormattedCommand.scala  | 329 ++-
 .../carbondata/TestStreamingTableOpName.scala   |   2 +-
 .../AlterTableValidationTestCase.scala  |   6 +-
 .../vectorreader/AddColumnTestCases.scala   |  38 +--
 .../vectorreader/ChangeDataTypeTestCases.scala  |   6 +-
 .../vectorreader/DropColumnTestCases.scala  |   6 +-
 .../spark/sql/GetDataSizeAndIndexSizeTest.scala |  25 +-
 .../loading/DataLoadProcessBuilder.java |   2 +-
 .../loading/model/CarbonLoadModelBuilder.java   |   2 +-
 .../loading/sort/SortScopeOptions.java  |  49 ---
 .../processing/loading/sort/SorterFactory.java  |   1 +
 .../store/CarbonFactDataHandlerModel.java   |   2 +-
 .../writer/v3/CarbonFactDataWriterImplV3.java   |   2 +-
 .../util/CarbonDataProcessorUtil.java   |   2 +-
 35 files changed, 390 insertions(+), 373 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/carbondata/blob/851dd2c8/core/src/main/java/org/apache/carbondata/core/constants/CarbonCommonConstants.java
--
diff --git 
a/core/src/main/java/org/apache/carbondata/core/constants/CarbonCommonConstants.java
 
b/core/src/main/java/org/apache/carbondata/core/constants/CarbonCommonConstants.java
index 259f84e..b75648e 100644
--- 
a/core/src/main/java/org/apache/carbondata/core/constants/CarbonCommonConstants.java
+++ 
b/core/src/main/java/org/apache/carbondata/core/constants/CarbonCommonConstants.java
@@ -62,11 +62,6 @@ public final class CarbonCommonConstants {
   public static final int BLOCKLET_SIZE_MAX_VAL = 1200;
 
   /**
-   * default block size in MB
-   */
-  public static final String BLOCK_SIZE_DEFAULT_VAL = "1024";
-
-  /**
* min block size in MB
*/
   public static final int BLOCK_SIZE_MIN_VAL = 1;
@@ -438,8 +433,16 @@ public final class CarbonCommonConstants {
   public static final String COLUMN_PROPERTIES = "columnproperties";
   // table block size in MB
   public static final String TABLE_BLOCKSIZE = "table_blocksize";
+
+  // default block size in MB
+  public static final String TABLE_BLOCK_SIZE_DEFAULT = "1024";
+
   // table blocklet size in MB
   public static final String TABLE_BLOCKLET_SIZE = "table_blocklet_size";
+
+  // default blocklet size value in MB
+  public static final String TABLE_BLOCKLET_SIZE_DEFAULT = "64";
+
   /**
* set in column level to disable inverted index
* @Deprecated :This property is deprecated, it is kept just for 
compatibility

http://git-wip-us.apache.org/repos/asf/carbondata/blob/851dd2c8/core/src/main/java/org/apache/carbondata/core/constants/SortScopeOptions.java
--
diff --git 
a/core/src/main/java/org/apache/carbondata/core/constants/SortScopeOptions.java 
b/core/src/main/java/org/apache/carbondata/core/constants/SortScopeOptions.java
new file mode 100644
index 000..281a27e
--- /dev/null
++

[2/2] carbondata git commit: [HOTFIX] Remove search mode module

2018-11-13 Thread xuchuanyin
[HOTFIX] Remove search mode module

Remove search mode module

This closes #2904


Project: http://git-wip-us.apache.org/repos/asf/carbondata/repo
Commit: http://git-wip-us.apache.org/repos/asf/carbondata/commit/10918484
Tree: http://git-wip-us.apache.org/repos/asf/carbondata/tree/10918484
Diff: http://git-wip-us.apache.org/repos/asf/carbondata/diff/10918484

Branch: refs/heads/master
Commit: 10918484853953528f5756b1b6bdb4a3a2d3b068
Parents: 6f3b9d3
Author: Jacky Li 
Authored: Fri Nov 9 11:45:50 2018 +0800
Committer: xuchuanyin 
Committed: Tue Nov 13 19:59:28 2018 +0800

--
 .../core/constants/CarbonCommonConstants.java   |  65 
 .../scan/executor/QueryExecutorFactory.java |  17 +-
 .../impl/SearchModeDetailQueryExecutor.java |  86 -
 .../SearchModeVectorDetailQueryExecutor.java|  91 --
 .../AbstractSearchModeResultIterator.java   | 139 
 .../iterator/SearchModeResultIterator.java  |  53 ---
 .../SearchModeVectorResultIterator.java |  49 ---
 .../carbondata/core/util/CarbonProperties.java  |  57 
 .../carbondata/core/util/SessionParams.java |   1 -
 .../benchmark/ConcurrentQueryBenchmark.scala|  59 +---
 .../carbondata/examples/S3UsingSDkExample.scala |   3 +-
 .../carbondata/examples/SearchModeExample.scala | 194 ---
 integration/spark-common-test/pom.xml   |  10 -
 ...eneFineGrainDataMapWithSearchModeSuite.scala | 325 ---
 .../detailquery/SearchModeTestCase.scala| 154 -
 .../carbondata/spark/rdd/CarbonScanRDD.scala|  15 +-
 integration/spark2/pom.xml  |   5 -
 .../carbondata/store/SparkCarbonStore.scala |  63 
 .../org/apache/spark/sql/CarbonSession.scala|  95 --
 .../execution/command/CarbonHiveCommands.scala  |  12 +-
 .../bloom/BloomCoarseGrainDataMapSuite.scala|  15 -
 pom.xml |   7 -
 store/search/pom.xml| 112 ---
 .../store/worker/SearchRequestHandler.java  | 244 --
 .../apache/carbondata/store/worker/Status.java  |  28 --
 .../scala/org/apache/spark/rpc/Master.scala | 291 -
 .../scala/org/apache/spark/rpc/RpcUtil.scala|  56 
 .../scala/org/apache/spark/rpc/Scheduler.scala  | 139 
 .../scala/org/apache/spark/rpc/Worker.scala | 118 ---
 .../org/apache/spark/search/Registry.scala  |  51 ---
 .../org/apache/spark/search/Searcher.scala  |  79 -
 .../carbondata/store/SearchServiceTest.java |  37 ---
 .../org/apache/spark/rpc/SchedulerSuite.scala   | 154 -
 33 files changed, 25 insertions(+), 2799 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/carbondata/blob/10918484/core/src/main/java/org/apache/carbondata/core/constants/CarbonCommonConstants.java
--
diff --git 
a/core/src/main/java/org/apache/carbondata/core/constants/CarbonCommonConstants.java
 
b/core/src/main/java/org/apache/carbondata/core/constants/CarbonCommonConstants.java
index fc26404..6edfd66 100644
--- 
a/core/src/main/java/org/apache/carbondata/core/constants/CarbonCommonConstants.java
+++ 
b/core/src/main/java/org/apache/carbondata/core/constants/CarbonCommonConstants.java
@@ -1380,71 +1380,6 @@ public final class CarbonCommonConstants {
   public static final String 
CARBON_HEAP_MEMORY_POOLING_THRESHOLD_BYTES_DEFAULT = "1048576";
 
   /**
-   * If set to true, will use CarbonReader to do distributed scan directly 
instead of using
-   * compute framework like spark, thus avoiding limitation of compute 
framework like SQL
-   * optimizer and task scheduling overhead.
-   */
-  @InterfaceStability.Unstable
-  @CarbonProperty(dynamicConfigurable = true)
-  public static final String CARBON_SEARCH_MODE_ENABLE = 
"carbon.search.enabled";
-
-  public static final String CARBON_SEARCH_MODE_ENABLE_DEFAULT = "false";
-
-  /**
-   * It's timeout threshold of carbon search query
-   */
-  @InterfaceStability.Unstable
-  @CarbonProperty
-  public static final String CARBON_SEARCH_QUERY_TIMEOUT = 
"carbon.search.query.timeout";
-
-  /**
-   * Default value is 10 seconds
-   */
-  public static final String CARBON_SEARCH_QUERY_TIMEOUT_DEFAULT = "10s";
-
-  /**
-   * The size of thread pool used for reading files in Work for search mode. 
By default,
-   * it is number of cores in Worker
-   */
-  @InterfaceStability.Unstable
-  @CarbonProperty
-  public static final String CARBON_SEARCH_MODE_SCAN_THREAD = 
"carbon.search.scan.thread";
-
-  /**
-   * In search mode, Master will listen on this port for worker registration.
-   * If Master failed to start service with this port, it will try to 
increment the port number
-   * and try to bind again, until it is success
-   */
-  @Int

[1/2] carbondata git commit: [HOTFIX] Remove search mode module

2018-11-13 Thread xuchuanyin
Repository: carbondata
Updated Branches:
  refs/heads/master 6f3b9d3b9 -> 109184848


http://git-wip-us.apache.org/repos/asf/carbondata/blob/10918484/integration/spark2/src/test/scala/org/apache/carbondata/datamap/bloom/BloomCoarseGrainDataMapSuite.scala
--
diff --git 
a/integration/spark2/src/test/scala/org/apache/carbondata/datamap/bloom/BloomCoarseGrainDataMapSuite.scala
 
b/integration/spark2/src/test/scala/org/apache/carbondata/datamap/bloom/BloomCoarseGrainDataMapSuite.scala
index 3b5b5ca..4985718 100644
--- 
a/integration/spark2/src/test/scala/org/apache/carbondata/datamap/bloom/BloomCoarseGrainDataMapSuite.scala
+++ 
b/integration/spark2/src/test/scala/org/apache/carbondata/datamap/bloom/BloomCoarseGrainDataMapSuite.scala
@@ -294,17 +294,6 @@ class BloomCoarseGrainDataMapSuite extends QueryTest with 
BeforeAndAfterAll with
 val expectedAnswer1 = sql(s"select * from $normalTable where id = 
1").collect()
 val expectedAnswer2 = sql(s"select * from $normalTable where city in 
('city_999')").collect()
 
-carbonSession.startSearchMode()
-assert(carbonSession.isSearchModeEnabled)
-
-checkAnswer(
-  sql(s"select * from $bloomDMSampleTable where id = 1"), expectedAnswer1)
-checkAnswer(
-  sql(s"select * from $bloomDMSampleTable where city in ('city_999')"), 
expectedAnswer2)
-
-carbonSession.stopSearchMode()
-assert(!carbonSession.isSearchModeEnabled)
-
 sql(s"DROP TABLE IF EXISTS $normalTable")
 sql(s"DROP TABLE IF EXISTS $bloomDMSampleTable")
   }
@@ -975,10 +964,6 @@ class BloomCoarseGrainDataMapSuite extends QueryTest with 
BeforeAndAfterAll with
   }
 
   override protected def afterAll(): Unit = {
-// in case of search mode test case failed, stop search mode again
-if (carbonSession.isSearchModeEnabled) {
-  carbonSession.stopSearchMode()
-}
 deleteFile(bigFile)
 deleteFile(smallFile)
 sql(s"DROP TABLE IF EXISTS $normalTable")

http://git-wip-us.apache.org/repos/asf/carbondata/blob/10918484/pom.xml
--
diff --git a/pom.xml b/pom.xml
index b61c59e..8e67c26 100644
--- a/pom.xml
+++ b/pom.xml
@@ -104,7 +104,6 @@
 integration/spark-common-test
 datamap/examples
 store/sdk
-store/search
 assembly
 tools/cli
   
@@ -536,8 +535,6 @@
 
${basedir}/streaming/src/main/java
 
${basedir}/streaming/src/main/scala
 
${basedir}/store/sdk/src/main/java
-
${basedir}/store/search/src/main/scala
-
${basedir}/store/search/src/main/java
 
${basedir}/datamap/bloom/src/main/java
 
${basedir}/datamap/lucene/src/main/java
   
@@ -599,8 +596,6 @@
 
${basedir}/streaming/src/main/java
 
${basedir}/streaming/src/main/scala
 
${basedir}/store/sdk/src/main/java
-
${basedir}/store/search/src/main/scala
-
${basedir}/store/search/src/main/java
 
${basedir}/datamap/bloom/src/main/java
 
${basedir}/datamap/lucene/src/main/java
   
@@ -658,8 +653,6 @@
 
${basedir}/streaming/src/main/java
 
${basedir}/streaming/src/main/scala
 
${basedir}/store/sdk/src/main/java
-
${basedir}/store/search/src/main/scala
-
${basedir}/store/search/src/main/java
 
${basedir}/datamap/bloom/src/main/java
 
${basedir}/datamap/lucene/src/main/java
   

http://git-wip-us.apache.org/repos/asf/carbondata/blob/10918484/store/search/pom.xml
--
diff --git a/store/search/pom.xml b/store/search/pom.xml
deleted file mode 100644
index 6b84be9..000
--- a/store/search/pom.xml
+++ /dev/null
@@ -1,112 +0,0 @@
-http://maven.apache.org/POM/4.0.0;
- xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance;
- xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 
http://maven.apache.org/xsd/maven-4.0.0.xsd;>
-
-  4.0.0
-
-  
-org.apache.carbondata
-carbondata-parent
-1.6.0-SNAPSHOT
-../../pom.xml
-  
-
-  carbondata-search
-  Apache CarbonData :: Search 
-
-  
-${basedir}/../../dev
-  
-
-  
-
-  org.apache.carbondata
-  carbondata-hadoop
-  ${project.version}
-
-
-  org.apache.spark
-  spark-core_${scala.binary.version}
-  ${spark.version}
-
-
-  junit
-  junit
-  test
-
-
-  org.scalatest
-  scalatest_${scala.binary.version}
-  test
-
-  
-
-  
-src/test/scala
-
-  
-org.apache.maven.plugins
-maven-compiler-plugin
-
-  1.7
-  1.7
-
-  
-  
-org.scala-tools
-maven-scala-plugin
-2.15.2
-
-  
-  

carbondata git commit: [HOTFIX] change log level for data loading

2018-11-09 Thread xuchuanyin
Repository: carbondata
Updated Branches:
  refs/heads/master 6707db689 -> 6f3b9d3b9


[HOTFIX] change log level for data loading

In current data loading, many log meant for debugging purpose is logged as INFO 
log, in order to reduce the entry of them, In this PR they are changed to DEBUG 
level.

This closes #2911


Project: http://git-wip-us.apache.org/repos/asf/carbondata/repo
Commit: http://git-wip-us.apache.org/repos/asf/carbondata/commit/6f3b9d3b
Tree: http://git-wip-us.apache.org/repos/asf/carbondata/tree/6f3b9d3b
Diff: http://git-wip-us.apache.org/repos/asf/carbondata/diff/6f3b9d3b

Branch: refs/heads/master
Commit: 6f3b9d3b99ca55a6b29e951d89e17bb60eee1f03
Parents: 6707db6
Author: Jacky Li 
Authored: Fri Nov 9 14:49:15 2018 +0800
Committer: xuchuanyin 
Committed: Fri Nov 9 17:06:13 2018 +0800

--
 .../core/metadata/schema/table/TableInfo.java   |  9 +++--
 .../apache/carbondata/core/util/CarbonUtil.java | 39 +++-
 .../loading/AbstractDataLoadProcessorStep.java  |  4 +-
 .../processing/loading/DataLoadExecutor.java|  2 -
 .../CarbonRowDataWriterProcessorStepImpl.java   | 12 +++---
 .../steps/DataWriterProcessorStepImpl.java  | 12 +++---
 .../store/CarbonFactDataHandlerColumnar.java| 35 --
 .../store/writer/AbstractFactDataWriter.java|  2 +-
 .../writer/v3/CarbonFactDataWriterImplV3.java   |  7 +++-
 .../util/CarbonDataProcessorUtil.java   |  1 -
 10 files changed, 73 insertions(+), 50 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/carbondata/blob/6f3b9d3b/core/src/main/java/org/apache/carbondata/core/metadata/schema/table/TableInfo.java
--
diff --git 
a/core/src/main/java/org/apache/carbondata/core/metadata/schema/table/TableInfo.java
 
b/core/src/main/java/org/apache/carbondata/core/metadata/schema/table/TableInfo.java
index b3e9e7e..3e50586 100644
--- 
a/core/src/main/java/org/apache/carbondata/core/metadata/schema/table/TableInfo.java
+++ 
b/core/src/main/java/org/apache/carbondata/core/metadata/schema/table/TableInfo.java
@@ -258,9 +258,12 @@ public class TableInfo implements Serializable, Writable {
 }
 if (null == tableBlockSize) {
   tableBlockSize = CarbonCommonConstants.BLOCK_SIZE_DEFAULT_VAL;
-  LOGGER.info("Table block size not specified for " + getTableUniqueName()
-  + ". Therefore considering the default value "
-  + CarbonCommonConstants.BLOCK_SIZE_DEFAULT_VAL + " MB");
+  if (LOGGER.isDebugEnabled()) {
+LOGGER.debug(
+"Table block size not specified for " + getTableUniqueName() +
+". Therefore considering the default value " +
+CarbonCommonConstants.BLOCK_SIZE_DEFAULT_VAL + " MB");
+  }
 }
 return Integer.parseInt(tableBlockSize);
   }

http://git-wip-us.apache.org/repos/asf/carbondata/blob/6f3b9d3b/core/src/main/java/org/apache/carbondata/core/util/CarbonUtil.java
--
diff --git a/core/src/main/java/org/apache/carbondata/core/util/CarbonUtil.java 
b/core/src/main/java/org/apache/carbondata/core/util/CarbonUtil.java
index 1840ba0..2fa6260 100644
--- a/core/src/main/java/org/apache/carbondata/core/util/CarbonUtil.java
+++ b/core/src/main/java/org/apache/carbondata/core/util/CarbonUtil.java
@@ -2474,7 +2474,7 @@ public final class CarbonUtil {
   lockAcquired = carbonLock.lockWithRetries();
 }
 if (lockAcquired) {
-  LOGGER.info("Acquired lock for table for table status updation");
+  LOGGER.debug("Acquired lock for table for table status updation");
   String metadataPath = carbonTable.getMetadataPath();
   LoadMetadataDetails[] loadMetadataDetails =
   SegmentStatusManager.readLoadMetadata(metadataPath);
@@ -2488,7 +2488,7 @@ public final class CarbonUtil {
   // If it is old segment, need to calculate data size and index 
size again
   if (null == dsize || null == isize) {
 needUpdate = true;
-LOGGER.info("It is an old segment, need calculate data size 
and index size again");
+LOGGER.debug("It is an old segment, need calculate data size 
and index size again");
 HashMap map = CarbonUtil.getDataSizeAndIndexSize(
 identifier.getTablePath(), 
loadMetadataDetail.getLoadName());
 dsize = 
String.valueOf(map.get(CarbonCommonConstants.CARBON_TOTAL_DATA_SIZE));
@@ -2524,7 +2524,7 @@ public final class CarbonUtil {
 }
   } finally {
 if (carbonLock.unlock()) {
-  LOGGER.info("Table unlocked successfully after table status 
upda

carbondata git commit: [DOC] Update streaming-guide.md

2018-11-09 Thread xuchuanyin
Repository: carbondata
Updated Branches:
  refs/heads/master aa2747722 -> 5344b781e


[DOC] Update streaming-guide.md

Correct StreamSQL description

This closes #2910


Project: http://git-wip-us.apache.org/repos/asf/carbondata/repo
Commit: http://git-wip-us.apache.org/repos/asf/carbondata/commit/5344b781
Tree: http://git-wip-us.apache.org/repos/asf/carbondata/tree/5344b781
Diff: http://git-wip-us.apache.org/repos/asf/carbondata/diff/5344b781

Branch: refs/heads/master
Commit: 5344b781e818be6285143d9af5343f391a4042d4
Parents: aa27477
Author: Jacky Li 
Authored: Fri Nov 9 12:00:07 2018 +0800
Committer: xuchuanyin 
Committed: Fri Nov 9 17:01:14 2018 +0800

--
 docs/streaming-guide.md | 40 ++--
 1 file changed, 30 insertions(+), 10 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/carbondata/blob/5344b781/docs/streaming-guide.md
--
diff --git a/docs/streaming-guide.md b/docs/streaming-guide.md
index 714b07a..0987ed2 100644
--- a/docs/streaming-guide.md
+++ b/docs/streaming-guide.md
@@ -31,9 +31,10 @@
 - [StreamSQL](#streamsql)
   - [Defining Streaming Table](#streaming-table)
   - [Streaming Job Management](#streaming-job-management)
-- [START STREAM](#start-stream)
-- [STOP STREAM](#stop-stream)
+- [CREATE STREAM](#create-stream)
+- [DROP STREAM](#drop-stream)
 - [SHOW STREAMS](#show-streams)
+- [CLOSE STREAM](#close-stream)
 
 ## Quick example
 Download and unzip spark-2.2.0-bin-hadoop2.7.tgz, and export $SPARK_HOME
@@ -333,7 +334,7 @@ Following example shows how to start a streaming ingest job
 
 sql(
   """
-|START STREAM job123 ON TABLE sink
+|CREATE STREAM job123 ON TABLE sink
 |STMPROPERTIES(
 |  'trigger'='ProcessingTime',
 |  'interval'='1 seconds')
@@ -343,7 +344,7 @@ Following example shows how to start a streaming ingest job
 |  WHERE id % 2 = 1
   """.stripMargin)
 
-sql("STOP STREAM job123")
+sql("DROP STREAM job123")
 
 sql("SHOW STREAMS [ON TABLE tableName]")
 ```
@@ -360,13 +361,13 @@ These two tables are normal carbon tables, they can be 
queried independently.
 
 As above example shown:
 
-- `START STREAM jobName ON TABLE tableName` is used to start a streaming 
ingest job. 
-- `STOP STREAM jobName` is used to stop a streaming job by its name
+- `CREATE STREAM jobName ON TABLE tableName` is used to start a streaming 
ingest job. 
+- `DROP STREAM jobName` is used to stop a streaming job by its name
 - `SHOW STREAMS [ON TABLE tableName]` is used to print streaming job 
information
 
 
 
-# START STREAM
+# CREATE STREAM
 
 When this is issued, carbon will start a structured streaming job to do the 
streaming ingestion. Before launching the job, system will validate:
 
@@ -424,11 +425,25 @@ For Kafka data source, create the source table by:
   )
   ```
 
+- Then CREATE STREAM can be used to start the streaming ingest job from source 
table to sink table
+```
+CREATE STREAM job123 ON TABLE sink
+STMPROPERTIES(
+'trigger'='ProcessingTime',
+ 'interval'='10 seconds'
+) 
+AS
+   SELECT *
+   FROM source
+   WHERE id % 2 = 1
+```
 
-# STOP STREAM
-
-When this is issued, the streaming job will be stopped immediately. It will 
fail if the jobName specified is not exist.
+# DROP STREAM
 
+When `DROP STREAM` is issued, the streaming job will be stopped immediately. 
It will fail if the jobName specified is not exist.
+```
+DROP STREAM job123
+```
 
 
 # SHOW STREAMS
@@ -441,4 +456,9 @@ When this is issued, the streaming job will be stopped 
immediately. It will fail
 
 `SHOW STREAMS` command will show all stream jobs in the system.
 
+# ALTER TABLE CLOSE STREAM
+
+When the streaming application is stopped, and user want to manually trigger 
data conversion from carbon streaming files to columnar files, one can use
+`ALTER TABLE sink COMPACT 'CLOSE_STREAMING';`
+
 



[4/6] carbondata git commit: [CARBONDATA-3064] Support separate audit log

2018-11-08 Thread xuchuanyin
http://git-wip-us.apache.org/repos/asf/carbondata/blob/aa274772/integration/spark2/src/main/scala/org/apache/spark/sql/execution/command/schema/CarbonAlterTableRenameCommand.scala
--
diff --git 
a/integration/spark2/src/main/scala/org/apache/spark/sql/execution/command/schema/CarbonAlterTableRenameCommand.scala
 
b/integration/spark2/src/main/scala/org/apache/spark/sql/execution/command/schema/CarbonAlterTableRenameCommand.scala
index a1c68a3..c64f50b 100644
--- 
a/integration/spark2/src/main/scala/org/apache/spark/sql/execution/command/schema/CarbonAlterTableRenameCommand.scala
+++ 
b/integration/spark2/src/main/scala/org/apache/spark/sql/execution/command/schema/CarbonAlterTableRenameCommand.scala
@@ -28,7 +28,6 @@ import org.apache.spark.util.AlterTableUtil
 
 import 
org.apache.carbondata.common.exceptions.sql.MalformedCarbonCommandException
 import org.apache.carbondata.common.logging.LogServiceFactory
-import org.apache.carbondata.common.logging.impl.Audit
 import org.apache.carbondata.core.datamap.DataMapStoreManager
 import org.apache.carbondata.core.exception.ConcurrentOperationException
 import org.apache.carbondata.core.features.TableOperation
@@ -48,6 +47,8 @@ private[sql] case class CarbonAlterTableRenameCommand(
 val newTableIdentifier = alterTableRenameModel.newTableIdentifier
 val oldDatabaseName = oldTableIdentifier.database
   .getOrElse(sparkSession.catalog.currentDatabase)
+setAuditTable(oldDatabaseName, oldTableIdentifier.table)
+setAuditInfo(Map("newName" -> 
alterTableRenameModel.newTableIdentifier.table))
 val newDatabaseName = newTableIdentifier.database
   .getOrElse(sparkSession.catalog.currentDatabase)
 if (!oldDatabaseName.equalsIgnoreCase(newDatabaseName)) {
@@ -60,15 +61,12 @@ private[sql] case class CarbonAlterTableRenameCommand(
 }
 val oldTableName = oldTableIdentifier.table.toLowerCase
 val newTableName = newTableIdentifier.table.toLowerCase
-Audit.log(LOGGER, s"Rename table request has been received for 
$oldDatabaseName.$oldTableName")
 LOGGER.info(s"Rename table request has been received for 
$oldDatabaseName.$oldTableName")
 val metastore = CarbonEnv.getInstance(sparkSession).carbonMetastore
 val relation: CarbonRelation =
   metastore.lookupRelation(oldTableIdentifier.database, 
oldTableName)(sparkSession)
 .asInstanceOf[CarbonRelation]
 if (relation == null) {
-  Audit.log(LOGGER, s"Rename table request has failed. " +
-   s"Table $oldDatabaseName.$oldTableName does not exist")
   throwMetadataException(oldDatabaseName, oldTableName, "Table does not 
exist")
 }
 
@@ -162,13 +160,11 @@ private[sql] case class CarbonAlterTableRenameCommand(
   OperationListenerBus.getInstance().fireEvent(alterTableRenamePostEvent, 
operationContext)
 
   sparkSession.catalog.refreshTable(newIdentifier.quotedString)
-  Audit.log(LOGGER, s"Table $oldTableName has been successfully renamed to 
$newTableName")
   LOGGER.info(s"Table $oldTableName has been successfully renamed to 
$newTableName")
 } catch {
   case e: ConcurrentOperationException =>
 throw e
   case e: Exception =>
-LOGGER.error("Rename table failed: " + e.getMessage, e)
 if (carbonTable != null) {
   AlterTableUtil.revertRenameTableChanges(
 newTableName,
@@ -182,4 +178,5 @@ private[sql] case class CarbonAlterTableRenameCommand(
 Seq.empty
   }
 
+  override protected def opName: String = "ALTER TABLE RENAME TABLE"
 }

http://git-wip-us.apache.org/repos/asf/carbondata/blob/aa274772/integration/spark2/src/main/scala/org/apache/spark/sql/execution/command/schema/CarbonAlterTableSetCommand.scala
--
diff --git 
a/integration/spark2/src/main/scala/org/apache/spark/sql/execution/command/schema/CarbonAlterTableSetCommand.scala
 
b/integration/spark2/src/main/scala/org/apache/spark/sql/execution/command/schema/CarbonAlterTableSetCommand.scala
index 51c0e6e..b1e7e33 100644
--- 
a/integration/spark2/src/main/scala/org/apache/spark/sql/execution/command/schema/CarbonAlterTableSetCommand.scala
+++ 
b/integration/spark2/src/main/scala/org/apache/spark/sql/execution/command/schema/CarbonAlterTableSetCommand.scala
@@ -29,17 +29,18 @@ private[sql] case class CarbonAlterTableSetCommand(
 isView: Boolean)
   extends MetadataCommand {
 
-  override def run(sparkSession: SparkSession): Seq[Row] = {
-processMetadata(sparkSession)
-  }
-
   override def processMetadata(sparkSession: SparkSession): Seq[Row] = {
+
setAuditTable(tableIdentifier.database.getOrElse(sparkSession.catalog.currentDatabase),
+  tableIdentifier.table)
 AlterTableUtil.modifyTableProperties(
   tableIdentifier,
   properties,
   Nil,
   set = true)(sparkSession,
   

[3/6] carbondata git commit: [CARBONDATA-3064] Support separate audit log

2018-11-08 Thread xuchuanyin
http://git-wip-us.apache.org/repos/asf/carbondata/blob/aa274772/integration/spark2/src/test/scala/org/apache/spark/carbondata/TestStreamingTableOpName.scala
--
diff --git 
a/integration/spark2/src/test/scala/org/apache/spark/carbondata/TestStreamingTableOpName.scala
 
b/integration/spark2/src/test/scala/org/apache/spark/carbondata/TestStreamingTableOpName.scala
new file mode 100644
index 000..d789f5c
--- /dev/null
+++ 
b/integration/spark2/src/test/scala/org/apache/spark/carbondata/TestStreamingTableOpName.scala
@@ -0,0 +1,2647 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.spark.carbondata
+
+import java.io.{File, PrintWriter}
+import java.math.BigDecimal
+import java.net.{BindException, ServerSocket}
+import java.sql.{Date, Timestamp}
+import java.util.concurrent.Executors
+
+import scala.collection.mutable
+
+import org.apache.spark.rdd.RDD
+import org.apache.spark.sql._
+import org.apache.spark.sql.catalyst.TableIdentifier
+import org.apache.spark.sql.catalyst.analysis.NoSuchTableException
+import org.apache.spark.sql.hive.CarbonRelation
+import org.apache.spark.sql.streaming.{ProcessingTime, StreamingQuery}
+import org.apache.spark.sql.test.util.QueryTest
+import org.scalatest.BeforeAndAfterAll
+
+import org.apache.carbondata.common.exceptions.NoSuchStreamException
+import 
org.apache.carbondata.common.exceptions.sql.MalformedCarbonCommandException
+import org.apache.carbondata.core.constants.CarbonCommonConstants
+import org.apache.carbondata.core.datastore.impl.FileFactory
+import 
org.apache.carbondata.core.metadata.schema.datamap.DataMapClassProvider.TIMESERIES
+import org.apache.carbondata.core.metadata.schema.table.CarbonTable
+import org.apache.carbondata.core.statusmanager.{FileFormat, SegmentStatus}
+import org.apache.carbondata.core.util.CarbonProperties
+import org.apache.carbondata.core.util.path.CarbonTablePath
+import org.apache.carbondata.spark.exception.ProcessMetaDataException
+import org.apache.carbondata.spark.rdd.CarbonScanRDD
+import org.apache.carbondata.streaming.parser.CarbonStreamParser
+
+class TestStreamingTableOpName extends QueryTest with BeforeAndAfterAll {
+
+  private val spark = sqlContext.sparkSession
+  private val dataFilePath = s"$resourcesPath/streamSample.csv"
+  def currentPath: String = new File(this.getClass.getResource("/").getPath + 
"../../")
+.getCanonicalPath
+  val badRecordFilePath: File =new File(currentPath + 
"/target/test/badRecords")
+
+  override def beforeAll {
+badRecordFilePath.delete()
+badRecordFilePath.mkdirs()
+CarbonProperties.getInstance().addProperty(
+  CarbonCommonConstants.CARBON_TIMESTAMP_FORMAT,
+  CarbonCommonConstants.CARBON_TIMESTAMP_DEFAULT_FORMAT)
+CarbonProperties.getInstance().addProperty(
+  CarbonCommonConstants.CARBON_DATE_FORMAT,
+  CarbonCommonConstants.CARBON_DATE_DEFAULT_FORMAT)
+sql("DROP DATABASE IF EXISTS streaming CASCADE")
+sql("CREATE DATABASE streaming")
+sql("USE streaming")
+sql(
+  """
+| CREATE TABLE source(
+|c1 string,
+|c2 int,
+|c3 string,
+|c5 string
+| ) STORED BY 'org.apache.carbondata.format'
+| TBLPROPERTIES ('streaming' = 'true')
+  """.stripMargin)
+sql(s"""LOAD DATA LOCAL INPATH '$resourcesPath/IUD/dest.csv' INTO TABLE 
source""")
+
+dropTable()
+
+// 1. normal table not support streaming ingest
+createTable(tableName = "batch_table", streaming = false, withBatchLoad = 
true)
+
+// 2. streaming table with different input source
+// file source
+createTable(tableName = "stream_table_file", streaming = true, 
withBatchLoad = true)
+
+// 3. streaming table with bad records
+createTable(tableName = "bad_record_fail", streaming = true, withBatchLoad 
= true)
+
+// 4. streaming frequency check
+createTable(tableName = "stream_table_1s", streaming = true, withBatchLoad 
= true)
+
+// 5. streaming table execute batch loading
+// 6. detail query
+// 8. compaction
+// full scan + filter scan + aggregate query
+createTable(tableName = "stream_table_filter", 

[6/6] carbondata git commit: [CARBONDATA-3064] Support separate audit log

2018-11-08 Thread xuchuanyin
[CARBONDATA-3064] Support separate audit log

A new audit log is implemented as following:
1. a framework is added for carbon command to record the audit log 
automatically, see command/package.scala
2. Audit logs are output by Auditor.java, log4j config example is provided 
in Auditor.java file comment
3.old audit log is removed

This closes #2885


Project: http://git-wip-us.apache.org/repos/asf/carbondata/repo
Commit: http://git-wip-us.apache.org/repos/asf/carbondata/commit/aa274772
Tree: http://git-wip-us.apache.org/repos/asf/carbondata/tree/aa274772
Diff: http://git-wip-us.apache.org/repos/asf/carbondata/diff/aa274772

Branch: refs/heads/master
Commit: aa27477228d4fd0c51da3f2f8f588e41e06a9519
Parents: 6093a32
Author: Jacky Li 
Authored: Wed Oct 31 14:49:38 2018 +0800
Committer: xuchuanyin 
Committed: Fri Nov 9 08:47:55 2018 +0800

--
 .../carbondata/common/logging/impl/Audit.java   |   49 -
 .../logging/ft/LoggingServiceTest_FT.java   |   93 -
 .../status/DiskBasedDataMapStatusProvider.java  |2 -
 .../client/NonSecureDictionaryClient.java   |3 +-
 .../NonSecureDictionaryClientHandler.java   |3 +-
 .../IncrementalColumnDictionaryGenerator.java   |3 +-
 .../statusmanager/SegmentStatusManager.java |   14 +-
 .../carbondata/core/util/SessionParams.java |   18 +-
 .../carbondata/mv/datamap/MVAnalyzerRule.scala  |2 -
 .../examples/sql/CarbonSessionExample.java  |  137 -
 .../examples/sql/JavaCarbonSessionExample.java  |   94 +
 .../examples/CarbonSessionExample.scala |   44 +-
 .../carbondata/examplesCI/RunExamples.scala |5 +
 .../client/SecureDictionaryClient.java  |3 +-
 .../server/SecureDictionaryServer.java  |3 +-
 .../org/apache/carbondata/api/CarbonStore.scala |9 +-
 .../carbondata/spark/rdd/PartitionDropper.scala |8 -
 .../spark/rdd/PartitionSplitter.scala   |6 -
 .../carbondata/spark/rdd/StreamHandoffRDD.scala |   12 +-
 .../command/carbonTableSchemaCommon.scala   |   21 +-
 .../sql/test/ResourceRegisterAndCopier.scala|4 +-
 ...CreateCarbonSourceTableAsSelectCommand.scala |   13 +-
 .../datamap/CarbonMergeBloomIndexFilesRDD.scala |2 +-
 .../spark/rdd/AggregateDataMapCompactor.scala   |4 -
 .../spark/rdd/CarbonDataRDDFactory.scala|   71 +-
 .../spark/rdd/CarbonTableCompactor.scala|   12 -
 .../carbondata/stream/StreamJobManager.scala|   13 +-
 .../events/MergeBloomIndexEventListener.scala   |3 +-
 .../sql/events/MergeIndexEventListener.scala|   17 +-
 .../datamap/CarbonCreateDataMapCommand.scala|   12 +-
 .../datamap/CarbonDataMapRebuildCommand.scala   |3 +
 .../datamap/CarbonDataMapShowCommand.scala  |3 +
 .../datamap/CarbonDropDataMapCommand.scala  |6 +-
 .../CarbonAlterTableCompactionCommand.scala |   13 +-
 .../CarbonAlterTableFinishStreaming.scala   |3 +
 .../management/CarbonCleanFilesCommand.scala|7 +-
 .../command/management/CarbonCliCommand.scala   |4 +
 .../CarbonDeleteLoadByIdCommand.scala   |5 +-
 .../CarbonDeleteLoadByLoadDateCommand.scala |5 +-
 .../management/CarbonInsertIntoCommand.scala|   13 +-
 .../management/CarbonLoadDataCommand.scala  |  355 +--
 .../management/CarbonShowLoadsCommand.scala |3 +
 .../management/RefreshCarbonTableCommand.scala  |   18 +-
 .../CarbonProjectForDeleteCommand.scala |9 +-
 .../CarbonProjectForUpdateCommand.scala |4 +
 .../command/mutation/DeleteExecution.scala  |   12 +-
 .../command/mutation/HorizontalCompaction.scala |6 -
 .../spark/sql/execution/command/package.scala   |   82 +-
 ...arbonAlterTableAddHivePartitionCommand.scala |3 +
 ...rbonAlterTableDropHivePartitionCommand.scala |3 +
 .../CarbonAlterTableDropPartitionCommand.scala  |8 +-
 .../CarbonAlterTableSplitPartitionCommand.scala |9 +-
 .../CarbonShowCarbonPartitionsCommand.scala |3 +
 .../CarbonAlterTableAddColumnCommand.scala  |9 +-
 .../CarbonAlterTableDataTypeChangeCommand.scala |   16 +-
 .../CarbonAlterTableDropColumnCommand.scala |8 +-
 .../schema/CarbonAlterTableRenameCommand.scala  |9 +-
 .../schema/CarbonAlterTableSetCommand.scala |9 +-
 .../schema/CarbonAlterTableUnsetCommand.scala   |   11 +-
 .../schema/CarbonGetTableDetailCommand.scala|2 +
 .../stream/CarbonCreateStreamCommand.scala  |3 +
 .../stream/CarbonDropStreamCommand.scala|3 +
 .../stream/CarbonShowStreamsCommand.scala   |3 +
 .../CarbonCreateTableAsSelectCommand.scala  |   17 +-
 .../table/CarbonCreateTableCommand.scala|   17 +-
 .../table/CarbonDescribeFormattedCommand.scala  |3 +
 .../command/table/CarbonDropTableCommand.scala  |6 +-
 .../command/table/CarbonExplainCommand.scala|4 +-
 .../command/table/CarbonShowTablesCommand.scala |1 +
 .../strategy

[2/6] carbondata git commit: [CARBONDATA-3064] Support separate audit log

2018-11-08 Thread xuchuanyin
http://git-wip-us.apache.org/repos/asf/carbondata/blob/aa274772/integration/spark2/src/test/scala/org/apache/spark/carbondata/TestStreamingTableOperation.scala
--
diff --git 
a/integration/spark2/src/test/scala/org/apache/spark/carbondata/TestStreamingTableOperation.scala
 
b/integration/spark2/src/test/scala/org/apache/spark/carbondata/TestStreamingTableOperation.scala
deleted file mode 100644
index 62c0221..000
--- 
a/integration/spark2/src/test/scala/org/apache/spark/carbondata/TestStreamingTableOperation.scala
+++ /dev/null
@@ -1,2647 +0,0 @@
-/*
- * Licensed to the Apache Software Foundation (ASF) under one or more
- * contributor license agreements.  See the NOTICE file distributed with
- * this work for additional information regarding copyright ownership.
- * The ASF licenses this file to You under the Apache License, Version 2.0
- * (the "License"); you may not use this file except in compliance with
- * the License.  You may obtain a copy of the License at
- *
- *http://www.apache.org/licenses/LICENSE-2.0
- *
- * Unless required by applicable law or agreed to in writing, software
- * distributed under the License is distributed on an "AS IS" BASIS,
- * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
- * See the License for the specific language governing permissions and
- * limitations under the License.
- */
-
-package org.apache.spark.carbondata
-
-import java.io.{File, PrintWriter}
-import java.math.BigDecimal
-import java.net.{BindException, ServerSocket}
-import java.sql.{Date, Timestamp}
-import java.util.concurrent.Executors
-
-import scala.collection.mutable
-
-import org.apache.spark.rdd.RDD
-import org.apache.spark.sql._
-import org.apache.spark.sql.catalyst.TableIdentifier
-import org.apache.spark.sql.catalyst.analysis.NoSuchTableException
-import org.apache.spark.sql.hive.CarbonRelation
-import org.apache.spark.sql.streaming.{ProcessingTime, StreamingQuery}
-import org.apache.spark.sql.test.util.QueryTest
-import org.scalatest.BeforeAndAfterAll
-
-import org.apache.carbondata.common.exceptions.NoSuchStreamException
-import 
org.apache.carbondata.common.exceptions.sql.MalformedCarbonCommandException
-import org.apache.carbondata.core.constants.CarbonCommonConstants
-import org.apache.carbondata.core.datastore.impl.FileFactory
-import 
org.apache.carbondata.core.metadata.schema.datamap.DataMapClassProvider.TIMESERIES
-import org.apache.carbondata.core.metadata.schema.table.CarbonTable
-import org.apache.carbondata.core.statusmanager.{FileFormat, SegmentStatus}
-import org.apache.carbondata.core.util.CarbonProperties
-import org.apache.carbondata.core.util.path.CarbonTablePath
-import org.apache.carbondata.spark.exception.ProcessMetaDataException
-import org.apache.carbondata.spark.rdd.CarbonScanRDD
-import org.apache.carbondata.streaming.parser.CarbonStreamParser
-
-class TestStreamingTableOperation extends QueryTest with BeforeAndAfterAll {
-
-  private val spark = sqlContext.sparkSession
-  private val dataFilePath = s"$resourcesPath/streamSample.csv"
-  def currentPath: String = new File(this.getClass.getResource("/").getPath + 
"../../")
-.getCanonicalPath
-  val badRecordFilePath: File =new File(currentPath + 
"/target/test/badRecords")
-
-  override def beforeAll {
-badRecordFilePath.delete()
-badRecordFilePath.mkdirs()
-CarbonProperties.getInstance().addProperty(
-  CarbonCommonConstants.CARBON_TIMESTAMP_FORMAT,
-  CarbonCommonConstants.CARBON_TIMESTAMP_DEFAULT_FORMAT)
-CarbonProperties.getInstance().addProperty(
-  CarbonCommonConstants.CARBON_DATE_FORMAT,
-  CarbonCommonConstants.CARBON_DATE_DEFAULT_FORMAT)
-sql("DROP DATABASE IF EXISTS streaming CASCADE")
-sql("CREATE DATABASE streaming")
-sql("USE streaming")
-sql(
-  """
-| CREATE TABLE source(
-|c1 string,
-|c2 int,
-|c3 string,
-|c5 string
-| ) STORED BY 'org.apache.carbondata.format'
-| TBLPROPERTIES ('streaming' = 'true')
-  """.stripMargin)
-sql(s"""LOAD DATA LOCAL INPATH '$resourcesPath/IUD/dest.csv' INTO TABLE 
source""")
-
-dropTable()
-
-// 1. normal table not support streaming ingest
-createTable(tableName = "batch_table", streaming = false, withBatchLoad = 
true)
-
-// 2. streaming table with different input source
-// file source
-createTable(tableName = "stream_table_file", streaming = true, 
withBatchLoad = true)
-
-// 3. streaming table with bad records
-createTable(tableName = "bad_record_fail", streaming = true, withBatchLoad 
= true)
-
-// 4. streaming frequency check
-createTable(tableName = "stream_table_1s", streaming = true, withBatchLoad 
= true)
-
-// 5. streaming table execute batch loading
-// 6. detail query
-// 8. compaction
-// full scan + filter scan + aggregate query
-createTable(tableName = 

[1/6] carbondata git commit: [CARBONDATA-3064] Support separate audit log

2018-11-08 Thread xuchuanyin
Repository: carbondata
Updated Branches:
  refs/heads/master 6093a32d7 -> aa2747722


http://git-wip-us.apache.org/repos/asf/carbondata/blob/aa274772/processing/src/main/java/org/apache/carbondata/processing/util/Auditor.java
--
diff --git 
a/processing/src/main/java/org/apache/carbondata/processing/util/Auditor.java 
b/processing/src/main/java/org/apache/carbondata/processing/util/Auditor.java
new file mode 100644
index 000..e811c59
--- /dev/null
+++ 
b/processing/src/main/java/org/apache/carbondata/processing/util/Auditor.java
@@ -0,0 +1,222 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.carbondata.processing.util;
+
+import java.io.IOException;
+import java.util.Date;
+import java.util.HashMap;
+import java.util.Map;
+import java.util.Objects;
+
+import org.apache.carbondata.common.annotations.InterfaceAudience;
+import org.apache.carbondata.common.logging.impl.AuditLevel;
+
+import com.google.gson.Gson;
+import com.google.gson.GsonBuilder;
+import org.apache.commons.lang3.time.FastDateFormat;
+import org.apache.hadoop.security.UserGroupInformation;
+import org.apache.log4j.Logger;
+
+/**
+ * Audit logger.
+ * User can configure log4j to log to a separate file. For example
+ *
+ *  log4j.logger.carbon.audit=DEBUG, audit
+ *  log4j.appender.audit=org.apache.log4j.FileAppender
+ *  log4j.appender.audit.File=/opt/logs/audit.out
+ *  log4j.appender.audit.Threshold=AUDIT
+ *  log4j.appender.audit.Append=false
+ *  log4j.appender.audit.layout=org.apache.log4j.PatternLayout
+ *  log4j.appender.audit.layout.ConversionPattern=%m%n
+ */
+@InterfaceAudience.Internal
+public class Auditor {
+  private static final Logger LOGGER = Logger.getLogger("carbon.audit");
+  private static final Gson gson = new 
GsonBuilder().disableHtmlEscaping().create();
+  private static String username;
+
+  static {
+try {
+  username = UserGroupInformation.getCurrentUser().getShortUserName();
+} catch (IOException e) {
+  username = "unknown";
+}
+  }
+
+  /**
+   * call this method to record audit log when operation is triggered
+   * @param opName operation name
+   * @param opId operation unique id
+   */
+  public static void logOperationStart(String opName, String opId) {
+Objects.requireNonNull(opName);
+Objects.requireNonNull(opId);
+OpStartMessage message = new OpStartMessage(opName, opId);
+Gson gson = new GsonBuilder().disableHtmlEscaping().create();
+String json = gson.toJson(message);
+LOGGER.log(AuditLevel.AUDIT, json);
+  }
+
+  /**
+   * call this method to record audit log when operation finished
+   * @param opName operation name
+   * @param opId operation unique id
+   * @param success true if operation success
+   * @param table carbon dbName and tableName
+   * @param opTime elapse time in Ms for this operation
+   * @param extraInfo extra information to include in the audit log
+   */
+  public static void logOperationEnd(String opName, String opId, boolean 
success, String table,
+  String opTime, Map extraInfo) {
+Objects.requireNonNull(opName);
+Objects.requireNonNull(opId);
+Objects.requireNonNull(opTime);
+OpEndMessage message = new OpEndMessage(opName, opId, table, opTime,
+success ? OpStatus.SUCCESS : OpStatus.FAILED,
+extraInfo != null ? extraInfo : new HashMap());
+String json = gson.toJson(message);
+LOGGER.log(AuditLevel.AUDIT, json);
+  }
+
+  private enum OpStatus {
+// operation started
+START,
+
+// operation succeed
+SUCCESS,
+
+// operation failed
+FAILED
+  }
+
+  // log message for operation start, it is written as a JSON record in the 
audit log
+  private static class OpStartMessage {
+private String time;
+private String username;
+private String opName;
+private String opId;
+private OpStatus opStatus;
+
+OpStartMessage(String opName, String opId) {
+  FastDateFormat format =
+  FastDateFormat.getDateTimeInstance(FastDateFormat.LONG, 
FastDateFormat.LONG);
+  this.time = format.format(new Date());
+  this.username = Auditor.username;
+  this.opName = opName;
+  this.opId = 

[5/6] carbondata git commit: [CARBONDATA-3064] Support separate audit log

2018-11-08 Thread xuchuanyin
http://git-wip-us.apache.org/repos/asf/carbondata/blob/aa274772/integration/spark2/src/main/scala/org/apache/spark/sql/events/MergeIndexEventListener.scala
--
diff --git 
a/integration/spark2/src/main/scala/org/apache/spark/sql/events/MergeIndexEventListener.scala
 
b/integration/spark2/src/main/scala/org/apache/spark/sql/events/MergeIndexEventListener.scala
index c8c9a47..35b73d6 100644
--- 
a/integration/spark2/src/main/scala/org/apache/spark/sql/events/MergeIndexEventListener.scala
+++ 
b/integration/spark2/src/main/scala/org/apache/spark/sql/events/MergeIndexEventListener.scala
@@ -29,7 +29,6 @@ import org.apache.spark.sql.SparkSession
 import org.apache.spark.sql.util.CarbonException
 
 import org.apache.carbondata.common.logging.LogServiceFactory
-import org.apache.carbondata.common.logging.impl.Audit
 import org.apache.carbondata.core.constants.CarbonCommonConstants
 import org.apache.carbondata.core.datamap.Segment
 import org.apache.carbondata.core.locks.{CarbonLockFactory, LockUsage}
@@ -39,7 +38,6 @@ import 
org.apache.carbondata.core.statusmanager.SegmentStatusManager
 import org.apache.carbondata.events.{AlterTableCompactionPostEvent, 
AlterTableMergeIndexEvent, Event, OperationContext, OperationEventListener}
 import 
org.apache.carbondata.processing.loading.events.LoadEvents.LoadTablePostExecutionEvent
 import org.apache.carbondata.processing.merger.CarbonDataMergerUtil
-import org.apache.carbondata.spark.util.CommonUtil
 
 class MergeIndexEventListener extends OperationEventListener with Logging {
   val LOGGER = LogServiceFactory.getLogService(this.getClass.getCanonicalName)
@@ -47,7 +45,7 @@ class MergeIndexEventListener extends OperationEventListener 
with Logging {
   override def onEvent(event: Event, operationContext: OperationContext): Unit 
= {
 event match {
   case preStatusUpdateEvent: LoadTablePostExecutionEvent =>
-Audit.log(LOGGER, "Load post status event-listener called for merge 
index")
+LOGGER.info("Load post status event-listener called for merge index")
 val loadModel = preStatusUpdateEvent.getCarbonLoadModel
 val carbonTable = loadModel.getCarbonDataLoadSchema.getCarbonTable
 val compactedSegments = loadModel.getMergedSegmentIds
@@ -73,7 +71,7 @@ class MergeIndexEventListener extends OperationEventListener 
with Logging {
   }
 }
   case alterTableCompactionPostEvent: AlterTableCompactionPostEvent =>
-Audit.log(LOGGER, "Merge index for compaction called")
+LOGGER.info("Merge index for compaction called")
 val carbonTable = alterTableCompactionPostEvent.carbonTable
 val mergedLoads = alterTableCompactionPostEvent.compactedLoads
 val sparkSession = alterTableCompactionPostEvent.sparkSession
@@ -84,8 +82,6 @@ class MergeIndexEventListener extends OperationEventListener 
with Logging {
 val carbonMainTable = alterTableMergeIndexEvent.carbonTable
 val sparkSession = alterTableMergeIndexEvent.sparkSession
 if (!carbonMainTable.isStreamingSink) {
-  Audit.log(LOGGER, s"Compaction request received for table " +
-   s"${ carbonMainTable.getDatabaseName }.${ 
carbonMainTable.getTableName }")
   LOGGER.info(s"Merge Index request received for table " +
   s"${ carbonMainTable.getDatabaseName }.${ 
carbonMainTable.getTableName }")
   val lock = CarbonLockFactory.getCarbonLockObj(
@@ -130,16 +126,11 @@ class MergeIndexEventListener extends 
OperationEventListener with Logging {
   clearBlockDataMapCache(carbonMainTable, validSegmentIds)
   val requestMessage = "Compaction request completed for table " +
 s"${ carbonMainTable.getDatabaseName }.${ 
carbonMainTable.getTableName }"
-  Audit.log(LOGGER, requestMessage)
   LOGGER.info(requestMessage)
 } else {
   val lockMessage = "Not able to acquire the compaction lock for 
table " +
-s"${ carbonMainTable.getDatabaseName }.${
-  carbonMainTable
-.getTableName
-}"
-
-  Audit.log(LOGGER, lockMessage)
+s"${ carbonMainTable.getDatabaseName }." +
+s"${ carbonMainTable.getTableName}"
   LOGGER.error(lockMessage)
   CarbonException.analysisException(
 "Table is already locked for compaction. Please try after some 
time.")

http://git-wip-us.apache.org/repos/asf/carbondata/blob/aa274772/integration/spark2/src/main/scala/org/apache/spark/sql/execution/command/datamap/CarbonCreateDataMapCommand.scala
--
diff --git 

carbondata git commit: [CARBONDATA-3078] Disable explain collector for count star query without filter

2018-11-06 Thread xuchuanyin
Repository: carbondata
Updated Branches:
  refs/heads/master a3a83dcad -> b6ff4672b


[CARBONDATA-3078] Disable explain collector for count star query without filter

An issue is found about count star query without filter in explain command. It 
is a special case. It uses different plan.
Considering
no useful information about block/blocklet pruning for count star query without 
filter, so disable explain collector and avoid the exception in 
https://issues.apache.org/jira/browse/CARBONDATA-3078

This closes #2900


Project: http://git-wip-us.apache.org/repos/asf/carbondata/repo
Commit: http://git-wip-us.apache.org/repos/asf/carbondata/commit/b6ff4672
Tree: http://git-wip-us.apache.org/repos/asf/carbondata/tree/b6ff4672
Diff: http://git-wip-us.apache.org/repos/asf/carbondata/diff/b6ff4672

Branch: refs/heads/master
Commit: b6ff4672be7bd25ab40144feb801be9e20069244
Parents: a3a83dc
Author: Manhua 
Authored: Mon Nov 5 20:17:59 2018 +0800
Committer: xuchuanyin 
Committed: Tue Nov 6 19:58:24 2018 +0800

--
 .../carbondata/hadoop/api/CarbonTableInputFormat.java|  9 +
 .../spark/testsuite/filterexpr/CountStarTestCase.scala   | 11 +++
 2 files changed, 20 insertions(+)
--


http://git-wip-us.apache.org/repos/asf/carbondata/blob/b6ff4672/hadoop/src/main/java/org/apache/carbondata/hadoop/api/CarbonTableInputFormat.java
--
diff --git 
a/hadoop/src/main/java/org/apache/carbondata/hadoop/api/CarbonTableInputFormat.java
 
b/hadoop/src/main/java/org/apache/carbondata/hadoop/api/CarbonTableInputFormat.java
index ba3accf..86cbfec 100644
--- 
a/hadoop/src/main/java/org/apache/carbondata/hadoop/api/CarbonTableInputFormat.java
+++ 
b/hadoop/src/main/java/org/apache/carbondata/hadoop/api/CarbonTableInputFormat.java
@@ -43,6 +43,7 @@ import org.apache.carbondata.core.mutate.CarbonUpdateUtil;
 import org.apache.carbondata.core.mutate.SegmentUpdateDetails;
 import org.apache.carbondata.core.mutate.UpdateVO;
 import org.apache.carbondata.core.mutate.data.BlockMappingVO;
+import org.apache.carbondata.core.profiler.ExplainCollector;
 import org.apache.carbondata.core.readcommitter.LatestFilesReadCommittedScope;
 import org.apache.carbondata.core.readcommitter.ReadCommittedScope;
 import org.apache.carbondata.core.readcommitter.TableStatusReadCommittedScope;
@@ -575,6 +576,14 @@ public class CarbonTableInputFormat extends 
CarbonInputFormat {
*/
   public BlockMappingVO getBlockRowCount(Job job, CarbonTable table,
   List partitions) throws IOException {
+// Normal query flow goes to CarbonInputFormat#getPrunedBlocklets and 
initialize the
+// pruning info for table we queried. But here count star query without 
filter uses a different
+// query plan, and no pruning info is initialized. When it calls default 
data map to
+// prune(with a null filter), exception will occur during setting pruning 
info.
+// Considering no useful information about block/blocklet pruning for such 
query
+// (actually no pruning), so we disable explain collector here
+ExplainCollector.remove();
+
 AbsoluteTableIdentifier identifier = table.getAbsoluteTableIdentifier();
 TableDataMap blockletMap = 
DataMapStoreManager.getInstance().getDefaultDataMap(table);
 

http://git-wip-us.apache.org/repos/asf/carbondata/blob/b6ff4672/integration/spark-common-test/src/test/scala/org/apache/carbondata/spark/testsuite/filterexpr/CountStarTestCase.scala
--
diff --git 
a/integration/spark-common-test/src/test/scala/org/apache/carbondata/spark/testsuite/filterexpr/CountStarTestCase.scala
 
b/integration/spark-common-test/src/test/scala/org/apache/carbondata/spark/testsuite/filterexpr/CountStarTestCase.scala
index f26d0e7..18ad1d7 100644
--- 
a/integration/spark-common-test/src/test/scala/org/apache/carbondata/spark/testsuite/filterexpr/CountStarTestCase.scala
+++ 
b/integration/spark-common-test/src/test/scala/org/apache/carbondata/spark/testsuite/filterexpr/CountStarTestCase.scala
@@ -54,6 +54,17 @@ class CountStarTestCase extends QueryTest with 
BeforeAndAfterAll {
 )
   }
 
+  test("explain select count star without filter") {
+CarbonProperties.getInstance()
+  .addProperty(CarbonCommonConstants.ENABLE_QUERY_STATISTICS, "true")
+
+sql("explain select count(*) from filterTimestampDataType").collect()
+
+CarbonProperties.getInstance()
+  .addProperty(CarbonCommonConstants.ENABLE_QUERY_STATISTICS,
+CarbonCommonConstants.ENABLE_QUERY_STATISTICS_DEFAULT)
+  }
+
   override def afterAll {
 sql("drop table if exists filtertestTables")
 sql("drop table if exists filterTimestampDataType")



carbondata git commit: [CARBONDATA-3074] Change default sort temp compressor to snappy

2018-11-05 Thread xuchuanyin
Repository: carbondata
Updated Branches:
  refs/heads/master 789055e19 -> c9fb4bc06


[CARBONDATA-3074] Change default sort temp compressor to snappy

sort temp compressor used to be set as empty, which means that Carbondata will 
not compress the sort temp files. This PR changes the default value to snappy.
Some experiments in local cluster shows that setting the compressor 
‘snappy’ will slightly enhance the loading performance and reduce lots of 
disk IO during data loading.

This closes #2894


Project: http://git-wip-us.apache.org/repos/asf/carbondata/repo
Commit: http://git-wip-us.apache.org/repos/asf/carbondata/commit/c9fb4bc0
Tree: http://git-wip-us.apache.org/repos/asf/carbondata/tree/c9fb4bc0
Diff: http://git-wip-us.apache.org/repos/asf/carbondata/diff/c9fb4bc0

Branch: refs/heads/master
Commit: c9fb4bc064c96649fc200ed13bd2db0bcda6ed56
Parents: 789055e
Author: Manhua 
Authored: Mon Nov 5 17:28:07 2018 +0800
Committer: xuchuanyin 
Committed: Mon Nov 5 20:35:25 2018 +0800

--
 .../core/constants/CarbonCommonConstants.java   |  6 +++---
 docs/configuration-parameters.md|  3 +--
 docs/performance-tuning.md  |  2 +-
 .../dataload/TestLoadWithSortTempCompressed.scala   | 16 
 4 files changed, 21 insertions(+), 6 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/carbondata/blob/c9fb4bc0/core/src/main/java/org/apache/carbondata/core/constants/CarbonCommonConstants.java
--
diff --git 
a/core/src/main/java/org/apache/carbondata/core/constants/CarbonCommonConstants.java
 
b/core/src/main/java/org/apache/carbondata/core/constants/CarbonCommonConstants.java
index bf4f7e5..9484bb4 100644
--- 
a/core/src/main/java/org/apache/carbondata/core/constants/CarbonCommonConstants.java
+++ 
b/core/src/main/java/org/apache/carbondata/core/constants/CarbonCommonConstants.java
@@ -873,10 +873,10 @@ public final class CarbonCommonConstants {
   public static final String CARBON_SORT_TEMP_COMPRESSOR = 
"carbon.sort.temp.compressor";
 
   /**
-   * The optional values are 'SNAPPY','GZIP','BZIP2','LZ4','ZSTD'.
-   * By default, empty means that Carbondata will not compress the sort temp 
files.
+   * The optional values are 'SNAPPY','GZIP','BZIP2','LZ4','ZSTD' and empty.
+   * Specially, empty means that Carbondata will not compress the sort temp 
files.
*/
-  public static final String CARBON_SORT_TEMP_COMPRESSOR_DEFAULT = "";
+  public static final String CARBON_SORT_TEMP_COMPRESSOR_DEFAULT = "SNAPPY";
   /**
* Which storage level to persist rdd when sort_scope=global_sort
*/

http://git-wip-us.apache.org/repos/asf/carbondata/blob/c9fb4bc0/docs/configuration-parameters.md
--
diff --git a/docs/configuration-parameters.md b/docs/configuration-parameters.md
index 2a3748c..5a4dea6 100644
--- a/docs/configuration-parameters.md
+++ b/docs/configuration-parameters.md
@@ -79,12 +79,11 @@ This section provides the details of all the configurations 
required for the Car
 | enable.inmemory.merge.sort | false | CarbonData sorts and writes data to 
intermediate files to limit the memory usage. These intermediate files needs to 
be sorted again using merge sort before writing to the final carbondata 
file.Performing merge sort in memory would increase the sorting performance at 
the cost of increased memory footprint. This Configuration specifies to do 
in-memory merge sort or to do file based merge sort. |
 | carbon.sort.storage.inmemory.size.inmb | 512 | CarbonData writes every 
***carbon.sort.size*** number of records to intermediate temp files during data 
loading to ensure memory footprint is within limits. When 
***enable.unsafe.sort*** configuration is enabled, instead of using 
***carbon.sort.size*** which is based on rows count, size occupied in memory is 
used to determine when to flush data pages to intermediate temp files. This 
configuration determines the memory to be used for storing data pages in 
memory. **NOTE:** Configuring a higher value ensures more data is maintained in 
memory and hence increases data loading performance due to reduced or no 
IO.Based on the memory availability in the nodes of the cluster, configure the 
values accordingly. |
 | carbon.load.sortmemory.spill.percentage | 0 | During data loading, some data 
pages are kept in memory upto memory configured in 
***carbon.sort.storage.inmemory.size.inmb*** beyond which they are spilled to 
disk as intermediate temporary sort files. This configuration determines after 
what percentage data needs to be spilled to disk. **NOTE:** Without this 
configuration, when the data pages occupy upto configured memory, new data 
pages would be dumped to disk and old p

carbondata git commit: [HOTFIX] Throw original exception in thread pool

2018-11-04 Thread xuchuanyin
Repository: carbondata
Updated Branches:
  refs/heads/master 934216de1 -> 469c52f5d


[HOTFIX] Throw original exception in thread pool

If there are exception occurs in the Callable.run in the thread pool, it should 
throw the original exception instead of throw a new one, which makes it hard 
for debugging.

This closes #2887


Project: http://git-wip-us.apache.org/repos/asf/carbondata/repo
Commit: http://git-wip-us.apache.org/repos/asf/carbondata/commit/469c52f5
Tree: http://git-wip-us.apache.org/repos/asf/carbondata/tree/469c52f5
Diff: http://git-wip-us.apache.org/repos/asf/carbondata/diff/469c52f5

Branch: refs/heads/master
Commit: 469c52f5d4c18579e2a6ed4c3bb35691cf01937b
Parents: 934216d
Author: Jacky Li 
Authored: Wed Oct 31 20:16:11 2018 +0800
Committer: xuchuanyin 
Committed: Mon Nov 5 09:15:53 2018 +0800

--
 .../spark/rdd/NewCarbonDataLoadRDD.scala|  2 +
 .../carbondata/spark/rdd/PartitionDropper.scala |  7 +-
 .../spark/rdd/PartitionSplitter.scala   |  4 +-
 .../steps/DataWriterProcessorStepImpl.java  | 68 ++--
 4 files changed, 27 insertions(+), 54 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/carbondata/blob/469c52f5/integration/spark-common/src/main/scala/org/apache/carbondata/spark/rdd/NewCarbonDataLoadRDD.scala
--
diff --git 
a/integration/spark-common/src/main/scala/org/apache/carbondata/spark/rdd/NewCarbonDataLoadRDD.scala
 
b/integration/spark-common/src/main/scala/org/apache/carbondata/spark/rdd/NewCarbonDataLoadRDD.scala
index 041dc1c..0b6a2a9 100644
--- 
a/integration/spark-common/src/main/scala/org/apache/carbondata/spark/rdd/NewCarbonDataLoadRDD.scala
+++ 
b/integration/spark-common/src/main/scala/org/apache/carbondata/spark/rdd/NewCarbonDataLoadRDD.scala
@@ -156,6 +156,8 @@ class NewCarbonDataLoadRDD[K, V](
   logInfo("Bad Record Found")
 case e: Exception =>
   loadMetadataDetails.setSegmentStatus(SegmentStatus.LOAD_FAILURE)
+  executionErrors.failureCauses = FailureCauses.EXECUTOR_FAILURE
+  executionErrors.errorMsg = e.getMessage
   logInfo("DataLoad failure", e)
   LOGGER.error(e)
   throw e

http://git-wip-us.apache.org/repos/asf/carbondata/blob/469c52f5/integration/spark-common/src/main/scala/org/apache/carbondata/spark/rdd/PartitionDropper.scala
--
diff --git 
a/integration/spark-common/src/main/scala/org/apache/carbondata/spark/rdd/PartitionDropper.scala
 
b/integration/spark-common/src/main/scala/org/apache/carbondata/spark/rdd/PartitionDropper.scala
index 6911b0b..353a478 100644
--- 
a/integration/spark-common/src/main/scala/org/apache/carbondata/spark/rdd/PartitionDropper.scala
+++ 
b/integration/spark-common/src/main/scala/org/apache/carbondata/spark/rdd/PartitionDropper.scala
@@ -102,8 +102,8 @@ object PartitionDropper {
   Seq(partitionId, targetPartitionId).toList, dbName,
   tableName, partitionInfo)
   } catch {
-case e: IOException => sys.error(s"Exception while delete original 
carbon files " +
- e.getMessage)
+case e: IOException =>
+  throw new IOException("Exception while delete original carbon 
files ", e)
   }
   Audit.log(logger, s"Drop Partition request completed for table " +
s"${ dbName }.${ tableName }")
@@ -111,7 +111,8 @@ object PartitionDropper {
   s"${ dbName }.${ tableName }")
 }
   } catch {
-case e: Exception => sys.error(s"Exception in dropping partition 
action: ${ e.getMessage }")
+case e: Exception =>
+  throw new RuntimeException("Exception in dropping partition action", 
e)
   }
 } else {
   PartitionUtils.deleteOriginalCarbonFile(alterPartitionModel, 
absoluteTableIdentifier,

http://git-wip-us.apache.org/repos/asf/carbondata/blob/469c52f5/integration/spark-common/src/main/scala/org/apache/carbondata/spark/rdd/PartitionSplitter.scala
--
diff --git 
a/integration/spark-common/src/main/scala/org/apache/carbondata/spark/rdd/PartitionSplitter.scala
 
b/integration/spark-common/src/main/scala/org/apache/carbondata/spark/rdd/PartitionSplitter.scala
index ca9f049..369ad51 100644
--- 
a/integration/spark-common/src/main/scala/org/apache/carbondata/spark/rdd/PartitionSplitter.scala
+++ 
b/integration/spark-common/src/main/scala/org/apache/carbondata/spark/rdd/PartitionSplitter.scala
@@ -87,8 +87,8 @@ object PartitionSplitter {
deleteOriginalCarbonFile(alterPartitionMo

carbondata git commit: [CARBONDATA-3068] fixed cannot load data from hdfs files without hdfs prefix

2018-11-01 Thread xuchuanyin
Repository: carbondata
Updated Branches:
  refs/heads/master ac94dbadc -> 269f4c378


[CARBONDATA-3068] fixed cannot load data from hdfs files without hdfs prefix

sql:
LOAD DATA INPATH '/tmp/test.csv' INTO TABLE test 
OPTIONS('QUOTECHAR'='"','TIMESTAMPFORMAT'='/MM/dd HH:mm:ss');
error:
org.apache.carbondata.processing.exception.DataLoadingException: The input file 
does not exist: /tmp/test.csv (state=,code=0)
but the file "test.csv" is in hdfs path, and hadoop conf "core-site.xml" has 
the property:

fs.defaultFS
hdfs://master:9000

This closes #2891


Project: http://git-wip-us.apache.org/repos/asf/carbondata/repo
Commit: http://git-wip-us.apache.org/repos/asf/carbondata/commit/269f4c37
Tree: http://git-wip-us.apache.org/repos/asf/carbondata/tree/269f4c37
Diff: http://git-wip-us.apache.org/repos/asf/carbondata/diff/269f4c37

Branch: refs/heads/master
Commit: 269f4c378bad1294430b7d1662469e300f0e04ed
Parents: ac94dba
Author: Sssan520 
Authored: Mon Jul 2 19:12:24 2018 +0800
Committer: xuchuanyin 
Committed: Fri Nov 2 09:10:35 2018 +0800

--
 .../src/main/scala/org/apache/spark/util/FileUtils.scala   | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)
--


http://git-wip-us.apache.org/repos/asf/carbondata/blob/269f4c37/integration/spark-common/src/main/scala/org/apache/spark/util/FileUtils.scala
--
diff --git 
a/integration/spark-common/src/main/scala/org/apache/spark/util/FileUtils.scala 
b/integration/spark-common/src/main/scala/org/apache/spark/util/FileUtils.scala
index db63f6e..3c7fbbb 100644
--- 
a/integration/spark-common/src/main/scala/org/apache/spark/util/FileUtils.scala
+++ 
b/integration/spark-common/src/main/scala/org/apache/spark/util/FileUtils.scala
@@ -75,7 +75,7 @@ object FileUtils {
   val filePaths = inputPath.split(",")
   for (i <- 0 until filePaths.size) {
 val filePath = CarbonUtil.checkAndAppendHDFSUrl(filePaths(i))
-val carbonFile = FileFactory.getCarbonFile(filePaths(i), hadoopConf)
+val carbonFile = FileFactory.getCarbonFile(filePath, hadoopConf)
 if (!carbonFile.exists()) {
   throw new DataLoadingException(
 s"The input file does not exist: 
${CarbonUtil.removeAKSK(filePaths(i))}" )



carbondata git commit: [CARBONDATA-3041]Optimize load minimum size strategy for data loading

2018-10-29 Thread xuchuanyin
Repository: carbondata
Updated Branches:
  refs/heads/master db5da530e -> e2c517e3f


[CARBONDATA-3041]Optimize load minimum size strategy for data loading

this PR modifies the following points:
  1. Delete system property carbon.load.min.size.enabled,modified this 
property load_min_size_inmb to table property,and This property can also be 
specified in the load option.
  2. Support to alter table xxx set TBLPROPERTIES('load_min_size_inmb '='256')
  3. If creating a table has this property load_min_size_inmb,Display this 
property via the desc formatted command.

This closes #2864


Project: http://git-wip-us.apache.org/repos/asf/carbondata/repo
Commit: http://git-wip-us.apache.org/repos/asf/carbondata/commit/e2c517e3
Tree: http://git-wip-us.apache.org/repos/asf/carbondata/tree/e2c517e3
Diff: http://git-wip-us.apache.org/repos/asf/carbondata/diff/e2c517e3

Branch: refs/heads/master
Commit: e2c517e3f9225b3b3ff9e55bcfee0d73fe943f01
Parents: db5da53
Author: ndwangsen 
Authored: Sat Oct 27 10:38:48 2018 +0800
Committer: xuchuanyin 
Committed: Mon Oct 29 16:20:56 2018 +0800

--
 .../core/constants/CarbonCommonConstants.java   |   3 +-
 .../constants/CarbonLoadOptionConstants.java|  10 -
 .../carbondata/core/util/CarbonProperties.java  |  11 -
 docs/configuration-parameters.md|   1 -
 docs/ddl-of-carbondata.md   |  19 +-
 .../dataload/TestTableLoadMinSize.scala |  62 +-
 .../carbondata/spark/util/CommonUtil.scala  |  28 +
 .../spark/sql/catalyst/CarbonDDLSqlParser.scala |   3 +
 .../spark/rdd/CarbonDataRDDFactory.scala|  20 +-
 .../table/CarbonDescribeFormattedCommand.scala  |   6 +
 .../org/apache/spark/util/AlterTableUtil.scala  |  20 +-
 .../loading/model/CarbonLoadModelBuilder.java   | 829 ++-
 .../processing/loading/model/LoadOption.java|   4 +-
 .../processing/util/CarbonLoaderUtil.java   |   4 +-
 14 files changed, 555 insertions(+), 465 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/carbondata/blob/e2c517e3/core/src/main/java/org/apache/carbondata/core/constants/CarbonCommonConstants.java
--
diff --git 
a/core/src/main/java/org/apache/carbondata/core/constants/CarbonCommonConstants.java
 
b/core/src/main/java/org/apache/carbondata/core/constants/CarbonCommonConstants.java
index 37e2aa1..bf4f7e5 100644
--- 
a/core/src/main/java/org/apache/carbondata/core/constants/CarbonCommonConstants.java
+++ 
b/core/src/main/java/org/apache/carbondata/core/constants/CarbonCommonConstants.java
@@ -980,8 +980,7 @@ public final class CarbonCommonConstants {
*/
   @CarbonProperty
   public static final String CARBON_LOAD_MIN_SIZE_INMB = "load_min_size_inmb";
-  public static final String CARBON_LOAD_MIN_NODE_SIZE_INMB_DEFAULT = "256";
-
+  public static final String CARBON_LOAD_MIN_SIZE_INMB_DEFAULT = "0";
   /**
*  the node minimum load data default value
*/

http://git-wip-us.apache.org/repos/asf/carbondata/blob/e2c517e3/core/src/main/java/org/apache/carbondata/core/constants/CarbonLoadOptionConstants.java
--
diff --git 
a/core/src/main/java/org/apache/carbondata/core/constants/CarbonLoadOptionConstants.java
 
b/core/src/main/java/org/apache/carbondata/core/constants/CarbonLoadOptionConstants.java
index 82485ca..0d98cf4 100644
--- 
a/core/src/main/java/org/apache/carbondata/core/constants/CarbonLoadOptionConstants.java
+++ 
b/core/src/main/java/org/apache/carbondata/core/constants/CarbonLoadOptionConstants.java
@@ -151,14 +151,4 @@ public final class CarbonLoadOptionConstants {
   public static final String CARBON_LOAD_SORT_MEMORY_SPILL_PERCENTAGE
   = "carbon.load.sortmemory.spill.percentage";
   public static final String CARBON_LOAD_SORT_MEMORY_SPILL_PERCENTAGE_DEFAULT 
= "0";
-
-  /**
-   *  if loading data is too small, the original loading method will produce 
many small files.
-   *  enable set the node load minimum amount of data,avoid producing many 
small files.
-   *  This option is especially useful when you encounter a lot of small 
amounts of data.
-   */
-  @CarbonProperty
-  public static final String ENABLE_CARBON_LOAD_NODE_DATA_MIN_SIZE
-  = "carbon.load.min.size.enabled";
-  public static final String ENABLE_CARBON_LOAD_NODE_DATA_MIN_SIZE_DEFAULT = 
"false";
 }

http://git-wip-us.apache.org/repos/asf/carbondata/blob/e2c517e3/core/src/main/java/org/apache/carbondata/core/util/CarbonProperties.java
--
diff --git 
a/core/src/main/java/org/apache/carbondata/core/util/CarbonProperties.java 
b/core/src/main/java/org/apache/carbondata/core/util/CarbonProperties.java
index 49d89e7

carbondata git commit: [CARBONDATA-3050][Doc] Remove unused parameter doc

2018-10-29 Thread xuchuanyin
Repository: carbondata
Updated Branches:
  refs/heads/master 9de946673 -> 46d7595f4


[CARBONDATA-3050][Doc] Remove unused parameter doc

Remove documentation of parameter 'carbon.use.multiple.temp.dir'
Related PR:
#2824 - removed the parameter 'carbon.use.multiple.temp.dir'

This closes #2866


Project: http://git-wip-us.apache.org/repos/asf/carbondata/repo
Commit: http://git-wip-us.apache.org/repos/asf/carbondata/commit/46d7595f
Tree: http://git-wip-us.apache.org/repos/asf/carbondata/tree/46d7595f
Diff: http://git-wip-us.apache.org/repos/asf/carbondata/diff/46d7595f

Branch: refs/heads/master
Commit: 46d7595f4bd8583cfc533820a92477f20765859c
Parents: 9de9466
Author: Manhua 
Authored: Mon Oct 29 10:08:15 2018 +0800
Committer: xuchuanyin 
Committed: Mon Oct 29 14:15:29 2018 +0800

--
 docs/configuration-parameters.md | 1 -
 1 file changed, 1 deletion(-)
--


http://git-wip-us.apache.org/repos/asf/carbondata/blob/46d7595f/docs/configuration-parameters.md
--
diff --git a/docs/configuration-parameters.md b/docs/configuration-parameters.md
index 7a6dcab..de98c8d 100644
--- a/docs/configuration-parameters.md
+++ b/docs/configuration-parameters.md
@@ -84,7 +84,6 @@ This section provides the details of all the configurations 
required for the Car
 | carbon.cutOffTimestamp | (none) | CarbonData has capability to generate the 
Dictionary values for the timestamp columns from the data itself without the 
need to store the computed dictionary values. This configuration sets the start 
date for calculating the timestamp. Java counts the number of milliseconds from 
start of "1970-01-01 00:00:00". This property is used to customize the start of 
position. For example "2000-01-01 00:00:00". **NOTE:** The date must be in the 
form ***carbon.timestamp.format***. CarbonData supports storing data for upto 
68 years.For example, if the cut-off time is 1970-01-01 05:30:00, then data 
upto 2038-01-01 05:30:00 will be supported by CarbonData. |
 | carbon.timegranularity | SECOND | The configuration is used to specify the 
data granularity level such as DAY, HOUR, MINUTE, or SECOND. This helps to 
store more than 68 years of data into CarbonData. |
 | carbon.use.local.dir | true | CarbonData,during data loading, writes files 
to local temp directories before copying the files to HDFS. This configuration 
is used to specify whether CarbonData can write locally to tmp directory of the 
container or to the YARN application directory. |
-| carbon.use.multiple.temp.dir | false | When multiple disks are present in 
the system, YARN is generally configured with multiple disks to be used as temp 
directories for managing the containers. This configuration specifies whether 
to use multiple YARN local directories during data loading for disk IO load 
balancing.Enable ***carbon.use.local.dir*** for this configuration to take 
effect. **NOTE:** Data Loading is an IO intensive operation whose performance 
can be limited by the disk IO threshold, particularly during multi table 
concurrent data load.Configuring this parameter, balances the disk IO across 
multiple disks there by improving the over all load performance. |
 | carbon.sort.temp.compressor | (none) | CarbonData writes every 
***carbon.sort.size*** number of records to intermediate temp files during data 
loading to ensure memory footprint is within limits. These temporary files can 
be compressed and written in order to save the storage space. This 
configuration specifies the name of compressor to be used to compress the 
intermediate sort temp files during sort procedure in data loading. The valid 
values are 'SNAPPY','GZIP','BZIP2','LZ4','ZSTD' and empty. By default, empty 
means that Carbondata will not compress the sort temp files. **NOTE:** 
Compressor will be useful if you encounter disk bottleneck.Since the data needs 
to be compressed and decompressed,it involves additional CPU cycles,but is 
compensated by the high IO throughput due to less data to be written or read 
from the disks. |
 | carbon.load.skewedDataOptimization.enabled | false | During data 
loading,CarbonData would divide the number of blocks equally so as to ensure 
all executors process same number of blocks. This mechanism satisfies most of 
the scenarios and ensures maximum parallel processing for optimal data loading 
performance.In some business scenarios, there might be scenarios where the size 
of blocks vary significantly and hence some executors would have to do more 
work if they get blocks containing more data. This configuration enables size 
based block allocation strategy for data loading. When loading, carbondata will 
use file size based block allocation strategy for task distribution. It will 
make sure that all the executors process the same size of data.**NOTE

carbondata git commit: [CARBONDATA_3025]make cli compilable with java 1.7

2018-10-26 Thread xuchuanyin
Repository: carbondata
Updated Branches:
  refs/heads/master e6d15da74 -> b62b0fd9c


[CARBONDATA_3025]make cli compilable with java 1.7

This commit removes some code in jdk1.8 style to make it compilable with jdk1.7

This closes #2853


Project: http://git-wip-us.apache.org/repos/asf/carbondata/repo
Commit: http://git-wip-us.apache.org/repos/asf/carbondata/commit/b62b0fd9
Tree: http://git-wip-us.apache.org/repos/asf/carbondata/tree/b62b0fd9
Diff: http://git-wip-us.apache.org/repos/asf/carbondata/diff/b62b0fd9

Branch: refs/heads/master
Commit: b62b0fd9ce0122735a24b1359e601af1e5ccb6b9
Parents: e6d15da
Author: akashrn5 
Authored: Wed Oct 24 17:35:00 2018 +0530
Committer: xuchuanyin 
Committed: Fri Oct 26 19:19:06 2018 +0800

--
 .../apache/carbondata/tool/FileCollector.java   | 20 +++---
 .../apache/carbondata/tool/ScanBenchmark.java   | 64 
 2 files changed, 48 insertions(+), 36 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/carbondata/blob/b62b0fd9/tools/cli/src/main/java/org/apache/carbondata/tool/FileCollector.java
--
diff --git 
a/tools/cli/src/main/java/org/apache/carbondata/tool/FileCollector.java 
b/tools/cli/src/main/java/org/apache/carbondata/tool/FileCollector.java
index aa48b93..b2ff061 100644
--- a/tools/cli/src/main/java/org/apache/carbondata/tool/FileCollector.java
+++ b/tools/cli/src/main/java/org/apache/carbondata/tool/FileCollector.java
@@ -18,11 +18,7 @@
 package org.apache.carbondata.tool;
 
 import java.io.IOException;
-import java.util.ArrayList;
-import java.util.HashSet;
-import java.util.LinkedHashMap;
-import java.util.List;
-import java.util.Set;
+import java.util.*;
 
 import org.apache.carbondata.common.Strings;
 import org.apache.carbondata.core.datastore.filesystem.CarbonFile;
@@ -77,13 +73,17 @@ class FileCollector {
 
   }
 }
-unsortedFiles.sort((o1, o2) -> {
-  if (o1.getShardName().equalsIgnoreCase(o2.getShardName())) {
-return Integer.parseInt(o1.getPartNo()) - 
Integer.parseInt(o2.getPartNo());
-  } else {
-return o1.getShardName().compareTo(o2.getShardName());
+
+Collections.sort(unsortedFiles, new Comparator() {
+  @Override public int compare(DataFile o1, DataFile o2) {
+if (o1.getShardName().equalsIgnoreCase(o2.getShardName())) {
+  return Integer.parseInt(o1.getPartNo()) - 
Integer.parseInt(o2.getPartNo());
+} else {
+  return o1.getShardName().compareTo(o2.getShardName());
+}
   }
 });
+
 for (DataFile collectedFile : unsortedFiles) {
   this.dataFiles.put(collectedFile.getFilePath(), collectedFile);
 }

http://git-wip-us.apache.org/repos/asf/carbondata/blob/b62b0fd9/tools/cli/src/main/java/org/apache/carbondata/tool/ScanBenchmark.java
--
diff --git 
a/tools/cli/src/main/java/org/apache/carbondata/tool/ScanBenchmark.java 
b/tools/cli/src/main/java/org/apache/carbondata/tool/ScanBenchmark.java
index af5bdb3..ddb9652 100644
--- a/tools/cli/src/main/java/org/apache/carbondata/tool/ScanBenchmark.java
+++ b/tools/cli/src/main/java/org/apache/carbondata/tool/ScanBenchmark.java
@@ -72,57 +72,69 @@ class ScanBenchmark implements Command {
 }
 
 outPuts.add("\n## Benchmark");
-AtomicReference fileHeaderRef = new AtomicReference<>();
-AtomicReference fileFoorterRef = new AtomicReference<>();
-AtomicReference convertedFooterRef = new 
AtomicReference<>();
+final AtomicReference fileHeaderRef = new AtomicReference<>();
+final AtomicReference fileFoorterRef = new 
AtomicReference<>();
+final AtomicReference convertedFooterRef = new 
AtomicReference<>();
 
 // benchmark read header and footer time
-benchmarkOperation("ReadHeaderAndFooter", () -> {
-  fileHeaderRef.set(file.readHeader());
-  fileFoorterRef.set(file.readFooter());
+benchmarkOperation("ReadHeaderAndFooter", new Operation() {
+  @Override public void run() throws IOException, MemoryException {
+fileHeaderRef.set(file.readHeader());
+fileFoorterRef.set(file.readFooter());
+  }
 });
-FileHeader fileHeader = fileHeaderRef.get();
-FileFooter3 fileFooter = fileFoorterRef.get();
+final FileHeader fileHeader = fileHeaderRef.get();
+final FileFooter3 fileFooter = fileFoorterRef.get();
 
 // benchmark convert footer
-benchmarkOperation("ConvertFooter", () -> {
-  convertFooter(fileHeader, fileFooter);
+benchmarkOperation("ConvertFooter", new Operation() {
+  @Override public void run() throws IOException, MemoryException {
+convertFooter(fileHeader, fileFooter);
+  }
 });
 
 

carbondata git commit: [CARBONDATA-3040][BloomDataMap] Fix bug for merging bloom index

2018-10-25 Thread xuchuanyin
Repository: carbondata
Updated Branches:
  refs/heads/master e4806b9a0 -> 33a6dc2ac


[CARBONDATA-3040][BloomDataMap] Fix bug for merging bloom index

Problem
There is a bug which causes query failure when we create two bloom datamaps on 
same table with data.

Analyze
Since we already have data, each create datamap will trigger rebuild datamap 
task and then trigger bloom index file merging. By debuging, we found the first 
datamap's bloom index files would be merged two times and the second time made 
bloom index file empty.

The procedure goes as below:

  1. create table
  2. load data
  3. create bloom datamap1: rebuild datamap1 for existing data, event listener 
is trigger to merge index files for all bloom datamaps( currently only datamap1 
)
  4. create bloom datamap2: rebuild datamap2 for existing data, event listener 
is trigger to merge index files for all bloom datamaps (currently datamap1 and 
datamap2)

Because the event does not has information which datamap it rebuilt, it always 
rebuilds all bloom datamap. So datamap1's bloom index files would be merged 2 
times, but only remains a mergeShard folder when it ran the second merged such 
that no file input for merging and the final merge bloom index files are empty.

Solution
Send the datamap name in rebuild event for filter and only merge bloom index 
files for the specific datamap. Also add file check whether mergeShard already 
exists before merging.

This closes #2851


Project: http://git-wip-us.apache.org/repos/asf/carbondata/repo
Commit: http://git-wip-us.apache.org/repos/asf/carbondata/commit/33a6dc2a
Tree: http://git-wip-us.apache.org/repos/asf/carbondata/tree/33a6dc2a
Diff: http://git-wip-us.apache.org/repos/asf/carbondata/diff/33a6dc2a

Branch: refs/heads/master
Commit: 33a6dc2ac996cbb0bfb4f354d7fc80b297d652bb
Parents: e4806b9
Author: Manhua 
Authored: Wed Oct 24 16:20:13 2018 +0800
Committer: xuchuanyin 
Committed: Thu Oct 25 14:17:29 2018 +0800

--
 .../datamap/bloom/BloomIndexFileStore.java  | 20 +++-
 .../carbondata/events/DataMapEvents.scala   | 10 +-
 .../datamap/IndexDataMapRebuildRDD.scala|  2 +-
 .../spark/rdd/CarbonTableCompactor.scala|  2 +-
 .../events/MergeBloomIndexEventListener.scala   | 10 --
 .../management/CarbonLoadDataCommand.scala  |  2 +-
 6 files changed, 35 insertions(+), 11 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/carbondata/blob/33a6dc2a/datamap/bloom/src/main/java/org/apache/carbondata/datamap/bloom/BloomIndexFileStore.java
--
diff --git 
a/datamap/bloom/src/main/java/org/apache/carbondata/datamap/bloom/BloomIndexFileStore.java
 
b/datamap/bloom/src/main/java/org/apache/carbondata/datamap/bloom/BloomIndexFileStore.java
index 17813ba..3d6ad9b 100644
--- 
a/datamap/bloom/src/main/java/org/apache/carbondata/datamap/bloom/BloomIndexFileStore.java
+++ 
b/datamap/bloom/src/main/java/org/apache/carbondata/datamap/bloom/BloomIndexFileStore.java
@@ -60,6 +60,9 @@ public class BloomIndexFileStore {
 
 
   public static void mergeBloomIndexFile(String dmSegmentPathString, 
List indexCols) {
+
+// Step 1. check current folders
+
 // get all shard paths of old store
 CarbonFile segmentPath = FileFactory.getCarbonFile(dmSegmentPathString,
 FileFactory.getFileType(dmSegmentPathString));
@@ -72,6 +75,9 @@ public class BloomIndexFileStore {
 
 String mergeShardPath = dmSegmentPathString + File.separator + 
MERGE_BLOOM_INDEX_SHARD_NAME;
 String mergeInprogressFile = dmSegmentPathString + File.separator + 
MERGE_INPROGRESS_FILE;
+
+// Step 2. prepare for fail-safe merging
+
 try {
   // delete mergeShard folder if exists
   if (FileFactory.isFileExist(mergeShardPath)) {
@@ -87,10 +93,12 @@ public class BloomIndexFileStore {
 throw new RuntimeException("Failed to create directory " + 
mergeShardPath);
   }
 } catch (IOException e) {
-  LOGGER.error("Error occurs while create directory " + mergeShardPath, e);
-  throw new RuntimeException("Error occurs while create directory " + 
mergeShardPath);
+  throw new RuntimeException(e);
 }
 
+// Step 3. merge index files
+// Query won't use mergeShard until MERGE_INPROGRESS_FILE is deleted
+
 // for each index column, merge the bloomindex files from all shards into 
one
 for (String indexCol: indexCols) {
   String mergeIndexFile = getMergeBloomIndexFile(mergeShardPath, indexCol);
@@ -115,15 +123,17 @@ public class BloomIndexFileStore {
 }
   } catch (IOException e) {
 LOGGER.error("Error occurs while merge bloom index file of column: " + 
indexCol, e);
-// delete merge shard of bloom index for this segment when failed
+

carbondata git commit: [CARBONDATA-3002] Fix some spell error and remove the data after test case finished running

2018-10-19 Thread xuchuanyin
Repository: carbondata
Updated Branches:
  refs/heads/master 1be990f66 -> f810389fa


[CARBONDATA-3002] Fix some spell error and remove the data after test case 
finished running

1.fix spell error -- change retrive to retrieve
2.remove the data after test case finished by deleting table and database
3.change dummy table with UUID, avoid error when there are multiple carbonreader

This closes #2811


Project: http://git-wip-us.apache.org/repos/asf/carbondata/repo
Commit: http://git-wip-us.apache.org/repos/asf/carbondata/commit/f810389f
Tree: http://git-wip-us.apache.org/repos/asf/carbondata/tree/f810389f
Diff: http://git-wip-us.apache.org/repos/asf/carbondata/diff/f810389f

Branch: refs/heads/master
Commit: f810389faf5ddf938b40e2c9a7be35d4a12d901e
Parents: 1be990f
Author: xubo245 
Authored: Thu Oct 11 18:13:14 2018 +0800
Committer: xuchuanyin 
Committed: Fri Oct 19 16:37:27 2018 +0800

--
 .../java/org/apache/carbondata/hadoop/CarbonRecordReader.java | 2 +-
 .../carbondata/presto/PrestoCarbonVectorizedRecordReader.java | 2 +-
 .../spark/vectorreader/VectorizedCarbonRecordReader.java  | 2 +-
 .../spark/testsuite/partition/TestAlterPartitionTable.scala   | 2 ++
 .../java/org/apache/carbondata/sdk/file/CarbonReader.java | 7 +++
 5 files changed, 8 insertions(+), 7 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/carbondata/blob/f810389f/hadoop/src/main/java/org/apache/carbondata/hadoop/CarbonRecordReader.java
--
diff --git 
a/hadoop/src/main/java/org/apache/carbondata/hadoop/CarbonRecordReader.java 
b/hadoop/src/main/java/org/apache/carbondata/hadoop/CarbonRecordReader.java
index 0d38906..d447320 100644
--- a/hadoop/src/main/java/org/apache/carbondata/hadoop/CarbonRecordReader.java
+++ b/hadoop/src/main/java/org/apache/carbondata/hadoop/CarbonRecordReader.java
@@ -117,7 +117,7 @@ public class CarbonRecordReader extends 
AbstractRecordReader {
   }
 
   @Override public float getProgress() throws IOException, 
InterruptedException {
-// TODO : Implement it based on total number of rows it is going to 
retrive.
+// TODO : Implement it based on total number of rows it is going to 
retrieve.
 return 0;
   }
 

http://git-wip-us.apache.org/repos/asf/carbondata/blob/f810389f/integration/presto/src/main/java/org/apache/carbondata/presto/PrestoCarbonVectorizedRecordReader.java
--
diff --git 
a/integration/presto/src/main/java/org/apache/carbondata/presto/PrestoCarbonVectorizedRecordReader.java
 
b/integration/presto/src/main/java/org/apache/carbondata/presto/PrestoCarbonVectorizedRecordReader.java
index 9935b54..4e2d36c 100644
--- 
a/integration/presto/src/main/java/org/apache/carbondata/presto/PrestoCarbonVectorizedRecordReader.java
+++ 
b/integration/presto/src/main/java/org/apache/carbondata/presto/PrestoCarbonVectorizedRecordReader.java
@@ -169,7 +169,7 @@ class PrestoCarbonVectorizedRecordReader extends 
AbstractRecordReader {
   }
 
   @Override public float getProgress() throws IOException, 
InterruptedException {
-// TODO : Implement it based on total number of rows it is going to 
retrive.
+// TODO : Implement it based on total number of rows it is going to 
retrieve.
 return 0;
   }
 

http://git-wip-us.apache.org/repos/asf/carbondata/blob/f810389f/integration/spark-datasource/src/main/scala/org/apache/carbondata/spark/vectorreader/VectorizedCarbonRecordReader.java
--
diff --git 
a/integration/spark-datasource/src/main/scala/org/apache/carbondata/spark/vectorreader/VectorizedCarbonRecordReader.java
 
b/integration/spark-datasource/src/main/scala/org/apache/carbondata/spark/vectorreader/VectorizedCarbonRecordReader.java
index 779c62f..839a8a0 100644
--- 
a/integration/spark-datasource/src/main/scala/org/apache/carbondata/spark/vectorreader/VectorizedCarbonRecordReader.java
+++ 
b/integration/spark-datasource/src/main/scala/org/apache/carbondata/spark/vectorreader/VectorizedCarbonRecordReader.java
@@ -215,7 +215,7 @@ public class VectorizedCarbonRecordReader extends 
AbstractRecordReader {
 
   @Override
   public float getProgress() throws IOException, InterruptedException {
-// TODO : Implement it based on total number of rows it is going to 
retrive.
+// TODO : Implement it based on total number of rows it is going to 
retrieve.
 return 0;
   }
 

http://git-wip-us.apache.org/repos/asf/carbondata/blob/f810389f/integration/spark2/src/test/scala/org/apache/carbondata/spark/testsuite/partition/TestAlterPartitionTable.scala
--
diff --git 
a/integration/spark2/src/test/scala/org/apache/carbondata/spark/testsuite/partition/TestAlterPartitionTable.scala

[1/6] carbondata git commit: [CARBONDATA-3024] Refactor to use log4j Logger directly

2018-10-17 Thread xuchuanyin
Repository: carbondata
Updated Branches:
  refs/heads/master 15d38260c -> 06adb5a03


http://git-wip-us.apache.org/repos/asf/carbondata/blob/06adb5a0/processing/src/main/java/org/apache/carbondata/processing/loading/steps/DataWriterBatchProcessorStepImpl.java
--
diff --git 
a/processing/src/main/java/org/apache/carbondata/processing/loading/steps/DataWriterBatchProcessorStepImpl.java
 
b/processing/src/main/java/org/apache/carbondata/processing/loading/steps/DataWriterBatchProcessorStepImpl.java
index 26ae2d7..694b345 100644
--- 
a/processing/src/main/java/org/apache/carbondata/processing/loading/steps/DataWriterBatchProcessorStepImpl.java
+++ 
b/processing/src/main/java/org/apache/carbondata/processing/loading/steps/DataWriterBatchProcessorStepImpl.java
@@ -20,7 +20,6 @@ import java.io.IOException;
 import java.util.Iterator;
 import java.util.Map;
 
-import org.apache.carbondata.common.logging.LogService;
 import org.apache.carbondata.common.logging.LogServiceFactory;
 import 
org.apache.carbondata.core.datastore.exception.CarbonDataWriterException;
 import org.apache.carbondata.core.datastore.row.CarbonRow;
@@ -40,6 +39,8 @@ import 
org.apache.carbondata.processing.store.CarbonFactHandler;
 import org.apache.carbondata.processing.store.CarbonFactHandlerFactory;
 import org.apache.carbondata.processing.util.CarbonDataProcessorUtil;
 
+import org.apache.log4j.Logger;
+
 /**
  * It reads data from batch of sorted files(it could be in-memory/disk based 
files)
  * which are generated in previous sort step. And it writes data to carbondata 
file.
@@ -47,7 +48,7 @@ import 
org.apache.carbondata.processing.util.CarbonDataProcessorUtil;
  */
 public class DataWriterBatchProcessorStepImpl extends 
AbstractDataLoadProcessorStep {
 
-  private static final LogService LOGGER =
+  private static final Logger LOGGER =
   
LogServiceFactory.getLogService(DataWriterBatchProcessorStepImpl.class.getName());
 
   private Map localDictionaryGeneratorMap;
@@ -112,7 +113,7 @@ public class DataWriterBatchProcessorStepImpl extends 
AbstractDataLoadProcessorS
 i++;
   }
 } catch (Exception e) {
-  LOGGER.error(e, "Failed for table: " + tableName + " in 
DataWriterBatchProcessorStepImpl");
+  LOGGER.error("Failed for table: " + tableName + " in 
DataWriterBatchProcessorStepImpl", e);
   if (e.getCause() instanceof BadRecordFoundException) {
 throw new BadRecordFoundException(e.getCause().getMessage());
   }
@@ -132,7 +133,7 @@ public class DataWriterBatchProcessorStepImpl extends 
AbstractDataLoadProcessorS
 } catch (Exception e) {
   // if throw exception from here dataHandler will not be closed.
   // so just holding exception and later throwing exception
-  LOGGER.error(e, "Failed for table: " + tableName + " in  finishing data 
handler");
+  LOGGER.error("Failed for table: " + tableName + " in  finishing data 
handler", e);
   exception = new CarbonDataWriterException(
   "Failed for table: " + tableName + " in  finishing data handler", e);
 }

http://git-wip-us.apache.org/repos/asf/carbondata/blob/06adb5a0/processing/src/main/java/org/apache/carbondata/processing/loading/steps/DataWriterProcessorStepImpl.java
--
diff --git 
a/processing/src/main/java/org/apache/carbondata/processing/loading/steps/DataWriterProcessorStepImpl.java
 
b/processing/src/main/java/org/apache/carbondata/processing/loading/steps/DataWriterProcessorStepImpl.java
index cc038b9..3d704c9 100644
--- 
a/processing/src/main/java/org/apache/carbondata/processing/loading/steps/DataWriterProcessorStepImpl.java
+++ 
b/processing/src/main/java/org/apache/carbondata/processing/loading/steps/DataWriterProcessorStepImpl.java
@@ -29,7 +29,6 @@ import java.util.concurrent.Executors;
 import java.util.concurrent.Future;
 import java.util.concurrent.TimeUnit;
 
-import org.apache.carbondata.common.logging.LogService;
 import org.apache.carbondata.common.logging.LogServiceFactory;
 import 
org.apache.carbondata.core.datastore.exception.CarbonDataWriterException;
 import org.apache.carbondata.core.datastore.row.CarbonRow;
@@ -50,13 +49,15 @@ import 
org.apache.carbondata.processing.store.CarbonFactHandler;
 import org.apache.carbondata.processing.store.CarbonFactHandlerFactory;
 import org.apache.carbondata.processing.util.CarbonDataProcessorUtil;
 
+import org.apache.log4j.Logger;
+
 /**
  * It reads data from sorted files which are generated in previous sort step.
  * And it writes data to carbondata file. It also generates mdk key while 
writing to carbondata file
  */
 public class DataWriterProcessorStepImpl extends AbstractDataLoadProcessorStep 
{
 
-  private static final LogService LOGGER =
+  private static final Logger LOGGER =
   
LogServiceFactory.getLogService(DataWriterProcessorStepImpl.class.getName());
 
   private long readCounter;
@@ 

[5/6] carbondata git commit: [CARBONDATA-3024] Refactor to use log4j Logger directly

2018-10-17 Thread xuchuanyin
http://git-wip-us.apache.org/repos/asf/carbondata/blob/06adb5a0/core/src/main/java/org/apache/carbondata/core/dictionary/server/NonSecureDictionaryServerHandler.java
--
diff --git 
a/core/src/main/java/org/apache/carbondata/core/dictionary/server/NonSecureDictionaryServerHandler.java
 
b/core/src/main/java/org/apache/carbondata/core/dictionary/server/NonSecureDictionaryServerHandler.java
index 82efe80..0f076a4 100644
--- 
a/core/src/main/java/org/apache/carbondata/core/dictionary/server/NonSecureDictionaryServerHandler.java
+++ 
b/core/src/main/java/org/apache/carbondata/core/dictionary/server/NonSecureDictionaryServerHandler.java
@@ -16,7 +16,6 @@
  */
 package org.apache.carbondata.core.dictionary.server;
 
-import org.apache.carbondata.common.logging.LogService;
 import org.apache.carbondata.common.logging.LogServiceFactory;
 import 
org.apache.carbondata.core.dictionary.generator.ServerDictionaryGenerator;
 import org.apache.carbondata.core.dictionary.generator.key.DictionaryMessage;
@@ -26,6 +25,7 @@ import io.netty.buffer.ByteBuf;
 import io.netty.channel.ChannelHandler;
 import io.netty.channel.ChannelHandlerContext;
 import io.netty.channel.ChannelInboundHandlerAdapter;
+import org.apache.log4j.Logger;
 
 /**
  * Handler for Dictionary server.
@@ -33,7 +33,7 @@ import io.netty.channel.ChannelInboundHandlerAdapter;
 @ChannelHandler.Sharable public class NonSecureDictionaryServerHandler
 extends ChannelInboundHandlerAdapter {
 
-  private static final LogService LOGGER =
+  private static final Logger LOGGER =
   
LogServiceFactory.getLogService(NonSecureDictionaryServerHandler.class.getName());
 
   /**
@@ -77,7 +77,7 @@ import io.netty.channel.ChannelInboundHandlerAdapter;
* @param cause
*/
   @Override public void exceptionCaught(ChannelHandlerContext ctx, Throwable 
cause) {
-LOGGER.error(cause, "exceptionCaught");
+LOGGER.error("exceptionCaught", cause);
 ctx.close();
   }
 

http://git-wip-us.apache.org/repos/asf/carbondata/blob/06adb5a0/core/src/main/java/org/apache/carbondata/core/dictionary/service/AbstractDictionaryServer.java
--
diff --git 
a/core/src/main/java/org/apache/carbondata/core/dictionary/service/AbstractDictionaryServer.java
 
b/core/src/main/java/org/apache/carbondata/core/dictionary/service/AbstractDictionaryServer.java
index 754f253..5703051 100644
--- 
a/core/src/main/java/org/apache/carbondata/core/dictionary/service/AbstractDictionaryServer.java
+++ 
b/core/src/main/java/org/apache/carbondata/core/dictionary/service/AbstractDictionaryServer.java
@@ -27,13 +27,12 @@ import java.util.Collections;
 import java.util.Enumeration;
 import java.util.List;
 
-import org.apache.carbondata.common.logging.LogService;
-
 import org.apache.commons.lang3.SystemUtils;
+import org.apache.log4j.Logger;
 
 public abstract class AbstractDictionaryServer {
 
-  public String findLocalIpAddress(LogService LOGGER) {
+  public String findLocalIpAddress(Logger LOGGER) {
 try {
   String defaultIpOverride = System.getenv("SPARK_LOCAL_IP");
   if (defaultIpOverride != null) {

http://git-wip-us.apache.org/repos/asf/carbondata/blob/06adb5a0/core/src/main/java/org/apache/carbondata/core/fileoperations/AtomicFileOperationsImpl.java
--
diff --git 
a/core/src/main/java/org/apache/carbondata/core/fileoperations/AtomicFileOperationsImpl.java
 
b/core/src/main/java/org/apache/carbondata/core/fileoperations/AtomicFileOperationsImpl.java
index f9f8647..13f10d7 100644
--- 
a/core/src/main/java/org/apache/carbondata/core/fileoperations/AtomicFileOperationsImpl.java
+++ 
b/core/src/main/java/org/apache/carbondata/core/fileoperations/AtomicFileOperationsImpl.java
@@ -21,7 +21,6 @@ import java.io.DataInputStream;
 import java.io.DataOutputStream;
 import java.io.IOException;
 
-import org.apache.carbondata.common.logging.LogService;
 import org.apache.carbondata.common.logging.LogServiceFactory;
 import org.apache.carbondata.core.constants.CarbonCommonConstants;
 import org.apache.carbondata.core.datastore.filesystem.CarbonFile;
@@ -29,12 +28,14 @@ import 
org.apache.carbondata.core.datastore.impl.FileFactory;
 import org.apache.carbondata.core.datastore.impl.FileFactory.FileType;
 import org.apache.carbondata.core.util.CarbonUtil;
 
+import org.apache.log4j.Logger;
+
 class AtomicFileOperationsImpl implements AtomicFileOperations {
 
   /**
* Logger instance
*/
-  private static final LogService LOGGER =
+  private static final Logger LOGGER =
   
LogServiceFactory.getLogService(AtomicFileOperationsImpl.class.getName());
   private String filePath;
 

http://git-wip-us.apache.org/repos/asf/carbondata/blob/06adb5a0/core/src/main/java/org/apache/carbondata/core/indexstore/BlockletDataMapIndexStore.java

[2/6] carbondata git commit: [CARBONDATA-3024] Refactor to use log4j Logger directly

2018-10-17 Thread xuchuanyin
http://git-wip-us.apache.org/repos/asf/carbondata/blob/06adb5a0/integration/spark2/src/main/scala/org/apache/spark/sql/execution/command/schema/CarbonAlterTableAddColumnCommand.scala
--
diff --git 
a/integration/spark2/src/main/scala/org/apache/spark/sql/execution/command/schema/CarbonAlterTableAddColumnCommand.scala
 
b/integration/spark2/src/main/scala/org/apache/spark/sql/execution/command/schema/CarbonAlterTableAddColumnCommand.scala
index 22ff5c4..1f1e7bd 100644
--- 
a/integration/spark2/src/main/scala/org/apache/spark/sql/execution/command/schema/CarbonAlterTableAddColumnCommand.scala
+++ 
b/integration/spark2/src/main/scala/org/apache/spark/sql/execution/command/schema/CarbonAlterTableAddColumnCommand.scala
@@ -25,7 +25,8 @@ import org.apache.spark.sql.hive.CarbonSessionCatalog
 import org.apache.spark.util.AlterTableUtil
 
 import 
org.apache.carbondata.common.exceptions.sql.MalformedCarbonCommandException
-import org.apache.carbondata.common.logging.{LogService, LogServiceFactory}
+import org.apache.carbondata.common.logging.LogServiceFactory
+import org.apache.carbondata.common.logging.impl.Audit
 import org.apache.carbondata.core.features.TableOperation
 import org.apache.carbondata.core.locks.{ICarbonLock, LockUsage}
 import 
org.apache.carbondata.core.metadata.converter.ThriftWrapperSchemaConverterImpl
@@ -39,11 +40,11 @@ private[sql] case class CarbonAlterTableAddColumnCommand(
   extends MetadataCommand {
 
   override def processMetadata(sparkSession: SparkSession): Seq[Row] = {
-val LOGGER: LogService = 
LogServiceFactory.getLogService(this.getClass.getCanonicalName)
+val LOGGER = 
LogServiceFactory.getLogService(this.getClass.getCanonicalName)
 val tableName = alterTableAddColumnsModel.tableName
 val dbName = alterTableAddColumnsModel.databaseName
   .getOrElse(sparkSession.catalog.currentDatabase)
-LOGGER.audit(s"Alter table add columns request has been received for 
$dbName.$tableName")
+Audit.log(LOGGER, s"Alter table add columns request has been received for 
$dbName.$tableName")
 val locksToBeAcquired = List(LockUsage.METADATA_LOCK, 
LockUsage.COMPACTION_LOCK)
 var locks = List.empty[ICarbonLock]
 var timeStamp = 0L
@@ -104,10 +105,10 @@ private[sql] case class CarbonAlterTableAddColumnCommand(
   carbonTable, alterTableAddColumnsModel)
   OperationListenerBus.getInstance.fireEvent(alterTablePostExecutionEvent, 
operationContext)
   LOGGER.info(s"Alter table for add columns is successful for table 
$dbName.$tableName")
-  LOGGER.audit(s"Alter table for add columns is successful for table 
$dbName.$tableName")
+  Audit.log(LOGGER, s"Alter table for add columns is successful for table 
$dbName.$tableName")
 } catch {
   case e: Exception =>
-LOGGER.error(e, "Alter table add columns failed")
+LOGGER.error("Alter table add columns failed", e)
 if (newCols.nonEmpty) {
   LOGGER.info("Cleaning up the dictionary files as alter table add 
operation failed")
   new AlterTableDropColumnRDD(sparkSession,

http://git-wip-us.apache.org/repos/asf/carbondata/blob/06adb5a0/integration/spark2/src/main/scala/org/apache/spark/sql/execution/command/schema/CarbonAlterTableDataTypeChangeCommand.scala
--
diff --git 
a/integration/spark2/src/main/scala/org/apache/spark/sql/execution/command/schema/CarbonAlterTableDataTypeChangeCommand.scala
 
b/integration/spark2/src/main/scala/org/apache/spark/sql/execution/command/schema/CarbonAlterTableDataTypeChangeCommand.scala
index 9ce79e9..716b9c9 100644
--- 
a/integration/spark2/src/main/scala/org/apache/spark/sql/execution/command/schema/CarbonAlterTableDataTypeChangeCommand.scala
+++ 
b/integration/spark2/src/main/scala/org/apache/spark/sql/execution/command/schema/CarbonAlterTableDataTypeChangeCommand.scala
@@ -25,7 +25,8 @@ import org.apache.spark.sql.hive.CarbonSessionCatalog
 import org.apache.spark.util.AlterTableUtil
 
 import 
org.apache.carbondata.common.exceptions.sql.MalformedCarbonCommandException
-import org.apache.carbondata.common.logging.{LogService, LogServiceFactory}
+import org.apache.carbondata.common.logging.LogServiceFactory
+import org.apache.carbondata.common.logging.impl.Audit
 import org.apache.carbondata.core.features.TableOperation
 import org.apache.carbondata.core.locks.{ICarbonLock, LockUsage}
 import 
org.apache.carbondata.core.metadata.converter.ThriftWrapperSchemaConverterImpl
@@ -40,11 +41,12 @@ private[sql] case class 
CarbonAlterTableDataTypeChangeCommand(
   extends MetadataCommand {
 
   override def processMetadata(sparkSession: SparkSession): Seq[Row] = {
-val LOGGER: LogService = 
LogServiceFactory.getLogService(this.getClass.getCanonicalName)
+val LOGGER = 
LogServiceFactory.getLogService(this.getClass.getCanonicalName)
 val tableName = 

[6/6] carbondata git commit: [CARBONDATA-3024] Refactor to use log4j Logger directly

2018-10-17 Thread xuchuanyin
[CARBONDATA-3024] Refactor to use log4j Logger directly

Currently CarbonData's log is printing the line number in StandardLogService, 
it is not good for maintainability, a better way is to use log4j Logger 
directly so that it will print line number of where we are logging.

This closes #2827


Project: http://git-wip-us.apache.org/repos/asf/carbondata/repo
Commit: http://git-wip-us.apache.org/repos/asf/carbondata/commit/06adb5a0
Tree: http://git-wip-us.apache.org/repos/asf/carbondata/tree/06adb5a0
Diff: http://git-wip-us.apache.org/repos/asf/carbondata/diff/06adb5a0

Branch: refs/heads/master
Commit: 06adb5a0376255c15f8c393257d6db0736e35f31
Parents: 15d3826
Author: Jacky Li 
Authored: Wed Oct 17 20:27:15 2018 +0800
Committer: xuchuanyin 
Committed: Thu Oct 18 09:56:12 2018 +0800

--
 .../carbondata/common/logging/LogService.java   |  80 ++-
 .../common/logging/LogServiceFactory.java   |   6 +-
 .../carbondata/common/logging/impl/Audit.java   |  49 
 .../common/logging/impl/StandardLogService.java | 236 ---
 .../logging/LogServiceFactoryTest_UT.java   |   7 +-
 .../logging/ft/LoggingServiceTest_FT.java   |   9 +-
 .../logging/impl/StandardLogServiceTest_UT.java | 140 ---
 .../carbondata/core/cache/CacheProvider.java|   5 +-
 .../carbondata/core/cache/CarbonLRUCache.java   |   5 +-
 .../dictionary/ForwardDictionaryCache.java  |   5 +-
 .../dictionary/ManageDictionaryAndBTree.java|   5 +-
 .../dictionary/ReverseDictionaryCache.java  |   5 +-
 .../core/constants/CarbonVersionConstants.java  |   9 +-
 .../core/datamap/DataMapStoreManager.java   |  10 +-
 .../carbondata/core/datamap/DataMapUtil.java|   4 +-
 .../status/DiskBasedDataMapStatusProvider.java  |  17 +-
 .../block/SegmentPropertiesAndSchemaHolder.java |   5 +-
 .../blocklet/BlockletEncodedColumnPage.java |   5 +-
 .../datastore/compression/SnappyCompressor.java |  32 +--
 .../filesystem/AbstractDFSCarbonFile.java   |   4 +-
 .../datastore/filesystem/AlluxioCarbonFile.java |   6 +-
 .../datastore/filesystem/HDFSCarbonFile.java|   4 +-
 .../datastore/filesystem/LocalCarbonFile.java   |   7 +-
 .../core/datastore/filesystem/S3CarbonFile.java |   4 +-
 .../datastore/filesystem/ViewFSCarbonFile.java  |   4 +-
 .../core/datastore/impl/FileFactory.java|   4 +-
 .../datastore/page/LocalDictColumnPage.java |   9 +-
 .../page/encoding/ColumnPageEncoder.java|   5 +-
 .../client/NonSecureDictionaryClient.java   |   7 +-
 .../NonSecureDictionaryClientHandler.java   |  13 +-
 .../IncrementalColumnDictionaryGenerator.java   |   7 +-
 .../generator/TableDictionaryGenerator.java |   5 +-
 .../server/NonSecureDictionaryServer.java   |   6 +-
 .../NonSecureDictionaryServerHandler.java   |   6 +-
 .../service/AbstractDictionaryServer.java   |   5 +-
 .../AtomicFileOperationsImpl.java   |   5 +-
 .../indexstore/BlockletDataMapIndexStore.java   |   4 +-
 .../core/indexstore/BlockletDetailInfo.java |   8 +-
 .../indexstore/blockletindex/BlockDataMap.java  |  13 +-
 .../blockletindex/SegmentIndexFileStore.java|   4 +-
 .../DateDirectDictionaryGenerator.java  |   5 +-
 .../TimeStampDirectDictionaryGenerator.java |   5 +-
 .../core/locks/CarbonLockFactory.java   |   5 +-
 .../carbondata/core/locks/CarbonLockUtil.java   |   4 +-
 .../carbondata/core/locks/HdfsFileLock.java |   5 +-
 .../carbondata/core/locks/LocalFileLock.java|   5 +-
 .../carbondata/core/locks/S3FileLock.java   |   7 +-
 .../carbondata/core/locks/ZooKeeperLocking.java |  10 +-
 .../carbondata/core/locks/ZookeeperInit.java|   4 +-
 .../core/memory/IntPointerBuffer.java   |   5 +-
 .../core/memory/UnsafeMemoryManager.java|   5 +-
 .../core/memory/UnsafeSortMemoryManager.java|   5 +-
 .../core/metadata/SegmentFileStore.java |   4 +-
 .../core/metadata/schema/table/CarbonTable.java |   4 +-
 .../core/metadata/schema/table/TableInfo.java   |   5 +-
 .../core/mutate/CarbonUpdateUtil.java   |   4 +-
 .../core/mutate/DeleteDeltaBlockDetails.java|   5 +-
 .../core/mutate/SegmentUpdateDetails.java   |   5 +-
 .../reader/CarbonDeleteDeltaFileReaderImpl.java |   8 -
 .../reader/CarbonDeleteFilesDataReader.java |   4 +-
 .../CarbonDictionarySortIndexReaderImpl.java|   6 +-
 .../scan/collector/ResultCollectorFactory.java  |  16 +-
 .../RestructureBasedRawResultCollector.java |   8 -
 .../executor/impl/AbstractQueryExecutor.java|   8 +-
 .../impl/SearchModeDetailQueryExecutor.java |   4 +-
 .../SearchModeVectorDetailQueryExecutor.java|   4 +-
 .../expression/RangeExpressionEvaluator.java|   5 +-
 .../scan/filter/FilterExpressionProcessor.java  |   5 +-
 .../carbondata/core/scan/filter/FilterUtil.java |   6 +-
 .../executer/RowLevelFilterExecuterImpl.java|   5 +-
 .../core/scan/model

[4/6] carbondata git commit: [CARBONDATA-3024] Refactor to use log4j Logger directly

2018-10-17 Thread xuchuanyin
http://git-wip-us.apache.org/repos/asf/carbondata/blob/06adb5a0/core/src/main/java/org/apache/carbondata/core/util/ObjectSizeCalculator.java
--
diff --git 
a/core/src/main/java/org/apache/carbondata/core/util/ObjectSizeCalculator.java 
b/core/src/main/java/org/apache/carbondata/core/util/ObjectSizeCalculator.java
index 513e786..3d63560 100644
--- 
a/core/src/main/java/org/apache/carbondata/core/util/ObjectSizeCalculator.java
+++ 
b/core/src/main/java/org/apache/carbondata/core/util/ObjectSizeCalculator.java
@@ -19,9 +19,10 @@ package org.apache.carbondata.core.util;
 
 import java.lang.reflect.Method;
 
-import org.apache.carbondata.common.logging.LogService;
 import org.apache.carbondata.common.logging.LogServiceFactory;
 
+import org.apache.log4j.Logger;
+
 /**
  * This wrapper class is created so that core doesnt have direct dependency on 
spark
  * TODO: Need to have carbon implementation if carbon needs to be used without 
spark
@@ -30,7 +31,7 @@ public final class ObjectSizeCalculator {
   /**
* Logger object for the class
*/
-  private static final LogService LOGGER =
+  private static final Logger LOGGER =
   LogServiceFactory.getLogService(ObjectSizeCalculator.class.getName());
 
   /**
@@ -63,7 +64,7 @@ public final class ObjectSizeCalculator {
 } catch (Throwable ex) {
   // throwable is being caught as external interface is being invoked 
through reflection
   // and runtime exceptions might get thrown
-  LOGGER.error(ex, "Could not access method 
SizeEstimator:estimate.Returning default value");
+  LOGGER.error("Could not access method SizeEstimator:estimate.Returning 
default value", ex);
   methodAccessible = false;
   return defValue;
 }

http://git-wip-us.apache.org/repos/asf/carbondata/blob/06adb5a0/core/src/main/java/org/apache/carbondata/core/util/SessionParams.java
--
diff --git 
a/core/src/main/java/org/apache/carbondata/core/util/SessionParams.java 
b/core/src/main/java/org/apache/carbondata/core/util/SessionParams.java
index 931e106..027e6cb 100644
--- a/core/src/main/java/org/apache/carbondata/core/util/SessionParams.java
+++ b/core/src/main/java/org/apache/carbondata/core/util/SessionParams.java
@@ -23,8 +23,8 @@ import java.util.Map;
 import java.util.concurrent.ConcurrentHashMap;
 
 import org.apache.carbondata.common.constants.LoggerAction;
-import org.apache.carbondata.common.logging.LogService;
 import org.apache.carbondata.common.logging.LogServiceFactory;
+import org.apache.carbondata.common.logging.impl.Audit;
 import org.apache.carbondata.core.cache.CacheProvider;
 import org.apache.carbondata.core.constants.CarbonCommonConstants;
 import org.apache.carbondata.core.constants.CarbonCommonConstantsInternal;
@@ -56,12 +56,14 @@ import static 
org.apache.carbondata.core.constants.CarbonLoadOptionConstants.CAR
 import static 
org.apache.carbondata.core.constants.CarbonLoadOptionConstants.CARBON_OPTIONS_TIMESTAMPFORMAT;
 import static 
org.apache.carbondata.core.constants.CarbonV3DataFormatConstants.BLOCKLET_SIZE_IN_MB;
 
+import org.apache.log4j.Logger;
+
 /**
  * This class maintains carbon session params
  */
 public class SessionParams implements Serializable, Cloneable {
 
-  private static final LogService LOGGER =
+  private static final Logger LOGGER =
   LogServiceFactory.getLogService(CacheProvider.class.getName());
   private static final long serialVersionUID = -7801994600594915264L;
 
@@ -124,7 +126,8 @@ public class SessionParams implements Serializable, 
Cloneable {
 value = value.toUpperCase();
   }
   if (doAuditing) {
-LOGGER.audit("The key " + key + " with value " + value + " added in 
the session param");
+Audit.log(LOGGER,
+"The key " + key + " with value " + value + " added in the session 
param");
   }
   sProps.put(key, value);
 }

http://git-wip-us.apache.org/repos/asf/carbondata/blob/06adb5a0/core/src/main/java/org/apache/carbondata/core/util/TaskMetricsMap.java
--
diff --git 
a/core/src/main/java/org/apache/carbondata/core/util/TaskMetricsMap.java 
b/core/src/main/java/org/apache/carbondata/core/util/TaskMetricsMap.java
index 16dacb2..196fd64 100644
--- a/core/src/main/java/org/apache/carbondata/core/util/TaskMetricsMap.java
+++ b/core/src/main/java/org/apache/carbondata/core/util/TaskMetricsMap.java
@@ -23,17 +23,17 @@ import java.util.Map;
 import java.util.concurrent.ConcurrentHashMap;
 import java.util.concurrent.CopyOnWriteArrayList;
 
-import org.apache.carbondata.common.logging.LogService;
 import org.apache.carbondata.common.logging.LogServiceFactory;
 
 import org.apache.hadoop.fs.FileSystem;
+import org.apache.log4j.Logger;
 
 /**
  * This class maintains task level metrics info for all spawned child threads 
and parent task thread
  */
 

[3/6] carbondata git commit: [CARBONDATA-3024] Refactor to use log4j Logger directly

2018-10-17 Thread xuchuanyin
http://git-wip-us.apache.org/repos/asf/carbondata/blob/06adb5a0/integration/spark-common/src/main/scala/org/apache/carbondata/spark/rdd/StreamHandoffRDD.scala
--
diff --git 
a/integration/spark-common/src/main/scala/org/apache/carbondata/spark/rdd/StreamHandoffRDD.scala
 
b/integration/spark-common/src/main/scala/org/apache/carbondata/spark/rdd/StreamHandoffRDD.scala
index 57b2e44..3f0eb71 100644
--- 
a/integration/spark-common/src/main/scala/org/apache/carbondata/spark/rdd/StreamHandoffRDD.scala
+++ 
b/integration/spark-common/src/main/scala/org/apache/carbondata/spark/rdd/StreamHandoffRDD.scala
@@ -28,7 +28,9 @@ import org.apache.spark.{Partition, SerializableWritable, 
SparkContext, TaskCont
 import org.apache.spark.sql.SparkSession
 import org.apache.spark.sql.catalyst.expressions.GenericInternalRow
 
+import org.apache.carbondata.api.CarbonStore.LOGGER
 import org.apache.carbondata.common.logging.LogServiceFactory
+import org.apache.carbondata.common.logging.impl.Audit
 import org.apache.carbondata.converter.SparkDataTypeConverterImpl
 import org.apache.carbondata.core.datamap.Segment
 import org.apache.carbondata.core.datastore.block.SegmentProperties
@@ -335,7 +337,7 @@ object StreamHandoffRDD {
 } catch {
   case ex: Exception =>
 loadStatus = SegmentStatus.LOAD_FAILURE
-LOGGER.error(ex, s"Handoff failed on streaming segment 
$handoffSegmenId")
+LOGGER.error(s"Handoff failed on streaming segment $handoffSegmenId", 
ex)
 errorMessage = errorMessage + ": " + ex.getCause.getMessage
 LOGGER.error(errorMessage)
 }
@@ -345,7 +347,7 @@ object StreamHandoffRDD {
   LOGGER.info("starting clean up**")
   CarbonLoaderUtil.deleteSegment(carbonLoadModel, 
carbonLoadModel.getSegmentId.toInt)
   LOGGER.info("clean up done**")
-  LOGGER.audit(s"Handoff is failed for " +
+  Audit.log(LOGGER, s"Handoff is failed for " +
s"${ carbonLoadModel.getDatabaseName }.${ 
carbonLoadModel.getTableName }")
   LOGGER.error("Cannot write load metadata file as handoff failed")
   throw new Exception(errorMessage)
@@ -367,7 +369,7 @@ object StreamHandoffRDD {
 .fireEvent(loadTablePostStatusUpdateEvent, operationContext)
   if (!done) {
 val errorMessage = "Handoff failed due to failure in table status 
updation."
-LOGGER.audit("Handoff is failed for " +
+Audit.log(LOGGER, "Handoff is failed for " +
  s"${ carbonLoadModel.getDatabaseName }.${ 
carbonLoadModel.getTableName }")
 LOGGER.error("Handoff failed due to failure in table status updation.")
 throw new Exception(errorMessage)

http://git-wip-us.apache.org/repos/asf/carbondata/blob/06adb5a0/integration/spark-common/src/main/scala/org/apache/carbondata/spark/util/CarbonScalaUtil.scala
--
diff --git 
a/integration/spark-common/src/main/scala/org/apache/carbondata/spark/util/CarbonScalaUtil.scala
 
b/integration/spark-common/src/main/scala/org/apache/carbondata/spark/util/CarbonScalaUtil.scala
index 2cc2a5b..1e8d148 100644
--- 
a/integration/spark-common/src/main/scala/org/apache/carbondata/spark/util/CarbonScalaUtil.scala
+++ 
b/integration/spark-common/src/main/scala/org/apache/carbondata/spark/util/CarbonScalaUtil.scala
@@ -27,6 +27,7 @@ import scala.collection.mutable
 import scala.util.Try
 
 import com.univocity.parsers.common.TextParsingException
+import org.apache.log4j.Logger
 import org.apache.spark.SparkException
 import org.apache.spark.sql._
 import 
org.apache.spark.sql.carbondata.execution.datasources.CarbonSparkDataSourceUtil
@@ -363,7 +364,7 @@ object CarbonScalaUtil {
   /**
* Retrieve error message from exception
*/
-  def retrieveAndLogErrorMsg(ex: Throwable, logger: LogService): (String, 
String) = {
+  def retrieveAndLogErrorMsg(ex: Throwable, logger: Logger): (String, String) 
= {
 var errorMessage = "DataLoad failure"
 var executorMessage = ""
 if (ex != null) {

http://git-wip-us.apache.org/repos/asf/carbondata/blob/06adb5a0/integration/spark-common/src/main/scala/org/apache/carbondata/spark/util/GlobalDictionaryUtil.scala
--
diff --git 
a/integration/spark-common/src/main/scala/org/apache/carbondata/spark/util/GlobalDictionaryUtil.scala
 
b/integration/spark-common/src/main/scala/org/apache/carbondata/spark/util/GlobalDictionaryUtil.scala
index 67c4c9b..704382f 100644
--- 
a/integration/spark-common/src/main/scala/org/apache/carbondata/spark/util/GlobalDictionaryUtil.scala
+++ 
b/integration/spark-common/src/main/scala/org/apache/carbondata/spark/util/GlobalDictionaryUtil.scala
@@ -696,17 +696,17 @@ object GlobalDictionaryUtil {
 } catch {
   case ex: Exception =>
 if (ex.getCause != null && 

carbondata git commit: [CARBONDATA-2974] Fixed multiple expressions issue on datamap chooser and bloom datamap

2018-09-28 Thread xuchuanyin
Repository: carbondata
Updated Branches:
  refs/heads/master 8427771fc -> 8284d9ed1


[CARBONDATA-2974] Fixed multiple expressions issue on datamap chooser and bloom 
datamap

DataMap framework provide a mechanism to composite expression and
forward it to corresponding datamap, in this way, the datamap can handle
the pruning in batch. But currently the expressions the framework
forwarded contains the one that cannot be supported by the datamap, so
here we optimize the datamap chooser.

We will composite the expression and wrap them into AndExpression. These
expressions are exactly the datamap wanted. The bloomfilter datamap
changed accordingly to handle the AndExpression.

This closes #2767


Project: http://git-wip-us.apache.org/repos/asf/carbondata/repo
Commit: http://git-wip-us.apache.org/repos/asf/carbondata/commit/8284d9ed
Tree: http://git-wip-us.apache.org/repos/asf/carbondata/tree/8284d9ed
Diff: http://git-wip-us.apache.org/repos/asf/carbondata/diff/8284d9ed

Branch: refs/heads/master
Commit: 8284d9ed1fe60d8881788656b7f78c055f76e453
Parents: 8427771
Author: ravipesala 
Authored: Wed Sep 26 16:56:03 2018 +0530
Committer: xuchuanyin 
Committed: Fri Sep 28 16:46:49 2018 +0800

--
 .../carbondata/core/datamap/DataMapChooser.java | 76 ++--
 .../datamap/bloom/BloomCoarseGrainDataMap.java  |  8 ++-
 .../bloom/BloomCoarseGrainDataMapSuite.scala| 62 +++-
 3 files changed, 106 insertions(+), 40 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/carbondata/blob/8284d9ed/core/src/main/java/org/apache/carbondata/core/datamap/DataMapChooser.java
--
diff --git 
a/core/src/main/java/org/apache/carbondata/core/datamap/DataMapChooser.java 
b/core/src/main/java/org/apache/carbondata/core/datamap/DataMapChooser.java
index 68696cf..3b6537c 100644
--- a/core/src/main/java/org/apache/carbondata/core/datamap/DataMapChooser.java
+++ b/core/src/main/java/org/apache/carbondata/core/datamap/DataMapChooser.java
@@ -39,6 +39,7 @@ import 
org.apache.carbondata.core.scan.expression.logical.AndExpression;
 import org.apache.carbondata.core.scan.expression.logical.OrExpression;
 import org.apache.carbondata.core.scan.filter.intf.ExpressionType;
 import org.apache.carbondata.core.scan.filter.resolver.FilterResolverIntf;
+import 
org.apache.carbondata.core.scan.filter.resolver.resolverinfo.TrueConditionalResolverImpl;
 
 /**
  * This chooser does 2 jobs.
@@ -123,9 +124,11 @@ public class DataMapChooser {
 if (resolverIntf != null) {
   Expression expression = resolverIntf.getFilterExpression();
   List datamaps = level == DataMapLevel.CG ? cgDataMaps : 
fgDataMaps;
-  ExpressionTuple tuple = selectDataMap(expression, datamaps, 
resolverIntf);
-  if (tuple.dataMapExprWrapper != null) {
-return tuple.dataMapExprWrapper;
+  if (datamaps.size() > 0) {
+ExpressionTuple tuple = selectDataMap(expression, datamaps, 
resolverIntf);
+if (tuple.dataMapExprWrapper != null) {
+  return tuple.dataMapExprWrapper;
+}
   }
 }
 return null;
@@ -177,34 +180,35 @@ public class DataMapChooser {
   // If both left and right has datamap then we can either merge both 
datamaps to single
   // datamap if possible. Otherwise apply AND expression.
   if (left.dataMapExprWrapper != null && right.dataMapExprWrapper != 
null) {
-filterExpressionTypes.add(
-
left.dataMapExprWrapper.getFilterResolverIntf().getFilterExpression()
-.getFilterExpressionType());
-filterExpressionTypes.add(
-
right.dataMapExprWrapper.getFilterResolverIntf().getFilterExpression()
-.getFilterExpressionType());
+filterExpressionTypes.addAll(left.filterExpressionTypes);
+filterExpressionTypes.addAll(right.filterExpressionTypes);
 List columnExpressions = new ArrayList<>();
 columnExpressions.addAll(left.columnExpressions);
 columnExpressions.addAll(right.columnExpressions);
 // Check if we can merge them to single datamap.
 TableDataMap dataMap =
 chooseDataMap(allDataMap, columnExpressions, 
filterExpressionTypes);
+TrueConditionalResolverImpl resolver = new 
TrueConditionalResolverImpl(
+new AndExpression(left.expression, right.expression), false,
+true);
 if (dataMap != null) {
   ExpressionTuple tuple = new ExpressionTuple();
   tuple.columnExpressions = columnExpressions;
-  tuple.dataMapExprWrapper = new DataMapExprWrapperImpl(dataMap, 
filterResolverIntf);
+  tuple.dataMapExprWrapper = new DataMapExprWrapperImp

carbondata git commit: [CARBONDATA-2971] Add shard info of blocklet for debugging

2018-09-26 Thread xuchuanyin
Repository: carbondata
Updated Branches:
  refs/heads/master 3cd8b947c -> 5c0da31a5


[CARBONDATA-2971] Add shard info of blocklet for debugging

add toString method to print both shard name and blocklet id for debugging.

This closes #2765


Project: http://git-wip-us.apache.org/repos/asf/carbondata/repo
Commit: http://git-wip-us.apache.org/repos/asf/carbondata/commit/5c0da31a
Tree: http://git-wip-us.apache.org/repos/asf/carbondata/tree/5c0da31a
Diff: http://git-wip-us.apache.org/repos/asf/carbondata/diff/5c0da31a

Branch: refs/heads/master
Commit: 5c0da31a5a0afaf707455fa80ac431a082a57ec9
Parents: 3cd8b94
Author: Manhua 
Authored: Wed Sep 26 10:34:54 2018 +0800
Committer: xuchuanyin 
Committed: Thu Sep 27 11:37:56 2018 +0800

--
 .../carbondata/core/indexstore/Blocklet.java| 21 
 .../blockletindex/BlockletDataMapFactory.java   |  2 +-
 2 files changed, 18 insertions(+), 5 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/carbondata/blob/5c0da31a/core/src/main/java/org/apache/carbondata/core/indexstore/Blocklet.java
--
diff --git 
a/core/src/main/java/org/apache/carbondata/core/indexstore/Blocklet.java 
b/core/src/main/java/org/apache/carbondata/core/indexstore/Blocklet.java
index c6e1681..3270d08 100644
--- a/core/src/main/java/org/apache/carbondata/core/indexstore/Blocklet.java
+++ b/core/src/main/java/org/apache/carbondata/core/indexstore/Blocklet.java
@@ -65,17 +65,20 @@ public class Blocklet implements Writable,Serializable {
 return filePath;
   }
 
-  @Override public void write(DataOutput out) throws IOException {
+  @Override
+  public void write(DataOutput out) throws IOException {
 out.writeUTF(filePath);
 out.writeUTF(blockletId);
   }
 
-  @Override public void readFields(DataInput in) throws IOException {
+  @Override
+  public void readFields(DataInput in) throws IOException {
 filePath = in.readUTF();
 blockletId = in.readUTF();
   }
 
-  @Override public boolean equals(Object o) {
+  @Override
+  public boolean equals(Object o) {
 if (this == o) return true;
 if (o == null || getClass() != o.getClass()) return false;
 
@@ -92,7 +95,17 @@ public class Blocklet implements Writable,Serializable {
 blocklet.blockletId == null;
   }
 
-  @Override public int hashCode() {
+  @Override
+  public String toString() {
+final StringBuffer sb = new StringBuffer("Blocklet{");
+sb.append("filePath='").append(filePath).append('\'');
+sb.append(", blockletId='").append(blockletId).append('\'');
+sb.append('}');
+return sb.toString();
+  }
+
+  @Override
+  public int hashCode() {
 int result = filePath != null ? filePath.hashCode() : 0;
 result = 31 * result;
 if (compareBlockletIdForObjectMatching) {

http://git-wip-us.apache.org/repos/asf/carbondata/blob/5c0da31a/core/src/main/java/org/apache/carbondata/core/indexstore/blockletindex/BlockletDataMapFactory.java
--
diff --git 
a/core/src/main/java/org/apache/carbondata/core/indexstore/blockletindex/BlockletDataMapFactory.java
 
b/core/src/main/java/org/apache/carbondata/core/indexstore/blockletindex/BlockletDataMapFactory.java
index e16c3cd..096a5e3 100644
--- 
a/core/src/main/java/org/apache/carbondata/core/indexstore/blockletindex/BlockletDataMapFactory.java
+++ 
b/core/src/main/java/org/apache/carbondata/core/indexstore/blockletindex/BlockletDataMapFactory.java
@@ -252,7 +252,7 @@ public class BlockletDataMapFactory extends 
CoarseGrainDataMapFactory
 }
   }
 }
-throw new IOException("Blocklet with blockid " + blocklet.getBlockletId() 
+ " not found ");
+throw new IOException("Blocklet not found: " + blocklet.toString());
   }
 
 



carbondata git commit: [CARBONDATA-2965] support Benchmark command in CarbonCli [Forced Update!]

2018-09-26 Thread xuchuanyin
Repository: carbondata
Updated Branches:
  refs/heads/master 23a9e7c5f -> e07df44a1 (forced update)


[CARBONDATA-2965] support Benchmark command in CarbonCli

A new command called "benchmark" is added in CarbonCli tool to output the scan 
performance of the specified file and column.
Example usage:
```bash
shell>java -jar carbondata-cli.jar org.apache.carbondata.CarbonCli -cmd 
benchmark -p hdfs://carbon1:9000/carbon.store/tpchcarbon_base/lineitem/ -a -c 
l_comment
```
will scan output the scan time of l_comment column in first file in the input 
folder and prints: (or using -f option to provide the data file instead of 
folder)

```
ReadHeaderAndFooter takes 12,598 us
ConvertFooter takes 4,712 us
ReadAllMetaAndConvertFooter takes 8,039 us

Scan column 'l_comment'
Blocklet#0: ColumnChunkIO takes 222,609 us
Blocklet#0: DecompressPage takes 111,985 us
Blocklet#1: ColumnChunkIO takes 186,522 us
Blocklet#1: DecompressPage takes 89,132 us
Blocklet#2: ColumnChunkIO takes 209,129 us
Blocklet#2: DecompressPage takes 84,051 us
```
This closes #2755


Project: http://git-wip-us.apache.org/repos/asf/carbondata/repo
Commit: http://git-wip-us.apache.org/repos/asf/carbondata/commit/e07df44a
Tree: http://git-wip-us.apache.org/repos/asf/carbondata/tree/e07df44a
Diff: http://git-wip-us.apache.org/repos/asf/carbondata/diff/e07df44a

Branch: refs/heads/master
Commit: e07df44a1db52304c54ab4e379f28b0f026449fd
Parents: 49f6715
Author: Jacky Li 
Authored: Sun Sep 23 00:01:04 2018 +0800
Committer: xuchuanyin 
Committed: Wed Sep 26 15:47:37 2018 +0800

--
 .../core/util/DataFileFooterConverterV3.java|   6 +-
 pom.xml |   7 +-
 tools/cli/pom.xml   |   5 +
 .../org/apache/carbondata/tool/CarbonCli.java   |  90 
 .../org/apache/carbondata/tool/Command.java |  28 +++
 .../org/apache/carbondata/tool/DataFile.java|  94 +++--
 .../org/apache/carbondata/tool/DataSummary.java | 188 ++---
 .../apache/carbondata/tool/FileCollector.java   | 147 +
 .../apache/carbondata/tool/ScanBenchmark.java   | 205 +++
 .../apache/carbondata/tool/CarbonCliTest.java   |  94 +
 10 files changed, 622 insertions(+), 242 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/carbondata/blob/e07df44a/core/src/main/java/org/apache/carbondata/core/util/DataFileFooterConverterV3.java
--
diff --git 
a/core/src/main/java/org/apache/carbondata/core/util/DataFileFooterConverterV3.java
 
b/core/src/main/java/org/apache/carbondata/core/util/DataFileFooterConverterV3.java
index 41e22fd..438e3e3 100644
--- 
a/core/src/main/java/org/apache/carbondata/core/util/DataFileFooterConverterV3.java
+++ 
b/core/src/main/java/org/apache/carbondata/core/util/DataFileFooterConverterV3.java
@@ -59,12 +59,16 @@ public class DataFileFooterConverterV3 extends 
AbstractDataFileFooterConverter {
*/
   @Override public DataFileFooter readDataFileFooter(TableBlockInfo 
tableBlockInfo)
   throws IOException {
-DataFileFooter dataFileFooter = new DataFileFooter();
 CarbonHeaderReader carbonHeaderReader = new 
CarbonHeaderReader(tableBlockInfo.getFilePath());
 FileHeader fileHeader = carbonHeaderReader.readHeader();
 CarbonFooterReaderV3 reader =
 new CarbonFooterReaderV3(tableBlockInfo.getFilePath(), 
tableBlockInfo.getBlockOffset());
 FileFooter3 footer = reader.readFooterVersion3();
+return convertDataFileFooter(fileHeader, footer);
+  }
+
+  public DataFileFooter convertDataFileFooter(FileHeader fileHeader, 
FileFooter3 footer) {
+DataFileFooter dataFileFooter = new DataFileFooter();
 dataFileFooter.setVersionId(ColumnarFormatVersion.valueOf((short) 
fileHeader.getVersion()));
 dataFileFooter.setNumberOfRows(footer.getNum_rows());
 dataFileFooter.setSegmentInfo(getSegmentInfo(footer.getSegment_info()));

http://git-wip-us.apache.org/repos/asf/carbondata/blob/e07df44a/pom.xml
--
diff --git a/pom.xml b/pom.xml
index eff438b..00a5287 100644
--- a/pom.xml
+++ b/pom.xml
@@ -106,6 +106,7 @@
 store/sdk
 store/search
 assembly
+tools/cli
   
 
   
@@ -718,12 +719,6 @@
 datamap/mv/core
   
 
-
-  tools
-  
-tools/cli
-  
-
   
 
 

http://git-wip-us.apache.org/repos/asf/carbondata/blob/e07df44a/tools/cli/pom.xml
--
diff --git a/tools/cli/pom.xml b/tools/cli/pom.xml
index 0d00438..60e69dc 100644
--- a/tools/cli/pom.xml
+++ b/tools/cli/pom.xml
@@ -25,6 +25,11 @@
   ${project.version}
 
 
+  javax.servlet
+  servlet-api
+  2.5
+
+
   junit
   junit
   test

carbondata git commit: [CARBONDATA-2965] support Benchmark command in CarbonCli [Forced Update!]

2018-09-26 Thread xuchuanyin
Repository: carbondata
Updated Branches:
  refs/heads/master 83f28800b -> 23a9e7c5f (forced update)


[CARBONDATA-2965] support Benchmark command in CarbonCli

A new command called "benchmark" is added in CarbonCli tool to output the scan 
performance of the specified file and column.
Example usage:
```bash
shell>java -jar carbondata-cli.jar org.apache.carbondata.CarbonCli -cmd 
benchmark -p hdfs://carbon1:9000/carbon.store/tpchcarbon_base/lineitem/ -a -c 
l_comment
```
will scan output the scan time of l_comment column in first file in the input 
folder and prints: (or using -f option to provide the data file instead of 
folder)

```
ReadHeaderAndFooter takes 12,598 us
ConvertFooter takes 4,712 us
ReadAllMetaAndConvertFooter takes 8,039 us

Scan column 'l_comment'
Blocklet#0: ColumnChunkIO takes 222,609 us
Blocklet#0: DecompressPage takes 111,985 us
Blocklet#1: ColumnChunkIO takes 186,522 us
Blocklet#1: DecompressPage takes 89,132 us
Blocklet#2: ColumnChunkIO takes 209,129 us
Blocklet#2: DecompressPage takes 84,051 us
```
This closes#2755


Project: http://git-wip-us.apache.org/repos/asf/carbondata/repo
Commit: http://git-wip-us.apache.org/repos/asf/carbondata/commit/23a9e7c5
Tree: http://git-wip-us.apache.org/repos/asf/carbondata/tree/23a9e7c5
Diff: http://git-wip-us.apache.org/repos/asf/carbondata/diff/23a9e7c5

Branch: refs/heads/master
Commit: 23a9e7c5fde0b801b6c475949f26e2bbc05e9255
Parents: 49f6715
Author: Jacky Li 
Authored: Sun Sep 23 00:01:04 2018 +0800
Committer: xuchuanyin 
Committed: Wed Sep 26 15:45:52 2018 +0800

--
 .../core/util/DataFileFooterConverterV3.java|   6 +-
 pom.xml |   7 +-
 tools/cli/pom.xml   |   5 +
 .../org/apache/carbondata/tool/CarbonCli.java   |  90 
 .../org/apache/carbondata/tool/Command.java |  28 +++
 .../org/apache/carbondata/tool/DataFile.java|  94 +++--
 .../org/apache/carbondata/tool/DataSummary.java | 188 ++---
 .../apache/carbondata/tool/FileCollector.java   | 147 +
 .../apache/carbondata/tool/ScanBenchmark.java   | 205 +++
 .../apache/carbondata/tool/CarbonCliTest.java   |  94 +
 10 files changed, 622 insertions(+), 242 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/carbondata/blob/23a9e7c5/core/src/main/java/org/apache/carbondata/core/util/DataFileFooterConverterV3.java
--
diff --git 
a/core/src/main/java/org/apache/carbondata/core/util/DataFileFooterConverterV3.java
 
b/core/src/main/java/org/apache/carbondata/core/util/DataFileFooterConverterV3.java
index 41e22fd..438e3e3 100644
--- 
a/core/src/main/java/org/apache/carbondata/core/util/DataFileFooterConverterV3.java
+++ 
b/core/src/main/java/org/apache/carbondata/core/util/DataFileFooterConverterV3.java
@@ -59,12 +59,16 @@ public class DataFileFooterConverterV3 extends 
AbstractDataFileFooterConverter {
*/
   @Override public DataFileFooter readDataFileFooter(TableBlockInfo 
tableBlockInfo)
   throws IOException {
-DataFileFooter dataFileFooter = new DataFileFooter();
 CarbonHeaderReader carbonHeaderReader = new 
CarbonHeaderReader(tableBlockInfo.getFilePath());
 FileHeader fileHeader = carbonHeaderReader.readHeader();
 CarbonFooterReaderV3 reader =
 new CarbonFooterReaderV3(tableBlockInfo.getFilePath(), 
tableBlockInfo.getBlockOffset());
 FileFooter3 footer = reader.readFooterVersion3();
+return convertDataFileFooter(fileHeader, footer);
+  }
+
+  public DataFileFooter convertDataFileFooter(FileHeader fileHeader, 
FileFooter3 footer) {
+DataFileFooter dataFileFooter = new DataFileFooter();
 dataFileFooter.setVersionId(ColumnarFormatVersion.valueOf((short) 
fileHeader.getVersion()));
 dataFileFooter.setNumberOfRows(footer.getNum_rows());
 dataFileFooter.setSegmentInfo(getSegmentInfo(footer.getSegment_info()));

http://git-wip-us.apache.org/repos/asf/carbondata/blob/23a9e7c5/pom.xml
--
diff --git a/pom.xml b/pom.xml
index eff438b..00a5287 100644
--- a/pom.xml
+++ b/pom.xml
@@ -106,6 +106,7 @@
 store/sdk
 store/search
 assembly
+tools/cli
   
 
   
@@ -718,12 +719,6 @@
 datamap/mv/core
   
 
-
-  tools
-  
-tools/cli
-  
-
   
 
 

http://git-wip-us.apache.org/repos/asf/carbondata/blob/23a9e7c5/tools/cli/pom.xml
--
diff --git a/tools/cli/pom.xml b/tools/cli/pom.xml
index 0d00438..60e69dc 100644
--- a/tools/cli/pom.xml
+++ b/tools/cli/pom.xml
@@ -25,6 +25,11 @@
   ${project.version}
 
 
+  javax.servlet
+  servlet-api
+  2.5
+
+
   junit
   junit
   test

carbondata git commit: support Benchmark command in CarbonCli

2018-09-26 Thread xuchuanyin
Repository: carbondata
Updated Branches:
  refs/heads/master 49f67153a -> 83f28800b


support Benchmark command in CarbonCli

A new command called "benchmark" is added in CarbonCli tool to output the scan 
performance of the specified file and column.
Example usage:
```bash
shell>java -jar carbondata-cli.jar org.apache.carbondata.CarbonCli -cmd 
benchmark -p hdfs://carbon1:9000/carbon.store/tpchcarbon_base/lineitem/ -a -c 
l_comment
```
will scan output the scan time of l_comment column in first file in the input 
folder and prints: (or using -f option to provide the data file instead of 
folder)

```
ReadHeaderAndFooter takes 12,598 us
ConvertFooter takes 4,712 us
ReadAllMetaAndConvertFooter takes 8,039 us

Scan column 'l_comment'
Blocklet#0: ColumnChunkIO takes 222,609 us
Blocklet#0: DecompressPage takes 111,985 us
Blocklet#1: ColumnChunkIO takes 186,522 us
Blocklet#1: DecompressPage takes 89,132 us
Blocklet#2: ColumnChunkIO takes 209,129 us
Blocklet#2: DecompressPage takes 84,051 us
```


Project: http://git-wip-us.apache.org/repos/asf/carbondata/repo
Commit: http://git-wip-us.apache.org/repos/asf/carbondata/commit/83f28800
Tree: http://git-wip-us.apache.org/repos/asf/carbondata/tree/83f28800
Diff: http://git-wip-us.apache.org/repos/asf/carbondata/diff/83f28800

Branch: refs/heads/master
Commit: 83f28800bfc152fd1d9bfebef758ee4f4a8659db
Parents: 49f6715
Author: Jacky Li 
Authored: Sun Sep 23 00:01:04 2018 +0800
Committer: xuchuanyin 
Committed: Wed Sep 26 15:44:00 2018 +0800

--
 .../core/util/DataFileFooterConverterV3.java|   6 +-
 pom.xml |   7 +-
 tools/cli/pom.xml   |   5 +
 .../org/apache/carbondata/tool/CarbonCli.java   |  90 
 .../org/apache/carbondata/tool/Command.java |  28 +++
 .../org/apache/carbondata/tool/DataFile.java|  94 +++--
 .../org/apache/carbondata/tool/DataSummary.java | 188 ++---
 .../apache/carbondata/tool/FileCollector.java   | 147 +
 .../apache/carbondata/tool/ScanBenchmark.java   | 205 +++
 .../apache/carbondata/tool/CarbonCliTest.java   |  94 +
 10 files changed, 622 insertions(+), 242 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/carbondata/blob/83f28800/core/src/main/java/org/apache/carbondata/core/util/DataFileFooterConverterV3.java
--
diff --git 
a/core/src/main/java/org/apache/carbondata/core/util/DataFileFooterConverterV3.java
 
b/core/src/main/java/org/apache/carbondata/core/util/DataFileFooterConverterV3.java
index 41e22fd..438e3e3 100644
--- 
a/core/src/main/java/org/apache/carbondata/core/util/DataFileFooterConverterV3.java
+++ 
b/core/src/main/java/org/apache/carbondata/core/util/DataFileFooterConverterV3.java
@@ -59,12 +59,16 @@ public class DataFileFooterConverterV3 extends 
AbstractDataFileFooterConverter {
*/
   @Override public DataFileFooter readDataFileFooter(TableBlockInfo 
tableBlockInfo)
   throws IOException {
-DataFileFooter dataFileFooter = new DataFileFooter();
 CarbonHeaderReader carbonHeaderReader = new 
CarbonHeaderReader(tableBlockInfo.getFilePath());
 FileHeader fileHeader = carbonHeaderReader.readHeader();
 CarbonFooterReaderV3 reader =
 new CarbonFooterReaderV3(tableBlockInfo.getFilePath(), 
tableBlockInfo.getBlockOffset());
 FileFooter3 footer = reader.readFooterVersion3();
+return convertDataFileFooter(fileHeader, footer);
+  }
+
+  public DataFileFooter convertDataFileFooter(FileHeader fileHeader, 
FileFooter3 footer) {
+DataFileFooter dataFileFooter = new DataFileFooter();
 dataFileFooter.setVersionId(ColumnarFormatVersion.valueOf((short) 
fileHeader.getVersion()));
 dataFileFooter.setNumberOfRows(footer.getNum_rows());
 dataFileFooter.setSegmentInfo(getSegmentInfo(footer.getSegment_info()));

http://git-wip-us.apache.org/repos/asf/carbondata/blob/83f28800/pom.xml
--
diff --git a/pom.xml b/pom.xml
index eff438b..00a5287 100644
--- a/pom.xml
+++ b/pom.xml
@@ -106,6 +106,7 @@
 store/sdk
 store/search
 assembly
+tools/cli
   
 
   
@@ -718,12 +719,6 @@
 datamap/mv/core
   
 
-
-  tools
-  
-tools/cli
-  
-
   
 
 

http://git-wip-us.apache.org/repos/asf/carbondata/blob/83f28800/tools/cli/pom.xml
--
diff --git a/tools/cli/pom.xml b/tools/cli/pom.xml
index 0d00438..60e69dc 100644
--- a/tools/cli/pom.xml
+++ b/tools/cli/pom.xml
@@ -25,6 +25,11 @@
   ${project.version}
 
 
+  javax.servlet
+  servlet-api
+  2.5
+
+
   junit
   junit
   test

http://git-wip-us.apache.org/repos/asf/carbonda

carbondata git commit: [CARBONDATA-2783][BloomDataMap][Doc] Update document for bloom filter datamap

2018-07-25 Thread xuchuanyin
Repository: carbondata
Updated Branches:
  refs/heads/master 316e9de65 -> 232f6ced3


[CARBONDATA-2783][BloomDataMap][Doc] Update document for bloom filter datamap

add example for enable/disable datamap

This closes #2554


Project: http://git-wip-us.apache.org/repos/asf/carbondata/repo
Commit: http://git-wip-us.apache.org/repos/asf/carbondata/commit/232f6ced
Tree: http://git-wip-us.apache.org/repos/asf/carbondata/tree/232f6ced
Diff: http://git-wip-us.apache.org/repos/asf/carbondata/diff/232f6ced

Branch: refs/heads/master
Commit: 232f6ced3944b437539ae9f1ad05c62441d44f46
Parents: 316e9de
Author: Manhua 
Authored: Wed Jul 25 16:51:49 2018 +0800
Committer: xuchuanyin 
Committed: Thu Jul 26 09:15:17 2018 +0800

--
 docs/datamap/bloomfilter-datamap-guide.md | 11 ++-
 1 file changed, 10 insertions(+), 1 deletion(-)
--


http://git-wip-us.apache.org/repos/asf/carbondata/blob/232f6ced/docs/datamap/bloomfilter-datamap-guide.md
--
diff --git a/docs/datamap/bloomfilter-datamap-guide.md 
b/docs/datamap/bloomfilter-datamap-guide.md
index 0fb3ba2..325a508 100644
--- a/docs/datamap/bloomfilter-datamap-guide.md
+++ b/docs/datamap/bloomfilter-datamap-guide.md
@@ -26,7 +26,16 @@ Showing all DataMaps on this table
   SHOW DATAMAP
   ON TABLE main_table
   ```
-It will show all DataMaps created on main table.
+
+Disable Datamap
+> The datamap by default is enabled. To support tuning on query, we can 
disable a specific datamap during query to observe whether we can gain 
performance enhancement from it. This will only take effect current session.
+
+  ```
+  // disable the datamap
+  SET carbon.datamap.visible.dbName.tableName.dataMapName = false
+  // enable the datamap
+  SET carbon.datamap.visible.dbName.tableName.dataMapName = true
+  ```
 
 
 ## BloomFilter DataMap Introduction



carbondata git commit: [CARBONDATA-2774][BloomDataMap] Exception should be thrown if expression do not satisfy bloomFilter's requirement

2018-07-24 Thread xuchuanyin
Repository: carbondata
Updated Branches:
  refs/heads/master 895239490 -> b853963b2


[CARBONDATA-2774][BloomDataMap] Exception should be thrown if expression do not 
satisfy bloomFilter's requirement

If query on string column with bloom index using number as filter value, we 
would get wrong result. Because DDL will wrap the filter with a datatype 
conversion, and DatamapChooser chooses datamap by searching column deep down to 
all children of filter expression. However, bloom filter required the 
expression is simply column =/in filterValue. So bloom will fail to build any 
querymodel and no hit blocklet, such that we will get an empty result.

To avoid getting wrong answer silently, we thrown exception instead of only 
giving a error log currently.

For users who wants to get correct result, can fix the SQL using corresponding 
data type or disable bloom datamap.

This closes #2545


Project: http://git-wip-us.apache.org/repos/asf/carbondata/repo
Commit: http://git-wip-us.apache.org/repos/asf/carbondata/commit/b853963b
Tree: http://git-wip-us.apache.org/repos/asf/carbondata/tree/b853963b
Diff: http://git-wip-us.apache.org/repos/asf/carbondata/diff/b853963b

Branch: refs/heads/master
Commit: b853963b262689491c3eafb81654c72b35966165
Parents: 8952394
Author: Manhua 
Authored: Tue Jul 24 14:51:18 2018 +0800
Committer: xuchuanyin 
Committed: Tue Jul 24 16:45:26 2018 +0800

--
 .../carbondata/datamap/bloom/BloomCoarseGrainDataMap.java| 8 ++--
 1 file changed, 6 insertions(+), 2 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/carbondata/blob/b853963b/datamap/bloom/src/main/java/org/apache/carbondata/datamap/bloom/BloomCoarseGrainDataMap.java
--
diff --git 
a/datamap/bloom/src/main/java/org/apache/carbondata/datamap/bloom/BloomCoarseGrainDataMap.java
 
b/datamap/bloom/src/main/java/org/apache/carbondata/datamap/bloom/BloomCoarseGrainDataMap.java
index 26db300..be531d6 100644
--- 
a/datamap/bloom/src/main/java/org/apache/carbondata/datamap/bloom/BloomCoarseGrainDataMap.java
+++ 
b/datamap/bloom/src/main/java/org/apache/carbondata/datamap/bloom/BloomCoarseGrainDataMap.java
@@ -203,7 +203,9 @@ public class BloomCoarseGrainDataMap extends 
CoarseGrainDataMap {
 }
 return queryModels;
   } else {
-LOGGER.warn("BloomFilter can only support the 'equal' filter like 'Col 
= PlainValue'");
+String errorMsg = "BloomFilter can only support the 'equal' filter 
like 'Col = PlainValue'";
+LOGGER.warn(errorMsg);
+throw new RuntimeException(errorMsg);
   }
 } else if (expression instanceof InExpression) {
   Expression left = ((InExpression) expression).getLeft();
@@ -226,7 +228,9 @@ public class BloomCoarseGrainDataMap extends 
CoarseGrainDataMap {
 }
 return queryModels;
   } else {
-LOGGER.warn("BloomFilter can only support the 'in' filter like 'Col in 
(PlainValues)'");
+String errorMsg = "BloomFilter can only support the 'in' filter like 
'Col in PlainValue'";
+LOGGER.warn(errorMsg);
+throw new RuntimeException(errorMsg);
   }
 }
 



carbondata git commit: [CARBONDATA-2618][32K] Split to multiple pages if varchar column page exceeds 2GB/snappy limits

2018-07-24 Thread xuchuanyin
Repository: carbondata
Updated Branches:
  refs/heads/master 42a80564c -> 6f1767b5a


[CARBONDATA-2618][32K] Split to multiple pages if varchar column page exceeds 
2GB/snappy limits

Dynamic column page size decided by long string column

A varchar column page uses SafeVarLengthColumnPage/UnsafeVarLengthColumnPage to 
store data
and encoded using HighCardDictDimensionIndexCodec which will call 
getByteArrayPage() from
column page and flatten into byte[] for compression.
Limited by the index of array, we can only put number of Integer.MAX_VALUE 
bytes in a page.

Another limitation is from Compressor. Currently we use snappy as default 
compressor,
and it will call MaxCompressedLength method to estimate the result size for 
preparing output.
For safety, the estimate result is oversize: 32 + source_len + source_len/6.
So the maximum bytes to compress by snappy is (2GB-32)*6/7≈1.71GB.

Size of a row does not exceed 2MB since UnsafeSortDataRows uses 2MB byte[] as 
rowBuffer.
Such that we can stop adding more row here if any long string column reach this 
limit.

This closes #2464


Project: http://git-wip-us.apache.org/repos/asf/carbondata/repo
Commit: http://git-wip-us.apache.org/repos/asf/carbondata/commit/6f1767b5
Tree: http://git-wip-us.apache.org/repos/asf/carbondata/tree/6f1767b5
Diff: http://git-wip-us.apache.org/repos/asf/carbondata/diff/6f1767b5

Branch: refs/heads/master
Commit: 6f1767b5a3bf7d13053210623efdeb7512bcdedb
Parents: 42a8056
Author: Manhua 
Authored: Mon Jul 9 16:42:08 2018 +0800
Committer: xuchuanyin 
Committed: Tue Jul 24 14:40:24 2018 +0800

--
 .../datastore/compression/SnappyCompressor.java |  3 ++
 .../store/CarbonFactDataHandlerColumnar.java| 53 ++--
 .../store/CarbonFactDataHandlerModel.java   | 44 
 3 files changed, 97 insertions(+), 3 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/carbondata/blob/6f1767b5/core/src/main/java/org/apache/carbondata/core/datastore/compression/SnappyCompressor.java
--
diff --git 
a/core/src/main/java/org/apache/carbondata/core/datastore/compression/SnappyCompressor.java
 
b/core/src/main/java/org/apache/carbondata/core/datastore/compression/SnappyCompressor.java
index 65244d2..bd740b2 100644
--- 
a/core/src/main/java/org/apache/carbondata/core/datastore/compression/SnappyCompressor.java
+++ 
b/core/src/main/java/org/apache/carbondata/core/datastore/compression/SnappyCompressor.java
@@ -31,6 +31,9 @@ public class SnappyCompressor implements Compressor {
   private static final LogService LOGGER =
   LogServiceFactory.getLogService(SnappyCompressor.class.getName());
 
+  // snappy estimate max compressed length as 32 + source_len + source_len/6
+  public static final int MAX_BYTE_TO_COMPRESS = (int)((Integer.MAX_VALUE - 
32) / 7.0 * 6);
+
   private final SnappyNative snappyNative;
 
   public SnappyCompressor() {

http://git-wip-us.apache.org/repos/asf/carbondata/blob/6f1767b5/processing/src/main/java/org/apache/carbondata/processing/store/CarbonFactDataHandlerColumnar.java
--
diff --git 
a/processing/src/main/java/org/apache/carbondata/processing/store/CarbonFactDataHandlerColumnar.java
 
b/processing/src/main/java/org/apache/carbondata/processing/store/CarbonFactDataHandlerColumnar.java
index 97737d0..94511cf 100644
--- 
a/processing/src/main/java/org/apache/carbondata/processing/store/CarbonFactDataHandlerColumnar.java
+++ 
b/processing/src/main/java/org/apache/carbondata/processing/store/CarbonFactDataHandlerColumnar.java
@@ -34,8 +34,10 @@ import org.apache.carbondata.common.logging.LogService;
 import org.apache.carbondata.common.logging.LogServiceFactory;
 import org.apache.carbondata.core.constants.CarbonCommonConstants;
 import org.apache.carbondata.core.constants.CarbonV3DataFormatConstants;
+import org.apache.carbondata.core.datastore.compression.SnappyCompressor;
 import 
org.apache.carbondata.core.datastore.exception.CarbonDataWriterException;
 import org.apache.carbondata.core.datastore.row.CarbonRow;
+import org.apache.carbondata.core.datastore.row.WriteStepRowUtil;
 import org.apache.carbondata.core.keygenerator.KeyGenException;
 import 
org.apache.carbondata.core.keygenerator.columnar.impl.MultiDimKeyVarLengthEquiSplitGenerator;
 import org.apache.carbondata.core.memory.MemoryException;
@@ -84,6 +86,7 @@ public class CarbonFactDataHandlerColumnar implements 
CarbonFactHandler {
   private ExecutorService consumerExecutorService;
   private List> consumerExecutorServiceTaskList;
   private List dataRows;
+  private int[] varcharColumnSizeInByte;
   /**
* semaphore which will used for managing node holder objects
*/
@@ -191,7 +194,7 @@ public class CarbonFactDataHandlerColumnar impl

carbondata git commit: [CARBONDATA-2682][32K] fix create table with long_string_columns properties bugs

2018-07-23 Thread xuchuanyin
Repository: carbondata
Updated Branches:
  refs/heads/master 43285bbd1 -> 45960f4a8


[CARBONDATA-2682][32K] fix create table with long_string_columns properties bugs

This PR fixes create table with long_string_columns bugs which are:
1.create table with columns both in long_string_columns and partition or 
no_inverted_index property should be blocked.
2.create table with duplicate columns in long_string_column property should be 
blocked.

This closes #2506


Project: http://git-wip-us.apache.org/repos/asf/carbondata/repo
Commit: http://git-wip-us.apache.org/repos/asf/carbondata/commit/45960f4a
Tree: http://git-wip-us.apache.org/repos/asf/carbondata/tree/45960f4a
Diff: http://git-wip-us.apache.org/repos/asf/carbondata/diff/45960f4a

Branch: refs/heads/master
Commit: 45960f4a8d45c99d917ea9aec3579dc3b2345af4
Parents: 43285bb
Author: Sssan520 
Authored: Thu Jul 12 19:58:23 2018 +0800
Committer: xuchuanyin 
Committed: Mon Jul 23 14:53:27 2018 +0800

--
 .../VarcharDataTypesBasicTestCase.scala | 66 +--
 .../spark/sql/catalyst/CarbonDDLSqlParser.scala | 88 +---
 2 files changed, 137 insertions(+), 17 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/carbondata/blob/45960f4a/integration/spark-common-test/src/test/scala/org/apache/carbondata/spark/testsuite/longstring/VarcharDataTypesBasicTestCase.scala
--
diff --git 
a/integration/spark-common-test/src/test/scala/org/apache/carbondata/spark/testsuite/longstring/VarcharDataTypesBasicTestCase.scala
 
b/integration/spark-common-test/src/test/scala/org/apache/carbondata/spark/testsuite/longstring/VarcharDataTypesBasicTestCase.scala
index cb7cd81..4b8187f 100644
--- 
a/integration/spark-common-test/src/test/scala/org/apache/carbondata/spark/testsuite/longstring/VarcharDataTypesBasicTestCase.scala
+++ 
b/integration/spark-common-test/src/test/scala/org/apache/carbondata/spark/testsuite/longstring/VarcharDataTypesBasicTestCase.scala
@@ -25,6 +25,7 @@ import org.apache.spark.sql.test.util.QueryTest
 import org.apache.spark.sql.types.{IntegerType, StringType, StructField, 
StructType}
 import org.scalatest.{BeforeAndAfterAll, BeforeAndAfterEach}
 
+import 
org.apache.carbondata.common.exceptions.sql.MalformedCarbonCommandException
 import org.apache.carbondata.core.constants.CarbonCommonConstants
 import org.apache.carbondata.core.metadata.CarbonMetadata
 import org.apache.carbondata.core.metadata.datatype.DataTypes
@@ -79,7 +80,7 @@ class VarcharDataTypesBasicTestCase extends QueryTest with 
BeforeAndAfterEach wi
   }
 
   test("long string columns cannot be dictionary include") {
-val exceptionCaught = intercept[Exception] {
+val exceptionCaught = intercept[MalformedCarbonCommandException] {
   sql(
 s"""
| CREATE TABLE if not exists $longStringTable(
@@ -93,7 +94,7 @@ class VarcharDataTypesBasicTestCase extends QueryTest with 
BeforeAndAfterEach wi
   }
 
   test("long string columns cannot be dictionay exclude") {
-val exceptionCaught = intercept[Exception] {
+val exceptionCaught = intercept[MalformedCarbonCommandException] {
   sql(
 s"""
| CREATE TABLE if not exists $longStringTable(
@@ -107,7 +108,7 @@ class VarcharDataTypesBasicTestCase extends QueryTest with 
BeforeAndAfterEach wi
   }
 
   test("long string columns cannot be sort_columns") {
-val exceptionCaught = intercept[Exception] {
+val exceptionCaught = intercept[MalformedCarbonCommandException] {
   sql(
 s"""
| CREATE TABLE if not exists $longStringTable(
@@ -121,20 +122,73 @@ class VarcharDataTypesBasicTestCase extends QueryTest 
with BeforeAndAfterEach wi
   }
 
   test("long string columns can only be string columns") {
-val exceptionCaught = intercept[Exception] {
+val exceptionCaught = intercept[MalformedCarbonCommandException] {
   sql(
 s"""
| CREATE TABLE if not exists $longStringTable(
| id INT, name STRING, description STRING, address STRING, note 
STRING
| ) STORED BY 'carbondata'
| TBLPROPERTIES('LONG_STRING_COLUMNS'='id, note')
-   |""".
-  stripMargin)
+   |""".stripMargin)
 }
 assert(exceptionCaught.getMessage.contains("long_string_columns: id"))
 assert(exceptionCaught.getMessage.contains("its data type is not string"))
   }
 
+  test("long string columns cannot contain duplicate columns") {
+val exceptionCaught = intercept[MalformedCarbonCommandException] {
+  sql(
+s"""
+   | CREATE TABLE if not exists $longStringTable(
+   | i

carbondata git commit: [CARBONDATA-2757][BloomDataMap] Fix bug when building bloomfilter on measure column

2018-07-22 Thread xuchuanyin
Repository: carbondata
Updated Branches:
  refs/heads/master 9f42fbf33 -> 7551cc696


[CARBONDATA-2757][BloomDataMap] Fix bug when building bloomfilter on measure 
column

1. support to get raw data from decimal column page when building datamap in 
loading process
2. convert decimal column to java datatype when rebuilding bloom datamap from 
query result
3. convert boolean to byte as carbon wants
4. fix bugs when measure column is null

This closes #2526


Project: http://git-wip-us.apache.org/repos/asf/carbondata/repo
Commit: http://git-wip-us.apache.org/repos/asf/carbondata/commit/7551cc69
Tree: http://git-wip-us.apache.org/repos/asf/carbondata/tree/7551cc69
Diff: http://git-wip-us.apache.org/repos/asf/carbondata/diff/7551cc69

Branch: refs/heads/master
Commit: 7551cc69657b725d2c5566e9718f9bebb42d14f5
Parents: 9f42fbf
Author: Manhua 
Authored: Thu Jul 19 16:26:18 2018 +0800
Committer: xuchuanyin 
Committed: Sun Jul 22 14:01:47 2018 +0800

--
 .../core/datastore/page/ColumnPage.java |   3 +
 .../core/datastore/page/DecimalColumnPage.java  |  48 ++
 .../datastore/page/SafeDecimalColumnPage.java   |  21 ---
 .../datastore/page/UnsafeDecimalColumnPage.java |  23 ---
 .../bloom/AbstractBloomDataMapWriter.java   |  10 +-
 .../datamap/bloom/BloomCoarseGrainDataMap.java  |  12 +-
 .../datamap/bloom/DataConvertUtil.java  |  22 ++-
 .../datamap/IndexDataMapRebuildRDD.scala|  14 +-
 .../bloom/BloomCoarseGrainDataMapSuite.scala| 169 +++
 9 files changed, 270 insertions(+), 52 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/carbondata/blob/7551cc69/core/src/main/java/org/apache/carbondata/core/datastore/page/ColumnPage.java
--
diff --git 
a/core/src/main/java/org/apache/carbondata/core/datastore/page/ColumnPage.java 
b/core/src/main/java/org/apache/carbondata/core/datastore/page/ColumnPage.java
index ea250cf..75e47de 100644
--- 
a/core/src/main/java/org/apache/carbondata/core/datastore/page/ColumnPage.java
+++ 
b/core/src/main/java/org/apache/carbondata/core/datastore/page/ColumnPage.java
@@ -525,6 +525,9 @@ public abstract class ColumnPage {
   result = getBoolean(rowId);
 } else if (dataType == DataTypes.BYTE) {
   result = getByte(rowId);
+  if (columnSpec.getSchemaDataType() == DataTypes.BOOLEAN) {
+result = BooleanConvert.byte2Boolean((byte)result);
+  }
 } else if (dataType == DataTypes.SHORT) {
   result = getShort(rowId);
 } else if (dataType == DataTypes.INT) {

http://git-wip-us.apache.org/repos/asf/carbondata/blob/7551cc69/core/src/main/java/org/apache/carbondata/core/datastore/page/DecimalColumnPage.java
--
diff --git 
a/core/src/main/java/org/apache/carbondata/core/datastore/page/DecimalColumnPage.java
 
b/core/src/main/java/org/apache/carbondata/core/datastore/page/DecimalColumnPage.java
index 2624223..368a289 100644
--- 
a/core/src/main/java/org/apache/carbondata/core/datastore/page/DecimalColumnPage.java
+++ 
b/core/src/main/java/org/apache/carbondata/core/datastore/page/DecimalColumnPage.java
@@ -17,8 +17,11 @@
 
 package org.apache.carbondata.core.datastore.page;
 
+import java.math.BigDecimal;
+
 import org.apache.carbondata.core.datastore.TableSpec;
 import org.apache.carbondata.core.metadata.datatype.DataType;
+import org.apache.carbondata.core.metadata.datatype.DataTypes;
 import org.apache.carbondata.core.metadata.datatype.DecimalConverterFactory;
 
 /**
@@ -106,4 +109,49 @@ public abstract class DecimalColumnPage extends 
VarLengthColumnPageBase {
 throw new UnsupportedOperationException("invalid data type: " + dataType);
   }
 
+  // used for building datamap in loading process
+  private BigDecimal getDecimalFromRawData(int rowId) {
+long value;
+switch (decimalConverter.getDecimalConverterType()) {
+  case DECIMAL_INT:
+value = getInt(rowId);
+break;
+  case DECIMAL_LONG:
+value = getLong(rowId);
+break;
+  default:
+value = getByte(rowId);
+}
+return decimalConverter.getDecimal(value);
+  }
+
+  private BigDecimal getDecimalFromDecompressData(int rowId) {
+long value;
+if (dataType == DataTypes.BYTE) {
+  value = getByte(rowId);
+} else if (dataType == DataTypes.SHORT) {
+  value = getShort(rowId);
+} else if (dataType == DataTypes.SHORT_INT) {
+  value = getShortInt(rowId);
+} else if (dataType == DataTypes.INT) {
+  value = getInt(rowId);
+} else if (dataType == DataTypes.LONG) {
+  value = getLong(rowId);
+} else {
+  return decimalConverter.getDecimal(getBytes(rowId));
+}
+return decimalConverter.getDecimal(value);
+  }
+
+  @Override
+  public BigDecimal getDeci

carbondata git commit: [CARBONDATA-2747][Lucene] Fix Lucene datamap choosing and DataMapDistributable building

2018-07-20 Thread xuchuanyin
Repository: carbondata
Updated Branches:
  refs/heads/master 4a37e05ca -> 46f0c8517


[CARBONDATA-2747][Lucene] Fix Lucene datamap choosing and DataMapDistributable 
building

1. choose lucene datamap for query column
2. build DataMapDistributable only for target datamap

This closes #2519


Project: http://git-wip-us.apache.org/repos/asf/carbondata/repo
Commit: http://git-wip-us.apache.org/repos/asf/carbondata/commit/46f0c851
Tree: http://git-wip-us.apache.org/repos/asf/carbondata/tree/46f0c851
Diff: http://git-wip-us.apache.org/repos/asf/carbondata/diff/46f0c851

Branch: refs/heads/master
Commit: 46f0c8517d4e79a402ff6dc8a077f3d3955f39b5
Parents: 4a37e05
Author: Manhua 
Authored: Wed Jul 18 10:14:40 2018 +0800
Committer: xuchuanyin 
Committed: Fri Jul 20 15:04:25 2018 +0800

--
 .../carbondata/core/datamap/DataMapChooser.java | 14 +++
 .../bloom/BloomCoarseGrainDataMapFactory.java   |  1 -
 .../lucene/LuceneDataMapFactoryBase.java| 25 +++-
 .../lucene/LuceneFineGrainDataMapSuite.scala|  9 +--
 4 files changed, 30 insertions(+), 19 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/carbondata/blob/46f0c851/core/src/main/java/org/apache/carbondata/core/datamap/DataMapChooser.java
--
diff --git 
a/core/src/main/java/org/apache/carbondata/core/datamap/DataMapChooser.java 
b/core/src/main/java/org/apache/carbondata/core/datamap/DataMapChooser.java
index cf5dffd..68696cf 100644
--- a/core/src/main/java/org/apache/carbondata/core/datamap/DataMapChooser.java
+++ b/core/src/main/java/org/apache/carbondata/core/datamap/DataMapChooser.java
@@ -34,6 +34,7 @@ import 
org.apache.carbondata.core.datamap.status.DataMapStatusManager;
 import org.apache.carbondata.core.metadata.schema.table.CarbonTable;
 import org.apache.carbondata.core.scan.expression.ColumnExpression;
 import org.apache.carbondata.core.scan.expression.Expression;
+import org.apache.carbondata.core.scan.expression.MatchExpression;
 import org.apache.carbondata.core.scan.expression.logical.AndExpression;
 import org.apache.carbondata.core.scan.expression.logical.OrExpression;
 import org.apache.carbondata.core.scan.filter.intf.ExpressionType;
@@ -269,6 +270,14 @@ public class DataMapChooser {
   List columnExpressions) {
 if (expression instanceof ColumnExpression) {
   columnExpressions.add((ColumnExpression) expression);
+} else if (expression instanceof MatchExpression) {
+  // this is a special case for lucene
+  // build a fake ColumnExpression to filter datamaps which contain target 
column
+  // a Lucene query string is alike "column:query term"
+  String[] queryItems = expression.getString().split(":", 2);
+  if (queryItems.length == 2) {
+columnExpressions.add(new ColumnExpression(queryItems[0], null));
+  }
 } else if (expression != null) {
   List children = expression.getChildren();
   if (children != null && children.size() > 0) {
@@ -303,11 +312,6 @@ public class DataMapChooser {
*/
   private boolean contains(DataMapMeta mapMeta, List 
columnExpressions,
   Set expressionTypes) {
-if (mapMeta.getOptimizedOperation().contains(ExpressionType.TEXT_MATCH) &&
-expressionTypes.contains(ExpressionType.TEXT_MATCH)) {
-  // TODO: fix it with right logic
-  return true;
-}
 if (mapMeta.getIndexedColumns().size() == 0 || columnExpressions.size() == 
0) {
   return false;
 }

http://git-wip-us.apache.org/repos/asf/carbondata/blob/46f0c851/datamap/bloom/src/main/java/org/apache/carbondata/datamap/bloom/BloomCoarseGrainDataMapFactory.java
--
diff --git 
a/datamap/bloom/src/main/java/org/apache/carbondata/datamap/bloom/BloomCoarseGrainDataMapFactory.java
 
b/datamap/bloom/src/main/java/org/apache/carbondata/datamap/bloom/BloomCoarseGrainDataMapFactory.java
index 4b5bc7c..652e1fc 100644
--- 
a/datamap/bloom/src/main/java/org/apache/carbondata/datamap/bloom/BloomCoarseGrainDataMapFactory.java
+++ 
b/datamap/bloom/src/main/java/org/apache/carbondata/datamap/bloom/BloomCoarseGrainDataMapFactory.java
@@ -278,7 +278,6 @@ public class BloomCoarseGrainDataMapFactory extends 
DataMapFactory 0) {
   for (TableDataMap dataMap : dataMaps) {
-// different from lucene, bloom only get corresponding directory of 
current datamap
 if 
(dataMap.getDataMapSchema().getDataMapName().equals(this.dataMapName)) {
   List indexFiles;
   String dmPath = CarbonTablePath.getDataMapStorePath(tablePath, 
segmentId,

http://git-wip-us.apache.org/repos/asf/carbondata/blob/46f0c851/datamap/lucene/src/main/java/org/apache/carbondata/da

carbondata git commit: [CARBONDATA-2746][BloomDataMap] Fix bug for getting datamap file when table has multiple datamaps

2018-07-17 Thread xuchuanyin
Repository: carbondata
Updated Branches:
  refs/heads/master a16289786 -> 4612e0031


[CARBONDATA-2746][BloomDataMap] Fix bug for getting datamap file when table has 
multiple datamaps

Currently, if table has multiple bloom datamap and carbon is set to use 
distributed datamap, query will throw an exception when accessing the index 
file, because carbon gets all the datamaps but sets them with same datamap 
schema. The error is appeared when getting the full path of bloom index by 
concating index directory and index column. This PR fix this problem by filter 
the index directories of target datamap when using distributed datamap.

Test shows that lucene is not affected by this. On the other hand, lucene gets 
wrong result if we apply this filter

This closes #2512


Project: http://git-wip-us.apache.org/repos/asf/carbondata/repo
Commit: http://git-wip-us.apache.org/repos/asf/carbondata/commit/4612e003
Tree: http://git-wip-us.apache.org/repos/asf/carbondata/tree/4612e003
Diff: http://git-wip-us.apache.org/repos/asf/carbondata/diff/4612e003

Branch: refs/heads/master
Commit: 4612e003186ccc6bae89443043bd0db3463f8fc1
Parents: a162897
Author: Manhua 
Authored: Mon Jul 16 19:29:07 2018 +0800
Committer: xuchuanyin 
Committed: Wed Jul 18 09:10:22 2018 +0800

--
 .../bloom/BloomCoarseGrainDataMapFactory.java   | 27 +++--
 .../lucene/LuceneFineGrainDataMapSuite.scala|  7 
 .../bloom/BloomCoarseGrainDataMapSuite.scala| 40 
 3 files changed, 62 insertions(+), 12 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/carbondata/blob/4612e003/datamap/bloom/src/main/java/org/apache/carbondata/datamap/bloom/BloomCoarseGrainDataMapFactory.java
--
diff --git 
a/datamap/bloom/src/main/java/org/apache/carbondata/datamap/bloom/BloomCoarseGrainDataMapFactory.java
 
b/datamap/bloom/src/main/java/org/apache/carbondata/datamap/bloom/BloomCoarseGrainDataMapFactory.java
index 35ebd20..4b5bc7c 100644
--- 
a/datamap/bloom/src/main/java/org/apache/carbondata/datamap/bloom/BloomCoarseGrainDataMapFactory.java
+++ 
b/datamap/bloom/src/main/java/org/apache/carbondata/datamap/bloom/BloomCoarseGrainDataMapFactory.java
@@ -278,18 +278,21 @@ public class BloomCoarseGrainDataMapFactory extends 
DataMapFactory 0) {
   for (TableDataMap dataMap : dataMaps) {
-List indexFiles;
-String dmPath = CarbonTablePath
-.getDataMapStorePath(tablePath, segmentId, 
dataMap.getDataMapSchema().getDataMapName());
-FileFactory.FileType fileType = FileFactory.getFileType(dmPath);
-final CarbonFile dirPath = FileFactory.getCarbonFile(dmPath, fileType);
-indexFiles = Arrays.asList(dirPath.listFiles(new CarbonFileFilter() {
-  @Override
-  public boolean accept(CarbonFile file) {
-return file.isDirectory();
-  }
-}));
-indexDirs.addAll(indexFiles);
+// different from lucene, bloom only get corresponding directory of 
current datamap
+if 
(dataMap.getDataMapSchema().getDataMapName().equals(this.dataMapName)) {
+  List indexFiles;
+  String dmPath = CarbonTablePath.getDataMapStorePath(tablePath, 
segmentId,
+  dataMap.getDataMapSchema().getDataMapName());
+  FileFactory.FileType fileType = FileFactory.getFileType(dmPath);
+  final CarbonFile dirPath = FileFactory.getCarbonFile(dmPath, 
fileType);
+  indexFiles = Arrays.asList(dirPath.listFiles(new CarbonFileFilter() {
+@Override
+public boolean accept(CarbonFile file) {
+  return file.isDirectory();
+}
+  }));
+  indexDirs.addAll(indexFiles);
+}
   }
 }
 return indexDirs.toArray(new CarbonFile[0]);

http://git-wip-us.apache.org/repos/asf/carbondata/blob/4612e003/integration/spark-common-test/src/test/scala/org/apache/carbondata/datamap/lucene/LuceneFineGrainDataMapSuite.scala
--
diff --git 
a/integration/spark-common-test/src/test/scala/org/apache/carbondata/datamap/lucene/LuceneFineGrainDataMapSuite.scala
 
b/integration/spark-common-test/src/test/scala/org/apache/carbondata/datamap/lucene/LuceneFineGrainDataMapSuite.scala
index 657a3eb..aebbde4 100644
--- 
a/integration/spark-common-test/src/test/scala/org/apache/carbondata/datamap/lucene/LuceneFineGrainDataMapSuite.scala
+++ 
b/integration/spark-common-test/src/test/scala/org/apache/carbondata/datamap/lucene/LuceneFineGrainDataMapSuite.scala
@@ -34,6 +34,10 @@ import 
org.apache.carbondata.core.datamap.status.DataMapStatusManager
 
 class LuceneFineGrainDataMapSuite extends QueryTest with BeforeAndAfterAll {
 
+  val originDistributedDatamapStatus = 
CarbonProperties.getInstance().getPrope

carbondata git commit: [CARBONDATA-2724][DataMap]Unsupported create datamap on table with V1 or V2 format data

2018-07-17 Thread xuchuanyin
Repository: carbondata
Updated Branches:
  refs/heads/master 81038f55e -> a16289786


[CARBONDATA-2724][DataMap]Unsupported create datamap on table with V1 or V2 
format data

block creating datamap on carbon table with V1 or V2 format
Currently the version info is read from carbon data file

This closes #2488


Project: http://git-wip-us.apache.org/repos/asf/carbondata/repo
Commit: http://git-wip-us.apache.org/repos/asf/carbondata/commit/a1628978
Tree: http://git-wip-us.apache.org/repos/asf/carbondata/tree/a1628978
Diff: http://git-wip-us.apache.org/repos/asf/carbondata/diff/a1628978

Branch: refs/heads/master
Commit: a162897862c92947ea8fd63713b7dbe6098f3b13
Parents: 81038f5
Author: ndwangsen 
Authored: Wed Jul 11 17:41:25 2018 +0800
Committer: xuchuanyin 
Committed: Tue Jul 17 23:35:50 2018 +0800

--
 .../apache/carbondata/core/util/CarbonUtil.java | 51 
 .../datamap/CarbonCreateDataMapCommand.scala|  8 ++-
 2 files changed, 58 insertions(+), 1 deletion(-)
--


http://git-wip-us.apache.org/repos/asf/carbondata/blob/a1628978/core/src/main/java/org/apache/carbondata/core/util/CarbonUtil.java
--
diff --git a/core/src/main/java/org/apache/carbondata/core/util/CarbonUtil.java 
b/core/src/main/java/org/apache/carbondata/core/util/CarbonUtil.java
index 9796696..642fe8e 100644
--- a/core/src/main/java/org/apache/carbondata/core/util/CarbonUtil.java
+++ b/core/src/main/java/org/apache/carbondata/core/util/CarbonUtil.java
@@ -88,6 +88,7 @@ import org.apache.carbondata.core.util.path.CarbonTablePath;
 import org.apache.carbondata.format.BlockletHeader;
 import org.apache.carbondata.format.DataChunk2;
 import org.apache.carbondata.format.DataChunk3;
+import org.apache.carbondata.format.FileHeader;
 
 import com.google.gson.Gson;
 import com.google.gson.GsonBuilder;
@@ -3184,4 +3185,54 @@ public final class CarbonUtil {
 }
 return columnLocalDictGenMap;
   }
+
+  /**
+   * This method get the carbon file format version
+   *
+   * @param carbonTable
+   * carbon Table
+   */
+  public static ColumnarFormatVersion getFormatVersion(CarbonTable carbonTable)
+  throws IOException {
+String storePath = null;
+// if the carbontable is support flat folder
+boolean supportFlatFolder = carbonTable.isSupportFlatFolder();
+if (supportFlatFolder) {
+  storePath = carbonTable.getTablePath();
+} else {
+  // get the valid segments
+  SegmentStatusManager segmentStatusManager =
+  new SegmentStatusManager(carbonTable.getAbsoluteTableIdentifier());
+  SegmentStatusManager.ValidAndInvalidSegmentsInfo 
validAndInvalidSegmentsInfo =
+  segmentStatusManager.getValidAndInvalidSegments();
+  List validSegments = 
validAndInvalidSegmentsInfo.getValidSegments();
+  CarbonProperties carbonProperties = CarbonProperties.getInstance();
+  if (validSegments.isEmpty()) {
+return carbonProperties.getFormatVersion();
+  }
+  storePath = 
carbonTable.getSegmentPath(validSegments.get(0).getSegmentNo());
+}
+
+CarbonFile[] carbonFiles = FileFactory
+.getCarbonFile(storePath)
+.listFiles(new CarbonFileFilter() {
+  @Override
+  public boolean accept(CarbonFile file) {
+if (file == null) {
+  return false;
+}
+return file.getName().endsWith("carbondata");
+  }
+});
+if (carbonFiles == null || carbonFiles.length < 1) {
+  return CarbonProperties.getInstance().getFormatVersion();
+}
+
+CarbonFile carbonFile = carbonFiles[0];
+// get the carbon file header
+CarbonHeaderReader headerReader = new 
CarbonHeaderReader(carbonFile.getCanonicalPath());
+FileHeader fileHeader = headerReader.readHeader();
+int version = fileHeader.getVersion();
+return ColumnarFormatVersion.valueOf((short)version);
+  }
 }

http://git-wip-us.apache.org/repos/asf/carbondata/blob/a1628978/integration/spark2/src/main/scala/org/apache/spark/sql/execution/command/datamap/CarbonCreateDataMapCommand.scala
--
diff --git 
a/integration/spark2/src/main/scala/org/apache/spark/sql/execution/command/datamap/CarbonCreateDataMapCommand.scala
 
b/integration/spark2/src/main/scala/org/apache/spark/sql/execution/command/datamap/CarbonCreateDataMapCommand.scala
index 7600160..336793e 100644
--- 
a/integration/spark2/src/main/scala/org/apache/spark/sql/execution/command/datamap/CarbonCreateDataMapCommand.scala
+++ 
b/integration/spark2/src/main/scala/org/apache/spark/sql/execution/command/datamap/CarbonCreateDataMapCommand.scala
@@ -26,9 +26,10 @@ import 
org.apache.carbondata.common.exceptions.sql.{MalformedCarbonComma

carbondata git commit: [CARBONDATA-2727][BloomDataMap] Support create bloom datamap on newly added column

2018-07-17 Thread xuchuanyin
Repository: carbondata
Updated Branches:
  refs/heads/master aec47e06f -> 81038f55e


[CARBONDATA-2727][BloomDataMap] Support create bloom datamap on newly added 
column

Add a result collector with rowId infomation for datamap rebuild if table 
schema is changed;
Use keygenerator to retrieve surrogate value of dictIndexColumn from query 
result;

This closes #2490


Project: http://git-wip-us.apache.org/repos/asf/carbondata/repo
Commit: http://git-wip-us.apache.org/repos/asf/carbondata/commit/81038f55
Tree: http://git-wip-us.apache.org/repos/asf/carbondata/tree/81038f55
Diff: http://git-wip-us.apache.org/repos/asf/carbondata/diff/81038f55

Branch: refs/heads/master
Commit: 81038f55ef9a582f82305378988f603ded76e524
Parents: aec47e0
Author: Manhua 
Authored: Wed Jul 11 19:39:31 2018 +0800
Committer: xuchuanyin 
Committed: Tue Jul 17 23:31:43 2018 +0800

--
 .../scan/collector/ResultCollectorFactory.java  |  31 ++---
 ...RowIdRestructureBasedRawResultCollector.java | 138 +++
 .../bloom/AbstractBloomDataMapWriter.java   |  72 +-
 .../bloom/BloomCoarseGrainDataMapFactory.java   |   2 +-
 .../datamap/bloom/BloomDataMapBuilder.java  |   8 ++
 .../datamap/bloom/BloomDataMapWriter.java   |  72 ++
 .../datamap/IndexDataMapRebuildRDD.scala| 131 +++---
 .../bloom/BloomCoarseGrainDataMapSuite.scala|  96 +
 8 files changed, 413 insertions(+), 137 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/carbondata/blob/81038f55/core/src/main/java/org/apache/carbondata/core/scan/collector/ResultCollectorFactory.java
--
diff --git 
a/core/src/main/java/org/apache/carbondata/core/scan/collector/ResultCollectorFactory.java
 
b/core/src/main/java/org/apache/carbondata/core/scan/collector/ResultCollectorFactory.java
index ea4afd1..e0a0b90 100644
--- 
a/core/src/main/java/org/apache/carbondata/core/scan/collector/ResultCollectorFactory.java
+++ 
b/core/src/main/java/org/apache/carbondata/core/scan/collector/ResultCollectorFactory.java
@@ -18,15 +18,7 @@ package org.apache.carbondata.core.scan.collector;
 
 import org.apache.carbondata.common.logging.LogService;
 import org.apache.carbondata.common.logging.LogServiceFactory;
-import 
org.apache.carbondata.core.scan.collector.impl.AbstractScannedResultCollector;
-import 
org.apache.carbondata.core.scan.collector.impl.DictionaryBasedResultCollector;
-import 
org.apache.carbondata.core.scan.collector.impl.DictionaryBasedVectorResultCollector;
-import org.apache.carbondata.core.scan.collector.impl.RawBasedResultCollector;
-import 
org.apache.carbondata.core.scan.collector.impl.RestructureBasedDictionaryResultCollector;
-import 
org.apache.carbondata.core.scan.collector.impl.RestructureBasedRawResultCollector;
-import 
org.apache.carbondata.core.scan.collector.impl.RestructureBasedVectorResultCollector;
-import 
org.apache.carbondata.core.scan.collector.impl.RowIdBasedResultCollector;
-import 
org.apache.carbondata.core.scan.collector.impl.RowIdRawBasedResultCollector;
+import org.apache.carbondata.core.scan.collector.impl.*;
 import org.apache.carbondata.core.scan.executor.infos.BlockExecutionInfo;
 
 /**
@@ -51,14 +43,21 @@ public class ResultCollectorFactory {
 AbstractScannedResultCollector scannerResultAggregator = null;
 if (blockExecutionInfo.isRawRecordDetailQuery()) {
   if (blockExecutionInfo.isRestructuredBlock()) {
-LOGGER.info("Restructure based raw collector is used to scan and 
collect the data");
-scannerResultAggregator = new 
RestructureBasedRawResultCollector(blockExecutionInfo);
-  } else if (blockExecutionInfo.isRequiredRowId()) {
-LOGGER.info("RowId based raw collector is used to scan and collect the 
data");
-scannerResultAggregator = new 
RowIdRawBasedResultCollector(blockExecutionInfo);
+if (blockExecutionInfo.isRequiredRowId()) {
+  LOGGER.info("RowId Restructure based raw ollector is used to scan 
and collect the data");
+  scannerResultAggregator = new 
RowIdRestructureBasedRawResultCollector(blockExecutionInfo);
+} else {
+  LOGGER.info("Restructure based raw collector is used to scan and 
collect the data");
+  scannerResultAggregator = new 
RestructureBasedRawResultCollector(blockExecutionInfo);
+}
   } else {
-LOGGER.info("Row based raw collector is used to scan and collect the 
data");
-scannerResultAggregator = new 
RawBasedResultCollector(blockExecutionInfo);
+if (blockExecutionInfo.isRequiredRowId()) {
+  LOGGER.info("RowId based raw collector is used to scan and collect 
the data");
+  scannerResultAggregator = new 
RowIdRawBasedResultCollector(blockExecutio

carbondata git commit: [CARBONDATA-2698][CARBONDATA-2700][CARBONDATA-2732][BloomDataMap] block some operations of bloomfilter datamap

2018-07-17 Thread xuchuanyin
Repository: carbondata
Updated Branches:
  refs/heads/master 8e7895715 -> 1c4358e89


[CARBONDATA-2698][CARBONDATA-2700][CARBONDATA-2732][BloomDataMap] block some 
operations of bloomfilter datamap

1.Block create bloomfilter datamap index on column which its datatype is 
complex type;
2.Block change datatype for bloomfilter index datamap;
3.Block dropping index columns for bloomfilter index datamap

This closes #2505


Project: http://git-wip-us.apache.org/repos/asf/carbondata/repo
Commit: http://git-wip-us.apache.org/repos/asf/carbondata/commit/1c4358e8
Tree: http://git-wip-us.apache.org/repos/asf/carbondata/tree/1c4358e8
Diff: http://git-wip-us.apache.org/repos/asf/carbondata/diff/1c4358e8

Branch: refs/heads/master
Commit: 1c4358e89f5cba1132e9512107d3a0cb22087b7b
Parents: 8e78957
Author: Sssan520 
Authored: Mon Jul 16 10:59:43 2018 +0800
Committer: xuchuanyin 
Committed: Tue Jul 17 16:34:14 2018 +0800

--
 .../core/datamap/dev/DataMapFactory.java| 13 +++
 .../core/metadata/schema/table/CarbonTable.java | 13 ++-
 .../bloom/BloomCoarseGrainDataMapFactory.java   | 37 +++-
 .../datamap/CarbonCreateDataMapCommand.scala| 10 ++
 .../CarbonAlterTableDataTypeChangeCommand.scala |  3 +-
 .../CarbonAlterTableDropColumnCommand.scala |  3 +-
 .../bloom/BloomCoarseGrainDataMapSuite.scala| 99 
 7 files changed, 171 insertions(+), 7 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/carbondata/blob/1c4358e8/core/src/main/java/org/apache/carbondata/core/datamap/dev/DataMapFactory.java
--
diff --git 
a/core/src/main/java/org/apache/carbondata/core/datamap/dev/DataMapFactory.java 
b/core/src/main/java/org/apache/carbondata/core/datamap/dev/DataMapFactory.java
index 0889f8b..ab0f8ea 100644
--- 
a/core/src/main/java/org/apache/carbondata/core/datamap/dev/DataMapFactory.java
+++ 
b/core/src/main/java/org/apache/carbondata/core/datamap/dev/DataMapFactory.java
@@ -144,4 +144,17 @@ public abstract class DataMapFactory {
 }
   }
 
+  /**
+   * whether to block operation on corresponding table or column.
+   * For example, bloomfilter datamap will block changing datatype for 
bloomindex column.
+   * By default it will not block any operation.
+   *
+   * @param operation table operation
+   * @param targets objects which the operation impact on
+   * @return true the operation will be blocked;false the operation will not 
be blocked
+   */
+  public boolean isOperationBlocked(TableOperation operation, Object... 
targets) {
+return false;
+  }
+
 }

http://git-wip-us.apache.org/repos/asf/carbondata/blob/1c4358e8/core/src/main/java/org/apache/carbondata/core/metadata/schema/table/CarbonTable.java
--
diff --git 
a/core/src/main/java/org/apache/carbondata/core/metadata/schema/table/CarbonTable.java
 
b/core/src/main/java/org/apache/carbondata/core/metadata/schema/table/CarbonTable.java
index 71256d4..995f943 100644
--- 
a/core/src/main/java/org/apache/carbondata/core/metadata/schema/table/CarbonTable.java
+++ 
b/core/src/main/java/org/apache/carbondata/core/metadata/schema/table/CarbonTable.java
@@ -1054,11 +1054,12 @@ public class CarbonTable implements Serializable {
   /**
* methods returns true if operation is allowed for the corresponding 
datamap or not
* if this operation makes datamap stale it is not allowed
-   * @param carbonTable
-   * @param operation
-   * @return
+   * @param carbonTable carbontable to be operated
+   * @param operation which operation on the table,such as drop column,change 
datatype.
+   * @param targets objects which the operation impact on,such as column
+   * @return true allow;false not allow
*/
-  public boolean canAllow(CarbonTable carbonTable, TableOperation operation) {
+  public boolean canAllow(CarbonTable carbonTable, TableOperation operation, 
Object... targets) {
 try {
   List datamaps = 
DataMapStoreManager.getInstance().getAllDataMap(carbonTable);
   if (!datamaps.isEmpty()) {
@@ -1069,6 +1070,10 @@ public class CarbonTable implements Serializable {
   if (factoryClass.willBecomeStale(operation)) {
 return false;
   }
+  // check whether the operation is blocked for datamap
+  if (factoryClass.isOperationBlocked(operation, targets)) {
+return false;
+  }
 }
   }
 } catch (Exception e) {

http://git-wip-us.apache.org/repos/asf/carbondata/blob/1c4358e8/datamap/bloom/src/main/java/org/apache/carbondata/datamap/bloom/BloomCoarseGrainDataMapFactory.java
--
diff --git 
a/datamap/bloom/src/main/java/org/apache/carbondata/datamap/bloom/BloomCoarseGrainDataMapFactory.java
 
b/datamap/bloom/src/m

carbondata git commit: [CARBONDATA-2719] Block update and delete on table having datamaps

2018-07-12 Thread xuchuanyin
Repository: carbondata
Updated Branches:
  refs/heads/master 84102a22a -> 56e7dad7b


[CARBONDATA-2719] Block update and delete on table having datamaps

Table update/delete is needed to block on table which has datamaps.

This close #2483


Project: http://git-wip-us.apache.org/repos/asf/carbondata/repo
Commit: http://git-wip-us.apache.org/repos/asf/carbondata/commit/56e7dad7
Tree: http://git-wip-us.apache.org/repos/asf/carbondata/tree/56e7dad7
Diff: http://git-wip-us.apache.org/repos/asf/carbondata/diff/56e7dad7

Branch: refs/heads/master
Commit: 56e7dad7b18b6d5946ccdc49c0d264384225d231
Parents: 84102a2
Author: ndwangsen 
Authored: Wed Jul 11 11:52:09 2018 +0800
Committer: xuchuanyin 
Committed: Fri Jul 13 09:50:56 2018 +0800

--
 .../lucene/LuceneFineGrainDataMapSuite.scala|  8 ++--
 .../iud/DeleteCarbonTableTestCase.scala | 44 +++
 .../TestInsertAndOtherCommandConcurrent.scala   | 12 +++--
 .../iud/UpdateCarbonTableTestCase.scala | 46 
 .../spark/sql/hive/CarbonAnalysisRules.scala| 37 +++-
 5 files changed, 138 insertions(+), 9 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/carbondata/blob/56e7dad7/integration/spark-common-test/src/test/scala/org/apache/carbondata/datamap/lucene/LuceneFineGrainDataMapSuite.scala
--
diff --git 
a/integration/spark-common-test/src/test/scala/org/apache/carbondata/datamap/lucene/LuceneFineGrainDataMapSuite.scala
 
b/integration/spark-common-test/src/test/scala/org/apache/carbondata/datamap/lucene/LuceneFineGrainDataMapSuite.scala
index fd55145..657a3eb 100644
--- 
a/integration/spark-common-test/src/test/scala/org/apache/carbondata/datamap/lucene/LuceneFineGrainDataMapSuite.scala
+++ 
b/integration/spark-common-test/src/test/scala/org/apache/carbondata/datamap/lucene/LuceneFineGrainDataMapSuite.scala
@@ -641,15 +641,15 @@ class LuceneFineGrainDataMapSuite extends QueryTest with 
BeforeAndAfterAll {
 assert(ex4.getMessage.contains("alter table drop column is not supported"))
 
 sql(s"LOAD DATA LOCAL INPATH '$file2' INTO TABLE datamap_test7 
OPTIONS('header'='false')")
-val ex5 = intercept[MalformedCarbonCommandException] {
+val ex5 = intercept[UnsupportedOperationException] {
   sql("UPDATE datamap_test7 d set(d.city)=('luc') where 
d.name='n10'").show()
 }
-assert(ex5.getMessage.contains("update operation is not supported"))
+assert(ex5.getMessage.contains("Update operation is not supported"))
 
-val ex6 = intercept[MalformedCarbonCommandException] {
+val ex6 = intercept[UnsupportedOperationException] {
   sql("delete from datamap_test7 where name = 'n10'").show()
 }
-assert(ex6.getMessage.contains("delete operation is not supported"))
+assert(ex6.getMessage.contains("Delete operation is not supported"))
   }
 
   test("test lucene fine grain multiple data map on table") {

http://git-wip-us.apache.org/repos/asf/carbondata/blob/56e7dad7/integration/spark-common-test/src/test/scala/org/apache/carbondata/spark/testsuite/iud/DeleteCarbonTableTestCase.scala
--
diff --git 
a/integration/spark-common-test/src/test/scala/org/apache/carbondata/spark/testsuite/iud/DeleteCarbonTableTestCase.scala
 
b/integration/spark-common-test/src/test/scala/org/apache/carbondata/spark/testsuite/iud/DeleteCarbonTableTestCase.scala
index 64aae1d..de93229 100644
--- 
a/integration/spark-common-test/src/test/scala/org/apache/carbondata/spark/testsuite/iud/DeleteCarbonTableTestCase.scala
+++ 
b/integration/spark-common-test/src/test/scala/org/apache/carbondata/spark/testsuite/iud/DeleteCarbonTableTestCase.scala
@@ -298,6 +298,50 @@ class DeleteCarbonTableTestCase extends QueryTest with 
BeforeAndAfterAll {
 
   }
 
+  test("block deleting records from table which has preaggregate datamap") {
+sql("drop table if exists test_dm_main")
+sql("drop table if exists test_dm_main_preagg1")
+
+sql("create table test_dm_main (a string, b string, c string) stored by 
'carbondata'")
+sql("insert into test_dm_main select 'aaa','bbb','ccc'")
+sql("insert into test_dm_main select 'bbb','bbb','ccc'")
+sql("insert into test_dm_main select 'ccc','bbb','ccc'")
+
+sql(
+  "create datamap preagg1 on table test_dm_main using 'preaggregate' as 
select" +
+  " a,sum(b) from test_dm_main group by a")
+
+assert(intercept[UnsupportedOperationException] {
+  sql("delete from test_dm_main_preagg1 where test_dm_main_a = 'bbb'")
+}.getMessage.contains("Delete operation i

carbondata git commit: [HOTFIX][CARBONDATA-2716][DataMap] fix bug for loading datamap [Forced Update!]

2018-07-12 Thread xuchuanyin
Repository: carbondata
Updated Branches:
  refs/heads/master 0fb1e02a9 -> 84102a22a (forced update)


[HOTFIX][CARBONDATA-2716][DataMap] fix bug for loading datamap

In some scenarios, input parameter of getCarbonFactDataHandlerModel called 
carbonTable may be different from the one in loadmodel.

This close #2497


Project: http://git-wip-us.apache.org/repos/asf/carbondata/repo
Commit: http://git-wip-us.apache.org/repos/asf/carbondata/commit/84102a22
Tree: http://git-wip-us.apache.org/repos/asf/carbondata/tree/84102a22
Diff: http://git-wip-us.apache.org/repos/asf/carbondata/diff/84102a22

Branch: refs/heads/master
Commit: 84102a22accd3d24d52faccc747a62887c13d502
Parents: 9d7a9a2
Author: Manhua 
Authored: Thu Jul 12 16:47:15 2018 +0800
Committer: xuchuanyin 
Committed: Fri Jul 13 09:25:25 2018 +0800

--
 .../carbondata/processing/store/CarbonFactDataHandlerModel.java| 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)
--


http://git-wip-us.apache.org/repos/asf/carbondata/blob/84102a22/processing/src/main/java/org/apache/carbondata/processing/store/CarbonFactDataHandlerModel.java
--
diff --git 
a/processing/src/main/java/org/apache/carbondata/processing/store/CarbonFactDataHandlerModel.java
 
b/processing/src/main/java/org/apache/carbondata/processing/store/CarbonFactDataHandlerModel.java
index 63e47f0..ca75b8c 100644
--- 
a/processing/src/main/java/org/apache/carbondata/processing/store/CarbonFactDataHandlerModel.java
+++ 
b/processing/src/main/java/org/apache/carbondata/processing/store/CarbonFactDataHandlerModel.java
@@ -332,7 +332,7 @@ public class CarbonFactDataHandlerModel {
 new TableSpec(loadModel.getCarbonDataLoadSchema().getCarbonTable());
 DataMapWriterListener listener = new DataMapWriterListener();
 listener.registerAllWriter(
-loadModel.getCarbonDataLoadSchema().getCarbonTable(),
+carbonTable,
 loadModel.getSegmentId(),
 CarbonTablePath.getShardName(
 
CarbonTablePath.DataFileUtil.getTaskIdFromTaskNo(loadModel.getTaskNo()),



carbondata git commit: [HOTFIX][CARBONDATA-2716][DataMap] fix bug for loading datamap

2018-07-12 Thread xuchuanyin
Repository: carbondata
Updated Branches:
  refs/heads/master 9d7a9a2a9 -> 0fb1e02a9


[HOTFIX][CARBONDATA-2716][DataMap] fix bug for loading datamap

In some scenarios, input parameter of getCarbonFactDataHandlerModel called 
carbonTable may be different from the one in loadmodel.


Project: http://git-wip-us.apache.org/repos/asf/carbondata/repo
Commit: http://git-wip-us.apache.org/repos/asf/carbondata/commit/0fb1e02a
Tree: http://git-wip-us.apache.org/repos/asf/carbondata/tree/0fb1e02a
Diff: http://git-wip-us.apache.org/repos/asf/carbondata/diff/0fb1e02a

Branch: refs/heads/master
Commit: 0fb1e02a94ed1032598c99807516e34e0ed63e00
Parents: 9d7a9a2
Author: Manhua 
Authored: Thu Jul 12 16:47:15 2018 +0800
Committer: xuchuanyin 
Committed: Fri Jul 13 09:22:20 2018 +0800

--
 .../carbondata/processing/store/CarbonFactDataHandlerModel.java| 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)
--


http://git-wip-us.apache.org/repos/asf/carbondata/blob/0fb1e02a/processing/src/main/java/org/apache/carbondata/processing/store/CarbonFactDataHandlerModel.java
--
diff --git 
a/processing/src/main/java/org/apache/carbondata/processing/store/CarbonFactDataHandlerModel.java
 
b/processing/src/main/java/org/apache/carbondata/processing/store/CarbonFactDataHandlerModel.java
index 63e47f0..ca75b8c 100644
--- 
a/processing/src/main/java/org/apache/carbondata/processing/store/CarbonFactDataHandlerModel.java
+++ 
b/processing/src/main/java/org/apache/carbondata/processing/store/CarbonFactDataHandlerModel.java
@@ -332,7 +332,7 @@ public class CarbonFactDataHandlerModel {
 new TableSpec(loadModel.getCarbonDataLoadSchema().getCarbonTable());
 DataMapWriterListener listener = new DataMapWriterListener();
 listener.registerAllWriter(
-loadModel.getCarbonDataLoadSchema().getCarbonTable(),
+carbonTable,
 loadModel.getSegmentId(),
 CarbonTablePath.getShardName(
 
CarbonTablePath.DataFileUtil.getTaskIdFromTaskNo(loadModel.getTaskNo()),