[jira] [Comment Edited] (CARBONDATA-3327) Errors lies in query with small blocklet size

2019-03-24 Thread xuchuanyin (JIRA)


[ 
https://issues.apache.org/jira/browse/CARBONDATA-3327?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16799944#comment-16799944
 ] 

xuchuanyin edited comment on CARBONDATA-3327 at 3/24/19 8:34 AM:
-

Besides, I noticed that if we do not filter on the sort_columns, the problem 
will appear.

The content of the diff can also be accessed 
[here|https://gist.github.com/xuchuanyin/e5ffa3cca7c0ad62128fbf8dc1844a10]


was (Author: xuchuanyin):
Besides, I noticed that if we do not filter on the sort_columns, the problem 
will appear.

The content of the diff can also be accessed here:
[diff|https://gist.github.com/xuchuanyin/e5ffa3cca7c0ad62128fbf8dc1844a10]

> Errors lies in query with small blocklet size
> -
>
> Key: CARBONDATA-3327
> URL: https://issues.apache.org/jira/browse/CARBONDATA-3327
> Project: CarbonData
>  Issue Type: Bug
>Reporter: xuchuanyin
>Priority: Major
>
> while implementing the following patch
> ```diff
> diff --git 
> a/core/src/main/java/org/apache/carbondata/core/constants/CarbonCommonConstants.java
>  
> b/core/src/main/java/org/apache/carbondata/core/constants/CarbonCommonConstants.java
> index 69374ad..c6b63a4 100644
> --- 
> a/core/src/main/java/org/apache/carbondata/core/constants/CarbonCommonConstants.java
> +++ 
> b/core/src/main/java/org/apache/carbondata/core/constants/CarbonCommonConstants.java
> @@ -54,7 +54,7 @@ public final class CarbonCommonConstants {
>/**
> * min blocklet size
> */
> -  public static final int BLOCKLET_SIZE_MIN_VAL = 2000;
> +  public static final int BLOCKLET_SIZE_MIN_VAL = 1;
>  
>/**
> * max blocklet size
> diff --git 
> a/integration/spark-common-test/src/test/scala/org/apache/carbondata/spark/testsuite/sortcolumns/TestSortColumns.scala
>  
> b/integration/spark-common-test/src/test/scala/org/apache/carbondata/spark/testsuite/sortcolumns/TestSortColumns.scala
> index df97d0f..ace9fd5 100644
> --- 
> a/integration/spark-common-test/src/test/scala/org/apache/carbondata/spark/testsuite/sortcolumns/TestSortColumns.scala
> +++ 
> b/integration/spark-common-test/src/test/scala/org/apache/carbondata/spark/testsuite/sortcolumns/TestSortColumns.scala
> @@ -29,6 +29,7 @@ import 
> org.apache.carbondata.common.exceptions.sql.MalformedCarbonCommandExcepti
>  class TestSortColumns extends QueryTest with BeforeAndAfterAll {
>  
>override def beforeAll {
> +
> CarbonProperties.getInstance().addProperty(CarbonCommonConstants.BLOCKLET_SIZE,
>  "2")
>  CarbonProperties.getInstance().addProperty(
>CarbonCommonConstants.CARBON_TIMESTAMP_FORMAT, "dd-MM-")
>  
> ```
> I find that some of the tests in `TestSortColumns` failed with NPE and the 
> error logs show that
> ```
> 19/03/23 20:54:30 ERROR Executor: Exception in task 0.0 in stage 104.0 (TID 
> 173)
> java.lang.NullPointerException
> at 
> org.apache.parquet.io.api.Binary$ByteArrayBackedBinary.getBytes(Binary.java:294)
> at 
> org.apache.spark.sql.execution.vectorized.ColumnVector.getUTF8String(ColumnVector.java:646)
> at 
> org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIterator.processNext(Unknown
>  Source)
> at 
> org.apache.spark.sql.execution.BufferedRowIterator.hasNext(BufferedRowIterator.java:43)
> at 
> org.apache.spark.sql.execution.WholeStageCodegenExec$$anonfun$8$$anon$1.hasNext(WholeStageCodegenExec.scala:395)
> at 
> org.apache.spark.sql.execution.SparkPlan$$anonfun$2.apply(SparkPlan.scala:234)
> at 
> org.apache.spark.sql.execution.SparkPlan$$anonfun$2.apply(SparkPlan.scala:228)
> at 
> org.apache.spark.rdd.RDD$$anonfun$mapPartitionsInternal$1$$anonfun$apply$25.apply(RDD.scala:827)
> at 
> org.apache.spark.rdd.RDD$$anonfun$mapPartitionsInternal$1$$anonfun$apply$25.apply(RDD.scala:827)
> at 
> org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38)
> at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:323)
> at org.apache.spark.rdd.RDD.iterator(RDD.scala:287)
> at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:87)
> at org.apache.spark.scheduler.Task.run(Task.scala:108)
> at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:338)
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
> at java.lang.Thread.run(Thread.java:745)
> 19/03/23 20:54:30 ERROR TaskSetManager: Task 0 in stage 104.0 failed 1 times; 
> aborting job
> 19/03/23 20:54:30 INFO TestSortColumns: 
> = FINISHED 
> org.apache.carbondata.spark.testsuite.sortcolumns.TestSortColumns: 'filter on 
> sort_columns include no-dictionary, direct-dictionary and dictioanry' =
> 19/03/23 20:54:30 INFO TestSortColumns: 
> = 

[jira] [Comment Edited] (CARBONDATA-3327) Errors lies in query with small blocklet size

2019-03-24 Thread xuchuanyin (JIRA)


[ 
https://issues.apache.org/jira/browse/CARBONDATA-3327?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16799944#comment-16799944
 ] 

xuchuanyin edited comment on CARBONDATA-3327 at 3/24/19 8:33 AM:
-

Besides, I noticed that if we do not filter on the sort_columns, the problem 
will appear.

The content of the diff can also be accessed here:
[diff|https://gist.github.com/xuchuanyin/e5ffa3cca7c0ad62128fbf8dc1844a10]


was (Author: xuchuanyin):
Besides, I noticed that if we do not filter on the sort_columns, the problem 
will appear.

> Errors lies in query with small blocklet size
> -
>
> Key: CARBONDATA-3327
> URL: https://issues.apache.org/jira/browse/CARBONDATA-3327
> Project: CarbonData
>  Issue Type: Bug
>Reporter: xuchuanyin
>Priority: Major
>
> while implementing the following patch
> ```diff
> diff --git 
> a/core/src/main/java/org/apache/carbondata/core/constants/CarbonCommonConstants.java
>  
> b/core/src/main/java/org/apache/carbondata/core/constants/CarbonCommonConstants.java
> index 69374ad..c6b63a4 100644
> --- 
> a/core/src/main/java/org/apache/carbondata/core/constants/CarbonCommonConstants.java
> +++ 
> b/core/src/main/java/org/apache/carbondata/core/constants/CarbonCommonConstants.java
> @@ -54,7 +54,7 @@ public final class CarbonCommonConstants {
>/**
> * min blocklet size
> */
> -  public static final int BLOCKLET_SIZE_MIN_VAL = 2000;
> +  public static final int BLOCKLET_SIZE_MIN_VAL = 1;
>  
>/**
> * max blocklet size
> diff --git 
> a/integration/spark-common-test/src/test/scala/org/apache/carbondata/spark/testsuite/sortcolumns/TestSortColumns.scala
>  
> b/integration/spark-common-test/src/test/scala/org/apache/carbondata/spark/testsuite/sortcolumns/TestSortColumns.scala
> index df97d0f..ace9fd5 100644
> --- 
> a/integration/spark-common-test/src/test/scala/org/apache/carbondata/spark/testsuite/sortcolumns/TestSortColumns.scala
> +++ 
> b/integration/spark-common-test/src/test/scala/org/apache/carbondata/spark/testsuite/sortcolumns/TestSortColumns.scala
> @@ -29,6 +29,7 @@ import 
> org.apache.carbondata.common.exceptions.sql.MalformedCarbonCommandExcepti
>  class TestSortColumns extends QueryTest with BeforeAndAfterAll {
>  
>override def beforeAll {
> +
> CarbonProperties.getInstance().addProperty(CarbonCommonConstants.BLOCKLET_SIZE,
>  "2")
>  CarbonProperties.getInstance().addProperty(
>CarbonCommonConstants.CARBON_TIMESTAMP_FORMAT, "dd-MM-")
>  
> ```
> I find that some of the tests in `TestSortColumns` failed with NPE and the 
> error logs show that
> ```
> 19/03/23 20:54:30 ERROR Executor: Exception in task 0.0 in stage 104.0 (TID 
> 173)
> java.lang.NullPointerException
> at 
> org.apache.parquet.io.api.Binary$ByteArrayBackedBinary.getBytes(Binary.java:294)
> at 
> org.apache.spark.sql.execution.vectorized.ColumnVector.getUTF8String(ColumnVector.java:646)
> at 
> org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIterator.processNext(Unknown
>  Source)
> at 
> org.apache.spark.sql.execution.BufferedRowIterator.hasNext(BufferedRowIterator.java:43)
> at 
> org.apache.spark.sql.execution.WholeStageCodegenExec$$anonfun$8$$anon$1.hasNext(WholeStageCodegenExec.scala:395)
> at 
> org.apache.spark.sql.execution.SparkPlan$$anonfun$2.apply(SparkPlan.scala:234)
> at 
> org.apache.spark.sql.execution.SparkPlan$$anonfun$2.apply(SparkPlan.scala:228)
> at 
> org.apache.spark.rdd.RDD$$anonfun$mapPartitionsInternal$1$$anonfun$apply$25.apply(RDD.scala:827)
> at 
> org.apache.spark.rdd.RDD$$anonfun$mapPartitionsInternal$1$$anonfun$apply$25.apply(RDD.scala:827)
> at 
> org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38)
> at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:323)
> at org.apache.spark.rdd.RDD.iterator(RDD.scala:287)
> at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:87)
> at org.apache.spark.scheduler.Task.run(Task.scala:108)
> at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:338)
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
> at java.lang.Thread.run(Thread.java:745)
> 19/03/23 20:54:30 ERROR TaskSetManager: Task 0 in stage 104.0 failed 1 times; 
> aborting job
> 19/03/23 20:54:30 INFO TestSortColumns: 
> = FINISHED 
> org.apache.carbondata.spark.testsuite.sortcolumns.TestSortColumns: 'filter on 
> sort_columns include no-dictionary, direct-dictionary and dictioanry' =
> 19/03/23 20:54:30 INFO TestSortColumns: 
> = TEST OUTPUT FOR 
> org.apache.carbondata.spark.testsuite.sortcolumns.TestSortColumns: 'unsorted 
> table creation, query 

[jira] [Commented] (CARBONDATA-3327) Errors lies in query with small blocklet size

2019-03-24 Thread xuchuanyin (JIRA)


[ 
https://issues.apache.org/jira/browse/CARBONDATA-3327?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16799944#comment-16799944
 ] 

xuchuanyin commented on CARBONDATA-3327:


Besides, I noticed that if we do not filter on the sort_columns, the problem 
will appear.

> Errors lies in query with small blocklet size
> -
>
> Key: CARBONDATA-3327
> URL: https://issues.apache.org/jira/browse/CARBONDATA-3327
> Project: CarbonData
>  Issue Type: Bug
>Reporter: xuchuanyin
>Priority: Major
>
> while implementing the following patch
> ```diff
> diff --git 
> a/core/src/main/java/org/apache/carbondata/core/constants/CarbonCommonConstants.java
>  
> b/core/src/main/java/org/apache/carbondata/core/constants/CarbonCommonConstants.java
> index 69374ad..c6b63a4 100644
> --- 
> a/core/src/main/java/org/apache/carbondata/core/constants/CarbonCommonConstants.java
> +++ 
> b/core/src/main/java/org/apache/carbondata/core/constants/CarbonCommonConstants.java
> @@ -54,7 +54,7 @@ public final class CarbonCommonConstants {
>/**
> * min blocklet size
> */
> -  public static final int BLOCKLET_SIZE_MIN_VAL = 2000;
> +  public static final int BLOCKLET_SIZE_MIN_VAL = 1;
>  
>/**
> * max blocklet size
> diff --git 
> a/integration/spark-common-test/src/test/scala/org/apache/carbondata/spark/testsuite/sortcolumns/TestSortColumns.scala
>  
> b/integration/spark-common-test/src/test/scala/org/apache/carbondata/spark/testsuite/sortcolumns/TestSortColumns.scala
> index df97d0f..ace9fd5 100644
> --- 
> a/integration/spark-common-test/src/test/scala/org/apache/carbondata/spark/testsuite/sortcolumns/TestSortColumns.scala
> +++ 
> b/integration/spark-common-test/src/test/scala/org/apache/carbondata/spark/testsuite/sortcolumns/TestSortColumns.scala
> @@ -29,6 +29,7 @@ import 
> org.apache.carbondata.common.exceptions.sql.MalformedCarbonCommandExcepti
>  class TestSortColumns extends QueryTest with BeforeAndAfterAll {
>  
>override def beforeAll {
> +
> CarbonProperties.getInstance().addProperty(CarbonCommonConstants.BLOCKLET_SIZE,
>  "2")
>  CarbonProperties.getInstance().addProperty(
>CarbonCommonConstants.CARBON_TIMESTAMP_FORMAT, "dd-MM-")
>  
> ```
> I find that some of the tests in `TestSortColumns` failed with NPE and the 
> error logs show that
> ```
> 19/03/23 20:54:30 ERROR Executor: Exception in task 0.0 in stage 104.0 (TID 
> 173)
> java.lang.NullPointerException
> at 
> org.apache.parquet.io.api.Binary$ByteArrayBackedBinary.getBytes(Binary.java:294)
> at 
> org.apache.spark.sql.execution.vectorized.ColumnVector.getUTF8String(ColumnVector.java:646)
> at 
> org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIterator.processNext(Unknown
>  Source)
> at 
> org.apache.spark.sql.execution.BufferedRowIterator.hasNext(BufferedRowIterator.java:43)
> at 
> org.apache.spark.sql.execution.WholeStageCodegenExec$$anonfun$8$$anon$1.hasNext(WholeStageCodegenExec.scala:395)
> at 
> org.apache.spark.sql.execution.SparkPlan$$anonfun$2.apply(SparkPlan.scala:234)
> at 
> org.apache.spark.sql.execution.SparkPlan$$anonfun$2.apply(SparkPlan.scala:228)
> at 
> org.apache.spark.rdd.RDD$$anonfun$mapPartitionsInternal$1$$anonfun$apply$25.apply(RDD.scala:827)
> at 
> org.apache.spark.rdd.RDD$$anonfun$mapPartitionsInternal$1$$anonfun$apply$25.apply(RDD.scala:827)
> at 
> org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38)
> at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:323)
> at org.apache.spark.rdd.RDD.iterator(RDD.scala:287)
> at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:87)
> at org.apache.spark.scheduler.Task.run(Task.scala:108)
> at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:338)
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
> at java.lang.Thread.run(Thread.java:745)
> 19/03/23 20:54:30 ERROR TaskSetManager: Task 0 in stage 104.0 failed 1 times; 
> aborting job
> 19/03/23 20:54:30 INFO TestSortColumns: 
> = FINISHED 
> org.apache.carbondata.spark.testsuite.sortcolumns.TestSortColumns: 'filter on 
> sort_columns include no-dictionary, direct-dictionary and dictioanry' =
> 19/03/23 20:54:30 INFO TestSortColumns: 
> = TEST OUTPUT FOR 
> org.apache.carbondata.spark.testsuite.sortcolumns.TestSortColumns: 'unsorted 
> table creation, query data loading with heap and safe sort config' =
> Job aborted due to stage failure: Task 0 in stage 104.0 failed 1 times, most 
> recent failure: Lost task 0.0 in stage 104.0 (TID 173, localhost, executor 
> driver): java.lang.NullPointerException
> at 
> 

[jira] [Created] (CARBONDATA-3327) Errors lies in query with small blocklet size

2019-03-24 Thread xuchuanyin (JIRA)
xuchuanyin created CARBONDATA-3327:
--

 Summary: Errors lies in query with small blocklet size
 Key: CARBONDATA-3327
 URL: https://issues.apache.org/jira/browse/CARBONDATA-3327
 Project: CarbonData
  Issue Type: Bug
Reporter: xuchuanyin


while implementing the following patch
```diff
diff --git 
a/core/src/main/java/org/apache/carbondata/core/constants/CarbonCommonConstants.java
 
b/core/src/main/java/org/apache/carbondata/core/constants/CarbonCommonConstants.java
index 69374ad..c6b63a4 100644
--- 
a/core/src/main/java/org/apache/carbondata/core/constants/CarbonCommonConstants.java
+++ 
b/core/src/main/java/org/apache/carbondata/core/constants/CarbonCommonConstants.java
@@ -54,7 +54,7 @@ public final class CarbonCommonConstants {
   /**
* min blocklet size
*/
-  public static final int BLOCKLET_SIZE_MIN_VAL = 2000;
+  public static final int BLOCKLET_SIZE_MIN_VAL = 1;
 
   /**
* max blocklet size
diff --git 
a/integration/spark-common-test/src/test/scala/org/apache/carbondata/spark/testsuite/sortcolumns/TestSortColumns.scala
 
b/integration/spark-common-test/src/test/scala/org/apache/carbondata/spark/testsuite/sortcolumns/TestSortColumns.scala
index df97d0f..ace9fd5 100644
--- 
a/integration/spark-common-test/src/test/scala/org/apache/carbondata/spark/testsuite/sortcolumns/TestSortColumns.scala
+++ 
b/integration/spark-common-test/src/test/scala/org/apache/carbondata/spark/testsuite/sortcolumns/TestSortColumns.scala
@@ -29,6 +29,7 @@ import 
org.apache.carbondata.common.exceptions.sql.MalformedCarbonCommandExcepti
 class TestSortColumns extends QueryTest with BeforeAndAfterAll {
 
   override def beforeAll {
+
CarbonProperties.getInstance().addProperty(CarbonCommonConstants.BLOCKLET_SIZE, 
"2")
 CarbonProperties.getInstance().addProperty(
   CarbonCommonConstants.CARBON_TIMESTAMP_FORMAT, "dd-MM-")
 
```
I find that some of the tests in `TestSortColumns` failed with NPE and the 
error logs show that
```
19/03/23 20:54:30 ERROR Executor: Exception in task 0.0 in stage 104.0 (TID 173)
java.lang.NullPointerException
at 
org.apache.parquet.io.api.Binary$ByteArrayBackedBinary.getBytes(Binary.java:294)
at 
org.apache.spark.sql.execution.vectorized.ColumnVector.getUTF8String(ColumnVector.java:646)
at 
org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIterator.processNext(Unknown
 Source)
at 
org.apache.spark.sql.execution.BufferedRowIterator.hasNext(BufferedRowIterator.java:43)
at 
org.apache.spark.sql.execution.WholeStageCodegenExec$$anonfun$8$$anon$1.hasNext(WholeStageCodegenExec.scala:395)
at 
org.apache.spark.sql.execution.SparkPlan$$anonfun$2.apply(SparkPlan.scala:234)
at 
org.apache.spark.sql.execution.SparkPlan$$anonfun$2.apply(SparkPlan.scala:228)
at 
org.apache.spark.rdd.RDD$$anonfun$mapPartitionsInternal$1$$anonfun$apply$25.apply(RDD.scala:827)
at 
org.apache.spark.rdd.RDD$$anonfun$mapPartitionsInternal$1$$anonfun$apply$25.apply(RDD.scala:827)
at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:323)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:287)
at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:87)
at org.apache.spark.scheduler.Task.run(Task.scala:108)
at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:338)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
19/03/23 20:54:30 ERROR TaskSetManager: Task 0 in stage 104.0 failed 1 times; 
aborting job
19/03/23 20:54:30 INFO TestSortColumns: 

= FINISHED 
org.apache.carbondata.spark.testsuite.sortcolumns.TestSortColumns: 'filter on 
sort_columns include no-dictionary, direct-dictionary and dictioanry' =

19/03/23 20:54:30 INFO TestSortColumns: 

= TEST OUTPUT FOR 
org.apache.carbondata.spark.testsuite.sortcolumns.TestSortColumns: 'unsorted 
table creation, query data loading with heap and safe sort config' =


Job aborted due to stage failure: Task 0 in stage 104.0 failed 1 times, most 
recent failure: Lost task 0.0 in stage 104.0 (TID 173, localhost, executor 
driver): java.lang.NullPointerException
at 
org.apache.parquet.io.api.Binary$ByteArrayBackedBinary.getBytes(Binary.java:294)
at 
org.apache.spark.sql.execution.vectorized.ColumnVector.getUTF8String(ColumnVector.java:646)
at 
org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIterator.processNext(Unknown
 Source)
at 
org.apache.spark.sql.execution.BufferedRowIterator.hasNext(BufferedRowIterator.java:43)
at 
org.apache.spark.sql.execution.WholeStageCodegenExec$$anonfun$8$$anon$1.hasNext(WholeStageCodegenExec.scala:395)
at 

[jira] [Resolved] (CARBONDATA-3281) Limit the LRU cache size

2019-03-07 Thread xuchuanyin (JIRA)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-3281?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

xuchuanyin resolved CARBONDATA-3281.

Resolution: Fixed

> Limit the LRU cache size
> 
>
> Key: CARBONDATA-3281
> URL: https://issues.apache.org/jira/browse/CARBONDATA-3281
> Project: CarbonData
>  Issue Type: Improvement
>Reporter: TaoLi
>Priority: Minor
>  Time Spent: 12h
>  Remaining Estimate: 0h
>
> If configure the LRU bigger than jvm xmx size, then use 
> CARBON_MAX_LRU_CACHE_SIZE_DEFAULT replace.
> because if setting LRU bigger than xmx size,if we query for a big table with 
> many more carbonfiles, may cause "Error: java.io.IOException: Problem in 
> loading segment blocks: GC overhead 
> limit exceeded (state=,code=0)" the jdbc server will restart.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Assigned] (CARBONDATA-3281) Limit the LRU cache size

2019-03-07 Thread xuchuanyin (JIRA)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-3281?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

xuchuanyin reassigned CARBONDATA-3281:
--

Assignee: (was: xuchuanyin)

> Limit the LRU cache size
> 
>
> Key: CARBONDATA-3281
> URL: https://issues.apache.org/jira/browse/CARBONDATA-3281
> Project: CarbonData
>  Issue Type: Improvement
>Reporter: TaoLi
>Priority: Minor
>  Time Spent: 12h
>  Remaining Estimate: 0h
>
> If configure the LRU bigger than jvm xmx size, then use 
> CARBON_MAX_LRU_CACHE_SIZE_DEFAULT replace.
> because if setting LRU bigger than xmx size,if we query for a big table with 
> many more carbonfiles, may cause "Error: java.io.IOException: Problem in 
> loading segment blocks: GC overhead 
> limit exceeded (state=,code=0)" the jdbc server will restart.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Assigned] (CARBONDATA-3281) Limit the LRU cache size

2019-03-07 Thread xuchuanyin (JIRA)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-3281?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

xuchuanyin reassigned CARBONDATA-3281:
--

Assignee: xuchuanyin

> Limit the LRU cache size
> 
>
> Key: CARBONDATA-3281
> URL: https://issues.apache.org/jira/browse/CARBONDATA-3281
> Project: CarbonData
>  Issue Type: Improvement
>Reporter: TaoLi
>Assignee: xuchuanyin
>Priority: Minor
>  Time Spent: 12h
>  Remaining Estimate: 0h
>
> If configure the LRU bigger than jvm xmx size, then use 
> CARBON_MAX_LRU_CACHE_SIZE_DEFAULT replace.
> because if setting LRU bigger than xmx size,if we query for a big table with 
> many more carbonfiles, may cause "Error: java.io.IOException: Problem in 
> loading segment blocks: GC overhead 
> limit exceeded (state=,code=0)" the jdbc server will restart.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (CARBONDATA-2447) Range Partition Table。When the update operation is performed, the data will be lost.

2019-02-22 Thread xuchuanyin (JIRA)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-2447?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

xuchuanyin resolved CARBONDATA-2447.

   Resolution: Fixed
Fix Version/s: (was: NONE)

> Range Partition Table。When the update operation is performed, the data will 
> be lost.
> 
>
> Key: CARBONDATA-2447
> URL: https://issues.apache.org/jira/browse/CARBONDATA-2447
> Project: CarbonData
>  Issue Type: Bug
>  Components: spark-integration
>Affects Versions: 1.3.1
> Environment: centos6.5
> java8
> Spark2.1.0
> CarbonData1.3.1
>Reporter: duweike
>Priority: Blocker
> Attachments: 微信图片_20180507113738.jpg, 微信图片_20180507113748.jpg
>
>   Original Estimate: 72h
>  Time Spent: 7h 10m
>  Remaining Estimate: 64h 50m
>
> Range Partition Table。When the update operation is performed, the data will 
> be lost.
> As shown in the picture。
> 如下面图片所示,数据丢失必现。
>  
>  
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (CARBONDATA-3107) Optimize error/exception coding for better debugging

2019-02-22 Thread xuchuanyin (JIRA)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-3107?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

xuchuanyin resolved CARBONDATA-3107.

Resolution: Fixed

> Optimize error/exception coding for better debugging
> 
>
> Key: CARBONDATA-3107
> URL: https://issues.apache.org/jira/browse/CARBONDATA-3107
> Project: CarbonData
>  Issue Type: Improvement
>Reporter: jiangmanhua
>Priority: Major
>  Time Spent: 4h
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (CARBONDATA-3278) Remove duplicate code to get filter string of date/timestamp

2019-02-22 Thread xuchuanyin (JIRA)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-3278?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

xuchuanyin resolved CARBONDATA-3278.

Resolution: Fixed

> Remove duplicate code to get filter string of date/timestamp
> 
>
> Key: CARBONDATA-3278
> URL: https://issues.apache.org/jira/browse/CARBONDATA-3278
> Project: CarbonData
>  Issue Type: Improvement
>Reporter: jiangmanhua
>Assignee: jiangmanhua
>Priority: Major
>  Time Spent: 3h 50m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (CARBONDATA-3181) IllegalAccessError for BloomFilter.bits when bloom_compress is false

2018-12-20 Thread xuchuanyin (JIRA)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-3181?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

xuchuanyin resolved CARBONDATA-3181.

   Resolution: Fixed
Fix Version/s: 1.5.2

> IllegalAccessError for BloomFilter.bits when bloom_compress is false
> 
>
> Key: CARBONDATA-3181
> URL: https://issues.apache.org/jira/browse/CARBONDATA-3181
> Project: CarbonData
>  Issue Type: Bug
>Reporter: jiangmanhua
>Priority: Major
> Fix For: 1.5.2
>
>  Time Spent: 1.5h
>  Remaining Estimate: 0h
>
> ```
> 18/12/19 11:16:07 ERROR thriftserver.SparkExecuteStatementOperation: Error 
> executing query, currentState RUNNING,
> java.lang.IllegalAccessError: tried to access field 
> org.apache.hadoop.util.bloom.BloomFilter.bits from class 
> org.apache.hadoop.util.bloom.CarbonBloomFilter
>  at 
> org.apache.hadoop.util.bloom.CarbonBloomFilter.membershipTest(CarbonBloomFilter.java:70)
>  at 
> org.apache.carbondata.datamap.bloom.BloomCoarseGrainDataMap.prune(BloomCoarseGrainDataMap.java:202)
>  at 
> org.apache.carbondata.core.datamap.TableDataMap.pruneWithFilter(TableDataMap.java:185)
>  at 
> org.apache.carbondata.core.datamap.TableDataMap.prune(TableDataMap.java:160)
>  at 
> org.apache.carbondata.core.datamap.dev.expr.DataMapExprWrapperImpl.prune(DataMapExprWrapperImpl.java:53)
>  at 
> org.apache.carbondata.hadoop.api.CarbonInputFormat.getPrunedBlocklets(CarbonInputFormat.java:517)
>  at 
> org.apache.carbondata.hadoop.api.CarbonInputFormat.getDataBlocksOfSegment(CarbonInputFormat.java:412)
>  at 
> org.apache.carbondata.hadoop.api.CarbonTableInputFormat.getSplits(CarbonTableInputFormat.java:529)
>  at 
> org.apache.carbondata.hadoop.api.CarbonTableInputFormat.getSplits(CarbonTableInputFormat.java:220)
>  at 
> org.apache.carbondata.spark.rdd.CarbonScanRDD.internalGetPartitions(CarbonScanRDD.scala:127)
>  at 
> org.apache.carbondata.spark.rdd.CarbonRDD.getPartitions(CarbonRDD.scala:66)
>  at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:252)
>  at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:250)
> ```



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (CARBONDATA-3166) Changes in Document and Displaying Carbon Column Compressor used in Describe Formatted Command

2018-12-14 Thread xuchuanyin (JIRA)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-3166?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

xuchuanyin resolved CARBONDATA-3166.

   Resolution: Fixed
Fix Version/s: 1.5.2

> Changes in Document and Displaying Carbon Column Compressor used in Describe 
> Formatted Command
> --
>
> Key: CARBONDATA-3166
> URL: https://issues.apache.org/jira/browse/CARBONDATA-3166
> Project: CarbonData
>  Issue Type: Improvement
>Reporter: Shardul Singh
>Assignee: Shardul Singh
>Priority: Minor
> Fix For: 1.5.2
>
>  Time Spent: 2h 50m
>  Remaining Estimate: 0h
>
> Changes in Document and Displaying Carbon Column Compressor used in Describe 
> Formatted Command



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (CARBONDATA-3139) Fix bugs in datamap example

2018-11-28 Thread xuchuanyin (JIRA)
xuchuanyin created CARBONDATA-3139:
--

 Summary: Fix bugs in datamap example
 Key: CARBONDATA-3139
 URL: https://issues.apache.org/jira/browse/CARBONDATA-3139
 Project: CarbonData
  Issue Type: Bug
Reporter: xuchuanyin
Assignee: xuchuanyin






--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (CARBONDATA-3133) Update carbondata build document

2018-11-27 Thread xuchuanyin (JIRA)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-3133?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

xuchuanyin resolved CARBONDATA-3133.

Resolution: Fixed

> Update carbondata build document
> 
>
> Key: CARBONDATA-3133
> URL: https://issues.apache.org/jira/browse/CARBONDATA-3133
> Project: CarbonData
>  Issue Type: Improvement
>  Components: build
>Affects Versions: NONE
>Reporter: Jonathan.Wei
>Assignee: Jonathan.Wei
>Priority: Major
> Fix For: 1.5.2
>
>   Original Estimate: 1h
>  Time Spent: 2h 40m
>  Remaining Estimate: 0h
>
> Update the document to add spark 2.3.2 and add datamap mv compiling method.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (CARBONDATA-3031) Find wrong description in the document for 'carbon.number.of.cores.while.loading'

2018-11-16 Thread xuchuanyin (JIRA)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-3031?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

xuchuanyin resolved CARBONDATA-3031.

   Resolution: Fixed
Fix Version/s: 1.5.1

> Find wrong description in the document for 
> 'carbon.number.of.cores.while.loading'
> -
>
> Key: CARBONDATA-3031
> URL: https://issues.apache.org/jira/browse/CARBONDATA-3031
> Project: CarbonData
>  Issue Type: Improvement
>Reporter: xuchuanyin
>Assignee: lianganping
>Priority: Major
> Fix For: 1.5.1
>
>  Time Spent: 5h 40m
>  Remaining Estimate: 0h
>
> The document says that the default value of 
> ‘carbon.number.of.cores.while.loading’ is 2. But actually during data 
> loading, carbondata use the the value of 'spark.executor.cores', which means 
> that the description in document is incorrect.
> But this doesn't mean that the default value of 
> 'carbon.number.of.cores.while.loading' is useless -- in compaction and sdk, 
> carbondata still use this default value.
> In a word, we need to fix the implementation as well as the document, maybe 
> some refactoring is needed.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (CARBONDATA-3087) Prettify DESC FORMATTED output

2018-11-15 Thread xuchuanyin (JIRA)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-3087?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

xuchuanyin resolved CARBONDATA-3087.

   Resolution: Fixed
 Assignee: Jacky Li
Fix Version/s: 1.5.1

> Prettify DESC FORMATTED output
> --
>
> Key: CARBONDATA-3087
> URL: https://issues.apache.org/jira/browse/CARBONDATA-3087
> Project: CarbonData
>  Issue Type: Improvement
>Reporter: Jacky Li
>Assignee: Jacky Li
>Priority: Major
> Fix For: 1.5.1
>
>  Time Spent: 6h
>  Remaining Estimate: 0h
>
> Change output of DESC FORMATTED to:
> {noformat}
> ++-+---+
> |col_name|data_type   
>  |comment|
> ++-+---+
> |shortfield  |smallint
>  |null   |
> |intfield|int 
>  |null   |
> |bigintfield |bigint  
>  |null   |
> |doublefield |double  
>  |null   |
> |stringfield |string  
>  |null   |
> |timestampfield  |timestamp   
>  |null   |
> |decimalfield|decimal(18,2)   
>  |null   |
> |datefield   |date
>  |null   |
> |charfield   |string  
>  |null   |
> |floatfield  |double  
>  |null   |
> ||
>  |   |
> |## Table Basic Information  |
>  |   |
> |Comment |
>  |   |
> |Path
> |/Users/jacky/code/carbondata/examples/spark2/target/store/default/carbonsession_table|
>|
> |Table Block Size|1024 MB 
>  |   |
> |Table Blocklet Size |64 MB   
>  |   |
> |Streaming   |false   
>  |   |
> |Flat Folder |false   
>  |   |
> |Bad Record Path |
>  |   |
> |Min Input Per Node  |0.0B
>  |   |
> ||
>  |   |
> |## Index Information|
>  |   |
> |Sort Scope  |LOCAL_SORT  
>  |   |
> |Sort Columns|stringfield,timestampfield,datefield,charfield  
>  |   |
> |Index Cache Level   |BLOCK   
>  |   |
> |Cached Index Columns|All columns 
>  |   |
> ||
>  |   |
> |## Encoding Information |
>  |   |
> |Local Dictionary Enabled|true
>  |   |
> |Local Dictionary Threshold  |1

[jira] [Updated] (CARBONDATA-3088) enhance compaction performance by using prefetch

2018-11-07 Thread xuchuanyin (JIRA)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-3088?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

xuchuanyin updated CARBONDATA-3088:
---
Issue Type: Improvement  (was: Bug)

> enhance compaction performance by using prefetch
> 
>
> Key: CARBONDATA-3088
> URL: https://issues.apache.org/jira/browse/CARBONDATA-3088
> Project: CarbonData
>  Issue Type: Improvement
>Reporter: xuchuanyin
>Assignee: xuchuanyin
>Priority: Major
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (CARBONDATA-3088) enhance compaction performance by using prefetch

2018-11-07 Thread xuchuanyin (JIRA)
xuchuanyin created CARBONDATA-3088:
--

 Summary: enhance compaction performance by using prefetch
 Key: CARBONDATA-3088
 URL: https://issues.apache.org/jira/browse/CARBONDATA-3088
 Project: CarbonData
  Issue Type: Bug
Reporter: xuchuanyin
Assignee: xuchuanyin






--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (CARBONDATA-3086) Unable build project due maven error

2018-11-06 Thread xuchuanyin (JIRA)


[ 
https://issues.apache.org/jira/browse/CARBONDATA-3086?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16677796#comment-16677796
 ] 

xuchuanyin commented on CARBONDATA-3086:


not sure about this error, can you try maven 3.5.0 instead?

> Unable build project due maven error
> 
>
> Key: CARBONDATA-3086
> URL: https://issues.apache.org/jira/browse/CARBONDATA-3086
> Project: CarbonData
>  Issue Type: Bug
>Reporter: Almaz Murzabekov
>Priority: Major
> Attachments: packaging.log
>
>
> Hi, guys!
> Can you help me, please!
> I am trying to build project after cloning from github, with command 
> {code:java}
> mvn -DskipTests -Pspark-2.1 -Dspark.version=2.1.0 clean package -X >> 
> packaging_log.log
> {code}
> but, I got an error on Core module (No such file or directory, full log see 
> on attach file) 
> Environent:
> OS: CentOs (Docker image on Windows 10)
> Java: OpenJdk 1.8.0_191
> Maven: 3.0.5 (Red Hat 3.0.5-17)
> Git: 1.8.3.1



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (CARBONDATA-3086) Unable build project due maven error

2018-11-06 Thread xuchuanyin (JIRA)


[ 
https://issues.apache.org/jira/browse/CARBONDATA-3086?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16677798#comment-16677798
 ] 

xuchuanyin commented on CARBONDATA-3086:


not sure about this error, can you try maven 3.5.0 instead?

> Unable build project due maven error
> 
>
> Key: CARBONDATA-3086
> URL: https://issues.apache.org/jira/browse/CARBONDATA-3086
> Project: CarbonData
>  Issue Type: Bug
>Reporter: Almaz Murzabekov
>Priority: Major
> Attachments: packaging.log
>
>
> Hi, guys!
> Can you help me, please!
> I am trying to build project after cloning from github, with command 
> {code:java}
> mvn -DskipTests -Pspark-2.1 -Dspark.version=2.1.0 clean package -X >> 
> packaging_log.log
> {code}
> but, I got an error on Core module (No such file or directory, full log see 
> on attach file) 
> Environent:
> OS: CentOs (Docker image on Windows 10)
> Java: OpenJdk 1.8.0_191
> Maven: 3.0.5 (Red Hat 3.0.5-17)
> Git: 1.8.3.1



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (CARBONDATA-3078) Exception caused by explain command for count star query without filter

2018-11-06 Thread xuchuanyin (JIRA)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-3078?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

xuchuanyin resolved CARBONDATA-3078.

   Resolution: Fixed
 Assignee: jiangmanhua
Fix Version/s: 1.5.1

> Exception caused by explain command for count star query without filter
> ---
>
> Key: CARBONDATA-3078
> URL: https://issues.apache.org/jira/browse/CARBONDATA-3078
> Project: CarbonData
>  Issue Type: Bug
>Reporter: jiangmanhua
>Assignee: jiangmanhua
>Priority: Major
> Fix For: 1.5.1
>
>  Time Spent: 3h 20m
>  Remaining Estimate: 0h
>
> Procedure to reproduce the problem:
> - create table test_tbl;
> - load some data into table;
> - run query as "explain select count(*) from test_tbl"
>  
> ```
> Exception in thread "main" java.lang.IllegalStateException
>  at 
> org.apache.carbondata.core.profiler.ExplainCollector.getCurrentTablePruningInfo(ExplainCollector.java:162)
>  at 
> org.apache.carbondata.core.profiler.ExplainCollector.setShowPruningInfo(ExplainCollector.java:106)
>  at 
> org.apache.carbondata.core.indexstore.blockletindex.BlockDataMap.prune(BlockDataMap.java:696)
>  at 
> org.apache.carbondata.core.indexstore.blockletindex.BlockDataMap.prune(BlockDataMap.java:743)
>  at 
> org.apache.carbondata.core.indexstore.blockletindex.BlockletDataMapFactory.getAllBlocklets(BlockletDataMapFactory.java:391)
>  at 
> org.apache.carbondata.core.datamap.TableDataMap.prune(TableDataMap.java:132)
>  at 
> org.apache.carbondata.hadoop.api.CarbonTableInputFormat.getBlockRowCount(CarbonTableInputFormat.java:618)
>  at org.apache.spark.sql.CarbonCountStar.doExecute(CarbonCountStar.scala:59)
>  at 
> org.apache.spark.sql.execution.SparkPlan$$anonfun$execute$1.apply(SparkPlan.scala:117)
>  at 
> org.apache.spark.sql.execution.SparkPlan$$anonfun$execute$1.apply(SparkPlan.scala:117)
>  at 
> org.apache.spark.sql.execution.SparkPlan$$anonfun$executeQuery$1.apply(SparkPlan.scala:138)
>  at 
> org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:151)
>  at org.apache.spark.sql.execution.SparkPlan.executeQuery(SparkPlan.scala:135)
>  at org.apache.spark.sql.execution.SparkPlan.execute(SparkPlan.scala:116)
>  at 
> org.apache.spark.sql.execution.QueryExecution.toRdd$lzycompute(QueryExecution.scala:92)
>  at 
> org.apache.spark.sql.execution.QueryExecution.toRdd(QueryExecution.scala:92)
>  at 
> org.apache.spark.sql.execution.command.table.CarbonExplainCommand.collectProfiler(CarbonExplainCommand.scala:54)
>  at 
> org.apache.spark.sql.execution.command.table.CarbonExplainCommand.processMetadata(CarbonExplainCommand.scala:45)
>  at 
> org.apache.spark.sql.execution.command.MetadataCommand.run(package.scala:68)
>  at 
> org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult$lzycompute(commands.scala:58)
>  at 
> org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult(commands.scala:56)
>  at 
> org.apache.spark.sql.execution.command.ExecutedCommandExec.executeCollect(commands.scala:67)
>  at org.apache.spark.sql.Dataset.(Dataset.scala:183)
>  at 
> org.apache.spark.sql.CarbonSession$$anonfun$sql$1.apply(CarbonSession.scala:106)
>  at 
> org.apache.spark.sql.CarbonSession$$anonfun$sql$1.apply(CarbonSession.scala:95)
>  at org.apache.spark.sql.CarbonSession.withProfiler(CarbonSession.scala:154)
>  at org.apache.spark.sql.CarbonSession.sql(CarbonSession.scala:93)
>  at org.apache.carbondata.examples.SQL_Prune$.main(Test.scala:101)
>  at org.apache.carbondata.examples.SQL_Prune.main(Test.scala)
> Process finished with exit code 1
> ```



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (CARBONDATA-3074) Change default sort temp compressor to SNAPPY

2018-11-05 Thread xuchuanyin (JIRA)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-3074?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

xuchuanyin resolved CARBONDATA-3074.

   Resolution: Fixed
Fix Version/s: 1.5.1

> Change default sort temp  compressor to SNAPPY
> --
>
> Key: CARBONDATA-3074
> URL: https://issues.apache.org/jira/browse/CARBONDATA-3074
> Project: CarbonData
>  Issue Type: Improvement
>Reporter: jiangmanhua
>Assignee: jiangmanhua
>Priority: Major
> Fix For: 1.5.1
>
>  Time Spent: 1h 50m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (CARBONDATA-3069) fix bugs in setting cores for compaction

2018-11-02 Thread xuchuanyin (JIRA)
xuchuanyin created CARBONDATA-3069:
--

 Summary: fix bugs in setting cores for compaction
 Key: CARBONDATA-3069
 URL: https://issues.apache.org/jira/browse/CARBONDATA-3069
 Project: CarbonData
  Issue Type: Bug
Reporter: xuchuanyin
Assignee: xuchuanyin






--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (CARBONDATA-3067) Add check for debug to avoid string concat

2018-10-31 Thread xuchuanyin (JIRA)
xuchuanyin created CARBONDATA-3067:
--

 Summary: Add check for debug to avoid string concat
 Key: CARBONDATA-3067
 URL: https://issues.apache.org/jira/browse/CARBONDATA-3067
 Project: CarbonData
  Issue Type: Improvement
Reporter: xuchuanyin
Assignee: xuchuanyin


for debug log, we should check before call log method to avoid unnecessary 
string concatenation



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (CARBONDATA-3053) Un-closed file stream found in cli

2018-10-29 Thread xuchuanyin (JIRA)
xuchuanyin created CARBONDATA-3053:
--

 Summary: Un-closed file stream found in cli
 Key: CARBONDATA-3053
 URL: https://issues.apache.org/jira/browse/CARBONDATA-3053
 Project: CarbonData
  Issue Type: Improvement
Reporter: xuchuanyin
Assignee: xuchuanyin






--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (CARBONDATA-3041) Optimize load minimum size strategy for data loading

2018-10-29 Thread xuchuanyin (JIRA)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-3041?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

xuchuanyin resolved CARBONDATA-3041.

Resolution: Fixed

> Optimize load minimum size strategy for data loading
> 
>
> Key: CARBONDATA-3041
> URL: https://issues.apache.org/jira/browse/CARBONDATA-3041
> Project: CarbonData
>  Issue Type: Improvement
>  Components: data-load
>Affects Versions: 1.5.0
>Reporter: wangsen
>Assignee: wangsen
>Priority: Minor
> Fix For: 1.5.1
>
>  Time Spent: 6h
>  Remaining Estimate: 0h
>
> 1、Delete system property carbon.load.min.size.enabled,modified this property 
> load_min_size_inmb to table property,and This property can also be specified 
> in the load option.
> 2、Support to alter table xxx set TBLPROPERTIES('load_min_size_inmb '='256') 
> 3、If creating a table has this property  load_min_size_inmb,Display this 
> property via the desc formatted command.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (CARBONDATA-3051) unclosed streams cause tests failure in windows env

2018-10-29 Thread xuchuanyin (JIRA)
xuchuanyin created CARBONDATA-3051:
--

 Summary: unclosed streams cause tests failure in windows env
 Key: CARBONDATA-3051
 URL: https://issues.apache.org/jira/browse/CARBONDATA-3051
 Project: CarbonData
  Issue Type: Improvement
Reporter: xuchuanyin
Assignee: xuchuanyin






--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (CARBONDATA-3050) Remove unused parameter doc

2018-10-29 Thread xuchuanyin (JIRA)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-3050?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

xuchuanyin resolved CARBONDATA-3050.

   Resolution: Fixed
Fix Version/s: 1.5.1

> Remove unused parameter doc
> ---
>
> Key: CARBONDATA-3050
> URL: https://issues.apache.org/jira/browse/CARBONDATA-3050
> Project: CarbonData
>  Issue Type: Improvement
>Reporter: jiangmanhua
>Assignee: jiangmanhua
>Priority: Major
> Fix For: 1.5.1
>
>  Time Spent: 1h
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (CARBONDATA-3046) remove outdated configurations in template properties

2018-10-26 Thread xuchuanyin (JIRA)
xuchuanyin created CARBONDATA-3046:
--

 Summary: remove outdated configurations in template properties
 Key: CARBONDATA-3046
 URL: https://issues.apache.org/jira/browse/CARBONDATA-3046
 Project: CarbonData
  Issue Type: Improvement
Reporter: xuchuanyin
Assignee: xuchuanyin






--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (CARBONDATA-3040) Fix bug for merging bloom index

2018-10-25 Thread xuchuanyin (JIRA)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-3040?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

xuchuanyin resolved CARBONDATA-3040.

   Resolution: Fixed
Fix Version/s: 1.5.1

> Fix bug for merging bloom index
> ---
>
> Key: CARBONDATA-3040
> URL: https://issues.apache.org/jira/browse/CARBONDATA-3040
> Project: CarbonData
>  Issue Type: Bug
>Reporter: jiangmanhua
>Assignee: jiangmanhua
>Priority: Major
> Fix For: 1.5.1
>
>  Time Spent: 2h 10m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (CARBONDATA-3035) Optimize parameters for unsafe working and sort memory

2018-10-23 Thread xuchuanyin (JIRA)
xuchuanyin created CARBONDATA-3035:
--

 Summary: Optimize parameters for unsafe working and sort memory
 Key: CARBONDATA-3035
 URL: https://issues.apache.org/jira/browse/CARBONDATA-3035
 Project: CarbonData
  Issue Type: Improvement
Reporter: xuchuanyin
Assignee: xuchuanyin






--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (CARBONDATA-3033) Fix errors for parameter description in documents

2018-10-22 Thread xuchuanyin (JIRA)
xuchuanyin created CARBONDATA-3033:
--

 Summary: Fix errors for parameter description in documents
 Key: CARBONDATA-3033
 URL: https://issues.apache.org/jira/browse/CARBONDATA-3033
 Project: CarbonData
  Issue Type: Improvement
Reporter: xuchuanyin
Assignee: xuchuanyin






--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Assigned] (CARBONDATA-3031) Find wrong description in the document for 'carbon.number.of.cores.while.loading'

2018-10-22 Thread xuchuanyin (JIRA)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-3031?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

xuchuanyin reassigned CARBONDATA-3031:
--

Assignee: lianganping  (was: xuchuanyin)

> Find wrong description in the document for 
> 'carbon.number.of.cores.while.loading'
> -
>
> Key: CARBONDATA-3031
> URL: https://issues.apache.org/jira/browse/CARBONDATA-3031
> Project: CarbonData
>  Issue Type: Improvement
>Reporter: xuchuanyin
>Assignee: lianganping
>Priority: Major
>
> The document says that the default value of 
> ‘carbon.number.of.cores.while.loading’ is 2. But actually during data 
> loading, carbondata use the the value of 'spark.executor.cores', which means 
> that the description in document is incorrect.
> But this doesn't mean that the default value of 
> 'carbon.number.of.cores.while.loading' is useless -- in compaction and sdk, 
> carbondata still use this default value.
> In a word, we need to fix the implementation as well as the document, maybe 
> some refactoring is needed.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (CARBONDATA-3031) Find wrong description in the document for 'carbon.number.of.cores.while.loading'

2018-10-22 Thread xuchuanyin (JIRA)
xuchuanyin created CARBONDATA-3031:
--

 Summary: Find wrong description in the document for 
'carbon.number.of.cores.while.loading'
 Key: CARBONDATA-3031
 URL: https://issues.apache.org/jira/browse/CARBONDATA-3031
 Project: CarbonData
  Issue Type: Improvement
Reporter: xuchuanyin
Assignee: xuchuanyin


The document says that the default value of 
‘carbon.number.of.cores.while.loading’ is 2. But actually during data loading, 
carbondata use the the value of 'spark.executor.cores', which means that the 
description in document is incorrect.

But this doesn't mean that the default value of 
'carbon.number.of.cores.while.loading' is useless -- in compaction and sdk, 
carbondata still use this default value.

In a word, we need to fix the implementation as well as the document, maybe 
some refactoring is needed.




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (CARBONDATA-3024) Use Log4j directly

2018-10-19 Thread xuchuanyin (JIRA)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-3024?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

xuchuanyin updated CARBONDATA-3024:
---
Fix Version/s: 1.5.1

> Use Log4j directly
> --
>
> Key: CARBONDATA-3024
> URL: https://issues.apache.org/jira/browse/CARBONDATA-3024
> Project: CarbonData
>  Issue Type: Improvement
>Reporter: Jacky Li
>Assignee: Jacky Li
>Priority: Major
> Fix For: 1.5.1
>
>  Time Spent: 2h
>  Remaining Estimate: 0h
>
> Currently CarbonData's log is printing the line number in StandardLogService, 
> it is not good for maintainability, a better way is to use log4j Logger 
> directly so that it will print line number of where we are logging.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (CARBONDATA-3002) Fix some spell error and remove the data after test case finished running

2018-10-19 Thread xuchuanyin (JIRA)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-3002?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

xuchuanyin resolved CARBONDATA-3002.

   Resolution: Fixed
Fix Version/s: 1.5.1

> Fix some spell error and remove the data after test case finished running
> -
>
> Key: CARBONDATA-3002
> URL: https://issues.apache.org/jira/browse/CARBONDATA-3002
> Project: CarbonData
>  Issue Type: Bug
>Affects Versions: 1.5.0
>Reporter: xubo245
>Assignee: xubo245
>Priority: Major
> Fix For: 1.5.1
>
>  Time Spent: 1h
>  Remaining Estimate: 0h
>
> Fix some spell error and remove the data after test case finished running
> retrive



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (CARBONDATA-3029) Failed to run spark data source test cases in windows env

2018-10-18 Thread xuchuanyin (JIRA)
xuchuanyin created CARBONDATA-3029:
--

 Summary: Failed to run spark data source test cases in windows env
 Key: CARBONDATA-3029
 URL: https://issues.apache.org/jira/browse/CARBONDATA-3029
 Project: CarbonData
  Issue Type: Bug
Reporter: xuchuanyin
Assignee: xuchuanyin






--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (CARBONDATA-3028) failed query spark file format table when there are blanks in long_string_columns

2018-10-18 Thread xuchuanyin (JIRA)
xuchuanyin created CARBONDATA-3028:
--

 Summary: failed query spark file format table when there are 
blanks in long_string_columns
 Key: CARBONDATA-3028
 URL: https://issues.apache.org/jira/browse/CARBONDATA-3028
 Project: CarbonData
  Issue Type: Bug
Reporter: xuchuanyin
Assignee: xuchuanyin






--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (CARBONDATA-3026) clear expired property that may cause GC problem

2018-10-18 Thread xuchuanyin (JIRA)
xuchuanyin created CARBONDATA-3026:
--

 Summary: clear expired property that may cause GC problem
 Key: CARBONDATA-3026
 URL: https://issues.apache.org/jira/browse/CARBONDATA-3026
 Project: CarbonData
  Issue Type: Bug
  Components: data-load
Reporter: xuchuanyin
Assignee: xuchuanyin


During data loading, we will write some temp files (sort temp
files and temp fact data files) in some locations. In currently
implementation, we will add the locations to the CarbonProperties and
associated it with a special key that refers to the data loading.

After data loading, the temp locations are cleared, but the added
property is still remain in the CarbonProperties and never to be cleared.

This will cause the CarbonProperties object growing bigger and bigger
and lead to OOM problems if the thrift-server is a long time running
service. A local test shows that after adding different properties for
11 Billion times, the OOM happens.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (CARBONDATA-3024) Use Log4j directly

2018-10-17 Thread xuchuanyin (JIRA)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-3024?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

xuchuanyin resolved CARBONDATA-3024.

Resolution: Fixed

merged into 1.5.1

> Use Log4j directly
> --
>
> Key: CARBONDATA-3024
> URL: https://issues.apache.org/jira/browse/CARBONDATA-3024
> Project: CarbonData
>  Issue Type: Improvement
>Reporter: Jacky Li
>Assignee: Jacky Li
>Priority: Major
>  Time Spent: 2h
>  Remaining Estimate: 0h
>
> Currently CarbonData's log is printing the line number in StandardLogService, 
> it is not good for maintainability, a better way is to use log4j Logger 
> directly so that it will print line number of where we are logging.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (CARBONDATA-3009) Optimize the entry point of code for MergeIndex

2018-10-16 Thread xuchuanyin (JIRA)
xuchuanyin created CARBONDATA-3009:
--

 Summary: Optimize the entry point of code for MergeIndex
 Key: CARBONDATA-3009
 URL: https://issues.apache.org/jira/browse/CARBONDATA-3009
 Project: CarbonData
  Issue Type: Improvement
Reporter: xuchuanyin
Assignee: xuchuanyin






--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (CARBONDATA-3008) make yarn-local and multiple dir for temp data enable by default

2018-10-15 Thread xuchuanyin (JIRA)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-3008?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

xuchuanyin updated CARBONDATA-3008:
---
Priority: Minor  (was: Major)

> make yarn-local and multiple dir for temp data enable by default
> 
>
> Key: CARBONDATA-3008
> URL: https://issues.apache.org/jira/browse/CARBONDATA-3008
> Project: CarbonData
>  Issue Type: Improvement
>Reporter: xuchuanyin
>Priority: Minor
>
> About a year ago, we introduced 'multiple dirs for temp data during data 
> loading' to solve disk hotspot problem. For about one years' usage in 
> productive environment, this feature turns to be effective and correct. So 
> here I propose to enable the related parameters by default. The related 
> parameters contains:
> `carbon.use.local.dir` : Currently it is `false` by default, we will turn it 
> to `true` by default;
> `carbon.user.multiple.dir` : Currently it is `false` by default, we will turn 
> it to `true` by default.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (CARBONDATA-3008) make yarn-local and multiple dir for temp data enable by default

2018-10-15 Thread xuchuanyin (JIRA)
xuchuanyin created CARBONDATA-3008:
--

 Summary: make yarn-local and multiple dir for temp data enable by 
default
 Key: CARBONDATA-3008
 URL: https://issues.apache.org/jira/browse/CARBONDATA-3008
 Project: CarbonData
  Issue Type: Improvement
Reporter: xuchuanyin


About a year ago, we introduced 'multiple dirs for temp data during data 
loading' to solve disk hotspot problem. For about one years' usage in 
productive environment, this feature turns to be effective and correct. So here 
I propose to enable the related parameters by default. The related parameters 
contains:

`carbon.use.local.dir` : Currently it is `false` by default, we will turn it to 
`true` by default;
`carbon.user.multiple.dir` : Currently it is `false` by default, we will turn 
it to `true` by default.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Assigned] (CARBONDATA-3007) Fix error in document

2018-10-15 Thread xuchuanyin (JIRA)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-3007?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

xuchuanyin reassigned CARBONDATA-3007:
--

Assignee: xuchuanyin

> Fix error in document
> -
>
> Key: CARBONDATA-3007
> URL: https://issues.apache.org/jira/browse/CARBONDATA-3007
> Project: CarbonData
>  Issue Type: Bug
>Reporter: xuchuanyin
>Assignee: xuchuanyin
>Priority: Trivial
>  Time Spent: 10m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (CARBONDATA-3007) Fix error in document

2018-10-15 Thread xuchuanyin (JIRA)
xuchuanyin created CARBONDATA-3007:
--

 Summary: Fix error in document
 Key: CARBONDATA-3007
 URL: https://issues.apache.org/jira/browse/CARBONDATA-3007
 Project: CarbonData
  Issue Type: Bug
Reporter: xuchuanyin






--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Closed] (CARBONDATA-2988) use unsafe for query model based on system property

2018-10-12 Thread xuchuanyin (JIRA)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-2988?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

xuchuanyin closed CARBONDATA-2988.
--
Resolution: Not A Problem

> use unsafe for query model based on system property
> ---
>
> Key: CARBONDATA-2988
> URL: https://issues.apache.org/jira/browse/CARBONDATA-2988
> Project: CarbonData
>  Issue Type: Bug
>Reporter: xuchuanyin
>Assignee: xuchuanyin
>Priority: Major
>  Time Spent: 1.5h
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (CARBONDATA-3004) Fix bug in writing dataframe to carbon table while the field order is different

2018-10-12 Thread xuchuanyin (JIRA)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-3004?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

xuchuanyin updated CARBONDATA-3004:
---
Issue Type: Sub-task  (was: Bug)
Parent: CARBONDATA-2420

> Fix bug in writing dataframe to carbon table while the field order is 
> different
> ---
>
> Key: CARBONDATA-3004
> URL: https://issues.apache.org/jira/browse/CARBONDATA-3004
> Project: CarbonData
>  Issue Type: Sub-task
>Reporter: xuchuanyin
>Assignee: xuchuanyin
>Priority: Major
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> More information about this issue can be found in this link: 
> http://apache-carbondata-dev-mailing-list-archive.1130556.n5.nabble.com/Issue-Long-string-columns-config-for-big-strings-not-work-td64876.html



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (CARBONDATA-3004) Fix bug in writing dataframe to carbon table while the field order is different

2018-10-12 Thread xuchuanyin (JIRA)
xuchuanyin created CARBONDATA-3004:
--

 Summary: Fix bug in writing dataframe to carbon table while the 
field order is different
 Key: CARBONDATA-3004
 URL: https://issues.apache.org/jira/browse/CARBONDATA-3004
 Project: CarbonData
  Issue Type: Bug
Reporter: xuchuanyin
Assignee: xuchuanyin


More information about this issue can be found in this link: 
http://apache-carbondata-dev-mailing-list-archive.1130556.n5.nabble.com/Issue-Long-string-columns-config-for-big-strings-not-work-td64876.html



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (CARBONDATA-2988) use unsafe for query model based on system property

2018-09-28 Thread xuchuanyin (JIRA)
xuchuanyin created CARBONDATA-2988:
--

 Summary: use unsafe for query model based on system property
 Key: CARBONDATA-2988
 URL: https://issues.apache.org/jira/browse/CARBONDATA-2988
 Project: CarbonData
  Issue Type: Bug
Reporter: xuchuanyin
Assignee: xuchuanyin






--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (CARBONDATA-2974) Bloomfilter not working when created bloom on multiple columns and queried

2018-09-28 Thread xuchuanyin (JIRA)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-2974?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

xuchuanyin resolved CARBONDATA-2974.

   Resolution: Fixed
Fix Version/s: 1.5.0

> Bloomfilter not working when created bloom on multiple columns and queried
> --
>
> Key: CARBONDATA-2974
> URL: https://issues.apache.org/jira/browse/CARBONDATA-2974
> Project: CarbonData
>  Issue Type: Bug
>Reporter: Ravindra Pesala
>Priority: Major
> Fix For: 1.5.0
>
>  Time Spent: 4h
>  Remaining Estimate: 0h
>
> Please check the link for more information
> http://apache-carbondata-dev-mailing-list-archive.1130556.n5.nabble.com/Issue-Bloomfilter-datamap-td63254.html



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (CARBONDATA-2980) clear bloomindex cache when dropping datamap

2018-09-27 Thread xuchuanyin (JIRA)
xuchuanyin created CARBONDATA-2980:
--

 Summary: clear bloomindex cache when dropping datamap
 Key: CARBONDATA-2980
 URL: https://issues.apache.org/jira/browse/CARBONDATA-2980
 Project: CarbonData
  Issue Type: Bug
Reporter: xuchuanyin
Assignee: xuchuanyin


should clear the bloomindex cache when we drop datamap, otherwise query will 
fail if we drop and recreate a brand new table and datamap and the stale cache 
still exists.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (CARBONDATA-2971) Add shard info of blocklet for debugging

2018-09-27 Thread xuchuanyin (JIRA)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-2971?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

xuchuanyin resolved CARBONDATA-2971.

Resolution: Fixed

> Add shard info of blocklet for debugging
> 
>
> Key: CARBONDATA-2971
> URL: https://issues.apache.org/jira/browse/CARBONDATA-2971
> Project: CarbonData
>  Issue Type: Improvement
>Reporter: jiangmanhua
>Assignee: jiangmanhua
>Priority: Major
>  Time Spent: 2h 50m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (CARBONDATA-2965) Support scan performance benchmark tool

2018-09-26 Thread xuchuanyin (JIRA)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-2965?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

xuchuanyin resolved CARBONDATA-2965.

   Resolution: Fixed
Fix Version/s: 1.5.0

> Support scan performance benchmark tool
> ---
>
> Key: CARBONDATA-2965
> URL: https://issues.apache.org/jira/browse/CARBONDATA-2965
> Project: CarbonData
>  Issue Type: New Feature
>Reporter: Jacky Li
>Assignee: Jacky Li
>Priority: Major
> Fix For: 1.5.0
>
>  Time Spent: 1h 40m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (CARBONDATA-2957) update document about zstd support in carbondata

2018-09-21 Thread xuchuanyin (JIRA)
xuchuanyin created CARBONDATA-2957:
--

 Summary: update document about zstd support in carbondata
 Key: CARBONDATA-2957
 URL: https://issues.apache.org/jira/browse/CARBONDATA-2957
 Project: CarbonData
  Issue Type: Sub-task
Reporter: xuchuanyin






--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Assigned] (CARBONDATA-2955) bug for legacy store and compaction with zstd compressor and adaptiveDeltaIntegralCodec

2018-09-20 Thread xuchuanyin (JIRA)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-2955?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

xuchuanyin reassigned CARBONDATA-2955:
--

   Assignee: xuchuanyin
Description: if table is configured with zstd compressor, compaction will 
fail if we use adaptiveDeltaIntegralCodec;
Summary: bug for legacy store and compaction with zstd compressor and 
adaptiveDeltaIntegralCodec  (was: bug for legacy store andwith zstd compressor)

> bug for legacy store and compaction with zstd compressor and 
> adaptiveDeltaIntegralCodec
> ---
>
> Key: CARBONDATA-2955
> URL: https://issues.apache.org/jira/browse/CARBONDATA-2955
> Project: CarbonData
>  Issue Type: Bug
>Reporter: xuchuanyin
>Assignee: xuchuanyin
>Priority: Major
>
> if table is configured with zstd compressor, compaction will fail if we use 
> adaptiveDeltaIntegralCodec;



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (CARBONDATA-2955) bug for legacy store andwith zstd compressor

2018-09-20 Thread xuchuanyin (JIRA)
xuchuanyin created CARBONDATA-2955:
--

 Summary: bug for legacy store andwith zstd compressor
 Key: CARBONDATA-2955
 URL: https://issues.apache.org/jira/browse/CARBONDATA-2955
 Project: CarbonData
  Issue Type: Bug
Reporter: xuchuanyin






--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (CARBONDATA-2944) optimize compress/decompress related procedure

2018-09-17 Thread xuchuanyin (JIRA)
xuchuanyin created CARBONDATA-2944:
--

 Summary: optimize compress/decompress related procedure
 Key: CARBONDATA-2944
 URL: https://issues.apache.org/jira/browse/CARBONDATA-2944
 Project: CarbonData
  Issue Type: Sub-task
Reporter: xuchuanyin
Assignee: xuchuanyin


While implementing customize compressor, I found that carbon compressor deal 
with primitive object while compressing/decompressing which I think will cause 
efficiency because: 
1. many compressor do not provide compress/decompress interface for primitive 
objects, we need to handle it ourselves, which may cause unnecessary conversion 
from primitive to bytes and from bytes to primitives;
2. for querying, we need to decompress the content. I think it's better to keep 
them in bytes and convert them to primitives until it is needed.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (CARBONDATA-2933) Fix errors in spelling

2018-09-13 Thread xuchuanyin (JIRA)
xuchuanyin created CARBONDATA-2933:
--

 Summary: Fix errors in spelling
 Key: CARBONDATA-2933
 URL: https://issues.apache.org/jira/browse/CARBONDATA-2933
 Project: CarbonData
  Issue Type: Bug
Reporter: xuchuanyin
Assignee: xuchuanyin






--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (CARBONDATA-2930) Support customize column compressor

2018-09-12 Thread xuchuanyin (JIRA)
xuchuanyin created CARBONDATA-2930:
--

 Summary: Support customize column compressor
 Key: CARBONDATA-2930
 URL: https://issues.apache.org/jira/browse/CARBONDATA-2930
 Project: CarbonData
  Issue Type: Sub-task
Reporter: xuchuanyin
Assignee: xuchuanyin


Support customize compressor to compress the final store.

User can create their own compressor and specify it during creating table or 
loading data.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (CARBONDATA-2850) Support configurable column compressor for final store

2018-09-12 Thread xuchuanyin (JIRA)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-2850?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

xuchuanyin updated CARBONDATA-2850:
---
Summary: Support configurable column compressor for final store  (was: 
Support zstd as column compressor in final store)

> Support configurable column compressor for final store
> --
>
> Key: CARBONDATA-2850
> URL: https://issues.apache.org/jira/browse/CARBONDATA-2850
> Project: CarbonData
>  Issue Type: Improvement
>Reporter: xuchuanyin
>Assignee: xuchuanyin
>Priority: Major
> Attachments: Tests on Zstd as column compressor.pdf
>
>
> ZSTD has a better compression ratio that snappy and the compress/decompress 
> rate is acceptable compared with snappy.
> After we introduce zstd as the column compressor, the size of carbondata 
> final store will be reduced.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (CARBONDATA-2852) support zstd on legacy store

2018-09-12 Thread xuchuanyin (JIRA)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-2852?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

xuchuanyin resolved CARBONDATA-2852.

   Resolution: Fixed
Fix Version/s: 1.5.0

> support zstd on legacy store
> 
>
> Key: CARBONDATA-2852
> URL: https://issues.apache.org/jira/browse/CARBONDATA-2852
> Project: CarbonData
>  Issue Type: Sub-task
>Reporter: xuchuanyin
>Assignee: xuchuanyin
>Priority: Major
> Fix For: 1.5.0
>
>
> Currently carbondata reads the column compressor from system property. This 
> will cause problems on legacy store if we have changed the compressor.
> It should read that information from metadata in data files.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (CARBONDATA-2850) Support zstd as column compressor in final store

2018-09-10 Thread xuchuanyin (JIRA)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-2850?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

xuchuanyin updated CARBONDATA-2850:
---
Attachment: Tests on Zstd as column compressor.pdf

> Support zstd as column compressor in final store
> 
>
> Key: CARBONDATA-2850
> URL: https://issues.apache.org/jira/browse/CARBONDATA-2850
> Project: CarbonData
>  Issue Type: Improvement
>Reporter: xuchuanyin
>Assignee: xuchuanyin
>Priority: Major
> Attachments: Tests on Zstd as column compressor.pdf
>
>
> ZSTD has a better compression ratio that snappy and the compress/decompress 
> rate is acceptable compared with snappy.
> After we introduce zstd as the column compressor, the size of carbondata 
> final store will be reduced.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (CARBONDATA-2904) Support minmax datamap for external format

2018-08-31 Thread xuchuanyin (JIRA)
xuchuanyin created CARBONDATA-2904:
--

 Summary: Support minmax datamap for external format
 Key: CARBONDATA-2904
 URL: https://issues.apache.org/jira/browse/CARBONDATA-2904
 Project: CarbonData
  Issue Type: Sub-task
Reporter: xuchuanyin
Assignee: xuchuanyin






--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (CARBONDATA-2890) Use CarbonLoadModelBuilder instead of new CarbonLoadModel instance

2018-08-27 Thread xuchuanyin (JIRA)
xuchuanyin created CARBONDATA-2890:
--

 Summary: Use CarbonLoadModelBuilder instead of new CarbonLoadModel 
instance
 Key: CARBONDATA-2890
 URL: https://issues.apache.org/jira/browse/CARBONDATA-2890
 Project: CarbonData
  Issue Type: Sub-task
Reporter: xuchuanyin


Currently to get an instance of CarbonLoadModel, we can:
1. directly new an instance and set the member one by one;
2. or use the CarbonLoadModelBuilder to build an instance

However some of the members of CarbonLoadModel (such as ColumnCompressor, 
tableName) are required in the following procedure.

For the 1st method, these members may be forgotten to initialize. While for the 
2nd method, we can validate these members in the build method to ensure that 
these members are initialized.

So here I proposed to only use the CarbonLoadModelBuilder to instantiate a 
CarbonLoadModel.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Reopened] (CARBONDATA-2420) Support string longer than 32000 characters

2018-08-23 Thread xuchuanyin (JIRA)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-2420?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

xuchuanyin reopened CARBONDATA-2420:

  Assignee: (was: xuchuanyin)

> Support string longer than 32000 characters
> ---
>
> Key: CARBONDATA-2420
> URL: https://issues.apache.org/jira/browse/CARBONDATA-2420
> Project: CarbonData
>  Issue Type: Improvement
>Reporter: xuchuanyin
>Priority: Major
> Fix For: 1.4.1
>
>  Time Spent: 19h 40m
>  Remaining Estimate: 0h
>
> Add a property in creating table 'long_string_columns' to support string 
> columns that will contains more than 32000 characters.
> Inside carbondata, it use an integer instead of short to store the length of 
> bytes content.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (CARBONDATA-2881) Tests in TestStreamingOperation should be independent but actually have interrelationship

2018-08-23 Thread xuchuanyin (JIRA)
xuchuanyin created CARBONDATA-2881:
--

 Summary: Tests in TestStreamingOperation should be independent but 
actually have interrelationship
 Key: CARBONDATA-2881
 URL: https://issues.apache.org/jira/browse/CARBONDATA-2881
 Project: CarbonData
  Issue Type: Sub-task
Reporter: xuchuanyin


all the testcases in TestStreamingOperation use the exactly same table and do 
data loading. Once one test fail, the following testcases will fail too.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (CARBONDATA-2878) Umbrella issue for minor modifications

2018-08-23 Thread xuchuanyin (JIRA)
xuchuanyin created CARBONDATA-2878:
--

 Summary: Umbrella issue for minor modifications
 Key: CARBONDATA-2878
 URL: https://issues.apache.org/jira/browse/CARBONDATA-2878
 Project: CarbonData
  Issue Type: Bug
Reporter: xuchuanyin


This umbrella issue is to cover the minor issues about the defect in bugs and 
optimizations for doc, tests and code.

The sub-issues are simple and labeled as 'newbie'



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (CARBONDATA-2873) Support query through index datamap for external CSV format

2018-08-22 Thread xuchuanyin (JIRA)
xuchuanyin created CARBONDATA-2873:
--

 Summary: Support query through index datamap for external CSV 
format
 Key: CARBONDATA-2873
 URL: https://issues.apache.org/jira/browse/CARBONDATA-2873
 Project: CarbonData
  Issue Type: Sub-task
Reporter: xuchuanyin
Assignee: xuchuanyin






--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (CARBONDATA-2562) Support create and build index datamaps on external CSV format

2018-08-22 Thread xuchuanyin (JIRA)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-2562?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

xuchuanyin updated CARBONDATA-2562:
---
Summary: Support create and build index datamaps on external CSV format  
(was: Support index datamaps on external CSV format)

> Support create and build index datamaps on external CSV format
> --
>
> Key: CARBONDATA-2562
> URL: https://issues.apache.org/jira/browse/CARBONDATA-2562
> Project: CarbonData
>  Issue Type: Sub-task
>Reporter: xuchuanyin
>Assignee: xuchuanyin
>Priority: Major
>
> Support creating indexed datamap on external CSV datasource.
> Support rebuilding the indexed datamap for the external CSV datasource.
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (CARBONDATA-2562) Support index datamaps on external CSV format

2018-08-22 Thread xuchuanyin (JIRA)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-2562?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

xuchuanyin updated CARBONDATA-2562:
---
Description: 
Support creating indexed datamap on external CSV datasource.

Support rebuilding the indexed datamap for the external CSV datasource.

 

  was:
Support creating indexed datamap on external CSV datasource.

Support rebuilding the indexed datamap for the external CSV datasource.

Query on external datasource make use of datamap if it is available.


> Support index datamaps on external CSV format
> -
>
> Key: CARBONDATA-2562
> URL: https://issues.apache.org/jira/browse/CARBONDATA-2562
> Project: CarbonData
>  Issue Type: Sub-task
>Reporter: xuchuanyin
>Assignee: xuchuanyin
>Priority: Major
>
> Support creating indexed datamap on external CSV datasource.
> Support rebuilding the indexed datamap for the external CSV datasource.
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (CARBONDATA-2562) Support index datamaps on external CSV format

2018-08-16 Thread xuchuanyin (JIRA)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-2562?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

xuchuanyin updated CARBONDATA-2562:
---
Summary: Support index datamaps on external CSV format  (was: Support 
datamaps on external CSV format)

> Support index datamaps on external CSV format
> -
>
> Key: CARBONDATA-2562
> URL: https://issues.apache.org/jira/browse/CARBONDATA-2562
> Project: CarbonData
>  Issue Type: Sub-task
>Reporter: xuchuanyin
>Assignee: xuchuanyin
>Priority: Major
>
> Support creating indexed datamap on external CSV datasource.
> Support rebuilding the indexed datamap for the external CSV datasource.
> Query on external datasource make use of datamap if it is available.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (CARBONDATA-2768) Fix error test for external format

2018-08-16 Thread xuchuanyin (JIRA)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-2768?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

xuchuanyin updated CARBONDATA-2768:
---
Issue Type: Sub-task  (was: Bug)
Parent: CARBONDATA-2561

> Fix error test for external format
> --
>
> Key: CARBONDATA-2768
> URL: https://issues.apache.org/jira/browse/CARBONDATA-2768
> Project: CarbonData
>  Issue Type: Sub-task
>Reporter: xuchuanyin
>Assignee: xuchuanyin
>Priority: Major
>  Time Spent: 4.5h
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (CARBONDATA-2859) add sdv test case for bloomfilter datamap

2018-08-15 Thread xuchuanyin (JIRA)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-2859?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

xuchuanyin updated CARBONDATA-2859:
---
Issue Type: Sub-task  (was: Bug)
Parent: CARBONDATA-2632

> add sdv test case for bloomfilter datamap
> -
>
> Key: CARBONDATA-2859
> URL: https://issues.apache.org/jira/browse/CARBONDATA-2859
> Project: CarbonData
>  Issue Type: Sub-task
>Reporter: xuchuanyin
>Assignee: xuchuanyin
>Priority: Major
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> add sdv test case for bloomfilter datamap



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (CARBONDATA-2859) add sdv test case for bloomfilter datamap

2018-08-15 Thread xuchuanyin (JIRA)
xuchuanyin created CARBONDATA-2859:
--

 Summary: add sdv test case for bloomfilter datamap
 Key: CARBONDATA-2859
 URL: https://issues.apache.org/jira/browse/CARBONDATA-2859
 Project: CarbonData
  Issue Type: Bug
Reporter: xuchuanyin
Assignee: xuchuanyin


add sdv test case for bloomfilter datamap



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (CARBONDATA-2856) Fix bug in bloom index on multiple dictionary columns

2018-08-14 Thread xuchuanyin (JIRA)
xuchuanyin created CARBONDATA-2856:
--

 Summary: Fix bug in bloom index on multiple dictionary columns
 Key: CARBONDATA-2856
 URL: https://issues.apache.org/jira/browse/CARBONDATA-2856
 Project: CarbonData
  Issue Type: Sub-task
Affects Versions: 1.4.1
Reporter: xuchuanyin
Assignee: xuchuanyin
 Fix For: 1.4.2


create bloom index on a table which has date and string columns and the string 
columns is global dictionary.

The data loading procedure will fail



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (CARBONDATA-2852) support zstd on legacy store

2018-08-13 Thread xuchuanyin (JIRA)
xuchuanyin created CARBONDATA-2852:
--

 Summary: support zstd on legacy store
 Key: CARBONDATA-2852
 URL: https://issues.apache.org/jira/browse/CARBONDATA-2852
 Project: CarbonData
  Issue Type: Sub-task
Reporter: xuchuanyin
Assignee: xuchuanyin


Currently carbondata reads the column compressor from system property. This 
will cause problems on legacy store if we have changed the compressor.
It should read that information from metadata in data files.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (CARBONDATA-2850) Support zstd as column compressor in final store

2018-08-13 Thread xuchuanyin (JIRA)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-2850?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

xuchuanyin updated CARBONDATA-2850:
---
Description: 
ZSTD has a better compression ratio that snappy and the compress/decompress 
rate is acceptable compared with snappy.

After we introduce zstd as the column compressor, the size of carbondata final 
store will be reduced.

> Support zstd as column compressor in final store
> 
>
> Key: CARBONDATA-2850
> URL: https://issues.apache.org/jira/browse/CARBONDATA-2850
> Project: CarbonData
>  Issue Type: Improvement
>Reporter: xuchuanyin
>Assignee: xuchuanyin
>Priority: Major
>
> ZSTD has a better compression ratio that snappy and the compress/decompress 
> rate is acceptable compared with snappy.
> After we introduce zstd as the column compressor, the size of carbondata 
> final store will be reduced.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (CARBONDATA-2851) support zstd as column compressor

2018-08-13 Thread xuchuanyin (JIRA)
xuchuanyin created CARBONDATA-2851:
--

 Summary: support zstd as column compressor
 Key: CARBONDATA-2851
 URL: https://issues.apache.org/jira/browse/CARBONDATA-2851
 Project: CarbonData
  Issue Type: Sub-task
Reporter: xuchuanyin
Assignee: xuchuanyin






--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (CARBONDATA-2850) Support zstd as column compressor in final store

2018-08-13 Thread xuchuanyin (JIRA)
xuchuanyin created CARBONDATA-2850:
--

 Summary: Support zstd as column compressor in final store
 Key: CARBONDATA-2850
 URL: https://issues.apache.org/jira/browse/CARBONDATA-2850
 Project: CarbonData
  Issue Type: Improvement
Reporter: xuchuanyin
Assignee: xuchuanyin






--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (CARBONDATA-2835) Block MV datamap on streaming table

2018-08-07 Thread xuchuanyin (JIRA)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-2835?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

xuchuanyin updated CARBONDATA-2835:
---
Issue Type: Sub-task  (was: Bug)
Parent: CARBONDATA-2628

> Block MV datamap on streaming table
> ---
>
> Key: CARBONDATA-2835
> URL: https://issues.apache.org/jira/browse/CARBONDATA-2835
> Project: CarbonData
>  Issue Type: Sub-task
>Reporter: xuchuanyin
>Assignee: wangsen
>Priority: Major
>
> We should block creating MV datamap on streaming table;
> Also we should block setting streaming property for table which has MV 
> datamap.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (CARBONDATA-2835) Block MV datamap on streaming table

2018-08-07 Thread xuchuanyin (JIRA)
xuchuanyin created CARBONDATA-2835:
--

 Summary: Block MV datamap on streaming table
 Key: CARBONDATA-2835
 URL: https://issues.apache.org/jira/browse/CARBONDATA-2835
 Project: CarbonData
  Issue Type: Bug
Reporter: xuchuanyin
Assignee: wangsen


We should block creating MV datamap on streaming table;

Also we should block setting streaming property for table which has MV datamap.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Closed] (CARBONDATA-2809) Manually rebuilding non-lazy datamap cause error

2018-08-06 Thread xuchuanyin (JIRA)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-2809?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

xuchuanyin closed CARBONDATA-2809.
--
Resolution: Duplicate

duplicated with CARBONDATA-2821

> Manually rebuilding non-lazy datamap cause error
> 
>
> Key: CARBONDATA-2809
> URL: https://issues.apache.org/jira/browse/CARBONDATA-2809
> Project: CarbonData
>  Issue Type: Bug
>Reporter: xuchuanyin
>Assignee: xuchuanyin
>Priority: Minor
>  Time Spent: 3h 50m
>  Remaining Estimate: 0h
>
> Steps to reproduce:
> 1. create base table
> 2. load data to base table
> 3. create index datamap (such as bloomfilter datamap) on base table
> 4. rebuild datamap  This will give error
> In step3, the data of datamap has already been generated, if we trigger 
> rebuild, the procedure does not clean the files properly, thus causing the 
> error.
> Actually, the rebuild is not required. We can fix this issue by skipping the 
> rebuild procedure.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Reopened] (CARBONDATA-2820) Block rebuilding for preagg, bloom and lucene datamap

2018-08-06 Thread xuchuanyin (JIRA)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-2820?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

xuchuanyin reopened CARBONDATA-2820:


> Block rebuilding for preagg, bloom and lucene datamap
> -
>
> Key: CARBONDATA-2820
> URL: https://issues.apache.org/jira/browse/CARBONDATA-2820
> Project: CarbonData
>  Issue Type: Improvement
>Reporter: xuchuanyin
>Assignee: xuchuanyin
>Priority: Major
>
> currently we will block rebuilding these datamap



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Closed] (CARBONDATA-2820) Block rebuilding for preagg, bloom and lucene datamap

2018-08-06 Thread xuchuanyin (JIRA)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-2820?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

xuchuanyin closed CARBONDATA-2820.
--
Resolution: Duplicate

> Block rebuilding for preagg, bloom and lucene datamap
> -
>
> Key: CARBONDATA-2820
> URL: https://issues.apache.org/jira/browse/CARBONDATA-2820
> Project: CarbonData
>  Issue Type: Improvement
>Reporter: xuchuanyin
>Assignee: xuchuanyin
>Priority: Major
>
> currently we will block rebuilding these datamap



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Comment Edited] (CARBONDATA-2820) Block rebuilding for preagg, bloom and lucene datamap

2018-08-06 Thread xuchuanyin (JIRA)


[ 
https://issues.apache.org/jira/browse/CARBONDATA-2820?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16571049#comment-16571049
 ] 

xuchuanyin edited comment on CARBONDATA-2820 at 8/7/18 2:40 AM:


duplicated with CARBONDATA-2821


was (Author: xuchuanyin):
duplicated with CARBONDATA-2823

> Block rebuilding for preagg, bloom and lucene datamap
> -
>
> Key: CARBONDATA-2820
> URL: https://issues.apache.org/jira/browse/CARBONDATA-2820
> Project: CarbonData
>  Issue Type: Improvement
>Reporter: xuchuanyin
>Assignee: xuchuanyin
>Priority: Major
>
> currently we will block rebuilding these datamap



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Closed] (CARBONDATA-2820) Block rebuilding for preagg, bloom and lucene datamap

2018-08-06 Thread xuchuanyin (JIRA)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-2820?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

xuchuanyin closed CARBONDATA-2820.
--
Resolution: Duplicate

duplicated with CARBONDATA-2823

> Block rebuilding for preagg, bloom and lucene datamap
> -
>
> Key: CARBONDATA-2820
> URL: https://issues.apache.org/jira/browse/CARBONDATA-2820
> Project: CarbonData
>  Issue Type: Improvement
>Reporter: xuchuanyin
>Assignee: xuchuanyin
>Priority: Major
>
> currently we will block rebuilding these datamap



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (CARBONDATA-2833) NPE when we do a insert over a insert failure operation

2018-08-06 Thread xuchuanyin (JIRA)


[ 
https://issues.apache.org/jira/browse/CARBONDATA-2833?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16571028#comment-16571028
 ] 

xuchuanyin commented on CARBONDATA-2833:


steps in issue description cannot reproduce the problem, I've tried with 
another steps, but still cannot reproduce it:

```

test("test") {
 CarbonProperties.getInstance().addProperty("bad_records_logger_enable", "true")
 
CarbonProperties.getInstance().addProperty(CarbonCommonConstants.CARBON_BAD_RECORDS_ACTION,
 "FAIL")
 sql("CREATE DATABASE test1")
 sql("use test1")
 sql("DROP TABLE IF EXISTS ab")
 sql("CREATE TABLE ab (a integer, b string) stored by 'carbondata'")
 sql("CREATE DATAMAP dm ON TABLE ab using 'bloomfilter' 
DMPROPERTIES('index_columns'='a,b')")
 try {
 sql("insert into ab select 'berb', 'abc', 'ggg', '1'")
 } catch {
 case e : Exception => LOGGER.error(e)
 }
 LOGGER.error("XU second run")
 try {
 sql("insert into ab select 'berb', 'abc', 'ggg', '1'")
 } catch {
 case e : Exception => LOGGER.error(e)
 }
 sql("select * from ab").show(false)
 sql("DROP TABLE IF EXISTS ab")
 sql("DROP DATABASE IF EXISTS test1")
 sql("use default")
 CarbonProperties.getInstance().addProperty("bad_records_logger_enable",
 CarbonLoadOptionConstants.CARBON_OPTIONS_BAD_RECORDS_LOGGER_ENABLE_DEFAULT)
 
CarbonProperties.getInstance().addProperty(CarbonCommonConstants.CARBON_BAD_RECORDS_ACTION,
 "FAIL")
}

```

The load statement complains about the bad_record error, no NPE is reported.

> NPE when we do a insert over a insert failure operation
> ---
>
> Key: CARBONDATA-2833
> URL: https://issues.apache.org/jira/browse/CARBONDATA-2833
> Project: CarbonData
>  Issue Type: Bug
>Reporter: Brijoo Bopanna
>Priority: Major
>
> jdbc:hive2://10.18.5.188:23040/default> CREATE TABLE
> 0: jdbc:hive2://10.18.5.188:23040/default> IF NOT EXISTS test_table(
> 0: jdbc:hive2://10.18.5.188:23040/default> id string,
> 0: jdbc:hive2://10.18.5.188:23040/default> name string,
> 0: jdbc:hive2://10.18.5.188:23040/default> city string,
> 0: jdbc:hive2://10.18.5.188:23040/default> age Int)
> 0: jdbc:hive2://10.18.5.188:23040/default> STORED BY 'carbondata';
> +-+--+
> | Result  |
> +-+--+
> +-+--+
> No rows selected (0.191 seconds)
> 0: jdbc:hive2://10.18.5.188:23040/default>
> 0: jdbc:hive2://10.18.5.188:23040/default>
> 0: jdbc:hive2://10.18.5.188:23040/default>
> 0: jdbc:hive2://10.18.5.188:23040/default> desc test_table
> 0: jdbc:hive2://10.18.5.188:23040/default> ;
> +---++--+--+
> | col_name  | data_type  | comment  |
> +---++--+--+
> | id    | string | NULL |
> | name  | string | NULL |
> | city  | string | NULL |
> | age   | int    | NULL |
> +---++--+--+
> 4 rows selected (0.081 seconds)
> 0: jdbc:hive2://10.18.5.188:23040/default> insert into ab select 
> 'berb','abc','ggg','1';
> Error: java.lang.Exception: Data load failed due to bad record: The value 
> with column name a and column data type INT is not a valid INT type.Please 
> enable bad record logger to know the detail reason. (state=,code=0)
> 0: jdbc:hive2://10.18.5.188:23040/default> insert into ab select 
> 'berb','abc','ggg','1';
> *Error: java.lang.NullPointerException (state=,code=0)*
> 0: jdbc:hive2://10.18.5.188:23040/default> insert into test_table select 
> 'berb','abc','ggg',1;
> +-+--+
> | Result  |
> +-+--+
> +-+--+
> No rows selected (1.127 seconds)
> 0: jdbc:hive2://10.18.5.188:23040/default> show tables
> 0: jdbc:hive2://10.18.5.188:23040/default> ;
> +---+-+--+--+
> | database  |  tableName  | isTemporary  |
> +---+-+--+--+
> | praveen   | a   | false    |
> | praveen   | ab      | false    |
> | praveen   | bbc | false    |
> | praveen   | test_table  | false    |
> +---+-+--+--+
> 4 rows selected (0.041 seconds)
> 0: jdbc:hive2://10.18.5.188:23040/default>
> 0: jdbc:hive2://10.18.5.188:23040/default> desc ab
> 0: jdbc:hive2://10.18.5.188:23040/default> ;
> +---++--+--+
> | col_name  | data_type  | comment  |
> +---++--+--+
> | a | int    | NULL |
> | b | string | NULL |
> +---++--+--+
> 2 rows selected (0.074 seconds)



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (CARBONDATA-2763) Create table with partition and no_inverted_index on long_string column is not blocked

2018-08-06 Thread xuchuanyin (JIRA)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-2763?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

xuchuanyin resolved CARBONDATA-2763.

   Resolution: Fixed
Fix Version/s: 1.4.1

> Create table with partition and no_inverted_index on long_string column is 
> not blocked
> --
>
> Key: CARBONDATA-2763
> URL: https://issues.apache.org/jira/browse/CARBONDATA-2763
> Project: CarbonData
>  Issue Type: Bug
>  Components: data-query
>Affects Versions: 1.4.1
> Environment: Spark 2.1, 2.2
>Reporter: Chetan Bhat
>Priority: Minor
> Fix For: 1.4.1
>
>
> Steps :
>  # Create table with partition using long_string column 
>  CREATE TABLE local_no_inverted_index(id int, name string, description 
> string,address string, note string) STORED BY 'org.apache.carbondata.format' 
> tblproperties('no_inverted_index'='note','long_string_columns'='note');
>  2. Create table with no_inverted_index 
>   CREATE TABLE local1_partition(id int,name string, description 
> string,address string)  partitioned by (note string) STORED BY 
> 'org.apache.carbondata.format' tblproperties('long_string_columns'='note');
>  
> Actual Output : The Create table with partition and no_inverted_index on 
> long_string column is successful.
> 0: jdbc:hive2://10.18.98.101:22550/default> CREATE TABLE 
> local_no_inverted_index(id int, name string, description string,address 
> string, note string) STORED BY 'org.apache.carbondata.format' 
> tblproperties('no_inverted_index'='note','long_string_columns'='note');
> +-+--+
> | Result |
> +-+--+
> +-+--+
> No rows selected (2.604 seconds)
> 0: jdbc:hive2://10.18.98.101:22550/default> CREATE TABLE local1_partition(id 
> int,name string, description string,address string) partitioned by (note 
> string) STORED BY 'org.apache.carbondata.format' 
> tblproperties('long_string_columns'='note');
> +-+--+
> | Result |
> +-+--+
> +-+--+
> No rows selected (1.989 seconds)
> Expected Output - The Create table with partition and no_inverted_index on 
> long_string column should be blocked.
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (CARBONDATA-2762) Long string column displayed as string in describe formatted

2018-08-06 Thread xuchuanyin (JIRA)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-2762?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

xuchuanyin resolved CARBONDATA-2762.

   Resolution: Fixed
Fix Version/s: 1.4.1

> Long string column displayed as string in describe formatted
> 
>
> Key: CARBONDATA-2762
> URL: https://issues.apache.org/jira/browse/CARBONDATA-2762
> Project: CarbonData
>  Issue Type: Bug
>  Components: data-query
>Affects Versions: 1.4.1
>Reporter: Chetan Bhat
>Priority: Minor
> Fix For: 1.4.1
>
>
> Steps :
> User creates a table with long string column and executes the describe 
> formatted table command.
> 0: jdbc:hive2://10.18.98.101:22550/default> create table t2(c1 string, c2 
> string) stored by 'carbondata' tblproperties('long_string_columns' = 'c2');
> +-+--+
> | Result |
> +-+--+
> +-+--+
> No rows selected (3.034 seconds)
> 0: jdbc:hive2://10.18.98.101:22550/default> desc formatted t2;
> Actual Output : The describe formatted displays the c2 column as string 
> instead of long string.
> 0: jdbc:hive2://10.18.98.101:22550/default> desc formatted t2;
> +---+---+---+--+
> | col_name | data_type | comment |
> +---+---+---+--+
> | c1 | string | KEY COLUMN,null |
> *| c2 | string | KEY COLUMN,null |*
> | | | |
> | ##Detailed Table Information | | |
> | Database Name | default | |
> | Table Name | t2 | |
> | CARBON Store Path | 
> hdfs://hacluster/user/hive/warehouse/carbon.store/default/t2 | |
> | Comment | | |
> | Table Block Size | 1024 MB | |
> | Table Data Size | 0 | |
> | Table Index Size | 0 | |
> | Last Update Time | 0 | |
> | SORT_SCOPE | LOCAL_SORT | LOCAL_SORT |
> | CACHE_LEVEL | BLOCK | |
> | Streaming | false | |
> | Local Dictionary Enabled | true | |
> | Local Dictionary Threshold | 1 | |
> | Local Dictionary Include | c1,c2 | |
> | | | |
> | ##Detailed Column property | | |
> | ADAPTIVE | | |
> | SORT_COLUMNS | c1 | |
> +---+---+---+--+
> 22 rows selected (2.847 seconds)
>  
> Expected Output : The describe formatted should display the c2 column as long 
> string.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (CARBONDATA-2796) Fix data loading problem when table has complex column and long string column

2018-08-06 Thread xuchuanyin (JIRA)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-2796?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

xuchuanyin resolved CARBONDATA-2796.

   Resolution: Fixed
Fix Version/s: 1.4.1

> Fix data loading problem when table has  complex column and long string column
> --
>
> Key: CARBONDATA-2796
> URL: https://issues.apache.org/jira/browse/CARBONDATA-2796
> Project: CarbonData
>  Issue Type: Sub-task
>Reporter: jiangmanhua
>Assignee: jiangmanhua
>Priority: Major
> Fix For: 1.4.1
>
>  Time Spent: 3h
>  Remaining Estimate: 0h
>
> currently both varchar column and complex column believes itself is the last 
> one member in noDictionary group when converting carbon row from raw format 
> to 3-parted format. Since they need to be proceeded in different way, 
> exception will occur if we deal the column in wrong way.
> To fix this, we marked the info of complex columns explicitly like varchar 
> columns, and keep the order of noDictionary group as : normal Dim & varchar & 
> complex



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (CARBONDATA-2823) Alter table set local dictionary include after bloom creation and merge index on old V3 store fails throwing incorrect error

2018-08-06 Thread xuchuanyin (JIRA)


[ 
https://issues.apache.org/jira/browse/CARBONDATA-2823?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16569825#comment-16569825
 ] 

xuchuanyin commented on CARBONDATA-2823:


since we get the splits from streaming segment and columnar segments 
respectively, we can support streaming with index datamap

> Alter table set local dictionary include after bloom creation and merge index 
> on old V3 store fails throwing incorrect error
> 
>
> Key: CARBONDATA-2823
> URL: https://issues.apache.org/jira/browse/CARBONDATA-2823
> Project: CarbonData
>  Issue Type: Bug
>  Components: data-query
>Affects Versions: 1.4.1
> Environment: Spark 2.1
>Reporter: Chetan Bhat
>Assignee: xuchuanyin
>Priority: Minor
>
> Steps :
> In old version V3 store create table and load data.
> CREATE TABLE uniqdata_load (CUST_ID int,CUST_NAME String,ACTIVE_EMUI_VERSION 
> string, DOB timestamp, DOJ timestamp, BIGINT_COLUMN1 bigint,BIGINT_COLUMN2 
> bigint,DECIMAL_COLUMN1 decimal(30,10), DECIMAL_COLUMN2 
> decimal(36,36),Double_COLUMN1 double, Double_COLUMN2 double,INTEGER_COLUMN1 
> int) STORED BY 'org.apache.carbondata.format';
> LOAD DATA INPATH 'hdfs://hacluster/chetan/2000_UniqData.csv' into table 
> uniqdata_load OPTIONS('DELIMITER'=',' , 
> 'QUOTECHAR'='"','BAD_RECORDS_ACTION'='FORCE','FILEHEADER'='CUST_ID,CUST_NAME,ACTIVE_EMUI_VERSION,DOB,DOJ,BIGINT_COLUMN1,BIGINT_COLUMN2,DECIMAL_COLUMN1,DECIMAL_COLUMN2,Double_COLUMN1,Double_COLUMN2,INTEGER_COLUMN1');
> In 1.4.1 version refresh the table of old V3 store.
> refresh table uniqdata_load;
> Create bloom filter and merge index.
> CREATE DATAMAP dm_uniqdata1_tmstmp ON TABLE uniqdata_load USING 'bloomfilter' 
> DMPROPERTIES ('INDEX_COLUMNS' = 'DOJ', 'BLOOM_SIZE'='64', 
> 'BLOOM_FPP'='0.1');
> Alter table set local dictionary include.
>  alter table uniqdata_load set 
> tblproperties('local_dictionary_include'='CUST_NAME');
>  
> Issue : Alter table set local dictionary include fails with incorrect error.
> 0: jdbc:hive2://10.18.98.101:22550/default> alter table uniqdata_load set 
> tblproperties('local_dictionary_include'='CUST_NAME');
> *Error: 
> org.apache.carbondata.common.exceptions.sql.MalformedCarbonCommandException: 
> streaming is not supported for index datamap (state=,code=0)*
>  
> Expected : Operation should be success. If the operation is unsupported it 
> should throw correct error message.
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Comment Edited] (CARBONDATA-2823) Alter table set local dictionary include after bloom creation and merge index on old V3 store fails throwing incorrect error

2018-08-06 Thread xuchuanyin (JIRA)


[ 
https://issues.apache.org/jira/browse/CARBONDATA-2823?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16569823#comment-16569823
 ] 

xuchuanyin edited comment on CARBONDATA-2823 at 8/6/18 7:22 AM:


As for CARBONDATA-2823, it can simply be reproduced by
1. create table
2. create bloom/lucene datamap
3. load data
4. alter table set tblProperties


was (Author: xuchuanyin):
As for CARBONDATA-2823, it can simply reproduced by
1. create table
2. create bloom/lucene datamap
3. load data
4. alter table set tblProperties

> Alter table set local dictionary include after bloom creation and merge index 
> on old V3 store fails throwing incorrect error
> 
>
> Key: CARBONDATA-2823
> URL: https://issues.apache.org/jira/browse/CARBONDATA-2823
> Project: CarbonData
>  Issue Type: Bug
>  Components: data-query
>Affects Versions: 1.4.1
> Environment: Spark 2.1
>Reporter: Chetan Bhat
>Priority: Minor
>
> Steps :
> In old version V3 store create table and load data.
> CREATE TABLE uniqdata_load (CUST_ID int,CUST_NAME String,ACTIVE_EMUI_VERSION 
> string, DOB timestamp, DOJ timestamp, BIGINT_COLUMN1 bigint,BIGINT_COLUMN2 
> bigint,DECIMAL_COLUMN1 decimal(30,10), DECIMAL_COLUMN2 
> decimal(36,36),Double_COLUMN1 double, Double_COLUMN2 double,INTEGER_COLUMN1 
> int) STORED BY 'org.apache.carbondata.format';
> LOAD DATA INPATH 'hdfs://hacluster/chetan/2000_UniqData.csv' into table 
> uniqdata_load OPTIONS('DELIMITER'=',' , 
> 'QUOTECHAR'='"','BAD_RECORDS_ACTION'='FORCE','FILEHEADER'='CUST_ID,CUST_NAME,ACTIVE_EMUI_VERSION,DOB,DOJ,BIGINT_COLUMN1,BIGINT_COLUMN2,DECIMAL_COLUMN1,DECIMAL_COLUMN2,Double_COLUMN1,Double_COLUMN2,INTEGER_COLUMN1');
> In 1.4.1 version refresh the table of old V3 store.
> refresh table uniqdata_load;
> Create bloom filter and merge index.
> CREATE DATAMAP dm_uniqdata1_tmstmp ON TABLE uniqdata_load USING 'bloomfilter' 
> DMPROPERTIES ('INDEX_COLUMNS' = 'DOJ', 'BLOOM_SIZE'='64', 
> 'BLOOM_FPP'='0.1');
> Alter table set local dictionary include.
>  alter table uniqdata_load set 
> tblproperties('local_dictionary_include'='CUST_NAME');
>  
> Issue : Alter table set local dictionary include fails with incorrect error.
> 0: jdbc:hive2://10.18.98.101:22550/default> alter table uniqdata_load set 
> tblproperties('local_dictionary_include'='CUST_NAME');
> *Error: 
> org.apache.carbondata.common.exceptions.sql.MalformedCarbonCommandException: 
> streaming is not supported for index datamap (state=,code=0)*
>  
> Expected : Operation should be success. If the operation is unsupported it 
> should throw correct error message.
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Assigned] (CARBONDATA-2823) Alter table set local dictionary include after bloom creation and merge index on old V3 store fails throwing incorrect error

2018-08-06 Thread xuchuanyin (JIRA)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-2823?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

xuchuanyin reassigned CARBONDATA-2823:
--

Assignee: xuchuanyin

> Alter table set local dictionary include after bloom creation and merge index 
> on old V3 store fails throwing incorrect error
> 
>
> Key: CARBONDATA-2823
> URL: https://issues.apache.org/jira/browse/CARBONDATA-2823
> Project: CarbonData
>  Issue Type: Bug
>  Components: data-query
>Affects Versions: 1.4.1
> Environment: Spark 2.1
>Reporter: Chetan Bhat
>Assignee: xuchuanyin
>Priority: Minor
>
> Steps :
> In old version V3 store create table and load data.
> CREATE TABLE uniqdata_load (CUST_ID int,CUST_NAME String,ACTIVE_EMUI_VERSION 
> string, DOB timestamp, DOJ timestamp, BIGINT_COLUMN1 bigint,BIGINT_COLUMN2 
> bigint,DECIMAL_COLUMN1 decimal(30,10), DECIMAL_COLUMN2 
> decimal(36,36),Double_COLUMN1 double, Double_COLUMN2 double,INTEGER_COLUMN1 
> int) STORED BY 'org.apache.carbondata.format';
> LOAD DATA INPATH 'hdfs://hacluster/chetan/2000_UniqData.csv' into table 
> uniqdata_load OPTIONS('DELIMITER'=',' , 
> 'QUOTECHAR'='"','BAD_RECORDS_ACTION'='FORCE','FILEHEADER'='CUST_ID,CUST_NAME,ACTIVE_EMUI_VERSION,DOB,DOJ,BIGINT_COLUMN1,BIGINT_COLUMN2,DECIMAL_COLUMN1,DECIMAL_COLUMN2,Double_COLUMN1,Double_COLUMN2,INTEGER_COLUMN1');
> In 1.4.1 version refresh the table of old V3 store.
> refresh table uniqdata_load;
> Create bloom filter and merge index.
> CREATE DATAMAP dm_uniqdata1_tmstmp ON TABLE uniqdata_load USING 'bloomfilter' 
> DMPROPERTIES ('INDEX_COLUMNS' = 'DOJ', 'BLOOM_SIZE'='64', 
> 'BLOOM_FPP'='0.1');
> Alter table set local dictionary include.
>  alter table uniqdata_load set 
> tblproperties('local_dictionary_include'='CUST_NAME');
>  
> Issue : Alter table set local dictionary include fails with incorrect error.
> 0: jdbc:hive2://10.18.98.101:22550/default> alter table uniqdata_load set 
> tblproperties('local_dictionary_include'='CUST_NAME');
> *Error: 
> org.apache.carbondata.common.exceptions.sql.MalformedCarbonCommandException: 
> streaming is not supported for index datamap (state=,code=0)*
>  
> Expected : Operation should be success. If the operation is unsupported it 
> should throw correct error message.
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (CARBONDATA-2823) Alter table set local dictionary include after bloom creation and merge index on old V3 store fails throwing incorrect error

2018-08-06 Thread xuchuanyin (JIRA)


[ 
https://issues.apache.org/jira/browse/CARBONDATA-2823?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16569823#comment-16569823
 ] 

xuchuanyin commented on CARBONDATA-2823:


As for CARBONDATA-2823, it can simply reproduced by
1. create table
2. create bloom/lucene datamap
3. load data
4. alter table set tblProperties

> Alter table set local dictionary include after bloom creation and merge index 
> on old V3 store fails throwing incorrect error
> 
>
> Key: CARBONDATA-2823
> URL: https://issues.apache.org/jira/browse/CARBONDATA-2823
> Project: CarbonData
>  Issue Type: Bug
>  Components: data-query
>Affects Versions: 1.4.1
> Environment: Spark 2.1
>Reporter: Chetan Bhat
>Priority: Minor
>
> Steps :
> In old version V3 store create table and load data.
> CREATE TABLE uniqdata_load (CUST_ID int,CUST_NAME String,ACTIVE_EMUI_VERSION 
> string, DOB timestamp, DOJ timestamp, BIGINT_COLUMN1 bigint,BIGINT_COLUMN2 
> bigint,DECIMAL_COLUMN1 decimal(30,10), DECIMAL_COLUMN2 
> decimal(36,36),Double_COLUMN1 double, Double_COLUMN2 double,INTEGER_COLUMN1 
> int) STORED BY 'org.apache.carbondata.format';
> LOAD DATA INPATH 'hdfs://hacluster/chetan/2000_UniqData.csv' into table 
> uniqdata_load OPTIONS('DELIMITER'=',' , 
> 'QUOTECHAR'='"','BAD_RECORDS_ACTION'='FORCE','FILEHEADER'='CUST_ID,CUST_NAME,ACTIVE_EMUI_VERSION,DOB,DOJ,BIGINT_COLUMN1,BIGINT_COLUMN2,DECIMAL_COLUMN1,DECIMAL_COLUMN2,Double_COLUMN1,Double_COLUMN2,INTEGER_COLUMN1');
> In 1.4.1 version refresh the table of old V3 store.
> refresh table uniqdata_load;
> Create bloom filter and merge index.
> CREATE DATAMAP dm_uniqdata1_tmstmp ON TABLE uniqdata_load USING 'bloomfilter' 
> DMPROPERTIES ('INDEX_COLUMNS' = 'DOJ', 'BLOOM_SIZE'='64', 
> 'BLOOM_FPP'='0.1');
> Alter table set local dictionary include.
>  alter table uniqdata_load set 
> tblproperties('local_dictionary_include'='CUST_NAME');
>  
> Issue : Alter table set local dictionary include fails with incorrect error.
> 0: jdbc:hive2://10.18.98.101:22550/default> alter table uniqdata_load set 
> tblproperties('local_dictionary_include'='CUST_NAME');
> *Error: 
> org.apache.carbondata.common.exceptions.sql.MalformedCarbonCommandException: 
> streaming is not supported for index datamap (state=,code=0)*
>  
> Expected : Operation should be success. If the operation is unsupported it 
> should throw correct error message.
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Closed] (CARBONDATA-2420) Support string longer than 32000 characters

2018-08-05 Thread xuchuanyin (JIRA)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-2420?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

xuchuanyin closed CARBONDATA-2420.
--
   Resolution: Resolved
Fix Version/s: 1.4.1

1.4.1 introduced 32k feature (alpha) to support this

> Support string longer than 32000 characters
> ---
>
> Key: CARBONDATA-2420
> URL: https://issues.apache.org/jira/browse/CARBONDATA-2420
> Project: CarbonData
>  Issue Type: Improvement
>Reporter: xuchuanyin
>Assignee: xuchuanyin
>Priority: Major
> Fix For: 1.4.1
>
>  Time Spent: 19h 40m
>  Remaining Estimate: 0h
>
> Add a property in creating table 'long_string_columns' to support string 
> columns that will contains more than 32000 characters.
> Inside carbondata, it use an integer instead of short to store the length of 
> bytes content.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Closed] (CARBONDATA-2340) load数据超过32000byte

2018-08-05 Thread xuchuanyin (JIRA)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-2340?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

xuchuanyin closed CARBONDATA-2340.
--
   Resolution: Fixed
Fix Version/s: 1.4.1

1.4.1 introduced 32k feature (alpha) to support this

> load数据超过32000byte
> -
>
> Key: CARBONDATA-2340
> URL: https://issues.apache.org/jira/browse/CARBONDATA-2340
> Project: CarbonData
>  Issue Type: Bug
>  Components: data-load
>Affects Versions: 1.3.0
>Reporter: niaoshu
>Assignee: xuchuanyin
>Priority: Blocker
> Fix For: 1.4.1
>
>   Original Estimate: 12h
>  Remaining Estimate: 12h
>
> INFO storage.BlockManagerMasterEndpoint: Registering block manager 
> spark1:12603 with 5.2 GB RAM, BlockManagerId(1, spark1, 12603, None)
> 18/04/11 14:24:23 INFO storage.BlockManagerInfo: Added broadcast_0_piece0 in 
> memory on spark1:12603 (size: 34.9 KB, free: 5.2 GB)
> 18/04/11 14:24:34 WARN scheduler.TaskSetManager: Lost task 0.0 in stage 0.0 
> (TID 0, spark1, executor 1): 
> org.apache.carbondata.processing.loading.exception.CarbonDataLoadingException:
>  Dataload failed, String size cannot exceed 32000 bytes
>  at 
> org.apache.carbondata.processing.loading.converter.impl.NonDictionaryFieldConverterImpl.convert(NonDictionaryFieldConverterImpl.java:75)
>  at 
> org.apache.carbondata.processing.loading.converter.impl.RowConverterImpl.convert(RowConverterImpl.java:162)
>  at 
> org.apache.carbondata.processing.loading.steps.DataConverterProcessorStepImpl.processRowBatch(DataConverterProcessorStepImpl.java:104)
>  at 
> org.apache.carbondata.processing.loading.steps.DataConverterProcessorStepImpl$1.next(DataConverterProcessorStepImpl.java:91)
>  at 
> org.apache.carbondata.processing.loading.steps.DataConverterProcessorStepImpl$1.next(DataConverterProcessorStepImpl.java:77)
>  at 
> org.apache.carbondata.processing.loading.sort.impl.ParallelReadMergeSorterImpl$SortIteratorThread.run(ParallelReadMergeSorterImpl.java:214)
>  at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>  at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>  at java.lang.Thread.run(Thread.java:748)
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Closed] (CARBONDATA-2339) 下标越界

2018-08-05 Thread xuchuanyin (JIRA)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-2339?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

xuchuanyin closed CARBONDATA-2339.
--
   Resolution: Fixed
Fix Version/s: (was: NONE)
   1.4.1

1.4.1 introduced 32k feature (alpha) to support this

> 下标越界
> 
>
> Key: CARBONDATA-2339
> URL: https://issues.apache.org/jira/browse/CARBONDATA-2339
> Project: CarbonData
>  Issue Type: Bug
>  Components: data-load
>Affects Versions: 1.3.0
>Reporter: niaoshu
>Assignee: xuchuanyin
>Priority: Blocker
> Fix For: 1.4.1
>
>   Original Estimate: 96h
>  Remaining Estimate: 96h
>
> java.lang.ArrayIndexOutOfBoundsException



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Closed] (CARBONDATA-2202) Introduce local dictionary encoding for dimensions

2018-08-05 Thread xuchuanyin (JIRA)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-2202?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

xuchuanyin closed CARBONDATA-2202.
--
   Resolution: Fixed
Fix Version/s: 1.4.1

1.4.1 introduced local-dictionary for this

> Introduce local dictionary encoding for dimensions
> --
>
> Key: CARBONDATA-2202
> URL: https://issues.apache.org/jira/browse/CARBONDATA-2202
> Project: CarbonData
>  Issue Type: Improvement
>Reporter: xuchuanyin
>Assignee: xuchuanyin
>Priority: Major
> Fix For: 1.4.1
>
>
> Currently Carbondata will generate global dictionary for columns with 
> 'dictionary_include' attribute.
> A dimension column without that attribute will only be stored after some 
> simple compression. These columns can also be dictionary encoded in file 
> level (called ‘local dictionary’) to reduce data size.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Closed] (CARBONDATA-2166) Default value of cutoff timestamp is wrong

2018-08-05 Thread xuchuanyin (JIRA)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-2166?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

xuchuanyin closed CARBONDATA-2166.
--
Resolution: Not A Problem

> Default value of cutoff timestamp is wrong
> --
>
> Key: CARBONDATA-2166
> URL: https://issues.apache.org/jira/browse/CARBONDATA-2166
> Project: CarbonData
>  Issue Type: Bug
>Reporter: xuchuanyin
>Assignee: xuchuanyin
>Priority: Major
>  Time Spent: 2h 40m
>  Remaining Estimate: 0h
>
> In the configuration-parameters.md, it says that the default value of 
> `carbon.cutoffTimestamp` is `1970-01-01 05:30:00`. But actually in 
> `TimeStampDirectDictionaryGenerator` it use empty as default value.
>  
> As  a result, some tests in module `SDVTests` ran failed in my local machine.
> For example,  testcase of `BadRecord_Dataload_006` ran failed in maven but it 
> ran successfully in IDE.
>  
> Besides, the TimeZone should also be set accordingly to make the tests right.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (CARBONDATA-2820) Block rebuilding for preagg, bloom and lucene datamap

2018-08-03 Thread xuchuanyin (JIRA)
xuchuanyin created CARBONDATA-2820:
--

 Summary: Block rebuilding for preagg, bloom and lucene datamap
 Key: CARBONDATA-2820
 URL: https://issues.apache.org/jira/browse/CARBONDATA-2820
 Project: CarbonData
  Issue Type: Improvement
Reporter: xuchuanyin
Assignee: xuchuanyin


currently we will block rebuilding these datamap



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


  1   2   3   4   >