[jira] [Commented] (CARBONDATA-3631) StringIndexOutOfBoundsException When Inserting Select From a Parquet Table with Empty array/map

2019-12-30 Thread Zhichao Zhang (Jira)


[ 
https://issues.apache.org/jira/browse/CARBONDATA-3631?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17005201#comment-17005201
 ] 

Zhichao  Zhang commented on CARBONDATA-3631:


[~shenhong] please raise a pr to fix this, thanks.

> StringIndexOutOfBoundsException When Inserting Select From a Parquet Table 
> with Empty array/map
> ---
>
> Key: CARBONDATA-3631
> URL: https://issues.apache.org/jira/browse/CARBONDATA-3631
> Project: CarbonData
>  Issue Type: Bug
>Affects Versions: 1.6.1, 2.0.0
>Reporter: Xingjun Hao
>Priority: Minor
> Fix For: 2.0.0
>
>
> sql("insert into datatype_array_parquet values(array())")
> sql("insert into datatype_array_carbondata select f from 
> datatype_array_parquet")
>  
> {code:java}
> java.lang.StringIndexOutOfBoundsException: String index out of range: -1
> at java.lang.AbstractStringBuilder.substring(AbstractStringBuilder.java:935)
> at java.lang.StringBuilder.substring(StringBuilder.java:76)
> at scala.collection.mutable.StringBuilder.substring(StringBuilder.scala:166)
> at 
> org.apache.carbondata.streaming.parser.FieldConverter$.objectToString(FieldConverter.scala:77)
> at 
> org.apache.carbondata.spark.util.CarbonScalaUtil$.getString(CarbonScalaUtil.scala:71)
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (CARBONDATA-3619) NoSuchMethodError(registerCurrentOperationLog) While Creating Table

2019-12-18 Thread Zhichao Zhang (Jira)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-3619?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhichao  Zhang resolved CARBONDATA-3619.

Resolution: Fixed

> NoSuchMethodError(registerCurrentOperationLog) While Creating Table
> ---
>
> Key: CARBONDATA-3619
> URL: https://issues.apache.org/jira/browse/CARBONDATA-3619
> Project: CarbonData
>  Issue Type: Bug
>  Components: spark-integration
>Affects Versions: 1.6.1, 2.0.0
>Reporter: Xingjun Hao
>Priority: Minor
> Fix For: 2.0.0
>
>  Time Spent: 1.5h
>  Remaining Estimate: 0h
>
> ExecuteStatementOperation.java exists in hive-service model and 
> spark-hive-thriftserver model, Leading "NoSuchMethodError: 
> org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation.registerCurrentOperationLog()V"
> {code:java}
> Caused by: java.lang.NoSuchMethodError: 
> org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation.registerCurrentOperationLog()V
>  
> at 
> org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation.protected$registerCurrentOperationLog(SparkExecuteStatementOperation.scala:173)
>  
> at 
> org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation$$anon$1$$anon$2.run(SparkExecuteStatementOperation.scala:173)
>  
> at 
> org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation$$anon$1$$anon$2.run(SparkExecuteStatementOperation.scala:171)
>  
> at java.security.AccessController.doPrivileged(Native Method) at 
> javax.security.auth.Subject.doAs(Subject.java:422) 
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657)
>  
> at 
> org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation$$anon$1.run(SparkExecuteStatementOperation.scala:185)
>  
> at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) at 
> java.util.concurrent.FutureTask.run(FutureTask.java:266) ... 3 more
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (CARBONDATA-3612) Caused by: java.io.IOException: Problem in loading segment blocks: null

2019-12-15 Thread Zhichao Zhang (Jira)


[ 
https://issues.apache.org/jira/browse/CARBONDATA-3612?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16996914#comment-16996914
 ] 

Zhichao  Zhang commented on CARBONDATA-3612:


[~SeaAndHill] you can add my WeChat: xm_zzc, I can help to view this problem.

> Caused by: java.io.IOException: Problem in loading segment blocks: null
> ---
>
> Key: CARBONDATA-3612
> URL: https://issues.apache.org/jira/browse/CARBONDATA-3612
> Project: CarbonData
>  Issue Type: Bug
>  Components: core, data-load
>Affects Versions: 1.5.1
>Reporter: SeaAndHill
>Priority: Major
>
> at org.apache.spark.sql.catalyst.errors.package$.attachTree(package.scala:56) 
> at org.apache.spark.sql.catalyst.errors.package$.attachTree(package.scala:56) 
> at 
> org.apache.spark.sql.execution.exchange.ShuffleExchange.doExecute(ShuffleExchange.scala:115)
>  at 
> org.apache.spark.sql.execution.SparkPlan$$anonfun$execute$1.apply(SparkPlan.scala:117)
>  at 
> org.apache.spark.sql.execution.SparkPlan$$anonfun$execute$1.apply(SparkPlan.scala:117)
>  at 
> org.apache.spark.sql.execution.SparkPlan$$anonfun$executeQuery$1.apply(SparkPlan.scala:138)
>  at 
> org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:151)
>  at 
> org.apache.spark.sql.execution.SparkPlan.executeQuery(SparkPlan.scala:135) at 
> org.apache.spark.sql.execution.SparkPlan.execute(SparkPlan.scala:116) at 
> org.apache.spark.sql.execution.InputAdapter.inputRDDs(WholeStageCodegenExec.scala:252)
>  at 
> org.apache.spark.sql.execution.aggregate.HashAggregateExec.inputRDDs(HashAggregateExec.scala:141)
>  at 
> org.apache.spark.sql.execution.aggregate.HashAggregateExec.inputRDDs(HashAggregateExec.scala:141)
>  at 
> org.apache.spark.sql.execution.WholeStageCodegenExec.doExecute(WholeStageCodegenExec.scala:386)
>  at 
> org.apache.spark.sql.execution.SparkPlan$$anonfun$execute$1.apply(SparkPlan.scala:117)
>  at 
> org.apache.spark.sql.execution.SparkPlan$$anonfun$execute$1.apply(SparkPlan.scala:117)
>  at 
> org.apache.spark.sql.execution.SparkPlan$$anonfun$executeQuery$1.apply(SparkPlan.scala:138)
>  at 
> org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:151)
>  at 
> org.apache.spark.sql.execution.SparkPlan.executeQuery(SparkPlan.scala:135) at 
> org.apache.spark.sql.execution.SparkPlan.execute(SparkPlan.scala:116) at 
> org.apache.spark.sql.execution.exchange.ShuffleExchange.prepareShuffleDependency(ShuffleExchange.scala:88)
>  at 
> org.apache.spark.sql.execution.exchange.ShuffleExchange$$anonfun$doExecute$1.apply(ShuffleExchange.scala:124)
>  at 
> org.apache.spark.sql.execution.exchange.ShuffleExchange$$anonfun$doExecute$1.apply(ShuffleExchange.scala:115)
>  at 
> org.apache.spark.sql.catalyst.errors.package$.attachTree(package.scala:52) 
> ... 35 moreCaused by: java.io.IOException: Problem in loading segment blocks: 
> null at 
> org.apache.carbondata.core.indexstore.BlockletDataMapIndexStore.getAll(BlockletDataMapIndexStore.java:193)
>  at 
> org.apache.carbondata.core.indexstore.blockletindex.BlockletDataMapFactory.getDataMaps(BlockletDataMapFactory.java:144)
>  at 
> org.apache.carbondata.core.datamap.TableDataMap.prune(TableDataMap.java:139) 
> at 
> org.apache.carbondata.hadoop.api.CarbonInputFormat.getPrunedBlocklets(CarbonInputFormat.java:493)
>  at 
> org.apache.carbondata.hadoop.api.CarbonInputFormat.getDataBlocksOfSegment(CarbonInputFormat.java:412)
>  at 
> org.apache.carbondata.hadoop.api.CarbonTableInputFormat.getSplits(CarbonTableInputFormat.java:529)
>  at 
> org.apache.carbondata.hadoop.api.CarbonTableInputFormat.getSplits(CarbonTableInputFormat.java:220)
>  at 
> org.apache.carbondata.spark.rdd.CarbonScanRDD.internalGetPartitions(CarbonScanRDD.scala:127)
>  at 
> org.apache.carbondata.spark.rdd.CarbonRDD.getPartitions(CarbonRDD.scala:66) 
> at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:252) at 
> org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:250) at 
> scala.Option.getOrElse(Option.scala:121) at 
> org.apache.spark.rdd.RDD.partitions(RDD.scala:250) at 
> org.apache.spark.rdd.MapPartitionsRDD.getPartitions(MapPartitionsRDD.scala:35)
>  at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:252) at 
> org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:250) at 
> scala.Option.getOrElse(Option.scala:121) at 
> org.apache.spark.rdd.RDD.partitions(RDD.scala:250) at 
> org.apache.spark.rdd.MapPartitionsRDD.getPartitions(MapPartitionsRDD.scala:35)
>  at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:252) at 
> org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:250) at 
> scala.Option.getOrElse(Option.scala:121) at 
> org.apache.spark.rdd.RDD.partitions(RDD.scala:250) at 
> 

[jira] [Created] (CARBONDATA-3611) Filter failed with measure columns on stream table when this stream table includes complex columns

2019-12-06 Thread Zhichao Zhang (Jira)
Zhichao  Zhang created CARBONDATA-3611:
--

 Summary: Filter failed with measure columns on stream table when 
this stream table includes complex columns
 Key: CARBONDATA-3611
 URL: https://issues.apache.org/jira/browse/CARBONDATA-3611
 Project: CarbonData
  Issue Type: Bug
Reporter: Zhichao  Zhang
Assignee: Zhichao  Zhang


Filter failed with measure columns on stream table when this stream table 
includes complex columns



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (CARBONDATA-3596) After execute 'add column' command, it will throw exception when execute load data command or select sql on a table which includes complex columns

2019-11-26 Thread Zhichao Zhang (Jira)
Zhichao  Zhang created CARBONDATA-3596:
--

 Summary: After execute 'add column' command, it will throw 
exception when execute load data command or select sql on a table which 
includes complex columns
 Key: CARBONDATA-3596
 URL: https://issues.apache.org/jira/browse/CARBONDATA-3596
 Project: CarbonData
  Issue Type: Bug
Reporter: Zhichao  Zhang
Assignee: Zhichao  Zhang


After execute 'add column' command, it will throw exception when execute load 
data command or select sql on a table which includes complex columns



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (CARBONDATA-3591) optimize java code checkstyle for NoWhitespaceAfter rule

2019-11-20 Thread Zhichao Zhang (Jira)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-3591?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhichao  Zhang updated CARBONDATA-3591:
---
Fix Version/s: 2.0.0

> optimize java code checkstyle for NoWhitespaceAfter rule
> 
>
> Key: CARBONDATA-3591
> URL: https://issues.apache.org/jira/browse/CARBONDATA-3591
> Project: CarbonData
>  Issue Type: Improvement
>  Components: core
>Reporter: lamber-ken
>Priority: Major
> Fix For: 2.0.0
>
>  Time Spent: 1h
>  Remaining Estimate: 0h
>
> optimize java code checkstyle for NoWhitespaceAfter rule



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (CARBONDATA-3591) optimize java code checkstyle for NoWhitespaceAfter rule

2019-11-20 Thread Zhichao Zhang (Jira)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-3591?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhichao  Zhang resolved CARBONDATA-3591.

Resolution: Fixed

> optimize java code checkstyle for NoWhitespaceAfter rule
> 
>
> Key: CARBONDATA-3591
> URL: https://issues.apache.org/jira/browse/CARBONDATA-3591
> Project: CarbonData
>  Issue Type: Improvement
>  Components: core
>Reporter: lamber-ken
>Priority: Major
>  Time Spent: 1h
>  Remaining Estimate: 0h
>
> optimize java code checkstyle for NoWhitespaceAfter rule



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (CARBONDATA-3576) optimize java code checkstyle for EmptyLineSeparator rule

2019-11-20 Thread Zhichao Zhang (Jira)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-3576?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhichao  Zhang resolved CARBONDATA-3576.

Resolution: Fixed

> optimize java code checkstyle for EmptyLineSeparator rule
> -
>
> Key: CARBONDATA-3576
> URL: https://issues.apache.org/jira/browse/CARBONDATA-3576
> Project: CarbonData
>  Issue Type: Improvement
>  Components: core, other
>Affects Versions: 1.6.1
>Reporter: lamber-ken
>Priority: Minor
> Fix For: 2.0.0
>
>  Time Spent: 1h 50m
>  Remaining Estimate: 0h
>
> optimize java code checkstyle for EmptyLineSeparator rule



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (CARBONDATA-3583) Upgrade default JDK version 1.7 to 1.8

2019-11-14 Thread Zhichao Zhang (Jira)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-3583?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhichao  Zhang updated CARBONDATA-3583:
---
Fix Version/s: 2.0.0

> Upgrade default JDK version 1.7 to 1.8
> --
>
> Key: CARBONDATA-3583
> URL: https://issues.apache.org/jira/browse/CARBONDATA-3583
> Project: CarbonData
>  Issue Type: Improvement
>Reporter: Zhichao  Zhang
>Assignee: Zhichao  Zhang
>Priority: Minor
> Fix For: 2.0.0
>
>
> Upgrade default JDK version 1.7 to 1.8 which provides some good features that 
> makes code cleaner.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (CARBONDATA-3583) Upgrade default JDK version 1.7 to 1.8

2019-11-14 Thread Zhichao Zhang (Jira)
Zhichao  Zhang created CARBONDATA-3583:
--

 Summary: Upgrade default JDK version 1.7 to 1.8
 Key: CARBONDATA-3583
 URL: https://issues.apache.org/jira/browse/CARBONDATA-3583
 Project: CarbonData
  Issue Type: Improvement
Reporter: Zhichao  Zhang
Assignee: Zhichao  Zhang


Upgrade default JDK version 1.7 to 1.8 which provides some good features that 
makes code cleaner.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (CARBONDATA-3561) After delete/update the data, the query results become incorrect

2019-11-14 Thread Zhichao Zhang (Jira)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-3561?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhichao  Zhang updated CARBONDATA-3561:
---
Fix Version/s: 2.0.0

> After delete/update the data, the query results become incorrect
> 
>
> Key: CARBONDATA-3561
> URL: https://issues.apache.org/jira/browse/CARBONDATA-3561
> Project: CarbonData
>  Issue Type: Bug
>  Components: data-query
>Affects Versions: 1.6.0, 1.5.4
> Environment: SUSE12 CDH5
>Reporter: Xigua
>Assignee: Zhichao  Zhang
>Priority: Blocker
> Fix For: 2.0.0
>
> Attachments: image-2019-10-30-14-42-49-871.png, 
> image-2019-10-30-14-43-21-972.png
>
>  Time Spent: 3h 10m
>  Remaining Estimate: 0h
>
>  
> {quote}CREATE TABLE `workqueue_zyy02` (`queuecode` STRING, `channelflag` 
> STRING, `dt` STRING)
>  USING org.apache.spark.sql.CarbonSource
>  PARTITIONED BY (dt);
>   
>  delete from workqueue_zyy02 m where m.queuecode ='1';
> {quote}
> *before*
>  * !image-2019-10-30-14-42-49-871.png!
>  * 
>  * *after*
>  * !image-2019-10-30-14-43-21-972.png!



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Assigned] (CARBONDATA-3569) spark ui open exception

2019-11-13 Thread Zhichao Zhang (Jira)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-3569?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhichao  Zhang reassigned CARBONDATA-3569:
--

Assignee: Zhichao  Zhang

> spark ui open exception
> ---
>
> Key: CARBONDATA-3569
> URL: https://issues.apache.org/jira/browse/CARBONDATA-3569
> Project: CarbonData
>  Issue Type: Improvement
>  Components: build
>Affects Versions: 1.6.1
>Reporter: tianyou
>Assignee: Zhichao  Zhang
>Priority: Critical
> Fix For: 2.0.0
>
>  Time Spent: 1h
>  Remaining Estimate: 0h
>
> Spark 2.3.2 jars contains related packages of javax, and the carbondata 
> package references another javax, which causes an exception in sparkui.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (CARBONDATA-3577) Use Spark 2.3 as default version and upgrade Spark 2.3.2 to 2.3.4

2019-11-13 Thread Zhichao Zhang (Jira)
Zhichao  Zhang created CARBONDATA-3577:
--

 Summary: Use Spark 2.3 as default version and upgrade Spark 2.3.2 
to 2.3.4
 Key: CARBONDATA-3577
 URL: https://issues.apache.org/jira/browse/CARBONDATA-3577
 Project: CarbonData
  Issue Type: Improvement
  Components: spark-integration
Reporter: Zhichao  Zhang
Assignee: Zhichao  Zhang
 Fix For: 2.0.0






--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (CARBONDATA-3561) After delete/update the data, the query results become incorrect

2019-11-11 Thread Zhichao Zhang (Jira)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-3561?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhichao  Zhang updated CARBONDATA-3561:
---
Summary: After delete/update the data, the query results become incorrect  
(was: After deletor updat the data, the query results become incorrect)

> After delete/update the data, the query results become incorrect
> 
>
> Key: CARBONDATA-3561
> URL: https://issues.apache.org/jira/browse/CARBONDATA-3561
> Project: CarbonData
>  Issue Type: Bug
>  Components: data-query
>Affects Versions: 1.6.0, 1.5.4
> Environment: SUSE12 CDH5
>Reporter: Xigua
>Assignee: Zhichao  Zhang
>Priority: Blocker
> Attachments: image-2019-10-30-14-42-49-871.png, 
> image-2019-10-30-14-43-21-972.png
>
>
>  
> {quote}CREATE TABLE `workqueue_zyy02` (`queuecode` STRING, `channelflag` 
> STRING, `dt` STRING)
>  USING org.apache.spark.sql.CarbonSource
>  PARTITIONED BY (dt);
>   
>  delete from workqueue_zyy02 m where m.queuecode ='1';
> {quote}
> *before*
>  * !image-2019-10-30-14-42-49-871.png!
>  * 
>  * *after*
>  * !image-2019-10-30-14-43-21-972.png!



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Reopened] (CARBONDATA-3561) After deletor updat the data, the query results become incorrect

2019-11-07 Thread Zhichao Zhang (Jira)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-3561?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhichao  Zhang reopened CARBONDATA-3561:

  Assignee: Zhichao  Zhang

This issue is not resolved.

> After deletor updat the data, the query results become incorrect
> 
>
> Key: CARBONDATA-3561
> URL: https://issues.apache.org/jira/browse/CARBONDATA-3561
> Project: CarbonData
>  Issue Type: Bug
>  Components: data-query
>Affects Versions: 1.6.0, 1.5.4
> Environment: SUSE12 CDH5
>Reporter: Xigua
>Assignee: Zhichao  Zhang
>Priority: Blocker
> Attachments: image-2019-10-30-14-42-49-871.png, 
> image-2019-10-30-14-43-21-972.png
>
>
>  
> {quote}CREATE TABLE `workqueue_zyy02` (`queuecode` STRING, `channelflag` 
> STRING, `dt` STRING)
>  USING org.apache.spark.sql.CarbonSource
>  PARTITIONED BY (dt);
>   
>  delete from workqueue_zyy02 m where m.queuecode ='1';
> {quote}
> *before*
>  * !image-2019-10-30-14-42-49-871.png!
>  * 
>  * *after*
>  * !image-2019-10-30-14-43-21-972.png!



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (CARBONDATA-3527) Throw 'String length cannot exceed 32000 characters' exception when load data with 'GLOBAL_SORT' from csv which include big complex type data

2019-09-25 Thread Zhichao Zhang (Jira)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-3527?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhichao  Zhang updated CARBONDATA-3527:
---
Description: 
*Problem:*

When complex type data is used more than 32000 characters to indicate in csv 
file, and load data with 'GLOBAL_SORT' from these csv files, it will throw 
'String length cannot exceed 32000 characters' exception.

*Cause:*

Use 'GLOBAL_SORT' to load data from csv files, it reads files and firstly store 
data in StringArrayRow, the type of all data are string, when call 
'CarbonScalaUtil.getString' in 'NewRddIterator.next', it will check the length 
of all data and throw 'String length cannot exceed 32000 characters' exception 
even if it's complex type data which store as more than 32000 characters in csv 
files.

*Solution:*

In 'FieldConverter.objectToString' (called in 'CarbonScalaUtil.getString'), if 
the data type of field is complex type, don't check the length.

> Throw 'String length cannot exceed 32000 characters' exception when load data 
> with 'GLOBAL_SORT' from csv which include big complex type data
> -
>
> Key: CARBONDATA-3527
> URL: https://issues.apache.org/jira/browse/CARBONDATA-3527
> Project: CarbonData
>  Issue Type: Bug
>  Components: spark-integration
>Affects Versions: 1.6.0
>Reporter: Zhichao  Zhang
>Assignee: Zhichao  Zhang
>Priority: Major
> Fix For: 1.6.1
>
>
> *Problem:*
> When complex type data is used more than 32000 characters to indicate in csv 
> file, and load data with 'GLOBAL_SORT' from these csv files, it will throw 
> 'String length cannot exceed 32000 characters' exception.
> *Cause:*
> Use 'GLOBAL_SORT' to load data from csv files, it reads files and firstly 
> store data in StringArrayRow, the type of all data are string, when call 
> 'CarbonScalaUtil.getString' in 'NewRddIterator.next', it will check the 
> length of all data and throw 'String length cannot exceed 32000 characters' 
> exception even if it's complex type data which store as more than 32000 
> characters in csv files.
> *Solution:*
> In 'FieldConverter.objectToString' (called in 'CarbonScalaUtil.getString'), 
> if the data type of field is complex type, don't check the length.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (CARBONDATA-3527) Throw 'String length cannot exceed 32000 characters' exception when load data with 'GLOBAL_SORT' from csv which include big complex type data

2019-09-24 Thread Zhichao Zhang (Jira)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-3527?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhichao  Zhang updated CARBONDATA-3527:
---
Description: (was: Problem:

when load data with 'GLOBAL_SORT' from csv, and these csv files include some 
big complex type data, which are used long string to )

> Throw 'String length cannot exceed 32000 characters' exception when load data 
> with 'GLOBAL_SORT' from csv which include big complex type data
> -
>
> Key: CARBONDATA-3527
> URL: https://issues.apache.org/jira/browse/CARBONDATA-3527
> Project: CarbonData
>  Issue Type: Bug
>  Components: spark-integration
>Affects Versions: 1.6.0
>Reporter: Zhichao  Zhang
>Assignee: Zhichao  Zhang
>Priority: Major
> Fix For: 1.6.1
>
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (CARBONDATA-3527) Throw 'String length cannot exceed 32000 characters' exception when load data with 'GLOBAL_SORT' from csv which include big complex type data

2019-09-24 Thread Zhichao Zhang (Jira)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-3527?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhichao  Zhang updated CARBONDATA-3527:
---
Description: 
Problem:

when load data with 'GLOBAL_SORT' from csv, and these csv files include some 
big complex type data, which are used long string to 

> Throw 'String length cannot exceed 32000 characters' exception when load data 
> with 'GLOBAL_SORT' from csv which include big complex type data
> -
>
> Key: CARBONDATA-3527
> URL: https://issues.apache.org/jira/browse/CARBONDATA-3527
> Project: CarbonData
>  Issue Type: Bug
>  Components: spark-integration
>Affects Versions: 1.6.0
>Reporter: Zhichao  Zhang
>Assignee: Zhichao  Zhang
>Priority: Major
> Fix For: 1.6.1
>
>
> Problem:
> when load data with 'GLOBAL_SORT' from csv, and these csv files include some 
> big complex type data, which are used long string to 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (CARBONDATA-3527) Throw 'String length cannot exceed 32000 characters' exception when load data with 'GLOBAL_SORT' from csv which include big complex type data

2019-09-24 Thread Zhichao Zhang (Jira)
Zhichao  Zhang created CARBONDATA-3527:
--

 Summary: Throw 'String length cannot exceed 32000 characters' 
exception when load data with 'GLOBAL_SORT' from csv which include big complex 
type data
 Key: CARBONDATA-3527
 URL: https://issues.apache.org/jira/browse/CARBONDATA-3527
 Project: CarbonData
  Issue Type: Bug
  Components: spark-integration
Affects Versions: 1.6.0
Reporter: Zhichao  Zhang
Assignee: Zhichao  Zhang
 Fix For: 1.6.1






--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (CARBONDATA-3513) can not run major compaction when using hive partition table

2019-09-05 Thread Zhichao Zhang (Jira)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-3513?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhichao  Zhang updated CARBONDATA-3513:
---
Description: 
Major compaction command runs error.ERROR information:
{code:java}
2019-09-03 13:35:49 INFO  BlockManagerInfo:54 - Added broadcast_0_piece0 in 
memory on czh-yhfx-redis1:41430 (size: 26.4 KB, free: 5.2 GB)2019-09-03 
13:35:49 INFO  BlockManagerInfo:54 - Added broadcast_0_piece0 in memory on 
czh-yhfx-redis1:41430 (size: 26.4 KB, free: 5.2 GB)2019-09-03 13:35:52 WARN  
TaskSetManager:66 - Lost task 1.0 in stage 0.0 (TID 1, czh-yhfx-redis1, 
executor 1): java.lang.NumberFormatException: For input string: 
"328812001110" at 
java.lang.NumberFormatException.forInputString(NumberFormatException.java:65) 
at java.lang.Long.parseLong(Long.java:592) at 
java.lang.Long.parseLong(Long.java:631) at 
org.apache.carbondata.core.util.path.CarbonTablePath$DataFileUtil.getTaskIdFromTaskNo(CarbonTablePath.java:503)
 at 
org.apache.carbondata.processing.store.CarbonFactDataHandlerModel.getCarbonFactDataHandlerModel(CarbonFactDataHandlerModel.java:396)
 at 
org.apache.carbondata.processing.merger.RowResultMergerProcessor.(RowResultMergerProcessor.java:86)
 at 
org.apache.carbondata.spark.rdd.CarbonMergerRDD$$anon$1.(CarbonMergerRDD.scala:213)
 at 
org.apache.carbondata.spark.rdd.CarbonMergerRDD.internalCompute(CarbonMergerRDD.scala:86)
 at org.apache.carbondata.spark.rdd.CarbonRDD.compute(CarbonRDD.scala:82) at 
org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:323) at 
org.apache.spark.rdd.RDD.iterator(RDD.scala:287) at 
org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:87) at 
org.apache.spark.scheduler.Task.run(Task.scala:108) at 
org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:338) at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) 
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) 
at java.lang.Thread.run(Thread.java:748){code}
 

version:apache-carbondata-1.5.1-bin-spark2.2.1-hadoop2.7.2.jar

table is a hive partition table.

 

  was:
Major compaction command runs error.ERROR information:

2019-09-03 13:35:49 INFO  BlockManagerInfo:54 - Added broadcast_0_piece0 in 
memory on czh-yhfx-redis1:41430 (size: 26.4 KB, free: 5.2 GB)2019-09-03 
13:35:49 INFO  BlockManagerInfo:54 - Added broadcast_0_piece0 in memory on 
czh-yhfx-redis1:41430 (size: 26.4 KB, free: 5.2 GB)2019-09-03 13:35:52 WARN  
TaskSetManager:66 - Lost task 1.0 in stage 0.0 (TID 1, czh-yhfx-redis1, 
executor 1): java.lang.NumberFormatException: For input string: 
"328812001110" at 
java.lang.NumberFormatException.forInputString(NumberFormatException.java:65) 
at java.lang.Long.parseLong(Long.java:592) at 
java.lang.Long.parseLong(Long.java:631) at 
org.apache.carbondata.core.util.path.CarbonTablePath$DataFileUtil.getTaskIdFromTaskNo(CarbonTablePath.java:503)
 at 
org.apache.carbondata.processing.store.CarbonFactDataHandlerModel.getCarbonFactDataHandlerModel(CarbonFactDataHandlerModel.java:396)
 at 
org.apache.carbondata.processing.merger.RowResultMergerProcessor.(RowResultMergerProcessor.java:86)
 at 
org.apache.carbondata.spark.rdd.CarbonMergerRDD$$anon$1.(CarbonMergerRDD.scala:213)
 at 
org.apache.carbondata.spark.rdd.CarbonMergerRDD.internalCompute(CarbonMergerRDD.scala:86)
 at org.apache.carbondata.spark.rdd.CarbonRDD.compute(CarbonRDD.scala:82) at 
org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:323) at 
org.apache.spark.rdd.RDD.iterator(RDD.scala:287) at 
org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:87) at 
org.apache.spark.scheduler.Task.run(Task.scala:108) at 
org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:338) at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) 
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) 
at java.lang.Thread.run(Thread.java:748)

 

version:apache-carbondata-1.5.1-bin-spark2.2.1-hadoop2.7.2.jar

table is a hive partition table.

 


> can not run major compaction when using hive partition table
> 
>
> Key: CARBONDATA-3513
> URL: https://issues.apache.org/jira/browse/CARBONDATA-3513
> Project: CarbonData
>  Issue Type: Bug
>  Components: core
>Affects Versions: 1.6.0
>Reporter: ocean
>Priority: Major
> Fix For: 1.6.1
>
> Attachments: 33cb0d98561f8b7f505eb7b2ff9f72e0.jpg
>
>
> Major compaction command runs error.ERROR information:
> {code:java}
> 2019-09-03 13:35:49 INFO  BlockManagerInfo:54 - Added broadcast_0_piece0 in 
> memory on czh-yhfx-redis1:41430 (size: 26.4 KB, free: 5.2 GB)2019-09-03 
> 13:35:49 INFO  BlockManagerInfo:54 - Added broadcast_0_piece0 in memory on 
> czh-yhfx-redis1:41430 (size: 

[jira] [Created] (CARBONDATA-3501) Support to execute update sql on table with long_string field (Not update long_string field)

2019-08-26 Thread Zhichao Zhang (Jira)
Zhichao  Zhang created CARBONDATA-3501:
--

 Summary: Support to execute update sql on table with long_string 
field (Not update long_string field)
 Key: CARBONDATA-3501
 URL: https://issues.apache.org/jira/browse/CARBONDATA-3501
 Project: CarbonData
  Issue Type: Improvement
Reporter: Zhichao  Zhang
Assignee: Zhichao  Zhang


When execute update sql (not update long_string field) on table with 
long_string field, it fail.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Updated] (CARBONDATA-3497) Support to write long string for streaming table

2019-08-21 Thread Zhichao Zhang (Jira)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-3497?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhichao  Zhang updated CARBONDATA-3497:
---
Description: Support to write long string for streaming table

> Support to write long string for streaming table
> 
>
> Key: CARBONDATA-3497
> URL: https://issues.apache.org/jira/browse/CARBONDATA-3497
> Project: CarbonData
>  Issue Type: Improvement
>  Components: spark-integration
>Reporter: Zhichao  Zhang
>Assignee: Zhichao  Zhang
>Priority: Minor
>
> Support to write long string for streaming table



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Created] (CARBONDATA-3498) Support to alter sort_column for streaming table

2019-08-21 Thread Zhichao Zhang (Jira)
Zhichao  Zhang created CARBONDATA-3498:
--

 Summary: Support to alter sort_column for streaming table
 Key: CARBONDATA-3498
 URL: https://issues.apache.org/jira/browse/CARBONDATA-3498
 Project: CarbonData
  Issue Type: Improvement
  Components: spark-integration
Reporter: Zhichao  Zhang
Assignee: Zhichao  Zhang


Support to alter sort_column for streaming table



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Created] (CARBONDATA-3497) Support to write long string for streaming table

2019-08-21 Thread Zhichao Zhang (Jira)
Zhichao  Zhang created CARBONDATA-3497:
--

 Summary: Support to write long string for streaming table
 Key: CARBONDATA-3497
 URL: https://issues.apache.org/jira/browse/CARBONDATA-3497
 Project: CarbonData
  Issue Type: Improvement
  Components: spark-integration
Reporter: Zhichao  Zhang
Assignee: Zhichao  Zhang






--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Created] (CARBONDATA-3491) Return updated/deleted rows count when execute update/delete sql

2019-08-11 Thread Zhichao Zhang (JIRA)
Zhichao  Zhang created CARBONDATA-3491:
--

 Summary: Return updated/deleted rows count when execute 
update/delete sql
 Key: CARBONDATA-3491
 URL: https://issues.apache.org/jira/browse/CARBONDATA-3491
 Project: CarbonData
  Issue Type: Improvement
Reporter: Zhichao  Zhang
Assignee: Zhichao  Zhang


Return updated/deleted rows count when execute update/delete sql.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[jira] [Resolved] (CARBONDATA-3488) Check the file size after move local file to carbon path

2019-08-10 Thread Zhichao Zhang (JIRA)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-3488?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhichao  Zhang resolved CARBONDATA-3488.

   Resolution: Fixed
Fix Version/s: 1.6.0

> Check the file size after move local file to carbon path
> 
>
> Key: CARBONDATA-3488
> URL: https://issues.apache.org/jira/browse/CARBONDATA-3488
> Project: CarbonData
>  Issue Type: Improvement
>Reporter: Zhichao  Zhang
>Assignee: Zhichao  Zhang
>Priority: Minor
> Fix For: 1.6.0
>
>  Time Spent: 7h
>  Remaining Estimate: 0h
>
> *Problem:*
> One user met an issue: the row num saved in carbonindex file is non zero but 
> the file size of relevant carbondata file is 0.
>  
> *Solution:*
> In CarbonUtil.copyCarbonDataFileToCarbonStorePath, check the file size of 
> carbon file whether is the same as the size fo local file after move local 
> file to carbon path.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[jira] [Resolved] (CARBONDATA-3477) Throw out exception when use sql: 'update table select\n...'

2019-08-08 Thread Zhichao Zhang (JIRA)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-3477?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhichao  Zhang resolved CARBONDATA-3477.

Resolution: Fixed

> Throw out exception when use sql: 'update table select\n...'
> 
>
> Key: CARBONDATA-3477
> URL: https://issues.apache.org/jira/browse/CARBONDATA-3477
> Project: CarbonData
>  Issue Type: Bug
>Reporter: Zhichao  Zhang
>Assignee: Zhichao  Zhang
>Priority: Minor
>  Time Spent: 4h 50m
>  Remaining Estimate: 0h
>
> When use below sql to update table:
> {code:java}
> UPDATE IUD_table2 a
>  SET (a.IUD_table2_country, a.IUD_table2_salary) = (select
>  b.IUD_table1_country, b.IUD_table1_salary from IUD_table1 b where 
> b.IUD_table1_id = 8)
>  WHERE a.IUD_table2_id < 6 or a.IUD_table2_id > 15{code}
> *It will throw out exception:* 
> {code:java}
> Exception in thread "main" org.apache.spark.sql.AnalysisException: == Parse1 
> ==
> mismatched input '.' expecting (line 2, pos 1)
> == SQL ==
>  select select
>  b.IUD_table1_country, b.IUD_table1_salary from IUD_table1 b where 
> b.IUD_table1_id = 8 from iud_table2 a
>  -^^^
> == Parse2 ==
>  [1.1] failure: identifier matching regex (?i)ALTER expected
> select select
> {code}
>  



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[jira] [Updated] (CARBONDATA-3477) Throw out exception when use sql: 'update table select\n...'

2019-08-08 Thread Zhichao Zhang (JIRA)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-3477?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhichao  Zhang updated CARBONDATA-3477:
---
Fix Version/s: 1.6.0

> Throw out exception when use sql: 'update table select\n...'
> 
>
> Key: CARBONDATA-3477
> URL: https://issues.apache.org/jira/browse/CARBONDATA-3477
> Project: CarbonData
>  Issue Type: Bug
>Reporter: Zhichao  Zhang
>Assignee: Zhichao  Zhang
>Priority: Minor
> Fix For: 1.6.0
>
>  Time Spent: 4h 50m
>  Remaining Estimate: 0h
>
> When use below sql to update table:
> {code:java}
> UPDATE IUD_table2 a
>  SET (a.IUD_table2_country, a.IUD_table2_salary) = (select
>  b.IUD_table1_country, b.IUD_table1_salary from IUD_table1 b where 
> b.IUD_table1_id = 8)
>  WHERE a.IUD_table2_id < 6 or a.IUD_table2_id > 15{code}
> *It will throw out exception:* 
> {code:java}
> Exception in thread "main" org.apache.spark.sql.AnalysisException: == Parse1 
> ==
> mismatched input '.' expecting (line 2, pos 1)
> == SQL ==
>  select select
>  b.IUD_table1_country, b.IUD_table1_salary from IUD_table1 b where 
> b.IUD_table1_id = 8 from iud_table2 a
>  -^^^
> == Parse2 ==
>  [1.1] failure: identifier matching regex (?i)ALTER expected
> select select
> {code}
>  



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[jira] [Resolved] (CARBONDATA-3483) Can not run horizontal compaction when execute update sql

2019-08-08 Thread Zhichao Zhang (JIRA)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-3483?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhichao  Zhang resolved CARBONDATA-3483.

   Resolution: Fixed
Fix Version/s: 1.6.0

> Can not run horizontal compaction when execute update sql
> -
>
> Key: CARBONDATA-3483
> URL: https://issues.apache.org/jira/browse/CARBONDATA-3483
> Project: CarbonData
>  Issue Type: Bug
>Affects Versions: 1.5.3, 1.6.0, 1.5.4
>Reporter: Zhichao  Zhang
>Assignee: Zhichao  Zhang
>Priority: Major
> Fix For: 1.6.0
>
>  Time Spent: 5h 50m
>  Remaining Estimate: 0h
>
> After PR#3166, horizontal compaction will not actually run when execute 
> update sql.
> When it runs update sql and will run horizontal compaction if needs, it will 
> require update.lock and compaction.lock when execute 
> CarbonAlterTableCompactionCommand.alterTableForCompaction, but these two 
> locks already are locked when it starts to execute update sql. so it will 
> require locks failed and can't execute compaction.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[jira] [Updated] (CARBONDATA-3488) Check the file size after move local file to carbon path

2019-08-07 Thread Zhichao Zhang (JIRA)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-3488?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhichao  Zhang updated CARBONDATA-3488:
---
Description: 
*Problem:*

One user met an issue: the row num saved in carbonindex file is non zero but 
the file size of relevant carbondata file is 0.

 

*Solution:*

In CarbonUtil.copyCarbonDataFileToCarbonStorePath, check the file size of 
carbon file whether is the same as the size fo local file after move local file 
to carbon path.

  was:
One user met an issue

 

CarbonUtil.copyCarbonDataFileToCarbonStorePath


> Check the file size after move local file to carbon path
> 
>
> Key: CARBONDATA-3488
> URL: https://issues.apache.org/jira/browse/CARBONDATA-3488
> Project: CarbonData
>  Issue Type: Improvement
>Reporter: Zhichao  Zhang
>Assignee: Zhichao  Zhang
>Priority: Minor
>
> *Problem:*
> One user met an issue: the row num saved in carbonindex file is non zero but 
> the file size of relevant carbondata file is 0.
>  
> *Solution:*
> In CarbonUtil.copyCarbonDataFileToCarbonStorePath, check the file size of 
> carbon file whether is the same as the size fo local file after move local 
> file to carbon path.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[jira] [Updated] (CARBONDATA-3488) Check the file size after move local file to carbon path

2019-08-07 Thread Zhichao Zhang (JIRA)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-3488?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhichao  Zhang updated CARBONDATA-3488:
---
Description: 
One user met an issue

 

CarbonUtil.copyCarbonDataFileToCarbonStorePath

> Check the file size after move local file to carbon path
> 
>
> Key: CARBONDATA-3488
> URL: https://issues.apache.org/jira/browse/CARBONDATA-3488
> Project: CarbonData
>  Issue Type: Improvement
>Reporter: Zhichao  Zhang
>Assignee: Zhichao  Zhang
>Priority: Minor
>
> One user met an issue
>  
> CarbonUtil.copyCarbonDataFileToCarbonStorePath



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[jira] [Created] (CARBONDATA-3488) Check the file size after move local file to carbon path

2019-08-07 Thread Zhichao Zhang (JIRA)
Zhichao  Zhang created CARBONDATA-3488:
--

 Summary: Check the file size after move local file to carbon path
 Key: CARBONDATA-3488
 URL: https://issues.apache.org/jira/browse/CARBONDATA-3488
 Project: CarbonData
  Issue Type: Improvement
Reporter: Zhichao  Zhang
Assignee: Zhichao  Zhang






--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[jira] [Commented] (CARBONDATA-3476) Read time and scan time stats shown wrong in executor log for filter query

2019-08-06 Thread Zhichao Zhang (JIRA)


[ 
https://issues.apache.org/jira/browse/CARBONDATA-3476?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16900718#comment-16900718
 ] 

Zhichao  Zhang commented on CARBONDATA-3476:


what's your accout? I can not find user called 'Vikram Ahuja'.

> Read time and scan time stats shown wrong in executor log for filter query
> --
>
> Key: CARBONDATA-3476
> URL: https://issues.apache.org/jira/browse/CARBONDATA-3476
> Project: CarbonData
>  Issue Type: Bug
>  Components: core
>Reporter: Vikram Ahuja
>Priority: Minor
> Fix For: 1.6.0
>
>  Time Spent: 8h 20m
>  Remaining Estimate: 0h
>
> Problem: Read time and scan time stats shown wrong in executor log for filter 
> query
> Root cause: Projection read time is added in scan time because of this scan 
> time and read time is not correct in stats
> Solution: Added projection read time for both measure and dimension column in 
> read stats



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[jira] [Updated] (CARBONDATA-3483) Can not run horizontal compaction when execute update sql

2019-08-02 Thread Zhichao Zhang (JIRA)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-3483?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhichao  Zhang updated CARBONDATA-3483:
---
Priority: Major  (was: Minor)

> Can not run horizontal compaction when execute update sql
> -
>
> Key: CARBONDATA-3483
> URL: https://issues.apache.org/jira/browse/CARBONDATA-3483
> Project: CarbonData
>  Issue Type: Bug
>Reporter: Zhichao  Zhang
>Assignee: Zhichao  Zhang
>Priority: Major
>  Time Spent: 3h 20m
>  Remaining Estimate: 0h
>
> After PR#3166, horizontal compaction will not actually run when execute 
> update sql.
> When it runs update sql and will run horizontal compaction if needs, it will 
> require update.lock and compaction.lock when execute 
> CarbonAlterTableCompactionCommand.alterTableForCompaction, but these two 
> locks already are locked when it starts to execute update sql. so it will 
> require locks failed and can't execute compaction.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[jira] [Updated] (CARBONDATA-3483) Can not run horizontal compaction when execute update sql

2019-08-02 Thread Zhichao Zhang (JIRA)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-3483?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhichao  Zhang updated CARBONDATA-3483:
---
Affects Version/s: 1.6.0
   1.5.3
   1.5.4

> Can not run horizontal compaction when execute update sql
> -
>
> Key: CARBONDATA-3483
> URL: https://issues.apache.org/jira/browse/CARBONDATA-3483
> Project: CarbonData
>  Issue Type: Bug
>Affects Versions: 1.5.3, 1.6.0, 1.5.4
>Reporter: Zhichao  Zhang
>Assignee: Zhichao  Zhang
>Priority: Major
>  Time Spent: 3h 20m
>  Remaining Estimate: 0h
>
> After PR#3166, horizontal compaction will not actually run when execute 
> update sql.
> When it runs update sql and will run horizontal compaction if needs, it will 
> require update.lock and compaction.lock when execute 
> CarbonAlterTableCompactionCommand.alterTableForCompaction, but these two 
> locks already are locked when it starts to execute update sql. so it will 
> require locks failed and can't execute compaction.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[jira] [Updated] (CARBONDATA-3483) Can not run horizontal compaction when execute update sql

2019-07-31 Thread Zhichao Zhang (JIRA)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-3483?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhichao  Zhang updated CARBONDATA-3483:
---
Description: 
After PR#3166, horizontal compaction will not actually run when execute update 
sql.

When it runs update sql and will run horizontal compaction if needs, it will 
require update.lock and compaction.lock when execute 
CarbonAlterTableCompactionCommand.alterTableForCompaction, but these two locks 
already are locked when it starts to execute update sql. so it will require 
locks failed and can't execute compaction.

  was:
After PR#3166, horizontal compaction will not actually run when execute update 
sql.

When it runs update sql and will run horizontal compaction if needs, it will 
require update.lock and compaction.lock when execute 
CarbonAlterTableCompactionCommand.alterTableForCompaction, but these two locks 
already are locked when it starts to execute update sql. so it will require 
locks failed and don't execute compaction.


> Can not run horizontal compaction when execute update sql
> -
>
> Key: CARBONDATA-3483
> URL: https://issues.apache.org/jira/browse/CARBONDATA-3483
> Project: CarbonData
>  Issue Type: Bug
>Reporter: Zhichao  Zhang
>Assignee: Zhichao  Zhang
>Priority: Minor
>
> After PR#3166, horizontal compaction will not actually run when execute 
> update sql.
> When it runs update sql and will run horizontal compaction if needs, it will 
> require update.lock and compaction.lock when execute 
> CarbonAlterTableCompactionCommand.alterTableForCompaction, but these two 
> locks already are locked when it starts to execute update sql. so it will 
> require locks failed and can't execute compaction.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[jira] [Updated] (CARBONDATA-3483) Can not run horizontal compaction when execute update sql

2019-07-31 Thread Zhichao Zhang (JIRA)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-3483?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhichao  Zhang updated CARBONDATA-3483:
---
Summary: Can not run horizontal compaction when execute update sql  (was: 
Do not run horizontal compaction when execute update sql)

> Can not run horizontal compaction when execute update sql
> -
>
> Key: CARBONDATA-3483
> URL: https://issues.apache.org/jira/browse/CARBONDATA-3483
> Project: CarbonData
>  Issue Type: Bug
>Reporter: Zhichao  Zhang
>Assignee: Zhichao  Zhang
>Priority: Minor
>
> After PR#3166, horizontal compaction will not actually run when execute 
> update sql.
> When it runs update sql and will run horizontal compaction if needs, it will 
> require update.lock and compaction.lock when execute 
> CarbonAlterTableCompactionCommand.alterTableForCompaction, but these two 
> locks already are locked when it starts to execute update sql. so it will 
> require locks failed and don't execute compaction.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[jira] [Created] (CARBONDATA-3483) Do not run horizontal compaction when execute update sql

2019-07-31 Thread Zhichao Zhang (JIRA)
Zhichao  Zhang created CARBONDATA-3483:
--

 Summary: Do not run horizontal compaction when execute update sql
 Key: CARBONDATA-3483
 URL: https://issues.apache.org/jira/browse/CARBONDATA-3483
 Project: CarbonData
  Issue Type: Bug
Reporter: Zhichao  Zhang
Assignee: Zhichao  Zhang


After PR#3166, horizontal compaction will not actually run when execute update 
sql.

When it runs update sql and will run horizontal compaction if needs, it will 
require update.lock and compaction.lock when execute 
CarbonAlterTableCompactionCommand.alterTableForCompaction, but these two locks 
already are locked when it starts to execute update sql. so it will require 
locks failed and don't execute compaction.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[jira] [Closed] (CARBONDATA-1625) Introduce new datatype of varchar(size) to store column length more than short limit.

2019-07-26 Thread Zhichao Zhang (JIRA)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-1625?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhichao  Zhang closed CARBONDATA-1625.
--
Resolution: Duplicate

This feature was supported

> Introduce new datatype of  varchar(size) to store column length more than 
> short limit.
> --
>
> Key: CARBONDATA-1625
> URL: https://issues.apache.org/jira/browse/CARBONDATA-1625
> Project: CarbonData
>  Issue Type: New Feature
>  Components: file-format
>Reporter: Zhichao  Zhang
>Priority: Minor
>
> I am using Spark 2.1 + CarbonData 1.2, and find that if 
> enable.unsafe.sort=true, the length of bytes of column exceed 32768, it will 
> load data unsuccessfully. 
> My test code: 
> 
> {code:java}
> val longStr = sb.toString()  // the getBytes length of longStr exceeds 32768 
> println(longStr.length()) 
> println(longStr.getBytes("UTF-8").length) 
> 
> import spark.implicits._ 
> val df1 = spark.sparkContext.parallelize(0 to 1000) 
>   .map(x => ("a", x.toString(), longStr, x, x.toLong, x * 2)) 
>   .toDF("stringField1", "stringField2", "stringField3", "intField", 
> "longField", "int2Field") 
>   
> val df2 = spark.sparkContext.parallelize(1001 to 2000) 
>   .map(x => ("b", x.toString(), (x % 2).toString(), x, x.toLong, x * 2)) 
>   .toDF("stringField1", "stringField2", "stringField3", "intField", 
> "longField", "int2Field") 
>   
> val df3 = df1.union(df2) 
> val tableName = "study_carbondata_test" 
> spark.sql(s"DROP TABLE IF EXISTS ${tableName} ").show() 
> val sortScope = "LOCAL_SORT"   // LOCAL_SORT   GLOBAL_SORT 
> spark.sql(s""" 
> |  CREATE TABLE IF NOT EXISTS ${tableName} ( 
> |stringField1  string, 
> |stringField2  string, 
> |stringField3  string, 
> |intField  int, 
> |longField bigint, 
> |int2Field int 
> |  ) 
> |  STORED BY 'carbondata' 
> |  TBLPROPERTIES('DICTIONARY_INCLUDE'='stringField1, stringField2', 
> |'SORT_COLUMNS'='stringField1, stringField2, intField, 
> longField', 
> |'SORT_SCOPE'='${sortScope}', 
> |'NO_INVERTED_INDEX'='stringField3, int2Field', 
> |'TABLE_BLOCKSIZE'='64' 
> |  ) 
>""".stripMargin) 
> df3.write 
>   .format("carbondata")   
>   .option("tableName", "study_carbondata_test") 
>   .option("compress", "true")  // just valid when tempCSV is true 
>   .option("tempCSV", "false") 
>   .option("single_pass", "true") 
>   .mode(SaveMode.Append) 
>   .save()
> {code}
> The error message: 
> {code:java}
> *java.lang.NegativeArraySizeException 
> at 
> org.apache.carbondata.processing.newflow.sort.unsafe.UnsafeCarbonRowPage.getRow(UnsafeCarbonRowPage.java:182)
>  
> at 
> org.apache.carbondata.processing.newflow.sort.unsafe.holder.UnsafeInmemoryHolder.readRow(UnsafeInmemoryHolder.java:63)
>  
> at 
> org.apache.carbondata.processing.newflow.sort.unsafe.merger.UnsafeSingleThreadFinalSortFilesMerger.startSorting(UnsafeSingleThreadFinalSortFilesMerger.java:114)
>  
> at 
> org.apache.carbondata.processing.newflow.sort.unsafe.merger.UnsafeSingleThreadFinalSortFilesMerger.startFinalMerge(UnsafeSingleThreadFinalSortFilesMerger.java:81)
>  
> at 
> org.apache.carbondata.processing.newflow.sort.impl.UnsafeParallelReadMergeSorterImpl.sort(UnsafeParallelReadMergeSorterImpl.java:105)
>  
> at 
> org.apache.carbondata.processing.newflow.steps.SortProcessorStepImpl.execute(SortProcessorStepImpl.java:62)
>  
> at 
> org.apache.carbondata.processing.newflow.steps.DataWriterProcessorStepImpl.execute(DataWriterProcessorStepImpl.java:87)
>  
> at 
> org.apache.carbondata.processing.newflow.DataLoadExecutor.execute(DataLoadExecutor.java:51)
>  
> at 
> org.apache.carbondata.spark.rdd.NewDataFrameLoaderRDD$$anon$2.(NewCarbonDataLoadRDD.scala:442)
>  
> at 
> org.apache.carbondata.spark.rdd.NewDataFrameLoaderRDD.internalCompute(NewCarbonDataLoadRDD.scala:405)
>  
> at 
> org.apache.carbondata.spark.rdd.CarbonRDD.compute(CarbonRDD.scala:62) 
> at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:323) 
> at org.apache.spark.rdd.RDD.iterator(RDD.scala:287)* 
> {code}
> Currently, the length of column was stored by short type.
> Introduce new datatype of  varchar(size) to store column length more than 
> short limit.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[jira] [Updated] (CARBONDATA-3477) Throw out exception when use sql: 'update table select\n...'

2019-07-26 Thread Zhichao Zhang (JIRA)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-3477?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhichao  Zhang updated CARBONDATA-3477:
---
Description: 
When use below sql to update table:
{code:java}
UPDATE IUD_table2 a
 SET (a.IUD_table2_country, a.IUD_table2_salary) = (select
 b.IUD_table1_country, b.IUD_table1_salary from IUD_table1 b where 
b.IUD_table1_id = 8)
 WHERE a.IUD_table2_id < 6 or a.IUD_table2_id > 15{code}
*It will throw out exception:* 
{code:java}
Exception in thread "main" org.apache.spark.sql.AnalysisException: == Parse1 ==
mismatched input '.' expecting (line 2, pos 1)
== SQL ==
 select select
 b.IUD_table1_country, b.IUD_table1_salary from IUD_table1 b where 
b.IUD_table1_id = 8 from iud_table2 a
 -^^^
== Parse2 ==
 [1.1] failure: identifier matching regex (?i)ALTER expected
select select
{code}
 

  was:
When use below sql to update table:
{code:java}
UPDATE IUD_table2 a
 SET (a.IUD_table2_country, a.IUD_table2_salary) = (select
 b.IUD_table1_country, b.IUD_table1_salary from IUD_table1 b where 
b.IUD_table1_id = 8)
 WHERE a.IUD_table2_id < 6 or a.IUD_table2_id > 15{code}
*It will throw out exception:*

 
{code:java}
Exception in thread "main" org.apache.spark.sql.AnalysisException: == Parse1 ==
mismatched input '.' expecting (line 2, pos 1)
== SQL ==
 select select
 b.IUD_table1_country, b.IUD_table1_salary from IUD_table1 b where 
b.IUD_table1_id = 8 from iud_table2 a
 -^^^
== Parse2 ==
 [1.1] failure: identifier matching regex (?i)ALTER expected
select select
{code}
 


> Throw out exception when use sql: 'update table select\n...'
> 
>
> Key: CARBONDATA-3477
> URL: https://issues.apache.org/jira/browse/CARBONDATA-3477
> Project: CarbonData
>  Issue Type: Bug
>Reporter: Zhichao  Zhang
>Assignee: Zhichao  Zhang
>Priority: Minor
>
> When use below sql to update table:
> {code:java}
> UPDATE IUD_table2 a
>  SET (a.IUD_table2_country, a.IUD_table2_salary) = (select
>  b.IUD_table1_country, b.IUD_table1_salary from IUD_table1 b where 
> b.IUD_table1_id = 8)
>  WHERE a.IUD_table2_id < 6 or a.IUD_table2_id > 15{code}
> *It will throw out exception:* 
> {code:java}
> Exception in thread "main" org.apache.spark.sql.AnalysisException: == Parse1 
> ==
> mismatched input '.' expecting (line 2, pos 1)
> == SQL ==
>  select select
>  b.IUD_table1_country, b.IUD_table1_salary from IUD_table1 b where 
> b.IUD_table1_id = 8 from iud_table2 a
>  -^^^
> == Parse2 ==
>  [1.1] failure: identifier matching regex (?i)ALTER expected
> select select
> {code}
>  



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[jira] [Created] (CARBONDATA-3477) Throw out exception when use sql: 'update table select\n...'

2019-07-26 Thread Zhichao Zhang (JIRA)
Zhichao  Zhang created CARBONDATA-3477:
--

 Summary: Throw out exception when use sql: 'update table 
select\n...'
 Key: CARBONDATA-3477
 URL: https://issues.apache.org/jira/browse/CARBONDATA-3477
 Project: CarbonData
  Issue Type: Bug
Reporter: Zhichao  Zhang
Assignee: Zhichao  Zhang


When use below sql to update table:

UPDATE IUD_table2 a
 SET (a.IUD_table2_country, a.IUD_table2_salary) = (select
b.IUD_table1_country, b.IUD_table1_salary from IUD_table1 b where 
b.IUD_table1_id = 8)
WHERE a.IUD_table2_id < 6 or a.IUD_table2_id > 15

It will throw out exception:

Exception in thread "main" org.apache.spark.sql.AnalysisException: == Parse1 ==

mismatched input '.' expecting (line 2, pos 1)

== SQL ==
select select
b.IUD_table1_country, b.IUD_table1_salary from IUD_table1 b where 
b.IUD_table1_id = 8 from iud_table2 a
-^^^

== Parse2 ==
[1.1] failure: identifier matching regex (?i)ALTER expected

select select



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[jira] [Updated] (CARBONDATA-3477) Throw out exception when use sql: 'update table select\n...'

2019-07-26 Thread Zhichao Zhang (JIRA)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-3477?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhichao  Zhang updated CARBONDATA-3477:
---
Description: 
When use below sql to update table:
{code:java}
UPDATE IUD_table2 a
 SET (a.IUD_table2_country, a.IUD_table2_salary) = (select
 b.IUD_table1_country, b.IUD_table1_salary from IUD_table1 b where 
b.IUD_table1_id = 8)
 WHERE a.IUD_table2_id < 6 or a.IUD_table2_id > 15{code}
*It will throw out exception:*

 
{code:java}
Exception in thread "main" org.apache.spark.sql.AnalysisException: == Parse1 ==
mismatched input '.' expecting (line 2, pos 1)
== SQL ==
 select select
 b.IUD_table1_country, b.IUD_table1_salary from IUD_table1 b where 
b.IUD_table1_id = 8 from iud_table2 a
 -^^^
== Parse2 ==
 [1.1] failure: identifier matching regex (?i)ALTER expected
select select
{code}
 

  was:
When use below sql to update table:

UPDATE IUD_table2 a
 SET (a.IUD_table2_country, a.IUD_table2_salary) = (select
b.IUD_table1_country, b.IUD_table1_salary from IUD_table1 b where 
b.IUD_table1_id = 8)
WHERE a.IUD_table2_id < 6 or a.IUD_table2_id > 15

It will throw out exception:

Exception in thread "main" org.apache.spark.sql.AnalysisException: == Parse1 ==

mismatched input '.' expecting (line 2, pos 1)

== SQL ==
select select
b.IUD_table1_country, b.IUD_table1_salary from IUD_table1 b where 
b.IUD_table1_id = 8 from iud_table2 a
-^^^

== Parse2 ==
[1.1] failure: identifier matching regex (?i)ALTER expected

select select


> Throw out exception when use sql: 'update table select\n...'
> 
>
> Key: CARBONDATA-3477
> URL: https://issues.apache.org/jira/browse/CARBONDATA-3477
> Project: CarbonData
>  Issue Type: Bug
>Reporter: Zhichao  Zhang
>Assignee: Zhichao  Zhang
>Priority: Minor
>
> When use below sql to update table:
> {code:java}
> UPDATE IUD_table2 a
>  SET (a.IUD_table2_country, a.IUD_table2_salary) = (select
>  b.IUD_table1_country, b.IUD_table1_salary from IUD_table1 b where 
> b.IUD_table1_id = 8)
>  WHERE a.IUD_table2_id < 6 or a.IUD_table2_id > 15{code}
> *It will throw out exception:*
>  
> {code:java}
> Exception in thread "main" org.apache.spark.sql.AnalysisException: == Parse1 
> ==
> mismatched input '.' expecting (line 2, pos 1)
> == SQL ==
>  select select
>  b.IUD_table1_country, b.IUD_table1_salary from IUD_table1 b where 
> b.IUD_table1_id = 8 from iud_table2 a
>  -^^^
> == Parse2 ==
>  [1.1] failure: identifier matching regex (?i)ALTER expected
> select select
> {code}
>  



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[jira] [Commented] (CARBONDATA-3469) CarbonData with 2.3.2 can not run on CDH spark 2.4

2019-07-26 Thread Zhichao Zhang (JIRA)


[ 
https://issues.apache.org/jira/browse/CARBONDATA-3469?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16893380#comment-16893380
 ] 

Zhichao  Zhang commented on CARBONDATA-3469:


[~imperio] I think it can't work with spark 2.4, there maybe some interfaces of 
spark changed, community will integrate with spark 2.4 in the next time.

> CarbonData with 2.3.2 can not run on CDH spark 2.4
> --
>
> Key: CARBONDATA-3469
> URL: https://issues.apache.org/jira/browse/CARBONDATA-3469
> Project: CarbonData
>  Issue Type: Bug
>  Components: spark-integration
>Affects Versions: 1.5.3
>Reporter: wxmimperio
>Priority: Major
>
> *{color:#33}spark2-shell --jars 
> [apache-carbondata-1.5.3-bin-spark2.3.2-hadoop2.7.2.jar|https://dist.apache.org/repos/dist/release/carbondata/1.5.3/apache-carbondata-1.5.3-bin-spark2.3.2-hadoop2.7.2.jar]{color}*
>  
> {code:java}
> java.lang.NoSuchMethodError: 
> org.apache.spark.sql.internal.SharedState.externalCatalog()Lorg/apache/spark/sql/catalyst/catalog/ExternalCatalog;{code}
> {code:java}
> scala> carbon.sql(
> | s"""
> | | CREATE TABLE IF NOT EXISTS test_table(
> | | id string,
> | | name string,
> | | city string,
> | | age Int)
> | | STORED AS carbondata
> | """.stripMargin)
> java.lang.NoSuchMethodError: 
> org.apache.spark.sql.internal.SharedState.externalCatalog()Lorg/apache/spark/sql/catalyst/catalog/ExternalCatalog;
> at 
> org.apache.spark.sql.hive.CarbonSessionStateBuilder.externalCatalog(CarbonSessionState.scala:227)
> at 
> org.apache.spark.sql.hive.CarbonSessionStateBuilder.catalog$lzycompute(CarbonSessionState.scala:214)
> at 
> org.apache.spark.sql.hive.CarbonSessionStateBuilder.catalog(CarbonSessionState.scala:212)
> at 
> org.apache.spark.sql.hive.CarbonSessionStateBuilder.catalog(CarbonSessionState.scala:191)
> at 
> org.apache.spark.sql.internal.BaseSessionStateBuilder$$anonfun$build$1.apply(BaseSessionStateBuilder.scala:291)
> at 
> org.apache.spark.sql.internal.BaseSessionStateBuilder$$anonfun$build$1.apply(BaseSessionStateBuilder.scala:291)
> at 
> org.apache.spark.sql.internal.SessionState.catalog$lzycompute(SessionState.scala:77)
> at org.apache.spark.sql.internal.SessionState.catalog(SessionState.scala:77)
> at org.apache.spark.sql.CarbonEnv$.getInstance(CarbonEnv.scala:135)
> at 
> org.apache.spark.sql.CarbonSession$.updateSessionInfoToCurrentThread(CarbonSession.scala:326)
> at 
> org.apache.spark.sql.parser.CarbonSparkSqlParser.parsePlan(CarbonSparkSqlParser.scala:47)
> at org.apache.spark.sql.CarbonSession.withProfiler(CarbonSession.scala:125)
> at org.apache.spark.sql.CarbonSession.sql(CarbonSession.scala:88)
> ... 59 elided
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[jira] [Commented] (CARBONDATA-3471) Spark query carbondata error reporting

2019-07-17 Thread Zhichao Zhang (JIRA)


[ 
https://issues.apache.org/jira/browse/CARBONDATA-3471?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16886819#comment-16886819
 ] 

Zhichao  Zhang commented on CARBONDATA-3471:


[~tianyouyangying], you found that the content of file 
'Metadata/segments/1471_time.segment' is below:

{"locationMap":\{"/Fact/Part0/Segment_1471":{"files":[],"partitions":[],"status":"Success","mergeFileName":"1471_1562963281071.carbonindexmerge","isRelative":true}}}

 

but in the dir Fact/Part0/Segment_1471, there is just a file 
'1471_1562963281071.carbonindexmerge', no carbondata file, right?

 

> Spark query carbondata error reporting
> --
>
> Key: CARBONDATA-3471
> URL: https://issues.apache.org/jira/browse/CARBONDATA-3471
> Project: CarbonData
>  Issue Type: Bug
>  Components: data-query
>Affects Versions: 1.5.3
> Environment: cdh5.14.x spark2.3.2 hadoop2.6
>Reporter: tianyou
>Priority: Major
>
> Data tables are stored every hour ,delete segment clean file for this table 
> every night.
> It has been running steadily for more than a month.
> But:Now query for error reporting.
> error:
>      caused by: java.lang.IndexOutOfBoundsException: Index: 0, Size: 0
>  at java.util.ArrayList.rangeCheck(ArrayList.java:657)
>  at java.util.ArrayList.get(ArrayList.java:433)
>  at 
> org.apache.carbondata.core.indexstore.blockletindex.BlockletDataMapFactory.getSegmentProperties(BlockletDataMapFactory.java:376)
>  at 
> org.apache.carbondata.core.datamap.TableDataMap.pruneWithFilter(TableDataMap.java:195)
>  at 
> org.apache.carbondata.core.datamap.TableDataMap.prune(TableDataMap.java:171)
>  at 
> org.apache.carbondata.hadoop.api.CarbonInputFormat.getPrunedBlocklets(CarbonInputFormat.java:491)
>  at 
> org.apache.carbondata.hadoop.api.CarbonInputFormat.getDataBlocksOfSegment(CarbonInputFormat.java:414)
>  at 
> org.apache.carbondata.hadoop.api.CarbonTableInputFormat.getSplits(CarbonTableInputFormat.java:494)
>  at 
> org.apache.carbondata.hadoop.api.CarbonTableInputFormat.getSplits(CarbonTableInputFormat.java:218)
>  at 
> org.apache.carbondata.spark.rdd.CarbonScanRDD.internalGetPartitions(CarbonScanRDD.scala:129)
>  at 
> org.apache.carbondata.spark.rdd.CarbonRDD.getPartitions(CarbonRDD.scala:66)
>  at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:253)
>  at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:251)
>  at scala.Option.getOrElse(Option.scala:121)
>  at org.apache.spark.rdd.RDD.partitions(RDD.scala:251)
>  at 
> org.apache.spark.rdd.MapPartitionsRDD.getPartitions(MapPartitionsRDD.scala:46)
>  at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:253)
>  at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:251)
>  at scala.Option.getOrElse(Option.scala:121)
>  at org.apache.spark.rdd.RDD.partitions(RDD.scala:251)
>  at 
> org.apache.spark.rdd.MapPartitionsRDD.getPartitions(MapPartitionsRDD.scala:46)
>  at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:253)
>  at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:251)
>  at scala.Option.g



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[jira] [Closed] (CARBONDATA-3324) The parameter CarbonCommonConstants.CARBON_TIMESTAMP_FORMAT and CarbonCommonConstants.CARBON_DATE_FORMAT don't have effect in streaming table writing

2019-03-25 Thread Zhichao Zhang (JIRA)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-3324?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhichao  Zhang closed CARBONDATA-3324.
--
Resolution: Duplicate

duplicate with CARBONDATA-3325

> The parameter CarbonCommonConstants.CARBON_TIMESTAMP_FORMAT and 
> CarbonCommonConstants.CARBON_DATE_FORMAT don't have effect in streaming table 
> writing
> -
>
> Key: CARBONDATA-3324
> URL: https://issues.apache.org/jira/browse/CARBONDATA-3324
> Project: CarbonData
>  Issue Type: Bug
>  Components: spark-integration
>Affects Versions: 1.5.2
>Reporter: jocean.shi
>Priority: Major
> Fix For: 1.5.2
>
>   Original Estimate: 96h
>  Remaining Estimate: 96h
>
> When I write data into streaming table. 
> I use such code:
> CarbonProperties.getInstance()
>  .addProperty(CarbonCommonConstants.CARBON_TIMESTAMP_FORMAT, "/MM/dd 
> HH:mm:ss")
>  .addProperty(CarbonCommonConstants.CARBON_DATE_FORMAT, "/MM/dd")
>  
> but don't have effect



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (CARBONDATA-3325) The parameter CarbonCommonConstants.CARBON_TIMESTAMP_FORMAT and CarbonCommonConstants.CARBON_DATE_FORMAT don't have effect in streaming table writing

2019-03-25 Thread Zhichao Zhang (JIRA)


[ 
https://issues.apache.org/jira/browse/CARBONDATA-3325?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16800734#comment-16800734
 ] 

Zhichao  Zhang commented on CARBONDATA-3325:


We still can not add permission for you :), you can raise a pr to fix this 
issue first.

> The parameter CarbonCommonConstants.CARBON_TIMESTAMP_FORMAT and 
> CarbonCommonConstants.CARBON_DATE_FORMAT don't have effect in streaming table 
> writing
> -
>
> Key: CARBONDATA-3325
> URL: https://issues.apache.org/jira/browse/CARBONDATA-3325
> Project: CarbonData
>  Issue Type: Bug
>  Components: spark-integration
>Affects Versions: 1.5.2
>Reporter: jocean.shi
>Priority: Major
> Fix For: 1.5.2
>
>   Original Estimate: 96h
>  Remaining Estimate: 96h
>
> When I write data into streaming table. 
> I use such code:
> CarbonProperties.getInstance()
>  .addProperty(CarbonCommonConstants.CARBON_TIMESTAMP_FORMAT, "/MM/dd 
> HH:mm:ss")
>  .addProperty(CarbonCommonConstants.CARBON_DATE_FORMAT, "/MM/dd")
>  
> but don't have effect



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (CARBONDATA-3325) The parameter CarbonCommonConstants.CARBON_TIMESTAMP_FORMAT and CarbonCommonConstants.CARBON_DATE_FORMAT don't have effect in streaming table writing

2019-03-25 Thread Zhichao Zhang (JIRA)


[ 
https://issues.apache.org/jira/browse/CARBONDATA-3325?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16800457#comment-16800457
 ] 

Zhichao  Zhang commented on CARBONDATA-3325:


[~Jocean], please comfirm that your account and email is correct, we can't 
assign your account.

> The parameter CarbonCommonConstants.CARBON_TIMESTAMP_FORMAT and 
> CarbonCommonConstants.CARBON_DATE_FORMAT don't have effect in streaming table 
> writing
> -
>
> Key: CARBONDATA-3325
> URL: https://issues.apache.org/jira/browse/CARBONDATA-3325
> Project: CarbonData
>  Issue Type: Bug
>  Components: spark-integration
>Affects Versions: 1.5.2
>Reporter: jocean@gmail.com
>Priority: Major
> Fix For: 1.5.2
>
>   Original Estimate: 96h
>  Remaining Estimate: 96h
>
> When I write data into streaming table. 
> I use such code:
> CarbonProperties.getInstance()
>  .addProperty(CarbonCommonConstants.CARBON_TIMESTAMP_FORMAT, "/MM/dd 
> HH:mm:ss")
>  .addProperty(CarbonCommonConstants.CARBON_DATE_FORMAT, "/MM/dd")
>  
> but don't have effect



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Comment Edited] (CARBONDATA-3325) The parameter CarbonCommonConstants.CARBON_TIMESTAMP_FORMAT and CarbonCommonConstants.CARBON_DATE_FORMAT don't have effect in streaming table writing

2019-03-24 Thread Zhichao Zhang (JIRA)


[ 
https://issues.apache.org/jira/browse/CARBONDATA-3325?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16800339#comment-16800339
 ] 

Zhichao  Zhang edited comment on CARBONDATA-3325 at 3/25/19 1:36 AM:
-

[~Jocean] are you sure your email address is correct? gmail or gamil?  I can't 
assign to you too.


was (Author: zzcclp):
[~Jocean] are you sure your email address is correct? gmail or gamil?

> The parameter CarbonCommonConstants.CARBON_TIMESTAMP_FORMAT and 
> CarbonCommonConstants.CARBON_DATE_FORMAT don't have effect in streaming table 
> writing
> -
>
> Key: CARBONDATA-3325
> URL: https://issues.apache.org/jira/browse/CARBONDATA-3325
> Project: CarbonData
>  Issue Type: Bug
>  Components: spark-integration
>Affects Versions: 1.5.2
>Reporter: jocean@gamil.com
>Priority: Major
> Fix For: 1.5.2
>
>   Original Estimate: 96h
>  Remaining Estimate: 96h
>
> When I write data into streaming table. 
> I use such code:
> CarbonProperties.getInstance()
>  .addProperty(CarbonCommonConstants.CARBON_TIMESTAMP_FORMAT, "/MM/dd 
> HH:mm:ss")
>  .addProperty(CarbonCommonConstants.CARBON_DATE_FORMAT, "/MM/dd")
>  
> but don't have effect



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (CARBONDATA-3325) The parameter CarbonCommonConstants.CARBON_TIMESTAMP_FORMAT and CarbonCommonConstants.CARBON_DATE_FORMAT don't have effect in streaming table writing

2019-03-24 Thread Zhichao Zhang (JIRA)


[ 
https://issues.apache.org/jira/browse/CARBONDATA-3325?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16800339#comment-16800339
 ] 

Zhichao  Zhang commented on CARBONDATA-3325:


[~Jocean] are you sure your email address is correct? gmail or gamil?

> The parameter CarbonCommonConstants.CARBON_TIMESTAMP_FORMAT and 
> CarbonCommonConstants.CARBON_DATE_FORMAT don't have effect in streaming table 
> writing
> -
>
> Key: CARBONDATA-3325
> URL: https://issues.apache.org/jira/browse/CARBONDATA-3325
> Project: CarbonData
>  Issue Type: Bug
>  Components: spark-integration
>Affects Versions: 1.5.2
>Reporter: jocean@gamil.com
>Priority: Major
> Fix For: 1.5.2
>
>   Original Estimate: 96h
>  Remaining Estimate: 96h
>
> When I write data into streaming table. 
> I use such code:
> CarbonProperties.getInstance()
>  .addProperty(CarbonCommonConstants.CARBON_TIMESTAMP_FORMAT, "/MM/dd 
> HH:mm:ss")
>  .addProperty(CarbonCommonConstants.CARBON_DATE_FORMAT, "/MM/dd")
>  
> but don't have effect



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (CARBONDATA-3325) The parameter CarbonCommonConstants.CARBON_TIMESTAMP_FORMAT and CarbonCommonConstants.CARBON_DATE_FORMAT don't have effect in streaming table writing

2019-03-23 Thread Zhichao Zhang (JIRA)


[ 
https://issues.apache.org/jira/browse/CARBONDATA-3325?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16799868#comment-16799868
 ] 

Zhichao  Zhang commented on CARBONDATA-3325:


Yeap, you can submit a pr to fix this first, and pleas give me the email which 
is registered for Jira account.

> The parameter CarbonCommonConstants.CARBON_TIMESTAMP_FORMAT and 
> CarbonCommonConstants.CARBON_DATE_FORMAT don't have effect in streaming table 
> writing
> -
>
> Key: CARBONDATA-3325
> URL: https://issues.apache.org/jira/browse/CARBONDATA-3325
> Project: CarbonData
>  Issue Type: Bug
>  Components: spark-integration
>Affects Versions: 1.5.2
>Reporter: jocean@gamil.com
>Priority: Major
> Fix For: 1.5.2
>
>   Original Estimate: 96h
>  Remaining Estimate: 96h
>
> When I write data into streaming table. 
> I use such code:
> CarbonProperties.getInstance()
>  .addProperty(CarbonCommonConstants.CARBON_TIMESTAMP_FORMAT, "/MM/dd 
> HH:mm:ss")
>  .addProperty(CarbonCommonConstants.CARBON_DATE_FORMAT, "/MM/dd")
>  
> but don't have effect



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (CARBONDATA-3325) The parameter CarbonCommonConstants.CARBON_TIMESTAMP_FORMAT and CarbonCommonConstants.CARBON_DATE_FORMAT don't have effect in streaming table writing

2019-03-22 Thread Zhichao Zhang (JIRA)


[ 
https://issues.apache.org/jira/browse/CARBONDATA-3325?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16799548#comment-16799548
 ] 

Zhichao  Zhang commented on CARBONDATA-3325:


You can try to use this way: 

spark.readStream 
   .format("socket") 
   .option("host", "localhost") 
   .option("port", 9099) 
   .option("timestampformat", "-MM-dd HH:mm:ss") 
   .option("dateformat", "-MM-dd HH:mm:ss") 

> The parameter CarbonCommonConstants.CARBON_TIMESTAMP_FORMAT and 
> CarbonCommonConstants.CARBON_DATE_FORMAT don't have effect in streaming table 
> writing
> -
>
> Key: CARBONDATA-3325
> URL: https://issues.apache.org/jira/browse/CARBONDATA-3325
> Project: CarbonData
>  Issue Type: Bug
>  Components: spark-integration
>Affects Versions: 1.5.2
>Reporter: jocean@gamil.com
>Priority: Major
> Fix For: 1.5.2
>
>   Original Estimate: 96h
>  Remaining Estimate: 96h
>
> When I write data into streaming table. 
> I use such code:
> CarbonProperties.getInstance()
>  .addProperty(CarbonCommonConstants.CARBON_TIMESTAMP_FORMAT, "/MM/dd 
> HH:mm:ss")
>  .addProperty(CarbonCommonConstants.CARBON_DATE_FORMAT, "/MM/dd")
>  
> but don't have effect



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (CARBONDATA-3317) Executing 'show segments' command throws NPE when spark streaming app write data to new stream segment.

2019-03-16 Thread Zhichao Zhang (JIRA)
Zhichao  Zhang created CARBONDATA-3317:
--

 Summary: Executing 'show segments' command throws NPE when spark 
streaming app write data to new stream segment.
 Key: CARBONDATA-3317
 URL: https://issues.apache.org/jira/browse/CARBONDATA-3317
 Project: CarbonData
  Issue Type: Bug
  Components: spark-integration
Affects Versions: 1.6.0
Reporter: Zhichao  Zhang
Assignee: Zhichao  Zhang
 Fix For: 1.6.0


When spark streaming app starts to create new stream segment, it does not 
create carbondataindex file before writing data successfully, and now if 
execute 'show segments' command, it will throw NPE.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Closed] (CARBONDATA-2595) Reformat the output of command 'desc formatted table_name'

2019-03-16 Thread Zhichao Zhang (JIRA)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-2595?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhichao  Zhang closed CARBONDATA-2595.
--
Resolution: Fixed

> Reformat the output of command 'desc formatted table_name'
> --
>
> Key: CARBONDATA-2595
> URL: https://issues.apache.org/jira/browse/CARBONDATA-2595
> Project: CarbonData
>  Issue Type: Improvement
>  Components: sql
>Reporter: Zhichao  Zhang
>Assignee: Zhichao  Zhang
>Priority: Minor
> Attachments: desc_formatted.txt, desc_formatted_external.txt
>
>  Time Spent: 2h
>  Remaining Estimate: 0h
>
> According to the discussion in 
> [topic|http://apache-carbondata-dev-mailing-list-archive.1130556.n5.nabble.com/Change-the-comment-content-for-column-when-execute-command-desc-formatted-table-name-td46848.html],
>  reformat the output of command 'desc formatted table_name'. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (CARBONDATA-3120) apache-carbondata-1.5.1-rc1.tar.gz Datamap's core and plan project, pom.xml, is version 1.5.0, which results in an inability to compile properly

2018-11-26 Thread Zhichao Zhang (JIRA)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-3120?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhichao  Zhang resolved CARBONDATA-3120.

Resolution: Fixed

> apache-carbondata-1.5.1-rc1.tar.gz Datamap's core and plan project, pom.xml, 
> is version 1.5.0, which results in an inability to compile properly
> 
>
> Key: CARBONDATA-3120
> URL: https://issues.apache.org/jira/browse/CARBONDATA-3120
> Project: CarbonData
>  Issue Type: Bug
>  Components: build
>Affects Versions: 1.5.1
> Environment: MacOS
> apache-carbondata-1.5.1-rc1
>Reporter: Jonathan.Wei
>Priority: Major
> Fix For: 1.5.1
>
>   Original Estimate: 1h
>  Time Spent: 4h 40m
>  Remaining Estimate: 0h
>
> Hi,guy!
>        I download the apache-carbondata-1.5.1-rc1.tar.gz。
>        After decompression, the datamap mv/core mv/plan project was added to 
> the main pom for compilation。
>        But the But the compilation failed。
>  
> LOG:
> {code:java}
> [ERROR] [ERROR] Some problems were encountered while processing the POMs:
> [FATAL] Non-resolvable parent POM for 
> org.apache.carbondata:carbondata-mv-core:[unknown-version]: Could not find 
> artifact org.apache.carbondata:carbondata-parent:pom:1.5.0-SNAPSHOT and 
> 'parent.relativePath' points at wrong local POM @ line 22, column 11
> [FATAL] Non-resolvable parent POM for 
> org.apache.carbondata:carbondata-mv-plan:[unknown-version]: Could not find 
> artifact org.apache.carbondata:carbondata-parent:pom:1.5.0-SNAPSHOT and 
> 'parent.relativePath' points at wrong local POM @ line 22, column 11
> [WARNING] 'build.plugins.plugin.version' for 
> com.ning.maven.plugins:maven-duplicate-finder-plugin is missing. @ 
> org.apache.carbondata:carbondata-presto:[unknown-version], 
> /Users/jonathanwei/summary/carbondata/carbondata-apache-carbondata-1.5.1-rc1/integration/presto/pom.xml,
>  line 620, column 15
> [WARNING] 'build.plugins.plugin.version' for 
> pl.project13.maven:git-commit-id-plugin is missing. @ 
> org.apache.carbondata:carbondata-presto:[unknown-version], 
> /Users/jonathanwei/summary/carbondata/carbondata-apache-carbondata-1.5.1-rc1/integration/presto/pom.xml,
>  line 633, column 15
> [WARNING] 'build.plugins.plugin.version' for 
> com.ning.maven.plugins:maven-duplicate-finder-plugin is missing. @ 
> org.apache.carbondata:carbondata-examples-spark2:[unknown-version], 
> /Users/jonathanwei/summary/carbondata/carbondata-apache-carbondata-1.5.1-rc1/examples/spark2/pom.xml,
>  line 184, column 15
>  @
> [ERROR] The build could not read 2 projects -> [Help 1]
> [ERROR]
> [ERROR]   The project 
> org.apache.carbondata:carbondata-mv-core:[unknown-version] 
> (/Users/jonathanwei/summary/carbondata/carbondata-apache-carbondata-1.5.1-rc1/datamap/mv/core/pom.xml)
>  has 1 error
> [ERROR]     Non-resolvable parent POM for 
> org.apache.carbondata:carbondata-mv-core:[unknown-version]: Could not find 
> artifact org.apache.carbondata:carbondata-parent:pom:1.5.0-SNAPSHOT and 
> 'parent.relativePath' points at wrong local POM @ line 22, column 11 -> [Help 
> 2]
> [ERROR]
> [ERROR]   The project 
> org.apache.carbondata:carbondata-mv-plan:[unknown-version] 
> (/Users/jonathanwei/summary/carbondata/carbondata-apache-carbondata-1.5.1-rc1/datamap/mv/plan/pom.xml)
>  has 1 error
> [ERROR]     Non-resolvable parent POM for 
> org.apache.carbondata:carbondata-mv-plan:[unknown-version]: Could not find 
> artifact org.apache.carbondata:carbondata-parent:pom:1.5.0-SNAPSHOT and 
> 'parent.relativePath' points at wrong local POM @ line 22, column 11 -> [Help 
> 2]
> [ERROR]
> [ERROR] To see the full stack trace of the errors, re-run Maven with the -e 
> switch.
> [ERROR] Re-run Maven using the -X switch to enable full debug logging.
> [ERROR]
> [ERROR] For more information about the errors and possible solutions, please 
> read the following articles:
> [ERROR] [Help 1] 
> http://cwiki.apache.org/confluence/display/MAVEN/ProjectBuildingException
> [ERROR] [Help 2] 
> http://cwiki.apache.org/confluence/display/MAVEN/UnresolvableModelException
> {code}
> I check the pom file, parent.version is 1.5.0-snapshot. But 
> apache-carbondata-1.5.1-rc1.tar.gz is 1.5.1.
> mv/core pom.xml
> {code:java}
> 
> org.apache.carbondata
> carbondata-parent
> 1.5.0-SNAPSHOT
> ../../../pom.xml
> 
> carbondata-mv-core
> Apache CarbonData :: Materialized View Core
> {code}
> mv/plan pom.xml
> {code:java}
> 
> org.apache.carbondata
> carbondata-parent
> 1.5.0-SNAPSHOT
> ../../../pom.xml
> 
> carbondata-mv-plan
> Apache CarbonData :: Materialized View Plan
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (CARBONDATA-3087) Prettify DESC FORMATTED output

2018-11-08 Thread Zhichao Zhang (JIRA)


[ 
https://issues.apache.org/jira/browse/CARBONDATA-3087?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16679858#comment-16679858
 ] 

Zhichao  Zhang commented on CARBONDATA-3087:


Hi Jacky:

  I have some questions shown below:

  1. How to show external info if it's an external table?

  2. It need to show the partition info if it has partition columns;

  3. missing property: DICTIONARY_INCLUDE;

  4. doesn't it show database and table name?

  5. will remove NO_INVERTED_INDEX ? right? 

  6. doesn't it show default value for each properties? and as our discussion, 
we need to save the default value to schema info even user doesn't set the 
value, is this done?

> Prettify DESC FORMATTED output
> --
>
> Key: CARBONDATA-3087
> URL: https://issues.apache.org/jira/browse/CARBONDATA-3087
> Project: CarbonData
>  Issue Type: Improvement
>Reporter: Jacky Li
>Priority: Major
>  Time Spent: 1h 10m
>  Remaining Estimate: 0h
>
> Change output of DESC FORMATTED to:
> {noformat}
> ++-+---+
> |col_name|data_type   
>  |comment|
> ++-+---+
> |shortfield  |smallint
>  |null   |
> |intfield|int 
>  |null   |
> |bigintfield |bigint  
>  |null   |
> |doublefield |double  
>  |null   |
> |stringfield |string  
>  |null   |
> |timestampfield  |timestamp   
>  |null   |
> |decimalfield|decimal(18,2)   
>  |null   |
> |datefield   |date
>  |null   |
> |charfield   |string  
>  |null   |
> |floatfield  |double  
>  |null   |
> ||
>  |   |
> |## Table Basic Information  |
>  |   |
> |Comment |
>  |   |
> |Path
> |/Users/jacky/code/carbondata/examples/spark2/target/store/default/carbonsession_table|
>|
> |Table Block Size|1024 MB 
>  |   |
> |Table Blocklet Size |64 MB   
>  |   |
> |Streaming   |false   
>  |   |
> |Flat Folder |false   
>  |   |
> |Bad Record Path |
>  |   |
> |Min Input Per Node  |0.0B
>  |   |
> ||
>  |   |
> |## Index Information|
>  |   |
> |Sort Scope  |LOCAL_SORT  
>  |   |
> |Sort Columns|stringfield,timestampfield,datefield,charfield  
>  |   |
> |Index Cache Level   |BLOCK   
>  |   |
> |Cached Index Columns|All columns 
>  |   |
> || 

[jira] [Created] (CARBONDATA-3019) Add error log in catch block to avoid to abort the exception which is thrown from catch block when there is an exception thrown in finally block

2018-10-17 Thread Zhichao Zhang (JIRA)
Zhichao  Zhang created CARBONDATA-3019:
--

 Summary: Add error log in catch block to avoid to abort the 
exception which is thrown from catch block when there is an exception thrown in 
finally block
 Key: CARBONDATA-3019
 URL: https://issues.apache.org/jira/browse/CARBONDATA-3019
 Project: CarbonData
  Issue Type: Improvement
Reporter: Zhichao  Zhang
Assignee: Zhichao  Zhang


# Add error log in catch block to avoid to abort the exception which is thrown 
from catch block when there is an exception thrown in finally block.
 # enhance log output.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (CARBONDATA-2595) Reformat the output of command 'desc formatted table_name'

2018-09-30 Thread Zhichao Zhang (JIRA)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-2595?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhichao  Zhang updated CARBONDATA-2595:
---
Fix Version/s: (was: 1.5.0)

> Reformat the output of command 'desc formatted table_name'
> --
>
> Key: CARBONDATA-2595
> URL: https://issues.apache.org/jira/browse/CARBONDATA-2595
> Project: CarbonData
>  Issue Type: Improvement
>  Components: sql
>Reporter: Zhichao  Zhang
>Assignee: Zhichao  Zhang
>Priority: Minor
> Attachments: desc_formatted.txt, desc_formatted_external.txt
>
>
> According to the discussion in 
> [topic|http://apache-carbondata-dev-mailing-list-archive.1130556.n5.nabble.com/Change-the-comment-content-for-column-when-execute-command-desc-formatted-table-name-td46848.html],
>  reformat the output of command 'desc formatted table_name'. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (CARBONDATA-2597) Support deleting historical data from non-stream segments

2018-09-30 Thread Zhichao Zhang (JIRA)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-2597?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhichao  Zhang updated CARBONDATA-2597:
---
Fix Version/s: (was: NONE)

> Support deleting historical data from non-stream segments
> -
>
> Key: CARBONDATA-2597
> URL: https://issues.apache.org/jira/browse/CARBONDATA-2597
> Project: CarbonData
>  Issue Type: Sub-task
>  Components: spark-integration
>Reporter: Zhichao  Zhang
>Assignee: Zhichao  Zhang
>Priority: Minor
>
> Delete historical data from non-stream segments, do not support deleting from 
> stream segments.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (CARBONDATA-2597) Support deleting historical data from non-stream segments

2018-09-30 Thread Zhichao Zhang (JIRA)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-2597?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhichao  Zhang updated CARBONDATA-2597:
---
Fix Version/s: (was: 1.5.0)
   NONE

> Support deleting historical data from non-stream segments
> -
>
> Key: CARBONDATA-2597
> URL: https://issues.apache.org/jira/browse/CARBONDATA-2597
> Project: CarbonData
>  Issue Type: Sub-task
>  Components: spark-integration
>Reporter: Zhichao  Zhang
>Assignee: Zhichao  Zhang
>Priority: Minor
>
> Delete historical data from non-stream segments, do not support deleting from 
> stream segments.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Assigned] (CARBONDATA-2989) Upgrade spark integration version to 2.3.2

2018-09-28 Thread Zhichao Zhang (JIRA)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-2989?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhichao  Zhang reassigned CARBONDATA-2989:
--

Assignee: Zhichao  Zhang

> Upgrade spark integration version to 2.3.2
> --
>
> Key: CARBONDATA-2989
> URL: https://issues.apache.org/jira/browse/CARBONDATA-2989
> Project: CarbonData
>  Issue Type: Improvement
>  Components: spark-integration
>Reporter: Zhichao  Zhang
>Assignee: Zhichao  Zhang
>Priority: Minor
> Fix For: 1.5.0
>
>
> h1. Upgrade spark integration version to 2.3.2



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (CARBONDATA-2989) Upgrade spark integration version to 2.3.2

2018-09-28 Thread Zhichao Zhang (JIRA)
Zhichao  Zhang created CARBONDATA-2989:
--

 Summary: Upgrade spark integration version to 2.3.2
 Key: CARBONDATA-2989
 URL: https://issues.apache.org/jira/browse/CARBONDATA-2989
 Project: CarbonData
  Issue Type: Improvement
  Components: spark-integration
Reporter: Zhichao  Zhang
 Fix For: 1.5.0


h1. Upgrade spark integration version to 2.3.2



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Assigned] (CARBONDATA-2594) Incorrect logic when set 'Encoding.INVERTED_INDEX' for each dimension column

2018-09-26 Thread Zhichao Zhang (JIRA)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-2594?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhichao  Zhang reassigned CARBONDATA-2594:
--

Assignee: Jacky Li  (was: Zhichao  Zhang)

> Incorrect logic when set 'Encoding.INVERTED_INDEX' for each dimension column
> 
>
> Key: CARBONDATA-2594
> URL: https://issues.apache.org/jira/browse/CARBONDATA-2594
> Project: CarbonData
>  Issue Type: Improvement
>  Components: spark-integration
>Reporter: Zhichao  Zhang
>Assignee: Jacky Li
>Priority: Minor
> Fix For: 1.5.0
>
>
> All of non-sort dimension columns are set as 'Encoding.INVERTED_INDEX' 
> column, this is wrong, only the columns defined in 'SORT_COLUMN' and not in 
> 'NO_INVERTED_INDEX' need to be set as  'Encoding.INVERTED_INDEX' column.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Closed] (CARBONDATA-2600) Add a command to show detailed index information for a segment

2018-09-25 Thread Zhichao Zhang (JIRA)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-2600?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhichao  Zhang closed CARBONDATA-2600.
--
   Resolution: Duplicate
Fix Version/s: (was: 1.5.0)

please see CARBONDATA-2916.

> Add a command to show detailed index information for a segment
> --
>
> Key: CARBONDATA-2600
> URL: https://issues.apache.org/jira/browse/CARBONDATA-2600
> Project: CarbonData
>  Issue Type: New Feature
>  Components: spark-integration
>Reporter: Zhichao  Zhang
>Assignee: Zhichao  Zhang
>Priority: Minor
>
> Add a command to show detailed index information for a segment, for example:
> {code:java}
> show index for table table_name where segment_id = 0;{code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (CARBONDATA-2859) add sdv test case for bloomfilter datamap

2018-09-08 Thread Zhichao Zhang (JIRA)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-2859?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhichao  Zhang updated CARBONDATA-2859:
---
Priority: Minor  (was: Major)

> add sdv test case for bloomfilter datamap
> -
>
> Key: CARBONDATA-2859
> URL: https://issues.apache.org/jira/browse/CARBONDATA-2859
> Project: CarbonData
>  Issue Type: Sub-task
>Reporter: xuchuanyin
>Assignee: xuchuanyin
>Priority: Minor
> Fix For: 1.5.0
>
>  Time Spent: 1h 50m
>  Remaining Estimate: 0h
>
> add sdv test case for bloomfilter datamap



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (CARBONDATA-2859) add sdv test case for bloomfilter datamap

2018-09-08 Thread Zhichao Zhang (JIRA)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-2859?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhichao  Zhang resolved CARBONDATA-2859.

   Resolution: Fixed
Fix Version/s: 1.5.0

> add sdv test case for bloomfilter datamap
> -
>
> Key: CARBONDATA-2859
> URL: https://issues.apache.org/jira/browse/CARBONDATA-2859
> Project: CarbonData
>  Issue Type: Sub-task
>Reporter: xuchuanyin
>Assignee: xuchuanyin
>Priority: Minor
> Fix For: 1.5.0
>
>  Time Spent: 1h 50m
>  Remaining Estimate: 0h
>
> add sdv test case for bloomfilter datamap



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (CARBONDATA-2595) Reformat the output of command 'desc formatted table_name'

2018-08-21 Thread Zhichao Zhang (JIRA)


[ 
https://issues.apache.org/jira/browse/CARBONDATA-2595?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16587328#comment-16587328
 ] 

Zhichao  Zhang commented on CARBONDATA-2595:


[~sraghunandan], it's the same when table is a external table, otherwise there 
is no 'Location Path' property.

> Reformat the output of command 'desc formatted table_name'
> --
>
> Key: CARBONDATA-2595
> URL: https://issues.apache.org/jira/browse/CARBONDATA-2595
> Project: CarbonData
>  Issue Type: Improvement
>  Components: sql
>Reporter: Zhichao  Zhang
>Assignee: Zhichao  Zhang
>Priority: Minor
> Fix For: 1.5.0
>
> Attachments: desc_formatted.txt, desc_formatted_external.txt
>
>
> According to the discussion in 
> [topic|http://apache-carbondata-dev-mailing-list-archive.1130556.n5.nabble.com/Change-the-comment-content-for-column-when-execute-command-desc-formatted-table-name-td46848.html],
>  reformat the output of command 'desc formatted table_name'. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (CARBONDATA-2595) Reformat the output of command 'desc formatted table_name'

2018-08-20 Thread Zhichao Zhang (JIRA)


[ 
https://issues.apache.org/jira/browse/CARBONDATA-2595?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16585858#comment-16585858
 ] 

Zhichao  Zhang commented on CARBONDATA-2595:


  Now I am working on this, the new format is shown in attachment, please 
give me some feedback. 
  There is one question: if user uses CTAS to create table, do we need to 
show the 'select sql' in the result of 'desc formatted table'? If yes, how 
to get 'select sql'? now I just can get a non-formatted sql from 
'CarbonSparkSqlParser.scala' (Jacky mentioned), for example: 

create table sql:
{code:java}
CREATE TABLE IF NOT EXISTS test_table 
STORED BY 'carbondata' 
TBLPROPERTIES( 
'streaming'='false', 'sort_columns'='id,city', 'dictionary_include'='name') 
AS SELECT * from source_test ;{code}


The non-formatted sql I get is : 

 
{code:java}
SELECT*fromsource_test{code}
 

 

 

any suggestion for this?

 
{code:java}
 {code}
 

> Reformat the output of command 'desc formatted table_name'
> --
>
> Key: CARBONDATA-2595
> URL: https://issues.apache.org/jira/browse/CARBONDATA-2595
> Project: CarbonData
>  Issue Type: Improvement
>  Components: sql
>Reporter: Zhichao  Zhang
>Assignee: Zhichao  Zhang
>Priority: Minor
> Fix For: 1.5.0
>
> Attachments: desc_formatted.txt, desc_formatted_external.txt
>
>
> According to the discussion in 
> [topic|http://apache-carbondata-dev-mailing-list-archive.1130556.n5.nabble.com/Change-the-comment-content-for-column-when-execute-command-desc-formatted-table-name-td46848.html],
>  reformat the output of command 'desc formatted table_name'. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Comment Edited] (CARBONDATA-2595) Reformat the output of command 'desc formatted table_name'

2018-08-20 Thread Zhichao Zhang (JIRA)


[ 
https://issues.apache.org/jira/browse/CARBONDATA-2595?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16585858#comment-16585858
 ] 

Zhichao  Zhang edited comment on CARBONDATA-2595 at 8/20/18 12:27 PM:
--

  Now I am working on this, the new format is shown in attachment, please 
 give me some feedback. 
   There is one question: if user uses CTAS to create table, do we need to 
 show the 'select sql' in the result of 'desc formatted table'? If yes, how 
 to get 'select sql'? now I just can get a non-formatted sql from 
 'CarbonSparkSqlParser.scala' (Jacky mentioned), for example: 

create table sql:
{code:java}
CREATE TABLE IF NOT EXISTS test_table 
STORED BY 'carbondata' 
TBLPROPERTIES( 
'streaming'='false', 'sort_columns'='id,city', 'dictionary_include'='name') 
AS SELECT * from source_test ;{code}
The non-formatted sql I get is : 

 
{code:java}
SELECT*fromsource_test{code}
  

any suggestion for this? 

 


was (Author: zzcclp):
  Now I am working on this, the new format is shown in attachment, please 
give me some feedback. 
  There is one question: if user uses CTAS to create table, do we need to 
show the 'select sql' in the result of 'desc formatted table'? If yes, how 
to get 'select sql'? now I just can get a non-formatted sql from 
'CarbonSparkSqlParser.scala' (Jacky mentioned), for example: 

create table sql:
{code:java}
CREATE TABLE IF NOT EXISTS test_table 
STORED BY 'carbondata' 
TBLPROPERTIES( 
'streaming'='false', 'sort_columns'='id,city', 'dictionary_include'='name') 
AS SELECT * from source_test ;{code}


The non-formatted sql I get is : 

 
{code:java}
SELECT*fromsource_test{code}
 

 

 

any suggestion for this?

 
{code:java}
 {code}
 

> Reformat the output of command 'desc formatted table_name'
> --
>
> Key: CARBONDATA-2595
> URL: https://issues.apache.org/jira/browse/CARBONDATA-2595
> Project: CarbonData
>  Issue Type: Improvement
>  Components: sql
>Reporter: Zhichao  Zhang
>Assignee: Zhichao  Zhang
>Priority: Minor
> Fix For: 1.5.0
>
> Attachments: desc_formatted.txt, desc_formatted_external.txt
>
>
> According to the discussion in 
> [topic|http://apache-carbondata-dev-mailing-list-archive.1130556.n5.nabble.com/Change-the-comment-content-for-column-when-execute-command-desc-formatted-table-name-td46848.html],
>  reformat the output of command 'desc formatted table_name'. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (CARBONDATA-2595) Reformat the output of command 'desc formatted table_name'

2018-08-20 Thread Zhichao Zhang (JIRA)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-2595?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhichao  Zhang updated CARBONDATA-2595:
---
Attachment: desc_formatted_external.txt

> Reformat the output of command 'desc formatted table_name'
> --
>
> Key: CARBONDATA-2595
> URL: https://issues.apache.org/jira/browse/CARBONDATA-2595
> Project: CarbonData
>  Issue Type: Improvement
>  Components: sql
>Reporter: Zhichao  Zhang
>Assignee: Zhichao  Zhang
>Priority: Minor
> Fix For: 1.5.0
>
> Attachments: desc_formatted.txt, desc_formatted_external.txt
>
>
> According to the discussion in 
> [topic|http://apache-carbondata-dev-mailing-list-archive.1130556.n5.nabble.com/Change-the-comment-content-for-column-when-execute-command-desc-formatted-table-name-td46848.html],
>  reformat the output of command 'desc formatted table_name'. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (CARBONDATA-2595) Reformat the output of command 'desc formatted table_name'

2018-08-20 Thread Zhichao Zhang (JIRA)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-2595?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhichao  Zhang updated CARBONDATA-2595:
---
Attachment: desc_formatted.txt

> Reformat the output of command 'desc formatted table_name'
> --
>
> Key: CARBONDATA-2595
> URL: https://issues.apache.org/jira/browse/CARBONDATA-2595
> Project: CarbonData
>  Issue Type: Improvement
>  Components: sql
>Reporter: Zhichao  Zhang
>Assignee: Zhichao  Zhang
>Priority: Minor
> Fix For: 1.5.0
>
> Attachments: desc_formatted.txt, desc_formatted_external.txt
>
>
> According to the discussion in 
> [topic|http://apache-carbondata-dev-mailing-list-archive.1130556.n5.nabble.com/Change-the-comment-content-for-column-when-execute-command-desc-formatted-table-name-td46848.html],
>  reformat the output of command 'desc formatted table_name'. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (CARBONDATA-2854) Release table status file lock before delete physical files when execute 'clean files' command

2018-08-13 Thread Zhichao Zhang (JIRA)
Zhichao  Zhang created CARBONDATA-2854:
--

 Summary: Release table status file lock before delete physical 
files when execute 'clean files' command
 Key: CARBONDATA-2854
 URL: https://issues.apache.org/jira/browse/CARBONDATA-2854
 Project: CarbonData
  Issue Type: Bug
  Components: spark-integration
Affects Versions: 1.4.0, 1.5.0
Reporter: Zhichao  Zhang
Assignee: Zhichao  Zhang
 Fix For: 1.5.0


Release table status file lock before delete physical files when execute 'clean 
files' command, otherwise table status file will be locked during deleting 
physical files, it may take a long time, other operations will fail to get 
table status file lock.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (CARBONDATA-2600) Add a command to show detailed index information for a segment

2018-06-10 Thread Zhichao Zhang (JIRA)
Zhichao  Zhang created CARBONDATA-2600:
--

 Summary: Add a command to show detailed index information for a 
segment
 Key: CARBONDATA-2600
 URL: https://issues.apache.org/jira/browse/CARBONDATA-2600
 Project: CarbonData
  Issue Type: New Feature
  Components: spark-integration
Reporter: Zhichao  Zhang
Assignee: Zhichao  Zhang
 Fix For: 1.5.0


Add a command to show detailed index information for a segment, for example:
{code:java}
show index for table table_name where segment_id = 0;{code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (CARBONDATA-2599) Use RowStreamParserImp as default value of config 'carbon.stream.parser'

2018-06-10 Thread Zhichao Zhang (JIRA)
Zhichao  Zhang created CARBONDATA-2599:
--

 Summary: Use RowStreamParserImp as default value of config 
'carbon.stream.parser'
 Key: CARBONDATA-2599
 URL: https://issues.apache.org/jira/browse/CARBONDATA-2599
 Project: CarbonData
  Issue Type: Improvement
  Components: spark-integration
Reporter: Zhichao  Zhang
Assignee: Zhichao  Zhang
 Fix For: 1.5.0


See the detailed info in 
[topic|http://apache-carbondata-dev-mailing-list-archive.1130556.n5.nabble.com/Use-RowStreamParserImp-as-default-value-of-config-carbon-stream-parser-td51565.html]



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (CARBONDATA-2598) Support updating/deleting data from stream segments

2018-06-07 Thread Zhichao Zhang (JIRA)
Zhichao  Zhang created CARBONDATA-2598:
--

 Summary: Support updating/deleting data from stream segments
 Key: CARBONDATA-2598
 URL: https://issues.apache.org/jira/browse/CARBONDATA-2598
 Project: CarbonData
  Issue Type: Sub-task
  Components: spark-integration
Reporter: Zhichao  Zhang






--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (CARBONDATA-2597) Support deleting historical data from non-stream segments

2018-06-07 Thread Zhichao Zhang (JIRA)
Zhichao  Zhang created CARBONDATA-2597:
--

 Summary: Support deleting historical data from non-stream segments
 Key: CARBONDATA-2597
 URL: https://issues.apache.org/jira/browse/CARBONDATA-2597
 Project: CarbonData
  Issue Type: Sub-task
  Components: spark-integration
Reporter: Zhichao  Zhang
Assignee: Zhichao  Zhang
 Fix For: 1.5.0


Delete historical data from non-stream segments, do not support deleting from 
stream segments.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (CARBONDATA-2596) Support updating/deleting data on stream table

2018-06-07 Thread Zhichao Zhang (JIRA)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-2596?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhichao  Zhang updated CARBONDATA-2596:
---
Issue Type: Improvement  (was: Task)

> Support updating/deleting data on stream table
> --
>
> Key: CARBONDATA-2596
> URL: https://issues.apache.org/jira/browse/CARBONDATA-2596
> Project: CarbonData
>  Issue Type: Improvement
>  Components: spark-integration
>Reporter: Zhichao  Zhang
>Priority: Minor
> Fix For: 1.5.0
>
>
> According to the disscusion in 
> [topic|http://apache-carbondata-dev-mailing-list-archive.1130556.n5.nabble.com/Support-updating-deleting-data-for-stream-table-td51060.html],
>  there are 2 steps to implement this feature.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (CARBONDATA-2596) Support updating/deleting data on stream table

2018-06-07 Thread Zhichao Zhang (JIRA)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-2596?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhichao  Zhang updated CARBONDATA-2596:
---
Issue Type: New Feature  (was: Improvement)

> Support updating/deleting data on stream table
> --
>
> Key: CARBONDATA-2596
> URL: https://issues.apache.org/jira/browse/CARBONDATA-2596
> Project: CarbonData
>  Issue Type: New Feature
>  Components: spark-integration
>Reporter: Zhichao  Zhang
>Priority: Minor
> Fix For: 1.5.0
>
>
> According to the disscusion in 
> [topic|http://apache-carbondata-dev-mailing-list-archive.1130556.n5.nabble.com/Support-updating-deleting-data-for-stream-table-td51060.html],
>  there are 2 steps to implement this feature.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (CARBONDATA-2596) Support updating/deleting data on stream table

2018-06-07 Thread Zhichao Zhang (JIRA)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-2596?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhichao  Zhang updated CARBONDATA-2596:
---
Issue Type: Task  (was: Improvement)

> Support updating/deleting data on stream table
> --
>
> Key: CARBONDATA-2596
> URL: https://issues.apache.org/jira/browse/CARBONDATA-2596
> Project: CarbonData
>  Issue Type: Task
>  Components: spark-integration
>Reporter: Zhichao  Zhang
>Priority: Minor
> Fix For: 1.5.0
>
>
> According to the disscusion in 
> [topic|http://apache-carbondata-dev-mailing-list-archive.1130556.n5.nabble.com/Support-updating-deleting-data-for-stream-table-td51060.html],
>  there are 2 steps to implement this feature.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (CARBONDATA-2596) Support updating/deleting data on stream table

2018-06-07 Thread Zhichao Zhang (JIRA)
Zhichao  Zhang created CARBONDATA-2596:
--

 Summary: Support updating/deleting data on stream table
 Key: CARBONDATA-2596
 URL: https://issues.apache.org/jira/browse/CARBONDATA-2596
 Project: CarbonData
  Issue Type: Improvement
  Components: spark-integration
Reporter: Zhichao  Zhang
 Fix For: 1.5.0


According to the disscusion in 
[topic|http://apache-carbondata-dev-mailing-list-archive.1130556.n5.nabble.com/Support-updating-deleting-data-for-stream-table-td51060.html],
 there are 2 steps to implement this feature.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (CARBONDATA-2594) Incorrect logic when set 'Encoding.INVERTED_INDEX' for each dimension column

2018-06-07 Thread Zhichao Zhang (JIRA)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-2594?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhichao  Zhang updated CARBONDATA-2594:
---
Affects Version/s: (was: 1.5.0)

> Incorrect logic when set 'Encoding.INVERTED_INDEX' for each dimension column
> 
>
> Key: CARBONDATA-2594
> URL: https://issues.apache.org/jira/browse/CARBONDATA-2594
> Project: CarbonData
>  Issue Type: Improvement
>  Components: spark-integration
>Reporter: Zhichao  Zhang
>Assignee: Zhichao  Zhang
>Priority: Minor
> Fix For: 1.5.0
>
>
> All of non-sort dimension columns are set as 'Encoding.INVERTED_INDEX' 
> column, this is wrong, only the columns defined in 'SORT_COLUMN' and not in 
> 'NO_INVERTED_INDEX' need to be set as  'Encoding.INVERTED_INDEX' column.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (CARBONDATA-2593) Add an option 'carbon.insert.storage.level' to support configuring the storage level when insert into data with 'carbon.insert.persist.enable'='true'

2018-06-07 Thread Zhichao Zhang (JIRA)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-2593?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhichao  Zhang updated CARBONDATA-2593:
---
Affects Version/s: (was: 1.5.0)

> Add an option 'carbon.insert.storage.level' to support configuring the 
> storage level when insert into data with 'carbon.insert.persist.enable'='true'
> -
>
> Key: CARBONDATA-2593
> URL: https://issues.apache.org/jira/browse/CARBONDATA-2593
> Project: CarbonData
>  Issue Type: Improvement
>  Components: data-load, spark-integration
>Reporter: Zhichao  Zhang
>Assignee: Zhichao  Zhang
>Priority: Minor
> Fix For: 1.5.0
>
>
> When insert into data with 'carbon.insert.persist.enable'='true', the storage 
> level of dataset is 'MEMORY_AND_DISK', it should support configuring the 
> storage level to correspond to different environment.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (CARBONDATA-2595) Reformat the output of command 'desc formatted table_name'

2018-06-07 Thread Zhichao Zhang (JIRA)
Zhichao  Zhang created CARBONDATA-2595:
--

 Summary: Reformat the output of command 'desc formatted table_name'
 Key: CARBONDATA-2595
 URL: https://issues.apache.org/jira/browse/CARBONDATA-2595
 Project: CarbonData
  Issue Type: Improvement
  Components: sql
Reporter: Zhichao  Zhang
Assignee: Zhichao  Zhang
 Fix For: 1.5.0


According to the discussion in 
[topic|http://apache-carbondata-dev-mailing-list-archive.1130556.n5.nabble.com/Change-the-comment-content-for-column-when-execute-command-desc-formatted-table-name-td46848.html],
 reformat the output of command 'desc formatted table_name'. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (CARBONDATA-2594) Incorrect logic when set 'Encoding.INVERTED_INDEX' for each dimension column

2018-06-07 Thread Zhichao Zhang (JIRA)
Zhichao  Zhang created CARBONDATA-2594:
--

 Summary: Incorrect logic when set 'Encoding.INVERTED_INDEX' for 
each dimension column
 Key: CARBONDATA-2594
 URL: https://issues.apache.org/jira/browse/CARBONDATA-2594
 Project: CarbonData
  Issue Type: Improvement
  Components: spark-integration
Affects Versions: 1.5.0
Reporter: Zhichao  Zhang
Assignee: Zhichao  Zhang
 Fix For: 1.5.0


All of non-sort dimension columns are set as 'Encoding.INVERTED_INDEX' column, 
this is wrong, only the columns defined in 'SORT_COLUMN' and not in 
'NO_INVERTED_INDEX' need to be set as  'Encoding.INVERTED_INDEX' column.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (CARBONDATA-2593) Add an option 'carbon.insert.storage.level' to support configuring the storage level when insert into data with 'carbon.insert.persist.enable'='true'

2018-06-07 Thread Zhichao Zhang (JIRA)
Zhichao  Zhang created CARBONDATA-2593:
--

 Summary: Add an option 'carbon.insert.storage.level' to support 
configuring the storage level when insert into data with 
'carbon.insert.persist.enable'='true'
 Key: CARBONDATA-2593
 URL: https://issues.apache.org/jira/browse/CARBONDATA-2593
 Project: CarbonData
  Issue Type: Improvement
  Components: data-load, spark-integration
Affects Versions: 1.5.0
Reporter: Zhichao  Zhang
Assignee: Zhichao  Zhang
 Fix For: 1.5.0


When insert into data with 'carbon.insert.persist.enable'='true', the storage 
level of dataset is 'MEMORY_AND_DISK', it should support configuring the 
storage level to correspond to different environment.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (CARBONDATA-2351) CarbonData Select null

2018-04-16 Thread Zhichao Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/CARBONDATA-2351?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16440263#comment-16440263
 ] 

Zhichao  Zhang commented on CARBONDATA-2351:


I ran your code at 23:00 yesterday , and now I run the select sql, it works 
fine. I can't reproduce your issue.

> CarbonData Select null
> --
>
> Key: CARBONDATA-2351
> URL: https://issues.apache.org/jira/browse/CARBONDATA-2351
> Project: CarbonData
>  Issue Type: Bug
>Affects Versions: 1.3.1
> Environment: Carbondata1.3.1 with Spark2.1.2
>Reporter: xhnqccf
>Priority: Major
>  Labels: SELECT, null
>
> Carbondata1.3.1 with Spark2.1.2,after insert into values,SELECT is right.but 
> next day,data of SELECT is null but the line number is right. 
> create table carbon01(id int,name int,age int,sex int) stored by 'carbondata';
> insert into carbon01 values(1,1,1,1);
> select * from carbon01;
> 1 1 1 1
> then,i exit spark-sql.
> but,next day,
> select * from carbon01;
> null null null null



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (CARBONDATA-2351) CarbonData Select null

2018-04-16 Thread Zhichao Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/CARBONDATA-2351?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16439029#comment-16439029
 ] 

Zhichao  Zhang commented on CARBONDATA-2351:


[~xhnqccf] you ran this test case on local mode or yarn-client mode? does it 
reproduce every time?

> CarbonData Select null
> --
>
> Key: CARBONDATA-2351
> URL: https://issues.apache.org/jira/browse/CARBONDATA-2351
> Project: CarbonData
>  Issue Type: Bug
>Affects Versions: 1.3.1
> Environment: Carbondata1.3.1 with Spark2.1.2
>Reporter: xhnqccf
>Priority: Major
>  Labels: SELECT, null
>
> Carbondata1.3.1 with Spark2.1.2,after insert into values,SELECT is right.but 
> next day,data of SELECT is null but the line number is right. 
> create table carbon01(id int,name int,age int,sex int) stored by 'carbondata';
> insert into carbon01 values(1,1,1,1);
> select * from carbon01;
> 1 1 1 1
> then,i exit spark-sql.
> but,next day,
> select * from carbon01;
> null null null null



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Comment Edited] (CARBONDATA-2345) "Task failed while writing rows" error occuers when streaming ingest into carbondata table

2018-04-15 Thread Zhichao Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/CARBONDATA-2345?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16438969#comment-16438969
 ] 

Zhichao  Zhang edited comment on CARBONDATA-2345 at 4/16/18 4:11 AM:
-

 

[~oceaneast], you can see the doc [Stream data 
parser|https://github.com/apache/carbondata/blob/branch-1.3/docs/streaming-guide.md#stream-data-parser].

There is also an 
[example|https://github.com/apache/carbondata/blob/branch-1.3/examples/spark2/src/main/scala/org/apache/carbondata/examples/CarbonStructuredStreamingWithRowParser.scala]
 showing how to use Stream Data Parser.


was (Author: zzcclp):
[~oceaneast], you can see the doc [Stream data 
parser|https://github.com/apache/carbondata/blob/branch-1.3/docs/streaming-guide.md#stream-data-parser]

> "Task failed while writing rows" error occuers when streaming ingest into 
> carbondata table
> --
>
> Key: CARBONDATA-2345
> URL: https://issues.apache.org/jira/browse/CARBONDATA-2345
> Project: CarbonData
>  Issue Type: Bug
>  Components: data-load
>Affects Versions: 1.3.1
>Reporter: ocean
>Priority: Major
>
> carbondata version:1.3.1。spark:2.2.1
> When using spark structured streaming ingest data into carbondata table , 
> such error occurs:
> warning: there was one deprecation warning; re-run with -deprecation for 
> details
> qry: org.apache.spark.sql.streaming.StreamingQuery = 
> org.apache.spark.sql.execution.streaming.StreamingQueryWrapper@7ddf193a
> [Stage 1:> (0 + 2) / 5]18/04/13 18:03:56 WARN TaskSetManager: Lost task 1.0 
> in stage 1.0 (TID 2, sz-pg-entanalytics-research-004.tendcloud.com, executor 
> 1): org.apache.carbondata.streaming.CarbonStreamException: Task failed while 
> writing rows
>  at 
> org.apache.spark.sql.execution.streaming.CarbonAppendableStreamSink$.writeDataFileTask(CarbonAppendableStreamSink.scala:345)
>  at 
> org.apache.spark.sql.execution.streaming.CarbonAppendableStreamSink$$anonfun$writeDataFileJob$1$$anonfun$apply$mcV$sp$1.apply(CarbonAppendableStreamSink.scala:247)
>  at 
> org.apache.spark.sql.execution.streaming.CarbonAppendableStreamSink$$anonfun$writeDataFileJob$1$$anonfun$apply$mcV$sp$1.apply(CarbonAppendableStreamSink.scala:246)
>  at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:87)
>  at org.apache.spark.scheduler.Task.run(Task.scala:108)
>  at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:338)
>  at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>  at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>  at java.lang.Thread.run(Thread.java:745)
> Caused by: java.lang.NullPointerException
>  at 
> org.apache.carbondata.processing.loading.BadRecordsLogger.addBadRecordsToBuilder(BadRecordsLogger.java:126)
>  at 
> org.apache.carbondata.processing.loading.converter.impl.RowConverterImpl.convert(RowConverterImpl.java:164)
>  at 
> org.apache.carbondata.hadoop.streaming.CarbonStreamRecordWriter.write(CarbonStreamRecordWriter.java:186)
>  at 
> org.apache.carbondata.streaming.segment.StreamSegment.appendBatchData(StreamSegment.java:244)
>  at 
> org.apache.spark.sql.execution.streaming.CarbonAppendableStreamSink$$anonfun$writeDataFileTask$1.apply$mcV$sp(CarbonAppendableStreamSink.scala:336)
>  at 
> org.apache.spark.sql.execution.streaming.CarbonAppendableStreamSink$$anonfun$writeDataFileTask$1.apply(CarbonAppendableStreamSink.scala:326)
>  at 
> org.apache.spark.sql.execution.streaming.CarbonAppendableStreamSink$$anonfun$writeDataFileTask$1.apply(CarbonAppendableStreamSink.scala:326)
>  at 
> org.apache.spark.util.Utils$.tryWithSafeFinallyAndFailureCallbacks(Utils.scala:1371)
>  at 
> org.apache.spark.sql.execution.streaming.CarbonAppendableStreamSink$.writeDataFileTask(CarbonAppendableStreamSink.scala:338)
>  ... 8 more
> [Stage 1:===> (1 + 2) / 5]18/04/13 18:03:57 ERROR TaskSetManager: 
> Task 0 in stage 1.0 failed 4 times; aborting job
> 18/04/13 18:03:57 ERROR CarbonAppendableStreamSink$: stream execution thread 
> for [id = 3abdadea-65f6-4d94-8686-306fccae4559, runId = 
> 689adf7e-a617-41d9-96bc-de075ce4dd73] Aborting job job_20180413180354_.
> org.apache.spark.SparkException: Job aborted due to stage failure: Task 0 in 
> stage 1.0 failed 4 times, most recent failure: Lost task 0.3 in stage 1.0 
> (TID 11, sz-pg-entanalytics-research-004.tendcloud.com, executor 1): 
> org.apache.carbondata.streaming.CarbonStreamException: Task failed while 
> writing rows
>  at 
> org.apache.spark.sql.execution.streaming.CarbonAppendableStreamSink$.writeDataFileTask(CarbonAppendableStreamSink.scala:345)
>  at 
> 

[jira] [Commented] (CARBONDATA-2345) "Task failed while writing rows" error occuers when streaming ingest into carbondata table

2018-04-15 Thread Zhichao Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/CARBONDATA-2345?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16438969#comment-16438969
 ] 

Zhichao  Zhang commented on CARBONDATA-2345:


[~oceaneast], you can see the doc [Stream data 
parser|https://github.com/apache/carbondata/blob/branch-1.3/docs/streaming-guide.md#stream-data-parser]

> "Task failed while writing rows" error occuers when streaming ingest into 
> carbondata table
> --
>
> Key: CARBONDATA-2345
> URL: https://issues.apache.org/jira/browse/CARBONDATA-2345
> Project: CarbonData
>  Issue Type: Bug
>  Components: data-load
>Affects Versions: 1.3.1
>Reporter: ocean
>Priority: Major
>
> carbondata version:1.3.1。spark:2.2.1
> When using spark structured streaming ingest data into carbondata table , 
> such error occurs:
> warning: there was one deprecation warning; re-run with -deprecation for 
> details
> qry: org.apache.spark.sql.streaming.StreamingQuery = 
> org.apache.spark.sql.execution.streaming.StreamingQueryWrapper@7ddf193a
> [Stage 1:> (0 + 2) / 5]18/04/13 18:03:56 WARN TaskSetManager: Lost task 1.0 
> in stage 1.0 (TID 2, sz-pg-entanalytics-research-004.tendcloud.com, executor 
> 1): org.apache.carbondata.streaming.CarbonStreamException: Task failed while 
> writing rows
>  at 
> org.apache.spark.sql.execution.streaming.CarbonAppendableStreamSink$.writeDataFileTask(CarbonAppendableStreamSink.scala:345)
>  at 
> org.apache.spark.sql.execution.streaming.CarbonAppendableStreamSink$$anonfun$writeDataFileJob$1$$anonfun$apply$mcV$sp$1.apply(CarbonAppendableStreamSink.scala:247)
>  at 
> org.apache.spark.sql.execution.streaming.CarbonAppendableStreamSink$$anonfun$writeDataFileJob$1$$anonfun$apply$mcV$sp$1.apply(CarbonAppendableStreamSink.scala:246)
>  at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:87)
>  at org.apache.spark.scheduler.Task.run(Task.scala:108)
>  at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:338)
>  at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>  at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>  at java.lang.Thread.run(Thread.java:745)
> Caused by: java.lang.NullPointerException
>  at 
> org.apache.carbondata.processing.loading.BadRecordsLogger.addBadRecordsToBuilder(BadRecordsLogger.java:126)
>  at 
> org.apache.carbondata.processing.loading.converter.impl.RowConverterImpl.convert(RowConverterImpl.java:164)
>  at 
> org.apache.carbondata.hadoop.streaming.CarbonStreamRecordWriter.write(CarbonStreamRecordWriter.java:186)
>  at 
> org.apache.carbondata.streaming.segment.StreamSegment.appendBatchData(StreamSegment.java:244)
>  at 
> org.apache.spark.sql.execution.streaming.CarbonAppendableStreamSink$$anonfun$writeDataFileTask$1.apply$mcV$sp(CarbonAppendableStreamSink.scala:336)
>  at 
> org.apache.spark.sql.execution.streaming.CarbonAppendableStreamSink$$anonfun$writeDataFileTask$1.apply(CarbonAppendableStreamSink.scala:326)
>  at 
> org.apache.spark.sql.execution.streaming.CarbonAppendableStreamSink$$anonfun$writeDataFileTask$1.apply(CarbonAppendableStreamSink.scala:326)
>  at 
> org.apache.spark.util.Utils$.tryWithSafeFinallyAndFailureCallbacks(Utils.scala:1371)
>  at 
> org.apache.spark.sql.execution.streaming.CarbonAppendableStreamSink$.writeDataFileTask(CarbonAppendableStreamSink.scala:338)
>  ... 8 more
> [Stage 1:===> (1 + 2) / 5]18/04/13 18:03:57 ERROR TaskSetManager: 
> Task 0 in stage 1.0 failed 4 times; aborting job
> 18/04/13 18:03:57 ERROR CarbonAppendableStreamSink$: stream execution thread 
> for [id = 3abdadea-65f6-4d94-8686-306fccae4559, runId = 
> 689adf7e-a617-41d9-96bc-de075ce4dd73] Aborting job job_20180413180354_.
> org.apache.spark.SparkException: Job aborted due to stage failure: Task 0 in 
> stage 1.0 failed 4 times, most recent failure: Lost task 0.3 in stage 1.0 
> (TID 11, sz-pg-entanalytics-research-004.tendcloud.com, executor 1): 
> org.apache.carbondata.streaming.CarbonStreamException: Task failed while 
> writing rows
>  at 
> org.apache.spark.sql.execution.streaming.CarbonAppendableStreamSink$.writeDataFileTask(CarbonAppendableStreamSink.scala:345)
>  at 
> org.apache.spark.sql.execution.streaming.CarbonAppendableStreamSink$$anonfun$writeDataFileJob$1$$anonfun$apply$mcV$sp$1.apply(CarbonAppendableStreamSink.scala:247)
>  at 
> org.apache.spark.sql.execution.streaming.CarbonAppendableStreamSink$$anonfun$writeDataFileJob$1$$anonfun$apply$mcV$sp$1.apply(CarbonAppendableStreamSink.scala:246)
>  at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:87)
>  at org.apache.spark.scheduler.Task.run(Task.scala:108)
>  at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:338)
>  at 
> 

[jira] [Commented] (CARBONDATA-2345) "Task failed while writing rows" error occuers when streaming ingest into carbondata table

2018-04-13 Thread Zhichao Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/CARBONDATA-2345?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16437420#comment-16437420
 ] 

Zhichao  Zhang commented on CARBONDATA-2345:


[~oceaneast], you need to add below option into 'writeStream' block:

 
{code:java}
.option(CarbonStreamParser.CARBON_STREAM_PARSER,
 CarbonStreamParser.CARBON_STREAM_PARSER_ROW_PARSER)
 
{code}
 

for example:

 
{code:java}
qry = readSocketDF.writeStream
.format("carbondata")
.trigger(ProcessingTime("20 seconds"))
.option("checkpointLocation", tablePath.getStreamingCheckpointDir)
.option("dbName", "default")
.option("tableName", tableName)
.option(CarbonStreamParser.CARBON_STREAM_PARSER,
CarbonStreamParser.CARBON_STREAM_PARSER_ROW_PARSER)
.outputMode("append")
.start()
{code}
 

 

Please try again.

> "Task failed while writing rows" error occuers when streaming ingest into 
> carbondata table
> --
>
> Key: CARBONDATA-2345
> URL: https://issues.apache.org/jira/browse/CARBONDATA-2345
> Project: CarbonData
>  Issue Type: Bug
>  Components: data-load
>Affects Versions: 1.3.1
>Reporter: ocean
>Priority: Major
>
> carbondata version:1.3.1。spark:2.2.1
> When using spark structured streaming ingest data into carbondata table , 
> such error occurs:
> warning: there was one deprecation warning; re-run with -deprecation for 
> details
> qry: org.apache.spark.sql.streaming.StreamingQuery = 
> org.apache.spark.sql.execution.streaming.StreamingQueryWrapper@7ddf193a
> [Stage 1:> (0 + 2) / 5]18/04/13 18:03:56 WARN TaskSetManager: Lost task 1.0 
> in stage 1.0 (TID 2, sz-pg-entanalytics-research-004.tendcloud.com, executor 
> 1): org.apache.carbondata.streaming.CarbonStreamException: Task failed while 
> writing rows
>  at 
> org.apache.spark.sql.execution.streaming.CarbonAppendableStreamSink$.writeDataFileTask(CarbonAppendableStreamSink.scala:345)
>  at 
> org.apache.spark.sql.execution.streaming.CarbonAppendableStreamSink$$anonfun$writeDataFileJob$1$$anonfun$apply$mcV$sp$1.apply(CarbonAppendableStreamSink.scala:247)
>  at 
> org.apache.spark.sql.execution.streaming.CarbonAppendableStreamSink$$anonfun$writeDataFileJob$1$$anonfun$apply$mcV$sp$1.apply(CarbonAppendableStreamSink.scala:246)
>  at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:87)
>  at org.apache.spark.scheduler.Task.run(Task.scala:108)
>  at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:338)
>  at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>  at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>  at java.lang.Thread.run(Thread.java:745)
> Caused by: java.lang.NullPointerException
>  at 
> org.apache.carbondata.processing.loading.BadRecordsLogger.addBadRecordsToBuilder(BadRecordsLogger.java:126)
>  at 
> org.apache.carbondata.processing.loading.converter.impl.RowConverterImpl.convert(RowConverterImpl.java:164)
>  at 
> org.apache.carbondata.hadoop.streaming.CarbonStreamRecordWriter.write(CarbonStreamRecordWriter.java:186)
>  at 
> org.apache.carbondata.streaming.segment.StreamSegment.appendBatchData(StreamSegment.java:244)
>  at 
> org.apache.spark.sql.execution.streaming.CarbonAppendableStreamSink$$anonfun$writeDataFileTask$1.apply$mcV$sp(CarbonAppendableStreamSink.scala:336)
>  at 
> org.apache.spark.sql.execution.streaming.CarbonAppendableStreamSink$$anonfun$writeDataFileTask$1.apply(CarbonAppendableStreamSink.scala:326)
>  at 
> org.apache.spark.sql.execution.streaming.CarbonAppendableStreamSink$$anonfun$writeDataFileTask$1.apply(CarbonAppendableStreamSink.scala:326)
>  at 
> org.apache.spark.util.Utils$.tryWithSafeFinallyAndFailureCallbacks(Utils.scala:1371)
>  at 
> org.apache.spark.sql.execution.streaming.CarbonAppendableStreamSink$.writeDataFileTask(CarbonAppendableStreamSink.scala:338)
>  ... 8 more
> [Stage 1:===> (1 + 2) / 5]18/04/13 18:03:57 ERROR TaskSetManager: 
> Task 0 in stage 1.0 failed 4 times; aborting job
> 18/04/13 18:03:57 ERROR CarbonAppendableStreamSink$: stream execution thread 
> for [id = 3abdadea-65f6-4d94-8686-306fccae4559, runId = 
> 689adf7e-a617-41d9-96bc-de075ce4dd73] Aborting job job_20180413180354_.
> org.apache.spark.SparkException: Job aborted due to stage failure: Task 0 in 
> stage 1.0 failed 4 times, most recent failure: Lost task 0.3 in stage 1.0 
> (TID 11, sz-pg-entanalytics-research-004.tendcloud.com, executor 1): 
> org.apache.carbondata.streaming.CarbonStreamException: Task failed while 
> writing rows
>  at 
> org.apache.spark.sql.execution.streaming.CarbonAppendableStreamSink$.writeDataFileTask(CarbonAppendableStreamSink.scala:345)
>  at 
> 

[jira] [Updated] (CARBONDATA-2337) Fix duplicately acquiring 'streaming.lock' error when integrating with spark-streaming

2018-04-11 Thread Zhichao Zhang (JIRA)

 [ 
https://issues.apache.org/jira/browse/CARBONDATA-2337?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhichao  Zhang updated CARBONDATA-2337:
---
Description: After merged 
[PR2135|[https://github.com/apache/carbondata/pull/2135]] it will acquire 
'streaming.lock' duplicately when integrating with spark-streaming.  (was: 
After merged [PR2135|[https://github.com/apache/carbondata/pull/2135],] it will 
acquire 'streaming.lock' duplicately when integrating with spark-streaming.)

> Fix duplicately acquiring 'streaming.lock' error when integrating with 
> spark-streaming
> --
>
> Key: CARBONDATA-2337
> URL: https://issues.apache.org/jira/browse/CARBONDATA-2337
> Project: CarbonData
>  Issue Type: Bug
>  Components: spark-integration
>Affects Versions: 1.4.0
>Reporter: Zhichao  Zhang
>Assignee: Zhichao  Zhang
>Priority: Minor
> Fix For: 1.4.0
>
>
> After merged [PR2135|[https://github.com/apache/carbondata/pull/2135]] it 
> will acquire 'streaming.lock' duplicately when integrating with 
> spark-streaming.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (CARBONDATA-2337) Fix duplicately acquiring 'streaming.lock' error when integrating with spark-streaming

2018-04-11 Thread Zhichao Zhang (JIRA)
Zhichao  Zhang created CARBONDATA-2337:
--

 Summary: Fix duplicately acquiring 'streaming.lock' error when 
integrating with spark-streaming
 Key: CARBONDATA-2337
 URL: https://issues.apache.org/jira/browse/CARBONDATA-2337
 Project: CarbonData
  Issue Type: Bug
  Components: spark-integration
Affects Versions: 1.4.0
Reporter: Zhichao  Zhang
Assignee: Zhichao  Zhang
 Fix For: 1.4.0


After merged [PR2135|[https://github.com/apache/carbondata/pull/2135],] it will 
acquire 'streaming.lock' duplicately when integrating with spark-streaming.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (CARBONDATA-2302) Fix some bugs when separate visible and invisible segments info into two files

2018-04-01 Thread Zhichao Zhang (JIRA)

 [ 
https://issues.apache.org/jira/browse/CARBONDATA-2302?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhichao  Zhang updated CARBONDATA-2302:
---
Description: 
There are some bugs when separate visible and invisible segments info into two 
files:
 # It will not delete physical data of history segments after separating
 # Generate duplicated segment id.

  was:
There are some bugs where separate visible and invisible segments info into two 
files:
 # It will not delete physical data of history segments after separating
 # Generate duplicated segment id.


> Fix some bugs when separate visible and invisible segments info into two files
> --
>
> Key: CARBONDATA-2302
> URL: https://issues.apache.org/jira/browse/CARBONDATA-2302
> Project: CarbonData
>  Issue Type: Bug
>  Components: core, data-load
>Affects Versions: 1.4.0
>Reporter: Zhichao  Zhang
>Assignee: Zhichao  Zhang
>Priority: Major
> Fix For: 1.4.0
>
>  Time Spent: 1h 40m
>  Remaining Estimate: 0h
>
> There are some bugs when separate visible and invisible segments info into 
> two files:
>  # It will not delete physical data of history segments after separating
>  # Generate duplicated segment id.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (CARBONDATA-2302) Fix some bugs when separate visible and invisible segments info into two files

2018-04-01 Thread Zhichao Zhang (JIRA)
Zhichao  Zhang created CARBONDATA-2302:
--

 Summary: Fix some bugs when separate visible and invisible 
segments info into two files
 Key: CARBONDATA-2302
 URL: https://issues.apache.org/jira/browse/CARBONDATA-2302
 Project: CarbonData
  Issue Type: Bug
  Components: core, data-load
Affects Versions: 1.4.0
Reporter: Zhichao  Zhang
Assignee: Zhichao  Zhang
 Fix For: 1.4.0


There are some bugs where separate visible and invisible segments info into two 
files:
 # It will not delete physical data of history segments after separating
 # Generate duplicated segment id.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (CARBONDATA-2299) Support showing all segment information(include visible and invisible segments)

2018-03-31 Thread Zhichao Zhang (JIRA)
Zhichao  Zhang created CARBONDATA-2299:
--

 Summary: Support showing all segment information(include visible 
and invisible segments)
 Key: CARBONDATA-2299
 URL: https://issues.apache.org/jira/browse/CARBONDATA-2299
 Project: CarbonData
  Issue Type: Improvement
Affects Versions: 1.4.0
Reporter: Zhichao  Zhang
Assignee: Zhichao  Zhang
 Fix For: 1.4.0


Use command 'SHOW HISTORY SEGMENTS' to show all segment information(include 
visible and invisible segments)



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (CARBONDATA-2298) Delete segment lock files before update metadata

2018-03-31 Thread Zhichao Zhang (JIRA)
Zhichao  Zhang created CARBONDATA-2298:
--

 Summary: Delete segment lock files before update metadata
 Key: CARBONDATA-2298
 URL: https://issues.apache.org/jira/browse/CARBONDATA-2298
 Project: CarbonData
  Issue Type: Improvement
Affects Versions: 1.4.0, 1.3.2
Reporter: Zhichao  Zhang
Assignee: Zhichao  Zhang
 Fix For: 1.4.0, 1.3.2


If there are some COMPACTED segments and their last modified time is within one 
hour, the segment lock files deletion operation will not be executed.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (CARBONDATA-2258) Separate visible and invisible segments info into two files to reduce the size of tablestatus file.

2018-03-15 Thread Zhichao Zhang (JIRA)
Zhichao  Zhang created CARBONDATA-2258:
--

 Summary: Separate visible and invisible segments info into two 
files to reduce the size of tablestatus file.
 Key: CARBONDATA-2258
 URL: https://issues.apache.org/jira/browse/CARBONDATA-2258
 Project: CarbonData
  Issue Type: Improvement
  Components: core
Affects Versions: 1.4.0, 1.3.2
Reporter: Zhichao  Zhang
Assignee: Zhichao  Zhang
 Fix For: 1.4.0, 1.3.2


The size of the tablestatus file is getting larger, there are many places will 
scan this file and it will impact the performance of reading this file.
According to the discussion on 
[thread|http://apache-carbondata-dev-mailing-list-archive.1130556.n5.nabble.com/The-size-of-the-tablestatus-file-is-getting-larger-does-it-impact-the-performance-of-reading-this-fi-td41941.html],
 it can *append* the 
invisible segment list to the file called 'tablestatus.history' when execute 
command 'CLEAN FILES FOR TABLE' (in method 
'SegmentStatusManager.deleteLoadsAndUpdateMetadata') every time, separate 
visible and invisible segments into two files(tablestatus file and 
tablestatus.history file). 
If later it needs to support listing all segments(include visible and 
invisible) list when execute 'SHOW SEGMENTS FOR TABLE', it just need to read 
from two files.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (CARBONDATA-2244) When there are some invisibility INSERT_IN_PROGRESS/INSERT_OVERWRITE_IN_PROGRESS segments on main table, it can not create preaggregate table on it.

2018-03-09 Thread Zhichao Zhang (JIRA)
Zhichao  Zhang created CARBONDATA-2244:
--

 Summary: When there are some invisibility 
INSERT_IN_PROGRESS/INSERT_OVERWRITE_IN_PROGRESS segments on main table, it can 
not create preaggregate table on it.
 Key: CARBONDATA-2244
 URL: https://issues.apache.org/jira/browse/CARBONDATA-2244
 Project: CarbonData
  Issue Type: Bug
Affects Versions: 1.3.0, 1.4.0
Reporter: Zhichao  Zhang
Assignee: Zhichao  Zhang
 Fix For: 1.4.0, 1.3.2


When there are some invisibility 
INSERT_IN_PROGRESS/INSERT_OVERWRITE_IN_PROGRESS segments on main table, it can 
not create preaggregate table on it.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (CARBONDATA-2230) Add a path into table path to store lock files and delete useless segment lock files before loading

2018-03-06 Thread Zhichao Zhang (JIRA)

 [ 
https://issues.apache.org/jira/browse/CARBONDATA-2230?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhichao  Zhang updated CARBONDATA-2230:
---
Description: 
After [PR1984|https://github.com/apache/carbondata/pull/1984] merged, it 
doesn't delete the lock files when unlock, there are many useless lock files in 
table path, especially segment lock files, they grow after every batch loading.

Solution :
1. add a child path into table path, called Locks, all lock files will be 
stored in this path;
2. Before loading, get all useless segment lock files and delete them, because 
just segment lock files will grow, other lock files dosen't grow.

> Add a path into table path to store lock files and delete useless segment 
> lock files before loading
> ---
>
> Key: CARBONDATA-2230
> URL: https://issues.apache.org/jira/browse/CARBONDATA-2230
> Project: CarbonData
>  Issue Type: Improvement
>  Components: data-load
>Affects Versions: 1.3.0, 1.4.0
>Reporter: Zhichao  Zhang
>Assignee: Zhichao  Zhang
>Priority: Minor
> Fix For: 1.3.0, 1.4.0
>
>
> After [PR1984|https://github.com/apache/carbondata/pull/1984] merged, it 
> doesn't delete the lock files when unlock, there are many useless lock files 
> in table path, especially segment lock files, they grow after every batch 
> loading.
> Solution :
> 1. add a child path into table path, called Locks, all lock files will be 
> stored in this path;
> 2. Before loading, get all useless segment lock files and delete them, 
> because just segment lock files will grow, other lock files dosen't grow.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (CARBONDATA-2230) Add a path into table path to store lock files and delete useless segment lock files before loading

2018-03-06 Thread Zhichao Zhang (JIRA)
Zhichao  Zhang created CARBONDATA-2230:
--

 Summary: Add a path into table path to store lock files and delete 
useless segment lock files before loading
 Key: CARBONDATA-2230
 URL: https://issues.apache.org/jira/browse/CARBONDATA-2230
 Project: CarbonData
  Issue Type: Improvement
  Components: data-load
Affects Versions: 1.3.0, 1.4.0
Reporter: Zhichao  Zhang
Assignee: Zhichao  Zhang
 Fix For: 1.4.0, 1.3.0






--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (CARBONDATA-2215) Add the description of Carbon Stream Parser into streaming-guide.md

2018-02-27 Thread Zhichao Zhang (JIRA)
Zhichao  Zhang created CARBONDATA-2215:
--

 Summary: Add the description of Carbon Stream Parser into 
streaming-guide.md
 Key: CARBONDATA-2215
 URL: https://issues.apache.org/jira/browse/CARBONDATA-2215
 Project: CarbonData
  Issue Type: Task
  Components: docs
Reporter: Zhichao  Zhang
Assignee: Zhichao  Zhang


Add the description of Carbon Stream Parser into streaming-guide.md



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


  1   2   >