[jira] [Commented] (CARBONDATA-3631) StringIndexOutOfBoundsException When Inserting Select From a Parquet Table with Empty array/map
[ https://issues.apache.org/jira/browse/CARBONDATA-3631?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17005201#comment-17005201 ] Zhichao Zhang commented on CARBONDATA-3631: [~shenhong] please raise a pr to fix this, thanks. > StringIndexOutOfBoundsException When Inserting Select From a Parquet Table > with Empty array/map > --- > > Key: CARBONDATA-3631 > URL: https://issues.apache.org/jira/browse/CARBONDATA-3631 > Project: CarbonData > Issue Type: Bug >Affects Versions: 1.6.1, 2.0.0 >Reporter: Xingjun Hao >Priority: Minor > Fix For: 2.0.0 > > > sql("insert into datatype_array_parquet values(array())") > sql("insert into datatype_array_carbondata select f from > datatype_array_parquet") > > {code:java} > java.lang.StringIndexOutOfBoundsException: String index out of range: -1 > at java.lang.AbstractStringBuilder.substring(AbstractStringBuilder.java:935) > at java.lang.StringBuilder.substring(StringBuilder.java:76) > at scala.collection.mutable.StringBuilder.substring(StringBuilder.scala:166) > at > org.apache.carbondata.streaming.parser.FieldConverter$.objectToString(FieldConverter.scala:77) > at > org.apache.carbondata.spark.util.CarbonScalaUtil$.getString(CarbonScalaUtil.scala:71) > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Resolved] (CARBONDATA-3619) NoSuchMethodError(registerCurrentOperationLog) While Creating Table
[ https://issues.apache.org/jira/browse/CARBONDATA-3619?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhichao Zhang resolved CARBONDATA-3619. Resolution: Fixed > NoSuchMethodError(registerCurrentOperationLog) While Creating Table > --- > > Key: CARBONDATA-3619 > URL: https://issues.apache.org/jira/browse/CARBONDATA-3619 > Project: CarbonData > Issue Type: Bug > Components: spark-integration >Affects Versions: 1.6.1, 2.0.0 >Reporter: Xingjun Hao >Priority: Minor > Fix For: 2.0.0 > > Time Spent: 1.5h > Remaining Estimate: 0h > > ExecuteStatementOperation.java exists in hive-service model and > spark-hive-thriftserver model, Leading "NoSuchMethodError: > org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation.registerCurrentOperationLog()V" > {code:java} > Caused by: java.lang.NoSuchMethodError: > org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation.registerCurrentOperationLog()V > > at > org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation.protected$registerCurrentOperationLog(SparkExecuteStatementOperation.scala:173) > > at > org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation$$anon$1$$anon$2.run(SparkExecuteStatementOperation.scala:173) > > at > org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation$$anon$1$$anon$2.run(SparkExecuteStatementOperation.scala:171) > > at java.security.AccessController.doPrivileged(Native Method) at > javax.security.auth.Subject.doAs(Subject.java:422) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657) > > at > org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation$$anon$1.run(SparkExecuteStatementOperation.scala:185) > > at > java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) at > java.util.concurrent.FutureTask.run(FutureTask.java:266) ... 3 more > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (CARBONDATA-3612) Caused by: java.io.IOException: Problem in loading segment blocks: null
[ https://issues.apache.org/jira/browse/CARBONDATA-3612?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16996914#comment-16996914 ] Zhichao Zhang commented on CARBONDATA-3612: [~SeaAndHill] you can add my WeChat: xm_zzc, I can help to view this problem. > Caused by: java.io.IOException: Problem in loading segment blocks: null > --- > > Key: CARBONDATA-3612 > URL: https://issues.apache.org/jira/browse/CARBONDATA-3612 > Project: CarbonData > Issue Type: Bug > Components: core, data-load >Affects Versions: 1.5.1 >Reporter: SeaAndHill >Priority: Major > > at org.apache.spark.sql.catalyst.errors.package$.attachTree(package.scala:56) > at org.apache.spark.sql.catalyst.errors.package$.attachTree(package.scala:56) > at > org.apache.spark.sql.execution.exchange.ShuffleExchange.doExecute(ShuffleExchange.scala:115) > at > org.apache.spark.sql.execution.SparkPlan$$anonfun$execute$1.apply(SparkPlan.scala:117) > at > org.apache.spark.sql.execution.SparkPlan$$anonfun$execute$1.apply(SparkPlan.scala:117) > at > org.apache.spark.sql.execution.SparkPlan$$anonfun$executeQuery$1.apply(SparkPlan.scala:138) > at > org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:151) > at > org.apache.spark.sql.execution.SparkPlan.executeQuery(SparkPlan.scala:135) at > org.apache.spark.sql.execution.SparkPlan.execute(SparkPlan.scala:116) at > org.apache.spark.sql.execution.InputAdapter.inputRDDs(WholeStageCodegenExec.scala:252) > at > org.apache.spark.sql.execution.aggregate.HashAggregateExec.inputRDDs(HashAggregateExec.scala:141) > at > org.apache.spark.sql.execution.aggregate.HashAggregateExec.inputRDDs(HashAggregateExec.scala:141) > at > org.apache.spark.sql.execution.WholeStageCodegenExec.doExecute(WholeStageCodegenExec.scala:386) > at > org.apache.spark.sql.execution.SparkPlan$$anonfun$execute$1.apply(SparkPlan.scala:117) > at > org.apache.spark.sql.execution.SparkPlan$$anonfun$execute$1.apply(SparkPlan.scala:117) > at > org.apache.spark.sql.execution.SparkPlan$$anonfun$executeQuery$1.apply(SparkPlan.scala:138) > at > org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:151) > at > org.apache.spark.sql.execution.SparkPlan.executeQuery(SparkPlan.scala:135) at > org.apache.spark.sql.execution.SparkPlan.execute(SparkPlan.scala:116) at > org.apache.spark.sql.execution.exchange.ShuffleExchange.prepareShuffleDependency(ShuffleExchange.scala:88) > at > org.apache.spark.sql.execution.exchange.ShuffleExchange$$anonfun$doExecute$1.apply(ShuffleExchange.scala:124) > at > org.apache.spark.sql.execution.exchange.ShuffleExchange$$anonfun$doExecute$1.apply(ShuffleExchange.scala:115) > at > org.apache.spark.sql.catalyst.errors.package$.attachTree(package.scala:52) > ... 35 moreCaused by: java.io.IOException: Problem in loading segment blocks: > null at > org.apache.carbondata.core.indexstore.BlockletDataMapIndexStore.getAll(BlockletDataMapIndexStore.java:193) > at > org.apache.carbondata.core.indexstore.blockletindex.BlockletDataMapFactory.getDataMaps(BlockletDataMapFactory.java:144) > at > org.apache.carbondata.core.datamap.TableDataMap.prune(TableDataMap.java:139) > at > org.apache.carbondata.hadoop.api.CarbonInputFormat.getPrunedBlocklets(CarbonInputFormat.java:493) > at > org.apache.carbondata.hadoop.api.CarbonInputFormat.getDataBlocksOfSegment(CarbonInputFormat.java:412) > at > org.apache.carbondata.hadoop.api.CarbonTableInputFormat.getSplits(CarbonTableInputFormat.java:529) > at > org.apache.carbondata.hadoop.api.CarbonTableInputFormat.getSplits(CarbonTableInputFormat.java:220) > at > org.apache.carbondata.spark.rdd.CarbonScanRDD.internalGetPartitions(CarbonScanRDD.scala:127) > at > org.apache.carbondata.spark.rdd.CarbonRDD.getPartitions(CarbonRDD.scala:66) > at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:252) at > org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:250) at > scala.Option.getOrElse(Option.scala:121) at > org.apache.spark.rdd.RDD.partitions(RDD.scala:250) at > org.apache.spark.rdd.MapPartitionsRDD.getPartitions(MapPartitionsRDD.scala:35) > at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:252) at > org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:250) at > scala.Option.getOrElse(Option.scala:121) at > org.apache.spark.rdd.RDD.partitions(RDD.scala:250) at > org.apache.spark.rdd.MapPartitionsRDD.getPartitions(MapPartitionsRDD.scala:35) > at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:252) at > org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:250) at > scala.Option.getOrElse(Option.scala:121) at > org.apache.spark.rdd.RDD.partitions(RDD.scala:250) at >
[jira] [Created] (CARBONDATA-3611) Filter failed with measure columns on stream table when this stream table includes complex columns
Zhichao Zhang created CARBONDATA-3611: -- Summary: Filter failed with measure columns on stream table when this stream table includes complex columns Key: CARBONDATA-3611 URL: https://issues.apache.org/jira/browse/CARBONDATA-3611 Project: CarbonData Issue Type: Bug Reporter: Zhichao Zhang Assignee: Zhichao Zhang Filter failed with measure columns on stream table when this stream table includes complex columns -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (CARBONDATA-3596) After execute 'add column' command, it will throw exception when execute load data command or select sql on a table which includes complex columns
Zhichao Zhang created CARBONDATA-3596: -- Summary: After execute 'add column' command, it will throw exception when execute load data command or select sql on a table which includes complex columns Key: CARBONDATA-3596 URL: https://issues.apache.org/jira/browse/CARBONDATA-3596 Project: CarbonData Issue Type: Bug Reporter: Zhichao Zhang Assignee: Zhichao Zhang After execute 'add column' command, it will throw exception when execute load data command or select sql on a table which includes complex columns -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (CARBONDATA-3591) optimize java code checkstyle for NoWhitespaceAfter rule
[ https://issues.apache.org/jira/browse/CARBONDATA-3591?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhichao Zhang updated CARBONDATA-3591: --- Fix Version/s: 2.0.0 > optimize java code checkstyle for NoWhitespaceAfter rule > > > Key: CARBONDATA-3591 > URL: https://issues.apache.org/jira/browse/CARBONDATA-3591 > Project: CarbonData > Issue Type: Improvement > Components: core >Reporter: lamber-ken >Priority: Major > Fix For: 2.0.0 > > Time Spent: 1h > Remaining Estimate: 0h > > optimize java code checkstyle for NoWhitespaceAfter rule -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Resolved] (CARBONDATA-3591) optimize java code checkstyle for NoWhitespaceAfter rule
[ https://issues.apache.org/jira/browse/CARBONDATA-3591?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhichao Zhang resolved CARBONDATA-3591. Resolution: Fixed > optimize java code checkstyle for NoWhitespaceAfter rule > > > Key: CARBONDATA-3591 > URL: https://issues.apache.org/jira/browse/CARBONDATA-3591 > Project: CarbonData > Issue Type: Improvement > Components: core >Reporter: lamber-ken >Priority: Major > Time Spent: 1h > Remaining Estimate: 0h > > optimize java code checkstyle for NoWhitespaceAfter rule -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Resolved] (CARBONDATA-3576) optimize java code checkstyle for EmptyLineSeparator rule
[ https://issues.apache.org/jira/browse/CARBONDATA-3576?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhichao Zhang resolved CARBONDATA-3576. Resolution: Fixed > optimize java code checkstyle for EmptyLineSeparator rule > - > > Key: CARBONDATA-3576 > URL: https://issues.apache.org/jira/browse/CARBONDATA-3576 > Project: CarbonData > Issue Type: Improvement > Components: core, other >Affects Versions: 1.6.1 >Reporter: lamber-ken >Priority: Minor > Fix For: 2.0.0 > > Time Spent: 1h 50m > Remaining Estimate: 0h > > optimize java code checkstyle for EmptyLineSeparator rule -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (CARBONDATA-3583) Upgrade default JDK version 1.7 to 1.8
[ https://issues.apache.org/jira/browse/CARBONDATA-3583?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhichao Zhang updated CARBONDATA-3583: --- Fix Version/s: 2.0.0 > Upgrade default JDK version 1.7 to 1.8 > -- > > Key: CARBONDATA-3583 > URL: https://issues.apache.org/jira/browse/CARBONDATA-3583 > Project: CarbonData > Issue Type: Improvement >Reporter: Zhichao Zhang >Assignee: Zhichao Zhang >Priority: Minor > Fix For: 2.0.0 > > > Upgrade default JDK version 1.7 to 1.8 which provides some good features that > makes code cleaner. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (CARBONDATA-3583) Upgrade default JDK version 1.7 to 1.8
Zhichao Zhang created CARBONDATA-3583: -- Summary: Upgrade default JDK version 1.7 to 1.8 Key: CARBONDATA-3583 URL: https://issues.apache.org/jira/browse/CARBONDATA-3583 Project: CarbonData Issue Type: Improvement Reporter: Zhichao Zhang Assignee: Zhichao Zhang Upgrade default JDK version 1.7 to 1.8 which provides some good features that makes code cleaner. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (CARBONDATA-3561) After delete/update the data, the query results become incorrect
[ https://issues.apache.org/jira/browse/CARBONDATA-3561?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhichao Zhang updated CARBONDATA-3561: --- Fix Version/s: 2.0.0 > After delete/update the data, the query results become incorrect > > > Key: CARBONDATA-3561 > URL: https://issues.apache.org/jira/browse/CARBONDATA-3561 > Project: CarbonData > Issue Type: Bug > Components: data-query >Affects Versions: 1.6.0, 1.5.4 > Environment: SUSE12 CDH5 >Reporter: Xigua >Assignee: Zhichao Zhang >Priority: Blocker > Fix For: 2.0.0 > > Attachments: image-2019-10-30-14-42-49-871.png, > image-2019-10-30-14-43-21-972.png > > Time Spent: 3h 10m > Remaining Estimate: 0h > > > {quote}CREATE TABLE `workqueue_zyy02` (`queuecode` STRING, `channelflag` > STRING, `dt` STRING) > USING org.apache.spark.sql.CarbonSource > PARTITIONED BY (dt); > > delete from workqueue_zyy02 m where m.queuecode ='1'; > {quote} > *before* > * !image-2019-10-30-14-42-49-871.png! > * > * *after* > * !image-2019-10-30-14-43-21-972.png! -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Assigned] (CARBONDATA-3569) spark ui open exception
[ https://issues.apache.org/jira/browse/CARBONDATA-3569?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhichao Zhang reassigned CARBONDATA-3569: -- Assignee: Zhichao Zhang > spark ui open exception > --- > > Key: CARBONDATA-3569 > URL: https://issues.apache.org/jira/browse/CARBONDATA-3569 > Project: CarbonData > Issue Type: Improvement > Components: build >Affects Versions: 1.6.1 >Reporter: tianyou >Assignee: Zhichao Zhang >Priority: Critical > Fix For: 2.0.0 > > Time Spent: 1h > Remaining Estimate: 0h > > Spark 2.3.2 jars contains related packages of javax, and the carbondata > package references another javax, which causes an exception in sparkui. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (CARBONDATA-3577) Use Spark 2.3 as default version and upgrade Spark 2.3.2 to 2.3.4
Zhichao Zhang created CARBONDATA-3577: -- Summary: Use Spark 2.3 as default version and upgrade Spark 2.3.2 to 2.3.4 Key: CARBONDATA-3577 URL: https://issues.apache.org/jira/browse/CARBONDATA-3577 Project: CarbonData Issue Type: Improvement Components: spark-integration Reporter: Zhichao Zhang Assignee: Zhichao Zhang Fix For: 2.0.0 -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (CARBONDATA-3561) After delete/update the data, the query results become incorrect
[ https://issues.apache.org/jira/browse/CARBONDATA-3561?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhichao Zhang updated CARBONDATA-3561: --- Summary: After delete/update the data, the query results become incorrect (was: After deletor updat the data, the query results become incorrect) > After delete/update the data, the query results become incorrect > > > Key: CARBONDATA-3561 > URL: https://issues.apache.org/jira/browse/CARBONDATA-3561 > Project: CarbonData > Issue Type: Bug > Components: data-query >Affects Versions: 1.6.0, 1.5.4 > Environment: SUSE12 CDH5 >Reporter: Xigua >Assignee: Zhichao Zhang >Priority: Blocker > Attachments: image-2019-10-30-14-42-49-871.png, > image-2019-10-30-14-43-21-972.png > > > > {quote}CREATE TABLE `workqueue_zyy02` (`queuecode` STRING, `channelflag` > STRING, `dt` STRING) > USING org.apache.spark.sql.CarbonSource > PARTITIONED BY (dt); > > delete from workqueue_zyy02 m where m.queuecode ='1'; > {quote} > *before* > * !image-2019-10-30-14-42-49-871.png! > * > * *after* > * !image-2019-10-30-14-43-21-972.png! -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Reopened] (CARBONDATA-3561) After deletor updat the data, the query results become incorrect
[ https://issues.apache.org/jira/browse/CARBONDATA-3561?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhichao Zhang reopened CARBONDATA-3561: Assignee: Zhichao Zhang This issue is not resolved. > After deletor updat the data, the query results become incorrect > > > Key: CARBONDATA-3561 > URL: https://issues.apache.org/jira/browse/CARBONDATA-3561 > Project: CarbonData > Issue Type: Bug > Components: data-query >Affects Versions: 1.6.0, 1.5.4 > Environment: SUSE12 CDH5 >Reporter: Xigua >Assignee: Zhichao Zhang >Priority: Blocker > Attachments: image-2019-10-30-14-42-49-871.png, > image-2019-10-30-14-43-21-972.png > > > > {quote}CREATE TABLE `workqueue_zyy02` (`queuecode` STRING, `channelflag` > STRING, `dt` STRING) > USING org.apache.spark.sql.CarbonSource > PARTITIONED BY (dt); > > delete from workqueue_zyy02 m where m.queuecode ='1'; > {quote} > *before* > * !image-2019-10-30-14-42-49-871.png! > * > * *after* > * !image-2019-10-30-14-43-21-972.png! -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (CARBONDATA-3527) Throw 'String length cannot exceed 32000 characters' exception when load data with 'GLOBAL_SORT' from csv which include big complex type data
[ https://issues.apache.org/jira/browse/CARBONDATA-3527?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhichao Zhang updated CARBONDATA-3527: --- Description: *Problem:* When complex type data is used more than 32000 characters to indicate in csv file, and load data with 'GLOBAL_SORT' from these csv files, it will throw 'String length cannot exceed 32000 characters' exception. *Cause:* Use 'GLOBAL_SORT' to load data from csv files, it reads files and firstly store data in StringArrayRow, the type of all data are string, when call 'CarbonScalaUtil.getString' in 'NewRddIterator.next', it will check the length of all data and throw 'String length cannot exceed 32000 characters' exception even if it's complex type data which store as more than 32000 characters in csv files. *Solution:* In 'FieldConverter.objectToString' (called in 'CarbonScalaUtil.getString'), if the data type of field is complex type, don't check the length. > Throw 'String length cannot exceed 32000 characters' exception when load data > with 'GLOBAL_SORT' from csv which include big complex type data > - > > Key: CARBONDATA-3527 > URL: https://issues.apache.org/jira/browse/CARBONDATA-3527 > Project: CarbonData > Issue Type: Bug > Components: spark-integration >Affects Versions: 1.6.0 >Reporter: Zhichao Zhang >Assignee: Zhichao Zhang >Priority: Major > Fix For: 1.6.1 > > > *Problem:* > When complex type data is used more than 32000 characters to indicate in csv > file, and load data with 'GLOBAL_SORT' from these csv files, it will throw > 'String length cannot exceed 32000 characters' exception. > *Cause:* > Use 'GLOBAL_SORT' to load data from csv files, it reads files and firstly > store data in StringArrayRow, the type of all data are string, when call > 'CarbonScalaUtil.getString' in 'NewRddIterator.next', it will check the > length of all data and throw 'String length cannot exceed 32000 characters' > exception even if it's complex type data which store as more than 32000 > characters in csv files. > *Solution:* > In 'FieldConverter.objectToString' (called in 'CarbonScalaUtil.getString'), > if the data type of field is complex type, don't check the length. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (CARBONDATA-3527) Throw 'String length cannot exceed 32000 characters' exception when load data with 'GLOBAL_SORT' from csv which include big complex type data
[ https://issues.apache.org/jira/browse/CARBONDATA-3527?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhichao Zhang updated CARBONDATA-3527: --- Description: (was: Problem: when load data with 'GLOBAL_SORT' from csv, and these csv files include some big complex type data, which are used long string to ) > Throw 'String length cannot exceed 32000 characters' exception when load data > with 'GLOBAL_SORT' from csv which include big complex type data > - > > Key: CARBONDATA-3527 > URL: https://issues.apache.org/jira/browse/CARBONDATA-3527 > Project: CarbonData > Issue Type: Bug > Components: spark-integration >Affects Versions: 1.6.0 >Reporter: Zhichao Zhang >Assignee: Zhichao Zhang >Priority: Major > Fix For: 1.6.1 > > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (CARBONDATA-3527) Throw 'String length cannot exceed 32000 characters' exception when load data with 'GLOBAL_SORT' from csv which include big complex type data
[ https://issues.apache.org/jira/browse/CARBONDATA-3527?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhichao Zhang updated CARBONDATA-3527: --- Description: Problem: when load data with 'GLOBAL_SORT' from csv, and these csv files include some big complex type data, which are used long string to > Throw 'String length cannot exceed 32000 characters' exception when load data > with 'GLOBAL_SORT' from csv which include big complex type data > - > > Key: CARBONDATA-3527 > URL: https://issues.apache.org/jira/browse/CARBONDATA-3527 > Project: CarbonData > Issue Type: Bug > Components: spark-integration >Affects Versions: 1.6.0 >Reporter: Zhichao Zhang >Assignee: Zhichao Zhang >Priority: Major > Fix For: 1.6.1 > > > Problem: > when load data with 'GLOBAL_SORT' from csv, and these csv files include some > big complex type data, which are used long string to -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (CARBONDATA-3527) Throw 'String length cannot exceed 32000 characters' exception when load data with 'GLOBAL_SORT' from csv which include big complex type data
Zhichao Zhang created CARBONDATA-3527: -- Summary: Throw 'String length cannot exceed 32000 characters' exception when load data with 'GLOBAL_SORT' from csv which include big complex type data Key: CARBONDATA-3527 URL: https://issues.apache.org/jira/browse/CARBONDATA-3527 Project: CarbonData Issue Type: Bug Components: spark-integration Affects Versions: 1.6.0 Reporter: Zhichao Zhang Assignee: Zhichao Zhang Fix For: 1.6.1 -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (CARBONDATA-3513) can not run major compaction when using hive partition table
[ https://issues.apache.org/jira/browse/CARBONDATA-3513?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhichao Zhang updated CARBONDATA-3513: --- Description: Major compaction command runs error.ERROR information: {code:java} 2019-09-03 13:35:49 INFO BlockManagerInfo:54 - Added broadcast_0_piece0 in memory on czh-yhfx-redis1:41430 (size: 26.4 KB, free: 5.2 GB)2019-09-03 13:35:49 INFO BlockManagerInfo:54 - Added broadcast_0_piece0 in memory on czh-yhfx-redis1:41430 (size: 26.4 KB, free: 5.2 GB)2019-09-03 13:35:52 WARN TaskSetManager:66 - Lost task 1.0 in stage 0.0 (TID 1, czh-yhfx-redis1, executor 1): java.lang.NumberFormatException: For input string: "328812001110" at java.lang.NumberFormatException.forInputString(NumberFormatException.java:65) at java.lang.Long.parseLong(Long.java:592) at java.lang.Long.parseLong(Long.java:631) at org.apache.carbondata.core.util.path.CarbonTablePath$DataFileUtil.getTaskIdFromTaskNo(CarbonTablePath.java:503) at org.apache.carbondata.processing.store.CarbonFactDataHandlerModel.getCarbonFactDataHandlerModel(CarbonFactDataHandlerModel.java:396) at org.apache.carbondata.processing.merger.RowResultMergerProcessor.(RowResultMergerProcessor.java:86) at org.apache.carbondata.spark.rdd.CarbonMergerRDD$$anon$1.(CarbonMergerRDD.scala:213) at org.apache.carbondata.spark.rdd.CarbonMergerRDD.internalCompute(CarbonMergerRDD.scala:86) at org.apache.carbondata.spark.rdd.CarbonRDD.compute(CarbonRDD.scala:82) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:323) at org.apache.spark.rdd.RDD.iterator(RDD.scala:287) at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:87) at org.apache.spark.scheduler.Task.run(Task.scala:108) at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:338) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) at java.lang.Thread.run(Thread.java:748){code} version:apache-carbondata-1.5.1-bin-spark2.2.1-hadoop2.7.2.jar table is a hive partition table. was: Major compaction command runs error.ERROR information: 2019-09-03 13:35:49 INFO BlockManagerInfo:54 - Added broadcast_0_piece0 in memory on czh-yhfx-redis1:41430 (size: 26.4 KB, free: 5.2 GB)2019-09-03 13:35:49 INFO BlockManagerInfo:54 - Added broadcast_0_piece0 in memory on czh-yhfx-redis1:41430 (size: 26.4 KB, free: 5.2 GB)2019-09-03 13:35:52 WARN TaskSetManager:66 - Lost task 1.0 in stage 0.0 (TID 1, czh-yhfx-redis1, executor 1): java.lang.NumberFormatException: For input string: "328812001110" at java.lang.NumberFormatException.forInputString(NumberFormatException.java:65) at java.lang.Long.parseLong(Long.java:592) at java.lang.Long.parseLong(Long.java:631) at org.apache.carbondata.core.util.path.CarbonTablePath$DataFileUtil.getTaskIdFromTaskNo(CarbonTablePath.java:503) at org.apache.carbondata.processing.store.CarbonFactDataHandlerModel.getCarbonFactDataHandlerModel(CarbonFactDataHandlerModel.java:396) at org.apache.carbondata.processing.merger.RowResultMergerProcessor.(RowResultMergerProcessor.java:86) at org.apache.carbondata.spark.rdd.CarbonMergerRDD$$anon$1.(CarbonMergerRDD.scala:213) at org.apache.carbondata.spark.rdd.CarbonMergerRDD.internalCompute(CarbonMergerRDD.scala:86) at org.apache.carbondata.spark.rdd.CarbonRDD.compute(CarbonRDD.scala:82) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:323) at org.apache.spark.rdd.RDD.iterator(RDD.scala:287) at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:87) at org.apache.spark.scheduler.Task.run(Task.scala:108) at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:338) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) at java.lang.Thread.run(Thread.java:748) version:apache-carbondata-1.5.1-bin-spark2.2.1-hadoop2.7.2.jar table is a hive partition table. > can not run major compaction when using hive partition table > > > Key: CARBONDATA-3513 > URL: https://issues.apache.org/jira/browse/CARBONDATA-3513 > Project: CarbonData > Issue Type: Bug > Components: core >Affects Versions: 1.6.0 >Reporter: ocean >Priority: Major > Fix For: 1.6.1 > > Attachments: 33cb0d98561f8b7f505eb7b2ff9f72e0.jpg > > > Major compaction command runs error.ERROR information: > {code:java} > 2019-09-03 13:35:49 INFO BlockManagerInfo:54 - Added broadcast_0_piece0 in > memory on czh-yhfx-redis1:41430 (size: 26.4 KB, free: 5.2 GB)2019-09-03 > 13:35:49 INFO BlockManagerInfo:54 - Added broadcast_0_piece0 in memory on > czh-yhfx-redis1:41430 (size:
[jira] [Created] (CARBONDATA-3501) Support to execute update sql on table with long_string field (Not update long_string field)
Zhichao Zhang created CARBONDATA-3501: -- Summary: Support to execute update sql on table with long_string field (Not update long_string field) Key: CARBONDATA-3501 URL: https://issues.apache.org/jira/browse/CARBONDATA-3501 Project: CarbonData Issue Type: Improvement Reporter: Zhichao Zhang Assignee: Zhichao Zhang When execute update sql (not update long_string field) on table with long_string field, it fail. -- This message was sent by Atlassian Jira (v8.3.2#803003)
[jira] [Updated] (CARBONDATA-3497) Support to write long string for streaming table
[ https://issues.apache.org/jira/browse/CARBONDATA-3497?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhichao Zhang updated CARBONDATA-3497: --- Description: Support to write long string for streaming table > Support to write long string for streaming table > > > Key: CARBONDATA-3497 > URL: https://issues.apache.org/jira/browse/CARBONDATA-3497 > Project: CarbonData > Issue Type: Improvement > Components: spark-integration >Reporter: Zhichao Zhang >Assignee: Zhichao Zhang >Priority: Minor > > Support to write long string for streaming table -- This message was sent by Atlassian Jira (v8.3.2#803003)
[jira] [Created] (CARBONDATA-3498) Support to alter sort_column for streaming table
Zhichao Zhang created CARBONDATA-3498: -- Summary: Support to alter sort_column for streaming table Key: CARBONDATA-3498 URL: https://issues.apache.org/jira/browse/CARBONDATA-3498 Project: CarbonData Issue Type: Improvement Components: spark-integration Reporter: Zhichao Zhang Assignee: Zhichao Zhang Support to alter sort_column for streaming table -- This message was sent by Atlassian Jira (v8.3.2#803003)
[jira] [Created] (CARBONDATA-3497) Support to write long string for streaming table
Zhichao Zhang created CARBONDATA-3497: -- Summary: Support to write long string for streaming table Key: CARBONDATA-3497 URL: https://issues.apache.org/jira/browse/CARBONDATA-3497 Project: CarbonData Issue Type: Improvement Components: spark-integration Reporter: Zhichao Zhang Assignee: Zhichao Zhang -- This message was sent by Atlassian Jira (v8.3.2#803003)
[jira] [Created] (CARBONDATA-3491) Return updated/deleted rows count when execute update/delete sql
Zhichao Zhang created CARBONDATA-3491: -- Summary: Return updated/deleted rows count when execute update/delete sql Key: CARBONDATA-3491 URL: https://issues.apache.org/jira/browse/CARBONDATA-3491 Project: CarbonData Issue Type: Improvement Reporter: Zhichao Zhang Assignee: Zhichao Zhang Return updated/deleted rows count when execute update/delete sql. -- This message was sent by Atlassian JIRA (v7.6.14#76016)
[jira] [Resolved] (CARBONDATA-3488) Check the file size after move local file to carbon path
[ https://issues.apache.org/jira/browse/CARBONDATA-3488?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhichao Zhang resolved CARBONDATA-3488. Resolution: Fixed Fix Version/s: 1.6.0 > Check the file size after move local file to carbon path > > > Key: CARBONDATA-3488 > URL: https://issues.apache.org/jira/browse/CARBONDATA-3488 > Project: CarbonData > Issue Type: Improvement >Reporter: Zhichao Zhang >Assignee: Zhichao Zhang >Priority: Minor > Fix For: 1.6.0 > > Time Spent: 7h > Remaining Estimate: 0h > > *Problem:* > One user met an issue: the row num saved in carbonindex file is non zero but > the file size of relevant carbondata file is 0. > > *Solution:* > In CarbonUtil.copyCarbonDataFileToCarbonStorePath, check the file size of > carbon file whether is the same as the size fo local file after move local > file to carbon path. -- This message was sent by Atlassian JIRA (v7.6.14#76016)
[jira] [Resolved] (CARBONDATA-3477) Throw out exception when use sql: 'update table select\n...'
[ https://issues.apache.org/jira/browse/CARBONDATA-3477?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhichao Zhang resolved CARBONDATA-3477. Resolution: Fixed > Throw out exception when use sql: 'update table select\n...' > > > Key: CARBONDATA-3477 > URL: https://issues.apache.org/jira/browse/CARBONDATA-3477 > Project: CarbonData > Issue Type: Bug >Reporter: Zhichao Zhang >Assignee: Zhichao Zhang >Priority: Minor > Time Spent: 4h 50m > Remaining Estimate: 0h > > When use below sql to update table: > {code:java} > UPDATE IUD_table2 a > SET (a.IUD_table2_country, a.IUD_table2_salary) = (select > b.IUD_table1_country, b.IUD_table1_salary from IUD_table1 b where > b.IUD_table1_id = 8) > WHERE a.IUD_table2_id < 6 or a.IUD_table2_id > 15{code} > *It will throw out exception:* > {code:java} > Exception in thread "main" org.apache.spark.sql.AnalysisException: == Parse1 > == > mismatched input '.' expecting (line 2, pos 1) > == SQL == > select select > b.IUD_table1_country, b.IUD_table1_salary from IUD_table1 b where > b.IUD_table1_id = 8 from iud_table2 a > -^^^ > == Parse2 == > [1.1] failure: identifier matching regex (?i)ALTER expected > select select > {code} > -- This message was sent by Atlassian JIRA (v7.6.14#76016)
[jira] [Updated] (CARBONDATA-3477) Throw out exception when use sql: 'update table select\n...'
[ https://issues.apache.org/jira/browse/CARBONDATA-3477?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhichao Zhang updated CARBONDATA-3477: --- Fix Version/s: 1.6.0 > Throw out exception when use sql: 'update table select\n...' > > > Key: CARBONDATA-3477 > URL: https://issues.apache.org/jira/browse/CARBONDATA-3477 > Project: CarbonData > Issue Type: Bug >Reporter: Zhichao Zhang >Assignee: Zhichao Zhang >Priority: Minor > Fix For: 1.6.0 > > Time Spent: 4h 50m > Remaining Estimate: 0h > > When use below sql to update table: > {code:java} > UPDATE IUD_table2 a > SET (a.IUD_table2_country, a.IUD_table2_salary) = (select > b.IUD_table1_country, b.IUD_table1_salary from IUD_table1 b where > b.IUD_table1_id = 8) > WHERE a.IUD_table2_id < 6 or a.IUD_table2_id > 15{code} > *It will throw out exception:* > {code:java} > Exception in thread "main" org.apache.spark.sql.AnalysisException: == Parse1 > == > mismatched input '.' expecting (line 2, pos 1) > == SQL == > select select > b.IUD_table1_country, b.IUD_table1_salary from IUD_table1 b where > b.IUD_table1_id = 8 from iud_table2 a > -^^^ > == Parse2 == > [1.1] failure: identifier matching regex (?i)ALTER expected > select select > {code} > -- This message was sent by Atlassian JIRA (v7.6.14#76016)
[jira] [Resolved] (CARBONDATA-3483) Can not run horizontal compaction when execute update sql
[ https://issues.apache.org/jira/browse/CARBONDATA-3483?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhichao Zhang resolved CARBONDATA-3483. Resolution: Fixed Fix Version/s: 1.6.0 > Can not run horizontal compaction when execute update sql > - > > Key: CARBONDATA-3483 > URL: https://issues.apache.org/jira/browse/CARBONDATA-3483 > Project: CarbonData > Issue Type: Bug >Affects Versions: 1.5.3, 1.6.0, 1.5.4 >Reporter: Zhichao Zhang >Assignee: Zhichao Zhang >Priority: Major > Fix For: 1.6.0 > > Time Spent: 5h 50m > Remaining Estimate: 0h > > After PR#3166, horizontal compaction will not actually run when execute > update sql. > When it runs update sql and will run horizontal compaction if needs, it will > require update.lock and compaction.lock when execute > CarbonAlterTableCompactionCommand.alterTableForCompaction, but these two > locks already are locked when it starts to execute update sql. so it will > require locks failed and can't execute compaction. -- This message was sent by Atlassian JIRA (v7.6.14#76016)
[jira] [Updated] (CARBONDATA-3488) Check the file size after move local file to carbon path
[ https://issues.apache.org/jira/browse/CARBONDATA-3488?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhichao Zhang updated CARBONDATA-3488: --- Description: *Problem:* One user met an issue: the row num saved in carbonindex file is non zero but the file size of relevant carbondata file is 0. *Solution:* In CarbonUtil.copyCarbonDataFileToCarbonStorePath, check the file size of carbon file whether is the same as the size fo local file after move local file to carbon path. was: One user met an issue CarbonUtil.copyCarbonDataFileToCarbonStorePath > Check the file size after move local file to carbon path > > > Key: CARBONDATA-3488 > URL: https://issues.apache.org/jira/browse/CARBONDATA-3488 > Project: CarbonData > Issue Type: Improvement >Reporter: Zhichao Zhang >Assignee: Zhichao Zhang >Priority: Minor > > *Problem:* > One user met an issue: the row num saved in carbonindex file is non zero but > the file size of relevant carbondata file is 0. > > *Solution:* > In CarbonUtil.copyCarbonDataFileToCarbonStorePath, check the file size of > carbon file whether is the same as the size fo local file after move local > file to carbon path. -- This message was sent by Atlassian JIRA (v7.6.14#76016)
[jira] [Updated] (CARBONDATA-3488) Check the file size after move local file to carbon path
[ https://issues.apache.org/jira/browse/CARBONDATA-3488?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhichao Zhang updated CARBONDATA-3488: --- Description: One user met an issue CarbonUtil.copyCarbonDataFileToCarbonStorePath > Check the file size after move local file to carbon path > > > Key: CARBONDATA-3488 > URL: https://issues.apache.org/jira/browse/CARBONDATA-3488 > Project: CarbonData > Issue Type: Improvement >Reporter: Zhichao Zhang >Assignee: Zhichao Zhang >Priority: Minor > > One user met an issue > > CarbonUtil.copyCarbonDataFileToCarbonStorePath -- This message was sent by Atlassian JIRA (v7.6.14#76016)
[jira] [Created] (CARBONDATA-3488) Check the file size after move local file to carbon path
Zhichao Zhang created CARBONDATA-3488: -- Summary: Check the file size after move local file to carbon path Key: CARBONDATA-3488 URL: https://issues.apache.org/jira/browse/CARBONDATA-3488 Project: CarbonData Issue Type: Improvement Reporter: Zhichao Zhang Assignee: Zhichao Zhang -- This message was sent by Atlassian JIRA (v7.6.14#76016)
[jira] [Commented] (CARBONDATA-3476) Read time and scan time stats shown wrong in executor log for filter query
[ https://issues.apache.org/jira/browse/CARBONDATA-3476?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16900718#comment-16900718 ] Zhichao Zhang commented on CARBONDATA-3476: what's your accout? I can not find user called 'Vikram Ahuja'. > Read time and scan time stats shown wrong in executor log for filter query > -- > > Key: CARBONDATA-3476 > URL: https://issues.apache.org/jira/browse/CARBONDATA-3476 > Project: CarbonData > Issue Type: Bug > Components: core >Reporter: Vikram Ahuja >Priority: Minor > Fix For: 1.6.0 > > Time Spent: 8h 20m > Remaining Estimate: 0h > > Problem: Read time and scan time stats shown wrong in executor log for filter > query > Root cause: Projection read time is added in scan time because of this scan > time and read time is not correct in stats > Solution: Added projection read time for both measure and dimension column in > read stats -- This message was sent by Atlassian JIRA (v7.6.14#76016)
[jira] [Updated] (CARBONDATA-3483) Can not run horizontal compaction when execute update sql
[ https://issues.apache.org/jira/browse/CARBONDATA-3483?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhichao Zhang updated CARBONDATA-3483: --- Priority: Major (was: Minor) > Can not run horizontal compaction when execute update sql > - > > Key: CARBONDATA-3483 > URL: https://issues.apache.org/jira/browse/CARBONDATA-3483 > Project: CarbonData > Issue Type: Bug >Reporter: Zhichao Zhang >Assignee: Zhichao Zhang >Priority: Major > Time Spent: 3h 20m > Remaining Estimate: 0h > > After PR#3166, horizontal compaction will not actually run when execute > update sql. > When it runs update sql and will run horizontal compaction if needs, it will > require update.lock and compaction.lock when execute > CarbonAlterTableCompactionCommand.alterTableForCompaction, but these two > locks already are locked when it starts to execute update sql. so it will > require locks failed and can't execute compaction. -- This message was sent by Atlassian JIRA (v7.6.14#76016)
[jira] [Updated] (CARBONDATA-3483) Can not run horizontal compaction when execute update sql
[ https://issues.apache.org/jira/browse/CARBONDATA-3483?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhichao Zhang updated CARBONDATA-3483: --- Affects Version/s: 1.6.0 1.5.3 1.5.4 > Can not run horizontal compaction when execute update sql > - > > Key: CARBONDATA-3483 > URL: https://issues.apache.org/jira/browse/CARBONDATA-3483 > Project: CarbonData > Issue Type: Bug >Affects Versions: 1.5.3, 1.6.0, 1.5.4 >Reporter: Zhichao Zhang >Assignee: Zhichao Zhang >Priority: Major > Time Spent: 3h 20m > Remaining Estimate: 0h > > After PR#3166, horizontal compaction will not actually run when execute > update sql. > When it runs update sql and will run horizontal compaction if needs, it will > require update.lock and compaction.lock when execute > CarbonAlterTableCompactionCommand.alterTableForCompaction, but these two > locks already are locked when it starts to execute update sql. so it will > require locks failed and can't execute compaction. -- This message was sent by Atlassian JIRA (v7.6.14#76016)
[jira] [Updated] (CARBONDATA-3483) Can not run horizontal compaction when execute update sql
[ https://issues.apache.org/jira/browse/CARBONDATA-3483?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhichao Zhang updated CARBONDATA-3483: --- Description: After PR#3166, horizontal compaction will not actually run when execute update sql. When it runs update sql and will run horizontal compaction if needs, it will require update.lock and compaction.lock when execute CarbonAlterTableCompactionCommand.alterTableForCompaction, but these two locks already are locked when it starts to execute update sql. so it will require locks failed and can't execute compaction. was: After PR#3166, horizontal compaction will not actually run when execute update sql. When it runs update sql and will run horizontal compaction if needs, it will require update.lock and compaction.lock when execute CarbonAlterTableCompactionCommand.alterTableForCompaction, but these two locks already are locked when it starts to execute update sql. so it will require locks failed and don't execute compaction. > Can not run horizontal compaction when execute update sql > - > > Key: CARBONDATA-3483 > URL: https://issues.apache.org/jira/browse/CARBONDATA-3483 > Project: CarbonData > Issue Type: Bug >Reporter: Zhichao Zhang >Assignee: Zhichao Zhang >Priority: Minor > > After PR#3166, horizontal compaction will not actually run when execute > update sql. > When it runs update sql and will run horizontal compaction if needs, it will > require update.lock and compaction.lock when execute > CarbonAlterTableCompactionCommand.alterTableForCompaction, but these two > locks already are locked when it starts to execute update sql. so it will > require locks failed and can't execute compaction. -- This message was sent by Atlassian JIRA (v7.6.14#76016)
[jira] [Updated] (CARBONDATA-3483) Can not run horizontal compaction when execute update sql
[ https://issues.apache.org/jira/browse/CARBONDATA-3483?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhichao Zhang updated CARBONDATA-3483: --- Summary: Can not run horizontal compaction when execute update sql (was: Do not run horizontal compaction when execute update sql) > Can not run horizontal compaction when execute update sql > - > > Key: CARBONDATA-3483 > URL: https://issues.apache.org/jira/browse/CARBONDATA-3483 > Project: CarbonData > Issue Type: Bug >Reporter: Zhichao Zhang >Assignee: Zhichao Zhang >Priority: Minor > > After PR#3166, horizontal compaction will not actually run when execute > update sql. > When it runs update sql and will run horizontal compaction if needs, it will > require update.lock and compaction.lock when execute > CarbonAlterTableCompactionCommand.alterTableForCompaction, but these two > locks already are locked when it starts to execute update sql. so it will > require locks failed and don't execute compaction. -- This message was sent by Atlassian JIRA (v7.6.14#76016)
[jira] [Created] (CARBONDATA-3483) Do not run horizontal compaction when execute update sql
Zhichao Zhang created CARBONDATA-3483: -- Summary: Do not run horizontal compaction when execute update sql Key: CARBONDATA-3483 URL: https://issues.apache.org/jira/browse/CARBONDATA-3483 Project: CarbonData Issue Type: Bug Reporter: Zhichao Zhang Assignee: Zhichao Zhang After PR#3166, horizontal compaction will not actually run when execute update sql. When it runs update sql and will run horizontal compaction if needs, it will require update.lock and compaction.lock when execute CarbonAlterTableCompactionCommand.alterTableForCompaction, but these two locks already are locked when it starts to execute update sql. so it will require locks failed and don't execute compaction. -- This message was sent by Atlassian JIRA (v7.6.14#76016)
[jira] [Closed] (CARBONDATA-1625) Introduce new datatype of varchar(size) to store column length more than short limit.
[ https://issues.apache.org/jira/browse/CARBONDATA-1625?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhichao Zhang closed CARBONDATA-1625. -- Resolution: Duplicate This feature was supported > Introduce new datatype of varchar(size) to store column length more than > short limit. > -- > > Key: CARBONDATA-1625 > URL: https://issues.apache.org/jira/browse/CARBONDATA-1625 > Project: CarbonData > Issue Type: New Feature > Components: file-format >Reporter: Zhichao Zhang >Priority: Minor > > I am using Spark 2.1 + CarbonData 1.2, and find that if > enable.unsafe.sort=true, the length of bytes of column exceed 32768, it will > load data unsuccessfully. > My test code: > > {code:java} > val longStr = sb.toString() // the getBytes length of longStr exceeds 32768 > println(longStr.length()) > println(longStr.getBytes("UTF-8").length) > > import spark.implicits._ > val df1 = spark.sparkContext.parallelize(0 to 1000) > .map(x => ("a", x.toString(), longStr, x, x.toLong, x * 2)) > .toDF("stringField1", "stringField2", "stringField3", "intField", > "longField", "int2Field") > > val df2 = spark.sparkContext.parallelize(1001 to 2000) > .map(x => ("b", x.toString(), (x % 2).toString(), x, x.toLong, x * 2)) > .toDF("stringField1", "stringField2", "stringField3", "intField", > "longField", "int2Field") > > val df3 = df1.union(df2) > val tableName = "study_carbondata_test" > spark.sql(s"DROP TABLE IF EXISTS ${tableName} ").show() > val sortScope = "LOCAL_SORT" // LOCAL_SORT GLOBAL_SORT > spark.sql(s""" > | CREATE TABLE IF NOT EXISTS ${tableName} ( > |stringField1 string, > |stringField2 string, > |stringField3 string, > |intField int, > |longField bigint, > |int2Field int > | ) > | STORED BY 'carbondata' > | TBLPROPERTIES('DICTIONARY_INCLUDE'='stringField1, stringField2', > |'SORT_COLUMNS'='stringField1, stringField2, intField, > longField', > |'SORT_SCOPE'='${sortScope}', > |'NO_INVERTED_INDEX'='stringField3, int2Field', > |'TABLE_BLOCKSIZE'='64' > | ) >""".stripMargin) > df3.write > .format("carbondata") > .option("tableName", "study_carbondata_test") > .option("compress", "true") // just valid when tempCSV is true > .option("tempCSV", "false") > .option("single_pass", "true") > .mode(SaveMode.Append) > .save() > {code} > The error message: > {code:java} > *java.lang.NegativeArraySizeException > at > org.apache.carbondata.processing.newflow.sort.unsafe.UnsafeCarbonRowPage.getRow(UnsafeCarbonRowPage.java:182) > > at > org.apache.carbondata.processing.newflow.sort.unsafe.holder.UnsafeInmemoryHolder.readRow(UnsafeInmemoryHolder.java:63) > > at > org.apache.carbondata.processing.newflow.sort.unsafe.merger.UnsafeSingleThreadFinalSortFilesMerger.startSorting(UnsafeSingleThreadFinalSortFilesMerger.java:114) > > at > org.apache.carbondata.processing.newflow.sort.unsafe.merger.UnsafeSingleThreadFinalSortFilesMerger.startFinalMerge(UnsafeSingleThreadFinalSortFilesMerger.java:81) > > at > org.apache.carbondata.processing.newflow.sort.impl.UnsafeParallelReadMergeSorterImpl.sort(UnsafeParallelReadMergeSorterImpl.java:105) > > at > org.apache.carbondata.processing.newflow.steps.SortProcessorStepImpl.execute(SortProcessorStepImpl.java:62) > > at > org.apache.carbondata.processing.newflow.steps.DataWriterProcessorStepImpl.execute(DataWriterProcessorStepImpl.java:87) > > at > org.apache.carbondata.processing.newflow.DataLoadExecutor.execute(DataLoadExecutor.java:51) > > at > org.apache.carbondata.spark.rdd.NewDataFrameLoaderRDD$$anon$2.(NewCarbonDataLoadRDD.scala:442) > > at > org.apache.carbondata.spark.rdd.NewDataFrameLoaderRDD.internalCompute(NewCarbonDataLoadRDD.scala:405) > > at > org.apache.carbondata.spark.rdd.CarbonRDD.compute(CarbonRDD.scala:62) > at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:323) > at org.apache.spark.rdd.RDD.iterator(RDD.scala:287)* > {code} > Currently, the length of column was stored by short type. > Introduce new datatype of varchar(size) to store column length more than > short limit. -- This message was sent by Atlassian JIRA (v7.6.14#76016)
[jira] [Updated] (CARBONDATA-3477) Throw out exception when use sql: 'update table select\n...'
[ https://issues.apache.org/jira/browse/CARBONDATA-3477?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhichao Zhang updated CARBONDATA-3477: --- Description: When use below sql to update table: {code:java} UPDATE IUD_table2 a SET (a.IUD_table2_country, a.IUD_table2_salary) = (select b.IUD_table1_country, b.IUD_table1_salary from IUD_table1 b where b.IUD_table1_id = 8) WHERE a.IUD_table2_id < 6 or a.IUD_table2_id > 15{code} *It will throw out exception:* {code:java} Exception in thread "main" org.apache.spark.sql.AnalysisException: == Parse1 == mismatched input '.' expecting (line 2, pos 1) == SQL == select select b.IUD_table1_country, b.IUD_table1_salary from IUD_table1 b where b.IUD_table1_id = 8 from iud_table2 a -^^^ == Parse2 == [1.1] failure: identifier matching regex (?i)ALTER expected select select {code} was: When use below sql to update table: {code:java} UPDATE IUD_table2 a SET (a.IUD_table2_country, a.IUD_table2_salary) = (select b.IUD_table1_country, b.IUD_table1_salary from IUD_table1 b where b.IUD_table1_id = 8) WHERE a.IUD_table2_id < 6 or a.IUD_table2_id > 15{code} *It will throw out exception:* {code:java} Exception in thread "main" org.apache.spark.sql.AnalysisException: == Parse1 == mismatched input '.' expecting (line 2, pos 1) == SQL == select select b.IUD_table1_country, b.IUD_table1_salary from IUD_table1 b where b.IUD_table1_id = 8 from iud_table2 a -^^^ == Parse2 == [1.1] failure: identifier matching regex (?i)ALTER expected select select {code} > Throw out exception when use sql: 'update table select\n...' > > > Key: CARBONDATA-3477 > URL: https://issues.apache.org/jira/browse/CARBONDATA-3477 > Project: CarbonData > Issue Type: Bug >Reporter: Zhichao Zhang >Assignee: Zhichao Zhang >Priority: Minor > > When use below sql to update table: > {code:java} > UPDATE IUD_table2 a > SET (a.IUD_table2_country, a.IUD_table2_salary) = (select > b.IUD_table1_country, b.IUD_table1_salary from IUD_table1 b where > b.IUD_table1_id = 8) > WHERE a.IUD_table2_id < 6 or a.IUD_table2_id > 15{code} > *It will throw out exception:* > {code:java} > Exception in thread "main" org.apache.spark.sql.AnalysisException: == Parse1 > == > mismatched input '.' expecting (line 2, pos 1) > == SQL == > select select > b.IUD_table1_country, b.IUD_table1_salary from IUD_table1 b where > b.IUD_table1_id = 8 from iud_table2 a > -^^^ > == Parse2 == > [1.1] failure: identifier matching regex (?i)ALTER expected > select select > {code} > -- This message was sent by Atlassian JIRA (v7.6.14#76016)
[jira] [Created] (CARBONDATA-3477) Throw out exception when use sql: 'update table select\n...'
Zhichao Zhang created CARBONDATA-3477: -- Summary: Throw out exception when use sql: 'update table select\n...' Key: CARBONDATA-3477 URL: https://issues.apache.org/jira/browse/CARBONDATA-3477 Project: CarbonData Issue Type: Bug Reporter: Zhichao Zhang Assignee: Zhichao Zhang When use below sql to update table: UPDATE IUD_table2 a SET (a.IUD_table2_country, a.IUD_table2_salary) = (select b.IUD_table1_country, b.IUD_table1_salary from IUD_table1 b where b.IUD_table1_id = 8) WHERE a.IUD_table2_id < 6 or a.IUD_table2_id > 15 It will throw out exception: Exception in thread "main" org.apache.spark.sql.AnalysisException: == Parse1 == mismatched input '.' expecting (line 2, pos 1) == SQL == select select b.IUD_table1_country, b.IUD_table1_salary from IUD_table1 b where b.IUD_table1_id = 8 from iud_table2 a -^^^ == Parse2 == [1.1] failure: identifier matching regex (?i)ALTER expected select select -- This message was sent by Atlassian JIRA (v7.6.14#76016)
[jira] [Updated] (CARBONDATA-3477) Throw out exception when use sql: 'update table select\n...'
[ https://issues.apache.org/jira/browse/CARBONDATA-3477?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhichao Zhang updated CARBONDATA-3477: --- Description: When use below sql to update table: {code:java} UPDATE IUD_table2 a SET (a.IUD_table2_country, a.IUD_table2_salary) = (select b.IUD_table1_country, b.IUD_table1_salary from IUD_table1 b where b.IUD_table1_id = 8) WHERE a.IUD_table2_id < 6 or a.IUD_table2_id > 15{code} *It will throw out exception:* {code:java} Exception in thread "main" org.apache.spark.sql.AnalysisException: == Parse1 == mismatched input '.' expecting (line 2, pos 1) == SQL == select select b.IUD_table1_country, b.IUD_table1_salary from IUD_table1 b where b.IUD_table1_id = 8 from iud_table2 a -^^^ == Parse2 == [1.1] failure: identifier matching regex (?i)ALTER expected select select {code} was: When use below sql to update table: UPDATE IUD_table2 a SET (a.IUD_table2_country, a.IUD_table2_salary) = (select b.IUD_table1_country, b.IUD_table1_salary from IUD_table1 b where b.IUD_table1_id = 8) WHERE a.IUD_table2_id < 6 or a.IUD_table2_id > 15 It will throw out exception: Exception in thread "main" org.apache.spark.sql.AnalysisException: == Parse1 == mismatched input '.' expecting (line 2, pos 1) == SQL == select select b.IUD_table1_country, b.IUD_table1_salary from IUD_table1 b where b.IUD_table1_id = 8 from iud_table2 a -^^^ == Parse2 == [1.1] failure: identifier matching regex (?i)ALTER expected select select > Throw out exception when use sql: 'update table select\n...' > > > Key: CARBONDATA-3477 > URL: https://issues.apache.org/jira/browse/CARBONDATA-3477 > Project: CarbonData > Issue Type: Bug >Reporter: Zhichao Zhang >Assignee: Zhichao Zhang >Priority: Minor > > When use below sql to update table: > {code:java} > UPDATE IUD_table2 a > SET (a.IUD_table2_country, a.IUD_table2_salary) = (select > b.IUD_table1_country, b.IUD_table1_salary from IUD_table1 b where > b.IUD_table1_id = 8) > WHERE a.IUD_table2_id < 6 or a.IUD_table2_id > 15{code} > *It will throw out exception:* > > {code:java} > Exception in thread "main" org.apache.spark.sql.AnalysisException: == Parse1 > == > mismatched input '.' expecting (line 2, pos 1) > == SQL == > select select > b.IUD_table1_country, b.IUD_table1_salary from IUD_table1 b where > b.IUD_table1_id = 8 from iud_table2 a > -^^^ > == Parse2 == > [1.1] failure: identifier matching regex (?i)ALTER expected > select select > {code} > -- This message was sent by Atlassian JIRA (v7.6.14#76016)
[jira] [Commented] (CARBONDATA-3469) CarbonData with 2.3.2 can not run on CDH spark 2.4
[ https://issues.apache.org/jira/browse/CARBONDATA-3469?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16893380#comment-16893380 ] Zhichao Zhang commented on CARBONDATA-3469: [~imperio] I think it can't work with spark 2.4, there maybe some interfaces of spark changed, community will integrate with spark 2.4 in the next time. > CarbonData with 2.3.2 can not run on CDH spark 2.4 > -- > > Key: CARBONDATA-3469 > URL: https://issues.apache.org/jira/browse/CARBONDATA-3469 > Project: CarbonData > Issue Type: Bug > Components: spark-integration >Affects Versions: 1.5.3 >Reporter: wxmimperio >Priority: Major > > *{color:#33}spark2-shell --jars > [apache-carbondata-1.5.3-bin-spark2.3.2-hadoop2.7.2.jar|https://dist.apache.org/repos/dist/release/carbondata/1.5.3/apache-carbondata-1.5.3-bin-spark2.3.2-hadoop2.7.2.jar]{color}* > > {code:java} > java.lang.NoSuchMethodError: > org.apache.spark.sql.internal.SharedState.externalCatalog()Lorg/apache/spark/sql/catalyst/catalog/ExternalCatalog;{code} > {code:java} > scala> carbon.sql( > | s""" > | | CREATE TABLE IF NOT EXISTS test_table( > | | id string, > | | name string, > | | city string, > | | age Int) > | | STORED AS carbondata > | """.stripMargin) > java.lang.NoSuchMethodError: > org.apache.spark.sql.internal.SharedState.externalCatalog()Lorg/apache/spark/sql/catalyst/catalog/ExternalCatalog; > at > org.apache.spark.sql.hive.CarbonSessionStateBuilder.externalCatalog(CarbonSessionState.scala:227) > at > org.apache.spark.sql.hive.CarbonSessionStateBuilder.catalog$lzycompute(CarbonSessionState.scala:214) > at > org.apache.spark.sql.hive.CarbonSessionStateBuilder.catalog(CarbonSessionState.scala:212) > at > org.apache.spark.sql.hive.CarbonSessionStateBuilder.catalog(CarbonSessionState.scala:191) > at > org.apache.spark.sql.internal.BaseSessionStateBuilder$$anonfun$build$1.apply(BaseSessionStateBuilder.scala:291) > at > org.apache.spark.sql.internal.BaseSessionStateBuilder$$anonfun$build$1.apply(BaseSessionStateBuilder.scala:291) > at > org.apache.spark.sql.internal.SessionState.catalog$lzycompute(SessionState.scala:77) > at org.apache.spark.sql.internal.SessionState.catalog(SessionState.scala:77) > at org.apache.spark.sql.CarbonEnv$.getInstance(CarbonEnv.scala:135) > at > org.apache.spark.sql.CarbonSession$.updateSessionInfoToCurrentThread(CarbonSession.scala:326) > at > org.apache.spark.sql.parser.CarbonSparkSqlParser.parsePlan(CarbonSparkSqlParser.scala:47) > at org.apache.spark.sql.CarbonSession.withProfiler(CarbonSession.scala:125) > at org.apache.spark.sql.CarbonSession.sql(CarbonSession.scala:88) > ... 59 elided > {code} -- This message was sent by Atlassian JIRA (v7.6.14#76016)
[jira] [Commented] (CARBONDATA-3471) Spark query carbondata error reporting
[ https://issues.apache.org/jira/browse/CARBONDATA-3471?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16886819#comment-16886819 ] Zhichao Zhang commented on CARBONDATA-3471: [~tianyouyangying], you found that the content of file 'Metadata/segments/1471_time.segment' is below: {"locationMap":\{"/Fact/Part0/Segment_1471":{"files":[],"partitions":[],"status":"Success","mergeFileName":"1471_1562963281071.carbonindexmerge","isRelative":true}}} but in the dir Fact/Part0/Segment_1471, there is just a file '1471_1562963281071.carbonindexmerge', no carbondata file, right? > Spark query carbondata error reporting > -- > > Key: CARBONDATA-3471 > URL: https://issues.apache.org/jira/browse/CARBONDATA-3471 > Project: CarbonData > Issue Type: Bug > Components: data-query >Affects Versions: 1.5.3 > Environment: cdh5.14.x spark2.3.2 hadoop2.6 >Reporter: tianyou >Priority: Major > > Data tables are stored every hour ,delete segment clean file for this table > every night. > It has been running steadily for more than a month. > But:Now query for error reporting. > error: > caused by: java.lang.IndexOutOfBoundsException: Index: 0, Size: 0 > at java.util.ArrayList.rangeCheck(ArrayList.java:657) > at java.util.ArrayList.get(ArrayList.java:433) > at > org.apache.carbondata.core.indexstore.blockletindex.BlockletDataMapFactory.getSegmentProperties(BlockletDataMapFactory.java:376) > at > org.apache.carbondata.core.datamap.TableDataMap.pruneWithFilter(TableDataMap.java:195) > at > org.apache.carbondata.core.datamap.TableDataMap.prune(TableDataMap.java:171) > at > org.apache.carbondata.hadoop.api.CarbonInputFormat.getPrunedBlocklets(CarbonInputFormat.java:491) > at > org.apache.carbondata.hadoop.api.CarbonInputFormat.getDataBlocksOfSegment(CarbonInputFormat.java:414) > at > org.apache.carbondata.hadoop.api.CarbonTableInputFormat.getSplits(CarbonTableInputFormat.java:494) > at > org.apache.carbondata.hadoop.api.CarbonTableInputFormat.getSplits(CarbonTableInputFormat.java:218) > at > org.apache.carbondata.spark.rdd.CarbonScanRDD.internalGetPartitions(CarbonScanRDD.scala:129) > at > org.apache.carbondata.spark.rdd.CarbonRDD.getPartitions(CarbonRDD.scala:66) > at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:253) > at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:251) > at scala.Option.getOrElse(Option.scala:121) > at org.apache.spark.rdd.RDD.partitions(RDD.scala:251) > at > org.apache.spark.rdd.MapPartitionsRDD.getPartitions(MapPartitionsRDD.scala:46) > at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:253) > at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:251) > at scala.Option.getOrElse(Option.scala:121) > at org.apache.spark.rdd.RDD.partitions(RDD.scala:251) > at > org.apache.spark.rdd.MapPartitionsRDD.getPartitions(MapPartitionsRDD.scala:46) > at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:253) > at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:251) > at scala.Option.g -- This message was sent by Atlassian JIRA (v7.6.14#76016)
[jira] [Closed] (CARBONDATA-3324) The parameter CarbonCommonConstants.CARBON_TIMESTAMP_FORMAT and CarbonCommonConstants.CARBON_DATE_FORMAT don't have effect in streaming table writing
[ https://issues.apache.org/jira/browse/CARBONDATA-3324?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhichao Zhang closed CARBONDATA-3324. -- Resolution: Duplicate duplicate with CARBONDATA-3325 > The parameter CarbonCommonConstants.CARBON_TIMESTAMP_FORMAT and > CarbonCommonConstants.CARBON_DATE_FORMAT don't have effect in streaming table > writing > - > > Key: CARBONDATA-3324 > URL: https://issues.apache.org/jira/browse/CARBONDATA-3324 > Project: CarbonData > Issue Type: Bug > Components: spark-integration >Affects Versions: 1.5.2 >Reporter: jocean.shi >Priority: Major > Fix For: 1.5.2 > > Original Estimate: 96h > Remaining Estimate: 96h > > When I write data into streaming table. > I use such code: > CarbonProperties.getInstance() > .addProperty(CarbonCommonConstants.CARBON_TIMESTAMP_FORMAT, "/MM/dd > HH:mm:ss") > .addProperty(CarbonCommonConstants.CARBON_DATE_FORMAT, "/MM/dd") > > but don't have effect -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (CARBONDATA-3325) The parameter CarbonCommonConstants.CARBON_TIMESTAMP_FORMAT and CarbonCommonConstants.CARBON_DATE_FORMAT don't have effect in streaming table writing
[ https://issues.apache.org/jira/browse/CARBONDATA-3325?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16800734#comment-16800734 ] Zhichao Zhang commented on CARBONDATA-3325: We still can not add permission for you :), you can raise a pr to fix this issue first. > The parameter CarbonCommonConstants.CARBON_TIMESTAMP_FORMAT and > CarbonCommonConstants.CARBON_DATE_FORMAT don't have effect in streaming table > writing > - > > Key: CARBONDATA-3325 > URL: https://issues.apache.org/jira/browse/CARBONDATA-3325 > Project: CarbonData > Issue Type: Bug > Components: spark-integration >Affects Versions: 1.5.2 >Reporter: jocean.shi >Priority: Major > Fix For: 1.5.2 > > Original Estimate: 96h > Remaining Estimate: 96h > > When I write data into streaming table. > I use such code: > CarbonProperties.getInstance() > .addProperty(CarbonCommonConstants.CARBON_TIMESTAMP_FORMAT, "/MM/dd > HH:mm:ss") > .addProperty(CarbonCommonConstants.CARBON_DATE_FORMAT, "/MM/dd") > > but don't have effect -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (CARBONDATA-3325) The parameter CarbonCommonConstants.CARBON_TIMESTAMP_FORMAT and CarbonCommonConstants.CARBON_DATE_FORMAT don't have effect in streaming table writing
[ https://issues.apache.org/jira/browse/CARBONDATA-3325?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16800457#comment-16800457 ] Zhichao Zhang commented on CARBONDATA-3325: [~Jocean], please comfirm that your account and email is correct, we can't assign your account. > The parameter CarbonCommonConstants.CARBON_TIMESTAMP_FORMAT and > CarbonCommonConstants.CARBON_DATE_FORMAT don't have effect in streaming table > writing > - > > Key: CARBONDATA-3325 > URL: https://issues.apache.org/jira/browse/CARBONDATA-3325 > Project: CarbonData > Issue Type: Bug > Components: spark-integration >Affects Versions: 1.5.2 >Reporter: jocean@gmail.com >Priority: Major > Fix For: 1.5.2 > > Original Estimate: 96h > Remaining Estimate: 96h > > When I write data into streaming table. > I use such code: > CarbonProperties.getInstance() > .addProperty(CarbonCommonConstants.CARBON_TIMESTAMP_FORMAT, "/MM/dd > HH:mm:ss") > .addProperty(CarbonCommonConstants.CARBON_DATE_FORMAT, "/MM/dd") > > but don't have effect -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Comment Edited] (CARBONDATA-3325) The parameter CarbonCommonConstants.CARBON_TIMESTAMP_FORMAT and CarbonCommonConstants.CARBON_DATE_FORMAT don't have effect in streaming table writing
[ https://issues.apache.org/jira/browse/CARBONDATA-3325?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16800339#comment-16800339 ] Zhichao Zhang edited comment on CARBONDATA-3325 at 3/25/19 1:36 AM: - [~Jocean] are you sure your email address is correct? gmail or gamil? I can't assign to you too. was (Author: zzcclp): [~Jocean] are you sure your email address is correct? gmail or gamil? > The parameter CarbonCommonConstants.CARBON_TIMESTAMP_FORMAT and > CarbonCommonConstants.CARBON_DATE_FORMAT don't have effect in streaming table > writing > - > > Key: CARBONDATA-3325 > URL: https://issues.apache.org/jira/browse/CARBONDATA-3325 > Project: CarbonData > Issue Type: Bug > Components: spark-integration >Affects Versions: 1.5.2 >Reporter: jocean@gamil.com >Priority: Major > Fix For: 1.5.2 > > Original Estimate: 96h > Remaining Estimate: 96h > > When I write data into streaming table. > I use such code: > CarbonProperties.getInstance() > .addProperty(CarbonCommonConstants.CARBON_TIMESTAMP_FORMAT, "/MM/dd > HH:mm:ss") > .addProperty(CarbonCommonConstants.CARBON_DATE_FORMAT, "/MM/dd") > > but don't have effect -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (CARBONDATA-3325) The parameter CarbonCommonConstants.CARBON_TIMESTAMP_FORMAT and CarbonCommonConstants.CARBON_DATE_FORMAT don't have effect in streaming table writing
[ https://issues.apache.org/jira/browse/CARBONDATA-3325?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16800339#comment-16800339 ] Zhichao Zhang commented on CARBONDATA-3325: [~Jocean] are you sure your email address is correct? gmail or gamil? > The parameter CarbonCommonConstants.CARBON_TIMESTAMP_FORMAT and > CarbonCommonConstants.CARBON_DATE_FORMAT don't have effect in streaming table > writing > - > > Key: CARBONDATA-3325 > URL: https://issues.apache.org/jira/browse/CARBONDATA-3325 > Project: CarbonData > Issue Type: Bug > Components: spark-integration >Affects Versions: 1.5.2 >Reporter: jocean@gamil.com >Priority: Major > Fix For: 1.5.2 > > Original Estimate: 96h > Remaining Estimate: 96h > > When I write data into streaming table. > I use such code: > CarbonProperties.getInstance() > .addProperty(CarbonCommonConstants.CARBON_TIMESTAMP_FORMAT, "/MM/dd > HH:mm:ss") > .addProperty(CarbonCommonConstants.CARBON_DATE_FORMAT, "/MM/dd") > > but don't have effect -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (CARBONDATA-3325) The parameter CarbonCommonConstants.CARBON_TIMESTAMP_FORMAT and CarbonCommonConstants.CARBON_DATE_FORMAT don't have effect in streaming table writing
[ https://issues.apache.org/jira/browse/CARBONDATA-3325?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16799868#comment-16799868 ] Zhichao Zhang commented on CARBONDATA-3325: Yeap, you can submit a pr to fix this first, and pleas give me the email which is registered for Jira account. > The parameter CarbonCommonConstants.CARBON_TIMESTAMP_FORMAT and > CarbonCommonConstants.CARBON_DATE_FORMAT don't have effect in streaming table > writing > - > > Key: CARBONDATA-3325 > URL: https://issues.apache.org/jira/browse/CARBONDATA-3325 > Project: CarbonData > Issue Type: Bug > Components: spark-integration >Affects Versions: 1.5.2 >Reporter: jocean@gamil.com >Priority: Major > Fix For: 1.5.2 > > Original Estimate: 96h > Remaining Estimate: 96h > > When I write data into streaming table. > I use such code: > CarbonProperties.getInstance() > .addProperty(CarbonCommonConstants.CARBON_TIMESTAMP_FORMAT, "/MM/dd > HH:mm:ss") > .addProperty(CarbonCommonConstants.CARBON_DATE_FORMAT, "/MM/dd") > > but don't have effect -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (CARBONDATA-3325) The parameter CarbonCommonConstants.CARBON_TIMESTAMP_FORMAT and CarbonCommonConstants.CARBON_DATE_FORMAT don't have effect in streaming table writing
[ https://issues.apache.org/jira/browse/CARBONDATA-3325?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16799548#comment-16799548 ] Zhichao Zhang commented on CARBONDATA-3325: You can try to use this way: spark.readStream .format("socket") .option("host", "localhost") .option("port", 9099) .option("timestampformat", "-MM-dd HH:mm:ss") .option("dateformat", "-MM-dd HH:mm:ss") > The parameter CarbonCommonConstants.CARBON_TIMESTAMP_FORMAT and > CarbonCommonConstants.CARBON_DATE_FORMAT don't have effect in streaming table > writing > - > > Key: CARBONDATA-3325 > URL: https://issues.apache.org/jira/browse/CARBONDATA-3325 > Project: CarbonData > Issue Type: Bug > Components: spark-integration >Affects Versions: 1.5.2 >Reporter: jocean@gamil.com >Priority: Major > Fix For: 1.5.2 > > Original Estimate: 96h > Remaining Estimate: 96h > > When I write data into streaming table. > I use such code: > CarbonProperties.getInstance() > .addProperty(CarbonCommonConstants.CARBON_TIMESTAMP_FORMAT, "/MM/dd > HH:mm:ss") > .addProperty(CarbonCommonConstants.CARBON_DATE_FORMAT, "/MM/dd") > > but don't have effect -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (CARBONDATA-3317) Executing 'show segments' command throws NPE when spark streaming app write data to new stream segment.
Zhichao Zhang created CARBONDATA-3317: -- Summary: Executing 'show segments' command throws NPE when spark streaming app write data to new stream segment. Key: CARBONDATA-3317 URL: https://issues.apache.org/jira/browse/CARBONDATA-3317 Project: CarbonData Issue Type: Bug Components: spark-integration Affects Versions: 1.6.0 Reporter: Zhichao Zhang Assignee: Zhichao Zhang Fix For: 1.6.0 When spark streaming app starts to create new stream segment, it does not create carbondataindex file before writing data successfully, and now if execute 'show segments' command, it will throw NPE. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Closed] (CARBONDATA-2595) Reformat the output of command 'desc formatted table_name'
[ https://issues.apache.org/jira/browse/CARBONDATA-2595?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhichao Zhang closed CARBONDATA-2595. -- Resolution: Fixed > Reformat the output of command 'desc formatted table_name' > -- > > Key: CARBONDATA-2595 > URL: https://issues.apache.org/jira/browse/CARBONDATA-2595 > Project: CarbonData > Issue Type: Improvement > Components: sql >Reporter: Zhichao Zhang >Assignee: Zhichao Zhang >Priority: Minor > Attachments: desc_formatted.txt, desc_formatted_external.txt > > Time Spent: 2h > Remaining Estimate: 0h > > According to the discussion in > [topic|http://apache-carbondata-dev-mailing-list-archive.1130556.n5.nabble.com/Change-the-comment-content-for-column-when-execute-command-desc-formatted-table-name-td46848.html], > reformat the output of command 'desc formatted table_name'. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Resolved] (CARBONDATA-3120) apache-carbondata-1.5.1-rc1.tar.gz Datamap's core and plan project, pom.xml, is version 1.5.0, which results in an inability to compile properly
[ https://issues.apache.org/jira/browse/CARBONDATA-3120?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhichao Zhang resolved CARBONDATA-3120. Resolution: Fixed > apache-carbondata-1.5.1-rc1.tar.gz Datamap's core and plan project, pom.xml, > is version 1.5.0, which results in an inability to compile properly > > > Key: CARBONDATA-3120 > URL: https://issues.apache.org/jira/browse/CARBONDATA-3120 > Project: CarbonData > Issue Type: Bug > Components: build >Affects Versions: 1.5.1 > Environment: MacOS > apache-carbondata-1.5.1-rc1 >Reporter: Jonathan.Wei >Priority: Major > Fix For: 1.5.1 > > Original Estimate: 1h > Time Spent: 4h 40m > Remaining Estimate: 0h > > Hi,guy! > I download the apache-carbondata-1.5.1-rc1.tar.gz。 > After decompression, the datamap mv/core mv/plan project was added to > the main pom for compilation。 > But the But the compilation failed。 > > LOG: > {code:java} > [ERROR] [ERROR] Some problems were encountered while processing the POMs: > [FATAL] Non-resolvable parent POM for > org.apache.carbondata:carbondata-mv-core:[unknown-version]: Could not find > artifact org.apache.carbondata:carbondata-parent:pom:1.5.0-SNAPSHOT and > 'parent.relativePath' points at wrong local POM @ line 22, column 11 > [FATAL] Non-resolvable parent POM for > org.apache.carbondata:carbondata-mv-plan:[unknown-version]: Could not find > artifact org.apache.carbondata:carbondata-parent:pom:1.5.0-SNAPSHOT and > 'parent.relativePath' points at wrong local POM @ line 22, column 11 > [WARNING] 'build.plugins.plugin.version' for > com.ning.maven.plugins:maven-duplicate-finder-plugin is missing. @ > org.apache.carbondata:carbondata-presto:[unknown-version], > /Users/jonathanwei/summary/carbondata/carbondata-apache-carbondata-1.5.1-rc1/integration/presto/pom.xml, > line 620, column 15 > [WARNING] 'build.plugins.plugin.version' for > pl.project13.maven:git-commit-id-plugin is missing. @ > org.apache.carbondata:carbondata-presto:[unknown-version], > /Users/jonathanwei/summary/carbondata/carbondata-apache-carbondata-1.5.1-rc1/integration/presto/pom.xml, > line 633, column 15 > [WARNING] 'build.plugins.plugin.version' for > com.ning.maven.plugins:maven-duplicate-finder-plugin is missing. @ > org.apache.carbondata:carbondata-examples-spark2:[unknown-version], > /Users/jonathanwei/summary/carbondata/carbondata-apache-carbondata-1.5.1-rc1/examples/spark2/pom.xml, > line 184, column 15 > @ > [ERROR] The build could not read 2 projects -> [Help 1] > [ERROR] > [ERROR] The project > org.apache.carbondata:carbondata-mv-core:[unknown-version] > (/Users/jonathanwei/summary/carbondata/carbondata-apache-carbondata-1.5.1-rc1/datamap/mv/core/pom.xml) > has 1 error > [ERROR] Non-resolvable parent POM for > org.apache.carbondata:carbondata-mv-core:[unknown-version]: Could not find > artifact org.apache.carbondata:carbondata-parent:pom:1.5.0-SNAPSHOT and > 'parent.relativePath' points at wrong local POM @ line 22, column 11 -> [Help > 2] > [ERROR] > [ERROR] The project > org.apache.carbondata:carbondata-mv-plan:[unknown-version] > (/Users/jonathanwei/summary/carbondata/carbondata-apache-carbondata-1.5.1-rc1/datamap/mv/plan/pom.xml) > has 1 error > [ERROR] Non-resolvable parent POM for > org.apache.carbondata:carbondata-mv-plan:[unknown-version]: Could not find > artifact org.apache.carbondata:carbondata-parent:pom:1.5.0-SNAPSHOT and > 'parent.relativePath' points at wrong local POM @ line 22, column 11 -> [Help > 2] > [ERROR] > [ERROR] To see the full stack trace of the errors, re-run Maven with the -e > switch. > [ERROR] Re-run Maven using the -X switch to enable full debug logging. > [ERROR] > [ERROR] For more information about the errors and possible solutions, please > read the following articles: > [ERROR] [Help 1] > http://cwiki.apache.org/confluence/display/MAVEN/ProjectBuildingException > [ERROR] [Help 2] > http://cwiki.apache.org/confluence/display/MAVEN/UnresolvableModelException > {code} > I check the pom file, parent.version is 1.5.0-snapshot. But > apache-carbondata-1.5.1-rc1.tar.gz is 1.5.1. > mv/core pom.xml > {code:java} > > org.apache.carbondata > carbondata-parent > 1.5.0-SNAPSHOT > ../../../pom.xml > > carbondata-mv-core > Apache CarbonData :: Materialized View Core > {code} > mv/plan pom.xml > {code:java} > > org.apache.carbondata > carbondata-parent > 1.5.0-SNAPSHOT > ../../../pom.xml > > carbondata-mv-plan > Apache CarbonData :: Materialized View Plan > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (CARBONDATA-3087) Prettify DESC FORMATTED output
[ https://issues.apache.org/jira/browse/CARBONDATA-3087?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16679858#comment-16679858 ] Zhichao Zhang commented on CARBONDATA-3087: Hi Jacky: I have some questions shown below: 1. How to show external info if it's an external table? 2. It need to show the partition info if it has partition columns; 3. missing property: DICTIONARY_INCLUDE; 4. doesn't it show database and table name? 5. will remove NO_INVERTED_INDEX ? right? 6. doesn't it show default value for each properties? and as our discussion, we need to save the default value to schema info even user doesn't set the value, is this done? > Prettify DESC FORMATTED output > -- > > Key: CARBONDATA-3087 > URL: https://issues.apache.org/jira/browse/CARBONDATA-3087 > Project: CarbonData > Issue Type: Improvement >Reporter: Jacky Li >Priority: Major > Time Spent: 1h 10m > Remaining Estimate: 0h > > Change output of DESC FORMATTED to: > {noformat} > ++-+---+ > |col_name|data_type > |comment| > ++-+---+ > |shortfield |smallint > |null | > |intfield|int > |null | > |bigintfield |bigint > |null | > |doublefield |double > |null | > |stringfield |string > |null | > |timestampfield |timestamp > |null | > |decimalfield|decimal(18,2) > |null | > |datefield |date > |null | > |charfield |string > |null | > |floatfield |double > |null | > || > | | > |## Table Basic Information | > | | > |Comment | > | | > |Path > |/Users/jacky/code/carbondata/examples/spark2/target/store/default/carbonsession_table| >| > |Table Block Size|1024 MB > | | > |Table Blocklet Size |64 MB > | | > |Streaming |false > | | > |Flat Folder |false > | | > |Bad Record Path | > | | > |Min Input Per Node |0.0B > | | > || > | | > |## Index Information| > | | > |Sort Scope |LOCAL_SORT > | | > |Sort Columns|stringfield,timestampfield,datefield,charfield > | | > |Index Cache Level |BLOCK > | | > |Cached Index Columns|All columns > | | > ||
[jira] [Created] (CARBONDATA-3019) Add error log in catch block to avoid to abort the exception which is thrown from catch block when there is an exception thrown in finally block
Zhichao Zhang created CARBONDATA-3019: -- Summary: Add error log in catch block to avoid to abort the exception which is thrown from catch block when there is an exception thrown in finally block Key: CARBONDATA-3019 URL: https://issues.apache.org/jira/browse/CARBONDATA-3019 Project: CarbonData Issue Type: Improvement Reporter: Zhichao Zhang Assignee: Zhichao Zhang # Add error log in catch block to avoid to abort the exception which is thrown from catch block when there is an exception thrown in finally block. # enhance log output. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (CARBONDATA-2595) Reformat the output of command 'desc formatted table_name'
[ https://issues.apache.org/jira/browse/CARBONDATA-2595?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhichao Zhang updated CARBONDATA-2595: --- Fix Version/s: (was: 1.5.0) > Reformat the output of command 'desc formatted table_name' > -- > > Key: CARBONDATA-2595 > URL: https://issues.apache.org/jira/browse/CARBONDATA-2595 > Project: CarbonData > Issue Type: Improvement > Components: sql >Reporter: Zhichao Zhang >Assignee: Zhichao Zhang >Priority: Minor > Attachments: desc_formatted.txt, desc_formatted_external.txt > > > According to the discussion in > [topic|http://apache-carbondata-dev-mailing-list-archive.1130556.n5.nabble.com/Change-the-comment-content-for-column-when-execute-command-desc-formatted-table-name-td46848.html], > reformat the output of command 'desc formatted table_name'. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (CARBONDATA-2597) Support deleting historical data from non-stream segments
[ https://issues.apache.org/jira/browse/CARBONDATA-2597?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhichao Zhang updated CARBONDATA-2597: --- Fix Version/s: (was: NONE) > Support deleting historical data from non-stream segments > - > > Key: CARBONDATA-2597 > URL: https://issues.apache.org/jira/browse/CARBONDATA-2597 > Project: CarbonData > Issue Type: Sub-task > Components: spark-integration >Reporter: Zhichao Zhang >Assignee: Zhichao Zhang >Priority: Minor > > Delete historical data from non-stream segments, do not support deleting from > stream segments. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (CARBONDATA-2597) Support deleting historical data from non-stream segments
[ https://issues.apache.org/jira/browse/CARBONDATA-2597?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhichao Zhang updated CARBONDATA-2597: --- Fix Version/s: (was: 1.5.0) NONE > Support deleting historical data from non-stream segments > - > > Key: CARBONDATA-2597 > URL: https://issues.apache.org/jira/browse/CARBONDATA-2597 > Project: CarbonData > Issue Type: Sub-task > Components: spark-integration >Reporter: Zhichao Zhang >Assignee: Zhichao Zhang >Priority: Minor > > Delete historical data from non-stream segments, do not support deleting from > stream segments. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Assigned] (CARBONDATA-2989) Upgrade spark integration version to 2.3.2
[ https://issues.apache.org/jira/browse/CARBONDATA-2989?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhichao Zhang reassigned CARBONDATA-2989: -- Assignee: Zhichao Zhang > Upgrade spark integration version to 2.3.2 > -- > > Key: CARBONDATA-2989 > URL: https://issues.apache.org/jira/browse/CARBONDATA-2989 > Project: CarbonData > Issue Type: Improvement > Components: spark-integration >Reporter: Zhichao Zhang >Assignee: Zhichao Zhang >Priority: Minor > Fix For: 1.5.0 > > > h1. Upgrade spark integration version to 2.3.2 -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (CARBONDATA-2989) Upgrade spark integration version to 2.3.2
Zhichao Zhang created CARBONDATA-2989: -- Summary: Upgrade spark integration version to 2.3.2 Key: CARBONDATA-2989 URL: https://issues.apache.org/jira/browse/CARBONDATA-2989 Project: CarbonData Issue Type: Improvement Components: spark-integration Reporter: Zhichao Zhang Fix For: 1.5.0 h1. Upgrade spark integration version to 2.3.2 -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Assigned] (CARBONDATA-2594) Incorrect logic when set 'Encoding.INVERTED_INDEX' for each dimension column
[ https://issues.apache.org/jira/browse/CARBONDATA-2594?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhichao Zhang reassigned CARBONDATA-2594: -- Assignee: Jacky Li (was: Zhichao Zhang) > Incorrect logic when set 'Encoding.INVERTED_INDEX' for each dimension column > > > Key: CARBONDATA-2594 > URL: https://issues.apache.org/jira/browse/CARBONDATA-2594 > Project: CarbonData > Issue Type: Improvement > Components: spark-integration >Reporter: Zhichao Zhang >Assignee: Jacky Li >Priority: Minor > Fix For: 1.5.0 > > > All of non-sort dimension columns are set as 'Encoding.INVERTED_INDEX' > column, this is wrong, only the columns defined in 'SORT_COLUMN' and not in > 'NO_INVERTED_INDEX' need to be set as 'Encoding.INVERTED_INDEX' column. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Closed] (CARBONDATA-2600) Add a command to show detailed index information for a segment
[ https://issues.apache.org/jira/browse/CARBONDATA-2600?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhichao Zhang closed CARBONDATA-2600. -- Resolution: Duplicate Fix Version/s: (was: 1.5.0) please see CARBONDATA-2916. > Add a command to show detailed index information for a segment > -- > > Key: CARBONDATA-2600 > URL: https://issues.apache.org/jira/browse/CARBONDATA-2600 > Project: CarbonData > Issue Type: New Feature > Components: spark-integration >Reporter: Zhichao Zhang >Assignee: Zhichao Zhang >Priority: Minor > > Add a command to show detailed index information for a segment, for example: > {code:java} > show index for table table_name where segment_id = 0;{code} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (CARBONDATA-2859) add sdv test case for bloomfilter datamap
[ https://issues.apache.org/jira/browse/CARBONDATA-2859?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhichao Zhang updated CARBONDATA-2859: --- Priority: Minor (was: Major) > add sdv test case for bloomfilter datamap > - > > Key: CARBONDATA-2859 > URL: https://issues.apache.org/jira/browse/CARBONDATA-2859 > Project: CarbonData > Issue Type: Sub-task >Reporter: xuchuanyin >Assignee: xuchuanyin >Priority: Minor > Fix For: 1.5.0 > > Time Spent: 1h 50m > Remaining Estimate: 0h > > add sdv test case for bloomfilter datamap -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Resolved] (CARBONDATA-2859) add sdv test case for bloomfilter datamap
[ https://issues.apache.org/jira/browse/CARBONDATA-2859?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhichao Zhang resolved CARBONDATA-2859. Resolution: Fixed Fix Version/s: 1.5.0 > add sdv test case for bloomfilter datamap > - > > Key: CARBONDATA-2859 > URL: https://issues.apache.org/jira/browse/CARBONDATA-2859 > Project: CarbonData > Issue Type: Sub-task >Reporter: xuchuanyin >Assignee: xuchuanyin >Priority: Minor > Fix For: 1.5.0 > > Time Spent: 1h 50m > Remaining Estimate: 0h > > add sdv test case for bloomfilter datamap -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (CARBONDATA-2595) Reformat the output of command 'desc formatted table_name'
[ https://issues.apache.org/jira/browse/CARBONDATA-2595?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16587328#comment-16587328 ] Zhichao Zhang commented on CARBONDATA-2595: [~sraghunandan], it's the same when table is a external table, otherwise there is no 'Location Path' property. > Reformat the output of command 'desc formatted table_name' > -- > > Key: CARBONDATA-2595 > URL: https://issues.apache.org/jira/browse/CARBONDATA-2595 > Project: CarbonData > Issue Type: Improvement > Components: sql >Reporter: Zhichao Zhang >Assignee: Zhichao Zhang >Priority: Minor > Fix For: 1.5.0 > > Attachments: desc_formatted.txt, desc_formatted_external.txt > > > According to the discussion in > [topic|http://apache-carbondata-dev-mailing-list-archive.1130556.n5.nabble.com/Change-the-comment-content-for-column-when-execute-command-desc-formatted-table-name-td46848.html], > reformat the output of command 'desc formatted table_name'. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (CARBONDATA-2595) Reformat the output of command 'desc formatted table_name'
[ https://issues.apache.org/jira/browse/CARBONDATA-2595?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16585858#comment-16585858 ] Zhichao Zhang commented on CARBONDATA-2595: Now I am working on this, the new format is shown in attachment, please give me some feedback. There is one question: if user uses CTAS to create table, do we need to show the 'select sql' in the result of 'desc formatted table'? If yes, how to get 'select sql'? now I just can get a non-formatted sql from 'CarbonSparkSqlParser.scala' (Jacky mentioned), for example: create table sql: {code:java} CREATE TABLE IF NOT EXISTS test_table STORED BY 'carbondata' TBLPROPERTIES( 'streaming'='false', 'sort_columns'='id,city', 'dictionary_include'='name') AS SELECT * from source_test ;{code} The non-formatted sql I get is : {code:java} SELECT*fromsource_test{code} any suggestion for this? {code:java} {code} > Reformat the output of command 'desc formatted table_name' > -- > > Key: CARBONDATA-2595 > URL: https://issues.apache.org/jira/browse/CARBONDATA-2595 > Project: CarbonData > Issue Type: Improvement > Components: sql >Reporter: Zhichao Zhang >Assignee: Zhichao Zhang >Priority: Minor > Fix For: 1.5.0 > > Attachments: desc_formatted.txt, desc_formatted_external.txt > > > According to the discussion in > [topic|http://apache-carbondata-dev-mailing-list-archive.1130556.n5.nabble.com/Change-the-comment-content-for-column-when-execute-command-desc-formatted-table-name-td46848.html], > reformat the output of command 'desc formatted table_name'. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Comment Edited] (CARBONDATA-2595) Reformat the output of command 'desc formatted table_name'
[ https://issues.apache.org/jira/browse/CARBONDATA-2595?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16585858#comment-16585858 ] Zhichao Zhang edited comment on CARBONDATA-2595 at 8/20/18 12:27 PM: -- Now I am working on this, the new format is shown in attachment, please give me some feedback. There is one question: if user uses CTAS to create table, do we need to show the 'select sql' in the result of 'desc formatted table'? If yes, how to get 'select sql'? now I just can get a non-formatted sql from 'CarbonSparkSqlParser.scala' (Jacky mentioned), for example: create table sql: {code:java} CREATE TABLE IF NOT EXISTS test_table STORED BY 'carbondata' TBLPROPERTIES( 'streaming'='false', 'sort_columns'='id,city', 'dictionary_include'='name') AS SELECT * from source_test ;{code} The non-formatted sql I get is : {code:java} SELECT*fromsource_test{code} any suggestion for this? was (Author: zzcclp): Now I am working on this, the new format is shown in attachment, please give me some feedback. There is one question: if user uses CTAS to create table, do we need to show the 'select sql' in the result of 'desc formatted table'? If yes, how to get 'select sql'? now I just can get a non-formatted sql from 'CarbonSparkSqlParser.scala' (Jacky mentioned), for example: create table sql: {code:java} CREATE TABLE IF NOT EXISTS test_table STORED BY 'carbondata' TBLPROPERTIES( 'streaming'='false', 'sort_columns'='id,city', 'dictionary_include'='name') AS SELECT * from source_test ;{code} The non-formatted sql I get is : {code:java} SELECT*fromsource_test{code} any suggestion for this? {code:java} {code} > Reformat the output of command 'desc formatted table_name' > -- > > Key: CARBONDATA-2595 > URL: https://issues.apache.org/jira/browse/CARBONDATA-2595 > Project: CarbonData > Issue Type: Improvement > Components: sql >Reporter: Zhichao Zhang >Assignee: Zhichao Zhang >Priority: Minor > Fix For: 1.5.0 > > Attachments: desc_formatted.txt, desc_formatted_external.txt > > > According to the discussion in > [topic|http://apache-carbondata-dev-mailing-list-archive.1130556.n5.nabble.com/Change-the-comment-content-for-column-when-execute-command-desc-formatted-table-name-td46848.html], > reformat the output of command 'desc formatted table_name'. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (CARBONDATA-2595) Reformat the output of command 'desc formatted table_name'
[ https://issues.apache.org/jira/browse/CARBONDATA-2595?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhichao Zhang updated CARBONDATA-2595: --- Attachment: desc_formatted_external.txt > Reformat the output of command 'desc formatted table_name' > -- > > Key: CARBONDATA-2595 > URL: https://issues.apache.org/jira/browse/CARBONDATA-2595 > Project: CarbonData > Issue Type: Improvement > Components: sql >Reporter: Zhichao Zhang >Assignee: Zhichao Zhang >Priority: Minor > Fix For: 1.5.0 > > Attachments: desc_formatted.txt, desc_formatted_external.txt > > > According to the discussion in > [topic|http://apache-carbondata-dev-mailing-list-archive.1130556.n5.nabble.com/Change-the-comment-content-for-column-when-execute-command-desc-formatted-table-name-td46848.html], > reformat the output of command 'desc formatted table_name'. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (CARBONDATA-2595) Reformat the output of command 'desc formatted table_name'
[ https://issues.apache.org/jira/browse/CARBONDATA-2595?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhichao Zhang updated CARBONDATA-2595: --- Attachment: desc_formatted.txt > Reformat the output of command 'desc formatted table_name' > -- > > Key: CARBONDATA-2595 > URL: https://issues.apache.org/jira/browse/CARBONDATA-2595 > Project: CarbonData > Issue Type: Improvement > Components: sql >Reporter: Zhichao Zhang >Assignee: Zhichao Zhang >Priority: Minor > Fix For: 1.5.0 > > Attachments: desc_formatted.txt, desc_formatted_external.txt > > > According to the discussion in > [topic|http://apache-carbondata-dev-mailing-list-archive.1130556.n5.nabble.com/Change-the-comment-content-for-column-when-execute-command-desc-formatted-table-name-td46848.html], > reformat the output of command 'desc formatted table_name'. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (CARBONDATA-2854) Release table status file lock before delete physical files when execute 'clean files' command
Zhichao Zhang created CARBONDATA-2854: -- Summary: Release table status file lock before delete physical files when execute 'clean files' command Key: CARBONDATA-2854 URL: https://issues.apache.org/jira/browse/CARBONDATA-2854 Project: CarbonData Issue Type: Bug Components: spark-integration Affects Versions: 1.4.0, 1.5.0 Reporter: Zhichao Zhang Assignee: Zhichao Zhang Fix For: 1.5.0 Release table status file lock before delete physical files when execute 'clean files' command, otherwise table status file will be locked during deleting physical files, it may take a long time, other operations will fail to get table status file lock. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (CARBONDATA-2600) Add a command to show detailed index information for a segment
Zhichao Zhang created CARBONDATA-2600: -- Summary: Add a command to show detailed index information for a segment Key: CARBONDATA-2600 URL: https://issues.apache.org/jira/browse/CARBONDATA-2600 Project: CarbonData Issue Type: New Feature Components: spark-integration Reporter: Zhichao Zhang Assignee: Zhichao Zhang Fix For: 1.5.0 Add a command to show detailed index information for a segment, for example: {code:java} show index for table table_name where segment_id = 0;{code} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (CARBONDATA-2599) Use RowStreamParserImp as default value of config 'carbon.stream.parser'
Zhichao Zhang created CARBONDATA-2599: -- Summary: Use RowStreamParserImp as default value of config 'carbon.stream.parser' Key: CARBONDATA-2599 URL: https://issues.apache.org/jira/browse/CARBONDATA-2599 Project: CarbonData Issue Type: Improvement Components: spark-integration Reporter: Zhichao Zhang Assignee: Zhichao Zhang Fix For: 1.5.0 See the detailed info in [topic|http://apache-carbondata-dev-mailing-list-archive.1130556.n5.nabble.com/Use-RowStreamParserImp-as-default-value-of-config-carbon-stream-parser-td51565.html] -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (CARBONDATA-2598) Support updating/deleting data from stream segments
Zhichao Zhang created CARBONDATA-2598: -- Summary: Support updating/deleting data from stream segments Key: CARBONDATA-2598 URL: https://issues.apache.org/jira/browse/CARBONDATA-2598 Project: CarbonData Issue Type: Sub-task Components: spark-integration Reporter: Zhichao Zhang -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (CARBONDATA-2597) Support deleting historical data from non-stream segments
Zhichao Zhang created CARBONDATA-2597: -- Summary: Support deleting historical data from non-stream segments Key: CARBONDATA-2597 URL: https://issues.apache.org/jira/browse/CARBONDATA-2597 Project: CarbonData Issue Type: Sub-task Components: spark-integration Reporter: Zhichao Zhang Assignee: Zhichao Zhang Fix For: 1.5.0 Delete historical data from non-stream segments, do not support deleting from stream segments. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (CARBONDATA-2596) Support updating/deleting data on stream table
[ https://issues.apache.org/jira/browse/CARBONDATA-2596?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhichao Zhang updated CARBONDATA-2596: --- Issue Type: Improvement (was: Task) > Support updating/deleting data on stream table > -- > > Key: CARBONDATA-2596 > URL: https://issues.apache.org/jira/browse/CARBONDATA-2596 > Project: CarbonData > Issue Type: Improvement > Components: spark-integration >Reporter: Zhichao Zhang >Priority: Minor > Fix For: 1.5.0 > > > According to the disscusion in > [topic|http://apache-carbondata-dev-mailing-list-archive.1130556.n5.nabble.com/Support-updating-deleting-data-for-stream-table-td51060.html], > there are 2 steps to implement this feature. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (CARBONDATA-2596) Support updating/deleting data on stream table
[ https://issues.apache.org/jira/browse/CARBONDATA-2596?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhichao Zhang updated CARBONDATA-2596: --- Issue Type: New Feature (was: Improvement) > Support updating/deleting data on stream table > -- > > Key: CARBONDATA-2596 > URL: https://issues.apache.org/jira/browse/CARBONDATA-2596 > Project: CarbonData > Issue Type: New Feature > Components: spark-integration >Reporter: Zhichao Zhang >Priority: Minor > Fix For: 1.5.0 > > > According to the disscusion in > [topic|http://apache-carbondata-dev-mailing-list-archive.1130556.n5.nabble.com/Support-updating-deleting-data-for-stream-table-td51060.html], > there are 2 steps to implement this feature. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (CARBONDATA-2596) Support updating/deleting data on stream table
[ https://issues.apache.org/jira/browse/CARBONDATA-2596?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhichao Zhang updated CARBONDATA-2596: --- Issue Type: Task (was: Improvement) > Support updating/deleting data on stream table > -- > > Key: CARBONDATA-2596 > URL: https://issues.apache.org/jira/browse/CARBONDATA-2596 > Project: CarbonData > Issue Type: Task > Components: spark-integration >Reporter: Zhichao Zhang >Priority: Minor > Fix For: 1.5.0 > > > According to the disscusion in > [topic|http://apache-carbondata-dev-mailing-list-archive.1130556.n5.nabble.com/Support-updating-deleting-data-for-stream-table-td51060.html], > there are 2 steps to implement this feature. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (CARBONDATA-2596) Support updating/deleting data on stream table
Zhichao Zhang created CARBONDATA-2596: -- Summary: Support updating/deleting data on stream table Key: CARBONDATA-2596 URL: https://issues.apache.org/jira/browse/CARBONDATA-2596 Project: CarbonData Issue Type: Improvement Components: spark-integration Reporter: Zhichao Zhang Fix For: 1.5.0 According to the disscusion in [topic|http://apache-carbondata-dev-mailing-list-archive.1130556.n5.nabble.com/Support-updating-deleting-data-for-stream-table-td51060.html], there are 2 steps to implement this feature. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (CARBONDATA-2594) Incorrect logic when set 'Encoding.INVERTED_INDEX' for each dimension column
[ https://issues.apache.org/jira/browse/CARBONDATA-2594?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhichao Zhang updated CARBONDATA-2594: --- Affects Version/s: (was: 1.5.0) > Incorrect logic when set 'Encoding.INVERTED_INDEX' for each dimension column > > > Key: CARBONDATA-2594 > URL: https://issues.apache.org/jira/browse/CARBONDATA-2594 > Project: CarbonData > Issue Type: Improvement > Components: spark-integration >Reporter: Zhichao Zhang >Assignee: Zhichao Zhang >Priority: Minor > Fix For: 1.5.0 > > > All of non-sort dimension columns are set as 'Encoding.INVERTED_INDEX' > column, this is wrong, only the columns defined in 'SORT_COLUMN' and not in > 'NO_INVERTED_INDEX' need to be set as 'Encoding.INVERTED_INDEX' column. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (CARBONDATA-2593) Add an option 'carbon.insert.storage.level' to support configuring the storage level when insert into data with 'carbon.insert.persist.enable'='true'
[ https://issues.apache.org/jira/browse/CARBONDATA-2593?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhichao Zhang updated CARBONDATA-2593: --- Affects Version/s: (was: 1.5.0) > Add an option 'carbon.insert.storage.level' to support configuring the > storage level when insert into data with 'carbon.insert.persist.enable'='true' > - > > Key: CARBONDATA-2593 > URL: https://issues.apache.org/jira/browse/CARBONDATA-2593 > Project: CarbonData > Issue Type: Improvement > Components: data-load, spark-integration >Reporter: Zhichao Zhang >Assignee: Zhichao Zhang >Priority: Minor > Fix For: 1.5.0 > > > When insert into data with 'carbon.insert.persist.enable'='true', the storage > level of dataset is 'MEMORY_AND_DISK', it should support configuring the > storage level to correspond to different environment. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (CARBONDATA-2595) Reformat the output of command 'desc formatted table_name'
Zhichao Zhang created CARBONDATA-2595: -- Summary: Reformat the output of command 'desc formatted table_name' Key: CARBONDATA-2595 URL: https://issues.apache.org/jira/browse/CARBONDATA-2595 Project: CarbonData Issue Type: Improvement Components: sql Reporter: Zhichao Zhang Assignee: Zhichao Zhang Fix For: 1.5.0 According to the discussion in [topic|http://apache-carbondata-dev-mailing-list-archive.1130556.n5.nabble.com/Change-the-comment-content-for-column-when-execute-command-desc-formatted-table-name-td46848.html], reformat the output of command 'desc formatted table_name'. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (CARBONDATA-2594) Incorrect logic when set 'Encoding.INVERTED_INDEX' for each dimension column
Zhichao Zhang created CARBONDATA-2594: -- Summary: Incorrect logic when set 'Encoding.INVERTED_INDEX' for each dimension column Key: CARBONDATA-2594 URL: https://issues.apache.org/jira/browse/CARBONDATA-2594 Project: CarbonData Issue Type: Improvement Components: spark-integration Affects Versions: 1.5.0 Reporter: Zhichao Zhang Assignee: Zhichao Zhang Fix For: 1.5.0 All of non-sort dimension columns are set as 'Encoding.INVERTED_INDEX' column, this is wrong, only the columns defined in 'SORT_COLUMN' and not in 'NO_INVERTED_INDEX' need to be set as 'Encoding.INVERTED_INDEX' column. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (CARBONDATA-2593) Add an option 'carbon.insert.storage.level' to support configuring the storage level when insert into data with 'carbon.insert.persist.enable'='true'
Zhichao Zhang created CARBONDATA-2593: -- Summary: Add an option 'carbon.insert.storage.level' to support configuring the storage level when insert into data with 'carbon.insert.persist.enable'='true' Key: CARBONDATA-2593 URL: https://issues.apache.org/jira/browse/CARBONDATA-2593 Project: CarbonData Issue Type: Improvement Components: data-load, spark-integration Affects Versions: 1.5.0 Reporter: Zhichao Zhang Assignee: Zhichao Zhang Fix For: 1.5.0 When insert into data with 'carbon.insert.persist.enable'='true', the storage level of dataset is 'MEMORY_AND_DISK', it should support configuring the storage level to correspond to different environment. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (CARBONDATA-2351) CarbonData Select null
[ https://issues.apache.org/jira/browse/CARBONDATA-2351?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16440263#comment-16440263 ] Zhichao Zhang commented on CARBONDATA-2351: I ran your code at 23:00 yesterday , and now I run the select sql, it works fine. I can't reproduce your issue. > CarbonData Select null > -- > > Key: CARBONDATA-2351 > URL: https://issues.apache.org/jira/browse/CARBONDATA-2351 > Project: CarbonData > Issue Type: Bug >Affects Versions: 1.3.1 > Environment: Carbondata1.3.1 with Spark2.1.2 >Reporter: xhnqccf >Priority: Major > Labels: SELECT, null > > Carbondata1.3.1 with Spark2.1.2,after insert into values,SELECT is right.but > next day,data of SELECT is null but the line number is right. > create table carbon01(id int,name int,age int,sex int) stored by 'carbondata'; > insert into carbon01 values(1,1,1,1); > select * from carbon01; > 1 1 1 1 > then,i exit spark-sql. > but,next day, > select * from carbon01; > null null null null -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (CARBONDATA-2351) CarbonData Select null
[ https://issues.apache.org/jira/browse/CARBONDATA-2351?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16439029#comment-16439029 ] Zhichao Zhang commented on CARBONDATA-2351: [~xhnqccf] you ran this test case on local mode or yarn-client mode? does it reproduce every time? > CarbonData Select null > -- > > Key: CARBONDATA-2351 > URL: https://issues.apache.org/jira/browse/CARBONDATA-2351 > Project: CarbonData > Issue Type: Bug >Affects Versions: 1.3.1 > Environment: Carbondata1.3.1 with Spark2.1.2 >Reporter: xhnqccf >Priority: Major > Labels: SELECT, null > > Carbondata1.3.1 with Spark2.1.2,after insert into values,SELECT is right.but > next day,data of SELECT is null but the line number is right. > create table carbon01(id int,name int,age int,sex int) stored by 'carbondata'; > insert into carbon01 values(1,1,1,1); > select * from carbon01; > 1 1 1 1 > then,i exit spark-sql. > but,next day, > select * from carbon01; > null null null null -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Comment Edited] (CARBONDATA-2345) "Task failed while writing rows" error occuers when streaming ingest into carbondata table
[ https://issues.apache.org/jira/browse/CARBONDATA-2345?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16438969#comment-16438969 ] Zhichao Zhang edited comment on CARBONDATA-2345 at 4/16/18 4:11 AM: - [~oceaneast], you can see the doc [Stream data parser|https://github.com/apache/carbondata/blob/branch-1.3/docs/streaming-guide.md#stream-data-parser]. There is also an [example|https://github.com/apache/carbondata/blob/branch-1.3/examples/spark2/src/main/scala/org/apache/carbondata/examples/CarbonStructuredStreamingWithRowParser.scala] showing how to use Stream Data Parser. was (Author: zzcclp): [~oceaneast], you can see the doc [Stream data parser|https://github.com/apache/carbondata/blob/branch-1.3/docs/streaming-guide.md#stream-data-parser] > "Task failed while writing rows" error occuers when streaming ingest into > carbondata table > -- > > Key: CARBONDATA-2345 > URL: https://issues.apache.org/jira/browse/CARBONDATA-2345 > Project: CarbonData > Issue Type: Bug > Components: data-load >Affects Versions: 1.3.1 >Reporter: ocean >Priority: Major > > carbondata version:1.3.1。spark:2.2.1 > When using spark structured streaming ingest data into carbondata table , > such error occurs: > warning: there was one deprecation warning; re-run with -deprecation for > details > qry: org.apache.spark.sql.streaming.StreamingQuery = > org.apache.spark.sql.execution.streaming.StreamingQueryWrapper@7ddf193a > [Stage 1:> (0 + 2) / 5]18/04/13 18:03:56 WARN TaskSetManager: Lost task 1.0 > in stage 1.0 (TID 2, sz-pg-entanalytics-research-004.tendcloud.com, executor > 1): org.apache.carbondata.streaming.CarbonStreamException: Task failed while > writing rows > at > org.apache.spark.sql.execution.streaming.CarbonAppendableStreamSink$.writeDataFileTask(CarbonAppendableStreamSink.scala:345) > at > org.apache.spark.sql.execution.streaming.CarbonAppendableStreamSink$$anonfun$writeDataFileJob$1$$anonfun$apply$mcV$sp$1.apply(CarbonAppendableStreamSink.scala:247) > at > org.apache.spark.sql.execution.streaming.CarbonAppendableStreamSink$$anonfun$writeDataFileJob$1$$anonfun$apply$mcV$sp$1.apply(CarbonAppendableStreamSink.scala:246) > at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:87) > at org.apache.spark.scheduler.Task.run(Task.scala:108) > at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:338) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) > at java.lang.Thread.run(Thread.java:745) > Caused by: java.lang.NullPointerException > at > org.apache.carbondata.processing.loading.BadRecordsLogger.addBadRecordsToBuilder(BadRecordsLogger.java:126) > at > org.apache.carbondata.processing.loading.converter.impl.RowConverterImpl.convert(RowConverterImpl.java:164) > at > org.apache.carbondata.hadoop.streaming.CarbonStreamRecordWriter.write(CarbonStreamRecordWriter.java:186) > at > org.apache.carbondata.streaming.segment.StreamSegment.appendBatchData(StreamSegment.java:244) > at > org.apache.spark.sql.execution.streaming.CarbonAppendableStreamSink$$anonfun$writeDataFileTask$1.apply$mcV$sp(CarbonAppendableStreamSink.scala:336) > at > org.apache.spark.sql.execution.streaming.CarbonAppendableStreamSink$$anonfun$writeDataFileTask$1.apply(CarbonAppendableStreamSink.scala:326) > at > org.apache.spark.sql.execution.streaming.CarbonAppendableStreamSink$$anonfun$writeDataFileTask$1.apply(CarbonAppendableStreamSink.scala:326) > at > org.apache.spark.util.Utils$.tryWithSafeFinallyAndFailureCallbacks(Utils.scala:1371) > at > org.apache.spark.sql.execution.streaming.CarbonAppendableStreamSink$.writeDataFileTask(CarbonAppendableStreamSink.scala:338) > ... 8 more > [Stage 1:===> (1 + 2) / 5]18/04/13 18:03:57 ERROR TaskSetManager: > Task 0 in stage 1.0 failed 4 times; aborting job > 18/04/13 18:03:57 ERROR CarbonAppendableStreamSink$: stream execution thread > for [id = 3abdadea-65f6-4d94-8686-306fccae4559, runId = > 689adf7e-a617-41d9-96bc-de075ce4dd73] Aborting job job_20180413180354_. > org.apache.spark.SparkException: Job aborted due to stage failure: Task 0 in > stage 1.0 failed 4 times, most recent failure: Lost task 0.3 in stage 1.0 > (TID 11, sz-pg-entanalytics-research-004.tendcloud.com, executor 1): > org.apache.carbondata.streaming.CarbonStreamException: Task failed while > writing rows > at > org.apache.spark.sql.execution.streaming.CarbonAppendableStreamSink$.writeDataFileTask(CarbonAppendableStreamSink.scala:345) > at >
[jira] [Commented] (CARBONDATA-2345) "Task failed while writing rows" error occuers when streaming ingest into carbondata table
[ https://issues.apache.org/jira/browse/CARBONDATA-2345?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16438969#comment-16438969 ] Zhichao Zhang commented on CARBONDATA-2345: [~oceaneast], you can see the doc [Stream data parser|https://github.com/apache/carbondata/blob/branch-1.3/docs/streaming-guide.md#stream-data-parser] > "Task failed while writing rows" error occuers when streaming ingest into > carbondata table > -- > > Key: CARBONDATA-2345 > URL: https://issues.apache.org/jira/browse/CARBONDATA-2345 > Project: CarbonData > Issue Type: Bug > Components: data-load >Affects Versions: 1.3.1 >Reporter: ocean >Priority: Major > > carbondata version:1.3.1。spark:2.2.1 > When using spark structured streaming ingest data into carbondata table , > such error occurs: > warning: there was one deprecation warning; re-run with -deprecation for > details > qry: org.apache.spark.sql.streaming.StreamingQuery = > org.apache.spark.sql.execution.streaming.StreamingQueryWrapper@7ddf193a > [Stage 1:> (0 + 2) / 5]18/04/13 18:03:56 WARN TaskSetManager: Lost task 1.0 > in stage 1.0 (TID 2, sz-pg-entanalytics-research-004.tendcloud.com, executor > 1): org.apache.carbondata.streaming.CarbonStreamException: Task failed while > writing rows > at > org.apache.spark.sql.execution.streaming.CarbonAppendableStreamSink$.writeDataFileTask(CarbonAppendableStreamSink.scala:345) > at > org.apache.spark.sql.execution.streaming.CarbonAppendableStreamSink$$anonfun$writeDataFileJob$1$$anonfun$apply$mcV$sp$1.apply(CarbonAppendableStreamSink.scala:247) > at > org.apache.spark.sql.execution.streaming.CarbonAppendableStreamSink$$anonfun$writeDataFileJob$1$$anonfun$apply$mcV$sp$1.apply(CarbonAppendableStreamSink.scala:246) > at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:87) > at org.apache.spark.scheduler.Task.run(Task.scala:108) > at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:338) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) > at java.lang.Thread.run(Thread.java:745) > Caused by: java.lang.NullPointerException > at > org.apache.carbondata.processing.loading.BadRecordsLogger.addBadRecordsToBuilder(BadRecordsLogger.java:126) > at > org.apache.carbondata.processing.loading.converter.impl.RowConverterImpl.convert(RowConverterImpl.java:164) > at > org.apache.carbondata.hadoop.streaming.CarbonStreamRecordWriter.write(CarbonStreamRecordWriter.java:186) > at > org.apache.carbondata.streaming.segment.StreamSegment.appendBatchData(StreamSegment.java:244) > at > org.apache.spark.sql.execution.streaming.CarbonAppendableStreamSink$$anonfun$writeDataFileTask$1.apply$mcV$sp(CarbonAppendableStreamSink.scala:336) > at > org.apache.spark.sql.execution.streaming.CarbonAppendableStreamSink$$anonfun$writeDataFileTask$1.apply(CarbonAppendableStreamSink.scala:326) > at > org.apache.spark.sql.execution.streaming.CarbonAppendableStreamSink$$anonfun$writeDataFileTask$1.apply(CarbonAppendableStreamSink.scala:326) > at > org.apache.spark.util.Utils$.tryWithSafeFinallyAndFailureCallbacks(Utils.scala:1371) > at > org.apache.spark.sql.execution.streaming.CarbonAppendableStreamSink$.writeDataFileTask(CarbonAppendableStreamSink.scala:338) > ... 8 more > [Stage 1:===> (1 + 2) / 5]18/04/13 18:03:57 ERROR TaskSetManager: > Task 0 in stage 1.0 failed 4 times; aborting job > 18/04/13 18:03:57 ERROR CarbonAppendableStreamSink$: stream execution thread > for [id = 3abdadea-65f6-4d94-8686-306fccae4559, runId = > 689adf7e-a617-41d9-96bc-de075ce4dd73] Aborting job job_20180413180354_. > org.apache.spark.SparkException: Job aborted due to stage failure: Task 0 in > stage 1.0 failed 4 times, most recent failure: Lost task 0.3 in stage 1.0 > (TID 11, sz-pg-entanalytics-research-004.tendcloud.com, executor 1): > org.apache.carbondata.streaming.CarbonStreamException: Task failed while > writing rows > at > org.apache.spark.sql.execution.streaming.CarbonAppendableStreamSink$.writeDataFileTask(CarbonAppendableStreamSink.scala:345) > at > org.apache.spark.sql.execution.streaming.CarbonAppendableStreamSink$$anonfun$writeDataFileJob$1$$anonfun$apply$mcV$sp$1.apply(CarbonAppendableStreamSink.scala:247) > at > org.apache.spark.sql.execution.streaming.CarbonAppendableStreamSink$$anonfun$writeDataFileJob$1$$anonfun$apply$mcV$sp$1.apply(CarbonAppendableStreamSink.scala:246) > at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:87) > at org.apache.spark.scheduler.Task.run(Task.scala:108) > at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:338) > at >
[jira] [Commented] (CARBONDATA-2345) "Task failed while writing rows" error occuers when streaming ingest into carbondata table
[ https://issues.apache.org/jira/browse/CARBONDATA-2345?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16437420#comment-16437420 ] Zhichao Zhang commented on CARBONDATA-2345: [~oceaneast], you need to add below option into 'writeStream' block: {code:java} .option(CarbonStreamParser.CARBON_STREAM_PARSER, CarbonStreamParser.CARBON_STREAM_PARSER_ROW_PARSER) {code} for example: {code:java} qry = readSocketDF.writeStream .format("carbondata") .trigger(ProcessingTime("20 seconds")) .option("checkpointLocation", tablePath.getStreamingCheckpointDir) .option("dbName", "default") .option("tableName", tableName) .option(CarbonStreamParser.CARBON_STREAM_PARSER, CarbonStreamParser.CARBON_STREAM_PARSER_ROW_PARSER) .outputMode("append") .start() {code} Please try again. > "Task failed while writing rows" error occuers when streaming ingest into > carbondata table > -- > > Key: CARBONDATA-2345 > URL: https://issues.apache.org/jira/browse/CARBONDATA-2345 > Project: CarbonData > Issue Type: Bug > Components: data-load >Affects Versions: 1.3.1 >Reporter: ocean >Priority: Major > > carbondata version:1.3.1。spark:2.2.1 > When using spark structured streaming ingest data into carbondata table , > such error occurs: > warning: there was one deprecation warning; re-run with -deprecation for > details > qry: org.apache.spark.sql.streaming.StreamingQuery = > org.apache.spark.sql.execution.streaming.StreamingQueryWrapper@7ddf193a > [Stage 1:> (0 + 2) / 5]18/04/13 18:03:56 WARN TaskSetManager: Lost task 1.0 > in stage 1.0 (TID 2, sz-pg-entanalytics-research-004.tendcloud.com, executor > 1): org.apache.carbondata.streaming.CarbonStreamException: Task failed while > writing rows > at > org.apache.spark.sql.execution.streaming.CarbonAppendableStreamSink$.writeDataFileTask(CarbonAppendableStreamSink.scala:345) > at > org.apache.spark.sql.execution.streaming.CarbonAppendableStreamSink$$anonfun$writeDataFileJob$1$$anonfun$apply$mcV$sp$1.apply(CarbonAppendableStreamSink.scala:247) > at > org.apache.spark.sql.execution.streaming.CarbonAppendableStreamSink$$anonfun$writeDataFileJob$1$$anonfun$apply$mcV$sp$1.apply(CarbonAppendableStreamSink.scala:246) > at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:87) > at org.apache.spark.scheduler.Task.run(Task.scala:108) > at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:338) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) > at java.lang.Thread.run(Thread.java:745) > Caused by: java.lang.NullPointerException > at > org.apache.carbondata.processing.loading.BadRecordsLogger.addBadRecordsToBuilder(BadRecordsLogger.java:126) > at > org.apache.carbondata.processing.loading.converter.impl.RowConverterImpl.convert(RowConverterImpl.java:164) > at > org.apache.carbondata.hadoop.streaming.CarbonStreamRecordWriter.write(CarbonStreamRecordWriter.java:186) > at > org.apache.carbondata.streaming.segment.StreamSegment.appendBatchData(StreamSegment.java:244) > at > org.apache.spark.sql.execution.streaming.CarbonAppendableStreamSink$$anonfun$writeDataFileTask$1.apply$mcV$sp(CarbonAppendableStreamSink.scala:336) > at > org.apache.spark.sql.execution.streaming.CarbonAppendableStreamSink$$anonfun$writeDataFileTask$1.apply(CarbonAppendableStreamSink.scala:326) > at > org.apache.spark.sql.execution.streaming.CarbonAppendableStreamSink$$anonfun$writeDataFileTask$1.apply(CarbonAppendableStreamSink.scala:326) > at > org.apache.spark.util.Utils$.tryWithSafeFinallyAndFailureCallbacks(Utils.scala:1371) > at > org.apache.spark.sql.execution.streaming.CarbonAppendableStreamSink$.writeDataFileTask(CarbonAppendableStreamSink.scala:338) > ... 8 more > [Stage 1:===> (1 + 2) / 5]18/04/13 18:03:57 ERROR TaskSetManager: > Task 0 in stage 1.0 failed 4 times; aborting job > 18/04/13 18:03:57 ERROR CarbonAppendableStreamSink$: stream execution thread > for [id = 3abdadea-65f6-4d94-8686-306fccae4559, runId = > 689adf7e-a617-41d9-96bc-de075ce4dd73] Aborting job job_20180413180354_. > org.apache.spark.SparkException: Job aborted due to stage failure: Task 0 in > stage 1.0 failed 4 times, most recent failure: Lost task 0.3 in stage 1.0 > (TID 11, sz-pg-entanalytics-research-004.tendcloud.com, executor 1): > org.apache.carbondata.streaming.CarbonStreamException: Task failed while > writing rows > at > org.apache.spark.sql.execution.streaming.CarbonAppendableStreamSink$.writeDataFileTask(CarbonAppendableStreamSink.scala:345) > at >
[jira] [Updated] (CARBONDATA-2337) Fix duplicately acquiring 'streaming.lock' error when integrating with spark-streaming
[ https://issues.apache.org/jira/browse/CARBONDATA-2337?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhichao Zhang updated CARBONDATA-2337: --- Description: After merged [PR2135|[https://github.com/apache/carbondata/pull/2135]] it will acquire 'streaming.lock' duplicately when integrating with spark-streaming. (was: After merged [PR2135|[https://github.com/apache/carbondata/pull/2135],] it will acquire 'streaming.lock' duplicately when integrating with spark-streaming.) > Fix duplicately acquiring 'streaming.lock' error when integrating with > spark-streaming > -- > > Key: CARBONDATA-2337 > URL: https://issues.apache.org/jira/browse/CARBONDATA-2337 > Project: CarbonData > Issue Type: Bug > Components: spark-integration >Affects Versions: 1.4.0 >Reporter: Zhichao Zhang >Assignee: Zhichao Zhang >Priority: Minor > Fix For: 1.4.0 > > > After merged [PR2135|[https://github.com/apache/carbondata/pull/2135]] it > will acquire 'streaming.lock' duplicately when integrating with > spark-streaming. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (CARBONDATA-2337) Fix duplicately acquiring 'streaming.lock' error when integrating with spark-streaming
Zhichao Zhang created CARBONDATA-2337: -- Summary: Fix duplicately acquiring 'streaming.lock' error when integrating with spark-streaming Key: CARBONDATA-2337 URL: https://issues.apache.org/jira/browse/CARBONDATA-2337 Project: CarbonData Issue Type: Bug Components: spark-integration Affects Versions: 1.4.0 Reporter: Zhichao Zhang Assignee: Zhichao Zhang Fix For: 1.4.0 After merged [PR2135|[https://github.com/apache/carbondata/pull/2135],] it will acquire 'streaming.lock' duplicately when integrating with spark-streaming. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (CARBONDATA-2302) Fix some bugs when separate visible and invisible segments info into two files
[ https://issues.apache.org/jira/browse/CARBONDATA-2302?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhichao Zhang updated CARBONDATA-2302: --- Description: There are some bugs when separate visible and invisible segments info into two files: # It will not delete physical data of history segments after separating # Generate duplicated segment id. was: There are some bugs where separate visible and invisible segments info into two files: # It will not delete physical data of history segments after separating # Generate duplicated segment id. > Fix some bugs when separate visible and invisible segments info into two files > -- > > Key: CARBONDATA-2302 > URL: https://issues.apache.org/jira/browse/CARBONDATA-2302 > Project: CarbonData > Issue Type: Bug > Components: core, data-load >Affects Versions: 1.4.0 >Reporter: Zhichao Zhang >Assignee: Zhichao Zhang >Priority: Major > Fix For: 1.4.0 > > Time Spent: 1h 40m > Remaining Estimate: 0h > > There are some bugs when separate visible and invisible segments info into > two files: > # It will not delete physical data of history segments after separating > # Generate duplicated segment id. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (CARBONDATA-2302) Fix some bugs when separate visible and invisible segments info into two files
Zhichao Zhang created CARBONDATA-2302: -- Summary: Fix some bugs when separate visible and invisible segments info into two files Key: CARBONDATA-2302 URL: https://issues.apache.org/jira/browse/CARBONDATA-2302 Project: CarbonData Issue Type: Bug Components: core, data-load Affects Versions: 1.4.0 Reporter: Zhichao Zhang Assignee: Zhichao Zhang Fix For: 1.4.0 There are some bugs where separate visible and invisible segments info into two files: # It will not delete physical data of history segments after separating # Generate duplicated segment id. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (CARBONDATA-2299) Support showing all segment information(include visible and invisible segments)
Zhichao Zhang created CARBONDATA-2299: -- Summary: Support showing all segment information(include visible and invisible segments) Key: CARBONDATA-2299 URL: https://issues.apache.org/jira/browse/CARBONDATA-2299 Project: CarbonData Issue Type: Improvement Affects Versions: 1.4.0 Reporter: Zhichao Zhang Assignee: Zhichao Zhang Fix For: 1.4.0 Use command 'SHOW HISTORY SEGMENTS' to show all segment information(include visible and invisible segments) -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (CARBONDATA-2298) Delete segment lock files before update metadata
Zhichao Zhang created CARBONDATA-2298: -- Summary: Delete segment lock files before update metadata Key: CARBONDATA-2298 URL: https://issues.apache.org/jira/browse/CARBONDATA-2298 Project: CarbonData Issue Type: Improvement Affects Versions: 1.4.0, 1.3.2 Reporter: Zhichao Zhang Assignee: Zhichao Zhang Fix For: 1.4.0, 1.3.2 If there are some COMPACTED segments and their last modified time is within one hour, the segment lock files deletion operation will not be executed. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (CARBONDATA-2258) Separate visible and invisible segments info into two files to reduce the size of tablestatus file.
Zhichao Zhang created CARBONDATA-2258: -- Summary: Separate visible and invisible segments info into two files to reduce the size of tablestatus file. Key: CARBONDATA-2258 URL: https://issues.apache.org/jira/browse/CARBONDATA-2258 Project: CarbonData Issue Type: Improvement Components: core Affects Versions: 1.4.0, 1.3.2 Reporter: Zhichao Zhang Assignee: Zhichao Zhang Fix For: 1.4.0, 1.3.2 The size of the tablestatus file is getting larger, there are many places will scan this file and it will impact the performance of reading this file. According to the discussion on [thread|http://apache-carbondata-dev-mailing-list-archive.1130556.n5.nabble.com/The-size-of-the-tablestatus-file-is-getting-larger-does-it-impact-the-performance-of-reading-this-fi-td41941.html], it can *append* the invisible segment list to the file called 'tablestatus.history' when execute command 'CLEAN FILES FOR TABLE' (in method 'SegmentStatusManager.deleteLoadsAndUpdateMetadata') every time, separate visible and invisible segments into two files(tablestatus file and tablestatus.history file). If later it needs to support listing all segments(include visible and invisible) list when execute 'SHOW SEGMENTS FOR TABLE', it just need to read from two files. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (CARBONDATA-2244) When there are some invisibility INSERT_IN_PROGRESS/INSERT_OVERWRITE_IN_PROGRESS segments on main table, it can not create preaggregate table on it.
Zhichao Zhang created CARBONDATA-2244: -- Summary: When there are some invisibility INSERT_IN_PROGRESS/INSERT_OVERWRITE_IN_PROGRESS segments on main table, it can not create preaggregate table on it. Key: CARBONDATA-2244 URL: https://issues.apache.org/jira/browse/CARBONDATA-2244 Project: CarbonData Issue Type: Bug Affects Versions: 1.3.0, 1.4.0 Reporter: Zhichao Zhang Assignee: Zhichao Zhang Fix For: 1.4.0, 1.3.2 When there are some invisibility INSERT_IN_PROGRESS/INSERT_OVERWRITE_IN_PROGRESS segments on main table, it can not create preaggregate table on it. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (CARBONDATA-2230) Add a path into table path to store lock files and delete useless segment lock files before loading
[ https://issues.apache.org/jira/browse/CARBONDATA-2230?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhichao Zhang updated CARBONDATA-2230: --- Description: After [PR1984|https://github.com/apache/carbondata/pull/1984] merged, it doesn't delete the lock files when unlock, there are many useless lock files in table path, especially segment lock files, they grow after every batch loading. Solution : 1. add a child path into table path, called Locks, all lock files will be stored in this path; 2. Before loading, get all useless segment lock files and delete them, because just segment lock files will grow, other lock files dosen't grow. > Add a path into table path to store lock files and delete useless segment > lock files before loading > --- > > Key: CARBONDATA-2230 > URL: https://issues.apache.org/jira/browse/CARBONDATA-2230 > Project: CarbonData > Issue Type: Improvement > Components: data-load >Affects Versions: 1.3.0, 1.4.0 >Reporter: Zhichao Zhang >Assignee: Zhichao Zhang >Priority: Minor > Fix For: 1.3.0, 1.4.0 > > > After [PR1984|https://github.com/apache/carbondata/pull/1984] merged, it > doesn't delete the lock files when unlock, there are many useless lock files > in table path, especially segment lock files, they grow after every batch > loading. > Solution : > 1. add a child path into table path, called Locks, all lock files will be > stored in this path; > 2. Before loading, get all useless segment lock files and delete them, > because just segment lock files will grow, other lock files dosen't grow. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (CARBONDATA-2230) Add a path into table path to store lock files and delete useless segment lock files before loading
Zhichao Zhang created CARBONDATA-2230: -- Summary: Add a path into table path to store lock files and delete useless segment lock files before loading Key: CARBONDATA-2230 URL: https://issues.apache.org/jira/browse/CARBONDATA-2230 Project: CarbonData Issue Type: Improvement Components: data-load Affects Versions: 1.3.0, 1.4.0 Reporter: Zhichao Zhang Assignee: Zhichao Zhang Fix For: 1.4.0, 1.3.0 -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (CARBONDATA-2215) Add the description of Carbon Stream Parser into streaming-guide.md
Zhichao Zhang created CARBONDATA-2215: -- Summary: Add the description of Carbon Stream Parser into streaming-guide.md Key: CARBONDATA-2215 URL: https://issues.apache.org/jira/browse/CARBONDATA-2215 Project: CarbonData Issue Type: Task Components: docs Reporter: Zhichao Zhang Assignee: Zhichao Zhang Add the description of Carbon Stream Parser into streaming-guide.md -- This message was sent by Atlassian JIRA (v7.6.3#76005)