[jira] [Created] (CARBONDATA-4300) Clean files command supports specify segment ids
Yahui Liu created CARBONDATA-4300: - Summary: Clean files command supports specify segment ids Key: CARBONDATA-4300 URL: https://issues.apache.org/jira/browse/CARBONDATA-4300 Project: CarbonData Issue Type: New Feature Components: sql Reporter: Yahui Liu Clean files command supports specify segment ids, syntax is "clean files for table table_name options("segment_ids"="id1,id2,id3...")". If specified segment ids, then only the segment with these ids will be delete physically. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (CARBONDATA-4291) Carbon hive table supports float datatype
[ https://issues.apache.org/jira/browse/CARBONDATA-4291?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yahui Liu updated CARBONDATA-4291: -- Affects Version/s: (was: 2.1.1) 2.2.0 > Carbon hive table supports float datatype > - > > Key: CARBONDATA-4291 > URL: https://issues.apache.org/jira/browse/CARBONDATA-4291 > Project: CarbonData > Issue Type: New Feature > Components: sql >Affects Versions: 2.2.0 >Reporter: Yahui Liu >Priority: Major > Time Spent: 40m > Remaining Estimate: 0h > > Currently when create carbon hive table, if data type is float, will convert > to double type. This means all float data will be stored as double. > In CTAS secnario, if source table column is float type, the data in newly > created carbon table will be incorrect. > Reproduce steps: > CREATE TABLE p1(f float) stored as parquet; > insert into table p1 select 12.36; > create table carbon1 stored as carbondata as select * from p1; > select * from carbon1; > Result: > 5.410467587E-315 > Carbon should support store float datatype directly. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Resolved] (CARBONDATA-4215) When carbon.enable.vector.reader=false and upon adding a parquet segment through alter add segments in a carbon table , we are getting error in count(*)
[ https://issues.apache.org/jira/browse/CARBONDATA-4215?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Indhumathi resolved CARBONDATA-4215. Fix Version/s: 2.3.0 Resolution: Fixed > When carbon.enable.vector.reader=false and upon adding a parquet segment > through alter add segments in a carbon table , we are getting error in > count(*) > > > Key: CARBONDATA-4215 > URL: https://issues.apache.org/jira/browse/CARBONDATA-4215 > Project: CarbonData > Issue Type: Bug >Affects Versions: 2.1.1 > Environment: 3 node FI >Reporter: Prasanna Ravichandran >Priority: Minor > Fix For: 2.3.0 > > Time Spent: 11h 50m > Remaining Estimate: 0h > > When carbon.enable.vector.reader=false and upon adding a parquet segment > through alter add segments in a carbon table , we are getting error in > count(*). > > Test queries: > --set carbon.enable.vector.reader=false in carbon.properties; > use default; > drop table if exists uniqdata; > CREATE TABLE uniqdata (cust_id int,cust_name String,active_emui_version > string, dob timestamp, doj timestamp, bigint_column1 bigint,bigint_column2 > bigint,decimal_column1 decimal(30,10), decimal_column2 > decimal(36,36),double_column1 double, double_column2 double,integer_column1 > int) stored as carbondata; > load data inpath 'hdfs://hacluster/user/prasanna/2000_UniqData.csv' into > table uniqdata > options('fileheader'='cust_id,cust_name,active_emui_version,dob,doj,bigint_column1,bigint_column2,decimal_column1,decimal_column2,double_column1,double_column2,integer_column1','bad_records_action'='force'); > drop table if exists uniqdata_parquet; > CREATE TABLE uniqdata_parquet (cust_id int,cust_name > String,active_emui_version string, dob timestamp, doj timestamp, > bigint_column1 bigint,bigint_column2 bigint,decimal_column1 decimal(30,10), > decimal_column2 decimal(36,36),double_column1 double, double_column2 > double,integer_column1 int) stored as parquet; > insert into uniqdata_parquet select * from uniqdata; > create database if not exists test; > use test; > CREATE TABLE uniqdata (cust_id int,cust_name String,active_emui_version > string, dob timestamp, doj timestamp, bigint_column1 bigint,bigint_column2 > bigint,decimal_column1 decimal(30,10), decimal_column2 > decimal(36,36),double_column1 double, double_column2 double,integer_column1 > int) stored as carbondata; > load data inpath 'hdfs://hacluster/user/prasanna/2000_UniqData.csv' into > table uniqdata > options('fileheader'='cust_id,cust_name,active_emui_version,dob,doj,bigint_column1,bigint_column2,decimal_column1,decimal_column2,double_column1,double_column2,integer_column1','bad_records_action'='force'); > Alter table uniqdata add segment options > ('path'='hdfs://hacluster/user/hive/warehouse/uniqdata_parquet','format'='parquet'); > select count(*) from uniqdata; -- throwing error class cast exception; > > Error Log traces: > java.lang.ClassCastException: org.apache.spark.sql.vectorized.ColumnarBatch > cannot be cast to org.apache.spark.sql.catalyst.InternalRow > at > org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIteratorForCodegenStage1.agg_doAggregateWithoutKey_0$(Unknown > Source) > at > org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIteratorForCodegenStage1.processNext(Unknown > Source) > at > org.apache.spark.sql.execution.BufferedRowIterator.hasNext(BufferedRowIterator.java:43) > at > org.apache.spark.sql.execution.WholeStageCodegenExec$$anonfun$13$$anon$1.hasNext(WholeStageCodegenExec.scala:584) > at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:409) > at > org.apache.spark.shuffle.sort.BypassMergeSortShuffleWriter.write(BypassMergeSortShuffleWriter.java:132) > at > org.apache.spark.shuffle.ShuffleWriteProcessor.write(ShuffleWriteProcessor.scala:58) > at > org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:102) > at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:55) > at org.apache.spark.scheduler.Task.run(Task.scala:123) > at > org.apache.spark.executor.Executor$TaskRunner$$anonfun$10.apply(Executor.scala:413) > at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:1551) > at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:419) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) > at java.lang.Thread.run(Thread.java:748) > 2021-06-19 13:50:59,035 | WARN | task-result-getter-2 | Lost task 0.0 in > stage 4.0 (TID 28, localhost, executor driver): java.lang.ClassCastException: >
[jira] [Updated] (CARBONDATA-4297) Create table(Carbon and Parquet) with combination of partitioned by, Clustered by, Sorted by and with options parameter and insert overwrite fails with parser errors
[ https://issues.apache.org/jira/browse/CARBONDATA-4297?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chetan Bhat updated CARBONDATA-4297: Description: *Issue 1 : Create table* *(Carbon and Parquet) with combination of partitioned by, Clustered by, Sorted by fails -* *Queries-* CREATE TABLE t (a STRING, b INT, c STRING, d STRING) stored as carbondata OPTIONS (a '1', b '2') PARTITIONED BY (c, d) CLUSTERED BY (a) SORTED BY (b ASC) INTO 2 BUCKETS COMMENT 'table_comment' TBLPROPERTIES (t 'test'); CREATE TABLE t (a STRING, b INT, c STRING, d STRING) stored as parquet OPTIONS (a '1', b '2') PARTITIONED BY (c, d) CLUSTERED BY (a) SORTED BY (b ASC) INTO 2 BUCKETS COMMENT 'table_comment' TBLPROPERTIES (t 'test'); 0: jdbc:hive2://7.187.185.158:23040/default> CREATE TABLE t (a STRING, b INT, c STRING, d STRING) stored as carbondata 0: jdbc:hive2://7.187.185.158:23040/default> OPTIONS (a '1', b '2') 0: jdbc:hive2://7.187.185.158:23040/default> PARTITIONED BY (c, d) CLUSTERED BY (a) SORTED BY (b ASC) INTO 2 BUCKETS 0: jdbc:hive2://7.187.185.158:23040/default> COMMENT 'table_comment' 0: jdbc:hive2://7.187.185.158:23040/default> TBLPROPERTIES (t 'test'); Error: org.apache.spark.sql.AnalysisException: == Spark Parser: org.apache.spark.sql.execution.SparkSqlParser == mismatched input 'OPTIONS' expecting (line 2, pos 0) == SQL == CREATE TABLE t (a STRING, b INT, c STRING, d STRING) stored as carbondata OPTIONS (a '1', b '2') ^^^ PARTITIONED BY (c, d) CLUSTERED BY (a) SORTED BY (b ASC) INTO 2 BUCKETS COMMENT 'table_comment' TBLPROPERTIES (t 'test') == Carbon Parser: org.apache.spark.sql.parser.CarbonExtensionSpark2SqlParser == [1.8] failure: identifier matching regex (?i)MATERIALIZED expected CREATE TABLE t (a STRING, b INT, c STRING, d STRING) stored as carbondata ^; == Antlr Parser: org.apache.spark.sql.parser.CarbonAntlrParser == Antlr SQL Parser will only deal with Merge Into SQL Command; (state=,code=0) 0: jdbc:hive2://7.187.185.158:23040/default> CREATE TABLE t (a STRING, b INT, c STRING, d STRING) stored as parquet 0: jdbc:hive2://7.187.185.158:23040/default> OPTIONS (a '1', b '2') 0: jdbc:hive2://7.187.185.158:23040/default> PARTITIONED BY (c, d) CLUSTERED BY (a) SORTED BY (b ASC) INTO 2 BUCKETS 0: jdbc:hive2://7.187.185.158:23040/default> COMMENT 'table_comment' 0: jdbc:hive2://7.187.185.158:23040/default> TBLPROPERTIES (t 'test'); Error: org.apache.spark.sql.AnalysisException: == Spark Parser: org.apache.spark.sql.execution.SparkSqlParser == mismatched input 'OPTIONS' expecting (line 2, pos 0) == SQL == CREATE TABLE t (a STRING, b INT, c STRING, d STRING) stored as parquet OPTIONS (a '1', b '2') ^^^ PARTITIONED BY (c, d) CLUSTERED BY (a) SORTED BY (b ASC) INTO 2 BUCKETS COMMENT 'table_comment' TBLPROPERTIES (t 'test') == Carbon Parser: org.apache.spark.sql.parser.CarbonExtensionSpark2SqlParser == [1.8] failure: identifier matching regex (?i)MATERIALIZED expected CREATE TABLE t (a STRING, b INT, c STRING, d STRING) stored as parquet ^; == Antlr Parser: org.apache.spark.sql.parser.CarbonAntlrParser == Antlr SQL Parser will only deal with Merge Into SQL Command; (state=,code=0) 0: jdbc:hive2://7.187.185.158:23040/default> *Issue 2 : Create table with options parameter fails-* *Queries-* CREATE TABLE tbl (a INT, b STRING, c INT) stored as carbondata OPTIONS ('a' 1); CREATE TABLE tbl1 (a INT, b STRING, c INT) stored as parquet OPTIONS ('a' 1); 0: jdbc:hive2://7.187.185.158:23040/default> CREATE TABLE tbl (a INT, b STRING, c INT) stored as carbondata OPTIONS ('a' 1); Error: org.apache.spark.sql.AnalysisException: == Spark Parser: org.apache.spark.sql.execution.SparkSqlParser == mismatched input 'OPTIONS' expecting (line 1, pos 63) == SQL == CREATE TABLE tbl (a INT, b STRING, c INT) stored as carbondata OPTIONS ('a' 1) ---^^^ == Carbon Parser: org.apache.spark.sql.parser.CarbonExtensionSpark2SqlParser == [1.8] failure: identifier matching regex (?i)MATERIALIZED expected CREATE TABLE tbl (a INT, b STRING, c INT) stored as carbondata OPTIONS ('a' 1) ^; == Antlr Parser: org.apache.spark.sql.parser.CarbonAntlrParser == Antlr SQL Parser will only deal with Merge Into SQL Command; (state=,code=0) 0: jdbc:hive2://7.187.185.158:23040/default> CREATE TABLE tbl1 (a INT, b STRING, c INT) stored as parquet OPTIONS ('a' 1); Error: org.apache.spark.sql.AnalysisException: == Spark Parser: org.apache.spark.sql.execution.SparkSqlParser == mismatched input 'OPTIONS' expecting (line 1, pos 61) == SQL == CREATE TABLE tbl1 (a INT, b STRING, c INT) stored as parquet OPTIONS ('a' 1) -^^^ == Carbon Parser: org.apache.spark.sql.parser.CarbonExtensionSpark2SqlParser == [1.8] failure: identifier matching regex (?i)MATERIALIZED expected CREATE TABLE tbl1 (a INT, b
[jira] [Updated] (CARBONDATA-4297) Create table(Carbon and Parquet) with combination of partitioned by, Clustered by, Sorted by and with options parameter and insert overwrite fails with parser errors
[ https://issues.apache.org/jira/browse/CARBONDATA-4297?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chetan Bhat updated CARBONDATA-4297: Attachment: image-2021-10-08-12-51-14-837.png Description: *Issue 1 : Create table* *(Carbon and Parquet) with combination of partitioned by, Clustered by, Sorted by fails -* *Queries-* CREATE TABLE t (a STRING, b INT, c STRING, d STRING) stored as carbondata OPTIONS (a '1', b '2') PARTITIONED BY (c, d) CLUSTERED BY (a) SORTED BY (b ASC) INTO 2 BUCKETS COMMENT 'table_comment' TBLPROPERTIES (t 'test'); CREATE TABLE t (a STRING, b INT, c STRING, d STRING) stored as parquet OPTIONS (a '1', b '2') PARTITIONED BY (c, d) CLUSTERED BY (a) SORTED BY (b ASC) INTO 2 BUCKETS COMMENT 'table_comment' TBLPROPERTIES (t 'test'); 0: jdbc:hive2://7.187.185.158:23040/default> CREATE TABLE t (a STRING, b INT, c STRING, d STRING) stored as carbondata 0: jdbc:hive2://7.187.185.158:23040/default> OPTIONS (a '1', b '2') 0: jdbc:hive2://7.187.185.158:23040/default> PARTITIONED BY (c, d) CLUSTERED BY (a) SORTED BY (b ASC) INTO 2 BUCKETS 0: jdbc:hive2://7.187.185.158:23040/default> COMMENT 'table_comment' 0: jdbc:hive2://7.187.185.158:23040/default> TBLPROPERTIES (t 'test'); Error: org.apache.spark.sql.AnalysisException: == Spark Parser: org.apache.spark.sql.execution.SparkSqlParser == mismatched input 'OPTIONS' expecting (line 2, pos 0) == SQL == CREATE TABLE t (a STRING, b INT, c STRING, d STRING) stored as carbondata OPTIONS (a '1', b '2') ^^^ PARTITIONED BY (c, d) CLUSTERED BY (a) SORTED BY (b ASC) INTO 2 BUCKETS COMMENT 'table_comment' TBLPROPERTIES (t 'test') == Carbon Parser: org.apache.spark.sql.parser.CarbonExtensionSpark2SqlParser == [1.8] failure: identifier matching regex (?i)MATERIALIZED expected CREATE TABLE t (a STRING, b INT, c STRING, d STRING) stored as carbondata ^; == Antlr Parser: org.apache.spark.sql.parser.CarbonAntlrParser == Antlr SQL Parser will only deal with Merge Into SQL Command; (state=,code=0) 0: jdbc:hive2://7.187.185.158:23040/default> CREATE TABLE t (a STRING, b INT, c STRING, d STRING) stored as parquet 0: jdbc:hive2://7.187.185.158:23040/default> OPTIONS (a '1', b '2') 0: jdbc:hive2://7.187.185.158:23040/default> PARTITIONED BY (c, d) CLUSTERED BY (a) SORTED BY (b ASC) INTO 2 BUCKETS 0: jdbc:hive2://7.187.185.158:23040/default> COMMENT 'table_comment' 0: jdbc:hive2://7.187.185.158:23040/default> TBLPROPERTIES (t 'test'); Error: org.apache.spark.sql.AnalysisException: == Spark Parser: org.apache.spark.sql.execution.SparkSqlParser == mismatched input 'OPTIONS' expecting (line 2, pos 0) == SQL == CREATE TABLE t (a STRING, b INT, c STRING, d STRING) stored as parquet OPTIONS (a '1', b '2') ^^^ PARTITIONED BY (c, d) CLUSTERED BY (a) SORTED BY (b ASC) INTO 2 BUCKETS COMMENT 'table_comment' TBLPROPERTIES (t 'test') == Carbon Parser: org.apache.spark.sql.parser.CarbonExtensionSpark2SqlParser == [1.8] failure: identifier matching regex (?i)MATERIALIZED expected CREATE TABLE t (a STRING, b INT, c STRING, d STRING) stored as parquet ^; == Antlr Parser: org.apache.spark.sql.parser.CarbonAntlrParser == Antlr SQL Parser will only deal with Merge Into SQL Command; (state=,code=0) 0: jdbc:hive2://7.187.185.158:23040/default> *Issue 2 : Create table with options parameter fails-* *Queries-* CREATE TABLE tbl (a INT, b STRING, c INT) stored as carbondata OPTIONS ('a' 1); CREATE TABLE tbl1 (a INT, b STRING, c INT) stored as parquet OPTIONS ('a' 1); 0: jdbc:hive2://7.187.185.158:23040/default> CREATE TABLE tbl (a INT, b STRING, c INT) stored as carbondata OPTIONS ('a' 1); Error: org.apache.spark.sql.AnalysisException: == Spark Parser: org.apache.spark.sql.execution.SparkSqlParser == mismatched input 'OPTIONS' expecting (line 1, pos 63) == SQL == CREATE TABLE tbl (a INT, b STRING, c INT) stored as carbondata OPTIONS ('a' 1) ---^^^ == Carbon Parser: org.apache.spark.sql.parser.CarbonExtensionSpark2SqlParser == [1.8] failure: identifier matching regex (?i)MATERIALIZED expected CREATE TABLE tbl (a INT, b STRING, c INT) stored as carbondata OPTIONS ('a' 1) ^; == Antlr Parser: org.apache.spark.sql.parser.CarbonAntlrParser == Antlr SQL Parser will only deal with Merge Into SQL Command; (state=,code=0) 0: jdbc:hive2://7.187.185.158:23040/default> CREATE TABLE tbl1 (a INT, b STRING, c INT) stored as parquet OPTIONS ('a' 1); Error: org.apache.spark.sql.AnalysisException: == Spark Parser: org.apache.spark.sql.execution.SparkSqlParser == mismatched input 'OPTIONS' expecting (line 1, pos 61) == SQL == CREATE TABLE tbl1 (a INT, b STRING, c INT) stored as parquet OPTIONS ('a' 1) -^^^ == Carbon Parser: org.apache.spark.sql.parser.CarbonExtensionSpark2SqlParser == [1.8] failure: identifier matching regex