[jira] [Closed] (CARBONDATA-4240) Properties present in https://github.com/apache/carbondata/blob/master/core/src/main/java/org/apache/carbondata/core/constants/CarbonCommonConstants.java which are n
[ https://issues.apache.org/jira/browse/CARBONDATA-4240?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chetan Bhat closed CARBONDATA-4240. --- The mentioned properties are updated now in the documentation link - [https://github.com/apache/carbondata/blob/master/core/src/main/java/org/apache/carbondata/core/constants/CarbonCommonConstants.java.] Hence the bug is closed. > Properties present in > https://github.com/apache/carbondata/blob/master/core/src/main/java/org/apache/carbondata/core/constants/CarbonCommonConstants.java > which are not present in open source doc > --- > > Key: CARBONDATA-4240 > URL: https://issues.apache.org/jira/browse/CARBONDATA-4240 > Project: CarbonData > Issue Type: Bug > Components: docs >Affects Versions: 2.2.0 > Environment: Open source docs >Reporter: Chetan Bhat >Priority: Minor > Fix For: 2.3.0 > > Time Spent: 9.5h > Remaining Estimate: 0h > > Properties present in > https://github.com/apache/carbondata/blob/master/core/src/main/java/org/apache/carbondata/core/constants/CarbonCommonConstants.java > which are not present in open source doc as mentioned below. These > properties need to be updated in open source doc. > carbon.storelocation > carbon.blocklet.size > carbon.properties.filepath > carbon.date.format > carbon.complex.delimiter.level.1 > carbon.complex.delimiter.level.2 > carbon.complex.delimiter.level.3 > carbon.complex.delimiter.level.4 > carbon.lock.class > carbon.local.dictionary.enable > carbon.local.dictionary.decoder.fallback > spark.deploy.zookeeper.url > carbon.data.file.version > spark.carbon.hive.schema.store > spark.carbon.datamanagement.driver > spark.carbon.sessionstate.classname > spark.carbon.sqlastbuilder.classname > carbon.lease.recovery.retry.count > carbon.lease.recovery.retry.interval > carbon.index.schema.storage > carbon.merge.index.in.segment > carbon.number.of.cores.while.altPartition > carbon.minor.compaction.size > enable.unsafe.columnpage > carbon.lucene.compression.mode > sort.inmemory.size.inmb > is.driver.instance > carbon.input.metrics.update.interval > carbon.use.bitset.pipe.line > is.internal.load.call > carbon.lucene.index.stop.words > carbon.load.dateformat.setlenient.enable > carbon.infilter.subquery.pushdown.enable > broadcast.record.size > carbon.indexserver.tempfolder.deletetime -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Created] (CARBONDATA-4321) Major Compaction of a table with multiple big data loads each having different sort scopes fails
Chetan Bhat created CARBONDATA-4321: --- Summary: Major Compaction of a table with multiple big data loads each having different sort scopes fails Key: CARBONDATA-4321 URL: https://issues.apache.org/jira/browse/CARBONDATA-4321 Project: CarbonData Issue Type: Bug Components: data-load Affects Versions: 2.3.0 Environment: SUSE/Cent OS, Spark 3.1.1 Reporter: Chetan Bhat Attachments: Failure_Logs.txt Test Steps : >From Spark beeline table is created with compression format gzip, table having >more than 100 columns. 3 big data loads each with different sort scopes are loaded in the table. Major compaction is executed on the table. create table JL_r3 ( p_cap_time String, city String, product_code String, user_base_station String, user_belong_area_code String, user_num String, user_imsi String, user_id String, user_msisdn String, dim1 String, dim2 String, dim3 String, dim4 String, dim5 String, dim6 String, dim7 String, dim8 String, dim9 String, dim10 String, dim11 String, dim12 String, dim13 String, dim14 String, dim15 String, dim16 String, dim17 String, dim18 String, dim19 String, dim20 String, dim21 String, dim22 String, dim23 String, dim24 String, dim25 String, dim26 String, dim27 String, dim28 String, dim29 String, dim30 String, dim31 String, dim32 String, dim33 String, dim34 String, dim35 String, dim36 String, dim37 String, dim38 String, dim39 String, dim40 String, dim41 String, dim42 String, dim43 String, dim44 String, dim45 String, dim46 String, dim47 String, dim48 String, dim49 String, dim50 String, dim51 String, dim52 String, dim53 String, dim54 String, dim55 String, dim56 String, dim57 String, dim58 String, dim59 String, dim60 String, dim61 String, dim62 String, dim63 String, dim64 String, dim65 String, dim66 String, dim67 String, dim68 String, dim69 String, dim70 String, dim71 String, dim72 String, dim73 String, dim74 String, dim75 String, dim76 String, dim77 String, dim78 String, dim79 String, dim80 String, dim81 String, M1 double, M2 double, M3 double, M4 double, M5 double, M6 double, M7 double, M8 double, M9 double, M10 double ) stored as carbondata TBLPROPERTIES('table_blocksize'='256','sort_columns'='dim81','carbon.column.compressor'='gzip'); 0: jdbc:hive2://10.21.19.14:23040/default> LOAD DATA inpath 'hdfs://hacluster/chetan/Bigdata_bulk.csv' into table JL_r3 options('sort_scope'='global_sort','DELIMITER'=',', 'QUOTECHAR'='"','BAD_RECORDS_ACTION'='FORCE','BAD_RECORDS_LOGGER_ENABLE'='TRUE','IS_EMPTY_DATA_BAD_RECORD'='TRUE','FILEHEADER'='p_cap_time,city,product_code,user_base_station,user_belong_area_code,user_num,user_imsi,user_id,user_msisdn,dim1,dim2,dim3,dim4,dim5,dim6,dim7,dim8,dim9,dim10,dim11,dim12,dim13,dim14,dim15,dim16,dim17,dim18,dim19,dim20,dim21,dim22,dim23,dim24,dim25,dim26,dim27,dim28,dim29,dim30,dim31,dim32,dim33,dim34,dim35,dim36,dim37,dim38,dim39,dim40,dim41,dim42,dim43,dim44,dim45,dim46,dim47,dim48,dim49,dim50,dim51,dim52,dim53,dim54,dim55,dim56,dim57,dim58,dim59,dim60,dim61,dim62,dim63,dim64,dim65,dim66,dim67,dim68,dim69,dim70,dim71,dim72,dim73,dim74,dim75,dim76,dim77,dim78,dim79,dim80,dim81,M1,M2,M3,M4,M5,M6,M7,M8,M9,M10'); +-+ | Segment ID | +-+ | 0 | +-+ 1 row selected (41.011 seconds) 0: jdbc:hive2://10.21.19.14:23040/default> LOAD DATA inpath 'hdfs://hacluster/chetan/Bigdata_bulk.csv' into table JL_r3 options('sort_scope'='local_sort','DELIMITER'=',', 'QUOTECHAR'='"','BAD_RECORDS_ACTION'='FORCE','BAD_RECORDS_LOGGER_ENABLE'='TRUE','IS_EMPTY_DATA_BAD_RECORD'='TRUE','FILEHEADER'='p_cap_time,city,product_code,user_base_station,user_belong_area_code,user_num,user_imsi,user_id,user_msisdn,dim1,dim2,dim3,dim4,dim5,dim6,dim7,dim8,dim9,dim10,dim11,dim12,dim13,dim14,dim15,dim16,dim17,dim18,dim19,dim20,dim21,dim22,dim23,dim24,dim25,dim26,dim27,dim28,dim29,dim30,dim31,dim32,dim33,dim34,dim35,dim36,dim37,dim38,dim39,dim40,dim41,dim42,dim43,dim44,dim45,dim46,dim47,dim48,dim49,dim50,dim51,dim52,dim53,dim54,dim55,dim56,dim57,dim58,dim59,dim60,dim61,dim62,dim63,dim64,dim65,dim66,dim67,dim68,dim69,dim70,dim71,dim72,dim73,dim74,dim75,dim76,dim77,dim78,dim79,dim80,dim81,M1,M2,M3,M4,M5,M6,M7,M8,M9,M10'); +-+ | Segment ID | +-+ | 1 | +-+ 1 row selected (17.094 seconds) 0: jdbc:hive2://10.21.19.14:23040/default> LOAD DATA inpath 'hdfs://hacluster/chetan/Bigdata_bulk.csv' into table JL_r3 options('sort_scope'='no_sort','DELIMITER'=',',
[jira] [Updated] (CARBONDATA-4297) Create table(Carbon and Parquet) with combination of partitioned by, Clustered by, Sorted by and with options parameter and insert overwrite fails with parser errors
[ https://issues.apache.org/jira/browse/CARBONDATA-4297?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chetan Bhat updated CARBONDATA-4297: Description: *Issue 1 : Create table* *(Carbon and Parquet) with combination of partitioned by, Clustered by, Sorted by fails -* *Queries-* CREATE TABLE t (a STRING, b INT, c STRING, d STRING) stored as carbondata OPTIONS (a '1', b '2') PARTITIONED BY (c, d) CLUSTERED BY (a) SORTED BY (b ASC) INTO 2 BUCKETS COMMENT 'table_comment' TBLPROPERTIES (t 'test'); CREATE TABLE t (a STRING, b INT, c STRING, d STRING) stored as parquet OPTIONS (a '1', b '2') PARTITIONED BY (c, d) CLUSTERED BY (a) SORTED BY (b ASC) INTO 2 BUCKETS COMMENT 'table_comment' TBLPROPERTIES (t 'test'); 0: jdbc:hive2://7.187.185.158:23040/default> CREATE TABLE t (a STRING, b INT, c STRING, d STRING) stored as carbondata 0: jdbc:hive2://7.187.185.158:23040/default> OPTIONS (a '1', b '2') 0: jdbc:hive2://7.187.185.158:23040/default> PARTITIONED BY (c, d) CLUSTERED BY (a) SORTED BY (b ASC) INTO 2 BUCKETS 0: jdbc:hive2://7.187.185.158:23040/default> COMMENT 'table_comment' 0: jdbc:hive2://7.187.185.158:23040/default> TBLPROPERTIES (t 'test'); Error: org.apache.spark.sql.AnalysisException: == Spark Parser: org.apache.spark.sql.execution.SparkSqlParser == mismatched input 'OPTIONS' expecting (line 2, pos 0) == SQL == CREATE TABLE t (a STRING, b INT, c STRING, d STRING) stored as carbondata OPTIONS (a '1', b '2') ^^^ PARTITIONED BY (c, d) CLUSTERED BY (a) SORTED BY (b ASC) INTO 2 BUCKETS COMMENT 'table_comment' TBLPROPERTIES (t 'test') == Carbon Parser: org.apache.spark.sql.parser.CarbonExtensionSpark2SqlParser == [1.8] failure: identifier matching regex (?i)MATERIALIZED expected CREATE TABLE t (a STRING, b INT, c STRING, d STRING) stored as carbondata ^; == Antlr Parser: org.apache.spark.sql.parser.CarbonAntlrParser == Antlr SQL Parser will only deal with Merge Into SQL Command; (state=,code=0) 0: jdbc:hive2://7.187.185.158:23040/default> CREATE TABLE t (a STRING, b INT, c STRING, d STRING) stored as parquet 0: jdbc:hive2://7.187.185.158:23040/default> OPTIONS (a '1', b '2') 0: jdbc:hive2://7.187.185.158:23040/default> PARTITIONED BY (c, d) CLUSTERED BY (a) SORTED BY (b ASC) INTO 2 BUCKETS 0: jdbc:hive2://7.187.185.158:23040/default> COMMENT 'table_comment' 0: jdbc:hive2://7.187.185.158:23040/default> TBLPROPERTIES (t 'test'); Error: org.apache.spark.sql.AnalysisException: == Spark Parser: org.apache.spark.sql.execution.SparkSqlParser == mismatched input 'OPTIONS' expecting (line 2, pos 0) == SQL == CREATE TABLE t (a STRING, b INT, c STRING, d STRING) stored as parquet OPTIONS (a '1', b '2') ^^^ PARTITIONED BY (c, d) CLUSTERED BY (a) SORTED BY (b ASC) INTO 2 BUCKETS COMMENT 'table_comment' TBLPROPERTIES (t 'test') == Carbon Parser: org.apache.spark.sql.parser.CarbonExtensionSpark2SqlParser == [1.8] failure: identifier matching regex (?i)MATERIALIZED expected CREATE TABLE t (a STRING, b INT, c STRING, d STRING) stored as parquet ^; == Antlr Parser: org.apache.spark.sql.parser.CarbonAntlrParser == Antlr SQL Parser will only deal with Merge Into SQL Command; (state=,code=0) 0: jdbc:hive2://7.187.185.158:23040/default> *Issue 2 : Create table with options parameter fails-* *Queries-* CREATE TABLE tbl (a INT, b STRING, c INT) stored as carbondata OPTIONS ('a' 1); CREATE TABLE tbl1 (a INT, b STRING, c INT) stored as parquet OPTIONS ('a' 1); 0: jdbc:hive2://7.187.185.158:23040/default> CREATE TABLE tbl (a INT, b STRING, c INT) stored as carbondata OPTIONS ('a' 1); Error: org.apache.spark.sql.AnalysisException: == Spark Parser: org.apache.spark.sql.execution.SparkSqlParser == mismatched input 'OPTIONS' expecting (line 1, pos 63) == SQL == CREATE TABLE tbl (a INT, b STRING, c INT) stored as carbondata OPTIONS ('a' 1) ---^^^ == Carbon Parser: org.apache.spark.sql.parser.CarbonExtensionSpark2SqlParser == [1.8] failure: identifier matching regex (?i)MATERIALIZED expected CREATE TABLE tbl (a INT, b STRING, c INT) stored as carbondata OPTIONS ('a' 1) ^; == Antlr Parser: org.apache.spark.sql.parser.CarbonAntlrParser == Antlr SQL Parser will only deal with Merge Into SQL Command; (state=,code=0) 0: jdbc:hive2://7.187.185.158:23040/default> CREATE TABLE tbl1 (a INT, b STRING, c INT) stored as parquet OPTIONS ('a' 1); Error: org.apache.spark.sql.AnalysisException: == Spark Parser: org.apache.spark.sql.execution.SparkSqlParser == mismatched input 'OPTIONS' expecting (line 1, pos 61) == SQL == CREATE TABLE tbl1 (a INT, b STRING, c INT) stored as parquet OPTIONS ('a' 1) -^^^ == Carbon Parser: org.apache.spark.sql.parser.CarbonExtensionSpark2SqlParser == [1.8] failure: identifier matching regex (?i)MATERIALIZED expected CREATE TABLE tbl1 (a INT, b
[jira] [Updated] (CARBONDATA-4297) Create table(Carbon and Parquet) with combination of partitioned by, Clustered by, Sorted by and with options parameter and insert overwrite fails with parser errors
[ https://issues.apache.org/jira/browse/CARBONDATA-4297?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chetan Bhat updated CARBONDATA-4297: Attachment: image-2021-10-08-12-51-14-837.png Description: *Issue 1 : Create table* *(Carbon and Parquet) with combination of partitioned by, Clustered by, Sorted by fails -* *Queries-* CREATE TABLE t (a STRING, b INT, c STRING, d STRING) stored as carbondata OPTIONS (a '1', b '2') PARTITIONED BY (c, d) CLUSTERED BY (a) SORTED BY (b ASC) INTO 2 BUCKETS COMMENT 'table_comment' TBLPROPERTIES (t 'test'); CREATE TABLE t (a STRING, b INT, c STRING, d STRING) stored as parquet OPTIONS (a '1', b '2') PARTITIONED BY (c, d) CLUSTERED BY (a) SORTED BY (b ASC) INTO 2 BUCKETS COMMENT 'table_comment' TBLPROPERTIES (t 'test'); 0: jdbc:hive2://7.187.185.158:23040/default> CREATE TABLE t (a STRING, b INT, c STRING, d STRING) stored as carbondata 0: jdbc:hive2://7.187.185.158:23040/default> OPTIONS (a '1', b '2') 0: jdbc:hive2://7.187.185.158:23040/default> PARTITIONED BY (c, d) CLUSTERED BY (a) SORTED BY (b ASC) INTO 2 BUCKETS 0: jdbc:hive2://7.187.185.158:23040/default> COMMENT 'table_comment' 0: jdbc:hive2://7.187.185.158:23040/default> TBLPROPERTIES (t 'test'); Error: org.apache.spark.sql.AnalysisException: == Spark Parser: org.apache.spark.sql.execution.SparkSqlParser == mismatched input 'OPTIONS' expecting (line 2, pos 0) == SQL == CREATE TABLE t (a STRING, b INT, c STRING, d STRING) stored as carbondata OPTIONS (a '1', b '2') ^^^ PARTITIONED BY (c, d) CLUSTERED BY (a) SORTED BY (b ASC) INTO 2 BUCKETS COMMENT 'table_comment' TBLPROPERTIES (t 'test') == Carbon Parser: org.apache.spark.sql.parser.CarbonExtensionSpark2SqlParser == [1.8] failure: identifier matching regex (?i)MATERIALIZED expected CREATE TABLE t (a STRING, b INT, c STRING, d STRING) stored as carbondata ^; == Antlr Parser: org.apache.spark.sql.parser.CarbonAntlrParser == Antlr SQL Parser will only deal with Merge Into SQL Command; (state=,code=0) 0: jdbc:hive2://7.187.185.158:23040/default> CREATE TABLE t (a STRING, b INT, c STRING, d STRING) stored as parquet 0: jdbc:hive2://7.187.185.158:23040/default> OPTIONS (a '1', b '2') 0: jdbc:hive2://7.187.185.158:23040/default> PARTITIONED BY (c, d) CLUSTERED BY (a) SORTED BY (b ASC) INTO 2 BUCKETS 0: jdbc:hive2://7.187.185.158:23040/default> COMMENT 'table_comment' 0: jdbc:hive2://7.187.185.158:23040/default> TBLPROPERTIES (t 'test'); Error: org.apache.spark.sql.AnalysisException: == Spark Parser: org.apache.spark.sql.execution.SparkSqlParser == mismatched input 'OPTIONS' expecting (line 2, pos 0) == SQL == CREATE TABLE t (a STRING, b INT, c STRING, d STRING) stored as parquet OPTIONS (a '1', b '2') ^^^ PARTITIONED BY (c, d) CLUSTERED BY (a) SORTED BY (b ASC) INTO 2 BUCKETS COMMENT 'table_comment' TBLPROPERTIES (t 'test') == Carbon Parser: org.apache.spark.sql.parser.CarbonExtensionSpark2SqlParser == [1.8] failure: identifier matching regex (?i)MATERIALIZED expected CREATE TABLE t (a STRING, b INT, c STRING, d STRING) stored as parquet ^; == Antlr Parser: org.apache.spark.sql.parser.CarbonAntlrParser == Antlr SQL Parser will only deal with Merge Into SQL Command; (state=,code=0) 0: jdbc:hive2://7.187.185.158:23040/default> *Issue 2 : Create table with options parameter fails-* *Queries-* CREATE TABLE tbl (a INT, b STRING, c INT) stored as carbondata OPTIONS ('a' 1); CREATE TABLE tbl1 (a INT, b STRING, c INT) stored as parquet OPTIONS ('a' 1); 0: jdbc:hive2://7.187.185.158:23040/default> CREATE TABLE tbl (a INT, b STRING, c INT) stored as carbondata OPTIONS ('a' 1); Error: org.apache.spark.sql.AnalysisException: == Spark Parser: org.apache.spark.sql.execution.SparkSqlParser == mismatched input 'OPTIONS' expecting (line 1, pos 63) == SQL == CREATE TABLE tbl (a INT, b STRING, c INT) stored as carbondata OPTIONS ('a' 1) ---^^^ == Carbon Parser: org.apache.spark.sql.parser.CarbonExtensionSpark2SqlParser == [1.8] failure: identifier matching regex (?i)MATERIALIZED expected CREATE TABLE tbl (a INT, b STRING, c INT) stored as carbondata OPTIONS ('a' 1) ^; == Antlr Parser: org.apache.spark.sql.parser.CarbonAntlrParser == Antlr SQL Parser will only deal with Merge Into SQL Command; (state=,code=0) 0: jdbc:hive2://7.187.185.158:23040/default> CREATE TABLE tbl1 (a INT, b STRING, c INT) stored as parquet OPTIONS ('a' 1); Error: org.apache.spark.sql.AnalysisException: == Spark Parser: org.apache.spark.sql.execution.SparkSqlParser == mismatched input 'OPTIONS' expecting (line 1, pos 61) == SQL == CREATE TABLE tbl1 (a INT, b STRING, c INT) stored as parquet OPTIONS ('a' 1) -^^^ == Carbon Parser: org.apache.spark.sql.parser.CarbonExtensionSpark2SqlParser == [1.8] failure: identifier matching regex
[jira] [Created] (CARBONDATA-4297) Create table(Carbon and Parquet) with combination of partitioned by, Clustered by, Sorted by and with options parameter fails with parser errors in Carbon session in
Chetan Bhat created CARBONDATA-4297: --- Summary: Create table(Carbon and Parquet) with combination of partitioned by, Clustered by, Sorted by and with options parameter fails with parser errors in Carbon session in Spark 2.4.5 Key: CARBONDATA-4297 URL: https://issues.apache.org/jira/browse/CARBONDATA-4297 Project: CarbonData Issue Type: Bug Components: sql Affects Versions: 2.3.0 Environment: Spark 2.4.5 Reporter: Chetan Bhat *Issue 1 : Create table* *(Carbon and Parquet) with combination of partitioned by, Clustered by, Sorted by fails -* *Queries-* CREATE TABLE t (a STRING, b INT, c STRING, d STRING) stored as carbondata OPTIONS (a '1', b '2') PARTITIONED BY (c, d) CLUSTERED BY (a) SORTED BY (b ASC) INTO 2 BUCKETS COMMENT 'table_comment' TBLPROPERTIES (t 'test'); CREATE TABLE t (a STRING, b INT, c STRING, d STRING) stored as parquet OPTIONS (a '1', b '2') PARTITIONED BY (c, d) CLUSTERED BY (a) SORTED BY (b ASC) INTO 2 BUCKETS COMMENT 'table_comment' TBLPROPERTIES (t 'test'); 0: jdbc:hive2://7.187.185.158:23040/default> CREATE TABLE t (a STRING, b INT, c STRING, d STRING) stored as carbondata 0: jdbc:hive2://7.187.185.158:23040/default> OPTIONS (a '1', b '2') 0: jdbc:hive2://7.187.185.158:23040/default> PARTITIONED BY (c, d) CLUSTERED BY (a) SORTED BY (b ASC) INTO 2 BUCKETS 0: jdbc:hive2://7.187.185.158:23040/default> COMMENT 'table_comment' 0: jdbc:hive2://7.187.185.158:23040/default> TBLPROPERTIES (t 'test'); Error: org.apache.spark.sql.AnalysisException: == Spark Parser: org.apache.spark.sql.execution.SparkSqlParser == mismatched input 'OPTIONS' expecting (line 2, pos 0) == SQL == CREATE TABLE t (a STRING, b INT, c STRING, d STRING) stored as carbondata OPTIONS (a '1', b '2') ^^^ PARTITIONED BY (c, d) CLUSTERED BY (a) SORTED BY (b ASC) INTO 2 BUCKETS COMMENT 'table_comment' TBLPROPERTIES (t 'test') == Carbon Parser: org.apache.spark.sql.parser.CarbonExtensionSpark2SqlParser == [1.8] failure: identifier matching regex (?i)MATERIALIZED expected CREATE TABLE t (a STRING, b INT, c STRING, d STRING) stored as carbondata ^; == Antlr Parser: org.apache.spark.sql.parser.CarbonAntlrParser == Antlr SQL Parser will only deal with Merge Into SQL Command; (state=,code=0) 0: jdbc:hive2://7.187.185.158:23040/default> CREATE TABLE t (a STRING, b INT, c STRING, d STRING) stored as parquet 0: jdbc:hive2://7.187.185.158:23040/default> OPTIONS (a '1', b '2') 0: jdbc:hive2://7.187.185.158:23040/default> PARTITIONED BY (c, d) CLUSTERED BY (a) SORTED BY (b ASC) INTO 2 BUCKETS 0: jdbc:hive2://7.187.185.158:23040/default> COMMENT 'table_comment' 0: jdbc:hive2://7.187.185.158:23040/default> TBLPROPERTIES (t 'test'); Error: org.apache.spark.sql.AnalysisException: == Spark Parser: org.apache.spark.sql.execution.SparkSqlParser == mismatched input 'OPTIONS' expecting (line 2, pos 0) == SQL == CREATE TABLE t (a STRING, b INT, c STRING, d STRING) stored as parquet OPTIONS (a '1', b '2') ^^^ PARTITIONED BY (c, d) CLUSTERED BY (a) SORTED BY (b ASC) INTO 2 BUCKETS COMMENT 'table_comment' TBLPROPERTIES (t 'test') == Carbon Parser: org.apache.spark.sql.parser.CarbonExtensionSpark2SqlParser == [1.8] failure: identifier matching regex (?i)MATERIALIZED expected CREATE TABLE t (a STRING, b INT, c STRING, d STRING) stored as parquet ^; == Antlr Parser: org.apache.spark.sql.parser.CarbonAntlrParser == Antlr SQL Parser will only deal with Merge Into SQL Command; (state=,code=0) 0: jdbc:hive2://7.187.185.158:23040/default> *Issue 2 : Create table with options parameter fails-* *Queries-* CREATE TABLE tbl (a INT, b STRING, c INT) stored as carbondata OPTIONS ('a' 1); CREATE TABLE tbl1 (a INT, b STRING, c INT) stored as parquet OPTIONS ('a' 1); 0: jdbc:hive2://7.187.185.158:23040/default> CREATE TABLE tbl (a INT, b STRING, c INT) stored as carbondata OPTIONS ('a' 1); Error: org.apache.spark.sql.AnalysisException: == Spark Parser: org.apache.spark.sql.execution.SparkSqlParser == mismatched input 'OPTIONS' expecting (line 1, pos 63) == SQL == CREATE TABLE tbl (a INT, b STRING, c INT) stored as carbondata OPTIONS ('a' 1) ---^^^ == Carbon Parser: org.apache.spark.sql.parser.CarbonExtensionSpark2SqlParser == [1.8] failure: identifier matching regex (?i)MATERIALIZED expected CREATE TABLE tbl (a INT, b STRING, c INT) stored as carbondata OPTIONS ('a' 1) ^; == Antlr Parser: org.apache.spark.sql.parser.CarbonAntlrParser == Antlr SQL Parser will only deal with Merge Into SQL Command; (state=,code=0) 0: jdbc:hive2://7.187.185.158:23040/default> CREATE TABLE tbl1 (a INT, b STRING, c INT) stored as parquet OPTIONS ('a' 1); Error: org.apache.spark.sql.AnalysisException: == Spark Parser: org.apache.spark.sql.execution.SparkSqlParser == mismatched input 'OPTIONS' expecting (line 1, pos 61) == SQL == CREATE
[jira] [Created] (CARBONDATA-4294) Some Carbondata github docs links not working
Chetan Bhat created CARBONDATA-4294: --- Summary: Some Carbondata github docs links not working Key: CARBONDATA-4294 URL: https://issues.apache.org/jira/browse/CARBONDATA-4294 Project: CarbonData Issue Type: Bug Components: docs Affects Versions: 2.3.0 Environment: Carbondata github links Reporter: Chetan Bhat 1. In https://github.com/apache/carbondata/blob/master/docs/how-to-contribute-to-apache-carbondata.md#engage page "Apache CarbonData Dev Mailing List archive" link when clicked the target does not open. Also "If you do not already have an account, sign up here", the here target link does not open. 2. In https://github.com/apache/carbondata/blob/master/docs/how-to-contribute-to-apache-carbondata.md#deleting-your-branch--optional- link the "Deleting your branch (optional)" link when clicked does not open the target page. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Closed] (CARBONDATA-4235) after alter add column when user does rename operation ,the select operation on struct type gives null value and childen of struct gives error
[ https://issues.apache.org/jira/browse/CARBONDATA-4235?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chetan Bhat closed CARBONDATA-4235. --- Fix Version/s: 2.3.0 Resolution: Fixed Issue fixed in 2.3.0. > after alter add column when user does rename operation ,the select operation > on struct type gives null value and childen of struct gives error > > > Key: CARBONDATA-4235 > URL: https://issues.apache.org/jira/browse/CARBONDATA-4235 > Project: CarbonData > Issue Type: Bug > Components: sql >Affects Versions: 2.2.0 > Environment: Spark 3.1.1, Spark 2.4.5 >Reporter: Chetan Bhat >Priority: Minor > Fix For: 2.3.0 > > > *Queries –* > drop table if exists test_rename; > CREATE TABLE test_rename (str1 struct, str2 struct>, > str3 struct>> comment 'struct', intfield int,arr1 > array, arr2 array>, arr3 array, arr4 > array> comment 'array') STORED AS carbondata; > insert into test_rename values (named_struct('a', 2), named_struct('a', > named_struct('b', 2)), named_struct('a', named_struct('b',named_struct('c', > 2))), 1,array(1,2,3), array(array(1,2),array(3,4)), array('hello','world'), > array(named_struct('a',45))); > ALTER TABLE test_rename ADD COLUMNS(arr_1 ARRAY); > alter table test_rename change str2 str22 struct>; > select str22 from test_rename; > select str22.a from test_rename; > select str22.a.b from test_rename; > > Issue : after alter add column when user does rename operation ,the select > operation on struct type gives null value and childen of struct gives error > > *Issue 1 : Exception trace on executing query –* > 0: jdbc:hive2://vm2:22550/> select str22.a.b from test_rename; > INFO : Execution ID: 2465 > Error: org.apache.hive.service.cli.HiveSQLException: Error running query: > org.apache.spark.SparkException: Job aborted due to stage failure: Task 0 in > stage 1100.0 failed 4 times, most recent failure: Lost task 0.3 in stage > 1100.0 (TID 10353) (vm1 executor 5): java.nio.BufferUnderflowException > at java.nio.HeapByteBuffer.get(HeapByteBuffer.java:155) > at > org.apache.carbondata.core.scan.complextypes.PrimitiveQueryType.getDataObject(PrimitiveQueryType.java:166) > at > org.apache.carbondata.core.scan.complextypes.PrimitiveQueryType.getDataObject(PrimitiveQueryType.java:147) > at > org.apache.carbondata.core.scan.complextypes.PrimitiveQueryType.getDataBasedOnColumn(PrimitiveQueryType.java:141) > at > org.apache.carbondata.core.scan.complextypes.StructQueryType.getDataBasedOnColumn(StructQueryType.java:160) > at > org.apache.carbondata.core.scan.complextypes.StructQueryType.getDataBasedOnColumn(StructQueryType.java:160) > at > org.apache.carbondata.core.scan.collector.impl.DictionaryBasedResultCollector.fillRow(DictionaryBasedResultCollector.java:316) > at > org.apache.carbondata.core.scan.collector.impl.DictionaryBasedResultCollector.fillDimensionData(DictionaryBasedResultCollector.java:288) > at > org.apache.carbondata.core.scan.collector.impl.DictionaryBasedResultCollector.collectResultInRow(DictionaryBasedResultCollector.java:159) > at > org.apache.carbondata.core.scan.processor.DataBlockIterator.next(DataBlockIterator.java:110) > at > org.apache.carbondata.core.scan.result.iterator.DetailQueryResultIterator.getBatchResult(DetailQueryResultIterator.java:58) > at > org.apache.carbondata.core.scan.result.iterator.DetailQueryResultIterator.next(DetailQueryResultIterator.java:50) > at > org.apache.carbondata.core.scan.result.iterator.DetailQueryResultIterator.next(DetailQueryResultIterator.java:32) > at > org.apache.carbondata.core.scan.result.iterator.ChunkRowIterator.hasNext(ChunkRowIterator.java:56) > at > org.apache.carbondata.hadoop.CarbonRecordReader.nextKeyValue(CarbonRecordReader.java:127) > at > org.apache.carbondata.spark.rdd.CarbonScanRDD$$anon$1.hasNext(CarbonScanRDD.scala:557) > at scala.collection.Iterator$$anon$10.hasNext(Iterator.scala:458) > at scala.collection.Iterator$$anon$10.hasNext(Iterator.scala:458) > at > org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIteratorForCodegenStage1.processNext(Unknown > Source) > at > org.apache.spark.sql.execution.BufferedRowIterator.hasNext(BufferedRowIterator.java:43) > at > org.apache.spark.sql.execution.WholeStageCodegenExec$$anon$1.hasNext(WholeStageCodegenExec.scala:755) > at > org.apache.spark.sql.execution.SparkPlan.$anonfun$getByteArrayRdd$1(SparkPlan.scala:345) > at org.apache.spark.rdd.RDD.$anonfun$mapPartitionsInternal$2(RDD.scala:897) > at > org.apache.spark.rdd.RDD.$anonfun$mapPartitionsInternal$2$adapted(RDD.scala:897) > at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52) > at
[jira] [Closed] (CARBONDATA-4241) if the sort scope is changed to global sort and data loaded, major compaction fails
[ https://issues.apache.org/jira/browse/CARBONDATA-4241?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chetan Bhat closed CARBONDATA-4241. --- Fix Version/s: 2.3.0 Resolution: Fixed The 2 scenarios in bug are fixed. Scenario 1 :- 0: jdbc:hive2://10.21.19.14:23040> CREATE TABLE uniqdata_pagesize (CUST_ID int,CUST_NAME String,ACTIVE_EMUI_VERSION string, DOB timestamp, DOJ timestamp, BIG INT_COLUMN1 bigint,BIGINT_COLUMN2 bigint,DECIMAL_COLUMN1 decimal(30,10), DECIMAL_COLUMN2 decimal(36,36),Double_COLUMN1 double, Double_COLUMN2 double,INTEGER_ COLUMN1 int) STORED as carbondata TBLPROPERTIES('table_page_size_inmb'='1'); +-+ | Result | +-+ +-+ No rows selected (1.637 seconds) 0: jdbc:hive2://10.21.19.14:23040> LOAD DATA INPATH 'hdfs://hacluster/chetan/2000_UniqData.csv' into table uniqdata_pagesize OPTIONS('DELIMITER'=',' , 'QUOTE CHAR'='"','BAD_RECORDS_ACTION'='FORCE','FILEHEADER'='CUST_ID,CUST_NAME,ACTIVE_EMUI_VERSION,DOB,DOJ,BIGINT_COLUMN1,BIGINT_COLUMN2,DECIMAL_COLUMN1,DECIMAL_COLU MN2,Double_COLUMN1,Double_COLUMN2,INTEGER_COLUMN1'); +-+ | Segment ID | +-+ | 0 | +-+ 1 row selected (5.361 seconds) 0: jdbc:hive2://10.21.19.14:23040> alter table uniqdata_pagesize set tblproperties('sort_columns'='CUST_ID','sort_scope'='global_sort'); +-+ | Result | +-+ +-+ No rows selected (0.883 seconds) 0: jdbc:hive2://10.21.19.14:23040> LOAD DATA INPATH 'hdfs://hacluster/chetan/2000_UniqData.csv' into table uniqdata_pagesize OPTIONS('DELIMITER'=',' , 'QUOTE CHAR'='"','BAD_RECORDS_ACTION'='FORCE','FILEHEADER'='CUST_ID,CUST_NAME,ACTIVE_EMUI_VERSION,DOB,DOJ,BIGINT_COLUMN1,BIGINT_COLUMN2,DECIMAL_COLUMN1,DECIMAL_COLU MN2,Double_COLUMN1,Double_COLUMN2,INTEGER_COLUMN1'); +-+ | Segment ID | +-+ | 1 | +-+ 1 row selected (2.104 seconds) 0: jdbc:hive2://10.21.19.14:23040> alter table uniqdata_pagesize compact 'major'; +-+ | Result | +-+ +-+ No rows selected (5.737 seconds) Scenario 2 :- 0: jdbc:hive2://10.21.19.14:23040> CREATE TABLE uniqdata_sortcol_bloom_locdic (CUST_ID int,CUST_NAME String,ACTIVE_EMUI_VERSION string, DOB timestamp, DOJ timestamp, BIGINT_COLUMN1 bigint,BIGINT_COLUMN2 bigint,DECIMAL_COLUMN1 decimal(30,10), DECIMAL_COLUMN2 decimal(36,10),Double_COLUMN1 double, Double_COLUMN2 double,INTEGER_COLUMN1 int) STORED as carbondata tblproperties('sort_columns'='cust_id,cust_name,dob,doj,bigint_column1'); +-+ | Result | +-+ +-+ No rows selected (0.31 seconds) 0: jdbc:hive2://10.21.19.14:23040> LOAD DATA INPATH 'hdfs://hacluster/chetan/2000_UniqData.csv' into table uniqdata_sortcol_bloom_locdic OPTIONS('DELIMITER'=',', 'QUOTECHAR'='"','BAD_RECORDS_ACTION'='FORCE','FILEHEADER'='CUST_ID,CUST_NAME,ACTIVE_EMUI_VERSION,DOB,DOJ,BIGINT_COLUMN1,BIGINT_COLUMN2,DECIMAL_COLUMN1,DECIMAL_COLUMN2,Double_COLUMN1,Double_COLUMN2,INTEGER_COLUMN1'); +-+ | Segment ID | +-+ | 0 | +-+ 1 row selected (1.613 seconds) 0: jdbc:hive2://10.21.19.14:23040> alter table uniqdata_sortcol_bloom_locdic set tblproperties('local_dictionary_enable'='true','local_dictionary_threshold'='1000'); +-+ | Result | +-+ +-+ No rows selected (0.711 seconds) 0: jdbc:hive2://10.21.19.14:23040> alter table uniqdata_sortcol_bloom_locdic set tblproperties('sort_columns'='CUST_ID','sort_scope'='global_sort'); +-+ | Result | +-+ +-+ No rows selected (0.638 seconds) 0: jdbc:hive2://10.21.19.14:23040> LOAD DATA INPATH 'hdfs://hacluster/chetan/2000_UniqData.csv' into table uniqdata_sortcol_bloom_locdic partition(active_emui_version='xyz') OPTIONS('FILEHEADER'='CUST_ID,CUST_NAME ,ACTIVE_EMUI_VERSION,DOB,DOJ, BIGINT_COLUMN1,BIGINT_COLUMN2,DECIMAL_COLUMN1,DECIMAL_COLUMN2,Double_COLUMN1, Double_COLUMN2,INTEGER_COLUMN1','BAD_RECORDS_ACTION'='FORCE'); +-+ | Segment ID | +-+ | 1 | +-+ 1 row selected (0.929 seconds) 0: jdbc:hive2://10.21.19.14:23040> alter table uniqdata_sortcol_bloom_locdic compact 'major'; +-+ | Result | +-+ +-+ No rows selected (1.581 seconds) > if the sort scope is changed to global sort and data loaded, major compaction > fails > --- > > Key: CARBONDATA-4241 > URL: https://issues.apache.org/jira/browse/CARBONDATA-4241 > Project: CarbonData > Issue Type: Bug > Components: data-load >Affects Versions: 2.2.0 > Environment: Spark 2.3.2 Carbon 1.6.1 , Spark 3.1.1 Carbon 2.2.0 >Reporter: Chetan Bhat >Assignee: Indhumathi Muthumurugesh >Priority: Major > Fix For: 2.3.0, 2.2.0 > > > *Scenario 1 : create table with table_page_size_inmb'='1', load data ,* *set
[jira] [Closed] (CARBONDATA-4236) Documentation correctness and link issues in https://github.com/apache/carbondata/blob/master/docs/
[ https://issues.apache.org/jira/browse/CARBONDATA-4236?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chetan Bhat closed CARBONDATA-4236. --- Issue is fixed now. > Documentation correctness and link issues in > https://github.com/apache/carbondata/blob/master/docs/ > --- > > Key: CARBONDATA-4236 > URL: https://issues.apache.org/jira/browse/CARBONDATA-4236 > Project: CarbonData > Issue Type: Bug > Components: docs >Affects Versions: 2.2.0 > Environment: docs with content and examples verified on Spark 2.4.5 > and Spark 3.1.1 compatible carbon. >Reporter: Chetan Bhat >Priority: Minor > Fix For: 2.3.0 > > > In the documentation link > https://github.com/apache/carbondata/blob/master/docs/ > Issue 1 :- > In link -> > https://github.com/apache/carbondata/blob/master/docs/configuration-parameters.md > the "See detail" links does not open the target > "http://spark.apache.org/docs/latest/rdd-programming-guide.html#rdd-persistence; > In link --> > https://github.com/apache/carbondata/blob/master/docs/documentation.md the > link "Apache CarbonData wiki" when clicked tries to open link > "https://cwiki.apache.org/confluence/display/CARBONDATA/CarbonData+Home; the > target page cant be opened. Similarly the other links in the "External > Resources" section cant be opened due to the same error. > In link > https://github.com/apache/carbondata/blob/master/docs/faq.md#what-are-bad-records > the link "https://thrift.apache.org/docs/install; when clicked does not open > the target page. > In link > https://github.com/apache/carbondata/blob/master/docs/quick-start-guide.md > when the "Spark website" link is clicked > https://spark.apache.org/downloads.html page is not opened. Also in same page > when the "Apache Spark Documentation" link is clicked the > "http://spark.apache.org/docs/latest/; page is not opened. > In the link > https://github.com/apache/carbondata/blob/master/docs/release-guide.md > "Product Release Policy link" , "release signing guidelines" , "Apache Nexus > repository" and "repository.apache.org" when clicked the target pages are not > opening. > Issue 2:- > In link --> > https://github.com/apache/carbondata/blob/master/docs/configuration-parameters.md > the "To configure Ranges-based Compaction" to be changed to "To configure > Range-based Compaction" > Issue 3:- > In link --> > https://github.com/apache/carbondata/blob/master/docs/configuration-parameters.md > the "Making this true degrade the LOAD performance" to be changed to "Making > this true degrades the LOAD performance" > Issue 4 :- > In link --> > https://github.com/apache/carbondata/blob/master/docs/configuration-parameters.md > the "user an either set to true" to be changed to "user can either set to > true" -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (CARBONDATA-4276) writestream fail when csv is copied to readstream hdfs path in Spark 2.4.5
[ https://issues.apache.org/jira/browse/CARBONDATA-4276?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chetan Bhat updated CARBONDATA-4276: Description: *With Carbon 2.2.0 Spark 2.4.5 cluster* *steps :* *+In hdfs execute following command :+* cd /opt/HA/C10/install/hadoop/datanode/bin/ ./hdfs dfs -rm -r /tmp/stream_test/checkpoint_all_data ./hdfs dfs -mkdir -p /tmp/stream_test/\{checkpoint_all_data,bad_records_all_data} ./hdfs dfs -mkdir -p /Priyesh/streaming/csv/ ./hdfs dfs -cp /chetan/100_olap_C20.csv /Priyesh/streaming/csv/ ./hdfs dfs -cp /Priyesh/streaming/csv/100_olap_C20.csv /Priyesh/streaming/csv/100_olap_C21.csv *+From Spark-beeline /Spark-sql /Spark-shell, execute :+* DROP TABLE IF EXISTS all_datatypes_2048; create table all_datatypes_2048 (imei string,deviceInformationId int,MAC string,deviceColor string,device_backColor string,modelId string,marketName string,AMSize string,ROMSize string,CUPAudit string,CPIClocked string,series string,productionDate timestamp,bomCode string,internalModels string, deliveryTime string, channelsId string, channelsName string , deliveryAreaId string, deliveryCountry string, deliveryProvince string, deliveryCity string,deliveryDistrict string, deliveryStreet string, oxSingleNumber string, ActiveCheckTime string, ActiveAreaId string, ActiveCountry string, ActiveProvince string, Activecity string, ActiveDistrict string, ActiveStreet string, ActiveOperatorId string, Active_releaseId string, Active_EMUIVersion string, Active_operaSysVersion string, Active_BacVerNumber string, Active_BacFlashVer string, Active_webUIVersion string, Active_webUITypeCarrVer string,Active_webTypeDataVerNumber string, Active_operatorsVersion string, Active_phonePADPartitionedVersions string, Latest_YEAR int, Latest_MONTH int, Latest_DAY Decimal(30,10), Latest_HOUR string, Latest_areaId string, Latest_country string, Latest_province string, Latest_city string, Latest_district string, Latest_street string, Latest_releaseId string, Latest_EMUIVersion string, Latest_operaSysVersion string, Latest_BacVerNumber string, Latest_BacFlashVer string, Latest_webUIVersion string, Latest_webUITypeCarrVer string, Latest_webTypeDataVerNumber string, Latest_operatorsVersion string, Latest_phonePADPartitionedVersions string, Latest_operatorId string, gamePointDescription string,gamePointId double,contractNumber BigInt) stored as carbondata TBLPROPERTIES('table_blocksize'='2048','streaming'='true', 'sort_columns'='imei'); *+From Spark-shell ,execute :+* import org.apache.spark.sql.streaming._ import org.apache.spark.sql.streaming.Trigger.ProcessingTime val df_j=spark.readStream.text("hdfs://hacluster/Priyesh/streaming/csv/*.csv") df_j.writeStream.format("carbondata").option("dbName","ranjan").option("carbon.stream.parser","org.apache.carbondata.streaming.parser.CSVStreamParserImp").option("checkpointLocation", "hdfs://hacluster/tmp/stream_test/checkpoint_all_data").option("bad_records_action","hdfs://hacluster/tmp/stream_test/bad_records_all_data").option("tableName","all_datatypes_2048").trigger(ProcessingTime(6000)).option("carbon.streaming.auto.handoff.enabled","true").option("carbon.streaming.segment.max.size",102400).start show segments for table all_datatypes_2048; *issue 1 :* *+when copy csv file in hdfs folder for 1st time after streaming started ,writestream fails with error:+* scala> df_j.writeStream.format("carbondata").option("dbName","ranjan").option("carbon.stream.parser","org.apache.carbondata.streaming.parser.CSVStreamParserImp").option("checkpointLocation", "hdfs://hacluster/tmp/stream_test/checkpoint_all_data").option("bad_records_action","hdfs://hacluster/tmp/stream_test/bad_records_all_data").option("tableName","all_datatypes_2048").trigger(ProcessingTime(6000)).option("carbon.streaming.auto.handoff.enabled","true").option("carbon.streaming.segment.max.size",102400).start 21/08/26 12:53:11 WARN CarbonProperties: The enable mv value "null" is invalid. Using the default value "true" 21/08/26 12:53:11 WARN CarbonProperties: The value "LOCALLOCK" configured for key carbon.lock.type is invalid for current file system. Use the default value HDFSLOCK instead. 21/08/26 12:53:12 WARN HiveConf: HiveConf of name hive.metastore.rdb.password.decode.enable does not exist 21/08/26 12:53:12 WARN HiveConf: HiveConf of name hive.metastore.db.ssl.enabled does not exist 21/08/26 12:53:13 WARN HiveConf: HiveConf of name hive.metastore.rdb.password.decode.enable does not exist 21/08/26 12:53:13 WARN HiveConf: HiveConf of name hive.metastore.db.ssl.enabled does not exist 21/08/26 12:53:14 WARN ObjectStore: Failed to get database global_temp, returning NoSuchObjectException res0: org.apache.spark.sql.streaming.StreamingQuery = org.apache.spark.sql.execution.streaming.StreamingQueryWrapper@ad038f8 scala> 21/08/26 13:00:49 WARN DFSClient: DataStreamer Exception
[jira] [Updated] (CARBONDATA-4276) writestream fail when csv is copied to readstream hdfs path in Spark 2.4.5
[ https://issues.apache.org/jira/browse/CARBONDATA-4276?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chetan Bhat updated CARBONDATA-4276: Summary: writestream fail when csv is copied to readstream hdfs path in Spark 2.4.5 (was: writestream fail when csv is copied to readstream hdfs path) > writestream fail when csv is copied to readstream hdfs path in Spark 2.4.5 > -- > > Key: CARBONDATA-4276 > URL: https://issues.apache.org/jira/browse/CARBONDATA-4276 > Project: CarbonData > Issue Type: Bug > Components: data-load >Affects Versions: 2.2.0 > Environment: Spark 2.4.5 >Reporter: PRIYESH RANJAN >Priority: Minor > > *steps :* > *+In hdfs execute following command :+* > cd /opt/HA/C10/install/hadoop/datanode/bin/ > ./hdfs dfs -rm -r /tmp/stream_test/checkpoint_all_data > ./hdfs dfs -mkdir -p > /tmp/stream_test/\{checkpoint_all_data,bad_records_all_data} > ./hdfs dfs -mkdir -p /Priyesh/streaming/csv/ > ./hdfs dfs -cp /chetan/100_olap_C20.csv /Priyesh/streaming/csv/ > ./hdfs dfs -cp /Priyesh/streaming/csv/100_olap_C20.csv > /Priyesh/streaming/csv/100_olap_C21.csv > > *+From Spark-beeline /Spark-sql /Spark-shell, execute :+* > DROP TABLE IF EXISTS all_datatypes_2048; > create table all_datatypes_2048 (imei string,deviceInformationId int,MAC > string,deviceColor string,device_backColor string,modelId string,marketName > string,AMSize string,ROMSize string,CUPAudit string,CPIClocked string,series > string,productionDate timestamp,bomCode string,internalModels string, > deliveryTime string, channelsId string, channelsName string , deliveryAreaId > string, deliveryCountry string, deliveryProvince string, deliveryCity > string,deliveryDistrict string, deliveryStreet string, oxSingleNumber string, > ActiveCheckTime string, ActiveAreaId string, ActiveCountry string, > ActiveProvince string, Activecity string, ActiveDistrict string, ActiveStreet > string, ActiveOperatorId string, Active_releaseId string, Active_EMUIVersion > string, Active_operaSysVersion string, Active_BacVerNumber string, > Active_BacFlashVer string, Active_webUIVersion string, > Active_webUITypeCarrVer string,Active_webTypeDataVerNumber string, > Active_operatorsVersion string, Active_phonePADPartitionedVersions string, > Latest_YEAR int, Latest_MONTH int, Latest_DAY Decimal(30,10), Latest_HOUR > string, Latest_areaId string, Latest_country string, Latest_province string, > Latest_city string, Latest_district string, Latest_street string, > Latest_releaseId string, Latest_EMUIVersion string, Latest_operaSysVersion > string, Latest_BacVerNumber string, Latest_BacFlashVer string, > Latest_webUIVersion string, Latest_webUITypeCarrVer string, > Latest_webTypeDataVerNumber string, Latest_operatorsVersion string, > Latest_phonePADPartitionedVersions string, Latest_operatorId string, > gamePointDescription string,gamePointId double,contractNumber BigInt) stored > as carbondata TBLPROPERTIES('table_blocksize'='2048','streaming'='true', > 'sort_columns'='imei'); > > *+From Spark-shell ,execute :+* > import org.apache.spark.sql.streaming._ > import org.apache.spark.sql.streaming.Trigger.ProcessingTime > val df_j=spark.readStream.text("hdfs://hacluster/Priyesh/streaming/csv/*.csv") > df_j.writeStream.format("carbondata").option("dbName","ranjan").option("carbon.stream.parser","org.apache.carbondata.streaming.parser.CSVStreamParserImp").option("checkpointLocation", > > "hdfs://hacluster/tmp/stream_test/checkpoint_all_data").option("bad_records_action","hdfs://hacluster/tmp/stream_test/bad_records_all_data").option("tableName","all_datatypes_2048").trigger(ProcessingTime(6000)).option("carbon.streaming.auto.handoff.enabled","true").option("carbon.streaming.segment.max.size",102400).start > show segments for table all_datatypes_2048; > > *issue 1 :* > *+when copy csv file in hdfs folder for 1st time after streaming started > ,writestream fails with error:+* > scala> > df_j.writeStream.format("carbondata").option("dbName","ranjan").option("carbon.stream.parser","org.apache.carbondata.streaming.parser.CSVStreamParserImp").option("checkpointLocation", > > "hdfs://hacluster/tmp/stream_test/checkpoint_all_data").option("bad_records_action","hdfs://hacluster/tmp/stream_test/bad_records_all_data").option("tableName","all_datatypes_2048").trigger(ProcessingTime(6000)).option("carbon.streaming.auto.handoff.enabled","true").option("carbon.streaming.segment.max.size",102400).start > 21/08/26 12:53:11 WARN CarbonProperties: The enable mv value "null" is > invalid. Using the default value "true" > 21/08/26 12:53:11 WARN CarbonProperties: The value "LOCALLOCK" configured for > key carbon.lock.type is invalid for current file system. Use the default > value HDFSLOCK instead. > 21/08/26 12:53:12
[jira] [Updated] (CARBONDATA-4243) Select filter query with to_date in filter fails for table with column_meta_cache configured also having SI
[ https://issues.apache.org/jira/browse/CARBONDATA-4243?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chetan Bhat updated CARBONDATA-4243: Environment: Spark 3.1.1, Spark 2.4.5 (was: Spark 3.1.1) > Select filter query with to_date in filter fails for table with > column_meta_cache configured also having SI > --- > > Key: CARBONDATA-4243 > URL: https://issues.apache.org/jira/browse/CARBONDATA-4243 > Project: CarbonData > Issue Type: Bug > Components: sql >Affects Versions: 2.2.0 > Environment: Spark 3.1.1, Spark 2.4.5 >Reporter: Chetan Bhat >Priority: Minor > > Create table with column_meta_cache, create secondary indexes and load data > to table. > Execute the Select filter query with to_date in filter. > CREATE TABLE uniqdata (CUST_ID int,CUST_NAME String,ACTIVE_EMUI_VERSION > string, DOB timestamp, DOJ timestamp, BIGINT_COLUMN1 bigint,BIGINT_COLUMN2 > bigint,DECIMAL_COLUMN1 decimal(30,10), DECIMAL_COLUMN2 > decimal(36,36),Double_COLUMN1 double, Double_COLUMN2 double,INTEGER_COLUMN1 > int) stored as carbondata > TBLPROPERTIES('COLUMN_META_CACHE'='CUST_NAME,ACTIVE_EMUI_VERSION,DOB,DOJ'); > CREATE INDEX indextable2 ON TABLE uniqdata (DOB) AS 'carbondata'; > CREATE INDEX indextable3 ON TABLE uniqdata (DOJ) AS 'carbondata'; > LOAD DATA INPATH 'hdfs://hacluster/chetan/2000_UniqData.csv' into table > uniqdata OPTIONS('DELIMITER'=',' , > 'QUOTECHAR'='"','BAD_RECORDS_ACTION'='FORCE','FILEHEADER'='CUST_ID,CUST_NAME,ACTIVE_EMUI_VERSION,DOB,DOJ,BIGINT_COLUMN1,BIGINT_COLUMN2,DECIMAL_COLUMN1,DECIMAL_COLUMN2,Double_COLUMN1,Double_COLUMN2,INTEGER_COLUMN1'); > > *Issue: Select filter query with to_date in filter fails for table with > column_meta_cache configured also having SI* > 0: jdbc:hive2://10.21.19.14:23040/default> select > max(to_date(DOB)),min(to_date(DOB)),count(to_date(DOB)) from uniqdata where > to_date(DOB)='1975-06-11' or to_date(Dn select > max(to_date(DOB)),min(to_date(DOB)),count(to_date(DOB)) from uniqdata where > to_date(DOB)='1975-06-11' or to_date(DOB)='1975-06-23'; > Error: org.apache.hive.service.cli.HiveSQLException: Error running query: > org.apache.spark.sql.catalyst.errors.package$TreeNodeException: makeCopy, > tree: > !BroadCastSIFilterPushJoin [none#0|#0], [none#1|#1], Inner, BuildRight > :- *(6) ColumnarToRow > : +- Scan CarbonDatasourceHadoopRelation chetan.uniqdata[dob#847024|#847024] > Batched: true, DirectScan: false, PushedFilters: [((cast(input[0] as date) = > 1987) or (cast(in9))], ReadSchema: [dob] > +- *(8) HashAggregate(keys=[positionReference#847161|#847161], functions=[], > output=[positionReference#847161|#847161]) > +- ReusedExchange [positionReference#847161|#847161], Exchange > hashpartitioning(positionReference#847161, 200), ENSURE_REQUIREMENTS, > [id=#195473|#195473] > at > org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation.org$apache$spark$sql$hive$thriftserver$SparkExecuteStatementOperation$$execute(Sparation.scala:361) > at > org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation$$anon$2$$anon$3.$anonfun$run$2(SparkExecuteStatementOperation.scala:263) > at scala.runtime.java8.JFunction0$mcV$sp.apply(JFunction0$mcV$sp.java:23) > at > org.apache.spark.sql.hive.thriftserver.SparkOperation.withLocalProperties(SparkOperation.scala:78) > at > org.apache.spark.sql.hive.thriftserver.SparkOperation.withLocalProperties$(SparkOperation.scala:62) > at > org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation.withLocalProperties(SparkExecuteStatementOperation.scala:43) > at > org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation$$anon$2$$anon$3.run(SparkExecuteStatementOperation.scala:263) > at > org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation$$anon$2$$anon$3.run(SparkExecuteStatementOperation.scala:258) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:422) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1746) > at > org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation$$anon$2.run(SparkExecuteStatementOperation.scala:272) > at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) > at java.util.concurrent.FutureTask.run(FutureTask.java:266) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) > at java.lang.Thread.run(Thread.java:748) > Caused by: org.apache.spark.sql.catalyst.errors.package$TreeNodeException: > makeCopy, tree: > !BroadCastSIFilterPushJoin [none#0|#0], [none#1|#1], Inner, BuildRight > :-
[jira] [Reopened] (CARBONDATA-4241) if the sort scope is changed to global sort and data loaded, major compaction fails
[ https://issues.apache.org/jira/browse/CARBONDATA-4241?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chetan Bhat reopened CARBONDATA-4241: - Issue still exists with Spark 3.1.1 jars in https://dist.apache.org/repos/dist/release/carbondata/2.2.0/ > if the sort scope is changed to global sort and data loaded, major compaction > fails > --- > > Key: CARBONDATA-4241 > URL: https://issues.apache.org/jira/browse/CARBONDATA-4241 > Project: CarbonData > Issue Type: Bug > Components: data-load >Affects Versions: 2.2.0 > Environment: Spark 2.3.2 Carbon 1.6.1 , Spark 3.1.1 Carbon 2.2.0 >Reporter: Chetan Bhat >Assignee: Indhumathi Muthumurugesh >Priority: Major > Fix For: 2.2.0 > > > *Scenario 1 : create table with table_page_size_inmb'='1', load data ,* *set > sortscope as global sort , load data and do major compaction.*** > 0: jdbc:hive2://10.21.19.14:23040/default> CREATE TABLE uniqdata_pagesize > (CUST_ID int,CUST_NAME String,ACTIVE_EMUI_VERSION string, DOB timestamp, DOJ > timestamp, BIGINT_COLUMN1 bigint,BIGINT_COLUMN2 bigint,DECIMAL_COLUMN1 > decimal(30,10), DECIMAL_COLUMN2 decimal(36,36),Double_COLUMN1 double, > Double_COLUMN2 double,INTEGER_COLUMN1 int) STORED as carbondata > TBLPROPERTIES('table_page_size_inmb'='1'); > +-+ > | Result | > +-+ > +-+ > No rows selected (0.229 seconds) > 0: jdbc:hive2://10.21.19.14:23040/default> LOAD DATA INPATH > 'hdfs://hacluster/chetan/2000_UniqData.csv' into table uniqdata_pagesize > OPTIONS('DELIMITER'=',' , > 'QUOTECHAR'='"','BAD_RECORDS_ACTION'='FORCE','FILEHEADER'='CUST_ID,CUST_NAME,ACTIVE_EMUI_VERSION,DOB,DOJ,BIGINT_COLUMN1,BIGINT_COLUMN2,DECIMAL_COLUMN1,DECIMAL_COLUMN2,Double_COLUMN1,Double_COLUMN2,INTEGER_COLUMN1'); > +-+ > | Segment ID | > +-+ > | 0 | > +-+ > 1 row selected (1.016 seconds) > 0: jdbc:hive2://10.21.19.14:23040/default> alter table uniqdata_pagesize set > tblproperties('sort_columns'='CUST_ID','sort_scope'='global_sort'); > +-+ > | Result | > +-+ > +-+ > No rows selected (0.446 seconds) > 0: jdbc:hive2://10.21.19.14:23040/default> LOAD DATA INPATH > 'hdfs://hacluster/chetan/2000_UniqData.csv' into table uniqdata_pagesize > OPTIONS('DELIMITER'=',' , > 'QUOTECHAR'='"','BAD_RECORDS_ACTION'='FORCE','FILEHEADER'='CUST_ID,CUST_NAME,ACTIVE_EMUI_VERSION,DOB,DOJ,BIGINT_COLUMN1,BIGINT_COLUMN2,DECIMAL_COLUMN1,DECIMAL_COLUMN2,Double_COLUMN1,Double_COLUMN2,INTEGER_COLUMN1'); > +-+ > | Segment ID | > +-+ > | 1 | > +-+ > 1 row selected (0.767 seconds) > 0: jdbc:hive2://10.21.19.14:23040/default> alter table uniqdata_pagesize > compact 'major'; > Error: org.apache.hive.service.cli.HiveSQLException: Error running query: > org.apache.spark.sql.AnalysisException: Compaction failed. Please check logs > for more info. Exception in compaction Compaction Failure in Merger Rdd. > at > org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation.org$apache$spark$sql$hive$thriftserver$SparkExecuteStatementOperation$$execute(SparkExecuteStatementOperation.scala:361) > at > org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation$$anon$2$$anon$3.$anonfun$run$2(SparkExecuteStatementOperation.scala:263) > at > scala.runtime.java8.JFunction0$mcV$sp.apply(JFunction0$mcV$sp.java:23) > at > org.apache.spark.sql.hive.thriftserver.SparkOperation.withLocalProperties(SparkOperation.scala:78) > at > org.apache.spark.sql.hive.thriftserver.SparkOperation.withLocalProperties$(SparkOperation.scala:62) > at > org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation.withLocalProperties(SparkExecuteStatementOperation.scala:43) > at > org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation$$anon$2$$anon$3.run(SparkExecuteStatementOperation.scala:263) > at > org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation$$anon$2$$anon$3.run(SparkExecuteStatementOperation.scala:258) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:422) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1746) > at > org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation$$anon$2.run(SparkExecuteStatementOperation.scala:272) > at > java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) > at java.util.concurrent.FutureTask.run(FutureTask.java:266) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) > at >
[jira] [Updated] (CARBONDATA-4243) Select filter query with to_date in filter fails for table with column_meta_cache configured also having SI
[ https://issues.apache.org/jira/browse/CARBONDATA-4243?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chetan Bhat updated CARBONDATA-4243: Description: Create table with column_meta_cache, create secondary indexes and load data to table. Execute the Select filter query with to_date in filter. CREATE TABLE uniqdata (CUST_ID int,CUST_NAME String,ACTIVE_EMUI_VERSION string, DOB timestamp, DOJ timestamp, BIGINT_COLUMN1 bigint,BIGINT_COLUMN2 bigint,DECIMAL_COLUMN1 decimal(30,10), DECIMAL_COLUMN2 decimal(36,36),Double_COLUMN1 double, Double_COLUMN2 double,INTEGER_COLUMN1 int) stored as carbondata TBLPROPERTIES('COLUMN_META_CACHE'='CUST_NAME,ACTIVE_EMUI_VERSION,DOB,DOJ'); CREATE INDEX indextable2 ON TABLE uniqdata (DOB) AS 'carbondata'; CREATE INDEX indextable3 ON TABLE uniqdata (DOJ) AS 'carbondata'; LOAD DATA INPATH 'hdfs://hacluster/chetan/2000_UniqData.csv' into table uniqdata OPTIONS('DELIMITER'=',' , 'QUOTECHAR'='"','BAD_RECORDS_ACTION'='FORCE','FILEHEADER'='CUST_ID,CUST_NAME,ACTIVE_EMUI_VERSION,DOB,DOJ,BIGINT_COLUMN1,BIGINT_COLUMN2,DECIMAL_COLUMN1,DECIMAL_COLUMN2,Double_COLUMN1,Double_COLUMN2,INTEGER_COLUMN1'); *Issue: Select filter query with to_date in filter fails for table with column_meta_cache configured also having SI* 0: jdbc:hive2://10.21.19.14:23040/default> select max(to_date(DOB)),min(to_date(DOB)),count(to_date(DOB)) from uniqdata where to_date(DOB)='1975-06-11' or to_date(Dn select max(to_date(DOB)),min(to_date(DOB)),count(to_date(DOB)) from uniqdata where to_date(DOB)='1975-06-11' or to_date(DOB)='1975-06-23'; Error: org.apache.hive.service.cli.HiveSQLException: Error running query: org.apache.spark.sql.catalyst.errors.package$TreeNodeException: makeCopy, tree: !BroadCastSIFilterPushJoin [none#0|#0], [none#1|#1], Inner, BuildRight :- *(6) ColumnarToRow : +- Scan CarbonDatasourceHadoopRelation chetan.uniqdata[dob#847024|#847024] Batched: true, DirectScan: false, PushedFilters: [((cast(input[0] as date) = 1987) or (cast(in9))], ReadSchema: [dob] +- *(8) HashAggregate(keys=[positionReference#847161|#847161], functions=[], output=[positionReference#847161|#847161]) +- ReusedExchange [positionReference#847161|#847161], Exchange hashpartitioning(positionReference#847161, 200), ENSURE_REQUIREMENTS, [id=#195473|#195473] at org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation.org$apache$spark$sql$hive$thriftserver$SparkExecuteStatementOperation$$execute(Sparation.scala:361) at org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation$$anon$2$$anon$3.$anonfun$run$2(SparkExecuteStatementOperation.scala:263) at scala.runtime.java8.JFunction0$mcV$sp.apply(JFunction0$mcV$sp.java:23) at org.apache.spark.sql.hive.thriftserver.SparkOperation.withLocalProperties(SparkOperation.scala:78) at org.apache.spark.sql.hive.thriftserver.SparkOperation.withLocalProperties$(SparkOperation.scala:62) at org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation.withLocalProperties(SparkExecuteStatementOperation.scala:43) at org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation$$anon$2$$anon$3.run(SparkExecuteStatementOperation.scala:263) at org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation$$anon$2$$anon$3.run(SparkExecuteStatementOperation.scala:258) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:422) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1746) at org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation$$anon$2.run(SparkExecuteStatementOperation.scala:272) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:748) Caused by: org.apache.spark.sql.catalyst.errors.package$TreeNodeException: makeCopy, tree: !BroadCastSIFilterPushJoin [none#0|#0], [none#1|#1], Inner, BuildRight :- *(6) ColumnarToRow : +- Scan CarbonDatasourceHadoopRelation chetan.uniqdata[dob#847024|#847024] Batched: true, DirectScan: false, PushedFilters: [((cast(input[0] as date) = 1987) or (cast(in9))], ReadSchema: [dob] +- *(8) HashAggregate(keys=[positionReference#847161|#847161], functions=[], output=[positionReference#847161|#847161]) +- ReusedExchange [positionReference#847161|#847161], Exchange hashpartitioning(positionReference#847161, 200), ENSURE_REQUIREMENTS, [id=#195473|#195473] at org.apache.spark.sql.catalyst.errors.package$.attachTree(package.scala:56) at org.apache.spark.sql.catalyst.trees.TreeNode.makeCopy(TreeNode.scala:468) at org.apache.spark.sql.catalyst.trees.TreeNode.makeCopy(TreeNode.scala:457) at
[jira] [Created] (CARBONDATA-4243) Select filter query with to_date in filter fails for table with column_meta_cache configured also having SI
Chetan Bhat created CARBONDATA-4243: --- Summary: Select filter query with to_date in filter fails for table with column_meta_cache configured also having SI Key: CARBONDATA-4243 URL: https://issues.apache.org/jira/browse/CARBONDATA-4243 Project: CarbonData Issue Type: Bug Components: sql Affects Versions: 2.2.0 Environment: Spark 3.1.1 Reporter: Chetan Bhat Create table with column_meta_cache, create secondary indexes and load data to table. Execute the Select filter query with to_date in filter. CREATE TABLE uniqdata (CUST_ID int,CUST_NAME String,ACTIVE_EMUI_VERSION string, DOB timestamp, DOJ timestamp, BIGINT_COLUMN1 bigint,BIGINT_COLUMN2 bigint,DECIMAL_COLUMN1 decimal(30,10), DECIMAL_COLUMN2 decimal(36,36),Double_COLUMN1 double, Double_COLUMN2 double,INTEGER_COLUMN1 int) stored as carbondata TBLPROPERTIES('COLUMN_META_CACHE'='CUST_NAME,ACTIVE_EMUI_VERSION,DOB,DOJ'); CREATE INDEX indextable2 ON TABLE uniqdata (DOB) AS 'carbondata'; CREATE INDEX indextable3 ON TABLE uniqdata (DOJ) AS 'carbondata'; LOAD DATA INPATH 'hdfs://hacluster/chetan/2000_UniqData.csv' into table uniqdata OPTIONS('DELIMITER'=',' , 'QUOTECHAR'='"','BAD_RECORDS_ACTION'='FORCE','FILEHEADER'='CUST_ID,CUST_NAME,ACTIVE_EMUI_VERSION,DOB,DOJ,BIGINT_COLUMN1,BIGINT_COLUMN2,DECIMAL_COLUMN1,DECIMAL_COLUMN2,Double_COLUMN1,Double_COLUMN2,INTEGER_COLUMN1'); Issue: Select filter query with to_date in filter fails for table with column_meta_cache configured also having SI 0: jdbc:hive2://10.21.19.14:23040/default> select max(to_date(DOB)),min(to_date(DOB)),count(to_date(DOB)) from uniqdata where to_date(DOB)='1975-06-11' or to_date(Dn select max(to_date(DOB)),min(to_date(DOB)),count(to_date(DOB)) from uniqdata where to_date(DOB)='1975-06-11' or to_date(DOB)='1975-06-23'; Error: org.apache.hive.service.cli.HiveSQLException: Error running query: org.apache.spark.sql.catalyst.errors.package$TreeNodeException: makeCopy, tree: !BroadCastSIFilterPushJoin [none#0], [none#1], Inner, BuildRight :- *(6) ColumnarToRow : +- Scan CarbonDatasourceHadoopRelation chetan.uniqdata[dob#847024] Batched: true, DirectScan: false, PushedFilters: [((cast(input[0] as date) = 1987) or (cast(in9))], ReadSchema: [dob] +- *(8) HashAggregate(keys=[positionReference#847161], functions=[], output=[positionReference#847161]) +- ReusedExchange [positionReference#847161], Exchange hashpartitioning(positionReference#847161, 200), ENSURE_REQUIREMENTS, [id=#195473] at org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation.org$apache$spark$sql$hive$thriftserver$SparkExecuteStatementOperation$$execute(Sparation.scala:361) at org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation$$anon$2$$anon$3.$anonfun$run$2(SparkExecuteStatementOperation.scala:263) at scala.runtime.java8.JFunction0$mcV$sp.apply(JFunction0$mcV$sp.java:23) at org.apache.spark.sql.hive.thriftserver.SparkOperation.withLocalProperties(SparkOperation.scala:78) at org.apache.spark.sql.hive.thriftserver.SparkOperation.withLocalProperties$(SparkOperation.scala:62) at org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation.withLocalProperties(SparkExecuteStatementOperation.scala:43) at org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation$$anon$2$$anon$3.run(SparkExecuteStatementOperation.scala:263) at org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation$$anon$2$$anon$3.run(SparkExecuteStatementOperation.scala:258) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:422) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1746) at org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation$$anon$2.run(SparkExecuteStatementOperation.scala:272) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:748) Caused by: org.apache.spark.sql.catalyst.errors.package$TreeNodeException: makeCopy, tree: !BroadCastSIFilterPushJoin [none#0], [none#1], Inner, BuildRight :- *(6) ColumnarToRow : +- Scan CarbonDatasourceHadoopRelation chetan.uniqdata[dob#847024] Batched: true, DirectScan: false, PushedFilters: [((cast(input[0] as date) = 1987) or (cast(in9))], ReadSchema: [dob] +- *(8) HashAggregate(keys=[positionReference#847161], functions=[], output=[positionReference#847161]) +- ReusedExchange [positionReference#847161], Exchange hashpartitioning(positionReference#847161, 200), ENSURE_REQUIREMENTS, [id=#195473] at org.apache.spark.sql.catalyst.errors.package$.attachTree(package.scala:56)
[jira] [Updated] (CARBONDATA-4241) if the sort scope is changed to global sort and data loaded, major compaction fails
[ https://issues.apache.org/jira/browse/CARBONDATA-4241?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chetan Bhat updated CARBONDATA-4241: Description: *Scenario 1 : create table with table_page_size_inmb'='1', load data ,* *set sortscope as global sort , load data and do major compaction.*** 0: jdbc:hive2://10.21.19.14:23040/default> CREATE TABLE uniqdata_pagesize (CUST_ID int,CUST_NAME String,ACTIVE_EMUI_VERSION string, DOB timestamp, DOJ timestamp, BIGINT_COLUMN1 bigint,BIGINT_COLUMN2 bigint,DECIMAL_COLUMN1 decimal(30,10), DECIMAL_COLUMN2 decimal(36,36),Double_COLUMN1 double, Double_COLUMN2 double,INTEGER_COLUMN1 int) STORED as carbondata TBLPROPERTIES('table_page_size_inmb'='1'); +-+ | Result | +-+ +-+ No rows selected (0.229 seconds) 0: jdbc:hive2://10.21.19.14:23040/default> LOAD DATA INPATH 'hdfs://hacluster/chetan/2000_UniqData.csv' into table uniqdata_pagesize OPTIONS('DELIMITER'=',' , 'QUOTECHAR'='"','BAD_RECORDS_ACTION'='FORCE','FILEHEADER'='CUST_ID,CUST_NAME,ACTIVE_EMUI_VERSION,DOB,DOJ,BIGINT_COLUMN1,BIGINT_COLUMN2,DECIMAL_COLUMN1,DECIMAL_COLUMN2,Double_COLUMN1,Double_COLUMN2,INTEGER_COLUMN1'); +-+ | Segment ID | +-+ | 0 | +-+ 1 row selected (1.016 seconds) 0: jdbc:hive2://10.21.19.14:23040/default> alter table uniqdata_pagesize set tblproperties('sort_columns'='CUST_ID','sort_scope'='global_sort'); +-+ | Result | +-+ +-+ No rows selected (0.446 seconds) 0: jdbc:hive2://10.21.19.14:23040/default> LOAD DATA INPATH 'hdfs://hacluster/chetan/2000_UniqData.csv' into table uniqdata_pagesize OPTIONS('DELIMITER'=',' , 'QUOTECHAR'='"','BAD_RECORDS_ACTION'='FORCE','FILEHEADER'='CUST_ID,CUST_NAME,ACTIVE_EMUI_VERSION,DOB,DOJ,BIGINT_COLUMN1,BIGINT_COLUMN2,DECIMAL_COLUMN1,DECIMAL_COLUMN2,Double_COLUMN1,Double_COLUMN2,INTEGER_COLUMN1'); +-+ | Segment ID | +-+ | 1 | +-+ 1 row selected (0.767 seconds) 0: jdbc:hive2://10.21.19.14:23040/default> alter table uniqdata_pagesize compact 'major'; Error: org.apache.hive.service.cli.HiveSQLException: Error running query: org.apache.spark.sql.AnalysisException: Compaction failed. Please check logs for more info. Exception in compaction Compaction Failure in Merger Rdd. at org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation.org$apache$spark$sql$hive$thriftserver$SparkExecuteStatementOperation$$execute(SparkExecuteStatementOperation.scala:361) at org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation$$anon$2$$anon$3.$anonfun$run$2(SparkExecuteStatementOperation.scala:263) at scala.runtime.java8.JFunction0$mcV$sp.apply(JFunction0$mcV$sp.java:23) at org.apache.spark.sql.hive.thriftserver.SparkOperation.withLocalProperties(SparkOperation.scala:78) at org.apache.spark.sql.hive.thriftserver.SparkOperation.withLocalProperties$(SparkOperation.scala:62) at org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation.withLocalProperties(SparkExecuteStatementOperation.scala:43) at org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation$$anon$2$$anon$3.run(SparkExecuteStatementOperation.scala:263) at org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation$$anon$2$$anon$3.run(SparkExecuteStatementOperation.scala:258) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:422) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1746) at org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation$$anon$2.run(SparkExecuteStatementOperation.scala:272) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:748) Caused by: org.apache.spark.sql.AnalysisException: Compaction failed. Please check logs for more info. Exception in compaction Compaction Failure in Merger Rdd. at org.apache.spark.sql.util.CarbonException$.analysisException(CarbonException.scala:23) at org.apache.spark.sql.execution.command.management.CarbonAlterTableCompactionCommand.$anonfun$processData$3(CarbonAlterTableCompactionCommand.scala:197) at org.apache.carbondata.events.package$.withEvents(package.scala:27) at org.apache.spark.sql.execution.command.management.CarbonAlterTableCompactionCommand.processData(CarbonAlterTableCompactionCommand.scala:185) at
[jira] [Updated] (CARBONDATA-4241) if the sort scope is changed to global sort and data loaded, major compaction fails
[ https://issues.apache.org/jira/browse/CARBONDATA-4241?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chetan Bhat updated CARBONDATA-4241: Description: *create table and alter table set local dictionary.* *set sortscope as global sort , load and do major compaction.* CREATE TABLE uniqdata_sortcol_bloom_locdic (CUST_ID int,CUST_NAME String,ACTIVE_EMUI_VERSION string, DOB timestamp, DOJ timestamp, BIGINT_COLUMN1 bigint,BIGINT_COLUMN2 bigint,DECIMAL_COLUMN1 decimal(30,10), DECIMAL_COLUMN2 decimal(36,10),Double_COLUMN1 double, Double_COLUMN2 double,INTEGER_COLUMN1 int) STORED as carbondata tblproperties('sort_columns'='cust_id,cust_name,dob,doj,bigint_column1'); LOAD DATA INPATH 'hdfs://hacluster/chetan/2000_UniqData.csv' into table uniqdata_sortcol_bloom_locdic OPTIONS('DELIMITER'=',', 'QUOTECHAR'='"','BAD_RECORDS_ACTION'='FORCE','FILEHEADER'='CUST_ID,CUST_NAME,ACTIVE_EMUI_VERSION,DOB,DOJ,BIGINT_COLUMN1,BIGINT_COLUMN2,DECIMAL_COLUMN1,DECIMAL_COLUMN2,Double_COLUMN1,Double_COLUMN2,INTEGER_COLUMN1'); alter table uniqdata_sortcol_bloom_locdic set tblproperties('local_dictionary_enable'='true','local_dictionary_threshold'='1000'); alter table uniqdata_sortcol_bloom_locdic set tblproperties('sort_columns'='CUST_ID','sort_scope'='global_sort'); LOAD DATA INPATH 'hdfs://hacluster/chetan/2000_UniqData.csv' into table uniqdata_sortcol_bloom_locdic partition(active_emui_version='xyz') OPTIONS('FILEHEADER'='CUST_ID,CUST_NAME ,ACTIVE_EMUI_VERSION,DOB,DOJ, BIGINT_COLUMN1,BIGINT_COLUMN2,DECIMAL_COLUMN1,DECIMAL_COLUMN2,Double_COLUMN1, Double_COLUMN2,INTEGER_COLUMN1','BAD_RECORDS_ACTION'='FORCE'); *Compaction fails* *0: jdbc:hive2://10.21.19.14:23040/default> alter table uniqdata_sortcol_bloom_locdic compact 'major';* Error: org.apache.hive.service.cli.HiveSQLException: Error running query: org.apache.spark.sql.AnalysisException: Compaction failed. Please check logs for more info. Exception in compaction Compaction Failure in Merger Rdd. at org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation.org$apache$spark$sql$hive$thriftserver$SparkExecuteStatementOperation$$execute(SparkExecuteStatementOperation.scala:361) at org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation$$anon$2$$anon$3.$anonfun$run$2(SparkExecuteStatementOperation.scala:263) at scala.runtime.java8.JFunction0$mcV$sp.apply(JFunction0$mcV$sp.java:23) at org.apache.spark.sql.hive.thriftserver.SparkOperation.withLocalProperties(SparkOperation.scala:78) at org.apache.spark.sql.hive.thriftserver.SparkOperation.withLocalProperties$(SparkOperation.scala:62) at org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation.withLocalProperties(SparkExecuteStatementOperation.scala:43) at org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation$$anon$2$$anon$3.run(SparkExecuteStatementOperation.scala:263) at org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation$$anon$2$$anon$3.run(SparkExecuteStatementOperation.scala:258) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:422) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1746) at org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation$$anon$2.run(SparkExecuteStatementOperation.scala:272) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:748) Caused by: org.apache.spark.sql.AnalysisException: Compaction failed. Please check logs for more info. Exception in compaction Compaction Failure in Merger Rdd. at org.apache.spark.sql.util.CarbonException$.analysisException(CarbonException.scala:23) at org.apache.spark.sql.execution.command.management.CarbonAlterTableCompactionCommand.$anonfun$processData$3(CarbonAlterTableCompactionCommand.scala:197) at org.apache.carbondata.events.package$.withEvents(package.scala:27) at org.apache.spark.sql.execution.command.management.CarbonAlterTableCompactionCommand.processData(CarbonAlterTableCompactionCommand.scala:185) at org.apache.spark.sql.execution.command.AtomicRunnableCommand.$anonfun$run$3(package.scala:162) at org.apache.spark.sql.execution.command.Auditable.runWithAudit(package.scala:118) at org.apache.spark.sql.execution.command.Auditable.runWithAudit$(package.scala:114) at org.apache.spark.sql.execution.command.AtomicRunnableCommand.runWithAudit(package.scala:155) at org.apache.spark.sql.execution.command.AtomicRunnableCommand.run(package.scala:168) at org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult$lzycompute(commands.scala:70) at
[jira] [Updated] (CARBONDATA-4241) if the sort scope is changed to global sort and data loaded, major compaction fails
[ https://issues.apache.org/jira/browse/CARBONDATA-4241?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chetan Bhat updated CARBONDATA-4241: Summary: if the sort scope is changed to global sort and data loaded, major compaction fails (was: in 1.6.1 version table with local dictionary is created and in 2.2.0 if the sort scope is changed to global sort and data loaded, major compaction fails) > if the sort scope is changed to global sort and data loaded, major compaction > fails > --- > > Key: CARBONDATA-4241 > URL: https://issues.apache.org/jira/browse/CARBONDATA-4241 > Project: CarbonData > Issue Type: Bug > Components: data-load >Affects Versions: 2.2.0 > Environment: Spark 2.3.2 Carbon 1.6.1 , Spark 3.1.1 Carbon 2.2.0 >Reporter: Chetan Bhat >Priority: Major > > *In 1.6.1 version create table and alter table set local dictionary.* > CREATE TABLE uniqdata_sortcol_bloom_locdic (CUST_ID int,CUST_NAME > String,ACTIVE_EMUI_VERSION string, DOB timestamp, DOJ timestamp, > BIGINT_COLUMN1 bigint,BIGINT_COLUMN2 bigint,DECIMAL_COLUMN1 decimal(30,10), > DECIMAL_COLUMN2 decimal(36,10),Double_COLUMN1 double, Double_COLUMN2 > double,INTEGER_COLUMN1 int) STORED as carbondata > tblproperties('sort_columns'='cust_id,cust_name,dob,doj,bigint_column1'); > LOAD DATA INPATH 'hdfs://hacluster/chetan/2000_UniqData.csv' into table > uniqdata_sortcol_bloom_locdic OPTIONS('DELIMITER'=',', > 'QUOTECHAR'='"','BAD_RECORDS_ACTION'='FORCE','FILEHEADER'='CUST_ID,CUST_NAME,ACTIVE_EMUI_VERSION,DOB,DOJ,BIGINT_COLUMN1,BIGINT_COLUMN2,DECIMAL_COLUMN1,DECIMAL_COLUMN2,Double_COLUMN1,Double_COLUMN2,INTEGER_COLUMN1'); > alter table uniqdata_sortcol_bloom_locdic set > tblproperties('local_dictionary_enable'='true','local_dictionary_threshold'='1000'); > *In 2.2.0 set sortscope as global sort , load and do major compaction.* > alter table uniqdata_sortcol_bloom_locdic set > tblproperties('sort_columns'='CUST_ID','sort_scope'='global_sort'); > LOAD DATA INPATH 'hdfs://hacluster/chetan/2000_UniqData.csv' into table > uniqdata_sortcol_bloom_locdic partition(active_emui_version='xyz') > OPTIONS('FILEHEADER'='CUST_ID,CUST_NAME ,ACTIVE_EMUI_VERSION,DOB,DOJ, > BIGINT_COLUMN1,BIGINT_COLUMN2,DECIMAL_COLUMN1,DECIMAL_COLUMN2,Double_COLUMN1, > Double_COLUMN2,INTEGER_COLUMN1','BAD_RECORDS_ACTION'='FORCE'); > *0: jdbc:hive2://10.21.19.14:23040/default> alter table > uniqdata_sortcol_bloom_locdic compact 'major';* > Error: org.apache.hive.service.cli.HiveSQLException: Error running query: > org.apache.spark.sql.AnalysisException: Compaction failed. Please check logs > for more info. Exception in compaction Compaction Failure in Merger Rdd. > at > org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation.org$apache$spark$sql$hive$thriftserver$SparkExecuteStatementOperation$$execute(SparkExecuteStatementOperation.scala:361) > at > org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation$$anon$2$$anon$3.$anonfun$run$2(SparkExecuteStatementOperation.scala:263) > at scala.runtime.java8.JFunction0$mcV$sp.apply(JFunction0$mcV$sp.java:23) > at > org.apache.spark.sql.hive.thriftserver.SparkOperation.withLocalProperties(SparkOperation.scala:78) > at > org.apache.spark.sql.hive.thriftserver.SparkOperation.withLocalProperties$(SparkOperation.scala:62) > at > org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation.withLocalProperties(SparkExecuteStatementOperation.scala:43) > at > org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation$$anon$2$$anon$3.run(SparkExecuteStatementOperation.scala:263) > at > org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation$$anon$2$$anon$3.run(SparkExecuteStatementOperation.scala:258) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:422) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1746) > at > org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation$$anon$2.run(SparkExecuteStatementOperation.scala:272) > at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) > at java.util.concurrent.FutureTask.run(FutureTask.java:266) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) > at java.lang.Thread.run(Thread.java:748) > Caused by: org.apache.spark.sql.AnalysisException: Compaction failed. Please > check logs for more info. Exception in compaction Compaction Failure in > Merger Rdd. > at > org.apache.spark.sql.util.CarbonException$.analysisException(CarbonException.scala:23) > at >
[jira] [Updated] (CARBONDATA-4241) in 1.6.1 version table with local dictionary is created and in 2.2.0 if the sort scope is changed to global sort and data loaded, major compaction fails
[ https://issues.apache.org/jira/browse/CARBONDATA-4241?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chetan Bhat updated CARBONDATA-4241: Description: *In 1.6.1 version create table and alter table set local dictionary.* CREATE TABLE uniqdata_sortcol_bloom_locdic (CUST_ID int,CUST_NAME String,ACTIVE_EMUI_VERSION string, DOB timestamp, DOJ timestamp, BIGINT_COLUMN1 bigint,BIGINT_COLUMN2 bigint,DECIMAL_COLUMN1 decimal(30,10), DECIMAL_COLUMN2 decimal(36,10),Double_COLUMN1 double, Double_COLUMN2 double,INTEGER_COLUMN1 int) STORED as carbondata tblproperties('sort_columns'='cust_id,cust_name,dob,doj,bigint_column1'); LOAD DATA INPATH 'hdfs://hacluster/chetan/2000_UniqData.csv' into table uniqdata_sortcol_bloom_locdic OPTIONS('DELIMITER'=',', 'QUOTECHAR'='"','BAD_RECORDS_ACTION'='FORCE','FILEHEADER'='CUST_ID,CUST_NAME,ACTIVE_EMUI_VERSION,DOB,DOJ,BIGINT_COLUMN1,BIGINT_COLUMN2,DECIMAL_COLUMN1,DECIMAL_COLUMN2,Double_COLUMN1,Double_COLUMN2,INTEGER_COLUMN1'); alter table uniqdata_sortcol_bloom_locdic set tblproperties('local_dictionary_enable'='true','local_dictionary_threshold'='1000'); *In 2.2.0 set sortscope as global sort , load and do major compaction.* alter table uniqdata_sortcol_bloom_locdic set tblproperties('sort_columns'='CUST_ID','sort_scope'='global_sort'); LOAD DATA INPATH 'hdfs://hacluster/chetan/2000_UniqData.csv' into table uniqdata_sortcol_bloom_locdic partition(active_emui_version='xyz') OPTIONS('FILEHEADER'='CUST_ID,CUST_NAME ,ACTIVE_EMUI_VERSION,DOB,DOJ, BIGINT_COLUMN1,BIGINT_COLUMN2,DECIMAL_COLUMN1,DECIMAL_COLUMN2,Double_COLUMN1, Double_COLUMN2,INTEGER_COLUMN1','BAD_RECORDS_ACTION'='FORCE'); *0: jdbc:hive2://10.21.19.14:23040/default> alter table uniqdata_sortcol_bloom_locdic compact 'major';* Error: org.apache.hive.service.cli.HiveSQLException: Error running query: org.apache.spark.sql.AnalysisException: Compaction failed. Please check logs for more info. Exception in compaction Compaction Failure in Merger Rdd. at org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation.org$apache$spark$sql$hive$thriftserver$SparkExecuteStatementOperation$$execute(SparkExecuteStatementOperation.scala:361) at org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation$$anon$2$$anon$3.$anonfun$run$2(SparkExecuteStatementOperation.scala:263) at scala.runtime.java8.JFunction0$mcV$sp.apply(JFunction0$mcV$sp.java:23) at org.apache.spark.sql.hive.thriftserver.SparkOperation.withLocalProperties(SparkOperation.scala:78) at org.apache.spark.sql.hive.thriftserver.SparkOperation.withLocalProperties$(SparkOperation.scala:62) at org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation.withLocalProperties(SparkExecuteStatementOperation.scala:43) at org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation$$anon$2$$anon$3.run(SparkExecuteStatementOperation.scala:263) at org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation$$anon$2$$anon$3.run(SparkExecuteStatementOperation.scala:258) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:422) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1746) at org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation$$anon$2.run(SparkExecuteStatementOperation.scala:272) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:748) Caused by: org.apache.spark.sql.AnalysisException: Compaction failed. Please check logs for more info. Exception in compaction Compaction Failure in Merger Rdd. at org.apache.spark.sql.util.CarbonException$.analysisException(CarbonException.scala:23) at org.apache.spark.sql.execution.command.management.CarbonAlterTableCompactionCommand.$anonfun$processData$3(CarbonAlterTableCompactionCommand.scala:197) at org.apache.carbondata.events.package$.withEvents(package.scala:27) at org.apache.spark.sql.execution.command.management.CarbonAlterTableCompactionCommand.processData(CarbonAlterTableCompactionCommand.scala:185) at org.apache.spark.sql.execution.command.AtomicRunnableCommand.$anonfun$run$3(package.scala:162) at org.apache.spark.sql.execution.command.Auditable.runWithAudit(package.scala:118) at org.apache.spark.sql.execution.command.Auditable.runWithAudit$(package.scala:114) at org.apache.spark.sql.execution.command.AtomicRunnableCommand.runWithAudit(package.scala:155) at org.apache.spark.sql.execution.command.AtomicRunnableCommand.run(package.scala:168) at org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult$lzycompute(commands.scala:70) at
[jira] [Created] (CARBONDATA-4241) in 1.6.1 version table with local dictionary is created and in 2.2.0 if the sort scope is changed to global sort and data loaded, major compaction fails
Chetan Bhat created CARBONDATA-4241: --- Summary: in 1.6.1 version table with local dictionary is created and in 2.2.0 if the sort scope is changed to global sort and data loaded, major compaction fails Key: CARBONDATA-4241 URL: https://issues.apache.org/jira/browse/CARBONDATA-4241 Project: CarbonData Issue Type: Bug Components: data-load Affects Versions: 2.2.0 Environment: Spark 2.3.2 Carbon 1.6.1 , Spark 3.1.1 Carbon 2.2.0 Reporter: Chetan Bhat *In 1.6.1 version create table and alter table set local dictionary.* CREATE TABLE uniqdata_sortcol_bloom_locdic (CUST_ID int,CUST_NAME String,ACTIVE_EMUI_VERSION string, DOB timestamp, DOJ timestamp, BIGINT_COLUMN1 bigint,BIGINT_COLUMN2 bigint,DECIMAL_COLUMN1 decimal(30,10), DECIMAL_COLUMN2 decimal(36,10),Double_COLUMN1 double, Double_COLUMN2 double,INTEGER_COLUMN1 int) STORED as carbondata tblproperties('sort_columns'='cust_id,cust_name,dob,doj,bigint_column1'); LOAD DATA INPATH 'hdfs://hacluster/chetan/2000_UniqData.csv' into table uniqdata_sortcol_bloom_locdic OPTIONS('DELIMITER'=',', 'QUOTECHAR'='"','BAD_RECORDS_ACTION'='FORCE','FILEHEADER'='CUST_ID,CUST_NAME,ACTIVE_EMUI_VERSION,DOB,DOJ,BIGINT_COLUMN1,BIGINT_COLUMN2,DECIMAL_COLUMN1,DECIMAL_COLUMN2,Double_COLUMN1,Double_COLUMN2,INTEGER_COLUMN1'); alter table uniqdata_sortcol_bloom_locdic set tblproperties('local_dictionary_enable'='true','local_dictionary_threshold'='1000'); *In 2.2.0 set sortscope as global sort , load and do major compaction.* alter table uniqdata_sortcol_bloom_locdic set tblproperties('sort_columns'='CUST_ID','sort_scope'='global_sort'); LOAD DATA INPATH 'hdfs://hacluster/chetan/2000_UniqData.csv' into table uniqdata_sortcol_bloom_locdic partition(active_emui_version='xyz') OPTIONS('FILEHEADER'='CUST_ID,CUST_NAME ,ACTIVE_EMUI_VERSION,DOB,DOJ, BIGINT_COLUMN1,BIGINT_COLUMN2,DECIMAL_COLUMN1,DECIMAL_COLUMN2,Double_COLUMN1, Double_COLUMN2,INTEGER_COLUMN1','BAD_RECORDS_ACTION'='FORCE'); *0: jdbc:hive2://10.21.19.14:23040/default> alter table uniqdata_sortcol_bloom_locdic compact 'major';* Error: org.apache.hive.service.cli.HiveSQLException: Error running query: org.apache.spark.sql.AnalysisException: Compaction failed. Please check logs for more info. Exception in compaction Compaction Failure in Merger Rdd. at org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation.org$apache$spark$sql$hive$thriftserver$SparkExecuteStatementOperation$$execute(SparkExecuteStatementOperation.scala:361) at org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation$$anon$2$$anon$3.$anonfun$run$2(SparkExecuteStatementOperation.scala:263) at scala.runtime.java8.JFunction0$mcV$sp.apply(JFunction0$mcV$sp.java:23) at org.apache.spark.sql.hive.thriftserver.SparkOperation.withLocalProperties(SparkOperation.scala:78) at org.apache.spark.sql.hive.thriftserver.SparkOperation.withLocalProperties$(SparkOperation.scala:62) at org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation.withLocalProperties(SparkExecuteStatementOperation.scala:43) at org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation$$anon$2$$anon$3.run(SparkExecuteStatementOperation.scala:263) at org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation$$anon$2$$anon$3.run(SparkExecuteStatementOperation.scala:258) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:422) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1746) at org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation$$anon$2.run(SparkExecuteStatementOperation.scala:272) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:748) Caused by: org.apache.spark.sql.AnalysisException: Compaction failed. Please check logs for more info. Exception in compaction Compaction Failure in Merger Rdd. at org.apache.spark.sql.util.CarbonException$.analysisException(CarbonException.scala:23) at org.apache.spark.sql.execution.command.management.CarbonAlterTableCompactionCommand.$anonfun$processData$3(CarbonAlterTableCompactionCommand.scala:197) at org.apache.carbondata.events.package$.withEvents(package.scala:27) at org.apache.spark.sql.execution.command.management.CarbonAlterTableCompactionCommand.processData(CarbonAlterTableCompactionCommand.scala:185) at org.apache.spark.sql.execution.command.AtomicRunnableCommand.$anonfun$run$3(package.scala:162) at org.apache.spark.sql.execution.command.Auditable.runWithAudit(package.scala:118) at
[jira] [Created] (CARBONDATA-4240) Properties present in https://github.com/apache/carbondata/blob/master/core/src/main/java/org/apache/carbondata/core/constants/CarbonCommonConstants.java which are
Chetan Bhat created CARBONDATA-4240: --- Summary: Properties present in https://github.com/apache/carbondata/blob/master/core/src/main/java/org/apache/carbondata/core/constants/CarbonCommonConstants.java which are not present in open source doc Key: CARBONDATA-4240 URL: https://issues.apache.org/jira/browse/CARBONDATA-4240 Project: CarbonData Issue Type: Bug Components: docs Affects Versions: 2.2.0 Environment: Open source docs Reporter: Chetan Bhat Properties present in https://github.com/apache/carbondata/blob/master/core/src/main/java/org/apache/carbondata/core/constants/CarbonCommonConstants.java which are not present in open source doc as mentioned below. These properties need to be updated in open source doc. carbon.storelocation carbon.blocklet.size carbon.properties.filepath carbon.date.format carbon.complex.delimiter.level.1 carbon.complex.delimiter.level.2 carbon.complex.delimiter.level.3 carbon.complex.delimiter.level.4 carbon.lock.class carbon.local.dictionary.enable carbon.local.dictionary.decoder.fallback spark.deploy.zookeeper.url carbon.data.file.version spark.carbon.hive.schema.store spark.carbon.datamanagement.driver spark.carbon.sessionstate.classname spark.carbon.sqlastbuilder.classname carbon.lease.recovery.retry.count carbon.lease.recovery.retry.interval carbon.index.schema.storage carbon.merge.index.in.segment carbon.number.of.cores.while.altPartition carbon.minor.compaction.size enable.unsafe.columnpage carbon.lucene.compression.mode sort.inmemory.size.inmb is.driver.instance carbon.input.metrics.update.interval carbon.use.bitset.pipe.line is.internal.load.call carbon.lucene.index.stop.words carbon.load.dateformat.setlenient.enable carbon.infilter.subquery.pushdown.enable broadcast.record.size carbon.indexserver.tempfolder.deletetime -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (CARBONDATA-4236) Documentation correctness and link issues in https://github.com/apache/carbondata/blob/master/docs/
Chetan Bhat created CARBONDATA-4236: --- Summary: Documentation correctness and link issues in https://github.com/apache/carbondata/blob/master/docs/ Key: CARBONDATA-4236 URL: https://issues.apache.org/jira/browse/CARBONDATA-4236 Project: CarbonData Issue Type: Bug Components: docs Affects Versions: 2.2.0 Environment: docs with content and examples verified on Spark 2.4.5 and Spark 3.1.1 compatible carbon. Reporter: Chetan Bhat In the documentation link https://github.com/apache/carbondata/blob/master/docs/ Issue 1 :- In link -> https://github.com/apache/carbondata/blob/master/docs/configuration-parameters.md the "See detail" links does not open the target "http://spark.apache.org/docs/latest/rdd-programming-guide.html#rdd-persistence; In link --> https://github.com/apache/carbondata/blob/master/docs/documentation.md the link "Apache CarbonData wiki" when clicked tries to open link "https://cwiki.apache.org/confluence/display/CARBONDATA/CarbonData+Home; the target page cant be opened. Similarly the other links in the "External Resources" section cant be opened due to the same error. In link https://github.com/apache/carbondata/blob/master/docs/faq.md#what-are-bad-records the link "https://thrift.apache.org/docs/install; when clicked does not open the target page. In link https://github.com/apache/carbondata/blob/master/docs/quick-start-guide.md when the "Spark website" link is clicked https://spark.apache.org/downloads.html page is not opened. Also in same page when the "Apache Spark Documentation" link is clicked the "http://spark.apache.org/docs/latest/; page is not opened. In the link https://github.com/apache/carbondata/blob/master/docs/release-guide.md "Product Release Policy link" , "release signing guidelines" , "Apache Nexus repository" and "repository.apache.org" when clicked the target pages are not opening. Issue 2:- In link --> https://github.com/apache/carbondata/blob/master/docs/configuration-parameters.md the "To configure Ranges-based Compaction" to be changed to "To configure Range-based Compaction" Issue 3:- In link --> https://github.com/apache/carbondata/blob/master/docs/configuration-parameters.md the "Making this true degrade the LOAD performance" to be changed to "Making this true degrades the LOAD performance" Issue 4 :- In link --> https://github.com/apache/carbondata/blob/master/docs/configuration-parameters.md the "user an either set to true" to be changed to "user can either set to true" -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (CARBONDATA-4235) after alter add column when user does rename operation ,the select operation on struct type gives null value and childen of struct gives error
[ https://issues.apache.org/jira/browse/CARBONDATA-4235?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chetan Bhat updated CARBONDATA-4235: Description: *Queries –* drop table if exists test_rename; CREATE TABLE test_rename (str1 struct, str2 struct>, str3 struct>> comment 'struct', intfield int,arr1 array, arr2 array>, arr3 array, arr4 array> comment 'array') STORED AS carbondata; insert into test_rename values (named_struct('a', 2), named_struct('a', named_struct('b', 2)), named_struct('a', named_struct('b',named_struct('c', 2))), 1,array(1,2,3), array(array(1,2),array(3,4)), array('hello','world'), array(named_struct('a',45))); ALTER TABLE test_rename ADD COLUMNS(arr_1 ARRAY); alter table test_rename change str2 str22 struct>; select str22 from test_rename; select str22.a from test_rename; select str22.a.b from test_rename; Issue : after alter add column when user does rename operation ,the select operation on struct type gives null value and childen of struct gives error *Issue 1 : Exception trace on executing query –* 0: jdbc:hive2://vm2:22550/> select str22.a.b from test_rename; INFO : Execution ID: 2465 Error: org.apache.hive.service.cli.HiveSQLException: Error running query: org.apache.spark.SparkException: Job aborted due to stage failure: Task 0 in stage 1100.0 failed 4 times, most recent failure: Lost task 0.3 in stage 1100.0 (TID 10353) (vm1 executor 5): java.nio.BufferUnderflowException at java.nio.HeapByteBuffer.get(HeapByteBuffer.java:155) at org.apache.carbondata.core.scan.complextypes.PrimitiveQueryType.getDataObject(PrimitiveQueryType.java:166) at org.apache.carbondata.core.scan.complextypes.PrimitiveQueryType.getDataObject(PrimitiveQueryType.java:147) at org.apache.carbondata.core.scan.complextypes.PrimitiveQueryType.getDataBasedOnColumn(PrimitiveQueryType.java:141) at org.apache.carbondata.core.scan.complextypes.StructQueryType.getDataBasedOnColumn(StructQueryType.java:160) at org.apache.carbondata.core.scan.complextypes.StructQueryType.getDataBasedOnColumn(StructQueryType.java:160) at org.apache.carbondata.core.scan.collector.impl.DictionaryBasedResultCollector.fillRow(DictionaryBasedResultCollector.java:316) at org.apache.carbondata.core.scan.collector.impl.DictionaryBasedResultCollector.fillDimensionData(DictionaryBasedResultCollector.java:288) at org.apache.carbondata.core.scan.collector.impl.DictionaryBasedResultCollector.collectResultInRow(DictionaryBasedResultCollector.java:159) at org.apache.carbondata.core.scan.processor.DataBlockIterator.next(DataBlockIterator.java:110) at org.apache.carbondata.core.scan.result.iterator.DetailQueryResultIterator.getBatchResult(DetailQueryResultIterator.java:58) at org.apache.carbondata.core.scan.result.iterator.DetailQueryResultIterator.next(DetailQueryResultIterator.java:50) at org.apache.carbondata.core.scan.result.iterator.DetailQueryResultIterator.next(DetailQueryResultIterator.java:32) at org.apache.carbondata.core.scan.result.iterator.ChunkRowIterator.hasNext(ChunkRowIterator.java:56) at org.apache.carbondata.hadoop.CarbonRecordReader.nextKeyValue(CarbonRecordReader.java:127) at org.apache.carbondata.spark.rdd.CarbonScanRDD$$anon$1.hasNext(CarbonScanRDD.scala:557) at scala.collection.Iterator$$anon$10.hasNext(Iterator.scala:458) at scala.collection.Iterator$$anon$10.hasNext(Iterator.scala:458) at org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIteratorForCodegenStage1.processNext(Unknown Source) at org.apache.spark.sql.execution.BufferedRowIterator.hasNext(BufferedRowIterator.java:43) at org.apache.spark.sql.execution.WholeStageCodegenExec$$anon$1.hasNext(WholeStageCodegenExec.scala:755) at org.apache.spark.sql.execution.SparkPlan.$anonfun$getByteArrayRdd$1(SparkPlan.scala:345) at org.apache.spark.rdd.RDD.$anonfun$mapPartitionsInternal$2(RDD.scala:897) at org.apache.spark.rdd.RDD.$anonfun$mapPartitionsInternal$2$adapted(RDD.scala:897) at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:373) at org.apache.spark.rdd.RDD.iterator(RDD.scala:337) at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:90) at org.apache.spark.scheduler.Task.run(Task.scala:131) at org.apache.spark.executor.Executor$TaskRunner.$anonfun$run$3(Executor.scala:499) at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:1554) at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:502) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:748) Driver stacktrace: at
[jira] [Created] (CARBONDATA-4235) after alter add column when user does rename operation ,the select operation on struct type gives null value and childen of struct gives error
Chetan Bhat created CARBONDATA-4235: --- Summary: after alter add column when user does rename operation ,the select operation on struct type gives null value and childen of struct gives error Key: CARBONDATA-4235 URL: https://issues.apache.org/jira/browse/CARBONDATA-4235 Project: CarbonData Issue Type: Bug Components: sql Affects Versions: 2.2.0 Environment: Spark 3.1.1, Spark 2.4.5 Reporter: Chetan Bhat *Queries –* drop table if exists test_rename; CREATE TABLE test_rename (str1 struct, str2 struct>, str3 struct>> comment 'struct', intfield int,arr1 array, arr2 array>, arr3 array, arr4 array> comment 'array') STORED AS carbondata; insert into test_rename values (named_struct('a', 2), named_struct('a', named_struct('b', 2)), named_struct('a', named_struct('b',named_struct('c', 2))), 1,array(1,2,3), array(array(1,2),array(3,4)), array('hello','world'), array(named_struct('a',45))); ALTER TABLE test_rename ADD COLUMNS(arr_1 ARRAY); alter table test_rename change str2 str22 struct>; select str22 from test_rename; select str22.a from test_rename; select str22.a.b from test_rename; Issue : after alter add column when user does rename operation ,the select operation on struct type gives null value and childen of struct gives error *Exception trace on executing query –* 0: jdbc:hive2://vm2:22550/> select str22.a.b from test_rename; INFO : Execution ID: 2465 Error: org.apache.hive.service.cli.HiveSQLException: Error running query: org.apache.spark.SparkException: Job aborted due to stage failure: Task 0 in stage 1100.0 failed 4 times, most recent failure: Lost task 0.3 in stage 1100.0 (TID 10353) (vm1 executor 5): java.nio.BufferUnderflowException at java.nio.HeapByteBuffer.get(HeapByteBuffer.java:155) at org.apache.carbondata.core.scan.complextypes.PrimitiveQueryType.getDataObject(PrimitiveQueryType.java:166) at org.apache.carbondata.core.scan.complextypes.PrimitiveQueryType.getDataObject(PrimitiveQueryType.java:147) at org.apache.carbondata.core.scan.complextypes.PrimitiveQueryType.getDataBasedOnColumn(PrimitiveQueryType.java:141) at org.apache.carbondata.core.scan.complextypes.StructQueryType.getDataBasedOnColumn(StructQueryType.java:160) at org.apache.carbondata.core.scan.complextypes.StructQueryType.getDataBasedOnColumn(StructQueryType.java:160) at org.apache.carbondata.core.scan.collector.impl.DictionaryBasedResultCollector.fillRow(DictionaryBasedResultCollector.java:316) at org.apache.carbondata.core.scan.collector.impl.DictionaryBasedResultCollector.fillDimensionData(DictionaryBasedResultCollector.java:288) at org.apache.carbondata.core.scan.collector.impl.DictionaryBasedResultCollector.collectResultInRow(DictionaryBasedResultCollector.java:159) at org.apache.carbondata.core.scan.processor.DataBlockIterator.next(DataBlockIterator.java:110) at org.apache.carbondata.core.scan.result.iterator.DetailQueryResultIterator.getBatchResult(DetailQueryResultIterator.java:58) at org.apache.carbondata.core.scan.result.iterator.DetailQueryResultIterator.next(DetailQueryResultIterator.java:50) at org.apache.carbondata.core.scan.result.iterator.DetailQueryResultIterator.next(DetailQueryResultIterator.java:32) at org.apache.carbondata.core.scan.result.iterator.ChunkRowIterator.hasNext(ChunkRowIterator.java:56) at org.apache.carbondata.hadoop.CarbonRecordReader.nextKeyValue(CarbonRecordReader.java:127) at org.apache.carbondata.spark.rdd.CarbonScanRDD$$anon$1.hasNext(CarbonScanRDD.scala:557) at scala.collection.Iterator$$anon$10.hasNext(Iterator.scala:458) at scala.collection.Iterator$$anon$10.hasNext(Iterator.scala:458) at org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIteratorForCodegenStage1.processNext(Unknown Source) at org.apache.spark.sql.execution.BufferedRowIterator.hasNext(BufferedRowIterator.java:43) at org.apache.spark.sql.execution.WholeStageCodegenExec$$anon$1.hasNext(WholeStageCodegenExec.scala:755) at org.apache.spark.sql.execution.SparkPlan.$anonfun$getByteArrayRdd$1(SparkPlan.scala:345) at org.apache.spark.rdd.RDD.$anonfun$mapPartitionsInternal$2(RDD.scala:897) at org.apache.spark.rdd.RDD.$anonfun$mapPartitionsInternal$2$adapted(RDD.scala:897) at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:373) at org.apache.spark.rdd.RDD.iterator(RDD.scala:337) at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:90) at org.apache.spark.scheduler.Task.run(Task.scala:131) at org.apache.spark.executor.Executor$TaskRunner.$anonfun$run$3(Executor.scala:499) at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:1554) at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:502) at
[jira] [Created] (CARBONDATA-4209) import processing class in https://carbondata.apache.org/streaming-guide.html is wrong
Chetan Bhat created CARBONDATA-4209: --- Summary: import processing class in https://carbondata.apache.org/streaming-guide.html is wrong Key: CARBONDATA-4209 URL: https://issues.apache.org/jira/browse/CARBONDATA-4209 Project: CarbonData Issue Type: Bug Components: docs Affects Versions: 2.1.1 Environment: Spark 3.1.1 Reporter: Chetan Bhat In the open source doc link [https://carbondata.apache.org/streaming-guide.html] import org.apache.spark.sql.streaming.{ProcessingTime, StreamingQuery} --> this import statement can be replaced by --> import org.apache.spark.sql.streaming.Trigger.ProcessingTime as the earlier import does not work scala> import org.apache.spark.sql.streaming.ProcessingTime :23: error: object ProcessingTime is not a member of package org.apache.spark.sql.streaming import org.apache.spark.sql.streaming.ProcessingTime ^ scala> import org.apache.spark.sql.streaming.Trigger.ProcessingTime import org.apache.spark.sql.streaming.Trigger.ProcessingTime -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (CARBONDATA-4200) query with AND filter and query on partition table not hitting SI from presto cli
Chetan Bhat created CARBONDATA-4200: --- Summary: query with AND filter and query on partition table not hitting SI from presto cli Key: CARBONDATA-4200 URL: https://issues.apache.org/jira/browse/CARBONDATA-4200 Project: CarbonData Issue Type: Bug Components: presto-integration Affects Versions: 2.1.1 Environment: Spark 2.4.5 , Presto 316/333 Reporter: Chetan Bhat [Steps] :- Issue 1 : - query with AND filter not hitting SI from presto cli. Queries executed from spark sql/beeline - drop table if exists uniqdata; CREATE TABLE uniqdata (CUST_ID int,CUST_NAME String,ACTIVE_EMUI_VERSION string, DOB timestamp, DOJ timestamp, BIGINT_COLUMN1 bigint,BIGINT_COLUMN2 bigint,DECIMAL_COLUMN1 decimal(30,10), DECIMAL_COLUMN2 decimal(36,36),Double_COLUMN1 double, Double_COLUMN2 double,INTEGER_COLUMN1 int) stored as carbondata ; create index uniqdata_index on table uniqdata(cust_name) as 'carbondata' properties ('sort_scope'='global_sort', 'Global_sort_partitions'='3'); create index uniqdata_index1 on table uniqdata(ACTIVE_EMUI_VERSION) as 'carbondata' properties ('sort_scope'='global_sort', 'Global_sort_partitions'='3'); LOAD DATA INPATH 'hdfs://hacluster/chetan/2000_UniqData.csv' into table uniqdata OPTIONS('DELIMITER'=',' , 'QUOTECHAR'='"','BAD_RECORDS_ACTION'='FORCE','FILEHEADER'='CUST_ID,CUST_NAME,ACTIVE_EMUI_VERSION,DOB,DOJ,BIGINT_COLUMN1,BIGINT_COLUMN2,DECIMAL_COLUMN1,DECIMAL_COLUMN2,Double_COLUMN1,Double_COLUMN2,INTEGER_COLUMN1'); Queries executed from presto cli - select * from uniqdata where cust_name='CUST_NAME_2' AND ACTIVE_EMUI_VERSION='ACTIVE_EMUI_VERSION_2'; show metacache on table uniqdata; Issue 2 - query on partition table does not hit SI Queries executed from spark sql/beeline - drop table carbon_test; create table carbon_test(id int,name string)PARTITIONED BY(record_date int) stored as carbondata TBLPROPERTIES('SORT_COLUMNS'='id','SORT_SCOPE'='NO_SORT'); create index uniq1 on table carbon_test (name) as 'carbondata'; insert into table carbon_test partition(record_date) select 1,'kim',unix_timestamp('2018-02-05','-MM-dd') as record_date ; Queries executed from presto cli - select * from carbon_test where name='kim'; show metacache on table carbon_test; -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (CARBONDATA-4194) read from presto session throws error after delete operation from complex table from spark session
Chetan Bhat created CARBONDATA-4194: --- Summary: read from presto session throws error after delete operation from complex table from spark session Key: CARBONDATA-4194 URL: https://issues.apache.org/jira/browse/CARBONDATA-4194 Project: CarbonData Issue Type: Bug Components: presto-integration Affects Versions: 2.1.1 Environment: Spark 2.4.5, Presto SQL 316 Reporter: Chetan Bhat *Queries executed -* >From Spark session create table with complex types , load data to table and >delete data from table create table Struct_com19_PR4031_009 (CUST_ID string, YEAR int, MONTH int, AGE int, GENDER string, EDUCATED string, IS_MARRIED string, STRUCT_INT_DOUBLE_STRING_DATE struct,CARD_COUNT int,DEBIT_COUNT int, CREDIT_COUNT int, DEPOSIT double, HQ_DEPOSIT decimal(20,3)) stored as carbondata; LOAD DATA INPATH 'hdfs://hacluster/chetan/Struct.csv' INTO table Struct_com19_PR4031_009 options ('DELIMITER'=',', 'QUOTECHAR'='"', 'FILEHEADER'='CUST_ID,YEAR,MONTH,AGE,GENDER,EDUCATED,IS_MARRIED,STRUCT_INT_DOUBLE_STRING_DATE,CARD_COUNT,DEBIT_COUNT,CREDIT_COUNT,DEPOSIT,HQ_DEPOSIT','COMPLEX_DELIMITER_LEVEL_1'='$'); delete from Struct_com19_PR4031_009 where EDUCATED='MS'; >From Presto CLI execute the select queries. select * from Struct_com19_PR4031_009 limit 1; select count(*) from Struct_com19_PR4031_009; *Issue : -* read from presto session throws error after delete operation from complex table from spark session presto:ranjan> select * from Struct_com19_PR4031_009 limit 1; Query 20210528_075917_1_swzys, FAILED, 1 node Splits: 18 total, 0 done (0.00%) 0:00 [0 rows, 0B] [0 rows/s, 0B/s] Query 20210528_075917_1_swzys failed: Error in Reading Data from Carbondata *Log -* org.apache.carbondata.processing.loading.exception.CarbonDataLoadingException: Error in Reading Data from Carbondata at org.apache.carbondata.presto.CarbondataPageSource$CarbondataBlockLoader.load(CarbondataPageSource.java:491) at org.apache.carbondata.presto.CarbondataPageSource$CarbondataBlockLoader.load(CarbondataPageSource.java:467) at io.prestosql.spi.block.LazyBlock.assureLoaded(LazyBlock.java:276) at io.prestosql.spi.block.LazyBlock.getLoadedBlock(LazyBlock.java:267) at io.prestosql.spi.Page.getLoadedPage(Page.java:261) at io.prestosql.operator.TableScanOperator.getOutput(TableScanOperator.java:283) at io.prestosql.operator.Driver.processInternal(Driver.java:379) at io.prestosql.operator.Driver.lambda$processFor$8(Driver.java:283) at io.prestosql.operator.Driver.tryWithLock(Driver.java:675) at io.prestosql.operator.Driver.processFor(Driver.java:276) at io.prestosql.execution.SqlTaskExecution$DriverSplitRunner.processFor(SqlTaskExecution.java:1075) at io.prestosql.execution.executor.PrioritizedSplitRunner.process(PrioritizedSplitRunner.java:163) at io.prestosql.execution.executor.TaskExecutor$TaskRunner.run(TaskExecutor.java:484) at io.prestosql.$gen.Presto_31620210526_073226_1.run(Unknown Source) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:748) Caused by: java.lang.RuntimeException: java.lang.RuntimeException: java.lang.ClassCastException at org.apache.carbondata.core.datastore.chunk.impl.DimensionRawColumnChunk.convertToDimColDataChunkAndFillVector(DimensionRawColumnChunk.java:140) at org.apache.carbondata.core.scan.scanner.LazyPageLoader.loadPage(LazyPageLoader.java:75) at org.apache.carbondata.core.scan.result.vector.impl.CarbonColumnVectorImpl.loadPage(CarbonColumnVectorImpl.java:531) at org.apache.carbondata.presto.CarbondataPageSource$CarbondataBlockLoader.load(CarbondataPageSource.java:483) ... 16 more Caused by: java.lang.RuntimeException: java.lang.ClassCastException at org.apache.carbondata.core.datastore.chunk.impl.DimensionRawColumnChunk.convertToDimColDataChunkAndFillVector(DimensionRawColumnChunk.java:140) at org.apache.carbondata.core.scan.scanner.LazyPageLoader.loadPage(LazyPageLoader.java:75) at org.apache.carbondata.core.scan.result.vector.impl.CarbonColumnVectorImpl.loadPage(CarbonColumnVectorImpl.java:531) at org.apache.carbondata.core.datastore.page.encoding.compress.DirectCompressCodec$3.decodeAndFillVector(DirectCompressCodec.java:277) at org.apache.carbondata.core.datastore.page.encoding.compress.DirectCompressCodec$2.decodeAndFillVector(DirectCompressCodec.java:158) at org.apache.carbondata.core.datastore.chunk.reader.dimension.v3.DimensionChunkReaderV3.decodeDimensionByMeta(DimensionChunkReaderV3.java:260) at org.apache.carbondata.core.datastore.chunk.reader.dimension.v3.DimensionChunkReaderV3.decodeDimension(DimensionChunkReaderV3.java:307) at
[jira] [Created] (CARBONDATA-4184) alter table Set TBLPROPERTIES for RANGE_COLUMN sets unsupported datatype(complex_datatypes/Binary/Boolean/Decimal) as RANGE_COLUMN
Chetan Bhat created CARBONDATA-4184: --- Summary: alter table Set TBLPROPERTIES for RANGE_COLUMN sets unsupported datatype(complex_datatypes/Binary/Boolean/Decimal) as RANGE_COLUMN Key: CARBONDATA-4184 URL: https://issues.apache.org/jira/browse/CARBONDATA-4184 Project: CarbonData Issue Type: Bug Components: data-query Affects Versions: 2.1.1 Environment: Spark 2.4.5 Reporter: Chetan Bhat [Steps] :- >From Spark Beeline/SQL/Submit/Shell the queries are executed DROP TABLE IF EXISTS alter_array; CREATE TABLE alter_array(intField INT, arr1 array) STORED AS carbondata; ALTER TABLE alter_array SET TBLPROPERTIES('RANGE_COLUMN'='arr1'); desc formatted alter_array; DROP TABLE IF EXISTS alter_struct; create table alter_struct(roll int, struct1 struct) STORED AS carbondata; ALTER TABLE alter_struct SET TBLPROPERTIES('RANGE_COLUMN'='struct1'); desc formatted alter_struct; DROP TABLE IF EXISTS alter_map; create table alter_map(roll int, map1 map) STORED AS carbondata; ALTER TABLE alter_map SET TBLPROPERTIES('RANGE_COLUMN'='map1'); desc formatted alter_map; DROP TABLE IF EXISTS alter_boolean; create table alter_boolean(roll int, bool1 boolean) STORED AS carbondata; ALTER TABLE alter_boolean SET TBLPROPERTIES('RANGE_COLUMN'='bool1'); desc formatted alter_boolean; DROP TABLE IF EXISTS alter_binary; create table alter_binary(roll int, bin1 binary) STORED AS carbondata; ALTER TABLE alter_binary SET TBLPROPERTIES('RANGE_COLUMN'='bin1'); desc formatted alter_binary; DROP TABLE IF EXISTS alter_decimal; create table alter_decimal(roll int, dec1 decimal(10,5)) STORED AS carbondata; ALTER TABLE alter_decimal SET TBLPROPERTIES('RANGE_COLUMN'='dec1'); desc formatted alter_decimal; [Actual Issue] : - alter table Set TBLPROPERTIES for RANGE_COLUMN sets unsupported datatype(complex_datatypes/Binary/Boolean/Decimal) as RANGE_COLUMN [Expected Result] :- Validation should be provided when alter table Set TBLPROPERTIES for RANGE_COLUMN tried to be set for unsupported datatype(complex_datatypes/Binary/Boolean/Decimal) -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (CARBONDATA-4178) if we insert different no of data in array complex datatype. then query filter on increment or last index gives error from presto
[ https://issues.apache.org/jira/browse/CARBONDATA-4178?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chetan Bhat updated CARBONDATA-4178: Priority: Minor (was: Major) > if we insert different no of data in array complex datatype. then query > filter on increment or last index gives error from presto > - > > Key: CARBONDATA-4178 > URL: https://issues.apache.org/jira/browse/CARBONDATA-4178 > Project: CarbonData > Issue Type: Bug > Components: presto-integration >Affects Versions: 2.1.1 > Environment: Spark 2.4.5. Presto 316 >Reporter: Chetan Bhat >Priority: Minor > > Steps - > From Spark session the table creation and insert operations are executed as > shown below > > drop table if exists complextable; > create table complextable (id string, country array, name string) > stored as carbondata; > insert into complextable select 1, array('china', 'us'), 'b' union all select > 2, array ('pak', 'india', 'china'), 'v'; > From presto cli the below queries are executed. > select * from complextable ; > select * from complextable where country[3]='china'; > > Issue - if we insert different no of data in array complex datatype. then > query filter on increment or last index gives error from presto > presto:ranjan> select * from complextable ; > id | country | name > ++--- > 1 | [china, us] | b > 2 | [pak, india, china] | v > (2 rows) > Query 20210507_072934_2_d2cjp, FINISHED, 1 node > Splits: 18 total, 18 done (100.00%) > 0:07 [2 rows, 95B] [0 rows/s, 12B/s] > presto:ranjan> select * from complextable where country[1]='pak'; > id | country | name > ++--- > 2 | [pak, india, china] | v > (1 row) > Query 20210507_072948_3_d2cjp, FINISHED, 1 node > Splits: 18 total, 18 done (100.00%) > 0:06 [2 rows, 0B] [0 rows/s, 0B/s] > presto:ranjan> select * from complextable where country[3]='china'; > id | country | name > ++--- > 2 | [pak, india, china] | v > (1 row) > Query 20210507_073007_4_d2cjp, FAILED, 1 node > Splits: 18 total, 1 done (5.56%) > 0:05 [1 rows, 0B] [0 rows/s, 0B/s] > Query 20210507_073007_4_d2cjp failed: Array subscript out of bounds > > Expected - Error should not be thrown for the query executed in hetu cli and > it show correct resultset as in spark session. > 0: jdbc:hive2://10.20.254.208:23040/default> select * from complextable where > country[2]='china'; > +--+-++--- > |id|country|name| > +--+-++--- > |2|["pak","india","china"]|v| > +--+-++--- > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (CARBONDATA-4178) if we insert different no of data in array complex datatype. then query filter on increment or last index gives error from presto
[ https://issues.apache.org/jira/browse/CARBONDATA-4178?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chetan Bhat updated CARBONDATA-4178: Description: Steps - >From Spark session the table creation and insert operations are executed as >shown below drop table if exists complextable; create table complextable (id string, country array, name string) stored as carbondata; insert into complextable select 1, array('china', 'us'), 'b' union all select 2, array ('pak', 'india', 'china'), 'v'; >From presto cli the below queries are executed. select * from complextable ; select * from complextable where country[3]='china'; Issue - if we insert different no of data in array complex datatype. then query filter on increment or last index gives error from presto presto:ranjan> select * from complextable ; id | country | name ++--- 1 | [china, us] | b 2 | [pak, india, china] | v (2 rows) Query 20210507_072934_2_d2cjp, FINISHED, 1 node Splits: 18 total, 18 done (100.00%) 0:07 [2 rows, 95B] [0 rows/s, 12B/s] presto:ranjan> select * from complextable where country[1]='pak'; id | country | name ++--- 2 | [pak, india, china] | v (1 row) Query 20210507_072948_3_d2cjp, FINISHED, 1 node Splits: 18 total, 18 done (100.00%) 0:06 [2 rows, 0B] [0 rows/s, 0B/s] presto:ranjan> select * from complextable where country[3]='china'; id | country | name ++--- 2 | [pak, india, china] | v (1 row) Query 20210507_073007_4_d2cjp, FAILED, 1 node Splits: 18 total, 1 done (5.56%) 0:05 [1 rows, 0B] [0 rows/s, 0B/s] Query 20210507_073007_4_d2cjp failed: Array subscript out of bounds Expected - Error should not be thrown for the query executed in hetu cli and it show correct resultset as in spark session. 0: jdbc:hive2://10.20.254.208:23040/default> select * from complextable where country[2]='china'; +--+-++--- |id|country|name| +--+-++--- |2|["pak","india","china"]|v| +--+-++--- was: Steps - >From presto cli the below queries are executed. drop table if exists complextable; create table complextable (id string, country array, name string) stored as carbondata; insert into complextable select 1, array('china', 'us'), 'b' union all select 2, array ('pak', 'india', 'china'), 'v'; select * from complextable ; select * from complextable where country[3]='china'; Issue - if we insert different no of data in array complex datatype. then query filter on increment or last index gives error from presto presto:ranjan> select * from complextable ; id | country | name +-+-- 1 | [china, us] | b 2 | [pak, india, china] | v (2 rows) Query 20210507_072934_2_d2cjp, FINISHED, 1 node Splits: 18 total, 18 done (100.00%) 0:07 [2 rows, 95B] [0 rows/s, 12B/s] presto:ranjan> select * from complextable where country[1]='pak'; id | country | name +-+-- 2 | [pak, india, china] | v (1 row) Query 20210507_072948_3_d2cjp, FINISHED, 1 node Splits: 18 total, 18 done (100.00%) 0:06 [2 rows, 0B] [0 rows/s, 0B/s] presto:ranjan> select * from complextable where country[3]='china'; id | country | name +-+-- 2 | [pak, india, china] | v (1 row) Query 20210507_073007_4_d2cjp, FAILED, 1 node Splits: 18 total, 1 done (5.56%) 0:05 [1 rows, 0B] [0 rows/s, 0B/s] Query 20210507_073007_4_d2cjp failed: Array subscript out of bounds Expected - Error should not be thrown for the query executed in hetu cli and it show correct resultset as in spark session. 0: jdbc:hive2://10.20.254.208:23040/default> select * from complextable where country[2]='china'; +-+--+---+ | id | country | name | +-+--+---+ | 2 | ["pak","india","china"] | v | +-+--+---+ > if we insert different no of data in array complex datatype. then query > filter on increment or last index gives error from presto > - > > Key: CARBONDATA-4178 > URL: https://issues.apache.org/jira/browse/CARBONDATA-4178 > Project: CarbonData > Issue Type: Bug > Components: presto-integration >Affects Versions: 2.1.1 > Environment: Spark 2.4.5. Presto 316 >Reporter: Chetan Bhat >Priority: Major > > Steps - > From Spark session the table creation and insert operations are executed as > shown below > > drop table if exists complextable; > create table complextable (id string, country array, name string) > stored as carbondata; > insert into complextable select 1,
[jira] [Created] (CARBONDATA-4178) if we insert different no of data in array complex datatype. then query filter on increment or last index gives error from presto
Chetan Bhat created CARBONDATA-4178: --- Summary: if we insert different no of data in array complex datatype. then query filter on increment or last index gives error from presto Key: CARBONDATA-4178 URL: https://issues.apache.org/jira/browse/CARBONDATA-4178 Project: CarbonData Issue Type: Bug Components: presto-integration Affects Versions: 2.1.1 Environment: Spark 2.4.5. Presto 316 Reporter: Chetan Bhat Steps - >From presto cli the below queries are executed. drop table if exists complextable; create table complextable (id string, country array, name string) stored as carbondata; insert into complextable select 1, array('china', 'us'), 'b' union all select 2, array ('pak', 'india', 'china'), 'v'; select * from complextable ; select * from complextable where country[3]='china'; Issue - if we insert different no of data in array complex datatype. then query filter on increment or last index gives error from presto presto:ranjan> select * from complextable ; id | country | name +-+-- 1 | [china, us] | b 2 | [pak, india, china] | v (2 rows) Query 20210507_072934_2_d2cjp, FINISHED, 1 node Splits: 18 total, 18 done (100.00%) 0:07 [2 rows, 95B] [0 rows/s, 12B/s] presto:ranjan> select * from complextable where country[1]='pak'; id | country | name +-+-- 2 | [pak, india, china] | v (1 row) Query 20210507_072948_3_d2cjp, FINISHED, 1 node Splits: 18 total, 18 done (100.00%) 0:06 [2 rows, 0B] [0 rows/s, 0B/s] presto:ranjan> select * from complextable where country[3]='china'; id | country | name +-+-- 2 | [pak, india, china] | v (1 row) Query 20210507_073007_4_d2cjp, FAILED, 1 node Splits: 18 total, 1 done (5.56%) 0:05 [1 rows, 0B] [0 rows/s, 0B/s] Query 20210507_073007_4_d2cjp failed: Array subscript out of bounds Expected - Error should not be thrown for the query executed in hetu cli and it show correct resultset as in spark session. 0: jdbc:hive2://10.20.254.208:23040/default> select * from complextable where country[2]='china'; +-+--+---+ | id | country | name | +-+--+---+ | 2 | ["pak","india","china"] | v | +-+--+---+ -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (CARBONDATA-4159) Insert into select in presto session throws error after delete data and alter add table executed from spark session
[ https://issues.apache.org/jira/browse/CARBONDATA-4159?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chetan Bhat updated CARBONDATA-4159: Description: Steps - create a table in spark session . load ,delete and alter add column in spark session CREATE TABLE lsc1(id int, name string, description string,address string, note string) stored as carbondata tblproperties('sort_columns'='id,name','long_string_columns'='description,note'); load data inpath 'hdfs://hacluster/chetan/longStringData_100rec.csv' into table lsc1 options('DELIMITER'=',', 'QUOTECHAR'= '"','BAD_RECORDS_ACTION'='FORCE','FILEHEADER'='id,name,description,address,note'); delete from lsc1 where id=99; alter table lsc1 add columns(id2 int); Insert into select from presto session. insert into lsc1 select * from lsc1; Issue :- Insert into select in presto session throws error presto:ranjan> insert into lsc1 select * from lsc1; Query 20210401_135306_00013_fnva9, FAILED, 1 node Splits: 35 total, 0 done (0.00%) 0:01 [100 rows, 6.38MB] [90 rows/s, 5.75MB/s] *Query 20210401_135306_00013_fnva9 failed: Invalid position 50 and length 50 in block with 99 positions* presto:ranjan> *Log -* java.lang.IndexOutOfBoundsException: Invalid position 50 and length 50 in block with 99 positionsjava.lang.IndexOutOfBoundsException: Invalid position 50 and length 50 in block with 99 positions at io.prestosql.spi.block.BlockUtil.checkValidRegion(BlockUtil.java:48) at io.prestosql.spi.block.DictionaryBlock.getRegion(DictionaryBlock.java:325) at io.prestosql.spi.Page.getRegion(Page.java:128) at io.prestosql.execution.buffer.PageSplitterUtil.splitPage(PageSplitterUtil.java:53) at io.prestosql.execution.buffer.PageSplitterUtil.splitPage(PageSplitterUtil.java:29) at io.prestosql.operator.TaskOutputOperator.addInput(TaskOutputOperator.java:145) at io.prestosql.operator.Driver.processInternal(Driver.java:384) at io.prestosql.operator.Driver.lambda$processFor$8(Driver.java:283) at io.prestosql.operator.Driver.tryWithLock(Driver.java:675) at io.prestosql.operator.Driver.processFor(Driver.java:276) at io.prestosql.execution.SqlTaskExecution$DriverSplitRunner.processFor(SqlTaskExecution.java:1075) at io.prestosql.execution.executor.PrioritizedSplitRunner.process(PrioritizedSplitRunner.java:163) at io.prestosql.execution.executor.TaskExecutor$TaskRunner.run(TaskExecutor.java:484) at io.prestosql.$gen.Presto_31620210401_102926_1.run(Unknown Source) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:748) was: Steps - creating a table in spark session . load ,delete and alter add column in spark session CREATE TABLE lsc1(id int, name string, description string,address string, note string) stored as carbondata tblproperties('sort_columns'='id,name','long_string_columns'='description,note'); load data inpath 'hdfs://hacluster/chetan/longStringData_100rec.csv' into table lsc1 options('DELIMITER'=',', 'QUOTECHAR'= '"','BAD_RECORDS_ACTION'='FORCE','FILEHEADER'='id,name,description,address,note'); delete from lsc1 where id=99; alter table lsc1 add columns(id2 int); Insert into select from presto session. insert into lsc1 select * from lsc1; Issue :- Insert into select in presto session throws error presto:ranjan> insert into lsc1 select * from lsc1; Query 20210401_135306_00013_fnva9, FAILED, 1 node Splits: 35 total, 0 done (0.00%) 0:01 [100 rows, 6.38MB] [90 rows/s, 5.75MB/s] *Query 20210401_135306_00013_fnva9 failed: Invalid position 50 and length 50 in block with 99 positions* presto:ranjan> *Log -* java.lang.IndexOutOfBoundsException: Invalid position 50 and length 50 in block with 99 positionsjava.lang.IndexOutOfBoundsException: Invalid position 50 and length 50 in block with 99 positions at io.prestosql.spi.block.BlockUtil.checkValidRegion(BlockUtil.java:48) at io.prestosql.spi.block.DictionaryBlock.getRegion(DictionaryBlock.java:325) at io.prestosql.spi.Page.getRegion(Page.java:128) at io.prestosql.execution.buffer.PageSplitterUtil.splitPage(PageSplitterUtil.java:53) at io.prestosql.execution.buffer.PageSplitterUtil.splitPage(PageSplitterUtil.java:29) at io.prestosql.operator.TaskOutputOperator.addInput(TaskOutputOperator.java:145) at io.prestosql.operator.Driver.processInternal(Driver.java:384) at io.prestosql.operator.Driver.lambda$processFor$8(Driver.java:283) at io.prestosql.operator.Driver.tryWithLock(Driver.java:675) at io.prestosql.operator.Driver.processFor(Driver.java:276) at io.prestosql.execution.SqlTaskExecution$DriverSplitRunner.processFor(SqlTaskExecution.java:1075) at io.prestosql.execution.executor.PrioritizedSplitRunner.process(PrioritizedSplitRunner.java:163) at
[jira] [Created] (CARBONDATA-4159) Insert into select in presto session throws error after delete data and alter add table executed from spark session
Chetan Bhat created CARBONDATA-4159: --- Summary: Insert into select in presto session throws error after delete data and alter add table executed from spark session Key: CARBONDATA-4159 URL: https://issues.apache.org/jira/browse/CARBONDATA-4159 Project: CarbonData Issue Type: Bug Components: presto-integration Affects Versions: 2.1.0 Environment: Spark 2.4.5. Presto-SQL 316 Reporter: Chetan Bhat Steps - creating a table in spark session . load ,delete and alter add column in spark session CREATE TABLE lsc1(id int, name string, description string,address string, note string) stored as carbondata tblproperties('sort_columns'='id,name','long_string_columns'='description,note'); load data inpath 'hdfs://hacluster/chetan/longStringData_100rec.csv' into table lsc1 options('DELIMITER'=',', 'QUOTECHAR'= '"','BAD_RECORDS_ACTION'='FORCE','FILEHEADER'='id,name,description,address,note'); delete from lsc1 where id=99; alter table lsc1 add columns(id2 int); Insert into select from presto session. insert into lsc1 select * from lsc1; Issue :- Insert into select in presto session throws error presto:ranjan> insert into lsc1 select * from lsc1; Query 20210401_135306_00013_fnva9, FAILED, 1 node Splits: 35 total, 0 done (0.00%) 0:01 [100 rows, 6.38MB] [90 rows/s, 5.75MB/s] *Query 20210401_135306_00013_fnva9 failed: Invalid position 50 and length 50 in block with 99 positions* presto:ranjan> *Log -* java.lang.IndexOutOfBoundsException: Invalid position 50 and length 50 in block with 99 positionsjava.lang.IndexOutOfBoundsException: Invalid position 50 and length 50 in block with 99 positions at io.prestosql.spi.block.BlockUtil.checkValidRegion(BlockUtil.java:48) at io.prestosql.spi.block.DictionaryBlock.getRegion(DictionaryBlock.java:325) at io.prestosql.spi.Page.getRegion(Page.java:128) at io.prestosql.execution.buffer.PageSplitterUtil.splitPage(PageSplitterUtil.java:53) at io.prestosql.execution.buffer.PageSplitterUtil.splitPage(PageSplitterUtil.java:29) at io.prestosql.operator.TaskOutputOperator.addInput(TaskOutputOperator.java:145) at io.prestosql.operator.Driver.processInternal(Driver.java:384) at io.prestosql.operator.Driver.lambda$processFor$8(Driver.java:283) at io.prestosql.operator.Driver.tryWithLock(Driver.java:675) at io.prestosql.operator.Driver.processFor(Driver.java:276) at io.prestosql.execution.SqlTaskExecution$DriverSplitRunner.processFor(SqlTaskExecution.java:1075) at io.prestosql.execution.executor.PrioritizedSplitRunner.process(PrioritizedSplitRunner.java:163) at io.prestosql.execution.executor.TaskExecutor$TaskRunner.run(TaskExecutor.java:484) at io.prestosql.$gen.Presto_31620210401_102926_1.run(Unknown Source) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:748) -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (CARBONDATA-4135) insert in partition table should fail from presto side but insert into select * in passing in partition table with single column partition table from presto side
Chetan Bhat created CARBONDATA-4135: --- Summary: insert in partition table should fail from presto side but insert into select * in passing in partition table with single column partition table from presto side Key: CARBONDATA-4135 URL: https://issues.apache.org/jira/browse/CARBONDATA-4135 Project: CarbonData Issue Type: Bug Components: presto-integration Affects Versions: 2.1.0 Environment: Spark 2.4.5, Presto 316 Reporter: Chetan Bhat Presto 316 version used. *Steps :-* >From Spark beeline execute the queries - 0: jdbc:hive2://10.20.254.208:23040/default> drop table uniqdata_Partition_single; +-+ | Result | +-+ +-+ No rows selected (0.454 seconds) 0: jdbc:hive2://10.20.254.208:23040/default> CREATE TABLE uniqdata_Partition_single (CUST_ID int,CUST_NAME String, DOB timestamp, DOJ timestamp, BIGINT_COLUMN1 bigint,BIGINT_COLUMN2 bigint,DECIMAL_COLUMN1 decimal(30,10), DECIMAL_COLUMN2 decimal(36,36),Double_COLUMN1 double, Double_COLUMN2 double,INTEGER_COLUMN1 int) partitioned by (ACTIVE_EMUI_VERSION string)stored as carbondata tblproperties('COLUMN_META_CACHE'='CUST_ID,CUST_NAME,DECIMAL_COLUMN2,DOJ,Double_COLUMN2,BIGINT_COLUMN2','local_dictionary_enable'='true','local_dictionary_threshold'='1000','local_dictionary_include'='ACTIVE_EMUI_VERSION') ; +-+ | Result | +-+ +-+ No rows selected (0.202 seconds) 0: jdbc:hive2://10.20.254.208:23040/default> LOAD DATA INPATH 'hdfs://hacluster/chetan/2000_UniqData_partition.csv' into table uniqdata_Partition_single OPTIONS('FILEHEADER'='CUST_ID,CUST_NAME ,ACTIVE_EMUI_VERSION,DOB,DOJ, BIGINT_COLUMN1,BIGINT_COLUMN2,DECIMAL_COLUMN1,DECIMAL_COLUMN2,Double_COLUMN1, Double_COLUMN2,INTEGER_COLUMN1','BAD_RECORDS_ACTION'='FORCE'); +-+ | Result | +-+ +-+ No rows selected (3.471 seconds) 0: jdbc:hive2://10.20.254.208:23040/default> >From prestocli the query is executed - presto:ranjan> insert into uniqdata_Partition_single select * from uniqdata_Partition_single; *Issue : - insert in partition table should fail from presto side but insert into select * in passing in partition table with single column partition table from presto side.* presto:ranjan> insert into uniqdata_Partition_single select * from uniqdata_Partition_single; INSERT: 2002 rows Query 20210223_044320_0_ggkxh, FINISHED, 1 node Splits: 45 total, 45 done (100.00%) 0:05 [2K rows, 206KB] [431 rows/s, 44.4KB/s] presto:ranjan> desc uniqdata_Partition_single; Column | Type | Extra | Comment -++---+- cust_id | integer | | cust_name | varchar | | dob | timestamp | | doj | timestamp | | bigint_column1 | bigint | | bigint_column2 | bigint | | decimal_column1 | decimal(30,10) | | decimal_column2 | decimal(36,36) | | double_column1 | double | | double_column2 | double | | integer_column1 | integer | | active_emui_version | varchar | partition key | (12 rows) Query 20210223_044344_1_ggkxh, FINISHED, 1 node Splits: 19 total, 19 done (100.00%) 0:00 [12 rows, 1.07KB] [50 rows/s, 4.53KB/s] presto:ranjan> -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (CARBONDATA-4128) Merge SQL command fails with different case for column name and also if the command is input with different case
[ https://issues.apache.org/jira/browse/CARBONDATA-4128?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chetan Bhat updated CARBONDATA-4128: Summary: Merge SQL command fails with different case for column name and also if the command is input with different case (was: Merge SQL command fails with different case for column name) > Merge SQL command fails with different case for column name and also if the > command is input with different case > > > Key: CARBONDATA-4128 > URL: https://issues.apache.org/jira/browse/CARBONDATA-4128 > Project: CarbonData > Issue Type: Bug > Components: data-query >Affects Versions: 2.1.0 > Environment: Spark 2.4.5 >Reporter: Chetan Bhat >Priority: Minor > > Steps:- > > *Issue 1 : Merge command fails for case insensitive column name.* > drop table if exists A; > drop table if exists B; > CREATE TABLE A(id Int, name string, description string,address string, note > string) stored as carbondata > tblproperties('long_string_columns'='description,note','table_blocksize'='1','SORT_SCOPE'='global_sort','table_page_size_inmb'='1'); > > insert into A select > 1,"name1","asasfdfdfdsf","tutyutyuty","6867898980909099-0-0-0878676565454545465768798"; > insert into A select > 2,"name2","asasfdfdfdsf","tutyutyuty","6867898980909099-0-0-0878676565454545465768798"; > insert into A select > 3,"name3","asasfdfdfdsf","tutyutyuty","6867898980909099-0-0-0878676565454545465768798"; > insert into A select > 4,"name4","asasfdfdfdsf","tutyutyuty","6867898980909099-0-0-0878676565454545465768798"; > insert into A select > 5,"name5","asasfdfdfdsf","tutyutyuty","6867898980909099-0-0-0878676565454545465768798"; > CREATE TABLE B(id Int, name string, description string,address string, note > string) stored as carbondata > tblproperties('long_string_columns'='description,note','table_blocksize'='1','SORT_SCOPE'='global_sort','table_page_size_inmb'='1'); > > insert into B select > 1,"name1","asasfdfdfdsf","tutyutyuty","6867898980909099-0-0-0878676565454545465768798"; > insert into B select > 2,"name2","asasfdfdfdsf","tutyutyuty","6867898980909099-0-0-0878676565454545465768798"; > insert into B select > 3,"name3","asasfdfdfdsf","tutyutyuty","6867898980909099-0-0-0878676565454545465768798"; > insert into B select > 6,"name4","asasfdfdfdsf","tutyutyuty","6867898980909099-0-0-0878676565454545465768798"; > insert into B select > 7,"name5","asasfdfdfdsf","tutyutyuty","6867898980909099-0-0-0878676565454545465768798"; > --merge > MERGE INTO A USING B ON A.id=B.id WHEN MATCHED THEN DELETE; > > Issue :- Merge SQL command fails with different case for column name > 0: jdbc:hive2://linux-63:22550/> MERGE INTO A USING B ON A.id=B.id WHEN > MATCHED THEN DELETE; > Error: org.apache.spark.sql.AnalysisException: == Spark Parser: > org.apache.spark.sql.hive.FISqlParser == > mismatched input 'MERGE' expecting \{'(', 'SELECT', 'FROM', 'ADD', 'DESC', > 'EMPOWER', 'WITH', 'VALUES', 'CREATE', 'TABLE', 'INSERT', 'DELETE', > 'DESCRIBE', 'EXPLAIN', 'SHOW', 'USE', 'DROP', 'ALTER', 'MAP', 'SET', 'RESET', > 'START', 'COMMIT', 'ROLLBACK', 'REDUCE', 'REFRESH', 'CLEAR', 'CACHE', > 'UNCACHE', 'DFS', 'TRUNCATE', 'ANALYZE', 'LIST', 'REVOKE', 'GRANT', 'LOCK', > 'UNLOCK', 'MSCK', 'EXPORT', 'IMPORT', 'LOAD', 'HEALTHCHECK'}(line 1, pos 0) > == SQL == > MERGE INTO A USING B ON A.id=B.id WHEN MATCHED THEN DELETE > ^^^ > == Carbon Parser: org.apache.spark.sql.parser.CarbonExtensionSpark2SqlParser > == > [1.1] failure: identifier matching regex (?i)EXPLAIN expected > MERGE INTO A USING B ON A.id=B.id WHEN MATCHED THEN DELETE > ^; > == Antlr Parser: org.apache.spark.sql.parser.CarbonAntlrParser == > > org.apache.spark.sql.parser.CarbonSqlBaseParser$ValueExpressionDefaultContext > cannot be cast to > org.apache.spark.sql.parser.CarbonSqlBaseParser$ComparisonContext; > (state=,code=0) > 0: jdbc:hive2://linux-63:22550/> > > *Issue 2 : merge into command is not working as case sensitive and fails as > mentioned below.* > 0: jdbc:hive2://linux1:22550/> merge into a using b on a.ID=b.ID when matched > then delete; > Error: > org.apache.carbondata.common.exceptions.sql.MalformedCarbonCommandException: > Parse failed! (state=,code=0) > 0: jdbc:hive2://linux1:22550/> merge into a using b on A.ID=B.ID when > matched then delete; > Error: > org.apache.carbondata.common.exceptions.sql.MalformedCarbonCommandException: > Parse failed! (state=,code=0) > 0: jdbc:hive2://linux1:22550/> merge into A using B on A.ID=B.ID when > matched then delete; > Error: > org.apache.carbondata.common.exceptions.sql.MalformedCarbonCommandException: > Parse failed! (state=,code=0) -- This
[jira] [Updated] (CARBONDATA-4128) Merge SQL command fails with different case for column name
[ https://issues.apache.org/jira/browse/CARBONDATA-4128?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chetan Bhat updated CARBONDATA-4128: Description: Steps:- Issue 1 : Merge command fails for case insensitive column. drop table if exists A; drop table if exists B; CREATE TABLE A(id Int, name string, description string,address string, note string) stored as carbondata tblproperties('long_string_columns'='description,note','table_blocksize'='1','SORT_SCOPE'='global_sort','table_page_size_inmb'='1'); insert into A select 1,"name1","asasfdfdfdsf","tutyutyuty","6867898980909099-0-0-0878676565454545465768798"; insert into A select 2,"name2","asasfdfdfdsf","tutyutyuty","6867898980909099-0-0-0878676565454545465768798"; insert into A select 3,"name3","asasfdfdfdsf","tutyutyuty","6867898980909099-0-0-0878676565454545465768798"; insert into A select 4,"name4","asasfdfdfdsf","tutyutyuty","6867898980909099-0-0-0878676565454545465768798"; insert into A select 5,"name5","asasfdfdfdsf","tutyutyuty","6867898980909099-0-0-0878676565454545465768798"; CREATE TABLE B(id Int, name string, description string,address string, note string) stored as carbondata tblproperties('long_string_columns'='description,note','table_blocksize'='1','SORT_SCOPE'='global_sort','table_page_size_inmb'='1'); insert into B select 1,"name1","asasfdfdfdsf","tutyutyuty","6867898980909099-0-0-0878676565454545465768798"; insert into B select 2,"name2","asasfdfdfdsf","tutyutyuty","6867898980909099-0-0-0878676565454545465768798"; insert into B select 3,"name3","asasfdfdfdsf","tutyutyuty","6867898980909099-0-0-0878676565454545465768798"; insert into B select 6,"name4","asasfdfdfdsf","tutyutyuty","6867898980909099-0-0-0878676565454545465768798"; insert into B select 7,"name5","asasfdfdfdsf","tutyutyuty","6867898980909099-0-0-0878676565454545465768798"; --merge MERGE INTO A USING B ON A.id=B.id WHEN MATCHED THEN DELETE; Issue :- Merge SQL command fails with different case for column name 0: jdbc:hive2://linux-63:22550/> MERGE INTO A USING B ON A.id=B.id WHEN MATCHED THEN DELETE; Error: org.apache.spark.sql.AnalysisException: == Spark Parser: org.apache.spark.sql.hive.FISqlParser == mismatched input 'MERGE' expecting \{'(', 'SELECT', 'FROM', 'ADD', 'DESC', 'EMPOWER', 'WITH', 'VALUES', 'CREATE', 'TABLE', 'INSERT', 'DELETE', 'DESCRIBE', 'EXPLAIN', 'SHOW', 'USE', 'DROP', 'ALTER', 'MAP', 'SET', 'RESET', 'START', 'COMMIT', 'ROLLBACK', 'REDUCE', 'REFRESH', 'CLEAR', 'CACHE', 'UNCACHE', 'DFS', 'TRUNCATE', 'ANALYZE', 'LIST', 'REVOKE', 'GRANT', 'LOCK', 'UNLOCK', 'MSCK', 'EXPORT', 'IMPORT', 'LOAD', 'HEALTHCHECK'}(line 1, pos 0) == SQL == MERGE INTO A USING B ON A.id=B.id WHEN MATCHED THEN DELETE ^^^ == Carbon Parser: org.apache.spark.sql.parser.CarbonExtensionSpark2SqlParser == [1.1] failure: identifier matching regex (?i)EXPLAIN expected MERGE INTO A USING B ON A.id=B.id WHEN MATCHED THEN DELETE ^; == Antlr Parser: org.apache.spark.sql.parser.CarbonAntlrParser == org.apache.spark.sql.parser.CarbonSqlBaseParser$ValueExpressionDefaultContext cannot be cast to org.apache.spark.sql.parser.CarbonSqlBaseParser$ComparisonContext; (state=,code=0) 0: jdbc:hive2://linux-63:22550/> *Issue 2 : merge into command is not working as case sensitive and fails as mentioned below.* 0: jdbc:hive2://linux1:22550/> merge into a using b on a.ID=b.ID when matched then delete; Error: org.apache.carbondata.common.exceptions.sql.MalformedCarbonCommandException: Parse failed! (state=,code=0) 0: jdbc:hive2://linux1:22550/> merge into a using b on A.ID=B.ID when matched then delete; Error: org.apache.carbondata.common.exceptions.sql.MalformedCarbonCommandException: Parse failed! (state=,code=0) 0: jdbc:hive2://linux1:22550/> merge into A using B on A.ID=B.ID when matched then delete; Error: org.apache.carbondata.common.exceptions.sql.MalformedCarbonCommandException: Parse failed! (state=,code=0) was: Steps:- drop table if exists A; drop table if exists B; CREATE TABLE A(id Int, name string, description string,address string, note string) stored as carbondata tblproperties('long_string_columns'='description,note','table_blocksize'='1','SORT_SCOPE'='global_sort','table_page_size_inmb'='1'); insert into A select 1,"name1","asasfdfdfdsf","tutyutyuty","6867898980909099-0-0-0878676565454545465768798"; insert into A select 2,"name2","asasfdfdfdsf","tutyutyuty","6867898980909099-0-0-0878676565454545465768798"; insert into A select 3,"name3","asasfdfdfdsf","tutyutyuty","6867898980909099-0-0-0878676565454545465768798"; insert into A select 4,"name4","asasfdfdfdsf","tutyutyuty","6867898980909099-0-0-0878676565454545465768798"; insert into A select 5,"name5","asasfdfdfdsf","tutyutyuty","6867898980909099-0-0-0878676565454545465768798"; CREATE TABLE B(id Int, name string, description
[jira] [Updated] (CARBONDATA-4128) Merge SQL command fails with different case for column name
[ https://issues.apache.org/jira/browse/CARBONDATA-4128?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chetan Bhat updated CARBONDATA-4128: Description: Steps:- *Issue 1 : Merge command fails for case insensitive column name.* drop table if exists A; drop table if exists B; CREATE TABLE A(id Int, name string, description string,address string, note string) stored as carbondata tblproperties('long_string_columns'='description,note','table_blocksize'='1','SORT_SCOPE'='global_sort','table_page_size_inmb'='1'); insert into A select 1,"name1","asasfdfdfdsf","tutyutyuty","6867898980909099-0-0-0878676565454545465768798"; insert into A select 2,"name2","asasfdfdfdsf","tutyutyuty","6867898980909099-0-0-0878676565454545465768798"; insert into A select 3,"name3","asasfdfdfdsf","tutyutyuty","6867898980909099-0-0-0878676565454545465768798"; insert into A select 4,"name4","asasfdfdfdsf","tutyutyuty","6867898980909099-0-0-0878676565454545465768798"; insert into A select 5,"name5","asasfdfdfdsf","tutyutyuty","6867898980909099-0-0-0878676565454545465768798"; CREATE TABLE B(id Int, name string, description string,address string, note string) stored as carbondata tblproperties('long_string_columns'='description,note','table_blocksize'='1','SORT_SCOPE'='global_sort','table_page_size_inmb'='1'); insert into B select 1,"name1","asasfdfdfdsf","tutyutyuty","6867898980909099-0-0-0878676565454545465768798"; insert into B select 2,"name2","asasfdfdfdsf","tutyutyuty","6867898980909099-0-0-0878676565454545465768798"; insert into B select 3,"name3","asasfdfdfdsf","tutyutyuty","6867898980909099-0-0-0878676565454545465768798"; insert into B select 6,"name4","asasfdfdfdsf","tutyutyuty","6867898980909099-0-0-0878676565454545465768798"; insert into B select 7,"name5","asasfdfdfdsf","tutyutyuty","6867898980909099-0-0-0878676565454545465768798"; --merge MERGE INTO A USING B ON A.id=B.id WHEN MATCHED THEN DELETE; Issue :- Merge SQL command fails with different case for column name 0: jdbc:hive2://linux-63:22550/> MERGE INTO A USING B ON A.id=B.id WHEN MATCHED THEN DELETE; Error: org.apache.spark.sql.AnalysisException: == Spark Parser: org.apache.spark.sql.hive.FISqlParser == mismatched input 'MERGE' expecting \{'(', 'SELECT', 'FROM', 'ADD', 'DESC', 'EMPOWER', 'WITH', 'VALUES', 'CREATE', 'TABLE', 'INSERT', 'DELETE', 'DESCRIBE', 'EXPLAIN', 'SHOW', 'USE', 'DROP', 'ALTER', 'MAP', 'SET', 'RESET', 'START', 'COMMIT', 'ROLLBACK', 'REDUCE', 'REFRESH', 'CLEAR', 'CACHE', 'UNCACHE', 'DFS', 'TRUNCATE', 'ANALYZE', 'LIST', 'REVOKE', 'GRANT', 'LOCK', 'UNLOCK', 'MSCK', 'EXPORT', 'IMPORT', 'LOAD', 'HEALTHCHECK'}(line 1, pos 0) == SQL == MERGE INTO A USING B ON A.id=B.id WHEN MATCHED THEN DELETE ^^^ == Carbon Parser: org.apache.spark.sql.parser.CarbonExtensionSpark2SqlParser == [1.1] failure: identifier matching regex (?i)EXPLAIN expected MERGE INTO A USING B ON A.id=B.id WHEN MATCHED THEN DELETE ^; == Antlr Parser: org.apache.spark.sql.parser.CarbonAntlrParser == org.apache.spark.sql.parser.CarbonSqlBaseParser$ValueExpressionDefaultContext cannot be cast to org.apache.spark.sql.parser.CarbonSqlBaseParser$ComparisonContext; (state=,code=0) 0: jdbc:hive2://linux-63:22550/> *Issue 2 : merge into command is not working as case sensitive and fails as mentioned below.* 0: jdbc:hive2://linux1:22550/> merge into a using b on a.ID=b.ID when matched then delete; Error: org.apache.carbondata.common.exceptions.sql.MalformedCarbonCommandException: Parse failed! (state=,code=0) 0: jdbc:hive2://linux1:22550/> merge into a using b on A.ID=B.ID when matched then delete; Error: org.apache.carbondata.common.exceptions.sql.MalformedCarbonCommandException: Parse failed! (state=,code=0) 0: jdbc:hive2://linux1:22550/> merge into A using B on A.ID=B.ID when matched then delete; Error: org.apache.carbondata.common.exceptions.sql.MalformedCarbonCommandException: Parse failed! (state=,code=0) was: Steps:- Issue 1 : Merge command fails for case insensitive column. drop table if exists A; drop table if exists B; CREATE TABLE A(id Int, name string, description string,address string, note string) stored as carbondata tblproperties('long_string_columns'='description,note','table_blocksize'='1','SORT_SCOPE'='global_sort','table_page_size_inmb'='1'); insert into A select 1,"name1","asasfdfdfdsf","tutyutyuty","6867898980909099-0-0-0878676565454545465768798"; insert into A select 2,"name2","asasfdfdfdsf","tutyutyuty","6867898980909099-0-0-0878676565454545465768798"; insert into A select 3,"name3","asasfdfdfdsf","tutyutyuty","6867898980909099-0-0-0878676565454545465768798"; insert into A select 4,"name4","asasfdfdfdsf","tutyutyuty","6867898980909099-0-0-0878676565454545465768798"; insert into A select
[jira] [Updated] (CARBONDATA-4129) Class cast exception when array, struct , binary and string type data tried to be merged using merge SQL command
[ https://issues.apache.org/jira/browse/CARBONDATA-4129?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chetan Bhat updated CARBONDATA-4129: Summary: Class cast exception when array, struct , binary and string type data tried to be merged using merge SQL command (was: Class cast exception when array, struct , binary and string type data tried to be merged) > Class cast exception when array, struct , binary and string type data tried > to be merged using merge SQL command > > > Key: CARBONDATA-4129 > URL: https://issues.apache.org/jira/browse/CARBONDATA-4129 > Project: CarbonData > Issue Type: Bug > Components: data-query >Affects Versions: 2.1.0 > Environment: Spark 2.4.5 >Reporter: Chetan Bhat >Priority: Major > > > *Scenario 1 : - merge command with insertion on string with expression > **throws error.**. Also insert into binary with expression throws error.* > drop table if exists A; > drop table if exists B; > CREATE TABLE A(id Int, name string, description string,address string, note > string) stored as carbondata > tblproperties('long_string_columns'='description,note','table_blocksize'='1','SORT_SCOPE'='global_sort','table_page_size_inmb'='1'); > > CREATE TABLE B(id Int, name string, description string,address string, note > string) stored as carbondata > tblproperties('long_string_columns'='description,note','table_blocksize'='1','SORT_SCOPE'='global_sort','table_page_size_inmb'='1'); > > insert into A select > 1,"name1A","asasfdfdfdsf","tutyutyuty","6867898980909099-0-0-0878676565454545465768798"; > insert into A select > 2,"name2A","asasfdfdfdsf","tutyutyuty","6867898980909099-0-0-0878676565454545465768798"; > insert into A select > 3,"name3A","asasfdfdfdsf","tutyutyuty","6867898980909099-0-0-0878676565454545465768798"; > insert into A select > 4,"name4A","asasfdfdfdsf","tutyutyuty","6867898980909099-0-0-0878676565454545465768798"; > insert into A select > 5,"name5A","asasfdfdfdsf","tutyutyuty","6867898980909099-0-0-0878676565454545465768798"; > insert into B select > 1,"name1B","asasfdfdfdsf","tutyutyuty","6867898980909099-0-0-0878676565454545465768798"; > insert into B select > 2,"name2B","asasfdfdfdsf","tutyutyuty","6867898980909099-0-0-0878676565454545465768798"; > insert into B select > 3,"name3B","asasfdfdfdsf","tutyutyuty","6867898980909099-0-0-0878676565454545465768798"; > insert into B select > 6,"name4B","asasfdfdfdsf","tutyutyuty","6867898980909099-0-0-0878676565454545465768798"; > insert into B select > 7,"name5B","asasfdfdfdsf","tutyutyuty","6867898980909099-0-0-0878676565454545465768798"; > MERGE INTO A USING B ON A.ID=B.ID WHEN NOT MATCHED AND B.ID=7 THEN INSERT > (A.ID,A.name,A.description ,A.address, A.note) VALUES > (B.ID,B.name+'10',B.description ,B.address,'test-string'); > 0: jdbc:hive2://linux-63:22550/> MERGE INTO A USING B ON A.ID=B.ID WHEN NOT > MATCHED AND B.ID=7 THEN INSERT (A.ID,A.name,A.description ,A.address, A.note) > VALUES (B.ID,B.name+'10',B.description ,B.address,'test-string'); > Error: org.apache.spark.SparkException: Job aborted due to stage failure: > Task 4 in stage 3813.0 failed 4 times, most recent failure: Lost task 4.3 in > stage 3813.0 (TID 23528, linux-63, executor 5): java.lang.ClassCastException: > org.apache.spark.sql.types.StringType$ cannot be cast to > org.apache.spark.sql.types.NumericType > at > org.apache.spark.sql.catalyst.util.TypeUtils$.getNumeric(TypeUtils.scala:58) > at > org.apache.spark.sql.catalyst.expressions.Add.numeric$lzycompute(arithmetic.scala:166) > at > org.apache.spark.sql.catalyst.expressions.Add.numeric(arithmetic.scala:166) > at > org.apache.spark.sql.catalyst.expressions.Add.nullSafeEval(arithmetic.scala:172) > at > org.apache.spark.sql.catalyst.expressions.BinaryExpression.eval(Expression.scala:486) > at > org.apache.spark.sql.catalyst.expressions.InterpretedMutableProjection.apply(Projection.scala:92) > at > org.apache.spark.sql.catalyst.expressions.InterpretedMutableProjection.apply(Projection.scala:66) > at > org.apache.spark.sql.execution.command.mutation.merge.MergeProjection.apply(MergeProjection.scala:54) > at > org.apache.spark.sql.execution.command.mutation.merge.CarbonMergeDataSetCommand$$anonfun$processIUD$1$$anon$1$$anonfun$next$1.apply(CarbonMergeDataSetCommand.scala:341) > at > org.apache.spark.sql.execution.command.mutation.merge.CarbonMergeDataSetCommand$$anonfun$processIUD$1$$anon$1$$anonfun$next$1.apply(CarbonMergeDataSetCommand.scala:338) > at scala.collection.immutable.List.foreach(List.scala:392) > at >
[jira] [Created] (CARBONDATA-4129) Class cast exception when array, struct , binary and string type data tried to be merged
Chetan Bhat created CARBONDATA-4129: --- Summary: Class cast exception when array, struct , binary and string type data tried to be merged Key: CARBONDATA-4129 URL: https://issues.apache.org/jira/browse/CARBONDATA-4129 Project: CarbonData Issue Type: Bug Components: data-query Affects Versions: 2.1.0 Environment: Spark 2.4.5 Reporter: Chetan Bhat *Scenario 1 : - merge command with insertion on string with expression **throws error.**. Also insert into binary with expression throws error.* drop table if exists A; drop table if exists B; CREATE TABLE A(id Int, name string, description string,address string, note string) stored as carbondata tblproperties('long_string_columns'='description,note','table_blocksize'='1','SORT_SCOPE'='global_sort','table_page_size_inmb'='1'); CREATE TABLE B(id Int, name string, description string,address string, note string) stored as carbondata tblproperties('long_string_columns'='description,note','table_blocksize'='1','SORT_SCOPE'='global_sort','table_page_size_inmb'='1'); insert into A select 1,"name1A","asasfdfdfdsf","tutyutyuty","6867898980909099-0-0-0878676565454545465768798"; insert into A select 2,"name2A","asasfdfdfdsf","tutyutyuty","6867898980909099-0-0-0878676565454545465768798"; insert into A select 3,"name3A","asasfdfdfdsf","tutyutyuty","6867898980909099-0-0-0878676565454545465768798"; insert into A select 4,"name4A","asasfdfdfdsf","tutyutyuty","6867898980909099-0-0-0878676565454545465768798"; insert into A select 5,"name5A","asasfdfdfdsf","tutyutyuty","6867898980909099-0-0-0878676565454545465768798"; insert into B select 1,"name1B","asasfdfdfdsf","tutyutyuty","6867898980909099-0-0-0878676565454545465768798"; insert into B select 2,"name2B","asasfdfdfdsf","tutyutyuty","6867898980909099-0-0-0878676565454545465768798"; insert into B select 3,"name3B","asasfdfdfdsf","tutyutyuty","6867898980909099-0-0-0878676565454545465768798"; insert into B select 6,"name4B","asasfdfdfdsf","tutyutyuty","6867898980909099-0-0-0878676565454545465768798"; insert into B select 7,"name5B","asasfdfdfdsf","tutyutyuty","6867898980909099-0-0-0878676565454545465768798"; MERGE INTO A USING B ON A.ID=B.ID WHEN NOT MATCHED AND B.ID=7 THEN INSERT (A.ID,A.name,A.description ,A.address, A.note) VALUES (B.ID,B.name+'10',B.description ,B.address,'test-string'); 0: jdbc:hive2://linux-63:22550/> MERGE INTO A USING B ON A.ID=B.ID WHEN NOT MATCHED AND B.ID=7 THEN INSERT (A.ID,A.name,A.description ,A.address, A.note) VALUES (B.ID,B.name+'10',B.description ,B.address,'test-string'); Error: org.apache.spark.SparkException: Job aborted due to stage failure: Task 4 in stage 3813.0 failed 4 times, most recent failure: Lost task 4.3 in stage 3813.0 (TID 23528, linux-63, executor 5): java.lang.ClassCastException: org.apache.spark.sql.types.StringType$ cannot be cast to org.apache.spark.sql.types.NumericType at org.apache.spark.sql.catalyst.util.TypeUtils$.getNumeric(TypeUtils.scala:58) at org.apache.spark.sql.catalyst.expressions.Add.numeric$lzycompute(arithmetic.scala:166) at org.apache.spark.sql.catalyst.expressions.Add.numeric(arithmetic.scala:166) at org.apache.spark.sql.catalyst.expressions.Add.nullSafeEval(arithmetic.scala:172) at org.apache.spark.sql.catalyst.expressions.BinaryExpression.eval(Expression.scala:486) at org.apache.spark.sql.catalyst.expressions.InterpretedMutableProjection.apply(Projection.scala:92) at org.apache.spark.sql.catalyst.expressions.InterpretedMutableProjection.apply(Projection.scala:66) at org.apache.spark.sql.execution.command.mutation.merge.MergeProjection.apply(MergeProjection.scala:54) at org.apache.spark.sql.execution.command.mutation.merge.CarbonMergeDataSetCommand$$anonfun$processIUD$1$$anon$1$$anonfun$next$1.apply(CarbonMergeDataSetCommand.scala:341) at org.apache.spark.sql.execution.command.mutation.merge.CarbonMergeDataSetCommand$$anonfun$processIUD$1$$anon$1$$anonfun$next$1.apply(CarbonMergeDataSetCommand.scala:338) at scala.collection.immutable.List.foreach(List.scala:392) at org.apache.spark.sql.execution.command.mutation.merge.CarbonMergeDataSetCommand$$anonfun$processIUD$1$$anon$1.next(CarbonMergeDataSetCommand.scala:338) at org.apache.spark.sql.execution.command.mutation.merge.CarbonMergeDataSetCommand$$anonfun$processIUD$1$$anon$1.next(CarbonMergeDataSetCommand.scala:319) at scala.collection.Iterator$$anon$13.hasNext(Iterator.scala:463) at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:409) at org.apache.spark.sql.execution.columnar.CachedRDDBuilder$$anonfun$1$$anon$1.hasNext(InMemoryRelation.scala:125) at org.apache.spark.storage.memory.MemoryStore.putIterator(MemoryStore.scala:221) at org.apache.spark.storage.memory.MemoryStore.putIteratorAsValues(MemoryStore.scala:299) at
[jira] [Created] (CARBONDATA-4128) Merge SQL command fails with different case for column name
Chetan Bhat created CARBONDATA-4128: --- Summary: Merge SQL command fails with different case for column name Key: CARBONDATA-4128 URL: https://issues.apache.org/jira/browse/CARBONDATA-4128 Project: CarbonData Issue Type: Bug Components: data-query Affects Versions: 2.1.0 Environment: Spark 2.4.5 Reporter: Chetan Bhat Steps:- drop table if exists A; drop table if exists B; CREATE TABLE A(id Int, name string, description string,address string, note string) stored as carbondata tblproperties('long_string_columns'='description,note','table_blocksize'='1','SORT_SCOPE'='global_sort','table_page_size_inmb'='1'); insert into A select 1,"name1","asasfdfdfdsf","tutyutyuty","6867898980909099-0-0-0878676565454545465768798"; insert into A select 2,"name2","asasfdfdfdsf","tutyutyuty","6867898980909099-0-0-0878676565454545465768798"; insert into A select 3,"name3","asasfdfdfdsf","tutyutyuty","6867898980909099-0-0-0878676565454545465768798"; insert into A select 4,"name4","asasfdfdfdsf","tutyutyuty","6867898980909099-0-0-0878676565454545465768798"; insert into A select 5,"name5","asasfdfdfdsf","tutyutyuty","6867898980909099-0-0-0878676565454545465768798"; CREATE TABLE B(id Int, name string, description string,address string, note string) stored as carbondata tblproperties('long_string_columns'='description,note','table_blocksize'='1','SORT_SCOPE'='global_sort','table_page_size_inmb'='1'); insert into B select 1,"name1","asasfdfdfdsf","tutyutyuty","6867898980909099-0-0-0878676565454545465768798"; insert into B select 2,"name2","asasfdfdfdsf","tutyutyuty","6867898980909099-0-0-0878676565454545465768798"; insert into B select 3,"name3","asasfdfdfdsf","tutyutyuty","6867898980909099-0-0-0878676565454545465768798"; insert into B select 6,"name4","asasfdfdfdsf","tutyutyuty","6867898980909099-0-0-0878676565454545465768798"; insert into B select 7,"name5","asasfdfdfdsf","tutyutyuty","6867898980909099-0-0-0878676565454545465768798"; --merge MERGE INTO A USING B ON A.id=B.id WHEN MATCHED THEN DELETE; Issue :- Merge SQL command fails with different case for column name 0: jdbc:hive2://linux-63:22550/> MERGE INTO A USING B ON A.id=B.id WHEN MATCHED THEN DELETE; Error: org.apache.spark.sql.AnalysisException: == Spark Parser: org.apache.spark.sql.hive.FISqlParser == mismatched input 'MERGE' expecting \{'(', 'SELECT', 'FROM', 'ADD', 'DESC', 'EMPOWER', 'WITH', 'VALUES', 'CREATE', 'TABLE', 'INSERT', 'DELETE', 'DESCRIBE', 'EXPLAIN', 'SHOW', 'USE', 'DROP', 'ALTER', 'MAP', 'SET', 'RESET', 'START', 'COMMIT', 'ROLLBACK', 'REDUCE', 'REFRESH', 'CLEAR', 'CACHE', 'UNCACHE', 'DFS', 'TRUNCATE', 'ANALYZE', 'LIST', 'REVOKE', 'GRANT', 'LOCK', 'UNLOCK', 'MSCK', 'EXPORT', 'IMPORT', 'LOAD', 'HEALTHCHECK'}(line 1, pos 0) == SQL == MERGE INTO A USING B ON A.id=B.id WHEN MATCHED THEN DELETE ^^^ == Carbon Parser: org.apache.spark.sql.parser.CarbonExtensionSpark2SqlParser == [1.1] failure: identifier matching regex (?i)EXPLAIN expected MERGE INTO A USING B ON A.id=B.id WHEN MATCHED THEN DELETE ^; == Antlr Parser: org.apache.spark.sql.parser.CarbonAntlrParser == org.apache.spark.sql.parser.CarbonSqlBaseParser$ValueExpressionDefaultContext cannot be cast to org.apache.spark.sql.parser.CarbonSqlBaseParser$ComparisonContext; (state=,code=0) 0: jdbc:hive2://linux-63:22550/> -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (CARBONDATA-4127) Merge SQL command not working with different table names
Chetan Bhat created CARBONDATA-4127: --- Summary: Merge SQL command not working with different table names Key: CARBONDATA-4127 URL: https://issues.apache.org/jira/browse/CARBONDATA-4127 Project: CarbonData Issue Type: Bug Components: data-query Affects Versions: 2.1.0 Environment: Spark 2.4.5 Reporter: Chetan Bhat Steps:- CREATE TABLE lsc1(id Int, name string, description string,address string, note string) stored as carbondata tblproperties('long_string_columns'='description,note','table_blocksize'='1','SORT_SCOPE'='global_sort','table_page_size_inmb'='1'); insert into lsc1 select 1,"name1","asasfdfdfdsf","tutyutyuty","6867898980909099-0-0-0878676565454545465768798"; insert into lsc1 select 2,"name2","asasfdfdfdsf","tutyutyuty","6867898980909099-0-0-0878676565454545465768798"; insert into lsc1 select 3,"name3","asasfdfdfdsf","tutyutyuty","6867898980909099-0-0-0878676565454545465768798"; insert into lsc1 select 4,"name4","asasfdfdfdsf","tutyutyuty","6867898980909099-0-0-0878676565454545465768798"; insert into lsc1 select 5,"name5","asasfdfdfdsf","tutyutyuty","6867898980909099-0-0-0878676565454545465768798"; CREATE TABLE lsc2(id Int, name string, description string,address string, note string) stored as carbondata tblproperties('long_string_columns'='description,note','table_blocksize'='1','SORT_SCOPE'='global_sort','table_page_size_inmb'='1'); insert into lsc2 select 1,"name1","asasfdfdfdsf","tutyutyuty","6867898980909099-0-0-0878676565454545465768798"; insert into lsc2 select 2,"name2","asasfdfdfdsf","tutyutyuty","6867898980909099-0-0-0878676565454545465768798"; insert into lsc2 select 3,"name3","asasfdfdfdsf","tutyutyuty","6867898980909099-0-0-0878676565454545465768798"; insert into lsc2 select 6,"name4","asasfdfdfdsf","tutyutyuty","6867898980909099-0-0-0878676565454545465768798"; insert into lsc2 select 7,"name5","asasfdfdfdsf","tutyutyuty","6867898980909099-0-0-0878676565454545465768798"; Issue :- merge fails with parse error 0: jdbc:hive2://linux-63:22550/> MERGE INTO lsc1 USING lsc2 ON lsc1.ID=lsc2.ID WHEN MATCHED THEN DELETE; *Error: org.apache.carbondata.common.exceptions.sql.MalformedCarbonCommandException: Parse failed! (state=,code=0)* *0: jdbc:hive2://linux-63:22550/>* -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (CARBONDATA-4126) Concurrent Compaction fails with Load on table with SI
Chetan Bhat created CARBONDATA-4126: --- Summary: Concurrent Compaction fails with Load on table with SI Key: CARBONDATA-4126 URL: https://issues.apache.org/jira/browse/CARBONDATA-4126 Project: CarbonData Issue Type: Bug Components: data-load Affects Versions: 2.1.0 Environment: Spark 2.4.5 Reporter: Chetan Bhat [Steps] :- Create table, load data and create SI. create table brinjal (imei string,AMSize string,channelsId string,ActiveCountry string, Activecity string,gamePointId double,deviceInformationId double,productionDate Timestamp,deliveryDate timestamp,deliverycharge double) stored as carbondata TBLPROPERTIES('table_blocksize'='1'); LOAD DATA INPATH 'hdfs://hacluster/chetan/vardhandaterestruct.csv' INTO TABLE brinjal OPTIONS('DELIMITER'=',', 'QUOTECHAR'= '"','BAD_RECORDS_ACTION'='FORCE','FILEHEADER'= 'imei,deviceInformationId,AMSize,channelsId,ActiveCountry,Activecity,gamePointId,productionDate,deliveryDate,deliverycharge'); create index indextable1 ON TABLE brinjal (AMSize) AS 'carbondata'; >From one terminal load data to table and other terminal perform minor and >major compaction on the table concurrently for some time. LOAD DATA INPATH 'hdfs://hacluster/chetan/vardhandaterestruct.csv' INTO TABLE brinjal OPTIONS('DELIMITER'=',', 'QUOTECHAR'= '"','BAD_RECORDS_ACTION'='FORCE','FILEHEADER'= 'imei,deviceInformationId,AMSize,channelsId,ActiveCountry,Activecity,gamePointId,productionDate,deliveryDate,deliverycharge'); alter table brinjal compact 'minor'; alter table brinjal compact 'major'; [Expected Result] :- Concurrent Compaction should be success with Load on table with SI [Actual Issue] : - Concurrent Compaction fails with Load on table with SI *0: jdbc:hive2://linux-32:22550/> alter table brinjal compact 'major';* *Error: org.apache.spark.sql.AnalysisException: Compaction failed. Please check logs for more info. Exception in compaction Failed to acquire lock on segment 2, during compaction of table test.brinjal; (state=,code=0)* -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Closed] (CARBONDATA-4049) Sometimes refresh table fails with error "table not found in database" error
[ https://issues.apache.org/jira/browse/CARBONDATA-4049?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chetan Bhat closed CARBONDATA-4049. --- Resolution: Won't Fix Issue is identified as that of refresh which is handled by Spark. As issue is not a carbon issue its closed. > Sometimes refresh table fails with error "table not found in database" error > > > Key: CARBONDATA-4049 > URL: https://issues.apache.org/jira/browse/CARBONDATA-4049 > Project: CarbonData > Issue Type: Bug > Components: data-query >Affects Versions: 2.1.0 > Environment: Spark 2.4.5 >Reporter: Chetan Bhat >Priority: Minor > > In Carbon 2.1 version user creates a database. > user copies a old version store such as 1.6.1 to HDFS folder of the database > in the In Carbon 2.1 version > In Spark-SQL or beeline the user accesses the database using the use db > command. > Refresh table command is executed on the old version store table and then the > subsequent operations on the table are performed. > Next refresh table command is tried to be executed on another old version > store table . > > Issue : Sometimes refresh table fails with error "table not found in > database" error. > spark-sql> refresh table brinjal_deleteseg; > *Error in query: Table or view 'brinjal_deleteseg' not found in database > '1_6_1';* > > **Log - > 2020-11-12 18:55:46,922 | INFO | [main] | Created broadcast 171 from > broadCastHadoopConf at CarbonRDD.scala:58 | > org.apache.spark.internal.Logging$class.logInfo(Logging.scala:54)2020-11-12 > 18:55:46,922 | INFO | [main] | Created broadcast 171 from > broadCastHadoopConf at CarbonRDD.scala:58 | > org.apache.spark.internal.Logging$class.logInfo(Logging.scala:54)2020-11-12 > 18:55:46,924 | INFO | [main] | Pushed Filters: | > org.apache.spark.internal.Logging$class.logInfo(Logging.scala:54)2020-11-12 > 18:55:46,939 | INFO | [main] | Distributed Index server is enabled for > 1_6_1.brinjal_update | > org.apache.carbondata.core.util.CarbonProperties.isDistributedPruningEnabled(CarbonProperties.java:1742)2020-11-12 > 18:55:46,939 | INFO | [main] | Started block pruning ... | > org.apache.carbondata.hadoop.api.CarbonInputFormat.getPrunedBlocklets(CarbonInputFormat.java:526)2020-11-12 > 18:55:46,940 | INFO | [main] | Distributed Index server is enabled for > 1_6_1.brinjal_update | > org.apache.carbondata.core.util.CarbonProperties.isDistributedPruningEnabled(CarbonProperties.java:1742)2020-11-12 > 18:55:46,945 | INFO | [main] | Successfully Created directory: > hdfs://hacluster/tmp/indexservertmp/4b6353d4-65d7-4856-b3cd-b3bc11d15c55 | > org.apache.carbondata.core.util.CarbonUtil.createTempFolderForIndexServer(CarbonUtil.java:3273)2020-11-12 > 18:55:46,945 | INFO | [main] | Temp folder path for Query ID: > 4b6353d4-65d7-4856-b3cd-b3bc11d15c55 is > org.apache.carbondata.core.datastore.filesystem.HDFSCarbonFile@b8f2e1bf | > org.apache.carbondata.indexserver.DistributedIndexJob.execute(IndexJobs.scala:57)2020-11-12 > 18:55:46,946 | ERROR | [main] | Configured port for index server is not a > valid number | > org.apache.carbondata.core.util.CarbonProperties.getIndexServerPort(CarbonProperties.java:1779)java.lang.NumberFormatException: > null at java.lang.Integer.parseInt(Integer.java:542) at > java.lang.Integer.parseInt(Integer.java:615) at > org.apache.carbondata.core.util.CarbonProperties.getIndexServerPort(CarbonProperties.java:1777) > at > org.apache.carbondata.indexserver.IndexServer$.serverPort$lzycompute(IndexServer.scala:88) > at > org.apache.carbondata.indexserver.IndexServer$.serverPort(IndexServer.scala:88) > at > org.apache.carbondata.indexserver.IndexServer$.getClient(IndexServer.scala:312) > at > org.apache.carbondata.indexserver.IndexServer$.getClient(IndexServer.scala:301) > at > org.apache.carbondata.indexserver.DistributedIndexJob$$anonfun$1.apply(IndexJobs.scala:83) > at > org.apache.carbondata.indexserver.DistributedIndexJob$$anonfun$1.apply(IndexJobs.scala:59) > at > org.apache.carbondata.spark.util.CarbonScalaUtil$.logTime(CarbonScalaUtil.scala:769) > at > org.apache.carbondata.indexserver.DistributedIndexJob.execute(IndexJobs.scala:58) > at > org.apache.carbondata.core.index.IndexUtil.executeIndexJob(IndexUtil.java:304) > at > org.apache.carbondata.hadoop.api.CarbonInputFormat.getDistributedSplit(CarbonInputFormat.java:431) > at > org.apache.carbondata.hadoop.api.CarbonInputFormat.getPrunedBlocklets(CarbonInputFormat.java:532) > at > org.apache.carbondata.hadoop.api.CarbonInputFormat.getDataBlocksOfSegment(CarbonInputFormat.java:477) > at > org.apache.carbondata.hadoop.api.CarbonTableInputFormat.getSplits(CarbonTableInputFormat.java:356) > at >
[jira] [Closed] (CARBONDATA-4048) Update fails after continous update operations with error
[ https://issues.apache.org/jira/browse/CARBONDATA-4048?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chetan Bhat closed CARBONDATA-4048. --- Resolution: Won't Fix subquery used within update query fetches more than 1 row. Hence the exception was thrown. Exception message is handled properly and carbon is printing the message from spark directly. > Update fails after continous update operations with error > - > > Key: CARBONDATA-4048 > URL: https://issues.apache.org/jira/browse/CARBONDATA-4048 > Project: CarbonData > Issue Type: Bug > Components: data-query >Affects Versions: 2.1.0 > Environment: Spark 2.3.2 >Reporter: Chetan Bhat >Priority: Minor > > Create table , load data and perform continous update operation on the table. > 0: jdbc:hive2://10.20.255.171:23040> CREATE TABLE uniqdata (CUST_ID > int,CUST_NAME String,ACTIVE_EMUI_VERSION string, DOB timestamp, DOJ > timestamp, BIGINT_COLUMN1 bigint,BIGINT_COLUMN2 bigint,DECIMAL_COLUMN1 > decimal(30,10), DECIMAL_COLUMN2 decimal(36,10),Double_COLUMN1 double, > Double_COLUMN2 double,INTEGER_COLUMN1 int) stored as carbondata TBLPROPERTIES > ("TABLE_BLOCKSIZE"= "256 MB",'flat_folder'='true'); > +-+--+ > | Result | > +-+--+ > +-+--+ > No rows selected (0.177 seconds) > 0: jdbc:hive2://10.20.255.171:23040> LOAD DATA inpath > 'hdfs://hacluster/chetan/2000_UniqData.csv' INTO table uniqdata > options('DELIMITER'=',', 'FILEHEADER'='CUST_ID, CUST_NAME, > ACTIVE_EMUI_VERSION, DOB, DOJ, BIGINT_COLUMN1, BIGINT_COLUMN2, > DECIMAL_COLUMN1, DECIMAL_COLUMN2, Double_COLUMN1, Double_COLUMN2, > INTEGER_COLUMN1','TIMESTAMPFORMAT'='-MM-dd > HH:mm:ss','BAD_RECORDS_ACTION'='FORCE'); > +-+--+ > | Result | > +-+--+ > +-+--+ > No rows selected (1.484 seconds) > 0: jdbc:hive2://10.20.255.171:23040> update uniqdata set > (bigint_column1)=(100) where bigint_column1<123372036856; > ++--+ > | Updated Row Count | > ++--+ > | 2 | > ++--+ > 1 row selected (3.294 seconds) > 0: jdbc:hive2://10.20.255.171:23040> update uniqdata set > (bigint_column1)=(100) where bigint_column1>123372038852; > ++--+ > | Updated Row Count | > ++--+ > | 1 | > ++--+ > 1 row selected (3.467 seconds) > 0: jdbc:hive2://10.20.255.171:23040> update uniqdata set > (bigint_column1)=(100) where bigint_column1<=123372036859; > ++--+ > | Updated Row Count | > ++--+ > | 9 | > ++--+ > 1 row selected (3.349 seconds) > 0: jdbc:hive2://10.20.255.171:23040> update uniqdata set > (bigint_column1)=(100) where bigint_column1>=123372038846; > ++--+ > | Updated Row Count | > ++--+ > | 8 | > ++--+ > 1 row selected (3.259 seconds) > 0: jdbc:hive2://10.20.255.171:23040> update uniqdata set > (bigint_column1)=(100) where bigint_column1 == 123372038845; > ++--+ > | Updated Row Count | > ++--+ > | 1 | > ++--+ > 1 row selected (4.164 seconds) > 0: jdbc:hive2://10.20.255.171:23040> update uniqdata set > (bigint_column1)=(100) where bigint_column1 like '123%'; > ++--+ > | Updated Row Count | > ++--+ > | 2000 | > ++--+ > 1 row selected (3.695 seconds) > 0: jdbc:hive2://10.20.255.171:23040> update uniqdata set > (bigint_column1)=(100) where bigint_column1 between 123372038849 AND > 123372038855; > ++--+ > | Updated Row Count | > ++--+ > | 5 | > ++--+ > 1 row selected (3.228 seconds) > 0: jdbc:hive2://10.20.255.171:23040> update uniqdata set > (bigint_column1)=(100) where bigint_column1 = 123372038845 OR false; > ++--+ > | Updated Row Count | > ++--+ > | 1 | > ++--+ > 1 row selected (3.548 seconds) > 0: jdbc:hive2://10.20.255.171:23040> update uniqdata set > (bigint_column1)=(100) where bigint_column1 = 123372038849 AND true; > ++--+ > | Updated Row Count | > ++--+ > | 1 | > ++--+ > 1 row selected (3.321 seconds) > 0: jdbc:hive2://10.20.255.171:23040> update uniqdata set > (bigint_column1)=(100) where bigint_column1 not between (123372038849) AND > (12337203885); > ++--+ > | Updated Row Count | > ++--+ > | 4025 | > ++--+ > 1 row selected (3.718 seconds) > 0: jdbc:hive2://10.20.255.171:23040> update uniqdata set > (cust_name)=('deepti') where cust_name<'CUST_NAME_01990'; > ++--+ > | Updated Row Count | > ++--+ > | 5978 | >
[jira] [Closed] (CARBONDATA-4061) Empty value for date and timestamp columns are reading as null when using SDK. if we pass empty value to data and timestamp columns ,it gives null pointer exception
[ https://issues.apache.org/jira/browse/CARBONDATA-4061?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chetan Bhat closed CARBONDATA-4061. --- Resolution: Fixed Fixed in 2.1.0 version > Empty value for date and timestamp columns are reading as null when using > SDK. if we pass empty value to data and timestamp columns ,it gives null > pointer exception > > > Key: CARBONDATA-4061 > URL: https://issues.apache.org/jira/browse/CARBONDATA-4061 > Project: CarbonData > Issue Type: Bug > Components: data-query, other >Affects Versions: 2.1.0 > Environment: Spark 2.4.5 >Reporter: Chetan Bhat >Priority: Minor > > Empty value for date and timestamp columns are reading as null when using > SDK. if we pass empty value to data and timestamp columns ,it gives null > pointer exception > > 2020-11-27 13:44:20 INFO ResultCollectorFactory:78 - Vector based dictionary > collector is used to scan and collect the data > 2020-11-27 13:44:20 INFO DictionaryBasedVectorResultCollector:73 - Direct > page-wise vector fill collector is used to scan and collect the data > java.lang.NullPointerException > at > org.apache.carbondata.sdk.file.CarbonReader.formatDateAndTimeStamp(CarbonReader.java:153) > at > org.apache.carbondata.sdk.file.CarbonReader.readNextRow(CarbonReader.java:126) > at > com.apache.spark.SDKReaderTest.testSDKRederAll_data_types2(SDKReaderTest.java:239) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:498) > at > org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50) > at > org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12) > at > org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47) > at > org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17) > at org.junit.rules.TestWatcher$1.evaluate(TestWatcher.java:55) > at org.junit.rules.RunRules.evaluate(RunRules.java:20) > at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:325) > at > org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:78) > at > org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:57) > at org.junit.runners.ParentRunner$3.run(ParentRunner.java:290) > at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:71) > at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:288) > at org.junit.runners.ParentRunner.access$000(ParentRunner.java:58) > at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:268) > at > org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26) > at > org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27) > at org.junit.runners.ParentRunner.run(ParentRunner.java:363) > at org.junit.runner.JUnitCore.run(JUnitCore.java:137) > at > com.intellij.junit4.JUnit4IdeaTestRunner.startRunnerWithArgs(JUnit4IdeaTestRunner.java:69) > at > com.intellij.rt.junit.IdeaTestRunner$Repeater.startRunnerWithArgs(IdeaTestRunner.java:33) > at > com.intellij.rt.junit.JUnitStarter.prepareStreamsAndStart(JUnitStarter.java:220) > at com.intellij.rt.junit.JUnitStarter.main(JUnitStarter.java:53) > 2020-11-27 13:44:20 INFO SparkContext:54 - Invoking stop() from shutdown hook > 2020-11-27 13:44:20 INFO AbstractConnector:343 - Stopped [Spark@22175d4f > {HTTP/1.1|mailto:Spark@22175d4f%7BHTTP/1.1],[http/1.1]} > {10.19.36.215:4040} > 2020-11-27 13:44:20 INFO session:158 - node0 Stopped scavenging -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (CARBONDATA-4061) Empty value for date and timestamp columns are reading as null when using SDK. if we pass empty value to data and timestamp columns ,it gives null pointer exception
[ https://issues.apache.org/jira/browse/CARBONDATA-4061?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chetan Bhat updated CARBONDATA-4061: Description: Empty value for date and timestamp columns are reading as null when using SDK. if we pass empty value to data and timestamp columns ,it gives null pointer exception 2020-11-27 13:44:20 INFO ResultCollectorFactory:78 - Vector based dictionary collector is used to scan and collect the data 2020-11-27 13:44:20 INFO DictionaryBasedVectorResultCollector:73 - Direct page-wise vector fill collector is used to scan and collect the data java.lang.NullPointerException at org.apache.carbondata.sdk.file.CarbonReader.formatDateAndTimeStamp(CarbonReader.java:153) at org.apache.carbondata.sdk.file.CarbonReader.readNextRow(CarbonReader.java:126) at com.apache.spark.SDKReaderTest.testSDKRederAll_data_types2(SDKReaderTest.java:239) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50) at org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12) at org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47) at org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17) at org.junit.rules.TestWatcher$1.evaluate(TestWatcher.java:55) at org.junit.rules.RunRules.evaluate(RunRules.java:20) at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:325) at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:78) at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:57) at org.junit.runners.ParentRunner$3.run(ParentRunner.java:290) at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:71) at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:288) at org.junit.runners.ParentRunner.access$000(ParentRunner.java:58) at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:268) at org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26) at org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27) at org.junit.runners.ParentRunner.run(ParentRunner.java:363) at org.junit.runner.JUnitCore.run(JUnitCore.java:137) at com.intellij.junit4.JUnit4IdeaTestRunner.startRunnerWithArgs(JUnit4IdeaTestRunner.java:69) at com.intellij.rt.junit.IdeaTestRunner$Repeater.startRunnerWithArgs(IdeaTestRunner.java:33) at com.intellij.rt.junit.JUnitStarter.prepareStreamsAndStart(JUnitStarter.java:220) at com.intellij.rt.junit.JUnitStarter.main(JUnitStarter.java:53) 2020-11-27 13:44:20 INFO SparkContext:54 - Invoking stop() from shutdown hook 2020-11-27 13:44:20 INFO AbstractConnector:343 - Stopped [Spark@22175d4f {HTTP/1.1|mailto:Spark@22175d4f%7BHTTP/1.1],[http/1.1]} {10.19.36.215:4040} 2020-11-27 13:44:20 INFO session:158 - node0 Stopped scavenging was: Empty value for date and timestamp columns are reading as null . if we pass empty value to data and timestamp columns ,it gives null pointer exception 2020-11-27 13:44:20 INFO ResultCollectorFactory:78 - Vector based dictionary collector is used to scan and collect the data 2020-11-27 13:44:20 INFO DictionaryBasedVectorResultCollector:73 - Direct page-wise vector fill collector is used to scan and collect the data java.lang.NullPointerException at org.apache.carbondata.sdk.file.CarbonReader.formatDateAndTimeStamp(CarbonReader.java:153) at org.apache.carbondata.sdk.file.CarbonReader.readNextRow(CarbonReader.java:126) at com.apache.spark.SDKReaderTest.testSDKRederAll_data_types2(SDKReaderTest.java:239) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50) at org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12) at org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47) at org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17) at org.junit.rules.TestWatcher$1.evaluate(TestWatcher.java:55) at org.junit.rules.RunRules.evaluate(RunRules.java:20) at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:325) at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:78) at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:57) at
[jira] [Updated] (CARBONDATA-4061) Empty value for date and timestamp columns are reading as null when using SDK. if we pass empty value to data and timestamp columns ,it gives null pointer exception
[ https://issues.apache.org/jira/browse/CARBONDATA-4061?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chetan Bhat updated CARBONDATA-4061: Summary: Empty value for date and timestamp columns are reading as null when using SDK. if we pass empty value to data and timestamp columns ,it gives null pointer exception (was: Empty value for date and timestamp columns are reading as null . if we pass empty value to data and timestamp columns ,it gives null pointer exception) > Empty value for date and timestamp columns are reading as null when using > SDK. if we pass empty value to data and timestamp columns ,it gives null > pointer exception > > > Key: CARBONDATA-4061 > URL: https://issues.apache.org/jira/browse/CARBONDATA-4061 > Project: CarbonData > Issue Type: Bug > Components: data-query, other >Affects Versions: 2.1.0 > Environment: Spark 2.4.5 >Reporter: Chetan Bhat >Priority: Minor > > Empty value for date and timestamp columns are reading as null . if we pass > empty value to data and timestamp columns ,it gives null pointer exception > > 2020-11-27 13:44:20 INFO ResultCollectorFactory:78 - Vector based dictionary > collector is used to scan and collect the data > 2020-11-27 13:44:20 INFO DictionaryBasedVectorResultCollector:73 - Direct > page-wise vector fill collector is used to scan and collect the data > java.lang.NullPointerException > at > org.apache.carbondata.sdk.file.CarbonReader.formatDateAndTimeStamp(CarbonReader.java:153) > at > org.apache.carbondata.sdk.file.CarbonReader.readNextRow(CarbonReader.java:126) > at > com.apache.spark.SDKReaderTest.testSDKRederAll_data_types2(SDKReaderTest.java:239) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:498) > at > org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50) > at > org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12) > at > org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47) > at > org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17) > at org.junit.rules.TestWatcher$1.evaluate(TestWatcher.java:55) > at org.junit.rules.RunRules.evaluate(RunRules.java:20) > at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:325) > at > org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:78) > at > org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:57) > at org.junit.runners.ParentRunner$3.run(ParentRunner.java:290) > at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:71) > at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:288) > at org.junit.runners.ParentRunner.access$000(ParentRunner.java:58) > at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:268) > at > org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26) > at > org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27) > at org.junit.runners.ParentRunner.run(ParentRunner.java:363) > at org.junit.runner.JUnitCore.run(JUnitCore.java:137) > at > com.intellij.junit4.JUnit4IdeaTestRunner.startRunnerWithArgs(JUnit4IdeaTestRunner.java:69) > at > com.intellij.rt.junit.IdeaTestRunner$Repeater.startRunnerWithArgs(IdeaTestRunner.java:33) > at > com.intellij.rt.junit.JUnitStarter.prepareStreamsAndStart(JUnitStarter.java:220) > at com.intellij.rt.junit.JUnitStarter.main(JUnitStarter.java:53) > 2020-11-27 13:44:20 INFO SparkContext:54 - Invoking stop() from shutdown hook > 2020-11-27 13:44:20 INFO AbstractConnector:343 - Stopped > [Spark@22175d4f{HTTP/1.1|mailto:Spark@22175d4f%7BHTTP/1.1],[http/1.1]}\{10.19.36.215:4040} > 2020-11-27 13:44:20 INFO session:158 - node0 Stopped scavenging -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (CARBONDATA-4061) Empty value for date and timestamp columns are reading as null . if we pass empty value to data and timestamp columns ,it gives null pointer exception
Chetan Bhat created CARBONDATA-4061: --- Summary: Empty value for date and timestamp columns are reading as null . if we pass empty value to data and timestamp columns ,it gives null pointer exception Key: CARBONDATA-4061 URL: https://issues.apache.org/jira/browse/CARBONDATA-4061 Project: CarbonData Issue Type: Bug Components: data-query, other Affects Versions: 2.1.0 Environment: Spark 2.4.5 Reporter: Chetan Bhat Empty value for date and timestamp columns are reading as null . if we pass empty value to data and timestamp columns ,it gives null pointer exception 2020-11-27 13:44:20 INFO ResultCollectorFactory:78 - Vector based dictionary collector is used to scan and collect the data 2020-11-27 13:44:20 INFO DictionaryBasedVectorResultCollector:73 - Direct page-wise vector fill collector is used to scan and collect the data java.lang.NullPointerException at org.apache.carbondata.sdk.file.CarbonReader.formatDateAndTimeStamp(CarbonReader.java:153) at org.apache.carbondata.sdk.file.CarbonReader.readNextRow(CarbonReader.java:126) at com.apache.spark.SDKReaderTest.testSDKRederAll_data_types2(SDKReaderTest.java:239) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50) at org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12) at org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47) at org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17) at org.junit.rules.TestWatcher$1.evaluate(TestWatcher.java:55) at org.junit.rules.RunRules.evaluate(RunRules.java:20) at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:325) at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:78) at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:57) at org.junit.runners.ParentRunner$3.run(ParentRunner.java:290) at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:71) at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:288) at org.junit.runners.ParentRunner.access$000(ParentRunner.java:58) at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:268) at org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26) at org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27) at org.junit.runners.ParentRunner.run(ParentRunner.java:363) at org.junit.runner.JUnitCore.run(JUnitCore.java:137) at com.intellij.junit4.JUnit4IdeaTestRunner.startRunnerWithArgs(JUnit4IdeaTestRunner.java:69) at com.intellij.rt.junit.IdeaTestRunner$Repeater.startRunnerWithArgs(IdeaTestRunner.java:33) at com.intellij.rt.junit.JUnitStarter.prepareStreamsAndStart(JUnitStarter.java:220) at com.intellij.rt.junit.JUnitStarter.main(JUnitStarter.java:53) 2020-11-27 13:44:20 INFO SparkContext:54 - Invoking stop() from shutdown hook 2020-11-27 13:44:20 INFO AbstractConnector:343 - Stopped [Spark@22175d4f{HTTP/1.1|mailto:Spark@22175d4f%7BHTTP/1.1],[http/1.1]}\{10.19.36.215:4040} 2020-11-27 13:44:20 INFO session:158 - node0 Stopped scavenging -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (CARBONDATA-4049) Sometimes refresh table fails with error "table not found in database" error
Chetan Bhat created CARBONDATA-4049: --- Summary: Sometimes refresh table fails with error "table not found in database" error Key: CARBONDATA-4049 URL: https://issues.apache.org/jira/browse/CARBONDATA-4049 Project: CarbonData Issue Type: Bug Components: data-query Affects Versions: 2.1.0 Environment: Spark 2.4.5 Reporter: Chetan Bhat In Carbon 2.1 version user creates a database. user copies a old version store such as 1.6.1 to HDFS folder of the database in the In Carbon 2.1 version In Spark-SQL or beeline the user accesses the database using the use db command. Refresh table command is executed on the old version store table and then the subsequent operations on the table are performed. Next refresh table command is tried to be executed on another old version store table . Issue : Sometimes refresh table fails with error "table not found in database" error. spark-sql> refresh table brinjal_deleteseg; *Error in query: Table or view 'brinjal_deleteseg' not found in database '1_6_1';* **Log - 2020-11-12 18:55:46,922 | INFO | [main] | Created broadcast 171 from broadCastHadoopConf at CarbonRDD.scala:58 | org.apache.spark.internal.Logging$class.logInfo(Logging.scala:54)2020-11-12 18:55:46,922 | INFO | [main] | Created broadcast 171 from broadCastHadoopConf at CarbonRDD.scala:58 | org.apache.spark.internal.Logging$class.logInfo(Logging.scala:54)2020-11-12 18:55:46,924 | INFO | [main] | Pushed Filters: | org.apache.spark.internal.Logging$class.logInfo(Logging.scala:54)2020-11-12 18:55:46,939 | INFO | [main] | Distributed Index server is enabled for 1_6_1.brinjal_update | org.apache.carbondata.core.util.CarbonProperties.isDistributedPruningEnabled(CarbonProperties.java:1742)2020-11-12 18:55:46,939 | INFO | [main] | Started block pruning ... | org.apache.carbondata.hadoop.api.CarbonInputFormat.getPrunedBlocklets(CarbonInputFormat.java:526)2020-11-12 18:55:46,940 | INFO | [main] | Distributed Index server is enabled for 1_6_1.brinjal_update | org.apache.carbondata.core.util.CarbonProperties.isDistributedPruningEnabled(CarbonProperties.java:1742)2020-11-12 18:55:46,945 | INFO | [main] | Successfully Created directory: hdfs://hacluster/tmp/indexservertmp/4b6353d4-65d7-4856-b3cd-b3bc11d15c55 | org.apache.carbondata.core.util.CarbonUtil.createTempFolderForIndexServer(CarbonUtil.java:3273)2020-11-12 18:55:46,945 | INFO | [main] | Temp folder path for Query ID: 4b6353d4-65d7-4856-b3cd-b3bc11d15c55 is org.apache.carbondata.core.datastore.filesystem.HDFSCarbonFile@b8f2e1bf | org.apache.carbondata.indexserver.DistributedIndexJob.execute(IndexJobs.scala:57)2020-11-12 18:55:46,946 | ERROR | [main] | Configured port for index server is not a valid number | org.apache.carbondata.core.util.CarbonProperties.getIndexServerPort(CarbonProperties.java:1779)java.lang.NumberFormatException: null at java.lang.Integer.parseInt(Integer.java:542) at java.lang.Integer.parseInt(Integer.java:615) at org.apache.carbondata.core.util.CarbonProperties.getIndexServerPort(CarbonProperties.java:1777) at org.apache.carbondata.indexserver.IndexServer$.serverPort$lzycompute(IndexServer.scala:88) at org.apache.carbondata.indexserver.IndexServer$.serverPort(IndexServer.scala:88) at org.apache.carbondata.indexserver.IndexServer$.getClient(IndexServer.scala:312) at org.apache.carbondata.indexserver.IndexServer$.getClient(IndexServer.scala:301) at org.apache.carbondata.indexserver.DistributedIndexJob$$anonfun$1.apply(IndexJobs.scala:83) at org.apache.carbondata.indexserver.DistributedIndexJob$$anonfun$1.apply(IndexJobs.scala:59) at org.apache.carbondata.spark.util.CarbonScalaUtil$.logTime(CarbonScalaUtil.scala:769) at org.apache.carbondata.indexserver.DistributedIndexJob.execute(IndexJobs.scala:58) at org.apache.carbondata.core.index.IndexUtil.executeIndexJob(IndexUtil.java:304) at org.apache.carbondata.hadoop.api.CarbonInputFormat.getDistributedSplit(CarbonInputFormat.java:431) at org.apache.carbondata.hadoop.api.CarbonInputFormat.getPrunedBlocklets(CarbonInputFormat.java:532) at org.apache.carbondata.hadoop.api.CarbonInputFormat.getDataBlocksOfSegment(CarbonInputFormat.java:477) at org.apache.carbondata.hadoop.api.CarbonTableInputFormat.getSplits(CarbonTableInputFormat.java:356) at org.apache.carbondata.hadoop.api.CarbonTableInputFormat.getSplits(CarbonTableInputFormat.java:204) at org.apache.carbondata.spark.rdd.CarbonScanRDD.internalGetPartitions(CarbonScanRDD.scala:159) at org.apache.carbondata.spark.rdd.CarbonRDD.getPartitions(CarbonRDD.scala:68) at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:273) at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:269) at scala.Option.getOrElse(Option.scala:121) at org.apache.spark.rdd.RDD.partitions(RDD.scala:269) at
[jira] [Created] (CARBONDATA-4048) Update fails after continous update operations with error
Chetan Bhat created CARBONDATA-4048: --- Summary: Update fails after continous update operations with error Key: CARBONDATA-4048 URL: https://issues.apache.org/jira/browse/CARBONDATA-4048 Project: CarbonData Issue Type: Bug Components: data-query Affects Versions: 2.1.0 Environment: Spark 2.3.2 Reporter: Chetan Bhat Create table , load data and perform continous update operation on the table. 0: jdbc:hive2://10.20.255.171:23040> CREATE TABLE uniqdata (CUST_ID int,CUST_NAME String,ACTIVE_EMUI_VERSION string, DOB timestamp, DOJ timestamp, BIGINT_COLUMN1 bigint,BIGINT_COLUMN2 bigint,DECIMAL_COLUMN1 decimal(30,10), DECIMAL_COLUMN2 decimal(36,10),Double_COLUMN1 double, Double_COLUMN2 double,INTEGER_COLUMN1 int) stored as carbondata TBLPROPERTIES ("TABLE_BLOCKSIZE"= "256 MB",'flat_folder'='true'); +-+--+ | Result | +-+--+ +-+--+ No rows selected (0.177 seconds) 0: jdbc:hive2://10.20.255.171:23040> LOAD DATA inpath 'hdfs://hacluster/chetan/2000_UniqData.csv' INTO table uniqdata options('DELIMITER'=',', 'FILEHEADER'='CUST_ID, CUST_NAME, ACTIVE_EMUI_VERSION, DOB, DOJ, BIGINT_COLUMN1, BIGINT_COLUMN2, DECIMAL_COLUMN1, DECIMAL_COLUMN2, Double_COLUMN1, Double_COLUMN2, INTEGER_COLUMN1','TIMESTAMPFORMAT'='-MM-dd HH:mm:ss','BAD_RECORDS_ACTION'='FORCE'); +-+--+ | Result | +-+--+ +-+--+ No rows selected (1.484 seconds) 0: jdbc:hive2://10.20.255.171:23040> update uniqdata set (bigint_column1)=(100) where bigint_column1<123372036856; ++--+ | Updated Row Count | ++--+ | 2 | ++--+ 1 row selected (3.294 seconds) 0: jdbc:hive2://10.20.255.171:23040> update uniqdata set (bigint_column1)=(100) where bigint_column1>123372038852; ++--+ | Updated Row Count | ++--+ | 1 | ++--+ 1 row selected (3.467 seconds) 0: jdbc:hive2://10.20.255.171:23040> update uniqdata set (bigint_column1)=(100) where bigint_column1<=123372036859; ++--+ | Updated Row Count | ++--+ | 9 | ++--+ 1 row selected (3.349 seconds) 0: jdbc:hive2://10.20.255.171:23040> update uniqdata set (bigint_column1)=(100) where bigint_column1>=123372038846; ++--+ | Updated Row Count | ++--+ | 8 | ++--+ 1 row selected (3.259 seconds) 0: jdbc:hive2://10.20.255.171:23040> update uniqdata set (bigint_column1)=(100) where bigint_column1 == 123372038845; ++--+ | Updated Row Count | ++--+ | 1 | ++--+ 1 row selected (4.164 seconds) 0: jdbc:hive2://10.20.255.171:23040> update uniqdata set (bigint_column1)=(100) where bigint_column1 like '123%'; ++--+ | Updated Row Count | ++--+ | 2000 | ++--+ 1 row selected (3.695 seconds) 0: jdbc:hive2://10.20.255.171:23040> update uniqdata set (bigint_column1)=(100) where bigint_column1 between 123372038849 AND 123372038855; ++--+ | Updated Row Count | ++--+ | 5 | ++--+ 1 row selected (3.228 seconds) 0: jdbc:hive2://10.20.255.171:23040> update uniqdata set (bigint_column1)=(100) where bigint_column1 = 123372038845 OR false; ++--+ | Updated Row Count | ++--+ | 1 | ++--+ 1 row selected (3.548 seconds) 0: jdbc:hive2://10.20.255.171:23040> update uniqdata set (bigint_column1)=(100) where bigint_column1 = 123372038849 AND true; ++--+ | Updated Row Count | ++--+ | 1 | ++--+ 1 row selected (3.321 seconds) 0: jdbc:hive2://10.20.255.171:23040> update uniqdata set (bigint_column1)=(100) where bigint_column1 not between (123372038849) AND (12337203885); ++--+ | Updated Row Count | ++--+ | 4025 | ++--+ 1 row selected (3.718 seconds) 0: jdbc:hive2://10.20.255.171:23040> update uniqdata set (cust_name)=('deepti') where cust_name<'CUST_NAME_01990'; ++--+ | Updated Row Count | ++--+ | 5978 | ++--+ 1 row selected (4.109 seconds) 0: jdbc:hive2://10.20.255.171:23040> update uniqdata set (cust_name)=('deepti') where cust_name>'CUST_NAME_01990'; ++--+ | Updated Row Count | ++--+ | 6022 | ++--+ 1 row selected (3.643 seconds) 0: jdbc:hive2://10.20.255.171:23040> update uniqdata set (cust_name)=('deepti') where cust_name<='CUST_NAME_01990'; ++--+ | Updated Row Count | ++--+ | 5981 | ++--+ 1 row selected (3.713 seconds) 0: jdbc:hive2://10.20.255.171:23040> update uniqdata set (cust_name)=('deepti')
[jira] [Closed] (CARBONDATA-3838) Select filter query fails on SI columns of different SI tables.
[ https://issues.apache.org/jira/browse/CARBONDATA-3838?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chetan Bhat closed CARBONDATA-3838. --- Fix Version/s: 2.1.0 Resolution: Fixed > Select filter query fails on SI columns of different SI tables. > --- > > Key: CARBONDATA-3838 > URL: https://issues.apache.org/jira/browse/CARBONDATA-3838 > Project: CarbonData > Issue Type: Bug > Components: data-query >Affects Versions: 2.0.0 > Environment: Spark 2.3.2 >Reporter: Chetan Bhat >Priority: Major > Fix For: 2.1.0 > > > Select filter query fails on SI columns of different SI tables. > *Steps :-* > 0: jdbc:hive2://10.20.255.171:23040/default> create table brinjal (imei > string,AMSize string,channelsId string,ActiveCountry string, Activecity > string,gamePointId double,deviceInformationId double,productionDate > Timestamp,deliveryDate timestamp,deliverycharge double) stored as carbondata > TBLPROPERTIES('inverted_index'='imei,AMSize,channelsId,ActiveCountry,Activecity,productionDate,deliveryDate','sort_columns'='imei,AMSize,channelsId,ActiveCountry,Activecity,productionDate,deliveryDate','table_blocksize'='1','SORT_SCOPE'='GLOBAL_SORT','carbon.column.compressor'='zstd'); > +-+--+ > | Result | > +-+--+ > +-+--+ > No rows selected (0.153 seconds) > 0: jdbc:hive2://10.20.255.171:23040/default> LOAD DATA INPATH > 'hdfs://hacluster/chetan/vardhandaterestruct.csv' INTO TABLE brinjal > OPTIONS('DELIMITER'=',', 'QUOTECHAR'= > '"','BAD_RECORDS_ACTION'='FORCE','FILEHEADER'= > 'imei,deviceInformationId,AMSize,channelsId,ActiveCountry,Activecity,gamePointId,productionDate,deliveryDate,deliverycharge'); > +-+--+ > | Result | > +-+--+ > +-+--+ > No rows selected (2.357 seconds) > 0: jdbc:hive2://10.20.255.171:23040/default> CREATE INDEX indextable1 ON > TABLE brinjal (channelsId) AS 'carbondata' > PROPERTIES('carbon.column.compressor'='zstd'); > +-+--+ > | Result | > +-+--+ > +-+--+ > No rows selected (1.048 seconds) > 0: jdbc:hive2://10.20.255.171:23040/default> CREATE INDEX indextable2 ON > TABLE brinjal (ActiveCountry) AS 'carbondata' > PROPERTIES('carbon.column.compressor'='zstd'); > +-+--+ > | Result | > +-+--+ > +-+--+ > No rows selected (1.895 seconds) > 0: jdbc:hive2://10.20.255.171:23040/default> select * from brinjal where > ActiveCountry ='Chinese' or channelsId =4; > Error: org.apache.spark.sql.catalyst.errors.package$TreeNodeException: > execute, tree: > Exchange hashpartitioning(positionReference#6440, 200) > +- *(6) HashAggregate(keys=[positionReference#6440], functions=[], > output=[positionReference#6440]) > +- Union > :- *(3) HashAggregate(keys=[positionReference#6440], functions=[], > output=[positionReference#6440]) > : +- Exchange hashpartitioning(positionReference#6440, 200) > : +- *(2) HashAggregate(keys=[positionReference#6440], functions=[], > output=[positionReference#6440]) > : +- *(2) Project [positionReference#6440] > : +- *(2) Filter (cast(channelsid#6439 as int) = 4) > : +- *(2) FileScan carbondata > 2_0.indextable1[positionReference#6440,channelsid#6439] PushedFilters: > [CastExpr((cast(channelsid#6439 as int) = 4))], ReadSchema: > struct > +- *(5) HashAggregate(keys=[positionReference#6442], functions=[], > output=[positionReference#6442]) > +- Exchange hashpartitioning(positionReference#6442, 200) > +- *(4) HashAggregate(keys=[positionReference#6442], functions=[], > output=[positionReference#6442]) > +- *(4) Project [positionReference#6442|#6442] > +- *(4) Filter (activecountry#6441 = Chinese) > +- *(4) FileScan carbondata > 2_0.indextable2[positionReference#6442,activecountry#6441] PushedFilters: > [EqualTo(activecountry,Chinese)], ReadSchema: > struct (state=,code=0) > > *Log -* > org.apache.carbondata.core.datastore.block.SegmentPropertiesAndSchemaHolder.addSegmentProperties(SegmentPropertiesAndSchemaHolder.java:117)org.apache.carbondata.core.datastore.block.SegmentPropertiesAndSchemaHolder.addSegmentProperties(SegmentPropertiesAndSchemaHolder.java:117)2020-06-01 > 12:19:28,058 | ERROR | [HiveServer2-Background-Pool: Thread-1150] | Error > executing query, currentState RUNNING, | > org.apache.spark.internal.Logging$class.logError(Logging.scala:91)org.apache.spark.sql.catalyst.errors.package$TreeNodeException: > execute, tree:Exchange hashpartitioning(positionReference#6440, 200)+- *(6) > HashAggregate(keys=[positionReference#6440], functions=[], > output=[positionReference#6440]) +- Union :- *(3) > HashAggregate(keys=[positionReference#6440], functions=[], > output=[positionReference#6440]) : +- Exchange > hashpartitioning(positionReference#6440, 200) : +- *(2) >
[jira] [Closed] (CARBONDATA-3971) Session level dynamic properties for repair(carbon.load.si.repair and carbon.si.repair.limit) are not updated in https://github.com/apache/carbondata/blob/master/docs
[ https://issues.apache.org/jira/browse/CARBONDATA-3971?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chetan Bhat closed CARBONDATA-3971. --- Fix Version/s: 2.1.0 Resolution: Fixed Issue is fixed. > Session level dynamic properties for repair(carbon.load.si.repair and > carbon.si.repair.limit) are not updated in > https://github.com/apache/carbondata/blob/master/docs/configuration-parameters.md > -- > > Key: CARBONDATA-3971 > URL: https://issues.apache.org/jira/browse/CARBONDATA-3971 > Project: CarbonData > Issue Type: Bug > Components: docs >Affects Versions: 2.1.0 >Reporter: Chetan Bhat >Priority: Minor > Fix For: 2.1.0 > > > Session level dynamic properties for repair(carbon.load.si.repair and > carbon.si.repair.limit) are not mentioned in github link - > https://github.com/apache/carbondata/blob/master/docs/configuration-parameters.md -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Closed] (CARBONDATA-4010) "Alter table set tblproperties should support long string columns" and bad record handling of long string data for string columns need to be updated in https://github
[ https://issues.apache.org/jira/browse/CARBONDATA-4010?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chetan Bhat closed CARBONDATA-4010. --- Issue is fixed in Carbon 2.1 version. > "Alter table set tblproperties should support long string columns" and bad > record handling of long string data for string columns need to be updated in > https://github.com/apache/carbondata/blob/master/docs > - > > Key: CARBONDATA-4010 > URL: https://issues.apache.org/jira/browse/CARBONDATA-4010 > Project: CarbonData > Issue Type: Bug > Components: docs >Affects Versions: 2.1.0 > Environment: https://github.com/apache/carbondata/blob/master/docs >Reporter: Chetan Bhat >Priority: Minor > Fix For: 2.1.0 > > Time Spent: 6.5h > Remaining Estimate: 0h > > "Alter table set tblproperties should support long string columns" and bad > record handling of long string data for string columns need to be updated in > https://github.com/apache/carbondata/blob/master/docs -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Closed] (CARBONDATA-3932) need to change discovery.uri and add hive.metastore.uri,hive.config.resources in https://github.com/apache/carbondata/blob/master/docs/prestosql-guide.md#presto-mul
[ https://issues.apache.org/jira/browse/CARBONDATA-3932?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chetan Bhat closed CARBONDATA-3932. --- Issue is fixed in Carbon 2.1 version. > need to change discovery.uri and add > hive.metastore.uri,hive.config.resources in > https://github.com/apache/carbondata/blob/master/docs/prestosql-guide.md#presto-multinode-cluster-setup-for-carbondata > - > > Key: CARBONDATA-3932 > URL: https://issues.apache.org/jira/browse/CARBONDATA-3932 > Project: CarbonData > Issue Type: Bug > Components: docs, presto-integration >Affects Versions: 2.0.1 > Environment: Documentation >Reporter: Chetan Bhat >Priority: Minor > Fix For: 2.1.0 > > Time Spent: 3h 20m > Remaining Estimate: 0h > > Need to change discovery.uri=:8086 to > discovery.uri=http://:8086 in > [https://github.com/apache/carbondata/blob/master/docs/prestosql-guide.md#presto-multinode-cluster-setup-for-carbondata] > Need to add these configurations as well in carbondata.properties and to be > updated in carbondata-presto opensource doc . > 1.hive.metastore.uri > 2.hive.config.resources > Ex : - > connector.name=carbondata > hive.metastore.uri=thrift://10.21.18.106:9083 > hive.config.resources=/opt/HA/C10/install/hadoop/datanode/etc/hadoop/core-site.xml,/opt/HA/C10/install/hadoop/datanode/etc/hadoop/hdfs-site.xml > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Closed] (CARBONDATA-3901) Documentation issues in https://github.com/apache/carbondata/tree/master/docs
[ https://issues.apache.org/jira/browse/CARBONDATA-3901?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chetan Bhat closed CARBONDATA-3901. --- Issue is fixed in Carbon 2.1 version > Documentation issues in https://github.com/apache/carbondata/tree/master/docs > - > > Key: CARBONDATA-3901 > URL: https://issues.apache.org/jira/browse/CARBONDATA-3901 > Project: CarbonData > Issue Type: Bug > Components: docs >Affects Versions: 2.0.1 > Environment: https://github.com/apache/carbondata/tree/master/docs >Reporter: Chetan Bhat >Priority: Minor > Fix For: 2.1.0 > > Time Spent: 2h 40m > Remaining Estimate: 0h > > *Issue 1 -* > [https://github.com/apache/carbondata/blob/master/docs/ddl-of-carbondata.mdSORT_SCOPE] > Sort scope of the load.Options include no sort, local sort ,batch sort and > global sort --> Batch sort to be removed as its not supported. > *Issue 2 -* > [https://github.com/apache/carbondata/blob/master/docs/streaming-guide.md#close-stream] > CLOSE STREAM link is not working. > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Closed] (CARBONDATA-3824) Error when Secondary index tried to be created on table that does not exist is not correct.
[ https://issues.apache.org/jira/browse/CARBONDATA-3824?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chetan Bhat closed CARBONDATA-3824. --- Issue is fixed in Carbon 2.1 version. > Error when Secondary index tried to be created on table that does not exist > is not correct. > --- > > Key: CARBONDATA-3824 > URL: https://issues.apache.org/jira/browse/CARBONDATA-3824 > Project: CarbonData > Issue Type: Bug > Components: data-query >Affects Versions: 2.0.0 > Environment: Spark 2.3.2, Spark 2.4.5 >Reporter: Chetan Bhat >Priority: Minor > Fix For: 2.1.0 > > > *Issue :-* > Table uniqdata_double does not exist. > Secondary index tried to be created on table. Error message is incorrect. > CREATE INDEX indextable2 ON TABLE uniqdata_double (DOB) AS 'carbondata' > PROPERTIES('carbon.column.compressor'='zstd'); > *Error: java.lang.RuntimeException: Operation not allowed on non-carbon table > (state=,code=0)* > > *Expected :-* > *Error: java.lang.RuntimeException: Table does not exist* *(state=,code=0)*** -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (CARBONDATA-3901) Documentation issues in https://github.com/apache/carbondata/tree/master/docs
[ https://issues.apache.org/jira/browse/CARBONDATA-3901?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chetan Bhat updated CARBONDATA-3901: Description: *Issue 1 -* [https://github.com/apache/carbondata/blob/master/docs/ddl-of-carbondata.mdSORT_SCOPE] Sort scope of the load.Options include no sort, local sort ,batch sort and global sort --> Batch sort to be removed as its not supported. *Issue 2 -* [https://github.com/apache/carbondata/blob/master/docs/streaming-guide.md#close-stream] CLOSE STREAM link is not working. was: *Issue 1 :* [https://github.com/apache/carbondata/blob/master/docs/alluxio-guide.md] getOrCreateCarbonSession not used in Carbon 2.0 version and should be removed.Issue 1 : [https://github.com/apache/carbondata/blob/master/docs/alluxio-guide.md] getOrCreateCarbonSession not used in Carbon 2.0 version and should be removed. Testing use alluxio by CarbonSessionimport org.apache.spark.sql.CarbonSession._import org.apache.spark.sql.SparkSession val carbon = SparkSession.builder().master("local").appName("test").getOrCreateCarbonSession("alluxio://localhost:19998/carbondata");carbon.sql("CREATE TABLE carbon_alluxio(id String,name String, city String,age Int) STORED as carbondata");carbon.sql(s"LOAD DATA LOCAL INPATH '${CARBONDATA_PATH}/integration/spark/src/test/resources/sample.csv' into table carbon_alluxio");carbon.sql("select * from carbon_alluxio").show *Issue 2 -* [https://github.com/apache/carbondata/blob/master/docs/ddl-of-carbondata.mdSORT_SCOPE] Sort scope of the load.Options include no sort, local sort ,batch sort and global sort --> Batch sort to be removed as its not supported. *Issue 3 -* [https://github.com/apache/carbondata/blob/master/docs/streaming-guide.md#close-stream] CLOSE STREAM link is not working. *Issue 4 -* [https://github.com/apache/carbondata/blob/master/docs/index/bloomfilter-index-guide.md] Explain query does not hit the bloom. Hence the line "User can verify whether a query can leverage BloomFilter Index by executing {{EXPLAIN}} command, which will show the transformed logical plan, and thus user can check whether the BloomFilter Index can skip blocklets during the scan." needs to be removed. > Documentation issues in https://github.com/apache/carbondata/tree/master/docs > - > > Key: CARBONDATA-3901 > URL: https://issues.apache.org/jira/browse/CARBONDATA-3901 > Project: CarbonData > Issue Type: Bug > Components: docs >Affects Versions: 2.0.1 > Environment: https://github.com/apache/carbondata/tree/master/docs >Reporter: Chetan Bhat >Priority: Minor > Fix For: 2.1.0 > > Time Spent: 2h 40m > Remaining Estimate: 0h > > *Issue 1 -* > [https://github.com/apache/carbondata/blob/master/docs/ddl-of-carbondata.mdSORT_SCOPE] > Sort scope of the load.Options include no sort, local sort ,batch sort and > global sort --> Batch sort to be removed as its not supported. > *Issue 2 -* > [https://github.com/apache/carbondata/blob/master/docs/streaming-guide.md#close-stream] > CLOSE STREAM link is not working. > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (CARBONDATA-3901) Documentation issues in https://github.com/apache/carbondata/tree/master/docs
[ https://issues.apache.org/jira/browse/CARBONDATA-3901?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chetan Bhat updated CARBONDATA-3901: Description: *Issue 1 :* [https://github.com/apache/carbondata/blob/master/docs/alluxio-guide.md] getOrCreateCarbonSession not used in Carbon 2.0 version and should be removed.Issue 1 : [https://github.com/apache/carbondata/blob/master/docs/alluxio-guide.md] getOrCreateCarbonSession not used in Carbon 2.0 version and should be removed. Testing use alluxio by CarbonSessionimport org.apache.spark.sql.CarbonSession._import org.apache.spark.sql.SparkSession val carbon = SparkSession.builder().master("local").appName("test").getOrCreateCarbonSession("alluxio://localhost:19998/carbondata");carbon.sql("CREATE TABLE carbon_alluxio(id String,name String, city String,age Int) STORED as carbondata");carbon.sql(s"LOAD DATA LOCAL INPATH '${CARBONDATA_PATH}/integration/spark/src/test/resources/sample.csv' into table carbon_alluxio");carbon.sql("select * from carbon_alluxio").show *Issue 2 -* [https://github.com/apache/carbondata/blob/master/docs/ddl-of-carbondata.mdSORT_SCOPE] Sort scope of the load.Options include no sort, local sort ,batch sort and global sort --> Batch sort to be removed as its not supported. *Issue 3 -* [https://github.com/apache/carbondata/blob/master/docs/streaming-guide.md#close-stream] CLOSE STREAM link is not working. *Issue 4 -* [https://github.com/apache/carbondata/blob/master/docs/index/bloomfilter-index-guide.md] Explain query does not hit the bloom. Hence the line "User can verify whether a query can leverage BloomFilter Index by executing {{EXPLAIN}} command, which will show the transformed logical plan, and thus user can check whether the BloomFilter Index can skip blocklets during the scan." needs to be removed. was: *Issue 1 :* [https://github.com/apache/carbondata/blob/master/docs/alluxio-guide.md] getOrCreateCarbonSession not used in Carbon 2.0 version and should be removed.Issue 1 : [https://github.com/apache/carbondata/blob/master/docs/alluxio-guide.md] getOrCreateCarbonSession not used in Carbon 2.0 version and should be removed. Testing use alluxio by CarbonSessionimport org.apache.spark.sql.CarbonSession._import org.apache.spark.sql.SparkSession val carbon = SparkSession.builder().master("local").appName("test").getOrCreateCarbonSession("alluxio://localhost:19998/carbondata");carbon.sql("CREATE TABLE carbon_alluxio(id String,name String, city String,age Int) STORED as carbondata");carbon.sql(s"LOAD DATA LOCAL INPATH '${CARBONDATA_PATH}/integration/spark/src/test/resources/sample.csv' into table carbon_alluxio");carbon.sql("select * from carbon_alluxio").show *Issue 2 -* [https://github.com/apache/carbondata/blob/master/docs/ddl-of-carbondata.mdSORT_SCOPE] Sort scope of the load.Options include no sort, local sort ,batch sort and global sort --> Batch sort to be removed as its not supported. *Issue 3 -* [https://github.com/apache/carbondata/blob/master/docs/streaming-guide.md#close-stream] CLOSE STREAM link is not working. *Issue 4 -* [https://github.com/apache/carbondata/blob/master/docs/index/bloomfilter-index-guide.md] Explain query does not hit the MV. Hence the line "User can verify whether a query can leverage BloomFilter Index by executing {{EXPLAIN}} command, which will show the transformed logical plan, and thus user can check whether the BloomFilter Index can skip blocklets during the scan." needs to be removed. > Documentation issues in https://github.com/apache/carbondata/tree/master/docs > - > > Key: CARBONDATA-3901 > URL: https://issues.apache.org/jira/browse/CARBONDATA-3901 > Project: CarbonData > Issue Type: Bug > Components: docs >Affects Versions: 2.0.1 > Environment: https://github.com/apache/carbondata/tree/master/docs >Reporter: Chetan Bhat >Priority: Minor > > *Issue 1 :* > [https://github.com/apache/carbondata/blob/master/docs/alluxio-guide.md] > getOrCreateCarbonSession not used in Carbon 2.0 version and should be > removed.Issue 1 : > [https://github.com/apache/carbondata/blob/master/docs/alluxio-guide.md] > getOrCreateCarbonSession not used in Carbon 2.0 version and should be removed. > Testing use alluxio by CarbonSessionimport > org.apache.spark.sql.CarbonSession._import org.apache.spark.sql.SparkSession > val carbon = > SparkSession.builder().master("local").appName("test").getOrCreateCarbonSession("alluxio://localhost:19998/carbondata");carbon.sql("CREATE > TABLE carbon_alluxio(id String,name String, city String,age Int) STORED as > carbondata");carbon.sql(s"LOAD DATA LOCAL INPATH > '${CARBONDATA_PATH}/integration/spark/src/test/resources/sample.csv' into > table
[jira] [Closed] (CARBONDATA-3825) Refresh table in carbonsession using carbonextension fails for a table created in sparkfile format
[ https://issues.apache.org/jira/browse/CARBONDATA-3825?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chetan Bhat closed CARBONDATA-3825. --- Resolution: Invalid Issue is analyzed as invalid and closed. > Refresh table in carbonsession using carbonextension fails for a table > created in sparkfile format > --- > > Key: CARBONDATA-3825 > URL: https://issues.apache.org/jira/browse/CARBONDATA-3825 > Project: CarbonData > Issue Type: Bug > Components: data-load >Affects Versions: 2.0.0 > Environment: Spark 2.3.2, 2.4.5 >Reporter: Chetan Bhat >Priority: Major > > In 1.6.1 or 2.0 version create a table in a db in spark file format and > insert records in the table. > Take a backup of the table store, drop database > In carbonsession using carbonextension create a database with same name as > the db in sparkfileformat and copy table store of sparkfileformat to db path > in hdfs. > Execute the refresh table command. > Refresh table fails with error "Table or view not found in database" -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Closed] (CARBONDATA-3795) Create external carbon table fails if the schema is not provided
[ https://issues.apache.org/jira/browse/CARBONDATA-3795?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chetan Bhat closed CARBONDATA-3795. --- Fix Version/s: 2.0.1 Resolution: Fixed Issue fixed in 2.0.1 > Create external carbon table fails if the schema is not provided > > > Key: CARBONDATA-3795 > URL: https://issues.apache.org/jira/browse/CARBONDATA-3795 > Project: CarbonData > Issue Type: Bug > Components: data-query >Affects Versions: 2.0.0 > Environment: Spark 2.4.5 compatible carbon jars >Reporter: Chetan Bhat >Priority: Major > Fix For: 2.0.1 > > > Create external carbon table fails if the schema is not provided. > Example command - > create external table test1 stored as carbondata location > '/user/sparkhive/warehouse/1_6_1.db/brinjal/'; > *Error: org.apache.spark.sql.AnalysisException: Unable to infer the schema. > The schema specification is required to create the table `1_6_1`.`test1`.; > (state=,code=0)* > > *Logs -* > 2020-05-05 22:57:25,638 | ERROR | [HiveServer2-Background-Pool: Thread-371] | > Error executing query, currentState RUNNING, | > org.apache.spark.internal.Logging$class.logError(Logging.scala:91) > org.apache.spark.sql.AnalysisException: Unable to infer the schema. The > schema specification is required to create the table `1_6_1`.`test1`.; > at > org.apache.spark.sql.hive.ResolveHiveSerdeTable$$anonfun$apply$1.applyOrElse(HiveStrategies.scala:104) > at > org.apache.spark.sql.hive.ResolveHiveSerdeTable$$anonfun$apply$1.applyOrElse(HiveStrategies.scala:90) > at > org.apache.spark.sql.catalyst.plans.logical.AnalysisHelper$$anonfun$resolveOperatorsDown$1$$anonfun$2.apply(AnalysisHelper.scala:108) > at > org.apache.spark.sql.catalyst.plans.logical.AnalysisHelper$$anonfun$resolveOperatorsDown$1$$anonfun$2.apply(AnalysisHelper.scala:108) > at > org.apache.spark.sql.catalyst.trees.CurrentOrigin$.withOrigin(TreeNode.scala:69) > at > org.apache.spark.sql.catalyst.plans.logical.AnalysisHelper$$anonfun$resolveOperatorsDown$1.apply(AnalysisHelper.scala:107) > at > org.apache.spark.sql.catalyst.plans.logical.AnalysisHelper$$anonfun$resolveOperatorsDown$1.apply(AnalysisHelper.scala:106) > at > org.apache.spark.sql.catalyst.plans.logical.AnalysisHelper$.allowInvokingTransformsInAnalyzer(AnalysisHelper.scala:194) > at > org.apache.spark.sql.catalyst.plans.logical.AnalysisHelper$class.resolveOperatorsDown(AnalysisHelper.scala:106) > at > org.apache.spark.sql.catalyst.plans.logical.LogicalPlan.resolveOperatorsDown(LogicalPlan.scala:29) > at > org.apache.spark.sql.catalyst.plans.logical.AnalysisHelper$class.resolveOperators(AnalysisHelper.scala:73) > at > org.apache.spark.sql.catalyst.plans.logical.LogicalPlan.resolveOperators(LogicalPlan.scala:29) > at > org.apache.spark.sql.hive.ResolveHiveSerdeTable.apply(HiveStrategies.scala:90) > at > org.apache.spark.sql.hive.ResolveHiveSerdeTable.apply(HiveStrategies.scala:44) > at > org.apache.spark.sql.catalyst.rules.RuleExecutor$$anonfun$execute$1$$anonfun$apply$1.apply(RuleExecutor.scala:87) > at > org.apache.spark.sql.catalyst.rules.RuleExecutor$$anonfun$execute$1$$anonfun$apply$1.apply(RuleExecutor.scala:84) > at > scala.collection.LinearSeqOptimized$class.foldLeft(LinearSeqOptimized.scala:124) > at scala.collection.immutable.List.foldLeft(List.scala:84) > at > org.apache.spark.sql.catalyst.rules.RuleExecutor$$anonfun$execute$1.apply(RuleExecutor.scala:84) > at > org.apache.spark.sql.catalyst.rules.RuleExecutor$$anonfun$execute$1.apply(RuleExecutor.scala:76) > at scala.collection.immutable.List.foreach(List.scala:392) > at > org.apache.spark.sql.catalyst.rules.RuleExecutor.execute(RuleExecutor.scala:76) > at > org.apache.spark.sql.catalyst.analysis.Analyzer.org$apache$spark$sql$catalyst$analysis$Analyzer$$executeSameContext(Analyzer.scala:127) > at > org.apache.spark.sql.catalyst.analysis.Analyzer.execute(Analyzer.scala:121) > at > org.apache.spark.sql.catalyst.analysis.Analyzer$$anonfun$executeAndCheck$1.apply(Analyzer.scala:106) > at > org.apache.spark.sql.catalyst.analysis.Analyzer$$anonfun$executeAndCheck$1.apply(Analyzer.scala:105) > at > org.apache.spark.sql.catalyst.plans.logical.AnalysisHelper$.markInAnalyzer(AnalysisHelper.scala:201) > at > org.apache.spark.sql.catalyst.analysis.Analyzer.executeAndCheck(Analyzer.scala:105) > at > org.apache.spark.sql.execution.QueryExecution.analyzed$lzycompute(QueryExecution.scala:58) > at > org.apache.spark.sql.execution.QueryExecution.analyzed(QueryExecution.scala:56) > at > org.apache.spark.sql.execution.QueryExecution.assertAnalyzed(QueryExecution.scala:48) > at org.apache.spark.sql.Dataset$.ofRows(Dataset.scala:78) > at org.apache.spark.sql.SparkSession.sql(SparkSession.scala:642) > at
[jira] [Closed] (CARBONDATA-3794) show metacache command takes significantly more time 1st time when compared to 2nd time.
[ https://issues.apache.org/jira/browse/CARBONDATA-3794?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chetan Bhat closed CARBONDATA-3794. --- Fix Version/s: 2.1.0 Resolution: Fixed The issue is resolved in Carbon 2.1.0 version. 0: jdbc:hive2://10.20.251.163:23040/default> show metacache; +-+---+-+-+ | Identifier | Table Index size | CgAndFg Index size | Cache Location | +-+---+-+-+ +-+---+-+-+ No rows selected (7.059 seconds) 0: jdbc:hive2://10.20.251.163:23040/default> show metacache; +-+---+-+-+ | Identifier | Table Index size | CgAndFg Index size | Cache Location | +-+---+-+-+ +-+---+-+-+ No rows selected (6.52 seconds) > show metacache command takes significantly more time 1st time when compared > to 2nd time. > > > Key: CARBONDATA-3794 > URL: https://issues.apache.org/jira/browse/CARBONDATA-3794 > Project: CarbonData > Issue Type: Bug > Components: data-query >Affects Versions: 2.0.0 > Environment: Spark 2.3.2 compatible carbon >Reporter: Chetan Bhat >Priority: Minor > Fix For: 2.1.0 > > > show metacache command takes significantly more time 1st time when compared > to 2nd time. > *1st time* > 0: jdbc:hive2://10.20.255.171:23040/show metacache; > > +--+++++-- > |Identifier|Index size|Datamap size|Cache Location| > +--+++++-- > |TOTAL|745 B|0 B|DRIVER| > |1_6_1.uniqdata_comp_nosort|745 B|0 B|DRIVER| > +--+++++-- > *2 rows selected (8.233 seconds)* > > *2nd time* > 0: jdbc:hive2://10.20.255.171:23040/default> show metacache; > > +--+++++-- > |Identifier|Index size|Datamap size|Cache Location| > +--+++++-- > |TOTAL|745 B|0 B|DRIVER| > |1_6_1.uniqdata_comp_nosort|745 B|0 B|DRIVER| > +--+++++-- > *2 rows selected (1.46 seconds)* > > *Sometimes the 1st time show metacache takes upto 25 seconds compared to 3-4 > seconds for 2nd time show metacache.* -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (CARBONDATA-3794) show metacache command takes significantly more time 1st time when compared to 2nd time.
[ https://issues.apache.org/jira/browse/CARBONDATA-3794?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chetan Bhat updated CARBONDATA-3794: Description: show metacache command takes significantly more time 1st time when compared to 2nd time. *1st time* 0: jdbc:hive2://10.20.255.171:23040/show metacache; +--+++++-- |Identifier|Index size|Datamap size|Cache Location| +--+++++-- |TOTAL|745 B|0 B|DRIVER| |1_6_1.uniqdata_comp_nosort|745 B|0 B|DRIVER| +--+++++-- *2 rows selected (8.233 seconds)* *2nd time* 0: jdbc:hive2://10.20.255.171:23040/default> show metacache; +--+++++-- |Identifier|Index size|Datamap size|Cache Location| +--+++++-- |TOTAL|745 B|0 B|DRIVER| |1_6_1.uniqdata_comp_nosort|745 B|0 B|DRIVER| +--+++++-- *2 rows selected (1.46 seconds)* *Sometimes the 1st time show metacache takes upto 25 seconds compared to 3-4 seconds for 2nd time show metacache.* was: show metacache command takes significantly more time 1st time when compared to 2nd time. *1st time* 0: jdbc:hive2://10.20.255.171:23040/show metcshow metacache; +-+-+---+-+--+ | Identifier | Index size | Datamap size | Cache Location | +-+-+---+-+--+ | TOTAL | 745 B | 0 B | DRIVER | | 1_6_1.uniqdata_comp_nosort | 745 B | 0 B | DRIVER | +-+-+---+-+--+ *2 rows selected (8.233 seconds)* *2nd time* 0: jdbc:hive2://10.20.255.171:23040/default> show metacache; +-+-+---+-+--+ | Identifier | Index size | Datamap size | Cache Location | +-+-+---+-+--+ | TOTAL | 745 B | 0 B | DRIVER | | 1_6_1.uniqdata_comp_nosort | 745 B | 0 B | DRIVER | +-+-+---+-+--+ *2 rows selected (1.46 seconds)* *Sometimes the 1st time show metacache takes upto 25 seconds compared to 3-4 seconds for 2nd time show metacache.* > show metacache command takes significantly more time 1st time when compared > to 2nd time. > > > Key: CARBONDATA-3794 > URL: https://issues.apache.org/jira/browse/CARBONDATA-3794 > Project: CarbonData > Issue Type: Bug > Components: data-query >Affects Versions: 2.0.0 > Environment: Spark 2.3.2 compatible carbon >Reporter: Chetan Bhat >Priority: Minor > > show metacache command takes significantly more time 1st time when compared > to 2nd time. > *1st time* > 0: jdbc:hive2://10.20.255.171:23040/show metacache; > > +--+++++-- > |Identifier|Index size|Datamap size|Cache Location| > +--+++++-- > |TOTAL|745 B|0 B|DRIVER| > |1_6_1.uniqdata_comp_nosort|745 B|0 B|DRIVER| > +--+++++-- > *2 rows selected (8.233 seconds)* > > *2nd time* > 0: jdbc:hive2://10.20.255.171:23040/default> show metacache; > > +--+++++-- > |Identifier|Index size|Datamap size|Cache Location| > +--+++++-- > |TOTAL|745 B|0 B|DRIVER| > |1_6_1.uniqdata_comp_nosort|745 B|0 B|DRIVER| > +--+++++-- > *2 rows selected (1.46 seconds)* > > *Sometimes the 1st time show metacache takes upto 25 seconds compared to 3-4 > seconds for 2nd time show metacache.* -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Closed] (CARBONDATA-3950) Alter table drop column for non partition column throws error
[ https://issues.apache.org/jira/browse/CARBONDATA-3950?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chetan Bhat closed CARBONDATA-3950. --- Fix Version/s: 2.1.0 Resolution: Fixed Issue is fixed in latest Carbon 2.1.0 build > Alter table drop column for non partition column throws error > - > > Key: CARBONDATA-3950 > URL: https://issues.apache.org/jira/browse/CARBONDATA-3950 > Project: CarbonData > Issue Type: Bug > Components: data-query >Affects Versions: 2.0.1 > Environment: Spark 2.4.5 >Reporter: Chetan Bhat >Priority: Minor > Fix For: 2.1.0 > > > From spark-sql the queries are executed as mentioned below- > drop table if exists uniqdata_int; > CREATE TABLE uniqdata_int (CUST_NAME String,ACTIVE_EMUI_VERSION string, DOB > timestamp, DOJ timestamp, BIGINT_COLUMN1 bigint,BIGINT_COLUMN2 > bigint,DECIMAL_COLUMN1 decimal(30,10), DECIMAL_COLUMN2 > decimal(36,10),Double_COLUMN1 double, Double_COLUMN2 double, INTEGER_COLUMN1 > int) Partitioned by (cust_id int) stored as carbondata TBLPROPERTIES > ("TABLE_BLOCKSIZE"= "256 MB"); > LOAD DATA INPATH 'hdfs://hacluster/chetan/2000_UniqData.csv' into table > uniqdata_int partition(cust_id='1') OPTIONS ('FILEHEADER'='CUST_ID,CUST_NAME > ,ACTIVE_EMUI_VERSION,DOB,DOJ, > BIGINT_COLUMN1,BIGINT_COLUMN2,DECIMAL_COLUMN1,DECIMAL_COLUMN2,Double_COLUMN1, > Double_COLUMN2,INTEGER_COLUMN1','BAD_RECORDS_ACTION'='FORCE'); > show partitions uniqdata_int; > select * from uniqdata_int order by cust_id; > alter table uniqdata_int add columns(id int); > desc uniqdata_int; > *alter table uniqdata_int drop columns(CUST_NAME);* > desc uniqdata_int; > Issue : Alter table drop column for non partition column throws error even > though the operation is success. > org.apache.carbondata.spark.exception.ProcessMetaDataException: operation > failed for priyesh.uniqdata_int: Alterion failed: > org.apache.hadoop.hive.ql.metadata.HiveException: Unable to alter table. The > following columns have he existing columns in their respective positions : > col; > at > org.apache.spark.sql.execution.command.MetadataProcessOperation$class.throwMetadataException(package. > at > org.apache.spark.sql.execution.command.MetadataCommand.throwMetadataException(package.scala:120) > at > org.apache.spark.sql.execution.command.schema.CarbonAlterTableDropColumnCommand.processMetadata(Carboand.scala:201) > at > org.apache.spark.sql.execution.command.MetadataCommand$$anonfun$run$1.apply(package.scala:123) > at > org.apache.spark.sql.execution.command.MetadataCommand$$anonfun$run$1.apply(package.scala:123) > at > org.apache.spark.sql.execution.command.Auditable$class.runWithAudit(package.scala:104) > at > org.apache.spark.sql.execution.command.MetadataCommand.runWithAudit(package.scala:120) > at > org.apache.spark.sql.execution.command.MetadataCommand.run(package.scala:123) > at > org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult$lzycompute(commands.scala > at > org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult(commands.scala:69) > at > org.apache.spark.sql.execution.command.ExecutedCommandExec.executeCollect(commands.scala:80) > at org.apache.spark.sql.Dataset$$anonfun$6.apply(Dataset.scala:196) > at org.apache.spark.sql.Dataset$$anonfun$6.apply(Dataset.scala:196) > at org.apache.spark.sql.Dataset$$anonfun$52.apply(Dataset.scala:3379) > at > org.apache.spark.sql.execution.SQLExecution$$anonfun$withNewExecutionId$1.apply(SQLExecution.scala:95 > at > org.apache.spark.sql.execution.SQLExecution$.withSQLConfPropagated(SQLExecution.scala:144) > at > org.apache.spark.sql.execution.SQLExecution$.withNewExecutionId(SQLExecution.scala:86) > at org.apache.spark.sql.Dataset.withAction(Dataset.scala:3378) > at org.apache.spark.sql.Dataset.(Dataset.scala:196) > at org.apache.spark.sql.Dataset$.ofRows(Dataset.scala:79) > at org.apache.spark.sql.SparkSession.sql(SparkSession.scala:651) > at org.apache.spark.sql.SQLContext.sql(SQLContext.scala:694) > at > org.apache.spark.sql.hive.thriftserver.SparkSQLDriver.run(SparkSQLDriver.scala:67) > at > org.apache.spark.sql.hive.thriftserver.SparkSQLCLIDriver.processCmd(SparkSQLCLIDriver.scala:387) > at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:406) > at > org.apache.spark.sql.hive.thriftserver.SparkSQLCLIDriver$.main(SparkSQLCLIDriver.scala:279) > at > org.apache.spark.sql.hive.thriftserver.SparkSQLCLIDriver.main(SparkSQLCLIDriver.scala) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at
[jira] [Closed] (CARBONDATA-3867) Show materialized views command not documented in https://github.com/apache/carbondata/blob/master/docs/mv-guide.md
[ https://issues.apache.org/jira/browse/CARBONDATA-3867?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chetan Bhat closed CARBONDATA-3867. --- Fix Version/s: 2.1.0 Resolution: Fixed Updated in [https://github.com/apache/carbondata/blob/master/docs/mv-guide.md.] Defect closed. > Show materialized views command not documented in > https://github.com/apache/carbondata/blob/master/docs/mv-guide.md > --- > > Key: CARBONDATA-3867 > URL: https://issues.apache.org/jira/browse/CARBONDATA-3867 > Project: CarbonData > Issue Type: Bug > Components: docs >Affects Versions: 2.0.0 > Environment: > https://github.com/apache/carbondata/blob/master/docs/mv-guide.md >Reporter: Chetan Bhat >Priority: Minor > Fix For: 2.1.0 > > > Show materialized views command not documented in > https://github.com/apache/carbondata/blob/master/docs/mv-guide.md -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Closed] (CARBONDATA-3949) Select filter query fails from presto-cli on MV table
[ https://issues.apache.org/jira/browse/CARBONDATA-3949?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chetan Bhat closed CARBONDATA-3949. --- Fix Version/s: 2.1.0 Resolution: Fixed Limitation Updated in docs - [https://github.com/apache/carbondata/blob/master/docs/prestosql-guide.md] Defect is closed. > Select filter query fails from presto-cli on MV table > - > > Key: CARBONDATA-3949 > URL: https://issues.apache.org/jira/browse/CARBONDATA-3949 > Project: CarbonData > Issue Type: Bug > Components: presto-integration >Affects Versions: 2.0.1 > Environment: Spark 2.4.5. PrestoSQL 316 >Reporter: Chetan Bhat >Priority: Major > Fix For: 2.1.0 > > > From sparksql create table , load data and create MV > spark-sql> CREATE TABLE uniqdata(CUST_ID int,CUST_NAME > String,ACTIVE_EMUI_VERSION string, DOB timestamp, DOJ timestamp, > BIGINT_COLUMN1 bigint,BIGINT_COLUMN2 bigint,DECIMAL_COLUMN1 decimal(30,10), > DECIMAL_COLUMN2 decimal(36,10),Double_COLUMN1 double, Double_COLUMN2 > double,INTEGER_COLUMN1 int) STORED as carbondata > TBLPROPERTIES('local_dictionary_enable'='true','local_dictionary_threshold'='1000'); > Time taken: 0.753 seconds > spark-sql> LOAD DATA INPATH 'hdfs://hacluster/chetan/2000_UniqData.csv' into > table uniqdata OPTIONS('DELIMITER'=',', > 'QUOTECHAR'='"','BAD_RECORDS_ACTION'='FORCE','FILEHEADER'='CUST_ID,CUST_NAME,ACTIVE_EMUI_VERSION,DOB,DOJ,BIGINT_COLUMN1,BIGINT_COLUMN2,DECIMAL_COLUMN1,DECIMAL_COLUMN2,Double_COLUMN1,Double_COLUMN2,INTEGER_COLUMN1'); > OK > OK > Time taken: 1.992 seconds > spark-sql> CREATE MATERIALIZED VIEW mv1 as select cust_id, cust_name, > count(cust_id) from uniqdata group by cust_id, cust_name; > OK > Time taken: 4.336 seconds > > From presto cli select filter query on table with MV fails. > presto:chetan> select * from uniqdata where CUST_ID IS NULL or BIGINT_COLUMN1 > =1233720368578 or DECIMAL_COLUMN1 = 12345678901.123458 or Double_COLUMN1 > = 1.12345674897976E10 or INTEGER_COLUMN1 IS NULL ; > Query 20200804_092703_00253_ed34h failed: Unable to get file status: > *Log-* > 2020-08-04T18:09:55.975+0800 INFO Query-20200804_100955_00300_ed34h-2642 > stdout 2020-08-04 18:09:55 WARN AbstractDFSCarbonFile:458 - Exception > occurred: File > hdfs://hacluster/user/sparkhive/warehouse/chetan.db/uniqdata_string/Metadata > does not exist. > java.io.FileNotFoundException: File > hdfs://hacluster/user/sparkhive/warehouse/chetan.db/uniqdata_string/Metadata > does not exist. > at > org.apache.hadoop.hdfs.DistributedFileSystem.listStatusInternal(DistributedFileSystem.java:1058) > at > org.apache.hadoop.hdfs.DistributedFileSystem.access$1000(DistributedFileSystem.java:131) > at > org.apache.hadoop.hdfs.DistributedFileSystem$24.doCall(DistributedFileSystem.java:1118) > at > org.apache.hadoop.hdfs.DistributedFileSystem$24.doCall(DistributedFileSystem.java:1115) > at > org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81) > at > org.apache.hadoop.hdfs.DistributedFileSystem.listStatus(DistributedFileSystem.java:1125) > at > org.apache.hadoop.fs.FilterFileSystem.listStatus(FilterFileSystem.java:270) > at > org.apache.carbondata.core.datastore.filesystem.AbstractDFSCarbonFile.listFiles(AbstractDFSCarbonFile.java:456) > at > org.apache.carbondata.core.datastore.filesystem.AbstractDFSCarbonFile.listFiles(AbstractDFSCarbonFile.java:559) > at > org.apache.carbondata.core.util.path.CarbonTablePath.getActualSchemaFilePath(CarbonTablePath.java:189) > at > org.apache.carbondata.core.util.path.CarbonTablePath.getSchemaFilePath(CarbonTablePath.java:168) > at > org.apache.carbondata.presto.impl.CarbonTableReader.updateSchemaTables(CarbonTableReader.java:147) > at > org.apache.carbondata.presto.impl.CarbonTableReader.getCarbonCache(CarbonTableReader.java:128) > at > org.apache.carbondata.presto.CarbondataSplitManager.getSplits(CarbondataSplitManager.java:145) > at > io.prestosql.spi.connector.classloader.ClassLoaderSafeConnectorSplitManager.getSplits(ClassLoaderSafeConnectorSplitManager.java:50) > at io.prestosql.split.SplitManager.getSplits(SplitManager.java:85) > at > io.prestosql.sql.planner.DistributedExecutionPlanner$Visitor.visitScanAndFilter(DistributedExecutionPlanner.java:189) > at > io.prestosql.sql.planner.DistributedExecutionPlanner$Visitor.visitFilter(DistributedExecutionPlanner.java:257) > at > io.prestosql.sql.planner.DistributedExecutionPlanner$Visitor.visitFilter(DistributedExecutionPlanner.java:149) > at io.prestosql.sql.planner.plan.FilterNode.accept(FilterNode.java:72) > at > io.prestosql.sql.planner.DistributedExecutionPlanner.doPlan(DistributedExecutionPlanner.java:119) > at >
[jira] [Closed] (CARBONDATA-3806) Create bloom datamap fails with null pointer exception
[ https://issues.apache.org/jira/browse/CARBONDATA-3806?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chetan Bhat closed CARBONDATA-3806. --- Fix Version/s: 2.1.0 Resolution: Fixed fixed in latest Carbon 2.1 B06 build > Create bloom datamap fails with null pointer exception > -- > > Key: CARBONDATA-3806 > URL: https://issues.apache.org/jira/browse/CARBONDATA-3806 > Project: CarbonData > Issue Type: Bug > Components: data-query >Affects Versions: 1.6.1 > Environment: Spark 2.3.2 >Reporter: Chetan Bhat >Priority: Major > Fix For: 2.1.0 > > > Create bloom datamap fails with null pointer exception > create table brinjal_bloom (imei string,AMSize string,channelsId > string,ActiveCountry string, Activecity string,gamePointId > double,deviceInformationId double,productionDate Timestamp,deliveryDate > timestamp,deliverycharge double) STORED BY 'carbondata' > TBLPROPERTIES('table_blocksize'='1'); > LOAD DATA INPATH 'hdfs://hacluster/chetan/vardhandaterestruct.csv' INTO TABLE > brinjal_bloom OPTIONS('DELIMITER'=',', 'QUOTECHAR'= > '"','BAD_RECORDS_ACTION'='FORCE','FILEHEADER'= > 'imei,deviceInformationId,AMSize,channelsId,ActiveCountry,Activecity,gamePointId,productionDate,deliveryDate,deliverycharge'); > 0: jdbc:hive2://10.20.255.171:23040/default> CREATE DATAMAP dm_brinjal4 ON > TABLE brinjal_bloom USING 'bloomfilter' DMPROPERTIES ('INDEX_COLUMNS' = > 'AMSize', 'BLOOM_SIZE'='64', 'BLOOM_FPP'='0.1'); > Error: org.apache.spark.SparkException: Job aborted due to stage failure: > Task 0 in stage 210.0 failed 4 times, most recent failure: Lost task 0.3 in > stage 210.0 (TID 1477, vm2, executor 2): java.lang.NullPointerException > at > org.apache.carbondata.core.datamap.Segment.getCommittedIndexFile(Segment.java:150) > at > org.apache.carbondata.core.util.BlockletDataMapUtil.getTableBlockUniqueIdentifiers(BlockletDataMapUtil.java:198) > at > org.apache.carbondata.core.indexstore.blockletindex.BlockletDataMapFactory.getTableBlockIndexUniqueIdentifiers(BlockletDataMapFactory.java:176) > at > org.apache.carbondata.core.indexstore.blockletindex.BlockletDataMapFactory.getDataMaps(BlockletDataMapFactory.java:154) > at > org.apache.carbondata.core.indexstore.blockletindex.BlockletDataMapFactory.getSegmentProperties(BlockletDataMapFactory.java:425) > at > org.apache.carbondata.datamap.IndexDataMapRebuildRDD.internalCompute(IndexDataMapRebuildRDD.scala:359) > at org.apache.carbondata.spark.rdd.CarbonRDD.compute(CarbonRDD.scala:84) > at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:324) > at org.apache.spark.rdd.RDD.iterator(RDD.scala:288) > at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:87) > at org.apache.spark.scheduler.Task.run(Task.scala:109) > at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:345) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) > at java.lang.Thread.run(Thread.java:745) > Driver stacktrace: (state=,code=0) -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Closed] (CARBONDATA-3797) Refresh materialized view command throws null pointer exception
[ https://issues.apache.org/jira/browse/CARBONDATA-3797?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chetan Bhat closed CARBONDATA-3797. --- Fix Version/s: 2.1.0 Resolution: Fixed Issue fixed in Carbon 2.1.0 > Refresh materialized view command throws null pointer exception > --- > > Key: CARBONDATA-3797 > URL: https://issues.apache.org/jira/browse/CARBONDATA-3797 > Project: CarbonData > Issue Type: Bug > Components: data-query >Affects Versions: 2.0.0 > Environment: Spark 2.3.2, Spark 2.4.5 >Reporter: Chetan Bhat >Priority: Major > Fix For: 2.1.0 > > > Refresh materialized view command throws null pointer exception > CREATE TABLE uniqdata_mv(CUST_ID int,CUST_NAME String,ACTIVE_EMUI_VERSION > string, DOB timestamp, DOJ timestamp, BIGINT_COLUMN1 bigint,BIGINT_COLUMN2 > bigint,DECIMAL_COLUMN1 decimal(30,10), DECIMAL_COLUMN2 > decimal(36,10),Double_COLUMN1 double, Double_COLUMN2 double,INTEGER_COLUMN1 > int) STORED as carbondata > TBLPROPERTIES('local_dictionary_enable'='true','local_dictionary_threshold'='1000'); > LOAD DATA INPATH 'hdfs://hacluster/chetan/2000_UniqData.csv' into table > uniqdata_mv OPTIONS('DELIMITER'=',', > 'QUOTECHAR'='"','BAD_RECORDS_ACTION'='FORCE','FILEHEADER'='CUST_ID,CUST_NAME,ACTIVE_EMUI_VERSION,DOB,DOJ,BIGINT_COLUMN1,BIGINT_COLUMN2,DECIMAL_COLUMN1,DECIMAL_COLUMN2,Double_COLUMN1,Double_COLUMN2,INTEGER_COLUMN1'); > CREATE MATERIALIZED VIEW mv1 as select cust_id, cust_name, count(cust_id) > from uniqdata_mv group by cust_id, cust_name; > refresh MATERIALIZED VIEW mv1; > Error: java.lang.NullPointerException (state=,code=0) > > *Exception-* > 2020-05-06 00:50:59,941 | ERROR | [HiveServer2-Background-Pool: Thread-1822] > | Error executing query, currentState RUNNING, | > org.apache.spark.internal.Logging$class.logError(Logging.scala:91) > java.lang.NullPointerException > at org.apache.carbondata.view.MVRefresher$.refresh(MVRefresher.scala:62) > at > org.apache.spark.sql.execution.command.view.CarbonRefreshMVCommand.processData(CarbonRefreshMVCommand.scala:52) > at > org.apache.spark.sql.execution.command.DataCommand$$anonfun$run$2.apply(package.scala:132) > at > org.apache.spark.sql.execution.command.DataCommand$$anonfun$run$2.apply(package.scala:132) > at > org.apache.spark.sql.execution.command.Auditable$class.runWithAudit(package.scala:104) > at > org.apache.spark.sql.execution.command.DataCommand.runWithAudit(package.scala:130) > at org.apache.spark.sql.execution.command.DataCommand.run(package.scala:132) > at > org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult$lzycompute(commands.scala:70) > at > org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult(commands.scala:68) > at > org.apache.spark.sql.execution.command.ExecutedCommandExec.executeCollect(commands.scala:79) > at org.apache.spark.sql.Dataset$$anonfun$6.apply(Dataset.scala:194) > at org.apache.spark.sql.Dataset$$anonfun$6.apply(Dataset.scala:194) > at org.apache.spark.sql.Dataset$$anonfun$52.apply(Dataset.scala:3370) > at > org.apache.spark.sql.execution.SQLExecution$$anonfun$withNewExecutionId$1.apply(SQLExecution.scala:80) > at > org.apache.spark.sql.execution.SQLExecution$.withSQLConfPropagated(SQLExecution.scala:127) > at > org.apache.spark.sql.execution.SQLExecution$.withNewExecutionId(SQLExecution.scala:75) > at org.apache.spark.sql.Dataset.withAction(Dataset.scala:3369) > at org.apache.spark.sql.Dataset.(Dataset.scala:194) > at org.apache.spark.sql.Dataset$.ofRows(Dataset.scala:79) > at org.apache.spark.sql.SparkSession.sql(SparkSession.scala:642) > at org.apache.spark.sql.SQLContext.sql(SQLContext.scala:694) > at > org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation.org$apache$spark$sql$hive$thriftserver$SparkExecuteStatementOperation$$execute(SparkExecuteStatementOperation.scala:232) > at > org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation$$anon$1$$anon$2.run(SparkExecuteStatementOperation.scala:175) > at > org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation$$anon$1$$anon$2.run(SparkExecuteStatementOperation.scala:171) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:422) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1698) > at > org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation$$anon$1.run(SparkExecuteStatementOperation.scala:185) > at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) > at java.util.concurrent.FutureTask.run(FutureTask.java:266) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) > at >
[jira] [Created] (CARBONDATA-4020) Drop bloom index for single index of table having multiple index drops all indexes
Chetan Bhat created CARBONDATA-4020: --- Summary: Drop bloom index for single index of table having multiple index drops all indexes Key: CARBONDATA-4020 URL: https://issues.apache.org/jira/browse/CARBONDATA-4020 Project: CarbonData Issue Type: Bug Components: data-query Affects Versions: 2.1.0 Environment: Spark 2.4.5 Reporter: Chetan Bhat Create multiple bloom indexes on the table. Try to drop single bloom index drop table if exists datamap_test_1; CREATE TABLE datamap_test_1 (id int,name string,salary float,dob date)STORED as carbondata TBLPROPERTIES('SORT_COLUMNS'='id'); CREATE index dm_datamap_test_1_2 ON TABLE datamap_test_1(id) as 'bloomfilter' PROPERTIES ( 'BLOOM_SIZE'='64', 'BLOOM_FPP'='0.1', 'BLOOM_COMPRESS'='true'); CREATE index dm_datamap_test3 ON TABLE datamap_test_1 (name) as 'bloomfilter' PROPERTIES ('BLOOM_SIZE'='64', 'BLOOM_FPP'='0.1', 'BLOOM_COMPRESS'='true'); show indexes on table datamap_test_1; drop index dm_datamap_test_1_2 on datamap_test_1; show indexes on table datamap_test_1; Issue : Drop bloom index for single index of table having multiple index drops all indexes 0: jdbc:hive2://linux-32:22550/> show indexes on table datamap_test_1; +--+--+--++--+ | Name | Provider | Indexed Columns | Properties | Status | Sync In +--+--+--++--+ | dm_datamap_test_1_2 | bloomfilter | id | 'INDEX_COLUMNS'='id','bloom_compress'='true','bloom_fpp'='0.1','blo | dm_datamap_test3 | bloomfilter | name | 'INDEX_COLUMNS'='name','bloom_compress'='true','bloom_fpp'='0.1','b +--+--+--++--+ 2 rows selected (0.315 seconds) 0: jdbc:hive2://linux-32:22550/> drop index dm_datamap_test_1_2 on datamap_test_1; +-+ | Result | +-+ +-+ No rows selected (1.232 seconds) 0: jdbc:hive2://linux-32:22550/> show indexes on table datamap_test_1; +---+---+--+-+-++ | Name | Provider | Indexed Columns | Properties | Status | Sync Info | +---+---+--+-+-++ +---+---+--+-+-++ No rows selected (0.21 seconds) 0: jdbc:hive2://linux-32:22550/> -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (CARBONDATA-3901) Documentation issues in https://github.com/apache/carbondata/tree/master/docs
[ https://issues.apache.org/jira/browse/CARBONDATA-3901?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chetan Bhat updated CARBONDATA-3901: Description: *Issue 1 :* [https://github.com/apache/carbondata/blob/master/docs/alluxio-guide.md] getOrCreateCarbonSession not used in Carbon 2.0 version and should be removed.Issue 1 : [https://github.com/apache/carbondata/blob/master/docs/alluxio-guide.md] getOrCreateCarbonSession not used in Carbon 2.0 version and should be removed. Testing use alluxio by CarbonSessionimport org.apache.spark.sql.CarbonSession._import org.apache.spark.sql.SparkSession val carbon = SparkSession.builder().master("local").appName("test").getOrCreateCarbonSession("alluxio://localhost:19998/carbondata");carbon.sql("CREATE TABLE carbon_alluxio(id String,name String, city String,age Int) STORED as carbondata");carbon.sql(s"LOAD DATA LOCAL INPATH '${CARBONDATA_PATH}/integration/spark/src/test/resources/sample.csv' into table carbon_alluxio");carbon.sql("select * from carbon_alluxio").show *Issue 2 -* [https://github.com/apache/carbondata/blob/master/docs/ddl-of-carbondata.mdSORT_SCOPE] Sort scope of the load.Options include no sort, local sort ,batch sort and global sort --> Batch sort to be removed as its not supported. *Issue 3 -* [https://github.com/apache/carbondata/blob/master/docs/streaming-guide.md#close-stream] CLOSE STREAM link is not working. *Issue 4 -* [https://github.com/apache/carbondata/blob/master/docs/index/bloomfilter-index-guide.md] Explain query does not hit the MV. Hence the line "User can verify whether a query can leverage BloomFilter Index by executing {{EXPLAIN}} command, which will show the transformed logical plan, and thus user can check whether the BloomFilter Index can skip blocklets during the scan." needs to be removed. was: *Issue 1 :* https://github.com/apache/carbondata/blob/master/docs/alluxio-guide.md getOrCreateCarbonSession not used in Carbon 2.0 version and should be removed.Issue 1 : https://github.com/apache/carbondata/blob/master/docs/alluxio-guide.md getOrCreateCarbonSession not used in Carbon 2.0 version and should be removed. Testing use alluxio by CarbonSessionimport org.apache.spark.sql.CarbonSession._import org.apache.spark.sql.SparkSession val carbon = SparkSession.builder().master("local").appName("test").getOrCreateCarbonSession("alluxio://localhost:19998/carbondata");carbon.sql("CREATE TABLE carbon_alluxio(id String,name String, city String,age Int) STORED as carbondata");carbon.sql(s"LOAD DATA LOCAL INPATH '${CARBONDATA_PATH}/integration/spark/src/test/resources/sample.csv' into table carbon_alluxio");carbon.sql("select * from carbon_alluxio").show *Issue 2 -* https://github.com/apache/carbondata/blob/master/docs/ddl-of-carbondata.mdSORT_SCOPE Sort scope of the load.Options include no sort, local sort ,batch sort and global sort --> Batch sort to be removed as its not supported. *Issue 3 -* https://github.com/apache/carbondata/blob/master/docs/streaming-guide.md#close-stream CLOSE STREAM link is not working. > Documentation issues in https://github.com/apache/carbondata/tree/master/docs > - > > Key: CARBONDATA-3901 > URL: https://issues.apache.org/jira/browse/CARBONDATA-3901 > Project: CarbonData > Issue Type: Bug > Components: docs >Affects Versions: 2.0.1 > Environment: https://github.com/apache/carbondata/tree/master/docs >Reporter: Chetan Bhat >Priority: Minor > > *Issue 1 :* > [https://github.com/apache/carbondata/blob/master/docs/alluxio-guide.md] > getOrCreateCarbonSession not used in Carbon 2.0 version and should be > removed.Issue 1 : > [https://github.com/apache/carbondata/blob/master/docs/alluxio-guide.md] > getOrCreateCarbonSession not used in Carbon 2.0 version and should be removed. > Testing use alluxio by CarbonSessionimport > org.apache.spark.sql.CarbonSession._import org.apache.spark.sql.SparkSession > val carbon = > SparkSession.builder().master("local").appName("test").getOrCreateCarbonSession("alluxio://localhost:19998/carbondata");carbon.sql("CREATE > TABLE carbon_alluxio(id String,name String, city String,age Int) STORED as > carbondata");carbon.sql(s"LOAD DATA LOCAL INPATH > '${CARBONDATA_PATH}/integration/spark/src/test/resources/sample.csv' into > table carbon_alluxio");carbon.sql("select * from carbon_alluxio").show > *Issue 2 -* > [https://github.com/apache/carbondata/blob/master/docs/ddl-of-carbondata.mdSORT_SCOPE] > Sort scope of the load.Options include no sort, local sort ,batch sort and > global sort --> Batch sort to be removed as its not supported. > *Issue 3 -* > [https://github.com/apache/carbondata/blob/master/docs/streaming-guide.md#close-stream] > CLOSE STREAM link is
[jira] [Closed] (CARBONDATA-4013) NullPointerException when use ArrowCarbonReader to read carbondata created using orc ,parquet and avro files in SDK
[ https://issues.apache.org/jira/browse/CARBONDATA-4013?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chetan Bhat closed CARBONDATA-4013. --- Resolution: Invalid https://issues.apache.org/jira/browse/CARBONDATA-3365 *Stage2:* Deep integration with carbon vector; for this, currently carbon SDK vector doesn't support filling complex columns. As mentioned in this Jira the arrow reader SDK interfaces does not support complex type. > NullPointerException when use ArrowCarbonReader to read carbondata created > using orc ,parquet and avro files in SDK > --- > > Key: CARBONDATA-4013 > URL: https://issues.apache.org/jira/browse/CARBONDATA-4013 > Project: CarbonData > Issue Type: Bug > Components: other >Affects Versions: 2.1.0 > Environment: Spark 2.4.5 compiled jars >Reporter: Chetan Bhat >Priority: Major > > when use ArrowCarbonReader to read carbondata created using orc files in SDK- > java.lang.NullPointerException > at > org.apache.carbondata.sdk.file.arrow.ArrowUtils.toArrowSchema(ArrowUtils.java:109) > at > org.apache.carbondata.sdk.file.arrow.ArrowConverter.(ArrowConverter.java:45) > at > org.apache.carbondata.sdk.file.ArrowCarbonReader.readArrowBatch(ArrowCarbonReader.java:54) > at > com.apache.spark.LoadFromFiles.testORCFileLoadWithComplexSchemaArrowReader(LoadFromFiles.java:1401) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:498) > at > org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50) > at > org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12) > at > org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47) > at > org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17) > at org.junit.rules.TestWatcher$1.evaluate(TestWatcher.java:55) > at org.junit.rules.RunRules.evaluate(RunRules.java:20) > at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:325) > at > org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:78) > at > org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:57) > at org.junit.runners.ParentRunner$3.run(ParentRunner.java:290) > at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:71) > at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:288) > at org.junit.runners.ParentRunner.access$000(ParentRunner.java:58) > at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:268) > at > org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26) > at > org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27) > at org.junit.runners.ParentRunner.run(ParentRunner.java:363) > at org.junit.runner.JUnitCore.run(JUnitCore.java:137) > at > com.intellij.junit4.JUnit4IdeaTestRunner.startRunnerWithArgs(JUnit4IdeaTestRunner.java:68) > at > com.intellij.rt.junit.IdeaTestRunner$Repeater.startRunnerWithArgs(IdeaTestRunner.java:33) > at > com.intellij.rt.junit.JUnitStarter.prepareStreamsAndStart(JUnitStarter.java:230) > at com.intellij.rt.junit.JUnitStarter.main(JUnitStarter.java:58) > > when use ArrowCarbonReader to read carbondata created using parquet and avro > files in SDK > java.lang.ClassCastException: java.lang.String cannot be cast to > [Ljava.lang.Object; > at > org.apache.carbondata.sdk.file.arrow.StructWriter.setValue(ArrowFieldWriter.java:374) > at > org.apache.carbondata.sdk.file.arrow.ArrowFieldWriter.write(ArrowFieldWriter.java:60) > at > org.apache.carbondata.sdk.file.arrow.StructWriter.setValue(ArrowFieldWriter.java:377) > at > org.apache.carbondata.sdk.file.arrow.ArrowFieldWriter.write(ArrowFieldWriter.java:60) > at > org.apache.carbondata.sdk.file.arrow.ArrowWriter.write(ArrowWriter.java:56) > at > org.apache.carbondata.sdk.file.arrow.ArrowConverter.addToArrowBuffer(ArrowConverter.java:63) > at > org.apache.carbondata.sdk.file.ArrowCarbonReader.readArrowBatch(ArrowCarbonReader.java:56) > at > com.apache.spark.LoadFromFiles.testParquetLoadAndCarbonArrowReader(LoadFromFiles.java:1775) -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (CARBONDATA-4013) NullPointerException when use ArrowCarbonReader to read carbondata created using orc ,parquet and avro files in SDK
[ https://issues.apache.org/jira/browse/CARBONDATA-4013?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chetan Bhat updated CARBONDATA-4013: Summary: NullPointerException when use ArrowCarbonReader to read carbondata created using orc ,parquet and avro files in SDK (was: NullPointerException when use ArrowCarbonReader to read carbondata created using orc ,parquet and avro files) > NullPointerException when use ArrowCarbonReader to read carbondata created > using orc ,parquet and avro files in SDK > --- > > Key: CARBONDATA-4013 > URL: https://issues.apache.org/jira/browse/CARBONDATA-4013 > Project: CarbonData > Issue Type: Bug > Components: other >Affects Versions: 2.1.0 > Environment: Spark 2.4.5 compiled jars >Reporter: Chetan Bhat >Priority: Major > > when use ArrowCarbonReader to read carbondata created using orc files- > java.lang.NullPointerException > at > org.apache.carbondata.sdk.file.arrow.ArrowUtils.toArrowSchema(ArrowUtils.java:109) > at > org.apache.carbondata.sdk.file.arrow.ArrowConverter.(ArrowConverter.java:45) > at > org.apache.carbondata.sdk.file.ArrowCarbonReader.readArrowBatch(ArrowCarbonReader.java:54) > at > com.apache.spark.LoadFromFiles.testORCFileLoadWithComplexSchemaArrowReader(LoadFromFiles.java:1401) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:498) > at > org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50) > at > org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12) > at > org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47) > at > org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17) > at org.junit.rules.TestWatcher$1.evaluate(TestWatcher.java:55) > at org.junit.rules.RunRules.evaluate(RunRules.java:20) > at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:325) > at > org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:78) > at > org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:57) > at org.junit.runners.ParentRunner$3.run(ParentRunner.java:290) > at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:71) > at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:288) > at org.junit.runners.ParentRunner.access$000(ParentRunner.java:58) > at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:268) > at > org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26) > at > org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27) > at org.junit.runners.ParentRunner.run(ParentRunner.java:363) > at org.junit.runner.JUnitCore.run(JUnitCore.java:137) > at > com.intellij.junit4.JUnit4IdeaTestRunner.startRunnerWithArgs(JUnit4IdeaTestRunner.java:68) > at > com.intellij.rt.junit.IdeaTestRunner$Repeater.startRunnerWithArgs(IdeaTestRunner.java:33) > at > com.intellij.rt.junit.JUnitStarter.prepareStreamsAndStart(JUnitStarter.java:230) > at com.intellij.rt.junit.JUnitStarter.main(JUnitStarter.java:58) > > when use ArrowCarbonReader to read carbondata created using parquet and avro > files > java.lang.ClassCastException: java.lang.String cannot be cast to > [Ljava.lang.Object; > at > org.apache.carbondata.sdk.file.arrow.StructWriter.setValue(ArrowFieldWriter.java:374) > at > org.apache.carbondata.sdk.file.arrow.ArrowFieldWriter.write(ArrowFieldWriter.java:60) > at > org.apache.carbondata.sdk.file.arrow.StructWriter.setValue(ArrowFieldWriter.java:377) > at > org.apache.carbondata.sdk.file.arrow.ArrowFieldWriter.write(ArrowFieldWriter.java:60) > at > org.apache.carbondata.sdk.file.arrow.ArrowWriter.write(ArrowWriter.java:56) > at > org.apache.carbondata.sdk.file.arrow.ArrowConverter.addToArrowBuffer(ArrowConverter.java:63) > at > org.apache.carbondata.sdk.file.ArrowCarbonReader.readArrowBatch(ArrowCarbonReader.java:56) > at > com.apache.spark.LoadFromFiles.testParquetLoadAndCarbonArrowReader(LoadFromFiles.java:1775) -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (CARBONDATA-4013) NullPointerException when use ArrowCarbonReader to read carbondata created using orc ,parquet and avro files in SDK
[ https://issues.apache.org/jira/browse/CARBONDATA-4013?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chetan Bhat updated CARBONDATA-4013: Description: when use ArrowCarbonReader to read carbondata created using orc files in SDK- java.lang.NullPointerException at org.apache.carbondata.sdk.file.arrow.ArrowUtils.toArrowSchema(ArrowUtils.java:109) at org.apache.carbondata.sdk.file.arrow.ArrowConverter.(ArrowConverter.java:45) at org.apache.carbondata.sdk.file.ArrowCarbonReader.readArrowBatch(ArrowCarbonReader.java:54) at com.apache.spark.LoadFromFiles.testORCFileLoadWithComplexSchemaArrowReader(LoadFromFiles.java:1401) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50) at org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12) at org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47) at org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17) at org.junit.rules.TestWatcher$1.evaluate(TestWatcher.java:55) at org.junit.rules.RunRules.evaluate(RunRules.java:20) at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:325) at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:78) at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:57) at org.junit.runners.ParentRunner$3.run(ParentRunner.java:290) at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:71) at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:288) at org.junit.runners.ParentRunner.access$000(ParentRunner.java:58) at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:268) at org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26) at org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27) at org.junit.runners.ParentRunner.run(ParentRunner.java:363) at org.junit.runner.JUnitCore.run(JUnitCore.java:137) at com.intellij.junit4.JUnit4IdeaTestRunner.startRunnerWithArgs(JUnit4IdeaTestRunner.java:68) at com.intellij.rt.junit.IdeaTestRunner$Repeater.startRunnerWithArgs(IdeaTestRunner.java:33) at com.intellij.rt.junit.JUnitStarter.prepareStreamsAndStart(JUnitStarter.java:230) at com.intellij.rt.junit.JUnitStarter.main(JUnitStarter.java:58) when use ArrowCarbonReader to read carbondata created using parquet and avro files in SDK java.lang.ClassCastException: java.lang.String cannot be cast to [Ljava.lang.Object; at org.apache.carbondata.sdk.file.arrow.StructWriter.setValue(ArrowFieldWriter.java:374) at org.apache.carbondata.sdk.file.arrow.ArrowFieldWriter.write(ArrowFieldWriter.java:60) at org.apache.carbondata.sdk.file.arrow.StructWriter.setValue(ArrowFieldWriter.java:377) at org.apache.carbondata.sdk.file.arrow.ArrowFieldWriter.write(ArrowFieldWriter.java:60) at org.apache.carbondata.sdk.file.arrow.ArrowWriter.write(ArrowWriter.java:56) at org.apache.carbondata.sdk.file.arrow.ArrowConverter.addToArrowBuffer(ArrowConverter.java:63) at org.apache.carbondata.sdk.file.ArrowCarbonReader.readArrowBatch(ArrowCarbonReader.java:56) at com.apache.spark.LoadFromFiles.testParquetLoadAndCarbonArrowReader(LoadFromFiles.java:1775) was: when use ArrowCarbonReader to read carbondata created using orc files- java.lang.NullPointerException at org.apache.carbondata.sdk.file.arrow.ArrowUtils.toArrowSchema(ArrowUtils.java:109) at org.apache.carbondata.sdk.file.arrow.ArrowConverter.(ArrowConverter.java:45) at org.apache.carbondata.sdk.file.ArrowCarbonReader.readArrowBatch(ArrowCarbonReader.java:54) at com.apache.spark.LoadFromFiles.testORCFileLoadWithComplexSchemaArrowReader(LoadFromFiles.java:1401) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50) at org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12) at org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47) at org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17) at org.junit.rules.TestWatcher$1.evaluate(TestWatcher.java:55) at org.junit.rules.RunRules.evaluate(RunRules.java:20) at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:325) at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:78) at
[jira] [Created] (CARBONDATA-4013) NullPointerException when use ArrowCarbonReader to read carbondata created using orc ,parquet and avro files
Chetan Bhat created CARBONDATA-4013: --- Summary: NullPointerException when use ArrowCarbonReader to read carbondata created using orc ,parquet and avro files Key: CARBONDATA-4013 URL: https://issues.apache.org/jira/browse/CARBONDATA-4013 Project: CarbonData Issue Type: Bug Components: other Affects Versions: 2.1.0 Environment: Spark 2.4.5 compiled jars Reporter: Chetan Bhat when use ArrowCarbonReader to read carbondata created using orc files- java.lang.NullPointerException at org.apache.carbondata.sdk.file.arrow.ArrowUtils.toArrowSchema(ArrowUtils.java:109) at org.apache.carbondata.sdk.file.arrow.ArrowConverter.(ArrowConverter.java:45) at org.apache.carbondata.sdk.file.ArrowCarbonReader.readArrowBatch(ArrowCarbonReader.java:54) at com.apache.spark.LoadFromFiles.testORCFileLoadWithComplexSchemaArrowReader(LoadFromFiles.java:1401) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50) at org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12) at org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47) at org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17) at org.junit.rules.TestWatcher$1.evaluate(TestWatcher.java:55) at org.junit.rules.RunRules.evaluate(RunRules.java:20) at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:325) at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:78) at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:57) at org.junit.runners.ParentRunner$3.run(ParentRunner.java:290) at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:71) at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:288) at org.junit.runners.ParentRunner.access$000(ParentRunner.java:58) at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:268) at org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26) at org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27) at org.junit.runners.ParentRunner.run(ParentRunner.java:363) at org.junit.runner.JUnitCore.run(JUnitCore.java:137) at com.intellij.junit4.JUnit4IdeaTestRunner.startRunnerWithArgs(JUnit4IdeaTestRunner.java:68) at com.intellij.rt.junit.IdeaTestRunner$Repeater.startRunnerWithArgs(IdeaTestRunner.java:33) at com.intellij.rt.junit.JUnitStarter.prepareStreamsAndStart(JUnitStarter.java:230) at com.intellij.rt.junit.JUnitStarter.main(JUnitStarter.java:58) when use ArrowCarbonReader to read carbondata created using parquet and avro files java.lang.ClassCastException: java.lang.String cannot be cast to [Ljava.lang.Object; at org.apache.carbondata.sdk.file.arrow.StructWriter.setValue(ArrowFieldWriter.java:374) at org.apache.carbondata.sdk.file.arrow.ArrowFieldWriter.write(ArrowFieldWriter.java:60) at org.apache.carbondata.sdk.file.arrow.StructWriter.setValue(ArrowFieldWriter.java:377) at org.apache.carbondata.sdk.file.arrow.ArrowFieldWriter.write(ArrowFieldWriter.java:60) at org.apache.carbondata.sdk.file.arrow.ArrowWriter.write(ArrowWriter.java:56) at org.apache.carbondata.sdk.file.arrow.ArrowConverter.addToArrowBuffer(ArrowConverter.java:63) at org.apache.carbondata.sdk.file.ArrowCarbonReader.readArrowBatch(ArrowCarbonReader.java:56) at com.apache.spark.LoadFromFiles.testParquetLoadAndCarbonArrowReader(LoadFromFiles.java:1775) -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (CARBONDATA-4010) "Alter table set tblproperties should support long string columns" and bad record handling of long string data for string columns need to be updated in https://githu
Chetan Bhat created CARBONDATA-4010: --- Summary: "Alter table set tblproperties should support long string columns" and bad record handling of long string data for string columns need to be updated in https://github.com/apache/carbondata/blob/master/docs Key: CARBONDATA-4010 URL: https://issues.apache.org/jira/browse/CARBONDATA-4010 Project: CarbonData Issue Type: Bug Components: docs Affects Versions: 2.1.0 Environment: https://github.com/apache/carbondata/blob/master/docs Reporter: Chetan Bhat "Alter table set tblproperties should support long string columns" and bad record handling of long string data for string columns need to be updated in https://github.com/apache/carbondata/blob/master/docs -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (CARBONDATA-4007) ArrayIndexOutofBoundsException when IUD operations performed using SDK
Chetan Bhat created CARBONDATA-4007: --- Summary: ArrayIndexOutofBoundsException when IUD operations performed using SDK Key: CARBONDATA-4007 URL: https://issues.apache.org/jira/browse/CARBONDATA-4007 Project: CarbonData Issue Type: Bug Components: data-load Affects Versions: 2.1.0 Environment: Spark 2.4.5 jars used for compilation of SDK Reporter: Chetan Bhat Issue - ArrayIndexOutofBoundsException when IUD operations performed using SDK. Exception - java.lang.ArrayIndexOutOfBoundsException: 1 at org.apache.carbondata.hadoop.api.CarbonTableOutputFormat$1.close(CarbonTableOutputFormat.java:579) at org.apache.carbondata.sdk.file.CarbonIUD.delete(CarbonIUD.java:110) at org.apache.carbondata.sdk.file.CarbonIUD.deleteExecution(CarbonIUD.java:238) at org.apache.carbondata.sdk.file.CarbonIUD.closeDelete(CarbonIUD.java:123) at org.apache.carbondata.sdk.file.CarbonIUD.commit(CarbonIUD.java:221) at com.apache.spark.SdkIUD_Test.testDelete(SdkIUD_Test.java:130) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50) at org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12) at org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47) at org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17) at org.junit.rules.TestWatcher$1.evaluate(TestWatcher.java:55) at org.junit.rules.RunRules.evaluate(RunRules.java:20) at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:325) at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:78) at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:57) at org.junit.runners.ParentRunner$3.run(ParentRunner.java:290) at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:71) at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:288) at org.junit.runners.ParentRunner.access$000(ParentRunner.java:58) at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:268) at org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26) at org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27) at org.junit.runners.ParentRunner.run(ParentRunner.java:363) at org.junit.runner.JUnitCore.run(JUnitCore.java:137) at com.intellij.junit4.JUnit4IdeaTestRunner.startRunnerWithArgs(JUnit4IdeaTestRunner.java:68) at com.intellij.rt.junit.IdeaTestRunner$Repeater.startRunnerWithArgs(IdeaTestRunner.java:33) at com.intellij.rt.junit.JUnitStarter.prepareStreamsAndStart(JUnitStarter.java:230) at com.intellij.rt.junit.JUnitStarter.main(JUnitStarter.java:58) -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (CARBONDATA-3987) Issues in SDK Pagination reader (2 issues)
Chetan Bhat created CARBONDATA-3987: --- Summary: Issues in SDK Pagination reader (2 issues) Key: CARBONDATA-3987 URL: https://issues.apache.org/jira/browse/CARBONDATA-3987 Project: CarbonData Issue Type: Bug Components: other Affects Versions: 2.1.0 Reporter: Chetan Bhat Issue 1 : write data to table and insert into one more row , error is thrown when try to read new added row where as getTotalRows get incremented by 1. Test code- /** * Carbon Files are written using CarbonWriter in outputpath * * Carbon Files are read using paginationCarbonReader object * Checking pagination with insert on large data with 8 split */ @Test public void testSDKPaginationInsertData() throws IOException, InvalidLoadOptionException, InterruptedException { System.out.println("___" + name.getMethodName() + " TestCase Execution is started"); // // String outputPath1 = getOutputPath(outputDir, name.getMethodName() + "large"); // // long uid = 123456; // TimeZone.setDefault(TimeZone.getTimeZone("Asia/Shanghai")); // writeMultipleCarbonFiles("id int,name string,rank short,salary double,active boolean,dob date,doj timestamp,city string,dept string", getDatas(), outputPath1, uid, null, null); // // System.out.println("Data is written"); List data1 = new ArrayList(); String[] row1 = \{"1", "AAA", "3", "3444345.66", "true", "1979-12-09", "2011-2-10 1:00:20", "Pune", "IT"}; String[] row2 = \{"2", "BBB", "2", "543124.66", "false", "1987-2-19", "2017-1-1 12:00:20", "Bangalore", "DATA"}; String[] row3 = \{"3", "CCC", "1", "787878.888", "false", "1982-05-12", "2015-12-1 2:20:20", "Pune", "DATA"}; String[] row4 = \{"4", "DDD", "1", "9.24", "true", "1981-04-09", "2000-1-15 7:00:20", "Delhi", "MAINS"}; String[] row5 = \{"5", "EEE", "3", "545656.99", "true", "1987-12-09", "2017-11-25 04:00:20", "Delhi", "IT"}; data1.add(row1); data1.add(row2); data1.add(row3); data1.add(row4); data1.add(row5); String outputPath1 = getOutputPath(outputDir, name.getMethodName() + "large"); long uid = 123456; TimeZone.setDefault(TimeZone.getTimeZone("Asia/Shanghai")); writeMultipleCarbonFiles("id int,name string,rank short,salary double,active boolean,dob date,doj timestamp,city string,dept string", data1, outputPath1, uid, null, null); System.out.println("Data is written"); String hdfsPath1 = moveFiles(outputPath1, outputPath1); String datapath1 = hdfsPath1.concat("/" + name.getMethodName() + "large"); System.out.println("HDFS Data Path is: " + datapath1); runSQL("create table " + name.getMethodName() + "large" + " using carbon location '" + datapath1 + "'"); System.out.println("Table " + name.getMethodName() + " is created Successfully"); runSQL("select count(*) from " + name.getMethodName() + "large"); long uid1 = 123; String outputPath = getOutputPath(outputDir, name.getMethodName()); List data = new ArrayList(); String[] row = \{"222", "Daisy", "3", "334.456", "true", "1956-11-08", "2013-12-10 12:00:20", "Pune", "IT"}; data.add(row); writeData("id int,name string,rank short,salary double,active boolean,dob date,doj timestamp,city string,dept string", data, outputPath, uid, null, null); String hdfsPath = moveFiles(outputPath, outputPath); String datapath = hdfsPath.concat("/" + name.getMethodName()); runSQL("create table " + name.getMethodName() + " using carbon location '" + datapath + "'"); runSQL("select count(*) from " + name.getMethodName()); System.out.println("Insert--"); runSQL("insert into table " + name.getMethodName() + " select * from " + name.getMethodName() + "large"); System.out.println("Inserted"); System.out.println("--After Insert--"); System.out.println("Query 1"); runSQL("select count(*) from " + name.getMethodName()); // configure cache size = 4 blocklet CarbonProperties.getInstance() .addProperty(CarbonCommonConstants.CARBON_MAX_PAGINATION_LRU_CACHE_SIZE_IN_MB, "4"); CarbonReaderBuilder carbonReaderBuilder = CarbonReader.builder(datapath, "_temp").withPaginationSupport().projection(new String[]\{"id","name","rank","salary","active","dob","doj","city","dept"}); PaginationCarbonReader paginationCarbonReader = (PaginationCarbonReader) carbonReaderBuilder.build(); File[] dataFiles1 = new File(datapath).listFiles(new FilenameFilter() { @Override public boolean accept(File dir, String name) { return name.endsWith("carbondata"); } }); String version=CarbonSchemaReader.getVersionDetails(dataFiles1[0].getAbsolutePath()); System.out.println("version "+version); System.out.println("Total no of rows is : "+paginationCarbonReader.getTotalRows() ); assertTrue(paginationCarbonReader.getTotalRows() == 6); Object[] rows=paginationCarbonReader.read(1,6); //assertTrue(rows.length==5); for (Object rowss
[jira] [Updated] (CARBONDATA-3949) Select filter query fails from presto-cli on MV table
[ https://issues.apache.org/jira/browse/CARBONDATA-3949?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chetan Bhat updated CARBONDATA-3949: Description: >From sparksql create table , load data and create MV spark-sql> CREATE TABLE uniqdata(CUST_ID int,CUST_NAME String,ACTIVE_EMUI_VERSION string, DOB timestamp, DOJ timestamp, BIGINT_COLUMN1 bigint,BIGINT_COLUMN2 bigint,DECIMAL_COLUMN1 decimal(30,10), DECIMAL_COLUMN2 decimal(36,10),Double_COLUMN1 double, Double_COLUMN2 double,INTEGER_COLUMN1 int) STORED as carbondata TBLPROPERTIES('local_dictionary_enable'='true','local_dictionary_threshold'='1000'); Time taken: 0.753 seconds spark-sql> LOAD DATA INPATH 'hdfs://hacluster/chetan/2000_UniqData.csv' into table uniqdata OPTIONS('DELIMITER'=',', 'QUOTECHAR'='"','BAD_RECORDS_ACTION'='FORCE','FILEHEADER'='CUST_ID,CUST_NAME,ACTIVE_EMUI_VERSION,DOB,DOJ,BIGINT_COLUMN1,BIGINT_COLUMN2,DECIMAL_COLUMN1,DECIMAL_COLUMN2,Double_COLUMN1,Double_COLUMN2,INTEGER_COLUMN1'); OK OK Time taken: 1.992 seconds spark-sql> CREATE MATERIALIZED VIEW mv1 as select cust_id, cust_name, count(cust_id) from uniqdata group by cust_id, cust_name; OK Time taken: 4.336 seconds >From presto cli select filter query on table with MV fails. presto:chetan> select * from uniqdata where CUST_ID IS NULL or BIGINT_COLUMN1 =1233720368578 or DECIMAL_COLUMN1 = 12345678901.123458 or Double_COLUMN1 = 1.12345674897976E10 or INTEGER_COLUMN1 IS NULL ; Query 20200804_092703_00253_ed34h failed: Unable to get file status: *Log-* 2020-08-04T18:09:55.975+0800 INFO Query-20200804_100955_00300_ed34h-2642 stdout 2020-08-04 18:09:55 WARN AbstractDFSCarbonFile:458 - Exception occurred: File hdfs://hacluster/user/sparkhive/warehouse/chetan.db/uniqdata_string/Metadata does not exist. java.io.FileNotFoundException: File hdfs://hacluster/user/sparkhive/warehouse/chetan.db/uniqdata_string/Metadata does not exist. at org.apache.hadoop.hdfs.DistributedFileSystem.listStatusInternal(DistributedFileSystem.java:1058) at org.apache.hadoop.hdfs.DistributedFileSystem.access$1000(DistributedFileSystem.java:131) at org.apache.hadoop.hdfs.DistributedFileSystem$24.doCall(DistributedFileSystem.java:1118) at org.apache.hadoop.hdfs.DistributedFileSystem$24.doCall(DistributedFileSystem.java:1115) at org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81) at org.apache.hadoop.hdfs.DistributedFileSystem.listStatus(DistributedFileSystem.java:1125) at org.apache.hadoop.fs.FilterFileSystem.listStatus(FilterFileSystem.java:270) at org.apache.carbondata.core.datastore.filesystem.AbstractDFSCarbonFile.listFiles(AbstractDFSCarbonFile.java:456) at org.apache.carbondata.core.datastore.filesystem.AbstractDFSCarbonFile.listFiles(AbstractDFSCarbonFile.java:559) at org.apache.carbondata.core.util.path.CarbonTablePath.getActualSchemaFilePath(CarbonTablePath.java:189) at org.apache.carbondata.core.util.path.CarbonTablePath.getSchemaFilePath(CarbonTablePath.java:168) at org.apache.carbondata.presto.impl.CarbonTableReader.updateSchemaTables(CarbonTableReader.java:147) at org.apache.carbondata.presto.impl.CarbonTableReader.getCarbonCache(CarbonTableReader.java:128) at org.apache.carbondata.presto.CarbondataSplitManager.getSplits(CarbondataSplitManager.java:145) at io.prestosql.spi.connector.classloader.ClassLoaderSafeConnectorSplitManager.getSplits(ClassLoaderSafeConnectorSplitManager.java:50) at io.prestosql.split.SplitManager.getSplits(SplitManager.java:85) at io.prestosql.sql.planner.DistributedExecutionPlanner$Visitor.visitScanAndFilter(DistributedExecutionPlanner.java:189) at io.prestosql.sql.planner.DistributedExecutionPlanner$Visitor.visitFilter(DistributedExecutionPlanner.java:257) at io.prestosql.sql.planner.DistributedExecutionPlanner$Visitor.visitFilter(DistributedExecutionPlanner.java:149) at io.prestosql.sql.planner.plan.FilterNode.accept(FilterNode.java:72) at io.prestosql.sql.planner.DistributedExecutionPlanner.doPlan(DistributedExecutionPlanner.java:119) at io.prestosql.sql.planner.DistributedExecutionPlanner.doPlan(DistributedExecutionPlanner.java:124) at io.prestosql.sql.planner.DistributedExecutionPlanner.plan(DistributedExecutionPlanner.java:96) at io.prestosql.execution.SqlQueryExecution.planDistribution(SqlQueryExecution.java:425) at io.prestosql.execution.SqlQueryExecution.start(SqlQueryExecution.java:321) at io.prestosql.$gen.Presto_31620200804_042858_1.run(Unknown Source) at io.prestosql.execution.SqlQueryManager.createQuery(SqlQueryManager.java:239) at io.prestosql.dispatcher.LocalDispatchQuery.lambda$startExecution$4(LocalDispatchQuery.java:105) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:748) Expected : If the Carbon indexes are not
[jira] [Closed] (CARBONDATA-3845) Bucket table creation fails with exception for empty BUCKET_NUMBER and BUCKET_COLUMNS
[ https://issues.apache.org/jira/browse/CARBONDATA-3845?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chetan Bhat closed CARBONDATA-3845. --- Issue fixed in Carbon 2.1 version. > Bucket table creation fails with exception for empty BUCKET_NUMBER and > BUCKET_COLUMNS > - > > Key: CARBONDATA-3845 > URL: https://issues.apache.org/jira/browse/CARBONDATA-3845 > Project: CarbonData > Issue Type: Bug > Components: data-query >Affects Versions: 2.0.0 > Environment: Spark 2.3.2 >Reporter: Chetan Bhat >Priority: Minor > Fix For: 2.1.0 > > Time Spent: 4h > Remaining Estimate: 0h > > *Steps and Issue-* > 0: jdbc:hive2://10.20.251.163:23040/default> create table if not exists > all_data_types1(bool_1 boolean,bool_2 boolean,chinese string,Number > int,smallNumber smallint,BigNumber bigint,LargeDecimal double,smalldecimal > float,customdecimal decimal(38,15),words string,smallwords char(8),varwords > varchar(20),time timestamp,day date,emptyNumber int,emptysmallNumber > smallint,emptyBigNumber bigint,emptyLargeDecimal double,emptysmalldecimal > float,emptycustomdecimal decimal(38,38),emptywords string,emptysmallwords > char(8),emptyvarwords varchar(20)) stored as carbondata TBLPROPERTIES > (*'BUCKET_NUMBER'='', 'BUCKET_COLUMNS'=''*); > *Error: java.lang.NumberFormatException: For input string: "" > (state=,code=0)* > Same issue present if bucket_number is empty. > 0: jdbc:hive2://10.20.251.163:23040/default> create table if not exists > all_data_types1(bool_1 boolean,bool_2 boolean,chinese string,Number > int,smallNumber smallint,BigNumber bigint,LargeDecimal double,smalldecimal > float,customdecimal decimal(38,15),words string,smallwords char(8),varwords > varchar(20),time timestamp,day date,emptyNumber int,emptysmallNumber > smallint,emptyBigNumber bigint,emptyLargeDecimal double,emptysmalldecimal > float,emptycustomdecimal decimal(38,38),emptywords string,emptysmallwords > char(8),emptyvarwords varchar(20)) stored as carbondata TBLPROPERTIES > (*'BUCKET_NUMBER'=''*, 'BUCKET_COLUMNS'='test'); > *Error: java.lang.NumberFormatException: For input string: "" > (state=,code=0)* > *Log-* > 2020-06-05 01:52:32,633 | ERROR | [HiveServer2-Background-Pool: Thread-102] | > Error executing query, currentState RUNNING, | > org.apache.spark.internal.Logging$class.logError(Logging.scala:91)2020-06-05 > 01:52:32,633 | ERROR | [HiveServer2-Background-Pool: Thread-102] | Error > executing query, currentState RUNNING, | > org.apache.spark.internal.Logging$class.logError(Logging.scala:91)java.lang.NumberFormatException: > For input string: "" at > java.lang.NumberFormatException.forInputString(NumberFormatException.java:65) > at java.lang.Integer.parseInt(Integer.java:592) at > java.lang.Integer.parseInt(Integer.java:615) at > scala.collection.immutable.StringLike$class.toInt(StringLike.scala:272) at > scala.collection.immutable.StringOps.toInt(StringOps.scala:29) at > org.apache.carbondata.spark.CarbonOption.bucketNumber$lzycompute(CarbonOption.scala:61) > at > org.apache.carbondata.spark.CarbonOption.bucketNumber(CarbonOption.scala:61) > at > org.apache.spark.sql.parser.CarbonSpark2SqlParser.getBucketFields(CarbonSpark2SqlParser.scala:765) > at > org.apache.spark.sql.parser.CarbonSparkSqlParserUtil$.buildTableInfoFromCatalogTable(CarbonSparkSqlParserUtil.scala:382) > at > org.apache.spark.sql.CarbonSource$.createTableInfo(CarbonSource.scala:235) at > org.apache.spark.sql.CarbonSource$.createTableMeta(CarbonSource.scala:382) at > org.apache.spark.sql.execution.command.table.CarbonCreateDataSourceTableCommand.processMetadata(CarbonCreateDataSourceTableCommand.scala:69) > at > org.apache.spark.sql.execution.command.MetadataCommand$$anonfun$run$1.apply(package.scala:123) > at > org.apache.spark.sql.execution.command.MetadataCommand$$anonfun$run$1.apply(package.scala:123) > at > org.apache.spark.sql.execution.command.Auditable$class.runWithAudit(package.scala:104) > at > org.apache.spark.sql.execution.command.MetadataCommand.runWithAudit(package.scala:120) > at > org.apache.spark.sql.execution.command.MetadataCommand.run(package.scala:123) > at > org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult$lzycompute(commands.scala:70) > at > org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult(commands.scala:68) > at > org.apache.spark.sql.execution.command.ExecutedCommandExec.executeCollect(commands.scala:79) > at org.apache.spark.sql.Dataset$$anonfun$6.apply(Dataset.scala:190) at > org.apache.spark.sql.Dataset$$anonfun$6.apply(Dataset.scala:190) at > org.apache.spark.sql.Dataset$$anonfun$52.apply(Dataset.scala:3259) at >
[jira] [Created] (CARBONDATA-3971) Session level dynamic properties for repair(carbon.load.si.repair and carbon.si.repair.limit) are not updated in https://github.com/apache/carbondata/blob/master/doc
Chetan Bhat created CARBONDATA-3971: --- Summary: Session level dynamic properties for repair(carbon.load.si.repair and carbon.si.repair.limit) are not updated in https://github.com/apache/carbondata/blob/master/docs/configuration-parameters.md Key: CARBONDATA-3971 URL: https://issues.apache.org/jira/browse/CARBONDATA-3971 Project: CarbonData Issue Type: Bug Components: docs Affects Versions: 2.1.0 Reporter: Chetan Bhat Session level dynamic properties for repair(carbon.load.si.repair and carbon.si.repair.limit) are not mentioned in github link - https://github.com/apache/carbondata/blob/master/docs/configuration-parameters.md -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (CARBONDATA-3949) Select filter query fails from presto-cli on MV table
[ https://issues.apache.org/jira/browse/CARBONDATA-3949?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chetan Bhat updated CARBONDATA-3949: Affects Version/s: (was: 2.0.0) 2.0.1 > Select filter query fails from presto-cli on MV table > - > > Key: CARBONDATA-3949 > URL: https://issues.apache.org/jira/browse/CARBONDATA-3949 > Project: CarbonData > Issue Type: Bug > Components: presto-integration >Affects Versions: 2.0.1 > Environment: Spark 2.4.5. PrestoSQL 316 >Reporter: Chetan Bhat >Priority: Major > > From sparksql create table , load data and create MV > spark-sql> CREATE TABLE uniqdata(CUST_ID int,CUST_NAME > String,ACTIVE_EMUI_VERSION string, DOB timestamp, DOJ timestamp, > BIGINT_COLUMN1 bigint,BIGINT_COLUMN2 bigint,DECIMAL_COLUMN1 decimal(30,10), > DECIMAL_COLUMN2 decimal(36,10),Double_COLUMN1 double, Double_COLUMN2 > double,INTEGER_COLUMN1 int) STORED as carbondata > TBLPROPERTIES('local_dictionary_enable'='true','local_dictionary_threshold'='1000'); > Time taken: 0.753 seconds > spark-sql> LOAD DATA INPATH 'hdfs://hacluster/chetan/2000_UniqData.csv' into > table uniqdata OPTIONS('DELIMITER'=',', > 'QUOTECHAR'='"','BAD_RECORDS_ACTION'='FORCE','FILEHEADER'='CUST_ID,CUST_NAME,ACTIVE_EMUI_VERSION,DOB,DOJ,BIGINT_COLUMN1,BIGINT_COLUMN2,DECIMAL_COLUMN1,DECIMAL_COLUMN2,Double_COLUMN1,Double_COLUMN2,INTEGER_COLUMN1'); > OK > OK > Time taken: 1.992 seconds > spark-sql> CREATE MATERIALIZED VIEW mv1 as select cust_id, cust_name, > count(cust_id) from uniqdata group by cust_id, cust_name; > OK > Time taken: 4.336 seconds > > From presto cli select filter query on table with MV fails. > presto:chetan> select * from uniqdata where CUST_ID IS NULL or BIGINT_COLUMN1 > =1233720368578 or DECIMAL_COLUMN1 = 12345678901.123458 or Double_COLUMN1 > = 1.12345674897976E10 or INTEGER_COLUMN1 IS NULL ; > Query 20200804_092703_00253_ed34h failed: Unable to get file status: > *Log-* > 2020-08-04T18:09:55.975+0800 INFO Query-20200804_100955_00300_ed34h-2642 > stdout 2020-08-04 18:09:55 WARN AbstractDFSCarbonFile:458 - Exception > occurred: File > hdfs://hacluster/user/sparkhive/warehouse/chetan.db/uniqdata_string/Metadata > does not exist. > java.io.FileNotFoundException: File > hdfs://hacluster/user/sparkhive/warehouse/chetan.db/uniqdata_string/Metadata > does not exist. > at > org.apache.hadoop.hdfs.DistributedFileSystem.listStatusInternal(DistributedFileSystem.java:1058) > at > org.apache.hadoop.hdfs.DistributedFileSystem.access$1000(DistributedFileSystem.java:131) > at > org.apache.hadoop.hdfs.DistributedFileSystem$24.doCall(DistributedFileSystem.java:1118) > at > org.apache.hadoop.hdfs.DistributedFileSystem$24.doCall(DistributedFileSystem.java:1115) > at > org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81) > at > org.apache.hadoop.hdfs.DistributedFileSystem.listStatus(DistributedFileSystem.java:1125) > at > org.apache.hadoop.fs.FilterFileSystem.listStatus(FilterFileSystem.java:270) > at > org.apache.carbondata.core.datastore.filesystem.AbstractDFSCarbonFile.listFiles(AbstractDFSCarbonFile.java:456) > at > org.apache.carbondata.core.datastore.filesystem.AbstractDFSCarbonFile.listFiles(AbstractDFSCarbonFile.java:559) > at > org.apache.carbondata.core.util.path.CarbonTablePath.getActualSchemaFilePath(CarbonTablePath.java:189) > at > org.apache.carbondata.core.util.path.CarbonTablePath.getSchemaFilePath(CarbonTablePath.java:168) > at > org.apache.carbondata.presto.impl.CarbonTableReader.updateSchemaTables(CarbonTableReader.java:147) > at > org.apache.carbondata.presto.impl.CarbonTableReader.getCarbonCache(CarbonTableReader.java:128) > at > org.apache.carbondata.presto.CarbondataSplitManager.getSplits(CarbondataSplitManager.java:145) > at > io.prestosql.spi.connector.classloader.ClassLoaderSafeConnectorSplitManager.getSplits(ClassLoaderSafeConnectorSplitManager.java:50) > at io.prestosql.split.SplitManager.getSplits(SplitManager.java:85) > at > io.prestosql.sql.planner.DistributedExecutionPlanner$Visitor.visitScanAndFilter(DistributedExecutionPlanner.java:189) > at > io.prestosql.sql.planner.DistributedExecutionPlanner$Visitor.visitFilter(DistributedExecutionPlanner.java:257) > at > io.prestosql.sql.planner.DistributedExecutionPlanner$Visitor.visitFilter(DistributedExecutionPlanner.java:149) > at io.prestosql.sql.planner.plan.FilterNode.accept(FilterNode.java:72) > at > io.prestosql.sql.planner.DistributedExecutionPlanner.doPlan(DistributedExecutionPlanner.java:119) > at > io.prestosql.sql.planner.DistributedExecutionPlanner.doPlan(DistributedExecutionPlanner.java:124) > at > io.prestosql.sql.planner.DistributedExecutionPlanner.plan(DistributedExecutionPlanner.java:96) > at >
[jira] [Updated] (CARBONDATA-3932) need to change discovery.uri and add hive.metastore.uri,hive.config.resources in https://github.com/apache/carbondata/blob/master/docs/prestosql-guide.md#presto-mu
[ https://issues.apache.org/jira/browse/CARBONDATA-3932?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chetan Bhat updated CARBONDATA-3932: Affects Version/s: (was: 2.0.0) 2.0.1 > need to change discovery.uri and add > hive.metastore.uri,hive.config.resources in > https://github.com/apache/carbondata/blob/master/docs/prestosql-guide.md#presto-multinode-cluster-setup-for-carbondata > - > > Key: CARBONDATA-3932 > URL: https://issues.apache.org/jira/browse/CARBONDATA-3932 > Project: CarbonData > Issue Type: Bug > Components: docs, presto-integration >Affects Versions: 2.0.1 > Environment: Documentation >Reporter: Chetan Bhat >Priority: Minor > > Need to change discovery.uri=:8086 to > discovery.uri=http://:8086 in > [https://github.com/apache/carbondata/blob/master/docs/prestosql-guide.md#presto-multinode-cluster-setup-for-carbondata] > Need to add these configurations as well in carbondata.properties and to be > updated in carbondata-presto opensource doc . > 1.hive.metastore.uri > 2.hive.config.resources > Ex : - > connector.name=carbondata > hive.metastore.uri=thrift://10.21.18.106:9083 > hive.config.resources=/opt/HA/C10/install/hadoop/datanode/etc/hadoop/core-site.xml,/opt/HA/C10/install/hadoop/datanode/etc/hadoop/hdfs-site.xml > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (CARBONDATA-3950) Alter table drop column for non partition column throws error
Chetan Bhat created CARBONDATA-3950: --- Summary: Alter table drop column for non partition column throws error Key: CARBONDATA-3950 URL: https://issues.apache.org/jira/browse/CARBONDATA-3950 Project: CarbonData Issue Type: Bug Components: data-query Affects Versions: 2.0.1 Environment: Spark 2.4.5 Reporter: Chetan Bhat >From spark-sql the queries are executed as mentioned below- drop table if exists uniqdata_int; CREATE TABLE uniqdata_int (CUST_NAME String,ACTIVE_EMUI_VERSION string, DOB timestamp, DOJ timestamp, BIGINT_COLUMN1 bigint,BIGINT_COLUMN2 bigint,DECIMAL_COLUMN1 decimal(30,10), DECIMAL_COLUMN2 decimal(36,10),Double_COLUMN1 double, Double_COLUMN2 double, INTEGER_COLUMN1 int) Partitioned by (cust_id int) stored as carbondata TBLPROPERTIES ("TABLE_BLOCKSIZE"= "256 MB"); LOAD DATA INPATH 'hdfs://hacluster/chetan/2000_UniqData.csv' into table uniqdata_int partition(cust_id='1') OPTIONS ('FILEHEADER'='CUST_ID,CUST_NAME ,ACTIVE_EMUI_VERSION,DOB,DOJ, BIGINT_COLUMN1,BIGINT_COLUMN2,DECIMAL_COLUMN1,DECIMAL_COLUMN2,Double_COLUMN1, Double_COLUMN2,INTEGER_COLUMN1','BAD_RECORDS_ACTION'='FORCE'); show partitions uniqdata_int; select * from uniqdata_int order by cust_id; alter table uniqdata_int add columns(id int); desc uniqdata_int; *alter table uniqdata_int drop columns(CUST_NAME);* desc uniqdata_int; Issue : Alter table drop column for non partition column throws error even though the operation is success. org.apache.carbondata.spark.exception.ProcessMetaDataException: operation failed for priyesh.uniqdata_int: Alterion failed: org.apache.hadoop.hive.ql.metadata.HiveException: Unable to alter table. The following columns have he existing columns in their respective positions : col; at org.apache.spark.sql.execution.command.MetadataProcessOperation$class.throwMetadataException(package. at org.apache.spark.sql.execution.command.MetadataCommand.throwMetadataException(package.scala:120) at org.apache.spark.sql.execution.command.schema.CarbonAlterTableDropColumnCommand.processMetadata(Carboand.scala:201) at org.apache.spark.sql.execution.command.MetadataCommand$$anonfun$run$1.apply(package.scala:123) at org.apache.spark.sql.execution.command.MetadataCommand$$anonfun$run$1.apply(package.scala:123) at org.apache.spark.sql.execution.command.Auditable$class.runWithAudit(package.scala:104) at org.apache.spark.sql.execution.command.MetadataCommand.runWithAudit(package.scala:120) at org.apache.spark.sql.execution.command.MetadataCommand.run(package.scala:123) at org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult$lzycompute(commands.scala at org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult(commands.scala:69) at org.apache.spark.sql.execution.command.ExecutedCommandExec.executeCollect(commands.scala:80) at org.apache.spark.sql.Dataset$$anonfun$6.apply(Dataset.scala:196) at org.apache.spark.sql.Dataset$$anonfun$6.apply(Dataset.scala:196) at org.apache.spark.sql.Dataset$$anonfun$52.apply(Dataset.scala:3379) at org.apache.spark.sql.execution.SQLExecution$$anonfun$withNewExecutionId$1.apply(SQLExecution.scala:95 at org.apache.spark.sql.execution.SQLExecution$.withSQLConfPropagated(SQLExecution.scala:144) at org.apache.spark.sql.execution.SQLExecution$.withNewExecutionId(SQLExecution.scala:86) at org.apache.spark.sql.Dataset.withAction(Dataset.scala:3378) at org.apache.spark.sql.Dataset.(Dataset.scala:196) at org.apache.spark.sql.Dataset$.ofRows(Dataset.scala:79) at org.apache.spark.sql.SparkSession.sql(SparkSession.scala:651) at org.apache.spark.sql.SQLContext.sql(SQLContext.scala:694) at org.apache.spark.sql.hive.thriftserver.SparkSQLDriver.run(SparkSQLDriver.scala:67) at org.apache.spark.sql.hive.thriftserver.SparkSQLCLIDriver.processCmd(SparkSQLCLIDriver.scala:387) at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:406) at org.apache.spark.sql.hive.thriftserver.SparkSQLCLIDriver$.main(SparkSQLCLIDriver.scala:279) at org.apache.spark.sql.hive.thriftserver.SparkSQLCLIDriver.main(SparkSQLCLIDriver.scala) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at org.apache.spark.deploy.JavaMainApplication.start(SparkApplication.scala:52) at org.apache.spark.deploy.SparkSubmit.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:87 at org.apache.spark.deploy.SparkSubmit.doRunMain$1(SparkSubmit.scala:164) at org.apache.spark.deploy.SparkSubmit.submit(SparkSubmit.scala:187) at org.apache.spark.deploy.SparkSubmit.doSubmit(SparkSubmit.scala:89) at
[jira] [Created] (CARBONDATA-3949) Select filter query fails from presto-cli on MV table
Chetan Bhat created CARBONDATA-3949: --- Summary: Select filter query fails from presto-cli on MV table Key: CARBONDATA-3949 URL: https://issues.apache.org/jira/browse/CARBONDATA-3949 Project: CarbonData Issue Type: Bug Components: presto-integration Affects Versions: 2.0.0 Environment: Spark 2.4.5. PrestoSQL 316 Reporter: Chetan Bhat >From sparksql create table , load data and create MV spark-sql> CREATE TABLE uniqdata(CUST_ID int,CUST_NAME String,ACTIVE_EMUI_VERSION string, DOB timestamp, DOJ timestamp, BIGINT_COLUMN1 bigint,BIGINT_COLUMN2 bigint,DECIMAL_COLUMN1 decimal(30,10), DECIMAL_COLUMN2 decimal(36,10),Double_COLUMN1 double, Double_COLUMN2 double,INTEGER_COLUMN1 int) STORED as carbondata TBLPROPERTIES('local_dictionary_enable'='true','local_dictionary_threshold'='1000'); Time taken: 0.753 seconds spark-sql> LOAD DATA INPATH 'hdfs://hacluster/chetan/2000_UniqData.csv' into table uniqdata OPTIONS('DELIMITER'=',', 'QUOTECHAR'='"','BAD_RECORDS_ACTION'='FORCE','FILEHEADER'='CUST_ID,CUST_NAME,ACTIVE_EMUI_VERSION,DOB,DOJ,BIGINT_COLUMN1,BIGINT_COLUMN2,DECIMAL_COLUMN1,DECIMAL_COLUMN2,Double_COLUMN1,Double_COLUMN2,INTEGER_COLUMN1'); OK OK Time taken: 1.992 seconds spark-sql> CREATE MATERIALIZED VIEW mv1 as select cust_id, cust_name, count(cust_id) from uniqdata group by cust_id, cust_name; OK Time taken: 4.336 seconds >From presto cli select filter query on table with MV fails. presto:chetan> select * from uniqdata where CUST_ID IS NULL or BIGINT_COLUMN1 =1233720368578 or DECIMAL_COLUMN1 = 12345678901.123458 or Double_COLUMN1 = 1.12345674897976E10 or INTEGER_COLUMN1 IS NULL ; Query 20200804_092703_00253_ed34h failed: Unable to get file status: *Log-* 2020-08-04T18:09:55.975+0800 INFO Query-20200804_100955_00300_ed34h-2642 stdout 2020-08-04 18:09:55 WARN AbstractDFSCarbonFile:458 - Exception occurred: File hdfs://hacluster/user/sparkhive/warehouse/chetan.db/uniqdata_string/Metadata does not exist. java.io.FileNotFoundException: File hdfs://hacluster/user/sparkhive/warehouse/chetan.db/uniqdata_string/Metadata does not exist. at org.apache.hadoop.hdfs.DistributedFileSystem.listStatusInternal(DistributedFileSystem.java:1058) at org.apache.hadoop.hdfs.DistributedFileSystem.access$1000(DistributedFileSystem.java:131) at org.apache.hadoop.hdfs.DistributedFileSystem$24.doCall(DistributedFileSystem.java:1118) at org.apache.hadoop.hdfs.DistributedFileSystem$24.doCall(DistributedFileSystem.java:1115) at org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81) at org.apache.hadoop.hdfs.DistributedFileSystem.listStatus(DistributedFileSystem.java:1125) at org.apache.hadoop.fs.FilterFileSystem.listStatus(FilterFileSystem.java:270) at org.apache.carbondata.core.datastore.filesystem.AbstractDFSCarbonFile.listFiles(AbstractDFSCarbonFile.java:456) at org.apache.carbondata.core.datastore.filesystem.AbstractDFSCarbonFile.listFiles(AbstractDFSCarbonFile.java:559) at org.apache.carbondata.core.util.path.CarbonTablePath.getActualSchemaFilePath(CarbonTablePath.java:189) at org.apache.carbondata.core.util.path.CarbonTablePath.getSchemaFilePath(CarbonTablePath.java:168) at org.apache.carbondata.presto.impl.CarbonTableReader.updateSchemaTables(CarbonTableReader.java:147) at org.apache.carbondata.presto.impl.CarbonTableReader.getCarbonCache(CarbonTableReader.java:128) at org.apache.carbondata.presto.CarbondataSplitManager.getSplits(CarbondataSplitManager.java:145) at io.prestosql.spi.connector.classloader.ClassLoaderSafeConnectorSplitManager.getSplits(ClassLoaderSafeConnectorSplitManager.java:50) at io.prestosql.split.SplitManager.getSplits(SplitManager.java:85) at io.prestosql.sql.planner.DistributedExecutionPlanner$Visitor.visitScanAndFilter(DistributedExecutionPlanner.java:189) at io.prestosql.sql.planner.DistributedExecutionPlanner$Visitor.visitFilter(DistributedExecutionPlanner.java:257) at io.prestosql.sql.planner.DistributedExecutionPlanner$Visitor.visitFilter(DistributedExecutionPlanner.java:149) at io.prestosql.sql.planner.plan.FilterNode.accept(FilterNode.java:72) at io.prestosql.sql.planner.DistributedExecutionPlanner.doPlan(DistributedExecutionPlanner.java:119) at io.prestosql.sql.planner.DistributedExecutionPlanner.doPlan(DistributedExecutionPlanner.java:124) at io.prestosql.sql.planner.DistributedExecutionPlanner.plan(DistributedExecutionPlanner.java:96) at io.prestosql.execution.SqlQueryExecution.planDistribution(SqlQueryExecution.java:425) at io.prestosql.execution.SqlQueryExecution.start(SqlQueryExecution.java:321) at io.prestosql.$gen.Presto_31620200804_042858_1.run(Unknown Source) at io.prestosql.execution.SqlQueryManager.createQuery(SqlQueryManager.java:239) at io.prestosql.dispatcher.LocalDispatchQuery.lambda$startExecution$4(LocalDispatchQuery.java:105) at
[jira] [Updated] (CARBONDATA-3932) need to change discovery.uri and add hive.metastore.uri,hive.config.resources in https://github.com/apache/carbondata/blob/master/docs/prestosql-guide.md#presto-mu
[ https://issues.apache.org/jira/browse/CARBONDATA-3932?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chetan Bhat updated CARBONDATA-3932: Description: Need to change discovery.uri=:8086 to discovery.uri=http://:8086 in [https://github.com/apache/carbondata/blob/master/docs/prestosql-guide.md#presto-multinode-cluster-setup-for-carbondata] Need to add these configurations as well in carbondata.properties and to be updated in carbondata-presto opensource doc . 1.hive.metastore.uri 2.hive.config.resources Ex : - connector.name=carbondata hive.metastore.uri=thrift://10.21.18.106:9083 hive.config.resources=/opt/HA/C10/install/hadoop/datanode/etc/hadoop/core-site.xml,/opt/HA/C10/install/hadoop/datanode/etc/hadoop/hdfs-site.xml was:Need to change discovery.uri=:8086 to discovery.uri=http://:8086 in https://github.com/apache/carbondata/blob/master/docs/prestosql-guide.md#presto-multinode-cluster-setup-for-carbondata Summary: need to change discovery.uri and add hive.metastore.uri,hive.config.resources in https://github.com/apache/carbondata/blob/master/docs/prestosql-guide.md#presto-multinode-cluster-setup-for-carbondata (was: need to change discovery.uri=:8086 to discovery.uri=http://:8086 in https://github.com/apache/carbondata/blob/master/docs/prestosql-guide.md#presto-multinode-cluster-setup-for-carbondata) > need to change discovery.uri and add > hive.metastore.uri,hive.config.resources in > https://github.com/apache/carbondata/blob/master/docs/prestosql-guide.md#presto-multinode-cluster-setup-for-carbondata > - > > Key: CARBONDATA-3932 > URL: https://issues.apache.org/jira/browse/CARBONDATA-3932 > Project: CarbonData > Issue Type: Bug > Components: docs, presto-integration >Affects Versions: 2.0.0 > Environment: Documentation >Reporter: Chetan Bhat >Priority: Minor > > Need to change discovery.uri=:8086 to > discovery.uri=http://:8086 in > [https://github.com/apache/carbondata/blob/master/docs/prestosql-guide.md#presto-multinode-cluster-setup-for-carbondata] > Need to add these configurations as well in carbondata.properties and to be > updated in carbondata-presto opensource doc . > 1.hive.metastore.uri > 2.hive.config.resources > Ex : - > connector.name=carbondata > hive.metastore.uri=thrift://10.21.18.106:9083 > hive.config.resources=/opt/HA/C10/install/hadoop/datanode/etc/hadoop/core-site.xml,/opt/HA/C10/install/hadoop/datanode/etc/hadoop/hdfs-site.xml > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (CARBONDATA-3932) need to change discovery.uri=:8086 to discovery.uri=http://:8086 in https://github.com/apache/carbondata/blob/master/docs/prestos
Chetan Bhat created CARBONDATA-3932: --- Summary: need to change discovery.uri=:8086 to discovery.uri=http://:8086 in https://github.com/apache/carbondata/blob/master/docs/prestosql-guide.md#presto-multinode-cluster-setup-for-carbondata Key: CARBONDATA-3932 URL: https://issues.apache.org/jira/browse/CARBONDATA-3932 Project: CarbonData Issue Type: Bug Components: docs, presto-integration Affects Versions: 2.0.0 Environment: Documentation Reporter: Chetan Bhat Need to change discovery.uri=:8086 to discovery.uri=http://:8086 in https://github.com/apache/carbondata/blob/master/docs/prestosql-guide.md#presto-multinode-cluster-setup-for-carbondata -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (CARBONDATA-3909) Insert into select fails after insert decimal value as null and set sort scope to global sort
Chetan Bhat created CARBONDATA-3909: --- Summary: Insert into select fails after insert decimal value as null and set sort scope to global sort Key: CARBONDATA-3909 URL: https://issues.apache.org/jira/browse/CARBONDATA-3909 Project: CarbonData Issue Type: Bug Components: data-load Affects Versions: 2.0.1 Environment: Spark 2.3.2, 2.4.5 Reporter: Chetan Bhat Steps - insert decimal value as null and set sort scope to global sort and do insert into select. Issue : - Insert into select fails. Expected : - Insert into select should be success. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (CARBONDATA-3901) Documentation issues in https://github.com/apache/carbondata/tree/master/docs
Chetan Bhat created CARBONDATA-3901: --- Summary: Documentation issues in https://github.com/apache/carbondata/tree/master/docs Key: CARBONDATA-3901 URL: https://issues.apache.org/jira/browse/CARBONDATA-3901 Project: CarbonData Issue Type: Bug Components: docs Affects Versions: 2.0.1 Environment: https://github.com/apache/carbondata/tree/master/docs Reporter: Chetan Bhat *Issue 1 :* https://github.com/apache/carbondata/blob/master/docs/alluxio-guide.md getOrCreateCarbonSession not used in Carbon 2.0 version and should be removed.Issue 1 : https://github.com/apache/carbondata/blob/master/docs/alluxio-guide.md getOrCreateCarbonSession not used in Carbon 2.0 version and should be removed. Testing use alluxio by CarbonSessionimport org.apache.spark.sql.CarbonSession._import org.apache.spark.sql.SparkSession val carbon = SparkSession.builder().master("local").appName("test").getOrCreateCarbonSession("alluxio://localhost:19998/carbondata");carbon.sql("CREATE TABLE carbon_alluxio(id String,name String, city String,age Int) STORED as carbondata");carbon.sql(s"LOAD DATA LOCAL INPATH '${CARBONDATA_PATH}/integration/spark/src/test/resources/sample.csv' into table carbon_alluxio");carbon.sql("select * from carbon_alluxio").show *Issue 2 -* https://github.com/apache/carbondata/blob/master/docs/ddl-of-carbondata.mdSORT_SCOPE Sort scope of the load.Options include no sort, local sort ,batch sort and global sort --> Batch sort to be removed as its not supported. *Issue 3 -* https://github.com/apache/carbondata/blob/master/docs/streaming-guide.md#close-stream CLOSE STREAM link is not working. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Closed] (CARBONDATA-3847) Dataload fails for table with data of 10 records having string type bucket column for if number of buckets exceed large no (300).
[ https://issues.apache.org/jira/browse/CARBONDATA-3847?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chetan Bhat closed CARBONDATA-3847. --- Resolution: Cannot Reproduce Cant reproduce this more than once thereafter. Might be related to cluster configuration. Hence closing the issue. > Dataload fails for table with data of 10 records having string type bucket > column for if number of buckets exceed large no (300). > - > > Key: CARBONDATA-3847 > URL: https://issues.apache.org/jira/browse/CARBONDATA-3847 > Project: CarbonData > Issue Type: Bug > Components: data-query >Affects Versions: 2.0.0 > Environment: Spark 2.3.2, Spark 2.4.5 >Reporter: Chetan Bhat >Priority: Minor > > *Steps -* > 0: jdbc:hive2://10.20.251.163:23040/default> create table if not exists > all_data_types1(bool_1 boolean,bool_2 boolean,chinese string,Number > int,smallNumber smallint,BigNumber bigint,LargeDecimal double,smalldecimal > float,customdecimal decimal(38,15),words string,smallwords char(8),varwords > varchar(20),time timestamp,day date,emptyNumber int,emptysmallNumber > smallint,emptyBigNumber bigint,emptyLargeDecimal double,emptysmalldecimal > float,emptycustomdecimal decimal(38,38),emptywords string,emptysmallwords > char(8),emptyvarwords varchar(20)) stored as carbondata TBLPROPERTIES > (*'BUCKET_NUMBER'='300'*, 'BUCKET_COLUMNS'='chinese'); > +-+--+ > | Result | > +-+--+ > +-+--+ > No rows selected (0.241 seconds) > 0: jdbc:hive2://10.20.251.163:23040/default> LOAD DATA INPATH > 'hdfs://hacluster/chetan/datafile_0.csv' into table all_data_types1 > OPTIONS('DELIMITER'=',' , > 'QUOTECHAR'='"','BAD_RECORDS_ACTION'='FORCE','FILEHEADER'='bool_1 ,bool_2 > ,chinese ,Number ,smallNumber ,BigNumber ,LargeDecimal ,smalldecimal > ,customdecimal,words ,smallwords ,varwords ,time ,day ,emptyNumber > ,emptysmallNumber ,emptyBigNumber ,emptyLargeDecimal > ,emptysmalldecimal,emptycustomdecimal ,emptywords ,emptysmallwords > ,emptyvarwords'); > *Error: java.lang.Exception: DataLoad failure (state=,code=0)* > > *Log -* > java.lang.Exception: DataLoad failure > at > org.apache.carbondata.spark.rdd.CarbonDataRDDFactory$.loadCarbonData(CarbonDataRDDFactory.scala:565) > at > org.apache.spark.sql.execution.command.management.CarbonLoadDataCommand.loadData(CarbonLoadDataCommand.scala:207) > at > org.apache.spark.sql.execution.command.management.CarbonLoadDataCommand.processData(CarbonLoadDataCommand.scala:168) > at > org.apache.spark.sql.execution.command.AtomicRunnableCommand$$anonfun$run$3.apply(package.scala:148) > at > org.apache.spark.sql.execution.command.AtomicRunnableCommand$$anonfun$run$3.apply(package.scala:145) > at > org.apache.spark.sql.execution.command.Auditable$class.runWithAudit(package.scala:104) > at > org.apache.spark.sql.execution.command.AtomicRunnableCommand.runWithAudit(package.scala:141) > at > org.apache.spark.sql.execution.command.AtomicRunnableCommand.run(package.scala:145) > at > org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult$lzycompute(commands.scala:71) > at > org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult(commands.scala:69) > at > org.apache.spark.sql.execution.command.ExecutedCommandExec.executeCollect(commands.scala:80) > at org.apache.spark.sql.Dataset$$anonfun$6.apply(Dataset.scala:196) > at org.apache.spark.sql.Dataset$$anonfun$6.apply(Dataset.scala:196) > at org.apache.spark.sql.Dataset$$anonfun$52.apply(Dataset.scala:3379) > at > org.apache.spark.sql.execution.SQLExecution$$anonfun$withNewExecutionId$1.apply(SQLExecution.scala:90) > at > org.apache.spark.sql.execution.SQLExecution$.withSQLConfPropagated(SQLExecution.scala:137) > at > org.apache.spark.sql.execution.SQLExecution$.withNewExecutionId(SQLExecution.scala:85) > at org.apache.spark.sql.Dataset.withAction(Dataset.scala:3378) > at org.apache.spark.sql.Dataset.(Dataset.scala:196) > at org.apache.spark.sql.Dataset$.ofRows(Dataset.scala:79) > at org.apache.spark.sql.SparkSession.sql(SparkSession.scala:651) > at org.apache.spark.sql.SQLContext.sql(SQLContext.scala:694) > at > org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation.org$apache$spark$sql$hive$thriftserver$SparkExecuteStatementOperation$$execute(SparkExecuteStatementOperation.scala:248) > at > org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation$$anon$1$$anon$2.run(SparkExecuteStatementOperation.scala:178) > at > org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation$$anon$1$$anon$2.run(SparkExecuteStatementOperation.scala:174) > at java.security.AccessController.doPrivileged(Native Method) > at
[jira] [Updated] (CARBONDATA-3846) Dataload fails for boolean column configured as BUCKET_COLUMNS
[ https://issues.apache.org/jira/browse/CARBONDATA-3846?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chetan Bhat updated CARBONDATA-3846: Description: *Steps-* 0: jdbc:hive2://10.20.255.171:23040/default> create table if not exists all_data_types1(*bool_1 boolean*,bool_2 boolean,chinese string,Number int,smallNumber smallint,BigNumber bigint,LargeDecimal double,smalldecimal float,customdecimal decimal(38,15),words string,smallwords char(8),varwords varchar(20),time timestamp,day date,emptyNumber int,emptysmallNumber smallint,emptyBigNumber bigint,emptyLargeDecimal double,emptysmalldecimal float,emptycustomdecimal decimal(38,38),emptywords string,emptysmallwords char(8),emptyvarwords varchar(20)) stored as carbondata TBLPROPERTIES ('BUCKET_NUMBER'='1', '*BUCKET_COLUMNS'='bool_1*'); +--+-+ |Result| +--+-+ +--+-+ No rows selected (0.939 seconds) 0: jdbc:hive2://10.20.255.171:23040/default> LOAD DATA INPATH 'hdfs://hacluster/chetan/datafile_0.csv' into table all_data_types1 OPTIONS('DELIMITER'=',' , 'QUOTECHAR'='"','BAD_RECORDS_ACTION'='FORCE','FILEHEADER'='bool_1 ,bool_2 ,chinese ,Number ,smallNumber ,BigNumber ,LargeDecimal ,smalldecimal ,customdecimal,words ,smallwords ,varwords ,time ,day ,emptyNumber ,emptysmallNumber ,emptyBigNumber ,emptyLargeDecimal ,emptysmalldecimal,emptycustomdecimal ,emptywords ,emptysmallwords ,emptyvarwords'); *Error: java.lang.Exception: DataLoad failure: (state=,code=0)* *Log-* java.lang.Exception: DataLoad failure: at org.apache.carbondata.spark.rdd.CarbonDataRDDFactory$.loadCarbonData(CarbonDataRDDFactory.scala:560) at org.apache.spark.sql.execution.command.management.CarbonLoadDataCommand.loadData(CarbonLoadDataCommand.scala:207) at org.apache.spark.sql.execution.command.management.CarbonLoadDataCommand.processData(CarbonLoadDataCommand.scala:168) at org.apache.spark.sql.execution.command.AtomicRunnableCommand$$anonfun$run$3.apply(package.scala:148) at org.apache.spark.sql.execution.command.AtomicRunnableCommand$$anonfun$run$3.apply(package.scala:145) at org.apache.spark.sql.execution.command.Auditable$class.runWithAudit(package.scala:104) at org.apache.spark.sql.execution.command.AtomicRunnableCommand.runWithAudit(package.scala:141) at org.apache.spark.sql.execution.command.AtomicRunnableCommand.run(package.scala:145) at org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult$lzycompute(commands.scala:70) at org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult(commands.scala:68) at org.apache.spark.sql.execution.command.ExecutedCommandExec.executeCollect(commands.scala:79) at org.apache.spark.sql.Dataset$$anonfun$6.apply(Dataset.scala:190) at org.apache.spark.sql.Dataset$$anonfun$6.apply(Dataset.scala:190) at org.apache.spark.sql.Dataset$$anonfun$52.apply(Dataset.scala:3259) at org.apache.spark.sql.execution.SQLExecution$.withNewExecutionId(SQLExecution.scala:77) at org.apache.spark.sql.Dataset.withAction(Dataset.scala:3258) at org.apache.spark.sql.Dataset.(Dataset.scala:190) at org.apache.spark.sql.Dataset$.ofRows(Dataset.scala:75) at org.apache.spark.sql.SparkSession.sql(SparkSession.scala:642) at org.apache.spark.sql.SQLContext.sql(SQLContext.scala:694) at org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation.org$apache$spark$sql$hive$thriftserver$SparkExecuteStatementOperation$$execute(SparkExecuteStatementOperation.scala:232) at org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation$$anon$1$$anon$2.run(SparkExecuteStatementOperation.scala:175) at org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation$$anon$1$$anon$2.run(SparkExecuteStatementOperation.scala:171) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:422) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1698) at org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation$$anon$1.run(SparkExecuteStatementOperation.scala:185) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) at java.lang.Thread.run(Thread.java:745) 2020-06-05 02:05:56,789 | ERROR | [HiveServer2-Background-Pool: Thread-138] | Error running hive query: | org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation$$anon$1$$anon$2.run(SparkExecuteStatementOperation.scala:179) org.apache.hive.service.cli.HiveSQLException: java.lang.Exception: DataLoad failure: at
[jira] [Created] (CARBONDATA-3867) Show materialized views command not documented in https://github.com/apache/carbondata/blob/master/docs/mv-guide.md
Chetan Bhat created CARBONDATA-3867: --- Summary: Show materialized views command not documented in https://github.com/apache/carbondata/blob/master/docs/mv-guide.md Key: CARBONDATA-3867 URL: https://issues.apache.org/jira/browse/CARBONDATA-3867 Project: CarbonData Issue Type: Bug Components: docs Affects Versions: 2.0.0 Environment: https://github.com/apache/carbondata/blob/master/docs/mv-guide.md Reporter: Chetan Bhat Show materialized views command not documented in https://github.com/apache/carbondata/blob/master/docs/mv-guide.md -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (CARBONDATA-3853) Dataload fails for date column configured as BUCKET_COLUMNS
Chetan Bhat created CARBONDATA-3853: --- Summary: Dataload fails for date column configured as BUCKET_COLUMNS Key: CARBONDATA-3853 URL: https://issues.apache.org/jira/browse/CARBONDATA-3853 Project: CarbonData Issue Type: Bug Components: data-load Affects Versions: 2.0.0 Reporter: Chetan Bhat Steps and Issue 0: jdbc:hive2://10.20.255.171:23040/> create table if not exists all_data_types1(bool_1 boolean,bool_2 boolean,chinese string,Number int,smallNumber smallint,BigNumber bigint,LargeDecimal double,smalldecimal float,customdecimal decimal(38,15),words string,smallwords char(8),varwords varchar(20),time timestamp,day date,emptyNumber int,emptysmallNumber smallint,emptyBigNumber bigint,emptyLargeDecimal double,emptysmalldecimal float,emptycustomdecimal decimal(38,38),emptywords string,emptysmallwords char(8),emptyvarwords varchar(20)) stored as carbondata TBLPROPERTIES ('BUCKET_NUMBER'='2', *'BUCKET_COLUMNS'='day'*); +-+--+ | Result | +-+--+ +-+--+ No rows selected (0.494 seconds) 0: jdbc:hive2://10.20.255.171:23040/> LOAD DATA INPATH 'hdfs://hacluster/chetan/datafile_0.csv' into table all_data_types1 OPTIONS('DELIMITER'=',' , 'QUOTECHAR'='"','BAD_RECORDS_ACTION'='FORCE','FILEHEADER'='bool_1 ,bool_2 ,chinese ,Number ,smallNumber ,BigNumber ,LargeDecimal ,smalldecimal ,customdecimal,words ,smallwords ,varwords ,time ,day ,emptyNumber ,emptysmallNumber ,emptyBigNumber ,emptyLargeDecimal ,emptysmalldecimal,emptycustomdecimal ,emptywords ,emptysmallwords ,emptyvarwords'); *Error: java.lang.Exception: DataLoad failure (state=,code=0)* *Log-* -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (CARBONDATA-3853) Dataload fails for date column configured as BUCKET_COLUMNS
[ https://issues.apache.org/jira/browse/CARBONDATA-3853?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chetan Bhat updated CARBONDATA-3853: Description: Steps and Issue 0: jdbc:hive2://10.20.255.171:23040/> create table if not exists all_data_types1(bool_1 boolean,bool_2 boolean,chinese string,Number int,smallNumber smallint,BigNumber bigint,LargeDecimal double,smalldecimal float,customdecimal decimal(38,15),words string,smallwords char(8),varwords varchar(20),time timestamp,day date,emptyNumber int,emptysmallNumber smallint,emptyBigNumber bigint,emptyLargeDecimal double,emptysmalldecimal float,emptycustomdecimal decimal(38,38),emptywords string,emptysmallwords char(8),emptyvarwords varchar(20)) stored as carbondata TBLPROPERTIES ('BUCKET_NUMBER'='2', *'BUCKET_COLUMNS'='day'*); +--+-+ |Result| +--+-+ +--+-+ No rows selected (0.494 seconds) 0: jdbc:hive2://10.20.255.171:23040/> LOAD DATA INPATH 'hdfs://hacluster/chetan/datafile_0.csv' into table all_data_types1 OPTIONS('DELIMITER'=',' , 'QUOTECHAR'='"','BAD_RECORDS_ACTION'='FORCE','FILEHEADER'='bool_1 ,bool_2 ,chinese ,Number ,smallNumber ,BigNumber ,LargeDecimal ,smalldecimal ,customdecimal,words ,smallwords ,varwords ,time ,day ,emptyNumber ,emptysmallNumber ,emptyBigNumber ,emptyLargeDecimal ,emptysmalldecimal,emptycustomdecimal ,emptywords ,emptysmallwords ,emptyvarwords'); *Error: java.lang.Exception: DataLoad failure (state=,code=0)* *Log-* java.lang.Exception: DataLoad failure at org.apache.carbondata.spark.rdd.CarbonDataRDDFactory$.loadCarbonData(CarbonDataRDDFactory.scala:560) at org.apache.spark.sql.execution.command.management.CarbonLoadDataCommand.loadData(CarbonLoadDataCommand.scala:207) at org.apache.spark.sql.execution.command.management.CarbonLoadDataCommand.processData(CarbonLoadDataCommand.scala:168) at org.apache.spark.sql.execution.command.AtomicRunnableCommand$$anonfun$run$3.apply(package.scala:148) at org.apache.spark.sql.execution.command.AtomicRunnableCommand$$anonfun$run$3.apply(package.scala:145) at org.apache.spark.sql.execution.command.Auditable$class.runWithAudit(package.scala:104) at org.apache.spark.sql.execution.command.AtomicRunnableCommand.runWithAudit(package.scala:141) at org.apache.spark.sql.execution.command.AtomicRunnableCommand.run(package.scala:145) at org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult$lzycompute(commands.scala:70) at org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult(commands.scala:68) at org.apache.spark.sql.execution.command.ExecutedCommandExec.executeCollect(commands.scala:79) at org.apache.spark.sql.Dataset$$anonfun$6.apply(Dataset.scala:190) at org.apache.spark.sql.Dataset$$anonfun$6.apply(Dataset.scala:190) at org.apache.spark.sql.Dataset$$anonfun$52.apply(Dataset.scala:3259) at org.apache.spark.sql.execution.SQLExecution$.withNewExecutionId(SQLExecution.scala:77) at org.apache.spark.sql.Dataset.withAction(Dataset.scala:3258) at org.apache.spark.sql.Dataset.(Dataset.scala:190) at org.apache.spark.sql.Dataset$.ofRows(Dataset.scala:75) at org.apache.spark.sql.SparkSession.sql(SparkSession.scala:642) at org.apache.spark.sql.SQLContext.sql(SQLContext.scala:694) at org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation.org$apache$spark$sql$hive$thriftserver$SparkExecuteStatementOperation$$execute(SparkExecuteStatementOperation.scala:232) at org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation$$anon$1$$anon$2.run(SparkExecuteStatementOperation.scala:175) at org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation$$anon$1$$anon$2.run(SparkExecuteStatementOperation.scala:171) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:422) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1698) at org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation$$anon$1.run(SparkExecuteStatementOperation.scala:185) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) at java.lang.Thread.run(Thread.java:745) 2020-06-11 23:47:24,973 | ERROR | [HiveServer2-Background-Pool: Thread-104] | Error running hive query: | org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation$$anon$1$$anon$2.run(SparkExecuteStatementOperation.scala:179) org.apache.hive.service.cli.HiveSQLException: java.lang.Exception: DataLoad failure at
[jira] [Created] (CARBONDATA-3847) Dataload fails for table with data of 10 records having string type bucket column for if number of buckets exceed large no (300).
Chetan Bhat created CARBONDATA-3847: --- Summary: Dataload fails for table with data of 10 records having string type bucket column for if number of buckets exceed large no (300). Key: CARBONDATA-3847 URL: https://issues.apache.org/jira/browse/CARBONDATA-3847 Project: CarbonData Issue Type: Bug Components: data-query Affects Versions: 2.0.0 Environment: Spark 2.3.2, Spark 2.4.5 Reporter: Chetan Bhat *Steps -* 0: jdbc:hive2://10.20.251.163:23040/default> create table if not exists all_data_types1(bool_1 boolean,bool_2 boolean,chinese string,Number int,smallNumber smallint,BigNumber bigint,LargeDecimal double,smalldecimal float,customdecimal decimal(38,15),words string,smallwords char(8),varwords varchar(20),time timestamp,day date,emptyNumber int,emptysmallNumber smallint,emptyBigNumber bigint,emptyLargeDecimal double,emptysmalldecimal float,emptycustomdecimal decimal(38,38),emptywords string,emptysmallwords char(8),emptyvarwords varchar(20)) stored as carbondata TBLPROPERTIES (*'BUCKET_NUMBER'='300'*, 'BUCKET_COLUMNS'='chinese'); +-+--+ | Result | +-+--+ +-+--+ No rows selected (0.241 seconds) 0: jdbc:hive2://10.20.251.163:23040/default> LOAD DATA INPATH 'hdfs://hacluster/chetan/datafile_0.csv' into table all_data_types1 OPTIONS('DELIMITER'=',' , 'QUOTECHAR'='"','BAD_RECORDS_ACTION'='FORCE','FILEHEADER'='bool_1 ,bool_2 ,chinese ,Number ,smallNumber ,BigNumber ,LargeDecimal ,smalldecimal ,customdecimal,words ,smallwords ,varwords ,time ,day ,emptyNumber ,emptysmallNumber ,emptyBigNumber ,emptyLargeDecimal ,emptysmalldecimal,emptycustomdecimal ,emptywords ,emptysmallwords ,emptyvarwords'); *Error: java.lang.Exception: DataLoad failure (state=,code=0)* *Log -* java.lang.Exception: DataLoad failure at org.apache.carbondata.spark.rdd.CarbonDataRDDFactory$.loadCarbonData(CarbonDataRDDFactory.scala:565) at org.apache.spark.sql.execution.command.management.CarbonLoadDataCommand.loadData(CarbonLoadDataCommand.scala:207) at org.apache.spark.sql.execution.command.management.CarbonLoadDataCommand.processData(CarbonLoadDataCommand.scala:168) at org.apache.spark.sql.execution.command.AtomicRunnableCommand$$anonfun$run$3.apply(package.scala:148) at org.apache.spark.sql.execution.command.AtomicRunnableCommand$$anonfun$run$3.apply(package.scala:145) at org.apache.spark.sql.execution.command.Auditable$class.runWithAudit(package.scala:104) at org.apache.spark.sql.execution.command.AtomicRunnableCommand.runWithAudit(package.scala:141) at org.apache.spark.sql.execution.command.AtomicRunnableCommand.run(package.scala:145) at org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult$lzycompute(commands.scala:71) at org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult(commands.scala:69) at org.apache.spark.sql.execution.command.ExecutedCommandExec.executeCollect(commands.scala:80) at org.apache.spark.sql.Dataset$$anonfun$6.apply(Dataset.scala:196) at org.apache.spark.sql.Dataset$$anonfun$6.apply(Dataset.scala:196) at org.apache.spark.sql.Dataset$$anonfun$52.apply(Dataset.scala:3379) at org.apache.spark.sql.execution.SQLExecution$$anonfun$withNewExecutionId$1.apply(SQLExecution.scala:90) at org.apache.spark.sql.execution.SQLExecution$.withSQLConfPropagated(SQLExecution.scala:137) at org.apache.spark.sql.execution.SQLExecution$.withNewExecutionId(SQLExecution.scala:85) at org.apache.spark.sql.Dataset.withAction(Dataset.scala:3378) at org.apache.spark.sql.Dataset.(Dataset.scala:196) at org.apache.spark.sql.Dataset$.ofRows(Dataset.scala:79) at org.apache.spark.sql.SparkSession.sql(SparkSession.scala:651) at org.apache.spark.sql.SQLContext.sql(SQLContext.scala:694) at org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation.org$apache$spark$sql$hive$thriftserver$SparkExecuteStatementOperation$$execute(SparkExecuteStatementOperation.scala:248) at org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation$$anon$1$$anon$2.run(SparkExecuteStatementOperation.scala:178) at org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation$$anon$1$$anon$2.run(SparkExecuteStatementOperation.scala:174) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:422) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1729) at org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation$$anon$1.run(SparkExecuteStatementOperation.scala:188) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) at
[jira] [Created] (CARBONDATA-3846) Dataload fails for boolean column configured as BUCKET_COLUMNS
Chetan Bhat created CARBONDATA-3846: --- Summary: Dataload fails for boolean column configured as BUCKET_COLUMNS Key: CARBONDATA-3846 URL: https://issues.apache.org/jira/browse/CARBONDATA-3846 Project: CarbonData Issue Type: Bug Components: data-query Affects Versions: 2.0.0 Environment: Spark 2.3.2, Spark 2.4.5 Reporter: Chetan Bhat *Steps-* 0: jdbc:hive2://10.20.255.171:23040/default> create table if not exists all_data_types1(*bool_1 boolean*,bool_2 boolean,chinese string,Number int,smallNumber smallint,BigNumber bigint,LargeDecimal double,smalldecimal float,customdecimal decimal(38,15),words string,smallwords char(8),varwords varchar(20),time timestamp,day date,emptyNumber int,emptysmallNumber smallint,emptyBigNumber bigint,emptyLargeDecimal double,emptysmalldecimal float,emptycustomdecimal decimal(38,38),emptywords string,emptysmallwords char(8),emptyvarwords varchar(20)) stored as carbondata TBLPROPERTIES ('BUCKET_NUMBER'='1', '*BUCKET_COLUMNS'='bool_1*'); +-+--+ | Result | +-+--+ +-+--+ No rows selected (0.939 seconds) 0: jdbc:hive2://10.20.255.171:23040/default> LOAD DATA INPATH 'hdfs://hacluster/chetan/datafile_0.csv' into table all_data_types1 OPTIONS('DELIMITER'=',' , 'QUOTECHAR'='"','BAD_RECORDS_ACTION'='FORCE','FILEHEADER'='bool_1 ,bool_2 ,chinese ,Number ,smallNumber ,BigNumber ,LargeDecimal ,smalldecimal ,customdecimal,words ,smallwords ,varwords ,time ,day ,emptyNumber ,emptysmallNumber ,emptyBigNumber ,emptyLargeDecimal ,emptysmalldecimal,emptycustomdecimal ,emptywords ,emptysmallwords ,emptyvarwords'); *Error: java.lang.Exception: DataLoad failure: (state=,code=0)* *Log-* java.lang.Exception: DataLoad failure: at org.apache.carbondata.spark.rdd.CarbonDataRDDFactory$.loadCarbonData(CarbonDataRDDFactory.scala:560) at org.apache.spark.sql.execution.command.management.CarbonLoadDataCommand.loadData(CarbonLoadDataCommand.scala:207) at org.apache.spark.sql.execution.command.management.CarbonLoadDataCommand.processData(CarbonLoadDataCommand.scala:168) at org.apache.spark.sql.execution.command.AtomicRunnableCommand$$anonfun$run$3.apply(package.scala:148) at org.apache.spark.sql.execution.command.AtomicRunnableCommand$$anonfun$run$3.apply(package.scala:145) at org.apache.spark.sql.execution.command.Auditable$class.runWithAudit(package.scala:104) at org.apache.spark.sql.execution.command.AtomicRunnableCommand.runWithAudit(package.scala:141) at org.apache.spark.sql.execution.command.AtomicRunnableCommand.run(package.scala:145) at org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult$lzycompute(commands.scala:70) at org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult(commands.scala:68) at org.apache.spark.sql.execution.command.ExecutedCommandExec.executeCollect(commands.scala:79) at org.apache.spark.sql.Dataset$$anonfun$6.apply(Dataset.scala:190) at org.apache.spark.sql.Dataset$$anonfun$6.apply(Dataset.scala:190) at org.apache.spark.sql.Dataset$$anonfun$52.apply(Dataset.scala:3259) at org.apache.spark.sql.execution.SQLExecution$.withNewExecutionId(SQLExecution.scala:77) at org.apache.spark.sql.Dataset.withAction(Dataset.scala:3258) at org.apache.spark.sql.Dataset.(Dataset.scala:190) at org.apache.spark.sql.Dataset$.ofRows(Dataset.scala:75) at org.apache.spark.sql.SparkSession.sql(SparkSession.scala:642) at org.apache.spark.sql.SQLContext.sql(SQLContext.scala:694) at org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation.org$apache$spark$sql$hive$thriftserver$SparkExecuteStatementOperation$$execute(SparkExecuteStatementOperation.scala:232) at org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation$$anon$1$$anon$2.run(SparkExecuteStatementOperation.scala:175) at org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation$$anon$1$$anon$2.run(SparkExecuteStatementOperation.scala:171) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:422) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1698) at org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation$$anon$1.run(SparkExecuteStatementOperation.scala:185) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) at java.lang.Thread.run(Thread.java:745) 2020-06-05 02:05:56,789 | ERROR | [HiveServer2-Background-Pool: Thread-138] | Error running hive query: | org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation$$anon$1$$anon$2.run(SparkExecuteStatementOperation.scala:179)
[jira] [Updated] (CARBONDATA-3845) Bucket table creation fails with exception for empty BUCKET_NUMBER and BUCKET_COLUMNS
[ https://issues.apache.org/jira/browse/CARBONDATA-3845?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chetan Bhat updated CARBONDATA-3845: Description: *Steps and Issue-* 0: jdbc:hive2://10.20.251.163:23040/default> create table if not exists all_data_types1(bool_1 boolean,bool_2 boolean,chinese string,Number int,smallNumber smallint,BigNumber bigint,LargeDecimal double,smalldecimal float,customdecimal decimal(38,15),words string,smallwords char(8),varwords varchar(20),time timestamp,day date,emptyNumber int,emptysmallNumber smallint,emptyBigNumber bigint,emptyLargeDecimal double,emptysmalldecimal float,emptycustomdecimal decimal(38,38),emptywords string,emptysmallwords char(8),emptyvarwords varchar(20)) stored as carbondata TBLPROPERTIES (*'BUCKET_NUMBER'='', 'BUCKET_COLUMNS'=''*); *Error: java.lang.NumberFormatException: For input string: "" (state=,code=0)* Same issue present if bucket_number is empty. 0: jdbc:hive2://10.20.251.163:23040/default> create table if not exists all_data_types1(bool_1 boolean,bool_2 boolean,chinese string,Number int,smallNumber smallint,BigNumber bigint,LargeDecimal double,smalldecimal float,customdecimal decimal(38,15),words string,smallwords char(8),varwords varchar(20),time timestamp,day date,emptyNumber int,emptysmallNumber smallint,emptyBigNumber bigint,emptyLargeDecimal double,emptysmalldecimal float,emptycustomdecimal decimal(38,38),emptywords string,emptysmallwords char(8),emptyvarwords varchar(20)) stored as carbondata TBLPROPERTIES (*'BUCKET_NUMBER'=''*, 'BUCKET_COLUMNS'='test'); *Error: java.lang.NumberFormatException: For input string: "" (state=,code=0)* *Log-* 2020-06-05 01:52:32,633 | ERROR | [HiveServer2-Background-Pool: Thread-102] | Error executing query, currentState RUNNING, | org.apache.spark.internal.Logging$class.logError(Logging.scala:91)2020-06-05 01:52:32,633 | ERROR | [HiveServer2-Background-Pool: Thread-102] | Error executing query, currentState RUNNING, | org.apache.spark.internal.Logging$class.logError(Logging.scala:91)java.lang.NumberFormatException: For input string: "" at java.lang.NumberFormatException.forInputString(NumberFormatException.java:65) at java.lang.Integer.parseInt(Integer.java:592) at java.lang.Integer.parseInt(Integer.java:615) at scala.collection.immutable.StringLike$class.toInt(StringLike.scala:272) at scala.collection.immutable.StringOps.toInt(StringOps.scala:29) at org.apache.carbondata.spark.CarbonOption.bucketNumber$lzycompute(CarbonOption.scala:61) at org.apache.carbondata.spark.CarbonOption.bucketNumber(CarbonOption.scala:61) at org.apache.spark.sql.parser.CarbonSpark2SqlParser.getBucketFields(CarbonSpark2SqlParser.scala:765) at org.apache.spark.sql.parser.CarbonSparkSqlParserUtil$.buildTableInfoFromCatalogTable(CarbonSparkSqlParserUtil.scala:382) at org.apache.spark.sql.CarbonSource$.createTableInfo(CarbonSource.scala:235) at org.apache.spark.sql.CarbonSource$.createTableMeta(CarbonSource.scala:382) at org.apache.spark.sql.execution.command.table.CarbonCreateDataSourceTableCommand.processMetadata(CarbonCreateDataSourceTableCommand.scala:69) at org.apache.spark.sql.execution.command.MetadataCommand$$anonfun$run$1.apply(package.scala:123) at org.apache.spark.sql.execution.command.MetadataCommand$$anonfun$run$1.apply(package.scala:123) at org.apache.spark.sql.execution.command.Auditable$class.runWithAudit(package.scala:104) at org.apache.spark.sql.execution.command.MetadataCommand.runWithAudit(package.scala:120) at org.apache.spark.sql.execution.command.MetadataCommand.run(package.scala:123) at org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult$lzycompute(commands.scala:70) at org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult(commands.scala:68) at org.apache.spark.sql.execution.command.ExecutedCommandExec.executeCollect(commands.scala:79) at org.apache.spark.sql.Dataset$$anonfun$6.apply(Dataset.scala:190) at org.apache.spark.sql.Dataset$$anonfun$6.apply(Dataset.scala:190) at org.apache.spark.sql.Dataset$$anonfun$52.apply(Dataset.scala:3259) at org.apache.spark.sql.execution.SQLExecution$.withNewExecutionId(SQLExecution.scala:77) at org.apache.spark.sql.Dataset.withAction(Dataset.scala:3258) at org.apache.spark.sql.Dataset.(Dataset.scala:190) at org.apache.spark.sql.Dataset$.ofRows(Dataset.scala:75) at org.apache.spark.sql.SparkSession.sql(SparkSession.scala:642) at org.apache.spark.sql.SQLContext.sql(SQLContext.scala:694) at org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation.org$apache$spark$sql$hive$thriftserver$SparkExecuteStatementOperation$$execute(SparkExecuteStatementOperation.scala:232) at org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation$$anon$1$$anon$2.run(SparkExecuteStatementOperation.scala:175) at
[jira] [Updated] (CARBONDATA-3845) Bucket table creation fails with exception for empty BUCKET_NUMBER and BUCKET_COLUMNS
[ https://issues.apache.org/jira/browse/CARBONDATA-3845?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chetan Bhat updated CARBONDATA-3845: Description: *Steps and Issue-* 0: jdbc:hive2://10.20.251.163:23040/default> create table if not exists all_data_types1(bool_1 boolean,bool_2 boolean,chinese string,Number int,smallNumber smallint,BigNumber bigint,LargeDecimal double,smalldecimal float,customdecimal decimal(38,15),words string,smallwords char(8),varwords varchar(20),time timestamp,day date,emptyNumber int,emptysmallNumber smallint,emptyBigNumber bigint,emptyLargeDecimal double,emptysmalldecimal float,emptycustomdecimal decimal(38,38),emptywords string,emptysmallwords char(8),emptyvarwords varchar(20)) stored as carbondata TBLPROPERTIES ('BUCKET_NUMBER'='', 'BUCKET_COLUMNS'=''); *Error: java.lang.NumberFormatException: For input string: "" (state=,code=0)* Same issue present if bucket_number is empty. 0: jdbc:hive2://10.20.251.163:23040/default> create table if not exists all_data_types1(bool_1 boolean,bool_2 boolean,chinese string,Number int,smallNumber smallint,BigNumber bigint,LargeDecimal double,smalldecimal float,customdecimal decimal(38,15),words string,smallwords char(8),varwords varchar(20),time timestamp,day date,emptyNumber int,emptysmallNumber smallint,emptyBigNumber bigint,emptyLargeDecimal double,emptysmalldecimal float,emptycustomdecimal decimal(38,38),emptywords string,emptysmallwords char(8),emptyvarwords varchar(20)) stored as carbondata TBLPROPERTIES ('BUCKET_NUMBER'='', 'BUCKET_COLUMNS'='test'); *Error: java.lang.NumberFormatException: For input string: "" (state=,code=0)* *Log-* 2020-06-05 01:52:32,633 | ERROR | [HiveServer2-Background-Pool: Thread-102] | Error executing query, currentState RUNNING, | org.apache.spark.internal.Logging$class.logError(Logging.scala:91)2020-06-05 01:52:32,633 | ERROR | [HiveServer2-Background-Pool: Thread-102] | Error executing query, currentState RUNNING, | org.apache.spark.internal.Logging$class.logError(Logging.scala:91)java.lang.NumberFormatException: For input string: "" at java.lang.NumberFormatException.forInputString(NumberFormatException.java:65) at java.lang.Integer.parseInt(Integer.java:592) at java.lang.Integer.parseInt(Integer.java:615) at scala.collection.immutable.StringLike$class.toInt(StringLike.scala:272) at scala.collection.immutable.StringOps.toInt(StringOps.scala:29) at org.apache.carbondata.spark.CarbonOption.bucketNumber$lzycompute(CarbonOption.scala:61) at org.apache.carbondata.spark.CarbonOption.bucketNumber(CarbonOption.scala:61) at org.apache.spark.sql.parser.CarbonSpark2SqlParser.getBucketFields(CarbonSpark2SqlParser.scala:765) at org.apache.spark.sql.parser.CarbonSparkSqlParserUtil$.buildTableInfoFromCatalogTable(CarbonSparkSqlParserUtil.scala:382) at org.apache.spark.sql.CarbonSource$.createTableInfo(CarbonSource.scala:235) at org.apache.spark.sql.CarbonSource$.createTableMeta(CarbonSource.scala:382) at org.apache.spark.sql.execution.command.table.CarbonCreateDataSourceTableCommand.processMetadata(CarbonCreateDataSourceTableCommand.scala:69) at org.apache.spark.sql.execution.command.MetadataCommand$$anonfun$run$1.apply(package.scala:123) at org.apache.spark.sql.execution.command.MetadataCommand$$anonfun$run$1.apply(package.scala:123) at org.apache.spark.sql.execution.command.Auditable$class.runWithAudit(package.scala:104) at org.apache.spark.sql.execution.command.MetadataCommand.runWithAudit(package.scala:120) at org.apache.spark.sql.execution.command.MetadataCommand.run(package.scala:123) at org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult$lzycompute(commands.scala:70) at org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult(commands.scala:68) at org.apache.spark.sql.execution.command.ExecutedCommandExec.executeCollect(commands.scala:79) at org.apache.spark.sql.Dataset$$anonfun$6.apply(Dataset.scala:190) at org.apache.spark.sql.Dataset$$anonfun$6.apply(Dataset.scala:190) at org.apache.spark.sql.Dataset$$anonfun$52.apply(Dataset.scala:3259) at org.apache.spark.sql.execution.SQLExecution$.withNewExecutionId(SQLExecution.scala:77) at org.apache.spark.sql.Dataset.withAction(Dataset.scala:3258) at org.apache.spark.sql.Dataset.(Dataset.scala:190) at org.apache.spark.sql.Dataset$.ofRows(Dataset.scala:75) at org.apache.spark.sql.SparkSession.sql(SparkSession.scala:642) at org.apache.spark.sql.SQLContext.sql(SQLContext.scala:694) at org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation.org$apache$spark$sql$hive$thriftserver$SparkExecuteStatementOperation$$execute(SparkExecuteStatementOperation.scala:232) at org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation$$anon$1$$anon$2.run(SparkExecuteStatementOperation.scala:175) at
[jira] [Created] (CARBONDATA-3845) Bucket table creation fails with exception for empty BUCKET_NUMBER and BUCKET_COLUMNS
Chetan Bhat created CARBONDATA-3845: --- Summary: Bucket table creation fails with exception for empty BUCKET_NUMBER and BUCKET_COLUMNS Key: CARBONDATA-3845 URL: https://issues.apache.org/jira/browse/CARBONDATA-3845 Project: CarbonData Issue Type: Bug Components: data-query Affects Versions: 2.0.0 Environment: Spark 2.3.2 Reporter: Chetan Bhat *Steps and Issue-* 0: jdbc:hive2://10.20.251.163:23040/default> create table if not exists all_data_types1(bool_1 boolean,bool_2 boolean,chinese string,Number int,smallNumber smallint,BigNumber bigint,LargeDecimal double,smalldecimal float,customdecimal decimal(38,15),words string,smallwords char(8),varwords varchar(20),time timestamp,day date,emptyNumber int,emptysmallNumber smallint,emptyBigNumber bigint,emptyLargeDecimal double,emptysmalldecimal float,emptycustomdecimal decimal(38,38),emptywords string,emptysmallwords char(8),emptyvarwords varchar(20)) stored as carbondata TBLPROPERTIES ('BUCKET_NUMBER'='', 'BUCKET_COLUMNS'=''); *Error: java.lang.NumberFormatException: For input string: "" (state=,code=0)* *Log-* 2020-06-05 01:52:32,633 | ERROR | [HiveServer2-Background-Pool: Thread-102] | Error executing query, currentState RUNNING, | org.apache.spark.internal.Logging$class.logError(Logging.scala:91)2020-06-05 01:52:32,633 | ERROR | [HiveServer2-Background-Pool: Thread-102] | Error executing query, currentState RUNNING, | org.apache.spark.internal.Logging$class.logError(Logging.scala:91)java.lang.NumberFormatException: For input string: "" at java.lang.NumberFormatException.forInputString(NumberFormatException.java:65) at java.lang.Integer.parseInt(Integer.java:592) at java.lang.Integer.parseInt(Integer.java:615) at scala.collection.immutable.StringLike$class.toInt(StringLike.scala:272) at scala.collection.immutable.StringOps.toInt(StringOps.scala:29) at org.apache.carbondata.spark.CarbonOption.bucketNumber$lzycompute(CarbonOption.scala:61) at org.apache.carbondata.spark.CarbonOption.bucketNumber(CarbonOption.scala:61) at org.apache.spark.sql.parser.CarbonSpark2SqlParser.getBucketFields(CarbonSpark2SqlParser.scala:765) at org.apache.spark.sql.parser.CarbonSparkSqlParserUtil$.buildTableInfoFromCatalogTable(CarbonSparkSqlParserUtil.scala:382) at org.apache.spark.sql.CarbonSource$.createTableInfo(CarbonSource.scala:235) at org.apache.spark.sql.CarbonSource$.createTableMeta(CarbonSource.scala:382) at org.apache.spark.sql.execution.command.table.CarbonCreateDataSourceTableCommand.processMetadata(CarbonCreateDataSourceTableCommand.scala:69) at org.apache.spark.sql.execution.command.MetadataCommand$$anonfun$run$1.apply(package.scala:123) at org.apache.spark.sql.execution.command.MetadataCommand$$anonfun$run$1.apply(package.scala:123) at org.apache.spark.sql.execution.command.Auditable$class.runWithAudit(package.scala:104) at org.apache.spark.sql.execution.command.MetadataCommand.runWithAudit(package.scala:120) at org.apache.spark.sql.execution.command.MetadataCommand.run(package.scala:123) at org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult$lzycompute(commands.scala:70) at org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult(commands.scala:68) at org.apache.spark.sql.execution.command.ExecutedCommandExec.executeCollect(commands.scala:79) at org.apache.spark.sql.Dataset$$anonfun$6.apply(Dataset.scala:190) at org.apache.spark.sql.Dataset$$anonfun$6.apply(Dataset.scala:190) at org.apache.spark.sql.Dataset$$anonfun$52.apply(Dataset.scala:3259) at org.apache.spark.sql.execution.SQLExecution$.withNewExecutionId(SQLExecution.scala:77) at org.apache.spark.sql.Dataset.withAction(Dataset.scala:3258) at org.apache.spark.sql.Dataset.(Dataset.scala:190) at org.apache.spark.sql.Dataset$.ofRows(Dataset.scala:75) at org.apache.spark.sql.SparkSession.sql(SparkSession.scala:642) at org.apache.spark.sql.SQLContext.sql(SQLContext.scala:694) at org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation.org$apache$spark$sql$hive$thriftserver$SparkExecuteStatementOperation$$execute(SparkExecuteStatementOperation.scala:232) at org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation$$anon$1$$anon$2.run(SparkExecuteStatementOperation.scala:175) at org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation$$anon$1$$anon$2.run(SparkExecuteStatementOperation.scala:171) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:422) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1698) at org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation$$anon$1.run(SparkExecuteStatementOperation.scala:185) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) at
[jira] [Created] (CARBONDATA-3842) Select with limit displays incorrect resultset after datamap creation
Chetan Bhat created CARBONDATA-3842: --- Summary: Select with limit displays incorrect resultset after datamap creation Key: CARBONDATA-3842 URL: https://issues.apache.org/jira/browse/CARBONDATA-3842 Project: CarbonData Issue Type: Bug Components: data-query Affects Versions: 2.0.1 Environment: Spark 2.3.2 Reporter: Chetan Bhat *Steps :-* create table tab1(id int, name string, dept string) STORED as carbondata; create materialized view datamap31 as select a.id, a.name from tab1 a; insert into tab1 select 1,'ram','cs'; insert into tab1 select 2,'shyam','it'; select a.id, a.name from tab1 a order by a.id limit 1; *Issue :* Select with limit displays incorrect resultset (2 records instead of 1) after datamap creation. 0: jdbc:hive2://10.20.251.163:23040/default> select a.id, a.name from tab1 a order by a.id limit 1; INFO : Execution ID: 558 +-++--+ | id | name | +-++--+ | 2 | shyam | | 1 | ram | +-++--+ *2 rows selected (0.601 seconds)* -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (CARBONDATA-3838) Select filter query fails on SI columns of different SI tables.
Chetan Bhat created CARBONDATA-3838: --- Summary: Select filter query fails on SI columns of different SI tables. Key: CARBONDATA-3838 URL: https://issues.apache.org/jira/browse/CARBONDATA-3838 Project: CarbonData Issue Type: Bug Components: data-query Affects Versions: 2.0.0 Environment: Spark 2.3.2 Reporter: Chetan Bhat Select filter query fails on SI columns of different SI tables. *Steps :-* 0: jdbc:hive2://10.20.255.171:23040/default> create table brinjal (imei string,AMSize string,channelsId string,ActiveCountry string, Activecity string,gamePointId double,deviceInformationId double,productionDate Timestamp,deliveryDate timestamp,deliverycharge double) stored as carbondata TBLPROPERTIES('inverted_index'='imei,AMSize,channelsId,ActiveCountry,Activecity,productionDate,deliveryDate','sort_columns'='imei,AMSize,channelsId,ActiveCountry,Activecity,productionDate,deliveryDate','table_blocksize'='1','SORT_SCOPE'='GLOBAL_SORT','carbon.column.compressor'='zstd'); +-+--+ | Result | +-+--+ +-+--+ No rows selected (0.153 seconds) 0: jdbc:hive2://10.20.255.171:23040/default> LOAD DATA INPATH 'hdfs://hacluster/chetan/vardhandaterestruct.csv' INTO TABLE brinjal OPTIONS('DELIMITER'=',', 'QUOTECHAR'= '"','BAD_RECORDS_ACTION'='FORCE','FILEHEADER'= 'imei,deviceInformationId,AMSize,channelsId,ActiveCountry,Activecity,gamePointId,productionDate,deliveryDate,deliverycharge'); +-+--+ | Result | +-+--+ +-+--+ No rows selected (2.357 seconds) 0: jdbc:hive2://10.20.255.171:23040/default> CREATE INDEX indextable1 ON TABLE brinjal (channelsId) AS 'carbondata' PROPERTIES('carbon.column.compressor'='zstd'); +-+--+ | Result | +-+--+ +-+--+ No rows selected (1.048 seconds) 0: jdbc:hive2://10.20.255.171:23040/default> CREATE INDEX indextable2 ON TABLE brinjal (ActiveCountry) AS 'carbondata' PROPERTIES('carbon.column.compressor'='zstd'); +-+--+ | Result | +-+--+ +-+--+ No rows selected (1.895 seconds) 0: jdbc:hive2://10.20.255.171:23040/default> select * from brinjal where ActiveCountry ='Chinese' or channelsId =4; Error: org.apache.spark.sql.catalyst.errors.package$TreeNodeException: execute, tree: Exchange hashpartitioning(positionReference#6440, 200) +- *(6) HashAggregate(keys=[positionReference#6440], functions=[], output=[positionReference#6440]) +- Union :- *(3) HashAggregate(keys=[positionReference#6440], functions=[], output=[positionReference#6440]) : +- Exchange hashpartitioning(positionReference#6440, 200) : +- *(2) HashAggregate(keys=[positionReference#6440], functions=[], output=[positionReference#6440]) : +- *(2) Project [positionReference#6440] : +- *(2) Filter (cast(channelsid#6439 as int) = 4) : +- *(2) FileScan carbondata 2_0.indextable1[positionReference#6440,channelsid#6439] PushedFilters: [CastExpr((cast(channelsid#6439 as int) = 4))], ReadSchema: struct +- *(5) HashAggregate(keys=[positionReference#6442], functions=[], output=[positionReference#6442]) +- Exchange hashpartitioning(positionReference#6442, 200) +- *(4) HashAggregate(keys=[positionReference#6442], functions=[], output=[positionReference#6442]) +- *(4) Project [positionReference#6442|#6442] +- *(4) Filter (activecountry#6441 = Chinese) +- *(4) FileScan carbondata 2_0.indextable2[positionReference#6442,activecountry#6441] PushedFilters: [EqualTo(activecountry,Chinese)], ReadSchema: struct (state=,code=0) *Log -* org.apache.carbondata.core.datastore.block.SegmentPropertiesAndSchemaHolder.addSegmentProperties(SegmentPropertiesAndSchemaHolder.java:117)org.apache.carbondata.core.datastore.block.SegmentPropertiesAndSchemaHolder.addSegmentProperties(SegmentPropertiesAndSchemaHolder.java:117)2020-06-01 12:19:28,058 | ERROR | [HiveServer2-Background-Pool: Thread-1150] | Error executing query, currentState RUNNING, | org.apache.spark.internal.Logging$class.logError(Logging.scala:91)org.apache.spark.sql.catalyst.errors.package$TreeNodeException: execute, tree:Exchange hashpartitioning(positionReference#6440, 200)+- *(6) HashAggregate(keys=[positionReference#6440], functions=[], output=[positionReference#6440]) +- Union :- *(3) HashAggregate(keys=[positionReference#6440], functions=[], output=[positionReference#6440]) : +- Exchange hashpartitioning(positionReference#6440, 200) : +- *(2) HashAggregate(keys=[positionReference#6440], functions=[], output=[positionReference#6440]) : +- *(2) Project [positionReference#6440] : +- *(2) Filter (cast(channelsid#6439 as int) = 4) : +- *(2) FileScan carbondata 2_0.indextable1[positionReference#6440,channelsid#6439] PushedFilters: [CastExpr((cast(channelsid#6439 as int) = 4))], ReadSchema: struct +- *(5) HashAggregate(keys=[positionReference#6442],
[jira] [Commented] (CARBONDATA-3797) Refresh materialized view command throws null pointer exception
[ https://issues.apache.org/jira/browse/CARBONDATA-3797?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17110901#comment-17110901 ] Chetan Bhat commented on CARBONDATA-3797: - Added other steps-queries > Refresh materialized view command throws null pointer exception > --- > > Key: CARBONDATA-3797 > URL: https://issues.apache.org/jira/browse/CARBONDATA-3797 > Project: CarbonData > Issue Type: Bug > Components: data-query >Affects Versions: 2.0.0 > Environment: Spark 2.3.2, Spark 2.4.5 >Reporter: Chetan Bhat >Priority: Major > > Refresh materialized view command throws null pointer exception > CREATE TABLE uniqdata_mv(CUST_ID int,CUST_NAME String,ACTIVE_EMUI_VERSION > string, DOB timestamp, DOJ timestamp, BIGINT_COLUMN1 bigint,BIGINT_COLUMN2 > bigint,DECIMAL_COLUMN1 decimal(30,10), DECIMAL_COLUMN2 > decimal(36,10),Double_COLUMN1 double, Double_COLUMN2 double,INTEGER_COLUMN1 > int) STORED as carbondata > TBLPROPERTIES('local_dictionary_enable'='true','local_dictionary_threshold'='1000'); > LOAD DATA INPATH 'hdfs://hacluster/chetan/2000_UniqData.csv' into table > uniqdata_mv OPTIONS('DELIMITER'=',', > 'QUOTECHAR'='"','BAD_RECORDS_ACTION'='FORCE','FILEHEADER'='CUST_ID,CUST_NAME,ACTIVE_EMUI_VERSION,DOB,DOJ,BIGINT_COLUMN1,BIGINT_COLUMN2,DECIMAL_COLUMN1,DECIMAL_COLUMN2,Double_COLUMN1,Double_COLUMN2,INTEGER_COLUMN1'); > CREATE MATERIALIZED VIEW mv1 as select cust_id, cust_name, count(cust_id) > from uniqdata_mv group by cust_id, cust_name; > refresh MATERIALIZED VIEW mv1; > Error: java.lang.NullPointerException (state=,code=0) > > *Exception-* > 2020-05-06 00:50:59,941 | ERROR | [HiveServer2-Background-Pool: Thread-1822] > | Error executing query, currentState RUNNING, | > org.apache.spark.internal.Logging$class.logError(Logging.scala:91) > java.lang.NullPointerException > at org.apache.carbondata.view.MVRefresher$.refresh(MVRefresher.scala:62) > at > org.apache.spark.sql.execution.command.view.CarbonRefreshMVCommand.processData(CarbonRefreshMVCommand.scala:52) > at > org.apache.spark.sql.execution.command.DataCommand$$anonfun$run$2.apply(package.scala:132) > at > org.apache.spark.sql.execution.command.DataCommand$$anonfun$run$2.apply(package.scala:132) > at > org.apache.spark.sql.execution.command.Auditable$class.runWithAudit(package.scala:104) > at > org.apache.spark.sql.execution.command.DataCommand.runWithAudit(package.scala:130) > at org.apache.spark.sql.execution.command.DataCommand.run(package.scala:132) > at > org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult$lzycompute(commands.scala:70) > at > org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult(commands.scala:68) > at > org.apache.spark.sql.execution.command.ExecutedCommandExec.executeCollect(commands.scala:79) > at org.apache.spark.sql.Dataset$$anonfun$6.apply(Dataset.scala:194) > at org.apache.spark.sql.Dataset$$anonfun$6.apply(Dataset.scala:194) > at org.apache.spark.sql.Dataset$$anonfun$52.apply(Dataset.scala:3370) > at > org.apache.spark.sql.execution.SQLExecution$$anonfun$withNewExecutionId$1.apply(SQLExecution.scala:80) > at > org.apache.spark.sql.execution.SQLExecution$.withSQLConfPropagated(SQLExecution.scala:127) > at > org.apache.spark.sql.execution.SQLExecution$.withNewExecutionId(SQLExecution.scala:75) > at org.apache.spark.sql.Dataset.withAction(Dataset.scala:3369) > at org.apache.spark.sql.Dataset.(Dataset.scala:194) > at org.apache.spark.sql.Dataset$.ofRows(Dataset.scala:79) > at org.apache.spark.sql.SparkSession.sql(SparkSession.scala:642) > at org.apache.spark.sql.SQLContext.sql(SQLContext.scala:694) > at > org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation.org$apache$spark$sql$hive$thriftserver$SparkExecuteStatementOperation$$execute(SparkExecuteStatementOperation.scala:232) > at > org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation$$anon$1$$anon$2.run(SparkExecuteStatementOperation.scala:175) > at > org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation$$anon$1$$anon$2.run(SparkExecuteStatementOperation.scala:171) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:422) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1698) > at > org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation$$anon$1.run(SparkExecuteStatementOperation.scala:185) > at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) > at java.util.concurrent.FutureTask.run(FutureTask.java:266) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) > at >