[jira] [Closed] (CARBONDATA-4240) Properties present in https://github.com/apache/carbondata/blob/master/core/src/main/java/org/apache/carbondata/core/constants/CarbonCommonConstants.java which are n

2022-02-17 Thread Chetan Bhat (Jira)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-4240?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chetan Bhat closed CARBONDATA-4240.
---

The mentioned properties are updated now in the documentation link - 
[https://github.com/apache/carbondata/blob/master/core/src/main/java/org/apache/carbondata/core/constants/CarbonCommonConstants.java.]

Hence the bug is closed.

> Properties present in 
> https://github.com/apache/carbondata/blob/master/core/src/main/java/org/apache/carbondata/core/constants/CarbonCommonConstants.java
>   which are not present in open source doc
> ---
>
> Key: CARBONDATA-4240
> URL: https://issues.apache.org/jira/browse/CARBONDATA-4240
> Project: CarbonData
>  Issue Type: Bug
>  Components: docs
>Affects Versions: 2.2.0
> Environment: Open source docs
>Reporter: Chetan Bhat
>Priority: Minor
> Fix For: 2.3.0
>
>  Time Spent: 9.5h
>  Remaining Estimate: 0h
>
> Properties present in 
> https://github.com/apache/carbondata/blob/master/core/src/main/java/org/apache/carbondata/core/constants/CarbonCommonConstants.java
>  which are not present in open source doc as mentioned below. These 
> properties need to be updated in open source doc.
> carbon.storelocation
> carbon.blocklet.size
> carbon.properties.filepath
> carbon.date.format
> carbon.complex.delimiter.level.1
> carbon.complex.delimiter.level.2
> carbon.complex.delimiter.level.3
> carbon.complex.delimiter.level.4
> carbon.lock.class
> carbon.local.dictionary.enable
> carbon.local.dictionary.decoder.fallback
> spark.deploy.zookeeper.url
> carbon.data.file.version
> spark.carbon.hive.schema.store
> spark.carbon.datamanagement.driver
> spark.carbon.sessionstate.classname
> spark.carbon.sqlastbuilder.classname
> carbon.lease.recovery.retry.count
> carbon.lease.recovery.retry.interval
> carbon.index.schema.storage
> carbon.merge.index.in.segment
> carbon.number.of.cores.while.altPartition
> carbon.minor.compaction.size
> enable.unsafe.columnpage
> carbon.lucene.compression.mode
> sort.inmemory.size.inmb
> is.driver.instance
> carbon.input.metrics.update.interval
> carbon.use.bitset.pipe.line
> is.internal.load.call
> carbon.lucene.index.stop.words
> carbon.load.dateformat.setlenient.enable
> carbon.infilter.subquery.pushdown.enable
> broadcast.record.size
> carbon.indexserver.tempfolder.deletetime



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Created] (CARBONDATA-4321) Major Compaction of a table with multiple big data loads each having different sort scopes fails

2022-01-06 Thread Chetan Bhat (Jira)
Chetan Bhat created CARBONDATA-4321:
---

 Summary: Major Compaction of a table with multiple big data loads 
each having different sort scopes fails
 Key: CARBONDATA-4321
 URL: https://issues.apache.org/jira/browse/CARBONDATA-4321
 Project: CarbonData
  Issue Type: Bug
  Components: data-load
Affects Versions: 2.3.0
 Environment: SUSE/Cent OS, Spark 3.1.1
Reporter: Chetan Bhat
 Attachments: Failure_Logs.txt

Test Steps :

>From Spark beeline table is created with compression format gzip, table having 
>more than 100 columns.

3 big data loads each with different sort scopes are loaded in the table.

Major compaction is executed on the table.

create table JL_r3
(
p_cap_time String,
city String,
product_code String,
user_base_station String,
user_belong_area_code String,
user_num String,
user_imsi String,
user_id String,
user_msisdn String,
dim1 String,
dim2 String,
dim3 String,
dim4 String,
dim5 String,
dim6 String,
dim7 String,
dim8 String,
dim9 String,
dim10 String,
dim11 String,
dim12 String,
dim13 String,
dim14 String,
dim15 String,
dim16 String,
dim17 String,
dim18 String,
dim19 String,
dim20 String,
dim21 String,
dim22 String,
dim23 String,
dim24 String,
dim25 String,
dim26 String,
dim27 String,
dim28 String,
dim29 String,
dim30 String,
dim31 String,
dim32 String,
dim33 String,
dim34 String,
dim35 String,
dim36 String,
dim37 String,
dim38 String,
dim39 String,
dim40 String,
dim41 String,
dim42 String,
dim43 String,
dim44 String,
dim45 String,
dim46 String,
dim47 String,
dim48 String,
dim49 String,
dim50 String,
dim51 String,
dim52 String,
dim53 String,
dim54 String,
dim55 String,
dim56 String,
dim57 String,
dim58 String,
dim59 String,
dim60 String,
dim61 String,
dim62 String,
dim63 String,
dim64 String,
dim65 String,
dim66 String,
dim67 String,
dim68 String,
dim69 String,
dim70 String,
dim71 String,
dim72 String,
dim73 String,
dim74 String,
dim75 String,
dim76 String,
dim77 String,
dim78 String,
dim79 String,
dim80 String,
dim81 String,
M1 double,
M2 double,
M3 double,
M4 double,
M5 double,
M6 double,
M7 double,
M8 double,
M9 double,
M10 double )
stored as carbondata
TBLPROPERTIES('table_blocksize'='256','sort_columns'='dim81','carbon.column.compressor'='gzip');

0: jdbc:hive2://10.21.19.14:23040/default> LOAD DATA inpath 
'hdfs://hacluster/chetan/Bigdata_bulk.csv' into table JL_r3 
options('sort_scope'='global_sort','DELIMITER'=',', 
'QUOTECHAR'='"','BAD_RECORDS_ACTION'='FORCE','BAD_RECORDS_LOGGER_ENABLE'='TRUE','IS_EMPTY_DATA_BAD_RECORD'='TRUE','FILEHEADER'='p_cap_time,city,product_code,user_base_station,user_belong_area_code,user_num,user_imsi,user_id,user_msisdn,dim1,dim2,dim3,dim4,dim5,dim6,dim7,dim8,dim9,dim10,dim11,dim12,dim13,dim14,dim15,dim16,dim17,dim18,dim19,dim20,dim21,dim22,dim23,dim24,dim25,dim26,dim27,dim28,dim29,dim30,dim31,dim32,dim33,dim34,dim35,dim36,dim37,dim38,dim39,dim40,dim41,dim42,dim43,dim44,dim45,dim46,dim47,dim48,dim49,dim50,dim51,dim52,dim53,dim54,dim55,dim56,dim57,dim58,dim59,dim60,dim61,dim62,dim63,dim64,dim65,dim66,dim67,dim68,dim69,dim70,dim71,dim72,dim73,dim74,dim75,dim76,dim77,dim78,dim79,dim80,dim81,M1,M2,M3,M4,M5,M6,M7,M8,M9,M10');
+-+
| Segment ID  |
+-+
| 0           |
+-+
1 row selected (41.011 seconds)
0: jdbc:hive2://10.21.19.14:23040/default> LOAD DATA inpath 
'hdfs://hacluster/chetan/Bigdata_bulk.csv' into table JL_r3 
options('sort_scope'='local_sort','DELIMITER'=',', 
'QUOTECHAR'='"','BAD_RECORDS_ACTION'='FORCE','BAD_RECORDS_LOGGER_ENABLE'='TRUE','IS_EMPTY_DATA_BAD_RECORD'='TRUE','FILEHEADER'='p_cap_time,city,product_code,user_base_station,user_belong_area_code,user_num,user_imsi,user_id,user_msisdn,dim1,dim2,dim3,dim4,dim5,dim6,dim7,dim8,dim9,dim10,dim11,dim12,dim13,dim14,dim15,dim16,dim17,dim18,dim19,dim20,dim21,dim22,dim23,dim24,dim25,dim26,dim27,dim28,dim29,dim30,dim31,dim32,dim33,dim34,dim35,dim36,dim37,dim38,dim39,dim40,dim41,dim42,dim43,dim44,dim45,dim46,dim47,dim48,dim49,dim50,dim51,dim52,dim53,dim54,dim55,dim56,dim57,dim58,dim59,dim60,dim61,dim62,dim63,dim64,dim65,dim66,dim67,dim68,dim69,dim70,dim71,dim72,dim73,dim74,dim75,dim76,dim77,dim78,dim79,dim80,dim81,M1,M2,M3,M4,M5,M6,M7,M8,M9,M10');
+-+
| Segment ID  |
+-+
| 1           |
+-+
1 row selected (17.094 seconds)
0: jdbc:hive2://10.21.19.14:23040/default> LOAD DATA inpath 
'hdfs://hacluster/chetan/Bigdata_bulk.csv' into table JL_r3 
options('sort_scope'='no_sort','DELIMITER'=',', 

[jira] [Updated] (CARBONDATA-4297) Create table(Carbon and Parquet) with combination of partitioned by, Clustered by, Sorted by and with options parameter and insert overwrite fails with parser errors

2021-10-08 Thread Chetan Bhat (Jira)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-4297?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chetan Bhat updated CARBONDATA-4297:

Description: 
*Issue 1 : Create table* *(Carbon and Parquet) with combination of partitioned 
by, Clustered by, Sorted by fails -*

*Queries-*

CREATE TABLE t (a STRING, b INT, c STRING, d STRING) stored as carbondata
 OPTIONS (a '1', b '2')
 PARTITIONED BY (c, d) CLUSTERED BY (a) SORTED BY (b ASC) INTO 2 BUCKETS
 COMMENT 'table_comment'
 TBLPROPERTIES (t 'test');
 CREATE TABLE t (a STRING, b INT, c STRING, d STRING) stored as parquet
 OPTIONS (a '1', b '2')
 PARTITIONED BY (c, d) CLUSTERED BY (a) SORTED BY (b ASC) INTO 2 BUCKETS
 COMMENT 'table_comment'
 TBLPROPERTIES (t 'test');

0: jdbc:hive2://7.187.185.158:23040/default> CREATE TABLE t (a STRING, b INT, c 
STRING, d STRING) stored as carbondata
 0: jdbc:hive2://7.187.185.158:23040/default> OPTIONS (a '1', b '2')
 0: jdbc:hive2://7.187.185.158:23040/default> PARTITIONED BY (c, d) CLUSTERED 
BY (a) SORTED BY (b ASC) INTO 2 BUCKETS
 0: jdbc:hive2://7.187.185.158:23040/default> COMMENT 'table_comment'
 0: jdbc:hive2://7.187.185.158:23040/default> TBLPROPERTIES (t 'test');
 Error: org.apache.spark.sql.AnalysisException: == Spark Parser: 
org.apache.spark.sql.execution.SparkSqlParser ==

mismatched input 'OPTIONS' expecting (line 2, pos 0)

== SQL ==
 CREATE TABLE t (a STRING, b INT, c STRING, d STRING) stored as carbondata
 OPTIONS (a '1', b '2')
 ^^^
 PARTITIONED BY (c, d) CLUSTERED BY (a) SORTED BY (b ASC) INTO 2 BUCKETS
 COMMENT 'table_comment'
 TBLPROPERTIES (t 'test')

== Carbon Parser: org.apache.spark.sql.parser.CarbonExtensionSpark2SqlParser ==
 [1.8] failure: identifier matching regex (?i)MATERIALIZED expected

CREATE TABLE t (a STRING, b INT, c STRING, d STRING) stored as carbondata
 ^;
 == Antlr Parser: org.apache.spark.sql.parser.CarbonAntlrParser ==
 Antlr SQL Parser will only deal with Merge Into SQL Command; (state=,code=0)
 0: jdbc:hive2://7.187.185.158:23040/default> CREATE TABLE t (a STRING, b INT, 
c STRING, d STRING) stored as parquet
 0: jdbc:hive2://7.187.185.158:23040/default> OPTIONS (a '1', b '2')
 0: jdbc:hive2://7.187.185.158:23040/default> PARTITIONED BY (c, d) CLUSTERED 
BY (a) SORTED BY (b ASC) INTO 2 BUCKETS
 0: jdbc:hive2://7.187.185.158:23040/default> COMMENT 'table_comment'
 0: jdbc:hive2://7.187.185.158:23040/default> TBLPROPERTIES (t 'test');
 Error: org.apache.spark.sql.AnalysisException: == Spark Parser: 
org.apache.spark.sql.execution.SparkSqlParser ==

mismatched input 'OPTIONS' expecting (line 2, pos 0)

== SQL ==
 CREATE TABLE t (a STRING, b INT, c STRING, d STRING) stored as parquet
 OPTIONS (a '1', b '2')
 ^^^
 PARTITIONED BY (c, d) CLUSTERED BY (a) SORTED BY (b ASC) INTO 2 BUCKETS
 COMMENT 'table_comment'
 TBLPROPERTIES (t 'test')

== Carbon Parser: org.apache.spark.sql.parser.CarbonExtensionSpark2SqlParser ==
 [1.8] failure: identifier matching regex (?i)MATERIALIZED expected

CREATE TABLE t (a STRING, b INT, c STRING, d STRING) stored as parquet
 ^;
 == Antlr Parser: org.apache.spark.sql.parser.CarbonAntlrParser ==
 Antlr SQL Parser will only deal with Merge Into SQL Command; (state=,code=0)
 0: jdbc:hive2://7.187.185.158:23040/default>

 

*Issue 2 : Create table with options parameter fails-*

*Queries-*

CREATE TABLE tbl (a INT, b STRING, c INT) stored as carbondata OPTIONS ('a' 1);
 CREATE TABLE tbl1 (a INT, b STRING, c INT) stored as parquet OPTIONS ('a' 1);

 

0: jdbc:hive2://7.187.185.158:23040/default> CREATE TABLE tbl (a INT, b STRING, 
c INT) stored as carbondata OPTIONS ('a' 1);
 Error: org.apache.spark.sql.AnalysisException: == Spark Parser: 
org.apache.spark.sql.execution.SparkSqlParser ==

mismatched input 'OPTIONS' expecting (line 1, pos 63)

== SQL ==
 CREATE TABLE tbl (a INT, b STRING, c INT) stored as carbondata OPTIONS ('a' 1)
 ---^^^

== Carbon Parser: org.apache.spark.sql.parser.CarbonExtensionSpark2SqlParser ==
 [1.8] failure: identifier matching regex (?i)MATERIALIZED expected

CREATE TABLE tbl (a INT, b STRING, c INT) stored as carbondata OPTIONS ('a' 1)
 ^;
 == Antlr Parser: org.apache.spark.sql.parser.CarbonAntlrParser ==
 Antlr SQL Parser will only deal with Merge Into SQL Command; (state=,code=0)
 0: jdbc:hive2://7.187.185.158:23040/default> CREATE TABLE tbl1 (a INT, b 
STRING, c INT) stored as parquet OPTIONS ('a' 1);
 Error: org.apache.spark.sql.AnalysisException: == Spark Parser: 
org.apache.spark.sql.execution.SparkSqlParser ==

mismatched input 'OPTIONS' expecting (line 1, pos 61)

== SQL ==
 CREATE TABLE tbl1 (a INT, b STRING, c INT) stored as parquet OPTIONS ('a' 1)
 -^^^

== Carbon Parser: org.apache.spark.sql.parser.CarbonExtensionSpark2SqlParser ==
 [1.8] failure: identifier matching regex (?i)MATERIALIZED expected

CREATE TABLE tbl1 (a INT, b 

[jira] [Updated] (CARBONDATA-4297) Create table(Carbon and Parquet) with combination of partitioned by, Clustered by, Sorted by and with options parameter and insert overwrite fails with parser errors

2021-10-08 Thread Chetan Bhat (Jira)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-4297?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chetan Bhat updated CARBONDATA-4297:

 Attachment: image-2021-10-08-12-51-14-837.png
Description: 
*Issue 1 : Create table* *(Carbon and Parquet) with combination of partitioned 
by, Clustered by, Sorted by fails -*

*Queries-*

CREATE TABLE t (a STRING, b INT, c STRING, d STRING) stored as carbondata
 OPTIONS (a '1', b '2')
 PARTITIONED BY (c, d) CLUSTERED BY (a) SORTED BY (b ASC) INTO 2 BUCKETS
 COMMENT 'table_comment'
 TBLPROPERTIES (t 'test');
 CREATE TABLE t (a STRING, b INT, c STRING, d STRING) stored as parquet
 OPTIONS (a '1', b '2')
 PARTITIONED BY (c, d) CLUSTERED BY (a) SORTED BY (b ASC) INTO 2 BUCKETS
 COMMENT 'table_comment'
 TBLPROPERTIES (t 'test');

0: jdbc:hive2://7.187.185.158:23040/default> CREATE TABLE t (a STRING, b INT, c 
STRING, d STRING) stored as carbondata
 0: jdbc:hive2://7.187.185.158:23040/default> OPTIONS (a '1', b '2')
 0: jdbc:hive2://7.187.185.158:23040/default> PARTITIONED BY (c, d) CLUSTERED 
BY (a) SORTED BY (b ASC) INTO 2 BUCKETS
 0: jdbc:hive2://7.187.185.158:23040/default> COMMENT 'table_comment'
 0: jdbc:hive2://7.187.185.158:23040/default> TBLPROPERTIES (t 'test');
 Error: org.apache.spark.sql.AnalysisException: == Spark Parser: 
org.apache.spark.sql.execution.SparkSqlParser ==

mismatched input 'OPTIONS' expecting (line 2, pos 0)

== SQL ==
 CREATE TABLE t (a STRING, b INT, c STRING, d STRING) stored as carbondata
 OPTIONS (a '1', b '2')
 ^^^
 PARTITIONED BY (c, d) CLUSTERED BY (a) SORTED BY (b ASC) INTO 2 BUCKETS
 COMMENT 'table_comment'
 TBLPROPERTIES (t 'test')

== Carbon Parser: org.apache.spark.sql.parser.CarbonExtensionSpark2SqlParser ==
 [1.8] failure: identifier matching regex (?i)MATERIALIZED expected

CREATE TABLE t (a STRING, b INT, c STRING, d STRING) stored as carbondata
 ^;
 == Antlr Parser: org.apache.spark.sql.parser.CarbonAntlrParser ==
 Antlr SQL Parser will only deal with Merge Into SQL Command; (state=,code=0)
 0: jdbc:hive2://7.187.185.158:23040/default> CREATE TABLE t (a STRING, b INT, 
c STRING, d STRING) stored as parquet
 0: jdbc:hive2://7.187.185.158:23040/default> OPTIONS (a '1', b '2')
 0: jdbc:hive2://7.187.185.158:23040/default> PARTITIONED BY (c, d) CLUSTERED 
BY (a) SORTED BY (b ASC) INTO 2 BUCKETS
 0: jdbc:hive2://7.187.185.158:23040/default> COMMENT 'table_comment'
 0: jdbc:hive2://7.187.185.158:23040/default> TBLPROPERTIES (t 'test');
 Error: org.apache.spark.sql.AnalysisException: == Spark Parser: 
org.apache.spark.sql.execution.SparkSqlParser ==

mismatched input 'OPTIONS' expecting (line 2, pos 0)

== SQL ==
 CREATE TABLE t (a STRING, b INT, c STRING, d STRING) stored as parquet
 OPTIONS (a '1', b '2')
 ^^^
 PARTITIONED BY (c, d) CLUSTERED BY (a) SORTED BY (b ASC) INTO 2 BUCKETS
 COMMENT 'table_comment'
 TBLPROPERTIES (t 'test')

== Carbon Parser: org.apache.spark.sql.parser.CarbonExtensionSpark2SqlParser ==
 [1.8] failure: identifier matching regex (?i)MATERIALIZED expected

CREATE TABLE t (a STRING, b INT, c STRING, d STRING) stored as parquet
 ^;
 == Antlr Parser: org.apache.spark.sql.parser.CarbonAntlrParser ==
 Antlr SQL Parser will only deal with Merge Into SQL Command; (state=,code=0)
 0: jdbc:hive2://7.187.185.158:23040/default>

 

*Issue 2 : Create table with options parameter fails-*

*Queries-*

CREATE TABLE tbl (a INT, b STRING, c INT) stored as carbondata OPTIONS ('a' 1);
 CREATE TABLE tbl1 (a INT, b STRING, c INT) stored as parquet OPTIONS ('a' 1);

 

0: jdbc:hive2://7.187.185.158:23040/default> CREATE TABLE tbl (a INT, b STRING, 
c INT) stored as carbondata OPTIONS ('a' 1);
 Error: org.apache.spark.sql.AnalysisException: == Spark Parser: 
org.apache.spark.sql.execution.SparkSqlParser ==

mismatched input 'OPTIONS' expecting (line 1, pos 63)

== SQL ==
 CREATE TABLE tbl (a INT, b STRING, c INT) stored as carbondata OPTIONS ('a' 1)
 ---^^^

== Carbon Parser: org.apache.spark.sql.parser.CarbonExtensionSpark2SqlParser ==
 [1.8] failure: identifier matching regex (?i)MATERIALIZED expected

CREATE TABLE tbl (a INT, b STRING, c INT) stored as carbondata OPTIONS ('a' 1)
 ^;
 == Antlr Parser: org.apache.spark.sql.parser.CarbonAntlrParser ==
 Antlr SQL Parser will only deal with Merge Into SQL Command; (state=,code=0)
 0: jdbc:hive2://7.187.185.158:23040/default> CREATE TABLE tbl1 (a INT, b 
STRING, c INT) stored as parquet OPTIONS ('a' 1);
 Error: org.apache.spark.sql.AnalysisException: == Spark Parser: 
org.apache.spark.sql.execution.SparkSqlParser ==

mismatched input 'OPTIONS' expecting (line 1, pos 61)

== SQL ==
 CREATE TABLE tbl1 (a INT, b STRING, c INT) stored as parquet OPTIONS ('a' 1)
 -^^^

== Carbon Parser: org.apache.spark.sql.parser.CarbonExtensionSpark2SqlParser ==
 [1.8] failure: identifier matching regex 

[jira] [Created] (CARBONDATA-4297) Create table(Carbon and Parquet) with combination of partitioned by, Clustered by, Sorted by and with options parameter fails with parser errors in Carbon session in

2021-10-04 Thread Chetan Bhat (Jira)
Chetan Bhat created CARBONDATA-4297:
---

 Summary: Create table(Carbon and Parquet) with combination of 
partitioned by, Clustered by, Sorted by and with options parameter fails with 
parser errors in Carbon session in Spark 2.4.5
 Key: CARBONDATA-4297
 URL: https://issues.apache.org/jira/browse/CARBONDATA-4297
 Project: CarbonData
  Issue Type: Bug
  Components: sql
Affects Versions: 2.3.0
 Environment: Spark 2.4.5
Reporter: Chetan Bhat


*Issue 1 : Create table* *(Carbon and Parquet) with combination of partitioned 
by, Clustered by, Sorted by fails -*

*Queries-*

CREATE TABLE t (a STRING, b INT, c STRING, d STRING) stored as carbondata
 OPTIONS (a '1', b '2')
 PARTITIONED BY (c, d) CLUSTERED BY (a) SORTED BY (b ASC) INTO 2 BUCKETS
 COMMENT 'table_comment'
 TBLPROPERTIES (t 'test');
 CREATE TABLE t (a STRING, b INT, c STRING, d STRING) stored as parquet
 OPTIONS (a '1', b '2')
 PARTITIONED BY (c, d) CLUSTERED BY (a) SORTED BY (b ASC) INTO 2 BUCKETS
 COMMENT 'table_comment'
 TBLPROPERTIES (t 'test');

0: jdbc:hive2://7.187.185.158:23040/default> CREATE TABLE t (a STRING, b INT, c 
STRING, d STRING) stored as carbondata
0: jdbc:hive2://7.187.185.158:23040/default> OPTIONS (a '1', b '2')
0: jdbc:hive2://7.187.185.158:23040/default> PARTITIONED BY (c, d) CLUSTERED BY 
(a) SORTED BY (b ASC) INTO 2 BUCKETS
0: jdbc:hive2://7.187.185.158:23040/default> COMMENT 'table_comment'
0: jdbc:hive2://7.187.185.158:23040/default> TBLPROPERTIES (t 'test');
Error: org.apache.spark.sql.AnalysisException: == Spark Parser: 
org.apache.spark.sql.execution.SparkSqlParser ==

mismatched input 'OPTIONS' expecting (line 2, pos 0)

== SQL ==
CREATE TABLE t (a STRING, b INT, c STRING, d STRING) stored as carbondata
OPTIONS (a '1', b '2')
^^^
PARTITIONED BY (c, d) CLUSTERED BY (a) SORTED BY (b ASC) INTO 2 BUCKETS
COMMENT 'table_comment'
TBLPROPERTIES (t 'test')

== Carbon Parser: org.apache.spark.sql.parser.CarbonExtensionSpark2SqlParser ==
[1.8] failure: identifier matching regex (?i)MATERIALIZED expected

CREATE TABLE t (a STRING, b INT, c STRING, d STRING) stored as carbondata
 ^;
== Antlr Parser: org.apache.spark.sql.parser.CarbonAntlrParser ==
Antlr SQL Parser will only deal with Merge Into SQL Command; (state=,code=0)
0: jdbc:hive2://7.187.185.158:23040/default> CREATE TABLE t (a STRING, b INT, c 
STRING, d STRING) stored as parquet
0: jdbc:hive2://7.187.185.158:23040/default> OPTIONS (a '1', b '2')
0: jdbc:hive2://7.187.185.158:23040/default> PARTITIONED BY (c, d) CLUSTERED BY 
(a) SORTED BY (b ASC) INTO 2 BUCKETS
0: jdbc:hive2://7.187.185.158:23040/default> COMMENT 'table_comment'
0: jdbc:hive2://7.187.185.158:23040/default> TBLPROPERTIES (t 'test');
Error: org.apache.spark.sql.AnalysisException: == Spark Parser: 
org.apache.spark.sql.execution.SparkSqlParser ==

mismatched input 'OPTIONS' expecting (line 2, pos 0)

== SQL ==
CREATE TABLE t (a STRING, b INT, c STRING, d STRING) stored as parquet
OPTIONS (a '1', b '2')
^^^
PARTITIONED BY (c, d) CLUSTERED BY (a) SORTED BY (b ASC) INTO 2 BUCKETS
COMMENT 'table_comment'
TBLPROPERTIES (t 'test')

== Carbon Parser: org.apache.spark.sql.parser.CarbonExtensionSpark2SqlParser ==
[1.8] failure: identifier matching regex (?i)MATERIALIZED expected

CREATE TABLE t (a STRING, b INT, c STRING, d STRING) stored as parquet
 ^;
== Antlr Parser: org.apache.spark.sql.parser.CarbonAntlrParser ==
Antlr SQL Parser will only deal with Merge Into SQL Command; (state=,code=0)
0: jdbc:hive2://7.187.185.158:23040/default>

 

*Issue 2 : Create table with options parameter fails-*

*Queries-*

CREATE TABLE tbl (a INT, b STRING, c INT) stored as carbondata OPTIONS ('a' 1);
 CREATE TABLE tbl1 (a INT, b STRING, c INT) stored as parquet OPTIONS ('a' 1);

 

0: jdbc:hive2://7.187.185.158:23040/default> CREATE TABLE tbl (a INT, b STRING, 
c INT) stored as carbondata OPTIONS ('a' 1);
Error: org.apache.spark.sql.AnalysisException: == Spark Parser: 
org.apache.spark.sql.execution.SparkSqlParser ==

mismatched input 'OPTIONS' expecting (line 1, pos 63)

== SQL ==
CREATE TABLE tbl (a INT, b STRING, c INT) stored as carbondata OPTIONS ('a' 1)
---^^^

== Carbon Parser: org.apache.spark.sql.parser.CarbonExtensionSpark2SqlParser ==
[1.8] failure: identifier matching regex (?i)MATERIALIZED expected

CREATE TABLE tbl (a INT, b STRING, c INT) stored as carbondata OPTIONS ('a' 1)
 ^;
== Antlr Parser: org.apache.spark.sql.parser.CarbonAntlrParser ==
Antlr SQL Parser will only deal with Merge Into SQL Command; (state=,code=0)
0: jdbc:hive2://7.187.185.158:23040/default> CREATE TABLE tbl1 (a INT, b 
STRING, c INT) stored as parquet OPTIONS ('a' 1);
Error: org.apache.spark.sql.AnalysisException: == Spark Parser: 
org.apache.spark.sql.execution.SparkSqlParser ==

mismatched input 'OPTIONS' expecting (line 1, pos 61)

== SQL ==
CREATE 

[jira] [Created] (CARBONDATA-4294) Some Carbondata github docs links not working

2021-10-01 Thread Chetan Bhat (Jira)
Chetan Bhat created CARBONDATA-4294:
---

 Summary: Some Carbondata github docs links not working
 Key: CARBONDATA-4294
 URL: https://issues.apache.org/jira/browse/CARBONDATA-4294
 Project: CarbonData
  Issue Type: Bug
  Components: docs
Affects Versions: 2.3.0
 Environment: Carbondata github links
Reporter: Chetan Bhat


1. In 
https://github.com/apache/carbondata/blob/master/docs/how-to-contribute-to-apache-carbondata.md#engage
 page "Apache CarbonData Dev Mailing List archive" link when clicked the target 
does not open. Also "If you do not already have an account, sign up here", the 
here target link does not open.
2. In 
https://github.com/apache/carbondata/blob/master/docs/how-to-contribute-to-apache-carbondata.md#deleting-your-branch--optional-
 link the "Deleting your branch (optional)" link when clicked does not open the 
target page.

 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Closed] (CARBONDATA-4235) after alter add column when user does rename operation ,the select operation on struct type gives null value and childen of struct gives error

2021-09-20 Thread Chetan Bhat (Jira)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-4235?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chetan Bhat closed CARBONDATA-4235.
---
Fix Version/s: 2.3.0
   Resolution: Fixed

Issue fixed in 2.3.0.

> after alter add column  when user does rename operation ,the select operation 
> on struct type gives null value and childen of struct  gives error
> 
>
> Key: CARBONDATA-4235
> URL: https://issues.apache.org/jira/browse/CARBONDATA-4235
> Project: CarbonData
>  Issue Type: Bug
>  Components: sql
>Affects Versions: 2.2.0
> Environment: Spark 3.1.1, Spark 2.4.5
>Reporter: Chetan Bhat
>Priority: Minor
> Fix For: 2.3.0
>
>
> *Queries –*
> drop table if exists test_rename;
> CREATE TABLE test_rename (str1 struct, str2 struct>, 
> str3 struct>> comment 'struct', intfield int,arr1 
> array, arr2 array>, arr3 array, arr4 
> array> comment 'array') STORED AS carbondata;
> insert into test_rename values (named_struct('a', 2), named_struct('a', 
> named_struct('b', 2)), named_struct('a', named_struct('b',named_struct('c', 
> 2))), 1,array(1,2,3), array(array(1,2),array(3,4)), array('hello','world'), 
> array(named_struct('a',45)));
> ALTER TABLE test_rename ADD COLUMNS(arr_1 ARRAY);
> alter table test_rename change str2 str22 struct>;
> select str22 from test_rename;
> select str22.a from test_rename;
> select str22.a.b from test_rename;
>  
> Issue : after alter add column when user does rename operation ,the select 
> operation on struct type gives null value and childen of struct gives error
>  
> *Issue 1 : Exception trace on executing query –*
> 0: jdbc:hive2://vm2:22550/> select str22.a.b from test_rename;
>  INFO : Execution ID: 2465
>  Error: org.apache.hive.service.cli.HiveSQLException: Error running query: 
> org.apache.spark.SparkException: Job aborted due to stage failure: Task 0 in 
> stage 1100.0 failed 4 times, most recent failure: Lost task 0.3 in stage 
> 1100.0 (TID 10353) (vm1 executor 5): java.nio.BufferUnderflowException
>  at java.nio.HeapByteBuffer.get(HeapByteBuffer.java:155)
>  at 
> org.apache.carbondata.core.scan.complextypes.PrimitiveQueryType.getDataObject(PrimitiveQueryType.java:166)
>  at 
> org.apache.carbondata.core.scan.complextypes.PrimitiveQueryType.getDataObject(PrimitiveQueryType.java:147)
>  at 
> org.apache.carbondata.core.scan.complextypes.PrimitiveQueryType.getDataBasedOnColumn(PrimitiveQueryType.java:141)
>  at 
> org.apache.carbondata.core.scan.complextypes.StructQueryType.getDataBasedOnColumn(StructQueryType.java:160)
>  at 
> org.apache.carbondata.core.scan.complextypes.StructQueryType.getDataBasedOnColumn(StructQueryType.java:160)
>  at 
> org.apache.carbondata.core.scan.collector.impl.DictionaryBasedResultCollector.fillRow(DictionaryBasedResultCollector.java:316)
>  at 
> org.apache.carbondata.core.scan.collector.impl.DictionaryBasedResultCollector.fillDimensionData(DictionaryBasedResultCollector.java:288)
>  at 
> org.apache.carbondata.core.scan.collector.impl.DictionaryBasedResultCollector.collectResultInRow(DictionaryBasedResultCollector.java:159)
>  at 
> org.apache.carbondata.core.scan.processor.DataBlockIterator.next(DataBlockIterator.java:110)
>  at 
> org.apache.carbondata.core.scan.result.iterator.DetailQueryResultIterator.getBatchResult(DetailQueryResultIterator.java:58)
>  at 
> org.apache.carbondata.core.scan.result.iterator.DetailQueryResultIterator.next(DetailQueryResultIterator.java:50)
>  at 
> org.apache.carbondata.core.scan.result.iterator.DetailQueryResultIterator.next(DetailQueryResultIterator.java:32)
>  at 
> org.apache.carbondata.core.scan.result.iterator.ChunkRowIterator.hasNext(ChunkRowIterator.java:56)
>  at 
> org.apache.carbondata.hadoop.CarbonRecordReader.nextKeyValue(CarbonRecordReader.java:127)
>  at 
> org.apache.carbondata.spark.rdd.CarbonScanRDD$$anon$1.hasNext(CarbonScanRDD.scala:557)
>  at scala.collection.Iterator$$anon$10.hasNext(Iterator.scala:458)
>  at scala.collection.Iterator$$anon$10.hasNext(Iterator.scala:458)
>  at 
> org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIteratorForCodegenStage1.processNext(Unknown
>  Source)
>  at 
> org.apache.spark.sql.execution.BufferedRowIterator.hasNext(BufferedRowIterator.java:43)
>  at 
> org.apache.spark.sql.execution.WholeStageCodegenExec$$anon$1.hasNext(WholeStageCodegenExec.scala:755)
>  at 
> org.apache.spark.sql.execution.SparkPlan.$anonfun$getByteArrayRdd$1(SparkPlan.scala:345)
>  at org.apache.spark.rdd.RDD.$anonfun$mapPartitionsInternal$2(RDD.scala:897)
>  at 
> org.apache.spark.rdd.RDD.$anonfun$mapPartitionsInternal$2$adapted(RDD.scala:897)
>  at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52)
>  at 

[jira] [Closed] (CARBONDATA-4241) if the sort scope is changed to global sort and data loaded, major compaction fails

2021-09-14 Thread Chetan Bhat (Jira)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-4241?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chetan Bhat closed CARBONDATA-4241.
---
Fix Version/s: 2.3.0
   Resolution: Fixed

The 2 scenarios in bug are fixed.

Scenario 1 :-

0: jdbc:hive2://10.21.19.14:23040> CREATE TABLE uniqdata_pagesize (CUST_ID 
int,CUST_NAME String,ACTIVE_EMUI_VERSION string, DOB timestamp, DOJ timestamp, 
BIG INT_COLUMN1 bigint,BIGINT_COLUMN2 bigint,DECIMAL_COLUMN1 decimal(30,10), 
DECIMAL_COLUMN2 decimal(36,36),Double_COLUMN1 double, Double_COLUMN2 
double,INTEGER_ COLUMN1 int) STORED as carbondata 
TBLPROPERTIES('table_page_size_inmb'='1');
+-+
| Result |
+-+
+-+
No rows selected (1.637 seconds)
0: jdbc:hive2://10.21.19.14:23040> LOAD DATA INPATH 
'hdfs://hacluster/chetan/2000_UniqData.csv' into table uniqdata_pagesize 
OPTIONS('DELIMITER'=',' , 'QUOTE 
CHAR'='"','BAD_RECORDS_ACTION'='FORCE','FILEHEADER'='CUST_ID,CUST_NAME,ACTIVE_EMUI_VERSION,DOB,DOJ,BIGINT_COLUMN1,BIGINT_COLUMN2,DECIMAL_COLUMN1,DECIMAL_COLU
 MN2,Double_COLUMN1,Double_COLUMN2,INTEGER_COLUMN1');
+-+
| Segment ID |
+-+
| 0 |
+-+
1 row selected (5.361 seconds)
0: jdbc:hive2://10.21.19.14:23040> alter table uniqdata_pagesize set 
tblproperties('sort_columns'='CUST_ID','sort_scope'='global_sort');
+-+
| Result |
+-+
+-+
No rows selected (0.883 seconds)
0: jdbc:hive2://10.21.19.14:23040> LOAD DATA INPATH 
'hdfs://hacluster/chetan/2000_UniqData.csv' into table uniqdata_pagesize 
OPTIONS('DELIMITER'=',' , 'QUOTE 
CHAR'='"','BAD_RECORDS_ACTION'='FORCE','FILEHEADER'='CUST_ID,CUST_NAME,ACTIVE_EMUI_VERSION,DOB,DOJ,BIGINT_COLUMN1,BIGINT_COLUMN2,DECIMAL_COLUMN1,DECIMAL_COLU
 MN2,Double_COLUMN1,Double_COLUMN2,INTEGER_COLUMN1');
+-+
| Segment ID |
+-+
| 1 |
+-+
1 row selected (2.104 seconds)
0: jdbc:hive2://10.21.19.14:23040> alter table uniqdata_pagesize compact 
'major';
+-+
| Result |
+-+
+-+
No rows selected (5.737 seconds)

 

Scenario 2 :- 

0: jdbc:hive2://10.21.19.14:23040> CREATE TABLE uniqdata_sortcol_bloom_locdic 
(CUST_ID int,CUST_NAME String,ACTIVE_EMUI_VERSION string, DOB timestamp, DOJ 
timestamp, BIGINT_COLUMN1 bigint,BIGINT_COLUMN2 bigint,DECIMAL_COLUMN1 
decimal(30,10), DECIMAL_COLUMN2 decimal(36,10),Double_COLUMN1 double, 
Double_COLUMN2 double,INTEGER_COLUMN1 int) STORED as carbondata 
tblproperties('sort_columns'='cust_id,cust_name,dob,doj,bigint_column1');
+-+
| Result |
+-+
+-+
No rows selected (0.31 seconds)
0: jdbc:hive2://10.21.19.14:23040> LOAD DATA INPATH 
'hdfs://hacluster/chetan/2000_UniqData.csv' into table 
uniqdata_sortcol_bloom_locdic OPTIONS('DELIMITER'=',', 
'QUOTECHAR'='"','BAD_RECORDS_ACTION'='FORCE','FILEHEADER'='CUST_ID,CUST_NAME,ACTIVE_EMUI_VERSION,DOB,DOJ,BIGINT_COLUMN1,BIGINT_COLUMN2,DECIMAL_COLUMN1,DECIMAL_COLUMN2,Double_COLUMN1,Double_COLUMN2,INTEGER_COLUMN1');
+-+
| Segment ID |
+-+
| 0 |
+-+
1 row selected (1.613 seconds)
0: jdbc:hive2://10.21.19.14:23040> alter table uniqdata_sortcol_bloom_locdic 
set 
tblproperties('local_dictionary_enable'='true','local_dictionary_threshold'='1000');
+-+
| Result |
+-+
+-+
No rows selected (0.711 seconds)
0: jdbc:hive2://10.21.19.14:23040> alter table uniqdata_sortcol_bloom_locdic 
set tblproperties('sort_columns'='CUST_ID','sort_scope'='global_sort');
+-+
| Result |
+-+
+-+
No rows selected (0.638 seconds)
0: jdbc:hive2://10.21.19.14:23040> LOAD DATA INPATH 
'hdfs://hacluster/chetan/2000_UniqData.csv' into table 
uniqdata_sortcol_bloom_locdic partition(active_emui_version='xyz') 
OPTIONS('FILEHEADER'='CUST_ID,CUST_NAME ,ACTIVE_EMUI_VERSION,DOB,DOJ, 
BIGINT_COLUMN1,BIGINT_COLUMN2,DECIMAL_COLUMN1,DECIMAL_COLUMN2,Double_COLUMN1, 
Double_COLUMN2,INTEGER_COLUMN1','BAD_RECORDS_ACTION'='FORCE');
+-+
| Segment ID |
+-+
| 1 |
+-+
1 row selected (0.929 seconds)
0: jdbc:hive2://10.21.19.14:23040> alter table uniqdata_sortcol_bloom_locdic 
compact 'major';
+-+
| Result |
+-+
+-+
No rows selected (1.581 seconds)

> if the sort scope is changed to global sort and data loaded, major compaction 
> fails
> ---
>
> Key: CARBONDATA-4241
> URL: https://issues.apache.org/jira/browse/CARBONDATA-4241
> Project: CarbonData
>  Issue Type: Bug
>  Components: data-load
>Affects Versions: 2.2.0
> Environment: Spark 2.3.2 Carbon 1.6.1 , Spark 3.1.1 Carbon 2.2.0
>Reporter: Chetan Bhat
>Assignee: Indhumathi Muthumurugesh
>Priority: Major
> Fix For: 2.3.0, 2.2.0
>
>
> *Scenario 1 :  create table with table_page_size_inmb'='1', load data ,* *set 

[jira] [Closed] (CARBONDATA-4236) Documentation correctness and link issues in https://github.com/apache/carbondata/blob/master/docs/

2021-09-07 Thread Chetan Bhat (Jira)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-4236?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chetan Bhat closed CARBONDATA-4236.
---

Issue is fixed now.

> Documentation correctness and link issues in 
> https://github.com/apache/carbondata/blob/master/docs/
> ---
>
> Key: CARBONDATA-4236
> URL: https://issues.apache.org/jira/browse/CARBONDATA-4236
> Project: CarbonData
>  Issue Type: Bug
>  Components: docs
>Affects Versions: 2.2.0
> Environment: docs with content and examples verified on Spark 2.4.5 
> and Spark 3.1.1 compatible carbon.
>Reporter: Chetan Bhat
>Priority: Minor
> Fix For: 2.3.0
>
>
> In the documentation link 
> https://github.com/apache/carbondata/blob/master/docs/
> Issue 1 :- 
> In link -> 
> https://github.com/apache/carbondata/blob/master/docs/configuration-parameters.md
>  the "See detail" links does not open the target 
> "http://spark.apache.org/docs/latest/rdd-programming-guide.html#rdd-persistence;
> In link --> 
> https://github.com/apache/carbondata/blob/master/docs/documentation.md the 
> link "Apache CarbonData wiki" when clicked tries to open link 
> "https://cwiki.apache.org/confluence/display/CARBONDATA/CarbonData+Home; the 
> target page cant be opened. Similarly the other links in the "External 
> Resources" section cant be opened due to the same error.
> In link 
> https://github.com/apache/carbondata/blob/master/docs/faq.md#what-are-bad-records
>  the link "https://thrift.apache.org/docs/install; when clicked does not open 
> the target page.
> In link 
> https://github.com/apache/carbondata/blob/master/docs/quick-start-guide.md 
> when the "Spark website" link is clicked 
> https://spark.apache.org/downloads.html page is not opened. Also in same page 
> when the "Apache Spark Documentation" link is clicked the 
> "http://spark.apache.org/docs/latest/; page is not opened.
> In the link 
> https://github.com/apache/carbondata/blob/master/docs/release-guide.md 
> "Product Release Policy link" , "release signing guidelines" , "Apache Nexus 
> repository" and "repository.apache.org" when clicked the target pages are not 
> opening.
> Issue 2:-
> In link --> 
> https://github.com/apache/carbondata/blob/master/docs/configuration-parameters.md
>  the "To configure Ranges-based Compaction" to be changed to "To configure 
> Range-based Compaction"
> Issue 3:-
> In link --> 
> https://github.com/apache/carbondata/blob/master/docs/configuration-parameters.md
>  the "Making this true degrade the LOAD performance" to be changed to "Making 
> this true degrades the LOAD performance"
> Issue 4 :-
> In link --> 
> https://github.com/apache/carbondata/blob/master/docs/configuration-parameters.md
>  the "user an either set to true" to be changed to "user can either set to 
> true"



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (CARBONDATA-4276) writestream fail when csv is copied to readstream hdfs path in Spark 2.4.5

2021-08-26 Thread Chetan Bhat (Jira)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-4276?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chetan Bhat updated CARBONDATA-4276:

Description: 
*With Carbon 2.2.0 Spark 2.4.5 cluster* 

*steps :*

*+In hdfs execute following command :+*

 cd /opt/HA/C10/install/hadoop/datanode/bin/
 ./hdfs dfs -rm -r /tmp/stream_test/checkpoint_all_data
 ./hdfs dfs -mkdir -p 
/tmp/stream_test/\{checkpoint_all_data,bad_records_all_data}
 ./hdfs dfs -mkdir -p /Priyesh/streaming/csv/
 ./hdfs dfs -cp /chetan/100_olap_C20.csv /Priyesh/streaming/csv/

./hdfs dfs -cp /Priyesh/streaming/csv/100_olap_C20.csv 
/Priyesh/streaming/csv/100_olap_C21.csv

 

*+From Spark-beeline /Spark-sql /Spark-shell, execute :+*

DROP TABLE IF EXISTS all_datatypes_2048;
 create table all_datatypes_2048 (imei string,deviceInformationId int,MAC 
string,deviceColor string,device_backColor string,modelId string,marketName 
string,AMSize string,ROMSize string,CUPAudit string,CPIClocked string,series 
string,productionDate timestamp,bomCode string,internalModels string, 
deliveryTime string, channelsId string, channelsName string , deliveryAreaId 
string, deliveryCountry string, deliveryProvince string, deliveryCity 
string,deliveryDistrict string, deliveryStreet string, oxSingleNumber string, 
ActiveCheckTime string, ActiveAreaId string, ActiveCountry string, 
ActiveProvince string, Activecity string, ActiveDistrict string, ActiveStreet 
string, ActiveOperatorId string, Active_releaseId string, Active_EMUIVersion 
string, Active_operaSysVersion string, Active_BacVerNumber string, 
Active_BacFlashVer string, Active_webUIVersion string, Active_webUITypeCarrVer 
string,Active_webTypeDataVerNumber string, Active_operatorsVersion string, 
Active_phonePADPartitionedVersions string, Latest_YEAR int, Latest_MONTH int, 
Latest_DAY Decimal(30,10), Latest_HOUR string, Latest_areaId string, 
Latest_country string, Latest_province string, Latest_city string, 
Latest_district string, Latest_street string, Latest_releaseId string, 
Latest_EMUIVersion string, Latest_operaSysVersion string, Latest_BacVerNumber 
string, Latest_BacFlashVer string, Latest_webUIVersion string, 
Latest_webUITypeCarrVer string, Latest_webTypeDataVerNumber string, 
Latest_operatorsVersion string, Latest_phonePADPartitionedVersions string, 
Latest_operatorId string, gamePointDescription string,gamePointId 
double,contractNumber BigInt) stored as carbondata 
TBLPROPERTIES('table_blocksize'='2048','streaming'='true', 
'sort_columns'='imei');

 

*+From Spark-shell ,execute :+*

import org.apache.spark.sql.streaming._
 import org.apache.spark.sql.streaming.Trigger.ProcessingTime

val df_j=spark.readStream.text("hdfs://hacluster/Priyesh/streaming/csv/*.csv")

df_j.writeStream.format("carbondata").option("dbName","ranjan").option("carbon.stream.parser","org.apache.carbondata.streaming.parser.CSVStreamParserImp").option("checkpointLocation",
 
"hdfs://hacluster/tmp/stream_test/checkpoint_all_data").option("bad_records_action","hdfs://hacluster/tmp/stream_test/bad_records_all_data").option("tableName","all_datatypes_2048").trigger(ProcessingTime(6000)).option("carbon.streaming.auto.handoff.enabled","true").option("carbon.streaming.segment.max.size",102400).start

show segments for table all_datatypes_2048;

 

*issue 1 :*

*+when  copy csv file in hdfs folder for 1st time after streaming started 
,writestream fails with error:+*

scala> 
df_j.writeStream.format("carbondata").option("dbName","ranjan").option("carbon.stream.parser","org.apache.carbondata.streaming.parser.CSVStreamParserImp").option("checkpointLocation",
 
"hdfs://hacluster/tmp/stream_test/checkpoint_all_data").option("bad_records_action","hdfs://hacluster/tmp/stream_test/bad_records_all_data").option("tableName","all_datatypes_2048").trigger(ProcessingTime(6000)).option("carbon.streaming.auto.handoff.enabled","true").option("carbon.streaming.segment.max.size",102400).start
 21/08/26 12:53:11 WARN CarbonProperties: The enable mv value "null" is 
invalid. Using the default value "true"
 21/08/26 12:53:11 WARN CarbonProperties: The value "LOCALLOCK" configured for 
key carbon.lock.type is invalid for current file system. Use the default value 
HDFSLOCK instead.
 21/08/26 12:53:12 WARN HiveConf: HiveConf of name 
hive.metastore.rdb.password.decode.enable does not exist
 21/08/26 12:53:12 WARN HiveConf: HiveConf of name 
hive.metastore.db.ssl.enabled does not exist
 21/08/26 12:53:13 WARN HiveConf: HiveConf of name 
hive.metastore.rdb.password.decode.enable does not exist
 21/08/26 12:53:13 WARN HiveConf: HiveConf of name 
hive.metastore.db.ssl.enabled does not exist
 21/08/26 12:53:14 WARN ObjectStore: Failed to get database global_temp, 
returning NoSuchObjectException
 res0: org.apache.spark.sql.streaming.StreamingQuery = 
org.apache.spark.sql.execution.streaming.StreamingQueryWrapper@ad038f8

scala> 21/08/26 13:00:49 WARN DFSClient: DataStreamer Exception
 

[jira] [Updated] (CARBONDATA-4276) writestream fail when csv is copied to readstream hdfs path in Spark 2.4.5

2021-08-26 Thread Chetan Bhat (Jira)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-4276?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chetan Bhat updated CARBONDATA-4276:

Summary: writestream fail when csv is copied to readstream hdfs path in 
Spark 2.4.5  (was: writestream fail when csv is copied to readstream hdfs path)

> writestream fail when csv is copied to readstream hdfs path in Spark 2.4.5
> --
>
> Key: CARBONDATA-4276
> URL: https://issues.apache.org/jira/browse/CARBONDATA-4276
> Project: CarbonData
>  Issue Type: Bug
>  Components: data-load
>Affects Versions: 2.2.0
> Environment: Spark 2.4.5
>Reporter: PRIYESH RANJAN
>Priority: Minor
>
> *steps :*
> *+In hdfs execute following command :+*
>  cd /opt/HA/C10/install/hadoop/datanode/bin/
> ./hdfs dfs -rm -r /tmp/stream_test/checkpoint_all_data
> ./hdfs dfs -mkdir -p 
> /tmp/stream_test/\{checkpoint_all_data,bad_records_all_data}
> ./hdfs dfs -mkdir -p /Priyesh/streaming/csv/
> ./hdfs dfs -cp /chetan/100_olap_C20.csv /Priyesh/streaming/csv/
> ./hdfs dfs -cp /Priyesh/streaming/csv/100_olap_C20.csv 
> /Priyesh/streaming/csv/100_olap_C21.csv
>  
> *+From Spark-beeline /Spark-sql /Spark-shell, execute :+*
> DROP TABLE IF EXISTS all_datatypes_2048;
> create table all_datatypes_2048 (imei string,deviceInformationId int,MAC 
> string,deviceColor string,device_backColor string,modelId string,marketName 
> string,AMSize string,ROMSize string,CUPAudit string,CPIClocked string,series 
> string,productionDate timestamp,bomCode string,internalModels string, 
> deliveryTime string, channelsId string, channelsName string , deliveryAreaId 
> string, deliveryCountry string, deliveryProvince string, deliveryCity 
> string,deliveryDistrict string, deliveryStreet string, oxSingleNumber string, 
> ActiveCheckTime string, ActiveAreaId string, ActiveCountry string, 
> ActiveProvince string, Activecity string, ActiveDistrict string, ActiveStreet 
> string, ActiveOperatorId string, Active_releaseId string, Active_EMUIVersion 
> string, Active_operaSysVersion string, Active_BacVerNumber string, 
> Active_BacFlashVer string, Active_webUIVersion string, 
> Active_webUITypeCarrVer string,Active_webTypeDataVerNumber string, 
> Active_operatorsVersion string, Active_phonePADPartitionedVersions string, 
> Latest_YEAR int, Latest_MONTH int, Latest_DAY Decimal(30,10), Latest_HOUR 
> string, Latest_areaId string, Latest_country string, Latest_province string, 
> Latest_city string, Latest_district string, Latest_street string, 
> Latest_releaseId string, Latest_EMUIVersion string, Latest_operaSysVersion 
> string, Latest_BacVerNumber string, Latest_BacFlashVer string, 
> Latest_webUIVersion string, Latest_webUITypeCarrVer string, 
> Latest_webTypeDataVerNumber string, Latest_operatorsVersion string, 
> Latest_phonePADPartitionedVersions string, Latest_operatorId string, 
> gamePointDescription string,gamePointId double,contractNumber BigInt) stored 
> as carbondata TBLPROPERTIES('table_blocksize'='2048','streaming'='true', 
> 'sort_columns'='imei');
>  
> *+From Spark-shell ,execute :+*
> import org.apache.spark.sql.streaming._
> import org.apache.spark.sql.streaming.Trigger.ProcessingTime
> val df_j=spark.readStream.text("hdfs://hacluster/Priyesh/streaming/csv/*.csv")
> df_j.writeStream.format("carbondata").option("dbName","ranjan").option("carbon.stream.parser","org.apache.carbondata.streaming.parser.CSVStreamParserImp").option("checkpointLocation",
>  
> "hdfs://hacluster/tmp/stream_test/checkpoint_all_data").option("bad_records_action","hdfs://hacluster/tmp/stream_test/bad_records_all_data").option("tableName","all_datatypes_2048").trigger(ProcessingTime(6000)).option("carbon.streaming.auto.handoff.enabled","true").option("carbon.streaming.segment.max.size",102400).start
> show segments for table all_datatypes_2048;
>  
> *issue 1 :*
> *+when  copy csv file in hdfs folder for 1st time after streaming started 
> ,writestream fails with error:+*
> scala> 
> df_j.writeStream.format("carbondata").option("dbName","ranjan").option("carbon.stream.parser","org.apache.carbondata.streaming.parser.CSVStreamParserImp").option("checkpointLocation",
>  
> "hdfs://hacluster/tmp/stream_test/checkpoint_all_data").option("bad_records_action","hdfs://hacluster/tmp/stream_test/bad_records_all_data").option("tableName","all_datatypes_2048").trigger(ProcessingTime(6000)).option("carbon.streaming.auto.handoff.enabled","true").option("carbon.streaming.segment.max.size",102400).start
> 21/08/26 12:53:11 WARN CarbonProperties: The enable mv value "null" is 
> invalid. Using the default value "true"
> 21/08/26 12:53:11 WARN CarbonProperties: The value "LOCALLOCK" configured for 
> key carbon.lock.type is invalid for current file system. Use the default 
> value HDFSLOCK instead.
> 21/08/26 12:53:12 

[jira] [Updated] (CARBONDATA-4243) Select filter query with to_date in filter fails for table with column_meta_cache configured also having SI

2021-08-24 Thread Chetan Bhat (Jira)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-4243?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chetan Bhat updated CARBONDATA-4243:

Environment: Spark 3.1.1, Spark 2.4.5  (was: Spark 3.1.1)

> Select filter query with to_date in filter fails for table with 
> column_meta_cache configured also having SI
> ---
>
> Key: CARBONDATA-4243
> URL: https://issues.apache.org/jira/browse/CARBONDATA-4243
> Project: CarbonData
>  Issue Type: Bug
>  Components: sql
>Affects Versions: 2.2.0
> Environment: Spark 3.1.1, Spark 2.4.5
>Reporter: Chetan Bhat
>Priority: Minor
>
> Create table with column_meta_cache, create secondary indexes and load data 
> to table. 
> Execute the Select filter query with to_date in filter.
> CREATE TABLE uniqdata (CUST_ID int,CUST_NAME String,ACTIVE_EMUI_VERSION 
> string, DOB timestamp, DOJ timestamp, BIGINT_COLUMN1 bigint,BIGINT_COLUMN2 
> bigint,DECIMAL_COLUMN1 decimal(30,10), DECIMAL_COLUMN2 
> decimal(36,36),Double_COLUMN1 double, Double_COLUMN2 double,INTEGER_COLUMN1 
> int) stored as carbondata 
> TBLPROPERTIES('COLUMN_META_CACHE'='CUST_NAME,ACTIVE_EMUI_VERSION,DOB,DOJ');
>  CREATE INDEX indextable2 ON TABLE uniqdata (DOB) AS 'carbondata';
>  CREATE INDEX indextable3 ON TABLE uniqdata (DOJ) AS 'carbondata';
>  LOAD DATA INPATH 'hdfs://hacluster/chetan/2000_UniqData.csv' into table 
> uniqdata OPTIONS('DELIMITER'=',' , 
> 'QUOTECHAR'='"','BAD_RECORDS_ACTION'='FORCE','FILEHEADER'='CUST_ID,CUST_NAME,ACTIVE_EMUI_VERSION,DOB,DOJ,BIGINT_COLUMN1,BIGINT_COLUMN2,DECIMAL_COLUMN1,DECIMAL_COLUMN2,Double_COLUMN1,Double_COLUMN2,INTEGER_COLUMN1');
>  
> *Issue: Select filter query with to_date in filter fails for table with 
> column_meta_cache configured also having SI*
> 0: jdbc:hive2://10.21.19.14:23040/default> select 
> max(to_date(DOB)),min(to_date(DOB)),count(to_date(DOB)) from uniqdata where 
> to_date(DOB)='1975-06-11' or to_date(Dn select 
> max(to_date(DOB)),min(to_date(DOB)),count(to_date(DOB)) from uniqdata where 
> to_date(DOB)='1975-06-11' or to_date(DOB)='1975-06-23';
>  Error: org.apache.hive.service.cli.HiveSQLException: Error running query: 
> org.apache.spark.sql.catalyst.errors.package$TreeNodeException: makeCopy, 
> tree:
>  !BroadCastSIFilterPushJoin [none#0|#0], [none#1|#1], Inner, BuildRight
>  :- *(6) ColumnarToRow
>  : +- Scan CarbonDatasourceHadoopRelation chetan.uniqdata[dob#847024|#847024] 
> Batched: true, DirectScan: false, PushedFilters: [((cast(input[0] as date) = 
> 1987) or (cast(in9))], ReadSchema: [dob]
>  +- *(8) HashAggregate(keys=[positionReference#847161|#847161], functions=[], 
> output=[positionReference#847161|#847161])
>  +- ReusedExchange [positionReference#847161|#847161], Exchange 
> hashpartitioning(positionReference#847161, 200), ENSURE_REQUIREMENTS, 
> [id=#195473|#195473]
> at 
> org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation.org$apache$spark$sql$hive$thriftserver$SparkExecuteStatementOperation$$execute(Sparation.scala:361)
>  at 
> org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation$$anon$2$$anon$3.$anonfun$run$2(SparkExecuteStatementOperation.scala:263)
>  at scala.runtime.java8.JFunction0$mcV$sp.apply(JFunction0$mcV$sp.java:23)
>  at 
> org.apache.spark.sql.hive.thriftserver.SparkOperation.withLocalProperties(SparkOperation.scala:78)
>  at 
> org.apache.spark.sql.hive.thriftserver.SparkOperation.withLocalProperties$(SparkOperation.scala:62)
>  at 
> org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation.withLocalProperties(SparkExecuteStatementOperation.scala:43)
>  at 
> org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation$$anon$2$$anon$3.run(SparkExecuteStatementOperation.scala:263)
>  at 
> org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation$$anon$2$$anon$3.run(SparkExecuteStatementOperation.scala:258)
>  at java.security.AccessController.doPrivileged(Native Method)
>  at javax.security.auth.Subject.doAs(Subject.java:422)
>  at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1746)
>  at 
> org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation$$anon$2.run(SparkExecuteStatementOperation.scala:272)
>  at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
>  at java.util.concurrent.FutureTask.run(FutureTask.java:266)
>  at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
>  at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
>  at java.lang.Thread.run(Thread.java:748)
>  Caused by: org.apache.spark.sql.catalyst.errors.package$TreeNodeException: 
> makeCopy, tree:
>  !BroadCastSIFilterPushJoin [none#0|#0], [none#1|#1], Inner, BuildRight
>  :- 

[jira] [Reopened] (CARBONDATA-4241) if the sort scope is changed to global sort and data loaded, major compaction fails

2021-08-24 Thread Chetan Bhat (Jira)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-4241?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chetan Bhat reopened CARBONDATA-4241:
-

Issue still exists with Spark 3.1.1 jars in 
https://dist.apache.org/repos/dist/release/carbondata/2.2.0/

> if the sort scope is changed to global sort and data loaded, major compaction 
> fails
> ---
>
> Key: CARBONDATA-4241
> URL: https://issues.apache.org/jira/browse/CARBONDATA-4241
> Project: CarbonData
>  Issue Type: Bug
>  Components: data-load
>Affects Versions: 2.2.0
> Environment: Spark 2.3.2 Carbon 1.6.1 , Spark 3.1.1 Carbon 2.2.0
>Reporter: Chetan Bhat
>Assignee: Indhumathi Muthumurugesh
>Priority: Major
> Fix For: 2.2.0
>
>
> *Scenario 1 :  create table with table_page_size_inmb'='1', load data ,* *set 
> sortscope as global sort , load data and do major compaction.***
> 0: jdbc:hive2://10.21.19.14:23040/default> CREATE TABLE uniqdata_pagesize 
> (CUST_ID int,CUST_NAME String,ACTIVE_EMUI_VERSION string, DOB timestamp, DOJ 
> timestamp, BIGINT_COLUMN1 bigint,BIGINT_COLUMN2 bigint,DECIMAL_COLUMN1 
> decimal(30,10), DECIMAL_COLUMN2 decimal(36,36),Double_COLUMN1 double, 
> Double_COLUMN2 double,INTEGER_COLUMN1 int) STORED as carbondata 
> TBLPROPERTIES('table_page_size_inmb'='1');
> +-+
> | Result  |
> +-+
> +-+
> No rows selected (0.229 seconds)
> 0: jdbc:hive2://10.21.19.14:23040/default> LOAD DATA INPATH 
> 'hdfs://hacluster/chetan/2000_UniqData.csv' into table uniqdata_pagesize 
> OPTIONS('DELIMITER'=',' , 
> 'QUOTECHAR'='"','BAD_RECORDS_ACTION'='FORCE','FILEHEADER'='CUST_ID,CUST_NAME,ACTIVE_EMUI_VERSION,DOB,DOJ,BIGINT_COLUMN1,BIGINT_COLUMN2,DECIMAL_COLUMN1,DECIMAL_COLUMN2,Double_COLUMN1,Double_COLUMN2,INTEGER_COLUMN1');
> +-+
> | Segment ID  |
> +-+
> | 0   |
> +-+
> 1 row selected (1.016 seconds)
> 0: jdbc:hive2://10.21.19.14:23040/default> alter table uniqdata_pagesize set 
> tblproperties('sort_columns'='CUST_ID','sort_scope'='global_sort');
> +-+
> | Result  |
> +-+
> +-+
> No rows selected (0.446 seconds)
> 0: jdbc:hive2://10.21.19.14:23040/default> LOAD DATA INPATH 
> 'hdfs://hacluster/chetan/2000_UniqData.csv' into table uniqdata_pagesize 
> OPTIONS('DELIMITER'=',' , 
> 'QUOTECHAR'='"','BAD_RECORDS_ACTION'='FORCE','FILEHEADER'='CUST_ID,CUST_NAME,ACTIVE_EMUI_VERSION,DOB,DOJ,BIGINT_COLUMN1,BIGINT_COLUMN2,DECIMAL_COLUMN1,DECIMAL_COLUMN2,Double_COLUMN1,Double_COLUMN2,INTEGER_COLUMN1');
> +-+
> | Segment ID  |
> +-+
> | 1   |
> +-+
> 1 row selected (0.767 seconds)
> 0: jdbc:hive2://10.21.19.14:23040/default> alter table uniqdata_pagesize 
> compact 'major';
> Error: org.apache.hive.service.cli.HiveSQLException: Error running query: 
> org.apache.spark.sql.AnalysisException: Compaction failed. Please check logs 
> for more info. Exception in compaction Compaction Failure in Merger Rdd.
>     at 
> org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation.org$apache$spark$sql$hive$thriftserver$SparkExecuteStatementOperation$$execute(SparkExecuteStatementOperation.scala:361)
>     at 
> org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation$$anon$2$$anon$3.$anonfun$run$2(SparkExecuteStatementOperation.scala:263)
>     at 
> scala.runtime.java8.JFunction0$mcV$sp.apply(JFunction0$mcV$sp.java:23)
>     at 
> org.apache.spark.sql.hive.thriftserver.SparkOperation.withLocalProperties(SparkOperation.scala:78)
>     at 
> org.apache.spark.sql.hive.thriftserver.SparkOperation.withLocalProperties$(SparkOperation.scala:62)
>     at 
> org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation.withLocalProperties(SparkExecuteStatementOperation.scala:43)
>     at 
> org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation$$anon$2$$anon$3.run(SparkExecuteStatementOperation.scala:263)
>     at 
> org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation$$anon$2$$anon$3.run(SparkExecuteStatementOperation.scala:258)
>     at java.security.AccessController.doPrivileged(Native Method)
>     at javax.security.auth.Subject.doAs(Subject.java:422)
>     at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1746)
>     at 
> org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation$$anon$2.run(SparkExecuteStatementOperation.scala:272)
>     at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
>     at java.util.concurrent.FutureTask.run(FutureTask.java:266)
>     at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
>     at 
> 

[jira] [Updated] (CARBONDATA-4243) Select filter query with to_date in filter fails for table with column_meta_cache configured also having SI

2021-07-09 Thread Chetan Bhat (Jira)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-4243?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chetan Bhat updated CARBONDATA-4243:

Description: 
Create table with column_meta_cache, create secondary indexes and load data to 
table. 

Execute the Select filter query with to_date in filter.

CREATE TABLE uniqdata (CUST_ID int,CUST_NAME String,ACTIVE_EMUI_VERSION string, 
DOB timestamp, DOJ timestamp, BIGINT_COLUMN1 bigint,BIGINT_COLUMN2 
bigint,DECIMAL_COLUMN1 decimal(30,10), DECIMAL_COLUMN2 
decimal(36,36),Double_COLUMN1 double, Double_COLUMN2 double,INTEGER_COLUMN1 
int) stored as carbondata 
TBLPROPERTIES('COLUMN_META_CACHE'='CUST_NAME,ACTIVE_EMUI_VERSION,DOB,DOJ');
 CREATE INDEX indextable2 ON TABLE uniqdata (DOB) AS 'carbondata';
 CREATE INDEX indextable3 ON TABLE uniqdata (DOJ) AS 'carbondata';
 LOAD DATA INPATH 'hdfs://hacluster/chetan/2000_UniqData.csv' into table 
uniqdata OPTIONS('DELIMITER'=',' , 
'QUOTECHAR'='"','BAD_RECORDS_ACTION'='FORCE','FILEHEADER'='CUST_ID,CUST_NAME,ACTIVE_EMUI_VERSION,DOB,DOJ,BIGINT_COLUMN1,BIGINT_COLUMN2,DECIMAL_COLUMN1,DECIMAL_COLUMN2,Double_COLUMN1,Double_COLUMN2,INTEGER_COLUMN1');

 

*Issue: Select filter query with to_date in filter fails for table with 
column_meta_cache configured also having SI*

0: jdbc:hive2://10.21.19.14:23040/default> select 
max(to_date(DOB)),min(to_date(DOB)),count(to_date(DOB)) from uniqdata where 
to_date(DOB)='1975-06-11' or to_date(Dn select 
max(to_date(DOB)),min(to_date(DOB)),count(to_date(DOB)) from uniqdata where 
to_date(DOB)='1975-06-11' or to_date(DOB)='1975-06-23';
 Error: org.apache.hive.service.cli.HiveSQLException: Error running query: 
org.apache.spark.sql.catalyst.errors.package$TreeNodeException: makeCopy, tree:
 !BroadCastSIFilterPushJoin [none#0|#0], [none#1|#1], Inner, BuildRight
 :- *(6) ColumnarToRow
 : +- Scan CarbonDatasourceHadoopRelation chetan.uniqdata[dob#847024|#847024] 
Batched: true, DirectScan: false, PushedFilters: [((cast(input[0] as date) = 
1987) or (cast(in9))], ReadSchema: [dob]
 +- *(8) HashAggregate(keys=[positionReference#847161|#847161], functions=[], 
output=[positionReference#847161|#847161])
 +- ReusedExchange [positionReference#847161|#847161], Exchange 
hashpartitioning(positionReference#847161, 200), ENSURE_REQUIREMENTS, 
[id=#195473|#195473]

at 
org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation.org$apache$spark$sql$hive$thriftserver$SparkExecuteStatementOperation$$execute(Sparation.scala:361)
 at 
org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation$$anon$2$$anon$3.$anonfun$run$2(SparkExecuteStatementOperation.scala:263)
 at scala.runtime.java8.JFunction0$mcV$sp.apply(JFunction0$mcV$sp.java:23)
 at 
org.apache.spark.sql.hive.thriftserver.SparkOperation.withLocalProperties(SparkOperation.scala:78)
 at 
org.apache.spark.sql.hive.thriftserver.SparkOperation.withLocalProperties$(SparkOperation.scala:62)
 at 
org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation.withLocalProperties(SparkExecuteStatementOperation.scala:43)
 at 
org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation$$anon$2$$anon$3.run(SparkExecuteStatementOperation.scala:263)
 at 
org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation$$anon$2$$anon$3.run(SparkExecuteStatementOperation.scala:258)
 at java.security.AccessController.doPrivileged(Native Method)
 at javax.security.auth.Subject.doAs(Subject.java:422)
 at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1746)
 at 
org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation$$anon$2.run(SparkExecuteStatementOperation.scala:272)
 at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
 at java.util.concurrent.FutureTask.run(FutureTask.java:266)
 at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
 at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
 at java.lang.Thread.run(Thread.java:748)
 Caused by: org.apache.spark.sql.catalyst.errors.package$TreeNodeException: 
makeCopy, tree:
 !BroadCastSIFilterPushJoin [none#0|#0], [none#1|#1], Inner, BuildRight
 :- *(6) ColumnarToRow
 : +- Scan CarbonDatasourceHadoopRelation chetan.uniqdata[dob#847024|#847024] 
Batched: true, DirectScan: false, PushedFilters: [((cast(input[0] as date) = 
1987) or (cast(in9))], ReadSchema: [dob]
 +- *(8) HashAggregate(keys=[positionReference#847161|#847161], functions=[], 
output=[positionReference#847161|#847161])
 +- ReusedExchange [positionReference#847161|#847161], Exchange 
hashpartitioning(positionReference#847161, 200), ENSURE_REQUIREMENTS, 
[id=#195473|#195473]

at org.apache.spark.sql.catalyst.errors.package$.attachTree(package.scala:56)
 at org.apache.spark.sql.catalyst.trees.TreeNode.makeCopy(TreeNode.scala:468)
 at org.apache.spark.sql.catalyst.trees.TreeNode.makeCopy(TreeNode.scala:457)
 at 

[jira] [Created] (CARBONDATA-4243) Select filter query with to_date in filter fails for table with column_meta_cache configured also having SI

2021-07-09 Thread Chetan Bhat (Jira)
Chetan Bhat created CARBONDATA-4243:
---

 Summary: Select filter query with to_date in filter fails for 
table with column_meta_cache configured also having SI
 Key: CARBONDATA-4243
 URL: https://issues.apache.org/jira/browse/CARBONDATA-4243
 Project: CarbonData
  Issue Type: Bug
  Components: sql
Affects Versions: 2.2.0
 Environment: Spark 3.1.1
Reporter: Chetan Bhat


Create table with column_meta_cache, create secondary indexes and load data to 
table. 

Execute the Select filter query with to_date in filter.

CREATE TABLE uniqdata (CUST_ID int,CUST_NAME String,ACTIVE_EMUI_VERSION string, 
DOB timestamp, DOJ timestamp, BIGINT_COLUMN1 bigint,BIGINT_COLUMN2 
bigint,DECIMAL_COLUMN1 decimal(30,10), DECIMAL_COLUMN2 
decimal(36,36),Double_COLUMN1 double, Double_COLUMN2 double,INTEGER_COLUMN1 
int) stored as carbondata 
TBLPROPERTIES('COLUMN_META_CACHE'='CUST_NAME,ACTIVE_EMUI_VERSION,DOB,DOJ');
CREATE INDEX indextable2 ON TABLE uniqdata (DOB) AS 'carbondata';
CREATE INDEX indextable3 ON TABLE uniqdata (DOJ) AS 'carbondata';
LOAD DATA INPATH 'hdfs://hacluster/chetan/2000_UniqData.csv' into table 
uniqdata OPTIONS('DELIMITER'=',' , 
'QUOTECHAR'='"','BAD_RECORDS_ACTION'='FORCE','FILEHEADER'='CUST_ID,CUST_NAME,ACTIVE_EMUI_VERSION,DOB,DOJ,BIGINT_COLUMN1,BIGINT_COLUMN2,DECIMAL_COLUMN1,DECIMAL_COLUMN2,Double_COLUMN1,Double_COLUMN2,INTEGER_COLUMN1');

 

Issue: Select filter query with to_date in filter fails for table with 
column_meta_cache configured also having SI

0: jdbc:hive2://10.21.19.14:23040/default> select 
max(to_date(DOB)),min(to_date(DOB)),count(to_date(DOB)) from uniqdata where 
to_date(DOB)='1975-06-11' or to_date(Dn select 
max(to_date(DOB)),min(to_date(DOB)),count(to_date(DOB)) from uniqdata where 
to_date(DOB)='1975-06-11' or to_date(DOB)='1975-06-23';
Error: org.apache.hive.service.cli.HiveSQLException: Error running query: 
org.apache.spark.sql.catalyst.errors.package$TreeNodeException: makeCopy, tree:
!BroadCastSIFilterPushJoin [none#0], [none#1], Inner, BuildRight
:- *(6) ColumnarToRow
: +- Scan CarbonDatasourceHadoopRelation chetan.uniqdata[dob#847024] Batched: 
true, DirectScan: false, PushedFilters: [((cast(input[0] as date) = 1987) or 
(cast(in9))], ReadSchema: [dob]
+- *(8) HashAggregate(keys=[positionReference#847161], functions=[], 
output=[positionReference#847161])
 +- ReusedExchange [positionReference#847161], Exchange 
hashpartitioning(positionReference#847161, 200), ENSURE_REQUIREMENTS, 
[id=#195473]

at 
org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation.org$apache$spark$sql$hive$thriftserver$SparkExecuteStatementOperation$$execute(Sparation.scala:361)
 at 
org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation$$anon$2$$anon$3.$anonfun$run$2(SparkExecuteStatementOperation.scala:263)
 at scala.runtime.java8.JFunction0$mcV$sp.apply(JFunction0$mcV$sp.java:23)
 at 
org.apache.spark.sql.hive.thriftserver.SparkOperation.withLocalProperties(SparkOperation.scala:78)
 at 
org.apache.spark.sql.hive.thriftserver.SparkOperation.withLocalProperties$(SparkOperation.scala:62)
 at 
org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation.withLocalProperties(SparkExecuteStatementOperation.scala:43)
 at 
org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation$$anon$2$$anon$3.run(SparkExecuteStatementOperation.scala:263)
 at 
org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation$$anon$2$$anon$3.run(SparkExecuteStatementOperation.scala:258)
 at java.security.AccessController.doPrivileged(Native Method)
 at javax.security.auth.Subject.doAs(Subject.java:422)
 at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1746)
 at 
org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation$$anon$2.run(SparkExecuteStatementOperation.scala:272)
 at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
 at java.util.concurrent.FutureTask.run(FutureTask.java:266)
 at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
 at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
 at java.lang.Thread.run(Thread.java:748)
Caused by: org.apache.spark.sql.catalyst.errors.package$TreeNodeException: 
makeCopy, tree:
!BroadCastSIFilterPushJoin [none#0], [none#1], Inner, BuildRight
:- *(6) ColumnarToRow
: +- Scan CarbonDatasourceHadoopRelation chetan.uniqdata[dob#847024] Batched: 
true, DirectScan: false, PushedFilters: [((cast(input[0] as date) = 1987) or 
(cast(in9))], ReadSchema: [dob]
+- *(8) HashAggregate(keys=[positionReference#847161], functions=[], 
output=[positionReference#847161])
 +- ReusedExchange [positionReference#847161], Exchange 
hashpartitioning(positionReference#847161, 200), ENSURE_REQUIREMENTS, 
[id=#195473]

at org.apache.spark.sql.catalyst.errors.package$.attachTree(package.scala:56)
 

[jira] [Updated] (CARBONDATA-4241) if the sort scope is changed to global sort and data loaded, major compaction fails

2021-07-08 Thread Chetan Bhat (Jira)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-4241?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chetan Bhat updated CARBONDATA-4241:

Description: 
*Scenario 1 :  create table with table_page_size_inmb'='1', load data ,* *set 
sortscope as global sort , load data and do major compaction.***

0: jdbc:hive2://10.21.19.14:23040/default> CREATE TABLE uniqdata_pagesize 
(CUST_ID int,CUST_NAME String,ACTIVE_EMUI_VERSION string, DOB timestamp, DOJ 
timestamp, BIGINT_COLUMN1 bigint,BIGINT_COLUMN2 bigint,DECIMAL_COLUMN1 
decimal(30,10), DECIMAL_COLUMN2 decimal(36,36),Double_COLUMN1 double, 
Double_COLUMN2 double,INTEGER_COLUMN1 int) STORED as carbondata 
TBLPROPERTIES('table_page_size_inmb'='1');

+-+

| Result  |

+-+

+-+

No rows selected (0.229 seconds)

0: jdbc:hive2://10.21.19.14:23040/default> LOAD DATA INPATH 
'hdfs://hacluster/chetan/2000_UniqData.csv' into table uniqdata_pagesize 
OPTIONS('DELIMITER'=',' , 
'QUOTECHAR'='"','BAD_RECORDS_ACTION'='FORCE','FILEHEADER'='CUST_ID,CUST_NAME,ACTIVE_EMUI_VERSION,DOB,DOJ,BIGINT_COLUMN1,BIGINT_COLUMN2,DECIMAL_COLUMN1,DECIMAL_COLUMN2,Double_COLUMN1,Double_COLUMN2,INTEGER_COLUMN1');

+-+

| Segment ID  |

+-+

| 0   |

+-+

1 row selected (1.016 seconds)

0: jdbc:hive2://10.21.19.14:23040/default> alter table uniqdata_pagesize set 
tblproperties('sort_columns'='CUST_ID','sort_scope'='global_sort');

+-+

| Result  |

+-+

+-+

No rows selected (0.446 seconds)

0: jdbc:hive2://10.21.19.14:23040/default> LOAD DATA INPATH 
'hdfs://hacluster/chetan/2000_UniqData.csv' into table uniqdata_pagesize 
OPTIONS('DELIMITER'=',' , 
'QUOTECHAR'='"','BAD_RECORDS_ACTION'='FORCE','FILEHEADER'='CUST_ID,CUST_NAME,ACTIVE_EMUI_VERSION,DOB,DOJ,BIGINT_COLUMN1,BIGINT_COLUMN2,DECIMAL_COLUMN1,DECIMAL_COLUMN2,Double_COLUMN1,Double_COLUMN2,INTEGER_COLUMN1');

+-+

| Segment ID  |

+-+

| 1   |

+-+

1 row selected (0.767 seconds)

0: jdbc:hive2://10.21.19.14:23040/default> alter table uniqdata_pagesize 
compact 'major';

Error: org.apache.hive.service.cli.HiveSQLException: Error running query: 
org.apache.spark.sql.AnalysisException: Compaction failed. Please check logs 
for more info. Exception in compaction Compaction Failure in Merger Rdd.

    at 
org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation.org$apache$spark$sql$hive$thriftserver$SparkExecuteStatementOperation$$execute(SparkExecuteStatementOperation.scala:361)

    at 
org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation$$anon$2$$anon$3.$anonfun$run$2(SparkExecuteStatementOperation.scala:263)

    at 
scala.runtime.java8.JFunction0$mcV$sp.apply(JFunction0$mcV$sp.java:23)

    at 
org.apache.spark.sql.hive.thriftserver.SparkOperation.withLocalProperties(SparkOperation.scala:78)

    at 
org.apache.spark.sql.hive.thriftserver.SparkOperation.withLocalProperties$(SparkOperation.scala:62)

    at 
org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation.withLocalProperties(SparkExecuteStatementOperation.scala:43)

    at 
org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation$$anon$2$$anon$3.run(SparkExecuteStatementOperation.scala:263)

    at 
org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation$$anon$2$$anon$3.run(SparkExecuteStatementOperation.scala:258)

    at java.security.AccessController.doPrivileged(Native Method)

    at javax.security.auth.Subject.doAs(Subject.java:422)

    at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1746)

    at 
org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation$$anon$2.run(SparkExecuteStatementOperation.scala:272)

    at 
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)

    at java.util.concurrent.FutureTask.run(FutureTask.java:266)

    at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)

    at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)

    at java.lang.Thread.run(Thread.java:748)

Caused by: org.apache.spark.sql.AnalysisException: Compaction failed. Please 
check logs for more info. Exception in compaction Compaction Failure in Merger 
Rdd.

    at 
org.apache.spark.sql.util.CarbonException$.analysisException(CarbonException.scala:23)

    at 
org.apache.spark.sql.execution.command.management.CarbonAlterTableCompactionCommand.$anonfun$processData$3(CarbonAlterTableCompactionCommand.scala:197)

    at org.apache.carbondata.events.package$.withEvents(package.scala:27)

    at 
org.apache.spark.sql.execution.command.management.CarbonAlterTableCompactionCommand.processData(CarbonAlterTableCompactionCommand.scala:185)

    at 

[jira] [Updated] (CARBONDATA-4241) if the sort scope is changed to global sort and data loaded, major compaction fails

2021-07-08 Thread Chetan Bhat (Jira)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-4241?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chetan Bhat updated CARBONDATA-4241:

Description: 
*create table and alter table set local dictionary.*

*set sortscope as global sort , load and do major compaction.*
 CREATE TABLE uniqdata_sortcol_bloom_locdic (CUST_ID int,CUST_NAME 
String,ACTIVE_EMUI_VERSION string, DOB timestamp, DOJ timestamp, BIGINT_COLUMN1 
bigint,BIGINT_COLUMN2 bigint,DECIMAL_COLUMN1 decimal(30,10), DECIMAL_COLUMN2 
decimal(36,10),Double_COLUMN1 double, Double_COLUMN2 double,INTEGER_COLUMN1 
int) STORED as carbondata 
tblproperties('sort_columns'='cust_id,cust_name,dob,doj,bigint_column1');
 LOAD DATA INPATH 'hdfs://hacluster/chetan/2000_UniqData.csv' into table 
uniqdata_sortcol_bloom_locdic OPTIONS('DELIMITER'=',', 
'QUOTECHAR'='"','BAD_RECORDS_ACTION'='FORCE','FILEHEADER'='CUST_ID,CUST_NAME,ACTIVE_EMUI_VERSION,DOB,DOJ,BIGINT_COLUMN1,BIGINT_COLUMN2,DECIMAL_COLUMN1,DECIMAL_COLUMN2,Double_COLUMN1,Double_COLUMN2,INTEGER_COLUMN1');
 alter table uniqdata_sortcol_bloom_locdic set 
tblproperties('local_dictionary_enable'='true','local_dictionary_threshold'='1000');
 alter table uniqdata_sortcol_bloom_locdic set 
tblproperties('sort_columns'='CUST_ID','sort_scope'='global_sort');
 LOAD DATA INPATH 'hdfs://hacluster/chetan/2000_UniqData.csv' into table 
uniqdata_sortcol_bloom_locdic partition(active_emui_version='xyz') 
OPTIONS('FILEHEADER'='CUST_ID,CUST_NAME ,ACTIVE_EMUI_VERSION,DOB,DOJ, 
BIGINT_COLUMN1,BIGINT_COLUMN2,DECIMAL_COLUMN1,DECIMAL_COLUMN2,Double_COLUMN1, 
Double_COLUMN2,INTEGER_COLUMN1','BAD_RECORDS_ACTION'='FORCE');

*Compaction fails*

*0: jdbc:hive2://10.21.19.14:23040/default> alter table 
uniqdata_sortcol_bloom_locdic compact 'major';*
 Error: org.apache.hive.service.cli.HiveSQLException: Error running query: 
org.apache.spark.sql.AnalysisException: Compaction failed. Please check logs 
for more info. Exception in compaction Compaction Failure in Merger Rdd.
 at 
org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation.org$apache$spark$sql$hive$thriftserver$SparkExecuteStatementOperation$$execute(SparkExecuteStatementOperation.scala:361)
 at 
org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation$$anon$2$$anon$3.$anonfun$run$2(SparkExecuteStatementOperation.scala:263)
 at scala.runtime.java8.JFunction0$mcV$sp.apply(JFunction0$mcV$sp.java:23)
 at 
org.apache.spark.sql.hive.thriftserver.SparkOperation.withLocalProperties(SparkOperation.scala:78)
 at 
org.apache.spark.sql.hive.thriftserver.SparkOperation.withLocalProperties$(SparkOperation.scala:62)
 at 
org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation.withLocalProperties(SparkExecuteStatementOperation.scala:43)
 at 
org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation$$anon$2$$anon$3.run(SparkExecuteStatementOperation.scala:263)
 at 
org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation$$anon$2$$anon$3.run(SparkExecuteStatementOperation.scala:258)
 at java.security.AccessController.doPrivileged(Native Method)
 at javax.security.auth.Subject.doAs(Subject.java:422)
 at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1746)
 at 
org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation$$anon$2.run(SparkExecuteStatementOperation.scala:272)
 at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
 at java.util.concurrent.FutureTask.run(FutureTask.java:266)
 at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
 at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
 at java.lang.Thread.run(Thread.java:748)
 Caused by: org.apache.spark.sql.AnalysisException: Compaction failed. Please 
check logs for more info. Exception in compaction Compaction Failure in Merger 
Rdd.
 at 
org.apache.spark.sql.util.CarbonException$.analysisException(CarbonException.scala:23)
 at 
org.apache.spark.sql.execution.command.management.CarbonAlterTableCompactionCommand.$anonfun$processData$3(CarbonAlterTableCompactionCommand.scala:197)
 at org.apache.carbondata.events.package$.withEvents(package.scala:27)
 at 
org.apache.spark.sql.execution.command.management.CarbonAlterTableCompactionCommand.processData(CarbonAlterTableCompactionCommand.scala:185)
 at 
org.apache.spark.sql.execution.command.AtomicRunnableCommand.$anonfun$run$3(package.scala:162)
 at 
org.apache.spark.sql.execution.command.Auditable.runWithAudit(package.scala:118)
 at 
org.apache.spark.sql.execution.command.Auditable.runWithAudit$(package.scala:114)
 at 
org.apache.spark.sql.execution.command.AtomicRunnableCommand.runWithAudit(package.scala:155)
 at 
org.apache.spark.sql.execution.command.AtomicRunnableCommand.run(package.scala:168)
 at 
org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult$lzycompute(commands.scala:70)
 at 

[jira] [Updated] (CARBONDATA-4241) if the sort scope is changed to global sort and data loaded, major compaction fails

2021-07-08 Thread Chetan Bhat (Jira)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-4241?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chetan Bhat updated CARBONDATA-4241:

Summary: if the sort scope is changed to global sort and data loaded, major 
compaction fails  (was: in 1.6.1 version table with local dictionary is created 
and in 2.2.0 if the sort scope is changed to global sort and data loaded, major 
compaction fails)

> if the sort scope is changed to global sort and data loaded, major compaction 
> fails
> ---
>
> Key: CARBONDATA-4241
> URL: https://issues.apache.org/jira/browse/CARBONDATA-4241
> Project: CarbonData
>  Issue Type: Bug
>  Components: data-load
>Affects Versions: 2.2.0
> Environment: Spark 2.3.2 Carbon 1.6.1 , Spark 3.1.1 Carbon 2.2.0
>Reporter: Chetan Bhat
>Priority: Major
>
> *In 1.6.1 version create table and alter table set local dictionary.*
>  CREATE TABLE uniqdata_sortcol_bloom_locdic (CUST_ID int,CUST_NAME 
> String,ACTIVE_EMUI_VERSION string, DOB timestamp, DOJ timestamp, 
> BIGINT_COLUMN1 bigint,BIGINT_COLUMN2 bigint,DECIMAL_COLUMN1 decimal(30,10), 
> DECIMAL_COLUMN2 decimal(36,10),Double_COLUMN1 double, Double_COLUMN2 
> double,INTEGER_COLUMN1 int) STORED as carbondata 
> tblproperties('sort_columns'='cust_id,cust_name,dob,doj,bigint_column1');
>  LOAD DATA INPATH 'hdfs://hacluster/chetan/2000_UniqData.csv' into table 
> uniqdata_sortcol_bloom_locdic OPTIONS('DELIMITER'=',', 
> 'QUOTECHAR'='"','BAD_RECORDS_ACTION'='FORCE','FILEHEADER'='CUST_ID,CUST_NAME,ACTIVE_EMUI_VERSION,DOB,DOJ,BIGINT_COLUMN1,BIGINT_COLUMN2,DECIMAL_COLUMN1,DECIMAL_COLUMN2,Double_COLUMN1,Double_COLUMN2,INTEGER_COLUMN1');
>  alter table uniqdata_sortcol_bloom_locdic set 
> tblproperties('local_dictionary_enable'='true','local_dictionary_threshold'='1000');
> *In 2.2.0 set sortscope as global sort , load and do major compaction.*
>  alter table uniqdata_sortcol_bloom_locdic set 
> tblproperties('sort_columns'='CUST_ID','sort_scope'='global_sort');
>  LOAD DATA INPATH 'hdfs://hacluster/chetan/2000_UniqData.csv' into table 
> uniqdata_sortcol_bloom_locdic partition(active_emui_version='xyz') 
> OPTIONS('FILEHEADER'='CUST_ID,CUST_NAME ,ACTIVE_EMUI_VERSION,DOB,DOJ, 
> BIGINT_COLUMN1,BIGINT_COLUMN2,DECIMAL_COLUMN1,DECIMAL_COLUMN2,Double_COLUMN1, 
> Double_COLUMN2,INTEGER_COLUMN1','BAD_RECORDS_ACTION'='FORCE');
> *0: jdbc:hive2://10.21.19.14:23040/default> alter table 
> uniqdata_sortcol_bloom_locdic compact 'major';*
>  Error: org.apache.hive.service.cli.HiveSQLException: Error running query: 
> org.apache.spark.sql.AnalysisException: Compaction failed. Please check logs 
> for more info. Exception in compaction Compaction Failure in Merger Rdd.
>  at 
> org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation.org$apache$spark$sql$hive$thriftserver$SparkExecuteStatementOperation$$execute(SparkExecuteStatementOperation.scala:361)
>  at 
> org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation$$anon$2$$anon$3.$anonfun$run$2(SparkExecuteStatementOperation.scala:263)
>  at scala.runtime.java8.JFunction0$mcV$sp.apply(JFunction0$mcV$sp.java:23)
>  at 
> org.apache.spark.sql.hive.thriftserver.SparkOperation.withLocalProperties(SparkOperation.scala:78)
>  at 
> org.apache.spark.sql.hive.thriftserver.SparkOperation.withLocalProperties$(SparkOperation.scala:62)
>  at 
> org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation.withLocalProperties(SparkExecuteStatementOperation.scala:43)
>  at 
> org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation$$anon$2$$anon$3.run(SparkExecuteStatementOperation.scala:263)
>  at 
> org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation$$anon$2$$anon$3.run(SparkExecuteStatementOperation.scala:258)
>  at java.security.AccessController.doPrivileged(Native Method)
>  at javax.security.auth.Subject.doAs(Subject.java:422)
>  at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1746)
>  at 
> org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation$$anon$2.run(SparkExecuteStatementOperation.scala:272)
>  at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
>  at java.util.concurrent.FutureTask.run(FutureTask.java:266)
>  at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
>  at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
>  at java.lang.Thread.run(Thread.java:748)
>  Caused by: org.apache.spark.sql.AnalysisException: Compaction failed. Please 
> check logs for more info. Exception in compaction Compaction Failure in 
> Merger Rdd.
>  at 
> org.apache.spark.sql.util.CarbonException$.analysisException(CarbonException.scala:23)
>  at 
> 

[jira] [Updated] (CARBONDATA-4241) in 1.6.1 version table with local dictionary is created and in 2.2.0 if the sort scope is changed to global sort and data loaded, major compaction fails

2021-07-08 Thread Chetan Bhat (Jira)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-4241?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chetan Bhat updated CARBONDATA-4241:

Description: 
*In 1.6.1 version create table and alter table set local dictionary.*
 CREATE TABLE uniqdata_sortcol_bloom_locdic (CUST_ID int,CUST_NAME 
String,ACTIVE_EMUI_VERSION string, DOB timestamp, DOJ timestamp, BIGINT_COLUMN1 
bigint,BIGINT_COLUMN2 bigint,DECIMAL_COLUMN1 decimal(30,10), DECIMAL_COLUMN2 
decimal(36,10),Double_COLUMN1 double, Double_COLUMN2 double,INTEGER_COLUMN1 
int) STORED as carbondata 
tblproperties('sort_columns'='cust_id,cust_name,dob,doj,bigint_column1');
 LOAD DATA INPATH 'hdfs://hacluster/chetan/2000_UniqData.csv' into table 
uniqdata_sortcol_bloom_locdic OPTIONS('DELIMITER'=',', 
'QUOTECHAR'='"','BAD_RECORDS_ACTION'='FORCE','FILEHEADER'='CUST_ID,CUST_NAME,ACTIVE_EMUI_VERSION,DOB,DOJ,BIGINT_COLUMN1,BIGINT_COLUMN2,DECIMAL_COLUMN1,DECIMAL_COLUMN2,Double_COLUMN1,Double_COLUMN2,INTEGER_COLUMN1');
 alter table uniqdata_sortcol_bloom_locdic set 
tblproperties('local_dictionary_enable'='true','local_dictionary_threshold'='1000');

*In 2.2.0 set sortscope as global sort , load and do major compaction.*
 alter table uniqdata_sortcol_bloom_locdic set 
tblproperties('sort_columns'='CUST_ID','sort_scope'='global_sort');
 LOAD DATA INPATH 'hdfs://hacluster/chetan/2000_UniqData.csv' into table 
uniqdata_sortcol_bloom_locdic partition(active_emui_version='xyz') 
OPTIONS('FILEHEADER'='CUST_ID,CUST_NAME ,ACTIVE_EMUI_VERSION,DOB,DOJ, 
BIGINT_COLUMN1,BIGINT_COLUMN2,DECIMAL_COLUMN1,DECIMAL_COLUMN2,Double_COLUMN1, 
Double_COLUMN2,INTEGER_COLUMN1','BAD_RECORDS_ACTION'='FORCE');

*0: jdbc:hive2://10.21.19.14:23040/default> alter table 
uniqdata_sortcol_bloom_locdic compact 'major';*
 Error: org.apache.hive.service.cli.HiveSQLException: Error running query: 
org.apache.spark.sql.AnalysisException: Compaction failed. Please check logs 
for more info. Exception in compaction Compaction Failure in Merger Rdd.
 at 
org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation.org$apache$spark$sql$hive$thriftserver$SparkExecuteStatementOperation$$execute(SparkExecuteStatementOperation.scala:361)
 at 
org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation$$anon$2$$anon$3.$anonfun$run$2(SparkExecuteStatementOperation.scala:263)
 at scala.runtime.java8.JFunction0$mcV$sp.apply(JFunction0$mcV$sp.java:23)
 at 
org.apache.spark.sql.hive.thriftserver.SparkOperation.withLocalProperties(SparkOperation.scala:78)
 at 
org.apache.spark.sql.hive.thriftserver.SparkOperation.withLocalProperties$(SparkOperation.scala:62)
 at 
org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation.withLocalProperties(SparkExecuteStatementOperation.scala:43)
 at 
org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation$$anon$2$$anon$3.run(SparkExecuteStatementOperation.scala:263)
 at 
org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation$$anon$2$$anon$3.run(SparkExecuteStatementOperation.scala:258)
 at java.security.AccessController.doPrivileged(Native Method)
 at javax.security.auth.Subject.doAs(Subject.java:422)
 at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1746)
 at 
org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation$$anon$2.run(SparkExecuteStatementOperation.scala:272)
 at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
 at java.util.concurrent.FutureTask.run(FutureTask.java:266)
 at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
 at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
 at java.lang.Thread.run(Thread.java:748)
 Caused by: org.apache.spark.sql.AnalysisException: Compaction failed. Please 
check logs for more info. Exception in compaction Compaction Failure in Merger 
Rdd.
 at 
org.apache.spark.sql.util.CarbonException$.analysisException(CarbonException.scala:23)
 at 
org.apache.spark.sql.execution.command.management.CarbonAlterTableCompactionCommand.$anonfun$processData$3(CarbonAlterTableCompactionCommand.scala:197)
 at org.apache.carbondata.events.package$.withEvents(package.scala:27)
 at 
org.apache.spark.sql.execution.command.management.CarbonAlterTableCompactionCommand.processData(CarbonAlterTableCompactionCommand.scala:185)
 at 
org.apache.spark.sql.execution.command.AtomicRunnableCommand.$anonfun$run$3(package.scala:162)
 at 
org.apache.spark.sql.execution.command.Auditable.runWithAudit(package.scala:118)
 at 
org.apache.spark.sql.execution.command.Auditable.runWithAudit$(package.scala:114)
 at 
org.apache.spark.sql.execution.command.AtomicRunnableCommand.runWithAudit(package.scala:155)
 at 
org.apache.spark.sql.execution.command.AtomicRunnableCommand.run(package.scala:168)
 at 
org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult$lzycompute(commands.scala:70)
 at 

[jira] [Created] (CARBONDATA-4241) in 1.6.1 version table with local dictionary is created and in 2.2.0 if the sort scope is changed to global sort and data loaded, major compaction fails

2021-07-08 Thread Chetan Bhat (Jira)
Chetan Bhat created CARBONDATA-4241:
---

 Summary: in 1.6.1 version table with local dictionary is created 
and in 2.2.0 if the sort scope is changed to global sort and data loaded, major 
compaction fails
 Key: CARBONDATA-4241
 URL: https://issues.apache.org/jira/browse/CARBONDATA-4241
 Project: CarbonData
  Issue Type: Bug
  Components: data-load
Affects Versions: 2.2.0
 Environment: Spark 2.3.2 Carbon 1.6.1 , Spark 3.1.1 Carbon 2.2.0
Reporter: Chetan Bhat


*In 1.6.1 version create table and alter table set local dictionary.*
 CREATE TABLE uniqdata_sortcol_bloom_locdic (CUST_ID int,CUST_NAME 
String,ACTIVE_EMUI_VERSION string, DOB timestamp, DOJ timestamp, BIGINT_COLUMN1 
bigint,BIGINT_COLUMN2 bigint,DECIMAL_COLUMN1 decimal(30,10), DECIMAL_COLUMN2 
decimal(36,10),Double_COLUMN1 double, Double_COLUMN2 double,INTEGER_COLUMN1 
int) STORED as carbondata 
tblproperties('sort_columns'='cust_id,cust_name,dob,doj,bigint_column1');
 LOAD DATA INPATH 'hdfs://hacluster/chetan/2000_UniqData.csv' into table 
uniqdata_sortcol_bloom_locdic OPTIONS('DELIMITER'=',', 
'QUOTECHAR'='"','BAD_RECORDS_ACTION'='FORCE','FILEHEADER'='CUST_ID,CUST_NAME,ACTIVE_EMUI_VERSION,DOB,DOJ,BIGINT_COLUMN1,BIGINT_COLUMN2,DECIMAL_COLUMN1,DECIMAL_COLUMN2,Double_COLUMN1,Double_COLUMN2,INTEGER_COLUMN1');
 alter table uniqdata_sortcol_bloom_locdic set 
tblproperties('local_dictionary_enable'='true','local_dictionary_threshold'='1000');
 
 *In 2.2.0 set sortscope as global sort , load and do major compaction.*
 alter table uniqdata_sortcol_bloom_locdic set 
tblproperties('sort_columns'='CUST_ID','sort_scope'='global_sort');
 LOAD DATA INPATH 'hdfs://hacluster/chetan/2000_UniqData.csv' into table 
uniqdata_sortcol_bloom_locdic partition(active_emui_version='xyz') 
OPTIONS('FILEHEADER'='CUST_ID,CUST_NAME ,ACTIVE_EMUI_VERSION,DOB,DOJ, 
BIGINT_COLUMN1,BIGINT_COLUMN2,DECIMAL_COLUMN1,DECIMAL_COLUMN2,Double_COLUMN1, 
Double_COLUMN2,INTEGER_COLUMN1','BAD_RECORDS_ACTION'='FORCE');
 
 *0: jdbc:hive2://10.21.19.14:23040/default> alter table 
uniqdata_sortcol_bloom_locdic compact 'major';*
 Error: org.apache.hive.service.cli.HiveSQLException: Error running query: 
org.apache.spark.sql.AnalysisException: Compaction failed. Please check logs 
for more info. Exception in compaction Compaction Failure in Merger Rdd.
 at 
org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation.org$apache$spark$sql$hive$thriftserver$SparkExecuteStatementOperation$$execute(SparkExecuteStatementOperation.scala:361)
 at 
org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation$$anon$2$$anon$3.$anonfun$run$2(SparkExecuteStatementOperation.scala:263)
 at scala.runtime.java8.JFunction0$mcV$sp.apply(JFunction0$mcV$sp.java:23)
 at 
org.apache.spark.sql.hive.thriftserver.SparkOperation.withLocalProperties(SparkOperation.scala:78)
 at 
org.apache.spark.sql.hive.thriftserver.SparkOperation.withLocalProperties$(SparkOperation.scala:62)
 at 
org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation.withLocalProperties(SparkExecuteStatementOperation.scala:43)
 at 
org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation$$anon$2$$anon$3.run(SparkExecuteStatementOperation.scala:263)
 at 
org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation$$anon$2$$anon$3.run(SparkExecuteStatementOperation.scala:258)
 at java.security.AccessController.doPrivileged(Native Method)
 at javax.security.auth.Subject.doAs(Subject.java:422)
 at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1746)
 at 
org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation$$anon$2.run(SparkExecuteStatementOperation.scala:272)
 at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
 at java.util.concurrent.FutureTask.run(FutureTask.java:266)
 at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
 at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
 at java.lang.Thread.run(Thread.java:748)
 Caused by: org.apache.spark.sql.AnalysisException: Compaction failed. Please 
check logs for more info. Exception in compaction Compaction Failure in Merger 
Rdd.
 at 
org.apache.spark.sql.util.CarbonException$.analysisException(CarbonException.scala:23)
 at 
org.apache.spark.sql.execution.command.management.CarbonAlterTableCompactionCommand.$anonfun$processData$3(CarbonAlterTableCompactionCommand.scala:197)
 at org.apache.carbondata.events.package$.withEvents(package.scala:27)
 at 
org.apache.spark.sql.execution.command.management.CarbonAlterTableCompactionCommand.processData(CarbonAlterTableCompactionCommand.scala:185)
 at 
org.apache.spark.sql.execution.command.AtomicRunnableCommand.$anonfun$run$3(package.scala:162)
 at 
org.apache.spark.sql.execution.command.Auditable.runWithAudit(package.scala:118)
 at 

[jira] [Created] (CARBONDATA-4240) Properties present in https://github.com/apache/carbondata/blob/master/core/src/main/java/org/apache/carbondata/core/constants/CarbonCommonConstants.java which are

2021-06-29 Thread Chetan Bhat (Jira)
Chetan Bhat created CARBONDATA-4240:
---

 Summary: Properties present in 
https://github.com/apache/carbondata/blob/master/core/src/main/java/org/apache/carbondata/core/constants/CarbonCommonConstants.java
  which are not present in open source doc
 Key: CARBONDATA-4240
 URL: https://issues.apache.org/jira/browse/CARBONDATA-4240
 Project: CarbonData
  Issue Type: Bug
  Components: docs
Affects Versions: 2.2.0
 Environment: Open source docs
Reporter: Chetan Bhat


Properties present in 
https://github.com/apache/carbondata/blob/master/core/src/main/java/org/apache/carbondata/core/constants/CarbonCommonConstants.java
 which are not present in open source doc as mentioned below. These properties 
need to be updated in open source doc.


carbon.storelocation
carbon.blocklet.size
carbon.properties.filepath
carbon.date.format
carbon.complex.delimiter.level.1
carbon.complex.delimiter.level.2
carbon.complex.delimiter.level.3
carbon.complex.delimiter.level.4
carbon.lock.class
carbon.local.dictionary.enable
carbon.local.dictionary.decoder.fallback
spark.deploy.zookeeper.url
carbon.data.file.version
spark.carbon.hive.schema.store
spark.carbon.datamanagement.driver
spark.carbon.sessionstate.classname
spark.carbon.sqlastbuilder.classname
carbon.lease.recovery.retry.count
carbon.lease.recovery.retry.interval
carbon.index.schema.storage
carbon.merge.index.in.segment
carbon.number.of.cores.while.altPartition
carbon.minor.compaction.size
enable.unsafe.columnpage
carbon.lucene.compression.mode
sort.inmemory.size.inmb
is.driver.instance
carbon.input.metrics.update.interval
carbon.use.bitset.pipe.line
is.internal.load.call
carbon.lucene.index.stop.words
carbon.load.dateformat.setlenient.enable
carbon.infilter.subquery.pushdown.enable
broadcast.record.size
carbon.indexserver.tempfolder.deletetime



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (CARBONDATA-4236) Documentation correctness and link issues in https://github.com/apache/carbondata/blob/master/docs/

2021-06-25 Thread Chetan Bhat (Jira)
Chetan Bhat created CARBONDATA-4236:
---

 Summary: Documentation correctness and link issues in 
https://github.com/apache/carbondata/blob/master/docs/
 Key: CARBONDATA-4236
 URL: https://issues.apache.org/jira/browse/CARBONDATA-4236
 Project: CarbonData
  Issue Type: Bug
  Components: docs
Affects Versions: 2.2.0
 Environment: docs with content and examples verified on Spark 2.4.5 
and Spark 3.1.1 compatible carbon.
Reporter: Chetan Bhat


In the documentation link https://github.com/apache/carbondata/blob/master/docs/

Issue 1 :- 
In link -> 
https://github.com/apache/carbondata/blob/master/docs/configuration-parameters.md
 the "See detail" links does not open the target 
"http://spark.apache.org/docs/latest/rdd-programming-guide.html#rdd-persistence;
In link --> 
https://github.com/apache/carbondata/blob/master/docs/documentation.md the link 
"Apache CarbonData wiki" when clicked tries to open link 
"https://cwiki.apache.org/confluence/display/CARBONDATA/CarbonData+Home; the 
target page cant be opened. Similarly the other links in the "External 
Resources" section cant be opened due to the same error.
In link 
https://github.com/apache/carbondata/blob/master/docs/faq.md#what-are-bad-records
 the link "https://thrift.apache.org/docs/install; when clicked does not open 
the target page.
In link 
https://github.com/apache/carbondata/blob/master/docs/quick-start-guide.md when 
the "Spark website" link is clicked https://spark.apache.org/downloads.html 
page is not opened. Also in same page when the "Apache Spark Documentation" 
link is clicked the "http://spark.apache.org/docs/latest/; page is not opened.
In the link 
https://github.com/apache/carbondata/blob/master/docs/release-guide.md "Product 
Release Policy link" , "release signing guidelines" , "Apache Nexus repository" 
and "repository.apache.org" when clicked the target pages are not opening.


Issue 2:-
In link --> 
https://github.com/apache/carbondata/blob/master/docs/configuration-parameters.md
 the "To configure Ranges-based Compaction" to be changed to "To configure 
Range-based Compaction"

Issue 3:-
In link --> 
https://github.com/apache/carbondata/blob/master/docs/configuration-parameters.md
 the "Making this true degrade the LOAD performance" to be changed to "Making 
this true degrades the LOAD performance"

Issue 4 :-
In link --> 
https://github.com/apache/carbondata/blob/master/docs/configuration-parameters.md
 the "user an either set to true" to be changed to "user can either set to true"



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (CARBONDATA-4235) after alter add column when user does rename operation ,the select operation on struct type gives null value and childen of struct gives error

2021-06-24 Thread Chetan Bhat (Jira)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-4235?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chetan Bhat updated CARBONDATA-4235:

Description: 
*Queries –*

drop table if exists test_rename;

CREATE TABLE test_rename (str1 struct, str2 struct>, 
str3 struct>> comment 'struct', intfield int,arr1 
array, arr2 array>, arr3 array, arr4 
array> comment 'array') STORED AS carbondata;

insert into test_rename values (named_struct('a', 2), named_struct('a', 
named_struct('b', 2)), named_struct('a', named_struct('b',named_struct('c', 
2))), 1,array(1,2,3), array(array(1,2),array(3,4)), array('hello','world'), 
array(named_struct('a',45)));

ALTER TABLE test_rename ADD COLUMNS(arr_1 ARRAY);

alter table test_rename change str2 str22 struct>;

select str22 from test_rename;

select str22.a from test_rename;

select str22.a.b from test_rename;

 

Issue : after alter add column when user does rename operation ,the select 
operation on struct type gives null value and childen of struct gives error

 

*Issue 1 : Exception trace on executing query –*

0: jdbc:hive2://vm2:22550/> select str22.a.b from test_rename;
 INFO : Execution ID: 2465
 Error: org.apache.hive.service.cli.HiveSQLException: Error running query: 
org.apache.spark.SparkException: Job aborted due to stage failure: Task 0 in 
stage 1100.0 failed 4 times, most recent failure: Lost task 0.3 in stage 1100.0 
(TID 10353) (vm1 executor 5): java.nio.BufferUnderflowException
 at java.nio.HeapByteBuffer.get(HeapByteBuffer.java:155)
 at 
org.apache.carbondata.core.scan.complextypes.PrimitiveQueryType.getDataObject(PrimitiveQueryType.java:166)
 at 
org.apache.carbondata.core.scan.complextypes.PrimitiveQueryType.getDataObject(PrimitiveQueryType.java:147)
 at 
org.apache.carbondata.core.scan.complextypes.PrimitiveQueryType.getDataBasedOnColumn(PrimitiveQueryType.java:141)
 at 
org.apache.carbondata.core.scan.complextypes.StructQueryType.getDataBasedOnColumn(StructQueryType.java:160)
 at 
org.apache.carbondata.core.scan.complextypes.StructQueryType.getDataBasedOnColumn(StructQueryType.java:160)
 at 
org.apache.carbondata.core.scan.collector.impl.DictionaryBasedResultCollector.fillRow(DictionaryBasedResultCollector.java:316)
 at 
org.apache.carbondata.core.scan.collector.impl.DictionaryBasedResultCollector.fillDimensionData(DictionaryBasedResultCollector.java:288)
 at 
org.apache.carbondata.core.scan.collector.impl.DictionaryBasedResultCollector.collectResultInRow(DictionaryBasedResultCollector.java:159)
 at 
org.apache.carbondata.core.scan.processor.DataBlockIterator.next(DataBlockIterator.java:110)
 at 
org.apache.carbondata.core.scan.result.iterator.DetailQueryResultIterator.getBatchResult(DetailQueryResultIterator.java:58)
 at 
org.apache.carbondata.core.scan.result.iterator.DetailQueryResultIterator.next(DetailQueryResultIterator.java:50)
 at 
org.apache.carbondata.core.scan.result.iterator.DetailQueryResultIterator.next(DetailQueryResultIterator.java:32)
 at 
org.apache.carbondata.core.scan.result.iterator.ChunkRowIterator.hasNext(ChunkRowIterator.java:56)
 at 
org.apache.carbondata.hadoop.CarbonRecordReader.nextKeyValue(CarbonRecordReader.java:127)
 at 
org.apache.carbondata.spark.rdd.CarbonScanRDD$$anon$1.hasNext(CarbonScanRDD.scala:557)
 at scala.collection.Iterator$$anon$10.hasNext(Iterator.scala:458)
 at scala.collection.Iterator$$anon$10.hasNext(Iterator.scala:458)
 at 
org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIteratorForCodegenStage1.processNext(Unknown
 Source)
 at 
org.apache.spark.sql.execution.BufferedRowIterator.hasNext(BufferedRowIterator.java:43)
 at 
org.apache.spark.sql.execution.WholeStageCodegenExec$$anon$1.hasNext(WholeStageCodegenExec.scala:755)
 at 
org.apache.spark.sql.execution.SparkPlan.$anonfun$getByteArrayRdd$1(SparkPlan.scala:345)
 at org.apache.spark.rdd.RDD.$anonfun$mapPartitionsInternal$2(RDD.scala:897)
 at 
org.apache.spark.rdd.RDD.$anonfun$mapPartitionsInternal$2$adapted(RDD.scala:897)
 at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52)
 at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:373)
 at org.apache.spark.rdd.RDD.iterator(RDD.scala:337)
 at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:90)
 at org.apache.spark.scheduler.Task.run(Task.scala:131)
 at 
org.apache.spark.executor.Executor$TaskRunner.$anonfun$run$3(Executor.scala:499)
 at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:1554)
 at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:502)
 at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
 at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
 at java.lang.Thread.run(Thread.java:748)

Driver stacktrace:
 at 

[jira] [Created] (CARBONDATA-4235) after alter add column when user does rename operation ,the select operation on struct type gives null value and childen of struct gives error

2021-06-24 Thread Chetan Bhat (Jira)
Chetan Bhat created CARBONDATA-4235:
---

 Summary: after alter add column  when user does rename operation 
,the select operation on struct type gives null value and childen of struct  
gives error
 Key: CARBONDATA-4235
 URL: https://issues.apache.org/jira/browse/CARBONDATA-4235
 Project: CarbonData
  Issue Type: Bug
  Components: sql
Affects Versions: 2.2.0
 Environment: Spark 3.1.1, Spark 2.4.5
Reporter: Chetan Bhat


*Queries –*

drop table if exists test_rename;

CREATE TABLE test_rename (str1 struct, str2 struct>, 
str3 struct>> comment 'struct', intfield int,arr1 
array, arr2 array>, arr3 array, arr4 
array> comment 'array') STORED AS carbondata;

insert into test_rename values (named_struct('a', 2), named_struct('a', 
named_struct('b', 2)), named_struct('a', named_struct('b',named_struct('c', 
2))), 1,array(1,2,3), array(array(1,2),array(3,4)), array('hello','world'), 
array(named_struct('a',45)));

ALTER TABLE test_rename ADD COLUMNS(arr_1 ARRAY);

alter table test_rename change str2 str22 struct>;

select str22 from test_rename;

select str22.a from test_rename;

select str22.a.b from test_rename;

 

Issue : after alter add column when user does rename operation ,the select 
operation on struct type gives null value and childen of struct gives error

 

*Exception trace on executing query –*

0: jdbc:hive2://vm2:22550/> select str22.a.b from test_rename;
 INFO : Execution ID: 2465
 Error: org.apache.hive.service.cli.HiveSQLException: Error running query: 
org.apache.spark.SparkException: Job aborted due to stage failure: Task 0 in 
stage 1100.0 failed 4 times, most recent failure: Lost task 0.3 in stage 1100.0 
(TID 10353) (vm1 executor 5): java.nio.BufferUnderflowException
 at java.nio.HeapByteBuffer.get(HeapByteBuffer.java:155)
 at 
org.apache.carbondata.core.scan.complextypes.PrimitiveQueryType.getDataObject(PrimitiveQueryType.java:166)
 at 
org.apache.carbondata.core.scan.complextypes.PrimitiveQueryType.getDataObject(PrimitiveQueryType.java:147)
 at 
org.apache.carbondata.core.scan.complextypes.PrimitiveQueryType.getDataBasedOnColumn(PrimitiveQueryType.java:141)
 at 
org.apache.carbondata.core.scan.complextypes.StructQueryType.getDataBasedOnColumn(StructQueryType.java:160)
 at 
org.apache.carbondata.core.scan.complextypes.StructQueryType.getDataBasedOnColumn(StructQueryType.java:160)
 at 
org.apache.carbondata.core.scan.collector.impl.DictionaryBasedResultCollector.fillRow(DictionaryBasedResultCollector.java:316)
 at 
org.apache.carbondata.core.scan.collector.impl.DictionaryBasedResultCollector.fillDimensionData(DictionaryBasedResultCollector.java:288)
 at 
org.apache.carbondata.core.scan.collector.impl.DictionaryBasedResultCollector.collectResultInRow(DictionaryBasedResultCollector.java:159)
 at 
org.apache.carbondata.core.scan.processor.DataBlockIterator.next(DataBlockIterator.java:110)
 at 
org.apache.carbondata.core.scan.result.iterator.DetailQueryResultIterator.getBatchResult(DetailQueryResultIterator.java:58)
 at 
org.apache.carbondata.core.scan.result.iterator.DetailQueryResultIterator.next(DetailQueryResultIterator.java:50)
 at 
org.apache.carbondata.core.scan.result.iterator.DetailQueryResultIterator.next(DetailQueryResultIterator.java:32)
 at 
org.apache.carbondata.core.scan.result.iterator.ChunkRowIterator.hasNext(ChunkRowIterator.java:56)
 at 
org.apache.carbondata.hadoop.CarbonRecordReader.nextKeyValue(CarbonRecordReader.java:127)
 at 
org.apache.carbondata.spark.rdd.CarbonScanRDD$$anon$1.hasNext(CarbonScanRDD.scala:557)
 at scala.collection.Iterator$$anon$10.hasNext(Iterator.scala:458)
 at scala.collection.Iterator$$anon$10.hasNext(Iterator.scala:458)
 at 
org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIteratorForCodegenStage1.processNext(Unknown
 Source)
 at 
org.apache.spark.sql.execution.BufferedRowIterator.hasNext(BufferedRowIterator.java:43)
 at 
org.apache.spark.sql.execution.WholeStageCodegenExec$$anon$1.hasNext(WholeStageCodegenExec.scala:755)
 at 
org.apache.spark.sql.execution.SparkPlan.$anonfun$getByteArrayRdd$1(SparkPlan.scala:345)
 at org.apache.spark.rdd.RDD.$anonfun$mapPartitionsInternal$2(RDD.scala:897)
 at 
org.apache.spark.rdd.RDD.$anonfun$mapPartitionsInternal$2$adapted(RDD.scala:897)
 at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52)
 at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:373)
 at org.apache.spark.rdd.RDD.iterator(RDD.scala:337)
 at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:90)
 at org.apache.spark.scheduler.Task.run(Task.scala:131)
 at 
org.apache.spark.executor.Executor$TaskRunner.$anonfun$run$3(Executor.scala:499)
 at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:1554)
 at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:502)
 at 

[jira] [Created] (CARBONDATA-4209) import processing class in https://carbondata.apache.org/streaming-guide.html is wrong

2021-06-14 Thread Chetan Bhat (Jira)
Chetan Bhat created CARBONDATA-4209:
---

 Summary: import processing class in 
https://carbondata.apache.org/streaming-guide.html is wrong
 Key: CARBONDATA-4209
 URL: https://issues.apache.org/jira/browse/CARBONDATA-4209
 Project: CarbonData
  Issue Type: Bug
  Components: docs
Affects Versions: 2.1.1
 Environment: Spark 3.1.1
Reporter: Chetan Bhat


In the open source doc link  
[https://carbondata.apache.org/streaming-guide.html]

import org.apache.spark.sql.streaming.{ProcessingTime, StreamingQuery}  --> 
this  import statement can be replaced by  -->  import 
org.apache.spark.sql.streaming.Trigger.ProcessingTime

as the earlier import does not work

 

scala> import org.apache.spark.sql.streaming.ProcessingTime
:23: error: object ProcessingTime is not a member of package 
org.apache.spark.sql.streaming
 import org.apache.spark.sql.streaming.ProcessingTime
 ^

scala> import org.apache.spark.sql.streaming.Trigger.ProcessingTime
import org.apache.spark.sql.streaming.Trigger.ProcessingTime

 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (CARBONDATA-4200) query with AND filter and query on partition table not hitting SI from presto cli

2021-06-06 Thread Chetan Bhat (Jira)
Chetan Bhat created CARBONDATA-4200:
---

 Summary: query with AND filter and query on partition table not 
hitting SI from presto cli
 Key: CARBONDATA-4200
 URL: https://issues.apache.org/jira/browse/CARBONDATA-4200
 Project: CarbonData
  Issue Type: Bug
  Components: presto-integration
Affects Versions: 2.1.1
 Environment: Spark 2.4.5 , Presto 316/333
Reporter: Chetan Bhat


[Steps] :-

Issue 1 : - query with AND filter not hitting SI from presto cli.

Queries executed from spark sql/beeline - 

drop table if exists uniqdata;

CREATE TABLE uniqdata (CUST_ID int,CUST_NAME String,ACTIVE_EMUI_VERSION string, 
DOB timestamp, DOJ timestamp, BIGINT_COLUMN1 bigint,BIGINT_COLUMN2 
bigint,DECIMAL_COLUMN1 decimal(30,10), DECIMAL_COLUMN2 
decimal(36,36),Double_COLUMN1 double, Double_COLUMN2 double,INTEGER_COLUMN1 
int) stored as carbondata ;

create index uniqdata_index on table uniqdata(cust_name) as 'carbondata' 
properties ('sort_scope'='global_sort', 'Global_sort_partitions'='3');

create index uniqdata_index1 on table uniqdata(ACTIVE_EMUI_VERSION) as 
'carbondata' properties ('sort_scope'='global_sort', 
'Global_sort_partitions'='3');

LOAD DATA INPATH 'hdfs://hacluster/chetan/2000_UniqData.csv' into table 
uniqdata OPTIONS('DELIMITER'=',' , 
'QUOTECHAR'='"','BAD_RECORDS_ACTION'='FORCE','FILEHEADER'='CUST_ID,CUST_NAME,ACTIVE_EMUI_VERSION,DOB,DOJ,BIGINT_COLUMN1,BIGINT_COLUMN2,DECIMAL_COLUMN1,DECIMAL_COLUMN2,Double_COLUMN1,Double_COLUMN2,INTEGER_COLUMN1');

Queries executed from presto cli - 

select * from uniqdata where cust_name='CUST_NAME_2' AND 
ACTIVE_EMUI_VERSION='ACTIVE_EMUI_VERSION_2'; show metacache on table 
uniqdata;

 

Issue 2 - query on partition table does not hit SI

Queries executed from spark sql/beeline - 

drop table carbon_test;

create table carbon_test(id int,name string)PARTITIONED BY(record_date int) 
stored as carbondata TBLPROPERTIES('SORT_COLUMNS'='id','SORT_SCOPE'='NO_SORT');

create index uniq1 on table carbon_test (name) as 'carbondata';

insert into table carbon_test partition(record_date) select 
1,'kim',unix_timestamp('2018-02-05','-MM-dd') as record_date ;

Queries executed from presto cli - 

select * from carbon_test where name='kim'; show metacache on table carbon_test;



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (CARBONDATA-4194) read from presto session throws error after delete operation from complex table from spark session

2021-05-28 Thread Chetan Bhat (Jira)
Chetan Bhat created CARBONDATA-4194:
---

 Summary: read from presto session throws error after delete 
operation from complex table from spark session
 Key: CARBONDATA-4194
 URL: https://issues.apache.org/jira/browse/CARBONDATA-4194
 Project: CarbonData
  Issue Type: Bug
  Components: presto-integration
Affects Versions: 2.1.1
 Environment: Spark 2.4.5, Presto SQL 316
Reporter: Chetan Bhat


*Queries executed -* 

>From Spark session create table with complex types , load data to table and 
>delete data from table

create table Struct_com19_PR4031_009 (CUST_ID string, YEAR int, MONTH int, AGE 
int, GENDER string, EDUCATED string, IS_MARRIED string, 
STRUCT_INT_DOUBLE_STRING_DATE 
struct,CARD_COUNT
 int,DEBIT_COUNT int, CREDIT_COUNT int, DEPOSIT double, HQ_DEPOSIT 
decimal(20,3)) stored as carbondata;
LOAD DATA INPATH 'hdfs://hacluster/chetan/Struct.csv' INTO table 
Struct_com19_PR4031_009 options ('DELIMITER'=',', 'QUOTECHAR'='"', 
'FILEHEADER'='CUST_ID,YEAR,MONTH,AGE,GENDER,EDUCATED,IS_MARRIED,STRUCT_INT_DOUBLE_STRING_DATE,CARD_COUNT,DEBIT_COUNT,CREDIT_COUNT,DEPOSIT,HQ_DEPOSIT','COMPLEX_DELIMITER_LEVEL_1'='$');
delete from Struct_com19_PR4031_009 where EDUCATED='MS';

 

>From Presto CLI execute the select queries.

select * from Struct_com19_PR4031_009 limit 1;
select count(*) from Struct_com19_PR4031_009;

 

*Issue : -* read from presto session throws error after delete operation from 
complex table from spark session

presto:ranjan> select * from Struct_com19_PR4031_009 limit 1;

Query 20210528_075917_1_swzys, FAILED, 1 node
Splits: 18 total, 0 done (0.00%)
0:00 [0 rows, 0B] [0 rows/s, 0B/s]

Query 20210528_075917_1_swzys failed: Error in Reading Data from Carbondata

 

*Log -*

org.apache.carbondata.processing.loading.exception.CarbonDataLoadingException: 
Error in Reading Data from Carbondata 
 at 
org.apache.carbondata.presto.CarbondataPageSource$CarbondataBlockLoader.load(CarbondataPageSource.java:491)
 at 
org.apache.carbondata.presto.CarbondataPageSource$CarbondataBlockLoader.load(CarbondataPageSource.java:467)
 at io.prestosql.spi.block.LazyBlock.assureLoaded(LazyBlock.java:276)
 at io.prestosql.spi.block.LazyBlock.getLoadedBlock(LazyBlock.java:267)
 at io.prestosql.spi.Page.getLoadedPage(Page.java:261)
 at 
io.prestosql.operator.TableScanOperator.getOutput(TableScanOperator.java:283)
 at io.prestosql.operator.Driver.processInternal(Driver.java:379)
 at io.prestosql.operator.Driver.lambda$processFor$8(Driver.java:283)
 at io.prestosql.operator.Driver.tryWithLock(Driver.java:675)
 at io.prestosql.operator.Driver.processFor(Driver.java:276)
 at 
io.prestosql.execution.SqlTaskExecution$DriverSplitRunner.processFor(SqlTaskExecution.java:1075)
 at 
io.prestosql.execution.executor.PrioritizedSplitRunner.process(PrioritizedSplitRunner.java:163)
 at 
io.prestosql.execution.executor.TaskExecutor$TaskRunner.run(TaskExecutor.java:484)
 at io.prestosql.$gen.Presto_31620210526_073226_1.run(Unknown Source)
 at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
 at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
 at java.lang.Thread.run(Thread.java:748)
Caused by: java.lang.RuntimeException: java.lang.RuntimeException: 
java.lang.ClassCastException
 at 
org.apache.carbondata.core.datastore.chunk.impl.DimensionRawColumnChunk.convertToDimColDataChunkAndFillVector(DimensionRawColumnChunk.java:140)
 at 
org.apache.carbondata.core.scan.scanner.LazyPageLoader.loadPage(LazyPageLoader.java:75)
 at 
org.apache.carbondata.core.scan.result.vector.impl.CarbonColumnVectorImpl.loadPage(CarbonColumnVectorImpl.java:531)
 at 
org.apache.carbondata.presto.CarbondataPageSource$CarbondataBlockLoader.load(CarbondataPageSource.java:483)
 ... 16 more
Caused by: java.lang.RuntimeException: java.lang.ClassCastException
 at 
org.apache.carbondata.core.datastore.chunk.impl.DimensionRawColumnChunk.convertToDimColDataChunkAndFillVector(DimensionRawColumnChunk.java:140)
 at 
org.apache.carbondata.core.scan.scanner.LazyPageLoader.loadPage(LazyPageLoader.java:75)
 at 
org.apache.carbondata.core.scan.result.vector.impl.CarbonColumnVectorImpl.loadPage(CarbonColumnVectorImpl.java:531)
 at 
org.apache.carbondata.core.datastore.page.encoding.compress.DirectCompressCodec$3.decodeAndFillVector(DirectCompressCodec.java:277)
 at 
org.apache.carbondata.core.datastore.page.encoding.compress.DirectCompressCodec$2.decodeAndFillVector(DirectCompressCodec.java:158)
 at 
org.apache.carbondata.core.datastore.chunk.reader.dimension.v3.DimensionChunkReaderV3.decodeDimensionByMeta(DimensionChunkReaderV3.java:260)
 at 
org.apache.carbondata.core.datastore.chunk.reader.dimension.v3.DimensionChunkReaderV3.decodeDimension(DimensionChunkReaderV3.java:307)
 at 

[jira] [Created] (CARBONDATA-4184) alter table Set TBLPROPERTIES for RANGE_COLUMN sets unsupported datatype(complex_datatypes/Binary/Boolean/Decimal) as RANGE_COLUMN

2021-05-12 Thread Chetan Bhat (Jira)
Chetan Bhat created CARBONDATA-4184:
---

 Summary: alter table Set TBLPROPERTIES for RANGE_COLUMN sets 
unsupported datatype(complex_datatypes/Binary/Boolean/Decimal) as RANGE_COLUMN
 Key: CARBONDATA-4184
 URL: https://issues.apache.org/jira/browse/CARBONDATA-4184
 Project: CarbonData
  Issue Type: Bug
  Components: data-query
Affects Versions: 2.1.1
 Environment: Spark 2.4.5
Reporter: Chetan Bhat


[Steps] :-

>From Spark Beeline/SQL/Submit/Shell the queries are executed

DROP TABLE IF EXISTS alter_array;
CREATE TABLE alter_array(intField INT, arr1 array) STORED AS carbondata;
ALTER TABLE alter_array SET TBLPROPERTIES('RANGE_COLUMN'='arr1');
desc formatted alter_array;

DROP TABLE IF EXISTS alter_struct;
create table alter_struct(roll int, struct1 struct) 
STORED AS carbondata;
ALTER TABLE alter_struct SET TBLPROPERTIES('RANGE_COLUMN'='struct1');
desc formatted alter_struct;

DROP TABLE IF EXISTS alter_map;
create table alter_map(roll int, map1 map) STORED AS carbondata;
ALTER TABLE alter_map SET TBLPROPERTIES('RANGE_COLUMN'='map1');
desc formatted alter_map;

DROP TABLE IF EXISTS alter_boolean;
create table alter_boolean(roll int, bool1 boolean) STORED AS carbondata;
ALTER TABLE alter_boolean SET TBLPROPERTIES('RANGE_COLUMN'='bool1');
desc formatted alter_boolean;

DROP TABLE IF EXISTS alter_binary;
create table alter_binary(roll int, bin1 binary) STORED AS carbondata;
ALTER TABLE alter_binary SET TBLPROPERTIES('RANGE_COLUMN'='bin1');
desc formatted alter_binary;

DROP TABLE IF EXISTS alter_decimal;
create table alter_decimal(roll int, dec1 decimal(10,5)) STORED AS carbondata;
ALTER TABLE alter_decimal SET TBLPROPERTIES('RANGE_COLUMN'='dec1');
desc formatted alter_decimal;

 [Actual Issue] : - alter table Set TBLPROPERTIES for RANGE_COLUMN sets 
unsupported datatype(complex_datatypes/Binary/Boolean/Decimal) as RANGE_COLUMN

 

[Expected Result] :- Validation should be provided when alter table Set 
TBLPROPERTIES for RANGE_COLUMN tried to be set for unsupported 
datatype(complex_datatypes/Binary/Boolean/Decimal)

 

 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (CARBONDATA-4178) if we insert different no of data in array complex datatype. then query filter on increment or last index gives error from presto

2021-05-07 Thread Chetan Bhat (Jira)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-4178?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chetan Bhat updated CARBONDATA-4178:

Priority: Minor  (was: Major)

> if we insert different no of data in array complex datatype. then query 
> filter on increment or last index gives error from presto
> -
>
> Key: CARBONDATA-4178
> URL: https://issues.apache.org/jira/browse/CARBONDATA-4178
> Project: CarbonData
>  Issue Type: Bug
>  Components: presto-integration
>Affects Versions: 2.1.1
> Environment: Spark 2.4.5. Presto 316
>Reporter: Chetan Bhat
>Priority: Minor
>
> Steps -
> From Spark session the table creation and insert operations are executed as 
> shown below
>  
> drop table if exists complextable;
>  create table complextable (id string, country array, name string) 
> stored as carbondata;
> insert into complextable select 1, array('china', 'us'), 'b' union all select 
> 2, array ('pak', 'india', 'china'), 'v';
> From presto cli the below  queries are executed.
>  select * from complextable ;
>  select * from complextable where country[3]='china';
>  
> Issue - if we insert different no of data in array complex datatype. then 
> query filter on increment or last index gives error from presto
> presto:ranjan> select * from complextable ;
>  id | country | name
>  ++---
>  1 | [china, us] | b
>  2 | [pak, india, china] | v
>  (2 rows)
> Query 20210507_072934_2_d2cjp, FINISHED, 1 node
>  Splits: 18 total, 18 done (100.00%)
>  0:07 [2 rows, 95B] [0 rows/s, 12B/s]
> presto:ranjan> select * from complextable where country[1]='pak';
>  id | country | name
>  ++---
>  2 | [pak, india, china] | v
>  (1 row)
> Query 20210507_072948_3_d2cjp, FINISHED, 1 node
>  Splits: 18 total, 18 done (100.00%)
>  0:06 [2 rows, 0B] [0 rows/s, 0B/s]
> presto:ranjan> select * from complextable where country[3]='china';
>  id | country | name
>  ++---
>  2 | [pak, india, china] | v
>  (1 row)
> Query 20210507_073007_4_d2cjp, FAILED, 1 node
>  Splits: 18 total, 1 done (5.56%)
>  0:05 [1 rows, 0B] [0 rows/s, 0B/s]
> Query 20210507_073007_4_d2cjp failed: Array subscript out of bounds
>  
> Expected - Error should not be thrown for the query executed in hetu cli and 
> it show correct resultset as in spark session.
> 0: jdbc:hive2://10.20.254.208:23040/default> select * from complextable where 
> country[2]='china';
>  +--+-++---
> |id|country|name|
> +--+-++---
> |2|["pak","india","china"]|v|
> +--+-++---
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (CARBONDATA-4178) if we insert different no of data in array complex datatype. then query filter on increment or last index gives error from presto

2021-05-07 Thread Chetan Bhat (Jira)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-4178?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chetan Bhat updated CARBONDATA-4178:

Description: 
Steps -

>From Spark session the table creation and insert operations are executed as 
>shown below

 

drop table if exists complextable;
 create table complextable (id string, country array, name string) 
stored as carbondata;

insert into complextable select 1, array('china', 'us'), 'b' union all select 
2, array ('pak', 'india', 'china'), 'v';

>From presto cli the below  queries are executed.


 select * from complextable ;
 select * from complextable where country[3]='china';

 

Issue - if we insert different no of data in array complex datatype. then query 
filter on increment or last index gives error from presto

presto:ranjan> select * from complextable ;
 id | country | name
 ++---
 1 | [china, us] | b
 2 | [pak, india, china] | v
 (2 rows)

Query 20210507_072934_2_d2cjp, FINISHED, 1 node
 Splits: 18 total, 18 done (100.00%)
 0:07 [2 rows, 95B] [0 rows/s, 12B/s]

presto:ranjan> select * from complextable where country[1]='pak';
 id | country | name
 ++---
 2 | [pak, india, china] | v
 (1 row)

Query 20210507_072948_3_d2cjp, FINISHED, 1 node
 Splits: 18 total, 18 done (100.00%)
 0:06 [2 rows, 0B] [0 rows/s, 0B/s]

presto:ranjan> select * from complextable where country[3]='china';
 id | country | name
 ++---
 2 | [pak, india, china] | v
 (1 row)

Query 20210507_073007_4_d2cjp, FAILED, 1 node
 Splits: 18 total, 1 done (5.56%)
 0:05 [1 rows, 0B] [0 rows/s, 0B/s]

Query 20210507_073007_4_d2cjp failed: Array subscript out of bounds

 

Expected - Error should not be thrown for the query executed in hetu cli and it 
show correct resultset as in spark session.

0: jdbc:hive2://10.20.254.208:23040/default> select * from complextable where 
country[2]='china';
 +--+-++---
|id|country|name|

+--+-++---
|2|["pak","india","china"]|v|

+--+-++---

 

  was:
Steps -

>From presto cli the below  queries are executed.

drop table if exists complextable;
create table complextable (id string, country array, name string) 
stored as carbondata;
insert into complextable select 1, array('china', 'us'), 'b' union all select 
2, array ('pak', 'india', 'china'), 'v';
select * from complextable ;
select * from complextable where country[3]='china';

 

Issue - if we insert different no of data in array complex datatype. then query 
filter on increment or last index gives error from presto

presto:ranjan> select * from complextable ;
 id | country | name
+-+--
 1 | [china, us] | b
 2 | [pak, india, china] | v
(2 rows)

Query 20210507_072934_2_d2cjp, FINISHED, 1 node
Splits: 18 total, 18 done (100.00%)
0:07 [2 rows, 95B] [0 rows/s, 12B/s]

presto:ranjan> select * from complextable where country[1]='pak';
 id | country | name
+-+--
 2 | [pak, india, china] | v
(1 row)

Query 20210507_072948_3_d2cjp, FINISHED, 1 node
Splits: 18 total, 18 done (100.00%)
0:06 [2 rows, 0B] [0 rows/s, 0B/s]

presto:ranjan> select * from complextable where country[3]='china';
 id | country | name
+-+--
 2 | [pak, india, china] | v
(1 row)

Query 20210507_073007_4_d2cjp, FAILED, 1 node
Splits: 18 total, 1 done (5.56%)
0:05 [1 rows, 0B] [0 rows/s, 0B/s]

Query 20210507_073007_4_d2cjp failed: Array subscript out of bounds

 

Expected - Error should not be thrown for the query executed in hetu cli and it 
show correct resultset as in spark session.

0: jdbc:hive2://10.20.254.208:23040/default> select * from complextable where 
country[2]='china';
+-+--+---+
| id | country | name |
+-+--+---+
| 2 | ["pak","india","china"] | v |
+-+--+---+

 


> if we insert different no of data in array complex datatype. then query 
> filter on increment or last index gives error from presto
> -
>
> Key: CARBONDATA-4178
> URL: https://issues.apache.org/jira/browse/CARBONDATA-4178
> Project: CarbonData
>  Issue Type: Bug
>  Components: presto-integration
>Affects Versions: 2.1.1
> Environment: Spark 2.4.5. Presto 316
>Reporter: Chetan Bhat
>Priority: Major
>
> Steps -
> From Spark session the table creation and insert operations are executed as 
> shown below
>  
> drop table if exists complextable;
>  create table complextable (id string, country array, name string) 
> stored as carbondata;
> insert into complextable select 1, 

[jira] [Created] (CARBONDATA-4178) if we insert different no of data in array complex datatype. then query filter on increment or last index gives error from presto

2021-05-07 Thread Chetan Bhat (Jira)
Chetan Bhat created CARBONDATA-4178:
---

 Summary: if we insert different no of data in array complex 
datatype. then query filter on increment or last index gives error from presto
 Key: CARBONDATA-4178
 URL: https://issues.apache.org/jira/browse/CARBONDATA-4178
 Project: CarbonData
  Issue Type: Bug
  Components: presto-integration
Affects Versions: 2.1.1
 Environment: Spark 2.4.5. Presto 316
Reporter: Chetan Bhat


Steps -

>From presto cli the below  queries are executed.

drop table if exists complextable;
create table complextable (id string, country array, name string) 
stored as carbondata;
insert into complextable select 1, array('china', 'us'), 'b' union all select 
2, array ('pak', 'india', 'china'), 'v';
select * from complextable ;
select * from complextable where country[3]='china';

 

Issue - if we insert different no of data in array complex datatype. then query 
filter on increment or last index gives error from presto

presto:ranjan> select * from complextable ;
 id | country | name
+-+--
 1 | [china, us] | b
 2 | [pak, india, china] | v
(2 rows)

Query 20210507_072934_2_d2cjp, FINISHED, 1 node
Splits: 18 total, 18 done (100.00%)
0:07 [2 rows, 95B] [0 rows/s, 12B/s]

presto:ranjan> select * from complextable where country[1]='pak';
 id | country | name
+-+--
 2 | [pak, india, china] | v
(1 row)

Query 20210507_072948_3_d2cjp, FINISHED, 1 node
Splits: 18 total, 18 done (100.00%)
0:06 [2 rows, 0B] [0 rows/s, 0B/s]

presto:ranjan> select * from complextable where country[3]='china';
 id | country | name
+-+--
 2 | [pak, india, china] | v
(1 row)

Query 20210507_073007_4_d2cjp, FAILED, 1 node
Splits: 18 total, 1 done (5.56%)
0:05 [1 rows, 0B] [0 rows/s, 0B/s]

Query 20210507_073007_4_d2cjp failed: Array subscript out of bounds

 

Expected - Error should not be thrown for the query executed in hetu cli and it 
show correct resultset as in spark session.

0: jdbc:hive2://10.20.254.208:23040/default> select * from complextable where 
country[2]='china';
+-+--+---+
| id | country | name |
+-+--+---+
| 2 | ["pak","india","china"] | v |
+-+--+---+

 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (CARBONDATA-4159) Insert into select in presto session throws error after delete data and alter add table executed from spark session

2021-04-01 Thread Chetan Bhat (Jira)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-4159?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chetan Bhat updated CARBONDATA-4159:

Description: 
Steps -

create a table in spark session .

load ,delete and alter add column in spark session

CREATE TABLE lsc1(id int, name string, description string,address string, note 
string) stored as carbondata 
tblproperties('sort_columns'='id,name','long_string_columns'='description,note');
 load data inpath 'hdfs://hacluster/chetan/longStringData_100rec.csv' into 
table lsc1 options('DELIMITER'=',', 'QUOTECHAR'= 
'"','BAD_RECORDS_ACTION'='FORCE','FILEHEADER'='id,name,description,address,note');
 delete from lsc1 where id=99;
 alter table lsc1 add columns(id2 int);

 

Insert into select from presto session.

insert into lsc1 select * from lsc1;

 

Issue :-  Insert into select in presto session throws error 

presto:ranjan> insert into lsc1 select * from lsc1;

Query 20210401_135306_00013_fnva9, FAILED, 1 node
 Splits: 35 total, 0 done (0.00%)
 0:01 [100 rows, 6.38MB] [90 rows/s, 5.75MB/s]

*Query 20210401_135306_00013_fnva9 failed: Invalid position 50 and length 50 in 
block with 99 positions*

presto:ranjan>

 

*Log -*

java.lang.IndexOutOfBoundsException: Invalid position 50 and length 50 in block 
with 99 positionsjava.lang.IndexOutOfBoundsException: Invalid position 50 and 
length 50 in block with 99 positions at 
io.prestosql.spi.block.BlockUtil.checkValidRegion(BlockUtil.java:48) at 
io.prestosql.spi.block.DictionaryBlock.getRegion(DictionaryBlock.java:325) at 
io.prestosql.spi.Page.getRegion(Page.java:128) at 
io.prestosql.execution.buffer.PageSplitterUtil.splitPage(PageSplitterUtil.java:53)
 at 
io.prestosql.execution.buffer.PageSplitterUtil.splitPage(PageSplitterUtil.java:29)
 at 
io.prestosql.operator.TaskOutputOperator.addInput(TaskOutputOperator.java:145) 
at io.prestosql.operator.Driver.processInternal(Driver.java:384) at 
io.prestosql.operator.Driver.lambda$processFor$8(Driver.java:283) at 
io.prestosql.operator.Driver.tryWithLock(Driver.java:675) at 
io.prestosql.operator.Driver.processFor(Driver.java:276) at 
io.prestosql.execution.SqlTaskExecution$DriverSplitRunner.processFor(SqlTaskExecution.java:1075)
 at 
io.prestosql.execution.executor.PrioritizedSplitRunner.process(PrioritizedSplitRunner.java:163)
 at 
io.prestosql.execution.executor.TaskExecutor$TaskRunner.run(TaskExecutor.java:484)
 at io.prestosql.$gen.Presto_31620210401_102926_1.run(Unknown Source) at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) 
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) 
at java.lang.Thread.run(Thread.java:748)

  was:
Steps -

creating a table in spark session .

load ,delete and alter add column in spark session

CREATE TABLE lsc1(id int, name string, description string,address string, note 
string) stored as carbondata 
tblproperties('sort_columns'='id,name','long_string_columns'='description,note');
load data inpath 'hdfs://hacluster/chetan/longStringData_100rec.csv' into table 
lsc1 options('DELIMITER'=',', 'QUOTECHAR'= 
'"','BAD_RECORDS_ACTION'='FORCE','FILEHEADER'='id,name,description,address,note');
delete from lsc1 where id=99;
alter table lsc1 add columns(id2 int);

 

Insert into select from presto session.

insert into lsc1 select * from lsc1;

 

Issue :-  Insert into select in presto session throws error 

presto:ranjan> insert into lsc1 select * from lsc1;

Query 20210401_135306_00013_fnva9, FAILED, 1 node
Splits: 35 total, 0 done (0.00%)
0:01 [100 rows, 6.38MB] [90 rows/s, 5.75MB/s]

*Query 20210401_135306_00013_fnva9 failed: Invalid position 50 and length 50 in 
block with 99 positions*

presto:ranjan>

 

*Log -*

java.lang.IndexOutOfBoundsException: Invalid position 50 and length 50 in block 
with 99 positionsjava.lang.IndexOutOfBoundsException: Invalid position 50 and 
length 50 in block with 99 positions at 
io.prestosql.spi.block.BlockUtil.checkValidRegion(BlockUtil.java:48) at 
io.prestosql.spi.block.DictionaryBlock.getRegion(DictionaryBlock.java:325) at 
io.prestosql.spi.Page.getRegion(Page.java:128) at 
io.prestosql.execution.buffer.PageSplitterUtil.splitPage(PageSplitterUtil.java:53)
 at 
io.prestosql.execution.buffer.PageSplitterUtil.splitPage(PageSplitterUtil.java:29)
 at 
io.prestosql.operator.TaskOutputOperator.addInput(TaskOutputOperator.java:145) 
at io.prestosql.operator.Driver.processInternal(Driver.java:384) at 
io.prestosql.operator.Driver.lambda$processFor$8(Driver.java:283) at 
io.prestosql.operator.Driver.tryWithLock(Driver.java:675) at 
io.prestosql.operator.Driver.processFor(Driver.java:276) at 
io.prestosql.execution.SqlTaskExecution$DriverSplitRunner.processFor(SqlTaskExecution.java:1075)
 at 
io.prestosql.execution.executor.PrioritizedSplitRunner.process(PrioritizedSplitRunner.java:163)
 at 

[jira] [Created] (CARBONDATA-4159) Insert into select in presto session throws error after delete data and alter add table executed from spark session

2021-04-01 Thread Chetan Bhat (Jira)
Chetan Bhat created CARBONDATA-4159:
---

 Summary: Insert into select in presto session throws error after 
delete data and alter add table executed from spark session
 Key: CARBONDATA-4159
 URL: https://issues.apache.org/jira/browse/CARBONDATA-4159
 Project: CarbonData
  Issue Type: Bug
  Components: presto-integration
Affects Versions: 2.1.0
 Environment: Spark 2.4.5. Presto-SQL 316
Reporter: Chetan Bhat


Steps -

creating a table in spark session .

load ,delete and alter add column in spark session

CREATE TABLE lsc1(id int, name string, description string,address string, note 
string) stored as carbondata 
tblproperties('sort_columns'='id,name','long_string_columns'='description,note');
load data inpath 'hdfs://hacluster/chetan/longStringData_100rec.csv' into table 
lsc1 options('DELIMITER'=',', 'QUOTECHAR'= 
'"','BAD_RECORDS_ACTION'='FORCE','FILEHEADER'='id,name,description,address,note');
delete from lsc1 where id=99;
alter table lsc1 add columns(id2 int);

 

Insert into select from presto session.

insert into lsc1 select * from lsc1;

 

Issue :-  Insert into select in presto session throws error 

presto:ranjan> insert into lsc1 select * from lsc1;

Query 20210401_135306_00013_fnva9, FAILED, 1 node
Splits: 35 total, 0 done (0.00%)
0:01 [100 rows, 6.38MB] [90 rows/s, 5.75MB/s]

*Query 20210401_135306_00013_fnva9 failed: Invalid position 50 and length 50 in 
block with 99 positions*

presto:ranjan>

 

*Log -*

java.lang.IndexOutOfBoundsException: Invalid position 50 and length 50 in block 
with 99 positionsjava.lang.IndexOutOfBoundsException: Invalid position 50 and 
length 50 in block with 99 positions at 
io.prestosql.spi.block.BlockUtil.checkValidRegion(BlockUtil.java:48) at 
io.prestosql.spi.block.DictionaryBlock.getRegion(DictionaryBlock.java:325) at 
io.prestosql.spi.Page.getRegion(Page.java:128) at 
io.prestosql.execution.buffer.PageSplitterUtil.splitPage(PageSplitterUtil.java:53)
 at 
io.prestosql.execution.buffer.PageSplitterUtil.splitPage(PageSplitterUtil.java:29)
 at 
io.prestosql.operator.TaskOutputOperator.addInput(TaskOutputOperator.java:145) 
at io.prestosql.operator.Driver.processInternal(Driver.java:384) at 
io.prestosql.operator.Driver.lambda$processFor$8(Driver.java:283) at 
io.prestosql.operator.Driver.tryWithLock(Driver.java:675) at 
io.prestosql.operator.Driver.processFor(Driver.java:276) at 
io.prestosql.execution.SqlTaskExecution$DriverSplitRunner.processFor(SqlTaskExecution.java:1075)
 at 
io.prestosql.execution.executor.PrioritizedSplitRunner.process(PrioritizedSplitRunner.java:163)
 at 
io.prestosql.execution.executor.TaskExecutor$TaskRunner.run(TaskExecutor.java:484)
 at io.prestosql.$gen.Presto_31620210401_102926_1.run(Unknown Source) at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) 
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) 
at java.lang.Thread.run(Thread.java:748)



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (CARBONDATA-4135) insert in partition table should fail from presto side but insert into select * in passing in partition table with single column partition table from presto side

2021-02-22 Thread Chetan Bhat (Jira)
Chetan Bhat created CARBONDATA-4135:
---

 Summary: insert in partition table should fail from presto side  
but insert into select * in passing in partition table with single column 
partition table from presto side
 Key: CARBONDATA-4135
 URL: https://issues.apache.org/jira/browse/CARBONDATA-4135
 Project: CarbonData
  Issue Type: Bug
  Components: presto-integration
Affects Versions: 2.1.0
 Environment: Spark 2.4.5, Presto 316
Reporter: Chetan Bhat


Presto 316 version used.

*Steps :-* 

>From Spark beeline execute the queries - 

0: jdbc:hive2://10.20.254.208:23040/default> drop table 
uniqdata_Partition_single;
+-+
| Result |
+-+
+-+
No rows selected (0.454 seconds)
0: jdbc:hive2://10.20.254.208:23040/default> CREATE TABLE 
uniqdata_Partition_single (CUST_ID int,CUST_NAME String, DOB timestamp, DOJ 
timestamp, BIGINT_COLUMN1 bigint,BIGINT_COLUMN2 bigint,DECIMAL_COLUMN1 
decimal(30,10), DECIMAL_COLUMN2 decimal(36,36),Double_COLUMN1 double, 
Double_COLUMN2 double,INTEGER_COLUMN1 int) partitioned by (ACTIVE_EMUI_VERSION 
string)stored as carbondata 
tblproperties('COLUMN_META_CACHE'='CUST_ID,CUST_NAME,DECIMAL_COLUMN2,DOJ,Double_COLUMN2,BIGINT_COLUMN2','local_dictionary_enable'='true','local_dictionary_threshold'='1000','local_dictionary_include'='ACTIVE_EMUI_VERSION')
 ;
+-+
| Result |
+-+
+-+
No rows selected (0.202 seconds)
0: jdbc:hive2://10.20.254.208:23040/default> LOAD DATA INPATH 
'hdfs://hacluster/chetan/2000_UniqData_partition.csv' into table 
uniqdata_Partition_single OPTIONS('FILEHEADER'='CUST_ID,CUST_NAME 
,ACTIVE_EMUI_VERSION,DOB,DOJ, 
BIGINT_COLUMN1,BIGINT_COLUMN2,DECIMAL_COLUMN1,DECIMAL_COLUMN2,Double_COLUMN1, 
Double_COLUMN2,INTEGER_COLUMN1','BAD_RECORDS_ACTION'='FORCE');
+-+
| Result |
+-+
+-+
No rows selected (3.471 seconds)
0: jdbc:hive2://10.20.254.208:23040/default>

 

>From prestocli the query is executed - 

presto:ranjan> insert into uniqdata_Partition_single select * from 
uniqdata_Partition_single;

 

*Issue : - insert in partition table should fail from presto side but insert 
into select * in passing in partition table with single column partition table 
from presto side.*

presto:ranjan> insert into uniqdata_Partition_single select * from 
uniqdata_Partition_single;
INSERT: 2002 rows

Query 20210223_044320_0_ggkxh, FINISHED, 1 node
Splits: 45 total, 45 done (100.00%)
0:05 [2K rows, 206KB] [431 rows/s, 44.4KB/s]

presto:ranjan> desc uniqdata_Partition_single;
 Column | Type | Extra | Comment
-++---+-
 cust_id | integer | |
 cust_name | varchar | |
 dob | timestamp | |
 doj | timestamp | |
 bigint_column1 | bigint | |
 bigint_column2 | bigint | |
 decimal_column1 | decimal(30,10) | |
 decimal_column2 | decimal(36,36) | |
 double_column1 | double | |
 double_column2 | double | |
 integer_column1 | integer | |
 active_emui_version | varchar | partition key |
(12 rows)

Query 20210223_044344_1_ggkxh, FINISHED, 1 node
Splits: 19 total, 19 done (100.00%)
0:00 [12 rows, 1.07KB] [50 rows/s, 4.53KB/s]

presto:ranjan>



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (CARBONDATA-4128) Merge SQL command fails with different case for column name and also if the command is input with different case

2021-02-19 Thread Chetan Bhat (Jira)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-4128?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chetan Bhat updated CARBONDATA-4128:

Summary: Merge SQL command fails with different case for column name and 
also if the command is input with different case  (was: Merge SQL command fails 
with different case for column name)

> Merge SQL command fails with different case for column name and also if the 
> command is input with different case
> 
>
> Key: CARBONDATA-4128
> URL: https://issues.apache.org/jira/browse/CARBONDATA-4128
> Project: CarbonData
>  Issue Type: Bug
>  Components: data-query
>Affects Versions: 2.1.0
> Environment: Spark 2.4.5
>Reporter: Chetan Bhat
>Priority: Minor
>
> Steps:-
>  
> *Issue 1 : Merge command fails for case insensitive column name.*
> drop table if exists A;
>  drop table if exists B;
>  CREATE TABLE A(id Int, name string, description string,address string, note 
> string) stored as carbondata 
> tblproperties('long_string_columns'='description,note','table_blocksize'='1','SORT_SCOPE'='global_sort','table_page_size_inmb'='1');
>  
>  insert into A select 
> 1,"name1","asasfdfdfdsf","tutyutyuty","6867898980909099-0-0-0878676565454545465768798";
>  insert into A select 
> 2,"name2","asasfdfdfdsf","tutyutyuty","6867898980909099-0-0-0878676565454545465768798";
>  insert into A select 
> 3,"name3","asasfdfdfdsf","tutyutyuty","6867898980909099-0-0-0878676565454545465768798";
>  insert into A select 
> 4,"name4","asasfdfdfdsf","tutyutyuty","6867898980909099-0-0-0878676565454545465768798";
>  insert into A select 
> 5,"name5","asasfdfdfdsf","tutyutyuty","6867898980909099-0-0-0878676565454545465768798";
> CREATE TABLE B(id Int, name string, description string,address string, note 
> string) stored as carbondata 
> tblproperties('long_string_columns'='description,note','table_blocksize'='1','SORT_SCOPE'='global_sort','table_page_size_inmb'='1');
>  
>  insert into B select 
> 1,"name1","asasfdfdfdsf","tutyutyuty","6867898980909099-0-0-0878676565454545465768798";
>  insert into B select 
> 2,"name2","asasfdfdfdsf","tutyutyuty","6867898980909099-0-0-0878676565454545465768798";
>  insert into B select 
> 3,"name3","asasfdfdfdsf","tutyutyuty","6867898980909099-0-0-0878676565454545465768798";
>  insert into B select 
> 6,"name4","asasfdfdfdsf","tutyutyuty","6867898980909099-0-0-0878676565454545465768798";
>  insert into B select 
> 7,"name5","asasfdfdfdsf","tutyutyuty","6867898980909099-0-0-0878676565454545465768798";
>  --merge
>  MERGE INTO A USING B ON A.id=B.id WHEN MATCHED THEN DELETE;
>  
> Issue :- Merge SQL command fails with different case for column name
> 0: jdbc:hive2://linux-63:22550/> MERGE INTO A USING B ON A.id=B.id WHEN 
> MATCHED THEN DELETE;
>  Error: org.apache.spark.sql.AnalysisException: == Spark Parser: 
> org.apache.spark.sql.hive.FISqlParser ==
> mismatched input 'MERGE' expecting \{'(', 'SELECT', 'FROM', 'ADD', 'DESC', 
> 'EMPOWER', 'WITH', 'VALUES', 'CREATE', 'TABLE', 'INSERT', 'DELETE', 
> 'DESCRIBE', 'EXPLAIN', 'SHOW', 'USE', 'DROP', 'ALTER', 'MAP', 'SET', 'RESET', 
> 'START', 'COMMIT', 'ROLLBACK', 'REDUCE', 'REFRESH', 'CLEAR', 'CACHE', 
> 'UNCACHE', 'DFS', 'TRUNCATE', 'ANALYZE', 'LIST', 'REVOKE', 'GRANT', 'LOCK', 
> 'UNLOCK', 'MSCK', 'EXPORT', 'IMPORT', 'LOAD', 'HEALTHCHECK'}(line 1, pos 0)
> == SQL ==
>  MERGE INTO A USING B ON A.id=B.id WHEN MATCHED THEN DELETE
>  ^^^
> == Carbon Parser: org.apache.spark.sql.parser.CarbonExtensionSpark2SqlParser 
> ==
>  [1.1] failure: identifier matching regex (?i)EXPLAIN expected
> MERGE INTO A USING B ON A.id=B.id WHEN MATCHED THEN DELETE
>  ^;
>  == Antlr Parser: org.apache.spark.sql.parser.CarbonAntlrParser ==
>  
> org.apache.spark.sql.parser.CarbonSqlBaseParser$ValueExpressionDefaultContext 
> cannot be cast to 
> org.apache.spark.sql.parser.CarbonSqlBaseParser$ComparisonContext; 
> (state=,code=0)
>  0: jdbc:hive2://linux-63:22550/>
>  
> *Issue 2 : merge into command is not working as case sensitive and fails as 
> mentioned below.*
> 0: jdbc:hive2://linux1:22550/> merge into a using b on a.ID=b.ID when matched 
> then delete;
>  Error: 
> org.apache.carbondata.common.exceptions.sql.MalformedCarbonCommandException: 
> Parse failed! (state=,code=0)
>  0: jdbc:hive2://linux1:22550/> merge into a using b on A.ID=B.ID when 
> matched then delete;
>  Error: 
> org.apache.carbondata.common.exceptions.sql.MalformedCarbonCommandException: 
> Parse failed! (state=,code=0)
>  0: jdbc:hive2://linux1:22550/> merge into A using B on A.ID=B.ID when 
> matched then delete;
>  Error: 
> org.apache.carbondata.common.exceptions.sql.MalformedCarbonCommandException: 
> Parse failed! (state=,code=0)



--
This 

[jira] [Updated] (CARBONDATA-4128) Merge SQL command fails with different case for column name

2021-02-19 Thread Chetan Bhat (Jira)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-4128?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chetan Bhat updated CARBONDATA-4128:

Description: 
Steps:-

 

Issue 1 : Merge command fails for case insensitive column.

drop table if exists A;
 drop table if exists B;
 CREATE TABLE A(id Int, name string, description string,address string, note 
string) stored as carbondata 
tblproperties('long_string_columns'='description,note','table_blocksize'='1','SORT_SCOPE'='global_sort','table_page_size_inmb'='1');
 
 insert into A select 
1,"name1","asasfdfdfdsf","tutyutyuty","6867898980909099-0-0-0878676565454545465768798";
 insert into A select 
2,"name2","asasfdfdfdsf","tutyutyuty","6867898980909099-0-0-0878676565454545465768798";
 insert into A select 
3,"name3","asasfdfdfdsf","tutyutyuty","6867898980909099-0-0-0878676565454545465768798";
 insert into A select 
4,"name4","asasfdfdfdsf","tutyutyuty","6867898980909099-0-0-0878676565454545465768798";
 insert into A select 
5,"name5","asasfdfdfdsf","tutyutyuty","6867898980909099-0-0-0878676565454545465768798";

CREATE TABLE B(id Int, name string, description string,address string, note 
string) stored as carbondata 
tblproperties('long_string_columns'='description,note','table_blocksize'='1','SORT_SCOPE'='global_sort','table_page_size_inmb'='1');
 
 insert into B select 
1,"name1","asasfdfdfdsf","tutyutyuty","6867898980909099-0-0-0878676565454545465768798";
 insert into B select 
2,"name2","asasfdfdfdsf","tutyutyuty","6867898980909099-0-0-0878676565454545465768798";
 insert into B select 
3,"name3","asasfdfdfdsf","tutyutyuty","6867898980909099-0-0-0878676565454545465768798";
 insert into B select 
6,"name4","asasfdfdfdsf","tutyutyuty","6867898980909099-0-0-0878676565454545465768798";
 insert into B select 
7,"name5","asasfdfdfdsf","tutyutyuty","6867898980909099-0-0-0878676565454545465768798";
 --merge
 MERGE INTO A USING B ON A.id=B.id WHEN MATCHED THEN DELETE;

 

Issue :- Merge SQL command fails with different case for column name

0: jdbc:hive2://linux-63:22550/> MERGE INTO A USING B ON A.id=B.id WHEN MATCHED 
THEN DELETE;
 Error: org.apache.spark.sql.AnalysisException: == Spark Parser: 
org.apache.spark.sql.hive.FISqlParser ==

mismatched input 'MERGE' expecting \{'(', 'SELECT', 'FROM', 'ADD', 'DESC', 
'EMPOWER', 'WITH', 'VALUES', 'CREATE', 'TABLE', 'INSERT', 'DELETE', 'DESCRIBE', 
'EXPLAIN', 'SHOW', 'USE', 'DROP', 'ALTER', 'MAP', 'SET', 'RESET', 'START', 
'COMMIT', 'ROLLBACK', 'REDUCE', 'REFRESH', 'CLEAR', 'CACHE', 'UNCACHE', 'DFS', 
'TRUNCATE', 'ANALYZE', 'LIST', 'REVOKE', 'GRANT', 'LOCK', 'UNLOCK', 'MSCK', 
'EXPORT', 'IMPORT', 'LOAD', 'HEALTHCHECK'}(line 1, pos 0)

== SQL ==
 MERGE INTO A USING B ON A.id=B.id WHEN MATCHED THEN DELETE
 ^^^

== Carbon Parser: org.apache.spark.sql.parser.CarbonExtensionSpark2SqlParser ==
 [1.1] failure: identifier matching regex (?i)EXPLAIN expected

MERGE INTO A USING B ON A.id=B.id WHEN MATCHED THEN DELETE
 ^;
 == Antlr Parser: org.apache.spark.sql.parser.CarbonAntlrParser ==
 org.apache.spark.sql.parser.CarbonSqlBaseParser$ValueExpressionDefaultContext 
cannot be cast to 
org.apache.spark.sql.parser.CarbonSqlBaseParser$ComparisonContext; 
(state=,code=0)
 0: jdbc:hive2://linux-63:22550/>

 

*Issue 2 : merge into command is not working as case sensitive and fails as 
mentioned below.*

0: jdbc:hive2://linux1:22550/> merge into a using b on a.ID=b.ID when matched 
then delete;
Error: 
org.apache.carbondata.common.exceptions.sql.MalformedCarbonCommandException: 
Parse failed! (state=,code=0)
0: jdbc:hive2://linux1:22550/> merge into a using b on A.ID=B.ID when matched 
then delete;
Error: 
org.apache.carbondata.common.exceptions.sql.MalformedCarbonCommandException: 
Parse failed! (state=,code=0)
0: jdbc:hive2://linux1:22550/> merge into A using B on A.ID=B.ID when matched 
then delete;
Error: 
org.apache.carbondata.common.exceptions.sql.MalformedCarbonCommandException: 
Parse failed! (state=,code=0)

  was:
Steps:-

drop table if exists A;
drop table if exists B;
CREATE TABLE A(id Int, name string, description string,address string, note 
string) stored as carbondata 
tblproperties('long_string_columns'='description,note','table_blocksize'='1','SORT_SCOPE'='global_sort','table_page_size_inmb'='1');
 
insert into A select 
1,"name1","asasfdfdfdsf","tutyutyuty","6867898980909099-0-0-0878676565454545465768798";
insert into A select 
2,"name2","asasfdfdfdsf","tutyutyuty","6867898980909099-0-0-0878676565454545465768798";
insert into A select 
3,"name3","asasfdfdfdsf","tutyutyuty","6867898980909099-0-0-0878676565454545465768798";
insert into A select 
4,"name4","asasfdfdfdsf","tutyutyuty","6867898980909099-0-0-0878676565454545465768798";
insert into A select 
5,"name5","asasfdfdfdsf","tutyutyuty","6867898980909099-0-0-0878676565454545465768798";


CREATE TABLE B(id Int, name string, description 

[jira] [Updated] (CARBONDATA-4128) Merge SQL command fails with different case for column name

2021-02-19 Thread Chetan Bhat (Jira)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-4128?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chetan Bhat updated CARBONDATA-4128:

Description: 
Steps:-

 

*Issue 1 : Merge command fails for case insensitive column name.*

drop table if exists A;
 drop table if exists B;
 CREATE TABLE A(id Int, name string, description string,address string, note 
string) stored as carbondata 
tblproperties('long_string_columns'='description,note','table_blocksize'='1','SORT_SCOPE'='global_sort','table_page_size_inmb'='1');
 
 insert into A select 
1,"name1","asasfdfdfdsf","tutyutyuty","6867898980909099-0-0-0878676565454545465768798";
 insert into A select 
2,"name2","asasfdfdfdsf","tutyutyuty","6867898980909099-0-0-0878676565454545465768798";
 insert into A select 
3,"name3","asasfdfdfdsf","tutyutyuty","6867898980909099-0-0-0878676565454545465768798";
 insert into A select 
4,"name4","asasfdfdfdsf","tutyutyuty","6867898980909099-0-0-0878676565454545465768798";
 insert into A select 
5,"name5","asasfdfdfdsf","tutyutyuty","6867898980909099-0-0-0878676565454545465768798";

CREATE TABLE B(id Int, name string, description string,address string, note 
string) stored as carbondata 
tblproperties('long_string_columns'='description,note','table_blocksize'='1','SORT_SCOPE'='global_sort','table_page_size_inmb'='1');
 
 insert into B select 
1,"name1","asasfdfdfdsf","tutyutyuty","6867898980909099-0-0-0878676565454545465768798";
 insert into B select 
2,"name2","asasfdfdfdsf","tutyutyuty","6867898980909099-0-0-0878676565454545465768798";
 insert into B select 
3,"name3","asasfdfdfdsf","tutyutyuty","6867898980909099-0-0-0878676565454545465768798";
 insert into B select 
6,"name4","asasfdfdfdsf","tutyutyuty","6867898980909099-0-0-0878676565454545465768798";
 insert into B select 
7,"name5","asasfdfdfdsf","tutyutyuty","6867898980909099-0-0-0878676565454545465768798";
 --merge
 MERGE INTO A USING B ON A.id=B.id WHEN MATCHED THEN DELETE;

 

Issue :- Merge SQL command fails with different case for column name

0: jdbc:hive2://linux-63:22550/> MERGE INTO A USING B ON A.id=B.id WHEN MATCHED 
THEN DELETE;
 Error: org.apache.spark.sql.AnalysisException: == Spark Parser: 
org.apache.spark.sql.hive.FISqlParser ==

mismatched input 'MERGE' expecting \{'(', 'SELECT', 'FROM', 'ADD', 'DESC', 
'EMPOWER', 'WITH', 'VALUES', 'CREATE', 'TABLE', 'INSERT', 'DELETE', 'DESCRIBE', 
'EXPLAIN', 'SHOW', 'USE', 'DROP', 'ALTER', 'MAP', 'SET', 'RESET', 'START', 
'COMMIT', 'ROLLBACK', 'REDUCE', 'REFRESH', 'CLEAR', 'CACHE', 'UNCACHE', 'DFS', 
'TRUNCATE', 'ANALYZE', 'LIST', 'REVOKE', 'GRANT', 'LOCK', 'UNLOCK', 'MSCK', 
'EXPORT', 'IMPORT', 'LOAD', 'HEALTHCHECK'}(line 1, pos 0)

== SQL ==
 MERGE INTO A USING B ON A.id=B.id WHEN MATCHED THEN DELETE
 ^^^

== Carbon Parser: org.apache.spark.sql.parser.CarbonExtensionSpark2SqlParser ==
 [1.1] failure: identifier matching regex (?i)EXPLAIN expected

MERGE INTO A USING B ON A.id=B.id WHEN MATCHED THEN DELETE
 ^;
 == Antlr Parser: org.apache.spark.sql.parser.CarbonAntlrParser ==
 org.apache.spark.sql.parser.CarbonSqlBaseParser$ValueExpressionDefaultContext 
cannot be cast to 
org.apache.spark.sql.parser.CarbonSqlBaseParser$ComparisonContext; 
(state=,code=0)
 0: jdbc:hive2://linux-63:22550/>

 

*Issue 2 : merge into command is not working as case sensitive and fails as 
mentioned below.*

0: jdbc:hive2://linux1:22550/> merge into a using b on a.ID=b.ID when matched 
then delete;
 Error: 
org.apache.carbondata.common.exceptions.sql.MalformedCarbonCommandException: 
Parse failed! (state=,code=0)
 0: jdbc:hive2://linux1:22550/> merge into a using b on A.ID=B.ID when matched 
then delete;
 Error: 
org.apache.carbondata.common.exceptions.sql.MalformedCarbonCommandException: 
Parse failed! (state=,code=0)
 0: jdbc:hive2://linux1:22550/> merge into A using B on A.ID=B.ID when matched 
then delete;
 Error: 
org.apache.carbondata.common.exceptions.sql.MalformedCarbonCommandException: 
Parse failed! (state=,code=0)

  was:
Steps:-

 

Issue 1 : Merge command fails for case insensitive column.

drop table if exists A;
 drop table if exists B;
 CREATE TABLE A(id Int, name string, description string,address string, note 
string) stored as carbondata 
tblproperties('long_string_columns'='description,note','table_blocksize'='1','SORT_SCOPE'='global_sort','table_page_size_inmb'='1');
 
 insert into A select 
1,"name1","asasfdfdfdsf","tutyutyuty","6867898980909099-0-0-0878676565454545465768798";
 insert into A select 
2,"name2","asasfdfdfdsf","tutyutyuty","6867898980909099-0-0-0878676565454545465768798";
 insert into A select 
3,"name3","asasfdfdfdsf","tutyutyuty","6867898980909099-0-0-0878676565454545465768798";
 insert into A select 
4,"name4","asasfdfdfdsf","tutyutyuty","6867898980909099-0-0-0878676565454545465768798";
 insert into A select 

[jira] [Updated] (CARBONDATA-4129) Class cast exception when array, struct , binary and string type data tried to be merged using merge SQL command

2021-02-10 Thread Chetan Bhat (Jira)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-4129?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chetan Bhat updated CARBONDATA-4129:

Summary: Class cast exception when array, struct , binary and string type 
data tried to be merged using merge SQL command  (was: Class cast exception 
when array, struct , binary and string type data tried to be merged)

> Class cast exception when array, struct , binary and string type data tried 
> to be merged using merge SQL command
> 
>
> Key: CARBONDATA-4129
> URL: https://issues.apache.org/jira/browse/CARBONDATA-4129
> Project: CarbonData
>  Issue Type: Bug
>  Components: data-query
>Affects Versions: 2.1.0
> Environment: Spark 2.4.5
>Reporter: Chetan Bhat
>Priority: Major
>
>  
> *Scenario 1 : - merge command with insertion on string with expression 
> **throws error.**. Also insert into binary with expression throws error.*
> drop table if exists A;
> drop table if exists B;
> CREATE TABLE A(id Int, name string, description string,address string, note 
> string) stored as carbondata 
> tblproperties('long_string_columns'='description,note','table_blocksize'='1','SORT_SCOPE'='global_sort','table_page_size_inmb'='1');
>  
> CREATE TABLE B(id Int, name string, description string,address string, note 
> string) stored as carbondata 
> tblproperties('long_string_columns'='description,note','table_blocksize'='1','SORT_SCOPE'='global_sort','table_page_size_inmb'='1');
>  
> insert into A select 
> 1,"name1A","asasfdfdfdsf","tutyutyuty","6867898980909099-0-0-0878676565454545465768798";
> insert into A select 
> 2,"name2A","asasfdfdfdsf","tutyutyuty","6867898980909099-0-0-0878676565454545465768798";
> insert into A select 
> 3,"name3A","asasfdfdfdsf","tutyutyuty","6867898980909099-0-0-0878676565454545465768798";
> insert into A select 
> 4,"name4A","asasfdfdfdsf","tutyutyuty","6867898980909099-0-0-0878676565454545465768798";
> insert into A select 
> 5,"name5A","asasfdfdfdsf","tutyutyuty","6867898980909099-0-0-0878676565454545465768798";
> insert into B select 
> 1,"name1B","asasfdfdfdsf","tutyutyuty","6867898980909099-0-0-0878676565454545465768798";
> insert into B select 
> 2,"name2B","asasfdfdfdsf","tutyutyuty","6867898980909099-0-0-0878676565454545465768798";
> insert into B select 
> 3,"name3B","asasfdfdfdsf","tutyutyuty","6867898980909099-0-0-0878676565454545465768798";
> insert into B select 
> 6,"name4B","asasfdfdfdsf","tutyutyuty","6867898980909099-0-0-0878676565454545465768798";
> insert into B select 
> 7,"name5B","asasfdfdfdsf","tutyutyuty","6867898980909099-0-0-0878676565454545465768798";
> MERGE INTO A USING B ON A.ID=B.ID WHEN NOT MATCHED AND B.ID=7 THEN INSERT 
> (A.ID,A.name,A.description ,A.address, A.note) VALUES 
> (B.ID,B.name+'10',B.description ,B.address,'test-string');
> 0: jdbc:hive2://linux-63:22550/> MERGE INTO A USING B ON A.ID=B.ID WHEN NOT 
> MATCHED AND B.ID=7 THEN INSERT (A.ID,A.name,A.description ,A.address, A.note) 
> VALUES (B.ID,B.name+'10',B.description ,B.address,'test-string');
> Error: org.apache.spark.SparkException: Job aborted due to stage failure: 
> Task 4 in stage 3813.0 failed 4 times, most recent failure: Lost task 4.3 in 
> stage 3813.0 (TID 23528, linux-63, executor 5): java.lang.ClassCastException: 
> org.apache.spark.sql.types.StringType$ cannot be cast to 
> org.apache.spark.sql.types.NumericType
>  at 
> org.apache.spark.sql.catalyst.util.TypeUtils$.getNumeric(TypeUtils.scala:58)
>  at 
> org.apache.spark.sql.catalyst.expressions.Add.numeric$lzycompute(arithmetic.scala:166)
>  at 
> org.apache.spark.sql.catalyst.expressions.Add.numeric(arithmetic.scala:166)
>  at 
> org.apache.spark.sql.catalyst.expressions.Add.nullSafeEval(arithmetic.scala:172)
>  at 
> org.apache.spark.sql.catalyst.expressions.BinaryExpression.eval(Expression.scala:486)
>  at 
> org.apache.spark.sql.catalyst.expressions.InterpretedMutableProjection.apply(Projection.scala:92)
>  at 
> org.apache.spark.sql.catalyst.expressions.InterpretedMutableProjection.apply(Projection.scala:66)
>  at 
> org.apache.spark.sql.execution.command.mutation.merge.MergeProjection.apply(MergeProjection.scala:54)
>  at 
> org.apache.spark.sql.execution.command.mutation.merge.CarbonMergeDataSetCommand$$anonfun$processIUD$1$$anon$1$$anonfun$next$1.apply(CarbonMergeDataSetCommand.scala:341)
>  at 
> org.apache.spark.sql.execution.command.mutation.merge.CarbonMergeDataSetCommand$$anonfun$processIUD$1$$anon$1$$anonfun$next$1.apply(CarbonMergeDataSetCommand.scala:338)
>  at scala.collection.immutable.List.foreach(List.scala:392)
>  at 
> 

[jira] [Created] (CARBONDATA-4129) Class cast exception when array, struct , binary and string type data tried to be merged

2021-02-10 Thread Chetan Bhat (Jira)
Chetan Bhat created CARBONDATA-4129:
---

 Summary: Class cast exception when array, struct , binary and 
string type data tried to be merged
 Key: CARBONDATA-4129
 URL: https://issues.apache.org/jira/browse/CARBONDATA-4129
 Project: CarbonData
  Issue Type: Bug
  Components: data-query
Affects Versions: 2.1.0
 Environment: Spark 2.4.5
Reporter: Chetan Bhat


 

*Scenario 1 : - merge command with insertion on string with expression **throws 
error.**. Also insert into binary with expression throws error.*

drop table if exists A;
drop table if exists B;
CREATE TABLE A(id Int, name string, description string,address string, note 
string) stored as carbondata 
tblproperties('long_string_columns'='description,note','table_blocksize'='1','SORT_SCOPE'='global_sort','table_page_size_inmb'='1');
 
CREATE TABLE B(id Int, name string, description string,address string, note 
string) stored as carbondata 
tblproperties('long_string_columns'='description,note','table_blocksize'='1','SORT_SCOPE'='global_sort','table_page_size_inmb'='1');
 

insert into A select 
1,"name1A","asasfdfdfdsf","tutyutyuty","6867898980909099-0-0-0878676565454545465768798";
insert into A select 
2,"name2A","asasfdfdfdsf","tutyutyuty","6867898980909099-0-0-0878676565454545465768798";
insert into A select 
3,"name3A","asasfdfdfdsf","tutyutyuty","6867898980909099-0-0-0878676565454545465768798";
insert into A select 
4,"name4A","asasfdfdfdsf","tutyutyuty","6867898980909099-0-0-0878676565454545465768798";
insert into A select 
5,"name5A","asasfdfdfdsf","tutyutyuty","6867898980909099-0-0-0878676565454545465768798";

insert into B select 
1,"name1B","asasfdfdfdsf","tutyutyuty","6867898980909099-0-0-0878676565454545465768798";
insert into B select 
2,"name2B","asasfdfdfdsf","tutyutyuty","6867898980909099-0-0-0878676565454545465768798";
insert into B select 
3,"name3B","asasfdfdfdsf","tutyutyuty","6867898980909099-0-0-0878676565454545465768798";
insert into B select 
6,"name4B","asasfdfdfdsf","tutyutyuty","6867898980909099-0-0-0878676565454545465768798";
insert into B select 
7,"name5B","asasfdfdfdsf","tutyutyuty","6867898980909099-0-0-0878676565454545465768798";

MERGE INTO A USING B ON A.ID=B.ID WHEN NOT MATCHED AND B.ID=7 THEN INSERT 
(A.ID,A.name,A.description ,A.address, A.note) VALUES 
(B.ID,B.name+'10',B.description ,B.address,'test-string');

0: jdbc:hive2://linux-63:22550/> MERGE INTO A USING B ON A.ID=B.ID WHEN NOT 
MATCHED AND B.ID=7 THEN INSERT (A.ID,A.name,A.description ,A.address, A.note) 
VALUES (B.ID,B.name+'10',B.description ,B.address,'test-string');
Error: org.apache.spark.SparkException: Job aborted due to stage failure: Task 
4 in stage 3813.0 failed 4 times, most recent failure: Lost task 4.3 in stage 
3813.0 (TID 23528, linux-63, executor 5): java.lang.ClassCastException: 
org.apache.spark.sql.types.StringType$ cannot be cast to 
org.apache.spark.sql.types.NumericType
 at org.apache.spark.sql.catalyst.util.TypeUtils$.getNumeric(TypeUtils.scala:58)
 at 
org.apache.spark.sql.catalyst.expressions.Add.numeric$lzycompute(arithmetic.scala:166)
 at org.apache.spark.sql.catalyst.expressions.Add.numeric(arithmetic.scala:166)
 at 
org.apache.spark.sql.catalyst.expressions.Add.nullSafeEval(arithmetic.scala:172)
 at 
org.apache.spark.sql.catalyst.expressions.BinaryExpression.eval(Expression.scala:486)
 at 
org.apache.spark.sql.catalyst.expressions.InterpretedMutableProjection.apply(Projection.scala:92)
 at 
org.apache.spark.sql.catalyst.expressions.InterpretedMutableProjection.apply(Projection.scala:66)
 at 
org.apache.spark.sql.execution.command.mutation.merge.MergeProjection.apply(MergeProjection.scala:54)
 at 
org.apache.spark.sql.execution.command.mutation.merge.CarbonMergeDataSetCommand$$anonfun$processIUD$1$$anon$1$$anonfun$next$1.apply(CarbonMergeDataSetCommand.scala:341)
 at 
org.apache.spark.sql.execution.command.mutation.merge.CarbonMergeDataSetCommand$$anonfun$processIUD$1$$anon$1$$anonfun$next$1.apply(CarbonMergeDataSetCommand.scala:338)
 at scala.collection.immutable.List.foreach(List.scala:392)
 at 
org.apache.spark.sql.execution.command.mutation.merge.CarbonMergeDataSetCommand$$anonfun$processIUD$1$$anon$1.next(CarbonMergeDataSetCommand.scala:338)
 at 
org.apache.spark.sql.execution.command.mutation.merge.CarbonMergeDataSetCommand$$anonfun$processIUD$1$$anon$1.next(CarbonMergeDataSetCommand.scala:319)
 at scala.collection.Iterator$$anon$13.hasNext(Iterator.scala:463)
 at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:409)
 at 
org.apache.spark.sql.execution.columnar.CachedRDDBuilder$$anonfun$1$$anon$1.hasNext(InMemoryRelation.scala:125)
 at 
org.apache.spark.storage.memory.MemoryStore.putIterator(MemoryStore.scala:221)
 at 
org.apache.spark.storage.memory.MemoryStore.putIteratorAsValues(MemoryStore.scala:299)
 at 

[jira] [Created] (CARBONDATA-4128) Merge SQL command fails with different case for column name

2021-02-10 Thread Chetan Bhat (Jira)
Chetan Bhat created CARBONDATA-4128:
---

 Summary: Merge SQL command fails with different case for column 
name
 Key: CARBONDATA-4128
 URL: https://issues.apache.org/jira/browse/CARBONDATA-4128
 Project: CarbonData
  Issue Type: Bug
  Components: data-query
Affects Versions: 2.1.0
 Environment: Spark 2.4.5
Reporter: Chetan Bhat


Steps:-

drop table if exists A;
drop table if exists B;
CREATE TABLE A(id Int, name string, description string,address string, note 
string) stored as carbondata 
tblproperties('long_string_columns'='description,note','table_blocksize'='1','SORT_SCOPE'='global_sort','table_page_size_inmb'='1');
 
insert into A select 
1,"name1","asasfdfdfdsf","tutyutyuty","6867898980909099-0-0-0878676565454545465768798";
insert into A select 
2,"name2","asasfdfdfdsf","tutyutyuty","6867898980909099-0-0-0878676565454545465768798";
insert into A select 
3,"name3","asasfdfdfdsf","tutyutyuty","6867898980909099-0-0-0878676565454545465768798";
insert into A select 
4,"name4","asasfdfdfdsf","tutyutyuty","6867898980909099-0-0-0878676565454545465768798";
insert into A select 
5,"name5","asasfdfdfdsf","tutyutyuty","6867898980909099-0-0-0878676565454545465768798";


CREATE TABLE B(id Int, name string, description string,address string, note 
string) stored as carbondata 
tblproperties('long_string_columns'='description,note','table_blocksize'='1','SORT_SCOPE'='global_sort','table_page_size_inmb'='1');
 
insert into B select 
1,"name1","asasfdfdfdsf","tutyutyuty","6867898980909099-0-0-0878676565454545465768798";
insert into B select 
2,"name2","asasfdfdfdsf","tutyutyuty","6867898980909099-0-0-0878676565454545465768798";
insert into B select 
3,"name3","asasfdfdfdsf","tutyutyuty","6867898980909099-0-0-0878676565454545465768798";
insert into B select 
6,"name4","asasfdfdfdsf","tutyutyuty","6867898980909099-0-0-0878676565454545465768798";
insert into B select 
7,"name5","asasfdfdfdsf","tutyutyuty","6867898980909099-0-0-0878676565454545465768798";
--merge
MERGE INTO A USING B ON A.id=B.id WHEN MATCHED THEN DELETE;

 

Issue :- Merge SQL command fails with different case for column name

0: jdbc:hive2://linux-63:22550/> MERGE INTO A USING B ON A.id=B.id WHEN MATCHED 
THEN DELETE;
Error: org.apache.spark.sql.AnalysisException: == Spark Parser: 
org.apache.spark.sql.hive.FISqlParser ==

mismatched input 'MERGE' expecting \{'(', 'SELECT', 'FROM', 'ADD', 'DESC', 
'EMPOWER', 'WITH', 'VALUES', 'CREATE', 'TABLE', 'INSERT', 'DELETE', 'DESCRIBE', 
'EXPLAIN', 'SHOW', 'USE', 'DROP', 'ALTER', 'MAP', 'SET', 'RESET', 'START', 
'COMMIT', 'ROLLBACK', 'REDUCE', 'REFRESH', 'CLEAR', 'CACHE', 'UNCACHE', 'DFS', 
'TRUNCATE', 'ANALYZE', 'LIST', 'REVOKE', 'GRANT', 'LOCK', 'UNLOCK', 'MSCK', 
'EXPORT', 'IMPORT', 'LOAD', 'HEALTHCHECK'}(line 1, pos 0)

== SQL ==
MERGE INTO A USING B ON A.id=B.id WHEN MATCHED THEN DELETE
^^^

== Carbon Parser: org.apache.spark.sql.parser.CarbonExtensionSpark2SqlParser ==
[1.1] failure: identifier matching regex (?i)EXPLAIN expected

MERGE INTO A USING B ON A.id=B.id WHEN MATCHED THEN DELETE
^;
== Antlr Parser: org.apache.spark.sql.parser.CarbonAntlrParser ==
org.apache.spark.sql.parser.CarbonSqlBaseParser$ValueExpressionDefaultContext 
cannot be cast to 
org.apache.spark.sql.parser.CarbonSqlBaseParser$ComparisonContext; 
(state=,code=0)
0: jdbc:hive2://linux-63:22550/>



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (CARBONDATA-4127) Merge SQL command not working with different table names

2021-02-10 Thread Chetan Bhat (Jira)
Chetan Bhat created CARBONDATA-4127:
---

 Summary: Merge SQL command not working with different table names
 Key: CARBONDATA-4127
 URL: https://issues.apache.org/jira/browse/CARBONDATA-4127
 Project: CarbonData
  Issue Type: Bug
  Components: data-query
Affects Versions: 2.1.0
 Environment: Spark 2.4.5
Reporter: Chetan Bhat


Steps:-

CREATE TABLE lsc1(id Int, name string, description string,address string, note 
string) stored as carbondata 
tblproperties('long_string_columns'='description,note','table_blocksize'='1','SORT_SCOPE'='global_sort','table_page_size_inmb'='1');
 
insert into lsc1 select 
1,"name1","asasfdfdfdsf","tutyutyuty","6867898980909099-0-0-0878676565454545465768798";
insert into lsc1 select 
2,"name2","asasfdfdfdsf","tutyutyuty","6867898980909099-0-0-0878676565454545465768798";
insert into lsc1 select 
3,"name3","asasfdfdfdsf","tutyutyuty","6867898980909099-0-0-0878676565454545465768798";
insert into lsc1 select 
4,"name4","asasfdfdfdsf","tutyutyuty","6867898980909099-0-0-0878676565454545465768798";
insert into lsc1 select 
5,"name5","asasfdfdfdsf","tutyutyuty","6867898980909099-0-0-0878676565454545465768798";


CREATE TABLE lsc2(id Int, name string, description string,address string, note 
string) stored as carbondata 
tblproperties('long_string_columns'='description,note','table_blocksize'='1','SORT_SCOPE'='global_sort','table_page_size_inmb'='1');
 
insert into lsc2 select 
1,"name1","asasfdfdfdsf","tutyutyuty","6867898980909099-0-0-0878676565454545465768798";
insert into lsc2 select 
2,"name2","asasfdfdfdsf","tutyutyuty","6867898980909099-0-0-0878676565454545465768798";
insert into lsc2 select 
3,"name3","asasfdfdfdsf","tutyutyuty","6867898980909099-0-0-0878676565454545465768798";
insert into lsc2 select 
6,"name4","asasfdfdfdsf","tutyutyuty","6867898980909099-0-0-0878676565454545465768798";
insert into lsc2 select 
7,"name5","asasfdfdfdsf","tutyutyuty","6867898980909099-0-0-0878676565454545465768798";

 

Issue :- merge fails with parse error

0: jdbc:hive2://linux-63:22550/> MERGE INTO lsc1 USING lsc2 ON lsc1.ID=lsc2.ID 
WHEN MATCHED THEN DELETE;
*Error: 
org.apache.carbondata.common.exceptions.sql.MalformedCarbonCommandException: 
Parse failed! (state=,code=0)*
*0: jdbc:hive2://linux-63:22550/>*



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (CARBONDATA-4126) Concurrent Compaction fails with Load on table with SI

2021-02-10 Thread Chetan Bhat (Jira)
Chetan Bhat created CARBONDATA-4126:
---

 Summary: Concurrent Compaction fails with Load on table with SI
 Key: CARBONDATA-4126
 URL: https://issues.apache.org/jira/browse/CARBONDATA-4126
 Project: CarbonData
  Issue Type: Bug
  Components: data-load
Affects Versions: 2.1.0
 Environment: Spark 2.4.5
Reporter: Chetan Bhat


[Steps] :-

Create table, load data and create SI.

create table brinjal (imei string,AMSize string,channelsId string,ActiveCountry 
string, Activecity string,gamePointId double,deviceInformationId 
double,productionDate Timestamp,deliveryDate timestamp,deliverycharge double) 
stored as carbondata TBLPROPERTIES('table_blocksize'='1');

LOAD DATA INPATH 'hdfs://hacluster/chetan/vardhandaterestruct.csv' INTO TABLE 
brinjal OPTIONS('DELIMITER'=',', 'QUOTECHAR'= 
'"','BAD_RECORDS_ACTION'='FORCE','FILEHEADER'= 
'imei,deviceInformationId,AMSize,channelsId,ActiveCountry,Activecity,gamePointId,productionDate,deliveryDate,deliverycharge');

create index indextable1 ON TABLE brinjal (AMSize) AS 'carbondata';

 

>From one terminal load data to table and other terminal perform minor and 
>major compaction on the table concurrently for some time.

LOAD DATA INPATH 'hdfs://hacluster/chetan/vardhandaterestruct.csv' INTO TABLE 
brinjal OPTIONS('DELIMITER'=',', 'QUOTECHAR'= 
'"','BAD_RECORDS_ACTION'='FORCE','FILEHEADER'= 
'imei,deviceInformationId,AMSize,channelsId,ActiveCountry,Activecity,gamePointId,productionDate,deliveryDate,deliverycharge');

alter table brinjal compact 'minor';

alter table brinjal compact 'major';

 

[Expected Result] :-  Concurrent Compaction should be success with Load on 
table with SI

 

[Actual Issue] : - Concurrent Compaction fails with Load on table with SI

*0: jdbc:hive2://linux-32:22550/> alter table brinjal compact 'major';*

*Error: org.apache.spark.sql.AnalysisException: Compaction failed. Please check 
logs for more info. Exception in compaction Failed to acquire lock on segment 
2, during compaction of table test.brinjal; (state=,code=0)*



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Closed] (CARBONDATA-4049) Sometimes refresh table fails with error "table not found in database" error

2020-12-31 Thread Chetan Bhat (Jira)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-4049?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chetan Bhat closed CARBONDATA-4049.
---
Resolution: Won't Fix

Issue is identified as that of refresh which is handled by Spark. As issue is 
not a carbon issue its closed.

> Sometimes refresh table fails with error "table not found in database" error
> 
>
> Key: CARBONDATA-4049
> URL: https://issues.apache.org/jira/browse/CARBONDATA-4049
> Project: CarbonData
>  Issue Type: Bug
>  Components: data-query
>Affects Versions: 2.1.0
> Environment: Spark 2.4.5
>Reporter: Chetan Bhat
>Priority: Minor
>
> In Carbon 2.1 version user creates a database.
> user copies a old version store such as 1.6.1 to HDFS folder of the database 
> in the In Carbon 2.1 version
> In Spark-SQL or beeline the user accesses the database using the use db 
> command.
> Refresh table command is executed on the old version store table and then the 
> subsequent operations on the table are performed.
> Next refresh table command is tried to be executed on another old version 
> store table .
>  
> Issue : Sometimes refresh table fails with error "table not found in 
> database" error.
> spark-sql> refresh table brinjal_deleteseg;
> *Error in query: Table or view 'brinjal_deleteseg' not found in database 
> '1_6_1';*
>  
> **Log -
> 2020-11-12 18:55:46,922 | INFO  | [main] | Created broadcast 171 from 
> broadCastHadoopConf at CarbonRDD.scala:58 | 
> org.apache.spark.internal.Logging$class.logInfo(Logging.scala:54)2020-11-12 
> 18:55:46,922 | INFO  | [main] | Created broadcast 171 from 
> broadCastHadoopConf at CarbonRDD.scala:58 | 
> org.apache.spark.internal.Logging$class.logInfo(Logging.scala:54)2020-11-12 
> 18:55:46,924 | INFO  | [main] | Pushed Filters:  | 
> org.apache.spark.internal.Logging$class.logInfo(Logging.scala:54)2020-11-12 
> 18:55:46,939 | INFO  | [main] | Distributed Index server is enabled for 
> 1_6_1.brinjal_update | 
> org.apache.carbondata.core.util.CarbonProperties.isDistributedPruningEnabled(CarbonProperties.java:1742)2020-11-12
>  18:55:46,939 | INFO  | [main] | Started block pruning ... | 
> org.apache.carbondata.hadoop.api.CarbonInputFormat.getPrunedBlocklets(CarbonInputFormat.java:526)2020-11-12
>  18:55:46,940 | INFO  | [main] | Distributed Index server is enabled for 
> 1_6_1.brinjal_update | 
> org.apache.carbondata.core.util.CarbonProperties.isDistributedPruningEnabled(CarbonProperties.java:1742)2020-11-12
>  18:55:46,945 | INFO  | [main] | Successfully Created directory: 
> hdfs://hacluster/tmp/indexservertmp/4b6353d4-65d7-4856-b3cd-b3bc11d15c55 | 
> org.apache.carbondata.core.util.CarbonUtil.createTempFolderForIndexServer(CarbonUtil.java:3273)2020-11-12
>  18:55:46,945 | INFO  | [main] | Temp folder path for Query ID: 
> 4b6353d4-65d7-4856-b3cd-b3bc11d15c55 is 
> org.apache.carbondata.core.datastore.filesystem.HDFSCarbonFile@b8f2e1bf | 
> org.apache.carbondata.indexserver.DistributedIndexJob.execute(IndexJobs.scala:57)2020-11-12
>  18:55:46,946 | ERROR | [main] | Configured port for index server is not a 
> valid number | 
> org.apache.carbondata.core.util.CarbonProperties.getIndexServerPort(CarbonProperties.java:1779)java.lang.NumberFormatException:
>  null at java.lang.Integer.parseInt(Integer.java:542) at 
> java.lang.Integer.parseInt(Integer.java:615) at 
> org.apache.carbondata.core.util.CarbonProperties.getIndexServerPort(CarbonProperties.java:1777)
>  at 
> org.apache.carbondata.indexserver.IndexServer$.serverPort$lzycompute(IndexServer.scala:88)
>  at 
> org.apache.carbondata.indexserver.IndexServer$.serverPort(IndexServer.scala:88)
>  at 
> org.apache.carbondata.indexserver.IndexServer$.getClient(IndexServer.scala:312)
>  at 
> org.apache.carbondata.indexserver.IndexServer$.getClient(IndexServer.scala:301)
>  at 
> org.apache.carbondata.indexserver.DistributedIndexJob$$anonfun$1.apply(IndexJobs.scala:83)
>  at 
> org.apache.carbondata.indexserver.DistributedIndexJob$$anonfun$1.apply(IndexJobs.scala:59)
>  at 
> org.apache.carbondata.spark.util.CarbonScalaUtil$.logTime(CarbonScalaUtil.scala:769)
>  at 
> org.apache.carbondata.indexserver.DistributedIndexJob.execute(IndexJobs.scala:58)
>  at 
> org.apache.carbondata.core.index.IndexUtil.executeIndexJob(IndexUtil.java:304)
>  at 
> org.apache.carbondata.hadoop.api.CarbonInputFormat.getDistributedSplit(CarbonInputFormat.java:431)
>  at 
> org.apache.carbondata.hadoop.api.CarbonInputFormat.getPrunedBlocklets(CarbonInputFormat.java:532)
>  at 
> org.apache.carbondata.hadoop.api.CarbonInputFormat.getDataBlocksOfSegment(CarbonInputFormat.java:477)
>  at 
> org.apache.carbondata.hadoop.api.CarbonTableInputFormat.getSplits(CarbonTableInputFormat.java:356)
>  at 
> 

[jira] [Closed] (CARBONDATA-4048) Update fails after continous update operations with error

2020-12-29 Thread Chetan Bhat (Jira)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-4048?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chetan Bhat closed CARBONDATA-4048.
---
Resolution: Won't Fix

subquery used within update query fetches more than 1 row. Hence the exception 
was thrown. Exception message is handled properly  and carbon is printing the 
message from spark directly.

> Update fails after continous update operations with error
> -
>
> Key: CARBONDATA-4048
> URL: https://issues.apache.org/jira/browse/CARBONDATA-4048
> Project: CarbonData
>  Issue Type: Bug
>  Components: data-query
>Affects Versions: 2.1.0
> Environment: Spark 2.3.2
>Reporter: Chetan Bhat
>Priority: Minor
>
> Create table , load data and perform continous update operation on the table.
> 0: jdbc:hive2://10.20.255.171:23040> CREATE TABLE uniqdata (CUST_ID 
> int,CUST_NAME String,ACTIVE_EMUI_VERSION string, DOB timestamp, DOJ 
> timestamp, BIGINT_COLUMN1 bigint,BIGINT_COLUMN2 bigint,DECIMAL_COLUMN1 
> decimal(30,10), DECIMAL_COLUMN2 decimal(36,10),Double_COLUMN1 double, 
> Double_COLUMN2 double,INTEGER_COLUMN1 int) stored as carbondata TBLPROPERTIES 
> ("TABLE_BLOCKSIZE"= "256 MB",'flat_folder'='true');
> +-+--+
> | Result |
> +-+--+
> +-+--+
> No rows selected (0.177 seconds)
> 0: jdbc:hive2://10.20.255.171:23040> LOAD DATA inpath 
> 'hdfs://hacluster/chetan/2000_UniqData.csv' INTO table uniqdata 
> options('DELIMITER'=',', 'FILEHEADER'='CUST_ID, CUST_NAME, 
> ACTIVE_EMUI_VERSION, DOB, DOJ, BIGINT_COLUMN1, BIGINT_COLUMN2, 
> DECIMAL_COLUMN1, DECIMAL_COLUMN2, Double_COLUMN1, Double_COLUMN2, 
> INTEGER_COLUMN1','TIMESTAMPFORMAT'='-MM-dd 
> HH:mm:ss','BAD_RECORDS_ACTION'='FORCE');
> +-+--+
> | Result |
> +-+--+
> +-+--+
> No rows selected (1.484 seconds)
> 0: jdbc:hive2://10.20.255.171:23040> update uniqdata set 
> (bigint_column1)=(100) where bigint_column1<123372036856;
> ++--+
> | Updated Row Count |
> ++--+
> | 2 |
> ++--+
> 1 row selected (3.294 seconds)
> 0: jdbc:hive2://10.20.255.171:23040> update uniqdata set 
> (bigint_column1)=(100) where bigint_column1>123372038852;
> ++--+
> | Updated Row Count |
> ++--+
> | 1 |
> ++--+
> 1 row selected (3.467 seconds)
> 0: jdbc:hive2://10.20.255.171:23040> update uniqdata set 
> (bigint_column1)=(100) where bigint_column1<=123372036859;
> ++--+
> | Updated Row Count |
> ++--+
> | 9 |
> ++--+
> 1 row selected (3.349 seconds)
> 0: jdbc:hive2://10.20.255.171:23040> update uniqdata set 
> (bigint_column1)=(100) where bigint_column1>=123372038846;
> ++--+
> | Updated Row Count |
> ++--+
> | 8 |
> ++--+
> 1 row selected (3.259 seconds)
> 0: jdbc:hive2://10.20.255.171:23040> update uniqdata set 
> (bigint_column1)=(100) where bigint_column1 == 123372038845;
> ++--+
> | Updated Row Count |
> ++--+
> | 1 |
> ++--+
> 1 row selected (4.164 seconds)
> 0: jdbc:hive2://10.20.255.171:23040> update uniqdata set 
> (bigint_column1)=(100) where bigint_column1 like '123%';
> ++--+
> | Updated Row Count |
> ++--+
> | 2000 |
> ++--+
> 1 row selected (3.695 seconds)
> 0: jdbc:hive2://10.20.255.171:23040> update uniqdata set 
> (bigint_column1)=(100) where bigint_column1 between 123372038849 AND 
> 123372038855;
> ++--+
> | Updated Row Count |
> ++--+
> | 5 |
> ++--+
> 1 row selected (3.228 seconds)
> 0: jdbc:hive2://10.20.255.171:23040> update uniqdata set 
> (bigint_column1)=(100) where bigint_column1 = 123372038845 OR false;
> ++--+
> | Updated Row Count |
> ++--+
> | 1 |
> ++--+
> 1 row selected (3.548 seconds)
> 0: jdbc:hive2://10.20.255.171:23040> update uniqdata set 
> (bigint_column1)=(100) where bigint_column1 = 123372038849 AND true;
> ++--+
> | Updated Row Count |
> ++--+
> | 1 |
> ++--+
> 1 row selected (3.321 seconds)
> 0: jdbc:hive2://10.20.255.171:23040> update uniqdata set 
> (bigint_column1)=(100) where bigint_column1 not between (123372038849) AND 
> (12337203885);
> ++--+
> | Updated Row Count |
> ++--+
> | 4025 |
> ++--+
> 1 row selected (3.718 seconds)
> 0: jdbc:hive2://10.20.255.171:23040> update uniqdata set 
> (cust_name)=('deepti') where cust_name<'CUST_NAME_01990';
> ++--+
> | Updated Row Count |
> ++--+
> | 5978 |
> 

[jira] [Closed] (CARBONDATA-4061) Empty value for date and timestamp columns are reading as null when using SDK. if we pass empty value to data and timestamp columns ,it gives null pointer exception

2020-12-27 Thread Chetan Bhat (Jira)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-4061?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chetan Bhat closed CARBONDATA-4061.
---
Resolution: Fixed

Fixed in 2.1.0 version

> Empty value for date and timestamp columns are reading as null when using 
> SDK. if we pass empty value to data and timestamp columns ,it gives null 
> pointer exception
> 
>
> Key: CARBONDATA-4061
> URL: https://issues.apache.org/jira/browse/CARBONDATA-4061
> Project: CarbonData
>  Issue Type: Bug
>  Components: data-query, other
>Affects Versions: 2.1.0
> Environment: Spark 2.4.5
>Reporter: Chetan Bhat
>Priority: Minor
>
> Empty value for date and timestamp columns are reading as null when using 
> SDK. if we pass empty value to data and timestamp columns ,it gives null 
> pointer exception
>  
> 2020-11-27 13:44:20 INFO ResultCollectorFactory:78 - Vector based dictionary 
> collector is used to scan and collect the data
>  2020-11-27 13:44:20 INFO DictionaryBasedVectorResultCollector:73 - Direct 
> page-wise vector fill collector is used to scan and collect the data
> java.lang.NullPointerException
>  at 
> org.apache.carbondata.sdk.file.CarbonReader.formatDateAndTimeStamp(CarbonReader.java:153)
>  at 
> org.apache.carbondata.sdk.file.CarbonReader.readNextRow(CarbonReader.java:126)
>  at 
> com.apache.spark.SDKReaderTest.testSDKRederAll_data_types2(SDKReaderTest.java:239)
>  at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>  at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>  at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>  at java.lang.reflect.Method.invoke(Method.java:498)
>  at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50)
>  at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
>  at 
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47)
>  at 
> org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
>  at org.junit.rules.TestWatcher$1.evaluate(TestWatcher.java:55)
>  at org.junit.rules.RunRules.evaluate(RunRules.java:20)
>  at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:325)
>  at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:78)
>  at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:57)
>  at org.junit.runners.ParentRunner$3.run(ParentRunner.java:290)
>  at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:71)
>  at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:288)
>  at org.junit.runners.ParentRunner.access$000(ParentRunner.java:58)
>  at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:268)
>  at 
> org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
>  at 
> org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
>  at org.junit.runners.ParentRunner.run(ParentRunner.java:363)
>  at org.junit.runner.JUnitCore.run(JUnitCore.java:137)
>  at 
> com.intellij.junit4.JUnit4IdeaTestRunner.startRunnerWithArgs(JUnit4IdeaTestRunner.java:69)
>  at 
> com.intellij.rt.junit.IdeaTestRunner$Repeater.startRunnerWithArgs(IdeaTestRunner.java:33)
>  at 
> com.intellij.rt.junit.JUnitStarter.prepareStreamsAndStart(JUnitStarter.java:220)
>  at com.intellij.rt.junit.JUnitStarter.main(JUnitStarter.java:53)
> 2020-11-27 13:44:20 INFO SparkContext:54 - Invoking stop() from shutdown hook
>  2020-11-27 13:44:20 INFO AbstractConnector:343 - Stopped [Spark@22175d4f
> {HTTP/1.1|mailto:Spark@22175d4f%7BHTTP/1.1],[http/1.1]}
> {10.19.36.215:4040}
>  2020-11-27 13:44:20 INFO session:158 - node0 Stopped scavenging



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (CARBONDATA-4061) Empty value for date and timestamp columns are reading as null when using SDK. if we pass empty value to data and timestamp columns ,it gives null pointer exception

2020-11-27 Thread Chetan Bhat (Jira)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-4061?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chetan Bhat updated CARBONDATA-4061:

Description: 
Empty value for date and timestamp columns are reading as null when using SDK. 
if we pass empty value to data and timestamp columns ,it gives null pointer 
exception

 

2020-11-27 13:44:20 INFO ResultCollectorFactory:78 - Vector based dictionary 
collector is used to scan and collect the data
 2020-11-27 13:44:20 INFO DictionaryBasedVectorResultCollector:73 - Direct 
page-wise vector fill collector is used to scan and collect the data

java.lang.NullPointerException
 at 
org.apache.carbondata.sdk.file.CarbonReader.formatDateAndTimeStamp(CarbonReader.java:153)
 at 
org.apache.carbondata.sdk.file.CarbonReader.readNextRow(CarbonReader.java:126)
 at 
com.apache.spark.SDKReaderTest.testSDKRederAll_data_types2(SDKReaderTest.java:239)
 at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
 at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
 at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
 at java.lang.reflect.Method.invoke(Method.java:498)
 at 
org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50)
 at 
org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
 at 
org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47)
 at 
org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
 at org.junit.rules.TestWatcher$1.evaluate(TestWatcher.java:55)
 at org.junit.rules.RunRules.evaluate(RunRules.java:20)
 at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:325)
 at 
org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:78)
 at 
org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:57)
 at org.junit.runners.ParentRunner$3.run(ParentRunner.java:290)
 at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:71)
 at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:288)
 at org.junit.runners.ParentRunner.access$000(ParentRunner.java:58)
 at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:268)
 at 
org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
 at org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
 at org.junit.runners.ParentRunner.run(ParentRunner.java:363)
 at org.junit.runner.JUnitCore.run(JUnitCore.java:137)
 at 
com.intellij.junit4.JUnit4IdeaTestRunner.startRunnerWithArgs(JUnit4IdeaTestRunner.java:69)
 at 
com.intellij.rt.junit.IdeaTestRunner$Repeater.startRunnerWithArgs(IdeaTestRunner.java:33)
 at 
com.intellij.rt.junit.JUnitStarter.prepareStreamsAndStart(JUnitStarter.java:220)
 at com.intellij.rt.junit.JUnitStarter.main(JUnitStarter.java:53)

2020-11-27 13:44:20 INFO SparkContext:54 - Invoking stop() from shutdown hook
 2020-11-27 13:44:20 INFO AbstractConnector:343 - Stopped [Spark@22175d4f

{HTTP/1.1|mailto:Spark@22175d4f%7BHTTP/1.1],[http/1.1]}

{10.19.36.215:4040}
 2020-11-27 13:44:20 INFO session:158 - node0 Stopped scavenging

  was:
Empty value for date and timestamp columns are reading as null . if we pass 
empty value to data and timestamp columns ,it gives null pointer exception

 

2020-11-27 13:44:20 INFO ResultCollectorFactory:78 - Vector based dictionary 
collector is used to scan and collect the data
2020-11-27 13:44:20 INFO DictionaryBasedVectorResultCollector:73 - Direct 
page-wise vector fill collector is used to scan and collect the data

java.lang.NullPointerException
 at 
org.apache.carbondata.sdk.file.CarbonReader.formatDateAndTimeStamp(CarbonReader.java:153)
 at 
org.apache.carbondata.sdk.file.CarbonReader.readNextRow(CarbonReader.java:126)
 at 
com.apache.spark.SDKReaderTest.testSDKRederAll_data_types2(SDKReaderTest.java:239)
 at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
 at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
 at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
 at java.lang.reflect.Method.invoke(Method.java:498)
 at 
org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50)
 at 
org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
 at 
org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47)
 at 
org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
 at org.junit.rules.TestWatcher$1.evaluate(TestWatcher.java:55)
 at org.junit.rules.RunRules.evaluate(RunRules.java:20)
 at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:325)
 at 
org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:78)
 at 
org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:57)
 at 

[jira] [Updated] (CARBONDATA-4061) Empty value for date and timestamp columns are reading as null when using SDK. if we pass empty value to data and timestamp columns ,it gives null pointer exception

2020-11-27 Thread Chetan Bhat (Jira)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-4061?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chetan Bhat updated CARBONDATA-4061:

Summary: Empty value for date and timestamp columns are reading as null 
when using SDK. if we pass empty value to data and timestamp columns ,it gives 
null pointer exception  (was: Empty value for date and timestamp columns are 
reading as null . if we pass empty value to data and timestamp columns ,it 
gives null pointer exception)

> Empty value for date and timestamp columns are reading as null when using 
> SDK. if we pass empty value to data and timestamp columns ,it gives null 
> pointer exception
> 
>
> Key: CARBONDATA-4061
> URL: https://issues.apache.org/jira/browse/CARBONDATA-4061
> Project: CarbonData
>  Issue Type: Bug
>  Components: data-query, other
>Affects Versions: 2.1.0
> Environment: Spark 2.4.5
>Reporter: Chetan Bhat
>Priority: Minor
>
> Empty value for date and timestamp columns are reading as null . if we pass 
> empty value to data and timestamp columns ,it gives null pointer exception
>  
> 2020-11-27 13:44:20 INFO ResultCollectorFactory:78 - Vector based dictionary 
> collector is used to scan and collect the data
> 2020-11-27 13:44:20 INFO DictionaryBasedVectorResultCollector:73 - Direct 
> page-wise vector fill collector is used to scan and collect the data
> java.lang.NullPointerException
>  at 
> org.apache.carbondata.sdk.file.CarbonReader.formatDateAndTimeStamp(CarbonReader.java:153)
>  at 
> org.apache.carbondata.sdk.file.CarbonReader.readNextRow(CarbonReader.java:126)
>  at 
> com.apache.spark.SDKReaderTest.testSDKRederAll_data_types2(SDKReaderTest.java:239)
>  at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>  at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>  at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>  at java.lang.reflect.Method.invoke(Method.java:498)
>  at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50)
>  at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
>  at 
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47)
>  at 
> org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
>  at org.junit.rules.TestWatcher$1.evaluate(TestWatcher.java:55)
>  at org.junit.rules.RunRules.evaluate(RunRules.java:20)
>  at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:325)
>  at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:78)
>  at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:57)
>  at org.junit.runners.ParentRunner$3.run(ParentRunner.java:290)
>  at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:71)
>  at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:288)
>  at org.junit.runners.ParentRunner.access$000(ParentRunner.java:58)
>  at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:268)
>  at 
> org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
>  at 
> org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
>  at org.junit.runners.ParentRunner.run(ParentRunner.java:363)
>  at org.junit.runner.JUnitCore.run(JUnitCore.java:137)
>  at 
> com.intellij.junit4.JUnit4IdeaTestRunner.startRunnerWithArgs(JUnit4IdeaTestRunner.java:69)
>  at 
> com.intellij.rt.junit.IdeaTestRunner$Repeater.startRunnerWithArgs(IdeaTestRunner.java:33)
>  at 
> com.intellij.rt.junit.JUnitStarter.prepareStreamsAndStart(JUnitStarter.java:220)
>  at com.intellij.rt.junit.JUnitStarter.main(JUnitStarter.java:53)
> 2020-11-27 13:44:20 INFO SparkContext:54 - Invoking stop() from shutdown hook
> 2020-11-27 13:44:20 INFO AbstractConnector:343 - Stopped 
> [Spark@22175d4f{HTTP/1.1|mailto:Spark@22175d4f%7BHTTP/1.1],[http/1.1]}\{10.19.36.215:4040}
> 2020-11-27 13:44:20 INFO session:158 - node0 Stopped scavenging



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (CARBONDATA-4061) Empty value for date and timestamp columns are reading as null . if we pass empty value to data and timestamp columns ,it gives null pointer exception

2020-11-27 Thread Chetan Bhat (Jira)
Chetan Bhat created CARBONDATA-4061:
---

 Summary: Empty value for date and timestamp columns are reading as 
null . if we pass empty value to data and timestamp columns ,it gives null 
pointer exception
 Key: CARBONDATA-4061
 URL: https://issues.apache.org/jira/browse/CARBONDATA-4061
 Project: CarbonData
  Issue Type: Bug
  Components: data-query, other
Affects Versions: 2.1.0
 Environment: Spark 2.4.5
Reporter: Chetan Bhat


Empty value for date and timestamp columns are reading as null . if we pass 
empty value to data and timestamp columns ,it gives null pointer exception

 

2020-11-27 13:44:20 INFO ResultCollectorFactory:78 - Vector based dictionary 
collector is used to scan and collect the data
2020-11-27 13:44:20 INFO DictionaryBasedVectorResultCollector:73 - Direct 
page-wise vector fill collector is used to scan and collect the data

java.lang.NullPointerException
 at 
org.apache.carbondata.sdk.file.CarbonReader.formatDateAndTimeStamp(CarbonReader.java:153)
 at 
org.apache.carbondata.sdk.file.CarbonReader.readNextRow(CarbonReader.java:126)
 at 
com.apache.spark.SDKReaderTest.testSDKRederAll_data_types2(SDKReaderTest.java:239)
 at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
 at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
 at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
 at java.lang.reflect.Method.invoke(Method.java:498)
 at 
org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50)
 at 
org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
 at 
org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47)
 at 
org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
 at org.junit.rules.TestWatcher$1.evaluate(TestWatcher.java:55)
 at org.junit.rules.RunRules.evaluate(RunRules.java:20)
 at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:325)
 at 
org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:78)
 at 
org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:57)
 at org.junit.runners.ParentRunner$3.run(ParentRunner.java:290)
 at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:71)
 at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:288)
 at org.junit.runners.ParentRunner.access$000(ParentRunner.java:58)
 at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:268)
 at 
org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
 at org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
 at org.junit.runners.ParentRunner.run(ParentRunner.java:363)
 at org.junit.runner.JUnitCore.run(JUnitCore.java:137)
 at 
com.intellij.junit4.JUnit4IdeaTestRunner.startRunnerWithArgs(JUnit4IdeaTestRunner.java:69)
 at 
com.intellij.rt.junit.IdeaTestRunner$Repeater.startRunnerWithArgs(IdeaTestRunner.java:33)
 at 
com.intellij.rt.junit.JUnitStarter.prepareStreamsAndStart(JUnitStarter.java:220)
 at com.intellij.rt.junit.JUnitStarter.main(JUnitStarter.java:53)

2020-11-27 13:44:20 INFO SparkContext:54 - Invoking stop() from shutdown hook
2020-11-27 13:44:20 INFO AbstractConnector:343 - Stopped 
[Spark@22175d4f{HTTP/1.1|mailto:Spark@22175d4f%7BHTTP/1.1],[http/1.1]}\{10.19.36.215:4040}
2020-11-27 13:44:20 INFO session:158 - node0 Stopped scavenging



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (CARBONDATA-4049) Sometimes refresh table fails with error "table not found in database" error

2020-11-12 Thread Chetan Bhat (Jira)
Chetan Bhat created CARBONDATA-4049:
---

 Summary: Sometimes refresh table fails with error "table not found 
in database" error
 Key: CARBONDATA-4049
 URL: https://issues.apache.org/jira/browse/CARBONDATA-4049
 Project: CarbonData
  Issue Type: Bug
  Components: data-query
Affects Versions: 2.1.0
 Environment: Spark 2.4.5
Reporter: Chetan Bhat


In Carbon 2.1 version user creates a database.

user copies a old version store such as 1.6.1 to HDFS folder of the database in 
the In Carbon 2.1 version

In Spark-SQL or beeline the user accesses the database using the use db command.

Refresh table command is executed on the old version store table and then the 
subsequent operations on the table are performed.

Next refresh table command is tried to be executed on another old version store 
table .

 

Issue : Sometimes refresh table fails with error "table not found in database" 
error.

spark-sql> refresh table brinjal_deleteseg;
*Error in query: Table or view 'brinjal_deleteseg' not found in database 
'1_6_1';*

 

**Log -

2020-11-12 18:55:46,922 | INFO  | [main] | Created broadcast 171 from 
broadCastHadoopConf at CarbonRDD.scala:58 | 
org.apache.spark.internal.Logging$class.logInfo(Logging.scala:54)2020-11-12 
18:55:46,922 | INFO  | [main] | Created broadcast 171 from broadCastHadoopConf 
at CarbonRDD.scala:58 | 
org.apache.spark.internal.Logging$class.logInfo(Logging.scala:54)2020-11-12 
18:55:46,924 | INFO  | [main] | Pushed Filters:  | 
org.apache.spark.internal.Logging$class.logInfo(Logging.scala:54)2020-11-12 
18:55:46,939 | INFO  | [main] | Distributed Index server is enabled for 
1_6_1.brinjal_update | 
org.apache.carbondata.core.util.CarbonProperties.isDistributedPruningEnabled(CarbonProperties.java:1742)2020-11-12
 18:55:46,939 | INFO  | [main] | Started block pruning ... | 
org.apache.carbondata.hadoop.api.CarbonInputFormat.getPrunedBlocklets(CarbonInputFormat.java:526)2020-11-12
 18:55:46,940 | INFO  | [main] | Distributed Index server is enabled for 
1_6_1.brinjal_update | 
org.apache.carbondata.core.util.CarbonProperties.isDistributedPruningEnabled(CarbonProperties.java:1742)2020-11-12
 18:55:46,945 | INFO  | [main] | Successfully Created directory: 
hdfs://hacluster/tmp/indexservertmp/4b6353d4-65d7-4856-b3cd-b3bc11d15c55 | 
org.apache.carbondata.core.util.CarbonUtil.createTempFolderForIndexServer(CarbonUtil.java:3273)2020-11-12
 18:55:46,945 | INFO  | [main] | Temp folder path for Query ID: 
4b6353d4-65d7-4856-b3cd-b3bc11d15c55 is 
org.apache.carbondata.core.datastore.filesystem.HDFSCarbonFile@b8f2e1bf | 
org.apache.carbondata.indexserver.DistributedIndexJob.execute(IndexJobs.scala:57)2020-11-12
 18:55:46,946 | ERROR | [main] | Configured port for index server is not a 
valid number | 
org.apache.carbondata.core.util.CarbonProperties.getIndexServerPort(CarbonProperties.java:1779)java.lang.NumberFormatException:
 null at java.lang.Integer.parseInt(Integer.java:542) at 
java.lang.Integer.parseInt(Integer.java:615) at 
org.apache.carbondata.core.util.CarbonProperties.getIndexServerPort(CarbonProperties.java:1777)
 at 
org.apache.carbondata.indexserver.IndexServer$.serverPort$lzycompute(IndexServer.scala:88)
 at 
org.apache.carbondata.indexserver.IndexServer$.serverPort(IndexServer.scala:88) 
at 
org.apache.carbondata.indexserver.IndexServer$.getClient(IndexServer.scala:312) 
at 
org.apache.carbondata.indexserver.IndexServer$.getClient(IndexServer.scala:301) 
at 
org.apache.carbondata.indexserver.DistributedIndexJob$$anonfun$1.apply(IndexJobs.scala:83)
 at 
org.apache.carbondata.indexserver.DistributedIndexJob$$anonfun$1.apply(IndexJobs.scala:59)
 at 
org.apache.carbondata.spark.util.CarbonScalaUtil$.logTime(CarbonScalaUtil.scala:769)
 at 
org.apache.carbondata.indexserver.DistributedIndexJob.execute(IndexJobs.scala:58)
 at 
org.apache.carbondata.core.index.IndexUtil.executeIndexJob(IndexUtil.java:304) 
at 
org.apache.carbondata.hadoop.api.CarbonInputFormat.getDistributedSplit(CarbonInputFormat.java:431)
 at 
org.apache.carbondata.hadoop.api.CarbonInputFormat.getPrunedBlocklets(CarbonInputFormat.java:532)
 at 
org.apache.carbondata.hadoop.api.CarbonInputFormat.getDataBlocksOfSegment(CarbonInputFormat.java:477)
 at 
org.apache.carbondata.hadoop.api.CarbonTableInputFormat.getSplits(CarbonTableInputFormat.java:356)
 at 
org.apache.carbondata.hadoop.api.CarbonTableInputFormat.getSplits(CarbonTableInputFormat.java:204)
 at 
org.apache.carbondata.spark.rdd.CarbonScanRDD.internalGetPartitions(CarbonScanRDD.scala:159)
 at org.apache.carbondata.spark.rdd.CarbonRDD.getPartitions(CarbonRDD.scala:68) 
at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:273) at 
org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:269) at 
scala.Option.getOrElse(Option.scala:121) at 
org.apache.spark.rdd.RDD.partitions(RDD.scala:269) at 

[jira] [Created] (CARBONDATA-4048) Update fails after continous update operations with error

2020-11-06 Thread Chetan Bhat (Jira)
Chetan Bhat created CARBONDATA-4048:
---

 Summary: Update fails after continous update operations with error
 Key: CARBONDATA-4048
 URL: https://issues.apache.org/jira/browse/CARBONDATA-4048
 Project: CarbonData
  Issue Type: Bug
  Components: data-query
Affects Versions: 2.1.0
 Environment: Spark 2.3.2
Reporter: Chetan Bhat


Create table , load data and perform continous update operation on the table.

0: jdbc:hive2://10.20.255.171:23040> CREATE TABLE uniqdata (CUST_ID 
int,CUST_NAME String,ACTIVE_EMUI_VERSION string, DOB timestamp, DOJ timestamp, 
BIGINT_COLUMN1 bigint,BIGINT_COLUMN2 bigint,DECIMAL_COLUMN1 decimal(30,10), 
DECIMAL_COLUMN2 decimal(36,10),Double_COLUMN1 double, Double_COLUMN2 
double,INTEGER_COLUMN1 int) stored as carbondata TBLPROPERTIES 
("TABLE_BLOCKSIZE"= "256 MB",'flat_folder'='true');
+-+--+
| Result |
+-+--+
+-+--+
No rows selected (0.177 seconds)
0: jdbc:hive2://10.20.255.171:23040> LOAD DATA inpath 
'hdfs://hacluster/chetan/2000_UniqData.csv' INTO table uniqdata 
options('DELIMITER'=',', 'FILEHEADER'='CUST_ID, CUST_NAME, ACTIVE_EMUI_VERSION, 
DOB, DOJ, BIGINT_COLUMN1, BIGINT_COLUMN2, DECIMAL_COLUMN1, DECIMAL_COLUMN2, 
Double_COLUMN1, Double_COLUMN2, INTEGER_COLUMN1','TIMESTAMPFORMAT'='-MM-dd 
HH:mm:ss','BAD_RECORDS_ACTION'='FORCE');
+-+--+
| Result |
+-+--+
+-+--+
No rows selected (1.484 seconds)
0: jdbc:hive2://10.20.255.171:23040> update uniqdata set (bigint_column1)=(100) 
where bigint_column1<123372036856;
++--+
| Updated Row Count |
++--+
| 2 |
++--+
1 row selected (3.294 seconds)
0: jdbc:hive2://10.20.255.171:23040> update uniqdata set (bigint_column1)=(100) 
where bigint_column1>123372038852;
++--+
| Updated Row Count |
++--+
| 1 |
++--+
1 row selected (3.467 seconds)
0: jdbc:hive2://10.20.255.171:23040> update uniqdata set (bigint_column1)=(100) 
where bigint_column1<=123372036859;
++--+
| Updated Row Count |
++--+
| 9 |
++--+
1 row selected (3.349 seconds)
0: jdbc:hive2://10.20.255.171:23040> update uniqdata set (bigint_column1)=(100) 
where bigint_column1>=123372038846;
++--+
| Updated Row Count |
++--+
| 8 |
++--+
1 row selected (3.259 seconds)
0: jdbc:hive2://10.20.255.171:23040> update uniqdata set (bigint_column1)=(100) 
where bigint_column1 == 123372038845;
++--+
| Updated Row Count |
++--+
| 1 |
++--+
1 row selected (4.164 seconds)
0: jdbc:hive2://10.20.255.171:23040> update uniqdata set (bigint_column1)=(100) 
where bigint_column1 like '123%';
++--+
| Updated Row Count |
++--+
| 2000 |
++--+
1 row selected (3.695 seconds)
0: jdbc:hive2://10.20.255.171:23040> update uniqdata set (bigint_column1)=(100) 
where bigint_column1 between 123372038849 AND 123372038855;
++--+
| Updated Row Count |
++--+
| 5 |
++--+
1 row selected (3.228 seconds)
0: jdbc:hive2://10.20.255.171:23040> update uniqdata set (bigint_column1)=(100) 
where bigint_column1 = 123372038845 OR false;
++--+
| Updated Row Count |
++--+
| 1 |
++--+
1 row selected (3.548 seconds)
0: jdbc:hive2://10.20.255.171:23040> update uniqdata set (bigint_column1)=(100) 
where bigint_column1 = 123372038849 AND true;
++--+
| Updated Row Count |
++--+
| 1 |
++--+
1 row selected (3.321 seconds)
0: jdbc:hive2://10.20.255.171:23040> update uniqdata set (bigint_column1)=(100) 
where bigint_column1 not between (123372038849) AND (12337203885);
++--+
| Updated Row Count |
++--+
| 4025 |
++--+
1 row selected (3.718 seconds)
0: jdbc:hive2://10.20.255.171:23040> update uniqdata set (cust_name)=('deepti') 
where cust_name<'CUST_NAME_01990';
++--+
| Updated Row Count |
++--+
| 5978 |
++--+
1 row selected (4.109 seconds)
0: jdbc:hive2://10.20.255.171:23040> update uniqdata set (cust_name)=('deepti') 
where cust_name>'CUST_NAME_01990';
++--+
| Updated Row Count |
++--+
| 6022 |
++--+
1 row selected (3.643 seconds)
0: jdbc:hive2://10.20.255.171:23040> update uniqdata set (cust_name)=('deepti') 
where cust_name<='CUST_NAME_01990';
++--+
| Updated Row Count |
++--+
| 5981 |
++--+
1 row selected (3.713 seconds)
0: jdbc:hive2://10.20.255.171:23040> update uniqdata set (cust_name)=('deepti') 

[jira] [Closed] (CARBONDATA-3838) Select filter query fails on SI columns of different SI tables.

2020-11-05 Thread Chetan Bhat (Jira)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-3838?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chetan Bhat closed CARBONDATA-3838.
---
Fix Version/s: 2.1.0
   Resolution: Fixed

> Select filter query fails on SI columns of different SI tables.
> ---
>
> Key: CARBONDATA-3838
> URL: https://issues.apache.org/jira/browse/CARBONDATA-3838
> Project: CarbonData
>  Issue Type: Bug
>  Components: data-query
>Affects Versions: 2.0.0
> Environment: Spark 2.3.2
>Reporter: Chetan Bhat
>Priority: Major
> Fix For: 2.1.0
>
>
> Select filter query fails on SI columns of different SI tables.
> *Steps :-*
> 0: jdbc:hive2://10.20.255.171:23040/default> create table brinjal (imei 
> string,AMSize string,channelsId string,ActiveCountry string, Activecity 
> string,gamePointId double,deviceInformationId double,productionDate 
> Timestamp,deliveryDate timestamp,deliverycharge double) stored as carbondata 
> TBLPROPERTIES('inverted_index'='imei,AMSize,channelsId,ActiveCountry,Activecity,productionDate,deliveryDate','sort_columns'='imei,AMSize,channelsId,ActiveCountry,Activecity,productionDate,deliveryDate','table_blocksize'='1','SORT_SCOPE'='GLOBAL_SORT','carbon.column.compressor'='zstd');
> +-+--+
> | Result |
> +-+--+
> +-+--+
> No rows selected (0.153 seconds)
> 0: jdbc:hive2://10.20.255.171:23040/default> LOAD DATA INPATH 
> 'hdfs://hacluster/chetan/vardhandaterestruct.csv' INTO TABLE brinjal 
> OPTIONS('DELIMITER'=',', 'QUOTECHAR'= 
> '"','BAD_RECORDS_ACTION'='FORCE','FILEHEADER'= 
> 'imei,deviceInformationId,AMSize,channelsId,ActiveCountry,Activecity,gamePointId,productionDate,deliveryDate,deliverycharge');
> +-+--+
> | Result |
> +-+--+
> +-+--+
> No rows selected (2.357 seconds)
> 0: jdbc:hive2://10.20.255.171:23040/default> CREATE INDEX indextable1 ON 
> TABLE brinjal (channelsId) AS 'carbondata' 
> PROPERTIES('carbon.column.compressor'='zstd');
> +-+--+
> | Result |
> +-+--+
> +-+--+
> No rows selected (1.048 seconds)
> 0: jdbc:hive2://10.20.255.171:23040/default> CREATE INDEX indextable2 ON 
> TABLE brinjal (ActiveCountry) AS 'carbondata' 
> PROPERTIES('carbon.column.compressor'='zstd');
> +-+--+
> | Result |
> +-+--+
> +-+--+
> No rows selected (1.895 seconds)
> 0: jdbc:hive2://10.20.255.171:23040/default> select * from brinjal where 
> ActiveCountry ='Chinese' or channelsId =4;
> Error: org.apache.spark.sql.catalyst.errors.package$TreeNodeException: 
> execute, tree:
> Exchange hashpartitioning(positionReference#6440, 200)
> +- *(6) HashAggregate(keys=[positionReference#6440], functions=[], 
> output=[positionReference#6440])
>  +- Union
>  :- *(3) HashAggregate(keys=[positionReference#6440], functions=[], 
> output=[positionReference#6440])
>  : +- Exchange hashpartitioning(positionReference#6440, 200)
>  : +- *(2) HashAggregate(keys=[positionReference#6440], functions=[], 
> output=[positionReference#6440])
>  : +- *(2) Project [positionReference#6440]
>  : +- *(2) Filter (cast(channelsid#6439 as int) = 4)
>  : +- *(2) FileScan carbondata 
> 2_0.indextable1[positionReference#6440,channelsid#6439] PushedFilters: 
> [CastExpr((cast(channelsid#6439 as int) = 4))], ReadSchema: 
> struct
>  +- *(5) HashAggregate(keys=[positionReference#6442], functions=[], 
> output=[positionReference#6442])
>  +- Exchange hashpartitioning(positionReference#6442, 200)
>  +- *(4) HashAggregate(keys=[positionReference#6442], functions=[], 
> output=[positionReference#6442])
>  +- *(4) Project [positionReference#6442|#6442]
>  +- *(4) Filter (activecountry#6441 = Chinese)
>  +- *(4) FileScan carbondata 
> 2_0.indextable2[positionReference#6442,activecountry#6441] PushedFilters: 
> [EqualTo(activecountry,Chinese)], ReadSchema: 
> struct (state=,code=0)
>  
> *Log -*
> org.apache.carbondata.core.datastore.block.SegmentPropertiesAndSchemaHolder.addSegmentProperties(SegmentPropertiesAndSchemaHolder.java:117)org.apache.carbondata.core.datastore.block.SegmentPropertiesAndSchemaHolder.addSegmentProperties(SegmentPropertiesAndSchemaHolder.java:117)2020-06-01
>  12:19:28,058 | ERROR | [HiveServer2-Background-Pool: Thread-1150] | Error 
> executing query, currentState RUNNING,  | 
> org.apache.spark.internal.Logging$class.logError(Logging.scala:91)org.apache.spark.sql.catalyst.errors.package$TreeNodeException:
>  execute, tree:Exchange hashpartitioning(positionReference#6440, 200)+- *(6) 
> HashAggregate(keys=[positionReference#6440], functions=[], 
> output=[positionReference#6440])   +- Union      :- *(3) 
> HashAggregate(keys=[positionReference#6440], functions=[], 
> output=[positionReference#6440])      :  +- Exchange 
> hashpartitioning(positionReference#6440, 200)      :     +- *(2) 
> 

[jira] [Closed] (CARBONDATA-3971) Session level dynamic properties for repair(carbon.load.si.repair and carbon.si.repair.limit) are not updated in https://github.com/apache/carbondata/blob/master/docs

2020-10-29 Thread Chetan Bhat (Jira)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-3971?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chetan Bhat closed CARBONDATA-3971.
---
Fix Version/s: 2.1.0
   Resolution: Fixed

Issue is fixed.

> Session level dynamic properties for repair(carbon.load.si.repair and 
> carbon.si.repair.limit) are not updated in 
> https://github.com/apache/carbondata/blob/master/docs/configuration-parameters.md
> --
>
> Key: CARBONDATA-3971
> URL: https://issues.apache.org/jira/browse/CARBONDATA-3971
> Project: CarbonData
>  Issue Type: Bug
>  Components: docs
>Affects Versions: 2.1.0
>Reporter: Chetan Bhat
>Priority: Minor
> Fix For: 2.1.0
>
>
> Session level dynamic properties for repair(carbon.load.si.repair and 
> carbon.si.repair.limit) are not mentioned in  github link - 
> https://github.com/apache/carbondata/blob/master/docs/configuration-parameters.md



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Closed] (CARBONDATA-4010) "Alter table set tblproperties should support long string columns" and bad record handling of long string data for string columns need to be updated in https://github

2020-10-27 Thread Chetan Bhat (Jira)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-4010?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chetan Bhat closed CARBONDATA-4010.
---

Issue is fixed in Carbon 2.1 version.

> "Alter table set tblproperties should support long string columns" and bad 
> record handling of long string data for string columns need to be updated in 
> https://github.com/apache/carbondata/blob/master/docs
> -
>
> Key: CARBONDATA-4010
> URL: https://issues.apache.org/jira/browse/CARBONDATA-4010
> Project: CarbonData
>  Issue Type: Bug
>  Components: docs
>Affects Versions: 2.1.0
> Environment: https://github.com/apache/carbondata/blob/master/docs
>Reporter: Chetan Bhat
>Priority: Minor
> Fix For: 2.1.0
>
>  Time Spent: 6.5h
>  Remaining Estimate: 0h
>
> "Alter table set tblproperties should support long string columns" and bad 
> record handling of long string data for string columns need to be updated in 
> https://github.com/apache/carbondata/blob/master/docs



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Closed] (CARBONDATA-3932) need to change discovery.uri and add hive.metastore.uri,hive.config.resources in https://github.com/apache/carbondata/blob/master/docs/prestosql-guide.md#presto-mul

2020-10-27 Thread Chetan Bhat (Jira)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-3932?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chetan Bhat closed CARBONDATA-3932.
---

Issue is fixed in Carbon 2.1 version.

> need to change discovery.uri and add  
> hive.metastore.uri,hive.config.resources  in 
> https://github.com/apache/carbondata/blob/master/docs/prestosql-guide.md#presto-multinode-cluster-setup-for-carbondata
> -
>
> Key: CARBONDATA-3932
> URL: https://issues.apache.org/jira/browse/CARBONDATA-3932
> Project: CarbonData
>  Issue Type: Bug
>  Components: docs, presto-integration
>Affects Versions: 2.0.1
> Environment: Documentation
>Reporter: Chetan Bhat
>Priority: Minor
> Fix For: 2.1.0
>
>  Time Spent: 3h 20m
>  Remaining Estimate: 0h
>
> Need to change discovery.uri=:8086 to 
> discovery.uri=http://:8086 in 
> [https://github.com/apache/carbondata/blob/master/docs/prestosql-guide.md#presto-multinode-cluster-setup-for-carbondata]
> Need to add these configurations as well in carbondata.properties and to be 
> updated in carbondata-presto opensource doc .
> 1.hive.metastore.uri
> 2.hive.config.resources
> Ex : -
> connector.name=carbondata
> hive.metastore.uri=thrift://10.21.18.106:9083
> hive.config.resources=/opt/HA/C10/install/hadoop/datanode/etc/hadoop/core-site.xml,/opt/HA/C10/install/hadoop/datanode/etc/hadoop/hdfs-site.xml
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Closed] (CARBONDATA-3901) Documentation issues in https://github.com/apache/carbondata/tree/master/docs

2020-10-27 Thread Chetan Bhat (Jira)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-3901?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chetan Bhat closed CARBONDATA-3901.
---

Issue is fixed in Carbon 2.1 version

> Documentation issues in https://github.com/apache/carbondata/tree/master/docs
> -
>
> Key: CARBONDATA-3901
> URL: https://issues.apache.org/jira/browse/CARBONDATA-3901
> Project: CarbonData
>  Issue Type: Bug
>  Components: docs
>Affects Versions: 2.0.1
> Environment: https://github.com/apache/carbondata/tree/master/docs
>Reporter: Chetan Bhat
>Priority: Minor
> Fix For: 2.1.0
>
>  Time Spent: 2h 40m
>  Remaining Estimate: 0h
>
> *Issue 1  -* 
> [https://github.com/apache/carbondata/blob/master/docs/ddl-of-carbondata.mdSORT_SCOPE]
>  Sort scope of the load.Options include no sort, local sort ,batch sort and 
> global sort  --> Batch sort to be removed as its not supported.
> *Issue 2 -* 
> [https://github.com/apache/carbondata/blob/master/docs/streaming-guide.md#close-stream]
>    CLOSE STREAM link is not working.
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Closed] (CARBONDATA-3824) Error when Secondary index tried to be created on table that does not exist is not correct.

2020-10-27 Thread Chetan Bhat (Jira)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-3824?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chetan Bhat closed CARBONDATA-3824.
---

Issue is fixed in Carbon 2.1 version.

> Error when Secondary index tried to be created on table that does not exist 
> is not correct.
> ---
>
> Key: CARBONDATA-3824
> URL: https://issues.apache.org/jira/browse/CARBONDATA-3824
> Project: CarbonData
>  Issue Type: Bug
>  Components: data-query
>Affects Versions: 2.0.0
> Environment: Spark 2.3.2, Spark 2.4.5
>Reporter: Chetan Bhat
>Priority: Minor
> Fix For: 2.1.0
>
>
> *Issue :-*
> Table uniqdata_double does not exist.
> Secondary index tried to be created on table. Error message is incorrect.
> CREATE INDEX indextable2 ON TABLE uniqdata_double (DOB) AS 'carbondata' 
> PROPERTIES('carbon.column.compressor'='zstd');
> *Error: java.lang.RuntimeException: Operation not allowed on non-carbon table 
> (state=,code=0)*
>  
> *Expected :-*
> *Error: java.lang.RuntimeException: Table does not exist* *(state=,code=0)***



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (CARBONDATA-3901) Documentation issues in https://github.com/apache/carbondata/tree/master/docs

2020-10-19 Thread Chetan Bhat (Jira)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-3901?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chetan Bhat updated CARBONDATA-3901:

Description: 
*Issue 1  -* 
[https://github.com/apache/carbondata/blob/master/docs/ddl-of-carbondata.mdSORT_SCOPE]
 Sort scope of the load.Options include no sort, local sort ,batch sort and 
global sort  --> Batch sort to be removed as its not supported.

*Issue 2 -* 
[https://github.com/apache/carbondata/blob/master/docs/streaming-guide.md#close-stream]
   CLOSE STREAM link is not working.

 

  was:
*Issue 1 :* 
[https://github.com/apache/carbondata/blob/master/docs/alluxio-guide.md] 
getOrCreateCarbonSession not used in Carbon 2.0 version and should be 
removed.Issue 1 : 
[https://github.com/apache/carbondata/blob/master/docs/alluxio-guide.md] 
getOrCreateCarbonSession not used in Carbon 2.0 version and should be removed.
 Testing use alluxio by CarbonSessionimport 
org.apache.spark.sql.CarbonSession._import org.apache.spark.sql.SparkSession   
val carbon = 
SparkSession.builder().master("local").appName("test").getOrCreateCarbonSession("alluxio://localhost:19998/carbondata");carbon.sql("CREATE
 TABLE carbon_alluxio(id String,name String, city String,age Int) STORED as 
carbondata");carbon.sql(s"LOAD DATA LOCAL INPATH 
'${CARBONDATA_PATH}/integration/spark/src/test/resources/sample.csv' into table 
carbon_alluxio");carbon.sql("select * from carbon_alluxio").show

*Issue 2  -* 
[https://github.com/apache/carbondata/blob/master/docs/ddl-of-carbondata.mdSORT_SCOPE]
 Sort scope of the load.Options include no sort, local sort ,batch sort and 
global sort  --> Batch sort to be removed as its not supported.

*Issue 3 -* 
[https://github.com/apache/carbondata/blob/master/docs/streaming-guide.md#close-stream]
   CLOSE STREAM link is not working.

*Issue 4 -*  
[https://github.com/apache/carbondata/blob/master/docs/index/bloomfilter-index-guide.md]
  Explain query does not hit the bloom. Hence the line "User can verify whether 
a query can leverage BloomFilter Index by executing {{EXPLAIN}} command, which 
will show the transformed logical plan, and thus user can check whether the 
BloomFilter Index can skip blocklets during the scan."  needs to be removed.


> Documentation issues in https://github.com/apache/carbondata/tree/master/docs
> -
>
> Key: CARBONDATA-3901
> URL: https://issues.apache.org/jira/browse/CARBONDATA-3901
> Project: CarbonData
>  Issue Type: Bug
>  Components: docs
>Affects Versions: 2.0.1
> Environment: https://github.com/apache/carbondata/tree/master/docs
>Reporter: Chetan Bhat
>Priority: Minor
> Fix For: 2.1.0
>
>  Time Spent: 2h 40m
>  Remaining Estimate: 0h
>
> *Issue 1  -* 
> [https://github.com/apache/carbondata/blob/master/docs/ddl-of-carbondata.mdSORT_SCOPE]
>  Sort scope of the load.Options include no sort, local sort ,batch sort and 
> global sort  --> Batch sort to be removed as its not supported.
> *Issue 2 -* 
> [https://github.com/apache/carbondata/blob/master/docs/streaming-guide.md#close-stream]
>    CLOSE STREAM link is not working.
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (CARBONDATA-3901) Documentation issues in https://github.com/apache/carbondata/tree/master/docs

2020-10-13 Thread Chetan Bhat (Jira)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-3901?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chetan Bhat updated CARBONDATA-3901:

Description: 
*Issue 1 :* 
[https://github.com/apache/carbondata/blob/master/docs/alluxio-guide.md] 
getOrCreateCarbonSession not used in Carbon 2.0 version and should be 
removed.Issue 1 : 
[https://github.com/apache/carbondata/blob/master/docs/alluxio-guide.md] 
getOrCreateCarbonSession not used in Carbon 2.0 version and should be removed.
 Testing use alluxio by CarbonSessionimport 
org.apache.spark.sql.CarbonSession._import org.apache.spark.sql.SparkSession   
val carbon = 
SparkSession.builder().master("local").appName("test").getOrCreateCarbonSession("alluxio://localhost:19998/carbondata");carbon.sql("CREATE
 TABLE carbon_alluxio(id String,name String, city String,age Int) STORED as 
carbondata");carbon.sql(s"LOAD DATA LOCAL INPATH 
'${CARBONDATA_PATH}/integration/spark/src/test/resources/sample.csv' into table 
carbon_alluxio");carbon.sql("select * from carbon_alluxio").show

*Issue 2  -* 
[https://github.com/apache/carbondata/blob/master/docs/ddl-of-carbondata.mdSORT_SCOPE]
 Sort scope of the load.Options include no sort, local sort ,batch sort and 
global sort  --> Batch sort to be removed as its not supported.

*Issue 3 -* 
[https://github.com/apache/carbondata/blob/master/docs/streaming-guide.md#close-stream]
   CLOSE STREAM link is not working.

*Issue 4 -*  
[https://github.com/apache/carbondata/blob/master/docs/index/bloomfilter-index-guide.md]
  Explain query does not hit the bloom. Hence the line "User can verify whether 
a query can leverage BloomFilter Index by executing {{EXPLAIN}} command, which 
will show the transformed logical plan, and thus user can check whether the 
BloomFilter Index can skip blocklets during the scan."  needs to be removed.

  was:
*Issue 1 :* 
[https://github.com/apache/carbondata/blob/master/docs/alluxio-guide.md] 
getOrCreateCarbonSession not used in Carbon 2.0 version and should be 
removed.Issue 1 : 
[https://github.com/apache/carbondata/blob/master/docs/alluxio-guide.md] 
getOrCreateCarbonSession not used in Carbon 2.0 version and should be removed.
 Testing use alluxio by CarbonSessionimport 
org.apache.spark.sql.CarbonSession._import org.apache.spark.sql.SparkSession   
val carbon = 
SparkSession.builder().master("local").appName("test").getOrCreateCarbonSession("alluxio://localhost:19998/carbondata");carbon.sql("CREATE
 TABLE carbon_alluxio(id String,name String, city String,age Int) STORED as 
carbondata");carbon.sql(s"LOAD DATA LOCAL INPATH 
'${CARBONDATA_PATH}/integration/spark/src/test/resources/sample.csv' into table 
carbon_alluxio");carbon.sql("select * from carbon_alluxio").show

*Issue 2  -* 
[https://github.com/apache/carbondata/blob/master/docs/ddl-of-carbondata.mdSORT_SCOPE]
 Sort scope of the load.Options include no sort, local sort ,batch sort and 
global sort  --> Batch sort to be removed as its not supported.

*Issue 3 -* 
[https://github.com/apache/carbondata/blob/master/docs/streaming-guide.md#close-stream]
   CLOSE STREAM link is not working.

*Issue 4 -*  
[https://github.com/apache/carbondata/blob/master/docs/index/bloomfilter-index-guide.md]
  Explain query does not hit the MV. Hence the line "User can verify whether a 
query can leverage BloomFilter Index by executing {{EXPLAIN}} command, which 
will show the transformed logical plan, and thus user can check whether the 
BloomFilter Index can skip blocklets during the scan."  needs to be removed.


> Documentation issues in https://github.com/apache/carbondata/tree/master/docs
> -
>
> Key: CARBONDATA-3901
> URL: https://issues.apache.org/jira/browse/CARBONDATA-3901
> Project: CarbonData
>  Issue Type: Bug
>  Components: docs
>Affects Versions: 2.0.1
> Environment: https://github.com/apache/carbondata/tree/master/docs
>Reporter: Chetan Bhat
>Priority: Minor
>
> *Issue 1 :* 
> [https://github.com/apache/carbondata/blob/master/docs/alluxio-guide.md] 
> getOrCreateCarbonSession not used in Carbon 2.0 version and should be 
> removed.Issue 1 : 
> [https://github.com/apache/carbondata/blob/master/docs/alluxio-guide.md] 
> getOrCreateCarbonSession not used in Carbon 2.0 version and should be removed.
>  Testing use alluxio by CarbonSessionimport 
> org.apache.spark.sql.CarbonSession._import org.apache.spark.sql.SparkSession  
>  val carbon = 
> SparkSession.builder().master("local").appName("test").getOrCreateCarbonSession("alluxio://localhost:19998/carbondata");carbon.sql("CREATE
>  TABLE carbon_alluxio(id String,name String, city String,age Int) STORED as 
> carbondata");carbon.sql(s"LOAD DATA LOCAL INPATH 
> '${CARBONDATA_PATH}/integration/spark/src/test/resources/sample.csv' into 
> table 

[jira] [Closed] (CARBONDATA-3825) Refresh table in carbonsession using carbonextension fails for a table created in sparkfile format

2020-10-09 Thread Chetan Bhat (Jira)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-3825?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chetan Bhat closed CARBONDATA-3825.
---
Resolution: Invalid

Issue is analyzed as invalid and closed.

> Refresh table in carbonsession using carbonextension fails  for a table 
> created in sparkfile format
> ---
>
> Key: CARBONDATA-3825
> URL: https://issues.apache.org/jira/browse/CARBONDATA-3825
> Project: CarbonData
>  Issue Type: Bug
>  Components: data-load
>Affects Versions: 2.0.0
> Environment: Spark 2.3.2, 2.4.5
>Reporter: Chetan Bhat
>Priority: Major
>
> In 1.6.1 or 2.0 version create a table in a db in spark file format and 
> insert records in the table.
> Take a backup of the table store, drop database
> In carbonsession using carbonextension create a database with same name as 
> the db in sparkfileformat and copy table store of sparkfileformat to db path 
> in hdfs.
> Execute the refresh table command.
> Refresh table fails with error "Table or view not found in database"



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Closed] (CARBONDATA-3795) Create external carbon table fails if the schema is not provided

2020-10-09 Thread Chetan Bhat (Jira)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-3795?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chetan Bhat closed CARBONDATA-3795.
---
Fix Version/s: 2.0.1
   Resolution: Fixed

Issue fixed in 2.0.1

> Create external carbon table fails if the schema is not provided
> 
>
> Key: CARBONDATA-3795
> URL: https://issues.apache.org/jira/browse/CARBONDATA-3795
> Project: CarbonData
>  Issue Type: Bug
>  Components: data-query
>Affects Versions: 2.0.0
> Environment: Spark 2.4.5 compatible carbon jars
>Reporter: Chetan Bhat
>Priority: Major
> Fix For: 2.0.1
>
>
> Create external carbon table fails if the schema is not provided.
> Example command - 
> create external table test1 stored as carbondata location 
> '/user/sparkhive/warehouse/1_6_1.db/brinjal/';
> *Error: org.apache.spark.sql.AnalysisException: Unable to infer the schema. 
> The schema specification is required to create the table `1_6_1`.`test1`.; 
> (state=,code=0)*
>  
> *Logs -*
> 2020-05-05 22:57:25,638 | ERROR | [HiveServer2-Background-Pool: Thread-371] | 
> Error executing query, currentState RUNNING, | 
> org.apache.spark.internal.Logging$class.logError(Logging.scala:91)
> org.apache.spark.sql.AnalysisException: Unable to infer the schema. The 
> schema specification is required to create the table `1_6_1`.`test1`.;
>  at 
> org.apache.spark.sql.hive.ResolveHiveSerdeTable$$anonfun$apply$1.applyOrElse(HiveStrategies.scala:104)
>  at 
> org.apache.spark.sql.hive.ResolveHiveSerdeTable$$anonfun$apply$1.applyOrElse(HiveStrategies.scala:90)
>  at 
> org.apache.spark.sql.catalyst.plans.logical.AnalysisHelper$$anonfun$resolveOperatorsDown$1$$anonfun$2.apply(AnalysisHelper.scala:108)
>  at 
> org.apache.spark.sql.catalyst.plans.logical.AnalysisHelper$$anonfun$resolveOperatorsDown$1$$anonfun$2.apply(AnalysisHelper.scala:108)
>  at 
> org.apache.spark.sql.catalyst.trees.CurrentOrigin$.withOrigin(TreeNode.scala:69)
>  at 
> org.apache.spark.sql.catalyst.plans.logical.AnalysisHelper$$anonfun$resolveOperatorsDown$1.apply(AnalysisHelper.scala:107)
>  at 
> org.apache.spark.sql.catalyst.plans.logical.AnalysisHelper$$anonfun$resolveOperatorsDown$1.apply(AnalysisHelper.scala:106)
>  at 
> org.apache.spark.sql.catalyst.plans.logical.AnalysisHelper$.allowInvokingTransformsInAnalyzer(AnalysisHelper.scala:194)
>  at 
> org.apache.spark.sql.catalyst.plans.logical.AnalysisHelper$class.resolveOperatorsDown(AnalysisHelper.scala:106)
>  at 
> org.apache.spark.sql.catalyst.plans.logical.LogicalPlan.resolveOperatorsDown(LogicalPlan.scala:29)
>  at 
> org.apache.spark.sql.catalyst.plans.logical.AnalysisHelper$class.resolveOperators(AnalysisHelper.scala:73)
>  at 
> org.apache.spark.sql.catalyst.plans.logical.LogicalPlan.resolveOperators(LogicalPlan.scala:29)
>  at 
> org.apache.spark.sql.hive.ResolveHiveSerdeTable.apply(HiveStrategies.scala:90)
>  at 
> org.apache.spark.sql.hive.ResolveHiveSerdeTable.apply(HiveStrategies.scala:44)
>  at 
> org.apache.spark.sql.catalyst.rules.RuleExecutor$$anonfun$execute$1$$anonfun$apply$1.apply(RuleExecutor.scala:87)
>  at 
> org.apache.spark.sql.catalyst.rules.RuleExecutor$$anonfun$execute$1$$anonfun$apply$1.apply(RuleExecutor.scala:84)
>  at 
> scala.collection.LinearSeqOptimized$class.foldLeft(LinearSeqOptimized.scala:124)
>  at scala.collection.immutable.List.foldLeft(List.scala:84)
>  at 
> org.apache.spark.sql.catalyst.rules.RuleExecutor$$anonfun$execute$1.apply(RuleExecutor.scala:84)
>  at 
> org.apache.spark.sql.catalyst.rules.RuleExecutor$$anonfun$execute$1.apply(RuleExecutor.scala:76)
>  at scala.collection.immutable.List.foreach(List.scala:392)
>  at 
> org.apache.spark.sql.catalyst.rules.RuleExecutor.execute(RuleExecutor.scala:76)
>  at 
> org.apache.spark.sql.catalyst.analysis.Analyzer.org$apache$spark$sql$catalyst$analysis$Analyzer$$executeSameContext(Analyzer.scala:127)
>  at 
> org.apache.spark.sql.catalyst.analysis.Analyzer.execute(Analyzer.scala:121)
>  at 
> org.apache.spark.sql.catalyst.analysis.Analyzer$$anonfun$executeAndCheck$1.apply(Analyzer.scala:106)
>  at 
> org.apache.spark.sql.catalyst.analysis.Analyzer$$anonfun$executeAndCheck$1.apply(Analyzer.scala:105)
>  at 
> org.apache.spark.sql.catalyst.plans.logical.AnalysisHelper$.markInAnalyzer(AnalysisHelper.scala:201)
>  at 
> org.apache.spark.sql.catalyst.analysis.Analyzer.executeAndCheck(Analyzer.scala:105)
>  at 
> org.apache.spark.sql.execution.QueryExecution.analyzed$lzycompute(QueryExecution.scala:58)
>  at 
> org.apache.spark.sql.execution.QueryExecution.analyzed(QueryExecution.scala:56)
>  at 
> org.apache.spark.sql.execution.QueryExecution.assertAnalyzed(QueryExecution.scala:48)
>  at org.apache.spark.sql.Dataset$.ofRows(Dataset.scala:78)
>  at org.apache.spark.sql.SparkSession.sql(SparkSession.scala:642)
>  at 

[jira] [Closed] (CARBONDATA-3794) show metacache command takes significantly more time 1st time when compared to 2nd time.

2020-10-07 Thread Chetan Bhat (Jira)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-3794?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chetan Bhat closed CARBONDATA-3794.
---
Fix Version/s: 2.1.0
   Resolution: Fixed

The issue is resolved in Carbon 2.1.0 version.

0: jdbc:hive2://10.20.251.163:23040/default> show metacache;
+-+---+-+-+
| Identifier | Table Index size | CgAndFg Index size | Cache Location |
+-+---+-+-+
+-+---+-+-+
No rows selected (7.059 seconds)
0: jdbc:hive2://10.20.251.163:23040/default> show metacache;
+-+---+-+-+
| Identifier | Table Index size | CgAndFg Index size | Cache Location |
+-+---+-+-+
+-+---+-+-+
No rows selected (6.52 seconds)

> show metacache command takes significantly more time 1st time when compared 
> to 2nd time.
> 
>
> Key: CARBONDATA-3794
> URL: https://issues.apache.org/jira/browse/CARBONDATA-3794
> Project: CarbonData
>  Issue Type: Bug
>  Components: data-query
>Affects Versions: 2.0.0
> Environment: Spark 2.3.2 compatible carbon
>Reporter: Chetan Bhat
>Priority: Minor
> Fix For: 2.1.0
>
>
> show metacache command takes significantly more time 1st time when compared 
> to 2nd time.
> *1st time*
> 0: jdbc:hive2://10.20.255.171:23040/show metacache;
>  
> +--+++++--
> |Identifier|Index size|Datamap size|Cache Location|
> +--+++++--
> |TOTAL|745 B|0 B|DRIVER|
> |1_6_1.uniqdata_comp_nosort|745 B|0 B|DRIVER|
> +--+++++--
>  *2 rows selected (8.233 seconds)*
>  
> *2nd time*
>  0: jdbc:hive2://10.20.255.171:23040/default> show metacache;
>  
> +--+++++--
> |Identifier|Index size|Datamap size|Cache Location|
> +--+++++--
> |TOTAL|745 B|0 B|DRIVER|
> |1_6_1.uniqdata_comp_nosort|745 B|0 B|DRIVER|
> +--+++++--
>  *2 rows selected (1.46 seconds)*
>  
> *Sometimes the 1st time show metacache takes upto 25 seconds compared to 3-4 
> seconds for 2nd time show metacache.*



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (CARBONDATA-3794) show metacache command takes significantly more time 1st time when compared to 2nd time.

2020-10-07 Thread Chetan Bhat (Jira)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-3794?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chetan Bhat updated CARBONDATA-3794:

Description: 
show metacache command takes significantly more time 1st time when compared to 
2nd time.

*1st time*

0: jdbc:hive2://10.20.255.171:23040/show metacache;
 
+--+++++--
|Identifier|Index size|Datamap size|Cache Location|

+--+++++--
|TOTAL|745 B|0 B|DRIVER|
|1_6_1.uniqdata_comp_nosort|745 B|0 B|DRIVER|

+--+++++--
 *2 rows selected (8.233 seconds)*

 

*2nd time*
 0: jdbc:hive2://10.20.255.171:23040/default> show metacache;
 
+--+++++--
|Identifier|Index size|Datamap size|Cache Location|

+--+++++--
|TOTAL|745 B|0 B|DRIVER|
|1_6_1.uniqdata_comp_nosort|745 B|0 B|DRIVER|

+--+++++--
 *2 rows selected (1.46 seconds)*

 

*Sometimes the 1st time show metacache takes upto 25 seconds compared to 3-4 
seconds for 2nd time show metacache.*

  was:
show metacache command takes significantly more time 1st time when compared to 
2nd time.

*1st time*

0: jdbc:hive2://10.20.255.171:23040/show metcshow metacache;
+-+-+---+-+--+
| Identifier | Index size | Datamap size | Cache Location |
+-+-+---+-+--+
| TOTAL | 745 B | 0 B | DRIVER |
| 1_6_1.uniqdata_comp_nosort | 745 B | 0 B | DRIVER |
+-+-+---+-+--+
*2 rows selected (8.233 seconds)*

 

*2nd time*
0: jdbc:hive2://10.20.255.171:23040/default> show metacache;
+-+-+---+-+--+
| Identifier | Index size | Datamap size | Cache Location |
+-+-+---+-+--+
| TOTAL | 745 B | 0 B | DRIVER |
| 1_6_1.uniqdata_comp_nosort | 745 B | 0 B | DRIVER |
+-+-+---+-+--+
*2 rows selected (1.46 seconds)*

 

*Sometimes the 1st time show metacache takes upto 25 seconds compared to 3-4 
seconds for 2nd time show metacache.*


> show metacache command takes significantly more time 1st time when compared 
> to 2nd time.
> 
>
> Key: CARBONDATA-3794
> URL: https://issues.apache.org/jira/browse/CARBONDATA-3794
> Project: CarbonData
>  Issue Type: Bug
>  Components: data-query
>Affects Versions: 2.0.0
> Environment: Spark 2.3.2 compatible carbon
>Reporter: Chetan Bhat
>Priority: Minor
>
> show metacache command takes significantly more time 1st time when compared 
> to 2nd time.
> *1st time*
> 0: jdbc:hive2://10.20.255.171:23040/show metacache;
>  
> +--+++++--
> |Identifier|Index size|Datamap size|Cache Location|
> +--+++++--
> |TOTAL|745 B|0 B|DRIVER|
> |1_6_1.uniqdata_comp_nosort|745 B|0 B|DRIVER|
> +--+++++--
>  *2 rows selected (8.233 seconds)*
>  
> *2nd time*
>  0: jdbc:hive2://10.20.255.171:23040/default> show metacache;
>  
> +--+++++--
> |Identifier|Index size|Datamap size|Cache Location|
> +--+++++--
> |TOTAL|745 B|0 B|DRIVER|
> |1_6_1.uniqdata_comp_nosort|745 B|0 B|DRIVER|
> +--+++++--
>  *2 rows selected (1.46 seconds)*
>  
> *Sometimes the 1st time show metacache takes upto 25 seconds compared to 3-4 
> seconds for 2nd time show metacache.*



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Closed] (CARBONDATA-3950) Alter table drop column for non partition column throws error

2020-10-07 Thread Chetan Bhat (Jira)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-3950?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chetan Bhat closed CARBONDATA-3950.
---
Fix Version/s: 2.1.0
   Resolution: Fixed

Issue is fixed in latest Carbon 2.1.0 build

> Alter table drop column for non partition column throws error
> -
>
> Key: CARBONDATA-3950
> URL: https://issues.apache.org/jira/browse/CARBONDATA-3950
> Project: CarbonData
>  Issue Type: Bug
>  Components: data-query
>Affects Versions: 2.0.1
> Environment: Spark 2.4.5
>Reporter: Chetan Bhat
>Priority: Minor
> Fix For: 2.1.0
>
>
> From spark-sql the queries are executed as mentioned below-
> drop table if exists uniqdata_int;
> CREATE TABLE uniqdata_int (CUST_NAME String,ACTIVE_EMUI_VERSION string, DOB 
> timestamp, DOJ timestamp, BIGINT_COLUMN1 bigint,BIGINT_COLUMN2 
> bigint,DECIMAL_COLUMN1 decimal(30,10), DECIMAL_COLUMN2 
> decimal(36,10),Double_COLUMN1 double, Double_COLUMN2 double, INTEGER_COLUMN1 
> int) Partitioned by (cust_id int) stored as carbondata TBLPROPERTIES 
> ("TABLE_BLOCKSIZE"= "256 MB");
> LOAD DATA INPATH 'hdfs://hacluster/chetan/2000_UniqData.csv' into table 
> uniqdata_int partition(cust_id='1') OPTIONS ('FILEHEADER'='CUST_ID,CUST_NAME 
> ,ACTIVE_EMUI_VERSION,DOB,DOJ, 
> BIGINT_COLUMN1,BIGINT_COLUMN2,DECIMAL_COLUMN1,DECIMAL_COLUMN2,Double_COLUMN1, 
> Double_COLUMN2,INTEGER_COLUMN1','BAD_RECORDS_ACTION'='FORCE');
> show partitions uniqdata_int;
> select * from uniqdata_int order by cust_id;
> alter table uniqdata_int add columns(id int);
>  desc uniqdata_int;
>  *alter table uniqdata_int drop columns(CUST_NAME);*
>  desc uniqdata_int;
> Issue : Alter table drop column for non partition column throws error even 
> though the operation is success.
> org.apache.carbondata.spark.exception.ProcessMetaDataException: operation 
> failed for priyesh.uniqdata_int: Alterion failed: 
> org.apache.hadoop.hive.ql.metadata.HiveException: Unable to alter table. The 
> following columns have he existing columns in their respective positions :
> col;
>  at 
> org.apache.spark.sql.execution.command.MetadataProcessOperation$class.throwMetadataException(package.
>  at 
> org.apache.spark.sql.execution.command.MetadataCommand.throwMetadataException(package.scala:120)
>  at 
> org.apache.spark.sql.execution.command.schema.CarbonAlterTableDropColumnCommand.processMetadata(Carboand.scala:201)
>  at 
> org.apache.spark.sql.execution.command.MetadataCommand$$anonfun$run$1.apply(package.scala:123)
>  at 
> org.apache.spark.sql.execution.command.MetadataCommand$$anonfun$run$1.apply(package.scala:123)
>  at 
> org.apache.spark.sql.execution.command.Auditable$class.runWithAudit(package.scala:104)
>  at 
> org.apache.spark.sql.execution.command.MetadataCommand.runWithAudit(package.scala:120)
>  at 
> org.apache.spark.sql.execution.command.MetadataCommand.run(package.scala:123)
>  at 
> org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult$lzycompute(commands.scala
>  at 
> org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult(commands.scala:69)
>  at 
> org.apache.spark.sql.execution.command.ExecutedCommandExec.executeCollect(commands.scala:80)
>  at org.apache.spark.sql.Dataset$$anonfun$6.apply(Dataset.scala:196)
>  at org.apache.spark.sql.Dataset$$anonfun$6.apply(Dataset.scala:196)
>  at org.apache.spark.sql.Dataset$$anonfun$52.apply(Dataset.scala:3379)
>  at 
> org.apache.spark.sql.execution.SQLExecution$$anonfun$withNewExecutionId$1.apply(SQLExecution.scala:95
>  at 
> org.apache.spark.sql.execution.SQLExecution$.withSQLConfPropagated(SQLExecution.scala:144)
>  at 
> org.apache.spark.sql.execution.SQLExecution$.withNewExecutionId(SQLExecution.scala:86)
>  at org.apache.spark.sql.Dataset.withAction(Dataset.scala:3378)
>  at org.apache.spark.sql.Dataset.(Dataset.scala:196)
>  at org.apache.spark.sql.Dataset$.ofRows(Dataset.scala:79)
>  at org.apache.spark.sql.SparkSession.sql(SparkSession.scala:651)
>  at org.apache.spark.sql.SQLContext.sql(SQLContext.scala:694)
>  at 
> org.apache.spark.sql.hive.thriftserver.SparkSQLDriver.run(SparkSQLDriver.scala:67)
>  at 
> org.apache.spark.sql.hive.thriftserver.SparkSQLCLIDriver.processCmd(SparkSQLCLIDriver.scala:387)
>  at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:406)
>  at 
> org.apache.spark.sql.hive.thriftserver.SparkSQLCLIDriver$.main(SparkSQLCLIDriver.scala:279)
>  at 
> org.apache.spark.sql.hive.thriftserver.SparkSQLCLIDriver.main(SparkSQLCLIDriver.scala)
>  at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>  at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>  at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>  at 

[jira] [Closed] (CARBONDATA-3867) Show materialized views command not documented in https://github.com/apache/carbondata/blob/master/docs/mv-guide.md

2020-10-06 Thread Chetan Bhat (Jira)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-3867?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chetan Bhat closed CARBONDATA-3867.
---
Fix Version/s: 2.1.0
   Resolution: Fixed

Updated in [https://github.com/apache/carbondata/blob/master/docs/mv-guide.md.] 
Defect closed.

> Show materialized views command not documented in 
> https://github.com/apache/carbondata/blob/master/docs/mv-guide.md
> ---
>
> Key: CARBONDATA-3867
> URL: https://issues.apache.org/jira/browse/CARBONDATA-3867
> Project: CarbonData
>  Issue Type: Bug
>  Components: docs
>Affects Versions: 2.0.0
> Environment: 
> https://github.com/apache/carbondata/blob/master/docs/mv-guide.md
>Reporter: Chetan Bhat
>Priority: Minor
> Fix For: 2.1.0
>
>
> Show materialized views command not documented in 
> https://github.com/apache/carbondata/blob/master/docs/mv-guide.md



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Closed] (CARBONDATA-3949) Select filter query fails from presto-cli on MV table

2020-10-06 Thread Chetan Bhat (Jira)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-3949?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chetan Bhat closed CARBONDATA-3949.
---
Fix Version/s: 2.1.0
   Resolution: Fixed

Limitation Updated in docs - 
[https://github.com/apache/carbondata/blob/master/docs/prestosql-guide.md]

Defect is closed.

> Select filter query fails from presto-cli on MV table
> -
>
> Key: CARBONDATA-3949
> URL: https://issues.apache.org/jira/browse/CARBONDATA-3949
> Project: CarbonData
>  Issue Type: Bug
>  Components: presto-integration
>Affects Versions: 2.0.1
> Environment: Spark 2.4.5. PrestoSQL 316
>Reporter: Chetan Bhat
>Priority: Major
> Fix For: 2.1.0
>
>
> From sparksql create table , load data and create MV
> spark-sql> CREATE TABLE uniqdata(CUST_ID int,CUST_NAME 
> String,ACTIVE_EMUI_VERSION string, DOB timestamp, DOJ timestamp, 
> BIGINT_COLUMN1 bigint,BIGINT_COLUMN2 bigint,DECIMAL_COLUMN1 decimal(30,10), 
> DECIMAL_COLUMN2 decimal(36,10),Double_COLUMN1 double, Double_COLUMN2 
> double,INTEGER_COLUMN1 int) STORED as carbondata 
> TBLPROPERTIES('local_dictionary_enable'='true','local_dictionary_threshold'='1000');
>  Time taken: 0.753 seconds
>  spark-sql> LOAD DATA INPATH 'hdfs://hacluster/chetan/2000_UniqData.csv' into 
> table uniqdata OPTIONS('DELIMITER'=',', 
> 'QUOTECHAR'='"','BAD_RECORDS_ACTION'='FORCE','FILEHEADER'='CUST_ID,CUST_NAME,ACTIVE_EMUI_VERSION,DOB,DOJ,BIGINT_COLUMN1,BIGINT_COLUMN2,DECIMAL_COLUMN1,DECIMAL_COLUMN2,Double_COLUMN1,Double_COLUMN2,INTEGER_COLUMN1');
>  OK
>  OK
>  Time taken: 1.992 seconds
>  spark-sql> CREATE MATERIALIZED VIEW mv1 as select cust_id, cust_name, 
> count(cust_id) from uniqdata group by cust_id, cust_name;
>  OK
>  Time taken: 4.336 seconds
>  
> From presto cli select filter query on table with MV fails.
> presto:chetan> select * from uniqdata where CUST_ID IS NULL or BIGINT_COLUMN1 
> =1233720368578 or DECIMAL_COLUMN1 = 12345678901.123458 or Double_COLUMN1 
> = 1.12345674897976E10 or INTEGER_COLUMN1 IS NULL ;
>  Query 20200804_092703_00253_ed34h failed: Unable to get file status:
> *Log-*
>  2020-08-04T18:09:55.975+0800 INFO Query-20200804_100955_00300_ed34h-2642 
> stdout 2020-08-04 18:09:55 WARN AbstractDFSCarbonFile:458 - Exception 
> occurred: File 
> hdfs://hacluster/user/sparkhive/warehouse/chetan.db/uniqdata_string/Metadata 
> does not exist.
>  java.io.FileNotFoundException: File 
> hdfs://hacluster/user/sparkhive/warehouse/chetan.db/uniqdata_string/Metadata 
> does not exist.
>  at 
> org.apache.hadoop.hdfs.DistributedFileSystem.listStatusInternal(DistributedFileSystem.java:1058)
>  at 
> org.apache.hadoop.hdfs.DistributedFileSystem.access$1000(DistributedFileSystem.java:131)
>  at 
> org.apache.hadoop.hdfs.DistributedFileSystem$24.doCall(DistributedFileSystem.java:1118)
>  at 
> org.apache.hadoop.hdfs.DistributedFileSystem$24.doCall(DistributedFileSystem.java:1115)
>  at 
> org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
>  at 
> org.apache.hadoop.hdfs.DistributedFileSystem.listStatus(DistributedFileSystem.java:1125)
>  at 
> org.apache.hadoop.fs.FilterFileSystem.listStatus(FilterFileSystem.java:270)
>  at 
> org.apache.carbondata.core.datastore.filesystem.AbstractDFSCarbonFile.listFiles(AbstractDFSCarbonFile.java:456)
>  at 
> org.apache.carbondata.core.datastore.filesystem.AbstractDFSCarbonFile.listFiles(AbstractDFSCarbonFile.java:559)
>  at 
> org.apache.carbondata.core.util.path.CarbonTablePath.getActualSchemaFilePath(CarbonTablePath.java:189)
>  at 
> org.apache.carbondata.core.util.path.CarbonTablePath.getSchemaFilePath(CarbonTablePath.java:168)
>  at 
> org.apache.carbondata.presto.impl.CarbonTableReader.updateSchemaTables(CarbonTableReader.java:147)
>  at 
> org.apache.carbondata.presto.impl.CarbonTableReader.getCarbonCache(CarbonTableReader.java:128)
>  at 
> org.apache.carbondata.presto.CarbondataSplitManager.getSplits(CarbondataSplitManager.java:145)
>  at 
> io.prestosql.spi.connector.classloader.ClassLoaderSafeConnectorSplitManager.getSplits(ClassLoaderSafeConnectorSplitManager.java:50)
>  at io.prestosql.split.SplitManager.getSplits(SplitManager.java:85)
>  at 
> io.prestosql.sql.planner.DistributedExecutionPlanner$Visitor.visitScanAndFilter(DistributedExecutionPlanner.java:189)
>  at 
> io.prestosql.sql.planner.DistributedExecutionPlanner$Visitor.visitFilter(DistributedExecutionPlanner.java:257)
>  at 
> io.prestosql.sql.planner.DistributedExecutionPlanner$Visitor.visitFilter(DistributedExecutionPlanner.java:149)
>  at io.prestosql.sql.planner.plan.FilterNode.accept(FilterNode.java:72)
>  at 
> io.prestosql.sql.planner.DistributedExecutionPlanner.doPlan(DistributedExecutionPlanner.java:119)
>  at 
> 

[jira] [Closed] (CARBONDATA-3806) Create bloom datamap fails with null pointer exception

2020-10-06 Thread Chetan Bhat (Jira)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-3806?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chetan Bhat closed CARBONDATA-3806.
---
Fix Version/s: 2.1.0
   Resolution: Fixed

fixed in latest Carbon 2.1 B06 build

> Create bloom datamap fails with null pointer exception
> --
>
> Key: CARBONDATA-3806
> URL: https://issues.apache.org/jira/browse/CARBONDATA-3806
> Project: CarbonData
>  Issue Type: Bug
>  Components: data-query
>Affects Versions: 1.6.1
> Environment: Spark 2.3.2
>Reporter: Chetan Bhat
>Priority: Major
> Fix For: 2.1.0
>
>
> Create bloom datamap fails with null pointer exception
> create table brinjal_bloom (imei string,AMSize string,channelsId 
> string,ActiveCountry string, Activecity string,gamePointId 
> double,deviceInformationId double,productionDate Timestamp,deliveryDate 
> timestamp,deliverycharge double) STORED BY 'carbondata' 
> TBLPROPERTIES('table_blocksize'='1');
> LOAD DATA INPATH 'hdfs://hacluster/chetan/vardhandaterestruct.csv' INTO TABLE 
> brinjal_bloom OPTIONS('DELIMITER'=',', 'QUOTECHAR'= 
> '"','BAD_RECORDS_ACTION'='FORCE','FILEHEADER'= 
> 'imei,deviceInformationId,AMSize,channelsId,ActiveCountry,Activecity,gamePointId,productionDate,deliveryDate,deliverycharge');
> 0: jdbc:hive2://10.20.255.171:23040/default> CREATE DATAMAP dm_brinjal4 ON 
> TABLE brinjal_bloom USING 'bloomfilter' DMPROPERTIES ('INDEX_COLUMNS' = 
> 'AMSize', 'BLOOM_SIZE'='64', 'BLOOM_FPP'='0.1');
> Error: org.apache.spark.SparkException: Job aborted due to stage failure: 
> Task 0 in stage 210.0 failed 4 times, most recent failure: Lost task 0.3 in 
> stage 210.0 (TID 1477, vm2, executor 2): java.lang.NullPointerException
>  at 
> org.apache.carbondata.core.datamap.Segment.getCommittedIndexFile(Segment.java:150)
>  at 
> org.apache.carbondata.core.util.BlockletDataMapUtil.getTableBlockUniqueIdentifiers(BlockletDataMapUtil.java:198)
>  at 
> org.apache.carbondata.core.indexstore.blockletindex.BlockletDataMapFactory.getTableBlockIndexUniqueIdentifiers(BlockletDataMapFactory.java:176)
>  at 
> org.apache.carbondata.core.indexstore.blockletindex.BlockletDataMapFactory.getDataMaps(BlockletDataMapFactory.java:154)
>  at 
> org.apache.carbondata.core.indexstore.blockletindex.BlockletDataMapFactory.getSegmentProperties(BlockletDataMapFactory.java:425)
>  at 
> org.apache.carbondata.datamap.IndexDataMapRebuildRDD.internalCompute(IndexDataMapRebuildRDD.scala:359)
>  at org.apache.carbondata.spark.rdd.CarbonRDD.compute(CarbonRDD.scala:84)
>  at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:324)
>  at org.apache.spark.rdd.RDD.iterator(RDD.scala:288)
>  at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:87)
>  at org.apache.spark.scheduler.Task.run(Task.scala:109)
>  at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:345)
>  at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>  at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>  at java.lang.Thread.run(Thread.java:745)
> Driver stacktrace: (state=,code=0)



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Closed] (CARBONDATA-3797) Refresh materialized view command throws null pointer exception

2020-10-06 Thread Chetan Bhat (Jira)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-3797?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chetan Bhat closed CARBONDATA-3797.
---
Fix Version/s: 2.1.0
   Resolution: Fixed

Issue fixed in Carbon 2.1.0

> Refresh materialized view command throws null pointer exception
> ---
>
> Key: CARBONDATA-3797
> URL: https://issues.apache.org/jira/browse/CARBONDATA-3797
> Project: CarbonData
>  Issue Type: Bug
>  Components: data-query
>Affects Versions: 2.0.0
> Environment: Spark 2.3.2, Spark 2.4.5
>Reporter: Chetan Bhat
>Priority: Major
> Fix For: 2.1.0
>
>
> Refresh materialized view command throws null pointer exception
> CREATE TABLE uniqdata_mv(CUST_ID int,CUST_NAME String,ACTIVE_EMUI_VERSION 
> string, DOB timestamp, DOJ timestamp, BIGINT_COLUMN1 bigint,BIGINT_COLUMN2 
> bigint,DECIMAL_COLUMN1 decimal(30,10), DECIMAL_COLUMN2 
> decimal(36,10),Double_COLUMN1 double, Double_COLUMN2 double,INTEGER_COLUMN1 
> int) STORED as carbondata 
> TBLPROPERTIES('local_dictionary_enable'='true','local_dictionary_threshold'='1000');
> LOAD DATA INPATH 'hdfs://hacluster/chetan/2000_UniqData.csv' into table 
> uniqdata_mv OPTIONS('DELIMITER'=',', 
> 'QUOTECHAR'='"','BAD_RECORDS_ACTION'='FORCE','FILEHEADER'='CUST_ID,CUST_NAME,ACTIVE_EMUI_VERSION,DOB,DOJ,BIGINT_COLUMN1,BIGINT_COLUMN2,DECIMAL_COLUMN1,DECIMAL_COLUMN2,Double_COLUMN1,Double_COLUMN2,INTEGER_COLUMN1');
> CREATE MATERIALIZED VIEW mv1 as select cust_id, cust_name, count(cust_id) 
> from uniqdata_mv group by cust_id, cust_name;
> refresh MATERIALIZED VIEW mv1;
>  Error: java.lang.NullPointerException (state=,code=0)
>  
> *Exception-*
> 2020-05-06 00:50:59,941 | ERROR | [HiveServer2-Background-Pool: Thread-1822] 
> | Error executing query, currentState RUNNING, | 
> org.apache.spark.internal.Logging$class.logError(Logging.scala:91)
>  java.lang.NullPointerException
>  at org.apache.carbondata.view.MVRefresher$.refresh(MVRefresher.scala:62)
>  at 
> org.apache.spark.sql.execution.command.view.CarbonRefreshMVCommand.processData(CarbonRefreshMVCommand.scala:52)
>  at 
> org.apache.spark.sql.execution.command.DataCommand$$anonfun$run$2.apply(package.scala:132)
>  at 
> org.apache.spark.sql.execution.command.DataCommand$$anonfun$run$2.apply(package.scala:132)
>  at 
> org.apache.spark.sql.execution.command.Auditable$class.runWithAudit(package.scala:104)
>  at 
> org.apache.spark.sql.execution.command.DataCommand.runWithAudit(package.scala:130)
>  at org.apache.spark.sql.execution.command.DataCommand.run(package.scala:132)
>  at 
> org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult$lzycompute(commands.scala:70)
>  at 
> org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult(commands.scala:68)
>  at 
> org.apache.spark.sql.execution.command.ExecutedCommandExec.executeCollect(commands.scala:79)
>  at org.apache.spark.sql.Dataset$$anonfun$6.apply(Dataset.scala:194)
>  at org.apache.spark.sql.Dataset$$anonfun$6.apply(Dataset.scala:194)
>  at org.apache.spark.sql.Dataset$$anonfun$52.apply(Dataset.scala:3370)
>  at 
> org.apache.spark.sql.execution.SQLExecution$$anonfun$withNewExecutionId$1.apply(SQLExecution.scala:80)
>  at 
> org.apache.spark.sql.execution.SQLExecution$.withSQLConfPropagated(SQLExecution.scala:127)
>  at 
> org.apache.spark.sql.execution.SQLExecution$.withNewExecutionId(SQLExecution.scala:75)
>  at org.apache.spark.sql.Dataset.withAction(Dataset.scala:3369)
>  at org.apache.spark.sql.Dataset.(Dataset.scala:194)
>  at org.apache.spark.sql.Dataset$.ofRows(Dataset.scala:79)
>  at org.apache.spark.sql.SparkSession.sql(SparkSession.scala:642)
>  at org.apache.spark.sql.SQLContext.sql(SQLContext.scala:694)
>  at 
> org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation.org$apache$spark$sql$hive$thriftserver$SparkExecuteStatementOperation$$execute(SparkExecuteStatementOperation.scala:232)
>  at 
> org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation$$anon$1$$anon$2.run(SparkExecuteStatementOperation.scala:175)
>  at 
> org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation$$anon$1$$anon$2.run(SparkExecuteStatementOperation.scala:171)
>  at java.security.AccessController.doPrivileged(Native Method)
>  at javax.security.auth.Subject.doAs(Subject.java:422)
>  at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1698)
>  at 
> org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation$$anon$1.run(SparkExecuteStatementOperation.scala:185)
>  at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
>  at java.util.concurrent.FutureTask.run(FutureTask.java:266)
>  at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>  at 
> 

[jira] [Created] (CARBONDATA-4020) Drop bloom index for single index of table having multiple index drops all indexes

2020-10-01 Thread Chetan Bhat (Jira)
Chetan Bhat created CARBONDATA-4020:
---

 Summary: Drop bloom index for single index of table having 
multiple index drops all indexes
 Key: CARBONDATA-4020
 URL: https://issues.apache.org/jira/browse/CARBONDATA-4020
 Project: CarbonData
  Issue Type: Bug
  Components: data-query
Affects Versions: 2.1.0
 Environment: Spark 2.4.5
Reporter: Chetan Bhat


Create multiple bloom indexes on the table. Try to drop single bloom index

drop table if exists datamap_test_1;
 CREATE TABLE datamap_test_1 (id int,name string,salary float,dob date)STORED 
as carbondata TBLPROPERTIES('SORT_COLUMNS'='id');
 
 CREATE index dm_datamap_test_1_2 ON TABLE datamap_test_1(id) as 'bloomfilter' 
PROPERTIES ( 'BLOOM_SIZE'='64', 'BLOOM_FPP'='0.1', 
'BLOOM_COMPRESS'='true');
 
 CREATE index dm_datamap_test3 ON TABLE datamap_test_1 (name) as 'bloomfilter' 
PROPERTIES ('BLOOM_SIZE'='64', 'BLOOM_FPP'='0.1', 
'BLOOM_COMPRESS'='true');

show indexes on table datamap_test_1;
drop index dm_datamap_test_1_2 on datamap_test_1;
show indexes on table datamap_test_1;

 

Issue : Drop bloom index for single index of table having multiple index drops 
all indexes

0: jdbc:hive2://linux-32:22550/> show indexes on table datamap_test_1;
+--+--+--++--+
| Name | Provider | Indexed Columns | Properties | Status | Sync In
+--+--+--++--+
| dm_datamap_test_1_2 | bloomfilter | id | 
'INDEX_COLUMNS'='id','bloom_compress'='true','bloom_fpp'='0.1','blo
| dm_datamap_test3 | bloomfilter | name | 
'INDEX_COLUMNS'='name','bloom_compress'='true','bloom_fpp'='0.1','b
+--+--+--++--+
2 rows selected (0.315 seconds)
0: jdbc:hive2://linux-32:22550/> drop index dm_datamap_test_1_2 on 
datamap_test_1;
+-+
| Result |
+-+
+-+
No rows selected (1.232 seconds)
0: jdbc:hive2://linux-32:22550/> show indexes on table datamap_test_1;
+---+---+--+-+-++
| Name | Provider | Indexed Columns | Properties | Status | Sync Info |
+---+---+--+-+-++
+---+---+--+-+-++
No rows selected (0.21 seconds)
0: jdbc:hive2://linux-32:22550/>



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (CARBONDATA-3901) Documentation issues in https://github.com/apache/carbondata/tree/master/docs

2020-09-30 Thread Chetan Bhat (Jira)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-3901?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chetan Bhat updated CARBONDATA-3901:

Description: 
*Issue 1 :* 
[https://github.com/apache/carbondata/blob/master/docs/alluxio-guide.md] 
getOrCreateCarbonSession not used in Carbon 2.0 version and should be 
removed.Issue 1 : 
[https://github.com/apache/carbondata/blob/master/docs/alluxio-guide.md] 
getOrCreateCarbonSession not used in Carbon 2.0 version and should be removed.
 Testing use alluxio by CarbonSessionimport 
org.apache.spark.sql.CarbonSession._import org.apache.spark.sql.SparkSession   
val carbon = 
SparkSession.builder().master("local").appName("test").getOrCreateCarbonSession("alluxio://localhost:19998/carbondata");carbon.sql("CREATE
 TABLE carbon_alluxio(id String,name String, city String,age Int) STORED as 
carbondata");carbon.sql(s"LOAD DATA LOCAL INPATH 
'${CARBONDATA_PATH}/integration/spark/src/test/resources/sample.csv' into table 
carbon_alluxio");carbon.sql("select * from carbon_alluxio").show

*Issue 2  -* 
[https://github.com/apache/carbondata/blob/master/docs/ddl-of-carbondata.mdSORT_SCOPE]
 Sort scope of the load.Options include no sort, local sort ,batch sort and 
global sort  --> Batch sort to be removed as its not supported.

*Issue 3 -* 
[https://github.com/apache/carbondata/blob/master/docs/streaming-guide.md#close-stream]
   CLOSE STREAM link is not working.

*Issue 4 -*  
[https://github.com/apache/carbondata/blob/master/docs/index/bloomfilter-index-guide.md]
  Explain query does not hit the MV. Hence the line "User can verify whether a 
query can leverage BloomFilter Index by executing {{EXPLAIN}} command, which 
will show the transformed logical plan, and thus user can check whether the 
BloomFilter Index can skip blocklets during the scan."  needs to be removed.

  was:
*Issue 1 :* 
https://github.com/apache/carbondata/blob/master/docs/alluxio-guide.md 
getOrCreateCarbonSession not used in Carbon 2.0 version and should be 
removed.Issue 1 : 
https://github.com/apache/carbondata/blob/master/docs/alluxio-guide.md 
getOrCreateCarbonSession not used in Carbon 2.0 version and should be removed.
Testing use alluxio by CarbonSessionimport 
org.apache.spark.sql.CarbonSession._import org.apache.spark.sql.SparkSession   
val carbon = 
SparkSession.builder().master("local").appName("test").getOrCreateCarbonSession("alluxio://localhost:19998/carbondata");carbon.sql("CREATE
 TABLE carbon_alluxio(id String,name String, city String,age Int) STORED as 
carbondata");carbon.sql(s"LOAD DATA LOCAL INPATH 
'${CARBONDATA_PATH}/integration/spark/src/test/resources/sample.csv' into table 
carbon_alluxio");carbon.sql("select * from carbon_alluxio").show

*Issue 2  -* 
https://github.com/apache/carbondata/blob/master/docs/ddl-of-carbondata.mdSORT_SCOPE
 Sort scope of the load.Options include no sort, local sort ,batch sort and 
global sort  --> Batch sort to be removed as its not supported.

*Issue 3 -* 
https://github.com/apache/carbondata/blob/master/docs/streaming-guide.md#close-stream
   CLOSE STREAM link is not working.


> Documentation issues in https://github.com/apache/carbondata/tree/master/docs
> -
>
> Key: CARBONDATA-3901
> URL: https://issues.apache.org/jira/browse/CARBONDATA-3901
> Project: CarbonData
>  Issue Type: Bug
>  Components: docs
>Affects Versions: 2.0.1
> Environment: https://github.com/apache/carbondata/tree/master/docs
>Reporter: Chetan Bhat
>Priority: Minor
>
> *Issue 1 :* 
> [https://github.com/apache/carbondata/blob/master/docs/alluxio-guide.md] 
> getOrCreateCarbonSession not used in Carbon 2.0 version and should be 
> removed.Issue 1 : 
> [https://github.com/apache/carbondata/blob/master/docs/alluxio-guide.md] 
> getOrCreateCarbonSession not used in Carbon 2.0 version and should be removed.
>  Testing use alluxio by CarbonSessionimport 
> org.apache.spark.sql.CarbonSession._import org.apache.spark.sql.SparkSession  
>  val carbon = 
> SparkSession.builder().master("local").appName("test").getOrCreateCarbonSession("alluxio://localhost:19998/carbondata");carbon.sql("CREATE
>  TABLE carbon_alluxio(id String,name String, city String,age Int) STORED as 
> carbondata");carbon.sql(s"LOAD DATA LOCAL INPATH 
> '${CARBONDATA_PATH}/integration/spark/src/test/resources/sample.csv' into 
> table carbon_alluxio");carbon.sql("select * from carbon_alluxio").show
> *Issue 2  -* 
> [https://github.com/apache/carbondata/blob/master/docs/ddl-of-carbondata.mdSORT_SCOPE]
>  Sort scope of the load.Options include no sort, local sort ,batch sort and 
> global sort  --> Batch sort to be removed as its not supported.
> *Issue 3 -* 
> [https://github.com/apache/carbondata/blob/master/docs/streaming-guide.md#close-stream]
>    CLOSE STREAM link is 

[jira] [Closed] (CARBONDATA-4013) NullPointerException when use ArrowCarbonReader to read carbondata created using orc ,parquet and avro files in SDK

2020-09-28 Thread Chetan Bhat (Jira)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-4013?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chetan Bhat closed CARBONDATA-4013.
---
Resolution: Invalid

https://issues.apache.org/jira/browse/CARBONDATA-3365

*Stage2:* 
Deep integration with carbon vector; for this, currently carbon SDK vector 
doesn't support filling complex columns. 

As mentioned in this Jira the arrow reader SDK interfaces does not support 
complex type.

 

 

> NullPointerException when use ArrowCarbonReader to read carbondata created 
> using orc ,parquet and avro files in SDK
> ---
>
> Key: CARBONDATA-4013
> URL: https://issues.apache.org/jira/browse/CARBONDATA-4013
> Project: CarbonData
>  Issue Type: Bug
>  Components: other
>Affects Versions: 2.1.0
> Environment: Spark 2.4.5 compiled jars
>Reporter: Chetan Bhat
>Priority: Major
>
> when use ArrowCarbonReader to read carbondata created using orc files in SDK-
> java.lang.NullPointerException
>  at 
> org.apache.carbondata.sdk.file.arrow.ArrowUtils.toArrowSchema(ArrowUtils.java:109)
>  at 
> org.apache.carbondata.sdk.file.arrow.ArrowConverter.(ArrowConverter.java:45)
>  at 
> org.apache.carbondata.sdk.file.ArrowCarbonReader.readArrowBatch(ArrowCarbonReader.java:54)
>  at 
> com.apache.spark.LoadFromFiles.testORCFileLoadWithComplexSchemaArrowReader(LoadFromFiles.java:1401)
>  at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>  at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>  at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>  at java.lang.reflect.Method.invoke(Method.java:498)
>  at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50)
>  at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
>  at 
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47)
>  at 
> org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
>  at org.junit.rules.TestWatcher$1.evaluate(TestWatcher.java:55)
>  at org.junit.rules.RunRules.evaluate(RunRules.java:20)
>  at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:325)
>  at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:78)
>  at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:57)
>  at org.junit.runners.ParentRunner$3.run(ParentRunner.java:290)
>  at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:71)
>  at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:288)
>  at org.junit.runners.ParentRunner.access$000(ParentRunner.java:58)
>  at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:268)
>  at 
> org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
>  at 
> org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
>  at org.junit.runners.ParentRunner.run(ParentRunner.java:363)
>  at org.junit.runner.JUnitCore.run(JUnitCore.java:137)
>  at 
> com.intellij.junit4.JUnit4IdeaTestRunner.startRunnerWithArgs(JUnit4IdeaTestRunner.java:68)
>  at 
> com.intellij.rt.junit.IdeaTestRunner$Repeater.startRunnerWithArgs(IdeaTestRunner.java:33)
>  at 
> com.intellij.rt.junit.JUnitStarter.prepareStreamsAndStart(JUnitStarter.java:230)
>  at com.intellij.rt.junit.JUnitStarter.main(JUnitStarter.java:58)
>  
> when use ArrowCarbonReader to read carbondata created using parquet and avro 
> files in SDK
> java.lang.ClassCastException: java.lang.String cannot be cast to 
> [Ljava.lang.Object;
> at 
> org.apache.carbondata.sdk.file.arrow.StructWriter.setValue(ArrowFieldWriter.java:374)
>  at 
> org.apache.carbondata.sdk.file.arrow.ArrowFieldWriter.write(ArrowFieldWriter.java:60)
>  at 
> org.apache.carbondata.sdk.file.arrow.StructWriter.setValue(ArrowFieldWriter.java:377)
>  at 
> org.apache.carbondata.sdk.file.arrow.ArrowFieldWriter.write(ArrowFieldWriter.java:60)
>  at 
> org.apache.carbondata.sdk.file.arrow.ArrowWriter.write(ArrowWriter.java:56)
>  at 
> org.apache.carbondata.sdk.file.arrow.ArrowConverter.addToArrowBuffer(ArrowConverter.java:63)
>  at 
> org.apache.carbondata.sdk.file.ArrowCarbonReader.readArrowBatch(ArrowCarbonReader.java:56)
>  at 
> com.apache.spark.LoadFromFiles.testParquetLoadAndCarbonArrowReader(LoadFromFiles.java:1775)



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (CARBONDATA-4013) NullPointerException when use ArrowCarbonReader to read carbondata created using orc ,parquet and avro files in SDK

2020-09-25 Thread Chetan Bhat (Jira)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-4013?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chetan Bhat updated CARBONDATA-4013:

Summary: NullPointerException when use ArrowCarbonReader to read carbondata 
created using orc ,parquet and avro files in SDK  (was: NullPointerException 
when use ArrowCarbonReader to read carbondata created using orc ,parquet and 
avro files)

> NullPointerException when use ArrowCarbonReader to read carbondata created 
> using orc ,parquet and avro files in SDK
> ---
>
> Key: CARBONDATA-4013
> URL: https://issues.apache.org/jira/browse/CARBONDATA-4013
> Project: CarbonData
>  Issue Type: Bug
>  Components: other
>Affects Versions: 2.1.0
> Environment: Spark 2.4.5 compiled jars
>Reporter: Chetan Bhat
>Priority: Major
>
> when use ArrowCarbonReader to read carbondata created using orc files-
> java.lang.NullPointerException
>  at 
> org.apache.carbondata.sdk.file.arrow.ArrowUtils.toArrowSchema(ArrowUtils.java:109)
>  at 
> org.apache.carbondata.sdk.file.arrow.ArrowConverter.(ArrowConverter.java:45)
>  at 
> org.apache.carbondata.sdk.file.ArrowCarbonReader.readArrowBatch(ArrowCarbonReader.java:54)
>  at 
> com.apache.spark.LoadFromFiles.testORCFileLoadWithComplexSchemaArrowReader(LoadFromFiles.java:1401)
>  at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>  at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>  at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>  at java.lang.reflect.Method.invoke(Method.java:498)
>  at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50)
>  at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
>  at 
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47)
>  at 
> org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
>  at org.junit.rules.TestWatcher$1.evaluate(TestWatcher.java:55)
>  at org.junit.rules.RunRules.evaluate(RunRules.java:20)
>  at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:325)
>  at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:78)
>  at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:57)
>  at org.junit.runners.ParentRunner$3.run(ParentRunner.java:290)
>  at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:71)
>  at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:288)
>  at org.junit.runners.ParentRunner.access$000(ParentRunner.java:58)
>  at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:268)
>  at 
> org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
>  at 
> org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
>  at org.junit.runners.ParentRunner.run(ParentRunner.java:363)
>  at org.junit.runner.JUnitCore.run(JUnitCore.java:137)
>  at 
> com.intellij.junit4.JUnit4IdeaTestRunner.startRunnerWithArgs(JUnit4IdeaTestRunner.java:68)
>  at 
> com.intellij.rt.junit.IdeaTestRunner$Repeater.startRunnerWithArgs(IdeaTestRunner.java:33)
>  at 
> com.intellij.rt.junit.JUnitStarter.prepareStreamsAndStart(JUnitStarter.java:230)
>  at com.intellij.rt.junit.JUnitStarter.main(JUnitStarter.java:58)
>  
> when use ArrowCarbonReader to read carbondata created using parquet and avro 
> files
> java.lang.ClassCastException: java.lang.String cannot be cast to 
> [Ljava.lang.Object;
>  at 
> org.apache.carbondata.sdk.file.arrow.StructWriter.setValue(ArrowFieldWriter.java:374)
>  at 
> org.apache.carbondata.sdk.file.arrow.ArrowFieldWriter.write(ArrowFieldWriter.java:60)
>  at 
> org.apache.carbondata.sdk.file.arrow.StructWriter.setValue(ArrowFieldWriter.java:377)
>  at 
> org.apache.carbondata.sdk.file.arrow.ArrowFieldWriter.write(ArrowFieldWriter.java:60)
>  at 
> org.apache.carbondata.sdk.file.arrow.ArrowWriter.write(ArrowWriter.java:56)
>  at 
> org.apache.carbondata.sdk.file.arrow.ArrowConverter.addToArrowBuffer(ArrowConverter.java:63)
>  at 
> org.apache.carbondata.sdk.file.ArrowCarbonReader.readArrowBatch(ArrowCarbonReader.java:56)
>  at 
> com.apache.spark.LoadFromFiles.testParquetLoadAndCarbonArrowReader(LoadFromFiles.java:1775)



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (CARBONDATA-4013) NullPointerException when use ArrowCarbonReader to read carbondata created using orc ,parquet and avro files in SDK

2020-09-25 Thread Chetan Bhat (Jira)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-4013?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chetan Bhat updated CARBONDATA-4013:

Description: 
when use ArrowCarbonReader to read carbondata created using orc files in SDK-

java.lang.NullPointerException
 at 
org.apache.carbondata.sdk.file.arrow.ArrowUtils.toArrowSchema(ArrowUtils.java:109)
 at 
org.apache.carbondata.sdk.file.arrow.ArrowConverter.(ArrowConverter.java:45)
 at 
org.apache.carbondata.sdk.file.ArrowCarbonReader.readArrowBatch(ArrowCarbonReader.java:54)
 at 
com.apache.spark.LoadFromFiles.testORCFileLoadWithComplexSchemaArrowReader(LoadFromFiles.java:1401)
 at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
 at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
 at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
 at java.lang.reflect.Method.invoke(Method.java:498)
 at 
org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50)
 at 
org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
 at 
org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47)
 at 
org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
 at org.junit.rules.TestWatcher$1.evaluate(TestWatcher.java:55)
 at org.junit.rules.RunRules.evaluate(RunRules.java:20)
 at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:325)
 at 
org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:78)
 at 
org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:57)
 at org.junit.runners.ParentRunner$3.run(ParentRunner.java:290)
 at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:71)
 at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:288)
 at org.junit.runners.ParentRunner.access$000(ParentRunner.java:58)
 at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:268)
 at 
org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
 at org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
 at org.junit.runners.ParentRunner.run(ParentRunner.java:363)
 at org.junit.runner.JUnitCore.run(JUnitCore.java:137)
 at 
com.intellij.junit4.JUnit4IdeaTestRunner.startRunnerWithArgs(JUnit4IdeaTestRunner.java:68)
 at 
com.intellij.rt.junit.IdeaTestRunner$Repeater.startRunnerWithArgs(IdeaTestRunner.java:33)
 at 
com.intellij.rt.junit.JUnitStarter.prepareStreamsAndStart(JUnitStarter.java:230)
 at com.intellij.rt.junit.JUnitStarter.main(JUnitStarter.java:58)

 

when use ArrowCarbonReader to read carbondata created using parquet and avro 
files in SDK

java.lang.ClassCastException: java.lang.String cannot be cast to 
[Ljava.lang.Object;

at 
org.apache.carbondata.sdk.file.arrow.StructWriter.setValue(ArrowFieldWriter.java:374)
 at 
org.apache.carbondata.sdk.file.arrow.ArrowFieldWriter.write(ArrowFieldWriter.java:60)
 at 
org.apache.carbondata.sdk.file.arrow.StructWriter.setValue(ArrowFieldWriter.java:377)
 at 
org.apache.carbondata.sdk.file.arrow.ArrowFieldWriter.write(ArrowFieldWriter.java:60)
 at org.apache.carbondata.sdk.file.arrow.ArrowWriter.write(ArrowWriter.java:56)
 at 
org.apache.carbondata.sdk.file.arrow.ArrowConverter.addToArrowBuffer(ArrowConverter.java:63)
 at 
org.apache.carbondata.sdk.file.ArrowCarbonReader.readArrowBatch(ArrowCarbonReader.java:56)
 at 
com.apache.spark.LoadFromFiles.testParquetLoadAndCarbonArrowReader(LoadFromFiles.java:1775)

  was:
when use ArrowCarbonReader to read carbondata created using orc files-

java.lang.NullPointerException
 at 
org.apache.carbondata.sdk.file.arrow.ArrowUtils.toArrowSchema(ArrowUtils.java:109)
 at 
org.apache.carbondata.sdk.file.arrow.ArrowConverter.(ArrowConverter.java:45)
 at 
org.apache.carbondata.sdk.file.ArrowCarbonReader.readArrowBatch(ArrowCarbonReader.java:54)
 at 
com.apache.spark.LoadFromFiles.testORCFileLoadWithComplexSchemaArrowReader(LoadFromFiles.java:1401)
 at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
 at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
 at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
 at java.lang.reflect.Method.invoke(Method.java:498)
 at 
org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50)
 at 
org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
 at 
org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47)
 at 
org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
 at org.junit.rules.TestWatcher$1.evaluate(TestWatcher.java:55)
 at org.junit.rules.RunRules.evaluate(RunRules.java:20)
 at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:325)
 at 
org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:78)
 at 

[jira] [Created] (CARBONDATA-4013) NullPointerException when use ArrowCarbonReader to read carbondata created using orc ,parquet and avro files

2020-09-25 Thread Chetan Bhat (Jira)
Chetan Bhat created CARBONDATA-4013:
---

 Summary: NullPointerException when use ArrowCarbonReader to read 
carbondata created using orc ,parquet and avro files
 Key: CARBONDATA-4013
 URL: https://issues.apache.org/jira/browse/CARBONDATA-4013
 Project: CarbonData
  Issue Type: Bug
  Components: other
Affects Versions: 2.1.0
 Environment: Spark 2.4.5 compiled jars
Reporter: Chetan Bhat


when use ArrowCarbonReader to read carbondata created using orc files-

java.lang.NullPointerException
 at 
org.apache.carbondata.sdk.file.arrow.ArrowUtils.toArrowSchema(ArrowUtils.java:109)
 at 
org.apache.carbondata.sdk.file.arrow.ArrowConverter.(ArrowConverter.java:45)
 at 
org.apache.carbondata.sdk.file.ArrowCarbonReader.readArrowBatch(ArrowCarbonReader.java:54)
 at 
com.apache.spark.LoadFromFiles.testORCFileLoadWithComplexSchemaArrowReader(LoadFromFiles.java:1401)
 at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
 at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
 at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
 at java.lang.reflect.Method.invoke(Method.java:498)
 at 
org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50)
 at 
org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
 at 
org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47)
 at 
org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
 at org.junit.rules.TestWatcher$1.evaluate(TestWatcher.java:55)
 at org.junit.rules.RunRules.evaluate(RunRules.java:20)
 at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:325)
 at 
org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:78)
 at 
org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:57)
 at org.junit.runners.ParentRunner$3.run(ParentRunner.java:290)
 at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:71)
 at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:288)
 at org.junit.runners.ParentRunner.access$000(ParentRunner.java:58)
 at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:268)
 at 
org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
 at org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
 at org.junit.runners.ParentRunner.run(ParentRunner.java:363)
 at org.junit.runner.JUnitCore.run(JUnitCore.java:137)
 at 
com.intellij.junit4.JUnit4IdeaTestRunner.startRunnerWithArgs(JUnit4IdeaTestRunner.java:68)
 at 
com.intellij.rt.junit.IdeaTestRunner$Repeater.startRunnerWithArgs(IdeaTestRunner.java:33)
 at 
com.intellij.rt.junit.JUnitStarter.prepareStreamsAndStart(JUnitStarter.java:230)
 at com.intellij.rt.junit.JUnitStarter.main(JUnitStarter.java:58)

 

when use ArrowCarbonReader to read carbondata created using parquet and avro 
files

java.lang.ClassCastException: java.lang.String cannot be cast to 
[Ljava.lang.Object;

 at 
org.apache.carbondata.sdk.file.arrow.StructWriter.setValue(ArrowFieldWriter.java:374)
 at 
org.apache.carbondata.sdk.file.arrow.ArrowFieldWriter.write(ArrowFieldWriter.java:60)
 at 
org.apache.carbondata.sdk.file.arrow.StructWriter.setValue(ArrowFieldWriter.java:377)
 at 
org.apache.carbondata.sdk.file.arrow.ArrowFieldWriter.write(ArrowFieldWriter.java:60)
 at org.apache.carbondata.sdk.file.arrow.ArrowWriter.write(ArrowWriter.java:56)
 at 
org.apache.carbondata.sdk.file.arrow.ArrowConverter.addToArrowBuffer(ArrowConverter.java:63)
 at 
org.apache.carbondata.sdk.file.ArrowCarbonReader.readArrowBatch(ArrowCarbonReader.java:56)
 at 
com.apache.spark.LoadFromFiles.testParquetLoadAndCarbonArrowReader(LoadFromFiles.java:1775)



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (CARBONDATA-4010) "Alter table set tblproperties should support long string columns" and bad record handling of long string data for string columns need to be updated in https://githu

2020-09-25 Thread Chetan Bhat (Jira)
Chetan Bhat created CARBONDATA-4010:
---

 Summary: "Alter table set tblproperties should support long string 
columns" and bad record handling of long string data for string columns need to 
be updated in https://github.com/apache/carbondata/blob/master/docs
 Key: CARBONDATA-4010
 URL: https://issues.apache.org/jira/browse/CARBONDATA-4010
 Project: CarbonData
  Issue Type: Bug
  Components: docs
Affects Versions: 2.1.0
 Environment: https://github.com/apache/carbondata/blob/master/docs
Reporter: Chetan Bhat


"Alter table set tblproperties should support long string columns" and bad 
record handling of long string data for string columns need to be updated in 
https://github.com/apache/carbondata/blob/master/docs



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (CARBONDATA-4007) ArrayIndexOutofBoundsException when IUD operations performed using SDK

2020-09-23 Thread Chetan Bhat (Jira)
Chetan Bhat created CARBONDATA-4007:
---

 Summary: ArrayIndexOutofBoundsException when IUD operations 
performed using SDK
 Key: CARBONDATA-4007
 URL: https://issues.apache.org/jira/browse/CARBONDATA-4007
 Project: CarbonData
  Issue Type: Bug
  Components: data-load
Affects Versions: 2.1.0
 Environment: Spark 2.4.5 jars used for compilation of SDK 
Reporter: Chetan Bhat


Issue -

ArrayIndexOutofBoundsException when IUD operations performed using SDK.

Exception -

java.lang.ArrayIndexOutOfBoundsException: 1

 at 
org.apache.carbondata.hadoop.api.CarbonTableOutputFormat$1.close(CarbonTableOutputFormat.java:579)
 at org.apache.carbondata.sdk.file.CarbonIUD.delete(CarbonIUD.java:110)
 at org.apache.carbondata.sdk.file.CarbonIUD.deleteExecution(CarbonIUD.java:238)
 at org.apache.carbondata.sdk.file.CarbonIUD.closeDelete(CarbonIUD.java:123)
 at org.apache.carbondata.sdk.file.CarbonIUD.commit(CarbonIUD.java:221)
 at com.apache.spark.SdkIUD_Test.testDelete(SdkIUD_Test.java:130)
 at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
 at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
 at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
 at java.lang.reflect.Method.invoke(Method.java:498)
 at 
org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50)
 at 
org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
 at 
org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47)
 at 
org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
 at org.junit.rules.TestWatcher$1.evaluate(TestWatcher.java:55)
 at org.junit.rules.RunRules.evaluate(RunRules.java:20)
 at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:325)
 at 
org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:78)
 at 
org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:57)
 at org.junit.runners.ParentRunner$3.run(ParentRunner.java:290)
 at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:71)
 at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:288)
 at org.junit.runners.ParentRunner.access$000(ParentRunner.java:58)
 at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:268)
 at 
org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
 at org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
 at org.junit.runners.ParentRunner.run(ParentRunner.java:363)
 at org.junit.runner.JUnitCore.run(JUnitCore.java:137)
 at 
com.intellij.junit4.JUnit4IdeaTestRunner.startRunnerWithArgs(JUnit4IdeaTestRunner.java:68)
 at 
com.intellij.rt.junit.IdeaTestRunner$Repeater.startRunnerWithArgs(IdeaTestRunner.java:33)
 at 
com.intellij.rt.junit.JUnitStarter.prepareStreamsAndStart(JUnitStarter.java:230)
 at com.intellij.rt.junit.JUnitStarter.main(JUnitStarter.java:58)



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (CARBONDATA-3987) Issues in SDK Pagination reader (2 issues)

2020-09-14 Thread Chetan Bhat (Jira)
Chetan Bhat created CARBONDATA-3987:
---

 Summary: Issues in SDK Pagination reader (2 issues)
 Key: CARBONDATA-3987
 URL: https://issues.apache.org/jira/browse/CARBONDATA-3987
 Project: CarbonData
  Issue Type: Bug
  Components: other
Affects Versions: 2.1.0
Reporter: Chetan Bhat


Issue 1 : 
write data to table and insert into one more row , error is thrown when try to 
read new added row where as getTotalRows get incremented by 1.

Test code-
/**
 * Carbon Files are written using CarbonWriter in outputpath
 *
 * Carbon Files are read using paginationCarbonReader object
 * Checking pagination with insert on large data with 8 split
 */
 @Test
 public void testSDKPaginationInsertData() throws IOException, 
InvalidLoadOptionException, InterruptedException {
 System.out.println("___" + 
name.getMethodName() + " TestCase Execution is 
started");

//
// String outputPath1 = getOutputPath(outputDir, name.getMethodName() + 
"large");
//
// long uid = 123456;
// TimeZone.setDefault(TimeZone.getTimeZone("Asia/Shanghai"));
// writeMultipleCarbonFiles("id int,name string,rank short,salary double,active 
boolean,dob date,doj timestamp,city string,dept string", getDatas(), 
outputPath1, uid, null, null);
//
// System.out.println("Data is written");

List data1 = new ArrayList();
 String[] row1 = \{"1", "AAA", "3", "3444345.66", "true", "1979-12-09", 
"2011-2-10 1:00:20", "Pune", "IT"};
 String[] row2 = \{"2", "BBB", "2", "543124.66", "false", "1987-2-19", 
"2017-1-1 12:00:20", "Bangalore", "DATA"};
 String[] row3 = \{"3", "CCC", "1", "787878.888", "false", "1982-05-12", 
"2015-12-1 2:20:20", "Pune", "DATA"};
 String[] row4 = \{"4", "DDD", "1", "9.24", "true", "1981-04-09", 
"2000-1-15 7:00:20", "Delhi", "MAINS"};
 String[] row5 = \{"5", "EEE", "3", "545656.99", "true", "1987-12-09", 
"2017-11-25 04:00:20", "Delhi", "IT"};

data1.add(row1);
 data1.add(row2);
 data1.add(row3);
 data1.add(row4);
 data1.add(row5);

String outputPath1 = getOutputPath(outputDir, name.getMethodName() + "large");

long uid = 123456;
 TimeZone.setDefault(TimeZone.getTimeZone("Asia/Shanghai"));
 writeMultipleCarbonFiles("id int,name string,rank short,salary double,active 
boolean,dob date,doj timestamp,city string,dept string", data1, outputPath1, 
uid, null, null);

System.out.println("Data is written");

String hdfsPath1 = moveFiles(outputPath1, outputPath1);
 String datapath1 = hdfsPath1.concat("/" + name.getMethodName() + "large");
 System.out.println("HDFS Data Path is: " + datapath1);

runSQL("create table " + name.getMethodName() + "large" + " using carbon 
location '" + datapath1 + "'");
 System.out.println("Table " + name.getMethodName() + " is created 
Successfully");
 runSQL("select count(*) from " + name.getMethodName() + "large");


 long uid1 = 123;
 String outputPath = getOutputPath(outputDir, name.getMethodName());
 List data = new ArrayList();
 String[] row = \{"222", "Daisy", "3", "334.456", "true", "1956-11-08", 
"2013-12-10 12:00:20", "Pune", "IT"};
 data.add(row);
 writeData("id int,name string,rank short,salary double,active boolean,dob 
date,doj timestamp,city string,dept string", data, outputPath, uid, null, null);
 String hdfsPath = moveFiles(outputPath, outputPath);
 String datapath = hdfsPath.concat("/" + name.getMethodName());

runSQL("create table " + name.getMethodName() + " using carbon location '" + 
datapath + "'");
 runSQL("select count(*) from " + name.getMethodName());
 System.out.println("Insert--");
 runSQL("insert into table " + name.getMethodName() + " select * from " + 
name.getMethodName() + "large");
 System.out.println("Inserted");
 System.out.println("--After Insert--");
 System.out.println("Query 1");
 runSQL("select count(*) from " + name.getMethodName());


 // configure cache size = 4 blocklet
 CarbonProperties.getInstance()
 .addProperty(CarbonCommonConstants.CARBON_MAX_PAGINATION_LRU_CACHE_SIZE_IN_MB, 
"4");

CarbonReaderBuilder carbonReaderBuilder = CarbonReader.builder(datapath, 
"_temp").withPaginationSupport().projection(new 
String[]\{"id","name","rank","salary","active","dob","doj","city","dept"});
 PaginationCarbonReader paginationCarbonReader =
 (PaginationCarbonReader) carbonReaderBuilder.build();


 File[] dataFiles1 = new File(datapath).listFiles(new FilenameFilter() {
 @Override public boolean accept(File dir, String name) {
 return name.endsWith("carbondata");
 }
 });
 String 
version=CarbonSchemaReader.getVersionDetails(dataFiles1[0].getAbsolutePath());
 System.out.println("version "+version);

System.out.println("Total no of rows is : 
"+paginationCarbonReader.getTotalRows() );
 assertTrue(paginationCarbonReader.getTotalRows() == 6);

Object[] rows=paginationCarbonReader.read(1,6);
 //assertTrue(rows.length==5);
 for (Object rowss 

[jira] [Updated] (CARBONDATA-3949) Select filter query fails from presto-cli on MV table

2020-09-07 Thread Chetan Bhat (Jira)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-3949?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chetan Bhat updated CARBONDATA-3949:

Description: 
>From sparksql create table , load data and create MV

spark-sql> CREATE TABLE uniqdata(CUST_ID int,CUST_NAME 
String,ACTIVE_EMUI_VERSION string, DOB timestamp, DOJ timestamp, BIGINT_COLUMN1 
bigint,BIGINT_COLUMN2 bigint,DECIMAL_COLUMN1 decimal(30,10), DECIMAL_COLUMN2 
decimal(36,10),Double_COLUMN1 double, Double_COLUMN2 double,INTEGER_COLUMN1 
int) STORED as carbondata 
TBLPROPERTIES('local_dictionary_enable'='true','local_dictionary_threshold'='1000');
 Time taken: 0.753 seconds
 spark-sql> LOAD DATA INPATH 'hdfs://hacluster/chetan/2000_UniqData.csv' into 
table uniqdata OPTIONS('DELIMITER'=',', 
'QUOTECHAR'='"','BAD_RECORDS_ACTION'='FORCE','FILEHEADER'='CUST_ID,CUST_NAME,ACTIVE_EMUI_VERSION,DOB,DOJ,BIGINT_COLUMN1,BIGINT_COLUMN2,DECIMAL_COLUMN1,DECIMAL_COLUMN2,Double_COLUMN1,Double_COLUMN2,INTEGER_COLUMN1');
 OK
 OK
 Time taken: 1.992 seconds
 spark-sql> CREATE MATERIALIZED VIEW mv1 as select cust_id, cust_name, 
count(cust_id) from uniqdata group by cust_id, cust_name;
 OK
 Time taken: 4.336 seconds

 

>From presto cli select filter query on table with MV fails.

presto:chetan> select * from uniqdata where CUST_ID IS NULL or BIGINT_COLUMN1 
=1233720368578 or DECIMAL_COLUMN1 = 12345678901.123458 or Double_COLUMN1 = 
1.12345674897976E10 or INTEGER_COLUMN1 IS NULL ;
 Query 20200804_092703_00253_ed34h failed: Unable to get file status:

*Log-*
 2020-08-04T18:09:55.975+0800 INFO Query-20200804_100955_00300_ed34h-2642 
stdout 2020-08-04 18:09:55 WARN AbstractDFSCarbonFile:458 - Exception occurred: 
File 
hdfs://hacluster/user/sparkhive/warehouse/chetan.db/uniqdata_string/Metadata 
does not exist.
 java.io.FileNotFoundException: File 
hdfs://hacluster/user/sparkhive/warehouse/chetan.db/uniqdata_string/Metadata 
does not exist.
 at 
org.apache.hadoop.hdfs.DistributedFileSystem.listStatusInternal(DistributedFileSystem.java:1058)
 at 
org.apache.hadoop.hdfs.DistributedFileSystem.access$1000(DistributedFileSystem.java:131)
 at 
org.apache.hadoop.hdfs.DistributedFileSystem$24.doCall(DistributedFileSystem.java:1118)
 at 
org.apache.hadoop.hdfs.DistributedFileSystem$24.doCall(DistributedFileSystem.java:1115)
 at 
org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
 at 
org.apache.hadoop.hdfs.DistributedFileSystem.listStatus(DistributedFileSystem.java:1125)
 at org.apache.hadoop.fs.FilterFileSystem.listStatus(FilterFileSystem.java:270)
 at 
org.apache.carbondata.core.datastore.filesystem.AbstractDFSCarbonFile.listFiles(AbstractDFSCarbonFile.java:456)
 at 
org.apache.carbondata.core.datastore.filesystem.AbstractDFSCarbonFile.listFiles(AbstractDFSCarbonFile.java:559)
 at 
org.apache.carbondata.core.util.path.CarbonTablePath.getActualSchemaFilePath(CarbonTablePath.java:189)
 at 
org.apache.carbondata.core.util.path.CarbonTablePath.getSchemaFilePath(CarbonTablePath.java:168)
 at 
org.apache.carbondata.presto.impl.CarbonTableReader.updateSchemaTables(CarbonTableReader.java:147)
 at 
org.apache.carbondata.presto.impl.CarbonTableReader.getCarbonCache(CarbonTableReader.java:128)
 at 
org.apache.carbondata.presto.CarbondataSplitManager.getSplits(CarbondataSplitManager.java:145)
 at 
io.prestosql.spi.connector.classloader.ClassLoaderSafeConnectorSplitManager.getSplits(ClassLoaderSafeConnectorSplitManager.java:50)
 at io.prestosql.split.SplitManager.getSplits(SplitManager.java:85)
 at 
io.prestosql.sql.planner.DistributedExecutionPlanner$Visitor.visitScanAndFilter(DistributedExecutionPlanner.java:189)
 at 
io.prestosql.sql.planner.DistributedExecutionPlanner$Visitor.visitFilter(DistributedExecutionPlanner.java:257)
 at 
io.prestosql.sql.planner.DistributedExecutionPlanner$Visitor.visitFilter(DistributedExecutionPlanner.java:149)
 at io.prestosql.sql.planner.plan.FilterNode.accept(FilterNode.java:72)
 at 
io.prestosql.sql.planner.DistributedExecutionPlanner.doPlan(DistributedExecutionPlanner.java:119)
 at 
io.prestosql.sql.planner.DistributedExecutionPlanner.doPlan(DistributedExecutionPlanner.java:124)
 at 
io.prestosql.sql.planner.DistributedExecutionPlanner.plan(DistributedExecutionPlanner.java:96)
 at 
io.prestosql.execution.SqlQueryExecution.planDistribution(SqlQueryExecution.java:425)
 at io.prestosql.execution.SqlQueryExecution.start(SqlQueryExecution.java:321)
 at io.prestosql.$gen.Presto_31620200804_042858_1.run(Unknown Source)
 at io.prestosql.execution.SqlQueryManager.createQuery(SqlQueryManager.java:239)
 at 
io.prestosql.dispatcher.LocalDispatchQuery.lambda$startExecution$4(LocalDispatchQuery.java:105)
 at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
 at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
 at java.lang.Thread.run(Thread.java:748)

 

Expected : If the Carbon indexes are not 

[jira] [Closed] (CARBONDATA-3845) Bucket table creation fails with exception for empty BUCKET_NUMBER and BUCKET_COLUMNS

2020-09-07 Thread Chetan Bhat (Jira)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-3845?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chetan Bhat closed CARBONDATA-3845.
---

Issue fixed in Carbon 2.1 version.

> Bucket table creation fails with exception for empty BUCKET_NUMBER and 
> BUCKET_COLUMNS
> -
>
> Key: CARBONDATA-3845
> URL: https://issues.apache.org/jira/browse/CARBONDATA-3845
> Project: CarbonData
>  Issue Type: Bug
>  Components: data-query
>Affects Versions: 2.0.0
> Environment: Spark 2.3.2
>Reporter: Chetan Bhat
>Priority: Minor
> Fix For: 2.1.0
>
>  Time Spent: 4h
>  Remaining Estimate: 0h
>
> *Steps and Issue-*
> 0: jdbc:hive2://10.20.251.163:23040/default> create table if not exists 
> all_data_types1(bool_1 boolean,bool_2 boolean,chinese string,Number 
> int,smallNumber smallint,BigNumber bigint,LargeDecimal double,smalldecimal 
> float,customdecimal decimal(38,15),words string,smallwords char(8),varwords 
> varchar(20),time timestamp,day date,emptyNumber int,emptysmallNumber 
> smallint,emptyBigNumber bigint,emptyLargeDecimal double,emptysmalldecimal 
> float,emptycustomdecimal decimal(38,38),emptywords string,emptysmallwords 
> char(8),emptyvarwords varchar(20)) stored as carbondata TBLPROPERTIES 
> (*'BUCKET_NUMBER'='', 'BUCKET_COLUMNS'=''*);
>  *Error: java.lang.NumberFormatException: For input string: "" 
> (state=,code=0)*
>  Same issue present if bucket_number is empty.
> 0: jdbc:hive2://10.20.251.163:23040/default> create table if not exists 
> all_data_types1(bool_1 boolean,bool_2 boolean,chinese string,Number 
> int,smallNumber smallint,BigNumber bigint,LargeDecimal double,smalldecimal 
> float,customdecimal decimal(38,15),words string,smallwords char(8),varwords 
> varchar(20),time timestamp,day date,emptyNumber int,emptysmallNumber 
> smallint,emptyBigNumber bigint,emptyLargeDecimal double,emptysmalldecimal 
> float,emptycustomdecimal decimal(38,38),emptywords string,emptysmallwords 
> char(8),emptyvarwords varchar(20)) stored as carbondata TBLPROPERTIES 
> (*'BUCKET_NUMBER'=''*, 'BUCKET_COLUMNS'='test');
>  *Error: java.lang.NumberFormatException: For input string: "" 
> (state=,code=0)*
> *Log-*
> 2020-06-05 01:52:32,633 | ERROR | [HiveServer2-Background-Pool: Thread-102] | 
> Error executing query, currentState RUNNING,  | 
> org.apache.spark.internal.Logging$class.logError(Logging.scala:91)2020-06-05 
> 01:52:32,633 | ERROR | [HiveServer2-Background-Pool: Thread-102] | Error 
> executing query, currentState RUNNING,  | 
> org.apache.spark.internal.Logging$class.logError(Logging.scala:91)java.lang.NumberFormatException:
>  For input string: "" at 
> java.lang.NumberFormatException.forInputString(NumberFormatException.java:65) 
> at java.lang.Integer.parseInt(Integer.java:592) at 
> java.lang.Integer.parseInt(Integer.java:615) at 
> scala.collection.immutable.StringLike$class.toInt(StringLike.scala:272) at 
> scala.collection.immutable.StringOps.toInt(StringOps.scala:29) at 
> org.apache.carbondata.spark.CarbonOption.bucketNumber$lzycompute(CarbonOption.scala:61)
>  at 
> org.apache.carbondata.spark.CarbonOption.bucketNumber(CarbonOption.scala:61) 
> at 
> org.apache.spark.sql.parser.CarbonSpark2SqlParser.getBucketFields(CarbonSpark2SqlParser.scala:765)
>  at 
> org.apache.spark.sql.parser.CarbonSparkSqlParserUtil$.buildTableInfoFromCatalogTable(CarbonSparkSqlParserUtil.scala:382)
>  at 
> org.apache.spark.sql.CarbonSource$.createTableInfo(CarbonSource.scala:235) at 
> org.apache.spark.sql.CarbonSource$.createTableMeta(CarbonSource.scala:382) at 
> org.apache.spark.sql.execution.command.table.CarbonCreateDataSourceTableCommand.processMetadata(CarbonCreateDataSourceTableCommand.scala:69)
>  at 
> org.apache.spark.sql.execution.command.MetadataCommand$$anonfun$run$1.apply(package.scala:123)
>  at 
> org.apache.spark.sql.execution.command.MetadataCommand$$anonfun$run$1.apply(package.scala:123)
>  at 
> org.apache.spark.sql.execution.command.Auditable$class.runWithAudit(package.scala:104)
>  at 
> org.apache.spark.sql.execution.command.MetadataCommand.runWithAudit(package.scala:120)
>  at 
> org.apache.spark.sql.execution.command.MetadataCommand.run(package.scala:123) 
> at 
> org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult$lzycompute(commands.scala:70)
>  at 
> org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult(commands.scala:68)
>  at 
> org.apache.spark.sql.execution.command.ExecutedCommandExec.executeCollect(commands.scala:79)
>  at org.apache.spark.sql.Dataset$$anonfun$6.apply(Dataset.scala:190) at 
> org.apache.spark.sql.Dataset$$anonfun$6.apply(Dataset.scala:190) at 
> org.apache.spark.sql.Dataset$$anonfun$52.apply(Dataset.scala:3259) at 
> 

[jira] [Created] (CARBONDATA-3971) Session level dynamic properties for repair(carbon.load.si.repair and carbon.si.repair.limit) are not updated in https://github.com/apache/carbondata/blob/master/doc

2020-09-04 Thread Chetan Bhat (Jira)
Chetan Bhat created CARBONDATA-3971:
---

 Summary: Session level dynamic properties for 
repair(carbon.load.si.repair and carbon.si.repair.limit) are not updated in 
https://github.com/apache/carbondata/blob/master/docs/configuration-parameters.md
 Key: CARBONDATA-3971
 URL: https://issues.apache.org/jira/browse/CARBONDATA-3971
 Project: CarbonData
  Issue Type: Bug
  Components: docs
Affects Versions: 2.1.0
Reporter: Chetan Bhat


Session level dynamic properties for repair(carbon.load.si.repair and 
carbon.si.repair.limit) are not mentioned in  github link - 
https://github.com/apache/carbondata/blob/master/docs/configuration-parameters.md



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (CARBONDATA-3949) Select filter query fails from presto-cli on MV table

2020-08-11 Thread Chetan Bhat (Jira)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-3949?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chetan Bhat updated CARBONDATA-3949:

Affects Version/s: (was: 2.0.0)
   2.0.1

> Select filter query fails from presto-cli on MV table
> -
>
> Key: CARBONDATA-3949
> URL: https://issues.apache.org/jira/browse/CARBONDATA-3949
> Project: CarbonData
>  Issue Type: Bug
>  Components: presto-integration
>Affects Versions: 2.0.1
> Environment: Spark 2.4.5. PrestoSQL 316
>Reporter: Chetan Bhat
>Priority: Major
>
> From sparksql create table , load data and create MV
> spark-sql> CREATE TABLE uniqdata(CUST_ID int,CUST_NAME 
> String,ACTIVE_EMUI_VERSION string, DOB timestamp, DOJ timestamp, 
> BIGINT_COLUMN1 bigint,BIGINT_COLUMN2 bigint,DECIMAL_COLUMN1 decimal(30,10), 
> DECIMAL_COLUMN2 decimal(36,10),Double_COLUMN1 double, Double_COLUMN2 
> double,INTEGER_COLUMN1 int) STORED as carbondata 
> TBLPROPERTIES('local_dictionary_enable'='true','local_dictionary_threshold'='1000');
> Time taken: 0.753 seconds
> spark-sql> LOAD DATA INPATH 'hdfs://hacluster/chetan/2000_UniqData.csv' into 
> table uniqdata OPTIONS('DELIMITER'=',', 
> 'QUOTECHAR'='"','BAD_RECORDS_ACTION'='FORCE','FILEHEADER'='CUST_ID,CUST_NAME,ACTIVE_EMUI_VERSION,DOB,DOJ,BIGINT_COLUMN1,BIGINT_COLUMN2,DECIMAL_COLUMN1,DECIMAL_COLUMN2,Double_COLUMN1,Double_COLUMN2,INTEGER_COLUMN1');
> OK
> OK
> Time taken: 1.992 seconds
> spark-sql> CREATE MATERIALIZED VIEW mv1 as select cust_id, cust_name, 
> count(cust_id) from uniqdata group by cust_id, cust_name;
> OK
> Time taken: 4.336 seconds
>  
> From presto cli select filter query on table with MV fails.
> presto:chetan> select * from uniqdata where CUST_ID IS NULL or BIGINT_COLUMN1 
> =1233720368578 or DECIMAL_COLUMN1 = 12345678901.123458 or Double_COLUMN1 
> = 1.12345674897976E10 or INTEGER_COLUMN1 IS NULL ;
> Query 20200804_092703_00253_ed34h failed: Unable to get file status:
> *Log-*
> 2020-08-04T18:09:55.975+0800 INFO Query-20200804_100955_00300_ed34h-2642 
> stdout 2020-08-04 18:09:55 WARN AbstractDFSCarbonFile:458 - Exception 
> occurred: File 
> hdfs://hacluster/user/sparkhive/warehouse/chetan.db/uniqdata_string/Metadata 
> does not exist.
> java.io.FileNotFoundException: File 
> hdfs://hacluster/user/sparkhive/warehouse/chetan.db/uniqdata_string/Metadata 
> does not exist.
>  at 
> org.apache.hadoop.hdfs.DistributedFileSystem.listStatusInternal(DistributedFileSystem.java:1058)
>  at 
> org.apache.hadoop.hdfs.DistributedFileSystem.access$1000(DistributedFileSystem.java:131)
>  at 
> org.apache.hadoop.hdfs.DistributedFileSystem$24.doCall(DistributedFileSystem.java:1118)
>  at 
> org.apache.hadoop.hdfs.DistributedFileSystem$24.doCall(DistributedFileSystem.java:1115)
>  at 
> org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
>  at 
> org.apache.hadoop.hdfs.DistributedFileSystem.listStatus(DistributedFileSystem.java:1125)
>  at 
> org.apache.hadoop.fs.FilterFileSystem.listStatus(FilterFileSystem.java:270)
>  at 
> org.apache.carbondata.core.datastore.filesystem.AbstractDFSCarbonFile.listFiles(AbstractDFSCarbonFile.java:456)
>  at 
> org.apache.carbondata.core.datastore.filesystem.AbstractDFSCarbonFile.listFiles(AbstractDFSCarbonFile.java:559)
>  at 
> org.apache.carbondata.core.util.path.CarbonTablePath.getActualSchemaFilePath(CarbonTablePath.java:189)
>  at 
> org.apache.carbondata.core.util.path.CarbonTablePath.getSchemaFilePath(CarbonTablePath.java:168)
>  at 
> org.apache.carbondata.presto.impl.CarbonTableReader.updateSchemaTables(CarbonTableReader.java:147)
>  at 
> org.apache.carbondata.presto.impl.CarbonTableReader.getCarbonCache(CarbonTableReader.java:128)
>  at 
> org.apache.carbondata.presto.CarbondataSplitManager.getSplits(CarbondataSplitManager.java:145)
>  at 
> io.prestosql.spi.connector.classloader.ClassLoaderSafeConnectorSplitManager.getSplits(ClassLoaderSafeConnectorSplitManager.java:50)
>  at io.prestosql.split.SplitManager.getSplits(SplitManager.java:85)
>  at 
> io.prestosql.sql.planner.DistributedExecutionPlanner$Visitor.visitScanAndFilter(DistributedExecutionPlanner.java:189)
>  at 
> io.prestosql.sql.planner.DistributedExecutionPlanner$Visitor.visitFilter(DistributedExecutionPlanner.java:257)
>  at 
> io.prestosql.sql.planner.DistributedExecutionPlanner$Visitor.visitFilter(DistributedExecutionPlanner.java:149)
>  at io.prestosql.sql.planner.plan.FilterNode.accept(FilterNode.java:72)
>  at 
> io.prestosql.sql.planner.DistributedExecutionPlanner.doPlan(DistributedExecutionPlanner.java:119)
>  at 
> io.prestosql.sql.planner.DistributedExecutionPlanner.doPlan(DistributedExecutionPlanner.java:124)
>  at 
> io.prestosql.sql.planner.DistributedExecutionPlanner.plan(DistributedExecutionPlanner.java:96)
>  at 
> 

[jira] [Updated] (CARBONDATA-3932) need to change discovery.uri and add hive.metastore.uri,hive.config.resources in https://github.com/apache/carbondata/blob/master/docs/prestosql-guide.md#presto-mu

2020-08-11 Thread Chetan Bhat (Jira)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-3932?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chetan Bhat updated CARBONDATA-3932:

Affects Version/s: (was: 2.0.0)
   2.0.1

> need to change discovery.uri and add  
> hive.metastore.uri,hive.config.resources  in 
> https://github.com/apache/carbondata/blob/master/docs/prestosql-guide.md#presto-multinode-cluster-setup-for-carbondata
> -
>
> Key: CARBONDATA-3932
> URL: https://issues.apache.org/jira/browse/CARBONDATA-3932
> Project: CarbonData
>  Issue Type: Bug
>  Components: docs, presto-integration
>Affects Versions: 2.0.1
> Environment: Documentation
>Reporter: Chetan Bhat
>Priority: Minor
>
> Need to change discovery.uri=:8086 to 
> discovery.uri=http://:8086 in 
> [https://github.com/apache/carbondata/blob/master/docs/prestosql-guide.md#presto-multinode-cluster-setup-for-carbondata]
> Need to add these configurations as well in carbondata.properties and to be 
> updated in carbondata-presto opensource doc .
> 1.hive.metastore.uri
> 2.hive.config.resources
> Ex : -
> connector.name=carbondata
> hive.metastore.uri=thrift://10.21.18.106:9083
> hive.config.resources=/opt/HA/C10/install/hadoop/datanode/etc/hadoop/core-site.xml,/opt/HA/C10/install/hadoop/datanode/etc/hadoop/hdfs-site.xml
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (CARBONDATA-3950) Alter table drop column for non partition column throws error

2020-08-11 Thread Chetan Bhat (Jira)
Chetan Bhat created CARBONDATA-3950:
---

 Summary: Alter table drop column for non partition column throws 
error
 Key: CARBONDATA-3950
 URL: https://issues.apache.org/jira/browse/CARBONDATA-3950
 Project: CarbonData
  Issue Type: Bug
  Components: data-query
Affects Versions: 2.0.1
 Environment: Spark 2.4.5
Reporter: Chetan Bhat


>From spark-sql the queries are executed as mentioned below-

drop table if exists uniqdata_int;
CREATE TABLE uniqdata_int (CUST_NAME String,ACTIVE_EMUI_VERSION string, DOB 
timestamp, DOJ timestamp, BIGINT_COLUMN1 bigint,BIGINT_COLUMN2 
bigint,DECIMAL_COLUMN1 decimal(30,10), DECIMAL_COLUMN2 
decimal(36,10),Double_COLUMN1 double, Double_COLUMN2 double, INTEGER_COLUMN1 
int) Partitioned by (cust_id int) stored as carbondata TBLPROPERTIES 
("TABLE_BLOCKSIZE"= "256 MB");

LOAD DATA INPATH 'hdfs://hacluster/chetan/2000_UniqData.csv' into table 
uniqdata_int partition(cust_id='1') OPTIONS ('FILEHEADER'='CUST_ID,CUST_NAME 
,ACTIVE_EMUI_VERSION,DOB,DOJ, 
BIGINT_COLUMN1,BIGINT_COLUMN2,DECIMAL_COLUMN1,DECIMAL_COLUMN2,Double_COLUMN1, 
Double_COLUMN2,INTEGER_COLUMN1','BAD_RECORDS_ACTION'='FORCE');
show partitions uniqdata_int;
select * from uniqdata_int order by cust_id;

alter table uniqdata_int add columns(id int);
 desc uniqdata_int;
 *alter table uniqdata_int drop columns(CUST_NAME);*
 desc uniqdata_int;

Issue : Alter table drop column for non partition column throws error even 
though the operation is success.

org.apache.carbondata.spark.exception.ProcessMetaDataException: operation 
failed for priyesh.uniqdata_int: Alterion failed: 
org.apache.hadoop.hive.ql.metadata.HiveException: Unable to alter table. The 
following columns have he existing columns in their respective positions :
col;
 at 
org.apache.spark.sql.execution.command.MetadataProcessOperation$class.throwMetadataException(package.
 at 
org.apache.spark.sql.execution.command.MetadataCommand.throwMetadataException(package.scala:120)
 at 
org.apache.spark.sql.execution.command.schema.CarbonAlterTableDropColumnCommand.processMetadata(Carboand.scala:201)
 at 
org.apache.spark.sql.execution.command.MetadataCommand$$anonfun$run$1.apply(package.scala:123)
 at 
org.apache.spark.sql.execution.command.MetadataCommand$$anonfun$run$1.apply(package.scala:123)
 at 
org.apache.spark.sql.execution.command.Auditable$class.runWithAudit(package.scala:104)
 at 
org.apache.spark.sql.execution.command.MetadataCommand.runWithAudit(package.scala:120)
 at 
org.apache.spark.sql.execution.command.MetadataCommand.run(package.scala:123)
 at 
org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult$lzycompute(commands.scala
 at 
org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult(commands.scala:69)
 at 
org.apache.spark.sql.execution.command.ExecutedCommandExec.executeCollect(commands.scala:80)
 at org.apache.spark.sql.Dataset$$anonfun$6.apply(Dataset.scala:196)
 at org.apache.spark.sql.Dataset$$anonfun$6.apply(Dataset.scala:196)
 at org.apache.spark.sql.Dataset$$anonfun$52.apply(Dataset.scala:3379)
 at 
org.apache.spark.sql.execution.SQLExecution$$anonfun$withNewExecutionId$1.apply(SQLExecution.scala:95
 at 
org.apache.spark.sql.execution.SQLExecution$.withSQLConfPropagated(SQLExecution.scala:144)
 at 
org.apache.spark.sql.execution.SQLExecution$.withNewExecutionId(SQLExecution.scala:86)
 at org.apache.spark.sql.Dataset.withAction(Dataset.scala:3378)
 at org.apache.spark.sql.Dataset.(Dataset.scala:196)
 at org.apache.spark.sql.Dataset$.ofRows(Dataset.scala:79)
 at org.apache.spark.sql.SparkSession.sql(SparkSession.scala:651)
 at org.apache.spark.sql.SQLContext.sql(SQLContext.scala:694)
 at 
org.apache.spark.sql.hive.thriftserver.SparkSQLDriver.run(SparkSQLDriver.scala:67)
 at 
org.apache.spark.sql.hive.thriftserver.SparkSQLCLIDriver.processCmd(SparkSQLCLIDriver.scala:387)
 at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:406)
 at 
org.apache.spark.sql.hive.thriftserver.SparkSQLCLIDriver$.main(SparkSQLCLIDriver.scala:279)
 at 
org.apache.spark.sql.hive.thriftserver.SparkSQLCLIDriver.main(SparkSQLCLIDriver.scala)
 at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
 at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
 at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
 at java.lang.reflect.Method.invoke(Method.java:498)
 at org.apache.spark.deploy.JavaMainApplication.start(SparkApplication.scala:52)
 at 
org.apache.spark.deploy.SparkSubmit.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:87
 at org.apache.spark.deploy.SparkSubmit.doRunMain$1(SparkSubmit.scala:164)
 at org.apache.spark.deploy.SparkSubmit.submit(SparkSubmit.scala:187)
 at org.apache.spark.deploy.SparkSubmit.doSubmit(SparkSubmit.scala:89)
 at 

[jira] [Created] (CARBONDATA-3949) Select filter query fails from presto-cli on MV table

2020-08-10 Thread Chetan Bhat (Jira)
Chetan Bhat created CARBONDATA-3949:
---

 Summary: Select filter query fails from presto-cli on MV table
 Key: CARBONDATA-3949
 URL: https://issues.apache.org/jira/browse/CARBONDATA-3949
 Project: CarbonData
  Issue Type: Bug
  Components: presto-integration
Affects Versions: 2.0.0
 Environment: Spark 2.4.5. PrestoSQL 316
Reporter: Chetan Bhat


>From sparksql create table , load data and create MV

spark-sql> CREATE TABLE uniqdata(CUST_ID int,CUST_NAME 
String,ACTIVE_EMUI_VERSION string, DOB timestamp, DOJ timestamp, BIGINT_COLUMN1 
bigint,BIGINT_COLUMN2 bigint,DECIMAL_COLUMN1 decimal(30,10), DECIMAL_COLUMN2 
decimal(36,10),Double_COLUMN1 double, Double_COLUMN2 double,INTEGER_COLUMN1 
int) STORED as carbondata 
TBLPROPERTIES('local_dictionary_enable'='true','local_dictionary_threshold'='1000');
Time taken: 0.753 seconds
spark-sql> LOAD DATA INPATH 'hdfs://hacluster/chetan/2000_UniqData.csv' into 
table uniqdata OPTIONS('DELIMITER'=',', 
'QUOTECHAR'='"','BAD_RECORDS_ACTION'='FORCE','FILEHEADER'='CUST_ID,CUST_NAME,ACTIVE_EMUI_VERSION,DOB,DOJ,BIGINT_COLUMN1,BIGINT_COLUMN2,DECIMAL_COLUMN1,DECIMAL_COLUMN2,Double_COLUMN1,Double_COLUMN2,INTEGER_COLUMN1');
OK
OK
Time taken: 1.992 seconds
spark-sql> CREATE MATERIALIZED VIEW mv1 as select cust_id, cust_name, 
count(cust_id) from uniqdata group by cust_id, cust_name;
OK
Time taken: 4.336 seconds

 

>From presto cli select filter query on table with MV fails.

presto:chetan> select * from uniqdata where CUST_ID IS NULL or BIGINT_COLUMN1 
=1233720368578 or DECIMAL_COLUMN1 = 12345678901.123458 or Double_COLUMN1 = 
1.12345674897976E10 or INTEGER_COLUMN1 IS NULL ;
Query 20200804_092703_00253_ed34h failed: Unable to get file status:

*Log-*
2020-08-04T18:09:55.975+0800 INFO Query-20200804_100955_00300_ed34h-2642 stdout 
2020-08-04 18:09:55 WARN AbstractDFSCarbonFile:458 - Exception occurred: File 
hdfs://hacluster/user/sparkhive/warehouse/chetan.db/uniqdata_string/Metadata 
does not exist.
java.io.FileNotFoundException: File 
hdfs://hacluster/user/sparkhive/warehouse/chetan.db/uniqdata_string/Metadata 
does not exist.
 at 
org.apache.hadoop.hdfs.DistributedFileSystem.listStatusInternal(DistributedFileSystem.java:1058)
 at 
org.apache.hadoop.hdfs.DistributedFileSystem.access$1000(DistributedFileSystem.java:131)
 at 
org.apache.hadoop.hdfs.DistributedFileSystem$24.doCall(DistributedFileSystem.java:1118)
 at 
org.apache.hadoop.hdfs.DistributedFileSystem$24.doCall(DistributedFileSystem.java:1115)
 at 
org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
 at 
org.apache.hadoop.hdfs.DistributedFileSystem.listStatus(DistributedFileSystem.java:1125)
 at org.apache.hadoop.fs.FilterFileSystem.listStatus(FilterFileSystem.java:270)
 at 
org.apache.carbondata.core.datastore.filesystem.AbstractDFSCarbonFile.listFiles(AbstractDFSCarbonFile.java:456)
 at 
org.apache.carbondata.core.datastore.filesystem.AbstractDFSCarbonFile.listFiles(AbstractDFSCarbonFile.java:559)
 at 
org.apache.carbondata.core.util.path.CarbonTablePath.getActualSchemaFilePath(CarbonTablePath.java:189)
 at 
org.apache.carbondata.core.util.path.CarbonTablePath.getSchemaFilePath(CarbonTablePath.java:168)
 at 
org.apache.carbondata.presto.impl.CarbonTableReader.updateSchemaTables(CarbonTableReader.java:147)
 at 
org.apache.carbondata.presto.impl.CarbonTableReader.getCarbonCache(CarbonTableReader.java:128)
 at 
org.apache.carbondata.presto.CarbondataSplitManager.getSplits(CarbondataSplitManager.java:145)
 at 
io.prestosql.spi.connector.classloader.ClassLoaderSafeConnectorSplitManager.getSplits(ClassLoaderSafeConnectorSplitManager.java:50)
 at io.prestosql.split.SplitManager.getSplits(SplitManager.java:85)
 at 
io.prestosql.sql.planner.DistributedExecutionPlanner$Visitor.visitScanAndFilter(DistributedExecutionPlanner.java:189)
 at 
io.prestosql.sql.planner.DistributedExecutionPlanner$Visitor.visitFilter(DistributedExecutionPlanner.java:257)
 at 
io.prestosql.sql.planner.DistributedExecutionPlanner$Visitor.visitFilter(DistributedExecutionPlanner.java:149)
 at io.prestosql.sql.planner.plan.FilterNode.accept(FilterNode.java:72)
 at 
io.prestosql.sql.planner.DistributedExecutionPlanner.doPlan(DistributedExecutionPlanner.java:119)
 at 
io.prestosql.sql.planner.DistributedExecutionPlanner.doPlan(DistributedExecutionPlanner.java:124)
 at 
io.prestosql.sql.planner.DistributedExecutionPlanner.plan(DistributedExecutionPlanner.java:96)
 at 
io.prestosql.execution.SqlQueryExecution.planDistribution(SqlQueryExecution.java:425)
 at io.prestosql.execution.SqlQueryExecution.start(SqlQueryExecution.java:321)
 at io.prestosql.$gen.Presto_31620200804_042858_1.run(Unknown Source)
 at io.prestosql.execution.SqlQueryManager.createQuery(SqlQueryManager.java:239)
 at 
io.prestosql.dispatcher.LocalDispatchQuery.lambda$startExecution$4(LocalDispatchQuery.java:105)
 at 

[jira] [Updated] (CARBONDATA-3932) need to change discovery.uri and add hive.metastore.uri,hive.config.resources in https://github.com/apache/carbondata/blob/master/docs/prestosql-guide.md#presto-mu

2020-07-31 Thread Chetan Bhat (Jira)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-3932?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chetan Bhat updated CARBONDATA-3932:

Description: 
Need to change discovery.uri=:8086 to 
discovery.uri=http://:8086 in 
[https://github.com/apache/carbondata/blob/master/docs/prestosql-guide.md#presto-multinode-cluster-setup-for-carbondata]

Need to add these configurations as well in carbondata.properties and to be 
updated in carbondata-presto opensource doc .
1.hive.metastore.uri
2.hive.config.resources

Ex : -

connector.name=carbondata
hive.metastore.uri=thrift://10.21.18.106:9083
hive.config.resources=/opt/HA/C10/install/hadoop/datanode/etc/hadoop/core-site.xml,/opt/HA/C10/install/hadoop/datanode/etc/hadoop/hdfs-site.xml




 

  was:Need to change discovery.uri=:8086 to 
discovery.uri=http://:8086 in 
https://github.com/apache/carbondata/blob/master/docs/prestosql-guide.md#presto-multinode-cluster-setup-for-carbondata

Summary: need to change discovery.uri and add  
hive.metastore.uri,hive.config.resources  in 
https://github.com/apache/carbondata/blob/master/docs/prestosql-guide.md#presto-multinode-cluster-setup-for-carbondata
  (was: need to change discovery.uri=:8086   to 
discovery.uri=http://:8086   in 
https://github.com/apache/carbondata/blob/master/docs/prestosql-guide.md#presto-multinode-cluster-setup-for-carbondata)

> need to change discovery.uri and add  
> hive.metastore.uri,hive.config.resources  in 
> https://github.com/apache/carbondata/blob/master/docs/prestosql-guide.md#presto-multinode-cluster-setup-for-carbondata
> -
>
> Key: CARBONDATA-3932
> URL: https://issues.apache.org/jira/browse/CARBONDATA-3932
> Project: CarbonData
>  Issue Type: Bug
>  Components: docs, presto-integration
>Affects Versions: 2.0.0
> Environment: Documentation
>Reporter: Chetan Bhat
>Priority: Minor
>
> Need to change discovery.uri=:8086 to 
> discovery.uri=http://:8086 in 
> [https://github.com/apache/carbondata/blob/master/docs/prestosql-guide.md#presto-multinode-cluster-setup-for-carbondata]
> Need to add these configurations as well in carbondata.properties and to be 
> updated in carbondata-presto opensource doc .
> 1.hive.metastore.uri
> 2.hive.config.resources
> Ex : -
> connector.name=carbondata
> hive.metastore.uri=thrift://10.21.18.106:9083
> hive.config.resources=/opt/HA/C10/install/hadoop/datanode/etc/hadoop/core-site.xml,/opt/HA/C10/install/hadoop/datanode/etc/hadoop/hdfs-site.xml
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (CARBONDATA-3932) need to change discovery.uri=:8086 to discovery.uri=http://:8086 in https://github.com/apache/carbondata/blob/master/docs/prestos

2020-07-30 Thread Chetan Bhat (Jira)
Chetan Bhat created CARBONDATA-3932:
---

 Summary: need to change discovery.uri=:8086   to 
discovery.uri=http://:8086   in 
https://github.com/apache/carbondata/blob/master/docs/prestosql-guide.md#presto-multinode-cluster-setup-for-carbondata
 Key: CARBONDATA-3932
 URL: https://issues.apache.org/jira/browse/CARBONDATA-3932
 Project: CarbonData
  Issue Type: Bug
  Components: docs, presto-integration
Affects Versions: 2.0.0
 Environment: Documentation
Reporter: Chetan Bhat


Need to change discovery.uri=:8086 to 
discovery.uri=http://:8086 in 
https://github.com/apache/carbondata/blob/master/docs/prestosql-guide.md#presto-multinode-cluster-setup-for-carbondata



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (CARBONDATA-3909) Insert into select fails after insert decimal value as null and set sort scope to global sort

2020-07-16 Thread Chetan Bhat (Jira)
Chetan Bhat created CARBONDATA-3909:
---

 Summary: Insert into select fails after insert decimal value as 
null and set sort scope to global sort
 Key: CARBONDATA-3909
 URL: https://issues.apache.org/jira/browse/CARBONDATA-3909
 Project: CarbonData
  Issue Type: Bug
  Components: data-load
Affects Versions: 2.0.1
 Environment: Spark 2.3.2, 2.4.5
Reporter: Chetan Bhat


Steps -

insert decimal value as null and set sort scope to global sort and do insert 
into select.

 

Issue : - Insert into select fails.

 

Expected : - Insert into select should be success.

 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (CARBONDATA-3901) Documentation issues in https://github.com/apache/carbondata/tree/master/docs

2020-07-15 Thread Chetan Bhat (Jira)
Chetan Bhat created CARBONDATA-3901:
---

 Summary: Documentation issues in 
https://github.com/apache/carbondata/tree/master/docs
 Key: CARBONDATA-3901
 URL: https://issues.apache.org/jira/browse/CARBONDATA-3901
 Project: CarbonData
  Issue Type: Bug
  Components: docs
Affects Versions: 2.0.1
 Environment: https://github.com/apache/carbondata/tree/master/docs
Reporter: Chetan Bhat


*Issue 1 :* 
https://github.com/apache/carbondata/blob/master/docs/alluxio-guide.md 
getOrCreateCarbonSession not used in Carbon 2.0 version and should be 
removed.Issue 1 : 
https://github.com/apache/carbondata/blob/master/docs/alluxio-guide.md 
getOrCreateCarbonSession not used in Carbon 2.0 version and should be removed.
Testing use alluxio by CarbonSessionimport 
org.apache.spark.sql.CarbonSession._import org.apache.spark.sql.SparkSession   
val carbon = 
SparkSession.builder().master("local").appName("test").getOrCreateCarbonSession("alluxio://localhost:19998/carbondata");carbon.sql("CREATE
 TABLE carbon_alluxio(id String,name String, city String,age Int) STORED as 
carbondata");carbon.sql(s"LOAD DATA LOCAL INPATH 
'${CARBONDATA_PATH}/integration/spark/src/test/resources/sample.csv' into table 
carbon_alluxio");carbon.sql("select * from carbon_alluxio").show

*Issue 2  -* 
https://github.com/apache/carbondata/blob/master/docs/ddl-of-carbondata.mdSORT_SCOPE
 Sort scope of the load.Options include no sort, local sort ,batch sort and 
global sort  --> Batch sort to be removed as its not supported.

*Issue 3 -* 
https://github.com/apache/carbondata/blob/master/docs/streaming-guide.md#close-stream
   CLOSE STREAM link is not working.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Closed] (CARBONDATA-3847) Dataload fails for table with data of 10 records having string type bucket column for if number of buckets exceed large no (300).

2020-07-09 Thread Chetan Bhat (Jira)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-3847?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chetan Bhat closed CARBONDATA-3847.
---
Resolution: Cannot Reproduce

Cant reproduce this more than once thereafter. Might be related to cluster 
configuration. Hence closing the issue.

> Dataload fails for table with data of 10 records having string type bucket 
> column for if number of buckets exceed large no (300).
> -
>
> Key: CARBONDATA-3847
> URL: https://issues.apache.org/jira/browse/CARBONDATA-3847
> Project: CarbonData
>  Issue Type: Bug
>  Components: data-query
>Affects Versions: 2.0.0
> Environment: Spark 2.3.2, Spark 2.4.5
>Reporter: Chetan Bhat
>Priority: Minor
>
> *Steps -*
> 0: jdbc:hive2://10.20.251.163:23040/default> create table if not exists 
> all_data_types1(bool_1 boolean,bool_2 boolean,chinese string,Number 
> int,smallNumber smallint,BigNumber bigint,LargeDecimal double,smalldecimal 
> float,customdecimal decimal(38,15),words string,smallwords char(8),varwords 
> varchar(20),time timestamp,day date,emptyNumber int,emptysmallNumber 
> smallint,emptyBigNumber bigint,emptyLargeDecimal double,emptysmalldecimal 
> float,emptycustomdecimal decimal(38,38),emptywords string,emptysmallwords 
> char(8),emptyvarwords varchar(20)) stored as carbondata TBLPROPERTIES 
> (*'BUCKET_NUMBER'='300'*, 'BUCKET_COLUMNS'='chinese');
> +-+--+
> | Result |
> +-+--+
> +-+--+
> No rows selected (0.241 seconds)
> 0: jdbc:hive2://10.20.251.163:23040/default> LOAD DATA INPATH 
> 'hdfs://hacluster/chetan/datafile_0.csv' into table all_data_types1 
> OPTIONS('DELIMITER'=',' , 
> 'QUOTECHAR'='"','BAD_RECORDS_ACTION'='FORCE','FILEHEADER'='bool_1 ,bool_2 
> ,chinese ,Number ,smallNumber ,BigNumber ,LargeDecimal ,smalldecimal 
> ,customdecimal,words ,smallwords ,varwords ,time ,day ,emptyNumber 
> ,emptysmallNumber ,emptyBigNumber ,emptyLargeDecimal 
> ,emptysmalldecimal,emptycustomdecimal ,emptywords ,emptysmallwords 
> ,emptyvarwords');
> *Error: java.lang.Exception: DataLoad failure (state=,code=0)*
>  
> *Log -*
> java.lang.Exception: DataLoad failure
>  at 
> org.apache.carbondata.spark.rdd.CarbonDataRDDFactory$.loadCarbonData(CarbonDataRDDFactory.scala:565)
>  at 
> org.apache.spark.sql.execution.command.management.CarbonLoadDataCommand.loadData(CarbonLoadDataCommand.scala:207)
>  at 
> org.apache.spark.sql.execution.command.management.CarbonLoadDataCommand.processData(CarbonLoadDataCommand.scala:168)
>  at 
> org.apache.spark.sql.execution.command.AtomicRunnableCommand$$anonfun$run$3.apply(package.scala:148)
>  at 
> org.apache.spark.sql.execution.command.AtomicRunnableCommand$$anonfun$run$3.apply(package.scala:145)
>  at 
> org.apache.spark.sql.execution.command.Auditable$class.runWithAudit(package.scala:104)
>  at 
> org.apache.spark.sql.execution.command.AtomicRunnableCommand.runWithAudit(package.scala:141)
>  at 
> org.apache.spark.sql.execution.command.AtomicRunnableCommand.run(package.scala:145)
>  at 
> org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult$lzycompute(commands.scala:71)
>  at 
> org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult(commands.scala:69)
>  at 
> org.apache.spark.sql.execution.command.ExecutedCommandExec.executeCollect(commands.scala:80)
>  at org.apache.spark.sql.Dataset$$anonfun$6.apply(Dataset.scala:196)
>  at org.apache.spark.sql.Dataset$$anonfun$6.apply(Dataset.scala:196)
>  at org.apache.spark.sql.Dataset$$anonfun$52.apply(Dataset.scala:3379)
>  at 
> org.apache.spark.sql.execution.SQLExecution$$anonfun$withNewExecutionId$1.apply(SQLExecution.scala:90)
>  at 
> org.apache.spark.sql.execution.SQLExecution$.withSQLConfPropagated(SQLExecution.scala:137)
>  at 
> org.apache.spark.sql.execution.SQLExecution$.withNewExecutionId(SQLExecution.scala:85)
>  at org.apache.spark.sql.Dataset.withAction(Dataset.scala:3378)
>  at org.apache.spark.sql.Dataset.(Dataset.scala:196)
>  at org.apache.spark.sql.Dataset$.ofRows(Dataset.scala:79)
>  at org.apache.spark.sql.SparkSession.sql(SparkSession.scala:651)
>  at org.apache.spark.sql.SQLContext.sql(SQLContext.scala:694)
>  at 
> org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation.org$apache$spark$sql$hive$thriftserver$SparkExecuteStatementOperation$$execute(SparkExecuteStatementOperation.scala:248)
>  at 
> org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation$$anon$1$$anon$2.run(SparkExecuteStatementOperation.scala:178)
>  at 
> org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation$$anon$1$$anon$2.run(SparkExecuteStatementOperation.scala:174)
>  at java.security.AccessController.doPrivileged(Native Method)
>  at 

[jira] [Updated] (CARBONDATA-3846) Dataload fails for boolean column configured as BUCKET_COLUMNS

2020-07-08 Thread Chetan Bhat (Jira)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-3846?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chetan Bhat updated CARBONDATA-3846:

Description: 
*Steps-*

0: jdbc:hive2://10.20.255.171:23040/default> create table if not exists 
all_data_types1(*bool_1 boolean*,bool_2 boolean,chinese string,Number 
int,smallNumber smallint,BigNumber bigint,LargeDecimal double,smalldecimal 
float,customdecimal decimal(38,15),words string,smallwords char(8),varwords 
varchar(20),time timestamp,day date,emptyNumber int,emptysmallNumber 
smallint,emptyBigNumber bigint,emptyLargeDecimal double,emptysmalldecimal 
float,emptycustomdecimal decimal(38,38),emptywords string,emptysmallwords 
char(8),emptyvarwords varchar(20)) stored as carbondata TBLPROPERTIES 
('BUCKET_NUMBER'='1', '*BUCKET_COLUMNS'='bool_1*');
 +--+-+
|Result|

+--+-+
 +--+-+
 No rows selected (0.939 seconds)
 0: jdbc:hive2://10.20.255.171:23040/default> LOAD DATA INPATH 
'hdfs://hacluster/chetan/datafile_0.csv' into table all_data_types1 
OPTIONS('DELIMITER'=',' , 
'QUOTECHAR'='"','BAD_RECORDS_ACTION'='FORCE','FILEHEADER'='bool_1 ,bool_2 
,chinese ,Number ,smallNumber ,BigNumber ,LargeDecimal ,smalldecimal 
,customdecimal,words ,smallwords ,varwords ,time ,day ,emptyNumber 
,emptysmallNumber ,emptyBigNumber ,emptyLargeDecimal 
,emptysmalldecimal,emptycustomdecimal ,emptywords ,emptysmallwords 
,emptyvarwords');
 *Error: java.lang.Exception: DataLoad failure: (state=,code=0)*

 

*Log-*

java.lang.Exception: DataLoad failure: 
 at 
org.apache.carbondata.spark.rdd.CarbonDataRDDFactory$.loadCarbonData(CarbonDataRDDFactory.scala:560)
 at 
org.apache.spark.sql.execution.command.management.CarbonLoadDataCommand.loadData(CarbonLoadDataCommand.scala:207)
 at 
org.apache.spark.sql.execution.command.management.CarbonLoadDataCommand.processData(CarbonLoadDataCommand.scala:168)
 at 
org.apache.spark.sql.execution.command.AtomicRunnableCommand$$anonfun$run$3.apply(package.scala:148)
 at 
org.apache.spark.sql.execution.command.AtomicRunnableCommand$$anonfun$run$3.apply(package.scala:145)
 at 
org.apache.spark.sql.execution.command.Auditable$class.runWithAudit(package.scala:104)
 at 
org.apache.spark.sql.execution.command.AtomicRunnableCommand.runWithAudit(package.scala:141)
 at 
org.apache.spark.sql.execution.command.AtomicRunnableCommand.run(package.scala:145)
 at 
org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult$lzycompute(commands.scala:70)
 at 
org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult(commands.scala:68)
 at 
org.apache.spark.sql.execution.command.ExecutedCommandExec.executeCollect(commands.scala:79)
 at org.apache.spark.sql.Dataset$$anonfun$6.apply(Dataset.scala:190)
 at org.apache.spark.sql.Dataset$$anonfun$6.apply(Dataset.scala:190)
 at org.apache.spark.sql.Dataset$$anonfun$52.apply(Dataset.scala:3259)
 at 
org.apache.spark.sql.execution.SQLExecution$.withNewExecutionId(SQLExecution.scala:77)
 at org.apache.spark.sql.Dataset.withAction(Dataset.scala:3258)
 at org.apache.spark.sql.Dataset.(Dataset.scala:190)
 at org.apache.spark.sql.Dataset$.ofRows(Dataset.scala:75)
 at org.apache.spark.sql.SparkSession.sql(SparkSession.scala:642)
 at org.apache.spark.sql.SQLContext.sql(SQLContext.scala:694)
 at 
org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation.org$apache$spark$sql$hive$thriftserver$SparkExecuteStatementOperation$$execute(SparkExecuteStatementOperation.scala:232)
 at 
org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation$$anon$1$$anon$2.run(SparkExecuteStatementOperation.scala:175)
 at 
org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation$$anon$1$$anon$2.run(SparkExecuteStatementOperation.scala:171)
 at java.security.AccessController.doPrivileged(Native Method)
 at javax.security.auth.Subject.doAs(Subject.java:422)
 at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1698)
 at 
org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation$$anon$1.run(SparkExecuteStatementOperation.scala:185)
 at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
 at java.util.concurrent.FutureTask.run(FutureTask.java:266)
 at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
 at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
 at java.lang.Thread.run(Thread.java:745)
 2020-06-05 02:05:56,789 | ERROR | [HiveServer2-Background-Pool: Thread-138] | 
Error running hive query: | 
org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation$$anon$1$$anon$2.run(SparkExecuteStatementOperation.scala:179)
 org.apache.hive.service.cli.HiveSQLException: java.lang.Exception: DataLoad 
failure: 
 at 

[jira] [Created] (CARBONDATA-3867) Show materialized views command not documented in https://github.com/apache/carbondata/blob/master/docs/mv-guide.md

2020-06-23 Thread Chetan Bhat (Jira)
Chetan Bhat created CARBONDATA-3867:
---

 Summary: Show materialized views command not documented in 
https://github.com/apache/carbondata/blob/master/docs/mv-guide.md
 Key: CARBONDATA-3867
 URL: https://issues.apache.org/jira/browse/CARBONDATA-3867
 Project: CarbonData
  Issue Type: Bug
  Components: docs
Affects Versions: 2.0.0
 Environment: 
https://github.com/apache/carbondata/blob/master/docs/mv-guide.md
Reporter: Chetan Bhat


Show materialized views command not documented in 
https://github.com/apache/carbondata/blob/master/docs/mv-guide.md



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (CARBONDATA-3853) Dataload fails for date column configured as BUCKET_COLUMNS

2020-06-11 Thread Chetan Bhat (Jira)
Chetan Bhat created CARBONDATA-3853:
---

 Summary: Dataload fails for date column configured as 
BUCKET_COLUMNS
 Key: CARBONDATA-3853
 URL: https://issues.apache.org/jira/browse/CARBONDATA-3853
 Project: CarbonData
  Issue Type: Bug
  Components: data-load
Affects Versions: 2.0.0
Reporter: Chetan Bhat


Steps and Issue

0: jdbc:hive2://10.20.255.171:23040/> create table if not exists 
all_data_types1(bool_1 boolean,bool_2 boolean,chinese string,Number 
int,smallNumber smallint,BigNumber bigint,LargeDecimal double,smalldecimal 
float,customdecimal decimal(38,15),words string,smallwords char(8),varwords 
varchar(20),time timestamp,day date,emptyNumber int,emptysmallNumber 
smallint,emptyBigNumber bigint,emptyLargeDecimal double,emptysmalldecimal 
float,emptycustomdecimal decimal(38,38),emptywords string,emptysmallwords 
char(8),emptyvarwords varchar(20)) stored as carbondata TBLPROPERTIES 
('BUCKET_NUMBER'='2', *'BUCKET_COLUMNS'='day'*);
+-+--+
| Result |
+-+--+
+-+--+
No rows selected (0.494 seconds)
0: jdbc:hive2://10.20.255.171:23040/> LOAD DATA INPATH 
'hdfs://hacluster/chetan/datafile_0.csv' into table all_data_types1 
OPTIONS('DELIMITER'=',' , 
'QUOTECHAR'='"','BAD_RECORDS_ACTION'='FORCE','FILEHEADER'='bool_1 ,bool_2 
,chinese ,Number ,smallNumber ,BigNumber ,LargeDecimal ,smalldecimal 
,customdecimal,words ,smallwords ,varwords ,time ,day ,emptyNumber 
,emptysmallNumber ,emptyBigNumber ,emptyLargeDecimal 
,emptysmalldecimal,emptycustomdecimal ,emptywords ,emptysmallwords 
,emptyvarwords');
*Error: java.lang.Exception: DataLoad failure (state=,code=0)*

 

*Log-*



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (CARBONDATA-3853) Dataload fails for date column configured as BUCKET_COLUMNS

2020-06-11 Thread Chetan Bhat (Jira)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-3853?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chetan Bhat updated CARBONDATA-3853:

Description: 
Steps and Issue

0: jdbc:hive2://10.20.255.171:23040/> create table if not exists 
all_data_types1(bool_1 boolean,bool_2 boolean,chinese string,Number 
int,smallNumber smallint,BigNumber bigint,LargeDecimal double,smalldecimal 
float,customdecimal decimal(38,15),words string,smallwords char(8),varwords 
varchar(20),time timestamp,day date,emptyNumber int,emptysmallNumber 
smallint,emptyBigNumber bigint,emptyLargeDecimal double,emptysmalldecimal 
float,emptycustomdecimal decimal(38,38),emptywords string,emptysmallwords 
char(8),emptyvarwords varchar(20)) stored as carbondata TBLPROPERTIES 
('BUCKET_NUMBER'='2', *'BUCKET_COLUMNS'='day'*);
 +--+-+
|Result|

+--+-+
 +--+-+
 No rows selected (0.494 seconds)
 0: jdbc:hive2://10.20.255.171:23040/> LOAD DATA INPATH 
'hdfs://hacluster/chetan/datafile_0.csv' into table all_data_types1 
OPTIONS('DELIMITER'=',' , 
'QUOTECHAR'='"','BAD_RECORDS_ACTION'='FORCE','FILEHEADER'='bool_1 ,bool_2 
,chinese ,Number ,smallNumber ,BigNumber ,LargeDecimal ,smalldecimal 
,customdecimal,words ,smallwords ,varwords ,time ,day ,emptyNumber 
,emptysmallNumber ,emptyBigNumber ,emptyLargeDecimal 
,emptysmalldecimal,emptycustomdecimal ,emptywords ,emptysmallwords 
,emptyvarwords');
 *Error: java.lang.Exception: DataLoad failure (state=,code=0)*

 

*Log-*

java.lang.Exception: DataLoad failure
 at 
org.apache.carbondata.spark.rdd.CarbonDataRDDFactory$.loadCarbonData(CarbonDataRDDFactory.scala:560)
 at 
org.apache.spark.sql.execution.command.management.CarbonLoadDataCommand.loadData(CarbonLoadDataCommand.scala:207)
 at 
org.apache.spark.sql.execution.command.management.CarbonLoadDataCommand.processData(CarbonLoadDataCommand.scala:168)
 at 
org.apache.spark.sql.execution.command.AtomicRunnableCommand$$anonfun$run$3.apply(package.scala:148)
 at 
org.apache.spark.sql.execution.command.AtomicRunnableCommand$$anonfun$run$3.apply(package.scala:145)
 at 
org.apache.spark.sql.execution.command.Auditable$class.runWithAudit(package.scala:104)
 at 
org.apache.spark.sql.execution.command.AtomicRunnableCommand.runWithAudit(package.scala:141)
 at 
org.apache.spark.sql.execution.command.AtomicRunnableCommand.run(package.scala:145)
 at 
org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult$lzycompute(commands.scala:70)
 at 
org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult(commands.scala:68)
 at 
org.apache.spark.sql.execution.command.ExecutedCommandExec.executeCollect(commands.scala:79)
 at org.apache.spark.sql.Dataset$$anonfun$6.apply(Dataset.scala:190)
 at org.apache.spark.sql.Dataset$$anonfun$6.apply(Dataset.scala:190)
 at org.apache.spark.sql.Dataset$$anonfun$52.apply(Dataset.scala:3259)
 at 
org.apache.spark.sql.execution.SQLExecution$.withNewExecutionId(SQLExecution.scala:77)
 at org.apache.spark.sql.Dataset.withAction(Dataset.scala:3258)
 at org.apache.spark.sql.Dataset.(Dataset.scala:190)
 at org.apache.spark.sql.Dataset$.ofRows(Dataset.scala:75)
 at org.apache.spark.sql.SparkSession.sql(SparkSession.scala:642)
 at org.apache.spark.sql.SQLContext.sql(SQLContext.scala:694)
 at 
org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation.org$apache$spark$sql$hive$thriftserver$SparkExecuteStatementOperation$$execute(SparkExecuteStatementOperation.scala:232)
 at 
org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation$$anon$1$$anon$2.run(SparkExecuteStatementOperation.scala:175)
 at 
org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation$$anon$1$$anon$2.run(SparkExecuteStatementOperation.scala:171)
 at java.security.AccessController.doPrivileged(Native Method)
 at javax.security.auth.Subject.doAs(Subject.java:422)
 at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1698)
 at 
org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation$$anon$1.run(SparkExecuteStatementOperation.scala:185)
 at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
 at java.util.concurrent.FutureTask.run(FutureTask.java:266)
 at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
 at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
 at java.lang.Thread.run(Thread.java:745)
2020-06-11 23:47:24,973 | ERROR | [HiveServer2-Background-Pool: Thread-104] | 
Error running hive query: | 
org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation$$anon$1$$anon$2.run(SparkExecuteStatementOperation.scala:179)
org.apache.hive.service.cli.HiveSQLException: java.lang.Exception: DataLoad 
failure
 at 

[jira] [Created] (CARBONDATA-3847) Dataload fails for table with data of 10 records having string type bucket column for if number of buckets exceed large no (300).

2020-06-04 Thread Chetan Bhat (Jira)
Chetan Bhat created CARBONDATA-3847:
---

 Summary: Dataload fails for table with data of 10 records having 
string type bucket column for if number of buckets exceed large no (300).
 Key: CARBONDATA-3847
 URL: https://issues.apache.org/jira/browse/CARBONDATA-3847
 Project: CarbonData
  Issue Type: Bug
  Components: data-query
Affects Versions: 2.0.0
 Environment: Spark 2.3.2, Spark 2.4.5
Reporter: Chetan Bhat


*Steps -*

0: jdbc:hive2://10.20.251.163:23040/default> create table if not exists 
all_data_types1(bool_1 boolean,bool_2 boolean,chinese string,Number 
int,smallNumber smallint,BigNumber bigint,LargeDecimal double,smalldecimal 
float,customdecimal decimal(38,15),words string,smallwords char(8),varwords 
varchar(20),time timestamp,day date,emptyNumber int,emptysmallNumber 
smallint,emptyBigNumber bigint,emptyLargeDecimal double,emptysmalldecimal 
float,emptycustomdecimal decimal(38,38),emptywords string,emptysmallwords 
char(8),emptyvarwords varchar(20)) stored as carbondata TBLPROPERTIES 
(*'BUCKET_NUMBER'='300'*, 'BUCKET_COLUMNS'='chinese');
+-+--+
| Result |
+-+--+
+-+--+
No rows selected (0.241 seconds)
0: jdbc:hive2://10.20.251.163:23040/default> LOAD DATA INPATH 
'hdfs://hacluster/chetan/datafile_0.csv' into table all_data_types1 
OPTIONS('DELIMITER'=',' , 
'QUOTECHAR'='"','BAD_RECORDS_ACTION'='FORCE','FILEHEADER'='bool_1 ,bool_2 
,chinese ,Number ,smallNumber ,BigNumber ,LargeDecimal ,smalldecimal 
,customdecimal,words ,smallwords ,varwords ,time ,day ,emptyNumber 
,emptysmallNumber ,emptyBigNumber ,emptyLargeDecimal 
,emptysmalldecimal,emptycustomdecimal ,emptywords ,emptysmallwords 
,emptyvarwords');
*Error: java.lang.Exception: DataLoad failure (state=,code=0)*

 

*Log -*

java.lang.Exception: DataLoad failure
 at 
org.apache.carbondata.spark.rdd.CarbonDataRDDFactory$.loadCarbonData(CarbonDataRDDFactory.scala:565)
 at 
org.apache.spark.sql.execution.command.management.CarbonLoadDataCommand.loadData(CarbonLoadDataCommand.scala:207)
 at 
org.apache.spark.sql.execution.command.management.CarbonLoadDataCommand.processData(CarbonLoadDataCommand.scala:168)
 at 
org.apache.spark.sql.execution.command.AtomicRunnableCommand$$anonfun$run$3.apply(package.scala:148)
 at 
org.apache.spark.sql.execution.command.AtomicRunnableCommand$$anonfun$run$3.apply(package.scala:145)
 at 
org.apache.spark.sql.execution.command.Auditable$class.runWithAudit(package.scala:104)
 at 
org.apache.spark.sql.execution.command.AtomicRunnableCommand.runWithAudit(package.scala:141)
 at 
org.apache.spark.sql.execution.command.AtomicRunnableCommand.run(package.scala:145)
 at 
org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult$lzycompute(commands.scala:71)
 at 
org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult(commands.scala:69)
 at 
org.apache.spark.sql.execution.command.ExecutedCommandExec.executeCollect(commands.scala:80)
 at org.apache.spark.sql.Dataset$$anonfun$6.apply(Dataset.scala:196)
 at org.apache.spark.sql.Dataset$$anonfun$6.apply(Dataset.scala:196)
 at org.apache.spark.sql.Dataset$$anonfun$52.apply(Dataset.scala:3379)
 at 
org.apache.spark.sql.execution.SQLExecution$$anonfun$withNewExecutionId$1.apply(SQLExecution.scala:90)
 at 
org.apache.spark.sql.execution.SQLExecution$.withSQLConfPropagated(SQLExecution.scala:137)
 at 
org.apache.spark.sql.execution.SQLExecution$.withNewExecutionId(SQLExecution.scala:85)
 at org.apache.spark.sql.Dataset.withAction(Dataset.scala:3378)
 at org.apache.spark.sql.Dataset.(Dataset.scala:196)
 at org.apache.spark.sql.Dataset$.ofRows(Dataset.scala:79)
 at org.apache.spark.sql.SparkSession.sql(SparkSession.scala:651)
 at org.apache.spark.sql.SQLContext.sql(SQLContext.scala:694)
 at 
org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation.org$apache$spark$sql$hive$thriftserver$SparkExecuteStatementOperation$$execute(SparkExecuteStatementOperation.scala:248)
 at 
org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation$$anon$1$$anon$2.run(SparkExecuteStatementOperation.scala:178)
 at 
org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation$$anon$1$$anon$2.run(SparkExecuteStatementOperation.scala:174)
 at java.security.AccessController.doPrivileged(Native Method)
 at javax.security.auth.Subject.doAs(Subject.java:422)
 at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1729)
 at 
org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation$$anon$1.run(SparkExecuteStatementOperation.scala:188)
 at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
 at java.util.concurrent.FutureTask.run(FutureTask.java:266)
 at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
 at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
 at 

[jira] [Created] (CARBONDATA-3846) Dataload fails for boolean column configured as BUCKET_COLUMNS

2020-06-04 Thread Chetan Bhat (Jira)
Chetan Bhat created CARBONDATA-3846:
---

 Summary: Dataload fails for boolean column configured as 
BUCKET_COLUMNS
 Key: CARBONDATA-3846
 URL: https://issues.apache.org/jira/browse/CARBONDATA-3846
 Project: CarbonData
  Issue Type: Bug
  Components: data-query
Affects Versions: 2.0.0
 Environment: Spark 2.3.2, Spark 2.4.5
Reporter: Chetan Bhat


*Steps-*

0: jdbc:hive2://10.20.255.171:23040/default> create table if not exists 
all_data_types1(*bool_1 boolean*,bool_2 boolean,chinese string,Number 
int,smallNumber smallint,BigNumber bigint,LargeDecimal double,smalldecimal 
float,customdecimal decimal(38,15),words string,smallwords char(8),varwords 
varchar(20),time timestamp,day date,emptyNumber int,emptysmallNumber 
smallint,emptyBigNumber bigint,emptyLargeDecimal double,emptysmalldecimal 
float,emptycustomdecimal decimal(38,38),emptywords string,emptysmallwords 
char(8),emptyvarwords varchar(20)) stored as carbondata TBLPROPERTIES 
('BUCKET_NUMBER'='1', '*BUCKET_COLUMNS'='bool_1*');
+-+--+
| Result |
+-+--+
+-+--+
No rows selected (0.939 seconds)
0: jdbc:hive2://10.20.255.171:23040/default> LOAD DATA INPATH 
'hdfs://hacluster/chetan/datafile_0.csv' into table all_data_types1 
OPTIONS('DELIMITER'=',' , 
'QUOTECHAR'='"','BAD_RECORDS_ACTION'='FORCE','FILEHEADER'='bool_1 ,bool_2 
,chinese ,Number ,smallNumber ,BigNumber ,LargeDecimal ,smalldecimal 
,customdecimal,words ,smallwords ,varwords ,time ,day ,emptyNumber 
,emptysmallNumber ,emptyBigNumber ,emptyLargeDecimal 
,emptysmalldecimal,emptycustomdecimal ,emptywords ,emptysmallwords 
,emptyvarwords');
*Error: java.lang.Exception: DataLoad failure: (state=,code=0)*

 

*Log-*

java.lang.Exception: DataLoad failure: 
 at 
org.apache.carbondata.spark.rdd.CarbonDataRDDFactory$.loadCarbonData(CarbonDataRDDFactory.scala:560)
 at 
org.apache.spark.sql.execution.command.management.CarbonLoadDataCommand.loadData(CarbonLoadDataCommand.scala:207)
 at 
org.apache.spark.sql.execution.command.management.CarbonLoadDataCommand.processData(CarbonLoadDataCommand.scala:168)
 at 
org.apache.spark.sql.execution.command.AtomicRunnableCommand$$anonfun$run$3.apply(package.scala:148)
 at 
org.apache.spark.sql.execution.command.AtomicRunnableCommand$$anonfun$run$3.apply(package.scala:145)
 at 
org.apache.spark.sql.execution.command.Auditable$class.runWithAudit(package.scala:104)
 at 
org.apache.spark.sql.execution.command.AtomicRunnableCommand.runWithAudit(package.scala:141)
 at 
org.apache.spark.sql.execution.command.AtomicRunnableCommand.run(package.scala:145)
 at 
org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult$lzycompute(commands.scala:70)
 at 
org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult(commands.scala:68)
 at 
org.apache.spark.sql.execution.command.ExecutedCommandExec.executeCollect(commands.scala:79)
 at org.apache.spark.sql.Dataset$$anonfun$6.apply(Dataset.scala:190)
 at org.apache.spark.sql.Dataset$$anonfun$6.apply(Dataset.scala:190)
 at org.apache.spark.sql.Dataset$$anonfun$52.apply(Dataset.scala:3259)
 at 
org.apache.spark.sql.execution.SQLExecution$.withNewExecutionId(SQLExecution.scala:77)
 at org.apache.spark.sql.Dataset.withAction(Dataset.scala:3258)
 at org.apache.spark.sql.Dataset.(Dataset.scala:190)
 at org.apache.spark.sql.Dataset$.ofRows(Dataset.scala:75)
 at org.apache.spark.sql.SparkSession.sql(SparkSession.scala:642)
 at org.apache.spark.sql.SQLContext.sql(SQLContext.scala:694)
 at 
org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation.org$apache$spark$sql$hive$thriftserver$SparkExecuteStatementOperation$$execute(SparkExecuteStatementOperation.scala:232)
 at 
org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation$$anon$1$$anon$2.run(SparkExecuteStatementOperation.scala:175)
 at 
org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation$$anon$1$$anon$2.run(SparkExecuteStatementOperation.scala:171)
 at java.security.AccessController.doPrivileged(Native Method)
 at javax.security.auth.Subject.doAs(Subject.java:422)
 at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1698)
 at 
org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation$$anon$1.run(SparkExecuteStatementOperation.scala:185)
 at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
 at java.util.concurrent.FutureTask.run(FutureTask.java:266)
 at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
 at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
 at java.lang.Thread.run(Thread.java:745)
2020-06-05 02:05:56,789 | ERROR | [HiveServer2-Background-Pool: Thread-138] | 
Error running hive query: | 
org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation$$anon$1$$anon$2.run(SparkExecuteStatementOperation.scala:179)

[jira] [Updated] (CARBONDATA-3845) Bucket table creation fails with exception for empty BUCKET_NUMBER and BUCKET_COLUMNS

2020-06-04 Thread Chetan Bhat (Jira)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-3845?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chetan Bhat updated CARBONDATA-3845:

Description: 
*Steps and Issue-*

0: jdbc:hive2://10.20.251.163:23040/default> create table if not exists 
all_data_types1(bool_1 boolean,bool_2 boolean,chinese string,Number 
int,smallNumber smallint,BigNumber bigint,LargeDecimal double,smalldecimal 
float,customdecimal decimal(38,15),words string,smallwords char(8),varwords 
varchar(20),time timestamp,day date,emptyNumber int,emptysmallNumber 
smallint,emptyBigNumber bigint,emptyLargeDecimal double,emptysmalldecimal 
float,emptycustomdecimal decimal(38,38),emptywords string,emptysmallwords 
char(8),emptyvarwords varchar(20)) stored as carbondata TBLPROPERTIES 
(*'BUCKET_NUMBER'='', 'BUCKET_COLUMNS'=''*);
 *Error: java.lang.NumberFormatException: For input string: "" (state=,code=0)*

 Same issue present if bucket_number is empty.

0: jdbc:hive2://10.20.251.163:23040/default> create table if not exists 
all_data_types1(bool_1 boolean,bool_2 boolean,chinese string,Number 
int,smallNumber smallint,BigNumber bigint,LargeDecimal double,smalldecimal 
float,customdecimal decimal(38,15),words string,smallwords char(8),varwords 
varchar(20),time timestamp,day date,emptyNumber int,emptysmallNumber 
smallint,emptyBigNumber bigint,emptyLargeDecimal double,emptysmalldecimal 
float,emptycustomdecimal decimal(38,38),emptywords string,emptysmallwords 
char(8),emptyvarwords varchar(20)) stored as carbondata TBLPROPERTIES 
(*'BUCKET_NUMBER'=''*, 'BUCKET_COLUMNS'='test');
 *Error: java.lang.NumberFormatException: For input string: "" (state=,code=0)*

*Log-*

2020-06-05 01:52:32,633 | ERROR | [HiveServer2-Background-Pool: Thread-102] | 
Error executing query, currentState RUNNING,  | 
org.apache.spark.internal.Logging$class.logError(Logging.scala:91)2020-06-05 
01:52:32,633 | ERROR | [HiveServer2-Background-Pool: Thread-102] | Error 
executing query, currentState RUNNING,  | 
org.apache.spark.internal.Logging$class.logError(Logging.scala:91)java.lang.NumberFormatException:
 For input string: "" at 
java.lang.NumberFormatException.forInputString(NumberFormatException.java:65) 
at java.lang.Integer.parseInt(Integer.java:592) at 
java.lang.Integer.parseInt(Integer.java:615) at 
scala.collection.immutable.StringLike$class.toInt(StringLike.scala:272) at 
scala.collection.immutable.StringOps.toInt(StringOps.scala:29) at 
org.apache.carbondata.spark.CarbonOption.bucketNumber$lzycompute(CarbonOption.scala:61)
 at 
org.apache.carbondata.spark.CarbonOption.bucketNumber(CarbonOption.scala:61) at 
org.apache.spark.sql.parser.CarbonSpark2SqlParser.getBucketFields(CarbonSpark2SqlParser.scala:765)
 at 
org.apache.spark.sql.parser.CarbonSparkSqlParserUtil$.buildTableInfoFromCatalogTable(CarbonSparkSqlParserUtil.scala:382)
 at org.apache.spark.sql.CarbonSource$.createTableInfo(CarbonSource.scala:235) 
at org.apache.spark.sql.CarbonSource$.createTableMeta(CarbonSource.scala:382) 
at 
org.apache.spark.sql.execution.command.table.CarbonCreateDataSourceTableCommand.processMetadata(CarbonCreateDataSourceTableCommand.scala:69)
 at 
org.apache.spark.sql.execution.command.MetadataCommand$$anonfun$run$1.apply(package.scala:123)
 at 
org.apache.spark.sql.execution.command.MetadataCommand$$anonfun$run$1.apply(package.scala:123)
 at 
org.apache.spark.sql.execution.command.Auditable$class.runWithAudit(package.scala:104)
 at 
org.apache.spark.sql.execution.command.MetadataCommand.runWithAudit(package.scala:120)
 at 
org.apache.spark.sql.execution.command.MetadataCommand.run(package.scala:123) 
at 
org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult$lzycompute(commands.scala:70)
 at 
org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult(commands.scala:68)
 at 
org.apache.spark.sql.execution.command.ExecutedCommandExec.executeCollect(commands.scala:79)
 at org.apache.spark.sql.Dataset$$anonfun$6.apply(Dataset.scala:190) at 
org.apache.spark.sql.Dataset$$anonfun$6.apply(Dataset.scala:190) at 
org.apache.spark.sql.Dataset$$anonfun$52.apply(Dataset.scala:3259) at 
org.apache.spark.sql.execution.SQLExecution$.withNewExecutionId(SQLExecution.scala:77)
 at org.apache.spark.sql.Dataset.withAction(Dataset.scala:3258) at 
org.apache.spark.sql.Dataset.(Dataset.scala:190) at 
org.apache.spark.sql.Dataset$.ofRows(Dataset.scala:75) at 
org.apache.spark.sql.SparkSession.sql(SparkSession.scala:642) at 
org.apache.spark.sql.SQLContext.sql(SQLContext.scala:694) at 
org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation.org$apache$spark$sql$hive$thriftserver$SparkExecuteStatementOperation$$execute(SparkExecuteStatementOperation.scala:232)
 at 
org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation$$anon$1$$anon$2.run(SparkExecuteStatementOperation.scala:175)
 at 

[jira] [Updated] (CARBONDATA-3845) Bucket table creation fails with exception for empty BUCKET_NUMBER and BUCKET_COLUMNS

2020-06-04 Thread Chetan Bhat (Jira)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-3845?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chetan Bhat updated CARBONDATA-3845:

Description: 
*Steps and Issue-*

0: jdbc:hive2://10.20.251.163:23040/default> create table if not exists 
all_data_types1(bool_1 boolean,bool_2 boolean,chinese string,Number 
int,smallNumber smallint,BigNumber bigint,LargeDecimal double,smalldecimal 
float,customdecimal decimal(38,15),words string,smallwords char(8),varwords 
varchar(20),time timestamp,day date,emptyNumber int,emptysmallNumber 
smallint,emptyBigNumber bigint,emptyLargeDecimal double,emptysmalldecimal 
float,emptycustomdecimal decimal(38,38),emptywords string,emptysmallwords 
char(8),emptyvarwords varchar(20)) stored as carbondata TBLPROPERTIES 
('BUCKET_NUMBER'='', 'BUCKET_COLUMNS'='');
 *Error: java.lang.NumberFormatException: For input string: "" (state=,code=0)*

 Same issue present if bucket_number is empty.

0: jdbc:hive2://10.20.251.163:23040/default> create table if not exists 
all_data_types1(bool_1 boolean,bool_2 boolean,chinese string,Number 
int,smallNumber smallint,BigNumber bigint,LargeDecimal double,smalldecimal 
float,customdecimal decimal(38,15),words string,smallwords char(8),varwords 
varchar(20),time timestamp,day date,emptyNumber int,emptysmallNumber 
smallint,emptyBigNumber bigint,emptyLargeDecimal double,emptysmalldecimal 
float,emptycustomdecimal decimal(38,38),emptywords string,emptysmallwords 
char(8),emptyvarwords varchar(20)) stored as carbondata TBLPROPERTIES 
('BUCKET_NUMBER'='', 'BUCKET_COLUMNS'='test');
*Error: java.lang.NumberFormatException: For input string: "" (state=,code=0)*

*Log-*

2020-06-05 01:52:32,633 | ERROR | [HiveServer2-Background-Pool: Thread-102] | 
Error executing query, currentState RUNNING,  | 
org.apache.spark.internal.Logging$class.logError(Logging.scala:91)2020-06-05 
01:52:32,633 | ERROR | [HiveServer2-Background-Pool: Thread-102] | Error 
executing query, currentState RUNNING,  | 
org.apache.spark.internal.Logging$class.logError(Logging.scala:91)java.lang.NumberFormatException:
 For input string: "" at 
java.lang.NumberFormatException.forInputString(NumberFormatException.java:65) 
at java.lang.Integer.parseInt(Integer.java:592) at 
java.lang.Integer.parseInt(Integer.java:615) at 
scala.collection.immutable.StringLike$class.toInt(StringLike.scala:272) at 
scala.collection.immutable.StringOps.toInt(StringOps.scala:29) at 
org.apache.carbondata.spark.CarbonOption.bucketNumber$lzycompute(CarbonOption.scala:61)
 at 
org.apache.carbondata.spark.CarbonOption.bucketNumber(CarbonOption.scala:61) at 
org.apache.spark.sql.parser.CarbonSpark2SqlParser.getBucketFields(CarbonSpark2SqlParser.scala:765)
 at 
org.apache.spark.sql.parser.CarbonSparkSqlParserUtil$.buildTableInfoFromCatalogTable(CarbonSparkSqlParserUtil.scala:382)
 at org.apache.spark.sql.CarbonSource$.createTableInfo(CarbonSource.scala:235) 
at org.apache.spark.sql.CarbonSource$.createTableMeta(CarbonSource.scala:382) 
at 
org.apache.spark.sql.execution.command.table.CarbonCreateDataSourceTableCommand.processMetadata(CarbonCreateDataSourceTableCommand.scala:69)
 at 
org.apache.spark.sql.execution.command.MetadataCommand$$anonfun$run$1.apply(package.scala:123)
 at 
org.apache.spark.sql.execution.command.MetadataCommand$$anonfun$run$1.apply(package.scala:123)
 at 
org.apache.spark.sql.execution.command.Auditable$class.runWithAudit(package.scala:104)
 at 
org.apache.spark.sql.execution.command.MetadataCommand.runWithAudit(package.scala:120)
 at 
org.apache.spark.sql.execution.command.MetadataCommand.run(package.scala:123) 
at 
org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult$lzycompute(commands.scala:70)
 at 
org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult(commands.scala:68)
 at 
org.apache.spark.sql.execution.command.ExecutedCommandExec.executeCollect(commands.scala:79)
 at org.apache.spark.sql.Dataset$$anonfun$6.apply(Dataset.scala:190) at 
org.apache.spark.sql.Dataset$$anonfun$6.apply(Dataset.scala:190) at 
org.apache.spark.sql.Dataset$$anonfun$52.apply(Dataset.scala:3259) at 
org.apache.spark.sql.execution.SQLExecution$.withNewExecutionId(SQLExecution.scala:77)
 at org.apache.spark.sql.Dataset.withAction(Dataset.scala:3258) at 
org.apache.spark.sql.Dataset.(Dataset.scala:190) at 
org.apache.spark.sql.Dataset$.ofRows(Dataset.scala:75) at 
org.apache.spark.sql.SparkSession.sql(SparkSession.scala:642) at 
org.apache.spark.sql.SQLContext.sql(SQLContext.scala:694) at 
org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation.org$apache$spark$sql$hive$thriftserver$SparkExecuteStatementOperation$$execute(SparkExecuteStatementOperation.scala:232)
 at 
org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation$$anon$1$$anon$2.run(SparkExecuteStatementOperation.scala:175)
 at 

[jira] [Created] (CARBONDATA-3845) Bucket table creation fails with exception for empty BUCKET_NUMBER and BUCKET_COLUMNS

2020-06-04 Thread Chetan Bhat (Jira)
Chetan Bhat created CARBONDATA-3845:
---

 Summary: Bucket table creation fails with exception for empty 
BUCKET_NUMBER and BUCKET_COLUMNS
 Key: CARBONDATA-3845
 URL: https://issues.apache.org/jira/browse/CARBONDATA-3845
 Project: CarbonData
  Issue Type: Bug
  Components: data-query
Affects Versions: 2.0.0
 Environment: Spark 2.3.2
Reporter: Chetan Bhat


*Steps and Issue-*

0: jdbc:hive2://10.20.251.163:23040/default> create table if not exists 
all_data_types1(bool_1 boolean,bool_2 boolean,chinese string,Number 
int,smallNumber smallint,BigNumber bigint,LargeDecimal double,smalldecimal 
float,customdecimal decimal(38,15),words string,smallwords char(8),varwords 
varchar(20),time timestamp,day date,emptyNumber int,emptysmallNumber 
smallint,emptyBigNumber bigint,emptyLargeDecimal double,emptysmalldecimal 
float,emptycustomdecimal decimal(38,38),emptywords string,emptysmallwords 
char(8),emptyvarwords varchar(20)) stored as carbondata TBLPROPERTIES 
('BUCKET_NUMBER'='', 'BUCKET_COLUMNS'='');
*Error: java.lang.NumberFormatException: For input string: "" (state=,code=0)*

 

*Log-*

2020-06-05 01:52:32,633 | ERROR | [HiveServer2-Background-Pool: Thread-102] | 
Error executing query, currentState RUNNING,  | 
org.apache.spark.internal.Logging$class.logError(Logging.scala:91)2020-06-05 
01:52:32,633 | ERROR | [HiveServer2-Background-Pool: Thread-102] | Error 
executing query, currentState RUNNING,  | 
org.apache.spark.internal.Logging$class.logError(Logging.scala:91)java.lang.NumberFormatException:
 For input string: "" at 
java.lang.NumberFormatException.forInputString(NumberFormatException.java:65) 
at java.lang.Integer.parseInt(Integer.java:592) at 
java.lang.Integer.parseInt(Integer.java:615) at 
scala.collection.immutable.StringLike$class.toInt(StringLike.scala:272) at 
scala.collection.immutable.StringOps.toInt(StringOps.scala:29) at 
org.apache.carbondata.spark.CarbonOption.bucketNumber$lzycompute(CarbonOption.scala:61)
 at 
org.apache.carbondata.spark.CarbonOption.bucketNumber(CarbonOption.scala:61) at 
org.apache.spark.sql.parser.CarbonSpark2SqlParser.getBucketFields(CarbonSpark2SqlParser.scala:765)
 at 
org.apache.spark.sql.parser.CarbonSparkSqlParserUtil$.buildTableInfoFromCatalogTable(CarbonSparkSqlParserUtil.scala:382)
 at org.apache.spark.sql.CarbonSource$.createTableInfo(CarbonSource.scala:235) 
at org.apache.spark.sql.CarbonSource$.createTableMeta(CarbonSource.scala:382) 
at 
org.apache.spark.sql.execution.command.table.CarbonCreateDataSourceTableCommand.processMetadata(CarbonCreateDataSourceTableCommand.scala:69)
 at 
org.apache.spark.sql.execution.command.MetadataCommand$$anonfun$run$1.apply(package.scala:123)
 at 
org.apache.spark.sql.execution.command.MetadataCommand$$anonfun$run$1.apply(package.scala:123)
 at 
org.apache.spark.sql.execution.command.Auditable$class.runWithAudit(package.scala:104)
 at 
org.apache.spark.sql.execution.command.MetadataCommand.runWithAudit(package.scala:120)
 at 
org.apache.spark.sql.execution.command.MetadataCommand.run(package.scala:123) 
at 
org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult$lzycompute(commands.scala:70)
 at 
org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult(commands.scala:68)
 at 
org.apache.spark.sql.execution.command.ExecutedCommandExec.executeCollect(commands.scala:79)
 at org.apache.spark.sql.Dataset$$anonfun$6.apply(Dataset.scala:190) at 
org.apache.spark.sql.Dataset$$anonfun$6.apply(Dataset.scala:190) at 
org.apache.spark.sql.Dataset$$anonfun$52.apply(Dataset.scala:3259) at 
org.apache.spark.sql.execution.SQLExecution$.withNewExecutionId(SQLExecution.scala:77)
 at org.apache.spark.sql.Dataset.withAction(Dataset.scala:3258) at 
org.apache.spark.sql.Dataset.(Dataset.scala:190) at 
org.apache.spark.sql.Dataset$.ofRows(Dataset.scala:75) at 
org.apache.spark.sql.SparkSession.sql(SparkSession.scala:642) at 
org.apache.spark.sql.SQLContext.sql(SQLContext.scala:694) at 
org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation.org$apache$spark$sql$hive$thriftserver$SparkExecuteStatementOperation$$execute(SparkExecuteStatementOperation.scala:232)
 at 
org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation$$anon$1$$anon$2.run(SparkExecuteStatementOperation.scala:175)
 at 
org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation$$anon$1$$anon$2.run(SparkExecuteStatementOperation.scala:171)
 at java.security.AccessController.doPrivileged(Native Method) at 
javax.security.auth.Subject.doAs(Subject.java:422) at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1698)
 at 
org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation$$anon$1.run(SparkExecuteStatementOperation.scala:185)
 at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) at 

[jira] [Created] (CARBONDATA-3842) Select with limit displays incorrect resultset after datamap creation

2020-06-02 Thread Chetan Bhat (Jira)
Chetan Bhat created CARBONDATA-3842:
---

 Summary: Select with limit displays incorrect resultset after 
datamap creation
 Key: CARBONDATA-3842
 URL: https://issues.apache.org/jira/browse/CARBONDATA-3842
 Project: CarbonData
  Issue Type: Bug
  Components: data-query
Affects Versions: 2.0.1
 Environment: Spark 2.3.2
Reporter: Chetan Bhat


*Steps :-*

create table tab1(id int, name string, dept string) STORED as carbondata;
create materialized view datamap31 as select a.id, a.name from tab1 a;
insert into tab1 select 1,'ram','cs';
insert into tab1 select 2,'shyam','it';
select a.id, a.name from tab1 a order by a.id limit 1;

*Issue :*  

Select with limit displays incorrect resultset (2 records instead of 1) after 
datamap creation.

0: jdbc:hive2://10.20.251.163:23040/default> select a.id, a.name from tab1 a 
order by a.id limit 1;
INFO : Execution ID: 558
+-++--+
| id | name |
+-++--+
| 2 | shyam |
| 1 | ram |
+-++--+
*2 rows selected (0.601 seconds)*

 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (CARBONDATA-3838) Select filter query fails on SI columns of different SI tables.

2020-05-31 Thread Chetan Bhat (Jira)
Chetan Bhat created CARBONDATA-3838:
---

 Summary: Select filter query fails on SI columns of different SI 
tables.
 Key: CARBONDATA-3838
 URL: https://issues.apache.org/jira/browse/CARBONDATA-3838
 Project: CarbonData
  Issue Type: Bug
  Components: data-query
Affects Versions: 2.0.0
 Environment: Spark 2.3.2
Reporter: Chetan Bhat


Select filter query fails on SI columns of different SI tables.

*Steps :-*

0: jdbc:hive2://10.20.255.171:23040/default> create table brinjal (imei 
string,AMSize string,channelsId string,ActiveCountry string, Activecity 
string,gamePointId double,deviceInformationId double,productionDate 
Timestamp,deliveryDate timestamp,deliverycharge double) stored as carbondata 
TBLPROPERTIES('inverted_index'='imei,AMSize,channelsId,ActiveCountry,Activecity,productionDate,deliveryDate','sort_columns'='imei,AMSize,channelsId,ActiveCountry,Activecity,productionDate,deliveryDate','table_blocksize'='1','SORT_SCOPE'='GLOBAL_SORT','carbon.column.compressor'='zstd');
+-+--+
| Result |
+-+--+
+-+--+
No rows selected (0.153 seconds)
0: jdbc:hive2://10.20.255.171:23040/default> LOAD DATA INPATH 
'hdfs://hacluster/chetan/vardhandaterestruct.csv' INTO TABLE brinjal 
OPTIONS('DELIMITER'=',', 'QUOTECHAR'= 
'"','BAD_RECORDS_ACTION'='FORCE','FILEHEADER'= 
'imei,deviceInformationId,AMSize,channelsId,ActiveCountry,Activecity,gamePointId,productionDate,deliveryDate,deliverycharge');
+-+--+
| Result |
+-+--+
+-+--+
No rows selected (2.357 seconds)
0: jdbc:hive2://10.20.255.171:23040/default> CREATE INDEX indextable1 ON TABLE 
brinjal (channelsId) AS 'carbondata' 
PROPERTIES('carbon.column.compressor'='zstd');
+-+--+
| Result |
+-+--+
+-+--+
No rows selected (1.048 seconds)
0: jdbc:hive2://10.20.255.171:23040/default> CREATE INDEX indextable2 ON TABLE 
brinjal (ActiveCountry) AS 'carbondata' 
PROPERTIES('carbon.column.compressor'='zstd');
+-+--+
| Result |
+-+--+
+-+--+
No rows selected (1.895 seconds)
0: jdbc:hive2://10.20.255.171:23040/default> select * from brinjal where 
ActiveCountry ='Chinese' or channelsId =4;
Error: org.apache.spark.sql.catalyst.errors.package$TreeNodeException: execute, 
tree:
Exchange hashpartitioning(positionReference#6440, 200)
+- *(6) HashAggregate(keys=[positionReference#6440], functions=[], 
output=[positionReference#6440])
 +- Union
 :- *(3) HashAggregate(keys=[positionReference#6440], functions=[], 
output=[positionReference#6440])
 : +- Exchange hashpartitioning(positionReference#6440, 200)
 : +- *(2) HashAggregate(keys=[positionReference#6440], functions=[], 
output=[positionReference#6440])
 : +- *(2) Project [positionReference#6440]
 : +- *(2) Filter (cast(channelsid#6439 as int) = 4)
 : +- *(2) FileScan carbondata 
2_0.indextable1[positionReference#6440,channelsid#6439] PushedFilters: 
[CastExpr((cast(channelsid#6439 as int) = 4))], ReadSchema: 
struct
 +- *(5) HashAggregate(keys=[positionReference#6442], functions=[], 
output=[positionReference#6442])
 +- Exchange hashpartitioning(positionReference#6442, 200)
 +- *(4) HashAggregate(keys=[positionReference#6442], functions=[], 
output=[positionReference#6442])
 +- *(4) Project [positionReference#6442|#6442]
 +- *(4) Filter (activecountry#6441 = Chinese)
 +- *(4) FileScan carbondata 
2_0.indextable2[positionReference#6442,activecountry#6441] PushedFilters: 
[EqualTo(activecountry,Chinese)], ReadSchema: 
struct (state=,code=0)

 

*Log -*

org.apache.carbondata.core.datastore.block.SegmentPropertiesAndSchemaHolder.addSegmentProperties(SegmentPropertiesAndSchemaHolder.java:117)org.apache.carbondata.core.datastore.block.SegmentPropertiesAndSchemaHolder.addSegmentProperties(SegmentPropertiesAndSchemaHolder.java:117)2020-06-01
 12:19:28,058 | ERROR | [HiveServer2-Background-Pool: Thread-1150] | Error 
executing query, currentState RUNNING,  | 
org.apache.spark.internal.Logging$class.logError(Logging.scala:91)org.apache.spark.sql.catalyst.errors.package$TreeNodeException:
 execute, tree:Exchange hashpartitioning(positionReference#6440, 200)+- *(6) 
HashAggregate(keys=[positionReference#6440], functions=[], 
output=[positionReference#6440])   +- Union      :- *(3) 
HashAggregate(keys=[positionReference#6440], functions=[], 
output=[positionReference#6440])      :  +- Exchange 
hashpartitioning(positionReference#6440, 200)      :     +- *(2) 
HashAggregate(keys=[positionReference#6440], functions=[], 
output=[positionReference#6440])      :        +- *(2) Project 
[positionReference#6440]      :           +- *(2) Filter (cast(channelsid#6439 
as int) = 4)      :              +- *(2) FileScan carbondata 
2_0.indextable1[positionReference#6440,channelsid#6439] PushedFilters: 
[CastExpr((cast(channelsid#6439 as int) = 4))], ReadSchema: 
struct      +- *(5) 
HashAggregate(keys=[positionReference#6442], 

[jira] [Commented] (CARBONDATA-3797) Refresh materialized view command throws null pointer exception

2020-05-19 Thread Chetan Bhat (Jira)


[ 
https://issues.apache.org/jira/browse/CARBONDATA-3797?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17110901#comment-17110901
 ] 

Chetan Bhat commented on CARBONDATA-3797:
-

Added other steps-queries

> Refresh materialized view command throws null pointer exception
> ---
>
> Key: CARBONDATA-3797
> URL: https://issues.apache.org/jira/browse/CARBONDATA-3797
> Project: CarbonData
>  Issue Type: Bug
>  Components: data-query
>Affects Versions: 2.0.0
> Environment: Spark 2.3.2, Spark 2.4.5
>Reporter: Chetan Bhat
>Priority: Major
>
> Refresh materialized view command throws null pointer exception
> CREATE TABLE uniqdata_mv(CUST_ID int,CUST_NAME String,ACTIVE_EMUI_VERSION 
> string, DOB timestamp, DOJ timestamp, BIGINT_COLUMN1 bigint,BIGINT_COLUMN2 
> bigint,DECIMAL_COLUMN1 decimal(30,10), DECIMAL_COLUMN2 
> decimal(36,10),Double_COLUMN1 double, Double_COLUMN2 double,INTEGER_COLUMN1 
> int) STORED as carbondata 
> TBLPROPERTIES('local_dictionary_enable'='true','local_dictionary_threshold'='1000');
> LOAD DATA INPATH 'hdfs://hacluster/chetan/2000_UniqData.csv' into table 
> uniqdata_mv OPTIONS('DELIMITER'=',', 
> 'QUOTECHAR'='"','BAD_RECORDS_ACTION'='FORCE','FILEHEADER'='CUST_ID,CUST_NAME,ACTIVE_EMUI_VERSION,DOB,DOJ,BIGINT_COLUMN1,BIGINT_COLUMN2,DECIMAL_COLUMN1,DECIMAL_COLUMN2,Double_COLUMN1,Double_COLUMN2,INTEGER_COLUMN1');
> CREATE MATERIALIZED VIEW mv1 as select cust_id, cust_name, count(cust_id) 
> from uniqdata_mv group by cust_id, cust_name;
> refresh MATERIALIZED VIEW mv1;
>  Error: java.lang.NullPointerException (state=,code=0)
>  
> *Exception-*
> 2020-05-06 00:50:59,941 | ERROR | [HiveServer2-Background-Pool: Thread-1822] 
> | Error executing query, currentState RUNNING, | 
> org.apache.spark.internal.Logging$class.logError(Logging.scala:91)
>  java.lang.NullPointerException
>  at org.apache.carbondata.view.MVRefresher$.refresh(MVRefresher.scala:62)
>  at 
> org.apache.spark.sql.execution.command.view.CarbonRefreshMVCommand.processData(CarbonRefreshMVCommand.scala:52)
>  at 
> org.apache.spark.sql.execution.command.DataCommand$$anonfun$run$2.apply(package.scala:132)
>  at 
> org.apache.spark.sql.execution.command.DataCommand$$anonfun$run$2.apply(package.scala:132)
>  at 
> org.apache.spark.sql.execution.command.Auditable$class.runWithAudit(package.scala:104)
>  at 
> org.apache.spark.sql.execution.command.DataCommand.runWithAudit(package.scala:130)
>  at org.apache.spark.sql.execution.command.DataCommand.run(package.scala:132)
>  at 
> org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult$lzycompute(commands.scala:70)
>  at 
> org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult(commands.scala:68)
>  at 
> org.apache.spark.sql.execution.command.ExecutedCommandExec.executeCollect(commands.scala:79)
>  at org.apache.spark.sql.Dataset$$anonfun$6.apply(Dataset.scala:194)
>  at org.apache.spark.sql.Dataset$$anonfun$6.apply(Dataset.scala:194)
>  at org.apache.spark.sql.Dataset$$anonfun$52.apply(Dataset.scala:3370)
>  at 
> org.apache.spark.sql.execution.SQLExecution$$anonfun$withNewExecutionId$1.apply(SQLExecution.scala:80)
>  at 
> org.apache.spark.sql.execution.SQLExecution$.withSQLConfPropagated(SQLExecution.scala:127)
>  at 
> org.apache.spark.sql.execution.SQLExecution$.withNewExecutionId(SQLExecution.scala:75)
>  at org.apache.spark.sql.Dataset.withAction(Dataset.scala:3369)
>  at org.apache.spark.sql.Dataset.(Dataset.scala:194)
>  at org.apache.spark.sql.Dataset$.ofRows(Dataset.scala:79)
>  at org.apache.spark.sql.SparkSession.sql(SparkSession.scala:642)
>  at org.apache.spark.sql.SQLContext.sql(SQLContext.scala:694)
>  at 
> org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation.org$apache$spark$sql$hive$thriftserver$SparkExecuteStatementOperation$$execute(SparkExecuteStatementOperation.scala:232)
>  at 
> org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation$$anon$1$$anon$2.run(SparkExecuteStatementOperation.scala:175)
>  at 
> org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation$$anon$1$$anon$2.run(SparkExecuteStatementOperation.scala:171)
>  at java.security.AccessController.doPrivileged(Native Method)
>  at javax.security.auth.Subject.doAs(Subject.java:422)
>  at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1698)
>  at 
> org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation$$anon$1.run(SparkExecuteStatementOperation.scala:185)
>  at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
>  at java.util.concurrent.FutureTask.run(FutureTask.java:266)
>  at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>  at 
> 

  1   2   3   >