[jira] [Created] (CARBONDATA-4298) IS_EMPTY_DATA_BAD_RECORD property not supported for complex types.

2021-10-05 Thread SHREELEKHYA GAMPA (Jira)
SHREELEKHYA GAMPA created CARBONDATA-4298:
-

 Summary: IS_EMPTY_DATA_BAD_RECORD property not supported for 
complex types.
 Key: CARBONDATA-4298
 URL: https://issues.apache.org/jira/browse/CARBONDATA-4298
 Project: CarbonData
  Issue Type: Bug
Reporter: SHREELEKHYA GAMPA


{{IS_EMPTY_DATA_BAD_RECORD}} property not supported for complex types. A flag 
to determine if empty record is to be considered a bad record or not.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (CARBONDATA-4292) Support spatial index creation using data frame

2021-09-26 Thread SHREELEKHYA GAMPA (Jira)
SHREELEKHYA GAMPA created CARBONDATA-4292:
-

 Summary: Support spatial index creation using data frame
 Key: CARBONDATA-4292
 URL: https://issues.apache.org/jira/browse/CARBONDATA-4292
 Project: CarbonData
  Issue Type: New Feature
Reporter: SHREELEKHYA GAMPA


To support spatial index creation using data frame



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (CARBONDATA-4284) Load/insert after alter add column on partition table with complex column fails

2021-09-14 Thread SHREELEKHYA GAMPA (Jira)
SHREELEKHYA GAMPA created CARBONDATA-4284:
-

 Summary: Load/insert after alter add column on partition table 
with complex column fails 
 Key: CARBONDATA-4284
 URL: https://issues.apache.org/jira/browse/CARBONDATA-4284
 Project: CarbonData
  Issue Type: Bug
Reporter: SHREELEKHYA GAMPA


Insert after alter add column on partition table with complex column fails with 
bufferUnderFlowException

[Steps] :-

drop table if exists strarmap1; create table strarmap1(id int,str 
struct>,arr 
array>) PARTITIONED BY(name string) stored as carbondata 
tblproperties('local_dictionary_enable'='true','local_dictionary_include'='name,str,arr');
 load data inpath 'hdfs://hacluster/chetan/strarmap1.csv' into table strarmap1 
partition(name='name0') 
options('fileheader'='id,name,str,arr','COMPLEX_DELIMITER_LEVEL_3'='#','COMPLEX_DELIMITER_LEVEL_2'='$','COMPLEX_DELIMITER_LEVEL_1'='&','BAD_RECORDS_ACTION'='FORCE');
 select * from strarmap1 limit 1; show partitions strarmap1; ALTER TABLE 
strarmap1 ADD COLUMNS(map1 Map, map2 Map, map3 
Map, map4 Map, map5 
Map,map6 Map,map7 map>, 
map8 map>>); load data inpath 
'hdfs://hacluster/chetan/strarmap1.csv' into table strarmap1 
partition(name='name0') 
options('fileheader'='id,name,str,arr,map1,map2,map3,map4,map5,map6,map7,map8','COMPLEX_DELIMITER_LEVEL_3'='#','COMPLEX_DELIMITER_LEVEL_2'='$','COMPLEX_DELIMITER_LEVEL_1'='&','BAD_RECORDS_ACTION'='FORCE');

[Expected Result] :- load after add map columns on partition table should be 
success

[Actual Issue]:- error on load after add map columns on partition table



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (CARBONDATA-4282) Issues with table having complex columns related to long string, SI, local dictionary

2021-09-07 Thread SHREELEKHYA GAMPA (Jira)
SHREELEKHYA GAMPA created CARBONDATA-4282:
-

 Summary: Issues with table having complex columns related to long 
string, SI, local dictionary
 Key: CARBONDATA-4282
 URL: https://issues.apache.org/jira/browse/CARBONDATA-4282
 Project: CarbonData
  Issue Type: Bug
Reporter: SHREELEKHYA GAMPA


*1. Insert/load fails after alter add complex column if table contains long 
string columns.*

  [Steps] :-  

DROP TABLE IF EXISTS alter_com;

CREATE TABLE alter_com(intfield int,EDUCATED string ,rankk string ) STORED AS 
carbondata 
TBLPROPERTIES('inverted_index'='intField','sort_columns'='intField','TABLE_BLOCKSIZE'=
 '256 
MB','TABLE_BLOCKLET_SIZE'='8','SORT_SCOPE'='no_sort','COLUMN_META_CACHE'='rankk','carbon.column.compressor'='gzip','long_string_columns'='rankk','table_page_size_inmb'='1');

insert into alter_com values(1,'cse','xi'); select * from alter_com limit 1;

ALTER TABLE alter_com ADD COLUMNS(map1 Map, map2 Map, 
map3 Map, map4 Map, map5 
Map,map6 Map,map7 map>, 
map8 map>>); 

ALTER TABLE alter_com SET TBLPROPERTIES('long_string_columns'='EDUCATED');

insert into alter_com values(1,'ece','x', map(1,2),map(3,2.34), 
map(1.23,'hello'),map('abc','def'), 
map(true,'2017-02-01'),map('time','2018-02-01 
02:00:00.0'),map('ph',array(1,2)), 
map('a',named_struct('d',23,'s',named_struct('im','sh';

[Expected Result] :- insert/load should be success after alter add map column 
,if table contains long string columns

*2. create index on array of complex column (map/struct) throws null pointer 
exception instead of correct error message.*

[Steps] :-

drop table if exists strarmap1; create table strarmap1(id int,name string,str 
struct>,arr 
array>) stored as carbondata 
tblproperties('inverted_index'='name','sort_columns'='name','TABLE_BLOCKSIZE'= 
'256 MB','TABLE_BLOCKLET_SIZE'='8','CACHE_LEVEL'='BLOCKLET'); load data inpath 
'hdfs://hacluster/chetan/strarmap1.csv' into table strarmap1 
options('fileheader'='id,name,str,arr','COMPLEX_DELIMITER_LEVEL_3'='#','COMPLEX_DELIMITER_LEVEL_2'='$','COMPLEX_DELIMITER_LEVEL_1'='&','BAD_RECORDS_ACTION'='FORCE');
 CREATE INDEX index2 ON TABLE strarmap1 (arr) as 'carbondata' 
properties('sort_scope'='global_sort','global_sort_partitions'='3');

[Expected Result] :- create index on array of map(string,timestamp) should 
thrown correct validation error message.

[Actual Issue]:- create index on array of map(string,timestamp) throws null 
pointer exception instead of correct error message

*3. alter table property local dictionary inlcude/exclude with newly added map 
column is failing.*

[Steps] :-

drop table if exists strarmap1; create table strarmap1(id int,name string,str 
struct>,arr 
array>) stored as carbondata 
tblproperties('inverted_index'='name','sort_columns'='name','local_dictionary_enable'='false','local_dictionary_include'='map1','local_dictionary_exclude'='str,arr','local_dictionary_threshold'='1000');

ALTER TABLE strarmap1 ADD COLUMNS(map1 Map, map2 Map, 
map3 Map, map4 Map, map5 
Map,map6 Map,map7 map>, 
map8 map>>); ALTER TABLE strarmap1 SET 
TBLPROPERTIES('LOCAL_DICTIONARY_ENABLE'='true','local_dictionary_include'='map4','local_dictionary_threshold'='1000');

[Expected Result] :- alter table property local dictionary inlcude/exclude with 
newly added map column should be success

[Actual Issue]:- alter table property local dictionary inlcude/exclude with 
newly added map column is failing



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (CARBONDATA-4274) Create partition table error with spark 3.1

2021-08-23 Thread SHREELEKHYA GAMPA (Jira)
SHREELEKHYA GAMPA created CARBONDATA-4274:
-

 Summary:  Create partition table error with spark 3.1
 Key: CARBONDATA-4274
 URL: https://issues.apache.org/jira/browse/CARBONDATA-4274
 Project: CarbonData
  Issue Type: Bug
Reporter: SHREELEKHYA GAMPA


With spark 3.1, we can create a partition table by giving partition columns 
from schema.
Like below example:
{{create table partitionTable(c1 int, c2 int, v1 string, v2 string) stored as 
carbondata partitioned by (v2,c2)}}

When the table is created by SparkSession with CarbonExtension, catalog table 
is created with the specified partitions.
But in cluster/ with carbon session, when we create partition table with above 
syntax it is creating normal table with no partitions.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (CARBONDATA-4119) User Input for GeoID column not validated.

2021-08-17 Thread SHREELEKHYA GAMPA (Jira)


[ 
https://issues.apache.org/jira/browse/CARBONDATA-4119?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17400186#comment-17400186
 ] 

SHREELEKHYA GAMPA commented on CARBONDATA-4119:
---

As part of enhancement, insert with customized geoID changes were made. Will 
update the documentation accordingly.

> User Input for GeoID column not validated.
> --
>
> Key: CARBONDATA-4119
> URL: https://issues.apache.org/jira/browse/CARBONDATA-4119
> Project: CarbonData
>  Issue Type: Bug
>  Components: data-load
>Affects Versions: 2.1.0
>Reporter: PURUJIT CHAUGULE
>Priority: Minor
>
> * User Input for geoId column can be paired to multiple pairs of source 
> columns values (correct internally calculated geoID values are different for 
> such above source columns values).
>  * The advantage of using geoID is not applicable when taking user input for 
> GeoId column is not validated and user input values may differ from actual 
> internally calculated values. GeoID value is only generated internally if 
> user does not input the geoID column.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (CARBONDATA-4165) Carbondata summing up two values of same timestamp.

2021-07-20 Thread SHREELEKHYA GAMPA (Jira)


[ 
https://issues.apache.org/jira/browse/CARBONDATA-4165?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17383867#comment-17383867
 ] 

SHREELEKHYA GAMPA commented on CARBONDATA-4165:
---

Hi Suyash,

Can you please share more details of the problem with some example queries.

 

> Carbondata summing up two values of same timestamp.
> ---
>
> Key: CARBONDATA-4165
> URL: https://issues.apache.org/jira/browse/CARBONDATA-4165
> Project: CarbonData
>  Issue Type: Wish
>  Components: core
>Affects Versions: 2.0.1
> Environment: apache carbondata 2.0.1, apache spark 2.4.5 hadoop 2.7.2
>Reporter: suyash yadav
>Priority: Major
> Fix For: 2.0.1
>
>
> Hi Team,
>  
> We have seen a behaviour while using Carbondata 2.0.1 that if we get 2 values 
> for same timestamp then it tries to sum both the values and put it as one 
> value. Instead we need that it should discard previous  value and use the 
> latest one.
>  
> Please let us know if there is any functionality already available in 
> carbondata to handle duplicate values by it self or if there is any plan to 
> implement such a functionality.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (CARBONDATA-4231) On update operation with 3.1v, cloned spark session is used and set properties are lost.

2021-06-23 Thread SHREELEKHYA GAMPA (Jira)
SHREELEKHYA GAMPA created CARBONDATA-4231:
-

 Summary: On update operation with 3.1v, cloned spark session is 
used and set properties are lost.
 Key: CARBONDATA-4231
 URL: https://issues.apache.org/jira/browse/CARBONDATA-4231
 Project: CarbonData
  Issue Type: Bug
Reporter: SHREELEKHYA GAMPA


*Update operation with bad records property fails with 3.1v.* 

*[Steps to reproduce]:*

0: jdbc:hive2://linux-221:22550/> set carbon.options.bad.records.action=force;

+++

| key | value |

+++

| carbon.options.bad.records.action | force |

+++

1 row selected (0.04 seconds)

0: jdbc:hive2://linux-221:22550/> create table t_carbn1(item_type_cd int, 
sell_price bigint, profit decimal(10,4), item_name string, update_time 
timestamp) stored a

+-+

| Result |

+-+

+-+

No rows selected (2.117 seconds)

0: jdbc:hive2://linux-221:22550/> insert into t_carbn1 select 2, 
10,23.3,'Apple','2012-11-11 11:11:11';

INFO : Execution ID: 858

+-+

| Segment ID |

+-+

| 0 |

+-+

1 row selected (4.278 seconds)

0: jdbc:hive2://linux-221:22550/> update t_carbn1 set (item_type_cd) = 
(item_type_cd/1);

Error: org.apache.hive.service.cli.HiveSQLException: Error running query: 
java.lang.RuntimeException: Update operation failed. DataLoad failure

*[Root cause]:*

On update command, persist is called and with latest 3.1 spark changes, spark 
returns a cloned SparkSession from cacheManager with all specified 
configurations disabled. As now its using different sparkSession for 3.1 which 
is not initialized in CarbonEnv. So CarbonEnv.init is called where new 
CarbonSessionInfo is created with no sessionParams. So, the properties set were 
not accessible.

Spark creates cloned spark session based on following properties:

1. spark.sql.optimizer.canChangeCachedPlanOutputPartitioning

2. spark.sql.sources.bucketing.autoBucketedScan.enabled

3.  spark.sql.adaptive.enabled

 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (CARBONDATA-4205) MINOR compaction getting triggered by it self while inserting data to a table

2021-06-18 Thread SHREELEKHYA GAMPA (Jira)


[ 
https://issues.apache.org/jira/browse/CARBONDATA-4205?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17365535#comment-17365535
 ] 

SHREELEKHYA GAMPA commented on CARBONDATA-4205:
---

Hi, can you share the carbon configuration set?

Pls check for carbon.enable.auto.load.merge and 
carbon.compaction.level.threshold properties. When 
carbon.enable.auto.load.merge is set to true, compaction will be automatically 
triggered once data load completes.

 

> MINOR compaction getting triggered by it self while inserting data to a table
> -
>
> Key: CARBONDATA-4205
> URL: https://issues.apache.org/jira/browse/CARBONDATA-4205
> Project: CarbonData
>  Issue Type: Improvement
>  Components: core
>Affects Versions: 2.0.1
> Environment: apache carbondata 2.0.1, hadoop 2.7.2, spark 2.4.5
>Reporter: suyash yadav
>Priority: Major
>
> Hi Team we have created a table and also created a timeseries MV on it. Later 
> we tried to insert a some data from other table to this newly created table 
> but we have observed that while inserting ...MINOR compaction on the MV is 
> getting triggered by it self. It doesn't happen for all the insert but 
> whnever we insert 6 to 7th hour data and then 14 to 15 hour datathe MINOR 
> compaction gets triggered. Could you tell us why the MINOR compaction is 
> getting triggered by it self.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (CARBONDATA-4211) from xx Insert into select fails if an SQL statement contains multiple inserts

2021-06-15 Thread SHREELEKHYA GAMPA (Jira)
SHREELEKHYA GAMPA created CARBONDATA-4211:
-

 Summary: from xx Insert into select fails if an SQL statement 
contains multiple inserts
 Key: CARBONDATA-4211
 URL: https://issues.apache.org/jira/browse/CARBONDATA-4211
 Project: CarbonData
  Issue Type: Bug
Reporter: SHREELEKHYA GAMPA


When multiple inserts with single query is used, it fails from SparkPlan with: 
{{java.lang.ClassCastException: GenericInternalRow cannot be cast to 
UnsafeRow}}.

[Steps] :-

>From Spark SQL execute the following queries

1、create tables:

create table catalog_returns_5(cr_returned_date_sk int,cr_returned_time_sk 
int,cr_item_sk int)ROW FORMAT DELIMITED FIELDS TERMINATED BY '|' LINES 
TERMINATED BY '\n' ;
create table catalog_returns_6(cr_returned_time_sk int,cr_item_sk int) 
partitioned by (cr_returned_date_sk int) STORED BY 
'org.apache.carbondata.format' TBLPROPERTIES ( 'table_blocksize'='64');
2、insert table:
from catalog_returns_5 insert overwrite table catalog_returns_6 partition 
(cr_returned_date_sk) select cr_returned_time_sk, cr_item_sk, 
cr_returned_date_sk where cr_returned_date_sk is not null distribute by 
cr_returned_date_sk insert overwrite table catalog_returns_6 partition 
(cr_returned_date_sk) select cr_returned_time_sk, cr_item_sk, 
cr_returned_date_sk where cr_returned_date_sk is null distribute by 
cr_returned_date_sk;

 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (CARBONDATA-4202) Fix issue when refresh main table with MV

2021-06-08 Thread SHREELEKHYA GAMPA (Jira)
SHREELEKHYA GAMPA created CARBONDATA-4202:
-

 Summary: Fix issue when refresh main table with MV
 Key: CARBONDATA-4202
 URL: https://issues.apache.org/jira/browse/CARBONDATA-4202
 Project: CarbonData
  Issue Type: Bug
Reporter: SHREELEKHYA GAMPA


[Problem phenomenon] : - error when trying to refresh main table which contain 
mv in 2.1.1. Store for main table with mv created in 2.1.0

[Steps] :-

CREATE TABLE originTable_mv (empno int, empname String, designation String, doj 
Timestamp,workgroupcategory int, workgroupcategoryname String, deptno int, 
deptname String,projectcode int, projectjoindate Timestamp, projectenddate 
Timestamp,attendance int,utilization int,salary int)STORED AS carbondata;

LOAD DATA local inpath 'hdfs://hacluster/BabuStore/Data/data.csv' INTO TABLE 
originTable_mv OPTIONS('DELIMITER'= ',', 'QUOTECHAR'= 
'"','timestampformat'='dd-MM-');

create MATERIALIZED VIEW datamap_comp_mv as select empno,sum(attendance) 
,min(projectjoindate) ,max(projectenddate) ,avg(attendance) 
,count(empno),count(distinct workgroupcategoryname) from originTable_mv group 
by empno;

LOAD DATA local inpath 'hdfs://hacluster/BabuStore/Data/data.csv' INTO TABLE 
originTable_mv OPTIONS('DELIMITER'= ',', 'QUOTECHAR'= 
'"','timestampformat'='dd-MM-');

Backup the store and copy the store .

execute the refresh table command on the main table that has MV table.

[Expected Result] :- refresh main table which contain mv in 2.1.1. Store for 
main table with mv created in 2.1.0 should be successful

[Actual Issue] : -error when trying to refresh main table which contain mv in 
2.1.1. Store for main table with mv created in 2.1.0

0: jdbc:hive2://linux-221:22550/> refresh table originTable_mv;

Error: org.apache.hive.service.cli.HiveSQLException: Error running query: 
org.apache.spark.sql.AnalysisException: == Spark Parser: 
org.apache.spark.sql.hive.FISqlParser ==

extraneous input '2_1' expecting \{')', ','}(line 8, pos 25)

== SQL ==

CREATE TABLE 2_1.origintable_mv

({{empno}} int,{{empname}} string,{{designation}} string,{{doj}} 
timestamp,{{workgroupcategory}} int,{{workgroupcategoryname}} string,{{deptno}} 
int,{{deptname}} string,{{projectcode}} int,{{projectjoindate}} 
timestamp,{{projectenddate}} timestamp,{{attendance}} int,{{utilization}} 
int,{{salary}} int)

USING carbondata

OPTIONS (

indexexists "false",

sort_columns "",

comment "",

relatedmvtablesmap "\{"2_1":["datamap_comp_mv"]}",

-^^^

bad_record_path "",

local_dictionary_enable "true",

indextableexists "false",

tableName "origintable_mv",

dbName "2_1",

tablePath 
"hdfs://hacluster/user/hive/warehouse/carbon.store/2_1/origintable_mv",

path "hdfs://hacluster/user/hive/warehouse/carbon.store/2_1/origintable_mv",

isExternal "false",

isTransactional "true",

isVisible "true"

,carbonSchemaPartsNo '2',carbonSchema0 

[jira] [Updated] (CARBONDATA-4143) UT with index server

2021-06-03 Thread SHREELEKHYA GAMPA (Jira)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-4143?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

SHREELEKHYA GAMPA updated CARBONDATA-4143:
--
Description: 
To enable to run UT with index server using flag {{useIndexServer.}}

excluded some of the test cases to not run with index server.

To Fix below issues:
 1. With index server enabled, select query gives incorrect result with SI when 
parent and child table segments are not in sync.

queries to execute:

0: jdbc:hive2://dggphisprb50622:22550/> create table test (c1 string,c2 int,c3 
string,c5 string) STORED AS carbondata;
 +-+
|Result|

+-+
 +-+
 No rows selected (0.564 seconds)
 0: jdbc:hive2://dggphisprb50622:22550/> load data inpath 
'hdfs://hacluster/chetan/dest.csv' into table test;
 +-+
|Segment ID|

+-+
|0|

+-+
 1 row selected (1.764 seconds)
 0: jdbc:hive2://dggphisprb50622:22550/> create index index_test on table test 
(c3) AS 'carbondata';
 +-+
|Result|

+-+
 +-+
 No rows selected (2.412 seconds)
 0: jdbc:hive2://dggphisprb50622:22550/> load data inpath 
'hdfs://hacluster/chetan/dest.csv' into table test;
 +-+
|Segment ID|

+-+
|1|

+-+
 1 row selected (2.839 seconds)
 0: jdbc:hive2://dggphisprb50622:22550/> select * from test where c3='dd';
 +--+++---+
|c1|c2|c3|c5|

+--+++---+
|d|4|dd|ddd|
|d|4|dd|ddd|

+--+++---+
 2 rows selected (3.452 seconds)
 0: jdbc:hive2://dggphisprb50622:22550/> delete from table index_test where 
segment.ID in(1);
 +-+
|Result|

+-+
 +-+
 No rows selected (0.413 seconds)
 0: jdbc:hive2://dggphisprb50622:22550/> select * from test where c3='dd';
 +--+++---+
|c1|c2|c3|c5|

+--+++---+
|d|4|dd|ddd|

+--+++---+
 1 row selected (3.262 seconds)
 0: jdbc:hive2://dggphisprb50622:22550/>

Expected: to return 2 rows.

2. When reindex is triggered, if stale files are present in the segment 
directory the segment file is being written with incorrect file names. (both 
valid index and stale mergeindex file names). As a result, duplicate data is 
present in SI table but there is no error/incorrect query results.

  was:
To enable to run UT with index server using flag {{useIndexServer.}}

excluded some of the test cases to not run with index server.

added test case with prepriming.

To Fix below issues:
1. With index server enabled, select query gives incorrect result with SI when 
parent and child table segments are not in sync.

queries to execute:


0: jdbc:hive2://dggphisprb50622:22550/> create table test (c1 string,c2 int,c3 
string,c5 string) STORED AS carbondata;
+-+
| Result |
+-+
+-+
No rows selected (0.564 seconds)
0: jdbc:hive2://dggphisprb50622:22550/> load data inpath 
'hdfs://hacluster/chetan/dest.csv' into table test;
+-+
| Segment ID |
+-+
| 0 |
+-+
1 row selected (1.764 seconds)
0: jdbc:hive2://dggphisprb50622:22550/> create index index_test on table test 
(c3) AS 'carbondata';
+-+
| Result |
+-+
+-+
No rows selected (2.412 seconds)
0: jdbc:hive2://dggphisprb50622:22550/> load data inpath 
'hdfs://hacluster/chetan/dest.csv' into table test;
+-+
| Segment ID |
+-+
| 1 |
+-+
1 row selected (2.839 seconds)
0: jdbc:hive2://dggphisprb50622:22550/> select * from test where c3='dd';
+-+-+-+--+
| c1 | c2 | c3 | c5 |
+-+-+-+--+
| d | 4 | dd | ddd |
| d | 4 | dd | ddd |
+-+-+-+--+
2 rows selected (3.452 seconds)
0: jdbc:hive2://dggphisprb50622:22550/> delete from table index_test where 
segment.ID in(1);
+-+
| Result |
+-+
+-+
No rows selected (0.413 seconds)
0: jdbc:hive2://dggphisprb50622:22550/> select * from test where c3='dd';
+-+-+-+--+
| c1 | c2 | c3 | c5 |
+-+-+-+--+
| d | 4 | dd | ddd |
+-+-+-+--+
1 row selected (3.262 seconds)
0: jdbc:hive2://dggphisprb50622:22550/>

Expected: to return 2 rows.


2. When reindex is triggered, if stale files are present in the segment 
directory the segment file is being written with incorrect file names. (both 
valid index and stale mergeindex file names). As a result, duplicate data is 
present in SI table but there is no error/incorrect query results.


> UT with index server
> 
>
> Key: CARBONDATA-4143
> URL: https://issues.apache.org/jira/browse/CARBONDATA-4143
> Project: CarbonData
>  Issue Type: Improvement
>Reporter: SHREELEKHYA GAMPA
>Priority: Major
>  Time Spent: 5h 50m
>  Remaining Estimate: 0h
>
> To enable to run UT with index server using flag {{useIndexServer.}}
> excluded some of the test cases to not run 

[jira] [Created] (CARBONDATA-4193) Fix compaction failure after alter add complex column.

2021-05-28 Thread SHREELEKHYA GAMPA (Jira)
SHREELEKHYA GAMPA created CARBONDATA-4193:
-

 Summary:  Fix compaction failure after alter add complex column.
 Key: CARBONDATA-4193
 URL: https://issues.apache.org/jira/browse/CARBONDATA-4193
 Project: CarbonData
  Issue Type: Bug
Reporter: SHREELEKHYA GAMPA


[Steps] :-

>From spark beeline/SQL/Shell/Submit the following queries are executed

drop table if exists alter_complex; create table alter_complex (a int, b 
string) stored as carbondata; insert into alter_complex select 1,'a'; insert 
into alter_complex select 1,'a'; insert into alter_complex select 1,'a'; insert 
into alter_complex select 1,'a'; insert into alter_complex select 1,'a'; select 
_from alter_complex; ALTER TABLE alter_complex ADD COLUMNS(struct1 
STRUCT); insert into alter_complex select 
3,'c',named_struct('s1',4,'s2','d'); insert into alter_complex select 
3,'c',named_struct('s1',4,'s2','d'); insert into alter_complex select 
3,'c',named_struct('s1',4,'s2','d'); insert into alter_complex select 
3,'c',named_struct('s1',4,'s2','d'); insert into alter_complex select 
3,'c',named_struct('s1',4,'s2','d'); select_ from alter_complex; alter table 
alter_complex compact 'minor'; OR alter table alter_complex compact 'major'; OR 
alter table alter_complex compact 'custom' where segment.id In (3,4,5,6);

[Expected Result] :- Compaction should be success after alter add complex 
column.

[Actual Issue] : - Compaction fails after alter add complex column.

 

!https://clouddevops.huawei.com/vision-file-storage/api/file/download/upload-v2/2021/3/26/c71035/ec9486ee659c4374a211db588b2f6b2a/image.png!



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (CARBONDATA-4143) UT with index server

2021-05-20 Thread SHREELEKHYA GAMPA (Jira)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-4143?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

SHREELEKHYA GAMPA updated CARBONDATA-4143:
--
Description: 
To enable to run UT with index server using flag {{useIndexServer.}}

excluded some of the test cases to not run with index server.

added test case with prepriming.

To Fix below issues:
1. With index server enabled, select query gives incorrect result with SI when 
parent and child table segments are not in sync.

queries to execute:


0: jdbc:hive2://dggphisprb50622:22550/> create table test (c1 string,c2 int,c3 
string,c5 string) STORED AS carbondata;
+-+
| Result |
+-+
+-+
No rows selected (0.564 seconds)
0: jdbc:hive2://dggphisprb50622:22550/> load data inpath 
'hdfs://hacluster/chetan/dest.csv' into table test;
+-+
| Segment ID |
+-+
| 0 |
+-+
1 row selected (1.764 seconds)
0: jdbc:hive2://dggphisprb50622:22550/> create index index_test on table test 
(c3) AS 'carbondata';
+-+
| Result |
+-+
+-+
No rows selected (2.412 seconds)
0: jdbc:hive2://dggphisprb50622:22550/> load data inpath 
'hdfs://hacluster/chetan/dest.csv' into table test;
+-+
| Segment ID |
+-+
| 1 |
+-+
1 row selected (2.839 seconds)
0: jdbc:hive2://dggphisprb50622:22550/> select * from test where c3='dd';
+-+-+-+--+
| c1 | c2 | c3 | c5 |
+-+-+-+--+
| d | 4 | dd | ddd |
| d | 4 | dd | ddd |
+-+-+-+--+
2 rows selected (3.452 seconds)
0: jdbc:hive2://dggphisprb50622:22550/> delete from table index_test where 
segment.ID in(1);
+-+
| Result |
+-+
+-+
No rows selected (0.413 seconds)
0: jdbc:hive2://dggphisprb50622:22550/> select * from test where c3='dd';
+-+-+-+--+
| c1 | c2 | c3 | c5 |
+-+-+-+--+
| d | 4 | dd | ddd |
+-+-+-+--+
1 row selected (3.262 seconds)
0: jdbc:hive2://dggphisprb50622:22550/>

Expected: to return 2 rows.


2. When reindex is triggered, if stale files are present in the segment 
directory the segment file is being written with incorrect file names. (both 
valid index and stale mergeindex file names). As a result, duplicate data is 
present in SI table but there is no error/incorrect query results.

  was:
To enable to run UT with index server using flag {{useIndexServer.}}

excluded some of the test cases to not run with index server.

added test case with prepriming.


> UT with index server
> 
>
> Key: CARBONDATA-4143
> URL: https://issues.apache.org/jira/browse/CARBONDATA-4143
> Project: CarbonData
>  Issue Type: Improvement
>Reporter: SHREELEKHYA GAMPA
>Priority: Major
>  Time Spent: 3h 20m
>  Remaining Estimate: 0h
>
> To enable to run UT with index server using flag {{useIndexServer.}}
> excluded some of the test cases to not run with index server.
> added test case with prepriming.
> To Fix below issues:
> 1. With index server enabled, select query gives incorrect result with SI 
> when parent and child table segments are not in sync.
> queries to execute:
> 0: jdbc:hive2://dggphisprb50622:22550/> create table test (c1 string,c2 
> int,c3 string,c5 string) STORED AS carbondata;
> +-+
> | Result |
> +-+
> +-+
> No rows selected (0.564 seconds)
> 0: jdbc:hive2://dggphisprb50622:22550/> load data inpath 
> 'hdfs://hacluster/chetan/dest.csv' into table test;
> +-+
> | Segment ID |
> +-+
> | 0 |
> +-+
> 1 row selected (1.764 seconds)
> 0: jdbc:hive2://dggphisprb50622:22550/> create index index_test on table test 
> (c3) AS 'carbondata';
> +-+
> | Result |
> +-+
> +-+
> No rows selected (2.412 seconds)
> 0: jdbc:hive2://dggphisprb50622:22550/> load data inpath 
> 'hdfs://hacluster/chetan/dest.csv' into table test;
> +-+
> | Segment ID |
> +-+
> | 1 |
> +-+
> 1 row selected (2.839 seconds)
> 0: jdbc:hive2://dggphisprb50622:22550/> select * from test where c3='dd';
> +-+-+-+--+
> | c1 | c2 | c3 | c5 |
> +-+-+-+--+
> | d | 4 | dd | ddd |
> | d | 4 | dd | ddd |
> +-+-+-+--+
> 2 rows selected (3.452 seconds)
> 0: jdbc:hive2://dggphisprb50622:22550/> delete from table index_test where 
> segment.ID in(1);
> +-+
> | Result |
> +-+
> +-+
> No rows selected (0.413 seconds)
> 0: jdbc:hive2://dggphisprb50622:22550/> select * from test where c3='dd';
> +-+-+-+--+
> | c1 | c2 | c3 | c5 |
> +-+-+-+--+
> | d | 4 | dd | ddd |
> +-+-+-+--+
> 1 row selected (3.262 seconds)
> 0: jdbc:hive2://dggphisprb50622:22550/>
> Expected: to return 2 rows.
> 2. When reindex is triggered, if stale files are present in the segment 
> 

[jira] [Created] (CARBONDATA-4174) Handle exception for desc column

2021-04-22 Thread SHREELEKHYA GAMPA (Jira)
SHREELEKHYA GAMPA created CARBONDATA-4174:
-

 Summary: Handle exception for desc column
 Key: CARBONDATA-4174
 URL: https://issues.apache.org/jira/browse/CARBONDATA-4174
 Project: CarbonData
  Issue Type: Bug
Reporter: SHREELEKHYA GAMPA


Validation not present for children column in desc column for a primitive 
datatype and higher level non existing children column desc column for a 
complex datatype



drop table if exists complexcarbontable; create table complexcarbontable 
(deviceInformationId int,channelsId string,ROMSize string,purchasedate 
string,mobile struct,MAC array,gamePointId 
map,contractNumber double) STORED AS carbondata;

describe column deviceInformationId.x on complexcarbontable; describe column 
channelsId.x on complexcarbontable;

describe column mobile.imei.x on complexcarbontable; describe column MAC.item.x 
on complexcarbontable; describe column gamePointId.key.x on complexcarbontable;

[Expected Result] :- Validation should be provided for children column in desc 
column for a primitive datatype and higher level non existing children column 
desc column for a complex datatype. Command execution should fail.

[Actual Issue] : - Validation not present for children column in desc column 
for a primitive datatype and higher level non existing children column desc 
column for a complex datatype. As a result the command execution is successful.

[!https://clouddevops.huawei.com/vision-file-storage/api/file/download/upload-v2/2021/3/21/c71035/7a3b04d78ceb4a489e6c038f4bb257db/image.png!|https://clouddevops.huawei.com/vision-file-storage/api/file/download/upload-v2/2021/3/21/c71035/7a3b04d78ceb4a489e6c038f4bb257db/image.png]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (CARBONDATA-4173) Fix inverted index query issue

2021-04-22 Thread SHREELEKHYA GAMPA (Jira)
SHREELEKHYA GAMPA created CARBONDATA-4173:
-

 Summary: Fix inverted index query issue
 Key: CARBONDATA-4173
 URL: https://issues.apache.org/jira/browse/CARBONDATA-4173
 Project: CarbonData
  Issue Type: Bug
Reporter: SHREELEKHYA GAMPA


 select query with filter column which is present in inverted_index column does 
not return any value

>From Spark beeline/SQL/Shell execute the following queries

drop table if exists uniqdata6;

CREATE TABLE uniqdata6(cust_id int,cust_name string,ACTIVE_EMUI_VERSION string, 
DOB timestamp, DOJ timestamp, BIGINT_COLUMN1 bigint,BIGINT_COLUMN2 
bigint,DECIMAL_COLUMN1 decimal(30,10), DECIMAL_COLUMN2 
decimal(36,10),Double_COLUMN1 double, Double_COLUMN2 double, INTEGER_COLUMN1 
int)stored as carbondata TBLPROPERTIES ('sort_columns'='CUST_ID,CUST_NAME', 
'inverted_index'='CUST_ID,CUST_NAME','sort_scope'='global_sort');

LOAD DATA INPATH 'hdfs://hacluster/chetan/2000_UniqData.csv' into table 
uniqdata6 OPTIONS ('FILEHEADER'='CUST_ID,CUST_NAME 
,ACTIVE_EMUI_VERSION,DOB,DOJ, 
BIGINT_COLUMN1,BIGINT_COLUMN2,DECIMAL_COLUMN1,DECIMAL_COLUMN2,Double_COLUMN1, 
Double_COLUMN2,INTEGER_COLUMN1','BAD_RECORDS_ACTION'='FORCE');

select cust_name from uniqdata6 limit 5;

select * from uniqdata6 where CUST_NAME='CUST_NAME_2';

select * from uniqdata6 where CUST_NAME='CUST_NAME_3';

 

[Expected Result] :- select query with filter column which is present in 
inverted_index column should return correct value

[Actual Issue] : - select query with filter column which is present in 
inverted_index column does not return any value

[!https://clouddevops.huawei.com/vision-file-storage/api/file/download/upload-v2/2021/3/15/c71035/05443c9a9c11457e947645f1cf0ad347/image.png!|https://clouddevops.huawei.com/vision-file-storage/api/file/download/upload-v2/2021/3/15/c71035/05443c9a9c11457e947645f1cf0ad347/image.png]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (CARBONDATA-4168) UDF validation Issues related to Geospatial support

2021-04-19 Thread SHREELEKHYA GAMPA (Jira)
SHREELEKHYA GAMPA created CARBONDATA-4168:
-

 Summary: UDF  validation Issues related to Geospatial support
 Key: CARBONDATA-4168
 URL: https://issues.apache.org/jira/browse/CARBONDATA-4168
 Project: CarbonData
  Issue Type: Bug
Reporter: SHREELEKHYA GAMPA


--Gives a wrong error message for size less than equal to 0.

drop table if exists source_index;

create table source_index(TIMEVALUE BIGINT,LONGITUDE long,LATITUDE long) STORED 
AS carbondata TBLPROPERTIES 
('SPATIAL_INDEX'='mygeohash','SPATIAL_INDEX.mygeohash.type'='geohash','SPATIAL_INDEX.mygeohash.sourcecolumns'='longitude,
 
latitude','SPATIAL_INDEX.mygeohash.originLatitude'='39.832277','SPATIAL_INDEX.mygeohash.gridSize'='50','SPATIAL_INDEX.mygeohash.conversionRatio'='100');

LOAD DATA inpath '/geodata/geodata2.csv' INTO TABLE source_index OPTIONS 
('DELIMITER'= ',');

select longitude, latitude from source_index where IN_POLYLINE_LIST('LINESTRING 
(120.184179 30.327465, 120.191603 30.328946, 120.199242 30.324464, 120.190359 
30.315388)', -65);

select longitude, latitude from source_index where IN_POLYLINE_LIST('LINESTRING 
(120.184179 30.327465, 120.191603 30.328946, 120.199242 30.324464, 120.190359 
30.315388)', 0);

Scenario 2:

--Accepts Invalid Buffer Size

select longitude, latitude from source_index where IN_POLYLINE_LIST('LINESTRING 
(120.184179 30.327465, 120.191603 30.328946, 120.199242 30.324464), LINESTRING 
(120.199242 30.324464, 120.190359 30.315388)', 'X');

Scenario 3:

--Accepts negative and 0 gridSize

select LatLngToGeoId(39930753, 116302895, 39.832277, -50) as geoId;

select GeoIdToLatLng(855279270226, 39.832277, -50) as LatitudeAndLongitude;

select ToRangeList('116.321011 40.123503, 116.320311 40.122503,116.32 
40.121503, 116.321011 40.123503', 39.832277, -50) as rangeList;

select LatLngToGeoId(39930753, 116302895, 39.832277, 0) as geoId;

select GeoIdToLatLng(855279270226, 39.832277, 0) as LatitudeAndLongitude;

--Gives Wrong error message fro gridSize 0

select ToRangeList('116.321011 40.123503, 116.320311 40.122503,116.32 
40.121503, 116.321011 40.123503', 39.832277, 0) as rangeList;

Scenario 4:

--Accepting Double values for GeoId

select GeoIdToLatLng(8.55279270226, 39.832277, -50) as LatitudeAndLongitude;

select ToUpperLayerGeoId(8.55279270226) as upperLayerGeoId;

select GeoIdToGridXy(8.55279270226) as GridXY;

Scanerio 5:

--Accepting Invalid Values in All UDFs

select GeoIdToGridXy('X') as GridXY;

select LatLngToGeoId('X', 'X', 'X', 'X') as geoId;

select GeoIdToLatLng('X', 'X', 'X') as LatitudeAndLongitude;

select ToUpperLayerGeoId('X') as upperLayerGeoId;

select ToRangeList('116.321011 40.123503, 116.320311 40.122503,116.32 
40.121503, 116.321011 40.123503', 39.832277, 'X') as rangeList;

select ToRangeList('116.321011 40.123503, 116.320311 40.122503,116.32 
40.121503, 116.321011 40.123503', 'X', 50) as rangeList;



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (CARBONDATA-4167) Case sensitive issues in Geospatial index support

2021-04-19 Thread SHREELEKHYA GAMPA (Jira)
SHREELEKHYA GAMPA created CARBONDATA-4167:
-

 Summary: Case sensitive issues in Geospatial index support
 Key: CARBONDATA-4167
 URL: https://issues.apache.org/jira/browse/CARBONDATA-4167
 Project: CarbonData
  Issue Type: Bug
Reporter: SHREELEKHYA GAMPA


1) create table source_index(TIMEVALUE BIGINT,LONGITUDE long,LATITUDE long) 
STORED AS carbondata TBLPROPERTIES 
('SPATIAL_INDEX.MYGEOHASH.type'='geohash','SPATIAL_INDEX.MYGEOHASH.sourcecolumns'='longitude,
 
latitude','SPATIAL_INDEX.MYGEOHASH.originLatitude'='39.930753','SPATIAL_INDEX.MYGEOHASH.gridSize'='50','SPATIAL_INDEX'='MYGEOHASH','SPATIAL_INDEX.MYGEOHASH.conversionRatio'='100');

 properties are being case sensitive:

[!https://clouddevops.huawei.com/vision-file-storage/api/file/download/upload-v2/2021/3/5/c71035/9d1b20f836a048909679864f0c0fb4d8/image.png!|https://clouddevops.huawei.com/vision-file-storage/api/file/download/upload-v2/2021/3/5/c71035/9d1b20f836a048909679864f0c0fb4d8/image.png]

 

2) select query with lower case in Query UDFs fails

[!https://clouddevops.huawei.com/vision-file-storage/api/file/download/upload-v2/2021/3/9/c71035/d28803c9cbc54db997b9d28015d59bee/image.png!|https://clouddevops.huawei.com/vision-file-storage/api/file/download/upload-v2/2021/3/9/c71035/d28803c9cbc54db997b9d28015d59bee/image.png]

 

 select query with with lower case linestring in Polyline UDf does not return 
any value.

[!https://clouddevops.huawei.com/vision-file-storage/api/file/download/upload-v2/2021/3/9/c71035/0b1fe6bdd8454079bd52047f2c72bda4/image.png!|https://clouddevops.huawei.com/vision-file-storage/api/file/download/upload-v2/2021/3/9/c71035/0b1fe6bdd8454079bd52047f2c72bda4/image.png]

 

 Select query with lower case rangelist in IN_POLYGON_RANGE_LIST UDF returns no 
value.

[!https://clouddevops.huawei.com/vision-file-storage/api/file/download/upload-v2/2021/3/9/c71035/d4d19f80d4d740b99bc5a0b6859f/image.png!|https://clouddevops.huawei.com/vision-file-storage/api/file/download/upload-v2/2021/3/9/c71035/d4d19f80d4d740b99bc5a0b6859f/image.png]

 

 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (CARBONDATA-4161) Describe columns

2021-04-02 Thread SHREELEKHYA GAMPA (Jira)
SHREELEKHYA GAMPA created CARBONDATA-4161:
-

 Summary: Describe columns
 Key: CARBONDATA-4161
 URL: https://issues.apache.org/jira/browse/CARBONDATA-4161
 Project: CarbonData
  Issue Type: New Feature
Reporter: SHREELEKHYA GAMPA


 

{{The DESCRIBE output can be formatted to avoid long lines for multiple fields. 
We can pass the column name to the command and visualize its structure with 
child fields.}}

{{DESCRIBE COLUMN fieldname ON [db_name.]table_name;
DESCRIBE short [db_name.]table_name;}}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (CARBONDATA-4149) Query with SI after add partition based on location on partition table gives incorrect results

2021-03-15 Thread SHREELEKHYA GAMPA (Jira)
SHREELEKHYA GAMPA created CARBONDATA-4149:
-

 Summary: Query with SI after add partition based on location on 
partition table gives incorrect results
 Key: CARBONDATA-4149
 URL: https://issues.apache.org/jira/browse/CARBONDATA-4149
 Project: CarbonData
  Issue Type: Bug
Reporter: SHREELEKHYA GAMPA


Queries to execute:
 * drop table if exists partitionTable;
 * create table partitionTable (id int,name String) partitioned by(email 
string) stored as carbondata;
 * insert into partitionTable select 1,'blue','abc';
 * CREATE INDEX maintable_si112 on table partitionTable (name) as 'carbondata';
 * alter table partitionTable add partition (email='def') location 
'$sdkWritePath';
 * select *from partitionTable where name = 'red';   ---> returns empty result
 * select *from partitionTable where ni(name = 'red');
 * alter table partitionTable compact 'major';
 * select *from partitionTable where name = 'red';

spark-sql> create table partitionTable (id int,name String) partitioned 
by(email string) STORED AS carbondata;
Time taken: 1.962 seconds
spark-sql> CREATE INDEX maintable_si112 on table partitionTable (name) as 
'carbondata';
Time taken: 2.759 seconds
spark-sql> insert into partitionTable select 1,'huawei','abc';
0
Time taken: 5.808 seconds, Fetched 1 row(s)
spark-sql> alter table partitionTable add partition (email='def') location 
'hdfs://hacluster/datastore';
Time taken: 1.108 seconds
spark-sql> insert into partitionTable select 1,'huawei','def';
1
Time taken: 2.707 seconds, Fetched 1 row(s)
spark-sql> select *from partitionTable where name='huawei';
1 huawei abc
Time taken: 0.75 seconds, Fetched 1 row(s)
spark-sql> select *from partitionTable where ni(name='huawei');
1 huawei def
1 huawei abc
Time taken: 0.507 seconds, Fetched 2 row(s)
spark-sql>

 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (CARBONDATA-4143) UT with index server

2021-03-05 Thread SHREELEKHYA GAMPA (Jira)
SHREELEKHYA GAMPA created CARBONDATA-4143:
-

 Summary: UT with index server
 Key: CARBONDATA-4143
 URL: https://issues.apache.org/jira/browse/CARBONDATA-4143
 Project: CarbonData
  Issue Type: Improvement
Reporter: SHREELEKHYA GAMPA


To enable to run UT with index server using flag {{useIndexServer.}}

excluded some of the test cases to not run with index server.

added test case with prepriming.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (CARBONDATA-4074) Should clean stale data in success segments

2021-03-05 Thread SHREELEKHYA GAMPA (Jira)


[ 
https://issues.apache.org/jira/browse/CARBONDATA-4074?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17295933#comment-17295933
 ] 

SHREELEKHYA GAMPA commented on CARBONDATA-4074:
---

can also include,

4. clean stale index files 

5. clean stale segment files

with retention time.

> Should clean stale data in success segments
> ---
>
> Key: CARBONDATA-4074
> URL: https://issues.apache.org/jira/browse/CARBONDATA-4074
> Project: CarbonData
>  Issue Type: Improvement
>Reporter: David Cai
>Priority: Major
>
> cleaning stale data in success segments include the following parts. 
> 1.  clean stale delete delta (when force is true)
> 2. clean stale small  files for index table
> 3. clean stale data files for loading/compaction



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (CARBONDATA-4037) Improve the table status and segment file writing

2021-03-03 Thread SHREELEKHYA GAMPA (Jira)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-4037?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

SHREELEKHYA GAMPA updated CARBONDATA-4037:
--
Attachment: Improve table status and segment file writing_1.docx

> Improve the table status and segment file writing
> -
>
> Key: CARBONDATA-4037
> URL: https://issues.apache.org/jira/browse/CARBONDATA-4037
> Project: CarbonData
>  Issue Type: Improvement
>Reporter: SHREELEKHYA GAMPA
>Priority: Minor
> Attachments: Improve table status and segment file writing_1.docx
>
>  Time Spent: 15h 40m
>  Remaining Estimate: 0h
>
> Currently, we update table status and segment files multiple times for a 
> single iud/merge/compact operation and delete the index files immediately 
> after merge. When concurrent queries are run, there may be situations like 
> user query is trying to access the segment index files and they are not 
> present, which is availability issue.
>  * To solve above issue, we can make mergeindex files generation mandatory 
> and fail load/compaction if mergeindex fails. Then if merge index is success, 
> update table status file and can delete index files immediately. However, in 
> legacy stores when alter segment merge is called, after merge index success, 
> do not delete index files immediately as it may cause issues for parallel 
> queries.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (CARBONDATA-4037) Improve the table status and segment file writing

2021-03-03 Thread SHREELEKHYA GAMPA (Jira)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-4037?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

SHREELEKHYA GAMPA updated CARBONDATA-4037:
--
Attachment: (was: Improve table status and segment file writing_1.docx)

> Improve the table status and segment file writing
> -
>
> Key: CARBONDATA-4037
> URL: https://issues.apache.org/jira/browse/CARBONDATA-4037
> Project: CarbonData
>  Issue Type: Improvement
>Reporter: SHREELEKHYA GAMPA
>Priority: Minor
>  Time Spent: 15h 40m
>  Remaining Estimate: 0h
>
> Currently, we update table status and segment files multiple times for a 
> single iud/merge/compact operation and delete the index files immediately 
> after merge. When concurrent queries are run, there may be situations like 
> user query is trying to access the segment index files and they are not 
> present, which is availability issue.
>  * To solve above issue, we can make mergeindex files generation mandatory 
> and fail load/compaction if mergeindex fails. Then if merge index is success, 
> update table status file and can delete index files immediately. However, in 
> legacy stores when alter segment merge is called, after merge index success, 
> do not delete index files immediately as it may cause issues for parallel 
> queries.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (CARBONDATA-4037) Improve the table status and segment file writing

2021-03-03 Thread SHREELEKHYA GAMPA (Jira)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-4037?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

SHREELEKHYA GAMPA updated CARBONDATA-4037:
--
Attachment: Improve table status and segment file writing_1.docx

> Improve the table status and segment file writing
> -
>
> Key: CARBONDATA-4037
> URL: https://issues.apache.org/jira/browse/CARBONDATA-4037
> Project: CarbonData
>  Issue Type: Improvement
>Reporter: SHREELEKHYA GAMPA
>Priority: Minor
> Attachments: Improve table status and segment file writing_1.docx
>
>  Time Spent: 15h 40m
>  Remaining Estimate: 0h
>
> Currently, we update table status and segment files multiple times for a 
> single iud/merge/compact operation and delete the index files immediately 
> after merge. When concurrent queries are run, there may be situations like 
> user query is trying to access the segment index files and they are not 
> present, which is availability issue.
>  * To solve above issue, we can make mergeindex files generation mandatory 
> and fail load/compaction if mergeindex fails. Then if merge index is success, 
> update table status file and can delete index files immediately. However, in 
> legacy stores when alter segment merge is called, after merge index success, 
> do not delete index files immediately as it may cause issues for parallel 
> queries.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (CARBONDATA-4037) Improve the table status and segment file writing

2021-03-03 Thread SHREELEKHYA GAMPA (Jira)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-4037?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

SHREELEKHYA GAMPA updated CARBONDATA-4037:
--
Description: 
Currently, we update table status and segment files multiple times for a single 
iud/merge/compact operation and delete the index files immediately after merge. 
When concurrent queries are run, there may be situations like user query is 
trying to access the segment index files and they are not present, which is 
availability issue.
 * To solve above issue, we can make mergeindex files generation mandatory and 
fail load/compaction if mergeindex fails. Then if merge index is success, 
update table status file and can delete index files immediately. However, in 
legacy stores when alter segment merge is called, after merge index success, do 
not delete index files immediately as it may cause issues for parallel queries.

  was:
Currently, we update table status and segment files multiple times for a single 
iud/merge/compact operation and delete the index files immediately after merge. 
When concurrent queries are run, there may be situations like user query is 
trying to access the segment index files and they are not present, which is 
availability issue.
 * Instead of deleting carbon index files immediately after merge, delete index 
files only when clean files command is executed and delete only those that have 
existed for more than 1 hour.
 * Generate segment file after merge index and update table status at beginning 
and after merge index.
order:
create table status file => index files => merge index => generate segment file 
=> update table status


> Improve the table status and segment file writing
> -
>
> Key: CARBONDATA-4037
> URL: https://issues.apache.org/jira/browse/CARBONDATA-4037
> Project: CarbonData
>  Issue Type: Improvement
>Reporter: SHREELEKHYA GAMPA
>Priority: Minor
>  Time Spent: 15h 40m
>  Remaining Estimate: 0h
>
> Currently, we update table status and segment files multiple times for a 
> single iud/merge/compact operation and delete the index files immediately 
> after merge. When concurrent queries are run, there may be situations like 
> user query is trying to access the segment index files and they are not 
> present, which is availability issue.
>  * To solve above issue, we can make mergeindex files generation mandatory 
> and fail load/compaction if mergeindex fails. Then if merge index is success, 
> update table status file and can delete index files immediately. However, in 
> legacy stores when alter segment merge is called, after merge index success, 
> do not delete index files immediately as it may cause issues for parallel 
> queries.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (CARBONDATA-4133) Concurrent Insert Overwrite with static partition on Index server fails

2021-02-18 Thread SHREELEKHYA GAMPA (Jira)
SHREELEKHYA GAMPA created CARBONDATA-4133:
-

 Summary: Concurrent Insert Overwrite with static partition on 
Index server fails
 Key: CARBONDATA-4133
 URL: https://issues.apache.org/jira/browse/CARBONDATA-4133
 Project: CarbonData
  Issue Type: Bug
Reporter: SHREELEKHYA GAMPA


[Steps] :-

with Index Server running execute the concurrent insert overwrite with static 
partition. 

 

Set 0:
CREATE TABLE if not exists uniqdata_string(CUST_ID int,CUST_NAME String,DOB 
timestamp,DOJ timestamp, BIGINT_COLUMN1 bigint,BIGINT_COLUMN2 
bigint,DECIMAL_COLUMN1 decimal(30,10),DECIMAL_COLUMN2 
decimal(36,10),Double_COLUMN1 double, Double_COLUMN2 double,INTEGER_COLUMN1 
int) PARTITIONED BY(ACTIVE_EMUI_VERSION string) STORED AS carbondata 
TBLPROPERTIES ('TABLE_BLOCKSIZE'= '256 MB');

Set 1:
LOAD DATA INPATH 'hdfs://hacluster/BabuStore/Data/2000_UniqData.csv' into table 
uniqdata_string partition(active_emui_version='abc') 
OPTIONS('FILEHEADER'='CUST_ID,CUST_NAME ,ACTIVE_EMUI_VERSION,DOB,DOJ, 
BIGINT_COLUMN1,BIGINT_COLUMN2,DECIMAL_COLUMN1,DECIMAL_COLUMN2,Double_COLUMN1, 
Double_COLUMN2,INTEGER_COLUMN1','BAD_RECORDS_ACTION'='FORCE');
LOAD DATA INPATH 'hdfs://hacluster/datasets/2000_UniqData.csv' into table 
uniqdata_string partition(active_emui_version='abc') 
OPTIONS('FILEHEADER'='CUST_ID,CUST_NAME ,ACTIVE_EMUI_VERSION,DOB,DOJ, 
BIGINT_COLUMN1,BIGINT_COLUMN2,DECIMAL_COLUMN1,DECIMAL_COLUMN2,Double_COLUMN1, 
Double_COLUMN2,INTEGER_COLUMN1','BAD_RECORDS_ACTION'='FORCE');

Set 2:
CREATE TABLE if not exists uniqdata_hive (CUST_ID int,CUST_NAME 
String,ACTIVE_EMUI_VERSION string, DOB timestamp, DOJ timestamp, BIGINT_COLUMN1 
bigint,BIGINT_COLUMN2 bigint,DECIMAL_COLUMN1 decimal(30,10), DECIMAL_COLUMN2 
decimal(36,10),Double_COLUMN1 double, Double_COLUMN2 double, INTEGER_COLUMN1 
int)ROW FORMAT DELIMITED FIELDS TERMINATED BY ',';
load data local inpath "/opt/csv/2000_UniqData.csv" into table uniqdata_hive;

Set 3: (concurrent)
insert overwrite table uniqdata_string partition(active_emui_version='abc') 
select CUST_ID, CUST_NAME,DOB,doj, bigint_column1, bigint_column2, 
decimal_column1, decimal_column2,double_column1, double_column2,integer_column1 
from uniqdata_hive limit 10;
insert overwrite table uniqdata_string partition(active_emui_version='abc') 
select CUST_ID, CUST_NAME,DOB,doj, bigint_column1, bigint_column2, 
decimal_column1, decimal_column2,double_column1, double_column2,integer_column1 
from uniqdata_hive limit 10;

[Expected Result] :- Insert should be success for timestamp data in Hive Carbon 
partition table

 

[Actual Issue] : - Concurrent Insert Overwrite with static partition on Index 
server fails

[!https://clouddevops.huawei.com/vision-file-storage/api/file/download/upload-v2/2021/1/17/c71035/a40a6d6be1434b1db8e8c1c6f5a2e97b/image.png!|https://clouddevops.huawei.com/vision-file-storage/api/file/download/upload-v2/2021/1/17/c71035/a40a6d6be1434b1db8e8c1c6f5a2e97b/image.png]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (CARBONDATA-4123) Bloom index query with Index server giving incorrect results

2021-02-08 Thread SHREELEKHYA GAMPA (Jira)
SHREELEKHYA GAMPA created CARBONDATA-4123:
-

 Summary: Bloom index query with Index server giving incorrect 
results
 Key: CARBONDATA-4123
 URL: https://issues.apache.org/jira/browse/CARBONDATA-4123
 Project: CarbonData
  Issue Type: Bug
Reporter: SHREELEKHYA GAMPA


Queries: create table and load data so that it can create >1 blocklet.

 

spark-sql> select count(*) from test_rcd where city = 'city40';
2021-02-04 22:13:29,759 | WARN | pool-24-thread-1 | It is not recommended to 
set off-heap working memory size less than 512MB, so setting default value to 
512 | 
org.apache.carbondata.core.memory.UnsafeMemoryManager.(UnsafeMemoryManager.java:83)
10
Time taken: 2.417 seconds, Fetched 1 row(s)
spark-sql> CREATE INDEX dm_rcd ON TABLE test_rcd (city) AS 'bloomfilter' 
properties ('BLOOM_SIZE'='64', 'BLOOM_FPP'='0.1');
2021-02-04 22:13:58,683 | AUDIT | main | \{"time":"February 4, 2021 10:13:58 PM 
CST","username":"carbon","opName":"CREATE 
INDEX","opId":"15148202700230273","opStatus":"START"} | 
carbon.audit.logOperationStart(Auditor.java:74)
2021-02-04 22:13:58,759 | WARN | main | Bloom compress is not configured for 
index dm_rcd, use default value true | 
org.apache.carbondata.index.bloom.BloomCoarseGrainIndexFactory.validateAndGetBloomCompress(BloomCoarseGrainIndexFactory.java:202)
2021-02-04 22:13:59,292 | WARN | Executor task launch worker for task 2 | Bloom 
compress is not configured for index dm_rcd, use default value true | 
org.apache.carbondata.index.bloom.BloomCoarseGrainIndexFactory.validateAndGetBloomCompress(BloomCoarseGrainIndexFactory.java:202)
2021-02-04 22:13:59,629 | WARN | main | Bloom compress is not configured for 
index dm_rcd, use default value true | 
org.apache.carbondata.index.bloom.BloomCoarseGrainIndexFactory.validateAndGetBloomCompress(BloomCoarseGrainIndexFactory.java:202)
2021-02-04 22:14:00,331 | AUDIT | main | \{"time":"February 4, 2021 10:14:00 PM 
CST","username":"carbon","opName":"CREATE 
INDEX","opId":"15148202700230273","opStatus":"SUCCESS","opTime":"1648 
ms","table":"default.test_rcd","extraInfo":{"provider":"bloomfilter","indexName":"dm_rcd","bloom_size":"64","bloom_fpp":"0.1"}}
 | carbon.audit.logOperationEnd(Auditor.java:97)
Time taken: 1.818 seconds
spark-sql> select count(*) from test_rcd where city = 'city40';
30
Time taken: 0.556 seconds, Fetched 1 row(s)
spark-sql>



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (CARBONDATA-4117) Test cg index query with Index server fails with NPE

2021-02-02 Thread SHREELEKHYA GAMPA (Jira)
SHREELEKHYA GAMPA created CARBONDATA-4117:
-

 Summary: Test cg index query with Index server fails with NPE
 Key: CARBONDATA-4117
 URL: https://issues.apache.org/jira/browse/CARBONDATA-4117
 Project: CarbonData
  Issue Type: Bug
Reporter: SHREELEKHYA GAMPA


Test queries to execute:


spark-sql> CREATE TABLE index_test_cg(id INT, name STRING, city STRING, age 
INT) STORED AS carbondata TBLPROPERTIES('SORT_COLUMNS'='city,name', 
'SORT_SCOPE'='LOCAL_SORT');

spark-sql> create index cgindex on table index_test_cg (name) as 
'org.apache.carbondata.spark.testsuite.index.CGIndexFactory';

LOAD DATA LOCAL INPATH '$file2' INTO TABLE index_test_cg 
OPTIONS('header'='false')

spark-sql> select * from index_test_cg where name='n502670';
2021-01-29 15:09:25,881 | ERROR | main | Exception occurred while getting 
splits using index server. Initiating Fallback to embedded mode | 
org.apache.carbondata.hadoop.api.CarbonInputFormat.getDistributedSplit(CarbonInputFormat.java:454)
java.lang.reflect.UndeclaredThrowableException
at com.sun.proxy.$Proxy69.getSplits(Unknown Source)
at 
org.apache.carbondata.indexserver.DistributedIndexJob$$anonfun$1.apply(IndexJobs.scala:85)
at 
org.apache.carbondata.indexserver.DistributedIndexJob$$anonfun$1.apply(IndexJobs.scala:59)
at 
org.apache.carbondata.spark.util.CarbonScalaUtil$.logTime(CarbonScalaUtil.scala:769)
at 
org.apache.carbondata.indexserver.DistributedIndexJob.execute(IndexJobs.scala:58)
at 
org.apache.carbondata.core.index.IndexUtil.executeIndexJob(IndexUtil.java:307)
at 
org.apache.carbondata.hadoop.api.CarbonInputFormat.getDistributedSplit(CarbonInputFormat.java:443)
at 
org.apache.carbondata.hadoop.api.CarbonInputFormat.getPrunedBlocklets(CarbonInputFormat.java:555)
at 
org.apache.carbondata.hadoop.api.CarbonInputFormat.getDataBlocksOfSegment(CarbonInputFormat.java:500)
at 
org.apache.carbondata.hadoop.api.CarbonTableInputFormat.getSplits(CarbonTableInputFormat.java:357)
at 
org.apache.carbondata.hadoop.api.CarbonTableInputFormat.getSplits(CarbonTableInputFormat.java:205)
at 
org.apache.carbondata.spark.rdd.CarbonScanRDD.internalGetPartitions(CarbonScanRDD.scala:159)
at org.apache.carbondata.spark.rdd.CarbonRDD.getPartitions(CarbonRDD.scala:68)
at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:273)
at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:269)
at scala.Option.getOrElse(Option.scala:121)
at org.apache.spark.rdd.RDD.partitions(RDD.scala:269)
at 
org.apache.spark.rdd.MapPartitionsRDD.getPartitions(MapPartitionsRDD.scala:49)
at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:273)
at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:269)
at scala.Option.getOrElse(Option.scala:121)
at org.apache.spark.rdd.RDD.partitions(RDD.scala:269)
at 
org.apache.spark.rdd.MapPartitionsRDD.getPartitions(MapPartitionsRDD.scala:49)
at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:273)
at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:269)
at scala.Option.getOrElse(Option.scala:121)
at org.apache.spark.rdd.RDD.partitions(RDD.scala:269)
at org.apache.spark.SparkContext.runJob(SparkContext.scala:2299)
at org.apache.spark.rdd.RDD$$anonfun$collect$1.apply(RDD.scala:989)
at 
org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:151)
at 
org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:112)
at org.apache.spark.rdd.RDD.withScope(RDD.scala:384)
at org.apache.spark.rdd.RDD.collect(RDD.scala:988)
at org.apache.spark.sql.execution.SparkPlan.executeCollect(SparkPlan.scala:345)
at 
org.apache.spark.sql.execution.SparkPlan.executeCollectPublic(SparkPlan.scala:372)
at 
org.apache.spark.sql.execution.QueryExecution.hiveResultString(QueryExecution.scala:127)
at 
org.apache.spark.sql.hive.thriftserver.SparkSQLDriver$$anonfun$run$1.apply(SparkSQLDriver.scala:66)
at 
org.apache.spark.sql.hive.thriftserver.SparkSQLDriver$$anonfun$run$1.apply(SparkSQLDriver.scala:66)
at 
org.apache.spark.sql.execution.SQLExecution$$anonfun$withNewExecutionId$1$$anonfun$apply$1.apply(SQLExecution.scala:95)
at 
org.apache.spark.sql.execution.SQLExecution$.withSQLConfPropagated(SQLExecution.scala:144)
at 
org.apache.spark.sql.execution.SQLExecution$$anonfun$withNewExecutionId$1.apply(SQLExecution.scala:86)
at org.apache.spark.sql.SparkSession.withActive(SparkSession.scala:789)
at 
org.apache.spark.sql.execution.SQLExecution$.withNewExecutionId(SQLExecution.scala:63)
at 
org.apache.spark.sql.hive.thriftserver.SparkSQLDriver.run(SparkSQLDriver.scala:65)
at 
org.apache.spark.sql.hive.thriftserver.SparkSQLCLIDriver.processCmd(SparkSQLCLIDriver.scala:383)
at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:406)
at 
org.apache.spark.sql.hive.thriftserver.SparkSQLCLIDriver$.main(SparkSQLCLIDriver.scala:277)
at 

[jira] [Created] (CARBONDATA-4113) Partition query results invalid when carbon.read.partition.hive.direct is disabled

2021-01-28 Thread SHREELEKHYA GAMPA (Jira)
SHREELEKHYA GAMPA created CARBONDATA-4113:
-

 Summary: Partition query results invalid when 
carbon.read.partition.hive.direct is disabled
 Key: CARBONDATA-4113
 URL: https://issues.apache.org/jira/browse/CARBONDATA-4113
 Project: CarbonData
  Issue Type: Bug
Reporter: SHREELEKHYA GAMPA


set 'carbon.read.partition.hive.direct' to false.

queries to execute:

create table partition_cache(a string) partitioned by(b int) stored as 
carbondata

insert into partition_cache select 'k',1;

insert into partition_cache select 'k',1;

insert into partition_cache select 'k',2;

insert into partition_cache select 'k',2;

alter table partition_cache compact 'minor';

select *from partition_cache; => no results



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (CARBONDATA-4111) Filter query having invalid results after add segment to table having SI with Indexserver

2021-01-26 Thread SHREELEKHYA GAMPA (Jira)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-4111?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

SHREELEKHYA GAMPA updated CARBONDATA-4111:
--
Description: 
queries to execute:

create table maintable_sdk(a string, b int, c string) stored as carbondata;
 insert into maintable_sdk select 'k',1,'k';
 insert into maintable_sdk select 'l',2,'l';
 CREATE INDEX maintable_si_sdk on table maintable_sdk (c) as 'carbondata';
 alter table maintable_sdk add segment 
options('path'='hdfs://hacluster/sdkfiles/newsegment/', 'format'='carbon');

spark-sql> select *from maintable_sdk where c='m';
2021-01-27 12:10:54,326 | WARN | IPC Client (653337757) connection to 
linux-30/10.19.90.30:22900 from car...@hadoop.com | Unexpected error reading 
responses on connection Thread[IPC Client (653337757) connection to 
linux-30/10.19.90.30:22900 from car...@hadoop.com,5,main] | 
org.apache.hadoop.ipc.Client.run(Client.java:1113)
java.lang.RuntimeException: java.lang.NoSuchMethodException: 
org.apache.carbondata.core.indexstore.SegmentWrapperContainer.()
 at org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:135)
 at 
org.apache.hadoop.io.WritableFactories.newInstance(WritableFactories.java:58)
 at org.apache.hadoop.io.ObjectWritable.readObject(ObjectWritable.java:284)
 at org.apache.hadoop.io.ObjectWritable.readFields(ObjectWritable.java:77)
 at 
org.apache.hadoop.ipc.RpcWritable$WritableWrapper.readFrom(RpcWritable.java:85)
 at org.apache.hadoop.ipc.RpcWritable$Buffer.getValue(RpcWritable.java:187)
 at org.apache.hadoop.ipc.RpcWritable$Buffer.newInstance(RpcWritable.java:183)
 at org.apache.hadoop.ipc.Client$Connection.receiveRpcResponse(Client.java:1223)
 at org.apache.hadoop.ipc.Client$Connection.run(Client.java:1107)
Caused by: java.lang.NoSuchMethodException: 
org.apache.carbondata.core.indexstore.SegmentWrapperContainer.()
 at java.lang.Class.getConstructor0(Class.java:3082)
 at java.lang.Class.getDeclaredConstructor(Class.java:2178)
 at org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:129)
 ... 8 more
2021-01-27 12:10:54,330 | WARN | main | Distributed Segment Pruning failed, 
initiating embedded pruning | 
org.apache.spark.sql.secondaryindex.joins.BroadCastSIFilterPushJoin$.getFilteredSegments(BroadCastSIFilterPushJoin.scala:349)
java.lang.reflect.UndeclaredThrowableException
 at com.sun.proxy.$Proxy59.getPrunedSegments(Unknown Source)
 at 
org.apache.spark.sql.secondaryindex.joins.BroadCastSIFilterPushJoin$.getFilteredSegments(BroadCastSIFilterPushJoin.scala:341)
 at 
org.apache.spark.sql.secondaryindex.joins.BroadCastSIFilterPushJoin$.getFilteredSegments(BroadCastSIFilterPushJoin.scala:426)
 at 
org.apache.spark.sql.secondaryindex.joins.BroadCastSIFilterPushJoin.partitions$lzycompute(BroadCastSIFilterPushJoin.scala:80)
 at 
org.apache.spark.sql.secondaryindex.joins.BroadCastSIFilterPushJoin.partitions(BroadCastSIFilterPushJoin.scala:78)
 at 
org.apache.spark.sql.secondaryindex.joins.BroadCastSIFilterPushJoin.inputCopy$lzycompute(BroadCastSIFilterPushJoin.scala:94)
 at 
org.apache.spark.sql.secondaryindex.joins.BroadCastSIFilterPushJoin.inputCopy(BroadCastSIFilterPushJoin.scala:93)
 at 
org.apache.spark.sql.secondaryindex.joins.BroadCastSIFilterPushJoin.doExecute(BroadCastSIFilterPushJoin.scala:132)
 at 
org.apache.spark.sql.execution.SparkPlan$$anonfun$execute$1.apply(SparkPlan.scala:177)
 at 
org.apache.spark.sql.execution.SparkPlan$$anonfun$execute$1.apply(SparkPlan.scala:173)
 at 
org.apache.spark.sql.execution.SparkPlan$$anonfun$executeQuery$1.apply(SparkPlan.scala:201)
 at 
org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:151)
 at org.apache.spark.sql.execution.SparkPlan.executeQuery(SparkPlan.scala:198)
 at org.apache.spark.sql.execution.SparkPlan.execute(SparkPlan.scala:173)
 at 
org.apache.spark.sql.execution.SparkPlan.getByteArrayRdd(SparkPlan.scala:293)
 at org.apache.spark.sql.execution.SparkPlan.executeCollect(SparkPlan.scala:342)
 at 
org.apache.spark.sql.execution.SparkPlan.executeCollectPublic(SparkPlan.scala:372)
 at 
org.apache.spark.sql.execution.QueryExecution.hiveResultString(QueryExecution.scala:127)
 at 
org.apache.spark.sql.hive.thriftserver.SparkSQLDriver$$anonfun$run$1.apply(SparkSQLDriver.scala:66)
 at 
org.apache.spark.sql.hive.thriftserver.SparkSQLDriver$$anonfun$run$1.apply(SparkSQLDriver.scala:66)
 at 
org.apache.spark.sql.execution.SQLExecution$$anonfun$withNewExecutionId$1$$anonfun$apply$1.apply(SQLExecution.scala:95)
 at 
org.apache.spark.sql.execution.SQLExecution$.withSQLConfPropagated(SQLExecution.scala:144)
 at 
org.apache.spark.sql.execution.SQLExecution$$anonfun$withNewExecutionId$1.apply(SQLExecution.scala:86)
 at org.apache.spark.sql.SparkSession.withActive(SparkSession.scala:789)
 at 
org.apache.spark.sql.execution.SQLExecution$.withNewExecutionId(SQLExecution.scala:63)
 at 

[jira] [Updated] (CARBONDATA-4111) Filter query having invalid results after add segment to table having SI with Indexserver

2021-01-26 Thread SHREELEKHYA GAMPA (Jira)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-4111?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

SHREELEKHYA GAMPA updated CARBONDATA-4111:
--
Summary: Filter query having invalid results after add segment to table 
having SI with Indexserver  (was: Filter query having invalid results when add 
segment to SI with Indexserver)

> Filter query having invalid results after add segment to table having SI with 
> Indexserver
> -
>
> Key: CARBONDATA-4111
> URL: https://issues.apache.org/jira/browse/CARBONDATA-4111
> Project: CarbonData
>  Issue Type: Bug
>Reporter: SHREELEKHYA GAMPA
>Priority: Minor
> Attachments: addseg_si_is.png
>
>
> queries to execute:
> create table maintable_sdk(a string, b int, c string) stored as carbondata;
>  insert into maintable_sdk select 'k',1,'k';
>  insert into maintable_sdk select 'l',2,'l';
>  CREATE INDEX maintable_si_sdk on table maintable_sdk (c) as 'carbondata';
>  alter table maintable_sdk add segment 
> options('path'='hdfs://hacluster/sdkfiles/newsegment/', 'format'='carbon');



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (CARBONDATA-4111) Filter query having invalid results when add segment to SI with Indexserver

2021-01-26 Thread SHREELEKHYA GAMPA (Jira)
SHREELEKHYA GAMPA created CARBONDATA-4111:
-

 Summary: Filter query having invalid results when add segment to 
SI with Indexserver
 Key: CARBONDATA-4111
 URL: https://issues.apache.org/jira/browse/CARBONDATA-4111
 Project: CarbonData
  Issue Type: Bug
Reporter: SHREELEKHYA GAMPA
 Attachments: addseg_si_is.png

queries to execute:

create table maintable_sdk(a string, b int, c string) stored as carbondata;
 insert into maintable_sdk select 'k',1,'k';
 insert into maintable_sdk select 'l',2,'l';
 CREATE INDEX maintable_si_sdk on table maintable_sdk (c) as 'carbondata';
 alter table maintable_sdk add segment 
options('path'='hdfs://hacluster/sdkfiles/newsegment/', 'format'='carbon');





--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (CARBONDATA-4111) Filter query having invalid results when add segment to SI with Indexserver

2021-01-26 Thread SHREELEKHYA GAMPA (Jira)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-4111?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

SHREELEKHYA GAMPA updated CARBONDATA-4111:
--
Attachment: addseg_si_is.png

> Filter query having invalid results when add segment to SI with Indexserver
> ---
>
> Key: CARBONDATA-4111
> URL: https://issues.apache.org/jira/browse/CARBONDATA-4111
> Project: CarbonData
>  Issue Type: Bug
>Reporter: SHREELEKHYA GAMPA
>Priority: Minor
> Attachments: addseg_si_is.png
>
>
> queries to execute:
> create table maintable_sdk(a string, b int, c string) stored as carbondata;
>  insert into maintable_sdk select 'k',1,'k';
>  insert into maintable_sdk select 'l',2,'l';
>  CREATE INDEX maintable_si_sdk on table maintable_sdk (c) as 'carbondata';
>  alter table maintable_sdk add segment 
> options('path'='hdfs://hacluster/sdkfiles/newsegment/', 'format'='carbon');



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (CARBONDATA-4096) SDK read fails from cluster and sdk read filter query on sort column giving wrong result with IndexServer

2020-12-22 Thread SHREELEKHYA GAMPA (Jira)
SHREELEKHYA GAMPA created CARBONDATA-4096:
-

 Summary: SDK read fails from cluster and sdk read filter query on 
sort column giving wrong result with IndexServer
 Key: CARBONDATA-4096
 URL: https://issues.apache.org/jira/browse/CARBONDATA-4096
 Project: CarbonData
  Issue Type: Bug
Reporter: SHREELEKHYA GAMPA
 Attachments: image-2020-12-22-18-54-52-361.png, 
wrongresults_with_IS.PNG

Test write sdk and read with spark.

Queries to reproduce:

put written sdk files in $warehouse/sdk path - contains .carbondata and .index 
files.

+From spark-sql:+ 

create table sdkout using carbon options(path='$warehouse/sdk');

select * from sdkout where salary = 100; 

!image-2020-12-22-18-54-52-361.png|width=744,height=279!

 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (CARBONDATA-4078) add external segment and query with index server fails

2020-12-07 Thread SHREELEKHYA GAMPA (Jira)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-4078?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

SHREELEKHYA GAMPA updated CARBONDATA-4078:
--
Attachment: is_noncarbonsegments stacktrace

> add external segment and query with index server fails
> --
>
> Key: CARBONDATA-4078
> URL: https://issues.apache.org/jira/browse/CARBONDATA-4078
> Project: CarbonData
>  Issue Type: Bug
>Reporter: SHREELEKHYA GAMPA
>Priority: Minor
> Attachments: is_noncarbonsegments stacktrace
>
>
> index server tries to cache parquet/orc segments and fails as it cannot read 
> the file format when the fallback mode is disabled.
> Ex:  'test parquet table' test case
>  
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (CARBONDATA-4078) add external segment and query with index server fails

2020-12-07 Thread SHREELEKHYA GAMPA (Jira)
SHREELEKHYA GAMPA created CARBONDATA-4078:
-

 Summary: add external segment and query with index server fails
 Key: CARBONDATA-4078
 URL: https://issues.apache.org/jira/browse/CARBONDATA-4078
 Project: CarbonData
  Issue Type: Bug
Reporter: SHREELEKHYA GAMPA
 Attachments: is_noncarbonsegments stacktrace

index server tries to cache parquet/orc segments and fails as it cannot read 
the file format when the fallback mode is disabled.

Ex:  'test parquet table' test case

 

 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Comment Edited] (CARBONDATA-3970) Carbondata 2.0.1 MV ERROR CarbonInternalMetastore$: Adding/Modifying tableProperties operation failed

2020-10-21 Thread SHREELEKHYA GAMPA (Jira)


[ 
https://issues.apache.org/jira/browse/CARBONDATA-3970?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17209365#comment-17209365
 ] 

SHREELEKHYA GAMPA edited comment on CARBONDATA-3970 at 10/21/20, 8:07 AM:
--

Hi [~sushantsam], Could you provide the spark configurations set particularly 
related to metastore, and complete stacktrace of error.

Please ensure carbon extensions are configured in spark-defaults.conf

Like, 'spark.sql.extensions=org.apache.spark.sql.CarbonExtensions' .


was (Author: shreelekhya):
Hi [~sushantsam], Could you please provide the spark configurations set 
particularly related to metastore, and complete stacktrace of error.

> Carbondata 2.0.1 MV  ERROR CarbonInternalMetastore$: Adding/Modifying 
> tableProperties operation failed
> --
>
> Key: CARBONDATA-3970
> URL: https://issues.apache.org/jira/browse/CARBONDATA-3970
> Project: CarbonData
>  Issue Type: Bug
>  Components: data-query, hive-integration
>Affects Versions: 2.0.1
> Environment: CarbonData 2.0.1 with Spark 2.4.5
>Reporter: Sushant Sammanwar
>Priority: Major
>
> Hi ,
>  
> I am facing issues with materialized views  -  the query is not hitting the 
> view in the explain plan .I would really appreciate if you could help me on 
> this.
> Below are the details : 
> I am using Spark shell to connect to Carbon 2.0.1 using spark 2.4.5
> Underlying table has data loaded.
> I think problem is while create materialized view as i am getting a error 
> related to metastore.
>  
>  
> scala> carbon.sql("create MATERIALIZED VIEW agg_sales_mv as select country, 
> sex,sum(quantity),avg(price) from sales group by country,sex").show()
> 20/08/26 01:04:41 AUDIT audit: \{"time":"August 26, 2020 1:04:41 AM 
> IST","username":"root","opName":"CREATE MATERIALIZED 
> VIEW","opId":"16462372696035311","opStatus":"START"}
> 20/08/26 01:04:45 AUDIT audit: \{"time":"August 26, 2020 1:04:45 AM 
> IST","username":"root","opName":"CREATE 
> TABLE","opId":"16462377160819798","opStatus":"START"}
> 20/08/26 01:04:46 AUDIT audit: \{"time":"August 26, 2020 1:04:46 AM 
> IST","username":"root","opName":"CREATE 
> TABLE","opId":"16462377696791275","opStatus":"START"}
> 20/08/26 01:04:48 AUDIT audit: \{"time":"August 26, 2020 1:04:48 AM 
> IST","username":"root","opName":"CREATE 
> TABLE","opId":"16462377696791275","opStatus":"SUCCESS","opTime":"2326 
> ms","table":"NA","extraInfo":{}}
> 20/08/26 01:04:48 AUDIT audit: \{"time":"August 26, 2020 1:04:48 AM 
> IST","username":"root","opName":"CREATE 
> TABLE","opId":"16462377160819798","opStatus":"SUCCESS","opTime":"2955 
> ms","table":"default.agg_sales_mv","extraInfo":{"local_dictionary_threshold":"1","bad_record_path":"","table_blocksize":"1024","local_dictionary_enable":"true","flat_folder":"false","external":"false","sort_columns":"","comment":"","carbon.column.compressor":"snappy","mv_related_tables":"sales"}}
> 20/08/26 01:04:50 ERROR CarbonInternalMetastore$: Adding/Modifying 
> tableProperties operation failed: 
> org.apache.spark.sql.hive.HiveExternalCatalog cannot be cast to 
> org.apache.spark.sql.catalyst.catalog.ExternalCatalogWithListener
> 20/08/26 01:04:50 ERROR CarbonInternalMetastore$: Adding/Modifying 
> tableProperties operation failed: 
> org.apache.spark.sql.hive.HiveExternalCatalog cannot be cast to 
> org.apache.spark.sql.catalyst.catalog.ExternalCatalogWithListener
> 20/08/26 01:04:51 AUDIT audit: \{"time":"August 26, 2020 1:04:51 AM 
> IST","username":"root","opName":"CREATE MATERIALIZED 
> VIEW","opId":"16462372696035311","opStatus":"SUCCESS","opTime":"10551 
> ms","table":"NA","extraInfo":{"mvName":"agg_sales_mv"}}
> ++
> ||
> ++
> ++
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (CARBONDATA-4037) Improve the table status and segment file writing

2020-10-19 Thread SHREELEKHYA GAMPA (Jira)
SHREELEKHYA GAMPA created CARBONDATA-4037:
-

 Summary: Improve the table status and segment file writing
 Key: CARBONDATA-4037
 URL: https://issues.apache.org/jira/browse/CARBONDATA-4037
 Project: CarbonData
  Issue Type: Improvement
Reporter: SHREELEKHYA GAMPA


Currently, we update table status and segment files multiple times for a single 
iud/merge/compact operation and delete the index files immediately after merge. 
When concurrent queries are run, there may be situations like user query is 
trying to access the segment index files and they are not present, which is 
availability issue.
 * Instead of deleting carbon index files immediately after merge, delete index 
files only when clean files command is executed and delete only those that have 
existed for more than 1 hour.
 * Generate segment file after merge index and update table status at beginning 
and after merge index.
order:
create table status file => index files => merge index => generate segment file 
=> update table status



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (CARBONDATA-3903) Documentation Issue in Github Docs Link https://github.com/apache/carbondata/tree/master/docs

2020-10-13 Thread SHREELEKHYA GAMPA (Jira)


[ 
https://issues.apache.org/jira/browse/CARBONDATA-3903?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17212872#comment-17212872
 ] 

SHREELEKHYA GAMPA commented on CARBONDATA-3903:
---

Made changes in UPDAT/DELETE section as suggested and added compaction 
hyperlink in DML section of language manual. Other information is either 
already present in other documents or not necessary.

> Documentation Issue in Github  Docs Link 
> https://github.com/apache/carbondata/tree/master/docs
> --
>
> Key: CARBONDATA-3903
> URL: https://issues.apache.org/jira/browse/CARBONDATA-3903
> Project: CarbonData
>  Issue Type: Bug
>  Components: docs
>Affects Versions: 2.0.1
> Environment: https://github.com/apache/carbondata/tree/master/docs
>Reporter: PURUJIT CHAUGULE
>Priority: Minor
>
> dml-of-carbondata.md
> LOAD DATA:
>  * Mention Each Load is considered as a Segment.
>  * Give all possible options for SORT_SCOPE like 
> GLOBAL_SORT/LOCAL_SORT/NO_SORT (with explanation of difference between each 
> type).
>  * Add Example Of complete Load query with/without use of OPTIONS.
> INSERT DATA:
>  * Mention each insert is a Segment.
> LOAD Using Static/Dynamic Partitioning:
>  * Can give a hyperlink to Static/Dynamic partitioning.
> UPDATE/DELETE:
>  * Mention about delta files concept in update and delete.
> DELETE:
>  * Add example for deletion of all records from a table (delete from 
> tablename).
> COMPACTION:
>  * Can mention Minor compaction of two types Auto and Manual( 
> carbon.auto.load.merge =true/false), and that if 
> carbon.auto.load.merge=false, trigger should be done manually.
>  * Hyperlink to Configurable properties of Compaction.
>  * Mention that compacted segments do not get cleaned automatically and 
> should be triggered manually using clean files.
>  
> flink-integration-guide.md
>  * Mention what are stages, how is it used.
>  * Process of insertion, deletion of stages in carbontable. (How is it stored 
> in carbontable).
>  
> language-manual.md
>  * Mention Compaction Hyperlink in DML section.
>  
> spatial-index-guide.md
>  * Mention the TBLPROPERTIES supported / not supported for Geo table.
>  * Mention Spatial Index does not make a new column.
>  * CTAS from one geo table to another does not create another Geo table can 
> be mentioned.
>  * Mention that a certain combination of Spatial Index table properties need 
> to be added in create table, without which a geo table does not get created.
>  * Mention that we cannot alter columns (change datatype, change name, drop) 
> mentioned in spatial_index.
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (CARBONDATA-3970) Carbondata 2.0.1 MV ERROR CarbonInternalMetastore$: Adding/Modifying tableProperties operation failed

2020-10-07 Thread SHREELEKHYA GAMPA (Jira)


[ 
https://issues.apache.org/jira/browse/CARBONDATA-3970?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17209365#comment-17209365
 ] 

SHREELEKHYA GAMPA commented on CARBONDATA-3970:
---

Hi [~sushantsam], Could you please provide the spark configurations set 
particularly related to metastore, and complete stacktrace of error.

> Carbondata 2.0.1 MV  ERROR CarbonInternalMetastore$: Adding/Modifying 
> tableProperties operation failed
> --
>
> Key: CARBONDATA-3970
> URL: https://issues.apache.org/jira/browse/CARBONDATA-3970
> Project: CarbonData
>  Issue Type: Bug
>  Components: data-query, hive-integration
>Affects Versions: 2.0.1
> Environment: CarbonData 2.0.1 with Spark 2.4.5
>Reporter: Sushant Sammanwar
>Priority: Major
>
> Hi ,
>  
> I am facing issues with materialized views  -  the query is not hitting the 
> view in the explain plan .I would really appreciate if you could help me on 
> this.
> Below are the details : 
> I am using Spark shell to connect to Carbon 2.0.1 using spark 2.4.5
> Underlying table has data loaded.
> I think problem is while create materialized view as i am getting a error 
> related to metastore.
>  
>  
> scala> carbon.sql("create MATERIALIZED VIEW agg_sales_mv as select country, 
> sex,sum(quantity),avg(price) from sales group by country,sex").show()
> 20/08/26 01:04:41 AUDIT audit: \{"time":"August 26, 2020 1:04:41 AM 
> IST","username":"root","opName":"CREATE MATERIALIZED 
> VIEW","opId":"16462372696035311","opStatus":"START"}
> 20/08/26 01:04:45 AUDIT audit: \{"time":"August 26, 2020 1:04:45 AM 
> IST","username":"root","opName":"CREATE 
> TABLE","opId":"16462377160819798","opStatus":"START"}
> 20/08/26 01:04:46 AUDIT audit: \{"time":"August 26, 2020 1:04:46 AM 
> IST","username":"root","opName":"CREATE 
> TABLE","opId":"16462377696791275","opStatus":"START"}
> 20/08/26 01:04:48 AUDIT audit: \{"time":"August 26, 2020 1:04:48 AM 
> IST","username":"root","opName":"CREATE 
> TABLE","opId":"16462377696791275","opStatus":"SUCCESS","opTime":"2326 
> ms","table":"NA","extraInfo":{}}
> 20/08/26 01:04:48 AUDIT audit: \{"time":"August 26, 2020 1:04:48 AM 
> IST","username":"root","opName":"CREATE 
> TABLE","opId":"16462377160819798","opStatus":"SUCCESS","opTime":"2955 
> ms","table":"default.agg_sales_mv","extraInfo":{"local_dictionary_threshold":"1","bad_record_path":"","table_blocksize":"1024","local_dictionary_enable":"true","flat_folder":"false","external":"false","sort_columns":"","comment":"","carbon.column.compressor":"snappy","mv_related_tables":"sales"}}
> 20/08/26 01:04:50 ERROR CarbonInternalMetastore$: Adding/Modifying 
> tableProperties operation failed: 
> org.apache.spark.sql.hive.HiveExternalCatalog cannot be cast to 
> org.apache.spark.sql.catalyst.catalog.ExternalCatalogWithListener
> 20/08/26 01:04:50 ERROR CarbonInternalMetastore$: Adding/Modifying 
> tableProperties operation failed: 
> org.apache.spark.sql.hive.HiveExternalCatalog cannot be cast to 
> org.apache.spark.sql.catalyst.catalog.ExternalCatalogWithListener
> 20/08/26 01:04:51 AUDIT audit: \{"time":"August 26, 2020 1:04:51 AM 
> IST","username":"root","opName":"CREATE MATERIALIZED 
> VIEW","opId":"16462372696035311","opStatus":"SUCCESS","opTime":"10551 
> ms","table":"NA","extraInfo":{"mvName":"agg_sales_mv"}}
> ++
> ||
> ++
> ++
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Closed] (CARBONDATA-3972) Date/timestamp compatability between hive and carbon

2020-10-06 Thread SHREELEKHYA GAMPA (Jira)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-3972?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

SHREELEKHYA GAMPA closed CARBONDATA-3972.
-
Resolution: Invalid

>  Date/timestamp compatability between hive and carbon
> -
>
> Key: CARBONDATA-3972
> URL: https://issues.apache.org/jira/browse/CARBONDATA-3972
> Project: CarbonData
>  Issue Type: Bug
>Reporter: SHREELEKHYA GAMPA
>Priority: Minor
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> To ensure the date/timestamp that is supported by hive also to be supported 
> by carbon.
> Ex: -01-01 is accepted by hive as a valid record and converted to 
> 0001-01-01.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (CARBONDATA-4023) Create MV failed on table with geospatial index

2020-10-05 Thread SHREELEKHYA GAMPA (Jira)
SHREELEKHYA GAMPA created CARBONDATA-4023:
-

 Summary: Create MV failed on table with geospatial index
 Key: CARBONDATA-4023
 URL: https://issues.apache.org/jira/browse/CARBONDATA-4023
 Project: CarbonData
  Issue Type: Bug
Reporter: SHREELEKHYA GAMPA


Create MV failed on the table with geospatial index using carbonsession.
Failed with, java.lang.ClassNotFoundException: 
org.apache.carbondata.geo.geohashindex



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (CARBONDATA-4005) SI with cache level blocklet issue

2020-09-22 Thread SHREELEKHYA GAMPA (Jira)
SHREELEKHYA GAMPA created CARBONDATA-4005:
-

 Summary: SI with cache level blocklet issue
 Key: CARBONDATA-4005
 URL: https://issues.apache.org/jira/browse/CARBONDATA-4005
 Project: CarbonData
  Issue Type: Bug
Reporter: SHREELEKHYA GAMPA


Select query on SI column returns blank resultset after changing the cache 
level to blocklet
PR: https://github.com/apache/carbondata/pull/3951



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Closed] (CARBONDATA-3952) After reset query not hitting MV

2020-09-15 Thread SHREELEKHYA GAMPA (Jira)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-3952?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

SHREELEKHYA GAMPA closed CARBONDATA-3952.
-
Resolution: Fixed

> After reset query not hitting MV
> 
>
> Key: CARBONDATA-3952
> URL: https://issues.apache.org/jira/browse/CARBONDATA-3952
> Project: CarbonData
>  Issue Type: Bug
>Reporter: SHREELEKHYA GAMPA
>Priority: Minor
>  Time Spent: 1h 50m
>  Remaining Estimate: 0h
>
> After reset query not hitting MV.
> With the reset, spark.sql.warehouse.dir and carbonStorePath don't match and 
> the databaseLocation will change to old table path format. So, new tables 
> that are created after reset, take a different path incase of default.
> Closing this , as it is identified as spark bug. More details can be found at 
> https://issues.apache.org/jira/browse/SPARK-31234



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (CARBONDATA-3952) After reset query not hitting MV

2020-09-14 Thread SHREELEKHYA GAMPA (Jira)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-3952?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

SHREELEKHYA GAMPA updated CARBONDATA-3952:
--
Description: 
After reset query not hitting MV.
With the reset, spark.sql.warehouse.dir and carbonStorePath don't match and the 
databaseLocation will change to old table path format. So, new tables that are 
created after reset, take a different path incase of default.

Closing this , as it is identified as spark bug. More details can be found at 
https://issues.apache.org/jira/browse/SPARK-31234

  was:
After reset query not hitting MV.
With the reset, spark.sql.warehouse.dir and carbonStorePath don't match and the 
databaseLocation will change to old table path format. So, new tables that are 
created after reset, take a different path incase of default.

Closing this PR, as it is identified as spark bug. More details can be found at 
https://issues.apache.org/jira/browse/SPARK-31234


> After reset query not hitting MV
> 
>
> Key: CARBONDATA-3952
> URL: https://issues.apache.org/jira/browse/CARBONDATA-3952
> Project: CarbonData
>  Issue Type: Bug
>Reporter: SHREELEKHYA GAMPA
>Priority: Minor
>  Time Spent: 1h 40m
>  Remaining Estimate: 0h
>
> After reset query not hitting MV.
> With the reset, spark.sql.warehouse.dir and carbonStorePath don't match and 
> the databaseLocation will change to old table path format. So, new tables 
> that are created after reset, take a different path incase of default.
> Closing this , as it is identified as spark bug. More details can be found at 
> https://issues.apache.org/jira/browse/SPARK-31234



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (CARBONDATA-3952) After reset query not hitting MV

2020-09-14 Thread SHREELEKHYA GAMPA (Jira)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-3952?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

SHREELEKHYA GAMPA updated CARBONDATA-3952:
--
Description: 
After reset query not hitting MV.
With the reset, spark.sql.warehouse.dir and carbonStorePath don't match and the 
databaseLocation will change to old table path format. So, new tables that are 
created after reset, take a different path incase of default.

Closing this PR, as it is identified as spark bug. More details can be found at 
https://issues.apache.org/jira/browse/SPARK-31234

  was:After reset query not hitting MV


> After reset query not hitting MV
> 
>
> Key: CARBONDATA-3952
> URL: https://issues.apache.org/jira/browse/CARBONDATA-3952
> Project: CarbonData
>  Issue Type: Bug
>Reporter: SHREELEKHYA GAMPA
>Priority: Minor
>  Time Spent: 1h 40m
>  Remaining Estimate: 0h
>
> After reset query not hitting MV.
> With the reset, spark.sql.warehouse.dir and carbonStorePath don't match and 
> the databaseLocation will change to old table path format. So, new tables 
> that are created after reset, take a different path incase of default.
> Closing this PR, as it is identified as spark bug. More details can be found 
> at https://issues.apache.org/jira/browse/SPARK-31234



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (CARBONDATA-3983) SI compatability issue

2020-09-11 Thread SHREELEKHYA GAMPA (Jira)
SHREELEKHYA GAMPA created CARBONDATA-3983:
-

 Summary: SI compatability issue
 Key: CARBONDATA-3983
 URL: https://issues.apache.org/jira/browse/CARBONDATA-3983
 Project: CarbonData
  Issue Type: Bug
Reporter: SHREELEKHYA GAMPA


Read from maintable having SI returns empty resultset when SI is stored with 
old tuple id storage format. 

Bug id: BUG2020090205414
PR link: https://github.com/apache/carbondata/pull/3922



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (CARBONDATA-3980) Load fails with aborted exception when Bad records action is unspecified

2020-09-10 Thread SHREELEKHYA GAMPA (Jira)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-3980?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

SHREELEKHYA GAMPA updated CARBONDATA-3980:
--
   Description: 
When the partition column is loaded with a bad record value, load fails with 
'Job aborted' message in cluster. However in complete stack trace we can see 
the actual error message. ('Data load failed due to bad record: The value with 
column name projectjoindate and column data type TIMESTAMP is not a valid 
TIMESTAMP type') 

Bug id: BUG2020082802430
PR link: https://github.com/apache/carbondata/pull/3919

  was:
When the partition column is loaded with a bad record value, load fails with 
'Job aborted' message in cluster. However in complete stack trace we can see 
the actual error message. ('Data load failed due to bad record: The value with 
column name projectjoindate and column data type TIMESTAMP is not a valid 
TIMESTAMP type') 

Bug id: BUG2020082802430


Remaining Estimate: (was: 0h)

> Load fails with aborted exception when Bad records action is unspecified
> 
>
> Key: CARBONDATA-3980
> URL: https://issues.apache.org/jira/browse/CARBONDATA-3980
> Project: CarbonData
>  Issue Type: Bug
>Reporter: SHREELEKHYA GAMPA
>Priority: Minor
>  Time Spent: 10m
>
> When the partition column is loaded with a bad record value, load fails with 
> 'Job aborted' message in cluster. However in complete stack trace we can see 
> the actual error message. ('Data load failed due to bad record: The value 
> with column name projectjoindate and column data type TIMESTAMP is not a 
> valid TIMESTAMP type') 
> Bug id: BUG2020082802430
> PR link: https://github.com/apache/carbondata/pull/3919



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (CARBONDATA-3980) Load fails with aborted exception when Bad records action is unspecified

2020-09-10 Thread SHREELEKHYA GAMPA (Jira)
SHREELEKHYA GAMPA created CARBONDATA-3980:
-

 Summary: Load fails with aborted exception when Bad records action 
is unspecified
 Key: CARBONDATA-3980
 URL: https://issues.apache.org/jira/browse/CARBONDATA-3980
 Project: CarbonData
  Issue Type: Bug
Reporter: SHREELEKHYA GAMPA


When the partition column is loaded with a bad record value, load fails with 
'Job aborted' message in cluster. However in complete stack trace we can see 
the actual error message. ('Data load failed due to bad record: The value with 
column name projectjoindate and column data type TIMESTAMP is not a valid 
TIMESTAMP type') 

Bug id: BUG2020082802430




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (CARBONDATA-3979) Added Hive local dictionary support example

2020-09-10 Thread SHREELEKHYA GAMPA (Jira)
SHREELEKHYA GAMPA created CARBONDATA-3979:
-

 Summary: Added Hive local dictionary support example
 Key: CARBONDATA-3979
 URL: https://issues.apache.org/jira/browse/CARBONDATA-3979
 Project: CarbonData
  Issue Type: Bug
Reporter: SHREELEKHYA GAMPA


 To verify local dictionary support in hive for the carbon tables created from 
spark.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (CARBONDATA-3972) Date/timestamp compatability between hive and carbon

2020-09-06 Thread SHREELEKHYA GAMPA (Jira)
SHREELEKHYA GAMPA created CARBONDATA-3972:
-

 Summary:  Date/timestamp compatability between hive and carbon
 Key: CARBONDATA-3972
 URL: https://issues.apache.org/jira/browse/CARBONDATA-3972
 Project: CarbonData
  Issue Type: Bug
Reporter: SHREELEKHYA GAMPA


To ensure the date/timestamp that is supported by hive also to be supported by 
carbon.
Ex: -01-01 is accepted by hive as a valid record and converted to 
0001-01-01.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (CARBONDATA-3955) Fix load failures due to daylight saving time changes

2020-08-19 Thread SHREELEKHYA GAMPA (Jira)
SHREELEKHYA GAMPA created CARBONDATA-3955:
-

 Summary: Fix load failures due to daylight saving time changes
 Key: CARBONDATA-3955
 URL: https://issues.apache.org/jira/browse/CARBONDATA-3955
 Project: CarbonData
  Issue Type: Bug
Reporter: SHREELEKHYA GAMPA


1) Fix load failures due to daylight saving time changes.
2) During load, date/timestamp year values with >4 digit should fail or be null 
according to bad records action property.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (CARBONDATA-3952) After reset query not hitting MV

2020-08-16 Thread SHREELEKHYA GAMPA (Jira)
SHREELEKHYA GAMPA created CARBONDATA-3952:
-

 Summary: After reset query not hitting MV
 Key: CARBONDATA-3952
 URL: https://issues.apache.org/jira/browse/CARBONDATA-3952
 Project: CarbonData
  Issue Type: Bug
Reporter: SHREELEKHYA GAMPA


After reset query not hitting MV



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (CARBONDATA-3943) Handling the addition of geo column to hive at the time of table creation

2020-08-06 Thread SHREELEKHYA GAMPA (Jira)
SHREELEKHYA GAMPA created CARBONDATA-3943:
-

 Summary:  Handling the addition of geo column to hive at the time 
of table creation
 Key: CARBONDATA-3943
 URL: https://issues.apache.org/jira/browse/CARBONDATA-3943
 Project: CarbonData
  Issue Type: Bug
Reporter: SHREELEKHYA GAMPA


 Handling the addition of geo column to hive at the time of table creation



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (CARBONDATA-3943) Handling the addition of geo column to hive at the time of table creation

2020-08-06 Thread SHREELEKHYA GAMPA (Jira)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-3943?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

SHREELEKHYA GAMPA updated CARBONDATA-3943:
--
Priority: Minor  (was: Major)

>  Handling the addition of geo column to hive at the time of table creation
> --
>
> Key: CARBONDATA-3943
> URL: https://issues.apache.org/jira/browse/CARBONDATA-3943
> Project: CarbonData
>  Issue Type: Bug
>Reporter: SHREELEKHYA GAMPA
>Priority: Minor
>
>  Handling the addition of geo column to hive at the time of table creation



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (CARBONDATA-3913) Table level timestamp support

2020-07-17 Thread SHREELEKHYA GAMPA (Jira)
SHREELEKHYA GAMPA created CARBONDATA-3913:
-

 Summary: Table level timestamp support
 Key: CARBONDATA-3913
 URL: https://issues.apache.org/jira/browse/CARBONDATA-3913
 Project: CarbonData
  Issue Type: New Feature
Reporter: SHREELEKHYA GAMPA


To support the timestamp format table level.
The priority of timestamp format as:

1. Load command options
2. Table level properties
3. configurable properties (carbon.timestamp.format)



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (CARBONDATA-3899) drop materialized view when executed concurrently from 4 concurrent client fails in all 4 clients.

2020-07-14 Thread SHREELEKHYA GAMPA (Jira)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-3899?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

SHREELEKHYA GAMPA updated CARBONDATA-3899:
--
Description: 
drop materialized view when executed concurrently from 4 concurrent client 
fails in all 4 clients from beeline.
 !screenshot-1.png! 

  was:drop materialized view when executed concurrently from 4 concurrent 
client fails in all 4 clients.


> drop materialized view when executed concurrently from 4 concurrent client 
> fails in all 4 clients.
> --
>
> Key: CARBONDATA-3899
> URL: https://issues.apache.org/jira/browse/CARBONDATA-3899
> Project: CarbonData
>  Issue Type: Bug
>Reporter: SHREELEKHYA GAMPA
>Priority: Major
> Attachments: screenshot-1.png
>
>
> drop materialized view when executed concurrently from 4 concurrent client 
> fails in all 4 clients from beeline.
>  !screenshot-1.png! 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (CARBONDATA-3899) drop materialized view when executed concurrently from 4 concurrent client fails in all 4 clients.

2020-07-14 Thread SHREELEKHYA GAMPA (Jira)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-3899?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

SHREELEKHYA GAMPA updated CARBONDATA-3899:
--
Attachment: screenshot-1.png

> drop materialized view when executed concurrently from 4 concurrent client 
> fails in all 4 clients.
> --
>
> Key: CARBONDATA-3899
> URL: https://issues.apache.org/jira/browse/CARBONDATA-3899
> Project: CarbonData
>  Issue Type: Bug
>Reporter: SHREELEKHYA GAMPA
>Priority: Major
> Attachments: screenshot-1.png
>
>
> drop materialized view when executed concurrently from 4 concurrent client 
> fails in all 4 clients.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (CARBONDATA-3899) drop materialized view when executed concurrently from 4 concurrent client fails in all 4 clients.

2020-07-14 Thread SHREELEKHYA GAMPA (Jira)
SHREELEKHYA GAMPA created CARBONDATA-3899:
-

 Summary: drop materialized view when executed concurrently from 4 
concurrent client fails in all 4 clients.
 Key: CARBONDATA-3899
 URL: https://issues.apache.org/jira/browse/CARBONDATA-3899
 Project: CarbonData
  Issue Type: Bug
Reporter: SHREELEKHYA GAMPA


drop materialized view when executed concurrently from 4 concurrent client 
fails in all 4 clients.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (CARBONDATA-3833) Make GeoID visible to the user

2020-05-22 Thread SHREELEKHYA GAMPA (Jira)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-3833?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

SHREELEKHYA GAMPA updated CARBONDATA-3833:
--
Description: GeoID is a column created internally for spatial tables and 
currently it is not visible to the users while querying. This feature is to 
make GeoID visible to the user.  (was: Make GeoID visible to the user)

> Make GeoID visible to the user
> --
>
> Key: CARBONDATA-3833
> URL: https://issues.apache.org/jira/browse/CARBONDATA-3833
> Project: CarbonData
>  Issue Type: New Feature
>Reporter: SHREELEKHYA GAMPA
>Priority: Minor
>
> GeoID is a column created internally for spatial tables and currently it is 
> not visible to the users while querying. This feature is to make GeoID 
> visible to the user.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (CARBONDATA-3833) Make GeoID visible to the user

2020-05-22 Thread SHREELEKHYA GAMPA (Jira)
SHREELEKHYA GAMPA created CARBONDATA-3833:
-

 Summary: Make GeoID visible to the user
 Key: CARBONDATA-3833
 URL: https://issues.apache.org/jira/browse/CARBONDATA-3833
 Project: CarbonData
  Issue Type: New Feature
Reporter: SHREELEKHYA GAMPA


Make GeoID visible to the user



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (CARBONDATA-3772) Update index documents

2020-04-16 Thread SHREELEKHYA GAMPA (Jira)
SHREELEKHYA GAMPA created CARBONDATA-3772:
-

 Summary: Update index documents
 Key: CARBONDATA-3772
 URL: https://issues.apache.org/jira/browse/CARBONDATA-3772
 Project: CarbonData
  Issue Type: Bug
Reporter: SHREELEKHYA GAMPA


PR: [https://github.com/apache/carbondata/pull/3708]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)