[jira] [Resolved] (CARBONDATA-4185) Heterogeneous format segments in carbondata documenation

2021-05-20 Thread Kunal Kapoor (Jira)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-4185?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kunal Kapoor resolved CARBONDATA-4185.
--
Fix Version/s: 2.2.0
   Resolution: Fixed

> Heterogeneous format segments in carbondata documenation
> 
>
> Key: CARBONDATA-4185
> URL: https://issues.apache.org/jira/browse/CARBONDATA-4185
> Project: CarbonData
>  Issue Type: Bug
>Reporter: Mahesh Raju Somalaraju
>Priority: Major
> Fix For: 2.2.0
>
>  Time Spent: 2h 10m
>  Remaining Estimate: 0h
>
> Heterogeneous format segments in carbondata documenation



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (CARBONDATA-4188) Select query fails for longstring data with small table page size after alter add columns

2021-05-20 Thread Kunal Kapoor (Jira)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-4188?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kunal Kapoor resolved CARBONDATA-4188.
--
Fix Version/s: 2.2.0
   Resolution: Fixed

> Select query fails for longstring data with small table page size after alter 
> add columns
> -
>
> Key: CARBONDATA-4188
> URL: https://issues.apache.org/jira/browse/CARBONDATA-4188
> Project: CarbonData
>  Issue Type: Bug
>Reporter: Nihal kumar ojha
>Priority: Major
> Fix For: 2.2.0
>
>  Time Spent: 3h 50m
>  Remaining Estimate: 0h
>
> Steps to reproduce:
>  # Create table with small page size and longstring data type.
>  # Load large amount of data(more than one page should be created.)
>  # Alter add int column on the same table.
>  # Select query with filter on newly added columns fails with 
> ArrayIndexOutOfBoundException.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (CARBONDATA-4143) UT with index server

2021-05-20 Thread SHREELEKHYA GAMPA (Jira)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-4143?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

SHREELEKHYA GAMPA updated CARBONDATA-4143:
--
Description: 
To enable to run UT with index server using flag {{useIndexServer.}}

excluded some of the test cases to not run with index server.

added test case with prepriming.

To Fix below issues:
1. With index server enabled, select query gives incorrect result with SI when 
parent and child table segments are not in sync.

queries to execute:


0: jdbc:hive2://dggphisprb50622:22550/> create table test (c1 string,c2 int,c3 
string,c5 string) STORED AS carbondata;
+-+
| Result |
+-+
+-+
No rows selected (0.564 seconds)
0: jdbc:hive2://dggphisprb50622:22550/> load data inpath 
'hdfs://hacluster/chetan/dest.csv' into table test;
+-+
| Segment ID |
+-+
| 0 |
+-+
1 row selected (1.764 seconds)
0: jdbc:hive2://dggphisprb50622:22550/> create index index_test on table test 
(c3) AS 'carbondata';
+-+
| Result |
+-+
+-+
No rows selected (2.412 seconds)
0: jdbc:hive2://dggphisprb50622:22550/> load data inpath 
'hdfs://hacluster/chetan/dest.csv' into table test;
+-+
| Segment ID |
+-+
| 1 |
+-+
1 row selected (2.839 seconds)
0: jdbc:hive2://dggphisprb50622:22550/> select * from test where c3='dd';
+-+-+-+--+
| c1 | c2 | c3 | c5 |
+-+-+-+--+
| d | 4 | dd | ddd |
| d | 4 | dd | ddd |
+-+-+-+--+
2 rows selected (3.452 seconds)
0: jdbc:hive2://dggphisprb50622:22550/> delete from table index_test where 
segment.ID in(1);
+-+
| Result |
+-+
+-+
No rows selected (0.413 seconds)
0: jdbc:hive2://dggphisprb50622:22550/> select * from test where c3='dd';
+-+-+-+--+
| c1 | c2 | c3 | c5 |
+-+-+-+--+
| d | 4 | dd | ddd |
+-+-+-+--+
1 row selected (3.262 seconds)
0: jdbc:hive2://dggphisprb50622:22550/>

Expected: to return 2 rows.


2. When reindex is triggered, if stale files are present in the segment 
directory the segment file is being written with incorrect file names. (both 
valid index and stale mergeindex file names). As a result, duplicate data is 
present in SI table but there is no error/incorrect query results.

  was:
To enable to run UT with index server using flag {{useIndexServer.}}

excluded some of the test cases to not run with index server.

added test case with prepriming.


> UT with index server
> 
>
> Key: CARBONDATA-4143
> URL: https://issues.apache.org/jira/browse/CARBONDATA-4143
> Project: CarbonData
>  Issue Type: Improvement
>Reporter: SHREELEKHYA GAMPA
>Priority: Major
>  Time Spent: 3h 20m
>  Remaining Estimate: 0h
>
> To enable to run UT with index server using flag {{useIndexServer.}}
> excluded some of the test cases to not run with index server.
> added test case with prepriming.
> To Fix below issues:
> 1. With index server enabled, select query gives incorrect result with SI 
> when parent and child table segments are not in sync.
> queries to execute:
> 0: jdbc:hive2://dggphisprb50622:22550/> create table test (c1 string,c2 
> int,c3 string,c5 string) STORED AS carbondata;
> +-+
> | Result |
> +-+
> +-+
> No rows selected (0.564 seconds)
> 0: jdbc:hive2://dggphisprb50622:22550/> load data inpath 
> 'hdfs://hacluster/chetan/dest.csv' into table test;
> +-+
> | Segment ID |
> +-+
> | 0 |
> +-+
> 1 row selected (1.764 seconds)
> 0: jdbc:hive2://dggphisprb50622:22550/> create index index_test on table test 
> (c3) AS 'carbondata';
> +-+
> | Result |
> +-+
> +-+
> No rows selected (2.412 seconds)
> 0: jdbc:hive2://dggphisprb50622:22550/> load data inpath 
> 'hdfs://hacluster/chetan/dest.csv' into table test;
> +-+
> | Segment ID |
> +-+
> | 1 |
> +-+
> 1 row selected (2.839 seconds)
> 0: jdbc:hive2://dggphisprb50622:22550/> select * from test where c3='dd';
> +-+-+-+--+
> | c1 | c2 | c3 | c5 |
> +-+-+-+--+
> | d | 4 | dd | ddd |
> | d | 4 | dd | ddd |
> +-+-+-+--+
> 2 rows selected (3.452 seconds)
> 0: jdbc:hive2://dggphisprb50622:22550/> delete from table index_test where 
> segment.ID in(1);
> +-+
> | Result |
> +-+
> +-+
> No rows selected (0.413 seconds)
> 0: jdbc:hive2://dggphisprb50622:22550/> select * from test where c3='dd';
> +-+-+-+--+
> | c1 | c2 | c3 | c5 |
> +-+-+-+--+
> | d | 4 | dd | ddd |
> +-+-+-+--+
> 1 row selected (3.262 seconds)
> 0: jdbc:hive2://dggphisprb50622:22550/>
> Expected: to return 2 rows.
> 2. When reindex is triggered, if stale files are present in the segment 
>