[jira] [Created] (CARBONDATA-3542) Support Map data type reading through Hive

2019-10-05 Thread dhatchayani (Jira)
dhatchayani created CARBONDATA-3542:
---

 Summary: Support Map data type reading through Hive
 Key: CARBONDATA-3542
 URL: https://issues.apache.org/jira/browse/CARBONDATA-3542
 Project: CarbonData
  Issue Type: Improvement
Reporter: dhatchayani






--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (CARBONDATA-3511) Query time improvement by reducing the number of NameNode calls while having carbonindex files in the store

2019-09-03 Thread dhatchayani (Jira)
dhatchayani created CARBONDATA-3511:
---

 Summary: Query time improvement by reducing the number of NameNode 
calls while having carbonindex files in the store
 Key: CARBONDATA-3511
 URL: https://issues.apache.org/jira/browse/CARBONDATA-3511
 Project: CarbonData
  Issue Type: Improvement
Reporter: dhatchayani






--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Commented] (CARBONDATA-3451) Select aggregation query with filter fails on hive table with decimal type using CarbonHiveSerDe in Spark 2.1

2019-07-08 Thread dhatchayani (JIRA)


[ 
https://issues.apache.org/jira/browse/CARBONDATA-3451?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16880140#comment-16880140
 ] 

dhatchayani commented on CARBONDATA-3451:
-

Please check this again. It is already fixed in 
[CARBONDATA-3441|https://issues.apache.org/jira/browse/CARBONDATA-3441]

> Select aggregation query with filter fails on hive table with decimal type 
> using CarbonHiveSerDe in Spark 2.1
> -
>
> Key: CARBONDATA-3451
> URL: https://issues.apache.org/jira/browse/CARBONDATA-3451
> Project: CarbonData
>  Issue Type: Bug
>  Components: data-query
>Affects Versions: 1.6.0
> Environment: Spark 2.1
>Reporter: Chetan Bhat
>Priority: Minor
>
> Test steps :
> In Spark 2.1 beeline user creates a carbon table and loads data.
>  create table Test_Boundary (c1_int int,c2_Bigint Bigint,c3_Decimal 
> Decimal(38,38),c4_double double,c5_string string,c6_Timestamp 
> Timestamp,c7_Datatype_Desc string) STORED BY 'org.apache.carbondata.format' 
> TBLPROPERTIES('inverted_index'='c1_int,c2_Bigint,c5_string,c6_Timestamp','sort_columns'='c1_int,c2_Bigint,c5_string,c6_Timestamp');
> LOAD DATA INPATH 'hdfs://hacluster/chetan/Test_Data1.csv' INTO table 
> Test_Boundary 
> OPTIONS('DELIMITER'=',','QUOTECHAR'='','BAD_RECORDS_ACTION'='FORCE','FILEHEADER'='');
> From hive beeline user creates a hive table from the already created carbon 
> table using CarbonHiveSerDe.
> CREATE TABLE IF NOT EXISTS Test_Boundary1 (c1_int int,c2_Bigint 
> Bigint,c3_Decimal Decimal(38,38),c4_double double,c5_string 
> string,c6_Timestamp Timestamp,c7_Datatype_Desc string) ROW FORMAT SERDE 
> 'org.apache.carbondata.hive.CarbonHiveSerDe' WITH SERDEPROPERTIES 
> ('mapreduce.input.carboninputformat.databaseName'='default','mapreduce.input.carboninputformat.tableName'='Test_Boundary')
>  STORED AS INPUTFORMAT 'org.apache.carbondata.hive.MapredCarbonInputFormat' 
> OUTPUTFORMAT 'org.apache.carbondata.hive.MapredCarbonOutputFormat' LOCATION 
> 'hdfs://hacluster//user/hive/warehouse/carbon.store/default/test_boundary';
> User executes below select aggregation query on the hive table.
> select min(c3_Decimal),max(c3_Decimal),sum(c3_Decimal),avg(c3_Decimal) , 
> count(c3_Decimal), variance(c3_Decimal) from test_boundary1 where 
> exp(c1_int)=0.0 or exp(c1_int)=1.0;
> select min(c3_Decimal),max(c3_Decimal),sum(c3_Decimal),avg(c3_Decimal) , 
> count(c3_Decimal), variance(c3_Decimal) from test_boundary1 where 
> log(c1_int,1)=0.0 or log(c1_int,1) IS NULL;
> select min(c3_Decimal),max(c3_Decimal),sum(c3_Decimal),avg(c3_Decimal) , 
> count(c3_Decimal), variance(c3_Decimal) from test_boundary1 where 
> pmod(c1_int,1)=0 or pmod(c1_int,1)IS NULL;
>  
> Actual Result : Select aggregation query with filter fails on hive table with 
> decimal type using CarbonHiveSerDe in Spark 2.1
> Expected Result : Select aggregation query with filter should be success on 
> hive table with decimal type using CarbonHiveSerDe in Spark 2.1
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (CARBONDATA-3455) Job Group ID is not displayed in the IndexServer

2019-06-27 Thread dhatchayani (JIRA)
dhatchayani created CARBONDATA-3455:
---

 Summary: Job Group ID is not displayed in the IndexServer
 Key: CARBONDATA-3455
 URL: https://issues.apache.org/jira/browse/CARBONDATA-3455
 Project: CarbonData
  Issue Type: Bug
Reporter: dhatchayani


Job Group ID is not displayed in the IndexServer



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (CARBONDATA-3443) Update hive guide with Read from hive

2019-06-19 Thread dhatchayani (JIRA)
dhatchayani created CARBONDATA-3443:
---

 Summary: Update hive guide with Read from hive
 Key: CARBONDATA-3443
 URL: https://issues.apache.org/jira/browse/CARBONDATA-3443
 Project: CarbonData
  Issue Type: Improvement
Reporter: dhatchayani






--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (CARBONDATA-3441) Aggregate queries are failing on Reading from Hive

2019-06-17 Thread dhatchayani (JIRA)
dhatchayani created CARBONDATA-3441:
---

 Summary: Aggregate queries are failing on Reading from Hive
 Key: CARBONDATA-3441
 URL: https://issues.apache.org/jira/browse/CARBONDATA-3441
 Project: CarbonData
  Issue Type: Bug
Reporter: dhatchayani


Aggregate queries are failing on Reading from Hive



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (CARBONDATA-3415) Merge index is not working for partition table. Merge index for partition table is taking significantly longer time than normal table.

2019-06-06 Thread dhatchayani (JIRA)
dhatchayani created CARBONDATA-3415:
---

 Summary: Merge index is not working for partition table. Merge 
index for partition table is taking significantly longer time than normal table.
 Key: CARBONDATA-3415
 URL: https://issues.apache.org/jira/browse/CARBONDATA-3415
 Project: CarbonData
  Issue Type: Bug
Reporter: dhatchayani


Issues:

(1) Merge index is not working on partition table

(2) Time taken for merge index is significantly more than the normal carbon 
table



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (CARBONDATA-3406) Support Binary, Boolean,Varchar, Complex data types read and Dictionary columns read

2019-05-30 Thread dhatchayani (JIRA)
dhatchayani created CARBONDATA-3406:
---

 Summary: Support Binary, Boolean,Varchar, Complex data types read 
and Dictionary columns read
 Key: CARBONDATA-3406
 URL: https://issues.apache.org/jira/browse/CARBONDATA-3406
 Project: CarbonData
  Issue Type: Bug
Reporter: dhatchayani


(1) Support all data types read

(2) Support dictionary columns read



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (CARBONDATA-3393) Merge Index Job Failure should not trigger the merge index job again. Exception propagation should be decided by the User.

2019-05-28 Thread dhatchayani (JIRA)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-3393?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

dhatchayani updated CARBONDATA-3393:

Summary: Merge Index Job Failure should not trigger the merge index job 
again. Exception propagation should be decided by the User.  (was: Merge Index 
Job Failure should not trigger the merge index job again. Exception should be 
propagated to the caller.)

> Merge Index Job Failure should not trigger the merge index job again. 
> Exception propagation should be decided by the User.
> --
>
> Key: CARBONDATA-3393
> URL: https://issues.apache.org/jira/browse/CARBONDATA-3393
> Project: CarbonData
>  Issue Type: Bug
>Reporter: dhatchayani
>Priority: Minor
>  Time Spent: 2h 20m
>  Remaining Estimate: 0h
>
> If the merge index job is failed, LOAD is also failing. Load should not 
> consider the merge index job status to decide the LOAD status.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (CARBONDATA-3393) Merge Index Job Failure should not trigger the merge index job again. Exception should be propagated to the caller.

2019-05-20 Thread dhatchayani (JIRA)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-3393?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

dhatchayani updated CARBONDATA-3393:

Summary: Merge Index Job Failure should not trigger the merge index job 
again. Exception should be propagated to the caller.  (was: Merge Index Job 
Failure should not fail the LOAD. Load status should not consider the merge 
index job status.)

> Merge Index Job Failure should not trigger the merge index job again. 
> Exception should be propagated to the caller.
> ---
>
> Key: CARBONDATA-3393
> URL: https://issues.apache.org/jira/browse/CARBONDATA-3393
> Project: CarbonData
>  Issue Type: Bug
>Reporter: dhatchayani
>Priority: Minor
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> If the merge index job is failed, LOAD is also failing. Load should not 
> consider the merge index job status to decide the LOAD status.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (CARBONDATA-3393) Merge Index Job Failure should not fail the LOAD. Load status should not consider the merge index job status.

2019-05-20 Thread dhatchayani (JIRA)
dhatchayani created CARBONDATA-3393:
---

 Summary: Merge Index Job Failure should not fail the LOAD. Load 
status should not consider the merge index job status.
 Key: CARBONDATA-3393
 URL: https://issues.apache.org/jira/browse/CARBONDATA-3393
 Project: CarbonData
  Issue Type: Bug
Reporter: dhatchayani


If the merge index job is failed, LOAD is also failing. Load should not 
consider the merge index job status to decide the LOAD status.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (CARBONDATA-3386) Concurrent Merge index and query is failing

2019-05-15 Thread dhatchayani (JIRA)
dhatchayani created CARBONDATA-3386:
---

 Summary: Concurrent Merge index and query is failing
 Key: CARBONDATA-3386
 URL: https://issues.apache.org/jira/browse/CARBONDATA-3386
 Project: CarbonData
  Issue Type: Bug
Reporter: dhatchayani


Concurrent merge index and query is failing. Load is triggered on a table, at 
the end of the load Merge index will be triggered. But this is triggered after 
the table status is updated as SUCCESS/PARTIAL SUCCESS for that segments. So 
for the concurrent query, this segment is available for query. Once the merge 
index is done, it deletes the index files, which are still referred by the 
query, this leads to the query failure.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (CARBONDATA-3364) Support Read from Hive. Queries are giving empty results from hive.

2019-04-29 Thread dhatchayani (JIRA)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-3364?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

dhatchayani updated CARBONDATA-3364:

Summary: Support Read from Hive. Queries are giving empty results from 
hive.  (was: Support Read from Hive. Queries on carbon table are giving empty 
results from hive.)

> Support Read from Hive. Queries are giving empty results from hive.
> ---
>
> Key: CARBONDATA-3364
> URL: https://issues.apache.org/jira/browse/CARBONDATA-3364
> Project: CarbonData
>  Issue Type: Bug
>Reporter: dhatchayani
>Priority: Minor
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (CARBONDATA-3364) Support Read from Hive. Queries on carbon table are giving empty results from hive.

2019-04-29 Thread dhatchayani (JIRA)
dhatchayani created CARBONDATA-3364:
---

 Summary: Support Read from Hive. Queries on carbon table are 
giving empty results from hive.
 Key: CARBONDATA-3364
 URL: https://issues.apache.org/jira/browse/CARBONDATA-3364
 Project: CarbonData
  Issue Type: Bug
Reporter: dhatchayani






--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (CARBONDATA-3293) Prune datamaps improvement for count(*)

2019-03-15 Thread dhatchayani (JIRA)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-3293?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

dhatchayani updated CARBONDATA-3293:

Summary: Prune datamaps improvement for count(*)  (was: Prune datamaps 
improvement)

> Prune datamaps improvement for count(*)
> ---
>
> Key: CARBONDATA-3293
> URL: https://issues.apache.org/jira/browse/CARBONDATA-3293
> Project: CarbonData
>  Issue Type: Improvement
>Reporter: dhatchayani
>Assignee: dhatchayani
>Priority: Major
>  Time Spent: 6h 20m
>  Remaining Estimate: 0h
>
> +*Problem:*+
> (1) Currently for count ( *) , the prune is same as select * query.  Blocklet 
> and ExtendedBlocklet are formed from the DataMapRow and that is of no need 
> and it is a time consuming process.
> (2) Pruning in select * query consumes time in convertToSafeRow() - 
> converting the DataMapRow to safe as in an unsafe row to get the position of 
> data, we need to traverse through the whole row to reach a position.
> (3) In case of filter queries, even if the blocklet is valid or invalid, we 
> are converting the DataMapRow to safeRow. This conversion is time consuming 
> increasing the number of blocklets.
>  
> +*Solution:*+
> (1) We have the blocklet row count in the DataMapRow itself, so it is just 
> enough to read the count. With this count ( *) query performance can be 
> improved.
> (2) Maintain the data length also to the DataMapRow, so that traversing the 
> whole row can be avoided. With the length we can directly hit the data 
> position.
> (3) Read only the MinMax from the DataMapRow, decide whether scan is required 
> on that blocklet, if required only then it can be converted to safeRow, if 
> needed.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (CARBONDATA-3313) count(*) is not invalidating the invalid segments cache

2019-03-12 Thread dhatchayani (JIRA)
dhatchayani created CARBONDATA-3313:
---

 Summary: count(*) is not invalidating the invalid segments cache
 Key: CARBONDATA-3313
 URL: https://issues.apache.org/jira/browse/CARBONDATA-3313
 Project: CarbonData
  Issue Type: Bug
Reporter: dhatchayani






--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (CARBONDATA-3293) Prune datamaps improvement

2019-02-14 Thread dhatchayani (JIRA)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-3293?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

dhatchayani updated CARBONDATA-3293:

Summary: Prune datamaps improvement  (was: Prune for count(*) improvement)

> Prune datamaps improvement
> --
>
> Key: CARBONDATA-3293
> URL: https://issues.apache.org/jira/browse/CARBONDATA-3293
> Project: CarbonData
>  Issue Type: Improvement
>Reporter: dhatchayani
>Assignee: dhatchayani
>Priority: Major
>
> Problem:
> (1) Currently for count(*), the prune is same as select * query.  Blocklet 
> and ExtendedBlocklet are formed from the DataMapRow and that is of no need.
>  
> Solution:
> We have the blocklet row count in the DataMapRow itself, so it is just enough 
> to read the count. With this count(*) query performance can be improved.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (CARBONDATA-3293) Prune datamaps improvement

2019-02-14 Thread dhatchayani (JIRA)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-3293?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

dhatchayani updated CARBONDATA-3293:

Description: 
+*Problem:*+

(1) Currently for count ( *) , the prune is same as select * query.  Blocklet 
and ExtendedBlocklet are formed from the DataMapRow and that is of no need and 
it is a time consuming process.

(2) Pruning in select * query consumes time in convertToSafeRow() - converting 
the DataMapRow to safe as in an unsafe row to get the position of data, we need 
to traverse through the whole row to reach a position.

(3) In case of filter queries, even if the blocklet is valid or invalid, we are 
converting the DataMapRow to safeRow. This conversion is time consuming 
increasing the number of blocklets.

 

+*Solution:*+

(1) We have the blocklet row count in the DataMapRow itself, so it is just 
enough to read the count. With this count ( *) query performance can be 
improved.

(2) Maintain the data length also to the DataMapRow, so that traversing the 
whole row can be avoided. With the length we can directly hit the data position.

(3) Read only the MinMax from the DataMapRow, decide whether scan is required 
on that blocklet, if required only then it can be converted to safeRow, if 
needed.

  was:
Problem:

(1) Currently for count ( *) , the prune is same as select * query.  Blocklet 
and ExtendedBlocklet are formed from the DataMapRow and that is of no need and 
it is a time consuming process.

(2) Pruning in select * query consumes time in convertToSafeRow() - converting 
the DataMapRow to safe as in an unsafe row to get the position of data, we need 
to traverse through the whole row to reach a position.

(3) In case of filter queries, even if the blocklet is valid or invalid, we are 
converting the DataMapRow to safeRow. This conversion is time consuming 
increasing the number of blocklets.

 

Solution:

(1) We have the blocklet row count in the DataMapRow itself, so it is just 
enough to read the count. With this count ( *) query performance can be 
improved.

(2) Maintain the data length also to the DataMapRow, so that traversing the 
whole row can be avoided. With the length we can directly hit the data position.

(3) Read only the MinMax from the DataMapRow, decide whether scan is required 
on that blocklet, if required only then it can be converted to safeRow, if 
needed.


> Prune datamaps improvement
> --
>
> Key: CARBONDATA-3293
> URL: https://issues.apache.org/jira/browse/CARBONDATA-3293
> Project: CarbonData
>  Issue Type: Improvement
>Reporter: dhatchayani
>Assignee: dhatchayani
>Priority: Major
>
> +*Problem:*+
> (1) Currently for count ( *) , the prune is same as select * query.  Blocklet 
> and ExtendedBlocklet are formed from the DataMapRow and that is of no need 
> and it is a time consuming process.
> (2) Pruning in select * query consumes time in convertToSafeRow() - 
> converting the DataMapRow to safe as in an unsafe row to get the position of 
> data, we need to traverse through the whole row to reach a position.
> (3) In case of filter queries, even if the blocklet is valid or invalid, we 
> are converting the DataMapRow to safeRow. This conversion is time consuming 
> increasing the number of blocklets.
>  
> +*Solution:*+
> (1) We have the blocklet row count in the DataMapRow itself, so it is just 
> enough to read the count. With this count ( *) query performance can be 
> improved.
> (2) Maintain the data length also to the DataMapRow, so that traversing the 
> whole row can be avoided. With the length we can directly hit the data 
> position.
> (3) Read only the MinMax from the DataMapRow, decide whether scan is required 
> on that blocklet, if required only then it can be converted to safeRow, if 
> needed.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (CARBONDATA-3293) Prune datamaps improvement

2019-02-14 Thread dhatchayani (JIRA)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-3293?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

dhatchayani updated CARBONDATA-3293:

Description: 
Problem:

(1) Currently for count ( *) , the prune is same as select * query.  Blocklet 
and ExtendedBlocklet are formed from the DataMapRow and that is of no need and 
it is a time consuming process.

(2) Pruning in select * query consumes time in convertToSafeRow() - converting 
the DataMapRow to safe as in an unsafe row to get the position of data, we need 
to traverse through the whole row to reach a position.

(3) In case of filter queries, even if the blocklet is valid or invalid, we are 
converting the DataMapRow to safeRow. This conversion is time consuming 
increasing the number of blocklets.

 

Solution:

(1) We have the blocklet row count in the DataMapRow itself, so it is just 
enough to read the count. With this count ( *) query performance can be 
improved.

(2) Maintain the data length also to the DataMapRow, so that traversing the 
whole row can be avoided. With the length we can directly hit the data position.

(3) Read only the MinMax from the DataMapRow, decide whether scan is required 
on that blocklet, if required only then it can be converted to safeRow, if 
needed.

  was:
Problem:

(1) Currently for count ( *) , the prune is same as select * query.  Blocklet 
and ExtendedBlocklet are formed from the DataMapRow and that is of no need.

 

Solution:

We have the blocklet row count in the DataMapRow itself, so it is just enough 
to read the count. With this count(*) query performance can be improved.


> Prune datamaps improvement
> --
>
> Key: CARBONDATA-3293
> URL: https://issues.apache.org/jira/browse/CARBONDATA-3293
> Project: CarbonData
>  Issue Type: Improvement
>Reporter: dhatchayani
>Assignee: dhatchayani
>Priority: Major
>
> Problem:
> (1) Currently for count ( *) , the prune is same as select * query.  Blocklet 
> and ExtendedBlocklet are formed from the DataMapRow and that is of no need 
> and it is a time consuming process.
> (2) Pruning in select * query consumes time in convertToSafeRow() - 
> converting the DataMapRow to safe as in an unsafe row to get the position of 
> data, we need to traverse through the whole row to reach a position.
> (3) In case of filter queries, even if the blocklet is valid or invalid, we 
> are converting the DataMapRow to safeRow. This conversion is time consuming 
> increasing the number of blocklets.
>  
> Solution:
> (1) We have the blocklet row count in the DataMapRow itself, so it is just 
> enough to read the count. With this count ( *) query performance can be 
> improved.
> (2) Maintain the data length also to the DataMapRow, so that traversing the 
> whole row can be avoided. With the length we can directly hit the data 
> position.
> (3) Read only the MinMax from the DataMapRow, decide whether scan is required 
> on that blocklet, if required only then it can be converted to safeRow, if 
> needed.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (CARBONDATA-3293) Prune datamaps improvement

2019-02-14 Thread dhatchayani (JIRA)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-3293?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

dhatchayani updated CARBONDATA-3293:

Description: 
Problem:

(1) Currently for count * , the prune is same as select * query.  Blocklet and 
ExtendedBlocklet are formed from the DataMapRow and that is of no need.

 

Solution:

We have the blocklet row count in the DataMapRow itself, so it is just enough 
to read the count. With this count(*) query performance can be improved.

  was:
Problem:

(1) Currently for count(*), the prune is same as select * query.  Blocklet and 
ExtendedBlocklet are formed from the DataMapRow and that is of no need.

 

Solution:

We have the blocklet row count in the DataMapRow itself, so it is just enough 
to read the count. With this count(*) query performance can be improved.


> Prune datamaps improvement
> --
>
> Key: CARBONDATA-3293
> URL: https://issues.apache.org/jira/browse/CARBONDATA-3293
> Project: CarbonData
>  Issue Type: Improvement
>Reporter: dhatchayani
>Assignee: dhatchayani
>Priority: Major
>
> Problem:
> (1) Currently for count * , the prune is same as select * query.  Blocklet 
> and ExtendedBlocklet are formed from the DataMapRow and that is of no need.
>  
> Solution:
> We have the blocklet row count in the DataMapRow itself, so it is just enough 
> to read the count. With this count(*) query performance can be improved.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (CARBONDATA-3293) Prune datamaps improvement

2019-02-14 Thread dhatchayani (JIRA)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-3293?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

dhatchayani updated CARBONDATA-3293:

Description: 
Problem:

(1) Currently for count ( *) , the prune is same as select * query.  Blocklet 
and ExtendedBlocklet are formed from the DataMapRow and that is of no need.

 

Solution:

We have the blocklet row count in the DataMapRow itself, so it is just enough 
to read the count. With this count(*) query performance can be improved.

  was:
Problem:

(1) Currently for count * , the prune is same as select * query.  Blocklet and 
ExtendedBlocklet are formed from the DataMapRow and that is of no need.

 

Solution:

We have the blocklet row count in the DataMapRow itself, so it is just enough 
to read the count. With this count(*) query performance can be improved.


> Prune datamaps improvement
> --
>
> Key: CARBONDATA-3293
> URL: https://issues.apache.org/jira/browse/CARBONDATA-3293
> Project: CarbonData
>  Issue Type: Improvement
>Reporter: dhatchayani
>Assignee: dhatchayani
>Priority: Major
>
> Problem:
> (1) Currently for count ( *) , the prune is same as select * query.  Blocklet 
> and ExtendedBlocklet are formed from the DataMapRow and that is of no need.
>  
> Solution:
> We have the blocklet row count in the DataMapRow itself, so it is just enough 
> to read the count. With this count(*) query performance can be improved.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (CARBONDATA-3293) Prune for count(*) improvement

2019-02-14 Thread dhatchayani (JIRA)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-3293?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

dhatchayani updated CARBONDATA-3293:

Description: 
Problem:

(1) Currently for count(*), the prune is same as select * query.  Blocklet and 
ExtendedBlocklet are formed from the DataMapRow and that is of no need.

 

Solution:

We have the blocklet row count in the DataMapRow itself, so it is just enough 
to read the count. With this count(*) query performance can be improved.

> Prune for count(*) improvement
> --
>
> Key: CARBONDATA-3293
> URL: https://issues.apache.org/jira/browse/CARBONDATA-3293
> Project: CarbonData
>  Issue Type: Improvement
>Reporter: dhatchayani
>Assignee: dhatchayani
>Priority: Major
>
> Problem:
> (1) Currently for count(*), the prune is same as select * query.  Blocklet 
> and ExtendedBlocklet are formed from the DataMapRow and that is of no need.
>  
> Solution:
> We have the blocklet row count in the DataMapRow itself, so it is just enough 
> to read the count. With this count(*) query performance can be improved.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (CARBONDATA-3293) Prune for count(*) improvement

2019-02-14 Thread dhatchayani (JIRA)
dhatchayani created CARBONDATA-3293:
---

 Summary: Prune for count(*) improvement
 Key: CARBONDATA-3293
 URL: https://issues.apache.org/jira/browse/CARBONDATA-3293
 Project: CarbonData
  Issue Type: Improvement
Reporter: dhatchayani
Assignee: dhatchayani






--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (CARBONDATA-3241) Refactor the requested scan columns and the projection columns

2019-01-10 Thread dhatchayani (JIRA)
dhatchayani created CARBONDATA-3241:
---

 Summary: Refactor the requested scan columns and the projection 
columns
 Key: CARBONDATA-3241
 URL: https://issues.apache.org/jira/browse/CARBONDATA-3241
 Project: CarbonData
  Issue Type: Improvement
Reporter: dhatchayani
Assignee: dhatchayani






--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (CARBONDATA-2755) Compaction of Complex DataType (STRUCT AND ARRAY)

2018-12-10 Thread dhatchayani (JIRA)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-2755?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

dhatchayani updated CARBONDATA-2755:

Summary: Compaction of Complex DataType (STRUCT AND ARRAY)  (was: 
Compaction of Complex DataType)

> Compaction of Complex DataType (STRUCT AND ARRAY)
> -
>
> Key: CARBONDATA-2755
> URL: https://issues.apache.org/jira/browse/CARBONDATA-2755
> Project: CarbonData
>  Issue Type: Sub-task
>Reporter: sounak chakraborty
>Assignee: dhatchayani
>Priority: Major
>  Time Spent: 4h 20m
>  Remaining Estimate: 0h
>
> Complex Type Enhancements - Compaction of Complex DataType



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (CARBONDATA-2755) Compaction of Complex DataType

2018-12-10 Thread dhatchayani (JIRA)


[ 
https://issues.apache.org/jira/browse/CARBONDATA-2755?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16714748#comment-16714748
 ] 

dhatchayani commented on CARBONDATA-2755:
-

https://issues.apache.org/jira/browse/CARBONDATA-3160  Jira to extend 
compaction support with MAP type

> Compaction of Complex DataType
> --
>
> Key: CARBONDATA-2755
> URL: https://issues.apache.org/jira/browse/CARBONDATA-2755
> Project: CarbonData
>  Issue Type: Sub-task
>Reporter: sounak chakraborty
>Assignee: dhatchayani
>Priority: Major
>  Time Spent: 4h 20m
>  Remaining Estimate: 0h
>
> Complex Type Enhancements - Compaction of Complex DataType



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (CARBONDATA-3160) Compaction support with MAP data type

2018-12-10 Thread dhatchayani (JIRA)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-3160?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

dhatchayani updated CARBONDATA-3160:

Description: Support compaction with MAP type

> Compaction support with MAP data type
> -
>
> Key: CARBONDATA-3160
> URL: https://issues.apache.org/jira/browse/CARBONDATA-3160
> Project: CarbonData
>  Issue Type: Sub-task
>Reporter: dhatchayani
>Assignee: dhatchayani
>Priority: Minor
>
> Support compaction with MAP type



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (CARBONDATA-3160) Compaction support with MAP data type

2018-12-10 Thread dhatchayani (JIRA)
dhatchayani created CARBONDATA-3160:
---

 Summary: Compaction support with MAP data type
 Key: CARBONDATA-3160
 URL: https://issues.apache.org/jira/browse/CARBONDATA-3160
 Project: CarbonData
  Issue Type: Sub-task
Reporter: dhatchayani
Assignee: dhatchayani






--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Assigned] (CARBONDATA-2605) Complex DataType Enhancements

2018-12-10 Thread dhatchayani (JIRA)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-2605?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

dhatchayani reassigned CARBONDATA-2605:
---

Assignee: dhatchayani

> Complex DataType Enhancements
> -
>
> Key: CARBONDATA-2605
> URL: https://issues.apache.org/jira/browse/CARBONDATA-2605
> Project: CarbonData
>  Issue Type: Improvement
>Reporter: sounak chakraborty
>Assignee: dhatchayani
>Priority: Major
> Attachments: Complex Data Type Enhancements.pdf
>
>
> Umbrella Jira to implement enhancements in Complex Data Type for Carbon.
>  * Projection push down for struct data type.
>  * Provide adaptive encoding and decoding for all data type.
>  * Support JSON data loading directly into Carbon table.
>  
> Please access the Design Document through this link. 
>  
> [https://docs.google.com/document/d/12EZwUlLs53Vro7pMeLnFd0lCjeKOakKY-60e3cryJb4/edit#|https://docs.google.com/document/d/12EZwUlLs53Vro7pMeLnFd0lCjeKOakKY-60e3cryJb4/edit]



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (CARBONDATA-2755) Compaction of Complex DataType

2018-12-10 Thread dhatchayani (JIRA)


[ 
https://issues.apache.org/jira/browse/CARBONDATA-2755?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16714745#comment-16714745
 ] 

dhatchayani commented on CARBONDATA-2755:
-

This Jira is to support compaction with STRUCT and ARRAY type. For MAP type, a 
new Jira will be raised.

> Compaction of Complex DataType
> --
>
> Key: CARBONDATA-2755
> URL: https://issues.apache.org/jira/browse/CARBONDATA-2755
> Project: CarbonData
>  Issue Type: Sub-task
>Reporter: sounak chakraborty
>Assignee: dhatchayani
>Priority: Major
>  Time Spent: 4h 20m
>  Remaining Estimate: 0h
>
> Complex Type Enhancements - Compaction of Complex DataType



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Assigned] (CARBONDATA-2755) Compaction of Complex DataType

2018-12-10 Thread dhatchayani (JIRA)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-2755?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

dhatchayani reassigned CARBONDATA-2755:
---

Assignee: dhatchayani  (was: sounak chakraborty)

> Compaction of Complex DataType
> --
>
> Key: CARBONDATA-2755
> URL: https://issues.apache.org/jira/browse/CARBONDATA-2755
> Project: CarbonData
>  Issue Type: Sub-task
>Reporter: sounak chakraborty
>Assignee: dhatchayani
>Priority: Major
>  Time Spent: 4h 20m
>  Remaining Estimate: 0h
>
> Complex Type Enhancements - Compaction of Complex DataType



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (CARBONDATA-3145) Avoid duplicate decoding for complex column pages while querying

2018-12-10 Thread dhatchayani (JIRA)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-3145?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

dhatchayani updated CARBONDATA-3145:

Summary: Avoid duplicate decoding for complex column pages while querying  
(was: Read improvement for complex column pages while querying)

> Avoid duplicate decoding for complex column pages while querying
> 
>
> Key: CARBONDATA-3145
> URL: https://issues.apache.org/jira/browse/CARBONDATA-3145
> Project: CarbonData
>  Issue Type: Sub-task
>Reporter: dhatchayani
>Assignee: dhatchayani
>Priority: Major
>  Time Spent: 3h
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (CARBONDATA-3145) Read improvement for complex column pages while querying

2018-12-03 Thread dhatchayani (JIRA)
dhatchayani created CARBONDATA-3145:
---

 Summary: Read improvement for complex column pages while querying
 Key: CARBONDATA-3145
 URL: https://issues.apache.org/jira/browse/CARBONDATA-3145
 Project: CarbonData
  Issue Type: Sub-task
Reporter: dhatchayani
Assignee: dhatchayani






--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (CARBONDATA-3131) Update the requested columns to the Scan

2018-11-26 Thread dhatchayani (JIRA)
dhatchayani created CARBONDATA-3131:
---

 Summary: Update the requested columns to the Scan
 Key: CARBONDATA-3131
 URL: https://issues.apache.org/jira/browse/CARBONDATA-3131
 Project: CarbonData
  Issue Type: Sub-task
Reporter: dhatchayani
Assignee: dhatchayani






--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (CARBONDATA-3117) Rearrange the projection list in the Scan

2018-11-21 Thread dhatchayani (JIRA)
dhatchayani created CARBONDATA-3117:
---

 Summary: Rearrange the projection list in the Scan
 Key: CARBONDATA-3117
 URL: https://issues.apache.org/jira/browse/CARBONDATA-3117
 Project: CarbonData
  Issue Type: Bug
Reporter: dhatchayani
Assignee: dhatchayani






--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (CARBONDATA-3096) Wrong records size on the input metrics

2018-11-16 Thread dhatchayani (JIRA)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-3096?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

dhatchayani updated CARBONDATA-3096:

Description: 
(1) Scanned record result size is taking from the default batch size. It should 
be taken from the records scanned.
h1. +*Steps to reproduce:*+

spark.sql("DROP TABLE IF EXISTS person")
 spark.sql("create table person (id int, name string) stored by 'carbondata'")
 spark.sql("insert into person select 1,'a'")
 spark.sql("select * from person").show(false)

!3096.PNG!

 

(2) The intermediate page used to sort in adaptive encoding should be freed.

  was:
(1) Scanned record result size is taking from the default batch size. It should 
be taken from the records scanned.

Steps to reproduce:

spark.sql("DROP TABLE IF EXISTS person")
 spark.sql("create table person (id int, name string) stored by 'carbondata'")
 spark.sql("insert into person select 1,'a'")
 spark.sql("select * from person").show(false)

!3096.PNG!

 

(2) The intermediate page used to sort in adaptive encoding should be freed.


> Wrong records size on the input metrics
> ---
>
> Key: CARBONDATA-3096
> URL: https://issues.apache.org/jira/browse/CARBONDATA-3096
> Project: CarbonData
>  Issue Type: Bug
>Reporter: dhatchayani
>Assignee: dhatchayani
>Priority: Minor
> Attachments: 3096.PNG
>
>  Time Spent: 9h 20m
>  Remaining Estimate: 0h
>
> (1) Scanned record result size is taking from the default batch size. It 
> should be taken from the records scanned.
> h1. +*Steps to reproduce:*+
> spark.sql("DROP TABLE IF EXISTS person")
>  spark.sql("create table person (id int, name string) stored by 'carbondata'")
>  spark.sql("insert into person select 1,'a'")
>  spark.sql("select * from person").show(false)
> !3096.PNG!
>  
> (2) The intermediate page used to sort in adaptive encoding should be freed.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (CARBONDATA-3096) Wrong records size on the input metrics

2018-11-16 Thread dhatchayani (JIRA)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-3096?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

dhatchayani updated CARBONDATA-3096:

Summary: Wrong records size on the input metrics  (was: Wrong records size 
on the input metrics & Free the intermediate page used while adaptive encoding)

> Wrong records size on the input metrics
> ---
>
> Key: CARBONDATA-3096
> URL: https://issues.apache.org/jira/browse/CARBONDATA-3096
> Project: CarbonData
>  Issue Type: Bug
>Reporter: dhatchayani
>Assignee: dhatchayani
>Priority: Minor
> Attachments: 3096.PNG
>
>  Time Spent: 9h 20m
>  Remaining Estimate: 0h
>
> (1) Scanned record result size is taking from the default batch size. It 
> should be taken from the records scanned.
> Steps to reproduce:
> spark.sql("DROP TABLE IF EXISTS person")
>  spark.sql("create table person (id int, name string) stored by 'carbondata'")
>  spark.sql("insert into person select 1,'a'")
>  spark.sql("select * from person").show(false)
> !3096.PNG!
>  
> (2) The intermediate page used to sort in adaptive encoding should be freed.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (CARBONDATA-3096) Wrong records size on the input metrics & Free the intermediate page used while adaptive encoding

2018-11-16 Thread dhatchayani (JIRA)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-3096?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

dhatchayani updated CARBONDATA-3096:

Description: 
(1) Scanned record result size is taking from the default batch size. It should 
be taken from the records scanned.

Steps to reproduce:

spark.sql("DROP TABLE IF EXISTS person")
 spark.sql("create table person (id int, name string) stored by 'carbondata'")
 spark.sql("insert into person select 1,'a'")
 spark.sql("select * from person").show(false)

!3096.PNG!

 

(2) The intermediate page used to sort in adaptive encoding should be freed.

  was:
(1) Scanned record result size is taking from the default batch size. It should 
be taken from the records scanned.

Steps to reproduce:

spark.sql("DROP TABLE IF EXISTS person")
spark.sql("create table person (id int, name string) stored by 'carbondata'")
spark.sql("insert into person select 1,'a'")
spark.sql("select * from person").show(false)

 

+--+---+---+--++++--+---+---+---+---+-+---+---+---+++---+
|query_id |task_id|start_time 
|total_time|load_blocks_time|load_dictionary_time|carbon_scan_time|carbon_IO_time|scan_blocks_num|total_blocklets|valid_blocklets|total_pages|scanned_pages|valid_pages|+*result_size*+|key_column_filling_time|measure_filling_time|page_uncompress_time|result_preparation_time|
+--+---+---+--++++--+---+---+---+---+-+---+---+---+++---+
|29127036821854| 0|2018-11-16 20:22:56.573| 1430ms| 100ms| 0ms| 13| 102| 1| 1| 
1| 1| 0| 1| +*64000*+| 0| 0| 927| 0|
+--+---+---+--++++--+---+---+---+---+-+---+---+---+++---+

(2) The intermediate page used to sort in adaptive encoding should be freed.


> Wrong records size on the input metrics & Free the intermediate page used 
> while adaptive encoding
> -
>
> Key: CARBONDATA-3096
> URL: https://issues.apache.org/jira/browse/CARBONDATA-3096
> Project: CarbonData
>  Issue Type: Bug
>Reporter: dhatchayani
>Assignee: dhatchayani
>Priority: Minor
> Attachments: 3096.PNG
>
>  Time Spent: 9h 20m
>  Remaining Estimate: 0h
>
> (1) Scanned record result size is taking from the default batch size. It 
> should be taken from the records scanned.
> Steps to reproduce:
> spark.sql("DROP TABLE IF EXISTS person")
>  spark.sql("create table person (id int, name string) stored by 'carbondata'")
>  spark.sql("insert into person select 1,'a'")
>  spark.sql("select * from person").show(false)
> !3096.PNG!
>  
> (2) The intermediate page used to sort in adaptive encoding should be freed.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (CARBONDATA-3096) Wrong records size on the input metrics & Free the intermediate page used while adaptive encoding

2018-11-16 Thread dhatchayani (JIRA)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-3096?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

dhatchayani updated CARBONDATA-3096:

Attachment: 3096.PNG

> Wrong records size on the input metrics & Free the intermediate page used 
> while adaptive encoding
> -
>
> Key: CARBONDATA-3096
> URL: https://issues.apache.org/jira/browse/CARBONDATA-3096
> Project: CarbonData
>  Issue Type: Bug
>Reporter: dhatchayani
>Assignee: dhatchayani
>Priority: Minor
> Attachments: 3096.PNG
>
>  Time Spent: 9h 20m
>  Remaining Estimate: 0h
>
> (1) Scanned record result size is taking from the default batch size. It 
> should be taken from the records scanned.
> Steps to reproduce:
> spark.sql("DROP TABLE IF EXISTS person")
> spark.sql("create table person (id int, name string) stored by 'carbondata'")
> spark.sql("insert into person select 1,'a'")
> spark.sql("select * from person").show(false)
>  
> +--+---+---+--++++--+---+---+---+---+-+---+---+---+++---+
> |query_id |task_id|start_time 
> |total_time|load_blocks_time|load_dictionary_time|carbon_scan_time|carbon_IO_time|scan_blocks_num|total_blocklets|valid_blocklets|total_pages|scanned_pages|valid_pages|+*result_size*+|key_column_filling_time|measure_filling_time|page_uncompress_time|result_preparation_time|
> +--+---+---+--++++--+---+---+---+---+-+---+---+---+++---+
> |29127036821854| 0|2018-11-16 20:22:56.573| 1430ms| 100ms| 0ms| 13| 102| 1| 
> 1| 1| 1| 0| 1| +*64000*+| 0| 0| 927| 0|
> +--+---+---+--++++--+---+---+---+---+-+---+---+---+++---+
> (2) The intermediate page used to sort in adaptive encoding should be freed.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (CARBONDATA-3096) Wrong records size on the input metrics & Free the intermediate page used while adaptive encoding

2018-11-16 Thread dhatchayani (JIRA)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-3096?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

dhatchayani updated CARBONDATA-3096:

Description: 
(1) Scanned record result size is taking from the default batch size. It should 
be taken from the records scanned.

Steps to reproduce:

spark.sql("DROP TABLE IF EXISTS person")
spark.sql("create table person (id int, name string) stored by 'carbondata'")
spark.sql("insert into person select 1,'a'")
spark.sql("select * from person").show(false)

 

+--+---+---+--++++--+---+---+---+---+-+---+---+---+++---+
|query_id |task_id|start_time 
|total_time|load_blocks_time|load_dictionary_time|carbon_scan_time|carbon_IO_time|scan_blocks_num|total_blocklets|valid_blocklets|total_pages|scanned_pages|valid_pages|+*result_size*+|key_column_filling_time|measure_filling_time|page_uncompress_time|result_preparation_time|
+--+---+---+--++++--+---+---+---+---+-+---+---+---+++---+
|29127036821854| 0|2018-11-16 20:22:56.573| 1430ms| 100ms| 0ms| 13| 102| 1| 1| 
1| 1| 0| 1| +*64000*+| 0| 0| 927| 0|
+--+---+---+--++++--+---+---+---+---+-+---+---+---+++---+

(2) The intermediate page used to sort in adaptive encoding should be freed.

  was:
(1) Scanned record result size is taking from the default batch size. It should 
be taken from the records scanned.

(2) The intermediate page used to sort in adaptive encoding should be freed.


> Wrong records size on the input metrics & Free the intermediate page used 
> while adaptive encoding
> -
>
> Key: CARBONDATA-3096
> URL: https://issues.apache.org/jira/browse/CARBONDATA-3096
> Project: CarbonData
>  Issue Type: Bug
>Reporter: dhatchayani
>Assignee: dhatchayani
>Priority: Minor
>  Time Spent: 9h 20m
>  Remaining Estimate: 0h
>
> (1) Scanned record result size is taking from the default batch size. It 
> should be taken from the records scanned.
> Steps to reproduce:
> spark.sql("DROP TABLE IF EXISTS person")
> spark.sql("create table person (id int, name string) stored by 'carbondata'")
> spark.sql("insert into person select 1,'a'")
> spark.sql("select * from person").show(false)
>  
> +--+---+---+--++++--+---+---+---+---+-+---+---+---+++---+
> |query_id |task_id|start_time 
> |total_time|load_blocks_time|load_dictionary_time|carbon_scan_time|carbon_IO_time|scan_blocks_num|total_blocklets|valid_blocklets|total_pages|scanned_pages|valid_pages|+*result_size*+|key_column_filling_time|measure_filling_time|page_uncompress_time|result_preparation_time|
> +--+---+---+--++++--+---+---+---+---+-+---+---+---+++---+
> |29127036821854| 0|2018-11-16 20:22:56.573| 1430ms| 100ms| 0ms| 13| 102| 1| 
> 1| 1| 1| 0| 1| +*64000*+| 0| 0| 927| 0|
> +--+---+---+--++++--+---+---+---+---+-+---+---+---+++---+
> (2) The intermediate page used to sort in adaptive encoding should be freed.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (CARBONDATA-3096) Wrong records size on the input metrics & Free the intermediate page used while adaptive encoding

2018-11-13 Thread dhatchayani (JIRA)
dhatchayani created CARBONDATA-3096:
---

 Summary: Wrong records size on the input metrics & Free the 
intermediate page used while adaptive encoding
 Key: CARBONDATA-3096
 URL: https://issues.apache.org/jira/browse/CARBONDATA-3096
 Project: CarbonData
  Issue Type: Bug
Reporter: dhatchayani
Assignee: dhatchayani


(1) Scanned record result size is taking from the default batch size. It should 
be taken from the records scanned.

(2) The intermediate page used to sort in adaptive encoding should be freed.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (CARBONDATA-3023) Alter add column issue with SORT_COLUMNS

2018-10-17 Thread dhatchayani (JIRA)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-3023?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

dhatchayani updated CARBONDATA-3023:

Summary: Alter add column issue with SORT_COLUMNS  (was: Alter add column 
issue with reading a row)

> Alter add column issue with SORT_COLUMNS
> 
>
> Key: CARBONDATA-3023
> URL: https://issues.apache.org/jira/browse/CARBONDATA-3023
> Project: CarbonData
>  Issue Type: Improvement
>Reporter: dhatchayani
>Assignee: dhatchayani
>Priority: Minor
>  Time Spent: 40m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (CARBONDATA-3023) Alter add column issue with reading a row

2018-10-17 Thread dhatchayani (JIRA)
dhatchayani created CARBONDATA-3023:
---

 Summary: Alter add column issue with reading a row
 Key: CARBONDATA-3023
 URL: https://issues.apache.org/jira/browse/CARBONDATA-3023
 Project: CarbonData
  Issue Type: Improvement
Reporter: dhatchayani
Assignee: dhatchayani






--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (CARBONDATA-3022) Refactor ColumnPageWrapper

2018-10-17 Thread dhatchayani (JIRA)
dhatchayani created CARBONDATA-3022:
---

 Summary: Refactor ColumnPageWrapper
 Key: CARBONDATA-3022
 URL: https://issues.apache.org/jira/browse/CARBONDATA-3022
 Project: CarbonData
  Issue Type: Improvement
Reporter: dhatchayani
Assignee: dhatchayani






--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (CARBONDATA-2998) Refresh column schema for old store(before V3) for SORT_COLUMNS option

2018-10-09 Thread dhatchayani (JIRA)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-2998?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

dhatchayani updated CARBONDATA-2998:

Summary: Refresh column schema for old store(before V3) for SORT_COLUMNS 
option  (was: Refresh column schema for old store for SORT_COLUMNS option)

> Refresh column schema for old store(before V3) for SORT_COLUMNS option
> --
>
> Key: CARBONDATA-2998
> URL: https://issues.apache.org/jira/browse/CARBONDATA-2998
> Project: CarbonData
>  Issue Type: Bug
>Reporter: dhatchayani
>Assignee: dhatchayani
>Priority: Minor
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (CARBONDATA-2998) Refresh column schema for old store for SORT_COLUMNS option

2018-10-09 Thread dhatchayani (JIRA)
dhatchayani created CARBONDATA-2998:
---

 Summary: Refresh column schema for old store for SORT_COLUMNS 
option
 Key: CARBONDATA-2998
 URL: https://issues.apache.org/jira/browse/CARBONDATA-2998
 Project: CarbonData
  Issue Type: Bug
Reporter: dhatchayani
Assignee: dhatchayani






--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (CARBONDATA-2975) DefaultValue choosing and removeNullValues on range filters is incorrect

2018-09-26 Thread dhatchayani (JIRA)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-2975?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

dhatchayani updated CARBONDATA-2975:

Summary: DefaultValue choosing and removeNullValues on range filters is 
incorrect  (was: DefaultValue choosing and removeNullValues on range filters is 
inorrect)

> DefaultValue choosing and removeNullValues on range filters is incorrect
> 
>
> Key: CARBONDATA-2975
> URL: https://issues.apache.org/jira/browse/CARBONDATA-2975
> Project: CarbonData
>  Issue Type: Bug
>Reporter: dhatchayani
>Priority: Major
>  Time Spent: 10m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (CARBONDATA-2975) DefaultValue choosing and removeNullValues on range filters is inorrect

2018-09-26 Thread dhatchayani (JIRA)
dhatchayani created CARBONDATA-2975:
---

 Summary: DefaultValue choosing and removeNullValues on range 
filters is inorrect
 Key: CARBONDATA-2975
 URL: https://issues.apache.org/jira/browse/CARBONDATA-2975
 Project: CarbonData
  Issue Type: Bug
Reporter: dhatchayani






--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (CARBONDATA-2947) Adaptive encoding support for timestamp no dictionary and Refactor ColumnPageWrapper

2018-09-24 Thread dhatchayani (JIRA)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-2947?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

dhatchayani updated CARBONDATA-2947:

Summary: Adaptive encoding support for timestamp no dictionary and Refactor 
ColumnPageWrapper  (was: Adaptive encoding support for timestamp no dictionary)

> Adaptive encoding support for timestamp no dictionary and Refactor 
> ColumnPageWrapper
> 
>
> Key: CARBONDATA-2947
> URL: https://issues.apache.org/jira/browse/CARBONDATA-2947
> Project: CarbonData
>  Issue Type: Improvement
>Reporter: dhatchayani
>Assignee: dhatchayani
>Priority: Minor
>  Time Spent: 6h 20m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (CARBONDATA-2946) Unify conversion while writing to Bloom

2018-09-24 Thread dhatchayani (JIRA)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-2946?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

dhatchayani updated CARBONDATA-2946:

Summary: Unify conversion while writing to Bloom  (was: Unify conversion 
while writing to Bloom and Refactor ColumnPageWrapper)

> Unify conversion while writing to Bloom
> ---
>
> Key: CARBONDATA-2946
> URL: https://issues.apache.org/jira/browse/CARBONDATA-2946
> Project: CarbonData
>  Issue Type: Bug
>Reporter: dhatchayani
>Assignee: dhatchayani
>Priority: Major
>  Time Spent: 4h 20m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (CARBONDATA-2946) Unify conversion while writing to Bloom and Refactor ColumnPageWrapper

2018-09-24 Thread dhatchayani (JIRA)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-2946?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

dhatchayani updated CARBONDATA-2946:

Summary: Unify conversion while writing to Bloom and Refactor 
ColumnPageWrapper  (was: Bloom filter backward compatibility with adaptive 
encoding and Refactor)

> Unify conversion while writing to Bloom and Refactor ColumnPageWrapper
> --
>
> Key: CARBONDATA-2946
> URL: https://issues.apache.org/jira/browse/CARBONDATA-2946
> Project: CarbonData
>  Issue Type: Bug
>Reporter: dhatchayani
>Assignee: dhatchayani
>Priority: Major
>  Time Spent: 4h 20m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (CARBONDATA-2947) Adaptive encoding support for timestamp no dictionary

2018-09-19 Thread dhatchayani (JIRA)
dhatchayani created CARBONDATA-2947:
---

 Summary: Adaptive encoding support for timestamp no dictionary
 Key: CARBONDATA-2947
 URL: https://issues.apache.org/jira/browse/CARBONDATA-2947
 Project: CarbonData
  Issue Type: Improvement
Reporter: dhatchayani
Assignee: dhatchayani






--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (CARBONDATA-2946) Bloom filter backward compatibility with adaptive encoding and Refactor

2018-09-19 Thread dhatchayani (JIRA)
dhatchayani created CARBONDATA-2946:
---

 Summary: Bloom filter backward compatibility with adaptive 
encoding and Refactor
 Key: CARBONDATA-2946
 URL: https://issues.apache.org/jira/browse/CARBONDATA-2946
 Project: CarbonData
  Issue Type: Bug
Reporter: dhatchayani
Assignee: dhatchayani






--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (CARBONDATA-2896) Adaptive encoding for primitive data types

2018-08-27 Thread dhatchayani (JIRA)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-2896?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

dhatchayani updated CARBONDATA-2896:

Description: 
Currently Encoding and Decoding is present only for Dictionary, Measure 
Columns, but for no dictionary Primitive types encoding is *absent.*

*Encoding is a technique used to reduce the storage size and  after all these 
encoding, result will be compressed with snappy compression to further reduce 
the storage size.*

*With this feature, we support encoding on the no dictionary primitive data 
types also.*

> Adaptive encoding for primitive data types
> --
>
> Key: CARBONDATA-2896
> URL: https://issues.apache.org/jira/browse/CARBONDATA-2896
> Project: CarbonData
>  Issue Type: New Feature
>Reporter: dhatchayani
>Assignee: dhatchayani
>Priority: Major
>
> Currently Encoding and Decoding is present only for Dictionary, Measure 
> Columns, but for no dictionary Primitive types encoding is *absent.*
> *Encoding is a technique used to reduce the storage size and  after all these 
> encoding, result will be compressed with snappy compression to further reduce 
> the storage size.*
> *With this feature, we support encoding on the no dictionary primitive data 
> types also.*



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (CARBONDATA-2896) Adaptive encoding for primitive data types

2018-08-27 Thread dhatchayani (JIRA)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-2896?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

dhatchayani updated CARBONDATA-2896:

Description: 
Currently Encoding and Decoding is present only for Dictionary, Measure 
Columns, but for no dictionary Primitive types encoding is *absent.*

Encoding is a technique used to reduce the storage size and  after all these 
encoding, result will be compressed with snappy compression to further reduce 
the storage size.

With this feature, we support encoding on the no dictionary primitive data 
types also.

  was:
Currently Encoding and Decoding is present only for Dictionary, Measure 
Columns, but for no dictionary Primitive types encoding is *absent.*

*Encoding is a technique used to reduce the storage size and  after all these 
encoding, result will be compressed with snappy compression to further reduce 
the storage size.*

*With this feature, we support encoding on the no dictionary primitive data 
types also.*


> Adaptive encoding for primitive data types
> --
>
> Key: CARBONDATA-2896
> URL: https://issues.apache.org/jira/browse/CARBONDATA-2896
> Project: CarbonData
>  Issue Type: New Feature
>Reporter: dhatchayani
>Assignee: dhatchayani
>Priority: Major
>
> Currently Encoding and Decoding is present only for Dictionary, Measure 
> Columns, but for no dictionary Primitive types encoding is *absent.*
> Encoding is a technique used to reduce the storage size and  after all these 
> encoding, result will be compressed with snappy compression to further reduce 
> the storage size.
> With this feature, we support encoding on the no dictionary primitive data 
> types also.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (CARBONDATA-2896) Adaptive encoding for primitive data types

2018-08-27 Thread dhatchayani (JIRA)
dhatchayani created CARBONDATA-2896:
---

 Summary: Adaptive encoding for primitive data types
 Key: CARBONDATA-2896
 URL: https://issues.apache.org/jira/browse/CARBONDATA-2896
 Project: CarbonData
  Issue Type: New Feature
Reporter: dhatchayani
Assignee: dhatchayani






--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (CARBONDATA-2829) Fix creating merge index on older V1 V2 store

2018-08-06 Thread dhatchayani (JIRA)
dhatchayani created CARBONDATA-2829:
---

 Summary: Fix creating merge index on older V1 V2 store
 Key: CARBONDATA-2829
 URL: https://issues.apache.org/jira/browse/CARBONDATA-2829
 Project: CarbonData
  Issue Type: Improvement
Reporter: dhatchayani
Assignee: dhatchayani


Block creating merge index on older V1 V2 version



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (CARBONDATA-2812) Implement freeMemory for complex pages

2018-08-01 Thread dhatchayani (JIRA)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-2812?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

dhatchayani updated CARBONDATA-2812:

Summary: Implement freeMemory for complex pages   (was: Implement free 
memory for complex pages )

> Implement freeMemory for complex pages 
> ---
>
> Key: CARBONDATA-2812
> URL: https://issues.apache.org/jira/browse/CARBONDATA-2812
> Project: CarbonData
>  Issue Type: Bug
>Reporter: dhatchayani
>Assignee: dhatchayani
>Priority: Major
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (CARBONDATA-2812) Implement free memory for complex pages

2018-08-01 Thread dhatchayani (JIRA)
dhatchayani created CARBONDATA-2812:
---

 Summary: Implement free memory for complex pages 
 Key: CARBONDATA-2812
 URL: https://issues.apache.org/jira/browse/CARBONDATA-2812
 Project: CarbonData
  Issue Type: Bug
Reporter: dhatchayani
Assignee: dhatchayani






--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (CARBONDATA-2808) Insert into select is crashing as both are sharing the same task context

2018-07-31 Thread dhatchayani (JIRA)
dhatchayani created CARBONDATA-2808:
---

 Summary: Insert into select is crashing as both are sharing the 
same task context
 Key: CARBONDATA-2808
 URL: https://issues.apache.org/jira/browse/CARBONDATA-2808
 Project: CarbonData
  Issue Type: Bug
Reporter: dhatchayani
Assignee: dhatchayani


Insert into select is failing as both are running as the same task and both are 
sharing the same taskcontext and resources are cleared once any one of the 
RDD's task is completed.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (CARBONDATA-2714) Support merge index files for the segment

2018-07-10 Thread dhatchayani (JIRA)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-2714?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

dhatchayani updated CARBONDATA-2714:

Description: 
We already have discussed the merge index advantages in the community.  Please 
find the link below.

[http://apache-carbondata-dev-mailing-list-archive.1130556.n5.nabble.com/Discussion-Merging-carbonindex-files-for-each-segments-and-across-segments-td24441.html]

But the feature is not completed and have some gaps like this feature is not 
supported for some of the features like pre-aggregate table, streaming table.

In this JIRA, Merge index feature will be completed by supporting all the 
existing impacted features.

> Support merge index files for the segment
> -
>
> Key: CARBONDATA-2714
> URL: https://issues.apache.org/jira/browse/CARBONDATA-2714
> Project: CarbonData
>  Issue Type: Sub-task
>Reporter: dhatchayani
>Assignee: dhatchayani
>Priority: Major
>
> We already have discussed the merge index advantages in the community.  
> Please find the link below.
> [http://apache-carbondata-dev-mailing-list-archive.1130556.n5.nabble.com/Discussion-Merging-carbonindex-files-for-each-segments-and-across-segments-td24441.html]
> But the feature is not completed and have some gaps like this feature is not 
> supported for some of the features like pre-aggregate table, streaming table.
> In this JIRA, Merge index feature will be completed by supporting all the 
> existing impacted features.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (CARBONDATA-2714) Support merge index files for the segment

2018-07-10 Thread dhatchayani (JIRA)
dhatchayani created CARBONDATA-2714:
---

 Summary: Support merge index files for the segment
 Key: CARBONDATA-2714
 URL: https://issues.apache.org/jira/browse/CARBONDATA-2714
 Project: CarbonData
  Issue Type: Sub-task
Reporter: dhatchayani
Assignee: dhatchayani






--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (CARBONDATA-2704) Index file size in describe formatted command is not updated correctly with the segment file

2018-07-06 Thread dhatchayani (JIRA)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-2704?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

dhatchayani updated CARBONDATA-2704:

Summary: Index file size in describe formatted command is not updated 
correctly with the segment file  (was: Index file size in describe formatted 
command is not updated correctly according to the segment file)

> Index file size in describe formatted command is not updated correctly with 
> the segment file
> 
>
> Key: CARBONDATA-2704
> URL: https://issues.apache.org/jira/browse/CARBONDATA-2704
> Project: CarbonData
>  Issue Type: Bug
>Reporter: dhatchayani
>Assignee: dhatchayani
>Priority: Minor
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (CARBONDATA-2704) Index file size in describe formatted command is not updated correctly according to the segment file

2018-07-06 Thread dhatchayani (JIRA)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-2704?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

dhatchayani updated CARBONDATA-2704:

Summary: Index file size in describe formatted command is not updated 
correctly according to the segment file  (was: Index file size in describe 
formatted command is not updated correctly)

> Index file size in describe formatted command is not updated correctly 
> according to the segment file
> 
>
> Key: CARBONDATA-2704
> URL: https://issues.apache.org/jira/browse/CARBONDATA-2704
> Project: CarbonData
>  Issue Type: Bug
>Reporter: dhatchayani
>Assignee: dhatchayani
>Priority: Minor
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (CARBONDATA-2704) Index file size in describe formatted command is not updated correctly

2018-07-06 Thread dhatchayani (JIRA)
dhatchayani created CARBONDATA-2704:
---

 Summary: Index file size in describe formatted command is not 
updated correctly
 Key: CARBONDATA-2704
 URL: https://issues.apache.org/jira/browse/CARBONDATA-2704
 Project: CarbonData
  Issue Type: Bug
Reporter: dhatchayani
Assignee: dhatchayani






--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (CARBONDATA-2571) Calculating the carbonindex and carbondata file size of a table is wrong

2018-06-01 Thread dhatchayani (JIRA)
dhatchayani created CARBONDATA-2571:
---

 Summary: Calculating the carbonindex and carbondata file size of a 
table is wrong
 Key: CARBONDATA-2571
 URL: https://issues.apache.org/jira/browse/CARBONDATA-2571
 Project: CarbonData
  Issue Type: Bug
Reporter: dhatchayani
Assignee: dhatchayani






--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (CARBONDATA-2482) Pass uuid while writing segment file if possible

2018-05-15 Thread dhatchayani (JIRA)
dhatchayani created CARBONDATA-2482:
---

 Summary: Pass uuid while writing segment file if possible
 Key: CARBONDATA-2482
 URL: https://issues.apache.org/jira/browse/CARBONDATA-2482
 Project: CarbonData
  Issue Type: Improvement
Reporter: dhatchayani
Assignee: dhatchayani






--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (CARBONDATA-2470) Refactor AlterTableCompactionPostStatusUpdateEvent usage in compaction flow

2018-05-10 Thread dhatchayani (JIRA)
dhatchayani created CARBONDATA-2470:
---

 Summary: Refactor AlterTableCompactionPostStatusUpdateEvent usage 
in compaction flow
 Key: CARBONDATA-2470
 URL: https://issues.apache.org/jira/browse/CARBONDATA-2470
 Project: CarbonData
  Issue Type: Improvement
Reporter: dhatchayani
Assignee: dhatchayani


AlterTableCompactionPostStatusUpdateEvent in compaction flow is controlled only 
by the preaggregate listener. If the CommitPreAggregateListener sets the 
commitComplete property to true, this event will not be fired for the next 
iteration



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (CARBONDATA-2467) Null is printed in the SDK writer logs for operations logged

2018-05-10 Thread dhatchayani (JIRA)
dhatchayani created CARBONDATA-2467:
---

 Summary: Null is printed in the SDK writer logs for operations 
logged
 Key: CARBONDATA-2467
 URL: https://issues.apache.org/jira/browse/CARBONDATA-2467
 Project: CarbonData
  Issue Type: Improvement
Reporter: dhatchayani
Assignee: Rahul Kumar


Expected Output:Null should not be printed in the SDK writer logs 

Actual Output:Null is printed in the SDK writer logs for operations logged as 
shown below. This is confusing for the user.

 

 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (CARBONDATA-2458) Remove unnecessary TableProvider interface

2018-05-08 Thread dhatchayani (JIRA)
dhatchayani created CARBONDATA-2458:
---

 Summary: Remove unnecessary TableProvider interface
 Key: CARBONDATA-2458
 URL: https://issues.apache.org/jira/browse/CARBONDATA-2458
 Project: CarbonData
  Issue Type: Improvement
Reporter: dhatchayani
Assignee: dhatchayani






--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (CARBONDATA-2448) Adding compacted segments to load and alter events

2018-05-06 Thread dhatchayani (JIRA)
dhatchayani created CARBONDATA-2448:
---

 Summary: Adding compacted segments to load and alter events
 Key: CARBONDATA-2448
 URL: https://issues.apache.org/jira/browse/CARBONDATA-2448
 Project: CarbonData
  Issue Type: Improvement
Reporter: dhatchayani






--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (CARBONDATA-2362) Changing the Cacheable object from DataMap to Wrapper

2018-04-19 Thread dhatchayani (JIRA)
dhatchayani created CARBONDATA-2362:
---

 Summary: Changing the Cacheable object from DataMap to Wrapper
 Key: CARBONDATA-2362
 URL: https://issues.apache.org/jira/browse/CARBONDATA-2362
 Project: CarbonData
  Issue Type: Improvement
Reporter: dhatchayani
Assignee: dhatchayani






--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (CARBONDATA-2310) Refactored code to improve Distributable interface

2018-04-02 Thread dhatchayani (JIRA)
dhatchayani created CARBONDATA-2310:
---

 Summary: Refactored code to improve Distributable interface
 Key: CARBONDATA-2310
 URL: https://issues.apache.org/jira/browse/CARBONDATA-2310
 Project: CarbonData
  Issue Type: Improvement
Reporter: dhatchayani
Assignee: dhatchayani






--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Assigned] (CARBONDATA-2265) [DFX]-Load]: Load job fails if 1 folder contains 1000 files

2018-03-20 Thread dhatchayani (JIRA)

 [ 
https://issues.apache.org/jira/browse/CARBONDATA-2265?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

dhatchayani reassigned CARBONDATA-2265:
---

Assignee: dhatchayani

> [DFX]-Load]: Load job fails if 1 folder contains 1000 files 
> 
>
> Key: CARBONDATA-2265
> URL: https://issues.apache.org/jira/browse/CARBONDATA-2265
> Project: CarbonData
>  Issue Type: Bug
> Environment: 3 node ant cluster
>Reporter: Ajeet Rai
>Assignee: dhatchayani
>Priority: Major
>  Labels: DFX
>
> Load job fails if 1 folder contains 1000 files. 
>  【Precondition】:Thrift server should be running
>  【Test step】: 
>  1: Create a carbon table
>  2: Start a load where 1 folder contains 1000 files
>  3: Observe that load fails
>  
> Observe that Out of Memory exception is thrown.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (CARBONDATA-2223) Adding Listener Support for Partition

2018-03-19 Thread dhatchayani (JIRA)

 [ 
https://issues.apache.org/jira/browse/CARBONDATA-2223?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

dhatchayani updated CARBONDATA-2223:

Summary: Adding Listener Support for Partition  (was: Remove unused 
listeners)

> Adding Listener Support for Partition
> -
>
> Key: CARBONDATA-2223
> URL: https://issues.apache.org/jira/browse/CARBONDATA-2223
> Project: CarbonData
>  Issue Type: Improvement
>Reporter: dhatchayani
>Assignee: dhatchayani
>Priority: Minor
>  Time Spent: 6h
>  Remaining Estimate: 0h
>
> Remove unused listeners



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (CARBONDATA-2223) Remove unused listeners

2018-03-05 Thread dhatchayani (JIRA)
dhatchayani created CARBONDATA-2223:
---

 Summary: Remove unused listeners
 Key: CARBONDATA-2223
 URL: https://issues.apache.org/jira/browse/CARBONDATA-2223
 Project: CarbonData
  Issue Type: Improvement
Reporter: dhatchayani
Assignee: dhatchayani


Remove unused listeners



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (CARBONDATA-2125) like% filter is giving ArrayIndexOutOfBoundException in case of table having more pages

2018-03-05 Thread dhatchayani (JIRA)

 [ 
https://issues.apache.org/jira/browse/CARBONDATA-2125?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

dhatchayani resolved CARBONDATA-2125.
-
   Resolution: Fixed
Fix Version/s: 1.3.0

> like% filter is giving ArrayIndexOutOfBoundException in case of table having 
> more pages
> ---
>
> Key: CARBONDATA-2125
> URL: https://issues.apache.org/jira/browse/CARBONDATA-2125
> Project: CarbonData
>  Issue Type: Bug
>Reporter: dhatchayani
>Assignee: dhatchayani
>Priority: Major
> Fix For: 1.3.0
>
>  Time Spent: 1h 40m
>  Remaining Estimate: 0h
>
> java.lang.RuntimeException: java.util.concurrent.ExecutionException: 
> java.lang.ArrayIndexOutOfBoundsException: 1
>  at 
> org.apache.carbondata.core.scan.processor.AbstractDataBlockIterator.close(AbstractDataBlockIterator.java:247)
>  at 
> org.apache.carbondata.core.scan.result.iterator.AbstractDetailQueryResultIterator.close(AbstractDetailQueryResultIterator.java:307)
>  at 
> org.apache.carbondata.core.scan.executor.impl.AbstractQueryExecutor.finish(AbstractQueryExecutor.java:590)
>  at 
> org.apache.carbondata.spark.vectorreader.VectorizedCarbonRecordReader.close(VectorizedCarbonRecordReader.java:162)
>  at 
> org.apache.carbondata.spark.rdd.CarbonScanRDD$$anon$1$$anonfun$17.apply(CarbonScanRDD.scala:385)
>  at 
> org.apache.carbondata.spark.rdd.CarbonScanRDD$$anon$1$$anonfun$17.apply(CarbonScanRDD.scala:384)
>  at 
> org.apache.spark.TaskContext$$anon$1.onTaskCompletion(TaskContext.scala:128)
>  at 
> org.apache.spark.TaskContextImpl$$anonfun$markTaskCompleted$1.apply(TaskContextImpl.scala:117)
>  at 
> org.apache.spark.TaskContextImpl$$anonfun$markTaskCompleted$1.apply(TaskContextImpl.scala:117)
>  at 
> org.apache.spark.TaskContextImpl$$anonfun$invokeListeners$1.apply(TaskContextImpl.scala:130)
>  at 
> org.apache.spark.TaskContextImpl$$anonfun$invokeListeners$1.apply(TaskContextImpl.scala:128)
>  at 
> scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:59)
>  at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:48)
>  at 
> org.apache.spark.TaskContextImpl.invokeListeners(TaskContextImpl.scala:128)
>  at 
> org.apache.spark.TaskContextImpl.markTaskCompleted(TaskContextImpl.scala:116)
>  at org.apache.spark.scheduler.Task.run(Task.scala:109)
>  at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:325)
>  at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>  at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>  at java.lang.Thread.run(Thread.java:745)
> Caused by: java.util.concurrent.ExecutionException: 
> java.lang.ArrayIndexOutOfBoundsException: 1
>  at java.util.concurrent.FutureTask.report(FutureTask.java:122)
>  at java.util.concurrent.FutureTask.get(FutureTask.java:192)
>  at 
> org.apache.carbondata.core.scan.processor.AbstractDataBlockIterator.close(AbstractDataBlockIterator.java:242)
>  ... 19 more
> Caused by: java.lang.ArrayIndexOutOfBoundsException: 1
>  at 
> org.apache.carbondata.core.scan.filter.executer.RowLevelFilterExecuterImpl.applyFilter(RowLevelFilterExecuterImpl.java:225)
>  at 
> org.apache.carbondata.core.scan.scanner.impl.FilterScanner.fillScannedResult(FilterScanner.java:168)
>  at 
> org.apache.carbondata.core.scan.scanner.impl.FilterScanner.scanBlocklet(FilterScanner.java:100)
>  at 
> org.apache.carbondata.core.scan.processor.AbstractDataBlockIterator$1.call(AbstractDataBlockIterator.java:201)
>  at 
> org.apache.carbondata.core.scan.processor.AbstractDataBlockIterator$1.call(AbstractDataBlockIterator.java:188)
>  at java.util.concurrent.FutureTask.run(FutureTask.java:266)
>  ... 3 more



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (CARBONDATA-2131) Alter table adding long datatype is failing but Create table with long type is successful, in Spark 2.1

2018-02-05 Thread dhatchayani (JIRA)
dhatchayani created CARBONDATA-2131:
---

 Summary: Alter table adding long datatype is failing but Create 
table with long type is successful, in Spark 2.1
 Key: CARBONDATA-2131
 URL: https://issues.apache.org/jira/browse/CARBONDATA-2131
 Project: CarbonData
  Issue Type: Bug
Reporter: dhatchayani
Assignee: dhatchayani


create table test4(a1 int) stored by 'carbondata';
 +-+--+
 | Result  |
 +-+--+
 +-+--+
 No rows selected (1.757 seconds)
**

 

*alter table test4 add columns (a6 long);*

 *Error: java.lang.RuntimeException*:
 BaseSqlParser == Parse1 ==

 

Operation not allowed: alter table add columns(line 1, pos 0)

 

== SQL ==
 alter table test4 add columns (a6 long)
 ^^^

 

== Parse2 ==
 [1.35] failure: identifier matching regex (?i)VARCHAR expected

 

alter table test4 add columns (a6 long)
   ^;
 CarbonSqlParser [1.35] failure: identifier matching regex (?i)VARCHAR 
expected

 

alter table test4 add columns (a6 long)
   ^ (state=,code=0)



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (CARBONDATA-2125) like% filter is giving ArrayIndexOutOfBoundException in case of table having more pages

2018-02-03 Thread dhatchayani (JIRA)

 [ 
https://issues.apache.org/jira/browse/CARBONDATA-2125?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

dhatchayani updated CARBONDATA-2125:

Description: 
java.lang.RuntimeException: java.util.concurrent.ExecutionException: 
java.lang.ArrayIndexOutOfBoundsException: 1
 at 
org.apache.carbondata.core.scan.processor.AbstractDataBlockIterator.close(AbstractDataBlockIterator.java:247)
 at 
org.apache.carbondata.core.scan.result.iterator.AbstractDetailQueryResultIterator.close(AbstractDetailQueryResultIterator.java:307)
 at 
org.apache.carbondata.core.scan.executor.impl.AbstractQueryExecutor.finish(AbstractQueryExecutor.java:590)
 at 
org.apache.carbondata.spark.vectorreader.VectorizedCarbonRecordReader.close(VectorizedCarbonRecordReader.java:162)
 at 
org.apache.carbondata.spark.rdd.CarbonScanRDD$$anon$1$$anonfun$17.apply(CarbonScanRDD.scala:385)
 at 
org.apache.carbondata.spark.rdd.CarbonScanRDD$$anon$1$$anonfun$17.apply(CarbonScanRDD.scala:384)
 at org.apache.spark.TaskContext$$anon$1.onTaskCompletion(TaskContext.scala:128)
 at 
org.apache.spark.TaskContextImpl$$anonfun$markTaskCompleted$1.apply(TaskContextImpl.scala:117)
 at 
org.apache.spark.TaskContextImpl$$anonfun$markTaskCompleted$1.apply(TaskContextImpl.scala:117)
 at 
org.apache.spark.TaskContextImpl$$anonfun$invokeListeners$1.apply(TaskContextImpl.scala:130)
 at 
org.apache.spark.TaskContextImpl$$anonfun$invokeListeners$1.apply(TaskContextImpl.scala:128)
 at 
scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:59)
 at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:48)
 at org.apache.spark.TaskContextImpl.invokeListeners(TaskContextImpl.scala:128)
 at 
org.apache.spark.TaskContextImpl.markTaskCompleted(TaskContextImpl.scala:116)
 at org.apache.spark.scheduler.Task.run(Task.scala:109)
 at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:325)
 at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
 at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
 at java.lang.Thread.run(Thread.java:745)
Caused by: java.util.concurrent.ExecutionException: 
java.lang.ArrayIndexOutOfBoundsException: 1
 at java.util.concurrent.FutureTask.report(FutureTask.java:122)
 at java.util.concurrent.FutureTask.get(FutureTask.java:192)
 at 
org.apache.carbondata.core.scan.processor.AbstractDataBlockIterator.close(AbstractDataBlockIterator.java:242)
 ... 19 more
Caused by: java.lang.ArrayIndexOutOfBoundsException: 1
 at 
org.apache.carbondata.core.scan.filter.executer.RowLevelFilterExecuterImpl.applyFilter(RowLevelFilterExecuterImpl.java:225)
 at 
org.apache.carbondata.core.scan.scanner.impl.FilterScanner.fillScannedResult(FilterScanner.java:168)
 at 
org.apache.carbondata.core.scan.scanner.impl.FilterScanner.scanBlocklet(FilterScanner.java:100)
 at 
org.apache.carbondata.core.scan.processor.AbstractDataBlockIterator$1.call(AbstractDataBlockIterator.java:201)
 at 
org.apache.carbondata.core.scan.processor.AbstractDataBlockIterator$1.call(AbstractDataBlockIterator.java:188)
 at java.util.concurrent.FutureTask.run(FutureTask.java:266)
 ... 3 more

> like% filter is giving ArrayIndexOutOfBoundException in case of table having 
> more pages
> ---
>
> Key: CARBONDATA-2125
> URL: https://issues.apache.org/jira/browse/CARBONDATA-2125
> Project: CarbonData
>  Issue Type: Bug
>Reporter: dhatchayani
>Assignee: dhatchayani
>Priority: Major
>
> java.lang.RuntimeException: java.util.concurrent.ExecutionException: 
> java.lang.ArrayIndexOutOfBoundsException: 1
>  at 
> org.apache.carbondata.core.scan.processor.AbstractDataBlockIterator.close(AbstractDataBlockIterator.java:247)
>  at 
> org.apache.carbondata.core.scan.result.iterator.AbstractDetailQueryResultIterator.close(AbstractDetailQueryResultIterator.java:307)
>  at 
> org.apache.carbondata.core.scan.executor.impl.AbstractQueryExecutor.finish(AbstractQueryExecutor.java:590)
>  at 
> org.apache.carbondata.spark.vectorreader.VectorizedCarbonRecordReader.close(VectorizedCarbonRecordReader.java:162)
>  at 
> org.apache.carbondata.spark.rdd.CarbonScanRDD$$anon$1$$anonfun$17.apply(CarbonScanRDD.scala:385)
>  at 
> org.apache.carbondata.spark.rdd.CarbonScanRDD$$anon$1$$anonfun$17.apply(CarbonScanRDD.scala:384)
>  at 
> org.apache.spark.TaskContext$$anon$1.onTaskCompletion(TaskContext.scala:128)
>  at 
> org.apache.spark.TaskContextImpl$$anonfun$markTaskCompleted$1.apply(TaskContextImpl.scala:117)
>  at 
> org.apache.spark.TaskContextImpl$$anonfun$markTaskCompleted$1.apply(TaskContextImpl.scala:117)
>  at 
> org.apache.spark.TaskContextImpl$$anonfun$invokeListeners$1.apply(TaskContextImpl.scala:130)
>  at 
> 

[jira] [Created] (CARBONDATA-2125) like% filter is giving ArrayIndexOutOfBoundException in case of table having more pages

2018-02-03 Thread dhatchayani (JIRA)
dhatchayani created CARBONDATA-2125:
---

 Summary: like% filter is giving ArrayIndexOutOfBoundException in 
case of table having more pages
 Key: CARBONDATA-2125
 URL: https://issues.apache.org/jira/browse/CARBONDATA-2125
 Project: CarbonData
  Issue Type: Bug
Reporter: dhatchayani
Assignee: dhatchayani






--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (CARBONDATA-1918) Incorrect data is displayed when String is updated using Sentences

2018-01-31 Thread dhatchayani (JIRA)

 [ 
https://issues.apache.org/jira/browse/CARBONDATA-1918?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

dhatchayani updated CARBONDATA-1918:

Description: 
update t_carbn01 set (active_status)= (sentences('Hello there! How are you?'));
+-+--+
| Result  |
+-+--+
+-+--+
No rows selected (2.784 seconds)
select active_status from t_carbn01;
+-+--+
|  active_status  |
+-+--+
*| Hello\:there\\$How\:are\:you\\  |*
*| Hello\:there\\$How\:are\:you\\  |*
*| Hello\:there\\$How\:are\:you\\  |*
*| Hello\:there\\$How\:are\:you\\  |*
*| Hello\:there\\$How\:are\:you\\  |*
*| Hello\:there\\$How\:are\:you\\  |*
*| Hello\:there\\$How\:are\:you\\  |*
*| Hello\:there\\$How\:are\:you\\  |*
*| Hello\:there\\$How\:are\:you\\  |*
*| Hello\:there\\$How\:are\:you\\  |*
+-+–+

 

The issue for sentences function also occurs when the below update is performed.
  update t_carbn01 set (active_status)= (split('ab', 'a'));

> Incorrect data is displayed when String is updated using Sentences
> --
>
> Key: CARBONDATA-1918
> URL: https://issues.apache.org/jira/browse/CARBONDATA-1918
> Project: CarbonData
>  Issue Type: Bug
>Reporter: dhatchayani
>Assignee: dhatchayani
>Priority: Minor
>  Time Spent: 3h 20m
>  Remaining Estimate: 0h
>
> update t_carbn01 set (active_status)= (sentences('Hello there! How are 
> you?'));
> +-+--+
> | Result  |
> +-+--+
> +-+--+
> No rows selected (2.784 seconds)
> select active_status from t_carbn01;
> +-+--+
> |  active_status  |
> +-+--+
> *| Hello\:there\\$How\:are\:you\\  |*
> *| Hello\:there\\$How\:are\:you\\  |*
> *| Hello\:there\\$How\:are\:you\\  |*
> *| Hello\:there\\$How\:are\:you\\  |*
> *| Hello\:there\\$How\:are\:you\\  |*
> *| Hello\:there\\$How\:are\:you\\  |*
> *| Hello\:there\\$How\:are\:you\\  |*
> *| Hello\:there\\$How\:are\:you\\  |*
> *| Hello\:there\\$How\:are\:you\\  |*
> *| Hello\:there\\$How\:are\:you\\  |*
> +-+–+
>  
> The issue for sentences function also occurs when the below update is 
> performed.
>   update t_carbn01 set (active_status)= (split('ab', 'a'));



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Closed] (CARBONDATA-1917) While loading, check for stale dictionary files

2018-01-31 Thread dhatchayani (JIRA)

 [ 
https://issues.apache.org/jira/browse/CARBONDATA-1917?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

dhatchayani closed CARBONDATA-1917.
---
Resolution: Invalid

> While loading, check for stale dictionary files
> ---
>
> Key: CARBONDATA-1917
> URL: https://issues.apache.org/jira/browse/CARBONDATA-1917
> Project: CarbonData
>  Issue Type: Improvement
>Reporter: dhatchayani
>Assignee: dhatchayani
>Priority: Minor
>  Time Spent: 6h
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (CARBONDATA-2106) Update product document with page level reader property

2018-01-30 Thread dhatchayani (JIRA)
dhatchayani created CARBONDATA-2106:
---

 Summary: Update product document with page level reader property
 Key: CARBONDATA-2106
 URL: https://issues.apache.org/jira/browse/CARBONDATA-2106
 Project: CarbonData
  Issue Type: Task
Reporter: dhatchayani
Assignee: Gururaj Shetty






--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (CARBONDATA-2064) Add compaction listener

2018-01-22 Thread dhatchayani (JIRA)
dhatchayani created CARBONDATA-2064:
---

 Summary: Add compaction listener
 Key: CARBONDATA-2064
 URL: https://issues.apache.org/jira/browse/CARBONDATA-2064
 Project: CarbonData
  Issue Type: Improvement
Reporter: dhatchayani
Assignee: dhatchayani






--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (CARBONDATA-2061) Check for only valid IN_PROGRESS segments

2018-01-22 Thread dhatchayani (JIRA)

 [ 
https://issues.apache.org/jira/browse/CARBONDATA-2061?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

dhatchayani updated CARBONDATA-2061:

Description: While checking for IN_PROGRESS segments of a table during 
other operations, we should check only for valid IN_PROGRESS segments. Some 
segments may be invalid like cancelled and may still in IN_PROGRESS state,those 
segments should be considered as stale segments.  (was: While checking for 
IN_PROGRESS segments of a table during other operation, we should check only 
for valid IN_PROGRESS segments. Some segments may be invalid and still in 
IN_PROGRESS state, should be considered as stale segments.)

> Check for only valid IN_PROGRESS segments
> -
>
> Key: CARBONDATA-2061
> URL: https://issues.apache.org/jira/browse/CARBONDATA-2061
> Project: CarbonData
>  Issue Type: Bug
>Reporter: dhatchayani
>Assignee: dhatchayani
>Priority: Major
>
> While checking for IN_PROGRESS segments of a table during other operations, 
> we should check only for valid IN_PROGRESS segments. Some segments may be 
> invalid like cancelled and may still in IN_PROGRESS state,those segments 
> should be considered as stale segments.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (CARBONDATA-2061) Check for only valid IN_PROGRESS segments

2018-01-21 Thread dhatchayani (JIRA)
dhatchayani created CARBONDATA-2061:
---

 Summary: Check for only valid IN_PROGRESS segments
 Key: CARBONDATA-2061
 URL: https://issues.apache.org/jira/browse/CARBONDATA-2061
 Project: CarbonData
  Issue Type: Bug
Reporter: dhatchayani
Assignee: dhatchayani


While checking for IN_PROGRESS segments of a table during other operation, we 
should check only for valid IN_PROGRESS segments. Some segments may be invalid 
and still in IN_PROGRESS state, should be considered as stale segments.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (CARBONDATA-2015) Restricted maximum length of bytes per column

2018-01-10 Thread dhatchayani (JIRA)

 [ 
https://issues.apache.org/jira/browse/CARBONDATA-2015?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

dhatchayani updated CARBONDATA-2015:

Description: 
Validation for number of bytes for a column is added.

We have limited the number of characters per column to 32000.
For example, a single unicode character takes 3 bytes. So in this case, if my 
column has 30,000 unicode characters, then 32000 * 3 exceeds the short range. 
So, load will fail.

> Restricted maximum length of bytes per column
> -
>
> Key: CARBONDATA-2015
> URL: https://issues.apache.org/jira/browse/CARBONDATA-2015
> Project: CarbonData
>  Issue Type: Bug
>Reporter: dhatchayani
>Assignee: dhatchayani
>Priority: Minor
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Validation for number of bytes for a column is added.
> We have limited the number of characters per column to 32000.
> For example, a single unicode character takes 3 bytes. So in this case, if my 
> column has 30,000 unicode characters, then 32000 * 3 exceeds the short range. 
> So, load will fail.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (CARBONDATA-2015) Restricted maximum length of bytes per column

2018-01-10 Thread dhatchayani (JIRA)
dhatchayani created CARBONDATA-2015:
---

 Summary: Restricted maximum length of bytes per column
 Key: CARBONDATA-2015
 URL: https://issues.apache.org/jira/browse/CARBONDATA-2015
 Project: CarbonData
  Issue Type: Bug
Reporter: dhatchayani
Assignee: dhatchayani
Priority: Minor






--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (CARBONDATA-1975) Wrong input metrics displayed for carbon

2018-01-04 Thread dhatchayani (JIRA)

 [ 
https://issues.apache.org/jira/browse/CARBONDATA-1975?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

dhatchayani updated CARBONDATA-1975:

Attachment: Corrected_Data.JPG

> Wrong input metrics displayed for carbon
> 
>
> Key: CARBONDATA-1975
> URL: https://issues.apache.org/jira/browse/CARBONDATA-1975
> Project: CarbonData
>  Issue Type: Bug
>Reporter: dhatchayani
>Assignee: dhatchayani
>Priority: Minor
> Fix For: 1.3.0
>
> Attachments: Corrected_Data.JPG, Wrong_Data.JPG, beeline.JPG
>
>  Time Spent: 1h
>  Remaining Estimate: 0h
>
> Input metrics is updated twice. Record count is updated twice and it is 
> wrongly displayed in Spark UI



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (CARBONDATA-1975) Wrong input metrics displayed for carbon

2018-01-04 Thread dhatchayani (JIRA)

 [ 
https://issues.apache.org/jira/browse/CARBONDATA-1975?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

dhatchayani updated CARBONDATA-1975:

Attachment: Wrong_Data.JPG

> Wrong input metrics displayed for carbon
> 
>
> Key: CARBONDATA-1975
> URL: https://issues.apache.org/jira/browse/CARBONDATA-1975
> Project: CarbonData
>  Issue Type: Bug
>Reporter: dhatchayani
>Assignee: dhatchayani
>Priority: Minor
> Fix For: 1.3.0
>
> Attachments: Wrong_Data.JPG, beeline.JPG
>
>  Time Spent: 1h
>  Remaining Estimate: 0h
>
> Input metrics is updated twice. Record count is updated twice and it is 
> wrongly displayed in Spark UI



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (CARBONDATA-1975) Wrong input metrics displayed for carbon

2018-01-03 Thread dhatchayani (JIRA)

 [ 
https://issues.apache.org/jira/browse/CARBONDATA-1975?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

dhatchayani updated CARBONDATA-1975:

 Attachment: beeline.JPG
Description: 
Input metrics is updated twice. Record count is updated twice and it is wrongly 
displayed in Spark UI



  was:Input metrics is updated twice. Record count is updated twice and it is 
wrongly displayed in Spark UI


> Wrong input metrics displayed for carbon
> 
>
> Key: CARBONDATA-1975
> URL: https://issues.apache.org/jira/browse/CARBONDATA-1975
> Project: CarbonData
>  Issue Type: Bug
>Reporter: dhatchayani
>Assignee: dhatchayani
>Priority: Minor
> Attachments: beeline.JPG
>
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> Input metrics is updated twice. Record count is updated twice and it is 
> wrongly displayed in Spark UI



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (CARBONDATA-1975) Wrong input metrics displayed for carbon

2018-01-03 Thread dhatchayani (JIRA)

 [ 
https://issues.apache.org/jira/browse/CARBONDATA-1975?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

dhatchayani updated CARBONDATA-1975:

Description: Input metrics is updated twice. Record count is updated twice 
and it is wrongly displayed in Spark UI  (was: Input metrics is updated twice)

> Wrong input metrics displayed for carbon
> 
>
> Key: CARBONDATA-1975
> URL: https://issues.apache.org/jira/browse/CARBONDATA-1975
> Project: CarbonData
>  Issue Type: Bug
>Reporter: dhatchayani
>Assignee: dhatchayani
>Priority: Minor
>
> Input metrics is updated twice. Record count is updated twice and it is 
> wrongly displayed in Spark UI



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (CARBONDATA-1975) Wrong input metrics displayed for carbon

2018-01-03 Thread dhatchayani (JIRA)

 [ 
https://issues.apache.org/jira/browse/CARBONDATA-1975?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

dhatchayani updated CARBONDATA-1975:

Attachment: (was: metrics2.JPG)

> Wrong input metrics displayed for carbon
> 
>
> Key: CARBONDATA-1975
> URL: https://issues.apache.org/jira/browse/CARBONDATA-1975
> Project: CarbonData
>  Issue Type: Bug
>Reporter: dhatchayani
>Assignee: dhatchayani
>Priority: Minor
>
> Input metrics is updated twice



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (CARBONDATA-1975) Wrong input metrics displayed for carbon

2018-01-03 Thread dhatchayani (JIRA)

 [ 
https://issues.apache.org/jira/browse/CARBONDATA-1975?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

dhatchayani updated CARBONDATA-1975:

Description: Input metrics is updated twice

> Wrong input metrics displayed for carbon
> 
>
> Key: CARBONDATA-1975
> URL: https://issues.apache.org/jira/browse/CARBONDATA-1975
> Project: CarbonData
>  Issue Type: Bug
>Reporter: dhatchayani
>Assignee: dhatchayani
>Priority: Minor
> Attachments: metrics2.JPG
>
>
> Input metrics is updated twice



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (CARBONDATA-1975) Wrong input metrics displayed for carbon

2018-01-03 Thread dhatchayani (JIRA)
dhatchayani created CARBONDATA-1975:
---

 Summary: Wrong input metrics displayed for carbon
 Key: CARBONDATA-1975
 URL: https://issues.apache.org/jira/browse/CARBONDATA-1975
 Project: CarbonData
  Issue Type: Bug
Reporter: dhatchayani
Assignee: dhatchayani
Priority: Minor
 Attachments: metrics2.JPG





--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (CARBONDATA-1939) Added show segments validation test case

2017-12-25 Thread dhatchayani (JIRA)
dhatchayani created CARBONDATA-1939:
---

 Summary: Added show segments validation test case
 Key: CARBONDATA-1939
 URL: https://issues.apache.org/jira/browse/CARBONDATA-1939
 Project: CarbonData
  Issue Type: Improvement
Reporter: dhatchayani
Assignee: dhatchayani
Priority: Minor


(1) Modified headers of show segments
(2) Modified SDV test cases for validating headers and result



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (CARBONDATA-1824) Carbon 1.3.0 - Spark 2.2-Residual segment files left over when load failure happens

2017-12-22 Thread dhatchayani (JIRA)

[ 
https://issues.apache.org/jira/browse/CARBONDATA-1824?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16301084#comment-16301084
 ] 

dhatchayani commented on CARBONDATA-1824:
-

please resolve this issue as this is already resolved by CARBONDATA-1759

> Carbon 1.3.0 - Spark 2.2-Residual segment files left over when load failure 
> happens
> ---
>
> Key: CARBONDATA-1824
> URL: https://issues.apache.org/jira/browse/CARBONDATA-1824
> Project: CarbonData
>  Issue Type: Bug
>  Components: data-load
>Affects Versions: 1.3.0
> Environment: Test - 3 node ant cluster
>Reporter: Ramakrishna S
>Assignee: dhatchayani
>Priority: Minor
>  Labels: DFX
> Fix For: 1.3.0
>
>
> Steps:
> Beeline:
> 1. Create a table with batch sort as sort type, keep block size small
> 2. Run Load/Insert/Compaction the table
> 3. Bring down thrift server when carbon data is being written to the segment
> 4. Do show segments on the table
> *+Expected:+* It should not show the residual segments  
> *+Actual:+* The segment intended for load is shown as marked for delete and 
> it does not get deleted with clean file. No impact on the table as such.
> *+Query:+*
> create table if not exists lineitem1(L_SHIPDATE string,L_SHIPMODE 
> string,L_SHIPINSTRUCT string,L_RETURNFLAG string,L_RECEIPTDATE 
> string,L_ORDERKEY string,L_PARTKEY string,L_SUPPKEY   string,L_LINENUMBER 
> int,L_QUANTITY double,L_EXTENDEDPRICE double,L_DISCOUNT double,L_TAX 
> double,L_LINESTATUS string,L_COMMITDATE string,L_COMMENT  string) STORED BY 
> 'org.apache.carbondata.format' TBLPROPERTIES 
> ('table_blocksize'='1','sort_scope'='BATCH_SORT','batch_sort_size_inmb'='5000');
> load data inpath "hdfs://hacluster/user/test/lineitem.tbl.1" into table 
> lineitem 
> options('DELIMITER'='|','FILEHEADER'='L_ORDERKEY,L_PARTKEY,L_SUPPKEY,L_LINENUMBER,L_QUANTITY,L_EXTENDEDPRICE,L_DISCOUNT,L_TAX,L_RETURNFLAG,L_LINESTATUS,L_SHIPDATE,L_COMMITDATE,L_RECEIPTDATE,L_SHIPINSTRUCT,L_SHIPMODE,L_COMMENT');
> 0: jdbc:hive2://10.18.98.34:23040> select count(*) from t_carbn0161;
> +---+--+
> | count(1)  |
> +---+--+
> | 0 |
> +---+--+
> 1 row selected (13.011 seconds)
> 0: jdbc:hive2://10.18.98.34:23040> show segments for table lineitem1;
> +++--+--++--+--+
> | SegmentSequenceId  |   Status   | Load Start Time  |  
> Load End Time   | Merged To  | File Format  |
> +++--+--++--+--+
> | 1  | Marked for Delete  | 2017-11-28 19:14:46.265  | 
> 2017-11-28 19:15:28.396  | NA | COLUMNAR_V3  |
> | 0  | Marked for Delete  | 2017-11-28 19:12:58.269  | 
> 2017-11-28 19:13:37.26   | NA | COLUMNAR_V3  |
> +++--+--++--+--+
> 0: jdbc:hive2://10.18.98.34:23040> clean files for table t_carbn0161;
> +-+--+
> | Result  |
> +-+--+
> +-+--+
> No rows selected (7.473 seconds)
> 0: jdbc:hive2://10.18.98.34:23040> show segments for table lineitem1;
> +++--+--++--+--+
> | SegmentSequenceId  |   Status   | Load Start Time  |  
> Load End Time   | Merged To  | File Format  |
> +++--+--++--+--+
> | 1  | Marked for Delete  | 2017-11-28 19:14:46.265  | 
> 2017-11-28 19:15:28.396  | NA | COLUMNAR_V3  |
> | 0  | Marked for Delete  | 2017-11-28 19:12:58.269  | 
> 2017-11-28 19:13:37.26   | NA | COLUMNAR_V3  |
> +++--+--++--+--+



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Assigned] (CARBONDATA-1824) Carbon 1.3.0 - Spark 2.2-Residual segment files left over when load failure happens

2017-12-22 Thread dhatchayani (JIRA)

 [ 
https://issues.apache.org/jira/browse/CARBONDATA-1824?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

dhatchayani reassigned CARBONDATA-1824:
---

Assignee: dhatchayani  (was: kumar vishal)

> Carbon 1.3.0 - Spark 2.2-Residual segment files left over when load failure 
> happens
> ---
>
> Key: CARBONDATA-1824
> URL: https://issues.apache.org/jira/browse/CARBONDATA-1824
> Project: CarbonData
>  Issue Type: Bug
>  Components: data-load
>Affects Versions: 1.3.0
> Environment: Test - 3 node ant cluster
>Reporter: Ramakrishna S
>Assignee: dhatchayani
>Priority: Minor
>  Labels: DFX
> Fix For: 1.3.0
>
>
> Steps:
> Beeline:
> 1. Create a table with batch sort as sort type, keep block size small
> 2. Run Load/Insert/Compaction the table
> 3. Bring down thrift server when carbon data is being written to the segment
> 4. Do show segments on the table
> *+Expected:+* It should not show the residual segments  
> *+Actual:+* The segment intended for load is shown as marked for delete and 
> it does not get deleted with clean file. No impact on the table as such.
> *+Query:+*
> create table if not exists lineitem1(L_SHIPDATE string,L_SHIPMODE 
> string,L_SHIPINSTRUCT string,L_RETURNFLAG string,L_RECEIPTDATE 
> string,L_ORDERKEY string,L_PARTKEY string,L_SUPPKEY   string,L_LINENUMBER 
> int,L_QUANTITY double,L_EXTENDEDPRICE double,L_DISCOUNT double,L_TAX 
> double,L_LINESTATUS string,L_COMMITDATE string,L_COMMENT  string) STORED BY 
> 'org.apache.carbondata.format' TBLPROPERTIES 
> ('table_blocksize'='1','sort_scope'='BATCH_SORT','batch_sort_size_inmb'='5000');
> load data inpath "hdfs://hacluster/user/test/lineitem.tbl.1" into table 
> lineitem 
> options('DELIMITER'='|','FILEHEADER'='L_ORDERKEY,L_PARTKEY,L_SUPPKEY,L_LINENUMBER,L_QUANTITY,L_EXTENDEDPRICE,L_DISCOUNT,L_TAX,L_RETURNFLAG,L_LINESTATUS,L_SHIPDATE,L_COMMITDATE,L_RECEIPTDATE,L_SHIPINSTRUCT,L_SHIPMODE,L_COMMENT');
> 0: jdbc:hive2://10.18.98.34:23040> select count(*) from t_carbn0161;
> +---+--+
> | count(1)  |
> +---+--+
> | 0 |
> +---+--+
> 1 row selected (13.011 seconds)
> 0: jdbc:hive2://10.18.98.34:23040> show segments for table lineitem1;
> +++--+--++--+--+
> | SegmentSequenceId  |   Status   | Load Start Time  |  
> Load End Time   | Merged To  | File Format  |
> +++--+--++--+--+
> | 1  | Marked for Delete  | 2017-11-28 19:14:46.265  | 
> 2017-11-28 19:15:28.396  | NA | COLUMNAR_V3  |
> | 0  | Marked for Delete  | 2017-11-28 19:12:58.269  | 
> 2017-11-28 19:13:37.26   | NA | COLUMNAR_V3  |
> +++--+--++--+--+
> 0: jdbc:hive2://10.18.98.34:23040> clean files for table t_carbn0161;
> +-+--+
> | Result  |
> +-+--+
> +-+--+
> No rows selected (7.473 seconds)
> 0: jdbc:hive2://10.18.98.34:23040> show segments for table lineitem1;
> +++--+--++--+--+
> | SegmentSequenceId  |   Status   | Load Start Time  |  
> Load End Time   | Merged To  | File Format  |
> +++--+--++--+--+
> | 1  | Marked for Delete  | 2017-11-28 19:14:46.265  | 
> 2017-11-28 19:15:28.396  | NA | COLUMNAR_V3  |
> | 0  | Marked for Delete  | 2017-11-28 19:12:58.269  | 
> 2017-11-28 19:13:37.26   | NA | COLUMNAR_V3  |
> +++--+--++--+--+



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (CARBONDATA-1896) Clean files operation improvement

2017-12-21 Thread dhatchayani (JIRA)

 [ 
https://issues.apache.org/jira/browse/CARBONDATA-1896?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

dhatchayani updated CARBONDATA-1896:

Description: 
+*Problem:*+
When bringing up the session, clean operation is handled in a way to mark all 
the INSERT_OVERWRITE_IN_PROGRESS or INSERT_IN_PROGRESS segments to 
MARKED_FOR_DELETE in tablestatus file. This clean operation is not considering 
the other parallel sessions. If any other session's data load is IN_PROGRESS at 
the time of bringing up one session, then the executing load also will be 
changed to MARKED_FOR_DELETE irrespective of the actual load status. Handling 
stale segments cleaning while session bring up also increases the time of 
bringing up a session.


+*Solution:*+
SEGMENT_LOCK should be taken on the new segment while loading.
While cleaning segments tablestatus file and SEGMENT_LOCK should be considered.
Cleaning stale files while bringing up the session should be removed and this 
can be either manually done on the needed tables through already existing CLEAN 
FILES DDL or the next  load will automatically clean the same.
















  was:
+*Problem:*+
When bringing up the session, clean operation is handled in a way to mark all 
the INSERT_OVERWRITE_IN_PROGRESS or INSERT_IN_PROGRESS segments to 
MARKED_FOR_DELETE in tablestatus file. This clean operation is not considering 
the other parallel sessions. If any other session's data load is IN_PROGRESS at 
the time of bringing up one session, then the executing load also will be 
changed to MARKED_FOR_DELETE irrespective of the actual load status. Handling 
stale segments cleaning while session bring up also increases the time of 
bringing up a session.


+*Solution:*+
SEGMENT_LOCK should be taken on the new segment while loading.
While cleaning segments tablestatus file and SEGMENT_LOCK should be considered.
Cleaning stale files while bringing up the session should be removed and this 
can be either manually done on the needed tables through already existing CLEAN 
FILES DDL or the next  load will automatically clean the same.

*Impact analysis on the solution will be updated soon.*
















> Clean files operation improvement
> -
>
> Key: CARBONDATA-1896
> URL: https://issues.apache.org/jira/browse/CARBONDATA-1896
> Project: CarbonData
>  Issue Type: Bug
>Reporter: dhatchayani
>Assignee: dhatchayani
>  Time Spent: 2.5h
>  Remaining Estimate: 0h
>
> +*Problem:*+
> When bringing up the session, clean operation is handled in a way to mark all 
> the INSERT_OVERWRITE_IN_PROGRESS or INSERT_IN_PROGRESS segments to 
> MARKED_FOR_DELETE in tablestatus file. This clean operation is not 
> considering the other parallel sessions. If any other session's data load is 
> IN_PROGRESS at the time of bringing up one session, then the executing load 
> also will be changed to MARKED_FOR_DELETE irrespective of the actual load 
> status. Handling stale segments cleaning while session bring up also 
> increases the time of bringing up a session.
> +*Solution:*+
> SEGMENT_LOCK should be taken on the new segment while loading.
> While cleaning segments tablestatus file and SEGMENT_LOCK should be 
> considered.
> Cleaning stale files while bringing up the session should be removed and this 
> can be either manually done on the needed tables through already existing 
> CLEAN FILES DDL or the next  load will automatically clean the same.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (CARBONDATA-1896) Clean files operation improvement

2017-12-21 Thread dhatchayani (JIRA)

 [ 
https://issues.apache.org/jira/browse/CARBONDATA-1896?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

dhatchayani updated CARBONDATA-1896:

Description: 
+*Problem:*+
When bringing up the session, clean operation is handled in a way to mark all 
the INSERT_OVERWRITE_IN_PROGRESS or INSERT_IN_PROGRESS segments to 
MARKED_FOR_DELETE in tablestatus file. This clean operation is not considering 
the other parallel sessions. If any other session's data load is IN_PROGRESS at 
the time of bringing up one session, then the executing load also will be 
changed to MARKED_FOR_DELETE irrespective of the actual load status. Handling 
stale segments cleaning while session bring up also increases the time of 
bringing up a session.


+*Solution:*+
SEGMENT_LOCK should be taken on the new segment while loading.
While cleaning segments tablestatus file and SEGMENT_LOCK should be considered.
Cleaning stale files while bringing up the session should be removed and this 
can be either manually done on the needed tables through already existing CLEAN 
FILES DDL or the next  load will automatically clean the same.

*Impact analysis on the solution will be updated soon.*















  was:
+*Problem:*+
When bringing up the session, clean operation is handled in a way to mark all 
the INSERT_OVERWRITE_IN_PROGRESS or INSERT_IN_PROGRESS segments to 
MARKED_FOR_DELETE in tablestatus file. This clean operation is not considering 
the other parallel sessions. If any other session's data load is IN_PROGRESS at 
the time of bringing up one session, then the executing load also will be 
changed to MARKED_FOR_DELETE irrespective of the actual load status. Handling 
stale segments cleaning while session bring up also increases the time of 
bringing up a session.


+*Solution:*+
SEGMENT_LOCK should be taken on the new segment while loading.
While cleaning segments tablestatus file and SEGMENT_LOCK should be considered.
Cleaning stale files while bringing up the session should be removed and this 
should be manually done on the needed tables through already existing CLEAN 
FILES DDL.

*Impact analysis on the solution will be updated soon.*
















> Clean files operation improvement
> -
>
> Key: CARBONDATA-1896
> URL: https://issues.apache.org/jira/browse/CARBONDATA-1896
> Project: CarbonData
>  Issue Type: Bug
>Reporter: dhatchayani
>Assignee: dhatchayani
>  Time Spent: 2.5h
>  Remaining Estimate: 0h
>
> +*Problem:*+
> When bringing up the session, clean operation is handled in a way to mark all 
> the INSERT_OVERWRITE_IN_PROGRESS or INSERT_IN_PROGRESS segments to 
> MARKED_FOR_DELETE in tablestatus file. This clean operation is not 
> considering the other parallel sessions. If any other session's data load is 
> IN_PROGRESS at the time of bringing up one session, then the executing load 
> also will be changed to MARKED_FOR_DELETE irrespective of the actual load 
> status. Handling stale segments cleaning while session bring up also 
> increases the time of bringing up a session.
> +*Solution:*+
> SEGMENT_LOCK should be taken on the new segment while loading.
> While cleaning segments tablestatus file and SEGMENT_LOCK should be 
> considered.
> Cleaning stale files while bringing up the session should be removed and this 
> can be either manually done on the needed tables through already existing 
> CLEAN FILES DDL or the next  load will automatically clean the same.
> *Impact analysis on the solution will be updated soon.*



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


  1   2   >