[jira] [Resolved] (CARBONDATA-4146) Query fails and the error message "unable to get file status" is displayed. query is normal after the "drop metacache on table" command is executed.

2021-03-23 Thread Akash R Nilugal (Jira)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-4146?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Akash R Nilugal resolved CARBONDATA-4146.
-
Fix Version/s: 2.1.1
   Resolution: Fixed

>  Query fails and the error message "unable to get file status" is displayed. 
> query is normal after the "drop metacache on table" command is executed.
> -
>
> Key: CARBONDATA-4146
> URL: https://issues.apache.org/jira/browse/CARBONDATA-4146
> Project: CarbonData
>  Issue Type: Bug
>  Components: data-query
>Affects Versions: 1.6.1, 2.0.0, 2.1.0
>Reporter: liuhe0702
>Priority: Major
> Fix For: 2.1.1
>
>  Time Spent: 8h 40m
>  Remaining Estimate: 0h
>
> During compact execution, the status of the new segment is set to success 
> before index files are merged. After index files are merged, the carbonindex 
> files are deleted. As a result, the query task cannot find the cached 
> carbonindex files.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (CARBONDATA-4147) Carbondata 2.1.0 MV ERROR inserting data into table with MV

2021-03-21 Thread Akash R Nilugal (Jira)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-4147?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Akash R Nilugal resolved CARBONDATA-4147.
-
Fix Version/s: (was: 2.1.0)
   2.1.1
 Assignee: Indhumathi Muthumurugesh
   Resolution: Fixed

> Carbondata 2.1.0 MV  ERROR inserting data into table with MV
> 
>
> Key: CARBONDATA-4147
> URL: https://issues.apache.org/jira/browse/CARBONDATA-4147
> Project: CarbonData
>  Issue Type: Bug
>  Components: core
>Affects Versions: 2.1.0
> Environment: Apache carbondata 2.1.0
>Reporter: Sushant Sammanwar
>Assignee: Indhumathi Muthumurugesh
>Priority: Major
>  Labels: datatype,double, materializedviews
> Fix For: 2.1.1
>
> Attachments: carbondata_210_insert_error_stack-trace
>
>  Time Spent: 3h 10m
>  Remaining Estimate: 0h
>
> Hi Team ,
>  
> We are working on a POC where we are using carbon 2.1.0.
> We have created below tables,  MV :
> create table if not exists fact_365_1_eutrancell_21 (ts timestamp, metric 
> STRING, tags_id STRING, value DOUBLE) partitioned by (ts2 timestamp) stored 
> as carbondata TBLPROPERTIES ('SORT_COLUMNS'='metric')
> create materialized view if not exists fact_365_1_eutrancell_21_30_minute as 
> select tags_id ,metric ,ts2, timeseries(ts,'thirty_minute') as 
> ts,sum(value),avg(value),min(value),max(value) from fact_365_1_eutrancell_21 
> group by metric, tags_id, timeseries(ts,'thirty_minute') ,ts2
>  
> When i try to insert data into above Table, below error is thrown  :
> scala> carbon.sql("insert into fact_365_1_eutrancell_21 values ('2020-09-25 
> 05:30:00','eUtranCell.HHO.X2.InterFreq.PrepAttOut','ff6cb0f7-fba0-4134-81ee-55e820574627',392.2345,'2020-09-25
>  05:30:00')").show()
>  21/03/10 22:32:20 AUDIT audit: \{"time":"March 10, 2021 10:32:20 PM 
> IST","username":"root","opName":"INSERT 
> INTO","opId":"33474031950342736","opStatus":"START"}
>  [Stage 0:> (0 + 1) / 1]21/03/10 22:32:32 WARN CarbonOutputIteratorWrapper: 
> try to poll a row batch one more time.
>  21/03/10 22:32:32 WARN CarbonOutputIteratorWrapper: try to poll a row batch 
> one more time.
>  21/03/10 22:32:32 WARN CarbonOutputIteratorWrapper: try to poll a row batch 
> one more time.
>  21/03/10 22:32:36 WARN log: Updating partition stats fast for: 
> fact_365_1_eutrancell_21
>  21/03/10 22:32:36 WARN log: Updated size to 2699
>  21/03/10 22:32:38 AUDIT audit: \{"time":"March 10, 2021 10:32:38 PM 
> IST","username":"root","opName":"INSERT 
> OVERWRITE","opId":"33474049863830951","opStatus":"START"}
>  [Stage 3:==>(199 + 1) / 
> 200]21/03/10 22:33:07 WARN CarbonOutputIteratorWrapper: try to poll a row 
> batch one more time.
>  21/03/10 22:33:07 WARN CarbonOutputIteratorWrapper: try to poll a row batch 
> one more time.
>  21/03/10 22:33:07 WARN CarbonOutputIteratorWrapper: try to poll a row batch 
> one more time.
>  21/03/10 22:33:07 ERROR CarbonFactDataHandlerColumnar: Error in producer
>  java.lang.ClassCastException: java.lang.Double cannot be cast to 
> java.lang.Long
>  at 
> org.apache.carbondata.core.datastore.page.ColumnPage.putData(ColumnPage.java:402)
>  at 
> org.apache.carbondata.processing.store.TablePage.convertToColumnarAndAddToPages(TablePage.java:239)
>  at 
> org.apache.carbondata.processing.store.TablePage.addRow(TablePage.java:201)
>  at 
> org.apache.carbondata.processing.store.CarbonFactDataHandlerColumnar.processDataRows(CarbonFactDataHandlerColumnar.java:397)
>  at 
> org.apache.carbondata.processing.store.CarbonFactDataHandlerColumnar.access$500(CarbonFactDataHandlerColumnar.java:60)
>  at 
> org.apache.carbondata.processing.store.CarbonFactDataHandlerColumnar$Producer.call(CarbonFactDataHandlerColumnar.java:637)
>  at 
> org.apache.carbondata.processing.store.CarbonFactDataHandlerColumnar$Producer.call(CarbonFactDataHandlerColumnar.java:614)
>  at java.util.concurrent.FutureTask.run(FutureTask.java:266)
>  at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
>  at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
>  at java.lang.Thread.run(Thread.java:748)
>  
>  
> It seems the method is converting "decimal" data type of table to a "long" 
> data type for MV.
> During value conversion it is throwing the error.
> Could you please check if this is a defect / bug or let me know if i have 
> missed something ?
> Note : This was working in carbon 2.0.1
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (CARBONDATA-4144) After the alter table xxx compact command is executed, the index size of the segment is 0, and an error is reported while quering

2021-03-21 Thread Akash R Nilugal (Jira)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-4144?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Akash R Nilugal updated CARBONDATA-4144:

Fix Version/s: (was: 2.2.0)
   2.1.1

> After the alter table xxx compact command is executed, the index size of the 
> segment is 0, and an error is reported while quering
> -
>
> Key: CARBONDATA-4144
> URL: https://issues.apache.org/jira/browse/CARBONDATA-4144
> Project: CarbonData
>  Issue Type: Bug
>  Components: data-load
>Affects Versions: 1.6.1, 2.0.0, 2.1.0
>Reporter: liuhe0702
>Priority: Major
> Fix For: 2.1.1
>
>  Time Spent: 4h 20m
>  Remaining Estimate: 0h
>
> When 'alter table xxx compact ...' command is executed, the value of 
> segmentFile is 13010.1_null.segment, and the values of indexSize and dataSize 
> are 0 in the tablestatus file of the secondary index table. Query failed and 
> the log displays java.lang.IndexOutOfBoundsException: Index:0, Size:0.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (CARBONDATA-4145) Query fails and the message "File does not exist: xxxx.carbondata" is displayed

2021-03-21 Thread Akash R Nilugal (Jira)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-4145?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Akash R Nilugal updated CARBONDATA-4145:

Fix Version/s: (was: 2.2.0)
   2.1.1

> Query fails and the message "File does not exist: .carbondata" is 
> displayed
> ---
>
> Key: CARBONDATA-4145
> URL: https://issues.apache.org/jira/browse/CARBONDATA-4145
> Project: CarbonData
>  Issue Type: Bug
>  Components: data-query
>Affects Versions: 1.6.1, 2.0.0, 2.1.0
>Reporter: liuhe0702
>Priority: Major
> Fix For: 2.1.1
>
>  Time Spent: 2h 50m
>  Remaining Estimate: 0h
>
> An exception occurs when the rebuild/refresh index command is executed. After 
> that, the query command fails to be executed, and the message "File does not 
> exist: 
> /user/hive/warehouse/carbon.store/sys/idx_tbl_data_event_carbon_user_num/Fact/Part0/Segment_27670/part-1-28_batchno0-0-x.carbondata"
>  is displayed and the idx_tbl_data_event_carbon_user_num table is secondary 
> index table.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (CARBONDATA-4145) Query fails and the message "File does not exist: xxxx.carbondata" is displayed

2021-03-19 Thread Akash R Nilugal (Jira)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-4145?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Akash R Nilugal resolved CARBONDATA-4145.
-
Fix Version/s: 2.2.0
   Resolution: Fixed

> Query fails and the message "File does not exist: .carbondata" is 
> displayed
> ---
>
> Key: CARBONDATA-4145
> URL: https://issues.apache.org/jira/browse/CARBONDATA-4145
> Project: CarbonData
>  Issue Type: Bug
>  Components: data-query
>Affects Versions: 1.6.1, 2.0.0, 2.1.0
>Reporter: liuhe0702
>Priority: Major
> Fix For: 2.2.0
>
>  Time Spent: 2h 50m
>  Remaining Estimate: 0h
>
> An exception occurs when the rebuild/refresh index command is executed. After 
> that, the query command fails to be executed, and the message "File does not 
> exist: 
> /user/hive/warehouse/carbon.store/sys/idx_tbl_data_event_carbon_user_num/Fact/Part0/Segment_27670/part-1-28_batchno0-0-x.carbondata"
>  is displayed and the idx_tbl_data_event_carbon_user_num table is secondary 
> index table.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (CARBONDATA-4144) After the alter table xxx compact command is executed, the index size of the segment is 0, and an error is reported while quering

2021-03-19 Thread Akash R Nilugal (Jira)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-4144?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Akash R Nilugal resolved CARBONDATA-4144.
-
Fix Version/s: 2.2.0
   Resolution: Fixed

> After the alter table xxx compact command is executed, the index size of the 
> segment is 0, and an error is reported while quering
> -
>
> Key: CARBONDATA-4144
> URL: https://issues.apache.org/jira/browse/CARBONDATA-4144
> Project: CarbonData
>  Issue Type: Bug
>  Components: data-load
>Affects Versions: 1.6.1, 2.0.0, 2.1.0
>Reporter: liuhe0702
>Priority: Major
> Fix For: 2.2.0
>
>  Time Spent: 4h 10m
>  Remaining Estimate: 0h
>
> When 'alter table xxx compact ...' command is executed, the value of 
> segmentFile is 13010.1_null.segment, and the values of indexSize and dataSize 
> are 0 in the tablestatus file of the secondary index table. Query failed and 
> the log displays java.lang.IndexOutOfBoundsException: Index:0, Size:0.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (CARBONDATA-4145) Query fails and the message "File does not exist: xxxx.carbondata" is displayed

2021-03-16 Thread Akash R Nilugal (Jira)


[ 
https://issues.apache.org/jira/browse/CARBONDATA-4145?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17302309#comment-17302309
 ] 

Akash R Nilugal commented on CARBONDATA-4145:
-

This is a duplicate jira, the issue is already being handled in[ 
https://github.com/apache/carbondata/pull/3988]

refer CARBONDATA-4037

> Query fails and the message "File does not exist: .carbondata" is 
> displayed
> ---
>
> Key: CARBONDATA-4145
> URL: https://issues.apache.org/jira/browse/CARBONDATA-4145
> Project: CarbonData
>  Issue Type: Bug
>  Components: data-query
>Affects Versions: 1.6.1, 2.0.0, 2.1.0
>Reporter: liuhe0702
>Priority: Major
>  Time Spent: 2.5h
>  Remaining Estimate: 0h
>
> An exception occurs when the rebuild/refresh index command is executed. After 
> that, the query command fails to be executed, and the message "File does not 
> exist: 
> /user/hive/warehouse/carbon.store/sys/idx_tbl_data_event_carbon_user_num/Fact/Part0/Segment_27670/part-1-28_batchno0-0-x.carbondata"
>  is displayed and the idx_tbl_data_event_carbon_user_num table is secondary 
> index table.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (CARBONDATA-4146) Query fails and the error message "unable to get file status" is displayed. query is normal after the "drop metacache on table" command is executed.

2021-03-16 Thread Akash R Nilugal (Jira)


[ 
https://issues.apache.org/jira/browse/CARBONDATA-4146?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17302291#comment-17302291
 ] 

Akash R Nilugal commented on CARBONDATA-4146:
-

This is a duplicate jira, the issue is already being handled in[ 
https://github.com/apache/carbondata/pull/3988|http://example.com]

refer CARBONDATA-4037


>  Query fails and the error message "unable to get file status" is displayed. 
> query is normal after the "drop metacache on table" command is executed.
> -
>
> Key: CARBONDATA-4146
> URL: https://issues.apache.org/jira/browse/CARBONDATA-4146
> Project: CarbonData
>  Issue Type: Bug
>  Components: data-query
>Affects Versions: 1.6.1, 2.0.0, 2.1.0
>Reporter: liuhe0702
>Priority: Major
>  Time Spent: 4h 40m
>  Remaining Estimate: 0h
>
> During compact execution, the status of the new segment is set to success 
> before index files are merged. After index files are merged, the carbonindex 
> files are deleted. As a result, the query task cannot find the cached 
> carbonindex files.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Comment Edited] (CARBONDATA-4146) Query fails and the error message "unable to get file status" is displayed. query is normal after the "drop metacache on table" command is executed.

2021-03-16 Thread Akash R Nilugal (Jira)


[ 
https://issues.apache.org/jira/browse/CARBONDATA-4146?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17302291#comment-17302291
 ] 

Akash R Nilugal edited comment on CARBONDATA-4146 at 3/16/21, 7:43 AM:
---

This is a duplicate jira, the issue is already being handled in[ 
https://github.com/apache/carbondata/pull/3988]

refer CARBONDATA-4037



was (Author: akashrn5):
This is a duplicate jira, the issue is already being handled in[ 
https://github.com/apache/carbondata/pull/3988|http://example.com]

refer CARBONDATA-4037


>  Query fails and the error message "unable to get file status" is displayed. 
> query is normal after the "drop metacache on table" command is executed.
> -
>
> Key: CARBONDATA-4146
> URL: https://issues.apache.org/jira/browse/CARBONDATA-4146
> Project: CarbonData
>  Issue Type: Bug
>  Components: data-query
>Affects Versions: 1.6.1, 2.0.0, 2.1.0
>Reporter: liuhe0702
>Priority: Major
>  Time Spent: 4h 40m
>  Remaining Estimate: 0h
>
> During compact execution, the status of the new segment is set to success 
> before index files are merged. After index files are merged, the carbonindex 
> files are deleted. As a result, the query task cannot find the cached 
> carbonindex files.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (CARBONDATA-4110) Support clean files dry run and show statistics after clean files operation

2021-03-15 Thread Akash R Nilugal (Jira)


[ 
https://issues.apache.org/jira/browse/CARBONDATA-4110?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17301587#comment-17301587
 ] 

Akash R Nilugal commented on CARBONDATA-4110:
-

https://github.com/apache/carbondata/pull/4072

> Support clean files dry run and show statistics after clean files operation
> ---
>
> Key: CARBONDATA-4110
> URL: https://issues.apache.org/jira/browse/CARBONDATA-4110
> Project: CarbonData
>  Issue Type: New Feature
>Reporter: Vikram Ahuja
>Priority: Minor
>  Time Spent: 26h 20m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (CARBONDATA-4110) Support clean files dry run and show statistics after clean files operation

2021-03-15 Thread Akash R Nilugal (Jira)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-4110?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Akash R Nilugal resolved CARBONDATA-4110.
-
Fix Version/s: 2.2.0
   Resolution: Fixed

> Support clean files dry run and show statistics after clean files operation
> ---
>
> Key: CARBONDATA-4110
> URL: https://issues.apache.org/jira/browse/CARBONDATA-4110
> Project: CarbonData
>  Issue Type: New Feature
>Reporter: Vikram Ahuja
>Priority: Minor
> Fix For: 2.2.0
>
>  Time Spent: 26h 20m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (CARBONDATA-4110) Support clean files dry run and show statistics after clean files operation

2021-03-15 Thread Akash R Nilugal (Jira)


[ 
https://issues.apache.org/jira/browse/CARBONDATA-4110?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17301586#comment-17301586
 ] 

Akash R Nilugal commented on CARBONDATA-4110:
-

Why is this PR needed?
Currently in the clean files operation the user does not know how much space 
will be freed. The idea is the add support for dry run in clean files which can 
tell the user how much space will be freed in the clean files operation without 
cleaning the actual data.

What changes were proposed in this PR?
This PR has the following changes:

Support dry run in clean files: It will show the user how much space will be 
freed by the clean files operation and how much space left (which can be 
released after expiration time) after the clean files operation.
Clean files output: Total size released during the clean files operation
Disable clean files Statistics option in case the user does not want clean 
files statistics
Clean files log: To enhance the clean files log to print the name of every file 
that is being deleted in the info log.

> Support clean files dry run and show statistics after clean files operation
> ---
>
> Key: CARBONDATA-4110
> URL: https://issues.apache.org/jira/browse/CARBONDATA-4110
> Project: CarbonData
>  Issue Type: New Feature
>Reporter: Vikram Ahuja
>Priority: Minor
>  Time Spent: 26h 20m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (CARBONDATA-4124) Refresh MV which does not exist is not throwing proper message

2021-02-17 Thread Akash R Nilugal (Jira)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-4124?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Akash R Nilugal resolved CARBONDATA-4124.
-
Fix Version/s: 2.2.0
 Assignee: Indhumathi Muthu Murugesh
   Resolution: Fixed

> Refresh MV which does not exist is not throwing proper message
> --
>
> Key: CARBONDATA-4124
> URL: https://issues.apache.org/jira/browse/CARBONDATA-4124
> Project: CarbonData
>  Issue Type: Bug
>Reporter: Indhumathi Muthu Murugesh
>Assignee: Indhumathi Muthu Murugesh
>Priority: Minor
> Fix For: 2.2.0
>
>  Time Spent: 3h
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (CARBONDATA-4125) SI compatability issue fix

2021-02-17 Thread Akash R Nilugal (Jira)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-4125?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Akash R Nilugal resolved CARBONDATA-4125.
-
Fix Version/s: 2.2.0
 Assignee: Indhumathi Muthu Murugesh
   Resolution: Fixed

> SI compatability issue fix
> --
>
> Key: CARBONDATA-4125
> URL: https://issues.apache.org/jira/browse/CARBONDATA-4125
> Project: CarbonData
>  Issue Type: Bug
>Reporter: Indhumathi Muthu Murugesh
>Assignee: Indhumathi Muthu Murugesh
>Priority: Major
> Fix For: 2.2.0
>
>  Time Spent: 3h 20m
>  Remaining Estimate: 0h
>
> Refer 
> [http://apache-carbondata-dev-mailing-list-archive.1130556.n5.nabble.com/Bug-SI-Compatibility-Issue-td105485.html]
>  for this issue



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (CARBONDATA-4107) MV Performance and Lock issues

2021-02-04 Thread Akash R Nilugal (Jira)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-4107?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Akash R Nilugal resolved CARBONDATA-4107.
-
Fix Version/s: 2.2.0
   Resolution: Fixed

> MV Performance and Lock issues
> --
>
> Key: CARBONDATA-4107
> URL: https://issues.apache.org/jira/browse/CARBONDATA-4107
> Project: CarbonData
>  Issue Type: Bug
>Reporter: Indhumathi Muthu Murugesh
>Priority: Major
> Fix For: 2.2.0
>
>  Time Spent: 11.5h
>  Remaining Estimate: 0h
>
> # After MV support multi-tenancy PR, mv system folder is moved to database 
> level. Hence, during each operation, insert/Load/IUD/show mv/query, we are 
> listing all the databases in the system and collecting mv schemas and 
> checking if there is any mv mapped to the table or not. This will degrade 
> performance of the query, to collect mv schemas from all databases, even 
> though the table has mv or not.
>  # When different jvm process call touchMDTFile method, file creation and 
> deletion can happen same time. This may fail the operation.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (CARBONDATA-4033) Error when using merge API with hive table

2021-01-03 Thread Akash R Nilugal (Jira)


[ 
https://issues.apache.org/jira/browse/CARBONDATA-4033?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17258008#comment-17258008
 ] 

Akash R Nilugal commented on CARBONDATA-4033:
-

can you give more details of queries, because i cannot see table A here, so 
cannot run to check the error.



> Error when using merge API with hive table
> --
>
> Key: CARBONDATA-4033
> URL: https://issues.apache.org/jira/browse/CARBONDATA-4033
> Project: CarbonData
>  Issue Type: Bug
>Affects Versions: 2.0.0, 2.0.1
>Reporter: Nguyen Dinh Huynh
>Priority: Major
>  Labels: easyfix, features, newbie
>
> I always get this error when trying to upsert hive table. I'm using CDH 6.3.1 
> with spark 2.4.3. Is this a bug ? 
> {code:java}
> 2020-10-14 14:59:25 WARN BlockManager:66 - Putting block rdd_21_1 failed due 
> to exception java.lang.RuntimeException: Store location not set for the key 
> __temptable-7bdfc88b-e5b7-46d5-8492-dfbb98b9a1b0_1602662359786_null_389ec940-ed27-41d1-9038-72ed1cd162e90x0.
>  2020-10-14 14:59:25 WARN BlockManager:66 - Block rdd_21_1 could not be 
> removed as it was not found on disk or in memory 2020-10-14 14:59:25 ERROR 
> Executor:91 - Exception in task 1.0 in stage 0.0 (TID 1) 
> java.lang.RuntimeException: Store location not set for the key 
> __temptable-7bdfc88b-e5b7-46d5-8492-dfbb98b9a1b0_1602662359786_null_389ec940-ed27-41d1-9038-72ed1cd162e90x0
> {code}
>  My code is:
> {code:java}
> val map = Map(
>   col("_external_op") -> col("A._external_op"),
>   col("_external_ts_sec") -> col("A._external_ts_sec"),
>   col("_external_row") -> col("A._external_row"),
>   col("_external_pos") -> col("A._external_pos"),
>   col("id") -> col("A.id"),
>   col("order") -> col("A.order"),
>   col("shop_code") -> col("A.shop_code"),
>   col("customer_tel") -> col("A.customer_tel"),
>   col("channel") -> col("A.channel"),
>   col("batch_session_id") -> col("A.batch_session_id"),
>   col("deleted_at") -> col("A.deleted_at"),
>   col("created") -> col("A.created"))
>   .asInstanceOf[Map[Any, Any]]
> val testDf =
>   spark.sqlContext.read.format("carbondata")
> .option("tableName", "package_drafts")
> .option("schemaName", "db")
> .option("dbName", "db")
> .option("databaseName", "d")b
> .load()
> .as("B")
> testDf.printSchema()
> testDf.merge(package_draft_view, col("A.id").equalTo(col("B.id")))
>   .whenMatched(col("A._external_op") === "u")
>   .updateExpr(map)
>   .whenMatched(col("A._external_op") === "c")
>   .insertExpr(map)
>   .whenMatched(col("A._external_op") === "d")
>   .delete()
>   .execute()
> {code}
>  
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (CARBONDATA-4047) Datediff datatype is not working with spark-2.4.5. even in spark-sql its showing as null. .Either the query might be wrong or the versions don't support datatypes.

2021-01-03 Thread Akash R Nilugal (Jira)


[ 
https://issues.apache.org/jira/browse/CARBONDATA-4047?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17258005#comment-17258005
 ] 

Akash R Nilugal commented on CARBONDATA-4047:
-

Also join our slack channel, you can directly ask questions there instead of 
raising issues directly as it will delay for responses.

https://join.slack.com/t/carbondataworkspace/shared_invite/zt-g8sv1g92-pr3GTvjrW5H9DVvNl6H2dg

> Datediff datatype is not working with spark-2.4.5. even in spark-sql its 
> showing as null. .Either the query might be wrong or the versions don't 
> support datatypes.
> ---
>
> Key: CARBONDATA-4047
> URL: https://issues.apache.org/jira/browse/CARBONDATA-4047
> Project: CarbonData
>  Issue Type: Task
>  Components: spark-integration
>Affects Versions: 2.0.0
> Environment: Hadoop - 3.2.1
> Hive - 3.1.2
> Spark - 2.4.5
> carbon data-2.0
> mysql connector jar - mysql-connector-java-8.0.19.jar
>Reporter: sravya
>Priority: Major
>  Labels: CarbonData, hadoop, hive, spark2.4
> Attachments: carbon error.PNG, sparksql.PNG
>
>
> 1.scala> carbon.sql("CREATE TABLE IF NOT EXISTS vestedlogs3 as (select * , 
> intck ("Hours", 'StartTimestamp', 'CompleteTimestamp') as Hours FROM 
> vestedlogs) STORED AS carbondata").show()
> :1: error: ')' expected but string literal found.
> carbon.sql("CREATE TABLE IF NOT EXISTS vestedlogs3 as (select * , intck 
> ("Hours", 'StartTimestamp', 'CompleteTimestamp') as Hours FROM vestedlogs) 
> STORED AS carbondata").show()
>  
> 2.scala> carbon.sql("CREATE TABLE IF NOT EXISTS vestedlogs3 (SELECT 
> *,DATEDIFF(HOUR,StartTimestamp GETDATE(),CompleteTimestamp GETDATE() AS 
> "Hours" FROM vestedlogs) STORED AS carbondata").show()
> :1: error: ')' expected but string literal found.
> carbon.sql("CREATE TABLE IF NOT EXISTS vestedlogs3 (SELECT 
> *,DATEDIFF(HOUR,StartTimestamp GETDATE(),CompleteTimestamp GETDATE() AS 
> "Hours" FROM vestedlogs) STORED AS carbondata").show()
>  
>  
>  
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Comment Edited] (CARBONDATA-4047) Datediff datatype is not working with spark-2.4.5. even in spark-sql its showing as null. .Either the query might be wrong or the versions don't support datat

2021-01-03 Thread Akash R Nilugal (Jira)


[ 
https://issues.apache.org/jira/browse/CARBONDATA-4047?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17258004#comment-17258004
 ] 

Akash R Nilugal edited comment on CARBONDATA-4047 at 1/4/21, 6:22 AM:
--

The query you have given is wrong, stored as should come first.

correct query would be
1. CREATE TABLE IF NOT EXISTS vestedlogs3 STORED AS carbondata as (select * , 
intck ("Hours", 'StartTimestamp', 'CompleteTimestamp') as Hours FROM 
vestedlogs) 

This will also fail, as spark doesnt have intck function, it fails for parquet 
also, you can check.

2. rewrite the query 
CREATE TABLE IF NOT EXISTS vestedlogs3  STORED AS carbondata as (SELECT 
*,DATEDIFF(HOUR,StartTimestamp GETDATE(),CompleteTimestamp GETDATE()) AS 
"Hours" FROM vestedlogs)

after this again fails for parsing, please check your query again
and can refer https://stackoverflow.com/questions/52527571/datediff-in-spark-sql
spark doesnt support all, you can check for alternative


was (Author: akashrn5):
The query you have given is wrong, stored as should come first.

correct query would be
1. CREATE TABLE IF NOT EXISTS vestedlogs3 STORED AS carbondata as (select * , 
intck ("Hours", 'StartTimestamp', 'CompleteTimestamp') as Hours FROM 
vestedlogs) 

This will also fail, as spark doesnt have intck function, it fails for parquet 
also, you can check.

2. rewrite the query 
CREATE TABLE IF NOT EXISTS vestedlogs3  STORED AS carbondata as (SELECT 
*,DATEDIFF(HOUR,StartTimestamp GETDATE(),CompleteTimestamp GETDATE()) AS 
"Hours" FROM vestedlogs)

after this again fails for parsing, please check your query again
and can refer 
[this post|https://stackoverflow.com/questions/52527571/datediff-in-spark-sql]
spark doesnt support all, you can check for alternative

> Datediff datatype is not working with spark-2.4.5. even in spark-sql its 
> showing as null. .Either the query might be wrong or the versions don't 
> support datatypes.
> ---
>
> Key: CARBONDATA-4047
> URL: https://issues.apache.org/jira/browse/CARBONDATA-4047
> Project: CarbonData
>  Issue Type: Task
>  Components: spark-integration
>Affects Versions: 2.0.0
> Environment: Hadoop - 3.2.1
> Hive - 3.1.2
> Spark - 2.4.5
> carbon data-2.0
> mysql connector jar - mysql-connector-java-8.0.19.jar
>Reporter: sravya
>Priority: Major
>  Labels: CarbonData, hadoop, hive, spark2.4
> Attachments: carbon error.PNG, sparksql.PNG
>
>
> 1.scala> carbon.sql("CREATE TABLE IF NOT EXISTS vestedlogs3 as (select * , 
> intck ("Hours", 'StartTimestamp', 'CompleteTimestamp') as Hours FROM 
> vestedlogs) STORED AS carbondata").show()
> :1: error: ')' expected but string literal found.
> carbon.sql("CREATE TABLE IF NOT EXISTS vestedlogs3 as (select * , intck 
> ("Hours", 'StartTimestamp', 'CompleteTimestamp') as Hours FROM vestedlogs) 
> STORED AS carbondata").show()
>  
> 2.scala> carbon.sql("CREATE TABLE IF NOT EXISTS vestedlogs3 (SELECT 
> *,DATEDIFF(HOUR,StartTimestamp GETDATE(),CompleteTimestamp GETDATE() AS 
> "Hours" FROM vestedlogs) STORED AS carbondata").show()
> :1: error: ')' expected but string literal found.
> carbon.sql("CREATE TABLE IF NOT EXISTS vestedlogs3 (SELECT 
> *,DATEDIFF(HOUR,StartTimestamp GETDATE(),CompleteTimestamp GETDATE() AS 
> "Hours" FROM vestedlogs) STORED AS carbondata").show()
>  
>  
>  
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (CARBONDATA-4047) Datediff datatype is not working with spark-2.4.5. even in spark-sql its showing as null. .Either the query might be wrong or the versions don't support datatypes.

2021-01-03 Thread Akash R Nilugal (Jira)


[ 
https://issues.apache.org/jira/browse/CARBONDATA-4047?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17258004#comment-17258004
 ] 

Akash R Nilugal commented on CARBONDATA-4047:
-

The query you have given is wrong, stored as should come first.

correct query would be
1. CREATE TABLE IF NOT EXISTS vestedlogs3 STORED AS carbondata as (select * , 
intck ("Hours", 'StartTimestamp', 'CompleteTimestamp') as Hours FROM 
vestedlogs) 

This will also fail, as spark doesnt have intck function, it fails for parquet 
also, you can check.

2. rewrite the query 
CREATE TABLE IF NOT EXISTS vestedlogs3  STORED AS carbondata as (SELECT 
*,DATEDIFF(HOUR,StartTimestamp GETDATE(),CompleteTimestamp GETDATE()) AS 
"Hours" FROM vestedlogs)

after this again fails for parsing, please check your query again
and can refer 
[this post|https://stackoverflow.com/questions/52527571/datediff-in-spark-sql]
spark doesnt support all, you can check for alternative

> Datediff datatype is not working with spark-2.4.5. even in spark-sql its 
> showing as null. .Either the query might be wrong or the versions don't 
> support datatypes.
> ---
>
> Key: CARBONDATA-4047
> URL: https://issues.apache.org/jira/browse/CARBONDATA-4047
> Project: CarbonData
>  Issue Type: Task
>  Components: spark-integration
>Affects Versions: 2.0.0
> Environment: Hadoop - 3.2.1
> Hive - 3.1.2
> Spark - 2.4.5
> carbon data-2.0
> mysql connector jar - mysql-connector-java-8.0.19.jar
>Reporter: sravya
>Priority: Major
>  Labels: CarbonData, hadoop, hive, spark2.4
> Attachments: carbon error.PNG, sparksql.PNG
>
>
> 1.scala> carbon.sql("CREATE TABLE IF NOT EXISTS vestedlogs3 as (select * , 
> intck ("Hours", 'StartTimestamp', 'CompleteTimestamp') as Hours FROM 
> vestedlogs) STORED AS carbondata").show()
> :1: error: ')' expected but string literal found.
> carbon.sql("CREATE TABLE IF NOT EXISTS vestedlogs3 as (select * , intck 
> ("Hours", 'StartTimestamp', 'CompleteTimestamp') as Hours FROM vestedlogs) 
> STORED AS carbondata").show()
>  
> 2.scala> carbon.sql("CREATE TABLE IF NOT EXISTS vestedlogs3 (SELECT 
> *,DATEDIFF(HOUR,StartTimestamp GETDATE(),CompleteTimestamp GETDATE() AS 
> "Hours" FROM vestedlogs) STORED AS carbondata").show()
> :1: error: ')' expected but string literal found.
> carbon.sql("CREATE TABLE IF NOT EXISTS vestedlogs3 (SELECT 
> *,DATEDIFF(HOUR,StartTimestamp GETDATE(),CompleteTimestamp GETDATE() AS 
> "Hours" FROM vestedlogs) STORED AS carbondata").show()
>  
>  
>  
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (CARBONDATA-4099) Fix Concurrent issues with clean files post event listener

2020-12-29 Thread Akash R Nilugal (Jira)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-4099?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Akash R Nilugal resolved CARBONDATA-4099.
-
Fix Version/s: 2.2.0
   Resolution: Fixed

> Fix Concurrent issues with clean files post event listener
> --
>
> Key: CARBONDATA-4099
> URL: https://issues.apache.org/jira/browse/CARBONDATA-4099
> Project: CarbonData
>  Issue Type: Bug
>Reporter: Vikram Ahuja
>Priority: Major
> Fix For: 2.2.0
>
>  Time Spent: 50m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (CARBONDATA-4088) Drop metacache didn't clear some cache information which leads to memory leak

2020-12-23 Thread Akash R Nilugal (Jira)


[ 
https://issues.apache.org/jira/browse/CARBONDATA-4088?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17254093#comment-17254093
 ] 

Akash R Nilugal commented on CARBONDATA-4088:
-

please handle 
[https://issues.apache.org/jira/browse/CARBONDATA-4098|https://issues.apache.org/jira/browse/CARBONDATA-4098]

> Drop metacache didn't clear some cache information which leads to memory leak
> -
>
> Key: CARBONDATA-4088
> URL: https://issues.apache.org/jira/browse/CARBONDATA-4088
> Project: CarbonData
>  Issue Type: Improvement
>  Components: core
>Affects Versions: 2.1.0
>Reporter: Yahui Liu
>Priority: Minor
>  Time Spent: 6h
>  Remaining Estimate: 0h
>
> When there are two spark applications, one drop a table, some cache 
> information of this table stay in another application and cannot be removed 
> with any method like "Drop metacache" command. This leads to memory leak. 
> With the passage of time, memory leak will also accumulate which finally 
> leads to driver OOM. Following are the leak points: 1) tableModifiedTimeStore 
> in CarbonFileMetastore; 2) segmentLockMap in BlockletDataMapIndexStore; 3) 
> absoluteTableIdentifierByteMap in SegmentPropertiesAndSchemaHolder; 4) 
> tableInfoMap in CarbonMetadata.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (CARBONDATA-4095) Select Query with SI filter fails, when columnDrift is enabled

2020-12-23 Thread Akash R Nilugal (Jira)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-4095?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Akash R Nilugal resolved CARBONDATA-4095.
-
Fix Version/s: 2.2.0
 Assignee: Indhumathi Muthu Murugesh
   Resolution: Fixed

> Select Query with SI filter fails, when columnDrift is enabled
> --
>
> Key: CARBONDATA-4095
> URL: https://issues.apache.org/jira/browse/CARBONDATA-4095
> Project: CarbonData
>  Issue Type: Improvement
>Reporter: Indhumathi Muthu Murugesh
>Assignee: Indhumathi Muthu Murugesh
>Priority: Major
> Fix For: 2.2.0
>
>  Time Spent: 1.5h
>  Remaining Estimate: 0h
>
> sql({color:#067d17}"drop table if exists maintable"{color})
>  sql({color:#067d17}"create table maintable (a string,b string,c int,d int) 
> STORED AS carbondata "{color})
>  sql({color:#067d17}"insert into maintable values('k','d',2,3)"{color})
>  sql({color:#067d17}"alter table maintable set 
> tblproperties('sort_columns'='c,d','sort_scope'='local_sort')"{color})
>  sql({color:#067d17}"create index indextable on table maintable(b) AS 
> 'carbondata'"{color})
>  sql({color:#067d17}"insert into maintable values('k','x',2,4)"{color})
>  sql({color:#067d17}"select * from maintable where b='x'"{color}).show(false)
>  
>  
>  
>  
> 2020-12-22 18:58:37 ERROR Executor:91 - Exception in task 0.0 in stage 40.0 
> (TID 422)
> java.lang.RuntimeException: Error while resolving filter expression
>  at 
> org.apache.carbondata.core.index.IndexFilter.resolveFilter(IndexFilter.java:283)
>  at 
> org.apache.carbondata.core.index.IndexFilter.getResolver(IndexFilter.java:203)
>  at 
> org.apache.carbondata.core.scan.executor.impl.AbstractQueryExecutor.initQuery(AbstractQueryExecutor.java:152)
>  at 
> org.apache.carbondata.core.scan.executor.impl.AbstractQueryExecutor.getBlockExecutionInfos(AbstractQueryExecutor.java:382)
>  at 
> org.apache.carbondata.core.scan.executor.impl.VectorDetailQueryExecutor.execute(VectorDetailQueryExecutor.java:43)
>  at 
> org.apache.carbondata.spark.vectorreader.VectorizedCarbonRecordReader.initialize(VectorizedCarbonRecordReader.java:141)
>  at 
> org.apache.carbondata.spark.rdd.CarbonScanRDD$$anon$1.hasNext(CarbonScanRDD.scala:540)
>  at 
> org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIteratorForCodegenStage1.scan_nextBatch_0$(Unknown
>  Source)
>  at 
> org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIteratorForCodegenStage1.processNext(Unknown
>  Source)
>  at 
> org.apache.spark.sql.execution.BufferedRowIterator.hasNext(BufferedRowIterator.java:43)
>  at 
> org.apache.spark.sql.execution.WholeStageCodegenExec$$anonfun$12$$anon$1.hasNext(WholeStageCodegenExec.scala:631)
>  at 
> org.apache.spark.sql.execution.SparkPlan$$anonfun$2.apply(SparkPlan.scala:253)
>  at 
> org.apache.spark.sql.execution.SparkPlan$$anonfun$2.apply(SparkPlan.scala:247)
>  at 
> org.apache.spark.rdd.RDD$$anonfun$mapPartitionsInternal$1$$anonfun$apply$25.apply(RDD.scala:836)
>  at 
> org.apache.spark.rdd.RDD$$anonfun$mapPartitionsInternal$1$$anonfun$apply$25.apply(RDD.scala:836)
>  at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:49)
>  at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:324)
>  at org.apache.spark.rdd.RDD.iterator(RDD.scala:288)
>  at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:49)
>  at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:324)
>  at org.apache.spark.rdd.RDD.iterator(RDD.scala:288)
>  at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:87)
>  at org.apache.spark.scheduler.Task.run(Task.scala:109)
>  at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:345)
>  at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
>  at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
>  at java.lang.Thread.run(Thread.java:748)
> Caused by: java.lang.NullPointerException
>  at 
> org.apache.carbondata.core.scan.filter.FilterExpressionProcessor.getFilterResolverBasedOnExpressionType(FilterExpressionProcessor.java:190)
>  at 
> org.apache.carbondata.core.scan.filter.FilterExpressionProcessor.createFilterResolverTree(FilterExpressionProcessor.java:128)
>  at 
> org.apache.carbondata.core.scan.filter.FilterExpressionProcessor.createFilterResolverTree(FilterExpressionProcessor.java:121)
>  at 
> org.apache.carbondata.core.scan.filter.FilterExpressionProcessor.getFilterResolverTree(FilterExpressionProcessor.java:77)
>  at 
> org.apache.carbondata.core.scan.filter.FilterExpressionProcessor.getFilterResolver(FilterExpressionProcessor.java:61)
>  at 
> org.apache.carbondata.core.index.IndexFilter.resolveFilter(IndexFilter.java:281)
>  ... 26 more
> 2020-12-22 18:58:37 ERROR 

[jira] [Resolved] (CARBONDATA-4093) Add logs for MV and method to verify if mv is in Sync during query

2020-12-21 Thread Akash R Nilugal (Jira)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-4093?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Akash R Nilugal resolved CARBONDATA-4093.
-
Fix Version/s: 2.2.0
 Assignee: Indhumathi Muthu Murugesh
   Resolution: Fixed

> Add logs for MV and method to verify if mv is in Sync during query
> --
>
> Key: CARBONDATA-4093
> URL: https://issues.apache.org/jira/browse/CARBONDATA-4093
> Project: CarbonData
>  Issue Type: Improvement
>Reporter: Indhumathi Muthu Murugesh
>Assignee: Indhumathi Muthu Murugesh
>Priority: Minor
> Fix For: 2.2.0
>
>  Time Spent: 3h 20m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (CARBONDATA-4076) Query having Subquery alias used in query projection doesnot hit mv after creation

2020-12-18 Thread Akash R Nilugal (Jira)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-4076?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Akash R Nilugal resolved CARBONDATA-4076.
-
Resolution: Fixed

> Query having Subquery alias used in query projection doesnot hit mv after 
> creation
> --
>
> Key: CARBONDATA-4076
> URL: https://issues.apache.org/jira/browse/CARBONDATA-4076
> Project: CarbonData
>  Issue Type: Bug
>Reporter: Indhumathi Muthu Murugesh
>Priority: Minor
> Fix For: 2.2.0
>
>  Time Spent: 5h 20m
>  Remaining Estimate: 0h
>
> {color:#067d17}CREATE TABLE fact_table1 (empname String, designation String, 
> doj Timestamp,
> {color}{color:#067d17}workgroupcategory int, workgroupcategoryname String, 
> deptno int, deptname String,
> {color}{color:#067d17}projectcode int, projectjoindate Timestamp, 
> projectenddate Timestamp,attendance int,
> {color}{color:#067d17}utilization int,salary int)
> {color}{color:#067d17}STORED AS carbondata;{color}
> {color:#067d17}create materialized view mv_sub as select empname, sum(result) 
> sum_ut from (select empname, utilization result from fact_table1) fact_table1 
> group by empname;
> {color}
>  
> {color:#067d17}select empname, sum(result) sum_ut from (select empname, 
> utilization result from fact_table1) fact_table1 group by empname;{color}
>  
> {color:#067d17}explain select empname, sum(result) sum_ut from (select 
> empname, utilization result from fact_table1) fact_table1 group by 
> empname;{color}
>  
> {color:#067d17}Expected: Query should hit MV{color}
> {color:#067d17}Actual: Query is not hitting MV{color}
>  
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (CARBONDATA-4052) Select query on SI table after insert overwrite is giving wrong result.

2020-12-01 Thread Akash R Nilugal (Jira)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-4052?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Akash R Nilugal resolved CARBONDATA-4052.
-
Fix Version/s: 2.2.0
   Resolution: Fixed

> Select query on SI table after insert overwrite is giving wrong result.
> ---
>
> Key: CARBONDATA-4052
> URL: https://issues.apache.org/jira/browse/CARBONDATA-4052
> Project: CarbonData
>  Issue Type: Bug
>Reporter: Nihal kumar ojha
>Priority: Major
> Fix For: 2.2.0
>
>  Time Spent: 4h 20m
>  Remaining Estimate: 0h
>
> # Create carbon table.
>  # Create SI table on the same carbon table.
>  # Do load or insert operation.
>  # Run query insert overwrite on maintable.
>  # Now select query on SI table is showing old as well as new data which 
> should be only new data.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (CARBONDATA-4055) Empty segment created and unnecessary entry to table status in update

2020-11-24 Thread Akash R Nilugal (Jira)
Akash R Nilugal created CARBONDATA-4055:
---

 Summary: Empty segment created and unnecessary entry to table 
status in update
 Key: CARBONDATA-4055
 URL: https://issues.apache.org/jira/browse/CARBONDATA-4055
 Project: CarbonData
  Issue Type: Bug
Reporter: Akash R Nilugal
Assignee: Akash R Nilugal


When the update command is executed and no data is updated, empty segment 
directories are created and an in progress stale entry added to table status, 
and even segment dirs are not cleaned during clean files.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (CARBONDATA-4042) Insert into select and CTAS launches fewer tasks(task count limited to number of nodes in cluster) even when target table is of no_sort

2020-10-29 Thread Akash R Nilugal (Jira)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-4042?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Akash R Nilugal resolved CARBONDATA-4042.
-
Fix Version/s: 2.1.0
   Resolution: Fixed

> Insert into select and CTAS launches fewer tasks(task count limited to number 
> of nodes in cluster) even when target table is of no_sort
> ---
>
> Key: CARBONDATA-4042
> URL: https://issues.apache.org/jira/browse/CARBONDATA-4042
> Project: CarbonData
>  Issue Type: Improvement
>  Components: data-load, spark-integration
>Reporter: Venugopal Reddy K
>Priority: Major
> Fix For: 2.1.0
>
>  Time Spent: 1h 10m
>  Remaining Estimate: 0h
>
> *Issue:*
> At present, When we do insert into table select from or create table as 
> select from, we lauch one single task per node. Whereas when we do a simple 
> select * from table query, tasks launched are equal to number of carbondata 
> files(CARBON_TASK_DISTRIBUTION default is CARBON_TASK_DISTRIBUTION_BLOCK). 
> Thus, slows down the load performance of insert into select and ctas cases.
> Refer [Community discussion regd. task 
> lauch|http://apache-carbondata-dev-mailing-list-archive.1130556.n5.nabble.com/Discussion-Query-Regarding-Task-launch-mechanism-for-data-load-operations-tt98711.html]
>  
> *Suggestion:*
> Launch the same number of tasks as in select query for insert into select and 
> ctas cases when the target table is of no-sort.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Comment Edited] (CARBONDATA-3354) how to use filiters in datamaps

2020-10-21 Thread Akash R Nilugal (Jira)


[ 
https://issues.apache.org/jira/browse/CARBONDATA-3354?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17218178#comment-17218178
 ] 

Akash R Nilugal edited comment on CARBONDATA-3354 at 10/21/20, 8:06 AM:


Hi [~imsuyash]

Sorry for late reply. Now carbondata has good improved features in latest 
version. So you can check our documentation and try your scenarios, any doubt 
you can directly ask in slack channel as all dev members are present there.

https://join.slack.com/t/carbondataworkspace/shared_invite/zt-g8sv1g92-pr3GTvjrW5H9DVvNl6H2dg

Also, the above error, is valid one, as we support timeseries on time columns.

Also in our latest version, we should all the granularities from year to 
second. You can refer here

https://github.com/apache/carbondata/blob/master/docs/mv-guide.md#time-series-support

Thanks


was (Author: akashrn5):
Hi [~imsuyash]

Sorry for late reply. Now carbondata has good improved features in latest 
version. So you can check our documentation and try your scenarios, any doubt 
you can directly ask in slack channel as all dev members are present there.

https://join.slack.com/t/carbondataworkspace/shared_invite/zt-g8sv1g92-pr3GTvjrW5H9DVvNl6H2dg

Also, the above error, is valid one, as we support timeseries on time columns.

Thanks

> how to use filiters in datamaps
> ---
>
> Key: CARBONDATA-3354
> URL: https://issues.apache.org/jira/browse/CARBONDATA-3354
> Project: CarbonData
>  Issue Type: Task
>  Components: core
>Affects Versions: 1.5.2
> Environment: apache carbon data 1.5.x
>Reporter: suyash yadav
>Priority: Major
>
> Hi Team,
>  
> We are doing a POC on apache carbon data so that we can verify if this 
> database is capable of handling amount of data we are collecting form network 
> devices.
>  
> We are stuck on few of our datamap related activities and have below queries: 
>  
>  # How to use timiebased filters while creating datamap.We tried a time based 
> condition while creating a datamap but it didn't work.
>  # How to create a timeseries datamap with column which is having value of 
> epoch time.Our query is like below:-  *carbon.sql("CREATE DATAMAP test ON 
> TABLE carbon_RT_test USING 'timeseries' DMPROPERTIES 
> ('event_time'='endMs','minute_granularity'='1',) AS SELECT sum(inOctets) FROM 
> carbon_RT_test GROUP BY inIfId")*
>  # *In above query endMs is having epoch time value.*
>  # We got an error like below: "Timeseries event time is only supported on 
> Timestamp column"
>  # Also we need to know if we can have a time granularity other then 1 like 
> in above query, can we have minute_granularity='5*'.*



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (CARBONDATA-3354) how to use filiters in datamaps

2020-10-21 Thread Akash R Nilugal (Jira)


[ 
https://issues.apache.org/jira/browse/CARBONDATA-3354?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17218178#comment-17218178
 ] 

Akash R Nilugal commented on CARBONDATA-3354:
-

Hi [~imsuyash]

Sorry for late reply. Now carbondata has good improved features in latest 
version. So you can check our documentation and try your scenarios, any doubt 
you can directly ask in slack channel as all dev members are present there.

https://join.slack.com/t/carbondataworkspace/shared_invite/zt-g8sv1g92-pr3GTvjrW5H9DVvNl6H2dg

Also, the above error, is valid one, as we support timeseries on time columns.

Thanks

> how to use filiters in datamaps
> ---
>
> Key: CARBONDATA-3354
> URL: https://issues.apache.org/jira/browse/CARBONDATA-3354
> Project: CarbonData
>  Issue Type: Task
>  Components: core
>Affects Versions: 1.5.2
> Environment: apache carbon data 1.5.x
>Reporter: suyash yadav
>Priority: Major
>
> Hi Team,
>  
> We are doing a POC on apache carbon data so that we can verify if this 
> database is capable of handling amount of data we are collecting form network 
> devices.
>  
> We are stuck on few of our datamap related activities and have below queries: 
>  
>  # How to use timiebased filters while creating datamap.We tried a time based 
> condition while creating a datamap but it didn't work.
>  # How to create a timeseries datamap with column which is having value of 
> epoch time.Our query is like below:-  *carbon.sql("CREATE DATAMAP test ON 
> TABLE carbon_RT_test USING 'timeseries' DMPROPERTIES 
> ('event_time'='endMs','minute_granularity'='1',) AS SELECT sum(inOctets) FROM 
> carbon_RT_test GROUP BY inIfId")*
>  # *In above query endMs is having epoch time value.*
>  # We got an error like below: "Timeseries event time is only supported on 
> Timestamp column"
>  # Also we need to know if we can have a time granularity other then 1 like 
> in above query, can we have minute_granularity='5*'.*



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (CARBONDATA-3970) Carbondata 2.0.1 MV ERROR CarbonInternalMetastore$: Adding/Modifying tableProperties operation failed

2020-10-21 Thread Akash R Nilugal (Jira)


[ 
https://issues.apache.org/jira/browse/CARBONDATA-3970?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17218172#comment-17218172
 ] 

Akash R Nilugal commented on CARBONDATA-3970:
-

[~sushantsam] please join our slack channel. It will be easy to discuss all the 
issues , as JIRA does not notify all the users.

https://join.slack.com/t/carbondataworkspace/shared_invite/zt-g8sv1g92-pr3GTvjrW5H9DVvNl6H2dg

> Carbondata 2.0.1 MV  ERROR CarbonInternalMetastore$: Adding/Modifying 
> tableProperties operation failed
> --
>
> Key: CARBONDATA-3970
> URL: https://issues.apache.org/jira/browse/CARBONDATA-3970
> Project: CarbonData
>  Issue Type: Bug
>  Components: data-query, hive-integration
>Affects Versions: 2.0.1
> Environment: CarbonData 2.0.1 with Spark 2.4.5
>Reporter: Sushant Sammanwar
>Priority: Major
>
> Hi ,
>  
> I am facing issues with materialized views  -  the query is not hitting the 
> view in the explain plan .I would really appreciate if you could help me on 
> this.
> Below are the details : 
> I am using Spark shell to connect to Carbon 2.0.1 using spark 2.4.5
> Underlying table has data loaded.
> I think problem is while create materialized view as i am getting a error 
> related to metastore.
>  
>  
> scala> carbon.sql("create MATERIALIZED VIEW agg_sales_mv as select country, 
> sex,sum(quantity),avg(price) from sales group by country,sex").show()
> 20/08/26 01:04:41 AUDIT audit: \{"time":"August 26, 2020 1:04:41 AM 
> IST","username":"root","opName":"CREATE MATERIALIZED 
> VIEW","opId":"16462372696035311","opStatus":"START"}
> 20/08/26 01:04:45 AUDIT audit: \{"time":"August 26, 2020 1:04:45 AM 
> IST","username":"root","opName":"CREATE 
> TABLE","opId":"16462377160819798","opStatus":"START"}
> 20/08/26 01:04:46 AUDIT audit: \{"time":"August 26, 2020 1:04:46 AM 
> IST","username":"root","opName":"CREATE 
> TABLE","opId":"16462377696791275","opStatus":"START"}
> 20/08/26 01:04:48 AUDIT audit: \{"time":"August 26, 2020 1:04:48 AM 
> IST","username":"root","opName":"CREATE 
> TABLE","opId":"16462377696791275","opStatus":"SUCCESS","opTime":"2326 
> ms","table":"NA","extraInfo":{}}
> 20/08/26 01:04:48 AUDIT audit: \{"time":"August 26, 2020 1:04:48 AM 
> IST","username":"root","opName":"CREATE 
> TABLE","opId":"16462377160819798","opStatus":"SUCCESS","opTime":"2955 
> ms","table":"default.agg_sales_mv","extraInfo":{"local_dictionary_threshold":"1","bad_record_path":"","table_blocksize":"1024","local_dictionary_enable":"true","flat_folder":"false","external":"false","sort_columns":"","comment":"","carbon.column.compressor":"snappy","mv_related_tables":"sales"}}
> 20/08/26 01:04:50 ERROR CarbonInternalMetastore$: Adding/Modifying 
> tableProperties operation failed: 
> org.apache.spark.sql.hive.HiveExternalCatalog cannot be cast to 
> org.apache.spark.sql.catalyst.catalog.ExternalCatalogWithListener
> 20/08/26 01:04:50 ERROR CarbonInternalMetastore$: Adding/Modifying 
> tableProperties operation failed: 
> org.apache.spark.sql.hive.HiveExternalCatalog cannot be cast to 
> org.apache.spark.sql.catalyst.catalog.ExternalCatalogWithListener
> 20/08/26 01:04:51 AUDIT audit: \{"time":"August 26, 2020 1:04:51 AM 
> IST","username":"root","opName":"CREATE MATERIALIZED 
> VIEW","opId":"16462372696035311","opStatus":"SUCCESS","opTime":"10551 
> ms","table":"NA","extraInfo":{"mvName":"agg_sales_mv"}}
> ++
> ||
> ++
> ++
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (CARBONDATA-3970) Carbondata 2.0.1 MV ERROR CarbonInternalMetastore$: Adding/Modifying tableProperties operation failed

2020-10-21 Thread Akash R Nilugal (Jira)


[ 
https://issues.apache.org/jira/browse/CARBONDATA-3970?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17218169#comment-17218169
 ] 

Akash R Nilugal commented on CARBONDATA-3970:
-

I think the issue with the carbon configurations. [~sushantsam] have you 
configured carbonExtensions and using sparksession itself for the queries ? 
Because we support and integrated SparkExtensions. Please can you tell us what 
is your configurations for the integration.

> Carbondata 2.0.1 MV  ERROR CarbonInternalMetastore$: Adding/Modifying 
> tableProperties operation failed
> --
>
> Key: CARBONDATA-3970
> URL: https://issues.apache.org/jira/browse/CARBONDATA-3970
> Project: CarbonData
>  Issue Type: Bug
>  Components: data-query, hive-integration
>Affects Versions: 2.0.1
> Environment: CarbonData 2.0.1 with Spark 2.4.5
>Reporter: Sushant Sammanwar
>Priority: Major
>
> Hi ,
>  
> I am facing issues with materialized views  -  the query is not hitting the 
> view in the explain plan .I would really appreciate if you could help me on 
> this.
> Below are the details : 
> I am using Spark shell to connect to Carbon 2.0.1 using spark 2.4.5
> Underlying table has data loaded.
> I think problem is while create materialized view as i am getting a error 
> related to metastore.
>  
>  
> scala> carbon.sql("create MATERIALIZED VIEW agg_sales_mv as select country, 
> sex,sum(quantity),avg(price) from sales group by country,sex").show()
> 20/08/26 01:04:41 AUDIT audit: \{"time":"August 26, 2020 1:04:41 AM 
> IST","username":"root","opName":"CREATE MATERIALIZED 
> VIEW","opId":"16462372696035311","opStatus":"START"}
> 20/08/26 01:04:45 AUDIT audit: \{"time":"August 26, 2020 1:04:45 AM 
> IST","username":"root","opName":"CREATE 
> TABLE","opId":"16462377160819798","opStatus":"START"}
> 20/08/26 01:04:46 AUDIT audit: \{"time":"August 26, 2020 1:04:46 AM 
> IST","username":"root","opName":"CREATE 
> TABLE","opId":"16462377696791275","opStatus":"START"}
> 20/08/26 01:04:48 AUDIT audit: \{"time":"August 26, 2020 1:04:48 AM 
> IST","username":"root","opName":"CREATE 
> TABLE","opId":"16462377696791275","opStatus":"SUCCESS","opTime":"2326 
> ms","table":"NA","extraInfo":{}}
> 20/08/26 01:04:48 AUDIT audit: \{"time":"August 26, 2020 1:04:48 AM 
> IST","username":"root","opName":"CREATE 
> TABLE","opId":"16462377160819798","opStatus":"SUCCESS","opTime":"2955 
> ms","table":"default.agg_sales_mv","extraInfo":{"local_dictionary_threshold":"1","bad_record_path":"","table_blocksize":"1024","local_dictionary_enable":"true","flat_folder":"false","external":"false","sort_columns":"","comment":"","carbon.column.compressor":"snappy","mv_related_tables":"sales"}}
> 20/08/26 01:04:50 ERROR CarbonInternalMetastore$: Adding/Modifying 
> tableProperties operation failed: 
> org.apache.spark.sql.hive.HiveExternalCatalog cannot be cast to 
> org.apache.spark.sql.catalyst.catalog.ExternalCatalogWithListener
> 20/08/26 01:04:50 ERROR CarbonInternalMetastore$: Adding/Modifying 
> tableProperties operation failed: 
> org.apache.spark.sql.hive.HiveExternalCatalog cannot be cast to 
> org.apache.spark.sql.catalyst.catalog.ExternalCatalogWithListener
> 20/08/26 01:04:51 AUDIT audit: \{"time":"August 26, 2020 1:04:51 AM 
> IST","username":"root","opName":"CREATE MATERIALIZED 
> VIEW","opId":"16462372696035311","opStatus":"SUCCESS","opTime":"10551 
> ms","table":"NA","extraInfo":{"mvName":"agg_sales_mv"}}
> ++
> ||
> ++
> ++
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (CARBONDATA-4025) storage space for MV is double to that of a table on which MV has been created.

2020-10-21 Thread Akash R Nilugal (Jira)


[ 
https://issues.apache.org/jira/browse/CARBONDATA-4025?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17218159#comment-17218159
 ] 

Akash R Nilugal commented on CARBONDATA-4025:
-

Hi,

MV stores the aggregated data, so how the number of rows are same in MV also?

can you give further details like, test queries, which granularity u tried? It 
would help to find the problem if any or suggest the proper way.

Also, please join and discuss in slack channel, as jira wont be notified to all.

https://join.slack.com/t/carbondataworkspace/shared_invite/zt-g8sv1g92-pr3GTvjrW5H9DVvNl6H2dg

Thanks

> storage space for MV is double to that of a table on which MV has been 
> created.
> ---
>
> Key: CARBONDATA-4025
> URL: https://issues.apache.org/jira/browse/CARBONDATA-4025
> Project: CarbonData
>  Issue Type: Improvement
>  Components: core
>Affects Versions: 2.0.1
> Environment: Apcahe carbondata 2.0.1
> Apache spark 2.4.5
> Hadoop 2.7.2
>Reporter: suyash yadav
>Priority: Major
>
> We are doing a POC based on carbondata but we have observed that when we 
> create n MV on a table with timeseries function of same granualarity the MV 
> takes double the space of the table.
>  
> In my scenario, My table has 1.3 million records and MV also has same number 
> of records but the size of the table is 3.6 MB but the size of the MV is 
> around 6.5 MB.
> This is really important for us as critical business decision are getting 
> affected due to this behaviour.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (CARBONDATA-3934) Support insert into command for transactional support

2020-10-20 Thread Akash R Nilugal (Jira)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-3934?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Akash R Nilugal updated CARBONDATA-3934:

Attachment: Presto_write_flow.pdf

> Support insert into command for transactional support
> -
>
> Key: CARBONDATA-3934
> URL: https://issues.apache.org/jira/browse/CARBONDATA-3934
> Project: CarbonData
>  Issue Type: Sub-task
>Reporter: Akash R Nilugal
>Assignee: Akash R Nilugal
>Priority: Major
> Attachments: Presto_write_flow.pdf
>
>  Time Spent: 8h 40m
>  Remaining Estimate: 0h
>
> Support insert into command for transactional support.
> Should support writing table status file, segment files, all the folder 
> structure similar to transactional carbon table.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (CARBONDATA-3831) Support write carbon files with presto.

2020-10-20 Thread Akash R Nilugal (Jira)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-3831?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Akash R Nilugal updated CARBONDATA-3831:

Attachment: Presto_write_flow.pdf

> Support write carbon files with presto.
> ---
>
> Key: CARBONDATA-3831
> URL: https://issues.apache.org/jira/browse/CARBONDATA-3831
> Project: CarbonData
>  Issue Type: New Feature
>Reporter: Akash R Nilugal
>Assignee: Akash R Nilugal
>Priority: Major
> Attachments: Presto_write_flow.pdf, carbon_presto_write_transactional 
> SUpport.pdf
>
>
> As we know the CarbonDataisan indexed columnar data format for fast analytics 
> on big data platforms. So we have already integrated with the query engines 
> like spark and even presto. Currently with presto we only support the 
> querying of carbondata files. But we don’t yet support the writing of 
> carbondata files
> through the presto engine.
>   Currentlypresto is integrated with carbondata for reading the 
> carbondata files via presto. For this, we should be having the store already 
> ready which may be written carbon in spark and the table
> should be hive metastore. So using carbondata connector we are able to read 
> the carbondata files. But we cannot create table or load the data to table in 
> presto. So it will somewhat hectic job to read the carbonfiles , by writing 
> first with other engine.
> So here i will be trying to support the transactional load support in presto 
> integration for carbon. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (CARBONDATA-4038) Support metrics during presto write

2020-10-20 Thread Akash R Nilugal (Jira)
Akash R Nilugal created CARBONDATA-4038:
---

 Summary: Support metrics during presto write 
 Key: CARBONDATA-4038
 URL: https://issues.apache.org/jira/browse/CARBONDATA-4038
 Project: CarbonData
  Issue Type: Sub-task
Reporter: Akash R Nilugal


Support metrics during presto write such as getSystemMemoryUsage () and 
getValidationCpuNanos()



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (CARBONDATA-3824) Error when Secondary index tried to be created on table that does not exist is not correct.

2020-10-19 Thread Akash R Nilugal (Jira)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-3824?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Akash R Nilugal resolved CARBONDATA-3824.
-
Fix Version/s: 2.1.0
   Resolution: Fixed

> Error when Secondary index tried to be created on table that does not exist 
> is not correct.
> ---
>
> Key: CARBONDATA-3824
> URL: https://issues.apache.org/jira/browse/CARBONDATA-3824
> Project: CarbonData
>  Issue Type: Bug
>  Components: data-query
>Affects Versions: 2.0.0
> Environment: Spark 2.3.2, Spark 2.4.5
>Reporter: Chetan Bhat
>Priority: Minor
> Fix For: 2.1.0
>
>
> *Issue :-*
> Table uniqdata_double does not exist.
> Secondary index tried to be created on table. Error message is incorrect.
> CREATE INDEX indextable2 ON TABLE uniqdata_double (DOB) AS 'carbondata' 
> PROPERTIES('carbon.column.compressor'='zstd');
> *Error: java.lang.RuntimeException: Operation not allowed on non-carbon table 
> (state=,code=0)*
>  
> *Expected :-*
> *Error: java.lang.RuntimeException: Table does not exist* *(state=,code=0)***



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (CARBONDATA-3903) Documentation Issue in Github Docs Link https://github.com/apache/carbondata/tree/master/docs

2020-10-19 Thread Akash R Nilugal (Jira)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-3903?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Akash R Nilugal resolved CARBONDATA-3903.
-
Fix Version/s: 2.1.0
   Resolution: Fixed

> Documentation Issue in Github  Docs Link 
> https://github.com/apache/carbondata/tree/master/docs
> --
>
> Key: CARBONDATA-3903
> URL: https://issues.apache.org/jira/browse/CARBONDATA-3903
> Project: CarbonData
>  Issue Type: Bug
>  Components: docs
>Affects Versions: 2.0.1
> Environment: https://github.com/apache/carbondata/tree/master/docs
>Reporter: PURUJIT CHAUGULE
>Priority: Minor
> Fix For: 2.1.0
>
>
> dml-of-carbondata.md
> LOAD DATA:
>  * Mention Each Load is considered as a Segment.
>  * Give all possible options for SORT_SCOPE like 
> GLOBAL_SORT/LOCAL_SORT/NO_SORT (with explanation of difference between each 
> type).
>  * Add Example Of complete Load query with/without use of OPTIONS.
> INSERT DATA:
>  * Mention each insert is a Segment.
> LOAD Using Static/Dynamic Partitioning:
>  * Can give a hyperlink to Static/Dynamic partitioning.
> UPDATE/DELETE:
>  * Mention about delta files concept in update and delete.
> DELETE:
>  * Add example for deletion of all records from a table (delete from 
> tablename).
> COMPACTION:
>  * Can mention Minor compaction of two types Auto and Manual( 
> carbon.auto.load.merge =true/false), and that if 
> carbon.auto.load.merge=false, trigger should be done manually.
>  * Hyperlink to Configurable properties of Compaction.
>  * Mention that compacted segments do not get cleaned automatically and 
> should be triggered manually using clean files.
>  
> flink-integration-guide.md
>  * Mention what are stages, how is it used.
>  * Process of insertion, deletion of stages in carbontable. (How is it stored 
> in carbontable).
>  
> language-manual.md
>  * Mention Compaction Hyperlink in DML section.
>  
> spatial-index-guide.md
>  * Mention the TBLPROPERTIES supported / not supported for Geo table.
>  * Mention Spatial Index does not make a new column.
>  * CTAS from one geo table to another does not create another Geo table can 
> be mentioned.
>  * Mention that a certain combination of Spatial Index table properties need 
> to be added in create table, without which a geo table does not get created.
>  * Mention that we cannot alter columns (change datatype, change name, drop) 
> mentioned in spatial_index.
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (CARBONDATA-3901) Documentation issues in https://github.com/apache/carbondata/tree/master/docs

2020-10-19 Thread Akash R Nilugal (Jira)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-3901?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Akash R Nilugal resolved CARBONDATA-3901.
-
Fix Version/s: 2.1.0
   Resolution: Fixed

> Documentation issues in https://github.com/apache/carbondata/tree/master/docs
> -
>
> Key: CARBONDATA-3901
> URL: https://issues.apache.org/jira/browse/CARBONDATA-3901
> Project: CarbonData
>  Issue Type: Bug
>  Components: docs
>Affects Versions: 2.0.1
> Environment: https://github.com/apache/carbondata/tree/master/docs
>Reporter: Chetan Bhat
>Priority: Minor
> Fix For: 2.1.0
>
>  Time Spent: 2h 40m
>  Remaining Estimate: 0h
>
> *Issue 1 :* 
> [https://github.com/apache/carbondata/blob/master/docs/alluxio-guide.md] 
> getOrCreateCarbonSession not used in Carbon 2.0 version and should be 
> removed.Issue 1 : 
> [https://github.com/apache/carbondata/blob/master/docs/alluxio-guide.md] 
> getOrCreateCarbonSession not used in Carbon 2.0 version and should be removed.
>  Testing use alluxio by CarbonSessionimport 
> org.apache.spark.sql.CarbonSession._import org.apache.spark.sql.SparkSession  
>  val carbon = 
> SparkSession.builder().master("local").appName("test").getOrCreateCarbonSession("alluxio://localhost:19998/carbondata");carbon.sql("CREATE
>  TABLE carbon_alluxio(id String,name String, city String,age Int) STORED as 
> carbondata");carbon.sql(s"LOAD DATA LOCAL INPATH 
> '${CARBONDATA_PATH}/integration/spark/src/test/resources/sample.csv' into 
> table carbon_alluxio");carbon.sql("select * from carbon_alluxio").show
> *Issue 2  -* 
> [https://github.com/apache/carbondata/blob/master/docs/ddl-of-carbondata.mdSORT_SCOPE]
>  Sort scope of the load.Options include no sort, local sort ,batch sort and 
> global sort  --> Batch sort to be removed as its not supported.
> *Issue 3 -* 
> [https://github.com/apache/carbondata/blob/master/docs/streaming-guide.md#close-stream]
>    CLOSE STREAM link is not working.
> *Issue 4 -*  
> [https://github.com/apache/carbondata/blob/master/docs/index/bloomfilter-index-guide.md]
>   Explain query does not hit the bloom. Hence the line "User can verify 
> whether a query can leverage BloomFilter Index by executing {{EXPLAIN}} 
> command, which will show the transformed logical plan, and thus user can 
> check whether the BloomFilter Index can skip blocklets during the scan."  
> needs to be removed.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (CARBONDATA-4036) When the ` character is present in column name, the table creation fails

2020-10-16 Thread Akash R Nilugal (Jira)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-4036?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Akash R Nilugal updated CARBONDATA-4036:

Description: 
When the ` character is present in column name, the table creation fails

sql("create table special_char(`i#d` string, `nam(e` 

[jira] [Issue Comment Deleted] (CARBONDATA-4035) MV table is not hit when sum() is applied on decimal column.

2020-10-16 Thread Akash R Nilugal (Jira)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-4035?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Akash R Nilugal updated CARBONDATA-4035:

Comment: was deleted

(was:  sql("drop table if exists special_char")
sql("create table special_char(`i#d` string, `nam(e` 

[jira] [Created] (CARBONDATA-4036) When the ` character is present in column name, the table creation fails

2020-10-16 Thread Akash R Nilugal (Jira)
Akash R Nilugal created CARBONDATA-4036:
---

 Summary: When the ` character is present in column name, the table 
creation fails
 Key: CARBONDATA-4036
 URL: https://issues.apache.org/jira/browse/CARBONDATA-4036
 Project: CarbonData
  Issue Type: Bug
Reporter: Akash R Nilugal
Assignee: Akash R Nilugal


When the ` character is present in column name, the table creation fails



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (CARBONDATA-4035) MV table is not hit when sum() is applied on decimal column.

2020-10-16 Thread Akash R Nilugal (Jira)


[ 
https://issues.apache.org/jira/browse/CARBONDATA-4035?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17215334#comment-17215334
 ] 

Akash R Nilugal commented on CARBONDATA-4035:
-

 sql("drop table if exists special_char")
sql("create table special_char(`i#d` string, `nam(e` 

[jira] [Created] (CARBONDATA-4035) MV table is not hit when sum() is applied on decimal column.

2020-10-16 Thread Akash R Nilugal (Jira)
Akash R Nilugal created CARBONDATA-4035:
---

 Summary: MV table is not hit when sum() is applied on decimal 
column.
 Key: CARBONDATA-4035
 URL: https://issues.apache.org/jira/browse/CARBONDATA-4035
 Project: CarbonData
  Issue Type: Bug
Reporter: Akash R Nilugal
Assignee: Akash R Nilugal


MV table is not hit when sum() is applied on decimal column.

sql("drop table if exists sum_agg_decimal")
sql("create table sum_agg_decimal(salary1 decimal(7,2),salary2 
decimal(7,2),salary3 decimal(7,2),salary4 decimal(7,2),empname string) stored 
as carbondata")
sql("drop materialized view if exists decimal_mv")
sql("create materialized view decimal_mv as select empname, sum(salary1 - 
salary2) from sum_agg_decimal group by empname")
sql("explain select empname, sum( salary1 - salary2) from sum_agg_decimal 
group by empname").show(false)





--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (CARBONDATA-4017) insert fails when column name has back slash and Si creation fails

2020-10-07 Thread Akash R Nilugal (Jira)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-4017?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Akash R Nilugal resolved CARBONDATA-4017.
-
Resolution: Fixed

> insert fails when column name has back slash and Si creation fails
> --
>
> Key: CARBONDATA-4017
> URL: https://issues.apache.org/jira/browse/CARBONDATA-4017
> Project: CarbonData
>  Issue Type: Bug
>Reporter: Akash R Nilugal
>Assignee: Akash R Nilugal
>Priority: Minor
> Fix For: 2.1.0
>
>  Time Spent: 2h 10m
>  Remaining Estimate: 0h
>
> 1. when the column name contains the backslash character and the table is 
> created with carbon session , insert fails second time.
> 2. when column name has special characters, SI creation fails in parsing.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (CARBONDATA-4018) CSV header validation is not considering the dimension columns

2020-10-07 Thread Akash R Nilugal (Jira)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-4018?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Akash R Nilugal resolved CARBONDATA-4018.
-
Resolution: Fixed

> CSV header validation is not  considering the dimension columns
> ---
>
> Key: CARBONDATA-4018
> URL: https://issues.apache.org/jira/browse/CARBONDATA-4018
> Project: CarbonData
>  Issue Type: Bug
>Reporter: Akash R Nilugal
>Assignee: Akash R Nilugal
>Priority: Minor
> Fix For: 2.1.0
>
>  Time Spent: 1h 40m
>  Remaining Estimate: 0h
>
> CSV header validation not considering the dimension columns in schema



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Closed] (CARBONDATA-3769) Upgrade hadoop version to 3.1.1 and add maven profile for 2.7.2

2020-10-01 Thread Akash R Nilugal (Jira)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-3769?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Akash R Nilugal closed CARBONDATA-3769.
---
Resolution: Not A Problem

> Upgrade hadoop version to 3.1.1 and add maven profile for 2.7.2
> ---
>
> Key: CARBONDATA-3769
> URL: https://issues.apache.org/jira/browse/CARBONDATA-3769
> Project: CarbonData
>  Issue Type: Bug
>Reporter: Akash R Nilugal
>Assignee: Akash R Nilugal
>Priority: Major
>  Time Spent: 3h 40m
>  Remaining Estimate: 0h
>
> Upgrade hadoop version to 3.1.1 and add maven profile for 2.7.2



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (CARBONDATA-3911) NullPointerException is thrown when clean files is executed after two updates

2020-10-01 Thread Akash R Nilugal (Jira)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-3911?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Akash R Nilugal resolved CARBONDATA-3911.
-
Fix Version/s: 2.1.0
   Resolution: Fixed

> NullPointerException is thrown when clean files is executed after two updates
> -
>
> Key: CARBONDATA-3911
> URL: https://issues.apache.org/jira/browse/CARBONDATA-3911
> Project: CarbonData
>  Issue Type: Bug
>Reporter: Akash R Nilugal
>Assignee: Akash R Nilugal
>Priority: Major
> Fix For: 2.1.0
>
>  Time Spent: 1h 40m
>  Remaining Estimate: 0h
>
> * create table
> * load data
> * load one more data
> * update1
> * update2
> * clean files
> fails with NullPointer



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (CARBONDATA-4019) CDC fails when the join expression contains the AND or any logical expression

2020-09-30 Thread Akash R Nilugal (Jira)
Akash R Nilugal created CARBONDATA-4019:
---

 Summary: CDC fails when the join expression contains the AND or 
any logical expression
 Key: CARBONDATA-4019
 URL: https://issues.apache.org/jira/browse/CARBONDATA-4019
 Project: CarbonData
  Issue Type: Bug
Reporter: Akash R Nilugal
Assignee: Akash R Nilugal


CDC fails when the join expression contains the AND or any logical expressions

Fails with cast expression



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (CARBONDATA-4018) CSV header validation is not considering the dimension columns

2020-09-30 Thread Akash R Nilugal (Jira)
Akash R Nilugal created CARBONDATA-4018:
---

 Summary: CSV header validation is not  considering the dimension 
columns
 Key: CARBONDATA-4018
 URL: https://issues.apache.org/jira/browse/CARBONDATA-4018
 Project: CarbonData
  Issue Type: Bug
Reporter: Akash R Nilugal
Assignee: Akash R Nilugal
 Fix For: 2.1.0


CSV header validation not considering the dimension columns in schema



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (CARBONDATA-4017) insert fails when column name has back slash and Si creation fails

2020-09-30 Thread Akash R Nilugal (Jira)
Akash R Nilugal created CARBONDATA-4017:
---

 Summary: insert fails when column name has back slash and Si 
creation fails
 Key: CARBONDATA-4017
 URL: https://issues.apache.org/jira/browse/CARBONDATA-4017
 Project: CarbonData
  Issue Type: Bug
Reporter: Akash R Nilugal
Assignee: Akash R Nilugal
 Fix For: 2.1.0


1. when the column name contains the backslash character and the table is 
created with carbon session , insert fails second time.
2. when column name has special characters, SI creation fails in parsing.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (CARBONDATA-4009) PartialQuery not hitting mv

2020-09-25 Thread Akash R Nilugal (Jira)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-4009?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Akash R Nilugal resolved CARBONDATA-4009.
-
Fix Version/s: 2.1.0
   Resolution: Fixed

> PartialQuery not hitting mv
> ---
>
> Key: CARBONDATA-4009
> URL: https://issues.apache.org/jira/browse/CARBONDATA-4009
> Project: CarbonData
>  Issue Type: Bug
>Reporter: Indhumathi Muthumurugesh
>Priority: Minor
> Fix For: 2.1.0
>
>  Time Spent: 1h 50m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (CARBONDATA-4005) SI with cache level blocklet issue

2020-09-23 Thread Akash R Nilugal (Jira)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-4005?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Akash R Nilugal resolved CARBONDATA-4005.
-
Fix Version/s: 2.1.0
   Resolution: Fixed

> SI with cache level blocklet issue
> --
>
> Key: CARBONDATA-4005
> URL: https://issues.apache.org/jira/browse/CARBONDATA-4005
> Project: CarbonData
>  Issue Type: Bug
>Reporter: SHREELEKHYA GAMPA
>Priority: Minor
> Fix For: 2.1.0
>
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> Select query on SI column returns blank resultset after changing the cache 
> level to blocklet
> PR: https://github.com/apache/carbondata/pull/3951



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (CARBONDATA-4002) Altering the value of sort columns and unsetting the longStringColumns results in deletion of columns from table schema.

2020-09-23 Thread Akash R Nilugal (Jira)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-4002?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Akash R Nilugal resolved CARBONDATA-4002.
-
Fix Version/s: (was: 2.0.0)
   2.1.0
   Resolution: Fixed

> Altering the value of sort columns and unsetting the longStringColumns 
> results in deletion of columns from table schema. 
> -
>
> Key: CARBONDATA-4002
> URL: https://issues.apache.org/jira/browse/CARBONDATA-4002
> Project: CarbonData
>  Issue Type: Bug
>  Components: core
>Reporter: Karan
>Priority: Major
> Fix For: 2.1.0
>
>  Time Spent: 5h
>  Remaining Estimate: 0h
>
> When we change the value of sortColumns by alter table query and then run 
> unset for longStringColumn. it removes some columns from table schema.
> CREATE TABLE if not exists $longStringTable(id INT, name STRING, description 
> STRING, address STRING, note STRING) STORED AS 
> carbondataTBLPROPERTIES('sort_columns'='id,name');
> alter table long_string_table set 
> tblproperties('sort_columns'='ID','sort_scope'='no_sort');
> alter table long_string_table unset tblproperties('long_string_columns');
> these queries will remove Name column from the schema because initially it 
> was a sortColumn and after that value of sortColumns is changed.
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (CARBONDATA-3996) Show table extended like command throws java.lang.ArrayIndexOutOfBoundsException

2020-09-22 Thread Akash R Nilugal (Jira)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-3996?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Akash R Nilugal resolved CARBONDATA-3996.
-
Resolution: Fixed

> Show table extended like command throws 
> java.lang.ArrayIndexOutOfBoundsException
> 
>
> Key: CARBONDATA-3996
> URL: https://issues.apache.org/jira/browse/CARBONDATA-3996
> Project: CarbonData
>  Issue Type: Bug
>  Components: spark-integration
>Affects Versions: 2.0.0
>Reporter: Venugopal Reddy K
>Priority: Minor
> Fix For: 2.1.0
>
>  Time Spent: 5h
>  Remaining Estimate: 0h
>
> *Issue:*
> Show table extended like command throws 
> java.lang.ArrayIndexOutOfBoundsException
> *Steps to reproduce:*
> spark.sql("create table employee(id string, name string) stored as 
> carbondata")
> spark.sql("show table extended like 'emp*'").show(100, false)
> *Exception stack:*
>  
> {code:java}
> Exception in thread "main" java.lang.ArrayIndexOutOfBoundsException: 
> 3Exception in thread "main" java.lang.ArrayIndexOutOfBoundsException: 3 at 
> org.apache.spark.sql.catalyst.expressions.GenericInternalRow.genericGet(rows.scala:201)
>  at 
> org.apache.spark.sql.catalyst.expressions.BaseGenericInternalRow$class.getAs(rows.scala:35)
>  at 
> org.apache.spark.sql.catalyst.expressions.BaseGenericInternalRow$class.getUTF8String(rows.scala:46)
>  at 
> org.apache.spark.sql.catalyst.expressions.GenericInternalRow.getUTF8String(rows.scala:195)
>  at 
> org.apache.spark.sql.catalyst.InternalRow$$anonfun$getAccessor$8.apply(InternalRow.scala:136)
>  at 
> org.apache.spark.sql.catalyst.InternalRow$$anonfun$getAccessor$8.apply(InternalRow.scala:136)
>  at 
> org.apache.spark.sql.catalyst.expressions.BoundReference.eval(BoundAttribute.scala:44)
>  at 
> org.apache.spark.sql.catalyst.expressions.UnaryExpression.eval(Expression.scala:389)
>  at 
> org.apache.spark.sql.catalyst.expressions.Alias.eval(namedExpressions.scala:152)
>  at 
> org.apache.spark.sql.catalyst.expressions.InterpretedMutableProjection.apply(Projection.scala:92)
>  at 
> org.apache.spark.sql.catalyst.optimizer.ConvertToLocalRelation$$anonfun$apply$24$$anonfun$applyOrElse$23.apply(Optimizer.scala:1364)
>  at 
> org.apache.spark.sql.catalyst.optimizer.ConvertToLocalRelation$$anonfun$apply$24$$anonfun$applyOrElse$23.apply(Optimizer.scala:1364)
>  at 
> scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:234)
>  at 
> scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:234)
>  at 
> scala.collection.IndexedSeqOptimized$class.foreach(IndexedSeqOptimized.scala:33)
>  at scala.collection.mutable.WrappedArray.foreach(WrappedArray.scala:35) at 
> scala.collection.TraversableLike$class.map(TraversableLike.scala:234) at 
> scala.collection.AbstractTraversable.map(Traversable.scala:104) at 
> org.apache.spark.sql.catalyst.optimizer.ConvertToLocalRelation$$anonfun$apply$24.applyOrElse(Optimizer.scala:1364)
>  at 
> org.apache.spark.sql.catalyst.optimizer.ConvertToLocalRelation$$anonfun$apply$24.applyOrElse(Optimizer.scala:1359)
>  at 
> org.apache.spark.sql.catalyst.trees.TreeNode$$anonfun$2.apply(TreeNode.scala:258)
>  at 
> org.apache.spark.sql.catalyst.trees.TreeNode$$anonfun$2.apply(TreeNode.scala:258)
>  at 
> org.apache.spark.sql.catalyst.trees.CurrentOrigin$.withOrigin(TreeNode.scala:69)
>  at 
> org.apache.spark.sql.catalyst.trees.TreeNode.transformDown(TreeNode.scala:257)
>  at 
> org.apache.spark.sql.catalyst.plans.logical.LogicalPlan.org$apache$spark$sql$catalyst$plans$logical$AnalysisHelper$$super$transformDown(LogicalPlan.scala:29)
>  at 
> org.apache.spark.sql.catalyst.plans.logical.AnalysisHelper$class.transformDown(AnalysisHelper.scala:149)
>  at 
> org.apache.spark.sql.catalyst.plans.logical.LogicalPlan.transformDown(LogicalPlan.scala:29)
>  at 
> org.apache.spark.sql.catalyst.plans.logical.LogicalPlan.transformDown(LogicalPlan.scala:29)
>  at 
> org.apache.spark.sql.catalyst.trees.TreeNode$$anonfun$transformDown$1.apply(TreeNode.scala:263)
>  at 
> org.apache.spark.sql.catalyst.trees.TreeNode$$anonfun$transformDown$1.apply(TreeNode.scala:263)
>  at 
> org.apache.spark.sql.catalyst.trees.TreeNode$$anonfun$4.apply(TreeNode.scala:328)
>  at 
> org.apache.spark.sql.catalyst.trees.TreeNode.mapProductIterator(TreeNode.scala:186)
>  at 
> org.apache.spark.sql.catalyst.trees.TreeNode.mapChildren(TreeNode.scala:326) 
> at 
> org.apache.spark.sql.catalyst.trees.TreeNode.transformDown(TreeNode.scala:263)
>  at 
> org.apache.spark.sql.catalyst.plans.logical.LogicalPlan.org$apache$spark$sql$catalyst$plans$logical$AnalysisHelper$$super$transformDown(LogicalPlan.scala:29)
>  at 
> org.apache.spark.sql.catalyst.plans.logical.AnalysisHelper$class.transformDown(AnalysisHelper.scala:149)
>  at 
> 

[jira] [Resolved] (CARBONDATA-3998) FileNotFoundException being thrown in hive during insert.

2020-09-22 Thread Akash R Nilugal (Jira)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-3998?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Akash R Nilugal resolved CARBONDATA-3998.
-
Fix Version/s: 2.1.0
   Resolution: Fixed

>  FileNotFoundException being thrown in hive during insert.
> --
>
> Key: CARBONDATA-3998
> URL: https://issues.apache.org/jira/browse/CARBONDATA-3998
> Project: CarbonData
>  Issue Type: Bug
>Reporter: Kunal Kapoor
>Assignee: Kunal Kapoor
>Priority: Major
> Fix For: 2.1.0
>
>  Time Spent: 4h 40m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (CARBONDATA-3990) Fix DropCache log error when indexmap is null

2020-09-17 Thread Akash R Nilugal (Jira)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-3990?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Akash R Nilugal resolved CARBONDATA-3990.
-
Fix Version/s: 2.1.0
   Resolution: Fixed

> Fix DropCache log error  when indexmap is null
> --
>
> Key: CARBONDATA-3990
> URL: https://issues.apache.org/jira/browse/CARBONDATA-3990
> Project: CarbonData
>  Issue Type: Bug
>Reporter: Indhumathi Muthumurugesh
>Priority: Minor
> Fix For: 2.1.0
>
>  Time Spent: 2h
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (CARBONDATA-3984) compaction on table having range column after altering data type from string to long string fails.

2020-09-16 Thread Akash R Nilugal (Jira)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-3984?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Akash R Nilugal resolved CARBONDATA-3984.
-
Fix Version/s: 2.1.0
   Resolution: Fixed

> compaction on table having range column after altering data type from string 
> to long string fails.
> --
>
> Key: CARBONDATA-3984
> URL: https://issues.apache.org/jira/browse/CARBONDATA-3984
> Project: CarbonData
>  Issue Type: Bug
>  Components: core, spark-integration
>Affects Versions: 2.0.0
>Reporter: Karan
>Priority: Major
> Fix For: 2.1.0
>
>  Time Spent: 3h 50m
>  Remaining Estimate: 0h
>
> When dataType of a String column which is also provided as range column in 
> table properties is altered to longStringColumn. It shows following error 
> while performing compaction on the table.
>  
> VARCHAR not supported for the filter expression; at 
> org.apache.spark.sql.util.CarbonException$.analysisException 
> (CarbonException.scala: 23) at 
> org.apache.carbondata.spark.rdd.CarbonMergerRDD $$ anon $ 1.  
> (CarbonMergerRDD.scala: 227) at 
> org.apache.carbondata.spark.rdd.CarbonMergerRDD.internalCompute ( 
> CarbonMergerRDD.scala: 104) at 
> org.apache.carbondata.spark.rdd.CarbonRDD.compute



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (CARBONDATA-3986) multiple issues during compaction and concurrent scenarios

2020-09-15 Thread Akash R Nilugal (Jira)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-3986?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Akash R Nilugal resolved CARBONDATA-3986.
-
Fix Version/s: 2.1.0
   Resolution: Fixed

> multiple issues during compaction and concurrent scenarios
> --
>
> Key: CARBONDATA-3986
> URL: https://issues.apache.org/jira/browse/CARBONDATA-3986
> Project: CarbonData
>  Issue Type: Bug
>Reporter: Ajantha Bhat
>Assignee: Ajantha Bhat
>Priority: Major
> Fix For: 2.1.0
>
>  Time Spent: 3h 10m
>  Remaining Estimate: 0h
>
> multiple issues during compaction and concurrent scenarios
> a) Auto compaction/multiple times minor compaction is called, it was 
> considering compacted segments and coming compaction again ad overwriting the 
> files and segments
> b) Minor/ auto compaction should skip >=2 level segments, now only skipping 
> =2 level segments
> c) when compaction failed, no need to call merge index
> d) At executor, When segment file or table status file failed to write during 
> merge index event, need to remove the stale files.
> e) during partial load cleanup segment folders are removed but segment 
> metadata files were not removed
> f) Some table status retry issues



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (CARBONDATA-3983) SI compatability issue

2020-09-15 Thread Akash R Nilugal (Jira)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-3983?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Akash R Nilugal resolved CARBONDATA-3983.
-
Fix Version/s: 2.1.0
   Resolution: Fixed

> SI compatability issue
> --
>
> Key: CARBONDATA-3983
> URL: https://issues.apache.org/jira/browse/CARBONDATA-3983
> Project: CarbonData
>  Issue Type: Bug
>Reporter: SHREELEKHYA GAMPA
>Priority: Minor
> Fix For: 2.1.0
>
>  Time Spent: 3h 50m
>  Remaining Estimate: 0h
>
> Read from maintable having SI returns empty resultset when SI is stored with 
> old tuple id storage format. 
> Bug id: BUG2020090205414
> PR link: https://github.com/apache/carbondata/pull/3922



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (CARBONDATA-3793) Data load with partition columns fail with InvalidLoadOptionException when load option 'header' is set to 'true'

2020-09-14 Thread Akash R Nilugal (Jira)


[ 
https://issues.apache.org/jira/browse/CARBONDATA-3793?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17195631#comment-17195631
 ] 

Akash R Nilugal commented on CARBONDATA-3793:
-

PR https://github.com/apache/carbondata/pull/3911 is wrongly linked to this 
Jira. The actual Jira is CARBONDATA-3973

> Data load with partition columns fail with InvalidLoadOptionException when 
> load option 'header' is set to 'true'
> 
>
> Key: CARBONDATA-3793
> URL: https://issues.apache.org/jira/browse/CARBONDATA-3793
> Project: CarbonData
>  Issue Type: Bug
>  Components: spark-integration
>Affects Versions: 2.0.0
>Reporter: Venugopal Reddy K
>Priority: Minor
> Fix For: 2.0.0
>
> Attachments: Selection_001.png
>
>  Time Spent: 4h 40m
>  Remaining Estimate: 0h
>
> *Issue:*
> Data load with partition fail with `InvalidLoadOptionException` when load 
> option `header` is set to `true`
>  
> *CallStack:*
> 2020-05-05 21:49:35 AUDIT audit:97 - {"time":"5 May, 2020 9:49:35 PM 
> IST","username":"root1","opName":"LOAD 
> DATA","opId":"199081091980878","opStatus":"FAILED","opTime":"1734 
> ms","table":"default.source","extraInfo":{color:#ff}{"Exception":"org.apache.carbondata.common.exceptions.sql.InvalidLoadOptionException","Message":"When
>  'header' option is true, 'fileheader' option is not required."}}{color}
> Exception in thread "main" 
> org.apache.carbondata.common.exceptions.sql.InvalidLoadOptionException: When 
> 'header' option is true, 'fileheader' option is not required.
> at 
> org.apache.carbondata.processing.loading.model.CarbonLoadModelBuilder.build(CarbonLoadModelBuilder.java:203)
> at 
> org.apache.carbondata.processing.loading.model.CarbonLoadModelBuilder.build(CarbonLoadModelBuilder.java:126)
> at 
> org.apache.spark.sql.execution.datasources.SparkCarbonTableFormat.prepareWrite(SparkCarbonTableFormat.scala:132)
> at 
> org.apache.spark.sql.execution.datasources.FileFormatWriter$.write(FileFormatWriter.scala:103)
> at 
> org.apache.spark.sql.execution.command.management.CarbonInsertIntoHadoopFsRelationCommand.run(CarbonInsertIntoHadoopFsRelationCommand.scala:160)
> at 
> org.apache.spark.sql.execution.command.DataWritingCommandExec.sideEffectResult$lzycompute(commands.scala:104)
> at 
> org.apache.spark.sql.execution.command.DataWritingCommandExec.sideEffectResult(commands.scala:102)



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (CARBONDATA-3989) Unnecessary segment files are created even when the segments are neither updated nor deleted

2020-09-14 Thread Akash R Nilugal (Jira)
Akash R Nilugal created CARBONDATA-3989:
---

 Summary: Unnecessary segment files are created even when the 
segments are neither updated nor deleted
 Key: CARBONDATA-3989
 URL: https://issues.apache.org/jira/browse/CARBONDATA-3989
 Project: CarbonData
  Issue Type: Bug
Reporter: Akash R Nilugal
Assignee: Akash R Nilugal


Unnecessary segment files are created even when the segments are neither 
updated nor deleted



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (CARBONDATA-3973) update and delete does not happen for partition table with multiple partition columns and clean files issue

2020-09-14 Thread Akash R Nilugal (Jira)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-3973?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Akash R Nilugal resolved CARBONDATA-3973.
-
Fix Version/s: 2.1.0
   Resolution: Fixed

> update and delete does not happen for partition table with multiple partition 
> columns and clean files issue
> ---
>
> Key: CARBONDATA-3973
> URL: https://issues.apache.org/jira/browse/CARBONDATA-3973
> Project: CarbonData
>  Issue Type: Bug
>Reporter: Akash R Nilugal
>Assignee: Akash R Nilugal
>Priority: Major
> Fix For: 2.1.0
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> 1. update and delete does not happen for partition table with multiple 
> partition columns and 2. clean files delete the segment files of non updated 
> segment considering as stale



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (CARBONDATA-3973) update and delete does not happen for partition table with multiple partition columns and clean files issue

2020-09-14 Thread Akash R Nilugal (Jira)


[ 
https://issues.apache.org/jira/browse/CARBONDATA-3973?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17195511#comment-17195511
 ] 

Akash R Nilugal commented on CARBONDATA-3973:
-

Linked to https://github.com/apache/carbondata/pull/3911

> update and delete does not happen for partition table with multiple partition 
> columns and clean files issue
> ---
>
> Key: CARBONDATA-3973
> URL: https://issues.apache.org/jira/browse/CARBONDATA-3973
> Project: CarbonData
>  Issue Type: Bug
>Reporter: Akash R Nilugal
>Assignee: Akash R Nilugal
>Priority: Major
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> 1. update and delete does not happen for partition table with multiple 
> partition columns and 2. clean files delete the segment files of non updated 
> segment considering as stale



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (CARBONDATA-3964) Select * from table or select count(*) without filter is throwing null pointer exception.

2020-09-08 Thread Akash R Nilugal (Jira)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-3964?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Akash R Nilugal resolved CARBONDATA-3964.
-
Fix Version/s: 2.1.0
   Resolution: Fixed

> Select * from table or select count(*) without filter is throwing null 
> pointer exception.
> -
>
> Key: CARBONDATA-3964
> URL: https://issues.apache.org/jira/browse/CARBONDATA-3964
> Project: CarbonData
>  Issue Type: Bug
>Reporter: Nihal kumar ojha
>Priority: Minor
> Fix For: 2.1.0
>
>  Time Spent: 5h 20m
>  Remaining Estimate: 0h
>
> Steps to reproduce.
> 1. Create a table.
> 2. Load around 500 segments and more than 1 million records.
> 3. Running query select(*) or select count(*) without filter is throwing null 
> pointer exception.
> File: TableIndex.java
> Method: pruneWithMultiThread
> line: 447
> Reason: filter.getresolver() is null.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (CARBONDATA-3975) Data mismatch when the binary data is read via hive in carbon.

2020-09-08 Thread Akash R Nilugal (Jira)
Akash R Nilugal created CARBONDATA-3975:
---

 Summary: Data mismatch when the binary data is read via hive in 
carbon.
 Key: CARBONDATA-3975
 URL: https://issues.apache.org/jira/browse/CARBONDATA-3975
 Project: CarbonData
  Issue Type: Bug
Reporter: Akash R Nilugal
Assignee: Akash R Nilugal


Data mismatch when the binary data is read via hive in carbon. carbon gives 
some wrong data compared to hive table for the same input data



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (CARBONDATA-3973) update and delete does not happen for partition table with multiple partition columns and clean files issue

2020-09-07 Thread Akash R Nilugal (Jira)
Akash R Nilugal created CARBONDATA-3973:
---

 Summary: update and delete does not happen for partition table 
with multiple partition columns and clean files issue
 Key: CARBONDATA-3973
 URL: https://issues.apache.org/jira/browse/CARBONDATA-3973
 Project: CarbonData
  Issue Type: Bug
Reporter: Akash R Nilugal
Assignee: Akash R Nilugal


1. update and delete does not happen for partition table with multiple 
partition columns and 2. clean files delete the segment files of non updated 
segment considering as stale



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (CARBONDATA-3960) Column comment should be null by default when adding column

2020-09-02 Thread Akash R Nilugal (Jira)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-3960?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Akash R Nilugal resolved CARBONDATA-3960.
-
Fix Version/s: 2.1.0
   Resolution: Fixed

> Column comment should be null by default when adding column
> ---
>
> Key: CARBONDATA-3960
> URL: https://issues.apache.org/jira/browse/CARBONDATA-3960
> Project: CarbonData
>  Issue Type: Improvement
>Reporter: David Cai
>Priority: Minor
> Fix For: 2.1.0
>
>  Time Spent: 2.5h
>  Remaining Estimate: 0h
>
> 1. create table
> create table test_add_column_with_comment(
>  col1 string comment 'col1 comment',
>  col2 int,
>  col3 string)
>  stored as carbondata
> 2 . alter table
> alter table test_add_column_with_comment add columns(
> col4 string comment "col4 comment",
> col5 int,
> col6 string comment "")
> 3. describe table
> describe test_add_column_with_comment
> ++-++
> |col_name|data_type|comment |
> ++-++
> |col1 |string |col1 comment|
> |col2 |int |null |
> |col3 |string |null |
> |col4 |string |col4 comment|
> |col5 |int | |
> |col6 |string | |
> ++-++
> the comment of col5 is "" by default



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Closed] (CARBONDATA-3966) NullPointerException is thrown in case of reliability testing of load, compaction and query

2020-09-02 Thread Akash R Nilugal (Jira)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-3966?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Akash R Nilugal closed CARBONDATA-3966.
---
Resolution: Not A Problem

> NullPointerException is thrown in case of reliability testing of load, 
> compaction and query
> ---
>
> Key: CARBONDATA-3966
> URL: https://issues.apache.org/jira/browse/CARBONDATA-3966
> Project: CarbonData
>  Issue Type: Bug
>Reporter: Akash R Nilugal
>Assignee: Akash R Nilugal
>Priority: Minor
>  Time Spent: 3h
>  Remaining Estimate: 0h
>
> Sometimes NullPointerException is thrown in case of reliability testing of 
> load, compaction and query



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (CARBONDATA-3966) NullPointerException is thrown in case of reliability testing of load, compaction and query

2020-09-02 Thread Akash R Nilugal (Jira)


[ 
https://issues.apache.org/jira/browse/CARBONDATA-3966?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17189011#comment-17189011
 ] 

Akash R Nilugal commented on CARBONDATA-3966:
-

this issue might not happen with recent changes in community of getting the 
modified time from metadata details, so closing the jira now, once segment 
refactoring is done, we can check again

> NullPointerException is thrown in case of reliability testing of load, 
> compaction and query
> ---
>
> Key: CARBONDATA-3966
> URL: https://issues.apache.org/jira/browse/CARBONDATA-3966
> Project: CarbonData
>  Issue Type: Bug
>Reporter: Akash R Nilugal
>Assignee: Akash R Nilugal
>Priority: Minor
>  Time Spent: 3h
>  Remaining Estimate: 0h
>
> Sometimes NullPointerException is thrown in case of reliability testing of 
> load, compaction and query



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (CARBONDATA-3966) NullPointerException is thrown in case of reliability testing of load, compaction and query

2020-08-31 Thread Akash R Nilugal (Jira)
Akash R Nilugal created CARBONDATA-3966:
---

 Summary: NullPointerException is thrown in case of reliability 
testing of load, compaction and query
 Key: CARBONDATA-3966
 URL: https://issues.apache.org/jira/browse/CARBONDATA-3966
 Project: CarbonData
  Issue Type: Bug
Reporter: Akash R Nilugal
Assignee: Akash R Nilugal


Sometimes NullPointerException is thrown in case of reliability testing of 
load, compaction and query



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (CARBONDATA-3955) Fix load failures due to daylight saving time changes

2020-08-27 Thread Akash R Nilugal (Jira)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-3955?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Akash R Nilugal resolved CARBONDATA-3955.
-
Fix Version/s: 2.1.0
   Resolution: Fixed

> Fix load failures due to daylight saving time changes
> -
>
> Key: CARBONDATA-3955
> URL: https://issues.apache.org/jira/browse/CARBONDATA-3955
> Project: CarbonData
>  Issue Type: Bug
>Reporter: SHREELEKHYA GAMPA
>Priority: Major
> Fix For: 2.1.0
>
>  Time Spent: 10h 20m
>  Remaining Estimate: 0h
>
> 1) Fix load failures due to daylight saving time changes.
> 2) During load, date/timestamp year values with >4 digit should fail or be 
> null according to bad records action property.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (CARBONDATA-3963) timestamp data is wrong in case of reading carbon via hive and other issue

2020-08-26 Thread Akash R Nilugal (Jira)
Akash R Nilugal created CARBONDATA-3963:
---

 Summary: timestamp data is wrong in case of reading carbon via 
hive and other issue
 Key: CARBONDATA-3963
 URL: https://issues.apache.org/jira/browse/CARBONDATA-3963
 Project: CarbonData
  Issue Type: Bug
Reporter: Akash R Nilugal
Assignee: Akash R Nilugal


1. timestamp data is wrong when carbon table is read via hive
2. carbon is not giving any data in beeline when queries via hive



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (CARBONDATA-3962) Empty fact dirs are present in case of flat folder, which are unnecessary

2020-08-26 Thread Akash R Nilugal (Jira)
Akash R Nilugal created CARBONDATA-3962:
---

 Summary: Empty fact dirs are present in case of flat folder, which 
are unnecessary
 Key: CARBONDATA-3962
 URL: https://issues.apache.org/jira/browse/CARBONDATA-3962
 Project: CarbonData
  Issue Type: Bug
Reporter: Akash R Nilugal
Assignee: Akash R Nilugal


Empty fact dirs are present in case of flat folder, which are unnecessary



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (CARBONDATA-3958) CDC Merge task can't finish

2020-08-25 Thread Akash R Nilugal (Jira)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-3958?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Akash R Nilugal resolved CARBONDATA-3958.
-
Fix Version/s: 2.1.0
   Resolution: Fixed

> CDC Merge task can't finish
> ---
>
> Key: CARBONDATA-3958
> URL: https://issues.apache.org/jira/browse/CARBONDATA-3958
> Project: CarbonData
>  Issue Type: Bug
>Reporter: David Cai
>Priority: Major
> Fix For: 2.1.0
>
>  Time Spent: 2h 20m
>  Remaining Estimate: 0h
>
> # The merge tasks take a long time and can't finish in some cases.
>  # We find warning "This scenario should not happen" in log



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (CARBONDATA-3928) Handle the Strings which length is greater than 32000 as a bad record.

2020-08-21 Thread Akash R Nilugal (Jira)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-3928?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Akash R Nilugal resolved CARBONDATA-3928.
-
Fix Version/s: 2.1.0
   Resolution: Fixed

> Handle the Strings which length is greater than 32000 as a bad record.
> --
>
> Key: CARBONDATA-3928
> URL: https://issues.apache.org/jira/browse/CARBONDATA-3928
> Project: CarbonData
>  Issue Type: Task
>Reporter: Nihal kumar ojha
>Priority: Major
> Fix For: 2.1.0
>
>  Time Spent: 12.5h
>  Remaining Estimate: 0h
>
> Currently, when the string length exceeds 32000 then the load is failed.
> Suggestion:
> 1. Bad record can handle string length greater than 32000 and load should not 
> be failed because only a few records string length is greater than 32000.
> 2. Include some more information in the log message like which record and 
> column have the problem.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (CARBONDATA-3943) Handling the addition of geo column to hive at the time of table creation

2020-08-18 Thread Akash R Nilugal (Jira)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-3943?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Akash R Nilugal resolved CARBONDATA-3943.
-
Fix Version/s: 2.1.0
   Resolution: Fixed

>  Handling the addition of geo column to hive at the time of table creation
> --
>
> Key: CARBONDATA-3943
> URL: https://issues.apache.org/jira/browse/CARBONDATA-3943
> Project: CarbonData
>  Issue Type: Bug
>Reporter: SHREELEKHYA GAMPA
>Priority: Minor
> Fix For: 2.1.0
>
>  Time Spent: 2h 20m
>  Remaining Estimate: 0h
>
>  Handling the addition of geo column to hive at the time of table creation



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (CARBONDATA-3919) Improve concurrent query performance

2020-08-18 Thread Akash R Nilugal (Jira)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-3919?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Akash R Nilugal resolved CARBONDATA-3919.
-
Fix Version/s: 2.1.0
   Resolution: Fixed

> Improve concurrent query performance
> 
>
> Key: CARBONDATA-3919
> URL: https://issues.apache.org/jira/browse/CARBONDATA-3919
> Project: CarbonData
>  Issue Type: Improvement
>Reporter: Ajantha Bhat
>Assignee: Ajantha Bhat
>Priority: Major
> Fix For: 2.1.0
>
>  Time Spent: 3.5h
>  Remaining Estimate: 0h
>
> problem1: when 500 queries executed concurrently. 
> checkIfRefreshIsNeeded method was synchronized. so only one thread was 
> working at time.
> But actually synchronization is required only when schema modified to drop 
> tables. Not for whole function
>  
> solution: synchronize only remove table part. Observed 500 query total 
> performance improved from 10s to 3 seconds in cluster.
>  
> problem2:  
> TokenCache.obtainTokensForNamenodes was causing a performance bottleneck for 
> concurrent queries. so, removed it
>  
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (CARBONDATA-3841) Remove useless string in create and alter command

2020-08-04 Thread Akash R Nilugal (Jira)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-3841?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Akash R Nilugal resolved CARBONDATA-3841.
-
Fix Version/s: 2.1.0
   Resolution: Fixed

> Remove useless string in create and alter command
> -
>
> Key: CARBONDATA-3841
> URL: https://issues.apache.org/jira/browse/CARBONDATA-3841
> Project: CarbonData
>  Issue Type: Improvement
>Reporter: Manhua Jiang
>Priority: Major
> Fix For: 2.1.0
>
>  Time Spent: 2h 40m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (CARBONDATA-3939) Exception added for index creation on long string columns

2020-08-04 Thread Akash R Nilugal (Jira)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-3939?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Akash R Nilugal resolved CARBONDATA-3939.
-
Fix Version/s: 2.1.0
   Resolution: Fixed

> Exception added for index creation on long string columns
> -
>
> Key: CARBONDATA-3939
> URL: https://issues.apache.org/jira/browse/CARBONDATA-3939
> Project: CarbonData
>  Issue Type: Bug
>Reporter: Akshay
>Priority: Major
> Fix For: 2.1.0
>
>  Time Spent: 1h 10m
>  Remaining Estimate: 0h
>
> Index creation for long string columns are not yet supported.
> User understandable exceptions are thrown if user tries to create the same.
> https://github.com/apache/carbondata/pull/3869



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (CARBONDATA-3936) Support writing complex datatype data in presto integration

2020-07-31 Thread Akash R Nilugal (Jira)
Akash R Nilugal created CARBONDATA-3936:
---

 Summary: Support writing complex datatype data in presto 
integration
 Key: CARBONDATA-3936
 URL: https://issues.apache.org/jira/browse/CARBONDATA-3936
 Project: CarbonData
  Issue Type: Sub-task
Reporter: Akash R Nilugal
Assignee: Akash R Nilugal






--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (CARBONDATA-3935) Support writing partition data in presto

2020-07-31 Thread Akash R Nilugal (Jira)
Akash R Nilugal created CARBONDATA-3935:
---

 Summary: Support writing partition data in presto
 Key: CARBONDATA-3935
 URL: https://issues.apache.org/jira/browse/CARBONDATA-3935
 Project: CarbonData
  Issue Type: Sub-task
Reporter: Akash R Nilugal
Assignee: Akash R Nilugal


Support writing partition data in presto



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (CARBONDATA-3934) Support insert into command for transactional support

2020-07-31 Thread Akash R Nilugal (Jira)
Akash R Nilugal created CARBONDATA-3934:
---

 Summary: Support insert into command for transactional support
 Key: CARBONDATA-3934
 URL: https://issues.apache.org/jira/browse/CARBONDATA-3934
 Project: CarbonData
  Issue Type: Sub-task
Reporter: Akash R Nilugal
Assignee: Akash R Nilugal


Support insert into command for transactional support.
Should support writing table status file, segment files, all the folder 
structure similar to transactional carbon table.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (CARBONDATA-3933) insert, desc throws error when the column name contains special character

2020-07-31 Thread Akash R Nilugal (Jira)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-3933?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Akash R Nilugal updated CARBONDATA-3933:

Description: 
insert, desc throws error when the column name contains special character

sql("drop table if exists special_char")
sql("create table special_char(`i#d` string, `nam(e` 

[jira] [Created] (CARBONDATA-3933) insert, desc throws error when the column name contains special character

2020-07-31 Thread Akash R Nilugal (Jira)
Akash R Nilugal created CARBONDATA-3933:
---

 Summary: insert, desc throws error when the column name contains 
special character
 Key: CARBONDATA-3933
 URL: https://issues.apache.org/jira/browse/CARBONDATA-3933
 Project: CarbonData
  Issue Type: Bug
Reporter: Akash R Nilugal


insert, desc throws error when the column name contains special character



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (CARBONDATA-3914) We are getting the below error when executing select query on a carbon table when no data is returned from hive beeline.

2020-07-29 Thread Akash R Nilugal (Jira)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-3914?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Akash R Nilugal resolved CARBONDATA-3914.
-
Fix Version/s: 2.1.0
   Resolution: Fixed

> We are getting the below error when executing select query on a carbon table 
> when no data is returned from hive beeline.
> 
>
> Key: CARBONDATA-3914
> URL: https://issues.apache.org/jira/browse/CARBONDATA-3914
> Project: CarbonData
>  Issue Type: Bug
>  Components: hive-integration
>Affects Versions: 2.0.0
> Environment: 3 node One track ANT cluster
>Reporter: Prasanna Ravichandran
>Priority: Minor
> Fix For: 2.1.0
>
> Attachments: Nodatareturnedfromcarbontable-IOexception.png
>
>  Time Spent: 2h
>  Remaining Estimate: 0h
>
> If no data is present in the table, then we are getting the below IOException 
> in carbon, while running select queries on that empty table. But in hive even 
> if the table holds no data, then it is working for select queries.
> Expected results: Even the table holds no records it should return 0 or no 
> rows returned. It should not throw error/exception.
> Actual result: It is throwing IO exception - Unable to read carbon schema.
>  
> Attached the screenshot for your reference.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (CARBONDATA-3927) TupleID/Position reference is long , make it short

2020-07-27 Thread Akash R Nilugal (Jira)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-3927?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Akash R Nilugal updated CARBONDATA-3927:

Issue Type: Improvement  (was: Bug)

> TupleID/Position reference is long , make it short
> --
>
> Key: CARBONDATA-3927
> URL: https://issues.apache.org/jira/browse/CARBONDATA-3927
> Project: CarbonData
>  Issue Type: Improvement
>Reporter: Akash R Nilugal
>Assignee: Akash R Nilugal
>Priority: Minor
>
> the current tuple id is long where some parts we can avoid to improve 
> performance. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (CARBONDATA-3929) Improve the CDC merge feature time

2020-07-27 Thread Akash R Nilugal (Jira)
Akash R Nilugal created CARBONDATA-3929:
---

 Summary: Improve the CDC merge feature time
 Key: CARBONDATA-3929
 URL: https://issues.apache.org/jira/browse/CARBONDATA-3929
 Project: CarbonData
  Issue Type: Improvement
Reporter: Akash R Nilugal
Assignee: Akash R Nilugal


Improve the CDC merge feature time



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (CARBONDATA-3927) TupleID/Position reference is long , make it short

2020-07-27 Thread Akash R Nilugal (Jira)
Akash R Nilugal created CARBONDATA-3927:
---

 Summary: TupleID/Position reference is long , make it short
 Key: CARBONDATA-3927
 URL: https://issues.apache.org/jira/browse/CARBONDATA-3927
 Project: CarbonData
  Issue Type: Bug
Reporter: Akash R Nilugal
Assignee: Akash R Nilugal


the current tuple id is long where some parts we can avoid to improve 
performance. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (CARBONDATA-3899) drop materialized view when executed concurrently from 4 concurrent client fails in all 4 clients.

2020-07-23 Thread Akash R Nilugal (Jira)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-3899?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Akash R Nilugal resolved CARBONDATA-3899.
-
Fix Version/s: 2.1.0
   Resolution: Fixed

> drop materialized view when executed concurrently from 4 concurrent client 
> fails in all 4 clients.
> --
>
> Key: CARBONDATA-3899
> URL: https://issues.apache.org/jira/browse/CARBONDATA-3899
> Project: CarbonData
>  Issue Type: Bug
>Reporter: SHREELEKHYA GAMPA
>Priority: Major
> Fix For: 2.1.0
>
> Attachments: screenshot-1.png
>
>  Time Spent: 3h 40m
>  Remaining Estimate: 0h
>
> drop materialized view when executed concurrently from 4 concurrent client 
> fails in all 4 clients from beeline.
>  !screenshot-1.png! 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (CARBONDATA-3920) compaction failure issue for SI table and metadata mismatch in concurrency

2020-07-22 Thread Akash R Nilugal (Jira)
Akash R Nilugal created CARBONDATA-3920:
---

 Summary: compaction failure issue for SI table and metadata 
mismatch in concurrency
 Key: CARBONDATA-3920
 URL: https://issues.apache.org/jira/browse/CARBONDATA-3920
 Project: CarbonData
  Issue Type: Bug
Reporter: Akash R Nilugal
Assignee: Akash R Nilugal


when load and compaction are run concurrently, sometimes the data files or 
segment folders are missing , and due to concurrency sometimes, the SI metadata 
is overwritten which leads to metadata inconsistency



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (CARBONDATA-3918) Select count(*) gives extra data after multiple updates with index server running

2020-07-22 Thread Akash R Nilugal (Jira)
Akash R Nilugal created CARBONDATA-3918:
---

 Summary: Select count(*) gives extra data after multiple updates 
with index server running
 Key: CARBONDATA-3918
 URL: https://issues.apache.org/jira/browse/CARBONDATA-3918
 Project: CarbonData
  Issue Type: Bug
Reporter: Akash R Nilugal
Assignee: Akash R Nilugal
 Fix For: 2.1.0


Select count(*) gives extra data after multiple updates with index server 
running

start index server.

create table and load data and then perform two updates and then run count (*) 
which gives extra data.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (CARBONDATA-3902) Query on partition table gives incorrect results after Delete records using CDC

2020-07-21 Thread Akash R Nilugal (Jira)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-3902?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Akash R Nilugal resolved CARBONDATA-3902.
-
Fix Version/s: 2.1.0
   Resolution: Fixed

> Query on partition table gives incorrect results after Delete records using 
> CDC
> ---
>
> Key: CARBONDATA-3902
> URL: https://issues.apache.org/jira/browse/CARBONDATA-3902
> Project: CarbonData
>  Issue Type: Bug
>Reporter: Indhumathi Muthumurugesh
>Priority: Major
> Fix For: 2.1.0
>
>  Time Spent: 2h 20m
>  Remaining Estimate: 0h
>
> Steps to Reproduce Issue :
> {code:java}
> import scala.collection.JavaConverters.
> import java.sql.Date
> import org.apache.spark.sql._
> import org.apache.spark.sql.CarbonSession._
> import org.apache.spark.sql.catalyst.TableIdentifier
> import 
> org.apache.spark.sql.execution.command.mutation.merge.{CarbonMergeDataSetCommand,
>  DeleteAction, InsertAction, InsertInHistoryTableAction, MergeDataSetMatches, 
> MergeMatch, UpdateAction, WhenMatched, WhenNotMatched, 
> WhenNotMatchedAndExistsOnlyOnTarget}
> import org.apache.spark.sql.functions._
> import org.apache.spark.sql.test.util.QueryTest
> import org.apache.spark.sql.types.{BooleanType, DateType, IntegerType, 
> StringType, StructField, StructType}
> import spark.implicits.
> sql("drop table if exists target").show()
> val initframe = spark.createDataFrame(Seq(
> Row("a", "0"),
> Row("b", "1"),
> Row("c", "2"),
> Row("d", "3")
> ).asJava, StructType(Seq(StructField("key", StringType), StructField("value", 
> StringType
> initframe.write
> .format("carbondata")
> .option("tableName", "target")
> .option("partitionColumns", "value")
> .mode(SaveMode.Overwrite)
> .save()
> val target = spark.read.format("carbondata").option("tableName", 
> "target").load()
> var ccd =
> spark.createDataFrame(Seq(
> Row("a", "10", false, 0),
> Row("a", null, true, 1),
> Row("b", null, true, 2),
> Row("c", null, true, 3),
> Row("c", "20", false, 4),
> Row("c", "200", false, 5),
> Row("e", "100", false, 6)
> ).asJava,
> StructType(Seq(StructField("key", StringType),
> StructField("newValue", StringType),
> StructField("deleted", BooleanType), StructField("time", IntegerType
> ccd.createOrReplaceTempView("changes")
> ccd = sql("SELECT key, latest.newValue as newValue, latest.deleted as deleted 
> FROM ( SELECT key, max(struct(time, newValue, deleted)) as latest FROM 
> changes GROUP BY key)")
> val updateMap = Map("key" -> "B.key", "value" -> 
> "B.newValue").asInstanceOf[Map[Any, Any]]
> val insertMap = Map("key" -> "B.key", "value" -> 
> "B.newValue").asInstanceOf[Map[Any, Any]]
> target.as("A").merge(ccd.as("B"), "A.key=B.key").
> whenMatched("B.deleted=true").
> delete().execute(){code}
>  
>  After this delete operation, partition 0, 1 and 2 should have deleted from 
> it.
> Actual:
> {color:#067d17}select * from target order by key;{color}
> {color:#067d17}+---+-+
> |key|value|
> +---+-+
> |a |0 |
> |b |1 |
> |c |2 |
> |d |3 |
> +---+-+{color}
> {color:#067d17}Expected:{color}
> {color:#067d17}+---+-+
> |key|value|
> +---+-+
> |d |3 |
> +---+-+{color}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (CARBONDATA-3907) Reuse firePreLoadEvents and firePostLoadEvents methods from CommonLoadUtils to trigger LoadTablePreExecutionEvent and LoadTablePostExecutionEvent respectively in al

2020-07-21 Thread Akash R Nilugal (Jira)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-3907?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Akash R Nilugal resolved CARBONDATA-3907.
-
Resolution: Fixed

> Reuse firePreLoadEvents and firePostLoadEvents methods from CommonLoadUtils 
> to trigger LoadTablePreExecutionEvent and LoadTablePostExecutionEvent 
> respectively in alter table add segment flow
> --
>
> Key: CARBONDATA-3907
> URL: https://issues.apache.org/jira/browse/CARBONDATA-3907
> Project: CarbonData
>  Issue Type: Improvement
>  Components: spark-integration
>Affects Versions: 2.0.0
>Reporter: Venugopal Reddy K
>Priority: Minor
> Fix For: 2.1.0
>
>  Time Spent: 2h 20m
>  Remaining Estimate: 0h
>
> *[Issue]*
> Currently we have 2 different ways of firing LoadTablePreExecutionEvent and 
> LoadTablePostExecutionEvent. We can reuse firePreLoadEvents and 
> firePostLoadEvents methods from CommonLoadUtils to trigger 
> LoadTablePreExecutionEvent and LoadTablePostExecutionEvent respectively in 
> alter table add segment flow as well. 
> *[Suggestion]*
> Reuse firePreLoadEvents and firePostLoadEvents methods from CommonLoadUtils 
> to trigger LoadTablePreExecutionEvent and LoadTablePostExecutionEvent 
> respectively in alter table add segment flow.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (CARBONDATA-3909) Insert into select fails after insert decimal value as null and set sort scope to global sort

2020-07-17 Thread Akash R Nilugal (Jira)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-3909?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Akash R Nilugal resolved CARBONDATA-3909.
-
Fix Version/s: 2.1.0
   Resolution: Fixed

> Insert into select fails after insert decimal value as null and set sort 
> scope to global sort
> -
>
> Key: CARBONDATA-3909
> URL: https://issues.apache.org/jira/browse/CARBONDATA-3909
> Project: CarbonData
>  Issue Type: Bug
>  Components: data-load
>Affects Versions: 2.0.1
> Environment: Spark 2.3.2, 2.4.5
>Reporter: Chetan Bhat
>Priority: Minor
> Fix For: 2.1.0
>
>  Time Spent: 1h 20m
>  Remaining Estimate: 0h
>
> Steps -
> insert decimal value as null and set sort scope to global sort and do insert 
> into select.
>  
> Issue : - Insert into select fails.
>  
> Expected : - Insert into select should be success.
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (CARBONDATA-3911) NullPointerException is thrown when clean files is executed after two updates

2020-07-16 Thread Akash R Nilugal (Jira)
Akash R Nilugal created CARBONDATA-3911:
---

 Summary: NullPointerException is thrown when clean files is 
executed after two updates
 Key: CARBONDATA-3911
 URL: https://issues.apache.org/jira/browse/CARBONDATA-3911
 Project: CarbonData
  Issue Type: Bug
Reporter: Akash R Nilugal
Assignee: Akash R Nilugal


* create table
* load data
* load one more data
* update1
* update2
* clean files

fails with NullPointer



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (CARBONDATA-3910) load fails when csv file present in local and loading to cluster

2020-07-16 Thread Akash R Nilugal (Jira)
Akash R Nilugal created CARBONDATA-3910:
---

 Summary: load fails when csv file present in local and loading to 
cluster
 Key: CARBONDATA-3910
 URL: https://issues.apache.org/jira/browse/CARBONDATA-3910
 Project: CarbonData
  Issue Type: Bug
Reporter: Akash R Nilugal
Assignee: Akash R Nilugal


load fails when csv file present in local and loading to cluster



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (CARBONDATA-3831) Support write carbon files with presto.

2020-07-14 Thread Akash R Nilugal (Jira)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-3831?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Akash R Nilugal updated CARBONDATA-3831:

Attachment: carbon_presto_write_transactional SUpport.pdf

> Support write carbon files with presto.
> ---
>
> Key: CARBONDATA-3831
> URL: https://issues.apache.org/jira/browse/CARBONDATA-3831
> Project: CarbonData
>  Issue Type: New Feature
>Reporter: Akash R Nilugal
>Assignee: Akash R Nilugal
>Priority: Major
> Attachments: carbon_presto_write_transactional SUpport.pdf
>
>
> As we know the CarbonDataisan indexed columnar data format for fast analytics 
> on big data platforms. So we have already integrated with the query engines 
> like spark and even presto. Currently with presto we only support the 
> querying of carbondata files. But we don’t yet support the writing of 
> carbondata files
> through the presto engine.
>   Currentlypresto is integrated with carbondata for reading the 
> carbondata files via presto. For this, we should be having the store already 
> ready which may be written carbon in spark and the table
> should be hive metastore. So using carbondata connector we are able to read 
> the carbondata files. But we cannot create table or load the data to table in 
> presto. So it will somewhat hectic job to read the carbonfiles , by writing 
> first with other engine.
> So here i will be trying to support the transactional load support in presto 
> integration for carbon. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (CARBONDATA-3852) CCD Merge with Partition Table is giving different results in different spark deploy modes

2020-07-13 Thread Akash R Nilugal (Jira)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-3852?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Akash R Nilugal resolved CARBONDATA-3852.
-
Fix Version/s: 2.1.0
   Resolution: Fixed

> CCD Merge with Partition Table is giving different results in different spark 
> deploy modes
> --
>
> Key: CARBONDATA-3852
> URL: https://issues.apache.org/jira/browse/CARBONDATA-3852
> Project: CarbonData
>  Issue Type: Bug
>  Components: spark-integration
>Affects Versions: 2.0.0
>Reporter: Sachin Ramachandra Setty
>Priority: Major
> Fix For: 2.1.0
>
>  Time Spent: 3h 20m
>  Remaining Estimate: 0h
>
> The result sets are different when run the sql queries in spark-shell 
> --master local and spark-shell --master yarn (Two Different Spark Deploy 
> Modes)
> {code}
> import scala.collection.JavaConverters._
> import java.sql.Date
> import org.apache.spark.sql._
> import org.apache.spark.sql.CarbonSession._
> import org.apache.spark.sql.catalyst.TableIdentifier
> import 
> org.apache.spark.sql.execution.command.mutation.merge.{CarbonMergeDataSetCommand,
>  DeleteAction, InsertAction, InsertInHistoryTableAction, MergeDataSetMatches, 
> MergeMatch, UpdateAction, WhenMatched, WhenNotMatched, 
> WhenNotMatchedAndExistsOnlyOnTarget}
> import org.apache.spark.sql.functions._
> import org.apache.spark.sql.test.util.QueryTest
> import org.apache.spark.sql.types.{BooleanType, DateType, IntegerType, 
> StringType, StructField, StructType}
> import spark.implicits._
> val df1 = sc.parallelize(1 to 10, 4).map{ x => ("id"+x, 
> s"order$x",s"customer$x", x*10, x*75, 1)}.toDF("id", "name", "c_name", 
> "quantity", "price", "state")
> df1.write.format("carbondata").option("tableName", 
> "order").mode(SaveMode.Overwrite).save()
> val dwframe = spark.read.format("carbondata").option("tableName", 
> "order").load()
> val dwSelframe = dwframe.as("A")
> val ds1 = sc.parallelize(3 to 10, 4)
>   .map {x =>
> if (x <= 4) {
>   ("id"+x, s"order$x",s"customer$x", x*10, x*75, 2)
> } else {
>   ("id"+x, s"order$x",s"customer$x", x*10, x*75, 1)
> }
>   }.toDF("id", "name", "c_name", "quantity", "price", "state")
> 
> val ds2 = sc.parallelize(1 to 2, 4).map {x => ("newid"+x, 
> s"order$x",s"customer$x", x*10, x*75, 1)}.toDS().toDF()
> val ds3 = ds1.union(ds2)  
> val odsframe = ds3.as("B")
>   
> sql("drop table if exists target").show()
> val initframe = spark.createDataFrame(Seq(
>   Row("a", "0"),
>   Row("b", "1"),
>   Row("c", "2"),
>   Row("d", "3")
> ).asJava, StructType(Seq(StructField("key", StringType), StructField("value", 
> StringType
> initframe.write
>   .format("carbondata")
>   .option("tableName", "target")
>   .option("partitionColumns", "value")
>   .mode(SaveMode.Overwrite)
>   .save()
>   
> val target = spark.read.format("carbondata").option("tableName", 
> "target").load()
> var ccd =
>   spark.createDataFrame(Seq(
> Row("a", "10", false,  0),
> Row("a", null, true, 1),   
> Row("b", null, true, 2),   
> Row("c", null, true, 3),   
> Row("c", "20", false, 4),
> Row("c", "200", false, 5),
> Row("e", "100", false, 6) 
>   ).asJava,
> StructType(Seq(StructField("key", StringType),
>   StructField("newValue", StringType),
>   StructField("deleted", BooleanType), StructField("time", IntegerType
> 
> ccd.createOrReplaceTempView("changes")
> ccd = sql("SELECT key, latest.newValue as newValue, latest.deleted as deleted 
> FROM ( SELECT key, max(struct(time, newValue, deleted)) as latest FROM 
> changes GROUP BY key)")
> val updateMap = Map("key" -> "B.key", "value" -> 
> "B.newValue").asInstanceOf[Map[Any, Any]]
> val insertMap = Map("key" -> "B.key", "value" -> 
> "B.newValue").asInstanceOf[Map[Any, Any]]
> target.as("A").merge(ccd.as("B"), "A.key=B.key").
>   whenMatched("B.deleted=false").
>   updateExpr(updateMap).
>   whenNotMatched("B.deleted=false").
>   insertExpr(insertMap).
>   whenMatched("B.deleted=true").
>   delete().execute()
>   
> {code}
> SQL Queries to run :
> {code} 
> sql("select count(*) from target").show()
> sql("select * from target order by key").show()
> {code}
> Results in spark-shell --master yarn
> {code}
> scala> sql("select count(*) from target").show()
> ++
> |count(1)|
> ++
> |   4|
> ++
> scala> sql("select * from target order by key").show()
> +---+-+
> |key|value|
> +---+-+
> |  a|0|
> |  b|1|
> |  c|2|
> |  d|3|
> +---+-+
> {code}
> Results in spark-shell --master local
> {code}
> scala> sql("select count(*) from target").show()
> ++
> |count(1)|
> ++
> |   3|
> ++
> scala> sql("select * from target order by key").show()
> 

[jira] [Resolved] (CARBONDATA-3851) Merge Update and Insert with Partition Table is giving different results in different spark deploy modes

2020-07-13 Thread Akash R Nilugal (Jira)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-3851?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Akash R Nilugal resolved CARBONDATA-3851.
-
Fix Version/s: 2.1.0
   Resolution: Fixed

> Merge Update and Insert with Partition Table is giving different results in 
> different spark deploy modes
> 
>
> Key: CARBONDATA-3851
> URL: https://issues.apache.org/jira/browse/CARBONDATA-3851
> Project: CarbonData
>  Issue Type: Bug
>  Components: spark-integration
>Affects Versions: 2.0.0
>Reporter: Sachin Ramachandra Setty
>Priority: Major
> Fix For: 2.1.0
>
>
> The result sets are different when run the queries in spark-shell --master 
> local and spark-shell --master yarn (Two Different Spark Deploy Modes)
> Steps to Reproduce Issue :
> {code}
> import scala.collection.JavaConverters._
> import java.sql.Date
> import org.apache.spark.sql._
> import org.apache.spark.sql.CarbonSession._
> import org.apache.spark.sql.catalyst.TableIdentifier
> import org.apache.spark.sql.execution.command.mutation.merge.
> {CarbonMergeDataSetCommand, DeleteAction, InsertAction, 
> InsertInHistoryTableAction, MergeDataSetMatches, MergeMatch, UpdateAction, 
> WhenMatched, WhenNotMatched, WhenNotMatchedAndExistsOnlyOnTarget}
> import org.apache.spark.sql.functions._
> import org.apache.spark.sql.test.util.QueryTest
> import org.apache.spark.sql.types.
> {BooleanType, DateType, IntegerType, StringType, StructField, StructType}
> import spark.implicits._
> sql("drop table if exists order").show()
> sql("drop table if exists order_hist").show()
> sql("create table order_hist(id string, name string, quantity int, price int, 
> state int) PARTITIONED BY (c_name String) STORED AS carbondata").show()
> val initframe = sc.parallelize(1 to 10, 4).map
> { x => ("id"+x, s"order$x",s"customer$x", x*10, x*75, 1)}
> .toDF("id", "name", "c_name", "quantity", "price", "state")
> initframe.write
>  .format("carbondata")
>  .option("tableName", "order")
>  .option("partitionColumns", "c_name")
>  .mode(SaveMode.Overwrite)
>  .save()
> val dwframe = spark.read.format("carbondata").option("tableName", 
> "order").load()
> val dwSelframe = dwframe.as("A")
> val ds1 = sc.parallelize(3 to 10, 4)
>  .map {x =>
>  if (x <= 4)
> { ("id"+x, s"order$x",s"customer$x", x*10, x*75, 2) }
> else
> { ("id"+x, s"order$x",s"customer$x", x*10, x*75, 1) }
> }.toDF("id", "name", "c_name", "quantity", "price", "state")
> ds1.show()
>  val ds2 = sc.parallelize(1 to 2, 4)
>  .map
> {x => ("newid"+x, s"order$x",s"customer$x", x*10, x*75, 1) }
> .toDS().toDF()
>  ds2.show()
>  val ds3 = ds1.union(ds2)
>  ds3.show()
> val odsframe = ds3.as("B")
> var matches = Seq.empty[MergeMatch]
>  val updateMap = Map(col("id") -> col("A.id"),
>  col("price") -> expr("B.price + 1"),
>  col("state") -> col("B.state"))
> val insertMap = Map(col("id") -> col("B.id"),
>  col("name") -> col("B.name"),
>  col("c_name") -> col("B.c_name"),
>  col("quantity") -> col("B.quantity"),
>  col("price") -> expr("B.price * 100"),
>  col("state") -> col("B.state"))
> val insertMap_u = Map(col("id") -> col("id"),
>  col("name") -> col("name"),
>  col("c_name") -> lit("insert"),
>  col("quantity") -> col("quantity"),
>  col("price") -> expr("price"),
>  col("state") -> col("state"))
> val insertMap_d = Map(col("id") -> col("id"),
>  col("name") -> col("name"),
>  col("c_name") -> lit("delete"),
>  col("quantity") -> col("quantity"),
>  col("price") -> expr("price"),
>  col("state") -> col("state"))
> matches ++= Seq(WhenMatched(Some(col("A.state") =!= 
> col("B.state"))).addAction(UpdateAction(updateMap)).addAction(InsertInHistoryTableAction(insertMap_u,
>  TableIdentifier("order_hist"
>  matches ++= Seq(WhenNotMatched().addAction(InsertAction(insertMap)))
>  matches ++= 
> Seq(WhenNotMatchedAndExistsOnlyOnTarget().addAction(DeleteAction()).addAction(InsertInHistoryTableAction(insertMap_d,
>  TableIdentifier("order_hist"
> {code}
>  
> SQL Queries :
> {code}
> sql("select count(*) from order").show()
>  sql("select count(*) from order where state = 2").show()
>  sql("select price from order where id = 'newid1'").show()
>  sql("select count(*) from order_hist where c_name = 'delete'").show()
>  sql("select count(*) from order_hist where c_name = 'insert'").show()
> {code}
> Results in spark-shell --master yarn
> {code}
>  scala> sql("select count(*) from order").show()
>  ++
> |count(1)|
> ++
> |10|
> ++
> scala> sql("select count(*) from order where state = 2").show()
>  ++
> |count(1)|
> ++
> |0|
> ++
> scala> sql("select price from order where id = 'newid1'").show()
>  +-+
> |price|
> +-+
>  +-+
> scala> sql("select count(*) from order_hist where c_name = 

[jira] [Resolved] (CARBONDATA-3884) During Concurrent loads in main table with SI table with isSITableEnabled = false, one of the concurrent load fails

2020-07-10 Thread Akash R Nilugal (Jira)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-3884?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Akash R Nilugal resolved CARBONDATA-3884.
-
Fix Version/s: (was: 2.0.0)
   2.1.0
   Resolution: Fixed

> During Concurrent loads in main table with SI table with isSITableEnabled = 
> false, one of the concurrent load fails
> ---
>
> Key: CARBONDATA-3884
> URL: https://issues.apache.org/jira/browse/CARBONDATA-3884
> Project: CarbonData
>  Issue Type: Bug
>  Components: spark-integration
>Affects Versions: 2.0.0
>Reporter: Vikram Ahuja
>Priority: Minor
> Fix For: 2.1.0
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> During Concurrent loads in main table with SI table with isSITableEnabled = 
> false, one of the concurrent load fails
> The steps are as follows:
> 1, Create a main table
> 2. Create SI table
> 3. Load in main table
> 4. ALter table set isSITableEnabled'='false'
> 5. Change SILoadEventListener such that is sleeps for some minutes after 
> getting the main table details.
> 6. When concurrent loads are fired then it gets the main table details and 
> then it sleeps and the load fails after some time



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


  1   2   3   4   >