[jira] [Updated] (HIVE-25638) Select returns deleted records in Hive ACID table

2021-10-29 Thread mahesh kumar behera (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25638?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

mahesh kumar behera updated HIVE-25638:
---
Description: Hive stores the stripe stats in the ORC files. During select, 
these stats are used to create the SARG. The SARG is used to reduce the records 
read from the delete-delta files. Currently, in case where the number of 
stripes are more than 1, the SARG generated is not proper as it uses the first 
stripe index for both min and max key interval. The max key interval should be 
obtained from last stripe index. This cause some valid deleted records to be 
skipped. And those records are return to the user. We need the last stripe here 
instead of the first one, is the fact the keys are ordered in the file.  (was: 
Hive stores the stripe stats in the ORC files. During select, these stats are 
used to create the SARG. The SARG is used to reduce the records read from the 
delete-delta files. Currently, in case where the number of stripes are more 
than 1, the SARG generated is not proper as it uses the first stripe index for 
both min and max key interval. The max key interval should be obtained from 
last stripe index. This cause some valid deleted records to be skipped. And 
those records are return to the user.)

> Select returns deleted records in Hive ACID table
> -
>
> Key: HIVE-25638
> URL: https://issues.apache.org/jira/browse/HIVE-25638
> Project: Hive
>  Issue Type: Bug
>  Components: HiveServer2
>Reporter: mahesh kumar behera
>Assignee: mahesh kumar behera
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Hive stores the stripe stats in the ORC files. During select, these stats are 
> used to create the SARG. The SARG is used to reduce the records read from the 
> delete-delta files. Currently, in case where the number of stripes are more 
> than 1, the SARG generated is not proper as it uses the first stripe index 
> for both min and max key interval. The max key interval should be obtained 
> from last stripe index. This cause some valid deleted records to be skipped. 
> And those records are return to the user. We need the last stripe here 
> instead of the first one, is the fact the keys are ordered in the file.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HIVE-25638) Select returns deleted records in Hive ACID table

2021-10-26 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25638?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated HIVE-25638:
--
Labels: pull-request-available  (was: )

> Select returns deleted records in Hive ACID table
> -
>
> Key: HIVE-25638
> URL: https://issues.apache.org/jira/browse/HIVE-25638
> Project: Hive
>  Issue Type: Bug
>  Components: HiveServer2
>Reporter: mahesh kumar behera
>Assignee: mahesh kumar behera
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Hive stores the stripe stats in the ORC files. During select, these stats are 
> used to create the SARG. The SARG is used to reduce the records read from the 
> delete-delta files. Currently, in case where the number of stripes are more 
> than 1, the SARG generated is not proper as it uses the first stripe index 
> for both min and max key interval. The max key interval should be obtained 
> from last stripe index. This cause some valid deleted records to be skipped. 
> And those records are return to the user.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HIVE-25638) Select returns deleted records in Hive ACID table

2021-10-24 Thread mahesh kumar behera (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25638?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

mahesh kumar behera updated HIVE-25638:
---
Description: Hive stores the stripe stats in the ORC files. During select, 
these stats are used to create the SARG. The SARG is used to reduce the records 
read from the delete-delta files. Currently, in case where the number of 
stripes are more than 1, the SARG generated is not proper as it uses the first 
stripe index for both min and max key interval. The max key interval should be 
obtained from last stripe index. This cause some valid deleted records to be 
skipped. And those records are return to the user.  (was: Hive stores the 
stripe stats in the ORC files. During select, these stats are used to create 
the SARG. The SARG is used to reduce the records read from the delete-delta 
files. Currently, in case where the number of stripes are more than 1, the SARG 
generated is not proper as it uses the first stripe index for both min and max 
key interval. The max key interval should be obtained from last stripe index.)

> Select returns deleted records in Hive ACID table
> -
>
> Key: HIVE-25638
> URL: https://issues.apache.org/jira/browse/HIVE-25638
> Project: Hive
>  Issue Type: Bug
>  Components: HiveServer2
>Reporter: mahesh kumar behera
>Assignee: mahesh kumar behera
>Priority: Major
>
> Hive stores the stripe stats in the ORC files. During select, these stats are 
> used to create the SARG. The SARG is used to reduce the records read from the 
> delete-delta files. Currently, in case where the number of stripes are more 
> than 1, the SARG generated is not proper as it uses the first stripe index 
> for both min and max key interval. The max key interval should be obtained 
> from last stripe index. This cause some valid deleted records to be skipped. 
> And those records are return to the user.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HIVE-25638) Select returns deleted records in Hive ACID table

2021-10-24 Thread mahesh kumar behera (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25638?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

mahesh kumar behera updated HIVE-25638:
---
Summary: Select returns deleted records in Hive ACID table  (was: Select 
returns the deleted records in Hive ACID table)

> Select returns deleted records in Hive ACID table
> -
>
> Key: HIVE-25638
> URL: https://issues.apache.org/jira/browse/HIVE-25638
> Project: Hive
>  Issue Type: Bug
>  Components: HiveServer2
>Reporter: mahesh kumar behera
>Assignee: mahesh kumar behera
>Priority: Major
>
> Hive stores the stripe stats in the ORC files. During select, these stats are 
> used to create the SARG. The SARG is used to reduce the records read from the 
> delete-delta files. Currently, in case where the number of stripes are more 
> than 1, the SARG generated is not proper as it uses the first stripe index 
> for both min and max key interval. The max key interval should be obtained 
> from last stripe index.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)