[jira] [Updated] (HIVE-15398) change metadata-only queries to still read the original table (in some cases?)

Sergey Shelukhin (JIRA) Thu, 08 Dec 2016 16:16:04 -0800

     [ 
https://issues.apache.org/jira/browse/HIVE-15398?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]


Sergey Shelukhin updated HIVE-15398:
------------------------------------
    Description: 
See HIVE-15397.
There are multiple complementary ways to handle this properly:
1) Enhance MetadataOnly to recognize when table emptiness matters and only 
optimize safe query patterns (or only use the below in unsafe cases). 
2) Create the original IF inside compilation, get record reader and see if it's 
empty. Seems like the only bulletproof method in terms of correctness, but it 
may break due to difference in setup and access between tasks and compilation. 
May also have security implications e.g. if compilation is in HS2 and 
permissions are different from tasks.
3) Somehow inject limit into table scan (using limit in the plan, or just hack 
it into TS itself specifically for this feature), and keep the original 
InputFormat. That way instead of 0 or 1 null rows it would return 0 or 1 rows 
from the original split, while avoiding large scans, which is the goal.


> change metadata-only queries to still read the original table (in some cases?)
> ------------------------------------------------------------------------------
>
>                 Key: HIVE-15398
>                 URL: https://issues.apache.org/jira/browse/HIVE-15398
>             Project: Hive
>          Issue Type: Bug
>            Reporter: Sergey Shelukhin
>
> See HIVE-15397.
> There are multiple complementary ways to handle this properly:
> 1) Enhance MetadataOnly to recognize when table emptiness matters and only 
> optimize safe query patterns (or only use the below in unsafe cases). 
> 2) Create the original IF inside compilation, get record reader and see if 
> it's empty. Seems like the only bulletproof method in terms of correctness, 
> but it may break due to difference in setup and access between tasks and 
> compilation. May also have security implications e.g. if compilation is in 
> HS2 and permissions are different from tasks.
> 3) Somehow inject limit into table scan (using limit in the plan, or just 
> hack it into TS itself specifically for this feature), and keep the original 
> InputFormat. That way instead of 0 or 1 null rows it would return 0 or 1 rows 
> from the original split, while avoiding large scans, which is the goal.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-15398) change metadata-only queries to still read the original table (in some cases?)

Reply via email to