Illya Yalovyy created HIVE-9225:
-----------------------------------

             Summary: Windowing functions are not executing efficiently when 
the window is identical
                 Key: HIVE-9225
                 URL: https://issues.apache.org/jira/browse/HIVE-9225
             Project: Hive
          Issue Type: Improvement
          Components: PTF-Windowing
    Affects Versions: 0.13.0
         Environment: Linux
            Reporter: Illya Yalovyy


Hive optimizer and the runtime are not smart enough to recognize if the 
windowing is the same. Even when the window is identical, the windowing is 
re-executed again and cause the runtime increase proportionally to # of 
windows. 

Example:
{code:sql}
select code,min(emp) over (partition by code order by emp  range between 
current row and 300000000 following)from sample_big limit 10;
{code}
*Time taken: 1h:36m:12s*

{code:sql}
select code,
min(emp) over (partition by code order by emp  range between current row and 
300000000 following),
max(emp) over (partition by code order by emp  range between current row and 
300000000 following),
min(salary) over (partition by code order by emp  range between current row and 
300000000 following),
max(salary) over (partition by code order by emp  range between current row and 
300000000 following)
from sample_big limit 10;
{code}
*Time taken: 4h:0m:37s*



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to