Feng Yuan created HIVE-11930:
--------------------------------

             Summary: how to prevent ppd the topN(a) udf predication in where 
clause?
                 Key: HIVE-11930
                 URL: https://issues.apache.org/jira/browse/HIVE-11930
             Project: Hive
          Issue Type: Bug
          Components: Hive
    Affects Versions: 0.14.0
            Reporter: Feng Yuan
            Priority: Blocker
             Fix For: 0.14.1


select 
a.state_date,a.customer,a.taskid,a.step_id,a.exit_title,a.pv,top1000(a.only_id)
          from
                (  select 
t1.state_date,t1.customer,t1.taskid,t1.step_id,t1.exit_title,t1.pv,t1.only_id
                  from 
                      ( select t11.state_date,
                               t11.customer,
                               t11.taskid,
                               t11.step_id,
                               t11.exit_title,
                               t11.pv,
                               concat(t11.customer,t11.taskid,t11.step_id) as 
only_id
                       from
                          (  select 
state_date,customer,taskid,step_id,exit_title,count(*) as pv
                             from bdi_fact2.mid_url_step
                             where exit_url!='-1'
                             and exit_title !='-1'
                             and l_date='2015-08-31'
                             group by 
state_date,customer,taskid,step_id,exit_title
                            )t11
                       )t1
                       order by t1.only_id,t1.pv desc
                 )a
          where  a.customer='Cdianyingwang'
          and a.taskid='33'
          and a.step_id='0' 
          and top1000(a.only_id)<=10;

in above example:
outer top1000(a.only_id)<=10;will ppd to:

stage 1:
( select t11.state_date,
                               t11.customer,
                               t11.taskid,
                               t11.step_id,
                               t11.exit_title,
                               t11.pv,
                               concat(t11.customer,t11.taskid,t11.step_id) as 
only_id
                       from
                          (  select 
state_date,customer,taskid,step_id,exit_title,count(*) as pv
                             from bdi_fact2.mid_url_step
                             where exit_url!='-1'
                             and exit_title !='-1'
                             and l_date='2015-08-31'
                             group by 
state_date,customer,taskid,step_id,exit_title
                            )t11
                       )t1

and this stage have 2 reduce,so you can see this will output 20 records,
upon to outer stage,the final results is exactly this 20 records.

so i want to know is there any way to hint this topN udf predication not to ppd?

Thanks




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to