Feng Yuan created HIVE-11930: -------------------------------- Summary: how to prevent ppd the topN(a) udf predication in where clause? Key: HIVE-11930 URL: https://issues.apache.org/jira/browse/HIVE-11930 Project: Hive Issue Type: Bug Components: Hive Affects Versions: 0.14.0 Reporter: Feng Yuan Priority: Blocker Fix For: 0.14.1
select a.state_date,a.customer,a.taskid,a.step_id,a.exit_title,a.pv,top1000(a.only_id) from ( select t1.state_date,t1.customer,t1.taskid,t1.step_id,t1.exit_title,t1.pv,t1.only_id from ( select t11.state_date, t11.customer, t11.taskid, t11.step_id, t11.exit_title, t11.pv, concat(t11.customer,t11.taskid,t11.step_id) as only_id from ( select state_date,customer,taskid,step_id,exit_title,count(*) as pv from bdi_fact2.mid_url_step where exit_url!='-1' and exit_title !='-1' and l_date='2015-08-31' group by state_date,customer,taskid,step_id,exit_title )t11 )t1 order by t1.only_id,t1.pv desc )a where a.customer='Cdianyingwang' and a.taskid='33' and a.step_id='0' and top1000(a.only_id)<=10; in above example: outer top1000(a.only_id)<=10;will ppd to: stage 1: ( select t11.state_date, t11.customer, t11.taskid, t11.step_id, t11.exit_title, t11.pv, concat(t11.customer,t11.taskid,t11.step_id) as only_id from ( select state_date,customer,taskid,step_id,exit_title,count(*) as pv from bdi_fact2.mid_url_step where exit_url!='-1' and exit_title !='-1' and l_date='2015-08-31' group by state_date,customer,taskid,step_id,exit_title )t11 )t1 and this stage have 2 reduce,so you can see this will output 20 records, upon to outer stage,the final results is exactly this 20 records. so i want to know is there any way to hint this topN udf predication not to ppd? Thanks -- This message was sent by Atlassian JIRA (v6.3.4#6332)