-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/19549/
-----------------------------------------------------------

Review request for hive.


Repository: hive-git


Description
-------

In this scenario, PPD on the script (transform) operator did the following 
wrong predicate pushdown:

script --> filter (state=1)
           --> select, insert into test1
       -->filter (state=2)
           --> select, insert into test2

into:

script --> filter (state=1 and state=2)   //not possible.
         --> select, insert into test1
         --> select, insert into test2


The bug was a combination of two things, first that these filters got chosen by 
FilterPPD and that the ScriptPPD called the sequence "mergeWithChildrenPred 
/createFilters (pred)" which did the above transformation.  ScriptPPD was one 
of the few simple operator that did this, I tried with some other combination 
like extract (see my added test in transform_ppr2.q) and also just a select 
operator.

The fix is to skip marking a predicate as a 'candidate' for the pushdown if it 
is a sibling of another filter.  We still want to pushdown children of select 
transform with grandchildren, etc.


Diffs
-----

  ql/src/java/org/apache/hadoop/hive/ql/ppd/OpProcFactory.java 40298e1 
  ql/src/test/queries/clientpositive/transform_ppd_multi.q PRE-CREATION 
  ql/src/test/queries/clientpositive/transform_ppr2.q 85ef3ac 
  ql/src/test/results/clientpositive/transform_ppd_multi.q.out PRE-CREATION 
  ql/src/test/results/clientpositive/transform_ppr2.q.out 4bddc69 

Diff: https://reviews.apache.org/r/19549/diff/


Testing
-------

Reproduced both the issue in transform_ppd_multi.q, also did another similar 
issue with an extract (cluster) operator in transform_pp2.q.  Ran other 
transform_ppd and general ppd tests to ensure no regression.


Thanks,

Szehon Ho

Reply via email to