Xiao Li created SPARK-13936:
-------------------------------

             Summary: PushPredicateThroughProject using Constraints
                 Key: SPARK-13936
                 URL: https://issues.apache.org/jira/browse/SPARK-13936
             Project: Spark
          Issue Type: Bug
          Components: SQL
    Affects Versions: 2.0.0
            Reporter: Xiao Li


In the query 
{code}
    sql("SELECT unionsrc1.key, unionsrc1.value, unionsrc2.key, unionsrc2.value 
FROM (select 'tst1' as key, cast(count(1) as string) as value from parquet_t1 
s1 UNION ALL select s2.key as key, s2.value as value from parquet_t1 s2 where 
s2.key < 10) unionsrc1 JOIN (select 'tst1' as key, cast(count(1) as string) as 
value from parquet_t1 s3 UNION  ALL select s4.key as key, s4.value as value 
from parquet_t1 s4 where s4.key < 10) unionsrc2 ON (unionsrc1.key = 
unionsrc2.key)").explain(true)
{code}

Optimizer generates many duplicate constraints in Filter constraints by using 
the rule {{PushPredicateThroughProject}}. Due to this issue, it also hits the 
max iteration. We should use constraints to avoid pushing any predicate that 
already exist in its child Constraints.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to