Liang-Chi Hsieh created SPARK-19665:
---------------------------------------
Summary: Improve constraint propagation
Key: SPARK-19665
URL: https://issues.apache.org/jira/browse/SPARK-19665
Project: Spark
Issue Type: Improvement
Components: SQL
Affects Versions: 2.1.0
Reporter: Liang-Chi Hsieh
If there are aliased expression in the projection, we propagate constraints by
completely expanding the original constraints with aliases.
This expanding costs much computation time when the number of aliases increases.
Another issue is we actually don't need the additional constraints at most of
time. For example, if there is a constraint "a > b", and "a" is aliased to "c"
and "d". When we use this constraint in filtering, we don't need all
constraints "a > b", "c > b", "d > b". We only need "a > b" because if it is
false, it is guaranteed that all other constraints are false too.
Fully expanding all constraints at all the time makes iterative ML algorithms
where a ML pipeline with many stages runs very slow.
--
This message was sent by Atlassian JIRA
(v6.3.15#6346)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]