Zhenhua Wang created SPARK-20718: ------------------------------------ Summary: FileSourceScanExec with different filter orders should have the same result Key: SPARK-20718 URL: https://issues.apache.org/jira/browse/SPARK-20718 Project: Spark Issue Type: Bug Components: SQL Affects Versions: 2.2.0 Reporter: Zhenhua Wang
Since `constraints` in `QueryPlan` is a set, the order of filters can differ. Usually this is ok because of canonicalization. However, in `FileSourceScanExec`, its data filters and partition filters are sequences, and their orders are not canonicalized. So `def sameResult` returns different results for different orders of data/partition filters. This leads to, e.g. different decision for `ReuseExchange`, and thus results in unstable performance. -- This message was sent by Atlassian JIRA (v6.3.15#6346) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org