[jira] [Resolved] (SPARK-33861) Simplify conditional in predicate
[ https://issues.apache.org/jira/browse/SPARK-33861?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yuming Wang resolved SPARK-33861. - Resolution: Won't Fix Note that only 3.2.0, 3.2.1 and 3.3.0 include this optimization. We recovered it via https://github.com/apache/spark/commit/43cbdc6ec9dbcf9ebe0b48e14852cec4af18b4ec > Simplify conditional in predicate > - > > Key: SPARK-33861 > URL: https://issues.apache.org/jira/browse/SPARK-33861 > Project: Spark > Issue Type: Sub-task > Components: SQL >Affects Versions: 3.2.0 >Reporter: Yuming Wang >Priority: Major > > The use case is: > {noformat} > spark.sql("create table t1 using parquet as select id as a, id as b from > range(10)") > spark.sql("select * from t1 where CASE WHEN a > 2 THEN b + 10 END > > 5").explain() > {noformat} > Before this pr: > {noformat} > == Physical Plan == > *(1) Filter CASE WHEN (a#3L > 2) THEN ((b#4L + 10) > 5) END > +- *(1) ColumnarToRow >+- FileScan parquet default.t1[a#3L,b#4L] Batched: true, DataFilters: > [CASE WHEN (a#3L > 2) THEN ((b#4L + 10) > 5) END], Format: Parquet, Location: > InMemoryFileIndex[file:/Users/yumwang/opensource/spark/spark-warehouse/org.apache.spark.sql.DataF..., > PartitionFilters: [], PushedFilters: [], ReadSchema: > struct > {noformat} > After this pr: > {noformat} > == Physical Plan == > *(1) Filter (((isnotnull(a#3L) AND isnotnull(b#4L)) AND (a#3L > 2)) AND > ((b#4L + 10) > 5)) > +- *(1) ColumnarToRow >+- FileScan parquet default.t1[a#3L,b#4L] Batched: true, DataFilters: > [isnotnull(a#3L), isnotnull(b#4L), (a#3L > 2), ((b#4L + 10) > 5)], Format: > Parquet, Location: > InMemoryFileIndex[file:/Users/yumwang/opensource/spark/spark-warehouse/org.apache.spark.sql.DataF..., > PartitionFilters: [], PushedFilters: [IsNotNull(a), IsNotNull(b), > GreaterThan(a,2)], ReadSchema: struct > {noformat} -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Resolved] (SPARK-33861) Simplify conditional in predicate
[ https://issues.apache.org/jira/browse/SPARK-33861?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wenchen Fan resolved SPARK-33861. - Fix Version/s: 3.2.0 Resolution: Fixed Issue resolved by pull request 30865 [https://github.com/apache/spark/pull/30865] > Simplify conditional in predicate > - > > Key: SPARK-33861 > URL: https://issues.apache.org/jira/browse/SPARK-33861 > Project: Spark > Issue Type: Improvement > Components: SQL >Affects Versions: 3.2.0 >Reporter: Yuming Wang >Assignee: Yuming Wang >Priority: Major > Fix For: 3.2.0 > > > The use case is: > {noformat} > spark.sql("create table t1 using parquet as select id as a, id as b from > range(10)") > spark.sql("select * from t1 where CASE WHEN a > 2 THEN b + 10 END > > 5").explain() > {noformat} > Before this pr: > {noformat} > == Physical Plan == > *(1) Filter CASE WHEN (a#3L > 2) THEN ((b#4L + 10) > 5) END > +- *(1) ColumnarToRow >+- FileScan parquet default.t1[a#3L,b#4L] Batched: true, DataFilters: > [CASE WHEN (a#3L > 2) THEN ((b#4L + 10) > 5) END], Format: Parquet, Location: > InMemoryFileIndex[file:/Users/yumwang/opensource/spark/spark-warehouse/org.apache.spark.sql.DataF..., > PartitionFilters: [], PushedFilters: [], ReadSchema: > struct > {noformat} > After this pr: > {noformat} > == Physical Plan == > *(1) Filter (((isnotnull(a#3L) AND isnotnull(b#4L)) AND (a#3L > 2)) AND > ((b#4L + 10) > 5)) > +- *(1) ColumnarToRow >+- FileScan parquet default.t1[a#3L,b#4L] Batched: true, DataFilters: > [isnotnull(a#3L), isnotnull(b#4L), (a#3L > 2), ((b#4L + 10) > 5)], Format: > Parquet, Location: > InMemoryFileIndex[file:/Users/yumwang/opensource/spark/spark-warehouse/org.apache.spark.sql.DataF..., > PartitionFilters: [], PushedFilters: [IsNotNull(a), IsNotNull(b), > GreaterThan(a,2)], ReadSchema: struct > {noformat} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org