Yuming Wang created SPARK-33368: ----------------------------------- Summary: SimplifyConditionals simplifies non-deterministic expressions Key: SPARK-33368 URL: https://issues.apache.org/jira/browse/SPARK-33368 Project: Spark Issue Type: Bug Components: SQL Affects Versions: 3.0.1, 2.4.7, 3.1.0 Reporter: Yuming Wang
It seems we simplified non-deterministic expressions with aliases. for example: {code:sql} CREATE TABLE t(a int, b int, c int) using parquet {code} {code:sql} sql SELECT CASE WHEN rand(100) > 1 THEN 1 WHEN rand(100) + 1 > 1000 THEN 1 WHEN rand(100) + 2 < 100 THEN 1 ELSE 1 END AS x FROM t {code} The plan is: {noformat} == Physical Plan == *(1) Project [CASE WHEN (rand(100) > 1.0) THEN 1 WHEN ((rand(100) + 1.0) > 1000.0) THEN 1 WHEN ((rand(100) + 2.0) < 100.0) THEN 1 ELSE 1 END AS x#6] +- *(1) ColumnarToRow +- FileScan parquet default.t[] Batched: true, DataFilters: [], Format: Parquet, Location: InMemoryFileIndex[file:/Users/yumwang/opensource/spark/sql/core/spark-warehouse/org.apache.spark...., PartitionFilters: [], PushedFilters: [], ReadSchema: struct<> {noformat} {code:sql} SELECT CASE WHEN rd > 1 THEN 1 WHEN rd + 1 > 1000 THEN 1 WHEN rd + 2 < 100 THEN 1 ELSE 1 END AS x FROM (SELECT *, rand(100) as rd FROM t) t1 {code} The plan is: {noformat} == Physical Plan == *(1) Project [1 AS x#1] +- *(1) ColumnarToRow +- FileScan parquet default.t[] Batched: true, DataFilters: [], Format: Parquet, Location: InMemoryFileIndex[file:/Users/yumwang/opensource/spark/sql/core/spark-warehouse/org.apache.spark...., PartitionFilters: [], PushedFilters: [], ReadSchema: struct<> {noformat} -- This message was sent by Atlassian Jira (v8.3.4#803005) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org