[GitHub] [spark] wangyum commented on pull request #21852: [SPARK-24893] [SQL] Remove the entire CaseWhen if all the outputs are semantic equivalence

GitBox Thu, 05 Nov 2020 18:31:14 -0800


wangyum commented on pull request #21852:
URL: https://github.com/apache/spark/pull/21852#issuecomment-722770930



   It seems we simplified non-deterministic expressions with aliases. for 
example:
   ```sql
   CREATE TABLE t(a int, b int, c int) using parquet
   ```
   
   ```sql
   SELECT CASE                          
       WHEN rand(100) > 1 THEN 1        
       WHEN rand(100) + 1 > 1000 THEN 1 
       WHEN rand(100) + 2 < 100 THEN 1  
       ELSE 1                           
   END AS x                             
   FROM t                                                        
   ```
   The plan is:
   ```
   == Physical Plan ==
   *(1) Project [CASE WHEN (rand(100) > 1.0) THEN 1 WHEN ((rand(100) + 1.0) > 
1000.0) THEN 1 WHEN ((rand(100) + 2.0) < 100.0) THEN 1 ELSE 1 END AS x#6]
   +- *(1) ColumnarToRow
      +- FileScan parquet default.t[] Batched: true, DataFilters: [], Format: 
Parquet, Location: 
InMemoryFileIndex[file:/Users/yumwang/opensource/spark/sql/core/spark-warehouse/org.apache.spark....,
 PartitionFilters: [], PushedFilters: [], ReadSchema: struct<>
   ```
   ---
   
   ```sql
   SELECT CASE                                  
       WHEN rd > 1 THEN 1                       
       WHEN rd + 1 > 1000 THEN 1                
       WHEN rd + 2 < 100 THEN 1                 
       ELSE 1                                   
   END AS x                                     
   FROM (SELECT *, rand(100) as rd FROM t) t1                                   
   ```
   The plan is:
   ```
   == Physical Plan ==
   *(1) Project [1 AS x#1]
   +- *(1) ColumnarToRow
      +- FileScan parquet default.t[] Batched: true, DataFilters: [], Format: 
Parquet, Location: 
InMemoryFileIndex[file:/Users/yumwang/opensource/spark/sql/core/spark-warehouse/org.apache.spark....,
 PartitionFilters: [], PushedFilters: [], ReadSchema: struct<>
   ```
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] [spark] wangyum commented on pull request #21852: [SPARK-24893] [SQL] Remove the entire CaseWhen if all the outputs are semantic equivalence

Reply via email to