Github user heary-cao commented on the issue:

    https://github.com/apache/spark/pull/20541
  
    oh ,yeah, there is a little difference, for a + 1 and a + b.
    **for a + 1**:
    ```
    `PushProjectionThroughUnion `rule handles:
    Union
    :- Project [(a#0 + 1) AS aa#10]
    :  +- LocalRelation <empty>, [a#0, b#1, c#2]
    :- Project [(d#3 + 1) AS aa#11]
    :  +- LocalRelation <empty>, [d#3, e#4, f#5]
    +- Project [(g#6 + 1) AS aa#12]
       +- LocalRelation <empty>, [g#6, h#7, i#8]
    
    `ColumnPruning `rule handles:
    Project [(a#0 + 1) AS aa#9]
    Union
    :- Project [a#0]
    :  +- LocalRelation <empty>, [a#0, b#1, c#2]
    :- Project [d#3]
    :  +- LocalRelation <empty>, [d#3, e#4, f#5]
    +- Project [g#6]
       +- LocalRelation <empty>, [g#6, h#7, i#8]
    ```
          
    **for a + b**:
    ```
    `PushProjectionThroughUnion `rule handles:
    Union
    :- Project [(a#0 + b#1) AS ab#10]
    :  +- LocalRelation <empty>, [a#0, b#1, c#2]
    :- Project [(d#3 + e#4) AS ab#11]
    :  +- LocalRelation <empty>, [d#3, e#4, f#5]
    +- Project [(g#6 + h#7) AS ab#12]
       +- LocalRelation <empty>, [g#6, h#7, i#8]        
    
    `ColumnPruning `rule handles:
    Project [(a#0 + b#1) AS ab#9]
    Union
    :- Project [a#0, b#1]
    :  +- LocalRelation <empty>, [a#0, b#1, c#2]
    :- Project [d#3, e#4]
    :  +- LocalRelation <empty>, [d#3, e#4, f#5]
    +- Project [g#6, h#7]
       +- LocalRelation <empty>, [g#6, h#7, i#8]
    ```
          
    So I think this may be the reason for the need to add the 
pushprojectionthroughunion rules. and to non-deterministic expression.



---

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to