[GitHub] spark pull request #21737: [SPARK-24208][SQL] Fix attribute deduplication fo...

mgaido91 Tue, 10 Jul 2018 01:30:01 -0700

Github user mgaido91 commented on a diff in the pull request:

    https://github.com/apache/spark/pull/21737#discussion_r201256679
  
    --- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala
 ---
    @@ -738,6 +738,10 @@ class Analyzer(
                 if 
findAliases(aggregateExpressions).intersect(conflictingAttributes).nonEmpty =>
               (oldVersion, oldVersion.copy(aggregateExpressions = 
newAliases(aggregateExpressions)))
     
    +        case oldVersion @ FlatMapGroupsInPandas(_, _, output, _)
    +            if 
AttributeSet(output).intersect(conflictingAttributes).nonEmpty =>
    --- End diff --
    
    @gatorsmile I agree with you. Moreover, there are other possible problems 
in having the same expressions (with same exprId) in different part of a tree 
(please see SPARK-24051). So probably on long term we can add a specific rule 
for addressing this problem (extending/generalizing what I tried to do in 
SPARK-24051). What do you think?



---

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] spark pull request #21737: [SPARK-24208][SQL] Fix attribute deduplication fo...

Reply via email to