[GitHub] spark pull request #19872: [SPARK-22274][PYTHON][SQL] User-defined aggregati...

yhuai Wed, 31 Jan 2018 15:29:18 -0800

Github user yhuai commented on a diff in the pull request:

    https://github.com/apache/spark/pull/19872#discussion_r165220142
  
    --- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/planning/patterns.scala
 ---
    @@ -199,7 +200,7 @@ object ExtractFiltersAndInnerJoins extends 
PredicateHelper {
     object PhysicalAggregation {
       // groupingExpressions, aggregateExpressions, resultExpressions, child
       type ReturnType =
    -    (Seq[NamedExpression], Seq[AggregateExpression], Seq[NamedExpression], 
LogicalPlan)
    +    (Seq[NamedExpression], Seq[Expression], Seq[NamedExpression], 
LogicalPlan)
    --- End diff --
    
    @icexelloss Thank you for this contribution! I just came across the change 
in this file. I am not sure if changing the type at here is the best option. 
The reason is that whenever we use this PhysicalAggregation rule, we have to 
check the instance type of those aggregate expressions and do casting. To me, 
it seems better to leave this rule untouched and create a new rule just for 
Python UDAF. What do you think?
    
    (maybe you and reviewers already discussed it. If so, can you point me to 
the discussion?)
    
    Thank you!



---

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] spark pull request #19872: [SPARK-22274][PYTHON][SQL] User-defined aggregati...

Reply via email to