srowen commented on a change in pull request #24779: [SPARK-27929][SQL] Make 
percentile function receive frq of double
URL: https://github.com/apache/spark/pull/24779#discussion_r314346114
 
 

 ##########
 File path: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/aggregate/Percentile.scala
 ##########
 @@ -67,16 +67,23 @@ case class Percentile(
     child: Expression,
     percentageExpression: Expression,
     frequencyExpression : Expression,
+    isIntFreqExpression: Expression,
 
 Review comment:
   At the least, I think you can switch logic based on the type of the column, 
rather than add a new parameter.
   
   I don't know the logic here, but I'd imagine that anything that works for 
integer weights (1, 2) should work identically for continuous ones (1.0, 2.0). 
It's possible that the current tests are actually expecting the 'wrong' value 
if the code is using an unsuitable approximation for integer values or 
something.
   
   I'd be interested in knowing what fails, as I don't expect Hive-related 
tests to fail - it doesn't implement weighted percentiles, I thought? 

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

Reply via email to