[GitHub] [spark] shahidki31 commented on a change in pull request #32498: [SPARK-35368][SQL] Update histogram statistics for RANGE operator for stats estimation

GitBox Tue, 18 May 2021 14:03:37 -0700


shahidki31 commented on a change in pull request #32498:
URL: https://github.com/apache/spark/pull/32498#discussion_r634746436




##########
File path: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/logical/basicLogicalOperators.scala
##########
@@ -789,6 +797,39 @@ case class Range(
     }
   }
 
+  private def computeHistogramStatistics() = {

Review comment:
       > Trivial: you could put a return type here.Yes, I added the return type
   
   Yes, I updated with the return type.
   
   > Is there any concern with the overhead of computing the histogram? It only 
happens when enabled of course. I just wonder if we need to micro-optimize the 
loop below with something more performant if this is critical, but maybe it 
isn't.
   
   This is not a performance intensive path. This will execute during planning 
phase. Also the loop size is numBins, which is 254 by default. So I am not sure 
we need to micro-optimise the loop?




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] [spark] shahidki31 commented on a change in pull request #32498: [SPARK-35368][SQL] Update histogram statistics for RANGE operator for stats estimation

Reply via email to