[GitHub] spark pull request #20913: [SPARK-23799] FilterEstimation.evaluateInSet prod...

apache-hivemall Wed, 04 Apr 2018 17:35:49 -0700

Github user apache-hivemall commented on a diff in the pull request:

    https://github.com/apache/spark/pull/20913#discussion_r179322297
  
    --- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/logical/statsEstimation/FilterEstimation.scala
 ---
    @@ -427,7 +427,11 @@ case class FilterEstimation(plan: Filter) extends 
Logging {
     
         // return the filter selectivity.  Without advanced statistics such as 
histograms,
         // we have to assume uniform distribution.
    -    Some(math.min(newNdv.toDouble / ndv.toDouble, 1.0))
    +    if (ndv.toDouble != 0) {
    --- End diff --
    
    Can you add a test for the empty table case?
    I think we need to fix the other places if they have the same issue. cc: 
@wzhfy



---

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] spark pull request #20913: [SPARK-23799] FilterEstimation.evaluateInSet prod...

Reply via email to