Github user maropu commented on a diff in the pull request: https://github.com/apache/spark/pull/20913#discussion_r179037665 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/logical/statsEstimation/FilterEstimation.scala --- @@ -427,7 +427,11 @@ case class FilterEstimation(plan: Filter) extends Logging { // return the filter selectivity. Without advanced statistics such as histograms, // we have to assume uniform distribution. - Some(math.min(newNdv.toDouble / ndv.toDouble, 1.0)) + if (ndv.toDouble != 0) { --- End diff -- What's the concrete case when `ndv.toDouble == 0`? Also, is this only an place where we need this check? For example, we don't here: https://github.com/apache/spark/blob/5cfd5fabcdbd77a806b98a6dd59b02772d2f6dee/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/logical/statsEstimation/EstimationUtils.scala#L166
--- --------------------------------------------------------------------- To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org