[GitHub] spark issue #23152: [SPARK-26181][SQL] the `hasMinMaxStats` method of `Colum...

juliuszsompolski Fri, 30 Nov 2018 08:03:49 -0800

Github user juliuszsompolski commented on the issue:

    https://github.com/apache/spark/pull/23152
  
    While at it, could we kill one more potential for a bug?
    In `FilterEstimation.evaluateBinaryForTwoColumns` there is a
    ```
        attrLeft.dataType match {
          case StringType | BinaryType =>
            // TODO: It is difficult to support other binary comparisons for 
String/Binary
            // type without min/max and advanced statistics like histogram.
            logDebug("[CBO] No range comparison statistics for String/Binary 
type " + attrLeft)
            return None
          case _ =>
        }
    ```
    Could we change
    ```
          case _ =>
            if (!colStatsMap.hasMinMaxStats(attrLeft)) {
              logDebug("[CBO] No min/max statistics " + attrLeft)
              return None
            }
            if (!colStatsMap.hasMinMaxStats(attrRight)) {
              logDebug("[CBO] No min/max statistics " + attrRight)
              return None
            }
    ```
    This is one more place that later does
    ```
        val statsIntervalLeft = ValueInterval(colStatLeft.min, colStatLeft.max, 
attrLeft.dataType)
          .asInstanceOf[NumericValueInterval]
    ...
        val statsIntervalRight = ValueInterval(colStatRight.min, 
colStatRight.max, attrRight.dataType)
          .asInstanceOf[NumericValueInterval]
    ```
    
    assuming that min/maxes are present, and could therefore also hint the 
ClassCastException.



---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #23152: [SPARK-26181][SQL] the `hasMinMaxStats` method of `Colum...

Reply via email to