GitHub user mshtelma opened a pull request:

    https://github.com/apache/spark/pull/21052

    [SPARK-23799] FilterEstimation.evaluateInSet produces devision by zero in a 
case of empty table with analyzed statistics

    During evaluation of IN conditions, if the source data frame, is 
represented by a plan, that uses hive table with columns, which were previously 
analyzed, and the plan has conditions for these fields, that cannot be 
satisfied (which leads us to an empty data frame), 
FilterEstimation.evaluateInSet method produces NumberFormatException and 
ClassCastException. 
    This PR fixes both bugs and introduces tests for them. 

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/mshtelma/spark 
filter_estimation_evaluateInSet_Bugs

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/spark/pull/21052.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #21052
    
----
commit 297395effef8df279f11545d803b051cc2234c6e
Author: Mykhailo Shtelma <mykhailo.shtelma@...>
Date:   2018-03-26T13:09:39Z

    During evaluation of IN conditions, if the source table is empty, division 
by zero can occur. In order to fix this, check was added.

commit d634ddaec88d0511334ec6c021255094f697b31d
Author: Mykhailo Shtelma <mykhailo.shtelma@...>
Date:   2018-04-03T15:44:33Z

    Added test case for the the following situation: During evaluation of IN 
conditions, if the source table is empty, division by zero can occur. In order 
to fix this, check was added.

commit 74b6ebdc2cd8a91944cc6159946f560ba7212a6a
Author: Mykhailo Shtelma <mykhailo.shtelma@...>
Date:   2018-04-12T12:00:55Z

    If an empty dataframe (because of some conditions in parent query, which 
were not satisfied) is queried and CBO is turned on, wrong statistics is used, 
which leads to ClassCastException in FilterEstimation.evaluateInSet

----


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

Reply via email to