[GitHub] spark pull request #19357: support histogram in filter cardinality estimatio...

ron8hu Tue, 26 Sep 2017 17:28:10 -0700

GitHub user ron8hu opened a pull request:

    https://github.com/apache/spark/pull/19357


    support histogram in filter cardinality estimation

    ## What changes were proposed in this pull request?
    
    Histogram is effective in dealing with skewed distribution. After we 
generate histogram information for column statistics, we need to adjust filter 
estimation based on histogram data structure.
    
    ## How was this patch tested?
    
    We revised all the unit test cases by including histogram data structure.
    
    Please review http://spark.apache.org/contributing.html before opening a 
pull request.


You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/ron8hu/spark createhistogram

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/spark/pull/19357.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #19357
    
----
commit 46af54d9c86fa1e5322fdd92ed47fe3d419dd966
Author: Ron Hu <[email protected]>
Date:   2017-09-26T23:33:49Z

    support histogram in filter cardinality estimation

----


---

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] spark pull request #19357: support histogram in filter cardinality estimatio...

Reply via email to