GitHub user gatorsmile opened a pull request:

    https://github.com/apache/spark/pull/22731

    [SPARK-25674][FOLLOW-UP] Update the stats for each ColumnarBatch

    ## What changes were proposed in this pull request?
    This PR is a follow-up of https://github.com/apache/spark/pull/22594 . This 
alternative can avoid the unneeded computation in the hot code path. 
    
    - For row-based scan, we keep the original way. 
    - For the columnar scan, we just need to update the stats after each batch.
    
    ## How was this patch tested?
    N/A

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/gatorsmile/spark udpateStatsFileScanRDD

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/spark/pull/22731.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #22731
    
----
commit 8731588d28f302e51095f7ed1a4331edc5233958
Author: gatorsmile <gatorsmile@...>
Date:   2018-10-15T17:44:39Z

    fix

----


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

Reply via email to