Repository: spark Updated Branches: refs/heads/master 2aaed0a4d -> 9b2c877be
[SPARK-21039][SPARK CORE] Use treeAggregate instead of aggregate in DataFrame.stat.bloomFilter ## What changes were proposed in this pull request? To use treeAggregate instead of aggregate in DataFrame.stat.bloomFilter to parallelize the operation of merging the bloom filters (Please fill in changes proposed in this fix) ## How was this patch tested? unit tests passed (Please explain how this patch was tested. E.g. unit tests, integration tests, manual tests) (If this patch involves UI changes, please attach a screenshot; otherwise, remove this) Please review http://spark.apache.org/contributing.html before opening a pull request. Author: Rishabh Bhardwaj <[email protected]> Author: Rishabh Bhardwaj <[email protected]> Author: Rishabh Bhardwaj <[email protected]> Author: Rishabh Bhardwaj <[email protected]> Author: Rishabh Bhardwaj <[email protected]> Closes #18263 from rishabhbhardwaj/SPARK-21039. Project: http://git-wip-us.apache.org/repos/asf/spark/repo Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/9b2c877b Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/9b2c877b Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/9b2c877b Branch: refs/heads/master Commit: 9b2c877beccf34fc7c063574496be7e6281227ad Parents: 2aaed0a Author: Rishabh Bhardwaj <[email protected]> Authored: Tue Jun 13 15:09:12 2017 +0100 Committer: Sean Owen <[email protected]> Committed: Tue Jun 13 15:09:12 2017 +0100 ---------------------------------------------------------------------- .../main/scala/org/apache/spark/sql/DataFrameStatFunctions.scala | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) ---------------------------------------------------------------------- http://git-wip-us.apache.org/repos/asf/spark/blob/9b2c877b/sql/core/src/main/scala/org/apache/spark/sql/DataFrameStatFunctions.scala ---------------------------------------------------------------------- diff --git a/sql/core/src/main/scala/org/apache/spark/sql/DataFrameStatFunctions.scala b/sql/core/src/main/scala/org/apache/spark/sql/DataFrameStatFunctions.scala index c856d30..531c613 100644 --- a/sql/core/src/main/scala/org/apache/spark/sql/DataFrameStatFunctions.scala +++ b/sql/core/src/main/scala/org/apache/spark/sql/DataFrameStatFunctions.scala @@ -551,7 +551,7 @@ final class DataFrameStatFunctions private[sql](df: DataFrame) { ) } - singleCol.queryExecution.toRdd.aggregate(zero)( + singleCol.queryExecution.toRdd.treeAggregate(zero)( (filter: BloomFilter, row: InternalRow) => { updater(filter, row) filter --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
