[
https://issues.apache.org/jira/browse/SPARK-15748?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Reynold Xin updated SPARK-15748:
--------------------------------
Issue Type: Sub-task (was: Improvement)
Parent: SPARK-15852
> Replace inefficient foldLeft() call in PartitionStatistics
> ----------------------------------------------------------
>
> Key: SPARK-15748
> URL: https://issues.apache.org/jira/browse/SPARK-15748
> Project: Spark
> Issue Type: Sub-task
> Components: SQL
> Reporter: Josh Rosen
> Assignee: Josh Rosen
> Fix For: 2.0.0
>
>
> PartitionStatistics uses foldLeft and list concatenation to flatten an
> iterator of lists, but this is extremely inefficient compared to simply doing
> flatMap/flatten because it performs many unnecessary object allocations.
> Simply replacing this foldLeft by a flatMap results in fair performance gains
> when constructing PartitionStatistics instances for tables with many columns.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]