[jira] [Commented] (SPARK-2590) Add config property to disable incremental collection used in Thrift server

Apache Spark (JIRA) Fri, 08 Aug 2014 01:57:27 -0700

    [ 
https://issues.apache.org/jira/browse/SPARK-2590?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14090492#comment-14090492
 ]


Apache Spark commented on SPARK-2590:
-------------------------------------

User 'liancheng' has created a pull request for this issue:
https://github.com/apache/spark/pull/1853

> Add config property to disable incremental collection used in Thrift server
> ---------------------------------------------------------------------------
>
>                 Key: SPARK-2590
>                 URL: https://issues.apache.org/jira/browse/SPARK-2590
>             Project: Spark
>          Issue Type: Bug
>          Components: SQL
>            Reporter: Cheng Lian
>            Assignee: Cheng Lian
>            Priority: Blocker
>
> {{SparkSQLOperationManager}} uses {{RDD.toLocalIterator}} to collect the 
> result set one partition at a time. This is useful to avoid OOM when the 
> result is large, but introduces extra job scheduling costs as each partition 
> is collected with a separate job. Users may want to disable this when the 
> result set is expected to be small.
> *UPDATE* Incremental collection hurts performance because tasks of the last 
> stage of the RDD DAG generated from the SQL query plan are executed 
> sequentially. Thus we decided to disable it by default.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[jira] [Commented] (SPARK-2590) Add config property to disable incremental collection used in Thrift server

Reply via email to