[ 
https://issues.apache.org/jira/browse/SPARK-2590?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Cheng Lian updated SPARK-2590:
------------------------------

    Description: 
{{SparkSQLOperationManager}} uses {{RDD.toLocalIterator}} to collect the result 
set one partition at a time. This is useful to avoid OOM when the result is 
large, but introduces extra job scheduling costs as each partition is collected 
with a separate job. Users may want to disable this when the result set is 
expected to be small.

*UPDATE* Incremental collection hurts performance because tasks of the last 
stage of the RDD DAG generated from the SQL query plan are executed 
sequentially. Thus we decided to disable it by default.

  was:{{SparkSQLOperationManager}} uses {{RDD.toLocalIterator}} to collect the 
result set one partition at a time. This is useful to avoid OOM when the result 
is large, but introduces extra job scheduling costs as each partition is 
collected with a separate job. Users may want to disable this when the result 
set is expected to be small.


> Add config property to disable incremental collection used in Thrift server
> ---------------------------------------------------------------------------
>
>                 Key: SPARK-2590
>                 URL: https://issues.apache.org/jira/browse/SPARK-2590
>             Project: Spark
>          Issue Type: Bug
>          Components: SQL
>            Reporter: Cheng Lian
>            Assignee: Cheng Lian
>            Priority: Blocker
>
> {{SparkSQLOperationManager}} uses {{RDD.toLocalIterator}} to collect the 
> result set one partition at a time. This is useful to avoid OOM when the 
> result is large, but introduces extra job scheduling costs as each partition 
> is collected with a separate job. Users may want to disable this when the 
> result set is expected to be small.
> *UPDATE* Incremental collection hurts performance because tasks of the last 
> stage of the RDD DAG generated from the SQL query plan are executed 
> sequentially. Thus we decided to disable it by default.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to