[jira] [Comment Edited] (SPARK-9850) Adaptive execution in Spark

JIRA Wed, 13 Jan 2016 13:14:06 -0800

    [ 
https://issues.apache.org/jira/browse/SPARK-9850?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15096985#comment-15096985
 ]


Maciej Bryński edited comment on SPARK-9850 at 1/13/16 9:13 PM:
----------------------------------------------------------------

[~matei]
Hi,
I'm not sure if my issue is related to this Jira.

In 1.6.0 when using sql limit Spark do following:
- execute limit on every partition
- then take result

Is it possible to finish scanning partitions when we collect enough rows for 
limit ?


was (Author: maver1ck):
[~matei]
Hi,
I'm not sure if my issue is related to this Jira.

In 1.6.0 when using sql limit Spark do following:
- execute limit on every partition
- then take result
Is it possible to finish scanning partitions when we collect enough rows for 
limit ?

> Adaptive execution in Spark
> ---------------------------
>
>                 Key: SPARK-9850
>                 URL: https://issues.apache.org/jira/browse/SPARK-9850
>             Project: Spark
>          Issue Type: Epic
>          Components: Spark Core, SQL
>            Reporter: Matei Zaharia
>            Assignee: Yin Huai
>         Attachments: AdaptiveExecutionInSpark.pdf
>
>
> Query planning is one of the main factors in high performance, but the 
> current Spark engine requires the execution DAG for a job to be set in 
> advance. Even with cost-based optimization, it is hard to know the behavior 
> of data and user-defined functions well enough to always get great execution 
> plans. This JIRA proposes to add adaptive query execution, so that the engine 
> can change the plan for each query as it sees what data earlier stages 
> produced.
> We propose adding this to Spark SQL / DataFrames first, using a new API in 
> the Spark engine that lets libraries run DAGs adaptively. In future JIRAs, 
> the functionality could be extended to other libraries or the RDD API, but 
> that is more difficult than adding it in SQL.
> I've attached a design doc by Yin Huai and myself explaining how it would 
> work in more detail.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Comment Edited] (SPARK-9850) Adaptive execution in Spark

Reply via email to