[jira] [Comment Edited] (PIG-1734) Pig needs a more efficient DAG execution

Rohini Palaniswamy (JIRA) Tue, 10 Feb 2015 12:30:51 -0800

    [ 
https://issues.apache.org/jira/browse/PIG-1734?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14314858#comment-14314858
 ]


Rohini Palaniswamy edited comment on PIG-1734 at 2/10/15 8:29 PM:
------------------------------------------------------------------

Closing this jira as PIG-3446 (Pig on Tez) and Pig on Spark (PIG-4059) address 
this problem.


was (Author: rohini):
Closing this jira as PIG-3444 (Pig on Tez) and Pig on Spark (PIG-4059) address 
this problem.

> Pig needs a more efficient DAG execution
> ----------------------------------------
>
>                 Key: PIG-1734
>                 URL: https://issues.apache.org/jira/browse/PIG-1734
>             Project: Pig
>          Issue Type: Improvement
>            Reporter: Olga Natkovich
>
> The current code uses Hadoop's Job control to execute one stage at a time. 
> The first stage includes all jobs with no dependencies, the second stage jobs 
> that depend only on jobs completed in the first stage, the third stage 
> contains the jobs that depend on jobs from stage 1 and 2, etc.
> The problem with this simplistic approach is that each next stages only 
> starts when the previous stage is over which means means that some branches 
> of the DAG are unnecessarily blocked.
> We would need to do our own DAG management to solve this issue which would be 
> a pretty significant undertaking. Something we should look at in the future.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Comment Edited] (PIG-1734) Pig needs a more efficient DAG execution

Reply via email to