Ioan Marius Curelariu created CRUNCH-390:
--------------------------------------------
Summary: Planner is not adding dependencies between jobs when
planning is done in more than one stage.
Key: CRUNCH-390
URL: https://issues.apache.org/jira/browse/CRUNCH-390
Project: Crunch
Issue Type: Bug
Components: Core
Affects Versions: 0.8.2
Reporter: Ioan Marius Curelariu
Assignee: Josh Wills
The planner splits does the planning in multiple stages when it finds job
dependencies on ReadableData. One example of this case is when using the
BloomFilterJoinStrategy.
While the generated plan dot file looks good, the planner actually does not add
dependencies between jobs that are created in different planning stages.
I have a pipeline that reads 3 input sources. It joins 2 of them using a bloom
filter join strategy. Later on, it joins this with the output of a job coming
from the third source path.
In the case the jobs on the branch using the bloom filter finish before the one
reading the third source, the executor attempts to start the 4-th job that is
supposed to join everything before the 3-rd one finish, resulting in a input
Path not found exception.
--
This message was sent by Atlassian JIRA
(v6.2#6252)