[ 
https://issues.apache.org/jira/browse/CRUNCH-361?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Micah Whitacre updated CRUNCH-361:
----------------------------------

    Summary: Adjust the planner to handle non-existent SourceTargets  (was: 
Illegal State Exception)

> Adjust the planner to handle non-existent SourceTargets
> -------------------------------------------------------
>
>                 Key: CRUNCH-361
>                 URL: https://issues.apache.org/jira/browse/CRUNCH-361
>             Project: Crunch
>          Issue Type: Bug
>          Components: Core
>    Affects Versions: 0.9.0, 0.8.2
>            Reporter: Jinal Shah
>            Assignee: Josh Wills
>            Priority: Minor
>
> So apparently  I was trying to use the ParallelDoOption in order to tell the 
> planner to do something in a certain way. So when you pass the sourceTarget 
> to it and do the union or co-group in the steps following that on the 
> PCollection that was generated it tries to find the size of the parent source 
> which is still not generated. Here are the steps to produce it
> {code}
> PCollection<U>  collection = afterSomeOperation();
> SourceTarget<U> marker = new SourceTarget<U>(pathThatDoesNotExist); // this 
> could be any SourceTarget implementation
> pipeline.write(collection, marker);
> PCollection<U> collection2 = pipeline.read(marker);
> PCollection<V> collection3 = 
> collection2.parallelDo(DoFn,PType,ParallelDoOptions.builder().sources(marker).build());
> doSomeMoreOperation();
> PCollection<V> union = collection3.union(SomePCollectionOfV);
> {code}
> This will throw the exception since the union will not be able to find the 
> size of the marker since it is not generated yet. So the planner should know 
> that the Source is not generated yet and there is a job in the pipeline that 
> will generate it.  



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

Reply via email to