[ 
https://issues.apache.org/jira/browse/CRUNCH-88?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gabriel Reid updated CRUNCH-88:
-------------------------------

    Attachment: CRUNCH-88.patch

Patch which resolves the issue, including an integration test that demonstrates 
the issue. 

I had hoped to correct this in the planner, but that is non-trivial because 
dependencies between PCollectionImpls are set up during the building of the 
pipeline, and it appears that fixing this in the planner requires reworking the 
structure of the pipeline graph quite a bit. However, I think that this would 
be a good candidate once we start working on doing fusion optimizations in the 
planner.

[~jwills] can you take a look at this? If you see a quick way to correct this 
in the planner instead then that would be better, but I couldn't spot it.
                
> Multiple parallelDos on a PGroupedTableImpl does not work
> ---------------------------------------------------------
>
>                 Key: CRUNCH-88
>                 URL: https://issues.apache.org/jira/browse/CRUNCH-88
>             Project: Crunch
>          Issue Type: Bug
>    Affects Versions: 0.3.0
>            Reporter: Gabriel Reid
>            Assignee: Gabriel Reid
>         Attachments: CRUNCH-88.patch
>
>
> Creating multiple distinct PCollections based on a single PGroupedTableImpl 
> does not work correctly - the content of the PGroupedTableImpl will only be 
> sent to a single outgoing PCollection, and all other PCollections that stem 
> from the grouped table will not receive any data.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to