[
https://issues.apache.org/jira/browse/CRUNCH-264?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13761139#comment-13761139
]
Micah Whitacre commented on CRUNCH-264:
---------------------------------------
The test I wrote is a little janky but was mostly setup to represent the DAG of
the real code. In the real code the writing of the text file works correctly.
Also the writing of the text file shows up in the job name
"Text(/some/test/first)" it however does not show up in the DOT diagram. Not
explicitly called out the "S1" processing step is also missing from the diagram
as well (but I believe that is an internal detail of writing to a text file).
> Writing to TextFileTarget map side does not show up in plan
> -----------------------------------------------------------
>
> Key: CRUNCH-264
> URL: https://issues.apache.org/jira/browse/CRUNCH-264
> Project: Crunch
> Issue Type: Bug
> Components: Core
> Affects Versions: 0.7.0
> Reporter: Micah Whitacre
> Assignee: Josh Wills
> Priority: Minor
> Attachments: CRUNCH-264.png, CRUNCH-264.txt
>
>
> Creating a pipeline that writes out data to a TextFile (mapside) and then
> Avro (reduce side), causes the text side write and any processing that might
> happen on that branch to not show up in the the plan.
> Specifically the name of the pipeline is..
> Text(/simple.txt)+S0+[[S1+Text(/some/test/first)]/[S3]]+GBK+ungroup+PTables.values+Avro(/some/test/path)"
> However the generated DOT is:
> digraph G {
> "Text(/simple.txt)" [label="Text(/simple.txt)" shape=folder];
> "Avro(/some/test/path)" [label="Avro(/some/test/path)" shape=folder];
> subgraph "cluster-job1" {
> subgraph "cluster-job1-map" {
> label = Map; color = blue;
> "S3@2118275672@1822883541" [label="S3" shape=box];
> "S0@875319338@1822883541" [label="S0" shape=box];
> }
> subgraph "cluster-job1-reduce" {
> label = Reduce; color = red;
> "GBK@221482301@1822883541" [label="GBK" shape=box];
> "PTables.values@1156570456@1822883541" [label="PTables.values"
> shape=box];
> "ungroup@1830236047@1822883541" [label="ungroup" shape=box];
> }
> }
> "ungroup@1830236047@1822883541" -> "PTables.values@1156570456@1822883541";
> "GBK@221482301@1822883541" -> "ungroup@1830236047@1822883541";
> "PTables.values@1156570456@1822883541" -> "Avro(/some/test/path)";
> "Text(/simple.txt)" -> "S0@875319338@1822883541";
> "S3@2118275672@1822883541" -> "GBK@221482301@1822883541";
> "S0@875319338@1822883541" -> "S3@2118275672@1822883541";
> }
> Which is missing "S1" and the writing to '/some/test/first'
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira