Micah Whitacre created CRUNCH-264:
-------------------------------------
Summary: Writing to TextFileTarget map side does not show up in
plan
Key: CRUNCH-264
URL: https://issues.apache.org/jira/browse/CRUNCH-264
Project: Crunch
Issue Type: Bug
Components: Core
Affects Versions: 0.7.0
Reporter: Micah Whitacre
Assignee: Josh Wills
Priority: Minor
Creating a pipeline that writes out data to a TextFile (mapside) and then Avro
(reduce side), causes the text side write and any processing that might happen
on that branch to not show up in the the plan.
Specifically the name of the pipeline is..
Text(/simple.txt)+S0+[[S1+Text(/some/test/first)]/[S3]]+GBK+ungroup+PTables.values+Avro(/some/test/path)"
However the generated DOT is:
digraph G {
"Text(/simple.txt)" [label="Text(/simple.txt)" shape=folder];
"Avro(/some/test/path)" [label="Avro(/some/test/path)" shape=folder];
subgraph "cluster-job1" {
subgraph "cluster-job1-map" {
label = Map; color = blue;
"S3@2118275672@1822883541" [label="S3" shape=box];
"S0@875319338@1822883541" [label="S0" shape=box];
}
subgraph "cluster-job1-reduce" {
label = Reduce; color = red;
"GBK@221482301@1822883541" [label="GBK" shape=box];
"PTables.values@1156570456@1822883541" [label="PTables.values" shape=box];
"ungroup@1830236047@1822883541" [label="ungroup" shape=box];
}
}
"ungroup@1830236047@1822883541" -> "PTables.values@1156570456@1822883541";
"GBK@221482301@1822883541" -> "ungroup@1830236047@1822883541";
"PTables.values@1156570456@1822883541" -> "Avro(/some/test/path)";
"Text(/simple.txt)" -> "S0@875319338@1822883541";
"S3@2118275672@1822883541" -> "GBK@221482301@1822883541";
"S0@875319338@1822883541" -> "S3@2118275672@1822883541";
}
Which is missing "S1" and the writing to '/some/test/first'
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira