[ 
https://issues.apache.org/jira/browse/CRUNCH-438?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Christian Tzolov updated CRUNCH-438:
------------------------------------

    Attachment: CRUNCH-438.2.patch

Updated patch that writes the dotfile content in the Configuration using the 
following placeholders:
PlanningParameters.PIPELINE_PLAN_DOTFILE;   (original one)
PlanningParameters.PCOLLECTION_LINEAGE_DOTFILE
PlanningParameters.BASE_GRAPH_PLANE_DOTFILE
PlanningParameters.SPLIT_GRAPH_PLANE_DOTFILE
PlanningParameters.RTNODES_PLAN_DOTFILE

One can print a dotfile content like this:
System.out.println(pipeline.getConfiguration().get(PlanningParameters.XXX));

For the experiment i've also  integrate this with the 
PlanningParameters.PIPELINE_DOTFILE_OUTPUT_DIR (CRUNCH-418) If the 
PIPELINE_DOTFILE_OUTPUT_DIR path is set then 5 dotfiles will be produced. 
I agree with Gabriel Reid that those diagrams are more like a debug tool. I the 
PIPELINE_DOTFILE_OUTPUT_DIR is not for debugging purpose? then  perhaps I 
should revert this integration?

- I've fixed the RTNode#getEmitter() method name. The reason it isn't called is 
that the emitters are created during the configuration stage. So this field is 
empty during the planning stage. Maybe we can find a use of it if we decide to 
create a live/run-time diagram representation ;) 


> Visualizations of some important internal/intermediate pipeline planning 
> states
> -------------------------------------------------------------------------------
>
>                 Key: CRUNCH-438
>                 URL: https://issues.apache.org/jira/browse/CRUNCH-438
>             Project: Crunch
>          Issue Type: Improvement
>          Components: Core
>    Affects Versions: 0.10.0, 0.8.3
>            Reporter: Christian Tzolov
>            Assignee: Christian Tzolov
>         Attachments: CRUNCH-438.2.patch, CRUNCH-438.patch
>
>
> To improve the understability of the pipeline planning stages it would help 
> to visualize some intermediate planning states like:
> - PCollection lineage. (visualizing the output-pcollection-targets structure) 
> - MSCRPlanner's planning Graphs before and after the split up of dependent 
> GBK nodes
> - RTNode hierarchy along with the Input and Output configurations as 
> persistent in the Configuration before the execution of the pipeline. 
> Most of the information can be intercepted in the MSCRPlanner#plan()  method.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Reply via email to