Author: buildbot
Date: Sat Jan 10 01:40:42 2015
New Revision: 935663
Log:
Staging update by buildbot for crunch
Modified:
websites/staging/crunch/trunk/content/ (props changed)
websites/staging/crunch/trunk/content/user-guide.html
Propchange: websites/staging/crunch/trunk/content/
------------------------------------------------------------------------------
--- cms:source-revision (original)
+++ cms:source-revision Sat Jan 10 01:40:42 2015
@@ -1 +1 @@
-1650540
+1650710
Modified: websites/staging/crunch/trunk/content/user-guide.html
==============================================================================
--- websites/staging/crunch/trunk/content/user-guide.html (original)
+++ websites/staging/crunch/trunk/content/user-guide.html Sat Jan 10 01:40:42
2015
@@ -1578,6 +1578,17 @@ is taken from one of Crunch's integratio
computations that combine custom DoFns with Crunch's built-in
<code>cogroup</code> operation by using the <a
href="#mempipeline">MemPipeline</a>
implementation to create test data sets that we can easily verify by hand, and
then this same logic can be executed on
a distributed data set using either the <a href="#mrpipeline">MRPipeline</a>
or <a href="#sparkpipeline">SparkPipeline</a> implementations.</p>
+<h3 id="pipeline-execution-plan-visualizations">Pipeline execution plan
visualizations</h3>
+<p>Crunch provides tools to visualize the pipeline execution plans. The <a
href="apidocs/0.10.0/org/apache/crunch/PipelineExecution.html">PipelineExecution</a><br
/>
+<code>String getPlanDotFile()</code> method returns an execution plan
visualization in DOT format. If the dot file output folder property is set,
Crunch produces a DOT file after each pipeline run. </p>
+<p>Additional aspects of the execution plans are provided when the DOT file
debug mode is enabled. Then Crunch provides 4 additional DOT diagrams
visualizing different internal stages of the execution plan. Such plans include
PCollection lineage, Base graph plan, Split graph plans, Run-time nodes.
+Note: To enable the debug mode you should set an out put folder first. The
following snapped switches the DOT file debug mode. As a result 5 DOT diagrams
are generated in the output folder after each Pipeline execution:</p>
+<div class="codehilite"><pre> <span class="n">Configuration</span> <span
class="n">conf</span> <span class="p">=</span> <span class="p">...</span>
+ <span class="n">String</span> <span class="n">dotfileDir</span> <span
class="p">=</span> <span class="p">...</span>
+
+ <span class="n">DotfileUtills</span><span class="p">.</span><span
class="n">setPipelineDotfileOutputDir</span><span class="p">(</span><span
class="n">conf</span><span class="p">,</span> <span
class="n">dotfileDir</span><span class="p">);</span>
+ <span class="n">DotfileUtills</span><span class="p">.</span><span
class="n">enableDebugDotfiles</span><span class="p">(</span><span
class="n">conf</span><span class="p">);</span>
+</pre></div>
</div> <!-- /span -->
</div> <!-- /row-fluid -->