Author: buildbot
Date: Tue Jan 14 17:47:23 2014
New Revision: 894121
Log:
Staging update by buildbot for crunch
Modified:
websites/staging/crunch/trunk/content/ (props changed)
websites/staging/crunch/trunk/content/user-guide.html
Propchange: websites/staging/crunch/trunk/content/
------------------------------------------------------------------------------
--- cms:source-revision (original)
+++ cms:source-revision Tue Jan 14 17:47:23 2014
@@ -1 +1 @@
-1557897
+1558124
Modified: websites/staging/crunch/trunk/content/user-guide.html
==============================================================================
--- websites/staging/crunch/trunk/content/user-guide.html (original)
+++ websites/staging/crunch/trunk/content/user-guide.html Tue Jan 14 17:47:23
2014
@@ -240,7 +240,7 @@ functions (UDFs), then Crunch is most li
of patterns. The following table illustrates the relationship between these
patterns across the various data pipeline projects that run on
top of Apache Hadoop:</p>
<p><a name="rels"></a>
-<table>
+<table border="1">
<tr>
<td>Concept</td>
<td><a href="http://hadoop.apache.org">Apache Hadoop MapReduce</a></td>
@@ -758,7 +758,7 @@ can set the value of this parameter via
<p>Because we specified this parameter on the Source instance and not the
Configuration object directly, we can process multiple
different files using the NLineInputFormat and not have their different
settings conflict with one another.</p>
<p>Here is a table of commonly used Sources and their associated usage
information:</p>
-<table>
+<table border="1">
<tr>
<td>Input Type</td>
<td>Source</td>
@@ -835,7 +835,7 @@ parameters that this Target needs:</p>
</pre>
<p>Here is a table of commonly used Targets:</p>
-<table>
+<table border="1">
<tr>
<td>Output Type</td>
<td>Target</td>
@@ -1308,39 +1308,39 @@ your jobs on the JobTracker or Applicati
</ol>
<p>There are a number of handy configuration parameters that can be used to
adjust the behavior of MRPipeline that you should be
aware of:</p>
-<table>
+<table border="1">
<tr>
<td><b>Name</b></td>
<td><b>Type</b></td>
<td><b>Usage Notes</b></td>
</tr>
<tr>
- <td><pre>crunch.debug</pre></td>
+ <td>crunch.debug</td>
<td>boolean</td>
- <td>Enables debug mode, which traps and logs any runtime exceptions and
input data. Can also be enabled via <pre>enableDebug()</pre> on the
<pre>Pipeline</pre> interface. False by default, because it introduces a fair
amount of overhead.</td>
+ <td>Enables debug mode, which traps and logs any runtime exceptions and
input data. Can also be enabled via enableDebug() on the Pipeline interface.
False by default, because it introduces a fair amount of overhead.</td>
</tr>
<tr>
- <td><pre>crunch.job.name.max.stack.length</pre></td>
+ <td>crunch.job.name.max.stack.length</td>
<td>integer</td>
<td>Controls the length of the name of the job that Crunch generates for
each phase of the pipeline. Default is 60 chars.</td>
</tr>
<tr>
- <td><pre>crunch.log.job.progress</pre></td>
+ <td>crunch.log.job.progress</td>
<td>boolean</td>
<td>If true, Crunch will print the "Map %P Reduce %P" data to stdout as
the jobs run. False by default.</td>
</tr>
<tr>
- <td><pre>crunch.disable.combine.file</pre></td>
+ <td>crunch.disable.combine.file</td>
<td>boolean</td>
- <td>By default, Crunch will use <pre>CombineFileInputFormat</pre> for
subclasses of `FileInputFormat`. This can be disabled on a per-source basis or
globally.</td>
+ <td>By default, Crunch will use CombineFileInputFormat for subclasses of
FileInputFormat. This can be disabled on a per-source basis or globally.</td>
</tr>
<tr>
- <td><pre>crunch.combine.file.block.size</pre></td>
+ <td>crunch.combine.file.block.size</td>
<td>integer</td>
- <td>The block size to use for the <pre>CombineFileInputFormat</pre>.
Default is the <pre>dfs.block.size</pre> for the cluster.</td>
+ <td>The block size to use for the CombineFileInputFormat. Default is the
dfs.block.size for the cluster.</td>
</tr>
<tr>
- <td><pre>crunch.max.running.jobs</pre></td>
+ <td>crunch.max.running.jobs</td>
<td>integer</td>
<td>Controls the maximum number of MapReduce jobs that will be executed
simultaneously. Default is 5.</td>
</tr>