[3/4] mahout git commit: Automatic Site Publish by Buildbot

git-site-role Wed, 29 Nov 2017 16:24:07 -0800

http://git-wip-us.apache.org/repos/asf/mahout/blob/d9686c8b/users/basics/collocations.html
----------------------------------------------------------------------
diff --git a/users/basics/collocations.html b/users/basics/collocations.html
index 5c6cd24..875b720 100644
--- a/users/basics/collocations.html
+++ b/users/basics/collocations.html
@@ -369,7 +369,7 @@ specified LLR score from being emitted, and the 
âminSupport argument can
 be used to filter out collocations that appear below a certain number of
 times.</p>
 
-<div class="highlighter-rouge"><div class="highlight"><pre 
class="highlight"><code>bin/mahout seq2sparse
+<pre><code>bin/mahout seq2sparse
 
 Usage:                                                                     
      [--minSupport &lt;minSupport&gt; --analyzerName &lt;analyzerName&gt; 
--chunkSize &lt;chunkSize&gt;
@@ -418,12 +418,12 @@ Options
   --sequentialAccessVector (-seq)     (Optional) Whether output vectors should 
                                      be SequentialAccessVectors If set true    
                                      else false 
-</code></pre></div></div>
+</code></pre>
 
 <p><a name="Collocations-CollocDriver"></a></p>
 <h3 id="collocdriver">CollocDriver</h3>
 
-<div class="highlighter-rouge"><div class="highlight"><pre 
class="highlight"><code>bin/mahout 
org.apache.mahout.vectorizer.collocations.llr.CollocDriver
+<pre><code>bin/mahout 
org.apache.mahout.vectorizer.collocations.llr.CollocDriver
 
 Usage:                                                                     
  [--input &lt;input&gt; --output &lt;output&gt; --maxNGramSize 
&lt;ngramSize&gt; --overwrite    
@@ -462,7 +462,7 @@ Options
                                      final output alongside collocations
    
   --help (-h)                        Print out help          
-</code></pre></div></div>
+</code></pre>
 
 <p><a name="Collocations-Algorithmdetails"></a></p>
 <h2 id="algorithm-details">Algorithm details</h2>
@@ -494,14 +494,14 @@ frequencies are collected across the entire document.</p>
 <p>Once this is done, ngrams are split into head and tail portions. A key of 
type GramKey is generated which is used later to join ngrams with their heads 
and tails in the reducer phase. The GramKey is a composite key made up of a 
string n-gram fragement as the primary key and a secondary key used for 
grouping and sorting in the reduce phase. The secondary key will either be 
EMPTY in the case where we are collecting either the head or tail of an ngram 
as the value or it will contain the byte<a href=".html"></a>
  form of the ngram when collecting an ngram as the value.</p>
 
-<div class="highlighter-rouge"><div class="highlight"><pre 
class="highlight"><code>head_key(EMPTY) -&gt; (head subgram, head frequency)
+<pre><code>head_key(EMPTY) -&gt; (head subgram, head frequency)
 
 head_key(ngram) -&gt; (ngram, ngram frequency) 
 
 tail_key(EMPTY) -&gt; (tail subgram, tail frequency)
 
 tail_key(ngram) -&gt; (ngram, ngram frequency)
-</code></pre></div></div>
+</code></pre>
 
 <p>subgram and ngram values are packaged in Gram objects.</p>
 
@@ -543,7 +543,7 @@ or (subgram_key, ngram) tuple; one from each map task 
executed in which the
 particular subgram was found.
 The input will be traversed in the following order:</p>
 
-<div class="highlighter-rouge"><div class="highlight"><pre 
class="highlight"><code>(head subgram, frequency 1)
+<pre><code>(head subgram, frequency 1)
 (head subgram, frequency 2)
 ... 
 (head subgram, frequency N)
@@ -560,7 +560,7 @@ The input will be traversed in the following order:</p>
 (ngram N, frequency 2)
 ...
 (ngram N, frequency N)
-</code></pre></div></div>
+</code></pre>
 
 <p>Where all of the ngrams above share the same head. Data is presented in the
 same manner for the tail subgrams.</p>
@@ -574,18 +574,18 @@ be incremented.</p>
 
 <p>Pairs are passed to the collector in the following format:</p>
 
-<div class="highlighter-rouge"><div class="highlight"><pre 
class="highlight"><code>ngram, ngram frequency -&gt; subgram subgram frequency
-</code></pre></div></div>
+<pre><code>ngram, ngram frequency -&gt; subgram subgram frequency
+</code></pre>
 
 <p>In this manner, the output becomes an unsorted version of the following:</p>
 
-<div class="highlighter-rouge"><div class="highlight"><pre 
class="highlight"><code>ngram 1, frequency -&gt; ngram 1 head, head frequency
+<pre><code>ngram 1, frequency -&gt; ngram 1 head, head frequency
 ngram 1, frequency -&gt; ngram 1 tail, tail frequency
 ngram 2, frequency -&gt; ngram 2 head, head frequency
 ngram 2, frequency -&gt; ngram 2 tail, tail frequency
 ngram N, frequency -&gt; ngram N head, head frequency
 ngram N, frequency -&gt; ngram N tail, tail frequency
-</code></pre></div></div>
+</code></pre>
 
 <p>Output is in the format k:Gram (ngram, frequency), v:Gram (subgram,
 frequency)</p>
@@ -610,11 +610,11 @@ the work for llr calculation is done in the reduce 
phase.</p>
 <p>This phase receives the head and tail subgrams and their frequencies for
 each ngram (with frequency) produced for the input:</p>
 
-<div class="highlighter-rouge"><div class="highlight"><pre 
class="highlight"><code>ngram 1, frequency -&gt; ngram 1 head, frequency; ngram 
1 tail, frequency
+<pre><code>ngram 1, frequency -&gt; ngram 1 head, frequency; ngram 1 tail, 
frequency
 ngram 2, frequency -&gt; ngram 2 head, frequency; ngram 2 tail, frequency
 ...
 ngram 1, frequency -&gt; ngram N head, frequency; ngram N tail, frequency
-</code></pre></div></div>
+</code></pre>
 
 <p>It also reads the full ngram count obtained from the first pass, passed in
 as a configuration option. The parameters to the llr calculation are


http://git-wip-us.apache.org/repos/asf/mahout/blob/d9686c8b/users/basics/creating-vectors-from-text.html
----------------------------------------------------------------------
diff --git a/users/basics/creating-vectors-from-text.html 
b/users/basics/creating-vectors-from-text.html
index ecd9b1e..1dfb217 100644
--- a/users/basics/creating-vectors-from-text.html
+++ b/users/basics/creating-vectors-from-text.html
@@ -310,7 +310,7 @@ option.  Examples of running the driver are included 
below:</p>
 <p><a 
name="CreatingVectorsfromText-GeneratinganoutputfilefromaLuceneIndex"></a></p>
 <h4 id="generating-an-output-file-from-a-lucene-index">Generating an output 
file from a Lucene Index</h4>
 
-<div class="highlighter-rouge"><div class="highlight"><pre 
class="highlight"><code>$MAHOUT_HOME/bin/mahout lucene.vector 
+<pre><code>$MAHOUT_HOME/bin/mahout lucene.vector 
     --dir (-d) dir                     The Lucene directory      
     --idField idField                  The field in the index    
                                            containing the index.  If 
@@ -362,17 +362,17 @@ option.  Examples of running the driver are included 
below:</p>
                                            percentage is expressed   
                                            as a value between 0 and  
                                            1. The default is 0.  
-</code></pre></div></div>
+</code></pre>
 
 <h4 id="example-create-50-vectors-from-an-index">Example: Create 50 Vectors 
from an Index</h4>
 
-<div class="highlighter-rouge"><div class="highlight"><pre 
class="highlight"><code>$MAHOUT_HOME/bin/mahout lucene.vector
+<pre><code>$MAHOUT_HOME/bin/mahout lucene.vector
     --dir $WORK_DIR/wikipedia/solr/data/index 
     --field body 
     --dictOut $WORK_DIR/solr/wikipedia/dict.txt
     --output $WORK_DIR/solr/wikipedia/out.txt 
     --max 50
-</code></pre></div></div>
+</code></pre>
 
 <p>This uses the index specified by âdir and the body field in it and writes
 out the info to the output dir and the dictionary to dict.txt. It only
@@ -382,14 +382,14 @@ the index are output.</p>
 <p><a name="CreatingVectorsfromText-50VectorsFromLuceneL2Norm"></a></p>
 <h4 
id="example-creating-50-normalized-vectors-from-a-lucene-index-using-the-l_2-norm">Example:
 Creating 50 Normalized Vectors from a Lucene Index using the <a 
href="http://en.wikipedia.org/wiki/Lp_space";>L_2 Norm</a></h4>
 
-<div class="highlighter-rouge"><div class="highlight"><pre 
class="highlight"><code>$MAHOUT_HOME/bin/mahout lucene.vector 
+<pre><code>$MAHOUT_HOME/bin/mahout lucene.vector 
     --dir $WORK_DIR/wikipedia/solr/data/index 
     --field body 
     --dictOut $WORK_DIR/solr/wikipedia/dict.txt
     --output $WORK_DIR/solr/wikipedia/out.txt 
     --max 50 
     --norm 2
-</code></pre></div></div>
+</code></pre>
 
 <p><a name="CreatingVectorsfromText-FromDirectoryofTextdocuments"></a></p>
 <h2 id="from-a-directory-of-text-documents">From A Directory of Text 
documents</h2>
@@ -408,7 +408,7 @@ binary documents to text.</p>
 <p>Mahout has a nifty utility which reads a directory path including its
 sub-directories and creates the SequenceFile in a chunked manner for us.</p>
 
-<div class="highlighter-rouge"><div class="highlight"><pre 
class="highlight"><code>$MAHOUT_HOME/bin/mahout seqdirectory 
+<pre><code>$MAHOUT_HOME/bin/mahout seqdirectory 
     --input (-i) input                       Path to job input directory.   
     --output (-o) output                     The directory pathname for     
                                                  output.                       
 
@@ -438,7 +438,7 @@ sub-directories and creates the SequenceFile in a chunked 
manner for us.</p>
     --tempDir tempDir                        Intermediate output directory  
     --startPhase startPhase                  First phase to run             
     --endPhase endPhase                      Last phase to run  
-</code></pre></div></div>
+</code></pre>
 
 <p>The output of seqDirectory will be a Sequence file &lt; Text, Text &gt; of 
all documents (/sub-directory-path/documentFileName, documentText).</p>
 
@@ -448,7 +448,7 @@ sub-directories and creates the SequenceFile in a chunked 
manner for us.</p>
 <p>From the sequence file generated from the above step run the following to
 generate vectors.</p>
 
-<div class="highlighter-rouge"><div class="highlight"><pre 
class="highlight"><code>$MAHOUT_HOME/bin/mahout seq2sparse
+<pre><code>$MAHOUT_HOME/bin/mahout seq2sparse
     --minSupport (-s) minSupport      (Optional) Minimum Support. Default      
 
                                           Value: 2                             
     
     --analyzerName (-a) analyzerName  The class name of the analyzer           
 
@@ -497,7 +497,7 @@ generate vectors.</p>
                                           be NamedVectors. If set true else 
false   
     --logNormalize (-lnorm)           (Optional) Whether output vectors should 
 
                                           be logNormalize. If set true else 
false
-</code></pre></div></div>
+</code></pre>
 
 <p>This will create SequenceFiles of tokenized documents &lt; Text, 
StringTuple &gt;  (docID, tokenizedDoc) and vectorized documents &lt; Text, 
VectorWritable &gt; (docID, TF-IDF Vector).</p>
 
@@ -510,17 +510,17 @@ generate vectors.</p>
 <h4 
id="example-creating-normalized-tf-idf-vectors-from-a-directory-of-text-documents-using-trigrams-and-the-l_2-norm">Example:
 Creating Normalized <a 
href="http://en.wikipedia.org/wiki/Tf%E2%80%93idf";>TF-IDF</a> Vectors from a 
directory of text documents using <a 
href="http://en.wikipedia.org/wiki/N-gram";>trigrams</a> and the <a 
href="http://en.wikipedia.org/wiki/Lp_space";>L_2 Norm</a></h4>
 <p>Create sequence files from the directory of text documents:</p>
 
-<div class="highlighter-rouge"><div class="highlight"><pre 
class="highlight"><code>$MAHOUT_HOME/bin/mahout seqdirectory 
+<pre><code>$MAHOUT_HOME/bin/mahout seqdirectory 
     -i $WORK_DIR/reuters 
     -o $WORK_DIR/reuters-seqdir 
     -c UTF-8
     -chunk 64
     -xm sequential
-</code></pre></div></div>
+</code></pre>
 
 <p>Vectorize the documents using trigrams, L_2 length normalization and a 
maximum document frequency cutoff of 85%.</p>
 
-<div class="highlighter-rouge"><div class="highlight"><pre 
class="highlight"><code>$MAHOUT_HOME/bin/mahout seq2sparse 
+<pre><code>$MAHOUT_HOME/bin/mahout seq2sparse 
     -i $WORK_DIR/reuters-out-seqdir/ 
     -o $WORK_DIR/reuters-out-seqdir-sparse-kmeans 
     --namedVec
@@ -528,7 +528,7 @@ generate vectors.</p>
     -ng 3
     -n 2
     --maxDFPercent 85 
-</code></pre></div></div>
+</code></pre>
 
 <p>The sequence file in the 
$WORK_DIR/reuters-out-seqdir-sparse-kmeans/tfidf-vectors directory can now be 
used as input to the Mahout <a 
href="http://mahout.apache.org/users/clustering/k-means-clustering.html";>k-Means</a>
 clustering algorithm.</p>
 
@@ -549,14 +549,14 @@ format. Probably the easiest way to go would be to 
implement your own
 Iterable<Vector> (called VectorIterable in the example below) and then
 reuse the existing VectorWriter classes:</Vector></p>
 
-<div class="highlighter-rouge"><div class="highlight"><pre 
class="highlight"><code>VectorWriter vectorWriter = 
SequenceFile.createWriter(filesystem,
+<pre><code>VectorWriter vectorWriter = SequenceFile.createWriter(filesystem,
                                                       configuration,
                                                       outfile,
                                                       LongWritable.class,
                                                       SparseVector.class);
 
 long numDocs = vectorWriter.write(new VectorIterable(), Long.MAX_VALUE);
-</code></pre></div></div>
+</code></pre>
 
 
    </div>

http://git-wip-us.apache.org/repos/asf/mahout/blob/d9686c8b/users/basics/quickstart.html
----------------------------------------------------------------------
diff --git a/users/basics/quickstart.html b/users/basics/quickstart.html
index 6d8a4c0..b6f689d 100644
--- a/users/basics/quickstart.html
+++ b/users/basics/quickstart.html
@@ -287,12 +287,12 @@
 <p>Mahout is also available via a <a 
href="http://mvnrepository.com/artifact/org.apache.mahout";>maven repository</a> 
under the group id <em>org.apache.mahout</em>.
 If you would like to import the latest release of mahout into a java project, 
add the following dependency in your <em>pom.xml</em>:</p>
 
-<div class="highlighter-rouge"><div class="highlight"><pre 
class="highlight"><code>&lt;dependency&gt;
+<pre><code>&lt;dependency&gt;
     &lt;groupId&gt;org.apache.mahout&lt;/groupId&gt;
     &lt;artifactId&gt;mahout-mr&lt;/artifactId&gt;
     &lt;version&gt;0.10.0&lt;/version&gt;
 &lt;/dependency&gt;
-</code></pre></div></div>
+</code></pre>
 
 <h2 id="features">Features</h2>
 

http://git-wip-us.apache.org/repos/asf/mahout/blob/d9686c8b/users/classification/bayesian-commandline.html
----------------------------------------------------------------------
diff --git a/users/classification/bayesian-commandline.html 
b/users/classification/bayesian-commandline.html
index 6039cfd..ffeea8b 100644
--- a/users/classification/bayesian-commandline.html
+++ b/users/classification/bayesian-commandline.html
@@ -288,14 +288,14 @@ complementary naive bayesian classification algorithms on 
a Hadoop cluster.</p>
 
 <p>In the examples directory type:</p>
 
-<div class="highlighter-rouge"><div class="highlight"><pre 
class="highlight"><code>mvn -q exec:java
+<pre><code>mvn -q exec:java
     
-Dexec.mainClass="org.apache.mahout.classifier.bayes.mapreduce.bayes.&lt;JOB&gt;"
     -Dexec.args="&lt;OPTIONS&gt;"
 
 mvn -q exec:java
     
-Dexec.mainClass="org.apache.mahout.classifier.bayes.mapreduce.cbayes.&lt;JOB&gt;"
     -Dexec.args="&lt;OPTIONS&gt;"
-</code></pre></div></div>
+</code></pre>
 
 <p><a name="bayesian-commandline-Runningitonthecluster"></a></p>
 <h3 id="running-it-on-the-cluster">Running it on the cluster</h3>
@@ -328,7 +328,7 @@ to view all outputs.</p>
 <p><a name="bayesian-commandline-Commandlineoptions"></a></p>
 <h2 id="command-line-options">Command line options</h2>
 
-<div class="highlighter-rouge"><div class="highlight"><pre 
class="highlight"><code>BayesDriver, BayesThetaNormalizerDriver, 
CBayesNormalizedWeightDriver, CBayesDriver, CBayesThetaDriver, 
CBayesThetaNormalizerDriver, BayesWeightSummerDriver, BayesFeatureDriver, 
BayesTfIdfDriver Usage:
+<pre><code>BayesDriver, BayesThetaNormalizerDriver, 
CBayesNormalizedWeightDriver, CBayesDriver, CBayesThetaDriver, 
CBayesThetaNormalizerDriver, BayesWeightSummerDriver, BayesFeatureDriver, 
BayesTfIdfDriver Usage:
     [--input &lt;input&gt; --output &lt;output&gt; --help]
   
 Options
@@ -336,7 +336,7 @@ Options
   --input (-i) input     The Path for input Vectors. Must be a SequenceFile of 
Writable, Vector.
   --output (-o) output   The directory pathname for output points.
   --help (-h)            Print out help.
-</code></pre></div></div>
+</code></pre>
 
 
    </div>

http://git-wip-us.apache.org/repos/asf/mahout/blob/d9686c8b/users/classification/bayesian.html
----------------------------------------------------------------------
diff --git a/users/classification/bayesian.html 
b/users/classification/bayesian.html
index 128e658..22c48df 100644
--- a/users/classification/bayesian.html
+++ b/users/classification/bayesian.html
@@ -288,38 +288,38 @@
 <p>As described in <a 
href="http://people.csail.mit.edu/jrennie/papers/icml03-nb.pdf";>[1]</a> Mahout 
Naive Bayes is broken down into the following steps (assignments are over all 
possible index values):</p>
 
 <ul>
-  <li>Let <code 
class="highlighter-rouge">\(\vec{d}=(\vec{d_1},...,\vec{d_n})\)</code> be a set 
of documents; <code class="highlighter-rouge">\(d_{ij}\)</code> is the count of 
word <code class="highlighter-rouge">\(i\)</code> in document <code 
class="highlighter-rouge">\(j\)</code>.</li>
-  <li>Let <code class="highlighter-rouge">\(\vec{y}=(y_1,...,y_n)\)</code> be 
their labels.</li>
-  <li>Let <code class="highlighter-rouge">\(\alpha_i\)</code> be a smoothing 
parameter for all words in the vocabulary; let <code 
class="highlighter-rouge">\(\alpha=\sum_i{\alpha_i}\)</code>.</li>
-  <li><strong>Preprocessing</strong>(via seq2Sparse) TF-IDF transformation and 
L2 length normalization of <code class="highlighter-rouge">\(\vec{d}\)</code>
+  <li>Let <code>\(\vec{d}=(\vec{d_1},...,\vec{d_n})\)</code> be a set of 
documents; <code>\(d_{ij}\)</code> is the count of word <code>\(i\)</code> in 
document <code>\(j\)</code>.</li>
+  <li>Let <code>\(\vec{y}=(y_1,...,y_n)\)</code> be their labels.</li>
+  <li>Let <code>\(\alpha_i\)</code> be a smoothing parameter for all words in 
the vocabulary; let <code>\(\alpha=\sum_i{\alpha_i}\)</code>.</li>
+  <li><strong>Preprocessing</strong>(via seq2Sparse) TF-IDF transformation and 
L2 length normalization of <code>\(\vec{d}\)</code>
     <ol>
-      <li><code class="highlighter-rouge">\(d_{ij} = 
\sqrt{d_{ij}}\)</code></li>
-      <li><code class="highlighter-rouge">\(d_{ij} = 
d_{ij}\left(\log{\frac{\sum_k1}{\sum_k\delta_{ik}+1}}+1\right)\)</code></li>
-      <li><code class="highlighter-rouge">\(d_{ij} 
=\frac{d_{ij}}{\sqrt{\sum_k{d_{kj}^2}}}\)</code></li>
+      <li><code>\(d_{ij} = \sqrt{d_{ij}}\)</code></li>
+      <li><code>\(d_{ij} = 
d_{ij}\left(\log{\frac{\sum_k1}{\sum_k\delta_{ik}+1}}+1\right)\)</code></li>
+      <li><code>\(d_{ij} =\frac{d_{ij}}{\sqrt{\sum_k{d_{kj}^2}}}\)</code></li>
     </ol>
   </li>
-  <li><strong>Training: Bayes</strong><code 
class="highlighter-rouge">\((\vec{d},\vec{y})\)</code> calculate term weights 
<code class="highlighter-rouge">\(w_{ci}\)</code> as:
+  <li><strong>Training: Bayes</strong><code>\((\vec{d},\vec{y})\)</code> 
calculate term weights <code>\(w_{ci}\)</code> as:
     <ol>
-      <li><code 
class="highlighter-rouge">\(\hat\theta_{ci}=\frac{d_{ic}+\alpha_i}{\sum_k{d_{kc}}+\alpha}\)</code></li>
-      <li><code 
class="highlighter-rouge">\(w_{ci}=\log{\hat\theta_{ci}}\)</code></li>
+      
<li><code>\(\hat\theta_{ci}=\frac{d_{ic}+\alpha_i}{\sum_k{d_{kc}}+\alpha}\)</code></li>
+      <li><code>\(w_{ci}=\log{\hat\theta_{ci}}\)</code></li>
     </ol>
   </li>
-  <li><strong>Training: CBayes</strong><code 
class="highlighter-rouge">\((\vec{d},\vec{y})\)</code> calculate term weights 
<code class="highlighter-rouge">\(w_{ci}\)</code> as:
+  <li><strong>Training: CBayes</strong><code>\((\vec{d},\vec{y})\)</code> 
calculate term weights <code>\(w_{ci}\)</code> as:
     <ol>
-      <li><code class="highlighter-rouge">\(\hat\theta_{ci} = 
\frac{\sum_{j:y_j\neq c}d_{ij}+\alpha_i}{\sum_{j:y_j\neq 
c}{\sum_k{d_{kj}}}+\alpha}\)</code></li>
-      <li><code 
class="highlighter-rouge">\(w_{ci}=-\log{\hat\theta_{ci}}\)</code></li>
-      <li><code class="highlighter-rouge">\(w_{ci}=\frac{w_{ci}}{\sum_i \lvert 
w_{ci}\rvert}\)</code></li>
+      <li><code>\(\hat\theta_{ci} = \frac{\sum_{j:y_j\neq 
c}d_{ij}+\alpha_i}{\sum_{j:y_j\neq c}{\sum_k{d_{kj}}}+\alpha}\)</code></li>
+      <li><code>\(w_{ci}=-\log{\hat\theta_{ci}}\)</code></li>
+      <li><code>\(w_{ci}=\frac{w_{ci}}{\sum_i \lvert 
w_{ci}\rvert}\)</code></li>
     </ol>
   </li>
   <li><strong>Label Assignment/Testing:</strong>
     <ol>
-      <li>Let <code class="highlighter-rouge">\(\vec{t}= 
(t_1,...,t_n)\)</code> be a test document; let <code 
class="highlighter-rouge">\(t_i\)</code> be the count of the word <code 
class="highlighter-rouge">\(t\)</code>.</li>
-      <li>Label the document according to <code 
class="highlighter-rouge">\(l(t)=\arg\max_c \sum\limits_{i} t_i 
w_{ci}\)</code></li>
+      <li>Let <code>\(\vec{t}= (t_1,...,t_n)\)</code> be a test document; let 
<code>\(t_i\)</code> be the count of the word <code>\(t\)</code>.</li>
+      <li>Label the document according to <code>\(l(t)=\arg\max_c 
\sum\limits_{i} t_i w_{ci}\)</code></li>
     </ol>
   </li>
 </ul>
 
-<p>As we can see, the main difference between Bayes and CBayes is the weight 
calculation step.  Where Bayes weighs terms more heavily based on the 
likelihood that they belong to class <code 
class="highlighter-rouge">\(c\)</code>, CBayes seeks to maximize term weights 
on the likelihood that they do not belong to any other class.</p>
+<p>As we can see, the main difference between Bayes and CBayes is the weight 
calculation step.  Where Bayes weighs terms more heavily based on the 
likelihood that they belong to class <code>\(c\)</code>, CBayes seeks to 
maximize term weights on the likelihood that they do not belong to any other 
class.</p>
 
 <h2 id="running-from-the-command-line">Running from the command line</h2>
 
@@ -330,31 +330,31 @@
     <p><strong>Preprocessing:</strong>
 For a set of Sequence File Formatted documents in PATH_TO_SEQUENCE_FILES the 
<a 
href="https://mahout.apache.org/users/basics/creating-vectors-from-text.html";>mahout
 seq2sparse</a> command performs the TF-IDF transformations (-wt tfidf option) 
and L2 length normalization (-n 2 option) as follows:</p>
 
-    <div class="highlighter-rouge"><div class="highlight"><pre 
class="highlight"><code>  mahout seq2sparse 
+    <pre><code>  mahout seq2sparse 
     -i ${PATH_TO_SEQUENCE_FILES} 
     -o ${PATH_TO_TFIDF_VECTORS} 
     -nv 
     -n 2
     -wt tfidf
-</code></pre></div>    </div>
+</code></pre>
   </li>
   <li>
     <p><strong>Training:</strong>
-The model is then trained using <code class="highlighter-rouge">mahout 
trainnb</code> .  The default is to train a Bayes model. The -c option is given 
to train a CBayes model:</p>
+The model is then trained using <code>mahout trainnb</code> .  The default is 
to train a Bayes model. The -c option is given to train a CBayes model:</p>
 
-    <div class="highlighter-rouge"><div class="highlight"><pre 
class="highlight"><code>  mahout trainnb
+    <pre><code>  mahout trainnb
     -i ${PATH_TO_TFIDF_VECTORS} 
     -o ${PATH_TO_MODEL}/model 
     -li ${PATH_TO_MODEL}/labelindex 
     -ow 
     -c
-</code></pre></div>    </div>
+</code></pre>
   </li>
   <li>
     <p><strong>Label Assignment/Testing:</strong>
-Classification and testing on a holdout set can then be performed via <code 
class="highlighter-rouge">mahout testnb</code>. Again, the -c option indicates 
that the model is CBayes.  The -seq option tells <code 
class="highlighter-rouge">mahout testnb</code> to run sequentially:</p>
+Classification and testing on a holdout set can then be performed via 
<code>mahout testnb</code>. Again, the -c option indicates that the model is 
CBayes.  The -seq option tells <code>mahout testnb</code> to run 
sequentially:</p>
 
-    <div class="highlighter-rouge"><div class="highlight"><pre 
class="highlight"><code>  mahout testnb 
+    <pre><code>  mahout testnb 
     -i ${PATH_TO_TFIDF_TEST_VECTORS}
     -m ${PATH_TO_MODEL}/model 
     -l ${PATH_TO_MODEL}/labelindex 
@@ -362,7 +362,7 @@ Classification and testing on a holdout set can then be 
performed via <code clas
     -o ${PATH_TO_OUTPUT} 
     -c 
     -seq
-</code></pre></div>    </div>
+</code></pre>
   </li>
 </ul>
 
@@ -372,9 +372,9 @@ Classification and testing on a holdout set can then be 
performed via <code clas
   <li>
     <p><strong>Preprocessing:</strong></p>
 
-    <p>Only relevant parameters used for Bayes/CBayes as detailed above are 
shown. Several other transformations can be performed by <code 
class="highlighter-rouge">mahout seq2sparse</code> and used as input to 
Bayes/CBayes.  For a full list of <code class="highlighter-rouge">mahout 
seq2Sparse</code> options see the <a 
href="https://mahout.apache.org/users/basics/creating-vectors-from-text.html";>Creating
 vectors from text</a> page.</p>
+    <p>Only relevant parameters used for Bayes/CBayes as detailed above are 
shown. Several other transformations can be performed by <code>mahout 
seq2sparse</code> and used as input to Bayes/CBayes.  For a full list of 
<code>mahout seq2Sparse</code> options see the <a 
href="https://mahout.apache.org/users/basics/creating-vectors-from-text.html";>Creating
 vectors from text</a> page.</p>
 
-    <div class="highlighter-rouge"><div class="highlight"><pre 
class="highlight"><code>  mahout seq2sparse                         
+    <pre><code>  mahout seq2sparse                         
     --output (-o) output             The directory pathname for output.        
     --input (-i) input               Path to job input directory.              
     --weight (-wt) weight            The kind of weight to use. Currently TF   
@@ -389,12 +389,12 @@ Classification and testing on a holdout set can then be 
performed via <code clas
                                          else false                            
    
     --namedVector (-nv)              (Optional) Whether output vectors should  
                                          be NamedVectors. If set true else 
false   
-</code></pre></div>    </div>
+</code></pre>
   </li>
   <li>
     <p><strong>Training:</strong></p>
 
-    <div class="highlighter-rouge"><div class="highlight"><pre 
class="highlight"><code>  mahout trainnb
+    <pre><code>  mahout trainnb
     --input (-i) input               Path to job input directory.              
   
     --output (-o) output             The directory pathname for output.        
            
     --alphaI (-a) alphaI             Smoothing parameter. Default is 1.0
@@ -406,12 +406,12 @@ Classification and testing on a holdout set can then be 
performed via <code clas
     --tempDir tempDir                Intermediate output directory             
   
     --startPhase startPhase          First phase to run                        
   
     --endPhase endPhase              Last phase to run
-</code></pre></div>    </div>
+</code></pre>
   </li>
   <li>
     <p><strong>Testing:</strong></p>
 
-    <div class="highlighter-rouge"><div class="highlight"><pre 
class="highlight"><code>  mahout testnb   
+    <pre><code>  mahout testnb   
     --input (-i) input               Path to job input directory.              
    
     --output (-o) output             The directory pathname for output.        
    
     --overwrite (-ow)                If present, overwrite the output 
directory    
@@ -426,7 +426,7 @@ Classification and testing on a holdout set can then be 
performed via <code clas
     --tempDir tempDir                Intermediate output directory             
    
     --startPhase startPhase          First phase to run                        
    
     --endPhase endPhase              Last phase to run  
-</code></pre></div>    </div>
+</code></pre>
   </li>
 </ul>
 

http://git-wip-us.apache.org/repos/asf/mahout/blob/d9686c8b/users/classification/breiman-example.html
----------------------------------------------------------------------
diff --git a/users/classification/breiman-example.html 
b/users/classification/breiman-example.html
index 8d1a60f..c239bd7 100644
--- a/users/classification/breiman-example.html
+++ b/users/classification/breiman-example.html
@@ -300,8 +300,8 @@ results to greater values of <em>m</em></li>
 
 <p>First, we deal with <a 
href="http://archive.ics.uci.edu/ml/datasets/Glass+Identification";>Glass 
Identification</a>: download the <a 
href="http://archive.ics.uci.edu/ml/machine-learning-databases/glass/glass.data";>dataset</a>
 file called <strong>glass.data</strong> and store it onto your local machine. 
Next, we must generate the descriptor file <strong>glass.info</strong> for this 
dataset with the following command:</p>
 
-<div class="highlighter-rouge"><div class="highlight"><pre 
class="highlight"><code>bin/mahout 
org.apache.mahout.classifier.df.tools.Describe -p /path/to/glass.data -f 
/path/to/glass.info -d I 9 N L
-</code></pre></div></div>
+<pre><code>bin/mahout org.apache.mahout.classifier.df.tools.Describe -p 
/path/to/glass.data -f /path/to/glass.info -d I 9 N L
+</code></pre>
 
 <p>Substitute <em>/path/to/</em> with the folder where you downloaded the 
dataset, the argument âI 9 N Lâ indicates the nature of the variables. Here 
it means 1
 ignored (I) attribute, followed by 9 numerical(N) attributes, followed by
@@ -309,8 +309,8 @@ the label (L).</p>
 
 <p>Finally, we build and evaluate our random forest classifier as follows:</p>
 
-<div class="highlighter-rouge"><div class="highlight"><pre 
class="highlight"><code>bin/mahout 
org.apache.mahout.classifier.df.BreimanExample -d /path/to/glass.data -ds 
/path/to/glass.info -i 10 -t 100 which builds 100 trees (-t argument) and 
repeats the test 10 iterations (-i argument) 
-</code></pre></div></div>
+<pre><code>bin/mahout org.apache.mahout.classifier.df.BreimanExample -d 
/path/to/glass.data -ds /path/to/glass.info -i 10 -t 100 which builds 100 trees 
(-t argument) and repeats the test 10 iterations (-i argument) 
+</code></pre>
 
 <p>The example outputs the following results:</p>
 
@@ -327,13 +327,13 @@ iterations</li>
 
 <p>We can repeat this for a <a 
href="http://archive.ics.uci.edu/ml/datasets/Connectionist+Bench+%28Sonar,+Mines+vs.+Rocks%29";>Sonar</a>
 usecase: download the <a 
href="http://archive.ics.uci.edu/ml/machine-learning-databases/undocumented/connectionist-bench/sonar/sonar.all-data";>dataset</a>
 file called <strong>sonar.all-data</strong> and store it onto your local 
machine. Generate the descriptor file <strong>sonar.info</strong> for this 
dataset with the following command:</p>
 
-<div class="highlighter-rouge"><div class="highlight"><pre 
class="highlight"><code>bin/mahout 
org.apache.mahout.classifier.df.tools.Describe -p /path/to/sonar.all-data -f 
/path/to/sonar.info -d 60 N L
-</code></pre></div></div>
+<pre><code>bin/mahout org.apache.mahout.classifier.df.tools.Describe -p 
/path/to/sonar.all-data -f /path/to/sonar.info -d 60 N L
+</code></pre>
 
 <p>The argument â60 N Lâ means 60 numerical(N) attributes, followed by the 
label (L). Analogous to the previous case, we run the evaluation as follows:</p>
 
-<div class="highlighter-rouge"><div class="highlight"><pre 
class="highlight"><code>bin/mahout 
org.apache.mahout.classifier.df.BreimanExample -d /path/to/sonar.all-data -ds 
/path/to/sonar.info -i 10 -t 100
-</code></pre></div></div>
+<pre><code>bin/mahout org.apache.mahout.classifier.df.BreimanExample -d 
/path/to/sonar.all-data -ds /path/to/sonar.info -i 10 -t 100
+</code></pre>
 
 
    </div>

http://git-wip-us.apache.org/repos/asf/mahout/blob/d9686c8b/users/classification/class-discovery.html
----------------------------------------------------------------------
diff --git a/users/classification/class-discovery.html 
b/users/classification/class-discovery.html
index 9dcfe83..20f30fc 100644
--- a/users/classification/class-discovery.html
+++ b/users/classification/class-discovery.html
@@ -304,13 +304,13 @@ A classification rule can be represented as follows:</p>
 <p>For a given <em>target</em> class and a weight <em>threshold</em>, the 
classification
 rule can be read :</p>
 
-<div class="highlighter-rouge"><div class="highlight"><pre 
class="highlight"><code>for each row of the dataset
+<pre><code>for each row of the dataset
   if (rule.w1 &lt; threshold || (rule.w1 &gt;= threshold &amp;&amp; row.value1 
rule.op1 rule.value1)) &amp;&amp;
      (rule.w2 &lt; threshold || (rule.w2 &gt;= threshold &amp;&amp; row.value2 
rule.op2 rule.value2)) &amp;&amp;
      ...
      (rule.wN &lt; threshold || (rule.wN &gt;= threshold &amp;&amp; row.valueN 
rule.opN rule.valueN)) then
     row is part of the target class
-</code></pre></div></div>
+</code></pre>
 
 <p><em>Important:</em> The label attribute is not evaluated by the rule.</p>
 
@@ -344,11 +344,11 @@ and the following parameters: threshold = 1 and target = 
0 (brown).
 
 <p>This rule can be read as follows:</p>
 
-<div class="highlighter-rouge"><div class="highlight"><pre 
class="highlight"><code>for each row of the dataset
+<pre><code>for each row of the dataset
   if (0 &lt; 1 || (0 &gt;= 1 &amp;&amp; row.value1 &lt; 20)) &amp;&amp;
      (1 &lt; 1 || (1 &gt;= 1 &amp;&amp; row.value2 != light)) then
     row is part of the "brown Eye Color" class
-</code></pre></div></div>
+</code></pre>
 
 <p>Please note how the rule skipped the label attribute (Eye Color), and how
 the first condition is ignored because its weight is &lt; threshold.</p>

http://git-wip-us.apache.org/repos/asf/mahout/blob/d9686c8b/users/classification/hidden-markov-models.html
----------------------------------------------------------------------
diff --git a/users/classification/hidden-markov-models.html 
b/users/classification/hidden-markov-models.html
index 1a84234..6f4fe33 100644
--- a/users/classification/hidden-markov-models.html
+++ b/users/classification/hidden-markov-models.html
@@ -330,18 +330,18 @@ can be efficiently solved using the Baum-Welch 
algorithm.</li>
 
 <p>Create an input file to train the model.  Here we have a sequence drawn 
from the set of states 0, 1, 2, and 3, separated by space characters.</p>
 
-<div class="highlighter-rouge"><div class="highlight"><pre 
class="highlight"><code>$ echo "0 1 2 2 2 1 1 0 0 3 3 3 2 1 2 1 1 1 1 2 2 2 0 0 
0 0 0 0 2 2 2 0 0 0 0 0 0 2 2 2 3 3 3 3 3 3 2 3 2 3 2 3 2 1 3 0 0 0 1 0 1 0 2 1 
2 1 2 1 2 3 3 3 3 2 2 3 2 1 1 0" &gt; hmm-input
-</code></pre></div></div>
+<pre><code>$ echo "0 1 2 2 2 1 1 0 0 3 3 3 2 1 2 1 1 1 1 2 2 2 0 0 0 0 0 0 2 2 
2 0 0 0 0 0 0 2 2 2 3 3 3 3 3 3 2 3 2 3 2 3 2 1 3 0 0 0 1 0 1 0 2 1 2 1 2 1 2 3 
3 3 3 2 2 3 2 1 1 0" &gt; hmm-input
+</code></pre>
 
 <p>Now run the baumwelch job to train your model, after first setting 
MAHOUT_LOCAL to true, to use your local file system.</p>
 
-<div class="highlighter-rouge"><div class="highlight"><pre 
class="highlight"><code>$ export MAHOUT_LOCAL=true
+<pre><code>$ export MAHOUT_LOCAL=true
 $ $MAHOUT_HOME/bin/mahout baumwelch -i hmm-input -o hmm-model -nh 3 -no 4 -e 
.0001 -m 1000
-</code></pre></div></div>
+</code></pre>
 
 <p>Output like the following should appear in the console.</p>
 
-<div class="highlighter-rouge"><div class="highlight"><pre 
class="highlight"><code>Initial probabilities: 
+<pre><code>Initial probabilities: 
 0 1 2 
 1.0 0.0 3.5659361683006626E-251 
 Transition matrix:
@@ -355,18 +355,18 @@ Emission matrix:
 1 7.495656581383351E-34 0.2241269055449904 0.4510889999455847 
0.32478409450942497 
 2 0.815051477991782 0.18494852200821799 8.465660634827592E-33 
2.8603899591778015E-36 
 14/03/22 09:52:21 INFO driver.MahoutDriver: Program took 180 ms (Minutes: 
0.003)
-</code></pre></div></div>
+</code></pre>
 
 <p>The model trained with the input set now is in the file âhmm-modelâ, 
which we can use to build a predicted sequence.</p>
 
-<div class="highlighter-rouge"><div class="highlight"><pre 
class="highlight"><code>$ $MAHOUT_HOME/bin/mahout hmmpredict -m hmm-model -o 
hmm-predictions -l 10
-</code></pre></div></div>
+<pre><code>$ $MAHOUT_HOME/bin/mahout hmmpredict -m hmm-model -o 
hmm-predictions -l 10
+</code></pre>
 
 <p>To see the predictions:</p>
 
-<div class="highlighter-rouge"><div class="highlight"><pre 
class="highlight"><code>$ cat hmm-predictions 
+<pre><code>$ cat hmm-predictions 
 0 1 3 3 2 2 2 2 1 2
-</code></pre></div></div>
+</code></pre>
 
 <p><a name="HiddenMarkovModels-Resources"></a></p>
 <h2 id="resources">Resources</h2>

http://git-wip-us.apache.org/repos/asf/mahout/blob/d9686c8b/users/classification/mlp.html
----------------------------------------------------------------------
diff --git a/users/classification/mlp.html b/users/classification/mlp.html
index 5283911..4983775 100644
--- a/users/classification/mlp.html
+++ b/users/classification/mlp.html
@@ -285,9 +285,9 @@ can be used for classification and regression tasks in a 
supervised learning app
 can be used with the following commands:</p>
 
 <h1 id="model-training">model training</h1>
-<div class="highlighter-rouge"><div class="highlight"><pre 
class="highlight"><code>$ bin/mahout 
org.apache.mahout.classifier.mlp.TrainMultilayerPerceptron  # model usage
+<pre><code>$ bin/mahout 
org.apache.mahout.classifier.mlp.TrainMultilayerPerceptron  # model usage
 $ bin/mahout org.apache.mahout.classifier.mlp.RunMultilayerPerceptron
-</code></pre></div></div>
+</code></pre>
 
 <p>To train and use the model, a number of parameters can be specified. 
Parameters without default values have to be specified by the user. Consider 
that not all parameters can be used both for training and running the model. We 
give an example of the usage below.</p>
 
@@ -336,7 +336,7 @@ $ bin/mahout 
org.apache.mahout.classifier.mlp.RunMultilayerPerceptron
     <tr>
       <td style="text-align: left">âlayerSize -ls</td>
       <td style="text-align: right">Â </td>
-      <td style="text-align: left">Number of units per layer, including input, 
hidden and ouput layers. This parameter specifies the topology of the network 
(see <a href="mlperceptron_structure.png" title="Architecture of a three-layer 
MLP">this image</a> for an example specified by <code 
class="highlighter-rouge">-ls 4 8 3</code>).</td>
+      <td style="text-align: left">Number of units per layer, including input, 
hidden and ouput layers. This parameter specifies the topology of the network 
(see <a href="mlperceptron_structure.png" title="Architecture of a three-layer 
MLP">this image</a> for an example specified by <code>-ls 4 8 3</code>).</td>
       <td style="text-align: left">training</td>
     </tr>
     <tr>
@@ -372,7 +372,7 @@ $ bin/mahout 
org.apache.mahout.classifier.mlp.RunMultilayerPerceptron
     <tr>
       <td style="text-align: left">âcolumnRange -cr</td>
       <td style="text-align: right">Â </td>
-      <td style="text-align: left">Range of the columns to use from the input 
file, starting with 0 (i.e. <code class="highlighter-rouge">-cr 0 5</code> for 
including the first six columns only)</td>
+      <td style="text-align: left">Range of the columns to use from the input 
file, starting with 0 (i.e. <code>-cr 0 5</code> for including the first six 
columns only)</td>
       <td style="text-align: left">testing</td>
     </tr>
     <tr>
@@ -393,23 +393,23 @@ The dimensions of the data set are given through some 
flower parameters (sepal l
 
 <p>To train our multilayer perceptron model from the command line, we call the 
following command</p>
 
-<div class="highlighter-rouge"><div class="highlight"><pre 
class="highlight"><code>$ bin/mahout 
org.apache.mahout.classifier.mlp.TrainMultilayerPerceptron \
+<pre><code>$ bin/mahout 
org.apache.mahout.classifier.mlp.TrainMultilayerPerceptron \
             -i ./mrlegacy/src/test/resources/iris.csv -sh \
             -labels setosa versicolor virginica \
             -mo /tmp/model.model -ls 4 8 3 -l 0.2 -m 0.35 -r 0.0001
-</code></pre></div></div>
+</code></pre>
 
 <p>The individual parameters are explained in the following.</p>
 
 <ul>
-  <li><code class="highlighter-rouge">-i 
./mrlegacy/src/test/resources/iris.csv</code> use the iris data set as input 
data</li>
-  <li><code class="highlighter-rouge">-sh</code> since the file <code 
class="highlighter-rouge">iris.csv</code> contains a header row, this row needs 
to be skipped</li>
-  <li><code class="highlighter-rouge">-labels setosa versicolor 
virginica</code> we specify, which class labels should be learnt (which are the 
flower species in this case)</li>
-  <li><code class="highlighter-rouge">-mo /tmp/model.model</code> specify 
where to store the model file</li>
-  <li><code class="highlighter-rouge">-ls 4 8 3</code> we specify the 
structure and depth of our layers. The actual network structure can be seen in 
the figure below.</li>
-  <li><code class="highlighter-rouge">-l 0.2</code> we set the learning rate 
to <code class="highlighter-rouge">0.2</code></li>
-  <li><code class="highlighter-rouge">-m 0.35</code> momemtum weight is set to 
<code class="highlighter-rouge">0.35</code></li>
-  <li><code class="highlighter-rouge">-r 0.0001</code> regularization weight 
is set to <code class="highlighter-rouge">0.0001</code></li>
+  <li><code>-i ./mrlegacy/src/test/resources/iris.csv</code> use the iris data 
set as input data</li>
+  <li><code>-sh</code> since the file <code>iris.csv</code> contains a header 
row, this row needs to be skipped</li>
+  <li><code>-labels setosa versicolor virginica</code> we specify, which class 
labels should be learnt (which are the flower species in this case)</li>
+  <li><code>-mo /tmp/model.model</code> specify where to store the model 
file</li>
+  <li><code>-ls 4 8 3</code> we specify the structure and depth of our layers. 
The actual network structure can be seen in the figure below.</li>
+  <li><code>-l 0.2</code> we set the learning rate to <code>0.2</code></li>
+  <li><code>-m 0.35</code> momemtum weight is set to <code>0.35</code></li>
+  <li><code>-r 0.0001</code> regularization weight is set to 
<code>0.0001</code></li>
 </ul>
 
 <table>
@@ -431,19 +431,19 @@ The dimensions of the data set are given through some 
flower parameters (sepal l
 
 <p>To test / run the multilayer perceptron classification on the trained 
model, we can use the following command</p>
 
-<div class="highlighter-rouge"><div class="highlight"><pre 
class="highlight"><code>$ bin/mahout 
org.apache.mahout.classifier.mlp.RunMultilayerPerceptron \
+<pre><code>$ bin/mahout 
org.apache.mahout.classifier.mlp.RunMultilayerPerceptron \
             -i ./mrlegacy/src/test/resources/iris.csv -sh -cr 0 3 \
             -mo /tmp/model.model -o /tmp/labelResult.txt
-</code></pre></div></div>
+</code></pre>
 
 <p>The individual parameters are explained in the following.</p>
 
 <ul>
-  <li><code class="highlighter-rouge">-i 
./mrlegacy/src/test/resources/iris.csv</code> use the iris data set as input 
data</li>
-  <li><code class="highlighter-rouge">-sh</code> since the file <code 
class="highlighter-rouge">iris.csv</code> contains a header row, this row needs 
to be skipped</li>
-  <li><code class="highlighter-rouge">-cr 0 3</code> we specify the column 
range of the input file</li>
-  <li><code class="highlighter-rouge">-mo /tmp/model.model</code> specify 
where the model file is stored</li>
-  <li><code class="highlighter-rouge">-o /tmp/labelResult.txt</code> specify 
where the labeled output file will be stored</li>
+  <li><code>-i ./mrlegacy/src/test/resources/iris.csv</code> use the iris data 
set as input data</li>
+  <li><code>-sh</code> since the file <code>iris.csv</code> contains a header 
row, this row needs to be skipped</li>
+  <li><code>-cr 0 3</code> we specify the column range of the input file</li>
+  <li><code>-mo /tmp/model.model</code> specify where the model file is 
stored</li>
+  <li><code>-o /tmp/labelResult.txt</code> specify where the labeled output 
file will be stored</li>
 </ul>
 
 <h2 id="implementation">Implementation</h2>
@@ -460,7 +460,7 @@ Currently, the logistic sigmoid is used as a squashing 
function in every hidden
 
 <p>The command line version <strong>does not perform iterations</strong> which 
leads to bad results on small datasets. Another restriction is, that the CLI 
version of the MLP only supports classification, since the labels have to be 
given explicitly when executing on the command line.</p>
 
-<p>A learned model can be stored and updated with new training instanced using 
the <code class="highlighter-rouge">--update</code> flag. Output of 
classification reults is saved as a .txt-file and only consists of the assigned 
labels. Apart from the command-line interface, it is possible to construct and 
compile more specialized neural networks using the API and interfaces in the 
mrlegacy package.</p>
+<p>A learned model can be stored and updated with new training instanced using 
the <code>--update</code> flag. Output of classification reults is saved as a 
.txt-file and only consists of the assigned labels. Apart from the command-line 
interface, it is possible to construct and compile more specialized neural 
networks using the API and interfaces in the mrlegacy package.</p>
 
 <h2 id="theoretical-background">Theoretical Background</h2>
 

http://git-wip-us.apache.org/repos/asf/mahout/blob/d9686c8b/users/classification/partial-implementation.html
----------------------------------------------------------------------
diff --git a/users/classification/partial-implementation.html 
b/users/classification/partial-implementation.html
index 5028896..6310eca 100644
--- a/users/classification/partial-implementation.html
+++ b/users/classification/partial-implementation.html
@@ -316,8 +316,8 @@ $HADOOP_HOME/bin/hadoop fs -put <PATH TO="" DATA=""> 
testdata{code}</PATH></li>
 <h2 id="generate-a-file-descriptor-for-the-dataset">Generate a file descriptor 
for the dataset:</h2>
 <p>run the following command:</p>
 
-<div class="highlighter-rouge"><div class="highlight"><pre 
class="highlight"><code>$HADOOP_HOME/bin/hadoop jar 
$MAHOUT_HOME/core/target/mahout-core-&lt;VERSION&gt;-job.jar 
org.apache.mahout.classifier.df.tools.Describe -p testdata/KDDTrain+.arff -f 
testdata/KDDTrain+.info -d N 3 C 2 N C 4 N C 8 N 2 C 19 N L
-</code></pre></div></div>
+<pre><code>$HADOOP_HOME/bin/hadoop jar 
$MAHOUT_HOME/core/target/mahout-core-&lt;VERSION&gt;-job.jar 
org.apache.mahout.classifier.df.tools.Describe -p testdata/KDDTrain+.arff -f 
testdata/KDDTrain+.info -d N 3 C 2 N C 4 N C 8 N 2 C 19 N L
+</code></pre>
 
 <p>The âN 3 C 2 N C 4 N C 8 N 2 C 19 N Lâ string describes all the 
attributes
 of the data. In this cases, it means 1 numerical(N) attribute, followed by
@@ -327,8 +327,8 @@ to ignore some attributes</p>
 <p><a name="PartialImplementation-Runtheexample"></a></p>
 <h2 id="run-the-example">Run the example</h2>
 
-<div class="highlighter-rouge"><div class="highlight"><pre 
class="highlight"><code>$HADOOP_HOME/bin/hadoop jar 
$MAHOUT_HOME/examples/target/mahout-examples-&lt;version&gt;-job.jar 
org.apache.mahout.classifier.df.mapreduce.BuildForest 
-Dmapred.max.split.size=1874231 -d testdata/KDDTrain+.arff -ds 
testdata/KDDTrain+.info -sl 5 -p -t 100 -o nsl-forest
-</code></pre></div></div>
+<pre><code>$HADOOP_HOME/bin/hadoop jar 
$MAHOUT_HOME/examples/target/mahout-examples-&lt;version&gt;-job.jar 
org.apache.mahout.classifier.df.mapreduce.BuildForest 
-Dmapred.max.split.size=1874231 -d testdata/KDDTrain+.arff -ds 
testdata/KDDTrain+.info -sl 5 -p -t 100 -o nsl-forest
+</code></pre>
 
 <p>which builds 100 trees (-t argument) using the partial implementation (-p).
 Each tree is built using 5 random selected attribute per node (-sl
@@ -356,8 +356,8 @@ nsl-forest/forest.seq</p>
 <h2 id="using-the-decision-forest-to-classify-new-data">Using the Decision 
Forest to Classify new data</h2>
 <p>run the following command:</p>
 
-<div class="highlighter-rouge"><div class="highlight"><pre 
class="highlight"><code>$HADOOP_HOME/bin/hadoop jar 
$MAHOUT_HOME/examples/target/mahout-examples-&lt;version&gt;-job.jar 
org.apache.mahout.classifier.df.mapreduce.TestForest -i nsl-kdd/KDDTest+.arff 
-ds nsl-kdd/KDDTrain+.info -m nsl-forest -a -mr -o predictions
-</code></pre></div></div>
+<pre><code>$HADOOP_HOME/bin/hadoop jar 
$MAHOUT_HOME/examples/target/mahout-examples-&lt;version&gt;-job.jar 
org.apache.mahout.classifier.df.mapreduce.TestForest -i nsl-kdd/KDDTest+.arff 
-ds nsl-kdd/KDDTrain+.info -m nsl-forest -a -mr -o predictions
+</code></pre>
 
 <p>This will compute the predictions of âKDDTest+.arffâ dataset (-i 
argument)
 using the same data descriptor generated for the training dataset (-ds) and

http://git-wip-us.apache.org/repos/asf/mahout/blob/d9686c8b/users/classification/twenty-newsgroups.html
----------------------------------------------------------------------
diff --git a/users/classification/twenty-newsgroups.html 
b/users/classification/twenty-newsgroups.html
index 291719f..c671aab 100644
--- a/users/classification/twenty-newsgroups.html
+++ b/users/classification/twenty-newsgroups.html
@@ -307,35 +307,35 @@ the 20 newsgroups.</p>
   <li>
     <p>If running Hadoop in cluster mode, start the hadoop daemons by 
executing the following commands:</p>
 
-    <div class="highlighter-rouge"><div class="highlight"><pre 
class="highlight"><code>     $ cd $HADOOP_HOME/bin
+    <pre><code>     $ cd $HADOOP_HOME/bin
      $ ./start-all.sh
-</code></pre></div>    </div>
+</code></pre>
 
     <p>Otherwise:</p>
 
-    <div class="highlighter-rouge"><div class="highlight"><pre 
class="highlight"><code>     $ export MAHOUT_LOCAL=true
-</code></pre></div>    </div>
+    <pre><code>     $ export MAHOUT_LOCAL=true
+</code></pre>
   </li>
   <li>
     <p>In the trunk directory of Mahout, compile and install Mahout:</p>
 
-    <div class="highlighter-rouge"><div class="highlight"><pre 
class="highlight"><code>     $ cd $MAHOUT_HOME
+    <pre><code>     $ cd $MAHOUT_HOME
      $ mvn -DskipTests clean install
-</code></pre></div>    </div>
+</code></pre>
   </li>
   <li>
     <p>Run the <a 
href="https://github.com/apache/mahout/blob/master/examples/bin/classify-20newsgroups.sh";>20
 newsgroups example script</a> by executing:</p>
 
-    <div class="highlighter-rouge"><div class="highlight"><pre 
class="highlight"><code>     $ ./examples/bin/classify-20newsgroups.sh
-</code></pre></div>    </div>
+    <pre><code>     $ ./examples/bin/classify-20newsgroups.sh
+</code></pre>
   </li>
   <li>
     <p>You will be prompted to select a classification method algorithm:</p>
 
-    <div class="highlighter-rouge"><div class="highlight"><pre 
class="highlight"><code>     1. Complement Naive Bayes
+    <pre><code>     1. Complement Naive Bayes
      2. Naive Bayes
      3. Stochastic Gradient Descent
-</code></pre></div>    </div>
+</code></pre>
   </li>
 </ol>
 
@@ -353,7 +353,7 @@ the 20 newsgroups.</p>
 
 <p>Output should look something like:</p>
 
-<div class="highlighter-rouge"><div class="highlight"><pre 
class="highlight"><code>=======================================================
+<pre><code>=======================================================
 Confusion Matrix
 -------------------------------------------------------
  a  b  c  d  e  f  g  h  i  j  k  l  m  n  o  p  q  r  s  t &lt;--Classified as
@@ -384,7 +384,7 @@ Kappa                                       0.8808
 Accuracy                                   90.8596%
 Reliability                                86.3632%
 Reliability (standard deviation)            0.2131
-</code></pre></div></div>
+</code></pre>
 
 <p><a name="TwentyNewsgroups-ComplementaryNaiveBayes"></a></p>
 <h2 id="end-to-end-commands-to-build-a-cbayes-model-for-20-newsgroups">End to 
end commands to build a CBayes model for 20 newsgroups</h2>
@@ -396,14 +396,14 @@ Reliability (standard deviation)            0.2131
   <li>
     <p>Create a working directory for the dataset and all input/output.</p>
 
-    <div class="highlighter-rouge"><div class="highlight"><pre 
class="highlight"><code>     $ export WORK_DIR=/tmp/mahout-work-${USER}
+    <pre><code>     $ export WORK_DIR=/tmp/mahout-work-${USER}
      $ mkdir -p ${WORK_DIR}
-</code></pre></div>    </div>
+</code></pre>
   </li>
   <li>
     <p>Download and extract the <em>20news-bydate.tar.gz</em> from the <a 
href="http://people.csail.mit.edu/jrennie/20Newsgroups/20news-bydate.tar.gz";>20newsgroups
 dataset</a> to the working directory.</p>
 
-    <div class="highlighter-rouge"><div class="highlight"><pre 
class="highlight"><code>     $ curl 
http://people.csail.mit.edu/jrennie/20Newsgroups/20news-bydate.tar.gz 
+    <pre><code>     $ curl 
http://people.csail.mit.edu/jrennie/20Newsgroups/20news-bydate.tar.gz 
          -o ${WORK_DIR}/20news-bydate.tar.gz
      $ mkdir -p ${WORK_DIR}/20news-bydate
      $ cd ${WORK_DIR}/20news-bydate &amp;&amp; tar xzf ../20news-bydate.tar.gz 
&amp;&amp; cd .. &amp;&amp; cd ..
@@ -411,62 +411,62 @@ Reliability (standard deviation)            0.2131
      $ cp -R ${WORK_DIR}/20news-bydate/*/* ${WORK_DIR}/20news-all   * If 
you're running on a Hadoop cluster:
  
      $ hadoop dfs -put ${WORK_DIR}/20news-all ${WORK_DIR}/20news-all
-</code></pre></div>    </div>
+</code></pre>
   </li>
   <li>
     <p>Convert the full 20 newsgroups dataset into a &lt; Text, Text &gt; 
SequenceFile.</p>
 
-    <div class="highlighter-rouge"><div class="highlight"><pre 
class="highlight"><code>     $ mahout seqdirectory 
+    <pre><code>     $ mahout seqdirectory 
          -i ${WORK_DIR}/20news-all 
          -o ${WORK_DIR}/20news-seq 
          -ow
-</code></pre></div>    </div>
+</code></pre>
   </li>
   <li>
     <p>Convert and preprocesses the dataset into  a &lt; Text, VectorWritable 
&gt; SequenceFile containing term frequencies for each document.</p>
 
-    <div class="highlighter-rouge"><div class="highlight"><pre 
class="highlight"><code>     $ mahout seq2sparse 
+    <pre><code>     $ mahout seq2sparse 
          -i ${WORK_DIR}/20news-seq 
          -o ${WORK_DIR}/20news-vectors
          -lnorm 
          -nv 
          -wt tfidf If we wanted to use different parsing methods or 
transformations on the term frequency vectors we could supply different options 
here e.g.: -ng 2 for bigrams or -n 2 for L2 length normalization.  See the 
[Creating vectors from 
text](http://mahout.apache.org/users/basics/creating-vectors-from-text.html) 
page for a list of all seq2sparse options.   
-</code></pre></div>    </div>
+</code></pre>
   </li>
   <li>
     <p>Split the preprocessed dataset into training and testing sets.</p>
 
-    <div class="highlighter-rouge"><div class="highlight"><pre 
class="highlight"><code>     $ mahout split 
+    <pre><code>     $ mahout split 
          -i ${WORK_DIR}/20news-vectors/tfidf-vectors 
          --trainingOutput ${WORK_DIR}/20news-train-vectors 
          --testOutput ${WORK_DIR}/20news-test-vectors  
          --randomSelectionPct 40 
          --overwrite --sequenceFiles -xm sequential
-</code></pre></div>    </div>
+</code></pre>
   </li>
   <li>
     <p>Train the classifier.</p>
 
-    <div class="highlighter-rouge"><div class="highlight"><pre 
class="highlight"><code>     $ mahout trainnb 
+    <pre><code>     $ mahout trainnb 
          -i ${WORK_DIR}/20news-train-vectors
          -el  
          -o ${WORK_DIR}/model 
          -li ${WORK_DIR}/labelindex 
          -ow 
          -c
-</code></pre></div>    </div>
+</code></pre>
   </li>
   <li>
     <p>Test the classifier.</p>
 
-    <div class="highlighter-rouge"><div class="highlight"><pre 
class="highlight"><code>     $ mahout testnb 
+    <pre><code>     $ mahout testnb 
          -i ${WORK_DIR}/20news-test-vectors
          -m ${WORK_DIR}/model 
          -l ${WORK_DIR}/labelindex 
          -ow 
          -o ${WORK_DIR}/20news-testing 
          -c
-</code></pre></div>    </div>
+</code></pre>
   </li>
 </ol>
 

http://git-wip-us.apache.org/repos/asf/mahout/blob/d9686c8b/users/classification/wikipedia-classifier-example.html
----------------------------------------------------------------------
diff --git a/users/classification/wikipedia-classifier-example.html 
b/users/classification/wikipedia-classifier-example.html
index bda386c..0d10dd1 100644
--- a/users/classification/wikipedia-classifier-example.html
+++ b/users/classification/wikipedia-classifier-example.html
@@ -281,32 +281,32 @@
 
 <h2 id="oververview">Oververview</h2>
 
-<p>Tou run the example simply execute the <code 
class="highlighter-rouge">$MAHOUT_HOME/examples/bin/classify-wikipedia.sh</code>
 script.</p>
+<p>Tou run the example simply execute the 
<code>$MAHOUT_HOME/examples/bin/classify-wikipedia.sh</code> script.</p>
 
 <p>By defult the script is set to run on a medium sized Wikipedia XML dump.  
To run on the full set (the entire english Wikipedia) you can change the 
download by commenting out line 78, and uncommenting line 80  of <a 
href="https://github.com/apache/mahout/blob/master/examples/bin/classify-wikipedia.sh";>classify-wikipedia.sh</a>
 [1]. However this is not recommended unless you have the resources to do so. 
<em>Be sure to clean your work directory when changing datasets- option 
(3).</em></p>
 
-<p>The step by step process for Creating a Naive Bayes Classifier for the 
Wikipedia XML dump is very similar to that for <a 
href="http://mahout.apache.org/users/classification/twenty-newsgroups.html";>creating
 a 20 Newsgroups Classifier</a> [4].  The only difference being that instead of 
running <code class="highlighter-rouge">$mahout seqdirectory</code> on the 
unzipped 20 Newsgroups file, youâll run <code 
class="highlighter-rouge">$mahout seqwiki</code> on the unzipped Wikipedia xml 
dump.</p>
+<p>The step by step process for Creating a Naive Bayes Classifier for the 
Wikipedia XML dump is very similar to that for <a 
href="http://mahout.apache.org/users/classification/twenty-newsgroups.html";>creating
 a 20 Newsgroups Classifier</a> [4].  The only difference being that instead of 
running <code>$mahout seqdirectory</code> on the unzipped 20 Newsgroups file, 
youâll run <code>$mahout seqwiki</code> on the unzipped Wikipedia xml 
dump.</p>
 
-<div class="highlighter-rouge"><div class="highlight"><pre 
class="highlight"><code>$ mahout seqwiki 
-</code></pre></div></div>
+<pre><code>$ mahout seqwiki 
+</code></pre>
 
-<p>The above command launches <code 
class="highlighter-rouge">WikipediaToSequenceFile.java</code> which accepts a 
text file of categories [3] and starts an MR job to parse the each document in 
the XML file.  This process will seek to extract documents with a wikipedia 
category tag which (exactly, if the <code 
class="highlighter-rouge">-exactMatchOnly</code> option is set) matches a line 
in the category file.  If no match is found and the <code 
class="highlighter-rouge">-all</code> option is set, the document will be 
dumped into an âunknownâ category. The documents will then be written out 
as a <code class="highlighter-rouge">&lt;Text,Text&gt;</code> sequence file of 
the form (K:/category/document_title , V: document).</p>
+<p>The above command launches <code>WikipediaToSequenceFile.java</code> which 
accepts a text file of categories [3] and starts an MR job to parse the each 
document in the XML file.  This process will seek to extract documents with a 
wikipedia category tag which (exactly, if the <code>-exactMatchOnly</code> 
option is set) matches a line in the category file.  If no match is found and 
the <code>-all</code> option is set, the document will be dumped into an 
âunknownâ category. The documents will then be written out as a 
<code>&lt;Text,Text&gt;</code> sequence file of the form 
(K:/category/document_title , V: document).</p>
 
 <p>There are 3 different example category files available to in the 
/examples/src/test/resources
 directory:  country.txt, country10.txt and country2.txt.  You can edit these 
categories to extract a different corpus from the Wikipedia dataset.</p>
 
-<p>The CLI options for <code class="highlighter-rouge">seqwiki</code> are as 
follows:</p>
+<p>The CLI options for <code>seqwiki</code> are as follows:</p>
 
-<div class="highlighter-rouge"><div class="highlight"><pre 
class="highlight"><code>--input          (-i)         input pathname String
+<pre><code>--input          (-i)         input pathname String
 --output         (-o)         the output pathname String
 --categories     (-c)         the file containing the Wikipedia categories
 --exactMatchOnly (-e)         if set, then the Wikipedia category must match
                                 exactly instead of simply containing the 
category string
 --all            (-all)       if set select all categories
 --removeLabels   (-rl)        if set, remove [[Category:labels]] from document 
text after extracting label.
-</code></pre></div></div>
+</code></pre>
 
-<p>After <code class="highlighter-rouge">seqwiki</code>, the script runs <code 
class="highlighter-rouge">seq2sparse</code>, <code 
class="highlighter-rouge">split</code>, <code 
class="highlighter-rouge">trainnb</code> and <code 
class="highlighter-rouge">testnb</code> as in the <a 
href="http://mahout.apache.org/users/classification/twenty-newsgroups.html";>step
 by step 20newsgroups example</a>.  When all of the jobs have finished, a 
confusion matrix will be displayed.</p>
+<p>After <code>seqwiki</code>, the script runs <code>seq2sparse</code>, 
<code>split</code>, <code>trainnb</code> and <code>testnb</code> as in the <a 
href="http://mahout.apache.org/users/classification/twenty-newsgroups.html";>step
 by step 20newsgroups example</a>.  When all of the jobs have finished, a 
confusion matrix will be displayed.</p>
 
 <p>#Resourcese</p>
 

http://git-wip-us.apache.org/repos/asf/mahout/blob/d9686c8b/users/clustering/canopy-clustering.html
----------------------------------------------------------------------
diff --git a/users/clustering/canopy-clustering.html 
b/users/clustering/canopy-clustering.html
index 1b17ff2..06d0a13 100644
--- a/users/clustering/canopy-clustering.html
+++ b/users/clustering/canopy-clustering.html
@@ -361,7 +361,7 @@ Both require several arguments:</p>
 
 <p>Invocation using the command line takes the form:</p>
 
-<div class="highlighter-rouge"><div class="highlight"><pre 
class="highlight"><code>bin/mahout canopy \
+<pre><code>bin/mahout canopy \
     -i &lt;input vectors directory&gt; \
     -o &lt;output working directory&gt; \
     -dm &lt;DistanceMeasure&gt; \
@@ -373,7 +373,7 @@ Both require several arguments:</p>
     -ow &lt;overwrite output directory if present&gt;
     -cl &lt;run input vector clustering after computing Canopies&gt;
     -xm &lt;execution method: sequential or mapreduce&gt;
-</code></pre></div></div>
+</code></pre>
 
 <p>Invocation using Java involves supplying the following arguments:</p>
 

http://git-wip-us.apache.org/repos/asf/mahout/blob/d9686c8b/users/clustering/canopy-commandline.html
----------------------------------------------------------------------
diff --git a/users/clustering/canopy-commandline.html 
b/users/clustering/canopy-commandline.html
index e878275..fb7f2eb 100644
--- a/users/clustering/canopy-commandline.html
+++ b/users/clustering/canopy-commandline.html
@@ -282,8 +282,8 @@ an operating Hadoop cluster on the target machine then the 
invocation will
 run Canopy on that cluster. If either of the environment variables are
 missing then the stand-alone Hadoop configuration will be invoked instead.</p>
 
-<div class="highlighter-rouge"><div class="highlight"><pre 
class="highlight"><code>./bin/mahout canopy &lt;OPTIONS&gt;
-</code></pre></div></div>
+<pre><code>./bin/mahout canopy &lt;OPTIONS&gt;
+</code></pre>
 
 <ul>
   <li>In $MAHOUT_HOME/, build the jar containing the job (mvn install) The job
@@ -326,7 +326,7 @@ to view all outputs.</li>
 <p><a name="canopy-commandline-Commandlineoptions"></a></p>
 <h1 id="command-line-options">Command line options</h1>
 
-<div class="highlighter-rouge"><div class="highlight"><pre 
class="highlight"><code>  --input (-i) input                             Path 
to job input directory.Must  
+<pre><code>  --input (-i) input                             Path to job input 
directory.Must  
                                             be a SequenceFile of           
                                             VectorWritable                 
   --output (-o) output                      The directory pathname for output. 
@@ -340,7 +340,7 @@ to view all outputs.</li>
   --clustering (-cl)                        If present, run clustering after   
                                             the iterations have taken place    
 
   --help (-h)                               Print out help                 
-</code></pre></div></div>
+</code></pre>
 
 
    </div>

http://git-wip-us.apache.org/repos/asf/mahout/blob/d9686c8b/users/clustering/cluster-dumper.html
----------------------------------------------------------------------
diff --git a/users/clustering/cluster-dumper.html 
b/users/clustering/cluster-dumper.html
index ba4e841..2fe2421 100644
--- a/users/clustering/cluster-dumper.html
+++ b/users/clustering/cluster-dumper.html
@@ -295,15 +295,15 @@ you can run clusterdumper in 2 modes:</p>
 <h3 id="hadoop-environment">Hadoop Environment</h3>
 
 <p>If you have setup your HADOOP_HOME environment variable, you can use the
-command line utility <code class="highlighter-rouge">mahout</code> to execute 
the ClusterDumper on Hadoop. In
+command line utility <code>mahout</code> to execute the ClusterDumper on 
Hadoop. In
 this case we wont need to get the output clusters to our local machines.
 The utility will read the output clusters present in HDFS and output the
 human-readable cluster values into our local file system. Say youâve just
 executed the <a href="clustering-of-synthetic-control-data.html">synthetic 
control example </a>
- and want to analyze the output, you can execute the <code 
class="highlighter-rouge">mahout clusterdumper</code> utility from the command 
line.</p>
+ and want to analyze the output, you can execute the <code>mahout 
clusterdumper</code> utility from the command line.</p>
 
 <h4 id="cli-options">CLI options:</h4>
-<div class="highlighter-rouge"><div class="highlight"><pre 
class="highlight"><code>--help                               Print out help 
+<pre><code>--help                               Print out help 
 --input (-i) input                   The directory containing Sequence
                                        Files for the Clusters      
 --output (-o) output                 The output file.  If not specified,
@@ -329,7 +329,7 @@ executed the <a 
href="clustering-of-synthetic-control-data.html">synthetic contr
 --evaluate (-e)                      Run ClusterEvaluator and CDbwEvaluator 
over the
                                       input. The output will be appended to 
the rest of
                                       the output at the end.   
-</code></pre></div></div>
+</code></pre>
 
 <h3 id="standalone-java-program">Standalone Java Program</h3>
 
@@ -350,11 +350,11 @@ executed the <a 
href="clustering-of-synthetic-control-data.html">synthetic contr
 
 <p>In the arguments tab, specify the below arguments</p>
 
-<div class="highlighter-rouge"><div class="highlight"><pre 
class="highlight"><code>--seqFileDir 
&lt;MAHOUT_HOME&gt;/examples/output/clusters-10 
+<pre><code>--seqFileDir &lt;MAHOUT_HOME&gt;/examples/output/clusters-10 
 --pointsDir &lt;MAHOUT_HOME&gt;/examples/output/clusteredPoints 
 --output &lt;MAHOUT_HOME&gt;/examples/output/clusteranalyze.txt
 replace &lt;MAHOUT_HOME&gt; with the actual path of your $MAHOUT_HOME
-</code></pre></div></div>
+</code></pre>
 
 <ul>
   <li>Hit run to execute the ClusterDumper using Eclipse. Setting breakpoints 
etc should just work fine.</li>

http://git-wip-us.apache.org/repos/asf/mahout/blob/d9686c8b/users/clustering/clustering-of-synthetic-control-data.html
----------------------------------------------------------------------
diff --git a/users/clustering/clustering-of-synthetic-control-data.html 
b/users/clustering/clustering-of-synthetic-control-data.html
index 2441536..ec32638 100644
--- a/users/clustering/clustering-of-synthetic-control-data.html
+++ b/users/clustering/clustering-of-synthetic-control-data.html
@@ -312,22 +312,22 @@
   <li><a href="/users/clustering/canopy-clustering.html">Canopy 
Clustering</a></li>
 </ul>
 
-<div class="highlighter-rouge"><div class="highlight"><pre 
class="highlight"><code>bin/mahout 
org.apache.mahout.clustering.syntheticcontrol.canopy.Job
-</code></pre></div></div>
+<pre><code>bin/mahout org.apache.mahout.clustering.syntheticcontrol.canopy.Job
+</code></pre>
 
 <ul>
   <li><a href="/users/clustering/k-means-clustering.html">k-Means 
Clustering</a></li>
 </ul>
 
-<div class="highlighter-rouge"><div class="highlight"><pre 
class="highlight"><code>bin/mahout 
org.apache.mahout.clustering.syntheticcontrol.kmeans.Job
-</code></pre></div></div>
+<pre><code>bin/mahout org.apache.mahout.clustering.syntheticcontrol.kmeans.Job
+</code></pre>
 
 <ul>
   <li><a href="/users/clustering/fuzzy-k-means.html">Fuzzy k-Means 
Clustering</a></li>
 </ul>
 
-<div class="highlighter-rouge"><div class="highlight"><pre 
class="highlight"><code>bin/mahout 
org.apache.mahout.clustering.syntheticcontrol.fuzzykmeans.Job
-</code></pre></div></div>
+<pre><code>bin/mahout 
org.apache.mahout.clustering.syntheticcontrol.fuzzykmeans.Job
+</code></pre>
 
 <p>The clustering output will be produced in the <em>output</em> directory. 
The output data points are in vector format. In order to read/analyze the 
output, you can use the <a 
href="/users/clustering/cluster-dumper.html">clusterdump</a> utility provided 
by Mahout.</p>
 

http://git-wip-us.apache.org/repos/asf/mahout/blob/d9686c8b/users/clustering/clusteringyourdata.html
----------------------------------------------------------------------
diff --git a/users/clustering/clusteringyourdata.html 
b/users/clustering/clusteringyourdata.html
index 695ed10..6dbe65c 100644
--- a/users/clustering/clusteringyourdata.html
+++ b/users/clustering/clusteringyourdata.html
@@ -315,13 +315,13 @@ In particular for text preparation check out <a 
href="../basics/creating-vectors
 
 <p>Mahout has a cluster dumper utility that can be used to retrieve and 
evaluate your clustering data.</p>
 
-<div class="highlighter-rouge"><div class="highlight"><pre 
class="highlight"><code>./bin/mahout clusterdump &lt;OPTIONS&gt;
-</code></pre></div></div>
+<pre><code>./bin/mahout clusterdump &lt;OPTIONS&gt;
+</code></pre>
 
 <p><a name="ClusteringYourData-Theclusterdumperoptionsare:"></a></p>
 <h2 id="the-cluster-dumper-options-are">The cluster dumper options are:</h2>
 
-<div class="highlighter-rouge"><div class="highlight"><pre 
class="highlight"><code>  --help (-h)                                  Print 
out help       
+<pre><code>  --help (-h)                                  Print out help       
     
   --input (-i) input                      The directory containing Sequence    
                                           Files for the Clusters           
@@ -359,7 +359,7 @@ In particular for text preparation check out <a 
href="../basics/creating-vectors
   --evaluate (-e)                         Run ClusterEvaluator and 
CDbwEvaluator over the
                                           input. The output will be appended 
to the rest of
                                           the output at the end.   
-</code></pre></div></div>
+</code></pre>
 
 <p>More information on using clusterdump utility can be found <a 
href="cluster-dumper.html">here</a></p>
 

http://git-wip-us.apache.org/repos/asf/mahout/blob/d9686c8b/users/clustering/fuzzy-k-means-commandline.html
----------------------------------------------------------------------
diff --git a/users/clustering/fuzzy-k-means-commandline.html 
b/users/clustering/fuzzy-k-means-commandline.html
index 4b8cb3d..7be184e 100644
--- a/users/clustering/fuzzy-k-means-commandline.html
+++ b/users/clustering/fuzzy-k-means-commandline.html
@@ -282,8 +282,8 @@ an operating Hadoop cluster on the target machine then the 
invocation will
 run FuzzyK on that cluster. If either of the environment variables are
 missing then the stand-alone Hadoop configuration will be invoked instead.</p>
 
-<div class="highlighter-rouge"><div class="highlight"><pre 
class="highlight"><code>./bin/mahout fkmeans &lt;OPTIONS&gt;
-</code></pre></div></div>
+<pre><code>./bin/mahout fkmeans &lt;OPTIONS&gt;
+</code></pre>
 
 <ul>
   <li>In $MAHOUT_HOME/, build the jar containing the job (mvn install) The job
@@ -324,7 +324,7 @@ to view all outputs.</li>
 <p><a name="fuzzy-k-means-commandline-Commandlineoptions"></a></p>
 <h1 id="command-line-options">Command line options</h1>
 
-<div class="highlighter-rouge"><div class="highlight"><pre 
class="highlight"><code>  --input (-i) input                               Path 
to job input directory. 
+<pre><code>  --input (-i) input                               Path to job 
input directory. 
                                               Must be a SequenceFile of    
                                               VectorWritable               
   --clusters (-c) clusters                    The input centroids, as Vectors. 
@@ -366,7 +366,7 @@ to view all outputs.</li>
                                               is 0 
   --clustering (-cl)                          If present, run clustering after 
                                               the iterations have taken place  
-</code></pre></div></div>
+</code></pre>
 
 
    </div>

http://git-wip-us.apache.org/repos/asf/mahout/blob/d9686c8b/users/clustering/fuzzy-k-means.html
----------------------------------------------------------------------
diff --git a/users/clustering/fuzzy-k-means.html 
b/users/clustering/fuzzy-k-means.html
index 44f5c14..648c188 100644
--- a/users/clustering/fuzzy-k-means.html
+++ b/users/clustering/fuzzy-k-means.html
@@ -351,7 +351,7 @@ FuzzyKMeansDriver.run().</p>
 
 <p>Invocation using the command line takes the form:</p>
 
-<div class="highlighter-rouge"><div class="highlight"><pre 
class="highlight"><code>bin/mahout fkmeans \
+<pre><code>bin/mahout fkmeans \
     -i &lt;input vectors directory&gt; \
     -c &lt;input clusters directory&gt; \
     -o &lt;output working directory&gt; \
@@ -365,7 +365,7 @@ FuzzyKMeansDriver.run().</p>
     -e &lt;emit vectors to most likely cluster during clustering&gt;
     -t &lt;threshold to use for clustering if -e is false&gt;
     -xm &lt;execution method: sequential or mapreduce&gt;
-</code></pre></div></div>
+</code></pre>
 
 <p><em>Note:</em> if the -k argument is supplied, any clusters in the -c 
directory
 will be overwritten and -k random points will be sampled from the input

http://git-wip-us.apache.org/repos/asf/mahout/blob/d9686c8b/users/clustering/k-means-clustering.html
----------------------------------------------------------------------
diff --git a/users/clustering/k-means-clustering.html 
b/users/clustering/k-means-clustering.html
index 21f9e2f..431aaa7 100644
--- a/users/clustering/k-means-clustering.html
+++ b/users/clustering/k-means-clustering.html
@@ -331,14 +331,14 @@ clustering and convergence values.</p>
 
 <p>Canopy clustering can be used to compute the initial clusters for 
k-KMeans:</p>
 
-<div class="highlighter-rouge"><div class="highlight"><pre 
class="highlight"><code>// run the CanopyDriver job
+<pre><code>// run the CanopyDriver job
 CanopyDriver.runJob("testdata", "output"
 ManhattanDistanceMeasure.class.getName(), (float) 3.1, (float) 2.1, false);
 
 // now run the KMeansDriver job
 KMeansDriver.runJob("testdata", "output/clusters-0", "output",
 EuclideanDistanceMeasure.class.getName(), "0.001", "10", true);
-</code></pre></div></div>
+</code></pre>
 
 <p>In the above example, the input data points are stored in âtestdataâ and
 the CanopyDriver is configured to output to the âoutput/clusters-0â
@@ -359,7 +359,7 @@ on KMeansDriver.main or by making a Java call to 
KMeansDriver.runJob().</p>
 
 <p>Invocation using the command line takes the form:</p>
 
-<div class="highlighter-rouge"><div class="highlight"><pre 
class="highlight"><code>bin/mahout kmeans \
+<pre><code>bin/mahout kmeans \
     -i &lt;input vectors directory&gt; \
     -c &lt;input clusters directory&gt; \
     -o &lt;output working directory&gt; \
@@ -370,7 +370,7 @@ on KMeansDriver.main or by making a Java call to 
KMeansDriver.runJob().</p>
     -ow &lt;overwrite output directory if present&gt;
     -cl &lt;run input vector clustering after computing Canopies&gt;
     -xm &lt;execution method: sequential or mapreduce&gt;
-</code></pre></div></div>
+</code></pre>
 
 <p>Note: if the -k argument is supplied, any clusters in the -c directory
 will be overwritten and -k random points will be sampled from the input

http://git-wip-us.apache.org/repos/asf/mahout/blob/d9686c8b/users/clustering/k-means-commandline.html
----------------------------------------------------------------------
diff --git a/users/clustering/k-means-commandline.html 
b/users/clustering/k-means-commandline.html
index 318b847..cf7de7a 100644
--- a/users/clustering/k-means-commandline.html
+++ b/users/clustering/k-means-commandline.html
@@ -289,8 +289,8 @@ an operating Hadoop cluster on the target machine then the 
invocation will
 run k-Means on that cluster. If either of the environment variables are
 missing then the stand-alone Hadoop configuration will be invoked instead.</p>
 
-<div class="highlighter-rouge"><div class="highlight"><pre 
class="highlight"><code>./bin/mahout kmeans &lt;OPTIONS&gt;
-</code></pre></div></div>
+<pre><code>./bin/mahout kmeans &lt;OPTIONS&gt;
+</code></pre>
 
 <p>In $MAHOUT_HOME/, build the jar containing the job (mvn install) The job
 will be generated in $MAHOUT_HOME/core/target/ and itâs name will contain
@@ -331,7 +331,7 @@ to view all outputs.</li>
 <p><a name="k-means-commandline-Commandlineoptions"></a></p>
 <h1 id="command-line-options">Command line options</h1>
 
-<div class="highlighter-rouge"><div class="highlight"><pre 
class="highlight"><code>  --input (-i) input                               Path 
to job input directory. 
+<pre><code>  --input (-i) input                               Path to job 
input directory. 
                                               Must be a SequenceFile of    
                                               VectorWritable               
   --clusters (-c) clusters                    The input centroids, as Vectors. 
@@ -362,7 +362,7 @@ to view all outputs.</li>
   --help (-h)                                 Print out help               
   --clustering (-cl)                          If present, run clustering after 
                                               the iterations have taken place  
-</code></pre></div></div>
+</code></pre>
 
 
    </div>

http://git-wip-us.apache.org/repos/asf/mahout/blob/d9686c8b/users/clustering/latent-dirichlet-allocation.html
----------------------------------------------------------------------
diff --git a/users/clustering/latent-dirichlet-allocation.html 
b/users/clustering/latent-dirichlet-allocation.html
index 78a8e4f..e857424 100644
--- a/users/clustering/latent-dirichlet-allocation.html
+++ b/users/clustering/latent-dirichlet-allocation.html
@@ -343,7 +343,7 @@ vectors, itâs recommended that you follow the 
instructions in <a href="../basi
 
 <p>Invocation takes the form:</p>
 
-<div class="highlighter-rouge"><div class="highlight"><pre 
class="highlight"><code>bin/mahout cvb \
+<pre><code>bin/mahout cvb \
     -i &lt;input path for document vectors&gt; \
     -dict &lt;path to term-dictionary file(s) , glob expression supported&gt; \
     -o &lt;output path for topic-term distributions&gt;
@@ -358,7 +358,7 @@ vectors, itâs recommended that you follow the 
instructions in <a href="../basi
     -seed &lt;random seed&gt; \
     -tf &lt;fraction of data to hold for testing&gt; \
     -block &lt;number of iterations per perplexity check, ignored unless 
test_set_percentage&gt;0&gt; \
-</code></pre></div></div>
+</code></pre>
 
 <p>Topic smoothing should generally be about 50/K, where K is the number of
 topics. The number of words in the vocabulary can be an upper bound, though
@@ -370,14 +370,14 @@ recommended that you try several values.</p>
 <p>After running LDA you can obtain an output of the computed topics using the
 LDAPrintTopics utility:</p>
 
-<div class="highlighter-rouge"><div class="highlight"><pre 
class="highlight"><code>bin/mahout ldatopics \
+<pre><code>bin/mahout ldatopics \
     -i &lt;input vectors directory&gt; \
     -d &lt;input dictionary file&gt; \
     -w &lt;optional number of words to print&gt; \
     -o &lt;optional output working directory. Default is to console&gt; \
     -h &lt;print out help&gt; \
     -dt &lt;optional dictionary type (text|sequencefile). Default is text&gt;
-</code></pre></div></div>
+</code></pre>
 
 <p><a name="LatentDirichletAllocation-Example"></a></p>
 <h1 id="example">Example</h1>

http://git-wip-us.apache.org/repos/asf/mahout/blob/d9686c8b/users/clustering/lda-commandline.html
----------------------------------------------------------------------
diff --git a/users/clustering/lda-commandline.html 
b/users/clustering/lda-commandline.html
index d3f4c67..729c061 100644
--- a/users/clustering/lda-commandline.html
+++ b/users/clustering/lda-commandline.html
@@ -285,8 +285,8 @@ Hadoop cluster on the target machine then the invocation 
will run the LDA
 algorithm on that cluster. If either of the environment variables are
 missing then the stand-alone Hadoop configuration will be invoked instead.</p>
 
-<div class="highlighter-rouge"><div class="highlight"><pre 
class="highlight"><code>./bin/mahout cvb &lt;OPTIONS&gt;
-</code></pre></div></div>
+<pre><code>./bin/mahout cvb &lt;OPTIONS&gt;
+</code></pre>
 
 <ul>
   <li>In $MAHOUT_HOME/, build the jar containing the job (mvn install) The job
@@ -327,7 +327,7 @@ to view all outputs.</li>
 <p><a name="lda-commandline-CommandlineoptionsfromMahoutcvbversion0.8"></a></p>
 <h1 id="command-line-options-from-mahout-cvb-version-08">Command line options 
from Mahout cvb version 0.8</h1>
 
-<div class="highlighter-rouge"><div class="highlight"><pre 
class="highlight"><code>mahout cvb -h 
+<pre><code>mahout cvb -h 
   --input (-i) input                                     Path to job input 
directory.        
   --output (-o) output                                   The directory 
pathname for output.  
   --maxIter (-x) maxIter                                 The maximum number of 
iterations.             
@@ -352,7 +352,7 @@ to view all outputs.</li>
   --tempDir tempDir                                      Intermediate output 
directory      
   --startPhase startPhase                                First phase to run    
   --endPhase endPhase                                    Last phase to run
-</code></pre></div></div>
+</code></pre>
 
 
    </div>

[3/4] mahout git commit: Automatic Site Publish by Buildbot

Reply via email to