[1/4] mahout git commit: Automatic Site Publish by Buildbot

git-site-role Wed, 29 Nov 2017 16:24:07 -0800

Repository: mahout
Updated Branches:
  refs/heads/asf-site f4961201e -> d9686c8ba



http://git-wip-us.apache.org/repos/asf/mahout/blob/d9686c8b/users/environment/out-of-core-reference.html
----------------------------------------------------------------------
diff --git a/users/environment/out-of-core-reference.html 
b/users/environment/out-of-core-reference.html
index 8db77d0..aed6f88 100644
--- a/users/environment/out-of-core-reference.html
+++ b/users/environment/out-of-core-reference.html
@@ -278,29 +278,29 @@
 
 <p>The subjects of this reference are solely applicable to Mahout-Samsaraâs 
<strong>DRM</strong> (distributed row matrix).</p>
 
-<p>In this reference, DRMs will be denoted as e.g. <code 
class="highlighter-rouge">A</code>, and in-core matrices as e.g. <code 
class="highlighter-rouge">inCoreA</code>.</p>
+<p>In this reference, DRMs will be denoted as e.g. <code>A</code>, and in-core 
matrices as e.g. <code>inCoreA</code>.</p>
 
 <h4 id="imports">Imports</h4>
 
 <p>The following imports are used to enable seamless in-core and distributed 
algebraic DSL operations:</p>
 
-<div class="highlighter-rouge"><div class="highlight"><pre 
class="highlight"><code>import org.apache.mahout.math._
+<pre><code>import org.apache.mahout.math._
 import scalabindings._
 import RLikeOps._
 import drm._
 import RLikeDRMOps._
-</code></pre></div></div>
+</code></pre>
 
 <p>If working with mixed scala/java code:</p>
 
-<div class="highlighter-rouge"><div class="highlight"><pre 
class="highlight"><code>import collection._
+<pre><code>import collection._
 import JavaConversions._
-</code></pre></div></div>
+</code></pre>
 
 <p>If you are working with Mahout-Samsaraâs Spark-specific operations e.g. 
for context creation:</p>
 
-<div class="highlighter-rouge"><div class="highlight"><pre 
class="highlight"><code>import org.apache.mahout.sparkbindings._
-</code></pre></div></div>
+<pre><code>import org.apache.mahout.sparkbindings._
+</code></pre>
 
 <p>The Mahout shell does all of these imports automatically.</p>
 
@@ -310,38 +310,38 @@ import JavaConversions._
 
 <p>Loading a DRM from (HD)FS:</p>
 
-<div class="highlighter-rouge"><div class="highlight"><pre 
class="highlight"><code>drmDfsRead(path = hdfsPath)
-</code></pre></div></div>
+<pre><code>drmDfsRead(path = hdfsPath)
+</code></pre>
 
 <p>Parallelizing from an in-core matrix:</p>
 
-<div class="highlighter-rouge"><div class="highlight"><pre 
class="highlight"><code>val inCoreA = (dense(1, 2, 3), (3, 4, 5))
+<pre><code>val inCoreA = (dense(1, 2, 3), (3, 4, 5))
 val A = drmParallelize(inCoreA)
-</code></pre></div></div>
+</code></pre>
 
 <p>Creating an empty DRM:</p>
 
-<div class="highlighter-rouge"><div class="highlight"><pre 
class="highlight"><code>val A = drmParallelizeEmpty(100, 50)
-</code></pre></div></div>
+<pre><code>val A = drmParallelizeEmpty(100, 50)
+</code></pre>
 
 <p>Collecting to driverâs jvm in-core:</p>
 
-<div class="highlighter-rouge"><div class="highlight"><pre 
class="highlight"><code>val inCoreA = A.collect
-</code></pre></div></div>
+<pre><code>val inCoreA = A.collect
+</code></pre>
 
 <p><strong>Warning: The collection of distributed matrices happens implicitly 
whenever conversion to an in-core (o.a.m.math.Matrix) type is required. 
E.g.:</strong></p>
 
-<div class="highlighter-rouge"><div class="highlight"><pre 
class="highlight"><code>val inCoreA: Matrix = ...
+<pre><code>val inCoreA: Matrix = ...
 val drmB: DrmLike[Int] =...
 val inCoreC: Matrix = inCoreA %*%: drmB
-</code></pre></div></div>
+</code></pre>
 
 <p><strong>implies (incoreA %*%: drmB).collect</strong></p>
 
 <p>Collecting to (HD)FS as a Mahoutâs DRM formatted file:</p>
 
-<div class="highlighter-rouge"><div class="highlight"><pre 
class="highlight"><code>A.dfsWrite(path = hdfsPath)
-</code></pre></div></div>
+<pre><code>A.dfsWrite(path = hdfsPath)
+</code></pre>
 
 <h4 id="logical-algebraic-operators-on-drm-matrices">Logical algebraic 
operators on DRM matrices:</h4>
 
@@ -350,12 +350,12 @@ val inCoreC: Matrix = inCoreA %*%: drmB
 
 <p>Cache a DRM and trigger an optimized physical plan:</p>
 
-<div class="highlighter-rouge"><div class="highlight"><pre 
class="highlight"><code>drmA.checkpoint(CacheHint.MEMORY_AND_DISK)
-</code></pre></div></div>
+<pre><code>drmA.checkpoint(CacheHint.MEMORY_AND_DISK)
+</code></pre>
 
 <p>Other valid caching Instructions:</p>
 
-<div class="highlighter-rouge"><div class="highlight"><pre 
class="highlight"><code>drmA.checkpoint(CacheHint.NONE)
+<pre><code>drmA.checkpoint(CacheHint.NONE)
 drmA.checkpoint(CacheHint.DISK_ONLY)
 drmA.checkpoint(CacheHint.DISK_ONLY_2)
 drmA.checkpoint(CacheHint.MEMORY_ONLY)
@@ -365,38 +365,38 @@ drmA.checkpoint(CacheHint.MEMORY_ONLY_SER_2)
 drmA.checkpoint(CacheHint.MEMORY_AND_DISK_2)
 drmA.checkpoint(CacheHint.MEMORY_AND_DISK_SER)
 drmA.checkpoint(CacheHint.MEMORY_AND_DISK_SER_2)
-</code></pre></div></div>
+</code></pre>
 
 <p><em>Note: Logical DRM operations are lazily computed.  Currently the actual 
computations and optional caching will be triggered by dfsWrite(â¦), 
collect(â¦) and blockify(â¦).</em></p>
 
 <p>Transposition:</p>
 
-<div class="highlighter-rouge"><div class="highlight"><pre 
class="highlight"><code>A.t
-</code></pre></div></div>
+<pre><code>A.t
+</code></pre>
 
 <p>Elementwise addition <em>(Matrices of identical geometry and row key 
types)</em>:</p>
 
-<div class="highlighter-rouge"><div class="highlight"><pre 
class="highlight"><code>A + B
-</code></pre></div></div>
+<pre><code>A + B
+</code></pre>
 
 <p>Elementwise subtraction <em>(Matrices of identical geometry and row key 
types)</em>:</p>
 
-<div class="highlighter-rouge"><div class="highlight"><pre 
class="highlight"><code>A - B
-</code></pre></div></div>
+<pre><code>A - B
+</code></pre>
 
 <p>Elementwise multiplication (Hadamard) <em>(Matrices of identical geometry 
and row key types)</em>:</p>
 
-<div class="highlighter-rouge"><div class="highlight"><pre 
class="highlight"><code>A * B
-</code></pre></div></div>
+<pre><code>A * B
+</code></pre>
 
 <p>Elementwise division <em>(Matrices of identical geometry and row key 
types)</em>:</p>
 
-<div class="highlighter-rouge"><div class="highlight"><pre 
class="highlight"><code>A / B
-</code></pre></div></div>
+<pre><code>A / B
+</code></pre>
 
 <p><strong>Elementwise operations involving one in-core argument (int-keyed 
DRMs only)</strong>:</p>
 
-<div class="highlighter-rouge"><div class="highlight"><pre 
class="highlight"><code>A + inCoreB
+<pre><code>A + inCoreB
 A - inCoreB
 A * inCoreB
 A / inCoreB
@@ -408,73 +408,73 @@ inCoreA +: B
 inCoreA -: B
 inCoreA *: B
 inCoreA /: B
-</code></pre></div></div>
+</code></pre>
 
-<p>Note the Spark associativity change (e.g. <code class="highlighter-rouge">A 
*: inCoreB</code> means <code 
class="highlighter-rouge">B.leftMultiply(A</code>), same as when both arguments 
are in core). Whenever operator arguments include both in-core and out-of-core 
arguments, the operator can only be associated with the out-of-core (DRM) 
argument to support the distributed implementation.</p>
+<p>Note the Spark associativity change (e.g. <code>A *: inCoreB</code> means 
<code>B.leftMultiply(A</code>), same as when both arguments are in core). 
Whenever operator arguments include both in-core and out-of-core arguments, the 
operator can only be associated with the out-of-core (DRM) argument to support 
the distributed implementation.</p>
 
 <p><strong>Matrix-matrix multiplication %*%</strong>:</p>
 
-<p><code class="highlighter-rouge">\(\mathbf{M}=\mathbf{AB}\)</code></p>
+<p><code>\(\mathbf{M}=\mathbf{AB}\)</code></p>
 
-<div class="highlighter-rouge"><div class="highlight"><pre 
class="highlight"><code>A %*% B
+<pre><code>A %*% B
 A %*% inCoreB
 A %*% inCoreDiagonal
 A %*%: B
-</code></pre></div></div>
+</code></pre>
 
 <p><em>Note: same as above, whenever operator arguments include both in-core 
and out-of-core arguments, the operator can only be associated with the 
out-of-core (DRM) argument to support the distributed implementation.</em></p>
 
 <p><strong>Matrix-vector multiplication %*%</strong>
-Currently we support a right multiply product of a DRM and an in-core 
Vector(<code class="highlighter-rouge">\(\mathbf{Ax}\)</code>) resulting in a 
single column DRM, which then can be collected in front (usually the desired 
outcome):</p>
+Currently we support a right multiply product of a DRM and an in-core 
Vector(<code>\(\mathbf{Ax}\)</code>) resulting in a single column DRM, which 
then can be collected in front (usually the desired outcome):</p>
 
-<div class="highlighter-rouge"><div class="highlight"><pre 
class="highlight"><code>val Ax = A %*% x
+<pre><code>val Ax = A %*% x
 val inCoreX = Ax.collect(::, 0)
-</code></pre></div></div>
+</code></pre>
 
 <p><strong>Matrix-scalar +,-,*,/</strong>
 Elementwise operations of every matrix element and a scalar:</p>
 
-<div class="highlighter-rouge"><div class="highlight"><pre 
class="highlight"><code>A + 5.0
+<pre><code>A + 5.0
 A - 5.0
 A :- 5.0
 5.0 -: A
 A * 5.0
 A / 5.0
 5.0 /: a
-</code></pre></div></div>
+</code></pre>
 
-<p>Note that <code class="highlighter-rouge">5.0 -: A</code> means <code 
class="highlighter-rouge">\(m_{ij} = 5 - a_{ij}\)</code> and <code 
class="highlighter-rouge">5.0 /: A</code> means <code 
class="highlighter-rouge">\(m_{ij} = \frac{5}{a{ij}}\)</code> for all elements 
of the result.</p>
+<p>Note that <code>5.0 -: A</code> means <code>\(m_{ij} = 5 - a_{ij}\)</code> 
and <code>5.0 /: A</code> means <code>\(m_{ij} = \frac{5}{a{ij}}\)</code> for 
all elements of the result.</p>
 
 <h4 id="slicing">Slicing</h4>
 
 <p>General slice:</p>
 
-<div class="highlighter-rouge"><div class="highlight"><pre 
class="highlight"><code>A(100 to 200, 100 to 200)
-</code></pre></div></div>
+<pre><code>A(100 to 200, 100 to 200)
+</code></pre>
 
 <p>Horizontal Block:</p>
 
-<div class="highlighter-rouge"><div class="highlight"><pre 
class="highlight"><code>A(::, 100 to 200)
-</code></pre></div></div>
+<pre><code>A(::, 100 to 200)
+</code></pre>
 
 <p>Vertical Block:</p>
 
-<div class="highlighter-rouge"><div class="highlight"><pre 
class="highlight"><code>A(100 to 200, ::)
-</code></pre></div></div>
+<pre><code>A(100 to 200, ::)
+</code></pre>
 
-<p><em>Note: if row range is not all-range (::) the the DRM must be <code 
class="highlighter-rouge">Int</code>-keyed.  General case row slicing is not 
supported by DRMs with key types other than <code 
class="highlighter-rouge">Int</code></em>.</p>
+<p><em>Note: if row range is not all-range (::) the the DRM must be 
<code>Int</code>-keyed.  General case row slicing is not supported by DRMs with 
key types other than <code>Int</code></em>.</p>
 
 <h4 id="stitching">Stitching</h4>
 
 <p>Stitch side by side (cbind R semantics):</p>
 
-<div class="highlighter-rouge"><div class="highlight"><pre 
class="highlight"><code>val drmAnextToB = drmA cbind drmB
-</code></pre></div></div>
+<pre><code>val drmAnextToB = drmA cbind drmB
+</code></pre>
 
 <p>Stitch side by side (Scala):</p>
 
-<div class="highlighter-rouge"><div class="highlight"><pre 
class="highlight"><code>val drmAnextToB = drmA.cbind(drmB)
-</code></pre></div></div>
+<pre><code>val drmAnextToB = drmA.cbind(drmB)
+</code></pre>
 
 <p>Analogously, vertical concatenation is available via 
<strong>rbind</strong></p>
 
@@ -483,126 +483,126 @@ A / 5.0
 
 <p><strong>drm.mapBlock(â¦)</strong>:</p>
 
-<p>The DRM operator <code class="highlighter-rouge">mapBlock</code> provides 
transformational access to the distributed vertical blockified tuples of a 
matrix (Row-Keys, Vertical-Matrix-Block).</p>
+<p>The DRM operator <code>mapBlock</code> provides transformational access to 
the distributed vertical blockified tuples of a matrix (Row-Keys, 
Vertical-Matrix-Block).</p>
 
-<p>Using <code class="highlighter-rouge">mapBlock</code> to add 1.0 to a 
DRM:</p>
+<p>Using <code>mapBlock</code> to add 1.0 to a DRM:</p>
 
-<div class="highlighter-rouge"><div class="highlight"><pre 
class="highlight"><code>val inCoreA = dense((1, 2, 3), (2, 3 , 4), (3, 4, 5))
+<pre><code>val inCoreA = dense((1, 2, 3), (2, 3 , 4), (3, 4, 5))
 val drmA = drmParallelize(inCoreA)
 val B = A.mapBlock() {
     case (keys, block) =&gt; keys -&gt; (block += 1.0)
 }
-</code></pre></div></div>
+</code></pre>
 
 <h4 id="broadcasting-vectors-and-matrices-to-closures">Broadcasting Vectors 
and matrices to closures</h4>
 <p>Generally we can create and use one-way closure attributes to be used on 
the back end.</p>
 
 <p>Scalar matrix multiplication:</p>
 
-<div class="highlighter-rouge"><div class="highlight"><pre 
class="highlight"><code>val factor: Int = 15
+<pre><code>val factor: Int = 15
 val drm2 = drm1.mapBlock() {
     case (keys, block) =&gt; block *= factor
     keys -&gt; block
 }
-</code></pre></div></div>
+</code></pre>
 
-<p><strong>Closure attributes must be java-serializable. Currently Mahoutâs 
in-core Vectors and Matrices are not java-serializable, and must be broadcast 
to the closure using <code 
class="highlighter-rouge">drmBroadcast(...)</code></strong>:</p>
+<p><strong>Closure attributes must be java-serializable. Currently Mahoutâs 
in-core Vectors and Matrices are not java-serializable, and must be broadcast 
to the closure using <code>drmBroadcast(...)</code></strong>:</p>
 
-<div class="highlighter-rouge"><div class="highlight"><pre 
class="highlight"><code>val v: Vector ...
+<pre><code>val v: Vector ...
 val bcastV = drmBroadcast(v)
 val drm2 = drm1.mapBlock() {
     case (keys, block) =&gt;
         for(row &lt;- 0 until block.nrow) block(row, ::) -= bcastV
     keys -&gt; block    
 }
-</code></pre></div></div>
+</code></pre>
 
 <h4 id="computations-providing-ad-hoc-summaries">Computations providing ad-hoc 
summaries</h4>
 
 <p>Matrix cardinality:</p>
 
-<div class="highlighter-rouge"><div class="highlight"><pre 
class="highlight"><code>drmA.nrow
+<pre><code>drmA.nrow
 drmA.ncol
-</code></pre></div></div>
+</code></pre>
 
-<p><em>Note: depending on the stage of optimization, these may trigger a 
computational action.  I.e. if one calls <code 
class="highlighter-rouge">nrow()</code> n times, then the back end will 
actually recompute <code class="highlighter-rouge">nrow</code> n times.</em></p>
+<p><em>Note: depending on the stage of optimization, these may trigger a 
computational action.  I.e. if one calls <code>nrow()</code> n times, then the 
back end will actually recompute <code>nrow</code> n times.</em></p>
 
 <p>Means and sums:</p>
 
-<div class="highlighter-rouge"><div class="highlight"><pre 
class="highlight"><code>drmA.colSums
+<pre><code>drmA.colSums
 drmA.colMeans
 drmA.rowSums
 drmA.rowMeans
-</code></pre></div></div>
+</code></pre>
 
-<p><em>Note: These will always trigger a computational action.  I.e. if one 
calls <code class="highlighter-rouge">colSums()</code> n times, then the back 
end will actually recompute <code class="highlighter-rouge">colSums</code> n 
times.</em></p>
+<p><em>Note: These will always trigger a computational action.  I.e. if one 
calls <code>colSums()</code> n times, then the back end will actually recompute 
<code>colSums</code> n times.</em></p>
 
 <h4 id="distributed-matrix-decompositions">Distributed Matrix 
Decompositions</h4>
 
 <p>To import the decomposition package:</p>
 
-<div class="highlighter-rouge"><div class="highlight"><pre 
class="highlight"><code>import org.apache.mahout.math._
+<pre><code>import org.apache.mahout.math._
 import decompositions._
-</code></pre></div></div>
+</code></pre>
 
 <p>Distributed thin QR:</p>
 
-<div class="highlighter-rouge"><div class="highlight"><pre 
class="highlight"><code>val (drmQ, incoreR) = dqrThin(drmA)
-</code></pre></div></div>
+<pre><code>val (drmQ, incoreR) = dqrThin(drmA)
+</code></pre>
 
 <p>Distributed SSVD:</p>
 
-<div class="highlighter-rouge"><div class="highlight"><pre 
class="highlight"><code>val (drmU, drmV, s) = dssvd(drmA, k = 40, q = 1)
-</code></pre></div></div>
+<pre><code>val (drmU, drmV, s) = dssvd(drmA, k = 40, q = 1)
+</code></pre>
 
 <p>Distributed SPCA:</p>
 
-<div class="highlighter-rouge"><div class="highlight"><pre 
class="highlight"><code>val (drmU, drmV, s) = dspca(drmA, k = 30, q = 1)
-</code></pre></div></div>
+<pre><code>val (drmU, drmV, s) = dspca(drmA, k = 30, q = 1)
+</code></pre>
 
 <p>Distributed regularized ALS:</p>
 
-<div class="highlighter-rouge"><div class="highlight"><pre 
class="highlight"><code>val (drmU, drmV, i) = dals(drmA,
+<pre><code>val (drmU, drmV, i) = dals(drmA,
                         k = 50,
                         lambda = 0.0,
                         maxIterations = 10,
                         convergenceThreshold = 0.10))
-</code></pre></div></div>
+</code></pre>
 
 <h4 id="adjusting-parallelism-of-computations">Adjusting parallelism of 
computations</h4>
 
-<p>Set the minimum parallelism to 100 for computations on <code 
class="highlighter-rouge">drmA</code>:</p>
+<p>Set the minimum parallelism to 100 for computations on 
<code>drmA</code>:</p>
 
-<div class="highlighter-rouge"><div class="highlight"><pre 
class="highlight"><code>drmA.par(min = 100)
-</code></pre></div></div>
+<pre><code>drmA.par(min = 100)
+</code></pre>
 
-<p>Set the exact parallelism to 100 for computations on <code 
class="highlighter-rouge">drmA</code>:</p>
+<p>Set the exact parallelism to 100 for computations on <code>drmA</code>:</p>
 
-<div class="highlighter-rouge"><div class="highlight"><pre 
class="highlight"><code>drmA.par(exact = 100)
-</code></pre></div></div>
+<pre><code>drmA.par(exact = 100)
+</code></pre>
 
-<p>Set the engine specific automatic parallelism adjustment for computations 
on <code class="highlighter-rouge">drmA</code>:</p>
+<p>Set the engine specific automatic parallelism adjustment for computations 
on <code>drmA</code>:</p>
 
-<div class="highlighter-rouge"><div class="highlight"><pre 
class="highlight"><code>drmA.par(auto = true)
-</code></pre></div></div>
+<pre><code>drmA.par(auto = true)
+</code></pre>
 
 <h4 
id="retrieving-the-engine-specific-data-structure-backing-the-drm">Retrieving 
the engine specific data structure backing the DRM:</h4>
 
 <p><strong>A Spark RDD:</strong></p>
 
-<div class="highlighter-rouge"><div class="highlight"><pre 
class="highlight"><code>val myRDD = drmA.checkpoint().rdd
-</code></pre></div></div>
+<pre><code>val myRDD = drmA.checkpoint().rdd
+</code></pre>
 
 <p><strong>An H2O Frame and Key Vec:</strong></p>
 
-<div class="highlighter-rouge"><div class="highlight"><pre 
class="highlight"><code>val myFrame = drmA.frame
+<pre><code>val myFrame = drmA.frame
 val myKeys = drmA.keys
-</code></pre></div></div>
+</code></pre>
 
 <p><strong>A Flink DataSet:</strong></p>
 
-<div class="highlighter-rouge"><div class="highlight"><pre 
class="highlight"><code>val myDataSet = drmA.ds
-</code></pre></div></div>
+<pre><code>val myDataSet = drmA.ds
+</code></pre>
 
 <p>For more information including information on Mahout-Samsaraâs Algebraic 
Optimizer and in-core Linear algebra bindings see: <a 
href="http://mahout.apache.org/users/sparkbindings/ScalaSparkBindings.pdf";>Mahout
 Scala Bindings and Mahout Spark Bindings for Linear Algebra Subroutines</a></p>
 

http://git-wip-us.apache.org/repos/asf/mahout/blob/d9686c8b/users/flinkbindings/playing-with-samsara-flink.html
----------------------------------------------------------------------
diff --git a/users/flinkbindings/playing-with-samsara-flink.html 
b/users/flinkbindings/playing-with-samsara-flink.html
index 8e45d9e..000d052 100644
--- a/users/flinkbindings/playing-with-samsara-flink.html
+++ b/users/flinkbindings/playing-with-samsara-flink.html
@@ -276,16 +276,16 @@
 
 <p>To get started, add the following dependency to the pom:</p>
 
-<div class="highlighter-rouge"><div class="highlight"><pre 
class="highlight"><code>&lt;dependency&gt;
+<pre><code>&lt;dependency&gt;
   &lt;groupId&gt;org.apache.mahout&lt;/groupId&gt;
   &lt;artifactId&gt;mahout-flink_2.10&lt;/artifactId&gt;
   &lt;version&gt;0.12.0&lt;/version&gt;
 &lt;/dependency&gt;
-</code></pre></div></div>
+</code></pre>
 
 <p>Here is how to use the Flink backend:</p>
 
-<div class="highlighter-rouge"><div class="highlight"><pre 
class="highlight"><code>import org.apache.flink.api.scala._
+<pre><code>import org.apache.flink.api.scala._
 import org.apache.mahout.math.drm._
 import org.apache.mahout.math.drm.RLikeDrmOps._
 import org.apache.mahout.flinkbindings._
@@ -304,7 +304,7 @@ object ReadCsvExample {
   }
 
 }
-</code></pre></div></div>
+</code></pre>
 
 <h2 id="current-status">Current Status</h2>
 
@@ -314,18 +314,18 @@ object ReadCsvExample {
 
 <ul>
   <li><a 
href="https://issues.apache.org/jira/browse/MAHOUT-1701";>MAHOUT-1701</a> Mahout 
DSL for Flink: implement AtB ABt and AtA operators</li>
-  <li><a 
href="https://issues.apache.org/jira/browse/MAHOUT-1702";>MAHOUT-1702</a> 
implement element-wise operators (like <code class="highlighter-rouge">A + 
2</code> or <code class="highlighter-rouge">A + B</code>)</li>
-  <li><a 
href="https://issues.apache.org/jira/browse/MAHOUT-1703";>MAHOUT-1703</a> 
implement <code class="highlighter-rouge">cbind</code> and <code 
class="highlighter-rouge">rbind</code></li>
-  <li><a 
href="https://issues.apache.org/jira/browse/MAHOUT-1709";>MAHOUT-1709</a> 
implement slicing (like <code class="highlighter-rouge">A(1 to 10, 
::)</code>)</li>
-  <li><a 
href="https://issues.apache.org/jira/browse/MAHOUT-1710";>MAHOUT-1710</a> 
implement right in-core matrix multiplication (<code 
class="highlighter-rouge">A %*% B</code> when <code 
class="highlighter-rouge">B</code> is in-core)</li>
+  <li><a 
href="https://issues.apache.org/jira/browse/MAHOUT-1702";>MAHOUT-1702</a> 
implement element-wise operators (like <code>A + 2</code> or <code>A + 
B</code>)</li>
+  <li><a 
href="https://issues.apache.org/jira/browse/MAHOUT-1703";>MAHOUT-1703</a> 
implement <code>cbind</code> and <code>rbind</code></li>
+  <li><a 
href="https://issues.apache.org/jira/browse/MAHOUT-1709";>MAHOUT-1709</a> 
implement slicing (like <code>A(1 to 10, ::)</code>)</li>
+  <li><a 
href="https://issues.apache.org/jira/browse/MAHOUT-1710";>MAHOUT-1710</a> 
implement right in-core matrix multiplication (<code>A %*% B</code> when 
<code>B</code> is in-core)</li>
   <li><a 
href="https://issues.apache.org/jira/browse/MAHOUT-1711";>MAHOUT-1711</a> 
implement broadcasting</li>
-  <li><a 
href="https://issues.apache.org/jira/browse/MAHOUT-1712";>MAHOUT-1712</a> 
implement operators <code class="highlighter-rouge">At</code>, <code 
class="highlighter-rouge">Ax</code>, <code class="highlighter-rouge">Atx</code> 
- <code class="highlighter-rouge">Ax</code> and <code 
class="highlighter-rouge">At</code> are implemented</li>
+  <li><a 
href="https://issues.apache.org/jira/browse/MAHOUT-1712";>MAHOUT-1712</a> 
implement operators <code>At</code>, <code>Ax</code>, <code>Atx</code> - 
<code>Ax</code> and <code>At</code> are implemented</li>
   <li><a 
href="https://issues.apache.org/jira/browse/MAHOUT-1734";>MAHOUT-1734</a> 
implement I/O - should be able to read results of Flink bindings</li>
-  <li><a 
href="https://issues.apache.org/jira/browse/MAHOUT-1747";>MAHOUT-1747</a> add 
support for different types of indexes (String, long, etc) - now supports <code 
class="highlighter-rouge">Int</code>, <code 
class="highlighter-rouge">Long</code> and <code 
class="highlighter-rouge">String</code></li>
+  <li><a 
href="https://issues.apache.org/jira/browse/MAHOUT-1747";>MAHOUT-1747</a> add 
support for different types of indexes (String, long, etc) - now supports 
<code>Int</code>, <code>Long</code> and <code>String</code></li>
   <li><a 
href="https://issues.apache.org/jira/browse/MAHOUT-1748";>MAHOUT-1748</a> switch 
to Flink Scala API</li>
-  <li><a 
href="https://issues.apache.org/jira/browse/MAHOUT-1749";>MAHOUT-1749</a> 
Implement <code class="highlighter-rouge">Atx</code></li>
-  <li><a 
href="https://issues.apache.org/jira/browse/MAHOUT-1750";>MAHOUT-1750</a> 
Implement <code class="highlighter-rouge">ABt</code></li>
-  <li><a 
href="https://issues.apache.org/jira/browse/MAHOUT-1751";>MAHOUT-1751</a> 
Implement <code class="highlighter-rouge">AtA</code></li>
+  <li><a 
href="https://issues.apache.org/jira/browse/MAHOUT-1749";>MAHOUT-1749</a> 
Implement <code>Atx</code></li>
+  <li><a 
href="https://issues.apache.org/jira/browse/MAHOUT-1750";>MAHOUT-1750</a> 
Implement <code>ABt</code></li>
+  <li><a 
href="https://issues.apache.org/jira/browse/MAHOUT-1751";>MAHOUT-1751</a> 
Implement <code>AtA</code></li>
   <li><a 
href="https://issues.apache.org/jira/browse/MAHOUT-1755";>MAHOUT-1755</a> Flush 
intermediate results to FS - Flink, unlike Spark, does not store intermediate 
results in memory.</li>
   <li><a 
href="https://issues.apache.org/jira/browse/MAHOUT-1764";>MAHOUT-1764</a> Add 
standard backend tests for Flink</li>
   <li><a 
href="https://issues.apache.org/jira/browse/MAHOUT-1765";>MAHOUT-1765</a> Add 
documentation about Flink backend</li>
@@ -355,19 +355,19 @@ object ReadCsvExample {
 <p>There is a set of standard tests that all engines should pass (see <a 
href="https://issues.apache.org/jira/browse/MAHOUT-1764";>MAHOUT-1764</a>).</p>
 
 <ul>
-  <li><code 
class="highlighter-rouge">DistributedDecompositionsSuite</code></li>
-  <li><code class="highlighter-rouge">DrmLikeOpsSuite</code></li>
-  <li><code class="highlighter-rouge">DrmLikeSuite</code></li>
-  <li><code class="highlighter-rouge">RLikeDrmOpsSuite</code></li>
+  <li><code>DistributedDecompositionsSuite</code></li>
+  <li><code>DrmLikeOpsSuite</code></li>
+  <li><code>DrmLikeSuite</code></li>
+  <li><code>RLikeDrmOpsSuite</code></li>
 </ul>
 
 <p>These are Flink-backend specific tests, e.g.</p>
 
 <ul>
-  <li><code class="highlighter-rouge">DrmLikeOpsSuite</code> for operations 
like <code class="highlighter-rouge">norm</code>, <code 
class="highlighter-rouge">rowSums</code>, <code 
class="highlighter-rouge">rowMeans</code></li>
-  <li><code class="highlighter-rouge">RLikeOpsSuite</code> for basic LA like 
<code class="highlighter-rouge">A.t %*% A</code>, <code 
class="highlighter-rouge">A.t %*% x</code>, etc</li>
-  <li><code class="highlighter-rouge">LATestSuite</code> tests for specific 
operators like <code class="highlighter-rouge">AtB</code>, <code 
class="highlighter-rouge">Ax</code>, etc</li>
-  <li><code class="highlighter-rouge">UseCasesSuite</code> has more complex 
examples, like power iteration, ridge regression, etc</li>
+  <li><code>DrmLikeOpsSuite</code> for operations like <code>norm</code>, 
<code>rowSums</code>, <code>rowMeans</code></li>
+  <li><code>RLikeOpsSuite</code> for basic LA like <code>A.t %*% A</code>, 
<code>A.t %*% x</code>, etc</li>
+  <li><code>LATestSuite</code> tests for specific operators like 
<code>AtB</code>, <code>Ax</code>, etc</li>
+  <li><code>UseCasesSuite</code> has more complex examples, like power 
iteration, ridge regression, etc</li>
 </ul>
 
 <h2 id="environment">Environment</h2>
@@ -382,9 +382,9 @@ object ReadCsvExample {
 <p>When using mahout, please import the following modules:</p>
 
 <ul>
-  <li><code class="highlighter-rouge">mahout-math</code></li>
-  <li><code class="highlighter-rouge">mahout-math-scala</code></li>
-  <li><code class="highlighter-rouge">mahout-flink_2.10</code>
+  <li><code>mahout-math</code></li>
+  <li><code>mahout-math-scala</code></li>
+  <li><code>mahout-flink_2.10</code>
 *</li>
 </ul>
 

http://git-wip-us.apache.org/repos/asf/mahout/blob/d9686c8b/users/misc/parallel-frequent-pattern-mining.html
----------------------------------------------------------------------
diff --git a/users/misc/parallel-frequent-pattern-mining.html 
b/users/misc/parallel-frequent-pattern-mining.html
index c02586a..1ec3df8 100644
--- a/users/misc/parallel-frequent-pattern-mining.html
+++ b/users/misc/parallel-frequent-pattern-mining.html
@@ -286,7 +286,7 @@ creating Iterators, Convertors and TopKPatternWritable for 
that particular
 object. For more information please refer the package
 org.apache.mahout.fpm.pfpgrowth.convertors.string</p>
 
-<div class="highlighter-rouge"><div class="highlight"><pre 
class="highlight"><code>e.g:
+<pre><code>e.g:
  FPGrowth&lt;String&gt; fp = new FPGrowth&lt;String&gt;();
  Set&lt;String&gt; features = new HashSet&lt;String&gt;();
  fp.generateTopKStringFrequentPatterns(
@@ -298,7 +298,7 @@ org.apache.mahout.fpm.pfpgrowth.convertors.string</p>
        features,
        new StringOutputConvertor(new SequenceFileOutputCollector&lt;Text, 
TopKStringPatterns&gt;(writer))
   );
-</code></pre></div></div>
+</code></pre>
 
 <ul>
   <li>The first argument is the iterator of transaction in this case its

http://git-wip-us.apache.org/repos/asf/mahout/blob/d9686c8b/users/misc/using-mahout-with-python-via-jpype.html
----------------------------------------------------------------------
diff --git a/users/misc/using-mahout-with-python-via-jpype.html 
b/users/misc/using-mahout-with-python-via-jpype.html
index 5290f63..d9e72c6 100644
--- a/users/misc/using-mahout-with-python-via-jpype.html
+++ b/users/misc/using-mahout-with-python-via-jpype.html
@@ -305,14 +305,14 @@ python script. The result for me looks like the 
following</p>
 
 <p>Now we can create a function to start the jvm in python using jype</p>
 
-<div class="highlighter-rouge"><div class="highlight"><pre 
class="highlight"><code>jvm=None
+<pre><code>jvm=None
 def start_jpype():
 global jvm
 if (jvm is None):
 cpopt="-Djava.class.path={cp}".format(cp=classpath)
 startJVM(jvmlib,"-ea",cpopt)
 jvm="started"
-</code></pre></div></div>
+</code></pre>
 
 <p><a 
name="UsingMahoutwithPythonviaJPype-WritingNamedVectorstoSequenceFilesfromPython"></a></p>
 <h1 id="writing-named-vectors-to-sequence-files-from-python">Writing Named 
Vectors to Sequence Files from Python</h1>
@@ -320,7 +320,7 @@ jvm="started"
 be used by Mahout for kmeans. The example below is a function which creates
 vectors from two Gaussian distributions with unit variance.</p>
 
-<div class="highlighter-rouge"><div class="highlight"><pre 
class="highlight"><code>def create_inputs(ifile,*args,**param):
+<pre><code>def create_inputs(ifile,*args,**param):
  """Create a sequence file containing some normally distributed
        ifile - path to the sequence file to create
  """
@@ -383,14 +383,14 @@ vectors from two Gaussian distributions with unit 
variance.</p>
   writer.append(wrapkey,wrapval)
   
  writer.close()
-</code></pre></div></div>
+</code></pre>
 
 <p><a 
name="UsingMahoutwithPythonviaJPype-ReadingtheKMeansClusteredPointsfromPython"></a></p>
 <h1 id="reading-the-kmeans-clustered-points-from-python">Reading the KMeans 
Clustered Points from Python</h1>
 <p>Similarly we can use JPype to easily read the clustered points outputted by
 mahout.</p>
 
-<div class="highlighter-rouge"><div class="highlight"><pre 
class="highlight"><code>def read_clustered_pts(ifile,*args,**param):
+<pre><code>def read_clustered_pts(ifile,*args,**param):
  """Read the clustered points
  ifile - path to the sequence file containing the clustered points
  """ 
@@ -433,14 +433,14 @@ mahout.</p>
    print "cluster={key} Name={name} 
x={x}y={y}".format(key=key.toString(),name=nvec.getName(),x=nvec.get(0),y=nvec.get(1))
   else:
    raise NotImplementedError("Vector isn't a NamedVector. Need tomodify/test 
the code to handle this case.")
-</code></pre></div></div>
+</code></pre>
 
 <p><a name="UsingMahoutwithPythonviaJPype-ReadingtheKMeansCentroids"></a></p>
 <h1 id="reading-the-kmeans-centroids">Reading the KMeans Centroids</h1>
 <p>Finally we can create a function to print out the actual cluster centers
 found by mahout,</p>
 
-<div class="highlighter-rouge"><div class="highlight"><pre 
class="highlight"><code>def getClusters(ifile,*args,**param):
+<pre><code>def getClusters(ifile,*args,**param):
  """Read the centroids from the clusters outputted by kmenas
           ifile - Path to the sequence file containing the centroids
  """ 
@@ -479,7 +479,7 @@ found by mahout,</p>
   
   print 
"id={cid}center={center}".format(cid=vecwritable.getId(),center=center.values)
   pass
-</code></pre></div></div>
+</code></pre>
 
 
    </div>

http://git-wip-us.apache.org/repos/asf/mahout/blob/d9686c8b/users/recommender/intro-als-hadoop.html
----------------------------------------------------------------------
diff --git a/users/recommender/intro-als-hadoop.html 
b/users/recommender/intro-als-hadoop.html
index e2920c2..db3700a 100644
--- a/users/recommender/intro-als-hadoop.html
+++ b/users/recommender/intro-als-hadoop.html
@@ -350,8 +350,8 @@ for item IDs. Then after recommendations are calculated you 
will have to transla
 
 <p>Assuming your <em>JAVA_HOME</em> is appropriately set and Mahout was 
installed properly weâre ready to configure our syntax. Enter the following 
command:</p>
 
-<div class="highlighter-rouge"><div class="highlight"><pre 
class="highlight"><code>$ mahout parallelALS --input $als_input --output 
$als_output --lambda 0.1 --implicitFeedback true --alpha 0.8 --numFeatures 2 
--numIterations 5  --numThreadsPerSolver 1 --tempDir tmp 
-</code></pre></div></div>
+<pre><code>$ mahout parallelALS --input $als_input --output $als_output 
--lambda 0.1 --implicitFeedback true --alpha 0.8 --numFeatures 2 
--numIterations 5  --numThreadsPerSolver 1 --tempDir tmp 
+</code></pre>
 
 <p>Running the command will execute a series of jobs the final product of 
which will be an output file deposited to the output directory specified in the 
command syntax. The output directory contains 3 sub-directories: <em>M</em> 
stores the item to feature matrix, <em>U</em> stores the user to feature matrix 
and userRatings stores the userâs ratings on the items. The <em>tempDir</em> 
parameter specifies the directory to store the intermediate output of the job, 
such as the matrix output in each iteration and each itemâs average rating. 
Using the <em>tempDir</em> will help on debugging.</p>
 
@@ -359,8 +359,8 @@ for item IDs. Then after recommendations are calculated you 
will have to transla
 
 <p>Based on the output feature matrices from step 3, we could make 
recommendations for users. Enter the following command:</p>
 
-<div class="highlighter-rouge"><div class="highlight"><pre 
class="highlight"><code> $ mahout recommendfactorized --input 
$als_recommender_input --userFeatures $als_output/U/ --itemFeatures 
$als_output/M/ --numRecommendations 1 --output recommendations --maxRating 1
-</code></pre></div></div>
+<pre><code> $ mahout recommendfactorized --input $als_recommender_input 
--userFeatures $als_output/U/ --itemFeatures $als_output/M/ 
--numRecommendations 1 --output recommendations --maxRating 1
+</code></pre>
 
 <p>The input user file is a sequence file, the sequence record key is user id 
and value is the userâs rated item ids which will be removed from 
recommendation. The output file generated in our simple example will be a text 
file giving the recommended item ids for each user. 
 Remember to translate the Mahout ids back into your application specific 
ids.</p>

http://git-wip-us.apache.org/repos/asf/mahout/blob/d9686c8b/users/recommender/intro-cooccurrence-spark.html
----------------------------------------------------------------------
diff --git a/users/recommender/intro-cooccurrence-spark.html 
b/users/recommender/intro-cooccurrence-spark.html
index 6aaccd5..fbeb1cd 100644
--- a/users/recommender/intro-cooccurrence-spark.html
+++ b/users/recommender/intro-cooccurrence-spark.html
@@ -304,7 +304,7 @@ For instance they might say an item-view is 0.2 of an item 
purchase. In practice
 cross-cooccurrence is a more principled way to handle this case. In effect it 
scrubs secondary actions with the action you want
 to recommend.</p>
 
-<div class="highlighter-rouge"><div class="highlight"><pre 
class="highlight"><code>spark-itemsimilarity Mahout 1.0
+<pre><code>spark-itemsimilarity Mahout 1.0
 Usage: spark-itemsimilarity [options]
 
 Disconnected from the target VM, address: '127.0.0.1:64676', transport: 
'socket'
@@ -368,7 +368,7 @@ Spark config options:
         
   -h | --help
         prints this usage text
-</code></pre></div></div>
+</code></pre>
 
 <p>This looks daunting but defaults to simple fairly sane values to take 
exactly the same input as legacy code and is pretty flexible. It allows the 
user to point to a single text file, a directory full of files, or a tree of 
directories to be traversed recursively. The files included can be specified 
with either a regex-style pattern or filename. The schema for the file is 
defined by column numbers, which map to the important bits of data including 
IDs and values. The files can even contain filters, which allow unneeded rows 
to be discarded or used for cross-cooccurrence calculations.</p>
 
@@ -378,20 +378,20 @@ Spark config options:
 
 <p>If all defaults are used the input can be as simple as:</p>
 
-<div class="highlighter-rouge"><div class="highlight"><pre 
class="highlight"><code>userID1,itemID1
+<pre><code>userID1,itemID1
 userID2,itemID2
 ...
-</code></pre></div></div>
+</code></pre>
 
 <p>With the command line:</p>
 
-<div class="highlighter-rouge"><div class="highlight"><pre 
class="highlight"><code>bash$ mahout spark-itemsimilarity --input in-file 
--output out-dir
-</code></pre></div></div>
+<pre><code>bash$ mahout spark-itemsimilarity --input in-file --output out-dir
+</code></pre>
 
 <p>This will use the âlocalâ Spark context and will output the standard 
text version of a DRM</p>
 
-<div class="highlighter-rouge"><div class="highlight"><pre 
class="highlight"><code>itemID1&lt;tab&gt;itemID2:value2&lt;space&gt;itemID10:value10...
-</code></pre></div></div>
+<pre><code>itemID1&lt;tab&gt;itemID2:value2&lt;space&gt;itemID10:value10...
+</code></pre>
 
 <p>###<a name="multiple-actions">How To Use Multiple User Actions</a></p>
 
@@ -407,7 +407,7 @@ to calculate the cross-cooccurrence indicator matrix.</p>
 <p><em>spark-itemsimilarity</em> can read separate actions from separate files 
or from a mixed action log by filtering certain lines. For a mixed 
 action log of the form:</p>
 
-<div class="highlighter-rouge"><div class="highlight"><pre 
class="highlight"><code>u1,purchase,iphone
+<pre><code>u1,purchase,iphone
 u1,purchase,ipad
 u2,purchase,nexus
 u2,purchase,galaxy
@@ -427,13 +427,13 @@ u3,view,nexus
 u4,view,iphone
 u4,view,ipad
 u4,view,galaxy
-</code></pre></div></div>
+</code></pre>
 
 <p>###Command Line</p>
 
 <p>Use the following options:</p>
 
-<div class="highlighter-rouge"><div class="highlight"><pre 
class="highlight"><code>bash$ mahout spark-itemsimilarity \
+<pre><code>bash$ mahout spark-itemsimilarity \
        --input in-file \     # where to look for data
     --output out-path \   # root dir for output
     --master masterUrl \  # URL of the Spark master server
@@ -442,35 +442,35 @@ u4,view,galaxy
     --itemIDPosition 2 \  # column that has the item ID
     --rowIDPosition 0 \   # column that has the user ID
     --filterPosition 1    # column that has the filter word
-</code></pre></div></div>
+</code></pre>
 
 <p>###Output</p>
 
 <p>The output of the job will be the standard text version of two Mahout DRMs. 
This is a case where we are calculating 
 cross-cooccurrence so a primary indicator matrix and cross-cooccurrence 
indicator matrix will be created</p>
 
-<div class="highlighter-rouge"><div class="highlight"><pre 
class="highlight"><code>out-path
+<pre><code>out-path
   |-- similarity-matrix - TDF part files
   \-- cross-similarity-matrix - TDF part-files
-</code></pre></div></div>
+</code></pre>
 
 <p>The indicator matrix will contain the lines:</p>
 
-<div class="highlighter-rouge"><div class="highlight"><pre 
class="highlight"><code>galaxy\tnexus:1.7260924347106847
+<pre><code>galaxy\tnexus:1.7260924347106847
 ipad\tiphone:1.7260924347106847
 nexus\tgalaxy:1.7260924347106847
 iphone\tipad:1.7260924347106847
 surface
-</code></pre></div></div>
+</code></pre>
 
 <p>The cross-cooccurrence indicator matrix will contain:</p>
 
-<div class="highlighter-rouge"><div class="highlight"><pre 
class="highlight"><code>iphone\tnexus:1.7260924347106847 
iphone:1.7260924347106847 ipad:1.7260924347106847 galaxy:1.7260924347106847
+<pre><code>iphone\tnexus:1.7260924347106847 iphone:1.7260924347106847 
ipad:1.7260924347106847 galaxy:1.7260924347106847
 ipad\tnexus:0.6795961471815897 iphone:0.6795961471815897 
ipad:0.6795961471815897 galaxy:0.6795961471815897
 nexus\tnexus:0.6795961471815897 iphone:0.6795961471815897 
ipad:0.6795961471815897 galaxy:0.6795961471815897
 galaxy\tnexus:1.7260924347106847 iphone:1.7260924347106847 
ipad:1.7260924347106847 galaxy:1.7260924347106847
 surface\tsurface:4.498681156950466 nexus:0.6795961471815897
-</code></pre></div></div>
+</code></pre>
 
 <p><strong>Note:</strong> You can run this multiple times to use more than two 
actions or you can use the underlying 
 SimilarityAnalysis.cooccurrence API, which will more efficiently calculate any 
number of cross-cooccurrence indicators.</p>
@@ -479,7 +479,7 @@ SimilarityAnalysis.cooccurrence API, which will more 
efficiently calculate any n
 
 <p>A common method of storing data is in log files. If they are written using 
some delimiter they can be consumed directly by spark-itemsimilarity. For 
instance input of the form:</p>
 
-<div class="highlighter-rouge"><div class="highlight"><pre 
class="highlight"><code>2014-06-23 14:46:53.115\tu1\tpurchase\trandom 
text\tiphone
+<pre><code>2014-06-23 14:46:53.115\tu1\tpurchase\trandom text\tiphone
 2014-06-23 14:46:53.115\tu1\tpurchase\trandom text\tipad
 2014-06-23 14:46:53.115\tu2\tpurchase\trandom text\tnexus
 2014-06-23 14:46:53.115\tu2\tpurchase\trandom text\tgalaxy
@@ -499,11 +499,11 @@ SimilarityAnalysis.cooccurrence API, which will more 
efficiently calculate any n
 2014-06-23 14:46:53.115\tu4\tview\trandom text\tiphone
 2014-06-23 14:46:53.115\tu4\tview\trandom text\tipad
 2014-06-23 14:46:53.115\tu4\tview\trandom text\tgalaxy    
-</code></pre></div></div>
+</code></pre>
 
 <p>Can be parsed with the following CLI and run on the cluster producing the 
same output as the above example.</p>
 
-<div class="highlighter-rouge"><div class="highlight"><pre 
class="highlight"><code>bash$ mahout spark-itemsimilarity \
+<pre><code>bash$ mahout spark-itemsimilarity \
     --input in-file \
     --output out-path \
     --master spark://sparkmaster:4044 \
@@ -513,7 +513,7 @@ SimilarityAnalysis.cooccurrence API, which will more 
efficiently calculate any n
     --itemIDPosition 4 \
     --rowIDPosition 1 \
     --filterPosition 2
-</code></pre></div></div>
+</code></pre>
 
 <p>##2. spark-rowsimilarity</p>
 
@@ -528,7 +528,7 @@ by a list of the most similar rows.</p>
 
 <p>The command line interface is:</p>
 
-<div class="highlighter-rouge"><div class="highlight"><pre 
class="highlight"><code>spark-rowsimilarity Mahout 1.0
+<pre><code>spark-rowsimilarity Mahout 1.0
 Usage: spark-rowsimilarity [options]
 
 Input, output options
@@ -574,7 +574,7 @@ Spark config options:
         
   -h | --help
         prints this usage text
-</code></pre></div></div>
+</code></pre>
 
 <p>See RowSimilarityDriver.scala in Mahoutâs spark module if you want to 
customize the code.</p>
 
@@ -654,32 +654,32 @@ content or metadata, not by which users interacted with 
them.</p>
 
 <p>For this we need input of the form:</p>
 
-<div class="highlighter-rouge"><div class="highlight"><pre 
class="highlight"><code>itemID&lt;tab&gt;list-of-tags
+<pre><code>itemID&lt;tab&gt;list-of-tags
 ...
-</code></pre></div></div>
+</code></pre>
 
 <p>The full collection will look like the tags column from a catalog DB. For 
our ecom example it might be:</p>
 
-<div class="highlighter-rouge"><div class="highlight"><pre 
class="highlight"><code>3459860b&lt;tab&gt;men long-sleeve chambray clothing 
casual
+<pre><code>3459860b&lt;tab&gt;men long-sleeve chambray clothing casual
 9446577d&lt;tab&gt;women tops chambray clothing casual
 ...
-</code></pre></div></div>
+</code></pre>
 
 <p>Weâll use <em>spark-rowimilairity</em> because we are looking for similar 
rows, which encode items in this case. As with the 
 collaborative filtering indicator and cross-cooccurrence indicator we use the 
âomitStrength option. The strengths created are 
 probabilistic log-likelihood ratios and so are used to filter unimportant 
similarities. Once the filtering or downsampling 
 is finished we no longer need the strengths. We will get an indicator matrix 
of the form:</p>
 
-<div class="highlighter-rouge"><div class="highlight"><pre 
class="highlight"><code>itemID&lt;tab&gt;list-of-item IDs
+<pre><code>itemID&lt;tab&gt;list-of-item IDs
 ...
-</code></pre></div></div>
+</code></pre>
 
 <p>This is a content indicator since it has found other items with similar 
content or metadata.</p>
 
-<div class="highlighter-rouge"><div class="highlight"><pre 
class="highlight"><code>3459860b&lt;tab&gt;3459860b 3459860b 6749860c 5959860a 
3434860a 3477860a
+<pre><code>3459860b&lt;tab&gt;3459860b 3459860b 6749860c 5959860a 3434860a 
3477860a
 9446577d&lt;tab&gt;9446577d 9496577d 0943577d 8346577d 9442277d 9446577e
 ...  
-</code></pre></div></div>
+</code></pre>
 
 <p>We now have three indicators, two collaborative filtering type and one 
content type.</p>
 
@@ -691,11 +691,11 @@ For a given user, map their history of an action or 
content to the correct indic
 <p>We have 3 indicators, these are indexed by the search engine into 3 fields, 
weâll call them âpurchaseâ, âviewâ, and âtagsâ. 
 We take the userâs history that corresponds to each indicator and create a 
query of the form:</p>
 
-<div class="highlighter-rouge"><div class="highlight"><pre 
class="highlight"><code>Query:
+<pre><code>Query:
   field: purchase; q:user's-purchase-history
   field: view; q:user's view-history
   field: tags; q:user's-tags-associated-with-purchases
-</code></pre></div></div>
+</code></pre>
 
 <p>The query will result in an ordered list of items recommended for purchase 
but skewed towards items with similar tags to 
 the ones the user has already purchased.</p>
@@ -707,11 +707,11 @@ by tagging items with some category of popularity (hot, 
warm, cold for instance)
 index that as a new indicator field and include the corresponding value in a 
query 
 on the popularity field. If we use the ecom example but use the query to get 
âhotâ recommendations it might look like this:</p>
 
-<div class="highlighter-rouge"><div class="highlight"><pre 
class="highlight"><code>Query:
+<pre><code>Query:
   field: purchase; q:user's-purchase-history
   field: view; q:user's view-history
   field: popularity; q:"hot"
-</code></pre></div></div>
+</code></pre>
 
 <p>This will return recommendations favoring ones that have the intrinsic 
indicator âhotâ.</p>
 

http://git-wip-us.apache.org/repos/asf/mahout/blob/d9686c8b/users/recommender/intro-itembased-hadoop.html
----------------------------------------------------------------------
diff --git a/users/recommender/intro-itembased-hadoop.html 
b/users/recommender/intro-itembased-hadoop.html
index 887b260..bb9b618 100644
--- a/users/recommender/intro-itembased-hadoop.html
+++ b/users/recommender/intro-itembased-hadoop.html
@@ -317,8 +317,8 @@
 
 <p>Assuming your <em>JAVA_HOME</em> is appropriately set and Mahout was 
installed properly weâre ready to configure our syntax. Enter the following 
command:</p>
 
-<div class="highlighter-rouge"><div class="highlight"><pre 
class="highlight"><code>$ mahout recommenditembased -s SIMILARITY_LOGLIKELIHOOD 
-i /path/to/input/file -o /path/to/desired/output --numRecommendations 25
-</code></pre></div></div>
+<pre><code>$ mahout recommenditembased -s SIMILARITY_LOGLIKELIHOOD -i 
/path/to/input/file -o /path/to/desired/output --numRecommendations 25
+</code></pre>
 
 <p>Running the command will execute a series of jobs the final product of 
which will be an output file deposited to the directory specified in the 
command syntax. The output file will contain two columns: the <em>userID</em> 
and an array of <em>itemIDs</em> and scores.</p>
 

http://git-wip-us.apache.org/repos/asf/mahout/blob/d9686c8b/users/recommender/matrix-factorization.html
----------------------------------------------------------------------
diff --git a/users/recommender/matrix-factorization.html 
b/users/recommender/matrix-factorization.html
index f15fa8d..a1c28d6 100644
--- a/users/recommender/matrix-factorization.html
+++ b/users/recommender/matrix-factorization.html
@@ -282,39 +282,39 @@ There are many different matrix decompositions, each 
finds use among a particula
 <p>In mahout, the SVDRecommender provides an interface to build recommender 
based on matrix factorization.
 The idea behind is to project the users and items onto a feature space and try 
to optimize U and M so that U * (M^t) is as close to R as possible:</p>
 
-<div class="highlighter-rouge"><div class="highlight"><pre 
class="highlight"><code> U is n * p user feature matrix, 
+<pre><code> U is n * p user feature matrix, 
  M is m * p item feature matrix, M^t is the conjugate transpose of M,
  R is n * m rating matrix,
  n is the number of users,
  m is the number of items,
  p is the number of features
-</code></pre></div></div>
+</code></pre>
 
 <p>We usually use RMSE to represent the deviations between predictions and 
atual ratings.
 RMSE is defined as the squared root of the sum of squared errors at each known 
user item ratings.
 So our matrix factorization target could be mathmatically defined as:</p>
 
-<div class="highlighter-rouge"><div class="highlight"><pre 
class="highlight"><code> find U and M, (U, M) = argmin(RMSE) = argmin(pow(SSE / 
K, 0.5))
+<pre><code> find U and M, (U, M) = argmin(RMSE) = argmin(pow(SSE / K, 0.5))
  
  SSE = sum(e(u,i)^2)
  e(u,i) = r(u, i) - U[u,] * (M[i,]^t) = r(u,i) - sum(U[u,f] * M[i,f]), f = 0, 
1, .. p - 1
  K is the number of known user item ratings.
-</code></pre></div></div>
+</code></pre>
 
 <p><a name="MatrixFactorization-Factorizers"></a></p>
 
 <p>Mahout has implemented matrix factorization based on</p>
 
-<div class="highlighter-rouge"><div class="highlight"><pre 
class="highlight"><code>(1) SGD(Stochastic Gradient Descent)
+<pre><code>(1) SGD(Stochastic Gradient Descent)
 (2) ALSWR(Alternating-Least-Squares with Weighted-Î»-Regularization).
-</code></pre></div></div>
+</code></pre>
 
 <h2 id="sgd">SGD</h2>
 
 <p>Stochastic gradient descent is a gradient descent optimization method for 
minimizing an objective function that is written as a su of differentiable 
functions.</p>
 
-<div class="highlighter-rouge"><div class="highlight"><pre 
class="highlight"><code>   Q(w) = sum(Q_i(w)), 
-</code></pre></div></div>
+<pre><code>   Q(w) = sum(Q_i(w)), 
+</code></pre>
 
 <p>where w is the parameters to be estimated,
       Q(w) is the objective function that could be expressed as sum of 
differentiable functions,
@@ -322,14 +322,14 @@ So our matrix factorization target could be mathmatically 
defined as:</p>
 
 <p>In practice, w is estimated using an iterative method at each single sample 
until an approximate miminum is obtained,</p>
 
-<div class="highlighter-rouge"><div class="highlight"><pre 
class="highlight"><code>  w = w - alpha * (d(Q_i(w))/dw), where aplpha is the 
learning rate,
+<pre><code>  w = w - alpha * (d(Q_i(w))/dw), where aplpha is the learning rate,
   (d(Q_i(w))/dw) is the first derivative of Q_i(w) on w.
-</code></pre></div></div>
+</code></pre>
 
 <p>In matrix factorization, the RatingSGDFactorizer class implements the SGD 
with w = (U, M) and objective function Q(w) = sum(Q(u,i)),</p>
 
-<div class="highlighter-rouge"><div class="highlight"><pre 
class="highlight"><code>   Q(u,i) =  sum(e(u,i) * e(u,i)) / 2 + lambda * 
[(U[u,] * (U[u,]^t)) + (M[i,] * (M[i,]^t))] / 2
-</code></pre></div></div>
+<pre><code>   Q(u,i) =  sum(e(u,i) * e(u,i)) / 2 + lambda * [(U[u,] * 
(U[u,]^t)) + (M[i,] * (M[i,]^t))] / 2
+</code></pre>
 
 <p>where Q(u, i) is the objecive function for user u and item i,
       e(u, i) is the error between predicted rating and actual rating,
@@ -339,7 +339,7 @@ So our matrix factorization target could be mathmatically 
defined as:</p>
 
 <p>The algorithm is sketched as follows:</p>
 
-<div class="highlighter-rouge"><div class="highlight"><pre 
class="highlight"><code>  init U and M with randomized value between 0.0 and 
1.0 with standard Gaussian distribution   
+<pre><code>  init U and M with randomized value between 0.0 and 1.0 with 
standard Gaussian distribution   
   
   for(iter = 0; iter &lt; numIterations; iter++)
   {
@@ -360,7 +360,7 @@ So our matrix factorization target could be mathmatically 
defined as:</p>
           U[u,] = NU[u,]
       }
   }
-</code></pre></div></div>
+</code></pre>
 
 <h2 id="svd">SVD++</h2>
 
@@ -370,18 +370,18 @@ So our matrix factorization target could be mathmatically 
defined as:</p>
 
 <p>The complete model is a sum of 3 sub-models with complete prediction 
formula as follows:</p>
 
-<div class="highlighter-rouge"><div class="highlight"><pre 
class="highlight"><code>pr(u,i) = b[u,i] + fm + nm   //user u and item i
+<pre><code>pr(u,i) = b[u,i] + fm + nm   //user u and item i
 
 pr(u,i) is the predicted rating of user u on item i,
 b[u,i] = U + b(u) + b(i)
 fm = (q[i,]) * (p[u,] + pow(|N(u)|, -0.5) * sum(y[j,])),  j is an item in N(u)
 nm = pow(|R(i;u;k)|, -0.5) * sum((r[u,j0] - b[u,j0]) * w[i,j0]) + 
pow(|N(i;u;k)|, -0.5) * sum(c[i,j1]), j0 is an item in R(i;u;k), j1 is an item 
in N(i;u;k)
-</code></pre></div></div>
+</code></pre>
 
 <p>The associated regularized squared error function to be minimized is:</p>
 
-<div class="highlighter-rouge"><div class="highlight"><pre 
class="highlight"><code>{sum((r[u,i] - pr[u,i]) * (r[u,i] - pr[u,i]))  - lambda 
* (b(u) * b(u) + b(i) * b(i) + ||q[i,]||^2 + ||p[u,]||^2 + sum(||y[j,]||^2) + 
sum(w[i,j0] * w[i,j0]) + sum(c[i,j1] * c[i,j1]))}
-</code></pre></div></div>
+<pre><code>{sum((r[u,i] - pr[u,i]) * (r[u,i] - pr[u,i]))  - lambda * (b(u) * 
b(u) + b(i) * b(i) + ||q[i,]||^2 + ||p[u,]||^2 + sum(||y[j,]||^2) + sum(w[i,j0] 
* w[i,j0]) + sum(c[i,j1] * c[i,j1]))}
+</code></pre>
 
 <p>b[u,i] is the baseline estimate of user uâs predicted rating on item i. U 
is usersâ overall average rating and b(u) and b(i) indicate the observed 
deviations of user u and item iâs ratings from average.</p>
 
@@ -420,12 +420,12 @@ please refer to the paper <a 
href="http://research.yahoo.com/files/kdd08koren.pd
 
 <p>The update to q, p, y in each gradient descent step is:</p>
 
-<div class="highlighter-rouge"><div class="highlight"><pre 
class="highlight"><code>  err(u,i) = r[u,i] - pr[u,i]
+<pre><code>  err(u,i) = r[u,i] - pr[u,i]
   q[i,] = q[i,] + alpha * (err(u,i) * (p[u,] + pow(|N(u)|, -0.5) * sum(y[j,])) 
- lamda * q[i,]) 
   p[u,] = p[u,] + alpha * (err(u,i) * q[i,] - lambda * p[u,])
   for j that is an item in N(u):
      y[j,] = y[j,] + alpha * (err(u,i) * pow(|N(u)|, -0.5) * q[i,] - lambda * 
y[j,])
-</code></pre></div></div>
+</code></pre>
 
 <p>where alpha is the learning rate of gradient descent, N(u) is the items 
that user u has expressed preference.</p>
 
@@ -443,8 +443,8 @@ vanilla SGD.</p>
 <p>ALSWR is an iterative algorithm to solve the low rank factorization of user 
feature matrix U and item feature matrix M.<br />
 The loss function to be minimized is formulated as the sum of squared errors 
plus <a href="http://en.wikipedia.org/wiki/Tikhonov_regularization";>Tikhonov 
regularization</a>:</p>
 
-<div class="highlighter-rouge"><div class="highlight"><pre 
class="highlight"><code> L(R, U, M) = sum(pow((R[u,i] - U[u,]* (M[i,]^t)), 2)) 
+ lambda * (sum(n(u) * ||U[u,]||^2) + sum(n(i) * ||M[i,]||^2))
-</code></pre></div></div>
+<pre><code> L(R, U, M) = sum(pow((R[u,i] - U[u,]* (M[i,]^t)), 2)) + lambda * 
(sum(n(u) * ||U[u,]||^2) + sum(n(i) * ||M[i,]||^2))
+</code></pre>
 
 <p>At the beginning of the algorithm, M is initialized with the average item 
ratings as its first row and random numbers for the rest row.</p>
 
@@ -454,14 +454,14 @@ the cost function similarly. The iteration stops until a 
certain stopping criter
 <p>To solve the matrix U when M is given, each userâs feature vector is 
calculated by resolving a regularized linear least square error function 
 using the items the user has rated and their feature vectors:</p>
 
-<div class="highlighter-rouge"><div class="highlight"><pre 
class="highlight"><code>  1/2 * d(L(R,U,M)) / d(U[u,f]) = 0 
-</code></pre></div></div>
+<pre><code>  1/2 * d(L(R,U,M)) / d(U[u,f]) = 0 
+</code></pre>
 
 <p>Similary, when M is updated, we resolve a regularized linear least square 
error function using feature vectors of the users that have rated the 
 item and their feature vectors:</p>
 
-<div class="highlighter-rouge"><div class="highlight"><pre 
class="highlight"><code>  1/2 * d(L(R,U,M)) / d(M[i,f]) = 0
-</code></pre></div></div>
+<pre><code>  1/2 * d(L(R,U,M)) / d(M[i,f]) = 0
+</code></pre>
 
 <p>The ALSWRFactorizer class is a non-distributed implementation of ALSWR 
using multi-threading to dispatch the computation among several threads.
 Mahout also offers a <a 
href="https://mahout.apache.org/users/recommender/intro-als-hadoop.html";>parallel
 map-reduce implementation</a>.</p>

http://git-wip-us.apache.org/repos/asf/mahout/blob/d9686c8b/users/recommender/recommender-documentation.html
----------------------------------------------------------------------
diff --git a/users/recommender/recommender-documentation.html 
b/users/recommender/recommender-documentation.html
index d1a4b44..55a508b 100644
--- a/users/recommender/recommender-documentation.html
+++ b/users/recommender/recommender-documentation.html
@@ -375,34 +375,34 @@ Weâll start with an example of this.</p>
 on data in a file. The file should be in CSV format, with lines of the form
 âuserID,itemID,prefValueâ (e.g. â39505,290002,3.5â):</p>
 
-<div class="highlighter-rouge"><div class="highlight"><pre 
class="highlight"><code>DataModel model = new FileDataModel(new 
File("data.txt"));
-</code></pre></div></div>
+<pre><code>DataModel model = new FileDataModel(new File("data.txt"));
+</code></pre>
 
 <p>Weâll use the <strong>PearsonCorrelationSimilarity</strong> 
implementation of <strong>UserSimilarity</strong>
 as our user correlation algorithm, and add an optional preference inference
 algorithm:</p>
 
-<div class="highlighter-rouge"><div class="highlight"><pre 
class="highlight"><code>UserSimilarity userSimilarity = new 
PearsonCorrelationSimilarity(model);
-</code></pre></div></div>
+<pre><code>UserSimilarity userSimilarity = new 
PearsonCorrelationSimilarity(model);
+</code></pre>
 
 <p>Now we create a <strong>UserNeighborhood</strong> algorithm. Here we use 
nearest-3:</p>
 
-<div class="highlighter-rouge"><div class="highlight"><pre 
class="highlight"><code>UserNeighborhood neighborhood =
+<pre><code>UserNeighborhood neighborhood =
          new NearestNUserNeighborhood(3, userSimilarity, model);{code}
-</code></pre></div></div>
+</code></pre>
 
 <p>Now we can create our <strong>Recommender</strong>, and add a caching 
decorator:</p>
 
-<div class="highlighter-rouge"><div class="highlight"><pre 
class="highlight"><code>Recommender recommender =
+<pre><code>Recommender recommender =
   new GenericUserBasedRecommender(model, neighborhood, userSimilarity);
 Recommender cachingRecommender = new CachingRecommender(recommender);
-</code></pre></div></div>
+</code></pre>
 
 <p>Now we can get 10 recommendations for user ID â1234â â done!</p>
 
-<div class="highlighter-rouge"><div class="highlight"><pre 
class="highlight"><code>List&lt;RecommendedItem&gt; recommendations =
+<pre><code>List&lt;RecommendedItem&gt; recommendations =
   cachingRecommender.recommend(1234, 10);
-</code></pre></div></div>
+</code></pre>
 
 <h2 id="item-based-recommender">Item-based Recommender</h2>
 
@@ -417,8 +417,8 @@ are more appropriate.</p>
 
 <p>Letâs start over, again with a <strong>FileDataModel</strong> to 
start:</p>
 
-<div class="highlighter-rouge"><div class="highlight"><pre 
class="highlight"><code>DataModel model = new FileDataModel(new 
File("data.txt"));
-</code></pre></div></div>
+<pre><code>DataModel model = new FileDataModel(new File("data.txt"));
+</code></pre>
 
 <p>Weâll also need an <strong>ItemSimilarity</strong>. We could use
 <strong>PearsonCorrelationSimilarity</strong>, which computes item similarity 
in realtime,
@@ -426,22 +426,22 @@ but, this is generally too slow to be useful. Instead, in 
a real
 application, you would feed a list of pre-computed correlations to a
 <strong>GenericItemSimilarity</strong>:</p>
 
-<div class="highlighter-rouge"><div class="highlight"><pre 
class="highlight"><code>// Construct the list of pre-computed correlations
+<pre><code>// Construct the list of pre-computed correlations
 Collection&lt;GenericItemSimilarity.ItemItemSimilarity&gt; correlations =
   ...;
 ItemSimilarity itemSimilarity =
   new GenericItemSimilarity(correlations);
-</code></pre></div></div>
+</code></pre>
 
 <p>Then we can finish as before to produce recommendations:</p>
 
-<div class="highlighter-rouge"><div class="highlight"><pre 
class="highlight"><code>Recommender recommender =
+<pre><code>Recommender recommender =
   new GenericItemBasedRecommender(model, itemSimilarity);
 Recommender cachingRecommender = new CachingRecommender(recommender);
 ...
 List&lt;RecommendedItem&gt; recommendations =
   cachingRecommender.recommend(1234, 10);
-</code></pre></div></div>
+</code></pre>
 
 <p><a name="RecommenderDocumentation-Integrationwithyourapplication"></a></p>
 <h2 id="integration-with-your-application">Integration with your 
application</h2>
@@ -499,7 +499,7 @@ provides a good starting point.</p>
 <p>Fortunately, Mahout provides a way to evaluate the accuracy of your
 Recommender on your own data, in org.apache.mahout.cf.taste.eval</p>
 
-<div class="highlighter-rouge"><div class="highlight"><pre 
class="highlight"><code>DataModel myModel = ...;
+<pre><code>DataModel myModel = ...;
 RecommenderBuilder builder = new RecommenderBuilder() {
   public Recommender buildRecommender(DataModel model) {
     // build and return the Recommender to evaluate here
@@ -508,7 +508,7 @@ RecommenderBuilder builder = new RecommenderBuilder() {
 RecommenderEvaluator evaluator =
          new AverageAbsoluteDifferenceRecommenderEvaluator();
 double evaluation = evaluator.evaluate(builder, myModel, 0.9, 1.0);
-</code></pre></div></div>
+</code></pre>
 
 <p>For âbooleanâ data model situations, where there are no notions of
 preference value, the above evaluation based on estimated preference does

http://git-wip-us.apache.org/repos/asf/mahout/blob/d9686c8b/users/sparkbindings/faq.html
----------------------------------------------------------------------
diff --git a/users/sparkbindings/faq.html b/users/sparkbindings/faq.html
index 6f3627c..4665f3a 100644
--- a/users/sparkbindings/faq.html
+++ b/users/sparkbindings/faq.html
@@ -289,10 +289,10 @@ the classpath is sane and is made available to Mahout:</p>
 <ol>
   <li>Check Spark is of correct version (same as in Mahoutâs poms), is 
compiled and SPARK_HOME is set.</li>
   <li>Check Mahout is compiled and MAHOUT_HOME is set.</li>
-  <li>Run <code 
class="highlighter-rouge">$SPARK_HOME/bin/compute-classpath.sh</code> and make 
sure it produces sane result with no errors. 
+  <li>Run <code>$SPARK_HOME/bin/compute-classpath.sh</code> and make sure it 
produces sane result with no errors. 
 If it outputs something other than a straightforward classpath string, most 
likely Spark is not compiled/set correctly (later spark versions require 
-<code class="highlighter-rouge">sbt/sbt assembly</code> to be run, simply 
runnig <code class="highlighter-rouge">sbt/sbt publish-local</code> is not 
enough any longer).</li>
-  <li>Run <code class="highlighter-rouge">$MAHOUT_HOME/bin/mahout -spark 
classpath</code> and check that path reported in step (3) is included.</li>
+<code>sbt/sbt assembly</code> to be run, simply runnig <code>sbt/sbt 
publish-local</code> is not enough any longer).</li>
+  <li>Run <code>$MAHOUT_HOME/bin/mahout -spark classpath</code> and check that 
path reported in step (3) is included.</li>
 </ol>
 
 <p><strong>Q: I am using the command line Mahout jobs that run on Spark or am 
writing my own application that uses 

http://git-wip-us.apache.org/repos/asf/mahout/blob/d9686c8b/users/sparkbindings/home.html
----------------------------------------------------------------------
diff --git a/users/sparkbindings/home.html b/users/sparkbindings/home.html
index 234f4c2..c85c2ec 100644
--- a/users/sparkbindings/home.html
+++ b/users/sparkbindings/home.html
@@ -279,14 +279,14 @@
 
 <p>In short, Scala &amp; Spark Bindings for Mahout is Scala DSL and algebraic 
optimizer of something like this (actual formula from 
<strong>(d)spca</strong>)</p>
 
-<p><code 
class="highlighter-rouge">\[\mathbf{G}=\mathbf{B}\mathbf{B}^{\top}-\mathbf{C}-\mathbf{C}^{\top}+\mathbf{s}_{q}\mathbf{s}_{q}^{\top}\boldsymbol{\xi}^{\top}\boldsymbol{\xi}\]</code></p>
+<p><code>\[\mathbf{G}=\mathbf{B}\mathbf{B}^{\top}-\mathbf{C}-\mathbf{C}^{\top}+\mathbf{s}_{q}\mathbf{s}_{q}^{\top}\boldsymbol{\xi}^{\top}\boldsymbol{\xi}\]</code></p>
 
 <p>bound to in-core and distributed computations (currently, on Apache 
Spark).</p>
 
 <p>Mahout Scala &amp; Spark Bindings expression of the above:</p>
 
-<div class="highlighter-rouge"><div class="highlight"><pre 
class="highlight"><code>    val g = bt.t %*% bt - c - c.t + (s_q cross s_q) * 
(xi dot xi)
-</code></pre></div></div>
+<pre><code>    val g = bt.t %*% bt - c - c.t + (s_q cross s_q) * (xi dot xi)
+</code></pre>
 
 <p>The main idea is that a scientist writing algebraic expressions cannot care 
less of distributed 
 operation plans and works <strong>entirely on the logical level</strong> just 
like he or she would do with R.</p>

http://git-wip-us.apache.org/repos/asf/mahout/blob/d9686c8b/users/sparkbindings/play-with-shell.html
----------------------------------------------------------------------
diff --git a/users/sparkbindings/play-with-shell.html 
b/users/sparkbindings/play-with-shell.html
index 694e3e1..a05e764 100644
--- a/users/sparkbindings/play-with-shell.html
+++ b/users/sparkbindings/play-with-shell.html
@@ -375,15 +375,15 @@
 
 <ol>
   <li>Download <a 
href="http://d3kbcqa49mib13.cloudfront.net/spark-1.6.2-bin-hadoop2.6.tgz";>Apache
 Spark 1.6.2</a> and unpack the archive file</li>
-  <li>Change to the directory where you unpacked Spark and type <code 
class="highlighter-rouge">sbt/sbt assembly</code> to build it</li>
-  <li>Create a directory for Mahout somewhere on your machine, change to there 
and checkout the master branch of Apache Mahout from GitHub <code 
class="highlighter-rouge">git clone https://github.com/apache/mahout 
mahout</code></li>
-  <li>Change to the <code class="highlighter-rouge">mahout</code> directory 
and build mahout using <code class="highlighter-rouge">mvn -DskipTests clean 
install</code></li>
+  <li>Change to the directory where you unpacked Spark and type <code>sbt/sbt 
assembly</code> to build it</li>
+  <li>Create a directory for Mahout somewhere on your machine, change to there 
and checkout the master branch of Apache Mahout from GitHub <code>git clone 
https://github.com/apache/mahout mahout</code></li>
+  <li>Change to the <code>mahout</code> directory and build mahout using 
<code>mvn -DskipTests clean install</code></li>
 </ol>
 
 <h2 id="starting-mahouts-spark-shell">Starting Mahoutâs Spark shell</h2>
 
 <ol>
-  <li>Goto the directory where you unpacked Spark and type <code 
class="highlighter-rouge">sbin/start-all.sh</code> to locally start Spark</li>
+  <li>Goto the directory where you unpacked Spark and type 
<code>sbin/start-all.sh</code> to locally start Spark</li>
   <li>Open a browser, point it to <a 
href="http://localhost:8080/";>http://localhost:8080/</a> to check whether Spark 
successfully started. Copy the url of the spark master at the top of the page 
(it starts with <strong>spark://</strong>)</li>
   <li>Define the following environment variables: &lt;pre 
class="codehilite"&gt;export MAHOUT_HOME=[directory into which you checked out 
Mahout]
 export SPARK_HOME=[directory where you unpacked Spark]
@@ -391,8 +391,8 @@ export MASTER=[url of the Spark master]</li>
 </ol>
 <p>&lt;/pre&gt;</p>
 <ol>
-  <li>Finally, change to the directory where you unpacked Mahout and type 
<code class="highlighter-rouge">bin/mahout spark-shell</code>, 
-you should see the shell starting and get the prompt <code 
class="highlighter-rouge">mahout&gt;</code>. Check 
+  <li>Finally, change to the directory where you unpacked Mahout and type 
<code>bin/mahout spark-shell</code>, 
+you should see the shell starting and get the prompt <code>mahout&gt;</code>. 
Check 
 <a href="http://mahout.apache.org/users/sparkbindings/faq.html";>FAQ</a> for 
further troubleshooting.</li>
 </ol>
 
@@ -402,7 +402,7 @@ you should see the shell starting and get the prompt <code 
class="highlighter-ro
 
 <p><em>Note: You can incrementally follow the example by copy-and-pasting the 
code into your running Mahout shell.</em></p>
 
-<p>Mahoutâs linear algebra DSL has an abstraction called 
<em>DistributedRowMatrix (DRM)</em> which models a matrix that is partitioned 
by rows and stored in the memory of a cluster of machines. We use <code 
class="highlighter-rouge">dense()</code> to create a dense in-memory matrix 
from our toy dataset and use <code 
class="highlighter-rouge">drmParallelize</code> to load it into the cluster, 
âmimickingâ a large, partitioned dataset.</p>
+<p>Mahoutâs linear algebra DSL has an abstraction called 
<em>DistributedRowMatrix (DRM)</em> which models a matrix that is partitioned 
by rows and stored in the memory of a cluster of machines. We use 
<code>dense()</code> to create a dense in-memory matrix from our toy dataset 
and use <code>drmParallelize</code> to load it into the cluster, 
âmimickingâ a large, partitioned dataset.</p>
 
 <div class="codehilite"><pre>
 val drmData = drmParallelize(dense(
@@ -421,47 +421,47 @@ val drmData = drmParallelize(dense(
 <p>Have a look at this matrix. The first four columns represent the 
ingredients 
 (our features) and the last column (the rating) is the target variable for 
 our regression. <a 
href="https://en.wikipedia.org/wiki/Linear_regression";>Linear regression</a> 
-assumes that the <strong>target variable</strong> <code 
class="highlighter-rouge">\(\mathbf{y}\)</code> is generated by the 
-linear combination of <strong>the feature matrix</strong> <code 
class="highlighter-rouge">\(\mathbf{X}\)</code> with the 
-<strong>parameter vector</strong> <code 
class="highlighter-rouge">\(\boldsymbol{\beta}\)</code> plus the
- <strong>noise</strong> <code 
class="highlighter-rouge">\(\boldsymbol{\varepsilon}\)</code>, summarized in 
the formula 
-<code 
class="highlighter-rouge">\(\mathbf{y}=\mathbf{X}\boldsymbol{\beta}+\boldsymbol{\varepsilon}\)</code>.
 
+assumes that the <strong>target variable</strong> <code>\(\mathbf{y}\)</code> 
is generated by the 
+linear combination of <strong>the feature matrix</strong> 
<code>\(\mathbf{X}\)</code> with the 
+<strong>parameter vector</strong> <code>\(\boldsymbol{\beta}\)</code> plus the
+ <strong>noise</strong> <code>\(\boldsymbol{\varepsilon}\)</code>, summarized 
in the formula 
+<code>\(\mathbf{y}=\mathbf{X}\boldsymbol{\beta}+\boldsymbol{\varepsilon}\)</code>.
 
 Our goal is to find an estimate of the parameter vector 
-<code class="highlighter-rouge">\(\boldsymbol{\beta}\)</code> that explains 
the data very well.</p>
+<code>\(\boldsymbol{\beta}\)</code> that explains the data very well.</p>
 
-<p>As a first step, we extract <code 
class="highlighter-rouge">\(\mathbf{X}\)</code> and <code 
class="highlighter-rouge">\(\mathbf{y}\)</code> from our data matrix. We get 
<em>X</em> by slicing: we take all rows (denoted by <code 
class="highlighter-rouge">::</code>) and the first four columns, which have the 
ingredients in milligrams as content. Note that the result is again a DRM. The 
shell will not execute this code yet, it saves the history of operations and 
defers the execution until we really access a result. <strong>Mahoutâs DSL 
automatically optimizes and parallelizes all operations on DRMs and runs them 
on Apache Spark.</strong></p>
+<p>As a first step, we extract <code>\(\mathbf{X}\)</code> and 
<code>\(\mathbf{y}\)</code> from our data matrix. We get <em>X</em> by slicing: 
we take all rows (denoted by <code>::</code>) and the first four columns, which 
have the ingredients in milligrams as content. Note that the result is again a 
DRM. The shell will not execute this code yet, it saves the history of 
operations and defers the execution until we really access a result. 
<strong>Mahoutâs DSL automatically optimizes and parallelizes all operations 
on DRMs and runs them on Apache Spark.</strong></p>
 
 <div class="codehilite"><pre>
 val drmX = drmData(::, 0 until 4)
 </pre></div>
 
-<p>Next, we extract the target variable vector <em>y</em>, the fifth column of 
the data matrix. We assume this one fits into our driver machine, so we fetch 
it into memory using <code class="highlighter-rouge">collect</code>:</p>
+<p>Next, we extract the target variable vector <em>y</em>, the fifth column of 
the data matrix. We assume this one fits into our driver machine, so we fetch 
it into memory using <code>collect</code>:</p>
 
 <div class="codehilite"><pre>
 val y = drmData.collect(::, 4)
 </pre></div>
 
-<p>Now we are ready to think about a mathematical way to estimate the 
parameter vector <em>Î²</em>. A simple textbook approach is <a 
href="https://en.wikipedia.org/wiki/Ordinary_least_squares";>ordinary least 
squares (OLS)</a>, which minimizes the sum of residual squares between the true 
target variable and the prediction of the target variable. In OLS, there is 
even a closed form expression for estimating <code 
class="highlighter-rouge">\(\boldsymbol{\beta}\)</code> as 
-<code 
class="highlighter-rouge">\(\left(\mathbf{X}^{\top}\mathbf{X}\right)^{-1}\mathbf{X}^{\top}\mathbf{y}\)</code>.</p>
+<p>Now we are ready to think about a mathematical way to estimate the 
parameter vector <em>Î²</em>. A simple textbook approach is <a 
href="https://en.wikipedia.org/wiki/Ordinary_least_squares";>ordinary least 
squares (OLS)</a>, which minimizes the sum of residual squares between the true 
target variable and the prediction of the target variable. In OLS, there is 
even a closed form expression for estimating 
<code>\(\boldsymbol{\beta}\)</code> as 
+<code>\(\left(\mathbf{X}^{\top}\mathbf{X}\right)^{-1}\mathbf{X}^{\top}\mathbf{y}\)</code>.</p>
 
-<p>The first thing which we compute for this is  <code 
class="highlighter-rouge">\(\mathbf{X}^{\top}\mathbf{X}\)</code>. The code for 
doing this in Mahoutâs scala DSL maps directly to the mathematical formula. 
The operation <code class="highlighter-rouge">.t()</code> transposes a matrix 
and analogous to R <code class="highlighter-rouge">%*%</code> denotes matrix 
multiplication.</p>
+<p>The first thing which we compute for this is  
<code>\(\mathbf{X}^{\top}\mathbf{X}\)</code>. The code for doing this in 
Mahoutâs scala DSL maps directly to the mathematical formula. The operation 
<code>.t()</code> transposes a matrix and analogous to R <code>%*%</code> 
denotes matrix multiplication.</p>
 
 <div class="codehilite"><pre>
 val drmXtX = drmX.t %*% drmX
 </pre></div>
 
-<p>The same is true for computing <code 
class="highlighter-rouge">\(\mathbf{X}^{\top}\mathbf{y}\)</code>. We can simply 
type the math in scala expressions into the shell. Here, <em>X</em> lives in 
the cluster, while is <em>y</em> in the memory of the driver, and the result is 
a DRM again.</p>
+<p>The same is true for computing 
<code>\(\mathbf{X}^{\top}\mathbf{y}\)</code>. We can simply type the math in 
scala expressions into the shell. Here, <em>X</em> lives in the cluster, while 
is <em>y</em> in the memory of the driver, and the result is a DRM again.</p>
 <div class="codehilite"><pre>
 val drmXty = drmX.t %*% y
 </pre></div>
 
-<p>Weâre nearly done. The next step we take is to fetch <code 
class="highlighter-rouge">\(\mathbf{X}^{\top}\mathbf{X}\)</code> and 
-<code class="highlighter-rouge">\(\mathbf{X}^{\top}\mathbf{y}\)</code> into 
the memory of our driver machine (we are targeting 
+<p>Weâre nearly done. The next step we take is to fetch 
<code>\(\mathbf{X}^{\top}\mathbf{X}\)</code> and 
+<code>\(\mathbf{X}^{\top}\mathbf{y}\)</code> into the memory of our driver 
machine (we are targeting 
 features matrices that are tall and skinny , 
-so we can assume that <code 
class="highlighter-rouge">\(\mathbf{X}^{\top}\mathbf{X}\)</code> is small 
enough 
+so we can assume that <code>\(\mathbf{X}^{\top}\mathbf{X}\)</code> is small 
enough 
 to fit in). Then, we provide them to an in-memory solver (Mahout provides 
-the an analog to Râs <code class="highlighter-rouge">solve()</code> for 
that) which computes <code class="highlighter-rouge">beta</code>, our 
-OLS estimate of the parameter vector <code 
class="highlighter-rouge">\(\boldsymbol{\beta}\)</code>.</p>
+the an analog to Râs <code>solve()</code> for that) which computes 
<code>beta</code>, our 
+OLS estimate of the parameter vector <code>\(\boldsymbol{\beta}\)</code>.</p>
 
 <div class="codehilite"><pre>
 val XtX = drmXtX.collect
@@ -478,9 +478,9 @@ as much as possible, while still retaining decent 
performance and
 scalability.</p>
 
 <p>We can now check how well our model fits its training data. 
-First, we multiply the feature matrix <code 
class="highlighter-rouge">\(\mathbf{X}\)</code> by our estimate of 
-<code class="highlighter-rouge">\(\boldsymbol{\beta}\)</code>. Then, we look 
at the difference (via L2-norm) of 
-the target variable <code class="highlighter-rouge">\(\mathbf{y}\)</code> to 
the fitted target variable:</p>
+First, we multiply the feature matrix <code>\(\mathbf{X}\)</code> by our 
estimate of 
+<code>\(\boldsymbol{\beta}\)</code>. Then, we look at the difference (via 
L2-norm) of 
+the target variable <code>\(\mathbf{y}\)</code> to the fitted target 
variable:</p>
 
 <div class="codehilite"><pre>
 val yFitted = (drmX %*% beta).collect(::, 0)
@@ -489,7 +489,7 @@ val yFitted = (drmX %*% beta).collect(::, 0)
 
 <p>We hope that we could show that Mahoutâs shell allows people to 
interactively and incrementally write algorithms. We have entered a lot of 
individual commands, one-by-one, until we got the desired results. We can now 
refactor a little by wrapping our statements into easy-to-use functions. The 
definition of functions follows standard scala syntax.</p>
 
-<p>We put all the commands for ordinary least squares into a function <code 
class="highlighter-rouge">ols</code>.</p>
+<p>We put all the commands for ordinary least squares into a function 
<code>ols</code>.</p>
 
 <div class="codehilite"><pre>
 def ols(drmX: DrmLike[Int], y: Vector) = 
@@ -497,10 +497,10 @@ def ols(drmX: DrmLike[Int], y: Vector) =
 
 </pre></div>
 
-<p>Note that DSL declares implicit <code 
class="highlighter-rouge">collect</code> if coersion rules require an in-core 
argument. Hence, we can simply
-skip explicit <code class="highlighter-rouge">collect</code>s.</p>
+<p>Note that DSL declares implicit <code>collect</code> if coersion rules 
require an in-core argument. Hence, we can simply
+skip explicit <code>collect</code>s.</p>
 
-<p>Next, we define a function <code 
class="highlighter-rouge">goodnessOfFit</code> that tells how well a model fits 
the target variable:</p>
+<p>Next, we define a function <code>goodnessOfFit</code> that tells how well a 
model fits the target variable:</p>
 
 <div class="codehilite"><pre>
 def goodnessOfFit(drmX: DrmLike[Int], beta: Vector, y: Vector) = {
@@ -513,7 +513,7 @@ def goodnessOfFit(drmX: DrmLike[Int], beta: Vector, y: 
Vector) = {
 model. Usually there is a constant bias term added to the model. Without 
 that, our model always crosses through the origin and we only learn the 
 right angle. An easy way to add such a bias term to our model is to add a 
-column of ones to the feature matrix <code 
class="highlighter-rouge">\(\mathbf{X}\)</code>. 
+column of ones to the feature matrix <code>\(\mathbf{X}\)</code>. 
 The corresponding weight in the parameter vector will then be the bias 
term.</p>
 
 <p>Here is how we add a bias column:</p>
@@ -522,14 +522,14 @@ The corresponding weight in the parameter vector will 
then be the bias term.</p>
 val drmXwithBiasColumn = drmX cbind 1
 </pre></div>
 
-<p>Now we can give the newly created DRM <code 
class="highlighter-rouge">drmXwithBiasColumn</code> to our model fitting method 
<code class="highlighter-rouge">ols</code> and see how well the resulting model 
fits the training data with <code 
class="highlighter-rouge">goodnessOfFit</code>. You should see a large 
improvement in the result.</p>
+<p>Now we can give the newly created DRM <code>drmXwithBiasColumn</code> to 
our model fitting method <code>ols</code> and see how well the resulting model 
fits the training data with <code>goodnessOfFit</code>. You should see a large 
improvement in the result.</p>
 
 <div class="codehilite"><pre>
 val betaWithBiasTerm = ols(drmXwithBiasColumn, y)
 goodnessOfFit(drmXwithBiasColumn, betaWithBiasTerm, y)
 </pre></div>
 
-<p>As a further optimization, we can make use of the DSLâs caching 
functionality. We use <code class="highlighter-rouge">drmXwithBiasColumn</code> 
repeatedly  as input to a computation, so it might be beneficial to cache it in 
memory. This is achieved by calling <code 
class="highlighter-rouge">checkpoint()</code>. In the end, we remove it from 
the cache with uncache:</p>
+<p>As a further optimization, we can make use of the DSLâs caching 
functionality. We use <code>drmXwithBiasColumn</code> repeatedly  as input to a 
computation, so it might be beneficial to cache it in memory. This is achieved 
by calling <code>checkpoint()</code>. In the end, we remove it from the cache 
with uncache:</p>
 
 <div class="codehilite"><pre>
 val cachedDrmX = drmXwithBiasColumn.checkpoint()

[1/4] mahout git commit: Automatic Site Publish by Buildbot

Reply via email to