Author: akm
Date: Thu Feb  2 23:36:41 2017
New Revision: 1781487

URL: http://svn.apache.org/viewvc?rev=1781487&view=rev
Log:
MAHOUT-1682 and 1686: SPCA and ALS pages.

Added:
    mahout/site/mahout_cms/trunk/content/users/algorithms/d-als.mdtext
      - copied unchanged from r1781457, 
mahout/site/mahout_cms/trunk/content/users/algorithms/d-qr.mdtext
    mahout/site/mahout_cms/trunk/content/users/algorithms/d-spca.mdtext
      - copied, changed from r1781457, 
mahout/site/mahout_cms/trunk/content/users/algorithms/d-qr.mdtext
Modified:
    mahout/site/mahout_cms/trunk/content/users/algorithms/d-qr.mdtext

Modified: mahout/site/mahout_cms/trunk/content/users/algorithms/d-qr.mdtext
URL: 
http://svn.apache.org/viewvc/mahout/site/mahout_cms/trunk/content/users/algorithms/d-qr.mdtext?rev=1781487&r1=1781486&r2=1781487&view=diff
==============================================================================
--- mahout/site/mahout_cms/trunk/content/users/algorithms/d-qr.mdtext (original)
+++ mahout/site/mahout_cms/trunk/content/users/algorithms/d-qr.mdtext Thu Feb  
2 23:36:41 2017
@@ -3,11 +3,11 @@
 
 ## Intro
 
-Mahout has a distributed implementation of QR decomposition for tall thin 
matricies[1].
+Mahout has a distributed implementation of QR decomposition for tall thin 
matrices[1].
 
 ## Algorithm 
 
-For the classic QR decomposition of the form 
`\(\mathbf{A}=\mathbf{QR},\mathbf{A}\in\mathbb{R}^{m\times n}\)` a distributed 
version is fairly easily achieved if `\(\mathbf{A}\)` is tall and thin such 
that `\(\mathbf{A}^{\top}\mathbf{A}\)` fits in memory, i.e. *m* is large but 
*n* < ~5000 Under such circumstances, only `\(\mathbf{A}\)` and 
`\(\mathbf{Q}\)` are distributed matricies and `\(\mathbf{A^{\top}A}\)` and 
`\(\mathbf{R}\)` are in-core products. We just compute the in-core version of 
the Cholesky decomposition in the form of `\(\mathbf{LL}^{\top}= 
\mathbf{A}^{\top}\mathbf{A}\)`.  After that we take `\(\mathbf{R}= 
\mathbf{L}^{\top}\)` and 
`\(\mathbf{Q}=\mathbf{A}\left(\mathbf{L}^{\top}\right)^{-1}\)`.  The latter is 
easily achieved by multiplying each verticle block of `\(\mathbf{A}\)` by 
`\(\left(\mathbf{L}^{\top}\right)^{-1}\)`.  (There is no actual matrix 
inversion happening). 
+For the classic QR decomposition of the form 
`\(\mathbf{A}=\mathbf{QR},\mathbf{A}\in\mathbb{R}^{m\times n}\)` a distributed 
version is fairly easily achieved if `\(\mathbf{A}\)` is tall and thin such 
that `\(\mathbf{A}^{\top}\mathbf{A}\)` fits in memory, i.e. *m* is large but 
*n* < ~5000 Under such circumstances, only `\(\mathbf{A}\)` and 
`\(\mathbf{Q}\)` are distributed matrices and `\(\mathbf{A^{\top}A}\)` and 
`\(\mathbf{R}\)` are in-core products. We just compute the in-core version of 
the Cholesky decomposition in the form of `\(\mathbf{LL}^{\top}= 
\mathbf{A}^{\top}\mathbf{A}\)`.  After that we take `\(\mathbf{R}= 
\mathbf{L}^{\top}\)` and 
`\(\mathbf{Q}=\mathbf{A}\left(\mathbf{L}^{\top}\right)^{-1}\)`.  The latter is 
easily achieved by multiplying each vertical block of `\(\mathbf{A}\)` by 
`\(\left(\mathbf{L}^{\top}\right)^{-1}\)`.  (There is no actual matrix 
inversion happening). 
 
 
 

Copied: mahout/site/mahout_cms/trunk/content/users/algorithms/d-spca.mdtext 
(from r1781457, 
mahout/site/mahout_cms/trunk/content/users/algorithms/d-qr.mdtext)
URL: 
http://svn.apache.org/viewvc/mahout/site/mahout_cms/trunk/content/users/algorithms/d-spca.mdtext?p2=mahout/site/mahout_cms/trunk/content/users/algorithms/d-spca.mdtext&p1=mahout/site/mahout_cms/trunk/content/users/algorithms/d-qr.mdtext&r1=1781457&r2=1781487&rev=1781487&view=diff
==============================================================================
--- mahout/site/mahout_cms/trunk/content/users/algorithms/d-qr.mdtext (original)
+++ mahout/site/mahout_cms/trunk/content/users/algorithms/d-spca.mdtext Thu Feb 
 2 23:36:41 2017
@@ -1,14 +1,13 @@
-# Distributed Cholesky QR
+# Distributed Stochastic PCA
 
 
 ## Intro
 
-Mahout has a distributed implementation of QR decomposition for tall thin 
matricies[1].
+Mahout has a distributed implementation of Stochastic PCA 
 
-## Algorithm 
-
-For the classic QR decomposition of the form 
`\(\mathbf{A}=\mathbf{QR},\mathbf{A}\in\mathbb{R}^{m\times n}\)` a distributed 
version is fairly easily achieved if `\(\mathbf{A}\)` is tall and thin such 
that `\(\mathbf{A}^{\top}\mathbf{A}\)` fits in memory, i.e. *m* is large but 
*n* < ~5000 Under such circumstances, only `\(\mathbf{A}\)` and 
`\(\mathbf{Q}\)` are distributed matricies and `\(\mathbf{A^{\top}A}\)` and 
`\(\mathbf{R}\)` are in-core products. We just compute the in-core version of 
the Cholesky decomposition in the form of `\(\mathbf{LL}^{\top}= 
\mathbf{A}^{\top}\mathbf{A}\)`.  After that we take `\(\mathbf{R}= 
\mathbf{L}^{\top}\)` and 
`\(\mathbf{Q}=\mathbf{A}\left(\mathbf{L}^{\top}\right)^{-1}\)`.  The latter is 
easily achieved by multiplying each verticle block of `\(\mathbf{A}\)` by 
`\(\left(\mathbf{L}^{\top}\right)^{-1}\)`.  (There is no actual matrix 
inversion happening). 
+## Motivation
 
+Stochastic SVD method in Mahout produces reduced-rank Singular Value 
Decomposition output in its strict mathematical definition: 
`\(\mathbf{A}\approx\mathbf{UΣV}\)`, i.e. it creates outputs for matrices 
`\(\mathbf{U},\mathbf{V}, and \mathbf{Σ}\)`, each of which may be requested 
individually. The desired rank of decomposition, henceforth denoted as 
*k*`\(\in\mathbb{N}_1\)`, is a parameter of the algorithm. The singular values 
inside diagonal matrix `\(\Sigma\)` satisfyσi+1≤σi∀i∈[1,k−1], i.e. 
sorted from biggest tosmallest. Cases of rank deficiency rank(A)< karehandled 
by producing 0s in singular value positionsonce deficiency takes place.
 
 
 ## Implementation


Reply via email to