[ 
https://issues.apache.org/jira/browse/MAHOUT-376?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12920369#action_12920369
 ] 

Ted Dunning commented on MAHOUT-376:
------------------------------------

Dima,

I just attached an update to the original outline document I posted.  The gist 
of it is that the Q_i need to be arranged in block diagonal form in order to 
form a bases of A\Omega.  When that is done, my experiments show complete 
agreement with the original algorithm.  

Here is R code that demonstrates decomposition without blocking and a 2 way 
block decomposition:
{code}
# SVD decompose a matrix, extracting the first k singular values/vectors
# using k+p random projection
svd.rp = function(A, k=10, p=5) {
  n = nrow(A)
  y = A %*% matrix(rnorm(n * (k+p)), nrow=n)
  q = qr.Q(qr(y))
  b = t(q) %*% A
  svd = svd(b)
  list(u=q%*%svd$u, d=svd$d, v=svd$v)
}

# block-wise SVD decompose a matrix, extracting the first k singular 
values/vectors
# using k+p random projection
svd.rpx = function(A, k=10, p=5) {
  n = nrow(A)
  # block sizes
  n1 = floor(n/2)
  n2 = n-n1

  r = matrix(rnorm(n * (k+p)), nrow=n)
  A1 = A[1:n1,]
  A2 = A[(n1+1):n,]

  # block-wise multiplication and basis
  y1 = A1 %*% r
  q1 = qr.Q(qr(y1))

  y2 = A2 %*% r
  q2 = qr.Q(qr(y2))

  # construction of full q (not really necessary)
  z1 = diag(0, nrow=nrow(q1), ncol=(k+p))
  z2 = diag(0, nrow=nrow(q2), ncol=(k+p))
  q = rbind(cbind(q1, z1), cbind(z2, q2))
  b = t(q) %*% A

  # we can compute b without forming the block diagonal Q
  bx = rbind(t(q1)%*%A1, t(q2)%*%A2)

  # now the decomposition continues
  svd = svd(bx)

  # return all the pieces for checking
  list(u=q%*%svd$u, d=svd$d, v=svd$v, q1=q1, q2=q2, q=q, b=b, bx=bx)
}
{code}
Note that this code has a fair bit of fat in it for debugging or illustrative 
purposes.

> Implement Map-reduce version of stochastic SVD
> ----------------------------------------------
>
>                 Key: MAHOUT-376
>                 URL: https://issues.apache.org/jira/browse/MAHOUT-376
>             Project: Mahout
>          Issue Type: Improvement
>          Components: Math
>            Reporter: Ted Dunning
>            Assignee: Ted Dunning
>             Fix For: 0.5
>
>         Attachments: MAHOUT-376.patch, Modified stochastic svd algorithm for 
> mapreduce.pdf, sd-bib.bib, sd.pdf, sd.pdf, sd.tex, sd.tex, Stochastic SVD 
> using eigensolver trick.pdf
>
>
> See attached pdf for outline of proposed method.
> All comments are welcome.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to