[
https://issues.apache.org/jira/browse/MAHOUT-792?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13114286#comment-13114286
]
Ted Dunning commented on MAHOUT-792:
------------------------------------
{quote}
Yes... and that's what I mean. I was implementing it as Y-on-the-fly. But that
implies full pass over A every time we need access to Y, and we need it 3 times
without option to parallelize. That's why I think I need to save it.
{quote}
Without the power iterations, every time that Y is needed, A is also scanned.
That means that Y-on-the-fly is fine.
But I think that saving Y is a fine idea. It should usually (but surprisingly,
not at all always) be smaller than A. It is the same size as Q.
> Add new stochastic decomposition code
> -------------------------------------
>
> Key: MAHOUT-792
> URL: https://issues.apache.org/jira/browse/MAHOUT-792
> Project: Mahout
> Issue Type: New Feature
> Reporter: Ted Dunning
> Attachments: MAHOUT-792.patch, MAHOUT-792.patch, sd-2.pdf
>
>
> I have figured out some simplification for our SSVD algorithms. This
> eliminates the QR decomposition and makes life easier.
> I will produce a patch that contains the following:
> - a CholeskyDecomposition implementation that does pivoting (and thus
> rank-revealing) or not. This should actually be useful for solution of large
> out-of-core least squares problems.
> - an in-memory SSVD implementation that should work for matrices up to
> about 1/3 of available memory.
> - an out-of-core SSVD threaded implementation that should work for very
> large matrices. It should take time about equal to the cost of reading the
> input matrix 4 times and will require working disk roughly equal to the size
> of the input.
--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira