[ 
https://issues.apache.org/jira/browse/MAHOUT-792?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13114286#comment-13114286
 ] 

Ted Dunning commented on MAHOUT-792:
------------------------------------

{quote}
Yes... and that's what I mean. I was implementing it as Y-on-the-fly. But that 
implies full pass over A every time we need access to Y, and we need it 3 times 
without option to parallelize. That's why I think I need to save it.
{quote}

Without the power iterations, every time that Y is needed, A is also scanned.  
That means that Y-on-the-fly is fine.

But I think that saving Y is a fine idea.  It should usually (but surprisingly, 
not at all always) be smaller than A.  It is the same size as Q.

> Add new stochastic decomposition code
> -------------------------------------
>
>                 Key: MAHOUT-792
>                 URL: https://issues.apache.org/jira/browse/MAHOUT-792
>             Project: Mahout
>          Issue Type: New Feature
>            Reporter: Ted Dunning
>         Attachments: MAHOUT-792.patch, MAHOUT-792.patch, sd-2.pdf
>
>
> I have figured out some simplification for our SSVD algorithms.  This 
> eliminates the QR decomposition and makes life easier.
> I will produce a patch that contains the following:
>   - a CholeskyDecomposition implementation that does pivoting (and thus 
> rank-revealing) or not.  This should actually be useful for solution of large 
> out-of-core least squares problems.
>   - an in-memory SSVD implementation that should work for matrices up to 
> about 1/3 of available memory.
>   - an out-of-core SSVD threaded implementation that should work for very 
> large matrices.  It should take time about equal to the cost of reading the 
> input matrix 4 times and will require working disk roughly equal to the size 
> of the input.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to