[ 
https://issues.apache.org/jira/browse/MAHOUT-796?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13091538#comment-13091538
 ] 

Ted Dunning commented on MAHOUT-796:
------------------------------------

For the in-memory implementations, I think that this is a non-issue.  Power 
iteration should simply be implemented.  In that case, the original form using 
Y = (A'A)^q A \Omega seems fine and I don't yet quite see how the iteration 
that Dmitriy proposes will get the right result.  Whichever method is used, it 
is a good thing to do.

The problems that I see are for the out-of-core problems.  There, computing A'A 
can often give pathologically bad results if the sparse pattern is highly 
skewed.  That approach also leads to significant fill-in which is not a good 
thing.  On the hand, multiplying A times anything too large to store in memory 
such as B typically is may be horribly bad as well. 

The orthogonalization is no big deal since it requires only a single pass 
through the data to accumulate the small matrix required for the Cholesky trick.

> Modified power iterations in existing SSVD code
> -----------------------------------------------
>
>                 Key: MAHOUT-796
>                 URL: https://issues.apache.org/jira/browse/MAHOUT-796
>             Project: Mahout
>          Issue Type: Improvement
>          Components: Math
>    Affects Versions: 0.5
>            Reporter: Dmitriy Lyubimov
>            Assignee: Dmitriy Lyubimov
>              Labels: SSVD
>             Fix For: 0.6
>
>
> Nathan Halko contacted me and pointed out importance of availability of power 
> iterations and their significant effect on accuracy of smaller eigenvalues 
> and noise attenuation. 
> Essentially, we would like to introduce yet another job parameter, q, that 
> governs amount of optional power iterations. The suggestion how to modify the 
> algorithm is outlined here : 
> https://github.com/dlyubimov/ssvd-lsi/wiki/Power-iterations-scratchpad .
> Note that it is different from original power iterations formula in the paper 
> in the sense that additional orthogonalization performed after each 
> iteration. Nathan points out that that improves errors in smaller eigenvalues 
> a lot (If i interpret it right). 

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to