[ 
https://issues.apache.org/jira/browse/MAHOUT-796?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13093388#comment-13093388
 ] 

Dmitriy Lyubimov commented on MAHOUT-796:
-----------------------------------------

I re-did dense test to construct 20,000x1,000 dense matrix (20 mln non-zero 
elements)  with random singular vectors 

The way i construct input is i generate random singular vector matrices and 
orthogonalize them using stable Gramm-Schmidt, multiply one of them (whatever 
is shorter) by Sigma and then produce row-wise surrogate input. 

For predefined singular values = 10,4,1,(0.1...), n=1000, m=20000, k=3, p=10 i 
get stochastic values 
--SSVD solver singular values:
svs: 9.998401  3.998322  0.972622  0.100000  0.100000  0.100000  0.100000  
0.100000  0.100000  0.100000  0.100000  0.100000  0.100000  


so you see if the decay is good then precision loss with 1 pass in my case 
doesn't exceed 2.8% in the worst case (3rd value) and the time is quite good. 
(same brunch as for mahout-797 in my github). 

Keep in mind that this precision loss also includes loss generated during 
simulated input construction, it's not all the solver's.

I also got rid of BBt job and fixed problems with sparse input on that branch.



> Modified power iterations in existing SSVD code
> -----------------------------------------------
>
>                 Key: MAHOUT-796
>                 URL: https://issues.apache.org/jira/browse/MAHOUT-796
>             Project: Mahout
>          Issue Type: Improvement
>          Components: Math
>    Affects Versions: 0.5
>            Reporter: Dmitriy Lyubimov
>            Assignee: Dmitriy Lyubimov
>              Labels: SSVD
>             Fix For: 0.6
>
>
> Nathan Halko contacted me and pointed out importance of availability of power 
> iterations and their significant effect on accuracy of smaller eigenvalues 
> and noise attenuation. 
> Essentially, we would like to introduce yet another job parameter, q, that 
> governs amount of optional power iterations. The suggestion how to modify the 
> algorithm is outlined here : 
> https://github.com/dlyubimov/ssvd-lsi/wiki/Power-iterations-scratchpad .
> Note that it is different from original power iterations formula in the paper 
> in the sense that additional orthogonalization performed after each 
> iteration. Nathan points out that that improves errors in smaller eigenvalues 
> a lot (If i interpret it right). 

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to