[jira] [Issue Comment Edited] (MAHOUT-796) Modified power iterations in existing SSVD code

Dmitriy Lyubimov (JIRA) Sat, 27 Aug 2011 14:47:04 -0700

    [ 
https://issues.apache.org/jira/browse/MAHOUT-796?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13092385#comment-13092385
 ]


Dmitriy Lyubimov edited comment on MAHOUT-796 at 8/27/11 9:45 PM:
------------------------------------------------------------------

PS it also just occurred to me that full B' does not have to be ever written 
out either because BB' can be accumulated in reducers of 2nd step. Then front 
end will just have to aggregate few triangular partial B' products produced by 
however many reducers, directly (we don't save full B' since it's symmetrical, 
nor do we compute full B). That saves on full Bt I/O and avoids startup costs 
of BB' job. 

Thus, full job is 2 MR passes with q=0, 4MR passes with q=1 and 6MR passes with 
q=2. If understand Nathan's point right, 3 orthogonalizations (which 
corresponds to q=2) is quite enough.

V and U jobs are optional and running in parallel, so they can count for 
another iteration.

so with q=2 (maximum case) and U,V output requested we end up with 7 
*sequential* MR iterations.


      was (Author: dlyubimov):
    PS it also just occurred to me that full B' does not have to be ever 
written out either and can be computed in reducers of 2nd step. Then front end 
will just have to aggregate few triangular partial B' products produced by 
however many reducers, directly (we don't save full B' since it's symmetrical, 
nor do we compute full B). That saves on full Bt I/O and avoids startup costs 
of BB' job. 

Thus, full job is 2 MR passes with q=0, 4MR passes with q=1 and 6MR passes with 
q=2. If understand Nathan's point right, 3 orthogonalizations (which 
corresponds to q=2) is quite enough.

V and U jobs are optional and running in parallel, so they can count for 
another iteration.

so with q=2 (maximum case) and U,V output requested we end up with 7 
*sequential* MR iterations.

  
> Modified power iterations in existing SSVD code
> -----------------------------------------------
>
>                 Key: MAHOUT-796
>                 URL: https://issues.apache.org/jira/browse/MAHOUT-796
>             Project: Mahout
>          Issue Type: Improvement
>          Components: Math
>    Affects Versions: 0.5
>            Reporter: Dmitriy Lyubimov
>            Assignee: Dmitriy Lyubimov
>              Labels: SSVD
>             Fix For: 0.6
>
>
> Nathan Halko contacted me and pointed out importance of availability of power 
> iterations and their significant effect on accuracy of smaller eigenvalues 
> and noise attenuation. 
> Essentially, we would like to introduce yet another job parameter, q, that 
> governs amount of optional power iterations. The suggestion how to modify the 
> algorithm is outlined here : 
> https://github.com/dlyubimov/ssvd-lsi/wiki/Power-iterations-scratchpad .
> Note that it is different from original power iterations formula in the paper 
> in the sense that additional orthogonalization performed after each 
> iteration. Nathan points out that that improves errors in smaller eigenvalues 
> a lot (If i interpret it right). 

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Issue Comment Edited] (MAHOUT-796) Modified power iterations in existing SSVD code

Reply via email to