Space: Apache Mahout (https://cwiki.apache.org/confluence/display/MAHOUT) Page: Stochastic Singular Value Decomposition (https://cwiki.apache.org/confluence/display/MAHOUT/Stochastic+Singular+Value+Decomposition)
Edited by Dmitriy Lyubimov: --------------------------------------------------------------------- Stochasitc SVD method in Mahout produces reduced rank Singular Value Decomposition output in its strict mathematical definition: A=USV'. h5. The benefits over other methods are: * reduced flops required compared to Krylov subspace methods * In map-reduce world, a fixed number of MR iterations required regardless of rank requested * Tweak precision/speed balance with options. * A is a Distributed Row Matrix where rows may be identified by any Writable (such as a document path). As such, it would work directly on the output of seq2sparse. map-reduce characteristics: SSVD uses at most 3 MR steps (map-only + map-reduce + optional map-reduce) to produce reduced rank approximation of U, V and S matrices. Additionally, two more map-reduce steps are added for each power iteration step if requested. h5. Potential drawbacks: * potentially less precise (but adding even one power iteration seems to fix that quite a bit). h5. Documentation [Overview and Usage|^SSVD-CLI.pdf] (Todo: add a tutorial example.) Change your notification preferences: https://cwiki.apache.org/confluence/users/viewnotifications.action
