[
https://issues.apache.org/jira/browse/MAHOUT-836?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13123798#comment-13123798
]
Sujit Nair commented on MAHOUT-836:
-----------------------------------
The main RPCA algorithm is iterative and one needs to compute an (expensive)
SVD in each iteration. So the computation scales as O(N^3 * num_iter). I had
blogged about it at
http://nairanalytics.com/Research_Blog/2011/09/05/the-low-rank-and-sparsity-technique-for-data-recovery/
to explain the main idea.
The difference between RPCA and PCA is as follows.
1. PCA is robust to small, i.i.d. Gaussian noise
2. RPCA is robust to large, sparse i.i.d noise
It has applications in surveillance, LSI, CF etc. So I assume it will be of
great interest here in the Mahout community. I have also done some (rather
limited) experiments where I replace the SVD in each iteration with randomized
SVD (a.k.a SSVD in Mahout community). The results are not that bad.
Thanks,
Sujit
> On donating my Robust PCA Java code to Mahout
> ---------------------------------------------
>
> Key: MAHOUT-836
> URL: https://issues.apache.org/jira/browse/MAHOUT-836
> Project: Mahout
> Issue Type: New JIRA Project
> Components: Classification
> Environment: Platform independent
> Reporter: Sujit Nair
> Labels: newbie
> Original Estimate: 672h
> Remaining Estimate: 672h
>
> Hi All,
> I have an implementation of Robust PCA (a.k.a low rank and sparse
> decomposition) in Java which I would like to donate to Mahout. I am a MATLAB
> expert, comfortable with C++ and have just started with Java. I am completely
> new to Mahout but am very excited to participate and contribute.
> I have tested my code exhaustively and there does not seem to be any issues.
> The results are very good but the code definitely needs some optimization.
> Please let me know if there is interest.
> Thanks,
> Sujit
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators:
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira