[
https://issues.apache.org/jira/browse/MAHOUT-817?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13210548#comment-13210548
]
[email protected] commented on MAHOUT-817:
------------------------------------------------------
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/3863/
-----------------------------------------------------------
(Updated 2012-02-17 20:50:01.339012)
Review request for mahout.
Changes
-------
commit 95d5934405d1ca51e13439a43e0fc793418e5d37
Author: Dmitriy Lyubimov <[email protected]>
Date: Fri Feb 17 12:48:37 2012 -0800
Fixing option recovery based on new api changes
Summary
-------
2d542fd4dfcc6e01577bddc28600632a88e358ee Merge remote-tracking branch
'apache/trunk' into MAHOUT-817
1f245bb5cc1354e7495ec62fbc5f41ed6d590210 Merge branch 'trunk' into MAHOUT-817
458d8112de180c93d5194d67ccfc00442ed1d460 Merge remote-tracking branch
'apache/trunk' into MAHOUT-817
3fea9bd981043e268dd003d4c6c3943bb570c0f7 added test, bug fixes
2725c1061c167126238d288039f0f68baafa7dc8 adding --pca and --pcaOffset options,
minor fixes
48c7b425241afff42ce52d3bb005a87aeb68386d fixing front end to factor in the
median data.
4e072615ac2b8a256d037aaf00db21820abb91e2 tweaking B' job to produce necessary
correctors s_q and s_b
b10fefd8d4aa5a0ed2f60902904d551afbbdf57e cosmetic fixes
849171d3af75117a2ee1115e6d5fc8e4a1fff5ce comment
6c196ea9606b3ca05d401fa1474ee9262a6c0303 retrofitting V job to do pca correction
e6fbe7cdb606698f180127302c33d30fffc6c4d7 adding pca options to Q,ABt jobs.
still need to work on B'-job, V-job and front-end pca corrections.
ecf5dd21c5d5805d70715a78abd07246d171536c Computing s_b0
b9b33cf72af85ade16fcfbf4e13a036877489afb comments
9bb6e971c68e0674b087b8c5d64f4967878f1834 More cleanup in favor of standard
functions, unit tests pass but need to verify the 2G benchmark.
39faa70158b52e50d31aca2abc4006874a9ea8fd cleanup I
780b291eb902e0e832d41748d45bf6d2163f9537 cosmetic changes, adding api with out
redundant parameters
02daf0024489305032320c578ac546c16bda31c1 current MAHOUT-923 patch from Raphael
This addresses bug MAHOUT-817.
https://issues.apache.org/jira/browse/MAHOUT-817
Diffs (updated)
-----
core/src/main/java/org/apache/mahout/math/hadoop/DistributedRowMatrix.java
3e0dd5e
core/src/main/java/org/apache/mahout/math/hadoop/MatrixColumnMeansJob.java
PRE-CREATION
core/src/main/java/org/apache/mahout/math/hadoop/stochasticsvd/ABtDenseOutJob.java
c52fe2a
core/src/main/java/org/apache/mahout/math/hadoop/stochasticsvd/BtJob.java
0c3a996
core/src/main/java/org/apache/mahout/math/hadoop/stochasticsvd/Omega.java
0fa8707
core/src/main/java/org/apache/mahout/math/hadoop/stochasticsvd/PartialRowEmitter.java
59bdedb
core/src/main/java/org/apache/mahout/math/hadoop/stochasticsvd/QJob.java
703c420
core/src/main/java/org/apache/mahout/math/hadoop/stochasticsvd/SSVDCli.java
d314186
core/src/main/java/org/apache/mahout/math/hadoop/stochasticsvd/SSVDHelper.java
PRE-CREATION
core/src/main/java/org/apache/mahout/math/hadoop/stochasticsvd/SSVDPrototype.java
98c8c59
core/src/main/java/org/apache/mahout/math/hadoop/stochasticsvd/SSVDSolver.java
b1a8b56
core/src/main/java/org/apache/mahout/math/hadoop/stochasticsvd/UJob.java
53f26f4
core/src/main/java/org/apache/mahout/math/hadoop/stochasticsvd/VJob.java
d58789e
core/src/main/java/org/apache/mahout/math/hadoop/stochasticsvd/YtYJob.java
bd8c6b1
core/src/test/java/org/apache/mahout/math/hadoop/TestDistributedRowMatrix.java
0ef8622
core/src/test/java/org/apache/mahout/math/hadoop/stochasticsvd/LocalSSVDPCADenseTest.java
PRE-CREATION
core/src/test/java/org/apache/mahout/math/hadoop/stochasticsvd/LocalSSVDSolverDenseTest.java
59f79c5
core/src/test/java/org/apache/mahout/math/hadoop/stochasticsvd/LocalSSVDSolverSparseSequentialTest.java
beb0102
core/src/test/java/org/apache/mahout/math/hadoop/stochasticsvd/SSVDCommonTest.java
PRE-CREATION
core/src/test/java/org/apache/mahout/math/hadoop/stochasticsvd/SSVDPrototypeTest.java
503433f
core/src/test/java/org/apache/mahout/math/hadoop/stochasticsvd/SSVDTestsHelper.java
32342c1
Diff: https://reviews.apache.org/r/3863/diff
Testing
-------
Additional unit tests for PCA
Thanks,
Dmitriy
> Add PCA options to SSVD code
> ----------------------------
>
> Key: MAHOUT-817
> URL: https://issues.apache.org/jira/browse/MAHOUT-817
> Project: Mahout
> Issue Type: New Feature
> Affects Versions: 0.6
> Reporter: Dmitriy Lyubimov
> Assignee: Dmitriy Lyubimov
> Fix For: 0.7
>
> Attachments: MAHOUT-817.patch, MAHOUT-817.patch, MAHOUT-817.patch,
> SSVD-PCA options.pdf, ssvd-tests.R, ssvd.R, ssvd.m
>
>
> It seems that a simple solution should exist to integrate PCA mean
> subtraction into SSVD algorithm without making it a pre-requisite step and
> also avoiding densifying the big input.
> Several approaches were suggested:
> 1) subtract mean off B
> 2) propagate mean vector deeper into algorithm algebraically where the data
> is already collapsed to smaller matrices
> 3) --?
> It needs some math done first . I'll take a stab at 1 and 2 but thoughts and
> math are welcome.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators:
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira