[ 
https://issues.apache.org/jira/browse/MAHOUT-817?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13210536#comment-13210536
 ] 

[email protected] commented on MAHOUT-817:
------------------------------------------------------


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/3863/
-----------------------------------------------------------

(Updated 2012-02-17 20:43:22.593328)


Review request for mahout.


Changes
-------

commit 996464eb600400745baf25498606aca115cb7e96
Merge: cd48627 aa7e1d8
Author: Dmitriy Lyubimov <[email protected]>
Date:   Fri Feb 17 12:40:26 2012 -0800

    Merge remote-tracking branch 'apache/trunk' into MAHOUT-817
    
    Conflicts:
        
core/src/main/java/org/apache/mahout/math/hadoop/stochasticsvd/SSVDCli.java


Summary
-------


2d542fd4dfcc6e01577bddc28600632a88e358ee Merge remote-tracking branch 
'apache/trunk' into MAHOUT-817
1f245bb5cc1354e7495ec62fbc5f41ed6d590210 Merge branch 'trunk' into MAHOUT-817
458d8112de180c93d5194d67ccfc00442ed1d460 Merge remote-tracking branch 
'apache/trunk' into MAHOUT-817
3fea9bd981043e268dd003d4c6c3943bb570c0f7 added test, bug fixes
2725c1061c167126238d288039f0f68baafa7dc8 adding --pca and --pcaOffset options, 
minor fixes
48c7b425241afff42ce52d3bb005a87aeb68386d fixing front end to factor in the 
median data.
4e072615ac2b8a256d037aaf00db21820abb91e2 tweaking B' job to produce necessary 
correctors s_q and s_b
b10fefd8d4aa5a0ed2f60902904d551afbbdf57e cosmetic fixes
849171d3af75117a2ee1115e6d5fc8e4a1fff5ce comment
6c196ea9606b3ca05d401fa1474ee9262a6c0303 retrofitting V job to do pca correction
e6fbe7cdb606698f180127302c33d30fffc6c4d7 adding pca options to Q,ABt jobs. 
still need to work on B'-job, V-job and front-end pca corrections.
ecf5dd21c5d5805d70715a78abd07246d171536c Computing s_b0
b9b33cf72af85ade16fcfbf4e13a036877489afb comments
9bb6e971c68e0674b087b8c5d64f4967878f1834 More cleanup in favor of standard 
functions, unit tests pass but need to verify the 2G benchmark.
39faa70158b52e50d31aca2abc4006874a9ea8fd cleanup I
780b291eb902e0e832d41748d45bf6d2163f9537 cosmetic changes, adding api with out 
redundant parameters
02daf0024489305032320c578ac546c16bda31c1 current MAHOUT-923 patch from Raphael


This addresses bug MAHOUT-817.
    https://issues.apache.org/jira/browse/MAHOUT-817


Diffs (updated)
-----

  core/src/main/java/org/apache/mahout/math/hadoop/DistributedRowMatrix.java 
3e0dd5e 
  core/src/main/java/org/apache/mahout/math/hadoop/MatrixColumnMeansJob.java 
PRE-CREATION 
  
core/src/main/java/org/apache/mahout/math/hadoop/stochasticsvd/ABtDenseOutJob.java
 c52fe2a 
  core/src/main/java/org/apache/mahout/math/hadoop/stochasticsvd/BtJob.java 
0c3a996 
  core/src/main/java/org/apache/mahout/math/hadoop/stochasticsvd/Omega.java 
0fa8707 
  
core/src/main/java/org/apache/mahout/math/hadoop/stochasticsvd/PartialRowEmitter.java
 59bdedb 
  core/src/main/java/org/apache/mahout/math/hadoop/stochasticsvd/QJob.java 
703c420 
  core/src/main/java/org/apache/mahout/math/hadoop/stochasticsvd/SSVDCli.java 
d314186 
  
core/src/main/java/org/apache/mahout/math/hadoop/stochasticsvd/SSVDHelper.java 
PRE-CREATION 
  
core/src/main/java/org/apache/mahout/math/hadoop/stochasticsvd/SSVDPrototype.java
 98c8c59 
  
core/src/main/java/org/apache/mahout/math/hadoop/stochasticsvd/SSVDSolver.java 
b1a8b56 
  core/src/main/java/org/apache/mahout/math/hadoop/stochasticsvd/UJob.java 
53f26f4 
  core/src/main/java/org/apache/mahout/math/hadoop/stochasticsvd/VJob.java 
d58789e 
  core/src/main/java/org/apache/mahout/math/hadoop/stochasticsvd/YtYJob.java 
bd8c6b1 
  
core/src/test/java/org/apache/mahout/math/hadoop/TestDistributedRowMatrix.java 
0ef8622 
  
core/src/test/java/org/apache/mahout/math/hadoop/stochasticsvd/LocalSSVDPCADenseTest.java
 PRE-CREATION 
  
core/src/test/java/org/apache/mahout/math/hadoop/stochasticsvd/LocalSSVDSolverDenseTest.java
 59f79c5 
  
core/src/test/java/org/apache/mahout/math/hadoop/stochasticsvd/LocalSSVDSolverSparseSequentialTest.java
 beb0102 
  
core/src/test/java/org/apache/mahout/math/hadoop/stochasticsvd/SSVDCommonTest.java
 PRE-CREATION 
  
core/src/test/java/org/apache/mahout/math/hadoop/stochasticsvd/SSVDPrototypeTest.java
 503433f 
  
core/src/test/java/org/apache/mahout/math/hadoop/stochasticsvd/SSVDTestsHelper.java
 32342c1 

Diff: https://reviews.apache.org/r/3863/diff


Testing
-------

Additional unit tests for PCA


Thanks,

Dmitriy


                
> Add PCA options to SSVD code
> ----------------------------
>
>                 Key: MAHOUT-817
>                 URL: https://issues.apache.org/jira/browse/MAHOUT-817
>             Project: Mahout
>          Issue Type: New Feature
>    Affects Versions: 0.6
>            Reporter: Dmitriy Lyubimov
>            Assignee: Dmitriy Lyubimov
>             Fix For: 0.7
>
>         Attachments: MAHOUT-817.patch, MAHOUT-817.patch, MAHOUT-817.patch, 
> SSVD-PCA options.pdf, ssvd-tests.R, ssvd.R, ssvd.m
>
>
> It seems that a simple solution should exist to integrate PCA mean 
> subtraction into SSVD algorithm without making it a pre-requisite step and 
> also avoiding densifying the big input. 
> Several approaches were suggested:
> 1) subtract mean off B
> 2) propagate mean vector deeper into algorithm algebraically where the data 
> is already collapsed to smaller matrices
> 3) --?
> It needs some math done first . I'll take a stab at 1 and 2 but thoughts and 
> math are welcome.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to