[ 
https://issues.apache.org/jira/browse/MAHOUT-817?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13113007#comment-13113007
 ] 

Dmitriy Lyubimov commented on MAHOUT-817:
-----------------------------------------

why would we want to support both row and column mean subtraction? I need to 
re-read the motivation of this.

I think a lot also resides on a question if we actually also want _output_ the 
mean. 

And the next question is whether we want to spend one additional pass just to 
find the mean. if yes, then the rest is easy. we just will be doing mean 
subtraction as part of Y computation . should be ok flops-wise.

but if we think we shouldn't be waiting for mean computation as a separate 
pass, and we don't want to output it either, then that's where it becomes a 
little tricky.


> Add PCA options to SSVD code
> ----------------------------
>
>                 Key: MAHOUT-817
>                 URL: https://issues.apache.org/jira/browse/MAHOUT-817
>             Project: Mahout
>          Issue Type: New Feature
>    Affects Versions: 0.6
>            Reporter: Dmitriy Lyubimov
>            Assignee: Dmitriy Lyubimov
>             Fix For: 0.6
>
>
> It seems that a simple solution should exist to integrate PCA mean 
> subtraction into SSVD algorithm without making it a pre-requisite step and 
> also avoiding densifying the big input. 
> Several approaches were suggested:
> 1) subtract mean off B
> 2) propagate mean vector deeper into algorithm algebraically where the data 
> is already collapsed to smaller matrices
> 3) --?
> It needs some math done first . I'll take a stab at 1 and 2 but thoughts and 
> math are welcome.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to