[ 
https://issues.apache.org/jira/browse/MADLIB-948?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15126277#comment-15126277
 ] 

Frank McQuillan commented on MADLIB-948:
----------------------------------------

The plan is to refactor the SVD code so that we can separate out the 
computation of singular values from the singular left and right matrix. This 
would allow PCA to call just the singular values part and find the right value 
of 'k' depending on the variance_proportion. We believe the additional 
performance cost is not high since the original method computed all singular 
values anyways.

> Proportion of variance for PCA training function
> ------------------------------------------------
>
>                 Key: MADLIB-948
>                 URL: https://issues.apache.org/jira/browse/MADLIB-948
>             Project: Apache MADlib
>          Issue Type: New Feature
>            Reporter: Frank McQuillan
>            Priority: Minor
>             Fix For: v2.0
>
>
> In future iterations of the pca_train command, is it feasible to insert 
> another optional command called variance_proportion? Instead of specifying k 
> principal components to compute, you instead specify the proportion of 
> variance that you want your PCA vectors to account for. The number of 
> principal vectors generated would depend the covariance matrix/correlation 
> matrix (depending on whether you normalized or not) and variance_proportion. 
> So if I specified that variance_proportion = .8, the algorithm would 
> terminate after obtaining enough principal vectors so that the ratio of the 
> sum of the eigenvalues collected thus far to the trace of the covariance 
> matrix/correlation matrix (the sum of all of the eigenvalues of the 
> covariance matrix/correlation matrix) is greater than or equal to .8. That 
> is, the algorithm would terminate after collecting enough vectors to account 
> for 80% of the total variance in the set of observations.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to