[ 
https://issues.apache.org/jira/browse/MADLIB-947?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15786369#comment-15786369
 ] 

ASF GitHub Bot commented on MADLIB-947:
---------------------------------------

GitHub user njayaram2 opened a pull request:

    https://github.com/apache/incubator-madlib/pull/84

    PCA: Add grouping support to PCA

    JIRA: MADLIB-947
    
    - PCA can now handle grouping columns. pca_train() with grouping_cols
    parameter specified learns an independent model for each group in
    the input table. New columns corresponding to the columns specified
    in grouping_cols will be created in the output, mean and summary
    tables.
    - If pca_project() is called on an input table that has grouping_cols
    in it, the pc_table used in the parameter list must be a PCA model
    table that is learnt with grouping_cols. If the input table for
    pca_project() has grouping columns but the pc_table used does not
    support grouping_cols, or vice versa, there will be an error thrown.
    - Another important new feature is that the 'row_id' column in the
    input tables always had to be serially increasing, starting from 1. That
    requirement is now relaxed since this commit converts given 'row_id' to
    a new column that follows the rules laid out by sparse and dense
    matrix formats.
    - Both the online and user docs are improved with more examples.

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/njayaram2/incubator-madlib 
features/pca-grouping-simple

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/incubator-madlib/pull/84.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #84
    
----
commit cfdddb490695782a38a56aa0c9a635c063fd916b
Author: Nandish Jayaram <[email protected]>
Date:   2016-12-21T22:18:38Z

    PCA: Add grouping support to PCA
    
    JIRA: MADLIB-947
    
    - PCA can now handle grouping columns. pca_train() with grouping_cols
    parameter specified learns an independent model for each group in
    the input table. New columns corresponding to the columns specified
    in grouping_cols will be created in the output, mean and summary
    tables.
    - If pca_project() is called on an input table that has grouping_cols
    in it, the pc_table used in the parameter list must be a PCA model
    table that is learnt with grouping_cols. If the input table for
    pca_project() has grouping columns but the pc_table used does not
    support grouping_cols, or vice versa, there will be an error thrown.
    - Another important new feature is that the 'row_id' column in the
    input tables always had to be serially increasing, starting from 1. That
    requirement is now relaxed since this commit converts given 'row_id' to
    a new column that follows the rules laid out by sparse and dense
    matrix formats.
    - Both the online and user docs are improved with more examples.

----


> Support grouping for PCA
> ------------------------
>
>                 Key: MADLIB-947
>                 URL: https://issues.apache.org/jira/browse/MADLIB-947
>             Project: Apache MADlib
>          Issue Type: New Feature
>            Reporter: Frank McQuillan
>            Assignee: Nandish Jayaram
>             Fix For: v1.10
>
>
> Implement grouping support in PCA
> http://doc.madlib.net/latest/group__grp__pca__train.html#train
> http://doc.madlib.net/latest/group__grp__pca__train.html#train



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to