Github user iyerr3 commented on a diff in the pull request:
https://github.com/apache/incubator-madlib/pull/33#discussion_r57504745
--- Diff: src/ports/postgres/modules/pca/pca.sql_in ---
@@ -324,13 +341,20 @@ string should be double-quoted; in this case the
input would be '"MyTable"').
\ref background_pca "Technical Background"), sparse matrices almost always
become dense during the training process. Thus, this implementation
automatically densifies sparse matrix input, and there should be no
expected
- performance improvement in using sparse matrix input over dense matrix
input.
-
-- If both <em>lanczos_iter</em> and proportion of variance (via the
-<em>grouping_cols</em>) is defined, <em>lanczos_iter</em> will
-take precedence in determining the number of principal components (i.e.
the
-number of principal components will not be greater than
<em>lanczos_iter</em>
-even if the target proportion is not reached).
+performance improvement in using sparse matrix input over dense matrix
input.
+
+- For the parameter 'components_param', INTEGER and FLOAT are
+interpreted differently. A special case to be aware of:
+'components_param' = 1 (INTEGER) will return 1 principle
--- End diff --
again, principle -> principal
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---