Github user sethah commented on the pull request:
https://github.com/apache/spark/pull/12419#issuecomment-215110515
@psuszyns This introduces a breaking change to the MLlib API, which we
should avoid since it is not strictly necessary. Looking at this more
carefully, the simplest way to do this seems like it would be to add this for
only spark.ML by requesting the full PCA from MLlib, then trimming according to
retained variance in the spark.ML fit method. I'm not sure if we ought to make
this available in MLlib, given that we could avoid some of the complexity. If
we do, we need to do it in a way that does not break the APIs.
Also, please do run the style checker, and see [Contributing to
Spark](https://cwiki.apache.org/confluence/display/SPARK/Contributing+to+Spark)
for Spark specific style guidelines.
@srowen @mengxr What do you think about this change?
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]