[
https://issues.apache.org/jira/browse/SPARK-6227?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14648192#comment-14648192
]
Joseph K. Bradley commented on SPARK-6227:
------------------------------------------
That's great you're interested. Please read this for lots of helpful info:
[https://cwiki.apache.org/confluence/display/SPARK/Contributing+to+Spark]
I would download the original source code from the Apache Spark website and
install it natively, without using the VM. There are instructions for that in
the Spark docs and READMEs. To get started, I recommend finding some small
JIRAs which have been resolved already and looking at the PRs which solved
them. Those will give you an idea of the code structure. Good luck!
> PCA and SVD for PySpark
> -----------------------
>
> Key: SPARK-6227
> URL: https://issues.apache.org/jira/browse/SPARK-6227
> Project: Spark
> Issue Type: Sub-task
> Components: MLlib, PySpark
> Affects Versions: 1.2.1
> Reporter: Julien Amelot
>
> The Dimensionality Reduction techniques are not available via Python (Scala +
> Java only).
> * Principal component analysis (PCA)
> * Singular value decomposition (SVD)
> Doc:
> http://spark.apache.org/docs/1.2.1/mllib-dimensionality-reduction.html
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]