[jira] [Commented] (SPARK-6227) PCA and SVD for PySpark

Joseph K. Bradley (JIRA) Thu, 30 Jul 2015 12:49:22 -0700

    [ 
https://issues.apache.org/jira/browse/SPARK-6227?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14648192#comment-14648192
 ]


Joseph K. Bradley commented on SPARK-6227:
------------------------------------------

That's great you're interested.  Please read this for lots of helpful info: 
[https://cwiki.apache.org/confluence/display/SPARK/Contributing+to+Spark]

I would download the original source code from the Apache Spark website and 
install it natively, without using the VM.  There are instructions for that in 
the Spark docs and READMEs.  To get started, I recommend finding some small 
JIRAs which have been resolved already and looking at the PRs which solved 
them.  Those will give you an idea of the code structure.  Good luck!

> PCA and SVD for PySpark
> -----------------------
>
>                 Key: SPARK-6227
>                 URL: https://issues.apache.org/jira/browse/SPARK-6227
>             Project: Spark
>          Issue Type: Sub-task
>          Components: MLlib, PySpark
>    Affects Versions: 1.2.1
>            Reporter: Julien Amelot
>
> The Dimensionality Reduction techniques are not available via Python (Scala + 
> Java only).
> * Principal component analysis (PCA)
> * Singular value decomposition (SVD)
> Doc:
> http://spark.apache.org/docs/1.2.1/mllib-dimensionality-reduction.html



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[jira] [Commented] (SPARK-6227) PCA and SVD for PySpark

Reply via email to