[ 
https://issues.apache.org/jira/browse/SPARK-6192?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14351030#comment-14351030
 ] 

David J. Manglano edited comment on SPARK-6192 at 3/6/15 10:23 PM:
-------------------------------------------------------------------

Hello,

I am experienced in Python, have an interest in machine learning, and have some 
knowledge of the graph and probability theory involved in machine learning 
methods. I am also interested in the use of cluster computing in scientific 
data analysis.

I would like to work on this project for GSoC 2015. What skills would be 
required, and what would be the next step?

Thanks!


was (Author: manglano):
Hello,

I am experienced in Python, have an interest in machine learning, and have some 
knowledge of the graph and probability theory involved. I am also interested in 
the use of cluster computing in scientific data analysis.

I would like to work on this project for GSoC 2015. What skills would be 
required, and what would be the next step?

Thanks!

> Enhance MLlib's Python API (GSoC 2015)
> --------------------------------------
>
>                 Key: SPARK-6192
>                 URL: https://issues.apache.org/jira/browse/SPARK-6192
>             Project: Spark
>          Issue Type: Umbrella
>          Components: ML, MLlib, PySpark
>            Reporter: Xiangrui Meng
>            Assignee: Manoj Kumar
>              Labels: gsoc, gsoc2015, mentor
>
> This is an umbrella JIRA for [~MechCoder]'s GSoC 2015 project. The main theme 
> is to enhance MLlib's Python API, to make it on par with the Scala/Java API. 
> The main tasks are:
> 1. For all models in MLlib, provide save/load method. This also
> includes save/load in Scala.
> 2. Python API for evaluation metrics.
> 3. Python API for streaming ML algorithms.
> 4. Python API for distributed linear algebra.
> 5. Simplify MLLibPythonAPI using DataFrames. Currently, we use
> customized serialization, making MLLibPythonAPI hard to maintain. It
> would be nice to use the DataFrames for serialization.
> I'll link the JIRAs for each of the tasks.
> Note that this doesn't mean all these JIRAs are pre-assigned to [~MechCoder]. 
> The TODO list will be dynamic based on the backlog.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to