[ 
https://issues.apache.org/jira/browse/SPARK-2510?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xiangrui Meng updated SPARK-2510:
---------------------------------

    Assignee: Liquan Pei

> word2vec: Distributed Representation of Words
> ---------------------------------------------
>
>                 Key: SPARK-2510
>                 URL: https://issues.apache.org/jira/browse/SPARK-2510
>             Project: Spark
>          Issue Type: New Feature
>          Components: MLlib
>            Reporter: Liquan Pei
>            Assignee: Liquan Pei
>   Original Estimate: 672h
>  Remaining Estimate: 672h
>
> We would like to add parallel implementation of word2vec to MLlib. word2vec 
> finds distributed representation of words through training of large data 
> sets. The Spark programming model fits nicely with word2vec as the training 
> algorithm of word2vec is embarrassingly parallel. We will focus on skip-gram 
> model and negative sampling in our initial implementation. 



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Reply via email to