[
https://issues.apache.org/jira/browse/SPARK-2510?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Xiangrui Meng updated SPARK-2510:
---------------------------------
Assignee: Liquan Pei
> word2vec: Distributed Representation of Words
> ---------------------------------------------
>
> Key: SPARK-2510
> URL: https://issues.apache.org/jira/browse/SPARK-2510
> Project: Spark
> Issue Type: New Feature
> Components: MLlib
> Reporter: Liquan Pei
> Assignee: Liquan Pei
> Original Estimate: 672h
> Remaining Estimate: 672h
>
> We would like to add parallel implementation of word2vec to MLlib. word2vec
> finds distributed representation of words through training of large data
> sets. The Spark programming model fits nicely with word2vec as the training
> algorithm of word2vec is embarrassingly parallel. We will focus on skip-gram
> model and negative sampling in our initial implementation.
--
This message was sent by Atlassian JIRA
(v6.2#6252)