Shubham Chopra created SPARK-20903:
--------------------------------------
Summary: Word2Vec Skip-Gram + Negative Sampling
Key: SPARK-20903
URL: https://issues.apache.org/jira/browse/SPARK-20903
Project: Spark
Issue Type: Sub-task
Components: ML, MLlib
Affects Versions: 2.1.1
Reporter: Shubham Chopra
SkipGram + Negative Sampling is shown to be comparative or out-performing the
hierarchical softmax based approach currently implemented with Spark. Since
word2vec is largely a pre-processing step, the performance often can depend on
the application it is being used for, and the corpus it is estimated on. These
implementation give users the choice of picking one that works best for their
use-case.
--
This message was sent by Atlassian JIRA
(v6.3.15#6346)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]