[GitHub] spark issue #18636: added support word2vec training with additional data

MLnick Tue, 19 Sep 2017 08:13:58 -0700

Github user MLnick commented on the issue:

    https://github.com/apache/spark/pull/18636
  
    Hi there - I don't see the value here of adding a few words in a String 
array to the training. You're effectively adding a second (non-distributed, 
therefore limited in size) corpus to the training.
    
    Word2Vec is more aimed at training on a larger corpus of text. If you want 
more accuracy train on a larger training set.
    
    Could you close this PR please?



---

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] spark issue #18636: added support word2vec training with additional data

Reply via email to