[GitHub] spark issue #18636: added support word2vec training with additional data

MLnick Tue, 19 Sep 2017 09:07:31 -0700

Github user MLnick commented on the issue:

    https://github.com/apache/spark/pull/18636
  
    I'm sorry but I still don't understand the intention here. You can already 
train on a Wikipedia dump (or any other dataset) by passing that dataset as the 
input DataFrame to Word2Vec.
    
    If you want to "incorporate additional data" why not just `union` the 
additional sentences / documents together with your other training set?



---

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] spark issue #18636: added support word2vec training with additional data

Reply via email to