[GitHub] spark issue #18636: added support word2vec training with additional data

LeoIV Tue, 19 Sep 2017 08:21:45 -0700

Github user LeoIV commented on the issue:

    https://github.com/apache/spark/pull/18636
  
    At the moment, it is not possible to improve a models accuracy by 
incorporating additional data. I think this should be supported since it can 
increase a classifiers performance significantly. With this implementation, I 
was able to train unsupervised on a Wikipedia Dump, which is pretty large. 
However, distributing the set is a good point.



---

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] spark issue #18636: added support word2vec training with additional data

Reply via email to