Github user MLnick commented on the issue:
https://github.com/apache/spark/pull/18636
Hi there - I don't see the value here of adding a few words in a String
array to the training. You're effectively adding a second (non-distributed,
therefore limited in size) corpus to the training.
Word2Vec is more aimed at training on a larger corpus of text. If you want
more accuracy train on a larger training set.
Could you close this PR please?
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]