Github user shubhamchopra commented on a diff in the pull request:
https://github.com/apache/spark/pull/17673#discussion_r143516384
--- Diff: mllib/src/main/scala/org/apache/spark/ml/feature/Word2Vec.scala
---
@@ -106,6 +106,45 @@ private[feature] trait Word2VecBase extends Params
/** @group getParam */
def getMaxSentenceLength: Int = $(maxSentenceLength)
+ /**
+ * Number of negative samples to use with CBOW based estimation.
+ * This parameter is ignored for SkipGram based estimation.
+ * Default: 15
+ * @group param
+ */
+ final val negativeSamples = new IntParam(this, "negativeSamples",
"Number of negative samples " +
--- End diff --
`negativeSamples` is only used when Negative Sampling is used with CBOW
(or, as in #18123, SkipGram)
SkipGram with Hierarchical Softmax, which was the initial version, is not
affected by this parameter.
In principle, we now have 3 possible techniques that can be used (another
one implemented in #18123). Open to suggestions possible values for `solver`
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]