Github user hhbyyh commented on a diff in the pull request:
https://github.com/apache/spark/pull/17673#discussion_r148966706
--- Diff: mllib/src/main/scala/org/apache/spark/ml/feature/Word2Vec.scala
---
@@ -105,6 +106,56 @@ private[feature] trait Word2VecBase extends Params
/** @group getParam */
def getMaxSentenceLength: Int = $(maxSentenceLength)
+ /**
+ * Number of negative samples to use with CBOW based estimation.
+ * This parameter is ignored for SkipGram-Hierachical Softmax based
estimation.
+ * Default: 15
+ * @group param
+ */
+ final val numNegativeSamples = new IntParam(this, "numNegativeSamples",
"Number of negative" +
+ " samples to use with CBOW estimation", ParamValidators.gt(0))
+ setDefault(numNegativeSamples -> 15)
--- End diff --
In other implementation, the default value is 5.
Is 15 from the original C-implementation?
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]