[GitHub] spark pull request #17673: [SPARK-20372] [ML] Word2Vec Continuous Bag of Wor...

shubhamchopra Mon, 09 Oct 2017 09:23:59 -0700

Github user shubhamchopra commented on a diff in the pull request:

    https://github.com/apache/spark/pull/17673#discussion_r143516384
  
    --- Diff: mllib/src/main/scala/org/apache/spark/ml/feature/Word2Vec.scala 
---
    @@ -106,6 +106,45 @@ private[feature] trait Word2VecBase extends Params
       /** @group getParam */
       def getMaxSentenceLength: Int = $(maxSentenceLength)
     
    +  /**
    +   * Number of negative samples to use with CBOW based estimation.
    +   * This parameter is ignored for SkipGram based estimation.
    +   * Default: 15
    +   * @group param
    +   */
    +  final val negativeSamples = new IntParam(this, "negativeSamples", 
"Number of negative samples " +
    --- End diff --
    
    `negativeSamples` is only used when Negative Sampling is used with CBOW 
(or, as in #18123, SkipGram)
    SkipGram with Hierarchical Softmax, which was the initial version, is not 
affected by this parameter.
    
    In principle, we now have 3 possible techniques that can be used (another 
one implemented in #18123). Open to suggestions possible values for `solver`



---

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] spark pull request #17673: [SPARK-20372] [ML] Word2Vec Continuous Bag of Wor...

Reply via email to