[GitHub] spark pull request #19340: [SPARK-22119][ML] Add cosine distance to KMeans

srowen Thu, 18 Jan 2018 10:13:44 -0800

Github user srowen commented on a diff in the pull request:

    https://github.com/apache/spark/pull/19340#discussion_r162396772
  
    --- Diff: 
mllib/src/main/scala/org/apache/spark/mllib/clustering/KMeans.scala ---
    @@ -546,10 +577,111 @@ object KMeans {
           .run(data)
       }
     
    +  private[spark] def validateInitMode(initMode: String): Boolean = {
    +    initMode match {
    +      case KMeans.RANDOM => true
    +      case KMeans.K_MEANS_PARALLEL => true
    +      case _ => false
    +    }
    +  }
    +  private[spark] def validateDistanceMeasure(distanceMeasure: String): 
Boolean = {
    +    distanceMeasure match {
    +      case DistanceMeasure.EUCLIDEAN => true
    +      case DistanceMeasure.COSINE => true
    +      case _ => false
    +    }
    +  }
    +}
    +
    +/**
    + * A vector with its norm for fast distance computation.
    + *
    + * @see [[org.apache.spark.mllib.clustering.KMeans#fastSquaredDistance]]
    + */
    +private[clustering]
    +class VectorWithNorm(val vector: Vector, val norm: Double) extends 
Serializable {
    --- End diff --
    
    Nit: I think we usually break long lines like this starting on the first 
arg, or "extends"



---

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] spark pull request #19340: [SPARK-22119][ML] Add cosine distance to KMeans

Reply via email to