[GitHub] spark pull request #21493: [SPARK-15784] Add Power Iteration Clustering to s...

mengxr Mon, 04 Jun 2018 16:41:09 -0700

Github user mengxr commented on a diff in the pull request:

    https://github.com/apache/spark/pull/21493#discussion_r192910578
  
    --- Diff: 
mllib/src/test/scala/org/apache/spark/ml/clustering/PowerIterationClusteringSuite.scala
 ---
    @@ -222,17 +167,13 @@ object PowerIterationClusteringSuite {
         val n = n1 + n2
         val points = genCircle(r1, n1) ++ genCircle(r2, n2)
     
    -    val rows = for (i <- 1 until n) yield {
    -      val neighbors = for (j <- 0 until i) yield {
    -        j.toLong
    +    val rows = (for (i <- 1 until n) yield {
    +      for (j <- 0 until i) yield {
    +        (i.toLong, j.toLong, sim(points(i), points(j)))
           }
    -      val similarities = for (j <- 0 until i) yield {
    -        sim(points(i), points(j))
    -      }
    -      (i.toLong, neighbors.toArray, similarities.toArray)
    -    }
    +    }).flatMap(_.iterator)
     
    -    spark.createDataFrame(rows).toDF("id", "neighbors", "similarities")
    +    spark.createDataFrame(rows).toDF("src", "dst", "weight")
       }
     
     }
    --- End diff --
    
    Should test default weight.



---

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] spark pull request #21493: [SPARK-15784] Add Power Iteration Clustering to s...

Reply via email to