Github user srowen commented on a diff in the pull request:

    https://github.com/apache/spark/pull/14940#discussion_r205839792
  
    --- Diff: 
graphx/src/main/scala/org/apache/spark/graphx/lib/LabelPropagation.scala ---
    @@ -58,7 +58,7 @@ object LabelPropagation {
           }.toMap
         }
         def vertexProgram(vid: VertexId, attr: Long, message: Map[VertexId, 
Long]): VertexId = {
    -      if (message.isEmpty) attr else message.maxBy(_._2)._1
    +      (Map(attr -> 1L) ++ message).maxBy(m => (m._2, m._1))._1
    --- End diff --
    
    Nit, is `message :+ (attr -> 1L)` simpler for the first expression?
    I don't know enough to evaluate the implications of this change. It sounds 
like the current behavior is on purpose or according to some paper, but I'm not 
sure. Is there a reference for this being the more correct thing to do?


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

Reply via email to