Github user srowen commented on a diff in the pull request:

    https://github.com/apache/spark/pull/14137#discussion_r70638869
  
    --- Diff: 
graphx/src/main/scala/org/apache/spark/graphx/lib/StronglyConnectedComponents.scala
 ---
    @@ -44,6 +44,11 @@ object StronglyConnectedComponents {
         // graph we are going to work with in our iterations
         var sccWorkGraph = graph.mapVertices { case (vid, _) => (vid, false) 
}.cache()
     
    +    // helper variables to unpersist cached graphs
    +    var prevSccGraph1 = sccGraph
    +    var prevSccGraph2 = sccGraph
    --- End diff --
    
    I guess my logic is that it's easier to have one variable following this 
single variable throughout rather than two -- which of the two refers to which 
previous RDD? are they different? the effect of this as well is that you 
unpersist and the re-persist the RDD once during each loop.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to