RJ Nowling created SPARK-3263:
---------------------------------

             Summary: PR #720 broke GraphGenerator.logNormal
                 Key: SPARK-3263
                 URL: https://issues.apache.org/jira/browse/SPARK-3263
             Project: Spark
          Issue Type: Bug
          Components: GraphX
            Reporter: RJ Nowling


PR #720 made multiple changes to GraphGenerator.logNormalGraph including:

* Replacing the call to functions for generating random vertices and edges with 
in-line implementations with different equations
* Hard-coding of RNG seeds so that method now generates the same graph for a 
given number of vertices, edges, mu, and sigma -- user is not able to override 
seed or specify that seed should be randomly generated.
* Backwards-incompatible change to logNormalGraph signature with introduction 
of new required parameter.
* Failed to update scala docs and programming guide for API changes

I also see that PR #720 added a Synthetic Benchmark in the examples.

Based on reading the Pregel paper, I believe the in-line functions are 
incorrect.  I proposed to:

* Removing the in-line calls
* Adding a seed for deterministic behavior (when desired)
* Keeping the number of partitions parameter.
* Updating the synthetic benchmark example



--
This message was sent by Atlassian JIRA
(v6.2#6252)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to