Hi all, PR #720 <https://github.com/apache/spark/pull/720> made multiple changes to GraphGenerator.logNormalGraph including:
- Replacing the call to functions for generating random vertices and edges with in-line implementations with different equations. Based on reading the Pregel paper, I believe the in-line functions are incorrect. - Hard-coding of RNG seeds so that method now generates the same graph for a given number of vertices, edges, mu, and sigma -- user is not able to override seed or specify that seed should be randomly generated. - Backwards-incompatible change to logNormalGraph signature with introduction of new required parameter. - Failed to update scala docs and programming guide for API changes - Added a Synthetic Benchmark in the examples. I submitted JIRA SPARK-3263 <https://issues.apache.org/jira/browse/SPARK-3263> and PR #2168 <https://github.com/apache/spark/pull/2168> to revert some of these changes and fix usage of the RNGs: - Removes the in-line calls and calls original vertex / edge generation functions again - Adds an optional seed parameter for deterministic behavior (when desired) - Keeps the number of partitions parameter that was added. - Keeps compatibility with the synthetic benchmark example - Maintains backwards-compatible API I would appreciate feedback and people taking a look. :) Thanks! RJ -- em rnowl...@gmail.com c 954.496.2314