GitHub user rnowling opened a pull request:
https://github.com/apache/spark/pull/2168
[SPARK-3263][GraphX] Fix changed
PR #720 made multiple changes to GraphGenerator.logNormalGraph including:
* Replacing the call to functions for generating random vertices and edges
with in-line implementations with different equations. Based on reading the
Pregel paper, I believe the in-line functions are incorrect.
* Hard-coding of RNG seeds so that method now generates the same graph for
a given number of vertices, edges, mu, and sigma -- user is not able to
override seed or specify that seed should be randomly generated.
* Backwards-incompatible change to logNormalGraph signature with
introduction of new required parameter.
* Failed to update scala docs and programming guide for API changes
* Added a Synthetic Benchmark in the examples.
This PR:
* Removes the in-line calls and calls original vertex / edge generation
functions again
* Adds an optional seed parameter for deterministic behavior (when desired)
* Keeps the number of partitions parameter that was added.
* Keeps compatibility with the synthetic benchmark example
You can merge this pull request into a Git repository by running:
$ git pull https://github.com/rnowling/spark graphgenrand
Alternatively you can review and apply these changes as the patch at:
https://github.com/apache/spark/pull/2168.patch
To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:
This closes #2168
----
commit 015010c4213594be5c9086f82743adebfd88a936
Author: RJ Nowling <[email protected]>
Date: 2014-08-26T20:17:59Z
Fixed GraphGenerator logNormalGraph API to make backward-incompatible
change in commit 894ecde04
commit c1831368ce36bbac48c2733ffa3077be453aedae
Author: RJ Nowling <[email protected]>
Date: 2014-08-27T20:03:45Z
Fix to deterministic GraphGenerators.logNormalGraph that allows generating
graphs randomly using optional seed.
commit 684804d4c5c33bd4d50a179b15e3b1b0375ff1d8
Author: RJ Nowling <[email protected]>
Date: 2014-08-27T21:06:51Z
revert PR #720 which introduce errors in logNormalGraph and messed up
seeding of RNGs. Add user-defined optional seed for deterministic behavior
----
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]