GitHub user dorx opened a pull request:
https://github.com/apache/spark/pull/1520
[SPARK-2514] [mllib] Random RDD generator
Utilities for generating random RDDs.
RandomRDD and RandomVectorRDD are created instead of using
`sc.parallelize(range:Range)` because `Range` objects in Scala can only have
`size <= Int.MaxValue`.
The object `RandomRDDGenerators` can be transformed into a generator class
to reduce the number of auxiliary methods for optional arguments.
You can merge this pull request into a Git repository by running:
$ git pull https://github.com/dorx/spark randomRDD
Alternatively you can review and apply these changes as the patch at:
https://github.com/apache/spark/pull/1520.patch
To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:
This closes #1520
----
commit 888144416ced2b6d4c4839ac95b8a3feb2b3aba1
Author: Doris Xin <[email protected]>
Date: 2014-07-12T01:02:01Z
RandomRDDGenerator: initial design
Looking for feedback on design decisions. Very rough draft and untested.
commit 7cb0e406793db493cee72cb91ec02475c95c8de7
Author: Doris Xin <[email protected]>
Date: 2014-07-12T01:15:56Z
fix for data inconsistency
commit 49ed20d9a30b0ba5d809974bbcf48cc76a45d68e
Author: Doris Xin <[email protected]>
Date: 2014-07-12T01:30:15Z
alternative poisson distribution generator
commit f46d928c4e3e71ced4ede9295ef645fb714c9a69
Author: Doris Xin <[email protected]>
Date: 2014-07-19T02:13:58Z
WIP
commit df5bcffc320bab85f6c5925b244fe9885d6d0eb5
Author: Doris Xin <[email protected]>
Date: 2014-07-21T07:47:07Z
Merge branch 'generator' into randomRDD
commit 92d6f1c3ca0f22371f7f0387b875ac16d5030ffb
Author: Doris Xin <[email protected]>
Date: 2014-07-21T07:48:12Z
solution for Cloneable
commit d56cacbde7a0550f53b59696ad7c7014c827f3f7
Author: Doris Xin <[email protected]>
Date: 2014-07-22T01:23:19Z
impl with RandomRDD
commit bc90234c9639bfb3f4581af63cf4bf370c61e18b
Author: Doris Xin <[email protected]>
Date: 2014-07-22T03:37:40Z
units passed.
commit aec68eb167ac9f11c64d95c698009cbf8919bd4b
Author: Doris Xin <[email protected]>
Date: 2014-07-22T03:42:31Z
newline
commit 063ea0b48b769f7f8477ca2364f8e676f93c297e
Author: Doris Xin <[email protected]>
Date: 2014-07-22T03:43:57Z
Merge branch 'master' into randomRDD
----
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---