shuffle the
random sample so far.
Thanks a lot!
--
View this message in context:
http://apache-spark-user-list.1001560.n3.nabble.com/Random-pairs-RDD-order-tp22529.html
Sent from the Apache Spark User List mailing list archive
not find a way to efficiently shuffle the
random sample so far.
Thanks a lot!
--
View this message in context:
http://apache-spark-user-list.1001560.n3.nabble.com/Random-pairs-RDD-order-tp22529.html
Sent from the Apache Spark User List mailing list archive at Nabble.com
.1001560.n3.nabble.com/Random-pairs-RDD-order-tp22529.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.
-
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h
Hi Aurelien,
Sean's solution is nice, but maybe not completely order-free, since
pairs will come from the same partition.
The easiest / fastest way to do it in my opinion is to use a random key
instead of a zipWithIndex. Of course you'll not be able to ensure
uniqueness of each elements of
(Indeed, though the OP said it was a requirement that the pairs are
drawn from the same partition.)
On Thu, Apr 16, 2015 at 11:14 PM, Guillaume Pitel
guillaume.pi...@exensa.com wrote:
Hi Aurelien,
Sean's solution is nice, but maybe not completely order-free, since pairs
will come from the
this message in context:
http://apache-spark-user-list.1001560.n3.nabble.com/Random-pairs-RDD-order-tp22529.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.
-
To unsubscribe, e-mail: user-unsubscr