e:
>
> my_seed = 42
> def f(iterator):
> random.seed(my_seed)
> yield my_seed
> rdd.mapPartitions(f)
>
>
>
> From: ayan guha
> Sent: Thursday, May 14, 2015 2:29 AM
>
> To: Charles Hayden
> Cc: user
> Subject: Re:
):
random.seed(my_seed)
yield my_seed
rdd.mapPartitions(f)
From: ayan guha
Sent: Thursday, May 14, 2015 2:29 AM
To: Charles Hayden
Cc: user
Subject: Re: how to set random seed
Sorry for late reply.
Here is what I was thinking
import random as r
def main
plant" the seed (call
> random.seed()) once on each worker?
> --
> *From:* ayan guha
> *Sent:* Tuesday, May 12, 2015 11:17 PM
> *To:* Charles Hayden
> *Cc:* user
> *Subject:* Re: how to set random seed
>
>
> Easiest way is to broadcast it.
&
er
Subject: Re: how to set random seed
Easiest way is to broadcast it.
On 13 May 2015 10:40, "Charles Hayden"
mailto:charles.hay...@atigeo.com>> wrote:
In pySpark, I am writing a map with a lambda that calls random.shuffle.
For testing, I want to be able to give it a seed, so
Easiest way is to broadcast it.
On 13 May 2015 10:40, "Charles Hayden" wrote:
> In pySpark, I am writing a map with a lambda that calls random.shuffle.
> For testing, I want to be able to give it a seed, so that successive runs
> will produce the same shuffle.
> I am looking for a way to set th
In pySpark, I am writing a map with a lambda that calls random.shuffle.
For testing, I want to be able to give it a seed, so that successive runs will
produce the same shuffle.
I am looking for a way to set this same random seed once on each worker. Is
there any simple way to do it??