Re: Using same rdd from two threads

2021-01-24 Thread jelmer
Well it is now... The RDD had a repartition call on it. When I removed repartition it it it would work, When i did not remove the repartition but called called rdd.partitions.length on it it would also work! I looked into the partitions method and in it some instance variables get initialized, s

Re: Using same rdd from two threads

2021-01-22 Thread Sean Owen
RDDs are immutable, and Spark itself is thread-safe. This should be fine. Something else is going on in your code. On Fri, Jan 22, 2021 at 7:59 AM jelmer wrote: > HI, > > I have a piece of code in which an rdd is created from a main method. > It then does work on this rdd from 2 different thread

Using same rdd from two threads

2021-01-22 Thread jelmer
HI, I have a piece of code in which an rdd is created from a main method. It then does work on this rdd from 2 different threads running in parallel. When running this code as part of a test with a local master it will sometimes make spark hang ( 1 task will never get completed) If i make a copy