Spark manage memory allocation and release automatically. Can you post the complete program which help checking where is wrong ?
On Wed, Sep 20, 2017 at 8:12 PM, Alexander Czech < alexander.cz...@googlemail.com> wrote: > Hello all, > > I'm running a pyspark script that makes use of for loop to create smaller > chunks of my main dataset. > > some example code: > > for chunk in chunks: > my_rdd = sc.parallelize(chunk).flatmap(somefunc) > # do some stuff with my_rdd > > my_df = make_df(my_rdd) > # do some stuff with my_df > my_df.write.parquet('./some/path') > > After a couple of loops I always start to loose executors because out of > memory errors. Is there a way free up memory after an loop? Do I have to do > it in python or with spark? > > Thanks >