Spark manage memory allocation and release automatically. Can you post the
complete program which help checking where is wrong ?

On Wed, Sep 20, 2017 at 8:12 PM, Alexander Czech <> wrote:

> Hello all,
> I'm running a pyspark script that makes use of for loop to create smaller
> chunks of my main dataset.
> some example code:
> for chunk in chunks:
>     my_rdd = sc.parallelize(chunk).flatmap(somefunc)
>     # do some stuff with my_rdd
>     my_df = make_df(my_rdd)
>     # do some stuff with my_df
>     my_df.write.parquet('./some/path')
> After a couple of loops I always start to loose executors because out of
> memory errors. Is there a way free up memory after an loop? Do I have to do
> it in python or with spark?
> Thanks

Reply via email to