Hi All,

How long the shuffle files and data files are stored on the block manager
folder of the workers.

I have a spark streaming job with window duration of 2 hours and slide
interval of 15 minutes.

When I execute the following command in my block manager path

find . -type f -cmin +150 -name "shuffle*" -exec ls {} \;

I see a lot of files which means that they are not getting cleared which I
was expecting that they should get cleared.

Subsequently, this size keeps on increasing and takes space on the disk.

Please suggest how to get rid of this and help on understanding this
behaviour.



Thanks !!!
Abhi

Reply via email to