Ok, that makes sense. I did see some job failures. However failures could happen occasionally. Is there any option to have the job manager clean-up these directories when the job has failed?
On Mon, Jan 10, 2022 at 8:58 PM Yang Wang <[email protected]> wrote: > IIRC, the staging directory(/user/{name}/.flink/application_xxx) will be > deleted automatically if the Flink job reaches global terminal state(e.g. > FINISHED, CANCELED, FAILED). > So I assume you have stopped the yarn application via "yarn application > -kill", not via "bin/flink cancel". > If it is the case, then having the residual staging directory is an > expected behavior since Flink JobManager does not have a chance to do the > clean-up. > > > > Best, > Yang > > David Clutter <[email protected]> 于2022年1月11日周二 10:08写道: > >> I'm seeing files orphaned in HDFS and wondering how to clean them up when >> the job is completed. The directory is /user/yarn/.flink so I am assuming >> this is created by flink? The HDFS in my cluster eventually fills up. >> >> Here is my setup: >> >> - Flink 1.13.1 on AWS EMR >> - Executing flink in per-job mode >> - Job is submitted every 5m >> >> In HDFS under /user/yarn/.flink I see a directory created for every flink >> job submitted/yarn application. Each application directory contains my >> user jar file, flink-dist jar, /lib with various flink jars, >> log4j.properties. >> >> Is there a property to tell flink to clean up this directory when the job >> is completed? >> >
