Re: [E] Re: Orphaned job files in HDFS

David Clutter Tue, 11 Jan 2022 07:00:07 -0800

Ok, that makes sense.  I did see some job failures.  However failures could
happen occasionally.  Is there any option to have the job manager clean-up
these directories when the job has failed?


On Mon, Jan 10, 2022 at 8:58 PM Yang Wang <[email protected]> wrote:

> IIRC, the staging directory(/user/{name}/.flink/application_xxx) will be
> deleted automatically if the Flink job reaches global terminal state(e.g.
> FINISHED, CANCELED, FAILED).
> So I assume you have stopped the yarn application via "yarn application
> -kill", not via "bin/flink cancel".
> If it is the case, then having the residual staging directory is an
> expected behavior since Flink JobManager does not have a chance to do the
> clean-up.
>
>
>
> Best,
> Yang
>
> David Clutter <[email protected]> 于2022年1月11日周二 10:08写道：
>
>> I'm seeing files orphaned in HDFS and wondering how to clean them up when
>> the job is completed.  The directory is /user/yarn/.flink so I am assuming
>> this is created by flink?  The HDFS in my cluster eventually fills up.
>>
>> Here is my setup:
>>
>>    - Flink 1.13.1 on AWS EMR
>>    - Executing flink in per-job mode
>>    - Job is submitted every 5m
>>
>> In HDFS under /user/yarn/.flink I see a directory created for every flink
>> job submitted/yarn application.  Each application directory contains my
>> user jar file, flink-dist jar, /lib with various flink jars,
>> log4j.properties.
>>
>> Is there a property to tell flink to clean up this directory when the job
>> is completed?
>>
>

Re: [E] Re: Orphaned job files in HDFS

Reply via email to