Here is the corresponding JIRA ticket:
https://issues.apache.org/jira/browse/FLINK-15806

On Wed, Jan 29, 2020 at 3:16 PM Till Rohrmann <trohrm...@apache.org> wrote:

> Hi Theo,
>
> your assumption is correct that Flink won't clean up its files when using
> `yarn application -kill ID`. This should also hold true for other temporary
> files generated by Flink's Blob service, shuffle service and io manager.
> These files are usually stored under /tmp and should be cleaned up
> eventually, though.
>
> I think a better approach is to reconnect to the Flink Yarn session
> cluster and then issue the "stop" command. You can either do it via
> `bin/yarn-session.sh -id APP_ID` and then type "stop" or you do `echo
> "stop" | bin/yarn-session.sh -id APP_ID`.
>
> I think we should also update the logging statements of the
> yarn-session.sh which say that you should use `yarn application -kill` in
> order to stop the process.
>
> Cheers,
> Till
>
> On Tue, Jan 28, 2020 at 6:21 PM Theo Diefenthal <
> theo.diefent...@scoop-software.de> wrote:
>
>> Hi there,
>>
>> Today I realized that we currently have a lot of not housekept flink
>> distribution jar files and would like to know what to do about this, i.e.
>> how to proper housekeep them.
>>
>> In the job submitting HDFS home directory, I find a subdirectory called
>> `.flink` with hundreds of subfolders like `application_1573731655031_0420`,
>> having the following structure:
>>
>> -rw-r--r--   3 dev dev        861 2020-01-27 21:17
>> /user/dev/.flink/application_1580155950981_0010/4797ff6e-853b-460c-81b3-34078814c5c9-taskmanager-conf.yaml
>> -rw-r--r--   3 dev dev        691 2020-01-27 21:16
>> /user/dev/.flink/application_1580155950981_0010/application_1580155950981_0010-flink-conf.yaml2755466919863419496.tmp
>> -rw-r--r--   3 dev dev        861 2020-01-27 21:17
>> /user/dev/.flink/application_1580155950981_0010/fdb5ef57-c140-4f6d-9791-c226eb1438ce-taskmanager-conf.yaml
>> -rw-r--r--   3 dev dev     92.2 M 2020-01-27 21:16
>> /user/dev/.flink/application_1580155950981_0010/flink-dist_2.11-1.9.1.jar
>> drwxr-xr-x   - dev dev          0 2020-01-27 21:16
>> /user/dev/.flink/application_1580155950981_0010/lib
>> -rw-r--r--   3 dev dev      2.6 K 2020-01-27 21:16
>> /user/dev/.flink/application_1580155950981_0010/log4j.properties
>> -rw-r--r--   3 dev dev      2.3 K 2020-01-27 21:16
>> /user/dev/.flink/application_1580155950981_0010/logback.xml
>> drwxr-xr-x   - dev dev          0 2020-01-27 21:16
>> /user/dev/.flink/application_1580155950981_0010/plugins
>>
>> With having tons of those folders (For each flink session we
>> launched/killed in our CI CD pipeline), they sum up to some terrabytes in
>> our HDFS in used space.
>> I suppose, I kill our flink sessions wrongly. We start and stop sessions
>> and and jobs separately like so:
>>
>> Start:
>>
>> ${OS_ROOT}/flink/bin/yarn-session.sh -jm 4g -tm 32g --name 
>> "${FLINK_SESSION_NAME}" -d -Denv.java.opts="-XX:+HeapDumpOnOutOfMemoryError"
>>
>> ${OS_ROOT}/flink/bin/flink run -m ${FLINK_HOST} [..savepoint/checkpoint 
>> options...] -d -n "${JOB_JAR}" $*
>>
>> Stop
>>
>> ${OS_ROOT}/flink/bin/flink stop -p ${SAVEPOINT_BASEDIR}/${FLINK_JOB_NAME} -m 
>> ${FLINK_HOST} ${ID}
>>
>> yarn application -kill "${ID}"
>>
>>
>> yarn application -kill was the best I could find as the flink docu
>> states, the linux session process should only be closed (" Stop the YARN
>> session by stopping the unix process (using CTRL+C) or by entering ‘stop’
>> into the client.").
>>
>> Now my question: Is there a more elegant way to kill a yarn session
>> (remotely from some host in the cluster, not necessarily the one starting
>> the detached session), which also does the housekeeping then? Or should I
>> do the housekeeping myself manually? (Pretty easy to script). Do I need to
>> expect any more side effects when killing the session with "yarn
>> application -kill"?
>>
>> Best regards
>> Theo
>>
>> --
>> SCOOP Software GmbH - Gut Maarhausen - Eiler Straße 3 P - D-51107 Köln
>> Theo Diefenthal
>>
>> T +49 221 801916-196 - F +49 221 801916-17 - M +49 160 90506575
>> theo.diefent...@scoop-software.de - www.scoop-software.de
>> Sitz der Gesellschaft: Köln, Handelsregister: Köln,
>> Handelsregisternummer: HRB 36625
>> Geschäftsführung: Dr. Oleg Balovnev, Frank Heinen,
>> Martin Müller-Rohde, Dr. Wolfgang Reddig, Roland Scheel
>>
>

Reply via email to