[ 
https://issues.apache.org/jira/browse/FLINK-10434?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Metzger updated FLINK-10434:
-----------------------------------
    Component/s: Runtime / Coordination

> Blob file not removed after job execution.
> ------------------------------------------
>
>                 Key: FLINK-10434
>                 URL: https://issues.apache.org/jira/browse/FLINK-10434
>             Project: Flink
>          Issue Type: Bug
>          Components: Runtime / Coordination
>    Affects Versions: 1.5.2
>         Environment: Flink container running on Yarn.
>            Reporter: Piotr Szczepanek
>            Priority: Major
>
> We have Flink container running on Yarn, and we're using YarnClusterClient 
> for job submission. After successfully/failed job execution it looks like 
> blob file for that job is deleted, but there is still handle from Flink 
> process to that file. As a result the file is
> not removed from machine, until we restart the whole Flink container.
> From the size comparison it looks like mentioned file is actually our jar 
> with Flink job.
> This is quite big problem for us, as we are submitting many jobs, every 
> submission upload new jar and created new blob file, which is never removed 
> from disc until we restart container. We already faced out of disc space.
> Results of lsof are:
> During job execution:
> lsof /flinkDir | grep job_dbafb671b0d60ed8a8ec2651fe59303b
> java 11883 yarn mem REG 253,2 112384928 109973177
> /flinkDir/yarn/../application_1536668870638_5555/blobStore-a1bcdbd4-5388-4c56-8052-6051f5af38dd/job_dbafb671b0d60ed8a8ec2651fe59303b/blob_p-8771d9ccac35e28d8571ac8957feaaecdebaeadd-7748aec7fe7369ca26181d0f94b1a578
> java 11883 yarn 1837r REG 253,2 112384928 109973177
> /flinkDir/yarn/../application_1536668870638_5555/blobStore-a1bcdbd4-5388-4c56-8052-6051f5af38dd/job_dbafb671b0d60ed8a8ec2651fe59303b/blob_p-8771d9ccac35e28d8571ac8957feaaecdebaeadd-7748aec7fe7369ca26181d0f94b1a578
> After job execution:
> lsof /flinkDir | grep job_dbafb671b0d60ed8a8ec2651fe59303b
> java 11883 yarn DEL REG 253,2 109973177
> /flinkDir/yarn/../application_1536668870638_5555/blobStore-a1bcdbd4-5388-4c56-8052-6051f5af38dd/job_dbafb671b0d60ed8a8ec2651fe59303b/blob_p-8771d9ccac35e28d8571ac8957feaaecdebaeadd-7748aec7fe7369ca26181d0f94b1a578
> java 11883 yarn 1837r REG 253,2 112384928 109973177
> /flinkDir/yarn/../application_1536668870638_5555/blobStore-a1bcdbd4-5388-4c56-8052-6051f5af38dd/job_dbafb671b0d60ed8a8ec2651fe59303b/blob_p-8771d9ccac35e28d8571ac8957feaaecdebaeadd-7748aec7fe7369ca26181d0f94b1a578
> *(deleted)*
> After restarting Flink container this handle disappeared.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to