GitHub user uce opened a pull request:
https://github.com/apache/flink/pull/2256
[FLINK-4150] [runtime] Don't clean up BlobStore on BlobServer shut down
The `BlobServer` acts as a local cache for uploaded BLOBs. The life-cycle
of each BLOB is bound to the life-cycle of the `BlobServer`. If the BlobServer
shuts down (on JobManager shut down), all local files will be removed.
With HA, BLOBs are persisted to another file system (e.g. HDFS) via the
`BlobStore` in order to have BLOBs available after a JobManager failure (or
shut down). These BLOBs are only allowed to be removed when the job that
requires them enters a globally terminal state (`FINISHED`, `CANCELLED`,
`FAILED`).
This commit removes the `BlobStore` clean up call from the `BlobServer`
shutdown. The `BlobStore` files will only be cleaned up via the
`BlobLibraryCacheManager`'s' clean up task (periodically or on
BlobLibraryCacheManager shutdown). This means that there is a chance that BLOBs
will linger around after the job has terminated, if the job manager fails
before the clean up.
You can merge this pull request into a Git repository by running:
$ git pull https://github.com/uce/flink 4150-blobstore
Alternatively you can review and apply these changes as the patch at:
https://github.com/apache/flink/pull/2256.patch
To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:
This closes #2256
----
commit 0d4522270881dbbb7164130f47f9d4df617c19c5
Author: Ufuk Celebi <[email protected]>
Date: 2016-07-14T14:29:49Z
[FLINK-4150] [runtime] Don't clean up BlobStore on BlobServer shut down
The `BlobServer` acts as a local cache for uploaded BLOBs. The life-cycle of
each BLOB is bound to the life-cycle of the `BlobServer`. If the BlobServer
shuts down (on JobManager shut down), all local files will be removed.
With HA, BLOBs are persisted to another file system (e.g. HDFS) via the
`BlobStore` in order to have BLOBs available after a JobManager failure (or
shut down). These BLOBs are only allowed to be removed when the job that
requires them enters a globally terminal state (`FINISHED`, `CANCELLED`,
`FAILED`).
This commit removes the `BlobStore` clean up call from the `BlobServer`
shutdown. The `BlobStore` files will only be cleaned up via the
`BlobLibraryCacheManager`'s' clean up task (periodically or on
BlobLibraryCacheManager shutdown). This means that there is a chance that
BLOBs will linger around after the job has terminated, if the job manager
fails before the clean up.
----
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---