danmagyar opened a new pull request #12636:
URL: https://github.com/apache/flink/pull/12636
## What is the purpose of the change
*This pull request introduces a new config option
`historyserver.max.history.size` and with that the ability to limit the number
of job archives kept in the history server, similarly to Spark’s
`spark.history.retainedApplications`.*
* Upon refreshing the archive directories, if the number of job archives
exceeds the configured limit, the least recently modified archive file is
deleted and its content removed from the history server’s cache.
* To provide straightforward user experience, this config option is made
independent of the value set to `historyserver.archive.clean-expired-jobs`.
* The default value of this config option is set to `50` similarly to the
respective default value in Spark.*
## Brief change log
- *The `HistoryServer` constructor loads the config option and passes it
to the `HistoryServerArchiveFetcher`*
- *The `HistoryServerArchiveFetcher`’s `JobArchiveFetcherTask` upon
refreshing the archive directories cleans up the least recently modified
archives from the filesystem and the cache*
## Verifying this change
This change added tests and can be verified as follows:
- *Added test that creates job archive files under the jobmanager’s
directory with their `lastModified` attribute values 1 minute apart from each
other before and after the history server starts, validating that the least
recently modified files are removed upon refresh*
## Does this pull request potentially affect one of the following parts:
- Dependencies (does it add or upgrade a dependency): no
- The public API, i.e., is any changed class annotated with
`@Public(Evolving)`: no
- The serializers: no
- The runtime per-record code paths (performance sensitive): no
- Anything that affects deployment or recovery: JobManager (and its
components), Checkpointing, Kubernetes/Yarn/Mesos, ZooKeeper: no
- The S3 file system connector: no
## Documentation
- Does this pull request introduce a new feature? yes
- If yes, how is the feature documented? docs
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]