sjwiesman opened a new pull request #13895:
URL: https://github.com/apache/flink/pull/13895
## What is the purpose of the change
This PR adds the new public classes as described in FLIP-142,
JobManagerCheckpointStorage, FileSystemCheckpointStorage, HashMapStateBackend,
and EmbeddedRocksDBStateBackend. It builds on #13797 so only the last 3 commits
are relevant.
## Brief change log
4d944f8 Adds `setDefaultSavepointDir` to StreamExecutionEnvironment and
wires it into the StreamConfig
409d151 Adds the two checkpoint storage implementations,
JobManagerCheckpointStorage and FileSystemCheckpointStorage. To maintain
backward compatibility with existing flink-confs the configuration
`state.checkpoint-storage` needs to be optional. The default value is
`FileSystemCheckpointStorage` if a checkpoint directory is provided,
`JobManagerCheckpointStorage` otherwise. I believe this will be the least
surprising to users as virtually everyone who sets a checkpoint directory wants
the full scalability of filesystem and not just externalized jm. Most of the
tests come in the next commit where there are state backends to test these
storage types with.
15e4b86 Adds the new state backends `HashMapStateBackend` and
`EmbeddedRocksDBStateBackend`. Note to reviewers, I changed the existing
`RocksDBStateBackend` to be a thin wrapper around the new
`EmbeddedRocksDBStateBackend`. That is because it has so many configurations
and is complex enough that I did not feel I could reasonably have enough test
coverage over both. This way, every existing test in flink that uses RocksDB
implicitly tests this new implementation so we can feel confident in its
correctness. This commit also deprecates the now legacy state backends.
## Verifying this change
*(Please pick either of the following options)*
UT / IT tests and all existing tests that use RocksDB.
## Does this pull request potentially affect one of the following parts:
- Dependencies (does it add or upgrade a dependency): (yes / **no**)
- The public API, i.e., is any changed class annotated with
`@Public(Evolving)`: (**yes** / no)
- The serializers: (yes / **no** / don't know)
- The runtime per-record code paths (performance sensitive): (yes / **no**
/ don't know)
- Anything that affects deployment or recovery: JobManager (and its
components), Checkpointing, Kubernetes/Yarn/Mesos, ZooKeeper: (yes / **no** /
don't know)
- The S3 file system connector: (yes / **no** / don't know)
## Documentation
- Does this pull request introduce a new feature? (**yes** / no)
- If yes, how is the feature documented? (not applicable / docs /
**JavaDocs** / not documented) I plan to add documentation as a follow up PR.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]