sjwiesman opened a new pull request #13912:
URL: https://github.com/apache/flink/pull/13912
## What is the purpose of the change
This PR adds the new public classes as described in FLIP-142,
JobManagerCheckpointStorage, FileSystemCheckpointStorage, HashMapStateBackend,
and EmbeddedRocksDBStateBackend. It builds on #13797 so only the last 4 commits
are relevant.
## Brief changelog
da0eeec Adds `setDefaultSavepointDir` to StreamExecutionEnvironment and
wires it into the StreamConfig
eb6ac01 Adds the two checkpoint storage implementations,
JobManagerCheckpointStorage and FileSystemCheckpointStorage. To maintain
backward compatibility with existing flink-confs the configuration
`state.checkpoint-storage` needs to be optional. The default value is
`FileSystemCheckpointStorage` if a checkpoint directory is provided,
`JobManagerCheckpointStorage` otherwise. I believe this will be the least
surprising to users as virtually everyone who sets a checkpoint directory wants
the full scalability of filesystem and not just externalized jm. Most of the
tests come in the next commit where there are state backends to test these
storage types with.
4b6c9a1 Adds the new state backends `HashMapStateBackend` and
`EmbeddedRocksDBStateBackend`. Note to reviewers, I changed the existing
`RocksDBStateBackend` to be a thin wrapper around the new
`EmbeddedRocksDBStateBackend`. That is because it has so many configurations
and is complex enough that I did not feel I could reasonably have enough test
coverage over both. This way, every existing test in flink that uses RocksDB
implicitly tests this new implementation so we can feel confident in its
correctness. This commit also deprecates the now legacy state backends.
28753e5 Adds the new methods and classes to PyFlink
## Verifying this change
*(Please pick either of the following options)*
UT / IT tests and all existing tests that use RocksDB.
## Does this pull request potentially affect one of the following parts:
- Dependencies (does it add or upgrade a dependency): (yes / **no**)
- The public API, i.e., is any changed class annotated with
`@Public(Evolving)`: (**yes** / no)
- The serializers: (yes / **no** / don't know)
- The runtime per-record code paths (performance sensitive): (yes / **no**
/ don't know)
- Anything that affects deployment or recovery: JobManager (and its
components), Checkpointing, Kubernetes/Yarn/Mesos, ZooKeeper: (yes / **no** /
don't know)
- The S3 file system connector: (yes / **no** / don't know)
## Documentation
- Does this pull request introduce a new feature? (**yes** / no)
- If yes, how is the feature documented? (not applicable / docs /
**JavaDocs** / not documented) I plan to add documentation as a follow up PR.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]