lgo opened a new pull request #12345: URL: https://github.com/apache/flink/pull/12345
## WIP I was hoping to get a first round of feedback for this implementation. This branch is currently passing tests, but there was additional work to clean it up: - [ ] More testing of the new implementation. - [ ] Add benchmarking to RocksDB for the two implementations used here. - [ ] Add more configuration parameters to pass throguh to the writers. - [ ] Compare performance of save-point recovery on our production instanace, as it has a large state. ## What is the purpose of the change This adds a new mode of batch writing keys into RocksDB, via `RocksDBSSTIngestWriter`, which should provide a considerable performance improvement to some operations such as save-point recovery. This is in reference to the discussion on the users maillist I brought up, [here](http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/RocksDB-savepoint-recovery-performance-improvements-td35238.html), as well as the previously reported ticket: https://issues.apache.org/jira/browse/FLINK-17288. The first commit is also replaces one use of RocksDB with a more efficient operation (`deleteRange`). ## Brief change log (commit 1) - Replaced an iterate and `delete` operation with `RocksDB#deleteRange`. (commit 2) - Refactored the use of `RocksDBWriteBatchWrapper` into using a factory (`RocksDBWriterFactory`) and interface (`RocksDBWriter`), in preperation of adding a second implementation. (commit 3) - Added `RockSDBSSTWriter`, which is a basic wrapper for `SstFileWriter` in order to create `sst` files. - Added `RocksDBSSTIngestWriter`, which uses the `RockSDBSSTWriter`, and provides a write-interface for batch writing k/v into RocksDB. This includes flushing and handling multiple column-families. - Added new configuration for opting into the writer, as well as tuning parameters. This configuration was plumbed into `RocksDBWriterFactory`. ## Verifying this change This change is already covered by existing tests, such as: - RocksDB savepoint and checkpoint tests. This change added tests and can be verified as follows: - Added new tests for the `RocksDBSSTWriter` and `RocksDBSSTIngestWriter`. - [ ] **TODO** Add more rigourous tests for the new implementation. - [ ] **TODO** Extend existing tests to test both writer implementations. - [ ] **TODO** Manually verified the change running on a cluster. - [ ] **TODO** Write benchmarks for https://github.com/facebook/rocksdb to compare the two writing methods. ## Does this pull request potentially affect one of the following parts: - Dependencies (does it add or upgrade a dependency): **no** - The public API, i.e., is any changed class annotated with `@Public(Evolving)`: **no** - The serializers: **no** - The runtime per-record code paths (performance sensitive): **yes** - Anything that affects deployment or recovery: JobManager (and its components), Checkpointing, Kubernetes/Yarn/Mesos, ZooKeeper: **yes** - The S3 file system connector: **no** ## Documentation - Does this pull request introduce a new feature? **yes** - If yes, how is the feature documented? **TODO: not yet documented** ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [email protected]
