kurtostfeld opened a new pull request, #22660: URL: https://github.com/apache/flink/pull/22660
…ith backward compatibility for existing savepoints and checkpoints. ## What is the purpose of the change To upgrade the primary Kryo library used by Flink from v2.x to v5.x, while providing backwards compatibility with existing savepoints and checkponts. This PR adds a new Kryo v5 dependency that is namespaced so that it can coexist with the legacy dependencies that would be kept for compatibility purposes. The existing chill-java library version 0.7.6 would also be deprecated and kept for compatibility purposes only. Some future version of Flink could eventually drop Kryo v2 and chill-java when backwards compatibility with Kryo v2 based state is no longer needed. Why upgrade Kryo: - Kryo v5.x is more compatible with newer versions of Java. I wrote a simple Flink app with Kryo serialized state: When running under Java 17 or Java 21 pre-release builds, the Kryo v2 library fails at runtime, while a prototype build of Flink with this PR can load state serialized with Kryo v5.x running under both Java 17 and Java 21 pre-release builds successfully without failures. - Kryo v5.x supports Java records, while Kryo v2.x does not. - Kryo v2.2.x was released over ten years ago, well before the release of Java 8. Kryo is a maintained project, there have been lots of improvements over the past ten years. Kryo v5 has faster runtime performance, and more memory efficient serialization protocols. Additionally Kryo v5 has fixed lots of bugs. While the Flink project has worked around all Kryo v2 bugs, it would be an improvement to remove the workarounds. Next, Kryo v2 depends on the chill-java library for functionality, where almost all of that functionality is built into Kryo v5.x so that chill-java dependency can be phased out. ## Brief change log This is a large PR with a lot of surface area and risk. I tried to keep the scope of these changes as narrow and simple as possible. I copied all the Kryo v2 code to Kryo v5 equivalents and made necessary adjustments to get everything working. Some Flink serialization code references Flink Java classes by full package name, so I didn't modify the package names or class names of any existing serialization classes. ## Verifying this change There are lots of existing unit tests that cover backward compatibility with existing state and also the serialization framework. This obviously passes all those tests. Additionally, I wrote a Flink application to do a more thorough test of the Kryo upgrade that was difficult to convert into unit test form. https://github.com/kurtostfeld/flink-kryo-upgrade-demo ## Does this pull request potentially affect one of the following parts: - Dependencies (does it add or upgrade a dependency): yes. This adds a new Kryo v5 dependency and keeps legacy dependencies for backwards compatibility purposes. - The public API, i.e., is any changed class annotated with `@Public(Evolving)`: yes. Kryo v2 APIs are deprecated. Parallel Kryo v5 APIs are created with PublicEvolving - The serializers: yes. This absolutely affects the serializers. - The runtime per-record code paths (performance sensitive): yes. This should be faster but I haven't done any benchmark testing. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
