This is an automated email from the ASF dual-hosted git repository. guoweijie pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/flink.git
commit aaa88d4e6b3d9a493e713c48d9ddcef9b2ce3a34 Author: Zhanghao Chen <m...@outlook.com> AuthorDate: Mon Apr 8 22:07:01 2024 +0800 [FLINK-34123][docs][type] Add doc for built-in serialization support for Map, List, and Collection --- .../serialization/types_serialization.md | 26 +++++++++++++++++----- 1 file changed, 21 insertions(+), 5 deletions(-) diff --git a/docs/content/docs/dev/datastream/fault-tolerance/serialization/types_serialization.md b/docs/content/docs/dev/datastream/fault-tolerance/serialization/types_serialization.md index 0d4441f4d3f..766024ccd75 100644 --- a/docs/content/docs/dev/datastream/fault-tolerance/serialization/types_serialization.md +++ b/docs/content/docs/dev/datastream/fault-tolerance/serialization/types_serialization.md @@ -37,15 +37,16 @@ Flink places some restrictions on the type of elements that can be in a DataStre The reason for this is that the system analyzes the types to determine efficient execution strategies. -There are seven different categories of data types: +There are eight different categories of data types: 1. **Java Tuples** and **Scala Case Classes** 2. **Java POJOs** 3. **Primitive Types** -4. **Regular Classes** -5. **Values** -6. **Hadoop Writables** -7. **Special Types** +4. **Common Collection Types** +5. **Regular Classes** +6. **Values** +7. **Hadoop Writables** +8. **Special Types** #### Tuples and Case Classes @@ -167,6 +168,21 @@ input.keyBy(_.word) Flink supports all Java and Scala primitive types such as `Integer`, `String`, and `Double`. +#### Common Collection Types + +Flink comes with dedicated serialization support for common Java collection types, which is more efficient than going +through a general purpose serialization framework by avoiding analysis and serialization of the type metadata. +Currently, only `Map`, `List`, `Set` and its super interface `Collection` +are supported. To utilize it, you need to declare the collection type with: + +1. Concrete type arguments: e.g. `List<String>` but not `List`, `List<T>`, or `List<?>`, as Flink needs them to dispatch + serialization of the element types. +2. Interface types: e.g. `List<String>` but not `LinkedList<String>`, as Flink does not preserve the underlying + implementation types across serialization. + +Other nonqualified collection types will be handled by Flink as general class types. If the implementation types are +also required to be preserved, you also need to register it with a custom serializer. + #### General Class Types Flink supports most Java and Scala classes (API and custom).