dylanhz opened a new pull request, #27778: URL: https://github.com/apache/flink/pull/27778
## What is the purpose of the change This pull request adds Table API/SQL support for the BITMAP data type introduced in [FLIP-556](https://cwiki.apache.org/confluence/display/FLINK/FLIP-556%3A+Introduce+BITMAP+Data+Type). It integrates BITMAP into Flink's type system, internal data format, planner, and code generation, enabling BITMAP columns to be used in SQL queries and Table API programs. This is the third PR in the FLIP-556 series: - PR 1 (`FLINK-39183`): Parser support - PR 2 (`FLINK-39184`): DataStream API support (`flink-core`) - **PR 3 (`FLINK-39185`): Table API/SQL support (this PR)** ## Brief change log Suggested review order: 1. **LogicalType system**: Added `BitmapType`, `LogicalTypeRoot.BITMAP`, `LogicalTypeFamily.EXTENSION`, visitor support, cast rules, and type parsing 2. **DataType / API layer**: Added `DataTypes.BITMAP()`, registered type mappings in `ClassDataTypeConverter`, `TypeInfoDataTypeConverter`, and `ValueDataTypeConverter` 3. **Internal data format**: Extended `RowData`/`ArrayData` with `getBitmap()`, implemented in `BinaryRowData`/`BinaryArrayData`/`GenericRowData`/`GenericArrayData`/`NestedRowData`; added `BinarySegmentUtils.readBitmap()` and `BinaryWriter.writeBitmap()` 4. **Planner integration**: Added `BitmapRelDataType`, integrated into `FlinkTypeFactory` (bidirectional conversion between `BitmapType` and `BitmapRelDataType`), extended `CodeGenUtils` for code generation, and updated `ExpressionReducer` 5. **Cast rules**: Added `BitmapToStringCastRule` and `BitmapToBinaryCastRule` (with trim/pad semantics); restricted `CAST(x AS BITMAP)` in `SqlCastFunction` 6. **Data converters**: Added `BitmapBitmapConverter`, `DataFormatConverters.BitmapConverter`, and JSON serialization/deserialization for `BitmapType` ## Verifying this change This change added tests and can be verified as follows: - `BitmapSemanticTest`: End-to-end integration tests for BITMAP in SQL/Table API, covering source/sink roundtrip, projection, filtering, UDF invocation, and UDAF aggregation - `BinaryRowDataTest` / `BinaryArrayDataTest`: Unit tests for BITMAP read/write in binary row and array formats - `RowDataTest`: Verifies BITMAP field access and `FieldGetter` in `RowData` - `DataTypesTest`: Verifies `DataTypes.BITMAP()` resolution and class mapping - `LogicalTypesTest`: Tests `BitmapType` properties, serialization string, and cast compatibility - `ProjectionCodeGeneratorTest`: Verifies BITMAP field projection in generated code - `TypeInferenceExtractorTest`: Tests type inference for UDFs that accept/return BITMAP, including rejection of custom Bitmap implementations ## Does this pull request potentially affect one of the following parts: - Dependencies (does it add or upgrade a dependency): no - The public API, i.e., is any changed class annotated with `@Public(Evolving)`: yes (`RowData`, `ArrayData`, `DataTypes`, `BinaryWriter`) - The serializers: no - The runtime per-record code paths (performance sensitive): yes (new `getBitmap`/`writeBitmap` code paths, but only activated for BITMAP type columns) - Anything that affects deployment or recovery: JobManager (and its components), Checkpointing, Kubernetes/Yarn, ZooKeeper: no - The S3 file system connector: no ## Documentation - Does this pull request introduce a new feature? yes - If yes, how is the feature documented? not documented (documentation will be added when the full BITMAP type support is complete, including built-in functions) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
