Hi Biao Geng, Thanks for driving this FLIP — it addresses a real gap in how
Flink represents multimodal data.

>From a connector developer's perspective, I am strongly in favor of
introducing `VECTOR` as a first-class type. Today, the lack of a vector
type in Flink's type system forces connectors that integrate with
vector-capable downstream systems (e.g. Milvus, Paimon) to fall back on
`ARRAY`, which cannot express vector semantics such as a fixed dimension.

Take Paimon as an example: Paimon already provides a native `VectorType`
(fixed element type + length), but since Flink has no equivalent type,
`paimon-flink` has to bridge it through `ARRAY` plus extra table options
(e.g. `vector-field` and `field.<name>.vector-dim`) to carry the dimension
that the schema itself cannot express.

A first-class `VECTOR('float32', 768)` would let connectors map directly to
the native vector fields of downstream systems and validate dtype/dimension
from the schema, without lossy `ARRAY` conversion or out-of-band options.

+1 on the `VECTOR` type.

Best regards,
Yanquan

Geng Biao <[email protected]> 于2026年6月12日周五 10:25写道:

> Hi everyone,
>
> Dylanhz and I would like to start a discussion on FLIP-590: Introduce
> Multimodal Data Types: Vector, Tensor, and Image [1].
>
> This FLIP follows the direction of FLIP-577 and proposes first-class
> multimodal data types for AI-oriented Flink pipelines.
>
> Today, values such as embeddings, tensors, and decoded images are commonly
> represented as VARBINARY, STRING, ARRAY, or custom ROW structures. These
> encodings can work, but they lose important semantics such as element
> dtype, tensor shape, vector dimension, image mode, and decoded image
> layout. This makes it hard for SQL/Table, DataStream, Java UDFs, PyFlink
> UDFs, and connectors to share a stable contract.
>
> The FLIP proposes three new logical data types:
>
> - TENSOR: a dense n-dimensional tensor with element dtype and optional
> fixed shape.
> - VECTOR: a dense fixed-dimension one-dimensional vector for embeddings,
> feature vectors, and vector database integration.
> - IMAGE: a decoded static image value with mode, height, width, and HWC
> pixel data.
>
> It also introduces ElementDType as a shared element dtype enum for tensor,
> vector, and image payloads. This allows multimodal dtypes such as uint8,
> uint32, uint64, float32, and float64 without adding top-level unsigned SQL
> scalar types to Flink.
>
> The proposal is intentionally scoped to type semantics, runtime
> representation, Java/PyFlink APIs, serialization, and connector/format
> boundaries. It does not aim to turn Flink into a tensor computation or
> image processing framework.
>
>
> Looking forward to your feedback.
>
> [1]
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-590%3A+Introduce+Multimodal+Data+Type%3A++Vector%2C+Tensor%2C+and+Image
> <
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-590:+Introduce+Multimodal+Data+Type:++Vector,+Tensor,+and+Image
> >
>
>
> Best,
> Biao Geng

Reply via email to