Guowei Ma created FLINK-39807:
---------------------------------
Summary: [umbrella] FLIP-577: AI-Native Flink — Umbrella for
Multimodal Data Processing
Key: FLINK-39807
URL: https://issues.apache.org/jira/browse/FLINK-39807
Project: Flink
Issue Type: New Feature
Reporter: Guowei Ma
This is the umbrella issue tracking FLIP-577: AI-Native Flink — An Umbrella
Proposal for Multimodal Data Processing.
_FLIP:_
[FLIP-577|[https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=421957275]]
_Discussion thread:_ [dev@
thread|[https://lists.apache.org/thread/l5obwo4kvjblqnx24btsv4rwjwoflb6d]]
User workloads are shifting from BI analytics to multimodal data processing
centered on model inference — data expands from structured records to
images/video/audio/embeddings, resources move from CPU to mixed CPU/GPU, and
execution moves from row-oriented to vectorized batch. This umbrella proposes
evolving Flink from a unified stream-batch compute engine into one that
natively supports AI workloads (AI-Native), decomposed into 11 sub-FLIPs across
three layers:
* _Layer 1 — Core Runtime Primitives:_ RpcOperator; multimodal type system and
OBJECT_REF.
* _Layer 2 — Workload Expression and Execution:_ Python DataFrame API;
multimodal Source/Sink connector API; GPU resource declaration and independent
deployment; built-in multimodal operators and AI functions; Arrow columnar
transport.
* _Layer 3 — Production-Grade Operational Guarantees:_ non-disruptive scaling
for CPU and GPU operators; Unaligned Checkpoint enhancements;
Pipeline-Region-based independent checkpoints.
Most sub-FLIPs have no hard dependencies and can be advanced in parallel. This
umbrella seeks consensus on the overall direction only; detailed design and
APIs are deferred to each sub-FLIP. All changes are incremental. Sub-FLIPs will
be tracked as separate issues and linked here.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)