[ 
https://issues.apache.org/jira/browse/FLINK-39807?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Guowei Ma updated FLINK-39807:
------------------------------
    Description: 
This is the umbrella issue tracking FLIP-577: AI-Native Flink — An Umbrella 
Proposal for Multimodal Data Processing.

_FLIP:_  
[FLIP-577|[https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=421957275]]

_Discussion thread:_ [dev@ 
thread|[https://lists.apache.org/thread/l5obwo4kvjblqnx24btsv4rwjwoflb6d]]

User workloads are shifting from BI analytics to multimodal data processing 
centered on model inference — data expands from structured records to 
images/video/audio/embeddings, resources move from CPU to mixed CPU/GPU, and 
execution moves from row-oriented to vectorized batch. This umbrella proposes 
evolving Flink from a unified stream-batch compute engine into one that 
natively supports AI workloads (AI-Native), decomposed into 11 sub-FLIPs across 
three layers:
 * _Layer 1 — Core Runtime Primitives:_ RpcOperator; multimodal type system and 
OBJECT_REF.
 * _Layer 2 — Workload Expression and Execution:_ Python DataFrame API; 
multimodal Source/Sink connector API; GPU resource declaration and independent 
deployment; built-in multimodal operators and AI functions; Arrow columnar 
transport.
 * _Layer 3 — Production-Grade Operational Guarantees:_ non-disruptive scaling 
for CPU and GPU operators; Unaligned Checkpoint enhancements; 
Pipeline-Region-based independent checkpoints.

Most sub-FLIPs have no hard dependencies and can be advanced in parallel. This 
umbrella seeks consensus on the overall direction only; detailed design and 
APIs are deferred to each sub-FLIP. All changes are incremental. Sub-FLIPs will 
be tracked as separate issues and linked here.

  was:
This is the umbrella issue tracking FLIP-577: AI-Native Flink — An Umbrella 
Proposal for Multimodal Data Processing.

_FLIP:_ 
[FLIP-577|[https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=421957275]]
 _Discussion thread:_ [dev@ 
thread|[https://lists.apache.org/thread/l5obwo4kvjblqnx24btsv4rwjwoflb6d]]

User workloads are shifting from BI analytics to multimodal data processing 
centered on model inference — data expands from structured records to 
images/video/audio/embeddings, resources move from CPU to mixed CPU/GPU, and 
execution moves from row-oriented to vectorized batch. This umbrella proposes 
evolving Flink from a unified stream-batch compute engine into one that 
natively supports AI workloads (AI-Native), decomposed into 11 sub-FLIPs across 
three layers:
 * _Layer 1 — Core Runtime Primitives:_ RpcOperator; multimodal type system and 
OBJECT_REF.
 * _Layer 2 — Workload Expression and Execution:_ Python DataFrame API; 
multimodal Source/Sink connector API; GPU resource declaration and independent 
deployment; built-in multimodal operators and AI functions; Arrow columnar 
transport.
 * _Layer 3 — Production-Grade Operational Guarantees:_ non-disruptive scaling 
for CPU and GPU operators; Unaligned Checkpoint enhancements; 
Pipeline-Region-based independent checkpoints.

Most sub-FLIPs have no hard dependencies and can be advanced in parallel. This 
umbrella seeks consensus on the overall direction only; detailed design and 
APIs are deferred to each sub-FLIP. All changes are incremental. Sub-FLIPs will 
be tracked as separate issues and linked here.


> [umbrella] FLIP-577: AI-Native Flink — Umbrella for Multimodal Data Processing
> ------------------------------------------------------------------------------
>
>                 Key: FLINK-39807
>                 URL: https://issues.apache.org/jira/browse/FLINK-39807
>             Project: Flink
>          Issue Type: New Feature
>            Reporter: Guowei Ma
>            Assignee: Guowei Ma
>            Priority: Major
>
> This is the umbrella issue tracking FLIP-577: AI-Native Flink — An Umbrella 
> Proposal for Multimodal Data Processing.
> _FLIP:_  
> [FLIP-577|[https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=421957275]]
> _Discussion thread:_ [dev@ 
> thread|[https://lists.apache.org/thread/l5obwo4kvjblqnx24btsv4rwjwoflb6d]]
> User workloads are shifting from BI analytics to multimodal data processing 
> centered on model inference — data expands from structured records to 
> images/video/audio/embeddings, resources move from CPU to mixed CPU/GPU, and 
> execution moves from row-oriented to vectorized batch. This umbrella proposes 
> evolving Flink from a unified stream-batch compute engine into one that 
> natively supports AI workloads (AI-Native), decomposed into 11 sub-FLIPs 
> across three layers:
>  * _Layer 1 — Core Runtime Primitives:_ RpcOperator; multimodal type system 
> and OBJECT_REF.
>  * _Layer 2 — Workload Expression and Execution:_ Python DataFrame API; 
> multimodal Source/Sink connector API; GPU resource declaration and 
> independent deployment; built-in multimodal operators and AI functions; Arrow 
> columnar transport.
>  * _Layer 3 — Production-Grade Operational Guarantees:_ non-disruptive 
> scaling for CPU and GPU operators; Unaligned Checkpoint enhancements; 
> Pipeline-Region-based independent checkpoints.
> Most sub-FLIPs have no hard dependencies and can be advanced in parallel. 
> This umbrella seeks consensus on the overall direction only; detailed design 
> and APIs are deferred to each sub-FLIP. All changes are incremental. 
> Sub-FLIPs will be tracked as separate issues and linked here.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to